work_2defa227zrcolf3tgbdyqvo54y ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216588596 Params is empty 216588596 exception Params is empty 2021/04/06-01:37:01 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216588596 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:01 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_2bsomn3afnfyffyywbw4u5no3e ---- The Dublin Core and Warwick Framework: A Review of the Literature, March 1995 - September 1997 D-Lib Magazine January 1998 ISSN 1082-9873 The Dublin Core and Warwick Framework A Review of the Literature, March 1995 - September 1997 Harold Thiele Department of Library and Information Science School of Information Sciences University of Pittsburgh Pittsburgh, Pennsylvania hthiele@lis.pitt.edu Overview The purpose of this essay is to identify and explore the dynamics of the literature associated with the Dublin Core Workshop Series. The essay opens by identifying the problems that the Dublin Core Workshop Series is addressing, the status of the Internet at the time of the first workshop, and the contributions each workshop has made to the ongoing discussion. The body of the essay describes the characteristics of the literature, highlights key documents, and identifies the major researchers. The essay closes with evaluation of the literary trends and considerations of future research directions. The essay concludes that a shift from a descriptive emphasis to a more empirical form of literature is about to take place. Future research questions are identified in the areas of satisfying searcher needs, the impact of surrogate descriptions on search engine performance, and the effectiveness of surrogate descriptions in authenticating Internet resources. Introduction The literature associated with the Dublin Core Workshop Series is of recent origin having started in 1995. It focuses on the development and promotion of metadata elements that facilitate the discovery of both textual and non-textual resources in a networked environment and support heterogeneous metadata interoperability. The object is to develop a simple metadata set and associated syntax that will be used by information producers and providers to describe their networked resources, thereby improving their chance of discovery. Background: Information Discovery, Search, and Retrieval The Internet's rapid expansion following the introduction of the World Wide Web (WWW) and the Mosaic WWW client/browser in 1993 (Poulter, 1997) was not accompanied by an equally radical change in the way the net was searched. Rather, like Gopher before it, the WWW depends upon two main classes of Internet search engines - keyword and subject directory. The development of robot programs to copy the contents of webpages back to a central site for indexing is magnifying the retrieval problems because the HTML protocol does not have mandatory resource description sections. Underscoring the magnitude of the resource description problem, Bray's (1996) November 1995 survey and quantitative description of the web identified 11,366,121 unique gopher, ftp, and http URLs with an average page size of 6500 bytes. Bray observed that page size has remained consistent since the start of the Open Text Index, but the number of pages is increasing dramatically. Using a different experimental design, Woodruff and others (1996), in their examination of 2.6 million HTML (http) documents retrieved by the Inktomi Web crawler in November 1995, reported an average size of 4400 bytes for their sample. In contrast to Bray's observations, which included the more mature gopher and ftp sites, Woodruff and others commented that the properties of the HTML documents were changing exceptionally quickly, especially in increasing page size and the URLs' inability to persist for an extended length of time. They agreed with Bray that the number of pages is increasing dramatically, finding that the size of the Internet seemed to double between October and November, 1995, going from 1.3 million to 2.6 million HTML documents. Growth in size and heterogeneity represents one set of challenges for designers of search systems. A second set of challenges arises from searchers' behavior. Recent studies have shown that users have difficulty in finding the resources they are seeking. Using log file analysis, Catledge and Pitkow (1995) found that users typically did not know the location of the documents they sought and used various heuristics to navigate the Internet, with the use of hyperlinks being the most popular method. They also found that users rarely cross more than two layers in a hypertext structure before returning to their entry point. The Dublin Core Workshops It is against this background that the Dublin Core Workshop Series has provided the catalyst and direction for the development of the literature. Employing a multidiscipline approach and focus group methodology to develop consensus, each of the workshops has contributed to the refinement of the arguments and redirection of the research as described below. The first Dublin Core Metadata Workshop (DC-1) was held in March, 1995 (Weibel and others, 1995). It was sponsored by the Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA). The first workshop's product was a core set of 13 metadata elements (the Dublin Core Metadata Element Set) that could be used to describe networked textual resources. A year later, in April 1996, a second workshop sponsored by OCLC and the United Kingdom Office for Library and Information Networking (UKOLN) was convened at Warwick University in England. (Dempsey and Weibel, 1996) The products of the Warwick Metadata Workshop (DC-2) included a proposed Dublin Core syntax, guidelines, and a modular based framework (the Warwick Framework) for packaging the metadata. In September, 1996, OCLC and the Coalition for Networked Information (CNI) sponsored a third workshop, the CNI/OCLC Image Metadata Workshop (DC-3), in Dublin, Ohio. The products of DC-3 included the expansion of the Dublin Core Element Set to 15 elements, and the refinement of the element set to support the description of networked non-textual resources. Canberra, Australia, hosted the 4th Dublin Core Metadata Workshop (DC-4) in March, 1997. (Weibel, Iannella, and Cathro, 1997) This workshop was sponsored by OCLC, the Distributed Systems Technology Centre (DSTC) and the National Library of Australia (NLA). The products of DC-4 included formalized qualifiers (the Canberra Qualifiers) and the development of syntactical expressions related to HTML. In October, 1997, the 5th Dublin Core Metadata Workshop (DC-5) met in Helsinki, Finland, under the sponsorship of OCLC and National Library of Finland. Stuart Weibel will report on the results of this meeting in a future issue of this magazine. Characteristics of the Literature In the two and one half years since DC-1, a vast and highly varied literature has developed around the ideas and concepts emanating from this ongoing series of workshops. The literature is primarily available through online sources. Where a print source is available, there is usually an electronic counterpart. Papers published as part of conference proceedings or transactions are generally available in both electronic and print format. The advantages and disadvantages associated with this emphasis on electronic publication are that the articles are subject updates at frequent intervals. The most popular outlet for publishing information about the Dublin Core and Warwick Framework is the D-Lib Magazine (ISSN: 1082-9873) [URL:http:www.dlib.org/]. The second most popular outlet is Ariadne (ISSN: 1361-3200) [URL:http://www.ariadne.ac.uk/]. The central resource page for workshop publications and web pages is The Dublin Core Metadata Element Set Home Page . This site maintains current links to workshop homepages, resources, and products, which is very important considering the emphasis on electronic publishing associated with this research area. Additional electronic resources are located at the various Digital Library and Metadata project websites that are experimenting with the use of the Dublin Core and Warwick Framework. See Hakala (1997) for a listing of recent project reports. The literature easily separates into several distinct clusters. The first cluster, which forms the central literary core, contains the proceedings and reports from the various Dublin Core workshops. This literature grows from the description of the basic elemental metadata set (Weibel and others, 1995) and the container architecture and syntax (Dempsey and Weibel, 1996) to grappling with extension development, and international and multidiscipline implementation (Weibel, Iannella, and Cathro, 1997). This cluster presents the findings from the various workshops, the consensus reached and areas yet unresolved, as well as the direction of future efforts. At this point in time, most of the literature from these workshops can be characterized as descriptive or broadly conceptual in nature. The second literary cluster focuses on crosswalks or mapping the Dublin Core Metadata Elements to other metadata systems. While much of this literature consists of attempts to make simple one-to-one correlations without discussion of problems involved, there are a few efforts that have provided more insight. Caplan and Guenther (1996) explored the difficulties in mapping a syntax-independent format (Dublin Core) against a highly syntax-dependent format (USMARC) and concluded that for successful machine mapping either generic fields will have to be added to the MARC scheme or the Dublin Core will have to be made more complex. As part of a project to identify a minimal searchable metadata set, an extensive crosswalk matrix was prepared for the Federal Geographic Data Committee (FGDC) by the MITRE Corporation. The Dublin Core was used as the reference set for comparison with eight different metadata systems and identification of equivalent and non-equivalent metadata elements. (MITRE Corporation, 1996). The Library of Congress (LOC) developed a crosswalk between the Dublin Core, MARC, and GILS that provides very specific MARC record formatting information (Library of Congress, 1997a). As part of this crosswalk, the LOC provided detailed rationale for their decisions. Where options were available for different possible mappings, the various options were fully explored so that accurate conversions can be made as the use of the Dublin Core becomes more widespread. A third literary cluster revolves around how key standards making organizations are reacting to and incorporating the Dublin Core into their standards. As with the workshops, the methodology used is consensus building. The two major standards organizations concerned with Dublin Core issues at this time are the Library of Congress and the Internet Engineering Task Force (IETF). As part of their traditional methodology of gathering information and building consensus, the Library of Congress produced a series of discussion papers dealing with the integration of the Dublin Core Metadata Elements into the MARC record format. MARBI Discussion Paper No. 88 (Library of Congress, 1995) grappled with the problem of defining a generic author field that corresponded with the Dublin Core author element. This exercise resulted in formal approval of the MARC record 720 field in 1996. MARBI Discussion Paper No. 99 (Library of Congress, 1997b) builds on their earlier experience and the new revisions to the MARC record to provide a revised crosswalk and commentary on resolving ongoing mapping difficulties. Two internet-drafts, the first stage in the IETF standards process, have been produced for consideration by the IETF. Building on their workshop experience and feedback from the various stakeholder communities, an internet-draft by Weibel, Kunze, and Lagoze (1997) provides a description of the Dublin Core and its semantics. Utilizing their previous work in directory related metadata research, a second internet-draft prepared by Hamilton, Iannella, and Knight (1997) provides a mapping from the Dublin Core to the X.500 and [C]LDAP directory service protocols by treating the Dublin Core elements as an object class of X.500/[C]LDAP attribute/value pairs. (Both X.500 and [C]LDAP define protocols for accessing information directories. The X.500 is designed to deal with all forms of telecommunication systems. The [C]LDAP, while based on the X.500, supports TCP/IP and was developed for Internet use (Kille, 1996; Shuh, 1997).) Two additional internet-drafts, (Musella, 1997; Daviel, 1997) provide examples of how the META tag may be used in HTML documents to provide cataloging and resource identification information. A fourth literary cluster is generated by ongoing Digital Library and/or Metadata projects around the world that are examining or actively incorporating the Dublin Core into their activities. The general formats used are technical reports or white papers that describe the conceptual approaches being used. At this point in time, the few empirical results reported are generated by prototypes and testbed exercises. Miller (1996) discusses an extension to the Dublin Core that describes Archaeological resources collected by the Archaeology Data Service (ADS). The expansion of the Dublin Core to describe non-textual information and the inclusion of the tag to identify every metadata item with its reference description is a key part of this report. Godby and Miller (1997) describe a tool, the Spectrum Cataloging Markup Language (SCML), that can be used to extract information from structured records, implement extensions to the Dublin Core element set, and generate Dublin Core records. The fifth literary cluster comprises articles and papers published in journals and conference proceedings that don't fit into the other four categories. It is in this cluster that many of the articles and studies using comparative and experimental research methodologies are reported, as well as the descriptions of smaller projects incorporating the Dublin Core. Desai (1997) compares the Dublin Core Elements List and the Semantic Header, and concludes that the Semantic Header is better suited for resource discovery on the Internet because it supports a more systematic approach to indexing information. Karttunen, Holmlund, and Nowotny (1996) describe how the Internet Pilot to Physics (TIPTOP) incorporated the Dublin Core as a critical component of this uniform and open information infrastructure for physics research and education. Key Documents There are a few documents that stand out from the rest of the literary field because of the completeness of the reporting or the importance of the concepts being discussed. First, there are two excellent descriptive studies that review the many metadata formats currently available. Heery (1996a) compared five metadata formats (IAFA/Whois++, MARC, Text Encoding Initiative, Dublin Core, and Uniform Resource Characteristics) against a set of five criteria in the context of bibliographic control. She provides an excellent description of the criteria, and follows it consistently in her comparison of the five formats. She also provides historical background, an example of a completed record where possible, and detailed description with each of the criteria for each of the formats. Heery concludes that while the constituencies promoting the various metadata formats are acknowledging the need for simplified records, there has been little progress towards rule simplification for content or for consensus on the degree of semantic structural complexity required. She closes by stating that any successful metadata format will need to incorporate flexible change procedures and also have the ability to deal with legacy systems. The second review of metadata formats is a product of the Development of a European Service for Information on Research and Education (DESIRE) project funded by the European Union. Dempsey and others (1997) examined 22 metadata formats from the point of view of metadata for information resources. The formats were distributed amongst three typological bands based on formats, standards, and other characteristics. Extensive commentary is provided for each metadata format. The object of the study was to provide background information on each of the formats so that the implications of their use could be assessed. The authors argued that the Dublin Core should remain optimized for its target use as a simple resource description format linked with the Warwick Framework that is used to aggregate metadata objects and facilitate their interchange. Based on the information in this report, Heery (1996b) included the Dublin Core as one of the four metadata formats recommended for additional investigation by the DESIRE project. A second set of documents deals with the development of the metadata container architecture referred to as the Warwick Framework. This set of documents provides one of the few illustrations of the progression from theory to implementation in this body of literature. Kahn and Wilensky (1995) provided the theoretical structure for the development of the Warwick Framework. The concept of the distributed digital object is defined and a method for naming, identifying, and invoking the digital object is described. The concepts are presented in a very conceptual and abstract fashion, and the content-based aspects of the infrastructure are purposefully not addressed. Lagoze, Lynch, and Daniel (1996), building on Kahn and Wilensky's theoretical work, describe the container architecture of the Warwick Framework that is to be used to aggregate logically distinct packages of metadata. The Warwick Framework has two distinct components, the container that aggregates the metadata sets and the metadata sets, i.e., the packages. This modular architecture allows the aggregation of containers within other containers, where they are treated as packages. Among the issues to be resolved are: the semantic overlap between packages; the need for a metadata type registry; the requirement for some form of interactive container syntax; the efficiency of distributed architecture; and repository access protocols. Building on this description, Knight and Hamilton (1997a) describe implementations of the Warwick Framework using the Multipurpose Internet Mail Extensions (MIME) [Internet RFC-1522]. They conclude that MIME is suitable for the encapsulation and transport of metadata as well as data, and meets all the requirements of the Warwick Framework. MIME has already in place a large body of code and implementation experience, a central type registry, and is being updated to allow the use of Unicode [ISO 10646] character sets. A third set of documents addresses the Platform for Internet Content Selection (PICS) efforts to demonstrate how the Dublin Core and Warwick Framework can be integrated into this proposed Internet standard. The growing interest in the Dublin Core and Warwick Framework as a mechanism for transporting resource descriptive information is illustrated by the efforts from both the Dublin Core research group and the PICS research group to include PICS content descriptive values in the Dublin Core Metadata set. Iannella (1997) showed how the Dublin Core could be extended to accommodate the PICS extension mechanism by defining the rat-inherit and sub-label extensions. Braun, König, and Wichmann (1997), building on their work with the PICS-SE standard, propose a slightly different approach that does not require introducing changes in the PICS syntax. This is accomplished by defining a set of PICS-SE classes for the Dublin Core by making use of the Knight and Hamilton (1997b) Dublin Core qualifiers. Key Researchers Most of the research and developmental work in this area is associated with a small number of researchers whose names always seem to appear whenever the Dublin Core or Warwick Framework are mentioned. The most prominent name that surfaces whenever the Dublin Core is mentioned is that of Stuart L. Weibel, a Senior Research Scientist with the Office of Research and Special Projects at OCLC. A name that often appears when the Warwick Framework or container architecture is mentioned is that of Carl Lagoze, head of the Cornell Digital Library Research Group at Cornell University. A third name that also appears quite often in the literature is that of Renato Iannella, a Senior Research Scientist at the Distributed Systems Technology Centre in Brisbane, Australia. Future Considerations Reflection on the current state of the literature and the general trends developing in the related research areas indicate that a crucial period is approaching. Most of the literature up to this point has been of a descriptive nature. Now that efforts are turning towards implementation of the Dublin Core and Warwick Framework, the emphasis needs to shift to a more empirical form of research. The proposed research falls into three general topics: behavioral, technical, and sociological. On the behavioral side, areas where research needs to be pursued include expanded user studies on how effective the Dublin Core actually is in comparison with other metadata concepts in satisfying searcher needs. A second question that user studies should address is how effective or efficient are surrogate descriptions in improving precision ratios in the retrieval activity for searchers in a very large networked environment. On the technical side, research should be done into what effect surrogate descriptions like the Dublin Core have on the improving cache performance in the search process and on reducing the bandwidth problems associated with the indexing of the Internet. A second technical question that needs study is whether or not surrogate descriptions like the Dublin Core favor centralized indexing search engines like Altavista over non-centralized indexing engines like Harvest. On the broader sociological issues, one long range question that should be considered is whether or not this form of creator-based surrogate descriptive indexing will split the resources available on the Internet into two distinct groups. One group of resources will be associated with the traditional academic and research paradigms that will employ the surrogate descriptions. A second group of resources will be generated by individuals who are not part of the academic and research paradigm, and surrogate descriptions will not be employed. Related to this is the question of whether or not the use of a surrogate descriptions will act as authenticating or validating mechanisms for Internet resources. In conclusion, the research in this area seems poised to switch its orientation from a predominately descriptive approach towards a more empirical one. The implementation of the Dublin Core on a wider scale provides opportunities for new questions to be asked and additional research methodologies to be employed. Bibliography Braun, Ingo, Andreas Köenig, and Thorsten Wichmann. (1997). Using PICS-SE with Dublin Core Metadata. [Online]. 21 April 1997. Available: http://kulturbox.cs.tu-berlin.de/aid/pics-se/dc.html. [Accessed: 8 September 1997]. Bray, Tim. (1996). Measuring the Web. Computer Networks and ISDN Systems. 28 (7-11). May 1996. [ISSN: 0169-7552]: 993-1005. Caplan, Priscilla, and Rebecca Guenther. (1996). Metadata for Internet Resources: The Dublin Core Metadata Elements Set and Its Mapping to USMARC. Cataloging & Classification Quarterly. 22 (3/4). (1996). [ISSN: 0163-9374]: 43-58. Catledge, Lara D. and James E. Pitkow. (1995). Characterizing Browsing Strategies in the World-Wide Web. In The Third International World-Wide Web Conference: Technology, Tools and Applications. April 10-14, 1995. Darmstadt, Germany. [Online]. 36824 bytes. Available: http://www.igd.fhg.de/www/www95/proceedings/papers/80/userpatterns/UserPatterns.Paper4.formatted.html. [Accessed: 12 September 1997]. Daviel, A. (1997). HTTP and HTML Metadata Linking Mechanism. [Online]. May 1997. Available: ftp://ietf.org/internet-drafts/draft-daviel-metadata-link-00.txt. [Accessed: 12 September 1997] Dempsey, Lorcan, Rachel Heery, Martin Hamilton, Debra Hiom, Jon Knight, Traugott Koch, Marianne Peereboom, and Andy Powell. (1997). A Review of Metadata: A Survey of Current Resource Description Formats. Work Package 3 of Telematics for Research project DESIRE (RE 1004). Version 1.0 [Online]. 19 March 1997. Available: http://www.ukoln.ac.uk/metadata/DESIRE/overview/. [Accessed 8 September 1997]. __________, and Stuart L. Weibel. (1996). The Warwick Metadata Workshop: A Framework for the Deployment of Resource Description. D-Lib Magazine. [Online]. July/August 1996. [ISSN: 1082-9873]. 51227 bytes. Available: http://www.oclc.org/oclc/research/publications/review96/warwick.htm. [Updated: October 1996] [Accessed: September 3, 1997]. Desai, Bipin C. (1997). Supporting Discovery in Virtual Libraries. Journal of the American Society for Information Science. 48 (3). (March 1997). [ISSN: 0002-8231]: 190-204. Godby, C. Jean and Eric J. Miller. (1997). A Metalanguage for Describing Internet Resources. The Annual Review of OCLC Research 1996. [Online]. (September 1997) [ISSN 0894-198X]. 26424 bytes. . Available: http://www.oclc.org/oclc/research/publications/review96/godby.htm. [Accessed: 12 September 1997]. Hakala, Juha. (1997). The 5th Dublin Core Metadata Workshop. Helsinki, Finland, October 6-8, 1997. Project Presentations. [Online]. Available: http://linnea.helsinki.fi/meta/projects.html. [Updated: 24 October 1997]. [Accessed: 16 December 1997]. Hamilton, Martin, Renato Iannella, and Jon Knight. (1997). Representing the Dublin Core within X.500, LDAP and CLDAP. [Online]. July 1997. Available: ftp://ietf.org/internet-drafts/draft-hamilton-dcxl-02.txt. [Accessed: 12 September 1997] Heery, Rachel. (1996a). Review of Metadata Formats. Program: Automated Library and Information Systems. 30 (4). (October 1996). [ISSN: 0033-0337]: 345-373. __________, (1996b). Resource Description: Initial Recommendations for Metadata Formats. Work Package 3 of Telematics for Research project DESIRE (no. 1004 (RE)). [Online]. July 1996. Available: http://www.ukoln.ac.uk/metadata/DESIRE/recommendations/. [Accessed: 17 September 1996]. Iannella, Renato. (1997). Using PICS as an Internet Metadata Repository. In Australian World Wide Web Technical Conference - Advanced Web Technologies & Industrial Applications. 1-9 May 1997, Brisbane, Queensland, Australia. [Online]. 16 April 1997. Available: http://www.dstc.edu.au/aw3tc/papers/iannella.html. [Accessed: 11 September 1997]. Kahn, Robert and Robert Wilensky. (1995). A Framework for Distributed Digital Object Services. [Online]. 13 May 1995. Available: http://WWW.CNRI.Reston.VA.US/home/cstr/arch/k-w.html. [Accessed: 19 September 1997]. Karttunen, Mikko, Kenneth Holmlund, and Günther Nowotny. (1996). The Internet Pilot to Physics: An Open Information System for Physics Research and Education. International Journal of Modern Physics C. [Online] Available: http://www.tp.umu.se/TIPTOP/Articles/ijmpc96.html. [Updated: 12 December 1996]. [Accessed: 4 September 1997]. Kille, Steve. (1996). X.500 and LDAP. Messaging Magazine. 2 (5). [Online] September/October 1996. [ISSN# 1072-1959] Available http://www.ema.org/html/pubs/mmv2n5/x500ldap.htm. [Accessed 22 December 1997] Knight, Jon and Martin Hamilton. (1997a). MIME Implementation for the Warwick Framework. [Online]. 3 July 1997. 23713 bytes. Available: http://www.roads.lut.ac.uk/MIME-WF.html. [Accessed: 12 September 1997]. __________, (1997b). Dublin Core Qualifiers. [Online]. 21 February 1997. Available: http://www.roads.lut.ac.uk/Metadata/DC-SubElements.html. [Accessed: 12 September 1997]. Lagoze, Carl, Clifford A. Lynch, and Ron Daniel Jr. (1996). The Warwick Framework: A Container Architecture for Aggregating Sets of Metadata. Cornell Computer Science Technical Report TR96-1593. [Online]. 12 July 1996. 96924 bytes. Available: http://cs-tr.cs.cornell.edu:80/Dienst/UI/2.0/Describe/ncstrl.cornell%2fTR96-1593. [Accessed: 3 September 1997] Library of Congress. Network Development and MARC Standards Office. (1997a). Dublin Core/MARC/GILS Crosswalk. [Online] Available: http://www.loc.gov/marc/dccross.html. [Updated: 7 April 1997]. [Accessed: 12 September 1997]. __________. (1997b). Discussion Paper No: 99. Metadata, Dublin Core and USMARC: a review of current efforts. [Online]. Available: gopher://marvel.loc.gov/00/.listarch/usmarc/dp99.doc. [Accessed: 2 September 1997]. __________. (1995). Discussion Paper No: 88. Defining a Generic Author Field in USMARC. [Online]. Available: gopher://marvel.loc.gov:70/00/.listarch/usmarc/dp88.doc. [Accessed: 17 September 1997]. Miller, Paul. (1996). An application of Dublin Core from the Archaeology Data Service - Draft. Version 1.05 [Online]. 10 July 1996. Available: http://www.ncl.ac.uk/~napml/ads/metadata.html. [Accessed: 3 September 1997]. MITRE Corporation. (1996). Task 3- Identification of the Minimal Metadata Element Set for Search. Musella, Davide. (1997). The META Tag of HTML. [Online]. 24 March 1997. Available: ftp://ietf.org/internet-drafts/draft-musella-html-metatag-03.txt. [Accessed: 12 September 1997]. Poulter, Alan. (1997), The Design of World Wide Web Search Engines: A Critical Review. Program: electronic library and information systems. 31 (2). April 1997. [ISSN: 0033-0337]: 131-145. Shuh, Barbara. (1997). Directories and X.500: An Introduction. Network Notes #45. [Online]. March 14, 1997. [ISSN 1201-4338]. Available http://www.nlc-bnc.ca/pubs/netnotes/notes45.htm. [Accessed 22 December 1997] Weibel, Stuart [L], Jean Godby, Eric [J.] Miller, and Ron Daniel. (1995). OCLC/NCSA Metadata Workshop Report. [Online] Available: http://www.oclc.org:5046/conferences/metadata/dublin_core_report.html. [Accessed: 4 September 1997] __________, Renato Iannella and Warwick Cathro (1997). The 4th Dublin Core Metadata Workshop Report. D-Lib Magazine. [Online]. June 1997. [ISSN: 1082-9873]. 40327 bytes. Available: http://www.dlib.org/dlib/june97/metadata/06weibel.html. [Updated: June 1997]. [Accessed: 3 September 1997]. __________, John A. Kunze, and Carl Lagoze. (1997). Dublin Core Metadata for Simple Resource Discovery. [Online]. 27 August 1997. Available: ftp://ietf.org/internet-drafts/draft-kunze-dc-01.txt. [Accessed: 12 September 1997]. __________ and Eric J. Miller (1996). Image Description on the Internet: Summary of CNI/OCLC Image Metadata Workshop. Annual Review of OCLC Research 1996. [Online] [ISSN 0894-198X]. . 34499 bytes. Available: http://www.oclc.org/oclc/research/publications/review96/image.htm. [Updated: September 1997]. [Accessed: 12 September 1997]. Woodruff, Allison, Paul M. Aoki, Eric Brewer, Paul Gauthier, and Lawrence A. Rowe. (1996). An Investigation of Documents from the World Wide Web. In Fifth International World Wide Web Conference, May 6-10, 1996, Paris, France. [Online]. Available: http://www5conf.inria.fr/fich_html/papers/P7/Overview.html. [Updated: June 1996]. [Accessed: 12 September 1997]. Copyright © 1998 Harold Thiele hdl:cnri.dlib/january98-thiele work_2c3zchny6rf7zetlbypechozo4 ---- Communication, Collaboration and Enhancing the Learning Experience: Developing a Collaborative Virtual Enquiry Service in University Libraries in the North of England Liz Jolly, Director Library and Information Services, Teesside University, UK Sue White, Director of Computing and Library Services, University of Huddersfield, UK Introduction Using the case study of developing a collaborative out-of-hours virtual enquiry service (VES), this paper explores the importance of communication and collaboration in enhancing student learning. Set against the context of a rapidly changing UK higher education sector, the paper considers both the benefits and challenges of collaboration, alongside the real and potential benefits for the student experience and the role of the library in enhancing learning. The paper is structured as follows: • The National Higher Education Context • Academic Libraries and Learning • A Review of Previous Activity in Shared and Collaborative Enquiry Services • Enhancing the Learning Experience: Developing a Collaborative Virtual Enquiry Service • Project Outcomes • Communication And Collaboration In Service Development: Benefits And Challenges • Lessons Learned and Next Steps 1 The National Higher Education Context The UK higher education sector is currently in a state of flux. The introduction of student fees in the 1990s as recommended by Dearing (1997) and the ensuing further reforms after the Higher Education Reform Act 2004 (see, for example: Department for Business, Innovation and Skills 2009, 2011; Browne 2010), with ever higher fee limits, together with the introduction of new types of higher education providers, has changed the higher education landscape. There is a perceived increase in marketization of the sector and commodification of the undergraduate student experience, linked to an increasingly competitive culture between institutions. The recent Green Paper (Department for Business, Innovation and Skills 2015), focuses on, amongst other issues, measuring teaching excellence which will link to tuition fees leading to further differentiation within the sector. However, within a culture of financial retrenchment, the idea of shared services in the higher education sector has also gained currency as a way of reducing expenditure and improving service delivery to the end user (see, for example, Universities UK 2011, 2015). A JISC study noted that “there is little overt enthusiasm for the introduction of shared services…administrative services are too important to institutions to take significant risk: no manager is going to gamble the institution in shared services” (Duke and Jordan, 2008, p.23) Rothwell and Herbert (2015) note that the changing financial climate may be responsible for the increased uptake in shared services since then. They summarise three broad types of shared services in HE based on the work of Clark, Ferrell and Hopkins (2011). These are: top down or bottom up; closeness (geographical or philosophical (mission groups) or technological); ‘I do it, we do it you do it’. How the Northern Collaboration has exploited geographical closeness combined with a technological solution to develop a shared service is explored later in this article. 2 Technology has obviously had a critical impact on higher education and in the UK the Committee of Enquiry into the Changing Learner Experience (2009) was convened to assess its impact on future policy development. Information and Communications Technology, along with procurement and human resources services, are cited by JISC (2009) as the most usual shared services. However, this is very much expressed in terms of shared ‘back office’ functions rather than an exploration of how this could be used to enhance the student learning experience in a digital world. Enhancing the student experience has been a key focus of funding councils, the Quality Assurance Agency and the Higher Education Academy in the UK. The Ramsden Report (2008) highlighted the importance of students as partners in developing their own learning experience, which is a ‘joint responsibility’ between them and their institution and in many universities students are now involved in formal and informal decision making and planning. However, the meaning of the student experience has changed under the current tuition fee regime, as Temple and Callendar (2015) point out, with students appearing to have “become customers rather than partners in the academic enterprise”. In this context the National Student Survey “gathers students’ opinions on the quality of their courses” (HEFCE, n.d.) and is used as a benchmarking shorthand for the quality of the overall student experience and the current Green Paper aims to create an Office for Students as a ‘new sector regulator and student champion’ (Department for Business, Innovation and Skills, 2015). With this changing student perception of their role, universities will need to be clear about their offer as they try to attract prospective ‘customers’ and retain ‘satisfaction’ in an increasingly differentiated marketplace. A holistic approach to learning and the student experience is now commonplace in UK institutions with changes both in organisational structure such as super-converged services 3 (Melling and Weaver, 2013) or in service delivery such as the one stop shop approach. Similar debates have occurred in the US and elsewhere with the Learning Reconsidered report (Keeling, 2004) articulating that effective student learning involves a holistic approach with collaboration from across the institution. Academic Libraries and Learning In the UK current prevailing pedagogical practice is predominantly constructivist, with learners constructing knowledge based upon their current or past knowledge and experience (Light and Cox, 2001).The (UK) Higher Education Academy has noted “The need to develop new ways of learning has become a live issue in HE, largely linked with the demand for increased flexibility of pace, place and mode of delivery” (HEA,2015) and its Flexible Pedagogies project aims to address these issues and provide examples of effective pedagogies that will empower learners. In this context academic Libraries are central to the learning, teaching and research enterprise of their institutions. Brophy (2005) emphasised the key role: “Academic libraries are here to enable and enhance learning in all its forms - whether it be the learning of a first year undergraduate coming to terms with what is meant by higher education or the learning of a Nobel Prize winning scientist seeking to push forwards the frontiers of her discipline” In the US Lankes (2011) has stated that “the mission of librarians is to improve society through facilitating knowledge creation in their communities”. Too often in the past library services and facilities have been designed to optimise delivery of library operations rather than with the learner at the centre (Bennett 2015). Much has been written on library buildings as ideal places for John Seely Brown’s learning conversations (Brown and Duguid 2000) and this can be applied to library services as a whole. Laurillard (2001) developed the 4 Conversational Framework as an approach to learning and teaching that is “an iterative dialogue between teacher and students that operates on two levels: the discursive, theoretical, conceptual level and the active, practical, experiential level”. We would argue that academic librarians have a key role to play in the Framework as they become more embedded in learning and teaching delivery. Pan et al (2004), inspired by the Boyer report in the United States (Boyer 1998), write of a Learning Ecosystem “cultivated between student and instructor; student and librarian; and instructor and librarian.” In this context, library help services, whether face to face, or virtual are key elements of an ecosystem and support for learners rather purely a library enquiry service. In the UK and elsewhere, students are viewed as key partners in the development of their learning experience, whether as customer / consumer (see for example, Department for Business Innovation and Skills, 2015) or as co-producer (Neary and Winn, 2009). Collaboration for enabling and supporting learning needs to build upon institutional experience of this ‘students as partners’ approach. Shared services to directly support student learning across institutions in the UK are less well developed. One example within the higher education sector is Falmouth Exeter Plus, which is the “service delivery partner” of Falmouth University (Falmouth) and the University of Exeter (UoE). It aims to “deliver shared services and facilities for UoE and Falmouth in Cornwall underpinned by close collaboration with FXU, the combined students' union for Falmouth and UoE” (Falmouth Exeter Plus, n.d.). Its current portfolio of services includes the Library, Student Services, IT services and Academic Skills. A cross- sectoral example of shared services to enable and support learning is The Hive, a combined University and Public library and archive service developed in partnership between Worcestershire County Council and the University of Worcester. Both these examples involve close working relationships between 5 two organisations. National collaboration between higher education libraries has so far been focussed on the SCONUL Access reciprocal borrowing scheme. Previous Developments in Shared and Collaborative Enquiry Services This paper offers as a case study the development of a shared enquiry service in the Northern Collaboration a group of university libraries in the North of England, UK (The Northern Collaboration, n.d.-a). Before commencing the project, a literature review was undertaken to establish the extent of previous activity in this space and whether there were lessons to be learned of value to the Northern Collaboration. The literature revealed considerable activity in the use of chat and instant messaging by individual libraries, particularly in the USA (see for example Bicknell-Holmes, 2008). In the UK, the Open University was one of the leaders in online digital reference (Payne & Bradbury, 2002). A virtual enquiry project at Edinburgh Napier University (Barry, Bedoya, Groom & Patterson, 2009) provided a useful overview of the use of virtual reference services (defined as the use of instant messaging or webchat for enquiries, which allow users to interact with library staff in real time) in academic libraries. In terms of collaborative reference services, a 24/7 reference tool was developed by Coffman and McGlamery (Putting virtual reference on the map, 2002) which later became the OCLC 24/7 co-operative reference service. Recent case studies of collaborative virtual reference in academic libraries are fairly infrequent (Johnson (2013) mentions the discontinuation of several institutional and collaborative virtual reference services in the US in the past ten years) but include those of New Zealand, where a consortium of four university libraries developed “a toolkit for providing virtual reference through instant 6 messaging” (Clements 2008), and the AskColorado/AskAcademic Virtual Reference Cooperative in the US: “one of only a dozen or so states to ever offer statewide online reference service to patrons via ‘cooperative reference service’.” (Johnson, 2013). In the UK, as mentioned elsewhere in this article, collaborative reference has been developed by the public library sector (Berube 2003) but has not been attempted before by academic libraries. Enhancing the Learning Experience: Developing a Collaborative Virtual Enquiry Service Background to the Project The project began life as one of the strands of activity emanating from a UK Higher Education Academy Change Academy programme called COLLABORATE in 2011. The purpose of COLLABORATE was to explore the potential for University Library Services in the North of England to work together on developing new services. The outcome was the Northern Collaboration. This is an organisation comprising 25 University libraries in Northern England, a region of the UK spanning from the Scottish border in the North, to Merseyside in the West and Humberside in the East. One of the first projects which library directors approved for progression was the shared Virtual Enquiry Service (VES). A project group of ten institutions undertook the next steps which comprised a literature review, project scoping, agreement on definitions of enquiries, data collection and analysis, and consideration of business models. The literature review (see above) confirmed that there was no collaborative enquiry service for academic libraries in the UK, and that there was merit in further exploration of the concept. The scoping exercise took place over several months, and was informed by two periods of data collection. The data captured the enquiry services provided in each library, including the format (face-to-face, phone, email, 7 chat), hours of delivery, level of staff providing the service (professionally qualified or assistant), the types of enquiries (e.g. reference enquiries, IT enquiries, directional) and costs of service provision. After analysis it became clear that the range and costs of services varied significantly between institutions. This was unsurprising, given the variety of institutions represented in the project group, which ranged from large research-intensive universities to small, teaching-led institutions. The average annual cost of enquiry services per library was around £70,000, representing a sizeable proportion of the library budget. Through an iterative process, the project scope was refined to an out-of-hours library enquiries service. ‘Out-of-hours’ was defined as the periods outside the normal working day when staff were not available to answer enquiries, namely evenings, overnight, weekends and bank holidays. One of the potential business models was to establish our own internal shared service, but given that external organisations were already providing similar services, it was agreed to investigate these first. Subsequently it was agreed to progress a partnership with OCLC, the American-based co-operative, well known for its work on bibliographic data and also a provider of a collaborative enquiry service through its QuestionPoint software. Examples of deployment of this 24/7 Reference Co-operative may be found in many academic libraries in the USA and globally, and also in the UK public library services where it is branded ‘Enquire’ (People’s Network, 2009). The primary medium for both services is web chat, though enquiries via email are also offered. Web chat represented a new enquiry medium for many of the libraries in the Northern Collaboration project, and one which informal research suggested would be popular with students. After endorsement by the library directors, a 15 month pilot with OCLC was implemented, commencing in May 2013. 8 Aims and Objectives To recap, what emerged from the diversity of institutions among the Northern Collaboration membership was a consensus around the need for an effective ‘out-of-hours’ enquiry service, primarily to cover the periods when local staff were not able to answer enquiries: evenings, overnight, weekends and bank holidays. There was no appetite for replacing the services provided during the normal working week. Some routine, procedural library enquiries could already be accommodated by NorMAN, an out-of-hours IT enquiry service available to further and higher education institutions (NorMAN, 2014). The priority for the VES project was therefore to satisfy the ‘reference’ enquiries, incorporating information resources, subject and referencing enquiries. The overall aim of the project was to enhance student learning and the student experience, with specific objectives to: • Pilot and evaluate a cost-effective, real-time out-of-hours enquiry service, which was sufficiently flexible to support diverse opening hours and organisational models. • Explore the benefits and challenges of working collaboratively, both within the Northern Collaboration and with an external partner It took over a year to achieve this level of clarification about the project as it was important to attain consensus amongst Northern Collaboration directors who were effectively the project sponsors. The Pilot As noted above, the OCLC 24/7 Reference Cooperative was well established in the USA. The principle on which it operates is that enquiries may be handled by a librarian from any member of the co-operative. No specific training is required of these librarians, as they all 9 have access to ‘policy pages’ (information supplied by participating libraries about their policies, procedures and information resources). Using a combination of the policy pages and reference interview skills, the librarians are able to answer the majority of enquiries. Because of the time difference between the UK and the USA, the majority of out-of-hours UK enquiries are picked up by colleagues in the western states of the USA. Within the UK, two Universities subscribed as individual members to the global co-operative but prior to the VES pilot there was no consortial academic library membership in the UK. For the pilot we effectively created a new business model in which each institution paid a subscription to purchase an out-of-hours enquiry service, with no requirement to supply staff from their own institution to answer enquiries from other member libraries. Subscriptions were differentiated according to JISC bands, and ranged from approximately £1500 to £3000 per year. Seven institutions took part in the pilot, representing diverse mission groups, size and organizational structures: some libraries operated as stand-alone directorates whereas others were part of converged services with Information Technology (IT) or Student Services. Start-up involved creating the ‘policy page’ (see above) and varying degrees of liaison with relevant departments, including IT and Marketing, to enable the QuestionPoint ‘chat’ widget on each institution’s web pages. Support for the start-up was provided by the QuestionPoint Product Manager, but increasingly as the pilot progressed, the operational leads within each institution created a community of practice, (Wenger, 1998) in which they learned from each other. Each institution was able to ‘switch on’ the service at different times in the evening to meet its own service delivery requirements. Evaluation The pilot was rigorously evaluated. Usage statistics were analysed on an on-going basis throughout the pilot; user satisfaction with the service was recorded; the quality of responses 10 to enquiries was evaluated by librarians; and each of the pilot institutions produced a case study, outlining the practical experience of delivering the service, the challenges, enablers and impact on the student experience. It is beyond the scope of this article to provide detailed analysis of the data; however readers may find the following overview useful. The first significant usage of the service started in September 2013 once all pilot libraries were up and running. During the period September 2013 to May 2014, approximately 3000 enquiries were handled in total across all institutions. Figures 1 and 2 below show the variance between institutions, with the average per month ranging from 101 enquiries to 13. The criteria for success appeared to include: prior experience of student use of web chat; an effective promotional campaign to raise awareness; high visibility of the chat widget on web pages. Table 1: Out-of-hours enquiries by institution September 2013 – May 2014 Name Sep Oct Nov Dec Jan Feb Mar Apr May Total Monthly average University 1 8 10 24 12 13 12 10 18 10 117 13 University 2 26 26 30 29 19 27 30 33 45 914 102 University 3 6 20 16 12 14 11 6 22 23 130 14 University 4 19 36 27 29 29 20 11 30 14 215 24 University 5 11 19 13 14 17 9 4 20 13 249 28 University 6 24 24 38 21 24 19 175 105 145 575 64 University 7 44 77 83 49 56 81 108 131 122 751 83 Total Enquiries 138 212 231 166 172 179 344 359 372 2951 47 11 Figure 1: Graph showing out-of-hours enquiries by institution September 2013 – May 2014 The majority of Monday to Friday enquiries were received between 1700 to 23.59 hours and 0700 to 08.59 hours (see Figure 3 below). Over the weekends, enquiries were distributed more evenly across the day and evenings. Figure 2: Out-of-hours enquiries Monday to Friday by time of day September 2013 – May 2014 The types of enquiries were categorised into six areas in order to give sufficient granularity for data analysis. As noted above, the pilot was particularly interested in the reference 0 20 40 60 80 100 120 140 160 180 200 Sep Oct Nov Dec Jan Feb Mar Apr May N um be r o f e nq ui ri es University 1 University 2 University 3 University 4 University 5 University 6 University 7 0 50 100 150 200 250 17:00 - 17:59 18:00 - 18:59 19:00 - 19:59 20:00 - 20:59 21:00 - 21:59 22:00 - 22:59 23:00 - 23:59 00:00 - 00:59 01:00 - 01:59 02:00 - 02:59 03:00 - 03:59 04:00 - 04:59 05:00 - 05:59 06:00 - 06:59 07:00 - 07:59 08:00 - 08:59 N um be r o f e nq ui ri es 12 enquiry, namely those relating to information resources, referencing and subject enquiries. Analysis showed, not unexpectedly, that a high proportion of enquiries were procedural/directional or related to IT, but it was pleasing to note that nearly 40% of all enquiries were classified as reference. Enquiries were also analysed using the categories required for the annual SCONUL statistical return (SCONUL, 2015). Both sets of data are summarised in figures 4 and 5 below. Figure 3: Out-of-hours enquiries analysed by type of enquiry (using VES categorisation of 6 enquiry types) 17% 4% 40% 7% 25% 7% VES-IT: VES-Non Library: VES-Procedural/directional: VES-Referencing: VES-Resources: VES-Subject: 13 Figure 4: Out-of-hours enquiries analysed by type of enquiry (using SCONUL categorisation of 4 enquiry types) The cost per enquiry was calculated by each pilot member and compared with the hypothetical costs of providing a service in-house, based on staffing grades they would expect to deploy in their library service to answer the same volume and types of enquiries. Actual costs varied from approximately £3 to £20 per enquiry, which compared to hypothetical costs of up to several hundred pounds per enquiry. Project Outcomes Clearly, the chief beneficiaries of an initiative like this were the service users. Although take- up for the service was relatively low, the experience of service users was positive. Student feedback, obtained through brief surveys, demonstrated that 75% of respondents were satisfied with the answer to their enquiry and 81% would use the service again. The following comments illustrate the value that students attached to the new service: “Excellent 38% 40% 18% 4% Number of information resource related enquiries handled Number of procedural/directional enquiries handled Number of enquiries made of library staff about IT- related matters Number of enquiries made of library staff about other university matters 14 help and would definitely use again. Thank you.”; “Really, really helpful. I wish I'd found this facility 6 hours ago!!” Feedback suggested the service was particularly valued by part-time students and distance learners who had limited opportunities to visit the physical campus. The consensus amongst the pilot group was that the new out-of-hours enquiry service complemented other 24/7 services offered, namely 24/7 physical access to the library and 24/7 virtual access to online information resources. One University summarised the impact as follows: “The VES provides a real enhancement to our students’ experience, and a service which is available at the time the students need it.” From a financial perspective there was clear evidence of value for money, enabling the provision of a 24/7 enquiry service at the relatively modest extra cost of a few thousand pounds per year. To provide the equivalent service in-house would have been prohibitively expensive. Feedback from senior institutional managers suggested that in addition to enhancing the student experience, the new service was perceived as offering a tangible and cost effective benefit of membership of the Northern Collaboration, and constructive engagement with the national shared services agenda. The VES also enabled a strong message that the institution provided a 24/7 professional library enquiry service. For some institutions the introduction of a chat system involved a major cultural change in terms of student expectations and the nature of student support. Where there was a longstanding culture of using such services take-up was much higher. Most institutions had a ‘soft launch’ of the new service, and in retrospect this resulted in low visibility of the 15 service. Although the cost of the service was relatively modest, it was recognised that effective publicity was essential in order to optimise investment. An evaluation report was presented to the Northern Collaboration directors in July 2014. This incorporated a proposed business model and subscription levels, negotiated with OCLC, for rolling out the service to any members of the Northern Collaboration who wished to participate. Over the following year, the number of subscribing institutions increased to sixteen. Communication and Collaboration: Benefits and Challenges for Service Development This section considers the role of communication and collaboration in the development of the new out-of-hours enquiry service, and highlights both the challenges but also the significant benefits which ensued. Communication and collaboration are inextricably linked, and both were key to the success of the VES. Communication may be defined as “the activity or process of expressing ideas and feelings or of giving people information” (Oxford Advanced Learners Dictionary, 2015a) whilst collaboration is “the act of working with another person or group of people to create or produce something” (Oxford Advanced Learners Dictionary, 2015b). To work effectively with other people or groups, there has to be exchange of information between all parties, an ability to articulate ideas, and a willingness to communicate regularly and openly. Librarians tend to be good at this. Indeed, libraries across the world have a long tradition of collaboration. In the academic sector this may occur within the sector (Fraser, Shaw and 16 Ruston, 2013; Harrasi and Jabur, 2014; Melling and Weaver, 2010), across sectors (Lawton and Lawton, 2009; Lucas, 2013; Ullah, 2015), or with vendors and suppliers (Marks, 2005). Communication on the VES project occurred at many levels and for different purposes, as summarised below. Table 2: Communication and collaboration activities apparent during service development Level Participants Communication / collaboration activities Macro - outside the Northern Collaboration Library Directors; Senior OCLC personnel Relationship development; negotiation; discussion; decision making; presentation Regional – within the Northern Collaboration (all members ) Library Directors and Heads of Service Discussion; report writing; evaluation; decision making Regional pilot - between the sub-set of institutions that developed the service Library operational leads; OCLC product manager; colleagues in university departments (IT, marketing) Service implementation; development of good practice; shared evaluation; benchmarking quality of enquiry responses; mystery shopping Local - within each institution that adopted the service Reference service providers; service users (students, academic staff) Service implementation; user feedback; continuous improvement At the macro level, the Northern Collaboration developed an effective working relationship with OCLC. The overlap in the common purpose of the two organisations undoubtedly helped. Amongst the stated aims of the Northern Collaboration are the provision ‘of a framework within which libraries can work together to improve the quality of services, to be more efficient, and to explore new models’ (The Northern Collaboration, n.d.-b); whilst the OCLC mission as ’a global library cooperative is to provide shared technology services, original research and community programs for its membership and the library community at large.’ (OCLC, 2015). Through regular communication and open discussion, the library directors and senior UK-based OCLC personnel in the UK developed a shared understanding of what the Northern Collaboration wished to achieve. 17 Engagement of the Northern Collaboration Directors Group was achieved through regular progress reports by the project leads, culminating in a comprehensive evaluation of the pilot. Whilst is was always understood that taking part in the VES was optional it was nevertheless extremely important to ensure that all Northern Collaboration directors were fully informed so that they were able to make appropriate decisions for their libraries. This level of engagement also gave the project substantial potential leverage, for example in making the case to OCLC for technical improvements to the product. Significant benefits of collaboration were achieved at an operational level, where a strong community of practice developed. Experiences were shared willingly, leading to the development of good practice in start-up, implementation, service promotion, training, evaluation, benchmarking and quality control. OCLC provided effective basic training and technical assistance with start-up, but the ways in which the project group worked together brought added value. One institution, for example, volunteered to undertake mystery shopping as a means of measuring the quality of responses. Another shared a particularly successful promotional campaign, which had resulted in a five-fold increase in service usage. Collaboration with colleagues in other university departments was not always so effective. Enlisting the support of IT departments to prioritise the installation of the chat widget was sometimes problematic, due to competing priorities. These challenges were fortunately all resolved, but were a reminder of the need to engage all stakeholders in collaborative projects, early in the process, and to explain clearly the project rationale. Engagement with students took place primarily after the launch of the pilot service, and has continued on an ongoing basis, through the online feedback forms which follow a web chat enquiry. There is potential for greater student involvement in the further development of the scheme. 18 A further important benefit of collaboration has been the opportunities afforded to library colleagues for professional development, particularly in terms of skills development, project working and in developing the professional community of practice alluded to above. Lessons Learned and Next Steps Rothwell and Herbert (2015) note that ‘the UK already has plenty of strengths regarding shared services and collaborative working’ and believe ‘the future is global, collaborative and shared’. By working collaboratively both with other institutions and with OCLC the Northern Collaboration has demonstrated the benefits in terms of student and learning experience and value for money. Amongst the key lessons learned were: the importance of setting clear objectives for the project; ensuring the involvement of key stakeholders within our departments across our institutions among Northern Collaboration directors; and communicating clearly with both students and stakeholders to ensure the success of the project and its successful operationalisation as a service. With regard to this last point publicity and promotion was critical to the visibility and uptake of the new service. The effective communication of the two Northern Collaboration operational Project Leads with OCLC on technical and data analysis issues and with project team members in each institution was a further critical success factor. Reflecting on the experience of working together during the project it is clear that building effective collaborative practices takes time. The pilot group of seven institutions worked 19 exceptionally well together but inevitably it takes longer to achieve consensus and to make decisions than with a project involving just one institution and this needs to be factored into the planning process. In many senses the process of staff learning to be collaborative was as important as the outcome of the project. In terms of staff learning and development the Shared VES has the potential to enable the further development of a community of practice which will continue to enhance communication and collaboration in service design and improvement. This relates to Sennett’s dialogical model of co-operation which emphasises mutual exchange as an intrinsic good: the dialogical conversation “prospers through empathy, the sentiment of curiosity about who other people are in themselves” (2013). The Northern Collaboration service now has sixteen members and is likely to extend to a national service co-ordinated by SCONUL, the UK university library directors’ group. At the time of writing initial positive expressions of interest have been received from over 60% of UK higher education institutions. There is potential to develop a variety of models to suit the needs of institutions and to more actively involve students as partners in this development. David Watson (2015) stated that “if UK higher education is going to prosper in the contemporary world it is going to have to become messier, less precious, more flexible and significantly more co-operative.” By offering clear enhancements to the student learning experience, collaborative development opportunities for our staff and financial benefits to our institutions the Northern Collaboration Shared Virtual Enquiry Service is a small step towards this goal. 20 Acknowledgements The authors would like to thank the Northern Collaboration Project Team: Jackie Oliver (Teesside University) and Russ Jones, (Leeds Beckett University) (Co – Leads); Jane Robinson (University of Cumbria); Anne Middleton (Newcastle University); Claire Smith (Durham University); Sue Hoskins and Nicola Haworth (University of Salford); Anthony Osborne (University of Huddersfield), and Andrew Hall, Chris Jones and Susan McGlamery of OCLC for their contributions to the success of the project. References Barry, E., & Bedoya, J.K and Patterson, L. (2010). Virtual reference in UK academic libraries: the virtual enquiry project 2008-2009. Library Review, 59(1), 40-55. Bennett, S. (2015). Putting learning into library planning. Portal: Libraries and the Academy, 15(2), 215-232. Berube, L. (2003). Collaborative digital reference: An ask a librarian (UK) overview. Program: Electronic Library and Information Systems, 38(1), 29-41. Bicknell-Holmes, T. (2008). Chat & instant messaging for reference services: a selected bibliography. Retrieved from http://digitalcommons.unl.edu/libraryscience/151/ Boyer Commission on Educating Undergraduates in the Research University. (1998). Reinventing undergraduate education: a blueprint for America's research universities. Stony Brook, New York: State University of New York, Stoney Brook. Brophy, P. (2005). The academic library (2nd ed.) Facet. Brown, J.S., & Duguid, P. (2000). The social life of information. Boston, Mass.: Harvard Business Press. 21 http://digitalcommons.unl.edu/libraryscience/151/ Browne, J. (2010). Securing a sustainable future for higher education: an independent review of higher education funding and student finance. Retrieved from https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/422565/bi s-10-1208-securing-sustainable-higher-education-browne-report.pdf. Clark, M., Ferrell, G., & Hopkins P. (2011). Study of early adopter of shared services and cloud computing within further and higher education. Newcastle upon Tyne: HE Associates / JISC. Clements, C. (2008). Collaborating to implement social software solutions for university libraries. Retrieved from http://www.lianza.org.nz/collaborating-implement-social- software-solutions-university-libraries Committee of Inquiry into the Changing Learner Experience (CLEX). (2009). Higher education in a web 2.0 world: report of an independent committee of inquiry into the impact on higher education of students' widespread use of web 2.0 technologies. London: JISC Department for Business Innovation and Skills. (2009). Higher ambitions: the future of universities in a knowledge economy. (No. 2015). HMSO. Department for Business Innovation and Skills. (2011). Higher education: students at the heart of the system. (Cm 8122). London: HMSO. Department for Business Innovation and Skills. (2015). Fulfilling our potential: teaching excellence, social mobility and student choice (Cm 9141). London: HMSO. Dearing, R. (1997). Higher education in the learning society: report of the National Committee of Enquiry into Higher Education. London: HMSO. 22 https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/422565/bis-10-1208-securing-sustainable-higher-education-browne-report.pdf. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/422565/bis-10-1208-securing-sustainable-higher-education-browne-report.pdf. http://www.lianza.org.nz/collaborating-implement-social-software-solutions-university-libraries http://www.lianza.org.nz/collaborating-implement-social-software-solutions-university-libraries Duke & Jordan Ltd et al. (2008). JISC study of shared services in UK further and higher education: Report 4: Conclusions and proposals. Undertaken on behalf of the JISC by Duke & Jordan Ltd with AlphaPlus ltd, Mary Auckland, Chris Cartledge, Simon Marsden and Bob Powell. London: JISC. Falmouth Exeter Plus. (n.d.). Working for Falmouth Exeter plus. Retrieved from http://www.fxplus.ac.uk/work/working-falmouth-exeter-plus Fraser, J., Shaw, K., & Ruston, S. (2013). Academic library collaboration in supporting students pre-induction: the head start project. New Review of Academic Librarianship, 19(2), 125-140. Harrasi, N. A., & Jabour, N. H. (2014). Factors contributing to successful collaboration among Omani academic libraries. Interlending and Document Supply, 42(1), 26-32. HEFCE: Higher Education Funding Council for England. (n.d.). National student survey. Retrieved from http://www.hefce.ac.uk/lt/nss/ Higher Education Academy. (2015). Flexible pedagogies: preparing for the future. Retrieved from https://www.heacademy.ac.uk/flexible-pedagogies-preparing-future Johnson, K. (2013). AskColorado: a collaborative virtual reference service. In B. Thomsett- Scott (Ed.), Implementing virtual reference services (pp. 115-135). Chicago: ALA TechSource. Keeling, R. (2004). Learning reconsidered: a campus-wide focus on the student experience. Washington DC: National Association of Student Personnel Administrators; American College Personnel Association. Lankes, D. (2011). The atlas of new librarianship. Cambridge Massachusetts: Massachusetts Institute of Technology / Association of College and Research Libraries. 23 http://www.fxplus.ac.uk/work/working-falmouth-exeter-plus http://www.hefce.ac.uk/lt/nss/ https://www.heacademy.ac.uk/flexible-pedagogies-preparing-future Laurillard, D. (2001). Rethinking university teaching in the digital age. Retrieved from https://net.educause.edu/ir/library/pdf/ffp0205s.pdf Lawton, J. R., & Lawton, H. B. (2009). Public-academic library collaboration: a case study of an instructional hour and property history research program for the public. The American Archivist, 72, 496-514. Light, G., & Cox, R. (2001). Learning and teaching in higher education: the reflective professional. London: Paul Chapman Publishing. Lucas, F. (2013). Many spokes same hub: Facilitating collaboration among library and early- childhood services to improve outcomes for children. The Australian Library Journal, 62(3), 196-203. Marks, K. E. (2005). Vendor/library collaboration - an opportunity for sharing. Resource Sharing and Information Networks, 18(1-2), 203-214. Melling, M., & Weaver, M. (Eds.). (2013). Collaboration in libraries and learning environments. London: Facet. Neary, M., & Winn, J. (2009). The student as producer: reinventing the student experience in higher education. In L. Bell, H. S. Stevenson & M. Neary (Eds.), The future of higher education: policy, pedagogy and the student experience (pp. 192-210). London: Continuum. NorMAN. (2014). Complete your services with the out of hours helpline. Retrieved from http://www.outofhourshelp.ac.uk/ The Northern Collaboration. (n.d.-a). About us. Retrieved from http://www.northerncollaboration.org.uk/content/about-us 24 https://net.educause.edu/ir/library/pdf/ffp0205s.pdf http://www.outofhourshelp.ac.uk/ http://www.northerncollaboration.org.uk/content/about-us The Northern Collaboration. (n.d.-b). Aims. Retrieved from http://www.northerncollaboration.org.uk/content/aims OCLC. (2015). Together we make breakthroughs possible. Retrieved from http://www.oclc.org/about.en.html Oxford Advanced Learners Dictionary (2015a).Definition: Communication. Retrieved from http://www.oxforddictionaries.com/definition/communciation. Oxford Advanced Learners Dictionary. (2015b). Definition: Collaboration. Retrieved from http://www.oxforddictionaries.com/definition/collaboration Pan, D., Ferrer-Vinenet, I., & Bruehl, M. (2014). Library value in the classroom: assessing student learning outcomes from instruction and collections. The Journal of Academic Librarianship, 40(3-4), 332-338. Payne, G. F., & Bradbury, D. (2002). An automated approach to online digital reference: the Open University Library OPAL project. Program: Electronic Library and Information Systems, 36(1), 5-12. Putting virtual reference on the map: Susan McGlamery--metropolitan cooperative library system (2002). Library Journal, 48 People's Network. (2009). What the Enquire service is. Retrieved from http://www.peoplesnetwork.gov.uk/enquire/about.html Ramsden, P. (2008). The future of higher education; teaching and the student experience. Retrieved from http://www.improvingthestudentexperience.com/essential- information/undergraduate-literature/general/ Rothwell, Andrew and Herbert, Ian. (2015). Collaboration and shared services in UK higher education: potential and possibilities. London: Efficiency Exchange. 25 http://www.northerncollaboration.org.uk/content/aims http://www.oclc.org/about.en.html http://www.oxforddictionaries.com/definition/communciation http://www.oxforddictionaries.com/definition/collaboration http://www.peoplesnetwork.gov.uk/enquire/about.html http://www.improvingthestudentexperience.com/essential-information/undergraduate-literature/general/ http://www.improvingthestudentexperience.com/essential-information/undergraduate-literature/general/ SCONUL. (2015). SCONUL statistics. Retrieved from http://www.sconul.ac.uk/tags/sconul- statistics Sennett, R. (2013). Together: the rituals, pleasures and politics of cooperation. London: Penguin. Stockham, M., Turtle, E., & Hansen, E. (2002). KANAnswer: a collaborative statewide virtual reference pilot. Reference Librarian, 38 (79-80), 257-266. Temple, P., & Callendar, C. (2015). The changing student experience Retrieved from http://wonkhe.com/blogs/the-changing-student-experience/ Ullah, A. (2015). Examining collaboration among central library and seminar libraries of leading universities in Pakistan. Library Review, 64(4-5), 321-334. Universities UK (2015). Efficiency, effectiveness and value for money. Retrieved from www.universitiesuk.ac.uk/highereducation/Pages/EfficiencyEffectivenessValueForMone y.aspx#.VX661PlVhHw Universities UK Efficiency and Modernisation Task Group. (2011). Efficiency and effectiveness in higher education. London: Universities UK. Watson, D. (2015). The coming of post-institutional higher education. Oxford Review of Higher Education, 41(5), 549-562. Wenger, E. (1998). Communities of practice: learning, meaning, and identity. Cambridge: Cambridge University Press. 26 http://www.sconul.ac.uk/tags/sconul-statistics http://www.sconul.ac.uk/tags/sconul-statistics http://wonkhe.com/blogs/the-changing-student-experience/ http://www.universitiesuk.ac.uk/highereducation/Pages/EfficiencyEffectivenessValueForMoney.aspx%23.VX661PlVhHw http://www.universitiesuk.ac.uk/highereducation/Pages/EfficiencyEffectivenessValueForMoney.aspx%23.VX661PlVhHw work_2hl34u5xpvhaxlnhpgf6afn6ka ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216584732 Params is empty 216584732 exception Params is empty 2021/04/06-01:36:56 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216584732 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:56 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_2ijebvabljbfthx55rq74iokyy ---- TITLE Joan S. Mitchell Diane Vizine-Goetz Online Computer Library Center, Inc. The DDC and OCLC Note: This is a pre-print version of a paper forthcoming in the Journal of Library Administration. Please cite the published version; a suggested citation appears below. Correspondence about the article may be sent to the authors at joan_mitchell@oclc.org or vizine@oclc.org. Abstract This article highlights key events in the relationship between OCLC Online Computer Library Center, Inc. and the Dewey Decimal Classification (DDC) system. The formal relationship started with OCLC's acquisition of Forest Press and the rights to the DDC from the Lake Placid Education Foundation in 1988, but OCLC's research interests in the DDC predated that acquisition and have remained strong during the relationship. Under OCLC's leadership, the DDC's value proposition has been enhanced by the continuous updating of the system itself, development of interoperable translations, mappings to other schemes, and new forms of representation of the underlying data. The amount of categorized content associated with the system in WorldCat and elsewhere has grown, as has worldwide use of the system. Emerging technologies are creating new opportunities for publishing, linking, and sharing DDC data. Keywords DDC; Dewey Decimal Classification; Forest Press; OCLC © 2009 OCLC Online Computer Library, Inc. 6565 Kilgour Place, Dublin, Ohio 43017-3395 USA http://www.oclc.org/ Reproduction of substantial portions of this publication must contain the OCLC copyright notice. Suggested citation: Mitchell, Joan S. and Diane Vizine-Goetz. 2010. "The DDC and OCLC." Journal of Library Administration, 49,6: 657-667. Pre-print available online at: http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf. mailto:joan_mitchell@oclc.org� mailto:vizine@oclc.org� http://www.oclc.org/� http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 2 of 11. INTRODUCTION OCLC acquired the Dewey Decimal Classification (DDC) system and Forest Press from the Lake Placid Education Foundation in 1988. The promise of OCLC's direct involvement in the DDC is presaged in the publisher's foreword to DDC 20: The year 1988 witnessed two events which will have a profound effect on the future of the Dewey Decimal Classification and other general classification systems. Curiously, both events took place on the same day. On July 29, a computer tape containing substantially all the text of DDC 20 was delivered to a firm in Massachusetts to begin production of this edition. … On the same date, Forest Press and the Dewey Decimal Classification became a part of the OCLC Online Computer Center. … Joining the DDC with the talents and resources of OCLC will allow the development of the computer products and services which are needed by DDC users (Paulson, 1989, p. xi). While the relationship between the DDC and OCLC predated the acquisition in terms of research projects and inclusion of DDC numbers in WorldCat records, the system has flourished along a number of dimensions under OCLC's leadership. In addition to publishing numerous works based wholly or partly on the DDC, OCLC developed the first electronic version of a general classification system and made it available by subscription. International use of the system and the number of DDC translations have grown, as have mappings of the DDC to other terminologies. OCLC has played a prominent role in classification research in general, one that has resulted in new models of representation, prototypes of new services, and emerging uses of classification in the Web environment. This article highlights important events in the DDC-OCLC relationship, and concludes with prospects for future contributions (Mitchell, & Vizine- Goetz, 2006, Mitchell Vizine-Goetz, in press). ACQUISITION OF THE FOREST PRESS AND THE DDC The 1988 library literature contains several reports announcing OCLC's acquisition of the rights to the DDC and the assets of Forest Press (the DDC's publisher) from the Lake Placid Education Foundation for a reported $3.8 million. The foundation was broke (revenues from DDC went back into DDC products and development, including contract payments to the Library of Congress); … it needed a buyer who could carry DDC into the computerized environment of the 21st century. http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0016� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0007� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0007� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 3 of 11. OCLC, which had worked with Forest Press in earlier cooperative activities, was that buyer (Plotnik, 1988a, p. 736). In another report, the focus of OCLC on the electronic promise of the DDC is clearly stated: “OCLC will explore publishing electronic versions of the DDC, as well as continuing the ongoing revision and publication in print form …” (OCLC 1988, p. 443). In yet another report, then OCLC President Roland Brown commented, “The synergy between the legacy of Melvil Dewey and the mission of OCLC is powerful” (Plotnik, 1988b, p. 641). In a 1999 interview following his retirement as Executive Director of Forest Press, Peter Paulson noted the sale to OCLC first among the most important occurrences during his leadership: First and most important, the sale of Forest Press and DDC to OCLC in 1988. This move brought us the skills and resources we needed, and OCLC has turned out to be a very good home for us (Intner, 1999, pp. 2-3). MANAGEMENT OF THE DDC When Forest Press was first acquired by OCLC, Peter Paulson remained executive director and the Forest Press office remained physically in Albany, NY. The Dewey Editorial Office continued at the Library of Congress (LC), where it had been located since 1923. OCLC took over annual payments to the Library of Congress to fund the Dewey editorial staff positions and operations—in 1988, all of these positions were filled by LC employees. In late 1991, the editor of the DDC, John P. Comaromi, died suddenly. There was a hiring freeze at the Library of Congress during the period candidates were being considered for the position to succeed Dr. Comaromi. OCLC and LC agreed to convert the editor-in-chief position from an LC employee fully funded by OCLC to an OCLC employee physically located in the Dewey Editorial Office at the Library of Congress. Joan S. Mitchell was hired under these circumstances as editor in April 1993. When Forest Press first joined OCLC, it was organizationally under a group devoted to electronic publications and information. The following year, it moved under the cataloging area, where it has remained nearly continuously until the present day. Peter Paulson retired at the end of 1998; upon his retirement, Joan Mitchell also took on the business operations of Forest Press and served simultaneously as editor-in-chief and executive director from 1999 through early 2003.1 In mid-1999, the physical assets of Forest Press were moved from Albany, NY, to OCLC headquarters in Dublin, OH. Also in 1999, the editorial team was expanded by one member. Giles Martin, an Australian, was the first non-U.S.-citizen to be hired on the Dewey team, and the first editor to be based at OCLC headquarters in Dublin. In 2009, Michael Panzer became the first former member of a Dewey translation team to be appointed assistant editor.2 In addition to the aforementioned, current editorial team members include Assistant Editors Julianne Beall http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0017� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0018� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0018� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0004� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 4 of 11. (an LC employee) and Rebecca Green (an OCLC employee), both based in the Dewey Editorial Office at LC, plus a part-time editorial assistant. One other important piece in the management of the DDC is the 10-member international advisory board, the Decimal Classification Editorial Policy Committee (EPC). EPC is a joint committee of OCLC and the American Library Association (ALA), and advises the DDC editors and OCLC on DDC content and strategic directions. The committee has existed in its present form since the early 1950s—prior to 1988, it was a Forest Press- ALA joint committee. The committee plays an important role in bringing a global viewpoint to the development of the DDC—current members are from Australia, Canada, South Africa, the United Kingdom, and the United States. Representatives of DDC translations serve as corresponding members of EPC and receive proposals at the same time as EPC members for consideration and comment. PUBLICATIONS Prior to joining OCLC, the Forest Press publications list was focused primarily on the full and abridged print editions of the DDC plus separate publications associated with them, Dewey-related conference proceedings, and a few Dewey-related texts. After Forest Press became part of OCLC, the publications list expanded to a wide variety of Dewey publications in print and electronic form, plus DDC-related products such as bookmarks and posters. A majority of the print publications and all of the electronic publications were developed and produced in cooperation with marketing and research staff at OCLC. In recent years, OCLC has chosen to license the production of DDC-related products to library vendors and has focused internal DDC publication efforts on the full and abridged editions of the DDC in print and electronic versions. OCLC also licenses the underlying DDC databases associated with the full and abridged editions as XML data files. The electronic editions and data files are discussed further in the Electronic Editions section of the article. ELECTRONIC EDITIONS An important relationship between OCLC and the DDC started several years prior to the acquisition of Forest Press with the Dewey Decimal Classification Online Project. The history and results of the study are available in full in the study report (Markey, & Demeyer, 1986); a short summary follows. In the early 1980s, the OCLC Office of Research became interested in how classification could assist library catalog users in performing subject searches in an online environment. The Office of Research learned that DDC 19 had been produced by computerized photocomposition—this led OCLC to inquire about the availability of the tapes for research purposes. Also in 1984, Inforonics Inc. was retained by Forest Press to develop an online database management system to support Dewey editorial operations. In January 1984, the DDC Online Project was initiated by the OCLC Office of Research with the support of the Council on Library Resources, Forest Press, and OCLC. In the http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0006� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0006� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 5 of 11. study, led by OCLC Research Scientist Karen Markey, researchers built two catalogs, one of which (the Dewey Online Catalog) included subject-rich data from DDC captions, notes, and Relative Index terms linked through the DDC class number to MARC records drawn from participating libraries' collections. This groundbreaking study, along with OCLC's eventual acquisition of the rights to the DDC, no doubt prompted OCLC's continued interest and experimentation in the creation and use of DDC data in electronic form. OCLC gained access to all of the Dewey schedules and tables in 1989 after the publication of DDC 20, the first edition produced using an online Editorial Support System (ESS). The ESS database was used by the OCLC Office of Research to prototype the Electronic Dewey software. In November 1992, catalogers at eight libraries began testing the prototype CD-ROM version of the DDC. The eight libraries were: National Library of Australia, Carnegie Mellon University Library, Columbus (OH) Metropolitan Library, Columbus (OH) City Schools, University of Illinois Library at Urbana- Champaign, Library of Congress (Decimal Classification Division), Stockton-San Joaquin County (CA) Public Library, and the New York State Library. Electronic Dewey was released the following year making Dewey the first library classification scheme available to users in electronic form. The system ran on a personal computer and enabled keyword searching of the schedules, tables, Relative Index, and Manual of DDC 20 on CD-ROM. In summer 1996, OCLC Forest Press published DDC 21 and released a new version of the Dewey software. For the first time, a new edition of the classification was published in two formats: the traditional four-volume print format and an electronic version on CD- ROM (Dewey for Windows3 The year 2000 marked another milestone in the evolution of the Dewey software, the debut of a Web-based product. WebDewey, a Web-based version of DDC 21, was released by OCLC as part of the Cooperative Online Resources Catalog (CORC) service. The CORC release included features to apply authority control to Dewey numbers and to generate DDC numbers for Web resources automatically. Two years later WebDewey and Abridged WebDewey, the latter a Web-based version of Abridged 13, became available in the OCLC Connexion cataloging service. ). The publication of Dewey for Windows followed several years of close collaboration between the OCLC Office of Research and the Dewey editorial team; the groups continue to work together today on a range of research and development projects. The DDC is also available in multiple XML representations. The XML files are used in OCLC products and services and distributed to translation partners and other licensed users. As part of an update of the Editorial Support System, the proprietary representations are being converted to ones based on the MARC 21 formats for Classification and Authority data. The MARC 21 versions will be available as XML files. http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 6 of 11. TRANSLATIONS The Dewey Decimal Classification is used in over 200,000 libraries in 138 countries—a reach into the global community that extends past OCLC's other services. An important feature of the DDC, its language-independent representation of concepts, makes it ideally suited as a global knowledge organization system. Since OCLC acquired the DDC in 1988, authorized translations of the full and abridged editions of the DDC have been published in the following languages: Arabic, French, German, Greek, Hebrew, Icelandic, Italian, Norwegian, Russian, Spanish, Turkish, and Vietnamese. Updated versions of the top three levels of the DDC are available in Arabic, Chinese, Czech, French, German, Hebrew, Italian, Norwegian, Portuguese, Russian, Spanish, Swedish, and Vietnamese. Plans are currently under way for a new Indonesian abridged translation and the first Swedish translation of the DDC (the latter currently envisioned as a mixed Swedish-English version of the full edition of the DDC). Currently, only the German translation is available in a Web version, but Web versions of the DDC are currently under exploration for the French, Greek, Italian, Norwegian, and Swedish translations. Translations of the DDC start with an agreement between OCLC and a recognized bibliographic agency in the country/language group. For example, under an agreement with OCLC, Deutsche Nationalbibliothek leads efforts on the German translation with the cooperation of bibliographic agencies in Germany, Austria, and Switzerland. Current translations are localized and interoperable with reference to the English-language edition on which the translation is based—localized in terms of terminology and examples appropriate to the country/language group, and interoperable in terms of authorized expansions or contractions of provisions in the base edition. A common example of the latter is an expansion of the geographic table in a translation. The Vietnamese translation of Abridged Edition 14 contains an extended geographic table for Vietnam in which the explicit provisions for the areas of Vietnam are at a deeper level than those found in the current abridged and full English-language editions of the DDC—in other words, the English-language version is a logical abridgment of the version found in the Vietnamese translation (Beall, 2003). MAPPINGS Mappings between Dewey and thesauri, subject heading lists, and other classification schemes enrich the vocabulary associated with DDC numbers and enable the use of the DDC as a switching system. Mappings to new concepts in other systems also help to keep the classification up-to-date. The electronic versions of DDC contain selected mappings between Dewey numbers and three subject headings systems—Library of Congress Subject Headings (LCSH), Medical Subject Headings (MeSH), and H.W. Wilson's Sears List of Subject Headings. The Dewey editors consult LCSH and MeSH as sources of terminology for the DDC and map terminology from both systems to the classification. Dewey for Windows was the first electronic edition to include intellectually mapped LCSH; MeSH mappings were introduced in WebDewey with the release of DDC 22 in 2003. Mappings between http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0002� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 7 of 11. abridged Dewey numbers and Sears headings are created at H.W. Wilson under an agreement with OCLC and are included in Abridged WebDewey and in various products offered by H.W. Wilson. In 2008, the Dewey editors began mapping DDC numbers to the BISAC (Book Industry Standards and Communications) subject headings. The work is part of OCLC's Next Generation Cataloging project which is piloting automated techniques for enriching publisher and vendor metadata (How the Pilot Works, n.d., para. 3). The mappings are used to add Dewey numbers to publisher records and BISAC subject headings to bibliographic records. Subject heading-DDC number pairs statistically derived from WorldCat are also included in OCLC products and services. The OCLC publications Subject Headings for Children and People, Places & Things are lists of LC subject headings with corresponding DDC numbers. Both include statistical mappings as do all of the electronic versions of the DDC, beginning with Electronic Dewey. Statistical mappings supplement the mappings provided by the Dewey editors. Several Dewey translation partners have projects under way to map Dewey numbers to local subject heading systems. Headings from Schlagwortnormdatei (SWD), the German subject heading authority file, are being mapped to Dewey numbers in the Criss-Cross project to date, 61,500 SWD headings have been mapped to the DDC (“Mapping of German” Subject Headings n.d.). At the Italian National Central Library in Florence, work is under way to map Dewey numbers to Nuovo Soggettario, the Italian subject heading list (Nuovo Soggettario, 2006; Paradisi, 2006). In addition to mappings between Dewey numbers and subject headings, several concordances have been developed between Dewey and other classification systems. The Library of Congress's Classification Web system includes statistical correlations among LCSH, Library of Congress Classification (LCC), and DDC based on the co-occurrence of the three in Library of Congress bibliographic records. The National Library of Sweden maintains a mapping between SAB, the Swedish classification system, and the DDC (Svanberg, 2008). The Czech National Library has built a concordance between UDC and DDC for the purposes of collection assessment (Bal kov 2007). RESEARCH For many years, the OCLC Office of Research has focused its DDC-related efforts in three main areas: prototyping classification tools for catalogers, developing automated classification software, and applying and refining statistical mapping techniques. Several of the outcomes of this work are discussed in the Electronic Editions and Mappings sections of this paper. While OCLC remains interested in these areas, recent projects are taking DDC research in new directions. One of these is the DeweyBrowser prototype (Vizine-Goetz, 2006). The DeweyBrowser is an end user system that incorporates many features of next generation library catalogs, http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0015� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0019� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0001� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0001� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0001� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0020� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 8 of 11. including tag clouds and multi-faceted searching and navigation. The clouds provide a visual representation of the number of titles in each of the top three levels of the DDC (known collectively as the DDC Summaries). In the prototype, users can navigate the Summaries in English, French, German, Norwegian, Spanish, and Swedish. The Summaries provide an ideal browsing structure for multilingual environments. In another project, OCLC researchers have developed an experimental classification service that provides access to classification information from more than 36 million WorldCat records (“Overview,” n.d., para 1). The OCLC FRBR Work-Set algorithm is used to group bibliographic records to provide a work-level summary of the DDC numbers, Library of Congress Classification numbers, and National Library of Medicine Classification numbers assigned to a work. The beta service is accessible through a human interface and as Web service. The Web service supports machine-to-machine interaction. Two additional Web services are being developed to deliver DDC data. One will offer a history of changes for a DDC class (Panzer, 2009); the other will provide a generic view of a DDC class across all editions/versions and languages. Finally, OCLC is investigating the issues involved in transforming the DDC into a Web information resource, including the design of Uniform Resource Identifiers (URIs) and the modeling of DDC in Simple Knowledge Organization System (SKOS) Panzer, 2008; Panzer, 2008 August). Emerging data models and new technologies (e.g., SKOS and linked data) will provide new opportunities for publishing, linking, and sharing DDC data in the years to come. CONCLUSION As we look back over the 20 years since OCLC acquired the rights to the DDC in 1988, we reflect on how OCLC has impacted Dewey's value proposition. The basic system features—well-defined categories and well-developed hierarchies, all interconnected by a rich network of relationships—have been enhanced by interoperable translations, mappings to other schemes, and new forms of representation of the underlying data. The amount of categorized content associated with the system in WorldCat and elsewhere has grown, as has worldwide use of the system. Dewey's language-independent representation of concepts makes it ideally suited to a myriad of uses in the current and future information environment. Its ongoing success as a knowledge organization tool will depend on the aggressive leadership that OCLC, in cooperation with the worldwide community of Dewey users, is willing to provide along a number of dimensions— updating and development of the system itself, availability of the system for experimentation and use, association of the system with content, mappings to other schemes, translations, and innovative research. ACKNOWLEDGMENTS The authors thank Julianne Beall (Library of Congress), plus the following OCLC colleagues for their advice and assistance in preparing this article: Mary Bray, Terry Butterworth, Robin Cornette, Libbie Crawford, Tam Dalrymple, Rebecca Green, Larry http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0014� https://www.informaworld.com/smpp/section?content=a915764349&fulltext=713240928#CIT0012� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 9 of 11. Olszewski, Michael Panzer, Phil Schieber, and MaryAnn Semigel. All opinions expressed and any omissions or errors remain the responsibility of the authors. Connexion, DDC, Dewey, Dewey Decimal Classification, WebDewey, and WorldCat are registered trademarks of OCLC Online Computer Library Center, Inc. REFERENCES Bal kov , M. (2007) UDC in Czechia. Proceedings of the International Seminar “Information Access for the Global Community, The Hague, June 4-5, 2007, Extensions and Corrections to the UDC, 29 pp. 191-227. — Retrieved February 27, 2009, from http://dlist.sir.arizona.edu/2379/01/MBalikova_UDC_Seminar2007.pdf Beall, J. (2003, August) Approaches to expansions: Case studies from the German and Vietnamese translations. A paper presented at the World Library and Information Congress (69th IFLA General Conference and Council) Berlin — Retrieved March 1, 2009, from http://www.ifla.org/IV/ifla69/papers/123e-Beall.pdf How the pilot works — Information retrieved February 27, 2009 from http://www.oclc.org/partnerships/material/nexgen/nextgencataloging.htm Intner, S. (1999) Stream of consciousness: An interview with Dewey's Peter Paulson. Technicalities 19 , pp. 2-3. Mapping of German Subject Headings to the Dewey Decimal Classification — Information retrieved February 27, 2009 from http://linux2.fbi.fh- koeln.de/crisscross/swd-ddc-mapping_en.html Markey, K. and Demeyer, A. (1986) Dewey Decimal Classification Online Project: Evaluation of a Library Schedule and Index Integrated into the Subject Searching Capabilities of an Online Catalog , OCLC/OPR/RR-86/1 . OCLC Online Computer Library Center , Dublin, OH Mitchell, J. S. and Vizine-Goetz, D. (2006) Moving beyond the Presentation Layer: Content and Context in the Dewey Decimal Classification (DDC) System Haworth Press. Co-published simultaneously in Cataloging & Classification Quarterly, 42, 2006 , Binghamton, NY Mitchell, J. S. and Vizine-Goetz, D. Bates, M. J. and Maack, M. (eds) Dewey Decimal Classification. Encyclopedia of Library and Information Science 2nd ed., Taylor & Francis , New York — (in press) (2006) Nuovo Soggettario: Guida al Sistema Italiano di Indicizzazione per Soggetto: Prototipo del Thesaurus pp. 175-177. Editrice Bibliografica , Milano OCLC acquires Forest Press, publisher of Dewey Decimal Classification (1988, December) Information Technology and Libraries p. 443. http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� http://dlist.sir.arizona.edu/2379/01/MBalikova_UDC_Seminar2007.pdf� http://www.ifla.org/IV/ifla69/papers/123e-Beall.pdf� http://www.oclc.org/partnerships/material/nexgen/nextgencataloging.htm� http://linux2.fbi.fh-koeln.de/crisscross/swd-ddc-mapping_en.html� http://linux2.fbi.fh-koeln.de/crisscross/swd-ddc-mapping_en.html� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 10 of 11. Overview — Information retrieved February 27, 2009 from http://www.oclc.org/research/researchworks/classify/ Panzer, M. Greenburg, J. and Klas, W. (eds) (2008) Cool URIs for the DDC: Towards Web-scale accessibility of a large classification system. Metadata for Semantic and Social Applications: Proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, September 22-26, 2008 pp. 183-190. Dublin Core Metadata Initiative and Universit tsverlag G ttingen , G ttingen Panzer, M. (2008, August) Presentation at OCLC/ISKO-NA Preconference to the 10th International ISKO Conference, Universite' de Montr al Canada DDC, SKOS, and linked data on the Web — Retrieved February 27, 2009, from http://www.oclc.org/us/en/news/events/presentations/2008/ISKO20080805- deweyskos-panzer.ppt Panzer, M. (2009, January) Presentation at Dewey Breakfast/Update, ALA Midwinter Meeting Denver, CO More than lists of changes: tracing the history of DDC concepts — Retrieved March 2, 2009, from http://www.oclc.org/us/en/dewey/news/conferences/more_than_lists.ppt Paradisi, F. (2006, August) Linking DDC numbers to the new 'Soggettario Italiano'. Presentation at Dewey Translators Meeting, World Library and Information Congress (72nd IFLA General Conference and Council) Seoul, Korea — Retrieved February 25, 2009, from http://www.oclc.org/dewey/news/conferences/ddc_and_soggetario_ifla_2006.ppt Paulson, P. Dewey, M., Comaromi, J. P., Beall, J., Matthews Jr., W. E. and New, G. R. (eds) (1989) Publisher's Foreword. Dewey Decimal Classification and Relative Index 1 , p. xi. Plotnik, A. (1988a) Would Dewey have done it?. American Libraries 19 , p. 736. Plotnik, A. (1988b) OCLC pays $3.8 million for Dewey Classification. American Libraries 19 , p. 641. Svanberg, M. (2008) Mapping Two Classification Schemes—DDC and SAB. New Perspectives on Subject Indexing and Classification: International Symposium in Honour of Magda Heiner-Freiling pp. 41-51. Deutsche Nationalbibliothek , Leipzig, Frankfurt am Main, Berlin Vizine-Goetz, D. Mitchell, J. S. and Vizine-Goetz, D. (eds) (2006) DeweyBrowser. Moving beyond the Presentation Layer: Content and Context in the Dewey Decimal Classification (DDC) System pp. 213-220. Haworth Press , Binghamton, NY — Co- published simultaneously in Cataloging & Classification Quarterly, 42, 2006, 213- 220 http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� http://www.oclc.org/research/researchworks/classify/� http://www.oclc.org/us/en/news/events/presentations/2008/ISKO20080805-deweyskos-panzer.ppt� http://www.oclc.org/us/en/news/events/presentations/2008/ISKO20080805-deweyskos-panzer.ppt� http://www.oclc.org/us/en/dewey/news/conferences/more_than_lists.ppt� http://www.oclc.org/dewey/news/conferences/ddc_and_soggetario_ifla_2006.ppt� Mitchell & Vizine-Goetz: The DDC and OCLC http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf Page 11 of 11. NOTES 1 At the request of Joan Mitchell, she returned to serving solely as editor-in-chief in early 2003. Dewey business operations were taken over by a business director in the OCLC cataloging area, and they have remained separate from the editorial operations since that period, mirroring the Forest Press/Dewey Editorial Office organization that had been in place for many years. 2 In the mid 2000s, Michael Panzer headed the technical team based at Cologne University of Applied Sciences that first translated Dewey into German. Michael Panzer succeeds long-time Assistant Editor and LC employee Winton E. Matthews Jr., but is based at OCLC headquarters in Dublin. 3 A Microsoft Windows®-based version of the software. http://www.oclc.org/research/publications/library/2009/mitchell-dvg-jla.pdf� INTRODUCTION ACQUISITION OF THE FOREST PRESS AND THE DDC MANAGEMENT OF THE DDC PUBLICATIONS ELECTRONIC EDITIONS TRANSLATIONS MAPPINGS RESEARCH CONCLUSION ACKNOWLEDGMENTS REFERENCES work_2iq2znowljdlpbfr43l5lr2tju ---- European Laboratory for Particle Physics CERN - Geneva CERN-AS-99-003 24th March 1999 The electronic journal service at CERN, a first evaluation: User access interfaces and user awareness Eliane CHANEY Catherine BULLIARD Caroline CHRISTIANSEN CERN Scientific Information Service 1 Eliane CHANEY, Catherine BULLIARD, Caroline CHRISTIANSEN CERN European Laboratory for Particle Physics, Geneva The electronic journal service at CERN, a first evaluation: User access interfaces and user awareness Introduction CERN is the European Laboratory for Particle Physics located in Geneva. It provides scientific research facilities to some 3000 permanent staff and 6500 visiting scientists, engineers or technicians from all over the world, who come to explore and study the atom. These scientists exchange theories and participate in large experiments. The development of their work and the results are regularly described in scientific articles. These are submitted as preprints to our server and subsequently to the now famous Los Alamos preprints electronic archives. In 1997, some 1800 CERN articles were published in journals or conference proceedings. Towards a journal desktop service Preprints and printed journal articles are complementary publishing outlets for authors. But while in a preprint archive it takes a matter of hours from submitting an article to its publication on the Web, publishing the same article in a printed journal may take months. These delays are less with electronic publishing. And every day new readers, well used to online technologies, discover the advantages of a digital library environment, working with full-text preprints and journal collections available on their desktop. Users value the reliability and stability of an integrated service and the time saving. But there is still the problem for users that journals put on the Internet are spread over different publishers’ web sites. The library at CERN has recognised the need to provide effortless access to our electronic journals collection. This meant that we need to develop, integrate and adapt specific online interfaces for these documents. 2 Present collections Presently, besides the paper collection which includes some 480 scientific titles, we offer access to 260 full-text online journals. Due to repeated acquisition budget reductions and regular subscription price increases, we have had to limit our online journal acquisitions within the printed subscriptions allocation. (see also: Bulletin des bibliothèques de France; 44(2), 1999, pp. 27-32) As a consequence, titles have been selected when they are: - available at the same or lower price than the paper version ; - accessible without charge as part of the paper subscription ; - acquired through a license between a publisher, Springer, and a consortium of Swiss academic and research libraries ; - available free of charge on the Internet and considered of scientific interest for our community. We also point to some 200 online tables of contents which are available free of charge to us. For each new title, we always test the service and the access reliability. We have a strong preference for IP-based access control for the whole site. Titles that require personal ID and passwords access are ignored. The resulting collection can be retrieved from our OPAC catalogue or from dedicated electronic journals web screens described below. Expectations The main benefits for the library expected from electronic journals are the savings on processing time and of storage space, and hopefully in the future reduced or stabilised costs. We like to cite as an interesting model, a new journal in our field: « JHEP, Journal of high energy physics », produced at SISSA (Trieste), and published by the Italian Physical Society. This title is primarily an electronic journal, peer- reviewed, and accessible without charge on the Internet. Articles are added as they become available and later a ‘printed archive’ volume is published on paper and distributed on subscription at a reasonable cost. Physicists support and encourage this new publication, as much for its quality as for the distribution speed of the full text published articles. 3 Towards a better access At present, we feel that we are providing a useful service, and we have the support of our community, but as explained further in this article, we need a better understanding of readers’ attitude towards electronic resources. The future should bring improvements. Currently there are a large variety of publication and distribution patterns for electronic journals. And the management and distribution methods of electronic journals are cumbersome, but we hope that in the future publishers will harmonise their services as they have done with printed editions. We also hope for the organisation of electronic archives by national or international institutions. We believe these two developments should enable libraries to provide a more satisfactory integrated service for electronic resources. 4 Cataloguing At the present time the CERN Library runs Aleph 330 documentary software with its in-house developed MARC format. The cataloguing rules are based on ISBD. By the year 2000 the Library will have installed the new Aleph 500 version and will convert its data into the USMARC format. The anglo- american cataloguing rules (AACR2) will then be adopted, along with the ISBD(ER) for the electronic resources. OCLC responded to the need for a focused cataloguing treatment of electronic resources by integrating discussion of Anglo-American Cataloguing Rules, 2d ed. (AACR2), the USMARC fields, and the International Standard Bibliographic Description (ISBD(ER))1. Cataloguing developments of the electronic journals When the Library catalogued its first electronic journals in 1996, few guidelines or standards were available. Our initial choices have been confirmed or have evolved according to the recommendations of the main bibliographic institutions, namely the Library of Congress and OCLC. The Library decided to allocate a proper bibliographic record to each electronic journal. CONSER suggests either one record with mention of the existence of the online format in the paper journal bibliographic record, or two bibliographic records, referring respectively to the printed and electronic editions. The Library has chosen the second method, which presents various advantages. Having a bibliographic record specific to each format means that in addition to the bibliographic data, some useful hypertext links can be put on the Web OPAC, enabling navigation between both versions. Having a separate record makes it possible to create lists for the Web automatically. Also, the printed and online electronic collection catalogues can be easily differentiated when produced. Furthermore in our integrated system the existence of separate records for e-versions allows us to manage e-journal subscriptions separately and easily. 1 Olson, Nancy B., Ed. Cataloging Internet Resources: A Manual and Practical Guide. 2d ed. Dublin, OH: OCLC Online Computer Library Center, Inc., 1997 [http://www.purl.org/oclc/cataloging-internet] 5 Location and electronic access Four new fields have been introduced in our library catalogue record in order to describe the characteristics of the electronic resources available on the Internet. The first added field is a sort of semaphore signal that indicates either that access is restricted to the CERN Intranet or that the full text is freely available on the Internet. A second added field is dedicated to the Uniform Resource Locator (URL), together with a standardised note that qualifies the resource type: i.e. online version of a printed journals, or an electronic only journal. A third field links both versions of a periodical with their system number. Finally a fourth new field defines the electronic services the publisher offers free of charge: tables of contents, abstracts, selected articles. The field is structured to contain the URL, to describe the service type, to give it a code for retrieval purposes, as well as to indicate its time span. Mining for data, and maintaining its validity, is time-consuming but is proving worthwhile for the users. A medium level of cataloguing is carried out by our library. The journal bibliographic record provides the electronic format availability, the holding information, or a bibliographic change. Each of these data achieves the complete description of the electronic resource and supports the user interfaces features. 6 Interface to electronic journals’ collection Readers can access full text articles or online tables of contents in different ways. One point of access is through the OPAC, i.e. the GUI to the catalogue. This interface has been designed in-house, with bibliographic records offering links to the electronic documents. The OPAC mostly serves the needs for searching. For browsing, we have developed two lists. In the first the titles are ordered alphabetically, with clear mention of documents with restricted access. For each title with a corresponding paper version, a link to the local holdings is indicated. The titles with full text availability have been graphically distinguished from the titles with tables of content access. What is special about the alphabetical list is its automated production. The list is generated via a program that queries the catalogue and retrieves all titles with online features. It produces and formats the html files, which are then put online. This development has considerably reduced the time needed to maintain the list, which was previously done manually. The sine qua non to its production is correct cataloguing. The list by subject allows the reader to choose a specific subject category and to browse the journals’ list under that heading. The links lead to the bibliographic records in the OPAC, from where access to the document is possible. Another interface to electronic journals, called GO DIRECT, is a Web form that retrieves requested articles on the fly if the reader has the precise reference (i.e. title, year, volume and page number). The engine avoids unnecessary navigation through publishers’ web pages. It was invented and launched by a physicist at our institute. He felt it was necessary to have a search tool that could speed up the access to full text articles. The facility offered by GO DIRECT corresponds well to our users’ needs: because in fact they usually do have precise references. The Go Direct 7 script constructs a URL from the citation the user supplies. Because the publishers use URLs based on citations it is possible for the script to reliably predict the correct URL from the citation. This understanding of the logical construction of URLs is also widely used to insert a link in the OPAC for preprints to the electronically published version of a scientific article. The method permits immediate access to both the preprint and the published versions of the article. Promotion and evaluation of the service One of the main goals of the library service is to circulate information on the availability of new services and products to readers. The Library is very keen to promote the use of electronic journals: for example by awareness raising advertisement campaigns. There are many reasons for this. One is our view of electronic journal access as a service, for everybody, and which has to be successful, i.e. used. The electronic journal gateways described above serve first the readers who are already familiar with online journals and then should persuade the others to access journals from their desktop. The whole community should experience the facilities offered by the electronic medium. Another reason to develop this service relates to foreseen cost savings. Finally, we believe that electronic journals are to be an important step towards the digital library, reducing the amount of paper that needs to be stored in the library, and making available diverse information sources. After promotion the next step in the development of the service is proper evaluation. It is important for us to measure the acceptance of electronic journals by our readers and the usage of our collection. 8 At first, observation of attitudes towards electronic journals has been primarily informal at the information desk in the Library. Then, we decided we needed a user survey. On the one hand, we have analysed the connections to the web pages dedicated to electronic journals. This quantitative data has informed us about the extent of the usage. More than 35% of the population on the site seem to have visited the pages within a period of 4 months. 30% of these connections are by regular readers. On the other hand, an online questionnaire was launched, requiring the opinions of both users and non-users of electronic journals. We considered the data as qualitative, because of the low response rate. The analysis of the questionnaire’s data has established the high degree of satisfaction of the users with electronic journals and the service provided by the Library as well as the willingness of the non-users to adopt new reading habits. In the future, we hope that publishers will collaborate in the task of measuring the usage of electronic collections of journals. At the moment, we have only been able to collect data on the global usage of our collection and had to rely on in-house skilled methods used in our recent survey. But publishers, we think, are able to, and should provide detailed figures about the usage of individual titles. When carrying out our survey, we contacted our publishers in order to obtain statistics. The result of our enquiry established that they are not yet organised to answer detailed questions. In the future we hope to get data in the form of log files in a standard format or at minimum, simply the willingness to collaborate in usage analysis. 9 Further reading Chaney, E.; Bulliard, C.; Christiansen, C.; Cressent, J.-P. Une bibliothèque de recherche face à l’édition électronique: l’exemple du CERN, in: Bulletin des bibliothèques de France, 44(2), 1999, pp. 27-32. Le Coadic, Y. F. Le besoin d'information: formulation, négociation, diagnostic. Paris: ADBS, 1998 (Collection Sciences de l'information, série Etudes et techniques). Peterson Bishop, A. Measuring access, use, and success in digital libraries, in: The Journal of electronic publishing [Online], 4(2), 1998. http://www.press.umich.edu/jep/04-02/bishop.html Pettenati, C. What is a virtual library?, in: ELAG, 17th Library System Seminar, Graz, 14-16 April 1993, pp. 145-163. _____________________________________________________________ URLs CERN Scientific Information Service homepage http://wwwas.cern.ch/library/ What is the APS link manager? http://publish.aps.org/linkfaq.html 10 Figures _____________________________________________________________ Fig.1 E-journals homepage: http://wwwas.cern.ch/library/electronic_journals/ej.html _____________________________________________________________ Fig.2 OPAC screens of the electronic and the printed editions of Classical and quantum gravity: 2.1 http://weblib.cern.ch/cgi- bin/showfull?uid=0571916_24074&base=CERCER&sysnb=0236052 2.2 http://weblib.cern.ch/cgi- bin/showfull?uid=0571916_24074&base=CERCER&sysnb=144241 _____________________________________________________________ Fig.3 GO Direct: http://wwwas.cern.ch/library/electronic_journals/from_ref_to_text_AE.html Please fill in the form with following example: Electronic letters online, volume 35, 1999, page 1 _____________________________________________________________ Fig.4 OPAC information of a preprint with both access to the original full text document and to the published version: ETHZ-IPP-PR-97-01 . (hep-ex/9705004) . Towards a Precise Parton Luminosity Determination at the CERN LHC . by Dittmar, M ; Pauss, F ; Zürcher, D ; (14 p) . 1997 . Publ. Ref.: Phys. Rev., D : 56 (1997) 11 7284-7290 - Published version - Access to fulltext document More information – Mark document work_2jf6gyb665cuvixtewowbpp42u ---- Iranome: A catalog of genomic variations in the Iranian population Human Mutation. 2019;1–17. wileyonlinelibrary.com/journal/humu © 2019 Wiley Periodicals, Inc. | 1 Received: 16 March 2019 | Revised: 16 July 2019 | Accepted: 22 July 2019 DOI: 10.1002/humu.23880 D A T A B A S E S Iranome: A catalog of genomic variations in the Iranian population Zohreh Fattahi1,2 | Maryam Beheshtian1,2 | Marzieh Mohseni1,2 | Hossein Poustchi3 | Erin Sellars4 | Sayyed Hossein Nezhadi5 | Amir Amini6 | Sanaz Arzhangi1 | Khadijeh Jalalvand1 | Peyman Jamali7 | Zahra Mohammadi3 | Behzad Davarnia1 | Pooneh Nikuei8 | Morteza Oladnabi1 | Akbar Mohammadzadeh1 | Elham Zohrehvand1 | Azim Nejatizadeh8 | Mohammad Shekari8 | Maryam Bagherzadeh4 | Ehsan Shamsi‐Gooshki9,10 | Stefan Börno11 | Bernd Timmermann11 | Aliakbar Haghdoost12,13 | Reza Najafipour14 | Hamid Reza Khorram Khorshid1 | Kimia Kahrizi1 | Reza Malekzadeh3 | Mohammad R. Akbari4,15,16 | Hossein Najmabadi1,2 1Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran 2Kariminejad–Najmabadi Pathology & Genetics Center, Tehran, Iran 3Digestive Diseases Research Center, Digestive Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran 4Women’s College Research Institute, University of Toronto, Toronto, Ontario, Canada 5Department of Computer Science, University of Toronto, Toronto, Ontario, Canada 6Information Technology Office, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran 7Shahrood Genetic Counseling Center, Welfare Office, Semnan, Iran 8Molecular Medicine Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran 9Medical Ethics and History of Medicine Research Center, Tehran University of Medical Sciences, Tehran, Iran 10Department of Medical Ethics, Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran 11Max Planck Institute for Molecular Genetics, Berlin, Germany 12Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran 13Regional Knowledge Hub, and WHO Collaborating Centre for HIV Surveillance, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran 14Cellular and Molecular Research Centre, Qazvin University of Medical Sciences, Qazvin, Iran 15Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada 16Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada Correspondence Mohammad R. Akbari, Women’s College Hospital, University of Toronto, 76 Grenville Street, Room 6421, M5S 1B2 Toronto, Ontario, Canada. Email: mohammad.akbari@utoronto.ca Hossein Najmabadi, Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Daneshjoo Blvd, Koodakyar St., Evin, Tehran 1985713834, Iran. Email: hnajm12@yahoo.com Abstract Considering the application of human genome variation databases in precision medicine, population‐specific genome projects are continuously being developed. However, the Middle Eastern population is underrepresented in current databases. Accordingly, we established Iranome database (www.iranome.com) by performing whole exome sequencing on 800 individuals from eight major Iranian ethnic groups representing the second largest population of Middle East. We identified 1,575,702 variants of which 308,311 were novel (19.6%). Also, by presenting higher frequency for 37,384 novel or known rare variants, Iranome database can improve the power of molecular diagnosis. Moreover, attainable clinical information makes this database a good resource for http://orcid.org/0000-0002-6084-7778 mailto:mohammad.akbari@utoronto.ca www.iranome.com http://crossmark.crossref.org/dialog/?doi=10.1002%2Fhumu.23880&domain=pdf&date_stamp=2019-08-17 Funding information Vice deputy for research and technology at Iran Ministry of Health and Medical Education, Grant/Award Number: 700/150; Iran Vice‐ President office for Science and Technology, Grant/Award Number: 11/66100 classifying pathogenicity of rare variants. Principal components analysis indicated that, apart from Iranian‐Baluchs, Iranian‐Turkmen, and Iranian‐Persian Gulf Islanders, who form their own clusters, rest of the population were genetically linked, forming a super‐ population. Furthermore, only 0.6% of novel variants showed counterparts in “Greater Middle East Variome Project”, emphasizing the value of Iranome at national level by releasing a comprehensive catalog of Iranian genomic variations and also filling another gap in the catalog of human genome variations at international level. We introduce Iranome as a resource which may also be applicable in other countries located in neighboring regions historically called Greater Iran (Persia). K E Y W O R D S Genome project, genomic variation database, Iran, Iranome, whole exome sequencing 1 | INTRODUCTION Completion of the Human Genome project (HGP) was a turning point in the field of human genetics, providing the first human reference DNA sequence. With the advent of next generation sequencing (NGS) technologies, some personal genomes were sequenced and numerous single nucleotide polymorphisms (SNPs), mostly with lower allele frequencies, were identified which were not present in the dbSNP database at that time (Naidoo, Pawitan, Soong, Cooper, & Ku, 2011). To obtain a more complete picture of the rare variants in different human populations, the 1,000 genome (1KG) project expanded and provided the largest catalog of human genetic variations applying whole exome sequencing (WES) and whole genome sequencing (WGS) on 2,504 individuals from 26 different populations (Auton et al., 2015; Naidoo et al., 2011). This project revealed over 88 million variants in the human genome providing a valuable resource for research on the genetic basis of human disorders. The majority of these variants were rare (allele frequency <0.5%) and the total number of variants was different among the 26 populations with the rare ones limited to closely related populations, known as geographical clustering of the rare variants (Auton et al., 2015; Tennessen et al., 2012). Furthermore, WES of 6,515 European American and African American individuals in the NHLBI GO Exome Sequencing Project (ESP) was used to infer mutation ages and determined that about 86% of single nucleotide variants (SNVs) had been created recently. The results showed that the majority of these variants were rare SNVs arising as a result of population growth, and the recently emerged rare deleterious SNVs were significantly increased in disease genes (Fu et al., 2013). Such variants are of great importance in the field of genetic diagnosis of Mendelian disorders. Basically, the pipelines used to identify the causal variant from the extensive list of genetic variations detected by NGS methods, include filtering the variants based on high allele frequency. Therefore, the presence of large‐scale databases of genetic variations with representatives from many different ethnic groups is essential to provide a more complete picture of human genome variations and their allele frequencies. This provides a good resource for clinical and functional interpretation of the variants, distinguishing real disease‐causing variants from polymorphisms. In line with this demand, a few collaborative projects were established recently and aggregated a large number of WES and WGS data, providing a more comprehensive summary of human genome variations. These included the Exome Aggregation Con- sortium (ExAC), aggregating 60,706 exome sequences and the later Genome Aggregation Database (gnomAD), aggregating 125,748 exome sequences in addition to 15,708 whole‐genome sequences of unrelated individuals from various ancestries (Lek et al., 2016). Data analysis of such large‐scale projects led to the following conclusion that rare variants are more likely to be population‐ specific. This shows the necessity to conduct population‐specific genome projects to identify their genetic backgrounds. Such attempts will help in building a more complete picture of genetic variations in the human genome by introducing novel and population‐specific rare variants. In fact, this was the aim of the 1KG project which selected samples from 26 different populations. However, the need for population‐specific genome projects still remains, because many ethnic groups are not represented in 1KG project or the number of individuals per population is insufficient to attain reliable allele frequencies (An, 2017; Dopazo et al., 2016; Yamaguchi‐Kabata et al., 2015). Accordingly, lack of representa- tives from specific populations and ethnic groups in human genome databases may lead to marginalization of members of those populations in the era of genomic revolution, which might put them in danger of discrimination by depriving them of the benefits of new advances in genetic technologies and the associated medical advances. Creation of population‐specific genomic variation data- bases will play an important role in genomic medicine and healthcare as the interpretation of a causal variant in the clinical setting requires knowledge of its frequency in the population the patient comes from (An, 2017; Auton et al., 2015; MacArthur et al., 2014). Therefore, from an ethical point of view, population‐specific genome projects improve health equity at a population and global level (Boomsma et al., 2014; The Genome of the Netherlands Consortium, 2014). 2 | FATTAHI ET AL. Middle Eastern genomes are completely absent from the most renowned human genome variation datasets (Auton et al., 2015; Lek et al., 2016). The Middle East; large region encompassing 17 countries from West Asia to North Africa, is renowned as “Home to the cradle of civilization” and also as an important gateway of modern human migratory routes out of Africa, thereafter populating the whole world (Henn, Cavalli‐Sforza, & Feldman, 2012). Furthermore, its overall 411 million residents have come from diverse ethnic groups with rapid population growth (The Middle East Population (2018‐05‐20). Retrieved from http://worldpopulationreview.com/ continents/the‐middle‐east‐population/). Iran, the second largest population in the Middle East, is geographically located in West Asia in a historical region known as the “Fertile Crescent” where the initial migrations out of Africa towards Asia and Europe (Eurasia) occurred (Map of Human Migration; Retrieved from https://genographic.nationalgeographic.com/human‐ journey/; Figure 1a; Alkan et al., 2014; Henn et al., 2012) (Figure 1a). In addition, the important role and geographical location of Iran in the expansion and distribution of gene mutations during the Silk Road trade, an important period causing population admixture after the divergence of western and eastern Eurasia, cannot be overlooked (Comas et al., 1998; Derenko et al., 2013; Zarei & Alipanah, 2014). All these factors emphasize the valuable and significant additive information on human genome variation that can be gained by sequencing genomes from Middle Eastern population and especially from Iranian population, as is the main focus of this study. As a result of the tradition of consanguineous marriage, a high burden of recessive disorders is reported in the region, and attempts to clarify the genetic variants and their allele frequencies in the Middle Eastern population have recently been initiated aiming to improve precision medicine. First, “The Greater Middle East (GME) Variome Project” included 2,497 WES data selected from the Greater Middle East (Figure 1a), the later “Al mena” project aggregated WES and WGS sequence data from 2,115 unrelated individuals from the Middle East and North Africa (MENA) region (Koshy, Ranawat, & Scaria, 2017; Scott et al., 2016). Although these large‐scale aggregation projects have provided an excellent view of allelic frequencies in the Middle Eastern FIGURE 1 (a) Map of Iran, its neighboring countries and countries in the Middle East and North Africa. The blue arrows show the initial migrations out of Africa towards the Fertile Crescent (a region located in the Middle East which stretches from the Zagros Mountains in southwestern Iran to northern Mesopotamia and into southeast Anatolia) and then the migration of early humans from this region to Asia and Europe (Eurasia). In addition, the black man symbols show the countries and populations which were investigated in The Greater Middle East (GME) Variome Project. (b) Map of Iran with its provinces. The man symbols show where all 800 samples in the Iranome project were taken. The red, dark blue, light green, pink, dark green, light blue, purple and yellow man symbols are shown in provinces where samples were collected for Iranian–Arabs, Iranian–Azeris, Iranian–Baluchs, Iranian–Kurds, Iranian–Lurs, Iranian–Persians, Iranian–Persian Gulf Islanders, and Iranian–Turkmen, respectively FATTAHI ET AL. | 3 http://worldpopulationreview.com/continents/the-middle-east-population/ http://worldpopulationreview.com/continents/the-middle-east-population/ https://genographic.nationalgeographic.com/human-journey/ https://genographic.nationalgeographic.com/human-journey/ population, the number of Iranian individuals included is insufficient to provide a detailed account of allelic frequencies in this specific population, particularly, because it is considered to be one of the ancient founder populations in the Middle East (Scott et al., 2016), and it is a multiethnic and multilinguistic community (Amanolahi, 2005). These Iranian ethnic groups are among the most under- represented populations in the human genomic variation databases currently available. Here, we describe the design of “Iranome” as a population‐ specific project to address the aforementioned issues in medical genetics and precision medicine among Middle Easterns and in particular Iranian populations. With this in mind, we established the Iranome database (www.iranome.com and/or www.iranome.ir) by performing whole exome sequencing on 800 individuals from eight major ethnic groups in Iran, with 100 healthy individuals from each ethnic group. The eight ethnic groups were as follows: Iranian‐Arabs, Iranian‐Azeris, Iranian‐Baluchs, Iranian‐Kurds, Iranian‐Lurs, Iranian‐ Persians, Iranian‐Persian Gulf Islanders, and Iranian‐Turkmen. They represent over 80 million Iranians and, to some degree, the half a billion individuals who live in the Middle East. 2 | MATERIALS AND METHODS 2.1 | Editorial policies and ethical considerations The present study obtained approval from the Biomedical Research Ethics Committee, part of the National Institute for Medical Research Development (Certification number: IR.NIMAD.REC.1395.003). In addition, all of the volunteers were properly informed of the project's objective and signed consent forms approving the publication of their results anonymously in aggregation with others. 2.2 | Project design and selection of samples The Iranome project was designed to reflect the demographic context of the country. Iran is geographically located in southwest Asia bordering the Caspian Sea, the Persian Gulf, and the Gulf of Oman (Figure 1b), with over 80 million residents, it is considered to be the 18th most populous country in the world comprising 1.07% of the world population. Throughout its history, Iran has been invaded by other countries on many occasions, each making their own contributions to the gene pool of the local population (Farhud et al., 1991). In addition, this country was a passageway between the Far East and the West as it lay on the Silk Road trade route and as a result, the Iranian people have interacted and intermarried with many different nations and races such as Greeks, Arabs, Mongols, Turks, and other tribes. So, the Iranian population is very heterogeneous and is composed of various ethnic groups (up to 26 in some reports) who live in geographically distinct regions of this vast country. Because the national census information is collected based on provinces and not on ethnicities, there are unofficial statistics about these ethnicities that are mostly categorized based on their language or religion and not race or biological factors (Amanolahi, 2005; Rashidvash, 2016). Although there are some discrepancies regarding the proportion of each ethnic group, it is well‐known that Persians (Fars) are the dominant majority of the ethnic component in Iran, and Azeris or Azerbaijanis comprise the second largest ethnic group (largest ethnic minority) in the country (Material S1). In some reports, the percentages of Iranian ethnicities are as follows: Persians (65%), Azeris (16%), Kurds (7%), Lurs (6%), Arabs (2%), Baluchs (2%), Turkmen (1%), Qashqai (1%), and non‐Persian, non‐Turkic groups such as Armenians, Assyrians, and Georgians (less than 1%). However, other reports consider Persians as the dominant population (51%), followed by the rest of the population consisting of Azeris (24%), Gilakis and Mazandaranis (8%), Kurds (7%), Arabs (3%), Lurs (2%), Baluchs (2%), Turkmen (2%), Zabolies (2%) and others (1%) (Banihashemi, 2009; Hassan, 2008; Majbouri & Fesharaki, 2017). There are some inconsistencies in considering Gilaki and Mazan- darani people who live on the Caspian Sea coast as a separate ethnicity. Some reports consider these people as originally Persian and that any differences present between Mazandarani and Gilaki people are not due to race, but to environmental differences. However, they speak Persian dialects which are distinctive compared to Persian speakers from the central plateau such as those in Tehran or Shiraz (Curtis, Hooglund, & Division, 2008; Rashidvash, 2012, 2013). The geographical separation and different historical origins of these ethnic groups play a significant role not only in building specific languages, cultures and life styles but possibly also has a bearing on their varied genetic background. However, some reports claim that common genetic roots of the ethnicities present in Iran are not changed significantly by encounters with other races throughout history (Farjadian & Safi, 2013). As the objective of the Iranome project was to develop a population‐specific framework of genomic variations, providing a good resource for classifying these differences and producing reasonable national health care plans, we decided to include 100 individuals from each of the following ethnic groups: Arabs, Azeris, Baluchs, Kurds, Lurs, Persians, Turkmen, and also Persian Gulf Islanders (Figure 1b). Samples were obtained through a network of local physicians in different provinces who were trained to collect samples according to the project's criteria shown in Table 1 (See questionnaire as material S2). All participants, whose ancestors were born in Iran, were registered in the project anonymously upon having pure race up to at least two generations (four grandparents) and after clinical evalua- tion according to the completed questionnaire and the complete blood count (CBC) and urine test results. The familial relationship was also explored as far as possible to prevent including relatives who had similar genetic backgrounds in the study. 2.3 | Sequencing and data analysis All 800 Iranian DNA samples underwent exome enrichment using Agilent SureSelectXT Human All Exon V6 (Agilent Technologies Inc, Santa Clara, CA) to capture 60 Mb of human genome and then paired‐end sequencing was performed using different Illumina 4 | FATTAHI ET AL. http://www.iranome.com http://www.iranome.ir sequencers (Illumina, San Diego, CA). The generated paired‐end reads of 100–150 bp were then aligned to Homo sapiens (human) genome assembly GRCh37(hg19)‐1KG‐decoy using the Burrows‐ Wheeler Aligner (BWA; V0.7.5a) after proper quality control assessment using the FastQC toolkit (Li & Durbin, 2010). BAM processing was implemented by applying Picard tools (V2.2.1) and then GATK pipeline (V3.7) adhering to best practices (Van der Auwera et al., 2013), which included marking and filtering duplicate reads, filtering low quality reads, insertion/deletion realignment and base quality recalibration. The alignment metrics were assessed using Picard tools to perform quality control of the BAM files followed by coverage assessment using GATK pipeline. Variant calling of the WES samples was performed by joint genotyping followed by variant calling using the haplotypecaller module of GATK pipeline. The final variant recalibration and filtering were accomplished by GATK Variant Quality Score Recalibration (VQSR). The variants identified were then annotated using the last updated versions of various databases and tools using ANNOVAR package and SNP and Variation Suite (SVS; Wang, Li, & Hakonarson, 2010). Statistics for all of the samples were acquired by SVS, RTG tools, VCFtools, and awk programing (Danecek et al., 2011). 2.4 | Database design All the variants identified were made publicly available to the scientific community through a web‐based genomic variation browser at http:// www.iranome.com/ and http://iranome.ir/. The Iranome Browser uses the open source code developed initially for the ExAC browser by the laboratory of Dr Daniel G. MacArthur at Broad Institute of MIT and Harvard Universities, Cambridge, MA, with some modifications made to it by Golden Helix Inc. (Bozeman, MT). 2.5 | Analysis of the population genetic structure The Eigenstrat method was used for analysis of the population genetic structure among the samples. In this method, principal components analysis (PCA) is applied to SNVs to infer continuous axes of genetic variation. To avoid the clustering of individuals based on regions of linkage disequilibrium (LD), SNPs in two known high LD regions (the human leukocyte antigen (HLA) region in chromosome 6 and a polymorphic chromosome 8 inversion) and dependent SNPs with r2 ≥ 0.2 over a shifting window of 500 kb were excluded and the remaining genetic variants were used for PCA. The eigenvectors of the first two PCs with the largest eigenvalues of the individuals for each population were plotted to visualize the genetic structure of different ethnic groups in comparison with each other. Then to compare the genetic structure of the Iranian population with other populations, the common variants between the Iranome database and the 26 populations of the 1KG database were pooled together and PCA was applied to the pooled database. 2.6 | Runs of homozygosity identification The Golden Helix SVS algorithm (version 8.8.3) was used for the Runs of homozygosity (ROH) estimations using the WES data for all 800 individuals. First, stringent filtering was applied. Variants which did not pass the following quality criteria were removed: VQSR filtering, TABLE 1 Detailed information examined for each participant in the study Individual information Criteria Age Inclusion criteria:>30 years Ethnicity Inclusion criteria: Pure race up to at least two generations for each ethnicity Sex ~1:1 ratio in the final 100 samples from each ethnicity Weight Recorded (all included) Height Recorded (all included) Blood pressure Recorded as normal, hypertension, not‐known (all included) Smoking Recorded as smoker, nonsmoker (all included) Record of medication use Recorded with drug information (all included) Disease history (cardiovascular, nephrological, urological, nervous system, gastrointestinal, endocrine and metabolic, blood, connective tissue, cancer) Recorded (If any); the known rare Mendelian disorders, cancer, seizure, and epilepsy were excluded, multifactorial phenotypes were not excluded Record of physical disability Excluded: individuals with apparent physical disability Disease history in parents and relatives Recorded (all included) Relative relationship in parents Recorded (all included) Twin pregnancy Recorded CBC and urine test results Excluded: Hemoglobin < 10, glucose in urine Abbreviation: CBC, cell blood count. FATTAHI ET AL. | 5 http://www.iranome.com/ http://www.iranome.com/ http://iranome.ir/ number of supporting reads (≥10 reads), genotype quality (≥40), and alternate allele read ratio (≤0.15 for Ref_Ref variants, >0.3 and <0.7 for Alt_Ref variants and >0.85 for Alt_Alt variants). Then variants located on chromosome X and Y were excluded. The filtered list of variants was then used to assess all of the possible runs per sample based on the following criteria specified to the algorithm: The minimum run length was taken as 500 kb with a minimum 25 variants per run. Also, one heterozygous and five missing calls per run were allowed. Next, the second algorithm provided a clustered list of ROHs for at least 20 samples in the data set. Finally, the highly overlapping ROH regions were calculated as optimal ROH clusters. 3 | RESULTS 3.1 | Demography of Iranome project samples The total number of samples included in the project was 800 individuals who were not suffering from severe rare Mendelian disorders. Table 2 provides a summary of the demographic information for the Iranome samples. The approximate 1:1 ratio of female/male was reflected in the overall list of samples as well as in each ethnic group. To decrease the bias made by late‐onset Mendelian disorders, samples were selected from individuals who were >30 years old. The mean age of individuals at blood draw was 50.61 (standard deviation [SD] 9.33 years). The mean age of female individuals was 50.65 (SD 9.08 years) and the mean age of male individuals was 50.59 (SD 9.56 years). The age range of individuals included in the project was 30–84 years. Also, as was mentioned before, representative sampling was according to the ethnicities and not the geographical regions. The provinces in which samples from each ethnicity were selected are shown in Table 2 and also in Figure 1b. 3.2 | Genomic structure of the Iranian population The genetic structure and ancestry among the seven ethnicities and also the Persian Gulf Islanders across Iran were estimated using principal components analysis (PCA) and population clusters are shown in Figure 2. The population clusters of Arab, Azeri, Kurd, Lur, and Persian ethnicities look genetically very similar to each other (Figure 2a) which is more distinctive in Arabs and Azeris (Figure 2b). The other three populations including Baluchs, Turkmen, and Persian Gulf Islanders are genetically more distinct from the other five, which may be explained by the separation of these groups from the rest of the population through geographical and cultural isolation (Figure 2a). Comparison of the Iranian population to the five super populations in the 1KG project (African, American, East Asian, European and South Asian) showed that the population clusters of Arabs, Azeris, Kurds, Lurs, and Persians are genetically distinct and these should probably be considered to be the sixth super population (main Iranian cluster) with its own genetic background distinct from the other five already known super populations (Figure 2c). Interestingly, this main Iranian cluster is located between Europeans and South Asians, predictable from their geographical locations. In comparison to the five super populations of the 1KG project, the Baluchs and Persian Gulf Islanders are located genetically between the main Iranian cluster and South Asians. In addition, Turkmen are TABLE 2 Demographic information of Iranome samples Ethnicity Number (n) and age (mean ± SD) (years) Provinces in which samples were takenFemale Male Total Arab 44 56 100 Khuzestan (50.05 ± 9.14) (46.79 ± 8) (48.22 ± 8.63) Azeri 44 56 100 Eastern Azerbaijan, Western Azerbaijan, Ardebil, Tehran, Zanjan (49.52 ± 9.27) (50.48 ± 7.93) (50.06 ± 8.51) Baluch 40 60 100 Sistan & Baluchistan (46.87 ± 11.73) (47.55 ± 11.54) (47.28 ± 11.56) Kurd 48 52 100 Kurdistan, Kermanshah (48.94 ± 7.13) (49.77 ± 6.37) (49.37 ± 6.72) Lur 59 41 100 Lorestan, Fars, Kohgiluyeh and Boyer‐Ahmad (50.77 ± 8.69) (53 ± 11.8) (51.69 ± 10.08) Persian 50 50 100 Fars, Semnan, Tehran, Bushehr, Khuzestan, Razavi‐Khorasan (52.24 ± 9.09) (54.08 ± 9.67) (53.16 ± 9.39) Persian Gulf Islanders 50 50 100 Hormozgan (52.38 ± 7.73) (51.98 ± 8.17) (52.18 ± 7.92) Turkmen 48 52 100 Golestan (53.44 ± 8.84) (55.2 ± 10.01) (52.96 ± 9.43) Total 383 417 800 Iran (50.65 ± 9.08) (50.59 ± 9.56) (50.61 ± 9.33) 6 | FATTAHI ET AL. located between the main Iranian cluster and East Asians (Figure 2d). Additional PCs for each of the abovementioned analyses are shown in the Figures S1 and S2. 3.3 | The iranome data set For the 800 samples sequenced in this project, the mean depth of coverage for the exons of the human genome based on CCDS Release15 was 84X with 97% and 93% coverage at 10X and 20X or more, respectively. In total, we identified 1,575,702 variants within protein coding regions captured by the SureSelect Human All Exon V6 kit, which passed a filter based on the following quality metrics: VQSR filtering (Using 99.0 tranche), depth of coverage (>4 reads for Ref_Ref and Alt_Alt variants, >8 reads for Ref_Alt variants), genotype quality (≥15), alternate allele read ratio (≤0.15 for Ref_Ref variants, >0.25 and <0.7 for Alt_Ref variants and > 0.8 for Alt_Alt variants) and Strand bias estimated using Fisher’s Exact Test (FisherStrand <30). These high quality variants included 1,332,298 SNPs and 243,404 insertions/deletions (indels) and represent one variant in every approximately 38 bp of the captured 60 Mbp exome interval. Among these 1,575,702 variants, 52.5% were singletons and 308,311 variants (including 240,256 SNPs and 68,055 indels) had no record in the following public databases: dbSNP catalog (version 149), dbSNP Common catalog (version 151), ExAC database, gnomAD database, NHLBI ESP6500 database, 1 kG Phase3, the Avon Longitudinal Study of Parents and Children (ALSPAC) data set, UK10K Twins data set and TOPMed data set; therefore, they were considered to be novel variants representing 19.6% of the entire detected variants (Table 3). As expected, the FIGURE 2 Results of principal components analysis (PCA) performed on the eight groups studied in Iranome project. Each color shows an ethnicity cluster as defined in the figure. (a) PCA of seven Iranian ethnicities and also Persian Gulf Islanders (PGI) show a common overlapping cluster for Arabs, Azeris, Kurds, Lurs and Persians, while Baluchs, Turkmen and Persian Gulf Islanders occur in separate clusters (PC2 and PC3 are shown). (b) PCA results for five Iranian ethnicities: Arabs, Azeris, Kurds, Lurs and Persians, showing similar genetic background, although Arabs and Azeris are more distinctive from the other three ethnic groups (PC2 and PC3 are shown). (c) Comparison of the Iranian population (shown as a super population including Arabs, Azeris, Kurds, Lurs and Persians) with the five super populations of the 1KG project (PC1 and PC3 are shown). (d) Comparison of the Iranian population (seven Iranian ethnicities and also Persian Gulf Islanders (PGI)) with the five super populations of 1KG project (PC1 and PC3 are shown) FATTAHI ET AL. | 7 majority of these novel variants were singletons (81%). On the other hand, an additional 50% (793,806) of the detected variants were observed in public databases but with an allele frequency less than 0.01 (rare variants). Therefore, about 70% of the variants identified in this data set belong to the category of rare/novel variants (Table 3). However, among these 1,102,117 rare/novel variants, we identified 37,384 variants (3.4%) with an alternate allele frequency of greater than 1% in the Iranome data set (Figure 3b). Therefore, in addition to introducing 308,311 novel variants to the catalog of human genome variation, the Iranome database can improve the power of molecular diagnosis by showing an alternative allele frequency of higher than 1% for 37,384 novel or previously known rare variants. The average number of genomic variations per Iranian individual within regions covered by the SureSelect Human All Exon V6 kit was 92,162. This is also shown in Table 4 for all eight ethnicities investigated. In addition, the average number of singletons, transitions, and transversions per individual in the Iranome database was 796.67, 55481.25, and 23340, respectively. This represents the mean Ti/Tv ratio of 2.38 in the data set. The total number of genetic variations did not differ significantly among the Iranian ethnic groups studied. Also, all ethnicities had similar proportions of the total number of novel genetic variations detected in the Iranian population with the highest detected in Azeri’s and the lowest in Turkmen (Figure 3a,c). In addition, Baluchs and Turkmen had the lowest percentage of ethnic‐specific novel variants among the total detected novel variants in the Iranian population whereas other ethnic groups showed nonsignificant differences. The detailed statistics for each ethnic group is shown in Table 4. This indicates that, although notable differences can be observed among these eight ethnicities in terms of their appearance, language and geographical location, they are genetically similar and contribute equally to the genetic pool of the Iranian population. So, we propose that the application of this current database can be TABLE 3 Proportion of variants identified in Iranome project based on the alternate allele frequency and variant types Alternate allele frequency (AF) Total Novel Rare (AF < 0.01) Common (AF ≥ 0.01) All variants 1,575,702 308,311 (19.57%) 793,806 (50.38%) 473,585 (30.05%) Clinically relevant effect Annotated by Refseq Genes 105 Interim v1, NCBI 1,440,997 280,691 733,334 426,972 LoF Effect Exon_loss_variant 4 2 0 2 Stop_lost 316 82 173 61 Stop_gained 4805 1333 2949 523 Initiator_codon_variant 703 178 388 137 Frameshift_variant 11,065 6332 3586 1147 Splice_acceptor_variant 1773 517 989 267 Splice_donor_variant 1583 507 883 193 Total 20,249 (1.4%) 8,951 8968 2330 Missense Effect 5’‐UTR_premature_start_codon_gain_ variant 1926 318 1205 403 Disruptive_inframe_deletion 189 79 83 27 Disruptive_inframe_insertion 60 18 34 8 Missense 238,922 39,443 152,877 46,602 Inframe_deletion 5871 1139 3137 1595 Inframe_insertion 3103 581 1628 894 Total 250,071 (17.35%) 41,578 158,964 49,529 Other Effects 5′_UTR_variants 39,225 8452 19,904 10,869 3′_UTR_variants 56,794 11,431 27,864 17,499 Intron_variant 878,672 187,215 402,483 288,974 Synonymous_variant 161,893 18,062 97,670 46,161 Splice_region_variant 33,944 4,967 17,410 11,567 Stop_retained_variant 149 35 71 43 Total 1,170,677 (81.24%) 230,162 565,402 375,113 Abbreviation: LoF, loss of function. 8 | FATTAHI ET AL. FIGURE 3 (a) Relative geographical distributions of individuals from each ethnicity are shown by colors (relative intensity) in each map (The Iran maps are created according to a poll in 2010 and are retrieved from Wikimedia Commons; the free media repository upon free permission to copy, distribute and/or modify under the terms of the GNU Free Documentation License). The area of each pi chart under each map represents the total number of variants identified in each ethnicity and Persian Gulf Islanders. Each pi chart is divided into three slices, showing known variants in each ethnicity (darkest blue), common novel variants in each ethnicity (medium blue) and ethnic‐specific novel variants (light blue). (b) Portion of frequent variants (MAF > 0.01) in Iranome data set out of the rare/novel variants compared to public databases. (c) Proportion of known and novel variants per each ethnicity TABLE 4 Proportion of variants identified in the Iranome project based on ethnicities, in addition to average number of genomic variations in each individual from eight ethnic groups Ethnic group No. of individuals All variants detected in each ethnic group Average no. of variants per individual Novel variants detected % of novel variants detected in each ethnic group / total number of variants detected in each ethnic group Ethnic‐ specific novel variants % of ethnic specific novel variants / total number of novel variants in Iranome Arab 100 578,143 94,066 51,842 8.97% 28,808 9.34% Azeri 100 526,479 91,486 51,794 9.84% 29,880 9.69% Baluch 100 526,858 105,071 47,049 8.93% 20,961 6.80% Kurd 100 502,970 94,135 50,057 9.95% 26,399 8.56% Lur 100 482,663 87,215 47,056 9.75% 23,861 7.74% Persian 100 490,559 85,060 47,017 9.58% 25,611 8.31% Persian Gulf Islanders 100 564,481 103,916 48,281 8.55% 22,020 7.14% Turkmen 100 462,059 76,351 38,891 8.42% 20,509 6.65% All ethnicities 800 1,575,702 92,162 308,311 – – – FATTAHI ET AL. | 9 extended to the other ethnic minorities and the Iranian population in general. 3.4 | Functional annotation of variants in the iranome data set The genomic variations detected in the Iranome database were then annotated by Refseq Genes 105 Interim v1, NCBI, which led to the identification of 1,440,997 variants annotated on verified mRNA transcripts of which 426,972 (29.63%) variants were frequently observed in public genomic databases with a frequency of greater than and equal to 0.01, and 733,334 (50.89%) variants were rarely observed with a frequency of less than 0.01, plus, an additional 280,691 (19.47%) novel variants (Table 3; Figure 4a). Based on functional variant effects, all of these variants were categorized into three main groups of variants with Loss‐of‐function (LoF) effect constituting 1.4% of the total variants, variants with Missense effect and variants with Other effects constituting 17.35% and 81.24%, respectively (Figure 4a). These three groups were subcategorized into 19 different sequence ontology terms as described in Table 3. In group 1 (LoF effects), 54.64% of the variants were Frameshifts whereas in group 2 (missense effects), 95.54% were missense variants. The most frequent variants in group 3 (other effects) were intronic variants (75.06%). As it is shown in Table 3, variants with LoF effect were more prevalent among the rare/novel categories (Figure 4a,b). In total, LoF variants constituted 1.4% (20,249 variants) of the database of which 21% were located at Online Mendelian Inheritance in Man (OMIM) genes with a reported associated phenotype. These LoF variants were located in 8365 unique genes of which 75.5% were registered in the OMIM database. In addition, while the number of LoF variants was significantly decreased in common variants category, the percentage of LoF genes located at OMIM genes with a reported associated phenotype did not differ significantly among the three categories of novel, rare and common variants (Figure 4c). 3.5 | Genomic structure of the Iran data set compared to the Greater Middle East Variome dataset Among the final list of 308,311 novel variants identified in the Iranian population through this project, we aimed to clarify the degree of population specificity by comparing these variants with the Greater Middle East (GME) Variome data set which includes samples FIGURE 4 (a) Proportion of all three main groups of variants with Loss‐of‐function (LoF) effect, missense effect, and other effects is shown and categorized based on their allele frequency. (b) Pi charts show the proportion of some types of functional variants based on their allele frequency. (c) Distribution of identified LoF variants based on their frequency in public databases, their location on OMIM genes and the genes with OMIM phenotypes 10 | FATTAHI ET AL. from populations in closer geographical regions in the Middle East, compared to the other public databases. Contrary to expectations, we only found about 1896 (0.6%) of the novel variants having overlap with the GME data set (including 601 variants with allele frequency greater than 0.01 and 1295 variants with allele frequencies < 0.01). So, apparently most of the novel variations identified in the Iranome project are not represented outside the corresponding population, even in closely located geographical regions. In fact, this is another indication of the importance of such ethnic‐specific databases in the clinical setting and in molecular diagnosis. 3.6 | Known pathogenic/likely pathogenic variants observed in the Iranome data set The contribution of known pathogenic variants in the Iranome database was assessed using the latest update of the ClinVar database (last updated on 2019‐01‐01). In total, 50,030 variants from the Iranome database were classified in ClinVar of which 721 were reported to be pathogenic and/or likely pathogenic. We assessed the allele frequency of these variants in the Iranian population and 668 of them (92.6%) were rare variants with alternative allele frequency of less than and equal to 1%, and only 53 variants had an alternate allele frequency of greater than 1%. Further in‐depth investigation of these frequent pathogenic variants as well as the rare ones appearing as homozygous (for rare recessive disorders) or heterozygous (for rare dominant disorders), along with exclusion of the variants known to be pathogenic for the phenotypes with a chance to be present in the Iranome database (refer to Table 1), led to the identification of 12 reported pathogenic/ likely pathogenic variants in the ClinVar database that suggested cause in rare Mendelian disorders whereas they were seen in apparently normal individuals of the Iranome database. The additional information provided by the Iranome data set resulted in reclassification of these variants into variants of uncertain according to the American College of Medical Genetics and Genomics (ACMG) Guideline (Table S1; Richards et al., 2015). 3.6.1 | Iranome adds uncertainty to the pathogenicity of four variants observed with rare allele frequency in other similar public databases The rare missense variant, p.Val1081Met, in the KDM5C gene is recognized as a pathogenic variant in the ClinVar database (with no supportive evidence) for mental retardation, X‐linked, syn- dromic, Claes–Jensen type, XLR (MIM# 300534). This variant was detected as hemizygous in one of the Persian individuals in the Iranome database. Additional clinical follow‐up revealed that this individual was apparently normal and unlikely to be related to this phenotype. Therefore, according to ACMG guidelines, this variant should be reclassified as a variant of uncertain significance based FIGURE 5 (a) Distribution of ROHs based on their length in all Iranians as well as each ethnicity. (b) Distribution of long, intermediate and short ROHs among the eight ethnicities investigated in this database. ROH, runs of homozygosity FATTAHI ET AL. | 11 on the observation of a healthy adult individual in the Iranome database. The rare missense variant, p.Ala338Val, in the ALDOB gene is recognized as a pathogenic/likely pathogenic variant in the ClinVar database for Fructose intolerance, hereditary, AR (MIM# 229600; Davit‐Spraul et al., 2008; Esposito et al., 2010). This variant was identified as homozygous in one of the Baluch individuals in the Iranome database. Additional clinical follow‐up revealed that this individual was apparently normal and unlikely to be related to this phenotype and therefore, this variant should be reclassified as a variant of uncertain significance based on the observation of a healthy adult individual in the Iranome database. The rare heterozygous missense variant, p.Arg848Gln, in the KIF1A gene is recognized as likely pathogenic in the ClinVar database (with no supportive evidence) for Mental retardation, autosomal dominant 9 (MIM# 614255). Two heterozygous individuals (both Persian) were identified in the Iranome data- base. These individuals were apparently normal (unlikely to be related to mental retardation, autosomal dominant 9) and there- fore, this variant should be reclassified as a variant of uncertain significance based on the observation of two healthy adult individuals in the Iranome database. The rare heterozygous stop‐gain variant, p.Tyr206Ter, in the SCN8A gene is recognized as likely pathogenic in the ClinVar database (with no supportive evidence) for Cognitive impairment with or without cerebellar ataxia, AD (MIM# 614306). Four heterozygous Lur individuals were identified in the Iranome database. These individuals were apparently normal (unlikely to be related to Cognitive impairment with or without cerebellar ataxia) and therefore, this variant should be reclassified as a variant of uncertain significance based on the observation of four healthy adult individuals in the Iranome database. 3.6.2 | Iranome adds uncertainty to the pathogenicity of two variants with homozygous genotype but not in trans with other causative variants The rare homozygous variant, p. Gly674Arg, in the WFS1 gene was observed in an individual with Arab ethnicity who was apparently FIGURE 6 Web‐based interface of the Iranome database 12 | FATTAHI ET AL. normal (unlikely to be related to Wolfram syndrome). The variant is considered to be pathogenic/likely pathogenic (Khanim, Kirk, Latif, & Barrett, 2001) but our data support the idea that this variant is polymorphism in the homozygous state and it can be considered to be causative only when occurring in trans with other variants in the WFS1 gene (Häkli, Kytövuori, Luotonen, Sorri, & Majamaa, 2014). This variant should also be reclassified as of uncertain significance in its homozygous state. The next rare homozygous variant, p.Phe55Leu, in the PAH gene, has been similarly reported as compound heterozygous along with another pathogenic variant in several patients presenting different types of PAH‐ related disorders, but especially in mild phenylketonuria (PKU) and hyperphenylalaninemia (HPA). The observation of one homozygous individual (Persian) in the Iranome database who was apparently normal for this phenotype supports the view that this variant should be reclassified as of uncertain significance in its homozygous state. 3.6.3 | Iranome confirms the uncertainty about GPR161 being responsible for pituitary stalk interruption syndrome Karaca et al. (2015) reported a homozygous missense variant, p. Leu19Gln, in the GPR161 gene as the potential novel cause of pituitary stalk interruption syndrome. In addition, this variant is recognized as likely pathogenic in the ClinVar database. However, the observation of homozygous individuals in public databases brought into question its contribution to this phenotype. Two homozygous Baluch individuals were also observed in the Iranome database. Additional clinical follow‐ up revealed that these individuals were apparently normal and unlikely to be related to pituitary stalk interruption syndrome. Therefore, this variant should be reclassified as a variant of uncertain significance according to ACMG guideline (See Table S1 for supporting evidence). 3.6.4 | Iranome adds uncertainty to the pathogenicity of ZC3H14 variant but does not exclude this gene as the cause of mental retardation, autosomal recessive 56 Pak et al. (2011) reported ZC3H14 as a novel causative gene in two families presenting nonsyndromic intellectual disability. They con- firmed the expression of this gene in adult and fetal human brain and showed a critical role of its ortholog for normal Drosophila melanogaster development and neuronal function. The homozygous 25 bp intronic deletion, c.2204+8_2204+32del25, observed in the second family is recognized as pathogenic in the ClinVar database. However, the observation of homozygous individuals in public databases brought into question its contribution to this phenotype. The two homozygous individuals from the Iranome database belong to the Arab and Baluch ethnicities. Additional clinical follow‐up was assessed. These individuals were apparently normal (unlikely to be related to Mental retardation, autosomal recessive 56). Therefore, this variant should be reclassified as a variant of uncertain significance according to ACMG guideline (See Table S1 for supporting evidence). 3.6.5 | Iranome adds uncertainty to the pathogenicity of CHD4 variant but does not exclude this gene as the cause of Sifrim‐Hitz‐Weiss syndrome In 2016, Sifrim et al. (2016) identified de novo p.Val1608Ile in the CHD4 gene in a patient with Sifrim–Hitz–Weiss syndrome, AD (MIM# 617159). However, observation of heterozygous individuals in public databases as well as three heterozygous individuals in the Iranome database (Lur ethnicity and Persian Gulf Islanders) does not support its pathogenicity. Therefore, this variant should be reclassi- fied as a variant of uncertain significance based on the observation of three healthy adult individuals in the Iranome database. 3.7 | Assessment of runs of homozygosity in the Iranian population Runs of homozygosity (ROH) regions are the nearby chromosomal segments of homozygous SNPs which are categorized into three classes based on their length as short (<500 kb), intermediate (500 kb–1.5 Mb), and long ROHs (>1.5 Mb). These three classes correspond to ancient haplotypes, population background related- ness and recent parental relatedness, respectively (Pemberton et al., 2012). ROH regions are considered to have functional significance in populations and are considered to be potential target regions having a tendency for rare and common disorders (Magi et al., 2014). Long ROHs are a consequence of recent parental relatedness and therefore can be observed more frequently in inbred populations, and are more likely to surround causative variants in individuals coming from such populations (Hu et al., 2018). Due to the high rate of consanguinity in the Iranian population, we attempted to assess the ROHs of Iranian individuals using WES data, to determine the autozygome map of Iranian individuals. In total, 1,446 highly overlapping ROH clusters were calculated (optimal clusters) in which 35.3% were short, 55.3% had intermediate length, and 9.4% were long ROHs (Table S2). The distribution of these ROHs based on their length in the Iranome database and per each ethnicity is indicated in Figure 5a. The longest ROH was about 12 Mb and was located on chromosome 16 encompassing 28 genes in which, surprisingly, all were pseudogenes. We expected an increased burden and length of ROHs in the Iranian population similar to what was observed in the GME database. When the percentage of each ROH category in Iranome was compared with the agreed 55%, 35% and 10% of short, intermediate and long ROHs reported in different 1KG populations (Pippucci, Magi, Gialluisi, & Romeo, 2014), we could observe a 20.3% increase in intermediate ROHs and a 19.7% decrease in short ROHs, compatible with expectations. Interestingly, we observed no significant difference in the proportion of long ROHs. The overall rate of consanguineous marriage in Iran is estimated as 38.6% with an approximately equal distribution in different ethnicities and with the highest rate observed in Baluchs (Saadat, Ansari‐Lari, & Farhud, 2004). We assessed the number of ROHs per each ethnicity and compared the relative distributions of these ROHs to see if they reflect the distribution pattern of consanguinity in each ethnic group FATTAHI ET AL. | 13 (Figure 5a,b). Interestingly, the largest number of ROHs was observed in Baluchs and Persian Gulf Islanders, the two ethnicities that formed their own clusters separate from the rest of the population in PCA, thus showing an increased burden of ROHs due to inbreeding but not increased length of the ROH compared with the other ethnicities. The highest percentages of long ROHs were observed in Turkmen, Persian, Lur, and Arab ethnicities, who had 49%, 45%, 39.3%, and 49% consanguineous marriage rates, respectively (Saadat et al., 2004). 3.8 | Web‐based interface of the Iranome database The Iranome project has produced a reference panel of genomic variations in Iran. This database provides the allele frequency of all variants identified in the Iranian population and includes genomic coordinates of corresponding alternate alleles, quality scores of the variants (including Quality score, BaseQRankSum, Read Depth (DP), Strand Bias‐Fisher’s (FS), Mapping Qual (MQ), MQRankSum, Quality by Depth (QD), ReadPosRankSums), corresponding dbSNP ID, the alternate allele counts, the alternate allele frequencies, genotype (GT), and genotype quality (GQ), and the number of heterozygotes and homo- zygotes in all 800 samples as well as in each ethnic group. Each variant was annotated and it is clinically relevant information including the transcript name, Human Genome Variation Society (HGVS) nomencla- ture, sequence ontology, exon number, gene name, and bioinformatics prediction of most of the well‐known programs and algorithms were made available. To provide rapid and free access to such data, a web‐based interface is also provided through the two following links: http:// www.iranome.com/ and http://iranome.ir/. The web server is user‐ friendly and provides a search box on its home page to explore variants based on their genomic positions, or by searching their gene symbol, or by specifying a transcript, or dbSNP ID (#rs ID) and also by providing corresponding coordinates of the intended genomic region. The gene pages provide the list of all variants identified among the entire data set and the variant classification is based on the most clinically relevant transcript of a gene or the longest one when the clinically relevant transcript was not known. Each variant is hyperlinked to its specific page. The variant page provides the following details: allele frequencies of each variant in all ethnic groups studied are shown separately in a table in addition to functional and clinical annotations along with bioinformatics predic- tion scores and quality metrics for the specified variant. The variant page also provides useful links to the corresponding public databases such as dbSNP, UCSC genome browser, ClinVar, ExAC Browser, NHLBI ESP database, Ensemble and gnomAD Browser, providing additional information for the variant specified (Figure 6). 4 | CONCLUSION 4.1 | Impact of the Iranome project at the national level For the best characterization of rare variants in a population, sequencing as many individuals as possible is critical. To fulfill such an objective, in phase I of the Iranome project, we sequenced 800 individuals selected from the main Iranian ethnic groups. We released a comprehensive catalog of Iranian genomic variations and constructed the Iranome database, the first public database of allele frequencies of genomic variants in the Iranian population. This is the first genomic variant database specific to the Iranian population provided by whole exome sequencing on a reasonable number of Iranian individuals coming from the main ethnic groups in this population. Approximately 70% of the variants identified in our database were novel or had frequencies less than 1% (rare variants) in public databases. With regard to the prominence of such variants in terms of diagnosis and management of patients suffering from rare Mendelian disorders, the Iranome database offers a comprehensive healthcare resource at the national level by providing population‐ specific allele frequencies of such variants. The data are also accessible from an ethnic‐specific viewpoint, which can be useful while interpreting the variants identified in patients coming from specific Iranian ethnical groups. The Iranian plateau has been exposed to invasions of different people throughout history including incursion of nomads from the central Asian steppes, Arab‐Muslims, Seljuq Turks originating from Oghuz tribes, and then Mongols. The large number of invasions and migrations played a major role in generating the diverse demographic structure in the Iranian plateau, which is apparently influenced by generated gene flows. This genetic diversity is confirmed by mtDNA sequencing analysis (Derenko et al., 2013) which also proposed a “common maternal ancestral gene pool” among Iranian people speaking Indo‐Iranian languages and Iranian people speaking Turkic Qashqais. This is in line with the results obtained from principal components analysis of Iranome samples where, apart from the Iranian Baluch, Turkmen and Persian Gulf Islander populations, which form their own clusters, the remaining populations are genetically very similar. We also observed that the proportion of variants and, in particular, ethnic‐specific novel variants, did not differ in the different ethnic groups investigated. In addition, the inclusion in the Iranome project of 100 non‐Arab people living in the Persian Gulf islands played a prominent role in clarifying the genetic background and diversity of this less‐studied subpopulation in Iran. Furthermore, our analysis showed that only 0.6% of novel variants in the Iranome data set have counterparts in databases of the Middle Eastern population, emphasizing the value of the Iranome project in clarifying the genetic background and allelic frequency of the Iranian population. 4.2 | Impact of the Iranome project at the international level This project introduced 308,311 additional variants into the human genomic variation catalog and fills another little corner of the human genetic variation picture. Furthermore, this database is an excellent resource for other countries in the region, specifi- cally, the neighboring countries located in a region which was 14 | FATTAHI ET AL. http://www.iranome.com/ http://www.iranome.com/ http://iranome.ir/ historically called Greater (Persia). This historical term refers to a region which included parts of the Caucasus, West Asia and the Middle East (Bahrain, Kurdistan, the modern state of Iran, and some parts of Iraq), Central Asia (Uzbekistan, Tajikistan, Turkme- nistan, Xinjiang), and parts of South Asia (Afghanistan and Pakistan), which were under the control of the Persian Empire and therefore were historically influenced by Iranian culture (https://en.wikipedia.org/wiki/Greater_Iran). Iranic (Iranian) peo- ple, defined as people who speak Indo‐Iranian languages and their dialects, are present in this region and are estimated to number about 150–200 million individuals (https://en.wikipedia.org/wiki/ Iranian_peoples). Moreover, applying mtDNA sequencing analysis, Derenko et al. (2013) showed that there is a common set of maternal lineages between people living in Iran and people living in Anatolia, the Caucasus and the Arabian Peninsula. The Iranome database can be considered to be a good resource for all of the Iranic people who not only live in the country of Iran but who also inhabit this historical region. The inclusion of 100 Iranian Persians in the database can be a useful resource for Persian‐ speaking people living in Afghanistan, Tajikistan, the Caucasus, Uzbekistan, Bahrain, Kuwait, and Iraq. The inclusion of 100 Iranian Kurds can be a useful resource for other Kurdish people living in Iraqi Kurdistan, Turkey, Syria, Armenia, Israel, Georgia, and Lebanon. The presence of 100 Iranian Baluchs can be considered to be a useful resource for Baluchi people living in Pakistan, Oman, Afghanistan, Turkmenistan, Saudi Arabia, and UAE. The genetic information on 100 Iranian Azeris can also be used for people living in Azerbaijan, Turkey, Russian, and Georgia. Although Iranian Arabs are admixed with other ethnicities in Iran such as Persians, Turks, and Lurs, the genetic variants identified in 100 Iranian Arabs investigated in this study can be a useful resource for the populations of Arab countries which are geographically close to Iran. Furthermore, Iranians seem to be the most probable parent population of most of the Hungarian ethnic groups. After Slavs and Germans, Iranians and Turks are the most likely contributors to the Hungarian ethnic groups, because of the relatively short genetic distances and average admixture estimates (Guglielmino, Silvestri, & Beres, 2000). In general, genomic databases are useful in clarifying the actual role of pathogenic/likely pathogenic variants in human diseases. This will be more valuable when dealing with rare homozygous variants in a small number of individuals in public databases. Knowing the clinical features of these individuals or at least having access to collect the required clinical data from such suspected individuals will be valuable for medical geneticists. Hopefully, the Iranome database will be very helpful in this matter as already discussed above where 12 pathogenic/likely pathogenic known variants could be reclassified with the help of this dataset. In conclusion, we believe that the Iranome database, as the first national genomic effort to clarify the genetic background of the country, will be useful in medical genomics and in the healthcare system in Iran as well as in other Indo‐Iranian speaking populations in the region. We plan to improve the database by adding more individuals from other Iranian ethnic groups who were not represented in the present phase of the project to comprehensively identify Iranian genomic variations. ACKNOWLEDGMENTS This national project could not be completed without the supports in University of Social Welfare & Rehabilitation Sciences in Tehran, Iran. The authors would like to acknowledge the support of Dr.Sorena Sattari, vice‐presidency for Science and Technology, and Dr. Hossein Vatanpour the general manager of Technology at Ministry of Health and Medical Education. The authors would like to acknowledge all of the 800 individuals who participated in this project as volunteers and the network of medical experts who helped in sample collection throughout the country. This study was funded by Iran Vice‐President Office for Science and Technology (grant number: 11/66100) and Vice deputy for Research and Technology at Iran Ministry of Health and Medical Education, grant number: 700/150. CONFLICT OF INTEREST The authors declare that there is no conflict of interest. ORCID Hossein Najmabadi http://orcid.org/0000-0002-6084-7778 REFERENCES Alkan, C., Kavak, P., Somel, M., Gokcumen, O., Ugurlu, S., Saygi, C., … Bekpen, C. (2014). Whole genome sequencing of Turkish genomes reveals functional private alleles and impact of genetic interactions with Europe, Asia and Africa. BMC Genomics, 15, 963. https://doi.org/ 10.1186/1471‐2164‐15‐963 Amanolahi, S. (2005). A note on ethnicity and ethnic groups in Iran. Iran & the Caucasus, 9(1), 37–41. An, J. Y. (2017). National human genome projects: An update and an agenda. Epidemiology and Health, 39):e2017045. https://doi.org/10. 4178/epih.e2017045 Auton, A., Brooks, L. D., Durbin, R. M., Garrison, E. P., Kang, H. M., Korbel, J. O., & Abecasis, G. R. (2015). A global reference for human genetic variation. Nature, 526(7571), 68–74. https://doi.org/10.1038/ nature15393 Van der Auwera, G. A., Carneiro, M. O., Hartl, C., Poplin, R., Del Angel, G., Levy‐Moonshine, A., … DePristo, M. A. (2013). Current protocols in bioinformatics: The genome analysis toolkit best practices pipeline. Current Protocols in Bioinformatics, 43, https://doi.org/10.1002/ 0471250953.bi1110s43. 11.10.11‐33 Banihashemi, K. (2009). Iranian human genome project: Overview of a research process among Iranian ethnicities. Indian Journal of Human Genetics, 15(3), 88–92. https://doi.org/10.4103/0971‐6866.60182 Boomsma, D. I., Wijmenga, C., Slagboom, E. P., Swertz, M. A., Karssen, L. C., Abdellaoui, A., … van Duijn, C. M. (2014). The Genome of the Netherlands: Design, and project goals. European Journal of Human Genetics, 22(2), 221–227. https://doi.org/10.1038/ejhg.2013.118 Comas, D., Calafell, F., Mateu, E., Pérez‐Lezaun, A., Bosch, E., Martínez‐Arias, R., … Bertranpetit, J. (1998). Trading genes along the silk road: MtDNA FATTAHI ET AL. | 15 https://en.wikipedia.org/wiki/Greater_Iran https://en.wikipedia.org/wiki/Iranian_peoples https://en.wikipedia.org/wiki/Iranian_peoples http://orcid.org/0000-0002-6084-7778 https://doi.org/10.1186/1471-2164-15-963 https://doi.org/10.1186/1471-2164-15-963 https://doi.org/10.4178/epih.e2017045 https://doi.org/10.4178/epih.e2017045 https://doi.org/10.1038/nature15393 https://doi.org/10.1038/nature15393 https://doi.org/10.1002/0471250953.bi1110s43 https://doi.org/10.1002/0471250953.bi1110s43 https://doi.org/10.4103/0971-6866.60182 https://doi.org/10.1038/ejhg.2013.118 sequences and the origin of central Asian populations. The American Journal of Human Genetics, 63(6), 1824–1838. https://doi.org/10.1086/ 302133 The Genome of the Netherlands Consortium (2014). Whole‐genome sequence variation, population structure and demographic history of the Dutch population. Nature Genetics, 46(8), 818–825. https://doi. org/10.1038/ng.3021 Curtis, G. E., & Hooglund, E. J. Library of Congress. Federal Research Division. (2008). Iran: A Country Study. Washington, DC: Federal Research Division, Library of Congress. Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., … Durbin, R. (2011). The variant call format and VCFtools. Bioinformatics, 27(15), 2156–2158. https://doi.org/10.1093/ bioinformatics/btr330 Davit‐Spraul, A., Costa, C., Zater, M., Habes, D., Berthelot, J., Broué, P., … Baussan, C. (2008). Hereditary fructose intolerance: Frequency and spectrum mutations of the aldolase B gene in a large patients cohort from France‐‐identification of eight new mutations. Molecular Genetics and Metabolism, 94(4), 443–447. https://doi.org/10.1016/j.ymgme. 2008.05.003 Derenko, M., Malyarchuk, B., Bahmanimehr, A., Denisova, G., Perkova, M., Farjadian, S., & Yepiskoposyan, L. (2013). Complete mitochondrial DNA diversity in Iranians. PLOS One, 8(11):e80673. https://doi.org/10. 1371/journal.pone.0080673 Dopazo, J., Amadoz, A., Bleda, M., Garcia‐Alonso, L., Alemán, A., García‐García, F., … Antiñolo, G. (2016). 267 Spanish Exomes reveal population‐ specific differences in disease‐related genetic variation. Molecular Biology and Evolution, 33(5), 1205–1218. https://doi.org/10.1093/molbev/ msw005 Esposito, G., Imperato, M. R., Ieno, L., Sorvillo, R., Benigno, V., Parenti, G., … Salvatore, F. (2010). Hereditary fructose intolerance: Functional study of two novel ALDOB natural variants and characterization of a partial gene deletion. Human Mutation, 31(12), 1294–1303. https:// doi.org/10.1002/humu.21359 Farhud, D. D., Mahmoudi, M., Kamali, M. S., Marzban, M., Andonian, L., & Saffari, R. (1991). Consanguinity in Iran. Iranian Journal of Public Health, 20, 1–16. Farjadian, S., & Safi, S. (2013). Genetic connections among Turkic‐speaking Iranian ethnic groups based on HLA class II gene diversity. International Journal of Immunogenetics, 40(6), 509–514. https://doi. org/10.1111/iji.12066 Fu, W., O’Connor, T. D., Jun, G., Kang, H. M., Abecasis, G., Leal, S. M., … Akey, J. M. (2013). Erratum: Corrigendum: Analysis of 6,515 exomes reveals the recent origin of most human protein‐coding variants. Nature, 495, 270. https://doi.org/10.1038/nature12022 Guglielmino, C. R., Silvestri, A., & Beres, J. (2000). Probable ancestors of Hungarian ethnic groups: An admixture analysis. Annals of Human Genetics, 64(Pt 2), 145–159. https://doi.org/10.1017/ S0003480000008010 Hassan, H. D. (2008). Iran: Ethnic and Religious Minorities (Report no. RL34021). Washington D.C. (https://digital.library.unt.edu/ark:/ 67531/metadc795725/ ): University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Gov- ernment Documents Department. Henn, B. M., Cavalli‐Sforza, L. L., & Feldman, M. W. (2012). The great human expansion. Proceedings of the National Academy of Sciences, 109(44), 17758–17764. https://doi.org/10.1073/pnas.1212380109 Hu, H., Kahrizi, K., Musante, L., Fattahi, Z., Herwig, R., Hosseini, M., … Najmabadi, H. (2018). Genetics of intellectual disability in consangui- neous families. Molecular Psychiatry, 24, 1027–1039. https://doi.org/ 10.1038/s41380‐017‐0012‐2 Häkli, S., Kytövuori, L., Luotonen, M., Sorri, M., & Majamaa, K. (2014). WFS1 mutations in hearing‐impaired children. International Journal of Audiology, 53(7), 446–451. https://doi.org/10.3109/14992027.2014. 887230 Karaca, E., Buyukkaya, R., Pehlivan, D., Charng, W. L., Yaykasli, K. O., Bayram, Y., … Lupski, J. R. (2015). Whole‐exome sequencing identifies homozygous GPR161 mutation in a family with pituitary stalk interruption syndrome. The Journal of Clinical Endocrinology & Metabolism, 100(1), E140–E147. https://doi.org/10.1210/jc.2014‐ 1984 Khanim, F., Kirk, J., Latif, F., & Barrett, T. G. (2001). WFS1/wolframin mutations, Wolfram syndrome, and associated diseases. Human Mutation, 17(5), 357–367. https://doi.org/10.1002/humu.1110 Koshy, R., Ranawat, A., & Scaria, V. (2017). al mena: A comprehensive resource of human genetic variants integrating genomes and exomes from Arab, Middle Eastern, and North African populations. Journal of Human Genetics, 62(10), 889–894. https://doi.org/10.1038/jhg.2017. 67 Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., … MacArthur, D. G. (2016). Analysis of protein‐coding genetic variation in 60,706 humans. Nature, 536(7616), 285–291. https://doi. org/10.1038/nature19057 Li, H., & Durbin, R. (2010). Fast and accurate long‐read alignment with Burrows‐Wheeler transform. Bioinformatics, 26(5), 589–595. https:// doi.org/10.1093/bioinformatics/btp698 MacArthur, D. G., Manolio, T. A., Dimmock, D. P., Rehm, H. L., Shendure, J., Abecasis, G. R., … Gunter, C. (2014). Guidelines for investigating causality of sequence variants in human disease. Nature, 508(7497), 469–476. https://doi.org/10.1038/nature13127 Magi, A., Tattini, L., Palombo, F., Benelli, M., Gialluisi, A., Giusti, B., … Pippucci, T. (2014). H3M2: Detection of runs of homozygosity from whole‐exome sequencing data. Bioinformatics, 30(20), 2852–2859. https://doi.org/10.1093/bioinformatics/btu401 Majbouri, M., & Fesharaki, S. (2017). Iran's multi‐ethnic mosaic: A 23‐year perspective. Social Indicators Research, https://doi.org/10.1007/ s11205‐017‐1800‐4 Naidoo, N., Pawitan, Y., Soong, R., Cooper, D. N., & Ku, C. S. (2011). Human genetics and genomics a decade after the release of the draft sequence of the human genome. Human genomics, 5(6), 577–622. Pak, C., Garshasbi, M., Kahrizi, K., Gross, C., Apponi, L. H., Noto, J. J., … Kuss, A. W. (2011). Mutation of the conserved polyadenosine RNA binding protein, ZC3H14/dNab2, impairs neural function in Droso- phila and humans. Proceedings of the National Academy of Sciences, 108(30), 12390–12395. https://doi.org/10.1073/pnas.1107103108 Pemberton, T. J., Absher, D., Feldman, M. W., Myers, R. M., Rosenberg, N. A., & Li, J. Z. (2012). Genomic patterns of homozygosity in worldwide human populations. The American Journal of Human Genetics, 91(2), 275–292. https://doi.org/10.1016/j.ajhg.2012.06.014 Pippucci, T., Magi, A., Gialluisi, A., & Romeo, G. (2014). Detection of runs of homozygosity from whole exome sequencing data: State of the art and perspectives for clinical, population and epidemiological studies. Human Heredity, 77(1‐4), 63–72. https://doi.org/10.1159/000362412 Rashidvash, V. (2012). The race of the Azerbaijani people in Iran (Atropatgan). International Journal of Research In Social Sciences (IJRSS), 2(3), 437–449. Rashidvash, V. (2013). Iranian people: Iranian ethnic groups. International Journal of Humanities and Social Science, 3(15), 216–226. Rashidvash, V. (2016). Iranian people and the race of people settled in the Iranian plateau. International Journal of Humanities & Social Science Studies, 3(1), 181–191. Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier‐Foster, J., … Rehm, H. L. (2015). Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine, 17(5), 405–423. https://doi.org/10.1038/gim.2015.30 Saadat, M., Ansari‐Lari, M., & Farhud, D. D. (2004). Short report consanguineous marriage in Iran. Annals of Human Biology, 31(2), 263–269. https://doi.org/10.1080/03014460310001652211 16 | FATTAHI ET AL. https://doi.org/10.1086/302133 https://doi.org/10.1086/302133 https://doi.org/10.1038/ng.3021 https://doi.org/10.1038/ng.3021 https://doi.org/10.1093/bioinformatics/btr330 https://doi.org/10.1093/bioinformatics/btr330 https://doi.org/10.1016/j.ymgme.2008.05.003 https://doi.org/10.1016/j.ymgme.2008.05.003 https://doi.org/10.1371/journal.pone.0080673 https://doi.org/10.1371/journal.pone.0080673 https://doi.org/10.1093/molbev/msw005 https://doi.org/10.1093/molbev/msw005 https://doi.org/10.1002/humu.21359 https://doi.org/10.1002/humu.21359 https://doi.org/10.1111/iji.12066 https://doi.org/10.1111/iji.12066 https://doi.org/10.1038/nature12022 https://doi.org/10.1017/S0003480000008010 https://doi.org/10.1017/S0003480000008010 https://digital.library.unt.edu/ark:/67531/metadc795725/ https://digital.library.unt.edu/ark:/67531/metadc795725/ https://digital.library.unt.edu https://doi.org/10.1073/pnas.1212380109 https://doi.org/10.1038/s41380-017-0012-2 https://doi.org/10.1038/s41380-017-0012-2 https://doi.org/10.3109/14992027.2014.887230 https://doi.org/10.3109/14992027.2014.887230 https://doi.org/10.1210/jc.2014-1984 https://doi.org/10.1210/jc.2014-1984 https://doi.org/10.1002/humu.1110 https://doi.org/10.1038/jhg.2017.67 https://doi.org/10.1038/jhg.2017.67 https://doi.org/10.1038/nature19057 https://doi.org/10.1038/nature19057 https://doi.org/10.1093/bioinformatics/btp698 https://doi.org/10.1093/bioinformatics/btp698 https://doi.org/10.1038/nature13127 https://doi.org/10.1093/bioinformatics/btu401 https://doi.org/10.1007/s11205-017-1800-4 https://doi.org/10.1007/s11205-017-1800-4 https://doi.org/10.1073/pnas.1107103108 https://doi.org/10.1016/j.ajhg.2012.06.014 https://doi.org/10.1159/000362412 https://doi.org/10.1038/gim.2015.30 https://doi.org/10.1080/03014460310001652211 Scott, E. M., Halees, A., Itan, Y., Spencer, E. G., He, Y., Azab, M. A., … Gleeson, J. G. (2016). Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery. Nature Genetics, 48(9), 1071–1076. https://doi.org/10.1038/ng.3592 Sifrim, A., Hitz, M. P., Wilsdon, A., Breckpot, J., Turki, S. H. A., Thienpont, B., … Hurles, M. E. (2016). Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing. Nature Genetics, 48(9), 1060–1065. https://doi.org/10. 1038/ng.3627 Tennessen, J. A., Bigham, A. W., O’connor, T. D., Fu, W., Kenny, E. E., Gravel, S., … Akey, J. M. (2012). Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science, 337(6090), 64–69. https://doi.org/10.1126/science.1219240 Wang, K., Li, M., & Hakonarson, H. (2010). ANNOVAR: Functional annotation of genetic variants from high‐throughput sequencing data. Nucleic Acids Research, 38(16), e164–e164. https://doi.org/10.1093/ nar/gkq603 Yamaguchi‐Kabata, Y., Nariai, N., Kawai, Y., Sato, Y., Kojima, K., Tateno, M., … Nagasaki, M. (2015). iJGVD: An integrative Japanese genome variation database based on whole‐genome sequencing. Human Genome Variation, 2, 15050. https://doi.org/10.1038/hgv.2015.50 Zarei, F., & Alipanah, H. (2014). Mitochondrial DNA variation, genetic structure and demographic history of Iranian populations. Molecular Biology Research Communications, 3(1), 45–65. SUPPORTING INFORMATION Additional supporting information may be found online in the Supporting Information section. How to cite this article: Fattahi Z, Beheshtian M, Mohseni M, et al. Iranome: A catalog of genomic variations in the Iranian population. Human Mutation. 2019;1–17. https://doi.org/10.1002/humu.23880 FATTAHI ET AL. | 17 https://doi.org/10.1038/ng.3592 https://doi.org/10.1038/ng.3627 https://doi.org/10.1038/ng.3627 https://doi.org/10.1126/science.1219240 https://doi.org/10.1093/nar/gkq603 https://doi.org/10.1093/nar/gkq603 https://doi.org/10.1038/hgv.2015.50 https://doi.org/10.1002/humu.23880 work_2k6g7qh56ngh5knnfbdukho22i ---- PowerPoint-Präsentation SM Adoption and Integration of Persistent Identifiers in European Research Information Management – Preliminary Findings – Rebecca Bryant, PhD OCLC Research, USA http://orcid.org/0000-0002-2753-3881 bryantr@oclc.org @rebeccabryant18 LIBER2017, Patras – July 2017 Annette Dortmund, PhD OCLC EMEA http://orcid.org/0000-0003-1588-9749 dortmuna@oclc.org @libsun http://orcid.org/0000-0002-2753-3881 mailto:bryantr@oclc.org http://orcid.org/0000-0003-1588-9749 mailto:dortmuna@oclc.org The aggregation, curation, & utilization of metadata about research activities Overlapping terms: • CRIS (Current Research Information System) • RNS (Research Networking System) • RPS (Research Profiling System) • RIMs ≠ researcher platforms like ResearchGate or Academia.edu • RIM ≠ Research Data Management (RDM) What is Research Information Management (RIM)? 4 “Long-lasting reference to a digital object that gives information about that object regardless what happens to it.” Types: • Digital Object Identifiers (e.g. DOI) • Person Identifiers (e.g. ORCID, ISNI, DAI) • Organization Identifiers (e.g. GRID, ISNI) • PIDs ≠ authority files • Not every ID is a PID • OCLC ≠ ISNI What are Persistent Identifiers (PID)? Definition: http://dictionary.casrai.org/Persistent_identifier http://dictionary.casrai.org/Persistent_identifier • Research institutions increasingly engaged in RIM Why study PIDs in RIM? • Scaling efforts at national and transnational level • Advancing technologies & standards offer new opportunities for interoperability and discoverability • PIDs expected to be playing a key role in these developments OCLC Research & RIM ORLP working groups • Survey on Research Information Management Practices (in collaboration with EuroCRIS) • Value proposition of libraries in RIM Webinars Listserv • Community resource for shared Research and Development (R&D) since 1978 • Devoted to challenges facing libraries and archives • Engagement with OCLC members and the community around shared concerns Why us? OCLC Research RESEARCH PROJECT Adoption and Integration of Persistent Identifiers in European Research Information Management • “Adoption and Integration of PIDs in European RIM“ • Examining the nexus of RIM with persistent identifiers • Focus on person and organizational identifiers • Objective: Gain useful insights on emerging practices and challenges in research management at different levels of scale Project Information OCLC Research • Rebecca Bryant (lead) • Annette Dortmund • Constance Malpas Project team in collaboration with LIBER • Kristiina Hormia-Poutanen (National Library Finland) • Birgit Schmidt (State and University Library Göttingen) • Esa-Pekka Keskitalo (National Library Finland) • Investigate RIM practices in three national contexts: Finland, Germany, and the Netherlands • Desk research followed by semi-structured interviews • Focus on adoption and integration of persistent identifiers; identify incentives for adoption • Investigate potential links between PID adoption and different levels of scale Case Study Approach • Netherlands: Leiden University, VU Amsterdam, University of Amsterdam, Radboud University / euroCRIS, SURF • Germany: University of Münster, University of Kassel, FAU (Friedrich-Alexander-Universität Erlangen-Nürnberg), BASE Bielefeld, German National Library • Finland: Aalto University, University of Eastern Finland, University of Jyväskylä, University of Helsinki, CSC • PID organizations: ORCID & ISNI Interview partners Project schedule Define research scope Winter 2016-2017 Desk research Later winter 2017 Interviews Spring 2017 Synthesis & writing Summer & fall 2017 Publish research report Late fall 2017 Additional webinars & presentations 2018 NATIONAL INFRASTRUCTURES Person PIDs in use: 15 RIM DAI DAI Netherlands DAI Incentives: • Internal (publication management, researcher profiles …) • External (National mandate for DAI, OA mandate, wish to internationalize ...) • Scaling (consortial efforts, data sharing opportunities, …) Efforts to internationalize the existing national standard (DAI to ISNI/ORCID). Person PIDs (NL): Established national standard Person PIDs in use: 17 RIM VIRTA / Juuli Finland Incentives: • Internal (publication management, researcher profiles, need for unique persistent IDs …) • External (Publishers requesting ORCID iD, complete data on publications for national funding …) • Scaling (planned national research information hub, open science …) Barriers: Not required by funders at the outset. ROI unclear. Person PIDs (FI): Strong incentives Person PIDs in use: 19 RIM Germany BASE (Repository) Incentives: • Internal (publication management, researcher profiles, …) • External / scaling (standards, good practice) [very limited] Barriers: Not required by funders, no real need in absence of regional or national RIM scaling efforts. ROI unclear. Effects of ORCID links with BASE and national authority file GND yet unknown. Person PIDs (DE): „There are no external incentives“ Infrastructures in comparison FURTHER OBSERVATIONS Mandates drive RIM and PID adoption today • Impact assessment (national, funder) • Open access (national, funder) • Requirement of ORCID iDs by funders, publishers • Concern: Loss of institutional autonomy / control Convenience important to drive researcher engagement • ORCID auto-update capabilities • ORCID integration into identity management Top down vs. bottom up Much interest, little activity. „Watching the space.“ • No urgent need, no immediate problem to solve. • Big need, but too complex to solve alone. • ICT organizations brainstorming options for (temporary) work-arounds at national level, but no firm plans yet. • Getting funders on board will be very important for adoption of whatever PID develops. Organizational identifiers Broad range of view on researcher role in RIM. Ranging from full responsibility - “it is their job to register their publications” to some responsibility / role - requesting researchers to enter information, support from support desk or library to do so to no responsibility / role - “researchers do not touch the CRIS” [as an option preferred by the admins!] Role of Researchers in RIM WHAT’S NEXT? • Upcoming blog post on hangingtogether.org • Report to be published in late fall this year • Webinars & presentations in 2018 • OCLC Research in progress (ORLP) • Survey of Research Information Management Practices Opens in late 2017, in collaboration with euroCRIS • Research report on the valuable roles of libraries in RIM (Fall 2017) What‘s next? Stay tuned oc.lc/rim #OCLCResearch hangingtogether.org http://www.oclc.org/research/themes/research-collections/rim.html Discussion Rebecca Bryant, PhD OCLC Research, USA http://orcid.org/0000-0002-2753-3881 bryantr@oclc.org @rebeccabryant18 Annette Dortmund, PhD http://orcid.org/0000-0003-1588-9749 dortmuna@oclc.org @libsun OCLC EMEA http://orcid.org/0000-0002-2753-3881 mailto:bryantr@oclc.org http://orcid.org/0000-0003-1588-9749 mailto:dortmuna@oclc.org work_2lmhzpj6rzayxcwyhma3pxazya ---- Conference Full Paper template Published in Interlending and Document Supply 43, no. 2: 68-75 Open Access: Help or Hindrance to Resource Sharing? Tina Baich, IUPUI University Library Introduction The growing acceptance of the open access movement has created an increasingly large body of free, online information that library users may have difficulty navigating. Students, in particular, may not be fully aware of open access and the corpus of knowledge available to them. As a result, users still request open access materials through interlibrary loan (ILL) despite their ability to access these materials directly. The Resource Sharing & Delivery Services (RSDS) department of Indiana University-Purdue University Indianapolis’ (IUPUI) University Library began tracking ILL borrowing requests for open access materials in 2009. RSDS tracks any request fitting the general criteria of open access content established by Peter Suber: “digital, online, free of charge, and free of most copyright and licensing restrictions” (Suber, 2013). Therefore, the collected data include requests for grey literature, electronic theses and dissertations (ETDs), and public domain works in addition to open access journal content. In 2011, the author used the collected data to study open access borrowing requests over two fiscal years (July 2009-June 2011) (Baich, 2012). This period showed an increase in open access requests while overall borrowing requests held relatively steady. This paper presents an update on that research using data for July 2011-June 2013. The new study provides evidence that the number of borrowing requests for open access documents continued to grow in the ensuing two years. Literature Review It is a commonly held belief that interlibrary loan is and will continue to be adversely affected by the growth of open access. Studies conducted in Japan and Belgium do show a reduction in the number of ILL requests and place at least partial blame on open access (Koyama et al., 2011), (Corthouts et al., 2011), but Schöpfel recently stated there is “little empirical evidence for this causal relation” (Schöpfel, 2014). (McGrath, 2014) also notes the many situations in which interlibrary loan will still be necessary despite any negative impact of open access. The experience at IUPUI University Library has so far run counter to the argument of open access as a hindrance to resource sharing. Open access has not reduced the number of article requests to date, and users actually request open access materials through interlibrary loan. The author could locate only one study which shares the view that open access is “an extraordinarily useful source for librarians to perform document delivery service” (Hu and Jiang, 2014). One of the key reasons users submit ILL requests for open access materials is difficulty with discovery. There are a vast number of resources for locating open access materials, but users want ease of access. Connaway, et al. found this is so imperative for users that they will “readily sacrifice content for convenience” (Connaway, et al., 2011). Additionally, a 2010 report on the findings of twelve user behavior studies found that Google and other search engines are increasingly central to the search for information (Connaway and Dickey, 2010). In fact, when “information consumers” were asked by OCLC Research where they begin their information search, 84 percent indicated beginning in a search engine while not a single person began their search on a library website (DeRosa, et al., 2010). As Kroll and Forsman note, “researchers find Google and Google Scholar to be amazingly effective in finding isolated bits of information or getting to publications or findings of interest to them” (Kroll and Forsman, 2010). As a result, users are unlikely to search multiple resources for the information they seek both out of convenience and the possible perception that what they seek has been found. These user behaviors present a particular problem for open access content housed in repositories. Google Scholar doesn’t follow the same metadata standards as libraries, which causes a level of incompatibility that can impact discovery. Google Scholar does have Inclusion Guidelines for Webmasters to help increase the likelihood an open access repository will be indexed, but some libraries may lack the knowledge or resources to implement these guidelines (Artlisch and O’Brien, 2011). Artlisch and O’Brien found that “in general, IRs [institutional repositories] that followed these guidelines had a much higher indexing ratio (88-98 percent) than sites that did not (38-48 percent)” (Artlisch and O’Brien, 2011). The current inconsistency in discovery of open access content through a Google or Google Scholar search has a negative impact on user discovery. The discovery problem extends beyond open access materials. The most recent study of literature regarding ILL requests for owned items summarizes the literature by stating, “most … found that interlibrary loan requests for 1 items owned or available through electronic access through the library represented 30 percent or greater of the total cancelled requests” (Kress et al., 2011). While there are a number of factors that can cause users to place requests for owned items, one of the key issues is similar to that for open access materials. Libraries offer numerous methods for locating an item – online catalogs, databases, A-Z e- journal lists, and OpenURL link resolvers – that retrieve different formats and results. This does not align with users’ need for convenience and ease of access and may result in a greater reliance on ILL to locate information. The initial 2000 study of ILL requests for owned items suggested that users may “take the line of least resistance in a search and believe that if it is not in the first place they look, it must not exist” (Yontz et al., 2000). This proves to be a prescient statement in light of later research. As discussed earlier, users’ demand for ease of access has only increased in the ensuing years. Overview of IUPUI and University Library IUPUI IUPUI is an urban university with nineteen schools and academic units from both Indiana University and Purdue University enrolling more than 30,000 students. IUPUI is administratively linked to Indiana University (IU) and is considered a core campus in the IU system along with Bloomington. The IU system also includes six regional campuses around the state. IUPUI has its own extension campus, Indiana University-Purdue University Columbus, located approximately forty-five miles south of Indianapolis (IUPUI, n.d.; Indiana University, n.d.). All Indiana University campus libraries collaborate in a number of ways including a shared online catalog, a remote circulation service, and some shared subscriptions. University Library IUPUI University Library serves the faculty, staff and students of all IUPUI schools except the law, medicine and dentistry schools, which have their own libraries. The Herron School of Art also has its own library, but it reports administratively to the Dean of University Library. Resource sharing services for Herron users are provided by University Library. David W. Lewis, the Dean of University Library, is a well-known proponent of the open access publishing model and has written a number of articles on scholarly communication and open access. One of his earliest works on the topic is an unpublished paper titled “Six Reasons Why the Price of Scholarly Information Will Fall in Cyberspace” (Lewis, 1997). Guided by Dean Lewis’ vision, University Library has developed a strong structure and services to support the scholarly communication of IUPUI faculty. One of the library’s earliest open access initiatives was the launch of an institutional repository in 2003. The largest collection within the institutional repository, now known as IUPUI ScholarWorks, is that of electronic theses, dissertations, and doctoral papers from IUPUI graduates. The growth of this collection was largely facilitated by the IUPUI Graduate Office’s ETD submission requirement, which began in 2005. In 2008, the library launched its first hosted open access journal in Open Journal Systems (OJS). At the time of its transition to OJS, Advances in Social Work was edited by an IUPUI faculty member. That same year, University Library refocused the existing Digital Libraries Team as the Digital Scholarship Team to be more inclusive of scholarly communication and open access issues. The increased focus on scholarly communication resulted in the addition of several librarian positions to the Team including a Digital Scholarship and Data Management Librarian (2011); Scholarly Communications Librarian (2013); Digital Scholarship Outreach Librarian (2013); Digital User Experience Librarian (2014) and Digital Humanities Librarian (2014). These new positions are largely focused on supporting IUPUI faculty research and its dissemination to a broader audience. In 2013, the Digital Scholarship Team increased its outward focus again with the creation of the IUPUI University Library Center for Digital Scholarship to enrich the research capabilities of scholars at IUPUI, within Indiana communities, and beyond by: • Digitally disseminating unique scholarship, data, and artifacts created by IUPUI faculty, students, staff and community partners; • Advocating for the rights of authors, fair use, and open access to information and publications; • Implementing and promoting best-practices for creation, description, preservation, sharing, and reuse of digital scholarship, data, and artifacts; • Strategically applying research-supporting technologies; 2 • Teaching digital literacy (IUPUI University Library, 2014). University Library also launched another major service in 2013 in the form of the IUPUI Open Access Publishing Fund. This fund “underwrites reasonable publication charges for articles published in fee-based, peer-reviewed journals that are openly accessible” (IUPUI University Library, 2013). Though administered by University Library, financial support comes from several key campus stakeholders including the Office of the Vice Chancellor for Research, IU School of Dentistry, Robert H. McKinney School of Law, and University Library. Since its launch, the Fund has supported the publication of sixteen faculty articles representing nine of IUPUI’s schools (Jere Odell, personal communication, 17 Dec. 2014). In September 2014, the Director of the Center was elevated to Associate Dean for Digital Scholarship once again emphasizing the importance of scholarly communication and open access to the library. Open Access Policies In April 2009, the IUPUI Library Faculty, the governing body for all librarians from the campus’ five libraries plus Columbus, passed a Deposit Mandate in affirmation of its support of the open access movement. The mandate requires IUPUI and IUPUC Librarians to deposit their scholarly articles in the institutional repository, IUPUI ScholarWorks. After months of work, the IUPUI Faculty Council’s Library Affairs Committee introduced an Open Access Policy. Following discussion at IUPUI Faculty Council meetings and a series of town hall meetings, the IUPUI Faculty Council passed the Open Access Policy on October 7, 2014, becoming the first Indiana University faculty body to do so. The Policy requires the deposit of faculty-created scholarly articles in IUPUI ScholarWorks. Overview of Resource Sharing Operations IUPUI University Library’s RSDS department provides interlibrary loan and document delivery services to the faculty, staff and students of all IUPUI schools except the law, medicine and dentistry schools, which have their own libraries. University Library also has an agreement with Martin University, a local university without its own library, to provide ILL services to its affiliates. RSDS consists of half an FTE librarian, three FTE staff (two of which have responsibility for resource sharing services) and two-three FTE student employees. IUPUI University Library is an OCLC supplier, participates in RapidILL, and uses the OCLC ILLiad ILL management system. Total ILL borrowing requests have decreased slightly over the past three fiscal years, but each decrease can be attributed to fewer loan requests. When borrowing copy requests are considered separately, the statistics show an increase in this type of request every fiscal year since 2008/2009. The large increase in article requests in 2008/2009 can be attributed to the implementation of a document delivery service for articles and book chapters owned by the library. These trends are illustrated in Figures 1 and 2. Figure 1. Borrowing Requests Submitted by Fiscal Year 3 Figure 2. Borrowing Copy Requests Submitted Fiscal Year Submitted % Change from Previous Year 2007/2008 7,516 2008/2009 10,441 38.92% 2009/2010 10,867 4.08% 2010/2011 11,422 5.11% 2011/2012 11,466 0.39% 2012/2013 12,065 5.22% Open Access ILL Workflow The RSDS department utilizes the OCLC ILLiad ILL management software, which supports the creation of custom routing rules, queues, and emails that assist staff in automating workflows. Two custom queues, “Awaiting Open Access Searching” and “Awaiting Thesis Processing,” allow staff to monitor potential open access borrowing requests. Items published in the US prior to 1923 are considered to be within the public domain and free from copyright restrictions. A custom routing rule directs any borrowing request with a pre- 1923 publication date into the “Awaiting Open Access Searching” queue so staff can search for freely available electronic copies prior to sending the request to another library. Staff members use ILLiad addons, which automatically execute searches in HathiTrust, Internet Archive, and Google or Google Scholar based on information in the request. With the increase in availability of electronic theses and dissertations (ETDs), staff members now search for open access versions if a title is not part of our ProQuest Dissertations & Theses (PQDT) subscription before submitting a request via OCLC. The “Awaiting Thesis Processing” queue facilitates this by segregating all requests with a document type of thesis or containing the phrase “Dissertation Abstracts.” When a thesis or dissertation request is submitted, an RSDS staff member first searches the PQDT database to determine whether IUPUI University Library has access through its subscription. If access is not possible through ProQuest, the staff member searches Google Scholar and/or Google for an ETD deposited in an institutional repository. It is only after failing to find an ETD that the staff member will turn to OCLC where she will confirm there is no electronic resource record or URL included in the print record. If no ETD is located, the staff member will submit a request for a physical copy from another library. A visual representation of these workflows is depicted in Figure 3. Figure 3. Open Access ILL Workflow 4 All other article requests are sent into the RapidILL system, which also checks for open access titles. Very few of the open access requests received by IUPUI University Library are fulfilled through RapidILL’s open access check (6 of 1,557 requests, or 0.4%). Staff members search the article title in Google Scholar for open access versions when requests are returned from RapidILL as unfilled. RapidILL also returns requests that are part of our local holdings, which can result in the identification of additional open access items when requests are searched in the library’s e-journal portal. The library uses Serials Solutions as its vendor for electronic resource management. Within the administrative module, it is possible to activate “subscriptions” to various open access journal collections. An example of how this appears in the user interface is shown in Figure 4. Thanks to this feature, resources such as PubMed Central and the Directory of Open Access Journals as well as various collections of freely accessible journal titles are linked through the library’s e-journal portal. This allows staff to fill requests for gold OA articles as well as those green OA articles archived in PubMed Central with minimal searching and without burdening possible lenders with requests for open access materials. Figure 4. Example Open Access Article as Located in E-Journal Portal 5 Post-1922 conference paper and report copy requests are screened for open access versions prior to submission to OCLC. Specific open access searching is typically not done for post-1922 book chapter or loan requests, but staff members are conscious of electronic resource records in OCLC and may sometimes identify an open access item based on the URL included in the record. Extensive searching for open access options does not occur for book chapter and loan requests until all other borrowing options have been exhausted. When an open access item is located, the staff member enters tracking information into the Call Number and Location fields within the request form and records “open” or “etds” (depending on the document type) as the Lending Library. She then saves the PDF to the ILLiad web server and sends the user a custom email notifying him both of the document’s availability on his account and of its location on the open Web. Requests for which an open access version is located are considered filled by RSDS since the staff member has used her time and expertise to find and deliver the item to the user. Data Overview Since the publication of the author’s 2011 study, open access requests have increase by 24-34 percent each year. Figure 5 shows the number of borrowing requests filled with open access materials during fiscal years 2010 through 2013. Despite these substantial increases, open access requests only account for 7 percent of total borrowing copy requests (1,557 of 23,531). Figure 5. Open Access Borrowing Requests by Fiscal Year 6 As stated in the introduction, students may not be fully aware of open access and the corpus of knowledge available to them. The number of requests by user status seems to support this assertion. Alternatively, or perhaps in addition, students may have greater difficulty with the discovery issues covered in the literature review. When taken in combination, IUPUI undergraduate and graduate student requests account for 70 percent of open access requests received in 2011/2012 and 2012/2013. If requests from Martin University students are added, then student requests account for 74 percent of open access borrowing requests. Figure 6 shows the number of open access borrowing requests by user status. Figure 6. Open Access Borrowing Requests by User Status Users representing 63 unique departments or schools submitted open access requests in 2011/2012 and 2012/2013. Figure 7 shows the number of open access requests submitted by users from the top fifteen departments or schools. Of the top fifteen departments, seven are STEM or health sciences disciplines. The amount of open access materials available in these disciplines may relate to the public access policies enacted by the National Science Foundation and National Institutes of Health, which require that research funded by these federal agencies be accessible to the public. 7 Figure 7. Open Access Borrowing Requests by User Department or School Open Access Document Types and Resources The 1,557 open access requests received during fiscal years 2012 and 2013 represent a variety of material types (See Fig 8). Figure 8. Open Access Borrowing Requests by Document Type Doc Type 2011/2012 2012/2013 Article 478 655 Book/Chapter 45 80 Thesis 65 75 Conference 32 22 Report 52 53 Grand Total 672 885 Percent change on previous year 27% 24% Article Requests Nearly three-quarters (n=1,133, 72%) of open access requests in fiscal years 2012 and 2013 were for articles. These requests were filled from a wide variety of both gold (open access journals) and green (self-archiving) sources. Though it is difficult to identify percentages with precision due to a variety of factors (i.e. language barrier, changes in access, multiple OA options), only 170 article requests (15%) can be clearly identified as gold OA. An additional 58 open access articles were retrieved from the sites of journals that still rely primarily on a subscription-based publishing model. Thirty articles requests were for public domain materials and were filled using HathiTrust, Internet Archive and library digital collections. Based on this analysis, the majority of open access article requests were filled via green (self-archiving) sources. Within the administrative module of University Library’s electronic resource management system, it is possible to activate “subscriptions” to open access journal collections. Access to these collections then becomes available in the e-journal portal (see Fig 4 above). The activation of open access collections within University Library’s e- journal portal resulted in the location of 152 (13%) articles in open access journals and repositories (see Fig 9), 8 which is a decrease from 25% from the previous study. Another 49 (4%) requests were filled from open access journals not included in e-journal portal open access collections; while more than 50 (4%) requests were for open access articles included in journals that still rely primarily on a subscription model. Figure 9. Number of Open Access Borrowing Requests Filled through E-Journal Portal Open access repositories were a major source for articles. There are several types of open access repositories including subject, institutional, consortial, and national. Though no one subject repository was the location for a significant number of articles, subject repositories as a whole provided access to 83 articles (See Fig 10). Eighty-four open access article requests were located in institutional repositories, while consortial and national repositories such as Dialnet, REDALyC, and SciELO accounted for another 16 requests. 1 When taken together, these open access repositories represented 16 percent of total open access borrowing requests for articles. Figure 10. Number of Open Access Borrowing Requests Filled through Subject Repositories (Including E- Journal Portal) Subject Repository Number of Requests arXiv.org 10 CiteSeerX 11 Digital Library for Physics and Astronomy 1 Education Resources Information Center (ERIC) 8 Europe PubMed Central 3 Project Euclid 1 PubMed Central 44 Optics InfoBase 3 Organic ePrints 2 Total 83 1 Dialnet is the consortial repository of 80 Spanish libraries. REDALyC is a repository of Iberoamerican journal content based in Mexico. SciELO stands for Scientific Electronic Library Online and exists in a number of iterations to serve as the national repository of various South American countries. 9 Book and Book Chapter Requests Books and book chapters represented only 8 percent (n=125) of open access requests, which is a decrease from 16 percent in the previous study. More than two-thirds (65%, n=81) of book and book chapter requests were published in the 19th and 20th centuries with another 7 percent (n=9) published in the 15th, 17th, and 18th centuries. Twenty-six percent (n=33) were published in the 21st century. One item had an unknown publication date. The majority (60%) of freely available books were located in HathiTrust (50) and Internet Archive (25). This is a shift from the previous two fiscal years when the most common source was Google Books, which is down to just two requests from 50. In all likelihood, this is due to the ability to automatically search HathiTrust and Internet Archive within the ILL management system. Though still a small percentage, four requests (3%) were for recently published open access e-books, which is a new development from the previous study. Thesis and Dissertation Requests Theses and dissertations accounted for 9 percent (n= 140) of total open access requests, which is a decrease from 13 percent in the previous study. However, a greater percentage (18%) of total thesis and dissertation borrowing requests (n=764) were filled using ETDs than in the previous study (10%). Not surprisingly, graduate students were the most frequent requesters of ETDs (71%). Ninety-three percent of ETDs were located within an open access repository. The institutional repository of the granting institution was most common with 75 percent (n=105) of requests followed by the consortial repository OhioLINK ETD Center with 11 percent (n=16). The national repositories Theses Canada Portal (7) and EThOS (1) comprised 6 percent of the total ETD requests. One request was located in the subject repository, Education Resources Information Center (ERIC). Following open access repositories was a new entrant to the ETD field, PQDT Open. Thesis and dissertation authors now have the option to publish their work as open access through ProQuest’s UMI Dissertation Publishing service for a fee. These ETDs are available both through the ProQuest Dissertations & Theses subscription database as well as the public interface, PQDT Open (ProQuest, 2011). Six requests (4%) were filled with ETDs published as open access through ProQuest. Conference Paper and Report Requests Conference papers represented three (n=54) percent of open access borrowing requests, which is a substantial decrease from the previous study where conference papers accounted for 13 percent of the total. This is primarily due to changes with All Academic, an online conference management tool that was previously an excellent source for open access conference papers. In the previous study, 45 percent (n=46) of open access conference papers were located in All Academic or the related repository, Political Research Online, compared to 11 percent (n=6) in the current period under study. Linking within the All Academic site appears to have changed causing many dead links from Google Scholar. Once a search is re-executed in All Academic, the index page for a given paper can be confusing and frequently does not yield a link to the full-text. These changes have greatly reduced the usefulness of All Academic for locating conference papers. Instead, conference papers were located in a variety of repositories and websites including those of the conference or sponsoring organization. In the previous study, reports represented such a small number (n=44) of open access borrowing requests that they were not discussed. However, report requests are now more numerous than those for conference papers at 105 (7%) of the 1,557 open access requests. Of these 105 requests, 33 percent (n=35) were located on the issuing institution’s website and 27 percent (n=28) in ERIC. Other sources included government agency websites (9), open access repositories (9), and the National Criminal Justice Reference Service (7). Conclusion The discovery problems surrounding information retrieval do not align with users’ need for convenience and ease of access and may result in a greater reliance on ILL to locate information. An example can be taken from IUPUI University Library’s own document delivery service. RSDS offers document delivery of articles and book chapters from the library’s print collection for all users. However, users do not limit their requests to items from the print collection. From July 2011 through June 2013, RSDS filled 7,626 document delivery requests of which 61 percent were available through the library’s electronic holdings. Users clearly find it easier to request through ILL rather than completing the search process themselves even though this means a delay in access. The data presented here show that this is clearly the case for open access materials as the number of ILL requests for such content steadily rises. The request volume and discovery problems may make open access feel 10 like a hindrance to resource sharing. ILL practitioners may themselves be overwhelmed or frustrated by the number of possible sources for open access materials. The growth in the number of requests for these materials also adds a manual workflow and the burden of filling requests that could have been located by the user. Despite these potential drawbacks to the use of open access materials in ILL, the benefits are clear. Open access helps resource sharing in three ways. First is the increased ability to fulfill borrowing requests. Theses and dissertations as well as grey literature like conference papers and reports are notoriously difficult to obtain due to lack of holdings or unwillingness on the part of the owning library to lend. In these instances, open access is an enormous help to ILL practitioners in that it allows them to obtain materials for users that they may not be able to otherwise. Second is speed. By utilizing open access materials, the turnaround time for these requests is greatly reduced. The requests do not need to be sent to other libraries or handled by lending library staff. A parallel can be drawn between the difference in turnaround time for borrowing versus document delivery requests since document delivery requests can be filled with material immediately at hand just as requests for open access materials can be. The RSDS department’s overall turnaround times for borrowing and document delivery requests during the two years under study vary by 2.91 days. If you limit the comparison to items delivered electronically, the borrowing turnaround time was 2.9 days while the document delivery turnaround time was 1.75 days. Immediate access to the material requested saved 1.15 days, a clear benefit to ILL services and users. Third is cost. Since open access materials are free of charge, libraries are saved potential borrowing and shipping fees that a typical ILL transaction could incur. During the two years included in this study, RSDS filled 1,557 borrowing requests using open access materials. The potential cost of borrowing these items through traditional ILL is $27,247.50 based on Mary Jackson’s 2004 cost estimate of $17.50 per borrowing transaction (Jackson 2004, p. 31). By utilizing open access materials, the cost for these requests is reduced to a minimal amount of staff time. These benefits will outweigh the potential pitfalls especially as open access continues to grow. If ILL practitioners want the number of requests for open access materials to decrease, we need to take an active role in the education of our users through our websites, electronic communications, and by working with our colleagues to embed information about open access in instruction. As expert searchers, ILL practitioners are also perfectly positioned to assist their colleagues in improving the discovery of open access materials. Users should be able to discover open access items with ease using intuitive, user-friendly systems and interfaces. In the meantime, ILL practitioners must embrace the idea that we provide a vital service in aiding users with the discovery of open access resources as well as the benefits this large body of literature provides us. References Artlisch, K. and O’Brien, P.S. (2011), “Invisible institutional repositories: addressing the low indexing ratios of IRs in Google Scholar”, Library Hi Tech, Vol. 30 No. 1, pp. 60-81. Baich, T. (2012), “Opening interlibrary loan to open access”, Interlending & Document Supply, Vol. 40 No. 1, pp. 55-60. Connaway, L.S. and Dickey, T.J. (2010), The Digital Information Seeker: Report of the Findings from Selected OCLC, RIN, and JISC User Behavior Projects, The Higher Education funding Council for England, on behalf of JISC, available at: http://www.jisc.ac.uk/media/documents/publications/reports/2010/digitalinformationseekerreport.pdf (accessed May 12, 2014). Connaway, L.S., Dickey, T.J. and Radford, M.L. (2011), “‘If it is too inconvenient, I’m not going after it:’ convenience as a critical factor in information-seeking behaviors”, Library and Information Science Research, Vol. 33, pp. 179-190, pre-print available at: http://www.oclc.org/research/publications/library/2011/connaway- lisr.pdf (accessed May 12, 2014). Corthouts, J., Van Borm, J. and Van den Eynde, M. (2011), “Impala 1991-2011: 20 years of ILL in Belgium”, Interlending & Document Supply, Vol. 39 No. 2, pp. 101-110. DeRosa, C. et al. (2010), Perceptions of Libraries, 2010: Context and Community, OCLC, Dublin, OH, available at: https://www.oclc.org/en-US/reports/2010perceptions.html (accessed May 12, 2014). 11 http://www.jisc.ac.uk/media/documents/publications/reports/2010/digitalinformationseekerreport.pdf http://www.oclc.org/research/publications/library/2011/connaway-lisr.pdf http://www.oclc.org/research/publications/library/2011/connaway-lisr.pdf https://www.oclc.org/en-US/reports/2010perceptions.html Hu, F. and Jiang, H. (2014), “Open access and document delivery services: a case study in Capital Normal University Library”, Interlending & Document Supply, Vol. 42 No. 2/3, pp. 79-82. Indiana University (n.d.), “Campuses”, available at: http://www.iu.edu/campuses/index.shtml (accessed May 7, 2014). IUPUI (n.d.), “About IUPUI”, available at: http://www.iupui.edu/about/ (accessed May 7, 2014). IUPUI University Library (2013), “IUPUI Open Access Publishing Fund”, available at: http://www.ulib.iupui.edu/digitalscholarship/oafund (accessed January 21, 2015). IUPUI University Library (2014), “Center for Digital Scholarship: About”, available at: http://www.ulib.iupui.edu/digitalscholarship/about (accessed January 21, 2015). Jackson, M.E. (2004), Assessing ILL/DD Services: New Cost-Effective Alternatives, Greenwood, Westport, Connecticut. Koyama, K., Sato, Y., Tutiya, S., and Takeuchi, H. (2011), “How the digital era has transformed ILL services in Japanese university libraries: a comprehensive analysis of NACSIS-ILL transaction records from 1994 to 2008”, Interlending & Document Supply, Vol. 39 No. 1, pp. 32-39. Kress, N., Del Bosque, D. and Ipri, T. (2011), “User failure to find known library items”, New Library World, Vol. 112 No. 3/4, pp. 150-170. Kroll, S. and Forsman, R. (2010), A Slice of Research Life: Information Support for Research in the United States, OCLC, Dublin, OH, available at: http://www.oclc.org/content/dam/research/publications/library/2010/2010-15.pdf (accessed May 12, 2014). Lewis, D. (1997), “Six Reasons Why the Price of Scholarly Information Will Fall in Cyberspace”, unpublished manuscript, available at: http://hdl.handle.net/1805/170 (accessed January 21, 2015). McGrath, M. (2014), “Viewpoint: open access – a nail in the coffin of ILL?”, Interlending and Document Supply, Vol. 42 No. 4, pp. 196-198. ProQuest (2011), “Open Access Publishing PLUS from ProQuest: frequently asked questions (FAQ)”, available at: http://media2.proquest.com/documents/open_access_faq.pdf (accessed May 15, 2014). Schöpfel, J. (2014), “Open access and document supply”, Interlending and Document Supply, Vol. 42 No. 4, pp. 187-195. Suber, P. (2013), “Open access overview”, available at: http://www.earlham.edu/~peters/fos/overview.htm (accessed May 14, 2014). Yontz, E., Williams, P. and Carey, J.A. (2000), “Interlibrary loan requests for locally held items: why aren’t they using what we’ve got?”, Journal of Interlibrary Loan, Document Delivery & Information Supply, Vol. 11 No. 1, pp. 119-128. 12 http://www.iu.edu/campuses/index.shtml http://www.iupui.edu/about/ http://www.ulib.iupui.edu/digitalscholarship/oafund http://www.ulib.iupui.edu/digitalscholarship/about http://www.oclc.org/content/dam/research/publications/library/2010/2010-15.pdf http://hdl.handle.net/1805/170 http://media2.proquest.com/documents/open_access_faq.pdf http://www.earlham.edu/%7Epeters/fos/overview.htm Introduction Literature Review Overview of IUPUI and University Library Overview of Resource Sharing Operations Open Access ILL Workflow Data Overview Open Access Document Types and Resources Conclusion work_2p2pfe3b5rhuzbkfga7rwgigpe ---- IDS article Recent developments in Remote Document Supply (RDS) in the UK – 3 Stephen Prowse, Kings College London British Library to pull out of document supply You read it here first. Purely for the sake of artistic and dramatic licence I’ve omitted the question mark that should rightly accompany that heading. But even with the question mark firmly in place it still comes as a shock, doesn’t it? Can you imagine life without the Document Supply Centre? Can you think the unthinkable? Why should the BL pull out and where would that leave RDS? Rather like a science fiction dystopia, I’ve tried to imagine what such a post-apocalyptic world would look like and what form RDS might take. Assuming, that is, that RDS would survive the fallout. This article will attempt to show why the BL may be on the verge of abandoning document supply and what could fill some of the huge gap that would be left. Seven minutes to midnight We can think of the likelihood of a post-BL document supply world in the same terms as the Doomsday Clock positing the likelihood of nuclear Armageddon – the nearer to midnight the clock hands then the closer the reality. Perhaps we’ll start it at seven minutes to and then adjust in future articles? Perhaps a clock could adorn the cover of this esteemed journal? It could be argued that trends have been pushing the BL towards an exit for a while – the relatively swift and ongoing collapse of the domestic RDS market, for example. But the idea was first publicly mooted or threatened (take your pick) at a seminar jointly organised by the BL and CURL on 5th December 2006 at the BL in London. Presentations from the event can still be found on the CURL website [1]. This event brought together all those with a stake or an interest in the proposed UK Research Reserve (UKRR), a collaborative store of little-used journals and monographs. Librarians are notoriously loathe to completely discard items, preferring to hang on to them in case of future need. Sooner or later this creates a storage problem as space runs out. Acquiring extra space is often problematic and expensive. What is to be done? Moving to e only preserves access to the content and frees up space but can’t be wholly trusted, so print needs to be held somewhere – just in case. Co-operating with other libraries, HE institutions can transfer print holdings to an off-site storage depot and, once an agreed number of copies have been retained, can dispose of the rest. This is the theory underpinning the UKRR. The UKRR is a co-operative that will eventually invite institutions to become partners or subscribers. The first phase involves the following institutions working with the BL – Imperial College (lead site), and the universities of Birmingham, Cardiff, Liverpool, St Andrew’s, and Southampton. Research has shown that the BL already holds most of the stock that libraries would classify as low use and seek to discard – an 80% overlap of journals held by the BL and CURL libraries has been identified. Additional retention copies (meaning a minimum of three in total) would be required to placate fears of stock accidentally being destroyed. It is not felt that extra building will be necessary – stock will be accommodated at BLDSC and at designated sites by encouraging some institutions to hold on to their volumes so that others can discard. SCONUL will be the broker in negotiations as to who will be asked to retain copies. The first phase began in January 2007 with 17.3 km of low-use journals being identified among the partners for storage/disposal. If the BL already holds most of these volumes, and there is a need to ensure that two more copies are kept, it will be interesting to see how much of the 17.3 km will actually make it to disposal. I expect that libraries will be asked to hold on to much of the material that they would like to send for disposal. In fact, at a subsequent CURL members’ meeting in April 2007, Imperial disclosed that 30 out of 1300 metres of stock selected for de-duplication had been sent to BL. This represents only 2.3% being offloaded. Once participation widens then there will be increased scope for disposal, but I can’t see the partner institutions creating much space until that happens. Should the UKRR really take off then there may be a need for more building space to accommodate stock, although the BL now has a new, high density storage facility at Boston Spa. The business model behind the UKRR will mark a change in the way remote document supply is offered to HE institutions and could determine the future of the service. Instead of the current transaction-based model the new model will be subscription–based and will comprise two elements – 1) a charge to cover the cost of BL storage and 2) a charge according to usage. Institutions that don’t subscribe, including commercial organisations, will be charged premium rates. The theory is that costs will not exceed those currently sustained for document supply. Assuming funding is provided for Phase 2, we will see the roll-out of this new model after June 2008. Advocacy will be crucial to the success of the UKRR. The original study reported widespread buy-in to the idea but will that translate into subscriptions? Many libraries will already be undertaking disposal programmes, particularly those with more confidence in and/or subscriptions to e repositories such as Portico. Will anyone really want access to that material once it’s out of sight (and out of mind)? If requests remain low (and decline further) and take- up isn’t great then that could spell the end as far as the BL & RDS go. The editor has already commented on the apparent lack of commitment to RDS from the chief executive of the BL (McGrath, 2006). This year the BL faces potentially very damaging cutbacks from the 2007 Spending Review with threats to reading rooms, opening hours and collections together with the possible need to introduce admissions charges. RDS wasn’t mentioned as a possible target for cutbacks but then the BL will want to see how the UKRR and the new model fares. Tough financial targets in the future coupled with low use/low take-up could lead to a time when the BL announces enough is enough. I’m sure that scenario has been considered by a new group of senior managers set up within the British Library – the Document Supply Futures Group. Not surprisingly little is known about this Group (again this was something divulged at the CURL December presentation) but if it’s looking at all possible futures then it must also be considering no future. The group is headed by Steve Morris the BL’s Director of Finance and Corporate Services. McGrath reported in his paper just quoted that senior figures were seriously considering the future of document supply in 2001. Whatever will come from this Group’s deliberations the present tangible outcome is a commitment to the UKRR. We’ll see where that goes – the clock’s ticking. Alternative universes If ever we’re left adrift in RDS without the BL then what are the alternatives? One option is to go it alone and request from whoever will supply. For this a good union catalogue will be a fundamental requirement. COPAC has had a facelift and, as with other search tools, the Google effect can be seen in the immediate presentation of a simplified ‘quick search’ screen. Expansion is taking place with the catalogues of libraries outside of CURL also being added e.g. the National Art Library at the Victoria and Albert Museum already added and the Cathedral Libraries’ Catalogue forthcoming. The national libraries of both Wales and Scotland are on COPAC, as is Trinity College Dublin. The national library of Scotland has always had an active ILL unit, although this is far too small to take on too many requests. Further development of COPAC has seen support for OpenURLs so that users are linked to document supply services at their home libraries. CISTI is the Canadian document supplier that would welcome more UK customers. However it should be remembered that the BL acts as a backup supplier for CISTI, so without them CISTI could only play a minor role. A new service for 2007 is the supply of ebooks. For US$25 users can access the ebook online for 30 days, after which the entitlement expires. As far as I’m aware this is the first solution aimed at tackling the problem of RDS in ebooks. Ejournal licences have become less restrictive and usually allow libraries to print an article from an ejournal and then send it to another library. This obviously isn’t an option for ebooks, and neither can libraries download and pass on the whole thing or permit access, so the CISTI solution is an attractive option. Of course a major undertaking of the BL is to act as a banker on behalf of libraries for the supply of requests. Libraries quote their customer numbers on requests to each other and charges can then be debited and credited to suppliers once suppliers inform BL (via an online form or by sending a spreadsheet). IFLA vouchers can act as currency but these are paper- based rather than electronic, even though an e voucher has long been desired and projects have looked at producing one. Realistically, survivors in a post-BL document supply world would need to band together with like-minded others to form strong consortia and reap the benefits from membership of a large group. Effectively that boils down to two options – joining up with OCLC or with Talis. Despatches from the Unity front – no sign of thaw in the new cold war OCLC and Talis have both, naturally enough, been promoting their distinct approaches to a national union catalogue and a RDS network that can operate on the back of that, while firing an occasional blast into the opposing camp. Barely was the ink dry on the contract between The Combined Regions (TCR) and OCLC for the production of UnityUK, than opening salvos were being exchanged. At the time Talis had announced they were going ahead with their own union catalogue and RDS system. Complaints relating to data handover and data quality were being lobbed Talis’ way. Since then news items on UnityUK have appeared regularly in CILIP’s ‘Library + Information Update’ (Anon, 2006) along with a stream of letters (Chad, Froud, Graham, Green, Hendrix, McCall, 2006), including one from a Talis user and two from senior Talis staff, bemoaning the situation and seeking ‘a unified approach’. Talis’ position is that they would like to enable interoperability between Talis Source and OCLC for both the union catalogue and the RDS system. I’m sure TCR’s position is that OCLC won the tender to provide a union catalogue and RDS services, while Talis didn’t bid, and they are happy to press on without Talis, thank you very much. An article by Rob Froud, chair of TCR, in a previous issue of ILDS (Froud, 2006b), providing some history and an update on progress, was met with a counter-blast from Talis’ Dr Paul Miller in the Talis Source blog (Miller, 2007). A particular bone of contention was the decision taken by Rob Froud to withdraw a number of TCR libraries’ holdings from the Talis Source union catalogue. Not an especially surprising move given the circumstances, but neutrals should note that libraries can contribute holdings records freely to both. However access to the union catalogue will only be free with Talis. This free access for contributors has seen more FE and HE libraries joining Talis Source. It’s interesting comparing membership lists. While there isn’t great overlap between the two there is a significant minority of public library authorities who are members of both. Will this continue, and, if so, for how long? UnityUK and Talis Source have staked their claims to be the pre-eminent union catalogue and RDS network on their respective websites [2, 3]. UnityUK have this to say - “In 2007, the combined UnityUK and LinkUK services will bring together 87% of public libraries in Great Britain, Jersey and Guernsey in to one national resource sharing service.” They show their extent of local authority coverage with the following membership figures- 97% County Councils 97% London Boroughs 97% Metropolitan authorities 75% Unitary authorities Meanwhile, Talis Source announces itself as “the largest union catalogue in the UK comprising 26 million catalogue items and 55 million holdings from over 200 institutions.” (April 25 2007).th No more ISO ILL for NLM In January 2007 the National Library of Medicine (NLM) in the U.S. said that it would no longer accept ILL requests into its DOCLINE system via ISO ILL. The reasons cited were poor take-up (only three libraries were using it), and the drain on resources by having to test separately with every supplier and every institution that wanted to use it. The protocol itself is quite long but implementers do not have to implement every item – they can select. This meant however that each implementer had to test with NLM even if they were using one (out of only four) of the systems suitable for use. The time and effort required to support ISO ILL was too much and so the NLM pulled the plug. This raises a number of questions about the use of ISO ILL and its future. It doesn’t seem to be well-used in the U.S. e.g. OCLC’s Resource Sharing website lists nearly four times as many Japanese libraries using it compared to those in the U.S., and the British Library hasn’t developed its own ISO ILL gateway since that came on stream. That gateway is of course run on VDX. On the other hand, ISO ILL is used in VDX-based consortia in the UK (UnityUK), the Netherlands, Australia and New Zealand. Quite where all this leaves ISO ILL I don’t know, but I wouldn’t be too optimistic about its prospects. Big deals - unpicking the unused from the unsubscribed Statistics on ejournal usage have moved on apace since publishers committed themselves to achieving COUNTER compliancy in their reports. By creating a common standard, COUNTER reports from one publisher can be meaningfully compared with those of another, knowing that both treat data in the same way. SUSHI takes that a step further by consolidating reports from several publishers into one to provide easy comparisons and show usage across platforms. These can be accessed via Electronic Resource Management Systems (ERMS) or by subscribing to a service such as ScholarlyStats. By utilising such tools analysis of these statistics will becoming increasingly sophisticated, but I suspect that for the moment it remains at a somewhat elementary level. After all, who has the time to look much beyond full text downloads and what titles are or are not being used? The Evidence Base team at the University of Central England have been running a project involving 14 HE institutions that looks at their usage of ejournals, specifically big deals. Libraries are given reports on their usage of ejournals within selected deals and how these rate for value etc. Furthermore, libraries can compare their use with use made at other libraries in the project. At King’s we have received a number of reports including our use of Blackwell’s STM collection in 2004-05, ScienceDirect in 2004-05 and Project Muse in 2005 (Conyers, 2006-07). The Blackwell’s report runs to 22 pages and provides a wealth of detail. Some key findings are highlighted – • 19% increase in usage from 2004-2005 • 91% of requests come directly from the publisher’s web-site, compared to 9% through Ingenta • The average number of requests per FTE user was 6.7 in 2004 and 8.4 in 2005 • 50% of titles in the STM deal were used 100 times or more and 96% of total requests were generated by these titles • 62% of high priced titles in the deal (£400 and over) were used 100 times or more. Higher priced titles were used more frequently than those with a low price (under £200) • 78% of subscribed titles and 39% of unsubscribed titles were used 100 times or more • 62 titles (14% of total) received nil or low use (under 5 requests) in 2005. 22 of these (35%) were unpriced titles not fully available within the deal and a further 18 (29%) were low price (under £200) • The average number of requests per title in 2005 was 369. Average requests for a subscribed title were 860 and unsubscribed title 186 • The heaviest used title was Journal of Advanced Nursing which recorded 15,049 requests in 2005 and 13,840 in 2004 So the report confirms that heavy use is made of titles in the deal, that practically all use is concentrated on half the titles, although practically every title gets some use, and that it is the expensive titles that are most used, but also that unsubscribed titles can attract heavy use. Furthermore, in discussing costs the report finds that the average cost of a request to a subscribed title is 84p in 2005, and just 16p to an unsubscribed title. Pretty good value when all is said and done. The second report confirms much of what the first found. I’ll focus on two of the deals – ScienceDirect (SD) and Project Muse (PM) – as the first is our biggest deal (and will be the case for other libraries too) and PM has a humanities focus which provides a nice contrast. In SD 35% of titles were used 100 times or more, in PM 15%. SD had 2% of titles with nil use*, PM 4% (*nil use doesn’t include ‘unpriced’ titles with limited availability). SD had 80% of subscribed titles used 100 times or more and 27% of unsubscribed titles; for PM the figures were 36% and 9% respectively. This reflects the relative importance of ejournals to users in STM and Humanities fields but also shows how much users gain from a big deal like SD. The average cost for a request to a subscribed SD title was £1.12 and only 2p for an unsubscribed title. One of the arguments against big deals is that you are buying content that you don’t really need – a lot of filler is thrown in with the good stuff. While not totally dispelling that presumption, research such as that produced by Evidence Base can counter that argument somewhat and certainly puts a lot more flesh on bare bones. If you choose carefully which deals you sign up to then your users can make good use of this extra content. At the time of writing (June) Evidence Base were recruiting institutions for a second round of the project. A report from Content Complete (the ejournals negotiation agent for FE, HE and the Research Councils) outlined what they discovered from trials involving five publishers and ten HE institutions that took place between January and December 2006 (Content Complete Ltd, 2007). The idea behind the trials was to look at alternative models to the traditional big deal, and in particular focus on unsubscribed or non-core content and acquiring this via pay per view (PPV). Although the common idea of PPV as a user-led activity was quickly dropped as impractical, a cheaper download cost per article was agreed for all but one of the publishers instead. PPV was then considered in the context of two models – one where unsubscribed content is charged per downloaded article, and the second also with a download charge per article, but this time, should downloading reach a certain threshold, PPV would convert to a subscription and there would be no further download charges. This second option appears more attractive to librarians at first glance as it puts a ceiling on usage, and therefore cost per title, but costs could still mount up considerably if the library saw heavy usage across a wide range of unsubscribed content and was forced into taking further subscriptions. The report highlights a number of problems to do with accurately measuring downloads such as the need to discount articles that are freely available, to not count twice those that are looked at in both HTML and PDF, and to include those downloaded via intermediaries’ gateways. Ultimately these problems proved too much of a technical and administrative difficulty to overcome during the trials for both publishers and librarians. Such problems are likely to continue for some time, although one imagines, given sufficient incentive, they could be overcome with automation and developments to COUNTER and SUSHI. However, would the incentive exist? For the trials also found that the PPV models didn’t compare too well against the traditional big deals in terms of management, and in almost all cases ended up more expensive. Updates In Recent Developments…2 I reported on the RDS proposal for the NHS in England. There’s been some progress on this but there’s still quite a way to go. A list of options has been trimmed to five to undergo cost-benefit analysis before deciding on an eventual winner. The options range from doing little or nothing to improving direct access to content to using a vendor’s RDS system to outsourcing. Building a search engine across catalogues or developing a national union catalogue were the rejected options. It won’t be until November that the preferred option is chosen and then should procurement prove necessary that will take until September 2008 with implementation following early in 2009 (ILDS Task Group, 2007). There have been two significant developments on open access (OA). Firstly, the UK version of PubMed Central launched in January 2007. Like the original U.S. version this will be a permanent archive of freely available articles from biomedical and life sciences journals. Although initially set up as a mirror service, the UK version has 307 such journals at the time of writing (June 2007) against 334 in the U.S version. We can expect future developments to favour UK and European resources. The UK version is supported by a number of organisations – the British Library, the European Bioinformatics Institute and Manchester University are the suppliers while a number of organisations including the Wellcome Trust provide funding. Secondly, for researchers who do not have access to an institutional or subject repository JISC is now offering a service called the Depot, where peer-reviewed papers can be deposited. The Depot is not intended as a long-term repository but rather more of a stop-gap until more become available. eTheses – a long time coming Of course, repositories don’t just have to be homes for journal articles; they can contain a lot more. The possibility of institutions holding their own theses in electronic form has been mooted since the early to mid nineties. Early projects often had a Scottish base and had wider dissemination of research material as a key factor in their raison d’être. An important group looking into the subject was the University Theses On-line Group (UTOG), chaired by Fred Friend. A survey they undertook showed how important theses were to those who consulted them, how authors would be happy to see their own theses more widely consulted, that most theses were being produced in electronic form and so should therefore be easily adapted to storage in an electronic form (Roberts, 1997). One of the members of the UTOG, the Robert Gordon University, subsequently led a smaller group to look at etheses production, submission, management and access. The recommendations from that group led to the EThOS (Electronic Theses Online Service) project which in turn is in the process of establishing itself as a service. From that service researchers will be able to freely access theses online while deposit can be directly into EThOS or by harvesting from institutional repositories. Digitisation of older theses can also be undertaken by the British Library as part of the service. Around the peak of BLDSC’s RDS operations in 1996-97 over 11,000 theses were supplied as loans with more than 3,000 also being sold as copies (Smith, 1997). Final point With UK PubMed Central and EThOS the British Library will be making material freely available that would previously have had to be obtained via RDS. That seems to be the way that much RDS has been going. Previously it was quite expensive, took a while and had to be done via an intermediary; increasingly the documents traditionally obtained via RDS are free and available directly to users immediately. It’s an interesting turnaround isn’t it? Notes 1 BL & CURL presentations on the UKRR from the December 2006 meeting can be found at http://www.curl.ac.uk/projects/CollaborativeStorageEventDec06.htm 2 TCR/UnityUK http://tcr.futurate.net/index.html http://www.curl.ac.uk/projects/CollaborativeStorageEventDec06.htm http://tcr.futurate.net/index.html 3 Talis Source http://www.talis.com/source/ References Anon. (2006), “Will UnityUK bring ILL harmony?”, Library + Information Update, Vol. 5 No 5, pp.4. Anon. (2006), “OCLC Pica/FDI and Talis set out their stalls,” Library + Information Update, Vol. 5 No 5, pp.4. Chad, K. (2006), “Removing barriers to create national catalogue”, Library + Information Update, Vol.5 No 7-8, pp.24. Content Complete Ltd (2007, JISC business models trails: a report for JISC Collections and the Journals Working Group, available at http://www.jisc- collections.ac.uk/media/documents/jisc_collections/business%20models%20trials%20report %20public%20version%207%206%2007.pdf Accessed 28th June 2007. Conyers, A. (2006-2007), Analysis of usage statistics, Evidence Base, UCE, Birmingham, unpublished reports. Froud, R. (2006), “Small price to pay for a proper inter-library lending system”, Library + Information Update, Vol.5 No 7-8, pp.25. Froud, R. (2006b), “Unity reaps rewards: an integrated UK ILL and resource discovery solution for libraries”, Interlending & Document Supply, Vol. 34 No 4, pp. 164–166. Graham, S. (2006), “We want a unified approach to inter-library lending”, Library + Information Update, Vol.5 No 9, pp.25. Green, S. (2006), “Make Unity UK freely available to boost demand”, Library + Information Update, Vol.5 No 6, pp.24. Hendrix, F. (2006), “Struggle for national union catalogue”, Library + Information Update, Vol.5 No 6, pp.26. ILDS Task Group (2007), Strategic business case for interlending and document supply (ILDS) in the NHS in England: recap and update on short listing of options, unpublished report. McCall, C. (2006), “Seeking a unified approach to inter-library lending”, Library + Information Update, Vol. 5 No 10, pp.21. McGrath, M. (2006), “Our digital world and the important influences on document supply”, Interlending & Document Supply, Vol. 34 No 4, pp. 171–176. Miller, P. (2007), “Unity reaps rewards: a response”, Talis Source Blog, available at http://www.talis.com/source/blog/2007/03/unity_reaps_rewards_a_response_1.html (Accessed 7th June 2007). http://www.talis.com/source/ http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.talis.com/source/blog/2007/03/unity_reaps_rewards_a_response_1.html Roberts, A. (1997), Survey on the Use of Doctoral Theses in British Universities: report on the survey for the University Theses Online Group, available at http://www.lib.ed.ac.uk/Theses/ (Accessed 28th June 2007). Smith, M. (1997), How theses are currently made available in the UK, available at http://www.cranfieldlibrary.cranfield.ac.uk/library/content/download/678/4114/file/smith.pdf (Accessed 6th July, 2007). http://www.lib.ed.ac.uk/Theses/ http://www.cranfieldlibrary.cranfield.ac.uk/library/content/download/678/4114/file/smith.pdf Recent developments in Remote Document Supply (RDS) in the U work_2psrlrp6mbhn3hloymhmzslsxu ---- UC San Diego UC San Diego Previously Published Works Title Cataloging for Consortial Collections: A Survey Permalink https://escholarship.org/uc/item/4j5118zb Journal Cataloging & Classification Quarterly, 56(2-3) ISSN 1544-4554 Authors Deng, Shi Sotelo, Aislinn Culbertson, Rebecca Publication Date 2017-11-13 Peer reviewed eScholarship.org Powered by the California Digital Library University of California https://escholarship.org/uc/item/4j5118zb https://escholarship.org http://www.cdlib.org/ Cataloging for Consortial Collections: A Survey 1 Cataloging for Consortial Collections: A Survey1 Shi Deng, Aislinn Sotelo & Rebecca Culbertson Abstract: Libraries face many challenges in making electronic resources accessible and discoverable. In particular, the exponentially increasing number of licensed and open access electronic resources and the dynamic nature of consortial collections (platform changes, title transference between packages, and package overlap) present challenges for cataloging and discovery. From March 21 to April 10, 2017, the authors performed a selected review of library literature and conducted a survey of library consortia worldwide to ascertain the cataloging models, strategies, and advanced technological tools used to ensure discovery of consortial collections. The findings from the literature review and survey are summarized in this article. Keywords: electronic resources, consortial cataloging, cooperative cataloging, centralized cataloging, vendor records, MARC record service (MRS), batch cataloging Introduction There are several benefits to being part of a library consortium including opportunities to leverage expertise, participate in cost sharing, and create a cohesive user experience. One significant benefit is the opportunity to license electronic resource materials on a consortial level. The cost sharing benefits however come with many challenges, especially when it comes to cataloging, accessibility, and discoverability of consortial collections. The authors of this article are metadata librarians and managers at the University of California, San Diego which is the home site for the Shared Cataloging Program (SCP) of the California Digital Library (CDL). Established in 2000, the SCP is a centralized cataloging unit 1 The Version of Record of this manuscript has been published and is available in Cataloging & Classification Quarterly on November 13, 2017. 56:2-3, 171-187, DOI: 10.1080/01639374.2017.1388327 To link to this article: https://doi.org/10.1080/01639374.2017.1388327 https://doi.org/10.1080/01639374.2017.1388327 Cataloging for Consortial Collections: A Survey 2 that provides catalog records for CDL licensed and selected open access electronic resources for the ten University of California (UC) campuses. Over the past seventeen years, the SCP has managed many consortial cataloging changes and challenges as described in the One for Ten article in this festschrift issue. Additionally, several UC campuses have recently or will soon migrate to a new Integrated Library System (ILS) / Library Service Platform (LSP), which will potentially impact the SCP operations and workflows. To help prepare for upcoming ILS/LSP workflow and operational changes, the authors sought to identify how other library consortia manage discovery of their consortially licensed materials. To do this, they performed a selected literature review (focusing on articles that emphasized cataloging consortial collections) and conducted a survey to better understand the current landscape of consortial cataloging. Their findings are summarized in this article. Literature Review To provide context for the survey results, the literature review seeks to illuminate current and past cataloging practices and approaches to the discovery of consortial collections. The authors found articles covering topics such as consortial acquisition, consortial licensing, consortial electronic resources management (ERM), shared collection development, consortial Demand Driven Acquisition (DDA), return on investment calculation, cataloging partnership between or among consortial members, and how to be a responsible collaborative participant in a library consortium, but there were only a handful of articles that focused specifically on cataloging for consortial collections. The authors will try to summarize the literature on cataloging for consortial collections in this article. The literature can be divided into two distinct time periods, the 2000s and 2010s, which clearly document a progression or change in the approach to providing access to consortially licensed materials. The first cluster of articles was published between 2000 and 2005 and illustrates developing approaches for cataloging for a consortium. In 2000, Cary and Ogburn1 detailed the Cataloging for Consortial Collections: A Survey 3 development of the Virtual Library of Virginia (VIVA), a consortium of separate universities funded by the state legislature. “There is no one integrated, online library system used in the state; instead, many different systems are in use among member institutions” (p. 46-47). The VIVA Cataloging and Intellectual Access Task Force (VIVACAT) was charged in October 1996 “to establish a process for sharing information about cataloging VIVA materials; develop a draft of goals, principles, and guidelines for cataloging VIVA materials; and determine future means of cooperative communication for cataloging staff in the VIVA libraries” (p. 46). VIVACAT developed shared cataloging standards, cataloger training for member institutions, and shared vendor records for each member institution to load into their local system. Following up in 2002, Shieh, Summers, and Day2 discussed how VIVA member libraries standardized vendor MARC records and distributed them within the consortium. In 2002, Boyle and Hughes3 documented the cooperative cataloging efforts of the ten member libraries of the Nebraska Independent College Library Consortium (NICLC), emphasizing the need for standardization in cataloging. When the NICLC transitioned from local catalogs to a union catalog, they decided to use a single record to represent their consortial collections. One vendor provided authority control. Another vendor merged ten local records into one record in the NICLC database and faced the difficulty of reconciling and merging different versions of OCLC records that were downloaded at different times by member libraries. They were unable to resolve the issue due to local edits which could not simply be overlaid with updated OCLC master records. Today, libraries can facilitate and resolve the same difficulty with new services such as the OCLC Collection Manager updating record service. NICLC moved to OCLC's Worldshare Management Services (WMS) in 2012 and uses WorldCat Local as the library catalog.4 In 2002, French, Culbertson, and Hsiung5, made a case for centralized cataloging for consortia (Shared Cataloging Program of the California Digital Library) as an effective cost saving method for supplying catalog records, provided that agreed upon standards were met. Cataloging for Consortial Collections: A Survey 4 Stumpf6 conducted an analysis in 2003 of raw data collected from the Municipal Library Consortium (MLC) of St Louis County, a consortium of eight small public libraries who “share an OPAC, an interlibrary loan system, and maintains reciprocal borrowing among its patrons” (p. 93). Stumpf’s survey “covered all aspects of a library’s cataloging process” (p. 94) such as staffing, processing, location, turnaround time, etc. Based on her analysis, she believed it would be feasible and more efficient to establish a centralized acquisitions-cataloging facility hosted by one of the member institutions. If MLC decided to implement, it would require a further in-depth study. In 2003, O’Connell7 recalled the information presented at a workshop, led by Paul Moeller and Wendy Baia at the University of Colorado at Boulder, on cataloging within a consortium catalog. The article outlined issues of serials cataloging in a consortium setting and made recommendations for addressing these issues. “Moeller provided an overview of the interesting features of several consortia. The CIC Virtual Electronic Library does not provide a central catalog but allows searching of all member library catalogs individually or as a group. VIVA, the Virtual Library of Virginia, does not have a consortium catalog, but has documentation providing guidelines for best cataloging practices for the electronic resources it purchases. OhioLINK, which was a leader in the development of a central catalog to facilitate the sharing of resources, has become a purchaser of e-resources. CDL, the California Digital Library, employs centralized cataloging of the e-resources purchased by the consortium” (p. 230-231). Moeller also mentioned that “the CDL has done an especially good job of documenting their cataloging policies” (p. 231) making them available to other libraries and recommended that “those developing documentation for consortium cataloging practices follow CONSER standards” (p. 231). In 2004, Xiaotian Chen, et al.8 presented the results of a 2003 survey conducted by a task force of the Illinois Library Computer Systems Organization (ILCSO) (now Consortium of Academic and Research Libraries in Illinois (CARLI)) that focused on e-resource cataloging practices in academic libraries and consortia. With over 60 responses from institutions and Cataloging for Consortial Collections: A Survey 5 consortia, the survey found that libraries were “dealing with a volatile set of unstable resources which change names, contents, providers, and URLs with alarming frequency and thereby require repeated revisions to their surrogate records” (p.174). Additionally “no clear or thorough national standards exist…leaving policy making at the institutional level” (p. 174). Xiaotian Chen, et al. also discussed emerging trends such as the adoption of “the CONSER B+ option, or the aggregator-neutral record“ (p. 176), the “the growing tendency of obtaining sets of resources from vendors” (p. 176) which move “libraries away from the single record approach” (p. 177), and the development of OpenURL link server or resolver software for e-resource management (p. 177). Following up in 2005, Chew and Braxton9 shared their perspectives and reflection on their deliberating process of developing recommended standards for consortial cataloging of electronic resources for ILCSO. Covering many factors from defining the purpose of the shared catalog, local and historical matters, online holdings, MARC environment, vendor records, single record vs. separate record approach, to catalog functionalities, their deliberations centered on “two of the major issues facing libraries today: the challenge of providing access to electronic resources, and of doing so in a distributed environment” (p. 324). Review of the first cluster of articles revealed that, in the early years, library consortia faced similar challenges in establishing practices for cataloging consortial collections. Library consortia ascribed to one of two methods of providing access and discovery for consortial collections: centralized cataloging or local cataloging and record sharing. The MLC and the CDL took a centralized cataloging approach while member institutions of the CARLI, VIVA, and NICLC contributed and shared catalog records either through a union catalog (CARLI and NICLC) or members’ local catalogs (VIVA). Regardless of the record provider method, the challenges of establishing cataloging standards for consortia were similar and identified as the lack of cataloging standards for bibliographic records shared within a consortium, the need to reconcile different local practices (e.g., using single vs. separate records for e-journals) with national practices, how to manage versions of shared records, and/or how to manage vendor Cataloging for Consortial Collections: A Survey 6 records for large sets of e-books. While VIVA developed shared cataloging guidelines for consortium members at the local level, there were also emerging national standards such as the CONSER B+ option to treat multiple providers using one bibliographic record (now the PCC policy and standard for creating provider-neutral records). The second cluster of articles, published between 2010 and 2013, demonstrate a trend towards utilizing batch cataloging methods and vendor records for cataloging consortial collections. In 2010, Martin and Mundle10 shared the University of Illinois at Chicago (UIC) Library’s experience of managing and improving the quality of bibliographic records for a large e-book collection in a consortial setting, and the challenges of working with vendor records. In 2011, Preston11 examined the ever changing workflow for the production and distribution of MARC records for e-books in the OhioLINK system designed by the Database Management & Standards Committee (DMSC). Comprised of 25 technical service librarians from various libraries in the system, the DMSC developed the Standards for Cataloging Electronic Monographs. Initially, the Standards applied to vendor records only but later played a broader role for Ohio libraries that were involved in cooperative cataloging, leading to the development of consistent cataloging standards for the Ohio libraries. In 2011, Martin, Dzierba, Fields, and Roe12 presented their survey findings on how members of the Consortium of Academic and Research Libraries in Illinois (CARLI) “handle e-resources cataloging issues, their awareness of and compliance with the existing consortial recommendations, and challenges faced in e- resources cataloging practices” (p. 362). The survey addressed four areas of e-resources cataloging: quantity and type of e-resources available through the catalog; use of the single versus separate record approaches for cataloging e-resources; batch loading and vendor records; and adherence to existing I-Share (consortium union catalog) guidelines (p. 371). Forty-five out of seventy-six (59%) I-Share members participated in the survey. The survey revealed that “managing access to e-resources through the library catalog is of continued importance to libraries. Almost all I-Share libraries provide access to e-resources--e-journals, e- Cataloging for Consortial Collections: A Survey 7 books, and/or databases--through the library catalog” (p. 381). The survey comments also “indicated that many libraries are open to simple and straightforward guidelines and welcome direction from the consortium” (p. 382). In 2013, Lu and Chambers13 described the University of Colorado (CU) system PDA program with a focus on how MARC records are prepared and distributed to the CU libraries--one campus catalogs the records using MarcEdit and then distributes them to other campuses within the system. While some of the challenges mentioned in the first cluster of articles continue to be debated, other challenges are emerging such as the reliance on vendor records to handle the deluge of e-resources, using batch processes to improve large sets of vendor records (OhioLINK, VIVA, CU), developing workflows for record distribution (OhioLINK, CU), and managing consortial DDA records (CU). Based on the literature review, it was not possible to obtain an overall picture of the number of consortia cataloging for consortial collections and the approaches they use. Most of the literature reviewed for this article focused on individual consortium or consortia members managing cataloging for consortia. Only the survey conducted by Xiaotian Chen, et al. in 2003 was distributed widely to libraries and consortia. With the increased need to use vendor records, MARC record services, and batch cataloging to manage the exponentially increasing number of e-resources (especially e-books), what are the unique issues facing consortia cataloging and how are they being addressed? To answer these questions, the authors designed a survey to conduct an environmental scan to learn more about how library consortia catalog and create access to their materials. The survey will be the first step in an effort towards creating a conversation about possible best practices for consortial cataloging. Survey Design From March 21 to April 10, 2017, library consortia worldwide were invited to participate in a survey about current consortial cataloging practices. The survey was posted to several Cataloging for Consortial Collections: A Survey 8 listservs including ICOLC, PCCLIST, CONSRLST, Autocat, OCLC-cat, and OCLC-CJK. Comprised of thirteen questions, the survey was a combination of multiple choice and open- ended questions (see attached survey form in Appendix). It was intentionally brief to allow participants to answer the questions within five to fifteen minutes. The goal of the survey was to determine if each consortium had a systematic way to acquire licensed and open access electronic resources. If yes, did these consortial collections get cataloged, and what cataloging models, strategies, and advanced technological tools were employed? Additionally, how did library consortia address the challenges mentioned above and ensure the discovery and access of consortia collections? Survey Results During the twenty-one day survey period, thirty-two consortia participated and submitted thirty-six valid responses. In order to analyze and organize the survey data accurately, adjustments were made for multiple responses from the same consortium. Since multiple member libraries from the same consortium responded to the survey separately, some answers were combined and counted only once. For example, only one response was considered for the following questions: Question 2: “Does your consortium have a systematic way to acquire licensed and open access electronic resources?”, and Question 3: “Do your consortium collections get cataloged?” For other questions with multiple answer choices, multiple responses from the same consortium were counted only once, if the responses were the same. However, all non-duplicated responses from the same consortium were included in the data analysis. For example, if there were two varied responses from the same consortium for Question 6: “Who catalogs for the collections selected by the consortium?” (e.g., vendor records and MARC record service), both answers were included in the results. Question 1: Participating Consortia Cataloging for Consortial Collections: A Survey 9 Thirty-two consortia responded to the survey. Table 1 below lists the name and acronym of the participating consortia in the first and second columns, respectively. The demographic makeup of the respondents was varied and well-represented, encompassing a variety of libraries (academic, public, special, school, museum, archives, county, state, government, and the United Nations) across four continents (Asia, Africa, North America, and Europe). Twenty- nine of the thirty-two respondents represented U.S. library consortia. Of the twenty-nine U.S. library consortia, five consortia served member libraries in multiple states or nationwide, and the remaining U.S. library consortia were from fifteen states serving member libraries at the state, city or small region levels. The membership of each consortium varied from 3 to 1700 member libraries. Some consortia served university systems, while others served member libraries within a state or a region. The membership of state or regional library consortia also varied and were either determined geographically (e.g., all libraries and museums within a specific region), by the type of library (e.g., academic/research libraries of higher education institutions, public libraries, and county/state libraries), or both. Table 1. List of consortia to which the participants belong Consortium Name Acronym Content Acquisitions Abilene Library Consortium ALC Yes Bavaria Consortium Yes Big Ten Academic Alliance (formerly the CIC) BTAA Yes California Digital Library CDL Yes California State University CSU Yes Consortium of Academic and Research Libraries in Illinois (CARLI) CARLI Yes CONsortium on Core Electronic Resources in Taiwan CONCERT No Cooperative Computer Services CCS Yes Florida Academic Library Services Cooperative FALSC Yes LYRASIS Lyrasis Yes MassCat, a service of the Massachusetts Library System MassCat No MOBIUS MOBIUS Yes Cataloging for Consortial Collections: A Survey 10 Moody Bible Institute -- 3 branches -- Chicago, Spokane, Michigan Yes NorthEast Research Libraries consortium NERL No Ohio Library and Information Network OhioLINK Yes PALCI PALCI Yes Partnership Among South Carolina Academic Libraries PASCAL Yes Private Academic Library Network of Indiana PALNI No Public and Academic Library Network PALnet Yes SHARE - part of the Illinois Heartland Library System SHARE No South African National Libary and information Consortium SANLiC Yes South Carolina Library Evergreen Network Delivery System SC LENDS Yes Statewide California Electronic Library Consortium SCELC Yes TexShare (LEIAN group) Yes The Louisiana Library Network LOUIS Yes U.S. National Park Service No UN System Electronic Information Acquisitions Consortium UNSEIAC Yes University of Texas System Digital Library UTSDL Yes University of Wisconsin System Yes University System of Maryland and Affiliated Institutions USMAI Yes Western New York Library Resources Council WNYLRC Yes Western North Carolina Library Network WNCLN No Question 2: Does your consortium have a systematic way to acquire licensed and open access electronic resources? Twenty-five consortia (79%) responded “yes” and seven consortia (21%) responded “no.” See the third column in Table 1 above for corresponding answers by consortia and see Table 2 below for the summary of responses. The consortia who chose “no” were asked to skip questions 3-13 and submit the survey. Question 3: Do your consortium collections get cataloged? Cataloging for Consortial Collections: A Survey 11 Twenty-two out of twenty-five consortia (88%) responded “yes” and three consortia (12%) responded “no.” See Table 2 below for a summary of responses. The consortia who chose “no” were asked to skip questions 4-10 and complete questions 11-13. Table 2. Number of consortia who acquire and catalog licensed and OA e-resources Number of consortia Yes No Systematically acquire licensed and open access e-resources 32 25 consortia (79%) 7 consortia (21%) Consortium collections get cataloged? 25 22 consortia (88%) 3 consortia (12%) Question 4: What categories of resources are cataloged? Twenty-one consortia submitted twenty-six responses. See Figure 1 below for a summary of the number of responses by type of resource. Answers for the “Other” category were as follows: State documents, Federal documents Items purchased in perpetuity is being catalogued, subscriptions not Nonbook formats Figure 1. What categories of resources are cataloged? Cataloging for Consortial Collections: A Survey 12 Question 5: Do you provide an avenue for bibliographers to request cataloging of open access resources, either single serials, or open access packages? Twenty-four consortia submitted responses to this question. Twelve respondents (about 50%) indicated that they did not have any formal cataloging request process in place, at either the local or consortium level. The responses below expanded on some of the “yes” or “no” answers: Libraries may request creation of record sets for open access resources through our request tracking system. Collections are built as time permits No formal process, but they could ask Yes. Bibliographers may request that resources be added to the catalog/discovery service. MARC records currently received from Serials Solutions. Some institutions do, but we do not manage this as a consortia. Each institution decides on their own process. Yes. Bibliographers Not formally, but we would try to accommodate any requests that came in We're experimenting with this -- what level of access can be provided by our discovery layer vs what do we need to catalog ourselves if we want to enhance access. Yes, but the request must come from selectors from our library (not from any consortium member) Yes (although this may vary from campus to campus) Yes. both a single form for serials and, lately, a newer form for groups of resources Question 6: Who catalogs for the collections selected by the consortium? Twenty-one consortia submitted twenty-six responses. If a respondent selected “Other” and provided an answer that duplicated one of the supplied answer categories (see Figure 2 below), that answer was included in the data for the specific category rather than in “Other.” As Cataloging for Consortial Collections: A Survey 13 a result, only three out of seven responses in the category “Other” were counted as such. With these adjustments, the data showed that nine consortia (42.9%) made use of “MARC record” services (such as Knowledge Base, Alma Network/Community/Institutional Zones), member libraries of five consortia (23.8%) supplied their own cataloging, four consortia (19%) had designated cataloging units to share cataloging workloads, and one consortium (4.8%) divided cataloging among member libraries. The latter consortium also made use of “MARC record” services. Three consortia (14.3%) selected “Other” and provided the following answers: A central office imports records batch-cataloged to a minimum level, which member libraries can locally correct & enhance Most are based at the State Library owing to nature of collections Records are batch loaded and not touched by catalogers Figure 2. Who catalogs for the consortium collections? Question 7: What strategies/methods are used for cataloging? Twenty-one consortia submitted twenty-six responses. See Figure 3 below for a summary. Sixteen consortia (76.2%) made use of vendor records. Twelve consortia (57.1%) made use of “MARC record” services (such as a Knowledge Base, Alma Network/Community/Institutional Zones). Ten consortia (47.6%) batch created MARC records. Nine consortia (42.9%) distributed Cataloging for Consortial Collections: A Survey 14 records to their member libraries. Six consortia (28.6%) selected “Other” and provided the following answers: OCLC record sets to acquire MARC records, then scripted cleanup to bring them to a minimum level of quality Original cataloging as needed (especially serials), enhanced copy cataloging, and batch editing Databases are not catalogued. Content is made available by our discovery tool. Perpetual items are supplied to by: Batch creation of MARC records and Copy cataloging via OCLC OCLC, original cataloging Original It depends on the institution doing the cataloging; my library primarily uses MARC record services Figure 3. What strategies/methods used for cataloging? Question 8: If records are cataloged, do you attach holdings in a utility (such as OCLC) to aid in resource sharing? Cataloging for Consortial Collections: A Survey 15 Twenty-one consortia submitted twenty-six responses. Only consortia that contributed cataloging for consortial collections responded to this question. Based on the responses, this question was not specific enough as to whether the responses addressed cataloging at the consortium level, individual library level, and/or even varied within the same consortium. Therefore, “yes” answers were counted only once. Overall, eighteen consortia (86%) attach holdings for resource sharing. Table 3. Number of consortia who attach holdings in a utility (such as OCLC) Response Number of consortia Yes 9 Yes in some limited situation, such as for shared e-books, or if in OCLC or contain OCLC number 3 Yes or maybe yes at individual library level, or not at consortium level 6 No 3 Total 21 Question 9: What about updating/maintaining catalog records? Like question 8, only consortia that contributed cataloging for consortial collections responded to this question. Twenty-five consortia responded mostly “yes” but varied to a certain degree. For example, some consortia updated/maintained records at the consortia level and distributed them to member libraries while individual libraries at other consortia managed their own records. Some consortia depended on the source of records (either the content provider, or OCLC Collection Manager) for record maintenance and others maintained only collections in certain categories. For example, a few consortia disclosed that they don’t maintain records for titles in subscription packages. One consortium relied heavily on their discovery tool, annual KBART lists, and maintaining the subscription on the discovery knowledge base to support record maintenance. Question 10: How do you prioritize cataloging of resources? Cataloging for Consortial Collections: A Survey 16 Twenty-two consortia responded to this question and reported that they prioritize cataloging by consulting with their member libraries (if they are using a centralized cataloging model) or allowing individual member libraries to determine their own priorities. Following that, they prioritized cataloging based on a variety of factors including the number of participating members for a resource, perpetual content or licensed content over all other categories, on a first in first out basis, print before electronic resources, and/or batch loads first. Question 11: If cataloging is not provided (for some or all collections), what strategies and methods are used to make resources discoverable? Twenty-three consortia (including three that do not provide cataloging) responded to this question. Some strategies and methods for discovery are categorized in Table 4. Note that some consortia utilize a combination of the strategies. Table 4. Strategies and methods used if collections not cataloged Strategies and methods Number of consortia Rely on discovery tools 10 Link from library website (either list or to vendor’s platform 6 Use OCLC KB 2 Use Link resolver 1 Provide MARC records to members to do as they wish 1 Others (all cataloged, cataloged licensed, N/A, etc.) 5 Question 12: What practices do you feel are most effective for providing access to your resources? Twenty-four consortia responded to this question, touching upon many areas. See Table 5 below. The buzzwords “central” or “centralized” dominated concepts such as centralized metadata (without loading records into local system), central knowledge base, centralized services, centralized distributions, or central catalog. Some consortia emphasized the importance of the currency of knowledge bases for providing access to resources in library Cataloging for Consortial Collections: A Survey 17 catalogs and discovery tools. Several participants also pointed out the need to use multiple cataloging practices and services in combination to achieve the most effective results. Table 5. Effective practices for cataloging for consortial collections Most effective practices Number of consortia Central services, centralized distribution of records 3 Central KB and MARC record sharing services 8 Central catalog (inclusive) 4 Discovery service (connected to local catalog), Link resolvers 9 Vendor records, (improving quality) 2 Batch editing 2 Question 13: What are challenges/issues? What are opportunities? Twenty-five respondents identified the following as current challenges/issues of consortial cataloging: ● Lack of cataloging staff/expertise at both consortium or local level; it’s a challenge to keep up with large quantity of e-resources with very limited staff ● Lack of time, funds: significant time/funds invested in resources but lack of time/funds allotted for cataloging ● Some large packages do not get the same level of attention as other packages ● Issue with consortial catalog architecture; records must be loaded into individual local library catalogs ● Overlapping content or duplication of titles in multiple collections/sources ● The OCLC WorldShare Collection Manager is complicated and has a steep learning curve ● Inconsistent quality and currency of records in Knowledge Bases, or catalogs; the latest titles or contents are not always available ● Dependency on vendor records which often lack consistency in the quality of its metadata and compliance with standards Cataloging for Consortial Collections: A Survey 18 ● Dependency on relevance ranking of discovery services ● Off-campus, VPN, and EZproxy authentication issues affecting user access Twenty-five respondents identified the following as opportunities and advantages of consortial cataloging: ● Sharing of staff expertise and workload ● Sharing records ● Work with vendors to improve the quality of metadata in records and be able to insist better/more complete records as a condition for purchase ● Leveraging discovery service indices effectively ● Utilizing OCLC WorldShare to tag e-book holdings for resource sharing ● Improve cooperative cataloging in utilities such as OCLC WorldCat ● Opportunities to make more resources discoverable, more quickly with less cataloger effort in knowledge bases Discussion The survey results revealed five current trends in consortial cataloging. Trend 1: As expected, the local library catalog is not dead yet! Twenty-two (88%) out of twenty-five consortia that systematically acquire licensed e-resources maintain that providing MARC records for the individual local catalogs in their consortium is important. Trend 2: The MARC records that the consortia provide in their union catalog heavily favor licensed materials, especially licensed e-monographs, followed closely by licensed e- serials. Surprisingly, only half of the consortia supply records in their catalogs for e-journals from aggregator databases. Despite the increasing availability of the open access resources, cataloging for open access resources is handled on an “if requested basis.” This may change, as more publishers expand their selection of open access journals and even switch previously licensed titles to open access status. Cataloging for Consortial Collections: A Survey 19 Trend 3: We found several approaches for consortial cataloging. While it is true that most of the consortia obtain their cataloging records from a MARC record service, there is a minority that uses the centralized cataloging and record distribution model. One successful centralized cataloging model is the Shared Cataloging Program (SCP) of the California Digital Library which is documented in the article One for Ten (also in this issue). The SCP catalogers in this model often use a variety of methods to acquire records, from batch loading records for e-books created through spreadsheets to providing the highest level of CONSER records for e- journals. Not reflected in the survey, one large consortium initially provided cooperative cataloging through the efforts of volunteer libraries within the consortium but due to budgetary cuts and lack of local staffing within the member libraries, combined with a large increase in e- book records, this consortium switched to using a centralized cataloging model. Another consortium moved from using volunteer member libraries to loading vendor records. It’s possible that more consortia would adopt the centralized cataloging model if there was a standard formula for success. Though the initial planning process for a centralized cataloging operation can be daunting, it is essential for successful implementation and longevity. The member libraries must have good communication and be able to trust the output of the catalog record provider. Another possibility is the predictable decline of local libraries cataloging staff as resource budgets move into the electronic resources realm and is then moved into the orbit of the centralized operation. Trend 4: Most institutions utilize vendor records in some manner. The majority of consortia (76.2%) make use of vendor records, 57.1% make use of MARC record services. The one glitch in the survey was that the term “MARC record services” wasn’t well defined. Consequently, respondents lumped vendor records issued to accompany for specific packages and batches of records provided by all-encompassing record services such as OCLC or ProQuest in the same category. Respondents also used the term “MARC records services” interchangeably to refer to providers that supply all the MARC records needed by an institution Cataloging for Consortial Collections: A Survey 20 (complete or stub records), and even vendor records offered by jobbers such as Gobi Library Solutions. In either of these cases, the major advantage of using vendor or MARC records is the speed with which they are brought into the catalog. The problem with speed, however, is the fact that speed and stub (or skimpy) records often go hand in hand. Fortunately for libraries that attach their holdings to all their records (including vendor records) in utilities such as OCLC for resource sharing, complementary record maintenance services may be available for facilitating automatic record updates based on a library’s individual profile. Trend 5. Although three-quarters of consortia rely on vendor records, many of them agree that the most effective method of providing discovery and access for consortial collections is through some kind of “central” or “centralized” service, such as centralized metadata, a central knowledge base and MARC record sharing services, centralized distribution of records, and discovery services connected to a local catalog. Conclusion Based on the literature review and survey of consortial cataloging, it appears that more libraries and consortia are concentrating their efforts on consolidating their cataloging on a local level into services such as the OCLC Knowledge Base and the Alma Community, Network, and Institution zones. Meanwhile some of the challenges of using such a service or centralized knowledge base for consortia need to be addressed. What is needed henceforth are more in-depth studies that shine the light on new and innovative ways of providing or maintaining batch processes for e-book records, both MARC and, increasingly non-MARC. The latter will be important as libraries move to Linked Data for bibliographic description. We also need more in-depth assessment studies to determine if the functionalities of knowledge bases and discovery services are meeting the functionality needs of consortia to effectively make their collections available for discovery and access at both the institutional and consortia levels. Lastly, best practices are needed both for consortial cataloging Cataloging for Consortial Collections: A Survey 21 and for managing consortial electronic resources in centralized knowledge bases. The library community would greatly benefit from a set of best practices that can be used both by the providers and the recipients of vendor records. The Program for Cooperative Cataloging vendor guide, published in 2011, was an attempt to provide best practices in this area but it is out of date. Fortunately, NISO has convened a working group to prepare a document on best practices for e-books cataloging. It will be released for worldwide review in 2018. Notes 1 Karen Cary, Joyce L. Ogburn, “Developing a consortial approach to cataloging and intellectual access,” Library Collections, Acquisitions, & Technical Services 24, no. 1 (Spring 2000): 45-51. http://dx.doi.org/10.1016/S1464- 9055(99)00095-0 2 Jackie Shieh, Ed Summers, and Elaine Day, “A Consortial Approach to Cooperative Cataloging and Authority Control: The Virtual Library of Virginia (VIVA) Experience,” Resource Sharing & Information Networks 16, no. 1 (2002): 33-52. http://dx.doi.org/10.1300/J121v16n01_04 3 Thomas Boyle and Patrice Hughes, “Cataloging in a Consortium Environment,” Nebraska Library Association Quarterly 33, no. 4 (Winter 2002): 39-42. 4 Sabrina Riley, “What’s happened to the Library’s Catalog?” Union College News accessed June 9, 2017. https://www.ucollege.edu/news/2012/11/13/whats-happened-library-catalog 5 Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for nine: the shared cataloging program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. http://dx.doi.org/10.1016/S0098- 7913(01)00169-1 6 Frances F. Stumpf, “Centralized cataloging and processing for public library consortia,” Bottom Line: Managing Library Finances 16, no. 3 (August 2003): 93-99. http://dx.doi.org/10.1108/08880450310488003 7 Paul Moeller, Wendy Baia, and Jennifer O'Connell, “Cataloging for Consortium Catalogs,” Serials Librarian; Jun2003, Vol. 44, no. 3-4 (June 2003): 229-235. http://dx.doi.org/10.1300/J123v44n03_12 8 Xiaotian Chen, Larry Colgan, Courtney Greene, Elizabeth Lowe, and Conrad Winke, “E-Resource Cataloging Practices: A Survey of Academic Libraries and Consortia,” Serials Librarian 47, no.1-2 (July 2004): 153-179. http://dx.doi.org/10.1300/J123v47n01_11 9 Chew Chiat Naun and Susan M. Braxton, “Developing recommendations for consortial cataloging of electronic resoruces: lessons learned,” Library Collections, Acquisitions, & Technical Services 29, no. 3 (September 2005): 307- 325. http://10.1016/j.lcats.2005.08.005 10 Kristin E. Martin and Kavita Mundle, “Cataloging E-Books and Vendor Records: A Case Study at the University of Illinois at Chicago,” Library Resources & Technical Services 54, no. 4 (October 2010): 227-237. http://dx.doi.org/10.1016/S1464-9055(99)00095-0 http://dx.doi.org/10.1016/S1464-9055(99)00095-0 http://dx.doi.org/10.1300/J121v16n01_04 https://www.ucollege.edu/news/2012/11/13/whats-happened-library-catalog http://dx.doi.org/10.1016/S0098-7913(01)00169-1 http://dx.doi.org/10.1016/S0098-7913(01)00169-1 http://dx.doi.org/10.1108/08880450310488003 http://dx.doi.org/10.1300/J123v44n03_12 http://dx.doi.org/10.1300/J123v47n01_11 http://10.0.3.248/j.lcats.2005.08.005 Cataloging for Consortial Collections: A Survey 22 11 Carrie A. Preston, “Cooperative e-Book Cataloging in the OhioLINK Library Consortium,” Cataloging & Classification Quarterly 49, no. 4 (2011, Vol.): 257-276. http://dx.doi.org/10.1080/01639374.2011.571147 12 Kristin E. Martin, Judith Dzierba, Lynnette Fields, and Sandra K.Roe, “Consortial Cataloging Guidelines for Electronic Resources: I-Share Survey and Recommendations,” Cataloging & Classification Quarterly 49, no. 5 (2011): 361-386. http://dx.doi.org/10.1080/01639374.2011.588996 13 Wen-ying Lu and Mary Beth Chambers, “PDA Consortium Style,” Library Resources & Technical Services; 57, no. 3 (July 2013): 164-178. http://dx.doi.org/10.1080/01639374.2011.571147 http://dx.doi.org/10.1080/01639374.2011.588996 Appendix. Survey on Cataloging for Consortial Collections 1. The name of the consortium to which you belong ______________________ 2. Does your consortium have a systematic way to acquire licensed and open access electronic resources? o Yes (If yes, please complete the next question) o No (If no, please STOP here and submit the survey. Thank you!) 3. Do your consortium collections get cataloged? o Yes (If yes, please complete questions 4-13, write NA for question 11 if not applicable) o No (If no, please SKIP questions 4-10 and complete the questions 11-13. Thank you!) 4. What categories of resources are cataloged? (Check all that apply) o Licensed monographs o Licensed serials o Aggregator packages o Databases (single) o Databases (with analytics) o Open access monographs o Open access serials o Other: _____________________ 5. Do you provide an avenue for bibliographers to request cataloging of open access resources, either single serials, or open access packages? 6. Who catalogs for the collections selected by the consortium? o Each member library supplies their own cataloging o A designated cataloging unit to share cataloging o Sharing (divvying up among member libraries) of cataloging o Make use of “MARC record” services (such as Knowledge Base, Alma Network/Community/Institutional Zones) o Other: ______________________ 7. What strategies/methods used for cataloging? (Check all that apply) o Batch creation of MARC records o Record distribution to member libraries o Make use of vendor records o Make use of “MARC record” services (such as Knowledge Base, Alma Network/Community/Institutional Zones) o Other: ______________________ 8. If records are cataloged, do you attach holdings in a utility (such as OCLC) to aid in resource sharing? 9. What about updating/maintaining catalog records? 10. How do you prioritize cataloging of resources? 11. If cataloging is not provided (for some or all collections), what strategies and methods are used to make resources discoverable? 12. What practices do you feel are most effective for providing access to your resources? 13. What are challenges/Issues? What are opportunities? Cataloging for Consrotial Collections-A Survey eScholarship deposit version A Survey_Appendix Survey Form Revised work_2qua5s2mfbbe5egathe3xce7de ---- Northumbria Research Link Citation: Banwell, Linda and Gannon-Leary, Pat (2000) JUBILEE: monitoring user information behaviour in the electronic age. OCLC Systems & Services, 16 (4). pp. 189-193. ISSN 1065-075X Published by: Emerald URL: http://dx.doi.org/10.1108/10650750010354148 This version was downloaded from Northumbria Research Link: http://nrl.northumbria.ac.uk/2037/ Northumbria University has developed Northumbria Research Link (NRL) to enable users to access the University’s research output. Copyright © and moral rights for items on NRL are retained by the individual author(s) and/or other copyright owners. Single copies of full items can be reproduced, displayed or performed, and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided the authors, title and full bibliographic details are given, as well as a hyperlink and/or URL to the original metadata page. The content must not be changed in any way. Full items must not be sold commercially in any format or medium without formal permission of the copyright holder. The full policy is available online: http://nrl.northumbria.ac.uk/pol i cies.html This document may differ from the final, published version of the research and has been made available online in accordance with publisher policies. To read and/or cite from the published version of the research, please visit the publisher’s website (a subscription may be required.) http://nrl.northumbria.ac.uk/policies.html work_2sz2qknx7zgp7fbhsmv7bja63e ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216590102 Params is empty 216590102 exception Params is empty 2021/04/06-01:37:02 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590102 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:02 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_2vbbod4nxbhldihuhr6sj5yori ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216587849 Params is empty 216587849 exception Params is empty 2021/04/06-01:37:00 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587849 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:00 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_32ttl6ta7zh5dcoqjrbkovnzgm ---- Discovery with Linked Open Data: Leveraging Wikidata for Context and Exploration Lucas Mak Devin Higgins (me) @devinhhi Michigan State University Libraries Digital Repository: https://d.lib.msu.edu Main Idea Use linked data to provide meaningful information about subject headings, and provide avenues for discovery through hierarchical subject connections and enriched context. FAST subjects provide the links ● OCLC publishes FAST as linked data in 2011. ○ Available via bulk download or via API ○ Contains links to: ■ LCSH, LCNAF, VIAF, GeoNames ■ And more... ● Experimental reconciliation against Wikipedia in 2016 https://www.oclc.org/research/themes/data-science/fast/download.html https://www.oclc.org/developer/develop/web-services/fast-api/linked-data.en.html Subject Knowledge Cards Connections Broader, Narrower and Related Terms; Context Data points from Wikidata and DBpedia Use AJAX to gather data about subjects Via API Broader Terms: Subject → Related Terms: Subject → Narrower Terms: → Subject Cross reference in repository: Only display terms that appear in our index. Data points captured via SPARQL query of WikiData General Image, Abstract, Wikipedia link Geographic names Coordinates, Country, Capital, Official language Corporate bodies Founder, Start date, End date, Location, Headquarters location, Website Person Birth date, Death date, Gender, Occupation Limitations ● Not every concept/entity has a Wikidata entry ○ Esp. subdivided headings, compound headings, headings qualified by nationality, ethnic group, or language ● Differences in data modeling ○ Wikidata has separate entries for Asparagus (genus), ○ Asparagus officinalis (species), and Asparagus ○ (vegetable). FAST / LCSH collapses these into one. ● Name changes ○ FAST / LCSH separate entries for Michigan State University / Michigan State College. Wikidata treats these as aliases of the same entity. Limitations ● Wikipedia: its vastness, inconsistency, lacunae ○ Tertiary source, “needs its own lens of both caution and possibility” -- Anasuya Sengupta, DLF 2018 ● Library of Congress: Slowness to change, incompleteness, unfitness for many knowledges. Discovery with Linked Open Data: Leveraging Wikidata for Context and Exploration Lucas Mak Devin Higgins (me) @devinhhi Michigan State University Libraries Digital Repository: https://d.lib.msu.edu work_3ea46r76vvbitejberqluy7yzy ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216590524 Params is empty 216590524 exception Params is empty 2021/04/06-01:37:03 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590524 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:03 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_3h5okhuxgvbalnp46lhaneoq24 ---- The Profession Literature Loss in International Relations Charles A. Schwartz, Rice University Literature loss has two levels of meaning. In the research process, it refers to the difference between what is "out there" in some published form and what is likely to be found in a literature search. On an institu- tional level, it refers to the declining ability of research libraries to main- tain comprehensive collections in the wake of the extraordinary growth and price inflation of scholarly litera- ture since the 1970s. Recent technological innovations in the library field provide the means not only to measure literature loss on an institutional level, but also to alleviate it in one's own research work. An application of this technol- ogy to the area of international rela- tions is presented here. Part one of the paper is a brief history of the commentary on literature loss in the social sciences. Part two provides an analysis of the growing gap in the international relations area between book publication output and the aggregate holdings of 70 institutions in the Association of Research Libraries (ARL).' Part three describes the relevance of the new technology in greater detail, with an illustration for peace and conflict studies. Finally, part four draws some conclusions on the problem of literature loss at both the research and institutional levels, as well as on the closely related question of infor- mation retrieval. Commentary on Literature Loss The problem of literature loss first came to the fore 20 years ago when Robert Lane observed, in his presi- dential address to the American Political Science Association, that publication of political science arti- cles in general journals was hindering development of a cumulative body of research in the political science field itself (1972, 181). During the 1970s, a few scholars prodded the association to sponsor the organization of a more comprehensive bibliographic database. However, as Carl Beck reported toward the end of the decade, such efforts had to be aban- doned for lack of colleague support: The need for information retrieval systems in the social sdences is both real and apparent but, given the ability of many researchers to gather idiosyn- cratic research support, the need is not perceived as acute. For most of us, our information search behaviors were shaped by payoffs in the random walk that we undertook as students in col- lege and graduate schools. Although little has been done to analyze the nature of this walk in the social sci- ences, it seems a tenable hypothesis that our information behavior was shaped by "authorities": the authority of either a journal or a publisher who had attained high status in the profes- sion. The authority-conscious nature of our information usage is reinforced by the information explosion; to undertake that walk again with all the many information sources available is a traumatic experience. . . . The infor- mation milieu in the social sciences gets further compounded by the changing nature of the social sciences. Under these conditions we have a seri- ous problem of information loss (1977, 428-29). A similar effort to organize a com- prehensive database for peace research under the auspices of the United Nations also failed, although the UNESCO Yearbook of Peace and Conflict Studies was launched as a compromise measure in 1980 (Hoivik 1980; Beck 1980; Chatfield 1980). Discussion of the problem of information loss lapsed in the 1980s.2 However, a large-scale survey in 1986 of scholars' views on libraries and databases tends to corroborate Beck's concern about the relatively narrow, authority-conscious nature of literature-searching behavior. This survey found that, among 3,835 social scientists, 77% rely primarily on personal libraries for doing research; only 18% attach much importance to computerized card catalogs; and just 6% use online databases of periodical literature with any regularity (Morton and Price 1989). Evidently, social scientists tend to confront the expanding infor- mation environment by limiting their searches to relatively small, familiar parts of it. Measurement of Literature Loss Among Research Libraries Underlying any analysis of litera- ture loss is the problem of assessing the size of the universal stock of aca- demic books in a given field. That problem has no definitive solution, nor any prospect of one. Moreover, even if accurate book publication figures could be compiled, it would hardly make sense to consider them as realistic or worthy goals for research library collection develop- ment. The ancient ideal of the Library of Alexandria—to hold everything that was ever published— has been considered impossible and undesirable since the early part of the twentieth century. For the purpose of the analysis presented here—to assess the finan- cial performance of the nation's research libraries within reasonable bounds—book publication output is derived from the Online Computer Library Center, Inc. (OCLC) Union Catalog (OLUC) bibliographic data- base. The decision to use this data- base recognized the utter infeasibility of attempting to locate the publica- tion lists of hundreds of university and trade publishers, scholarly socie- ties, and the like (in peace studies alone, UNESCO lists 313 research bodies); and involved the simplifying assumption that just about any book that appeared over the past decade in the Americas or in Europe would have shown up in the OCLC-OLUC system, which contains the shared 720 PS: Political Science & Politics Literature Loss in International Relations FIGURE 1 Book Output on International Relations 1000 800 600 400 200 • (A) Total bookioutput Literature loss (B) Aggregate ARL holdings Mainstream literature (C) Subset of rare ARL holdings i Gray literature 1978 1979 1900 1981 1982 1983 1984 1986 1986 1987 OCLC-OLUC and OCLC/AMIQOS CACD databases cataloging of over 4,800 libraries, including the Library of Congress, in 26 countries. Figure 1 compares the aggregate holdings of 70 ARL institutions with total book output on international relations for the period 1978-87.3 Line A shows that output fluctuated from a low of 601 titles in 1978 to a high of 904 titles in 1984, with 867 titles in 1987. On average, 808 titles were published annually, with a stan- dard deviation of 89. Line B—covering that part of book publication output that is held in at least one of 70 ARL institutions —is derived from a different data- base, OCLC/AMIGOS Collection Analysis CD (CACD). It shows aggregate ARL holdings by year of publication (not by year of acquisi- tion). Such holdings in international relations fluctuated from a low of 525 for titles published in 1987 to a high of 662 for titles published in 1983. On average, 570 titles were acquired per year of publication, with a standard deviation of 38. Overall, ARL holdings dropped from 75% of book output in the late 1970s to 65% in the late 1980s. Line C demarcates the mainstream literature from the gray literature. The former includes books held by at least 10% of ARL institutions and comprises 55% of ARL aggregate holdings. The gray literature is held in less than seven institutions and mainly consists of small press publi- cations, foreign-language materials, conference proceedings, and legal texts. Table 1 clarifies the breakdown between the mainstream and gray lit- eratures on international relations by showing their relative distribution within ranges of ARL holdings. The distribution of mainstream literature is covered in the top nine ranges, starting with the 115 titles held in 90%-100% of ARL institutions and descending to the 10%-19% range with holdings of 838 titles. Strikingly different is the distribution of the gray literature; as many as 1,817 titles are held in only a few of the ARL institutions, with another 1,216 titles being in a "unique" category. Thus, the main point brought out by the table is that the holdings of books on international relations among ARL institutions are highly skewed, or disparate. Among the largest ARL institu- tions, the average collection holds 60% of the mainstream literature and 18% of total output. Among smaller academic libraries (with about 1 million volumes), the average collection holds 23% of the main- stream literature and 12% of total output. Research Applications of the New Technology The prospect of finding a relatively small and diminishing part of the book literature in a major research library reflects a substantial literature loss and presumably arouses interest in the OCLC-OLUC database and its scope for information retrieval. OCLC was established in 1967 to facilitate shared cataloging opera- tions. Books could be searched by author and/or title, but not by sub- ject. The OLUC subsystem, incor- porating subject and keyword retrieval capabilities, began in 1990. By then, the OCLC database con- tained over 20 million records, with 2 million being added each year. Searches can be directed to records in one or more of 365 languages, and to particular formats—books, serials, musical scores, sound recordings, machine-readable files, maps, and media (films, videos). Retrieval can be formatted to show which libraries own an item. Online charges are $24.00 per hour with an additional telecommunications charge of $8.40 per hour, but most searches can be TABLE 1 Distribution of Holdings Association of Research Libraries Books* Published 1978-87 on International Relations Percentage of 70 ARL Institutions 90-100 80-89 70-79 60-69 50-59 40-48 30-39 20-29 10-19 3-9 Unique Toial 1-100 Number of Titles per Range 115 191 184 154 143 182 225 364 838 1,817 1,216 5,429 •Excluding dissertations and U.S. govern- ment reports. December 1992 721 The Profession FIGURE 2 Book Output: Nuclear War, Peace Studies Thousands 14 1 2 10 M M - — 1 i 1 j : _ j I 1 1 I JJJ- Pi m I Sfuaies 1 i i .. i irwiaf / I j li il > / /_ 0 1800 1860 1900 OCLC-OLUC and OCLG/AMIQOS CACD databases 1960 2000 done by a trained librarian in a few minutes. OCLC-OLUC is particularly useful for interdisciplinary and historical research. Figure 2, for example, compares the growth of two book lit- eratures: nuclear war and peace studies. The number of titles on nuclear war has roughly doubled every five years since 1945, with the greatest growth in the early 1980s. The number of titles on peace studies, by contrast, displayed no pronounced trend during the nine- teenth century, doubled in size dur- ing the first 40 years of this century, doubled again during the next 30 years, and nearly again in the 1980s. Proliferation of peace studies reflects a broadening of the research agenda from "negative peace"—an absence of warfare—to "positive peace"—the presence of social and economic justice, with all that implies for radical transformation of national and international arrange- ments. There is strong debate whether this expanding agenda is leading to a loss of focus and coher- ence in the field: [As Hylke Tromp has said], "peace research has become what a black hole is in astronomy. There seems to be no social problem which in the final analysis does not have its legitimate place within peace research, and there- fore is absorbed by the definitional processes in peace research." In peace research, as elsewhere, rapid growth causes its own problems, e.g., in terms of integration. Extremely few, if any, of those who define themselves as peace researchers today have an over- view of the entire field, and hardly any peace research institute has more than some parts of it represented. To many peace researchers, it is a source of embarrassment that a field which prides itself on its transdisciplinarity should become so compartmentalized (Wiberg 1988, 45; Tromp 1980, xxvii). Some scholars have urged a return in peace research to its traditional focus on international security issues (Boulding 1989; Quester 1989; Soroos 1990). However, no such trend is evident; the amount of peace research that is cross-referenced in OCLC-OLUC with such "negative peace" descriptors as arms control or nuclear weapons has not changed much over the past decade, hovering around 10%. OCLC-OLUC is also useful for checking perceived gaps in the litera- ture. For example, Kenneth Boulding proposed a number of topics for peace historians (1989). Below are a few of these topics, along with the number of probably relevant records retrieved in a quick search for pur- poses of this paper: • Peace or war in Paleolithic, Neo- lithic, or hunting societies—153; • Relation of demography to war or peace—21; • Relation of agriculture to war or peace—1,040; • Relation of religion in the Middle Ages to peace—8. Conclusions At the institutional level, the mag- ' nitude of literature loss in the field of international relations is estimated as follows: • Thirty percent of total book out- put for 1978-87 is not held in any of 70 ARL institutions. • Another 40% of book output, being held in just one or a few ARL institutions, is also virtually lost except through EPIC. • Of the remaining 30% that com- prises the mainstream literature, the average ARL institution holds about a third of it, or roughly 13% of the books published on international relations during the period under review. • In the largest ARL institutions, a researcher will find about 60% of the mainstream literature and up to 18% of total book output for the period. As Figure 1 shows, literature loss grows rather steadily. Underlying the deteriorating position of the nation's research libraries are certain infla- tionary tendencies built into the scholarly communications system. During the 1978-87 period, the average price of a hardbound book on political science increased 180% (Bentley 1991, 404); and the cumula- tive inflation rate for institutional subscriptions to political and social science journals increased 270%, ris- ing to 356% by 1991 (Carpenter and Alexander 1991, 59). In any particu- lar field, new journals (13 in political science since 1978) also tend to "crowd out" book acquisitions. However, with the advent of OCLC-OLUC, researchers are no longer tied to their institutional card catalog systems. The importance of comprehensive retrieval, of course, 722 PS: Political Science & Politics Literature Loss in International Relations depends on the nature of the research problem at hand. OCLC- OLUC is especially useful for his- torical, interdisciplinary, or foreign- language searches. For example, 29% of the records in OCLC-OLUC on peace studies are non-English in lan- guage, as opposed to only 10% of the mainstream social-science period- ical literature (Gareau 1984). OCLC- OLUC should also help to alleviate what is called the "problem of unnecessary originality": [In some quarters, it is held,] all that is worth knowing is contained in the very latest journals or books. . . . Unfortunately, this view has led to an antihistorical bias and caused a great deal of unnecessary originality in our discipline. . . . Reincarnations of con- temporary ideas and supposedly new ideas might be greatly improved if ideational antecedents or previous investigations were sought out and incorporated into the " n e w " produc- tion of knowledge (Ault and Ekelund 1987, 652). More directly, an OCLC-OLUC search may have value in demon- strating to editors that a research proposal does indeed fill a gap in the literature. As a final note, while the OCLC- OLUC system for comprehensive book retrieval now extends online to some 4,500 institutions, the CACD system for collection analysis has only been acquired by 44 institutions. Combining the two systems to measure literature loss in a given field has not been done before, so there are no- other studies against which to compare the results of the analysis presented here. Notes The author would like to express his thanks to Helen Hughes of the AMIGOS Biblio- graphic Council for technical advice; and to Stephen P. Harter, David Kaser, and Herbert S. White for their helpful comments on an earlier draft of this article. 1. ARL currently comprises 117 institu- tions. For the period 1978-87 under review, 70 institutions are included in the OCLC/ AMIGOS Collection Analysis CD. The par- ticular holdings of two research libraries not included—Stanford University and the New York Public Library—may be of interest to peace historians, since those institutions col- lect comprehensively in the area of inter- national relations. Such holdings, however, would not materially affect the results of this study, which is based on the contemporary literature. 2. A few tangential studies were done in the library field. Two citation analyses of political science journals—one for the period 1910-60 (Robinson 1973), the other for 1968-70 (Stewart 1970)—showed the same dis- tribution of sources used in research: 30% from within the discipline, 40% from other social sciences, and 30% from outside the social sciences. Drawing on those studies, Elliot Palais (1976) assessed the coverage of various periodical indexes for 179 journals commonly cited in political science. He found the range of coverage to vary between 40% and 70% of the journals. Thus, reliance on any one index to search a broad topic means information loss in the range of 30%-60%. 3. The literature on international relations comprises the " J X " range of the Library of Congress classification scheme and falls under three primary subject headings: inter- national relations, world politics, and national security. In this hierarchical scheme, titles falling under the scores of more specific subject headings (e.g., peace, nuclear arms) are also retrieved and counted. However, the numbers are not precise owing to inevitable inconsistencies, duplications, and errors in any shared-cataloging system as OCLC. (For example, the "first" book on nuclear warfare in the database is the 1932 edition of Aldous Huxley's Brave New World, which actually dealt with chemical warfare; this record was created, apparently on the basis of a cata- loger's faulty recollection of the novel's theme, sometime in the 1970s.) Also, while researchers in the area of international rela- tions are obviously interested in far more of the literature than is classified in the " J X " range, inclusion of non-"JX" titles in any general analysis would be an arbitrary and endless process. For the analysis presented here, it was necessary to structure and bound the literature on international relations. The set of 5,500 titles, together with the set of 70 ARL research collections, are adequate to draw reasonably accurate patterns of litera- ture loss over the past decade. References Ault, Richard W. and Robert B. Ekelund. 1987. "The Problem of Unnecessary Originality in Economics." Southern Eco- nomic Journal 53(1): 650-61. Beck, Carl. 1977. "Information Systems and Social Sciences." American Behavioral Scientist 20 (January): 427-48. Beck, Carl. 1981. "Peace Research and Information Systems." In UNESCO Year- book on Peace and Conflict Studies 1980. Westport, CT: Greenwood Press, pp. 14-24. Bentley, Stella. 1991. "Prices of U.S. and Foreign Published Materials." In The Bowker Annual Library and Trade Book Almanac, ed. Filomena Simora. New Providence, NJ: R. R. Bowker, pp. 399-442. Boulding, Elise. 1989. "Introduction." In Peace and World Order Studies: A Cur- riculum Guide, ed. Daniel C. Palmer and Michael T. Kare. Boulder, CO: Westview Press, pp. 3-4. Boulding, Kenneth E. 1989. "A Proposal for a Research Program in the History of Peace." Peace & Change 14 (October): 461-69. Carpenter, Kathryn and Adrian W. Alexan- der. 1991. "Price Index for U.S. Period- icals." Library Journal 116 (April 15): 52-59. Chatfield, Charles. 1981. "The Dissemination of Peace Research Through Periodicals: A Tentative Review and Recommenda- tions." In UNESCO Yearbook on Peace and Conflict Studies 1980. Westport, CT: Greenwood Press, pp. 25-56. Gareau, Frederick H. 1984. "An Empirical Analysis of the International Structure of American Social Science." The Social Sci- ence Journal 21 (July): 23-36. Hoivik, Tord. 1981. "New Developments in Information and Documentation on Peace and Conflict Studies." In UNESCO Year- book on Peace and Conflict Studies 1980. Westport, CT: Greenwood Press, pp. 3-13. Lane, Robert E. 1972. "To Nurture a Disci- pline." American Political Science Review 66 (March): 164-82. Morton, Herbert C. and Anne J. Price. 1989. The ACLS Survey of Scholars: Views on Publications, Computers, and Libraries. Washington, DC: American Council of Learned Societies. Palais, Elliot S. 1976. "The Significance of Subject Dispersion for the Indexing of Political Science Journals." The Journal of Academic Librarianship 2(2): 72-76. Quester, George H. 1989. "International Security Criticisms of Peace Research." In Peace Studies: Past and Future, ed. George A. Lopez. The Annals of the American Academy of Political and Social Science 504 (July): 98-105. Robinson, William C. 1973. "Subject Disper- sion in Political Science: An Analysis of References Appearing in the Journal Lit- erature, 1910-1960." Ph.D. diss., Univer- sity of Illinois. Soroos, Marvin S. 1990. "Global Policy Studies and Peace Research." Journal of Peace Research 27(2): 117-25. Stewart, June L. 1970. "The Literature of Politics: A Citation Analysis.'1 Interna- tional Library Review 2(4): 329-53. Wiberg, Hakan. 1988. "The Peace Research Movement." In Peace and Research: Achievements and Challenges, ed. Peter Wallensteen. Boulder, CO: Westview Press, pp. 30-53. About the Author Charles A. Schwartz is the social sciences bibliographer at the Fondren Library, Rice University, Houston, TX 77251, and he has a Ph.D. in foreign affairs from the University of Virginia. December 1992 723 work_3lldtroklzbw5notnkfjr56kyi ---- JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 __________ © 2020, The Author(s). This is an open access article, free of all copyright, that anyone can freely read, download, copy, distribute, print, search, or link to the full texts or use them for any other lawful purpose. This article is made available under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. JLIS.it is a journal of the SAGAS Department, University of Florence, published by EUM, Edizioni Università di Macerata (Italy). ISNI and traditional authority work Amy Armitage(a), Mary Jane Cuneo(b), Isabel Quintana(c), Karen Carlson Young(d) a) b) c) d) Harvard University __________ Contact: Amy Armitage, amy_armitage@harvard.edu; Mary Jane Cuneo, cuneo@fas.harvard.edu; Isabel Quintana, quintana@fas.harvard.edu; Karen Carlson Young, karen_young@gse.harvard.edu Received: 1 March 2019; Accepted: 21 July 2019; First Published: 15 January 2020 __________ ABSTRACT This article describes key differences between ISNI (International Standard Name Identifier) and the authority work traditionally performed at libraries. Authority work is concerned with establishing a unique form of name for a person and collocating materials under that form of name. ISNI, on the other hand, is concerned with establishing a unique numerical identifier for each entity, and differentiating distinct entities. The focus of the work becomes identity management rather than the establishment of authorized name forms. This article looks not only at the differences in workflows, but also explains how these theoretical differences can affect the way librarians identify and collocate named entities. The focus is on the future, and how we can best use our skills to ensure that entities are properly differentiated and accessible to our patrons. KEYWORDS Identity management; ISNI; Authority work. CITATION Armitage, A., Cuneo, M.J., Quintana, I., Carlson Young, K., “ISNI and traditional authority work.” JLIS.it 11, 1 (January 2020): 151–163. DOI: 10.4403/jlis.it-12554. http://creativecommons.org/licenses/by/4.0/ mailto:amy_armitage@harvard.edu JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 152 This article presents the key differences between traditional library authority work and identity management. In this article, traditional library authority work is exemplified by the Name Authority Cooperative Program (NACO) of the Program for Cooperative Cataloging (PCC). This program is based primarily in the United States, Canada, and the United Kingdom. The program cooperatively adheres to the same cataloguing standards in order to freely share records for both bibliographic and authority entities (“About NACO” 2018). The International Standard Name Identifier (ISNI) is presented as an example of identity management work. ISNI is an international file based primarily in Europe. The comparison reviews differences in provenance, veracity of metadata, corporate body name changes, and duplication. Our focus is on how traditional library authority work and identity management approach disambiguation of entities differently. These differences create different types of databases and perhaps would be most successfully used in different workflows. This article does not address how these identifiers, once constructed, are used by the international community. The article ends with thoughts on how librarians might incorporate identity management strategies as part of their library work. Since its creation in 2012, the ISNI database has been the subject of various articles published throughout the world. The literature has focused on the nature of ISNI – why and how it was established (Angjeli, 2012, 2014; Gatenby and MacEwan, 2011); and on its use in various settings (Balan, 2017; ISNI Assignments Top 6.5 Million, 2013). However, the process of working in ISNI has not been compared with the process of working with traditional library authority data. This article attempts to fill this gap, by comparing both the theoretical and practical aspects of working in these different environments. What is ISNI? Let’s face it, finding the right identity on the web can be very difficult at times. These slides, from the ISNI.org website, illustrate the problem. They depict three different people who all have the same name: Michele Smith. Two of them work in the field of music. It might be easy to conflate these identities. The third Michele Smith is an author. JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 153 Screen shot for Michele Smith, the singer (ISNI, 2018) Screen shot for Michele Smith, the musician (ISNI, 2018) JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 154 Screen shot for Michele Smith, the author (ISNI, 2018) The same problem occurs in our library catalogues. Establishing the correct entity can be a time- consuming and detailed process. Once the correct identity is found, we want to be able to find other materials related to this person. Traditionally, libraries have used a unique form of the person’s name to collocate materials by that person. ISNI takes a different approach, assigning instead a unique number to identify a person. This number can be used in linked data as a bridge identifier to connect this person to other resources by and about the same person, without the need to regularize the form of the name (ISNI website 2018, homepage). ISNI, or “International Standard Name Identifier,” is an ISO standard, and, as such, works much like the ISBN does to identify books. It is a persistent, unique identifier. ISNIs are primarily assigned for persons, but corporate bodies are included as well. ISNI is supported by the OCLC offices in Leiden, the Netherlands, and has two Quality Teams working at the British Library and the Bibliothèque nationale de France (“International Standard Name Identifier” Wikipedia 2018). What are the differences between ISNI and NACO? The fundamental difference between ISNI and NACO is that NACO provides authority control, where a unique form of name is created to represent each entity; ISNI, instead, serves identity management, where a unique identifier is assigned to each entity. Both programs attempt to distinguish entities, collecting information about one entity versus another one with a similar name. However, in authority files, the focus is on the name string. NACO training primarily teaches the cataloguing rules for how to formulate a unique name string. In contrast, ISNI focuses on the identity JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 155 itself. ISNI practitioners put most of their effort into determining whether one identity is the same as another, or different. In other words, most of an ISNI practitioner’s time is devoted to research. This key difference manifests itself even in how the databases are searched to identify the correct entity. For example, here are the results of a search for John the Baptist in ISNI: To a NACO practitioner, the results can be confusing. The names appear in various languages, and they include artists and musicians. Is the correct entity no. 3, 5, or 9? Clicking on no. 9 brings up this display: ISNI record for the identity “Yohane Osuboni” (der Taufer v1-29) in the web client list (ISNI, 2018) Expanding the “name variant” box produces a long list in many languages, including “John the Baptist (baptizer of Jesus).” So, the name does appear as expected – if one speaks German. This suggests several things: (1) there is no preferred form of name; (2) the ISNI community is broader JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 156 than any single library authority file, or all librarians who speak English; and (3) ISNI emphasizes machine actionability. Human eye-readable elements are present, but they are not primary, as an ISNI is designed for automation and for users and contributors worldwide. The same search in the NACO file yields this result: NACO browse search for “John the Baptist” in the NACO file in OCLC Connexion (OCLC, 2018) When the NACO record for “John the Baptist” is expanded, this is the result: Section of the NACO authority record for “John the Baptist” in the NACO file in OCLC Connexion (OCLC, 2018) Note the hybrid forms in many of the variant fields (400s). For example, the yellow highlighted field has a German term, “der Taufer,” followed by “Saint” in English. The NACO rules about qualifiers produce this effect. NACO is concerned with the form of the name in the string, regardless of whether the string is the authorized form (MARC 100), or a variant form (MARC 400) (“NACO Training” 2018, Module 1, slide 22). As suggested above, the NACO and ISNI communities differ in scope. NACO member institutions are libraries, primarily in the US and Great Britain (Frequently asked Questions about Joining the NACO Program 2019, 1). Because the names they contribute come from library resources, they represent primarily authors and their subjects. ISNI member institutions include libraries, too, but also other kinds of organizations in the information supply chain, from all over the world (“How ISNI works” 2019). They may contribute names for the creators of non-literary works, or works that are JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 157 not published – creating records that cite, for example: buildings, garden plans, or photograph albums. Furthermore, some ISNI records have no citations at all for related works, as we will see later on. Non-library member organizations bring interests and goals to ISNI that may differ from ours, but they are no less legitimate. ISNI opens the door to a more diverse catalogue of identities, a wider member cohort with points of view to consider, and new audiences with whom to share what we have to offer. There are also key differences in how metadata is recorded between ISNI and NACO. This is partly due to input workflows. The LC/NACO authority file has been built one record at a time, with care and attention to the PCC cataloguing rules. Cataloguers must check carefully before adding to the file to ensure the record is not a duplicate (“NACO Training” 2018, Module 1, slides 74-76). In contrast, most ISNI records have been batch loaded. ISNI ingests the data first and then improves it. A sophisticated algorithm assigns ISNIs, or not, based on programmed confidence levels, and highlights possible duplicates. A provisional record is one that has been added as a proposal and has not yet been assigned an ISNI identifier, pending further investigation by an ISNI participant. Humans intervene at the end of the process to resolve problems. Humans can also enter new records manually into ISNI (“Data Quality Policy” 2019). As noted earlier, the NACO and ISNI programs differ significantly in how their contributors are onboarded. Participation in NACO is allowed only after a cataloguer has completed a detailed 5-day training course, and has been reviewed, which usually entails the contribution of more than 100 records (“NACO Training” 2018, Module 8, slides 8-4, 11). The training required for ISNI participation is much shorter, as is the review period, which evaluates the source institution rather than the individual contributor (“PCC ISNI Pilot Home” 2019). An ISNI contributor does not have to be a cataloguer. This is because the form of a name does not depend upon library cataloguing conventions; indeed, the form of the name does not matter. What matters is entering the data, and correctly identifying the entity (Liss 2017, 5, 7). There is another key difference between ISNI and NACO. Provenance, which means the legitimacy of data based on where it comes from, is important in both NACO and ISNI, but the criteria are different. NACO relies upon “usage” (“Frequently Asked Questions on creating Personal Name Authority Records (NARs) for NACO” 2018): How does the entity represent itself in resources, and where did the name appear? This information is cited, to support or prove the assertions made in an authority record (“Descriptive Cataloging Manual” 2018, section Z1, 670). To give an example, the authorized heading for Sandro Botticelli is “Botticelli, Sandro, $d 1444 or 1445-1510.” In the authority record for this artist there are statements to support the preferred form of his name. One citation notes that in a book titled ‘Botticelli, painter of Florence’, by H.P. Horne, published in 1980, page 1, states that the artist is commonly called Sandro Botticelli. In ISNI there is no need to prove the veracity of the metadata. A contributing ISNI institution is called a “source.” ISNI trusts its sources, provided the ISNI algorithm finds no problem when comparing new data with existing data. If more than one source says the same thing, this is even JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 158 better. ISNI is interested in which institution added the metadata, and how many institutions corroborate the same metadata (“ISNI Manual” 2019, section 1.5, 5–8). Section of the NACO authority record for “Umberto Eco” in the NACO file in OCLC Connexion (OCLC, 2018) In the example above, the NACO record for Umberto Eco indicates that he was born on January 5, 1932 and died on February 19, 2016. This information is coded in the 046 field, and the birth date is supported by the last 670 visible in the illustration. The 040 field lists codes for all the libraries that have contributed to this authority record. It readily shows which institutions contributed the metadata, but not which institution contributed what piece of metadata. NACO bases metadata on “literary warrant” (RDA Toolkit 2017, 0.4.3.4), i.e., where the information appeared formally, so each 670 lists a title, year of publication, and sometimes the author (“NACO Training” 2018, Module 1, slides 208-209). In contrast, ISNI links each piece of metadata with the source institution’s symbol (“Data Contributors” 2019). JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 159 Section of the ISNI record for “Umberto Eco” (ISNI, 2018) The fourth line of this example states that the Bibliothèque nationale de France cites Umberto Eco’s birthday as January 5, 1932. Other institutions list both his birth and death date. The information is repeated, even if it is identical. NACO and ISNI also differ in how they handle name changes for corporate bodies. NACO requires the creation of a new authority record when a corporate body changes its name; the two authority records are then linked in an earlier/later relationship (RDA Toolkit 2017, 11.2.2.6, 32.1.1.3). In ISNI, a corporate body continues to have the same identifier unless there is a change in the structure of the corporate body, a merge or a split (“ISNI Manual” 2019, section 7, 31). In other words, to warrant a new identifier, the nature of the corporate body must change. Absent that, a name change is just a name variant in ISNI. However, ISNI batch loads data from sources that use mutually incompatible models for handling corporate name changes – mostly, European sources versus US sources. ISNI may then automatically merge records, creating some strange combinations. Recognizing that the result can be problematic, the ISNI Quality Team works to correct these records manually (“ISNI Manual” 2019 section 7, 22–30). Despite the differences between ISNI and traditional authority work, there are two issues that are key to any good database of entities: (1) the need to differentiate between entities; (2) the need to avoid duplicate records for the same entity. To meet these criteria, identifying information must be provided for each entity. For example, let’s look at this list of names: - Quintana, Isabel del Carmen - Quintana, Isabel - Quintana, Isabel - Quintana y Gonzalez, Isabel JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 160 The list tells us nothing about who this person, or these people, are, and therefore neither humans nor a computer can know if one, two, or several entities are represented. In NACO, one or more citations provide context by connecting the name with a work (works). It could be a book written by or about this person (“Descriptive Cataloging Manual” 2018, section Z1, 670). For example, in the screen shot below the NACO authority record ascertains that Isabel Quintana is the author of a book titled, Figuras de la experiencia en el fin de siglo, published in 2001. Section of the NACO authority record for “Quintana, Isabel” in the NACO file in OCLC Connexion (OCLC, 2018) In contrast, data batch loaded into ISNI may be “sparse” – for example, only a name may be present. In these cases, the computer cannot match the entity with any other entities in the database, and duplicates result. If the sparse records remain provisional, they do not display to the public and so any redundancies may not be of concern. However, if they are upgraded to “assigned” status, they need to be resolved (“Data Quality Policy” 2019). Duplicates are a major problem for any authority file. In NACO, great care is taken to ensure that duplicates are not created in the first place (“NACO Training” 2018, Module 1, slides 75, 82-83, 94-98). Yet we find them. This is partly because it is labor-intensive to search every known variant of the name, especially if it is a common name. When cataloguing a book by William Smith, one must look at many records in the current file, including variants such as Will Smith or Bill Smith. The person might also have been established with additional information not present in the book in hand, such as “Smith, William A.” It can be tricky and time-consuming to avoid creating a duplicate. ISNI resolves as many duplicates as possible automatically and flags others for manual resolution. Because it’s much easier to merge duplicate entities than to separate conflated ones, the matching algorithm is extremely cautious (“Data Quality Policy” 2019), with the result that duplicates are common. How easy is it to merge entities in ISNI? A search for “Michael, George” retrieves a complete record for the British singer who passed away in 2016. It also finds a very brief provisional record. The latter looks like this: JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 161 Section of the provisional ISNI record for “Michael, George” (ISNI, 2018) A possible match is indicated, and the contributor is asked to compare this record with the fuller record for George Michael. The two records will then be displayed side by side, and one can respond: “Equal,” “Unequal,” or “Don’t know” (“ISNI Web Interface Usage Guidelines” 2019, section 4, 19- 20). Since ISNI logins are at the institution level (“Data Contributors” 2019), the participant’s assertion will be recorded as a statement made by that institution. So, although one will find more duplicates in ISNI than in NACO, there are efficient ways to deal with the duplicates – once one has determined that they are duplicates. This case is a typical example of the kind of inquiry sometimes needed to resolve duplicates. There is so little information in the provisional record that it is difficult to know whether these entities are the same. A search for IPDA in ISNI’s list of sources (Ibid) reveals that it stands for the International Performers Database Association. A web search shows that the main objective of the International Performers Database (IPD) is to identify individual performers in audio recordings and audio-visual works (“IPD” 2019). So, these are likely the same person. In summary, we in the library community have traditionally practiced authority work rather than identity management. However, many of the tasks are similar. We want to have different records set up for unique entities. As librarians it makes sense for us to experiment with ISNI and see where identity management and ISNI workflows could enhance or change our current approach to names. ISNI also provides an opportunity for Anglo-American cataloguers to share data with our colleagues globally, and with representatives of institutions outside of the library sphere who are interested in identities too. Because the identifiers are machine actionable, ISNI is a step toward linked data and discovery of library resources on the web. What are some possible next steps? The PCC created a Pilot program (“PCC ISNI Pilot Home” 2019) to determine how librarians can better incorporate ISNI into our workflows. We might experiment with ISNI for local authority files. Many libraries have thousands of legacy local authorities that they cannot add to NACO because they lack the resources to enter the records manually. The PCC created a task group to work on mapping MARC library data to ISNI to facilitate future batch processing (“Pilot Joint Focus Areas & Deliverables” 2019). Perhaps these local authorities can be sent to ISNI, matched against the database, and staff can process ISNI reports that result from batch processing. JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 162 We might consider dissertations. Many university libraries have lists of people who have written dissertations; information that in some cases is stored electronically. However, NACO contributors are reluctant to create a personal name authority based on a thesis, since in many cases the form of name as it appears there is not necessarily the form preferred by the author. Creating these names in ISNI allows flexibility in the form of the name, and provides the opportunity to establish the scholar’s affiliation with a university early in his/her career. What about other names? Perhaps we have a collection of objects, or realia, with associated names which would not fit nicely into traditional authority files. What about local university clubs, such as student organizations? Many of these corporate bodies never issue a document, yet we have information that they exist and about their focus. In support of the PCC ISNI Pilot, the PCC has appointed an ISNI Training Task Group, charged with developing an ISNI training curriculum for PCC ISNI participants, documenting procedures and workflows for creating and maintaining ISNI records, and using ISNI tools. The task group will work closely with the ISNI Pilot participants to make the training curriculum available to others also interested in participating in ISNI (“ISNI Documentation & Training” 2019). In short, ISNI presents opportunities to look at named entities differently. It’s critical for librarians to experiment, to determine how best to represent entities using the various data structures available; to explore how we might broaden the community within which we cooperate to do this work; and to meet contemporary information seekers wherever they are, with high-quality metadata for these entities. References “About NACO,” Library of Congress, NACO website, accessed December 17, 2018, https://www.loc.gov/aba/pcc/naco/about.html. Angjeli, Anila. 2012. “ISNI: un identifiant passerelle.” Documentation et bibliothèques 58(3):101– 108. Accessed April 16, 2019. DOI: 10.7202/1028900ar. Angjeli, Anila. 2014. “ISNI: consolidating identities, connecting nodes.” International Journal of Knowledge and Learning 9(4):326–346. Accessed April 16, 2019. DOI: 10.1504/IJKL.2014.069534. Bălan, Dorina. 2017. “ISNI: International Standard Name Identifier.” Biblioteca 2017(7):203–204, 222–223. Accessed April 16, 2019. https://search-proquest-com.ezp- prod1.hul.harvard.edu/docview/2012832329?accountid=11311&rfr_id=info%3Axri%2Fsid%3Apr imo. “Data Contributors,” ISNI website, accessed February 1, 2019, http://www.isni.org/content/data- contributors. “Data Quality Policy,” ISNI website, accessed February 1, 2019, http://www.isni.org/content/data- quality-policy. https://www.loc.gov/aba/pcc/naco/about.html https://search-proquest-com.ezp-prod1.hul.harvard.edu/docview/2012832329?accountid=11311&rfr_id=info%3Axri%2Fsid%3Aprimo https://search-proquest-com.ezp-prod1.hul.harvard.edu/docview/2012832329?accountid=11311&rfr_id=info%3Axri%2Fsid%3Aprimo https://search-proquest-com.ezp-prod1.hul.harvard.edu/docview/2012832329?accountid=11311&rfr_id=info%3Axri%2Fsid%3Aprimo http://www.isni.org/content/data-contributors http://www.isni.org/content/data-contributors http://www.isni.org/content/data-quality-policy http://www.isni.org/content/data-quality-policy JLIS.it 11, 1 (January 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12554 163 “Descriptive Cataloging Manual,” Cataloger’s Desktop, last updated October 2018, http://desktop.loc.gov/search?view=document&id=68&fq=myresources|true. “Frequently asked Questions about Joining the NACO Program,” Library of Congress website, accessed January 31, 2019, https://www.loc.gov/aba/pcc/naco/nacoprogfaq.html#1. “Frequently asked Questions on creating Personal Name Authority Records (NARs) for NACO,” Library of Congress website, last updated March 13, 2018, https://www.loc.gov/aba/pcc/naco/personnamefaq.html. Gatenby, Janifer, and Andrew MacEwan. 2011. “ISNI: a new system for name identification.” Information Standards Quarterly 23(3):4–9. Accessed April 16, 2019. DOI: 10.3789/isqv23n3.2011.2. “How ISNI works,” ISNI website, accessed February 1, 2019, http://www.isni.org/how-isni-works. “International Standard Name Identifier,” Wikipedia, updated Dec 6, 2018, https://en.wikipedia.org/wiki/International_Standard_Name_Identifier. “IPD,” SCAPR Societies’ Council for the Collective Management of Performer’s Rights website, last updated 2019, https://www.scapr.org/tools-projects/ipd/. “ISNI Assignments Top 6.5 Million.” Information Standards Quarterly 25(3):37. DOI: 10.3789/isqv25no3.2013.06. “ISNI Documentation & Training,” PCC ISNI Pilot DuraSpace wiki, accessed January 31, 2019, https://wiki.duraspace.org/display/PCCISNI/ISNI+documentation+and+training. “ISNI Manual,” PCC ISNI Pilot DuraSpace wiki, accessed January 31, 2019, https://wiki.duraspace.org/display/PCCISNI/ISNI+documentation+and+training#ISNIdocumentati onandtraining-ISNIManual. “ISNI Web Interface Usage Guidelines,” PCC ISNI Pilot DuraSpace wiki, accessed January 31, 2019, https://wiki.duraspace.org/display/PCCISNI/ISNI+documentation+and+training. ISNI website, accessed December 14, 2018, www.isni.org. Liss, Jennifer A. 2017 “Identity Management or Authority Control?” American Library Association website, accessed February 1, 2019, https://connect.ala.org/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=422 1e787-22e6-4a7a-855e-8e471edffa18%3E. “NACO Training,” Library of Congress website, last updated October 29, 2018, https://www.loc.gov/catworkshop/courses/naco-RDA/index.html. “Pilot Joint Focus Areas & Deliverables,” PCC ISNI Pilot DuraSpace wiki, accessed January 31, 2019, https://wiki.duraspace.org/pages/viewpage.action?pageId=87465691. “PCC ISNI Pilot Home,” PCC ISNI Pilot DuraSpace wiki, accessed January 31, 2019, https://wiki.duraspace.org/display/PCCISNI/PCC+ISNI+Pilot+Home. RDA Toolkit, last updated April, 2017, http://access.rdatoolkit.org/. http://desktop.loc.gov/search?view=document&id=68&fq=myresources|true https://www.loc.gov/aba/pcc/naco/personnamefaq.html http://www.isni.org/how-isni-works https://en.wikipedia.org/wiki/International_Standard_Name_Identifier https://www.scapr.org/tools-projects/ipd/ https://wiki.duraspace.org/display/PCCISNI/ISNI+documentation+and+training https://wiki.duraspace.org/display/PCCISNI/ISNI+documentation+and+training http://www.isni.org/ https://connect.ala.org/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=4221e787-22e6-4a7a-855e-8e471edffa18%3e https://connect.ala.org/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=4221e787-22e6-4a7a-855e-8e471edffa18%3e https://www.loc.gov/catworkshop/courses/naco-RDA/index.html https://wiki.duraspace.org/pages/viewpage.action?pageId=87465691 https://wiki.duraspace.org/display/PCCISNI/PCC+ISNI+Pilot+Home http://access.rdatoolkit.org/ work_3mhkxq5syjbefbllyxd47trskq ---- texhcast.dvi TECHCAST Welcome to the new Collection Management column Techcast. With less time to devote to collection responsibilities, librarians are casting about for ways to streamline collection management activities. At the same time, rapidly changing information technology brings forth new software, Web applications, and innovative practices that will potentially streamline and enhance collection management activities. The Techcast column will high- light these new ideas and help librarians assess them for use at their own institutions. Our first Techcast column examines the WorldCat Collection Analysis Interlibrary Loan Analyses module. We hope that this article and those that follow will guide librarians searching for ways to improve and streamline collection management activities. Margaret Mellinger Column Editor 235 Further Reflections on the WorldCat Collection Analysis Tool Hilary Davis Annette Day Darby Orcutt ABSTRACT. This article focuses on a recent enhancement to the WorldCat Collection Analysis tool, the Interlibrary Loan Analyses module, exploring the possibilities that this enhancement offers for strategic collection devel- opment. The study concentrates on the tool as a way to assess the impact of the recent growth at North Carolina State University in biomedicine and human medicine programs. The research contained in this article originated from a session presented at XXVII Annual Charleston Conference, which was held in Charleston, South Carolina, on November 8, 2007. KEYWORDS. WorldCat Collection Analysis, strategic collection deci- sions, interlibrary loan data, interlibrary loan analyses Hilary Davis is Collection Manager for Physical Sciences, Engineering, and Data Analysis. She holds an MLS from the University of Missouri-Columbia and an MS in biology from the University of Missouri-St. Louis (E-mail: hi- lary davis@ncsu.edu). Annette Day is Associate Head of Collection Management. She earned her MLS from Leeds Metropolitan University in the United Kingdom and holds a BS in computer science and mathematics from Leeds University, United Kingdom (E-mail: annette day@ncsu.edu). Darby Orcutt is Senior Col- lection Manager for Humanities and Social Sciences. He holds an MS in library science, an MA in communication studies, and a BA in speech communication and religious studies (E-mail: darby orcutt@ncsu.edu). All are at North Carolina State University Libraries, 2 Broughton Drive, Box 7111, Raleigh, NC 27695-7111. 236 Collection Management, Vol. 33(3), 2008 Available online at http://www.haworthpress.com C© 2008 by The Haworth Press. All rights reserved. doi: 10.1080/01462670802045566 Davis, Day, and Orcutt 237 INTRODUCTION A 2006 article (Orcutt and Powell) reviewed the WorldCat Collec- tion Analysis (WCA) tool as used at the North Carolina State University (NCSU) libraries. It detailed many shortcomings and necessary capabili- ties of any collection analysis tool for contemporary and future collection assessment. In their review, Orcutt and Powell noted that the WCA tool enabled fast gathering of data related to the holdings of a single institution (i.e., a single Online Computer Library Center (OCLC) holding symbol) and allowed comparison to a group of two to five others to quickly produce exportable title lists in targeted areas. They also reported that there were some limitations to the WCA tool. For one, the WCA’s and WorldCat’s use of matching on accession number instead of matching at the title level produces duplicative results. Instead of a concise list of unique titles, many multiple records for same or like items were often returned with up to 80% duplication. In addition, they reported that the tool supports “single- institution analyses rather than the cross-institutional comparisons needed for the cooperative decision-making increasingly desired in practice” (Orcutt and Powell, 2006). This article focuses on a recent enhancement to the WCA tool, the Interlibrary Loan (ILL) Analyses module, exploring the possibilities that this enhancement offers for strategic collection development. The study concentrates on the tool as a way to assess the impact of the recent growth at NCSU in biomedicine and human medicine programs. The NCSU Libraries had anecdotal evidence of the demands placed on its collections by this growth trend but needed evidence to justify increased spending focus and funding requests. The WCA ILL data study described herein provided an opportunity to flesh out broad-level observations that originated in part as a response to the 2006 study conducted with the Triangle Research Libraries Network (TRLN) consortium. TRLN comprises NCSU, Duke University, the University of North Carolina at Chapel Hill, and North Carolina Central University, with combined resources approaching 14 million volumes and total collections budgets of more than $29 million. The TRLN study aimed to produce actionable interpretations of our users’ collections needs within the consortial context (Triangle Research Libraries Network 2006). SETTING THE SCENE As context for the study, some details about the NCSU Libraries are nec- essary. As part of the overall mission, the NCSU Libraries support more 238 COLLECTION MANAGEMENT than 31,000 students and 8,000 faculty in areas focusing on engineering, science/technology, mathematics, and veterinary medicine. NCSU is ranked third in industry-sponsored research spending, compared with all public universities without a medical school (or, at least, a medical school of the non-veterinary variety). As noted above, the NCSU Libraries are also members of the Association of Research Libraries and the TRLN con- sortium. All of these elements factor into the questions we asked regarding the tool and how we reviewed and interpreted the data retrieved. DATA GATHERING AND RESULTS To investigate the ILL Analyses module as a possible tool for guid- ing strategic collection development and to understand the impact of programmatic changes on the NCSU Libraries’ collections, we posed a series of questions. First, we wanted to know whether the ILL Analyses from WCA could help us identify subject areas in which there was a clear need for resources that were not already part of our collection. The analysis represented in Figure 1 highlights the demand for health professions and public health materials, with over 20,000 requests in a 3.5-year period—nearly 17 requests per day. It should be noted that the subjects listed in the figure are based on the OCLC Conspectus of subjects, which is also based on multiple classification schemes. These are the subject divisions provided by WCA. When applying the tool, users have no control over how the subjects are defined. Once we had identified an area of demonstrated need, namely, health professions and public health, we wanted to know whether the need for resources in that subject was format-specific, that is, for serials, books, or other formats. Figure 2 shows that for health professions and public health, 95% of requests were for serials and only 3% were for books. Further data analysis provided us with specific title information. For this subject area, we were also able to determine which journals were in highest demand over the past 3.5 years. Another area to analyze was the publication date of requested mate- rials. Were the articles requested from newer or older publications? For health professions and public health, in the past 3.5 years, there was a clear indication that articles published in the 1980s and 1990s represented the greatest need for serials (Figure 3). This would imply that backfile Davis, Day, and Orcutt 239 FIGURE 1. Number of Requests Across 3.5 Years of Data Provided by the Interlibrary Loan Analyses in WorldCat Collection Analysis Spanning Mid-2003 Through Mid-2007 Total borrowed items across all formats 0 5000 10000 15000 20000 25000 H e a lth P ro fe s s io n s & P u b lic H e a lth E n g in e e ri n g & T e c h n o lo g y L a n g u a g e , L in g u is tic s & L ite ra tu re H is to ry & A u x ili a ry S c ie n c e s S o c io lo g y B u s in e s s & E c o n o m ic s E d u c a tio n B io lo g ic a l S c ie n c e s A rt & A rc h ite c tu re A g ri c u ltu re P h ilo s o p h y & R e lig io n L ib ra ry S c ie n c e , G e n e ra lit ie s & M a th e m a tic s P h y s ic a l S c ie n c e s G e o g ra p h y & E a rt h S c ie n c e s C h e m is tr y P h y s ic a l E d u c a tio n & R e c re a tio n C o m p u te r S c ie n c e P o lit ic a l S c ie n c e P s y c h o lo g y M u s ic P e rf o rm in g A rt s L a w M e d ic in e M e d ic in e B y D is c ip lin e A n th ro p o lo g y P re c lin ic a l S c ie n c e s M e d ic in e B y B o d y S y s te m H e a lth F a c ili tie s , N u rs in g & H is to ry G o v e rn m e n t D o c u m e n ts C o m m u n ic a b le D is e a s e s & subjects # it em s re q u es te d FIGURE 2. Top 10 Most Requested Subjects Based on Format Top ILL Requests - Books, Serials, Other 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% H e a lth P ro fe ss io n s & P u b lic H e a lth E n g in e e ri n g & T e ch n o lo g y L a n g u a g e , L in g u is tic s & L ite ra tu re H is to ry & A u xi lia ry S ci e n ce s S o ci o lo g y B u si n e ss & E co n o m ic s E d u ca tio n B io lo g ic a l S ci e n ce s A rt & A rc h ite ct u re A g ri cu ltu re Subject P er ce n t o f re q u es ts Other Books Serials 240 COLLECTION MANAGEMENT FIGURE 3. Interlibrary Loan (ILL) Requests for Serials in the Health Pro- fessions and Public Health Subject, Showing Level of Need and Age of Materials Requested ILL Serials - Health Professions & Public Health 0 200 400 600 800 1000 1200 1400 1600 1800 1960s 1970s 1980s 1990s 2000s Age (publication date) # re q u e st s 2003 2004 2005 2006 purchases may be necessary to fill gaps. We used these ILL data to iden- tify which publishers were most highly represented, based on the journals that were requested, and mapped those data to publication dates to help determine which backfile packages were available to fill our collection gaps. Finally, with regard to our local library consortium, TRLN, we wanted to know how many of our users’ ILL requests were being filled by TRLN. Figure 4 shows the top 10 lending libraries, with the top 4 being from the TRLN consortium (representing about 49% of our ILL requests). The ILL Analyses also allows us to see the same data for any specific subject. For example, we were able to see that for requests in health professions and public health, the top four lending libraries are from the TRLN consortium (representing just over 60% of our ILL requests for this subject). On the flip side, the ILL Analyses module allows us to look at our activity as a lender. Figure 5 shows the top 10 libraries requesting materials in health professions and public health for both books and serials. We found that we lend the greatest amount of content in the same area in which our users generate the most requests: health professions and public health. Because of the very broad nature of the Conspectus headings, interpretation of our lending data is a little difficult, but it is likely that we have a specialized Davis, Day, and Orcutt 241 FIGURE 4. Top 10 Lending Libraries Across All Subjects and All Formats Top lending libraries - all subjects, all formats 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 U n iv e rs ity o f N o rt h C a ro lin a , C h a p e l H ill D u k e U n iv e rs ity L ib ra ri e s U N C , H e a lth S c ie n c e s L ib ra ry D u k e U n iv e rs ity , M e d ic a l C e n te r L ib ra ry U n iv e rs ity o f G e o rg ia B ri tis h L ib ra ry C IS T I L in d a H a ll L ib ra ry G e o rg ia I n s tit u te o f T e c h n o lo g y U n iv e rs ity o f V ir g in ia lending librarie s # i te m s b o rr o w e d b y N C S U FIGURE 5. The Top 10 Borrowing Libraries in the Subject of Health Profes- sions and Public Health for Both Books and Serials, From the Perspective of the North Carolina State University Libraries as a Lender NCSU as Lender - Top 10 Requesting Libraries for Health Professions/Public Health 0 200 400 600 800 1000 1200 D u k e U n iv e rs ity L ib ra ri e s U n iv e rs ity o f N o rt h C a ro lin a , G re e n s b o ro U n iv e rs ity o f N o rt h C a ro lin a , W ilm in g to n U n iv e rs ity o f N o rt h C a ro lin a , C h a p e l H ill W e s te rn C a ro lin a U n iv e rs ity F lo ri d a In te rn a tio n a l U n iv e rs ity E a s t C a ro lin a U n iv e rs ity U N C , H e a lth S c ie n c e s L ib ra ry U n iv e rs ity o f M a ry la n d , C o lle g e P a rk W a k e C o u n ty P u b lic L ib ra ry requesting library # it e m s r e q u e s te d Serials Books 242 COLLECTION MANAGEMENT collection in the subject area among our history of science and toxicology and psychology collections to explain these data. Analyzing our ILL activity with regard to TRLN provides some inter- esting insights into the cooperative collection development practiced by the consortium. At the NCSU Libraries, we have a relatively new research collection. It has only been in the last 15 or so years that we have had adequate funding and a dedicated collection management department to strategically build our expanding research collection. The cooperative col- lection building through TRLN has enabled us to provide materials to our users that supplement our collections’ strengths and foci. Over time, the focus of research has changed, and materials that were supplemental to NCSU research have now become central. The demand for health pro- fessions and public health materials clearly demonstrates this change in focus. Working with these data, collection managers, in consultation with faculty and researchers, are embarking on a program to ensure that the NCSU libraries’ local collections meet the developing needs of its con- stituents in these areas. Using this analysis to streamline and sharpen the collaborative collection building of TRLN, the ILL Analyses module of the WCA tool may provide valuable guidance in developing more meaningful cooperative collection building strategies. WCA ILL ANALYSES AND OUR LOCAL ILL DATA While the WCA ILL Analyses module was able to give us some mean- ingful data, as previously discussed, we found that to gain a full under- standing of our ILL activity and its implications for collections we needed to use the WCA data in conjunction with our local ILL data. Our local data, gathered from the NCSU Libraries’ ILL system, Illiad, gives us a useful level of granularity, indicating the college or department the re- questor is from and whether he/she is a faculty member, graduate student, staff member, and so on. This level of granularity is not available from the WCA. Another limitation/difficulty we encountered with the WCA was the ability to retrieve ILL data based on a specific call number range. Our ILL librarian can provide us with locally generated data to tell us the ILL activity in a specific call number range. With the WCA, we had to use the broad Conspectus headings. However, the WCA does have the advantage of easily providing a big-picture overview of our ILL activity over a broad time period. This is not easy to obtain with our local data. While we have those precise granular data, pulling back from that detailed level to create Davis, Day, and Orcutt 243 a broad overview would require much data manipulation and would be time-consuming. Ideally, both sets of data are necessary, as they complement each other well. This combined approach enabled us to see the complete picture of our ILL activity and pointed us to some useful interpretations for our collections. The WCA was able to provide us with a snapshot of where there may be gaps in our collections, which we were then able to flesh out with the more granular local data. Using those local data, we could identify specific departments needing materials, and we could speak to them to learn more of their needs and demands on our collections. DATA INTERPRETATION Regarding data interpretation, it is important not to look at the data alone but to ensure that it makes sense with what you know historically and currently about your collection and also the programmatic and research activities on your campus. One of the first things that we had to bear in mind was that these data may show us false gaps. For example, a book that was requested via ILL does not automatically mean that the NCSU Libraries does not hold that title in the collection. It may mean that we do not have enough copies to meet the demand. We found this to be the case when we examined the title lists generated by the analysis for some subjects. This is useful information to have, and we can approach meeting the demand in several ways in consultation with our patrons. Those ways include purchasing additional copies, buying an online version when available, or perhaps putting the book on reserve if the demand is being generated by a specific class. We also needed to think about how our patrons use ILL. There may well be a segment of need that we do not see in the data. It is quite possible that we have not captured all the gaps in our collections from this analysis because not all of our constituents use ILL. They may use their own network of colleagues at different institutions to fax or e-mail them an article they need. They also may in some cases choose to purchase the article themselves from a publisher’s Web site if either they are unaware that the library can get them the article or they perceive the process as too time-consuming when they need the article immediately. As was previously mentioned, it is important to understand your collec- tions to make sense of the data and to be aware of any anomalous results that may arise. The NCSU Libraries has a fairly young research collection, 244 COLLECTION MANAGEMENT and it is only in the past 15 or so years that we have been funded to develop a strong and rich research collection with a dedicated collection manage- ment department. In light of this historical context, the gaps we saw in our collections that pertained to serials from the 1970s to the 1990s made sense. Finally, we also interpreted the results in the light of the information landscape in which we are working. Serials and e-resources are consuming a large proportion of budgets. As a result, monograph collections are beginning to feel that squeeze. In addition, as publishers’ print runs get smaller and smaller, collection managers often find that if a book is not purchased within a few months of its publication, it becomes out of print; thus, it is harder to find and more expensive to purchase. The financial squeeze, coupled with short print runs, could explain why the demand for monographs focused on more recently published titles. CONCLUSIONS Regarding the ILL Analyses enhancement, within the context of the NCSU Libraries we found the tool to be extremely useful for providing a big-picture overview of ILL activity over a broad time period. We were able to use those overview data to generate effective graphs, which helped us describe collection needs to administrators. The overview data as com- municated via graphs helped administrators understand the justification for concentrating funding on a specific area or for increasing budget requests for library resources. Much local manipulation of the data from the WCA ILL Analyses module was needed to create the graphs/charts and title lists that we used in our analysis. As a result, we feel that there is a learning curve associated with using this tool to mine, manipulate, and interpret the data. At the NCSU Libraries, we are encouraged to use data sources such as the WCA and are given the time to work with such tools and to develop the skills to use them effectively. In other libraries, we know that many pressures and demands on librarians’ time make this sort of focus difficult. Therefore, some possible enhancements to the tool would be to increase the output capabilities. Enhancing these output capabilities would make the data more usable and applicable and enable users to make graphical presentations more complex and flexible, such as those that have been presented in this article. Currently, there are some very basic graphs and charts that can be output from the tool, but they do not describe the data in as much detail Davis, Day, and Orcutt 245 or as clearly as those we generated ourselves using the source data from the WCA. Alternatively, perhaps OCLC could offer this tool via a service model whereby OCLC would run the analyses and work with the library to produce output tailored to their needs. REFERENCES Orcutt, Darby and Tracy Powell. 2006. Reflections on the OCLC WorldCat Collection Analysis tool: we still need the next step. Against the Grain 18, no. 4 (November): 44–48. Triangle Research Libraries Network. 2006. TRLN OCLC Collection Analysis Task Group: Report to the Committee on Information Resources (CIR) June, 2006. Document online. Retreived from http://www.trln.org/TaskGroups/CollectionAnalysis/ TRLN CollAnalysis June2Report.pdf work_3mysit2eibhchnpb3dgoyib6o4 ---- p m " :' Joumal 01 EducII!ion叫 M叫ia & Library Scic忱的 VoJ. 27, No. 2 (l 990), pp, 1拘 1'" UNDERSTANDING MARC : ANOTHER LOOK Rog Ch8ng G:rwloging Coordi1lator Rene Raatjes Cataloging Librarùm The U1l;Vf' 叫'y Library Wes/e r1l IlIi1lois Un;versity Mocomb, lllùlOi又 U.S.A Abstract MAR C forma! has been widely used and discussed in our prof.,髓。" Hοwever, !here appear!o have a wide spread misunders!anding of i!s real s!r叫ur., and at! ribu!且 This anicle discuss 恥的吋s for u話'" underll!and i! a liule mOre, Also, il presen!s !he generaJ mis<(li~ & t.i brary Sc iences 'n : 2 ('"付 ntcr I咬狗) to terms with the MARC fonn泣. at least on a practical working level. l n rece nt years, however, the influence of MARC has extended beyond the cataloging uni t. As more libraries mQve toward the use of circulation controI systems, online public access cata logs (OPAC), and var io us local automated systems, the need to have bibliographic records conform to the “ MARC standard" has grown increasingly important. Li brarians are becoming aware that MARC records furnish the major source of data for building online catalogs and are beginning to realize that their choice of library automation system is dependent on t he abi lity of that system to support the MARC standa rd .1 The time has arrived when a fuller understanding of the MARC format can place Iibrarians in a more secure position from whi ch to participate in database decision making and maintenance. Never has it been more important to take another, close r Iook at MARC Popular Misconceptions About MARC Although the MARC record fonnat has bee n w自dely used and discussed, most librarians have only a limited understanding of it. Even if t heir library uses a bibliographic da tabase system, it has not been. necessary for librarians to know exactly how records are processed or in what shape they are stored in the machine or on archival tapes. Public service librarians have viewed MARC as just another catalog {ormat replacing th e old 3'" x 5" catalog card by a record display on a computer screen. For mast cataJoge峙. the extent o{ their required knowledge o{ MARC has been the input format which appears on their computer screen and which they often erroneously believe to be identical with the MARC record forma t. The example below shows a display of the input format designed by OCLC to accommodate the data elements required to make a complete bibliographic record. It uses codes known as tags and indicators to identi{y various elements of bibliographic data 寄: Ch"n刊g & R"a恥,~ : U"de .,.tandin~ MARC : AnOlher Look '" (refe rred to as fie lds) 、叭的in most fields, it uses subfield codes known as delimiters, to further narrow the identification of data elements 010 gh84.1258 C謂 “協扭曲785 (pbk) 050 1 HQ814 l聞 10 Wakzak. Yvette 245 10 Divorce : b the child's point of 叫ew / c Yvette 、lialczak with Sheila Burns 260 0 London : 11 Sa n Francisco : h Harper & Row, c 1 9.倒 A more sophisticated but still incomplete understanding of MARC recognizes that the MARC format also conta Îns fixed fie lds which accommodate data such as language of text, presence of a bibliography, source of cataloging, etc. In an OCLC display these appear at the top of the record. as shown below Type: a Bib lvl: m Go叫抖'" Lang: 自tg Source: d 1IlW;: Rep E耽'"' Conf pub: 0 Ctry:目'" Dat tp: s 卸νF/B : 10 J:>e,;c: a Int lvl Dat-純 l曲4 010 gb84.1 2!:姐 Another misconception involves the understanding of the term “ fu ll MARC record" , which 的 often thought to be one that di叩lays all the tags required by the data in the record. rather than one which represents the standard USMARC format In actuality, few librarians have ever seen the real standard record fonnat known as LCMARC or USMARC, since it was designed for the co mputer, not for the human eye. The display formats shown above are only the visible part of the MARC record. The other parts of the real MARC record are the leader and the directory, both invisible in online systems, but “ vÎtal to 這6 吵~ 1" Joumal uf Educa !lonal M..d旭& Library Scienct喝 27 : 2 (Winler 1唉1II) communication and some forms of processing ......" You may never see a leader or a directory on line, but those elements make USMARC processi ng efficient and fl exible. 2 In the real MARC record, the tags are nol included with the indicators attached to a particular field. Nor does the USMA RC leader match the same fields as the fix ed fields of the familiar input format . The visible input formats used by th e va riou s bibliographic uti 1ities (OCLC MARC, UT LAS MARC, etc.) a ll have the sa me general st ructu re as USMARC, bu t vary frorn eac h othe r in their use of ex tended nan-USMARC fields. FOf exam ple, OCLC MARC uses an 049 tagged field to s how item holdings, while RLl N MAR C uses 95X tagged fields for th e same purpose Most bibliographic da taba se sys tem s use the formats shown above to display a bibliographic record in place of the conventional 3'" x 5'" card format. 50me system s, how ever, o ff er library users the option of viewi ng the sa me record online in a 3"'x 5' card format or, as in the case of OCLC. may also produce ca rds offline for use in the Iibrary's ca rd catalog , As accustomed as librarians a re to the traditional catalog card , th叮 are beco ming increasing ly comfortabl e with the tagged format, which offers more room and flexibility in displaying and storing a bibl iog ra phic record 、Vhy ßother Trying to Understand MARC Despite some of the ir misconceptions, librarians have for yea rs managed to make practical use of MARC. 50 why bother tryi ng to learn what the MARC format really is? Walt Crawford, in the introduction to his book , MAR C for Library Use. summariz es the reasons that today's librarians need to increase their understanding of the MARC record Many lîbrarîans create and use MARC records wîthout ever unde r standîng the nature of MARC îtsel f. Whîle no such understandîng is ,.~"~門;."! n幫手 ChanR & R叫j,," : Und""nanding MARC : Ano!h"r 自刷k r叫uireJ for calaloging, librarians need 10 know more about MARC as tneir uses of computers "xpand. A thorough und"rsta吋ing of MARC w.11 help when dealing with vendors of services, when considering online catalogs and other aUlomated systems, and when considering possible local development of aUlomaled syslm es. 3 '" In recent years more and more libraries are implementing some type of automated system. Because of this trend, the need to understand the MARC record foramt takes on new impor- tance. Most librarians have heard enough abOUl MARC to know that i1 represents the standa rd for machine-readable record formats As computer applicat自ons become more common in libraries and the opportunity to share bibliographic records inc reases, librarians are becoming more conscious of the need for standardization of bibliographic records. We need to be assured that our automated systems are in accord with whatever standards MARC has establîshed . When faced with the responsibility of choosing an automated system, the lîbrarÎan must assess th e capabÎI Îty of that system to accc阱, store, an d process MARC records. Since most venclors claîm t ha t thei r systems a re MA R C compatible, it is important that librarians know enough about MARC to be able to verify suc h claims The StructuTe of MARC A complete description of the MARC structure is beyond the scope of this paper. Readers who are interested in the cletailed s peci fi cations of MARC should refer to the MARC documentations of Lîbra ry of Congress or of national bibliographic systems such as RLIN , WLN, UTLAS, or OCLC and to W. C rawford's MARC for Library Use. However , a g e n e ral descr ip tion of the real structure of MARC may be helpful in visualîzing the ove rall p lcture The MARC format is divicled into three main parts: the lea der, :可 ι lH j州"n ,, 1 "f Eoj ",-n川的論 I ~'l叫.n &. 1..1>...r) S山-"....~'n :l(W州n l'~叫 the directory. and the variable fields. The va rîabl e fields a re, in lum, subdivided inlO twO g roups: the variable control fields and th e variable data fields. The following ex扎mple îllustrates the overa ll format of a MARC record LEADER RECORD Dl RECTORY VARIABLE FIELDS 01 2:'“ 567ft. Use rs of the MARC record do not see lhis data On Ihe screen. For exa mpl e, what dU' uscr sees in the beginning po喝ition of an OCLC record is the rccord îdentîfication number. Thi s should not be co nfuscd wilh thc record length. which occupies t he beginning position in u USMARC record 扒)lIowing the de sig nati on of re地ord lenglh art.' three l.ch.lfacler data fields for coding the r(..cord statlls (new, T{'vised. del l'll'd , t' IC.), the type of reco rd (Iar明uage maten圳, muslc、 map. etc.), and the blbliographic levl'l (monograph, sc ria l, etc.). The品l' Ihree fidd討“long with characrer 17, the encoding level (c!egret. of complctencss of rt-!mrd) can be 間en online in Ihe fix t.'1ARC provides a common ground for sharing d..ta; without ∞mpatibility. a library is forec1osing 包uch sh..ring.5 D.S 恥1cPherson offers related advice When a library evaluates an automated sySlem , concern about record formats may take a back seat to other criteria such as system features and purchase price. ln the long term. however. l的e of a system that does not meet e",叫 ing standards may prove extremely cos ll y.6 '" We sometimes hea r that a library is using a local system to download its own bibliog raphical records from a national bibliog raphic database system, by connecting the system to the printing port of t he terminal and sending each of its records off the printing port. It is true that the local system can capture all data elements of each record as it appears on the computer screen or print-out. But that record is not the same as the one on the national system archival tape. Therefore, the local system needs to employ an additional program (separate from the program that processes the standard MARC record) in order to be able to process the records down loaded from the terminal We a1so hear at tîmes 出at certain Iîbra ri es are using micro computer packages such as dBASE m + or RBASE 5的o to ca阻10g special materials, and that they are creatin g the自 r records in MARC format. It is quite possible that such database management systems could be used to produce a true MARC record. However, it would require extensive programming efforts to achieve so complex a record format, because while both dBASE III + and RBASE 50個 can handle fixed field records quite eas i1 y, they cannot dea l effective1y with variab le length records. In fact, any re lationa l database management system wou ld be unsuitable for handli ng variable length records since the relational record characteristica Jl y places records in a flat talbe form. Although samples of the output '" Joumal of Ed uca lÌonal Media & Library Scicnc". 'zl : 2 {Wimer 19裕的 record format show that MARC tag s are used to indicate data field s such as author, t itle, and subject, the data in these fields is often truncated when space requi rem ents exceed the flat tab le limits Some of the system vendors point \0 t heir tagged display format as evidence that the ir systcm ca n process and autput MARC records. Such assertions are not uncomman but should not be taken at fa ce value. As such tÎm間, the Ii brarian must insist on asking the essential qu estion: Can th e Jocal system reproduce a comp!ete MARC record fron、 the records stored in the syste m, if in the future the records had 10 be transferred \0 another system ? lt is easy \0 understand why the Technical Standard s for Libra ry Automation Committee (TESLA ) of ALA became con cerned several years ago aboul the MARC compatibility of automated lib叩門 systems being marketed. As a result, it launched a compatibility survey of various vendor's products. The survey ~:~/ indi阻ted that there was a generally strong vendor comm it ment to the MARC forma t. "However, t her e were e nough nonstandard practices reported to indicate that MARC compatibility cannot be assumed and the customers should qu estion prospective vendors carefully in a number of a reas 叮 Standardization and Data Communication The codes and data fields within the MARC structure are not the same as the MARC structure itsel f. To be MARC compatible, both the structure and the inlerior codes must follow a set of standards. The MARC structure standard was set by the American National Standards Institute and is known as ANSl Z39.2 1979.8 Although the ANSl standard did not specify the standa rd for tags, indicators, and data-element identifiers (de limiters), those defined by LC have been accepted as standard practice, for example, 100 for main entry-personal name, 245 for title, etc The need for strict adheren ce 10 the standard is related to the i注2再 正步品 ι可an. l011Tnal 。而 Ed"心lional Mt'd ia & I" lv-ary S<:iences '!:1 : 2 (Winler I院期 fîeld s, but it witl s imp!y leave them there as part of the recorcl; it will oot do anything with them. Other tags Qr subfields oot defined by USMARC or OCLC MARC will be rejected by the sys tem Because each bib l iograph自 c utility has adopted a some wha t different set of extended tags and subfield code s, bibliographic communicati on among these systems requires additi onal process ing. For a discussion of alternative wa ys to reso\ve co nfli ct in commu nicati on amon g sys tems, see R. Renaud's “ Resolving Conflict in MARC Exchange."9 Computer programs for reading a MARC record are much easier to develop than those for co n st ru cting a MAR C record. Many vendor s wîll use MARC record s produced by LC, OCLC, and RLIN, etc. as input data , but they will not reconstr uct MARC records for other systems to use. ln so me cases, the local system will use a totally different str ucture to store records. For example, the LCS system in Il linois has a much simplifi ed record format , although its stru ct ure still maintains the framework of leader, directory, and va riable fields. A s pecially constructed program would , aga間, be required to process th ese record s Issues Concerning MARC There is no question that MARC is a hi ghly co mpl ex record format. 1'0 de ve lop a program for reading the MARC record would not be an easy task for a novice programmer. The easiest record for a programmer to work with and for a comput er to process is one co ntaining 1imited data eleme nt s which can be en te r ed into fix ed rather than va ri ab le fields. ' For example, a record consisting of libra ry staff nam es and telephone numbers req uires a very simple format. 80th the nam e and the number can be treated as fixed field data On the other hand, a book record w ith titl e and author data is a difficult record format , even though it a1so involves only two data Chang & Raa!jω Und.,rs!anding MARC : Ano!her L歐泳 '" elements: the title and the author. A book title ca n vary greatly În length from very short to very long. Simîlarly, there can be a s ingle author, no author. or more than one author indicated. The length of the author's name îs also widely varÎab le. Since a large portion of the MARC record co nsÎsts of variable fi el血,此時間叮 to foresee the difficulties of developing processing programs. The presence of numerous subfields, es pecia l1 y the comple x: s ubfi eld s în the 049 field of OCLC MARC present add ed difficulti由 for processîng. To co mpli cate matt e rs even further, many of the tags and subfields can also be used repeatedl y within the MARC structure. AII in all, the large number of tags , indica岫間, and subfield codes combi ned with variable-length data elements make the MARC format a highly comple x: record structure to deal with Is there a good rea son for the complexity of MA B,C? We may reply that it is the nature of the bibliographic record that makes h the machzne record structu re so complex-and umthe users' ,車'r' information needs that, in turn, dictate the nature of the bibliog raphic record. We librarians require a record format that will accommodate a ll needed data elements related to a bibliograph ic record. MARC, with its flexibility to accommodate multiple author s, multiple subjects, and all types of subject headings,的 a format designed to fill this need lt is inacc4rate to maÎntain that in developing a machine- readable record "We put the card catalog in electronic form 的 The MARC record is not limited to the traditional access points found on a cata log card, but a ll ows the record to be manipulated in numerous additional ways. For example, catalog records may be accessed by LC card number, ISBN, key words, et已, none of which is accessib le in the card catalog. lt is true that MARC arranges the variable data elements in the approximate order in which th叮 appear on a catalog card , e.g. , the call number (0間, etc.) comes before the main entry (100, etc.). and the main entry comes before the tit!e (245), and so on. The designer of the MARC format most ι司 152 Joumal ,,{ Edu<,a,ion,, 1 M~d他& Library Sc i{'n cL"S z7 : 2 (Win,er I吱~) likely reasoned that librarians are accustomed to this order of displaying bibliographic elements. However, the input and output format of a record can be indepenclent of the format that is stored in the computer. Name entries, whether main entries (lXX) or added e ntries (7XX) can be accessed together, despite their location in different fieJds. SimiJarly, series data, whether they occupy 4XX fields or 8XX fields , s hare identical acce時 The order of these tags does not prohibit the programmer from reordering them when they are processed. I-I owever, o nce t he meaning of t hese tags, indi ca tors, and subfield codes are se t, they should be standardized so that all 叮stems can easily process each others records How to utilize the data eleme nts within a MARC record is really up to the local system. As far as computers are co ncerned, any fi eld can be selec ted to be indexed for qui ck retrieval. Any of the data elements ca n be extracted from the MARC structure for building any type of data model: network, hierarchic訓, or relationa l. The flexibility of the MARC record all ows for the c rea tion of s pecial ized types of data files. For exampJe, a subject authority fiJe can be created by selecting data entered in the fields tagged 6XX The complaint that “ The MARC record does not provide adequate subjec t access to the very materials it has been used to access" 11 reflects a misunderstanding of MARC's pote ntial. The MARC format prov自d es t he fields for any num ber of subject headings an d sub-headings. It is up to ot hers to use them for providing adequate access. The \ocal ca taloger must accept the responsibility for inputting whatever subject entries are deemed necessary for adequate aι:cess to any bibliog ra phic record. The MARC fo rma t itself can not be blamed for the fa 自 lure to make use of its capabilities Conclusion MARC record s have long served as the key data source for 、郎, '、 ChanK & R叫阿 U"ders, and , nl! MAR C : AnOl her [胸k 153 library automation systems. These systems must not only be able to meet the bibliographic information needs of the library冶 users , including the librarians themselves, but they must also embody the standardization o f formal which Îs a prerequisite 10 data s haring and system communication. Although librarians are making 自 ncreasmg use of MARC for both technical and public services, their concept自on of ÎIs real nalure remains cloudy. Because MARC is made for “ machine-eyes" , it is nOI easy for those librarians who are not yet computer-literate 10 fully understand its nature and its potent岫1. ln the automation age MARC is essential to the library profession. If librarians persist in conti nuou s ignorance 01 the MARC format. the future electronic catalog could be totally at the mercy of system d的igners a nd data processing personnel. The main principles of library service might be severely compromised in favo r of conven- ience of data processing. As G. Patton suggests, the best results will come from the active participation of system experts who understand library func tÎons and librarians who have a fundamental understanding of computer systems. 12 The key to the excelle nce of future library operations is library automa lÌo n. Many library systems that are being developed loday will have a long.term e ffect on library services. One of the key eleme nts in system dev e!opment is the MARC record. To be able to take an active role în the decision-making related to li bra ry automation systems, one must have a good understanding of the nature and possibilities of the MARC format. Li brarian s as a profession must accept the c hal lenge of understanding MARC and how it relates to the future of library service Notes 1. Rober' A. Wahon & F. R. BridR吧!“Auwma, ed Syslem MarkelJ>lacc 1987: M訓uri\y and Compotli\ion弋 L, brary Jou ,."u!. 113: 6 (Avri1 1. (988), l' 強 2. WaIT Crawford, MARC fo ,. L ,br“ ,.y Use: Un duSllmd"'K Ih" US MAR C 『- ,,.. Jour吼叫 01 Educa 1iOIl"] Mc“為 & I. , b開 'Y Sαe"c... 'z7 : ~ (W叫“ 1蹺,的 Fo ,,",uls (While Plains. N,y': Koowl叫“e llldustric$ I'uhli~~tions. 1984), p.7 3. /1",1, p. I 4. Mieh..le 1. Dalchi ,.., "MARC Formal 01\ Tape: A T Ulorial,Rin From Tuþ /0 F'rodurl, ed. Barry 且Baker (Ann Arl酬, Ml<~h' Pienan Press, 1985). p.25. 5. Crllwford. 1).52 6. Domthy S. McPhenon. KMARC Compalibihly: A TESLA Survey 0' Vef、“n: I"/Q,.,..nl的同 Tuhn6岫αω耐心b,帥~s.. 4 (5叩tember L泊的 1) , 241 7. lb,d. 8. For a discu肥ion 01 the "Arneriun National SllI.nda叫~ for ß ,hliograllhic lnformation Int..rch~nge 01\ MlIgnellc Tap'"白5tt Crawford. p. 29. 9. Robert Renaud. 曰恥rol叫ng Co nflict in MAR C Exchange, K 1 .. /0""01 /0" 7 ..... h""/o,盯 帥d L.. bronn, 3 (Seplem悔穹,,.‘). pp 必.,., 10. Kevin Htgany , MMylh~ 01 I 且 b.8.Y Automal ;on,M J., brary JOU""Il~ 110 (Ocl. 1. 1985) p.471 11. lbul, p. 48 12. Glenn 1'.1100. 可怕他raction; l..ene帽 10 Ih.. F..d ilor" , ub悶ry R t:StW1US & T,r1"U個t b何時,或 32: I (hn. I伺8). p.9. work_3p4v7xqo65b7hmt6dll7qvdony ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586024 Params is empty 216586024 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586024 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_3pug2ytmnfafto2ee7vszr3xou ---- The Code4Lib Journal – Editorial: Beyond Posters: On Hospitality in Libtech Mission Editorial Committee Process and Structure Code4Lib Issue 40, 2018-05-04 Editorial: Beyond Posters: On Hospitality in Libtech In this editorial, I will be using the word hospitality to mean the intentional welcome of others into a space which one currently occupies, possibly as a member of a dominant group. I do not wish to encourage the idea that one should cultivate or maintain a role of benevolent host in a way that forces others to remain forever guest or outsider, although there will always be newcomers. Hospitality may be a first step to ceding one’s position as host in a space. It may be expanding that space to become a place with many potential hosts, each respected for their varied contributions and skillsets. It may also be supporting those in a different space or a different role, such as those who use the technologies we build and support (both colleagues and patrons), and respecting them in that space. I’d like to thank both Becky Yoose and Kim Phipps (Messiah College President) for fixing the term hospitality in my mind, Chris Bourg for the challenge she extended and the phrase “for the love of baby unicorns,” Linda Ballinger for asking questions, Julie Hardesty for a year of collaboration toward a useful metadata documentation plan, the Samvera Documentation Working Group for their work and for helping the metadataists formulate a plan, and the C4LJ Editorial Committee and Becky for providing feedback on this draft I once owned a Star Trek uniform. It was Deep Space Nine[1] style, blue for science and for Jadzia Dax, and an awkward, terrible fit on me. The second website I ever built was a DS9 Geocities fan site. I was the kind of teenage girl who spent hours downloading miniscule 30-second trailers of episodes on StarTrek.com, just to get her fix, and I don’t regret that… for all that I still can be a walking Memory Alpha[2]. What I’m writing about today is hospitality in libtech, and my starting point is Chris Bourg’s 2018 Code4Lib Keynote (keynote video). I understand, appreciate, and support the message of the study Chris cited which mentioned Star Trek posters. I’m writing on the side of Trekkies for inclusion. I shouldn’t assume, for example, you know that Memory Alpha is the Trek wiki, named for the United Federation of Planets’ library/data archive.[3] This is the last Trek reference, I promise. I’m going to walk through some of my own reflections on hospitality in library technology, building from Chris’s talk to encounters with the idea I’ve had over the last year in particular. My undergraduate institution pushed “hospitality” hard (I was given a foot-washing towel at graduation) and it’s taken a long time for me to move beyond feelings about its pitfalls and a buzzword feel to appreciate its real worth as a concept. Listening to Becky Yoose describe documentation as hospitality in an interview on the ( Open paren podcast began the word’s rehabilitation for me. Chris challenged me again to think intentionally about spaces I occupy and not just how I speak to but about (vouching for) others. In this editorial, I will be using the word hospitality to mean the intentional welcome of others into a space which one currently occupies, possibly as a member of a dominant group. I do not wish to encourage the idea that one should cultivate or maintain a role of benevolent host in a way that forces others to remain forever guest or outsider, although there will always be newcomers. Hospitality may be a first step to ceding one’s position as host in a space. It may be expanding that space to become a place with many potential hosts, each respected for their varied contributions and skillsets. It may also be supporting those in a different space or a different role, such as those who use the technologies we build and support (both colleagues and patrons), and respecting them in that space. Intentionality as Hospitality When it comes to behavior in the physical and virtual spaces we create for our community, a thoughtful and enforceable Code of Conduct is only the bare minimum for hospitality. Actually following up and enforcing such codes takes time and labor, but that work, followed by reports back to the whole community are the only way to build a deserved trust. Beyond the language, behavior, and imagery covered by codes of conduct, however, we should intentionally consider the less overt ways in which we may exclude. Do we consider whether everyone in a group knows what we mean by ILS, git branching, 245$a, or other terms which arise in our daily work? At the 2017 Samvera Connect, my colleague Linda Ballinger encouraged her fellow metadataists on proactive responses to feeling overwhelmed by a barrage of new-to-you terms. Such terms are much like Trek posters, inoffensive in themselves, but exclusionary in certain contexts. The onus is on all of us to create environments where it’s ok not to know a thing and safe to ask. The Recurse Center social rules, which are becoming more common at libtech events, provide guidance for creating such an environment. They read as follows: No feigning surprise No well-actually’s No back-seat driving No subtle -isms These are an excellent starting point. If we’re trying to extend boundaries of the spaces we occupy, I would challenge us all to try the following as well: Recognize that you can learn from those just starting out. Embrace familiar questions as an opportunity to reflect on why they recur. Consider your usage of acronyms and jargon. Value both technical (code, QA, testing, UX, and accessibility) and interpersonal contributions to projects. Regarding acronyms and jargon, this challenge is not to stop using them entirely but to proactively ensure everyone involved has sufficient context to understand what they mean. Documentation as Hospitality Beyond physical spaces and gatherings for sharing and learning, we need to be intentional about the kinds of materials we build for those learning on the job through self-teaching. I would venture that anyone, no matter their area of expertise, has encountered some kind of documentation which fell two or three steps short of where they needed it to be. Perhaps it was a primer with assumptions. Perhaps they could find beginner and advanced materials but nothing to guide folks who’ve got the basics down but want to move to an intermediate level. And, of course, sometimes there’s no documentation at all. As mentioned in the introduction, Becky Yoose spoke on the ( Open paren podcast[4] of the concept of Documentation as Hospitality. She then expands on her definition of hospitality as: “making sure that folks can enter in a particular community with the lowest bar possibly that I can do personally or I can build into the community systematically (or persuade others or bribe others) to make a more inclusive community” (emphasis added) This one-on-one work is a critical aspect of building community, but relying solely on it will leave out those uncertain who or where to ask questions and burn out those offering such assistance. As communities, we would do well to determine systematic approaches to easing community entry, both through concerted efforts at building regular, inclusive learning spaces and through our documentation.[5] A project which I’ve been watching and admiring is the Samvera Documentation Working Group’s A Guide for the Perplexed Samvera Developer. While still under development, it is an excellent example of documentation which tries to avoid presumption. Early on in the “New? Start here” section, the page “How to Ask for Help” includes directions for joining the Samvera Slack. Most real-time conversation happens on the Slack, particularly on its dev channel. The page then points readers to Slack’s own documentation on getting started as a Slack user. Will many readers already know what Slack is and how to use it? Sure. Will all? Nope, and the Documentation Working Group made the welcoming choice. Meanwhile, while Terry Reese wrote his previous editorial on learning to be a selfish librarian, he puts a great deal of work into supporting MarcEdit. While many may be familiar with his tutorial videos, I only recently encountered the in-progress Learning MarcEdit, which includes full chapters to spell out things like what every choice in preferences means, with context. As with Samvera, this documentation does not require one to start out at some intermediate stage while still describing how to do complex tasks. From my own experience, I’m aware how difficult it is to block out time to write up documentation, even if you consider it a critical component of a project. Besides time, however, I see a couple other impediments which documenters face. One is skill—knowing a lot about a thing and knowing how to write do not necessarily mean you’ll be good at writing its documentation. The other is that one may feel uncertain or even presumptuous in attempting documentation alone. Does one know everything which should be documented? Can one assess where others will need to start? This is one case where what I’m grateful for overlaps with the Write the Docs community, which provides principles and hosts events where one may practice them. Working on the practice in a group, as above, also means that one person need not feel the full weight of creating welcoming documentation. One has partners, gets feedback, and benefits from multiple perspectives and backgrounds. The Samvera Metadata Interest Group has had false starts in figuring out how such a group should work, although we are now trying to align our work more with the developer-focused Documentation WG, an effort which I hope will be productive.[6] Design as Hospitality As we work to create more welcoming environments and the means for everyone in them to participate in the work we’re doing, we may also turn our attention to what we build (or implement) together. A fundamental level of how we design hospitality in our systems is accessibility. Rather than give a meager overview in this short space, I’ll refer readers to Ng and Schofield’s article in issue 37 of this journal. Beyond their accessibility and usability, however, there are other aspects of our systems which I would like to frame through the lens of hospitality. Specifically, I want to address system design, record-keeping, and hospitality. In my January 2017 editorial, I dedicated a section to “The Records We Keep. Here I am concerned with a false hospitality which alleges we can become more hospitable through aggressive data collection. Jones and Salo have written an overview of learning analytics systems in academia and how these intersect with libraries and library ethics. Besides the continual pressure libraries feel to justify themselves through learning outcome assessment, these programs are pitched as helping students do better in college. If such programs actually improve student success (according to some definitions of success) should we change our principles around privacy? Beyond the challenges to this assertion outlined by Jones & Salo, I think we must also ask ourselves what kind of a world it is we want to be creating for our students. Is it one in which every institution surveils on them and reports back to others, ostensibly with their own good in mind? Is it one in which yet another dataset about them could be given or sold to researchers, demanded by governments, or compromised by hackers? Hardly a hospitable choice. Nor is the problem limited to academic libraries. Public libraries already collect and share certain portions of patron data with a variety of content and service providers, including Amazon. Through its new product, “Wise,” OCLC now proposes to partner with libraries to collect all that data in one place, managed in the cloud by OCLC. Some of the marketing around this product struck me as connected the editorial I was already drafting. In particular, their blog post includes the following testimonial, that Wise “removes subjectivity—instead of an interpretation of customer wants or what customers say they want, the system uses data to determine need.” I read this statement (from a customer, but promoted by OCLC) as a fundamentally false, hollow hospitality. Algorithms developed by human beings will reflect their assumptions and biases and impart their own oppressions. Since the product is only just being marketed, I contacted OCLC with some of my questions and concerns.[7] The responses I received from their new Data Protection Officer were primarily marketing language, uncertainty about most technical specifics (it appears they still need to learn what’s in the Wise system, which they purchased from its developer, and build from there), and assurances that they take privacy seriously without any firm commitments to minimum standards. I believe the following excerpt may be an accurate assessment of what Wise attends to accomplish: “…it takes the very best of disparate systems and combines them into a unified, powerful solution; the collection and use of data, however, is not materially different from today’s practices.”[8] I was reminded of Facebook’s recent statement to Reuters, that “data collection is fundamental to how the internet works.” Data collection certainly is the way things are right now. But rather than designing systems to collect, aggregate, and analyze that data more smoothly, perhaps we should assess where we are and reconsider. Humans will remain subjective, frustrating, and beautiful, no matter what kind of technology we throw at them. I hope we’ll remain aware enough about that to meet them as fellow humans whose inquiry, escapism, and entertainment we facilitate and whose privacy and safety in doing so we respect. The Labor of Hospitality If any of the above were easy, we wouldn’t have to talk about it. Intentionality about our space and words takes work. For most of us, it’s a practice that requires mindful attention. I certainly have a way to go when it comes to jargon. Writing the documentation may take as much time as creating the code did, with far less administrative support or reward in our careers. And trying to understand or negotiate data in external systems or advocate for better choices in what we’re making may require a good deal of time and emotional labor. Pushing back may even be dangerous to one’s employment. I’m aware of and grateful for the time another colleague recently put into phone calls and emails with a vendor after an internal group expressed concerns regarding privacy. Neither getting the answers nor documenting them in a way that the whole group could understand were easy. If Code4Lib is good for one thing, though, it’s community. As we continue to recognize where we fall short, personally and collectively, in welcoming others, let us support each other in lowering barriers to entry and in building systems which extend that same hospitality and respect to the communities we serve. End Notes [1] Deep Space Nine was the third live-action Star Trek series, running in the mid-to-late 1990s. Over its 7 years, two styles of Starfleet uniform were used. Mine was, regrettably, the former. [2] Memory Alpha is a wiki-based project reference for the Star Trek universe. [3] I’m now overcome by the desire to write a comparison of the Star Trek United Federation of Planets’ library/archive setup and that of the Galactic Empire as shown in Star Wars: Rogue One. [4] ( Open Paren. [ for the writer’s piece of mind…let’s close these ) ) ) ] [5] Speaking of systematic approaches, tools which serve as programmatic intermediaries, such as MarcEdit, the C# MARC Editor, and OpenRefine and the accompanying documentation and tutorials are invaluable in lowering bars to entry and expanding communities. Their support of multiple skill levels also provides an opportunity for users to gradually build advanced skills (and confidence in their tooling expertise) which they may bring to other contexts. [6] In my own experience of trying to lower barriers for entry, I’ve also realized some of the pitfalls when documentation may become the primary way one learns about a thing. In creating the EADiva project, my goal was simple—rewrite the EAD tag library in a way my classmates, who weren’t familiar with the language of Document Type Declarations, could use it. And interlink it for usability. However, simply learning what elements exist in EAD and how to encode data in them is not a good way to learn the practice of archival description. Beyond linking out to the HTML form of Describing Archives: a Content Standard and pointing out on the about page that it is primarily a technical reference, I sometimes struggle with how I can shape the site to make it that the site really is about the technical aspects of encoded archival description and not about its practices. [7] In the interest of disclosure on my end, I asked the following questions: Is the entire Wise program opt-in for the patrons themselves? (e.g. if my public library adopts it, will it opt me in to some aspects and allow me to opt into others or does it give me a full opt-in option on my account page/elsewhere? does the library decide this?) Can a patron continue to use a public library which has adopted it without their data being gathered in some way by Wise? (This question includes shadow profiles, similar to those Facebook created for people without accounts.) Is the data stored entirely within an ILS system or centrally on OCLC’s servers/in the cloud? Answer: cloud, like OCLC’s ILSes. Will the data from multiple libraries be aggregated for overall analysis by OCLC? While OCLC has made clear it won’t sell patron data, will it be sharing that data with researchers or other library partners? Which other OCLC products might be informed by that data? Answer: Reiteration that they won’t sell patron data. If patrons opt in, are they told explicitly how their data is being aggregated and used? (I’d appreciate a copy of whatever disclosure is given, but also how it’s being conveyed in a non wall-of-text format since it’s a known issue that end-users don’t often read such agreements in detail) If data is stored centrally, would this be the largest dataset of library patrons’ borrowing and taste information ever compiled? What plans does OCLC have to keep an unprecedented and tempting dataset safe from hackers? Does OCLC have a plan for what will happen if Wise data (centralized or in one library) gets compromised? This question includes everything from possible punitive fines to the damage to libraries’ reputations as a whole. [8] Ranjan, Brandy. (2018). Re: Questions re: OCLC Wise (context for Code4Lib Journal Editorial). [email]. Subscribe to comments: For this article | For all articles Leave a Reply Name (required) Mail (will not be published) (required) Website ISSN 1940-5758 Current Issue Issue 50, 2021-02-10 Previous Issues Issue 49, 2020-08-10 Issue 48, 2020-05-11 Issue 47, 2020-02-17 Issue 46, 2019-11-05 Older Issues For Authors Call for Submissions Article Guidelines Log in This work is licensed under a Creative Commons Attribution 3.0 United States License. work_3z6xb6cxfbb43gqthyoaczylpi ---- University Library Exterior View, 1995. Copyright 2002-2004. The Trustees of Indiana University. IUPUI Image Collection: A Usability Survey Elsa F. Kramer The author, an editor of consumer horticulture books and scholastic materials, received her M.L.S. degree in December 2005 from the Indiana University School of Library and Information Science in Indianapolis. Research paper Purpose To measure functionality, content, and awareness of an online digital image collection by observing participants in a controlled search for a specific image and evaluating their responses to questions with the objective of improving the site. Methodology Participants were recruited from among faculty, staff, and students based on interests stated in online profiles or indicated by type of professional work or academic major. Participants were timed while searching for a target photo and interviewed afterward about their search experience and their use of digital images. Findings Most potential users are not aware of the IUPUI Image Collection and have some difficulty locating it, but when introduced to it find the site attractive and navigable and its content interesting and useful for research, instruction, publication, or class work. Research limitations There were only 5,100 images uploaded to the collection at the time of the study. The participant group was limited to 70. Practical implications User data and suggestions help the Archives staff work with CONTENTdm™ to make improvements to the site’s function and metadata; choose photos to scan for the collection; and develop site-marketing plans. Originality/value This study provides quantitative measurements of user habits and concerns. Observation, sometimes rejected as an obtrusive methodology, is shown here as an important tool to evaluate human-computer interaction for improved understanding of user perspectives on navigability and functionality. Keywords Archival materials -- Digitization Human-computer interaction IUPUI (Campus) Web sites -- Evaluation 3 Introduction Digital image collections accessible via the Internet are still a relatively new library resource, most created no more than 15 years ago and many in just the last few years. Books and journal articles proliferate on technical aspects of the digitization process and image quality assessment. Until recently, however, comparatively little research had been undertaken with respect to the access needs of targeted users, and which factors influenced their use of digital image delivery systems. But as consumer expectation of Web-accessible images has grown exponentially along with technological advances in their delivery, libraries have hurried to take advantage of a new way to increase use of their photographic holdings while simultaneously identifying and preserving them—and perhaps also protecting them from repeated hands-on use. A decade ago librarians might have asked if they should digitize photo collections and why the task might be important. Today librarians are trying to decide which images to digitize first and where to get the money to fund the projects. Just a few years ago photographs might have been scanned only as low-resolution JPEG files in acknowledgement of long download times for patrons using 56K dialup modems. Now, high-resolution TIFF files are increasingly desired and available as more users have high-speed online service and demand top-quality images. The role of catalogers in collection access used to be hidden from public view. Now, in an era when most computer users understand the importance of Google keyword searches, the primary issue facing managers of cutting-edge digital image collections is how to provide bibliographic access of equally high quality that is also portable enough to accommodate the inevitable changes in technology and likely need for future collaborative interoperability. 4 The IUPUI Image Collection provides online access to campus photos of historical interest. Indianapolis Public Library Hospital Service, ca. 1922. Copyright 2002- 2004. The Trustees of Indiana University. The IUPUI Image Collection Indiana University-Purdue University Indianapolis (IUPUI) was created in 1969 as a partnership between Indiana and Purdue universities. IUPUI is a campus of Indiana University that offers degrees in more than 180 programs from both IU and Purdue. The IUPUI University Library's Ruth Lilly Special Collections and Archives include the Manuscript Collections, University Archives, and Rare Books. The University Archives preserves the official records of IUPUI and its various predecessor institutions, including materials related to the history and unique structure of the institution. The Archives includes approximately 250,000 photographic images. 5 Many photos in the IUPUI Image Collection document development of the campus. Two-Story Outhouse on Campus Grounds, 1941. Photo by Howard W. Fieber. Copyright 2002-2004. The Trustees of Indiana University. In an effort to increase use of photographs in the Archives, improve preservation of the materials by reducing the amount of manual contact with them, and reduce the number of staff- assisted photo searches, Archivist Brenda L. Burk created the IUPUI Image Collection, an online resource launched in October 2002 (http://indiamond6.ulib.iupui.edu/IUPUIphotos/). The author observed and timed 70 test subjects while they searched for a target photo, and interviewed them afterward about their search experience and their use of digital images. Based on results from a pilot test of the survey conducted in July 2004, questions were posed that were designed to elicit information needed to improve the site. Baseline Data At the time of the survey there were approximately 5,100 images uploaded to the IUPUI Image Collection. All photos—people, buildings, and events—pertain in some way to the IUPUI campus or the development of the urban area it occupies in downtown Indianapolis. According to statistics kept by the Archives, faculty and staff are the most frequent users of IUPUI’s archived images. Since the launch of the online resource, requests for printed copies of photographs have declined from a high of 232 in 1998-99 to a low of 6 in 2003-04. Requests for printed images are leveling off as users learn to retrieve or request digital images through the Web site. (Source: University Photograph Collection Reprint Statistics, Ruth Lilly Special Collections & Archives, IUPUI University Library, 1997-2004. Brenda L. Burk, archivist.) 6 WebTrends data are available only as an aggregate for the entire period of site operation and not by month. Because the images reside within CONTENTdm™ proprietary software, it is not possible to link directly to specific IUPUI Image Collection photos from a search engine such as Google. Users must first locate the IUPUI Image Collection site in order to perform a specific photo search. Also because of the software, it is not possible to display a feedback button or other survey mechanism when users complete their searches. These limitations prompted the Archives’ request for a usability study with locally collected statistics. Protocol The author conducted interviews with selected users and potential users about their online experience with the IUPUI Image Collection after observing them while they conducted a controlled search. Participants were recruited from among IUPUI faculty, staff, and students based on interests stated in their online profiles, type of professional work, or academic major. The author gathered data in October and November 2004 to evaluate the functionality of, content satisfaction with, and awareness of the online resource through observation interviews of current, past, and potential users from within the IUPUI community. The author provided each participant with written instructions to complete an online search for a designated photo, recorded their search path, and noted the time required to reach the target. After the participants completed the search, simple questions were asked about their experience to obtain descriptive answers and constructive criticism. The author provided the search scenario to each participant on a printed card, stopped participants who had not reached the target photo within 4 minutes, and helped them find the IUPUI Image Collection home page, http://indiamond6.ulib.iupui.edu/IUPUIphotos/, to try the search again. The scenario was: ••• Please search the IUPUI Image Collection to find a 1938 photo of Franklin D. Roosevelt talking with hydrotherapy patients at the Union Building. I will take notes while you search but cannot answer questions until you have finished. ••• The scenario was constructed so that the most important keywords, “IUPUI Image Collection,” were provided first, followed by terms to narrow the search, including two intentionally incorrect terms, “patients” and “Union Building,” to see what participants who tried to narrow their searches too early would do when confronted with no results. Several photographs in the collection met the search criteria; any of them was considered a target. 7 A specific photo search must start from this page. Accepted results for the targeted search for a photo of Franklin D. Roosevelt. 8 The author took step-by-step notes regarding where participants began the search, what search terms they used, what path they took to the site, and how long it took them to find the target photograph. Search engines used were also recorded, if any, and which search option on the site itself was chosen. Demographic data and interview questions Participants were asked their age (noted in spans that correspond with U.S. Census age groups) and university affiliation (undergraduate or graduate student and school; faculty and school; staff and department). After they completed the search, they were also asked if they had ever used or seen the IUPUI Image Collection previously; if so, how they found out about the site and if they had found what they were looking for; and whether they would use this resource in the future. They were then asked to describe changes that could be made to the site that would encourage them to use it again. While discussing possible changes, participants were asked if they found the site visually appealing and easy to navigate, using follow-up queries to elicit details. Their personal and professional uses of digital images were also queried, and the types of images they would like to see included on the site. With the goal of understanding how the site might best be marketed to potential users, participants were asked about their reading habits, with a focus on publications of local interest. RESULTS Total participants: 70 Age Group # of participants % of total 18-24 13 18.6 25-34 14 20.0 35-44 12 17.1 45-54 21 30.0 55-64 10 14.3 Gender # of participants % of total Females 32 45.7 Males 38 54.3 Position # of participants % of total Faculty 20 28.6 Staff 16 22.9 Graduate students 17 24.3 Undergraduate students 17 24.3 Departmental affiliation # of participants Alumni Office 3 Anthropology 4 Business 1 9 Communications 2 Computer Science 3 Education 1 French 1 Geology 11 Herron School of Art 12 History 4 Informatics/New Media 2 Information technology 1 IU Foundation 2 Journalism 2 Law 2 Library & Information Science 10 Marketing admin. 2 Medicine 3 Nursing 2 Psychology 1 Tourism 1 Used the site previously # of participants No 51 Maybe 1 Yes 18 Of those who said they had used the site previously, 10 heard about it from someone in Special Collections; 6 from a student or faculty member; and 2 came across it by chance while searching other IUPUI Web sites. One person had it bookmarked on her computer. Of those who had used the site previously, 9 were faculty, 4 were staff, 2 were graduate students, and 4 were undergrads. Of the 52 who had not or might have used the site prior to the survey, 4 were at least aware of it. Will use in the future # of participants Yes 55 Maybe 12 No, but will recommend to others 2 No 1 Of those who had not used the site prior to this survey, 36 said they will use it in the future, 12 might, 2 will recommend it to students, and 1 will not use it. Ease of navigation # of participants Easy 45 OK 13 Not easy 12 10 Of those who said the site was easy to navigate, 5 commented that it was easy to navigate once it had been located. Those who did not find the site easy to navigate made suggestions for improving the search functions and metadata. Attractiveness of the site # of participants Thought it was OK 35 Liked it 25 Didn’t like it 10 Of those who liked the site design, several specifically approved of its “minimalist” or “uncluttered” layout and the size of the thumbnail images. Of those who did not like the design, several specifically mentioned the gray background as detracting from its appeal. Many of those who said the site design was acceptable also criticized the gray background color and simple design. Images of interest Participants indicated their desire to see the following photographs (in order of frequency mentioned): Anything historical; people; events; aerial views; buildings; landscapes; parks; neighborhoods; building interiors; classrooms; teacher-student interaction; professional-patient interaction; technological innovations; Indianapolis; art; athletics; extracurricular activities; medical- and nursing-related; African Americans and other minorities; close-ups of structures, especially pre-IUPUI; disability access; computer labs; Internet-related; women; anything unusual. Use of digital images # of participants (multiple responses) Personal or professional work 21 Publications 19 Instruction (faculty) 18 Web site 18 Class projects (students) 17 Research 11 Don’t use 11 Quantity and type of downloads # of participants Multiple, high-res 27 Multiple, low-res 5 One at a time, high-res 8 One at a time, low-res 7 Doesn’t download images 23 Publications read (in order of frequency mentioned) The local daily newspaper (especially online) The student newspaper The campus e-newsletter for faculty and staff Indiana Historical Society publications 11 The local alternative weekly newspaper IU and Purdue alumni magazines Departmental or specialty Listservs and newsletters Office of Professional Development e-newsletter Indiana State Library newsletter Indiana Historic Landmarks newsletter Remember seeing a photo from the site used somewhere After the survey, 48 participants did not recall seeing a photo from the site in print or electronic use, 1 might have, and 21 were able to name a specific place they had seen an image from the collection published. Approach to the search # of participants From the IUPUI homepage 44 From the University Library homepage 12 From Google 12 From IUPUI’s Philanthropy Library site 1 From a bookmark for the collection 1 Elapsed time for the target search # of participants Less than 1 minute 27 1 minute 1 1.5 minutes 6 2 minutes 7 2.5 minutes 7 3 minutes 3 3.5 minutes 3 4 minutes 2 5 minutes 8 Of the 19 participants who said they had or maybe had used the site previously, 13 (68.4%) reached the target in 1 minute or less, and all (100%) reached it 3 minutes or less. Of the 51 participants who had not previously used the site, 14 (27.5%) reached it in 1 minute or less, and 32 (62.8%) reached it in 3 minutes or less. Of all 70 participants, 51 (73%) reached the target photo in 3 minutes or less. The two most direct paths to the site are: From Google > search “IUPUI Image Collection” > click on first link From any IUPUI home page > search “IUPUI Image Collection” (or “image collection”) > click on first link. Of the 70 participants, 6 took the most direct path from Google and 16 took the most direct path from IUPUI. They were able to reach the target photograph in less than 1 minute. 12 Others also started at these points but followed paths through additional links (sometimes many) before reaching the target photo. From outside university home pages, a Google search provides the most direct path to the IUPUI Image Collection. The URL is unusual compared to most other IUPUI domain syntax. 13 The shortest path from the University Library’s home page to the IUPUI Image Collection is through the link to the Ruth Lilly Special Collections and Archives near the bottom right. Instead, most participants linked from here to the online catalog. 14 Only a few participants found this prominent link from the Ruth Lilly Special Collections and Archives home page. 15 Many survey participants looked in the Library’s resources listings for the word “image” and did not scroll down to “IUPUI,” where the link is listed. Conclusions Analysis of the responses suggests that most potential users are not aware of the IUPUI Image Collection but when introduced to it find the site attractive and navigable and its content interesting and useful. They appreciate the prominent inclusion of information about image copyrights and how to obtain permissions. Problems identified are more related to site design than content. Users are grateful for the site and enthusiastic about its potential. Use of the site will continue to increase as more photos and features are added and the URL is publicized to potential users. Users will find the site more quickly and navigate it more easily as enhancements are made to the CONTENTdm software. Recommendations When updates to the CONTENTdm software make it possible, online user surveys will be an important additional tool for evaluation of the IUPUI Image Collection. A feedback mechanism given to visitors as they exit the site can take them to such a survey and provide perspectives from a wide variety of users. 16 Including digital image Web site designers in a larger study would invite valuable feedback on functionality and content. Subjects with expertise in search syntax could provide suggestions for improvements in metadata. A more intuitive and easy-to-remember URL consistent in syntax with other IUPUI URLs might help to increase repeat use of the site. Adding the simple search link (or a “search again” link) at the top of each results page will increase navigability. Adding the word “or” between the simple search and browse search boxes will eliminate some confusion. Ease of searching will increase the number of return visitors to the site. While the image collection is logically catalogued using controlled language (Library of Congress subject headings so that records can be exported to OCLC and IUCAT), searches might be made simpler—especially for those with poor computer literacy skills—by extending the natural language search capability used in the descriptive metadata to the subject headings. (The National Library of Medicine’s medical subject headings have been added for some images in order to facilitate searches.) MeSH headings have been added to some photos to facilitate keyword searches. Anatomy Class with Cadaver, n.d. Copyright 2002-2004. The Trustees of Indiana University. Routine publicity of the site and its updates through intra-university and print and electronic communications will increase faculty awareness. Faculty should be encouraged to introduce students to the site at the beginning of each semester and to include mention of it in every class syllabus, if possible. Alumni and other specialized university audience publications also can be targeted for promotion of the site. The student newspaper could collaborate on projects that promote use of the site and student journalists can use historical images from the collection to illustrate feature articles. 17 Local publications that focus on or sometimes feature the history of downtown Indianapolis also should be targeted for promotion of the site and its updates. As more users become aware of the site, an automated mechanism for downloading and purchasing high- resolution photos could further reduce the need for Archives staff assistance. The author believes that the rapid development of digital libraries has dramatically outpaced the average user’s navigation and information literacy skills. Although the site search for this survey was not designed to test individual online research skills, most participants who became lost in the effort appeared to have poor understanding of the basics of search syntax. That lack of understanding, moreover, was not statistically more significant in any one group over another. The ongoing collaboration between academic librarians and others in the university community to promote information literacy and teach mastery of online research skills will undoubtedly go as far as any software update or design change in increasing use of and user satisfaction with digital resources. Recommended Reading The following articles can be helpful in constructing a framework for measuring user evaluations of a digital image collection; identifying delivery, content, quality, and support variables that affect user satisfaction and frequency of use; and setting benchmarks for measuring the success of changes after they are implemented. Rieger and Gay’s 1999 article in RLG DigiNews is especially helpful in understanding how much more meaningful evaluation of an electronic resource can be when investigators use data collection and analysis tools that take into account human-computer interaction. Instead of assessing a technological resource in isolation, which often leads to emphasis on simple measurements alone, a framework for evaluation based on a “social construction of technology” model allows for a richer assessment that combines statistics with expert evaluation of resource design and investigators’ observation of human use of the electronic resource. Aladwani, Adel M., and Prashant C. Palvia. 2002. “Developing and Validating an Instrument for Measuring User-Perceived Web Quality.” Information & Management 39, no. 6, 467–476. Baca, Murtha, ed. 1998, 2000. Introduction to Metadata: Pathways to Digital Information. Los Angeles: Getty Research Institute. Battleson, Brenda, Austin Booth, and Jane Weintrop. 2001. “Usability Testing of an Academic Library Web Site: A Case Study.” The Journal of Academic Librarianship 27, no. 3, 188–198. Besser, Howard. 1999. “Digital Image Distribution: A Study of Costs and Uses.” D-Lib Magazine 5, no. 10. http://www.dlib.org/dlib/october99/10besser.html. Bishop, Ann Peterson. 2002. “Logins and Bailouts: Measuring Access, Use, and Success in Digital Libraries.” The Journal of Electronic Publishing 4, no. 2. http://www.press.umich.edu/jep/04-02/bishop.html. 18 Bishop, Ann Peterson, and Bertram (Chip) Bruce. 2002. “Digital Library Evaluation as Participative Inquiry.” Graduate School of Library and Information Science. University of Illinois. http://www.isrl.uiuc.edu/~chip/pubs/02delos.pdf California Digital Library Evaluation Activity Reports, University of California, 2003. http://www.cdlib.org/inside/assess/evaluation_activities.html Dickstein, Ruth, and Vicki Mills. 2000. “Usability Testing at the University of Arizona Library: How to Let the Users in on the Design.” Information Technology and Libraries 19, no. 3, 144–151. Dillon, Andrew, and Min Song. 1997. “An Empirical Comparison of the Usability for Novice and Expert Searchers of a Textual and Graphic Interface to an Art Resource Database.” Journal of Digital Information 1, no. 1. http://jodi.ecs.soton.ac.uk/Articles/v01/i01/Dillon Ester, Michael. Digital Image Collections: Issues and Practices. Washington, D.C.: Commission on Preservation and Access, December 1996. Farley, Laine. “Digital Images Come of Age.” Syllabus Magazine, May 2004. http://www.syllabus.com/article.asp?id=9363. Garrison, William A. 2001. “Retrieval Issues for the Colorado Digitization Project’s Heritage Database.” D-Lib Magazine7, no. 10. http://www.dlib.org/dlib/october01/garrison/10garrison.html. Gaynor, Edward. “Cataloging Digital Images: Issues.” Proceedings of the Seminar on Cataloging Digital Documents. Washington, D.C.: Library of Congress, October 13, 1994. http://www.loc.gov/catdir/semdigdocs/gaynor.html. Greenstein, Daniel. 2000. “Digital Libraries and Their Challenges.” Library Trends 49, no. 2, 290–303. Hill, Linda L., et al. 1997. “User Evaluation: Summary of the Methodologies and Results for the Alexandria Digital Library, University of California at Santa Barbara.” http://www.asis.org/annual-97/alexia.htm. Jeng, Judy H. 2004. “What Is Usability in the Context of Digital Libraries and How Can It Be Measured?” Information Technology and Libraries 24, no. 2 (June 2005), 3-12. Jones, Michael L. W. et al. 1999. “Project Soup: Comparing Evaluations of Digital Collection Efforts.” D-Lib Magazine 5, no. 11. http://www.dlib.org/dlib/november99/11jones.html. Kenney, Anne R., and Oya Y. Rieger. 2000. Moving Theory Into Practice: Digital Imaging for Libraries and Archives. Mountain View, CA: Research Libraries Group. 19 ———. 2000. “Preserving Digital Assets: Cornell’s Digital Image Collection Project.” First Monday 5, no. 6. http://www.firstmonday.org/issues/issue5_6/kenney/index.html. Kilker, Julian, and Geri Gay. 1998. “The Social Construction of a Digital Library: A Case Study Examining Implications for Evaluation.” Information Technology and Libraries 17, no. 2, 60-69. Long, Holley. 2002. “An Assessment of the Current State of Digital Library Evaluation.” Master’s Paper. University of North Carolina at Chapel Hill. “Subject Access to Images.” MARBI Discussion Paper no. 2005-DP01. Washington, D.C.: Library of Congress, December 10, 2004. http://www.loc.gov/marc/marbi/2005/2005-dp01.html. MIT Libraries Web Site Usability Test, March 1999. http://macfadden.mit.edu:9500/webgroup/usability/results/process.html. Moyo, Lesley M. 2002. “Collections on the Web: Some Access and Navigation Issues.” Library Collections, Acquisitions, & Technical Services 26, 47-59. Ostrow, Stephen. Digitizing Historical Pictorial Collections for the Internet. Washington, D.C.: Commission on Preservation and Access, February 1998. http://www.clir.org/pubs/reports/ostrow/pub71.html. Payette, Sandy D., and Ola Y. Rieger. 1997. “The User’s Perspective.” D-LibMagazine 2, no. 2. http://www.dlib.org/dlib/april97/cornell/04payette.html. Rieger, Robert, and Geri Gay. 1999. “Tools and Techniques in Evaluating Digital Imaging Projects.” RLG DigiNews 3, no. 3. http://www.rlg.org/preserv/diginews/diginews3-3.html. Saracevic, Tefko, and Marija Dalbello. 2001. “Digital Library Evaluation: Toward an Evolution of Concepts.” Library Trends 49, no. 2, 350–369. Stokes, John R. 1999. “Imaging Pictorial Collections at the Library of Congress.” RLG DigiNews 3, no. 2. http://www.rlg.org/legacy/preserv/diginews/diginews3-2.html#feature. Talbot, Dawn, Gerald R. Lowell, and Kerry Martin. 1998. “From the User’s Perspective: The UCSD Libraries User Survey Project.” Journal of Academic Librarianship 24, no. 5, 357–364. Visual Image User Study, Pennsylvania State University, December 2003. http://www.libraries.psu.edu/vius/reports.html. 20 Visual Resources Association. February 2005. Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images. Washington, D.C.: Digital Library Federation. http://www.vraweb.org/CCOweb/index.html. Weibel, Stuart and Eric Miller. 1997. “Image Description on the Internet: A Summary of the CNI/OCLC Image Metadata Workshop.” D-Lib Magazine (January 1997). http://www.dlib.org/dlib/january97/oclc/01weibel.html. work_3z7bi74tqbdcdou7hf2cipnhka ---- Next generation systems workshop Next generation systems workshop Liber Helsinki 2016 Liber 2016 1  ABES co-leads a national study for the implementation of a Mutualized IT system for libraries  A tender based on Public order grouping : – 46 institutions (universities, research organizations) – 60% of French Higher education libraries – Competitive dialog – Multi awarded – Sustained competition between suppliers for nearly 20 months Liber 2016 2 A win-win dialog  sustained competition between suppliers for nearly 20 months  Very precise specifications – 250 features – 70 practical cases – 30 experts  The task force : – Abes : 1 project director 2 project leaders (1 librarian, 1 IT) – 9 project leaders 1 for each Pilot site library  Nearly 145 people involved in a higly structured organization, 30 expert regularly requested  1250 man/days and probably 250 man days for each provider Liber 2016 3 Multi awarded tender process  The scope at French level : do not depend on just one supplier which would create a monopoly.  Minimum 3 potential providers : – Chosen by a group or a single library by a subsequent award that constitute ‘waves’  Opening of the bids : end of june (today!)  The tender commission will select only part of the 5 candidates  Only the approved systems can take part to a subsequent tender Liber 2016 4 Next steps  Summer : analysis  September 2016 : Commission decision  Autumn 2016 : first subsequent tender  January 2017 : first wave implementation  Autumn 2017 : system running Liber 2016 5 Economic issues  The weight of the French institutions going together (60%) help shake up pricing system towards greater transparency and lower prices  A study commissioned by ADBU (Libraries Directors association) focused to identify total cost of ownership conclude that : – The adoption of a cloud library management system is translated into gains only if the institution reorganize its workflows  Why not launch reorganization without a new system ? – If your system is no more state of the art you have to change it, but if not ? Liber 2016 6 University landscape context  Universities are merging to constitute big structures at cities or regional level  Amalgamation is an opportunity to choose a common system Liber 2016 7 Rethink the complexity  The SGBm project enabled both – to rethink the complexity of the relationships between local, regional and national levels and – reinvent close cooperation between these levels.  Force the suppliers to structure better their offer in 3 levels: – international level : global data and services to all libraries sharing system, – syndicated level : data and services specific to a group of libraries, – local level : own data and services for a particular library, Liber 2016 8 An evolution of services offered by ABES  According to the project phases : – During competitive dialogue : • ownership assistance • assisting with project management experimental sites – Other pooling axes will emerge : • IT co-development; • training and media with sharing platforms (presentation, scripts , ...); • coordinate data collection needs; • production, enrichment and correction data; • Strategic organization of local initiatives ... Liber 2016 9 Monopoly risk Mapping actors internationally  Remain master of our cataloging standards,  Move towards a shared repository authorities that transcends boundaries, including our Francophony (idref, Rameau, future national authorities file ABES / BNF).  The issue of the architecture of such systems becomes crucial. – How to work / lean on / multiple authority files (physical persons, legal persons, materials, works , etc.) – and have as much as possible formats allowing multilingual search ? . Liber 2016 10 Shared open source solution ?  Recent major solutions acquisitions are not reassuring regarding sustainable competition conditions – Open source solutions? – Open data solutions ? JABES 10 - 11 mai 2016 11 Questions  Are theses new category of library software really a new generation systems ?  Are we able, at European level, to build a metadata platform – Open – Federative – Real time – Complete  What is the role of national union catalogues Liber 2016 12 Questions 13 work_445r4iaonffsppnpohftmlmh4e ---- 109215 219..226 www.ssoar.info Education digital libraries management Santos, Gildenir Carolino; Passos, Rosemary; Ribeiro, Célia Maria Veröffentlichungsversion / Published Version Zeitschriftenartikel / journal article Empfohlene Zitierung / Suggested Citation: Santos, G. C., Passos, R., & Ribeiro, C. M. (2008). Education digital libraries management. OCLC Systems & Services: International digital library perspectives, 24(4), 219-226. https://nbn-resolving.org/urn:nbn:de:0168- ssoar-70982 Nutzungsbedingungen: Dieser Text wird unter einer Free Digital Peer Publishing Licence zur Verfügung gestellt. Nähere Auskünfte zu den DiPP-Lizenzen finden Sie hier: http://www.dipp.nrw.de/lizenzen/dppl/service/dppl/ Terms of use: This document is made available under a Free Digital Peer Publishing Licence. For more Information see: http://www.dipp.nrw.de/lizenzen/dppl/service/dppl/ http://www.ssoar.info https://nbn-resolving.org/urn:nbn:de:0168-ssoar-70982 https://nbn-resolving.org/urn:nbn:de:0168-ssoar-70982 http://www.dipp.nrw.de/lizenzen/dppl/service/dppl/ http://www.dipp.nrw.de/lizenzen/dppl/service/dppl/ Education digital libraries management Sharing the experience of UNICAMP education faculty Rosemary Passos, Gildenir Carolino Santos and Célia Maria Ribeiro State University of Campinas, São Paulo, Brazil Abstract Purpose – This work aims to report the experience of implementing a university digital library by introducing all the technical/administrative and scientific production in the education field. Design/methodology/approach – The paper describes the process of conception, information architecture, the steps and methodology for structuring and establishing of the Digital Library of the Faculty of Education of the State University of Campinas (BDE – FE/UNICAMP), with the partnership of the UNICAMP Libraries System (SBU/UNICAMP), which manages the Nou-Rau software and stores the Digital Library of UNICAMP (BDU). It also identifies the skills and abilities the information professional must have, concerning the definition of criteria for the evaluation and selection of documents to be scanned, and establishes management procedures for the implementation of services derived from this new tool for retrieving information in the educational area. Findings – The paper finds that the constitution of a multidisciplinary staff and the skills and abilities required of an information professional involved in designing digital libraries is the object of discussion in several forums. The technical-scientific skills are the most important ones, since this professional must be able to act in a changing environment, from an analogical to a digital culture. The attitude of the information professional, always related to a strictly technical situation, observed along this career’s development, also goes through changes, and demands a professional who is a manager, a leader, visionary and strategist or, in other words, a real agent of changes. Originality/value – The purpose of the BDE – FE/UNICAMP is to store and make electronically available to users the production of professors, students, employees (technicians) and the administration staff, generated within the Faculty. The implementation of this source of research, therefore, meets the expectation of the users and helps to spread the information. Keywords Digital libraries, Information management, Information literacy, Brazil Paper type Case study 1. Introduction It is clear today that digital libraries are well established in institutions of higher education in Brazil. The scientific production and researches developed by the academic community demand real time procedures of validation, confirmation and dissemination of the information produced, due to the importance of speed in knowledge communication. The arrival of the electronic information society has drastically changed the time and space delimitation of information. The emergence of information technology tools The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm Education digital libraries management 219 Received January 2008 Revised March 2008 Accepted April 2008 OCLC Systems & Services: International digital library perspectives Vol. 24 No. 4, 2008 pp. 219-226 q Emerald Group Publishing Limited 1065-075X DOI 10.1108/10650750810914229 provided the necessary infra-structure for changes with no return, in the relationship between information and its users (Barreto, 2006). This perspective refers to the role played by this information storing and spreading tool – the digital libraries – that have become fundamental to allow simultaneous interactivity. At the same time, digital libraries are responsible for the real time access to information, “[. . .] and give access in multiple ways of interaction between the receiver and the information structure contained in this space” (Barreto, 2006). The context described gives start to the discussion proposed in this article, reveals the concern with the building of digital libraries models, information stocks reservoirs, as well as the way users will have access to this documentation, as stated by Barreto (2006). The information flow between the digital stocks and the receivers comprises two criteria: that of information technology, which aims to make possible the widest and best access available; and the information science criterion, which intervenes to qualify this access in terms of users’ individual competencies to assimilate the information. The responsibility for the information science criterion is up to the information professional, who will navigate the “non-presential spaces,” adding new abilities and competencies to the “traditional tasks of the occupation.” These are necessary to manage a digital library architecture, as well as to select the informational contents that will be available to users. 2. Digital libraries – a brief concept According to da Silva et al. (2006), the subject digital library is a frequent theme on information science and librarianship journal articles and communications on events. To emphasize this, the authors mention the publication of a special issue of Information Science in 2001 and the International Workshop “Information Policies on Digital Libraries” which took place at UNICAMP in 2003, as an example of facts that reveal the importance of this discussion nowadays. For this reason, some authors made the most of the concept of the digital library, sometimes comparing it to an electronic library. The difference between these sources of information is that the digital library puts the full contents of documents at users’ disposal as they are already digitized; in its turn the electronic library, because it is completely automated, offers services online for users (Machado et al., 1999). The scanned collections are analyzed by some authors. Zang et al. (2001), consider the digital library as a way of presenting collections that can be scanned and stored in several types of media, such as floppy and hard disks or tape and compact disc. Pereira and Rutina (1999) quoted by Santos (2005, p. 281), state that the digital library is the one that, besides the catalog, has “[. . .] texts of documents of the collection stored in digital format, so they can be read from the screen of a computer or imported (downloaded) to the hard disk [. . .]”. The permission to read documents and the possibility of downloading are the main characteristics of digital libraries at the moment, when the focus on digital inclusion and the virtual world access expansion prevails. To Rosetto (2003), the digital library is the one that considers documents generated or transposed to the digital (electronic) environment, an information service (in all sorts OCLC 24,4 220 of formats), where all the resources are available in the form of electronic processing (acquisition, storing, preservation, retrieving and access through digital technologies). The scientific and technological development in Brazil happens at the same time as the establishment of digital libraries in different fields of activity. They become a tool for knowledge access, sharing and cooperation, which allows all the scattered and disorganized information available from the internet to be selected and stored, creating a channel of relevant information distribution with good quality. 3. Digital library of education The Education Digital Library of UNICAMP (BDE/UNICAMP) was conceived in July 2006, aiming to store and make the scientific and technical production of UNICAMP Education Faculty researchers electronically available in its architecture, according to technical norms. In the scope of a university where the principles of action are to stimulate teaching, research and specialization, the digital library becomes a proactive deed to offer the academic community the opportunity to regularly publish their work through web systems, spreading knowledge, optimizing the scientific communication flow and reducing the cycle of new knowledge generation (Vicentini, 2006). It is a small-sized digital library with an access link on the homepage of the Digital Library of UNICAMP (BDU/UNICAMP), a service offered by UNICAMP through its library system. The management of this system is available through an open software conceived at the University, called “Nou-Rau.” It is based on the Linux operational system, Apache WWW server, PHP language and PostGress data. BDE has four distinct parts to arrange bibliographic supports distribution. They comprise a repository of digital contents of all the UNICAMP Education Faculty production. 3.1. Architecture of the Education Digital Library (BDE) – methodology The architecture of BDE was developed according to the BDU’s architecture, which hosts the access link to the Education Digital Library. Its structure is composed by four large groups as in the Digital Library of UNICAMP’s architecture, namely professors’ production, students’ production, technical production and administrative production: (1) Professors’ production – works of professors of the Education Faculty. (2) Students’ production – works of students of the Education Faculty. (3) Technical production – works of technical support of the Education Faculty. (4) Administrative production – works of administrative collaborators of the Education Faculty. Besides this division by authors of works, the BDE’s architecture is divided in four levels: (1) Main topic. This shows the BDU’s homepage, where the Library’s content is listed. At this level is the access link to BDE. (2) Subtopic. This describes BDE’s content, where four large groups, including the Faculty of Education’s intellectual production (professors, students, technical and administrative) can be found. Education digital libraries management 221 (3) Subtopic. This describes the organization from A to Z, of producers of the works introduced in the Library’s database. (4) Subtopic. This describes the distribution of material types (article, book scrap, chapter, etc.) as they are registered in BDE’s catalog structure. Figures 1-3 show the steps to access the intellectual production of FE/UNICAMP in digital format, in the same environment as BDU. 4. Management of the education digital library – procedures It is possible to build digital libraries, provided that there is a proper plan, knowledge of informatics language, and the suitable software environment to manage the processes. Regarding the requirements for digital libraries management, Vicentini (2006), emphasizes the following items, considered ideal for planning: (1) Collections/contents. (2) Human resources: . multidisciplinary staff; and . training. (3) Standardization: . metadata; . MARC; . format of digital archive; and . digitization patterns. Figure 1. BDE main entrance from BDU web site OCLC 24,4 222 Figure 2. Main screen with BDE’s production levels Figure 3. Access screen with the types of materials included at BDE Education digital libraries management 223 (4) Technology: . Hardware. . Software: – open; and – proprietary. . Flexibility of development. . Programming language. . Use of protocols of communication to import and export data. (5) Digitization. (6) Authorship warranty right. (7) Digital document preservation. Observing these basic requirements, the management of a digital library will be suitable to its principles of an information professional, with the necessary competencies and abilities to run an academic or public cyberspace, in an education institution or not. In this context, BDE followed every step suggested in the Faculty of Education Digital Library planning and management, becoming the first academic library to own a logical content organization, distributed by the type of technical-scientific production. To add value to the BDE’s indexed information, therefore, all the materials were classified according to the Dewey Decimal Classification (DDC), aiming to make it easier for users to memorize subjects linked to the traditional collections in the sense of technical organization. 5. Final considerations Considering the new time of traditional libraries, when new services and tools are introduced in daily routines, concerning digital library management, the role of librarians acquires a magnitude that demands additional qualified knowledge about appropriate technologies, learning and interaction with several kind of organizations, people and institutions dedicated to the production, communication and diffusion of information in a larger geographic space. (Santos and Passos, 2004). The digital library was the chosen means to disseminate and preserve a collection in continuous growth, mainly because of the possibility of expanding the limits of access and use of information beyond time and physical space of traditional libraries (Pavão et al., 2005). To Vicentini (2006), one of the agents that benefits the development of a digital library is the composition of a multidisciplinary staff involved in the project, taking into consideration the following aspects: human resources (people specialized in the use of equipments and specific knowledge to develop the tasks); technological resources (acquisition of the right equipments and outlining economic resources); and motivational resources (incentive to the staff and collaborators for the development of the digital library product). OCLC 24,4 224 The competencies and abilities of the informational professional have been the subject of discussions in several forums. Competencies were classified in 2002 in four main categories: (1) communication and expression competencies; (2) technical-scientific competencies; (3) managerial competencies; and (4) political competencies. Regarding the abilities demanded from this professional to develop digital libraries, the technical-scientific competencies are emphasized, since this professional must be able to select, register, store, retrieve and diffuse the recorded information to any electronic media; he or she must also use and disseminate information sources, products and resources in different supporting systems, and establish plans and execution of studies, as well as the training for the information users. The competencies of the information professional in the digital libraries’ planning and operation, therefore, are necessarily associated to the discussions about the deep changes from the analogical culture to the digital culture, as well as about the change of attitude of the information professional related to the technical conditions, observed along this career’s development. The present circumstances in information units, however, demands someone who is a manager, a leader, a visionary and strategist professional, in other words, a real agent of changes. References Barreto, A.A. (2006), “Prefacio”, in Marcondes, C.H., Kuramoto, H. and Toutain, L.B. (Eds), Bibliotecas digitais: saberes e práticas, 2nd ed., UFBA and IBICT, Salvador and São Paulo, pp. 7-9. da Silva, H.P., Jambeiro, O. and Barreto, A.M. (2006), “Bibliotecas digitais: uma nova cultura, um novo conceito, um novo profissional”, in Marcondes, C.H., Kuramoto, H. and Toutain, L.B. (Eds), Bibliotecas digitais: saberes e práticas, 2nd ed., UFBA and IBICT, Salvador and São Paulo, pp. 259-82. Machado, R.N., Novaes, M.S.F. and Santos, A.H. (1999), “Biblioteca do futuro na percepção de profissionais da informação”, Transinformação, Vol. 11 No. 3, September/December, pp. 215-22. Pavão, C.G., Pouzada, E.V.S. and Mathias, M.A.(2005), Concepção e implementação da Biblioteca Digital da Universidade Federal do Rio Grande do Sul, Porto Alegre. Pereira, E.C. and Rutina, R. (1999), “O século XXI e o sonho da biblioteca universal: quase seis mil anos de evolução na produção, registro e socialização do conhecimento”, Perspectivas Ciência da Informação, Belo Horizonte, Vol. 4 No. 1, January/June, pp. 5-19. Rosetto, M. (2003), “Metadados e recuperação da informação: padrões para bibliotecas digitais”, Ciberética: Simpósio Internacional de Propriedade Intelectual, Informação e Ética, 2, available at: www.ciberetica.org.br (accessed 20 March 2007). Santos, G.C. (2005), “Mapeamento dos suportes de auxı́lio ao ensino tradicional: uma contextualização, da biblioteca, do livro, do computador, da Internet e da tecnologia na educação”, in Bittencour, A.B. and Oliveira, W.M. Jr (Eds), Estudo, pensamento e criação, FE/UNICAMP, Campinas, pp. 277-89. Education digital libraries management 225 Santos, G.C. and Passos, R. (2004), “Estratégias para a estruturação de um website no desenvolvimento de bibliotecas digitais”, ETD – Educação Temática Digital, Campinas, Vol. 6 No. 1, June, available at: http://143.106.58.55/revista/include/getdoc.php?id ¼ 88&article ¼ 26&mode ¼ pdf (accessed 25 May 2007). Vicentini, L.A. (2006), “Gestão em Biblioteca digitais”, in Marcondes, C.H., Kuramoto, H. and Toutain, L.B. (Eds), Bibliotecas digitais: saberes e práticas, 2nd ed., UFBA and IBICT, Salvador and São Paulo, pp. 239-57. Zang, N., Filipiak, E., Senger, I. and da Silva, T.L. (2001), “Biblioteca virtual: conceito, metodologia e implantação”, Revista de Pesquisa e Pós-Graduação, Erechim, Vol. 1 No. 1, pp. 217-36, available at: www.uri.br/publicacoes/revistappg/ano1 n1/ (accessed 20 February 2001). Further reading Passos, R. and Santos, G.C. (2005), Competência em informação na sociedade da aprendizagem, Kairós, Bauru. Silva, N.C., Sá, N.O. and Furtado, S.R.S. (2004), “Bibliotecas digitais: do conceito às práticas”, Simpósio Internacional de Bibliotecas Digitais, 2, UNICAMP, Campinas, available at: http:// libdigi.unicamp.br/document/?code ¼ 8304 (accessed 8 April 2007). OCLC 24,4 226 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_4bg65qeuhjhszcahm6ikchwddm ---- 364 American Archivist / Vol. 57 / Spring 1994 Automating the Archives: A Case Study CAROLE PRIETTO Abstract: The establishment of an archival automation program requires that the archivist address issues of both a technical and a managerial nature. These issues include needs assessment, selection of hardware and software to meet identified needs, redesigning ar- chival tasks in light of the system selected, and ongoing maintenance of the system se- lected. The present article discusses the issues Washington University Archives staff members faced in developing an automation program and the solutions they adopted. It concludes with a brief discussion of possible future directions for the automation program. About the author: Carole Prietto holds a B.A. in history from the University of California at Santa Barbara and an M.A. in history from UCLA. From 1986 to 1989 she was the assistant in the UCLA University Archives; since 1990, she has been university archivist at Washington University, St. Louis. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 Automating the Archives: A Case Study 365 MUCH DISCUSSION IN THE LITERATURE about archival automation concerns imple- mentation of the USMARC AMC format and development of national descriptive standards.1 This discussion is both useful and necessary, but it examines only one side of the automation issues archivists face. Without physical and intellectual con- trol over collections, MARC records can- not be created and collections cannot be used. Discussions of the application of per- sonal computers and commercial software packages to archival processing tasks are scarce. A search of the American Archivist and Archival Issues (formerly the Mid- western Archivist) going back to 1980 re- vealed only two such articles. In a 1990 article in the Midwestern Archivist,2 Ri- chard J. Hite and Daniel Linke outlined the use of a personal computer and Word- Perfect in a team approach to processing at the Western Reserve Historical Society. In a 1991 American Archivist article,3 James G. Carson outlined his repository's use of WordPerfect and Minaret. Even more scarce are discussions of the decision-making process that results in the implementation of an automation program. My purpose here is to discuss the imple- mentation of the automation program at the 'For discussion of the MARC AMC format and the development of descriptive standards, see especially the papers of the Working Group on Archival De- scription, reprinted in the American Archivist 52 (Summer 1989) and 53 (Winter 1990), with extensive bibliography; and David Bearman, ed., Toward Na- tional Information Systems for Archives and Manu- script Repositories: The National Information Systems Task Force (NISTF) Papers, 1981-1984 (Chicago: Society of American Archivists, 1987). Anne J. Gil- liland, ed., "Automating Intellectual Access to Ar- chives," Library Trends 36 (Winter 1988) is devoted entirely to microcomputer applications in an archival setting, as is American Archivist 47 (Summer 1984). 2"Teaming Up with Technology: Team Process- ing," Midwestern Archivist 15, no. 2 (1990): 91-98. 3"The American Medical Association's Historical Health Fraud and Alternative Medicine Collection: An Integrated Approach to Automated Collection Management," American Archivist 54 (Spring 1991): 184-91. Washington University Archives. This pro- cess consisted of a number of steps over a three-year period: evaluating the existing hardware and software, selecting a new da- tabase management package,4 installing and setting up the new software and train- ing staff in its use, adding OCLC and NO- TIS access to facilitate MARC-AMC cataloging, and, finally, adding our first MARC AMC records to a national biblio- graphic database. The discussion will con- clude with an outline of some possible future directions for our automation pro- gram. The Beginning of Special Collections Automation The Washington University Special Col- lections Department purchased its first per- sonal computers in 1988, two years before my arrival. At that time, the department's hardware consisted of four IBM-compati- ble computers and two printers. The prin- ters were shared by way of a local area network and a network program, LANtas- tic. WordPerfect was chosen for word- processing needs. In addition, a database management software package was needed for archives and manuscripts processing. The program chosen was Marcon, then manufactured by AIRS, Incorporated. My predecessor did not make automa- tion a high priority, and the personal com- puter in University Archives, when it was used at all, was used for correspondence. Item-level finding aids, accession registers, and statistics were prepared, as they always had been, on a typewriter. The result was a small archives staff overburdened with clerical tasks while facing both a large backlog and a heavy reader services load. Soon after my arrival in 1990, it became "The commercial database management packages discussed here are trademarks of their respective man- ufacturers. The author has no connection with any of the manufacturers whose products are discussed here. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 366 American Archivist / Spring 1994 apparent that the time had come to take a critical look at archives procedures, with an eye toward streamlining them. Automation offered a means to do this. We had the computers and the software; what we now needed was a plan to exploit our personal computer's capabilities to the fullest. Needs assessment came first: what activ- ities should be automated? The activities best suited for automation were those fre- quent and repetitive in nature, heavily pa- per-based, and involving a great deal of word processing. In examining our opera- tions, we found the activity that best fit the above criteria was the creation of archival and manuscript finding aids. Once we de- cided what to automate, we had both the tools (personal computers and software) and a clear sense of what we wanted to accomplish. Before any further progress could be re- alized, I had to learn to use Marcon Plus, the database package I had inherited, and then train the archives staff in its use. My strategy for learning Marcon was to find a test collection, design a database structure for that collection, enter data into Marcon, and generate a finding aid. These steps would provide the training I needed, which I could then pass on to the staff. The da- tabase file and finding aid that resulted could be used to evaluate Marcon's index- ing, searching, and reporting capabilities. The test collection was a group of au- diotapes documenting Washington Univ- ersity's ongoing lecture program, the Assembly Series. The Assembly Series was an ideal test collection because of its size (about 1,000 tapes) and the need to increase the number of access points to the collection. The only finding aid available for the collection was a typed list of the lectures in rough chronological order; cross-indexes by speaker's name, lecture ti- tle, or sponsoring organization did not exist. Putting this information into a data- base structure would enable us to generate these cross-indexes easily. These indexes could easily be updated as more informa- tion was added to the database. Unfortunately, Marcon had a number of drawbacks. The first I noticed was that the program's processing speed slowed dra- matically after we entered one collection of approximately a thousand records. Over time, thousands of records would be en- tered into the database, and I did not want a program that would be bogged down by the presence of large files. We also discov- ered other problems. Printed reports—an important component because the program would have to produce not only finding aids but also the results of on-line searches—were difficult to set up and dif- ficult to modify. The data structure could not be modified except by completely eras- ing the file and reentering the data. For no apparent reason, indexes became corrupt, and on several occasions hundreds of re- cords were lost. Because of these problems, I proposed dBASE III+ for archives use. Concerns were raised about dBASE's lack of com- patibility with the MARC format and the feasibility of having two database manage- ment systems within Special Collections. I was assigned to investigate database man- agement systems used in archival and man- uscript repositories and make recommen- dations to the library. The head of Special Collections and associate dean for Collec- tions and Services gave me permission to use dBASE on a trial basis, pending the outcome of my investigation. Investigating the Options The investigation of database manage- ment programs began in February 1990. The first step was ascertaining what pro- grams were used in archival repositories. To find out, I queried archival colleagues in the St. Louis area. At this beginning stage, I was interested only in basic infor- mation: what programs were used, who the manufacturers were, how much the pro- grams cost, and strengths and weaknesses D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 Automating the Archives: A Case Study 367 of the respective systems. Follow-up con- tacts were made with manufacturers, who provided sales literature, user- group infor- mation, and demonstration disks. Several additional programs came to my attention through a software review column in the Midwestern Archivist.5 The search resulted in a preliminary list of seventeen database management pro- grams. Based on information from soft- ware reviews and comments from users, the initial field of seventeen was narrowed to seven: Advanced Revelation, dBASE III+, Georgetown Archives Management System, Marcon Plus, Workflow, MicroMARC:amc, and Minaret. The 1990 Society of American Archivists (SAA) meeting in Seattle played an important role in the database management project be- cause it provided an opportunity to obtain detailed information about all seven pro- grams. The Marcon, MicroMARC, and Minaret user groups would be meeting, and information-sharing sessions (called "swap shops") for users of Advanced Revelation, dBASE, and Minaret were part of the pro- gram. In preparation for the Seattle meet- ing, I reviewed the literature and user comments I had received for our final group of seven programs and worked out, in consultation with other staff in Special Collections, the criteria to be used for se- lecting our database management system. They were as follows: • Reliability. Had other users experi- enced loss of data or system crashes while using a program? • Ease of use. Factors to be considered included ease of installation and setup, amount of time and level of technical knowledge needed to learn 'Glen McAnich, ed., "Reviews: Computer Appli- cations Programs," Midwestern Archivist 11, no. 1 (1986): 69-83. The programs reviewed were dBASE III, PFS File/PFS Report, DataEase, Savvy PC 4.0 and 5.1, Marcon II, PC File III, and DB Master 4 Plus. the program, the extent to which the program would allow modifications in either the data structure or the data it- self, and the quality of the user inter- face. A related factor was that no one on the Special Collections staff, and few people within the Olin Library System, had expertise in computer programming. Because of this, it was important that our database manage- ment system not be dependent on such expertise. • Adaptability. The system should be adaptable to the needs of both the University Archives and the Manu- scripts Section. The primary concern was whether the program could ac- commodate both folder- and item- level description. • Quality of documentation. Is it easy to understand? Does the program come with a tutorial, either print or on line? If so, how useful is it in learning the program? A related factor was avail- ability of resources beyond those pro- vided by the manufacturer: are there user groups, classes, or books availa- ble to assist the user? • Manufacturer's support of the prod- uct. Does the customer have to pay for technical support? How much sup- port, if any, is included in the pur- chase price, and what is the cost of ongoing support? Do users have dif- ficulty getting through to the manu- facturer? Are they happy with the service they get? In the case of pro- grams developed by an individual, how much technical support could we expect from the developer and how much would it cost? • Cost implications. Cost was inter- preted not only as the cost of the pro- gram itself but also as costs associated with technical support and the level of hardware needed to run the program. It was important to have a program that would run on our existing hard- D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 368 American Archivist / Spring 1994 ware with little or no sacrifice of per- formance. Could the program run on our local area network? Comments from the Marcon Users Group at the Seattle meeting confirmed what I had experienced. Others had expe- rienced problems such as sudden and unex- plained locking of the keyboard, corrupted indexes, data loss, and report forms that produced garbage text. The comments re- lated to performance alone were enough to remove Marcon from contention, but the users had other concerns: a poorly written manual, lack of a tutorial, and poor tech- nical support from the manufacturer. Marcon's manufacturer, Interactive Sup- port Services, did not send a representative to the meeting, with the result that the chair of the user group had the unenviable task of addressing the concerns of a hostile group of Marcon users. The manufacturer was working on a new release of Marcon that would fix the many bugs in the pro- gram, but the release had no definite ship- ping date. It was also announced that all work on the development of MARC-MAR- CON, a Marcon Plus utility that would have given the program the capacity to cre- ate MARC records, was being abandoned because the archival market was too small to warrant the costs involved. By the end of the meeting, many user-group members were speaking openly about plans to aban- don Marcon. Their comments made it plain that we, too, would be best served by mov- ing in a new direction. Fortunately, we were in a position to do so because our investment in Marcon had been small. Workflow and the Georgetown Archives Management System (GAMS) are derived from dBASE III+. Workflow is written in the dBASE programming language; GAMS is written using the dBASE lan- guage and the Clipper compiler.6 Both 6A compiler is a program that converts a user's pro- gram files to stand-alone applications that do not re- were designed to meet the needs of specific institutions (UCLA and Georgetown Uni- versity, respectively) by staff from those institutions. The Georgetown system recognizes three levels of hierarchical description used in archives and manuscripts: collection, box, and folder. Data for each level is linked to the next with a machine-gener- ated ID number. Index terms, filled in by the user during data entry, may be linked to each folder record and can be searched. The results of searches can be displayed on screen or printed, and the system can gen- erate finding aids at folder level. In creating the Workflow system, the de- velopers began with the premise that proc- essing requires a number of products above and beyond the finding aid, such as gift acknowledgements, monthly and annual statistics, inventories of archival supplies, and, of course, the MARC record.7 Work- flow consists of a series of databases and programs which are designed to track the actions taken on a collection, beginning with the initial contact with the donor and continuing with accessioning, creating a rinding aid, and cataloging. Information pertaining to a given collection exists in- dependently in the various databases until the programs format the data into whatever product is desired: accession register, gift acknowledgement (including news re- leases), finding aid, or MARC record, as appropriate. quire the presence of a particular program to run and can be legally distributed. The presence of Clipper means that GAMS, unlike Workflow, does not require the presence of dBASE to run. In fact, GAMS makes use of features found only in Clipper that prevent it from running directly in dBASE III+ or dBASE IV. One such feature allows GAMS to accommodate up to 64 kilobytes of free-text description per folder, by- passing dBASE's limit of 254 characters. T o r a detailed outline of the Workflow system, see Dan Luckenbill, "Using dBASE IIP- for Finding Aids and a Manuscripts Processing Workflow," Rare Book and Manuscript Librarianship 15, no. 1 (1990): 23-31. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 Automating the Archives: A Case Study 369 For both programs, my primary concern was adaptability: neither program was able to accommodate the item-level description needed by our manuscripts curator. An- other concern was the availability of tech- nical support. Ashton-Tate, then the manufacturer of dBASE, had a policy of not providing assistance to users of cus- tomized dBASE applications. The devel- opers of Workflow and GAMS were full-time archivists in Los Angeles and Washington, D.C., respectively. The dis- tance to St. Louis from either location would make site visits prohibitively expen- sive and difficult to arrange. Telephone service would have to be scheduled to ac- commodate the developers' work schedule. Advanced Revelation (A-Rev.), manu- factured by Revelation Technologies, was the most powerful program I saw. A- Rev. consists of an array of programming tools that allow the user to custom design a com- plete database management system without having to write programming code. These tools allow the developer to paint data en- try screens (fields can be placed anywhere on screen, and the developer can determine how much or how little data shows on screen), develop multiple levels of menus, develop pop-up windows that provide the user with lists of options at any point, and employ multiple levels of data verification. Data fields are stored in a central data dictionary, allowing changes to be made in the database structure without requiring that the developer restructure the entire data file or modify an entire application. A- Rev. allows variable- length description and can accommodate records up to 64 kil- obytes in size. Boolean and proximity searches are both possible; report forms used for finding aids can be developed and stored in a centralized reports library. If the tools provided are not adequate, the devel- oper can create others, thanks to the pres- ence of a programming language, R-BASIC, and an internal compiler and de- bugger. All the A-Rev. users I met commented that the user pays a price for A-Rev.'s power in the form of a steep learning curve—the program is difficult to learn. Having read the program's sales literature, I had to agree with their assessment. Clearly, the program's power was going to present a significant obstacle for us. There were no A-Rev. user groups in the St. Louis area. No one in Olin Library had heard of the program, much less knew how to use it. Thus, we would be faced with mastering a difficult program with few lo- cal resources to draw on. Although the training issues were significant, even more significant was the discovery, gained from conversations with other A-Rev. users, of two significant hardware limitations. The first was that A- Rev.'s file management system could not handle volumes of data larger than 32 megabytes, thereby putting an upper limit on the amount of informa- tion we could store in our computers. The second was that A-Rev. would not run on our local network without significant re- configuration of all our existing hardware. For those two reasons, A-Rev. was not considered the best option. The database management program that emerged as the best option for Special Col- lections use was dBASE III+. Its cost was the lowest,8 and its performance was the least affected by having to run on older, slower computers. Unlike A-Rev., dBASE III+ could easily run within our network and it set no limits on how much data 8The low cost was due in part to the fact that d- BASEIII+ was beginning to give way to dBASE IV. Although dBASE IV was the newer product, I never considered it for our use because the early versions of dBASE IV received poor reviews in the popular personal computer journals. dBASE III+, on the other hand, was a product with a proven track record, and Ashton-Tate had no plans to stop supporting it. Since that time, Ashton-Tate has been taken over by Bor- land International, and dBASE III+ is no longer man- ufactured or supported. Borland has made a number of improvements to dBASE IV, and we will be up- grading our database manager in the near future. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 370 American Archivist / Spring 1994 could be stored—dBASE III+ can handle as much as the computer's hard disk can hold.9 Unlike GAMS and Workflow, d- BASE III+ could accommodate the differ- ing descriptive needs of the University Archives and the Manuscripts Division. Because dBASE III+ was a commercial product, rather than the product of an in- dividual developer, customer service was a phone call away at any time. The software was widely used within the archival com- munity, and the many tutorial books, ref- erence guides, and third-party utilities designed for it constituted a virtual dBASE industry.10 It had the additional advantage of strong institutional support. During the course of the database management study, the library administration had selected dBASE as the officially supported database manager. This meant that on- site service (if needed) and upgrades could be easily obtained. It also meant that, in terms of in- formation sharing with other units, dBASE would not isolate us from the rest of the library system.11 One thing dBASE III+ could not give us was the ability to create MARC records that could be loaded into the OCLC data- base and our local NOTIS catalog. To that 'dBASE III+ has a limit of one million records per file but no limit on the number of files that can be created. In effect, dBASE can handle as much data as can fit on the hard disk. With Advanced Revelation, 32 megabytes is all the program can handle, even if the hard disk has 200 megabytes of free space. This problem can be solved using multiple DOS partitions, but such partitioning is not possible with later ver- sions of DOS. 10A related consideration at the time was that both Ashton-Tate and dBASE had remained stable for many years; thus we could be reasonably confident that Ashton-Tate and dBASE would be stable entities over the long term. Within six months after the con- clusion of the database management study, Ashton- Tate became a subsidiary of Borland International, one of dBASE's former competitors. "It should be noted that the Special Collections De- partment was not at any time forced to use dBASE. The library administration encouraged us to look at a number of options and propose the solution we felt was best. end, Minaret and MicroMARC were ex- plored. We were evaluating not only the usefulness of these programs for creating and exporting MARC records, but also whether one of these programs would sub- stitute for, or serve as an adjunct to, dBASE III+. I preferred Minaret to MicroMARC because it had a more user-friendly inter- face, could work with word-processing pro- grams such as WordPerfect to create finding aids and catalog cards, could read dBASE files, and was more widely used by archival colleagues in the St. Louis area. While in Seattle, I spoke with represen- tatives from the manufacturer and attended the Minaret users group meeting. On re- turning to St. Louis, I obtained demonstra- tion disks, tried out the program, and consulted with Minaret users in the St. Louis area. In the end, we decided against a PC-based MARC AMC utility and in fa- vor of a modem with OCLC's Passport software.12 Besides being a more cost-ef- fective solution, Passport would allow us to enter MARC AMC records directly into OCLC without having to convert data into a format OCLC could read, as would be necessary with Minaret or MicroMARC.13 Another benefit would be access to the OCLC authority file and other utilities we would need for our cataloging. 12Passport is the terminal emulation software for OCLC's PRISM system. In layman's terms, Passport enables a personal computer to function as an OCLC terminal. 13Minaret could send records to OCLC over tele- phone lines using the ProComm telecommunications package and a third-party utility; however, this re- quired a number of data conversions. For a discussion of Minaret's uploading procedure, see Carson, "American Medical Association." MicroMARC had, at the time, no way to send AMC records to OCLC via telephone lines. MicroMARC users had to copy completed records to a floppy disk and send them to Michigan State University. At Michigan State, re- cords were tape-loaded into OCLC via the universi- ty's mainframe. Both MicroMARC and Minaret have since added modules for importing and exporting MARC records. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 Automating the Archives: A Case Study 371 The study of database management needs in Special Collections ended in Oc- tober 1990, with a two-part recommenda- tion to the library administration: Full access to OCLC and NOTIS for cataloging needs, and dBASE III+ for in-house data- base management functions, including the preparation of the finding aids that make a MARC record possible. This recommen- dation was accepted, and work with d- BASE began in November 1990. Implementing the System Once dBASE was installed, the next tasks were staff training and putting d- BASE to use in the archives. The Assem- bly Series database, created during the initial trials with dBASE and originally de- signed with that specific collection in mind, was modified so that it could accommodate audiovisual materials from all collections. Databases for folder-level description of paper records and item-level description of printed items were added. Because our da- tabases are grouped along lines of format (paper, audiovisual, or print), it is possible for items from the same collection to ap- pear in three different databases. To keep track of where information was stored, and to assign classification numbers, two more databases were created for collection-level data—one for university records and one for our St. Louis—area manuscripts. The implementation of dBASE was ac- companied by changes in our processing procedures for paper records. Emphasis was placed on folder-level, rather than item- level, processing of paper records. This change in emphasis maintains intel- lectual control over collections while re- ducing the amount of staff and student time needed to process them. Another change is that we no longer create draft finding aids at various stages of processing. As part of the arrangement and description process, we review old folder headings and assign new ones when appropriate, as we did in the past. Once the arrangement and de- scription are complete, the processor, working from the folders themselves, en- ters the finished folder-level data into the computer database. When all folders have been entered, the processor generates the finding aid using the dBASE report form. Audiovisual materials and printed items are described at item level, but they are checked in using the computer rather than manually. The computer greatly simplifies the process of shelving an item, updating box and folder numbering, and producing a corrected finding aid. Of course, dBASE could accomplish nothing unless the staff and student assis- tants were trained to use it. Like all uni- versity archivists, I am faced with a constant turnover of student assistants, so dBASE training is ongoing. It is also in- cremental. In teaching a student the basics of dBASE, I begin with an introduction to our databases and the types of materials they describe. This introduction serves a dual purpose: in explaining what the vari- ous database fields mean, I am also giving a primer on the principles of archival ar- rangement. Because the manuscripts, publications, and audiovisual databases have similar data structures, learning to navigate one file means that the others can be quickly mastered. The students are first taught the four most basic commands used in data entry: USE, EDIT, APPEND, and QUIT. Once those commands are mas- tered, I move on to searching commands, such as LOCATE, LIST, and DISPLAY, and global-replace commands such as RE- PLACE WITH that expedite data entry by reducing the amount of repetitive typing. With few exceptions, students are com- fortable with the computer and have little trouble with dBASE commands. Once the students are familiar with both dBASE and basic archival hierarchy, they are introduced to actual processing of col- lections. For them, processing is a wel- come addition to the more routine tasks of refoldering, paging and retrieval, and pho- D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 372 American Archivist / Spring 1994 tocopying. For me there is the challenge of assigning appropriate work. As a recent American Archivist article rightly points out,14 student assistants cannot substitute for staff and their work assignments must reflect that fact. When I assign processing projects, the students are given collections that need little rearrangement but do need review of folder headings and input into the computer. Collections that require com- plex rearrangement, require access deci- sions, or have significant preservation problems are processed by my assistant. The staff consists of one half-time parapro- fessional and two undergraduate student assistants, and it is not unusual to have three or four projects in progress simulta- neously. The volume of materials proc- essed with the same number of staff has increased dramatically. MARC records, however, are not a stu- dent task. Because I was to be responsible for creating catalog records for archival collections, I, too, needed training, some of which was provided by technical services staff in Olin Library. Like many archivists, I have taken the SAA workshop on Ar- chives, Personal Papers, and Manuscripts; I have also taken a course on OCLC authority files sponsored by our regional OCLC office. Additional workshops will be necessary in order to keep up with cur- rent cataloging practice. The procedures used for creating MARC records take full advantage of OCLC's ability to copy screens to a personal com- puter's hard disk. First the OCLC work- form for AMC records is copied to the computer's hard disk and saved as a WordPerfect file. This procedure allows extensive editing of the record without in- curring large amounts of connect time. Once the basic information (main entry, ti- '"Barbara L. Floyd and Richard W. Oram, "Learn- ing by Doing: Undergraduates as Employees in Ar- chives,' ' American Archivist 55 (Summer 1992): 440- 52. tie, physical description, organization and arrangement, restrictions, scope and con- tent, biographical/historical note) is in final form, the WordPerfect file containing the filled-in workform is printed out. Data from the printout is keyed into OCLC and added to the OCLC save file. Potential sub- ject and added entries are searched in the OCLC authority file and appropriate entries are added to the saved record. Technical services staff review the completed record for conformity with both OCLC conven- tions and AACR2, then the record is added to the OCLC database. OCLC records are loaded into the library's NOTIS system via the library's weekly tape load, without the need for further intervention on our part. As of February 1994, our databases con- tain approximately 25,000 collection-level, folder-level, and item-level records span- ning over 70 record groups. As the data- bases grow in size and scope we gain an increasingly useful on-line searching tool. Already we have gained greater productivity from the same number of staff. Now that the various components of our automation program are up and running, keeping it run- ning smoothly is an important activity. To that end, we perform regular backups of our data and regular checks of our hardware for viruses and signs of impending hard-disk failure. Planning for the future is already taking place in a number of areas. A number of special collections departments are now using the Internet communications protocol known as Gopher to make finding aids available over the Internet. We are explor- ing ways to do the same.15 Planning and developing an automation program taught me two important lessons. The first was that developing an automa- "Seventeen special collections departments have set up Gopher servers as of February 1994, and the number continues to increase. An important part of making our finding aids accessible over Gopher will be retrospective conversion of older finding aids that exist only in typed form. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 Automating the Archives: A Case Study 373 tion program has both technical and man- continually developing process. In the be- agerial components, and neither is more ginning I assumed that selecting a system important than the other. In fact, the two would mean the end of my work. I now are constantly overlapping. The second les- know it was only the beginning of an on- son was that an automation program is a going, long-term process. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.9p4t712558174274 by C arnegie M ellon U niversity user on 06 A pril 2021 work_4cpbh6r4kvec5dd4uyf7ubcfri ---- Holding Patterns: Current Trends in Serial Holding Statements William & Mary From the SelectedWorks of Betty J Sibley Winter December 17, 2009 Holding Patterns: Current Trends in Serial Holding Statements Betty J Sibley, College of William and Mary Available at: https://works.bepress.com/jean_sibley/3/ http://www.wm.edu https://works.bepress.com/jean_sibley/ https://works.bepress.com/jean_sibley/3/ PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [Sibley, B Jean] On: 17 December 2009 Access details: Access Details: [subscription number 917927818] Publisher Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37- 41 Mortimer Street, London W1T 3JH, UK Technical Services Quarterly Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t792306978 Holding Patterns: Current Trends in Serial Holdings Statements B. Jean Sibley a a College of William and Mary, Williamsburg, VA Online publication date: 17 December 2009 To cite this Article Sibley, B. Jean(2010) 'Holding Patterns: Current Trends in Serial Holdings Statements', Technical Services Quarterly, 27: 1, 39 — 50 To link to this Article: DOI: 10.1080/07317130903253324 URL: http://dx.doi.org/10.1080/07317130903253324 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. http://www.informaworld.com/smpp/title~content=t792306978 http://dx.doi.org/10.1080/07317130903253324 http://www.informaworld.com/terms-and-conditions-of-access.pdf Technical Services Quarterly, Vol. 27:39–50, 2010 Copyright © Taylor & Francis Group, LLC ISSN: 0731-7131 print/1555-3337 online DOI: 10.1080/07317130903253324 Holding Patterns: Current Trends in Serial Holdings Statements B. JEAN SIBLEY College of William and Mary, Williamsburg, VA This work reports on the results of a survey conducted on current trends in serial holdings statements. Respondents described the type of formats used in constructing holdings statements and associated display and system issues. The value of holdings statements, often misunderstood, was deliberated, considering library staff efforts to maintain accuracy. The study examines what libraries are currently doing to convert their holdings statements to comply with the American National Standards Institute/National Informa- tion Standards Organization Z39.71-2006 display standard and MARC21 Format for Holdings Data. KEYWORDS serial holdings statements, MARC holdings, MARC21 Format for Holdings Data, American National Standards Institute, National Information Standards Organization, standards Serial holdings statements in the online catalog provide the library user with information about serial titles and serial issues a given library owns or has access to via gift or subscription. The statements may reflect enu- meration, chronology, missing issues, location, and format of issues. Behind the scenes, detailed and involved work is required by library staff to update and maintain the accuracy of serial holdings while attempting to make them readable to patrons. With the inception of the new standard of the American National Standards Institute (ANSI) and the National Information Standards Organization (NISO), ANSI/NISO Z39.71-2006, replacing the 1999 version, libraries are obligated to ensure that holdings statements comply with the latest standard. The ongoing transition of print serials to electronic form increases the importance of evaluating holdings statements in the OPAC. This article reports how libraries around the United States and Canada are Received 6 May 2008; accepted 23 June 2008 Address correspondence to B. Jean Sibley, College of William and Mary, Earl Gregg Swem Library, P.O. Box 8794, Williamsburg, VA 23187. E-mail: bjsibley@wm.edu 39 D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 40 B. J. Sibley handling conversion to the revised standard, given staff constraints. Several studies have discussed serial holdings statements, but few have focused on how libraries are formatting holdings, given the idiosyncrasies associated with integrated library systems. This research attempts to fill a gap in the current literature. HOLDINGS STANDARDS A brief review of the history of standards shows that the ANSI in New York released a standard in 1980 (ANSI Z39.42-1980) for serial holdings at the summary level 3, the highest level of enumeration and chronology. In 1986, a second standard (ANSI Z39.44-1986) replaced the first and covered serial holdings at both the summary level and the detailed level 4. Three years later, a consolidated standard for monographs and non-serial items was created. ANSI Committee Z39 became known as the NISO and released ANSI/NISO Z39.57-1989. In 1997, in Geneva, Switzerland, the International Organization for Standardization (ISO) issued a standard for holdings at the summary level (ISO 10324:1997). Compatibility with international standards prompted a proposed hold- ings standard to replace both Z39.44 and Z39.57, which led to Z39.71-1999. The draft was based on ISO 10324. It addressed both serial and monographic materials and defined display requirements for holdings statements. It also covered electronic resources and could be used for both manual and auto- mated recording of holdings. NISO standards are reviewed every five years. The latest standard (ANSI/ NISO Z39.71-2006) was approved by ANSI on October 6, 2006.1 This version was a maintenance revision of the 1999 standard with minor updates, cor- rections, and clarifications based on review of the older standard. The new flexible standard specified display requirements in regards to layout and punctuation for holdings statements for bibliographic items in any physical format or electronic medium. It was intended to promote consistency in the communication and exchange of holdings (MARC21 format). Holdings statements created under earlier standards were accommodated in the 2006 version with multiple presentation options possible. Both Ellen Rappaport2 (2000) and Marjorie Bloss and Helen Gbala3 (2001) provide excellent brief histories of standards development up to Z39.71-1999. TYPES OF HOLDINGS For this study, libraries were asked to expound on the type of formats they used to construct holdings statements. Descriptions are based on the Sirsi- D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 Holding Patterns 41 Dynix Unicorn serials check-in module but can be applied to all automated check-in systems. The purpose of a serial holding statement is to convey useful information to those who operate, depend upon, or use the data to verify or locate a specific serial issue. Ultimately, a holding statement must evolve to the issue-specific level.4 Holdings statements may be organized into several con- structed information fields such as a general union list statement, bound or unbound holdings, special issues or supplements, cumulative index holdings, and missing issues. The MARC21 Format for Holdings Data (MFHD) is in the communication format, not a display format, which carries holdings information and defines the structure and coding of data elements for serial items.5 The NISO standard specifies the content for holdings statements, while MARC21 provides the structure for holdings records. The MARC 852 field denotes the location of the serial items. There are subsequently four sets of holdings data fields. Natural language identifiers, also referred to as captions and patterns, are designated in fields 853–855. These can be automatically generated by the serials check- in system and establish the format of the display. Issue identification data, the enumeration, and chronology are contained in fields 863–865. These fields are also system-generated and indicate the actual enumeration and chronology for the received issues. Holdings are updated automatically as individual serial issues are checked in. If the feature is selected, the most recently arrived issue will display in the OPAC. Textual holdings fields 866, 867, and 868 contain a textual description of holdings. They can be generated automatically or manually entered. An 866 is generated when receiving an unpredicted basic issue, an 867 is generated to receive supplements, and an 868 is generated to receive index issues. A holdings record may contain information for physical items at one or more locations and is recorded in fields 876–878. An 856 field may be used to link electronic resources to the MARC holdings record. It allows for the electronic transfer of a file, subscription to an electronic journal, or logon to an electronic resource. Holdings may be recorded at general or issue-specific levels. Summary- level encoding (level 3) indicates that holdings are recorded at the first level of enumeration or chronology in a compressed form, using the first and last issues only in the holdings statement. At the detailed or itemized encoding (level 4), each serial item is listed individually. Z39.71-2006 allows a mixed level of holdings statements with part at the summary level and part at the detailed level. THE SURVEY An online survey was conducted to obtain general information from a group of diverse libraries on the following questions: D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 42 B. J. Sibley � What is the type of library and what library automation system do you use? � Do you check in serials? If so, do the checked-in issues display in the OPAC? � What type of formats do you use for constructing holdings statements? � For which types of serials are holdings statements done? � Are holdings statements formatted according to the latest ANSI/NISO Stan- dard Z39.71-2006? � Do holdings statements reflect your institution’s OCLC holdings? � Who is responsible for updating serial holdings statements? � How much time is spent each week updating holdings statements? � Are holdings statements easy for patrons to read? Are they worth the time and effort to maintain? An electronic link to the survey was sent to the SERIALST listserv, which serves as an informal electronic forum for those involved in most aspects of serials processing in libraries. The online survey was sent out on April 1, 2008, and data collection closed on April 18, 2008. Information on type of li- brary and library automation system was collected for comparative purposes. RESULTS Responses from 236 libraries were collected and analyzed. The majority of the responding libraries were from within the United States, but there were also some responses from libraries in Australia, Canada, India, the Middle East, New Zealand, and South Africa. The findings of the survey are summarized below, followed by discussion based on the responses. � Of the 236 libraries responding to the survey, 61.7% were university li- braries, 17.4% four-year college libraries, 11.5% special libraries, 4.7% community college libraries, and 4.7% public libraries. � A large part of the academic libraries responding (39.5%) used Innovative Interfaces as the library automation system. Ex Libris (Aleph and Voy- ager) accounted for 33.2%, SirsiDynix (Dynix, Horizon, and Unicorn) was used in 21.8% of institutions, and Endeavor in 5.5% of libraries. A few libraries used DRA Classic, EOS, Follett, Innovative Millenium, InMagic, OLIB, SOUL (India), SydneyPlus (Australia), TLC Library Solutions, Virtua, or an in-house system. � Of the 236 libraries responding, 219 (92.8%) checked in serials for which the issues displayed in the OPAC. Fifteen libraries (6.4%) had serial issues that did not display, and two libraries (0.8%) did not check in serials. � Libraries were asked to describe the types of formats they used to construct holdings statements. More than one category could apply. More libraries D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 Holding Patterns 43 (31.3%) used compressed holdings (first and last issues only) in contrast to expanded itemized holdings (19.7%), where each issue was listed. The majority of respondents (67.4%) recorded gaps when an issue was missing (detailed level) as opposed to not accounting for gaps (25.3%) with summary-level holdings. Enumeration and chronology were mostly recorded adjacent to each other (58.8%; e.g., v.5 (1999)–v.7 (2001)), com- pared to separate enumeration and chronology (30.5%; e.g., v.5–7 (1999– 2001)). Libraries (71.7%) used broad phrases such as ‘‘Library retains cur- rent year,’’ for limited retention titles. Open-ended statements (e.g., 1999– ; 63.1% of libraries) and captions (e.g., v. for volume; 46.8%) were also used in constructing their holdings statements. An ‘‘Other’’ category was used for individual libraries to give examples of their holdings statements. Many used a combination of formats depending on the material or collection, often resulting from change in practice over time (see Table 1). � Holding statements in libraries were done for a variety of materials. More than one category could apply. Most statements were done for bound print journals (93.1%) and microform (81.4%). Holdings were also created for monographic series (42.0%), electronic serials (37.2%), and analyzed serials (36.4%). These were serials where each volume had a distinctive title that received an individual bibliographic record (analytic). Summary holdings may be attached to the main serial record (see Table 2). � More than one-third of respondents (34.4%) had no idea if their holdings statements were formatted to comply with the latest ANSI/NISO Z39.71- 2006 standard. Many libraries (33.0%) believed their statements were com- pliant, 9.6% were in the process of conversion, and 23.0% had no plans at present to redo their holdings to comply with the current standard (see Table 3). � Most of the libraries’ holdings statements (47.2%) reflected the institution’s OCLC holdings in WorldCat, whereas 7.3% did not. Holdings were partially reflected in OCLC for 33.9%, while 8.6% of libraries had no idea if they were; seven libraries (3.0%) were not OCLC members. TABLE 1 Type of Formats Used to Construct Holdings Statements (233 Responses)a Count Percentage A. Compressed (first and last issues only) 73 31.3 B. Itemized (list each issue) 46 19.7 C. Summary level 3 59 25.3 D. Detailed level 4 157 67.4 E. Enumeration and chronology adjacent 137 58.8 F. Enumeration and chronology separate 71 30.5 G. Captions 109 46.8 H. Open-ended statements 147 63.1 I. Broad phrases 167 71.7 aMore than one category could apply. D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 44 B. J. Sibley TABLE 2 Types of Serials for Which Holdings Statements are Done (231 Responses)a Count Percentage A. Bound print serials 215 93.1 B. Analyzed serials 84 36.4 C. Electronic serials 86 37.2 D. Microform 188 81.4 E. Monographic series 97 42.0 aMore than one category could apply. � Paraprofessional staff (in 74.8% of libraries), such as library technicians, assistants, and some student workers, were primarily responsible for up- dating and maintaining serial holdings statements, while 25.2% of libraries relied on professional staff. The replies in the ‘‘Other’’ category gave details of staffing, showing that in many cases, both paraprofessional and professional staff worked on holdings statements. Paraprofessional staff did most of the input and updating, while professional staff was responsible for quality control. � The majority of libraries updated holdings statements only when needed (59.5%). Others estimated they spent about 90 minutes each day for one to two days per week (8.2%), one to two hours per week (10.8%), one hour or more daily (20.2%), or not at all (1.3%) maintaining holdings statements, which involved revising them to reflect receipt of current issues, as well as compressing lengthy holdings statements. � One hundred and seven libraries responded to the question regarding whether their library’s holdings statements were user-friendly and worth the time and effort to maintain. The primary response (43.9%) was that maintaining serial holdings statements were worth the staff’s time and ef- fort. Only 19.6% voiced concern that staff time could be better spent doing other things, since ‘‘patrons don’t usually read them [holdings statements] anyway.’’ A few libraries (5.6%) thought that library staff received more benefit from holdings statements than patrons. Over 10% of libraries were in the process of converting their statements. Some (17.8%) were simply undecided regarding the value of serial holdings statements (see Table 4). TABLE 3 Holdings Statements Formatted According to ANSI/NISO Z39.71-2006 (230 Responses) Count Percentage A. Yes, they are 76 33.0 B. In process of conversion 22 9.6 C. No plans to re-do holdings 53 23.0 D. Have no idea 79 34.4 D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 Holding Patterns 45 TABLE 4 Perceived Value of Holdings Statements in OPAC (107 Responses) Count Percentage A. Yes, they are worth time and effort to maintain 47 43.9 B. No, they are not worth time and effort to maintain 21 19.6 C. More benefit to library staff than patrons 6 5.6 D. Process of conversion 14 13.1 E. Undecided (includes no comment) 19 17.8 EXAMPLES OF HOLDINGS STATEMENTS The following are illustrations of different types of holdings statements for the same title, JAMA: The Journal of the American Medical Association. University Library: Unicorn (Sirsi) This statement combines both general and issue specific elements. Enumer- ation and chronology are adjacent. v47(1906)-248[#1-16 18-24]249-273#18(1995) v273:19(05/17/1995)-278:23(12/17/1997) v279:1(01/07/1998)-288:20(11/27/2002) v288:23(12/18/2002)-290:2(07/09/2003) v.290:no.6(2003:Aug.13)-v.290:no.23(2003:Dec.17) v.291:no.1(2004:Jan.7)-v.294:no.2(2005:July13) v.294:no.4(2005:July27)-v.296:no.23(2006:Dec.20) v.297:no.1(2007:Jan.3)-v.299:no.11(2008:Mar.19) v.299:no.13 (2008:Apr. 2)-v.299:no.16 (2008:Apr. 23) Community College: Aleph (ex Libris) This shows group holdings for colleges in a consortium. Enumeration and chronology are separate. There is use of broad statements and format is specified. Location: Key West Periodicals Holdings: v.285:no.1-v.288:no.17 (2001:Jan.01-2002:Nov.16) Scattered issues Location: Gulf Coast Periodicals SERIAL Holdings: v.287:no.1-v.298:no.12 (2002:Jan.-2007:Sep.) Bound, Current issues on display Location: Hillsborough Periodicals Holdings: Retains hard copy one year plus current year D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 46 B. J. Sibley Public Library: (Innovative Interfaces) Holdings are format specific. Open-ended dash is used as well as a broad statement of location. Paper: Vol. 133 (1947)-v.221, no. 13 (Sept. 25, 1972) Microfilm: Vol. 219 (1972)-v.222 (1972) Paper: Vol. 223 (1973)- Current issue on display Electronic Holdings Statement: (Serial Solutions) This entry was in the A–Z list of e-journals. A link to full text holdings is present. The link was also present in the OPAC via a Find It button. JAMA: the journal of the American Medical Association (0098-7474) from 01/01/1998 to present in American Medical Association journals DISCUSSION Benefits Derived from Holdings Statements According to the survey, serial holdings statements were considered to be invaluable to reference and public service staff and useful to library patrons. Some believed that librarians probably used them more than patrons. Many have made a concerted effort in their library to make holdings patron- readable; one special library taught users how to read the statements. De- tailed but compressed holdings statements seemed to be easier to read, especially in the case of long runs. In this day and age of remote users, who are geographically distant or unable to physically visit the library, a good indi- cation of a library’s holdings was important. The patron was spared the time and expense of additional queries and/or an unnecessary trip to the library. Accurate holdings information was an essential part of the serial record. The belief was expressed that users do not want ‘‘serials,’’ they want ‘‘serial issues.’’ This was particularly true with supplements and special issues such as the swimsuit issue of Sports Illustrated. Close to 50% of libraries responded that their institution’s holdings state- ments reflected their OCLC holdings in WorldCat. Therefore, the accuracy of these holdings was vital to interlibrary loan operations that rely on a union list. OCLC holdings were not as detailed as those in the OPAC, however, and often relied on broad phrases such as ‘‘Library retains two years’’ in local data records (LDRs) to minimize holdings maintenance. Some libraries did not maintain LDRs, but updated their holdings on the serial bibliographic records in OCLC. D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 Holding Patterns 47 Serial holdings statements were essential for inventory control and fiscal accountability. Several libraries maintained holdings only for those titles that were paid subscriptions separate from aggregated databases. Unless serial items were barcoded, such as bound periodicals, holdings statements were the only way to account for thousands of pieces. The statements were vital to inventory. Libraries felt obliged to those who funded the library to have an accounting for every issue received. This became important when claiming issues as well. Holdings records were useful for tracking/payment purposes for such materials as analyzed and monographic series that were cataloged separately. Disadvantages Associated with Holdings Statements According to the survey, the major problem with current serial holdings state- ments appeared to be the manner in which the formatted holdings displayed in the OPAC, regardless of the integrated library system used. Holdings were cryptic, difficult to read, and confusing to patrons. The results of adhering to strict traditional rules were not easily interpreted by patrons. Mixed practices over the years made holdings displays even more confusing and unreliable. Accounting for gaps confounded the display of the holdings. Due to time and staff constraints, holdings were not always updated, becoming inaccurate or incomplete in the catalog. Furthermore, holdings did not necessarily reflect what was available on the shelf but rather what issues had been checked in. Usability was a big issue. Many libraries expressed concern that patrons do not actually read the holdings statements. They checked the publication date or a call number and expected the issue to be on the shelf. Students were not instructed on how to read the holdings statements. Many users did not have a clue as to what the holdings statements meant. It could be frustrating for the user to scroll down to the bottom of the screen to read the holdings. The terminology and punctuation used in statements may not be self-evident to a layperson. Individual library holdings occasionally may be confused with bibliographic holdings in the MARC 362 field. To compound difficulties, members of consortia had their holdings displayed together in the OPAC. One special library stated that interpreting serials holdings information in the OPAC continued to give users more difficulty than anything else. Electronic Serial Holdings Today’s shifting focus toward electronic journal content has most likely compelled libraries to evaluate their print serial collections and holdings. Approximately 37% of responding libraries maintained holdings data for electronic serials. Typically, libraries use electronic resource management (ERM) software or a coverage service such as EBSCO’s A–Z, Innovative’s D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 48 B. J. Sibley Content Access Service (CASE), MARCit! by Ex Libris, or Serials Solutions. These supply holdings data for electronic journals through the e-journal portal. Additionally, some libraries use OCLC’s e-serials holdings service to keep electronic holdings current in OCLC. These holdings were represented with open-ended statements in the OPAC, as well as in an A–Z list. When patrons link from the link resolver to a title (registered full text via the publisher), they are directed to the OPAC, and so accurate holdings are essential. However, many integrated library system holdings are very difficult or impossible to export to a format usable by the link resolver. Compliance with ANSI/NISO Z39.71-2006 Currently, 33.0% of the libraries that responded, mainly universities and four- year colleges, were compliant with the Z39.71 standard published in 2006. There were 9.6% that are in the process of conversion, while 23.0% reported no immediate plans to redo their holdings. Another 34.4% did not know if their holdings statements were formatted according to the latest standard. Numerous methods were implemented by those libraries that are in the process of converting their holdings statements. Some institutions com- bined holdings statements for various formats into one inclusive, compact statement. Others switched from level 4 textual holdings (fields 866–868) to pattern and coded field (853/863) pairs for some serials. Holdings statements for inactive and dead run titles were eliminated. The textual holdings fields for active titles were also deleted. Conversely, a few libraries switched to using 866 detailed summary holdings and maintained them manually. In ret- rospective serials conversion projects, older serial records were re-cataloged and detailed holdings listed. Conversion of holdings statements was well received by reference and interlibrary loan personnel. Libraries tried their best to conform, but some were unsure if it was done correctly. Several libraries expected to migrate to MFHD in the near future—driven by systems issues rather than by patron usability. Many were induced to redo holdings after migration to a new ILS system. OPAC display, time, and staff were considered in decisions to comply. One four-year college, in particular, regretted the effort its staff went through to convert to the latest standard. A university librarian was of the opinion that we spend far too much time trying to maintain these holdings statements and correcting them to match the standard. CONCLUSIONS The survey examined how libraries constructed holdings statements and their efforts to comply with the latest ANSI/NISO standard Z39.71-2006. The question remains: Were holdings statements worth the time and effort to D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 Holding Patterns 49 maintain? Wallace (1997) searched 372 online library catalogs and surveyed 80 of them to determine to what degree libraries were providing holdings information in their OPACs.6 She concluded that reliance upon summary holdings statements to assist users in locating serial volumes was a necessity, rather than a ‘nuisance,’ for most academic libraries. No replacement was acceptable. This study similarly concludes that the majority of libraries responding (see Table 4; 43.9%) agreed that the maintenance of serial holdings state- ments was of vital importance to library operations and its users. It was shown that paraprofessional staff was chiefly responsible for most of the maintenance of holdings in libraries (74.8%). Professional librarians were responsible for quality control measures. The time spent updating these holdings varied from about one hour or more daily (20.2%), 90 minutes each day for one to two days per week (8.2%), or one to two hours per week (10.8%). Most libraries updated their holdings only when needed (59.5%) or not at all (1.3%). Libraries that did not update at all had their holdings automatically updated on check-in by the integrated library system. Nearly all libraries with reference and interlibrary loan capabilities are citation-driven. Users with a citation in hand require explicit detailed hold- ings in order to locate material. Titles frequently requested through interli- brary loan call for up-to-date holdings information. Detailed holdings were also important in special libraries that possessed unique titles, less common serials, or rare collections. However, with more libraries undergoing a transition from print to electronic serials, it may not be efficacious to continue to maintain holdings for some print titles. Radical changes such as eliminating serials check-in would do away with claiming individual serial issues and consequently curtail holdings data-maintenance. Overall, holdings were perceived as an integral part of serials mainte- nance, and holdings statements were invaluable for library staff that assist users. Yet efforts must be balanced by time constraints and overall usefulness of the information. FUTURE IMPLICATIONS The future of holdings has been pondered for quite some time. Ten years ago, Frieda Rosenberg (1998) spoke at a North American Serials Interest Group (NASIG) workshop when ANSI/NISO Standard Z39.71-1999 had re- cently been approved and was soon to be released.7 She noted there were many frontiers in holdings. Web linking between citations on remote databases and individual local holdings records was already being offered by some commercial companies. In terms of public service, user-friendly holdings notes regarding format and title changes were being added to the local D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 50 B. J. Sibley holding records. At the time, libraries wanted more consistency in holdings, a better model, and better system implementations. Now, this is still the case. With holdings automatically updated on serial check-in, OPAC display issues are still a primary concern. The new standard accommodates older practices from previous standards. Of the 236 libraries surveyed, a small percentage was in the process of converting their holdings to meet the new standard and MARC21 format, while a third had no plans for conversion. Many have postponed conversion due to system limitations or until integrated library systems improve the way holdings data is displayed. Ultimately, holdings were deemed worthwhile from a librarian’s viewpoint. NOTES 1. ANSI/NISO Z39.71-2006: Holdings statements for bibliographic items. (2006). National information standards series. Bethesda, MD: NISO Press. 2. Ellen C. Rappaport. (2000). What’s new in Z39.71-1999? Technical Services Law Librarian, 25, 1, 29–31. 3. Marjorie E. Bloss, Helen E. Gbala (workshop leaders), and Kevin M. Randall (recorder). (2001). Formatting holdings statements according to the NISO Standard Z39.71-1999. The Serials Librarian, 40, 261–266. 4. Audrey N. Grosch. (1977). Theory and design of serial holding statements in computer- based serials systems. The Serials Librarian, 1, 341–352. 5. MARC21 concise holdings: Holdings data—general information. Retreived April 2, 2008, from http://www.loc.gov/marc/holdings/echdgenr.html 6. Patricia M. Wallace. (1997). Serial holdings statements: A necessity or a nuisance? Technical Services Quarterly, 14, 11–24. 7. Frieda Rosenberg. (June 1998). Do holdings have a future? NASIG. Retrieved March 24, 2008 from http://www.lib.unc.edu/cat/mfh/mfhfuture.html D o w n l o a d e d B y : [ S i b l e y , B J e a n ] A t : 1 9 : 5 2 1 7 D e c e m b e r 2 0 0 9 William & Mary From the SelectedWorks of Betty J Sibley Winter December 17, 2009 Holding Patterns: Current Trends in Serial Holding Statements tmpVe1v6j.pdf work_4ebfg5i34rhinj5ehzokl47c2a ---- TR AD UZ IO N E Recupero della classificazione decimale Dewey da altre basi di dati: un progetto di bonifica del catalogo Stefano Bargioni, Michele Caputo, Alberto Gambardella, Luigi Gentile 1 Introduzione La Biblioteca della Pontificia Università della Santa Croce1 è una biblioteca di ricerca appartenente alla Rete URBE – Unione Romana Biblioteche Ecclesiastiche.2 Attualmente essa possiede circa 167.000 volumi corrispondenti a 145.000 record bibliografici catalogati in formato MARC21. Per la gestione della biblioteca si sono succedu- ti tre Integrated Library System (ILS): Aleph 300, Amicus 3.5.4 e l’attuale Koha3 3.2.7. Contemporaneamente all’adozione dell’ILS open source Koha dall’elevata produttività, sono stati introdotti gli authority records. La duttilità di Koha ha permesso inoltre di aprire nuovi percorsi di sperimentazione operativa ordinariamente non realizzabili con un ILS commerciale. Al fine di fornire all’utenza maggiori strumenti di ricerca cata- lografica in chiave semantica e tenendo presente che l’attività di 1http://www.pusc.it/bib. 2http://www.urbe.it. 3http://koha-community.org. Traduzione italiana a cura dell’Autore. http://www.pusc.it/bib http://www.urbe.it http://koha-community.org TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey soggettazione basata sul Nuovo Soggettario Thesaurus della Biblio- teca nazionale centrale di Firenze è recente, si è deciso di sviluppare le potenzialità legate alla classificazione Dewey,4 già parzialmente adottata in biblioteca da una decina di anni ed assegnata a circa il 25% dei documenti posseduti. Si è sviluppata così l’ipotesi di incrementare, attraverso l’im- portazione da altre basi di dati,5 la presenza della classificazione Dewey nei record bibliografici, utilizzando il codice ISBN6 come chiave per il recupero delle classificazioni mancanti. Si è proceduto inizialmente all’individuazione di fonti (basi di dati) che soddisfa- cessero significativamente le nostre esigenze, sia dal punto di vista qualitativo che quantitativo. L’esperienza della catalogazione deri- vata – un punto di forza di Koha – è stata fondamentale al riguardo. Una volta scelte le potenziali fonti, sia nazionali che internazionali, sono stati individuati i metodi per potervi accedere programmati- camente. La difformità con cui le varie istituzioni pubblicano i loro dati ha comportato la necessità di diversificare i metodi di interro- gazione per poter accedere sistematicamente all’informazione. Si va dal caso più moderno della OCLC, che ha dato vita a Classify,7 un web service sperimentale specifico per la classificazione, ai casi meno semplici in cui si deve ricorrere alle pagine HTML. Per poter 4http://dewey.info. 5L’importazione di dati da altre fonti bibliografiche si giustifica attraverso il ”prin- cipio di condivisione” sentito e vissuto praticamente da sempre dai cataloghi pubblici. Questo principio fonda lo scambio di informazione tramite OPAC, Z39.50, interfacce web, ecc., ed ha come scopo anche il confronto e il controllo reciproco delle regi- strazioni e della identificazione della biblioteca fonte dell’informazione, assicurata, per esempio, in MARC21 dal campo 035. L’utilizzo delle importazione è avvenu- to nel rispetto delle eventuali condizioni o raccomandazioni indicate nelle pagine web dei siti interrogati. Diverso potrebbe essere il caso di un utilizzo commerciale dell’informazione recuperata. 6http://www.isbn.org/standards/home/index.asp. 7http://classify.oclc.org. http://dewey.info http://www.isbn.org/standards/home/index.asp http://classify.oclc.org TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) controllare la qualità delle classificazioni Dewey ottenute, è stato creato un apposito algoritmo descritto nel paragrafo ”Il controllo di qualità”. Il processo di ricerca e importazione dei dati andava anche analizzato sotto il profilo del carico che rappresenta sia per il sistema alla fonte sia per quello di destinazione. Le interrogazioni dei server non possono avvenire ad un ritmo eccessivo, e per questo alcuni di essi pubblicano espressamente raccomandazioni agli eventuali software, chiamati crawler o web robots, che li interrogano. 2 Individuazione dei record da modificare I record del catalogo da arricchire sono quelli dotati di ISBN (tag 020) ma mancanti di classificazione Dewey (tag 082). La loro individua- zione può avvenire in Koha mediante una query SQL (v. listato 1), specifica del database MySQL, applicata al campo marcxml8 della tabella biblioitems9 Listing 1: Query per la selezione dei record in Koha. SELECT biblionumber, listaISBN FROM biblioitems WHERE isbn_presente AND dewey_assente AND lingua_008=’...’ Non trattandosi di una ricerca tramite indici, l’individuazione av- viene mediante l’analisi record per record del database. In questo caso dunque si è di fronte a un aspetto del progetto dipendente dalla 8Il campo biblioitems.marcxml contiene la rappresentazione del record biblio- grafico nel formato MARCXML, http://www.loc.gov/standards/marcxml/, http: //en.wikipedia.org/wiki/MARC_standards#MARCXML. 9Gli elementi principali della query sono descritti in tabella 9 a pagina 19. http://www.loc.gov/standards/marcxml/ http://en.wikipedia.org/wiki/MARC_standards#MARCXML http://en.wikipedia.org/wiki/MARC_standards#MARCXML TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey potenza di calcolo del server su cui risiede l’ILS. Altri ILS permette- ranno di reperire il numero di sistema e l’ISBN di un record senza classificazione Dewey in modi molto diversi da Koha, in funzione della struttura dati utilizzata per conservare i dati bibliografici e degli strumenti a disposizione per accedervi. 3 Le fonti Gli ISBN di ogni record, estratti dalla query, sono stati utilizzati per interrogare sette diverse basi di dati. Le fonti scelte sono elencate nella Tabella 1 nell’ordine temporale di interrogazione. Siccome lo scopo del lavoro era essenzialmente pratico, non si è cercato di interrogare ogni fonte con lo stesso ISBN. Nel caso in cui venisse reperita e salvata nel record una classificazione Dewey, si è deciso che quella fonte avrebbe prevalso sulle successive, così che il record non sarebbe stato ulteriormente processato. Questa modalità ci è parsa più economica rispetto alle altre due possibili: interrogare tutte le fonti con lo stesso ISBN, o simultaneamente o in successione. Inoltre in diversi casi la ricerca è stata limitata alla lingua prevalente della fonte interrogata, sia per evitare un eccessivo numero di ricerche, sia perché ritenuta più attendibile. Tra le lingue 1 Classify Classify di OCLC 2 LC Library of Congress 3 BNF Bibliothèque nationale de France 4 DNB Deutsche Nationalbibliothek 5 BNCF Biblioteca Nazionale Centrale di Firenze 6 BNCR Biblioteca Nazionale Centrale di Roma 7 BNB British National Bibliography Tabella 1: Fonti di classificazione Dewey interrogate. TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) presenti in catalogo, lo spagnolo non è stato trattato, in mancanza di basi di dati da noi ritenute sufficientemente significative allo scopo. Il metodo adottato non consente di effettuare confronti tra le diverse fonti a parità di condizioni, ma permette pur sempre un’analisi statistica dell’uso della classificazione Dewey nelle diverse fonti, come si vedrà in seguito. La Tabella rappresentata in figura 1 mostra l’indirizzo, il tipo di dato restituito, il tipo di servizio contattato per ogni fonte e la lingua interessata: Le fonti di tipo diverso da quelle web forniscono gli Figura 1: Caratteristiche delle fonti di classificazione Dewey interrogate. estremi della connessione nelle rispettive pagine di spiegazione del servizio. Per le fonti di tipo web, invece, connessione e interrogazio- ne vanno quasi sempre dedotte empiricamente, in genere a partire dalla schermata di interrogazione avanzata del catalogo. Per poter individuare i parametri da inviare, compreso quello dell’ISBN, si può procedere in uno dei modi elencati in Appendice. Sempre nel caso di pagine web, la tecnica adottata per l’estrazio- ne del dato è particolarmente specifica. Si deve applicare quello che TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey comunemente viene denominato web scraping,10 screen scraping o in generale data scraping. Occorre in sostanza capire se si dispone di un metodo per indivi- duare ed estrarre il dato di interesse dall’interno del codice HTML ottenuto, operazione che gli altri tipi di risposte rendono più facile e standard visto che forniscono dati strutturati. Il Web 2.0 e ancor più l’incalzante web dei linked data fanno auspicare che le fonti di dati offrano non solo interfacce web, essenzialmente destinate alla fruizione dell’uomo, ma soprattutto interfacce con risposte standard strutturate, fruibili da altre macchine e stabili nel tempo. La logica utilizzata nei programmi di interrogazione delle fonti dati è riconducibile all’algoritmo rappresentato in figura 2. Figura 2: Rappresentazione della logica utilizzata nei programmi di interrogazione delle fonti dati. Fa eccezione il caso di Classify, come detto, per il quale il passo di ”interrogazione della fonte dati per l’ISBN corrente” deve essere seguito da istruzioni specifiche (figura 3.). Figura 3: Rappresentazione dell’eccezione alla logica utilizzata nei program- mi di interrogazione delle fonti dati da Classify. 10http://en.wikipedia.org/wiki/Web_scraping. http://en.wikipedia.org/wiki/Web_scraping TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) Il paragrafo 3 dell’Appendice riporta esempi per ognuno dei tre tipi di dati ottenuti come risposta: XML, MARC e HTML. La risposta di Classify11 è tipicamente di quattro tipi, come da tabella 2. Response code Significato 2 ISBN corrispondente a una singola opera 4 ISBN corrispondente a più opere 101 ISBN errato 102 ISBN non trovato Tabella 2: Tipi di risposte di Classify. Nel caso di risposta di ”ISBN corrispondente a più opere”, Clas- sify12 fornisce un elenco di identificatori OCLC# delle relative ope- re. È stata preferita la prima di queste, andando a reperire il re- cord descrittivo tramite il suo OCLC# con un’altra interrogazio- ne del tipo: http://classify.oclc.org/classify2/Classify?summary= false&swid=OCLC#, che ovviamente ha response code 2, singola opera. La risposta di Classify per singola opera (se ne veda un esem- pio al paragrafo 1 dell’Appendice) riporta sia le aggregazioni delle classificazioni Dewey e LCC assegnate all’opera dai numerosi catalo- ghi che contribuiscono a OCLC, sia un elenco di edizioni, corredate dalla classificazione. È parso preferibile importare la classificazione 11Le API di Classify sono descritte in http://classify.oclc.org/classify2/api_docs/ index.html e possono essere provate tramite il Classify API Explorer alla pagina http://classify.oclc.org/classify2/api_docs/classify.html. 12Le aggregazioni in Classify avvengono per applicazione di FRBR. Alla pa- gina http://www.oclc.org/research/activities/classify.html (al 21.1.2013) si affer- ma: ”Bibliographic records are grouped using the OCLC FRBR Work-Set algorithm to form a work-level summary of the class numbers and subject headings assigned to a work. You can retrieve a summary by ISBN, ISSN, UPC, OCLC number, author/title, or subject heading”. http://classify.oclc.org/classify2/Classify?summary=false&swid=OCLC# http://classify.oclc.org/classify2/Classify?summary=false&swid=OCLC# http://classify.oclc.org/classify2/api_docs/index.html http://classify.oclc.org/classify2/api_docs/index.html http://classify.oclc.org/classify2/api_docs/classify.html http://www.oclc.org/research/activities/classify.html http://www.oclc.org/research/activities/frbralgorithm.html TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey della prima edizione in elenco, perché rispetto alle altre era spesso più completa. Le fonti Z39.50 richiedono sostanzialmente di estrarre il valore del tag della classificazione Dewey, secondo le regole del relativo formato MARC, come da Tabella 4. sottocampo sottocampoo Formato MARC tag del codice dell’edizione MARC21 082 a 2 InterMARC o UNIMARC 676 a v Tabella 3: Tag della classificazione Dewey in alcuni dialetti MARC. 4 Il ”controllo di qualità” Prima del progetto, il catalogo era popolato da classificazioni Dewey riferentesi alle edizioni dalla 19 alla 23. La scelta di non introdurre né classificazioni di tipo ridotto né classificazioni di edizioni Dewey inferiori alla 19 ha implicato di dover rinunciare a numerose clas- sificazioni trovate, come riportato nelle statistiche della tabella 7 a pagina 14. È parso opportuno privilegiare la qualità alla quantità per ottenere un arricchimento più possibile allineato alla politica di catalogazione. In concreto, oltre a limitare l’edizione alla 19 o superiori, sono state scartate classificazioni con indicatori 1 e 2 di- versi dal ”0 0” e ”0 4”.13 Sono state eliminate anche le classificazioni contenenti caratteri non numerici o mancanti di edizione. Infine le classificazioni sono state normalizzate prima di essere inserite nel record. 13Secondo il MARC21, il primo indicatore del campo 082 con valore ”0” indica uso dell’edizione completa della Dewey, il secondo indicatore con valore ”0” indi- ca Dewey assegnata dalla Library of Congress mentre il valore ”4” corrisponde a notazione assegnata da una agenzia diversa dalla Library of Congress. TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) 5 Il tag 035 Contestualmente alla modifica del record, è parso opportuno tenere traccia degli estremi del record da cui è stata tratta la classificazione Dewey importata, tramite l’utilizzo del tag 035 del MARC21, come nel seguente esempio: Listing 2: Esempio di utilizzo del tag 035 di MARC21. 00872nam a2200265 i 4500 001 000000035650 003 IT-RoPUS 005 20121121122621.0 008 041027r19851982xxk u000 u eng c 020 $a 0198247761 035 $a (OCoLC)007946090 040 $a IT-RoPUS $b ita 082 04 $a 111.85 $2 19 100 1 $a Savile, Anthony. $9 70779 245 14 $a The test of time : $b an essay in philosophical aesthetics / $c Anthony Savile. ... Nel caso di fonte non MARC21 o comunque senza MARC Organiza- tion Code,14 è stato scelto di assegnare un codice più logico possibile, come da Tabella 4 nella pagina seguente. L’ID è stato estratto dal record in posizioni diverse caso per caso. Per le fonti Z39.50 si trova nel tag 001, mentre per la Library of Con- gress si ricorre al tag 010. Anche Classify lo riporta espressamente nel record XML, mentre il reperimento dai record in formato HTML è particolarmente complesso. 14http://www.loc.gov/marc/organizations/. http://www.loc.gov/marc/organizations/ TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey Tabella 4: Codici istituzione per lo 035. 1 Classify di OCLC OCoLC ufficiale 2 Library of Congress DLC ufficiale 3 Bibliothèque nationale de France FR-PaBFM ufficiale 4 Deutsche Nationalbibliothek DE-101 ufficialea 5 Biblioteca Nazionale Centrale di Firenze BNCF non ufficiale 6 Biblioteca Nazionale Centrale di Roma BNCR non ufficiale 7 British National Bibliography BNB non ufficiale a http://dispatch.opac.d-nb.de/DB=1.2/LNG=EN. Questa scelta consente di collegare il record bibliografico a quello di un catalogo esterno, utile per costruire un link di interesse sia a livello di OPAC (figura 4 a pagina 12) che di linked data. Il link nell’OPAC viene costruito, per ogni occorrenza del tag 035, sulla base dei link della tabella 5 a fronte. La permanenza di alcuni è certa (permalink). Negli altri casi, il link, di natura molto più instabile, può essere costruito ricorrendo alla vista di ogni singolo record offerta dal catalogo. 6 Attese durante la ricerca sulle fonti Come accennato nell’Introduzione, un uso continuo, facilmente ot- tenibile con interrogazioni automatizzate, può gravare sul server interrogato. La lettura di pagine web di tipo ”Terms and Conditions” permette di regolare le condizioni di utilizzo delle fonti. Ad esempio, la Library of Congress richiede esplicitamente15 che i crawler utiliz- zino il server Z39.50 con un ritmo inferiore alle 10 interrogazioni al minuto. Il server Z39.50 della Bibliothèque nationale de France chiu- de il collegamento dopo la decima interrogazione. Il programma 15http://lccn.loc.gov/lccnperm-faq.html#n12. http://dispatch.opac.d-nb.de/DB=1.2/LNG=EN http://lccn.loc.gov/lccnperm-faq.html#n12 TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) Tabella 5: Costruzione di link nell’OPAC a partire da un’occorrenza di tag 035. Classify di OCLC - World- Cat http://www.worldcat.org/search?q=no%3AID permalinka Library of Congress http://lccn.loc.gov/ID permalinkb Bibliothèque nationale de France http://catalogue.bnf.fr/servlet/biblio ?idNoeud=1&SN1=0&SN2=0&host=catalogue& ID=ID Deutsche Nationalbiblio- thek http://d-nb.info/ID permalinkc Biblioteca Nazionale Cen- trale di Firenze http://opac.bncf.firenze.sbn.it/opac/ controller.jsp? action=notizia_view¬izia_idn=ID Biblioteca Nazionale Cen- trale di Roma http://193.206.215.17/BVE/ricercaEsperta. php?dove=esperta &cerca=Avvia+la+ricerca& textexpert=di%3DID British National Bibliogra- phy http://search.bl.uk/primo_library/libweb /action/search.do?vid=BLBNB&fn =search&vl%28freeText0%29=ID a http://www.oclc.org/worldcatorg/linking/how.htm#oclc-number. b http://lccn.loc.gov/lccnperm-faq.html. c Dedotto dalla visualizzazione di un singolo record al termine di una ricerca qualunque. http://www.oclc.org/worldcatorg/linking/how.htm#oclc-number http://lccn.loc.gov/lccnperm-faq.html TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey Figura 4: Vista di record nell’OPAC, arricchito con Dewey e link prelevati da DNB. deve pertanto riaprirlo con la stessa frequenza. Il sito della Bibliote- ca nazionale centrale di Firenze non si presta ad essere consultato senza pause, dato che sembra sovraccaricarsi quasi subito. È anche opportuno verificare, per le fonti interrogate tramite pro- tocollo http, se vi sono indicazioni ai crawler nel file /robots.txt, dove a volte si trovano restrizioni anche per la frequenza di acces- so.16 Pertanto per tutte le fonti sono state definite attese dai 4 ai 6 secondi tra le interrogazioni. Le pause hanno permesso anche di non sovraccaricare il nostro catalogo. Infatti ad ogni modifica di record, il motore di indicizzazione Zebra17 usato da Koha e il motore di 16http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_ directive. 17http://www.indexdata.dk/zebra. http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive http://www.indexdata.dk/zebra TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) 1 numero di sistema ISBN ISBN non trovato 2 numero di sistema ISBN ISBN errato 3 numero di sistema ISBN ISBN relativo a più opere 4 numero di sistema ISBN Dewey non trovata 5 numero di sistema ISBN Classificazione ed edizione trovate Non soddisfacenti 6 numero di sistema ISBN Classificazione ed edizione trovate Record modificato Tabella 6: Tipi di record di log. Il tipo 2 e 3 sono relativi solo a Classify. ricerca per liste sviluppato in proprio,18 intervengono per aggiornare i propri indici e possono rallentare la consultazione dell’OPAC e il lavoro ordinario. Un aspetto da valutare in funzione della potenza di calcolo a disposizione. Il ritmo imposto dalle pause suddette di fatto prolunga il processo di importazione per ore se non per giorni, in funzione del numero di ISBN da elaborare. Questo può comportare degli adattamenti del programma, per esempio parametrizzandolo affinché lavori solo in certe fasce orarie. 7 Log Il processo di importazione è stato monitorato al fine di raccogliere statistiche sul lavoro svolto. Sono stati registrati i tipi di record di log descritti nella tabella 6. 18Koha non dispone al momento di ricerche a scorrimento di indici, note an- che come ricerche browse. È stato possibile aggiungere questa funzionalità al- la nostra installazione di Koha tramite un applicativo basato su Solr (http:// lucene.apache.org/solr) e sviluppato dalla nostra biblioteca. Questo browse è stato presentato all’incontro internazionale di utenti Koha tenutosi ad Edimbur- go a giugno 2012 (http://wiki.koha-community.org/wiki/KohaCon12_Schedule# Adding_browse_to_Koha_using_Solr_.2815-20_min.29) e verrà integrato in succes- sive versioni di Koha, in particolare quando Solr sarà in alternativa a Zebra o lo sostituirà. http://lucene.apache.org/solr http://lucene.apache.org/solr http://wiki.koha-community.org/wiki/KohaCon12_Schedule#Adding_browse_to_Koha_using_Solr_.2815-20_min.29 http://wiki.koha-community.org/wiki/KohaCon12_Schedule#Adding_browse_to_Koha_using_Solr_.2815-20_min.29 TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey 8 Statistiche I log generati permettono di costruire le seguenti tabelle e confronta- re le diverse fonti sotto alcuni aspetti. Record Record ISBN non Dewey non Dewey Più opere per ISBN Fonte Lingua esaminati modificati trovati trovate scartate stesso ISBN errato Classify tutte 42387 10267 5321 6607 20059 8240 133 LC tutte 31999 1252 21195 8562 1011 BNF tutte 30903 2253 21327 7268 55 DNB ger 4193 163 3867 163 0 BNCF ita 12017 4088 3643 3542 744 BNCR ita 7549 1515 3003 2978 53 BNB eng 6215 193 5449 55 518 Totale 19710 Tabella 7: Conteggi. Fonte Campioni Ed. 19 (%) Ed. 20 (%) Ed. 21 (%) Ed. 22 (%) Ed. 23 (%) Classify 10267 19,86 23,03 36,18 20,13 0,79 LC 1231 28,11 25,83 24,29 19,58 2,19 BNF 2253 0,00 0,09 0,36 99,56 0,00 DNB 163 0,00 0,00 0,00 100,00 0,00 BNCF 4088 9,10 23,46 55,04 12,40 0,00 BNCR 1515 2,38 9,70 87,92 0,00 0,00 BNB 193 16,58 19,69 26,42 28,50 8,81 Totale 19710 Tabella 8: Distribuzione delle edizioni, relativa alle classificazioni reperite. La tabella 8 è riprodotta nei grafici raccolti nella figura 5 nella pagina successiva, uno per fonte. Si notano alcune scelte precise, quali BNF, DNB e BNCR, di pri- vilegiare una sola edizione. D’altra parte, visto quanto è riportato per Classify, mediamente chi ha intrapreso l’uso della classificazione Dewey da tempo, non sembra aver provveduto ad un aggiornamen- to delle notazioni Dewey nel catalogo, certamente per la complessità TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) Figura 5: Distribuzione delle edizioni. dell’operazione. Infine si nota la (ancora) scarsa diffusione dell’edi- zione 23. Come indicato in precedenza, il catalogo si è arricchito di 19710 nuove classificazioni Dewey in altrettanti record bibliografici. L’aumento è stato del 47,8%, dato che in precedenza i record con tag 082 erano 41255. La distribuzione attuale delle classificazioni Dewey, mostrata nella figura 6 nella pagina seguente, traccia un profilo del posseduto che riflette le aree di interesse delle facoltà e di crescita della biblioteca. La distribuzione delle edizioni Dewey in catalogo è rappresentata dalla figura 7 nella pagina successiva. L’assenza di edizione per un numero significativo di record bibliografici è un caso di disomogeneità catalografica per la cui bonifica si potrebbe utilizzare un metodo molto simile a quello illustrato nel presente lavoro. TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey Figura 6: Distribuzione del posseduto secondo le divisioni della classifica- zione Dewey. Figura 7: Distribuzione delle edizioni della classificazione Dewey. TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) 9 L’indice Dewey nell’OPAC Tramite gli indici a scorrimento, mostrati con l’esempio della figura 8 e citati in precedenza, è possibile offrire nell’OPAC un percorso di ricerca semantico basato sulla classificazione Dewey. I conteggi delle ricerche effettuate dall’utenza mostrano che l’indice di maggior utilizzo è proprio quello della classificazione Dewey, superiore anche a quello dell’indice dei nomi, peraltro particolarmente importante per i rinvii dei numerosi autori antichi e dei papi. Figura 8: L’indice a scorrimento della classificazione Dewey in Koha. TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey 10 Software utilizzato I sette programmi di interrogazione sono stati scritti nel linguaggio Perl, ricorrendo alle API di Koha e alle seguenti librerie:19 LWP per le connessioni HTTP, ZOOM per le connessioni Z39.50, DBI per le connessioni al database MySQL, XML::XPath per il trattamento dei dati XML, WWW::Scraper per il trattamento dei dati HTML, MARC::Record per il trattamento dei record MARC. 11 Conclusioni Il presente lavoro ha permesso di comprendere il valore e le proble- matiche del reperimento in rete di informazione che può concorrere a migliorare cataloghi bibliografici. Ordinariamente si considera di interesse la catalogazione derivata per ottenere l’intero record, ma – attraverso identificativi univoci quali l’ISBN o altri – è possi- bile reperire informazione parziale o ”atomica” con cui si possono raggiungere diversi scopi: • arricchire il catalogo in modo statico, come nel caso presentato; • arricchire l’OPAC in modo dinamico, recuperando uno o più dati al momento della visualizzazione di un record; • aumentare la navigabilità per una migliore fruizione da parte dell’utente dell’OPAC; • contribuire a bonificare situazioni pregresse; • effettuare controlli di qualità; • offrire strumenti di supporto al lavoro di catalogazione; 19Ogni libreria è documentata e reperibile in http://search.cpan.org. http://search.cpan.org TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) • aumentare il numero di identificativi univoci presenti in cata- logo; • effettuare confronti tra basi di dati. 12 Appendice 12.1 Elementi della query per la selezione dei record senza Dewey biblionumber il numero di sistema del record bibliografico listaISBN ExtractValue(marcxml,’//datafield[@tag=020]/subfield [@code=a]’) si tratta dell’elenco delle occorrenze del sot- tocampo $a del tag 020, separate da spazio; normalmente l’occorrenza è unica isbn_presente ExtractValue(marcxml,’count(//datafield[@tag=020] /subfield[@code=a])>0’) almeno una occorrenza di 020$a dewey_assente ExtractValue(marcxml,’count(//datafield [@tag=082]/subfield[@code=a])=0’) nessuna occorrenza di 082$a lingua_008 substr(ExtractValue(marcxml,’//controlfield[\@tag=\008 \]’),36,3) = ’codice_lingua’ Tabella 9: Elementi principali della query per la selezione dei record bibliografici da trattare. La funzione ExtractValue,20 presente in MySQL 5.1.5 o superiori, permette l’interrogazione di dati XML, specificando come parametri il campo da esaminare e una espressione Xpath.21 20http://dev.mysql.com/doc/refman/5.1/en/xml-functions.html. 21http://it.wikipedia.org/wiki/XPath. http://dev.mysql.com/doc/refman/5.1/en/xml-functions.html http://it.wikipedia.org/wiki/XPath TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey 12.2 Parametri per le ricerche di tipo web Per individuare i parametri con cui comporre l’url della ricerca, compreso quello dell’ISBN, si può procedere in uno dei seguenti modi: • lanciare la query e notare l’url della risposta; se questo non contiene i parametri, cioè nel caso di form con method=post, cambiare il parametro method al valore get tramite ”Inspect Element”, presente in diversi browser premendo il tasto destro sulla form, e lanciare l’interrogazione; • oppure analizzare la richiesta http inoltrata dall’interroga- zione tramite un plugin per l’analisi del traffico o apposita funzionalità del browser. 12.3 Esempi di risposte Un esempio di risposta XML da Classify22 è il seguente: Listing 3: XML 014271167 hold desc 2204022659 0 25 22http://classify.oclc.org/classify2/Classify?summary=false&isbn=2204022659. http://classify.oclc.org/classify2/Classify?summary=false&isbn=2204022659 TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey [... omissis ...] Un esempio di risposta Z39.5023 (MARC21), nella sua rappresen- tazione leggibile: Listing 4: MARC21 00932cam 2200253 a 4500 001 500315 005 20050929180451.0 008 851021s1986 nyua 000 0 eng 035 $9 (DLC) 85073338 010 $a 85073338 020 $a 0874472466 (pbk.) : $c $8.95 040 $a DLC $c DLC $d DLC 050 00 $a LB2353.57 $b .A16 1986 082 00 $a 371.2/6 $2 19 245 00 $a 10 SATs : $b the actual and [...] prepare for it. 250 $a 2nd ed. 260 $a New York : $b College Entrance Examination Board : $b ... 300 $a 304 p. : $b ill. ; $c 28 cm. [... omissis ...] Un esempio di codice HTML:24 Listing 5: HTML 23Da Library of Congress, lx2.loc.gov:210/LCDB, find @attr 1=7 0874472466. 24https://portal.dnb.de/opac.htm?query=isbn%3D9783525563427&method= simpleSearch. https://portal.dnb.de/opac.htm?query=isbn%3D9783525563427&method=simpleSearch https://portal.dnb.de/opac.htm?query=isbn%3D9783525563427&method=simpleSearch TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) DNB, Katalog der Deutschen Nationalbibliothek [... omissis ...] [... omissis ...] Link zu diesem Datensatz http://d-nb.info/977758214 [... omissis ...] DDC-Notation 231.6 [DDC22ger] [... omissis ...] la cui versione nel browser è mostrata in figura 9 nella pagina successiva. TR AD UZ IO N E S. Bargioni, Recupero della classificazione decimale Dewey Figura 9: Risultato di una ricerca per ISBN sul catalogo della Deutsche Nationalbibliothek. TR AD UZ IO N E JLIS.it. Vol. 4, n. 2 (Luglio/July 2013) Ai fini di una corretta indicizzazione, si invitano i lettori a citare esclusivamente il testo in lingua inglese; l’unico, infatti, che presenta l’indicazione del numero di pagina, l’abstract, le keywords e le date del processo redazionale. Bargioni, S., M. Caputo, A. Gambardella, et al. ”Recu- pero della classificazione decimale Dewey da altre basi di dati: un progetto di bonifica del catalogo”. JLIS.it. Vol. 4, n. 2 (Luglio/July 2013): Art. #8766, p. 1–25. DOI: 10.4403/jlis.it-8766. Web. http://dx.doi.org/10.4403/jlis.it-8766 Introduzione Individuazione dei record da modificare Le fonti Il ''controllo di qualità'' Il tag 035 Attese durante la ricerca sulle fonti Log Statistiche L'indice Dewey nell'OPAC Software utilizzato Conclusioni Appendice Elementi della query per la selezione dei record senza Dewey Parametri per le ricerche di tipo web Esempi di risposte work_4gjkil5fgjcjvl52sqgb55yaei ---- Türk Kütüphaneciliği 9,1 (1995), 62-64 İngiliz Kütüphanesi Belge Sağlama Hizmetleri British Library Document Supply Services Çeviren: Yeşim Cömertoğlu^ Giriş BLDSC (İngiliz Kütüphanesi Belge Sağlama Merkezi) sadece İngilte­ re'de değil tüm dünyada kullanıcılara, yayınlanmış bilgiyi sağlayan kuruluş­ ların önde gelenlerinden biridir. Otuz yıllık gelişme süresince araştırma dü­ zeyinde çok geniş kapsamlı bir derme oluşturmuştur. Bu derme ile 1993/94 döneminde 3.3 milyon isteği karşılamıştır. Derme, 47.000 tanesi halen sür­ mekte olan 242.000 dergi, 2.9 milyon kitap, 4 milyonun üzerinde rapor, 550.000 doktora tezi, 330.000 toplantı tunanağı ve yaklaşık 550.000 tercü­ meden oluşmaktadır. BLDSC 1962 yılında kurulmuştur. Temel amacı en güzel o zamanki adı olan "Ulusal Fen ve Teknoloji ödünç Verme Kütüphanesi" ile ifade edilebilir. Gerçekten de bilim ve teknik konulardaki istekler, toplam isteklerin %70'ini oluşturmaktadır. Ne varki, ödünç vermeden çok fotokopi sağlama süreç için­ de daha fazla önem kazandığından, bir ad değişikliği ile dünün "Ödünç Ver­ me Kütüphanesi" bugünün "Dokümantasyon Merkezi" olmuştur. BLDSC ve Uluslararası Bilgi Pazan Otuz yıl önce BLDSC'nin işlevi, İngiltere'de bilim ve teknik dalında bil­ gi talebini karşılamaktı. Daha yakın geçmişte ise diğer ülkelerden kullanıcı­ lar giderek önem kazanmaya başladı. Bunun sonucu olarak, son 20 yılda dünyanın çeşitli ülkelerde yapılan satışlar, tüm satışların onda birinden dörtte birine yükselmiştir. Bu büyüme yalnızca kapsamlı bir dermeyi koru­ makla değil, başvuruların işlenmesini sürekli güncelleştirerek gerçekleştiril­ miştir. BLDSC'nin ana stratejilerinden biri, hizmetlerini İngiltere dışında da­ ha yaygın hale getirebilmek için, İngiliz Kültür Heyeti ile dünya çapında bir anlaşmaya varmaktadır. Dünyayın çeşitli kilit ülkelerinde bir acentesi ol­ ması BLDSC'nin kullanıcılarının ihtiyaçlarına daha etkin bir şekilde cevap verebilmesi anlamına gelmektedir. Örneğin Türkiye'de, İngiliz Kültür Heye­ ti Ankara ve İstanbul'da acentemiz olarak hizmet veriyor oluşu ile, kullanı­ cılarımıza, hem bilgi sağlama hem de yerel para biriminden ödeme yapma (*) (*) İngiliz Kültür Heyeti İngiliz Kütüphanesi Belge Sağlama Hizmetleri 63 rahatlığı gibi bir yerel destekli hizmetin tüm avantajlarım sunabilmekteyiz. Ayrıca İngiliz Kültür Heyeti yeni üye kayıtlan da yapmaktadır. Uluslararası Kullanıcılara Sunulan Hizmetler Kullanıcı ihtiyaçlanna yönelik geliştirilen hizmetlerden halen en çok il­ gi göreni fotokopi hizmetidir. Belegelerin fotokopilerine yönelik istekler, Boston Spa'da bulunan BLDSC'ye posta, faks veya otomatik istek aktanm yolu ile ulaştınlırken kullamcılanmızm çoğu gittikçe artan bir oranda kendi ARTTel sistemimizi kullanmaktadırlar. Stoklarda yer alan belge istekleri­ nin %89'unda, istenilen fotokopi, siparişten sonra 48 saat içinde uçak posta­ sı ile kullanıcıya gönderilmektedir. Cevap verme süresi, isteklerin fakslan­ ması veya kargo ile gönderilmesi seçeneklerinin bulunduğu Acele İşlem Hiz­ metimiz ile sadece iki saate düşürülmüştür. Buna çok kısa süre önce başla­ nılmış olsa da, BLDSC, 1993/94 yıllarında 155.000 isteği bu hizmete destek veren kütüphanelere aktararak talebin karşılanmasını sağlamış ve böylece gelen 3.7 milyon isteğin %93'ünü tatmin etmeyi başarmıştır. Hem standart hem de acele işlem hizmetleri, halen BLDSC'nin tüm hiz­ met ve ürünlerini yenilenmiş halleriyle düzenli olarak almaya devam eden kayıtlı üyelerine sunulduğu hizmetlerdir. Fakat, bu hizmetten pek sık ya­ rarlanmayacağından kayıtlı üye olmayı düşünmeyenler için ■ Lexicon Kolay Sipariş Hizmetlerimiz (Lexicon Easy Order Service) aracılığı ile fotokopi ta­ lebinde bulunma imkânı vardır. Zaman kazanmak için, kişisel formlar dol­ durmak yerine, Lexicon hizmetinden yararlanan kullanıcılar posta, fax veya telefon aracılığı ile bibliyografik referans listesi şeklinde taleplerini sunabi­ lirler. Lexicon sisteminde sipariş 15 veya daha fazla kopyadan oluşuyorsa, ek ücret alınmaksızın bunlar uluslararası kargo ile gönderilmektedir. Telif hakkı ile yükümlü kullanıcılar içinse Telif Hakkı Alma Servisimiz de bulunmaktadır. Küçük miktar bir prim karşılığında, yukarıda belirtilen hizmetlerden herhangi birini kullanırken hemen hemen tüm telif hakkı kı­ sıtlamalarından kurtulmuş olunur. Son olarak, dermemizden kitapların ödünç alınması halen BLDSC hizmetleri arasında yer almaktadır ve diğer ülkelerden kullanıcıların bu hizmetten faydalanması için özel bir kayıt yap­ tırılması gerekmektedir. Derme Bilgisi ve Güncel Bilgilendirme Hizmeti İstek üzerine dokümanların fotokopisini göndermenin yanı sıra BLDSC, güncel bilgilendirme konusunda da bir dizi hizmet sunmaktadır. Tüm önem­ li dermelerin kataloglan ve üsteleri başlık düzeyinde düzenli olarak basıl­ makta ve bunlar mikrofîş ve elektronik formatlar şeklinde kullanıcılara su­ nulmaktadır. Araştırmacılann kendi alanlanndaki gelişmeleri yakından ta­ kip etmelerine yardımcı olmak amacıyla BLDSC aynca CD-ROM tabanında ETOC (Elektronik içindekiler Tablosu) hizmeti de sunmaktadır. İki tür ETOC ürünü bulunmaktadır. Aylık olarak yayınlanan inside Information 64 Yeşim Cömertoğlu en çok istenen 10.000 güncel süreli yayından makale düzeyinde bilgi edinil­ mesi imkanı vermektedir (yaklaşık bir milyon makale) . Yılda dört kez ya­ yınlanan inside Conferences ise her yıl yaklaşık 500.000 konferans tutana­ ğının başlıklar halinde listesini sunmaktadır. Ayrıca, Dergi İçindekiler Say­ falan Hizmetine (Journal Contents Page Service) üye olarak kullanıcılar, şu an alınan 47.000 dergiden düzenli olarak istediklerinin içindekiler sayfalan­ ılın fotokopilerini alabilirler. Yeni Gelişmeler Sayılan günden güne artan DİALOG ve OCLC gibi veri tabanı ağlann- dan faydalanarak kullanıcılann sipariş verebilmesi gibi birçok yeni hizmet geliştirilmiştir. Çevrimiçi erişimi olan kullanıcılar, kısa süre önce OCLC ile yapılan bir anlaşma ile ilk defa BLDSC'nin dergi ve ulaşılması güç rapor tü­ rü yayınlarına (grey literature) erişime imkamna kavuşmuştur. Servisleri­ miz arasına, kullanıcıların isteklerini elektronik posta ile göndermesi gibi her gün bir yenisi eklenmeye devam etmektedir. Aciliyetin önemli olduğu kullamcılanmıza oldukça faydalı olacak, hali hazırda kullanılan Faksla geri gönderme sistemi de geliştirilme aşamasındadır. En heyecan verici gelişmelerden biri ise, şu an üzerinde çalışılmakta olan dokümanların elektronik olarak kullanıcıların bilgisayarlarına ulaştı­ rılması projesidir. Bu, hem otomasyon sistemlerinin geliştirilmesi hem de telif hakkı alma hususunda acentelerle yeni anlaşmalar yapılması bakımın­ dan ümit verici bir gelişmedir, çözüm bekleyen bir sorun ise uluslararası te­ lekomünikasyon hizmetleri arasındaki geniş çaplı rekabettir. Bu da hizme­ tin kullanıcılara ulaştırılması için belli bir süre geçmesi gerektiği anlamına gelir. BLDSC ve seçilen bazı kuruluşlar arasında denemeler başlatılması planlanmaktadır. Bu hizmetlerin sağlanması ve geliştirilmesi büyük yatırımlar anlamına gelmektedir. Dünya çapındaki 15.000 kullanıcısına daha verimli ve etkin bir hizmet sunma yolları aramaya sürekli devam eden BLDSC bilgi ve belge sağlama alamnda uluslararası liderliğini korumayı amaçlamaktadır. Başvuru adresleri Pınar Yamaç British Council Kütüphanesi Kırlangıç Sokak 9 Gaziosmanpaşa 06700 - Ankara Tel: 312-4686201 Faks: 312-4276182 Meral Kırkalı British Council Kütüphanesi PK436 Beyoğlu, İstanbul T^I : 212-2490574 Faks : 212-2437682 Roderic Vassie Pazarlama Birimi, BLDSC Helen Pamaby Customer Services British Library Document Supply Sentre Boston Spa Wetherby West Yorkshire LS23 7BQ UK Tee : 44 937 546 243 Faks 44 : 44 937 546333 work_4h42fom2gbdbfggiraw6eftgnm ---- jd008521 1..17 Effects of tropospheric ozone pollution on net primary productivity and carbon storage in terrestrial ecosystems of China Wei Ren, 1 Hanqin Tian, 1 Mingliang Liu, 1 Chi Zhang, 1 Guangsheng Chen, 1 Shufen Pan, 1 Benjamin Felzer, 2 and Xiaofeng Xu 1 Received 8 February 2007; revised 8 May 2007; accepted 15 October 2007; published 17 November 2007. [1] We investigated the potential effects of elevated ozone (O3) along with climate variability, increasing CO2, and land use change on net primary productivity (NPP) and carbon storage in China’s terrestrial ecosystems for the period 1961–2000 with a process- based Dynamic Land Ecosystem Model (DLEM) forced by the gridded data of historical tropospheric O3 and other environmental factors. The simulated results showed that elevated O3 could result in a mean 4.5% reduction in NPP and 0.9% reduction in total carbon storage nationwide from 1961 to 2000. The reduction of carbon storage varied from 0.1 Tg C to 312 Tg C (a decreased rate ranging from 0.2% to 6.9%) among plant functional types. The effects of tropospheric O3 on NPP were strongest in east-central China. Significant reductions in NPP occurred in northeastern and central China where a large proportion of cropland is distributed. The O3 effects on carbon fluxes and storage are dependent upon other environmental factors. Therefore direct and indirect effects of O3, as well as interactive effects with other environmental factors, should be taken into account in order to accurately assess the regional carbon budget in China. The results showed that the adverse influences of increasing O3 concentration across China on NPP could be an important disturbance factor on carbon storage in the near future, and the improvement of air quality in China could enhance the capability of China’s terrestrial ecosystems to sequester more atmospheric CO2. Our estimation of O3 impacts on NPP and carbon storage in China, however, must be used with caution because of the limitation of historical tropospheric O3 data and other uncertainties associated with model parameters and field experiments. Citation: Ren, W., H. Tian, M. Liu, C. Zhang, G. Chen, S. Pan, B. Felzer, and X. Xu (2007), Effects of tropospheric ozone pollution on net primary productivity and carbon storage in terrestrial ecosystems of China, J. Geophys. Res., 112, D22S09, doi:10.1029/2007JD008521. 1. Introduction [2] The tropospheric ozone (O3) level has been increasing across a range of scales: local, national, continental, and even global [e.g., Akimoto, 2003]. Tropospheric O3 levels might increase substantially in the future [Streets and Waldhoff, 2000]. Advection from the Asian continent increases pollutant levels over the Pacific Ocean [Jacob et al., 1999; Mauzerall et al., 2000], and eventually influences North America and Europe by intercontinental transport [Jaffe et al., 2003; Wild and Akimoto, 2001]. Ozone can influence both ecosystem structure and functions [e.g., Heagle, 1989; Heagle et al., 1999; Ashmore, 2005; Muntifering et al., 2006]. Over 90% of vegetation damage may be the result of tropospheric ozone alone, and it could cause reductions in crop yield and forest production ranging from 0% to 30% [Adams et al., 1989]. Approximately 50% of forests might be exposed to higher O3 level (>60 ppb) by 2100. Therefore there is an urgent need to investigate the adverse effects of O3 on terrestrial ecosystem production. [3] Air pollution is one of the most pressing environmen- tal concerns in China [Liu and Diamond, 2005]. The rapid urbanization and industrialization, and intensive agricultural management in the past decades, are closely related to increasing fossil fuel combustion and fertilizer application. Between 1980 and 1995, fertilizer use in China was 36% higher than the average in developed countries (where fertilizer use has been decreasing), and 65% higher than the average in developing countries [Aunan et al., 2000]. Both fossil fuel consumption and N-fertilizer application will highly contribute to total emissions of NOx, a main O3 precursor, and consequently result in increased atmospheric O3 concentration. It was estimated that China’s emissions of NOx might increase by a factor of four toward the year 2020, compared to the emissions in 1990 under a non- control scenario [van Aardenne et al., 1999], which would lead to a much larger increase of surface O3 with 150 ppb JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 112, D22S09, doi:10.1029/2007JD008521, 2007 Click Here for Full Article 1 School of Forestry and Wildlife Sciences, Auburn University, Auburn, Alabama, USA. 2 Ecosystem Center, Marine Biological Laboratory, Woods Hole, Massachusetts, USA. Copyright 2007 by the American Geophysical Union. 0148-0227/07/2007JD008521$09.00 D22S09 1 of 17 http://dx.doi.org/10.1029/2007JD008521 level of ozone in some locations [Elliott et al., 1997]. Consequently, it is important to study the impacts of O3 on terrestrial ecosystems in China. Although studies on ozone have been carried out in China for about 20 a, observations of O3 concentrations are still limited, and the records of most sites are discontinuous [e.g., Chameides et al., 1999; Liu et al., 2004; Wang et al., 2007]. Several experiments demonstrated the interaction of O3 and CO2 on locally grown species and cultivars in China [e.g., Wang, 1995; Wang et al., 2002; Guo et al., 2001; Bai et al., 2003]. However, these studies rarely involved other plant functional types (PFTs), such as forests and grassland. An assessment of O3 effects on different PFTs at a large scale over a long time period has not been done yet. To illustrate the O3 effects at a continental scale, it is necessary to consider interactive effects of O3 with other environmental factors on terrestrial ecosystem production and carbon storage. [4] Quantitative assessment of O3 effects on terrestrial ecosystem production has been conducted since the 1980s on the basis of empirical or process-based dynamic simu- lations [Ren and Tian, 2007]. Well-documented empirical models, such as the Weibull function, are based on exposure indices and corresponding exposure-response relationships, and have been used to assess crop and forest production loss, as well as economic losses [e.g., Heck et al., 1984; Heggestad and Lesser, 1990; Chameides et al., 1994; Aunan et al., 2000; Kuik et al., 2000; Wang and Mauzerall, 2004]. Although not process-based or ecosystem-dependent, such models may be extrapolated to entire ecosystems. [5] Process-based models allow plant growth responses to vary with dynamic environments, such as high O3 concentration, elevated CO2 concentration, and climate change [Tian et al., 1998a]. Several process-based models have attempted to study the effects of O3 on vegetation [e.g., Reich, 1987; Ollinger et al., 1997; Ollinger, 2002; Martin et al., 2001; Felzer et al., 2004, 2005]. Reich’s [1987] model is not actually a process-based model, but he generalized a linear model to describe the response of crops and trees to O3 and argued that crops were more sensitive to O3. Ollinger et al. [1997] used O3-response relationships with the PnET-II model to simulate tree growth and eco- system functions. These models can apply the dynamic O3 damage mechanisms in seedling and mature trees from leaf level to canopy level. Ollinger and his colleagues [Ollinger, 2002] applied the model to study the effect of O3 on NPP for specific sites within the northeastern U.S. (a reduction in NPP of between 3 and 16%) and the combined effects of CO2, O3, and N deposition along with the context of historical land use changes for hardwoods in the northeastern U.S. with a new version of PnET (PnET-CN). Felzer et al. [2004, 2005] incorporated the algorithms from Reich [1987] and Ollinger et al. [1997] for hardwoods, conifers, and crops into a biogeochemical model (i.e., TEM). Their study across the conterminous U.S. indicated a 2.6–6.8% mean reduction in annual NPP in the US during the late 1980s and early 1990s. Unlike Ollinger’s and Felzer’s work, in which the effects of ozone on stomatal conductance were not considered, Martin et al. [2001] incorporated O3 effects on photosynthesis and stomatal conductance into the functional-structural tree growth model ECOPHYS (http:// www.nrri.umn.edu/default) by using O3 flux data. Not only did they combine the well-accepted equations from mech- anistic biochemical models for photosynthesis (e.g., equa- tions from Farquhar et al. [1980] and von Caemmerer and Farquhar [1981]) and the equations from phenological models for stomatal conductance (Ball et al. [1987], adapted by Harley et al. [1992]), but also explored the underlying mechanisms of O3-inhibited photosynthesis models. They found that O3 damage could reduce both protective scav- enging detoxification system (Vc max) and light-saturated rate of electron transport (J max) by the accumulated amounts of O3 above the threshold of damage entering the inner leaves. Considering the advantages and disadvan- tages of different models in simulating O3 effects, a coupled mechanistic model that fully couples energy, carbon, nitro- gen, and water, as well as vegetation dynamics is needed in the near future [Tian et al., 1998a]. [6] In this research, we used a highly integrated process- based model called Dynamic Land Ecosystem Model (DLEM) (detail description of this model was given by Tian et al. [2005]). The dynamic O3 damage mechanisms were extrapolated from a small spatial scale (leaf level) and a short-term scale into the corresponding long-term mech- anism at the ecosystem scale. The O3 module was primarily based on the work of Ollinger et al. [1997]. The equations from Farquhar et al. [1980] and Ball et al. [1987] were used to simulate photosynthesis and stomatal conductance, sim- ilar to Martin et al. [2001]. This module simulated O3 damage on plant photosynthesis and NPP. We also devel- oped the spatial data sets including historical climate, soil information, and land use change across China over a long period. The ozone sensitivities for different PFTs including crops, coniferous trees, hardwoods and other vegetation types, were based on the Reich’s compilation of OTC experiments in the U.S., which we assume to be applicable to China as well. [7] More ozone pollution in China is closely related to domestic food security and the global environment in the future [e.g., Chameides et al., 1999; Akimoto, 2003]. Unlike other studies in China [Aunan et al., 2000; Wang and Mauzerall, 2004; Felzer et al., 2005], we try to illustrate the effects of tropospheric O3 pollution on terrestrial eco- system productivity throughout the country between 1961 and 2000. We focus on the analysis of ozone effects on NPP and carbon storage in the context of multiple environmental stresses including increasing O3, changing climate, elevated CO2, and land use changes (including nitrogen fertilization and irrigation on croplands) across China. In this paper, we first briefly describe our model development, data prepara- tion, and the experimental design, and then examine the relative effects of ozone and other environmental factors on carbon storage across the country. The sensitivity of different PFTs to ozone pollution is also examined. Finally, we discuss and analyze the simulation results and their uncertainty. 2. Methods 2.1. Dynamic Land Ecosystem Model (DLEM) [8] The DLEM couples major biogeochemical cycles, hydrological cycle, and vegetation dynamics to generate daily, spatially explicit estimates of water, carbon (CO2, CH4), and nitrogen fluxes (N2O) and pool sizes (C and N) in terrestrial ecosystems (see Figure 1). DLEM includes five D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 2 of 17 D22S09 core components: (1) biophysics, (2) plant physiology, (3) soil biogeochemistry, (4) dynamic vegetation, and (5) land use and management. The biophysical component includes the instantaneous exchanges of energy, water, and momentum with the atmosphere. It includes aspects of micrometeorology, canopy physiology, soil physics, radia- tive transfer, hydrology, surface fluxes of energy, moisture, and momentum influences on simulated surface climate. The component of plant physiology in DLEM simulates major physiologic processes, such as photosynthesis, auto- trophic respiration, allocation among various parts (root, stem, and leaf), turnover of living biomass, nitrogen uptake and fixation, transpiration, phenology, etc. The component of soil biogeochemistry simulates N mineralization, nitrifi- cation/denitrification [Li et al., 2000], NH3 volatilization, leaching of soil mineral N, decomposition and fermentation [Huang et al., 1998]. Thus DLEM is able to simultaneously estimate emissions of multiple trace gases (CO2, CH4 and N2O) from soils. The dynamic vegetation component in DLEM simulates two kinds of processes: the biogeograph- ical redistribution when climate changes, and the plant competition and succession during vegetation recovery after disturbances. Like most DGVMs (Dynamic Global Vegeta- tion Models), DLEM builds on the concept of PFT to describe vegetation distributions (Figure 2). The DLEM has also emphasized the simulation of managed ecosystems, including agricultural ecosystems, plantation forests, and pastures. The DLEM 1.0 version has been used to simulate the effects of climate variability and change, atmospheric CO2, tropospheric ozone, land use change, nitrogen depo- sition, and disturbances (e.g., fire, harvest, hurricanes) on terrestrial carbon storage and fluxes in China [Tian et al., 2005; Chen et al., 2006; Ren et al., 2007]. This model has been calibrated against field data from various ecosystems including forests, grassland, and croplands. The simulated results with DLEM have also been evaluated against inde- pendent field data [Tian et al., 2005]. [9] In DLEM, the carbon balance of vegetation is deter- mined by the photosynthesis, autotrophic respiration, litter- fall (related to tissue turnover rate and leaf phenology), and plant mortality rate. Plants assimilate carbon by photosyn- thesis, and use this carbon to compensate for the carbon loss through maintenance respiration, tissue turnover, and repro- duction. The photosynthesis module of DLEM estimates the net C assimilation rate, leaf daytime maintenance respiration rate, and gross primary productivity (GPP, unit: g C/m 2 / day). The photosynthesis rate is first calculated on the leaf level. The results are then multiplied by leaf area index to scale up to canopy level [Tian et al., 2005; Chen et al., 2006; Ren et al., 2007; Zhang et al., 2007]. Photosynthesis is the first process by which most carbon and chemical energy enter ecosystems so it has critical impacts on ecosystem production. The GPP calculation can be expressed as: GPPi ¼ Ai þ Rdið Þ � LAIi � dayl ð1Þ Ai ¼ f PPFDleaf i; gi; leafNi; Tday; Ca; dayl � � ð2Þ where GPP (gC/m 2 /day) is the gross primary productivity of ecosystems for leaf type i; i is leaf type (sunlit leaf or shaded leaf); A (g/s/m 2 leaf) and Rd (g/s/m 2 leaf) are daytime photosynthesis rate and leaf respiration rate respectively; LAI is leaf area index; dayl (s) is the length of daytime; PPFD (mmol/m2/s) is the photosynthetic photon flux density; g (m/s) is the stomatal conductance of leaf to CO2 flux; Tday (�C) is daytime temperature; Ca (ppmv) is Figure 1. Framework of the Dynamic Land Ecosystem Model (DLEM). The DLEM model includes five core components: (1) biophysics, (2) plant physiology, (3) soil biogeochemistry, (4) dynamic vegetation and (5) land use and management [Tian et al., 2005]. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 3 of 17 D22S09 the atmospheric CO2 concentration; leafN (gN/m 2 leaf) is the leaf N content. On the basis of the ‘‘strong optimality’’ hypothesis [Dewar, 1996], DLEM allocates the leaf N to sunlit fraction and shaded fraction each day according to the relative PPFD absorbed by each fraction, to maximize the photosynthesis rate. In this study, NPP in an ecosystem and annual net carbon exchange (NCE) of the terrestrial ecosystem with the atmosphere were computed with following equations: NPP ¼ GPP � Rd ð3Þ NCE ¼ NPP � RH � ENAD � EAD � EP ð4Þ where NPP is the net primary productivity, Rd is the plant respiration, RH is soil respiration, ENAD is the magnitude of the carbon loss from a natural disturbance and is assigned as 0 here because of the difficulty of being simulated at present conditions, EAD is carbon loss during the conversion of natural ecosystems to agricultural land, and EP is the sum of carbon emission from the decomposition of products [McGuire et al., 2001; Tian et al., 2003]. For natural ecosystems, EP and EAD are equal to 0, and so NCE is equal to net ecosystem production (NEP). Unlike the other models which estimate the cropland C cycle on the basis of the simulation of potential vegetation of the agricultural grids [McGuire et al., 2001], the agricultural ecosystems in DLEM are not based on natural vegetation, but parameter- ized against several intensively studied agricultural sites in China (http://www.cerndata.ac.cn/). [10] To simulate the detrimental effect of air pollution on ecosystem productivity, an ozone module was developed on the basis of previous work [Ollinger et al., 1997; Felzer et al., 2004, 2005], in which the direct effect of ozone on photosynthesis and indirect effect on stomatal conductance by changing intercellular CO2 concentration were simulated. Here the ratio of ozone damage to photosynthesis is defined as O3eff, similar to Ollinger et al. [1997], and the sensitivity coefficient a for each different plant functional type (Table 1) Figure 2. Contemporary plant functional types in China used in DLEM. 1, tundra; 2, boreal broadleaf deciduous forest; 4, boreal needleleaf deciduous forest; 5, temperate broadleaf deciduous forest; 6, temperate broadleaf evergreen forest; 7, temperate needleleaf evergreen forest; 8, temperate needleleaf deciduous forest; 9, tropical broadleaf deciduous forest; 10, tropical broadleaf evergreen forest; 11, deciduous shrub; 12, evergreen shrub; 13, C3 grass; 14, C4 grass; 15, dry farmland; 16, paddy land; 17, wetland; 18, Gebi and desert; 19, build-up area; 20, water body. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 4 of 17 D22S09 is based on the work of Felzer et al. [2004]. The range of a is 2.6�10�6 ± 2.8�10�7 for hardwoods (based on the value used by Ollinger et al. [1997]), 0.8�10�6 ± 3.6�10�7 for conifers (based on pines), and 4.9�10�6 ± 1.6�10�7 for crops which was calculated from the empirical model of Reich [1987]. The errors are based on the standard deviation of the slope from the dose response curves and the standard error of the mean stomata conductance. GPPO3 ¼ GPP � O3eff ð5Þ O3eff ¼ 1 � a � gs � AOT40ð Þ ð6Þ gs ¼ f GPPO3ð Þ ð7Þ Here, GPPO3 is limited GPP because of ozone effect; gs is the stomatal conductance (mms �1 ); AOT40 is a cumulative ozone index (the accumulated hourly ozone dose over a threshold of 40 ppb in ppb/h), and in this study we use a monthly accumulative index as in the work by Felzer et al. [2004]. The AOT40 index has often been used to represent vegetation damage due to ozone [Fuhrer et al., 1997]. Because of limited ozone data throughout China, we use the model-developed AOT40 values from Felzer et al. [2005]. [11] Our photosynthesis module, based on Farquhar et al.’s [1980] model, has the potential ability to use ozone concentration as input, similar to Martin et al. [2001], if the ozone flux data are available in the future. In DLEM, the leaf C: N ratio is also affected by ozone. We do not use this mechanism in the current study because of the ambiguous role of ozone on plant C:N ratio [Lindroth et al., 2001]. 2.2. Input Data [12] Input data sets include (1) elevation, slope, and aspect maps which are derived from 1 km resolution digital elevation data set of China (http://www.wdc.cn/wdcdrre); (2) soil data sets (pH, bulk density, depth to bedrock, soil texture represented as the percentage content of loam, sand and silt) which are derived from the 1:1 million soil map based on the second national soil survey of China [Wang et al., 2003; Shi et al., 2004; Zhang et al., 2005; Tian et al., 2006]; (3) vegetation map (or land cover map) from the 2000 land use map of China (LUCC_2000) which was developed from Landsat Enhanced Thematic Mapper (ETM) imagery [Liu et al., 2005a]; (4) potential vegetation map, which is constructed by replacing the croplands of LUCC 2000 with potential vegetation in global potential vegetation maps developed by Ramankutty and Foley [1998]; (5) standard IPCC (Intergovernmental Panel on Climate Change) historical CO2 concentration data set [Enting et al., 1994]; (6) AOT40 data set (see below for detail information); (7) long-term land use history (cropland and urban distribution of China from 1661 to 2000) which is developed on the basis of three recent (1990, 1995 and 2000) land cover maps [Liu et al., 2003, 2005a, 2005b] and historical census data sets of China [Ge et al., 2003; Xu, 1983]; and (8) daily climate data (maximum, minimum, and average temperature, precipitation, and relative humidity). Seven hundred and forty six climate stations in China and 29 stations from surrounding countries were used to pro- duce daily climate data for the time period from 1961 to 2000, using an interpolation method similar to that used by Thornton et al. [1997]. To account for cropland manage- ment, we also used data from the National Bureau of Statistics of China, which recorded annual irrigation areas and fertilizer amounts in each province from 1978 to 2000 (Figure 3b). We did not construct an irrigation data set because of lack of data. We simulated the effects of irrigation by refilling the soil water pool to field capacity whenever cropland soil reached wilt point. All data sets have a spatial resolution of 0.5� � 0.5�, and Climate and AOT40 data sets have been developed on daily time step while CO2 and land use data sets on yearly time step. 2.2.1. Description of Ozone Data [13] The methods used for monitoring ozone vary among the limited ground ozone monitoring sites in China [Chameides et al., 1999; Li et al., 2000; Chen et al., 1998]. Therefore it is difficult to spatially develop a historical AOT40 data set based on the interpolation of site-level data like Felzer et al. [2004] for the U.S. In this study, the AOT40 data set was derived from the global historical AOT40 data sets constructed by Felzer et al. [2005]. This AOT40 index is calculated from combining geographic data from the MATCH model (Multiscale Atmospheric Trans- port and Chemistry) [Lawrence and Crutzen, 1999; Rasch et al., 1997; von Kuhlmann et al., 2003] with hourly zonal ozone from the MIT IGSM (Integrated Global Systems Model). The average monthly boundary layer MATCH ozone values for 1998 are scaled by the ratio of the zonal average ozone from the IGSM (Integrated Global Systems Model), which are 3-hourly values that have been linearly interpolated to hourly values, to the zonal ozone from the monthly MATCH to maintain the zonal ozone values from the IGSM [Wang et al., 1998; Mayer et al., 2000]. This procedure was done for the period 1977–2000. From 1860 to 1976, the zonal ozone values were assumed to increase by 1.6% per year on the basis of Marenco et al. [1994]. [14] The AOT40 (Figure 4) shows significant increase of ozone pollution in the past 40 a, and the trend accelerated rapidly since the early 1990s, possibly because of the rapid urbanization during that period in China [Liu et al., 2005b]. The data set shows seasonal variation of AOT40, with the first peak of ozone concentration occurring in early summer and the second in September. Both peaks appear approxi- mately at the critical time (the growth and harvest seasons) for crops in China. Thus ozone pollution may have signif- icant impacts on crop production in China. [15] Although the AOT40 generally increased throughout the nation, the severity of ozone pollution varied from region-to-region and from season-to-season (Figure 5). The central-eastern section of north China experienced Table 1. Values of Sensitivity Coefficient a for Different Functional Types a Functional Types a Coefficient Reference Crops 3.9 � 10�6 (d = 5.27 � 10�7) Reich [1987] Coniferous trees 0.7 � 10�6 (d = 2.45 � 10�7) Reich [1987] Deciduous trees and other vegetation types 2.6 � 10�6 (d = 2.3 � 10�7) Ollinger et al. [1997] a The values of a directly modified from Felzer et al. [2004]. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 5 of 17 D22S09 severe ozone pollution, especially in spring and summer. The greatest increase of AOT40 appeared in winter of northwest China, probably due to the rapid industrialization and the transport of air pollution from Europe [Akimoto, 2003]. In contrast, the change of AOT40 in south China is relatively low despite the large urban population and rapid industrial development in this region. 2.2.2. Description of Other Input Data [16] From 1961 to 2000, the CO2 concentration steadily increased from 312 ppmv to 372 ppmv (Figure 6a), while temperature and precipitation fluctuated substantially (Figures 6b and 6c). Since the mid-1980s, China experi- enced an observable climate warming. The annual precip- itation in the 1990s was higher than that in the 1980s. There was a relatively long dry period between 1965 and 1982, except for high annual precipitation in 1970 and 1974 (Figures 6c and 6e). Figure 3a shows that since the late 1980s, cropland expanded, while forestry and other land areas gradually decreased. 2.3. Simulation Design [17] In this study, six experiments were designed to analyze the effects of ozone on NPP, NCE, and carbon storage in terrestrial ecosystems of China (Table 2). Exper- iment I was used to examine the impact of transient ozone on terrestrial ecosystem productivity while holding other environmental factors constant. Experiments II and III were Figure 3. (a) Variations of land use and (b) irrigation area (10 10 m 2 ) and fertilizing amount (10 10 kg). D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 6 of 17 D22S09 used to analyze the combined effects of O3 and CO2 fertilization and of O3 and climate change. Both experi- ments can help better determine the relative impacts of O3, CO2 and climate on the ecosystem. Experiments IV simu- lated the overall effect of climate change, atmospheric change, and land use change. Experiment V was set up to study the overall combined effect without irrigation. The final experiment VI without ozone effects is used for comparison against the other experiments. [18] The model simulation began with an equilibration run to develop the baseline C, N, and water pools for each grid. A spin-up of about 100 a was then applied if the climate change was included in the simulation scenario. Finally, the model ran in transient mode driven by the daily or/and annual input data. 3. Results and Analyses 3.1. Overall Change in Net Primary Productivity and Carbon Storage [19] In the simulation experiments, there were negative effects of O3 on average NPP and carbon storage during the study period (1961–2000). Average annual NPP and total C storage from the 1960s to 1990s in China increased by 0.66% and 0.06%, respectively, under the full factorial (climate, land use, CO2, and O3 were changed, hereafter referred to as OCLC), while they increased by 7.77% and 1.63%, respectively, under the scenario without O3 (here- after CLC) (Table 3). This difference indicates that under the full factorial, O3 decreased NPP (about �1.64% in 1960s and �8.11% in 1990s) and total C storage (about �0.06% in 1960s and �1.61% in 1990s) in China’s terrestrial ecosystems. Although NPP and total C storage in both scenarios increased over time, the soil and litter C storage decreased (�0.18% and �0.67%, respectively) under the full factorial, while they increased by 0.30% and 1.63%, respectively, under the scenario without O3. Therefore O3 reduced soil and liter C storage by about 0.03% and 0.16% in the 1960s and 0.52% and 1.84% in the 1990s, respectively, in China. The model results show that NPP and carbon storage, including vegetation carbon, soil carbon, and litter carbon, decreased with O3 exposure, and the reduced NPP was more than the decrease in carbon storage. The changing rates in the 1960s and 1990s indicate that increasing O3 concentrations could result in less NPP and carbon storage, which further implies that under the influence of O3 alone, China’s soil ecosystem would be a net C source, while without O3, it would have been a net C sink (Table 3). Figure 4. (a) Annual monthly AOT40 (ppb/h) mean from 1961 to 2000 and (b) monthly AOT40 (ppb/h) in 1961, 1980 and 2000. Note: From atmospheric chemistry model, MATCH (Multiscale Atmospheric Transport and Chemistry) [Lawrence and Crutzen, 1999; Mahowald et al., 1997; Rasch et al., 1997; von Kuhlmann et al., 2003] and IGSM (Integrated Global Systems Model) [Wang et al., 1998; Wang and Prinn, 1999; Mayer et al., 2000]. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 7 of 17 D22S09 [20] The results of the accumulated NCE across China under three simulation experiments, including O3-only, combined effects without O3 (CLC) and with O3 (OCLC) from 1961 to 2000, indicate that O3 effects could cause carbon release from the terrestrial ecosystem to the atmo- sphere (Table 4). The accumulated NCE was �919.1 Tg C under the influence of O3 only, 1,177.4 Tg C under the combined influence of changed climate, CO2 and land use (CLC), and 677.3 Tg C under the full factorial (OCLC). These values imply that China was a CO2 source when influenced only by O3, but it was a sink under the influence of both CLC and OCLC scenarios. The accumulated NCE influenced by OCLC was 620.4 Tg C less (OCLC-CLC) than that influenced by CLC, implying that the interactions between O3 and other factors (CO2, climate, or land use change) were very strong. These interactions decreased the emissions of CO2 from terrestrial ecosystems. For the 1990s, a period with rapid atmospheric O3 change, our simulated results show that the cumulative NCE under the full factorial (OCLC) throughout China decreased, com- pared to the results without O3 influence (CLC) (Table 3 and Figure 7). In the central-eastern China and northeastern China, some places even released 150 g/m 2 more C into the atmosphere under O3 influences during the 1990s. [21] Accumulated carbon storage of different PFTs under the three simulation experiments indicate that PFTs respond very differently to increasing O3 concentration and its interaction with other environmental factors (Table 4). From 1961 to 2000, the accumulated NCE for different PFTs with O3 exposure decreased by only 0.1 Tg C in wetlands up to 615.9 Tg C in temperate broadleaf deciduous forests. Accumulated NCE under influences of CLC for different PFTs ranged from a decrease of 67.7 Tg C (Tundra) to an increase of 720.4 Tg C (temperate needleleaf evergreen forest), while accumulated NCE under influences of OCLC ranged from a decrease of 547.0 Tg C (temperate broadleaf deciduous forest) to an increase of 720.4 Tg C (temperate needleleaf evergreen forest). This range implies that tem- perate broadleaf deciduous forest was the biggest C source under the full factorial and that temperate needleleaf ever- green forest was the biggest C sink from 1961 to 2000. Compared with the combined effect with O3 (OCLC) and without O3 (CLC), the O3-only scenario releases more C for all different PFTs. Compared with the combined effect without O3 (CLC), the combined effect with O3 (OCLC) resulted in less accumulated NCE for temperate broadleaf deciduous forest (321.2 Tg C) and dry farmland crops (46.5 Tg C) than other PFTs, which means that temperate broadleaf deciduous forest and dry farmland were more sensitive to O3 than other PFTs. In general, we found that C3 grass was more sensitive than C4 grass to O3, dry farmland Figure 5. Average monthly AOT40 in (a) spring, (b) summer, (c) autumn, and (d) winter from 1990 to 2000 in China (unit is 1000 ppb/h or ppm/h). Note: From atmospheric chemistry model, MATCH (Multiscale Atmospheric Transport and Chemistry) [Lawrence and Crutzen, 1999; Mahowald et al., 1997; Rasch et al., 1997; von Kuhlmann et al., 2003] and IGSM (Integrated Global Systems Model) [Wang et al., 1998; Wang and Prinn, 1999; Mayer et al., 2000]. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 8 of 17 D22S09 was more sensitive than paddy farmland, and deciduous forest was more sensitive than needleleaf forest. [22] Overall results indicate that O3 has negative impacts on terrestrial ecosystem production (Figure 4 and Table 3), and the negative effects become severe because the O3 concentration increased across China in the past decades, especially after the 1990s [e.g., Aunan et al., 2000]. Biomes had complicated responses of carbon storage to O3 because of the different sensitivities of each PFT as well as different environmental conditions. For example, some arid sites exhibit small ozone effect on photosynthesis because low stomatal conductance in arid plants leads to relatively low ozone uptake. Some other studies showed that O3-induced reductions in photosynthesis were accompanied by decreased water use efficiency (WUE), however, resulting in even larger reductions in productivity, particularly at arid sites [Ollinger et al., 1997]. This fact may be the reason that wetlands show relatively low reduction in carbon storage (Table 4), and dry farmland crops are more sensitive to O3 than paddy farmland crops. This might be also because parameters of crops in the Reich model [Reich, 1987] are more sensitive to O3 than those in deciduous and coniferous forests. In addition, in response to elevated O3 concentra- tion, plants allocate more carbon to leaves and stems than roots because of increased defense mechanisms [e.g., Younglove et al., 1994; Piikki et al., 2004]. This effect may result in higher carbon storage loss in broadleaf deciduous forests than in evergreen forests. It is clearly needed to address variations in biome-level responses to ozone pollution. 3.2. Spatiotemporal Variations of C Flux and C Storage [23] Mean annual NPP changes from the 1960s to the 1990s under the O3-only scenario showed a significant spatial pattern (Figure 8). The mean annual NPP decreased the most in the eastern China partially because the eastern China has experienced faster development in urbanization, industrialization, and agricultural intensification in the past several decades, than remote areas in the western China [Aunan et al., 2000; Wang and Mauzerall, 2004; Liu et al., 2005a, 2005b]. This development is closely related to an increased use of fossil fuels and fertilizer. The imbalance of regional O3 concentrations could also result in fluctuations of annual NPP. [24] Under the influence of O3 and CO2, the simulation results illustrate that mean annual NPP in the 1990s increased by 140.6 Tg C compared to the 1960s (Figures 8 and 9); the total carbon storage increased by 46.3Tg over the past 40 a because of the accumulative increase in vegetation and soil C storage (Figure 10). The increased C storage may be attributed to the direct effects of increasing atmospheric CO2 [Melillo et al., 1993; Tian et al., 1999, 2000], however, ozone can partially compensate for the positive effects of CO2 fertilization. [25] DLEM estimates that the total carbon storage for potential vegetation under O3 and climate influences decreased by 15.9 Tg C. This decrease could mainly be attributed to the large decrease in soil C storage while vegetation C decreased relatively little from 1961 to 2000 (Figure 10). The interannual variation of NPP had a similar trend with the historical annual precipitation from 1961 Figure 6. Variations in (a) mean annual atmospheric CO2 concentration, (b) mean annual temperature, (c) annual precipitation, (d) annual precipitation anomalies, and (e) annual temperature anomalies (relative to 1961–1990 normal period) from 1961 to 2000. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 9 of 17 D22S09 to 2000 (Figures 9 and 6c). For example, annual NPP decreased as less precipitation occurred in the late 1960s and the 1970s (Figure 9). From 1961 to 2000, because of the influence of the monsoon climate, precipitation and temperature in China exhibited large interannual variability during the study period. The results indicate that NPP was more sensitive to changes in precipitation than in temper- ature, while soil carbon storage was more closely linked to temperature through decomposition responses. In addition, the changing pattern of mean annual NPP from the 1960s to 1990s indicates that NPP in the areas of the eastern and northern China decreased more than other areas over the same time period (Figure 8b). Those variations might be related to the magnitude and spatial distribution of rainfall from seasonal to decadal [Fu and Wen, 1999; Tian et al., 2003]. Therefore the combined effects of changes in air temperature and precipitation with increasing O3 concen- tration are complex, and the NPP loss might result from the balance among O3, CO2 and water uptake through changing stomatal conductance due to the combined effects of O3, temperature and CO2. [26] The above analyses address the response of potential terrestrial ecosystems to historical O3 concentration, chang- ing climate, and atmospheric CO2 concentration. Land use change as well as management, however, has substantially modified land ecosystems across China in the past 40 a [e.g., Liu et al., 2005a, 2005b]. On the basis of the DLEM simulation, the total terrestrial carbon storage in China increased by 16.8 Tg C during 1960–2000 (Figure 10). Annual NPP over time in the OCLC (simulation experiment IV) increased slightly (mean 0.1%), and the variation of interannual NPP is similar to the result from climate change (Figure 9). The distribution of the annual NPP difference between the 1960s and the 1990s in OCLC scenario indicates that the NPP changes were smaller than in the O3-only scenario, which could be caused by the modifica- tion of the interactive effects of changing climate, increasing CO2, and land use change. Compared with the effects of OCLC with irrigation and without irrigation (Figure 9), there was a mean 3% reduced rate of annual NPP from 1961 to 2000. The result implies that water conditions could alter the ozone-induced damage. 4. Discussion 4.1. Comparison With Estimates of Ozone Damage From Other Research in China [27] Research of ozone effects on crop growth and yield loss in China has been conducted since 1990 by using field experiments and model simulations. Field studies have shown that ozone exposure could result in crop yield reductions of 0–86% under different experimental treat- ments with varying ozone concentrations and duration [e.g., Wang and Guo, 1990; Huang et al., 2004]. Estimates of crop yield loss on the regional scale, such as the Yangtze River Delta and national level, were established [Feng et al., 2003; Wang and Mauzerall, 2004; Aunan et al., 2000], and results indicate crop loss of 0–23% from historical ozone in China and 2.3–64% as ozone rises in the future [Wang et al., 2007]. Felzer et al. [2005] found that crops were more sensitive to ozone damage at low ozone levels both in China and Europe than in the U.S. All the above crop studies and assessments of yield loss were based on empirical exposure- response relationships for crop yield and ozone. However, there are few studies based on process-based ecosystem models to estimate the effects of ozone damage on diverse PFTs on a national level. In our study, the DLEM model was used to address the influence of ozone concentration on NPP and carbon storage across China during the past 40 a from 1960 to 2000, and we estimated the influence of historical ozone on different PFTs. Similar to previous studies, there was a reduction of NPP when considering ozone effects. Also the ozone effects on terrestrial ecosys- tem across China was consistent with the study of Felzer et al. [2005], that spatial variations were largely due to varied ozone concentration, climate change and other stress factors. The large damage peaked in the eastern China with the greatest reduction in NPP over 70% for some places. Table 2. Experimental Arrangement Including CO2, Climate, Land Use and Human Being Management (Fertilizer and Irrigation) a Scenarios O3 Climate CO2 Land Use Fertilizer Irrigation Balance 0 constant constant constant 0 0 I O3 only historical constant constant constant 0 0 II O3_CO2 historical constant historical constant 0 0 III O3_Climate historical historical constant constant 0 0 IV O3_Climate_Lucc_CO2 historical historical historical historical historical historical V O3_Climate_Lucc_CO2_N historical historical historical historical historical 0 VI Climate_Lucc_CO2 0 historical historical historical historical historical a Climate_lucc_CO2, CLC; O3_climate_lucc_CO2, OCLC. Table 3. Overall Changes in Carbon Fluxes and Pools During 1961–2000 Scenarios O3_Climate_ Lucc _CO2 Climate_ Lucc _ CO2 Difference With O3 and Without O3 1960s, Pg 1990s, Pg Net Change, % 1960s, Pg 1990s, Pg Net Change, % 1960s, % 1990s, % NPP 3.04 3.06 0.66 3.09 3.33 7.77 �1.64 �8.11 Veg C 26.02 26.32 1.15 26.03 27.38 5.19 �0.04 �3.87 Soil C 59.79 59.68 �0.18 59.81 59.99 0.30 �0.03 �0.52 Litter C 19.32 19.19 �0.67 19.35 19.55 1.03 �0.16 �1.84 Total C 105.14 105.2 0.06 105.2 106.92 1.63 �0.06 �1.61 D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 10 of 17 D22S09 Table 4. Accumulated NCE of Different Plant Functional Types Under Three Scenarios Including O3 Only, CLC, OCLC and the Difference Between CLC and OCLC From 1961 to 2000 a Type Number Plant Functional Types Scenarios O3 Only, Tg C(10 12 C) CLC, Tg C(10 12 C) OCLC, Tg C(10 12 C) OCLC-CLC, Tg C(10 12 C) 1 tundra �30.6 �67.7 �130.2 �62.5 2 boreal broadleaf deciduous forest �1.2 �17 �18.3 �1.3 4 boreal needleleaf deciduous forest �9.9 94.9 85.1 �9.7 5 temperate broadleaf deciduous forest �615.9 �225.8 �547.0 �321.3 6 temperate broadleaf evergreen forest �26 231.8 206.5 �25.3 7 temperate needleleaf evergreen forest �72.1 720.4 720.4 43.2 8 temperate needleleaf deciduous forest �5.7 14.9 14.9 �4.0 10 tropical broadleaf evergreen forest �19.7 �48.1 �48.0 �19.8 11 deciduous shrub �4.9 17.9 17.9 �5.8 12 evergreen shrub �11.5 37.1 37.1 13.6 13 C3 grass �120.6 301.6 301.6 �136.9 14 C4 grass �0.9 24.4 3.8 �4.0 15 wetland �0.1 96.7 93.7 �30.1 16 dry farmland �3.6 �50.1 �46.5 17 paddy farmland �0.1 �10.1 �10.0 total accumulated NCE in China since 1961 �919.1 1177.4 677.3 �620.4 a There is no dry farmland and paddy farmland under the scenario of O3 only. Climate_lucc_CO2, CLC; O3_climate_lucc_CO2, OCLC. Figure 7. Difference in (top) NPP and (bottom) cumulative NCE in 1990s between CLC with O3 and without O3 (g m �2 ). D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 11 of 17 D22S09 Figure 8. Change rate of average annual NPP from 1960s to 1990s under different influencing factors related to O3 (%). (a) O3 only, (b) O3_Climate, (c) O3_LUCC, (d) O3_ CO2, (e) O3_Climate_LUCC_ CO2, and (f) Climate_LUCC_ CO2. Figure 9. Changes in annual NPP during 1961–2000 as forced by different factors and their combinations. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 12 of 17 D22S09 Furthermore, our work shows different sensitivities for different PFTs, in part because we incorporated the Reich [1987] dose-response functions into DLEM. Deciduous forests and dry farmland crops are relatively more sensitive to ozone than other PFTs. Dry farmland showed more reduction in yield than paddy farmland. It indicates an important need to further study variations among PFTs and the underlying mechanisms. 4.2. Ozone and Its Interactive Effects on Net Primary Productivity and Carbon Storage [28] In our study, average annual NPP decreased 0.01 PgC/a and accumulated NCE was �0.92 PgC (Figure 9 and Table 4) from 1961 to 2000 with the effect of O3 only. Similar to most field experiments in US, Europe and China and other model results [e.g., Heagle, 1989, and references therein], our results show that ozone has negative effects on terres- trial ecosystem production due to direct ozone-induced reductions in photosynthesis. [29] The combined effects of O3, CO2, climate, and land use show very different results. O3 may compensate for CO2 fertilization and result in NPP losses in different plant types over time across China (Figures 8–10). Climate variability increased the ozone-induced reduction of carbon storage and led to substantial year-to-year variations in carbon Figure 10. Annual changes in terrestrial carbon storage in China from 1961 to 2000 under four influencing factors. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 13 of 17 D22S09 fluxes (NPP and NCE). These results are consistent with many previous studies [e.g., Cao and Woodward, 1998; McGuire et al., 2001; Tian et al., 1998b, 1999, 2003]. There is a direct positive effect of elevated CO2 on photosynthesis and biomass production [e.g., Agrawal and Deepak, 2003], as well as reduced stomatal conductance. Climate warming increased decomposition, resulting in continuous loss of soil and total carbon storage (Figures 10b and 10c) since 1990, although annual precipitation was substantial during this period (Figures 6c and 6d). China is influenced by a monsoon climate so that summer monsoons bring most of the annual precipitation [Fu and Wen, 1999]. The combined effects of changes in ozone concentration and climate warming in arid areas may result in larger variability in productivity. [30] Similarly, the contribution of land use change to the terrestrial carbon budget varied over time and among different ecosystem types [e.g., Houghton and Hackler, 2003]. In our study, we took into account the combined effects of land use change with historical ozone concentra- tion, atmospheric CO2 concentration, and climate variabil- ity. When considering the effects of land use with ozone, three aspects need to be addressed. First, transformations of different land use types, such as the conversion from forest to crop and regrowth of natural vegetation after cropland abandonment, could result in carbon loss or carbon uptake [e.g., Tian et al., 2003]. The other two are the sensitivities of different biomes to ozone exposure and agriculture man- agement. The former results in different carbon loss rates; however, the latter’s effects combined with ozone pollution on carbon storage are related to changing soil environment such as water and nitrogen conditions. In addition, dry farmland and C3 grass are more sensitive than paddy farmland and C4 plant types, which could better explain the field study results of the relationship between ozone effects and photosynthesis and stomatal conductance. Reich’s [1987] study indicated that a secondary response to ozone is possibly a reduction in stomatal conductance, as the stomata close in response to increased internal CO2. Tjoelker et al. [1995] found a decoupling of photosynthesis from stomatal conductance as a result of long-term exposure to ozone. Such a decoupling implies that ozone-induced reductions in photosynthesis would also be accompanied by decreased water use efficiency (WUE), resulting in even larger reductions in productivity, particularly at arid sites, although many studies indicate that drought-induced stress could reduce ozone stress [Smith et al., 2003]. Unlike C4 photosynthesis, which adds a set of carbon-fixation reac- tions that enable some plants to increase photosynthetic water use efficiency in dry environments, C3 grass and dry farmland in China are always water limited and are more sensitive to ozone exposure. In addition, the modeling studies of Felzer et al. [2004] indicated that ozone pollution can reduce more NPP with fertilizer application. In contrast to land use change in the eastern U.S. [Felzer et al., 2004], it is necessary to consider how to manage irrigation in arid areas because there are a lot of moisture-limited regions that require irrigation in China. Reasonable irrigation manage- ment can both enhance the water use efficiency and modify the ozone damage. [31] Besides the environmental factors discussed above, recent reviews of the global carbon budget also indicate that terrestrial ecosystem productivity could be affected by other changes in atmospheric chemistry, such as nitrogen deposi- tion and aerosols [e.g., Pitcairn et al., 1998; Bergin et al., 2001; Lü and Tian, 2007]. These changes can directly change carbon storage. For example, aerosols and regional haze may also reduce ecosystem productivity by decreasing solar radiation and changing climate conditions [e.g., Huang et al., 2006]. Nitrogen deposition could also affect terrestrial carbon storage in a complex way [Aber et al., 1993]. For example, many terrestrial ecosystems in middle and high latitudes are nitrogen limited [e.g., Melillo, 1995], and the increasing effect of elevated CO2 on photosynthesis could be decoupled by limited nitrogen concentration. However, increasing nitrogen deposition could reduce total plant phosphorus uptake [e.g., Cleland, 2005] and nitrogen deposition can bias the estimates of carbon flux and carbon storage either high or low depending on the nature of the interannual climate variations [Tian et al., 1999]. In this study, we ran the model with a closed nitrogen cycle because a database containing a time series of nitrogen deposition was not available for our transient analyses. To completely understand the effects of air pollution conditions on terrestrial ecosystem productivity, future work should take these atmospheric chemistry factors into account. 4.3. Uncertainty and Future Work [32] An integrated assessment of O3 impacts on terrestrial ecosystem production at a large scale basically requires the following types of information: (1) O3 data set that reflects the air quality in the study area and other environmental data such as climate (temperature, precipitation and radia- tion), plant (types, distribution and parameters) and soil (texture, moisture, etc.) information; (2) mechanisms of O3 impacts on ecosystem processes, which describe relation- ships between air-pollutant-dose and ecophysiological pro- cesses such as photosynthesis, respiration, allocation, toleration, and competition; and (3) an integrated process- based model, which is able to quantify the damage of ozone on ecosystem processes. To reduce uncertainty in our current work, future work needs to address the following: First, we used one set of O3 data from a combination of global atmospheric chemistry models, which have not been validated well against field observations because of limited field data. We found that the seasonal pattern of our simulated ozone data (AOT40) was the same as the limited observation data sets [Wang et al., 2007], with the highest ozone concentrations in summer. However, we still need observed AOT40 values to calibrate and validate our O3 data set in the future. Second, the ozone module in our DLEM focuses on the direct effects on photosynthesis and indirect effects on other processes, such as stomatal con- ductance, carbon allocation, and plant growth. The quanti- tative relationship between O3 and these processes remains untested by field studies. Third, in our simulations, we simulate land use change accompanying optimum fertiliza- tion and irrigation management. It is hard to separate the contributions of land use change from their combination with fertilizer and irrigation. Especially, according to the studies of Felzer et al. [2004, 2005], the ozone effect with fertilizer management could increase the damage to ecosys- tem production. However, irrigation management in dry lands could reduce the negative effect of ozone [Ollinger et al., 1997]. In our model, crops were classified as dry D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 14 of 17 D22S09 farmland and paddy farmland which improved the crops’ simulation compared to previous process-based models, but different crop types, such as spring wheat, winter wheat, corn, rice, and soybean, have large differences in sensitivity to ozone and could result in different ecosystem production changes due to the effects of ozone [e.g., Heck, 1989; Wang et al., 2007]. It is needed to improve the agricultural ecosystem module by including different crop types and varied managements (fertilizer, irrigation, tillage, and so on) to study the ozone effect. [33] To accurately assess impacts of ozone and other pollutants on NPP and carbon storage on a regional scale, it is needed to improve both observation data in China and ecosystem model. So the future research may focus on the following: (1) validation of simulation results with site data; (2) refinement of present ozone data with field observations and comparison of present ozone data with other simulated ozone data; (3) development of the ozone module by coupling the effects of ozone on LAI and stomatal conduc- tance; (4) improvement of the process-based agricultural ecosystem model to study the effects of air pollutants on different crop types; and (5) inclusion of other air pollutants (e.g., aerosols and nitrogen deposition) in the model. 5. Conclusions [34] This work investigated tropospheric O3 pollution in China and its influence on NPP and carbon storage across China from 1961 to 2000 by using the Dynamic Land Ecosystem Model (DLEM). Our simulated results show that elevated tropospheric O3 concentration has led to a mean 4.5% reduction in NPP nationwide during the period of 1961–2000. Our simulations suggest that the interactions of ozone with increasing CO2, climate change, land use and management are significant, and that the interaction of ozone with the climate and land use change can cause terrestrial ecosystems to release carbon to the atmosphere. Ozone effects on NPP varied among plant functional types, ranging from 0.2% to 6.9% during 1961 to 2000. Dry farmland, C3 grasses, and deciduous forests are more sensitive to ozone exposure than paddy farmland, C4 grasses, and evergreen forests. In addition, following ozone exposure experiments, we allowed crops to be more sensi- tive than deciduous trees and deciduous trees to be more sensitive than coniferous trees, and therefore had the highest mean reduction (over 14.5%) since the late 1980s. Spatial variations in ozone pollution and ozone effects on NPP and carbon storage indicate that eastern-central China is the most sensitive area. Significant reduction in NPP occurred in northeast, central, and southeast China where most crops are planted. Direct and indirect effects of air pollutants on ecosystem production, especially on agriculture ecosystem carbon cycling, should be considered in the future work in China. Lack of ozone observation data sets and sensitivity experiments of different PFTS could result in uncertainties in this study. To accurately assess the impact of elevated O3 level on NPP and carbon storage, it is necessary to develop an observation network across China to measure tropo- spheric O3 concentration and its effects on ecosystem processes, and then to enhance the capacity of process- based ecosystem models by rigorous field data-model comparison. [35] Acknowledgments. This research is funded by NASA Interdis- ciplinary Science Program (NNG04GM39C). We thank Dengsheng Lu, Art Chappelka, Chelsea Nagy and two anonymous reviewers for their con- structive suggestions and comments. References Aber, J. S., E. E. Spellman, and M. P. Webster (1993), Landsat remote sensing of glacial terrain, in Glaciotectonics and Mapping of Glacial Deposits, edited by J. S. Aber, Can. Plains Proc., 25(1), 215–225. Adams, R. M., D. J. Glyer, S. L. Johnson, and B. A. McCarl (1989), A reassessment of the economic effects of ozone on United States agricul- ture, J. Air Pollut. Control Assoc., 39, 960–968. Agrawal, M., and S. S. Deepak (2003), Physiological and biochemical responses of two cultivars of wheat to elevated levels of CO2 and SO2, singly and in combination, Environ. Pollut., 121, 189–197. Akimoto, H. (2003), Global air quality and pollution, Science, 302(5651), 1716–1719. Ashmore, M. R. (2005), Assessing the future global impacts of ozone on vegetation, Plant Cell Environ., 28, 949–964. Aunan, K., T. K. Berntsen, and H. M. Seip (2000), Surface ozone in China and its possible impact on agriculture crop yields, Ambio, 29, 294–301. Bai, Y. M., C. Y. Wang, M. Wen, and H. Huang (2003), Experimental study of single and interactive effects of double CO2, O3 concentrations on soybean (in Chinese), J. Appl. IED Meteorol. Sci., 14(2), 245–251. Ball, J. T., I. E. Woodrow, and J. A. Berry (1987), A model predicting stomatal conductance and its contribution to the control of photosynthesis under different environmental conditions, in Progress in Photosynthesis Research, vol. IV, Proceedings of the International Congress on Photo- synthesis, edited by I. Biggins, pp. 221–224, Springer, New York. Bergin, M. H., R. Greenwald, J. Xu, Y. Berta, and W. L. Chameides (2001), Influence of aerosol dry deposition on photosynthetically active radiation available to plants: A case study in the Yangtze delta region of China, Geophys. Res. Lett., 28(18), 3605–3608. Cao, M. K., and F. I. Woodward (1998), Dynamic responses of terrestrial ecosystem carbon cycling to global climate change, Nature, 393, 249– 252. Chameides, W. L., P. Kasibhatla, J. Yienger, and H. Levy II (1994), Growth of continental-scale metro-agro-plexes, regional ozone pollution and world good production, Science, 264, 74–77. Chameides, W., X. Li, X. Zhou, C. Luo, C. S. Kiang, J. St. John, R. D. Saylor, S. C. Liu, and K. S. Lam (1999), Is ozone pollution sffecting crop yield in China?, Geophys. Res. Lett., 26, 867–870. Chen, G., H. Tian, M. Liu, W. Ren, C. Zhang, and S. Pan (2006), Climate impacts on China’s terrestrial carbon cycle: An assessment with the dynamic land ecosystem model, in Environmental Modeling and Simula- tion, edited by H. Q. Tian, pp. 56–70, ACTA Press, Calgary, Alberta, Canada. Chen, L. Y., H. Y. Liu, K. S. Lam, and T. Wang (1998), Analysis of the seasonal behavior of tropospheric ozone at Hong Kong (in Chinese), Atmos. Environ., 32(2), 159–168. Cleland, E. E. (2005), The influence of multiple interacting global changes on the structure and function of a California annual grassland ecosystem, thesis, Stanford Univ., Stanford, Calif. (Available at http://gradworks. umi.com/31/62/3162365.html) Dewar, R. C. (1996), The correlation between plant growth and intercepted radiation: an interpretation in terms of optimal plant nitrogen content, Ann. Bot., 78, 125–136. Elliott, S., D. R. Blake, R. A. Duce, C. A. Lai, I. McCreary, L. A. McNair, F. S. Rowland, A. G. Russell, G. E. Streit, and R. P. Turco (1997), Motorization of China implies changes in Pacific air chemistry and pri- mary production, Geophys. Res. Lett., 24(21), 2671–2674. Enting, I. E., T. M. L. Wigley, and M. Heimann (1994), Future emissions and concentrations of carbon dioxide: Key ocean/atmosphere/land ana- lyses, Tech. Pap. 31, Div. of Mar. and Atmos. Res., Commonw. Sci. and Ind. Res. Organ., Hobart, Tas., Australia. Farquhar, G. D., S. von Caemmerer, and J. A. Berry (1980), A biochemical model of photosynthetic CO2 assimilation in leaves of C3 species, Planta, 149, 78–90. Felzer, B., D. Kicklighter, J. Melillo, C. Wang, Q. Zhuang, and R. Prinn (2004), Effects of ozone on net primary production and carbon seques- tration in the conterminous United States using a biogeochemistry model, Tellus, Ser. B, 56, 230–248. Felzer, B. S., J. M. Reilly, D. W. Kicklighter, M. Sarofim, C. Wang, R. G. Prinn, and Q. Zhuang (2005), Future effects of ozone on carbon seques- tration and climate change policy using a global biochemistry model, Clim. Change, 73, 195–425. Feng, Z., M. Jin, and F. Zhang (2003), Effects of ground-level ozone (O3) pollution on the yields of rice and winter wheat in the Yangtze River Delta, J. Environ. Sci. China, 15, 360–362. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 15 of 17 D22S09 Fu, C. B., and G. Wen (1999), Variation of ecosystems over east Asia in association with seasonal interannual and decadal monsoon climate varia- bility, Clim. Change, 43, 477–494. Fuhrer, J., L. Skarby, and M. R. Ashmore (1997), Critical levels for ozone effects on vegetation in Europe, Environ. Pollut., 97(1–2), 91–106. Ge, Q. S., J. H. Dai, F. N. He, J. Y. Zheng, Z. M. Man, and Y. Zhao (2003), Spatiotemporal dynamics of reclamation and cultivation and its driving factors in parts of China during the last three centuries, Prog. Nat. Sci., 14(7), 605–613. Guo, J. P., C. Y. Wang, M. Wen, Y. M. Bai, and H. Z. Guo (2001), The experimental study on the impact of atmospheric O3 variation on rice (in Chinese), Acta Agron. Sinica, 27, 822–826. Harley, P. C., R. B. Thomas, J. F. Reynolds, and B. R. Strain (1992), Modelling photosynthesis of cotton grown in elevated CO2, Plant Cell Environ., 15, 271–282. Heagle, A. S. (1989), Ozone and crop yield, Annu. Rev. Phytopathol., 27, 397–423. Heagle, A. S., J. S. Mille, F. L. Booker, and W. A. Pursley (1999), Ozone stress, carbon dioxide enrichment, and nitrogen fertility interactions in cotton, Crop Sci., 39, 731–741. Heck, W. W. (1989), Assessment of crop losses from air pollutants in the United States, in Air Pollution’s Toll on Forests and Crops, edited by J. J. MacKenzie and M. T. El-Ashry, pp. 235–315, Yale Univ. Press, New Haven, Conn. Heck, W. W., W. W. Cure, J. O. Rawlings, L. J. Zaragoza, and A. S. Heagle (1984), Assessing impacts of ozone on agricultural crops. I. Overview, J. Air Pollut. Control Assoc., 34, 729–735. Heggestad, H. E., and V. M. Lesser (1990), Effects of ozone, sulfur dioxide, soil water deficit, and cultivar on yields of soybean, J. Environ. Qual., 19, 488–495. Houghton, R. A., and J. L. Hackler (2003), Sources and sinks of carbon from land-use change in China, Global Biogeochem. Cycles, 17(2), 1034, doi:10.1029/2002GB001970. Huang, H., C. Wang, Y. Bai, and M. Wen (2004), A diagnostic experimen- tal study of the composite influence of increasing O3 and CO2 concen- tration on soybean (in Chinese), Chin. J. Atmos. Sci., 28, 601–612. Huang, Y., R. L. Sass, and F. M. Fisher Jr. (1998), A semi-empirical model of methane emission from flooded rice paddy soils, Global Change Biol., 3, 247–268. Huang, Y., R. E. Dickinson, and W. L. Chameides (2006), Impact of aerosol indirect effect on surface temperature over east Asia, Proc. Natl. Acad. Sci. U. S. A., 103(12), 4371–4376. Jacob, D. J., J. A. Logan, and P. P. Murti (1999), Effect of rising Asian emissions on surface ozone in the United States, Geophys. Res. Lett., 26(14), 2175–2178. Jaffe, D., I. McKendry, T. Anderson, and H. Price (2003), Six ‘new’ epi- sodes of trans-Pacific transport of air pollutants, Atmos. Environ., 37(3), 391–404. Kuik, O. J., J. F. M. Helming, C. Dorland, and F. A. Spaninks (2000), The economic benefits to agriculture of a reduction of low-level ozone pollu- tion in the Netherlands, Eur. Rev. Agric. Econ., 27, 76–90. Lawrence, M. G., and P. J. Crutzen (1999), Influence of NOx emissions from ships on tropospheric photochemistry and climate, Nature, 402, 167–170. Li, C., J. Aber, F. Stange, K. Butterbach-Bahl, and H. Papen (2000), A process-oriented model of N2O and NO emissions from forest soils: 1. Model development, J. Geophys. Res., 105(D4), 4369–4384. Lindroth, R. L., B. J. Kopper, W. F. Parsons, J. G. Bockheim, D. F. Karnosky, G. R. Hendrey, K. S. Pregitzer, J. G. Isebrands, and J. Sober (2001), Consequences of elevated carbon dioxide and ozone for foliar chemical composition and dynamics in trembling aspen (Populus tremuloides) and paper birch (Betula papyrifera), Environ Pollut., 115(3), 395–404. Liu, J., and J. Diamond (2005), China’s environment in a globalizing world—How China and the rest of the world affect each other, Nature, 435, 1179–1186. Liu, J., M. Liu, D. Zhuang, Z. Zhang, and X. Deng (2003), Study on spatial pattern of land-use change in China during 1995–2000, Sci. China, Ser. D, 46(4), 373–384. Liu, J. D., X. J. Zhou, Q. Yu, P. Yan, J. P. Guo, and G. A. Ding (2004), A numerical simulation of the impacts of ozone in the ground layer atmo- sphere on crop photosynthesis, Chin. J. Atmos. Sci., 28(1), 59–68. Liu, J., M. Liu, H. Q. Tian, D. Zhuang, Z. Zhang, W. Zhang, X. Tang, and X. Deng (2005a), Current status and recent changes of cropland in China: An analysis based on Landsat TM data, Remote Sens. Environ., 98, 442– 456. Liu, J., H. Q. Tian, M. Liu, D. Zhuang, J. M. Melillo, and Z. Zhang (2005b), China’s changing landscape during the 1990s: Large-scale land transformation estimated with satellite data, Geophys. Res. Lett., 32, L02405, doi:10.1029/2004GL021649. Lü, C., and H. Tian (2007), Spatial and temporal patterns of nitrogen deposition in China: Synthesis of observational data, J. Geophys. Res., 112, D22S05, doi:10.1029/2006JD007990. Mahowald, N. M., P. J. Rasch, B. E. Eaton, S. Whittlestone, and R. G. Prinn (1997), Transport of 222 radon to the remote troposphere using the model of atmospheric transport and chemistry and assimilated winds from ECMWF and the National Center for Environmental Prediction/NCAR, J. Geophys. Res., 102, 28,139–28,152. Marenco, A., H. Gouget, P. Nédélec, J.-P. Pagés, and F. Karcher (1994), Evidence of a long-term increase in tropospheric ozone from Pic du Midi data series: Consequences: Positive radiative forcing, J. Geophys. Res., 99(D8), 16,617–16,632. Martin, M. J., G. E. Host, and K. E. Lenz (2001), Stimulating the growth response of aspen to elevated ozone: A mechanistic approach to scaling a leaf-level model of ozone effects on photosynthesis to a complex canopy architecture, Environ. Pollut., 115, 425–436. Mauzerall, D. L., D. Narita, H. Akimoto, L. Horowitz, and S. Waters (2000), Seasonal characteristics of tropospheric ozone production and mixing ratios of east Asia: A global three-dimensional chemical transport model analysis, J. Geophys. Res., 105, 17,895–17,910. Mayer, M., C. Wang, M. Webster, and R. G. Prinn (2000), Linking local air pollution to global chemistry and climate, J. Geophys. Res., 105(D18), 22,869–22,896. McGuire, A. D., et al. (2001), CO2,climate and land-use effects on the terrestrial carbon balance, 1920–1992: An analysis with four process- based ecosystem models, Global Biogeochem. Cycles, 15, 183–260. Melillo, J. M. (1995), Human influences on the global nitrogen budget and their implications for the global carbon budget, in Toward Global Planning of Sustainable Use of the Earth: Development of Global Ecos-Engineering, edited by S. Murai and M. Kimura, pp. 117–133, Elsevier, Amsterdam. Melillo, J. M., D. McGuire, D. W. Kicklighter, B. Moore III, C. J. Vorosmarty, and A. L. Schloss (1993), Global climate change and terres- trial net primary production, Nature, 363, 234–240. Muntifering, R. B., A. H. Chappelka, J. C. Lin, D. F. Karnosky, and G. L. Somers (2006), Chemical composition and digestibility of Trifolium exposed to elevated ozone and carbon dioxide in a free-air (FACE) fumigation system, Functional Ecol., 20, 269–275. Ollinger, S. V. (2002), Interactive effects of nitrogen deposition, tropo- spheric ozone, elevated CO2 and land use history on the carbon dynamics of northern hardwood forests, Global Change Biol., 8, 545–562. Ollinger, S. V., J. D. Aber, and P. B. Reich (1997), Simulating ozone effects on forest productivity: Interactions among leaf-canopy and stand-level processes, Ecol. Appl., 7(4), 1237–1251. Piikki, K., G. Sellden, and H. Pleijel (2004), The impact of tropospheric ozone on leaf number duration and tuber yield of the potato (Solanum tuberosum L.) cultivars Bintje and Kardal, Agric. Ecosyst. Environ., 104, 483–492. Pitcairn, C. E. R., I. D. Leith, L. J. Sheppard, M. A. Sutton, D. Fowler, R. C. Munro, S. Tang, and D. Wilson (1998), The relationship between nitro- gen deposition, species composition and foliar nitrogen concentrations in woodland flora in the vicinity of livestock farms, Environ. Pollut., 102, 41–48. Ramankutty, N., and J. A. Foley (1998), Characterizing patterns of global land use: An analysis of global croplands data, Global Biogeochem. Cycles, 12(4), 667–685. Rasch, D. A. M. K., E. M. T. Hendrix, and E. P. J. Boer (1997), Replication- free optimal designs in regression analysis, Comput. Stat., 12, 19–52. Reich, P. B. (1987), Quantifying plant response to ozone: A unifying theory, Tree Physiol., 3, 63–91. Ren, W., and H. Q. Tian (2007), Ozone pollution and terrestrial ecosystem productivity (in Chinese), J. Plant Ecol., 31(2), 219–230. Ren, W., H. Q. Tian, G. Chen, M. Liu, C. Zhang, A. Chappelka, and S. Pan (2007), Influence of ozone pollution and climate variability on net pri- mary productivity and carbon storage in China’s grassland ecosystems from 1961 to 2000, Environ. Pollut., 149(3), 327–335, doi:10.1016/j. envpol.2007.05.029. Shi, X., D. Yu, X. Pan, W. Sun, H. Wang, and Z. Gong (2004), 1:1,000,000 soil database of China and its application (in Chinese), in Proceedings of 10th National Congress of Soil Science of China, pp. 142–145, Sci. Press, Beijing. Smith, G., J. Coulston, E. Jepsen, and T. Prichard (2003), A national ozone biomonitoring program—Results from field surveys of ozone sensitive plants in northeastern forests (1994–2000), Environ. Monit. Assess., 87, 271–291. Streets, D. G., and S. Waldhoff (2000), Present and future emission of air pollutants in China: SO2, NOx, CO, Atmos. Environ., 34, 363–374. Thornton, P. E., S. W. Running, and M. A. White (1997), Generating surfaces of daily meteorological variables over large regions of complex terrain, J. Hydrol., 190, 241–251. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 16 of 17 D22S09 Tian, H. Q., C. A. S. Hall, and Y. Qi (1998a), Modeling primary produc- tivity of the terrestrial biosphere in changing environments: Toward a dynamic biosphere model, Crit. Rev. Plant Sci., 17(5), 541–557. Tian, H. Q., J. M. Melillo, D. W. Kicklighter, A. D. McGuire, J. Helfrich, B. Moore III, and C. J. Vörösmarty (1998b), Effect of interannual climate variability on carbon storage in Amazonian ecosystems, Nature, 3, 664– 667. Tian, H. Q., J. M. Melillo, D. W. Kicklighter, A. D. McGuire, and J. Helfrich (1999), The sensitivity of terrestrial carbon storage to historical atmo- spheric CO2 and climate variability in the United States, Tellus, Ser. B, 51, 414–452. Tian, H. Q., J. M. Melillo, D. W. Kicklighter, A. D. McGuire, J. Helfrich, B. Moore III, and C. J. Vörösmarty (2000), Climatic and biotic controls on annual carbon storage in Amazonian ecosystems, Global Ecol. Bio- geogr., 9, 315–336. Tian, H. Q., J. M. Melillo, D. W. Kicklighter, S. F. Pan, J. Y. Liu, A. D. McGuire, and B. Moore III (2003), Regional carbon dynamics in mon- soon Asia and its implications for the global carbon cycle, Global Planet. Change, 37, 201–217. Tian, H. Q., M. L. Liu, C. Zhang, W. Ren, G. S. Chen, and X. F. Xu (2005), DLEM—The Dynamic Land Ecosystem Model, user manual, Ecosyst. Sci. and Reg. Anal. Lab., Auburn Univ., Auburn, Ala. Tian, H. Q., S. Wang, J. Liu, S. Pan, H. Chen, C. Zhang, and X. Shi (2006), Patterns of soil nitrogen storage in China, Global Biogeochem. Cycles, 20, GB1001, doi:10.1029/2005GB002464. Tjoelker, M. G., J. C. Volin, J. Oleksyn, and P. B. Reich (1995), Interaction of ozone pollution and light effects on photosynthesis in a forest canopy experiment, Plant Cell Environ., 18, 895–905. van Aardenne, J. A., G. R. Carmichael, H. Levy II, D. Streets, and L. Hordijk (1999), Anthropogenic NOx emissions in Asia in the period 1990–2020, Atmos. Environ., 33, 633–646. Von Caemmerer, S., and G. D. Farquhar (1981), Some relationships between the biochemistry of photosynthesis and the gas exchange of leaves, Planta, 153, 376–387. von Kuhlmann, R., M. G. Lawrence, P. J. Crutzen, and P. J. Rasch (2003), A model for studies of tropospheric ozone and nonmethane hydrocar- bons: Model description and ozone results, J. Geophys. Res., 108(D9), 4294, doi:10.1029/2002JD002893. Wang, C., and R. G. Prinn (1999), Impact of emissions, chemistry and climate on atmospheric carbon monoxide: 100-year predictions from a global chemistry-climate model, Chemosphere, 1(1–3), 73–81. Wang, C., R. G. Prinn, and A. Sokolov (1998), A global interactive chem- istry and climate model: Formulation and testing, J. Geophys. Res., 103(D3), 3399–3418. Wang, C. Y. (1995), A study of the effect of ozone on crop (in Chinese), Q. J. Appl. Meteorol., 6(3), 343–349. Wang, C. Y., J. P. Guo, Y. M. Bai, and M. Wen (2002), Experimental study of impacts by increasing ozone concentration on winter wheat, Acta Meteorol. Sinica, 60(2), 239–242. Wang, S., H. Q. Tian, J. Liu, and S. Pan (2003), Pattern and change in soil organic carbon storage in China: 1960s–1980s, Tellus, Ser. B, 55, 416– 427. Wang, X., and Q. Guo (1990), The effects of ozone on respiration of the plants Fuchsia hybrida Voss. and Vicia faba L. (in Chinese), Environ. Sci., 11, 31–33. Wang, X. K., W. Manning, Z. W. Feng, and Y. G. Zhu (2007), Ground-level ozone in China: Distribution and effects on crop yields, Environ. Pollut., 147(2), 394–400. Wang, X. P., and D. L. Mauzerall (2004), Characterizing distributions of surface ozone and its impact on grain production in China, Japan and South Korea: 1990 and 2020, Atmos. Environ., 38, 4383–4402. Wild, O., and H. Akimoto (2001), Intercontinental transport of ozone and its precursors in a three-dimensional global CTM, J. Geophys. Res., 106, 27,729–27,744. Xu, D. F. (1983), Statistical Data of Agricultural Production and Trade in Modern China, Shanghai People’s Press, Shanghai, China. Younglove, T., P. M. McCool, R. C. Musselmann, and M. E. Kahl (1994), Growth-stage dependent crop yield response to ozone exposure, Environ. Pollut., 86, 287–295. Zhang, C., H. Q. Tian, J. Liu, S. Wang, M. Liu, S. Pan, and X. Shi (2005), Pools and distributions of soil phosphorus in China, Global Biogeochem. Cycles, 19, GB1020, doi:10.1029/2004GB002296. Zhang, C., H. Q. Tian, A. H. Chappelka, W. Ren, H. Chen, S. F. Pan, M. L. Liu, D. Styers, G. S. Chen, and Y. Wang (2007), Impacts of climatic and atmospheric changes on carbon dynamics in the Great Smoky Mountains National Park, Environ. Pollut., 149(3), 336–347, doi:10.1016/j.envpol. 2007.05.028. ����������������������� G. Chen, M. Liu, S. Pan, W. Ren, H. Tian, X. Xu, and C. Zhang, School of Forestry and Wildlife Sciences, Auburn University, Auburn, AL 36849, USA. (tianhan@auburn.edu) B. Felzer, Ecosystem Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA. D22S09 REN ET AL.: AIR POLLUTION AND THE CARBON CYCLE 17 of 17 D22S09 work_4nwrkif3ibanfb2zq4ayey3xee ---- Making Journals Accessible Front & Back: Examining Open Journal Systems at CSU Northridge Manuscript Citation: The following APA style citations may be used to reference this manuscript: Borchard, L., Biondo, M., Kutay, S., Morck, D., & Weiss, A. (2015). Making journals accessible front & back: Examining open journal systems at CSU Northridge. Retrieved from http://scholarworks.csun.edu Version of Record Information: Citation: Borchard, L., Biondo, M., Kutay, S., Morck, D., & Weiss, A. (2015). Making journals accessible front & back: Examining open journal systems at CSU Northridge. OCLC Systems & Services: International Digital Library Perspectives, 31(1), 35-50. Copyright: Copyright © Emerald Publishing Limited DOI: http://dx.doi.org/10.1108/OCLC-02-2014-0013 This item was retrieved from CSUN ScholarWorks, the open-access, institutional repository of California State University, Northridge. http://scholarworks.csun.edu This is the author’s penultimate, peer-reviewed, post-print manuscript as accepted for publication. The publisher-formatted PDF may be available through the journal web site or, your college and university library. http://dx.doi.org/10.1108/OCLC-02-2014-0013 http://scholarworks.csun.edu/ Structured Abstract: Title: Making Journals Accessible Front & Back: Examining Open Journal Systems at CSU Northridge Author(s): Laurie Borchard, Michael Biondo, Stephen Kutay, David Morck, Andrew Weiss Journal: OCLC Systems and Services: International Digital Library Perspectives Year: 2014 Purpose – This study examines Public Knowledge Project (PKP) Open Journal Systems for its overall web accessibility and compliance with the Federal Electronic and Information Technology Accessibility and Compliance Act, also known as Section 508. Design/methodology/approach – 21 individual web pages in the CSUN test instance of PKP’s OJS version 2.4.0 used in three back-end journal development user roles were examined using three web-accessibility tools (WAVE, Fangs, Functional Accessibility Evaluator). Errors in accessibility were then logged and mapped to specific Web Content Accessibility Guidelines (WCAG) criteria. Findings – In all, 202 accessibility errors were reported across the 21 Open Journal System pages selected for testing. Because of this, the OJS cannot be efficiently utilized by assistive technologies and therefore does not pass the minimal level of acceptability as described in the Web Content Accessibility Guidelines 2.0. However, the authors found that the types of errors reported in this study could be simply and effectively remedied. Research limitations/implications – Further studies will need to corroborate on a larger scale the problems of accessibility found in the specific pages. Only three user roles were examined; other roles will need to be analyzed for their own problems with accessibility. Finally, although specific errors were noted, most can be easily fixed. Practical implications – There is an important need for accessible software design. In the case of CSUN, one of our campus partners will be better served by improving the web accessibility of our online open access journals. Originality/value – Although many studies and analyses of Section 508 compliance of front-facing web resources have been conducted, very few appear to address the back-end of such tools. This is the first to examine what problems in accessibility journal users with disabilities might encounter as OJS system administrators, journal managers or journal editors. Keywords: accessibility, accessible software design, library publishing, Open Journal Systems, open access, open source software design, Federal Electronic and Information Technology Accessibility and Compliance Act, Section 508, web accessibility Article Type: Research paper I. Project Overview The emerging role of the library as publisher is an example of how librarians have become not only organizers of information but also facilitators of knowledge creation. With this new role, however, comes the responsibility to provide publishing tools that are accessible to all users through compliance with the American Disabilities Act (ADA). Although analyses of front-end user interfaces have been prevalent, examinations of the administration and editing functionality of these publishing platforms has been overlooked. In response to this need, this article reports on the findings of a pilot project, implemented at the Oviatt Library at California State University, Northridge, which examines the open source publishing platform Open Journal System (OJS). Usability testing was conducted on the administration and editing functions to determine non-ADA compliant issues for this subset of specialized back-end users. California State University Northridge (CSUN) is located in the center of the San Fernando Valley, 25 miles north of Los Angeles. CSUN is comprised of a diverse population of over 38,000 students (approximately 86-90% undergraduate) and 4,000 faculty members across 9 different colleges, including the university library. The Oviatt Library is located in the center of campus and serves CSUN students, faculty and staff, along with the local community. There are over 30 tenured or tenure track librarians with two major departments covering reference and technical services. In fall 2012 the library conducted strategic planning with special so-called “diagonal slice” groups dedicated to public service, print and e-resource management, space utilization, and marketing. Out of this strategic planning came the development of the Digital Publishing Implementation Group (DPI group). This working group’s original charge was to position Oviatt’s institutional repository, ScholarWorks, as a central CSUN open access mandated press and publisher of e-texts, journals, as well as a repository for open educational resources. Along with this internal mandate within the library, in August 2013 CSUN’s president, Dianne Harrison, signed the Berlin Declaration of Open Access, making open access initiatives a priority for all of the campus. Later, in November 2013, CSUN’s Faculty Senate also passed an Open Access Resolution encouraging the faculty to publish and archive their work in open access repositories. Given the current push on campus for open access initiatives, the DPI group felt it was imperative to provide a recommendation for publishing open journals in order to contribute to and help foster the nascent campus open access movement. As the DPI group initially convened, it was decided that it would be best to develop a pilot project where members would experiment with publishing a journal using a more user-friendly open journal platform rather than relying entirely on CSUN’s ScholarWorks DSpace institutional repository. Although a central part of the campus’ open access movement through its ETDs mandate, ScholarWorks was seen as insufficient overall for journal publishing as DSpace does not provide sufficient workflows and user roles for the various tasks found in journal publishing. As a result, alternative open journal software platforms were researched and discussed, but eventually Public Knowledge Project’s (PKP) Open Journal Systems (OJS) was chosen for the DPIG journal publishing pilot project. Following this, the DPI group discussed current CSUN publications and campus entities that might be willing to partner with the group. The group initially reached out to CSUN’s Center on Disabilities, which supports and implements the Annual International Technology and Persons with Disabilities Conference. Although the conference has been held for nearly 30 years, the COD has only just begun to publish conference proceedings via the Journal on Technology and Persons with Disabilities. The group met with Sean Goggin, the Technologies Manager at the Center on Disabilities, in early September. At that time he expressed his willingness to participate, but he also shared his experience with using OJS in the past. Mr. Goggin described some of the issues his staff had encountered with the inaccessibility of OJS for contributors and editors with disabilities. Despite the issues raised, the Journal on Technology and Persons with Disabilities was included in the initial pilot project along with three other journals, The Northridge Review, a creative writing journal for CSUN students; the California Geographer, the journal for the California Geographic society (formerly edited by CSUN faculty); and the New Journal of Student Research Abstracts, a journal publishing the abstracts of science projects created by students in K-12 and edited by CSUN Biology Department emeritus faculty member, Steven Oppenheimer. These journals were chosen because each represents various constituencies and potential user-groups and target audiences found at and outside of CSUN. The Journal on Technology and Persons with Disabilities, for example, despite being centered at CSUN, has a unique, international, decidedly non-CSUN audience with very specific user-needs. In particular, COD has the distinct need for accessible software. The Northridge Review is a CSUN student in-house publication designed to highlight the creative writing of the English Departments students, but it is currently published only in print version. The California Geographer was chosen because it had previously been scanned in entirety and placed in ScholarWorks (Biondo and Weiss 2013). As a result it is a prime candidate to compare the functionality of either OJS or DSpace. Finally the New Journal of Student Research Abstracts was chosen for its outreach role in fostering K-12 education. Each of the four journals during this pilot, then, would be able to provide different perspectives on the utility of OJS. A note on the installation of OJS In order to create a testing and development environment for OJS, an Intel Xeon based server was repurposed locally for the DPI group’s test environment. The server was reformatted and Red Hat Enterprise Linux Server 6.4 was installed and configured to send and receive from a static IP (internet protocol address) reserved for the library. Once the operating system was installed and configured to send and receive on the Oviatt Library’s IP, Apache 2.2.15 was installed, built and configured on August 2, 2013, with firewall protection in place to limit access to local campus IP ranges. Once Apache was properly serving web data additional services were installed via Red Hat’s package system which included: PHP 5.3.3 (Personal Home Page dynamic web programming language), MySQL 5.1.69 (open source database and management), and phpMyAdmin (web based database management). The DPI group chose to go with OJS version 2.4.2, which was the latest stable release at the time of installation. The downloaded package was then extracted into our server’s document root, and permissions were granted to the specified directories to be writeable. A directory for OJS uploads was created below the document root to maintain a secure upload directory. A database user was created via phpMyAdmin to enable the installation script to work properly. Relevant server, administrative user and configuration information was entered into the config.inc.php file provided with the OJS installation. Once everything was configured correctly the command-line installation option was used, which populated the OJS database using the previously created database user. After this the site was serving correctly to browsers, and access to the administrative interface was tested successfully. The only other modifications made were some slight changes to the CSS (cascading style sheets) for aesthetic purposes. II. Background: Accessibility, Open Access publishing & accessible software design This case study was formulated to determine the feasibility of using OJS to provide a viable publishing platform for the aforementioned Journal on Technology & Persons with Disabilities. The DPI group wanted to assess which features of the software could possibly present problems to those with disabilities who would assume organizational and editorial roles within the publication. Accessibility In 1998, the Federal Electronic and Information Technology Accessibility and Compliance Act was added as an amendment to the Rehabilitation Act of 1973. Commonly known as Section 508, the amendment dictates that all Federal agencies must make information equally usable, and accessible to disabled individuals through the use of assistive technology and proper construction of governmental webpages. Mandates of Section 508 have also been codified at state and local government levels for organizations that deal with Federal agencies or receive Federal funds. Likewise, these accessibility standards are broadly supported by the World Wide Web Consortium (W3C). The W3C formed the Web Content Accessibility Guidelines Working Group and in 1999 adopted the industry standard: Web Content Accessibility Guidelines 1.0 (WCAG). An updated version known as WCAG 2.0 was recommended in 2008. These standards are in place across the California State University system, where it is “viewed as a necessity and an investment.” (CSU web accessibility FAQ 2014) Such standards form the basis for the case study. Literature review A number of articles have been written concerning Section 508 of the Rehabilitation Act and the accessibility of academic library websites. The most recent of these is the exhaustive follow-up done by Comeaux and Schmetzke in 2012, which seeks to highlight a “snapshot of web accessibility” across the webpages of 56 North American academic libraries (Comeaux and Schmetzke 2012). They found that accessibility is on the rise as a result of a trio of factors. First, more academic libraries have shifted their online presence to content management systems such as Drupal. Second, the use of CSS for web page layout has allowed for more accessible design. Finally, the adoption of campus-wide policies and standards for institutional webpages has greatly contributed to the development of web pages accessible to all. Similarly, in their article “Differently Able” published in 2011, Cassner et al. found that the “large majority of ARL libraries (88%) [had] a web page for people with disabilities” (Cassner, Maxey-Harris and Anaya 2011). While these two recent works examine the front-end usability of academic library websites and making information accessible, we found that little has been written about the need to make information- creation and publishing tools such as OJS more accessible and usable for people with disabilities on the back end. Indeed, a search for literature related to the use of OJS and its compliance with Section 508 of the Americans with Disabilities Act has turned up little relevant literature. Some of this can be attributed to the relative novelty of library publishing. The Library Publishing Toolkit, published in the summer of 2013, provides the most current and comprehensive collection of experiences of library publishing, as well as discussion of trends in scholarly publishing. Accessibility, however, is only mentioned in broad terms, as merely something to consider, and it is never discussed in detail and certainly not in terms of the back-end editing functionality of a software platform. Cowan (2013) includes accessibility issues in her workflow checklist under technical Considerations. Newton, et al., (2013) state that accessibility and usability principles are adhered to for their projects, but do not discuss these principles in greater detail. MacGregor, et al. (2013) discuss the Public Knowledge Project and focus their chapter on reviewing the embedded workflow within OJS. However, the accessibility issues with the back end of the OJS tool itself are not discussed and were not reviewed. PKP’s discussion forums for OJS were also searched for topics relating to accessibility issues and ADA compliance. Only one discussion topic was found in relation ADA compliance. The topic thread started in 2008 with a user enquiring whether OJS would work with screen readers such as JAWS. A representative from the PKP support team at that time stated that they had not tested the product with screen readers, but they were working on becoming completely 508 compliant and they encouraged more user feedback. In 2010 another OJS user commented that they had run a journal table of contents through WAVE and identified problems relating to labels for language and the two search fields. Once again a representative from PKP promptly replied and stated they were working on becoming completely 508 compliant, but they had in fact not spent much time “determining where we might be slack” (MacGregor 2010). The OJS representative provided a list of recommendations based on a project they were currently working on and once again encouraged more user feedback. Apart from the example cited above, the authors have been unable to find a more recent discussion in the OJS forums that PKP support offers. This particular discussion indicates that although the developers claim to be working on being completely 508 compliant, they do not yet appear to have completed extensive usability/accessibility tests (OJS). As a result of this dearth of scholarship indicating the OJS’s overall level of web accessibility, the authors decided to implement web usability tests. The DPI group first checked the front-end accessibility of the OJS pages by implementing several software solutions designed to test compliance with web accessibility. Prior to this study, the authors first checked the front-end accessibility of the OJS pages by implementing several software solutions designed to test compliance for web accessibility. The first, called WAVE toolbar (which is a plug-in for the Mozilla Firefox browser), checks the content displayed on a webpage and provides visual notifications of non-compliance at the exact locations within a webpage. To corroborate these red flags in accessibility, FANGS, a screen reader simulator for Mozilla Firefox browsers was installed to estimate what a disabled user might encounter using assistive technologies. Finally, the Firefox Accessibility Evaluation Toolbar (FAE) developed by the University of Illinois Urbana-Champaign, was deployed to identify accessibility markup issues within the HTML code. (See Table 1) TABLE 1: Accessibility Tools and Resources Utilized Name of Tool or Resource Summary of Tool or Resource Online Resources Related to Tool or Resource Functional Accessibility Evaluator (FAE) Web and browser based accessibility testing tool from the Illinois Center for Information Technology and Web Accessibility (iCITA) which includes a range of tools for validating and testing webpages. Browser version was utilized for our testing. • Browser Toolbar: https://addons.mozilla.org/en- US/firefox/addon/accessibility- evaluation-toolb/ • Online Tool: http://fae.cita.uiuc.edu/ FANGS Screen reader emulation tool. Renders a text version of webpages to represent how a screen reader would interpret the semantic markup. • Browser Toolbar: https://addons.mozilla.org/en- US/firefox/addon/fangs-screen- reader-emulator/ WAVE Web Accessibility Evaluation Tool (WAVE) from WebAIM (Web Accessibility in Mind, based at Utah State University), creates WAVE accessibility reports in browser, or is able to send to wave.webaim.org for detailed reports. Also analyzes structure, and order. Is able to provide text overlays on live pages with accessibility concerns labelled. • Browser Toolbar: http://wave.webaim.org/toolbar/ • Online Accessibility Evaluation Tool: http://wave.webaim.org/ • WebAIM Home: http://webaim.org/ Section 508 Section 508 Amendment to the Rehabilitation Act of 1973 is a federal act that requires Federal agencies to electronic and information technology is accessible to people with disabilities, and provides guidelines to facilitate accessibility. • Section 508 Homepage: https://www.section508.gov/ WCAG Web Content Accessibility Guidelines, developed through W3C (World Wide Web Consortium) in order to provide guidelines for web content developers, tool developers, and policy makers to create accessible • WCAG Guidelines Home: http://www.w3.org/WAI/intro/wcag https://addons.mozilla.org/en-US/firefox/addon/accessibility-evaluation-toolb/ https://addons.mozilla.org/en-US/firefox/addon/accessibility-evaluation-toolb/ https://addons.mozilla.org/en-US/firefox/addon/accessibility-evaluation-toolb/ https://addons.mozilla.org/en-US/firefox/addon/accessibility-evaluation-toolb/ http://wave.webaim.org/toolbar/ http://wave.webaim.org/toolbar/ http://webaim.org/ http://webaim.org/ content for those with disabilities. W3C The World Wide Web Consortium is an organization which works with developers, vendors, and organizations in an effort to standardize core principles and components of the World Wide Web. • W3C Home: http://www.w3.org/ Table 1. The resources and technologies used for determining ADA compliance include W3C, WCAG, FANGS, WAVE, FAE. Each are described above. After running the tests, it was determined that OJS provides sufficient ADA compliance for outward-facing content display pages. Those with disabilities are likely to find no issues with accessing the content as long as those developing the content on the back end ensure that they provide ADA-compliant content. The functionality exists to allow journal creators, editors, and managers to publish ADA compliant content. However, given the increasing need for accessible software design, the authors decided to focus on the accessibility of back-end development pages, which would allow those with disabilities to participate in content creation, journal maintenance and administration, editing, and peer-reviewing. One of the main issues raised by our inquiry is the need for web-based digital tools -- especially those in the open source domain -- that are themselves ADA compliant. Based upon our examination of the literature and the perceived lack of in-depth analysis of ADA compliance in open source software design, our essential question is to examine whether PKP’s OJS would work across three major back-end user roles for people with disabilities. III. Methodology For this study the authors examined the implementation of accessible functionality across 21 back-end web pages specific to three user roles in the CSUN- installed instance of the OJS (version 2.4.2.0) for their adherence to ADA section 508 criteria. (See Figure 1 below.) Figure 1. Screenshot showing the 9 pages analyzed for the OJS California Geographer Journal Editor role, including the following: Editor Home, Unassigned, In Review, In Editing, Archives, Create Issue, Notify Users, Future Issues, Back Issues. The procedure was as follows: first, selected pages were examined via the WAVE program and their warnings and findings were noted. (WebAim 2014) Second, the Mozilla browser plug-in FANGS accessibility simulator was applied to each page to approximate what a person using such software would hear using a screen reader application. (Mozilla Foundation) The FANGS output confirmed the significance of the errors reported by the WAVE evaluator. Third, a report from the Functional Accessibility Evaluator (FAE) was generated for each page to examine issues of accessibility not addressed directly by either WAVE or FANGS. (FAE 2014) The errors found on each page were recorded, tallied, and mapped to a Section 508 compliance checklist developed by the W3C from the Web Content Accessibility Guidelines version 2.0. (W3C 2013) These guidelines provide success criteria from which ADA compliance is measured. As OJS allows for multiple user roles to be created, it was determined that those roles most used during the journal building process would be examined for their adherence to these ADA section 508 criteria. As a result, Site Administrator workflow pages, Journal Manager workflow pages and Journal Editor workflow pages were examined. A total of 21 unique web pages were examined, including 7 Site Administrator pages, 5 Journal Manager pages, and 9 Journal Editor pages. Within these roles the specific user pages examined are as follows: A. Site Administrator user role [7 pages]: Site Management: · [Page 1] Site Settings · [Page 2] Hosted Journals · [Page 3] Languages · [Page 4] Authentication Sources · [Page 5] Categories nistrative Functions: · [Page 6] System Information · [Page 7] Merge Users B. Journal Manager (California Geographer) user role [5 pages]: Journal Setup: Five Steps to a Journal Web Site · [Page 1] Details: Name of journal, ISSN, contacts, sponsors, and search engines. · [Page 2] Policies: Focus, peer review, sections, privacy, security, and additional about items. · [Page 3] Submissions: Author guidelines, copyright, and indexing (including registration). · [Page 4] Management: Access and security, scheduling, announcements, copyediting, layout, and proofreading. · [Page 5] The Look: Homepage header, content, journal header, footer, navigation bar, and style sheet. Journal Editor (California Geographer) user role [9 pages]: [Page 1] Editor Home Submissions · [Page 2] Unassigned · [Page 3] In Review · [Page 4] In Editing · [Page 5] Archives Issues · [Page 6] Create Issue · [Page 7] Notify Users · [Page 8] Future Issues · [Page 9] Back Issues Each web page within the OJS system’s designated user role was examined as outlined above and the results archived as individual PDFs indicating their adherence to specific ADA Section 508 criteria. The results of the accessibility tests for each role and page are explained in the following section. IV. Results Combined reports from both the WAVE and FAE evaluators reveal a total of 202 accessibility errors across the twenty-one pages selected from the OJS Web application (See Table 2). These errors were mapped to the following five associated WCAG 2.0 criteria, listed below in order of frequency: 1. Criterion 1.3.1 - Info and Relationships (166 errors). This success criterion ensures “that information and relationships that are implied by visual or auditory formatting are preserved when the presentation format changes” (W3C, 2013). 2. Criterion 3.1.1 – Language of Page (21 errors). This success criterion ensures “that content developers provide information in the Web page that user agents need to present text and other linguistic content correctly” (W3C, 2013). 3. Criterion 2.4.6 - Headings and Labels (7 errors). This success criterion helps “users understand what information is contained in Web pages and how that information is organized” (W3C, 2013). 4. Criterion 2.4.4 - Link Purpose (4 errors). This success criterion helps “users understand the purpose of each link so they can decide whether they want to follow the link” (W3C, 2013). 5. Criterion 4.1.1 – Parsing (4 errors). This success criterion ensures “that user agents, including assistive technologies, can accurately interpret and parse content” (W3C, 2013). TABLE 2: Total Accessibility Errors OJS Page 1.3.1 Relation- ships 2.4.4 Link Text 2.4.6 Naviga- tion 3.1.1 Lang- uage 4.1.1 Structure Total Admin 1 - Site Setting 5 n/a 1 1 1 8 Admin 2 - Journals 2 2 n/a 1 n/a 5 Admin 3 - Languages 3 n/a n/a 1 n/a 4 Admin 4 - Authentication 3 n/a n/a 1 1 5 Admin 5 - Categories 2 n/a n/a 1 n/a 3 Admin 6 - System Info 2 n/a n/a 1 n/a 3 Admin 7 - Merge Users 7 n/a n/a 1 n/a 8 Manager 1 - Details 1 n/a n/a 1 n/a 2 Manager 2 - Policies 9 n/a n/a 1 n/a 10 Manager 3 - Guiding Subs 35 n/a n/a 1 n/a 36 Manager 4 - Mng. Journal 9 n/a n/a 1 n/a 10 Manager 5 - Custom Look 24 n/a 3 1 n/a 28 Editor 1 - Home 10 n/a n/a 1 n/a 11 Editor 2 - Subs - Unassigned 12 n/a n/a 1 n/a 13 Editor 3 - Subs - In Review 12 n/a n/a 1 1 14 Editor 4 - Subs - In Editing 12 n/a n/a 1 1 14 Editor 5 - Subs - Archives 12 n/a n/a 1 n/a 13 Editor 6 - Issues - Create 2 n/a 2 1 n/a 5 Editor 7 - Issues - Notify 3 n/a 1 1 n/a 5 Editor 8 - Issues - Future n/a n/a n/a 1 n/a 1 Editor 9 - Issues - Back 1 2 n/a 1 n/a 4 Totals 166 4 7 21 4 202 Table 2. Combined accessibility errors as reported from WAVE and FAE evaluators and mapped to WCAG 2.0 criteria. Independent analysis of errors reported by the WAVE evaluator identified accessibility issues occurring in the front-end interface in 19 of the 21 pages (See Table 3). Of these, ‘missing form labels’ (WCAG 2.0, 1.3.1) was the most commonly cited error at 149 instances. This is followed by: 6 instances of ‘orphaned form labels’ (WCAG 2.0, 2.4.6); 4 instances of ‘problematic link text’ (WCAG 2.0, 2.4.4); and 1 “multiple form labels’ (WCAG 2.0, 2.4.6). TABLE 3: WAVE Accessibility Errors OJS Page WAVE Error Description WCAG 2.0 Criterion Error Instances Admin 1 - Site Setting Missing form label 1.3.1 4 Orphaned form label 2.4.6 1 Admin 2 - Journals Missing form label 1.3.1 2 Problematic link text 2.4.4 2 Admin 3 - Languages Missing form label 1.3.1 3 Admin 4 - Authentication Missing form label 1.3.1 3 Admin 5 - Categories Missing form label 1.3.1 2 Admin 6 - System Info Missing form label 1.3.1 2 Admin 7 - Merge Users Missing form label 1.3.1 7 Manager 1 - Details No errors n/a 0 Manager 2 - Policies Missing form label 1.3.1 8 Manager 3 - Guiding Subs. Missing form label 1.3.1 23 Manager 4 - Mng. Journal Missing form label 1.3.1 8 Manager 5 - Custom Look Missing form label 1.3.1 24 Orphaned form label 2.4.6 3 Editor 1 - Home Missing form label 1.3.1 10 Editor 2 - Subs - Unassigned Missing form label 1.3.1 12 Editor 3 - Subs - In Review Missing form label 1.3.1 12 Editor 4 - Subs - In Editing Missing form label 1.3.1 12 Editor 5 - Subs - Archives Missing form label 1.3.1 12 Editor 6 - Issues - Create Missing form label 1.3.1 1 Orphaned form label 2.4.6 1 Multiple form labels 2.4.6 1 Editor 7 - Issues - Notify Missing form label 1.3.1 3 Orphaned form label 2.4.6 1 Editor 8 - Issues - Future No errors n/a 0 Editor 9 - Issues - Back Missing form label 1.3.1 1 Problematic link text 2.4.4 2 Total 160 Table 3. Accessibility errors reported using the WAVE Accessibility evaluator. Independent analysis of errors reported by the FAE evaluator identified accessibility issues occurring in the underlying code across all pages (See Table 4). Of these, ‘no language attribute’ (WCAG 2.0, 3.1.1) was the most commonly cited error, occurring in all pages with 21 total instances. This is followed by: 11 instances of ‘two or more levels of nesting’ (WCAG 2.0, 1.3.1); 4 instances of ‘improperly nested headings’ (WCAG 2.0, 4.1.1); and 2 instances each of ‘no table summary’, ‘no table header ID’, and ‘no table header’ (WCAG 2.0, 1.3.1). TABLE 4: FAE Accessibility Errors OJS Page FAE Error Description WCAG 2.0 Criterion Error Instances Admin 1 - Site Setting No language attribute 3.1.1 1 Heading improperly nested 4.1.1 1 2+ levels of nesting 1.3.1 1 Admin 2 - Journals No language attribute 3.1.1 1 Admin 3 - Languages No language attribute 3.1.1 1 Admin 4 - Authentication No language attribute 3.1.1 1 Heading improperly nested 4.1.1 1 Admin 5 - Categories No language attribute 3.1.1 1 Admin 6 - System Info No language attribute 3.1.1 1 Admin 7 - Merge Users No language attribute 3.1.1 1 Manager 1 - Details No language attribute 3.1.1 1 2+ levels of nesting 1.3.1 1 Manager 2 - Policies No Language attribute 3.1.1 1 2+ levels of nesting 1.3.1 1 Manager 3 - Guiding Subs. No language attribute 3.1.1 1 No Table summary 1.3.1 2 No table header ID 1.3.1 2 No table header 1.3.1 2 2+ levels of nesting 1.3.1 6 Manager 4 - Mng. Journal No language attribute 3.1.1 1 2+ levels of nesting 1.3.1 1 Manager 5 - Custom Look No language attribute 3.1.1 1 Editor 1 - Home No language attribute 3.1.1 1 Editor 2 - Subs - Unassigned No language attribute 3.1.1 1 Editor 3 - Subs - In Review No language attribute 3.1.1 1 Heading improperly nested 4.1.1 1 Editor 4 - Subs - In Editing No language attribute 3.1.1 1 Heading improperly nested 4.1.1 1 Editor 5 - Subs - Archives No language attribute 3.1.1 1 Editor 6 - Issues - Create No language attribute 3.1.1 1 2+ levels of nesting 1.3.1 1 Editor 7 - Issues - Notify No language attribute 3.1.1 1 Editor 8 - Issues - Future No language attribute 3.1.1 1 Editor 9 - Issues - Back No language attribute 3.1.1 1 Total 42 Table 4. Accessibility errors reported using the Firefox Accessibility evaluator. Analysis of the pages assigned to ‘user types’ in our study reveals the highest number of accessibility errors were within the ‘manager’ pages with 86 accessibility errors. The pages assigned for use by ‘editors’ contained 80 accessibility errors. The pages designated for site ‘administrators’ had the fewest accessibility errors at 36. Among all three user types, errors within the WCAG 2.0 1.3.1 success criterion were most prevalent due to the high occurrence of form boxes that did not contain labels. V. Discussion: Implications and quick fixes Immediate implications of the results: Given the lack of in-depth analysis in library and information science literature, analyzing the design of the back end helps us to determine as librarians whether or not a proposed software will help users with disabilities. What the authors have found in this study is that although the software functions tolerably well on the display side, the back-end pages exhibited several distinct non-accessible behaviors. Essentially there were five distinct WCAG criteria determined to have specific errors detectable by the three tools employed. The first most commonly discovered error -- found on all pages examined and designated as criterion 1.3.1 by WCAG -- is defined as “Information and relationships conveyed through presentation can be programmatically determined, and notification of changes to these is available to user agents, including assistive technologies.” (W3C, 2013) The results in tests showed that generally the lack of form labels in the pages contribute to the possibility of overlooked or misunderstood input prompts. The lack of form labels on the pages examined actually hides the fact that forms for input exist. A person with disabilities using a screen reader would not recognize the form box for which to supply information regarding the journal. (See figure 2) Figure 2: Screenshot showing results of the WAVE analysis of “Editor Home” webpage. Various issues of accessibility, including missing form tags are indicated with red flags. This page showed 10 accessibility errors according to the software. Additionally, under criterion 1.3.1., nine of the pages sampled with the FAE test indicated two levels of nesting were found in the headings and tables. Additionally, tables were found to be missing header attributes and header IDs. These are relatively minor but still result in less information being conveyed to the disabled user. Assistive technologies must be able to interpret relationships between elements found on the page to ensure accurate rendering from within the assistive technology. The second most common error, found on all pages and designated as criterion 3.1.1 by WCAG, is defined as “The primary natural language or languages of the Web unit can be programmatically determined.” (W3C, 2013) The results show that although the lack of a defined language appears on all pages, this is mainly a minor inconvenience. It is likely to have a no impact on users, unless language translations are required. The third most common error, found on 5 of the pages and designated as criterion 2.4.6 by WCAG, is defined as “When a Web unit or authored component is navigated sequentially, components receive focus in an order that follows relationships and sequences in the content.” Criterion 2.4.6 does not require labels, as in 1.3.1, but rather requires existing labeled elements be logically interpreted. The results show that a few of the form labels indicated in the pages are orphaned from their respective form boxes. As a result, these labels are not properly ordered within the page. Additionally, if the form labels are not accurately describing the correct form boxes, these boxes are thus ambiguous in their definitions on the page, further causing problems with sequential navigation. The fourth most common error, found on 4 of the pages and designated as criterion 4.1.1 by WCAG, is defined as “Web units or authored components can be parsed unambiguously, and the relationships in the resulting data structure are also unambiguous.” These are primarily found to be problematic by the FAE tool in the form of improperly nested headings. Again, the problem is that those relying on readers may not necessarily understand the parts without the screen reader spelling them out. The fifth most common error, found on 2 of the pages and designated as criterion 2.4.4 by WCAG, is defined as “Each link is programmatically associated with text from which its purpose can be determined.” These are manifest in the OJS pages as problematic link texts. Essentially the links are not working in these cases. Though likely a minor issue, the breakage of such links is still an issue with links that do not provide alternative text for which an agent using a screen reader can understand the action or destination intended by the link. Addressing Issues of Accessibility The majority of the accessibility issues that were presented were missing form labels, and missing language definitions – the majority of the missing form labels are from the inward facing administration pages, and not as often with forward facing publicly accessible pages. OJS has templates left open and editable on the server side, which could be modified to address these issues by adding form labels and the use of meta tags in the sections of the pages to give a language definition. While we installed 2.4.2 as a stable release for OJS, we have yet to patch to the newest (2.4.3) version. The risk that is run here is that any changes to the templating system may be patched over when upgrading versions; however, this has not been established yet as a problem. As result of this, any changes that are made would have to be documented, in case they need to be re-established. One option is to fork a version of OJS from their github repository and share back any commits to the project in the hopes that the changes would be implemented in a future public release. Another feature missing from OJS is the lack of keyboard shortcuts to enable the up, down, left and right arrows in screen readers, which would have to be addressed programmatically. As noted above, some tables lack summaries, captions, row and column headers. The tables designated purely for layout should be summarized to notate this, and tables with tabular data should have clearly defined row and column headers in order to be accessible. However, this could be easily addressed with some templating changes. In several of the administrative sections there is a problem of lack of link text relating to navigation, and abbreviations that are not semantically marked up to provide explanations. Again, these issues could be addressed in templating and can ultimately be adopted back into the project by PKP. VI. Future Studies, Limitations and Conclusion Impact on CSUN’s future in journal publishing As the OJS is a prime candidate for the development of a CSUN-based library Open Access publishing platform, its adherence to current university-wide web policy becomes more important. As CSUN has been mandated to adhere as closely as possible to ADA standards, using OJS for official university publishing must therefore meet the established guidelines for web presence. As a result, it may be somewhat problematic to establish OJS as the go-to publishing tool without significant discussion and analysis of its ADA compliance. As the authors intend the CSUN Open Journals to be used by all of its stakeholders, including faculty, staff and students, compliance for all aspects of the software becomes essential. Limitations of the study: Although the study was able to pinpoint some errors in accessibility, there are some limitations to the study that should be addressed. Awareness of such limitations nonetheless will help to indicate how future studies might create more accurate results. First, the study was conducted without examining other similar types of software or similar online publishing platforms housing digital content such as DSpace, CONTENTdm, and the like. Additionally, though the most current at the time, a new version of OJS has been developed. As a result further studies would be advised to examine similar types of products for their adherence to ADA Section 508 guidelines as well as looking at the future updates of the software. Next, only selected roles (the three most common as assumed by the authors) and selected pages were chosen for the test. This represents a less-than complete examination of the web functionality of the whole site. A more comprehensive examination would need to examine a significantly higher number of pages. Such a small sample is not representative of everything. Finally, there is no one universal tool to definitively examine all aspects of section 508 compliance. For this study multiple tools were needed to analyze the site, yet each tool provides different overall analyses. It is possible that a fourth or even a fifth tool would indicate further errors not found by the others. Conclusions: It will be imperative for those interested in ADA compliance to examine whether more appropriate alternatives to the OJS exist. Although the front end of most software is largely ADA compliant, there is indeed a need for more tools to be ADA compliant themselves for the users of such tools. This emphasis on accessible software design will likely be an important aspect for all web-based software tools. Ultimately the authors find that the back end of the PKP OJS software platform will pose some problems for users with disabilities. In the case of CSUN’s Center on Disabilities, one of CSUN Open Journals partners, this lack of compliance is a deal breaker. The COD is unlikely to use OJS with Oviatt Library beyond the front-end capacity. Since peer-reviewers, editors or other staff affiliated with the COD's conference and journal may have need of fully accessible software. It is argued, however, that most of these errors can be easily fixed. Once such issues are addressed, it is likely that CSUN Oviatt Library will continue to use OJS to develop viable open access journals for all its constituents. CITATIONS: Anon., 2014. California State University Accessible Technology Initiative FAQ. [Online] Available at: http://www.calstate.edu/accessibility/webaccessibility/web_accessibility_FAQs.shtml (Accessed 4 February 2014). Biondo, M. and Weiss, A. (2013), "The California Geographer and the OA movement: using the green OA institutional repository as a publishing platform," in Brown, A. (Ed.), Library Publishing Toolkit, IDS Project Press, Geneseo, NY, pp. 207-214. Cassner, M., Maxey-Harris, C. & Anaya, T., 2011. Differently Able: A Review of Academic Library Websites for People with Disabilities. Behaviorial & Social Sciences Librarian, 30(1), pp. 33-51. Comeaux, D. & Schmetzke, A., 2013. Accessibility of Academic Library Websites in North America: Current Status and Trends (2002-2012). Library Hi Tech, 31(1), pp. 8-33. Cowan, S. (2013), “Open Access Journal Incubator at University of Lethbridge Library”, in Brown, A. (Ed.), Library Publishing Toolkit, IDS Project Press, Geneseo, NY, pp. 179- 186. FAE (2014), Functional Accessibility Evaluator 1.1, [Online]. Available at: http://fae.cita.uiuc.edu/ (Accessed 16 April 2014). Mozilla Foundation. Fangs Screen Reader Emulator: Add-ons for Firefox, [Online]. Available at: .https://addons.mozilla.org/en-US/firefox/addon/fangs-screen-reader- emulator/ (Accessed 16 April 2014). MacGregor, J. (2010), “Re: OJS and screen readers”, PKP Support [online discussion forum], available at http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility# wrap (accessed 30 January 2014). MacGregor, J., Meijer-Kline, K., Owen, B., Stranack, K. and Willinsky, J. (2013), “The Public Knowledge Project: Open Source-Publishing Services for Your Library”, in Brown, A. (Ed.), Library Publishing Toolkit, IDS Project Press, Geneseo, NY, pp. 359-366. Newton, M., Cunningham, E. and Morris, J. (2013), “Emerging Opportunities in Library Services: Planning for the Future of Scholarly Publishing”, in: Brown, A. (Ed.), Library Publishing Toolkit, IDS Project Press, Geneseo, NY, pp.109-118. “OJS and Screen Readers”, PKP Support [online discussion forum], available at: http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility# wrap (accessed 30 January 2014). WebAim (2014), WebAIM: Web Accessibility In Mind [Online]. Available at: http://wave.webaim.org/ (Accessed 16 April 2014). W3C World Wide Web Consortium (2013), “Understanding WCAG 2.0”, available at: http://www.w3.org/TR/WCAG20/ (Accessed 02 February 2014). http://fae.cita.uiuc.edu/ https://addons.mozilla.org/en-US/firefox/addon/fangs-screen-reader-emulator/ https://addons.mozilla.org/en-US/firefox/addon/fangs-screen-reader-emulator/ http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility#wrap http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility#wrap http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility#wrap http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility#wrap http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility#wrap http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=3048&p=10931&hilit=accessibility#wrap http://wave.webaim.org/ http://www.w3.org/TR/WCAG20/ http://www.w3.org/TR/WCAG20/ About the Authors: Laurie Borchard is a Digital Learning Initiatives Librarian at California State University, Northridge; she can be reached at laurie.borchard@csun.edu. Michael Biondo is the ScholarWorks Assistant at California State University, Northridge; he is a graduate student in the School of Library and Information Science at San Jose State University. Stephen Kutay is a Digital Services Librarian at California State University, Northridge. David Morck is a Web Programmer at California State University, Northridge. Andrew Weiss is a Digital Services Librarian at California State University, Northridge. mailto:laurie.borchard@csun.edu Structured Abstract: I. Project Overview II. Background: Accessibility, Open Access publishing & accessible software design III. Methodology IV. Results V. Discussion: Implications and quick fixes VI. Future Studies, Limitations and Conclusion CITATIONS: About the Authors: work_4rt7xda2gffx3ah27z4ywby7om ---- Microsoft Word - RDA bib relationships_Park_Morrison_2017.docx CONTACT  Taemin Kim Park, park@indiana.edu, or Andrea M. Morrison, amorriso@indiana.edu,  Indiana University Bloomington, Technical Services Department, Herman B Wells Library E350, 1320 E. 10th Street,  Bloomington, IN 47405‐3907, USA.    To cite this article: Taemin Kim Park & Andrea M. Morrison (2017) The Nature and Characteristics of Bibliographic  Relationships in RDA Cataloging Records in OCLC at the Beginning of RDA Implementation, Cataloging &  Classification Quarterly, 55:6, 361‐386, DOI: 10.1080/01639374.2017.1319451     The Nature and Characteristics of Bibliographic Relationships in RDA  Cataloging Records in OCLC at the Beginning of RDA Implementation  Taemin Kim Park & Andrea M. Morrison  Technical Services Department, Indiana University Libraries, Bloomington, Indiana, USA  Abstract  This research examines the characteristics and types of bibliographic relationships  in  Resource  Description  and  Access  (RDA)  book  cataloging  records  produced  in  OCLC  after  RDA  implementation.  Data  was  sampled  (n  =  1,550),  coded,  and  analyzed  for  work‐to‐work,  expression‐to‐expression, and manifestation‐to‐manifestation relationships. Results show work‐ to‐work  bibliographic  relationships  are  most  frequently recorded in both PCC records (57.4%)  and non‐PCC (59.5%); expression‐to‐expression are recorded the least in PCC (8.3%) and non‐PCC  (15.8%);  and  manifestation‐to‐manifestation relationships  fall  between  with  PCC  (34.4%)  and  non‐PCC (24.7%). This study also investigates the MARC fields used to record relationships and  common characteristics in cataloging records with bibliographic relationships.  Introduction  Cataloging data created by RDA helps information users find, identify, select, and obtain  bibliographic resources.1 The global cataloging community has established a strong theoretical and  practical foundation since Charles Ammi Cutter’s publication of Rules for a Dictionary Catalog, 4th  edition, 1904.2 The rich history of cataloging codes is evidence of a thriving, collaborative effort to  support improved resource discovery. Landmark cataloging standards reflect the community’s principle‐ based practice from the Paris Statement of Principles (1961); International Standard for Bibliographic  Description (ISBD); AACR2 (1981) and its revisions; MARC 21 Formats; Statement of International  Cataloguing Principles (2009); and Resource Description and Access (RDA, 2010‐).  2    From 2005‐2015, the Joint Steering Committee for the Development of RDA (JSC) was responsible for  maintaining the newest international cataloging standard, Resource Description and Access (RDA), with  the Committee of Principles. In 2015, a new governing structure for RDA was announced. In November  6, 2015, the JSC was renamed RDA Steering Committee, and the Committee of Principles was renamed  RDA Board. The RDA Board and the RDA Steering Committee currently guide the development of RDA as  a global cataloging standard. The new governing structure is expected to be firmly established by 2019.3  The International Federation of Library Associations and Institutions (IFLA) Study Group on the  Functional Requirements for Bibliographic Records developed an entity‐relationship model as a tool to  obtain a generalized view of the bibliographic universe in its Functional Requirements for Bibliographic  Records: Final Report. It was intended to be independent from any cataloging code.4 The final FRBR  report was first published in 1998 and later updated in 2009. Almost a decade was devoted to  formulating the conceptual model of FRBR and the related Functional Requirements for Authority Data  (FRAD), first published in 2009.5 Later, Functional Requirements for Subject Authority Data (FRSAD) was  approved in 2010 and issued in its final form in 2011.6 FRBR and FRAD were the first conceptual models  used as the underlying framework in RDA. FRSAD was added later as RDA subject standards were  developed. The original RDA introduction of April 2012 stated: “a key element in the design of RDA is its  alignment with the conceptual models for bibliographic and authority data developed by the  International Federation of Library Associations and Institutions (IFLA)”.7   IFLA is developing the latest conceptual model of bibliographic data needed by users. Currently, the  IFLA Library Reference Model (LRM) is only available in draft.8 The RDA Steering Committee agreed at its  November 2016 meeting to adopt the draft IFLA Library Reference Model as a conceptual model for the  future development of Resource Description and Access. This new model will replace the Functional  Requirements family of models (FRBR, FRAD, and FRSAD) that are superseded by the LRM.9 The  vocabularies from the new model have been published on the RDA Registry and will be a major part of  the complete overhaul of RDA in its RDA Toolkit Restructure and Redesign (3R) Project.10 Revised  terminology from LRM were included in the RDA Toolkit Release, February 14, 2017.11  In the history of cataloging practice, RDA is the first cataloging standard founded on an explicit  conceptual model: FRBR. The instruction and user tasks in RDA were closely tied to the FRBR and FRAD  models. Our research will investigate RDA entities and relationships as defined by these conceptual  models and standardized in RDA. As Barbara Tillett stated in What is FRBR? A Conceptual Model for the  Bibliographic Universe, “FRBR offers us a fresh perspective on the structure and relationships of  3    bibliographic and authority records, and also a more precise vocabulary to help future cataloging rule  makers and system designers in meeting user needs”.12 The implementation of the new cataloging  standard, RDA, in 2013, became a unique opportunity to investigate bibliographic relationships in RDA  records.   The FRBR model identifies three groups comprising eleven entities in the bibliographic universe.  Group 1 refers to the products of intellectual or artistic endeavor—work, expression, manifestation, and  item. Group 2 stands for those entities responsible for the intellectual or artistic content, the production  and dissemination, or the custodianship of such products ‐‐ person, family, and corporate body. Group 3  refers to an additional set of entities that, together with the entities in the first and second groups, may  serve as the subject of a work such as concept, object, event, and place. The entity‐relationship model  comprises descriptions of entities, relationships, and attributes that reflect the nature and  characteristics of the bibliographic universe. One of the FRBR family of models, Functional Requirements  for Authority Data (FRAD) further assists the analysis of attributes of these eleven entities and  relationships among them, which are the central focus of authority data.13 One of the strengths of RDA,  as defined in its objectives and principles (RDA 0.4), is the functionality of records produced under the  standard.14 These record are more responsive to user needs than records created under past cataloging  standards. The principle of ‘relationships’ states that “the data describing a resource should indicate  significant relationships between the resource described and other resources.”15 The explicit underlying  conceptual models of FRBR and FRAD in RDA assist catalogers to better understand and create metadata  for bibliographic relationships, which ultimately assists users with improved information discovery.  Bibliographic relationships for the Group 1 entities are included in several RDA sections and related  LC‐PCC Policy Statements (LC‐PCC PS). RDA Chapter 17 provides instructions and general guidelines on  recording primary relationships between work, expression, manifestation, and item. Primary  relationships, the hierarchical relationships between work, expression, manifestation, and item in the  FRBR Group 1 entities, are defined as:  a) the relationship between a work and an expression through which that work is realized and  the reciprocal relationship from the expression to the work;  b) the relationship between an expression of a work and a manifestation that embodies that  expression and the reciprocal relationship from the manifestation to the expression; and  4    c) the relationship between a manifestation and an item that exemplifies that manifestation and  the reciprocal relationship from the item to the manifestation.16  In its first iteration, RDA provides instructions for three techniques to use in recording bibliographic  relationships: recording the identifier for the work, expression, manifestation, or item17; using  authorized access point(s) for the work or expression18; and constructing structured or unstructured  notes to describe the relationship(s). However, LC/PCC policies for RDA chapter 17 are not supplied  under the current implementation scenario.  In RDA, Chapters 25‐28 cover bibliographic relationships between different instances of the FRBR  Group 1 entities: related works, related expressions, related manifestations, and related items. RDA  Appendix J includes a list of relationship designators to be used to specify the nature of relationships  between works, expressions, manifestations and items in the bibliographic description.   Under AACR2 standards, catalogers recorded some bibliographic relationships, but the instructions  were not as explicit as under RDA. The practice of explicitly recording and identifying the types of  bibliographic relationships in cataloging records is relatively new. Recent research has demonstrated  that it is important for user discovery. An empirical study by Hider and Liu in 2013 confirmed that users  frequently used bibliographic information regarding related works and manifestations.19   Our research investigated the RDA cataloging practices of PCC and non‐PCC OCLC member  institutions reflected in OCLC book cataloging records. We analyzed the nature and extent of RDA  cataloging practices represented in new RDA book records first contributed to OCLC in April 2013, which  was the first full month of RDA cataloging implementation by the Library of Congress, other national  libraries, and other participating libraries. All records were full encoding level input by either OCLC  participants (I) or PCC members (blank). No batch process or any other encoding level were included.  The primary purpose of our research was to identify the relationships recorded between works,  expressions, and manifestations (RDA 2012 version, Chapters 24‐27) in these records and to identify the  characteristics and extent of specific bibliographic relationships. RDA resource‐to‐resource relationships  in cataloging records may be expressed with an identifier, a structured or unstructured description (e.g.,  note), and/or authorized access points. Because RDA and Library of Congress/PCC core requirements  varied, the researchers kept the PCC and non‐PCC sample records separate. These record groups were  also separately analyzed because the cataloging standards vary ‐‐ PCC member libraries contribute high  quality records conforming to approved standards, while non‐PCC libraries contribute full cataloging.  5    These were the instructions the catalogers were following when they created the records. Specific  MARC fields used were also identified, as well as other specific characteristics of the bibliographic  relationships in the records.  The goal of our study was to answer the following questions from an analysis of the sample:   What percentage of the sample has bibliographic relationships between works,  expressions, and manifestations?   What are the types of these relationships?    What are the categories of bibliographic relationships, based on RDA work‐to‐work,  expression‐to‐expression, and manifestation‐to‐manifestation?   What are the preferred methods of explicitly expressing the bibliographic relationships?    To what extent do bibliographic relationships appear by discipline, as defined by the  Dewey Decimal Classification (DDC) assigned to the bibliographic records?   What is the extent of bibliographic relationships in the records by language, country,  and date of publication?   Literature review  The purpose of bibliographic control can be traced to an early statement by Charles A. Cutter in 1904  on the “objects of the catalog” and the “means” for attaining the objects in his Rules for a Dictionary  Catalog.20 Cutter’s objectives of a library catalog were further developed in Seymour Lubetzky’s  cataloging code of 1960.21 These objectives were: “first, to facilitate the location of a particular  publication, i.e., of a particular edition of a work, which is in the library. Second, to relate and display  together the editions which a library has of a given work and the works which it has of a given author.”22  In 1991, Barbara Tillett restated Lubetzky’s second objective as follows: a library catalog should group  related materials together (a collocation function) and display the associated bibliographic records.23  Before the FRBR model, bibliographic relationships were studied empirically in several doctoral  dissertations. Seminal works by Tillett (1987)24, Smiraglia (1992)25, and Vellucci (1995)26 contributed  significantly to our understanding of the nature and characteristics of bibliographic relationships. Based  on her 1987 dissertation, Tillett reported on bibliographic relationships.27 In 1991, Tillett further defined  6    bibliographic relationships as “an association between two or more bibliographic items or works” and  noted that bibliographic relationships have been incorporated in library catalogs for more than a  century.28 She developed a taxonomy of bibliographic relationships based on the analysis of 24  cataloging codes and defined seven categories of bibliographic relationships.29 The latter are  paraphrased as:  1. Equivalence relationships refer to relationships between exact copies of the same  manifestation of a work, or between an original item and its reproductions, as long as the  intellectual and artistic content and authorship are preserved. Examples include copies, issues,  facsimiles and reprints, photocopies, microforms, and other reproductions.  2. Derivative relationships refer to relationships between a bibliographic item and a  modification based on that same item, including variations or versions of another work.  Examples include editions, revisions, translations, summaries, abstracts, digest, adaptations,  changes in genre, paraphrases, etc.   3. Descriptive relationships refer to relationships between a bibliographic item or work and a  description, criticism, evaluation, or review of that work. Examples include annotated editions,  casebooks, commentaries, critiques, etc.  4. Whole‐part (or part‐whole) relationships refer to relationships between a component part of  a bibliographic item or work and its whole. Examples include excerpts from other titles,  collections and their constituent parts, or monographic series.  5. Accompanying relationships refer to relationships between a bibliographic item and the  bibliographic item it accompanies. Examples include supplements, concordances, indexes, etc.   6. Sequential relationships refer to relationships between bibliographic items that continue or  precede one another. Examples include successive titles of a serial, sequels of a monograph, or a  numbered series, etc.  7. Shared characteristic relationships refer to relationships between a bibliographic item and  other bibliographic items that are not otherwise related but which share common  characteristics. Examples include access point, language, date of publication, or country of  publication.  7    Tillett asserted that two “functions” of the library catalog ‐‐ finding and collocation ‐‐ are only  achieved with sufficient bibliographic description and the display of related items and groups of  bibliographic records.30 The conceptual structure defines the framework that encompasses the items to  be described in the catalog, the elements in the items, and relationships among items. Therefore, the  conceptual structure is a rationale to guide the creation and use/searching of the catalog.31 She  examined and reported on linking techniques in catalog records for: “catalog entries,” “uniform titles,”  “series statements,” and “addition to the physical descriptions,” among others.32 Tillett’s equivalence  relationships were later described in RDA as relationships between manifestation‐to‐manifestation. Her  derivative relationships were described in RDA as expression‐to‐expression or work‐to‐work  relationships. Categories such as whole‐part, accompanying, and sequential relationships were  described in RDA as work‐to‐work or expression–to‐expression relationships. She defined shared  characteristics of bibliographic relationships between items in relation to language, date of publication,  country of publication, and others.  In 1992, Smiraglia observed that derivative bibliographic relationships occurred often, and  furthermore, developed a taxonomy of derivative relationships.33 Based on the “work” concept, he  refined the definition of the derivative relationships, stating that “derivative bibliographic relationships  exist between any new conception of a work and its original source (the progenitor), or, its successor, or  both.”34   In a later work by Smiraglia and Leazer, seven different categories of derivation were explained as  follows:   “1. Simultaneous derivations: works that are published in two editions simultaneously, or nearly  simultaneously …  2. Successive derivations. Works that are revised one or more times, and issued with statements  such as ‘second … edition,’ [and] works that are issued successively with new authors. ...  3. Translations ….  4. Amplifications, including illustrated texts, musical setting, and criticisms, concordances and  commentaries. ...  5. Extractions, including abridgements, condensations and excerpts  8    6. Adaptations, including simplification, screenplays, librettos, arrangements of musical works,  and other modifications  7. Performances including sound, visual … recordings.  8. Predecessors, works from which a progenitor is clearly derived (e.g., a short story from which  a novel is derived)”35  Smiraglia & Lezer (1999) acknowledged that the concept of “work” has been difficult and  inconsistent as defined by scholars and in cataloging rules; accordingly, the authors suggested an  operational definition.36 A bibliographic entity has two properties ‐‐ physical and intellectual. The  physical properties comprise bibliographic characteristics such as physical description and bibliographic  data regarding title, names, and publication information. “Work” is defined as the intellectual content of  a bibliographic entity. They strongly agreed that with these definitions, the two concepts are completely  separable. Any variation in the linguistic content of a work is considered a new work. A bibliographic  family was defined as a set of related bibliographic works, and the bibliographic family linked through  standardized access points in the catalog.  Smiraglia and Leazer studied derivative bibliographic relationships using OCLC WorldCat records.37  They identified bibliographic families, as well as the size of each family and its bibliographic  characteristics. Almost a third of progenitor works in the sample had bibliographic families greater than  one. The mean family size for all families was 1.77, and the mean family size for families exhibiting  derivation was 3.54. Derivative bibliographic relationships were found more frequently in the academic  collection (about 50%) than the general collections in WorldCat. Some bibliographic characteristics of  derivative relationships were related to the age of the progenitor work, the collection type (e.g.,  humanities, fiction), and the popularity of works. However, disciplines, forms, mediums, and genre had  no influence on derivation of works.   Using a sample of 1,000 bibliographic records in COBISS (Slovenia Cooperative Bibliographic System  and Services), Marija Petek studied derivative bibliographic relationships.38 The proportion of derivative  works was about 25%, the mean size of all bibliographic families was 1.57, and the mean size of families  having more than one member was 3.2. Among the derivation types in the records under study,  successive derivations occurred most often (65%). There was a significantly lower percentage of  simultaneous derivations (10.7%) and translations (9.1%). The proportion of unexpressed derivative  relationships was about 59% of all relationships. These bibliographic records did not contain data on  9    relationship information, such as code for original language, translators, authors of adaptations, edition  area, note area, etc.   Applying the FRBR model’s work concept, Bennett, Lavoie, and O’Neill shared some interesting  statistics in a sample of cataloging records and work clusters in WorldCat.39 Based on their analysis and  assuming that each bibliographic record in WorldCat described a manifestation, they estimated that 47  million manifestations could be traced back to about 32 million distinct works. They estimated about  78% of works in WorldCat contain a single manifestation, about 20% of all works include two or more  manifestations, and only 1% contain more than 20 manifestations. Bennett, Lavoie, and O’Neill further  classed the works as elemental, simple, or complex works. Elemental works containing a single  expression and single manifestation accounted for the largest proportion (78%), while complex works  containing multiple expressions accounted for the smallest proportion (6%).40 They identified six  categories of complex works, including these five: augmented works, revised works collected/selected  works, multiple translations, and multiple forms of expression. The authors concluded that the largest  benefit of applying the FRBR model to these records was in improved descriptions of complex works. As  a result, these descriptions helped users to identify many work expressions and supported improved  user navigation and discovery.   In 2012, Arsenault and Noruzi, analyzed work‐to‐work bibliographic relationships in Canadian  publications according to the FRBR model.41 Among the 28,633 monographic records, a total 1,261  records (about 4.4%) demonstrated work‐to‐work relationships. The most frequently recorded work‐to‐ work relationship was supplement (59%), followed by successor (24%), transformation (10%), and  adaptation (6%). The majority of successor works appeared in literature, while the majority of  supplements were in the disciplines of social science, language, science, and technology. Bibliographic  relationships most frequently occurred in the literature class (about 26%), while the fewest relationships  were found in philosophy, psychology, and religion. Arsenault and Noruzi also noted that catalogers and  publishers sometimes missed recording the bibliographic relationships.42  The FRBR Final report described relationships: “Relationships serve as the vehicle for depicting the  link between one entity and another, and thus as the means of assisting the user to “navigate” the  universe that is represented in a bibliography, catalogue, or bibliographic database”.43 The report  emphasized that a relationship was not operative unless the entities on each side of the relationships  were explicitly identified and recorded.   10    Prior to implementation of RDA, a series of RDA‐related research appeared in Cataloging &  Classification Quarterly. In 2012, Picco and Repiso presented FRBR as the most significant change in the  cataloging world since ISBD. They reported on the contributions of the FRBR and FRAD models in RDA  instructions relating to the representation of bibliographic relationships.44 They noted that FRBR and  FRAD models are closely incorporated into RDA instructions ‐‐ four sections were on entities and six  sections on relationships, while four among twelve appendices were designated for relationships. Their  analysis focused on the RDA primary relationships among the Group 1 entities (work, expression,  manifestation, and item) and relationships between works, expressions, manifestations, and items. They  presented types of relationships and the conventions used to represent them in RDA.  Also in 2012, Riva & Oliver described the history of RDA development in relation to FRBR and FRAD in  their “Evaluation of RDA as an Implementation of FRBR and FRAD.”45 They conducted an in‐depth study  of RDA by examining RDA instructions in relation to the FRBR and FRAD models. Their research analyzed  the alignment of RDA entities, attributes, and bibliographic relationships in relation to FRBR and FRAD  models. They examined commonalities and differences of the vocabularies and instructions in FRBR,  FRAD and RDA. For example, four RDA user tasks (to find, identify, select, and obtain) are identical to the  terms used in FRBR. Four RDA user tasks related to authority data (to find, identify, clarify and  understand) showed some divergence from the FRAD authority‐related user tasks (find, identify,  contexualize, and justify). The RDA standard was subsequently updated several times. The user tasks  based on FRAD were redefined in the RDA 0.0 Purpose and Scope (2017 February release): “The data  created using RDA to describe an entity associated with a resource (an agent, concept, etc.) are  designed to assist users performing the following tasks:” find, identify, clarify, and understand.46   In 2012, Tami Morse examined bibliographic relationships in sheet maps by applying Tillett’s  taxonomy, Smiraglia’s categories of derivation, FRBR, and RDA.47 Correlations of relationships among  these taxonomies are extensively analyzed and demonstrated. Morse concluded that RDA explicitly uses  Tillett’s taxonomy and FRBR’s instance relationships to categorize bibliographic relationships. Her  research confirmed that Tillett’s taxonomies were frequently applied in map records. However, some of  Smiragalia’s categories of derivation were not apparent in bibliographic records of sheet maps. Morse  noted that the two models by Tillett and Smiragalia were developed primarily based on textual  resources.  Subsequently in 2013, Hider and Liu reported an empirical study on the RDA core elements and how  they were used in supporting user information tasks in the setting of a medium‐sized academic library.48  11    Using a think‐aloud research method, they asked users to verbalize their thoughts relating to their use  of the various elements in catalog records. The researchers discovered that only 37 RDA elements were  used. The most commonly used elements were title proper, followed by the term for the concept. Most  RDA core elements were never used by information seekers. In contrast, non‐core elements such as  other title information, mode of issuance, related work, related manifestation, and the summarization  of content were more frequently used.  A case study in 2013 by Noruzi and Arsenault examined work‐to‐work bibliographic relationships in  the FRBR context by analyzing supplementary relationships recorded in 2009 Canadian national catalog  records.49 Publications such as teacher’s guide, student manual, answer key, and workbook were  evaluated. The findings showed that about 47% of all Canadian works with supplementary bibliographic  relationships were educational material. The distribution of educational work‐to‐work relationships  most often appeared in science (27%), technology (22%), and social sciences (20%).  In a 2016 study involving careful examination of RDA instructions, Wallheim reported a deficiency in  operational definitions in RDA regarding recording bibliographic relationships.50 He studied relationships  between different works, expressions, manifestations, and items by analyzing a sample of relationship  designators from Appendix J. The author found a lack of instructions or consistent cataloger “guidance  on when and on what ground to assign relationship designators” and reported this as a serious and  fundamental problem.51  Our research focused only on relationships within FRBR Group 1: work‐to‐work, expression‐to‐ expression, and manifestation‐to‐manifestation. RDA 2012 version defined these relationships as  associated or related works in RDA 24.1.3. Related Work, Expression, Manifestation, and Item:   the term “related work” refers to a work that is related to the work represented by an identifier,  an authorized access point, or a description (e.g., an adaptation, commentary, supplement,  sequel, part of a larger work);    the term “related expression” refers to an expression related to the expression represented by  an identifier, an authorized access point, or a description (e.g., a revised version, a translation);  and    the term related manifestation refers to a manifestation related to the resource being described  (e.g., a manifestation in a different format).52  12    The RDA 2012 version identified no core, or required elements to be included in the bibliographic  description according to RDA 24.3, stating : “The recording of relationships between works, expressions,  manifestations, and items is not required except for the primary relationships as specified under 17.3.”53  For the purposes of this research, primary relationships as defined in RDA 17.4.1 were not included. We  focused on the relationships between works, expressions, and manifestations covered by the 2012 RDA  instructions and related LC‐PCC policy statements in these chapters:   Chapter 25. Related works   Chapter 26. Related expressions   Chapter 27. Related manifestations   RDA Appendix J (J2‐J4)  Data collection and sampling  The researchers requested RDA new MARC records entered into the OCLC Connexion database  during April 2013. The data request was submitted using the OCLC Research Office standard research  data application form. The data parameters were limited to MARC21 bibliographic records that were  full‐level RDA cataloging for monographs, separated by PCC or non‐PCC member library input. The  record sample included the first record version contributed to OCLC in April 2013, without further  modification. The specific limiters selected were:   BLvl = m, the bibliographic level for monographs;   encoding level for full‐level cataloging, either “blank” for PCC member cataloging  agencies or “I” for non‐PCC cataloging agencies; and   the standard MARC input for RDA records: both MARC 040 subfields, $b eng and $e  RDA.  The total number of monographic records in the sample was 17,371 records entered in April 2013.  Upon analysis of the MARC‐XML RDA records received, we realized that the files contained records of  some map, music, and video formats because our original request for the data set did not specify Type  “a” in the 008 MARC field. As a result, we further filtered the file to limit to book records only (13,941),  resulting in a final data set of 5,569 PCC records and 8,372 non‐PCC records.   13    We used a systematic sampling method to limit the number of cataloging records to be analyzed. The  random number helps researchers avoid any human bias.54 The first record number “2” was randomly  selected to begin sampling both the PCC records file and the non‐PCC records file. Each subsequent  ninth record was selected to complete the samples. The final sample consisted of 619 PCC records and  931 non‐PCC records.   Specific MARC fields from each sample record were selected and recorded in Excel spreadsheets,  keeping the PCC and non‐PCC record samples separate. The researchers selected specific bibliographic  data collected from the 008 field and the variable fields of each cataloging record. Language, country of  publication, and publication date were collected from the 008 field. Variable field data, such as  classification, descriptive elements and primary relationships, was also collected to help the researchers  understand bibliographic relationships in context and to categorize records. Finally, variable field data  for non‐primary (horizontal) bibliographic relationships relating to other works, expressions, and  manifestations was collected from selected MARC fields. The researchers selected data for analysis from  these MARC fields from each sample record. The researchers selected fields predicted to contain  relationship information, as well as fields with potential data that would assist the researchers to clarify  relationships. The researchers predicted that DDC would not be included in every record; therefore,  other classification schemes were collected for possible conversion to DDC.  MARC fields recorded from each sample record:   From the 008 field (to identify shared characteristics of records with relationships):   language (Lang)  place of publication (Ctry)  date of publication (Dates)  From variable fields:   001 (OCLC control number; served as an accession number across spreadsheets)   041 (Language code)  007 (physical characteristics)  082 (DDC call number by PCC, only subfield $a)  14    092 (DDC by member library, only subfield $a)  050 (LC call number by PCC, only subfield $a)  090 (LCC by member library, only subfield $a)  055 (Classification numbers assigned in Canada, only subfield $a)  060 (Nat. Lib of Medicine call number, only subfield $a)  070 (Nat. Lib of Agriculture call number, only subfield $a)  1XX fields: 100, 110, 111, 130 (names or preferred title access points; helped identify relationships)  2XX fields: 240, 245, 246, 250, 264 (elements for manifestation served to clarify relationships)  338 field (carrier type; served to exclude non‐book formats)  4XX fields (series statements; served to identify relationships)  5XX fields (general and specialized notes; served to identify relationships)  6XX fields: 600, 610, 611, 630 (subject access entries and terms; served to identify disciplines and  clarify context)  7XX fields (access points for names and titles with relationships)  8XX fields (series access points; served to identify relationships)   During data collection, the researchers assigned general DDC classification for 390 bibliographic  records lacking DDC numbers using these methods. Library of Congress numbers (LCC) were converted  to DDC classification numbers utilizing the “Dewey Class Number = LC Class Number” bibliographic  correlation search in LC’s Classification Web. For records lacking DDC or LCC numbers, a general DDC  classification was assigned based on the record’s first Library of Congress Subject Heading, using the “LC  Subject Heading = Dewey Class number” correlation search. For records lacking any classification or  LCSH, researchers assigned DDC by using DDC 22nd edition Summaries55 along with a careful examination  of other bibliographic data in the record (e.g., subject access points). Occasionally, the researchers  performed a broader online search to identify the subject matter.  15    When data collection was completed, the researchers devised a system of codes identifying  horizontal bibliographic relationships based on the 2012 RDA Appendix J. The research was focused on  work‐to‐work, expression‐to‐expression, and manifestation‐to‐manifestation relationships.56  Coding of bibliographic relationships and preliminary testing  To capture relationship information, the researchers individually examined each sample record and  recorded and categorized the relationship information. Particular attention was focused on data  elements identifying bibliographic relationships ‐‐ including identifiers, structured or unstructured notes,  and access points. Relationship information contained in 2012 RDA Chapters 24‐27 and RDA Appendix J  was consulted. Evaluating correctness or incorrect application of the MARC and RDA standards in each  record were outside the scope of this study.  Standardized lists for coding work, expression, and manifestation relationships were developed  based on RDA 2012 version Appendix J relationship information. Relationships were limited to RDA 2012  version relationship instructions, in force as of April 2013 (April 2012 Update), because these were the  instructions the catalogers followed to create the records included in the sample.   A pre‐test was conducted to assess the researcher’s data collection and bibliographic relationships  coding methods. The researchers tested the method using the first 50 sample records from both the PCC  and non‐PCC record groups and manually coded relationships between works, expressions, and  manifestations for each record. For each printed record, a researcher assigned and recorded codes for  the RDA relationships expressed in the record. Anomalies and confusing relationships were discussed in  depth. If the nature of the specific relationship was unclear, researchers coded the record for the  broader relationship. For example, based on (work) is the relationship designator defined as “A work  used as the source for a derivative work”.57 More specific relationship designators listed under it in the  Appendix are abridgement of (work) and adaptation of (work). If none of the specific relationships  were appropriate for coding a bibliographic relationship, the researchers coded for the general  relationship designator. For example, if abridgement or abstract were not appropriate concepts, the  researchers assigned the general code for based on (work). This pre‐testing allowed the researchers to  think more thoroughly about the coding method, data elements and attributes collected, and the nature  of bibliographic relationships.   16    After the pre‐test, the researchers developed guidelines for coding relationships in the bibliographic  records based on the 2012 RDA Appendix J, which defined the code names and their definitions, coding  examples, exceptions, and special‐case decisions. For example, a government publication Shipping List  number in a record’s general note field was not recorded as a relationship. However, a government  publication general note such as NUREG‐0581, Rev.2 was recorded as unstructured revision note, as it  identified a relationship between expressions. The MARC field 500 note, “Originally presented as the  author’s thesis (doctoral),” was considered an unstructured note for an expression relationship while a  500 note, “First published in the UK as The fall of the Reich in 2000,” was identified as an unstructured  note for a manifestation relationship. Occasionally, further searching on related resources in OCLC  Connexion and/or online was necessary to evaluate whether the bibliographic information truly  described a bibliographic relationship. The researchers found some relationships difficult to code, for  example a 500 field unstructured note: “Selection of the author’s articles appearing weekly since 2000 in  the column… in the newspaper ….” We coded this as a WW (work‐to‐work) relationship. Another  example was a 500 field unstructured note that we coded as an EE (expression‐to‐expression)  relationship: “The first part of the author’s habilitation ([University, date]) was published under the  title … ([University, date]).  The researchers examined each record in the total sample of 1550. Using the relationship coding  guidelines, the researchers assigned specific codes to the print MARC records to identify the relationship  information. Researchers consulted on the validity of coding whenever questions arose. Coding required  a significant effort. For each record, the relationship code and type was recorded next to the related  MARC field. For example, ‘WW 830 contained in series‘ was the code used to indicate a work‐to‐work  relationship expressed in a MARC 830 field (preferred series title). All MARC fields with relationships  were collected and the raw data contained all relationships recorded per record. After coding was  completed, a research assistant recorded the relationship codes in an Excel spreadsheet, along with the  record ID number (OCLC number), and associated MARC fields.  The PHP programming language, although most commonly associated with dynamic webpages, can  also be used to collect, analyze, and parse data. With the help of our research assistant, a series of  custom‐written PHP scripts were used to categorize, clean, and otherwise analyze the data, which were  stored in .csv files. The researchers specified the data collected for each spreadsheet. The sample record  number remained constant across all spreadsheets. No other tools or software were used.  17    Data analysis  General characteristics of data  The sample of 1550 records consisted of 619 PCC records (39.9%) and 931 non‐PCC records (60.1%).  More than 43.4% (673 titles) were published in the United States, followed by China 5.9% (91 records),  Germany 4.4% (69 titles) and Korea 3.8% (58 titles). The CJK materials were about 12% of the total  sample. Titles in the sample were published in 111 countries. Resources in English language (53.1%)  were the most popular, followed by Spanish (7.48%), Chinese (5.9%), and German (4.9%). At least 52  languages were identified in the sampled records. Publication dates ranged from dates before 1800 to  2014, although more than 78.3% of the sampled records were published after 2001 (with an error  variance of .008 due to non‐numeric dates).  Based on DDC classification number analysis, the sample records included all disciplines of academic  fields. Social science material was the highest (28%), followed by literature (19.5%), geography & travel  (12.9%), applied sciences (11.9%), arts (7.1%), natural sciences (6.2%), and religion (6.1%). The lowest  percentages (8% in total) included the categories of languages, philosophy, and general classes.  Bibliographic relationships  Researchers focused on relationships between works (WW), expressions (EE), and manifestations  (MM) ‐‐ also called as “horizontal” relationships.58 Horizontal bibliographic relationships were  represented in about 54% of all sample records, about 58% in the PCC record sample, and 51.34% in the  non‐PCC record sample. In records with relationships, many of the records represented more than one  relationship (15.67 % of PCC records and 12.4% of non‐PCC records).  Types and methods of recording bibliographic relationships: PCC records  The types and methods of recording bibliographic relationships were analyzed separately for PCC and  non‐PCC records. A total of 605 occurrences of WW, MM, EE relationships were found in the 619  sampled PCC records. Among those horizontal relationships, the work relationships were recorded most  frequently (57.4%), followed by manifestation relationships (34.4%) and expression relationships (8.3%).  Some PCC records contained more than one relationship or more than one type of relationships (WW,  EE, or MM). For example, a record entitled Aircraft contained a 776 field for print version (MM  relationship) and an 830 field (WW relationship), while another record entitled Islam and international  18    law contained both a 505 field (WW relationships) and an 830 field (WW relationship). About 15.67 % of  PCC records with relationships contained two or more types of relationships (e.g., a record with WW  relationships as well as EE and/or MM relationships).      Figure 1. Bibliographic relationships between works, expressions, and manifestations.  Within work relationships in the PCC records, whole‐part work relationships appeared most  frequently (99.1%), followed by derivative work relationships (0.6%), then accompanying work  relationships (0.3%). The whole‐part work relationships were recorded in MARC fields 490, 500 505, 700,  730, 740, 773, 800, 810, and 830. More than half of the whole‐part work relationships were only  recorded in MARC fields 830, 810, and 800 (58.7%). The remainder of the whole‐part work relationships  were recorded in MARC fields 490 alone (20.4%) and 505 contents note (14.5%). Less than 1% of work‐ to‐work relationships were the derivative and accompanying work relationships. In general, derivative  work relationships were recorded in the 500 note fields, as abridgement of (work) or based on (work),  while accompanying work relationships were recorded in 520 field as augmented by (work).  19    Within expression relationships in PCC records, derivative relationships were represented most  frequently (94%), followed by whole‐part relationships (6%). In derivative relationships, translation and  revision were the most common types of expression relationships. The MARC fields most frequently  used to represent all derivative relationships were fields 041 (31.9%), and 240 (29.8%). The MARC fields  used for recording expression relationships in notes varied. The 500 fields with revision notes were most  often used (14.9%), followed by translation notes recorded in either fields 500 or 546 combined (14.9%).  MARC fields 130 and 775 were less frequently used. In general, whole‐part expression relationships  were recorded in MARC fields 700 and 730.  All manifestation relationships in PCC records were equivalent relationships represented by the  MARC fields 020 (identifier for different manifestation), 500, 520, 530, 533, 775, 776 and 856. The MARC  fields most often used were 776 (39.9%) and 856 (29.8%), followed by field 500 with an unstructured  note also issued as (9.6%).  Types and methods of recording bibliographic relationships: Non‐PCC records  In the 931 non‐PCC sample records, a total of 823 WW, MM, or EE relationships were recorded. The  work relationships were most commonly recorded (59.5%), followed by manifestation relationships  (24.7%), and expression relationships (15.8%). About 12.4% of non‐PCC records contained two or more  types of relationships. Among work relationships (WR), whole‐part relationships were most commonly  recorded (95.3%), followed by accompanying relationships (2.9%), derivative relationships (1.6%), and  sequential relationships (0.2%).  Whole‐part work relationships were recorded in MARC fields 246, 490, 500, 501, 505, 520, 700, 710,  730, 740, 773, 800, 830, and 856. The most commonly used MARC fields were 830 (57.6%), 490 (11.7%),  505 content notes (11.5%), 740 (5.3%), 500 note (4%), and 800 (3.9%). Less than 5% of the work‐to‐work  relationships were accompanying, derivative, or sequential work relationships. Accompanying work  relationships were recorded in MARC fields 500 (64%), 700 (14%), 730 (14%), and 520 (7%). The most  common MARC fields used for derivative work relationships were 500 (25%) and 730 (25%). These  MARC fields were used much less frequently: 520, 710, 787 and 846.  Within the expression relationships (ER), 87% represented derivative expression relationships, while  the rest were whole‐part expression relationships (13%). The MARC fields most frequently used to  record the derivative expression relationships were: 041 (27.3%), 240 preferred title with language of  translation (24.7%), and the 500 translation of note (22.1%). The MARC fields less frequently used were  20    500 revision note (13.2%), with small percentages in MARC fields 546, 730, and 130. For the whole‐part  expression relationships, MARC fields 700 (47%) and 740 (24%) were used most frequently. These MARC  fields were used much less frequently: 730, 505 and 800.  Within the manifestation relationships (MR), equivalent manifestation relationships appeared most  frequently, almost 96 percent. The MARC fields most often used were: 776 (37.8%), 500 note (21%), 856  (12.2%), and 020 (10.2%). The MARC fields less frequently used were 530 (7.1%), followed by small  percentages in 533, 534, 580, 775, and 787 fields. Less than 4% represented whole‐part and  accompanying manifestation relationships. MARC fields 500 and 856 were used for those relationships.      Figure 2. Types of bibliographic relationships within works, expressions, and manifestations.  Discussion of the findings  Horizontal bibliographic relationships were represented in about 54% in the sample: about 58% in  the PCC records and 51.34% in non‐PCC records. Within records with relationships, work‐to‐work (WW)  relationships were the most common type of relationship in both the PCC (57.4%) and non‐PCC records  21    (59.5%). Expression‐to‐expression (EE) relationships were least common in both the PCC (8.3%) and non‐ PCC (15.8%) records. Manifestation‐to‐manifestation (MM) relationships occurred more frequently than  expression relationships in both the PCC (34.4%) and non‐PCC (24.7%) records. There was a noticeable  difference in cataloging practice of PCC and non‐PCC groups in recording expression‐to‐expression and  manifestation‐to‐manifestation relationships (over 10% variance).  The findings also showed specific of types of relationships within work‐to‐work (WW) and  expression‐to‐expression (EE) relationships. Within WW relationships, whole‐part relationships were  predominant (99% in PCC and 95% in non‐PCC), with small percentages in derivative relationships (0.6%  in PCC and 1.6% for non‐PCC), and accompanying relationships (0.3% in PCC and 1.6% in non‐PCC).  Three MARC fields were most commonly used to record WW relationships: 800, 810, and 830 (59% in  PCC and 62% in non‐PCC).  EE relationships were primarily derivative relationships (94% in PCC and 87% in non‐PCC). The MARC  fields most commonly used to record EE relationships were the 041 field (32% in PCC and 27% in non‐ PCC); the 240 field (30% in PCC and 25% in non‐PCC); and the 500 field (15% in PCC and 22% in non‐ PCC).  Manifestation‐to‐manifestation relationships represented equivalent relationships in the PCC record  group (100%) and the non‐PCC record group (96%). The most common MARC field used was the 776  field (40% in PCC and 38% in non‐PCC). The MARC fields used less frequently were 856 (30% in PCC and  12% in non‐PCC) and 500 (10% in PCC and 21% in non‐PCC). This shows a wide range of cataloging  practice between the PCC and non‐PCC groups.  In the final analysis of bibliographic relationships in this sample, over 90 percent represented the  whole‐part work relationships (WW), derivative expression relationships (EE), and equivalent  manifestation relationships (MM).  For the PCC group, a total of twenty‐two MARC fields were used to record all of the bibliographic  relationships. The most commonly used MARC fields by frequency from most to least were: 830, 776,  490, 856, 500, and 505. These six MARC fields contained more than 82% of bibliographic relationship  information in PCC records. All relationships (WW, EE, and MM) were chiefly represented in these six  MARC fields. To record work relationships, MARC fields 830, 490, 505 and 810 were most commonly  used (92%). To record expression relationships, MARC fields 041, 240, 500, and 546 were most  frequently used (86%). To record manifestation relationships, the most commonly used MARC fields  22    were: 776, 856, and 500 (88%). Among the relationships, EE relationships were recorded mainly in  unstructured notes in 5XX fields, with less common use of more structured and granular bibliographic  data, i.e. MARC field 775, etc.      Figure 3. Methods of recording bibliographic relationships (PCC).  Catalogers need to take more time and effort to apply the new standards for bibliographic  relationships at a granular level, using specific relationship designators and MARC fields for structured  description (7XX), in order to communicate these relationships more effectively and ultimately help  users with improved information discovery. Perhaps catalogers need more training and confidence in  order to utilize all of the new RDA elements that could represent relationships. The researchers suggest  that expanded practical examples in both RDA instructions and LC‐PCC Policy Statements would assist  catalogers in recording bibliographic relationships more effectively. Improved complex examples with  accompanying explanation would also be beneficial. Some additional open access examples from the  cataloging community would assist catalogers with fewer subscription resources or tools.  For the non‐PCC group, a total of twenty‐seven MARC fields were used for representing WW, EE and  MM relationships. The MARC fields most commonly used by frequency from most to least were 830,  500, 776, 490, and 505, representing almost 69% of the total bibliographic relationships. Three MARC  fields (830, 490, and 505) represented more than 77% of work‐to‐work relationships. Compared to the  PCC group, this group used the 505 contents field less frequently. For expression‐to‐expression  relationships, three MARC fields alone (500, 041, and 240) represented more than 76% of these  23    relationships. This indicates a higher reliance on a few fields and on the use of the MARC field 500 for an  unstructured expression relationship. Finally, four MARC fields (776, 500, 856, and 020) represented  more than 72% of the manifestation‐to‐manifestation relationships. Within the non‐PCC group, there is  more reliance on unstructured note fields. Structured relationships in 7XX MARC fields are underutilized  significantly.      Figure 4. Methods of recording bibliographic relationships (Non‐PCC).  Overall, the 830 field is the MARC field most frequently used for recording bibliographic relationships  (PCC and non‐PCC). This is not surprising, but it suggests that other fields such as 7XX fields for  structured description could be used more frequently. Whole‐part work relationships are well  represented and recorded (field 830), but we suggest that many other relationships are  underrepresented in bibliographic records. For example, accompanying, derivative, or sequential work  relationships were very low in our sample, less than 5% of the work‐to‐work relationships, and we  hypothesize that these relationships exist more frequently, but were not recorded. In addition, because  of the popularity of using 5XX fields for unstructured description of all types of relationships, we also  suggest that the specific nature of the relationships could be clarified by increased use of relationship  designators and structured description in 7XX fields. Derivative relationships are the most frequently  recorded relationship in our sample for expression relationships; however, if the reciprocal relationships  were recorded in each related record, for instance, ‘translation of’ and ‘translated as,’ we would have  expected a higher frequency of 7XX fields.  24    Other characteristics of records with bibliographic relationships  Cataloging records containing bibliographic relationships between works, expressions, and  manifestations were analyzed in relation to place, language, and date of publication, as well as DDC  classes of publications. The researchers were interested in determining whether these shared attributes  of bibliographic records reflect a difference in the bibliographic relationships recorded. For example, in  relation to place of publication, resources published in the U.S. (43% of the total) represented the  highest percent of relationships (54% for PCC and 45% for non‐PCC). Titles published in other countries  were, by frequency of relationships: China including Taiwan (8.6% in non‐PCC, 2% in PCC); Germany (6%  in PCC, 5% in non‐PCC) UK (4.4% in non‐PCC, 4.1% in PCC); Italy (5.6% in PCC, 1.6% in non‐PCC); Japan  (3.5% in PCC, 2.6% in non‐PCC); and Korea (3.3% in non‐PCC, 2.5% in PCC).  Languages of resources in the sample were analyzed. PCC records in English and German languages  represented the highest percent of MM and EE relationships compared to records for resources in other  languages, while English and Italian materials represented the highest percent of WW relationships.  Among non‐PCC records, MM and WW relationships frequently appeared in English and Chinese  language materials, while EE relationships frequently appeared in English, German and French materials.  These results are not surprising because more than 53% of the sample records were English language  materials. More research is needed to analyze why English materials appear to represent more  relationships than non‐English materials. Awareness of this data could help catalogers pay attention to  capturing relationships for non‐English resources in order to prevent possible bias in bibliographic  description.  Publications published after 2010 represented the highest percent of relationships (92% in PCC and  53% in non‐PCC). However, publications published 2000‐2009 represented this percent of relationships  (4% in PCC and 19% in non‐PCC). Publications published earlier than 2000 represented this percent of  relationships (.05% in PCC and 26.4% in non‐PCC). These results are normalized as our sample contained  66% of resources published after 2010 and 13% of resources published during 2000‐2009. In all, 79% of  resources in the sample were published 2000 and later. In all, 20.3% of all resources in the sample were  published 2000 or earlier (with an error variance of .008 due to non‐numeric dates). There is a  significant gap in cataloging practice for recording relationships for newer resources as compared to  older resources. It appears that PCC member libraries contributed a much greater percent of recently  published RDA records than the non‐PCC libraries in our sample. It appears that relationships in records  25    for older materials are not recorded as carefully. Awareness of this data could help catalogers avoid this  possible bias and pay attention to capturing relationships for older materials.  The researchers analyzed the extent that bibliographic relationships in general appeared by  discipline, as defined by the DDC classes. In the PCC records, relationships were represented most  frequently in the social sciences (44.1%), followed by literature (16.2%); technology, (9.1%); geography  & history (7.8%); natural sciences (6.3%); languages (6.1%); arts (4.6%); religion (3.5%); philosophy &  psychology (1.3%); and the general class (1%). Resources in the general class together with philosophy  and religion classes had the lowest percent of all relationships. The non‐PCC records showed a similar  distribution of relationships by disciplines. The social science class was highest (24.4%), followed by  literature (20.2%); technology (17.1%); geography & history (14.6%); natural sciences (8.4%); religion  (5.6%); arts (5.5%); philosophy & psychology (2.6%); languages (1.6%); and the general class (less than  1%). The top five classes were the same in both record groups. For the PCC and non‐PCC record groups  together, titles in the general class, philosophy & psychology, and languages showed fewer  relationships. It is interesting to note that PCC and non‐PCC practice in the languages discipline varied by  almost 5%.      Figure 5. Bibliographic relationship by disciplines.  26    These research findings are similar to a previous study which reported that bibliographic  relationships frequently occurred in the literature class (about 26%), while the fewest relationships  appeared in the philosophy, psychology, and religion classes.59 Our study shows the social sciences had  the most frequent bibliographic relationships, almost 20% higher in PCC records compared to non‐PCC  records. It is also surprising that relationships in PCC records in the social sciences are recorded much  more frequently than in literature.  Conclusions and future research  This research examined how the RDA instructions regarding bibliographic horizontal relationships  were recorded in the RDA book cataloging records in OCLC during the first month of RDA  implementation by national and member libraries (April 2013). This is a landmark implementation of the  new international cataloging standard. Using a sample of cataloging records first contributed to OCLC  during April 2013, the researchers separated PCC and non‐PCC records into two groups and analyzed the  bibliographic relationships, which are a key concept in cataloging using RDA.  Among the bibliographic relationships recorded, our research showed that within work‐to‐work  relationships around 97% represented whole‐part work relationships; within expression‐to‐expression  relationships around 90% represented derivative expression relationships; and within manifestation‐to‐ manifestation relationships around 98% represented equivalent manifestation relationships. It appears  that at the beginning of the implementation period, catalogers are still new and unused to this practice  of analyzing and recording relationships. These three types of bibliographic relationships may be the  most familiar to catalogers, and, it appeared that catalogers recorded these relationships most  frequently as a part of standard cataloging practice. However, records did not display a variety or  specificity of other relationships identified in RDA. For example, catalogers often used unstructured  notes alone to represent relationships, without adding more specific and structured relationship  information to the record. The limited MARC fields used for relationships illustrated a lack of variety in  representing various relationships and the specific nature of relationships. In addition, when authorized  access points and structured notes were used (7XX fields), our sample data showed that catalogers often  did not assign relationship designators. The researchers speculate that the scope and variety of RDA  relationship designators in Appendix J may not have been easy for catalogers to apply in cataloging  practice. Since the implementation of RDA, catalogers may have gained more experience and expertise  27    in applying RDA instructions and may have improved the practice of recording bibliographic  relationships. However, our findings could be incorporated into improved RDA training and teaching,  especially concerning underrepresented relationships and the specific nature of relationships.  Ultimately, if our long‐time goal is to display these work‐to‐work, expression‐to‐expression, and  manifestation‐to‐manifestation relationships in a fully actionable linked data environment, we need to  encourage catalogers to create more granular and structured bibliographic relationship data. Further  study is needed to determine what progress has been made in recording relationships, recording the  specific nature of relationships, and recording reciprocal relationships in related records. In particular,  more detailed analysis is needed both on the use of subfield “i“ in the 7XX structured note fields and on  the way relationships are recorded (by structured or unstructured note, identifier, or access point).  This research was limited to book cataloging records first contributed to OCLC the first month after  RDA implementation. Researchers examined only the horizontal relationships between works,  expression and manifestations; other significant relationships were excluded, e.g., primary relationships  between work, expression, and manifestation or relationships between works and subjects, etc. The  research also confirmed that there are other horizontal relationships that need to be acknowledged in  our cataloging standard. Relationships such as whole work‐to‐part expression or derivative expression‐ to‐work need to be further studied.   More research on the nature of relationships in RDA cataloging records is important, especially since  RDA is undergoing major revisions based on the ILFA Library Reference Model. It would be interesting to  use this research as benchmark data, and, in the future, periodically compare progress made in  recording bibliographic relationships since the first implementation of RDA. It would also be interesting  to identify the bibliographic relationships in non‐book formats and identify how they are different or  similar from book cataloging practice. It would be productive for future researchers to examine and  identify those relationships not recorded or underrepresented. Our data shows three types of  relationships, whole‐part relationships, derivative relationships, and equivalent relationships, were  recorded at a much higher frequency than other types of relationships. Analysis of cataloging practice  for recording bibliographic relationships at regular intervals would be valuable, because our standards  are continually revised. In addition, examining other types of bibliographic relationships would enrich  28    our understanding of bibliographic relationships, help catalogers enhance metadata creation with  improved bibliographic relationship data, and ultimately improve user information navigation and  discovery.  Acknowledgments  The authors thank OCLC for providing the WorldCat bibliographic records that supported this research.  The authors with to express our gratitude to Andrew Tsou, our research assistant, for his expertise in  computing skills for data input and analysis. The raw data used in this research will be deposited in the  IUScholarWorks Repository at https://scholarworks.iu.edu/.  Funding  We are grateful for the funding support of research grants from the Indiana University Libraries and the  Indiana University Librarians Association (InULA).  Notes  1. The Joint Steering Committee for Development of RDA, Resource Description and Access  (Chicago: ALA Publishing, 2010‐), section 0.0, February 14, 2017 version.  http://access.rdatoolkit.org/rdachp0.html (accessed February 23, 2017).  2. Charles A Cutter, Rules for a Dictionary Catalog, 4th ed. (Washington: Government Printing  Office, 1904).  3. “Transition timescales for RDA Governance,” RDAToolkit,  http://www.RDAtoolkit.org/sites/default/files/statement_from_cop_050815.pdf (accessed  February 16, 2017).  4. IFLA Study Group on the Functional Requirements for Bibliographic Records, Functional  Requirements for Bibliographic Records: Final Report, As amended and corrected through 2009  (Munich: K.G. Saur, 1998; 2009).  http://www.ifla.org/files/assets/cataloguing/frbr/frbr_2008.pdf (accessed February 16, 2017).   29    5. IFLA Working Group on Functional Requirements and Numbering of Authority Records  (FRANAR), Functional Requirements for Authority Data: A Conceptual Model, As amended and  corrected through 2013 (Munich: K.G. Saur, 2009; 2013).  http://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf (accessed February 16, 2017).  6. IFLA Working Group on the Functional Requirements for Subject Authority Records (FRSAR),  Functional Requirements for Subject Authority Data: A Conceptual Model (Berlin/Munich: De  Gruyter Saur, 2011). http://www.ifla.org/files/assets/classification‐and‐indexing/functional‐ requirements‐for‐subject‐authority‐data/frsad‐final‐report.pdf (accessed February 16, 2017).  7. The Joint Steering Committee for Development of RDA, Resource Description and Access, section  0.3.1, 2012 April Update. http://access.rdatoolkit.org/rdarev201204chp0.html (accessed  February 16, 2017).  8. IFLA Consolidation Editorial Group of the FRBR Review Group, FRBR‐Library Reference Model,  Draft for world‐wide review (The Hague: IFLA, 2016).  http://www.ifla.org/files/assets/cataloguing/frbr‐lrm/frbr‐lrm_20160225.pdf (accessed  February 16, 2017).  9. RDA Steering Committee website, “Implementation of the LRM in RDA,” http://www.rda‐ rsc.org/ImplementationLRMinRDA (accessed February 16, 2017).  10. “RDA Toolkit Restructure and Redesign Project,” RDA Toolkit,  http://www.rdatoolkit.org/blog/3RProject (accessed February 16, 2017).  11. “RDA Toolkit Release – February 14, 2017,” RDA Toolkit,  http://www.rdatoolkit.org/development/February2017release (accessed February 16, 2017).  12. Barbara Tillett, What is FRBR? A Conceptual Model for the Bibliographic Universe (Library of  Congress, Cataloging Distribution Service, Revised February 2004): 2.  13. IFLA Working Group on Functional Requirements and Numbering of Authority Records  (FRANAR), 2009.  14. The Joint Steering Committee for Development of RDA, Resource Description and Access, section  0.4, February 14, 2017 version. http://access.rdatoolkit.org/rdachp0.html (accessed February  23, 2017).  30    15. Ibid, section 0.4.3.3.  16. Ibid, section 17.4.1.  17. Ibid, section 17.4.2.1, 2012 April Update, http://access.rdatoolkit.org/rdachp17.html (accessed  February 23, 2017).  18. Ibid, section 17.4.2.2.  19. Philip Hider and Ying‐Hsang Liu, “The Use of RDA Elements in Support of FRBR User Tasks,”  Cataloging & Classification Quarterly 51, no. 8 (2013): 857‐872.  20. Cutter, 1904.  21. Seymour Lubetzky, Code of Cataloging Rules, Author and Title Entry (1960), as cited in Michael  Carpenter and Elaine Svenonius, eds, Foundations of Cataloging: A Sourcebook (Littleton, CO:  Libraries Unlimited, 1985), 191.  22.  Ibid.  23. Barbara B. Tillett, “A Taxonomy of Bibliographic Relationships,” Library Resources & Technical  Services 35, no. 2 (1991): 150‐158.  24. Barbara B. Tillett, Bibliographic Relationships: Toward a Conceptual Structure of Bibliographic  Information Used in Cataloging (Ph.D. Dissertation, School of Library and Information Science,  University of California, Los Angeles, 1987).  25. Richard P. Smiraglia, Authority Control and the Extent of Derivative Bibliographic Relationships  (Ph.D. dissertation, Graduate Library School, University of Chicago, 1992).  26. Sherry Vellucci, Bibliographic Relationships among Musical Bibliographic Entities: A Conceptual  Analysis of Music Represented in a Library Catalog with a Taxonomy of the Relationships  Discovered (D.L.S. Dissertation, Columbia University, 1995).  27. Tillett, 1987.  28. Tillett, 1991, p. 150.  29. Ibid, p. 156.  30. Ibid.  31    31. Ibid, p. 152.  32. Barbara B. Tillett, “The History of Linking Devices,” Library Resources & Technical Services 36, no.  1 (1992): 23‐36.  33. Richard P. Smiraglia and Gregory H. Leazer, Derivative Bibliographic Relationships: The Work  Relationship in a Global Bibliographic Database,” Journal of the American Society for Information  Science 50, no. 6 (1999): 495.  34. Ibid, p. 493‐504.   35. Ibid.  36. Ibid, p. 495.  37. Ibid.  38. Marija Petek, “A Conceptual Model for COBIB,” International Cataloging & Bibliographic Control  37, no. 2 (2008): 35‐38.  39. Rick Bennett, Brian F. Lavoie, and Edward T. O’Neill, “The Concept of a Work in WorldCat: An  Application of FRBR,” Library Collections, Acquisitions, and Technical Services 27, no. 1 (2003):  45‐59.  40. Ibid.  41. Clément Arsenault and Alireza Noruzi, “Analysis of Work‐to‐Work Bibliographic Relationships  Through FRBR: A Canadian Perspective,” Cataloging & Classification Quarterly 50, no. 5‐7  (2012): 641‐652.  42. Ibid.  43. IFLA Study Group on the Functional Requirements for Bibliographic Records, Functional  Requirements for Bibliographic Records: Final Report, 56.  44. Paola Picco and Virginia Ortiz Repiso, “The Contribution of FRBR to the Identification of  Bibliographic Relationships: The New RDA‐Based Ways of representing Relationships in  Catalogs,” Cataloging & Classification Quarterly 50, no. 6 (2012): 622‐640.  32    45. Pat Riva and Chris Oliver, "Evaluation of RDA as an Implementation of FRBR and FRAD,"  Cataloging & Classification Quarterly 50, no. 5‐7 (2012): 564‐586.  46. The Joint Steering Committee for Development of RDA, Resource Description and Access section  0.4, February 14, 2017 version. http://access.rdatoolkit.org/rdachp0.html (accessed February  23, 2017).  47. Tami Morse, “Mapping Relationships: Examining Bibliographic Relationships in Sheet Maps from  Tillett to RDA,” Cataloging & Classification Quarterly 50, no. 4 (2012): 225‐248.  48. Hider & Liu, 2013.  49. Alireza Noruzi and Clément Arsenault, “Educational Supplementary Bibliographic Relationships  from FRBR Point of View : A Canadian Case Study,” Library Collection, Acquisitions, & Technical  Services 37, issue 1‐2 (2013): 66‐72.  50. Henrik Wallheim, “From Complex Reality to Formal Description: Bibliographic Relationships and  Problems of Operationalization in RDA,” Cataloging & Classification Quarterly 54, issue 7 (2016):  483‐503.  51. Ibid, p. 487.   52. The Joint Steering Committee for Development of RDA, Resource Description and Access, section  24.1.3, 2012 April Update. http://access.rdatoolkit.org/rdarev201204chp24.html (accessed  February 21, 2017).    53. Ibid, section 24.3. http://access.rdatoolkit.org/rdarev201204chp24.html (accessed February 21,  2017).  54. Earl Babbie, The Practice of Social Research, 9th ed. (Belmont, CA: Wadsworth/Thomson  Learning, 2001), 197.  55. Dewey Decimal Classification summaries, 22nd ed., OCLC Dewey Services,  https://www.oclc.org/dewey/features/summaries.en.html (accessed February 16, 2017).  56. The Joint Steering Committee for Development of RDA, Resource Description and Access, section  J, April 2012 Update. http://access.rdatoolkit.org/rdarev201204appj.html (accessed February  16, 2017).  33    57. Ibid, section J.2.2, April 2012 Update. http://access.rdatoolkit.org/rdarev201204appj.html  (accessed February 16, 2017).  58. Robert L. Maxwell, Maxwell’s Handbook for RDA, Resource Description & Access (Chicago:  American Library Association, 2013), 629.  59. Arsenault & Noruzi, 2012.  work_4s3ckny5zbfr5b7ytbs4jq7kc4 ---- 115-132-------05-......=---------------........(..).hwp 차세 도서 목록의 이용자 서평에 한 고찰 A Study on the User-contributed Reviews for the Next Generation Library Catalogs 윤 정 옥(Cheong-Ok Yoon)* 목 차 1. 서 론 1.1 연구의 목 과 필요성 1.2 연구의 방법과 내용 2. 연구의 배경 2.1 일반 배경 2.2 선행 연구 3. WorldCat의 이용자 참여 3.1 소장 도서 3.2 이용자 서평 3.3 이용자 태그 3.4 이용자 독서 리스트 4. 외부정보원의 이용자 서평 4.1 이용자 서평 황과 증가율 4.2 이용자 서평의 사례 5. 맺음말 록 이 연구의 목 은 차세 도서 목록에서 이용자 서평 기능의 이용 황 서지 코드에서 연결될 수 있는 외부 정보원의 이용자 서평의 향 가능성을 살펴보는 것이다. 2012년 2월 16일부터 4월 4일 사이에 2010년에 출간된 열권의 책을 상으로 WorldCat에서 소장도서 , 이용자 서평, 태그 독서 리스트의 황 변동 추이, 그리고 서지 코드에 연결된 Amazon.com과 GoodReads.com의 이용자 서평 황을 살펴보았다. WorldCat에서 아직 이용자 참여 기능의 활용도는 매우 낮았으며, 이용자 서평보다는 태그나 독서 리스트를 통한 참여가 더 많았다. 같은 책들에 한 아마존과 굿리즈의 이용자 서평 참여도는 매우 높았고, 아마존에서 한 권의 책과 련된 이용자 서평 사례 분석은 이용자 참여 기능이 왜곡될 수 있는 가능성을 암시하 다. 아직은 기 단계인 도서 목록에 한 이용자 참여 기능의 확산 안정화 추이의 지속 찰, 그리고 이 기능이 이용자의 자료 선택에 미치는 향의 심층 , 체계 분석이 필요하다. ABSTRACT The purpose of this study is to examine the current status of user-contributed reviews for the Next Generation Library Catalogs, and the potential impact of user reviews available from the external sources, including Amazon.com and GoodReads.com. During the period of February 16th through April 4th, 2012, the number of holding libraries and user-contributed reviews, tags and reading lists of ten selected books were examined from the WorldCat. Also the user-contributed reviews for the same books available from Amazon.com and GoodReads.com were examined, and a case of reviews for one book was analyzed. The result shows that only a few users participated in the WorldCat, and user-contributed reviews were rarely used, when compared with tags or reading lists. Several hundred to thousand user-contributed reviews for the same books were available from Amazon.com and GoodReads.com directly linked with bibliographic records. A case of one book from Amazon.com reveals the possibility of distorting the function of user-contribution. With the introduction of the function of user-contribution, it is expected that its growth rate should be carefully observed and its potential impact on users should be thoroughly and systematically analyzed in the near future. 키워드: 차세 도서 목록, WorldCat, 이용자 참여, 이용자 서평, 아마존, 굿리즈닷컴 Next Generation Library Catalog, WorldCat, User Contribution, User-contributed Reviews, Amazon.com, GoodReads.com * 청주 학교 문헌정보학 공 교수(jade@cju.ac.kr) 논문 수일자: 2012년 4월 15일 최 심사일자: 2012년 4월 16일 게재확정일자: 2012년 5월 8일 한국문헌정보학회지, 46(2): 115-132, 2012. [http://dx.doi.org/10.4275/KSLIS.2012.46.2.115] 116 한국문헌정보학회지 제46권 제2호 2012 1. 서 론 1.1 연구의 목 과 필요성 최근 몇 년 사이에 차 늘어나고 있는 이른바 “차세 도서 목록(Next Generation Cata- log)” 혹은 “발견 인터페이스(discovery inter- face)”(Breeding 2007a; 2010)라고 불리는 도 서 목록의 주요한 기능 가운데 하나는 이용 자 참여를 허용한다는 이다. 목록의 이용자 참여 기능은 Web 2.0 혹은 Library 2.0 개념의 등장과 더불어 강조되기 시작된 것으로서, 기 술, 요약, 리뷰, 비평, 주석, 등 과 순 , 태깅 혹은 폭소노미 등을 포함하고 있다(Yang and Hofmann 2011). 이러한 이용자 참여 기능을 통 해 목록의 서지 코드라는 사서의 문 역이 일부 개방되기는 하 지만, 아직까지는 실제 이 용자들이 그 기능을 극 으로 활용하고 있다 고는 할 수 없다. 우리나라 연구자들 가운데 일 부터 차세 도서 목록의 개념을 주목해온 심경은 이미 있는 서평에 의견을 추가하거나 자신의 서평을 다는 것과 같은 이용자 참여를 이끌어낼 수 있 을 지는 “아직 미지수”라고 지 한 바 있다(심 경 2008, 27). 이지연과 민지연(2008)은 국내 학술정보서비스 이용자 97명을 상으로 웹 2.0 기술에 한 인지도 요구사항을 조사한 연 구에서 이용자들은 태그나 리뷰 등 이용자가 작 성하는 콘텐츠를 개인 자료 리 차원에서는 유용성을 높이 평가하나, 작성하는 행 자체에 해서는 상 으로 소극 인 입장을 갖고 있 다고 하 다. 이 연구는 차세 도서 목록의 이용자 참 여 기능 가운데 특히 이용자 서평에 주목하 다. 원래 서평은 단어나 수로 단편 의견이 표 되는 태그나 평 과는 달리 부분 문장으로 표 된 작성자의 주 가치 단이 나타나게 된다. 더욱이 서평은 주제 분야의 문지식과 문장력은 물론 문헌비평자로서의 확고한 의식 과 서평에 한 충분한 이해가 요구되기도 한 다(김상호 1994). 따라서 도서 목록에 비 문 가인 이용자들이 과연 얼마나 서평을 기고할 것 인지, 기고된 이용자 서평들이 술한 기본 요건을 얼마나 갖추고 있을 것인지, 만약에 이 용자 서평들이 그러한 요건을 갖추지 못한다면, 일부 연구자들(구 억, 곽승진 2007)이 주장한 바와 같이 “발견성(findability)”이나 지식공유 가능성을 가질 수 있는지 등 아직은 많은 것이 확실하지 않은 상태이다. 한편 차세 도서 목록은 확장된 콘텐츠 (enhanced contents)로서 외부의 문 서평 데 이터베이스뿐만 아니라 일반인이 작성하여 기고 한 서평 정보원에 한 링크도 제공한다. 목록의 서지 코드로부터 직 아마존(Amazon.com) 이나 반스앤노블(Barnes&Noble.com)과 같은 외부 서 의 온라인 웹사이트로 링크될 수 있으 므로 책의 상품정보와 연결된 이용자 서평을 볼 수 있다. 한 굿리즈(GoodReads.com)와 같은 이용자 서평 사이트로 곧바로 링크될 수 있으므 로, 목록 이용자들이 이 사이트에 일반 이용자들 이 올려놓은 서평을 볼 수 있다. 다시 말하면 차 세 도서 목록은 서지 코드에 목록 이용자 가 서평을 추가할 수도 있고, 동시에 다양한 외 부 정보원에서 제공되는 이용자 서평까지 볼 수 있는 기회를 제공한다. 이러한 맥락에서 이 연구는 차세 도서 목 차세 도서 목록의 이용자 서평에 한 고찰 117 록에서 이용자 서평 기능이 실제로 얼마나 이 용되고 있으며, 서지 코드에서 연결될 수 있 는 외부 정보원의 이용자 서평은 어떤 향을 미칠 수 있는지 살펴보는 것을 그 목 으로 하 다. 1.2 연구의 방법과 내용 이 연구는 2012년 2월 6일부터 4월 6일 사이 서지 코드의 계량 사례분석 방법을 사용하 여 진행되었으며, 다음과 같이 크게 세 부분으 로 구성된다: 첫째, 이용자 참여 기능의 활용 황을 살펴 보기 해서 2월 16일과 4월 4일 두 차례 아마 존의 ‘Best Books of 2010’으로 선정된 책들 상 열 권을 상으로 세계 주요 도서 종 합목록이라 할 수 있는 OCLC의 WorldCat 내 도서 소장 황, 이용자 서평과 태그, 독서 리 스트의 분포 황 기간 내 변동 추이를 살펴 보았다. 둘째, 같은 시 에 같은 책들에 하여 서지 코드에서 연결되는 외부 정보원인 아마존과 굿리즈의 서평 건수 황 변동 추이를 살펴 보았다. 이들은 이용자 태그와 독서 리스트를 제공하지 않으므로 이용자 기고 서평만을 살펴 볼 수 있었다. 셋째, 외부 정보원이 제공하는 이용자 서평 의 향을 살펴보기 하여 최근 국내에서 화 제가 되었던 2011년 신간도서 The Uncharted Path의 사례를 분석하 다. 2. 연구의 배경 2.1 일반 배경 이 연구에서 분석의 상으로 삼은 WorldCat 은 1967년 미국 오하이오 주에서 출범한 학 도서 기반 서지유틸리티인 OCLC의 온라인 종합목록으로서 1971년 구축되었다. 사십여 년 이 흐른 지 WorldCat은 국가도서 , 학도 서 , 문도서 , 공공도서 , 학교도서 등 다양한 종의 도서 을 아우르는 세계 도서 종합목록으로 자리잡고 있다. WorldCat은 2011년 11월 재 세계 170여 국가 72,000여 개 도서 소장 자료 250,021,271건의 서지 코 드와 소장정보를 수록하고 있으며, 매 10 마 다 한 건씩 새로운 서지 코드가 추가될 정도 로 매우 빠르게 성장하고 있다(OCLC 2011). WorldCat은 OCLC가 2007년부터 보 한 WorldCat Local이라는 차세 도서 목록 인 터페이스를 사용하고 있으며, WorldCat 회원 도서 만이 아닌 세계의 이용자들이 구든 웹 상에서 로그인 하여 태그, 서평, 독서 리스트 등을 추가할 수 있게 한다. WorldCat의 서지 코드에서는 “Buy It” 기능을 통해 아마존을 포 함한 외부 서 사이트로, “Reviews” 기능을 통해 굿리즈와 같은 외부 서평 사이트로도 직 링크될 수 있다. 이러한 기능은 많은 온라인 목록의 이용자들이 요구했던 것(OCLC 2009) 으로 도서 에서 직 외부 정보원에 한 정보 공유가 이루어진다는 의미를 가 진다. 118 한국문헌정보학회지 제46권 제2호 2012 2.2 선행 연구 도서 목록에서 이용자가 기고한 서평의 유 용성에 해서는 아직 많이 논의되지 않았다. 그 이유 가운데 하나는 도서 목록에 이용자 서평 등 참여 기능이 도입된 것은 2000년 반 차세 도서 목록 혹은 발견 인터페이스 의 개념이 등장한 이후이며 불과 육칠년 밖에 되지 않았다. 물론 최근 일반 인 정보기술과 서비스의 속한 확산 속도를 감안하면 그 정 도 기간이 짧지 않다고도 할 수 있으나, 사서의 고유한 문 역이던 목록에 이용자들이 개입 하게 되는 것이 “ 명 단계이자 새로운 개 념”(Yang and Hofmann 2010, 702)이라고 한 것처럼, 아직은 이러한 참여의 확산 그에 따 른 향을 직 측정하기 어려울 수도 있다. 우리나라 연구자들은 목록의 이용자 참여 기 능을 비교 정 인 으로 보는 것으로 나타났다. 이 실 등(2009)은 웹 2.0으로 표 되는 “웹 기반 인터넷 정보기술이 도서 시스 템의 요기술이 되면서 도서 의 통 업무 역에 경계가 사라지고” 있으며, “소셜 태깅이 나 태그 클라우드에서 보는 바와 같이 사서의 고유 업무 던 목록에 이용자 참여가 권장”되 고 있음을 지 한 바 있다. 최근 노동조와 민숙희는 2011년 우리나라 179개 4년제 학도서 웹사이트를 조사하여 “도서 2.0” 기술이 용된 열여섯 개 기능 도입 황을 검토하고, 가장 많이 도입된 기능은 이용 자 서평 기능으로 모두 113개 학도서 (63.1%) 이 제공하고 있음을 발견하 다. 그다음으로 많 이 제공된 RSS 기능(52개 도서 , 29.1%), 태그, 태그 클라우드 기능(47개 도서 , 26.3%)과 비 교할 때 이용자 서평 기능은 두 배 이상 많이 도 입되었음을 알 수 있다. 이들의 연구는 도서 이 도입한 이러한 기능을 이용자들이 실제로 얼 마나 사용하고 있는지는 다루지 않았다. 구 억과 곽승진(2007)은 차세 OPAC에 서 이용자가 직 참여하여 콘텐트를 만들어내 는 태그, 서평, 코멘트, 평 등은 이용자가 원 하는 자료를 잘 찾아낼 수 있는 발견성을 높이 고 다른 이용자와 지식을 공유할 수 있게 한다 고 강조하 으나, 실제로 그러한 효과를 달성하 고 있는지는 증거를 제시하지 않았다. 반면에 우리나라보다 더 일 차세 도서 목 록 개념이 도입된 북미주의 학도서 들은 오히 려 이용자 참여 기능을 그 게 많이 제공하지 않 는 것으로 나타났다. Yang과 Hofmann(2011)이 미국과 캐나다의 260개 학도서 의 OPAC 273 종을 조사한 결과는 단 30개 도서 (11%)이 태 그 기능을 허용하 고, 18개 도서 (7%)이 서평 기능을, 11개 도서 이 등 /순 (4%) 기능을 각 각 제공하 을 뿐이다. 연구자들은 이러한 이용자 참여의 제한 이유 가운데 요한 것은 목록자들이 서지 데이터 품질과 이용자가 기여한 내용의 합 성에 하여 염려하기 때문이라고 지 하 다. Yang과 Hofmann(2010)은 다른 연구에서 Blacklight, VuFind 등 오 소스 AquaBrowser, Primo 등 상용 목록 발견도구(discovery tools) 18가지를 12가지 기능 체크리스트를 사용하여 이들이 이른바 차세 도서 목록의 기능을 얼 마나 달성하 는지 검토하 다. 이들은 일반 이 용자들이 더 다른 이용자들이 온라인에서 발견된 자료에 하여 무엇이라고 하는지 찾고, 그들의 평가라고 생각하는 것을 시하는 경향 이 있다고 하 다. 그러나 검토 결과는 재 여 차세 도서 목록의 이용자 서평에 한 고찰 119 덟 가지 발견도구만이 실제로 이용자 참여 기능 을 허용하 고, 그 가운데 BiblioCommons는 가 장 많이 태그, 주석, 요약, 인용, 공지, 등 등 여덟 가지 기능을 제공하 고, LibraryFind는 태그, 서평, 등 등 세 가지, Primo, Scriblio, Sopac, VuFind, WorldCat Local은 태그를 포함 하여 두 개씩의 기능을 제공하 으며, Encore는 태그 한 가지 기능만 제공한 것으로 나타났다. 국내에서든 해외에서든 아직은 목록에 도입 된 이용자 참여 기능이 이용자에게 어떤 직 향을 미칠 수 있는가에 한 연구는 이루어 지지 않았다. 최근 윤정옥(2012)은 WorldCat 에 수록된 논쟁 주제의 책에 하여 서지 코드에서 직 연결되는 외부 정보원의 서평이 목록 이용자에게 향을 미칠 수 있는 가능성 을 지 하 다. 그러나 이 연구는 시론(試論) 의 성격을 가지며 이용자 서평의 향력을 추 론할 뿐 실제 인 향 여부를 입증하는 단계 에 이르지는 못하 다. 3. WorldCat의 이용자 참여 여기에서는 아마존에서 2010년 간행도서들 가운데 편집자들이 선정한 ‘Top 100 Editors’ Picks’에서 최상 10권을 심으로 WorldCat 의 이용자 서평, 태그 독서리스트를 사용한 참여 황을 살펴보았다. 먼 <표 1>은 열 권 의 책을 순 로 서명, 자, 출 사 발행일 을 보여 주고 있다. <표 2>는 이 책들에 하여 2012년 2월 16일 4월 4일 재 WorldCat 상 각 책의 소장도서 수, 서지 코드에 연결 된 이용자 기고 서평 건수, 이용자가 추가한 태 그 각 책을 포함한 독서리스트의 건수를 통 한 기간 내 변동 추이를 보여 다. 3.1 소장 도서 <표 1>에 포함된 열 권의 책들을 WorldCat 에서 보면 미국 내에서 어도 천여 개에서 이 순 서명 1 The Immortal Life of Henrietta Lacks/ Rebecca Skloot. Crown. (February 2, 2010) 2 Faithful Place: A Novel/ Tana French. Viking Adult. 1st ed. (July 13, 2010) 3 Matterhorn: A Novel of the Vietnam War/ Karl Marlantes. Atlantic Monthly Press. 1st ed. (March 23, 2010) 4 Unbroken: A World War II Story of Survival, Resilience, and Redemption/ Laura Hillenbrand. Random House. 1st ed. (November 16, 2010) 5 The Warmth of Other Suns: The Epic Story of America's Great Migration/ Isabel Wilkerson. Random House. 1st ed. (September 7, 2010) 6 Freedom: A Novel/ Jonathan Franzen. Farrar, Straus and Giroux. 1st ed. (August 31, 2010) 7 The Girl Who Kicked the Hornet's Nest (Millennium Trilogy)/ Stieg Larsson. Knopf (May 25, 2010) 8 To the End of the Land/ David Grossman. Knopf. 1st ed. (September 21, 2010) 9 Just Kids [Paperback]/ Patti Smith. Ecco. Reprint ed. (November 2, 2010) 10 The Big Short: Inside the Doomsday Machine [Hardcover]/ Michael Lewis. W. W. Norton & Company. 1st ed. (March 15, 2010) <표 1> 아마존의 2010년 ‘Top 100 Editors’ Picks’ 상 10종 120 한국문헌정보학회지 제46권 제2호 2012 순 소장도서 서평 태그 리스트 2/16 4/4 증가 증가율 2/16 4/4 2/16 4/4 2/16 4/4 1 2,949 3,030 81 2.7% 0 1 5 5 77 81 2 1,337 1,376 39 2.8% 0 0 1 1 11 12 3 1,675 1,714 39 2.3% 0 0 1 1 16 16 4 2,406 2,524 118 4.7% 1 1 9 9 40 43 5 2,128 2,176 48 2.2% 1 1 4 4 23 24 6 2,489 2,531 42 1.7% 0 0 8 8 37 37 7 2,770 2,839 69 2.4% 0 0 2 2 40 42 8 1,091 1,131 40 3.5% 0 0 2 2 8 9 9 1,362 1,403 41 2.9% 0 0 0 0 15 16 10 2,099 2,125 26 1.2% 0 0 6 6 31 31 평균 2,031 2,085 54 2.6% 0.2 0.3 3.8 3.8 29.8 31.1 <표 2> WorldCat의 이용자 참여 황 천여 개 이상의 도서 들이 소장하고 있다. 이 열 권의 책들은 2012년 2월 16일 재 미국 내 에서 평균 2,031개 도서 이 소장하고 있는 것 으로 나타났다. 다시 4월 4일 소장상황을 다시 검하 을 때는 평균 2,085개 도서 이 소장 하고 있었고, 그 기간 동안 소장기 증가율은 2.6% 정도 다. 먼 소장기 으로만 보았을 때, 2월 16일 재 가장 많은 도서 이 소장하고 있던 책은 The Immortal Life of Henrietta Lacks으로 서 모두 2,949개 도서 이 소장하고 있었다. 4 월 4일 조사에는 3,030개 도서 이 소장하여 81 개 도서 이 증가하 고, 기간 동안 증가율은 2.7%에 달하 다. 그 다음으로 많이 도서 이 소장하고 있던 책은 The Girl Who Kicked the Hornet’s Nest으로서 2월에는 2,770개 도서 , 4월에는 2,839개 도서 이 소장하여 기간 내 증 가율은 2.4% 다. 2월 16일 조사 당시 소장 도서 수가 가장 었던 책인 To the End of the Land로서 모 두 1,091개 도서 이 소장하 고, 4월에는 소장 도서 수가 40개 늘어 1,131개 도서 이 소장 하 고 3.5%의 증가율을 보 다. 가장 높은 소장 도서 증가율을 보인 것은 Unbroken: A World War II Story of Survival, Resilience, and Redemption으로 2월에는 2,406 개 도서 , 4월에는 2,524개 도서 으로 118개 증가하 고, 증가율은 4.7% 다. 3.2 이용자 서평 앞 에서 본 것처럼 WorldCat에 이 책들의 소장 도서 수가 지 않고, 계속해서 꾸 히 증가함에도 불구하고, 이 책들에 하여 이용 자가 기고한 이용자 서평은 거의 없었다. 2월 16일 당시 열권의 책 가운데 이용자가 기고한 서평은 Unbroken과 The Warmth of Other Suns라는 단 두 권의 책에만 각각 1건씩 있었 고, 4월 4일 검토 시에는 The Immortal Life of Henrietta Lacks에 서평이 하나 더 추가되 차세 도서 목록의 이용자 서평에 한 고찰 121 었을 뿐이다. 이 책들이 모두 2010년에 간행되 어 부분의 도서 에 어도 1년 이상 소장되 어 있었을 것임에도 불구하고, 목록 이용자가 서평을 올린 것은 이처럼 미미한 수 이었다. OCLC가 수행한 이 연구(2009)에서 많은 온라인 목록 이용자들이 이용자 참여 기능을 요구한다고 하 다. 그러나 <표 1>을 보면 Yang 과 Hofmann(2011)이 목록 이용자들이 아마존 이나 iTunes 같은 상업 사이트에 그러는 것 처럼 목록에도 기여하려고 할 것인지 아직은 미 지수라고 지 한 이 타당하게 여겨진다. 이들 은 개별 도서 목록보다는 세계 구든 태 그를 붙이거나 서평을 기고할 수 있는 계정을 만들 수 있는 WorldCat쯤 되어야 “태그 뭉치 (the mass of tags)” 같은 것도 의미 있을 것이 라고 하 는데, 이처럼 WorldCat 상 수많은 도 서 이 소장하고 있는 것으로 나타난 열 권의 책에 단 3건의 서평만 붙어있다는 것은 주목할 만하다. 3.3 이용자 태그 <표 1>의 각 책들에 하여 이용자 태그도 그리 많이 달려 있지는 않다. 그러나 어도 이 용자 서평보다는 많아, 열권의 책에 각각 평균 3.8개씩의 이용자 태그가 달려있다. 2월 16일과 4월 4일에 아무런 변동은 없었다. 열권의 책 가운데 가장 많은 태그를 가진 것 은 Unbroken으로 a world war ii story, acer, book, cheap, duc--history, greast book, love, price, book thief라는 9개의 태그, 그 다음으 로 Freedom: A Novel은 201103, duc--social issues, freedom, general collection, general fiction, oprah, own, rainy day books라는 8개 의 태그를 각각 갖고 있다. The Big Short는 6개, The Immortal Life of Henrietta Lacks 는 5개, The Warmth of Other Suns는 4개의 태그들이 각각 달려있다. 나머지 4권의 책들은 각각 2권씩 2개와 1개의 태그가 달려있고, Just Kids라는 책은 태그가 하나도 달려있지 않다. 이 열 권의 책들에 달려 있는 태그의 수 자체 가 그리 많지 않고, 한 권의 책에 하여 한 사 람이 여러 개의 태그를 추가할 수도 있으므로 얼마나 많은 이용자들이 태그 기능을 이용했는 지 단하기는 어렵다. 더욱이 서평이나 태그의 수가 그리 많지 않으므로 유의미한 계를 추 론하기는 어려우나, 내용을 문장으로 구성하고 서술하는 이용자 서평보다는 한 두 단어로 표 되는 태그 추가가 훨씬 용이한 만큼 재로 는 태그를 통한 이용자 참여가 다소 많다고 할 수 있다. 3.4 이용자 독서 리스트 <표 1>을 보면 이용자 독서 리스트를 통한 참여는 서평이나 태그보다 훨씬 더 활발한 것 으로 보인다. 열권의 책이 2월 16일 당시 각각 평균 29.8개, 즉 략 30개 정도의 독서 리스트 에 수록되어 있었고, 4월 4일에는 평균 31.1개 의 독서 리스트에 수록되었다. 2월 16일 재 가장 많은 독서 리스트에 수 록된 책은 The Immortal Life of Henrietta Lacks로서 모두 77개의 독서 리스트에 포함되 었고, 그 다음으로는 Unbroken과 The Girl Who Kicked the Hornet's Nest가 각각 40개 의 독서 리스트에 포함되었다. Freedom: A 122 한국문헌정보학회지 제46권 제2호 2012 Novel이 37개, The Big Short가 31개, The Warmth of Other Suns가 23개의 독서 리스 트에 각각 포함되었으며, 나머지 책들은 스무 개 미만의 독서 리스트에 수록되었다. 4월 4일 검토 시 각 책을 수록한 독서 리스트 의 수는 조 씩 증가하 으나, Matterhorn: A Novel of the Vietnam War과 Freedom: A Novel 두 책은 각각 16개와 37개의 독서 리스 트 수가 변하지 않았다. 이용자 독서 리스트는 이용자가 스스로 만든 독서 리스트를 목록에 공개한다는 에서 이용 자 참여 기능을 일부 수행하는 것이긴 하지만, 어떤 면에서는 가장 소극 인 참여 행태를 보 여주는 것이라 할 수 있다. 왜냐면 독서 리스트 는 서평이나 태그와 같이 다소라도 책의 내용 과 련된 지 인 활동의 결과물이기보다는 개 인의 심이나 독서 이력에 따라 서지 코드만 연결하는 형태이기 때문이다. 그러나 도서 과 같은 기 혹은 사서와 같 은 기고자가 독서 리스트를 올려놓았을 경우에 는 다른 측면에서 볼 수도 있다. 즉 도서 이나 사서가 올려놓은 독서 리스트는 단순한 개인의 독서 이력을 나타내는 것이 아니라 교과과목이 나 신간 안내 등과 련된 “추천” 독서 리스트 로서 어느 정도 책에 한 평가과정이 개입된 경우가 있기 때문이다. 를 들어 <표 2>에서 2월 16일 재 가장 많 은 77건의 독서 리스트에 제목이 올라있던 The Immortal Life of Henrietta Lacks의 경우는 한 달여가 지난 3월 26일에 독서 리스트는 80건 으로 증가하 고, 4월 4일에는 81건이 되었다. The Immortal Life of Henrietta Lacks와 연 결된 독서 리스트 가운데 상당수는 기 기고자 가 추가한 것으로 보인다. 2012년 3월 20에 갱신 된 “ALA Notable Non-Fiction for Adults”라 는 제목의 리스트에는 모두 152종의 책이 포함 되어 있다. clacklib라는 아이디의 기고자 로 일을 보면 주소는 “Librarian Clackamas CC, Oregon City, Oregon, United States”이고, 이 메일은 “reference@clackamass.edu”이다. 말 하자면 이 독서 리스트는 미국 오리건주 오리건 시티의 클래커매스 커뮤니티 컬리지(Clackamas CC) 도서 의 참고사서가 올려놓은 것으로 추 정할 수 있다. WorldCat에는 이 아이디의 기고 자와 연결된 리스트가 모두 12개 올라와 있고, 이들은 “CCC RD 090 Fiction-Baker”, “CCC ANT 103 Keeler(Ethnographies recommended for Cultural Anthropology)”처럼 교과과목과 련된 리스트들이다. The Immortal Life of Henrietta Lacks와 더불어 33권의 책을 포함하고 있는 독서 리스트 인 “Leisure Reading--Schusterman Library” 는 기고자 jcjanzen의 로 일은 공개하지 않 고 있다. 그러나 리스트의 설명(description)에 따르면 오클라호마 학-털사 캠퍼스(OU-Tulsa) 의 도서 에서 “Leisure Reading area”에 배치 한 신간 책들의 리스트임을 알 수 있다. 그밖에 도 “New Books at the Park Library”, “No- vember 2010-New Books @HPULibraries” 등과 같이 제목만 보아도 도서 에 련되었음 을 추측할 수 있는 리스트들이 다수 있다. “New Book List”는 모두 496권의 책을 수록하고 있 고, “UC Berkeley Fong Optometry and Heath Science Library가 입수한 신간 책 리스트”라 는 설명도 있어 학도서 에서 기고한 리스트 임을 바로 알 수 있다. 차세 도서 목록의 이용자 서평에 한 고찰 123 물론 개인 기고자들도 있다. amyvecchione 라는 기고자가 올린 “Finalists for Boise State Campus Read”라는 리스트는 10권의 책을 포 함하고 있는데, 로 일에 따르면 기고자는 Amy Vecciones라는 개인으로 추정되며 모두 14개의 리스트를 WorldCat에 올려놓고 있다. 한 31종의 책을 포함하고 있는 “Books I have Read Recently” 리스트는 rwillits라는 기고자 가 로 일을 공개하지 않으므로 신원을 알기 는 어려우나 리스트의 제목으로 보아 개인 독 서 리스트일 것임을 추론할 수 있다. “Books I have Read”라는 다른 리스트는 bowlib라는 기 고자가 올린 것으로 단 2권의 책만 포함하고 있 다. 로 일에 따르면 기고자는 Sarah Snavely (Bowman, North Dakota, United States)라 는 개인으로 추정되며, WorldCat에는 이 리스 트 하나만 올려놓고 있다. 한편으로는 독서 리스트의 신뢰도를 의심하 게 만드는 기고자도 있다. 를 들어 MCHayes 라는 기고자는 2011년 9월 29일과 30일 사이에 “ann”, “kjahfs”, “Wayne’s test”라는 세 개의 리스트를 올렸다. “ann”은 2종, “kjahfs”은 2 종, “Wayne’s test”는 3종의 책들만을 포함하 고 있으며, 세 리스트에 공통으로 들어있는 책 이 바로 The Immortal Life of Henrietta Lacks 이다. 이 기고자는 로 일을 공개하지 않고 있으며, WorldCat에 모두 9개의 리스트를 올 리고 있다. WorldCat에서는 이 독서 리스트들의 “viewer” 수를 공개하고 있다. 이 책과 련된 80건의 독 서 리스트 가운데 2010년 2월 3일 최 로 올려 진 “Things I Recommend”라는 제목의 리스 트는 WorldCat 이용자가 124회 본 것으로 계수 되어 있으며, 2012년 3월 20일에 올려진 최신 리 스트인 “Spring 2012 Library Associate Book Suggestions”는 WorldCat 이용자가 3월 26일 재까지 30회 본 것으로 나타났다. 앞서 언 한 “Finalists for Boise State Campus Read” 는 274회 본 것으로 계수되어 있다. 여기에서 80건의 독서 리스트 체의 “viewer” 수를 살펴보지는 않았으나, 분명 WorldCat의 목록 이용자들이 다른 이용자가 올려놓은 독서 리스트를 참조하는 것은 사실임을 알 수 있다. 4. 외부정보원의 이용자 서평 4.1 이용자 서평 황과 증가율 <표 3>은 <표 1>에 명시한 10권의 책에 한 외부 정보원의 이용자 서평 황으로서, WorldCat에 수록된 각 책의 서지 코드에 연 결된 외부 서평 정보원인 굿리즈와 온라인 서 인 아마존의 책 정보에 연결된 서평의 건수 를 보여 다. <표 3> 역시 2월 16일과 4월 4일 에 두 차례 조사한 서평 건수 기간 내 증가 율을 포함하고 있다. 차세 도서 목록에서 주목할 만한 기능 가 운데 하나는 곧바로 온라인 서 이나 서평 사이 트와 같은 외부정보원으로 직 연결될 수 있 다는 것이다. WorldCat의 서지 코드에서 연 결되는 ‘GoodReads Reviews’는 세계 일반 독자들이 서평과 추천을 올리는 웹사이트인 굿 리즈(GoodReads.com)의 서평 데이터베이스에 서 가져온 것으로, 사실은 차세 도서 목록 의 고유한 기능이라기보다는 리딩이 일 이 124 한국문헌정보학회지 제46권 제2호 2012 순 굿리즈 아마존 2월 16일 4월 4일 증가 증가율 2월 16일 4월 4일 증가 증가율 1 9,226 9,924 698 7.6% 991 1,035 44 4.4% 2 1,896 1,974 78 4.1% 234 240 6 2.6% 3 150 153 3 2.0% 593 610 17 2.9% 4 9,414 10,475 1,061 11.3% 2,118 2,289 171 8.1% 5 890 985 95 10.7% 314 328 14 4.5% 6 7,223 7,449 226 3.1% 1,017 1,038 21 2.1% 7 10,801 11,442 641 5.9% 1,391 1,486 95 6.8% 8 258 285 27 10.5% 88 91 3 3.4% 9 2,765 2,921 156 5.6% 292 302 10 3.4% 10 2,007 2,093 86 4.3% 731 745 14 1.9% 평균 4,463 4,770 307 6.5% 776 816 40 4.0% <표 3> 굿리즈와 아마존의 이용자 서평 황 목록의 ‘확장된 콘텐츠(Enhanced contents)’ 라고 언 한 것(Breeding 2004)에 해당한다. 2007년 1월에 출범한 굿리즈는 2012년 2월 재 7,100,000여 회원이 기고한 250,000,000여 건의 책에 한 정보를 제공하고 있다.1) 다만 이 사이트의 서평 기고자가 부분 일반인들이 므로 Publishers Weekly, New York Times, School Library Journal 등이 제공하는 문가 서평과는 구별된다. 한 WorldCat은 “Buy it”으로 아마존, 반 스앤노블, Better World Books 구 eBooks 으로 직 링크를 제공한다. WorldCat의 목록 이용자들이 얼마나 빈번하게 서지 코드로부터 이러한 외부 사이트로 찾아가는지는 바로 알 수 는 없으나, 이러한 링크는 목록이 더 이상 도서 이라는 제한된 경계 안에 있지는 않으며, 상 업 정보원과 상호작용할 수 있는 기회를 제 공함을 입증한다. <표 3>에 보는 바와 같이 열권의 책에 하 여 굿리즈와 아마존에 이용자가 기고한 서평 수 는 수백에서 수천 건에 이른다. 이 연구가 진행 되는 2012년 2월 16일부터 4월 4일 사이 서평 증가율은 각각 307건, 6.5%와 40건, 4.0%이다. <그림 1>은 기간 내 각 책에 한 이용자 서평 증가율을 보여 다. 2012년 2월 16일 당시 굿리즈에는 <표 1>의 열권의 책 각각에 평균 4,463건의 서평이, 4월 4일에는 평균 4,770건의 서평이 각각 달려 있으 며 기간 내 평균 307건, 6.5%의 증가율을 보여 주고 있다. 2월 16일 열권의 책 가운데 굿리즈 서평이 가장 많이 연결되어 있는 책은 ‘ 니 엄 3부작’ 세 번째 책인 The Girl Who Kicked the Hornet's Nest로 무려 10,801건의 서평이 달려 있었고, 4월 4일에는 모두 11,441건이 달 려 있어 기간 내 641건, 5.9%의 증가율을 보여 주었다. 1) Goodreads.com 홈페이지. “About Goodreads.” [online]. [cited 2012.2.17]. . 차세 도서 목록의 이용자 서평에 한 고찰 125 <그림 1> 굿리즈와 아마존의 이용자 서평 증가율 열 권 가운데 굿리즈 서평이 가장 은 책은 Matterhorn: A Novel of the Vietnam War 로서 2월 16일에 150건, 4월 4일에 153건의 서 평이 올라와 있을 뿐이다. 이 책의 서평은 기간 내 단 3건만이 증가하여 증가율 한 2.0%로 가장 낮았다. 그러나 이례 으로 WorldCat에 는 이용자 서평이 한 건 올라와 있다. To the End of the Land는 258건의 서평이 달려 있었 으나 285건으로 늘어나 27건, 10.5%의 증가율 을 보여주었다. 한편 아마존에서 보면 <표 1>의 책 각각에 하여 2월 16일에는 평균 776건의 이용자 서평 이 달려있었고, 4월 6일에는 816건으로 40건이 늘어나 평균 4.0%의 증가율을 보여주고 있다. 열권 가운데 2월 16일에 가장 많은 서평이 달 려있었던 책은 2010년 11월에 출간된 Unbroken: A World War II Story of Survival, Resilience, and Redemption으로 모두 2,118건의 서평이 있었다. 4월 4일 서평의 건수는 2,189건으로 모 두 171건이 늘어나 가장 높은 8.1%의 증가율 을 보여주었다. 열권의 책들 가운데 서평의 수 가 가장 었던 것은 To the End of the Land 의 88건이었고, 4월 4일에는 3건이 늘어난 91 건의 서평이 달려있어 평균치보다 낮은 3.4% 의 증가율을 보여주었다. 체 으로 각 책에 한 평균 서평 건수는 2월 16일을 기 으로 굿리즈가 아마존보다 5.7 배 가량 많았고, 4월 4일에 비교한 결과로는 기 간 내 증가율 한 1.6배 가량 높았다. 이러한 사실에서 일반 이용자들이 아마존과 같은 상업 서 사이트보다는 굿리즈와 같은 서평 문 사이트에 서평을 “더 많이” 기고한다고 추 론해 볼 수 있다. 한 이처럼 책마다 수백에서 수천 건의 서평을 달고 있는 사이트에 비하여 WorldCat과 같은 목록에 이용자 서평을 기고 하는 사람들의 수는 아직은 미미하며, 그런 만 큼 목록이 직 인 이용자 서평의 정보원이 된 다고 하기에는 이르다는 이다. WorldCat의 이용자가 각 책의 서지 코드에 연결된 굿리즈나 아마존의 서평을 얼마나 보고 126 한국문헌정보학회지 제46권 제2호 2012 참조하는지는 직 알 수 없다. 그러나 이용자 가 서평 정보를 원한다면 도서 의 경계를 넘어 바로 근할 수 있는 이러한 외부정보원이 지근 거리에 있다는 것은 이들의 서평이 도서 목록 이용자들에게 어떤 향을 미칠 수도 있음을 의 미한다. 4.2 이용자 서평의 사례 차세 도서 목록에 연결된 외부정보원의 이용자 서평이 어떻게 목록 이용자에게 향을 미칠 수 있는가는 아직은 명확하지 않다. 그러 나 최근 아마존에서 발생한 한 가지 사례를 보 면, 이용자 서평 기능이 왜곡될 가능성이 있으 며, 더 나아가서는 그러한 서평이 직 연결되 는 차세 도서 목록의 이용자들에게 향을 미칠 수도 있다는 우려를 가질 수 있다. 지난 2011년 11월 1일 미국에서만 발매되기 시작한 이명박 통령의 문자서 The Un- charted Path의 사례를 보면, 2012년 4월 3일 재 아마존에는 이 책의 서평이 모두 410건 올 라와 있으며 평균 3개의 별로 평가되고 있다. 이 가운데 가장 많은 187건(45.6%)의 서평은 한 개의 별을 주고 있지만, 거의 비슷한 수인 185건(45.1%)은 5개의 별을 주고 있으며, 31 건(7.6%)의 서평이 4개, 5건(1.2%)이 3개, 2 건(0.5%)이 2개의 별을 각각 주고 있다. 이보다 앞서 이 책에 해서 2011년 11월 13 일 아마존의 이용자 서평이 혹평 일색이라는 한 겨 신문 의 보도(임종업 2011)가 있었고, 한 이러한 보도 이후 호의 인 이용자 서평이 갑작스럽게 거 등장함을 지 한 2012년 1월 25일 인터넷 신문인 OhmyNews 의 보도가 있었다. OhmyNews 는 이 책이 발간된 2011 년 11월 1일부터 14일까지 체 142명의 이용 자 별 4-5개를 이용자는 28명에 불과했으 나, 한겨 신문 의 보도 이후 12월 31일까지 체 197명의 이용자 별 4-5개를 이용자 가 159명으로 크게 늘었다고 하며 ‘서평 알바’ 의 의혹을 제기하 다(최경 , 설갑수 2012). 2012년 4월 재 이 서평들을 날짜별로 정렬 한 결과를 보면 OhmyNews 의 보도 이후에 마지막으로 서평이 올라온 것은 3월 11일이다. 이에 앞서 이 기사에서도 언 된 12월 31일을 기 으로 2012년 1월 1일부터 1월 24일 사이에 는 모두 65건의 서평이 올라왔다. 이 가운데 26 건은 별 다섯 개, 32건은 별 한 개를 주었고, 5건 이 4개의 별, 2건이 3개의 별을 주었다. 특히 1 월 9일부터 12일 사이에 올라온 13건의 서평 가 운데 별 한개씩을 부여한 2건을 제외하고는 나 머지 11건은 모두 별 5개의 평 을 주고 있다. 이 책에서 한 가지 흥미로운 것은 재 이용 자들이 “가장 유용한 호의 서평”이라고 평가 한 서평이 책에 하여 별 다섯 개의 평 을 주 고 있지만, 실제 내용 상으로는 매우 호의 이 아니라는 이다. 그 다음으로 유용하다고 평 가된 “호의 서평”은 “Bible of Lies, Deceits, Swindles and Shameless Propaganda”라는 제목을 갖고 있으며, 역시 이 책을 별 다섯 개로 평 하고 있으나 실제로 이러한 제목의 서평을 호의 이라고 단하기는 어렵다. 그 밖에 내용이 호의 이거나 비 임을 단하지 않고 “유용성(Most helpful first)” 순으 로 정렬한 결과는 상 30건 모두 별 한 개를 “비 ” 서평들이다. 이 책은 WorldCat에서 검색하면 2012년 2 차세 도서 목록의 이용자 서평에 한 고찰 127 월 6일 재 미국 내 79개 도서 에 소장되어 있다. 두 개의 서지 코드 가운데 하나(OCLC Number: 760085112)는 5개 도서 , 다른 하 나(OCLC Number: 703211173)는 74개 도서 의 소장정보를 포함하고 있다. 두 개의 서지 코드 모두 이용자 서평이나 태그가 올려져 있지 않다. 아마존에서의 열기에 비하면 목록 에서 서평이라는 형태로 나타난 이용자들의 심은 에 띄지 않는다고 할 수 있다. WorldCat Local 인터페이스를 사용하여 구 축된 University of California 도서 시스템의 종합목록인 Melvyl을 보면 2012년 2월 7일 재 10개의 UC 캠퍼스 도서 가운데 버클리 (Berkeley) 캠퍼스만이 이 책을 입수하 고, 재는 ‘정리 (In Process)’이다. 따라서 아직 출이 가능하진 않았고, 그런 만큼 이용자 태 그나 서평 등이 추가되어 있지 않았다. WorldCat에서 직 가든, 혹은 WorldCat Local에 기반한 개별 도서 목록을 통해서든 서지 코드에서 아마존뿐만이 아니라 다른 온 라인 서 으로도 연결될 수 있다. 아마존의 서 평 공간에서 격렬한 토론이 진행되고 있는 반 면 반스앤노블 온라인서 에는 2012년 2월 5일 재 이 책의 서평이 단 일곱 건 올라와 있었다. 이 일곱 건의 서평 가운데 네 건은 별 한 개를, 세 건은 별 다섯 개를 각각 부여함으로써 이 책 의 평균 등 은 두 개 반으로 평가되어 있었다. 이 서평들은 앞서 한겨 신문 의 기사가 지 한 것과 동일한 양상을 보이고 있었다. 최 로 이 책의 서평이 올라온 것은 2011년 10월 24일 로 이 책의 공식 발행일인 2011년 11월 1일보 다 앞서고 있다. 이 최 의 서평은 익명의 독자 가 쓴 것으로 별 한 개의 평 과 “this story is a pure fabrication. better not get it”이라는 단 한 의 평가를 덧붙이고 있다. 이후 한 삼주 정도는 아무런 서평이 추가되지 않았고, 그 다 음으로는 갑자기 11월 13일에 5건, 17일에 1건 의 서평이 올라왔으며, 그 이후로는 다시 서평이 추가되지 않았다. 이 책에 별 다섯 개의 평 을 세 건의 서평은 13일에 올라온 2건과 17일에 올라온 1건으로 모두 한 의 서평만 포 함하고 있다. WorldCat에 연결된 다른 온라인 서 인 Better World Books에는 2월 16일 재 이 책에 한 서평이 하나도 올라와 있지 않았다. 더욱이 이 책 자체의 소개에는 “Written by the CEO of Korean car company Hyundai…”라고 하 여 자가 한국의 직 통령이 아니라, 자동 차 회사 의 CEO라고 되어 있다.2) 특히 이 사이트에서는 서평을 올리기 하여 굿리즈 계 정을 만들어야 하는데, 이 연구가 진행되는 시 에는 아무도 서평을 올리지 않은 것으로 보 다. 이 책과 련하여 아마존의 사례를 심으로 살펴보면 이용자 서평에는 몇 가지 주목할 만 한 이 있다: 첫째, 이용자가 주는 서평과 평 의 내용이 반드시 일치하지 않을 수도 있다는 이다. 통 상 평 인 별의 수가 많으면 “좋다”라고 인식 하게 마련인데, 상당수의 서평들은 이 책에 해 평 은 높게 주면서 서평의 내용은 “좋지 않 다”에 무게를 싣고 있다. 따라서 독자들이 처음 2) BetterWorldBooks. The Uncharted Path. [cited 2012.2.6]. . 128 한국문헌정보학회지 제46권 제2호 2012 부터 평 과 서평의 불일치 가능성을 감안하지 않는다면, 서평에 의존한 단에 혼란을 느낄 수도 있다. 둘째, 이용자들이 반드시 책을 읽지 않고도 서평란에 을 올릴 수 있다는 이다. 앞서 한 겨 신문 이 이 책의 서평란에 43개의 “댓 ” 이 달려있다고 하 는데, 실제로도 이들의 내 용은 책을 읽고 기고한 “서평”이라기보다는 일 반 신문기사에 달리는 “댓 ” 정도 다. 서평 의 공간이 마치 특정한 인물에 한 정치 혹 은 사회 공론장이 된 듯 하 다. 다시 말하면 실제 책 자체의 순수한 가치에 한 서평을 공 유하는 것이 아니라, 자 자체에 한 호불호 로 비롯된 의견 표출의 공론장이 되었다는 것 이다. 이용자 서평을 허용하는 공간에서 이러한 일이 다른 책과 련해서도 발생할 가능성이 있다. 셋째, 이용자가 서평으로 책을 평가하지만, 다 른 이용자는 그 서평 자체가 유용한가를 다시 평가함으로써 평가의 순환 구조가 만들어진다 는 이다. 이 책의 경우에는 많은 이용자들이 객 서평으로서 “유용함(helpful)”이 아니라 자나 책에 해 “동조할 만한 의견”으로서의 “유용함”을 단 기 으로 삼은 것으로 보 다. 이용자 서평이 비록 문 서평의 조건이나 수 을 갖추지 못한다고 해도, “동료” 독자의 높이 에서 의견을 제시한다는 에서 어떤 면에서는 더 쓸모가 있을 수도 있다. 따라서 서평의 “유용 함”에 한 평가는 서평을 기고하는 이용자들을 고무하거나 제어하는 역할을 할 수도 있다. 그런 맥락에서 이 책의 사례처럼 “유용함”의 평가가 의견에 한 동의 여부를 표 하는 수단이 된다 면 그 기능이 왜곡될 우려가 있다. 5. 맺음말 차세 도서 목록의 개념이 등장하고 새로 운 기능의 하나로서 다양한 형태의 이용자 참 여에 한 심이 늘어나고 있는 것은 사실이 다. 이용자가 서평, 태그, 평 , 독서 리스트 등 으로 목록에 직 참여할 수 있는 방법이 생겼 고, 특히 WorldCat과 같이 세계 도서 들의 방 한 서지 코드를 수록하고 있는 종합목록 은 세계 이용자들에게 참여 공간을 열어두 게 되었다. 그러나 아직은 이용자 참여가 그 게 활발하지 않은 것으로 보 다. 이 연구에서 2010년 간행된 열 권의 책을 심으로 WorldCat 상 이용자가 추가한 서평, 태 그와 독서 리스트의 황을 살펴본 결과는 세 가지 이용자 참여 기능 독서 리스트가 그래 도 많이 사용되었고, 태그와 서평은 거의 이용 되지 않은 것으로 나타났다. 독서 리스트는 일 반 목록이용자의 독서 리스트뿐만 아니라 학 도서 이나 공공도서 의 추천도서 리스트, 신 간도서 리스트 등도 포함되어 있어, 개인 기 차원의 참여가 있는 것으로 보 다. 같은 책들에 하여 WorldCat의 “Buy It” 기능을 통해서 연결되는 아마존의 이용자 서평 외부 서평 사이트인 굿리즈의 서평도 살펴 본 결과, 각 책마다 게는 수백 건에서 많게는 수천 건의 이용자 서평이 달려있었다. 이러한 사이트에서의 이용자 서평 활동이 매우 활발한 데 비하여, WorldCat의 이용자 서평 참여는 아 직 미진한 것으로 보이는 이유를 다음과 같이 추론해 볼 수 있다. 첫째, WorldCat이 세계 수많은 도서 을 회원으로 갖고 있으며 매우 빈번하게 이용되지 차세 도서 목록의 이용자 서평에 한 고찰 129 만, 아직은 자발 인 이용자 서평의 참여 공간 으로서 잘 인식되지 못하기 때문이다. 1980년 온라인 목록의 등장 이래 수십 년 동안 이용 자들에게는 조 도 개방되지 않았던 목록의 공 간에 차세 도서 목록 혹은 발견 인터페이 스의 도입과 함께 이용자들의 참여를 허용하 다 해도 아직은 인식의 환이 쉽지는 않을 수 도 있다. 둘째, WorldCat은 웹 상에서 일반 이용자들 로 하여 회원으로 가입하여 서평, 태그 등을 기고할 수 있게 허용하지만, 특별히 개인이 이 용자 서평을 기고함으로써 얻을 수 있는 혜택 이 아직은 에 띄지 않는 것으로 보인다. 아마 존과 같이 상업 인 웹사이트에서는 서평을 기 고하면 자상거래에서 상품평을 할 때와 마찬 가지로 일정한 크 딧을 수도 있다. 한 굿 리즈는 아 처음부터 서평 공유와 토론을 목 으로 하는 사람들이 모이는 공간이므로 자발 참여의 동기가 부여될 수 있다. 그런 면에서 WorldCat과 같은 목록은 아직은 이용자의 직 참여 공간으로서 자리매김 하기까지 시간이 걸릴 수도 있다. 물론 이 연구에서 단지 열 권의 책을 살펴보 았으므로 WorldCat에 이용자 참여가 “ 으 로” 조하다고 단정하기는 어렵다. 왜냐면 실 제로 WorldCat 상 조앤 롤링의 베스트셀러 해 리포터 시리즈 책들에는 여기서 살펴본 책들보 다는 많은 이용자 서평이 달려있기 때문이다. 그럼에도 불구하고 분명하게 드러나는 사실 은 이용자 참여 기능 가운데 간단한 단어로 참 여할 수 있는 태그나 서지를 포함시키기만 하 면 되는 독서 리스트 같은 것이 상 으로 더 많고, 이용자 서평과 같이 이용자의 지 노력 이 많이 투입되는 참여의 증 추이는 계속 지 켜보아야 할 것이라는 이다. 한 도서 목록에서 이용자 서평 참여가 아직 미미한 만큼, 서지 코드에서 직 연결 되는 아마존이나 굿리즈 등 외부 정보원이 제 공하는 이용자 서평이 상 으로 이용자들의 선택에 향을 미칠 수도 있다. 여기에서 The Uncharted Path라는 책과 련하여 아마존의 이용자 서평 공간에서 발생한 논쟁의 사례를 통해 이용자 참여가 왜곡되는 경우도 발생함을 볼 수 있었다. 이 책은 이용자 서평 공간이 정치 공론장이 됨으로써 외 사례일 수도 있 으나, 다른 책과 련해서도 그럴 가능성을 배 제할 수 없으며, 이용자 참여가 허용되는 목록 의 서평 공간 한 그 게 될 가능성이 있다. 여기에서는 “그럴 수도 있다”는 가능성을 논 하 다. 실제로 차세 도서 목록에서 이용 자 서평을 통한 참여 기능이 얼마나 이용되며, 거기 연결된 외부 정보원의 이용자 서평이 목 록 이용자들의 자료 선택에 어떤 향을 미치 는가에 하여 보다 심층 인 연구가 필요하다. 어느 정도 이용자 서평의 기능이 확산되고 안 정된 이후 실제 이용자에 한 향 여부를 측 정하는 연구가 진행될 수 있을 것으로 기 한 다. 한 아직은 부분 도서 들에서 목록의 이용자 서평 기능을 도입하는 기 단계이지만 아마존에서 발생한 것과 같은 이용자 서평 공 간의 왜곡 상을 제어할 수 있는 장치를 마련 할 필요가 있다. 130 한국문헌정보학회지 제46권 제2호 2012 참 고 문 헌 [1] 구 억, 곽승진. 2007. 차세 OPAC의 인터페이스와 기능에 한 연구. 한국비블리아학회지 , 18(2): 61-88. [2] 김상호. 1994. 문헌비평을 한 서평의 분석 고찰: 서평문화와 출 을 심으로. 한국비블리 아학회지 , 7(1): 247-262. [3] 노동조, 민숙희. 2011. 학도서 웹사이트 분석을 통한 도서 2.0 기반 서비스 운 실태 분석. 정보 리연구 , 42(4): 195-223. [4] 심 경. 2008a. 차세 도서 목록. 도서 문화 , 49(9): 22-28. [5] 심 경. 2008b. 차세 도서 목록의 사례: AquaBrowser. 도서 문화 , 49(10): 48-56. [6] 심 경. 2008c. 차세 도서 목록의 사례 (2): WorldCat Local. 도서 문화 , 49(11): 54-61. [7] 윤정옥. 2012. 도서 목록의 지식 확산 도구 역할에 한 시론: WorldCat을 심으로. 한국도서 ․정보학회지 , 43(1): 123-141. [8] 윤정옥. 2010. 차세 도서 목록 사례의 고찰. 한국도서 ․정보학회지 , 41(1): 1-28. [9] 이지연, 민지연. 2008. 라이 러리 2.0에 한 이용자 인식 요구사항에 한 실증 연구. 한국문 헌정보학회지 , 42(1): 213-231. [10] 이 실. 2009. OPAC 근 향상을 한 도서 툴바의 제공 사서 평가 연구 - W 학 도서 사례를 심으로. 한국도서 ․정보학회지 , 40(3): 157-180. [11] 이 실, 배창섭, 이은주, 한성국. 2009. 지식 서비스 지향 도서 시스템의 논리 모델. 정보 리학 회지 , 26(3): 45-67. [12] 임종업. 2011. MB 문 자서 아마존서 찬바람. 한겨 신문 , 11월 13일. [13] 최경 , 설갑수. 2012. 이 통령 문 자서 , 미국서 1014권 팔려, 홍보비만 1억 이상... 스티 잡스가 통곡할 일. OhmyNews , 1월 25일. [online]. [cited 2012.2.5]. . [14] Arko, R. A., Ginger, K. M., Kastens, K. A., & Weatherly, J. 2006. “Using annotations to add value to a digital library for education.” D-Lib Magazine, 2(3). [online]. [cited 2012.2.6]. . [15] Breeding, Marshall. 2010. “State of the art in Library Discovery 2010.” Computers in Libraries, 31(1): 31-35. [16] Breeding, Marshall. 2007a. “Next-Generation Catalogs.” Library Technology Reports, 43(4): 1-44. [17] Breeding, Marshall. 2007b. “Small world: OCLC launches WorldCat Local.” Small Libraries 차세 도서 목록의 이용자 서평에 한 고찰 131 Newsletter, 27(6): 3. [18] Breeding, Marshall. 2004. “Integrated library software: A guide to multiuser, multifunction systems.” Library Technology Reports, 40(1): 1-44. [19] OCLC. Online Catalogs: What Users and Librarians Want: An OCLC Report. 2009. [online]. [cited 2009.12.10]. . [20] OCLC. 홈페이지. “WorldCat Facts and Statistics.” ; “WorldCat: A Global Catalog.” ; “Brief history of OCLC Activities with National Libraries Outside the U.S.” . [online]. [cited 2011.12.3]. [21] Sierra, Tito, Ryan, Joseph, & Wust, Markus. 2007. “Beyond OPAC 2.0: Library Catalog as versatile discovery platform.” Code{4}Lib Journal, 1. [online]. [cited 2009.12.31]. . [22] Yang, Sharon Q., & Hofmann, Melissa A. 2011. “Next generation or current generation?: A study of the OPACs of 260 academic libraries in the USA and Canada.” Library Hi Tech, 29(2): 266-300. [23] Yang, Sharon Q., & Hofmann, Melissa A. 2010. “Evaluating and comparing discovery tools: How close are we towards next generation catalog?” Library Hi Tech, 28(4): 690-709. •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] Gu, Jung-Eok & Kwak, Seung-Jin. 2007. “A study on Next Generation OPAC's interface and function.” Journal of the Korean BIBLIA Society for Library and Information Science, 18(2): 61-88. [2] Kim, Sang-Ho. 1994. “A study on the book reviews published in review periodicals.” Journal of the Korean BIBLIA Society for Library and Information Science, 7(1): 247-262. [3] Noh, Dong-Jo & Min, Sook-Hee. 2011. “A study on the state of the service-based Library 2.0 through web site analysis of Korean university libraries.” Journal of Information Management, 42(4): 195-223. [4] Shim, Kyung. 2008a. “Next Generation Catalogs.” KLA Journal, 49(9): 22-28. [5] Shim, Kyung. 2008b. “A case study of Next Generation Catalogs: AquaBrowser.” KLA Journal, 132 한국문헌정보학회지 제46권 제2호 2012 49(10): 48-56. [6] Shim, Kyung. 2008c. “A case study of Next Generation Catalogs(2): WorldCat Local.” KLA Journal, 49(11): 54-61. [7] Yoon, Cheong-Ok. 2012. “A discourse on the role of library catalogs as a tool for knowledge distribution.” Journal of Korean Library and Information Science Society, 43(1): 123-141. [8] Yoon, Cheong-Ok. 2010. “A case study on the Next Generation Library Catalogs.” Journal of Korean Library and Information Science Society, 41(1): 1-28. [9] Lee, Jee-Yeon & Min, Ji-Yeon. 2008. “Empirical research to understand the user perception and requirement of Library 2.0.” Journal of the Korean Society for Library and Information Science, 42(1): 213-231. [10] Lee, Hyun-Sil. 2009. “A case study on the Next Generation Library Catalogs.” Journal of Korean Library and Information Science Society, 40(3): 157-180. [11] Lee, Hyun-Sil, Bae, Chang-Sub, Lee, Eun-Joo, & Han, Sung-Kook. 2009. “A logical model of library system towards knowledge service.” Journal of the Korean Society for Information Management, 26(3): 45-67. [12] Im, Jong-Eob. 2011. “Cold wind over MB's English autobiography.” The Hankyoreh, Novermber 13th. [13] Choi, Kyung-Joon & Seol, Kap-Soo. 2012. “President Lee's English autobiography, Sold only 1014 copies in the U.S.…” OhmyNews, January 25th. work_4v42beeknre7vdta3lfv5z5xki ---- Evidence Based Library and Information Practice 2012, 7.4 114 Evidence Based Library and Information Practice Evidence Summary Interlibrary Loan Rates for Academic Libraries in the United States of America Have Increased Despite the Availability of Electronic Databases, but Fulfilment Rates Have Decreased A Review of: Williams, J. A., & Woolwine, D. E. (2011). Interlibrary loan in the United States: An analysis of academic libraries in a digital age. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 21(4), 165-183. doi: 10.1080/1072303X.2011.602945 Reviewed by: Kathryn Oxborrow Team Leader Hutt City Libraries Lower Hutt, New Zealand Email: Kathryn.Oxborrow@huttcity.govt.nz Received: 1 June 2011 Accepted: 5 Oct. 2011 2012 Oxborrow. This is an Open Access article distributed under the terms of the Creative Commons‐Attribution‐Noncommercial‐Share Alike License 2.5 Canada (http://creativecommons.org/licenses/by‐nc‐sa/2.5/ca/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly attributed, not used for commercial purposes, and, if transformed, the resulting work is redistributed under the same or similar license to this one. Abstract Objectives – To determine the number of interlibrary loan (ILL) requests in academic libraries in the United States of America over the period 1997-2008, and how various factors have influenced these rates. These factors included electronic database subscriptions, size of print journal and monograph collections, and the presence of link resolvers. Data were collected from libraries as both lenders and borrowers. The study also looked at whether the number of professional staff in an ILL department had changed during the period studied, and whether ILL departments led by a professional librarian correlated positively with rates of ILL. Design – Online questionnaire. Setting – Academic library members of the Online Computer Library Center (OCLC) ILL scheme in the United States of America. Subjects – A total of 442 academic library members of the OCLC ILL scheme. Methods – An electronic questionnaire was sent to 1433 academic library member institutions of the OCLC ILL scheme. Data were collected for libraries as both lending and borrowing institutions. Data were analyzed using a statistical software package, specifically to calculate Spearman’s rank mailto:Kathryn.Oxborrow@huttcity.govt.nz http://creativecommons.org/licenses/by‐nc‐sa/2.5/ca/ Evidence Based Library and Information Practice 2012, 7.4 115 correlations between the variables and rates of ILL. Main Results – Responses to the electronic questionnaire were received from 442 (31%) academic libraries. There was an overall increase in the number of ILL requests in the period 1997-2008. The number of ILL requests which were unfulfilled also increased during this period. There was a positive correlation between rates of ILL and all of the variables investigated, with the strongest correlations with size of print monograph collections and size of print journal collections. The numbers of staff in ILL departments remained relatively static during the period covered by the study, although the majority of staff working in ILL was composed of paraprofessionals. There was a weak positive correlation between numbers of ILL requests and whether ILL departments were headed by a professional librarian. Conclusions – Access to full text electronic databases has not decreased the numbers of ILL requests in academic libraries in the United States of America. In fact, ILL requests have increased, probably due to the fact that students and staff of academic libraries now have access to a larger number of citations through online databases and other information sources. The authors suggest that the increase in unfulfilled ILL requests is also due to this increased access. Libraries with large print collections are more likely to receive ILL requests precisely because they have more material to lend out, and may make more ILL requests due to the research output of their presumably larger institutions. There may be a higher number of ILL requests fulfilled by departments headed by a professional librarian because a librarian has more knowledge of sources to fulfil requests. Commentary This study was done as part of a previously conducted, larger scale study in the area of ILL research, and the authors give a good summary of earlier literature. It aimed to discover whether a series of factors correlated positively with numbers of ILL requests conducted over a number of years in the United States of America. The data covers a wide range of scenarios: separate datasets were recorded for ILL requests made and fulfilled, and for libraries as borrowers and as lenders. The presentation of the data makes it difficult for the reader to interpret the results. The authors use a coding system to represent ranges of numbers of ILL requests. All of the figures and tables are presented in this way, so the reader must continually refer back to the coding, and in some cases the code numbers are divided into decimal points, meaning the reader must calculate for themselves what the actual figures are. The coding is also misleading as the numerical range covered by each code number varies widely. The scales used on the graphical representations of the results also vary, which makes it very difficult to compare among them. The results of this study do not give the reader the full story. Although a great number of institutions were surveyed, every institution did not answer every question, and there is no explanation of this in the text. Furthermore, information is not given about the size and distribution of the institutions surveyed. This makes some of the conclusions which the researchers draw somewhat shaky, as they do not discuss how these factors may also have a bearing on the results. There are other aspects which the researchers did not cover in their conclusions, such as the fact that the presence of link resolvers may facilitate the ILL application process, and that non-fulfilment in many cases may be due to library users’ inability to find items in the collection. This is an interesting research area, and the authors suggest that further study could be undertaken into the declining numbers of ILL requests being fulfilled. Further studies could also look at the staffing question touched upon by the authors, such as a smaller scale comparison of the success rates of professionally qualified staff and paraprofessionals in the fulfilment of ILL requests. A study similar to that carried out by Bernardini & Mangiaracina (2011), who Evidence Based Library and Information Practice 2012, 7.4 116 investigated the types of items that are being requested by ILL after the introduction of subscription bundles in Italy, would also be an interesting addition to the literature. ILL is a rapidly changing area of librarianship, particularly with continuing technological advances such as e-books, so it is an area which requires continual research. Implications for practice include the importance of providing information literacy training to library users to ensure they can access the available material. References Bernardini, E., & Mangiaracina, S. (2011). The relationship between ILL/document supply and journal subscriptions. Interlending and Document Supply, 39(1), 9-25. doi: 10.1108/02641611111112101 work_53gkllp4fvbvjp2sxxjzad5g2a ---- ELIS_OTDCF_v21no3.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA Electronic Resource Usage Statistics: The Challenge and the Promise ___________________________________________________________________________________________________ {A published version of this article appears in the 21:3 (2005) issue of OCLC Systems & Services.} “I ain't what I used to be, but who the hell is?” – Dizzy Dean ABSTRACT This article discusses the challenge of maintaining meaningful usage statistics for electronic resources. The article notes the potential value of longitudinal data at the institution level that would help libraries make renewal and cancellation decisions. It describes a proposal to integrate usage statistics with other criteria in order to develop a decision support mechanism. KEYWORDS Electronic resource usage statistics; e-resources; Caryn Anderson; Andrew Nagy; Tim McGeary This past Christmas, my mother-in-law gave me a book of quotes from the great Peter Drucker. Like the tear-away desktop calendars my wonderful mother-in-law is also fond of gifting me, the Drucker book is designed to impart one iota of wisdom per day, a ration that fulfills my mind’s capacity for such things. Although there are no usage prescriptions for either the Drucker compilation or the calendar (“Amazing but True each are meant to be read at the start of one’s day. Because I’ve positioned the calendar in line with my office window – a window that receives many exasperated stares, usually following receipt of a budget-crushing journal invoice – the calendar never falls far behind the actual date. Occasionally I catch myself lapsing by a day or two, but a quick rip of the pages gets me synchronized. The Drucker text, however, I find much more difficult to keep current. For one, the book isn’t as obvious as the calendar; it rests near other books and papers that I generally ignore. Moreover, unlike the calendar, the dates don’t stare angrily at me, as if saying, “January 7th was a week ago. Get with the program.” Yet when I do open Drucker’s book of “insight and motivation,” it provides me with thoughts that resonate. I think the March 2nd anecdote is especially revealing of some library applications (Drucker, 2004): “The test of an innovation is whether it creates value. Innovation means the creation of new values and new satisfaction for the customer. A novelty only creates amusement. Yet, again and again, managements decide to innovate for no other reason than that they are bored with doing the same thing or making the same product day in and day out. The test of an innovation, as well as the test of ‘quality,’ is not ‘Do we like it?’ It is ‘Do customers want it and will they pay for it?’” I’ve had the good fortune to meet recently with a small group of individuals from Villanova University, Simmons College, and Lehigh University to discuss the need for an application that will help libraries manage e -resource usage statistics. The catalyst for this discussion was the recognition that these statistics could be incredibly valuable to libraries if harnessed in a meaningful and at least partially-automated way. The scope of our initial discussion has since grown into a much larger framework, all of which I believe to be an “innovation” rather than merely a “novelty” to classify using Drucker’s terms. A look to the past often lends insight into today’s issues. In researching how periodical usage statistics were managed in past decades, I came across an article by Robert Broadus that discusses the value of use studies, but cautions that these studies “measure not what should have been used, but what was used” (Broadus, 1985). Large packages of e-journals that provide access to formally-unavailable e-journals – unavailable because the library recognized that the title was not relevant to the curriculum, pertinent to faculty research, or of academic value – often receive usage because they’re just a click away. These uses of convenience, unfortunately, can be neither counted nor prevented. Broadus continues his piece by positing that a well-performed use study should predict future use of periodicals in a library. For instance, if Journal X is only marginally used during years one -through-three of a use study, and the faculty and curriculum in the discipline to which Journal X is aligned remain constant, it’s reasonable to assume year-four use of Journal X will remain low. Likewise, high use of Journal Y throughout a three-year period should result in continued high use of Journal Y in year-four of the study, given no changes in the faculty and curriculum of the discipline to which Journal Y is aligned. Evidence from the journal study I administer is consistent with this theory. Certainly there are instances where spikes in usage are consequential of a class assignment or other one-time need, but over the course of several years’ study, usage trends have remained fairly steady. Broadus raises a question, however, for which little research has been done; that is, how consistent are journal uses between similar libraries? If Journal X maintains low usage in my liberal arts college library in Pennsylvania, does this journal have similarly low usage in liberal arts colleges elsewhere in the States? Phil Davis provides some insight with his look at the Northeast Regional Libraries’ (NERL) use of the Academic Ideal e -journal package (Davis, 2002). Davis found that the research and medical institutions within NERL during the two years studied tended to use the same group of e -journals most frequently. On the other hand, undergraduate institutions tended to show little similarity in their uses of e-journals within the Ideal stable. Further study substantiating Davis’ findings would be of value to collection development officers. The development work of Caryn Anderson (Simmons), Andrew Nagy (Villanova), and Tim McGeary (Lehigh) mentioned above will fill a void in the e -resources spectrum. Although the Digital Library Federation (DLF) Electronic Resource Management Initiative’s (ERMI) functional specifications accommodate both metadata about the availability, frequency, and location of usage statistics, as well as the actual storage of usage statistics, it’s unlikely vendors building e-resource systems will soon begin work on this important, but glamourless feature. The Anderson/Nagy/McGeary model would incorporate usage statistics into a larger framework that would include elements such as price, impact factor, and faculty interest. The result would be a decision support mechanism that could communicate with library management and electronic resource systems. It’s a powerful idea that I hope will acquire the credentials of the DLF or another funding agency so that this work can be realized. It’s only fitting to end this column the way it began, with a serving of wisdom from Peter Drucker (Drucker, 2004): “Everything improved or new needs first to be tested on a small scale; that is, it needs to be piloted. The way to do this is to find somebody within the enterprise who really wants the new. Everything new gets into trouble. And then it needs a champion. It needs somebody who says, ‘I am going to make this succeed,’ and who then goes to work on it. … If the pilot test is successful – it finds the problems nobody anticipated but also finds the opportunities that nobody anticipated, whether in terms of design, or market, or service – the risk of change is usually quite small.” Dated March 11th; read July 14th. REFERENCES Broadus, R.N. (1985). “The measurement of periodicals use,” Serials Review, vol. 11, p. 57- 61. Davis, P.M. (2002). “Patterns of electronic journal usage: Challenging the composition of geographic consortia,” College & Research Libraries, v. 63, no. 6, p. 484-497. Drucker, P.F. (2004). The Daily Drucker: 366 Days of Insight and Motivation for Getting the Right Things Done. New York, NY: HarperBusiness. work_55ncjpx2v5a5zfvc2ewhbweqce ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216590589 Params is empty 216590589 exception Params is empty 2021/04/06-01:37:03 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590589 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:03 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_56ff3ssfwbgo5eua3m2s2jebda ---- Microsoft Word - DEAN_ENG.DOC Page 1 of 15 FAST:1 Development of Simplified Headings for Metadata Rebecca J. Dean OCLC Introduction The enormous volume and rapid growth of resources available on the World Wide Web as well as the emergence of numerous metadata schemas have spurred a re-examination of the way subject data are provided for Web resources. There is broad agreement that a subject schema for metadata must exhibit both simplicity and interoperability. Simplicity refers to the usability by non-catalogers. Interoperability enables users to search across both discipline boundaries and across information retrieval and storage systems. Additional requirements identified by ALCTS/SAC/Subcommittee (1999) specify that the schema should: • Be simple and easy to apply and to comprehend, • Be intuitive so that sophisticated training in subject indexing and classification, while highly desirable, is not required in order to implement, • Be logical so that it requires the least effort to understand and implement, • Be scalable for implementation from the simplest to the most sophisticated.Another central issue involving the syntax revolves around the choice of pre-coordination or post-coordination. Both have precedence in cataloging and indexing practices. Subject vocabularies used in traditional cataloging typically consist of pre- coordinated subject heading strings, while controlled vocabularies used in online databases are mostly single-concept descriptors, relying on post-coordination for complex subjects. For the sake of simplicity and semantic interoperability, the post-coordinate approach is more in line with the basic premises and characteristics of the online environment. Chan et. al (2001) provides additional background on the metadata requirements particularly as they relate to Dublin Core applications. The ALCTS/SAC/Subcommittee recommended that metadata for subject analysis of Web resources include a mixture of keywords and controlled vocabulary. The potential sources of controlled vocabulary the Subcommittee identified included: • Using an existing schema(s), • Adapting or modifying existing schema(s), • Developing new schema(s). Each of these options offers clear advantages. The use of an existing schema is certainly the simplest approach if a suitable one can be found. Of the existing schema, LCSH is the most obvious choice, but its complexity greatly limits its use by nonprofessionals. There are many excellent subject specific schemas available but, since the Web is so interdisciplinary, 1FAST is an OCLC Office of Research Project. The members of the FAST team are: Edward T. O'Neill, Eric Childress, Rebecca Dean, Kerre Kammerer, Diane Vizine-Goetz, Anya Dyer (OCLC, Dublin, OH, USA); Lois Mai Chan (University of Kentucky, Lexington, Kentucky, USA) ; Lynn El-Hoshy (Library of Congress, Washington D.C., USA) Page 2 of 15 combining diverse schemas is likely to create significant interoperability problems. Obtaining rights to the required schemas could also pose a serious problem. At first glance, developing an entirely new schema appears to be very attractive. However, the effort required to develop a new subject indexing system appears considerably less attractive upon further examination. The cost would be very high without any guarantee the new schema would necessarily be superior to one of the existing schema. It is quite possible that a new system could trade a set of known problems with its own set of unknown problems. It became quickly clear that attempting to develop a system as comprehensive as LCSH would be very challenging. As was concluded by the ALCTS/SAC/Subcommittee, the options of modifying an existing schema appeared more attractive. As a result, the FAST project team concluded that the most viable option for a general-purpose metadata subject schema was to adapt LCSH. This new schema, known as FAST (Faceted Application of Subject Terminology), is derived from LCSH but will be applied with a simpler syntax. The objective of the FAST project is to develop a subject-heading schema based on LCSH suitable for metadata that is easy-to-use, understand, and maintain. To achieve this objective, this new schema is being designed to minimize the need to construct new headings and to simplify the syntax while retaining the richness of the LCSH vocabulary. The primary data source used for the research effort was OCLC’s WorldCat database, which contains bibliographic records containing approximately eight million unique topical and geographic headings. Library of Congress Subject Headings LCSH is the most widely used indexing vocabulary and offers many significant advantages: • Its rich vocabulary covers all subject areas, • It has the strong institutional support of the Library of Congress, • It imposes synonym and homograph control, • It has been extensively used by libraries, • It is contained in millions of bibliographic records, and • It has a long and well-documented history. While LCSH has served libraries and their patrons well for over a century, its complexity greatly restricts its use beyond the traditional cataloging environment. It was designed for card catalogs and excelled in that environment. However, because real estate on a 3x5 card is limited and each printed subject heading requires a new card, the number of headings per item that can be assigned was severely restricted. Since the card catalog is incompatible with post- coordination, the pre-coordinated headings were the only option available. LCSH is not a true thesaurus in the sense that it is not a comprehensive list of all valid subject headings. Rather LCSH combines authorities, now five volumes in their printed form, with a four-volume manual of rules detailing the requirements for creating headings that are not established in the authority file and for the further subdivision of the established headings. The rules for using free-floating subdivisions controlled by pattern headings illustrate some of these complexities. Under specified conditions, these free-floating subdivisions can be added to established headings. The scope of patterns is limited to particular types (patterns) of headings. For example, Burns and scalds—Patients—Family relationships is a valid heading formed by adding two pattern subdivisions to the established heading Burns and scalds. The subdivision 'Patients' is one of several hundred subdivisions that can be used with headings for diseases and other medical conditions. Therefore it can be used to subdivide Page 3 of 15 Burns and scalds. However, the addition of Patients changes the meaning of the heading from a medical condition to a class of persons. Now, since Family relationships is authorized under the pattern for classes of persons, it can also be added to complete the heading. Other examples of some of the complexities are illustrated a type of authority records known as ‘multiples’. Multiples are headings that establish a pattern of use, for example, the subdivision $x Translating into French [German, etc.], indicates that the language ‘French’ can be replaced with the name of any established language. The ‘multiple’ heading that actually appears in the 1xx field of an authority record should never be used in its multiple form in a bibliographic record. All the possible headings that can be created using ‘multiples’ are not included in LCSH. A third area that illustrates the complexities is music. Some of the complexities involved: determining the group for each solo instrument (e.g., wind instruments), the ordering of instruments within the individual group, when a heading should and should not be qualified (e.g., Concertos). Overall, music accounted for the largest number of correctly constructed headings represented by the fewest number of authority records. While the rich vocabulary and semantic relationships in LCSH provide subject access far beyond the capabilities of keywords, its complex syntax presents a stumbling block that limits its application beyond the traditional cataloging environment. Not only are the rules for patterns headings complex, their application requires extensive domain knowledge since there is no explicit coding that identifies which pattern subdivisions are appropriate for particular headings. Although FAST will retain headings authorized under these rules, they will be established in the authority file, effectively hiding the complexity of rules under which they were created. The LCSH environment has resulted in a complex system requiring skilled professionals for its successful application and has prompted several simplification attempts. Among these, the Subject Subdivisions Conference (The Future of Subdivisions, 1992) attempted to simplify the application of LCSH subdivisions. Recently, the ALCTS/SAC/Subcommittee on Metadata and Subject Analysis (Subject Data in the Metadata Record…, 1999) recommended that LCSH strings be broken up [faceted] into topic, place, period, language, etc., particularly in situations where non-catalogers are assigning the headings. The Library of Congress has also embarked on a series of efforts to simplify LCSH. The FAST Schema After reviewing the previous attempts to update LCSH or to provide other subject schema, OCLC decided to develop the FAST schema. While FAST is derived from LCSH, it has been redesigned as a post-coordinated faceted vocabulary for an online environment. Specifically it is designed to: • Be usable by people with minimal training and experience, • Enable a broad range of users to assign subject terminology to Web resources,Be amenable to automated authority control, • Be compatible with use as embedded metadata, • Focus on making use of LCSH as a post-coordinate system in an online environment. The first phase of the FAST development includes the development of facets based on the vocabulary found in LCSH topical and geographic headings and is limited to six facets: topical, geographic, form, period, with the most recent work focused on faceting personal and corporate names. This will leave headings for conference/meetings, uniform titles and name- Page 4 of 15 title entries for future phases. With the exception of the period facet, all FAST headings will be fully established in a FAST authority file. Topical Facet The topical facet consists of topical main headings and their corresponding general subdivisions. FAST topical headings look very similar to the established form of LCSH topical headings with the exception that established headings will include all commonly used (i.e., free-floating) topical subdivisions and each of the common multiple headings will be individually established. FAST topical headings will be created from: • LCSH main headings from topical headings (650) assigned to MARC records, • All associated general ($x) subdivisions from any type of LCSH heading, • Period subdivisions containing topical aspects from any type of LCSH heading. All topical headings strings will be established in an authority file. Examples of typical FAST topical headings are shown below: Project management $x Data processing Colombian poetry Blacksmithing $x Equipment and supplies Epic literature $x History and criticism Pets and travel Quartets (Pianos (2), percussion) Natural gas pipelines $x Electric equipment School psychologists Blood banks Loudspeakers $x Design and construction Burns and scalds $x Patients $x Family relationships FAST headings retain the hierarchical structure of LCSH, but topical subdivisions only can be subdivided by topical subdivisions, geographic headings can only be subdivided by geographic headings, etc. For example, in FAST, one would not see headings of the type: Colombian poetry $v Indexes Pets and travel $v Guidebooks Quartets (Pianos (2), percussion) $v Scores and parts Blood banks $z Italy $z Florence Italy $x History $y To 476 Geographic Facet The geographic facet includes all geographic names, and following the practice of the Library of Congress, populated places are the default and are not qualified by type of geographic unit.However, in FAST, these place names will be established and used in indirect order. For example, Ohio—Columbus is the established form in FAST rather than the direct order form, Columbus (Ohio). In LCSH, place names used as main headings are entered in direct order, but when they are used as subdivisions, those representing localities appear in indirect order. Page 5 of 15 First level geographic names in FAST will be far more limited than in LCSH. They will be restricted to names from the Geographic Area Codes table. Linking the first level entries with the Geographic Area Codes also provides additional specificity and hierarchical structure to the headings. In this way, the Geographic Area Codes can be used to limit a search. As with topical headings, all geographic headings will be established in an authority file. During the process of linking first level heading entries with Geographic Area Codes, some established geographic headings could only be associated with the code for ‘Other’. These include headings associated with geographic locations for the earth, sun and the plants in its solar system, as well as comets, stars, satellites, and plants in other galaxies. Creating a set of headings with ‘Other’ as the first level did not meet the goal of providing specificity, and after evaluating the headings that were associated with ‘Other’, a proposal for new Geographic Area Codes was submitted to the MARC Standards Office. As a result, a series of new codes were established: x Earth xa Eastern Hemisphere xb Northern Hemisphere xc Southern Hemisphere xd Western Hemisphere zd Deep space zju Jupiter zma Mars zme Mercury zmo Moon zne Neptune zo Outer space zpl Pluto zs Solar system zsa Saturn zsu Sun zur Uranus zve Venus Second level names will be entered as subdivisions under the name of the smallest first level geographic area in which it is fully contained. For example, the Maya forest, which spans Belize, Guatemala, and Mexico, would be established as North America—Maya Forest instead of simply as Maya Forest. The same geographic names may appear significantly different in their direct and indirect forms. In LCSH, North Carolina as a first level entry or as a subdivision, is spelled out, but, as a qualifier, it is abbreviated as N.C. (e.g., Chapel Hill (N.C.)) To ensure a comprehensive search, users frequently must search for multiple forms of the same name. Some examples of FAST geographic headings and their corresponding Geographic Area Codes are: England $z Coventry [e-uk-en] Great Lakes [nl] Great Lakes $z Lake Erie [nl] Italy [e-it] Maryland $z Worcester County [n-us-md] Ohio $z Columbus [n-us-oh] Deep space $z Milky Way [zd] Solar system $z Hale-Bopp comet [zs] Page 6 of 15 Type qualifiers (County, Lake, Kingdom, Princely State, etc.) will be used when the name is not a unique geographic name. For the United States, county names will be the most common means to identify a particular place name when the name is not unique within the state. For example, there are two Beaver Islands in Michigan; the larger one and better-known island is in Lake Michigan, but another Beaver Island exists in the Isle Royle National Park, located in Lake Superior. To uniquely specify the island in Lake Michigan, Beaver Island would be qualified by the county: Michigan $z Beaver Island (Charlevoix County) [n-us-mi] When different type of geographic entities use the same name, the name is qualified to reflect the type of entity. For example, Otsego Lake is both a town and a lake in Michigan--to distinguish between the town and the lake, a qualifier would be added to the heading for the lake, leaving the populated place unqualified. Michigan $z Otsego Lake [n-us-mi] Michigan $z Otsego Lake (Lake) [n-us-mi] In some cases, an LCSH geographic headings for city sections contains more information than can be expressed in two FAST levels. In FAST, headings of this type will be expressed as three levels. For example, headings of the type Hollywood (Los Angeles, Calif.) and German Village (Columbus, Ohio) would be expressed in FAST as: California $z Los Angeles $z Hollywood [n-us-ca] Ohio $z Columbus $z German Village [n-us-oh] Form Facet The form facet includes all form subdivisions. The form headings were established by extracting all form subdivisions from LCSH topical and geographic headings. However, because many form subdivisions are currently still coded as $x instead of subfield $v in LCSH headings, they were algorithmically identified and re-coded as v prior to their extraction. O’Neill et. Al provides the details of the algorithm used to identify the form subdivisions for re-coding. Some examples of FAST form subdivisions are: $v Translations into French $v Rules $v Dictionaries $x Swedish $v Controversial literature $v Early works to 1800 $v Statistics $v Databases $v Bibliography $v Graded lists $v Slides $v Directories $v Juvenile literature $v Scores As with the topical and geographic facets, all form headings will be established in the authority file. Page 7 of 15 Chronological Facet The period facet follows the practice recommended by the SAC/ALCTS Subcommittee, and a continuance of the recommendations discussed at the Airlie Conference, specifically, that chronological headings reflect the actual time period of coverage for the resource. In FAST, all period headings will be expressed as either a single numeric date or as a date range. In cases where the date is expressed in LCSH as a century (e.g., 20th century), in FAST, the date is expressed as a range of dates—1900-1999. Similarly, periods related to pre-history eras would be expressed as dates—Jurassic would be expressed as 190000000- 140000000 B.C. The only exception to this practice are for period headings that are represented in the authority file as established topical headings will be treated as topical headings, and not as periods (e.g., Twentieth century when found used as a main heading). Since the only general restriction on periods is that when a date range is used, the second date must be greater than the first, there is no need to routinely create authority records for period headings. For example, no period authority record would be created for the period facet $y To 1500. Complexities on the treatment of period facets in headings of the type [Geographic] $x History $y [topical descriptor, date range]. Some examples of these types of headings include: Argentina $x History $y Peronist Revolt, 1956, and Maine $x History $y King William’s War, 1689-1697. In these examples, the chronological subdivision contains additional information than can be expressed in a date or date range (e.g., King William’s War). As the research on faceting headings of this type continue, the objective of the FAST project remains, which is to develop a subject-heading schema based on LCSH suitable for metadata that is easy-to-use, understand, and maintain. Names Facet The facet for personal and corporate names is the area of most recent research. Similar to the topical main facet, FAST headings for personal and corporate names are very similar, and in most cases exact, to the established name heading in the LC authority file. Unlike the approach taken for the topical, geographic, and chronological facets, however, more restrictions were implemented when selecting headings from bibliographic records for inclusion in the FAST scheme. In part, this decision was made simply due to the difference in the number of name authority records versus the number of subject authority records. Currently, there are over 5.4 million name authority records, in contract to the approximate 270,000 subject authority records. ♣ Name headings found in bibliographic records must be represented in the LC names file, AND ♣ Name heading must be used at least one time as a subject heading. Multi-faceted phrase headings There are a small number of Library of Congress Subject Headings that contain multiple facets presented in a phrase-like structure, and all bounded within a single $a. Examples of these types of headings are: ♣ Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986 Page 8 of 15 ♣ War of the Mascates, Brazil, 1710-1714 ♣ Bull Run, 2nd Battle of, Va., 1862 These types of headings are retained as topical headings in Phase I of the FAST project, but they will require more extensive manual review in future phases. Based on the cursory research that has been completed on these types of headings, the faceting could result in the following: ♣ Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986 o 110 Geo. A. Hormel & Company o 150 Strikes and lockouts o 151 Minnesota $z Austin o 148 1985-1986 10. FAST headings in metadata records One of the goals of the ALCTS/SAC/Subcommittee was to develop a subject heading scheme compatible with Dublin Core and other metadata schemas. The subcommittee was also specific in regards to endorsing the use of use of other Dublin Core elements (e.g., coverage) to accommodate different facets. As the MARC21 format is currently the most heavily used format by libraries in the United States, it was important that FAST be developed in a way that was compliant with both MARC21 and Dublin Core formats. The following chart shows the faceting of the data extracted from LCSH headings and how it would be expressed in Dublin Core. Extracted from MARC21 Bibliographic tag FAST Facet Expressed as Dublin Core Qualifier 650, second indicator 0, $a Topical Subject 6xx, second indicator 0, $x Topical Subject 6xx, second indicator 0, $y Topical Subject 6xx, second indicator 0, $y Chronological Period 6xx, second indicator 0, $v Form Type 651, second indicator 0, $a Geographic Coverage.spatial 6xx, second indicator 0, $z Geographic Coverage.spatial 600, second indicator 0, $abcdq Personal name Creator/namePersonal or Contributor/namePersonal 610, second indicator 0, $abndc Corporate name Creator/nameCorporate or Contributor/namePersonal For example, the LCSH heading: 650 0 Authority files (Information retrieval) $z Italy $z Florence $v Congresses would be faceted into the following three FAST headings: ♣ Topical: Authority files (Information retrieval) ♣ Geographic: Italy $z Florence ♣ Form: Congresses And re-expressed in Dublin Core as: ♣ Subject: Authority files (Information retrieval) ♣ Coverage.spatial Italy · Florence ♣ Type: Congresses Page 9 of 15 Similarly, the LCSH heading: 651 0 United States $x Civilization $x Italian influences $x History $y 20th century $v Sources would be faceted into the following four FAST headings: ♣ Geographic: United States ♣ Topical: Civilization $x Italian influences $x History ♣ Period: 1900-1999 ♣ Form: Sources And re-expressed in Dublin Core as: ♣ Coverage.spatial United States ♣ Subject: Civilization · Italian influences · History ♣ Period: 1900-1999 ♣ Type: Sources However, to express the same data in MARC21 format presented problems, as neither the MARC21 bibliographic or authority formats had defined tags to support the entry of chronological data as a main ($a) subfield. As a result, the team met with staff at the Library of Congress, and later wrote a MARBI proposal to expand the MARC21 bibliographic and authority formats. In 2002, the proposal was accepted by MARBI committee, and allows complete mapping of FAST facets to MARC21 bibliographic tags: FAST Facet Expressed as Dublin Core Qualifier Expressed in MARC21 Bibliographic tag Topical Subject 650, second indicator 7, $a/$x, $2 fast Chronological Period 648, second indicator 7, $a, $2 fast Form Type 655, second indicator 7, $a, $2 fast Geographic Coverage.spatial 651, second indicator 7, $a/$z, $2 fast Personal name Creator/namePersonal or Contributor/namePersonal 600, second indicator 7, $abcdq, $2 fast Corporate name Creator/nameCorporate or Contributor/namePersonal 610, second indicator 7, $abndc, $2 fast In authority records, the MARC21 tags for the FAST facets are: FAST Facet Expressed in MARC21 Authority tag Topical 150 Chronological 148 Form 155 Geographic 151 Personal name 100 Corporate name 110 Authority records The FAST team selected the MARC 21 Authority Format is because the format is a well-proven, sophisticated protocol specifically designed to carry controlled vocabulary elements and support a synthetically-structured database. In FAST, the synthetically structured database was expanded to include the retention of obsolete authority records to ensure compatibility within a linked structure. To minimize the number of broken links, once a heading has been established and an authority created, that heading and its authority record Page 10 of 15 will be permanently retained in the FAST authority file with its 1XX field unchangeable. FAST authority records containing headings in the 1XX field containing obsolete headings will be contain value ‘o’ (Obsolete) in the Leader/05 to indicate that the heading is not the preferred term. The difference between Leader/05 ‘o’ and Leader/05 ‘d’ is purely one of a physical nature: Leader/05 ‘o’ identifies authority records in which the heading is obsolete, but the authority record physically remains in the file to support the linked structure of the database. Leader/05 value ‘d’, indicates the record should be physically deleted from the file. A second area identified by the FAST team lacking in the MARC21 Authority Format was one to facilitate systematic maintenance as headings and relationships between headings occur. Below are the four basic types of identifying heading changes and supporting updating that occur in LCSH, and how these would be handled within FAST using current and newly defined MARC elements. All FAST records will be linked back to the LC authority record from which it was derived using 7xx linking fields. The final component of the MARBI proposal defined a new $w/1 subfield value for the 700- 785 fields to support the ability for automatic replacement of headings. Three codes were defined that could be used by systems to automatically update bibliographic records with the replacement heading(s), specifically: ♣ a Heading replacement does not require review Identifies headings that are always used to replace the obsolete heading ♣ b Heading replacement requires review Identifies headings that may be used as replacement, but requires subject analysis to determine its appropriateness ♣ n Not applicable The heading is not being replaced; if code n is applicable, $w/1 need not be used 1. ‘One-to-one’ changes, for example, the heading Trade-unions is replaced by the heading Labor unions. The heading for Trade-unions now appears as a 450 heading in the authority record for Labor unions. ♣ The incoming authority record distributed by the Library of Congress containing the 150 heading Labor unions would contain the value ‘c’ (Corrected or revised) in the Leader/05 position. LC Authority record Leader /05 ‘c’ 001 2032352 010 sh 85136516 040 DLC $c DLC $d DLC 150 Labor unions 450 Trade-unions ♣ In FAST, a new authority record for Labor unions would be created, with the value ‘n’ (New) in the Leader/05 position. A 750 field would be added to the authority record. FAST Authority record Leader /05 ‘n’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast Page 11 of 15 150 Labor unions 450 Trade-unions 750 0 Labor unions $0(DLC) sh 85136516 ♣ In FAST, the authority record for Trade-unions would be retained as a separate record, but would be updated to contain value ‘o’ in the Leader/05. The 750 linking field in the record showing a relationship to the FAST authority record for Labor unions would remain, with $w a added to the 750 linking field indicate that the heading indicating that any occurrence the FAST heading Trade-unions found in bibliographic records should be replaced by the heading Labor unions. FAST Authority record Leader /05 ‘o’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 Trade-unions 750 0 Labor unions $0(DLC) sh 85136516 750 7 Labor unions $7(fast) [OCLC assigned number] $w a 2. ‘And/Or’ changes, for example, the heading Alms and almsgiving is replaced by two or more different headings—in this case, the replacement headings are Charity and Charities. In this instance, one or the other, and maybe both, of the identified headings would be the appropriate replacement for the obsolete heading. ♣ The incoming authority record distributed by the Library of Congress containing the 150 heading Alms and almsgiving would contain the value ‘d’ (Deleted) in the Leader/05 position. Two new authority records, with the value ‘n’ in the Leader/05 position would also be distributed for the headings Charity and Charities, respectively. LC Authority record Leader /05 ‘d’ 001 [OCLC assigned number] 010 [LC control number] 040 DLC $c DLC $d DLC 150 Alms and almsgiving LC Authority record Leader /05 ‘n’ 001 2137277 010 sh 85022672 040 DLC $c DLC $d DLC 150 Charity 450 Alms and almsgiving LC Authority record Leader /05 ‘n’ 001 2137212 010 sh 85022665 040 DLC $c DLC $d DLC 150 Charities Page 12 of 15 450 Alms and almsgiving ♣ Using value ‘o’ in the Leader/05 position of the authority record containing Alms and almsgiving and value ‘n’ (New) in the Leader/05 position of the two new authority records created for Charity and Charities. The presence of the same text appearing in the 450 field in multiple records would generate $w b in the 750 FAST linking fields of the obsolete record, indicating that the one or both headings may be used as a replacement. FAST Authority record Leader /05 ‘o’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 Alms and almsgiving 750 0 Alms and almsgiving $0(DLC) sh 85136516 750 7 Charity $7(fast)[OCLC assigned number] $w b 750 7 Charities $7(fast)[OCLC assigned number] $w b FAST Authority record Leader /05 ‘n’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 Charity 450 Alms and almsgiving 750 0 Charity $0(DLC) sh 85022672 FAST Authority record Leader /05 ‘n’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 Charities 450 Alms and almsgiving 750 0 Charities $0(DLC) sh 85022665 3. ‘Or’ changes, for example, the heading Hotels, taverns, etc. is replaced by two or more different headings—in this case, the replacement headings are Bars (Drinking establishments), and/or Hotels, and/or Taverns (Inns). ♣ Similar with the and/or changes, the incoming authority record distributed by the Library of Congress containing the 150 heading Hotels, taverns, etc would contain the value ‘d’ in the Leader/05 position. Three new authority records, with the value ‘n’ in the Leader/05 position would also be distributed for the headings Bars (Drinking establishments), and/or Hotels, and/or Taverns (Inns), respectively. ♣ Using value ‘o’ in the Leader/05 position of the authority record containing Hotels, taverns, etc. and value ‘n’ in the Leader/05 position of the three new authority records created for Bars (Drinking establishments), Hotels, and Taverns (Inns) respectively. The presence of the same text appearing in the 450 field in multiple records would generate $w b in the 750 FAST linking field of the obsolete record, indicating that the heading Page 13 of 15 may be used as a replacement, but requires subject analysis to determine its appropriateness. 4. ‘And’ changes such as occur with the faceting of a particular type of FAST heading that occurs when a single LCSH heading contains multiple facets within a single subfield (e.g., $a). LC Authority record 001 2488003 010 sh 89000691 040 DLC $c DLC $d DLC 150 Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986 ♣ Value ‘o’ in the Leader/05 authority record for Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986. The 7xx FAST linking fields in the record would remain, with $w a added to indicate that the heading for Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986 is replaced by multiple FAST headings. FAST Authority record Leader /05 ‘o’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986 710 7 Geo. A. Hormel & Company $7(fast)[OCLC assigned number] $w a 750 7 Strikes and lockouts $7(fast)[OCLC assigned number] $w a 751 7 Minnesota $z Austin $7(fast)[OCLC assigned number] $w a 748 7 1985-1986$7(fast)[OCLC assigned number] $w a 750 0 Geo. A. Hormel & Company Strike, Austin, Minn., 1985-1986 $0(DLC) sh 89000691$w n1 ♣ Value ‘n’ in the Leader/05 position for the FAST authority records. FAST Authority record Leader /05 ‘n’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 110 Geo. A. Hormel & Company 710 0 Geo. A. Hormel & Company $0(DLC) n 84082628 FAST Authority record Leader /05 ‘n’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 Strikes and lockouts 750 0 Strikes and lockouts $0(DLC) sh 85128731 FAST Authority record Leader /05 ‘n’ 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] Page 14 of 15 040 OCoLC $b eng $c OCoLC $f fast 043 n-us-mn 151 Minnesota $z Austin 751 0 Austin (Minn.) $0(DLC) n 79105963 Other decisions regarding what information from the Library of Congress that should be part of FAST authority records are still under review. Most 4xx fields will be retained, some 5xx fields will be retained, and some select 6xx note fields. In general, 4xx and 5xx fields are retained if the heading does not cross facets. Example 1: LC Authority record 001 4478097 010 sh 97006510 040 DLC $c DLC $d DLC 005 20010306142236.0 151 Maya Forest 451 Selva Maya 550 Rain forests $z Belize $w g 550 Rain forests $z Guatemala $w g 550 Rain forests $z Mexico $w g FAST Authority record 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 043 n 151 North America $z Maya Forest 451 Selva Maya 751 0 Maya Forest$0(DLC) sh 97006510 Example 2: Topical LC Authority record 001 2000367 010 sh 85000004 040 DLC $c DLC $d DLC 005 19960530131610.0 150 20th Century Limited (Express train) 450 Twentieth Century Limited (Express train) 550 Express trains $z United States $w g 670 Work cat.: Rose, A. 20th Century Limited, 1984. FAST Authority record 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 150 20th Century Limited (Express train) 450 Twentieth Century Limited (Express train) 750 0 20th Century Limited (Express train) $0(DLC) sh 85000004 Example 3: Form LC Authority record 010 sh 99001298 040 DLC $b eng $c DLC $d DLC 005 20010202130538.0 Page 15 of 15 073 H 1095 $z lcsh 185 $v Bibliography of bibliographies 480 $x Bibliography $v Bibliography $w nne 585 $v Bibliography $w g 680 $i Use as a form subdivision under subjects for works consisting of lists of bibliographies on those subjects. 681 $i Reference under the heading $a Bibliography of bibliographies FAST Authority record 001 [OCLC assigned number] 005 [OCLC assigned date/time stamp] 040 OCoLC $b eng $c OCoLC $f fast 155 Bibliography of bibliographies 555 Bibliography 785 0 $v Bibliography of bibliographies $0(DLC) sh 99001298 Conclusions Although much work remains before the FAST authorities files are complete and ready for use, the project has demonstrated that it is viable to derive a new subject schema based on the terminology of the Library of Congress Subject Headings but with a simpler syntax and application rules. Upon completion, the FAST authority records will be extensively tested and evaluated. After the evaluation, we will know if we have achieved our goal of creating a new subject schema for metadata that retains the rich vocabulary of LCSH while being easy to maintain, apply, and use. References Chan, Lois Mai, Eric Childress, Rebecca Dean, Edward T. O'Neill, and Diane Vizine-Goetz. 2001. A Faceted Approach to Subject Data in the Dublin Core Metadata Record. Journal of Internet Cataloging 4, No. 1/2: 35-47. The Future of Subdivisions in the Library of Congress Subject Headings System: Report from the Subject Subdivisions Conference May 9-12, 1991, edited by Martha O’Hara.1992. Washington, D.C.: Library of Congress, Cataloging Distribution Service. O’Neill, Edward T., Lois Mai Chan, , Eric Childress, Rebecca Dean, Lynn El-Hoshy, Kerre Kammerer, and Diane Vizine-Goetz. [Forthcoming] Form Subdivisions: Their Identification and Use in LCSH. Library Resources & Technical Services 45, No. 4: 187-197. Subject Data in the Metadata Record Recommendations and Rationale: A Report from the ALCTS/SAC/Subcommittee on Metadata and Subject Analysis. 1999. http://www.govst.edu/users/gddcasey/sac/MetadataReport.html Accessed 06/26/01. work_56gd6dj7xzdtnclidpoplgxlme ---- Juried Paper Proposal Edward T. O'Neill, Ph.D. OCLC Online Computer Library Center, Inc. Lynn Silipigni Connaway, Ph.D. OCLC Online Computer Library Center, Inc. Timothy J. Dickey, Ph.D. OCLC Online Computer Library Center, Inc. Estimating the Audience Level for Library Resources Note: This is a pre-print version of a paper published in Journal of the American Society for Information Science and Technology. Please cite the published version; a suggested citation appears below. Abstract WorldCat, OCLC’s bibliographic database, identifies books and the libraries that hold them. The holdings provide detailed information about the type and number of libraries that have acquired the material. Using this information, it is possible to infer the type of audience for which the material is intended. A quantitative measure, the audience level, is derived from the types of libraries that have selected the resource. The audience level can be used to refine discovery, analyze collections, advise readers, and enhance reference services. © 2009 OCLC Online Computer Library Center, Inc. 6565 Kilgour Place, Dublin, Ohio 43017-3395 USA http://www.oclc.org/ Reproduction of substantial portions of this publication must contain the OCLC copyright notice. Suggested citation: O'Neill, Edward T., Lynn Silipigni Connaway, and Timothy J. Dickey. 2008. "Estimating the Audience Level for Library Resources." Journal of the American Society for Information Science & Technology 59,13: 2042-2050. Pre-print available online at: http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf http://www.oclc.org/ http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Introduction and Statement of the Problem Current financial restrictions make it critical for librarians to use empirical data to assess and manage collections. Librarians assess collections to determine subject areas for acquisition, deaccession, digitization, preservation, and remote storage. They also must determine if the sources are relevant to their primary users’ needs and expectations. One collection assessment method is to examine usage statistics, such as circulation and interlibrary loan data. Librarians employ usage data as one indicator of the materials’ relevance. Determining if the materials’ content and presentation match the needs of the library’s primary user groups is another form of collection assessment. The exponential amount of sources retrieved in the online environment can make it difficult to determine what content is appropriate for the intended audience’s need. The audience level for a book theoretically represents the type of reader for which the resource is most appropriate, and thus can improve collection assessment and the development of a ranking system for discovery. Estimating the audience level also can enhance information retrieval by increasing the relevance of items retrieved. Determining the audience level is difficult because there is no standard requiring the inclusion of this information in the bibliographic record, other than the Target Audience element in the MARC record and the Library of Congress Subject Heading (LCSH) form subdivisions which in terms of the audience level are both used primarily to identify juvenile material. The researchers hypothesized that the http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 2 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources audience level could be estimated from the types of libraries – research, academic, public, and school – that have acquired the resource. WorldCat, OCLC’s bibliographic database, serves not only as an aggregator of bibliographic data, but also includes detailed holdings information that can support such an analysis. In July 2007, the WorldCat database contained more than 81 million records and identified more than a billion holding locations for library resources. WorldCat includes a holding symbol for every member library holding an item represented in the WorldCat database. Each holding represents a discrete selection decision implying that the material is relevant to the library’s patrons and is consistent with the library’s collection development strategy. Thus, the totality of these individual decisions can serve as a indicator of audience level. Literature Review The literature on management and assessment of library collections is vast, but only recently has expanded to assess and describe collections by the characteristics of the libraries owning the collections. As early as 1979, Bonk and Magrill (pp. 305-313) attempted to collect an authoritative bibliography of the various methods for collection analysis. The principal methodologies at that time were either checklist-based, or based upon quantitative measures such as total volumes and total expenditures. Magrill’s later, more exhaustive literature review of collection analysis methodologies revealed “variations on the traditional checking of standard bibliographies” (Magrill, 1985, 279). One of the most extensive bibliographies of collection assessment tools was that of Strohl (1999), who expanded the list of methods to include checklists, circulation data, citation analysis, the RLG-OCLC subject Conspectus, document delivery and ILL data, faculty http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 3 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources recommendations, and user-centered evaluation. Philips and Williams (2003) were able to add little to the literature in terms of assessment methods, though they documented that the number of studies had increased exponentially. They mentioned only one study which used WorldCat holdings data as an assessment tool (Senkevitch and Sweetland 1996, see below). At the same time, researchers have suggested that the WorldCat database represents an “aggregate collection” (Lavoie, Connaway, & O’Neill April 2007, 107), which is appropriate for bibliometric study. Lavoie, Connaway, and O’Neill mined data from WorldCat to “map the landscape” of digital resources cataloged in WorldCat and held by member libraries, discovering more than one million digital resources within the database, and describing characteristics of this aggregate digital collection that support library decision-making. Bernstein (2005) studied a random sample of bibliographic records from WorldCat, in a “demographic study” to determine the characteristics of the aggregate monographic collection in the database (see also Schonfeld & Lavoie, 2006). These studies acknowledged that WorldCat does not “represent the totality of world library holdings” (Bernstein, 80), though as an aggregate collection, analysis of its contents “affords a high-level perspective on historical patterns, suggests future trends, and supplies useful intelligence with which to inform decision making” (Lavoie, Connaway, & O’Neill April 2007, 107). Several researchers have specifically used WorldCat’s holdings to evaluate various types of library materials in the aggregate collection. The findings of Perrault (2002), for instance, reinforce the applicability of the WorldCat database as an aggregate collection for research. Specifically, she reported that the presence and accuracy of http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 4 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources monographic holdings in WorldCat was mirrored by a profile of research libraries’ collections. In two earlier studies, Perrault (1995; 1999) used the OCLC AMIGOS product as a source of data on general library collection patterns in the United States. Carpenter and Getz (1995), Ciliberti (1994), Velluci (in Gottlieb 1994; 1993), Gyeszky, Allen and Smith (1992), Harrell (1992), Joy (1992), Schwartz (1994), and Webster (1995) also used the OCLC AMIGOS Collection Analysis product for their research. However, many of these studies were evaluating the effectiveness of the AMIGOS tool itself rather than the collections themselves. Connaway, O’Neill, and Prabha (2006) used WorldCat holdings specifically as the point of analysis, to identify a body of “last copies” and to provide data for deaccession, digitization, and preservation decisions. Other researchers used WorldCat holdings to assess collections. Serebnick (1992) identified and described small publishers’ books owned by libraries and cataloged in WorldCat, while Serebnick and Cullars (1984) and Shaw (1991) assessed adult fiction collections included in WorldCat. The language and literature collections in WorldCat were identified and assessed by Sweetland and Christiansen (1997), while Wiberly (2002; 2004) focused on humanities and social science collections. Researchers have attempted to exploit WorldCat’s library holdings data as a generalized tool for library collection analysis. Wallace and Boyce (1988) offered an early example of bibliometric analysis of WorldCat holdings as a measure of journal “value.” The authors determined that they could not support a solitary correlation between how widely a journal is held and the journal’s score on other evaluative criteria, such as citation analysis, ISI impact factors, and circulation statistics. Senkevitch and http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 5 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Sweetland (1996) adopted a similar approach to using WorldCat holdings as a verification tool for titles in an adult fiction collection. Their results exposed some discrepancies between a “standard” list and public library holdings of these titles in WorldCat. Budd (1991) used WorldCat holdings as a tool to evaluate library collections in comparison to a standard recommended core list of books, the Books for College Libraries (Association of College and Research Libraries, 1988). He tentatively was able to support this checklist based upon WorldCat holdings. Calhoun (2000) developed a general model for collection development which blends WorldCat holdings with two major sources of book reviews and the associated value of monograph publishers. Several studies have used WorldCat holdings to measure the audience level of individual titles. White (1995), and later, Lesniaski (2004) used WorldCat holdings as part of a collection analysis tool for individual titles (see also Twiss, 2001). Both worked from the premise that the sheer number or paucity of libraries holding a title alone reflects its “difficulty effect” (White, 1995, 10); therefore, the most specialized research titles should have minimal worldwide library holdings, and the most basic and generalist titles, the maximum number of library holdings. The approach also assumed (with more empirical justification) that “libraries holding half or more of the items at a higher level certainly will hold half or more of the items at a lower level” (Lesniaski, 2004, 13). Bernstein (2005) also used WorldCat holdings to predict the level of an item, in his terms nonexistent, unique, scarce, or non-scarce. However, his analysis was based solely upon the number of libraries holding an item with the presumption that the more broadly an item is held, the more general its appeal and vice versa. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 6 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Algorithm The current approach utilizes the knowledge acquired from the earlier studies and extends the difficulty effect described by White (1995) and further tested by Lesniaski (2004) and Bernstein (2005) by considering the types of libraries holding the resource. By assigning a weight to each type of library that owns the title in WorldCat, an audience level can be calculated for each title based on aggregate library holdings in WorldCat. This approach was originally reported by O’Neill (2003) and later described in more detail by Connaway, O’Neill, Prabha, and Snyder (2004). An algorithm was developed to estimate the audience level for each WorldCat resource. The audience level is determined in two steps. First a weighted holdings value is derived, either using the target audience in the 008 field from the bibliographic record, or based on the types of libraries holding the resource. This weighted holdings value is a numeric value between zero and one. In the second step, the weighted holdings value is converted to a percentile to form the audience level. If one of the following codes for the target audience had been assigned in the bibliographic record, the weighted holdings value for the resource was derived directly from the target audience code using the associated weighted holdings values: 0.00 a (Preschool), 0.10 b (Primary) 0.15 c (Pre-adolescent) 0.25 d (Adolescent) 0.15 j (Juvenile) http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 7 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources If none of the above codes for target audience had been assigned, the weighted holding value was calculated using a weighted sum based upon the types of libraries that hold the resource. The following weighting is used: 0.00 School libraries, 0.33 Public libraries, 0.67 Academic libraries, 1.00 Research libraries. Research libraries are defined as those libraries who are members of the Association of Research Libraries (ARL) and academic libraries as those at non-ARL academic institutions. Using the library holdings data attached to each item, the weighted holdings value is calculated for each resource in WorldCat. Only the four above types of libraries are considered when calculating the audience level. There are, of course, many other types of libraries among the OCLC members. However some of the other library types such as special libraries, government libraries, and library networks have very heterogeneous collections making it difficult to place them in the school to research library spectrum. Fortunately, these four types account for 93% of all WorldCat holdings, so excluding the other types of libraries does not have a major impact. The most significant impact of their exclusion is that there are a few resources for which the weighted holdings value can not be calculated. If a particular resource is only held by a special library, it will not have any useable holdings information; therefore, no weighted holdings value can be calculated. As an example, Build Community: the Leader’s Guide to Building Community (OCLC #65514085) is held by 12 OCLC member libraries as shown in Table 1. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 8 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Table 1 Computing Audience Level for Build Community: the Leader’s Guide to Building Community Library Symbol Library Name Library Type Weight OUN Ohio University Research 1.00 KSU Kent State University Research 1.00 CIN University of Cincinnati Research 1.00 BGU Bowling Green State University Academic 0.67 TOL University of Toledo Academic 0.67 MIA Miami University Academic 0.67 HIR Hiram College Academic 0.67 YNG Youngstown State University Academic 0.67 OHI State Library of Ohio Other x OCO Columbus Metropolian Library Public 0.33 BGF Firelands College Academic 0.67 OSD SEO Automation Consortium Other x Four different types of libraries—research, academic, public and ‘other’ hold this book. However, the other category is not included so these two libraries are ignored in computing the weighted holdings value. The weighted holdings value for the book is then: Sum of the weights = 7.35 = 0.735 Number of libraries 10 Although the weighted holdings value is a valid measure of the audience level, its meaning can be difficult to interpret. The distribution of the weighted holdings values for all of the resources in WorldCat is shown in Figure 1. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 9 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources 0% 5% 10% 15% 20% 25% 30% P e rc e n t o f W o rl d C a t R e c o rd s 0.0 0.2 0.60.4 0.8 1.0 Weighted Audience Level Figure 1. Distribution of Weighted Holding Values As can be seen, the weighted holdings values are not uniformly distributed and cluster at several points. The clustering observed at the lower values is primarily the result of using the target audience to derive the weighted holdings. The ‘j’ (juvenile) code is commonly assigned, creating a large cluster at 0.15. Similar, although much smaller, clusters are created by the other target audience codes. Approximately half of the resources in WorldCat are held by only a single library. These uniquely held resources generate a large cluster at 1.00 (resources held by a single research library), 0.67 (academic), 0.33 (public), and 0.00 (school). Smaller clusters also result from resources held by a small number of libraries. To simplify the measure and make it easier to interpret, the weighted holdings value is converted to a percentile to create the audience level. For the above example, the weighted holdings value of 0.735 is converted to a http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 10 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 11 of 27 percentile to form the audience level of 0.66. This audience level value indicates that 34% of the books in WorldCat have a higher audience level while 66% have a lower value. The audience level is a property of the work rather than a property of a particular edition or manifestation and is computed for the work level as a whole.1 In the above Build Community example, there is only a single manifestation of the work so this distinction was not relevant. This distinction is significant for works with multiple manifestations. The necessity of making this distinction was first observed for Mother Goose, the famous children’s story. There are a large number of different manifestations of Mother Goose and some of the editions are rare and have very limited holdings while others are very widely held. Initially, when the audience level was derived at the manifestation level, it was observed that there was little consistency across editions; some editions had audience levels of 1.0, some had 0.0, and everything in between. The rare editions are typically held by research libraries. Those editions held only by research libraries would receive a weighted holdings value of 1.0. It is these rare, or at least rarely held, editions that created the wide variation in audience level values for Mother Goose. Since audience level is a work property; the solution is to derive the audience level for the work as a whole and to use that value for all manifestations of the work. For computational efficiency, the weighted holdings is computed for each record (manifestation) in WorldCat and then combined to create the weighted holdings for the work. 1 For the detailed definitions of work and manifestation as defined in for the Functional Requirements for Bibliographic Records (FRBR), see IFLA (1998). O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources To identify all the manifestations of a work, the audience level algorithm relies on the workset algorithm developed by Hickey and Toves (2005). Their algorithm has been used to “FRBRize” WorldCat. A second example illustrates the procedure for deriving audience level for Courtroom Criminal Evidence. This work has 4 manifestations as shown in Table 2. Table 2 Audience Level Computation for Courtroom Criminal Evidence OCLC No. Total Holdings Usable Holdings Manifestation Audience Level 15504400 139 114 0.783825 29613712 161 117 0.769453 40393191 193 136 0.789426 62762763 174 124 0.758274 As with other works with multiple manifestations, the first step is to compute the weighted holdings values for each manifestation using the same methodology as in the previous example. The weighted holdings value is then calculated for the work as a whole by taking a weighted average of those for the individual manifestations. In this example, the weighted holdings value for the work is 0.775. The final step is the conversion of weighted holdings value to a percentile to create the audience level for the work, in this case a value of 0.76 (24% of the works in WorldCat have a higher weighted holdings value). By deriving the audience level value at the work level, the variability associated with rarely held and other atypical manifestations is minimized and the resulting audience level is more reflective of the content of the work as a whole. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 12 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 13 of 27 Audience levels have been computed for each of the nearly 100 million individual resources in WorldCat and are beginning to be used in various OCLC applications. Access to the audience level values for WorldCat resources are made publicly available through the Audience Level Prototype.2 It successfully was used to identify the scholarly books in WorldCat for Microsoft Live Search. Audience levels are also used in OCLC’s FictionFinder prototype that provides access to 2.8 million works of fiction found in the WorldCat database.3 It is also being used to enhance retrieval with the DeweyBrowser, and as an evaluation tool for the aggregate of works by and about an individual in WorldCat Identities.4 It appears to be a valuable tool for analyzing and evaluating library collections and its potential in this area is being evaluated as part of the OhioLINK Collection Analysis Project.5 Evaluation of Algorithm and Findings The testing of the calculations for various titles indicates the audience level is an accurate and appropriate measure. However, two test methodologies were developed and conducted to systematically evaluate the calculations. The first test began with the generation of a random sample of 126 monographic titles held by an ARL library that was accessible to the research team. The team visited the library to examine some of the resources, which allowed researchers to determine if the audience levels were a meaningful measure of the target audience. The team examined the covers, title pages, table of contents, indexes, text, and images to assess the 2 http://www.oclc.org/research/projects/audience/ 3 http://fictionfinder.oclc.org/; on this application, see also Pisanski & Žumer (2007). 4 http://deweybrowser.oclc.org/ddcbrowser2/; http://orlabs.oclc.org/Identities/; on the DeweyBrowser, see also Vizine-Goetz & Mitchell (2006). 5 http://platinum.ohiolink.edu/cbtf/oclcres.ppt http://www.oclc.org/research/projects/audience/ http://fictionfinder.oclc.org/ http://deweybrowser.oclc.org/ddcbrowser2/ http://orlabs.oclc.org/Identities/ http://platinum.ohiolink.edu/cbtf/oclcres.ppt O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 14 of 27 calculated audience level for each title in the sample. Digital images were captured for future reference and discussion. Although this evaluation was very subjective, it was encouraging to find that the audience levels appeared to be appropriate. The second test compared the rankings of the audience level against ranking decisions made by human subjects. A sample of 30 books was ranked by each of 21 test participants, and a set of test rankings was created for each of the books. The test collection consisted of a stratified sample of 30 books from WorldCat. The books were all in the field of zoology, published in the year 2004, and representative of the entire spectrum of audience level rankings. Zoology was chosen because it is a field with a wide variety of books ranging from children’s books to highly specialized scholarly material. It also was believed that limiting the books to a single subject would facilitate comparison. The 2004 publication date was chosen since the acquisition and cataloging processes for 2004 books should be nearly complete but the books still would be current. All WorldCat records meeting the first two conditions were stratified by audience level (i.e. by their ranking within the entire database), and three were randomly selected from each decile.6 Table 3 identifies the books in the sample. 6 The ranking numbers used for data analysis differ slightlyfrom those used for the random sampling of the books. The sampling was done from Audience Level computations made on 18 January 2006; the following data analysis compares the test results to updated Audience Level figures, taking into account WorldCat holdings current to July 2007. O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Table 3 Sample of Test Books Audience Level Rank Author Title OCLC record # Audience Level Average Subject Ranking Holdings in WorldCat 1 Legg, G. Octopusses and squid 54857977 0.06 1.60 213 2 Miller, H. Mosquito 52757310 0.06 2.05 100 3 Burnie, D. Bird [Eyewitness Books] 56189296 0.08 3.95 2914 4 Hall, D. The ultimate guide to snakes and reptiles 56749767 0.14 13.05 83 5 Chittenden, R. Birds of prey of the world 54718467 0.15 8.10 73 6 Mancini, J. Guide to backyard birds 54415882 0.15 7.55 190 7 Romashko, S. The complete collector's guide to shells and shelling 56960134 0.16 10.30 10 8 Curious critters of the natural world: Reptiles & amphibians 62674819 0.27 3.45 19 9 Haas, S. Birds of Pennsylvania 60687811 0.31 11.50 17 10 Palmer, T. Landscape with reptile: Rattlesnakes in an urban world 54046614 0.32 16.1 504 11 Thompson, B. South Carolina bird-watching: A year-round guide 55700823 0.34 8.55 8 12 Patterson, B. The lions of Tsavo: Exploring the legacy 54472084 0.34 14.65 347 13 Heinrich, B. Bumblebee economics 56128472 0.37 22.85 1214 14 Humann, P. Reef fish identification: Baja to Panama 56980668 0.42 14.40 74 15 Hartman, W. A guide to the birds of Door County 57358137 0.46 11.95 2 16 Elzinga, R. Fundamentals of entomology 50510931 0.51 21.25 1345 17 Gaston, A. Seabirds: A natural history 56349814 0.51 18.75 361 18 Duff, A. Mammals of the world: A checklist 56204329 0.57 19.40 355 19 Podulka, S. Handbook of bird biology 57003728 0.60 21.55 303 20 Porter, R. Birds of the Middle East 57148591 0.66 14.90 36 21 Bradley, R. In Ohio's backyard: Spiders 57662538 0.66 10.60 20 22 Powler, C. Dynamics of large mammal populations 57894103 0.69 25.90 441 23 Legros, G. Fabre, poet of science 60576417 0.72 18.65 63 24 Fascione, N. People and predators: from conflict to coexistence 54694499 0.74 20.65 249 25 National research council Atlantic salmon in Maine 56493371 0.78 23.20 199 26 Borrow, N. Birds of western Africa 57733231 0.80 13.95 91 27 Broughton, J. Prehistoric human impacts on California birds 57203355 0.90 27.25 74 28 Barr, T. A classification and checklist of the genus 55500979 0.91 25.55 20 29 Minter, L. Atlas and red data book of the frogs of South Africa 61303229 0.93 24.00 5 30 Wallace and Dietz Phylogeny and systematics of the treehopper subfamily 54460359 0.96 29.35 33 21 participants, who had no prior affiliation with the audience level project, agreed to test the results. All participants were volunteers from the staff at OCLC headquarters in Dublin, Ohio, representing a reasonably broad demographic spectrum within that population. Eleven females and ten males participated in the test, with job http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 15 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 16 of 27 responsibilities ranging from internships to upper management. Eleven participants held the Masters’ degree in library or information science or its equivalent. The median educational level was two years’ graduate study and the median age range reported was 40-50. Eight of the subjects reported professional cataloging experience, and four reported library reference experience. The tests were conducted in the OCLC Usability Lab between July 17 and July 28, 2006. Each of the 21 participants was seated at a table with the 30 books arranged in a pseudorandom order; the order was identical for each participant. The instructions given to the subjects attempted to minimize any bias in the results by not dictating what criteria the participants should use or consider in their ranking. The participants were given the following instructions: “Please reorder these books in increasing order of difficulty, starting with pre-school books and proceeding to advanced scholarly material. Please let us know when you are done. Thank you for participating in this study.” The participants were given freedom to work however they desired in the space, and extra bookends were provided for their convenience. All but one of the participants produced a unique ordering of the books by perceived audience level.7 Each participant’s ranking of the books was recorded. With the exclusion of the one questionable data set, the tests produced a total of 20 valid sets of rankings. None of the participants required the full ninety minutes allocated for their session. Each individual’s approach may have been the greatest factor in the individuality 7 One participant mistook the directions midway through the test, and returned the books to the original order. The subject re-took the test after debriefing; the data, however, were excluded from any further analysis. O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources of the results. Many participants worked from an initial “rough sort” into three or four piles, or as many as nine; others began by quickly identifying books at both ends of the ranking spectrum, and working inward. At least two participants took an extremely fast approach, scanning at most a few pages in the few books they opened; others took the time to read prefatory material and interior passages, sometimes even comparing passages from two or three works simultaneously. In post-test debriefings, most of the participants spoke of the simple presence or absence of features such as footnotes, bibliographies, charts and tables, or pictures. One participant claimed that the presence or absence of Latin genus/species names was a deciding factor in his choices. Some individual books showed a greater variance in the test rankings than others. The two ends of the ranking spectrum were the most consistent since the ends are “bound” by the nature of the test, with no participant able to rank books below 1 or above 30. Figure 2 compares the subjects’ rankings with the computed audience level rankings. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 17 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources 0 5 10 15 20 25 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Audience Level Ranking S u b je t' s R a n k in g s Figure 2. Audience Level vs. Subject’s Rankings The bars represent the range of the observed subject rankings and the diamond is the average of the subjects’ rankings for the book. The dotted line is the ranking predicted by the audience level. As indicated by the length of the bars, there was wide variation in the subjects’ assessment of the book’s difficulty. Three books, identified by the wide range of observed rankings, seemed to be particularly difficult. Each presented to the subjects a different kind of cognitive challenge:  Fabre, Poet of Science (Rank 23) is a reprint edition of a naturalist’s diary, which could be considered either as light reading or as a scholarly work. The subjects ranked this book from a low of 5th to a high of 29th. The audience level for the book was high since nearly all of the libraries that hold this work and its two http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 18 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources earlier manifestations (1913 and 1921) are college and university libraries. In the case of many individual subjects, placement of this work was one of the last decisions made.  The Ultimate Guide to Snakes and Reptiles (Rank 4) includes a great amount of information, but is a picture book. Its low audience level reflects the fact that the majority of libraries holding this work are public libraries.  A Guide to the Birds of Door County (Rank 15) is a practical bird-watching guide, with hand-drawn pictures, which adds to the difficulty of assessing its level. However, except for The Ultimate Guide to Snakes and Reptiles, even for these challenging books the average subject ranking was reasonably close to that predicted by audience level. In all but two cases, the audience level ranking was within the range of the subjects’ rankings. In the two cases where the level ranking was outside of the subjects’ range, different factors may have contributed to divergence for these books:  The Ultimate Guide to Snakes and Reptiles (Rank 4) discussed above, is an example of a discrepancy between the subjects’ rankings and the audience level rankings. The subjects consistently ranked the book higher than predicted by the audience level.  Bumblebee Economics (Rank 13) was also consistently ranked higher by the subjects than predicted by the audience level. This may be attributed to the subjects’ (correct) perceptions of the book’s basis in scholarly research and its http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 19 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 20 of 27 lengthy bibliography. They appeared to disregard the non-specialist, flowing style of the author’s prose.8 The Spearman Rank Coefficient of Correlation, a non-parametric statistical test, was separately run on the results for each subject. The results with associated p-values are shown in table 4. The Spearman Rank Coefficient test evaluates the degree of correlation between the subjects’ ranking and the audience level ranking. All the rho values are significant at the 5% confidence level. The conclusion is that there exists a significant correlation between the human subjects’ ranking and the audience level rankings. The most important result of this test is the indication that the audience level and human subjects’ perceptions are strongly correlated. 8 Bumblebee Economics may be another installment in the trend for respected scholars to compose books specifically geared towards generalist audiences (Fermat’s Last Theorem, A Brief History of Time, Brunelleschi’s Dome). The first edition of Bumblebee Economics was cited in the New York Times Book Review as one of the “Best Books of 1979,” (Nov. 25, 1979, section BR4), and it was nominated in both 1980 and 1982 for the American Book Awards. O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Table 4. Correlation between Audience Level and Tester’s Rankings Tester Rho P-value 1 0.6828 <0.0001 2 0.7526 <0.0001 3 0.8394 <0.0001 4 0.6512 0.0001 5 0.7531 <0.0001 6 0.7620 <0.0001 7 0.8363 <0.0001 8 0.5092 0.0041 9 0.5564 0.0014 10 0.8389 <0.0001 11 0.6966 <0.0001 12 0.7682 <0.0001 13 0.8469 <0.0001 14 0.7544 <0.0001 15 0.7237 <0.0001 16 0.7464 <0.0001 17 0.7918 <0.0001 18 0.5737 0.0009 19 0.8274 <0.0001 20 0.8852 <0.0001 Correlation between Audience Level and Holdings As discussed earlier, both White (1995) and Lesniaski (2004) assumed that the number of libraries holding a title alone could be used to estimate its audience level or what White referred to as the “difficulty effect” (White, 1995, 10). Figure 3 depicts the relationship between the audience level and the average number of holdings. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 21 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 22 of 27 0 50 100 150 200 250 300 0 10 20 30 40 50 60 70 80 90 10 Audience Level A v e ra g e h o ld in g s 0 Figure 3. Relationship between Audience Level and Holdings The figure indicates a strong inverse relationship between the audience level and the average number of holdings for audience levels greater than 0.5. Books with high audience levels are not widely held. However, the reverse is not generally true; books with low audience levels are not necessary widely held. Hence the number of libraries holding a resource is not by itself a good prediction of its audience level. The number of holdings and the audience level, in fact, are measures of different although related attributes. The audience level is really a predictor of the target audience while within a given audience level, the number of holdings is a predictor of the perceived quality or popularity of the resource. Resources with very high audience levels by definition will be held predominately by research libraries. Since, compared to other O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources types of libraries, there are relatively few research libraries; resources with high audience levels never will be widely held. To be widely held, resources must have broad appeal. Conclusions The audience level is a valuable aid in identifying the appropriate resources for a particular audience. The algorithm produced audience level values that were consistent with those of human evaluators as demonstrated both by the analysis of the actual books and by the comparison of the algorithmic results to those of a test group of human subjects. Based on the findings of this research, the audience level is a new tool with the potential to improve information relevance for discovery and selection for collection analysis, readers’ advisory, and reference services. Since the audience level is a valid predictor of the target audience for a resource, it can be integrated into existing and new systems. The audience level has already been integrated into several OCLC prototypes - FictionFinder, WorldCat Identities, and DeweyBrowser - which can aid both librarians and users in discovering and selecting appropriate materials through various services. It currently is being applied to the OhioLINK Collection Analysis Project in anticipation of integration into future reference and collection assessment services. The audience level was used to enhance discovery of scholarly books in Microsoft Live Search; there is potential for integration of audience level into other discovery systems, such as WorldCat.org and WorldCat Local. This integration would benefit librarians and users in their discovery and selection of materials. Acknowledgement The authors would like to acknowledge the statistical advice and analysis provided by Dr. Stanley Lemeshow and the Biostatistics Laboratory at The Ohio State University. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 23 of 27 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources References Amazon.com: Online shopping for electronics, apparel, books, computers, and more. (n.d.). Accessed September 20, 2006 from http://www.amazon.com. Association of College and Research Libraries. (1988). Books for college libraries: A core collection of 50,000 titles. 3rd ed. Chicago: American Library Association. Bernstein, J. H. (2006). From the ubiquitous to the nonexistent: A demographic study of OCLC WorldCat. LRTS, 50(2), 79-90. Bonk, W. J., & Magrill, M. J. (1979). Building library collections. 5th ed. Metuchen, NJ: Scarecrow Press. Budd, J. M. (1991, July). The utility of a recommended core list: An examination of Books for College Libraries, 3rd ed. Journal of Academic Librarianship, 17(3), 140-144. Calhoun, J. C. (1998, January). Gauging the reception of Choice reviews through online union catalog holdings. LRTS, 42(1), 21-43. Calhoun, J. C. (2001, July). Reviews, holdings and presses and publishers in academic library book acquisitions. LRTS, 45(3), 127-177. Carpenter, D. E., & Getz, M. (1995). Evaluation of library resources in the field of economics: A case study. Collection Management, 20(1/2), 49-89. Ciliberti, A. C. (1994, Winter). Collection evaluation and academic review: A pilot study using the OCLC/AMIGOS Collection Analysis CD. Library Acquisitions: Practice and Theory, 18(4), 431-445. Connaway, L. S. (2007). Mountains, valleys, and pathways: Serials users’ needs and steps to meet them. Part I: Preliminary analysis of focus group and semi- structured interviews at colleges and universities. Serials Librarian, 52(1/2), 223- 236. Connaway, L. S., O’Neill, E. T., & Prabha, C. (2006, July). Last copies: What’s at risk? College and Research Libraries, 67(4), 370-379. Connaway, L. S., O’Neill, E. T., Prabha, C., & Snyder, C. (2004, October). Estimating audience level of monographs using holding patterns in WorldCat. Paper presented at the 3rd annual Library Research Seminar, Kansas City, MO. Accessed September 20, 2007 from http://www.oclc.org/research/presentations/connaway/lrsIII_audience.ppt. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 24 of 27 http://www.amazon.com/ http://www.oclc.org/research/presentations/connaway/lrsIII_audience.ppt O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Gottlieb, J. (1994). Collection assessment in music libraries. MLA Technical Reports, 22. Canton, MA: Music Library Association. Gyeszky, S., Allen, G., & Smith, C. R. (1992). Achieving academic excellence in higher education through improved library research collections: Using OCLC/AMIGOS Collection Analysis CD for collection building. In American libraries: Achieving excellence in higher education, 197-206. Chicago: American Library Association. Harrell, J. (1992). Use of the OCLC/AMIGOS Collection Anlysis CD to determine comparative collection strength in English and American literature: A case study. Technical Services Quarterly, 9(3), 1-14. Hernon, P. (1992). Statistics: A component of the research process. Norwood, NJ: Ablex. Hickey, T., & Toves, J. (2005, April). FRBR work-set algorithm. Accessed Sept. 20, 2007 from http://www.oclc.org/research/projects/frbr/default.htm. IFLA Committee on the Functional Requirements for Bibliographic Records. 1998. FRBR final report. Munich: K. G. Saur. Accessed Sept. 20, 2007, from http://www.ifla.org/VII/s13/wgfrbr/bibliography.htm. Joy, A. H. (1992). The OCLC/AMIGOS Collection Analysis CD: A unique tool for collection evaluation and development. Resource Sharing and Information Networks, 8(1), 23-45. Lavoie, B. F., Connaway, L. S., & O’Neill, E. T. (2007, April). Mapping WorldCat’s digital landscape. LRTS, 51(2), 106-115. Lesniaski, D. (2004). Evaluating collections: A discussion and extension of Brief tests of collection strength. College & Undergraduate Libraries, 11(1), 11-24. Magrill, R. M. (1985). Evaluation by type of library. Library Trends, 33(3), 267-295. OCLC Online Computer Library Center, Inc. Audience Level prototype. (2006a). Retrieved September 20, 2007 from http://www.oclc.org/research/researchworks/audience/default.htm#service#servic e. OCLC Online Computer Library Center, Inc. WorldCat.org. (2006b). Retrieved September 20, 2007 from http://www.worldcat.org. O’Neill, E. T. (2003). Estimating the audience level of books from holding patterns. Paper presented at the ASIST 2003 Annual Conference, Long Beach, California, October 22, 2003. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 25 of 27 http://www.oclc.org/research/projects/frbr/default.htm http://www.ifla.org/VII/s13/wgfrbr/bibliography.htm http://www.oclc.org/research/researchworks/audience/default.htm#service#service http://www.oclc.org/research/researchworks/audience/default.htm#service#service http://www.worldcat.org/ O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources Perrault, A. M. (1995, Winter). The changing print resource base of academic libraries in the United States. Journal of Education for Library and Information Science, 36(4), 295-308. Perrault, A. M. (1999). National collecting trends: collection analysis methods and findings. Library & Information Science Research, 21(1), 47-67. Perrault, A. M. (2002). Global collective resources: A study of monographic bibliographic records in WorldCat. Retrieved September 20, 2007 from http://www.oclc.org/research/grants/reports/perrault/intro.pdf. Phillips, L. L., & Williams, S. R. (2003). Collection development embraces the digital age: A review of the literature, 1997-2003. LRTS, 48(4), 273-299. Pisanski, J., & Žumer, M. (2007). Functional requirements for bibliographic records: An investigation of two prototypes. Program: Electronic Library and Information Systems, 41(4), 400-417. Schwartz, C. A. (1994, April). Empirical analysis of literature loss. LRTS, 38(2), 133- 138. Shonfeld, R. C., & Lavoie, B. F. (2006). Books without boundaries: A brief tour of the system-wide print book collection. Journal of Electronic Publishing, 9(2). Retrieved September 20, 2007 from http://quod.lib.umich.edu/cgi/t/text/text- idx?c=jep;cc=jep;q1=Summer%202006;op2=and;op3=and;rgn=main;rgn1=citatio n;rgn2=title;rgn3=title;view=text;idno=3336451.0009.208;hi=0. Senkevich, J. J., & Sweetland, J. H. (1996, Fall). Evaluating public library adult fiction: Can we define a core collection? RQ, 36(1), 103-117. Serebnick, J. (1992, July). Selection and holdings of small publishers’ books in OCLC libraries, a study of the influence of reviews, publishers, and vendors. Library Quarterly, 62(3), 259-294. Serebnick, J., & Cullars, J. (1984). Find more like this: An analysis of reviews and library holdings of small publishers’ books. LRTS, 28(1), 4-14. Shaw, D. (1991). An analysis of the relationship between book reviews and fiction holdings in OCLC. Library & Information Science Research, 31(2), 147-154. Sweetland, J. H., & Christiansen, P. G. (1997, March). Developing language and literature collections in academic libraries: A survey. Journal of Academic Librarianship, 23(2), 119-125. Twiss, T. M. (2001). A validation of Brief Tests of Collection Strength. Collection Management, 25(3), 23-37. http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 26 of 27 http://www.oclc.org/research/grants/reports/perrault/intro.pdf http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;cc=jep;q1=Summer%202006;op2=and;op3=and;rgn=main;rgn1=citation;rgn2=title;rgn3=title;view=text;idno=3336451.0009.208;hi=0 http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;cc=jep;q1=Summer%202006;op2=and;op3=and;rgn=main;rgn1=citation;rgn2=title;rgn3=title;view=text;idno=3336451.0009.208;hi=0 http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;cc=jep;q1=Summer%202006;op2=and;op3=and;rgn=main;rgn1=citation;rgn2=title;rgn3=title;view=text;idno=3336451.0009.208;hi=0 O'Neill, Connaway, & Dickey: Estimating the Audience Level for Library Resources http://www.oclc.org/research/publications/archive/2008/oneill-jasist.pdf Page 27 of 27 Velluci, S. L. (1993, Summer). OCLC/AMIGOS Collection Analysis CD: Broadening the scope of use. OCLC Systems and Services, 9(2), 49-53. Vizine-Goetz, D., & Mitchell, J. S. (2006). DeweyBrowser. Cataloging & Classification Quarterly 42(3/4), 213-220. Wagner, S. F. (1992). Introduction to statistics. New York: HarperCollins. Wallace, D. P., & Boyce, B. R. (1989, January). Holdings as a measure of journal value. Library & Information Science Research, 11(1), 59-71. Webster, M. G. (1995). Using the AMIGOS/OCLC Collection Analysis CD and student credit hour statistics to evaluate collection growth patters and potential demand. Library Acquisitions: Practice and Theory, 19(2), 197-210. White, H. D. (1995). Brief tests of collection strength: A methodology for all types of libraries. Westport, CT: Greenwood Press. Wiberley, S. E. (2002). The humanities: Who won the ‘90s in scholarly book publishing. Portal: Libraries and the Academy, 2(3), 357-374. Wiberley, S. E. (2004). The social sciences: Who won the ‘90s in scholarly book publishing. College & Research Libraries, 65(6), 505-523. work_5ag46tuejjhh7prpy6v3c2qmne ---- Emerald FullText Article : Electronic error reporting via Internet in the VAX environment by Sam A. Khosh-Khui Serials Cataloging Librarian Albert B. Alkek Library, Southwest Texas State University, San Marcos, Texas, USA Abstract This article describes an electronic OCLC error reporting (OER) program developed at the Albert B. Alkek Library, Southwest Texas State University, in response to the OCLC announcement that OCLC users could begin submitting bibliographic record change requests and duplicate record reports via Internet e-mail. OER is a menu-driven program written in VAX VMS which facilitates sending OCLC error reports by providing blank error-report forms for various error-reporting activities. This is accomplished by adding constant and system-supplied information to the forms and then automatically sending the forms, while giving ample opportunities to review the accuracy of the outgoing report. Doing so provides more uniformity and accuracy in the reporting process and saves money and staff time. This article suggests that, although the program is written for the SWT library, it may easily be modified and used by other compatible institutions. OCLC announced in October 1994 that OCLC users could begin submitting bibliographic record change requests and duplicate record reports via Internet e-mail (AMIGOS, 1994). This announcement was good news for our department here at the Albert B. Alkek Library, Southwest Texas State University, as, owing to budget limitations for this fiscal year and the library's commitment to save money for next year, we are trying to reduce all costs including the cost of paper and postage. In a departmental meeting it was agreed to send error reports via e-mail except for those cases where supporting documentation is necessary as specified in the OCLC Bibliographic Formats and Standards (OCLC, 1994). To facilitate this process it was decided to create a menu and automate most of the routine procedures. Additionally, it was decided to include other error reports on the menu. To make sure that this would be agreeable to OCLC, the cataloging department head, Elaine Sanchez, contacted OCLC and AMIGOS officials to clarify the following questions: • May we send "Incorrect filing indicator report" and "Type code change request" forms through Internet? • May we copy portions of the printed forms, which were used for years and with which our staff are familiar, into our online system and send them through e-mail? The answers to both questions were affirmative. We were told that we could report incorrect filing indicator reports and type code change requests in the same manner as reporting bibliographic record change requests. However, we were not given permission to send "Authority file change request" forms via Internet yet. Retrieving OCLC files The files shown in Table I were retrieved from OCLC via FTP according to OCLC's instructions. Table I Files retrieved from OCLC via FTP according to OCLC’s instructions File name Form/instructions bib.change.report Electronic bibliographic change report dup.report Electronic duplicate report bib.instructions Instructions for OCLC’s electronic error/duplicate record reporting The contents of these original files are listed in Appendices 1-3. Except for the OCLC instruction file which remained unchanged, each file was edited to remove information provided by the program and to include information from printed forms. Also two additional forms for "Filing indicator report" and "Type code change request" were created. The edited files were renamed as follows to be used for the OCLC error- reporting program: • ocllc_bib_change_report.form; • oclc_dup_record_report.form; • oclc_filing_ind_report.form; • oclc_type_code_change_req.form; • oclc_error_report_instructions.txt. The content of each form is listed in Appendices 4-7. OCLC error-reporting program We are connected to the University's VAX cluster which permits us to use e-mail and other network utilities. One of the VAX features is the ability to write predefined procedures into command files in VMS language. This feature allows the saving of time and assures accuracy and simplicity. Using this feature, the author prepared the OER.COM command file. "OER" is short for "OCLC error reporting" (see Appendix 8 for the program). Although this program is basically written for the SWT Library, with a little modification it could be used in any VAX computer. One can use this program to retrieve automatically the OCLC error report forms, put the user in the editing mode, and then forward the edited form to OCLC via Internet. Once the program is available in the directory, entering "@OER" at the "$" prompt will run the program. On activating this program, the following menu will appear: B Bibliographic record change request. D Duplicate record report. F Filing indicator report. T Type code change request. I OCLC's instructions for error reporting. X Cancel/Return to previous menu. OCLC's instructions for error reporting Option "I" will bring up "Instructions for OCLC's electronic error/duplicate record reporting" file. It would be a good idea to become familiar with these instructions. Since this program is using the "edit/tpu/read_only" command to read the file, one can use page-down and page-up to navigate in the instruction file. Using will terminate reading and return to the "OCLC error reporting" menu. Selecting a form Select the appropriate option by entering options "B", "D", "F", or "T". Each option will display a form that is very similar to the paper format. Once one of the above options is selected, the program will do the following: • Copy the appropriate blank form to a file in the sys$scratchpad directory. • Add "==> "OCLC_Symbol (Institution's Name)" in the "REPORTED BY (OCLC SYMBOL)" line. Presently the program adds "TXI (Southwest Texas State University)". However, the command file may be modified for any appropriate OCLC symbol and institution's name. • Automatically read the list of library staff from a file called "lib_staff.dat" and fill in the proper name without the individual needing to enter it. Each record in this list is entered in the format shown in Table II. • Add "==> {Full name (e-mail address) [Phone no]}" in the "SUBMITTED BY". e.g. ==> Sam A. Khosh-Khui (SK03@ADMIN.SWT.EDU)[512/245-2288]. Modify "Internet_address", and area_code in the command file to reflect appropriate address and phone number for your institution. • Read system date and in the "DATE" line add: "==> mm/dd/yy" e.g. ==> 12/14/94." Editing a form To avoid changing the structure of blank formats, users are advised to use to change the keyboard from "Insert" mode to "Overstrike" mode. The status of the keyboard is always displayed on the "Status line" and completes the editing session. When editing is completed, the following choices are available: E = Edit. R = Read. S = Send. X = Cancel. Option "E" may be used if reediting of the form is necessary. If one performs more editing, the program will provide the above options at completion. Using the "R" option allows one to proofread the completed form before sending it to OCLC. Page-up, Page-down, and Find keys are available during "Read" which facilitate navigation through the document. Using will return the user to the above choices. Option "X" may be used to cancel the selection. Doing so will lose any changes made on the blank form. Table II How to enter a library staff record Fields columns Size Comments User Name 1-4 4 5 1 Blank First Name 6-25 20 26 1 Blank Middle Initial 27 1 28 1 Blank Last Name 29-48 20 49 1 Blank Phone_number 49-57 8 “-” Included USER FIRST NAME M LAST_NAME PHONE_NO e.g., SK03 Sam A Khosh-Khui 245-2288 Sending the form to OCLC Choose "S" to send the completed form to OCLC. The accuracy of the form is the responsibility of the sender and this program is only to facilitate the filling out of the form and the sending process. We hope that using this menu will facilitate reporting and save time and money. References AMIGOS (1994), Agenda & OCLC Connection, No. October, pp.17. OCLC (1994), OCLC Bibliographic Formats and Standards, 2nd, OCLC Online Computer Library Center, Dublin, OH, pp.51-63 Appendix 1: electronic bibliographic change report (bib.change.report) ELECTRONIC BIBLIOGRAPHIC CHANGE REPORT – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – RETURN TO: bibchange@oclc.org – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – REPORTED BY (OCLC SYMBOL) ==> SUBMITTED BY ==> DATE ==> FORMAT ==> GIVE THE OCLC CONTROL NUMBER, IDENTIFY ERROR, and GIVE A BRIEF DESCRIPTION OF THE REQUIRED CORRECTION (repeat as necessary). ==> Appendix 2: electronic duplicate report (dup. report) ELECTRONIC DUPLICATE REPORT – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – RETURN TO: bibchange@oclc.org – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – REPORTED BY (OCLC SYMBOL) ==> SUBMITTED BY ==> DATE ==> FORMAT ==> Appendix 3: instructions for OCLC's electronic error/duplicate record reporting (bib. instructions) INSTRUCTIONS FOR OCLC'S ELECTRONIC ERROR/DUPLICATE RECORD REPORTING OCLC encourages members to report errors in the database such as: • filing indicator changes; • type code changes; • name and subject corrections; and • duplicates. Your efforts in identifying and reporting needed corrections help in maintaining the quality of the online union catalog. Reports may be sent through the mail, by phone, or by the Internet. Follow these instructions for Internet reports. Proof for the change can be the citation to the appropriate authority record(s) or other reference source. If proof requires a photocopy of the title page or other item, please do not send the report via Internet; rather, use the paper reporting forms. Bibliographic change report form (file name = bib.change.report) (1) Read each section and type the appropriate response next to the arrow ==>. (2) In the "SUBMITTED BY" field, give the name of the person to whom questions should be referred. (3) When submitting a list that has accumulated over a period, recheck the system to verify that the requested changes are still necessary. (4) Refer to the "Bibliographic formats and standards" manual for guidelines of the types of errors that can be reported. Note: OCLC reserves the right to process the requested changes at its discretion. (5) Return the completed form via Internet E-mail, to: bibchange@oclc.org. Appendix 4: electronic bibliographic change report (oclc_bib_change_report. form) _ FORMAT ==> GIVE THE OCLC CONTROL NUMBER, IDENTIFY ERROR, and GIVE A BRIEF DESCRIPTION OF THE REQUIRED CORRECTION (repeat as necessary). ==> ----------------------------------------------------------------------- AUTHOR (1xx) ----------------------------------------------------------------------- TITLE (245 $a) ----------------------------------------------------------------------- FIELD |FIXED FIELD CODE: | | | |------------------------------------------------------------- FIELD |REQUESTED CHANGE: | | | ---------|------------------------------------------------------------- LINE NO. |TEXT FROM RECORD: | | ---------|------------------------------------------------------------- LINE NO. |TEXT FROM RECORD: | | ---------|------------------------------------------------------------- TAG |REQUESTED CHANGE: | | ----------------------------------------------------------------------- Appendix 5: electronic duplicate report (oclc_dup_record_report.form) FORMAT ==> ----------------------------------------------------------------------- | | DUPLICATE RECORDS PREFERRED RECORD|------------------------------------------------------ | 1 | 2 | 3 | 4 – – – – – – – – | – – – – – – | – – – – – – | – – – – – – | – – – – – – | | | | – – – – – – – – | – – – – – – | – – – – – – | – – – – – – | – – – – – – | | | | – – – – – – – – | – – – – – – | – – – – – – | – – – – – – | – – – – – – | | | | – – – – – – – – | – – – – – – | – – – – – – | – – – – – – | – – – – – – | | | | – – – – – – – – | – – – – – – | – – – – – – | – – – – – – | – – – – – – | | | | ----------------------------------------------------------------------- Appendix 6: incorrect filing indicator report (oclc_filing_ind_report.form) – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – | OCLC | | | | | CONTROL | IF SERIALS | FIELD | INCORRECT | CHANGE | NUMBER | PLEASE INDICATE | TAG | INDICATOR | TO - – – – | – – – – | – – – – – – – – | – – – – | – – – – – – | – – – – – 1 | | | | | | | | | | - – – – | – – – – | – – – – – – – – | – – – – | – – – – – – | – – – – – 2 | | | | | | | | | | - – – – | – – – – | – – – – – – – – | – – – – | – – – – – – | – – – – – 3 | | | | | | | | | | - – – – | – – – – | – – – – – – – – | – – – – | – – – – – – | – – – – – 4 | | | | | | | | | | - – – – | – – – – | – – – – – – – – | – – – – | – – – – – – | – – – – – 5 | | | | | | | | | | – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – Appendix 7: type code change request (oclc_type_code_change_req.form) – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – | OCLC | | | | CONTROL | INPUTTING | FROM | TO | NUMBER | LIBRARY | TYPE | TYPE | | | | – – – – | – – – – – – – – – | – – – – – – – – –– |– – – – – – | – – – – 1 | | | | | | | | – – – – | – – – – – – – – – | – – – – – – – – –– |– – – – – – | – – – – 2 | | | | | | | | – – – – | – – – – – – – – – | – – – – – – – – –– |– – – – – – | – – – – 3 | | | | | | | | – – – – | – – – – – – – – – | – – – – – – – – –– |– – – – – – | – – – – 4 | | | | | | | | – – – – | – – – – – – – – – | – – – – – – – – –– |– – – – – – | – – – – 5 | | | | | | | | – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – Appendix 8: OER.COM $info: $! NAME: OCLC ERROR REPORTING $! COMMAND: @OER.COM (v.4.2) $! AUTHOR: Sam A. Khosh-khui, Ph.D. $! DATE: December, 1994 $! SITE: Southwest Texas State University, Alkek Library $! NOTE: This program facilitates reporting OCLC error forms via internet $!COPYRIGHT: $! This program may copied for by non commercial users provided that $! appropriate credit is given to the author. $! $blank_form_files: $blank_form_B == "disk$b:[sk03.pubd.oclcd]oclc_bib_change_report.txt" $blank_form_D == "disk$b:[sk03.pubd.oclcd]oclc_dup_record_report.txt" $blank_form_F == "disk$b:[sk03.pubd.oclcd]oclc_filing_ind_report.txt" $oclc_instr_file == "disk$b:[sk03.pubd.oclcd]oclc_error_report_instructions.txt" $blank_form_T == "disk$b:[sk03.pubd.oclcd]oclc_type_code_change_req.txt" $staff_file == "disk$b:[rr02.lupd]libg.dat" $! $definitions: $ask :== "read sys$command /time_out=200/error=errpass/prompt=" $ev_r == "edit/tpu/command=disk$b:[sk03.comd]ez.tpu/nomodify/read_only" $file_name :== "" $form_name :== "" $form_file :== "" $oclc_form :== "" $qm :== """ $report_name :== "" $! $constant_info: $user_name = f$user() - "[" - "]" $day = f$cvtime(,,"day") $if f$extract(0,1,day) .eqs. "0" then day = f$extract(1,1,day) $month = f$cvtime(,,"month") $if f$extract(0,1,month) .eqs. "0" then month = f$extract(1,1,month) $year = f$cvtime(,,"year") $date = month+"/"+day+"/"+year $! $open_staff_info_file: $open/read staff_list 'staff_file $! $!Each line of the record looks like the following: $!FIRST NAME MIDDLE N. LAST_NAME USER_NAME PHONE_NO $!Sam A Khosh-Khui SK03 218 2288 $! $read_staff_info_file: $read/end_of_file=close_staff_info_file staff_list staff_record $if f$extract(48,4,staff_record) .eqs. ""+user_name $then OCLC ERROR REPORTING" $write sys$output "tract(0,15,staff_record) $first_name = f$edit(first_name, "trim,compress") $middle_name = f$edit($middle_name, "trim,compress") $last_name = f$edit(last_name, "trim,compress") $write sy B Bibliographic Record Change Request" $write sy D Duplicate Record Report" mpress") $write sy F Filing Indicator Report" = middle_name+"." $write sy T Type Code Change Request" rd) $write sy I OCLC's Instructions For Error Reporting" $write sys$output "ct(57,4,staff_record) $phone_no X Cancel/Return to Previous Menu" $ask " = first_name+" "+middle_name+" "+last_name $goto close_staff_info_file $endif $goto read_staff_info_file $! $close_staff_info_file: $close staff_list $! Please Enter Your Choice and Press RETURN>>" choice $choice = f$edit(choice, "trim,compress,upcase") $if choice .eqs. "B" $then $file_name = ""+blank_form_B $report_name = " ELECTRONIC BIBLIOGRAPHIC CHANGE REPORT" $form_name = "" $oclc_form = "OCLC Bibliographic Record Change Request" $form_file = "sys$scratch:oclc_bib_change_report.form" $goto make_oclc_form $endif $! $if choice .eqs. "D" $then $file_name = ""+blank_form_D $report_name = " ELECTRONIC DUPLICATE REPORT" $form_name = "" $oclc_form = "OCLC Duplicate Record Report" $form_file = "sys$scratch:oclc_dup_record_report.form" $goto make_oclc_form $endif $! $if choice .eqs. "F" $then $file_name = ""+blank_form_F $report_name = " ELECTRONIC BIBLIOGRAPHIC CHANGE REPORT" $form_name = " (INCORRECT FILING INDICATOR REPORT)" $form_file = "sys$scratch:oclc_filing_ind_report.form" $oclc_form = "OCLC Incorrect Filing Indicator Report" $goto make_oclc_form $endif $! $if choice .eqs. "I" then goto oclc_instructions $if choice .eqs. "T" $then $file_name = ""+blank_form_T $report_name = " ELECTRONIC BIBLIOGRAPHIC CHANGE REPORT" $form_name = " (TYPE CODE CHANGE REQUEST)" $oclc_form = "OCLC Type Code Change Request" $form_file = "sys$scratch:oclc_type_code_change_req.form" $goto make_oclc_form Incorrect Choice" $goto oclc_error_report_menu $!f choice .eqs. "X" then exit $oclc_instructions: $assign/use sys$command sys$input $edit/tpu/nomodify/read_only 'oclc_instr_file $goto oclc_error_report_menu $! Working ..."orm: $create 'form_file $open/write form_file 'form_file $write form_file "" $write form_file "" $write form_file ""+report_name $write form_file ""+form_name $write form_file "" $line = "-------------------------------------------------------------" $write form_file ""+line $write form_file "RETURN TO: bibchange@oclc.org" $write form_file ""+line $write form_file "" $write form_file "REPORTED BY:" $write form_file "" $write form_file "==> TXI (Southwest Texas State University)" $write form_file "" $write form_file "SUBMITTED BY:" $write form_file "" $write form_file "==> "+name+" ("+user_name+"@swt.edu) ["+phone_no+"]" $write form_file "" $write form_file "DATE" $write form_file "" $write form_file "==> "+date $! $open_oclc_form_file: $open/read form_lines 'file_name $! $read_oclc_form_file: $read/end_of_file=close_form_file form_lines form_record $write form_file ""+form_record $goto read_oclc_form_file $close_form_file: $close form_file $close form_lines $! $edit_oclc_form: E = Edit R = Read S = Send X = Cancel" $resp = ""'form_file $ask " Enter Your Choice>>" resp $resp = f$edit(resp, "trim,compress,upcase") $!rite sys$output " $test_edit_form_choices: $if resp .eqs. "E" then goto edit_oclc_form Incorrect Choice" then goto read_oclc_form $goto edit_form_choices oto send_oclc_form $if resp .eqs. "X" then goto delete_oclc_file $delete_oclc_file:" $delete/nolog/noconfirm sys$scratch:oclc*.*;* $form_file = "" $goto oclc_error_report_menu $! $read_oclc_form: $assign/use sys$command sys$input $ev_r 'form_file "+oclc_form+" Mailed to: "+oclc_address $delete/nolog/noconfirm sys$scratch:oclc*.*;* $ask "oclc_form: $oclc_address = "in%"+qm+"bibchange@oclc.org"+qm $!oclc_address = "in%"+qm+"sk03@swt.edu"+qm $! $create_user_dist_list: $create sys$scratch:oclc_user.dis $open/write oclc_file sys$scratch:oclc_user.dis $write oclc_file ""+oclc_address+"" Working ..."file Please Press to Continue ..." ret $goto oclc_error_report_menuch:oclc_user" /subject='subj work_5cmz22jlhfhdlfksx3vsestcia ---- Multi-Media, Multi-Cultural, and Multi-Lingual Digital Libraries, Or How Do We Exchange Data In 400 Languages Multi-Media, Multi-Cultural, and Multi-Lingual Digital Libraries Or How Do We Exchange Data In 400 Languages? Christine L. Borgman Professor and Chair Department of Library and Information Science Graduate School of Education & Information Studies University of California, Los Angeles Los Angeles, California cborgman@ucla.edu D-Lib Magazine, June 1997 ISSN 1082-9873 Introduction Medium, Culture, and Language From Local Systems to Global Systems Design Tradeoffs Representation in Digital Form Language and Character Sets Transliteration and Other Forms of Data Loss Character Encoding Mono-lingual, Multi-lingual, and Universal Character Sets Library Community Approaches Summary and Conclusions References Introduction The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange (Libicki, 1995) related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure (Kahin & Abbate, 1995), for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard (Libicki, 1995, p. 63). The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries. Medium, Culture, and Language Speaking is different from writing, and still images are different from moving images; verbal and graphical communication are yet more different from each other. Speaking in one's native language to people who understand that language is different than speaking through a translator. Language translations, whether oral or written, manual or automatic, cannot be true equivalents due to subtle differences between languages and the cultures in which they originate. Thus the content and effect of messages are inseparable from the form of communication and the language in which communicated. For all of these reasons, we wish to capture DL content in the richest forms possible to assure the maximum potential for communication. We want accurate representations of the original form and minimal distortion of the creators' (author, artist, film maker, engineer, etc.) intentions. At the same time, we want to provide the widest array of searching, manipulation, display, and capture capabilities to those who seek the content, for the searchers or users of these digital libraries may come from different cultures and speak different languages than those of the creators. Herein lies the paradox of information retrieval: the need to describe the information that one does not have. We have spent decades designing mechanisms to match up the expressions of searchers with those of the creators of textual documents (centuries, if manual retrieval systems are considered). This is an inherently insolvable problem due to the richness of human communication. People express themselves in distinctive ways, and their terms often do not match those of the creators and indexers of the information sought, whether human or machine. Conversely, the same terms may have multiple meanings in multiple contexts. In addition, the same text string may retrieve words in multiple languages, adding yet more variance to the results. Better retrieval techniques will narrow the gap between searchers and creators of content, but will never close that gap completely. Searching for information in multi-media digital libraries is more complex than text-only searching. Consider the many options for describing sounds, images, numeric data sets, and mixed-media objects. We might describe sounds with words, or with other sounds (e.g., playing a tune and finding one like it); we might describe an image with words, by drawing a similar object, or by providing or selecting an exemplar. As Croft (1995) notes in an earlier D-Lib issue, general solutions to multi-media indexing are very difficult, and those that do exist tend to be of limited utility. The most progress is being made in well-defined applications in a single medium, such as searching for music or for photographs of faces. Cultural issues pervade digital library applications, whether viewing culture at the application level, such as variations in approaches to image retrieval by the art, museum, library, scientific, and public school communities, or on a multi-national scale, such as the differing policies on information access between the United States and Central and Eastern Europe. Designing digital libraries for distributed environments involves complex tradeoffs between tailoring to local cultures and meeting the standards and practices necessary for interoperability with other systems and services (Borgman, et al., 1996). From Local Systems to Global Systems The easiest systems to design are those for well-defined applications and well-defined user populations. Under these conditions, designers can build closed systems tailored to a community of users, iteratively testing and refining capabilities. These are rare conditions today, however. More often, we are designing open systems that serve not only a local population, but also remote and perhaps unknown populations. Examples include digital libraries of scholarly materials built by and for one university, then later made openly available on the Internet; business assets databases developed and tested at a local site and then provided to corporate sites around the world; scientific databases designed for research applications, later made available for educational purposes; and library catalogs designed for a local university, later incorporated into national and international databases for resource sharing. Any of these applications could involve content in multiple media and multiple languages. Design Tradeoffs Consider how the design issues change from local to distributed systems. In local systems, designers can tailor user interfaces, representation of content, and functional capabilities to the local culture and to the available hardware and software. Input and output parameters are easily specified. If users need to create sounds or to draw, these capabilities can be provided, along with display, capture, and printing abilities in the matching standards. Keyboards can be set to support the local language(s) of input; screens and printers can be set to support the proper display of the local languages as well. Designers have far less control over digital libraries destined for use in globally distributed environments. Users' hardware and software platforms are typically diverse and rapidly changing. Designers often must specify a minimum configuration or require a minimum version of client software, making tradeoffs between lowering the requirements to reach a larger population or raising requirements to provide more sophisticated capabilities. The more sophisticated the multi-media or multi-lingual searching capabilities, the higher the requirements are likely to be, and the fewer people that are likely to be served. While good design includes employing applicable standards, determining which standards are appropriate in the rapidly evolving global information infrastructure involves tradeoffs as well. The use of some standards may be legislated by the parent organization or funding agency, and the use of other standards may be a matter of judging which are most stable and which are most likely to be employed in other applications with which the current system needs to exchange data. In the case of character sets for representing text in digital libraries, designers sometimes face a choice between a standard employed within their country to represent their national language and a universal character set in which their national language is more commonly represented in other countries. At present, massive amounts of textual data are being generated in digital form, and represented in formats specific to applications, language, and countries. The sooner the digital library community confronts this tidal wave of "legacy data" in incompatible representations, the more easily this interoperability problem may be solved. Representation in Digital Form Although we have been capturing text, images, and sounds in machine-readable forms for several decades, issues of representation became urgent only when we began to access, maintain, exchange, and preserve data in digital form. In information technologies such as film, phonograph, CD-ROM, and printing, electronic data often is an intermediate format. When the final product was published or produced, the electronic data often were destroyed, and the medium (disks, tapes, etc.) reused. In digital libraries, the perspective changes in two important ways: (1) from static output to dynamic data exchange; and (2) from a transfer mechanism to a permanent archival form. In sound or print recordings, for example, once the record is issued or the book printed, it no longer matters how the content was represented in machine-readable form. In a digital library, the representation matters because the content must be continuously searched, processed, and displayed, and often must be exchanged with other applications on the same and other computers. When electronic media were viewed only as transfer mechanisms, we made little attempt to preserve the content. Many print publications exist only in paper form, the typesetting tapes used to generate them long since overwritten. Much of the early years of television broadcasts were lost, as the recording media were reused or allowed to decay. Now we recognize that digital data must be viewed as a permanent form of representation, requiring means to store content in complete and authoritative forms, and to migrate content to new technologies as they appear. Language and Character Sets Character set representation is a problem similar to that of representing multi-media objects in digital libraries, yet is more significant due to the massive volume of textual communication and data exchange that takes place on computer networks. Culture plays a role here as well: speakers of all languages wish to preserve their language in its complete and authoritative form. Incomplete or incorrect data exchange results in failures to find information, in failures to authenticate identities or content, and in the permanent loss of information. Handling character sets for multiple languages is a pervasive problem in automation, and one of great concern to libraries, network developers, government agencies, banks, multi-national companies, and others exchanging information over computer networks. Much to the dismay of the rest of the world, computer keyboards were initially designed for the character set of the English language, containing only 26 letters, 10 numbers, and a few special symbols. While variations on the typical English-language keyboard are used to create words in most other languages, doing so often results in either (1) a loss of data, or (2) encoding characters in a language-specific or application-specific format that is not readily transferable to other systems. We briefly discuss the problems involved in data loss and character encoding, then discuss some potential solutions. Transliteration and Other Forms of Data Loss Languages written in non-Roman scripts, such as Japanese, Arabic, Chinese, Korean, Persian (Farsi), Hebrew, and Yiddish (the "JACKPHY" languages), and Russian, are transliterated into Roman characters in many applications. Transliteration matches characters or sounds from one language into another; it does not translate meaning. Considerable data loss occurs in transliteration. The process may be irreversible, as variations occur due to multiple transliteration systems for a given language (e.g., Peking vs. Beijing, Mao Tse-tung vs. Mao Zedong (Chinese), Tchaikovsky vs. Chaikovskii (Russian)), and the transliterated forms may be unfamiliar to speakers of that language. Languages written in extensions of the Roman character set, such as French, Spanish, German, Hungarian, Czech, and Polish, are maintained in incomplete form in some applications by omitting diacritics (accents, umlauts, and other language-specific marks) that distinguish their additional characters. These forms of data loss are similar to those of "lossy" compression of images, in which data are discarded to save storage costs and transmission time while maintaining an acceptable reproduction of the image. Any kind of data loss creates problems in digital libraries. Variant forms of words will not match and sort properly, incomplete words will not exchange properly with digital libraries using complete forms, and incomplete forms may not be adequate for authoritative or archival purposes. The amount of acceptable loss varies by application. Far more data loss is acceptable in applications such as email, where rapid communication is valued over authoritative form, than in financial or legal records, where authentication is essential, for example. Character Encoding The creation of characters in electronic form involves hardware and software to support input, storing, processing, sorting, displaying, and printing. The internal representation of each character determines how it is treated by the hardware (keyboard, printer, VDT, etc.) and the application software. Two characters may appear the same on a screen but be represented differently due to their different sorting positions in multiple languages, for example. Conversely, the same key sequence on two different keyboards may produce two different characters, depending upon the internal representation that is generated. Character encoding for digital libraries includes all of these aspects: The keyboard commands used to generate characters, especially characters with diacritics, for building the digital library content; The keyboard commands used to generate characters to search the digital library; Rules for sorting characters in correct alphabetic sequence, which are dependent on the internal representation of the character; the correct sequence varies by language; Display of characters on computer screens; and Output of characters on printers and other devices. Numerous possibilities exist for mismatches and errors in access to digital libraries in distributed environments, considering the vast array of hardware and software employed by DLs and their users and the variety of languages and character encoding systems that may be involved. Mono-lingual, Multi-lingual, and Universal Character Sets Many standards and practices exist for encoding characters. Some are language-specific, others are script-specific (e.g., Latin or Roman, Arabic, Cyrillic), and "universal" standards that support most of the world's written languages are now available. Exchanging data among digital libraries that employ different character encoding formats is the crux of the problem. If mono-lingual DLs all use the same encoding format, such as ASCII for English, data exchange should be straightforward. If mono-lingual DLs use different formats, such as the three encoding formats approved for the Hungarian language by the Hungarian standards office (Számítástechnikai karakterkódok. A grafikus karakter magyar referenciakészlete, 1992), then data exchange encounters problems. Characters generated by a keyboard that is set for one encoding system may not match characters stored under another encoding system; characters with diacritics may display or print incorrectly or not display at all. DLs using the same script-specific formats, such as Latin-2 extended ASCII that encompasses the major European languages, should be able to exchange data with each other. When DLs using Latin-2 attempt to exchange data in those same languages with DLs using language-specific formats, mismatches may occur. Similarly, mismatches may occur when DLs that employ Latin-2 for European languages exchange data with DLs that employ a different multi-lingual set such as the American Library Association character set (The American Library Association Character Set, 1989) commonly used in the United States. After many years of international discussion on the topic, Unicode appears to be emerging as the preferred standard to support most of the world's written languages. A universal character set offers great promise for solving the data exchange problem. If data in all written languages are encoded in the same format, then data can be exchanged between mono-lingual and multi-lingual digital libraries. Just as the networked world is moving toward hardware platform-independent solutions, adopting Unicode widely would move us toward language-independent solutions to distributed digital libraries and to universal data exchange. Techniques for automatic language translation would be assisted by a common character set standard as well. Any solution that appears too simple probably is. Major hardware and software vendors are beginning to support Unicode, but it is not embedded in much application software yet. Unicode requires 16 bits to store each character -- twice as much as ASCII, at 8 bits. However, Unicode requires only half as much space as the earlier version of ISO 10646 (32 bits), the competing and more comprehensive universal character set. Unicode emerged as the winner in a long standards battle, eventually merging with ISO 10646, because it was seen as easier to implement and thus more likely to be adopted widely. As storage costs continue to decline, the storage requirements of Unicode will be less of an issue for new applications. Massive amounts of text continue to be generated not only in language-specific and script-specific encoding standards, but in local and proprietary formats. Any of this text maintained in digital libraries may become "legacy data" that has to be mapped to Unicode or some other standard in the future. At present, digital library designers face difficult tradeoffs between the character set standards in use by current exchange partners, and the standard likely to be in international use in the future for a broader variety of applications. Library Community Approaches The international library community began developing large, multi-language digital libraries in the 1960s. Standards for record structure and character sets were established long before the Internet was created, much less Unicode. Hundreds of millions of bibliographic records exist around the world in variations of the MARC (MAchine Readable Cataloging) standard, although in multiple character set encoding formats. OCLC Online Computer Library Center, the world's largest cataloging cooperative, serves more than 17,000 libraries in 52 countries and contains over 30 million bibliographic records with over 500 million records of ownership attached in more than 370 languages (Mitchell, 1994; OCLC Annual Report, 1993; Smith, 1994). OCLC uses the American Library Association (ALA) character set standard, which extends the English-language keyboard to include diacritics from major languages (Agenbroad, 1992; The ALA Character Set, 1989). Text in most other languages is maintained in transliterated form. The Library of Congress, which contributes its records in digital form to OCLC, RLIN (Research Libraries Information Network, the other major U.S.-based bibliographic utility), and other cooperatives, also do original script cataloging for the JACKPHY languages mentioned earlier. RLIN pioneered the ability to encode the JACKPHY languages in their original script form for bibliographic records, using available script-specific standards (Aliprand, 1992). Records encoded in full script form are exchanged between the Library of Congress, RLIN, OCLC, other bibliographic utilities in the U.S. and elsewhere, and many digital libraries maintained by research libraries. Catalog cards are printed in script and Romanized forms from these databases, but direct use of the records in script form requires special equipment to create and display characters properly. Records from OCLC, RLIN, and other sources are loaded into the online catalogs of individual libraries, where they usually are searchable only in transliterated forms. Some online catalogs support searching with diacritics, while others support only the ASCII characters. Regardless of the local input and output capabilities, if characters are represented internally in their fullest form, they will be available for more sophisticated uses in the future when the search and display technologies become more widely available. Libraries always have taken a long-term perspective on preserving and providing access to information. They manage content in many languages and cooperate as an international community to exchange data in digital form. Thus it is not surprising that libraries were among the first institutions to tackle the multi-lingual character set problem. Over the last 30 years, libraries have created massive stores of digital data. Not only do libraries create and maintain new bibliographic records in digital form, a growing number of the world's major research libraries have converted all of their historical records -- sometimes dating back several hundred years -- into a common record structure. By now, libraries have the expertise and influence to affect future developments in standards for character sets and other factors in data exchange. The library world is changing, however, as new regions of the world come online. The European Union is promoting Unicode and funding projects to support Unicode implementation in library automation (Brickell, 1997). Automation in Central and Eastern Europe (CEE) has advanced quickly since 1990 (Borgman, in press). A survey of research libraries in six CEE countries, each with its own national language and character set, indicates that a variety of coding systems are in use. As of late 1994, more than half used ASCII Latin2, one used Unicode, and the rest used a national or system-specific format; none used the ALA character set (Borgman, 1996). The national libraries in these countries are responsible for preserving the cultural heritage of their countries that appears in published form, and thus require that their language be preserved in its most complete and authoritative digital form. Transliterated text or characters stripped of diacritics are not acceptable. Several of these national libraries are now working closely with OCLC, toward the goal of exchanging data in authoritative forms. As libraries, archives, museums, and other cultural institutions throughout the world become more aware of the need to preserve digital data in archival forms, character set representation becomes a political as well as technical issue. Many agencies are supporting projects to ensure preservation of bibliographic data in digital forms that can be readily exchanged, including the Commission of the European Communities, International Federation of Library Associations, Soros Foundation Open Society Institute Regional Library Program, and the Mellon Foundation (Segbert & Burnett, 1997). Summary and Conclusions Massive volumes of text in many languages are becoming available online, whether created initially in digital form or converted from other media. Much of this data will be stored in digital libraries, whether alone or in combination with sounds and images. Digital formats are no longer viewed as an intermediate mechanism for transferring data to print, film, tape, or media. Rather, they have become permanent archival forms for many applications, including digital libraries. DL content is used directly in digital form -- searched, processed, and often reformatted for reuse in other applications. Data is exchanged between DLs, whether in large batch transfers, such as tape loads between bibliographic utilities and online catalogs, electronic funds transfers between financial institutions, or as hyperlinks between DLs distributed across the Internet. In networked environments, searchers speaking many different languages, with many different local hardware and software platforms, may access a single digital library. For all of these reasons, we need to encode characters in a standard form that can support most of the world's written languages. The first step is for designers of digital libraries to recognize that the multi-lingual character set problem exists. The goal of this essay, and the choice of publication venue, is to bring the problem to the attention of a wider audience than the technical elite who have been grappling with it for many years now. The second step is to take action. The solution will not come overnight, but given the great strides already taken toward platform-independent network applications, and toward standards for exchanging sounds and images, the foundation for progress has been laid. Designers of networked applications are more aware of interoperability, portability, and data exchange issues than in the past. Experience in migrating data from one application to another provides object lessons in the need to encode data in standard formats. Unicode appears to be the answer for new applications and for mapping legacy data from older applications. However, designers still must weigh factors such as the amount of data currently existing in other formats, the standards in use by other systems with which they must exchange data regularly, the availability of application software that supports Unicode and other universal standards for encoding character sets, and the pace at which conversion will occur. The sooner that the digital library community becomes involved in these discussions, the sooner we will find a multi-media, multi-cultural, and multi-lingual solution to exchanging data in all written languages. References Agenbroad, J. E. (1992). Nonromanization: Prospects for Improving Automated Cataloging of Items in Other Writing Systems. Cataloging Forum, Opinion Papers, No. 3. Washington, DC: Library of Congress. The ALA character set and other solutions for processing the worlds information. (1989). Library Technology Reports, 25(2), 253-273. Aliprand, J.M. (1992). Arabic script on RLIN. Library Hi Tech, 10(4), Issue 40, 59-80. Borgman, C.L. (1996). Automation is the answer, but what is the question? Progress and prospects for Central And Eastern European Libraries. Journal of Documentation, 52(3), 252-295. Borgman, C.L. (In press). From acting locally to thinking globally: A brief history of library automation. Library Quarterly. (To appear July, 1997) Borgman, C.L.; Bates, M.J.; Cloonan, M.V.; Efthimiadis, E.N.; Gilliland-Swetland, A.; Kafai, Y.; Leazer, G.L.; Maddox, A. (1996). Social Aspects Of Digital Libraries. Final Report to the National Science Foundation; Computer, Information Science, and Engineering Directorate; Division of Information, Robotics, and Intelligent Systems; Information Technology and Organizations Program. Award number 95-28808. Bossmeyer, C.; Massil, S.W. (Eds.). (1987). Automated systems for access to multilingual and multiscript library materials : problems and solutions : papers from the pre-conference held at Nihon Daigaku Kaikan Tokyo, Japan, August 21-22, 1986. International Federation of Library Associations and Institutions, Section on Library Services to Multicultural Populations and Section on Information Technology. Munich and New York: K.G. Saur. Brickell, A. (1997). Unicode/ISO 10646 and the CHASE project. In M. Segbert & P. Burnett (eds.). Proceedings of the Conference on Library Automation in Central and Eastern Europe, Budapest, Hungary, April 10-13, 1996. Soros Foundation Open Society Institute Regional Library Program and Commission of the European Communities, Directorate General XIII, Telecommunications, Information Market and Exploitation of Research, Libraries Programme (DG XIII/E-4). Budapest: Open Society Institute. Croft, W.B. (1995). What do people want from information retrieval? (The Top 10 Research Issues for Companies that Use and Sell IR Systems). D-Lib Magazine, November. Kahin, B.; & Abbate, J. (eds.). (1995). Standards policy for information infrastructure. Cambridge, MA: MIT Press. Libicki, M.C. (1995). Standards: The rough road to the common byte. B. Kahin & J. Abbate (eds.), Standards policy for information infrastructure. MIT Press: Cambridge, MA, 35-78. Számítástechnikai karakterkódok. A grafikus karakter magyar referenciakészlete. (1992). Budapest: Magyar Szabványügyi Hivatal. (Character sets and single control characters for information processing. Hungarian Reference version of graphic characters. Budapest: Hungarian Standards Office.) Mitchell, J. (1994). OCLC Europe: Progress report, March, 1994. European Library Automation Group Annual Meeting, Budapest. OCLC Online Computer Library Center, Inc. (1993). Furthering access to the world's information (Annual Report 1992/93). Dublin, OH: Author. Segbert, M. & Burnett, P. (eds.). (1997). Proceedings of the Conference on Library Automation in Central and Eastern Europe, Budapest, Hungary, April 10-13, 1996. Soros Foundation Open Society Institute Regional Library Program and Commission of the European Communities, Directorate General XIII, Telecommunications, Information Market and Exploitation of Research, Libraries Programme (DG XIII/E-4). Budapest: Open Society Institute. Smith, K.W. (1994). Toward a global library network. OCLC Newsletter, 208, 3. Copyright ©1997 Christine L. Borgman cnri.dlib/june97-borgman work_5e7t3mgpx5dj3dmqruaxhxuocy ---- 011-027(028)_____05_(¿Ï)-¹è¼øÀÚ.hwp 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구* A Case Study of the CDRS for Effective Operation of Collaborative Digital Reference Service in Korea 배 순 자(Soon-Ja Bae)** 목 차 1. 서 언 2. 력형디지털정보 사(CDRS)의 도입과 발 2.1 도입배경 2.2 개념과 발 3. CDRS의 사례 분석 3.1 조사의 진행 3.2 조사의 결과 분석 4. 결 언 록 본 연구는 력형디지털정보 사(CDRS: collaborative digital reference service)를 상으로 그 사례를 분석하여 국내 CDRS의 향후 발 방안을 모색하려는 의도로 진행되었다. 세계 범 의 력 사에 의해 정보 사의 문화를 모색하려는 CDRS의 해외 사례는 미국의 QuestionPoint를 비롯하여 7개국의 국가도서 이 주도하는 사례가 상이 되며 국내의 사례는 국립 앙도서 의 “사서에게 물어보세요”가 그 상이다. 분석에 의한 국내의 본 서비스는 이용자들로부터의 활용이 매년 증가되며 학생과 직장인들의 연구 업무수행에 활동되는 등 향후의 문 사로서의 발 가능성이 확인되고 있다. ABSTRACT This paper aims to seek ways of optimizing the operations of Collaborative Digital Reference Service(CDRS) in Korea. CDRS tries to specialize the reference service through world-wide collaborative service. In this study seven international CDRS services including “QuestionPoint” operated by ALA and one national service, “Ask a Librarian” by the National Library of Korea were surveyed. A focused analysis of CDRS in Korea shows not only a sharp increase in use by the public, but also much application in research and academic activities by university students and workers. 키워드: 디지털 정보서비스, 로벌네트워크, 사서에게 물어보세요, 문정보 사, 력형디지털정보 사 Collaborative Digital Reference Service, Digital Information Service, Global Network, QuestionPoint, Specialized Information Service * ** 본 연구는 주 학교 교내 연구비지원에 의해 진행되었음. 주 학교 사회과학부 문헌정보학 공 교수(sj1bae@jj.ac.kr) 논문 수일자: 2011년 1월 16일 최 심사일자: 2011년 1월 19일 게재확정일자: 2011년 2월 9일 한국문헌정보학회지, 45(1): 11-27, 2011. [DOI:10.4275/KSLIS.2011.45.1.011] 12 한국문헌정보학회지 제45권 제1호 2011 1. 서 언 정보환경의 진 발 은 정보의 제공 활용을 한 서비스 장에 가장 유력한 변화 의 요소가 되었다. 특히 정보의 양 변화는 정 보서비스의 태동과 방법에 해 결정 인 요소 로 작용하여 그 변화를 발시키고 있다. 즉 도 서 사의 발생 원인에 주요 요소가 된 출 물의 증가를 시작으로 오늘날에는 정보매체의 다양성과 정보생산의 량성 정보획득의 수 월성이라는 기술 측면과, 요구의 문성이라 는 이용자측 요소가 통합되어 정보서비스의 장에 끊임없는 변화를 모색하도록 하고 있다. 이와 같은 정보환경의 사회 요소는 도서 리 심의 사를 이용자 심의 사로 환 시켜 자료보존의 도서 체제를 자료이용체제 로 바꾼 이래로 도서 을 하나의 요한 사회 조직체로서 정착시켰다. 이는 곧 도서 의 사회 공익성이 정당화되고 이에 따른 사회 책 임을 수행하기 한 도서 내 인 문화 노력 에 부단히 고심하도록 한 이유가 되기도 한다. 이리하여 정보서비스의 장에서 요구되는 변 화와 개발의 주축은 정보의 량생산으로 시작 하여 도서 자동화를 거쳐 이제는 디지털환경 에 맞는 웹기반의 력화로까지 이어지고 있다. 그 결과 정보센터에서 재 수행되고 있는 정 보서비스의 응으로는 세계 범 에서 더 한층 문화된 정보를 보다 수월하게 제공하는 력형디지털정보 사(CDRS: collaborative digital reference service)를 표 으로 들 수 있다. 력형디지털정보 사는 이처럼 정보서비스 의 발 변화에 작용을 하는 여러 가지 요소 들에 따라 서비스 장에서 개발된 정보 사의 최근 모델로서 여러 도서 에서 정보 사의 효 율을 높이기 해 활용하고 있다. 특히 근간에 요청받고 있는 ‘디지털(virtual)’과 ‘ 력(coop- erative)’을 결합한 정보 사의 발 방법으 로 인정되고 있는 사모델이라 할 수 있다. 정 보기술의 발 이후 근성과 신속성 등에서 보 인 자참고 사의 기술 발 은 가히 높은 수 에 이르 으나 력 방법에 의한 서비스의 활동은 다른 도서 활동에 비해 다소 미미한 수 임을 지 한 바 있다(김석 2002). 이와 같은 상의 인식하에 시작된 본 연구는 세계 범 에서 새롭게 활용되고 있는 력 형디지털 사를 상으로 상황을 악하고 더 나아가 그 효율성을 규명하고자 한다. 그리 고 이를 바탕으로 우리나라 국립 앙도서 이 주 이 되어 국내 공공도서 간 력으로 시행 되고 있는 CDRS의 향후 보다 나은 서비스를 제공할 수 있는 기 자료를 획득하려는 목 을 갖는다. 한 우리나라의 경우, 공공도서 에만 한정되었으나 국가 수 에서 국 범 로 수행되는 CDRS가 이제 2년을 넘고 있는 시행 기의 황을 진단해 으로써 향후 본 서비 스의 발 가능성을 규명하고자 함도 포함된다. 이를 해 국내외의 CDRS 사례를 개 하여 연구의 목 에 참고할 것이며, 국내의 경우에 는 특히 국립 앙도서 이 주축이 되어 시행되 고 있는 ‘사서에게 물어 보세요’를 구체 인 사 례 상으로 삼을 것이다. 본 연구에서의 사례 상은 국내외 모두 국가도서 이 주축이 되어 수행되는 력업무가 될 것이며 이는 각 국가 별의 포 인 력형정보서비스의 황을 알 고자 함에서이다. 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 13 2. 력형디지털정보 사(CDRS)의 도입과 발 2.1 도입배경 웹의 등장은 인터넷 활용의 폭발 인 가속화 와 정보서비스를 제공하고 제공받는 과정과 방 법의 여러 가지 면에서 신을 래하 다. 디지 털환경에 의해 정보의 생산과 달에서는 엄청 난 양 증가를 가져 왔으나 합한 정보의 선 별에 의한 정보이용의 효율성에서는 오히려 ‘정 보의 홍수 속에서 정보의 가뭄’이란 아이러니를 경험하게 한다. 도서 의 디지털화는 면 면식 정보 사 방법을 가상이라고 하는 물리 공간의 제한 없는 근성에서는 수월성을 제공하 으 나 지식정보의 가치창출에 있어서는 인 신 뢰를 얻기에 다소 역부족이라는 단을 얻고 있다. 뿐 만 아니라 속도로 발달되는 정보환경은 정보이용자들로 하여 수 높은 정보탐색가 로 변신하게 함으로써 정보의 요구도가 세분화, 문화되어 이에 한 응을 해 정보사서들 은 지역을 월한 력을 강구하게끔 하 다. 력형디지털정보 사(CDRS)는 바로 이와 같 은 정보 장에서의 속한 발 과 변화에 한 사서들의 응책으로 고안된 것이다. 폭발 인 양 증가의 정보량 속에서 국가와 종 을 월한 력을 통해 검증된 지식정보를 웹 기반의 디지털환경에 의해 신속성과 근성을 최 한 살려 이용자의 문 정보요구에 답변 하고자 함이 곧 CDRS의 주요 도입배경이 된다. 이와 같은 도입배경은 오늘날 직 도서 에 가지 않고도 정보를 습득하는 것에 익숙해진 이 용자들에게 원격 속이 가능하다는 과 인터넷 을 통한 실시간 정보서비스의 시간성이 모두 해 결된다는 장 으로 인해 세계 인 활용으로 이어지고 있다. 특히 이 새로운 모델의 서비스 방법은 력형(네트워크)이라는 기본 틀에서 수행되므로 공동 력체제 지향인 정보경 의 개념에 부합되는 특성을 지니고 있어 그 활 용의 가능성은 더욱 커질 것으로 측된다. 2.2 개념과 발 력형디지털정보 사의 기본 개념은 인터 넷 기술을 기반으로 하는 디지털 정보서비스로서, 자 방법에 의한 실시간 시행으로 지역을 월한 네트워크를 통해 문사서의 력을 활용 한 문 사라는 이다. 사서와 이용자간의 의 사소통 방식은 면 면이 아닌 자 수단에 의 하여 실시간으로 진행될 수 있다는 에서 물 인 면에서 그 뛰어난 속성이 통 정보 사의 한계를 극복한 확장 는 개량형으로 각 을 받는다. 디지털 환경에서의 정보서비스는 일 이 1980 년 반부터 실시되긴 하 으나 1990년 에 이르러 웹기반으로 개선되면서 이용자 심의 서비스가 보다 더 가능해졌다. 이에 덧붙여 1990 년 말부터 도입된 력형의 실시간 방식인 CDRS는 인터넷과 정보통신기술이 통합된 장 을 최 한 활용한 서비스모델이 되고 있다. 따라서 통 인 도서 사에 한 신개념으 로서의 기술 정보서비스에 해 붙여지는 명 칭들이 모두 포함되는 정보서비스라 할 수 있다. 즉 가상정보 사(virtual reference service), 실시간 사(real time service), 디지털정보 사(digital reference service), 인터넷정보 사 14 한국문헌정보학회지 제45권 제1호 2011 (internet reference service), 온라인정보 사 (online reference service) 등의 용어와 서로 호환하여 사용된다. CDRS의 기본 개념이 되는 가상정보 사는 자 방법으로 실시간 진행이라는 을 핵심 으로 하여 통 참고 사와 차별화되는 공개 성, 심리 근성, 원격성, 시공간의 월성 등 에서 우월함을 지닌다. 이와 같은 우월성에서 더 나아가 력에 의한 사의 문성을 더 추가 하고자 하는 시스템이 바로 CDRS가 의도하는 장 이다. 이 시스템은 특정 도서 에서 수 한 이용자의 정보요구를 력 계에 있는 정보 센터의 주제 문사서나 문가 등이 인터넷 상 에서 하나의 인터페이스를 통해 해결, 제공하는 방법에 의한다. 따라서 이 시스템을 실행하기 해서는 국가 혹은 국제 인 네트워크의 형성 을 제로 한다. CDRS를 통해 획득되는 가장 큰 장 은 지 리 한계에 구애 없이 력 계에 있는 각 도 서 의 문 인 자원을 최 한 활용함으로 써 개별 도서 이 갖는 자원의 한계를 극복하 고 제공되는 사의 수 을 높이는데 있다. 그 결과 얻어지는 부수 인 효과로는 각 정보센터 의 사경비 감 뿐 아니라 지역별 도서 의 공동 력 체제의 활성화와 이용자들의 정보요 구에 한 경향을 구체 으로 악할 수 있다 는 이다. CDRS의 수행에 기 가 되는 사개념인 디 지털참고 사와 비교하여 그 공통 과 차이 을 본다면, 공통되는 내용으로는 인터넷을 활 용한 방 가능성을 이용해 정보원이나 웹자원 을 토 로 한 정보서비스라는 이다. 한편 그 들간의 차이 이란 여러 도서 들이 력 계 를 맺어 각 기 의 문 지식을 최 한 활용하 여 인터넷상에서 하나의 인터페이스로부터 질 문과 답변을 처리한다는 이다(최은주, 이선 희 2004). CDRS가 의도한 수행 목 에는 무엇보다 수 높은 정보 사의 실 과 유지가 포함되어 있 어, 이에 한 명칭으로는 그 개념이 포함된 “ 력형온라인지식정보서비스”로 통용되기도 한 다. 따라서 기 시행의 최 사례인 미의회도 서 에서 구축한 사모델인 QuestionPoint는 이 시스템의 수행목 을 문가 수 의 서비스 제공에 두고 정보 사의 수 을 더 한층 높이 려는 목 을 수립하고자 하 다. 이후 이를 모 델로 한 유사 력형 디지털정보 사가 여러 국가에서 문 사의 효율 인 수행을 한 방법으로 인정하고 그 실행에 노력을 기울 이고 있다. 2002년에 실시된 도서 간의 력 에 의한 웹상에서의 참고정보서비스에 한 황조사에 의하면 당시만 하더라도 도서 보 다는 상업 기 에서 주로 시행하며 도서 에서의 운 은 기 수 임을 발표한 바 있다 (Curtis and Mann 2002). 3. CDRS의 사례 분석 3.1 조사의 진행 본 연구에서 진행된 사례조사의 상은 력 형디지털정보 사의 유형 LC와 OCLC에 의해 공동으로 수행되는 QuestionPoint를 비 롯하여 각 국가도서 주축으로 수행되는 사례 만 용이 된다. 를 들어, 미국의 경우에는 각 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 15 주립(州立)도서 에서 주도하거나 종별의 력체제에 의해 시행되고 있는 많은 사례가 있으나 본 연구에서는 조사의 상으로 포함시 키지 않았다. 그 이유는 본 연구의 목 이 국내 CDRS의 향후 발 을 한 기 자료의 획득이 며 재 국내에서는 국 범 에서의 CDRS 수 행을 국립 앙도서 에서 주 하기 때문이다. 그리하여 국외의 사례조사 상국으로는 미국 과 국을 비롯하여 캐나다, 호주, 뉴질랜드, 노 르웨이, 일본 등 7개국이다. 우리나라의 경우에는 국립 앙도서 이 주 축이 되어 각 지역별 공공도서 과의 력체에 의해 시행되고 있는 ‘사서에게 물어보세요’란 명칭의 서비스가 사례 상이다. 이 서비스는 2008 년 5월부터 실시되었으나 2009년 4월까지는 국 립 앙도서 에서만 제한된 시범 운 이었 으며 이후 2009년 5월부터는 국 범 에서 공공도서 들만이 참여하는 력체제로 운 되고 있다. 우리나라의 국립도서 사례는 보 다 구체 인 내용으로 그 시행의 결과를 분석 한다. 따라서 본 연구에서는 CDRS의 황을 악하기 한 조사 내용이 국립 앙도서 에 서 처리된 것만 상이 되며 각 지역별 도서 에서 직 처리된 내용은 제외가 되는 연구의 제한 을 갖는다. 각 지역도서 에서 처리된 질의응답이 제외된 이유는 각 도서 별 조사가 본 연구자의 근 경로로는 불가능한 경우가 많았으며 한 ‘사서에게 물어보세요’에 수된 질문의 90% 이상이 국립 앙도서 으로 이 되어 처리되는 실정을 감안하 기 때문이다. 사례에 한 조사 진행에서 국외의 경우는 본 서비스가 인터넷에 의해 진행되는 특징을 고려하여 해당 국가도서 홈페이지를 통해 CDRS의 수행을 조사하 으며 조사결과 나타 낸 종합 내용은 2010년 10월 황이다. 국내의 경우에는 국립 앙도서 의 홈페이지 조사 담당사서로부터의 자료 조를 비롯하여 각 지 역의 표도서 담당사서와의 면담과 자료 조를 병행하 다. 특히 ‘사서에게 물어보세요’의 질문분석은 질문의 수시에 해당 질문의 내용 을 기록하고, 해답을 받은 후에는 처리결과에 한 만족도를 기록하도록 하고 있어 이들 자료 에 한 도서 측 조를 참조하 다. 국내 사 례를 한 조사 상이 된 기간은 2009년의 경 우는 국 규모에서 본 서비스에 참여가 시작 된 5월부터 12월까지, 2010년은 1월부터 8월까 지로서 각각 8개월간의 기간이 된다. 3.2 조사의 결과 분석 조사의 결과는 국외와 국내를 구분하여, 해외 의 경우에는 각 국가별의 CDRS를 상으로 재 수행되고 있는 내용을 종합하여 나타내 었다. 국내의 경우에는 일반 인 황을 비롯 하여 처리된 질의응답 분석과 서비스 만족도에 한 결과를 나타내었다. 먼 우리나라와 7개국 의 해외 표 사례 등 8개국의 국가별 CDRS 황에 한 종합된 내용은 다음 <표 1>과 같다. 3.2.1 국외의 사례 각 국가별의 력형디지털정보 사는 본 시 스템이 최 로 개발되어 시행된 미국의 Ques- tionPoint로부터 많은 향을 받으며 실제 업 무수행에 있어서도 QuestionPoint의 소 트웨 어 지원에 의해 진행되고 있는 사례를 볼 수 있 다. 해외 사례의 상으로는 ① 미국, ② 국, 16 한국문헌정보학회지 제45권 제1호 2011 국가 서비스명 개시연도 지원문자 력 미국 QuestionPoint 2002 어, 24개어 종 도서 국 Enquire: Ask a Librarian 2005 어 도서 /박물 /기록 캐나다 Ask Us a Question 2006 어, 불어 도서/박물/기록/연구 호주 AskNow 2005 어 공공도서 /뉴질랜드국립도서 뉴랜랜드 AnyQuestion 2008 어, 마오리어 도서 /교육부 노르웨이 Ask the Library(Biblioteksvar) 2008 어, 노르웨이어 공공․ 문도서 일본 Collaborative Ref. Database 2005 어, 일어 공공․ 문․ 학도서 한국 사서에게 물어보세요 2008 어, 한국어 공공도서 <표 1> 8개국의 력형디지털정보 사 황의 개 ③ 캐나다, ④ 호주, ⑤ 뉴질랜드, ⑥ 노르웨이, ⑦ 일본 등 7개국이다. (1) QuestionPoint 이는 미 의회도서 이 1998년부터 시행해 오 던 CDRS를 16개의 력도서 과 함께 도서 들 간 력모델로서의 효율성을 단하기 해 2000년 반부터 그 해 9월까지 참여도서 들 간에 수차에 걸친 이럿 테스트를 실시한 바 있다. 이럿 테스트를 실시하여 력 사업의 시행 가능성을 확인한 결과에 의해 데이터 요 소들의 표 화 비와 네트워크 자원의 확 등을 보완하여 2002년에 세계 인 력형디지 털참고 사로 시작된 사업이다. 2002년 1월에 CDRS의 주축인 LC와 OCLC 는 향후의 공동개발을 한 약에 의해 OCLC 는 참여기 들의 로 일 리 기술 지 원을 맡기로 하 다. 2006년 6월에 Question- Point로 개칭하고 그간 참여 도서 들에게만 제한된 이용을 일반 에게도 공개하여 명 실공히 세계 범 에서 문가 수 에 의한 력형 시스템의 정보 사를 수행하기에 이르 다. 이 사업은 두개 부문으로 구성되어 이용자의 질문과 해답제공, 향후 해답을 한 아카이빙 기능을 수행하도록 하 다. 서비스의 수행과정 은 먼 회원도서 을 통해 이용자가 정보를 요청하면 그 해답 제공이 가능한 도서 을 회 원 로 일의 검색을 통해 질의를 보내고 해 당 도서 에서 답변을 받아 질의를 보낸 도서 을 통해 이용자에게 달된다. 이 과정에서 유니코드를 사용하여 세계의 언어로 지원되며 회원 도서 의 로 일 리와 해답 달의 간 역할 질문과 해답 달간의 심 역할 을 하는 Request Manager(GM)를 두어 그 기 능을 수행하도록 한다. 2006년 개명 당시 약 1,000여개의 도서 이 력도서 으로 가입되었으나 해마다 그 수는 증가하고 있으며 종별로는 학도서 이 가 장 많은 수 에서 참가하고 있다. 세계 범 의 력도서 이용자들에 한 ‘24/7참고 사’ 를 효율 으로 수행하기 해 OCLC 본부를 비롯하여 미국, EMEA(Europe, the Middle East, Africa, and India), 아시아태평양 지 역 등 4개의 지역별 평의회를 두어 운 하고 있 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 17 다. 한 QuestionPoint User Council에서는 각 회원과 본부간의 원활한 교신을 돕는 한편 ALA 연차회의와 Midwinter에서 이용자 그룹 의 미 을 가져 이용자들의 반응과 사후 리에 한 개선책을 논의하기도 한다. (2) Enquire: Ask a Librarian 국의 CDRS사업은 미국의 QestionPoint와 병행하여 실시되고 있으며 정부의 재정 지원 으로 도서 을 비롯하여 박물 과 기록보존 이 통합하여 서비스를 제공한다. 국도서 박 물 기록보존 원회(The Museums Libraries and Archives Council: MLA)의 Peopl’s Net- work를 통해 CDRS서비스인 Enquire를 2005 년 5월부터 시행하고 있다. MLA의 재정 지 원과 OCLC의 24/7서비스, QuestionPoint의 소 트웨어의 지원에 의해 24/7서비스 제공을 목표로 한다. People’s Network는 로터리에서 재정지원 을 받고 MLA(http://www.mla.gov.uk)에서 운 하는 웹자원의 력업무이다. 국 인 구의 35%에 해당되는 국민들이 최소 1개월에 1회 이상 공공도서 을 방문하는 이용자들의 편의를 해 유비쿼터스 개념으로 국내의 공 공도서 서비스 실 을 목표로 하고 있다. 이 를 해 공공도서 의 문 사를 기획하여 단행본을 비롯하여 여러 다양한 정보자료를 상으로 훈련된 직원에 의해 고속으로 근할 수 있도록 하고 있다. People’s Network 홈페이지에서는 다음과 같 은 세 유형의 웹사이트를 통해 각각 련되는 신뢰성 있는 문 사를 제공하는데 매년 수백 만 건이 넘는 사 횟수를 유지한다. 즉 공공도 서 을 통해 보물찾기와 같은 유용정보를 획득 할 수 있는 합한 공공도서 탐색을 지원하는 “Discover(!)", 독서크럽을 지원하는 “Read( )” 를 비롯하여 문 질문에 답을 제공하는 문가상담을 지원하는 “Enquire(?)” 등이 포함된다. Enquire는 국과 스코트랜드 역에 걸쳐 약 80여개의 공공도서 들이 력하여 월~ 요일, 9am-5pm 속이 가능하며 그 이후의 시 간에는 미국의 트 도서 의 하나와 연결 시킨다. 질문에 한 답변은 필요한 경우, 계속 된 차후의 답변을 더 추가할 수 있도록 자주소 를 확보하며 익일 근무에 활용할 수 있도록 신 속성을 추구하며 답변결과에 한 평가를 통해 향후의 개선된 사를 해 노력한다. (3) Ask Us a Question 캐나다 국가단 의 Virtual Reference Canada (VRC)는 CDRS의 시스템에 의해 시행되는 디 지털 참고 사로서 캐나다 국가 내의 각종 정 보센터를 비롯하여 문서 이나 박물 등의 기 과도 력하는 조직 형태이다. 따라서 일반 인 경우에 비해 다양한 형태의 정보자료를 포 하여 디지털참고 사에 응하고 있는 표 인 특별 사례다. Library and Archives of Canada(LAC)가 주축이 되어 1998년부터 디 지털 참고 사의 네트워크화를 시도하여 재 불어와 어로 지원되고 있으며 Questionpoint 에도 가입되어 국제 인 범 로 확 되어 있다. Ask Us a Question란 명칭으로 수행되는 본 CDRS에서 질문의 수는 VRC의 홈페이지를 통하는데, 각 력기 에서 요구하는 질문 수 18 한국문헌정보학회지 제45권 제1호 2011 양식에 따라 작성되어 수된다. 수된 질문 은 메칭알고리즘에 의해 한 기 으로 자동 이 되어 얻어진 답변은 다시 홈페이지에 올려 지면 질문기 을 통해 이용자에게 달되는 과 정을 거친다. 질문의 수는 화나 팩스 는 e-mail로도 받으며 문 인 질문은 약제에 의해 문가와의 면담을 통해서 수와 답변이 처리되기도 한다. (4) AskNow 호주국립도서 (National and State Libraries Australasia: NSLA)이 주도하여 뉴질랜드국 립도서 을 비롯하여 11개 국립도서 과 주립 도서 13개의 공공도서 이 력하여 시행 되는 CDRS업무이다. 호주와 뉴질랜드의 사서 약 19명이 3인 1조가 되어 사서와 이용자간의 실시간 상호작용으로 진행되는 문 사로 기 획되었다. 웹폼으로 작성된 질문지로 수하며 해답이 가능한 사서와 채 으로 진행되며 필요 한 경우에는 이메일을 통하기도 한다. OCLC의 QuestionPoint로부터 지원되는 software에 의 해 더 많은 답을 제공할 수도 있으며 채 의 경 우에는 동일한 화면을 사서와 이용자간에 공유 할 수 있는 시스템을 갖는다. 력도서 간에 채 이 가능한 시간을 정하 여 이용자들에게 월- 요일 주간에 실시간 사가 효율 으로 제공되도록 운 하며 이용자 들의 이용평가를 기 로 하여 이용자만족을 지 향하는 력 사이다, 2010년 재로는 공공도 서 간의 력에 의해 시행되는 사이며 ․ 고등학생들의 이용률이 가장 높으며 종의 역 확 에 한 계획은 아직 미정에 있다. (5) AnyQuestion 이는 뉴질랜드에서 시행되는 사시스템으 로서, “real time, real people, real help with your homework”라는 부제(副題) 아래 뉴질 랜드국립도서 과 교육부의 주 으로 진행된 다. 학교도서 의회의 지원으로 뉴질랜드 역을 사범 로 하여 학생들에게는 무료로 제 공된다. 학부모와 교사 도서 사서를 상 으로 한 사이트를 각 각 운 하고 있으며 학교 와 각 도서 자택에서도 속 가능하도록 하 다. 특히 교육과정에 한 서비스 제공에 의해 학부모들이 가정에서의 자녀들을 한 학 습지도용으로 많이 활용되고 있다. 운 시스템의 언어는 어와 마오리 토속어 로 제공되며 온라인으로 속하여 서비스를 제 공받을 수 있으며 보다 구체 인 상담을 필요로 할 경우에는 kidsline과 youthline을 별도로 두 어 화 속에 의해 상담요청이 가능하도록 되어 있다. (6) Ask the Library =Bibliotekenes svartjeneste 본 사업은 노르웨이국립도서 이 주도하는 국가단 의 력 사로서 2010년 10월 재 국 범 의 55개 도서 약 200여명의 사서가 e- 메일과 문자 채 을 통해 수된 질문을 처 리하고 있다. 문도서 과 공공도서 의 력에 의해 진행되는 본 사업은 The Norwegian Archive, Library and Museum Authority(ABM Utvikling)의 디지털도서 업무일환으로 취 되어 이 기 으로부터 재정 지원을 받고 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 19 있다. 국가 단 에서 지역도서 과의 력에 의 해 인터넷 채 과 문자서비스 e-메일을 통 해 질문을 수하면 다음 날 업무시간 내에 답 을 받을 수 있는 신속함을 서비스의 기 로 삼 고 있다. 특히 사실질문에 해서는 문자로 답 을 보내며 다소 문 인 질문에 해서는 조 회 사(refferal service)나 일정한 웹폼에 의 한 e-메일에 의해 문가상담을 보완하기도 한 다. 재 덴마크, 스웨덴, 헬싱키, 핀란드 등의 국가내 몇 공공도서 에서도 이와 동일한 시스 템에 의해 유사한 서비스를 제공하고 있다. (7) Collaborative Reference Database 일본은 국립국회도서 이 주도하여 국가 단 의 력형디지털참고 사(CDRS)를 수행하 고 있는데, 본 력사업에는 2010년 10월 재 10개의 국회도서 을 비롯하여 공공도서 335 개 , 학도서 137개 문도서 41개 등 총 524개 도서 이 참가하고 있다. 2002 년 8월에 본 사업의 실시를 계획하여 수차에 걸친 설명회와 공청회 시스템 개발을 거쳐 2005 년 4월부터 실시하고 있다. 참가도서 들의 로 일을 데이터베이스화 하여 력도서 간에 인터넷을 통해 공유하고 질문을 데이터베이스로 축 하여 답변을 해결 하고 있다. 수된 참고질문을 데이터베이스화 하여 동참고 사의 효율을 높이고자 하는 이 다른 국가의 본 서비스와 차별화된다. 참가 도서 과 일반이용자용으로 구분하여 질문을 수하며 특정 문 테마에 한 문가 상담 과 개인문고를 비롯한 특수자료에 이르기까지 지원 자료의 역이 범 하다. 이메일과 화에 의해 상담신청을 할 수 있으나 2010년 9 월 21일부터는 트 터를 통해서도 동참고 사를 신청(http://twitter.com/crd_twitter)할 수 있도록 하 다. 3.2.2 국내 국립 앙도서 의 사례와 분석 CDRS의 운 사례와 련하여 국내의 경우 에는 사립기 을 심으로 출발되어(이선희, 최 희윤 2004, 336), 공공기 으로는 최 로 2004년 KISTI에서 개발한 QestionPoint + 가 재 서비 스 단 에 있으며, 같은 해 KERIS에서 개발한 한국형 CDRS 모형은 미실시로 그친 사례가 있 다. 2010년 8월 재로는 국립 앙도서 주 으로 공공도서 에만 한정된 력체계에서 진행 되는 “사서에게 물어보세요”가 있다. 특히 국내 의 경우에는 본 서비스에 한 일반 인 황에 더 이어 질문자가 작성하게 되는 질문내용과 처 리결과의 기록에 근거하여 질문에 한 분석과 질문처리 결과에 한 조사를 병행하 다. 질문 내용에 한 분석으로는 ‘질의응답의 연도 연 령별분석’, ‘질문의 주제별분석’, ‘질문의 목 별 분석’을 하 으며 처리결과에 한 분석으로는 답변을 받은 후 질문자가 직 작성한 응답처리 결과에 의한 ‘만족도 분석’을 하 다. • 사서에게 물어보세요 (1) 황 우리나라에서는 2006년 9월부터 이 서비스 와 련하여 국의 공공도서 에 한 설문조 사를 비롯하여 지역 표도서 워크 을 거치 20 한국문헌정보학회지 제45권 제1호 2011 지역 도서 명 참여 (%) 지역 도서 명 참여 (%) 서울 정독도서 외 50(53.6) 강원 원주도서 외 30(57.7) 부산 부산시립시민도서 외 19(61.2) 경남 창원도서 외 30(55.6) 인천 미추홀도서 외 12(42.9) 경북 도립구미도서 외 32(81.2) 구 구시립시민도서 외 13(56.1) 남 나주공공도서 외 27(46.5) 한밭도서 외 16(66.7) 북 주시립도서 외 9(19.1) 주 주시립도서 외 9(90.0) 충남 천안 앙도서 외 29(54.8) 울산 울산남부도서 외 4(36.4) 충북 충북 앙도서 외 15(44.1) 경기 경기도립과천도서 외 27(28.1) 제주 한라도서 외 21(40.3) 합 계 343(52.1) <표 2> 국내 CDRS 력도서 의 지역별 황 는 등 2007년 5월부터 본격 인 시스템개발을 비하 다. 2008년 1월 시스템 구축을 완성한 후 국립 앙도서 에서만 제한된 실시를 해오 다 2008년 5월 15일부터 12월 말까지 국의 지역 표도서 으로 확 하여 시범 으로 실시 하 다. 이후 2009년 2월부터는 국으로 그 참 여 범 를 확산시켜 2010년 8월 재는 국의 총 343개 공공도서 이 참여하고 있는 력서 비스이다. 력의 범 는 단 도서 ⟶ 지역 표도서 ⟶ 국립 앙도서 으로 연결되며 사내용 은 크게 ‘질의응답서비스’, ‘참고정보원’, ‘독서정 보’ 등 세 역을 포함시켜 실시간 지식정보 학술정보의 무료제공을 기본으로 하고 있다. 특 히 사의 세 역 ‘참고정보원’은 패스 인 더에 의해 답을 제공하는 조사형 사로서 CDRS 서비스가 지향하고자 하는 정보 사의 문화 의도를 실 하려는 역이다. “사서에게 물어 보세요” 홈페이지에 올려진 질문은 가장 인 된 도서 에서 수하여 답변을 하게 되며 처리 불능일 경우에는 국립 앙도서 으로 이 된다. 사서와의 직 면이나 화 는 우편으로도 서비스가 가능하며 답변자료 유용한 정보는 지식정보 DB로 장하여 향후의 답변 자료로 활용된다. 2010년 8월 재 본 CDRS에 력하는 도서 은 서울시 표도서 인 정독도서 을 비롯 하여 총 343개 공공도서 이며 16개 역지역 의 각 지역별 력 황은 아래 <표 2>와 같다. 표에서 나타낸 참여 의 비율은 각 지역별 공 립 공공도서 의 참여 수 으로서, 장애자도서 을 비롯한 특수도서 이 제외된 국 일반공 공도서 의 체에 한 상 비이다. 재 CDRS의 력체계를 유지하고 있는 국 내 지역공공도서 은 국 일반공공도서 체의 반(52.1%)을 넘는 수 으로, 본 서비스 의 력에 해서는 반 으로 정 인 호응 도임을 알 수 있다. 역지역 가장 높은 비율 로 참여하는 지역은 주지역으로서 평균보다 상당히 높은 수 (90.0%)이다. 반면, 가장 낮 은 참여도를 보이는 지역은 북지역으로서 평 균에 비해 다소 조한 수 (19.1%)이다. 국 범 에서 평균 이하의 참여율을 보이는 지역 은 북을 비롯하여 경기, 울산, 경북, 남, 충 북, 인천, 제주 등 7개 역지역이다. 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 21 <그림 1> 연도 연령별 분석(2009-2010년) (2) 연도 연령별 분석 CDRS에서 처리된 질의응답을 연도별과 연 령별로 분석한 결과에서는 본 서비스가 이용자 로 부터의 인지도 그 이용층의 연도별 변화 추이를 알고자 하 다. 그 결과는 다음 <그림 1> 과 같다. 조사 상은 국립 앙도서 에서 직 는 이 을 받아 수 처리된 질문 모두가 포 함되었으며, 본 서비스가 시작된 첫 해인 2008 년에는 국립 앙도서 에만 한정된 시범운 을 하 기 때문에 2009년과 2010년만 조사 상 에 포함되었다. 공공도서 의 이용 상은 문 성이나 연령 학력 등과 무 한 평등원칙에 있음으로 본 항목의 분석에서는 질문의 문성 여부와 무 하게 CDRS를 활용한 순수 이용자 에 한 조사결과이다. 서비스의 주 활용층을 알기 한 연령별 분석 에서는 학생 이하의 어린이(13세 이하)와 ․ 고등학생(14-19세) 학생을 포함한 직장 인(20-39세)과 장년층(40-59세)을 비롯하여 노년층(60세 이상)으로 별하여 그 결과를 나 타내었다. 국립 앙도서 CDRS의 이용자를 연도별로 보면, 국 범 에서 시작한 첫 해인 2009년 총 756건에서 2010년에는 1,154건으로 약 52.9% 수 의 증가율을 보인다. 그리고 연령별로 본 CDRS의 이용 황에서는 주 이용 상층이 학생을 비롯한 직장인을 변하는 청년층으로 서, 2년 평균 82%선을 상회하는 수 이다. 이 에 반해 ․고등 학생층에 포함되는 10 의 이용자는 2년 모두 1% 이하의 매우 조한 활 용 수 이다. 이와 같은 연령별 상황은 연도별 로 동일하게 나타나고 있어 본 서비스의 주요 이용자층은 학생 이상의 성인용 서비스임을 알 수 있다. (3) 질문의 주제별 분석 수된 질의에 해 주제별로 그 이용을 분석 하고자 함은 질문의 문성을 악하고자 한 이 유에서 이다. CDRS의 도입배경은 도서 사 의 문성 제고이다. 이를 규명하기 한 본 주 제별 분석에서의 조사 상은 패스 인더에 의해 답변을 제공하는 ‘참고정보원’에 해당되는 질문 만 제한하 다. ‘참고정보원’에 해당되는 질문은 패스 인더로 답변이 제공됨으로써 그 주제가 상 으로 명확하며 서지 사에 해당되기 때문 에 본 서비스 실시의 핵심 의도인 문 사로서 의 가능성을 악할 수 있는 정보가 된다. 이에 포함되는 질문은 수시 질문의 내용이 구체 22 한국문헌정보학회지 제45권 제1호 2011 <그림 2> 질문의 주제별 분석(2009/2010년) 으로 기술됨으로 주제별 분석 질문의 문성 을 악할 수 있는 근거가 될 수 있다. 그 이유는 ‘참고정보원’의 질문은 패스 인더로 해답이 제 공되며 이는 곧 서지 사로서, 도서 사의 유 형 문 인 사로 구분되기 때문이다. <그림 2>에서 나타낸 주제의 구분은 ‘참고정 보원’에 해당되는 질문의 구체 내용을 집계 하여, 총류를 비롯하여 각 학문 역별로 7개 주제로 세분된 것이다. 그리고 주제별 분석은 ‘사서에게 물어보세요’에 수된 질문 ‘참고 정보원’ 역만이 분석 상이기 때문에 분석 상의 질의건수는 2009년 80건, 2010년은 179건 이며 이들의 체 질문에 한 백분율은 평균 13.7% 수 이다. 이 수 은 곧 CDRS의 문 성 정도를 다소나마 악할 수 있는 자료가 될 수 있을 것이다. ‘참고정보원’에 해당되는 질문은 총 질문의 평균 15%에 미치지 못하는 수 으로서, 국내 에서는 아직까지 CDRS의 문 활용이 조 한 편이다. 그러나 2009년은 체 질문 10.7% 에 그쳤으나 2010년에는 15.7%의 수 으로 2009 년에 비해 다소 향상된 상을 보임으로써 사의 홍보와 개선에 의해 향후의 진 될 가능성 을 견할 수 있다. 한편 질문 수시에 기록되는 구체 인 질문 내용을 통해서는 질문의 주제분석 외에도 질문 내용의 문성 여부도 악할 수 있었으며, 그 결과에서도 문성이 포함된 질문은 극히 조 한 수 이었다. 즉 과학기술 분야를 비롯하여 인문과학 분야 문성이 포함된 철학이나 종교 주제의 질문은 많지 않으며, 사회과학 분 야의 질문에서도 시사와 사회일반에 련된 질 문이 부분이었다. (4) 질문목 별 분석 질문목 별 분석 한 주제별 분석과 동일한 의도에서 CDRS의 문 활용도를 알기 함 에서다. 질문의 목 별 분석을 해 목 의 유형 을 심/취미, 연구과제, 업무수행 등으로 그 내 용을 구분하여 분석한 결과는 <그림 3>에 나타 내었다. 목 별 분석에서는 “사서에게 물어보 세요”에 수된 질문 체를 상으로 하 으며 2009년과 2010년을 비교할 수 있도록 <그림 3> 에 함께 구분하여 나타내었다. 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 23 <그림 3> 질문의 목 별 분석(2009/2010년) <그림 4> 답변처리만족도(2009/2010년) CDRS의 실시 배경에는 정보 사의 문성 을 획득하고자 하는 의도가 포함되며 이 의도 는 본 서비스를 수행하는 세계의 도서 들 이 동일하게 갖고 있는 기본 목 이기도 하다. 이에 한 확인을 해 질문목 을 조사한 결 과에서는 ‘기타’로 표시한 응답이 다소 많은 편 (45.7%)이나, 목 이 표시된 내용 에서는 ‘연구과제의 수행’ 목 이 순 로는 가장 우선 하나 그 수 (30.8%)은 그리 높지 않은 편이다. 그 다음이 ‘ 심 는 취미’의 목 에서 질문을 한 경우(12.4%)이며 ‘업무수행의 목 ’(11.1%) 이 마지막 순 이다. 그러나 목 유형 연구 과제와 업무수행에 분류되는 질문을 문 활 용 역으로 간주한다면 이들의 통합된 수 은 40%를 조 상회하게 된다. 따라서 질문목 에서 본 ‘사서에게 물어보세요’는 다소 문 인 목 에서 활용되는 상임을 알 수 있다. (5) 서비스 만족도 분석 도서 에서 제공되는 서비스에 한 실이용 자들의 만족도 수 은 곧 향후 도서 에 한 인식수 과 활용의지를 가늠하게 하는 척도가 된다. 이와 같은 제에 의해 “사서에게 물어보 세요”의 경험을 통해 국내의 CDRS에 한 이 용자들의 만족도를 규명하고자 조사한 서비스 만족도에서는 <그림 4>와 같은 결과를 얻었다. 호속의 백분율은 서비스 경험자 체에 한 각 만족도별 수 을 나타낸 것이다. 24 한국문헌정보학회지 제45권 제1호 2011 본 서비스의 경험결과에 한 만족도 기록은 강제조항이 아니었음으로 무응답 건수가 2009 년에는 411건(54.3%), 2010년에는 622건(53.8%) 으로 체 질의 건수의 평균 54.3% 다. <그림 4>는 무응답을 제외한 45.7%에 해당되는 응답 자를 상으로 분석한 결과이며 이 ‘매우만 족’과 ‘만족’으로 답한 이용자가 각각 29.4%와 9.3%로서, 답변처리에 해 약 40%에 해당되 는 이용자가 만족하는 것으로 나타났다. 한 답변처리에 한 만족의 모든 척도에서 2009년 과 2010년이 거의 유사한 상을 보이며 부족 과 매우부족의 응답이 5% 이하의 수 에 있어, 서비스 만족도 분석결과로 부터는 향후 본 서 비스의 효과에 한 정 인 기 를 갖도록 한다. 4. 결 언 도서 정보 사의 수행에서는 도서 내외 의 환경에 의해 지속 인 변화와 개선을 통해 이용자들의 정보요구에 응해야 하는 책무가 따른다. 그 책무의 일환으로 개발된 력형디 지털정보 사(CDRS)는 디지털환경과 실시간 진행 력 사 등의 개념이 포함된 시스템 으로서 세계 범 에서 그 시행이 가속화 되고 있다. 본 연구는 우리나라 국립 앙도서 이 주 이 되어 국내 공공도서 간 력으로 시행되고 있는 CDRS의 향후 보다 나은 서비스를 제공할 수 있는 기 자료를 획득하고자 진행되었다. 한 공공도서 에만 한정되었으나 국가 수 에서 국 범 로 수행되는 CDRS가 이제 겨 우 2년을 넘고 있는 시행 기의 황을 진단해 으로써 향후 본 서비스의 발 가능성을 규명 하고자 함도 포함된다. 연구의 진행을 해 해 외의 CDRS 사례도 조사에 포함되었으며, 그 사례로는 국가도서 이 주 이 되어 력형 사를 수행하는 미국을 비롯한 국, 캐나다, 호 주, 뉴질랜드, 노르웨이, 일본 등 7개국의 국가 도서 이다. 국내 국립 앙도서 에서의 본 서비스는 2008 년 1월 시작 이후 4개월간은 국립 앙도서 에 서만 제한 실시되고 2008년 5월부터 국 16개 지역 표도서 간 력, 그리고 2009년 2월부 터는 공공도서 으로 확 하여 2010년 8월 재 국내의 총 343개 공공도서 이 력에 참 여하고 있다. 질문의 수상황은 2009년에 비 해 2010년에는 약 53% 수 상승하 으나 체로 이용자의 층이 균등하지 않고 20-30 층 에 집 되고 있다. 그러나 이들의 질문이 부 분 업무 는 연구의 목 이라는 은 CDRS가 의도하는 사의 문성이 확보된다는 기 를 할 수 있으나 질문의 주제를 분석한 결과에서 는 과학기술 분야나 철학 종교와 같은 학술 질문은 아직 조한 수 에 있음이 나타났다. 한편 서비스결과의 만족도에 있어서는 무응답 이 많았기 때문에 그 결과의 신뢰도에서는 다 소 문제가 되나 서비스 실시 이후 매년 균등하 게 ‘부족’의 범주가 5% 이하이며 ‘만족’의 범주 가 40% 수 을 넘고 있다. 이와 같은 결과를 토 로 하여 향후의 홍보와 노력에 의해 보다 높은 수 을 획득할 것으로 기 할 수 있다. 이상과 같은 종합 인 결과에 따르면 우리나 라에서의 CDRS는 반 으로 그 활용도가 높 으며 이용자들의 만족도 유지에 의해 이용률이 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 25 상승되고 있는 것을 확인할 수 있다. 그러나 이 용자층의 극심한 불균형과 문 질문건수가 많지 않으며 각 단 도서 에서 수한 질문에 한 자체 처리가 매우 낮은 수 인 등 국 내 CDRS의 효율 실행에 장애 요인이 잠재 되어 있는 것도 분석된다. 이와 같은 조사결과 를 근거로 하여 본 연구의 결론으로 향후 보다 발 인 국내 CDRS의 시행을 해 다음과 같 은 내용으로 그 개선방안을 제시한다. ∙ 재 공공도서 에만 제한된 력 을 학 문도서 으로 확 하여 보다 다양 한 주제의 문성 있는 질문을 유도하도록 한다. ∙CDRS의 이용자 특히 ․고등학생층이 극히 조한 상황을 개선하기 해서는 학 교도서 과의 력 연계가 필요하다. 일 선학교의 교사와 학부모들과는 주제별 문사서가 확보된 국립 앙도서 과 직 속되는 채 을 설치하여 CDRS로부터 공교육의 역할을 기 할 수 있도록 한다. ∙단 도서 사서의 계속교육을 강화하고 특히 지역 표도서 국립 앙도서 에는 주제 문사서의 확보를 통해 보다 문 인 질문을 유도하도록 한다. ∙도서 련 기 과의 력 계를 확 하 여 박물 을 비롯하여 문화 기록 과 의 력에 의해 보다 역의 도서 기능을 수행할 수 있도록 한다. ∙트 터를 비롯한 social network의 수단 등 새롭게 개발되는 통신기술을 활용하여 이용자와의 다양한 근채 을 확보함으 로써 도서 이 갖는 사회 기구로서의 기 능을 보다 역동 으로 수행함도 필요하다 이상 제안된 사항 외에도 아직 실행 기의 단계에 불과한 “사서에게 물어보세요"의 서비 스에 한 보다 극 인 홍보와 개발에 의한 응 방법으로 사개념의 주요 축이 되는 이용자 주의 정보 사가 실 되는 것에 CDRS가 기여할 수 있기를 기 한다. 참 고 문 헌 [1] 국립 앙도서 . 2008. 국립 앙도서 연보, 2007 . 서울: 국립 앙도서 . [2] 김석 . 2002. 디지털참고정보서비스의 최근 개발동향. 報報管理學 誌 , 19(4): 213-232. [3] 김성희. 2005. 실시간 디지털 정보서비스에 한 연구. 情報管理學 誌 , 22(1): 249-265. [4] 문화체육 부, [편]. 2008. 도서 력체계 시범사업을 통한 확산방안 연구 . 서울: 문화체육 부 도서 정보정책기획단 정책조정과. [5] 이선희. 2007. 력형 디지털정보서비스의 황 발 방안 연구. 디지틀도서 , 48: 30-44. [6] 이선희, 최희윤. 2004. 로벌네트워크를 활용한 CDRS 력모형 구 에 한 연구. 한국도서 ․ 정보학회지 , 36(4): 329-347. 26 한국문헌정보학회지 제45권 제1호 2011 [7] 조미선 외. 2006. 해외 력형 디지털참고 사 사례조사, 도서 , 61(1): 115-163. [8] 최은주, 이선희. 2004. 력형 디지털참고정보서비스 활용에 한 연구: KISTI Question 포인트 운 사례를 심으로. 情報管理學 誌 , 21(2): 69-87. [9] Curtis, Susan & Mann, Barbara. 2002. “Cooperative reference: Is there a consortium model?” Reference and User Service Quarterly, 41(4): 344-359. [10] 사서에게 물어보세요. [online]. [cited 2010.12.1]. . [11] AnyQuestion. [online]. [cited 2010.10.24]. . [12] Ask the Library. [online]. [cited 2010.10.30]. . [13] AskNow. [online]. [cited 2010.10.30]. . [14] Ask Us a Question. [online]. [cited 2010.11.4]. . [15] Collaborative Reference Database. [online]. [cited 2010.11.21]. . [16] Enquire; Ask a Librarian. [online]. [cited 2010.11.21]. . [17] QuestionPoint. [online]. [cited 2010.10.24]. . •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] National Library of Korea. 2008. Annul report of National Library of Korea, 2007. Seoul: National Library of Korea. [2] Kim, Sukyoung. 2002. “Recent developments and trends of digital reference service.” Journal of the Korean Society for Information Management, 19(4): 213-231. [3] Kim, Seunghee. 2005. “A study for the online digital reference service.” Journal of the Korean Society for Information Management, 22(1): 248-265. [4] Ministry of Culture, Sports and Tourism, [ed.]. 2008. Study on How to Proliferate Library Cooperation System through the Pilot Project. Seoul: Planning Group for Library and Information Policy. [5] Lee, Seon-Hee. 2007. “A study on the case of CDRS and for development.” Digital Library, 48: 30-44. [6] Lee, Seon-Hee & Choi, Hee-Yoon. 2004. “A study on the implementation of collaborative digital reference service using global network.” Journal of Korean Library and Information Science Society, 36(4): 329-347. [7] Cho, Mi Seon et al. 2006. “A study on foreign cases for cooperative digital reference.” 국내 력형디지털정보 사의 효율 실행을 한 CDRS의 사례연구 27 Doseogwan, 61(1): 115-163. [8] Choi, Eun-Ju & Lee, Seon-Hee. 2004. “A study on the collabroative digital reference service - Focused on the implement of QuestionPoint at KISTI.” Journal of the Korean Society for Information Management, 21(2): 69-87. work_5fgw74mmenhn7cj2v22le7vd5y ---- Microsoft Word - 2. BP_2015_10 INDEX.docx BAJO PALABRA. Revista de Filosofía II Época, Nº 10 (2015)   La revista Bajo Palabra ofrece a los autores la difusión de sus resultados de investigación principalmente a través del Repositorio Institucional de la Biblioteca de Humanidades de la Universidad Autónoma de Madrid, así como a través de diferentes bases de datos, catálogos, repositorios institucionales, blogs especializados, etc. El éxito con que se acomete la tarea de difundir los contenidos científicos de Bajo Palabra. Revista de Filosofía se ve reflejado por su inclusión en: Índices de valoración de calidad científica y editorial: • LATINDEX • DICE. Difusión y Calidad Editorial de las Revistas Españolas de Humanidades y Ciencias Sociales y Jurídicas • BDDOC CSIC: Revistas de CC. Sociales y Humanidades • CIRC: Clasifcación Integrada de Revistas Científcas • ANEP: Agencia Nacional de Evaluación y Prospectiva. Categoría ANEP: B • ISOC, CIENCIAS SOCIALES Y HUMANIDADES • RESH. Revistas españolas de Ciencias Sociales y Humanidades • Ulrich’s Periodicals Directory • CECIES. Revistas de Pensamiento y Estudios Latinoamericanos • I2OR. International Institute of Organized Research • DRJI. Directory of Research Journals Indexing • IN-RECH. Índice de impacto. Revistas españolas de Ciencias Humanas • MIAR (Sistema de medición cuantitativa de la visibilidad de las publicaciones periódicas en Ciencias Sociales: índice de difusión ICDS de BAJO PALABRA: 4.230) • The Philosopher’s Index Así como en su difusión y acceso a sus contenidos en texto completo a través de: • REPOSITORIO INSTITUCIONAL DE LA UAM. BIBLOS-E ARCHIVO • DIALNET, portal de difusión de la producción científica hispana • BIBLIOTECA UNIVERSIA • E-REVISTAS. Plataforma de Open Access de Revistas Científicas Electrónicas (no) • REDIB. Red Iberoamericana de Innovación y Conocimiento Científico • REBIUN. RED DE BIBLIOTECAS UNIVERSITARIAS • BIBLIOTECA VIRTUAL DE BIOTECNOLOGÍA PARA LAS AMÉRICAS • AL-DIA. REVISTAS ESPECIALIZADAS • COPAC. National, Academic and Specialist Library Catalogue (Reino Unido) • ZDB. Deutsche Digitale Bibliothek (Alemania) • Catálogo SUDOC (Francia) • OCLC WorldCat (mundial) • DULCINEA. SHERPA/RoMEO • EBSCO’s database products • DOAJ, Directory of Open Access Journals Y su citación en diferentes blogs y sitios web: • CANAL BIBLOS: Blog de la Biblioteca y archivo de la UAM • LA CRIÉE: PÉRIODIQUES EN LIGNE • HISPANA. Directorio y recolector de recursos digitales • BIBLIOTECA FILOSÓFICA IMPRESCINDIBLE y al excelente servicio de canje de revistas realizado por la Biblioteca de Humanidades de la Universidad Autónoma de Madrid, gracias al cual se pueden consultar ejemplares de Bajo Palabra en numerosas Bibliotecas, y por el cual se realiza actualmente un intercambio con más de 40 revistas. *Bajo Palabra. Revista de Filosofía ha sido incluida este año en el Catálogo SUDOC (Francia), Nuevo Portal de Revistas electrónicas UAM, Red Iberoamericana de Innovación y Conocimiento Científico (REDIB), International Institute of Organized Research (I2OR) y OCLC WorldCat. Actualmente se ha solicitado su inclusión en Philosophy Journals Index, CARHUS, SCOPUS y Arts and Humanities Citation Index (ISI). BAJO PALABRA. Revista de Filosofía II Época, Nº 10 (2015)   The Journal Bajo Palabra successfully diffuses the authors’ research results mainly through the Institutional Repository of Humanities Library at the Autonomous University of Madrid, as well as through different databases, catalogues, institutional repositories, specialized blogs, etc. Bajo Palabra has a good ranking in several quality editorial indexes: • LATINDEX • DICE. Diffusion and Editorial Quality of Spanish Journals of the Humanities and of Social • and Legal Sciences • BDDOC CSIC: Journals of Social Sciences and Humanities • CIRC: Classifcation Integrated Scientifc Journals • ANEP: The National Evaluation and Foresight Agency. Category ANEP: B • ISOC, CIENCIAS SOCIALES Y HUMANIDADES • RESH. Revistas españolas de Ciencias Sociales y Humanidades • Ulrich’s Periodicals Directory • CECIES. Journals of Latin-American Thought and Studies • I2OR. International Institute of Organized Research • DRJI. Directory of Research Journals Indexing • IN-RECH. Índice de impacto. Revistas españolas de Ciencias Humanas • MIAR (System for quantitatively measuring the visibility of social science journals based on their presence in different types of databases. ICDS of BAJO PALABRA: 4.230) • The Philosopher’s Index It is freely accessible through: • INSTITUTIONAL REPOSITORY OF THE UAM. UAM BIBLOS-E ARCHIVE • DIALNET, Web portal for the diffusion of Spanish scientifc production • UNIVERSIA LIBRARY • E-REVISTAS. Plataforma de Open Access de Revistas Científicas Electrónicas (CSIC) • REDIB. Red Iberoamericana de Innovación y Conocimiento Científico • REBIUN. NETWORK OF UNIVERSITY LIBRARIES • VIRTUAL LIBRARY OF BIOETCHNOLOGY FOR THE AMERICAS • AL-DIA. SPECIALIZED JOURNALS • COPAC. National, Academic and Specialist Library Catalogue (United Kingdom) • ZDB. Deutsche Digitale Bibliothek (Germany) • SUDOC Catalogue (France) • OCLC WorldCat • DULCINEA. SHERPA/RoMEO • EBSCO’s database products • DOAJ, Directory of Open Access Journals It has been quoted in multiple blogs and Web sites: • CANAL BIBLOS: Blog of the Library and archive of the UAM • LA CRIÉE: PÉRIODIQUES EN LIGNE • HISPANA. Directory and collector of digital resources • ESSENTIAL PHILOSOPHICAL LIBRARY And thanks to the excellent service of journals exchange provided by the Humanities Library of the Autonomous University of Madrid, it is possible to consult Bajo Palabra’s in several libraries, and the journal is currently conducting an exchange with more than 40 different humanities journals. *Bajo Palabra. Journal of Philosophy is indexed in Catálogo SUDOC (Francia), Nuevo Portal de Revistas electrónicas UAM, Red Iberoamericana de Innovación y Conocimiento Científico (REDIB), International Institute of Organized Research (I2OR) and OCLC WorldCat. The journal is currently under the indexing process with Philosophy Journals Index, CARHUS, SCOPUS and Arts and Humanities Citation Index (ISI). work_5hjvkmvb7vbodpfjhfauf2ikf4 ---- El Derecho a la Verdad en el Ámbito Iberoamericano Ius Humani | revista de derecho Volumen 7, año 2018 Publicada el mes de diciembre 2018 Frecuencia anual ISSN 1390-440X Ius Humani, Revista de derecho es un lugar abierto a los investigadores de todo el mundo, en todos los idiomas, donde se publican estudios originales sobre los derechos del ser humano (naturales, humanos o constitucionales) y sobre los procedimientos más efectivos para su protección, tanto desde la perspectiva filosófica, como la de la normativa superior del ordenamiento jurídico. La versión impresa de la Revista tiene una frecuencia anual desde el año 2016 y se imprime a finales del período. Tiene el ISSN 1390-440X. La versión digital de la Revista tiene el e-ISSN 1390-7794 y funciona como una publicación continua: aprobadas las colaboraciones se procede a publicarlas. Está indexada en múltiples sistemas como Latindex, Academic Search Premier, Advance Sciences Index, Cosmos Impact Factor, Crossref, Dialnet, DOAJ, EBSCO Legal Source, Emerging Sources Citation Index (ESCI), ERIH Plus, Europub, Fuente Académica, Global Impact Factor, Google Scholar, Heinonline, I2OR, Infobase Index, IPI, ISIFI, JIFactor, JournalGuide, JournalTOCs, Miar, Microsoft Academic, OAJI, OCLC, REDIB, Saif, Ulrich’s, VLex y en muchos otro catálogos y portales (Copac, Sudoc, ZDB, JournalGuide, etc.). www.iushumani.org Para canjes y susbcripciones remitirse a: revista@derecho.uhemisferios.edu.ec Ius Humani: Revista de derecho Facultad de Ciencias Jurídicas y Políticas Universidad de los Hemisferios www.uhemisferios.edu.ec EDICIÓN Servicio de Publicaciones de la Universidad de los Hemisferios (SPUH) revista@derecho.uhemisferios.edu.ec Dirección: Universidad de los Hemisferios / Paseo de la Universidad Nro. 300 y Juan Díaz (Urbanización Iñaquito Alto) / Quito – Ecuador Código Postal: EC170135 ISSN: 1390-440X Quito – Ecuador Tiraje: 100 ejemplares Revisión y corrección de textos: Esteban Cajiao Brito Diagramación y maquetación: Mario de la Cruz Ius Humani | revista de derecho Universidad de Los Hemisferios Quito – Ecuador Rector Diego Alejandro Jaramillo Decano de la Unidad Académica de Ciencias Jurídicas y Políticas Dr. René Bedón Garzón Director de la Revista Dr. Juan Carlos Riofrío Martínez-Villalba, Univ. de los Hemisferios (Quito, Ec.) Comité Científico Dr. Pedro Rivas Palá, Universidad de la Coruña (La Coruña, España) y Universidad Austral (Buenos Aires, Argentina) Dr. Hernán A. Olano García, Universidad de la Sabana (Bogotá, Colombia) Dr. Carlos Hakansson Nieto, Universidad de Piura (Piura, Peru) Dr. Hernán Pérez Loose, Universidad Católica Santiago de Guayaquil (Ecuador) Dr. Luis Castillo Córdova, Universidad de Piura (Piura, Peru) Comité Editorial D.a Maria Cimino, Università degli Studi di Napoli Parthenope (Nápoles, Italia) Dr. Julián Mora Aliseda, Universidad de Extremadura (Extremadura, España) Ph.D. Gustavo Arosemena, Universidad de Maastricht (Maastricht, Holanda) Dr. Alfredo Larrea Falcony, Universidad de los Hemisferios (Quito, Ecuador) Ph.D. Juan Cianciardo, Universidad de Navarra (Pamplona, España) Dr. Jaime Flor Rubianes, Pontificia Universidad Católica del Ecuador (Quito, Ecuador) Mgr. María Teresa Riofrío M.-V., Univ. de Villanueva (Madrid, España) Dr. Edgardo Falconi Palacios, Universidad Central del Ecuador (Quito, Ecuador) Sumario PREMIO INTERNACIONAL JUAN LARREA HOLGUÍN Ius Resistendi: Derechos de participación, garantismo, resistencia y represión a partir de las definiciones de Juan Larrea Holguín Gabriel Hidalgo Andrade 249-324 ARTÍCULOS El delito de desaparición forzada de personas en América Latina The crime of forced disappearance of people in Latin-America Isaac Marcelo Basaure Miranda 9-36 A comparative study of separability of arbitration clause under main contract of LICA and the UNCITRAL model law Una comparación de la separabilidad de la cláusula arbitral del contrato LICA y de la ley modelo CNUDMI Atefeh Darami Zadeh, Shapur Farhangpur 37-52 The jurisprudential foundations of Iran’s criminal policy concerning the spread of prostitution covered in the 2013 Penal Code Fundamentos jurisprudenciales de la política penal de Irán relacionados con la propagación de la prostitución del código penal de 2013 Ardavan Arzhang, Abozar Fadakar Davarani, Mohammad Nozari Ferdowsieh 53-66 Is the Current System of Criminal Procedure of Iran Efficient? ¿Es eficiente el actual sistema procesal penal de Irán? Mehdi Fazli, Jalaleddin Ghiasi, Mohammad Khalil Salehi 67-87 El derecho a la igualdad en el ámbito educativo Right to Equality in Education Mercedes Navarro Cejas, Gerardo Ramos Serpa, Magda Cejas Martínez 89-104 The principles of the dependent and independent nationality of spouses: advantages and disadvantages Los principios de la nacionalidad dependiente e independiente de los cónyuges: ventajas y desventajas Seyyed Mohsen Hashemi-Nasab Zavareh, Elham Ghaffarian, Naser Ghamkhar 105-121 Participação democrática e cidadã como mecanismo de superação da crise ecológica no Brasil Democratic participation and citizens as a mechanism for overcoming the ecological crisis in the brazilian legal context Cristiane Velasque, Thiago Germano Álvares Da Silva, Wambert Gomes Di Lorenzo 123-144 Regulations of determining law governing to arbitrability La regulación sobre la ley que rige el arbitraje Ardalan Haghpanah, Nejad Ali Almasi 145-166 El rol del ejecutivo en la producción normativa: el Ecuador entre 1979 y 2016 The role of the executive in normative production: Ecuador between 1979 and 2016 Santiago Llanos Escobar, Wladimir García Vinueza 167-197 Diferencias planeadas: la Asignación Universal por Hijo en Argentina desde una perspectiva de derechos humanos Planned Differences: Argentina’s Universal Child Allowance from a human rights perspective Horacio Javier Etchichury 199-222 Los principios de aplicación de los derechos en la Constitución ecuatoriana: una mirada desde la doctrina y la jurisprudencia The Principles of Implementation of rights in the Ecuadorian Constitution: a view from the doctrine and jurisprudence Esteban Javier Polo Pazmiño 223-247 work_5hzpw526evexbh6fqmx2jaapna ---- 167Journal of Educational Media & Library Sciences, 42 : 2 (December 2004) : 167-174 Vicki Toy-Smith Catalog Librarian Getchell Library, University of Nevada, Reno Reno, Nevada, U.S.A. Abstract It is important for catalogers to be able to streamline methods of cataloging lin- guistically-unique titles in their library’s collection of resources. One such group of items at the University of Nevada, Reno that needed to be cataloged included several hundred Basque sound recordings. It is also important to have local cata- loging procedures that are readily available, accessible, current and understand- able. By combining a new local procedure along with minimal level cataloging, the result was a streamlined and innovative cataloging project that resulted in bibliographic access to an unusual collection of Basque sound recordings. Keywords: Basque sound recordings; Cataloging projects; Minimal level cata- loging There are a variety of backlogs in Catalog Departments that often remain on the back burner as far as priorities are concerned. This is primarily due to lack of personnel, expertise, and time. In addition, with initiatives shifting in technical services units and pressing demands placed on our time, there are some cataloging projects that will not be addressed unless deemed to be important in the library’s mission. XML, metadata, and Dublin Core crosswalks are among the new priori- ties in Technical Services Departments. The Problem At the University of Nevada, Reno, one such proposal would not have been undertaken unless a minimal level cataloging project been implemented during the 2001-2002 academic year. It came to my attention that there was a growing back- log of sound recordings in my Catalog Department. A few years ago, several hun- dred sound recordings in the Basque language were ordered by the Basque Studies Library staff. These sound recordings comprise part of a linguistically-unique group of materials which will ultimately be housed in UNR’s Basque Studies Library. I found that the rate it would take to do full level (Encoding Level “I’’ in OCLC’s Bibliographic Formats and Standards) 1 bibliographic records would increase by an additional 40 to 50 minutes of processing time per item. Thus, full level cataloging of such records would lead to a larger amount of time spent pro- cessing each title in the collection. Access to Basque Sound Recordings: A Unique Minimal Level Cataloging Project 168 Journal of Educational Media & Library Sciences 42 : 2 (December 2004) It has been noted by non-print catalogers and cataloging professors alike that sound recording cataloging is complex and time-consuming. There are many rules and rule interpretations about uniform titles and access considerations (e.g. compsers and performers) that come into play. For instance, Nancy Olson states in her book, Cataloging Audiovisual Materials and Other Special Materials. There are special, and lengthy, rules and rule interpretations for the main entry of sound recording. Rule 21.23D splits recordings into those “in which the participation of the performer(s) goes beyond that of performance, execution or interpretation’’ and those in which it does not. 2 She lists various excerpts from some of the rules and rule interpretations from chapters 6 and 21 of the Anglo-American Cataloging Rules, Revised edition, (AACR2R) and the Library of Congress Rule Interpretations along with examples of how certain sound recordings should be cataloged. Considering the fact that there are so many rules and rule interpretations con- cerning sound recordings, I tried to include only relevant information in the proce- dures in order to clear up any confusion. Therefore, the library technicians would have a straightforward cataloging procedure to follow. However, I did include some pertinent Library of Congress cataloging rules and rule interpretations that would be necessary if one were inputting a full level (encoding level “I’’) sound recording record into OCLC. There were approximately 300-400 sound recordings in the departmental backlog. The project was begun during the 2001 academic year here at the University of Nevada, Reno. I began by examining the OCLC input standards for minimal level (Encoding level “K’’) 3 records. In addition, I inquired about which library technicians in the unit had any time available to undertake such a project. I have written up a local procedure for the department’s technical library assistants to follow. More information has been appended to the local procedure over time. A more detailed procedure can only serve as an enhanced resource tool for use by the Catalog Department staff. During the course of the project I have found new sub- ject categories to add to the local procedure. In addition, corresponding LC call numbers were added. Currently, there are 11 categories of music included in the local cataloging procedure. Preliminary Steps I have trained one technical library assistant, who also does copy cataloging of non-print materials, to undertake the Basque sound recording minimal level (Encoding Level “K’’ in the OCLC Bibliographic Formats and Standards) project. It took a couple of training sessions to explain the various fields that relate to sound recording records and various rules that are presented in Chapters 6 and 21 of the Anglo-American Cataloging Rules (AACR2R). At first the technical library assis- 169Toy-Smith : Access to Basque Sound Recordings : tant was concerned about the fact that the sound recordings were in the Basque lan- guage. In addition, she was hesitant about starting the project since it would incor- porate original cataloging along with handling a format, sound recordings, with which she was unfamiliar. However, after several months on the project, she became used to processing the sound recordings and has been able to catalog 150 bibliographic records this past year. Due to a need for cataloging quality and consistency, I continue to revise the technical library assistant’s records. The records are loaded into the OCLC WorldCat database and, in addition, our local online catalog(III). There are approx- imately 200 more sound recordings that need to be processed. The pr oject was implemen te d u s ing standard Library of Congress Classification Schedules and the Library of Congress Subject Headings. The cate- gories used to describe the majority of the sound recordings included : folk music and popular music. The various classical sound recording sound discs encompass a wide range of musical subjects; these include choral, sacred, organ music, etc. I am planning to group like categories of sound discs in the departmental backlog so that the library technician will be able to catalog similar items of a specific type at one time. The technical library assistant has been encouraged to listen to the sound recording for several minutes if the piece-in-hand does not appear to fit one of the categories that are listed in the procedure. If the item is too complicated for the library technician to catalog, I have requested that she refer the item to me for fur- ther investigation. Local Procedure The following part of the article addresses the purposes and, in addition, the procedure to which the Catalog Depart staff could refer when inputting minimal level Basque sound recording records into OCLC. The procedure is still being sup- plemented as various new categories of music are located in our collection. Local Cataloging Procedures for Basque Sound Recordings —Original Records Basic Procedure: The following guidelines apply to inputting OCLC member records for Basque sound recordings. Fixed Fields and Variable Fields Fixed Fields (these fixed fields are required or this project) Desc: a Elvl: K DtSt: Lang: Ctry: Dates: 170 Journal of Educational Media & Library Sciences 42 : 2 (December 2004) Variable Fields 040 xxx $c xxx 007 s $b d $d f $e s $f n $g g $h n $i n $m e $n u 028 00 (publisher number) $b publisher 090* See LC call number & corresponding LC subject headings for sug- gested 090 fields. 049 xxxx 100 1 Author (This can be a composer or an author). 245 1_ Title $h [sound recording] : $b other title information / $c principal composer or performer. 260 Place of publication : $b publisher, $c year. 300 1 sound disc : $b digital ; $c 4 3/4 in. 490 0 (For this project, do not trace any series statements) 511 0 List performers/artists, conductor(s) in this field. 505 0 (always list titles of songs on the sound recordings) 650 0* See LC call number & corresponding LC subject headings for sug- gested 650 fields. 650 4 Basques $v Music. (Always add this subject heading to the Basque sound recording bibliographic records). The local subject heading (Basques $v Music) is added in order to help patrons, researchers, faculty, and staff members locate Basque sound recordings. Whenever a patron needs to find a Basque sound recording, all the person has to do is type in the words “ Basques’’ and “ Music’’ in a subject search or a keyword search. 947 *ov=.0 (add order record number) 949 2 $l bsqav $o Basq $t5 $i(barcode) *LC call numbe 4 & corresponding LC subject headings 5 Examples: 090 M1630.18.cutter for main entry $b 2nd cutter for title date 650 0 Popular music. 090 M1627.cutter for m.e. $b 2nd cutter for title date 650 0 Folk music. 090 M5.cutter for m.e. $b 2nd cutter for title date 650 0 Instrumental music. 090 ML1500 $b .cutter for m.e. date 650 0 Choral music. 090 M1997.cutter for m.e. $b 2nd cutter for title date 650 0 Children’s songs. 090 M1999. cutter for m.e. (if 2 or more composers)$b second cutter for title date 650 0 Sacred vocal music. 171Toy-Smith : Access to Basque Sound Recordings : 090 M3.1.cutter for m.e. (if 1 composer) $b 2nd cutter for title date 650 0 Sacred vocal music. 090 ML2102.cutter for m.e. $b 2nd cutter for title date 650 0 Sacred music. 090 M1000.cutter for m.e. $b 2nd cutter for title date 650 0 Orchestral music. 090 M1619 $b cutter for m.e date (for collections of two or more com- posers) 650 0 Songs (High voice) with piano. 090 M7. cutter for m.e. $b 2nd cutter for title date 650 0 Organ music. Sample Records I have added several sample records to the local procedure in order to illustrate what a typical minimal level sound recording record would look like. Example 1: Type : j Elvl : K Srce : d Audn : Ctrl : Lang : N/A Blvl : m Form : Comp : pp Accm : Mrec : Ctry : sp Desc : a Fmus : n Lxt : Dtst : s Dates : 1998, 040 xxx $c xxx 007 s $b $d b f $e s $f n $g g $h n $i n $m e $n u 028 00 4082 $b Prion 090 M5.E76 $b A375 2001 049 xxxx 110 2 Ernach Trio. 245 10 Abriendo fronteras $h [sound recording] / $c Ernach Trio. 260 Spain : $b Prion : $b Eusko Jaurlaritza, Goberno Vasco, $c p2001. 300 1 sound disc : $b digital ; $c 4 3/4 in. 511 0 Monica Chirita, violin ; Oskar Espina Ruiz, clarinete ; Noriko Nagasawa, piano. 505 0 Zortziko / N. Ota~no -- Ume Malkoak / P. Sorozabal -- Txori Abestiak / P. Sorozabal -- Suite Vasca / J.D. de Santa Teresa -- Sistema de adioses / C. Villasol -- Pavane pour une infante d’efunte / M. Ravel -- Navarra / P. Sarasate -- Suite op. 157b / -- D.Milhaud -- Serenade for three / P. Schichkele. 650 0 Instrumental music. 650 4 Basques $v Music. 700 1 Chirita, Monica. 700 1 Ruiz, Oskar Espina. 700 1 Nagasawa, Noriko. 172 Journal of Educational Media & Library Sciences 42 : 2 (December 2004) Example 2: Type : j Elvl : K Srce : d Audn : Ctrl : Lang : N/A Blvl : m Form : Comp : mu Accm : Mrec : Ctry : sp Desc : a Fmus : n LTxt : DtSt : s Dates : 2001, 040 xxx $c xxx 007 s $b d $d f $e s $f n $g g $h n $i n $m e $n u 028 00 046 $b PSM 090 M5.I75 $b A64 2000 049 xxxx 100 1 Irizar, Juan Carlos. 245 10 Ametsen artean $h [sound recording] = $b Entre sue~nos / $c Juan Carlos Irizar. 246 31 Entre sue~nos 260 Zestoa [Spain] : $b PSM, $c [2002?] 300 1 sound disc : $b digital ; $c 4 3/4 in. 511 0 Juan Carlos Irizar, piano, along with various musicians. 505 0 Amets bat -- Ekain -- Kontrapas -- Larrosa -- Eurkal pizkundea -- Erromesari ; Al peletrino -- Lurkoi baserria, Busturian -- Lainoen atzetik -- Adios maitia -- Eguzki printzak -- Mendaro -- Nirekin (emoztazu mosutxue) -- Jonen balada -- Zilbor hesteak -- Ertxinetik zerurantz -- Erribera -- Hotel Arocena --Nafarroa, arragoa -- Orbela -- Ametsen artean -- Maiteaz galdezka. 650 0 Instrumental music. 650 4 Basques $v Music. Other Considerations I was asked to add information that would help Catalog Department staff members decide on the main entries and added entries that should be used for spe- cific artists and performers listed on a sound recording. Therefore, I added some further information on rule interpretations as far as composers, principal perform- ers, and conductors are concerned. In addition, a few useful cataloging resource tools were added to the end of the local procedure. Consider the following information when inputting a minimal level Basque sound recording record into OCLC. 1xx (100 1 or 110 2) used for composer or principal performer (see Rule 21.23C) 6 When two or more performers are named in the chief source of information, consider to be principal performers those given the greatest prominence there. If all the performers named in the chief source of information are given equal promi- nence there, consider all of them to be principal performers. When only one performer is named in the chief source of information, consid- er that performer to be a principal performer. 173Toy-Smith : Access to Basque Sound Recordings : 7xx(700 1 or 710 2) used for performer(s) or conductor(s) when the composer is considered to be the main entry (see Rule 21.29D) 7 When a featured performer is accompanied by an unnamed group, that if it had a name, would be given an added entry as a corporate body, do not make added entries for the individual members of the group. Do not make an added entry for a performer who receives a main entry head- ing as principal performer under 21.23.C1 8 For minimal level cataloging, use the following information as a guideline: 1. 3 artists or composers = use 100, 7xx, & 511 fields 9 2. more than 3 artists or composers = use a 511 field; 7xx field for the first one named. Helpful Sources: AAC2R and LCRIs, Chapters 6 and 21 10 OCLC’s Bibliographic Format and Standards 11 Cataloging of Audiovisual Materials, by Nancy B. Olson 4th ed. (1998) 12 Conclusion I wrote the local procedure and gave a preliminary version to the various staff members who would be involved in using it. There were requests that more specif- ic examples of the types of Library of Congress call numbers and LC Subject Headings be appended to the local procedure. The addition of more examples would assist them in readily locating what kinds of Library of Congress call num- bers (090 fields) and LC Subject Headings (650 fields ) to add to each new biblio- graphic record. Several positive results occurred due to the implementation of the unique min- imal level cataloging project. Bibliographic access is now provided to unique, hard to find Basque sound recordings. In addition, these bibliographic records can be accessed by other libraries and organizations throughout the world via OCLC’s WorldCat database. Original sound cataloging and Basque language expertise con- tinue to grow among technical services staff at the University of Nevada, Reno. This undertaking has served as a streamlined, efficient use of minimal level (encod- ing level “K’’ ) bibliographic records. Finally, this innovative cataloging project can serve as a benchmark for other library catalog departments to follow when cat- alog department staff members are searching for ways to diminish problematical backlogs of various kinds of materials. Notes 1. Available on the Web at http://oclc.org(Bibliographic Formats and Standards) 2. Nancy B. Olson, Cataloging of Audiovisual Materials (DeKalb, Ill.: Minnesota Scholarly Press, 1998). 3. Op. Cit. Bibliographic Formats and Standards. 4. Available on the Library of Congress Cataloger’s Desktop (and also in print). 174 Journal of Educational Media & Library Sciences 42 : 2 (December 2004) Library of Congress Classification Web and Library of Congress Subject Cataloging Manuals. 5. Op. Cit. Library of Congress Classification Web and Library of Congress Subject Cataloging Manuals. 6. Available on the Library of Congress Cataloger’s Desk top (and also in print). Library of Congress Rule Interpretations. 7. Op. Cit. Library of Congress Rule Interpretations. 8. Op. Cit. Library of Congress Rule Interpretations. 9. Op. Cit. Bibliographic Formats and Standards. 10. Available on the Library of Congress Cataloger’s Desk top (and also in print). Anglo-American Cataloging Rules, 2nd edition, revised, and Library of Congress Rule Interpretations. 11. Op. Cit. Bibliographic Formats and Standards. 12. Op. Cit. Nancy B. Olson. work_5jarf4azhrentiyorawu4ymutq ---- 1 Opportunities and Challenges: The Current Situation of Copyright Protection for Document Supply in China Zhao Xing Director of Document Supply, Center of National Library, Beijing, China Email address: zhaox@nlc.cn Copyright © 2019 by Zhao Xing. This work is made available under the terms of the Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 Abstract Purpose: This study aims to explore and articulate the copyright problems of document supply resulting from changes in the digital age in China, introducing current Chinese Copyright Law and “fair use” in library services and exploring the challenges and opportunities of copyright protection for document supply in China. Design: From statistical analysis of the changes to document delivery services in the digital age based on the professional experiences of National Library of China (NLC), copyright problems are presented. Current Chinese Copyright Law and “fair use” are introduced. The measures NLC has taken to protect copyright in document supply are summarized. Findings: With increasing digital document delivery, the potential risks of copyright infringement in document supply have become more and more serious; we must take proper steps to protect copyright, especially in the digital age in China. Value: This is the first article in English to describe the current situation of copyright protection for document supply in China. It also presents the problems based on the professional experiences of NLC and recommends solutions for the digital age today. Keywords: Copyright protection; Document delivery services; National Library of China; Digital Age mailto:zhaox@nlc.cn http://creativecommons.org/licenses/by/4.0 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 2 1. Introduction Libraries shoulder an important mission for knowledge dissemination from the day knowledge is created. As the basic form of resources sharing, interlibrary loan (ILL) and document delivery services (DDS) are effective ways of spreading knowledge and are important expressions of library core values. Since the 1990s, with the development of digital technology, profound changes have taken place in ILL and DDS; the scope, content, and mode of service have changed tremendously in the digital age. The changes have also brought various problems related to copyright protection to libraries. While libraries try their best to supply documents in order to meet users’ needs, they may step into the forbidden zone of copyright protection if they are careless. How to find the balance between copyright protection and document supply has become an important issue for the sustainable development of resource sharing in China. In order to illustrate the changes to document supply and the problems related to copyright protection, we take NLC as an example. 2. Changes in the digital age NLC has a long history of ILL and DDS. NLC has been developing ILL since 1927. In 1997, a dedicated Document Delivery Center (DDC) was established to providing ILL and DDS. In order to improve work efficiency, DDC at NLC began to use an Interlibrary Loan and Document Delivery System (ILDDS) in 2007. By 2018, ILDDS had served more than 200,000 users. The types of ILDDS users cover all kinds of members, including scientific researchers, educational institutions, enterprises and institutions, as well as individual users. 2.1 Cover a wide area and serve more users The number of ILL and DDS transactions grew explosively after the establishment of DDC at NLC. The total number of ILL and DDS requests increased year-by-year except during 2012 to 2014, when the main NLC building was closed for remodeling and lending services for some literature was closed (Figure 1). After establishing the ILDDS, DDC filled 52,511 ILL and DDS requests per year from 2007 to 2018. In 2010, NLC began participating in OCLC WorldShare and has formed partnerships with 603 libraries in 120 countries and regions. By 2018, DDC has cooperated with more than 600 libraries, covering 34 provinces and autonomous regions all over China. NLC’s DDC has become the world’s largest Chinese literature guarantee base and the largest supply center of foreign literature in China today. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 3 Figure 1: ILL/DDS transactions 2007-2018 2.2 Focus on special documents Many ILL/DDS transactions focus primarily on special documents which are difficult to obtain on the market. According to an analysis of ILDDS data for the last five years (Table 1), many transactions focus on preserved books and periodicals; microform documents; Taiwanese, Hong Kong, and Macao documents; and theses. These items represent over 50 percent of the total transactions from 2014 to 2018 and have in common that they are old, rare, and difficult to obtain on the market. Although with open access has increased and while most electronic resources can be obtained by users themselves, old books and periodicals are still difficult to obtain on the Internet or market because they have not yet been digitized or cannot be bought from bookstores, so they can be supplied from libraries only. Table 1: Types of special documents of ILL/DDS 2014-2018 2.3 Focus on foreign literature ILL/DDC transactions focus on foreign literature, especially Western books and periodicals; the ratio of Chinese to foreign language on annual average from 2014 to 2018 was about 1:8 (Figure 2). This highlights the important role of NLC DDC as the largest foreign language reference center in China. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 4 Advanced information in the foreign literature has important value for scientific researchers, but it is difficult for researchers to buy the original foreign literature by themselves in China. Researchers may, if they manage to navigate the tedious purchasing process, miss the best timing for conducting research. DDC has solved this problem: we deliver foreign documents to researchers timely and effectively; therefore, they can carry out scientific research in time. Figure 2: Transactions of Chinese and foreign documents, annual average volume, 2014- 2018 2.4 Electronic delivery increasing With the development of digital technology, users are already accustomed to the convenience and efficiency of networked services and no longer accept long waiting times for paper copies. Even if the documents are not electronic, users are more likely to obtain them by scanning the original text, taking photocopies, restoring micrographics, or other electronic delivery modes. Libraries are paying more attention to reducing intermediary barriers to document delivery, accelerating the speed of user access to documents. The number of DDC electronic deliveries increased each year from 2008 to 2017 (Figure 3). 2010, notably, nearly doubled compared with 2008. Then, the number of annual electronic deliveries grew to over 20,000 since 2011. The main mode of document delivery has changed gradually from traditional paper mail to electronic delivery. With more and more people relying on mobile devices to obtain information, many libraries have begun using mobile applications (apps)to solve mobile users’ information needs. Some libraries have tried to open WeChat widgets or other apps for document delivery in China. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 5 Figure 3: Number of electronic deliveries, 2008-2017 2.5 Mobile payment increasing In 2016, ILDDS was upgraded comprehensively and an Alipay wallet was connected. The introduction of mobile payment not only shortens the users’ payment times, but also caters to users’ interaction needs and behavior habits in the new media age. We analyzed ILDDS Alipay wallet data of ILDDS in 2017 and the results showed (Figure 4) that users who used Alipay wallets to pay for DDS were primarily younger than 45 years old and represented over 90% of total users. Among them, the number aged 25-29 was the largest, 24.48% of the total. At the same time, this group contributed 20.06% of transactions. Only 9.83% of users were over 45 years old. This is basically consistent with the audience of social networking, online shopping, online games, and other new media services. According to market research, the majority of users of “Online To Office” (offline business websites) are 18-40 years old. Such users have a greater acceptance of novel fashion consumption patterns and have a greater understanding of unknown areas and haveever tireless curiosity. Figure 4: Age distribution of users using an ILDDS Alipay wallet, 2017 https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 6 2.6 Resource sharing increasing Document supply has a high degree of universality in the digital age; users are different from the traditional users and are moving towards integration. Convenient communication technologies promote resource sharing and break regional restrictions, becoming regionalized and internationalized. Registered users in different systems achieved unified authentication after realizing system connections through cooperation between libraries. Users can enjoy the services of other libraries nationwide or even globally in their own library systems, with “one-stop service” becoming a reality. Since 2010, NLC established cooperative relations with BALIS (the Beijing Academic Library & Information System), CALIS (the China Academic Library & Information System), NSTL (the National Science and Technology library), OCLC, and SUBITO, respectively, allowing resource sharing to break the restrictions of library types and systems. Data from NLC DDC in 2018 showed that applications from BALIS, CALIS, and other document guarantee institutions to NLC have accounted for 47% of the total amount (Figure 5). As early as 2008, the article “Interlibrary Loan and Document Delivery in the Digital Age” published by Chen Li, NLC’s Director, mentioned “Borrowing a Ship to Sea” and "revealing the resources and services of NLC by means of other document guarantee institutions and platforms, expanding the scope of our library's service.” Now this goal is gradually being achieved; the step of users’ cross-system convergence is unstoppable. Figure 5: Applications from different platforms to NLC, 2018 2.6.1 Cooperation with BALIS In order to make the use of NLC resources more convenient for Beijing university scientific researchers, NLC’s DDC began to cooperate with BALIS in 2010 and this cooperation has provided strong support for the literature resources of higher education literature guarantee system in metropolitan Beijing. As can be seen from Figure 6, ILL and DDS transactions have been increasing almost every year since cooperation began. In the Beijing area, the city logistics are convenient and interlibrary loan service costs are low, so the ILL application amount is about 8-10 times of the amount of DDS. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 7 Figure 6: Volume of ILL/DDS of the NLC to BALIS, 2014-2018 The cooperation between NLC and BALIS is not limited to ILL and DDS, but also extends to quality training for readers on how to obtain information. From 2014 to 2018 (for 5 consecutive years), DDC and BALIS carried out such training. In the past five years, DDC has provided many center librarians with the opporunity to visit more than 53 colleges and universities such as Tsinghua University, Beijing Foreign Studies University, the Beijing University of Posts and Telecommunications, Beijing Normal University, the Beijing Institute of Technology, the University of Science and Technology Beijing, and so on. Face- to-face, librarians explained NLC’s resources and services for instructors and students and answered specific questions about document delivery services. The lecture activities, which lasted for one month, were warmly welcomed by instructors and students, achieved good results in cultivating users’ skills and improving service efficiency, and brought about actual application growth. 2.6.2 Cooperation with CALIS NLC officially opened its cooperation with CALIS on November 23, 2013 Since then, users of academic libraries can obtain NLC resources and services through a cooperation platform in one place. This not only enables users to obtain more documents more conveniently, but also effectively enhances NLC’s resource security capabilities. By 2018, 296 CALIS member libraries had already conducted ILL and DDS with NLC. The data of the cooperation between NLC and CALIS is just the opposite of BALIS, see Figure 7: the quantity of DDS is much higher than that of ILL. This is mainly because the CALIS members are located all over China, while BALIS is only in Beijing, so the costs of logistics for ILL in different cities is higher than within the same city, so the colleges outside of Beijing are more likely to choose DDS rather than ILL. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 8 Figure 7: Volume of ILL/DDS, NLC to CALIS, 2014-2018 2.6.3 Cooperation with OCLC In order to improve international lending, NLC joined OCLC ( Online Computer Library Center, based in the U.S.) in 2010. Foreign users can find Chinese literature more effectively through OCLC and domestic users can also find literature from all over the world, so international loan transactions increased rapidly since 2010. The data cooperation contributed to a rapid sustainable growth and the amount of ILL via OCLC had increased more than 52% in comparison to 2012. In cooperation with OCLC, we have had more lending requests than borrowing requests (seeFigure 8). The annual lending requests are about 3 times more than borrowing requests and even reached as much as 7 times more than borrowing requests in 2016. This shows that the needs of users in accessing Chinese literature in various countries is very strong and growing rapidly. Besides OCLC, NLC cooperates with the British Library, the National Diet Library, SUBITO, the Russian State Library, and other guarantee institutions to establish a widely cooperative service for international loan. Figure 8: ILL borrowing and lending requests through OCLC, NLC volume and fill rates https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 9 3. Problems Traditional document delivery services were mostly based on the number of items in paper collections and could be easily categorized in the category of “fair use.” Few people raised the issue of copyright protection for many years in China. But digital delivery has overcome space barriers and has expanded the scope of services in the digital age. Especially some large-scale, comprehensive and professional, cross-system and intra-system, national and regional document delivery systems such as NLC, CALIS, BALIS, and so on have realized resource sharing and collaborative services, so the scope of document delivery has become wider, which has lead to some copyright problems. 3.1 Infringement of duplicate In the process of document delivery, libraries inevitably have to copy a certain number of documents. The right of reproduction is protected by the Copyright Law of China, namely Article 10, Paragraph 5, which states: “the right of reproduction, that is, the right to produce one or more copies of a work by printing, photocopying, lithographing, making a sound recording or video recording, duplicating a recording, or duplicating a photographic work, or by other means.” Traditional document delivery does not have a negative impact on the literature market; any adverse impact on the interests of the owner can be ignored. On the one hand, traditional document delivery services adopt a “one-to-one” mode: documents are delivered to specific users and libraries and this is not substantially different from library lending services. On the other hand, the law usually stipulates that users or libraries who obtain the documents should not copy the documents. Even if someone copies documents without authorization, it would not only have considerable costs, but there would also be a clear “quality” difference between “duplicate” and “original”. Infringements are easy to find, identify, and combat. But in the digital technology/networked environment, digital delivery is quite different. First, the scope of dissemination expands rapidly and a “one-to-many” mode becomes a reality. Then, in the instant of dissemination through the network, the number of users increases rapidly, which can cause great damage to the interests of the owner(s). On the other hand, using digital technologies is convenient and fast and simple “fingertip operations” can be completed. Especially compared with an “original”, the difference between “digital copy” and “original work” has no copyright significance. Infringement can be concealed and can be difficult to find and punish. As a result, a large number of copyrighted works have been copied and used by individuals, schools, and libraries; even the act of copying for profit has appeared. Even for teaching or scientific research, the number of copies is larger than the limit of “fair use” and this has caused some problems related to copyright protection in China. 3.2 Infringement of the right of communication through the information network Article 10, Paragraph 12 of the Copyright Law of China stipulates the right of communication through the information network. It is an absolute right in China’s current copyright system. The copyright owner grants the library the digitalization right to use the owner’ss works, which does not mean that the right of communication through the information network has been handed over to the library at the same time. If libraries deliver documents in the form of digitalization of traditional paper works, they should acquire https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 10 digitalization rights and network communication rights at the same time. Otherwise, the library may assume copyright liability. 3.3 Joint liabilities for readers’ torts Although the document delivery service itself does not infringe the rights of copyright, if readers obtain copies through a library and then carry out infringement, the library may bear joint infringement liability. Even if libraries can prove no fault, they are not entirely exempt from liability in China. According to the provisions of Article 5 and Article 6 of the Interpretation of Several Questions Concerning the Application of Law in the Trial of Computer Network Copyright Disputes issued by the Supreme People’s Court of China in November 2001, libraries may bear joint infringement liability. 4. Current Chinese copyright law Libraries are public service organizations; “public welfare” is their main charter. In order to ensure that libraries can fulfill their social mission, copyright laws have formulated special provisions for the rational use of copyrighted works by libraries, also known as “fair use” or “exceptions” for libraries in order to restrict the rights of copyright owners. The Rights of the Copyright Law of China and Regulations on the Protection of the Right of Communication through the Information Network both stipulate “fair use.” In some cases, a work may be used without permission and without payment of remuneration to the copyright owner. “Fair use” is also an important basis for libraries in avoiding copyright problems when carrying out document delivery. 4.1 Copyright Law of the People’s Republic of China Three of the 12 cases of “fair use” stipulated in Article 22, Section 4 (Limitations on the Rights of the Copyright Law of China) are particularly applicable to libraries: 1. Paragraph: “(1) use of another person's published work for purposes of the user’s own personal study, research or appreciation.” This article guarantees readers the full right to read. Readers can freely use library books without permission or payment to the copyright owner. 2. Paragraph: “(6) translation, or reproduction in a small quantity of copies of a published work by teachers or scientific researchers for use in classroom teaching or scientific research, provided that the translation or the reproductions are not published for distribution.” This article clarifies that libraries and readers using a small amount of reproduction of published works for the purpose of teaching or scientific research is “fair use”. Of course, it must be non-profit. Under this premise, whether an item is copied in a library or obtained through resource sharing, interlibrary loan and document delivery to obtain copies should be considered “fair use.” If we go beyond this premise, libraries may cause infringement. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 11 3. Paragraph: “(8) reproduction of a work in its collections by a library, archive, memorial hall, museum, art gallery, etc., for the purpose of display, or preservation of a copy, of the work.” Libraries are allowed to reproduce their collections for the purpose of displaying or preserving editions. But is it legal to copy works collected by other libraries? There is no specific provision in the Copyright Law of China. If the second case is cited, a small number of copies of published works in other libraries for teaching or scientific research may be considered “fair use.” 4.2 Regulations on the protection of the right of communication through the information network In 2001, the revised Copyright Law of China stipulated in Article 10, Paragraph 12 “the right of communication through the information network, that is, the right to make a work available to the public by wire or by wireless means, so that people may have access to the work from a place and at a time individually chosen by them.” Since then, the “right of information network dissemination” has a legal status in China's copyright system. On May 28, 2006, the State Council formally promulgated the Regulations on the Protection of the Right of Communication through the Information Network (hereinafter referred to as the Regulations), which specifically regulates the dissemination of works by libraries through the network: Article 7 A library, archive, memorial hall, museum, or art gallery, and so on may make available to the service recipients on its premises through the information network a digital work in its collection which is legally published, or a work which is reproduced in digital form for the purpose of displaying, or preserving copies of the same work in accordance with law, without permission from, and without payment of remuneration to, the copyright owner, provided that no direct or indirect financial benefit is gained therefrom, unless the parties have agreed otherwise. The Regulations further limits the scope of digital works: The work reproduced in digital form for display or preservation purpose, as referred to in the preceding paragraph, shall be a work of which a copy in the collection is on the brink of damage or is damaged, lost or stolen, or of which the storage format is outmoded, and which is unavailable or only available at a price obviously higher than the marked one on the market. There are too many restrictive conditions and obvious legal uncertainties in applying the Regulations, so libraries are facing greater liability risks when applying this provision. 5. Current measures 5.1 Application of “Fair Use” From the point of view of the current law in China, as long as document delivery is limited to “fair use” as prescribed by the Copyright Law, this activity belongs to a situation in which https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 12 documents can be transferred without permission and payment. However, “fair use” must be subject to the following conditions: 1. Control the price charged. Document delivery fees can still be charged, but only “at cost” should be charged. The fees can only include reasonable mailing fees, telecommunication transmission fees, network communication fees, replication fees, and so on. 2. Control the number of deliveries. The quantity of document delivery must be controlled within the scope of “fair use”; a large number of document deliveries beyond the scope of “fair use” should not be carried out. Article 22, Paragraph 6 of the Copyright Law does not specify whether “a small quantity of copies” refers to a small number of copies of a work or a small amount of content of a work; thus, NLC currently provides readers with no more than 1/3 of the full content of a work to ensure”a small quantity of copies.”. The reproduction of the whole content of a work should be regarded as beyond the scope of “fair use” in China. 3. Pay attention to certain types of works which are not allowed to delivery under copyright law. These are mainly computer software and audio-visual products. The delivery of such works must be authorized in writing by the copyright owner and royalties paid. 4. Pay attention to copyright notice on the works. If the author expressly declares that it delivery of their work is not allowed, the document center shall not deliver it; otherwise, the library will bear certain liability for copyright infringement. In the process of document delivery, the copyright information of a work cannot be modified or deleted at any time. 5.2 Delivery to registered users only All users submitting applications to NLC’s DDC must be registered users of ILLDDS. Users need to provide their valid ID number, name, address, email address, and reader’s card number obtained when registering. All this information is used for preserving documents and delivery files or to investigate and verify infringements when they occur. 5.3 Necessary copyright statement When users submit applications to NLC’s DDC through ILLDDS, they must sign a copyright protection confirmation statement. There is a clear “Copyright Notice” in the reader's interface, a prompt that “Reproduction furnished by the National Library of China Document Supply Center should be used only for purposes of private study, scholarship, or research. If a user makes a request for, or later uses a photocopy or reproduction for purposes in excess of ‘fair use’ specified by The RPOC Copyright Law, that user may be liable for copyright infringement.” (Figure 9) Users can submit their applications only after reading and clicking that they “agree” with this Copyright Notice. There are more regulations, such as the copies transferred are not allowed to be copied, altered or forwarded, and only a single sheet of paper can be printed. In addition, all electronic versions of copies must be deleted after successful printing. https://www.techlib.cz/en/84026 Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 13 Figure 9: The NLC ILDDS Copyright Statement 5.4 Perfecting the library legal system The role of law itself is to balance the relationship between owner and user. When the relationship between the two is not conducive to its development, it is necessary to improve the legal system and make it play its role. There is no specific provision for electronic document delivery in China’s current copyright law, so it needs to be improved. We should protect the interests of intellectual property owners and pay attention to the rights of document users as well. 6. Conclusion In conclusion, despite the risk of infringement, ILL and DDS are still an important way to meet the needs of users in resource sharing. The core idea of copyright law is to seek a balance between copyright protection and users’ rights. While emphasizing the protection of copyright and information network dissemination rights, it also increases the restrictions on these rights, thus it leaves a certain space for ILL and DDS. To date, China has no special library law or other laws to regulate ILL and DDS. Both China’s copyright law and the ordinance regarding the right to information network dissemination protect copyright-related rights from the perspective of copyright owners. There is a lack of relevant laws for libraries and readers (as the users) to protect their rights, especially in electronic delivery, this has seriously hindered the library’s ability to play an important role in the dissemination of knowledge in the digital age. Extensive cooperation ensures NLC resources are well-utilized, but NLC must solve new problems related to copyright protection in the future. Notes 1. Available at: http://www.nlc.cn/ 2. Available at: http://www.nlc.cn/newkyck/kyfw/201011/t20101122_11696.htm 3. Available at: http://wxtgzx.nlc.cn:8111/gateway/login.jsf 4. Available at: http://www.nlc.cn/dsb_zyyfw/wdtsg/dzzn/dsb_gtzy/ https://www.techlib.cz/en/84026 http://www.nlc.cn/ http://www.nlc.cn/newkyck/kyfw/201011/t20101122_11696.htm http://wxtgzx.nlc.cn:8111/gateway/login.jsf http://www.nlc.cn/dsb_zyyfw/wdtsg/dzzn/dsb_gtzy/ Opportunities and Challenges-- Current Situation of Copyright Protection for document supply in China by Zhao Xing. 2019 IFLA ILDS Proceedings (ISBN: 978-80-86504-40-7) 14 References Baich, T. & Weltin, H. (2012). Going global: an international survey of lending and borrowing across borders, Interlending & Document Supply, 40(1), 37-42. doi: 10.1108/02641611211214279 Gan, L., Xiao, L., & Guan, Z. (2011). The intellectual property solution and practice of domestic document supply service, Journal of Academic Libraries (Chinese Journal), (1), 5- 10. IFLA. (2009). Statement of Principles on Copyright Exceptions and Limitations for Libraries and Archives. Retrieved from https://www.ifla.org/files/assets/clm/statements/statement-of- principles-sccr20.pdf Liu, S-S. (2007). The establishment of regulations on the protection of the right of communication through the information network and its influence on libraries, Journal of Library Information Scientific Agriculture (Chinese Journal), 19(4), 50-52. Love, E. & Edwards, M.B. (2009). Forging inroads between libraries and academic, multicultural and student services, Reference Services Review, 37(1), 28. doi: 10.1108/00907320910934968 Sha, T. (2019). Legal governance of excessive downloading of digital resources in libraries, Digital Library Forum (Chinese Journal), (3), 64-68. Qiu, F., Han, X., & Chen, Y. (2018). Copyright risk prevention in co-construction and sharing of digital resources in libraries, New Century Librarian (Chinese Journal), (2), 52-56. https://www.techlib.cz/en/84026 https://doi-org/10.1108/02641611211214279 https://www.ifla.org/files/assets/clm/statements/statement-of-principles-sccr20.pdf https://www.ifla.org/files/assets/clm/statements/statement-of-principles-sccr20.pdf https://doi-org/10.1108/00907320910934968 work_5nepvtlpuraojogir77dwiulse ---- Microsoft Word - subsearching Subject searching in OPAC of a special library M S Sridhar 1 Subject searching in OPAC of a special library: problems and issues M.S. SRIDHAR Head, Library and Documentation ISRO Satellite Centre Bangalore. India e-mail: sridharmirle@hotmail.com OR sridhar@isac.ernet.in Abstract This paper drawing data from a comparative study of use of Online Public Access Catalogue (OPAC) and card catalogue of ISRO Satellite Centre (ISAC) library examines the steady decline in use of subject searching by end-users and the associated problems and issues; presents data to highlights negligible use of Boolean operators and combination searches, variations in descriptors assigned to books of the same class numbers, too many records tagged too broad descriptors, etc.; Concludes that moving from traditional card catalogue to modern OPAC has not made subject searching more attractive and effective. Keywords OPAC; Subject Search 1. INTRODUCTION `Subject approach' to knowledge has been a long and extensive concern of librarianship and is assumed to be the major approach (access method) of users for a very long period. In the card catalogue days, both classified part of the catalogue and subject catalogue (based on assigned standard descriptors) were assumed to help users to have subject access to the resources of a library. Access by classification number is believed to be more common in Europe and India than in US. Unfortunately very few studies have been conducted in those days to see the relative importance of this approach, success or failure of searches and user behavior towards subject access. Card catalogues had plenty of cross references to help users even if they are not aware of standard descriptors and the same are lacking in OPACs. On the other hand, card catalogues primarily meant for pre-coordinated search where as OPACs enable post- coordinated searches using Boolean operators and other combination searches. Further, OPACs also enable executing vague and free text queries wherever KWIC indexes are provided, which is a great boon to users who are normally unaware of descriptors selected from thesaurus or subject heading list. Despite limitations users do prefer free-text searching. There are also efforts to create intelligent natural-language front ends which use subject headings / thesaurus for searching OPAC. 2. SOME PAST STUDIES Studies on use of OPAC are plenty and there exists good number of reviews of OPAC studies (Larson, 1991; Heldreth, 1985; O`brien, 1994). In a 30 months transaction analysis of patron search on OPAC of Ohio State University, Norden and Lawrence (1981) found that use of subject search commands increased rapidly (Quoted from Hildreth, 1991, p 262-3). Unfortunately OPACs are criticised as being more difficult to use and less Subject searching in OPAC of a special library M S Sridhar 2 serviceable than card catalogues and used more for finding known items rather than to seek information or to solve information based problems (Borgman, 1996, p 494). However, there is a significant and consistent decline in controlled vocabulary based subject searching over years in favour of title keyword searching (Larson, 1991). The relative use of subject index in OPAC varied widely from 10 to 62% in different studies. In the famous CLR project, subject searching was found to constitute the majority (59%) (Mathews, et. al. 1983). But, Larson found a gradual decline in the use of the subject index as the size of the database increases. The frequency of title keyword searching exceeded that of subject searching over a period. But, as mentioned before, many title keyword searches are for known items. `Subject access is the most problematic area of online catalogues' (O'brien, 1994). It often leads to either failure or retrieving too many references. Most of OPAC studies have identified the need to tackle the related issues like free text search, field directed search, training users, adjunct thesaurus help, limiting devices with `filtering' effect, relevance feed back, ranking of retrieved references, etc., to reduce search failures, as most users cannot be expected to put serious/ extra efforts in subject searching. Even computer literacy does not ensure that users will do better subject searches. OPAC is a `black box' to users and they know very little about what happens inside the system. For example, trade off between precision and recall is rarely known to end-users. This may also be true of many library professionals who deal with OPAC routinely. OPACs are diverse in features as well as size. Functional layers mediating access of OPAC are user interface, DBMS interface, DBMS, and database (with indexes). Conceptually they can be grouped into (i) those that deal with the users and (ii) those that deal with the storage and retrieval of bibliographic data. Most enhancements to subject searching are through enhancements of the database and DBMS layers. The database layer can be seen as some portion of the contents of card catalog in machine-readable form. The other layers provide procedures to facilitate the process of delivering information to users (Larson, 1991, p 190-191). Boolean logic appears to be one of the most difficult aspects of information retrieval and is not `common sense' for most people (Borgman, 1986, p 390). Users tend to perform simple searches using only the basic features. Even scientists and engineers who have expertise in logic for other applications often use `AND' and 'OR' in their linguistic sense (Borgman, 1986a, 1986b). The combination search or use of Boolean operator can greatly help users to reduce recall and increase precision so as to obtain a browsable size of hits. Users information needs range from highly specific to very general ones. Yet they make only simple searches. Most users search with single term that defeats the purpose of combination searches and Boolean logic. "… Users rarely ventured beyond a minimal set of system features. The majority of searches were simple, specifying only one field or data type to be searched… [and] advanced search features…. were rarely used…" (Borgman, 1986, p 389-390). Most studies monitoring transactions have reported significant frequencies of input unidentifiable by the system, aborted sessions and searches with no matches. Not only "usage rates are low enough that many online catalog users probably remain permanent novices" (Borgman, 1986, p 390) but also users tend to perform simple searches using only the basic features and did not utilise index terms or headings unless 'forced' to do so (Borgman, 1986, p 389-390). Like use of other services of library and interactions with Subject searching in OPAC of a special library M S Sridhar 3 library, use of OPAC is also skewed with a few using a lot and most using it little. Further, most end users search OPAC only occasionally and do not access the system on a regular basis and they tend to learn only enough to do simple searches reasonably quickly and to regard further instruction as unnecessary and more extensive expertise as a burden (Yuan, 1997, p 218). In OPAC use studies, no-match subject searches ranged from 35% to 57% (Borgman, 1986, p 389-390). Dickson (1984) found that 37% of all title searches and 23% of all author searches resulted in no matches. She determined that 39.5% of the no-match title searches and 51.3% of the no-match author searches were for records that existed in the database and not due to user error in searching (Quoted from Borgman, 1986). Hirst (1999) through a questionnaire survey of users with different levels of IT expertise about use of hypertext interfaces to LIS found that OPAC searches were mainly conducted for specific items and that most were successful. It may be noted that most of specific item searches are other than subject searches. Interestingly novice users tended to achieve higher success rates than expert users. " Identifying search terms for the subject catalog is hardest of all, since, people often do not recognise that the subject entries are drawn from a controlled list or thesaurus that is separately searchable itself. Instead, they enter the catalog using the free-text keywords they know best, often on a trial and error basis" (Borgman, 1996, p 497). One way to guide end-users is to enable them to choose a subject term from the classification number of the book. There is a great need to link subject headings to classification numbers. Drabenstott and Weller (1994) summarised the problems of subject access based on the findings of the past studies : (1) one-third of subject queries fail to produce results (ii) large retrievals (high recall) discourage users from scanning the results (iii) few instances of successful matching of query with controlled vocabulary are one-word queries (iv) users are discouraged by subject access and are seeking alternative approaches. They feel that system designs enhancing subject headings, developing menu-based interfaces and extending online catalog functionality to other databases meet the demands of library staff and not necessarily that of users for search functionality. Hence the problems found in the earliest OPACs still continues to exist as far as users are concerned. They have proposed `search trees' for subject searching after running an experimental online catalog called ASTUTE. Search failures are usually due to misspelling, lack of knowledge of thesaurus, `false drops', lack of user understanding of Boolean operators, lack of cross references, lack of online thesaurus and lack of training. There was a significant positive correlation between the failure rate and the percentage of subject searching (Larson, 1991) and a negative correlation between failure rate and time. Longer the processing time/ rate more the chances of user abandoning the search. Subject searches often lead to unusually high recall and create the problem of information overload. The average number of records retrieved are very high (about 77) and users look only few (about 9) (Larsen, 1991). Often, users prefer browsing shelf rather than browsing through subject access. Larson (1991, p20) concludes that " the desire to do topical searching has not diminished, but that the penalties incurred by the user in the process of using the subject index have led to the decline in use". Even CLR survey found that the topical searching is more prevalent among those who are less experienced with the library and its catalogs (Mathews, et. al., 1986). Subject searching in OPAC of a special library M S Sridhar 4 O'brien (1994, p223) says that the "…subject indexing of monographs is both superficial and inadequate" as they lack in-depth treatment particularly for composite books and conference volumes as well as lack of TOC, blurbs, etc. Larson (1991) suggests remedies to problems of subject searching which are grouped under (I) the database (ii) the search processing and retrieval algorithms and (iii) the user interface. The database related remedies include expanding records by adding words from TOC, index and blurbs of books, enhancing records with terms from classification schemes and increasing the number of descriptors per book. The second group of remedies include limiting search results by Boolean intersections using additional terms or dates, partial matching and stemming of keywords, relevance ranking of outputs, automatic mapping from input search terms to controlled vocabulary terms through thesaurus look-up, automatic spelling correction or phonetic matching of terms and relevance feed back. The last group of remedies relating to user interface include `browsability' of existing subject headings using online thesaurus and classification assignments. However, in the context of digital libraries Lesk (2003) questions the very need for traditional subject classification and indexing (which are usually meant for a possible future query) when the actual query itself can be searched on demand in seconds. Multistage searching and display, saving searches, set building, etc. are considered as no more required in the Web and future digital libraries. The new bibliographic model FRBR (The Functional Requirements for Bibliographic Records) of OCLC has some hopes for subject access by way of suggesting retrieval of groups of related documents. This program is suppose to generate sets of records that can be grouped for display as single works irrespective of their manifestations. In a transaction log analysis of OCLC online catalogs Tolle (1983) worked out correlation coefficients for transaction search patterns and found that the probability of going from `begin' state to ` browse' is the highest (0.643) and going from one `error' to another `error' state is the next highest (0.598). In other words, once an error was made, the next transaction/command was an error in 59.8% of times. Hardly less than 9% moved from an error state to ending the session. This speaks of user frustration and waste of lot of efforts of end-users. The degree of variability of subject searching in an OPAC at a university library revealed that subject searching varied from 22% to 74% over the hours of the day, from 17% to 64% over the days of the week and from 12% to 70% over weeks of the semister (Kaske, 1988a, 1988b). A more recent questionnaire based survey (Oduwole, A A et.al, 2002) of use of OPAC by 286 users at a Nigerian University found that OPAC was used mostly for self search rather than delegated search with author as major (59%) access point followed by subject (30.8 %) and large majority were found very satisfied (75%) with the OPAC. Important findings: ��Too many failed searches or no records retrieved ��Retrieve unmanageably large number of records ��Navigational frustrations ��Failure to match the system's subject vocabulary ��Recall of subject browsing displays ��Users’ search requests/ terms are too broad or too narrow ��Inadequate / lack of cross references ��No Boolean queries formulated ��Lack of user perseverance ��Users make a variety of errors when entering a search request ��Too few subject headings per bibliographic record (Average bibliographic record contains less than 2 subject headings) ��Too many or too few bibliographic records linked to a subject heading Subject searching in OPAC of a special library M S Sridhar 5 Extensive review of past OPAC use studies is neither feasible nor desirable. However, some of the important findings and suggestions are summarized in the adjacent boxes. 3. BACKGROUND OF THE STUDY An observation-based study of use of card catalogue of ISRO Satellite Centre (ISAC) library carried out during 1985 (Sridhar, 1986) revealed that classified catalogue was not used, report number catalogue (which is like classified catalogue for reports) was negligibly used, author and title catalogues were moderately used and subject catalogue was heavily used. The library was automated during early 80s and the OPAC was made available on LAN from 1991. A study (Sridhar, 2004) of use of OPAC based on observation, interaction with users and recording by professional staff at the site is done for over 80 hours (equivalent of 10 working days or two working weeks) with due representation for all times of the day and all days of the week during July- August 2002. Unfortunately, the new software does not provide for collecting transaction log data of OPAC. At the time of study, OPAC (2002) had over 2 lakh records. Surprisingly, when the results of this study was compared with that of earlier (1985) card catalogue study, it was found that the quantum of usage of OPAC is not even as much as what it was in case of card catalogue. The data from this study was extracted to further investigate issues and problems relating to decline in subject searching. 4. DATA AND DISCUSSION The kind of access users prefer while searching OPAC as well as card catalogue are very interesting and probably more revealing for development of search tools and techniques. The above mentioned study (2004) based on critical incident observation has primarily aimed at knowing the approaches of users of ISAC library while searching OPAC. Table I and Figure 1 present the statistics relating to different approaches adopted by users for searching / querying the system. Also juxtaposed in the Table and the corresponding Figure is the data from previous study of use of card catalogue in 1985 (Sridhar, 1986). The data reveals that the title approach is adopted by a maximum of 38.3% while using OPAC as against a maximum of 54.2% adopting subject (descriptors) approach in the card catalogue case. It may be noted that additional search features/ approaches like KWIC and combination searches were obviously not found in card catalogue and hence the magnitude of subject search on OPAC can be assumed to be 33% after adding the KWIC and combination searches. Even then the subject searches have substantially dropped from the time of card catalogue to OPAC days. This is in conformity with the findings of the past studies. As noted earlier, there is generally a small but significant decline in controlled subject searching in favour of keyword (free text) searching (Larson, 1991c). However, even the percentage of searches on author catalogue has dropped from 35.4% in card catalogue to 26.8% in OPAC. This also conforms to the findings of Norden, et.al. Suggestions for improvements : ��Records enhanced with table of contents (TOC), publisher summary (blurbs), index terms, chapter summary, MARC records, etc. ��Allow use of natural language search ��Provide additional search aids/ assistance tools like search trees ��Use information visualisation software and user interface ��Provide front end database ��Provide access to classification information ��Point the user from keywords to subject headings ��Clean up bibliographic records ��Utilize authority control ��Review frequency of assigned subject headings Subject searching in OPAC of a special library M S Sridhar 6 (1981) that title searches were most frequent and the ratio of title to author searches was higher than that in card catalogue. Assuming that specific item searches are mostly non- subject searches, Hirst (1999) through a questionnaire survey of users with different levels of IT expertise, found that OPAC searches were mainly conducted for specific items and that most were successful. In the present study, hardly 2.5 % of searches are `combination` searches. These kind of advanced searches are expectedly done by very few end-users and this trend also conforms to the findings of most of earlier studies of OPAC. Take in Table I and Figure 1 Concentrating only on major approaches, namely title, author and subject and merging data relating to KWIC searches into that of subject searches, the same data extracted from Table I is presented in Table II. This table together with Figure 1 relating to title, author and subject searches present an interesting comparison of findings of use of OPAC with that of card catalogue. Firstly, searches by title have substantially increased from mere 8.3 % in card catalogue to 38.3 % in OPAC. Secondly, subject searches have dropped substantially from 54.2% in case of card catalogue to mere 30.7% (including KWIC searches) in OPAC. However, author approach has marginally dropped (from 35.4% to 26.8%) from card catalogue to OPAC days. On the contrary, subject searching constituted the majority (59%) in CLR project (Mathews, et.al., 1983). Take in Table II A follow up observation of 51 subject searches made by end-users are depicted in Table III. The data revealed that nearly half of them have met with failure. The rest of little over half of searches (52.9%) were considered reasonably successful searches in getting desired results. Out of those met with failure, nearly one-fourth (23.5%) have abandoned the search having reached failure. Another one-fourth (23.5%) changed the search strategy. Out of those who changed search strategy, 13.7% changed keywords. Take in Table III It has been very clear from the results of the past studies that the majority of end-users face problems with subject searching and one of the major issues is concerned with the selection of appropriate standardized descriptors by end-users. It is also clear that users do not prefer searching by classification numbers and there is no online help regarding thesaurus and/ or classification scheme. There is a need to create link between classification numbers and the corresponding descriptors. Further, too many or too few bibliographic records are linked to a subject heading as well as too few subject headings are assigned per record (average 2) and hence there is a need to review the frequency of assigned subject headings. Table IV depicts an analysis of descriptors in 920 books from 20 sample class numbers at ISRO Satellite Centre library. Surprisingly, the average number of descriptors assigned per book varied from as low as 1.39 in case of Safety related books to a moderate 2.75 in case of Solar Physics/ System related books with overall average a meager 1.89. Further, the SD is also very low. In other words, much against the cry that more descriptors need to be assigned to the books, the collection continues to be indexed with fewer keywords than desired. The sample books had 599 unique keywords assigned 1737 times. The last column of Table IV depicts ratio of number of unique descriptors to number of books in each subject (class no.). As this ratio Subject searching in OPAC of a special library M S Sridhar 7 increases the precision of single descriptor search increases. Unfortunately, except few, this ratio is not even one for most of subjects and the highest is just 1.63. Take in Table IV An analysis of frequency of occurrence of descriptors in the entire OPAC revealed further surprises. Fifty seven most frequently occurring descriptors appearing 20452 times in OPAC are listed in Table V. Most of descriptors in Table V are very broad topics and unless they are used in post-coordinated or Boolean searches or in combination with other fields, they will result in too many records and become unmanageable for end-users. Take in Table V A frequency table of descriptors used for indexing books is depicted in Table VI. ISAC Library uses NASA Thesaurus for indexing books by way of assigning up-to six postable terms from out of 18,100 postable terms listed in NASA thesaurus (which has totally 41,300 entries). It is clear from the table that only 936 descriptors (postable terms) are assigned to more than ten books in a collection of about 39,000 books. In other words, as many as 13,600 descriptors out of 18100 postable terms (i.e., 75.2%) in NASA thesaurus remain unused. It is shocking that professional indexers have made least use of specialized thesaurus meant for space science and technology while indexing books in a space science and technology library. Take in Table VI While most frequently used terms (as shown in Table V) cause problems of precision, the unused and least used descriptors (as shown in Table VI) cause problems of recall. As descriptors are drawn from a specialized thesaurus meant for space science and technology, such infrequent (including non use) as well as most frequent use of small set of descriptors appears to be the result of variation in indexing level over time and also from indexer to indexer. The problem may not be uncommon in libraries, but the quality of indexing is a serious concern, if at all subject searching has to be promoted among users. Past research has repeatedly found that failure to match the system's subject vocabulary, recall of subject browsing displays with thesaurus lookup facility and inadequate / lack of cross references as serious drawbacks for subject searching. Obviously many have suggested for allowing use of natural language, providing additional search aids like search trees, pointing users from nonstandard keywords to descriptors, development of ontological and other automatic categorization techniques. 5. CONCLUSION The search process in online catalogues has more or less remained same as in card catalogue with increased access points, varieties of search features but with increased complexity of the process. End-users are not only expected to have technical searching skills but also conceptual and semantic knowledge relating to query in case of subject searching in order to articulate the query. Both indexing quality and under usage of subject access in OPAC cause serious concern about the so called ‘subject approach’ to knowledge. Inconsistent indexing quality has made the subject searching further difficult and ineffective. In addition to end-user training and other remedies suggested, total new look into interactive searching like drag and drop text from hits, ‘more like this’ feature, online thesaurus lookup with classification link, partial matching, auto AND search, etc. Subject searching in OPAC of a special library M S Sridhar 8 are required. Among many suggestions made to the vendor of software is the provision for indicating the intended use/ application and / or level of book such as beginner, general, specialized / advanced, review, etc., which often helps to improve the precision of the search and utility of the hits in OPAC. Lastly, adding intelligent components like quality addition based on community rating/ use statistics, automatic updation, etc. are necessary. ACKNOWLEDGEMENT REFERENCES Alzofon, Sammy R and Van Pulis, Noelle (1984). Patterns of searching and succcess rates in an online public access catalog. College & Research Libraries. March; 45(2), 110-115 Banks, Julie (2000). Are transactions log useful? A ten-year study. Journal of Southern Academic & Special Librarianship, February; Vol. 1, No. 3 Borgman, Christine L. (1986). Why are online catalogs hard to use? Lessons learned from information retrieval studies. Journal of the American Society for Information Science, 37(6), 387- 400. Borgman, Christine L. (1996). Why are online catalogs still hard to use? Journal of the American Society for Information Science, 47(7), 493-503. Drone, Jeanette M. (1984), "A use study of the card catalogue in the University of Illinois Music Library", Library resources and technical services, Vol. 28 No.3, July/September, pp. 253. Ferguson, Douglas et al (1982). The CLR public online catalog study: an overview. Information Technology & Libraries. June, 1(2),84-97 Gorman, Michael and Jami Hotsinpiller. (1979), "ISBD: Aid or barrier to understanding?", College and research libraries, Vol. 40 No. 6, November, pp. 251. Gouke, Mary Noel and Pease, Sue (1982). Title searches in an online catalog and a card catalog: a comparative study of patron success in two libraries. Journal of Academic Librarianship. July, 8(3), 137-143 Hildreth, Charles R.(1985). Online Public Access Catalogs. In: Martha E. Williams (Ed.) Annual Review of Information Science and Technology (ARIST), NY: Knowledge Industrial Inc. Vol. 20, pp. 232-285. Hirst, S.J.(1998). Hyperlib deliverable 1.2: In-depth survey of OPAC usage. Part of Hyperlib electronic document store, University of Antwerp - University of Loughborough www.lia.ua.ac.be/MAN/P12/root.html Kaske, Neal K. (1988a). The variability and intensity over time of subject searching in an online public access catalog. Information Technology & Libraries 7(3),273-287 Kaske, Neal K. (1988b). The comparative study of subject searching in an OPAC among branch libraries of a University Library system. Information Technology & Libraries 7(4),359-372 Larson, Ray R. (1991). Between Scylla and Charybdis: Subject searching in the online catalog. Advances in Librarianship, edited by Irene P. Godden, San Diego Academic Press, Vol.15, 175-236. Lesk, Michael (2003). Collecting for a digital library: size does matter. Information Management and Technology 36(4) October-December 2003, 184-187. Subject searching in OPAC of a special library M S Sridhar 9 Matthews, Joe (nd). Value of Library Services : A presentation to the Joint Chapter Professional Development Day. www.sla.org/chapter/cpnw/postings/0401Mathews.ppt Matthews, Joseph R., Gary S. Lawrence and Douglas K. Ferguson (1983). Using online catalogs: A nationwide survey. A report of a study sponsored by the Council on Library Resources. In: Matthews, Joseph R. (Editor) The impact of online catalogs. New York: Neal-Schuman. McCarthy, Cavan. (1975), "Colonial cataloguing", New Library World, Vol. 76 No.897, March, pp. 55-56. (Quoted from Ken Jones, conflict and change in library organizations: people, power and service, Longdon: Clive Bingley. 1984, pp.29). Norden, David J. and Lawrence, Gail herndon (1981). Public terminal use in an online catalog: some preliminary results. College & Research Libraries. July,42(2),308-316. O’Brien, A. (1994). Online catalogs: Enhancements and developments, in Martha E. Williams (Ed.) Annual review of information science and technology, 29, 219-242. Medford, NJ: Learned Information. Oduwole, A.A et.al. (2002), "On-line public access catalogue (OPAC) use in Nigerian academic libraries: a case study from the university of agriculture, Abeokuta", Library Herald, Vol. 40 No. 1, March, pp. 20-27. Sridhar, M.S. (1982), "A study of library visits and in-house use of library documents by Indian space technologists", Journal of Library and Information Science, Vol. 7 No. 2, December, pp. 146-158. Sridhar, M.S. (1983), "User participation in collection building in a special library: a case study", IASLIC Bulletin, Vol. 28 No. 3, September, pp. 117-122. Sridhar, M.S. (1985), "A case study of lent-out use of a special library", Library Science with a slant to Documentation, Vol. 22 No. 1, March, pp. 19-34. Sridhar, M.S. (1986), "Pattern of card catalogue consultation in a special library", IASLIC Bulletin, Vol. 31 No. 1, March, pp. 9-16. Sridhar, M.S. (1988), "Library-use index and library-interaction index as measures of effectiveness of a special library: a case study", Proceedings of XXXIV All India Library Conference on Library and Information Services: Assessment and Effectiveness, Calcutta: ILA, pp. 449-465. Sridhar, M.S. (1989), "Information-seeking behaviour of the Indian space technologists", Library Science with a slant to Documentation and Information Studies, Vol. 26 No. 2, June, pp. 127-165. Sridhar, M. S. (1989), "Patterns of user-visit, movement and length of stay in a special library: a case study." Annals of Library Science and Documentation, Vol. 36 No. 4, pp. 134-138. Sridhar, M. S. (1994), "Non-users and non-use of libraries", Library Science with a slant to Documentation and Information Studies, Vol. 31 No. 3, September, pp. 115-128. Sridhar, M. S. (1995a), " Understanding the user - why, what and how?", Library Science with a slant to Documentation and Information Studies, Vol. 32 No. 4, December, pp. 151- 164. Sridhar, M. S. (1995b), Information behaviour of scientists and engineers, Concept Publishing Company, New Delhi. Sridhar, M. S. (2002), Library use and user research: with twenty case studies, Concept Publishing Company, New Delhi. Sridhar, M. S. (2004), OPAC vs. card catalogue : a comparative study of user behaviour the Electronic Library, Vol. 24 No. 2, Mar/ Apr 2004, pp. . (In press). Subject searching in OPAC of a special library M S Sridhar 10 Wilson, Patrik. (1983), "The catalog as access mechanism: background and concepts", Library resources and technical services, Vol. 27 No. 1, Jan/Mar, pp. 6-16. Yuan, W. (1997). End-user searching behavior in information retrieval: a longitudinal study. Journal of the American Society for Information Science, 48(3), 218-234. TABLES AND FIGURE Table I Access point/ Approach Type of search/ access OPAC (2002) Card catalogue (1985) No. % No. % Title 150 38.3 8 8.3 Author 105 26.8 34 35.4 Subject (keyword) 90 23 52 54.2 Class No./ Report No. 5 1.3 2 2.1 Place of publication 1 0.2 NA NA Publisher 1 0.2 NA NA Word in title (KWIC) 30 7.7 NA NA Combination search 10 2.5 NA NA Total 392 100 96 100 Key: NA, Not applicable FIGURE 1 Access point/ Approach OPAC and card catalogue 38.3 26.8 23 1.3 0.2 0.2 7.7 2.5 8.3 35.4 54.2 2.1 0 10 20 30 40 50 60 Ti tle Au th or Su bj ec t ( De sc rip to r) Cl as s No ./ Re po rt No . Pl ac e of p ub lic at io n Pu bl is he r W or d in ti tle (K W IC ) Co m bi na tio n se ar ch OPAC (2002) Card catalogue (1985) Table II Percentage of searches by author, title and subject Access point/ type of catalogue OPAC (2002) Card catalogue (1985) Title 38.3 8.3 Author 26.8 35.4 Subject (+KWIC) 30.7 54.2 Table III Success or failure of Subject searches No. % Successes 27 52.9 Failures: - Abandoned 12 23.5 - Changed keywords 7 13.7 - Changed search 5 9.8 Sub total 24 47.1 Total 51 Subject searching in OPAC of a special library M S Sridhar 11 Table I V Sample class numbers checked for descriptors Sl. No. Class. No. Subject No. of books No. of descriptors Average SD No. of unique descriptors Ratio of unique descriptors to books 1 523.9 The Sun, Solar physics 24 66 2.75 1.27 39 1.63 2 528.8 Remote sensing 79 156 1.97 1.21 50 0.63 3 537.8 Electromagentism, Electromagnetic field, Electrodynamics, Maxwell theory 52 86 1.65 1.26 28 0.54 4 539.216.2 Films, Thin films 24 35 1.46 1.00 14 0.58 5 614.8 Accidents prevention, Protection safety 28 39 1.39 0.88 21 0.75 6 621.3.049.7 Printed circuits and the like 10 21 2.10 0.89 11 1.10 7 621.314.5 Conversion of AC ito DC & vice versa, convertors, invertors 17 41 2.41 1.18 20 1.18 8 621.38.049.771.1 4 Microprocessors 243 409 1.68 1.10 68 0.28 9 621.37.04 Microelectronics 32 71 2.22 1.11 28 0.88 10 621.381.542 Image analysis 30 71 2.37 1.57 38 1.27 11 621.383.5 Barrier layer photocells, Photovoltaic cells, Photodiodes, Phototranssistors 15 23 1.53 0.65 6 0.40 12 621.396.67 Aerial systems 79 150 1.90 1.86 55 0.70 13 621.390.96 Radar 58 107 1.84 1.92 55 0.95 14 629.7.036.5 Rocket propulsion 8 18 2.25 1.44 13 1.63 15 629.73 Aeronautical engineering 25 48 1.92 1.27 28 1.12 16 629.785 Space probes 21 52 2.48 1.01 21 1.00 17 658.562 Quality control 65 110 1.69 0.58 22 0.34 18 681.3.02 Design, construction layout of DP systems 77 177 2.30 1.01 62 0.81 19 681.3.06vhd VHDL (computers) 15 25 1.67 0.36 9 0.60 20 681.351 Computer networks 18 32 1.78 0.40 11 0.61 Total 920 1737 1.89 599 0.65 Subject searching in OPAC of a special library M S Sridhar 12 TABLE V Ranked list of descriptors in books Rank Descriptor No. of Books 1 Computer programming 1127 2 Computer networks 748 3 Signal processing 745 4 Software engineering 610 5 Integrated circuits 577 6 Programming languages 534 7 Control theory 520 8 Neural nets 515 9 Artificial Intelligence 510 10 Communication networks 470 11 Aerospace Engineering 468 12 Computers 457 13 Micro processors 433 14 Remote sensing 426 15 Data processing 418 15 Image processing 418 17 Robotics 404 17 Telecommunication 404 19 Data base management systems 368 19 Dictionaries 368 20 Indexes (Documentation) 349 21 Computer aided design 343 22 Astronomy 341 23 India 340 24 Circuits 329 24 Electrical engineering 329 25 Very large scale integration 328 26 Communication 324 26 Micro computer 309 27 Computer programs 303 28 Multimedia 299 29 Management 285 30 Antennas 281 31 Libraries 279 31 Wireless communication 279 33 Computer graphics 271 34 Industries 266 35 Quality control 264 36 Expert systems 263 36 Operating systems (computers) 260 37 Architecture (computers) 259 38 Internets 257 39 Heat transfer 248 40 Bibliographies 247 41 Manufacturing 246 42 C (Programming Language) 245 43 Robots 238 44 Medical science 234 45 Algorithms 220 45 Electronic equipment 220 47 Lasers 217 47 Reliability 217 48 C++ (Programming Language) 213 49 Databases 207 49 Micro waves 211 50 Propulsion 206 51 Object-oriented programming 205 Total (57 Descriptors) 20452 TABLE VI Descriptors with least number of records/ books (Frequency distribution of least assigned descriptors) No. of descriptors No. of Books (Records) 1,539 1 658 2 367 3 281 4 196 5 159 6 122 7 93 8 70 9 75 10 Sub-total 3,560 936 >10 13,004 (unused) 0 Total 18,100 Subject searching in OPAC of a special library M S Sridhar 13 About the Author Dr. M.S.Sridhar is a post graduate in mathematics and business management and a doctorate in library and information science. He is in the profession for last 35 years. Since 1978 he is heading the Library and Documentation Division of ISRO Satellite Centre, Bangalore. Earlier he has worked in the libraries of National Aeronautical Laboratory (Bangalore), Indian Institute of Management (Bangalore) and University of Mysore. Dr. Sridhar has published four books (‘User research: a review of information-behaviour studies in science and technology’, ‘Problems of collection development in special libraries’, ‘Information behaviour of scientists and engineers’ and ‘Use and user research with twenty case studies’) and 74 research papers, written 19 course material for BLIS and MLIS, presented over 22 papers in conferences and seminars, and contributed 5 chapters to books. work_5ohm53fwj5borbpqyuak3jn3bu ---- Sutton, Lynn, Bazirjian, Rosann, and Zerwas, Stephen. Library Service Perceptions: A Study of Two Universities, College and Research Libraries, v.70(5), pp. 474-496. Made available courtesy of the American Library Association: http://crl.acrl.org/content/70/5/474.abstract Library Service Perceptions: A Study of Two Universities Lynn Sutton, Rosann Bazirjian, and Stephen Zerwas Two academic libraries in North Carolina replicated the Perceptions of Libraries and Information Resources global survey. This paper examines whether student responses in this survey are similar to the 2005 OCLC study and whether they are similar to each other. The authors examine potential reasons for similarities and differences, including student body profile, institutional differences in library services and demographic factors. The findings indicate that local factors dramatically affect the responses and should drive local service decisions rather than relying on global aggregate data. n 2005, OCLC (Online Com- puter Library Center, Inc.) published Perceptions of Librar- ies and Information Resources,1 the results of a global survey designed to explore people’s information-seeking be- haviors and build a better understanding of the “library” brand. Later that year, a subset of the data was published as Col- lege Students’ Perceptions of Libraries and Information Resources.2 Immediately upon publication, academic library directors across the United States began to wonder how their students would compare to the international sample. In North Caro- lina, two library directors of neighboring academic institutions (for these purposes Institution A and Institution B) designed a study to replicate five main questions from the OCLC study to learn how their stu- dents’ answers compared. Permission was given by the principal contributor of the Perceptions report to replicate this study. Research Questions The two directors sought to answer the following research questions: Are the student responses of Institu- tion A/B similar to the responses found in the OCLC study? OCLC’s data came from 396 participants of the survey who self-identified as currently attending a post-secondary institution. Can findings from the OCLC study be applied to these institutions? How do student responses from the two institutions compare to each other? Institu- tions A and B are very different in terms of size, student body, and academic pro- grams. Would this result in substantially different responses to the survey items? Are there demographic differences in student responses to the survey? De- mographic data gathered included age, gender, residency on or off campus, and year of college. Would these characteris- tics differentiate the data? Lynn Sutton is Dean of Z. Smith Reynolds Library at Wake Forest University; e-mail: suttonls@wfu.edu. Rosann Bazirjian is Dean of University Libraries at The University of North Carolina – Greensboro; e- mail: rvbazirj@uncg.edu. Stephen Zerwas is Director of Academic Assessment at The University of North Carolina – Greensboro; e-mail: sczerwas@uncg.edu. 474 http://libres.uncg.edu/ir/uncg/clist.aspx?id=44 http://crl.acrl.org/content/70/5/474.abstract http://libres.uncg.edu/ir/uncg/clist.aspx?id=44 mailto:suttonls@wfu.edu. mailto:rvbazirj@uncg.edu. mailto:rvbazirj@uncg.edu. mailto:sczerwas@uncg.edu. Library Service Perceptions: A Study of Two Universities 475 Review of the Literature A review of Library Literature revealed many journal articles that mention OCLC in relation to perceptions of library ser- vices since 2005. Of those, 20 actually cite the Perceptions of Libraries and Information Resources study. Of those 20, only one ar- ticle refers to a survey that was done with results subsequently compared to OCLC’s responses. That 2006 article by Carol Tenopir links OCLC survey responses to a recent survey she conducted of faculty and students at seven universities in the United States and Australia.3 Ms. Tenopir offers conclusions regarding e-journal usage versus book usage. Her study indicates that e-collections are heavily used and that article readership is grow- ing consistently. She looks primarily at e -journals and notes the discrepancy with the OCLC report and its statement that “books” are the first things that college students think of when they think about the library. She suggests that the discrep- ancy between her responses and those of the OCLC study is because the faculty and students that she interviewed were from universities with “great electronic library collections.” The OCLC popula- tion was different as they surveyed the general public, including nonlibrary users. This article comes closest to rep- licating the OCLC research that we have conducted, but Ms. Tenopir focused on just one aspect. Other articles focus on certain aspects of the OCLC survey results, which are mentioned here for the sake of complete- ness but are beyond the scope of this study. Most concentrated on the result that search engines were trusted and used much more often than library Web sites as a source of information retrieval. These articles were often about related topics that used this finding as an ex- ample. A 2006 article by Lesley Williams focuses on how to make e-content more visible .4 Michelle Jeske’s article (2006) is about how to grow digital collections .5 An article about gaming, written in 2006 by Ameet Doshi, also cites the search engine finding of OCLC .6 Daniel L. Walters, in 2006, challenged public librarians to use the survey findings to boost the quality of library Web sites.7 Paul T. Jaeger, in 2007, focused his paper on trust and the values of librarianship and again cited OCLC’s comments that search engines are trusted more than library Web sites.8 Michael Casey and Michael Stephens, in 2008, used this as a call to librarians to use OCLC’s search engine findings as a foundation for change .9 A number of the articles centered on another major finding in the OCLC Perceptions document, that being the fact that most individuals think of books first when they think of libraries. Most of these articles used this finding to focus on branding. John Cell (2008) says that we need to find our “core of uniqueness.”10 Shu Liu (2008) says it is time to rejuvenate the library brand and make library Web sites more pertinent.11 Nancy Stimson said that patrons believing that libraries are just about books should make us all step back and think.12 Elizabeth Karle, 2008, talks about creative programming ideas to rejuvenate the library brand. 13 Trudi Bellardo Hahn (2008) questioned whether or not the brand of books was really all that bad.14 She feels that we will be in even more trouble if users stop thinking about libraries as books. Dick Kaser in 2006 also proposed that books hold value, so why is “books” a negative response?15 Book publishers should be heartened by this response, and he is thrilled that books seem to have a lasting value in this day and age. Scott Condon, in 2006, advised us to not see the OCLC report as a death sentence for libraries.16 He states: “As li- braries adapt and evolve, let’s make sure we do so in accord with our values and principles, rather than from fear, expedi- ency, or speculative zeal.” Based on this literature search, the au- thors are confident that no other univer- sity libraries have produced a survey and study such as ours. Applying the OCLC findings to individual libraries is untried. Based on a review of the literature, we 476 College & Research Libraries September 2009 find that the generic OCLC survey re- sponses are being used for local decision making at libraries across the country. This article will discuss whether or not the OCLC survey results are representative of the findings at two neighboring, yet significantly different institutions. Sample The sample frame consists of randomly selected students taken from two in- stitutions in North Carolina. Students were contacted by e-mail and asked to participate in a Web-based survey that elicited information on their perceptions of the library. At least two reminders were sent to the students who received the surveys. Subjects were entered into drawings for $100 gift cards to the campus bookstore as incentives for their participation in the survey. The sample at institution A consisted of 3,504 students. Eighteen e-mails were undeliverable and 478 of the remaining students responded for a response rate of 14.4 percent. The sample for institu- tion B was 4,972 with 27 undeliverable e-mails resulting in 486 respondents for a response rate of 9.8 percent. Institution A and Institution B are dramatically different universities. In addition to comparing a small, private institution (Institution A) to a mid-sized public institution (Institution B), as can be seen in Appendix A, there are substantial differences between the student popula- tions of the two institutions. The analysis used for this study was an a priori content analysis approach based on coding categories published in the OCLC Perceptions of Libraries and Information Resources. Content analysis is a systematic technique for summarizing any form of communication into fewer elements and is used to identify themes or other characteristics of communication. Communication is analyzed and codes are assigned to each content unit. The unit of analysis for this study was the complete response given by each subject for each open-ended question. Analysis of the Data Analysis of the data was performed using Roxanne Content Analyzer, a Microsoft Access application. The two analysts independently reviewed the content for each question and completed a prelimi- nary analysis using OCLC codes. Analysts were allowed to identify multiple codes for each subject response since multiple themes were present in the subject’s responses. The analysts then compared their analysis and refined their analy- sis approach for any disagreements in coding. The analysts then recoded the content for each open-ended question using their refined understanding of the OCLC codes. Following analysis, the results were compared and reliability statistics were calculated. Reliability sta- tistics ranged from 76.4 percent to 85.8 percent. Comparisons were made using institutional and demographic data, and comparisons were made with the original OCLC study. Findings Question One: “What do you feel is the main purpose of a library?” OCLC Comparison: The merged (Institu- tion A and B) responses to the first ques- tion differed significantly from the OCLC study. Fully 38 percent of Institution A/B students indicated that the building/ environment was the main purpose of a library, with 35 percent responding that materials were the main purpose, and a close 34 percent saying that libraries are for research purposes. OCLC reported that approximately 49 percent17 of the respondents said that information was the main purpose of a library. Books were cited among 32 percent of the respondents, and 20 percent replied that research was the main purpose. Surprisingly, compared to OCLC, only 26 percent of the Institution A/B students felt that information was the main purpose. Also, a mere 12 percent felt that the main purpose of a library was for books. It is a real surprise to note that the largest response from the combined institutions was OCLC’s lowest response Library Service Perceptions: A Study of Two Universities 477 (5%). Students provided both very posi- tive and very critical comments regard- ing the building/environment of the two libraries. One student replied that the main purpose of a library is “to provide an environment geared toward study- ing.” Another student said that a library is “a place to foster learning.” One student claimed that the purpose of a library is to “provide an environment for self-study and reflection, while encouraging groups of people an opportunity for collabora- tive creation and research.” The library as place is a very important concept for Institution A/B students. Institutional Comparison: Although the combined Institution A/B response favored building/environment as the main purpose of a library, the institutional responses differed significantly. Nearly half (49%) of Institution A students listed building/environment first, with research and materials tied for second place with 36 percent. One response from an Insti- tution A student expressed it this way: “Though one might argue that the library is a place to do research (and I do plenty there!), my initial associations are of a place that is quiet, distraction-free, and enables me to effectively get my work done.” Only 27 percent of Institution B students felt that building/environment is the main purpose of a library, the fourth-rated response. The top response at Institution B was a tie between infor- mation and materials, both at 33 percent, followed closely by research at 32 per- cent. An Institution B student felt the main purpose of a library is to serve as a research central. “A library should be a hub for any scholarly work involving textual or audio/visual media research.” In general, Institution B responses more closely paralleled OCLC; and it was only when the strong 49 percent response from Institution A for building/environ- ment combined with the 27 percent from Institution B that the overall response differed from OCLC. Reading, entertain- ment, and unknown/NA were at the bot- tom of the list for both institutions. Demographic Comparison: Gender was not a differentiating factor on this question, as both male and female responses followed the same order of frequency as the combined response: building/environment (42%-M, 37%-F), fol- lowed by materials (33%-M, 35%-F), and then research (32%-M, 35%-F), information (25%-M, 26%-F) and books (11%-M, 12%- F). Those who were residents on campus ranked building/environment as their top response (51%), followed by research (38%), materials (36%), information (19%), and books (14%). Off-campus residents rated materials first with 34 percent, fol- lowed by research (32%), information (31%) and then building/environment (30%). Age and Year in School are linked variables and both were differentiating factors in this question. Respondents 18 to 24 years old (roughly equivalent to undergradu- ates in their first through fourth/fifth years) favored building/environment (48%), research (36%), and materials (35%) in their responses, very similar to the overall combined response. Those respondents who were 25 to 64 years of age favored information (38%), materials (34%), and re- search (30%), closer to the OCLC response, though substituting materials for books. Master’s students listed information as the main purpose of a library, while doctoral students felt materials were most impor- tant. Professional students surprisingly returned to the undergraduate focus on building/environment as the main purpose of a library. Question Two: “What is the first thing that you think of when you think of a library?” OCLC Comparison: This is the trade- mark question that OCLC used to iden- tify “books” as the Library brand. Over- whelmingly, the first response for OCLC survey respondents was books, with 70 percent giving that response. Building/ environment was a distant second at 12 percent. The much-different top response for Institution A/B respondents was build- ing/environment at 45 percent, with books being a close second at 43 percent. The 478 College & Research Libraries September 2009 bottom-most response was the same for both surveys: reference clocked in at 1 per- cent for Institution A/B and 0.5 percent for OCLC. That’s a very important statement for libraries that provides librarians with an opportunity to rethink the way that reference services are provided. Com- ments from the Institution A/B students about the building/environment focused on the library as a quiet place to study, over and over again. Typical responses were, “a place to relax study and read,” “a quiet place to think/study,” and a personal favorite: “a place of mild climate where I can find adventures.” Institutional Comparison: Again, the Institution B response more closely paral- leled the OCLC response, with books (45%) as the top answer, followed by building/en- vironment (34%) and materials (17%). Typi- cal responses from Institution B students, when asked the first thing they thought of, were, “Many, many books,” and “Tons of books.” Institution A students strongly expressed their preference for building/ environment (55%), followed by books (41%), research (16%), and materials (9%). A typical Institution A student said the first thing thought of is, “A quiet place to think and study.” Reading and Reference were the bottom-most responses for students at both institutions, with information in the middle for both. Demographic Comparison: Again, gender was not a factor in determining student responses, as both males and females answered in the following order to the top-of-the-mind question: build- ing/environment (46%-M, 45%-F), books (40%-M, 45%-F), research (16%-M, 16%- F). More than half (56%) of on-campus students listed building/environment as the first thing they thought of, followed by books at 44 percent and research at 16 percent. Off-campus students agreed with the OCLC respondents (though at a much lower rate) when they listed books first at 42 percent, followed by building/ environment at 39 percent and research at 16 percent. Answers also differed by Age and Year in School, as 53 percent of 18- to 24-year-olds thought first of the building/ environment, followed by books at 43 per- cent and then research at 15 percent. For ages 25 and greater, books took the number one position with 42 percent, followed by building/environment at 30 percent and materials at 21 percent. Year in School groupings provided very interesting data. Years One through Four/Five at the undergraduate level all named building/ environment as the first thing they thought of, in percentages over 50 percent; but at the master’s and doctoral degree level, this changed dramatically to books. As in question one, this changed again for students in professional schools, where 73 percent again chose building/environ- ment as the first thing they thought of in a library. Question Three: “Please describe your positive associations with the library.” OCLC Comparison: Once again, the top two answers for OCLC and Institution A/B were the same, but appeared in re- verse order. The number one response for Institution A/B was facility/environment18 (47%), with products and offerings a close second (46%). OCLC reported products and offerings in first place (61%) and fa- cility/environment a distant second at 13 percent. For Institution A/B, facility/envi- ronment responses related to the libraries providing a quiet environment, friendly and comfortable surroundings, and a pleasant work environment. OCLC com- ments in this category revolved around the same focus—quiet, clean, nice, and comfortable atmospheres. For products and offerings, Institution A/B students appreciate the many computers in the libraries and convenient computer access, books, and online access to resources and electronic journals. As far as the OCLC responses in the products/offerings cat- egory, the majority had to do with books, followed by information. The concept of free information appeared in these responses. Computers and easy to find/ access was the lowest reported response in the category— surprisingly, one of the Library Service Perceptions: A Study of Two Universities 479 more popular responses with the Institu- tion A/B students. The next two responses are the same for OCLC and Institution A/B. The third most popular response was staff (25% for Institution A/B and 9% for OCLC). Staff responses in both surveys revolved around helpful, friendly, and knowledgeable staff. At Institution A/B, customer/user services answers were also about helpful, friendly staff, ILL services, and the research help that both libraries provide. Customer/user services among the OCLC respondents centered on the practice of the library being open to the public, ILL (once again), the availability of an online catalog, and the ability of libraries to meet their needs. Institutional comparison: The first two responses for students at both institutions were the same, products and offerings and facility/environment, but they were reversed. Again, Institution B more closely replicated the OCLC response of products and offerings with 49 percent and facility/environment in second place at 33 percent, though the OCLC percentages were much further apart at 61 percent and 13 percent. An In- stitution B student said, “I love the online access to databases and journals.” Another said, “Being able to use the computer and having a great selection of books and a very nice staff with both the librarians and the security.” At Institution A, students con- tinued their strong preference for facility/ environment with 62 percent and products and offerings at 43 percent. The attachment to the facility at Institution A is expressed by this student, “The library is gorgeous, in particular, the atrium. Its hours are long enough to allow for students to stay up to study/write papers/work on projects. Dur- ing exams, the library had free coffee and also was open 24 hours every day. Perhaps the best part of that period, though, was the free food (either subway or pizza) they of- fered at 1 a.m. each night.” The remaining three ranks were the same at both institu- tions: staff (28% at Institution B and 22% at Institution A), customer/user service (18% at Institution B and 19% at Institution A) and unknown factors. Demographic comparison: A slight dif- ference was shown in gender responses, where females ranked products and of- ferings and facility/environment exactly the same at 46 percent, but males gave a slight preference to facility/environment at 51 percent over products and offerings at 47 percent. Staff and customer/user service are in third and fourth place for both sexes. On-campus residents preferred facility/ environment (59%) over products and offer- ings (44%). Off-campus residents, which include more graduate students, prefer products and offerings over facility/environ- ment 48 percent to 40 percent. Staff and customer/user service are again in third and fourth place for both groups. In the closely linked age and Year in School categories, 18- to 24-year-olds listed more positive as- sociations with facility/environment (56%) than products and offerings (42%), as did students in the first four undergraduate years. More positive associations were found among 25- to 64-year-olds in products and offerings (54%) than facility/ environment (31%), corresponding to the graduate student responses with 53 per- cent of master’s students and 59 percent of doctoral students preferring products and offerings. As in questions one and two, professional school students returned to the preference for facility/environment, by a 75 percent to 29 percent margin over products and offerings. Question Four: "Please describe your negative associations with the library.” OCLC comparison: The number one response for Institution A/B was facility/ environment (51%). This was a real sur- prise since it prominently showed up as the number one response in the previous questions, all with a positive spin. For OCLC respondents, facility/environment was reported as the second most popular response at 25 percent. For both sets of re- spondents, the responses tended to be the same. The buildings are too loud or too quiet; too crowded or too outdated; too big or too small; and confusing in layout. The second most popular response for 480 College & Research Libraries September 2009 Institution A/B is unknown (22%). Under this category, the authors counted all of the replies that said there were no nega- tive associations with the library or that this question was n/a. The number one response for OCLC respondents, products and offerings (39%) was the number three response for Institution A/B students (14%). Books and computers were some of the items complained about in this category. For Institution A/B, there were comments about the computers always being full, that the libraries didn’t have enough books on a particular topic, the printers didn’t work, the library had outdated books, and that more full text was needed. For OCLC, the majority of responses within this category were about books and materials. Computer complaints also appeared. Customer/ user service was the fourth most popular response at Institution A/B (12%), and the third at OCLC (23%). Complaints about customer/user service at Institution A/B centered strongly on access services issues, including library fines, overdue notices, renewals, and recall policies. For OCLC, comments regarding customer/user service were also about fees and policies and stringent return dates—again, an access services focus. However, OCLC respondents also commented on hours of operation, waiting in line too long, and lack of privacy issues. Finally, staff was the least reported response at Institution A/B at 4 percent. It was also the least reported at OCLC at 6 percent. For both sets of survey responses, the few comments in this category were about unfriendly, unavailable, and not very helpful staff. Institutional comparison: For the first time, both universities showed the same rankings for a question, though the per- centages varied. The most common nega- tive association at each library was facility/ environment. This may be attributed to the continued preoccupation with library as place at each institution. At Institution A, 61 percent of negative associations were with the facility/environment, and at Institution B it was 41 percent. At Insti tution A, where each student receives a laptop as part of tuition, the number one complaint was about a lack of electri- cal outlets. For example, “There is not enough space in the 24-hour room to really focus by myself and there are not enough electrical outlets in the other parts of the library to use my computer.” Un- comfortable furniture was also a concern: “The furniture in the library is quite pos- sibly the least comfortable furniture I’ve ever encountered. Librarians and those who go to libraries should not have to be subjected to such pain.” At Institution B, students were frustrated with crowded computer labs and building conditions. One student said, “It takes too long to find something, it’s a long way for me to get there, it is not in a very accessible place.” Students at both libraries complained of the confusing arrangement of the build- ing and stacks. A typical comment was, “They are confusing! I always struggle to get on the right floor. They never seem to be laid out well. Inevitably, it is hard to get from floor to floor as well.” As previously stated, “unknown” was in second place at both libraries, attributed to the n/a answers from large numbers of students, which is a good thing. Products and offerings followed in third place with 17 percent at Institution B and 12 percent at Institution A. Demographic comparison: This ques- tion alone had consensus agreement from all demographic factors. Facility/ environment was the leading negative as- sociation for males (51%) and females (51%), on-campus (58%) and off-campus (47%) residents, 18- to 24-year-olds (57%) and 25- to 64-year-olds (40%), and all Years in College from freshman (59%) to doctoral students (37%). Library as place is very important to all student users in academic libraries, and they notice when the environment does not meet their needs. Complaints were registered about the confusing layout of both libraries: lack of 24-hour availability, lack of a cof- fee shop, uncomfortable furniture, dim lighting, lack of electrical outlets, lack of Library Service Perceptions: A Study of Two Universities 481 group study spaces, noise, temperature, and cleanliness. Question Five: “If you could provide one piece of advice to your library, what would it be?” For the fifth time, facility/environment was the Institution A/B combined first response, with 40 percent of the respon- dents’ answers falling into this category. For OCLC, facility/environment was the second most popular response at 23 per- cent. For Institution A/B, comments in this category were about providing either qui- eter or more group study areas, lighting, the need for more outlets, the provision of more comfortable seating, and a desire for longer library hours. Products and offer- ings was the number one response from the OCLC surveys (27%), and the number two response from the Institution A/B surveys (31%). Both ranked high in per- centage. Responses in this category from OCLC were in the area of adding more to the collections, updating collections, computer, and online chat. Institution A/B responses also focused on the need for adding more books to the collection as well as online resources, and also a request to update the collections. Online chat was not mentioned. Customer/user service ranked third for Institution A/B (14%) and third for OCLC (22%). Institutional Comparison: Institution A students again responded with the most pieces of advice (50%) for the category of facility/environment, meaning all five questions at Institution A were focused on library as place. One student summed it up, “Students use the library primarily, when not conducting actual research, for homework and studying. The library is lacking in appropriate areas to do this. Increase lighting in the stacks, update the chairs/desks, and add more outlets. This would make the library much better for student use.” Institution B students listed products and offerings first (35%) followed closely by facility/environment at 29 percent. One thoughtful student at Institution B said, “Keep thinking of ways to help students and faculty get at useful information. Now that electronic journals are prevalent, keep doing that, but work also on newer ideas, like how to get primary documents available online.” In second place at Institution A were com- ments about products and offerings (27%). Tied for third and fourth place at Institu- tion A were satisfaction and customer/user service, both at 11 percent. At Institution B, customer/user service came in third at 16 percent and satisfaction at 13 percent. Demographic Comparison: Both males and females gave the most pieces of ad- vice on facility/environment (42% to 39%). Next was the products and offerings cat- egory for both genders (32% for males and 31% for females), followed by satisfaction at 14 percent for males and customer/user service at 15 percent for females. Answers differed by residency status, however, as on-campus students gave the most ad- vice by far on facility/environment (51%), followed by products and offerings at 27% and customer/user service at 12 percent. Off-campus students were very close in their first two answers, with products and offerings at 34 percent, followed by facility/environment at 33 percent and cus- tomer/user service at 14 percent. Consistent with previous questions, undergraduate students in their first through fifth years focused on facility/environment ranging from 44 percent to 59 percent, but for master’s and doctoral students, products and offerings became more important, at 42 percent and 52 percent respectively. Professional school students continued the fascinating trend of behaving more like undergraduate students in their focus on facility/environment (73%). Answers to Research Questions The first research question was: Are the student responses of Institution A/B simi- lar to the responses found in the OCLC study? The answer is a resounding no. The top-ranked response was different in every category for OCLC a nd the combined Institution A/B results. Facility/ environment was top-ranked for all five 482 College & Research Libraries September 2009 questions from the two institutions and never higher than second in any of the OCLC questions. This is a startling find- ing. OCLC had only a small sample of 396 survey participants who self-identified as currently attending a postsecondary insti- tution. These could have been students in community colleges, trade schools, liberal arts colleges, or research univer- sities anywhere across the globe. Our combined study provided a total sample of 964 students, 478 from Institution A and 486 from Institution B. Something about these larger samples on individual campuses resulted in a much different response from the small but broad sample from the OCLC study. It is obvious that library as place is much more important to students on these two campuses than in the general OCLC findings. To learn why this might be, it is necessary to probe deeper into the responses. The second research question was: how do student responses from the two insti- tutions compare to each other? Keeping in mind that the numbers in the sample populations at each institution are nearly equal, the answer to that question is re- vealing. Institution A’s students placed facility/environment in first place to every question. It was both their most positive and their most negative experience of the library. It was the main purpose and the first thing they thought of when they thought of the library. And it was the sub- ject of most of the advice that they wanted to give about the library. At Institution B, facility/environment placed fourth, second, second, first, and second, respectively, in answers to the five questions. But when combined with the overwhelming empha- sis on place at Institution A, it became the most prevalent combined answer in every category. Institution B responses much more closely resembled the OCLC set. In four of the five categories (all except for negative associations), the top Institution B response was the same as the top OCLC response. What is it about these two neighboring sets of students that make them answer so differently? It could be lo cal factors such as the physical condition of each library (although both directors readily admit that each library is in seri- ous need of updating and renovations). The composition of each student body is quite different. Institution A is an elite Top 30 school with average SAT scores that are 277 points higher than Institu- tion B. Yet it would be counterintuitive to reason that higher-qualified students seek only a place to study and don’t value the materials, products, and services provided by the library. We need to look at demographics for other possible clues. The third research question was: do demographics matter? Here the answer must be a resounding yes, at least for some categories of demographics. Gen- der was the least differentiating factor, as males and females agreed on almost every question with a high degree of similarity in response percentages: 42% to 37% for main purpose; 46% to 45% for first thing thought of; 51% to 46% for posi- tive associations; 51% to 51% for negative associations; and 42% to 39% for advice to the library. There was much more variability by residency. Top responses of on-campus and off-campus students were different in four of the five questions. Only in their negative associations with the library did they agree that building/environment was foremost. In the other four questions, on- campus students continued to list build- ing/environment as their top answer, but off-campus students had other priorities. For the main purpose of a library, their top three answers were materials (34%), research (32%), and information (31%). Building/environment came in fourth with 30 percent. The first thing off-campus stu- dents thought of when they thought of a library was books (42%) followed by build- ing/environment (39%) and research (16%). Off-campus students had the most posi- tive associations with products and offerings (48%), followed by facility/environment (40%) and staff (27%). Off-campus stu- dents provided more pieces of advice on products and offerings (34%) than on facility/ Library Service Perceptions: A Study of Two Universities 483 environment (33%) or customer/user service (14%). Looking at the characteristics of on- campus students, they are more often in the first two years of their undergraduate careers and are in the 18–24 age group. Most graduate students live off-campus, rather than on. It should be noted that a majority of students attending Institution B live off campus. Age and Year in Class are closely linked variables. 18- to 24-year-olds are most of- ten undergraduate students, especially at Institution A. Both the 18–24 age category and Years One through Four in school show building/environment as the top-rated response at both schools to each question, replicating the total combined results for Institution A/B. Even in questions where the overall Institution B response was something other than building/environment (all questions except negative associa- tions), when only Institution B undergrad- uates or students from 18 to 24 years of age were considered, the answer became place focused. Looking at demographic data for each school, 18- to 24-year-olds are 85 percent of all Institution A partici- pants and 42 percent of all Institution B participants. Together, they are 63 percent of the total combined participants. Using the Year in School demographic, students in all undergraduate years are 73 percent of the Institution A respondents and 40 percent of all Institution B respondents. Given the much lower percentage of undergraduates at Institution B, it is evi- dent why total responses begin to differ from the overwhelmingly undergraduate student body at Institution A and why they begin to more closely resemble the broader global sample of OCLC. Given the demographic trends de- scribed above, it is reasonable to conclude that the driving force behind the place-cen- tered answers of the combined response, and Institution A in particular, is the age and year-in-school demographics. This validates perceptions that library staff have had for years, namely that undergraduates use the library most often to study. This phenomenon has driven the recent em- phasis on library as place in the literature. There has been a recent boom in academic library renovation that is transforming academic libraries into inviting, comfort- able places for individual and collaborative study, complete with coffee shops, soft seating and places for group study. Graduate students differ from under- graduates in their values; as their answers indicated, information is the main purpose of a library for master’s students and ma- terials are the main purpose for doctoral students. Both master’s and doctoral stu- dents say that the first thing they think of is books, ironically the same response as the broad, global OCLC study. Both master’s and doctoral students say that products and offerings hold their most posi- tive associations, although, interestingly, both master’s and doctoral students go back to facility/environment in their nega- tive associations. Again, it is products and offerings that both master’s and doctoral students have in mind when they offer advice to the library. One of the most fascinating findings of this study is the phenomenon of Profes- sional School students who very nearly replicate the answers of undergraduate students in their approach to the library. A total of 29 individuals identified them- selves as Professional School students in Law, Divinity, or Business at Institution A. Institution B did not offer the category of Professional School students. From the qualitative answers to questions, it was evident that a large number of responses came from law school students at Institu- tion A. Like undergraduates, their top answers to all five questions were building/ environment. This could be a local phenom- enon since at Institution A, law students are assigned permanent carrels within the library and become quite possessive about their space. Or it could be a more general- ized phenomenon that law students, like undergraduates, study heavily out of textbooks and use the library for intense study and classroom preparation. Addi- tional research is needed at other schools to test this finding. 484 College & Research Libraries September 2009 Implications and Further Research The major takeaway point for the authors is that libraries should not rely on the data presented in College Students’ Perceptions of Libraries and Information Resources for mak- ing decisions in their local environments. Use local data for local decisions. At many conferences and workshops, presenters are informing their audiences that books are the first things that students think of when they think about libraries. Yet the aggregated survey results indicated that the building/environment is the first thing that is thought about for the two libraries in this study. The demography, makeup, and other local traits of Institution A are such that building/environment is the first library thought from their students. However, books were the first library thought from the students at Institution B, as discussed earlier. Their environment, range of services, and varied demogra- phy probably more closely resemble the respondents of the OCLC survey. The lower percentage of undergraduates at Institution B could explain why build- ing/environment was not number one in most categories. In other categories, Institution B more closely resembled the OCLC responses than Institution A. Libraries should compare themselves to the demographic charts in the Appendix to potentially see what their own students may be thinking and then test those sup- positions with a local study of their own. Local factors are likely to have played a role in the answers given by students at each institution. At Institution A, the library enjoys a close relationship with its students. Nearly 75 percent of the student body lives on campus and there are very few public spaces that students can use for study other than the library. A 1991 addition created a beautiful atrium space that is a campus favorite and accounts for many of the positive comments about the library as place. On the other hand, the original wing of the building has been largely untouched since it was built in 1956 and accounts for many of the frus- trations students feel with uncomfortable furniture and the lack of electrical outlets. Institution A was one of the first laptop campuses in the country and provides students with a new laptop and printer as freshmen and again in their junior year—all as part of their tuition. Thus, there are no issues with outdated or crowded computer labs, but the laptop environment creates a strong demand (and frustration) for electrical capability. Institution B underwent a major beautification effort from 2005 to 2007. Prior to that time, the main library had been described as “prison-like” by many students. The way the library used to look may be a major factor in why the build- ing/environment was not the number one response for all questions except the “negative associations” question. In addi- tion, at the time this survey was taken in the spring of 2007, the library did not have an information commons, collaboratories, 24X5 space, or as many group spaces outfitted with comfortable furniture. The library currently has all of those features. As a result, the survey results could be very different if taken now. The need for further research is clear. It would be useful to replicate this survey at the research library level, both with public and private members of the Association of Research Libraries (ARL). In addition, different types of libraries could benefit from conducting this survey, specifically special and public libraries. The authors suspect that the results of the survey taken at a public library might more closely resemble the OCLC survey re- sults because of the range of services and demographics. The authors also strongly recommend that any libraries looking for data to renovate or upgrade local services consider conducting this survey. Special libraries, for example, may have a very different response in terms of the reference response, and this would surely impact any recommendations for change. Both institutions will be sharing these survey findings with their respective provosts as they impact budget, service, and renova- tion decisions for the future. Notes 1. OCLC, Perceptions of Libraries and Infor- mation Resources: A Report to the OCLC Member- ship (2005). Available online at www.oclc.org/ reports/2005perceptions.htm. [Accessed 25 November 2008]. 2. OCLC, College Students’ Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (2006). Available online at www.oclc.org/reports/perceptionscollege. htm. [Accessed 25 November 2008]. 3. Carol Tenopir, “Perception of Library Value,” Library Journal 131, no. 20 (2006): 36. 4. Lesley Williams, “Making ‘E’ Visible,” Library Journal 131, no. 11 (2006): 40–43. 5. Michelle Jeske, “Who Knows What The Future Will Bring? Get Prepared!” Colorado Libraries 32, no. 2 (2006): 14–18. 6. Ameet Doshi, “How Gaming Could Improve Information Literacy,” Computers in Libraries 26, no. 5 (2006): 14–17. 7. Daniel Walters, “Thoughts About Our Web Sites, Catalogs, and Databases,” Public Libraries 45, no. 3 (2006): 7–9. 8. Paul T. Jaeger, “Public Libraries, Values, Trust, and E-Government,” Information Technol- ogy and Libraries 26, no. 4 (2007): 34–43. 9. Michael Casey and Michael Stephens. “Insights from the Front Line,” Library Journal 133, no. 3 (2008): 6–27. 10. John Cell, “CIP on the Moon,” Library Journal Net-Connect (Jan. 15, 2008): 2–5. 11. Shu Liu, “Engaging Users: The Future of Academic Library Web Sites,” College & Research Libraries 69, no. 1 (2008): 6–27. 12. Nancy F. Stimson, “Library Change as a Branding Opportunity: Connect, Reflect, Research, Discover,” C&RL News 68, no. 11 (2007): 694–98. 13. Elizabeth M. Karle, “Invigorating the Academic Library Experience: Creative Pro- gramming Ideas,” C&RL News 69, no. 3 (2008): 141–44. 14. Trudi Ballardo Hahn, “Mass Digitiza- tion: Implications for Preserving the Scholarly Record,” Library Resources & Technical Services 52, no. 1 (2008): 18–26. 15. Dick Kaser, “Sanity Check,” Information Today 23, no. 2 (2006): 16. 16. Scott Condon, “Adapt or Die,” Alki 22, no.1 (2006): 25–26. 17. The OCLC study did not report numeri- cal findings, only graphical information with rough indications of percentage. 18. This paper follows the OCLC study, which changed terminology from building/ environment in questions 1 and 2 to facility/ environment in questions 3 to 5. http://www.oclc.org/ http://www.oclc.org/reports/perceptionscollege work_5olgmjdngfer3lv4wtbjf6xvby ---- The Technological Challenges of Digital Reference: An Overview Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine February 2003 Volume 9 Number 2 ISSN 1082-9873 The Technological Challenges of Digital Reference An Overview   Jeffrey T. Penka OCLC Online Computer Library Center, Inc. Abstract Much has been written about the various tools for digital reference, technical issues associated with their implementation, and the potential for these tools to reach new patrons. In this article, the author focuses on the need to understand the technical environment within which digital reference occurs, from issues of patron definition and access, to the role of cooperative relationships and networks in meeting the shared needs of librarians and patrons. The author provides an overview of today's reference environment along with data and practical examples from services like QuestionPoint™ [1], the Library of Congress, and Ask Joan of Art® to demonstrate the importance and effect understanding audiences, appropriately using technology, and working cooperatively can have for libraries in digital reference. Introduction Significant change in reference librarianship had been brewing for some time before the introduction of the World Wide Web in 1995. The 1980s and early 1990s saw this change express itself in debates such as "mediated versus unmediated online searching"; "access versus ownership" and "print versus electronic"; and professional concerns that gradually widened to include electronic licensing and consortial collection development. The Web introduced new possibilities and additional interactive technologies such as e-mail, chat, and instant messaging to the reference desk; however, the effort of keeping current with the pace of change in technology and tools can redirect focus from services and patrons to tools, and make the process of gathering information and assessing tools to arrive at an informed decision more difficult. Within this context of digital reference, the pace of change and new interactive technologies often dominate the discussion rather than the library's service goals and the appropriate roles technology plays in supporting these goals. This discussion of technological challenges associated with digital reference does not focus on which interactive technologies support the reference interview, but on challenges libraries face in establishing and supporting an efficient, patron-focused digital reference service, based on library values. Gorman summarizes the eight central values of librarianship as stewardship, service, intellectual freedom, rationalism, literacy and learning, equity of access to recorded knowledge and information, privacy, and democracy [Gorman, 2000]. Against this backdrop, libraries encounter wave after wave of technological innovations, each offering new options, features, opportunities, and potential distractions. Libraries face the ongoing and sometime paradoxical challenges of keeping up with these changes, implementing the new technologies, and maintaining a perspective on the technologies in relation to the libraries' work and core values. Janes sums up the challenge of conducting reference services in an increasingly digital environment in this way: "All professions and sectors must pay greater attention to how ever-rising connectivity and the digitization of resources are affecting their work, their professions, and the communities they serve" [Janes, 2002]. To this end, it becomes critical for libraries to understand the current technological landscape and to have an articulate vision of the customers or patrons they intend to serve. Without this clarity, technology—rather than vision and needs—may end up driving change. Maturing of Digital Reference When libraries first started providing digital reference services through the Internet in the mid 1990s, it primarily consisted of email addresses where patrons might submit a question and get an answer. Since then, libraries have begun to assess and adopt a variety of asynchronous and synchronous technologies like web forms, knowledge bases, and chat products to help them provide services in the web environment. Many of these efforts could be classified as ad-hoc and homegrown solutions where libraries and organizations looked at the available technologies and cobbled together solutions that met their local needs. Based on requests from libraries creating these types of solutions, software vendors who traditionally served other industries began to look at ways to retrofit and adapt their call-center products to the digital reference market. These efforts became more organized with the introduction of solutions like Library Systems and Services, L.L.C. (LSSI) and 24/7 Reference in the late 1990s created specifically for libraries. Other developments demonstrated the maturation of digital reference as well. The Library of Congress' Collaborative Digital Reference Service (CDRS) pilot, for example, explored the growth of cooperative systems worldwide in 1998. In 2002, QuestionPoint—a collaborative effort from the Library of Congress (LC) and OCLC Online Computer Library Center, Inc. (OCLC) [1]—became the next generation of the CDRS. Several developments characterized the growth of digital reference. Dramatic growth occurred in the number and type of tools available to support digital reference services, and products and services directed specifically at libraries. At the Virtual Reference Desk 2002 Conference, Milewski provided a comparison of 18 products and services currently used by libraries offering digital reference synchronously and asynchronously. Of those presented, six were specifically developed or targeted directly to libraries, including QuestionPoint, LSSI, and 24/7 Reference [Milewski, 2002]. Two more characteristics of growth in digital reference were an increase in research and development in that area, and standards development for digital reference systems. Recent research on librarian experiences and attitudes about digital reference (Janes, 2002), statistical and qualitative measuring of reference service [McClure et al., 2002], and the establishment of a research agenda at the Digital Reference Research Symposium [2] demonstrate this growing priority on reflection and assessment in digital reference. And in 2002, NISO initiated the AZ standards committee for networked reference. QuestionPoint In 2002 the Library of Congress (LC) and OCLC began exploring ways to take the pioneering work done by LC in the Collaborative Digital Reference Service (CDRS) project to the next stage. The notion behind CDRS was that creating online networks of libraries would combine the power of local collections and staff expertise with the diversity and availability of libraries and librarians throughout the world, 24 hours a day, 7 days a week [Kresh, 2000]. The CDRS pilot project eventually involved over 260 libraries of various types in the United States, Canada, the United Kingdom, Europe, and Asia. While LC explored the global potential for networked reference services, OCLC was working on pilot projects with several U.S. library consortia, to develop tools with which librarians could establish a more effective online presence within their own communities and work cooperatively within their existing institutions and consortia. In June 2002, LC and OCLC introduced QuestionPoint, a cooperative digital reference service that evolved from CDRS and operates on a subscription basis. A QuestionPoint subscription includes: Access to a professional community of librarians working together to develop standards, best practices, and the QuestionPoint service based on their experiences and needs An interface that enables libraries to offer online reference services locally and to refer questions to libraries locally, regionally or globally Tools to support synchronous and asynchronous digital reference including walk-up questions, email, web-based forms, and live chat including the ability for librarians to see and talk with patrons over the Internet The ability to route and track the status of questions, including system views for the patron, the librarian and the administrator Local and global knowledge bases that store previously asked and answered questions for later retrieval and use as a reference resource Usage statistics and reports to help librarians implement and maintain QuestionPoint successfully in their libraries Integration with other virtual reference systems that participating libraries already use A customizable administrative module. QuestionPoint is hosted at OCLC, requiring only a Web browser for the patron and librarian. Six months after its introduction in June 2002, over 300 libraries are using QuestionPoint, including users in Australia, Canada, China, England, Germany, the Netherlands, Norway and Scotland. LSSI (Library Systems & Services, L.L.C.) LSSI, a commercial library solutions company, has worked with a number of commercial software vendors to create various products and services for libraries interested in providing digital reference. They offer both software and a Web reference center for outsourced virtual reference. Subscribers to LSSI do not install any hardware or software locally. The servers supporting the system operate on an application service provider (ASP) model. The library operator uses a Web browser to control the remote sessions [Breeding, 2001]. LSSI has also developed a fully functional 24x7 Web Reference Center. The center operates out of their headquarters in Germantown, Maryland, and is staffed remotely with professional librarians trained in online reference. Their reference staff provides everything from backup and overflow reference support, and second-level and after hours reference services, to a fully staffed 24x7 online reference service. The librarians come from a variety of backgrounds including special, academic, and public libraries, and are trained by LSSI to meet specific library requirements. 24/7 Reference 24/7 Reference is a cooperative project of the Metropolitan Cooperative Library System that is supported by U.S. government and by LSTA funding (administered by the California State Library). This service offers a suite of products and services that enables libraries to offer live online reference to their patrons. The 24/7 Reference team has assessed the needs of its membership and worked with various commercial software vendors and developers to build tools that support the goals of its libraries. In addition to furnishing tools allowing libraries to offer live reference to their patrons, this project also uses the cooperative efforts of its members to provide additional staffing. Other Signs of Growth The ever-increasing number of resources as well as research focused on digital reference also signify a maturing domain. Resources such as Bernie Sloan's Digital Reference Pages [3], the VRD sponsored DIG_REF Listserv [4] and various cooperative communities like the Global Reference Network associated with the Library of Congress and QuestionPoint [5] provide libraries with access to a host of resources to understand, define, and discuss the nature of digital reference in librarianship. Understanding the Landscape In the evolving digital reference landscape, tools and functionality play a supporting role to the goals of the libraries providing digital reference. It is by understanding and focusing on patron needs and library issues, rather than simply adopting the newest technology, that libraries can look holistically at their reference offerings and build adaptable, goal-oriented systems. The following section provides some perspective and data on potential digital reference users, as well as library issues that digital reference services might consider. Patron Needs In understanding a library's goals for digital reference, it is critical to define the target audience and understand the context and conditions of those using a digital reference service. By considering the end user's point of view, libraries can better shape technology systems and define their own service offerings more clearly. Where Are the Users? According to a September 2002 survey by Nua [Nua.com, September 2002], there are over 600 million users on the Internet (see Table 1: How Many Online). Table 1 How Many Online (in Millions) World Total 605.60 Africa   6.31 Asia/Pacific 187.24 Europe 190.91 Middle East 5.12 Canada & USA 82.67 Latin America 33.35 These numbers provide a quantifiable view of the potential global audience for libraries providing services through the Internet and carry implications in terms of library policy, digital reference practice and research. Policy issues range from languages served to patron authentication for various resources, while practice and research areas include the need to understand the technology used by these users, the nature of reference services culturally, and the models for collaboration by libraries globally to meet the needs of this shared audience. What Technologies Are Patrons Using? Libraries must understand that cutting-edge, state-of-the-art technology may only be able to serve a small percentage of the Internet population. Some patrons pursue technologies with higher bandwidths and higher speeds, while others rely on older technologies. Thus, the chat-rooms, listserv discussions and instant messenger conversations integral to digital reference may not always work at optimal levels for all patrons. In December 2002, 46 million Americans or 32 percent of total Internet users in the US connected to the Internet via broadband [Nua.com, December 10, 2002]. However, there are only 18 million residential broadband subscribers in North America, representing roughly 11% of the total Internet users in the United States [Nua.com, December 4, 2002]. Only four percent of urban households in the UK are currently connected to broadband, a figure that is expected to rise to 10 percent by the beginning of 2004. Sweden has 25.7%; Belgium 19.4%; Austria and the Netherlands with 16.3 % and 11.6 % of urban households having broadband connections [Nua.com, December 12, 2002]. These kinds of data provide a broad perspective on potential audiences. The same technologies that currently provide reference services can help to evaluate them by creating service records like transcripts and question histories, generating concrete satisfaction data and assessment criteria, tracking referral patterns, and connecting reference professionals with previously unavailable peers. By gathering and analyzing this data, digital reference systems administrators can evaluate their services according to their patrons' needs, rather than on other industry models or software functionality. The next section presents several examples of more specific investigation into the users of individual digital reference services. Who Are the Patrons? Developing a technology profile about target patrons must include consideration of the operating systems, browser types and versions, access speed, and internet service providers (e.g., AOL) they use. Libraries can use available technology like web form surveys and web server logs to gather more specific data about their current users and form a clearer picture of the audience they serve. In a discussion of service, Cox describes "Going Out to the User" as the reference librarian moving into the patron's own environment [Cox, 1996]. In the physical world this requires libraries to identify their target audience and understand their environment, so the library can then establish services in such a way to meet the patrons at that point of need in their own environment. Digital reference provides the same opportunity; however, the definition of the patron's environment is more specific than the Internet. When looking at access patterns by users, the library web site is a good place to start; however, this does not provide a complete picture of the patron's environment. In the same way that libraries have established a presence in unconventional locations like malls and grocery stores to meet their patrons at their point of need, meeting users at their point of need on the Internet will likely involve partnerships with web locations patrons already use like government and community sites or destination portals such as American Online, Amazon.com™, and Google. Web logs can show where users come from to reach the library's web site. The following provides some information on usage patterns of some possible patrons and examples of how libraries can use these of types statistics and partnership opportunities to better position their digital reference services to meet patron needs. In December 2001, OCLC commissioned Harris Interactive to conduct a blind research web survey on the information habits of college students to help inform how academic libraries deliver relevant services to their patrons. The study [OCLC, 2002] reported "college students have confidence in their abilities to locate information for their study assignments. Three-out-of-four agree completely that they are successful at finding the information they need for courses and assignments, and seven-in-ten say they are successful at finding what they seek most of the time. The first-choice web resources for most of their assignments are search engines (such as Google or Alta Vista®), web portals (such as MSN®, AOL or Yahoo!™), and course-specific websites." The data offers libraries concrete information on student's habits and perceptions, as well as a baseline for future study on how academic libraries meet student information needs. Using statistical trends and feedback from the participants, the study suggests methods libraries can use to connect with students and increase visibility among web resources including promotion, instruction, access positioning, and integration with other resources like existing course materials and student information portals. When the Smithsonian American Art Museum's "Ask Joan of Art" reference service conducted an audience analysis in the mid 1990s, they determined that many of their potential users were on America Online™ (AOL). The reference service approached AOL about providing links to their services inside of service's user interface. AOL agreed and links to the Ask Joan of Art reference service were established within AOL's Research and Learn Channel and as a resource link within the AOL@SCHOOL web portal. Through 2002, the Joan of Art reference service continued to see approximately 40% of its traffic from AOL users. The service continues to seek out partnerships to meet patrons at their points of need, and now provide access through a number of art related information portals, as well as the Smithsonian American Art Museum's homepage. By investigating usage patterns, libraries can better understand how to meet patrons' needs, and establish strategies for service offerings. Library Issues As libraries grow these patron-oriented services, they will encounter issues of workload, efficiency, interoperability, and service quality. Understanding the digital reference workflow and the role of cooperation can clarify how technology can be used to assist with workload, efficiency, and quality service assessment. Technical and quality standards also play an important role in defining systems that support library needs for interoperability, cooperation, and quality. Digital Reference Workflow The question of how to build technology that supports the reference workflow presupposes that we have a clear understanding of the workflow. Much of the research to date around reference work has focused on the reference interview or the discovery aspect of the workflow, which only represents the beginning of the process. A more complete look at reference workflow includes activities like question assignment, fulfillment, routing, question management, archiving, retrieval, assessment, evaluation, and reporting. These components underlie the issues libraries encounter in providing digital reference. A full consideration of all these workflow components is beyond the scope of this paper; however, a brief discussion follows on the issues of cooperation, quality standards, and quality assessment. The Role of Cooperation Bunge and Ferguson assert that librarians must establish cooperative relationships with each other and with technologists in building systems that support their core values [Bunge & Ferguson, 1997]. Historically, cooperation and collaboration have played a significant role in technological and social advancement. Libraries have recognized this in their development of shared service values and through resource sharing in areas like interlibrary loan and consortial purchasing. In 1973, Kilgour stated that, "Computerized cooperation opens up untrodden avenues of research and development and by making unnecessary the imposition of uniformity on library processes, the cooperation creates hitherto unexplored opportunities for intellectual development in the profession" [Kilgour, 1973]. Thirty years later, these principles still hold true. In looking at cooperation in digital reference and how it is supported by technology, one must begin by understanding the various types of cooperation that exist. For the purpose of this discussion, five types have been identified: internal, informal, formal, affinity, and anonymous. Internal: Internal cooperation occurs when library staff work together to solve a problem or meet a shared need. This may seem a bit of a simplified view; however, this model often represents the most frequently encountered form of cooperation within a library. If this type of cooperation is not recognized, the areas where technology can begin to support cooperation and collaboration will be neglected. Examples of this type of cooperation in digital reference might include transferring a chat session to a more appropriate subject expert or assigning a question to the staff member with the highest likelihood of responding within a given time period. Informal: Many times, as reference professionals work to answer patron questions, they use other resources, including contacting knowledgeable individuals, that might not be publicly available or widely known. These informal resources could range from a little known web resource to a friend who seems to know obscure facts. Formal: Established consortia or groups with some form of publicly known charge are examples of formal cooperatives. Many times these relationships have been established to share resources and expertise and to increase purchasing power and efficiency. Within digital reference, these groups might help monitor a live reference queue, staff a central reference center or service for the group, or route questions and patrons based on expertise or coverage. Affinity: Groups formed around subject areas or meeting a common need that have no other type of formal arrangement represent affinity groups. These types of groups might start as an ad-hoc affinity group and grow to become a formal cooperative over time. Anonymous: Finally, anonymous cooperation occurs when a librarian can forward a query or patron to another library that is automatically selected based on a set of criteria like expertise or availability. In this model, the libraries may have no previous relationship and may simply have agreed to a common set of service terms through a referral service. For the purpose of this discussion, anonymous means that the human beings in the process do not require direct knowledge of one another. Planning and developing technology without understanding the levels of cooperation, may cause a myopic view of the role technology can play in supporting collaboration in digital reference. In an example that illustrates the levels of cooperation, QuestionPoint librarians can assign questions within known cooperative groups (formal or affinity) or route them to an automated algorithm to locate a "best match" among the participating libraries. QuestionPoint usage statistics from the first six months of service have shown that libraries use each of these levels of cooperation. With over 300 libraries participating, about 15% of the questions libraries manage within their QuestionPoint account are routed either to the global reference network anonymously or to a formal or affinity group partner. The remaining questions were answered by the original librarian or assigned internally within the library, examples of informal, internal, and formal cooperative networks. The use of QuestionPoint's anonymous and affinity networks has steadily increased; there was an increase of over 60% of this type of cooperative activity in the final three months of 2002. Technical Standards As libraries have become more automated and digitally based, technological systems and services permeate every part of reference workflow and interactions. Along with implementing, presenting, and integrating these systems, libraries face the challenge of maintaining an appropriate perspective on the technology and a focus on assessing and providing their services to a measurable level of quality. Both of these challenges point to the need for technical standards for interoperability and service quality standards or best practices. Coffman notes that virtual reference software must be compatible with thousands, perhaps even hundreds of thousands, of other resources [Coffman, 2001]. Such system interoperability requires open system architectures or established technical standards for exchanges between software packages and services. One of the primary bodies responsible for generating these types of technical standards is NISO, the National Information Standards Organization, a non-profit association accredited by the American National Standards Institute (ANSI). NISO "identifies, develops, maintains, and publishes technical standards to manage information in our changing and ever-more digital environment. NISO standards apply both traditional and new technologies to the full range of information-related needs, including retrieval, re-purposing, storage, metadata, and preservation" [NISO, 2002]. The NISO Standards Committee AZ on Networked Reference Services is charged with the development of a question processing transaction protocol for interchange between systems. The development of practical standards requires the involvement of both the library and technology communities and a commitment by the standards body to present experimental drafts for implementation and iteration. Organizations like QuestionPoint and 24/7 Reference support standards-based development and serve on the NISO AZ standards committee to help ensure that all development applies to implemented industry standards. As the standards develop, cooperation between organizations on the committee provides a principle test bed for the protocols, impacting both how the standards for exchange between services will work and how the standards will be implemented. Interchange between systems doesn't always require implementation of standards; open architectures or known Application Program Interfaces (API) can also provide means for libraries to pull together various software and services into a cohesive offering for their patrons. For example, QuestionPoint provides an open API for libraries, library portals, OPACs, and other reference services to link into QuestionPoint's functionality while maintaining the look and feel of the application. This kind of open architecture helps libraries provide a consistent experience for their patrons while providing access to core services at the patron's point of need. As digital reference moves forward, standards and open systems will become increasingly important. Examples of other applications in the digital reference arena that require attention to standards include: record retrieval from knowledge bases, patron authentication, statistics, fulfillment, document delivery, and others still to be identified. Quality Standards and Best Practices Cooperation and interchange require trust. From a technical point of view, trust means a system will perform reliably based on a set of predefined criteria, thus the role of standards and agreed upon architectures. Trust also plays a critical role when looking at human interaction or assessment of service quality; however, technology can really only play a supporting role by gathering data based on agreed upon measures or assessing that data with defined criteria. Collaborative digital reference services must develop best practices and shared professional standards for quality of service to establish environments where trust can be built and established. McClure, Lankes, Gross, and Choltco-Devlin propose a series of quality standards that can be used to evaluate the quality of digital reference services: 1) courtesy of library staff; 2) accuracy of answer; 3) user satisfaction with the service; 4) rate of repeat users; 5) awareness that the service exists; 6) cost per digital reference transaction; 7) completion time; and 8) accessibility. Harnessing technology to automate the collection and analysis of accepted metrics will provide a common vocabulary within librarianship about the services provided [McClure et al., 2002]. These metrics can also help educators, researchers, and service providers identify technology, education, and research areas that would benefit libraries and ultimately benefit patrons as well in meeting the common goal of quality service. As a global cooperative with hundreds of participating libraries, QuestionPoint has established a variety of guidelines, policies, and practices to ensure quality and build trust within the global reference network. For example, each library that responds to a question through QuestionPoint takes responsibility for the accuracy of its response. In addition, a board comprised of QuestionPoint members and OCLC and LC staff monitors the quality of the answers provided to patrons and maintain standards to ensure the quality of the global knowledge base. Peer monitoring is also used to help ensure service quality. Implementing a Vision: The Work of Making Good Ideas Reality Tenopir and Ennis summarized a decade of change in reference this way: "Although the exact number of reference questions seems to be declining in most libraries, the nature of the questions and the modes used to receive and answer questions have increased in variety and complexity" [Tenopir & Ennis, 2002]. In this context, libraries must reflect on how they can address this perceived decline and the growing number of modes used to provide service. The following example shows how the Library of Congress, the world's largest library, has worked to understand and address patron needs and library issues. A team of project coordinators and librarians at the Library of Congress under the direction of Diane Kresh, Director of Public Services, manages involvement in activities like QuestionPoint and NISO and applies these collaborative principles to define and support the shared vision of electronic reference for the library's twenty-two reading rooms. To form a clear picture of the patrons they wanted to serve, the team used an informal assessment including web statistics, patron feedback, and the shared experiences of over 200 library professionals working in the reading rooms and participating in the Collaborative Digital Reference Service. Through this assessment, the team concluded that patrons were often unaware of the services available; the services were often difficult to locate; and the interfaces and experiences for the patron were inconsistent. To better understand where their patrons were and what technology they were using, the team used web server logs to identify high traffic areas and browser technology used. Based on this data, links were placed to the "Ask a Librarian" service directly from the Library of Congress homepage and on the highest traffic locations on the Library's web site and affiliated sites. All these links take patrons to access points where they can select the type of reference they need. At any time, a patron is no more than two clicks from a reference librarian. The server logs and assessment also helped inform the technology profile and decisions about how to present the service (e.g., ADA compliance) and which technology to offer, like web forms and chat. To provide a consistent interface and experience for the patron, the reading rooms worked together to define their approach to meeting patron needs as a whole service rather than a collection of various services. In early 2002, the Reading Rooms established a uniform approach and created a web template for queries to provide a common interface and ensure that a uniform set of information was gathered from patrons across all the reference services, regardless of where the query originated. The team encountered issues of workflow, cooperation, and standards, both technical and quality, as the over 200 reference librarians in the 22 reading rooms worked together to define service. The reading rooms implemented QuestionPoint as their reference management and workflow service, which allowed them to handle question assignment, referral to other reading rooms, interaction with patrons, question management, statistical reporting, knowledge base creation, and routing to QuestionPoint's Global Reference Network as required. With a goal of maximizing research and reference service and minimizing the number of un-served patrons, the reading rooms formed internal and formal cooperative reference networks to address question referral between the reading rooms based on expertise and load. This use of collaboration extends to the reading rooms' participation in anonymous and formal cooperation in the QuestionPoint Global Reference Network, both for question submission and answering. An ongoing commitment to standards is illustrated by the Library of Congress' leadership role on the NISO AZ standards committee and its sponsorship and participation in various quality studies and standards setting activities. Through the implementation of this structured, patron and mission-driven approach, reference traffic has increased dramatically. In the Main Reading Room alone, the number of reference queries received via e-mail grew from 280 in May 2002 to nearly 1200 in October 2002. Much of this traffic continues to come from web forms, suggesting the need to better understand how libraries provide reference, where patrons access it, and what is done to support digital library workflow, rather than focusing principally on the modes or channels in which it occurs. The assessment and planning accomplished by the Library of Congress team has produced manageable systems, efficient processes, and support enabling this kind of increase without the need to increase staffing. Ongoing assessment, revision, and growth are the final, and perhaps the most critical, components of the Library of Congress strategy. In its commitment to providing quality reference, the Library of Congress continues to work for innovation by: participating in evolving standards; constantly re-evaluating and challenging its own practices; assessing service statistics and trends; continuing to provide and evaluate new modes of reference like chat and voice over IP; participating in projects to evaluate new technology; shaping the development and management of the QuestionPoint service and Global Reference Network community; and working to stay current with the Library's patrons by understanding and anticipating their current and future needs. Summary Lankes noted that "the core question in today's emerging digital reference field is: how can organizations build and maintain reference services that mediate between a patron's information need and a collection of information via the Internet?" [Lankes, 2000]. Examples like the Library of Congress and Ask Joan of Art point out that when libraries define their users and identify where they are and how best to serve them, the mission and goals for the service drive technological need, development, and support. Moving forward, as libraries develop their digital reference services, technology will play a critical role in their ability to effectively identify and meet patrons' needs and efficiently address service growth and quality through issues of workflow, cooperation, assessment, and interoperability. References [Bunge & Ferguson] Bunge, C. A. & Ferguson, C. D. (1997, May). The Shape of Services to Come: Values-Based Reference Service for the Largely Digital Library. College & Research Libraries, 252-265. [Breeding] Breeding, M. (2001, April). Providing Virtual Reference Service. Information Today, 43. [Coffman] Coffman, S. (2001, September). We'll Take It From Here: Further Developments We'd Like to See in Virtual Reference Software. Information Technology and Libraries, 149-50. [Cox] Cox, S. (1996). Rethinking reference models. Retrieved December 20, 2002, from . [Gorman] Gorman, M. (2000). Our Enduring Values: Librarianship in the 21st Century. Chicago and London: American Library Association. [Janes] Janes, J. (2002). Digital Reference: Reference Librarians' Experiences and Attitudes. Journal of the American Society for Information Science and Technology, 53 (7), 549-566. [Kilgour] Kilgour, F. (1973, March). Computer-Based Systems, A New Dimension to Library Cooperation. College & Research Library, 34 (2), 137-143. [Kresh] Kresh, D. (2000). Collaborative Digital Reference Service. In R. D. Lankes, J. W. Collins, & A. S. Kasowitz (Eds.), Digital Reference Service in the New Millennium, Planning, Management, and Evaluation, (p. 64). New York: Neal-Schuman. [Lankes] Lankes, R. D. (2000). The Foundations of Digital Reference. In R. D. Lankes, J. W. Collins, & A. S. Kasowitz (Eds.), The New Library Series: Digital Reference Service in the New Millennium, Planning, Management, and Evaluation, Number 6 (p. 3). New York: Neal-Schuman. [McClure et al.] McClure, C. R., Lankes, R. D., Gross, M. & Choltco-Devlin, B. (2002). Statistics, Measures and Quality Standards for Assessing Digital Reference Library Services: Guidelines and Procedures. Syracuse, NY: Information Institute of Syracuse School of Information Studies, Syracuse University. 61. [Milewski] Milewski, S. (2002). An Evaluation and Comparison of Popular VRD Applications. Virtual Reference Desk Conference 2002. Retrieved December 20, 2002 from . [NISO] National Information Standards Organization (2002). About NISO - National Information Standards Organization (NISO). Retrieved December 20, 2002, from . [Nua.com, September 2002] Nua.com. Nua Internet How Many Online. Nua Internet Surveys. Retrieved December 20, 2002, from . [Nua.com, December 4, 2002] Nua.com. Kinetic Strategies: Broadband adoption rise in North America. Nua Internet Surveys. Retrieved December 20, 2002, from . [Nua.com, December 5, 2002] Nua.com. eTForecasts: Global Net population on the rise. Nua Internet Surveys. Retrieved December 20, 2002, from . [Nua.com, December 10, 2002] Nua.com. comScore Media Metrix: High-speed Net users doing more online. Nua Internet Surveys. Retrieved December 20, 2002, from . [Nua.com December 12, 2002] Nua.com. Reed Electronics Research: UK still trailing behind in broadband take-up. Nua Internet Surveys. Retrieved December 20, 2002, from . [OCLC] OCLC (2002, June). OCLC White Paper on the Information Habits of College Students: How Academic Librarians Can Influence Students' Web-Based Information Choices. Retrieved December 20, 2002, from . [Tenopir & Ennis] Tenopir, C. & Ennis, L. (2002, Spring) A Decade of Digital Reference 1991-2001. Reference & User Services Quarterly, vol. 41, no. 3. 264-73. Notes [1] OCLC is a registered trademark of OCLC Online Computer Library Center, Inc. QuestionPoint is a trademark of OCLC Online Computer Library Center, Inc. [2] Digital Research Symposium, . [3] Bernie Sloan's Digital Reference Pages . [4] VRD sponsored DIG_REF Listserv . [5] Global Reference Network . Copyright © OCLC Online Computer Library Center. Used with permission. Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next Article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/february2003-penka   work_5qlu74psjnhoppwj5d6j7zkg6u ---- 94102 269..277 Utrecht University Repository: the evolution of the Igitur archive – a case-study Saskia Franken Igitur, Utrecht Publishing and Archiving Services, Utrecht University Library, Utrecht ,The Netherlands Bas Savenije Utrecht University Library, Utrecht, The Netherlands, and Jennifer Smith Igitur, Utrecht Publishing and Archiving Services, Utrecht University Library, Utrecht, The Netherlands Abstract Purpose – The purpose of this paper is to outline the development of the institutional repository of Utrecht University, the Igitur Archive. The Utrecht repository is unique for several reasons: it was started several years before the international repository movement began; the repository is combined with an e-publishing service, Igitur, Utrecht Publishing and Archiving Services; and because the repository is firmly embedded in the structural tasks of the university library. This case-study highlights the advantages and disadvantages of this particular situation. Design/methodology/approach – In order to give an outline of the evolution of the Igitur Archive, the paper uses information from policy papers and annual reports of the Utrecht University Library and Utrecht University, from the business plan of Igitur and from the Proceedings of the Dutch DARE project. The findings are sorted in four sections: the start of e-publishing in Utrecht; a section about Igitur; a section detailing the lessons learned; and finally a glance at the future is given. Findings – The conclusion is that the position of Utrecht as an “early adapter” in the Dutch repository movement has caused some delays, but that the combination of the repository with the additional e-publishing services has proved to be very fruitful. Igitur has developed a strong position and both the e-publishing services and the repository have a sound base for further growth. Originality/value – This paper gives useful information to other university libraries who want to start a repository and an e-publishing service or who are already developing such services. Keywords Publishing, Information control, Communications, Archives management, The Netherlands Paper type Case study 1. The start of e-publishing in Utrecht Utrecht University is one of the largest universities in The Netherlands (28,000 students, 8,000 staff members, 7,500 academic publications). Existing since 1636, the university has developed into one of Europe’s largest and most prominent institutes of research and education. Utrecht University Library (UBU) is the umbrella organisation of the various libraries within the university. The UBU collection is extremely varied, in line with the wide range of courses and research projects organised by the university. It contains approximately 4.5 million books and journal volumes. In addition, the UBU provides access to an increasing amount of electronic information, including approximately 7,000 periodicals that are available in full text. The The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm Utrecht University Repository 269 OCLC Systems & Services: International digital library perspectives Vol. 23 No. 3, 2007 pp. 269-277 q Emerald Group Publishing Limited 1065-075X DOI 10.1108/10650750710776404 ever-increasing possibilities of ICT have caused, and continue to cause, great changes in the library’s products and services. Consequently, the UBU is continually engaged in research and development. Since 1990, scientific information has become increasingly available in digital formats. Research libraries have been forced to change from institutions that collect, store, and lend scientific material on paper, to institutions that provide access to digital information that to a large extent is not owned by the library: a change from collection to connection. Libraries have also been confronted with the so-called serials crisis: due to extreme price rises of scientific journals, libraries had to cancel subscriptions, which in turn caused additional price increases. This posed a severe threat to the accessibility of scientific information (Savenije, 2003). In the USA, several initiatives to improve accessibility have arisen: SPARC (an initiative to increase competition) and Highwire (to help society publishers make their journals digitally available). The Utrecht University Board of Directors and the management of the University Library were very aware of these developments. Therefore, funds were made available to allow the library to counter these problems which resulted in four new projects in the late 1990s. 1.1 E-journals While the library management was thinking of a concrete action in tune with SPARC and Highwire, a Dutch professor in social medicine, Dr A.J.P. Schrijvers contacted the University Librarian, Bas Savenije. Researchers in a new interdisciplinary medical field, integrated care, found it extremely difficult to get their articles published in the existing journals so Professor Schrijvers asked the library for support in starting a new, preferably electronic, journal. Together with Delft University of Technology, Utrecht started the Roquade project, for which a special project team was established in the library consisting of a coordinator, several project leaders and a technical developer. The mission of this project was to enhance scientific communication in the interest of the scientific community by setting up an infrastructure for organising, co-ordinating, supporting and facilitating the digital publishing process. One of the first concrete results of Roquade in Utrecht was the International Journal of Integrated Care (IJIC) whose first issue was published online in 2000 (www.ijic.org). More journals were to follow (Vetscite, Ars Disputandi) and by the end of the Roquade project in 2002, Utrecht University Library was “publishing” five electronic journals and three other publication sites. 1.2 Electronic theses At that time, e-publishing was quite new and needed to be promoted. With this in mind, another activity was initiated: the library began offering the possibility for PhD students to deliver their thesis electronically for online publication. As an incentive, a small remuneration was available for PhD students who wanted to publish their thesis electronically. By the end of 2002, almost one third of Utrecht’s theses output was online. This was a positive result, however, most of the theses still remained available only on paper. Copyright issues proved to be the most important impediment to their online publication. OCLC 23,3 270 1.3 Dispute and DARE As more and more scientific output became available in digital form, the University Board of Directors decided that it should become a structural task of the library to archive digital research and teaching material. In 2001, the library started a pilot project in which Utrecht scholars were asked to deliver digital versions of their publications to the library in order to preserve them and make them freely available. Almost 1000 articles were entered in a database and presented on a demo website: Dispute. In fact, this was a repository avant-la-lettre. When the Open Archives Initiative started in 2002, the Dispute site was made OAI compliant. The Dutch Digital Academic Repository project (DARE) (www.darenet.nl/en/page/ language.view/dare.start) was also launched by the Surf foundation in 2002 (see Figure 1). DARE is a joint initiative of all Dutch universities and the National Library of the Netherlands, the Royal Netherlands Academy of Arts and Sciences (KNAW) and the Netherlands Organisation for Scientific Research (NWO). Its aim is to store and give access to the full text results of all Dutch research publications via a network of so-called “repositories“. All the institutions involved perform these tasks in a similar way but retain responsibility for and control over their own data. As a result of the Dispute project, Utrecht was able to share specific experience with the other DARE project members on a variety of matters. For example, Utrecht had already developed its own metadata scheme, which was based on Qualified Dublin Core and Utrecht was also able to contribute ideas in terms of the general strategy of DARE and DARE communication. Figure 1. Utrecht University Repository 271 At the same time, Utrecht profited from DARE. When Utrecht decided to switch from its own platform to the DSpace archival software, we were able to make use of the knowledge within the other universities that had already implemented DSpace. DARE also provided Utrecht with an additional stimulus to further develop our repository. For example, with the DARE initiative Cream of Science (Feijen and van der Kuil, 2005), each DARE partner selected ten of their prominent scientists and made their complete publication lists, with as much full text as possible, visible and digitally available through DAREnet. As a result of this initiative, approximately 4,000 publications were added to the Utrecht repository. In addition, other DARE partners have developed new services such as subject-specific websites (SCHOLAR’s economic community website, a website aimed at bringing together researchers who work in the field of education and labour economics) and CoMa, (a Copyright Management tool) which will be of great value to Utrecht. 1.4 Omega In 2001, the UBU began the development and implementation of a new search and retrieval system for digital scientific information, Omega. At present, more then 10 million journal articles are available via Omega (see: http://omega.library.uu.nl/seal/ omegasearch.php?lan ¼ en). In October 2006, the library introduced MyOmega, where the client can create his own “bookshelf” to put aside his favourite digital journals and store his search queries. The development of Omega was a strategic choice of Utrecht University Library in keeping with its vision that the main function of the library of the future will no longer be to provide a collection of information in the traditional sense of the term, but to provide optimal access to that collection. The library has the task to take care of tools, facilities and the infrastructure to provide access to electronic resources, irrespective of where the information is stored. This includes: an excellent search facility, personal alerting systems, personal home page facilities and facilities for the integration of information services in the user’s primary processes (Savenije and Grygierczyk, 2001). Omega is also very important for the repository as it is a “service provider/harvester” to the Igitur Archive. As such, it is an additional access point to the electronic resources stored in the repository. 2. Igitur, Utrecht Publishing and Archiving Services In 2003, Utrecht University Library was faced with a difficult situation. The Roquade project had left the library with a number of e-journals and e-publishing was becoming going concern in the library, yet there was no specific department to which e-publishing was assigned as a structural task. The same situation existed for the repository. The library did have a department for Innovative Projects and both the e-publishing project and the repository were still part of this department. At the same time, the library was confronted with severe budget cuts, which resulted in a reorganization. However, despite the budget cuts the library was able to launch a dedicated unit responsible for e-publishing and archiving services: Igitur. The primary goal of Igitur is to improve and to increase access to scientific information (see Figure 2). With Igitur, Utrecht is one of the first academic institutions in Europe to give electronic publishing a structural place within the organization. OCLC 23,3 272 Igitur has two main functions: (1) To further develop the Utrecht digital repository (the Igitur Archive) which is financed out of the library’s budget. (2) To support scholars in their e-publishing endeavours, for example, creating electronic journals and other types of publication websites. The e-publishing infrastructure for these projects is financed out of the library’s budget but the marginal costs for other services must be recovered. The Igitur team consists of 5.8 fte. For technical assistance, long term preservation, administration etc. Igitur makes use of the library organization and technical infrastructure. In the last two years, Igitur has grown considerably and there are now ten electronic journals and the repository has grown from 2,000 to 15,000 items in December 2006 (see Figure 3). The library considers the repository part of their basic services, so users do not have to pay for their use of the repository. However, clients who wish to set up and publish an electronic journal, for example, must have their own financing. In the future, Igitur’s journals must be self-supporting. 3. Lessons learned . The combination of archiving (digital repository) and publishing services (e-journals, e-books) proved very fruitful. In the first place, the same technical Figure 2. Utrecht University Repository 273 skills are needed and can be interchanged easily but more important, as both services are strongly rooted in the Open Access Movement, Igitur is in the position to create awareness. . Igitur started early with the institutional archive and this has sometimes also proven to be a disadvantage. For example, the implementation of the new archival software, DSpace, was a huge conversion project in Utrecht. Some Dutch universities, who started later, were able to start building their repository and the underlying policy immediately making use of new international insights and making maximal use of the DARE network. . As a part of the library, Igitur can profit from the diverse knowledge within the library and from the existing infrastructure. For example, when new information retrieval software (Autonomy) is adopted in Omega, this can also be used for Igitur journals. Another excellent example is the experience with user-statistics in the library. The library has recently begun to keep user statistics for all public services and by means of the same software and methods, the use of the Igitur Archive can be measured. A third example is the experience concerning linking. The Utrecht University Library has purchased a system, SFX, which makes it possible to make a direct connection from the literature reference to the online collection of the library. This means that users will get a link, referred to as an UBU-link, to the full text article, including illustrations, graphs etc., without further search if the university has a digital subscription to the journal in question, or to the catalogue if there is a printed version of the journal. UBU-link also offers the possibility for further search in other databases and on the Internet. In the future, UBU-link will also be implemented in the repository and in the Igitur journals. . Igitur has had the opportunity to experiment with new business models. For example, as a virtual community developed spontaneously in connection with The International Journal of Integrated Care, Igitur has tried to create a structure in which the community is responsible (also financially) for the publishing of IJIC. Nevertheless, it has proven to be quite difficult to implement an adequate business model for open access journals. Clients (editorial boards) are positive Figure 3. OCLC 23,3 274 with regard to open access, but are always a bit unsure when it comes to finding the money to finance an open access journal. Editorial boards which already have a subscription-based paper version are also less willing to switch over to open access. In these cases, Igitur proposes an embargo period. For example, each new issue of LIBER Quarterly, the subscription based Journal of European Research Libraries, is made publicly available after six months. Igitur also assists small publishers that do not have e-publishing expertise in setting up electronic versions of their journals. . The support of the University board of directors, especially with regard to archiving services, has been very valuable. Utrecht strongly supported the DARE program and in 2005, Utrecht University signed the Berlin Declaration. The repository is now the largest in the Netherlands. Until now, the growth has been primarily achieved through special singular projects such as the “Cream of Science” project within DARE. Igitur is currently busy making agreements with the various faculties and research groups within the university regarding the regular delivery of their scientific output to the Igitur Archive. However, the differences in the publishing culture among the various disciplines is in fact quite large. The humanities have a culture of publishing monographs. E-publishing is not very popular yet; they are primarily interested in digitalisation projects. The Law faculty is extremely cautious with copyright-issues and plagiarism. In the faculty of medicine, scientists place a very high value on publishing in the big Elsevier journals and they primarily make use of Medline and PubMed when performing information searches, so there is less need for a repository. For the time being, Utrecht has consciously chosen not to make the delivery of publications to the repository a mandatory action. The conviction exists that it is too soon for a mandate and that it is better to concentrate first on building volume in the repository and to present it primarily as a service to the scholars. Our first experiences with the mandatory digital delivery of theses (since February 2006) corroborate this idea. . There are strong signs that the open access movement is strengthening: a greater quantity of open access journals are emerging, particularly as a part of the Public Library of Science (PloS). In 2004, Springer launched the Open Choice program where authors are offered the possibility to have their journal articles made available with full open access in exchange for payment of a basic fee. In 2006 Elsevier announced the conversion of six of its physics journals to hybrid open access journals, with 30 more Elsevier journals in various fields to follow. The quantity and popularity of repositories is also growing. Nevertheless, Igitur has observed that most scholars still feel dependent on their publishers and are uncertain and sometimes even afraid of offending them. They consider the publisher their friend who delivers a service to them. Authors do not always realize that many publishers have relaxed their policies and are willing to cooperate and that they can take the initiative by not transferring their rights to the publisher. Sometimes even if they are aware, giving away their copyright is not always a big deal to them because changing the licence means too much work. This shows that there is still much work to be done in creating awareness and changing the attitude of scholars. The recent experiences with the policy changes in delivering electronic theses at Utrecht University present a prime Utrecht University Repository 275 opportunity to attract attention. Quite a few scholars have protested against the mandate and Utrecht University now plans to organize a public debate on the issue between advocates and opponents. . Utrecht University Library has given a great deal of attention to international developments and is actively taking part in international initiatives such as SPARC and the organisation of international conferences. Igitur has benefited greatly from its international networking activities (for example: in search of new e-publishing software, we came into contact with the Public Knowledge Project in Canada, which has developed OJS) and of course, as already mentioned, in national cooperation (DARE). 4. The future 4.1 The library as partner in science The role of the library within the university is changing rapidly and the prominence of the library’s physical collection is decreasing as the importance of information reference is increasing. Thus, the policy of the Utrecht University Library is to make as much information as possible accessible to its users by electronic means. Another trend is the continuous integration of library services in the primary processes of the university: education and research. In both the teaching and learning processes, it is becoming increasingly important for students to improve their skills in accessing and using scientific information. Similar trends can be observed for research processes. Library services (information access as well as the tools to manage that information and to support the communication between researchers) must become part of the regular workflow of researchers. This new role of the library as a partner in science also has consequences for the services provided by Igitur: . Archiving. By providing free access to the scientific output of the university in the repository, the library facilitates scientific research. Particularly when non-traditional, non-text digital objects, such as primary data and multimedia objects are also added to the other information collected in the repository. There is a growing need for repositories for educational material, such as learning objects: basic electronic building blocks for e-learning, which can be combined and reused in different courses. Igitur has concrete plans to create these facilities in the coming years. . Publishing. The library will continue to support scientists with the setup and production of electronic journals. Igitur will use the Open Access model as much as possible for new products so that free access to scientific information is promoted. In this way, the library and Igitur offer assistance in scientific communication. 4.2 Professionalization of Igitur Igitur began as a project and has built up its knowledge by trial and error. Over the years, Igitur has grown and it is definitely clear that there is a need for the services Igitur offers within the academic community. Igitur is now ready to enter the next phase in its development: professionalization. Last year Igitur developed a business plan in which the plans for the next five years are described. In order for Igitur to survive and continue to expand, professionalization is necessary. Igitur has to invest in OCLC 23,3 276 the publishing infrastructure to make further growth possible and to stay up to speed with international technical developments. The cooperation with the Public Knowledge Project/OJS, for example, is one step in this process. If Igitur is to increase the number of open-access journals, a feasible and sustainable business model is needed. Igitur should strive for more businesslike relationships with its clients, for example, by entering into service level agreements. With regard to the repository, the most important issues in its continued evolution are the acquisition of new content and ensuring the process of depositing publications is as easy as possible for the researchers by, for example, further integration within the research process (Gibbons, 2005). More content in the repository will contribute to the status of the repository and will stimulate its use which will ultimately convince researchers that repositories improve the visibility of their publications. By continuing to develop in this way, Igitur intends to contribute even more to the overall goal of improving and creating access to scientific information. References Feijen, M. and van der Kuil, A. (2005), “A recipe for Cream of Science: special content recruitment for Dutch institutional repositories”, Ariadne, Vol. 45, available at: /www.ariadne.ac.uk/ issue45/vanderkuil/ Gibbons, S. (2005), “Understanding faculty to improve content recruitment for institutional repositories”, D-Lib Magazine, Vol. 11 No. 1, available at: http://www.dlib.org/dlib/ january05/foster/01foster.html Savenije, J.S.M. (2003), “Recent developments in commercial scientific publishing: an economic and strategic analysis”, DF Revy, Vol. 26 No. 8, pp. 220-7, available at: www.library.uu.nl/ staff/savenije/publicaties/scientificjournalsrev.htm Savenije, J.S.M. and Grygierczyk, N.J. (2001), “Libraries without resources: towards personal collections”, Collection Building, Vol. 20 No. 1, available at: www.library.uu.nl/staff/ savenije/publicaties/jerusalem.htm About the authors Saskia Franken is managing director of Igitur, Utrecht Publishing and Archiving Services, the e-publishing department of Utrecht University Library since January 2003. Saskia studied History at Utrecht University. After completing her studies she has worked in various functions at Utrecht University, first at the Utrecht Faculty of Arts and since 2001, at the University Library. Saskia Franken is the corresponding author and can be contacted at: s.franken@ library.uu.nl Bas Savenije has been librarian of Utrecht University Library since 1994. Prior to that, he was Head of the Strategic Planning Department and Associate director of budgeting and Control of Utrecht University. Bas is Chairman of the Board of SPARC Europe and has published various articles about library and information science (see www. library.uu.nl/staff/savenije). Jennifer Smith is the marketing and communications advisor for Igitur, Utrecht Publishing and Archiving Services. Jennifer received her MBA from the Rotterdam School of Management in 2002 and since that time has held several marketing and communications positions within the Utrecht University Library. Utrecht University Repository 277 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_5s23svylk5et3kclee3dy4rdm4 ---- The application of function descriptors to the development of an information system typology sometimes incorrect ordering information. Most common was the exclusion of price or publisher, although these omissions did not seem to complicate verification unduly. Misspelled author names, one incorrect author, and errors in title citation were also discovered on the order form. The misspellings were obvious, but the correct forms of the names could not have been established easily without on- line access to the full bibliographic record in each case. The one incorrect author also appeared with an incomplete title citation, and it was not until the book was actually received a t the library that the OCW record could be classified as a “find.” Serious, too, was the pair of orders which seemed to differ only in the government document number. It is quite clear that verification of titles through OCLC serves to eliminate duplicate orders, improve speed of acquisitions, and ensure the correct receipt of items. In this connection it is also worth noting that OCLC’s use of trun- cated search keys greatly facilitates the process by minimiz- ing the effect of all mistakes, certainly including those that may have been made during the preparation of order forms. Conclusion The results of this pilot study indicate that technical in- formation centers a n d special libraries can benefit substan- tially from participation in OCLC. Moreover, i t seems clear that some problem areas are truly universal in scope, and that organizations working in such areas should be free t o cooperate through computer-based networks without re- gard for such relatively artificial constraints a s provincial, state, and even international borders. References 1 . 2 . 3. 4. 5. 6 . 7 . 8 . Reid, M.T. “Effectiveness of the OCLC Data Base for Acquisi- tions Verification.” Journal of Academic Librarianship. 2:236- 237; 1977. Morita, LT.; Gapen, D.K. “A Cost Analysis of the Ohio College Library Center On-Line Shared Cataloging System in the Ohio State University Libraries.” Library Resources and Technical Services. 21:286-302; 1977. Meyer, R.W.; Panetta, R . ‘Two Shared Cataloging Data Bases: A Comparison.” College& Research Libraries. 38:19-24; 1917. Plotnik, A. “OCLC for You-and Me?” American Libraries. 258-267;May 1976. ‘The Impact of OCLC.” American Libraries. 268-215; May 1976. Markuson, B.E. “The Ohio College Library Center System: A Study o f Factors Affecting the Adaptation o f Libraries to On- Line Networks.” Library Technology Reports. 11-132; J a n - uary 1976. Martin, S.K. Library Networks, 1976-77. White Plains, NY: Knowledge Industry Publications; 1976: 36. Private communication with INCOLSA Staff, February 1 3 , 1978. Errata The Application of Function Descriptors to the Development of an Information System Typology E. Burton Swanson Graduate School of Management, University of California, Los Angeles, CA 90024 [Article appeared in the Journal of the American Society f o r Information Science. 28:259-267 (1977)] The word “topology” was inadvertently inserted in place of “typology” in the title of the above paper on the lead page of the article and in the table of contents for that issue. 256 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-September 1978 work_5zlzszk6j5bktfgkdr22ujxtka ---- E-publishing in libraries: the [Digital] preservation imperative E-publishing in libraries: the [Digital] preservation imperative Heather Lea Moulaison and A.J. Million iSchool, University of Missouri, Columbia, Missouri, USA Abstract Purpose – This paper aims to, through an analysis of the current literature, explore the current state of the library e-publishing community and its approach to preservation. Libraries are increasingly proposingpublishingservicesaspartof theirworkwiththeircommunities,andrecently, therehasbeen a pronounced interest in providing electronic publishing (e-publishing) services. The library e-publishing community, however, has not systematically studied the need for the long-term preservation of the digital content they help create. Design/methodology/approach – Through a refective analysis of the literature, this paper explores the context and the evolution of e-publishing as a trend that aligns with public library missions; in doing so, it also explores implications for digital preservation in the context of these new services and identifes gaps in the literature. Findings – Digital preservation is an important and worthwhile activity for library e-publishers; preservation of community-based author content cannot, however, be an afterthought and should be planned fromthebeginning.Futurestudyshould take intoconsideration theneedsandexpectationsof community-basedauthors.Existingdigitalpreservationguidelinesalsoprovideapointof reference for the community and researchers. Originality/value – This paper addresses the understudied area of the importance of digital preservationto librarye-publishing. Indoingso, italso investigates theroleof the library insupporting community-based authors when e-publishing through the library. Keywords Public libraries, Library publishing, Digital preservation, Community-based authors, Library e-publishing Paper type Conceptual paper Introduction Librariesare increasinglyproposingpublishingservicesaspartof theirworkwiththeir communities.Recently, therehasalsobeenapronouncedinterest inprovidingelectronic publishing (e-publishing) services (LaRue, 2012). E-publishing has the potential to provide a valuable service to community-based authors, who, in turn, enrich the community through their work. By sponsoring e-publication services, libraries are not only being faithful to their missions but also working to provide access to valuable material that, whatever the content or approach, is an important and irreplaceable part of the cultural heritage of the community. This conceptual paper argues that the digital preservation of unique, community-based content is also part of the mission of the library due to its interest in providing democratic access to content. Yet, cultural heritage content available in digital formats isvulnerableand, for this reason,mustbeafocusofparticularattention. The library e-publishing community has not systematically addressed its role in the long-termpreservationofdigital content that ithelps tocreate.Thoroughananalysisof the current literature, this study demonstrates the need for refective and systematic The current issue and full text archive of this journal is available on Emerald Insight at: www.emeraldinsight.com/1065-075X.htm Accepted14March2014 consideration for digital preservation in library e-publishing initiatives. We begin by exploring thecontextand theevolutionof e-publishingasa trajectory for the library; in doing so, we also identify implications of these new services and gaps in the literature. Next,wedescribe thenecessityofdigitalpreservation,givingexamplesand identifying caveats.Afteradiscussionbasedontheliterature,werecommendbothfuturecoursesof action for libraries to take tosafeguarde-publishedcontentand futurestudy tosupport that work. The library and democratic access to information Libraries have a mission to provide access to information to their user communities. Since theemergenceof theWorldWideWeb, librarieshavebeenchallengedbythehigh prices publishers are charging for electronic materials. In academic libraries, an “increasing portion of a library collection […] is now comprised of licensed, not owned, electronicmaterials” (Fenton,2008,p.32).Onerecentwaytopromoteequitableaccessto a broad cross-section of content is for libraries to become publishers in an attempt to control costs and access. The 2013 Library Publishing Toolkit (Brown, 2013) provides a collectionof in-depthchapters for libraries interested inpublishing.Since the turnof the millennium,university librarieshaveprimarilyexpressedaninterest inself-publishing; public libraries, however, were frst interested in self-publishing as early as the 1970s (Perkins, 1978). Although vanity presses may have wielded the fatal blow to academic careers in the past (Savage, 2008), the broad value of self-publishing is increasingly being acknowledged. The evolution of self-publishing As an alternative to traditional publishing, self-publishing in the form of subsidy or vanity presses has been active for a number of years. Printing a run of books through vanity presses is an expensive proposition, however, and the Web now permits other, less-expensive models for self-publishing. Author services models, using print on demand,becamepopular in the early2000s (DilevkoandDali, 2006) andhaveprovided avenue forself-publishers toprint smaller runsofbookssuccessfully, allowingauthors to “dispense with publishers in the traditional sense and become their own publishers” (Jobson, 2003, p. 20). Even more economical up-front are services like Amazon’s Kindle DirectPublishing (https://kdp.amazon.com)wheree-booksarepublisheddirectly to the Amazon Kindle Store and authors retain 70 per cent of the royalties their books earn. These initiatives yield physical volumes that can be stored, under good conditions, for generations. Products of self-publishing have been generally considered to be inferior to traditionallypublishedworksdueto the lowqualityof thecontentandthenichesubject matter discussed (Dawson, 2008). The self-publishing model does not include some of the elements of traditional publishing, and, for example, would not necessarily support authors in the creation or editing of their work or in its subsequent marketing or distribution. Whereas traditional publishing vets authors, selecting only those with a proven trackrecordandwhosework is recognized tobeofhighquality, self-publishing has a low barrier to access. Huffman (2013) concedes that self-publishing may, under somecircumstances,be likebloggingorotherWeb-basedpublishing; thepurposeof the publication, according to him, is what makes the difference. In terms of the subject matter, concern about the value of self-published books has also been voiced: https://kdp.amazon.com services to individuals who are disillusioned with the publishing industry but still want to publish their work in some form. Ultimately, more research will need to be done in this area, but based on this article’s analysis, a series of recommendations, both theoretical and practical, can be made in support of public libraries wishing to explore an e-publishing service while also preserving content. Libraries are not alone in considering unpublished works for inclusion in their collections. Archives have been addressing related notions for centuries in their work with unique manuscripts. Although libraries are well-schooled in collection development, library-published e-books may not be as readily evaluated for selection as traditionally published materials. Accordingly, the archival notion of appraisal, a notion associated with managing records (Craig, 2004), is a useful and relevant concept to explore in the library e-publishing context. Appraisal is the: […] process of establishing the value of documents made or received in the course of the conduct of affairs, qualifying that value, and determining its duration. The primary objective of appraisal is to identify the documents to be continuously preserved for an unlimited period of time (Duranti, 1994, p. 329). The question of appraisal ties in to the assessment of content for digital preservation since not everything can or should be saved (Harvey, 2007). Selection, a related idea necessary for assessment, is “the process of deciding what items or resources will be added to a library’s collection” (Harvey, 2007, p. 31). Combined, appraisal and selection work will determine which content should be preserved for use by the repository’s end-user and for how long that content should be made available. Discussion: the preservation imperative This article contends that digital preservation needs to be planned up-front as library e-publishing initiatives are being explored. Digital preservation is an ongoing effort, not a one-time concern. For it to be effective into the future, there needs to be advance considerations for sustainability. Based on the review of the literature, two primary sustainability-related concerns are the following: (1) the required infrastructure in the library to manage the technology in terms of human resources and technological resources; and (2) the rights of community-based authors. We explore each below in turn, and then situate the importance of digital preservation of e-publications within the mission of the library. Digital preservation is complex, and a one-size-fts-all solution is not available due to the uniqueness of each organization, community of users, staff and technology. Although best practices have been developed around some of the routine aspects of digital preservation, including the selection of fle formats, other elements remain unexplored or understudied. As library e-publishers move forward, they will need to consider issues of sustainability that will drive both human resources and technological resources. Libraries will have to consider whether, for example, they will charge a fee for e-publishing initiatives. Fees could defray future preservation costs if the digital objects are selected to be maintained on-site, or could support outsourcing the digital preservation to a third-party vendor. Tennant, R. (2011), “The scam of edited collections”, TheDigital Shift, 19 December, available at: www.thedigitalshift.com/2011/12/roy-tennant-digital-libraries/the-scam-of-edited- collections/ (accessed 14 January 2014). Westney, L.C. (2007), “Intrinsic value and the permanent record: the preservation conundrum”, OCLCSystems&Services: InternationalDigitalLibraryPerspectives, Vol. 23 No. 1, pp. 5-12. Yakel, E. (2007), “Digital curation”, OCLC Systems & Services: International Digital Library Perspectives, Vol. 23 No. 4, pp. 335-340. Further reading Lefevre, J. and Huwe, T. (2013), “Digital publishing from the library: a new core competency”, Journal ofWebLibrarianship, Vol. 7 No. 2, pp. 190-214. Mid-Continent Public Library (2012), Mid-Continent Library Woodneath Campus: Campus Master Plan, available at: www.mymcpl.org/_uploaded_resources/Woodneath_Master_ Plan_-_Finala.pdf (accessed 14 January 2014). Corresponding author Heather Lea Moulaison can be contacted at: moulaisonhe@missouri.edu http://www.thedigitalshift.com/2011/12/roy-tennant-digital-libraries/the-scam-of-edited-collections/ http://www.thedigitalshift.com/2011/12/roy-tennant-digital-libraries/the-scam-of-edited-collections/ http://www.mymcpl.org/_uploaded_resources/Woodneath_Master_Plan_-_Finala.pdf http://www.mymcpl.org/_uploaded_resources/Woodneath_Master_Plan_-_Finala.pdf mailto:moulaisonhe@missouri.edu http://www.emeraldinsight.com/action/showLinks?system=10.1108%2F10650750710831466 http://www.emeraldinsight.com/action/showLinks?system=10.1108%2F10650750710720702 http://www.emeraldinsight.com/action/showLinks?crossref=10.1080%2F19322909.2013.780519 Introduction The library and democratic access to information The evolution of self-publishing Public library as e-publisher? Role of digital preservation Self-publishing author motives and expectations Library support of access Discussion: the preservation imperative Recommendations Future study Conclusion References Cit p_40:1: Cit p_40:2: Cit p_39:1: Cit p_41:1: work_63a4sqxiknaqtcub3hvq7l3tbi ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586770 Params is empty 216586770 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586770 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_6abyk2b6lrdfdo6o54tqjs6pp4 ---- Social science research processes involving statistical information create a data life cycle model DRAFT 7/20/2006 Building Partnerships Among Social Science Researchers, Institution-based Repositories and Domain Specific Data Archives Ann Green and Myron Gutmann1 Please do not cite or quote without permission. This paper is currently under review for publication.. Introduction The digital library world has recently developed and debated both the concept and the implementation of institutional digital repositories. It has spent less time discussing discipline- or domain-specific digital repositories. Such repositories have been in existence for many years in the social sciences and have generated important lessons about the long-term preservation and sharing of academic work. The goal of this paper is to compare these two kinds of repositories and to suggest ways that they can help build partnerships among themselves and with the research community. All the parties share important goals and, by working together, can advance these goals. As authors, we emphasize the role of these repositories in social science research, and we bring our own experiences and perspectives in dealing with digital preservation of research products generated by social science. We have both been involved in the leadership (Gutmann as Director, and Green as Chair of the governing Council) of the Inter-university Consortium for Political and Social Research (ICPSR)[2], a consortium of 550 institutions worldwide that serves as the long-term steward and a primary channel for sharing a vast archive of social science data. At the same time, Green has worked extensively as a digital information specialist, while Gutmann continues to be an active researcher who has everyday contact with others in the social science fields (historical population studies) where he has worked for the past thirty years. ICPSR and other data libraries and data archives (throughout the US and worldwide) provide essential social science infrastructure, including online access to key datasets, stewardship and preservation commitments for long-term access to data, training in statistical methodology, and a range of support services for and by an international network of researchers, students, data support professionals, and data providers. Because of our experience and the complexity of trying to deal with the international situation, we focus most of our attention on digital repository development in the United States. Acknowledgements: We are especially grateful to Cole Whiteman for helping us think through the relationships explained in the paper, and then providing the illustrations, and to Ruth Shamraj for editorial consultation. We also wish to thank Chuck Humphrey, Julie Linden, Caroline Arms, Jonathan Crabtree, Nancy McGovern, Ron Jantz, Linda Detterman, Diane Geraci, and Robin Rice, who read the paper in draft form and helped us clarify and improve it. 1 DRAFT 7/20/2006 We begin by describing the life cycle of social science research. Next we turn to some of the key elements of institutional versus domain-specific repositories, before concluding with recommendations about ways that researchers and those operating the two different kinds of digital repositories can forge partnerships. The Social Science Research Life Cycle Many authors have described a life cycle through which social science research proceeds, and the simplified model we propose here does not differ materially from that of others.[3] What this model allows us to do is to identify important stages in social science research, and by doing so point out times when the researcher may benefit from interacting with either institutional repositories or domain-specific repositories. We believe that constructive relationships between researchers and both types of repositories, as well as between the repositories themselves, will have genuinely valuable results for all concerned. Figure 1. Social Science Research Life Cycle • Discovery and Planning. The first step of the research process is one of discovery and planning. Through a theoretical and empirical perspective, the researcher seeks ways to enhance knowledge in her or his field. This is the stage where the researcher explores the possibility of using existing data (hence linking to archived data at the other end of the life cycle), and where she or he determines whether new data must be collected in order to best answer the scientific question. If the project requires external funding, the researcher writes a proposal for funds. Given requirements at the National Science Foundation[4] and the National Institutes of Health[5] for data sharing plans, this is also the stage where the researcher enters into discussions with digital repositories about steps required to ensure proper data sharing later in the project. Costs related to archiving and special considerations for data sharing, such as informed consent and confidentiality concerns, should also be featured in this stage. 2 DRAFT 7/20/2006 • Initial Data Collection. Once the project has been designed and funding secured, it is time for the researcher to collect data. In a survey or an experiment, new data are collected from respondents or experimental subjects. In a project that makes use of previously collected data, those files containing the data are acquired and potentially reformatted or linked to other useful data. This is also the stage at which essential data management strategies are formulated and instituted, including decisions about documentation content and formats. At the end of this step, a clean initial dataset is in place. To ensure that this dataset is available for her own or others” future research, the researcher should secure the long-term preservation of these preliminary data. This is also the time to inform the research community about the structure and aims of the project by providing high-level metadata that can be located within domain-specific research aids. The delivery of this high-level metadata should trigger a discussion between the researcher and a domain-specific repository about the format required for effective sharing of the data. • Final Data Preparation and Analysis. The third stage in the process occurs when the researcher is performing final verification and modification of the data, undertaking analysis, and beginning to write results. At the end of this stage, the process of data preparation is complete and a copy of the final dataset should be preserved, if only at the researcher’s institutional repository. • Publication and Sharing. The fourth stage for the researcher consists of communicating research findings. When publications appear, the sharing of data and publications with the broader research community is triggered. By the time this stage is well underway, the intellectual productivity of the project is fully demonstrated, and the researcher would normally share her data with the community by making data and full metadata discoverable and available through a domain repository. • Long-Term Management. During the final stage the repository community has two critical goals -- first to ensure that the data and other intellectual products are exposed for use and learning by others, and second to ensure long-term preservation. As more time passes, and the ability of the original researcher to manage any part of the relationship with secondary data users diminishes, more and more of the sharing and preservation activity resides with the repositories. Once exposed for secondary use, data that have reached the final stage of the research cycle become the seeds of new projects that begin with their own discovery and planning, thus beginning the cycle anew. Digital Repository Types The current digital repository landscape is made up of a blend of repository types. For the purposes of this discussion about social science data and digital repositories, we have grouped repositories into these two broad categories: 3 DRAFT 7/20/2006 • Institutional digital repositories with no specific discipline focus. By definition, these are found at academic institutions, and have goals of preserving and making available some portion of the academic work of their students, faculty, and staff. • Discipline or domain-specific data archives. All of these institutions currently share the attribute of focusing on data preservation and sharing. These include Social science data libraries maintained by a single institution, broad based social science data archives such as ICPSR or the Roper Center for Public Opinion Research, and topic-specific domain-focused data libraries and archives, such as Harvard’s Henry A. Murray Research Archive or Princeton’s Cultural Policy and the Arts National Data Archive (CPANDA).[6] These repositories vary widely in their missions and roles in supporting the different stages of the research life cycle. Clarifying those missions and goals helps to understand the potential relationships, dependencies, and overlaps among repositories. It also helps to identify gaps and anticipate needs. Institution-Based Repositories The majority of institution-based repositories currently planned or already in operation emphasize the collection, cataloging, and sharing of journal-type articles and monographs. Their goal is to ensure that the scientific production of their faculties and scientific staffs are collected in a single location, one that will ensure their availability and bring credit to the institution. While not always explicitly stated, another repository goal is to reduce the institutional or community-wide cost of scientific dissemination by drawing that role back from scientific publishers and keeping it closer to the producers of scientific output. Moreover, many eprint digital repositories strive to supplement or even replace the peer-reviewed scholarly journal publication process. (Cervone, 2004) Long- term sustainability is an important component of the institution-based repository movement: university libraries (where most are located) have a long record of sustainably preserving and delivering research materials. Institution-based repositories have emerged from financially stable and dependable organizations. Most repositories hope to be more than a stockpile of eprints. They seek to provide safe harbors for a more inclusive interpretation of the intellectual output of local faculty- driven research and teaching, by including pre- and post-prints, working papers, research reports, datasets, course materials, personal image collections, among other types of content. As with eprint collections, these repositories retain institutional identity, but may choose to limit access to specific sub-communities within the institution instead of public “webwide” open access. Institution-based repository collections primarily grow through voluntary deposit, but in some cases collections are developed by selection criteria through the assistance of professional librarians, such as the work done by the Harvard Science Libraries.[7] Domain-specific collections may be developed within the overall repository structure, but support services for data processing, metadata production, or analysis are not usually offered as part of the repository service. Where disciplinary focus exists, it is more likely 4 DRAFT 7/20/2006 that specialized units within institutional libraries typically are more able to work with and support faculty in their areas of interest. Whatever the collection policy they develop, institutional repositories attempt to limit the amount of effort that repository managers have to spend to acquire any single item by developing automated or nearly-automated deposit processes. Services are frequently quite basic and user-friendly: authors can put (place content in the repository), users can search and get, all within a framework of access control defined either by the author, the repository, or both. Intellectual property rights and copyright clearances are part of the deposit process, often negotiated at the level of the institution, department, or individual. With these procedures, faculty who contribute to the institutional repository upload content and define the high-level metadata describing their submission. Little detailed metadata are called for, and validation of metadata and content is restricted to checking file formats and ensuring that all required high-level metadata fields have been provided. Knowing the author, affiliation, title of the work and employing a file format (ASCII text, Microsoft Word, Adobe PDF, etc.) that complies with policies adopted for the institutional repository are often all that is required. Given this approach and their emphasis on capturing final or near-final forms of scholarly productivity, these repositories position themselves at or near the end of the scientific research life cycle. Their goal is less to partner with researchers or with domain-specific repositories throughout the research life cycle than it is to garner the value of the institution’s productivity, to gather this productivity, and possibly to lower the local or community-wide cost of scholarly publications. The success of institutional repositories, when measured by the size and use of the collection, depends upon faculty participation and their willingness and ability to contribute digital materials into open access environments. Based upon data from a JISC sponsored survey of authors, Steven Harnard (2006) reports that unless deposits in repositories are mandated by institutions and funding agencies, submission rates will remain low. Domain-Specific Digital Repositories Domain-specific repositories share important goals with institution-based repositories, including the objective to provide access to research materials. At the same time, there are significant differences between the two types of digital repositories. Rather than focusing on publication-related materials from multiple subjects areas within a single organization, domain-specific digital repositories hold collections of materials grouped by type, subject, or purpose and intrinsically support domain- or discipline-oriented research needs. Domain-specific digital repositories in the social sciences have a history of providing infrastructure for data sharing and strive to provide support throughout the data life cycle. These data archives hold the raw materials that faculty and students can reuse, repurpose, analyze, and recompile in teaching, learning, and research environments. Part of an 5 DRAFT 7/20/2006 international network of data archives and libraries, they share a mission to acquire and manage social science data (both quantitative and qualitative) and provide support for data users. The 2005 report from the U.S. National Science Board on Long-Lived Data Collections (Recommendation 3, p. 6) emphasizes the community proxy role that domain-specific repositories play in contrast to the heterogeneous roles of institutional repositories. Domain specific repositories act and speak on behalf of their designated communities. As part of their key missions, they seek to know what the community wants and expects in terms of content, format, delivery options, support, and training. Domain-based digital repositories in the social sciences share with institution-based repositories the goal of contributing to a common infrastructure that seeks to blur boundaries that divide types of research sources, institutions, researchers, disciplinary domains, geographic borders, and funding constraints. This is in line with how faculty researchers think of their academic community orientation. As Cliff Lynch (2003) has written, rather than having strictly an institutional orientation, faculty “often don’t stay at a single institution for their entire career, and they frequently disregard institutional boundaries when collaborating with other scholars. Federation of institutional repositories may also subsume the development of arrangements that recognize and facilitate faculty mobility and cross-institutional collaborations.” Recently, the boundary has blurred between the domain-specific social science archive as data repository and other institutional repositories of published reports about data and research results. For many years, important single-source data disseminators such as the General Social Survey, the Panel Study on Income Dynamics, the Health and Retirement Survey, and the University of Minnesota’s Integrated Public Use Microdata Series of Census data have maintained their own bibliographies of publications related to their data. Since the early 2000s, ICPSR has systematically compiled bibliographies of books, chapters, articles, and other publications that make use of the items in its collection; that bibliography now numbers over 40,000 items, with many of them available as links to full-text publications. This fuller integration of resources in the domain repositories has increasingly positioned them as research aids that are among the most commonly used by scholars. Combined with more general tools like Google Scholar, these combinations open up new seamless points of access for researchers. The social science domain repositories have also moved toward partnership across archives, leading to new opportunities for improvements in research efficiency. Beginning in 2004, and as part of the Library of Congress’s National Digital Information Infrastructure Preservation Program (NDIIPP), a group of the largest social science repositories came together to form the Data Preservation Alliance for the Social Sciences. This Alliance includes ICPSR, the Roper Center, the Odum Institute for Research in Social Science, the Murray Research Archive, the Harvard-MIT Data Center, and the US National Archives and Records Administration.[8] These collections contain data that are widely used by researchers around the world, connected both by the emerging DataPASS infrastructure and by well-used guides to data such as that prepared by the University of California, San Diego.[9] This parallels the European situation, where the 6 DRAFT 7/20/2006 Council of European Social Science Data Archives[10] maintains a common catalog and policies for cross-border data sharing. Resource discovery in the social sciences now extends far beyond consulting a stand- alone research aid or search tool. We now see alliances across repositories and the emergence of a new set of tools allowing researchers to do complex and innovative searches to locate and explore data. The possibility of cross-database and cross-site searching has been made much easier by the emergence of a standard XML-based markup language for social science metadata called the Data Documentation Initiative, or DDI [11] ,now in its third version. The existence of common metadata markup has enhanced the development of software tools for exploratory data search and analysis, including Berkeley’s Survey Data and Analysis [12], Harvard’s Virtual Data Center [13], which will drive the U.S. DataPASS partnership’s common catalog, and the European Nesstar [14], which drives the European common catalog. These tools make it possible for a student or professional researcher to find data, examine them, and do varying kinds of analysis without having to download the data and load them into a software package like SPSS, Stata, or SAS. Unlike social science digital repositories, most institution-based repositories are not oriented to deal with research data, even though data files can be deposited in them. Some have been set up to accept and actively pursue diverse collections, but are not charged with providing discipline-focused environments or statistical services for preparing or using the data. (Lagoze, et al 2005). The practice of “self-archiving” at many institutional repositories has the potential to pose difficulties for effective long-term sharing and preservation of social science research data. (Humphrey, 2005) In many cases, the depositing mechanisms have been made so user friendly and generic that they are inadequate for the demands of preparing and processing data for secondary use. Producing adequate data documentation demands significant amounts of time and labor that most researchers do not have. While work is under way to develop standards for archive-ready datasets, and while ICPSR and some of the other large data archives have publications and services designed to assist researchers in data preparation, in most cases dissemination-ready data are created through partnerships among data producers, data analysts, data archivists, and technicians. Moreover, these processes take place over several stages in the research life cycle, and not merely at the end. Institutional repositories are also limited in their ability to preserve and manage social science data over time. The level of long-term support for different kinds of content is an important issue for potential depositors and users of social science data. The forty-plus year experience of social science data archiving reveals that, while some core formats for datasets have persisted over time, many formats have not. (Green, Dionne, Dennis 1999) Even where formats have persisted, standards of metadata preparation and file organization have changed dramatically. As part of its commitment to keep data formats current, ICPSR has revised and reissued files from 250 older studies in the past two years alone; this is more than half the number of new studies it acquired during this period. 7 DRAFT 7/20/2006 Most institutional repositories do not and cannot offer support for managing dataset formats over time. Few have policies or commitments in place that would allow them to claim they have extensive data archiving capabilities, even though repository software could potentially support their preservation management of datasets.[15] Policies for long- term stewardship vary among institutions, but many have developed a sliding scale of preservation promises. For example, the Duke University Digital Archive has developed three basic tiers of custodianship based upon the assessed value of the material.[16] A similar tiered strategy from the Massachusetts Institute of Technology is based upon file formats.[17] The creation of institutional repositories is an important and valuable development for university-based researchers, who now increasingly have mechanisms for preserving and sharing the results of their work, both inside their institutions and with the larger scientific world. For now, repositories are more interested in collecting materials near the end of the research life cycle, employing an acquisition process that is simple and a preservation process that is not designed to support complex content. These limitations may decrease over time as the institution-based repository movement matures, but now is the time to begin developing successful partnerships with domain-specific depositories. Partnerships among social science researchers, institutional repositories, and social science data archives The complex landscape of institution-based repositories and domain-specific digital repositories can lead to lost data, missed opportunities, and competition for scarce resources. This network of repositories is confusing to most social science researchers and often can only be navigated by the most experienced professionals. We need to clarify intersecting nodes of interest, activity, and mission. Different and shared roles need to be developed to build partnerships between repositories that will lead to consistent support of data throughout the research process. We are not alone in making these assertions. The NSB report on Long-lived Data Collections (2005, p 18-21) urges all those involved in the world of data to “act collectively to pursue some of the higher level goals important to the entire field.” The 2006 report by the National Science Foundation’s Cyberinfrastructure Council (p. 20) also encourages “the establishment of strong, reciprocal, international, interagency and public-private partnerships’ to ensure the stewardship of valuable data assets. A series of questions arise: How can data archives, social science researchers, and institutional repositories best work together to improve the digital landscape supporting social science research? What tools and standards, policies, support, hand-offs are needed from one role to the next? What is being passed along from researcher to repository to archive, and what tools are needed to enhance those activities and improve data quality and access? What specific partnership arrangements could help channel academic resources into long-term access and preservation environments? In short, we advocate the creation of partnerships that support digital life-cycle management, policies, tools, and use of best practices and standards, and that include all 8 DRAFT 7/20/2006 potential stakeholders. We propose a layered approach with ongoing conversations between researchers, institutional repositories, and the domain-specific data archives. Partnership roles and activities during the research life cycle Social science research programs in the U.S. that produce digital output rarely receive funding for sharing or preserving data, descriptive research materials, or laboratory results, which are the “raw materials” of research projects. Most institutional repositories do not offer social science metadata production or management tools, nor do they have the resources to address issues of confidentiality or assist with preparing datasets that need to be anonymized and prepared for use by researchers outside the initial data production and research team. On the other hand, domain repositories can be physically remote from the research enterprise, and their standards and practices may be difficult for the individual researcher to learn about. The proximity of institutional repositories to principle investigators could put partners on the ground to provide essential help in moving social science data into safe repository environments. Partnerships among digital repositories will establish communication flows to make domain-specific support and expertise available during the full data life cycle. Early partnerships that match the skills and knowledge of the data producer (data production) with those of the repository (data life cycle management expertise and long-term curation planning) can have significant impacts: efforts made in the data production stages will reap long-term benefits in the publishing, reuse, and archiving stages. Informed selection of file formats and metadata standards at the creation of digital resources can increase both short- and long-term benefits. It is necessary to provide tools and processes that make best practices attractive and cost effective at the design and production phases of the data life cycle. The next section examines how such partnerships might operate during the stages of the research life cycle. The figures for each section illustrate potential ‘partnering moments’ among stakeholders. The Researcher (or research group) is shown as part of the larger research community. Solid lines designate original flows of information; dashed lines designate secondary flows of information. Discovery and Planning Initial conversations regarding the agenda, outcomes, and support requirements of data intensive research should take place at the earliest stages of a research project. Information from those conversations can then be passed from principal investigators to the repository infrastructure layers. Research descriptions will serve to alert the research community about the project and also put a placeholder for the emerging research output in institution-based and domain-specific information systems. (See Figure 2.) We suggest that institutional repositories might explore the possibility of capturing contextual metadata (the documents and core materials for research projects) at this very early stage in the life cycle. 9 DRAFT 7/20/2006 Figure 2. Research Descriptions are passed from Researcher to Repositories and on to the Research Community Even at these early stages, research projects can benefit from conversations with repository experts about intellectual property issues, long-term digital preservation planning, access controls, confidentiality considerations, file format options, and metadata standards.[18] Grant proposals and subsequent fiscal planning ought to include resources in their budgets to cover the costs of data preparation and archiving. Articulating those requirements and strategies can trigger connections among researchers and repositories and demonstrate the utility of channels of support for principal investigators. (See Figure 3.) Researchers need to know that support, standards, advice, and tools are available for content management and metadata production at this first stage of the research life cycle. 10 DRAFT 7/20/2006 Figure 3. Existing data are made available to Researchers from the Research Community and from Digital Repositories. Procedures for preserving and sharing data are passed from the Domain Repository to the Institutional Repository and Researcher. Initial Data Collection During the initial data collection and processing phase of the research cycle, researchers may choose to make a preliminary transfer of data and documentation into institution- based repositories for safe keeping and early data sharing among collaborators. Local institution-based repositories could function as research workspaces with backup services, data management, and processing support. High level metadata (descriptions of the project goals, funding, methodologies, etc.) can be produced and passed from researchers to local repositories when initial data collection efforts get underway. Also, institutional repositories can expose information about the research project and its data collection activity to a federated repositories structure in standard formats, for example OAI-PMH.[19] 11 DRAFT 7/20/2006 Figure 4. Initial data and high-level metadata are passed from the Researcher to the Institutional Repository and Research Community. Initial high-level metadata are passed from the Researcher to the Institutional Repository, and on to the Domain Repository. The transfers of data from researchers to local repositories make it possible to trigger a local discussion about data processing requirements and what the research team needs to know in order to deposit data in institutional repositories and with the later possibility of passing archive-ready versions of data to domain-specific repositories. These discussions could include relevant information about processing standards and archive-ready submission requirements. Domain-specific repositories could design and provide local repositories with guidelines and strategies regarding an initial data submission package and templates for “archive-ready” dataset production.[20] This is also a moment at which work on confidentiality issues should begin, with the support of domain-specific repositories. 12 DRAFT 7/20/2006 Final Data Preparation and Analysis As a research project moves into the final stages of data preparation and analysis, it is critical to provide tools and support for long-term sharing and archiving. This is the moment that researchers would benefit from mechanisms that: 1) Inform the social science community about data sharing opportunities and assistance with preparing and releasing an announcement of the characteristics of the initial dataset and metadata (instruments, methodologies, working papers, etc). This information could be pushed from institutional repositories into domain-specific repositories for integration into the domain-specific knowledge base. 2) Provide researchers with guidelines for data processing and metadata production, confidentiality review, and other requirements for later stages of metadata exposure and data sharing. (See Figure 5.) Tools and services could become part of the support system to which institution-based repositories refer their constituents, thus providing a link between the developing domain-based services and on-the-ground researchers. According to a study at the University of California at Santa Barbara, faculty consistently reported having difficulties in managing research information. Since tools are regarded as extremely task specific and not terribly relevant outside a specific research field, faculty stressed the importance of getting the right tool to the right users at the right time. (Pritchard, Carver, Anand, 2004) Figure 5. Detailed procedures and assistance are provided by the Domain Repository to Researchers and Institutional Repositories. 13 DRAFT 7/20/2006 Publication and Sharing At the publication and sharing stage, the results of the research project have reached the point of full-scale sharing of “archive-ready” datasets. (See Figure 6.) Confidentiality review, metadata production, and integration into a domain-specific digital repository are fully under way. Of course, not all research projects will contribute their digital output beyond publications to the social science commons, and not all datasets are destined to be archived in perpetuity. It is clear, however, that in order to increase the participation rates by researchers and to fulfill the mandates of some funding agencies, new partnerships, support, and channels of communication can dramatically influence the processing of research data. Since these efforts can be labor-intensive and expensive, we propose that support, tools, expertise, and the use of standards be employed during the entire research life cycle so that the process of archiving data for reuse and long-term accessibility will be greatly enhanced. Making archiving choices responsive to particular research projects is key to data sharing and preservation success. For example, ICPSR is developing flexible archiving options for active data collection projects in which ongoing data management and dissemination are important components of the research activity that make the transfer of data to ICPSR not immediately appropriate. They provide producers with five primary alternatives to the traditional archive model: • Data preservation only • Data preservation with delayed dissemination • Restricted-use data • Enclave release of data • “Virtual” data archiving This “virtual” data archiving option “offers [data] producers a selection of virtual options currently in use at ICPSR that can improve the visibility and usability of producer- disseminated data to users. The package of services includes union catalog listing, full- text linked bibliography, user registration and monitoring, and linked Web access. Thus, a data producer [or institution-based digital repository] can retain control of data dissemination but DSDR will provide access through the ICPSR search interface. Transitional Web pages are being developed that will standardize the linkage between ICPSR and the producers’ data dissemination Web pages.” (LeClere, 2006) 14 DRAFT 7/20/2006 Figure 6. Final archive-ready data and metadata are moved to the Institutional Repository and/or the Domain Repository. Shared services are developed and supported among repositories. Long-Term Management Because domain-specific repositories have as core missions the long-term preservation of social science data, partnerships between those repositories and single institution- based repositories have begun to emerge. For example, in the United Kingdom, SHERPA-DP[21] is developing a model whereby a network of institution-based repositories will outsource preservation to the Arts and Humanities Data Service[22]. Domain-specific repositories also have a mission to monitor and perform ongoing analysis of domain-focused curatorial and access needs of institutional repositories and social science researchers. In order to take on these responsibilities, these institutions seek support and resources for making improvements and developing tools and standards for the curation and preservation of digital resources. This includes research and development of proprietary file format migration issues, managing access changes over time, creating and maintaining bibliographic linkages from publications to datasets, and succession planning at institution levels. 15 DRAFT 7/20/2006 Figure 7. Access to and stewardship of data and metadata over the long-term are commitments made by the Domain Repository. Ongoing contributions of research and development of domain- specific tools and standards for creation and preservation are made available by the Domain Repository to Institutional Repositories and the Research Community. Conclusions and Recommendations We have asserted here that the next step in the evolution of digital repository strategies should be an explicit development of partnerships between researchers, institutional repositories, and domain-specific repositories. In order to articulate these ideas as clearly as possible, we have emphasized the scientific domain we know best, the quantitative social sciences. Many of its key features are transferable to other domains. By assigning these roles to these two types of repositories, and by understanding how they work within the research community, we believe that it will be much easier to succeed at the sharing and preserving of all forms of digital research materials. Our key message is that by visualizing the role of repositories explicitly in the life cycle of the social science research enterprise, the pathways to collaboration will be clear. These workings can be seen as a sequence of reciprocal information flows between parties to the process, triggers that signal that one party or another has a task to perform, and hand-offs of information from one party to another that take place at crucial moments. Providing an illustration of one such partnership, we show that when data 16 DRAFT 7/20/2006 collection is completed and a preliminary version of data is ready for transfer to a repository, high-level metadata are prepared and move from the researcher to the institutional repository (with the data), and then on to the domain repository (on their own). This transfer of metadata to the community becomes a signal that an important research project is underway, and it triggers a discussion between the researcher and the domain repository about the format in which the eventual, final metadata and data deposit should be made. This approach envisions both cooperation and specialization. The researcher produces the scientific product, both data and publications; the institutional repository has specialized knowledge of campus conditions and the opportunity to interact frequently with the researcher; and the domain-specific repository has specialized knowledge of data management approaches to data in a specific scientific field, for example, domain- specific metadata standards (the DDI in the case of the social sciences), as well as the ability to expose the research products to the field in a way that will have the greatest impact. Put another way, the researcher is the essential element that sets the whole process in action, with the domain repository facilitating all the elements of data-oriented scientific collaboration, and the institution-based repository facilitating transactions between the researcher and the domain-specific repository while gathering digital research outputs deemed of local value. 17 DRAFT 7/20/2006 References Burnhill, P., Geraci, D. and Rice, R. (2005), The Social Science of Data Sharing: Distilling Past Efforts. Available http://www.ukoln.ac.uk/events/pv- 2005/posters/burnhill-geraci-rice.pdf Cervone, H.F. (2004), “The Repository Adventure”, Library Journal, Vol 129 No10. Available http://www.libraryjournal.com/article/CA421033.html Crow, R. (2002), The Case for Institutional Repositories: A SPARC Position Paper. The Scholarly Publishing & Academic Resources Coalition. Available http://www.arl.org/sparc/IR/ir.html. DDI Alliance. (2004), DDI Version 3 Conceptual Model. Available http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf. Economic and Social Data Services (2005), Guide to Good Practice: Data Management. Available http://www.esds.ac.uk/support/guides/A4.pdf Economic and Social Data Services (2005), Guide to Good Practice: Micro Data Handling and Security. Available http://www.esds.ac.uk/news/microDataHandlingandSecurity.pdf Green, A. (2005), Review of Digital Repositories. Report to the Integrated Access Council, Yale University Library. Available http://www.library.yale.edu/iac/documents/DR_Review_final_27Sept05.pdf Green, A., Dionne, J. and Dennis, M. (1999), Preserving the Whole: A Two-Track Approach to Rescuing Social Science Data and Metadata. Washington: Council on Library and Information Resources. (CLIR Publications 83). Available http://www.clir.org/pubs/reports/pub83/contents.html Gutmann, M., Schurer, K., Donakowski, D. and Beedham, H. (2004), “The Selection, Appraisal and Retention of Social Science Data.” CODATA Data Science Journal 3, pp. 209-221. Available http://www.jstage.jst.go.jp/article/dsj/3/0/209/_pdf Harnad, S. (2006), Generic Rationale for University Open Access Self-Archiving Policy. Available http://eprints.ecs.soton.ac.uk/12078/01/genericSApolicy-linked.html Heery, R. and Powell, A (2006), Digital Repositories Roadmap: Looking Forward. Joint Information Systems Committee (JISC), Digital Repositories Programme. Available www.jisc.ac.uk/uploaded_documents/rep-roadmap-v15.doc Humphrey, C. (2006), e-Science and the Life Cycle of Research. Available http://datalib.library.ualberta.ca/~humphrey/lifecycle-science060308.doc 18 DRAFT 7/20/2006 Humphrey, C. (2005), “The Preservation of Research Data in a Postmodern Culture.” IASSIST Quarterly, Vol 29 No 1, pp. 24-25. Available http://iassistdata.org/publications/iq/iq29/iqvol291humphrey.pdf Humphrey, C.K., Estabrooks, C.A., Norris, J.R., Smith, J.E. & Hesketh, K.L. (2000), “Archivist on board: Contributions to the research team.” Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, Vol 1 No 3. Available http://www.qualitative-research.net/fqs-texte/3-00/3-00humphreyetal-e.htm Inter-university Consortium for Political and Social Research (2005), Guide to Social Science Data Preparation and Archiving. Third Edition. Ann Arbor: Inter-university Consortium for Political and Social Research. Available http://www.icpsr.umich.edu/access/dataprep.pdf Jacobs, J., and Humphrey, C. (2004), “Preserving Research Data.” Communications of the ACM, Vol 47 No 9, pp. 27-29. Johnson, R.K. (2002), “Institutional Repositories: Partnering with Faculty,” DLib Magazine, Vol 8 No 11. Available http://www.dlib.org/dlib/november02/johnson/11johnson.html Joint Information Systems Committee (JISC), Digital Repositories Programme. (2005), Digital Repositories Review. Available http://www.jisc.ac.uk/uploaded_documents/digital-repositories-review-2005.pdf Digital Repositories Review: Annex 1: Focus Group Report. Available http://www.jisc.ac.uk/uploaded_documents/rep-review-Annex1-fg.pdf. Digital Repositories Review: Annex 2: JISC Digital Repositories Review Software Developers Survey. Available http://www.jisc.ac.uk/uploaded_documents/rep-review- Annex2-software.pdf. Digital Repositories Review: Annex 3: Repository Issues…from a Teaching and Learning Perspective. Available http://www.jisc.ac.uk/uploaded_documents/repos_issues_cetis_feb05.pdf Lagoze, C., Krafft, D.B., Payette, S. and Jesuroga, S. (2005), “What Is a Digital Library Anymore, Anyway? Beyond Search and Access in the NSDL.” DLib Magazine, Vol 11 No 11. Available http://www.dlib.org/dlib/november05/lagoze/11lagoze.html LeClere, F. B. (2006), “Data Sharing for Demographic Research at ICPSR.” ICPSR Bulletin, Vol 26 No 2, pp. 3-8. Available http://www.icpsr.umich.edu/org/publications/bulletin/2006-Q1.pdf Lynch, C. (2003), “Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age.” ARL Bimonthly Report 226, pp. 1-7. Available http://www.arl.org/newsltr/226/ir.html. 19 DRAFT 7/20/2006 Lyon, L. (2003), “eBank UK: Building the links between research data, scholarly communication and learning”, Ariadne Issue 36. Available http://www.ariadne.ac.uk/issue36/lyon/intro.html Online Computer Library Center (OCLC) (2004), Research and Learning Landscape: Institutional Repositories, Scholarly Communication and Open Access, The 2003 OCLC Environmental Scan: Pattern Recognition: A Report to the OCLC Membership. Dublin, OH. Available http://www.oclc.org/reports/escan/research/repositories.htm Open Society Institute (2004), A Guide to Institutional Repository Software v 3.0. New York: Open Society Institute. Available http://www.soros.org/openaccess/software/. Pritchard, S. M., Carver, L. and Anand S. (2004), Collaboration for Knowledge Management and Campus Informatics. Available http://www.library.ucsb.edu/informatics/informatics/documents/UCSB_Campus_Informa tics_Project_Report.pdf Royal Statistical Society and the UK Data Archive. (2002), Preserving and Sharing Statistical Material. Colchester: UK Data Archive. Available http://www.data- archive.ac.uk/news/publications/PreservingSharing.pdf Rusbridge, C. (2005), Information Life Cycle and Curation. Available www.dcc.ac.uk/docs/dcc-life-cycle.ppt Swan, A. (2005), Open access self-archiving: An Introduction. JISC/ Key Perspectives Technical Report. Available http://eprints.ecs.soton.ac.uk/11006 United States. National Science Board. (2005), Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century. Pre-publication draft. Arlington, VA: National Science Foundation. Available http://www.nsf.gov/nsb/documents/2005/LLDDC_report.pdf. United States. National Science Foundation (2006), NSF’s Cyberinfrastructure Vision for 21st Century Discovery. Version 5.0. Available http://www.nsf.gov/od/oci/ci_v5.pdf 1 Ann Green is a consultant with Digital Life Cycle Research & Consulting, New Haven, CT (www.dlifecycle.net). Myron Gutmann is the Director of the Inter-university Consortium for Political and Social Research (ICPSR) and Professor of History at the University of Michigan. 2 http://www.icpsr.umich.edu 3 Other digital life cycle diagrams of interest can be found in ICPSR (2005), DDI Alliance (2004), UKDA/ESDS (2005, p. 8), Humphrey, C. (2006), Rusbridge, C. (2005), and Lyon, L. (2003). 4 http://www.nsf.gov/pubs/2001/gc101/gc101rev1.pdf, article 36. 5 http://grants2.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm 6 http://www.ropercenter.uconn.edu/,http://www.hmdc.harvard.edu/jsp/supportedgroups.jsp?id=6, http://www.cpanda.org/ 7 http://library.physics.harvard.edu/dspace/index.jsp 8 http://www.icpsr.umich.edu/DataPASS, http://www.irss.unc.edu/odum/, http://www.hmdc.harvard.edu/jsp/index.jsp, http://www.archives.gov/ 20 DRAFT 7/20/2006 9 http://odwin.ucsd.edu/idata/ 10 http://www.cessda.org 11 http://www.ddi.org 12 http://sda.berkeley.edu 13 http://www.thedata.org 14 http://www.nesstar.com 15 Two projects presented at the 2005 FEDORA User’s Conference are investigating the use of FEDORA for preservation management. See: “Researching FEDORA's ability to Serve as a Preservation System for Electronic University Records – Dockins, R., Glick, K., Wilczek. E. and “Digital Preservation Using the FEDORA Framework” – Jantz, R. http://www.scc.rutgers.edu/fedora_conf_2005/program.html 16 http://www.lib.duke.edu/its/diglib/digarchive/custodianship.html 17 http://libraries.mit.edu/dspace-mit/build/policies/format.html 18 See Preserving and Sharing Statistical Material (Royal Statistical Society and the UK Data Archive) as an example of a document produced by a data archive to promote the need for preserving and disseminating statistical research output. 19 http://www.openarchives.org/ 20ICPSR’s Guide to Social Science Data Preparation and Archiving and the ESDS publication Guide to Good Practice: Data Management provide guidance to depositors, and potential depositors, of data to ICPSR and to ESDS. 21 http://ahds.ac.uk/about/projects/sherpa-dp/ 22 http://ahds.ac.uk/ 21 http://www.lib.duke.edu/its/diglib/digarchive/custodianship.html work_6e6yng2mfbgg5c6miiaythmg4i ---- 482 American Archivist / Vol. 57 / Summer 1994 Research Article Viewing the Field: A Literature Review and Survey of the Use of U.S. MARC AMC in U.S. Academic Archives LYN M. MARTIN Abstract: U.S. MARC AMC (MAchine-Readable Cataloging for Archives and Manuscript Control) has "come of age," taking its place in the mainstream of both archival and cataloging thinking, theory, and practice. The meteoric rise in the use of MARC AMC is evident in the statistics reported by the bibliographic utilities. The literature of MARC AMC, although extensive, has not been reviewed since 1989 and does not systematically document the use of the format in U.S. academic archives. This paper presents a review of that literature and reports the results of a 1992 survey of 200 archivists, representing 200 academic archives in the United States. These respondents were randomly selected from the Society of American Archivists' 1991 Directory of Individual Members; they cooperated in a survey examining the use of MARC AMC for cataloging archival and manuscript collections. This paper profiles the institutional use of MARC AMC, including the choice of a cataloging standard, such as Steven Henson's Archives, Personal Papers and Manuscripts, Second Edition (APPM, Second Edition) and Anglo-American Catalog- ing Rules, Second Edition, Revised (AACR2R), chapter 4. The paper concludes with an admonition for archivists and traditional catalogers to work collaboratively to catalog archival and manuscript collections. About the author: Lyn Martin is currently senior assistant librarian and cataloger at the State University of New York College of Agriculture and Technology at Cobleskill, New York. She was previously senior assistant librarian and monographic cataloger (specializing in archival collections and rare books) at the University Libraries, University at Albany, State University of New York, Albany, New York. The research for this paper was conducted as part of a 1991-92 University at Albany Faculty Research Award Program grant. Selected results of this research were presented in June 1993 at the State University of New York Librarians Conference at Binghamton University, State University of New York, Binghamton, New York. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 483 U.S. MARC AMC (MACHINE-READABLE CATALOGING for Archives and Manuscript Control) has taken its place in the main- stream of both archival and cataloging thinking, theory, and practice. More than two decades have passed since the MARC Manuscripts format was introduced by the Library of Congress (LC) and implemented by the Online Computer Library Center, Inc. (OCLC) in 1973. This paper presents a current review of the literature on MARC AMC and reports the results of a survey of its use in 200 U.S. academic archives. MARC AMC Comes of Age It has been a decade since the MARC AMC format was approved by the Ameri- can Library Association's (ALA's) Com- mittee on Representation in MAchine- Readable Form of Bibliographic Informa- tion (MARBI),1 and since Stephen L. Hen- son's Archives, Personal Papers and Manuscripts (APPM)2 was first published by LC. In addition, it has been ten years since OCLC and the Research Libraries Group's (RLG) Research Libraries Infor- mation Network (RLIN) implemented the MARC AMC format,3 and five years since Henson's Archives, Personal Papers, and Manuscripts: A Cataloging Manual for Ar- chival Repositories, Historical Societies, and Manuscript Libraries, Second Edition {APPM, Second Edition)4 was endorsed as •Working Group on Standards for Archival De- scription, "Archival Description Standards: Establish- ing a Process for Their Development and Implementation," American Archivist 52 (Fall 1989): 448. 2Stephen L. Henson, Archives, Personal Papers and Manuscripts (Washington, D.C.: Library of Con- gress, Cataloging Distribution Service, 1983). 'Working Group on Standards for Archival De- scription, "Archival Description Standards: Establish- ing a Process," 448. "Stephen L. Henson, Archives, Personal Papers, and Manuscripts: A Cataloging Manual for Archival Repositories, Historical Societies, and Manuscript Li- braries, Second Edition (Chicago: Society of Ameri- can Archivists, 1989). the standard for archival description by the Society of American Archivists (SAA) Council.5 Further, it has been nine years since Henson forecast The MARC Archival and Manu- scripts Control . . . format has the potential to change the lives of ar- chivists forever. The format provides a structure for description that is not only fully consistent with archival principles but also compatible with modern bibliographic description. Contemplating the possibilities for information sharing, automated un- ion catalogs, network building, and computerized management is enough to make most archivists positively giddy. Not since the development of the acid-free folder has news this good broken upon the archival hori- zon. With this new freedom, how- ever, there are new responsibilities.6 Henson's forecast has proven to be close to the mark. And, now, in the mid 1990s, the "giddiness" Henson forecast is past and the "responsibilities" he anticipated are all too real. MARC AMC has come of age, taking its place (for good or for ill) in the mainstream of archival and cataloging thinking, theory, and practice. There is no doubt that MARC AMC seems firmly entrenched and that it is being used with increasing regularity, particularly in academic environments in the United States. In 1988 Henson reported that OCLC and RLIN databases combined held "almost 150,000 catalog records for man- 5Working Group on Standards for Archival De- scription, "Archival Description Standards: Establish- ing a Process," 449. 'Stephen L. Henson, "The Use of Standards in the Application of the AMC Format," American Archi- vist 49 (Winter 1986): 32. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 484 American Archivist/ Summer 1994 uscript and archival materials."7 More re- cently, OCLC alone reported the inclusion of 174,468 MARC AMC records in its da- tabase,8 and RLG reported that the "RLIN AMC file is richer than ever," containing "more than 383,000 records."9 The Literature of MARC AMC Both archivists and catalogers have held extensive discussions and debates about MARC AMC over the past two decades, and a wealth of material about the format has been published. However, there has been precious little survey work regarding the actual use of the format. The literature of MARC AMC had been well docu- mented by the Working Group on Stan- dards for Archival Description,10 but the 1989 bibliography (published in American Archivist) has never been updated. There has been a flurry of activity in the archival and library cataloging literature about MARC AMC since Michael Cook's Ar- chives and the Computer11 was first pub- lished in 1980 and since the first five MARC AMC articles were published in 1984.12 'Stephen L. Henson, "Squaring the Circle: The Reformation of Archival Description in AACR2," Li- brary Trends 36: 3 (Winter 1988): 539. 8SUNY/OCLC Network, "Bibliographic Records in the [OCLC] Online Union Catalog by Source of Cataloging," Status Line no. 57 (April 1993): 3. 'Research Libraries Group, "RLIN AMC File Richer Than Ever," The Research Libraries Group News no. 31 (Spring 1993): 10. '"Working Group on Standards for Archival De- scription, "Archival Description Standards: Establish- ing a Process," 494—502. "Michael Cook, Archives and the Computer (Lon- don: Butterworths, 1980). 12See Thomas Elton Brown, "The Society of American Archivists Confronts the Computer," American Archivist 47 (Fall 1984): 366-82; Michael J. Fox, "The Wisconsin Machine-Readable Records Project," American Archivist Al (Fall 1984): 429-31; Richard H. Lytle, "An Analysis of the Work of the National Information System's Taskforce," American Archivist 47 (Fall 1984): 357-65; William J. Maher, "Administering Archival Automation: Development of an In-House System," American Archivist 47 (Fall Most publications were concentrated in the seven-year period from 1984 to 1991. During the four-year period from 1980 to 1983, publications about MARC AMC were scarce. No articles appeared during that period, although the first three mono- graphs on the format were introduced.13 It is interesting that the first (and sole) mas- ter's thesis on MARC AMC did not appear until 1991,M and that by 1992 (following the 1984-91 peak period), publications had dwindled further, to just one article.15 As of August 1993, no further works on the format had been published (see figure 1). The literature of MARC AMC addresses the full spectrum of issues related to the format. The history of MARC AMC is well documented by a large group of publications.16 An array of publications ad- 1984): 405-17; and Alan M. Tucker, "The RLIN Im- plementation of the MARC Archives and Manuscript Control Format," in Academic Libraries: Myths and Realities: Proceedings of the Third National Confer- ence of The Association of College and Research Li- braries, edited by Suzanne C. Dodson and Gary L. Menges (Chicago: Association of College and Re- search Libraries, 1984), 69-79. 13See Cook, Archives and the Computer; H. Tho- mas Hickerson, Archives and Manuscripts: An Intro- duction to Automated Access (Chicago: Society of American Archivists, 1981); and Henson, Archives, Personal Papers, and Manuscripts (1983). '"Sheila H. Martell, "Use of the MARC AMC For- mat by Archivists for Integration of Special Collec- tions' Holdings into Bibliographic Databases and Networks," M.S.L.S. thesis, University of North Car- olina at Chapel Hill, 1991. 15Joan Warnow-Blewett, "Work to Internationalize Access to the Archives and Manuscripts of Physics and Allied Sciences," American Archivist 53 (Sum- mer 1992): 484-89. "See David Bearman, Towards National Informa- tion Systems for Archives and Manuscript Reposito- ries: The National Information Systems Task Force (NISTF) Papers, 1981-1984 (Chicago: Society of American Archivists, 1987); Robert D. Bohanan, "Developments and Options in Archival Automa- tion," Journal of Educational Media and Library Sci- ences 25 (Autumn 1987): 1-21; Walt Crawford, MARC for Library Use: Understanding Integrated USMARC, Second Edition (Boston: G.K. Hall, 1989); Lytle, "An Analysis of the Work of the National In- formation System's Taskforce," 357-65; Martell, "Use of the MARC AMC Format"; Working Group on Standards for Archival Description, "Archival De- D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 485 dresses the impact of the format on archi- val and cataloging education, theory, and practice.17 A host of publications considers "how-to" applications of MARC AMC, giving technical details and the implica- tions of format integration.18 Another scription Standards: Establishing a Process," 431— 537; and Working Group on Standards for Archival Description, "Standards for Archival Description," American Archivist 53 (Winter 1990): 22-108. "See David Bearman, "Archival and Bibliographic Information Networks," in Archives and Library Ad- ministration: Divergent Traditions and Common Con- cerns, edited by Lawrence J. McCrank (New York: Haworth, 1986), 99-110; Thomas Elton Brown, "The Society of American Archivists"; Cook, Archives and the Computer (1980 and 2nd ed.); Michael Cook, The Management of Information from Archives (Alder- shot: Gower, 1986); Michael Cook, "The Role of Computers in Archives," Information Development 5 (Winter 1990): 24-28; Patricia D. Cloud, "The Cost of Converting to MARC AMC: Some Early Obser- vations," Library Trends 35 (Winter 1988): 573-83; Donald L. DeWitt, "The Impact of the MARC AMC Format of Archival Education and Employment Dur- ing the 1980s," The Midwestern Archivist 16 (1991): 73-75; Anne J. Gilliland, "The Development of Automated Archival Systems: Planning and Manag- ing Change," Library Trends 36 (Winter 1988): 519- 37; Stephen E. Hannestad, "Clay Tablets to Micro Chips: The Evolution of Archival Practice into the Twenty-First Century," Library Hi Tech 9, no. 4 (1991): 75-96; Martell, "Use ofthe MARC AMC For- mat"; Richard V. Szary, "Information Systems for Libraries and Archives: Opportunity or Incompatibil- ity?" in Archives and Library Administration, edited by McCrank: 61-98; Sarah Tyacke, "Special Collec- tions in Research Libraries: Problems and Perspec- tives," Alexandria 2 (December 1990): 11-22; Lisa B. Weber, "Educating Archivists for Automation," Library Trends 36 (Winter 1988): 501-18; Working Group on Standards for Archival Description, "Stan- dards for Archival Description," and "Archival De- scription Standards: Establishing a Process." 18See William E. Brown, Jr. and Lofton Wilson, "The AMC Format: A Guide to the Implementation Process," Provenance 5 (Fall 1987): 27-36; Michael Cook, "The British Move Toward Standards of Ar- chival Description: The MAD Description Standard," American Archivist 53 (Winter 1990): 130-38; Mi- chael Cook, A MAD User Guide (Aldershot: Gower, 1989); Michael Cook and Kristina Gant, A Manual of Archival Description (London: Society of Archivists, 1985, 1986) 51; Michael Cook and Margaret Procter, A Manual of Archival DescriptionNew, 2nd Ed. (Al- dershot: Gower, 1989); Crawford, MARC for Library Use; Max J. Evans and Lisa B. Weber, MARC for Archives and Manuscripts: A Compendium of Prac- tice (Madison: State Historical Society of Wisconsin, group addresses the various MARC AMC- related cataloging standards.19 1985); Henson, Archives, Personal Papers and Man- uscripts (1983); Henson, Archives, Personal Papers and Manuscripts, (1989); Hickerson, Archives and Manuscripts"; Diana Madden "An Overview ofthe USMARC Archival and Manuscripts Control For- mat," S.A. Archives Journal 33 (1991): 47-59; Mar- ion Matters, Introduction to the USMARC Format for Archival and Manuscripts Control (Chicago: Society of American Archivists, 1990); Sally H. McCallum, "Format Integration: Handling the Additions and Subtractions," Information Technology and Libraries 9 (June 1990): 155-161; Frederic M. Miller, Arrang- ing and Describing Archives and Manuscripts (Chi- cago: Society of American Archivists, 1990); Katherine D. Morton, "The MARC Formats: An Overview," American Archivist 49 (Winter 1986): 21-30; Barbara Orbach, "So That Others May See: Tools for Cataloging Still Images," Cataloging & Classification Quarterly 11 nos. 3^t (1990): 163-91; Howard Pasternack, "Online Catalogs and the Ret- rospective Conversion of Special Collections," Rare Books & Manuscripts Librarianship 5, no. 2 (1990): 71-76; Nancy Ann Sahli, "Interpretation and Appli- cation of the AMC Format," American Archivist 49 (Winter 1986): 9-20; Nancy [Ann] Sahli, MARC for Archives and Manuscripts: The AMC Format (Chi- cago: Society of American Archivists, 1985); Richard P. Smiraglia, "New Promise for the Universal Con- trol of Recorded Knowledge," Cataloging & Classi- fication Quarterly 11, nos. 3-4 (1990): 1-15; David C. Sutton, "Full MARCs for Manuscripts," Cata- logue & Index nos. 96-97 (Spring-Summer 1990): 1- 4; and Lisa B. Weber, "Record Formatting: MARC AMC," Cataloging & Classification Quarterly 11, nos. 3-4(1990): 117-^3. "See David Bearman, "Description Standards: A Framework for Action," American Archivist 52 (Fall 1989): 514-19; Jean E. Dryden, "Dancing the Con- tinental: Archival Descriptive Standards in Canada," American Archivist 53 (Winter 1990): 106-8; Michael J. Fox, "Descriptive Cataloging for Archival Mate- rials," Cataloging & Classification Quarterly 11, nos. 3-4 (1990): 17-34; Henson, "Squaring the Circle," 539-52; Henson, "The Use of Standards in the Ap- plication ofthe AMC Format," 31-40; Marion Mat- ters, "Reconciling Sibling Rivalry in the AACR2 'Family': The Potential for Agreement on Rules for Archival Description of All Types of Materials," American Archivist 53 (Winter 1990): 76-93; Edward Swanson, "Choice and Form of Access Points Ac- cording to AACR2," Cataloging & Classification Quarterly 11, nos. 3-4 (1990): 35-61; Richard [V.] Szary, "Archival Description Standards: Scope and Criteria," American Archivist 52 (Fall 1989): 520-26; Sharon Gibbs Thibodeau, "Archival Arrangement and Description," in Managing Archives and Archi- val Institutions, edited by James Gregory Bradsher (Chicago: University of Chicago Press, 1989), 67-77; D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 486 American Archivist/ Summer 1994 100-r N u m b e r o f P u b I c a t o n s 92 '80 '81 "82 "83 '84 "85 '86 '87 '88 '89 '90 '91 '92 '93* Total Year * partial year only Figure 1. Number of MARC AMC Publications by Year. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 487 In addition 6 articles discuss the appli- cation of MARC AMC to specific material types;20 another group of 5 articles exam- ines issues related to authority control, sub- ject, and access;21 a larger group of 17 items addresses the impact of MARC AMC on the bibliographic utilities and net- works, as well as on local and stand-alone systems;22 and 2 articles describe technical Lisa B. Weber, "Archival Description Standards: Concepts, Principles, and Methodologies," American Archivist 52 (Fall 1989): 504-13. 20See James Corsaro, "Control of Cartographic Materials in Archives," Cataloging & Classification Quarterly 11, nos. 3^J (1990): 213-228; Linda J. Evans and Maureen O'Brien Will, MARC for Archi- val Visual Materials: A Compendium of Practice (Chicago: Chicago Historical Society, 1988); Janet Gertz and Leon J. Stout, "The MARC Archival and Manuscripts Format: A New Direction in Catalog- ing," Cataloging & Classification Quarterly 9, no. 4 (1989): 5-25; Martha Hodges, "Using the MARC Format for Archives and Manuscripts Control to Cat- alog Published Microfilms of Manuscripts Collec- tions," Microform Review 18 (Winter 1989): 29-31, 34-35; David H. Thomas, "Cataloging Sound Re- cordings Using Archival Methods," Cataloging & Classification Quarterly 11, nos. 3^t (1990): 193- 212; Lisa B. Weber, "Describing Microforms and the MARC Formats," Archives and Museum Informatics 1 (Summer 1987): 9-13. 21See David Bearman, "Authority Control Issues and Prospects," American Archivist 52 (Summer 1989): 286-99; Marion Matters, "Authority Work for Transitional Catalogs," Cataloging & Classification Quarterly 11, nos. 3-4 (1990): 91-115; Richard P. Smiraglia, "Subject Access to Archival Materials Us- ing LCSH," Cataloging & Classification Quarterly 11, nos. 3-̂ 1 (1990): 63-90; Lisa B. Weber, "The 'Other' USMARC Formats: Authorities and Hold- ings: Do We Care to Be Partners in This Dance, Too?" American Archivist 53 (Winter 1990): 44-54; Helena Zinkham, Patricia Cloud, and Hope Mayo, "Providing Access by Form of Material, Genre, and Physical Characteristics: Benefits and Techniques," American Archivist 52 (Summer 1989): 300-19. 22See David Bearman, "Archival and Bibliographic Information Networks," Journal of Library Admin- istration 7, nos. 2-3 (Summer-Fall 1986): 99-110; David Bearman, "Archives and Manuscript Control with Bibliographic Utilities: Challenges and Oppor- tunities," American Archivist 52 (Winter 1989): 2 6 - 39; Michael Cook, Archives and Manuscripts Control: A MARC Format for Use with a Cooperative Online Database (Liverpool: Liverpool University, Archival Description Project, 1987); W. Theodore Diirr, "At the Creation: Chaos, Control, and Auto- mation—Commercial Software Development for Ar- and telecommunications issues and prob- lems related to the format.23 Although 9 articles do describe specific cataloging projects using MARC AMC,24 chives," Library Trends 36 (Winter 1988): 593-607; Matthew Benjamin Gilmore, "Increasing Access to Archival Records in Library Online Public Access Catalogs," Library Trends 36 (Winter 1988): 609-23; H. Thomas Hickerson, "Archival Information and the Role of the Bibliographic Networks," Library Trends 36 (Winter 1988): 553-71; H. Thomas Hickerson, "Standards for Archival Information Management Systems," American Archivist 53 (Winter 1990): 2 4 - 28; Frederick L. Honhart, "The Application of Mi- crocomputer-Based Local Systems with the MARC AMC Format," Library Trends 36: (Winter 1988): 585-92; Frederick L. Honhart, "MicroMARC:amc: A Case Study in the Development of an Automated Sys- tem," American Archivist 52 (Winter 1989): 80-86; Frederick L.] Honhart, "MicroMARCamc Version 2.11," OCLC Micro 6 (December 1990): 13; Maher, "Administering Archival Automation"; Lawrence J. McCrank, "The Impact of Automation: Integrating Archival and Bibliographic Systems," in Archives and Library Administration edited by McCrank: 61— 98; Kathleen D. Roe, "The Automation Odyssey: Li- brary and Archives System Design Considerations," Cataloging & Classification Quarterly 11, nos. 3—4 (1990): 145-162; Kathleen D. Roe, "From Archival Gothic to MARC Modern: Building Common Data Structures," American Archivist 53 (Winter 1990): 56-66; Tucker "The RLIN Implementation"; David Weinberg, "Automation in the Archives: RLIN and the Archives and Manuscript Control Format," Prov- enance 4 (Fall 1986): 12-31; Ronald J. Zboray, "dBase III Plus and the MARC AMC Format: Prob- lems and Possibilities," American Archivist 50 (Spring 1987): 210-25. 23See Jill M. Tatem, "Beyond USMARC AMC: The Context of a Data Exchange Format," Midwest- ern Archivist 14 no. 1 (1989): 39^47; Sharon Gibbs Thibodeau, "External Technical Standards for Data Contents and Data Values: Prospects for Adoption by the Archival Community," American Archivist 53 (Winter 1990): 94-105. 24See James M. Bower, "One-Stop Shopping: RLIN as a Union Catalog for Research Collections at the Getty Center," Library Trends 37 (Fall 1988): 252-62; James G. Carson, "American Medical As- sociation's Historical Health Fraud and Alternative Medicine Collection: An Integrated Approach to Au- tomated Collection Description," American Archivist 54 (Spring 1991): 184-91; Patricia D. Cloud, "Fitting In: The Automation of the Archives at Northwestern University," Provenance 5 (Fall 1987): 14-26; Leon- ard A. Coombs, "A New Access System for the Vat- ican Archives," American Archivist 52 (Fall 1989): 538—46; Fox, "The Wisconsin Machine-Readable Project"; Richard W. Hite and Daniel Linke, "Teaming Up with Technology: Team Processing," D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 488 American Archivist/Summer 1994 none of them specifically records the extent of use of MARC AMC by U.S. academic archives, or the choice of cataloging stan- dard. Just one article, by Avra Michael- son,25 researches actual use of MARC AMC. Michaelson reports the findings of a survey of 40 repositories in 1987. Her sur- vey was, however, limited to 40 RLIN-re- porting repositories, excluding all other RLIN-reporting repositories and all OCLC- and Washington Library Network-report- ing repositories. Conditions may well have changed since that study was completed. Despite the increased use and accepta- bility of MARC AMC by archivists and catalogers alike, several questions remain: 1. Is anyone (or everyone?) actually cata- loging archival collections and manu- scripts using the format? 2. What AMC cataloging standard is em- ployed? 3. Why was a particular cataloging stan- dard chosen? This study supplies the missing informa- tion regarding the use of MARC AMC by U.S. academic archives and the choice of AMC cataloging standard. Methodology and Sample Questionnaire In January 1992, a questionnaire was mailed to 200 archivists at 200 different academic archives in the United States, randomly selected from the SAA's 1991 Midwestern Archivist 15, no. 2 (1990): 91-97; Wil- liam M. Holmes, Edie Hedlin, and Thomas E. Weir, "MARC and Life Cycle Tracking at the National Ar- chives: Project Final Report," American Archivist 49 (Summer 1986): 305-09; Curtis D. Jerde, "Technical Processing of Popular Music at Tulane University Li- brary's Hogan Jazz Archive: The Rockefeller Pro- ject," Technical Services Quarterly 4 (Summer 1989): 53-60; Warnow-Blewett, "Work to Internationalize Access to the Archives and Manu- scripts of Physics and the Allied Sciences." 25Avra Michaelson, "Description and Reference in the Age of Automation," American Archivist 50 (Spring 1987): 192-208. Directory of Individual Members.26 (See the appendix at the end of this article for a sample of this questionnaire.) The survey was intentionally designed to be simple and direct. It posed a select group of per- tinent questions regarding the actual use of MARC AMC to catalog archival and man- uscript collections, including questions re- garding the choice of cataloging standard; the choice of archival and manuscript cat- alogers; the training of archival and man- uscript catalogers; and the use of MARC AMC in bibliographic utilities, local auto- mated systems, and stand-alone, turnkey MARC AMC systems. Survey results were tabulated and per- centages were calculated and rounded to one decimal place. Initially, tables, ranking the responses for each survey question from greatest to smallest, were created. These tables were later amalgamated, in- corporating several smaller groupings into larger ones. Response to the survey was excellent: 140 of the 200 archivists (70%) responded. The random selection of the sample pop- ulation assisted in ensuring that the full range of academic archives (small to large) was incorporated in the study. To confirm that all sizes were represented, respondents were asked to list the approximate size of their respective collection, in either linear or cubic feet. The respondents reported col- lections in the full range of sizes. Of the 140 respondents, 105 (75%) described their collections as ranging from 25 to 61,000 linear feet; 30 (21.4%) described their col- lections as ranging from 60 to 75,000 cubic feet; and five (3.57%) reported that their archives were too newly created (and largely unsurveyed) to approximate the size. There was no correlation between the size of institution or the size of archival 26Society of American Archivists, 1991 Directory of Individual Members (Chicago: Society of Ameri- can Archivists, 1991). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 489 and manuscript holdings and the likelihood that the collections were cataloged using MARC AMC. Institutions using MARC AMC encompassed the full spectrum of in- stitutional and archival/manuscript hold- ings sizes. Respondents were also asked if their collections included manuscripts, in addition to archival materials. All 140 (100%) noted that their collections in- cluded both manuscript and archival col- lections. Expectations The study began with seven expectations regarding the use of MARC AMC in U.S. academic archives: Expectation 1: MARC AMC is used by the majority of U.S. academic archives. Based on the vast discussion of MARC AMC at archival- and library-related meet- ings and conferences and the considerable volume (and saturation) of the MARC AMC literature, it is expected that the use of MARC AMC has similarly reached the saturation point, with the majority of insti- tutions cataloging their archival and man- uscript collections using MARC AMC. Expectation 2: APPM, Second Edi- tion, is the cataloging standard of choice in use by the majority of U.S. academic archives. Based on SAA's support, the ver- whelmingly positive discussion in the lit- erature, and the obvious incorporation of archival practices in APPM, Second Edi- tion, it is expected that the vast majority of institutions would choose that standard. Expectation 3: The majority of U.S. academic archives enter MARC AMC re- cords into the OCLC database. Although the RLIN database holds the largest num- ber of MARC AMC records, a smaller number of institutions (i.e., the larger re- search institutions, which hold more exten- sive and complex collections) enter more records in RLIN. Expectation 4: The majority of U.S. academic archives enter their MARC AMC records in a local automated system (in ad- dition to a bibliographic utility), and these systems will vary widely. This expectation is based on the assumption that the entry of MARC AMC records parallels the entry of MARC records for traditional library formats, such as books and audiovisuals. Expectation 5: Some U.S. academic archives enter their MARC AMC records in a stand-alone, turnkey system, such as MicroMARCamc. Such turnkey systems, tailored to archival control requirements, boast a relatively low start-up cost and re- quire far less training to operate effec- tively, as compared to most library automation systems. Also, the detachment or loose affiliation of some academic ar- chives from the institution's library, com- bined with the incompatibility of archival data with library bibliographic data, makes turnkey systems an attractive choice. Expectation 6: In the majority of U.S. academic archives, archivists are primarily responsible for cataloging the archival and manuscript collections using MARC AMC. Based on the expectation that archivists were the primary catalogers, it is appropri- ate also to expect that the SAA workshops would be the training mode of choice. Expectation 7: The majority of those using MARC AMC have received some special training, primarily from SAA- sponsored workshops. MARC AMC is a complex format, requiring some instruction (even on the part of seasoned archivists and catalogers) to employ it effectively and fully. SAA has continually sponsored the majority of—and the most in-depth— MARC AMC instructional sessions. Results and Discussion Use of MARC AMC. The survey re- sults were interesting but, for the most part, not a surprise. The seven expectations proved to be close to the mark. A majority of the 140 respondents (80, or 57.1%) reported cataloging their collec- D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 490 American Archivist/Summer 1994 Table 1. Cataloging of Archival and Manuscript by U.S. Academic Archives Category MARC AMC (n = 140) Yes No Bibliographic Utility or Cataloging System (n = 80) OCLC RUN WLN Local system (OPAC) with bibliographic utility Bibliographic utility only Stand-alone, turnkey MARC AMC system only Stand-alone, turnkey system with bibliographic utility Local system only Local System (OPAC) (n = 54) NOTIS Institution's in-house system CARL III VTLS LIAS PALS multiLIS GEAC GLIS GEAC Advance Stand-Alone Turnkey MARC AMC System (n = 17) MicroMARC:amc 4th Dimension Brian Cole Associates Cuadra/Star Filemaker Pro GENCAT *+ .1 due to rounding. Collections Percentage 57.1 42.9 45.0 25.0 3.8 62.5 11.3 21.2 0.0 5.0 33.3 16.7 13.0 7.4 7.4 7.4 5.5 5.5 1.9 1.9 70.6* 5.9* 5.9* 5.9* 5.9* 5.9* Number 80 60 36 20 3 50 9 17 0 4 18 9 7 4 4 4 3 3 1 1 12 1 1 1 1 1 tions using MARC AMC (see table 1). The remainder (60, or 42.9%) reported that their collections were not cataloged at all; That is, records for their collections were not included in a bibliographic utility or in a library printed card or automated catalog. Of these 60 respondents, 48 (80%) listed reasons why they did not catalog the col- lections. Forty-three (71.7%) stated they did not have enough money, staff, or time to do so, and 5 (8.3%) reported that, al- though they were not presently cataloging their collections, they were planning to do so in the near future. All 80 respondents who stated they do use MARC AMC also reported that they catalog on line, using a bibliographic utility in combination with a local automated sys- tem, just a local automated system, or a stand-alone software package. Fifty-nine (73.8%) of the 80 reported cataloging on a bibliographic utility: 36 (45%) reported cataloging on OCLC; 20 (25%) reported cataloging on RLIN; and 3 (3.8%) reported D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 491 Table 2. Cataloging Standard Employed Standard APPM, Second Edition AACR2R, Chapter 4 Combination of APPM and AACR2R Other institutional- or subject-specific DCRB, Second Edition by MARC AMC Users (n Percentage 62.5 22.5 27.5 15.0 0.0 = 80) Number 50 18 22 12 0 cataloging on WLN. Fifty (62.5%) re- ported cataloging on a bibliographic utility in conjunction with a local automated sys- tem; 9 (11.3%) reported cataloging only on a bibliographic utiltiy; and 17 (21.2%) re- ported that they cataloged on a stand-alone, turn-key system. Of the 80 respondents, the 54 (67.5%) who reported cataloging on a local system were asked to identify the system they used. Eighteen (33.3%) reported using NOTIS; 9 (16.7%) used an in-house sys- tem; 7 (13%) used CARL; 4 each (7.4%) used III, VTLS, or LIAS; 3 each (5.5%) used PALS or multiLIS; and 1 each (1.9%) used GEAC GLIS or GEAC Ad- vance. Three (5.3%) failed to respond to this question. The 17 respondents who reported cata- loging by only a stand-alone, turnkey sys- tem were also asked to specify the system they used. The vast majority (12, or 70.6%) reported using MicroMARC:amc. The re- maining 5 were evenly divided (1 each, or 5.9%) among 4th Dimension, Brian Cole Associates, Cuadra/Star, Filemaker Pro, and GENCAT. Choice of Descriptive Cataloging Standard. The cataloger using MARC AMC for archival and manuscript collec- tions has a choice of descriptive cataloging standards, including two primary choices: APPM, Second Edition, and Anglo-Ameri- can Cataloging Rules, Second Edition, Re- vised AACR2R, Chapter 4.27 As noted "Michael Gorman and Paul W. Winkler, eds., An- glo-American Cataloging Rules, 2nd ed., 1988 rev. (Chicago: American Library Association, 1988). earlier, these choices have been hotly de- bated and discussed in depth in the litera- ture. The 80 respondents using MARC AMC to catalog their archival and manuscript collections were asked which archival cat- aloging standard they used and were given a choice of APPM, Second Edition; AACR2R, Chapter 4; LC's Descriptive Cat- aloging of Rare Books, Second Edition (DCRB, Second Edition);28 and "other." Respondents who selected the "other" cat- egory were asked to list the specific stan- dard they used. More than half (50 or 62.5%) were using APPM, Second Edition, and 18 (22.5%) were using AACR2R, Chapter 4 (see table 2). Twenty-two (27.5%) said they used a combination of APPM, Second Edition, and AACR2R, Chapter 4, noting specifically that they used the former for archival collections and the latter for manuscripts and manuscript collections. Twelve (15%) reported using "other" standards: 11 (91.7%) used an in- stitution-specific standard and 1 (8.3%) used a subject-specific (i.e., medical) stan- dard. No respondents reported using DCRB, Second Edition. The 80 respondents using MARC AMC were also asked to record their reasons for selecting a specific cataloging standard. Sixty-eight (85%) responded to the ques- 28Library of Congress, Office for Descriptive Cat- aloging Policy and Association of College and Re- search Libraries, Rare Books and Manuscripts Section, Bibliographic Standards Committee, De- scriptive Cataloging of Rare Books, 2nd ed. (Wash- ington, D.C.: Library of Congress, Cataloging Distribution Service, 1991). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 492 American Archivist/Summer 1994 Table 3. Collection Catalogers and Training Category Collection catalogers (n = 80) Archivists and catalogers working together Archivists solo Catalogers solo MARC AMC Training (n = 80) Yes No Type of Special Training (n = 58) SAA MARC AMC workshops RUN training sessions OCLC or OCLC network training sessions Institution in-house training sessions Graduate-level courses National Archives training sessions State historical societies workshops Combination of the above categories Percentage 60.0 35.0 5.0 72.5 27.5 51.7 24.1 17.2 17.2 13.8 10.3 3.4 41.4 Number 48 28 04 58 22 30 14 10 10 8 6 2 24 tion, and the responses centered on several key issues. Thirty-nine (48.8%) of the 68 commented that APPM, Second Edition, is " t h e " standard of choice for archival col- lections. One elaborated, saying, "AACR2R is like using a sledgehammer to do eye sur- gery." Thirty-one (38.8%) of the 68 attrib- uted their choice of APPM, Second Edition, to its being recommended by RLIN trainers, SAA, or other archivists; 4 (5%) said they chose it because it is more clearly presented than AA CR2R, Chapter 4; 4 others (5%) used AACR2R, Chapter 4, only as a supplement to APPM, Second Edition; and 2 (2.5%) reported having pre- vious experience with APPM, Second Edition. Twenty-four (30%) of the 68 said they used AACR2R, Chapter 4, for manuscripts and manuscript collections because it was " t h e " standard for manuscripts; 8 (10%) noted that OCLC trainers had recom- mended using AACR2R, Chapter 4, for ar- chival collections; another 8 (10%) noted that their cataloging department chose AACR2R, Chapter 4, as a standard for man- uscript and archival collections; and 3 (3.8%) said they did not know there was another standard. Archivist Versus Cataloger. Of the 80 respondents who used MARC AMC, the majority (48, or 60%) stated archivists and catalogers worked cooperatively to cat- alog the archival and manuscript collec- tions (see table 3). More than one-third (28, or 35%) reported that archivists cataloged the collections solo, contrasted with only 4 (5%) who said catalogers did the task alone. Nearly three-quarters (58, or 72.5%) of those who used MARC AMC also reported that the archivists and catalogers who cata- loged the collections had special MARC AMC cataloging training. Twenty-two (27.5%) said archivists and catalogers had no special training. The kind of training varied, and neither archivists nor catalogers were more likely to have had special training. Of the 58 respondents reporting special training, about half (30, or 51.7%) had at- tended SAA MARC AMC workshops; 14 (24.1%) had received RLIN training; 10 (17.2%) had received OCLC or OCLC net- work training; 10 (17.2%) had had in- house training; 8 (13.8%) reported taking D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 493 graduate-level courses including MARC AMC training; 6 (10.3%) had attended Na- tional Archives workshops; and 2 (3.4%) had attended state historical society work- shops. Twenty-four (41.4%) of the 58 re- ported that their training had combined two or more of the alternatives. Respondent Comments. A free-text question gave respondents an opportunity to comment on the cataloging of archival collections. Of the 140 respon- dents, the vast majority (120, or 85.7%) did include free-text comments. These com- ments are perhaps the most intriguing and most telling portion of the survey, although they were not unexpected. Many were sim- ilar; select, representative categories and specific comments follow. • As might be expected, some (31, or 22.1%) of the 140 respondents in- cluded descriptions of the type and scope of their institution's archival and manuscript holdings. • A surprising number (65, or 46.4%) specifically noted "Archivists and catalogers should work together." Seventy (50%) offered specific details regarding how archivists, special col- lections staff, and catalogers work co- operatively at their institutions. • Of the 54 respondents who use MARC AMC in conjunction with a local system, 15 (27.9%) briefly de- scribed their local systems. Although such comments were expected, a few (3, or 5.6%) of the 54 focused on how well or how poorly MARC AMC re- cords survived in the specific local- system environments. • Some respondents used the opportu- nity to defend (or apologize for) their choice of a particular descriptive cat- alog standard. Two (2.5%) who re- ported using MARC AMC wrote, " I must use AACR2R, my on-line catalog won't accept AMC records"; one (1.3%) noted, "Our Cataloging De- partment made us use AACR2 and they wouldn't take no for an answer." • More than half (36, or 60%) of the 60 respondents who reported not catalog- ing their collections using MARC AMC provided very telling comments regarding why they were not or could not catalog their collections. These comments seem to say more about the less-than-positive status of many ar- chives and special collections within the larger context of their respective academic institutions than about the efficacy of MARC AMC. Represen- tative comments were, on one ex- treme, mundane: "We're very small—no need to catalog with MARC AMC"; "We're a new ar- chives—just getting started"; and "No cataloging yet—hope to in the future." On the other extreme were the not-quite sublime: "My institution bought software for a stand-alone sys- tem but failed to buy the hardware"; "We survive mostly with volun- teers—not enough people or time to catalog"; "Inventorying the collec- tions is a high priority—cataloging is not"; "Not enough money, time or staff to catalog"; "Our institution doesn't support the archives"; and "Formal training? Are you kidding? No money here for that!" Full Circle: Expectations and Confirmations The results of this study are not surpris- ing and, for the most part, they confirm the seven expectations listed earlier. Expectation 1: MARC AMC is used by the majority of U.S. academic archives. The survey results confirm this expecta- tion. A majority of academic institutions— 80 (57.1%) of the 140 total respondents— are using MARC AMC to catalog their D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 494 American Archivist / Summer 1994 archival and manuscript collections. How- ever, keeping in mind all the discussion and debate at archival and library confer- ences and in the MARC AMC literature, a higher percentage use of the format might have been expected. Comments garnered in the free-text sections of the survey also tend to confirm this expectation and to point toward even greater use of the MARC AMC in future. A need remains for greater education about the access and re- search benefits gained by using the format and by including MARC AMC records in national databases and local systems. Expectation 2: APPM, Second Edi- tion, is the cataloging standard of choice, in use by the majority of U.S. academic archives. The survey results also confirmed this expectation, with a majority (50, or 62.5%) of the 80 respondents who use MARC AMC reporting that the also use APPM, Second Edition. It was not surpris- ing to learn that nearly one-quarter (18, or 22.5%) of the 80 use AACR2R, Chapter 4, for manuscripts and manuscript collections, or that 22 (27.5%) of the 80 use a combi- nation of APPM, Second Edition, and AACR2R, Chapter 4. The fact that any re- spondents commented that their respective cataloging departments insisted on the use of AACR2R, Chapter 4, even for archival collections, is disappointing. The catalog- ing community clearly requires more edu- cation about the nature of archival collec- tions and about archival theory and practice. This finding also points out the need for (and the benefit of) having cata- logers and archivists work together to cat- alog archival and manuscript collections. Despite the fact that most institutions chose APPM, Second Edition as the cataloging standard, it would not be surprising to find that some institutions imposed an institu- tion-specific standard or that certain subject disciplines (such as medicine) chose a sub- ject-related standard. Expectation 3: The majority of U.S. academic archives enter MARC AMC re- cords into the OCLC database. The survey results did not clearly support this expec- tation. Of the 80 respondents who reported using MARC AMC, the greatest percent- age (36, or 45.0%)—although not the ma- jority—did report using OCLC. The next greatest percentage (20 or 25%) is using RLIN, and a small percentage (3 or 3.8%) is using WLN. Expectation 4: The majority of U.S. academic archives enter their MARC AMC records in a local automated system (in ad- dition to a bibliographic utility), and these systems will vary widely. The survey re- sults confirmed this expectation; all 80 re- spondents (100%) who reported using MARC AMC said they cataloged on line in some manner, using a bibliographic util- ity in combination with a local automated system, just a local automated system, or a stand-alone software package. Also as ex- pected, respondents who used a local sys- tem were using a wide range of systems, including NOTIS, CARL, III, VTLS, LIAS, PALS, multiLIS, GEAC GLIS, and GEAC Advance. Expectation 5: Some U.S. academic archives enter their MARC AMC records in a stand-alone, turnkey system, such as MicroMARCamc. This expectation was also confirmed; 17 (21.2%) of the 80 re- spondents who used MARC AMC cata- loged only on a stand-alone, turnkey system. MicroMARC:amc was the primary system of choice, and (also as expected) respondents who chose the stand-alone, turnkey systems did not routinely report their holdings to any bibliographic utility. It is disappointing that databases created with these systems have not been routinely in- corporated in the on-line cataloging of the home institution or in national bibliographic utilities. This possibility needs to be inves- tigated and developed further by the various library and archival systems vendors, as well as by the bibliographic utilities. Expectation 6: In the majority of U.S. academic archives, archivists are primarily D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 495 responsible for cataloging the archival and manuscript collections using MARC AMC. Survey results regarding this expectation were surprising. The majority (48, or 60%) of the 80 respondents who reported using MARC AMC also reported that archivists and catalogers worked cooperatively to cat- alog the archival and manuscript collec- tions. A minority of respondents (28, or 35%) reported that only archivists (28, or 35%) or only catalogers (4, or 5%) cata- loged the collections. This result is both encouraging and exciting. As the use of MARC AMC continues to grow, and as more and more MARC AMC records ap- pear in local as well as in bibliographic utilities databases, it is imperative that ar- chivists and catalogers work together to en- sure that the most complete, accurate, and technically correct data and records are in- corporated to give the researcher the best means of access. With cataloger and archi- vist each having different areas of exper- tise, it is not unreasonable to consider collaborating, rather than demanding that one or the other compensate and provide records that are less than meaningful or less than accurate. Expectation 7: The majority of those using MARC AMC have received some spe- cial training, primarily from SAA-sponsored workshops. The survey results confirmed this expectation. Nearly three-quarters (58, or 72.5%) of the 80 respondents who use MARC AMC also reported that the archi- vists and catalogers who cataloged the col- lections had special MARC AMC training. Although the kind of training varied, the majority (30, or 51.7%) reported attendance at SAA AMC workshops. The wide range of opportunities for training proved grati- fying, with RLIN, OCLC, OCLC networks, the National Archives, and state historical societies offering workshops. It is interest- ing that some respondents (8, or 13.8%) had received specific training in MARC AMC in graduate-level library science, informa- tion science, or archival courses. Conclusion These results are a confirmation that the "giddiness" Henson wrote about in 1986 is over and that the "responsibilities" he mentioned29 are very real. It is the very real responsibility of archivists and catalogers alike to provide the most accurate and comprehensive access to academic archival and manuscript collections in order to ful- fill the needs of researchers. It is also the responsibility of archivists and catalogers to keep watch over the evolution of MARC AMC and to see that the format's propo- nents and reformers carefully take into ac- count the equally evolving needs of the practitioners and researchers who on a con- tinuing basis must deal with the format and the information it contains. MARC AMC has indeed come of age and has entered the mainstream of archival and cataloging thinking, theory, and prac- tice. Its potential is tremendous. However, with just over half of the 140 respondents to this survey (80, or 57.1%) using MARC AMC, it is clear that the format's potential has yet to be fully realized. It remains the responsibility of archivists and catalogers to make certain that MARC AMC does not fall short of this potential. In 1990 Lisa Weber speculated: "/n MARC for Library Use, Walt Crawford states that 'MARC is the single most im- portant factor in the growth of library au- tomation in the United States and other countries.'. . . While it is still too early to tell whether the MARC . . . format will have the same impact in the archival com- munity, it appears that some sort of revo- lution is in the making."30 Further, just two short years after his 1986 forecast, Henson more poignantly stated the following: 29Henson, "The Use of Standards in the Applica- tion of the AMC Format," 32. 30Weber, "Record Formatting: MARC AMC," 117. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 496 American Archivist / Summer 1994 Considering the many fundamen- tal differences between archives and libraries and between bibliographic and archival description, it is difficult to wonder why archivists would will- ingly subject themselves to the bib- liographic angst of reconciling their practices with A A C R 2 . . . . However, the pressures and dawning realization of the 'information age' made this position increasingly untenable. The mistake made along the way was to assume that the common element in archival materials and books lay in their form—that is, 'words on pages.' However, archives and man- uscripts are not basically biblio- graphic in nature and it was not until it was realized that the similarities between published and unpublished materials lay in their features as tools of information and research, that the benefit of their natural alliance could be exploited. The presence of thousands of APPM/AMC cataloging records in bibliographic networks is testimony to the truth of that alli- ance."31 MARC AMC is here to stay, and it is fully entrenched in archival theory and practice in the United States. The format's face may change over time, and its appli- cations may become broader as format in- tegration becomes a reality during the next few years—nonetheless, MARC AMC is here to stay. As a result, archivists and cat- alogers are obligated to continue to forge even stronger alliances. Working together, they must use and mold MARC AMC to their best advantage and to the format's greatest and fullest potential (that is, pro- viding full and accurate records for re- searchers across local, state, national, and international boundaries). MARC AMC also has no bounds; it has, instead, an inherent limitless and very pow- erful potential. If the traditional cataloging community, together with the archival communities, can continue to work colla- boratively to exploit the full potential of this format, they can provide researchers with more than they ever thought existed and they can begin to take the format into untried territory. But, our "forecasts" do not yet project that far, for that is the topic of another paper. "Henson, "Squaring the Circle," 551. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 Viewing the Field 497 Appendix 1 Sample MARC AMC Cataloging Survey (Revised 1/30/92) 1. Approximately how large is your institution's archival and/or manuscript collections? (Specify linear or cubic feet.) Linear feet Cubic feet 2. Do your institution's collections include archival collections, manuscript collections, or both? Both archival and manuscript collections Archival collections only Manuscript collections only Other (specify: ) 3. Does your institution catalog its collections using the MARC AMC format on OCLC, RLIN, WLN, a local system (OPAC), or a stand-alone turnkey MARC AMC system? No, (Go to questions 7 and 8) Yes, on OCLC Yes, on RLIN Yes, on WLN Yes, on a local system (Specify: ) Yes, on a stand-alone turnkey system (Specify: ) 4. Which cataloging standard(s) is (are) used? APPM, Second Edition (Henson) AACR2R, Chapter 4 DCRB, Second Edition Other (Specify: ) 5. Briefly explain why the cataloging standard(s) listed in Question 4 was (were) chosen? 6. Who catalogs the collections? Archivists only Catalogers only Archivists and catalogers working together Other (Specify: ) 7. Did the archivists/catalogers receive special MARC AMC format training? No Yes (Specify source: ) 8. Other comments regarding cataloging of archival and manuscript collections using the MARC AMC format: 9. If you would like survey results prior to publication, please list your name and address or enclose a business card: D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.3.xu5345u722614jm 8 by C arnegie M ellon U niversity user on 06 A pril 2021 work_6flohhulhfci5dbzxurgsc2i64 ---- Discovery and Delivery: Making it Work for Users Serials Librarian, 2009, Vol. 56, Issue. 1-4, p.79-93. ISSN: 0361-526X (Print), 1541-1095 (Online) DOI: 10.1080/03615260802679127 http://www.taylorandfrancis.com/ http://www.tandfonline.com/toc/wser20/56/1-4 http://www.tandfonline.com/doi/abs/10.1080/03615260802679127 © 2009 Routledge, Taylor & Francis Group Discovery and Delivery: Making it Work for Users CAROL PITTS DIEDRICHS Presenter User expectations for complete and immediate discovery and delivery of information have been set by their experiences in the Web 2.0 world. Libraries must respond to the needs of those users whose needs can easily be meet with Google-like discovery tools as well as those who require deeper access to our resources. What has happened to bring us to this time in the evolution of library collections and services? What characterizes user expectations and how are we fulfilling them today? What can we do to prepare for the future? Are we prepared for what is to come? INTRODUCTION User expectations for complete and immediate discovery and delivery of information have been set by their experiences in the Web 2.0 world. Libraries must respond to the needs of those users whose needs can easily be meet with Google-like discovery tools as well as those who require deeper access to our resources. I would first like to talk about what characterizes user expectations today—specifically, exceptional service, convenient tools and user recommendations and feedback. One of my favorite examples of the kind of exceptional service that users have become accustomed to comes from Jenny Levine’s blog, The Shifted Librarian. So I ordered it from Amazon and waited for it to arrive. And waited. And waited. And waited. After a couple of weeks, I decided to contact Amazon about it, so I started digging through the ‘where’s my stuff’ screens to find a phone number to call. Eventually, I got to a screen that had a button on the right-hand side that said “telephone help” or something like that. I clicked on that button and a small window popped up. It asked for my phone number and for when I wanted them to call me. I gave them my number and chose the ‘right now’ option. Sure enough, my phone immediately started ringing! I was put into their automated voice tree and was able to get to a human being to resolve the problem. It’s just the kind of thing I’d love to see from a library, both on its own web site and embedded in others’ sites (local government pages, school pages, etc.). 1 The second example is a personal one involving my hunt for slippers for my husband last Christmas. I had his shoe size but did not know how to relate that to the options available as small, medium, or large. So I clicked on “live help,” had an e-mail exchange with someone at the company, and finished my transaction very quickly. Our users have also become very accustomed to convenient tools. Joan Lippincott has written about the disconnects that millennial students have with today’s library. They depend on Google or other search engines for discovery of information resources rather than consulting http://www.taylorandfrancis.com/ http://www.tandfonline.com/toc/wser20/56/1-4 http://www.tandfonline.com/doi/abs/10.1080/03615260802679127 library Web pages, catalogs, and databases. These students often find library-sponsored resources difficult to figure out on their own, they are seldom interested in formal instruction in information literacy, and they prefer to use the simplistic but responsive Google. 2 There are many, many examples of convenient tools available to students on their favorite websites. One example in the book world is from the publisher, HarperCollins, who includes an author tracker feature for forthcoming books by your favorite author. You can even add a widget for a particular book that keeps track of the time until the book is in stores. With this widget you can see how many days until the author’s next book will be available. 3 Users also expect to have access to value-added content on websites that range from Amazon to epinions.com. In addition, they expect to be able to add their own recommendations and feedback on virtually every interaction they have on the Internet. A good example is Amazon’s “Search Inside the Book” feature, which includes details such as the front and back cover, an excerpt, and the table of contents, as well as a section called “customers who viewed this book also viewed,” followed by a listing of books of possible interest. So let me close this introduction with another personal example. Some of you know that when I am not doing library stuff, I am a very serious golfer. I recently was on a golf course here in Arizona and saw this very cute head cover of a cactus (head covers look like stuffed animals that protect the heads of your golf clubs). So, of course, I thought, I have to have one of those. One logical approach was to think about what stores on the Internet might have such a head cover, most likely a place like Golf Galaxy or Golfsmith. So I searched for those stores on the Internet; one of them had nine screens of head cover options, but no cactus head cover. So my next thought was Amazon; they sell everything, right? A search for cactus head cover on the Amazon site found twenty-four pages with 5,130 results. Finally I realized I could add a key word and so twenty minutes later I had located and purchased my head cover. This process was similar to finding something on a library website. But, that is not what I did. Instead, I behaved like most of our users and put the words “cactus head cover” into Google. On the first page, the first four entries that resulted from my search led me quickly to a place where I could purchase the head cover. The elapsed time was two minutes. Essentially, I agree with Mike Eisenberg who said, “The simple but powerful ‘Google search box’ is a model for what we need in libraries— beyond federated search, this means one-step immediate access to the full text of library resources. We can claim success when people use the library search as readily, easily, and often as they do Google.” 4 DISCOVERY—WHERE WE HAVE BEEN So with those kinds of expectations before us, let’s talk about discovery, starting with where we have been. This area is characterized by various layers of discovery, silos by format, a focus on tangible or purchased resources, chaos in keeping track and distinguishing what we own, and forcing users to come to us. The first layer of discovery is local including the library’s online catalog, your listing of electronic resources (databases and journals), and usually a separate database for digital content created by the library. Most library websites have individual silos by format or type of content. It is not uncommon for a user to search in one of our silos using key- words. But when their keyword fails to match anything, they get a small message that says “no matching records.” No other self-respecting website would give them anything less than a “did you mean” option or at least a listing of where their keyword would have fallen in an alphabetical list of resources. For most of us at this stage, the second layer of discovery is a statewide, consortial, or regional layer of discovery. Some systems like OhioLINK have a relatively simple single keystroke to transfer a search from the local catalog to the statewide catalog, but most users have to have a fair level of sophistication to understand and use this next layer of discovery. For most of us, OCLC’s WorldCat serves as the national and international layer of discovery. Our most sophisticated users understand this layer and may use it as a starting point for their search, working their way down to the local level. Many faculty have come to understand that they could search WorldCat first and find a larger universe of materials and then request materials not owned by their library via interlibrary loan. Most of our focus with these layers of discovery has been on tangible or purchased resources and with making users come to the library to get what they need. DISCOVERY—WHERE WE ARE NOW So, where are we now with discovery? Many libraries have begun to improve discovery in our legacy systems and explore new options. North Carolina State University Libraries (NCSU) was one of the first to unveil a new approach to searching in their existing library catalog using Endeca software. Launched in January 2006, NCSU used an existing software product that had never been applied to the rich data in existing catalog records. This implementation began to look much more like popular websites rather than existing online catalog interfaces. For the first time, users could capitalize on embedded options in the record to refine their search by categories such as location, genre, format, language, material type, publication date, and popularity. Results could be sorted by publication date, relevance, call number, and title. The individual components of subject headings were available to narrow your results. For example, for a title with the Library of Congress Subject Heading: Political corruption–United States–History–20th century, the user could narrow search results by each of the elements. Features such as “did you mean?” when a potential typographical error was detected were available. The user could also easily see options such as “More titles like this” or “More by this author” with all of these features capitalizing on the existing descriptive cataloging information. Then our integrated library systems vendors began to release new interfaces for their own systems but were also marketed as available for addition to a competitor’s online catalog. In his blog, Library Technology Guides, Marshall Breeding unveiled the new Primo interface from Ex Libris, the first time that Primo had been made available to the general public. 5 Ex Libris has also incorporated Google’s Book Search API so that it brings content back into the catalog from the “About the Book” pages in Google’s Book Search. The University of Kentucky (UK) Libraries contracted with Innovative Interfaces as a development partner for their Encore product. Encore too is designed to be independent of the underlying integrated library system and UK’s implementation places Encore as a discovery layer for its Voyager system. Encore includes faceted searching using the richness of the cataloging records and also adds graphic features such as book jacket displays for recently added materials. Encore uses tag clouds to provide the user with familiar options for enhanced subject access. Tag clouds are a great way to get the work of catalogers and technical services staff into the hands of users without having to teach them the nuances of the formal subject headings. All of these features and displays are dynamic, corresponding to the current search results. As those results change, so do these features. One of the most appealing features of this interface is that the user is never faced with a dead end search. The Encore interface can be customized and branded so that the user is aware that the interface is provided by the University of Kentucky Libraries and it also incorporates familiar tools such as “did you mean.” When location or availability information is needed, the user is quickly linked to the exact search screen for his title in the Voyager system. 6 There is at least one open source option in the area as well, VuFind, developed by Villanova University, described as follows: VuFind is a library resource portal designed and developed for libraries by libraries. The goal of VuFind is to enable your users to search and browse through all of your library’s resources by replacing the traditional OPAC to include: • Catalog Records • Locally Cached Journals • Digital Library Items • Institutional Repository • Institutional Bibliography • Other Library Collections and Resources VuFind is completely modular so you can implement just the basic system, or all of the components. And since it’s open source, you can modify the modules to best fit your need or you can add new modules to extend your resource offerings. 7 Of course, VuFind has many of the same features available in commercial systems including faceted browsing, live record status and location, ability to customize and brand, and suggestions for similar resources. It also has the advantages and disadvantages of open source—no ongoing support fees but a commitment to some level of programming and development. Andrew Pace, one of the librarians responsible for the implementation of Endeca at NCSU, documents that he first began talking about these interfaces as nothing more than “lipstick on a pig” in 2002. 8 Roy Tennant elaborates on this in a blog post titled “Lipstick on a Pig 2.0”: In the past I’ve quoted my esteemed colleague at the North Carolina State University Library, Andrew Pace, calling minor library catalog improvements “lipstick on a pig.” Sure, the pig may look a little bit better, but it’s still a pig. The point of this is not to merely insult library catalogs, but to identify that in focusing on gloss instead of substance is to miss the real point. Our systems are more broken than that. 9 I prefer to think of these interfaces as transitional. While we move ahead with development for better systems, the use of an improved discovery layer is useful to our users and better than the status quo. In essence, old challenges linger, but discovery has improved. The marketplace also now includes new options for improving discovery for users. OCLC has placed WorldCat on the open Web at WorldCat.org and made it freely available to the public. The website is devoted to searching WorldCat libraries and includes the entire WorldCat database. It is a permanent destination much like Amazon or eBay. OCLC also introduced a new search box that can be added to your library website or your Facebook account or blog. 10 The WorldCat.org website is clearly labeled—Find items in libraries near you. Once a user locates a title of interest in WorldCat.org and clicks on that title, they are presented with a variety of options for accessing that book including purchase through Amazon. More importantly, the system assesses their location and presents local library options. For example, when I search a title, the system interprets my information as the University of Kentucky without my having to do anything and presents me with options for getting the book itself at UK or the next closest libraries that own it. The system also links seamlessly to the user-initiated ILL system to allow me to place my own request for the item. The individual item page also gives the user a clear sense of the range of things he or she can do such as: • Get it (search my library, purchase from Amazon) • Save it (add the page to my favorites or a list) • Add to it (review the item or add to its public notes) • Share it (add links in popular tagging tools such as Digg, MySpace, Facebook, and Del.icio.us) On October 9, 2005, WorldCat.org debuted new options for users to contribute content to records. These social and personalization tools allow a user to: • Build their own personalized lists of books, videos, and other library-owned items; lists can be public or private, and public lists can be searched and shared with friends • Add their own ratings and reviews of an item, and contribute to collaboratively edited contextual notes • Add library materials to their social bookmarks at sites such as Digg, Del.icio.us, and Facebook with the Add This widget available on item record pages 11 In addition, OCLC has entered the marketplace with another new option called WorldCat Local, 12 which debuted at the University of Washington in May 2007. WorldCat Local presents the entire WorldCat database to the user in lieu of the library’s local catalog but presents results in order of the most accessible to the user. The first items displayed are available locally, followed by those shared consortially and then open access and other global collections. As a result, for the user it is much like searching Amazon but with a focus on materials that can be made available by the local library first. OCLC has also loaded article level information into WorldCat so that the user no longer must search in individual silos for articles, books, or digital items. In the Ohio State University implementation of WorldCat Local, it is easy to find archival finding aids with links to full-text digital items as well. When the user reaches the stage of accessing a particular item, the features from WorldCat.org mentioned previously are available. When searching the University of Washington catalog, the interface knew that I was not eligible to borrow materials directly from the catalog and seamlessly forwarded my query to the UK interlibrary lending page for user initiated ILL requests. Susan Marcin and Peter Morris discussed the differences between the Encore discovery tool and WorldCat Local in a recent article: However, at the heart of it, these [Encore and WorldCat Local] appear to be two vastly different products with different goals in mind. A library might not even need a native ILS interface if it used WorldCat Local. The WorldCat Local interface is fully developed—a library could abandon the ILS and rely on a basic (perhaps open source) inventory control system for circulation and acquisitions. Encore is not designed with this goal in mind. It is not meant to replace the native ILS but intended to enhance it. 13 You can read more about these various systems in the July/August 2007 issue of Library Technology Reports. 14 DISCOVERY—WHERE WE ARE GOING So, where are we going? The Association of College & Research Libraries Environmental Scan 2007 15 refers to an article by Karen Markey that “asserts that OPACs lost the battle for user attention to Web search engines. While some authors discuss how [the] OPAC may be brought back, other authors doubt that it will make a comeback. Some of them claim that librarians have waited too long to respond to search engines.” 16 Karen Schneider notes that “improving the catalog as a finding aid does not necessarily improve its ability to be a destination.” 17 In short, we are moving toward services that bypass the library website by taking content to the user and seamlessly fulfilling their requests. Some of the services mentioned earlier are early examples of where we will likely be in the future. In a March 2006 presentation at the University of Michigan Mass Digitization Conference, Daniel Greenstein reported observing students at the University of California, San Diego with two browsers open—one to Amazon using its “Search Inside the Book” feature to find what they wanted and then the library catalog to locate the call number. 18 One question we need to ask ourselves is how might this change once more full text is freely available on the Web? Open WorldCat, OCLC’s project to expose library content and integrate it with Web search engines and bibliographic and bookselling sites, stated the following benefit to libraries: The result: OCLC member libraries are more visible on the Web, and their catalogs are more accessible from the sites where many people start their search for information. . . . “Opening” WorldCat records to the Web helps libraries and other institutions provide a fast, convenient service to current and potential users through familiar Web channels. Open WorldCat points more people—even those who do not typically visit libraries—to library collections as a first source of information. It promotes the value of libraries on a scale greater than any library or group could achieve alone. 19 When Open WorldCat first debuted in early 2006, the people accessing WorldCat via the open Web quickly eclipsed user traffic through any other means. Google Scholar is another example of the integration of library information in tools on the open Web. With a particular focus on books and journal articles of interest to scholars, users no longer need to navigate the various databases we provide for them but rather search across the content in Google Scholar. When a user searches in Google Scholar and finds something of interest, one of the options available to them is Library Search. From that option the user quickly moves into the systems described earlier, which provide local library access to and information about the content. At the University of Kentucky we have enabled our SFX link resolver so that users in Google Scholar can find content that has been licensed by us for their use. Google Scholar eliminates the need for a user to understand the distinctions between databases. For example, in 2006, JSTOR announced that it had “entered into an arrangement enabling Google Scholar to index the archive. . . . Since 2001, while the overall number of Articles Viewed has increased more than sevenfold, links into JSTOR have increased by 32 times. [There was] a spike in the early part of this year, bringing in around 4 million links in only two months—apparently the result of Google’s recently completed index of JSTOR.” 20 As more publishers enable access to their content through Google Scholar, users will find what they need there and be able to quickly transition to the full text content. In another example, Microsoft “evangelist,” Jon Udell, developed his Library Lookup Project, which created a bookmarklet that can let the user know if their local library owns the book or not using a second browser. In further refinements, the library can install a script in their local catalog that will automatically insert a link in Amazon below the title of the book that will indicate the status of the book in your library catalog. Clicking on the link will bring you to the library record for that book. Even if the book is checked out, the status in Amazon will read “due back at the [blank] Library on [due date].” 21 The Google Library Project or Google Book Search aims to digitize the library collections of its twenty plus partners and make that full text available. Google makes its goals for this project clear: The Library Project’s aim is simple: make it easier for people to find relevant books—specifically, books they wouldn’t find any other way such as those that are out of print—while carefully respecting authors’ and publishers’ copyrights. Our ultimate goal is to work with publishers and libraries to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers discover new readers. 22 On their “About Google Book Search” page, Google makes it very easy and clear to the user what they can do once they find a book. Options range from buying the book, to searching to learn more about the book, to finding related information, to borrowing it from a library. 23 This page alone is an excellent example of how much better our competitors are doing than we are to be useful to their users. The reference pages for each book are also very rich with content. Another example of exposing content is the work done by the University of Washington and others to incorporate their content in Wikipedia. If you do a search in Wikipedia for Klondike Gold Rush and scroll down to the external links, you will find an entry labeled University of Washington Libraries Exhibit. 24 Clicking this link takes you to photographs related to the Klondike Gold Rush which are available from the University of Washington Libraries. In addition, on this same page you will find a link to the Wikimedia Commons. Following this link takes you to a group of photos, most of which are in the public domain and posted by others. One of those was posted by the Library and Archives of Canada, an additional example of what libraries and archives are doing to take their content to where users are already searching for information. If you wonder about the value of adding content to Wikipedia, consider this fact, “Wikipedia . . . is now the seventeenth-most-popular site on the Internet, generating more traffic daily than MSNBC.com and the online versions of the New York Times and the Wall Street Journal combined.” 25 In a 2007 study by the Pew Internet & American Life Project, “more than a third of American adult internet users (36%) consult the citizen-generated online encyclopedia Wikipedia. . . . And on a typical day in the winter of 2007, 8% of online Americans consulted Wikipedia. . . . Wikipedia is far more popular among the well-educated than it is among those with lower levels of education.” 26 In a related example, The Library of Congress (LC) is attempting to engage users with its content by posting images from its collections in Flickr. 27 This project began with three thousand of LC’s most popular photos being posted in January 2008. During the first week, there were more than 650,000 views of the images with comments posted on more than 400 individual photos and 1,200 photos were marked by people as favorites. 28 While reviewing these images in preparation for this presentation, I found an entry where the Flickr user has added a comment correcting information about the entry citing an authoritative article in the New York Times digital archive. LC then responds “Thanks for your information about the date. We’ll add the information to the source data and reload the description.” 29 Not only has the Library of Congress connected with its users in their environment, it has been able to add valuable context and detail to the metadata about these images. LibraryThing is yet another example of new sites available to Internet users who love books. LibraryThing defines itself as “a site for book lovers. LibraryThing helps you create a library-quality catalog of your books. You can do all of them or just what you’re reading now. And because everyone catalogs online, they also catalog together. LibraryThing connects people based on the books they share.” 30 Many of the vital statistics about LibraryThing are available in its Zeitgeist Overview. As of July 16, 2008, more than 29 million books (3.46 million unique titles) had been cataloged on the site by more than 450,000 users. More than 450,000 reviews had been added. 31 In essence, the site has formed a community around the love of books; it is not just about cataloging your collection. I have a LibraryThing account for one primary purpose. Rather than cataloging the books I own, I enter what I have been reading in order to provide a feed to my blog, which I use to provide a weekly update to my library staff about what is going on in the UK Libraries. LibraryThing also has a section for libraries. For a moderate cost, you can export your library’s list of books to LibraryThing. Then with a few lines of HTML, your catalog can be enhanced in three ways: • Book recommendations. High-quality “recommended” or “similar” books, like reader’s advisory that points to books available in your library. • Tag-based discovery. LibraryThing for Libraries provides tag clouds for books, and tab-based search and discovery, drawing on the over 31 million tags added by LibraryThing members. • Other editions and translations. Links to other editions and translations of a work. (This works much like the FRBR model.) 32 On these same FAQ sections of LibraryThing, you can take a tour and follow links to libraries who have actually implemented this product. 33 One good example is the Deschutes Public Library in central Oregon. 34 Not only has this library pulled in content from LibraryThing, they have also implemented the piece of software (API) that lets their catalog link easily to content in Google Book Search. When a user searches the library catalog at the Deschutes Public Library and encounters a book with a match in Google Book Search, the screen displays a link labeled “preview this book at Google.” As commented on the Google Blog, “This enables Deschutes readers to preview a book immediately via Google Book Search so that they can then make a better decision about whether they’d like to buy the book, borrow it from a library or whether this book wasn’t really the book they were looking for.” 35 And lest we think this is just a feature for public libraries, the preview feature has been implemented by the University of Texas Libraries as part of their partnership with Google to digitize their collection. When you click on a link in the catalog for a book under copyright, you are taken to a limited preview for the title with options such as contents, popular passages, subjects, and so on. This feature has also been added to Ex Libris’ Primo interface. One final example is Open Library whose goal is to have one Web page for every book ever published. Open Library is a project of the Internet Archive and is partly funded by a grant from the California State Library. Everything about the project is open—open software, open data, open documentation, and a freely available website. The project is growing, as stated on its website, “To date, we have gathered about 30 million records (13.4 million are available through the site now), and more are on the way. We have built the database infrastructure and the wiki interface, and you can search millions of book records, narrow results by facet, and search across the full text of 230,000 scanned books.” 36 Will this compete with WorldCat? Let me close this section with a quote and admonition from Lorcan Dempsey: “Discovery happens everywhere and discovery without fulfillment disappoints.” 37 In essence, we must help users discover information wherever they are but discovery is not enough, it must be converted to fulfillment for the user. WHAT REMAINS TO BE DONE? ARE WE PREPARED? So, why should all of this matter to us? We must begin to work differently than in the past. Technical services and collection librarians have a rich set of skills to contribute in today’s libraries but those skills must evolve to include new approaches to enriching the content and experiences of our users. In December 2007, OCLC announced that it was conducting a pilot project to capture metadata from publishers and vendors further upstream in the publishing process. This pilot flows out of the recent Library of Congress Report on the Future of Bibliographic Control, 38 which recommended that the library community make use of bibliographic information earlier in the supply chain. 39 This is where we are headed—less focus on touching every record and more focus on enrichment. Roy Tennant says it well, “this means forgetting about ‘control’ and getting good about ‘enrichment.’” 40 We must also experiment and use new tools. For example, instead of building a Web page, the Cataloging Department at the University of Georgia used a Del.icio.us account to create a page of resources for library employees. This was a much simpler approach to bookmarking a set of heavily used sites rather than the complexity of designing a Web page. 41 How might we use the skills of cataloging to bring disparate special collections together? Increasingly we will need to bring cataloging and metadata skills to bear on unique special collections including building context around collections. At the University of Kentucky Libraries, we have a wonderful oral history of Happy Chandler who was the Governor of Kentucky (available in the Kentuckiana Digital Library, http://kdl.kyvl.org/). Happy Chandler also served as commissioner of baseball. How do we bring together this oral history with his biography on the National Baseball Hall of Fame site (http://web.baseballhalloffame.org/index.jsp)? At the Orange County Public Library in Orlando, Florida, the online catalog and MARC records have been used in a variety of unconventional ways. A MARC record was created for the catalog to promote a fundraising event featuring author Carl Hiassen. The record listed all of his titles in the 505 field and also contained a link that led to a Web page to purchase tickets for the event. The library also created MARC records in Spanish and English for their Live Homework Help site. The record had many subject headings with school subjects such as math with an 856 link to the homework help database. Other examples include enhanced records for language learning tools and career books. During the years that Oprah Winfrey was selecting books for her book club, public libraries had difficulty coping with the demand for the titles because they had little advance notice of the title to be selected. Orange County found an innovative way to deal with the problem by simply adding a record to their catalog that was titled “Oprah’s New Pick.” Library users could then place a hold request for the book long before the actual title was known. The library could assess the expected demand and order accordingly once the title was known. Orange County employs librarians who are known as Digital Access Architects who work full time to deliver customer service through the use of technology. 42 What would happen if the library worked like NetFlix? “NetFlix is easy, personal, fast, and convenient. It assists users in finding titles they will not only enjoy but titles that they are probably http://kdl.kyvl.org/ http://web.baseballhalloffame.org/index.jsp very excited to find because they are surprised that they could be found or they’ve never heard of them before. Their choices are not limited to the blockbusters of the day. NetFlix makes it very easy for customers to borrow and return titles. NetFlix is to movies as libraries should be to books.” 43 Unfortunately, libraries have not become more like NetFlix but a book version of NetFlix, BookSwim, has entered the marketplace. “BookSwim is the first online book rental library club lending you paperbacks and hardcovers netflix-style directly to your house without the need to purchase! Whether it’s New Releases, Bestsellers, or Classics, we’ve got over 200,000 titles to choose from, with free shipping both ways! Read your books as long as you want—no late fees! Even choose to purchase and keep the titles you love!” 44 And our users now have options that mean they may not need a library unless we learn to more effectively market and sell our services to users and provide them with the options they expect. CONCLUSION In conclusion, I have one piece of parting advice which I must credit to our OCLC colleague, Andrew Pace: Outside my window is one of the large ponds that dot the OCLC campus in Dublin. Slightly frozen, I saw several members of Dublin’s rather robust goose population crossing the thin ice covering the pond. Mind you, it wasn’t quite comical—it was actually done with as much grace as a goose can muster in such an exercise. . . . It dawned on me that these geese were not afraid because if the ice breaks, they can swim; and if the water is too cold, they can fly. Well, the metaphor for librarianship is almost too easy here. I would argue that sometimes fear of the cold water makes people forget they can fly. 45 So take some chances, I know you can swim and I think we can fly. NOTES 1. Jenny Levine, “Call Me.” The Shifted Librarian. July 10, 2005. http://www.theshiftedlibrarian. com/archives/2005/07/10/call_me.html (accessed July 18, 2008). 2. Joan K. Lippincott, “Net Generation Students and Libraries,” EDUCAUSE Review 40 (March/April 2005): 57, http://net.educause.edu/ir/library/pdf/ERM0523.pdf. 3. HarperCollins. 2008. http://www.harpercollins.com/ (accessed July 18, 2008). 4. Mike Eisenberg, “The Parallel Information Universe,” Library Journal 133 (May 1, 2008): 25, http://www.libraryjournal.com/article/CA6551184.html. 5. Marshall Breeding, “Primo now public at Vanderbilt University,” Library Technologies Guide. 20 August 2007. http://www.librarytechnology.org/blog.pl?ThreadID=15 (accessed July 18, 2008). 6. Innovative Interfaces Inc., “Encore.” Innovative Interfaces. 2008. http://www.iii.com/products/encore.shtml (accessed July 18, 2008). 7. Falvey Memorial Library, Villanova University. VuFind. http://vufind.org/ (accessed July 18, 2008). http://net.educause.edu/ir/library/pdf/ERM0523.pdf. http://www.harpercollins.com/ http://www.libraryjournal.com/article/CA6551184.html. http://www.librarytechnology.org/blog.pl?ThreadID=15 http://www.iii.com/products/encore.shtml http://vufind.org/ 8. Andrew Pace, Comment on “Looking Back, Looking Forward.” Hectic Pace. December 12, 2007. http://blogs.ala.org/pace.php?title=looking_back_looking_forward&more=1&c=1&tb=1 &pb=1 (accessed July 18, 2008). 9. Roy Tennant, “Lipstick on a Pig 2.0.” Tennant: Digital Libraries. May 4, 2007. http://www. libraryjournal.com/blog/1090000309/post/220009022.html (accessed July 18, 2008). 10. CLC, “WorldCat.org interface features.” Oclc.org. 2008. http://www.oclc.org/worldcatorg/ features/default.htm (accessed July 18, 2008). 11. OCLC, “WorldCat.org: Overview.” Oclc.org. 2008. http://www.oclc.org/worldcatorg/overview/default.htm (accessed July 18, 2008). 12. OCLC, “WorldCat Local.” Oclc.org. 2008. http://www.oclc.org/worldcatlocal/ (accessed July 18, 2008). 13. Susan Marcin and Peter Morris, “OPAC: Placing an Encore Front End onto a SirsiDynix ILS,” Computers in Libraries 28 (May 2008): 9, http://www.iii.com/news/reprints/EncoreWithSirsiDynix.pdf. 14. Marshall Breeding, “Next-Generation Library Catalogs,” Library Technology Reports 43 (July/August 2007): 1–44. 15. Association of College & Research Libraries, “ACRL Environmental Scan 2007,” Acrl.org. January 2008. http://www.acrl.org/ala/acrl/acrlpubs/whitepapers/Environmental_Scan_2.pdf (accessed July 18, 2008). 16. Karen Markey, “The Online Library Catalog: Paradise Lost and Paradise Regained?” D-Lib Magazine 13 (January/February 2007). www.dlib.org/dlib/january07/markey/01markey.html (accessed July 18, 2008). 17. Karen Schneider, “Toward the Next Gen Catalog,” ALA Techsource. October 13, 2006. http://www.alatechsource.org/blog/2006/10/toward-the-next-gen-catalog.html (accessed July 18, 2008). 18. Daniel Greenstein, “Webcast: Panel Session—Publishing,” Scholarship and Libraries in Transition: A Dialogue about the Impacts of Mass Digitization Projects. The University of Michigan. March 10, 2006. http://www.lib.umich.edu/mdp/symposium/publishing.html (accessed July 18, 2008). 19. OCLC, “Open WorldCat Program,” OCLC.org. http://chnm.gmu.edu/digitalhistory/links/cached/preserving/8_18a_worldcat.htm (accessed July 18, 2008). 20. Michael P. Spinella, “JSTOR: Past, Present, and Future,” Journal of Library Administration 46 (July 2007): 55–78. 21. “The LibraryLookup Project,” Infoworld.com. http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookup.html (accessed July 18, 2008). 22. Google, “Google Books Library Project—An Enhanced Card Catalog of the World's Books.” Google Book Search. http://books.google.com/googlebooks/library.html (accessed July 18, 2008). 23. Google, “About Google Book Search.” Google Book Search. http://books.google.com/intl/en/googlebooks/about.html (accessed July 18, 2008). http://blogs.ala.org/pace.php?title=looking_back_looking_forward&more=1&c=1&tb=1&pb=1 http://blogs.ala.org/pace.php?title=looking_back_looking_forward&more=1&c=1&tb=1&pb=1 http://www.oclc.org/worldcatorg/%20features/default.htm http://www.oclc.org/worldcatorg/overview/default.htm http://www.oclc.org/worldcatlocal/ http://www.iii.com/news/reprints/EncoreWithSirsiDynix.pdf. http://www.acrl.org/ala/acrl/acrlpubs/whitepapers/Environmental_Scan_2.pdf http://www.dlib.org/dlib/january07/markey/01markey.html http://www.alatechsource.org/blog/2006/10/toward-the-next-gen-catalog.html http://www.lib.umich.edu/mdp/symposium/publishing.html http://chnm.gmu.edu/digitalhistory/links/cached/preserving/8_18a_worldcat.htm http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookup.html http://books.google.com/googlebooks/library.html http://books.google.com/intl/en/googlebooks/about.html 24. Wikipedia, “Klondike Gold Rush.” Wikipedia. http://en.wikipedia.org/wiki/Klondike_gold_ rush (accessed July 18, 2008). 25. Stacy Schiff, “Know It all: Can Wikipedia Conquer Expertise?” New Yorker 82 (July 31, 2006), http://www.newyorker.com/archive/2006/07/31/060731fa_fact (accessed July 18, 2008). 26. Lee Rainie and Bill Tancer, “Data Memo,” Pew Internet & American Life Project. April 2007. http://www.pewinternet.org/pdfs/PIP_Wikipedia07.pdf (accessed July 18, 2008). 27. “The Library of Congress' photostream.” Flickr. http://flickr.com/photos/library_of_congress/ (accessed July 18, 2008). 28. Matt Raymond, “Flickr Followup,” Library of Congress Blog. January 18, 2008, http://www. loc.gov/blog/?p=237 (accessed July 18, 2008). 29. The Library of Congress. Comment. Flickr. http://flickr.com/photos/library_of_congress/2515743501/#DiscussPhoto (accessed July 18, 2008). 30. “What is LibraryThing?” LibraryThing. http://www.librarything.com/tour/ (accessedJuly 18, 2008). 31. “Zeitgeist Overview,” LibraryThing. http://www.librarything.com/zeitgeist (accessed July 18, 2008). 32. “What is LibraryThing for Libraries?” LibraryThing. http://www.librarything.com/forlibraries/ (accessed July 18, 2008). 33. “LibraryThing for Libraries, FAQS: General.” LibraryThing. http://www.librarything.com/ forlibraries/about (accessed July 18, 2008). 34. Deschutes Public Library. http://www.dpls.lib.or.us/ (accessed July 18, 2008). 35. Frances Haugen and Matthew Gray, “Book Info Where You Need it, When You Need it,” The Google Blog. March 13, 2008. http://googleblog.blogspot.com/2008/03/book-info-where-you-need-it-when-you.html (accessed July 18, 2008). 36. “About Us,” The Open Library. http://openlibrary.org/about (accessed July 18, 2008). 37. Lorcan Dempsey, “Discover, Locate, … Vertical and Horizontal Integration,” Lorcan Dempsey's Weblog on Libraries, Services and Networks. November 20, 2005. http://orweblog.oclc.org/archives/000865.html (accessed July 18, 2008). 38. Library of Congress Working Group on the Future of Bibliographic Control, “On the Record: Report of The Library of Congress Working Group on the Future of Bibliographic Control,” Loc.gov. January 9, 2008. http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf (accessed July 18, 2008). 39. OCLC, “OCLC to Conduct New Cataloging and Metadata Project,” Oclc.org. December 13, 2007. http://www.oclc.org/news/releases/200688.htm (accessed July 18, 2008). 40. Roy Tennant, “The Future of Descriptive Enrichment,” Tennant: Digital Libraries. December 10, 2007. http://www.libraryjournal.com/blog/1090000309/post/1920018592.html (accessed July 18, 2008). 41. “Cataloging Resources's Bookmarks,” Delicious.com. http://del.icio.us/catresources (accessed July 18, 2008). http://en.wikipedia.org/wiki/Klondike_gold_%20rush http://www.newyorker.com/archive/2006/07/31/060731fa_fact http://www.pewinternet.org/pdfs/PIP_Wikipedia07.pdf http://flickr.com/photos/library_of_congress/ http://flickr.com/photos/library_of_congress/2515743501/#DiscussPhoto http://www.librarything.com/tour/ http://www.librarything.com/zeitgeist http://www.librarything.com/forlibraries/ http://www.librarything.com/%20forlibraries/about http://www.dpls.lib.or.us/ http://googleblog.blogspot.com/2008/03/book-info-where-you-need-it-when-you.html http://openlibrary.org/about http://orweblog.oclc.org/archives/000865.html http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf http://www.oclc.org/news/releases/200688.htm http://www.libraryjournal.com/blog/1090000309/post/1920018592.html http://del.icio.us/catresources 42. Wendi Bost and Jamie Conklin, “Creating a One-Stop Shop: Using the Catalog to Market Collections and Services,” Florida Libraries 49 (Fall 2006): 5–7. 43. Lori Bowen Ayre, “What if the Library Worked Like Netflix?” Mentat. November 30, 2006. http://www.galecia.com/weblog/mt/archives/2006_11.php (accessed July 18, 2008). 44. Bookswim, Inc., Bookswim.com. 2007–2008. http://www.bookswim.com/ (accessed July 16, 2008). 45. Andrew Pace, “Cold Goose.” Hectic Pace. April 7, 2008. http://community.oclc.org/hecticpace/archive/2008/04/cold-goose.html (accessed July 16, 2008). CONTRIBUTOR NOTE Carol Pitts Diedrichs is the Dean of Libraries at the University of Kentucky. http://www.galecia.com/weblog/mt/archives/2006_11.php http://www.bookswim.com/ http://community.oclc.org/hecticpace/archive/2008/04/cold-goose.html Discovery and Delivery: Making it Work for Users INTRODUCTION DISCOVERY—WHERE WE HAVE BEEN DISCOVERY—WHERE WE ARE NOW DISCOVERY—WHERE WE ARE GOING WHAT REMAINS TO BE DONE? ARE WE PREPARED? CONCLUSION NOTES work_6fsli6flvjeojiiv5yjz74j6h4 ---- Web Page Not Found (HTTP Error 404), NC DOCKS (North Carolina Digital Online Collection of Knowledge and Scholarship)  ALL ASU ECSU ECU FSU NCCU UNCA UNCC UNCG UNCP UNCW WCU        ALL ASU ECSU ECU FSU NCCU UNCA UNCC UNCG UNCP UNCW WCU   Browse All Titles Author By Last Name Keywords Departments Theses & Dissertations Titles Author By Last Name Advisor By Last Name Keywords Type Degrees By Discipline Submissions Articles, Chapters, & other finished products Web Page Not Found (HTTP Error 404) Web Page Not Found (HTTP Error 404). The page that you requested could not be located. Our apologies. It would appear that either the page or the item you are looking for has moved or been deleted. Or possibly a search engine link is incorrect. Looking for a repository item? Please feel free to search the site below, or browse via one of our navigation options. Search Full Text  ALL ASU ECSU ECU FSU NCCU UNCA UNCC UNCG UNCP UNCW WCU   Need help? Be sure to look over our searching and browsing guide if you are having trouble using the site. If you have technical concerns, you may also contact ERIT technical support at 336.334.4238 or via e-mail at erit@uncg.edu. Help About Us Maintained by ERIT, University Libraries, UNCG work_6gc7ra2z6ff7hhbpjvrc4427dq ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216590108 Params is empty 216590108 exception Params is empty 2021/04/06-01:37:02 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590108 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:02 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_6he6kfg2z5aptdne4fpfg2jij4 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216587848 Params is empty 216587848 exception Params is empty 2021/04/06-01:37:00 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587848 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:00 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_6kipwhffhvcileztdpyutqonxq ---- Library Trends, v.54, no.1 2005 Practical Preservation: The PREMIS Experience Priscilla Caplan and Rebecca Guenther Abstract In 2003 the Online Computer Library Center (OCLC) and Research Libraries Group (RLG) established an international working group to develop a common, implementable core set of metadata ele- ments for digital preser vation. Most published specifi cations for preservation-related metadata are either implementation specifi c or broadly theoretical. PREMIS (Preservation Metadata: Implementa- tion Strategies) was charged to defi ne a set of semantic units that are implementation independent, practically oriented, and likely to be needed by most preservation repositories. The semantic units will be represented in a data dictionary and in a METS-compatible XML schema. In the course of this work, the group also developed a glossary of terms and concepts, a data model, and a typology of relationships. Existing preservation repositories were surveyed about their architectural models and metadata practices, and some at- tempt was made to identify best practices. This article outlines the history and methods of the PREMIS Working Group and describes its deliverables. It explains major assumptions and decisions made by the group and examines some of the more diffi cult issues en- countered. Introduction In 2003 the Online Computer Library Center (OCLC) and Research Libraries Group (RLG) established an international working group to de- velop a common, implementable core set of metadata elements for digital preservation. Most published specifi cations for preservation-related meta- data are either implementation specifi c or broadly theoretical. PREMIS LIBRARY TRENDS, Vol. 54, No. 1, Summer 2005 (“Digital Preservation: Finding Balance,” edited by Deborah Woodyard-Robinson), pp. 111–124 © 2005 The Board of Trustees, University of Illinois 112 library trends/summer 2005 (Preservation Metadata: Implementation Strategies) was charged to defi ne a set of metadata elements that are implementation independent, practi- cally oriented, applicable to all types of materials, and likely to be needed by most preservation repositories. In addition, it aimed at establishing best practices for the implementation of preservation metadata. The stated PREMIS objectives were to • defi ne an implementable set of “core” preservation metadata elements, with broad applicability within the digital preservation community; • draft a data dictionary to support the core preservation metadata ele- ment set; • examine and evaluate alternative strategies for the encoding, storage, and management of preservation metadata within a digital preservation system, as well as for the exchange of preser vation metadata among systems; • conduct pilot programs for testing the group’s recommendations and best practices in a variety of systems settings; • explore opportunities for the cooperative creation and sharing of pres- ervation metadata. It was intended that PREMIS would build on the earlier work of an- other initiative sponsored by OCLC and RLG, the Preservation Metadata Framework Working Group (OCLC, 2003). That group was convened in 2001–2002 to develop a framework outlining the types of information that should be associated with an archived digital object. Their report, A Metadata Framework to Support the Preservation of Digital Objects (OCLC/RLG, 2002), expanded the conceptual structure for the Open Archival Informa- tion System (OAIS) information model (Consultative Committee, 2002) and mapped preservation metadata elements to that conceptual structure. Although the framework proposed a list of metadata elements, it did not contain suffi cient detail for an implementer to actually use the metadata in a preservation system without considerable further specifi cations. The PREMIS working group was established to take the previous group’s work a step further: to develop a data dictionary of core metadata elements to be applied to archived objects, give guidance on the implementation of that metadata element set in preservation systems, and suggest best practice for populating those elements. OCLC and RLG established the working group in 2003, chaired by Priscilla Caplan of the Florida Center for Library Automation and Rebecca Guenther of the Librar y of Congress. Because the charge was practical rather than theoretical, members were sought from institutions known to be running or developing preservation repository systems within the cul- tural heritage or information industry sectors. Conveners paid particular attention to diversity of stakeholders. The group consists of representatives from academic and national libraries, museums, and archives; governments; 113 and commercial enterprises in six different countries. In addition, PREMIS includes an international advisory committee of experts periodically called upon to review progress and provide feedback. In order to accomplish as much of the charge as possible in a reason- able timeframe, PREMIS divided into two subgroups with different deliver- ables and strategies. The Core Elements Subgroup took responsibility for drafting the “core” preservation metadata elements and supporting data dictionary. The Implementation Strategies Subgroup was responsible for examining alternative strategies for the encoding, storage, and manage- ment of preservation metadata within digital preservation systems and for developing pilot programs to test the group’s recommendations in a variety of system settings. The work of both subgroups was conducted almost entirely by weekly conference calls, which was a challenge given that the group members were from time zones ranging from the western United States to eastern Australia. Fortunately, only one person had to get up in the middle of the night to attend! However, the sheer frequency of calls and the ambi- tious agenda created a sense of camaraderie among participants. Members quickly learned each others’ voices and mastered use of a wiki (a Web site that allows any user to add and edit content) set up for their use by the University of Chicago. The Core Elements Subgroup also held two face-to- face meetings to expedite their work. The two meetings, one in San Diego in Januar y 2004 and the other in Cambridge, Massachusetts, in August 2004, were highly productive and contributed to the sense of community among members. One of the group’s practices has been well received and might well be found useful by other initiatives. Ever y month a summar y of each subgroup’s activities is posted on the offi cial Web site at http://www.oclc .org/research/projects/pmwg/. For example, the Core Elements update for September 2004 reads: The group spent time discussing the differences between fi les and bitstreams and how the semantic units applied to them. It was pro- posed that there was a need for a new level called “fi lestreams.” This also related to previous discussions about embedded fi les. The group continued its discussion of environment elements and whether this information is dependent on fi le format information. It continued to defi ne what information is needed about the environment in order to render objects for the long term. Two new participants joined the group, one from DSpace and another from the Walt Disney Company. A workplan was developed to fi nish the data dictionary by December in anticipation of a fi nal PREMIS report by the end of 2004. Because of these updates, anyone interested in the PREMIS activity could follow the group’s progress, see what issues were under discussion, and simply be assured the work group was working. caplan & guenther/practical preservation 114 library trends/summer 2005 Implementation Strategies The Implementation Strategies Subgroup was charged with examina- tion and evaluation of alternative strategies for the encoding, storage, and management of preservation metadata within a digital preservation system. To fi nd out how preservation repositories were actually implementing pres- ervation metadata, the subgroup decided to survey repositories that were in operation or under development. Although their work was focused on metadata, the subgroup felt that the sur vey provided an opportunity to explore the state of the art in digital preservation generally, and questions were drafted to elicit information about policies, governance and fund- ing, system architecture, and preservation strategies as well as metadata practices. In November 2003 copies were sent by email to approximately seventy organizations thought to be active in or interested in digital preservation. The survey was also made available on the PREMIS Web site and announced on various discussion lists. By the end of March 2004, forty-eight survey responses were received from institutions developing or planning to de- velop a digital preservation repository. Sixteen of these respondents were contacted for more in-depth telephone interviews. Although several institutions known to be developing digital preserva- tion repository systems did not respond, the replies received appear to be reasonably representative of the state of the art in the winter of 2003–2004. Responses came from 13 countries and included 28 libraries, 7 archives, and 14 other types of organizations. Among the respondents were 10 national libraries and 6 national archives, showing heavy involvement in digital pres- ervation at the national level, particularly in Europe and Canada. Key fi ndings are summarized in the report Implementing Preser vation Repositories for Digital Materials: Current Practices and Emerging Trends in the Cul- tural Heritage Community (OCLC/RLG PREMIS Working Group, 2004), so they will not be repeated here. However, a few points are worth noting. First, there is very little experience with digital preservation. Twenty- two respondents claimed to have a preservation repository in some stage of production (as opposed to planning, development, or alpha/beta test- ing). However, only half of these appeared to have implemented an active preservation strategy such as normalization, format migration, migration on demand, or emulation. This list included four national libraries/na- tional archives and six institutions categorized as “other.” None was an academic library. This fi nding must color all other results, including those pertaining to metadata. Whatever practices were reported on the survey, apart from these eleven institutions the results refl ect repositories not yet in production or not yet implementing active preservation strategies. We do not have enough experience to determine whether the metadata these systems record or plan to record is adequate for the purpose. 115 Second, those engaged in digital preservation still lack a common vocabu- lary and, to a large extent, a common conceptual framework. Although most respondents claimed to have been informed by the OAIS reference model and to be at least partly compliant with it, there was substantial difference of opinion as to the meaning of OAIS compliance. Although OAIS has been praised for providing a standard vocabulary for basic repository concepts, it is clear that most of these terms have not been widely adopted in the com- munity, at least not in informal communications such as survey responses. In relation to metadata, most respondents were recording several dif- ferent types of metadata, and more than half were recording metadata in all of these categories: rights, provenance, technical, administrative, descriptive, and structural. Repositories appear to draw metadata elements from various schemes to suit their purposes. The Metadata Encoding and Transmission Standard (METS) (Library of Congress, 2005), NISO Z39.87 (Technical Metadata for Digital Still Images) (National Information Stan- dards Organization and AIIM International, 2002), and the OCLC Digital Archive metadata set (OCLC, 2002) were the only named schemes used by more than 20 percent of respondents. Overall, thirty-three different metadata element sets or rule sets were mentioned by at least one reposi- tory. In general, the survey shows a picture of a community trying to take advantage of prior work but not at the point of developing or settling on dominant standards. Core Elements Methodology The Core Elements Subgroup began its work by attempting to defi ne the word “core” for the purpose of developing a metadata element list and data dictionar y. After much discussion the group settled on a practical defi nition of core: those elements that a working archive is likely to need to know in order to support the functions of ensuring viability, renderability, understandability, authenticity, and identity in a preservation context. Ini- tially the group felt that all core elements should be considered mandatory by defi nition, but some fl exibility crept in with the acknowledgement that some elements are more core than others, and even necessary information cannot always be provided. The Core Elements Subgroup then started analyzing the recommen- dations of the earlier Preservation Metadata Framework Working Group related to Preser vation Description Information. This included “digital provenance,” or the documentation of events associated with the digital objects. Those members of the subgroup from institutions actively run- ning or developing preservation repositories mapped the elements from the framework to what was used in their own systems. It became clear that the elements detailed in the previous work (which themselves had been caplan & guenther/practical preservation 116 library trends/summer 2005 mapped to the OAIS information model) did not always correspond to elements implemented in practice and did not give adequate guidance on how to use them. However, the exercise was useful in providing a com- mon denominator for diverse implementations; the group discussed each element in conference calls to see where there was commonality in usage. Elements that emerged as being widely used across implementations were considered the beginning of a core element list. The group made the decision that the data dictionary it was developing would be wholly implementation independent. That is, the core elements would defi ne information that a repository needed to know, regardless of how, or even whether, it was stored. For instance, for a given identifi er to be usable, it is necessary to know the identifi er scheme and the namespace in which it is unique. If a particular repository uses only one type of identifi er scheme, say one that is internally defi ned and assigned, the scheme can be assumed, and the repository would have no need to record it at all. The repository would, however, need to know this information and be able to supply it when exchanging metadata with other repositories. Because of the emphasis on the need to know rather than the need to record or represent in any particular way, the group preferred to use the term “semantic unit” (meaning an atom of meaning) rather than “metadata element.” The data dictionary therefore names and describes semantic units. After drafting a preliminary data dictionary for digital provenance in- formation, the group began to consider technical metadata, or detailed information about the physical characteristics of digital objects. The group realized that it did not have either the time or the expertise to tackle format- specifi c technical metadata for various types of digital fi les. By scoping the work to include only that metadata applicable to all (or at least most) digital formats, the group was able to limit the work to a reasonable set of semantic units and leave further development to format experts. The group compiled a list of potential semantic units based on specifi cations for the proposed Global Digital Format Registry (GDFR, n.d.) supplemented by data elements used in the repository systems of members’ institutions. Each element on the list was then discussed at some length, and those found to be both useful and broadly applicable were added to the data dictionary. Data Model One of the hardest issues to tackle was the development of an accept- able abstract data model. A valid criticism of the earlier framework was that the document recommended metadata elements pertaining to many different types of things while giving no guidance as to what type of thing they applied to. For example, “Resource Description” included the subele- ment “Existing metadata,” an example of which was “a MARC bibliographic record.” Bibliographic records usually describe intellectual entities, such as books, sound recordings, and Web sites. Another element, “File de- 117 scription” (defi ned as “technical specifi cations of the fi le(s) comprising a Content Data Object”), would appear to apply to individual digital fi les. A third element, “Size of object,” might be taken to apply to the total size of a complex object (for example, a book made up of many page images) or to a single stored fi le. The lack of specifi cs as to what level of granularity of an object the elements applied to made the document diffi cult to actually use in metadata implementations. The data model was intended to accomplish three purposes. First, it would force PREMIS members to be rigorous in their thinking in the de- velopment of the data dictionary. Second, it provided a structure for ar- ranging entries in the data dictionary. Third, it would help implementers of the data dictionary understand how to apply semantic units. The data model was not, however, meant to imply any particular implementation of the semantic units in the data dictionary. In the PREMIS data model there are fi ve types of entities: intellectual entities, objects, agents, rights, and events. Although it is possible these defi nitions will change before the fi nal report, these entities are currently defi ned as follows: • An event is an action that involves at least one object, agent, and/or rights entity. • An agent is an actor associated with preservation events in the life of an object. • A right is an assertion of one or more rights or permissions pertaining to an object. • An intellectual entity is a coherent set of content that is reasonably described as a unit, for example, a particular book, map, photograph, or database. • An object is one or more sequences of bits stored in the preservation repository. There are four subtypes of the object entity: fi le, fi lestream, bitstream, and representation. The most diffi cult part of the development of the data model has been to appropriately identify, name, and defi ne these subtypes. Defi nitions in this article are slightly less elaborate than those in the actual data model, but they communicate the concepts effectively. Of the fi ve entity types, fi le is perhaps the most intuitive, as our defi ni- tion resembles that of common usage: a named ordered sequence of zero or more bytes known to an operating system and accessible by applications. Every fi le has a fi le format, defi ned as a specifi c pre-established structure of a computer fi le that specifi es how data is organized. A fi le may contain zero or more bitstreams and zero or more fi lestreams. A “bitstream” is defi ned as data within a fi le that cannot be transformed into a stand-alone fi le without the addition of fi le structure (headers, etc.) and/or reformatting in order to comply with some particular fi le format. A “fi lestream” is a contiguous set of bits within a fi le that can be transformed caplan & guenther/practical preservation 118 into a stand-alone fi le conforming to some fi le format without adding information or reformatting the bitstream. An example of a bitstream is an image embedded within a PDF; an example of a fi lestream is a TIFF image within a TAR fi le. A “representation” is the set of fi les needed to provide a complete and reasonable rendition of an intellectual entity. It can be thought of as the digital embodiment of an intellectual entity. Preservation repositories never store intellectual entities, but they may store representation objects. As an example, the fi nal PREMIS report is an intellectual entity. There will probably be PDF and HTML versions posted on the Web; many readers will download their own copies, but all copies will have the same authors, title, and content. If the report were archived in a preservation repository, at least one representation would be stored. This might, for example, be a single, specifi c PDF fi le. The PDF fi le will doubtless contain embedded graphics for tables and charts, which would be bitstreams. If the HTML version were archived, the representation might consist of three or four fi les—the HTML fi le and several GIF images. Perhaps the repository will want to bundle these fi les together for storage by creating a TAR fi le. That TAR fi le would then have within it three or four fi lestreams, which could be extracted into fi les at some later time. These distinctions are important because different semantic units of metadata apply at different levels. The intellectual entity may have an ISBN or technical report number, but the representation does not. The represen- tation may have an identifi er known to the preservation repository, but the intellectual entity does not. The fi le will have a fi le name and fi le format, the fi lestream will have a fi le format but no fi le name, and the bitstream will have no fi le name or fi le format, although it may have other format characteristics such as color space. The PREMIS data dictionary attempts to defi ne core semantic units pertaining to all subtypes of objects and events. Intellectual entities and agents are not addressed in any detail because they have been the focus of other metadata schemes and they do not present unique requirements in the digital preservation context beyond the minimum needed to establish relationships between these and other types of entities. At the time of this writing, the group was still exploring the extent to which rights and/or permissions should be described. Relationships are the other important part of the data model. Entities can be related to entities of different types (for example, objects can be related to agents) and to entities of the same type (for example, objects can be related to other objects). Just as there may be core semantic units generally necessary in the majority of preservation repository applications, there are core relationships that most preservation repositories will need to record. The relationships between objects, agents, and events constitute digital library trends/summer 2005 119 provenance. As Clifford Lynch wrote in “Authenticity and Integrity in the Digital Environment”: Provenance, broadly speaking, is documentation about the origin, characteristics, and history of an object; its chain of custody; and its relationship to other objects. The fi nal point is particularly important. There are two ways to think about a digital object that is created by changing the format of an older object . . . We might think about a single object the provenance of which includes a particular transforma- tion, or we might think about multiple objects that are related through provenance documentation. Thus, provenance is not simply metadata about an object—it can also be metadata that describe the relationships between objects (2000). Objects and Events Most of the semantic units in the data dictionary pertain to objects and events. Semantic units related to the object entity describe characteristics relevant to preservation management. It is assumed that data content ob- jects are held in the preservation repository and that associated metadata may be held in the repository, in external systems, or in both. Data dic- tionary entries for objects indicate the level at which the semantic unit is applicable: representation, fi le, and/or bitstream. Filestream is considered equivalent to fi le for the purposes of applicability. Semantic units associated with object entities include identifi ers, loca- tion information, and technical characteristics. In anticipation of the devel- opment of format registries such as the proposed GDFR, the data dictionary also contains semantics for referencing format registry entries. Similarly, it provides for basic software and hardware environment information and anticipates adding references to future environment registries. Figures 1 and 2 provide examples of entries in the data dictionar y. Figure 1 shows the defi nition of a “container” unit (fi xity), which has no data itself but serves to group together three related semantic components (messageDigestAlgorithm, messageDigest, and messageDigestOriginator). Figure 2 shows the defi nition of one of these semantic components, mes- sageDigestAlgorithm. Events are actions that involve one or more objects and may be related to one or more agents. The PREMIS report states that whether or not a preservation repository records an event depends upon the importance of the event in the context of that repository. It recommends using the semantic units related to the Events entity when recording actions that modify objects. Other actions, such as the copying of an object for backup purposes, may be recorded in system logs or an audit trail but not neces- sarily as an event. Most of the documentation about the digital provenance of objects is given in relation to events. Semantic units include event identifi er, event type (for example, compression, migration, validation, etc.), event out- caplan & guenther/practical preservation Semantic unit fi xity Semantic components messageDigestAlgorithm, messageDigest, messageDigestOriginator Defi nition Information used to verify whether an object has been altered in an undocumented or unauthorized way. Data constraint Container Object category Representation File Bitstream Applicability Not applicable (see usage note) Applicable Applicable (see usage note) Repeatability Repeatable Repeatable Obligation Optional Optional Creation/ Maintenance notes Automatically calculated and recorded by repository. Usage notes To perform a fi xity check, a message digest calculated at some earlier time is compared with a message digest calcu- lated at a later time. If the digests are the same, the object was not altered in the interim. Recommended practice is to use two or more message digests calculated by different algorithms. The act of performing a fi xity check and the date it occurred would be recorded as an Event. The result of the check would be recorded as the eventOutcome. Therefore, only the mes- sageDigestAlgorithm and messageDigest need to be recorded as objectCharacteristics for future comparison. Representation level: It could be argued that if a represen- tation consists of a single fi le, or if all the fi les comprised by a representation are combined (e.g., zipped) into a single fi le, then a fi xity check could be performed on the representation. However, in both cases the fi xity check is actually being performed on a fi le, which in this case hap- pens to be coincidental with a representation. Bitstream level: Message digests can be computed for bitstreams although they are not as common as with fi les. For example, the JPX format, which is a JPEG2000 format, supports the inclusion of MD5 or SHA-1 message digests in internal metadata that was calculated on any range of bytes of the fi le. See “Fixity, integrity, authenticity,” page 4-5. Figure 1. Data Dictionary Entry for Fixity library trends/summer 2005120 121caplan & guenther/practical preservation Figure 2. Data Dictionary Entry for messageDigestAlgorithm come, and event date/time. When properties of an object are the result of an event, this is considered event-related information, but in practice this may be recorded with the object or with the event. An example of a data dictionary entry for a semantic unit related to the Event entity is given in fi gure 3. PREMIS Report and Further Work The fi nal PREMIS report will go into greater detail about the fi ndings of the working group and will present a completed data dictionary with examples. In addition, it will include a glossary, a description of the data model, discussions of some of the more diffi cult or controversial semantic units, and other related information. As of this writing, the working group was still conducting work by conference calls and the data dictionary was not yet completed. The target date for completion is December 2004.1 Semantic unit messageDigestAlgorithm Semantic components None Defi nition The specifi c algorithm used to construct the message digest for the digital object. Data constraint Value should be taken from a controlled vocabulary. Object category Representation File Bitstream Applicability Not applicable Applicable Applicable Examples MD5 Adler-32 HAVAL SHA-1 SHA-256 SHA-384 SHA-512 TIGER WHIRLPOOL Repeatability Not repeatable Not repeatable Obligation Mandatory Mandatory Figure 3. Data Dictionary Entry for eventType Semantic unit eventType Semantic components None Defi nition A categorization of the nature of the event. Rationale Categorizing events will aid the preservation repository in machine processing of event information, particularly in reporting. Data constraint Value should be taken from a controlled vocabulary. Examples E77 [a code used within a repository for a particular event type] Ingest Repeatability Not repeatable Obligation Mandatory Usage notes Each repository should defi ne its own controlled vocabulary of eventType values. A suggested starter list for consideration (see also the Glossary for more detailed defi nitions): capture = the process whereby a repository actively obtains an object compression = the process of coding data to save storage space or transmission time deaccession = the process of removing an object from the inventory of a repository decompression = the process of reversing the effects of compression decryption = the process of converting encrypted data to plaintext deletion = the process of removing an object from repository storage digital signature validation = the process of determining that a decrypted digital signature matches an expected value dissemination = the process of retrieving an object from repository storage and making it available to users fi xity check = the process of verifying that an object has not been changed in a given period ingestion = the process of adding objects to a preservation repository message digest calculation = the process by which a message digest (“hash”) is created migration = a transformation of an object creating a version in a more contemporary format normalization = a transformation of an object creating a version more conducive to preservation replication = the process of creating a copy of an object that is, bit-wise, identical to the original validation = the process of comparing an object with a standard and noting compliance or exceptions virus check = the process of scanning a fi le for malicious programs The level of specifi city in recording the type of event (e.g., whether the eventType indicates a transformation, a migration or a particular method of migration) is implementation specifi c and will depend upon how reporting and processing is done. Recommended practice is to record detailed information about the event itself in eventDetail rather than using a very granular value for eventType. library trends/summer 2005122 123 Although the data dictionary is intended to be implementation neu- tral, for information to be exchanged between repositories there must be some standard representation. The implementation survey showed wide use of METS among implementers. The METS initiative intends to draft PREMIS-based XML schemas suitable for use as extension schemas for the digital provenance metadata section (digiprovMD) and technical metadata section (techMD) of a METS document. The digiprovMD will be based on the events section of the data dictionary. The new techMD section will complement the other format-specifi c technical metadata sections and will include general technical metadata that applies regardless of fi le format. It will be necessary to reconcile existing format-specifi c extension schema with this new general one, since some data elements that apply regardless of fi le format will already be included in defi ned-format specifi c techni- cal metadata extension schema (for example, MIX, the XML binding of the NISO/AIIM standard Z39.87, Technical Metadata for Digital Still Im- ages) (National Information Standards Organization & AIIM International, 2002). Opportunities for developing testbeds for implementing PREMIS- compliant metadata are currently under discussion, as are trials of the exchange of preservation metadata among repositories. It is unlikely that these will actually be implemented before the group is formally disbanded, so other mechanisms for continuing this work are being considered. Mecha- nisms for supporting the adoption of PREMIS metadata, gathering feedback and evidence of practice, and maintaining the data dictionary over time will also be necessary. The PREMIS Web site should be consulted for the status of these and other related activities. Note 1. Since this article was written, the PREMIS working group released Data Dictionary for Preser- vation Metadata: Final Report of the PREMIS Working Group in May 2005. It is available from the PREMIS Web site at http://www.oclc.org/research/projects/pmwg/. The Web site for PREMIS maintenance activity is http://www.loc.gov/standards/premis/. References Consultative Committee for Space Data Systems. (2002). Reference model for an Open Archival Information System (OAIS) (CCSDS 650.0-B-1). Retrieved March 8, 2005, from http://ssdoo .gsfc.nasa.gov/nost/wwwclassic/documents/pdf/CCSDS-650.0-B-1.pdf. Global Digital Format Registry (GDFR). (n.d.). Home page. Retrieved March 8, 2005, from http://hul.harvard.edu/gdfr/. Library of Congress. (2005). Standards: Metadata encoding and transmission standard. Retrieved March 8, 2005, from http://www.loc.gov/standards/mets/. Lynch, C. (2000). Authenticity and integrity in the digital environment: An exploratory analysis of the central role of trust. In Authenticity in a digital environment (Council on Library and Information Resources report). Retrieved March 8, 2005, from http://www.clir.org/pubs/ reports/pub92/lynch.html. National Information Standards Organization & AIIM International. (2002). Data dictionary: Technical metadata for digital still images (NISO Z39.87). Retrieved March 8, 2005, from http://www.niso.org/standards/resources/Z39_87_trial_use.pdf. caplan & guenther/practical preservation 124 OCLC. (2002). OCLC digital archive system guides: Digital archive metadata elements. Retrieved March 8, 2005, from http://www.oclc.org/support/documentation/pdf/da_metadata_ elements.pdf. ———. (2003). Preservation Metadata Framework Working Group. Retrieved March 8, 2005, from http://www.oclc.org/research/projects/pmwg/wg1.htm. OCLC/RLG. (2002). Preservation metadata and the OAIS information model: A metadata framework to support the preservation of digital objects. Retrieved March 8, 2005, from http://www.oclc. org/research/projects/pmwg/pm_framework.pdf. OCLC/RLG PREMIS Working Group. (2004). Implementing preservation repositories for digital materials: Current practices and emerging trends in the cultural heritage community. Retrieved March 8, 2005, from http://www.oclc.org/research/projects/pmwg/surveyreport.pdf. Priscilla Caplan, Assistant Director for Digital Library Services, Florida Center for Librar y Automation, 5830 NW 39th Avenue, Gainesville FL 32606, pcaplan@ufl . edu, and Rebecca Guenther, Senior Networking and Standards Specialist, Library of Congress, 101 Independence Ave. SE, Washington, DC 20540-4402, rgue@loc .gov. Priscilla Caplan is Assistant Director for Digital Library Services at the Florida Center for Library Automation, where she is managing a project to develop a digital preservation repository for the use of the public universities of Florida. She is the author of Metadata Fundamentals for All Librarians (ALA Editions, 2003) and numer- ous articles on digital preservation, metadata, reference linking, and standards for digital libraries. In addition to co-chairing the OCLC Working Group on Preserva- tion Metadata: Implementation Strategies, she co-chairs the NISO/EDItEUR Joint Working Party on the Exchange of Serials Subscription Information. Rebecca Guenther is Senior Networking and Standards Specialist in the Network De- velopment and MARC Standards Offi ce of the Library of Congress, in which she has worked since 1989. Previous positions included cataloger at the National Library of Medicine; cataloger in the Library of Congress’ Shared Cataloging Division/German Language Section, and section head of the National Union Catalog Control Section/ Catalog Management and Publication Division. Her current responsibilities include work on national and international information standards, including, among others, rotating chair of ISO 639 Joint Advisory Committee on language codes and a member of the NISO Standards Development Committee. Rebecca has worked in the area of metadata since the early 1990s, including maintaining a number of crosswalks be- tween various metadata schemes; participating in development of XML bibliographic descriptive schemas (MODS and MARCXML); serving as chair of the Dublin Core Libraries Working Group and as a member of the Dublin Core Usage Board; serving as a co-chair of PREMIS, an OCLC/RLG working group on preservation metadata implementation strategies; and participating in the Open Ebook Forum’s Metadata and Identifi ers Working Group, among others. She has published articles and made presentations widely on metadata and various standards-related efforts. library trends/summer 2005 work_6krqymxnqffjfnmixnqhqm5ne4 ---- Dewey linked data: making connections with old friends and new acquaintances Joan S. Mitchell, Michael Panzer We adress the history, uses cases, and future plans associated with the availability of the Dewey Decimal Classification (DDC) system as linked data. Parts of the DDC have been available as linked data since 2009. Our initial offering included the DDC Summaries (the top three levels of the DDC) in eleven languages exposed as linked data in dewey.info, an experimental web service. In 2010, we extended the content of dewey.info1 by adding assignable numbers and captions from the Abridged Edition 14 data files in English, Italian, and Vietnamese. In mid-2012, we -extended the content of dewey.info yet once again by adding assignable numbers and captions from the schedules and geographic table in the latest full edition database, DDC 23. We will discuss the behind-the-scenes development and data transformation efforts that have supported these offerings, and then turn our attention to some uses of Dewey linked data plus future plans for Dewey linked data services. 1http//dewey.info. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). DOI: 10.4403/jlis.it-5467 http//dewey.info http://dx.doi.org/10.4403/jlis.it-5467 J.S. Mitchell, Dewey linked data History The history of Dewey linked data is an evolving story of opportunity and experimentation, with an eye toward usability and use of the data. In 2009, the DDC 22 Summaries, an authorized derivative work based on the top three levels of DDC 22, had already been translated into ten languages (more languages than the full edition of the DDC on which the data were based). We decided to experi- ment with making the DDC Summaries available as linked data in an experimental web service, dewey.info. Our initial design goals included: • provide an actionable URI for every class; • encode the classification semantics in RDF/SKOS; • provide representations for machines and for humans; • make the data usable under a widely understood license used in the Semantic Web community. Publishing Dewey as linked data required development decisions on several different fronts. First of all, we had to develop a URI pat- tern that would support the identification of several different kinds of entities and relationships. The URIs had to act as dereferenceable identifiers that could deliver representations of the referenced re- sources in a RESTful manner. Each class had to be identified with a URI and the data had to be presented in a reusable way. In develop- ing the URI pattern, we had to provide for the full complexity of the DDC at any time: identification of the scheme, parts of the scheme, edition, language, and time slice. Figure 1 shows the status of DDC 22 at the time of initial development of URIs for the DDC. DDC 22 was initially published in 2003; the various DDC 22 transla- tions were published in 2005 (German), 2007 (French), 2009 (Italian), JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 178 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) Figure 1: Versions of the DDC based on DDC 22. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 179 J.S. Mitchell, Dewey linked data and 2011 (Swedish-English mixed version). Abridged Edition 14 (a logical abridgment of DDC 22) was published in 2004; translations followed in 2005 (French), 2006 (Italian and Vietnamese), and 2008 (Hebrew and Spanish). The DDC Summaries based on DDC 22 were published in English and ten other languages at the time of the introduction of dewey.info. Besides the DDC Summaries, figure 1 includes two other authorized derivative works based on DDC 22: 200 Religion Class (2004), an updated subset of DDC 22; Guide de la classification décimale de Dewey, a French-language customized abridgment of DDC 22, and DDC Sachgruppen, a German transla- tion of selected DDC 22 top-level classes (including some below the three-digit level) developed for the primary use case of organizing the national bibliographies of Germany, Austria, and Switzerland (the four languages in the box on the right-hand side of figure 1 are translations of DDC Sachgruppen; all five language versions are used in the national bibliography of Switzerland). Dewey.info includes representations for machines and humans; the latter is particularly important in order to illustrate the DDC data offerings to a wider community beyond traditional users of value vocabularies from the library community. The data in dewey.info are presented in human (XHTML+RDFa) and machine (RDF) ver- sions (the machine version of dewey.info has three different RDF serializations: RDF/XML, Turtle, and JSON). The Dewey URIs have the following general pattern: http://dewey.info/{object-collection}/ {object}/{snapshot-collection}/{snapshot}/about}. Specific documents have a variable resource name component and allow specification of content language and type (format): http://dewey.info/{object-collection}/{object}/{snapshot-collection}/ {snapshot}/{resource-name}.{language}.{content-type}. An object is a member of the DDC domain and part of an object collection. The object collection specifies the type of the object. The object collection is a mandatory component and can have one of the values ”scheme,” ”table,” ”class,” ”manual,” ”index,” ”summary,” JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 180 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) and ”id.” A specific object from that collection follows if required. For example: http://dewey.info/class/576.83/ http://dewey.info/scheme/ http://dewey.info/table/2/ A snapshot is used to refer to versions of objects at specific points in time. Snapshots can be part of a snapshot collection, e.g., ”e22,” referring to every concept version that is part of Edition 22 of the DDC. In the following examples, the first URI is an example of a snapshot, the second is an example of a snapshot collection, and the third is an example of a snapshot-collection/snapshot/ combination. snapshot-collection/snapshot/. http://dewey.info/class/641/2009/ http://dewey.info/class/641/e22/ http://dewey.info/class/641/e23/2012-08/ Language and format are also accommodated in the URI: http://dewey.info/class/641/about.it http://dewey.info/class/641/about.rdf http://dewey.info/class/641/about.it.html While SKOS is often the RDF vocabulary of choice for represent- ing controlled vocabularies on the Web, its initial development was largely informed based on thesaurus-like knowledge structures. Panzer (“DDC, SKOS, and linked data on the web”) and Panzer and Zeng (“Modeling Classification Systems in SKOS: Some Challenges and Best-practice Recommendations”) have noted some of the chal- lenges in representing classification data in SKOS. Since the initial DDC linked data offering did not include complicated note types and relationships between classes other than those expressed by the notational hierarchy, the shortcomings in SKOS noted elsewhere with respect to the representation of classification data did not pose JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 181 J.S. Mitchell, Dewey linked data a major roadblock in the exposure of the DDC 22 Summaries in dewey.info. The query http://dewey.info/class/641/about.it.rdf delivers the following machine-actionable representation in RDF/SKOS, which focuses on presenting concept metadata together with number and caption information plus basic semantic relationships. Note that the two main entities retrieved are http://dewey.info/ \class/641/ and http://dewey.info/class/641/2007/02/about.it, connected through a dct:hasVersion relationship: Listing 1: Example of concept metadata representation in RDF/SKOS. OCLC Online ComputerLibrary Center, Inc. it JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 182 http://dewey.info/class/641/about.it.rdf http://dewey.info/\class/641/ http://dewey.info/\class/641/ http://dewey.info/class/641/2007/02/about.it JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) 641 Cibi e bevande 2000-01-01T00:00:00.0+01:00 2006-01-28T22:04:16.000+0100 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 183 J.S. Mitchell, Dewey linked data JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 184 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) 641 641 Finally we needed an appropriate license model. We make data on dewey.info available under a Creative Commons BY-NC-ND license.2Licensing information is embedded in RDF and RDFa fol- lowing the Creative Commons Rights Expression Language (ccREL) specification.3 In the RDF/SKOS extract above, the following licens- ing information is embedded in the RDF: Listing 2: CC license embedded in RDF/SKOS OCLC Online Computer Library Center, Inc. 2http://creativecommons.org/licenses/by-nc-nd/3.0. 3http://wiki.creativecommons.org/CcREL. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 185 http://creativecommons.org/licenses/by-nc-nd/3.0 http://wiki.creativecommons.org/CcREL J.S. Mitchell, Dewey linked data A year after the initial offering, we extended the data available in dewey.info with the addition of assignable numbers and captions from Abridged Edition 14 in three languages (English, Italian, and Vietnamese). This extension added about 3500 additional records for each language to the data already available in dewey.info. While the DDC Summaries represented a broader set of languages than avail- able in the full and abridged translations, the new abridged-edition offerings were a subset of the languages in which the edition had been translated. Why were English, Italian, and Vietnamese chosen? The simple answer was that each was available in the same propri- etary format, ESS XML, for which we already had an RDF/SKOS transformation. Parallel to the linked data work, the Dewey editorial team was making a major data transformation of another type—moving from the proprietary ”ESS” format to one based on the MARC 21 Clas- sification and Authority formats. In 2009, the DDC Summaries were transformed from ESS XML to RDF/SKOS; we used the same transformation to make the Abridged Edition 14 data available in dewey.info. In 2010, OCLC moved to a new underlying represen- tation for the DDC, adopting one based on the MARC 21 formats for classification data (to represent class records) and authority data (to represent Relative Index and mapped terminologies associated with class records). At the same time, OCLC adopted MARCXML as the distribution and ingest format for DDC data across versions, and moved to a new data distribution and ingest model (previously, data transfers were handled at the individual file level over an ftp site). We made a decision to delay the distribution of additional DDC data in dewey.info until we could productionize the data trans- formation and distribution process operating on the new format JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 186 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) and within the distribution environment. This meant taking the data encoded in MARCXML from the distribution server, applying the RDF/SKOS transformation stylesheet, and associating the result with a ”subscription,” automatically creating an Atom feed of data sets that a user agent (in this case, dewey.info) could pick up from the distribution server over a RESTful interface. A model of the process is shown in figure 2. Figure 2: Dewey distribution environment. We installed the pieces on the distribution server that would make this possible in May 2012. In in mid-June 2012, we added assignable numbers and captions from the DDC 23 schedules will be available to dewey.info ; this addition of over 38,000 numbers increased the available Dewey linked data nearly tenfold. In August 2012, we fur- ther extended Dewey linked data by adding the assignable notation JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 187 J.S. Mitchell, Dewey linked data and captions from Table 2 (the Dewey geographic table). Next steps Our next planned offering is the linking of ”new acquaintance,” GeoNames, to Table 2 data. Because we want to manage all editori- ally curated data (including mappings) with the OCLC ESS system, this will require short-term and long-term changes to geographic data within the system. In order to allow the provision of geographic data on the class level, the Dewey editorial team developed MARC PROPOSAL NO. 2011-10,4 which was approved by MARBI in June 2011. The proposal defines new fields that allow for the storage and display of geographic codes in MARC classification records, thereby enabling the reuse of parts of the Relative Index links to GeoNames (generated by the matching algorithm) on the class level in applications downstream, e.g., in linked data representations of the DDC. Use cases In addition to linking plans, we report on use cases that facilitate machine-assisted categorization and support discovery in the Se- mantic Web environment. It is important to have use cases for Dewey linked data, and to solicit new use cases that might inform decisions about our data offering. Institutions such as Bibliothèque nationale de France, the British Library, and Deutsche Nationalbib- liothek have made use of Dewey linked data in bibliographic records and authority files .FAO has linked AGROVOC to our data at a gen- eral level. We are also exploring links between the DDC and other 4http://www.loc.gov/marc/marbi/2011/2011-10.html. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 188 http://www.loc.gov/marc/marbi/2011/2011-10.html JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) value vocabularies such as VIAF, FAST, ISO 639-3 language codes, and MSC (Mathematics Subject Classification). Today, we would like to focus on three uses cases, a caption service, the ”old friend” of DDC synthesized number components associated with categorized content, and the ”new acquaintance” of DDC-GeoNames links. Caption service Querying Dewey linked data The first use case is a simple one: querying Dewey linked data by a Dewey number to have the associated caption delivered as an expla- nation of the number. For example, the query http//dewey.info/ class/945.5/about will return information about class 945.5, includ- ing the captions ”Regione della Toscana” and ”Tuscany (Toscana) region.” There are also two ways in which this data is made accessi- ble to machines and can therefore be used in an automated way as part of a library catalog or other discovery tool. The HTML page for class 945.5 contains structured data in RDFa markup, which means that user agents will be able to distill caption information as regular RDF triples. Another very powerful and flexible way is directly accessing the triple store using the SPARQL endpoint. Listing 3: Query that returns all distinct captions associated with class num- ber 945.5 PREFIX skos: SELECT DISTINCT ?caption WHERE { {GRAPH ?g {?concept skos:notation ’’945.5’’^^; skos:prefLabel ?caption JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 189 J.S. Mitchell, Dewey linked data } } } Note that the endpoint supports HTTP bindings of the SPARQL protocol, meaning that the endpoint serves as a general web service interface (in case the linked data presentation is not preferred). DDC-DDC number components links The second use case is an enhancement of data in dewey.info taken from the DDC itself: links to Dewey synthesized number compo- nents. The concept is simple: What if we linked every synthesized number to its component parts? For example, 641.59455 represents the cooking of Tuscany (641.59 Cooking characteristic of specific continents, countries, localities + T2—455 Tuscany [Toscana] region). The underlying Dewey data includes the MARC 21 765 Synthesized Number Components field: 765 0# $b641.59 $z2$s 455 $u641.59455 By establishing a link between 641.59455 and T2—455 (represented as ””$z 2$s 455” in the 765 field and as ”2–455” in the URI string), it is possible to isolate the geographic facet and use it to foster alterna- tive approaches to discovery. The potential enhancements to such discovery is discussed in the next section. DDC-GeoNames links Linking Dewey data with GeoNames offers the opportunity to ex- tend the boundaries of categorization and discovery. Since GeoN- ames has emerged as not only the dominant source for geographic coordinates in the linked data space, but also as a leading provider of identifiers (URIs) for geographic entities, a GeoNames term can act as a general equivalent or a boundary object for data from dif- JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 190 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) ferent domains that have never been directly mapped to each other. The linking of two concepts in different schemes or from different domains to the same GeoNames entity helps to establish a common ”aboutness” of these two terms. Figure 3 illustrates how a common link to a GeoNames term from a geographic class in dewey.info and from a New York Times subject heading for the same geographic area establishes a strong (albeit implicit and untyped) relationship between these two terms because both entities are ”about” the same city. Also, by extension it can be assumed that all articles and other resources indexed with the NYT heading should be discoverable by the DDC class, therefore adding to the amount of categorized content that can be retrieved by using this DDC number in a discovery interaction. Links to datasets like Figure 3: Links to GeoNarmes. GeoNames extend the boundaries of DDC classes on a conceptual level as well. Whereas a traditional mapping between KOS usually connects entities of the same type (e.g., concepts), linking in the sense of the Semantic Web can connect different kinds of named/i- JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 191 J.S. Mitchell, Dewey linked data dentified entities. While a mapping between concepts often operates with variations of semantic relationships traditionally employed by thesauri (e.g., broader/narrower, related, whole/part), linking of different types of entities requires a new set of relationships tailored to the domain model of the linked dataset or value vocabulary. In the case of GeoNames, in order to store the links in MARC, we have to use a traditional mapping relationship. However, in a linked data version, the SKOS mapping relationships (corresponding to traditional thesaurus relationships) cannot be used to link Dewey classes and GeoNames terms, because GeoNames URIs identify a gn:Feature, which is defined as ”a geographical object” and, being a subclass of http://schema.org/Place, as an entity with a ”physical extension.” In other words, GeoNames (like many other ontolo- gies) does not contain descriptions of or identifiers for concepts of places; it contains descriptions of and identifiers for the places them- selves. In such cases, a relationship like foaf:focus should be used, which ”relates a conceptualisation of something to the thing itself.” A GeoNames URI identifies a locality, not a concept of a locality. This operation effectively connects a Dewey concept with a differ- ent set of relationships, which can be used to present information seekers compelling tools to identify and select geographic features for resource discovery. In essence, it opens up a new perspective or viewpoint on the arrangement of classes in Dewey. Figure 4 on the facing page shows in parallel two different kinds of neighborhoods applicable to T2—6626 Niger. The established Dewey ”neighborhood” shows the class in the context of the DDC notational hierarchy. Linking this class to its corresponding GeoN- ames feature, however, allows for reusing GeoNames’ gn:neighbour relationship and applying it directly to this Dewey class. The right- hand side shows the concept T2—6626 surrounded by features that neighbor the country in its foaf:focus in the physical world. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 192 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) Figure 4: Two views of T2—6626 Niger. Taking this one step further, linking all geographic Dewey concepts to GeoNames allows for an on-the-fly switching of the viewpoint as needed, effectively allowing for transforming the concepts temporar- ily into features, and, by using inherited properties like geographic coordinates, placing them on a map (figure 5 on the next page). Furthermore, DDC classes can utilize more than just relationships inherited from geographic features. The links allow also for a more expressive typing of related DDC entities and open the door to geospatial reasoning over the underlying DDC data. For example, usually it is not clear whether a Dewey number represents a country (or another type of entity). But in the above example, the ”inherited” types allow for basic viewpoint-transgressing queries such as: ”Dis- play all Dewey numbers that represent countries that are adjacent to JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 193 J.S. Mitchell, Dewey linked data Figure 5: Blending of Dewey viewpoint and geographic viewpoints. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 194 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) T2—6626.” Figure 6 shows another example of transgressing viewpoints. Table 2 is mainly arranged by continents, which means that countries that span different continents are separated notationally, i.e., they don’t occupy a contiguous span of Dewey numbers. This may even be true for cities in these countries, e.g., Istanbul in Turkey occupies subdivisions of both T2—4 and T2—5. While Dewey provides all necessary relationships in order to relate the European and Asian parts of Turkey, they are divided notationally, making it not a simple task for a discovery system to offer the user a compelling way of selecting subentities for retrieval. Using the inherited gn:neighbour relationship, however, makes it easy to display classes about the European part of Turkey e.g., T2—49618, shown with its Relative Index terms in yellow) and the Asian part (e.g., T2—5632, shown with its Relative Index terms in green) together in a geobrowser like Google Earth using the geographic viewpoint. Figure 6: Overlaying Dewey classes and Relative Index terms on a map using properties of linked entries. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 195 J.S. Mitchell, Dewey linked data Conclusion The contents of dewey.info and links to Dewey data have evolved over time as we have taken advantage of various opportunities for experimentation. With each addition, we have considered possible use cases for the additional data. The following statement appears in the last paragraph of the final report of the W3C Linked Library Data Incubator Group (2011) : Linked data follows an open-world assumption: the assump- tion that data cannot generally be assumed to be complete and that, in principle, more data may become available for any given entity. The schema-less RDF data model allows for a substantial degree of freedom (compared to the relational database paradigm) in leverag- ing existing data by enrichment and addition of new connections almost ad hoc. Our efforts to publish the DDC as a linked data value vocabulary have taken place in a rich and evolving Dewey ecosys- tem. Figure 7 shows the current state of translations and versions published, planned, or under way based on DDC 23 data; where known, expected publication dates are shown in parentheses. Figure 8 shows the current mappings and crosswalks between the DDC and other knowledge organization systems. We expect to continue extending linked DDC data within the rich environment described in figure 7 on the next page and figure 8 on the facing page to meet use cases in categorization and discovery. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 196 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) Figure 7: Editions and versions based on DDC 23. Figure 8: Mappings and crosswalks to the DDC. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 197 J.S. Mitchell, Dewey linked data References Panzer, Michael. “DDC, SKOS, and linked data on the web”. Proc. of Everything Need Not Be Miscellaneous: Controlled Vocabularies and Classification in a Web World, Montréal, Canada, August 5 2008. 2008. http://www.oclc.org/news/ events/presentations/2008/ISKO/20080805- deweyskos- panzer.ppt. (Cit. on p. 181). Panzer, Michael and Marcia Lei Zeng. “Modeling Classification Systems in SKOS: Some Challenges and Best-practice Recommendations”. Semantic interoperability of linked data: Proceedings of the International Conference on Dublin Core and Metadata Applications. Seoul, October 12-16 2009. Ed. S. Oh, S. Sugimoto, and Sutton S.A. Seoul: Dublin Core Metadata Initiative and National Library of Korea, 2009. 3–14. http://dcpapers.dublincore.org/ojs/pubs/article/view/9748. (Cit. on p. 181). JOAN S. MITCHELL, OCLC. mitchelj@oclc.org http://staff.oclc.org/d̃ewey/joan.htm MICHAEL PANZER, OCLC. panzerm@oclc.org http://staff.oclc.org/d̃ewey/michael.htm Mitchell, J.S., M. Panzer. ”Dewey linked data: making connections with old friends and new acquaintances”. JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013): Art: #5467. DOI: 10.4403/jlis.it-5467. Web. ABSTRACT: This paper explores the history, uses cases, and future plans associated with availability of the Dewey Decimal Classification (DDC) system as linked data. Parts of DDC system have been available as linked data since 2009. Initial efforts in- cluded the DDC Summaries in eleven languages exposed as linked data in dewey.info. In 2010, the content of dewey.info was further extended by the addition of assignable numbers and captions from the Abridged Edition 14 data files in English, Italian, and Vietnamese. During 2012, we will add assignable numbers and captions from the latest full edition database, DDC 23. In addition to the ”old friends” of different Dewey language versions, institutions such as the British Library and Deutsche Na- JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 198 http://www.oclc.org/news/events/presentations/2008/ISKO/20080805-deweyskos-panzer.ppt http://www.oclc.org/news/events/presentations/2008/ISKO/20080805-deweyskos-panzer.ppt http://dcpapers.dublincore.org/ojs/pubs/article/view/9748 mailto:mitchelj@oclc.org http://staff.oclc.org/~dewey/joan.htm mailto:panzerm@oclc.org http://staff.oclc.org/~dewey/michael.htm http://dx.doi.org/10.4403/jlis.it-5467 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013) tionalbibliothek have made use of Dewey linked data in bibliographic records and authority files, and AGROVOC has linked to our data at a general level. We expect to extend our linked data network shortly to ”new acquaintances” such as GeoNames, ISO 639-3 language codes, and Mathematics Subject Classification. In particular, the paper examines the linking process to GeoNames as an example of cross-domain vocabulary alignment. In addition to linking plans, the paper reports on use cases that facilitate machine-assisted categorization and support discovery in the semantic web environment. KEYWORDS: DDC; Dewey linked data; Dewey Decimal Classification Submitted: 2012-04-25 Accepted: 2012-08-31 Published: 2013-01-15 JLIS.it. Vol. 4, n. 1 (Gennaio/January 2013). Art. #5467 p. 199 work_6mlyt6hwf5hwfm6f3havqo6k4q ---- Cataloging and Classification Quarterly, 2010, Vol.48, No.2-3, p. 221-236. ISSN: 0163-9374 (print) 1544-4554 (online) doi: 10.1080/01639370903535726 http://www.tandf.co.uk/journals/copyright.asp http://www.tandf.co.uk/journals/titles/01639374.asp http://www.tandfonline.com/doi/full/10.1080/01639370903535726 © 2010 Taylor & Francis Group Program f or Cooperative Cataloging: BIBCO Records: Analysis of Quality MAGDA EL-SHERBINI The Ohio State University ABSTRACT The Program for Cooperative Cataloging (PCC) is an international program that brings together libraries that wish to participate in the creation and sharing of bibliographic records. These high quality records can be used by any library around the world without additional modification or change. Members of the cooperative adhere to a set of standards and practices that help eliminate extensive editing of records by participant libraries, thus allowing libraries to reduce the cost of cataloging. Even though the records submitted to the Online Computer Library Center (OCLC) database by PCC member institutions adhere to the established standards, some libraries continue to verify the quality of the access points in these records. Many libraries outsource this process to outside vendors who automatically check these records against the Library of Congress (LC) Authority File. The purpose of this study is to examine the quality of the PCC records in light of the changes that were made by an authority control vendor. The author will analyze the changes made by the vendor to the PCC records and explain the reasons for those changes. INTRODUCTION The Library of Congress (LC) Program for Cooperative Cataloging (PCC) was established to improve the timely availability of bibliographic and authority records by cataloging more items, producing cataloging that is widely available for sharing and use by others, and performing cataloging in a more cost-effective manner. 1 PCC provides training to members of the cooperative who adhere to a set of standards and practices that help eliminate extensive editing of records by participant libraries, thus allowing libraries to reduce the cost of cataloging. Even though the records submitted to the Online Computer Library Center (OCLC) database by PCC member institutions adhere to the established standards, some libraries continue to verify the quality of the access points in these records. Many other libraries outsource this process to outside vendors who automatically check these records against the Library of Congress Authority File. The purpose of this study is to examine the quality of the PCC records in light of the changes that were made by an authority control vendor. The author will analyze the changes made by the vendor to the PCC records and explain the reasons for those changes. BACKGROUND PCC is made up of four components: NACO (Name Authority Cooperative Program), SACO (Subject Authority Cooperative Program), BIBCO (Monographic Bibliographic Record Cooperative Program), and CONSER (Cooperative Online Serials Program). Through these four programs, member institutions contribute bibliographic records that follow mutually accepted cataloging standards. Before joining each program, potential member institutions participate in PCC training in order to assure consistency and accuracy of bibliographic records that will be produced by them. The Monographic Bibliographic Record Cooperative Program (BIBCO) is an important component of PCC. BIBCO members have the responsibility for contributing full or core level bibliographic records to the program. As part of this process, members have to provide "complete authority work (both descriptive and subject), a national level call number (such as LC classification or NLM classification), and at least one subject access point drawn from nationally recognized thesauri such as LCSH, MeSH, etc., as appropriate." 2 Records submitted to the OCLC database by BIBCO contributors follow established rules and standards for authority work. These records are downloaded into local systems and many libraries submit these records to an authority control vendor to check them against the Library of Congress Authority File. Libraries use a vendor service instead of having their own copy catalogers check each individual record. Libraries prepare a specific profile that is used by the vendor to check these records against the LC Authority File. The profile provides instruction to the vendor on what changes need to be made. Currently, The Ohio State University Library (OSUL) is using Backstage Library Works (BSLW) as their authority control vendor. 3 BSLW provides name and subject authority control services based on the LC name and subject authority databases. The catalogers at OSUL do not check access points when they download records. However, they check the authority file when they are creating new records in the OCLC database. The OSUL policy is to depend on a commercial vendor to perform automated post-cataloging authority control without human intervention. This simplifies and accelerates the process of copy cataloging. At the end of every month, the Cataloging Department produces a file of all records created by cataloging staff (original cataloging) and by other library units (copy cataloging). This file of records is then sent to BSLW for automated authority control processing according to a specific profile (e.g., check all access points, punctuation, tags, indicators, and spelling). After BSLW checks the bibliographic records against the LC Authority File, they correct the records automatically and provide OSUL with reports on unmatched headings, unrecognized subfields, and possible invalid tags. LITERATURE REVIEW PCC initiatives have been well documented in the library literature. 4 The program Web site is an official source of information on PCC and its components (NACO, BIBCO, SACO, and CONSER). It includes documentation, as well as statistical and contact information. 5 General information about the program can be found in "Becoming an Authority on Authority Control: An Annotated Bibliography of Resources." 6 This bibliography includes monographs, articles and papers, electronic discussion groups, Web sites, training offered through NACO and SACO, and a summary of future trends in authority control. Riemer and Morgenroth discussed the increasing importance and the value of cooperative cataloging for librarians and library administrators. Their research focused on East Asian collections and PCC. 7 Bowen also addressed the benefits and the cost effectiveness of the PCC core records by providing an explanation of the long-term value of the program. 8 The PCC program now includes participation by non-U.S. institutions, 9 either through an "individual membership" to PCC or as a group through a "funnel." Shrinking resources and budget reductions are among the major problems facing libraries today. Cataloging is among the most affected areas. Authority control in particular is often considered a labor intensive and expensive operation. In order to continue providing quality bibliographic records and to reduce the cost of processing, the PCC core record concept was introduced. 10 The core record standard provides essential bibliographic elements based on acceptable standards that can be adapted without "modification" of the record at the local level. 11 The core record concept was later expanded and adapted to include non-monographic materials. The Core Standard for Rare Books was adopted in 1999, but was met with some re- sistance from the rare book cataloging community. An investigation of this response was researched and analyzed in "Evidence of Application of the DCRB Core Standard in WorldCat and RLIN." 12 Schuitema provided a lengthy introduction of the core bibliographic record and what it is, where the standard originates, and how the core level is different from the full level. 13 She also addressed the issues that are associated with the implementation of the core record and examined some of the reasons libraries are implementing the core standards. Czeck, Icenhower, and Kellsey identified significant differences between records cataloged using OCLC core standards and PCC full standards, particularly in the occurrences of specific name and subject access points. 14 This difference might have long term implications for user access and libraries should be alerted when they incorporate the core record in their copy cataloging procedures. NACO, which was founded in 1979, has grown and expanded through the years and now includes international membership. In his article on the subject, Byrum pointed out that "The NACO model has changed over time to create more cost-effective and user-friendly policies and procedures to meet participants' needs. Increased recognition, especially by library administrators, of the value of authority control also encouraged NACO to flourish" (Abstract). 15 In his article, he explained membership requirements, benefits to the participants, as well as the role of the Library of Congress in providing training and documentation and participation in the program through a "Funnel." Libraries that cannot join the NACO program directly have been creating NACO funnels to enable them to contribute records indirectly through another institution. Some of the reasons a library may not join the cooperative are a lack of cataloging expertise and resources or inability to meet the NACO minimum submission requirements. Larmore provided a step-by-step explanation of how to set up a NACO funnel among four academic and one state library in South Dakota. 16 As a result of changing the objectives of contributing records to the NACO program, the University of Florida Libraries increased productivity in this area. 17 Training catalogers on the NACO, BIBCO, SACO, and CONSER standards is essential to ensure the success of the program. Historical background on the PCC training and identifying the future needs in this area was discussed in "The Program for Cooperative Cataloging and Training for Catalogers." 18 Calhoun and Boissonnas discussed the advantage of libraries joining PCC and cataloging according to BIBCO standards. 19 In their discussion, they pointed out that PCC accomplishments included the establishment of shared standards for books, music, sound recordings, and audiovisual materials; simplifying and streamlining documentation; and implementing training programs. They pointed out that libraries should take advantage of the program and specifically emphasized the use of the core record concept that contains an accurate and standardized description and authorized access points. They also addressed the benefits of applying the core record in terms of cost effectiveness and enhanced user access. The benefits of participating in PCC are numerous and recognized in the library community. A practical approach to reduce the cost of the creation of authoritative bibliographic records is to create a record based on acceptable standards once and share it several times. In her editorial column, Carter said: "Cooperative cataloging is a subject near and dear to my heart and one in which I fervently believe. This includes being a contributor to the collective databases of cataloging and not just a taker. During my years in technical service at the University of Pittsburgh I had the privilege of participating in CONSER policy development and supported the library's entry into NACO and Enhance." 20 Quality of the cataloging record and the criteria that are used to determine it was discussed extensively by Bade in his article "The Perfect Bibliographic Record: Platonic Ideal, Rhetorical Strategy of Nonsense?" 21 Bade dismissed the concept of the "perfect record" and recommended a more pragmatic approach to the problem that would concentrate on matching the individual needs of a particular library with the corresponding set of data elements in the bibliographic record that would satisfy that institution's needs. METHODOLOGY To examine the quality of the PCC records in light of the changes that were made by the BSLW and to explain the reasons for those changes, the following steps were taken. This author examined the file of records that were produced by OSUL catalogers in April 2009- This file was sent to BSLW for authority processing. The file consisted of 7,787 records and included records that were either created in the OCLC database by OSUL staff or were downloaded from the OCLC database. Before sending this file to BSLW for post-authority control processing, the author used a Boolean search to separate the PCC records from non-PCC records. The "042" MARC 21 field was used to identify the PCC records. The result was a sample of 542 PCC records, about 7 percent of records downloaded in April. A printout was made of the PCC records, which were assigned a unique ID. The sample was then sent separately to BSLW for authority processing. The author did not distinguish between the PCC records created by OSUL staff and those created by other PCC participant libraries. After the completion of authority processing, BSLW created several statistical reports that provided detail about the changes made to the PCC records. These reports were based on OSUL local requirements as outlined in the vendor profile. Before loading these records into the catalog, a copy of each record was made and given a unique ID. The next step was to examine how many PCC records the vendor had corrected and which fields were changed. Criteria used to compare the records before and after authority processing were based on access points and fields of importance to OSUL. Series information was excluded and will need to be addressed in a separate study. These criteria are as follows: Numbers and Codes Library of Congress Control Number (010 field) International Standard Book Number (020 field) Library of Congress Call Number (050 field) Local Call Numbers (090 field) Main Entries Main Entry—Personal Name (100 field) Main Entry—Corporate Name (110 field) Main Entry—Meeting Name (111 field) Main Entry—Uniform Title (130 field) Title and Title-Related Fields Title Statement (for obvious misspellings) (245 field $a) Varying Form of Title (for obvious misspellings) (246 field) Subject Access Fields Subject Added Entry—Personal Name (600 field) Subject Added Entry—Corporate Name (610 field) Subject Added Entry—Meeting Name (611 field) Subject Added Entry—Uniform Title (630 field) Subject Added Entry—Topical Term (650 field) Subject Added Entry—Geographic Name (651 field) Index Term—Genre/Form (second indicator 0) (655 field) Added Entries Added Entry—Personal Name (700 field) Added Entry—Corporate Name (710 field) Added Entry—Meeting Name (711 field) Added Entry—Uniform Title (730 field) Criteria were then created to group the changes according to their importance to retrieval of records from the OSUL online catalog. Records were separated into two categories:  Minor changes that do not affect the retrieval of records—punctuation, diacritics, and spaces.  Major changes that impact the retrieval of records—incorrect indicators, incorrect or lack of subfield delimiters, incorrect tags, incorrect headings, and incorrect form of heading. For the purpose of this study, statistical analysis takes into consideration the number of occurrences of errors, and not the number of records affected by the errors. Hence, there could be more errors in a certain area than there are records in the sample. TABLE 1 Changes in Numbers and Codes Fields Numbers and Codes Fields Bibs with This Field PCC Bibs Changed Percent Changed LC Control Number (010) 273 272 99.6 ISBN (020) 1089 0 0 LC Call Number (050) 546 6 1.1 Locally Assigned LC-Type 123 0 0 Call Number (090) Total Numbers and Codes 2332 280 12 Fields The printouts of records that were made before authority processing were compared to records returned from the vendor. All the changes that were made by the vendor were recorded on the printout. To avoid searching the Authority File (AF) for all the access points to determine if they had already been established, the author used the report that was generated by BSLW. This report showed the headings that did not match or were not yet established in the AF. These headings were then searched manually in the AF to determine the reasons for the lack of headings match. It was assumed that the original cataloger checked the AF to verify or establish headings before creating the records in OCLC, as this is required by the PCC BIBCO standards. RESULTS Changes in Numbers and Codes Fields (010, 020, 050, and 090) Table 1 represents the most frequent changes that occurred in the LC Control Number (field 010) and the LC Call Number (field 050). Nearly all of the 010 fields were changed by the vendor to add a space between the subfield delimiter ($a) and the LC Control Number, according to the OCLC Bibliographic Formats and Standards. 22 In the 050 field, six errors were changed to correct spaces. It should be noted that there were 129 records with the 090 field (locally assigned LC-type call number). OSUL staff assigned local call numbers in the 090 field to adjust the Cutter number in certain classes (e.g., M, N, and P). However, the call numbers in the existing 050 fields were left untouched. There was no change in the International Standard Book Number (ISBN) (020 field). The changes in the Numbers and Codes fields were minor, because they did not affect the retrieval of records from the OSUL online catalog. Changes in Main Entries (100, 110, 111, and 130 fields) Table 2 shows the changes in the Main Entry that occurred during the authority control processing. The following is analysis of the changes in each Main Entry field: TABLE 2 Changes in Main Entry Fields (100, 110, 111, and 130 Fields) Main Entry Fields Number of Records that have This Field Total Match Total of Headings Changed/Non Match Percent of Changes and Non Match Personal Name Main Entry (100) 359 167 192 53 Corporate Name Main Entry (110) 14 8 6 75 Meeting Name Main Entry (111) 7 5 2 29 Uniform titles (130) 1 0 1 100 Total Main Entry Fields 381 180 201 54 CHANGES IN THE PERSONAL NAME MAIN ENTRY (100 FIELD) About 53 percent (192 errors) of the Personal Name Main Entry (100 field) changed during authority processing. These changes can be grouped into two categories: minor changes noted in punctuation (153 errors), adding diacritics (18 errors), and deleting spaces (11 errors). Major changes occurred in indicators (2 errors), correcting tags (1 error), correcting subfields and subfield delimiters (3 errors). The largest number of changes was in punctuation (nearly 82 percent), where adding and deleting a period at the end of the heading was an issue. It should be noted that all of these changes were performed automatically by comparing the headings in the bibliographic records to the heading in the AF. Although there were small number of changes in tags, indicators, and subfields and subfield delimiters, these changes were important for the proper indexing and retrieval of the record. The vendor reported that there were four headings that did not match the heading in the AF. Upon examination, it was determined that of the Personal Name Main Entries (100 field) that did not match, two were already found in the AF. These two records were created and added to the AF after the bibliographic records were input into OCLC. When the OSUL records were sent to the vendor for post-cataloging authority processing, these non-match headings were not in the AF. The other two names had not been established in the AF. CHANGES IN THE CORPORATE NAME MAIN ENTRY (110 FIELD) There were only fourteen fields in the sample set of records containing a 110 field. The vendor updated six fields to correct spaces (1 error) and punctuation (5 errors). Again, these changes were considered to be minor and do not effect the retrieval of the records. TABLE 3 Changes in Title Information Fields (245, 246, and 240 Fields) Title Information Field Number of Records that have This Field Total Match Total of Headings Changed/Non Match Percent of Non Match Title (245) 542 476 66 12.2 Other Title Information (246) 125 125 0 0 Total Title Information Fields 667 601 66 10 CHANGES IN THE MEETING NAME MAIN ENTRY (111 FIELD) There were seven fields in the sample records for the Meeting Name Main Entry (111 field) and two errors were changed to correct punctuation. CHANGES IN THE UNIFORM TITLES (130 FIELDS) There was one Uniform Title main entry (130 field) in the sample that was reported by the vendor as a non-match. An examination of this heading revealed that it was not established in the AF. Changes in Title and Title-Related Fields (245, 246, Fields) In examining the Title Fields (245 field) it was determined that 66 titles (12%) were changed (see Table 3). Of those, 48 errors (about 67 percent) were changed to correct spaces, six errors were corrected for misspelling, eight errors were corrected for non-filing indicators, and four errors were changed in the subfields and subneld delimiters. These changes in the title field were all done by the vendor through an automated process without human intervention. Searching these titles against the OCLC master records revealed that the records in OCLC remained incorrect. Although there were not many misspellings and non-filling indicators, these were important to correct because their accuracy has an impact on users' ability to search and retrieve records. There were no changes in the 246 field and all the title information in this field was correct. Changes in the Subject Access Fields (600, 610, 611, 630, 650, 651, 655) Table 4 shows the changes in the Subject Access that were made during the authority control processing. The following is analysis of the changes in each Subject Access field. TABLE 4 Changes in Subject Access Fields (600, 610, 611, 650, 651, and 655 Fields) Subject Access Fields Number of Records that have This Field Total Match Total of Headings Changed/Non Match Percent of Non Match Personal Name Subject Heading (600) 117 77 40 34 Corporate Name Subject Headings (610) 28 15 13 46 Meeting Name Subject Headings (611) 0 0 0 0 Subject Heading Uniform Title (630) 15 9 6 40 Topical Subject Headings (650) 1422 629 793 56 Subject Headings Geographic (651) 128 65 63 49 Genre Headings (655) 42 17 25 60 Total Subject Access Fields 1752 802 940 54 CHANGES IN THE PERSONAL NAME SUBJECT HEADING (600 FIELD) There were 117 Personal Name Subject Headings in the sample records and forty (34 percent) were updated as a result of authority processing. Most of these changes were to correct punctuation (24 errors), subfield and subfield delimiters (1 error), indicators (1 error), and adding diacritics (12 errors). Although correcting punctuation and adding diacritics are considered minor changes, correcting indicators and subfields and subfield delimiters are important for proper indexing and retrieval of records. There were also two fields that were reported by the vendor as non-match. Both were searched in the AF and already established. They were reported as non-matches because they had a form as a subdivision; however, for the purposes of this study, they are not considered true errors. CHANGES IN THE CORPORATE NAME SUBJECT HEADINGS (610 FIELD) There were 28 fields that include the Corporate Name Subject Headings (610 field) in the sample records; thirteen fields (46%) were changed to correct punctuation (11 errors) and diacritics (2 errors). CHANGES IN THE SUBJECT HEADING UNIFORM TITLE (630 FIELD) In comparing this field before and after authority processing, six fields out of fifteen (40 percent) were reported by the vendor as non-matching. Upon examination, it was determined that these six headings were in the same bibliographic record (OCLC # 318988782). They were actually two different headings with multiple form subdivisions; again, these are not considered errors. CHANGES IN THE TOPICAL SUBJECT HEADINGS (650 FIELD) There were numerous changes in the 650 Topical Subject Heading field, as 793 errors (56 percent) were corrected for spaces (255 errors), punctuation (447 errors), changes in subfields delimiters from "x" to "v" and vice versa (59 errors), changes in tags (14 errors) and indicators (11 errors). Changing subfield and subfield delimiters, tags, and indicators were important since they affect the meaning of the term and they have an impact on the users' ability to find the record. There were 7 fields that were reported by the vendor as non-match headings in the AF. Of these, 5 were major errors as defined by the parameters of this study, including two MARC tagging errors, and three errors in subject heading assignment. CHANGES IN THE SUBJECT HEADINGS GEOGRAPHIC (651 FIELD) There were 128 Geographic Subject Headings. Of this number, 63 fields were changed to correct punctuation (31 errors), spaces (27 errors), and subfield and subfield delimiters (7 errors). CHANGES IN THE GENRE HEADINGS (655) There were 25 errors that were changed. In most cases, the indicators were changed from "7" to "0" in 14 fields. Eleven headings (two of them were repeated three times) were reported as non-match by the vendor. In checking these headings in the AF, it was discovered that they had already been established, but two of them had typos and were not corrected by the cataloger locally or in the OCLC master record. 23 Changes in Added Entries Fields (700, 710, 711, and 730) Table 5 provides information on changes in Added Entries. The following is an analysis of the changes in each Added Entry field. CHANGES IN THE PERSONAL NAME ADDED ENTRY (700 FIELD) Out of 400 Personal Name Added Entry fields included in the sample records, 182 fields (46%) were changed by the vendor. Most changes in this field occurred in punctuation (78 errors), adding diacritics (29 errors), deleting spaces (63 errors), correcting indicators (5 errors), and correcting subfield delimiters (4 errors). The vendor reported that there were 3 headings that were not matched in the AF. After searching these headings in, it was determined that one already existed in the AF, one was not found, and the third was ambiguous. TABLE 5 Changes in Added Entries Fields (700, 710, 711, and 730 Fields) Added Entries Fields Number of Records that have This Field Total Match Total of Headings Changed/ Non Match Percent of Non Match Personal Name Added Entry (700) 400 218 182 46 Corporate Name Added Entry (710) 128 32 96 75 Meeting Name Added Entry (711) 1 1 0 0 Added Entries Uniform Titles (730) 4 0 4 100 Total Added Entries Fields 533 251 282 53 CHANGES IN THE CORPORATE NAME ADDED ENTRY (710 FIELD) There were 128 Corporate Name Added Entries in the sample records (710 field) and about three quarters of these fields changed. The major changes that occurred in this field were due to corrections in punctuation (65 errors), diacritics (22 errors), indicators (1 error), and subfield and subfield delimiters (2 errors). In this field, there were 6 headings that were reported by the vendor not to match the AF. In investigating these headings it was determined that only one heading was already added to the AF and one had not been established yet. The other headings were problems. 24 CHANGES IN THE MEETING NAME ADDED ENTRY (711 FIELD) There was only one Meeting Name Added Entry and it matched the AF. CHANGES IN THE ADDED ENTRIES UNIFORM TITLES (730 FIELDS) There were a total of four Uniform Title added entries in the sample records and the vendor reported all of them as non-match headings. An examination of these headings revealed that none of them had been established in the AF. These headings should have been established as part of the PCC standard requirement. DISCUSSION AND CONCLUSION A detailed examination of changes made by Backstage Library Works (BSLW) to the PCC records submitted at the end of April 2009 by the OSUL revealed certain patterns of errors and omissions. In measuring and assessing the quality of the PCC records in terms of the authority work and the accuracy of information, it was determined that the PCC records contain errors ranging from simple to serious. Some of these, such as adding or deleting spaces (355 errors), adding or deleting diacritics (133 errors) or punctuation marks (1,098 errors) are merely cosmetic. These changes have little or no impact on the user's ability to search and find the record in the online catalog. Other changes, such as correcting tags (14 errors), correcting indicators (29 errors), correcting subfields and subfield delimiters (76 errors), and spelling (6 errors), will affect the ability to search and retrieve these records. There were 381 fields among the sample records that contained Main Entries. Table 6 shows the distribution of the 201 errors that were corrected during the authority processing (about 53 percent). Most errors (80 percent) corrected by the vendor occurred in the punctuation area, 9 percent diacritics, and 6 percent correcting spaces. A typical example involved adding or deleting a "." at the end of the field and deleting spaces. This type of error does not affect access to the record in the OSUL online catalog. There were relatively few errors in indicators, subfields, subfield codes, and tags. Although the number of errors in these areas was not significant, they need to be corrected because they will affect indexing and retrieval of records. Unfortunately these records will remain incorrect in the OCLC database, but will be corrected at those libraries that have post- cataloging processing done by a vendor service. Five headings in the Main Entries areas were reported as "not found in the Authority File" by the vendor at the time of the authority control processing. Two of these headings were found during this study. This indicates a lack of synchronization between the time the bibliographic record is created and the time the authority record is added to the AF. The problem may also result from the BIBCO cataloger not being a member of NACO, and not being able to contribute to NACO. TABLE 6 Distribution of Changes in Main Entries Type of Changes Number of the Changes Percent of Changes Punctuation 160 80 Space 12 6 Diacritics 18 9 Subfield and subfield delimiters 3 1.5 Indicators 2 1 Tag 1 .5 Non-matched headings 5 2.5 Total changes 201 100 TABLE 7 Distribution of Changes in the Subject Areas Type of Changes Number of the Changes Percent of Changes Punctuation 515 55 Space 280 30 Diacritics 13 1.5 Subfield and subfield delimiters 67 7 Indicators 26 3 Tag 13 1.4 Non-matched headings 26 3 Total changes 940 100 There were a total of 1,742 subject fields among the sample records and about 54 percent (940 errors) were changed. Table 7 shows that the largest percentage of changes occurred in punctuation (55 percent), followed by correcting spaces (30 percent), and adding diacritics (1 percent). The major problems in the subject area occurred in correcting subfields and subfield delimiters (7 percent) followed by changing indicators (3 percent), and correcting tags (1 percent). Although the number of major changes was relatively small, correcting these errors is important for proper indexing and accessing of the records in the OSUL online catalog. There were 26 headings (3 percent) reported as "non-matched" by the vendor, especially in Personal Name Subject Headings, Subject Headings Uniform Title, and the Genre Headings. Other problems that caused the headings to result in a non-match were due to mis-tagging, free- floating subdivisions, form as a subdivision ($v), mis-constructing the subject heading, and not following the cross-references guide. In some cases the heading was not established in the Authority File at the time of authority control process. These problems require skilled catalogers to investigate them and correct them manually. According to the OSUL profile with the BSLW, the authority control vendor was only able to correct obvious errors that can be detected by the software. In the Added Entries fields, there were 282 errors that were changed by the vendor. Table 8 shows the distribution of these changes. Most errors occurred in punctuation (51 percent), followed by correcting spaces (22 percent) and diacritics (18 percent). Changing indicators and adding subfields and subfield delimiters represent only two percent. The percent- age of non-matched headings was very small. Only four percent were not in the AF at the time of authority control processing. Although the number of fields corrected by the BSLW in the sample record was substantial, this study reveals that majority of these would not affect indexing or retrieval of these records from the OSUL online catalog. These include punctuation, diacritics and spaces. These errors were corrected by the vendor for the client institution, but will not be corrected in the master record in OCLC. This type of error is not significant enough to be corrected in- house, if the vendor service was not used. TABLE 8 Distribution of Changes in Added Entries Type of Changes Number of the Changes Percent of Changes Punctuation 143 51 Space 63 22 Diacritics 51 18 Subfield and subfield delimiters 6 2 Indicators 6 2 Tag 0 Non-matched headings 13 4 Total changes 282 100 There is a smaller subset of errors that were corrected by BSLW that are more important, as they would have an impact on access and retrieval of records. These errors involve indicators, subfields and subfield delimiters, tags, spelling errors, and form of subject heading. Although there were a total number of 244 errors reported, this does not mean that so many records were affected. There were instances of multiple errors that were corrected in a single record, and the total number of records affected by this is substantially smaller than the statistical table indicates. In conclusion, the quality of the PCC-produced bibliographic records is high, as defined by the parameters of this study. The vast majority of the errors noted in the statistical tables were not substantial. The relatively small number of major errors occurred in the subfields and subfield delimiters, indicators, and tags. The vendor service used by the OSUL is good at identifying and correcting those records that contain major errors that have an impact on the accessibility of records in the online catalog. In the process, the vendor identifies and corrects other errors that have little or no bearing on the retrieval of records. Most of the errors in the sample records occurred at the time of original production of the catalog record. As PCC continues to develop and grow its cooperative cataloging program, it could consider offering continuing education or training of original catalogers involved in the program. NOTES 1. PCC Web site, http://www.loc.gov/catdir/pcc/ (accessed November 23, 2009). 2. BIBCO Web site, http://www.loc.gov/catdir/pcc/bibco/bibcopara.html (accessed November 23, 2009). 3. Backstage Library Works Web site. http://www.bslw.com/authority_control.htmKaccessed November 23, 2009). 4. Carol Mandel, "Cooperative Cataloging: Models, Issues, Prospects," Advances in Librarianship 16 (1992): 33-82. 5. PCC web site, http://www.loc.gov/catdir/pcc/ (accessed November 23, 2009). 6. Robert E. Wolverton, "Becoming an Authority on Authority Control: An Annotated Bibliography of Resources," Library Resources & Technical Services 50, no. 1 (2006): 31^1. 7. J. J. Riemer and K. Morgenroth, "Hang Together or Hang Separately: The Cooperative Authority Work Component of NACO," Cataloging & Classification Quarterly 17, no. 3-4 (1993): 127-161. 8. Jennifer B. Bowen, "Creating a Culture of Cooperation." Cataloging & Classification Quarterly 26, no. 3 (December 31, 1998): 73-85. 9. Anthony R. D. Franks and Ana Cristan, "International Cooperation in the Program for Cooperative Cataloging: Present and Prospects." Cataloging & Classification Quarterly 30, no. 4 (2000): 37. 10. The PCC Core Records Web site, http://www.loc.gov/catdir/pcc/bibco/coreintro.html (accessed November 23, 2009). 11. S. E. Thomas, "The Core Bibliographic Record and the Program for Cooperative Cataloging," Cataloging & Classification Quarterly 21, no. 3-4 (1996): 91-108. 12. Winslow Lundy, "Use and Perception of the DCRB Core Standard." Library Resources & Technical Services 47, no. 1 (January 2003): 16-27. 13. Joan E. Schuitema, "Demystifying Core Records in Today's Changing Catalogs," Cataloging & Classification Quarterly 26, no. 3 (December 31, 1998): 57-71. 14. Rita L. H. Czeck. Elizabeth Icenhower, and Charlene Kellsey, "PCC Core Records Versus PCC Full Records: Differences in Access?" Cataloging & Classification Quarterly 29, no. 3 (2000): 81-92 15. John D. Byrum Jr., "NACO: A Cooperative Model for Building and Maintaining a Shared Name Authority Database," Cataloging & Classification Quarterly 38, no. 3-4 (2004): 237-249. 16. Dustin P. Larmore, "A New Kid on the Block: The Start of a NACO Funnel Project and What Is Needed to Start Your Own," Cataloging & Classification Quarterly 42, no. 2 (2006): 75-81. 17. Betsy Simpson, and Priscilla Williams, "Growing a NACO Program: Ingredients for Success." Cataloging & Classification Quarterly 40, no. 1 (2005): 123-132. 18. Carol G. Hixson and William A. Garrison, "The Program for Cooperative Cataloging and Training for Catalogers," Cataloging & Classification Quarterly 34, no. 3 (2002): 355. 19. Karen S. Calhoun, and Christian M. Boissonnas, "BIBCO: A Winning Proposition for Library Users and Staff," Library Acquisitions 22, no. 3 (1998): 251-255. 20. Ruth C. Carter, "Cooperative Cataloging." Cataloging & Classification Quarterly 30, no. 4 (2000): 1. 21. David Bade, "The Perfect Bibliographic Record: Platonic Ideal, Rhetorical Strategy or Nonsense," Cataloging & Classification Quarterly 46, no. 1 (2008): 109-133. 22. OCLC Bibliographic Formats and Standards-LC Control Number, http://www.oclc.org/ bibformats/en/Oxx/OlO.shtm 23. See "Christion fiction" in the OCLC record numbers 10385060 and "Science ficton" in the 6774710). 24. One record "Standard Publishing Company" has two possibilities, Standard Publishing Co. (Quincy, Mass) and Standard Publishing Company (Cincinnati, Ohio). Based on the OCLC record cited, the Cincinnati, Ohio entry is the correct one (OCLC record number (320363906). Another heading with a problem was the "American Academy of Pediatrics. $b Section on Home Care." "American Academy of Pediatrics" is in the AF, but $b "Section on Home Care" is not; however, there is $b "Section on Home Health Care" in the AF. It is possible that the word "Health" was accidentally omitted from the beading. The third heading that was a problem is "Universidad de los Andes (Bogota, Columbia). $b Facultad de Ciencas Sociates. $b Departamento de Languajes y Estudios Socioculturales." This heading was in the AF but the last $b "Departamento de Languajes y Estudios Socioculturales" was not (OCLC record number 220868279). The fourth heading, "Sbanxi Sheng (China). $b Guo tu zi yuan ting" had not been established in the AF. However, "Shaanxi Sheng (China)" was found in the AF while browsing. This heading needs more investigation by an expert cataloger to determine the correct form of the name (OCLC record number 60635282). Program for Cooperative Cataloging: BIBCO Records: Analysis of Quality INTRODUCTION BACKGROUND LITERATURE REVIEW METHODOLOGY RESULTS DISCUSSION AND CONCLUSION NOTES work_6oclsuml2bcdnhlqsvooyj2ds4 ---- PII: S1464-9055(00)00171-8 Library Collections, Acquisitions, & Technical Services, 2000, Volume 24, Issue 4, Pages 443-458. ISSN: 1464-9055 DOI: 10.1016/S1464-9055(00)00171-8 http://www.sciencedirect.com/science/journal/14649055 http://www.sciencedirect.com/science/article/B6VSH-41V35RT-1/2/a5dfdebbfefa3a28205d89b44c369039 © 2000 Elsevier Science Inc. All rights reserved. Making the connection between processing and access: do cataloging decisions affect user access? Ruey L. Rodman Abstract One function of a call number is to organize the library collection to promote browsability either on the shelf or in an online catalog. This study, based on research done at the Ohio State University Libraries, examines the impact on library collection organization if call numbers are not changed to fit into the shelf list sequence. The browsability of items were tracked by assessing how many screens away titles appear from like items in the online public access catalog, if call numbers by a bibliographic utility were not changed. The study assesses whether not reviewing the call numbers affects patrons‟ ability to find the items. 1. Introduction With the ever-increasing “information explosion,” libraries face difficult decisions on purchasing, book processing, and space allocation. Libraries must seek ways to cut costs and increase efficiency. One area under continuous scrutiny is book cataloging. A study by Magda El-Sherbini and John Stalker entitled, “A Study of Cutter Number Adjustment at the Ohio State University Libraries,” is one example of research on this time-consuming process. Their study examines “existing copy cataloging procedures to assess whether it was feasible to eliminate the review and adjustment of cutter numbers in producing copy cataloging records. A change in this procedure might reduce processing costs and improve productivity” [1]. As a public service librarian at the Ohio State University (OSU) Libraries, this author questioned whether a change of this type would affect the accessibility of materials. More specifically, this author wondered if the non-adjustment of cutter number or non-adherence to strict alphabetic order would affect the “browsability” of materials in an online public access catalog (OPAC). The study takes the research of El-Sherbini and Stalker one step further. It compares the shelf-listed call number with the non-shelf-listed call number provided on a catalog record, and then attempts to assess the effect on the display of the title in an OPAC. http://www.sciencedirect.com/science/journal/14649055 http://www.sciencedirect.com/science/article/B6VSH-41V35RT-1/2/a5dfdebbfefa3a28205d89b44c369039 1.1. Scope and definitions Available cataloging copy in bibliographic utilities such as the OCLC, Inc. (OCLC) or the Research Libraries Information Network (RLIN) has done much to increase the speed of processing a book, but processing units still look for ways to increase efficiency and production. Also, services such as PromptCat, developed by OCLC, or shelf-ready materials provided by vendors, support library processing units in their efforts to receive an item and get it to the shelf as quickly as possible. Can libraries accept the copy of the record provided by a bibliographic utility without reviewing the content of the record? This study examines one part of the record, the call number. Classifying and cuttering, or the assignment of call numbers, is a primary activity in cataloging. In general, bibliographic classification is designed to organize materials in a chosen way. A call number is designed in parts using established symbols, including a class number (representing subject), one or two cutters (representing geographic, topical, or specific author), and book number (representing alphabetic scheme). Call number assignment is the most prominent method used in libraries to organize collections systematically according to the subject matter of each item. The call number file is called the shelf list because it is arranged in the order the items are found on the shelf. This file promotes browsability among items that are grouped together by subject through call number assignment. Classifying materials to permit effective browsing became more crucial with the rise of open access to materials by library patrons. As Osborn relates, “The provision of self-service on the part of readers grew out of conditions that were encountered for the first time in history in the 1820‟s when in the British Museum some 200 readers a day presented requests for materials and subjects which were beyond the capacity of the librarian-as-a-living-catalog to fill, for example a request to see all of the library‟s holdings of materials printed in France during the French Revolution or a request for information on new discoveries around the world or new developments in all fields of science” [2]. For the purpose of this study the following definitions will be used. 1. Class number: a system of alphas and numerics used to keep like items together by subject whether on the shelf, in a card catalog, or in an OPAC display. Part of the class number may be a cutter for subject, topic, or specific author 2. Class number change: an adjustment made to the cutter 3. Book number: the alpha and numerics used to order items by author or title within a class number 4. Shelf listing: the process of adjusting the book number to fit an item into an existing sequence of materials Throughout the study these definitions are used to differentiate between subject organization (class number portion) and alphabetic organization (book number portion). The phrases “shelf listing” and “call number assignment” both refer to the act of adjusting the book number. By eliminating shelf listing, the role of the call number functions more as a shelf position indicator and less as a means of keeping like items together. In many libraries the bar code, rather than the call number, is the unique number assigned to each item. With the use of bar codes it may no longer be critical to assign a unique call number to each item. Therefore, duplication of call numbers will be tracked as a possible factor that might have an impact on the library collection. 1.2. Literature survey A review of the literature did not reveal any research addressing the impact on call number display in an OPAC if shelf listing is eliminated in the cataloging process. Besides the El-Sherbini and Stalker article, Massey and Malinconico have also contributed research on cutting processing costs by accepting call numbers from records provided by a bibliographic utility. Massey and Malinconico reach a similar conclusion to El-Sherbini and Stalker: “the results of this study indicate that local shelf listing is not a cost-effective operation for the University of Alabama libraries . . . The small number of errors detected produced a small amount of shelflist disorder and would, therefore, be expected to have a low impact on the browsability of the collections” [3]. The University of Alabama Libraries is no longer shelf listing call numbers on provided copy. Many years ago at the OSU Libraries, shelf listing was suspended in all classes with the exception of classes M, N, and P. These exceptions were made in order to maintain the established single cutter for musicians, artists, and literary authors. This author examined other research areas that might influence a library to consider eliminating shelf listing as a part of book processing. The research can be categorized into three broad areas: 1) classification schemes in an online environment, 2) the quality of bibliographic records in the online databases of bibliographic utilities, and 3) catalog use studies and/or information seeking behavior studies. 1.3. Classification schemes in an online environment There are many studies that discuss the use of classification schemes as a means of improving access to items in an online environment. Most of these reports concern the enhancement of classification schemes through direct linking of class number to subject index files. Broadbent [4] highlights the issues by exploring whether an online catalog can function both as a dictionary and classified catalog without requiring additional time or intellectual effort on the part of the cataloger. Drabenstott [5] discusses the importance of incorporating a classification scheme into the retrieval protocols of an online catalog, to introduce a logical approach to subject searching and to increase the amount of information in subject indexes from the subjects in bibliographic records. There is also research being done on multiple class number assignments in bibliographic records in an online environment. Huestis [6] describes the development of clusters of classification numbers in an index, that are associated with bibliographic records and accessible in the online index-searching program. Past and present classification practices are summarized by Hill [7] who proposes that catalogers provide enhanced subject access through multiple classification numbers. Losee [8] examines the clustering of similar books provided by a classification system. He examines the relationship between the relative shelf location of a book and the books that users choose to circulate to impact the design of classification and clustering systems supporting browsing in online public access catalogs and in full-text electronic storage and retrieval systems. Classification schemes being used as independent online retrieval tools are also of interest. Cochrane and Markey [9] present research on data elements that have been enumerated for the purpose of constructing files of library classification records to assist in information retrieval. Williamson [10] addresses innovations in thesaurus design and standards to examine how classification structure will support information retrieval. Both of these articles conclude that an online classification index can aid in retrieval, although research into its design, users, and expected results still needs to be addressed. The above research implies that current classification practices alone are not an effective tool for the retrieval of information, or are not used to the fullest advantage. In a 1986 survey of ARL libraries, seventy-seven libraries (77) were still maintaining a card shelf list file [11]. The reasons for doing this were that a true shelf list function was not available online, that parts of the collection needed retrospective conversion, and that better browsability functions were needed in online systems. As Chan states, “Classification holds great promise for augmenting effectiveness in online retrieval. While certain characteristics of classification prevent its being a totally reliable retrieval tool by itself, it can be a useful supplementary device” [12]. Gertrude Koh [13] supports Chan‟s statement in her research on a “subject tool box” or a combined system of subject headings and classification which will meet user learning styles and vocabulary and assist in online “shelf browsing.” By itself, classification may not be an effective retrieval protocol, but in combination with other search mechanisms it provides added value for user searching. However, non-adjustment of call numbers can be viewed as a possible development in the use of classification in online systems. If many libraries use standard call numbers provided on records in a bibliographic utility, the development of classification schemes and their uses as search tools may become more acceptable. The results may be applied to many libraries rather than one library at a time. 1.4. Quality of records in bibliographic utilities The second area of research examined for this study concerns the accuracy of copy provided by bibliographic utilities. These studies include all fields in the provided record, of which the call number is but one element. In 1987 at the Mann Library of Cornell University, Janet McCue and others found that in an analysis of cataloging copy from the RLIN database, 57.44% of a total of 85.3 changes were modifications to the classification number. The authors state, “The fact that one or more Mann catalogers changed the classification on 39 of 80 records (including 4 L[ibrary of] C[ongress records]) illustrates the latitude possible in determining classification” [14]. The authors do not define their use of the term classification, but one gets a sense from the content of the article that the term is applied to the class number portion of the call number. They recommend more in-depth training on choice and form of classification numbers by copy catalogers. Shared cataloging as accepted or applied by local libraries is of great interest to the library community. In a study on the accuracy of LC copy, Arlene Taylor and Charles Simpson [15] also included classification as an access point worth consideration in their research. They found that there were 4.3% problems with call numbers in the Cataloging in Publication (CIP) sample and 5.5% problems with call numbers in the non-CIP sample in their study. The article does not present data on the types of problems found in the classification, but the problems are considered significant because classification is seen as a major access point. There seems to be a general perception that classifying a document or assigning just the class number portion is a very individualized process. Thus, one classifier‟s subject analysis and classification assignment might be different from another classifier‟s for the same item. The inconsistencies of classification through subject analysis do point to another possible weakness in the sharing of call numbers without adjustment. Does class number assignment have an impact on a decision to accept call numbers without review? In an investigation of information retrieval from a classification of words used to group documents together, Jones states, “By this [a certain sort of classification] I mean a classification in which members of a class do not necessarily all share one or more common properties, and in which individual items can appear in more than one class. . . . This is a natural consequence of the fact that the documents in a collection, though they may be topically related, are not likely to be identical in both subject matter and vocabulary” [16]. Jones further discusses the difficulty in accurately and consistently assigning the correct identifier to similar documents in order to group them together for retrieval. Consistency in the use of any classification scheme seems to be somewhat problematic. Bibliographic classification differences in libraries may also be affected by the needs and expectations of each library. Current practice assigns one class number to an item based on the first subject heading. If classification is a subjective decision-making process, can we then assume that the general class indication of content or topic is acceptable, or that the call number is relegated to be just a shelf position indicator? There are surveys administered by bibliographic utilities to assess users‟ perceptions about record quality in their databases. In an article about a survey of records in the OCLC database, Davis [17] asks two questions concerning the seriousness of errors, and the perceptions of how well existing programs addressed quality control needs. The research interest in shared records by both user and provider is important because this study investigates the acceptance of a provided field without review. 1.5. Catalog use studies The last research area examined for this study is catalog use studies and information seeking behaviors. In these areas there is a wealth of research. The following from R. Hyman‟s introduction in his Access to Library Collections sums up the issues involved. “An investigation of any aspect of the direct shelf approach involves one immediately in a central problem which ramifies [i.e., divides], often unexpectedly, into almost every major concern, theoretical and practical, of librarianship. Thus, one may easily become entangled in: selection and acquisition policy . . . ; the function of cataloging, particularly of subject heading, vis-à-vis classification; general versus special classification schemes; documentation as related to librarianship; the utility of mnemonic and of expressive notation; bibliothecal as against bibliographical classification; the differing interpretations of the browsing concept (and of browsability) for research and for non-scholarly library use; how to determine and store less- used or obsolescent materials; the divergent philosophies on the desirable extent of readers‟ services and reference assistance; the worth and form of independent study in the library; the suitability of the LC Classification (LC) or of the Dewey Decimal Classification (DC) for various types and sizes of libraries–an issue often complicated by concomitant problems of reclassification; the encroachments on direct access resulting from increased use of microforms and from possible mechanized information storage and retrieval; the proper educational, social, or scholarly functions of libraries. Nor is this by any means a full listing of the threatening entanglements” [18]. Even though this statement was written in 1972, it seems to hold true today. When studying the organizational structure of the library‟s collection, or the direct shelf approach, all of the library‟s parts or activities come under scrutiny. Use of card files or online files is usually the initial contact by a patron when beginning to seek materials on a subject or to look for a specific item. Making a change in just one of the available files could affect many aspects of how a library is organized and operates. Catalog use studies investigate not only how information is organized and retrieved, but also the schemes used to organize the information in the physical arrangement of the library as well as in online systems and their retrieval capabilities. A common conclusion reached in many reports is that when seeking information, users do not use the call number file as their initial search option. For the most part, users approach the search for information from a known item point of view (author or title search) or from a subject perspective [19–21]. After completing the search, they use call numbers to locate the item on the shelf. Patrons will then browse nearby items for other appropriate titles. They use call numbers as pointers to the physical item, and when they find the shelf area in the library they browse titles, not call numbers. Although the following statement by Thomas Mann is not from a user study, it does summarize another aspect of patron behavior that influences search strategies. Mann‟s Principle of Least Effort “states that most researchers (even „serious‟ scholars) will tend to choose easily available information sources, even when they are objectively of low quality, and further, will tend to be satisfied with whatever can be found easily in preference to pursuing high-quality sources whose use would require a greater expenditure of effort” [22]. In general, users want their information search to be quick, easy, usable, and limited in number of items retrieved. Another common thread in user study reports is the classification scheme itself and how it is manifested in the physical arrangement of items in the library. The classification of the store of human knowledge is indeed a very complex issue. As stated by Langridge, “In the bibliographic context, „classification‟ is commonly taken to imply „classification schemes‟. These represent the fullest use of classificatory methods, but the term „classification‟ by itself really means a way of thinking that pervades all subject work” [23]. A “way of thinking” is the crux of the issue facing libraries today. Each and every user may have his or her own way of approaching a search for information. It is difficult to assess how an “average user” thinks in order to design the best scheme for organization. The LC Classification schedules are very complex, and without some explanation, patrons may not be able to use them. The full call number is used by patrons to locate the item on the shelf, and only in its broadest sense (class number only) will “classification” assist the patron in browsing by linking like items together. There is much activity, study, and discussion in the area of classification research. Classification schemes or call number assignments are revised to meet the continuing changes in information, are examined in records found in bibliographic utilities, and are studied to determine their use by those searching for information. This study may raise more questions than it answers, but it is hoped that the research will shed some light on the non- adjustment of call numbers as a viable option for librarians to consider. 1.6. Research objectives and methodology From home, office, or in the library, the online catalog serves as the initial point for users to begin their search for information. The research questions examined by this study are: Is it necessary to adjust the book number to maintain alphabetic order of items within a class? If not, how does this affect the call number display in an OPAC? In other words, to what degree will the browsability of a collection in an online catalog change if call numbers are not shelf listed? The preponderance of literature describes the need for research and development in the use and application of classification systems and the need for more analysis of searching behaviors. No research has been done to show whether or not the suspension of concatenation or linking in a series affects the browsability for information in an OPAC. Libraries might be able to abandon strict alphabetic order for speedier, more efficient processing of materials if browsability in the call number file is not greatly affected. Data are collected on the call numbers in the bibliographic record. Data are also collected on the edited or shelf listed call number to compare with the original call number provided by a bibliographic utility. The impact on the browsability of an item or the effect on the display of the title in an OPAC is then assessed. The initial results will also be examined over three years to track any change in OPAC display position of the titles. The data for this study was taken from items copy cataloged during three months in 1992 at the OSU Libraries, which uses LC Classification. Data have been compiled on books that received copy cataloging using bibliographic records found in the OCLC Union Catalog. These bibliographic records were considered acceptable if they included a LC Classification number and subject entries. Records that did not have a LC call number were eliminated from the sample. Because this study is primarily concerned with the effect of shelf listing in the OPAC display of titles, no attempt has been made to ascertain the correctness of the class assignment, and it is assumed that the class number on the bibliographic record as found in OCLC was valid. In order to provide a description of the overall sample as found in OCLC, the following data elements were tracked from the supplied copy: 1. cataloging input agency: LC or member institution 2. the encoding level: blank (LC), I (member institution), 8 (CIP cataloging), and other (e.g., 5 for minimal level cataloging) 3. bibliographic description: blank (non-ISBN), a (AACR2), I (ISBD form) 4. call number field tag: 050 (assigned by the LC), 090 (assigned by member institutions using the LC Classification scheme) An analysis of the portions of the call numbers that were changed was also done to identify the types of changes made to the call number for shelf listing purposes. The categories used to track the call number changes were: 1. classification (includes topical, geographic, or author cutter) 2. book number (cutter used to alphabetize into the shelf list) 3. changes required for local practice (adding a date, adding a number one for English translation, etc.) 4. no change required In addition to the above, it was noted whether an unchanged call number matches or duplicates a call number already in the call number file. It was also noted if the changed call number was literature or LC Classification P. In summary, to assess the browsability of like items in the OPAC, three basic steps were used to analyze the sample: 1) a description of the type of copy used in book processing, 2) call number analysis to assess how many call numbers were changed, and 3) of the call numbers that were changed, how many would have been one, two, or three or more screens away if not changed. The last step could only be done after step two, which eliminates those items in which the call number has not been changed. Approximately 250–300 cataloged items in all formats were cataloged daily at the OSU Libraries. Every tenth title was selected for the sample as representative of the approximately 12,000 to 15,000 items normally added to the collection every three months. The sample was selected according to the following conditions: 1. Only those items that were copy cataloged were used. Any items that were “originally cataloged” at the library were removed during the analysis of the overall sample 2. Only monographs (including microforms) were used as source data 3. Those items cataloged with a locally constructed call number (not LC classification) were eliminated The source data used in this study were three years old when analysis began. Approximately 200,000–225,000 items have been added to the online catalog since the sample items were cataloged. By counting lines in the online catalog display for the unedited call numbers, an estimate of the effect of browsability in the OPAC was based on a period of three years. The sample yielded a total of 1,130 titles. The analysis began with a brief description of the type of copy provided. The fields chosen to describe the sample were the cataloging source field, the encoding level field, the description field, and the call number field tag. The definitions for these fields were taken from the document Bibliographic Formats and Standards (1993, FF:3-75. 079-83) issued by the OCLC Online Computer Center, Inc. In all cases the first part, or subfield “a,” of the 040 field was used to identify the original source of the cataloging data. The LC supplied 753 titles (66.6%) and OCLC member institutions supplied 377 titles (33.4%). The encoding level field was examined next. Encoding level indicates the degree of completeness of the machine-readable record. The LC, National Library of Medicine, British Library, National Library of Canada, National Library of Australia, and the National Series Data Program use blank and numeric codes in this field. Member institutions use capital letter codes. Encoding level “blank” is defined as full-level cataloging; encoding level 8 is the code for prepublication-level cataloging or the Cataloging-in-Publication program (CIP). Encoding level I indicates full-level cataloging input by OCLC member institutions. These codes were examined because they are indicative of full-level cataloging which should include a complete call number. Other codes used in the field, e.g., 5 or M, usually indicate less than full-level cataloging. Less than full cataloging codes are grouped together into a category titled “other.” The results for the encoding level field are blank = 511 titles (45.2%); I = 337 titles (29.8%); 8 = 243 titles (21.5%); and other = 39 (3.4%). The description field indicates whether the item has been cataloged according to the provisions of the International Standard Bibliographic Description (ISBD). The three possible indicators for this field are: “blank,” which indicates that the record is in non-ISBD form; “a,” which indicates that the record is in AACR2 form (Anglo-American Cataloging Rules, second edition); and “I,” which indicates record is in ISBD form and is known to be a non-AACR2 record. The description codes are concerned with the bibliographic description of the record‟s content, and do not imply whether the choice and form of the headings used in the record follow AACR2 standards and rules. The results were level a = 1,064 titles (94.2%); level blank = 38 titles (3.4%); and level I = 28 titles (2.5%). Based on these three data elements (cataloging source, encoding level, and description), 66.6% of the copy used was provided by national libraries, and 75.0% was full-level cataloging. One-third of the sample or 33.4% was input by member institutions of which 29.8% was input at full-level cataloging. Overall, 94.2% of the sample used in this study was input in AACR2 form. Only 3.4% is in less than full cataloging, and 5.9% of the records were in earlier forms of bibliographic description. To summarize, 96.5% of the sample (encoding levels blank, I and 8) and 94.2% of the sample (description a) indicated usable, full-level available copy. Call number assignment field tag was the next element examined. Acceptable copy at OSU Libraries is defined as having a LC call number or field tags 050 or 090. When neither of these tags were present, the record was also tracked and defined with other tags, e.g., 060, 070, 082, 092 that are not used by the OSU libraries for cataloging. The field tag 050 is defined as a call number assigned by the LC, and the 090 field tag is defined as a call number based on the LC Classification schedules but assigned by an OCLC member institution. The results were that 1,065 titles (94.2%) had 050 and 090 tags, and 65 titles (5.8%) had other or no tags. The 65 titles in the “other or not present” category were eliminated from further analysis because this type of call number is always shelf listed and would not have been accepted without review. With the elimination of the 65 titles, the sample size was reduced to 1,065. Another book processing requirement of the OSU Libraries is that the selected copy must have valid LC subject headings (650 field tag). This category does not affect the study except that items without valid subject entries would be forwarded to the original cataloging section. This category was counted to determine how many items would have been removed from processing without review. Of the 93 titles without subject entries, 67 titles were classed as literature, which do not require subject analysis. Only 26 titles had no subject entries. None of these titles were eliminated from the sample at this point because they were processed using the call number found on the copy, although these all required expert attention to other fields before cataloging was complete. The last category used in this study to define the sample answers the question: Does a record have original cataloging input by the OSU Libraries? This question is important because it means that no bibliographic record was available in OCLC. A cataloger at the OSU Libraries would catalog the item, including a shelf listed call number, before input to the OCLC Union Catalog. This study examined those items that were copy cataloged using a bibliographic record already in OCLC. Forty-five titles were found to be originally cataloged by the OSU Libraries. To summarize, the initial sample was reduced by 65 titles that did not have a call number and by 45 more titles that were originally cataloged. The total sample size was now 1,020 titles. Of these, only the call number was examined further. The initial examination determined whether the call number on the bibliographic record was changed or whether it was accepted as found in the bibliographic record. Seven hundred ninety six titles (78.0%) contained call numbers originally provided on the bibliographic record which were accepted without revision during the copy cataloging process. The sample size for further call number analysis was therefore reduced to 224 titles. 1.7. Results Of these 224 titles, three categories were tracked to identify which part of the call number was changed. First, it was noted if the class number, which includes author, geographical, or topical cutter, had been changed. This change was counted first and as the only change even if other parts of that call number were changed. Second, the book number, which alphabetizes the title into the collection, was examined. This category was counted as the only change if it was the only element changed in the call number. Third, changes due to local practice were counted as the one and only change provided that the class and book numbers were not changed. There were three local practices included in this study: 1) adding a number one to the book number to indicate English translation; 2) adding a cutter, Z8, to show literary criticism; and 3) adding a year to the call number. The results of the analysis of the parts of the changed call number can be seen in Table 1. Table 1 Call number changes By checking where the record would file if the call number had not been edited and comparing it to where the record would file with an edited call number, the “browsability,” or how close together on the screen the two call numbers are, can be estimated. The OPAC display of call number in OSU‟s Innovative Interfaces‟ system displays eight call numbers on one screen. When a call number is input that does not match an existing call number, the input call number is displayed in the middle of the screen with four call numbers above and below. For this study, the call number lines are translated into OPAC screen displays as follows: 1-4 lines are equal to the same screen; 5-12 lines are equal to one screen away; 13-20 lines are equal to two screens away; 21-28 lines are equal to three screens away; 29+ are equal to more than four screens away; The results of the OPAC search on the unchanged call numbers in relation to the shelf listed call numbers appear in Table 2. Note that 187 items (83.5%), that were within twelve lines of the call number, would probably have been found or seen by the patron if they follow Mann‟s Principle of Least Effort. In essence, the change to the call number was relatively slight when position in the OPAC display was examined. This does leave 37 (16.6%) of the titles that are two or more screens away. If a user was following the principle, this would result in a missed or failed search result. Table 2 OPAC display result for provided call number In the OSU Libraries‟ OPAC system, the call number does not have to be unique when a record is added to the database. The unique number for each item is the barcode. It is technically possible to have two different items with the same call number and still retrieve them for circulation purposes. It is not known whether this would be confusing to patrons when seen in the OPAC display or on the shelf. Thus, an additional category was tracked to determine the percentage of duplicated call numbers if a call number was accepted without review. It was also noted whether the titles were different or the same. Of the 224 titles, eight (3.6%) duplicated an existing call number. In six of the eight titles (75%) the titles were different, which means that the same call number was assigned to two different items. Of the two unchanged call numbers (25%), one matched a call number input to this OPAC by another library. The other unchanged call number represented the second edition of a title that matched the call number used for the cataloged first edition. Since approximately 25% of this collection is in the literature classifications, two additional categories of information about the changed call numbers were tracked: 1. whether the item is literature, and 2. whether the call number change was made to keep literary authors together Of the 71 changes made to class number, 55 (77.5%) were classed in literature. If these adjustments had not been made to the call numbers, a “new” class number sequence would have been established for these authors. Therefore, the works of these authors would have been found in two shelf locations. The remaining 16 titles (22.5%) were not literature. Upon review of these titles, the author determined that the class number portion was changed because of a topical or geographical cutter. These changes were made to keep the same topics or geographical areas together in the same shelf location. After the compilation of the results of the first search in December 1995, the author intended to do a time series projection based on the results and to check the OPAC displays two more times. However, when the OPAC displays were examined in March 1997 and May 1998, no change had occurred in the display positions of the 224 titles. Either the size of the collection did not increase enough or collecting in the subject areas of the 224 titles was not significant enough to make any change in the OPAC display position. Another possibility is that, since 1995, the unchanged call numbers would have compounded the out-of-sequence items. This aspect of the OPAC display results has not been tracked or factored into the results of this investigation. 2. Summary Since the size of the library collection seems to have an effect on the OPAC displays, some overall projections might be made for one year of production against the size of the database. Of the original sample, 224 titles (21.9%) had a call number change. If these call numbers had not been changed then 224 titles would not be in correct order in the OPAC display. However, 187 (83.5%) of the 224 titles appear on the same screen or one screen away and are considered easy to find if a search of the OPAC is done by call number. The remaining 37 titles (16.6%) would fall two or more screens away and are considered not easy to find. Note that 796 titles (78%) of the sample titles would fit perfectly into the collection without call number adjustment. Based on these results, the following projections can be made for one year of production. There are approximately 45,200 monograph titles added to OSU‟s collection in one year. Of these titles, 43,256 (90.2%) could be processed because there was available, acceptable copy. There would be approximately 9,473 (21.9%) call number changes. If these call numbers had not been changed, these titles would then be out of order in the OPAC display. However, of the unchanged call numbers, 7,909 (83.5%) titles would be on the same screen or one screen away from the shelf listed call number. This leaves 1,572 (16.6%) titles with unchanged call numbers that would be two or more screens away. The first OPAC search was done in 1995, three years after the sample titles had been processed. The estimated size of the database at that time was 2,865,000 titles. Following the line of reasoning above, after three years of production, there would be 4,716 titles (0.16%) of the entire database) out of sequence by more than two screens in the OPAC display. Using 0.16% as the percentage for out-of-sequence titles against yearly database growth, predictions can be made on the number of titles in the database that would be more than two screens away from a shelf listed call number. The above results do not take into account any compounding that may occur because of the out-of-sequence items. This study has not examined whether compounding is a significant factor in increasing the number of items out-of-sequence over time. The literature titles are more of a problem if titles are out-of-sequence. With literature, it is the class number that becomes the key element in accepting call numbers without review. Only a cursory review of literature titles was done in this investigation. There were 55 (24.6%) literature titles with changed call numbers of the 224 titles that were searched in the OPAC. Of these fifty-five titles, 53 (93.4%) had a change made to the author cutter, which is the element used to keep the works of an author together on the shelf. Without this call number adjustment, the works of an author would be shelved in two or more locations. This investigation did not review the literature titles further, but it would be interesting to note how far from the established class number an unchanged literature call number would fall, not only in the OPAC, but also on the shelf. 3. Conclusions The research question asked in this study is: To what degree will the browsability of a collection in an OPAC change if call numbers are not shelf listed? The results indicate that for this library‟s collection, after three years, 0.16% of total titles cataloged without call number review may not be easily found in the OPAC. This is not a large percentage, and therefore non- review of call numbers in cataloging would seem to be an acceptable decision for cutting costs and increasing productivity. There are serious questions raised by this study that have not been answered, and more research is recommended. This research was limited to a call number search and the display results of titles in an OPAC. The decision on what would be “findable” was based on readings about user retrieval preferences. Patrons do not like to retrieve too many titles for review. Also, patrons prefer a known item approach or subject approach, and so it is assumed that all titles would be retrieved by this type of search protocol, no matter what the call number assignment. An important constraint of this research is that the OPAC results were not translated to the actual shelf position in the library. This decision was based on the assumption that in this digital age, the user will search online and then use the call number to find the item on the shelf. Accepting call numbers without review may have one result in the OPAC display and an entirely different result when the actual shelf position of the item is examined. Assume that a user selects an item from a search of the OPAC. The patron jots down the call number and goes to the shelf to retrieve the item. The item selected is one that had a call number that was not changed during processing and is found five shelf ranges away from like items in the collection. Would the patron be satisfied with this result? Would the patron realize that more items exist but are not shelved in close proximity to the selected title? How does screen display position translate to actual shelf position? How are the items actually shelved in each library? In this OPAC, the call number sequence display is continuous no matter what the format or material type. If a library shelves formats separately, e.g., monographs in one area and serials in another, a shelf position examination might have very different results. By accepting call numbers as found on copy without review, how many classification sequences would actually be established for a given topic? It has been established in the review of available copy that 66.6% of records used were provided by the LC and 33.4% were provided by member institutions. If a call number input by the LC for a topic has a cutter of R66 and is accepted without review and the library had already established this topic as R6, the result is that two sequences have been established for one topic. It is assumed that the LC class assignment will remain consistent. If member institution call numbers are accepted without review for the same topic, yet another cutter might be established. A library collection could contain quite a few class sequences for items that are traditionally classed together. This could be a problem, not only in the browsing of the OPAC, but also in the browsing of the shelves. This leads one to question the extent to which a library‟s processing/maintenance policy extends to the re-cataloging of items to keep them together. Classification schemes by their very nature are under constant revision to codify new information, new research areas, or change already established class number notations. Do libraries go back and adjust class numbers of items if a change has been made to the scheme? It is assumed that they do not because of limited resources. If they do not re-catalog because of schematic changes, would it be necessary to re- catalog items that are out of order because of processing choices? The above brief discussion does not include all of the issues associated with this study. However, the study shows that from the sample group approximately 78% of the copy cataloged items fit into this library‟s collection without needing any call number adjustment. It showed that 21.9% of processed items required a call number adjustment which was so slight that the unchanged call number was on the same screen or the next screen in the OPAC display. This leaves 16.6% of the items out of sequence by two or more screens, which when factored into the entire collection results in 0.16% total titles not easily found in the OPAC. When taken by themselves, the statistics seem to make the proposition of processing items without call number review somewhat attractive. However, when translated to the collection‟s physical arrangement it may become less attractive. This author proposes that size of the library collection does make a difference. A similar study on a small library collection would make an interesting comparison. In the virtual world, should the core method of systematic classification that organizes our collections be suspended? As LeBlanc states, “Will the access potential of the virtual library prove healthily cornucopian, or will the browsability of this new informational format permit the retrieval of only so much fodder from the cybernetic trough – enough to sustain users, but not enough to satisfy them” [24]? The authors hopes that this examination of call number assignment and how it might be applied or not applied in processing provides some new ideas or insights. References [1] El-Sherbini M, Stalker JC. A study of cutter number adjustments at the Ohio State University Libraries. Library Resources & Technical Services, 40 (October 1996):320. [2] Osborn AD. From Cutter and Dewey to Mortimer Taube and beyond: a complete century of change in cataloguing and classification, Cataloging & Classification Quarterly 12 (1991):36. [3] Massey SA, Malinconico SM. Cutting cataloging costs: accepting LC classification call numbers from OCLC cataloging copy, Library Resources & Technical Services, 41 (January 1997):38. [4] Broadbent E. The Online Catalog: Dictionary, Classified, or Both? Cataloging & Classification Quarterly 12 (1991):108. [5] Drabenstott KM , et al. Analysis of a bibliographic database enhanced with a library classification, Library Resources & Technical Services, 34 (April 1990):179. [6] Hill JS. Online classification number access: some practical considerations, The Journal of Academic Librarianship, 10 (March 1984):21. [7] Huestis JC. Clustering LC classification numbers in an online catalog for improved browsability, Information Technology and Libraries, 7 (December 1988):383. [8] Losee RM. The relative shelf location of circulated books: a study of classification, users, and browsing, Library Resources & Technical Services, 37 (April 1993):197–8. [9] Cochrane PA, Markey K. Preparing for the use of classification in online cataloging systems and in online catalogs, Information Technology and Libraries, 4 (June 1985):108 –9. [10] Williamson NJ. The role of classification in online systems, Cataloging & Classification Quarterly, 10 (1989):99. [11] Epple M, Ginder B. Online catalogs and shelflist files: a survey of ARL Libraries, Information Technology and Libraries, 6 (December 1987):294. [12] Chan LM. Library of Congress class numbers in online catalog searching, RQ 28 (Summer 1989):536. [13] Koh GS. Options in classification available through modern technology, Cataloging & Classification Quarterly, 19 (1995):196. [14] McCue J, Weiss PJ, Wilson M. An analysis of cataloging copy: Library of Congress vs. selected RLIN Members, Library Resources & Technical Services, 35 (1991):73. [15] Taylor AG, Simpson CW. Accuracy of LC copy: a comparison between copy that began as CIP and other LC cataloging, Library Resources & Technical Services, 30 (October/December 1986):377. [16] Jones KS. Some thoughts on classification for retrieval, Journal of Documentation, 26 (June 1970):91. [17] Davis CC. Results of a survey on record quality in the OCLC database, Technical Services Quarterly, 7 (1989):44. [18] Hyman RJ. Access to library collections: an inquiry into the validity of the direct shelf approach, with special reference to browsing, Metuchen, NJ: Scarecrow Press. 1972, p. 2. [19] Wallace PM. How do patrons search the online catalog when no one‟s looking? Transaction log analysis and implications for bibliographic instruction and system design, RQ, 33 (1993):239. [20] Tagliacozzo R, Rosenberg L, Kochen M. Access and recognition: from user‟s data to catalog entries, Journal of Documentation, 26 (September 1970):248. [21] Hancock M. Subject searching behavior at the library catalog and at the shelves: implications for online interactive catalogues, Journal of Documentation, 43 (December 1987):306. [22] Mann T. Library Research Models: a Guide to Classification, Cataloging, and Computers. New York: Oxford, 1993, p. 91. [23] Langridge DW. Classification: Its Kinds, Elements, Systems, and Applications. London: Bowker-Sauer. 1992. [24] LeBlanc J. Classification and shelflisting as value added: some remarks on the relative worth and price of predictability, serendipity, and depth of access, Library Resources & Technical Services, 39 (July 1995):302. work_6qwd7l23gnc37lhhqlshi6mzhq ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216590594 Params is empty 216590594 exception Params is empty 2021/04/06-01:37:03 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590594 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:03 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_6shjp5svbfgiricvdn4v63ktqm ---- Pages from J123v48n03_Sample-3.pdf Are They Too Dynamic to Describe? Bonnie Parks Jian Wang Presenters Sarah John Recorder SUMMARY. Parks and Wang discussed the issues and challenges of cat- aloging integrating resources. They emphasized cataloging electronic re- sources such as Websites and databases. Using specific examples and citing cataloging rules and standards, the presenters looked at some of the problematic aspects of integrating resources, and suggested ways in which catalogers can use their judgment to describe the features of these dy- namic materials. [Article copies available for a fee from The Haworth Docu- ment Delivery Service: 1-800-HAWORTH. E-mail address: Website: ] The 2002 revisions of the Anglo-American Cataloguing Rules, Sec- ond Edition (AACR2) brought great change to the worlds of serials and cataloging. Specifically, chapter 12 of AACR2 underwent its own title change from “Serials” to “Continuing Resources”; the revised chapter encompasses serials as well as a new issuance of materials called inte- © 2005 by the North American Serials Interest Group, Inc. All rights reserved. [Haworth co-indexing entry note]: “Are They Too Dynamic to Describe?” John, Sarah. Co-published si- multaneously in The Serials Librarian (The Haworth Information Press, an imprint of The Haworth Press, Inc.) Vol. 48, No. 3/4, 2005, pp. 237-241; and: Growth, Creativity, and Collaboration: Great Visions on a Great Lake (ed: Patricia Sheldahl French, and Margaret Mering) The Haworth Information Press, an imprint of The Haworth Press, Inc., 2005, pp. 237-241. Single or multiple copies of this article are available for a fee from The Haworth Document Delivery Service [1-800-HAWORTH, 9:00 a.m. - 5:00 p.m. (EST). E-mail address: getinfo@haworthpress.com]. http://www.haworthpress.com/web/SER Digital Object Identifier: 10.1300/J123v48n03_01 237 http://www.HaworthPress.com http://www.haworthpress.com/web/SER grating resources, complete with a new bibliographic level and new en- try conventions. The revisions led to updates in the Library of Congress Rule Interpretations (LCRIs) and new MARC21 and OCLC standards to descriptively catalog integrating resources. Integrating resources are materials with updates that do not remain discrete. In other words, the updates are not published separately as is- sues; they become “integrated” into the whole work itself. They may be finite or continuing. Examples include loose-leaf publications, updating databases and Websites. The presenters focused on electronic resources since catalogers most frequently encounter this type of integrating re- source. While integrating resources are grouped with serials in AACR2, they may be assigned to serials or monograph catalogers as well as to electronic resource and metadata catalogers in different institutions. The new rules in AACR2 are designed to provide catalogers with flexi- bility when cataloging integrating resources. Catalogers’ judgment increasingly comes into play when describing this type of material. Parks and Wang referred to chapters 9, 12 and 21 in AACR2 as being the most relevant to integrating resources. Chapter 12 covers continuing resources. Chapter 9 covers electronic resources (formerly called com- puter files). Chapter 21 discusses choice of access points and title changes. The chief source of information for integrating resources can now come from anywhere in the resource. Catalogers should prefer the source that has the most complete information. The description of inte- grating resources is based on the current iteration of a resource. The ex- ception is for information for the beginning dates of publication, which should be based on the first iteration, if known. Title changes for integrated resources are treated differently from se- rials. Rather than using successive entry rules for title changes, integrat- ing entry rules call for the newest title to replace the old title in the 245 field. Former titles are retained in 247 and 547 fields. The 547 field is a note field that explains complexities in title changes that cannot be explained in 247 fields. When a cataloger believes a publication is an integrating resource, he/she should consult LCRI 1.0 to confirm that the item is not a mono- graph or a serial. If the item is determined to be an integrating resource, AACR2 chapter 12 rules should be used in conjunction with the type of item being cataloged. For example, if a cartographic Website were be- ing cataloged, the cataloger would consult chapter 3 of AACR2 along with chapters 9 and 12. New conventions for cataloging integrating resources also introduce numerous changes in MARC tagging. The largest bibliographic utility, 238 Growth, Creativity, and Collaboration: Great Visions on a Great Lake OCLC, promises to implement a new Bib Level ‘i’ for integrating re- sources no sooner than June 2005. Other utilities, such as RLIN, have already implemented this new Bib Level. OCLC’s interim policy is to catalog integrating resources using monographic Bib level ‘m.’ A serial 006 field is used to bring out the continuing aspects of the resource. As a result, catalogers will be seeing monographic-like cataloging records with serial-like fields such as the 022 ISSN field, the frequency 310/312 fields, the unformatted 362 field and 580/76x-78x linking fields. The AACR2, 2002 revisions allows for linking between serials and integrat- ing resources, between monographs and integrating resources, and be- tween integrating resources. However, linking from monographs to monographs is still not allowed. Further examples of MARC coding included a look at the most com- mon fields in a serial 006, and a demonstration of a typical serial 006. A new code in the 006 under the fixed field element ‘S/L’ (entry conven- tion) was introduced in OCLC Technical Bulletin 247. A code value of ‘2’ means that the item has an integrated entry convention and follows the integrating entry rules discussed above. Electronic resources also need a 007 field to code for physical characteristics. If the item has a fixed field Type ‘a’ for language material, it will need an additional computer file 006 to code for the electronic resource aspects of the material. The speakers presented case studies of integrating resources. Audi- ence members were asked to decide on the appropriate fixed field cod- ing for each example. Participants examined two Websites to decide when to use fixed field Type ‘a’ for language material versus Type ‘m’ for computer files. Catalogers should code for content, rather than for- mat. A Website consisting of a bibliography of Asian studies materials was coded fixed field Type ‘a’ for language material. The Website for Encyclopedia Britannica, while considered a language material in print format, would get the fixed field Type ‘m’ code because the Website contains an online store, a currency converter and numerous other mul- timedia and online service aspects. The Barnes and Noble Website was easier to code because it clearly has many aspects of an online system/ service (fixed field Computer File ‘j’). The distinction between fixed field Type ‘a’ and ‘m’ can be tricky. Cataloger’s judgment does come into play when making a decision about Type. Type ‘m’ should be used in cases of doubt. These examples were deliberately ambiguous to show some of the possibilities and challenges of integrating resources, and why judgment is necessary. Tactics Sessions 239 The next set of case studies involved choosing the chief source of in- formation and the title proper. Since the rules instruct that the chief source is the resource itself, catalogers are given more flexibility when choosing a title proper. The best source should present the most com- plete information. Introductory words such as “Welcome to” should not be included in title fields (245). They can be included as a variant title, a 246 title added entry. Title added entries should be given to all variants of a title that are considered important. After the title proper is chosen, its source should be recorded in a 500 field. There was some discussion from the audience as to the wording of source of title notes (“title from home page,” “title from introductory screen,” etc.). Since no standard wording for these kinds of notes exists, participants were advised to use their own judgment. Parks and Wang also said to be careful when tran- scribing other title information (field 245, subfield b). Subtitles can change frequently in an online environment. One audience member mentioned that she views the source code of a Website and takes the title from the metatags, if they exist, as this title is generally the most stable source for a title. The third set of case studies focused on who was publishing the mate- rial as opposed to who was hosting it on the Web. Looking at the copy- right statement or the domain name can give clues as to who is hosting versus who is publishing the Website. If considered important, make a 500 note starting with “Hosted by” for the host, and include 710 added entry. Once a publisher is established, the cataloger may need to go out- side the source in order to find the place of publication. A cataloger-sup- plied place of publication should be enclosed in brackets in the 260 field, and use a question mark if unsure of the information. The speakers addressed title changes next. For integrating resources, titles changes are based on integrating issuance, meaning the title proper (245 field) changes to meet the current iteration, and former titles are re- corded in the 247/547 fields. Unlike a 246 field, the 247 field does not allow catalogers to put detailed information in a subfield ‘i.’ Catalogers should use the 547 note field to provide further details on integrating re- source title changes. Dates may be added in the 247 subfield ‘f,’ with exact timing information of title changes if available. Approximate date information should be enclosed in angle brackets. The presenters dem- onstrated the use of the Internet Archive, or Wayback machine (http:// www.archive.org/), which can help determine previous versions and ti- tles. When a title changes, catalogers also need to remember to include a 500 note with the latest viewed date of the Website. 240 Growth, Creativity, and Collaboration: Great Visions on a Great Lake http://www.archive.org/ Final case studies centered on how to handle splits, merges, title re- formatting and a brief explanation of the various linking fields avail- able. The speakers reminded the audience to make sure that links are reciprocal between records; if there is a 776 field for the print version of a title in the electronic resource record, there should be a 776 for the electronic version in the print record. The workshop concluded with an emphasis on the basic principles for cataloging integrating resources: • Focus on the whole publication instead of one iteration • Focus on identification rather than transcription. For additional guidance: Anglo-American Cataloguing Rules. 2nd ed., 2002 revision. (Chicago: American Library Association, 2002). “Coding Practice for Integrating Resources,” OCLC Technical Bulletin, 247 (2002). http://www.oclc.org/support/documentation/worldcat/ tb/247/ (1 July 2004). CONSER Editing Guide. 1994 ed. (Washington, D.C.: Library of Congress, Serial Record Division, 1994). Updated through update 15 (2003). Hirons, Jean L., Ed. CONSER Cataloging Manual. 2002 ed. (Washington, D.C.: Library of Congress, Cataloging Distribution Service, 2002). Library of Congress. Library of Congress Rule Interpretations. 2nd ed. (Washington, D.C.: Cataloging Distribution Service, Library of Con- gress, 1990). Updated 2004, update 2. Miller, Steve. SCCTP Integrating Resources Cataloging Workshop. (Washington, D.C.: Cataloging Distribution Service, Library of Con- gress, 2003). Weitz, Jay. Cataloging Electronic Resources: OCLC-MARC Coding Guidelines. Revised July 17, 2003. http://www.oclc.org/support/ documentation/worldcat/cataloging/electronicresources/ (1 Jul. 2004). CONTRIBUTORS’ NOTES Bonnie Parks is Serials and Electronic Resources Catalog Librarian at the Oregon State University. Jian Wang is Serials Catalog Librarian and Serials/Documents Cata- loging Coordinator at the Portland State University. Sarah John is the Electronic Re- sources Serials Librarian at the University of California, Davis. Tactics Sessions 241 http://www.oclc.org/support/documentation/worldcat/ http://www.oclc.org/support/ work_6srallfgpfg2ppwrhpan7rgmza ---- doi:10.1016/j.serrev.2005.02.006 ARTICLE IN PRESS DTD 5 SERREV-00479; No. of pages: 7; 4C: Serial Conversations An Interview with Diane Hillmann and Frieda Rosenberg Jian Wang, Contributor Bonnie Parks, Column Editor doi:10.1016/j.serrev. Wang is Serials C land State Univers jian@pdx.edu. OCLC recently announced a plan to implement MARC 21 Format for Holdings Data (MFHD) and invited holdings experts Frieda Rosenberg and Diane Hillmann to serve as advisors and to aid in the implementation process. In December 2004, Jian Wang interviewed Rosenberg and Hillmann. They discuss their longtime involvement with the holdings standard and provide interesting perspectives on the issues, challenges, and benefits for the constituencies (libraries, the serials community, system vendors, and bibliographic utilities) involved with and responsible for implementing and using MFHD. Serials Review 2005; xx:xxx–xxx. The term bMARC 21 Format for Holdings DataQ (MFHD) is no longer a strange name to most librarians, but how it is understood and practiced by the library community varies. To some, MFHD is the established holdings standard used by libraries in managing serial publications in a standardized and consistent manner. 2005.02.006 ataloger, Branford P. Millar Library, Port- ity, Portland, OR 97207-1151, USA; e-mail: 1 To others, it is still a vague concept with little application in local use. I was honored to be able to interview two well-known holdings experts, Diane Hill- mann (at left) and Frieda Rosenberg, to discuss serials holdings related issues with a focus on MFHD. Diane Hillmann is the metadata specialist, National Science Digital Library at Cornell University. Besides her expertise in metadata, she is also one of the pioneers in the development of holdings standards. Frieda Rosenberg is head of serials cataloging for the Uni- versity of North Carolina (UNC) at Chapel Hill. She is also known as the bmother of serials holdingsQ because of her numerous workshops and publications in the field. ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx Professional Questions Jian Wang (JW): What initially sparked your interest in serials holdings/holdings standards? Diane Hillmann (DH): I was a law librarian at Cornell in technical services from 1977-1995, so I was interested in both serials and non-serials holdings. Law libraries traditionally have had the most creatively misbehaved publication patterns, and it was the law community that developed the understanding of bcontinuing resourcesQ that eventually spread to other libraries. Frieda Rosenberg (FR): Ironically, my interest began in the late seventies, when, after seven years as a paraprofessional turning out catalog cards by both typewriter and terminal keyboard, I moved to North Carolina, went to library school in Chapel Hill, and worked as a volunteer at the information desk in a local university library. The coordinator told me that for serials I should steer people toward the microfiche holdings list rather than to the card catalog. I felt ambivalent about that (remembering previous efforts at producing cards!). I wondered what could be done, if holdings were all important, to bring cataloging and holdings together. As I finished my library degree in 1978 and actually got a job as a serials cataloger at the UNC Library (where the same separation prevailed), I noticed even more files of holdings: the Kardex, the binding records, the serials printout, the microfiche and a separate card file called the Srec (serial record)—and this was just the serials departmentTs portion of all existing serials holdings files! As standards began to arrive in the next few years along with online catalogs, it began to dawn on me that the holdings needed for so many purposes would be more efficient in one place, but only if they were able to serve adequately for those purposes—and that was what standards, plus online access, could help to achieve. JW: Frieda, you said, bHoldings are at the hub of library serials use and serials management, just as central as the bibliographic record.Q1 Why is that? FR: Now I can say, bTogether with the bibliographic record, holdings are at the hub of library serials use,Q because the resource is all the richer when the biblio- graphic and holdings records are finally united. But the experience that I described above showed me how important holdings were even in alphabetical title lists without a lot of bibliographic information. Our physical bcom-ficheQ (computer output microfiche) list was sent all over campus and the state. The reference desks in each branch, as well as in our main library, were extremely active users of the list. Serials management- check-in, binding, inventory, preservation, interlibrary loan, circulation, bhooks to holdingsQ or even manual notations of holdings in printed periodical indexes—all these processes involved holdings and contributed their own holdings data to the mix. In an integrated system they still do, though we still havenTt managed to run all these operations off of one file. JW: When was MFHD first introduced? What was the driver in the development of this holdings standard? 2 Why has it taken so long for MFHD to be accepted in practice? DH: The MARC Format for Holdings was developed in the mid-1980s. I wasnTt involved with MARC develop- ment that early (I began in 1988), so ITm not entirely sure what the driving force behind the development was, but I suspect it was union listing. I believe that one reason it took so long for the holdings standard to be implemented in libraries was that it was designated as a bdraftQ for a long time, probably almost fifteen years, even though it was relatively stable long before that. Also, it was complex and heavily encoded, even for a MARC stand- ard. Very few people understood its power and potential sufficiently to attempt to use it, and the library manage- ment system vendors were very reluctant to be at the bleeding edge of development. VTLS was the only integrated library system with full MFHD capability for many years, and consequently they contributed signifi- cantly to its development. We owe them a great deal. FR: The MARC Format for Holdings Data began as a project within the Association of Southeastern Research Libraries (ASERL). Eight ASERL libraries began in the very early eighties to develop a way to communicate holdings data by computer. Eventually the Library of Congress commissioned them to develop their new standard as a MARC Format, and it became the MARC Format for Holdings and Locations, later USMARC and finally MARC21 Format for Holdings Data. So, unlike the bibliographic standard, an LC development, and the holdings display standard, developed by ANSI Z39 subcommittees, MFHD was inspired and created through the efforts of libraries working together. None- theless, it has been slow in both development and implementation. The format got a reputation for difficulty, so much so that some features (such as expansion and compression) barely exist in the field even today. It is a standard for communication, so it cannot in and of itself guarantee standard data, though it certainly helps encourage it. All holdings standards were harder to implement than bibliographic standards because, in the minds of many, holdings are considered local data and thus up to each individual library, so that adoption of standards seems like a loss of local control. Furthermore, the sheer bulk of this free-form, legacy data in large libraries, its existence at different levels of granularity and in different forms suiting a variety of functions were all deterrents to standardization. JW: What are the major reasons to implement MARC Holdings? DH: I believe weTre at a point where that question shouldnTt need to be asked. Those of us old enough to remember when the bibliographic formats were new remember that there were similar questions asked before everyone fully understood how essential stand- ard data were for libraries sharing data amongst themselves and investing heavily in their own data in an environment where systems change and data must migrate from one system to another. Unless you believe that there is some value in going it alone, ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx and bowing out of the incredible data sharing infra- structure that makes libraries in this country a model of common sense collaboration, you need to imple- ment MARC Holdings. Nobody can afford not to implement—that train has left the station. FR: Using MARC Holdings makes more sense than ever now. It has been adopted by all the major integrated library systems and is about to be adopted by OCLC as the basis of its local data record. In some cases the ILS (integrated library system), or OCLC, will be able to map your data into MARC, so you will receive a database of holdings which will be compatible with new systems, new versions of your present system, and computers accessing your data remotely. What a great benefit! Acquiring or adding publication patterns ena- bles you to predict serial receipts and saves you check-in labor. Both patterns and data save you costs, enable you to share records and acquire records from others, and then multiply these benefits across the library commun- ity as other libraries share information with you. JW: What challenges or difficulties have libraries experienced in the actual implementation process? DH: The biggest challenge has been the inclination of many library systems vendors to implement the stand- ards in a proprietary way, emphasizing interfaces that protected library staffs from the horrors of encoding. Some sort of interface for check-in staff (who may be students or part-timers) is very necessary, but librarians and managers must understand what sits below those interfaces and be able to interact with and understand the coded data. I remember the days when systems developers were convinced that librarians would never be able to deal with numeric field tags and coded subfields. We think thatTs hilarious now, but itTs essentially the attitude that is hindering the full imple- mentation of MARC Holdings. FR: Developing onsite knowledge of MFHD takes some time and effort. Leadership is needed in order to create the necessary training. Administrative support is essential for these priorities. If holdings work is shared among various groups of staff engaged in different activities, their buy-in and their training is a crucial foundation for the task of developing the holdings database. WeTd like to be able to say that you are guaranteed smooth sailing once you have this database, but since systems vary widely in their accommodation of the format and its functionality, migration between systems may still offer some setbacks and risks. This is something we need to work on. JW: What advice or suggestions would you offer libraries that are thinking about implementing MFHD? What should libraries consider before they make that decision? DH: I think the question is not bifQ but bwhenQ and bhow.Q CONSER provides great training for libraries in holdings, and good documentation. Librarians should approach this issue the same way they do anything else new: learn, plan, implement. There are libraries that have already done this and are happy to help others and 3 to pass on their experience to new implementers. No rocket science here, really! FR: Look at the considerations in the last paragraph: your libraryTs human resources and administrative support for intensive training and the creation and maintenance of the data, at whatever detail you can manage. Look at your data, too: it is easiest to map if it is or can be delimited, categorized, and labeled. If it canTt, itTs apt to be still mapable to textual holdings. Develop as much expertise as you can. Visit other libraries. Read the literature; for instance, the NASIGuide to Holdings is complete and at this writing will soon be available from the NASIG (North American Serials Interest Group) Web site. If possible, attend one of the workshops available on MFHD. Include MFHD in the discussions with prospective vendors and include specific detail in your query. For example, do you 1) support the current edition of MFHD for all types of material; 2) support encoding for base volumes, supplements, and indexes; 3) support the creation and maintenance of paired 853 and 863 fields; 4) support all subfields of the publication pattern data; and 5) allow receipt of materials according to input publication patterns? Ask for demonstrations of the features. Discuss your particular data with vendors. And when you finally choose one, test by submitting records for trial conversion. JW: What do you see as the benefits of standardized holdings data for the serials community in a global environment? DH: We have far more experience in this arena than many of our potential partners in the publishing and serials service industries, and I think we shouldnTt be shy about sharing that experience. Standard data are some- thing that libraries believe in fervently, and weTve built up a significant economic infrastructure around the sharing of this data. I hear calls for bsimplificationQ among some of these partners, and I find myself a bit mystified by some of this. Recall that it was not libraries that developed the complex publications that required complex standards to record, it was certainly publishers! I think itTs also sometimes forgotten that standards like MFHD are designed for machine-to-machine communi- cation, not human-to-human. Computers deal with much more complex data than encoded holdings even before breakfast. FR: Accrual of benefits tends to be circular. As more libraries implement the standard, everything improves: the data, the standard itself, the implementation of the standard in systems in the market, and the availability of shared archives and templates in systems and utilities, enabling further rounds of improvement. JW: The CONSER Publication Pattern and Holdings Initiative was a major step forward in promoting the use of MFHD.2 WhatTs the idea behind this initiative? What challenges were involved in carrying out the experiment to add publication pattern data to CONSER records in OCLC? ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx DH: I was there for that one so ITm happy to spill the beans. I attended some of the very early meetings back in the 1980s and early 1990s about sharing publication patterns. ITd gotten a bit frustrated by the lack of momentum in implementing the standard and had snapped at far too many people who opined that holdings were only local data, after all. I approached Jean Hirons on this issue, and there was a historic lunch at ALA at which Jean, Linda Miller, and I hatched the Publication Pattern Initiative. We wrote up a charge and got going convincing the rest of the serials community that the time was ripe for this kind of effort. Thankfully, Rich Greene at OCLC shared our vision and helped us figure out how to jump-start the effort, using local fields in CONSER records and a file of records from Harvard, and we were in business. FR: The Pattern and Holdings Initiative grew out of the realization that although a specific libraryTs holdings might be local data, looked at another way they were a subset of buniversalQ holdings, which were holdings as they came from the publisher. Diane Hillmann, who first suggested the project, wanted to harness these universal holdings, or publication history, for each title as 1) an archive of information for the larger world and 2) a database for all libraries to draw on for assessing their holdings, creating local holdings, and informing their users. Major challenges in designing the experi- ment were identifying which data, both retrospective and current, would be most useful in a shared database. For example, how important are patterns to retrospec- tive data? Deciding how to deal with limited space within the OCLC bibliographic record in the old platform was another challenge we no longer need to face. Jump-starting our work with a data load from the Harvard University Library database made the process much clearer and showed that the idea would really work. JW: What impact, if any, has the CONSER project had on libraries that are still not ready to implement MARC 21 for holdings? DH: I hope it has lit a warm little blaze under their desk chairs! Seriously, though, even libraries that knew they couldnTt implement right away have been instrumental in bringing some of the library system vendors around to fully implement MFHD, and we couldnTt have done it without their cooperation. FR: If Wen-ying Lu also is participating in this interview series, she will be the best person to answer this question! She and Paul Moeller recently conducted a survey on serial holdings which you may have seen on several discussion lists. They asked, among other questions, whether pattern fields now displaying at the bottom of a large number of OCLC serial records had at least attracted the notice of many libraries. The results of this survey should be out soon. The fact that two system vendors have developed loaders for the MARC data should definitely attract some of their reluctant customers, who may be considering predictive check-in and would benefit from some ready-made 4 patterns. We would also like to see more loaders developed rapidly. [Ed. note: By pure coincidence, their survey results are published in this issue of Serials Review]. JW: One of the goals of the CONSER project is to work with ILS vendors to develop systems that support MARC holdings. What is the current state of MFHD compliance by ILS vendors? DH: Much better than it was in the beginning. Some vendors made false starts, hoping to implement in ways that would give them competitive advantage and an easier interface, but most of them have come round to understanding that itTs the ability to exchange full standard data thatTs at the core of the effort, and sexy proprietary interfaces wonTt sell if they get in the way of that goal. FR: It is mixed but improving. It would be difficult for any vendor to keep up with the changes (really improve- ments) in the format designed to predict more serials accurately. Implementations are some years behind what the Format contains; however, we have to remember that the Format does not tell vendors how to implement its provisions. Instead, change happens as vendors are challenged to accommodate incoming MARC data. If the system isnTt adequate to handle it, the customer will probably not be satisfied with a bdown-migration.Q I think this, along with competition in general, is the greatest spur to better implementations. JW: Diane, you noted in a NASIG presentation that Z39.71 is to MFHD as AACR2 is to MARC biblio- graphic standards, which marvelously illustrates the two tracks of holdings standards.3 Could you elaborate a bit more on the relationships between Z39.71 and MFHD? How different is the current standard Z39.71 from the previous standards such as Z39.42, Z39.44, and Z39.57? DH: The earlier standards maintained a somewhat artificial separation between serials and non-serials, which were coming undone as MFHD was developing and digital resources finished the job. Z39.71 brought the serial and non-serial standards together into one standard. It is interesting to note that the late (and sorely missed) Ellen Rappaport, who was working for Albany Law School at the time, was co-chair of the NISO committee that developed the standard. She wrote an excellent summary of its history and highlights for her law library colleagues, available at http://www.aallnet. org/sis/tssis/tsll/26-0304/serliss.htm (accessed February 13, 2005). The Z39.71 standards contain most of the context and definitions crucial to understand MFHD, and the MARC standard provides the bpackagingQ that supports the sharing of holdings data created according to Z39.71. They are very intertwined at the conceptual level, certainly. JW: Frieda, you have played a key role in developing the CONSER Guidelines for Input of Captions/Patterns and Holdings Data, the Serials Holdings Workshop course materials for the Serials Cataloging Cooperative http://www.aallnet.org/sis/tssis/tsll/26-0304/serliss.htm ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx Training Program, and Holdings Guidelines for NASIG. What are some of the issues that you have been dealing with when writing the documentation? How do you think librarians and library staff benefit from using these educational materials? FR: Each of those guides was designed for a different user group with different objectives. The guidelines are in the initiativeTs participant manual, and solely written for those who input 891 fields (embedding 853/863 fields, the basic bpaired fieldsQ of the MFHD) into OCLC bibliographic records. They would probably bewilder someone unfamiliar with the special aims of that project, and they leave out all sorts of information that would be necessary in creating local holdings, since the 891 fields are meant to contain buniversal holdingsQ or bpublication historyQ fields. The holdings workshop, within its time constraints, is designed to give an overview and introduction to the subject of local serial holdings, along with some concrete guidance to get people started creating holdings records. It does answer some bwhyQ questions and has appendices, which tackle a few subjects that the workshop canTt cover in depth. One of these appendices is a brief code-by-code hand- book also available on the Web (http://www.lib.unc. edu/cat/mfh/mfhhandbook.html, accessed February 13, 2005). The NASIGuide, which should be available by the time this issue is released, is a much more leisurely and in-depth survey of the MFHD. It tries to cover many more issues, such as migration and conversion of specific fields, than previous guides. Where interpreta- tions have differed in the past, the NASIGuide will discuss them at length and give the reason why one interpretation has prevailed or is favored. I hope that not only librarians and library staff, but also system vendors and bibliographic utilities can take advantage of any of these documents and feel on more solid ground in an arena of competing demands. JW: We know that OCLC is implementing MFHD; you both have been invited to serve as advisors to aid in the implementation. What sorts of results do you envision with this project? DH: I was very impressed with the group at OCLC that is working on their MFHD implementation. They went through the standards documentation with a fine- toothed comb and asked us a great number of really good questions. Their first task is translating their union list data, and I think theyTve found the right balance in approaching that task. FR: We are understandably elated by the whole idea of the LDR (OCLCTs Local Data Record) finally being MARC-based. We understand that OCLC is taking this step because they are receiving better data from many libraries and no longer find it acceptable to use only part of it. The most revolutionary benefit, however, will be that OCLC will convert non-MARC records (far more reasonably, we feel, than a library could do on its own) and the library will have the benefit of that MARC data for further use. Even libraries not intending to union list that data could have it processed for migration or other 5 purposes. It would be impossible to take such a giant step forward without the willing cooperation of our largest bibliographic utility, which also hosts the CONSER database and the Publication Pattern Initiative data. JW: Diane, you currently chair the CONSER Task Force to Explore the Use of a Universal Holdings Record. What is a universal holdings record? How is it different from bpublication history?Q Why do you think the concept of universal holdings is important in todayTs shared environment for holdings records? DH: In late summer 2001, Ellen Rappaport and I floated a short discussion paper beginning to define a universal holdings record, based on the notion that what was published for a title was important data bibliographically and should be represented in a hold- ings record (available at http://content.nsdl.org/dih1/ PubPatt/Universal_holdings_statement.html, accessed February 13, 2005). Once the Publication Pattern Initiative began, the Task Force to Explore the Use of a Universal Holdings Record was charged. One of our first tasks was to find a new name for the bthingQ we were talking about because apparently the one Ellen and I chose was confusing people. The task force finally settled on bpublication history recordQ after some discussion sessions with groups of interested librarians, and it seems to have stuck. But of course, the task force still has the old name! I think what confused people at first was this notion that holdings were institution-based, but the publica- tion history record is really part of the complete bibliographic description, conceptually speaking. But if you think about it, what it provides is a template against which holdings can be matched and compared. From that basis, a display relating holdings within an institution, among versions (digital, print, microform) can be constructed. With a publication history record with a currently maintained publication pattern, you also have the basis to exchange information on newly published or available issues and volumes, as well as almost enough detail to construct a standard citation for an article. It is a really powerful underpinning for many of the data exchange challenges weTre struggling with today, and the best thing is that increasing numbers of libraries are committing to using and maintaining it in common with others. We are building the same kind of shared environment that weTve had for almost forty years with bibliographic data, with the same strengths and infrastructure that did the job for us then. JW: The term bserial super recordQ came up at the 2004 ALA Annual Meeting last June. Could you tell me a bit more about this new record model? How does this type of record fit into the FRBR concept and how does it relate to the bholdings recordQ? DH: Frieda and I have been circulating a short paper on this for some time (see http://www.lib.unc.edu/cat/mfh/ serials_approach_frbr.pdf, accessed February 13, 2005), but this fall an article in LRTS by Kristin Antelman came which really supports our notion, with some http://www.lib.unc.edu/cat/mfh/mfhhandbook.html http://content.nsdl.org/dih1/PubPatt/Universal_holdings_statement.html http://www.lib.unc.edu/cat/mfh/serials_approach_frbr.pdf http://www.loc.gov/acq/conser/patthold.html http://www.nasig.org/newsletters/newsletters.2002/02sept/02sept_preconference.html ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx excellent research and summarization of various approaches included.4 The bsuper-recordQ operates to a great extent as a FRBR work record in ways that make far more sense in a serials context than an authority record does. The best part of it is that most of the relationship links needed to support such an entity already exist in serials bibliographic records, which suggests that much of the work in creating these records, at least at first, could be done algorithmically. There are still a lot of critical questions to be answered, primarily concerning how these records fit into our current bibliographic uni- verse, how should they be distributed and maintained, etc. FR: Again, we are delighted that OCLC is also interested in the bsuper-record.Q The bsuper-recordQ actually stems from a concept first encountered in an article by Melissa Bernhardt (Beck) in Cataloging & Classification Quarterly in 1988.5 The article suggested utilizing the encoded control numbers within 780 and 785 linking fields in searches to create a tree display of related serial titles. Though the article did not discuss holdings in detail, it did suggest that some local holdings information be displayed along with the tree. When the Task Force on the Uses of a Publication History Record, chaired by Diane Hillmann, took up this idea, we used Melissa BeckTs concept along with Rahmatollah FattahiTs terminology of a bsuperQ work6 to collocate the related titles for successive entries (780/ 785) and simultaneous versions (776). The record might be a virtual record created on the fly by a looping search of the appropriate linking field control numbers and titles on each record, continuing until a match was found, and displaying the results in a variety of ways including graphical displays. Most important for the Initiative was the provision that the publication history record for all successive titles-a MFHD record showing a bperfectQ or complete set of volumes and issues-be constructed and displayed as a unit for each format. The concept might be further adapted to local holdings. More elaborate ideas, suggesting some different and more exhaustive ways of attaining this kind of collocation, are coming out of the FRBR task groups as they tackle serial relationships in their discussions. JW: Diane, you were one of the invited speakers for the 2005 ALA Midwinter Symposium on bCodified Inno- vations: Data Standards and Their Useful Applications,Q which brings together collective efforts from systems vendors, standards representatives, and librarians. What specific standards were discussed? What roles does each constituent play in implementing the standards? DH: I talked about some of the work weTve been discussing, and, in addition, there were discussions of ISSN (and other identification standards, as well as OpenURL), standards relevant to electronic resource management, ONIX, ISTC and dispatch data used by serials vendors and publishers. 6 JW: Frieda, you gave a workshop titled bDo Holdings Have a Future?Q several years ago.7 What is the future for holdings in your view? FR: I think it is only being realistic for even a die-hard cheerleader for holdings to say that once all present and past serial literature is digitized and readily available online at the issue and article level, local holdings-and surely the local catalog as well-will be only relics, replaced by newer systems of organization of informa- tion. Both digitization and the user flight from printed resources are already starting, but they are still gradual processes and reserved for institutions and libraries in the parts of the world that can afford the increased cost of digital materials. For a long time to come there wonTt be digital access to everything or the access wonTt be universal. If we abandon our stored treasury of information instead of finding ways to make it more accessible, we wonTt be fulfilling the libraryTs mission. JW: Is there anything else that I havenTt asked, but you would like to add? DH: I think itTs important to stress how the work above fits into the larger picture. Libraries have an enviable tradition of metadata sharing, supported by a strong infrastructure. Building on that base, and moving, as libraries have always done, from the monographic to the serial (and beyond), I think weTll start to see the same kinds of standardization and normalization that we saw in the early days, as shared bibliographic data became the norm in libraries. CONSER was in the forefront of those efforts and continues to provide important leadership now. I remember well the grous- ing and grumbling of that era, as we moved towards a common understanding of our goals and realized some truly astounding efficiency in the process. We take all that for granted now, so these efforts to expand on that success seem new and different. We somehow need to reassert what we already know to be true—shared data built on standards is cheaper, better, and the only way to go! FR: ITd like to expand on something related to your second question. That is the increased importance of local item information to online remote searching. Item information conveys the physical (or conceivably vir- tual) unit in which the sought piece is available. This information is being created separately from holdings and stored in many proprietary formats as textual strings. Transactional information, also proprietary is added to the items to reveal the status of an item at a particular time. Communication and migration of this information is often problematic. I think that in an ideal library system, the summary holding, physical item information, and uncompressed issue information would be a view of one file obtained through automatic compression and expansion. That may no longer be possible. But since remote communication of informa- tion at the more granular level, along with its status, has proven important, what can we do to standardize it within library systems? . . . And ITd like to thank you for some interesting questions. ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx Notes 1. Frieda Rosenberg, bDo Holdings Have a Future?Q Serials Librarian 36, no. 3–4 (1999): 529–539. 2. CONSER Publication Pattern Initiative, http://www.loc.gov/acq/ conser/patthold.html (accessed February 14, 2005). 3. NASIG Newsletter 17, no. 3 (2002), http://www.nasig.org/ newsletters/newsletters.2002/02sept/02sept_preconference.html (accessed February 14, 2005). 4. Kristin Antelman, bIdentifying the Serial Work as a Bibliographic 7 Entity,Q Library Resources & Technical Services 48, no. 4 (2004): 238. 5. bDealing with Serial Title Changes: Some Theoretical and Practical Considerations,Q Cataloging & Classification Quarterly 9, no. 2 (1988): 25–39. 6. Rahmatollah Fattahi, bSuper Records: An Approach Towards the Description of Works Appearing in Various Manifestations,Q Library Review 45, no. 4 (1996): 19–29. 7. Rosenberg, Do Holdings Have a Future? Presentation at the 13th Annual North American Serials Interest Group Conference, Boulder, Colorado, June 18–21, 1998. Serial Conversations: An Interview with Diane Hillmann and Frieda Rosenberg Professional Questions Notes work_6x2ffmwjw5gaflcrumbtiz37ui ---- Vol. 27, no. 1 (2017) 171–193 | e-ISSN: 2213-056X This work is licensed under a Creative Commons Attribution 4.0 International License Uopen Journals | http://liberquarterly.eu/ | DOI: 10.18352/lq.10211 Liber Quarterly Volume 27 Issue 1 2017 171 Defining National Solutions for Managing Book Collections and Improving Digital Access Neil Grindley Head of Resource Discovery, Jisc Neil.Grindley@jisc.ac.uk, orcid.org/0000-0001-9808-3032 Paola Marchionni Head of Digital Resources for Teaching, Learning and Research, Jisc Paola.Marchionni@jisc.ac.uk, orcid.org/0000-0002-9544-5410 Abstract In 2013 the National Monographs Strategy (NMS) project in the UK explored the potential for a national approach to the collection, preservation, supply and digitisation of scholarly monographs. The resulting NMS Roadmap rec- ommended seven components, believed to be critical for the provision of a national monograph infrastructure. This paper discusses how Jisc priori- tised three of the recommendations and started planning for the develop- ment of a National Bibliographic Knowledgebase (NBK) in association with key stakeholders representing UK Higher Education and the British Library. In parallel, Jisc also explored recommendations around a national digitisa- tion strategy and national licensing approaches by establishing a ‘Digital Access’ strand of activities. Key Words: monographs; collection management; books; ebooks, metadata; bibliographic data; digitisation; licencing; digital surrogates http://liberquarterly.eu http://www.doi.org/10.18352/lq.10211 mailto:Neil.Grindley@jisc.ac.uk http://orcid.org/0000-0001-9808-3032 mailto:Paola.Marchionni@jisc.ac.uk http://orcid.org/0000-0002-9544-5410 Defining National Solutions for Managing Book Collections and Improving Digital Access 172 Liber Quarterly Volume 27 Issue 1 2017 1. Introduction The work set out in this paper refers to activity over the last 2–3 years and sets out forward plans for building a new national service and designing digital access strategies. The organisation that is leading and managing the work is the UK charity, Jisc, which provides digital solutions for UK edu- cation and research.1 The scope of the work originates from a Jisc-led ini- tiative called The National Monograph Strategy (NMS) which convened a large group of relevant stakeholders from libraries and academia to exam- ine and formulate recommendations for the UK academic sector in rela- tion to monographs.2 The NMS Roadmap (Showers, 2014) was published in September 2014 and described seven components that the group believed were critical for the provision of a national monograph infrastructure. They were as follows: 1. A national monograph knowledgebase 2. A national digitisation strategy 3. A ‘systemic changes think tank’ group 4. New business models for monograph publishing 5. A negotiated national licence for access to digital scholarly monographs 6. A shared monograph publishing platform 7. An impact metrics framework to demonstrate the value of monographs One of the challenges of convening the original NMS discussion and of fol- lowing it up is that it is very difficult to constrain a conversation around the concept of ‘the monograph.’ From whatever point you begin, the discussion quickly expands to include a very wide array of issues, including library col- lection management, metadata quality, diversity of formats, availability of digital surrogates, publishing processes and platforms, appropriate infra- structures, governance, trust, and so on and so forth. However, from the present vantage point it is now much clearer to Jisc that some of the seven recommendations were a higher priority than others and there is now much more clarity about what specific actions should be taken. This is partly due to an inevitable evolution over time in the user community’s requirements; but it is also due to the work that has been done to turn a set of conceptual recom- mendations into an actionable plan that has solid stakeholder support and is able to justify the required levels of investment. Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 173 On consideration of an NMS phase 2 work plan, Jisc took an early decision to deprioritise two of the seven recommendations. The first of these was (3) the ‘systemic changes think tank,’ on the basis that such strategic thinking would take place in a devolved way across various stakeholder and gover- nance groups as a matter of course. The other one was (7) the ‘impact metrics framework’ due to the fact that other entities or collaborative partnerships (particularly those with better incentives and/or a clearer mandate to quan- tify the value and impact of research monograph publishing) would be better placed than Jisc to lead on such a topic.3 Further discussion across teams at Jisc and ongoing work (Collins & Molloy, 2016), clarified that (4) ‘New busi- ness models’ could be subsumed into the Open Access Monograph work that was being taken forward by Jisc Collections in collaboration with initiatives such as OAPEN4 and Knowledge Unlatched.5 It was also established that (6) the ‘shared monograph publishing platform’ was still at an early conceptual phase and could be separated off and managed as an R&D initiative by staff in Jisc Futures.6 2. Defining the Problem Statements Despite shrinking the actionable NMS recommendations from seven down to five and then devolving responsibility for two more, defining the remain- ing activities into coherent strands of work was still a significant challenge. The problem statements that were eventually alighted upon were the product of much additional discussion between Jisc and relevant stakeholder groups throughout 2015. In support of these discussions, Jisc commissioned a con- sultation and path-finding report to consider the future of bibliographic data services in the UK. The resulting report – the Bibliographic Services Implications Study (Hammond, Kay, Schonfeld, & Stephens, 2015)—contained an influen- tial set of recommendations which crucially elicited the support of the lead- ing academic library membership groups in the UK (RLUK7 and SCONUL8). Building on this report and the broader consultation, it was feasible by early 2016 to present the required solutions as a response to two prioritised and broadly agreed assertions. A: Libraries want to make data-driven decisions about the manage- ment of their print and digital book collections but the data that is currently available does not allow them to do this with confidence Defining National Solutions for Managing Book Collections and Improving Digital Access 174 Liber Quarterly Volume 27 Issue 1 2017 B: Libraries want to ensure that researchers and learners have sustain- able and convenient access to digital books but it is currently not obvious what is available or what could readily be made available Boiling down the challenges into two broad categories was extremely useful in terms of being able to divide up responsibility and allow tangible progress to be made. In general terms, statement A focuses on issues to do with meta- data, metadata quality and the aggregation of metadata on a large scale; and statement B focuses on content and access to that content. A two-pronged strategy was agreed whereby the metadata issue would be addressed by the specification and development of a new service – the National Bibliographic Knowledgebase (NBK); and the content issues would be tackled by a series of actions characterised as addressing ‘Digital Access.’ The rest of this paper sets out what has been taken forward in those two areas of work and provides an update on the progress that has been made since the presentation at the LIBER conference in early July 2016. 3. The National Bibliographic Knowledgebase (NBK) The Bibliographic Services Implication Study (Hammond et al., 2015) was published in September 2015 and set out much of the strategic and tactical framework for going into a new phase of Jisc service provision in the area of bibliographic data. A summary of the most pertinent recommendations from the study are as follows:9 1. The UK has a fundamental need for a new national-scale service to drive a range of required functions 2. The new service should consist of an aggregated database and its management should be outsourced to an organisation that is capable of delivering the service as core business at scale 3. The primary focus of future effort should be on supporting UK aca- demic libraries with collections management. Resource discovery and records delivery are of secondary importance 4. The data contributed to the new system must remain shareable and reusable by all contributing organisations and by other relevant organisations that support discovery and records delivery 5. The route to greater impact for contributed library data is through exposure to global search engines and other high impact web-scale Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 175 channels rather than through reliance on Jisc-funded discovery interfaces 6. The new system should combine knowledge about both print and digital publications for services to be efficient and effective As of early 2017, it is now possible to describe progress against all of these objec- tives and to set out the ambitious plans that Jisc and its strategic partners have put in train to address what was set out in the National Monograph Strategy. 3.1. A New National Scale Service The NBK will be a new service but will replicate components of existing ser- vices. It will supersede the current Copac and SUNCAT services that Jisc pro- vides and will work at much greater scale, with more diverse data sources, more functionality and greater flexibility. The goal of the NBK is to help transform how libraries manage their collections, provide access to resources and collaborate with each other. It will provide a sustainable fit-for-purpose next generation national data infrastructure that practically supports libraries to make the transition from a print-first to a digital-first paradigm. It will build on and surpass the functionality of Copac which is the nearest cur- rent equivalent service that Jisc provides and which currently aggregates data from around 90 libraries. The NBK will, over time, include catalogue data from more than 225 academic and specialist libraries, and by doing so it will more effectively support the management of library collections so that they are opti- mized for contemporary research and learning needs. By drilling down into the ‘long-tail’ of holdings across the UK it will support the formulation of a more joined-up national strategy around the retention of print materials. It will aggregate bibliographic data with availability and usage data and will facilitate more efficient access to eBooks, digitised books and journals. Another high-level objective of the NBK is to make a positive contribution to the overall quality of data that circulates around what might be referred to as a bibliographic data ‘ecosystem.’ The NBK will act as a positive agent of change in relation to the accuracy and effectiveness of metadata; the standards that are adhered to and promoted across the sector; and the development of a national approach to the use of authority controls and identifier frameworks in relation to bibliographic resources. There would seem to be broad agreement that the Defining National Solutions for Managing Book Collections and Improving Digital Access 176 Liber Quarterly Volume 27 Issue 1 2017 library sector probably needs to move on from a focus on MARC format data and legacy workflows; but there is also acknowledgement from bodies seeking to be progressive that practice within libraries is very slow to change. Librarians used to working with full MARC records may not easily grasp that a move to the more atomic level of individual statements will make possible inno- vation in areas like new services, localization, and distributed data improvement. Outside of libraries, these activities are building and taking shape, but most librarians aren’t yet monitoring those activities, mostly because they have yet to appreciate the connection with the library world. (NISO, 2014) The NBK will also support and facilitate the most unhindered flow of data possible in order to maximise the prospects of users encountering data that will lead them to library resources wherever they may be looking for it. This may be via a Google search; or within a commercial discovery system envi- ronment; or via another specialist library aggregator system. 3.2. Outsourced Service Management Following an extensive procurement and competitive dialogue process, Jisc selected OCLC as their service provider and partner to build and deliver the NBK. OCLC are uniquely positioned to make library data globally available via their WorldCat service and to connect library data-hubs at scale. They are a known quantity and already provide national and regional bibliographic infrastructure in a number of countries, including in Australia, France, Germany, Switzerland and the Netherlands. They have also worked collab- oratively with RLUK libraries in the UK to undertake analysis to explore the concept of the ‘Collective Collection’ (Malpas & Lavoie, 2016). Jisc has entered into a multi-year agreement with OCLC to work closely together to develop the solution that UK libraries need. Jisc will ensure that the service is owned and controlled by the community of libraries that contribute data to the aggregation and will share data management responsibility on the OCLC-provided CBS platform (Central Bibliographic System). 3.3. Focus on Collection Management Functionality Collection management has emerged as a much higher priority for libraries over the past 3 or 4 years and this has largely been driven by the need for Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 177 universities to focus closely on managing the space they have available for learning and teaching purposes.10 Libraries are now carefully considering their print monograph holdings in the same way that they do for journals which have been under scrutiny for some years and there is an urgent and well-defined requirement to provide an authoritative source of data that will support library decision-making to transfer, relegate or withdraw titles. Jisc currently offers the Copac Collection Management Tool to UK HEI’s (higher education institutions) and it supports a number of use cases (Jisc, n.d. b): • Identifying last copies among titles considered for withdrawal • Identifying collection strengths • Deciding whether to conserve a book • Reviewing a collection at the shelves • Prioritising a collection or item(s) for digitisation • Subject search—collection development and marketing The NBK will replicate or exceed the functionality of the Copac Collections Management (CCM) tool either by adopting the CCM toolset and integrating it with CBS; or by developing a native CBS tool. 3.4. Reusability of Data The data that finds its way into the NBK will be managed and licensed so that wherever possible it will be available for discovery and re-use by other sys- tems. As well as the CBS metadata management system, OCLC will provide the CBS publishing platform which will contain the enriched de-duplicated master records. This will provide a mechanism for the syndication of data, either freely or according to a fee model depending on circumstances and sustainability requirements. Reuse of data will be provided via batch export of files on request; a sched- uled ‘push’ mechanism; or a ‘pull’ service using OAI harvesting or other API mechanisms. It will be possible to select data using the indexes that have been defined on the CBS platform and the log-files that are maintained for database updates. Common selections will be per library, group of libraries, dates, material types and subjects. For the creation, maintenance, scheduling Defining National Solutions for Managing Book Collections and Improving Digital Access 178 Liber Quarterly Volume 27 Issue 1 2017 and monitoring of export jobs a web-based application will be used called CJM (CBS Jobs Management). With CJM, Jisc staff will be able to create, main- tain and schedule jobs. Exports that need to be produced regularly can be automatically launched daily, weekly, or at other frequencies. 3.5. Exposure of Library Data to Global Search Engines One of the gains that UK libraries can expect from a Jisc/OCLC collaboration is that NBK data will be visible in WorldCat. Through WorldCat, NBK data will gain greater visibility in OCLC discovery applications and in third-party applications which take part in the OCLC Web Syndication program. If, for any reason, libraries would prefer their data not to be published in WorldCat, then it will be possible to exclude records from that synchronisation process. In addition to OCLC Web Syndication, the CBS publishing platform enables website creation specifically for search engine crawlers that will give access to all records that should be made available for search engine harvesting. The data would be represented using schema.org mark-up, which is usually pre- ferred by any such web service. In this special web site, filters for available records and represented data elements can be applied. Data that is syndicated through WorldCat will be presented in web services (such as Google, Bing, Wikipedia, etc.) in a way that allows linking back to WorldCat. The user will be directed from the initial web site to the respective WorldCat entry for that title or author. The authority data will also be syndi- cated (the Wikipedia pages of many authors contain the VIAF number, which also makes it available on the Google knowledge card). 3.6. Knowledge about Print and Digital Publications OCLC will load eBook vendor collections onto the NBK and the expectation is that these will be regular and automated using data feeds from eBook ven- dors according to agreements that are in development by Jisc in association with libraries and third party organisations. The solution will be extended to eBook data that comes directly from libraries, thereby providing a platform for community supported management of shared collections. Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 179 CBS functionality will include the ability to regularly test the availability of an eBook and identify ‘broken’ links, and log the results. Analysis of the causes of broken links will be undertaken and batch change functions will be applied where it is possible to make corrections. CBS supports dynamic FRBR clustering and this could be used for the creation of FRBR work records. The NBK will seek to connect with international sources of eBook metadata/con- tent wherever possible and integrate them as data sources. For example, both Jisc and OCLC have strong connections with HathiTrust and the NBK will incorporate data and links to content from their openly available files. The last of these six objectives is a critical goal for the NBK and is one of the key differentiators between the capability that current systems provide and the func- tionality that the NBK aims to deliver. Copac is primarily geared towards search- ing library holdings records to identify the location of print materials in libraries. The NBK intends to extend and expand that scope by sharpening the focus on the availability of e-resources, wherever versions and copies may be available for use in the UK. This whole area of work was the subject of a parallel activity within Jisc as the specification for the NBK was being considered and assembled. The next section describes the complementary analysis that was undertaken during 2015/2016 to identify and analyse institutional requirements for eBook resources and to get a clearer picture of demand and supply issues. 4. Digital Access Over the last year or so, the Digital access strand of activities11 has progressed in tandem with the development of the NBK addressing two of the recom- mendations in the original NMS report: a digitisation strategy to support the building of a national digital research collection and a national licence to sup- port access to digital monographs through a negotiated national agreement. A key driver for both recommendations was the ambition to increase access to monographs in digital form, i.e. monographs that are not already currently available digitally, for the benefit of academics and students, and also enable collections managers to make more informed collection management deci- sions based on information on what is available where, and in what format. Another aim the two recommendations shared was to ensure that any stra- tegic approach to increasing access to monographs was evidence-based, and founded on analysis of requirements from institutions and their patrons. Defining National Solutions for Managing Book Collections and Improving Digital Access 180 Liber Quarterly Volume 27 Issue 1 2017 As both recommendations focused on increasing digital access, the scope of our work was from the beginning focused on print collections, rather than born digital monographs, and on “monographs” rather than textbooks. It soon became apparent, however, that the term “monographs” was too nar- row for the purpose of our work at this initial stage, and also that the bor- derline between what constitutes a monograph or an academic book and a textbook is not always clear and self-evident. There is more on this below. 4.1. Engaging with the Community If we were to tackle recommendations about a digitisation and a licensing strategy, the first questions the team was confronted with were, what should a strategy achieve? Where should we start from? What kind of titles should a digitisation and/or a licensing strategy focus on? What would be useful cri- teria to consider? Should we privilege Public Domain material? Make use of hard-won copyright exceptions such as the one on Orphan Works12? Prioritise out-of-commerce books? What about books still in-copyright given the dif- ficulties of clearing rights for 20th and 21st centuries publications (Freire, Scipione, Muhr, & Juffinger, 2013)? What would a business model and a service model to support digitisation and licensing at scale look like? How should we balance the immediate needs of practitioners within libraries in satisfying the day-to-day demand of academics and students with more stra- tegic and ambitious aspirations of the HE community? We needed to start with identifying the use cases, who needed what and why, and drill further into the problems that a digitisation and a licensing strategy needed to address. We embarked on a series of preliminary informal conversations with librar- ians and collection managers from a range of different universities who provided us with some initial insights into their problems with regard to the day-to-day management of book collections and the provision of digital resources to their users. We found that key issues were space and the need to weed collections based on usage levels. This is certainly not a new issue as libraries have been struggling with the management of high volume-low usage books for years (Bracegirdle, 2012). Satisfying readers’ growing expec- tations on quick and easy access to (digital) resources was also a recurrent concern, in the words of some of our interviewees, Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 181 “it is an important issue, as being able to provide access to digital copies of con- tent for reading lists is a key priority.” and “… so much of our purchasing is based around reading lists. So if we cannot provide a book digitally, then potentially a large cohort of students will be unable to access it in that format.” Many institutions have a “digital first” policy when acquiring new books, although print is still purchased when no suitable alternative is available. Time and budget are also major constraints and book availability information is hard to find. This preliminary research informed the design of what we called the Digital access pilots project. Following an open Call for Participation (CfP)13 we enlisted 10 institutions, a mixture of research intensive, specialist universities and teaching and learning focused institutions, to help us define the prob- lems and identify the type or categories of books that libraries most needed to provide access to in digital form, and why. At this stage, we didn’t know whether demand might be primarily for out-of-copyright books, in-copy- right, in or out-of-commerce, or if there might be any publishers or subject disciplines emerging as most “in demand.” Guided by the informal conversations with senior librarians and practitio- ners, and discussions with the BIBDOG14 group, we took the view that at this stage of the project we would have a more inclusive view of what constitutes a “monograph.” The CfP adopted a broad definition: At this stage we are adopting a broad definition in order to get more information about libraries’ needs. We include most types of academic book but it must not be a core textbook. A core textbook is defined as something written specifically to serve the needs of students and lecturers following a course. There are no other restrictions since we would like to know what sort of books libraries would pri- oritise above others. One of the key requirements for institutions to participate in this project was that they would provide us with a list of up to 100 titles of books each that the library had been requested to supply but which they couldn’t fulfil, for Defining National Solutions for Managing Book Collections and Improving Digital Access 182 Liber Quarterly Volume 27 Issue 1 2017 whatever reason, so that we could have real data to work from. We supplied libraries with an initial template to gather the titles and assembled a list of over 1200 titles. However, it soon became apparent that there were going to be a number of challenges in analysing the data. Libraries reported that the data they supplied for this pilot project had not, by necessity, been collected systematically, the bibliographic information wasn’t always accurate in the way it was available to them, and a great deal of clean- ing up and standardisation of the data had to be done by the team against the Nielsen BookData Online database (Nielsen, n.d.). This in itself posed its own challenges as we found that the bibliographic information on the database was not always reliable or expressed in the way that was useful for our research. In addition, it appeared that a number of titles had been included which were more likely to be textbooks, but we couldn’t be absolutely certain simply judg- ing from the bibliographic information. This resulted in us working off two sets of data, a larger one (N=1216), which was the aggregation of titles origi- nally provided by the ten libraries, and a slightly smaller and “cleaner” one (N=1117), where we removed about 100 titles which we recognised as most cer- tainly textbooks – there may have been more.15 Despite the two data sets, how- ever, the overall picture of results did not change by any significant amount. 4.2. What we Found: A Bird’s Eye View A health warning before delving into the data: this is a sample of data gath- ered from 10 institutions and therefore is likely to contain a certain degree of bias and may not be representative. In addition, although best effort was applied to the checking and cleaning up of the bibliographic metadata through the Nielsen database, there may still be inconsistencies where infor- mation is simply not available or in cases where titles have changed their availability status over time. Our objective, however, was to look for high level patterns, rather than achieve complete accuracy, to see if there were some key messages emerging, and we feel that the project achieved that. 4.2.1. Main Availability Problems We asked libraries to tell us the main problems they had experienced in accessing the titles they submitted to us based on the following categories that we provided, as shown in Figure 1: Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 183 Fig. 1: Library categorised titles (N=1216). Key Package only – the title is only available as e-book as part of a larger package No instit – no institutional licence available Price – the price is too high OoP & Cpyrt – out of print and copyright Format – the type of digital format (pdf, epub, etc.) is unsuitable OoP, in cpyrt – out of print and in copyright >chapter – more than one chapter is needed from this book Outside CLA – the title is outside the current CLA licence Not available – no e-book is available at all Other – a different problem to the ones listed Not in UK – there is no e-book available in the UK By far the biggest problem was that the titles libraries wanted as an e-book were simply not available—or at least this was the libraries’ perception based on the availability information they had access to, typically through a select number of aggregators or book vendors they use as suppliers. This is an important point to note. As it turned out at the time of writing during phase Defining National Solutions for Managing Book Collections and Improving Digital Access 184 Liber Quarterly Volume 27 Issue 1 2017 2 of this study16, a number of titles may in fact be available as ebooks, for example as different editions. However, based on the information they had, as far as the librarians were concerned, they did not have knowledge of, or could access, those titles in a digital format. This resonates with other recent surveys on key concerns librarians have in relation to publishers’ provision of books/ebooks such as around publishers’ pricing strategy and libraries’ budgets, licensing models and accessibility (Folan & Grace, 2017). The other problems, although each much smaller, did account together for a large portion of the data, as the pie chart in Figure 2 below illustrates: 4.2.2. Reasons and Mode of Access When we asked libraries to tell us the reasons for wanting a given book, this was mostly to fulfil reading list requests (80%), as shown in Figure 3, while 17% were for research, and 3% for preservation purposes, accessibility or other purposes. Libraries also told us that they require a certain degree of scale in the number of concurrent users, as shown in Figure 4, with the great majority of titles requested being for between 5-50 concurrent users and greater than 50. Libraries also require flexibility with remote access from out of campus within and outside of the UK. Two-thirds of the titles were needed for remote access Fig. 2: Titles categorised by problem (N=1216). Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 185 Fig. 3: Primary use (N=1216). Fig. 4: User access (N=1216). in the UK and the rest of the world, and one third for remote access in the UK only. The project’s final report identifies a number of scenarios based on real library workflows typical of situations that libraries find themselves in when trying to satisfy requests for books for reading list purposes or research proj- ects and the kind of barriers they come up against, as highlighted in Figure 1. 4.2.3. Availability and Status of Titles We conducted further analysis on the smaller “cleaner” data set (N=1117) to gain a better understanding of the availability and copyright status of the books and to see whether there were any publishers that may emerge as dominant. The chart below (Figure 5) shows the distribution of titles by decade of publication cross-analysed with the availability status. Defining National Solutions for Managing Book Collections and Improving Digital Access 186 Liber Quarterly Volume 27 Issue 1 2017 One of the most interesting findings was that there was hardly any overlap of titles requested by the institutions, only six were requested by two institu- tions, and one by three institutions. However, as this was a skewed sample, a larger aggregation of titles at national level might reveal a different picture. Another finding that we hadn’t anticipated was that most of the requested books were published from the 1960s onwards and therefore likely to still be in copyright, hardly any public domain book was included. Having checked the titles against the Nielsen database, we were also able to estimate with some degree of confidence whether the books were available as in-print only (no ebooks), out of commerce (i.e. no record or information existed in the Nielsen database), or were indeed “available as eBook, but” not in a way that was useful to libraries. Again, it is worth remembering that there may have been different editions/ISBNs of these titles available as ebooks, but based on the availability data that the libraries could check against, they didn’t seem to be available. Finally, we looked at the breakdown of publishers, as exemplified in Figure 6. Our estimate (after checking imprints and sales of companies as far as pos- sible) is that the 1117 titles were published by 291 different publishers. A large Fig. 5: Titles by date and availability, excluding textbooks (N=1117). Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 187 proportion of just over 41% are accounted for in the top 10 publishers, each with more than 20 titles. Within the top 10 Taylor & Francis Ltd. and Penguin Random House accounted for 150 of the titles, 32%. A further 165 titles, 15% can be accounted for by the next 13 publishers, each holding more than 10 titles per company. The remaining 268 publishers had ten or fewer titles each, 44%. Within this long tail of the 268 publishers, 185 companies held one title only and one was untraceable. 4.3. High Level Requirements Despite the fact that libraries are on the whole able to satisfy the majority of book requests, the participating institutions stressed how critical it is for them to provide access to digital versions of books to ensure they meet the requirements and expectations of their staff and students and in particular for reading lists purposes. One of our pilot institutions stressed how: “The titles passed to you for investigation were the tip of the iceberg, sourced from reading lists submitted by our History Department. Doing a system- atic trawl through reading lists from all departments would reveal a great many more titles where the demand for an e-book in recent years has gone unsatisfied.” In planning for a potential solution, or sets of solutions, to these problems, key requirements that have emerged from this project are the ability to: Fig. 6: Publishers breakdown (N=1117). Defining National Solutions for Managing Book Collections and Improving Digital Access 188 Liber Quarterly Volume 27 Issue 1 2017 1. aggregate at scale problem titles mainly from libraries’ reading lists 2. check reliability of bibliographic data (publishers, ISBNs..) from libraries against authority source 3. obtain more permissive licences to produce digital versions of books to satisfy access needs 4. cater for a “long tail” pattern of requests from libraries 5. keep cost of digital copy to no more than a print copy, if available, max £100 6. deliver an on-demand service for digitisation/provision of digital copy. This seemed to be the most appropriate route to satisfy “just in time” requests, possibly through existing mechanisms such as the British Library document delivery service or services provided by universities that might have spare capacity in digitisation 7. create ebooks in an appropriate format: searchable pdf as minimum but epub or HTML5 are preferred. Accessibility for users with dis- abilities is still a big problem. 5. What we Learnt When we embarked on the digital access project there were a number of unknowns in relation to the drivers for the demand for digital copies of books and the type of books in question. Some of our findings were not what we expected, and key learning points for us have been: • for libraries, the highest priority is to resource reading list requests regardless of the type of book (monograph, novel, textbook, refer- ence). We focused on the “monograph” in its broader sense (“aca- demic” book as opposed to textbook) partly because of the nature of this work stemming from the NMS report, and partly because we anticipated the solutions to the problem being different for mono- graphs/academic books and textbooks • having access to reliable availability data with regard to which titles are available and in what format is a big challenge for libraries, cou- pled with information on who owns the rights to any work. Even finding the current publisher (as a proxy for rights holder) is not always straightforward since availability fluctuates • titles in demand tend to have been published in the last 20 years. Demand from libraries and their patrons for e-books has increased, Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 189 yet the largest category of problem was that no digital version was available to the best of libraries’ knowledge • specialist institutions (particularly in the arts and humanities) seem to struggle more to fulfil reading list requests, probably due to the more niche type of publisher with whom these books are published • even where e-books are available to libraries, they are frequently unsuitable to meet the needs of their patrons, are often too expensive and follow unsatisfying licensing models Some of the lessons learnt in relation to the NBK are very much tied in with the digital access work. As stated above, trying to distil the complexity of the issues down into manageable and actionable tasks has been a significant challenge but it has become clear over time that it is possible and helpful to pursue the two strands of work in a semi-autonomous way. What is also clear, however, is that the NBK must not lose sight of the goal of better digi- tal access in favour of concentrating on the monumental task of aggregating bibliographic data at the scale proposed. This is a phasing and planning issue for the implementation team but also an oversight and governance priority. 6. Next Steps We have made substantial progress since the LIBER 2016 conference in tak- ing forward the vision of the National Monographs Strategy roadmap and in response to the two problem statements on metadata and digital access. With regard to the digital access strand of the work, given that the great major- ity of “in demand” titles is strongly rooted in the in-copyright category (in or out of commerce), we are concentrating on exploring possible alternative licensing solutions with publishers that respond to the requirements of the Higher Education sector. At the time of writing, we have moved onto phase 2 of this work and are digging deeper into the findings of the first phase and consulting with publishers and relevant stakeholders in the UK HE library community. Early interim findings point to the complex issue about the need for libraries to have access to reliable availability data on what is available to them as ebooks, as the challenge of discovery, in this context, might cre- ate the perception of non-availability of digital copies when in fact they are Defining National Solutions for Managing Book Collections and Improving Digital Access 190 Liber Quarterly Volume 27 Issue 1 2017 available. Secondly, early conversations with publishers also point to the challenge of terminology in differentiating between monographs, academic books and textbooks, as these categorisations are not objective and fixed in time. A more useful criteria to adopt might be simply refer to “reading lists titles” rather than trying to classify them as academic books or textbooks. The second phase is due to terminate by July 2017 and we will disseminate the results as the work progresses. At the same time, as Jisc and OCLC work together to build and deliver the NBK17 we will ensure that the bibliographic records aggregated by this new service will, when possible, link to existing digitised copies of books which are available in the public domain such as from repositories like the Hathi Trust. Building the NBK represents a substantial investment and in the first instance Jisc will work with its funders and its strategic partners to commit the neces- sary resources to build the system and to establish workflows and processes. During this initial ‘build’ phase, discussions with the sector as represented by governance and user groups will ensure that the long-term sustainability model for the service is owned and supported by the UK stakeholder com- munity. It is anticipated that a mix of core services and additional ‘value-add’ tools and service components will be built around the data, providing ways of designing and designating both cost recovery mechanisms and ways of generating income that can be re-invested into the service to keep it relevant and fit-for-purpose over time. Specifying and procuring an NBK service delivery partner has been a lengthy and intensive process but really just represents the ‘end of the beginning.’ The practical work to ensure that the UK has the sort of infrastructure and capa- bility originally set out in the National Monograph Strategy now begins in earnest. References Bracegirdle, S. (2012). Books to the ceiling, Books to the sky, My pile of books is a mile high. What do we do with all the unused books in the modern Research Library? Retrieved June 20, 2017, from https://rlukrrlm.wordpress.com/2012/10/26/ unusedbooks/#_ftn1. https://rlukrrlm.wordpress.com/2012/10/26/unusedbooks/#_ftn1 https://rlukrrlm.wordpress.com/2012/10/26/unusedbooks/#_ftn1 Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 191 Collins, E., & Molloy, C. (2016). OAPEN-UK final report: A five year study into open access monograph publishing in the humanities and social sciences. Retrieved June 14, 2017, from http://oapen-uk.jiscebooks.org/files/2016/01/OAPEN-UK-final-report.pdf. Folan, B., & Grace, C. (2017, May 19). Librarian messages to publishers. UKSG eNews 396, n.p. Retrieved June 20, 2017, from http://www.jisc-collections. ac.uk/UKSG/396/Librarian-messages-to-publishers/?n=121b8a3f-e721-4bc0- 8354-7ad4d47f99de&utm_source=UKSG+eNews%3A+UKSGMembersOtherC ontacts&utm_campaign=5fb6da73d1-EMAIL_CAMPAIGN_2017_05_17&utm_ medium=email&utm_term=0_091ba63bde-5fb6da73d1-302193797. Freire, N., Scipione, G., Muhr, M., & Juffinger, A. (2013). Supporting rights clearance for digitisation projects with the ARROW service. LIBER Quarterly, 22(4), 265–284. Retrieved June 19, 2017, from https://www.liberquarterly.eu/articles/10.18352/ lq.8101/. https://doi.org/10.18352/lq.8101. Hammond, M., Kay, D., Schonfeld, R., & Stephens, O. (2015). Bibliographic services implications study: Final report. Retrieved February 15, 2017, from http://repository. jisc.ac.uk/id/eprint/6550. Jisc (n.d. a). Bibliographic data services and the national monograph strategy – Next steps. Retrieved February 15, 2017, from https://monographs.jiscinvolve.org/wp/ files/2015/11/BiblioData_NMS_Next_Steps.pdf. Jisc (n.d. b). CCM tools project: Use cases. Retrieved February 15, 2017, from https:// ccm.copac.jisc.ac.uk/use-cases. Malpas, C., & Lavoie, B. (2016). Strength in numbers: The Research Libraries UK (RLUK) collective collection. Dublin, Ohio: OCLC Research. Retrieved June 20, 2017, from http://www.oclc.org/content/dam/research/publications/2016/oclcresearch- strength-in-numbers-rluk-collective-collection-2016-a4.pdf. Nielsen (n.d.). Nielsen book discovery – Providing comprehensive, enriched and timely bibliographic data. Retrieved February 15, 2017, from www.nielsenbookdata. co.uk/. NISO (2014). Roadmap for the future of bibliographic exchange: summary report. Retrieved June 20, 2017, from http://www.niso.org/apps/group_public/download. php/13327/NISO_14007BibliographicRoadmapDevelopmentDoc_FINAL4.pdf. Showers, B. (2014). A national monograph strategy roadmap. London: Jisc. Retrieved February 15, 2017, from https://www.jisc.ac.uk/ reports/a-national-monograph-strategy-roadmap. Ward, V., & Colbron, K. (2016). Digital access solutions – Report on investigations for possible pilot studies. London: Jisc. Retrieved February 15, 2017, from http:// repository.jisc.ac.uk/6562/. http://oapen-uk.jiscebooks.org/files/2016/01/OAPEN-UK-final-report.pdf http://www.jisc-collections.ac.uk/UKSG/396/Librarian-messages-to-publishers/?n=121b8a3f-e721-4bc0-8354-7ad4d47f99de&utm_source=UKSG+eNews%3A+UKSGMembersOtherContacts&utm_campaign=5fb6da73d1-EMAIL_CAMPAIGN_2017_05_17&utm_medium=email&utm_term=0_091ba63bde-5fb6da73d1-302193797 http://www.jisc-collections.ac.uk/UKSG/396/Librarian-messages-to-publishers/?n=121b8a3f-e721-4bc0-8354-7ad4d47f99de&utm_source=UKSG+eNews%3A+UKSGMembersOtherContacts&utm_campaign=5fb6da73d1-EMAIL_CAMPAIGN_2017_05_17&utm_medium=email&utm_term=0_091ba63bde-5fb6da73d1-302193797 http://www.jisc-collections.ac.uk/UKSG/396/Librarian-messages-to-publishers/?n=121b8a3f-e721-4bc0-8354-7ad4d47f99de&utm_source=UKSG+eNews%3A+UKSGMembersOtherContacts&utm_campaign=5fb6da73d1-EMAIL_CAMPAIGN_2017_05_17&utm_medium=email&utm_term=0_091ba63bde-5fb6da73d1-302193797 http://www.jisc-collections.ac.uk/UKSG/396/Librarian-messages-to-publishers/?n=121b8a3f-e721-4bc0-8354-7ad4d47f99de&utm_source=UKSG+eNews%3A+UKSGMembersOtherContacts&utm_campaign=5fb6da73d1-EMAIL_CAMPAIGN_2017_05_17&utm_medium=email&utm_term=0_091ba63bde-5fb6da73d1-302193797 http://www.jisc-collections.ac.uk/UKSG/396/Librarian-messages-to-publishers/?n=121b8a3f-e721-4bc0-8354-7ad4d47f99de&utm_source=UKSG+eNews%3A+UKSGMembersOtherContacts&utm_campaign=5fb6da73d1-EMAIL_CAMPAIGN_2017_05_17&utm_medium=email&utm_term=0_091ba63bde-5fb6da73d1-302193797 https://www.liberquarterly.eu/articles/10.18352/lq.8101 https://www.liberquarterly.eu/articles/10.18352/lq.8101 https://doi.org/10.18352/lq.8101 http://repository.jisc.ac.uk/id/eprint/6550 http://repository.jisc.ac.uk/id/eprint/6550 https://monographs.jiscinvolve.org/wp/files/2015/11/BiblioData_NMS_Next_Steps.pdf https://monographs.jiscinvolve.org/wp/files/2015/11/BiblioData_NMS_Next_Steps.pdf https://ccm.copac.jisc.ac.uk/use-cases https://ccm.copac.jisc.ac.uk/use-cases http://www.oclc.org/content/dam/research/publications/2016/oclcresearch-strength-in-numbers-rluk-collective-collection-2016-a4.pdf http://www.oclc.org/content/dam/research/publications/2016/oclcresearch-strength-in-numbers-rluk-collective-collection-2016-a4.pdf www.nielsenbookdata.co.uk www.nielsenbookdata.co.uk http://www.niso.org/apps/group_public/download.php/13327/NISO_14007BibliographicRoadmapDevelopmentDoc_FINAL4.pdf http://www.niso.org/apps/group_public/download.php/13327/NISO_14007BibliographicRoadmapDevelopmentDoc_FINAL4.pdf https://www.jisc.ac.uk/reports/a-national-monograph-strategy-roadmap https://www.jisc.ac.uk/reports/a-national-monograph-strategy-roadmap http://repository.jisc.ac.uk/6562 http://repository.jisc.ac.uk/6562 Defining National Solutions for Managing Book Collections and Improving Digital Access 192 Liber Quarterly Volume 27 Issue 1 2017 Notes 1 Jisc website—https://www.jisc.ac.uk/. 2 https://monographs.jiscinvolve.org/wp/expert-advisory-panel-membership/. 3 It should be noted that various Jisc services (including the proposed National Bibliographic Knowledgebase) could play a significant future role in providing data or intelligence for monograph impact metrics, once a coordinated and coherent community approach has been proposed. 4 Open Access Publishing in European Networks—http://www.oapen.org/home. 5 Knowledge Unlatched—http://www.knowledgeunlatched.org/. 6 Jisc ‘Futures’ focuses on Innovation and research & development—https://www. jisc.ac.uk/rd/how-we-innovate. The main thrust of the work described in this paper is the responsibility of the Jisc ‘Digital Resources’ directorate—https://www.jisc. ac.uk/content. 7 RLUK – Research Libraries UK—http://www.rluk.ac.uk/. 8 SCONUL – Society of College, National and University Libraries—http://www. sconul.ac.uk/. 9 A two-page summary of the report including the ‘Jisc response’ and a ‘next steps’ section is available at Jisc (n.d. a). 10 The SCONUL 2015 workshop: Space planning and the re-invention of the library, is instructive about the type of space planning that libraries are having to consider: https://www.sconul.ac.uk/sites/default/files/Spaceplanningandthere- inventionofthelibrary.pdf. 11 See the final report of phase one of the Digital access pilots (Ward & Colbron, 2016). Its accompanying data set is available at http://repository.jisc.ac.uk/6563/. 12 http://www.legislation.gov.uk/ukpga/1988/48/chapter/III/crossheading/ orphan-works. 13 See the original Call for participation at https://goo.gl/o8JznT. The ten participating institutions were: Durham University, Royal Conservatoire of Scotland, University of the Arts, University of East London, University of Glasgow, University of Manchester, University of Portsmouth, University of St Andrews, University of Sussex, University of York. 14 BIBDOG – The Bibliographic Data Oversight Group, convened by Jisc and consisting of representatives from: Research Libraries UK (RLUK); the Society of College, National and University Libraries (SCONUL) and the British Library. https://www.jisc.ac.uk https://monographs.jiscinvolve.org/wp/expert-advisory-panel-membership/ http://www.oapen.org/home http://www.knowledgeunlatched.org https://www.jisc.ac.uk/rd/how-we-innovate https://www.jisc.ac.uk/rd/how-we-innovate https://www.jisc.ac.uk/content https://www.jisc.ac.uk/content http://www.rluk.ac.uk http://www.sconul.ac.uk http://www.sconul.ac.uk https://www.sconul.ac.uk/sites/default/files/Spaceplanningandthere-inventionofthelibrary.pdf https://www.sconul.ac.uk/sites/default/files/Spaceplanningandthere-inventionofthelibrary.pdf http://repository.jisc.ac.uk/6563 http://www.legislation.gov.uk/ukpga/1988/48/chapter/III/crossheading/orphan-works http://www.legislation.gov.uk/ukpga/1988/48/chapter/III/crossheading/orphan-works https://goo.gl/o8JznT Neil Grindley and Paola Marchionni Liber Quarterly Volume 27 Issue 1 2017 193 15 More information on this is contained in the Appendix of the final report, p. 32. 16 Following an open procurement process, Jisc has appointed Information Power http://www.informationpower.co.uk/ to carry out phase 2 of this study. 17 See Jisc-OCLC press release at https://www.jisc.ac.uk/news/ new-uk-wide-service-will-transform-library-collaboration-03-feb-2017. http://www.informationpower.co.uk https://www.jisc.ac.uk/news/new-uk-wide-service-will-transform-library-collaboration-03-feb-2017 https://www.jisc.ac.uk/news/new-uk-wide-service-will-transform-library-collaboration-03-feb-2017 work_6xhtorsy7rgzjccq67vnpyu7ra ---- 2 KShearer Open is not enough! Sustainability, equality, and innovation in scholarly communication Who is COAR? • Over 100 members and partners from 35 countries in 5 continents • Universities, libraries, government agencies, open access organizations, not-for-profit organizations, and platform developers • Diverse perspectives that share a common vision Contac ts Us http:// www.coar-repositories.org Email: office@coar-repositories.org Phone: + 49 551 39 22215 Fax: + 49 551 39 5222 Facebook: COAReV Twitter: @COAR_eV How to participate? • Organizations can join COAR for €500 Euros per year (about $600 US) • Join as a single, consortial, or special member or partner • Download the membership application (https://www.coar-repositories.org/about/join/become-a-member) Major Ac tivities International voice Raising the visibility of repository networks as key infrastructure for open science Cultivating relationships Supporting an international community of practice for repositories and open access Building capacity Advancing skills and competencies for repository and research data management Alignment and interoperability Building a global knowledge commons through harmonization of standards and practices Adopting value-added services Promoting the use of web-friendly technologies and new functionalities for repositories Working for a sustainable, global knowledge commons based on a network of open access digital repositories 3 (1) Sustainability - Research, education and knowledge are critical for sustainable development But our system for sharing and disseminating knowledge must also be sustainable The ridiculous $$$$ for scholarly journals International Journals L Bid deals lock-ins Slide from Stéphanie Gagnon, Université de Montréal Libraries (and thanks to Richard Dumont) Open access via Article Processing Charges? Jisc 2016: Average APC cost was about £1745 (~$2400 US) Published on May 9, 2016 9 (2) Equality Juan Pablo Alperin: http://jalperin.github.io/d3-cartogram/ Example: Chagas Disease Number of publications: 3,011 Years: 2004-2013 Nepalese research outputs - with Major Clusters Image produced by Pitambar Gautam, Hokkaido University, Sapporo, Japan Word maps created using VosViewer, a free software (Leiden University) , Vaby Eck & Waltman (2010) Example: Nepal “Openness is not simply about gaining access to knowledge, but about the right to participate in the knowledge production process, driven by issues that are of local relevance, rather than research agendas set elsewhere or from the top down” Leslie Chan (3) Innovation The application of better solutions that meet new requirements, unarticulated needs, or existing market needs 350 years of the academic journal! 350 years of the journal, despite… Innovation in scholarly communication is stifled because of “perverse incentives” “The pressure to publish in "luxury" journals encourages researchers to cut corners and pursue trendy fields of science instead of doing more important work.” (Randy Schekman, University of California, Berkeley ) The way we assess research contributions is too heavily dependent on publishing in the international journals The case of Chilé • Researchers that publish in a Scielo journal, get 6 points towards promotion and tenure • Researchers that publish in an “international journal” get 10 points towards promotion and tenure The top five most prolific publishers account for more than 50% of all papers published in 2013. YES! > 1 billion EUR Increasing horizontal and vertical integration Increasing publisher integration of the research lifecycle By Jeroen Bosman and Bianca Kramer - 101 Innovations in Scholarly Communication https://101innovations.wordpress.com/workflows/ Example: Elsevier’s services Publishers are increasingly in control of scholarly infrastructure and why we should care Case Study of Elsevier Written by: Alejandro Posada and George Chen, University of Toronto Scarborough Scholarly communications Strengthen and expand the institutional role in managing scholarly output Our solution 31 Lorcan Dempsey (OCLC) 2012. Our environment has now changed. We live in an age of information abundance and transaction costs are reduced on the web. This makes the locally assembled collection less central. At the same time, institutions are generating new forms of data—research data, learning materials, preprints, videos, expertise profiles, etc.—which they wish to share with others. Libraries as an Open Global Platform An idea that is not new, but who’s time has come MIT Future of Libraries Report (2017) But… repository systems are using old technologies developed over 15 years ago that do not support the functionalities we need. And… in their current form, repositories only perpetuate the flawed system Next Generation Repositories Working Group (launched in April 2016) Eloy Rodrigues, chair (COAR, Portugal) Andrea Bollini (4Science, Italy) Alberto Cabezas (LA Referencia, Chile) Donatella Castelli (OpenAIRE/CNR, Italy) Les Carr (Southampton University, UK) Leslie Chan (University of Toronto at Scarborough, Canada) Chuck Humphrey (Portage, Canada) Rick Johnson (SHARE/University of Notre Dame, US) Petr Knoth (Open University, UK) Paolo Manghi (CNR, Italy) Lazarus Matizirofa (NRF, South Africa) Pandelis Perakakis (Open Scholar, Spain) Jochen Schirrwagen (University of Bielefeld, Germany) Daisy Selematsela (NRF, South Africa) Kathleen Shearer (COAR, Canada) Tim Smith (CERN, Switzerland) Herbert Van de Sompel (Los Alamos National Laboratory, US) Paul Walk (EDINA, UK) David Wilcox (Duraspace/Fedora, Canada) Kazu Yamaji (National Institute of Informatics, Japan) http://ngr.coar-repositories.org/ “to position repositories as the foundation for a distributed, globally networked infrastructure for scholarly communication, on top of which layers of value added services will be deployed, thereby transforming the system, making it more research- centric, open to and supportive of innovation, while also collectively managed by the scholarly community.” Vision http://ngr.coar-repositories.org/ • Distribution of control • Inclusiveness and diversity • Public good • Intelligent openness and accessibility • Sustainability • Interoperability • Trust and privacy Guiding principles 1. Common behaviors of repositories (interoperability) 1. Value added services on top of the resources in repositories 2 critical aspects to this vision 38By Petr Knoth, Open University, UK Key functionalities of a global repository-based network • Preserves and provides access to a wide variety of research outputs • Enables better discovery including batch, navigation and notification • Will support research assessment including open peer review and standard usage metrics • Provides the foundation for a transparent social network including annotation, notification feeds, and recommender systems Beyond the journal All valuable research contributions should be available and recognized The NGR network enables Open Science! http://ngr.coar-repositories.org/ 11 Behaviors 1. Exposing Identifiers 2. Declaring Licenses at the Resource Level 3. Discovery Through Navigation 4. Interacting with Resources (Annotation, Commentary, and Review) 5. Resource Transfer 6. Batch Discovery 7. Collecting and Exposing Activities 8. Identification of Users 9. Authentication of Users 10. Exposing Standardized Usage Metrics 11. Preserving Resources Next Generation Repositories Technologies, Standards and Protocols 43 1. Activity Streams 2.0 2. COUNTER 3. Creative Commons Licenses 4. ETag 5. HTTP Signatures 6. IPFS 7. IIIF - International Image Interoperability Framework 8. Linked Data Notifications 9. ORCID and other author IDs 10. OpenID Connect 10. ResourceSync 11. SUSHI 12. SWORD 13. Signposting 14. Sitemaps 15. Social Network Identities 16. Web Annotation Model and Protocol 17. WebID and WebID/TLS 18. WebSub 19. Webmention Next Generation Repositories Technologies, Standards and Protocols • A snapshot of the current status of technology, standards and protocols available to support each behaviour. • Focused on the generic technologies required by all repositories to support the adoption of common behaviours. 44 Implementation Status …3 key strategies 1. Implementing technologies and protocols into repository systems 2. Supporting the development of value added services 3. Ongoing monitoring of new technologies 2. Research is global: we need interoperable hubs to support information exchange across repositories Next generation repository networks or hubs 14 repository networks meeting in Hamburg – May 14 & 15 (3) Monitoring of new technologies, standards and protocols COAR Next Generation Repositories Editorial Group Andrea Bollini Kathleen Shearer Rick Johnson Herbert Van de Sompel Paolo Manghi Paul Walk Petr Knoth Kazu Yamaji Eloy Rodrigues (1) New technologies in repositories Already progress - many platforms are implementing our recommendations • OpenAIRE – Europe • National Institute of Informatics (NII) - Japan • US Next Generation Repositories Implementers Group • CARL Open Repositories Working Group - Canada • Meeting of open source platforms at open source repository platforms at Open Repository 2018 work_74ab2yhge5cjhe4dggsahlg5cy ---- 310 American Archivist / Vol. 57 / Spring 1994 Research Article The Epic Struggle: Subject Retrieval from Large Bibliographic Databases HELEN R. TIBBO Abstract: Archivists have talked at length about the virtue of contributing records to a national bibliographic utility to provide enhanced access to collections. There has been little discussion, however, of the difficulties of finding materials in such large database environments. This article discusses a retrieval study that focused on collection-level ar- chival records in the OCLC Online Union Catalog, made accessible through the EPIC search system. Data were also collected from the local OP AC at the University of North Carolina-Chapel Hill (UNC-CH) in which UNC-CH-produced OCLC records are loaded. The chief objective was to explore the retrieval environments in which a random sample of USMARC AMC records produced at UNC-Chapel Hill were found—specifically, to obtain a picture of the density of these databases in regard to each subject heading applied and, more generally, for each record. Key questions were (1) how many records would be retrieved for each subject heading attached to each of the records and (2) what was the nature of these subject headings vis-a-vis the number of hits associated with them. Findings show that large retrieval sets are a potential problem with national bibliographic utilities and that the local and national retrieval environments can vary greatly. The need for specificity in indexing is emphasized. This article is based on a paper given at the Society of American Archivists' 1992 annual meeting in Montreal. OCLC supported this research. The author wishes to thank Patricia Haberkern, who did much of the searching. About the author: Helen R. Tibbo is presently an assistant professor in the School of Information and Library Science at the University of North Carolina at Chapel Hill. She earned a B.A. in English from Bridgewater State College, an M.L.S. from Indiana University, an M.A. in American Studies from the University of Maryland, and a Ph.D. in Library and Information Science from Maryland as well. She teaches in the areas of reference, on-line information retrieval, and archival studies. Her primary research interests focus on optimizing information retrieval, particularly for informa- tion systems that support humanistic and archival research. She is a member of the Society of American Archivists, serving on its Editorial Board and as Chair of the Archival Educator's Round- table, 1992-94. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 311 ARCHIVISTS1 HAVE TALKED AT LENGTH about the virtue of contributing records to a national bibliographic utility such as the Online Computer Library Center (OCLC) or Research Libraries Information Network (RLIN) in order to enhance access to their collections.2 There has been little discus- sion, however, of the difficulties of finding materials in such large database environ- ments.3 Ironically, electronic services such as OCLC and RLIN, which promise vastly improved access to archival collections on a nationwide or even international level 'Archives and archivists are being used herein for convenience to indicate both institutional archives and manuscript repositories and archivists and manuscript curators, respectively, unless otherwise noted. 2See for example David Bearman, "Archives and Manuscript Control with Bibliographic Utilities: Challenges and Opportunities," American Archivist 52 (Winter 1989): 26-39; David Bearman, Toward National Information Systems for Archives and Man- uscript Repositories: The National Information Sys- tems Task Force (NISTF) Papers, 1981-1984 (Chicago, 111.: Society of American Archivists, 1987); Elaine D. Engst, "Nationwide Access to Archival In- formation," Documentation Newsletter 10 (Spring 1984): 4-6; H. Thomas Hickerson, "Archival Information Exchange and the Role of Bibliographic Networks," Library Trends (Winter 1988): 553-71; H. Thomas Hickerson, "Expand Access to Archival Sources," Reference Librarian 13 (Fall 1985-Winter 1986): 195-99. James O'Toole has noted that "ar- chivists fulfill only half their responsibility to make records available if they sit and wait for users to come to them. Instead, archivists must be active in publi- cizing their holdings. This responsibility implies the necessity of sharing information about what is in each archives," Understanding Archives and Manuscripts (Chicago, 111.: Society of American Archivists, 1990), 67. 3Avra Michelson ("Description and Reference in the Age of Automation," American Archivist 50 [Spring 1987]: 192-203) has discussed the lack of consistency in archival descriptive practice, especially the assignment of subject headings for MARC AMC records and the implications for retrieval. Matthew Gilmore has noted that the requirement of most bib- liographic information systems to include at least one LCSH term in each MARC AMC record "means that archivists frequently must use a very general heading rather than the specific local thesauri," resulting in those materials "disappearing into a void." "Increas- ing Access to Archival Records in Library Online Public Access Catalogs," Library Trends 36 (Winter 1988): 610-11. over that possible in printed tools such as the National Union Catalog of Manuscript Materials (NUCMC), present enormous re- trieval problems themselves.4 As Lester Asheim has noted, "increasing the amount of information and speeding up access to it is more likely to result in information overload and entropy than it is to improve the receiver's ability to benefit from the in- formation."5 The user's goal is to find all relevant ma- terial and nothing more.6 As simple as this sounds, it is exceedingly difficult to accom- plish, whether the retrieval system is word of mouth, printed format, or an electronic database. As systems grow in size, com- plexity, and power, they become more in- clusive, but barriers to optimal retrieval effectiveness increase as well. This should not be surprising, as information retrieval power is never without its price. The larger and more heterogeneous the database, the more difficult it is to conduct subject or free-text searches effectively. Even known- item searches become slower and poten- tially more difficult as the search space in- creases. Lancaster and his associates observe that the on-line catalog has not improved sub- ject access but may have made the situation worse because it has led to the creation of much larger catalogs that represent the holdings of many libraries.7 Merging sev- eral catalogs into one, when each compo- nent catalog provides inadequate subject "Library of Congress, National Union Catalog of Manuscript Collections (Washington, D.C.: Library of Congress, 1962-). 'Lester Asheim, "Ortega Revisited," Library Quarterly 52 (July 1982): 215. 'Although it can be argued that a user might only want a subset of all potentially relevant materials, that subset becomes all the items that are situationally rel- evant for that particular individual at that time. T . W. Lancaster, Tschera H. Connell, Nancy Bishop, and Sherry McCowan, "Identifying Barriers to Effective Subject Access in Library Catalogs," Li- brary Resources and Technical Services 35 (October 1991): 388. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 312 American Archivist / Spring 1994 access, exacerbates the problem, since the larger the catalog, the more discriminating must be the subject access points provided. In recent years, catalogs have grown much larger without any significant compensa- tory increase in their discriminating power. Chandra Prabha of OCLC calls large re- trievals " a problem of the 1990s."8 She goes on to note that, "the problem of large retrievals is accentuated in an OP AC [on- line public access catalog] environment be- cause a majority of users are occasional or casual users."9 With 30 million records, the OCLC Online Union Catalog (OUC) clearly poses a challenging retrieval envi- ronment. Representing archival collections so as to optimize subject retrieval from a large bibliographic utility such as OCLC can truly be an " e p i c " struggle. Regardless of the type of material rep- resented—be it books, serials, or archival collections—document retrieval in large bibliographic databases depends on well- constructed document representations or surrogates. The semantic condensation re- quired to represent a 350-page book or a 50-box collection in a catalog entry, or an abstract, or even an archival inventory de- mands that more is left unsaid than re- corded in these surrogates. In the process of semantic condensation, information is necessarily lost. This loss may seem un- fortunate, but the remaining distillation, when well selected, becomes a more pow- erful retrieval tool than the full text of the original. A "good" surrogate eliminates "noisy" information that is found in all full texts and could cause an item to be retrieved when it should not be; a good sur- rogate also includes information that will facilitate its retrieval in response to appro- priate queries. It is the processor's job to create a sur- rogate, be it an archival finding aid or a USMARC AMC (Machine Readable Cat- aloging, Archives and Manuscript Control) record, that captures the most important material in the item represented in as suc- cinct and specific a manner as possible. Of increasing importance in extremely large databases, the surrogate must not merely represent its parent document and/or col- lection, it must be able to distinguish it from a multitude of other very similar items. The most subjective elements of MARC AMC records in bibliographic databases, yet certainly some of the most important regarding access, are the subject fields. Many of the other fields, such as collection title, extent, or location, are relatively straightforward.10 Collection titles can pro- vide some manner of subject access, but for most researchers who want to find col- lections that contain materials related to a particular topic, a search of the 12 subject fields in a MARC AMC record will be ap- propriate." This article discusses a retrieval study that focused on collection-level archival re- cords in the OCLC Online Union Catalog, made accessible through the EPIC search system. I also collected retrieval data from the local OP AC at the University of North Carolina-Chapel Hill (UNC-CH) in which OCLC records produced by UNC-CH are loaded. The chief objective was to explore the retrieval environments in which a ran- dom sample of MARC AMC records pro- sChandra Prabha, "Managing Large Retrievals: A Problem of the 1990s?" in OPACs and Beyond, Pro- ceedings of a Joint Meeting of the British Library, DBMIST, and OCLC, OCLC Online Computer Li- brary Center, Inc., Dublin, Ohio, August 17-18, 1988 (Dublin, Ohio: OCLC, 1989), 33. 'Prabha, "Managing Large Retrievals," 33-34. '"Even with these fields there can be serious re- trieval problems, as when institutions just use "Pa- pers" as the full title for a collection. "For a detailed description of these fields, see Har- riet Ostroff, "Subject Access to Archival and Manu- script Materials," American Archivist 53 (Winter 1990): 100-05. See also Online Computer Library Center, Archives and Manuscript Control Format, 2nd ed. with updates (Dublin, Ohio: OCLC, 1986). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 313 duced at UNC-Chapel Hill were f o u n d - specifically, to obtain a picture of the den- sity of these databases in regard to each subject heading applied and, more gener- ally, for each record. Key questions were (1) how many records would be retrieved for each of the subject headings attached to each of the records and (2) what was the nature of these subject headings vis-a-vis the number of hits associated with them? I was particularly interested in seeing if the subject headings used at UNC-CH incurred an overwhelming number of postings in the national database and how this related to the number found in the UNC-CH OP AC. I also wanted to compare the number of postings for topical headings and personal names. This type of information is impor- tant in assessing how well a database is serving the research community because catalog persistence studies indicate that re- searchers, even in university settings, rarely are willing to look through hundreds of items in a catalog. Summarizing earlier OP AC studies, Ray Larson notes that users of on-line catalogs frequently find too many items or none at all.12 If subject head- ings applied to MARC AMC records incur hundreds of hits in OCLC, even if they work well in the contributing institution's local catalog, it is doubtful that researchers will find the records in the larger national bibliographic environment. To optimize the archival community's investment in pro- viding national access to materials, archi- vists must explore these large retrieval environments and adjust cataloging and re- trieval techniques appropriately. The EPIC Service OCLC's EPIC service is a commercially available interactive on-line searching serv- 12Ray R. Larson, "Managing Information Overload in Online Catalog Subject Searching," Proceedings of the ASIS Annual Meeting, 1989 (Medford, N.J.: Learned Information, 1989), 129. ice that provides access to several large da- tabases.13 The database with which archi- vists are most concerned is the OCLC On- line Union Catalog. If an archives sends MARC AMC records to OCLC, this is the database in which the records will appear. Currently, this database contains well over 30 million records representing informa- tion sources in a wide variety of materials and languages. It is growing at a rate of 2 million records per year, or 40,000 records per week. This is OCLC's original data- base, which library catalogers and interli- brary loan librarians have used for over 20 years for cooperative cataloging and for lo- cating known items for interlibrary loan. The Library of Congress sends an average of 5,000 records per week to OCLC, with other OCLC member libraries contributing about 34,000. Until the advent of the EPIC search serv- ice in 1990 and, more recently, First- Search,14 OCLC provided a search interface designed specifically for catalog- ers. The classic OCLC search protocol re- lies on the searcher having a book or other material in hand so that the author, title, publisher, and publishing date are known. The searcher enters parts of the title and the author's name so as to locate any ex- isting cataloging records for that particular item. The system then retrieves any records that match the given known-item specifi- cations. While the cataloger may have sev- "For more about EPIC, see Nita Dean, "EPIC: A New Frame of Reference for the OCLC Database," OCLC Newsletter (March-April 1991): 21; "The EPIC Service Is Introduced," OCLC Newsletter (Jan- uary-February 1990): 10-16; and Laurie Whitcomb, "OCLC'S EPIC System Offers a New Way to Search the OCLC Database," Online 14 (January 1990): 4 5 - 50. '"According to OCLC, "FirstSearch is an interac- tive searching system for library patrons" that allows them "to search a variety of bibliographic databases. . . . By following on-screen instructions, patrons can search successfully without special training." Online Computer Library Center, Inc., The FirstSearch Cat- alog (Dublin, Ohio: OCLC, 1992), 1. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 314 American Archivist/ Spring 1994 eral variant records to look through, they will all be for the particular title in hand (different editions perhaps), or, if only au- thor information is entered, they will all represent works by that individual. Despite the size of the database, a searcher can very quickly locate items via this system be- cause all searches are based on specific, concrete information such as titles, authors' names, and International Standard Book Numbers. The OCLC Online Union Cata- log has always held subject information in the form of subject headings (usually Li- brary of Congress Subject Headings [LCSH]) for each record, but it was not until the development of the EPIC service that OCLC provided a means by which to do subject searching, thus using these ex- isting access points. The EPIC search service complements the original OCLC search engine by pro- viding keyword, phrase, and subject searching. A searcher can use Boolean, proximity, and range searching features as well as truncation and index scanning.15 The EPIC command interface, the search language, is based on the NISO Common Command Language for Interactive Infor- mation Retrieval (Z39.58). The EPIC search interface is extremely powerful, but this does not mean that users will easily be able to produce good searches. The more simplistic FirstSearch system designed spe- cifically for end users also presents serious retrieval problems because the main prob- lems lie not with the searching front ends but with the OCLC OUC database itself. While this enormous database works ex- tremely well for cataloging and interlibrary loan, where the searcher has a specific title or author in mind, it is a relatively unex- plored morass for subject searching. The most evident problems revolve around the size of the database and the use of broad, precoordinated Library of Congress Sub- ject Headings for postcoordinate retrieval. These problems are not restricted to archi- val searching and MARC AMC records; indeed, producing manageable and com- plete subject search results for monographs in such a system is potentially even more difficult. In an effort to adapt LCSH terms for electronic retrieval, OCLC takes each sub- ject heading assigned to a book, archival collection, or other material and breaks it apart. This is very useful as it eliminates the need for users to construct lengthy LCSH strings in order to do subject searches and allows more flexible search- ing.16 To retrieve items assigned the head- ing ' 'North Carolina—History—Civil War, 1861-1865—Personal Narratives, Confed- erate," a searcher would enter a statement with the following elements in any order connected by the Boolean and: find su=(North Carolina and History and Civil War 1861-1865 and Personal Narratives Confederate). The su= tells EPIC to look only through subject headings but does not limit retrieval to only records with this par- ticular subject heading string. For example, an item with the following combination of subject headings would also be retrieved: "United States—History—Civil War, 1861-1865—Personal Narratives, Confed- erate" and "North Carolina—Description and Travel and 19th Century." Unfortu- nately, there is no mechanism by which the searcher can just receive items with a par- ticular subject heading string, nor can the searcher browse complete subject strings in the scan mode and see how many items are posted to each. "Index scanning does not work well with the sub- ject fields, as the subject strings, common to LCSH, are broken into constituent parts and do not appear in any scannable index as complete strings. 16Many individual library OPACs require users to enter full LCSH strings with correct syntax in order to retrieve items on a topic. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 315 Information Overload In general, the two primary purposes of subject control are (1) to allow the user to find material on a subject, and (2) to col- locate a repository's materials on a subject at one point in the catalog, thus giving the user a summary of what is contained in that collection on the given topic. National un- ion catalogs, such as OCLC's OUC, go one step further. Because the OCLC Union Catalog is a national database that employs LCSH, it collocates topical materials from around the country at each subject heading. Richard Smiraglia further notes that "when LCSH is used to supply subject headings for AMC formatted records, the archival materials will collocate with published ma- terials on the same topic in an integrated bibliographic system (network or local), thus giving a user an opportunity to browse bibliographic records for both published works and primary source material under a topical heading."17 While this is theoretically a wonderful research opportunity that might well bring new researchers into archival repositories because they find archival materials next to books in the catalog, such collocation works best, or perhaps only works at all, with relatively small collections. OCLC's Union Catalog, with over 30 million re- cords, hardly fits into the "relatively small" category. If 15,000 records collo- cate at a subject heading, or Boolean com- bination of terms—not an unheard-of retrieval in EPIC—the chance that the re- searcher will view any one of the records is greatly diminished; indeed, it becomes a chance event dependent on when the search is done, when the record was en- tered into the database, and how users deal with information overload. Researchers are not without resources to deal with information overload. Joel and Mary Jo Rudd list several ways in which library users turn a potential information overload into a manageable load.18 They explain that in addition to using Herbert Simon's principle of "satisficing" (acquir- ing a "satisfactory" subset of available in- formation), researchers faced with cognitive and temporal limitations on in- formation acquisition frequently just "skim off the top," looking only at the first few items they find in a catalog or on the shelves. Because most bibliographic databases present retrieval sets in last-in, first-out (LIFO) order, any given record collocated at a subject heading may fall victim to the "Andy Warhol" phenome- non, wherein each record is famous for its 15 minutes until it sinks into the morass of the database as newer records pile on top of it. The problem here, of course, is that the most appropriate records, particularly in fields such as history, where information does not go out of date quickly, may be at the bottom of the pile. Indexing consis- tency becomes important for only the most comprehensive searches and tenacious da- tabase searchers, but distinction drawn among items comes to the fore. Ortega y Gasset's 1934 definition of a librarian as " a filter interposed between man and the torrent of books" can now apply to the ar- chivist and the on-line catalog or on-line bibliographic systems.19 Stephen E. Wiberley, Jr., Robert A. Daugherty, and James A. Danowski con- ducted a "users' persistence" study in "Richard P. Smiraglia, "Subject Access to Archi- val Materials Using LCSH," in Describing Archival Materials: The Use of the Marc AMC Format, edited by Richard P. Smiraglia (New York: Haworth Press, 1990), 64. 18Joel Rudd and Mary Jo Rudd, "Coping with In- formation Load: User Strategies and Implications for Librarians," College and Research Libraries 47 (May 1986): 315-22. "Jose Ortega y Gasset, The Mission of the Librar- ian, translated by James Lewis and Ray Carpenter (Boston: G.K. Hall, 1961). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 316 American Archivist / Spring 1994 1987 20 They looked for what David Blair calls the "anticipated futility point."21 Blair defines this as the number of docu- ments a researcher will be willing to begin to browse through. Karen Markey has called this user "perseverance."22 Wiber- ley and his colleagues adapted Blair's def- inition to the number of references in an on-line catalog that users were willing to scan in discretionary information-seeking situations. Subject searching fits into this discretionary type of information seeking in that the user never knows the extent of information available and thus feels no compulsion to search out a particular fact or title. Wiberley, Daugherty, and Dan- owski studied user persistence or persever- ance with an academic library OP AC that contained more than 425,000 records. They studied user transaction logs and question- naires. The median response to the ques- tion "How many postings would you consider to be too many?" was fifteen. The transaction log data indicated a sharp drop- off in persistence with more than 30 post- ings and a great drop-off after sixty. More specifically, they found that while a major- ity of users "displays all general records for searches that retrieve between eleven and thirty postings, when searches retrieve more than thirty postings, a majority of users displays no records."23 MStephen E. Wiberley, Jr., Robert A. Daugherty, and James A. Danowski, "User Persistence in Scan- ning Postings of a Computer-Driven Information Sys- tem: LCS," Library and Information Science Research 12 (October-December 1990): 341-53. See also Stephen E. Wiberley, Jr., and Robert A. Daugh- erty, "Users' Persistence in Scanning Lists of Ref- erences," College and Research Libraries 49 (March 1988): 149-56. 2lDavid C. Blair, "Searching Biases in Large In- teractive Document Retrieval Systems," Journal of the American Society for Information Science 31 (July 1980): 271. "Karen Markey, Subject Searching in Library Cat- alogs: Before and After the Introduction of Online Catalogs (Dublin, Ohio: OCLC, 1984), 67-71. "Wiberley, Daugherty, and Danowski, "User Per- sistence," 352. OP AC users, such as those in the Wib- erley, Daugherty, and Danowski study, may tolerate fewer citations than on-line- search service clients, who may turn to commercial on-line databases only when they want an exhaustive search. The searching literature and vendors such as DIALOG Information Services generally hold that very few on-line-search clients are willing to look through more than 100 citations, with many people willing to scan only 50 or fewer items. This information holds serious implications for archival re- searchers using on-line databases such as OCLC's OUC and locally or Internet-avail- able library catalogs. To understand how best to represent documents or collections of materials in these contexts, we need first to explore these retrieval environments. Methodology In February 1992 I selected a random sample of 60 MARC AMC records repre- senting collections held in UNC-CH's Southern Historical Collection from the OCLC Online Union Catalog. A graduate assistant searched the subject headings at- tached to each of these records in OCLC as well as in the university's on-line cata- log in March 1992. For example, "Mer- chants—North Carolina—History—19th century" retrieved 67 items in the UNC- CH on-line catalog and 106 items in the OCLC OUC in March 1992. In August 1992 and June 1993 I again searched all headings in the on-line catalog, the entire OCLC database, and the manuscripts por- tion of the OCLC database. In comparing the data I discovered that because one rec- ord was such an outlier it distorted the pic- ture for the mean number of hits per search term and per record. In this case, one head- ing—Sermons—received 54,904 hits in OCLC in August 1992. I eliminated this record from the sample, thus bringing the usable population to fifty-nine. I also dis- covered that the graduate student had D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 317 Table 1. Mean Number of Postings per Term Table 3. Median Number of Postings per Term EPIC—June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss—June 1993 EPIC/mss—August 1992 Local Total—June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 Table 2. Mean Number of per Term per Record EPIC-^June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss—June 1993 EPIC/mss—August 1992 Local Total-^June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 229 207 196 67 59 42 39 29 20 Postings 252 235 220 C D C O C O C O 45 41 29 20 searched the local records in a different manner, so that the local data for March 1992 are not able to be compared to the August 1992 results but are comparable to one set of the June 1993 findings. Findings Table 1 shows the mean number of hits or postings for the 519 subject headings as- sociated with the 59 records. The mean number of subject headings per record was 8.8, with a median of 8.0. The first EPIC search retrieved an average of 196 postings per heading. Only five months later this number rose by 11 points, and nine months after that it went up another 22 points. Keeping Wiberley, Daugherty, and Dan- owski's findings in mind, these results should be alarming. Even when the manuscript records are separated from the other materials in EPIC—June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss-^June 1993 EPIC/mss—August 1992 Local Total-^lune 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 Table 4. Median Number of per Term per Record EPIC-^June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss-^June 1993 EPIC/mss—August 1992 Local Total—June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 101 93 79 46 43 26 24 21 14 Postings 128 120 105 44 40 27 24 21 16 OCLC, the average retrieval was 67. News is better for the local catalog, with an av- erage of 42 hits in June 1993 and 39 in August 1992. These numbers represent the total figure given for an entry such as "Virginia—Civil War." The UNC-CH catalog provides this figure for this term and all subdivisions, such as "Correspon- dence" or "Stores and supplies" before listing any brief titles on the screen. The searcher in March had gone to the second step of looking in the index—the actual list of subject headings used in the catalog— and had recorded the number of items spe- cifically attached to the broader term (e.g., "Virginia—History—Civil War, 1861- 1865"), but she did not include figures for any of the subtotals. Thus, the March fig- ure, which took more searching expertise to derive, is as conservative as possible and is still twenty. This number rose to 29 by June 1993. Table 2 provides data on the D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 318 American Archivist / Spring 1994 Table 5. 9 16 20 20 46 67 80 110 125 150 Postings per 152 179 198 257 299 355 360 406 422 500 Record, EPIC, 525 536 552 646 651 652 658 668 735 810 June 1993 841 866 966 971 974 1,155 1,206 1,306 1,357 1,482 1,497 1,565 1,808 1,813 1,846 1,867 2,877 3,161 3,728 3,918 4,589 4,723 4,778 5,749 6,320 6,453 9,454 13,622 16,666 — Table 6. Mean Number of Posting per Record, EPIC, June 1993 2.25 4.00 5.00 5.30 7.67 15.20 15.23 15.71 16.00 16.75 25.00 27.18 31.25 33.83 41.67 42.83 44.75 46.72 50.18 51.43 52.50 52.75 67.50 76.57 83.17 96.60 107.67 116.69 123.50 128.33 129.14 130.40 131.60 144.33 147.00 150.75 156.50 162.75 169.63 177.50 186.57 194.20 205.10 243.50 259.00 265.44 302.58 314.87 319.67 327.79 338.91 351.22 489.75 526.67 567.58 668.00 859.46 1,290.60 4,166.50 — Table 7. 6 9 10 14 17 22 23 30 35 38 Posting per Record, 47 78 84 85 97 106 108 114 122 124 Local Total, 141 148 151 154 161 164 178 178 189 207 June 1993 217 224 254 296 299 311 344 346 348 348 352 406 425 450 465 496 532 699 762 829 955 961 1,022 1,097 1,122 1,133 1,207 1,233 1,700 — mean number of postings per term per rec- ord. The results are even worse with this method of calculation. The median number of postings per term (table 3) and per term per record (table 4) may represent a more accurate picture of the data due to a few extremely heavily posted terms that distorted the means. Al- though it contains lower figures, table 3 shows a 22-point increase in the OCLC fig- ures over the 14-month period. Enumerations of the number of postings per record and the mean number of post- ings per term per record show the range in postings (tables 5 through 8). Table 9, showing the greatest number of hits per term, indicates how useless a sub- ject heading can become in a database of 30 million records. "United States—His- tory—Revolution—1775-1783' ' retrieved 16,393 items in the June 1993 complete OCLC search and 1,628 items from the D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 319 Table 8. Mean Number of Postings per Item per Record, Local Total, June 1993 1.50 1.80 2.50 4.25 4.67 4.70 5.00 5.00 5.75 7.08 7.60 8.31 8.40 9.64 11.50 12.00 12.13 13.50 14.83 15.40 16.44 18.50 19.00 19.50 20.33 20.67 22.00 24.86 25.43 27.13 27.18 30.20 32.00 34.56 35.42 36.29 38.83 40.25 46.50 51.38 55.11 56.25 57.33 59.63 60.94 63.50 68.13 69.60 70.40 70.50 76.00 81.20 86.50 86.82 92.11 109.73 120.13 224.20 425.00 — Table 9. Greatest Number of Post- ings Per Term Table 10. Terms with Only 1 Posting EPIC-^June 1993 EPIC—August 1992 EPIC—March 1992 16,393* 15,641* 15,001* EPIC/mss^June 1993 2,213** EPIC/mss—August 1992 1,797** Local Total—June 1993 1,628* Local Total—August 1992 1,438* Local Specific—June 1993 1i012 t Local Specific—March 1992 962+ *United States—History—Revolution—1775 -1783 **North Carolina—History World War, 1914-1918—France UNC-CH catalog. "North Carolina—His- tory" retrieved 2,213 just from the manuscripts in OCLC. The number of headings incurring only one hit—the vast majority of these being personal names—indicates that the remain- ing topical subject headings received more postings on average than shown above (see table 10). The picture becomes bleaker and bleaker for the use of topical subject head- ings in such a large database, especially when we realize that many of the headings analyzed are used predominately by archi- vists, and archivists have been contributing to OCLC only for a few years! Even the UNC-CH catalog now averages over 60 postings for the multiple-hit headings (ta- ble 11). These figures would be even EPIC—June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss—June 1993 EPIC/mss—August 1992 Local Total-^June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 132 144 145 154 167 181 190 189 241 25% 28 28 30 32 35 37 36 46 Table 11. Mean Number of Postings per Multiple-Hit Terms EPIC-June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss—June 1993 EPIC/mss—August 1992 Local Total^June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 307 286 272 95 87 64 60 45 37 higher if subject headings with two and three hits (still mostly names) were added to those with just one. Table 12 shows the number of hits on individuals' names and the average number of postings on the names as a whole. There were 108 individuals' names included as subject access points in these records. In comparison, entries for 29 families, such as "Rogers," "Smith," and "Erwin," D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 320 American Archivist / Spring 1994 Table 12. Total and Average Number of Postings for Individuals' Names Date of Search EPIC—June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss^June 1993 EPIC/mss—August 1992 Local Total—June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 Number of Postings 450 432 408 189 186 225 214 200 183 Average Number of Postings 4.2 4.0 3.8 1.8 1.7 2.1 2.0 1.9 1.7 yielded many more postings, particularly in the OCLC OUC (table 13). Discussion What do these data tell archivists, who both create records for national databases and help researchers locate materials in them? First and foremost, it is important to realize that what may work locally will not necessarily work in a 30-million-record da- tabase. This is not to say that the use of such databases for national access is not a good idea. Rather, archivists have to un- derstand the nature of the environments into which they are sending their records and do all they can to help them compete. In database terms this means providing ac- cess points that will help the records to be retrieved and read when they are relevant items. Both local and national concerns must be balanced. A "good" record is a little bit like the proverbial good child: It should speak only when spoken to—that is, present itself for retrieval when it is rele- vant to a researcher's needs, but otherwise be silent. As with children, it is often dif- ficult to make bibliographic records be- have. To extend the analogy, most child ex- perts will tell you that environment, as well as genetics or specific parental teachings, plays a role in how children behave. Such is the case with bibliographic records. A bibliographic record that does not use stan- dardized subject access terms may never be found in a national database. Such practice will lead to low-recall searches. At the same time, a seemingly excellent record with standardized subject headings that represents a collection very well may find itself buried in other seemingly excellent records if there is much material on that topic in a large database. In this scenario, document discrimination and search preci- sion become overriding concerns. The rec- ord and its creator must adapt to this environment or risk oblivion. The same record may work well "at home," where there are relatively few items on this topic in the on-line catalog. Conversely, the local catalog may require augmented local sub- ject headings that make sense in that en- vironment. Not only must archivists consider collection and user characteristics in providing subject access, they must also consider the environment into which the records will be sent. This may mean send- ing one record off to a national utility while placing another record, perhaps with local subject headings and location infor- mation, in the home OPAC. There is no reason, other than additional processing costs, why the two records must be iden- tical. Representing Archival Collections Most important in making records behave D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 321 Table 13. Total and Average Number of Postings for Family Names Date of Search EPIC—June 1993 EPIC—August 1992 EPIC—March 1992 EPIC/mss—June 1993 EPIC/mss—August 1992 Local Total—June 1993 Local Total—August 1992 Local Specific—June 1993 Local Specific—March 1992 Number of Postings 2,963 2,684 2,621 590 277 165 149 139 112 Average Number of Postings 102.0 92.5 90.4 20.4 9.6 5.7 5.1 4.8 3.9 in any bibliographic environment is the ar- chivist's responsibility for capturing the key concepts of the materials in their find- ing aids. As David Bearman argues, con- sistency of topical headings is not so important if we provide very good search- ing tools such as switching vocabularies and "intelligent" front ends (and that is a big " i f ' ) . 2 4 Selecting and representing key concepts are highly subjective and difficult tasks, and those selected will not always fit the needs and visions of future users. This work will never be scientific, but it will always be important, just as archival proc- essing has been important in the past. The great service here is to reduce the bulk of information to be searched in a meaningful and rational manner, keeping in mind that it is better to do this work now than to wait for perfection that will never come. Rep- resenting materials completely and suc- cinctly, while differentiating them from a multitude of similar documents, lies at the heart of any information storage and re- trieval system. As with the MARC AMC format, archivists now need to focus on the types of subjects to be documented. They need to build a subject access framework to identify what subjects in archival collec- 24David Bearman, "Authority Control Issues and Prospects," American Archivist 52 (Summer 1989): 288. tions should be represented in subject in- dexing, as Bearman and others have pointed out.25 Beyond ensuring that the truly signifi- cant material in a collection is represented, appropriate indexing language is central to creating good bibliographic records. Re- gardless of the item, be it an entire collec- tion or a series, the specificity and exhaustivity of the indexing language are important. If these elements are appropriate to the material being represented, some subject-indexing consistency should fol- low, with strict authority control being left to more specific forms of information, such as personal, corporate, and geopolitical names, as well as collection forms and functions. Database producers have long recognized the importance of appropriate indexing languages for their materials. Thus, databases such as MEDLINE, ERIC, and Psychological Abstracts all have their own controlled vocabularies and thesauri. "Bearman, "Authority Control," 286-99; Helena Zinkham, Patricia D. Cloud, and Hope Mayo, "Pro- viding Access by Form of Material, Genre, and Phys- ical Characteristics: Benefits and Techniques," American Archivist 52 (Summer 1989): 300-19. He- len Tibbo extends this notion to a framework for ab- stracting in Abstracting, Information Retrieval and the Humanities: Providing Access to Historical Literature (Chicago: American Library Association, 1993). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 322 American Archivist / Spring 1994 Broad, undifferentiated topical headings, common to LCSH, do not appear to work well for retrieval from large electronic da- tabases. If repositories collecting in similar areas work together on authority lists, ap- propriate index terms, and user thesauri, these efforts could increase the consistency with which the institutions with key col- lections represent their materials without sacrificing necessary specificity to a mon- olithic indexing language. This would also allow archivists to retain much of their "rugged individualism"26 while cooperat- ing with related institutions. Archivists could then coordinate and disseminate such vocabularies nationally. Avra Michelson sent a common descrip- tion of an archival collection to several re- positories and discovered a total lack of consistency in descriptive practice, espe- cially in the assignment of subject head- ings.27 While no conclusions regarding indexing consistency can be drawn from the present study, it is clear that archivists from across the United States are applying the same subject headings, even quite lengthy and complicated LCSH strings, to hundreds and thousands of records. In many cases they use terms that librarians also select quite frequently. We do not know if archivists are consistently applying these terms to the same concepts, but we do know that large numbers of postings are accruing at certain topical headings, even when these are delimited by geographic lo- cation and date. Because archivists in dif- ferent institutions never index the same collections, more context-sensitive studies of indexing consistency may be necessary if we are to judge accurately the extent of indexing consistency. What is clear is that better document discrimination, possibly 26Janet Gertz and Leon I Stout, "The MARC Ar- chival and Manuscripts Control (AMC) Format: A New Direction in Cataloging," Cataloging and Clas- sification Quarterly 9 (1989): 5. "Michelson, "Description and Reference." through more specific, appropriate, and ex- haustive indexing languages, is necessary as databases continue to grow. Another representation issue is deter- mining the best archival level on which to base MARC AMC records. It is important to recognize that these records serve to de- scribe collections in only a minimal fash- ion. The primary function of MARC AMC records is to lead the searcher to a finding aid, which in turn documents and describes in detail the collection and parts thereof.28 As such, MARC AMC records cannot fully describe a collection, nor should they. Hav- ing said this, I should add that it is prob- ably best to provide collection-level access in MARC AMC records, as the introduc- tory information in an inventory serves as an umbrella for the series and folder de- scriptions. Certain situations, however, can make the creation of just collection-level records arbitrary. A particular series, or even an individual item, may outweigh the value of the rest of the collection. If this is the case, and if the general terms that best describe the collection as a whole do not provide optimal specificity for the impor- tant part of the collection, a separate MARC AMC record would help to facili- tate access. Such a record, however, would have to lead the researcher to the collec- tion-level record or provide enough prov- enancial context so that the researcher could locate the collection. In OCLC, with a limited number of sub- ject headings per record, the archivist fre- quently cannot assign enough headings to index appropriately both the entire collec- tion and its significant parts. In RLIN, 28In Archives, Personal Papers, and Manuscripts, 2nd ed. (Chicago, 111.: Society of American Archi- vists, 1989), Steve Hensen notes that "The chief source of information for archival materials is the finding aid prepared for those materials" (p. 9) and not the materials themselves unless there is no finding aid or provenance or accession records. Thus, the cat- aloging record is derived from the finding aid as the finding aid is based on the collection. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 323 where any number of headings can be used, excessively long lists and long re- cords can discourage researchers from looking at the items they retrieve. In both cases, separate records that provide access to the collection as a whole and to specific parts would provide better access to this material than does one inadequate or overly long record. If different names, or- ganizations, or institutions are prominent in various series of a collection it may be a good idea to make linked MARC AMC re- cords for each relevant series and to index these with the prominent names. Subject access common to all series in a collection should be kept with the main record so as not to replicate the topical headings for the collection several times within the data- base. This is not to say that all series, fold- ers, or items need to be represented just because a few are deemed to be important. What might appear to be uneven represen- tation of the collection in terms of a finding aid could provide optimal access to key el- ements. In this way, cataloging and access become intricately tied to appraisal. It is important to remember that in a database all records, whether they represent impor- tant or relatively insignificant materials, be- come equal in the retrieval game. Responsible appraisal of what should be represented in the database becomes a powerful retrieval tool. As Bearman notes, a large number of subject headings per record gives that rec- ord a better chance to be retrieved.29 When repeated by everyone, the practice of ap- plying more and more subject headings will serve primarily to increase the size of the database and will result in overwhelm- ingly large retrieval sets and long records. This is already the case in RLIN, where records may go on for 12 screens and have MBearman, "Authority Control," 289. over 200 subject headings.30 The best pol- icy is to select important material and rep- resent it accurately and precisely. As with appraisal, selection is critical. It is irre- sponsible to "pollute" a retrieval environ- ment with extraneous or repetitive postings to terms just to increase the likelihood that a given record will be retrieved. We do not want to clog up our databases any more than our shelving or backlog areas. Retrieving Archival Materials Ref- erence archivists must become expert searchers of national bibliographic systems and on-line catalogs that are available on the Internet if they are to provide their cli- ents with the highest possible level of serv- ice. Since both OCLC and RLIN must be employed for comprehensive searches, and since many archival researchers want high- recall searches,31 archivists must become well versed in both systems. This means becoming familiar with the searching lan- guages and capabilities and, more impor- tant, with basic information-retrieval prin- ciples and strategies. Today's electronic in- formation-retrieval systems are deceptively easy to use, so much so that even the nov- ice searcher can find something on most topics. At the same time, it is often very difficult to do a good search that optimizes recall and precision. This is particularly true in large databases. Archivists must be prepared to do searches for clients and to assist clients in conducting their own searches. Indeed, there is a large role for user education, particularly with CD-ROM products and Internet-available on-line cat- alogs. Searching guides and instructional classes will become necessary if clients are to do their own searching. '"Kathleen Roe discussed the problems related to lengthy RLIN records at the 1992 SAA Annual Meet- ing in Montreal in a paper titled, "Autonomy vs. Community: Life in an Archives Database Commune." "Mary Jo Pugh, "The Illusion of Omniscience: Subject Access and the Reference Archivist," Amer- ican Archivist 45 (Winter 1982): 33-44. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 324 American Archivist / Spring 1994 Archivists must not only learn how best to apply subject headings, they must also turn this knowledge into searching exper- tise. Librarians are coming to see the dif- ficulty of using a precoordinate indexing language, such as LCSH, for postcoordi- nate retrieval, and hopefully there will be significant improvements in future LCSH versions in the age of OPACs. OCLC has recognized the precoordinate problem, and thus breaks each LCSH heading and sub- heading apart to facilitate more flexible re- trieval. Because of this, the searcher does not have to worry about matching the syn- tax of lengthy LCSH strings in EPIC, al- though this may still be the case with on-line catalogs available locally and on the Internet. Reference archivists must be- come skilled in searching all of these tools. They must know how to construct LCSH strings for searching OPACs and realize that the breadth of many LCSH terms, even when combined with other terms and de- limiters in the OCLC or RLIN OUC, may prohibit precise retrieval. When searching individual OPACs via the Internet, it is important to remember Avra Michelson's study. Archivists tend to use different terms (even when restricted to a controlled vocabulary) to describe the same things. Thus, when searching some- one else's catalog, we should remember that it is important to use a number of syn- onomous search terms to ensure high recall (if that is the objective). It is always easier to search our own catalog wherein we know the terms local staff members tend to use over and over. It would be a great serv- ice to the field if institutions with like col- lections cooperated in building "common term" lists and then made these available to other institutions and clients, complete with examples on how to make searches as specific as possible. These could even be mounted on Internet gopher servers for easy access. Headings divided by geographic and temporal elements—facets found to be im- portant to historians' information-seeking methodologies—work well to distinguish items that are topically related.32 Jackie Dooley also notes the importance of space and time delimiters for providing more re- fined subject access.33 Such delimiters, however, provide only a partial answer. As can be seen from examples given above, even when subject headings contain locales and date ranges, a large number of records may be retrieved, and thus the actual top- ical subject terms must also be specific. Conversely, many items may be omitted from date- or place-restricted retrieval sets if processors failed to include all possible specific delimiters and subheadings. When a collection covers several geographical ar- eas and years, processors may be forced to use broader terms because they are re- stricted in the number of more specific des- ignations they can make. Reference archivists should advise clients searching OCLC or similar databases to use geo- graphical and temporal elements in search strategies, but clients should also be aware that many relevant records will not be re- trieved with these limitations. Processors must assign geographic and temporal sub- headings to LCSH when these notions are central to the collection being represented, and reference archivists must explain the realities and limitations of database search- ing to clients. If only primary materials are desired, limiting a search to the manuscripts seg- ment of the OCLC database seems a good strategy to limit set size. Examination of records in the larger OCLC sets, however, reveals that many archival materials have been entered in the MARC book format. Thus, searches restricted to manuscripts will not retrieve all relevant items. Fur- thermore, such searches will not collocate 32Tibbo, Abstracting, Information Retrieval, and the Humanities. "Jackie Dooley, "Subject Indexing in Context," American Archivist 55 (Spring 1992): 348. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 Subject Retrieval from Large Bibiolographic Databases 325 published and unpublished sources, which may be what the researcher wants. This strategy should be used quite carefully and explained to the client. Subdivision by form is a useful retrieval strategy, but headings such as "Sermons" or "Diaries" by themselves get lost in the shuffle. It is very important in large data- bases to combine form headings with other topical, temporal, or geopolitical headings. Along this line, headings such as "Brown Family," while they may work in our local catalogs where there is only one Brown family, produce quite undesirable results in a national catalog. Ideally, each subject heading is supposed to denote only one concept. Although there may be linkages among the over 200 records in OCLC with the heading "Brown Family," in many cases individual families that are in no way related are represented. This indicates a to- tal lack of authority control and results in excessive postings because separate con- cepts (different families) are represented by the same term.34 Searchers should usually try to limit queries with family names to particular geographic locations. Entry of specific personal or corporate names, which can be expected to have very few hits even in large databases, seems to be one way to provide'specific access with- out running the risk of unwieldy retrieval sets. Without time-consuming name au- thority work, however, names may provide only partial access to relevant materials. Fortunately, searchers may be able to over- come many variations in names with trun- cation and other search tactics, but total pseudonyms will remain invisible to a searcher unless a link is made in the data- base.35 The primary drawback to retrieval by personal name is that the researcher must know the key players in the area be- ing studied before finding the material. While names and institutions provide a type of subject access, they augment rather than replace topical access points. The Future At this time, we just do not know enough about how researchers at- tempt to look for archival materials in na- tional databases or in local on-line catalogs. This information should drive the design of our information systems and our document representations. In its absence, the cardinal rule of indexing—"Index at the most specific level possible"—should always apply, but this edict is often ambig- uous. Even more problematic is the sear- cher's analog: "Search at the most specific level possible." Richard Pearce-Moses raised valuable questions in this regard in a posting to the ARCHIVES listserv in De- cember 1992: Fixing up LCSH and MARC may be the last steps we should be wor- rying about. Maybe we need to de- fine some common research strategies based on patron needs: What are patrons asking of our ma- terials? and What tools do we need to match our material to those re- quests?36 In addition to user studies, much more re- search into the nature of retrieval from large bibliographic databases is needed. This work would benefit all players in the information community, as most databases 34Thcre are two ways in which authority control (i.e., use of a control vocabulary) can be violated: (1) the same concept can be represented by different terms, and (2) different concepts can be represented by the same term. The former case is most often con- sidered, but the latter may be more difficult to over- come from a retrieval point of view, particularly when large numbers of records are retrieved. "Actually, a sophisticated search system would be able to retrieve pseudonyms of any name entered without the searcher ever being worried with the mat- ter, if so programmed. This is not the reality of major search services today and the upkeep cost of such a service makes it unlikely in the near future. "Richard Pearce-Moses, "LCSH—Summation and Opinions—Sources—1992," ARCHIVES listserv (15 December 1992). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 326 American Archivist / Spring 1994 are growing at an alarming rate. Retrieval studies comparing OCLC, RLIN, and In- ternet-available OPACs are also needed. Because the RLIN record structure is more felicitous to archival information, most ar- chivists believe it is the information system of choice for archival materials. Only re- search will substantiate this belief. If re- searchers know which repositories hold the materials they want, searching individual catalogs via the Internet may produce the best retrieval results once users deal with all the OPAC search variations. This ap- proach is the electronic equivalent to writ- ing individual archivists to see what their collections hold in a given area. Many in- teresting studies wait to be conducted. In this day of information gluttony and those surfeited years that surely lie ahead, responsible appraisal and provision of ac- cess to significant materials are central to the archivist's function. We know we can- not save everything. Now we must learn that only a portion of what we do save will merit specialized avenues of access. If we do not practice such restraint and temper- ance, the national bibliographic databases will grow to useless proportions and our processing backlogs will overwhelm us. We need to represent those materials deemed worthy with as much specificity as possible to stem the tide against the mean- inglessness of massive retrievals from elec- tronic systems. As noted earlier, catalogs need to describe works and collections while distinguishing them from a myriad of others. To achieve the former without the latter will produce databases that are both enormous and brutally random. They will become the archivist's, and the librarian's, Moby Dick: an obsession to maintain with an overwhelming whiteness and lack of meaning and direction. Lester Asheim has observed that "the rich store of informa- tion to which librarians can now provide access has a tremendous potential for good—to the individual and to the soci- ety." He continues by noting that, "as collectors, librarians have contributed to the information overload which inhibits rather than promotes achievement of the goal we had in view." He asks librarians if they do not "have an obligation now to provide a solution to the problem [that they] have helped to create."37 Is it not time that archivists started to face the prob- lem of information overload and stopped being lulled into a false sense of security offered by national databases and the allure of superficial subject access? Some call for scrapping the information systems we now have and starting over, but this will not solve all the problems. There will never be a "perfect" information stor- age and retrieval system for archival ma- terials, even if archivists design it from scratch specifically to meet their needs, be- cause language and the human mind are the real problems. Subject retrieval—or for that matter, any form of text representa- tion—will never be perfect. Archivists must recognize this and move forward, bal- ancing local and national needs and build- ing systems that are useful and possible. In the long run, there is no substitute for well- selected index terms that represent the pri- mary aspects of a collection. This is never easy, but the less effort put into represent- ing materials in a database, the more dif- ficult retrieval will be. Archivists must decide on which side of the retrieval equa- tion they wish the effort and cost to fall. "Asheim, "Ortega Revisited," 225-26. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.57.2.f0650763x258t4p5 by C arnegie M ellon U niversity user on 06 A pril 2021 work_74pehc5vizeali4znlgyfooosm ---- LCSH, FAST Y DELICIOUS: vocabularios normalizados y nuevas formas de catalogación temática. * mmontalvo@uprrp.edu Recibido: 3/03/2010; 2ª revisión: 21/05/2010; aceptado: 10/06/2010. MONTALVO MONTALVO, M. LCSH, FAST Y DELICIOUS: vocabularios normalizados y nuevas formas de catalogación temática. Anales de Documentación, 2011, vol. 14, nº 1. Disponible en: . LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE CATALOGACIÓN TEMÁTICA M a r i l y n M o n t a l v o M o n t a l v o * Sistema de Bibliotecas. Recinto de Río Piedras. Universidad de Puerto Rico. Resumen: Los vocabularios normalizados han sido las piezas medulares de la investigación bibliográfica, pero hay quienes consideran que los epígrafes organizados en cadenas de palabras son difíciles de comprender y utilizar. Esta situación ha dado paso al desarrollo de FAST, una versión deconstruida de los epígrafes LCSH. Paralelamente, gracias a las folksonomías desveladas por las redes sociales desarrolladas en Internet, las personas están organizando la información según sus preferencias particulares y compartiendo sus epígrafes o etiquetas. Delicious, un sistema de marcadores sociales, que permite etiquetar y compartir todo tipo de enlace electrónico, es una de las herramientas que descubre el comportamiento de los usuarios a la hora de buscar información. En este trabajo se explora la estructura de los epígrafes seleccionados por los usuarios para etiquetar los enlaces que guardan en Delicious y la utilidad de esta herramienta para desarrollar epígrafes normalizados que se acerquen a las preferencias de los usuarios. Palabras clave: LCSH; FAST; Delicious; vocabularios controlados; encabezamientos de materia; etiquetas; folksonomías. Title: LCSH, FAST AND DELICIOUS: NORMALIZED VOCABULARIES AND NEW WAYS OF SUBJECT CATALOGING. Abstract: Controlled vocabularies have been the key pieces of bibliographic research, but some people believe that subject headings organized in horizontal strings are difficult to understand and use. This situation has favored the development of FAST, a deconstructed version of LCSH. At the same time, thanks to the folksonomies revealed by the social networks that are being developed on the Internet, people are organizing information according to their preferences and sharing their subject headings or tags with others. Delicious, a system of open social bookmarks, which allows bookmarks to be tagged and shared, is one of the tools which allow us to discover the behavior of users in their search for information. In this work we explore the structure of the headings chosen by users to tag the bookmarks they store in Delicious and the usefulness of this tool in the development of controlled headings that approach users´ preferences. Keyword: LCSH; FAST; Delicious; controlled vocabularies; subject headings; tags; folksonomies. 2 MARILYN MONTALVO 1. INTRODUCCIÓN Los vocabularios normalizados, tipificados o controlados han sido, hasta ahora, las piezas medulares de la investigación bibliográfica1. Desde el momento en que la cantidad de manuscritos y publicaciones dejó de ser manejable mediante la preparación de catálo- gos onomásticos y de títulos, diversas taxonomías, generales, especializadas, sencillas o complejas, han guiado a los bibliotecarios y a los usuarios en la búsqueda de información. Gracias a los vocabularios controlados, se pueden agrupar diversos recursos informativos bajo un mismo concepto o nombre propio para reducir las opciones de búsqueda en catá- logos e índices y facilitar la recuperación de la información. En la configuración de dichos vocabularios se conjugan, entre otras disciplinas, la lexicografía, la semántica, la lexicología, la morfología, la sintaxis, la taxonomía y la lógica. La asignación de vocablos controlados es subjetiva. En algunos casos, pueden seleccionarse términos de uso popular o términos científicos, dependiendo del tipo de usuario a quien vayan dirigidos. Los puntos de vista sociales, culturales y políticos también tienen un peso importante en la organización de la información. Por otra parte, aunque los vocabularios normalizados aspiran a un gran nivel de uniformidad, rara vez se alcanza el nivel óptimo y hay que admitir que ningún índice o catálogo temático es tan consistente internamente como uno desearía (OCLC, 2007, p. 21). Los epígrafes elegidos, denominados autoridades, tomados de diccionarios, tesauros y de las propias obras catalogadas, pueden ser generales y abarcar varios aspectos de un tema, reduciendo las opciones que tendrá el usuario a la hora de buscar información, o bien pueden ser específicos, con el propósito de localizar subtemas, aumentando, tanto las opciones de búsqueda, como las entradas de la lista de encabezamientos de materia desarrollada. A su vez, la lista de encabezamientos de materia puede formar parte de una estructura preestablecida o construirse según sean necesarios nuevos epígrafes. La opción seleccionada incidirá en las disciplinas que comprenderá, el balance en la cobertura de los temas y su actualización. El significado de los vocablos, simples o complejos, el número, la homonimia y la sinonimia son aspectos que intervienen en la selección, desarrollo y desambiguación de los encabezamientos de materia. En la construcción de los epígrafes también intervienen el orden de los elementos que los constituyen, ya sea natural o invertido y el tipo de sintagma nominal, así como la especificación del tema mediante el uso de paréntesis o subdivisiones. La ubicación cronológica y geográfica de un tema, así como su formato y género, contribuyen, igualmente, a enmarcar la descripción del quehacer humano en diversos contextos. Una de las mayores fortalezas de los vocabularios controlados es su capacidad de síntesis y de reunión de términos sinónimos en un epígrafe unívoco al cual se remiten los términos no utilizados como autoridad. Otra de sus fortalezas es la capacidad de expresar conceptos complejos mediante la construcción de cadenas de epígrafes organizadas jerárquicamente. En un esquema temático enumerativo el vocabulario contiene encabezamientos previamente construidos para describir, tanto temas simples como complejos. En cambio, en el esquema sintético solamente figuran en la lista de epígrafes algunos de los posibles temas y los restantes se construyen mediante reglas. Si se opta por Anales de Documentación, 2011, vol. 14, nº 1 LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE� 3 la ´precoordinación´, los encabezamientos se construyen en sentido horizontal, haciéndose más específicos mediante la asignación de subdivisiones. En cambio, la ´poscoordinación´ produce múltiples encabezamientos que se construyen en sentido vertical. Todos los componentes que intervienen en la estructura de una lista de encabezamientos de materia deben desarrollarse de forma que les permitan a los usuarios comprender y anticipar los epígrafes utilizados en el catálogo o índice consultado. A su vez, deben permitirle a la entidad catalográfica establecer las normas que propicien la normalización de los epígrafes usados. Este es el gran reto de los vocabularios controlados. 2. LIBRARY OF CONGRESS SUBJECT HEADINGS Los Library of Congress Subject Headings (LCSH), producto de una larga tradición bibliotecológica, constituyen la lista de encabezamientos de materia más extensamente usada en Occidente, tanto en su forma original, como en numerosas adaptaciones. Los principios que han guiado su desarrollo son: el usuario como eje, el uso general y la especificidad. Además de estos principios básicos, la Library of Congress (LC) ha basado el desarrollo de sus epígrafes en los conceptos de ´garantía literaria´, uniformidad y unicidad de los epígrafes, consistencia interna, estabilidad, entrada directa, ´precoordinación´ y ´poscoordinación´ (Chan, 2005, p. 34-38). Los LCSH se distinguen porque: • Componen el vocabulario controlado más extenso que se haya desarrollado en Occidente. • Tienen un gran riqueza léxica y temática. • Proveen control de sinonimia y homonimia. • Contienen referencias jerárquicas y asociativas. • Se usan extensamente en otras instituciones, nacionales e internacionales. • Disponen de numerosas traducciones y adaptaciones. • Tienen una historia documentada. • Tienen el apoyo de LC. La estructura de los LCSH es, principalmente, precoordinada. Este es uno de los aspectos de la construcción de epígrafes que más defiende LC. Sus bibliotecarios sostienen que la estructura preecordinada u horizontal permite jerarquizar y establecer relaciones entre los temas, provee un orden estandarizado y mejora la relevancia de los resultados (Library of Congress, 2007). Es cierto que, cuando los usuarios tenían que consultar catálogos e índices impresos, debían tratar de entender la lógica de la estructura de los epígrafes ´precoordinados´ para acceder a la información. Sin embargo, poco se sabe del éxito que tenían en sus búsquedas temáticas. Lo que se sabe es que muchos de los bibliotecarios que ofrecían servicios al público acogieron con gran alivio y complacencia la opción de búsqueda por palabra clave que les proveyeron los catálogos en línea porque les permitía usar un lenguaje más natural y combinar los términos de búsqueda de forma ´poscoordinada´, combinando diversos epígrafes simples. Es que, a pesar de sus ventajas, las búsquedas temáticas normalizadas, les exigen, tanto a los usuarios como a los bibliotecarios que no tienen experiencia en catalogación, familiarizarse con la sintaxis Anales de Documentación, 2011, vol. 14, nº 1 4 MARILYN MONTALVO elegida por los catalogadores e interpretar el orden jerárquico de los epígrafes y sus correspondientes subdivisiones. Hay que preguntarse, entonces, si la estructura ´precoordinada´, por muy lógica que sea, verdaderamente responde a la conveniencia del usuario. Según algunos autores, los usuarios no han logrado asimilar la estructura jerárquica de los LCSH. Ya Cochrane había planteado la necesidad de que las cadenas de encabezamientos formuladas por los LCSH se pudieran permutar porque la mayoría de los usuarios y algunos catalogadores no podían seguir la lógica de la construcción de los epígrafes (Cochrane, 1986, p. 62). Según observaciones más recientes, la mitad de las búsquedas que se hacen en los catálogos en línea no arrojan ningún resultado (Ballard, 1998, p. 58). Chan opina que los LCSH resultan muy engorrosos con tantas reglas complicadas para la formación de cadenas de encabezamientos (Chan, 2005, p. 13). Asimismo, en un estudio en el que participaron 288 niños y adultos estadounidenses, realizado con el propósito de determinar en qué medida entendían los epígrafes con subdivisiones, solamente el 31% de los niños y el 39% de los adultos interpretaron correctamente los epígrafes (Drabenstott; Simcox y Fenton, 1999). En una investigación similar, en la que participaron 137 bibliotecarios que ofrecen servicios al público y 135 que laboran en servicios técnicos, el 52% de los bibliotecarios del primer grupo y el 55% de los bibliotecarios del segundo grupo, interpretaron correctamente el significado de los epígrafes (Drabenstott; Simcox y Williams, 1999). Estos resultados llevaron a los autores a sugerir, como medidas para simplificar la sintaxis de los LCSH, que se estandarizara el orden de los epígrafes o que se deconstruyeran los epígrafes muy largos. Si bien la búsqueda por palabra clave y el desarrollo de las nuevas generaciones de catálogos en línea y de interfaces adaptadas a los FRBR están ayudando a los usuarios a realizar búsquedas más generales o más específicas sin tener, necesariamente, que conocer la jerarquía de los encabezamientos de materia, la mayoría de los epígrafes se siguen construyendo en sentido horizontal. 3. FAST (FACETED APPLICATION OF SUBJECT TERMINOLOGY) En la actualidad, el desarrollo y la utilización de los vocabularios controlados se encuentran enmarcados, en primer lugar, en los cambios experimentados por los principios internacionales de catalogación, los requisitos funcionales de los registros bibliográficos y las reglas de catalogación nacionales. En segundo lugar, están muy condicionados por el desarrollo de recursos informativos en diferentes formatos y el crecimiento exponencial de los recursos electrónicos, caracterizados por su naturaleza ecléctica, inestable y escurridiza. En tercer lugar, están sujetos a las exigencias de los usuarios, que demandan sistemas de recuperación de información cada vez más rápidos y a la necesidad de incorporar esquemas de metadata a los recursos disponibles en Internet para agilizar su localización. Finalmente, se enfrentan, tanto la escasez de bibliotecarios catalogadores, debido a la reducción del currículo dedicado a la formación de catalogadores en las escuelas de biblioteconomía, como a las dificultades presupuestarias que afectan a todas las bibliotecas. Anales de Documentación, 2011, vol. 14, nº 1 LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE� 5 En este contexto, el Online Computer Library Center (OCLC) comenzó a explorar en 1998 el desarrollo de un sistema de encabezamientos de materia que pudiera utilizarse en los registros de metadata diseñados por el Dublin Core Metadata Initiative para la descripción e indización de los recursos electrónicos. OCLC determinó que para que una lista de epígrafes de materia fuera apropiada para proveer acceso a los recursos electrónicos debía: • tener una estructura simple de asignar, utilizar y mantener; • poder ser manejada por personas que no fueran catalogadores y en otros ambientes, además de las bibliotecas; • proveer puntos de acceso óptimos; • ser flexible y compatible con diferentes disciplinas y bases de datos; • ser compatible con MARC, Dublin Core y otros esquemas de descripción bibliográfica; • ser fácil de mantener y manejar, desde el punto de vista tecnológico. El desarrollo de un vocabulario controlado de esta naturaleza podría lograrse de dos formas: creándolo o usando una lista existente. El producto de la investigación realizada por OCLC ha sido FAST (Faceted Application of Subject Terminology), un vocabulario normalizado, extraído de los LCSH, que categoriza los encabezamientos según diversas ´facetas´2. FAST, que adopta un formato modular en el que cada faceta dispone de grupos discretos de epígrafes, contiene rasgos de ´precoordinación´ y de ´poscoordinación´. Es ´precoordinado´ porque utiliza encabezamientos con subdivisiones, en tanto pertenezcan a la misma faceta. Es ´poscoordinado´ porque, en los casos en los que no se han desarrollado encabezamientos complejos o en los que los encabezamientos pertenecen a diferentes facetas, no se usan subdivisiones, sino que se establecen encabezamientos adicionales. FAST dispone actualmente de una base de datos de autoridades (OCLC, 2008), de un manual preliminar (OCLC, 2007) y de un texto más amplio (Chan y O´Neill, 2010). La base de datos contiene dos archivos en los que figuran los epígrafes de cada faceta, a saber: Asunto3, Lugar, Tiempo, Persona, Entidad, Título uniforme y Acontecimiento4, por una parte, y los de Forma o género, por otra (véase Tabla I). Los epígrafes incluidos en la base de datos responden al concepto de ´garantía literaria´ (literary warrant), ya que incorporan todos los que han sido usados en Worldcat como temas, al menos una vez. El cambio mayor que plantea FAST es la organización de los epígrafes en forma vertical, lo que permite simplificar la sintaxis de los LCSH y reducir la cantidad de epígrafes únicos, sin menoscabo de su riqueza léxica (OCLC, 2007). ASUNTO Global warming--Health aspects LUGAR Colombia--Bogotá--Ciudad Bolívar TIEMPO 2004-2008 PERSONA Juan Carlos I, King of Spain, 1938 ENTIDAD Biblioteca Nacional de México TÍTULO Cantar de mío Cid Anales de Documentación, 2011, vol. 14, nº 1 6 MARILYN MONTALVO ACONTECIMIENTO Revolution (Dominican Republic : 1973) FORMA O GÉNERO Biography--Dictionaries ASUNTO Global warming--Health aspects Tabla I. ´Facetas´ de FAST. Se han llevado a cabo algunos estudios para comprobar la viabilidad de FAST, más allá de su uso como parte de la metadata de los registros del DCMI. Mitchell y Hsieh-Yee han demostrado que los epígrafes del Ulrich´s Periodicals Directory pueden sustituirse por encabezamientos de FAST (2007). En cambio, el análisis de 5,000 epígrafes de los LCSH convertidos a FAST, realizado por el Subject Analysis Committee de la Association of Library Collections and Tecnical Services (ALCTS) determinó que, aunque la mayoría de los encabezamientos temáticos reflejaban el tema del recurso catalogado, algunos epígrafes, al ser deconstruidos, podían perder el significado específico que tenían cuando estaban ´precoordinados´ (Quiang, 2008). En una cata realizada para conocer qué porcentaje de los epígrafes ´precoordinados´ asignados a diversas colecciones de imágenes del Sistema de Bibliotecas de la Universidad de Puerto Rico5 podían convertirse en epígrafes tomados de FAST, se encontró que todos podían adaptarse con relativa facilidad y que en tres de las cinco colecciones analizadas, la mayoría de los epígrafes resultantes eran epígrafes simples, lo que a la larga redundaría en una economía de términos. Aunque la muestra utilizada fue pequeña, los resultados respaldan la viabilidad del uso de FAST para simplificar la sintaxis de los epígrafes. Un ejemplo, tomado de la Colección de Mapas Raros, muestra cómo FAST eliminó la redundancia de los epígrafes de LC en un mismo registro y cómo se superaron las dificultades que presenta la ´precoordinación´, al reagrupar los epígrafes según sus diversas ´facetas´ (véase Figura 1). TÍTULO A map of the West-Indies or the islands of America in the North Sea; with ye adjacent countries; explaning [sic] what belongs to Spain, England, France, Holland and c. also ye trade winds, and ye several tracts made by ye galeons and flota from place to place. According to ye newest and most exact observations by Herman Moll, geographer. EPÍGRAFES CON LA ESTRUCTURA DE LOS LCSH Treasure-trove -- Caribbean Area -- Maps -- Early works to 1800. Treasure trove --West Indies -- Maps -- Early works to 1800. Treasure trove -- Spanish Main -- Maps -- Early works to 1800. Caribbean Area -- Maps -- Early works to 1800. West Indies -- Maps -- Early works to 1800. Spanish Main -- Maps -- Early works to 1800. EPÍGRAFES CON LA ESTRUCTURA DE FAST Treasure-troves Temático South America�Spanish Main Anales de Documentación, 2011, vol. 14, nº 1 Caribbean Area Geográfico http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=map http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=West http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Indies http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=islands http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=America http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=North http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Sea http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=ye http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=adjacent http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=countries http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=explaning http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=sic http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=belongs http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Spain http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=England http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=France http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Holland http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=c http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=ye http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=trade http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=winds http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=ye http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=several http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=tracts http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=made http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=ye http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=galeons http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=flota http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=place http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=place http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=According http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=ye http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=newest http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=most http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=exact http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=observations http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Herman http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Moll http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=any&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=geographer http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=exact&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Treasure-trove+--+Caribbean+Area+--+Maps+--+Early+works+to+1800. http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=exact&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Treasure%27trove+--West+ndies+--+Maps+--+Early+works+to+1800. http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=exact&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Treasure+trove+--+Spanish+Main+--+Maps+--+Arly+works+to+1800. http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=exact&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Caribbean+Area+--+Maps+--+Early+works+to+1800. http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=exact&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=West+Indies+--+Maps+--+Early+works+to+1800. http://bibliotecadigital.uprrp.edu/cdm4/results.php?CISOOP1=exact&CISOFIELD1=CISOSEARCHALL&CISOROOT=/MapasRaros&CISOBOX1=Spanish+Main+--+Maps+--+Early+works+to+1800. LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE� 7 West Indies Maps Early works to 1800* Forma *Se utiliza también como subdivisión común. Figura 1. FAST representa una visión diferente y un esfuerzo colaborativo entre LC y OCLC para hacer más asequible la búsqueda de encabezamientos de materia normalizados y acercar los catálogos a las preferencias de los usuarios. Aunque inicialmente fue ideado para usarse con los registros del DCMI, ha demostrado que puede simplificar y agilizar la catalogación temática de todo tipo de formatos. Si bien, LC ha decidido no adoptar el esquema6, OCLC cuenta con su apoyo para continuar actualizando la base de datos7, por lo que constituye una opción para las bibliotecas y centros de información que deseen explorar el potencial de un vocabulario normalizado que se construye en sentido vertical. 4. LAS FOLKSONOMÍAS Y DELICIOUS Los LCSH y otras listas de encabezamientos de materia similares, así como FAST, son ejemplos de las estrategias utilizadas por los especialistas en la organización de la información para lograr que los usuarios puedan hacer búsquedas efectivas y sistemáticas. Por otra parte, y de forma paralela, las personas han estado organizando la información que encuentran según sus preferencias particulares. A finales de los años 90´ los temas o ´etiquetas´ asignados por los usuarios, que habían permanecido en la intimidad de sus ficheros y de sus computadoras, comenzaron a divulgarse, gracias a las redes sociales. A esta forma de clasificación-informal-social se le ha denominado, folk classification, social tagging folksonomy, y de ahí, taxonomía popular, marcadores sociales o folksonomías. Folksonomy is the result of personal free tagging of information and objects (anything with a URL) for one's own retrieval. The tagging is done in a social environment (usually shared and open to others). Folksonomy is created from the act of tagging by the person consuming the information (Vander Wal, 2007). Actualmente, existen en Internet numerosas herramientas que les permiten a las personas formular temas en diversas instancias de su interacción con la Red y compartirlos con todos los que acceden a sus páginas. Al parecer, ¨la Babel de contenidos de Internet ha encontrado una nueva forma para poner orden a su océano: las folksonomías, un nuevo paradigma de clasificación de la información que permite a los internautas crear libremente etiquetas para categorizar todo tipo de contenidos¨ (Estalella, 2005). Al permitirles clasificar la información, estas herramientas arrojan información importante sobre el comportamiento de los usuarios a la hora de buscar información (West, 2007; Seoane, 2007) y sus preferencias idiomáticas (McElfresh, 2008). Evidentemente, la inmensa mayoría de los usuarios no se preocupa por resolver problemas de homonimia o sinonimia, género, número, orden de los elementos, idioma de las etiquetas, jerarquía, categorías gramaticales y otros rasgos imprescindibles para la normalización de los vocabularios. Por eso, a las tecnologías que propician la difusión de Anales de Documentación, 2011, vol. 14, nº 1 8 MARILYN MONTALVO los marcadores sociales, a menudo, se les considera tecnologías antiautoritarias que rechazan los controles que, tradicionalmente, han ejercido los bibliotecarios (Gilmour y Stickland, 2009). Aunque también cabe preguntarse hasta qué punto los usuarios son capaces de recuperar sistemáticamente la información que etiquetan de esta forma. Por otra parte, algunos bibliotecarios que ofrecen servicios directos al público, plantean que la estructura de los catálogos en línea, de las bases de datos y hasta de los portales sigue siendo muy rígida, lo que contrasta con la interfaz sencilla de Google, que tanto atrae a los usuarios, y consideran que las folksonomías les permiten participar en la organización de los recursos informativos, una tarea que antes era exclusiva de los bibliotecarios catalogadores. Además, les permiten hacerlo de forma rápida y dinámica (Rethlefsen, 2007). Decididamente, la función de las bibliotecas es facilitar la búsqueda de información. Esta búsqueda puede ser frustrante cuando el usuario navega de forma asistemática por la Internet. Por eso, de la misma forma en que se adoptan vocabularios normalizados para organizar los recursos informativos en catálogos e índices, se pueden explorar otras opciones que les ayuden, tanto a los bibliotecarios como a los usuarios, a organizar los recursos electrónicos gratuitos que van encontrando en la Red, Una de las herramientas sociales que permite explorar las folksonomías es Delicious. Delicious, que cuenta con más de 5 millones de usuarios y más de 180 millones de enlaces únicos, es un sistema de marcadores sociales gratuito, en el que cualquier persona puede crear un espacio virtual para guardar, describir, etiquetar, organizar, administrar y compartir todo tipo de enlace electrónico desde cualquier computadora. La estructura del etiquetado en Delicious se basa en los conceptos de Bookmark (Enlace), Tag (Etiqueta o tema) y Bundle (Agrupación de temas). La herramienta permite el desarrollo de folksonomías particulares más sencillas o más complejas, basadas en las preferencias de los usuarios. Las personas pueden asignar a sus enlaces cuántas etiquetas deseen y reagruparlas jerárquicamente. El sistema ofrece plena libertad para seleccionar, describir, etiquetar y editar la información de los enlaces que se incluyen. Los cambios realizados se actualizan en tiempo real y la lista de enlaces se puede organizar en orden cronológico o alfabético. Según se van añadiendo etiquetas, el sistema crea un tesauro automático que evita tener que escribir nuevamente las etiquetas, si ya fueron utilizadas anteriormente. Su flexibilidad permite construir vocabularios ´precoordinados´ o ´poscoordinados´, así como normalizados o sin normalizar. Anales de Documentación, 2011, vol. 14, nº 1 La plantilla de Delicious le muestra al usuario las 10 etiquetas más usadas, las agrupaciones y todas las etiquetas creadas, con la cantidad de enlaces que hay en cada una. Desde una página personalizada, se pueden ver los enlaces seleccionados por otras personas y sus etiquetas. A través de la creación de Networks (Redes de Delicious), las personas pueden conectar sus páginas con las de otras que hayan recopilado enlaces de su interés. Estas páginas figuran como Fans (Aficionados) de las páginas elegidas. Delicious también les permite a sus usuarios suscribirse a los temas que les interesan para examinar periódicamente los enlaces que van seleccionando otras personas. El sistema es muy estable y permite exportar los enlaces con todos sus datos como medida de seguridad. Además, dispone de un foro para compartir experiencias, dudas y sugerencias y ofrece la opción de solicitar apoyo técnico. Permite buscar diversos temas, solos o combinados, con el propósito de captar nuevos enlaces para las etiquetas establecidas. Al acceder a la página se puede ver la fecha en la que se añadieron los enlaces y cuántas otras personas LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE� 9 han seleccionado el mismo enlace. Además, el enlace de una página creada en Delicious se puede colocar en otras páginas de Internet para que diversos usuarios accedan a su contenido. Delicious les permite a los bibliotecarios, acostumbrados a determinar la estructura y el contenido de las taxonomías usadas para recuperar información, comparar dichas estructuras con las que los usuarios eligen libremente. También les permite organizar aquellos enlaces que desean capturar y recuperar sistemáticamente mediante el desarrollo de estructuras sencillas que los usuarios puedan utilizar, tanto para buscar en el Delicious de la biblioteca, como para obtener ideas de cómo mejorar la recuperación sistemática de los enlaces que guardan y etiquetan en diferentes herramientas sociales. Debido a su utilidad, algunos bibliotecarios comienzan a incluirlo en sus clases de capacitación (Rethlefsen, 2007). Teniendo en cuenta que una de las dificultades que afrontan los usuarios a la hora de consultar vocabularios controlados es su ´precoordinación´, se decidió examinar una muestra de los enlaces guardados en diversas páginas de Delicious. En una primera cata, se examinaron tres enlaces escogidos por los usuarios (uno en español, uno en inglés y uno en francés), se escogieron, al azar, trescientos de los epígrafes usados para describirlos y se determinó la proporción de epígrafes simples y epígrafes complejos o con subdivisión usados. El análisis de novecientos epígrafes arrojó un 89.89% de epígrafes simples (Montalvo, 2008). Un año después, se seleccionaron diez enlaces en español y se analizaron, al azar, cien epígrafes de cada uno. Se obtuvo, nuevamente, un 89% de epígrafes simples, lo que revela una clara preferencia por las unidades discretas frente a los epígrafes construidos en forma horizontal (véase Tabla II). Página Tags examinados Tags simples Tags complejos o con subdivisión %Tags´ simples EUMEDNET: Biblioteca virtual y enciclopedia de las ciencias sociales, económicas y jurídicas 100 87 13 87% Diccionario panhispánico de dudas (2005) 100 96 4 96% Internet invisible 100 89 11 89% Redalyc: Red de revistas científicas de América Latina y el Caribe, España y Portugal 100 77 23 77% BBC en español 100 96 4 96% EFE 100 86 14 86% Tesis doctorales en red (TDR) 100 87 13 87% Biblioteca Virtual Miguel de Cervantes 100 83 17 83% Dialnet 100 90 10 90% Anales de Documentación, 2011, vol. 14, nº 1 10 MARILYN MONTALVO Comunidades de wikis libres para aprender 100 94 6 94% Total 1,000 885 115 89% Tabla II. ´Tags´ simples y complejos o con subdivisión en una muestra de Delicious. 5. DELICIOUS DEL SISTEMA DE BIBLIOTECAS DE LA UPRRP El Sistema de Bibliotecas del Recinto de Río Piedras de la Universidad de Puerto Rico dispone de una amplia colección de bases de datos y otros recursos electrónicos suscritos. No obstante, también recomienda el uso de diversos recursos electrónicos disponibles gratuitamente en Internet. Inicialmente, todos los recursos se incluían en el Portal del SB en una sola lista, pero ello requería establecer criterios de selección muy restrictivos. Además, los usuarios, al verlos en el mismo lugar, suponían que la biblioteca tenía algún control sobre su funcionamiento. Por esta razón, después de evaluar Delicious, se decidió adoptarlo como herramienta para desarrollar una colección de recursos electrónicos gratuitos (véase Figura 2). La página se circunscribe a bases de datos, índices y obras de referencia, de contenido multidisciplinario, publicados, mayormente, por gobiernos e instituciones, que muestren rasgos de estabilidad. Figura 2. Anales de Documentación, 2011, vol. 14, nº 1 LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE� 11 Al desarrollar un Delicious para el SB, se optó por un esquema sintético porque se deseaba disponer de una estructura básica preestablecida que permitiera observar el crecimiento de los enlaces en cada disciplina y mantener su carácter multidisciplinario. Se podía crear una estructura nueva o utilizar una existente. Ya que la mayoría de las bibliotecas del SB utilizan el sistema de clasificación de Dewey, se establecieron agrupaciones por temas (Bundles) siguiendo las divisiones generales de la clasificación de Dewey, con ligeras modificaciones. A partir de este esquema básico, las etiquetas se han asignado libremente. No se utiliza ninguna lista preestablecida, excepto el propio tesauro que se genera al crear las etiquetas. Cada vez que se elige un enlace, se le asignan de inmediato todos los temas específicos necesarios, en español, salvo casos en los que no exista el término en este idioma. Predominan los epígrafes ´poscoordinados´ simples. A veces es necesario usar epígrafes complejos para denominar locuciones o nombres propios, pero no se añaden subdivisiones. Cada epígrafe creado se asigna a una agrupación, estableciendo relaciones de todo/parte. Se asignan encabezamientos temáticos, geográficos, onomásticos y de forma. También, se ha creado un epígrafe que permite separar los enlaces que tienen contenidos en español. Las etiquetas se revisan y se normalizan periódicamente, tarea que facilita el tesauro automático. Además, se procura que la cantidad de epígrafes no crezca excesivamente, se evita usar sinónimos y se mantiene una nomenclatura fácil de recordar. Actualmente hay en el Delicious del SB 235 etiquetas, de las cuales el 80% son epígrafes simples. El 20% restante lo componen epígrafes complejos, de los cuales una tercera parte son nombres propios. La cantidad de etiquetas se ha ido estabilizando, ya que se tienden a repetir, gracias a la normalización. Se han seleccionado 298 enlaces, número que varía constantemente, ya que periódicamente se añaden enlaces nuevos y se eliminan los inactivos. Desde su creación, en abril de 2008, más de un 95% de los enlaces seleccionados se han mantenido activos. Los usuarios acceden a Delicious desde la página y el blog del SB. Además, pueden añadirlo a sus propios Delicious y a otras redes sociales, si así lo desean. Si bien se ha optado por establecer epígrafes ´poscoordinados´, no sujetos a ninguna lista preestablecida, tratando de asemejarse a la forma en que los usuarios eligen sus etiquetas, en Delicious se pueden utilizar diversos esquemas temáticos, especialmente si son ágiles y poseen una sintaxis sencilla, como la que caracteriza a FAST. La decisión dependerá del grado de normalización deseado y de la preferencia por una estructura jerárquica o vertical. 6. CONCLUSIONES Los vocabularios controlados tienen las fortalezas necesarias para seguir siendo las piezas medulares de la investigación bibliográfica, tal y como lo siguen demostrando los LSCH y otras listas de encabezamientos de materia similares. Sin embargo, la ´precoordinación´, una de sus grandes fortalezas a la hora de jerarquizar los epígrafes, es también uno de los aspectos menos comprendidos por los usuarios. FAST, inicialmente desarrollado para suplir los encabezamientos de materia que formarían parte de los registros de metadata del DCMI, supera esta dificultad al permitir la construcción de Anales de Documentación, 2011, vol. 14, nº 1 12 MARILYN MONTALVO epígrafes ´poscoordinados´. Además, simplifica la sintaxis y agiliza la catalogación, pero conserva la riqueza léxica y el apoyo logístico de los LSCH. Las opiniones en cuanto a la pérdida de especificidad que pueda ocasionar la deconstrucción de los epígrafes están divididas. Aún así, constituye una opción viable para las bibliotecas y centros de información que deseen explorar el potencial de un vocabulario normalizado que se construye en sentido vertical. Paralelamente, las folksonomías han permitido la total libertad de los usuarios para etiquetar la información que encuentran en Internet, pero la falta de normalización puede ser un obstáculo para la recuperación sistemática de dicha información. Por eso, de la misma forma en que se adoptan vocabularios normalizados para organizar los recursos informativos en catálogos e índices, se pueden incorporar otras opciones que les ayuden, tanto a los bibliotecarios como a los usuarios, a organizar los recursos electrónicos gratuitos que van encontrando en la Red. Una de las herramientas sociales que permite conocer las folksonomías es Delicious, un sistema de marcadores sociales gratuito, en el que cualquier persona puede crear un espacio virtual para guardar, describir, etiquetar, organizar, administrar y compartir todo tipo de enlace electrónico desde cualquier computadora. El análisis de una muestra de las etiquetas elegidas por los usuarios para organizar sus enlaces en Delicious apunta a la preferencia por los epígrafes simples y la ´poscoordinación´. Desde el punto de vista bibliotecario, Delicious es muy útil para desarrollar índices de enlaces, flexibles y escalables, que permitan organizar muchos recursos electrónicos que probablemente se quedarían olvidados en algún bookmark. Delicious también es una herramienta que puede ayudar a sopesar las ventajas y las desventajas de los vocabularios, tanto controlados y no controlados, como ´precoordinados´ y ´poscoordinados´, a mostrarles a los usuarios otras formas de organizar sus folksonomías y a proveerles a las nuevas generaciones de profesionales dedicados a ofrecer servicios de consulta presencial y virtual, formas sencillas, pero útiles, de garantizar la recuperación sistemática de la información. NOTAS 1 Los sustantivos ´normalizar ´y ´tipificar´, de los cuales derivan los adjetivos normalizado y tipificado, describen en castellano el proceso por el cual se elijen artificialmente las entradas que representan conceptos diferentes. Hemos elegido el anglicismo ´controlado´ por su uso extenso y porque dentro de una de sus acepciones se incluye también el concepto de ´regularización´. 2 El término ´faceta´ es la traducción literal de facet, definido por el Oxford English Dictionary como las diferentes categorías o clases en las que algo puede ser clasificado simultáneamente. El OED atribuye su origen al sistema de clasificación de S.R. Ranganathan. 3 Hemos traducido ´Topics´ por ´Asuntos´. En el manual de FAST, los encabezamientos de asuntos se subdividen en: Concepts, Objects, Events, Form/Genre as subjects, Animals, Imaginary Persons, Places, etc. y Geologic Periods. 4 En el manual de FAST los Events forman parte de los encabezamientos temáticos sobre asuntos. En los LCSH, algunas subdivisiones cronológicas contienen el nombre del acontecimiento. FAST separa los componentes temáticos y cronológicos porque pertenecen a dos ´facetas´ diferentes. En aquellos casos en que el nombre del acontecimiento es genérico, se especifica el lugar y fecha por medio de un delimitador. Anales de Documentación, 2011, vol. 14, nº 1 LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE� 13 5 La descripción temática de las colecciones de imágenes no es uniforme: algunas han sido catalogadas usando los LCSH (en inglés); otras contienen epígrafes en español, asignados libremente. 6 En el V Encuentro Internacional de Catalogadores, celebrado en Santo Domingo, República Dominicana, durante los días 27 al 29 de octubre de 2009, Barbara Tillet expresó que LC no adoptaría FAST en su catalo- gación. 7 En un correo electrónico enviado por, Edward O´Neill a la autora en diciembre de 2009, éste expresó que LC y OCLC continuarán apoyando el desarrollo y la actualización de FAST. BIBLIOGRAFÍA BALLARD, T. Keyword/subject: finding a middle path. Information Today, 1998, vol. 15, no 6, p. 58. CHAN, L.M. Library of Congress subject headings: principles and application. Westport, Conn.: Libraries Unlimited, 2005. CHAN, L. M. y O´NEILL, E. FAST: Faceted Application of Subject Terminology: principles and application. Westport, Conn.: Libraries Unlimited, 2010. ESTALELLA, A. 2005. La folksonomía emerge como sistema para clasificar contenidos en colaboración [en línea]. El País, 8 septiembre 2005. Disponible en: < http:// adolfoestalella.googlepages.com/050908_Folksonomias.pdf> [Consulta: 13 de septiembre de 2009]. GILMOUR, R. y STICKLAND, J. Social bookmarking for library services: bibliographic access through Delicious. College & Research Libraries News, 2009, vol. 70, no 4, p. 234-237. LIBRARY OF CONGRESS. Library of Congress subject headings [en línea]: pre- vs. post-coordination and related issues report. Washington, D.C.: Library of Congress, 2007. Disponible en: [Consulta: 20 de octubre de 2009]. MCELFRESH, L. K. Folksonomies and the future of subject cataloging. Technicalities, 2008, no 28, p. 2. MONTALVO MONTALVO, M. Vocabularios controlados y FAST: la asignación de materias en el siglo XXI. En: Martínez Arellano, Filiberto Felipe, comp. III Encuentro de catalogación y metadatos: memoria. UNAM: México, 2009, p. 41-64. OCLC. FAST (Faceted application of subject terminology) [en línea]: applications guide and documentation, version 9. OCLC: Dublin, Ohio, 2007. Disponible en: [Consulta: 9 de agosto de 2009]. RETHLEFSEN, M. Tags make libraries Del.icio.us: social bookmarking and tagging boost participation. Library Journal, 2007, vol. 132, no 15, p. 26-28. SEOANE, C. Flexibilidad de las folksonomías. Anuario ThinkEPI, 2007, no 1, p. 74-75. VANDER WAL, T. 2007. Folksonomy. Disponible en: [Consulta: 20 de septiembre de 2009]. WEST, J. Subject headings 2.0: folksonomies and tags. Library Media Connection, 2007, vol. 25, no 7, p. 58-59. Anales de Documentación, 2011, vol. 14, nº 1 http://www.loc.gov/catdir/cpso/pre_vs_post.pdf http://www.oclc.org/info/research/wip/fast/manual-20070112.pdf http://vanderwal.net/%20folksonomy.html http://vanderwal.net/%20folksonomy.html LCSH, FAST Y DELICIOUS: VOCABULARIOS NORMALIZADOS Y NUEVAS FORMAS DE CATALOGACIÓN TEMÁTICA 1. INTRODUCCIÓN 2. LIBRARY OF CONGRESS SUBJECT HEADINGS 3. FAST (FACETED APPLICATION OF SUBJECT TERMINOLOGY) 4. LAS FOLKSONOMÍAS Y DELICIOUS 5. DELICIOUS DEL SISTEMA DE BIBLIOTECAS DE LA UPRRP 6. CONCLUSIONES NOTAS BIBLIOGRAFÍA work_74wnkrnkfvgh5iapkcadf2vxae ---- 1200 - 2000 words but over is OK This is an accepted author manuscript of the following output: Nicholson, D., & Macgregor, G. (2003). NOF-Digi: putting UK culture online. OCLC Systems and Services, 19(3), 96-99. DOI: 10.1108/10650750310490298 Distributed Digital Libraries ‘NOF-Digi’: Putting UK Culture Online ___________________________________ Dennis Nicholson and George Macgregor _______________________________________________________________________________ The authors Dennis Nicholson is Director of the Centre for Digital Library Research, Strathclyde University, Glasgow, Scotland. George Macgregor is a Researcher at the Centre. Keywords Distributed digital libraries, digitisation programmes, UK, NOF-Digi Abstract This article describes a major digitisation programme aimed at improving online access to UK cultural resources from Britain’s museums, libraries and galleries for lifelong learners and others. The programme is supported by lottery funding of £50m and provides free access to important areas of the country’s diverse cultural, artistic, and community resources. The article describes the programme, highlights some of the projects, and looks at areas where improvements to programme coordination might have been made. At time of writing, most of the projects are still in progress. Electronic Access The research register for this journal is available at http://www.emeraldinsight.com/researchregisters The current issue and full text archive of this journal is available at http://www.emeraldinsight.com/1065-075X.htm ________________________________________________________________________________ NOF, NOF-Digi, the National Grid for Learning, and the People’s Network The New Opportunities Fund (NOF) (see http://www.nof.org.uk/) awards grants to education, health and environment projects in the UK, distributing Lottery funds to support a range of worthwhile initiatives. One of its programmes, NOF-Digitise - launched in August 1999 and popularly known as ‘NOF-Digi’ - aims to create innovative online resources of benefit to every UK citizen, bringing together over 500 partner organisations to create support for lifelong learning under the broad themes of citizenship, re-skilling, and cultural enrichment. A budget of £50m supports digitisation initiatives offering content from a wide variety of sources, ranging from major collections such as the Science Museum and the National Libraries, through regional ‘sense of place’ collections, to material in community museums and voluntary organisations. The material being digitised encompasses ‘text, drawings, photos, maps, film and sound recordings and much more’ and is particularly aimed at schools through the National Grid for Learning (NGfL) and at public library users through the Peoples Network (People's Network, 1998). More information on these latter initiatives can be found at http://www.ngfl.gov.uk/ and http://www.peoplesnetwork.gov.uk/ The Enrich UK Portal At time of writing, these projects are still ongoing. There is, however, already a good deal of material available, albeit mainly in embryonic form. There are a variety of access routes to the material. Obviously, it is accessible via individual project websites and even (sometimes) websites highlighting particular local collections within bigger projects. It is also available through the websites of consortia combining a range of projects, and through sites offering English, Irish, Scottish, and Welsh perspectives on the material. For easy access to all of the 150-plus projects funded, however, the best approach is to use NOF’s own ‘EnrichUK’ portal (see http://www.enrichuk.net/). Here you will find access to a range of treasures covering a wide diversity of subjects, including (to name only a few):  The heritage, community history and culture of Staffordshire http://www.nof.org.uk/ http://www.ngfl.gov.uk/ http://www.peoplesnetwork.gov.uk/ http://www.enrichuk.net/  Trials which took place at the Old Bailey trials (London's central criminal court) between 1674 and 1834  Scottish traditional music and dance (James Scott Skinner)  Parliamentary Reports from the Victorian and other eras  UK Flora and Fauna  Royal Shakespeare Company pictures and exhibitions  The Industrial Revolution  The History of Glasgow  The story of Huntley & Palmers, world famous biscuit company  Caribbean, Irish, Jewish and South Asian migrations to England  The Cotton Industry Projects: A Snapshot It is, of course, impossible here to give more than a limited glimpse into the breadth and depth of materials being made available through NOF-Digi. The few projects described briefly below are presented as representative of the many excellent initiatives under development within the programme. It is not a list of the only projects worthy of note. That having been said, they do, hopefully, help to add flavour to what might otherwise be a rather dry and dusty account: British Pathe British Pathe is one of the oldest and most notable media companies in the world, producing famous bi- weekly newsreels and Cinemagazines from 1902 onwards. When the British Pathe ended production of its newsreels in 1970, they had accumulated over 3,500 hours of filmed history amounting to over 90,000 individual items. These have now formed the basis of the BritishPathe.com (see http://www.britishpathe.com/) web site offering access to 3,500 hours of video footage and over 90,000 web pages. Video clips can now be downloaded and viewed free. Powerful searching and browsing tools are available, but the sheer size of the collection is such that a 'Lucky Dip' facility is offered to provide users a random selection of clips, which can then be previewed or downloaded. By late spring 2003, BritishPathe.com also hope to offer over 12 million JPEG stills. Applause Southwest The Applause Southwest (see http://www.applausesw.org) archive contains material pertaining to theatre and theatrical arts in the South West of Britain. The fully searchable archive allows users to view records and search the archive for digitised historical playbills or posters from 1780 until the present day, view historical archives and objects that have never before been seen by the public and learn of theatrical developments in Plymouth and the surrounding area since the mid 18th century. Users can also delve into digitised images, histories and 3D virtual reconstructions of theatres, many long since demolished. I Dig Sheffield I Dig Sheffield (http://www.idigsheffield.org.uk) provides an online guide to archaeology from the Sheffield and the Peak District. More than 400 objects excavated from over 30 digs in South Yorkshire and Derbyshire, many too fragile to be displayed at the city's museum, have been preserved, then photographed and mounted on the web site. Archaeological finds can be browsed by category, including areas such as dress and accessories, food and farming or conflict and war, or alternatively the collection can be searched by keyword, by time period, find material and other associated characteristics. The Union Makes Us Strong: TUC History Online TUC History Online (see http://www.unionhistory.info) makes accessible many of the unique collections held in the Trades Union Congress library collection. Trade unions have been instrumental in influencing economic, social and political developments in the UK. However, as is noted on the web site, 'much of their history is at present unknown and inaccessible to the public'. TUC History Online aims to correct this by providing access to a dynamic set of new resources culled from books, pamphlets, union publications, ephemera and documents pertaining to industrial relations and working conditions held by the library. Although the majority of resources, at present, relate to the Match Workers strike in 1888 and a 150 year labour history timeline, five new learning resources will be released in phases throughout 2003 and will include collections relating to the General Strike, The Ragged Trousered Philanthropists and numerous TUC reports. Gathering the Jewels Gathering the Jewels (see http://www.gtj.org.uk/) brings together a unique collection of Welsh cultural resources including historic letters, paintings, documents, artefacts and photographs, many of them exceptional, amassed from libraries, museums and archives all over Wales. With individual items enhanced by annotation (available in both English and Welsh) and contextualised within themed super and sub collections, Gathering the Jewels provides search and browse tools capable of tapping into a wealth of material pertaining to the natural, political, economic and social history of Wales. Whilst still in development, Gathering the Jewels predicts that the collection will contain over 20,000 images by May 2003 and will become national learning resource. Act of Union The Act of Union Virtual Library (see http://www.actofunion.ac.uk/) is a digital resource relating to the 1801 Act of Union between Ireland and Britain and makes accessible hundreds of digitised pamphlets, newspapers, parliamentary papers and manuscript material culled from specialist collections in various public institutions in Belfast, all searchable bibliographically, and displayed in such a way as to facilitate browsing. Since many of the pamphlets and parliamentary paper have not only been digitised but their contents text keyed to add value to inconsistent typeface quality, free text searching is also provided constituting a key resource of 'inestimable use for scholars'. Conclusion: Problems and Issues There are, undoubtedly, many positive things to be said about NOF-Digi additional to those already made above. In a column written for professionals active in the field under scrutiny, however, there is, perhaps, more value to be had by examining some of the areas where improvements might have been made – in the interests of ensuring that mistakes (if this is what they are, oversights might be an equally valid term) are not repeated in later or other programmes. At time of writing, with most of the projects still a long way from completion, it is too early to be comprehensive in this respect. As far as we are aware, no detailed survey has been undertaken as yet, and this does not claim to be one. Hopefully, however, it is close to being the next best thing – an informed view of concerns being expressed by professionals working in a range of NOF-Digi projects to an organsation (the CDLR) that is involved in two projects itself and is advising some others. Speaking from this perspective, we are aware of the following points being made by participants in the programme:  Poor balance between expenditure on content and expenditure on metadata creation. There is some concern that many projects underestimated the time, effort and expertise required to create the metadata needed to adequately describe digitised materials for retrieval and comprehension. This concern has been reported to NOF by at least one group of projects, but we are not aware of any action having been taken to date. The problem has at least two sources. On the one hand, there has been a failure to recognise that creating metadata for something like a digitised photograph is more difficult and time-consuming than doing so for a book or a similar electronic resource – in particular, a photograph usually has nothing even remotely like a title page for the cataloguers to base their work on. On the other, there is still a general failure to recognise the increasing importance of professional quality interoperable metadata for finding and identifying appropriate resources amongst the ever-growing volume of material available over the Internet. When will we learn that a valuable resource is only valuable if those who need it can find it and that adequate expenditure on metadata is not a drain on resources available for content but essential expenditure if the true value of content is to be realised?  Failure to recognise the importance of metadata content standards. These are, of course, essential if interoperability across collections is to be ensured. Ensuring that all projects offer at least the DC core fields is of limited value if there is no agreement as to how authors or place names or photograph captions should be constructed and encoded. NOF, to its credit, realised from the start that the ability to deliver a fully networked information environment for learning and cultural enrichment would be difficult to achieve without the use of numerous technical standards and guidelines and had an extensive technical standards document covering issues of preservation, interoperability, accessibility, metadata, collection management and security drawn up and made available (UKOLN & Resource, 2003). However, the document does not cover the issue of metadata content standards and it is certain that there will be deficiencies in interoperability across the NOF-Digi environment as a result. A related concern is the area of subject description. Not only has there been no guidance offered in this area – a significant oversight when almost every project is offering subject-based access – but even the Enrich UK site has found it easier to invent a subject scheme rather than adopt a standard scheme. Some groups within the total programme are attempting to address the issue, but again there is little likelihood of a standard subject approach across the whole environment being possible.  Insufficient consultation with participants. With hindsight, it would seem sensible to have had in place mechanisms that would allow participants to interact more helpfully with those managing the programme. Each of the two problems mentioned above might have been tackled early on if such mechanisms (e.g. Regular meetings with NOF Project staff) had been in place. Both were identified and ‘telegraphed’ to NOF at an early stage but there appears to have been no adequate mechanism in place for identifying and tackling problems as they arose. A lack of consultation has also been blamed on other difficulties. One example here is the failure of the portal to highlight important sub-collections otherwise ‘hidden’ under entries for consortia (e.g. there are collections covering topics like ‘Red Clydeside’, ‘Springburn Community Museum’ and ‘Witchcraft in Ayrshire, to name but a few, hidden under a single entry for the Resources for Learning in Scotland project Another is the feeling – shared by most projects – that the promotional programmes were pushed on projects too early in their development. None of these points should be taken as detracting from the generally positive contribution of NOF- Digi and the projects it supports to the delivery of an enriched understanding of UK cultures and sub- cultures both at home and internationally, nor to the resulting enhancement of the web presence of the cultures of England, Ireland, Scotland and Wales generally. Hopefully, however, they can and will be taken as pointers to areas where all of those involved in digitisation programmes and projects should and could improve performance in future. Reminder As I think I’ve made clear, I am keen to interact with readers of this column, so please feel free to contact me. My email address is d.m.nicholson@strath.ac.uk References People's Network. (1998), “New Library: The People's Network”,. London: Library and Information Commission. Available at http://www.ukoln.ac.uk/services/lic/newlibrary/full.html UKOLN & Resource. (2003), “NOF-Digitise: Technical Standards and Guidelines”, London: People's Network. Available at: http://www.peoplesnetwork.gov.uk/content/technical.asp work_764lpsotsjexdc45kqkzcm4n24 ---- OTDCF_v20no4_final.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA Those who can, teach: an interview with Jane Greenberg ___________________________________________________________________________________________________ {A published version of this article appears in the 20:4 (2004) issue of OCLC Systems & Services.} “Emotive language has a tendency to increase and multiply according to some law of rhetoric. The end result is the generation of more intense heat and less clear light." – Thomas Wassmer ABSTRACT This article features an interview with Jane Greenberg, Associate Professor in the School of Information and Library Science, University of North Carolina at Chapel Hill. Ms. Greenberg discusses metadata education, her research projects, and the future of the Semantic Web. She describes the Metadata Generation Research and Automatic Metadata Generation Applications projects, the ways library school curricula have changed and will likely change in the near future, and the influence Dublin Core has had on her career. KEYWORDS Jane Greenberg; library and information science education; metadata; Metadata Generation Research project; MGR project; Automatic Metadata Generation Applications project; AMeGA project; Semantic Web; Dublin Core Metadata Initiative; DCMI The most influential teacher I ever had was Thomas Wassmer, an abrasive Jesuit who was unafraid to address difficult questions in his writings or in his classrooms. I first experienced Dr. Wassmer as a second semester freshman at the University of Massachusetts Dartmouth while attempting to enroll in his introductory philosophy course. When I asked if he’d sign me into his course, Dr. Wassmer looked down in a suspicious pose, with wide blue eyes and electrified white hair, and said that he would grant my request so long as I understood I would be worked hard and would contribute greatly to class discussion. It didn’t take long for me to realize how serious he was. Dr. Wassmer engaged his students, asking us to proclaim and defend our positions on charged social issues. All opinions were challenged, but none mocked, providing they were defended in an intellectually rigorous manner. On occasions when I’d seek his help prior to our 8:00am start time, he’d look up from his New York Times, reading glasses perched near the end of his rosaceous nose, and with the energy of someone who’d been awake for hours, bellow, “Good morning Medeiros.” He’d follow this drill sergeant welcome by recounting a story from the morning paper, a tale always more stimulating than the philosophical question that had prompted my visit. Soon I utilized these unofficial office hours merely to talk about life, which I think Dr. Wassmer knew and perhaps even enjoyed. Students in the University of North Carolina’s School of Information and Library Science are equally fortunate to have the opportunity to experience a similarly challenging and dynamic teacher. Jane Greenberg, Ph.D., is Associate Professor with tenure in UNC’s highly acclaimed graduate program. In a time when library school curricula are under attack, Ms. Greenberg stands in sharp contrast as an exemplar of what the best educators can be: innovators, mentors, motivators. In a mere five years at UNC, Ms. Greenberg has proven to be among her nation’s leading library school metadata educators. Recently I had a chance to talk with Ms. Greenberg about her teaching, research, and professional goals. NM: According to your vitae, your first library position was as an assistant in the Rush Rhees Fine Arts Library at the University of Rochester. What was that job like? JG: It was fun. There was a great learning atmosphere. I had wonderful supervisors. I still remember Katie Kinsky, and the Head Librarian, Stephanie Frontz, and how they explained the way each new responsibility I was assigned fit into the larger library entity. I did normal things that library clerks do: circulation desk, card filing, shelving, and so forth—the online catalog was just coming up then. I remember learning to bar code books in order to get them ready for OPAC- supported circulation. We had very nice office parties that helped with the learning process too! NM: Did you imagine 15 years later you would be teaching future library professionals? JG: No. I originally planned on linking my interests in art history and law, and becoming an art attorney and working in the art market. I was in college in the mid-1980’s. The art market was booming during this time, and one summer I interned at Citibank’s Art Advisory Service in Manhattan. Our division helped top tier clients build comprehensive art collections. It was fascinating, and I thought this was the direction I would go post-college. I thought about teaching for the first time when I was finishing my MLS at Columbia University and Richard Smiraglia, with whom I had taken two bibliographic control courses, suggested I should think about pursuing a doctorate in library science after a few years of working in the field. NM: When you arrived in Chapel Hill in 1999, metadata was still a buzzword. The Resource Description Framework (RDF) specifications had just been released by the World Wide Web Consortium, and the Santa Fe convention, from which would be born the Open Archives Initiative, was about to take place. How did these forces influence your teaching and research? JG: The word “metadata” actually influenced my teaching a little earlier, during the last year and one-half of my doctoral studies (1997-1998) at the University of Pittsburgh. During this period, I was learning about the Dublin Core, the Encoded Archival Description (EAD), and the Text Encoding Initiative (TEI). In my last year in Pittsburgh, I taught several metadata workshops for the University’s School of Information Sciences, PALINET, and AMIGOS. This experience set me on a path to focus my teaching and research efforts in the area of metadata. When I arrived in Chapel Hill, I built upon these workshops to design and offer a course entitled, “Metadata Architectures and Applications.” RDF, and later the Open Archives Initiative, were incorporated into the class, which covered a range of metadata issues and schemas. UNC’s School of Information and Library Science is highly integrated, offering master’s degrees in both library and information science. The first time I offered the metadata class, attendance was fairly evenly split between information and library science students. The healthy mix of students from both programs has continued over the last five years. Developments underlying the Dublin Core have had a profound impact on my research. In 1998, while in the final stages of my dissertation research, the Dublin Core metadata standard was gaining support from an array of communities striving to facilitate resource discovery of Web resources. A significant aspect of the Dublin Core is that it was developed for the resource authors; that is, it was to be simple enough so that resource authors could create metadata. When I learned about this goal, I immediately saw questions urgently requiring investigation about who should create metadata (resource author, cataloger, volunteers, etc.) and the best means of metadata production (automatic and/or human- oriented processes). These questions have served as a teaching foci and are central to the Metadata Generation Research (MGR) project, which I founded four years ago. While Dublin Core developments sparked my research efforts in this area, my interests extend to many other metadata domains. NM: The work of your advisees at UNC is impressive. Their master's papers are interesting and timely, and not surprisingly, representative of areas in which your own research is focused. How do you extract such excellence from your students? JG: First, let me say thank you for the compliment about my advisees. I continue to be impressed by UNC’s students and the master’s papers they complete across the board, with all of our faculty. Not every LIS program requires a master’s paper. This requirement at UNC is demanding for both students and faculty, but the outcome is rewarding and speaks to the excellence that you note. Students engage in and learn, first-hand, about the research process; they become better consumers of research; and they make a contribution to our field. Students wanting to complete a master’s paper in the area of organizing information/metadata often find their way to me. I’m certainly not the only one to advise master’s research in this area, but I am always happy to work with a student who has an interesting research question, and it’s exciting when they find a link to my research. In the latter case, students frequently join the Metadata Project’s team and conduct research that directly relates to the project. It’s rewarding when students realize they have contributed to a research project’s progress. The reward is amplified when they learn that their work is to be published, or they are listed as a co- author on a team publication. I believe such experiences inspire students to continue to contribute to the field as professionals after they graduate. In terms of extracting excellence, I can simply reiterate we have excellent students at UNC, and they take their master’s paper work seriously. My colleagues and I value this experience as much as the students do. NM: Much has been written and talked about regarding the demise of library school curricula, yet your teaching stands in direct contrast to the general sentiment that today's programs are not sufficiently preparatory or rigorous. What approaches in your teaching have helped you become a successful instructor? JG: I stress the importance of theory and research, and integrate these tenets with practice. My master’s degree education and experience as an information professional have been important influences on my teaching approach. Columbia University, where I earned my MLS, emphasized theoretical underpinnings of library and information science. Later, as a working professional, this understanding of theory helped me to articulate cataloging ideas and evaluate practices. This background has helped with formulating approaches focusing students’ thinking on understanding what one is doing, and why. I emphasize exploration of these questions in my teaching because I believe they are paramount to producing superior information professionals. NM: Over the next ten years, how do you see library school programs changing? JG: The answer I give today may change over time. Currently, I see the library science programs changing in two ways, which I think will continue over the next ten years. First, programs are changing to serve the evolving library environment, which is now physical and digital, local and networked. Second, programs are increasingly linking with different disciplines to serve an array of information environments beyond the library (e.g., archives; medical, nursing, and bio-informatics; commerce; museums; scientific research centers, educational enterprises, and more). UNC, for example, has joint degrees with a number of programs on campus, such as Business Administration, Nursing, and Public Health, which may, I think, be indicative of change (information about these joint degrees is available at ). It seems to me that the size of library science programs and faculty expertise will determine the different disciplines or domains with which library science programs partner. Many disciplines need core library science functions to operate successfully in today’s information world, and I think that library schools are, at least at the moment, valued programs on campuses. I hope it stays this way! NM: You are principal investigator of the Metadata Generation Research (MGR) project, an innovative and much-needed study designed to investigate the integration of human- and machine- created metadata -- work that facilitates the Semantic Web. How did you get involved in this work, and how is it proceeding? JG: As indicated before, the development of the Dublin Core has had an important impact on my research, and made evident to me the need to investigate questions about “who” and “how” metadata should be created. When I arrived at UNC during the spring of 1999, I set out to find a partner and examine these metadata questions. Here, I have to thank former SILS Dean, Joanne Marshall, for putting Davenport ("Dav") Robertson, Library Director, National Institute of Environmental Health Sciences (NIEHS), in touch with me when his organization began their metadata initiative. When we first met, I advised Jane’s baby, Jonathan Bierck, at two months, with his team on some basic metadata issues concerning Dublin Core Conference Proceedings – the youngest the Dublin Core, and then convinced him and Ellen attendee! -- Photograph by Stu Weibel, OCLC. Leadem, Technical Services Librarian (NIEHS), of the need to study the metadata questions underlying the MGR project. I am grateful to Dav and Ellen for this research partnership, and the many students that have contributed to the research project. The MGR project has been funded by Microsoft Research, OCLC, and UNC’s University Research Council, and we have disseminated our findings through a number of publications. Our immediate project is complete, although there is still some final research reporting to come. An aspect of the MGR project is continuing through the AMeGA (Automatic Metadata Generation Applications) project—the goal of which is to identify and recommend functionalities for applications supporting automatic metadata generation in the library/bibliographic control community. The project is being conducted in connection with Section 4.2 of the Library of Congress’ Bibliographic Control Action Plan, which is providing leadership to libraries and other information centers in this new millennium. (Information on both the MGR project and the AMeGA project can be found at .) NM: You edited the April/May 2003 issue of Bulletin of the American Society for Information Science and Technology, an issue devoted to the Semantic Web. You concluded your editorial by saying, "The Semantic Web is an engaging territory to explore and cultivate" (Greenberg, 2003). What will it take to conquer this territory? JG: Further collaboration and coordination is needed among a range of disciplines. People in library and information science need to work with people in computer science, psychology, linguistics, and other disciplines if we are to have a functional Semantic Web. The World Wide Web Consortium (W3C) is a powerful organization, and stands behind the development of the Semantic Web. This organization has the capability to provide needed management, but there must be buy-in from multiple parties. I recall commenting about the notion of “old wine in a new bottle” in the Semantic Web piece you reference here, and saying that this metaphor is not really true because today we have an unprecedented information infrastructure defined by the Web. I still stand behind this statement. The Semantic Web is likely not for every information source or repository, but the potential of “Semantic Web-like” operations or communities to help solve problems is exciting and I believe worth striving for. I’m an optimist, and I believe something good will come out of greater collaboration in striving for something like the Semantic Web. I already see this as communities, like the Dublin Core Metadata Initiative, bring together people from many information sectors. NM: You've accomplished much during your brief tenure at UNC. What's next for you professionally? JG: My immediate plan is to continue current research activities, and share findings from the Metadata Generation Research and AMeGA projects. A more long-term goal is to develop the SILS Center for Metadata Research at UNC, and muster support for research on metadata creation, ontologies, content management, and other related issues. And, another goal in the coming months is to teach my 10 ½ month old child to say metadata! work_7fu3zmatojht3cyx3jwljhgv7m ---- El Derecho a la Verdad en el Ámbito Iberoamericano Ius Humani | revista de derecho Volumen 4, bienio 2014-2015 Publicada el mes de diciembre 2015 Frecuencia bianual ISSN 1390-440X Ius Humani, Revista de derecho es un lugar abierto a los investigadores de todo el mundo, en todos los idiomas, donde se publican estudios originales sobre los derechos del ser humano (naturales, humanos o constitucionales) y sobre los procedimientos más efectivos para su protección, tanto desde la perspectiva filosófica, como la de la normativa superior del ordenamiento jurídico. La versión impresa de la Revista (ISSN 1390-440X) tiene una frecuencia bianual y se imprime a finales del período. La versión digital de la Revista (ISSN 1390-7794) funciona como una publicación continua: aprobadas las colaboraciones se proceden a publicarse. Se encuentra indexada múltiples sistemas como LATINDEX, DRJI, OCLC Worldcat Digital Collection Gateway, Miar, Saif, Global Impact Factor, Infobase Index, ULRICHS, JournalTOCs, UIFactor, OAJI, e-Revistas, Google Scholar, DOAJ, I2OR, SJournal Index, EBSCO Legal Source, Academic Search Premier, ERIH Plus, Heinonline, Dialnet, VLex y en muchos otro catálogos y portales (COPAC, SUDOC, JournalGuide, etc.). http://www.uhemisferios.edu.ec/revistadederecho/index.php/iushumani Para canjes y susbcripciones remitirse a: revista@derecho.uhemisferios.edu.ec Ius Humani. Revista de derecho Facultad de Ciencias Jurídicas y Políticas Universidad de los Hemisferios www.uhemisferios.edu.ec/revistadederecho EDICIÓN Servicio de Publicaciones de la Universidad de los Hemisferios (SPUH) revista@derecho.uhemisferios.edu.ec Dirección: Universidad de los Hemisferios / Paseo de la Universidad N° 300 y Juan Díaz (Urbanización Iñaquito Alto) / Quito – Ecuador Código Postal: EC170135 ISSN: 1390-440X Quito – Ecuador Tiraje: 300 ejemplares REVISIÓN Y CORRECCIÓN DE TEXTOS Esteban Cajiao Brito DIAGRAMACIÓN Y MAQUETACIÓN Isabel María Salgado Mario De la Cruz Ius Humani | revista de derecho Universidad de los Hemisferios Quito – Ecuador Rector Alejandro Ribadeneira Decano de la Unidad Académica de Ciencias Jurídicas y Políticas Dr. René Bedón Garzón Director de la Revista Dr. Juan Carlos Riofrío Martínez-Villalba, Universidad de los Hemisferios Comité Científico Mons. Antonio Arregui Yarza, Conferencia Episcopal Ecuatoriana (Ecuador) Dr. Pedro Rivas Palá, U. de La Coruña (España), U. Austral (Buenos Aires, Arg.) Dr. Hernán Olano García, Universidad de la Sabana (Bogotá, Colombia) Doctor Carlos Hakansson Nieto, Universidad de Piura (Perú) Dr.Hernán Pérez Loose, Universidad Católica Santiago de Guayaquil (Ecuador) Dr. Luis Castillo Córdova, Universidad de Piura (Perú) Comité Editorial Dr. Julián Mora Aliseda, Universidad de Extremadura (Cáceres, España) Dr. Alfredo Larrea Falcony, Universidad de los Hemisferios (Quito, Ecuador) Dr. Juan Cianciardo, Universidad Austral (Buenos Aires, Argentina) Dr. Jaime Flor Rubianes, Pontificia Universidad Católica del Ecuador (Quito) Dr. Josemaría Vásquez, IDE Business School (Quito, Ecuador) Dr. Eduardo Puente, Corp. de Estudios y Publicaciones CEP (Quito, Ecuador) Mgr. María Teresa Riofrío M.-V., Centro Univ. Villanueva (Madrid, España) Dr. Marcelo Marín Sevilla, Universidad de los Hemisferios (Quito, Ecuador) Dr. Edgardo Falconi Palacios, Universidad Central del Ecuador (Quito) Dr.Javier Hernando Masdeu, Centro Universitario Villanueva (Madrid, España) Sumario PREMIO “JUAN LARREA HOLGUÍN” La intangibilidad de las acciones privadas de las personas The right to the true in Iberoamerica Mauricio Maldonado Muñoz 9-48 ARTÍCULOS Bioetica giudiziaria in Italia: note critiche su una sentenza recente in tema di protezione della vita prenatale Judicial Bioethics In Italy: Critical Notes On A Recent Judgment Concerning Protection Of Prenatal Life Claudio Sartea 49-76 El derecho de propiedad privada y libertad económica. Algunos elementos legales, filosóficos y económicos para una teoría general The right to private property and economic freedom. Some legal, philosophical and economic elements for a general theory Santiago M. Castro Videla, Santiago Maqueda Fourcade 77-113 La relación entre moral y derecho en el paleopositivismo y el positivismo Jurídico. Aportes para una crónica The Relation Between Morality And Law In “Paleopositivism” And Ius Positivism. Contributions To A Chronicle Jorge Guillermo Portela 115-156 Sembrar Derechos. Reconfiguración del trabajador rural como sujeto de derecho, en los procesos de integración regional y universal To sow rights. Reconfiguration of the Argentine rural 157-192 worker, as a subject of regional and universal right Daniela Verónica Sánchez Enrique La protección jurídica del medio ambiente en la jurisprudencia de la Corte Interamericana de Derechos Humanos The environmental protection in the jurisprudence of the Inter-American Court of Human Rights Valerio de Oliveira Mazzuoli, Gustavo de Faria Moreira Teixeira 193-226 Neoconstitucionalismo negativo y neoconstitucionalis- mo positivo Negative and positive neo-constitutionalism Giovanni Battista Ratti 227-261 Transformaciones judiciales en el Ecuador: El equilibrio de poderes visto a través del análisis de redes sociales Judicial Transformations In Ecuador: The Balance Of Power Seen Through The Analysis Of Social Networks Efrén Ernesto Guerrero 263-297 La silla vacía y el dilema de la participación ciudadana en el Ecuador The empty chair and the citizen participation's dilemma in Ecuador José Luis Castro Montero 299-330 RECENSIONES La nozione di autorità. Suggestioni da Alexandre Kojève The Notion Of Authority. Suggestions By Alexandre Kojève Chiara Ariano 331-347 work_7gdidgajzzfp5ca4bsjc3fwzui ---- 14 August 1981, Volume 213, Number 4509 AMERICAN ASSOCIATION FOR Pros THE ADVANCEMENT OF SCIENCE Science serves its readers as a forum for the presenta- The i tion and discussion of important issues related to the advancement of science, including the presentation of univers] minority or conflicting points of view, rather than by diminisl publishing only material on which a consensus has been reached. Accordingly, all articles published in Sci- the pas ence-including editorials, news and comment, and reduce book reviews-are signed and reflect the individual views of the authors and not official points of view serial p adopted by the AAAS or the institutions with which the prices authors are affiliated. Editorial Board Associa 1981: PETER BELL, BRYCE CRAWFORD, JR., E. PETER effect oi GEIDUSCHEK, EMIL W. HAURY, SALLY GREGORY KOHLSTEDT, MANCUR OLSON, PETER H. RAVEN, WIL- number LIAM P. SLICHTER, FREDERIC G. WORDEN year pei 1982: WILLIAM ESTES, CLEMENT L. MARKERT, JOHN ye R. PIERCE, BRYANT W. ROSSITER, VERA C. RUBIN, ReseE MAXINE F. SINGER, PAUL E. WAGGONER, ALEXANDER ment of ZUCKER conside Publisher WILLIAM D. CAREY today's Associate Publisher: ROBERT V. ORMES a luxur3 Editor policy i PHILIP H. ABELSON immedi Editorial Staff Assistant Managing Editor: JOHN E. RINGLE changin Production Editor: ELLEN E. MURPHY areas of Business Manager: HANS NUSSBAUM News Editor: BARBARA J. CULLITON library News and Comment: WILLIAM J. BROAD, LUTHER J. browsin CARTER, CONSTANCE HOLDEN, ELIOT MARSHALL, COLIN NORMAN, R. JEFFREY SMITH, MARJORIE SUN, Journal NICHOLAS WADE, JOHN WALSH frinResearch News: RICHARD A. KERR, GINA BARI g KOLATA, ROGER LEWIN, JEAN L. MARX, THOMAS H. may be MAUGH II, ARTHUR L. ROBINSON, M. MITCHELL disappe~WALDROP Administrative Assistant, News: SCHERRAINE MACK; languag Editorial Assistants, News: FANNIE GROOM, CASSAN- DRA WATTS subscrip Senior Editors: ELEANORE BUTZ, MARY DORFMAN, Incre, RUTH KULSTAD Associate Editors: SYLVIA EBERHART, CAITILIN GOR- needs, I DON, Lois SCHMITT costl Assistant Editors: MARTHA COLLINS, STEPHEN cosly KEPPLE, EDITH MEYERS loss of Book Reviews: KATHERINE LIVINGSTON, Editor; LIN- interlibr DA HEISERMAN, JANET KEGG Letters: CHRISTINE GILBERT Illinois Copy Editor: ISABELLA BOULDIN Production: NANCY HARTNAGEL, JOHN BAKER; ROSE There a LOWERY; HOLLY BISHOP, ELEANOR WARNER; JEAN individu ROCKWOOD, LEAH RYAN, SHARON RYAN, ROBIN WHYTE lists, ar Covers, Reprints, and Permissions: GRAYCE FINGER, researcl Editor; GERALDINE CRUMP, CORRINE HARRIS Guide to Scientific Instruments: RICHARD G. SOMMER contribi Assistants to the Editors: SUSAN ELLIOTT, DIANE systems HOLLAND Membership Recruitment: GWENDOLYN HUDDLE The n Member and Subscription Records: ANN RAGLAND EDITORIAL CORRESPONDENCE: 1515 Massachu- librarieq setts Ave., NW, Washington, D.C. 20005. Area code sharing 202. General Editorial Office, 467-4350; Book Reviews, 467-4367; Guide to Scientific Instruments, 467-4480; iS essenr News and Comment, 467-4430; Reprints and Permis- t sions, 4674483; Research News, 4674321. Cable: Ad- Stages vancesci, Washington. For "Information for Contribu- the Wa tors," write to the editorial office or see page xi, Automa Science, 27 March 1981. BUSINESS CORRESPONDENCE: Area Code 202. position Membership and Subscriptions: 4674417. networl Advertising Representatives interlibr Director: EARL J. SCHERAGO Production Manager: GINA REILLY vices su Advertising Sales Manager: RICHARD L. CHARLES holding Marketing Manager: HERBERT L. BURKLUND O Sales: NEW YORK, N.Y. 10036: Steve Hamburger, 1515 journals Broadway (212-730-1050); SCOTCH PLAINS, N.J. 07076: C. Richard Callis, 12 Unami Lane (201-889-4873); CHI- tween c CAGO, ILL. 60611: Jack Ryan, Room 2107, 919 N. tween n Michigan Ave. (312-3374973); BEVERLY HILLS, CALIF. 90211: Winn Nance, 111 N. La Cienega Blvd. (213-657- lies in 2772); DORSET, VT. 05251: Fred W. Dieffenbach, Kent routine Hill Rd. (802-867-5581). ADVERTISING CORRESPONDENCE: Tenth floor, informa 1515 Broadway, New York, N.Y. 10036. Phone: 212- 730-1050. BLACK, SCIENCE pects for Research Libraries quality of the science and technology collections in America's ity research libraries is deteriorating under the onslaught of stable or ,hing acquisitions budgets coupled with double-digit inflation. Over t several years, almost all research libraries have been forced to their book purchases and subscription lists to journals and other 'ublications. Domestic book prices increased 3.5-fold and journal 3.3-fold during the past 10 years, while the median budget for ition of Research Library members increased only 1.7-fold. The ifmodest budget increases is evidenced by the change in the median of gross volumes added to the member libraries over the same 10- riod: 94,314 in 1969 to 1970 and 67,742 in 1979 to 1980. arch libraries' traditional goals of local self-sufficiency and develop- f in-depth collections in all areas of active research can no longer be red realistic. Instead, collection policy now reflects the needs of programs only. Collecting in areas of peripheral research interest is y most libraries can ill afford. The long-term implication of current is not attractive. With materials acquired principally in areas of iate interest, libraries will lack the breadth to accommodate new or ig research directions. Collections will exhibit discontinuities as fcurrent interest flourish and those offormer interest wither. For the user it will mean fewer books and journals locally available for ig-a popular information-gathering habit of many researchers. Ititles that are prime candidates for cancellation are less-used or -language titles. With most libraries in similar straits, the same titles bchosen for cancellation across the country, leading .to the virtual barance of current subscriptions to certain titles, such as foreign- ,e specialty journals. Another problem is the inevitable increase in ption prices as production costs are distributed over fewer subscribers. ased interlibrary borrowing is a possible solution to satisfy local but the system as currently conducted has problems. It tends to be and the wait involved means decreased productivity and can cause project momentum. Most large libraries have noted an increase in rary loan traffic. For example, in-state borrowing from Southern University at Carbondale has about doubled in the past 4 years. ire several reasons for this. Rising journal prices have caused many ual scientists and small academic libraries to pare their subscription nd both groups are relying on using or borrowing material from Ih libraries to satisfy their needs. On-line bibliographic searching has uted to increased demands for interlibrary loans as computer-based s identify obscure but pertinent sources of information. nost practical solution to the library budget crunch is the adoption by s of computer technology to assist the development of resource- systems. But development offaster, more efficient delivery systems tial to their success. Library computer networks are still in the early of implementation. Four networks, the Research Libraries Group, ashington Library Network, the University of Toronto Library ation System, and OCLC, are in the process of consolidating their ns within the American library community. The first step toward k resource sharing was taken in 1979, when OCLC initiated its rary loan subsystem. To date, the networks have emphasized ser- ach as shared cataloging over resource sharing. Computerized book rs lists are commonly available, but the programming to integrate s and serials into the systems is inadequate. Incompatibilities be- computing systems also limit communication and cooperation be- networks. All in all, it appears that the future of the research library interlibrary cooperation mediated by computerization of library s. Thus equipped, we should be better able to match the user and the ttion with a minimum of wasted time and resources.-GEoRGE Morris Library, Southern Illinois University, Carbondale 62901 o n A p ril 5 , 2 0 2 1 h ttp ://scie n ce .scie n ce m a g .o rg / D o w n lo a d e d fro m http://science.sciencemag.org/ Prospects for research libraries G Black DOI: 10.1126/science.7256274 (4509), 715.213Science ARTICLE TOOLS http://science.sciencemag.org/content/213/4509/715.citation PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions Terms of ServiceUse of this article is subject to the trademark of AAAS. is a registeredScienceAdvancement of Science, 1200 New York Avenue NW, Washington, DC 20005. The title (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for theScience Science. No claim to original U.S. Government Works. Copyright © 1981 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of o n A p ril 5 , 2 0 2 1 h ttp ://scie n ce .scie n ce m a g .o rg / D o w n lo a d e d fro m http://science.sciencemag.org/content/213/4509/715.citation http://www.sciencemag.org/help/reprints-and-permissions http://www.sciencemag.org/about/terms-service http://science.sciencemag.org/ work_7hgodaq6cfhiljcdokth32ekye ---- [PDF] Designing for Uncertainty: Three Approaches | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.1016/J.ACALIB.2006.12.005 Corpus ID: 51814614Designing for Uncertainty: Three Approaches @article{Bennett2007DesigningFU, title={Designing for Uncertainty: Three Approaches}, author={Scott Bennett}, journal={The Journal of Academic Librarianship}, year={2007}, volume={33}, pages={165-179} } Scott Bennett Published 2007 Sociology The Journal of Academic Librarianship Abstract Higher education wishes to get long life and good returns on its investment in learning spaces. Doing this has become difficult because rapid changes in information technology have created fundamental uncertainties about the future in which capital investments must deliver value. Three approaches to designing for this uncertainty are described using data from recent surveys. Many of these data are related to the National Survey of Student Engagement; for another essay that employs NSSE… Expand View via Publisher libraryspaceplanning.com Save to Library Create Alert Cite Launch Research Feed Share This Paper 40 CitationsHighly Influential Citations 1 Background Citations 9 View All Figures from this paper figure 1 figure 2 figure 4 figure 5 figure 6 figure 8 figure 9 figure 10 figure 13 View All 9 Figures & Tables 40 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency First Questions for Designing Higher Education Learning Spaces. Scott Bennett Psychology 2007 79 PDF Save Alert Research Feed Organized Spontaneity: The Learning Commons M. Stark, S. Samson Computer Science 2010 9 Save Alert Research Feed Models of learning space: integrating research on space, place and learning in higher education R. Ellis, P. Goodyear 2016 108 Save Alert Research Feed Campus Cultures Fostering Information Literacy Scott Bennett Sociology 2007 24 Save Alert Research Feed Space Assessment as a Venue for Defining the Academic Library D. Nitecki Sociology The Library Quarterly 2011 54 Save Alert Research Feed A Study Exploring Learners' Informal Learning Space Behaviors, Attitudes, and Preferences D. Harrop, B. Turpin Psychology 2013 84 PDF View 1 excerpt Save Alert Research Feed Before There Was a Place Called Library – Library Space as an Invisible Factor Affecting Students' Learning Pei-chun Lin, Kuan-nien Chen, Sung-Shan Chang Computer Science 2010 20 Save Alert Research Feed A Position of Strength: The Value of Evidence and Change Management in Master Plan Development Rachel Sarjeant-Jenkins Business 2018 Save Alert Research Feed Space and consequences: The influence of learning spaces on student development and communication Caroline S. Parsons Sociology 2015 1 Highly Influenced PDF View 3 excerpts, cites background Save Alert Research Feed Learning Spaces as Social Capital Paul Temple Sociology 2011 1 Save Alert Research Feed ... 1 2 3 4 ... References SHOWING 1-10 OF 21 REFERENCES SORT BYRelevance Most Influenced Papers Recency First Questions for Designing Higher Education Learning Spaces. Scott Bennett Psychology 2007 79 PDF Save Alert Research Feed The Role of the Academic Library in Promoting Student Engagement in Learning George D. Kuh, Robert M. Gonyea Sociology 2003 162 Save Alert Research Feed The future of the learning space: Breaking out of the box P. D. Long, S. Ehrmann Sociology 2005 70 PDF Save Alert Research Feed Collaborative Learning: Higher Education, Interdependence, and the Authority of Knowledge Kenneth A. Bruffee Sociology 1993 1,212 View 1 excerpt, references background Save Alert Research Feed From Teaching to Learning — A New Paradigm For Undergraduate Education R. Barr, J. Tagg Psychology 1995 2,954 PDF Save Alert Research Feed If you build it they will come: spaces, values and services in the digital era. M. Afifi, Deborah A. Holmes-Wong, Shahla Behavar, X. Liu Geography 1997 19 Save Alert Research Feed “Facework”: a new role for the next generation of library-based information technology centers C. Hughes, D. Morris Sociology 1999 14 Save Alert Research Feed Conceptualizing an information commons Donald Beagle Computer Science 1999 196 Save Alert Research Feed JSTOR: A History Roger C. Schonfeld Engineering 2003 39 Save Alert Research Feed Memory palace, place of refuge, Coney Island of the mind: the evolving roles of the library in the late 20th century C. Hartman Sociology 2000 8 Save Alert Research Feed ... 1 2 3 ... Related Papers Abstract Figures 40 Citations 21 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_7opwamy6nvfz3d72xhsoz2hqva ---- Metadata2020 for LIBER (Rachael) Advocacy campaign for richer metadata A cross-community collaboration Vision to create common understanding Express why metadata is so important Shared messaging & educational resources 1 LIBER 2017 #Metadata2020 2 "Most people wouldn't think: ‘Well, if we can fix this metadata we can find a cure for a terrible illness.’ If we can find a way to connect those dots, that would be huge. Nobody is asking: ‘What is the cost to society?’” @Metadata2020 3 • Authors want increased visibility • Researchers need easier reproducibility • Funders and institutions are looking for better performance data; and • Publishers and service-providers need to demonstrate value with increased usage • We share some metadata, but there is more to be done • None of this is possible when there are gaps in metadata. And everyone suffers as a consequence. The problem #Metadata2020 4It’s not just us! 5It’s not just us! (con’t) Key findings include: Metadata is a top priority: Metadata ranked as the highest priority for publishers across all verticals (4.6 out of 5), but also represented the largest gap in current organizational ability (2 out of 5). Determined to overcome key challenges and make strategic investments to accelerate their progress, 90 percent of all publishers are planning to invest in metadata over the next three years. Discoverability is a close second: Publishers ranked discoverability as the second most important transformation element (4.5 out of 5) and felt that current abilities were the highest in this category (2.5 out of 5). Roughly 30 percent of publishers reported recent efforts in platform, widget and partner services, with an additional 30 percent actively reviewing new tools to help end users discover content. http://www.copyright.com/new-report-digital-transformation-publishing-reveals-sluggish-progress-25-percent-publishers-see-ahead-industry-peers/ 6 We want to facilitate the collaboration of all involved in scholarly communications to consistently improve metadata to enhance discoverability, encourage new services, and create efficiencies, with the ultimate goal of accelerating scholarly research. Starting Point Add notes: #Metadata2020 7Steering group Cameron Neylon Curtin Univ / FORCE11 Caroline Sutton Co-Action/Informa Dario Taraborelli Wikimedia Ed Pentz Crossref Eva Mendez UC3M / OSPP / DCMI Genevieve Early Taylor & Francis Ginny Hendricks Crossref John Chodacki California Digital Library Juan Pablo Alperin PKP Kristen Ratan Coko Foundation Laure Haak / Alice Meadows ORCID Mark Patterson eLife Mike Taylor Digital Science Natalia Manola OpenAIRE Patricia Cruse / Laura Rueda DataCite Roy Tennant OCLC Scott Plutchak Univ Alabama Stefanie Haustein Univ Montreal Steve Byford JISC 8 • Raise awareness of the importance of sharing richer metadata. • Provide information for the community on the role of metadata in making scholarly content discoverable. • Encourage publishers, aggregators, funders, research institutions, and service providers to make a public commitment increase the quality of their metadata. • Facilitate communication between the stakeholders to encourage collaboration. • Equip all stakeholders with tools and information. Goals #Metadata2020 9 #Metadata2020 Aren’t we all metadata librarians? 10 Librarians make for natural metadata facilitators: • Scholcomms librarians working with researchers • Collaborations between (actual) metadata librarians & publishers’ production offices 11 Librarians make effective metadata ambassadors: • Catalogs & discovery systems=drivers of usage • Don’t let ‘em forget it! • Sharing expertise, e.g. at community meetings 12 • How can we help the researcher to understand the needs for better metadata and make supplying it easier? • How can we encourage collaboration to share better metadata? • How do we make better use of what we have? Can we strike a balance between consistency and flexibility? • What lessons can we learn from other industries? • Are you willing to sacrifice completeness for detail? Insights/questions Q Do you have stories to share? https://metadata2020.org https://metadata2020.org • Contribute your stories and perspectives • Give us your attention - volunteer to advocate • Follow us on twitter @Metadata2020 • Email info@metadata2020.org to stay in touch Thank you mailto:info@metadata2020.org work_7r56ecgz2zeazbuj3zxgue6pcq ---- 02 J4_200800098_R.hwp ISO 14721 OAIS 참조모형을 활용한 웹 아카이빙의 메타데이터 구조 요소 정의 651 ISO 14721 OAIS 참조모형을 활용한 웹 아카이빙의 메타데이터 구조 요소 정의 오 상 훈 † ․최 선 †† 요 약 본 연구에서는 웹 아카이빙에서 가치 있는 웹 자원의 수집, 리 보존을 해 요구되는 메타데이터의 구조를 설계하고 요소를 정의하 다. 본 연구를 해 국립 앙도서 ‘OASIS’등의 웹 아카이빙에서 수집 자원의 장기 보존을 해 활용되는 메타데이터를 조사하고, 웹 아카이 빙의 각 로세스 단계별 요구사항 웹 자원의 특성을 분석하 으며, 특히 장기 보존을 한 아카이빙의 개념 틀을 제공하는 ISO 14721 OAIS 참조모형을 기반으로 제안하 다. 한 웹 아카이빙 간의 자원 공유를 한 메타데이터의 상호운용성을 고려하 다. 그 결과 본 연구에 서는 웹 아카이빙에서 자원의 체계 이고 효율 인 수집, 리, 운 보존을 한 설명 , 구조 , 리 그리고 보존 유형의 4개 메타데 이터 구조를 설계하고 28개의 필수 메타데이터 요소를 정의하 다. 키워드 : 디지털 자원, 온라인 디지털 자원, 웹아카이빙, 메타데이터, 보존 메타데이터, OAIS 참조모형 A study on Designing Metadata Structure and Element on Web Archiving based on the ISO 14721 OAIS Reference Model Oh Sang Hoon † ․Choi Young Sun †† ABSTRACT This study is to develope the structures and the elements of the metadata for harvesting, management and preservation of a valuable web resources in the web archiving. For this study, we investigated the available metadata in the web archiving and surveyed the requirements of web archiving process. And we analyzed the characteristics of web resources. Also, this study was used a based on the ISO 14721 OAIS Reference Model. Finally, to share the metadata elements among the web archiving system, this study considered the interoperability for the exchange of the metadata. Based on the result, this study designed four structures of the metadata and defined the 28 core metadata elements for the web archiving. Keywords : Digital Resources, Online Digital Resources, Web Arhiving, Metadata, Preservation Metadata, OAIS Reference Model 1. 서 론 1) 1.1 연구배경 오늘날 정보통신기술의 속한 발달은 통 인 지식정보 자원의 생산․수집․ 리․축 방식에 변화를 가져왔다. 가치 있는 지 자원들이 다양한 형태의 디지털 자원으로 생산되 고 있으며, 기존 자원들도 디지털로 재생산 되고 있다. 그러 ※ 본 논문은 2007년도 국립 앙도서 「OASIS 표 화연구」의 지원으로 수행됨. †정 회 원:(사)한국디지털콘텐츠산업 회 사무국장 ††정 회 원:(사)한국디지털콘텐츠산업 회 연구원 논문 수:2008년 10월 16일 수 정 일:1차 2009년 6월 5일, 2차 2009년 7월 29일 심사완료:2009년 8월 10일 나 디지털 자원들은 태생 으로 물리 인 형태가 없으며, 가변 인 성질을 갖고 있어 빠르게 소멸될 수도 있다는 문 제 이 있다. 이에 2003년 10월 유네스코에서는 이러한 디 지털 자원의 수집과 보존을 해 인터넷 지식자원의 보존 이용에 한 내용을“디지털 유산 보존 헌장”에서 천명하 으며[1], 세계 각국에서는 웹 아카이빙을 운 하며 온라인 자원을 수집하고, 보존하기 한 노력을 하고 있다. 국립 앙도서 에서도 가치 있는 인터넷 자료(디지털 자원)를 국 가 인 차원에서 수집․보존하고자 OASIS(Online Archiving & Searching Internet Sources, 이하 OASIS) 로젝트를 2001년부터 추진해 오고 있다[2]. 웹 아카이빙은 쉽게 사라지는 인터넷 상의 진본의 자원을 DOI: 10.3745/KIPSTD.2009.16D.5.65 1 652 정보처리학회논문지 D 제16-D권 제5호(2009.10) 효과 으로 수집․ 리하고 장기간 보존하여 미래 세 에 달하는 목 을 갖고 있다. 이에 디지털 자원을 기술 (Description)하고, 체계 으로 리하며 이용자들에게 효율 인 검색을 제공해 주기 해서는 자정보, 네트워크 정 보를 한 메타데이터가 필요하게 되었다. 특히 웹 자원은 표 하는 수단과 종류가 다양하여 이들을 체계 이고 효율 으로 리하기 해서는 각 매체별․유형별 특성화된 리요소가 있어야 한다. 더불어 웹 아카이빙에서 는 자원을 수집․ 리 할 뿐만이 아니라 미래의 이용자들에게 재의 가치 있는 자원을 달해 주기 한 보존단계별 기 능을 포함한 체 인 메타데이터 구조와 요소가 제시되어 야 한다. 따라서 본 논문에서는 OASIS 웹 아카이빙의 주요 수집 자원인 웹사이트, 웹 문서, 웹 자원(개별 일)을 상으로 리하고 보존하기 한 단계별․기능별 요구사항을 충족 할 수 있는 메타데이터 구조와 요소를 개발하 다. 1.2 연구목 수행 방법 본 연구는 향후 디지털 자원의 웹 아카이빙을 해 내․외 부에 있는 디지털 자원을 보존하기 한 주체와 상 자원 의 체계 인 수집, 리, 장 서비스를 한 체계 인 리방안을 제시하고자 하 다. 따라서 OASIS 로젝트의 황 분석과 웹 아카이빙 련 해외 표사례를 조사 분석하 다. 그 결과를 기반으 로 OASIS 웹 아카이빙에 용 가능한 메타데이터를 개 발하기 해 ISO 14721 OAIS 참조모형에서 제시하는 웹 아 카이빙 업무단계별, 기능별 로세스에 따른 구조와 요소를 용하고자 하 다. 2. 웹 아카이빙 메타데이터 사례연구 2.1 메타데이터 개요 메타데이터는 데이터에 한 데이터로서, 실제 콘텐츠는 아니면서 그에 한 각종 정보를 갖고 있는 데이터를 의미 하며, 일반 으로 다음의 2가지 기능을 갖는다. 첫째는 정보 검색을 지원하는 기능으로 정제된 정보를 제공하여 자원의 식별과 이용에 도움을 뿐만 아니라 자원에 가치를 부가 하여 자원의 검색에 유용하게 한다. 둘째는 자원을 체계 으로 리하고 효과 인 이용을 지원하기 해 자원을 기술 하고, 범주화하며 자원의 연 (history) 정보를 기록하는 것 이다[3]. 웹 아카이빙을 한 메타데이터는 추가 으로 보존 정보 를 지원하기 한 기능이 요구된다. 보존정보는 자원의 생 명주기 과정에 한 모든 정보를 포함하고 있어야 한다. 이는 데이터 객체에 한 정보 뿐 아니라 그 자원을 리하 기 해 요구되는 소 트웨어와 하드웨어 등의 정보 그리고 자원 리에 련된 리자에 한 정보 등 자원 보존 활동 에 련된 모든 정보를 구체 으로 표 해야한다. 이러한 역할을 하는 메타데이터를 보존 메타데이터라고 한다[4]. 웹 아카이빙에서 보존메타데이터란 “장기 으로 자 기록들이 환경이 변하여도 계속 활용될 수 있게 하는 생존 능력 (Viability), 이용자 는 리자의 요구에 따라 처리 표 기될 수 있는 능력(Renderability), 리자에 의하여 기록된 정보들을 식별할 수 있게 해주는 능력(Understandability)을 유지하기 해 필요한 지식 정보”로 정의된다[5]. 2.2 웹 아카이빙 메타데이터 사례분석 2.2.1 Dublin Core [6] 더블린 코어(Dublin Core) 메타데이터는 데이터의 호환성 을 유지하고 네트워크 자원의 기술에 필요한 일련의 데이터 요소를 규정하여 이들 자원의 신속한 검색을 목 으로 1995 년 OCLC와 NCSA(National Center for Supercomputer Application)가 더블린에서 개최된 워크 에서 합의되었다. Dublin Core Metadata Element Set, Version 1.1이 발표되 었으며, ISO Standard 15836-2003 (February 2003)과 ISO Standard Z39.85-2007 (May 2007)로 채택되었다. 더블린 코 어 메타데이터는 메타데이터 표 개발에 폭넓게 이용되는 국제 표 메타데이터로 보편 이고 단순한 메타데이터 요 소를 구성하여 문가와 비 문가 모두에게 이용 된다. 한 고유성, 확장성, 구문독립성, 선택성, 반복성, 수정가능성 원칙을 근거로 15개 메타데이터 요소를 제안하 다. 특히 자도서 기 웹 아카이빙에서 자원을 수집 리 하기 해 메타데이터 개발 시 기 이 되었다. 2.2.2 OASIS [2] OASIS는 2001년 11월 '온라인 작물 수집․보존 시스템 구축 ISP 시범시스템 개발'을 시작으로 2005년 웹기반 OASIS 시스템 확장 개선 사업을 완료 하 으며, 2006년 OASIS 홈페이지를 통해 국민 서비스를 실시하고 있는 우 리나라를 표하는 웹 아카이빙이다. OASIS는 수집된 자원 의 메타데이터를 기술하기 하여 디지털 정보자원에 한 국제 데이터 기술 표 화 형식인 더블린 코어 메타데이터 요소와 OASIS 수집 자원 리를 한 요소를 추가 으로 선정하여 사용하고 있다. 특히 OASIS 로젝트의 정책, 시 스템 변화 업무 변동 상황에 따라 메타데이터의 확장 변경․수정이 가능하도록 구성하 다. OASIS 메타데이터는 더블린 코어의 15개 요소(Elements)를 토 로 내부 리 메 타데이터 요소를 추가하여 총 32개의 메타데이터 요소(하 요소 포함)로 구성되어 있다. 2.2.3 OCLC Digital Archive Metadata [7] 디지털 아카이 련 연구를 지속 으로 진행하고 있는 OCLC(Online Computer Library Center)는 디지털 아카이 시스템 가이드에서 디지털 아카이 에서의 메타데이터 요소 를 정의하 다. 여기에는 더블린 코어에 기반을 둔 요소와 자원의 리 기술(技術)을 한 메타데이터 요소를 포함 하고 있으며 웹 아카이빙 운 을 해 필요한 리 요소들 이 자세히 표 되었고, 특히 이용자에게 서비스하기 한 ISO 14721 OAIS 참조모형을 활용한 웹 아카이빙의 메타데이터 구조 요소 정의 653 (그림 1) OAIS Reference Model 부분까지 고려하 다. 2002년 이후 2003년 4월 자원의 생성과 입수 과정에 한 요소가 삭제되고 자원의 생명 주기에 한 이벤트와 자원의 계정보가 추가되었으며, 2004년 5월 아카이 운 에 한 요소를 반 하여 34개의 메타데이터 요소를 제안하고 있다. 2.2.4 WARP [8] WARP(Web Archiving Project)은 2002년 11월 시작한 온라인 자 출 물의 수집을 한 웹 아카이빙 로젝트 로 NDL(National Diet Library)에서 지원한 3년간의 시범 인 로젝트이다. WARP은 일본 도메인 상의 인터넷 정 보를 수집․보존하여 미래세 에게 달하는데 그 목 이 있 다. WARP는 2001년 3월 더블린 코어를 기반으로 메타데 이터 생성을 한 NDL 표 안인 "NDL Metadata ELement Set"을 공표하 다. WARP의 메타데이터는 자원의 설명 메타데이터, 구조 메타데이터, 리 메타데이터로 구 성되어 있으며, 보존 메타데이터 요소는 포함되어 있지 않다. 2.2.5 NLA (National Library of Australia) [9] 호주국립도서 은 웹 아카이빙을 해 15개의 기본 인 요소를 개발하 으나 아카이 리 시스템이 다양한 디지 털 정보 자원의 보존을 지원하는 메타데이터 요소를 도출하 지 못하자 독자 으로 보존 메타데이터를 개발하 다. 2.2.6 EVA Project [10] EVA 로젝트는 핀란드의 국립도서 인 헬싱키 학도 서 의 사서와 출 사업자 그리고 문가그룹 등과 함께 진 행한 핀란드 교육부의 정보사회 략 로그램(Information Society Strategy Program)에 포함된다. EVA는 핀란드의 인터넷에서 공인 는 자유롭게 출 된 온라인 문서를 캡처, 등록 보존하고 근을 지원하기 한 방법을 실험하는 로젝트이다. EVA는 수집된 자원을 기술하기 해 더블린 코어 기반의 Nrodic 더블린 코어 15개 요소를 사용하 다. 2.3 웹 아카이빙을 한 표 화 사례 연구 2.3.1 ISO 14721 OAIS 참조 모형 2002년 ISO 14721로 제정된 국제표 으로서 장기간에 걸 쳐 디지털 정보를 보존하여 이에 한 지속 근을 제공 하고자 하는 디지털객체보존 시스템, 즉 아카이 를 한 개념 구조 틀이다. 재는 진행되고 있는 거의 모든 디지 털 정보 보존기 과 로젝트의 기반이 되는 디지털 보존 시스템을 한 개념틀 이기도 하다. OAIS에서 정의하고 있 는 디지털 아카이 의 기능들은 입수, 디지털객체물 장, 데이터 리, 행정 리, 보존계획, 이용 이 게 6가지이며 OAIS 참조모형은 (그림 1)과 같다[11]. 2.3.2 NEDLIB 보존 메타데이터 NEDLIB(Networked European Deposit Library) 로젝 트는 1998년부터 2001년까지 유럽 7개국의 국립도서 (네 덜란드, 핀란드, 랑스, 이탈리아, 노르웨이, 포르투갈, 스 스)과 3곳의 온라인 자출 물 출 사( Elsevier, Kluwers, Springer Verlag) 그리고 네덜란드 국립기록보존소가 참여 한 자출 물의 장기 보존을 한 로젝트이다. NEDLIB 보존 메타데이터 한 OAIS 참조 모형에 기 하여 개발되 었으며, 그 결과는 다시 OAIS 참조모형이 완성되는데 향 을 주었다. CEDARS, NLA, OCLC/RLG의 보존 메타데이터 가 디지털 정보의 장기 보존과 근이라는 두 가지 목 을 두고 개발 된 것과 달리, NEDLIB는 디지털 정보자원의 보 존 메타데이터는 엄격하게 보존에만 을 맞춰서 개발이 진행되었다[12]. 2.3.3 OCLC/RLG 보존 메타데이터 2002년 6월 OCLC/RLG에서는 디지털 객체(object)를 보 존하기 한 메타데이터 임 워크를 개발하 다. OAIS 참조 모형을 기 로 하는 보존 메타데이터 임워크는 보 존 메타데이터를 구성하는 정보 유형의 포 이고, 상 수 의 설명을 제시하 다. 한 보존 메타데이터 요소의 " 로토타입(Prototype) "을 구성하여, 공식 인 보존 메타데 이터 명세서를 개발 한 합의에 기 하 다[13]. 2.3.4 PREMIS Data Dictionary [14] 2003년 6월 OCLC/RLG는 PREMIS 보존메타데이터 구 략을 해 6개국의 학도서 , 국가도서 , 박물 아카 이 , 정부기 기업에서 참여한 30명의 문가로 구성된 PREMIS 실무반을 구성하 다. REMIS 실무반의 목 은 첫 째, 보존 메타데이터 입력, 장, 리 교환을 한 안 인 략들은 식별하고 평가하는 것이고, 둘째, 구 가능 한 필수 보존 메타데이터의 리 이용을 한 가이드라 인과 권고안을 정의 하는 것이다. 그 결과로 PREMIS 데이 터 모형과 의미단 (Semantic unit)를 정리한 PREMIS Data Dictionary 를 완성하 다. PREMIS 데이터 모델에는 엔티티 (entity), 엔티티 속성 (property), 엔티티 간의 계들이 묘사된다. 5개의 엔티티 (Intellectual Entities, Objects, Events, Rights 그리고 Agents)는 보존 메타데이터로 설명되는 디지털 보존에 654 정보처리학회논문지 D 제16-D권 제5호(2009.10) <표 1> 웹 아카이빙 메타데이터 비교 (그림 2) PREMIS 데이터 모델 한 개체들이고, 엔티티의 속성은 의미단 로 표 된다. PREMIS 메타데이터 요소들의 논리 인 조직을 용이하게 하기 해서 (그림 2)와 5가지 유형의 속성들로 구성된 단 순한 모델을 개발하 다. 2.4 메타데이터 사례 분석 이슈 앞서 살펴본 표 인 웹 아카이빙 메타데이터 해외 사례 들의 경우 웹 크롤러, 로 이 자원을 수집하는 방법을 채택 하고 있다. 한 자동 으로 수집된 메타데이터 는 자 자원의 리와 검색을 한 더블린코어를 근거로 메타데이 터 요소를 개발해 왔다. <표 1>은 로젝트 사례별 메타데 이터에서 분야별 ․요소별 메타데이터의 콘텐츠정보, 구조정 보, 권리정보, 리정보 기술정보로 구분하여 각 사례들 에 한 구조와 요소를 비교․분석하 다. WARP은 서명정보 와 식별자 정보를 상세하게 구분하 고, OCLC는 웹 아카이 빙 시스템에서 자원을 리하기 한 정보를 세분화하고, 자원 리를 해 필수 인 기술 정보를 추가하 다. 이처럼 기의 웹 아카이빙 메타데이터가 기존의 다양하 고 방 한 자원 수집에 을 맞췄었다면, 최근에는 수집, 리 디지털 자원의 장기 인 보존을 한 요소까지 메 타데이터 항목에 추가하여 개발을 진행하고 있다. 한 해외 보존메타데이터 개발 로젝트의 결과인 ISO 14721 OAIS 참조모형은 아카이빙이 갖춰야 할 개념 틀에 서 출발하여 이상 인 모형을 제공하기는 하 으나 실제 활 용하기에는 실 인 상황과 차이가 있다. 최상 범 , 모 든 것을 포 하는 구조 는 간단하거나 최소의 필수 인 요소만을 제공하다 보니 상세하고, 구체 인 자원의 정보 ISO 14721 OAIS 참조모형을 활용한 웹 아카이빙의 메타데이터 구조 요소 정의 655 리가 요구되는 웹 아카이빙에서는 계속해서 새로운 메타데 이터, 실제 사용하는데 합한 메타데이터가 요구되어야 한 다는 것이다. OASIS는 웹 아카이빙을 한 단계별․기능별 요구사항이 반 되지 않았고, 멀티미디어 자원에 한 기술 표 방 법에 한 제한이 있다. 한 장기보존을 한 웹 아카이빙 임에도 보존보다는 우선 자원 수집을 으로 하고 있어 디지털 자원 보존 기능을 한 요소 개발에 한 요구가 있다. 따라서 향후 OASIS 등의 웹 아카이빙에서는 웹 사이트, 웹 자원의 장기 인 보존을 한 방향과 필요한 업무 로 세스에 따른 리 방안에 한 메타데이터 요소가 개발되어 활용되어야하는 이슈를 제시 한다. 3. 웹 아카이빙을 한 메타데이터 구조 요소 개발 3.1 웹 아카이빙 자원 정의 특징 3.1.1 웹 아카이빙 자원 정의 디지털 자원은 컴퓨터에서 생산되어진 자원(Born Digital) 는 재생산 된 자원(Digitizing)을 말하며, 인터넷을 통해 근 가능한 자원(Online)과 디지털 매체에 장된 자원 (Offline)을 포함한다. 웹 크롤러에 의한 자원 수집 방법을 택하고 있는 OASIS는 「온라인 디지털 자원 선정지침」에 따라 다음과 같이 정의하고 있다[2]. • 인터넷을 통해 자화된 디지털 자원의 내용을 내려 받 아 개인용 휴 용 컴퓨터 등의 정보통신 단말기에서 볼 수 있는 디지털 버 의 출 물 • 인터넷 등을 통해 내려 받아 개인용 컴퓨터, 노트북 컴 퓨터, PDA 등의 정보통신 단말기에서 읽을 수 있는 자 으로 유통되는 텍스트, 소리, 동 상정보 등을 담고 있는 컴퓨터 일 • 웹 사이트, 웹 페이지, 문서 일 (pdf, hwp, doc, txt 등), 이미지, 동 상, 음악, 압축 일 3.1.2 웹 아카이빙 자원 특성 본 논문에서는 웹 자원 특성에 한 3가지 을 통해 웹 아카이빙을 한 메타데이터 요구사항을 기술하고자 하 다. 첫째, 자원의 내용 측면 특성이다. 디지털 자원, 특히 처음부터 디지털 자원으로 생산된 경우는 구나 손쉽고, 자유롭게 작물을 창작하고 바로 온라인상에 유통시킬 수 있어 공식 인 검증 과정이 생략되는 경우가 많아 자원의 신뢰도나 공신력에 제한이 있다. 둘째, 자원의 리 측면 특성이다. 디지털 자원은 동일 자원에 한 복수의 포맷으로 생산이 가능하고, 다양한 유 형의 포맷으로 구성된 디지털 자원의 구성도 존재한다. 한, 디지털 자원은 인쇄자료에 비해 자원의 장 다운로 드가 용이하여 복제본의 생성이 쉬우며, 손쉽게 이동이 가 능하다. 그리고 기존 인쇄 자원들은 물리 인 형태의 변경 이나 소멸되는 경우가 지만, 디지털 자원은 응용 기기의 순간 인 오작동이나 오류발생 는 버 업그 이드 등을 통해서 자원이 유실되거나 훼손될 가능성이 있어 리 시 주의가 필요하다[15]. 셋째 자원의 이용 측면 특성이다. 디지털 자원은 온라 인으로 연결된 곳이라면 장소와 시간에 구애받지 않고 근 이 가능하다. 한 디지털 자원을 이용하기 해서는 반드 시 정보가 수록된 매체의 응용 로그램과 이를 활성화하기 한 하드웨어가 요구된다. 뿐만 아니라, 네트워크를 통해 손쉽게 복제와 이동이 가능한 까닭에 작물에 한 불법 인 복제와 유통 등 작권자의 권리를 침해하는 사례가 비 일비재하게 발생하고 있다. 3.2 웹 아카이빙 메타데이터 개발 본 논문에서 디지털 자원 웹 아카이빙을 한 메타데이터 를 개발하기 해서 지속 으로 연구되어 온 더블린코어 메 타데이터, OCLC/RLG 보존메타데이터, NEDLIB 보존메타데 이터, NLA 보존 메타데이터, OCLC 웹 아카이빙 메타데이 터 그리고 최근 완료된 보존 메타데이터의 결정 이라 할 수 있는 PREMIS Data Dictionary (1.1)를 비롯하여 그동안 의 웹 아카이빙 련 메타데이터 사례들을 분석하여 메타데 이터 개발 목표와 원칙을 제시하 다. 3.2.1 메타데이터 개발 목표 첫째, 다양한 디지털 자원에 용 가능하도록 디지털 자 원의 매체별․유형별 메타데이터 요소를 구조화 상세화 둘째, 웹 자원의 특성을 고려한 연계정보를 구성 셋째, 웹 아카이빙에서 보존 략 계획을 추진 할 때 지원이 가능한 보존 요소 개발 넷째, 향후 아카이빙 기 들 간의 메타데이터 호환과 자 원 공유를 해 상호운 성 확보가 가능한 메타데이터 구조 를 제시 3.2.2 메타데이터 개발 원칙 검증 첫째, ISO 14721 OAIS 참조모형을 수한다. OAIS 참조 모형은 웹 아카이빙에서 발생하는 모든 과정에서 최상 의 개념 임워크를 제공한다. 재 운 는 비 인 세계의 웹 아카이빙 로젝트나 기 한 OAIS 참조모형 의 기 을 용하고 수하기 해 노력하고 있다[3]. 따라 서 ISO 14721 OAIS 참조모형에서 제시하고 있는 아카이빙 의 차에 따른 단계별 요구정보를 통한 메타데이터의 구조 와 요소를 개발하고자 한다. 둘째, 표 는 권 있는 메타데이터를 용한다. 이러 한 메타데이터는 미국과 유럽을 심으로 각국의 문가와 앞선 사례들이 심이 되어 오랜 기간 논의와 조사를 통해 개발되었다. 이에 재 웹 아카이빙을 비․계획 이거나, 운 하고 있는 많은 로젝트와 기 에서는 이들이 제시한 메타데이터 구조와 요소를 참고하여 메타데이터 요소를 추 656 정보처리학회논문지 D 제16-D권 제5호(2009.10) 기능 단계 요구 정보 입수 -입수 설명정보, 구조정보, 기술정보, 리 정보, 권리 정보 장 기술정보, 인증정보 데이터 리 리 정보, 이력정보 보존 출처 정보, 인증정보, 고정정보, 참조 정보 근 식별정보. 리정보 운 리자 정보, 아카이빙 고유정보, 정책 정보 <표 2> 기능 단계별 요구 정보 (그림 3) 아카이빙 단계별 메타데이터 유형 출하고 있다[16]. 따라서 본 논문에서는 국제 인 아카이빙 로젝트 각 국가별 아카이빙에서 활용되는 메타데이터 의 다양한 요구사항과 특성들을 고려한 구조와 요소에 한 확보방안을 마련한다. 3.2.3 웹 아카이빙 메타데이터 구조 설계 웹 아카이빙에서 비트스트림과 일 등의 형태로 존재하 는 자원을 식별, 리 보존을 하기 해서는 자원에 한 다양한 정보를 기록한 메타데이터가 필요하다. 이때 어 떤 정보를 갖고 있느냐에 따라 설명 메타데이터, 구조 메타데이터, 리 메타데이터 등으로 구분 지을 수 있다 [17]. 이를 해 본 논문에서는 OAIS 참조모형에서 제시한 아카이빙의 기능 단계별(입수 -입수, 장, 데이터 리, 보 존, 근/서비스, 아카이빙 운 )로 요구되는 정보를 활용할 수 있는 구조 인 특징을 반 한 메타데이터 구조를 제시하 고자 한다. 첫째, 입수 -입수단계 : 아카이빙에서 수집 할 자원을 선별하여 입수하는 과정으로 무엇보다도 디지털 자원 객체 자체에 한 정보와 자원의 진본성과 무결성 확인을 한 정보들이 제공된다. 자원에 한 정보로는 제목, 주제, 표 언어 등 자원의 서지 사항을 설명하는 정보가 있다. 한 자원의 진본성과 무결성 검증을 해 작자, 권리 계 등 의 권리 정보 그리고 수집 자원의 일 정보를 다루는 기술 정보 뿐만 아니라 고유 식별자(URL,ISBN 등) 자원의 구 조와 구성에 한 정보 등이 요구된다. 둘째, 장단계 : 장은 수집 된 자원에 한 장기 보존 을 해 장소(서버)에 추가하는 과정으로 데이터가 실제 아카이빙에 제공되는 단계다. 이때는 자원에 한 마지막 확인 과정(에러 검)과 장된 공간의 치 정보, 장소에 한 리정보 등이 요구된다. 셋째, 데이터 리 단계 : 데이터 리는 최 입수된 자 원이 아카이빙에서 리되고 있는 단계에 한 정보가 제공 된다. 즉 데이터가 재 어떤 상태( : 포맷 변경, 장 매체 변경 등)에 있는지에 한 정보가 요구된다. 넷째, 보존 단계 : 보존을 해서는 우선 아카이빙에 보존 략이 있어야 하며, 데이터 그 자체 뿐 아니라 그 보다 더 많은 정보를 수집해야 한다. 그 자원의 출처가 어디 는지, 자원의 진본성 확인은 언제, 어떻게 받았는지, 다른 아카이 빙 자원과도 연결이 되는지, 생성부터 보존까지 어떤 과정 을 거쳤는지 등에 한 자세한 정보가 필요하다. 따라서 이 단계에서는 보존 략 보존 계획에 한 정보, 기술정보 변화에 따라 환경 정보를 지속 으로 리해야 한다. 다섯째, 근/서비스 단계 : 근은 리자가 근 을 제 공해 주는 경우와 OAIS 참조모형의 DIP(Dessemination Information Package:분배정보패키지)를 제공하는 경우처럼 이용자가 원하는 정보를 아카이빙에 요청하는 경우가 있다. 자는 재 검색 정보를 제공하는 것과 마찬가지로 해당 자원의 고유 식별 정보나 리 정보 몇 개의 요소를 선 정하면 된다. 후자의 경우는 아카이빙이 제공 가능한 정보 에 해서 이용자에게 공지를 하면 된다. 여섯째, 아카이빙 운 단계 : 운 은 아카이빙 반에 걸 쳐 처리해야 하는 정보들이 포함 된다. 디지털 자원을 체계 으로 리하기 한 정보, 아카이빙 정책과 련된 정보 ( 를 들면 자원의 서비스 여부, 내부 정책 수립, 이용자 정보 제공자 련 정책 지원 정보) 그리고 아카이빙 자원을 리하는 업무를 하는 담당자에 한 정보, 시스템 의 하드웨어와 소 트웨어 등의 환경 설정 정보 등이 해당 된다. 본 논문에서 메타데이터를 구조화하기 해 아카이빙의 각 기능 단계별(6단계)에서 요구되는 정보로 구분하여 정리 하면 <표 2>와 같다. 각각의 메타데이터 정보들은 특정 단 계( : 운 의 리자 정보, 데이터 리의 이력정보 등)에서 만 나타나기도 하고, 아카이빙의 분야에 걸쳐 공통 으로 요구( : 입수 , 입수, 장 단계의 기술정보)될 수 있다. (그림 3)은 <표 2>에 나타난 각 기능단계별 주요 메타데 이터 정보가 표 하는 내용에 따라 메타데이터를 구조화한 것이다. 일반 으로 리 메타데이터로 분류되던 보존 정 보가 웹 아카이빙에서는 출처정보, 인증정보, 고정정보, 참조 정보를 포함하며 구조화 상세화 되고 있다. 따라서 본 논문에서는 아카이빙 단계별 요구되는 정보에 따라 설명 메타데이터(설명정보, 식별정보), 구조 메타데이터(구조정 ISO 14721 OAIS 참조모형을 활용한 웹 아카이빙의 메타데이터 구조 요소 정의 657 구분 내용 시 설명 메타데이터 설명 정보 디지털 자원에 한 설명으로 주로 서지정보 일부를 포함하는 자원을 유일하게 식별할 수 있는 기본 정보를 제공 디지털 자원의 제목, 주제, 표 언어 구조 메타데이터 구조 정보 디지털 자원을 표 하기 한 수단 방법에 따라 자원의 구조와 구성에 한 정보를 제공 문서, 이미지, 텍스트, 비디오/ 웹사이트, 연속 간행물, 자 도서 리 메타데이터 리 정보 웹 아카이빙에서 입수된 자원을 리하고 아카이빙을 운 하기 해 요구되는 정보를 제공 아카이빙 등록번호, 그룹 번호, 날짜, 리자, 사건 정보 권리 정보 디지털 자원의 지 재산권에 련된 개인, 단체, 기 에 한 정보를 제공 작자, 출 사, 지 재산권정보,COI정보 기술 정보 디지털 자원을 사용하기 해 요구되는 운 시스템, 소 트웨어, 하드웨어 정보 등 기술 인 환경에 한 정보를 제공 Operating System 정보, 일 크기, 일포맷, (Window XP 등), 각종 소 트웨어 정보 보존 메타데이터 출처 정보 자원의 출처에 한 정보를 제공 서지정보, 메타데이터 정보 참조 정보 디지털 자원의 내용 정보에 한 히스토리, 즉 이력에 한 정보를 제공 입수 -입수 정보, 자원의 기원 정보 문맥 정보 디지털 자원 내용 정보의 생산 이유와 다른 내용 정보와의 련에 한 정보를 제공 자원의 계정보 인증 정보 해당 디지털 자원이 진본의 자원인지 분명한가를 확인하고 보장하는 정보를 제공 진본성 확인 <표 3> 메타데이터의 내용과 사례 범주 요소 하 요소 기원 정의 의무 설명 정보 Title DC 디지털 자원을 표하는 이름 필수 반복 Subject Collection DC 디지털 자원 내용의 주제를 아카이빙 컬 션과 분류번호로 표 필수 Classification number Description - DC 디지털 자원을 설명할 수 있는 요약정보 부연 설명 정보 선택 반복 Source - DC 디지털 자원의 원래 출처 선택 반복 Language - DC 디지털 자원을 기술하고 있는 표 언어 선택 반복 Coverage - DC 디지털 자원이 용되는 기간과 장소 선택 반복 <표 4> 설명 메타데이터 요소 (그림 4) OASIS 메타데이터 구조 보), 리 메타데이터( 리정보, 리자 정보, 기술정보, 정 책정보, 권리정보), 그리고 보존 메타데이터 (참조정보, 인 증정보, 문맥정보, 출처 정보)로 구분하 다. 보존 메타데이터를 리 메타데이터와 구분한 것은 재 OASIS와 같이 보존 정책과 계획이 수립되지 않은 웹 아카이빙 상황에서 보존과 련된 정보들 - 문맥정보, 참조 정보, 출처정보, 인증 정보 - 등에 해 별도로 리해야 할 필요가 있음을 고려하 기 때문이다. 결과 으로 재 OASIS 메타데이터와 앞서 사례 분석한 연구내용을 토 로 4개의 섹션(Section)과 9가지 메타정보로 구성된 구조를 제 안하고 그에 따른 구성을 (그림 4)와 같이 표 하 다. <표 3>은 웹 아카이빙 메타데이터 구조에서 제시된 4개 의 섹션의 9가지 메타데이터 정보에 한 내용과 그에 한 설명 시를 기술하 다. 3.2.4 웹 아카이빙 메타데이터 요소정의 본 논문에서는 앞서 제안한 OASIS 메타데이터 구조와 개발 원칙을 기 으로 총 29개의 메타데이터 요소를 추출하 고 구조화, 상세화를 통해 하 요소를 선정하 다. <표 4> 부터 <표 7>까지는 각 단 섹션별 메타데이터 요소에 658 정보처리학회논문지 D 제16-D권 제5호(2009.10) 범주 요소 하 요소 기원 정의 의무 구조 정보 Object Type Text - Format & version - compression NLA 디지털 자원을 표 하는 수단의 유형 - 텍스트 - 이미지 - 비디오 - 오디오 - 멀티미디어 필수 반복 Image - Format & version - image resolution - image dementions - image color - image orientation - compression Video - Format & version - frame dimensions - duration - frame rate - compression - video encoding structure - video sound Audio - format & version - audio resolution - duration - bitrate - compression - track & type Multimedia Object Genre Web Site - 디지털 자원의 유형별 종류에 따라 웹사이트와 웹자원에 한 정보 필수 반복Web Resource - Individual - Group Object Identifier Original URL OCLC 디지털 자원의 유일한 식별자 필수 반복 Source URL Harvest URL <표 5> 구조 메타데이터 요소 범주 요소 하 요소 기원 정의 의무 권리 정보 Creator Name DC 디지털 자원의 창작한 자 는 기 의 명과 연락처 필수 반복 Contact - Telephone Number - Address - e-mail OASIS 선택 반복 Publisher Name DC 디지털 자원을 발행한 사람 는 기 , 발행한 장소 필수 Locator - 선택 Right Agent - Personal - Agencies DC 디지털 자원에 한 권리정보를 갖는 사람 는 기 , 권리의 기간, 작권 동의서에 한 정보 필수 반복 Right Period - Start date - End date - 필수 반복 Copyright Agreement Number OASIS 필수 COI COI Identifier number OASIS 해당 디지털 자원과 연계된 COI 식별자 번호 필수 리 정보 Date Created DC 디지털 자원의 생애 주기에서 일어나는 사건에 한 날짜 필수 Issued 필수 Harvested 필수 Ingested 필수 Modified 필수 반복 Digital Archiving save file number Object Number OCLC 디지털 자원이 아카 이빙에 입수된 번호 - 디지털 객체의 입수번호와 그 객체가 포함된 자원의 번호 필수 Group Number 필수 Service level - OCLC 디지털 자원의 내부 /외부 서비스 여부 필수 Manageme nt person Registration person OASIS 아카이빙 내부에서 디지털 자원의 리와 목록작업에 련된 사람 필수 반복 Modification person 필수 반복 Event Harvest OCLC 디지털 자원의 아카이빙 되는 단계별 상황 필수 반복 Ingest Archive Modification 기술 정보 File Name OASIS OCLC 디지털 자원이 장된 일의 정보 필수 Size Object Format Name NEDLIB 디지털 자원의 일 포맷 정보 필수 Value Operating System Name NEDLIB 디지털 자원을 이용하기 해 요구되는 권장되는 운 체계 정보 필수 반복 Value Applicat- ion Name NEDLIB 디지털 자원 이용하기 해 요구되는 응용 로그램 필수 반복 Value <표 6> 리 메타데이터 요소 하여 메타데이터의 범주(Category), 요소(Element), 하 요 소(Sub Element), 기원(Origin), 정의(Definition) 의무사 항(Cardinality)에 한 설명이다. 4. 결론 제언 본 논문에서는 ISO 14721 OAIS 참조모형을 용한 국내 웹 아카이빙 메타데이터의 구조와 활용요소를 개발하 다. 그 결과 첫째, 웹 아카이빙을 한 디지털 자원으로 표 되 는 수단인 문서, 이미지, 동 상, 사운드의 매체별 자원에 ISO 14721 OAIS 참조모형을 활용한 웹 아카이빙의 메타데이터 구조 요소 정의 659 범주 요소 하 요소 기원 정의 의무 출처 정보 Resource Description CEDA RS 디지털 자원의 출처에 한 설명 필수 반복 참조 정보 Origin -Designation -Procedure -Date -Responsible agency -Outcome -Note -Next occurrence OCLC/ RLG 디지털 자원의 생성과 련된 정보 필수 Pre- ingest OCLC/ RLG 디지털 자원의 입수 정보 필수 Ingest OCLC/ RLG 디지털 자원의 수집 시 정보 필수 Archival retention OCLC/ RLG 디지털 자원의 아카이빙 시 정보 필수 문맥 정보 Relation- ships - OCLC/ RLG 디지털 자원과 연 되는 정보 필수 반복 인증 정보 Object Authenti- cation Type OCLC/ RLG 디지털 자원의 진본성 확인에 한 정보 필수 반복 procedure date result <표 7> 보존 메타데이터 요소 한 정보는 자원의 구조정보를 기술하는 구조 메타데이터 요소로 구조화, 상세화 하 다. 둘째, 웹 자원의 특성을 고려 하여 웹사이트, 웹 자원의 연계 정보를 구체화하 다. 셋째, 디지털 자원의 장기 보존을 지원하기 한 필수 보존 메타 데이터 요소를 추가하 다. 그 결과 메타데이터 개발 원칙 과 기 에 의해 4개의 메타데이터 섹션 구조 - 설명 , 구 조 , 리 , 보존 - 와 9개의 범주정보(Category)를 담 을 수 있는 메타데이터 구조를 제안하고, 각 섹션구조에 따 른 단 정보를 제공할 수 있는 29개의 메타데이터 요소와 각 요소별 필수하 요소를 개발하 다. 향후 연구에서 보존 메타데이터 개발은 국립 앙도서 의 OASIS 각 웹 아카이빙의 정책 인 방향에 따라 보존 략과 계획을 비해야 할 것이다. 재의 보존 메타데이터 요소는 자원의 장기 보존을 지원하기 해 요구되는 기본 인 요소만을 추출한 것이다. 향후 보존 정보가 강화된다면 보존 메타데이터 섹션에 구조 정보, 기술 정보, 리 정보의 일부 정보를 포함하는 의의 보존 메타데이터로도 변경도 가능하다. 따라서 디지털 자원의 단계별 보존 략과 계획 에 의한 구체 인 메타데이터 요소에 한 연구가 필요할 것이다. 참 고 문 헌 [1] 서혜란, "디지털 납본제도 방안".디지털 유산 보존을 한 포럼, 2004. [2] OASIS homepage : http://www.OASIS.go.kr [3] 국립 앙도서 ,“OASIS 표 화과제연구”, 2003. [4] Deborah Woodvard, "Preservation Metadata" OCLC/SCURL NEW Directions in Metadata, Edinburght, 15-16 August 2002. [5] 김희정, " 자 아카이빙을 한 OAIS 참조모형의 용방 안에 한 연구", 박사학 논문, 연세 학교 문헌정보학과, 2003. [6] DCMI : http://dublincore.org/ [7] OCLC. "Digital Archiving Metadata Elements", Dublin, Ohio, 2004. [8] WARP : http://warp,ndl.go.jp [9] NLA : http://www.nla.gov.au/ [10] Kristi Lounamaa, "EVA-The Acquisition and Archiving of Electironic Network Publications In Finland", Tietolinja News. 1999. [11] 이소연, "디지털 아카이빙의 표 화와 OAIS 참조모형", 정보 리연구 33(3), pp.45-68, 2002. [12] NEDLIB : http://nedlib.kb.nl/ [13] OCLC.RLG, A Metadata Framework to Support the Preservation of Digital Objects,2002. [14] PREMIS Working Group, " Data Dictionary for Preservation Metadata: Final Report of the PREMIS Working Group", 2005. [15] 김태수, 목록의 이해,서울,한국도서 회, 2000. [16] OCLC/RLG, "Implementing Preservation Repositories for Digital Materials: Current Practice and Emerging Trends in the Cultural Heritage Community", 2004. [17] NISO, "Understanding Metadata", 2004. 오 상 훈 e-mail : oshosh24@gmail.com 1988년 한국외국어 학교 자계산학과 (학사) 1990년 한국외국어 학교 경 정보 학원 응용 산학과(석사) 2006년 충남 학교 정보통신공학부 정보검 색 자연어처리(박사) 1994년~2000년 한국과학기술정보연구원 연구원 2000년~2001년 (재)한국데이터베이스진흥센터 장 2001년~ 재 (사)한국디지털콘텐츠산업 회 사무국장 심분야:디지털아카이빙, 디지털콘텐츠 유통보호, 메타데이터, 정보검색 660 정보처리학회논문지 D 제16-D권 제5호(2009.10) 최 선 e-mail :ming279@gmail.com 2003년 숙명여자 학교 정법학부(학사) 2007년~ 재 (사)한국디지털콘텐츠산업 회 연구원 심분야:디지털아카이빙, 메타데이터 << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /All /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Tags /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJDFFile false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.0000 /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails false /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams false /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo true /PreserveFlatness true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true /Arial-Black /Arial-BlackItalic /Arial-BoldItalicMT /Arial-BoldMT /Arial-ItalicMT /ArialMT /ArialNarrow /ArialNarrow-Bold /ArialNarrow-BoldItalic /ArialNarrow-Italic /ArialUnicodeMS /CenturyGothic /CenturyGothic-Bold /CenturyGothic-BoldItalic /CenturyGothic-Italic /CourierNewPS-BoldItalicMT /CourierNewPS-BoldMT /CourierNewPS-ItalicMT /CourierNewPSMT /Georgia /Georgia-Bold /Georgia-BoldItalic /Georgia-Italic /Impact /LucidaConsole /Tahoma /Tahoma-Bold /TimesNewRomanMT-ExtraBold /TimesNewRomanPS-BoldItalicMT /TimesNewRomanPS-BoldMT /TimesNewRomanPS-ItalicMT /TimesNewRomanPSMT /Trebuchet-BoldItalic /TrebuchetMS /TrebuchetMS-Bold /TrebuchetMS-Italic /Verdana /Verdana-Bold /Verdana-BoldItalic /Verdana-Italic ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 150 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 1200 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 1200 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 15 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /Description << /CHS /CHT /DAN /DEU /ESP /FRA /ITA (Utilizzare queste impostazioni per creare documenti Adobe PDF adatti per visualizzare e stampare documenti aziendali in modo affidabile. I documenti PDF creati possono essere aperti con Acrobat e Adobe Reader 5.0 e versioni successive.) /JPN /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken waarmee zakelijke documenten betrouwbaar kunnen worden weergegeven en afgedrukt. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR /PTB /SUO /SVE /ENU (Use these settings to create Adobe PDF documents suitable for reliable viewing and printing of business documents. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) /KOR >> >> setdistillerparams << /HWResolution [2400 2400] /PageSize [612.000 792.000] >> setpagedevice work_7rnuvezwujcapafonrchttlppi ---- TSQ 23(4) Prepublication.vp Cleanup of Netlibrary Cataloging Records: A Methodical Front-End Process Elaine Sanchez Leslie Fatout Aleene Howser Charles Vance ABSTRACT. Electronic resources and ebooks in particular, have be- come a very important source of information for library patrons. When our library was given access to more than 20,000 ebooks, we were faced with bibliographic records of unknown quality. To provide high-quality records in a timely manner, we identified as many potential problems as we could, worked with reference staff to create the best PAC displays, and created efficient record-editing methods to address these issues prior to loading the records in our database. This article documents that pro- cess and describes the MarcEdit, Word, and Excel strategies used to methodically correct and improve these records. It also offers practical solutions and procedures for database maintenance and quality control for NetLibrary or any outsourced cataloging records. The future of ebooks and other related cataloging issues, including authority control, are also discussed as points that remain to be addressed. [Article copies available for a fee from The Haworth Document Delivery Service: 1-800- HAWORTH. E-mail address: Website: © 2006 by The Haworth Press, Inc. All rights reserved.] Elaine Sanchez is Monographs Cataloging Librarian/Unit Head; Leslie Fatout is Library Systems Coordinator/Circulation Librarian; Aleene Howser is Head Monographs Cataloging Assistant; Charles Vance is Database Management Services Librarian/Unit Head, all at Alkek Library, Texas State University-San Marcos, 601 University Dr., San Marcos, TX 78666-4604. Technical Services Quarterly, Vol. 23(4) 2006 Available online at http://www.haworthpress.com/web/TSQ © 2006 by The Haworth Press, Inc. All rights reserved. doi:10.1300/J124v23n04_04 51 This electronic prepublication version may contain typographical errors and may be miss- ing artwork such as charts, photographs, etc. Pagination in later versions may differ from this copy; citation references to this material may be incorrect when this prepublication edition is replaced at a later date with the finalized version. http://www.HaworthPress.com>�2006 http://www.haworthpress.com/web/TSQ KEYWORDS. NetLibrary, cataloging, procedures, cleanup, ebook, database maintenance, front-end, process, quality control, edit, outsour- cing, online catalog, metadata, record loading, bibliographic, MarcEdit, Microsoft Word, Microsoft Excel, macro, spreadsheet INTRODUCTION NetLibrary (www.netLibrary.com), a division of OCLC, is the major provider of electronic books (ebooks) to the library community,2 with more than 60,000 ebooks from more than 400 publishers, covering all subject areas, and serving more than 6,700 libraries.3 Ebooks are pub- lished works such as research materials, reference books, and textbooks that have been converted into digital format for electronic distribution. They are an important supplement to print materials as they provide instant access for patrons in remote locations.4 Ebooks offer other ad- vantages such as full-text searching; instant linking to related resources; no risk of theft, damage, or loss; potential savings in processing costs; and no physical space requirements.5 Furthermore, while the debate rages about the future roles of printed books versus electronic resources, patrons are growing increasingly more reliant on electronic resources for their information needs. Given this fact, it is imperative that the library offers access to ebooks, ejournals, and all other remote electronic information. In 2002, through TexShare (www.texshare.edu), Texas’ statewide resource-sharing program, a NetLibrary collection of more than 20,000 electronic books was made available to member institutions. The ebooks were a welcome addition at a time when the enrollment at Texas State University-San Marcos (www.txstate.edu) was growing rapidly (from 22,471 in 2000 to 26,375 in 2003) and there was a strong empha- sis on using technology to improve and expand access. However, only the most curious and adventuresome library users were aware of the ebook collection because the catalog, beyond which many users never venture, did not disclose it. Therefore, upon learning that TexShare of- fered NetLibrary MARC records to its member libraries at no cost, our university librarian initiated a high-priority project to load biblio- graphic records for the ebook collection into our catalog. In the summer of 2003, reference librarians met with the system librarian and catalogers to plan for the addition of ebook records to the public access catalog. Reference staff wanted to ensure that these 52 TECHNICAL SERVICES QUARTERLY cataloging records would clearly distinguish electronic from print re- sources, and provide simple, direct access to the ebooks. Cataloging staff were concerned about the quality of the NetLibrary records; this would be the first time that bibliographic records which we had not cataloged would be loaded into our database. Our cataloging standards are high, with full encoding level, AACR2, ISBD, careful au- thority control, consistent series authority work, and complete classifi- cation and subject access. We wondered how NetLibrary records would compare to our own, and how they would affect the integrity of our carefully constructed bibliographic and authority databases. We found the richest source of information to be Autocat, where many problems were listed and discussed. We reviewed published liter- ature that covered the cataloging of ebooks and discussed fields and standard values to include, but at the time did not address cataloging problems in the set of NetLibrary records we were going to acquire. At the same time, the system librarian communicated with the Data Research Associates (DRA) Classic user group to inquire about sys- tem-specific problems and solutions other sites might offer. Points that were mentioned included dealing with the lack of physical items, hard copy records for the same titles, and variations in cataloging rules and practice. There are a couple of programs that can create or message pre-existing MARC data; however, several of the users recommended Terry Reese’s MarcEdit program as a gem of a tool which made batch editing records a breeze. Web site citations that describe the other pro- grams are included in the bibliography at the end of this article. Learning about the problems within the records, we knew we had to find a way to get them corrected. How would we deal with this task? Would there be a simple way to identify the different types of problems? Who would be responsible for finding and correcting the problems? It made sense to approach the cleanup systematically instead of deal- ing with records on an “as-found” basis. First, having the records iso- lated from the entire database allowed the task to proceed more quickly and efficiently. Second, doing most of the cleanup work before the re- cords were loaded made them more accessible and useful to the patrons as soon as they appeared in the catalog. Third, the tools that are avail- able with our current system (DRA Classic) are limited in scope and functionality. It would be more efficient to use programs that had greater search, sort, replace, and edit capabilities: MarcEdit, Microsoft Word, and Microsoft Excel. We have succeeded in creating a methodical front-end process that raises the quality of the NetLibrary records, as nearly as possible, to our Sanchez et al. 53 own local standards. A link to detailed procedures appears in the notes at the end of this article. SURVEY OF THE LITERATURE A survey of the literature was performed on OCLC, the Internet, and in the Library Literature & Information Science full-text database re- garding the quality and cleanup of NetLibrary records. The results were • some information, articles, and Web sites on MarcEdit and two other MARC-editing software programs • a small number of functional applications of these utilities in use by libraries, explaining in broad terms how certain features could be used to perform various types of edits, such as globally editing the 049 field • a few articles that had brief mentions of non-traditional software applications used to perform editing prior to loading MARC re- cords in catalogs. We found no article or information on the overall process of using non-traditional, non-ILS supplied editing utilities to correct MARC re- cords prior to loading them in the catalog. Neither was there any infor- mation on the actual procedures for these workflows, nor any discussion on the quality controls and editing standards required to bring the records up to the quality level of records already existing in the catalog. This article presents the entire process, as experienced at Texas State University-San Marcos, which can be emulated for any outsour- ced MARC record cleanup project. PROBLEMS AND RAMIFICATIONS Reference staff were guardedly positive about adding bibliographic records for NetLibrary titles to the catalog, provided the following is- sues were addressed: • The OPAC display must be patron-friendly and unambiguous. • The link to the online titles must be as simple and direct as possible. • Users must be able to search by keyword and limit the result to ebooks. 54 TECHNICAL SERVICES QUARTERLY Reference staff were also concerned about the absence of item records that caused the system-generated note: “The library currently has no holdings for this title.” Cataloging staff, on the other hand, were cautious about loading NetLibrary records, sight unseen, into the database. This was our first experience with accepting someone else’s cataloging en masse and without local cataloging oversight. We wanted to learn everything we could about the quality and potential problems of these records. After the first set of records was loaded in August 2003, we had the search capabilities to evaluate the records. Records were loaded at night, when staff was not cataloging, so the records appeared in sequential database control number (dbcn) order. This enabled staff to identify NetLibrary records that were retrieved when performing random searches for possi- ble problems and errors. One way to directly retrieve NetLibrary re- cords was by the dbcn as we knew the range of numbers for these titles. Another simple way was by the subject “Ebook,” as this had been added to all the NetLibrary records. Both of these searches allowed staff to limit searching to only NetLibrary titles. Our initial searches in the records uncovered various issues such as • A title search for initial articles in any language brought up a few records with incorrect filing indicators. • Searches for authors, corporate bodies, and other access points showed that these were not linking to the assigned authority re- cords in our database because of minor typos, capitalization dis- crepancies, diacritics, incorrect subfield codes, punctuation, or other problems. • Limitations in our system caused series with initial articles not to link with their authorized headings. • NetLibrary records with call number formatting problems; we used the 050 and 090 section of OCLC’s Bibliographic Format and Standards as examples. We also wanted to learn what other cataloging agencies had found. We searched Autocat archives to see what messages had been posted on the topic of NetLibrary records. There were disturbing problems with records created in 2001: • Duplicate records • 7xx fields stripped from records • Acceptable subject headings stripped from the full OCLC record, sometimes leaving only one subject heading Sanchez et al. 55 • MESH subjects retained, but LCSH deleted • Errors in 245s; non-English language cataloging description • Main entry vs. title entry errors, etc. Cataloging discussion lists detailed NetLibrary record problems in 2002 including the following: • Treatment of single serial issues and single volumes of multipart monographs that varies from our local practice • Records without call numbers • Print format had accompanying CD or software and a note or 300 $e accompanying material text reflecting this, and the NetLibrary record retained this note • Duplicate call numbers for records in classed-together series • The possibility of 126 duplicate records in the NetLibrary set released by SOLINET • Authority headings of all types conflicting with OCLC and LC authority files. Since 2002, the NetLibrary records have been corrected and redis- tributed to address the 2001 problems and some of the 2002 problems. Later records, made available in 2003 and 2004, demonstrate continued improvements in quality. Duplicate records are rare, and there are fewer records lacking call numbers. However, the following problems still remain: • Formatting errors in call numbers • Treatment of single serial issues and single volumes of multipart monographs differs from our local treatment of print serial and monographic set equivalents • Print format had accompanying CD or software and the 300 field $e accompanying material text was retained in the electronic format record • Duplicate call numbers for records in classed-together series • Authority conflicts with OCLC, LC, or local authority files. In our review of NetLibrary records we have identified continuing problems (see Figure 1). We are working on our processes to detect and fix these errors before they get into our database where they are harder to find and fix. It is important to address these errors because they create barriers to consistent and correct access. 56 TECHNICAL SERVICES QUARTERLY Sanchez et al. 57 FIGURE 1. NetLibrary Problems and Ramifications Figure 1 lists only the basic ramifications of the NetLibrary errors we identified. While some of these errors are not as critical, those that im- pair patron identification and use of desired materials are critical. AUTHORITY CONTROL In addition to the bibliographic description issues, authority control was a particular area of concern. Large batches of records were being loaded into our database with no systematic method to verify headings. Had the records been cataloged locally, we could have felt secure that between the efforts of our staff and the maintenance procedures we have in place, the headings entered would be up to our standards. In this case, however, we were dealing with authority records of unknown quality. The DRA system provides us with reports of headings (“index dumps”) that have been loaded into the database but not authorized. We use these reports to perform routine authority control, and with some minor adjustments to our normal procedures, we determined that they could be used for the NetLibrary record loads as well. The index dumps are set to run at the end of the work week to catch the headings entered during the course of normal cataloging. Because the NetLibrary records are loaded when other cataloging work is not be- ing done, we can run a special dump to isolate the NetLibrary records and their corresponding unauthorized headings. This special index dump created a large set of authority headings which were imported into an Excel spreadsheet. Headings were then di- vided among cataloging units and are being searched in our catalog and OCLC’s authority file for any necessary conflict correction, export of authority records from OCLC, or creation of local authority records. SOLUTIONS IDENTIFIED Our system librarian headed up the NetLibrary record load project. In February 2003, she initiated the record-review process by asking cata- logers what kinds of problems they found with these records and how the records should be edited to best serve the patrons and match the quality of the records in our existing database. Working with printouts of some of these records, we were able to determine how they were cata- loged, which fields and texts were used, what types of errors we could 58 TECHNICAL SERVICES QUARTERLY identify, and how they should be modified. By June, the following solu- tions and standard texts had been identified (see Figure 2). We also noted that these records were in OCLC-MARC format, not MARC21, as there were certain variable and fixed fields in these records Sanchez et al. 59 FIGURE 2. Solution and Standard Text with Rationale that were not included in MARC 21. Also, we observed that the 049 field for these records would have the holding code of the cataloging agency IKMN, rather than our own TXIM holding code, which we anticipated would cause no problems. The pros and cons of having holdings for ebooks were also discussed, and expediency required loading the bibliographic records without including a holding record for each title to contain call number and barcode information. The absence of holdings records causes the text “The library currently has no holdings for this title” to display in our Web catalog; Reference staff were concerned this could confuse pa- trons. Our system allows for customization of this message, but to date it has not caused problems. We have agreed that we will revisit the item record and holdings issue in the future, as our ejournals have holdings records, but our ebooks do not. Our serials cataloging unit had been cataloging ejournals since 1998 and already had established cataloging parameters. We reviewed these procedures and determined that their policies did not relate as closely to ebook cataloging needs as we had thought. Their electronic resource cataloging procedures did, however, reinforce the decisions we had already made: • Add text after the call number to alert patrons that the title is an ebook • Modify the text “Bibliographic record display” in the 856 $3 field to “Online version.” Finally, we determined parameters for the bibliographic load process: • Splitting the large file of records into sets of more manageable size so that pre-editing could be done more easily • Running the keyword index program on the split files of records rather than all at once, so that keyword indexing proceeded more quickly • Trying to use the load program to identify duplicate print titles for their NetLibrary counterparts (unfortunately, our bibliographic load program lacked this capability) • Tracking the database control numbers for each file in case we needed to isolate these records later for global updating. 60 TECHNICAL SERVICES QUARTERLY TOOLS AND STRATEGIES MarcEdit MarcEdit (http://oregonstate.edu/~reeset/marcedit/html/) is a free MARC-editing utility developed by Terry Reese, Assistant Librarian at Oregon State University’s Map and Aerial Photography Library. It in- cludes a tool that “breaks” MARC records into an easily readable, tagged text file, and another which restores broken records to MARC format. It also includes a powerful editor which provides the ability to find and replace text; edit fields, subfields, and indicators; and count the fields and subfields in a file of records. The system librarian downloaded MarcEdit and experimented with a sample of the NetLibrary records to familiarize herself with the pro- gram’s capabilities and limitations. She then met with cataloging staff that had been reviewing the records and identifying problems. It was clear that MarcEdit would be effective in fixing several of the problems. First, MarcBreaker was used to convert the records to display as tagged text (see Figure 3). With the records broken, various MarcEdit editing tools were used to fix several of the problems which had previously been identified (see Figure 4). Field Count MarcEdit includes a tool to count the fields and subfields in a file. This proved useful in determining various problems that affected access and accurate description, and certain descriptive cataloging practices that we do not use, including • call numbers lacking $b • 050 call numbers with multiple $a’s • records without call numbers • 245 fields with $n and $p that need review and revision of dupli- cate call numbers, along with other problems • 300 fields with $e’s indicating accompanying material, usually CDs or software • 440s with initial articles, which our DRA system does not link to the authorized series heading • 653 and 655, which we do not use in our online catalog • 6xx fields with indicators other than 0 or 1, as we use only LCSH Sanchez et al. 61 http://oregonstate.edu/~reeset/marcedit/html/ • 7xx fields with $e relator terms, which we do not use and which conflict with our authority records. The field count report was also very useful because it provided a sure method to review the contents of the MARC record tags and subfield 62 TECHNICAL SERVICES QUARTERLY FIGURE 3. Pre-Cleanup MARC Record Converted to Tagged Text (.mrk Format) codes for any other unidentified problems. For example, we found that we could use the overall number of certain required fields, such as the 049, to learn the exact number of records in the batch. We could then compare this number to other fields that should have the same number, such as 050/090, to determine if we had records lacking a call number. Figure 5 shows selections from a MarcEdit Field Count report and its usefulness in identifying problems in the content of the records. Many of these specific problem records can then be isolated by using the spreadsheet strategies described in the next section. Microsoft Word Macros A different solution was sought for a second group of problems because we were unable to fix them using MarcEdit (solutions may be Sanchez et al. 63 FIGURE 4. MarcEdit Tools and Fixing Bibliographic Record Problems FIGURE 5. MarcEdit Field Count Report Examples available in new versions of the software). Because the records were converted to display as text using MarcBreaker, Microsoft Word mac- ros were developed to handle this group of problems (see Figure 6). Microsoft Word Find and Replace The next task presented a challenge that sent the system librarian into Word’s online help. Our local cataloging policy dictates removal of 6XX fields with a second indicator of anything but 0 or 1, excluding 690. Word’s find/replace function, with the wildcard option, provided the solution (see Figure 7): Find what: =6[!9][0-9] ?[!01]*= Replace with: = Spreadsheet Strategies Importing MARC records into a spreadsheet enabled grouping fields to examine their contents for error identification. We initially arrived at the idea of using a spreadsheet to identify a record lacking a known field, the 050, and then later found it useful for pinpointing several other problems. The steps are outlined below: 1. Load the file of tagged records into an Excel spreadsheet. 2. Use Excel functions to number the lines and the records. 3. Sort the records to bring like tag numbers together. 4. Then: (a) visually inspect for missing or incorrect data; (b) visually inspect for missing sequential record numbers; or (c) select a group of records and perform a search for text within the selected group. This was an invaluable cleanup method, unavailable in our system’s traditional database maintenance programs. There were many other ar- eas for which we were able to use the spreadsheet. Depending on the type of error we found, we had two cleanup options: 1. Immediately edit the text-file copy of the records with the identified corrections. These were types of problems that were relatively straightforward, and required little or no cataloging judgment (see Figure 8). 64 TECHNICAL SERVICES QUARTERLY Sanchez et al. 65 FIGURE 6. Microsoft Word Macros and Fixing Bibliographic Record Problems 2. For errors which were more complex or required additional cata- loging tools, we extracted subsets of the spreadsheets containing those records. These we saved and printed for correction after the records were loaded into our catalog (see Figure 9). 66 TECHNICAL SERVICES QUARTERLY FIGURE 8. Spreadsheet Examples: Problems to be Corrected Before Records are Loaded FIGURE 7. Explanation of Word Find/Replace Using Wildcards TRANSFORMING CLEANUP TASKS INTO A METHODICAL FRONT-END PROCESS Library staff had already determined the specific edits that were needed and had the tools to make the changes, namely • MarcEdit to identify field and subfield anomalies and perform global edits • Word macros and find/replace editing tools to retrieve problem texts and perform global edits • Excel spreadsheets to perform data sorts which identified and grouped other anomalies and problems. With these tools, we had the ability to upgrade all NetLibrary cataloging records to our standards before we loaded them into our Sanchez et al. 67 FIGURE 9. Spreadsheet Examples: Problems to be Corrected After Records are Loaded bibliographic database. This was a breakthrough in our method of bibliographic record cleanup, which had previously been done after the records were in our catalog. Because these were new bibliographic record cleanup processes that used new editing tools, we must • Establish file-naming conventions and report parameters • Set up workflows and specific tasks for staff performing the work • Create procedures that detail a step-by-step approach to editing and revision tasks. Cataloging staff from the monographs cataloging unit and the database management services librarian created new workflows and correspond- ing documentation. As we created a cleanup process and procedure, we tried it out on a copy of our existing set of NetLibrary records, using the new cleanup tools and honing the procedure as necessary until its steps were correct and in the correct order. The result is a methodical front-end record cleanup process that is efficient, robust, and effective. After the cleanup procedures were completed and tested and revision steps documented, we began the work of implementing our newly es- tablished cleanup methodology on the actual NetLibrary records. The system librarian, who introduced us to the new tools, rejoined us at this point to boost confidence and provide insight as we put them to use. FUTURE CONSIDERATIONS While we have made every effort to identify and correct as many er- rors as possible before loading the records, it is likely that we will con- tinue to encounter new and different problems. As we do so, we will look for ways to incorporate new solutions into our pre-load cleanup procedures. There are also future considerations regarding NetLibrary records for which we are unable to provide definitive solutions; these include the following: General Issues • Permanence of collection and future viability of NetLibrary itself; NetLibrary has already been in financial trouble but was saved by OCLC. • Will TexShare funding be continued for NetLibrary titles? 68 TECHNICAL SERVICES QUARTERLY • In the mix of ebook providers (Project Gutenberg, Million Book Project, Internet Archive, etc.), what will NetLibrary’s role be in the future of electronic resource dissemination? • How will our arrangement with TexShare be affected by Baker & Taylor’s partnership with OCLC to provide NetLibrary titles? Technical Issues • Ongoing authority issues: Outsourcing vs. in-house cleanup • Item records for ebook titles: Yes or no? • Can this cleanup process be used for other outsourcing projects? • Quality of future NetLibrary records: Certain types of problems appear to have been corrected in recent batches; will this trend continue? • Will there be a mechanism put in place to allow error-reporting to the agency who catalogs NetLibrary records? • Monitor MarcEdit for functionality enhancements, and identify other potentially useful software or strategies • How to handle the relationship between print and ebook manifes- tations of the same title • Will this front-end cleanup process of vendor-supplied biblio- graphic records become a regular database maintenance function? • Will metadata description of ebooks assume a larger role in the future, perhaps replacing MARC as a communication format? With this project we have achieved a high degree of quality control over cataloging records from one specific source of electronic re- sources. However, with the proliferation of ebook sources that use very basic cataloging or none at all, we will face larger issues of how, or if, we can continue to provide consistent, quality cataloging and authority control for these titles. If some entity does not provide cataloging for the universe of ebooks, will other methods such as basic Internet search en- gines be sufficient to provide access? CONCLUSION Cataloging, reference, and the system librarian worked together as key players in the NetLibrary record load process. Our desire to have front-end quality control over the vendor-supplied records required that we look outside of the traditional database maintenance tools available in our integrated online system. Our literature survey revealed no com- Sanchez et al. 69 prehensive information available on the process that we envisioned. However, our system librarian found the tools and initiated the process. MarcEdit, Word, and Excel were identified as the software applications that would fill this role. They have given us more flexibility and power in our database maintenance work than we had ever imagined possible. We will continue to use them in the future in order to assure the quality of any other vendor-supplied records before we load them into our catalog. NOTES 1. These procedures are employed in the Alkek Library Cataloging Department of Texas State University-San Marcos, to perform cleanup of NetLibrary records prior to loading them into the database. They encompass a variety of tasks that assure the qual- ity of NetLibrary records and provide a clear and consistent OPAC display. http:// www.library.txstate.edu/cat/netlibrary/procedures/index.htm 2. OCLC PICA: NetLibrary Ebooks Available. http://oclcpica.org/?id=1012& ln=uk 3. DA Information Services–Electronic Media: To Sample Some eBooks, Go to NetLibrary. http://www.dadirect.com/Emedia/emediatitle1.asp?id=3 4. FAQs: NetLibrary: MINITEX eBooks Collection: CPERS: Programs and Services: MINITEX. http://www.minitex.umn.edu/ebook/netlib/faq.asp 5. Hyatt, Shirley, “netLibrary,” Ariadne, Oct. 10, 2002. http://www.ariadne.ac.uk/ issue33/netlibrary/ BIBLIOGRAPHY MarcEdit Bigwood, David, “MarcEdit,” Catalogablog, Jan. 11, 2005. http://catalogablog. blogspot.com/2005/01/marcedit.html Kentucky State University Libraries, Department Manual: Using MarcEdit to Edit Large Numbers of Bib Records. http://www.lib.ksu.edu/depts/techserv/manual/ general/marcedit.html MarcEdit Homepage: Your Complete Free MARC Software. http://oregonstate. edu/~reeset/marcedit/html/ Palermo, Natalie, Using the MarcEdit Program, LOUIS Users Conference 2002, Loui- siana State University. http://www.nsula.edu/watson_library/acrl/Using%20the %20MarcEdit%20Program.ppt Other MARC-Editing Tools MITINET/marc Library Services: MARC Magician. http://www.mitinet.com/ Products/ p_cleanup.htm The next link takes you to search results for the MARC subsection of Perl scripts within CPAN.ORG, Comprehensive Perl Archiving Network. http://search.cpan.org/ search?query=marc&mode=all 70 TECHNICAL SERVICES QUARTERLY http://www.library.txstate.edu/cat/netlibrary/procedures/index.htm http://oclcpica.org/?id=1012& http://www.dadirect.com/Emedia/emediatitle1.asp?id=3 http://www.minitex.umn.edu/ebook/netlib/faq.asp http://www.ariadne.ac.uk/ http://catalogablog http://www.lib.ksu.edu/depts/techserv/manual/ http://oregonstate http://www.nsula.edu/watson_library/acrl/Using%20the http://www.mitinet.com/ http://search.cpan.org/ work_7srvr5b4o5dgzmrihfs7fto5nq ---- Microsoft Word - electronicmonograph *updated links December 2015, July 2019 Standards for Cataloging Electronic Monographs Database Management and Standards Committee June 2010 revised February 2012, August 2019 For bibliographic records for OhioLINK e‐books and other electronic monographs, OhioLINK requires full level cataloging. OhioLINK prefers: • OCLC records • Provider‐neutral cataloging practice For MARC field‐by‐field standards, please refer to Provider‐Neutral E‐Monograph MARC Record Guide, pp. 4‐8. Look for this Guide from http://www.loc.gov/aba/pcc/bibco/index.html Vendors: Please refer to MARC Record Guide for Monograph Aggregator Vendors http://www.loc.gov/aba/pcc/scs/documents/PN‐RDA‐Combined.docx The following requirements are specific for the OhioLINK environment: MARC Field Required Content or “Exact Phrase” Requirement for records created by OhioLINK members Requirement for records supplied by vendors Encoding Level I – full level from OCLC members or blank – full level (LC/NLM) M R 001 OCLC control number M R* 050/060/082/086 Classification numbers strongly encouraged R R 506 “Available to OhioLINK Libraries” M O 6XX #0 Library of Congress Subject Headings A A 6XX #2 Medical Subject Headings R O 710 2# “Ohio Library and Information Network” M M 856u Unique URL to connect to e‐book. For e‐books loaded on an OhioLINK server, this is an ohiolink.edu URL. M M 8563 “[Vendor/Collection name]” M M 856z “Connect to resource” M M * Vendors not using OCLC as the primary cataloging source should create control numbers that are unique. The vendor‐created unique control numbers must have an alphabetic prefix and/or suffix in order to not conflict with OCLC record numbers in the OhioLINK state‐wide system. M = Mandatory A = Mandatory if Applicable R = Required if Available O = Optional http://www.loc.gov/aba/pcc/bibco/index.html http://www.loc.gov/aba/pcc/scs/documents/PN-RDA-Combined.docx work_7zrisikd4jgrjijkgrwncyna7q ---- Digitizing Oral History: Can You Hear the Difference? OCLC Systems & Services: International digital library perspectives Digitizing Oral History: Can You Hear the Difference? Anthony Cocciolo Article information: To cite this document: Anthony Cocciolo , (2015),"Digitizing Oral History: Can You Hear the Difference?", OCLC Systems & Services: International digital library perspectives, Vol. 31 Iss 3 pp. - Permanent link to this document: http://dx.doi.org/10.1108/OCLC-03-2014-0019 Downloaded on: 30 June 2015, At: 07:46 (PT) References: this document contains references to 0 other documents. To copy this document: permissions@emeraldinsight.com The fulltext of this document has been downloaded 7 times since 2015* Access to this document was granted through an Emerald subscription provided by emerald-srm:401606 [] For Authors If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information. About Emerald www.emeraldinsight.com Emerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online products and additional customer resources and services. Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation. *Related content and download information correct at time of download. D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) http://dx.doi.org/10.1108/OCLC-03-2014-0019 1 Digitizing Oral History: Can You Hear the Difference? Introduction For the last several years, my students from {remove name of institution for review purposes} MSLIS program have engaged in the digitization of oral histories ({references removed for review purposes}). Typically, the digitization activity is part of a larger effort to make oral histories available over the web, and usually involve students deploying a content management system, designing the front-end website, assigning metadata, and working within a rights framework. The oral histories come from a variety of archival institutions in the New York City area, including the Lesbian Herstory Archives, the Archives of the American Jewish Joint Distribution, Archives of the Center for Puerto Rican Studies at Hunter College, and the Archives of the American Field Service. These oral histories are most often contained on magnetic audiocassette, a once ubiquitous format now increasingly obscure. As semesters progressed, I upgraded the digitization lab equipment to better adhere to professional audio digitization and archiving practices. Particularly salient practices are captured in IASA-TC 04: Guidelines on the Production and Preservation of Digital Audio Objects (2009), as well as the work of Casey and Gordon (2007). For example, each audio digitization workstation within the classroom (there are four in total) was upgraded to include a high-quality analog-to-digital converter (the ADC Benchmark USB 1 ). These digital converters allowed for the creation of audio files at the rate and bit-depth recommended by audio archivists: 24-bits stored 96,000 times per second (or 96 kHz). Other upgrades include the setup of dual sets of 1 http://www.benchmarkmedia.com/adc/adc1-usb D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 2 headphones per workstation so that students could verify that the audio content being played back corresponded exactly with what was being digitized. Through these class projects, the switch from CD quality audio (16-bit/44.1kHz) to the accepted archival audio bit-depth and sample frequency (24-bit/96kHz) necessarily resulted in files approximately 3.3 times as large. For example, the digitization of one side of a tape in CD quality format usually results in files around 465 MB, and at archival quality around 1.5 GB. This increase in files size was manageable, but I did worry about the ability of the archival institutions I was partnering with—especially with ones with nothing other than grassroots support—to maintain over the long-term digital copies of audio tapes that were 3 GB each. Was the quality of the digital reproductions worth the tripled file size? To research this question, I turned the issue over to my class by posing the following research questions: RQ1 - Can MSLIS students discern the difference between oral histories digitized at archival quality (96 kHz/24-bit) versus CD-quality (44.1kHz/16-bit)? RQ2 - Additionally, how important do they believe this difference is? I choose my students as the research subjects not only for convenience sake, but more importantly for two reasons. First, a majority of the average student is under age thirty, which means they are not as subject to loss of hearing as older adults. There is well-documented D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 3 evidence that adults lose their ability to hear higher frequency sounds as they age, which will be discussed more thoroughly in the literature review. The second reason they were chosen is that as emerging librarians, archivists, and information professionals who have voluntarily chosen to take a course on digital archives, they are more committed to the preservation of historic material than the average adult. Hence, they are more likely to expend effort in deciding what is best for both the collection and the archival institution being served. Before the research questions are directly addressed, relevant literature related to the digitization of oral history, as well as psychoacoustics, will be introduced. This will be followed by the study methodology, results, and conclusions. Literature Review Oral History and Digitization Oral history “collects memories and personal commentaries of historical significance through recorded interviews,” which then get “transcribed, summarized or indexed and then placed in a library or archives” (Ritche, 2003, p. 19). Frisch (1990) observes that oral history is “a powerful tool for discovering, exploring and evaluating the nature of the process of historical memory— how people make sense of their past, how they connect individual experience and its social contexts and how the past becomes part of the present, and how people use it to interpret their lives and the world around them” (p. 188). D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 4 For libraries and archives, the faithful, accurate and authentic reproduction of oral histories and sound recordings is particularly important because these aspects could influence the meanings perceived by future researchers. For example, if a researcher has reason to believe that a recording is not faithful to the original source (e.g., Why did the sound cut-out? What might be missing?) may cause a researcher to loose trust in the source and decide not to use it in her research. Similarly, it could influence researcher interpretations of the persons encoded on the recording (e.g., Why does his voice sound so different from other recordings? Is that really him speaking? Was he really that gregarious? Or dull? Why did people think she was such a great singer?). Thus, much of the value of a primary source such as a sound recording derives from its integrity, which also influences the researcher’s interpretations of the subjects found a recording. Through the 1990s, recording oral histories on magnetic audiocassette tape was fairly standard practice as evidenced by the array of archival institutions that hold oral histories in this format (e.g., Weig et al., 2007). Today, analog recording technology is considered obsolete and has been replaced by digital technology for both production and preservation (Alten, 2011; Casey and Gordon, 2007). It should be acknowledged that there are individuals who record new music on analog equipment for aesthetic purposes (Rudser, 2011), although there are no known examples of individuals continuing to use this technology for recording oral histories. With respect to new oral histories, recordings are most often captured using digital technology, such as audio or video recorders with solid-state memory cards. For older oral histories contained on an analog medium, the recommended best practice is to transfer the recordings to a digital format and preserve the original carrier (Casey and Gordon, 2007). Original carriers are best stored in D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 5 cool and dry environments, and the digital files are best stored in trustworthy digital repositories (Casey & Cordon, 2007; RLG & National Archives, 2008). Audio archivists have developed well-accepted practices for digitization of sound recordings, which is best captured in IASA-TC 04: Guidelines on the Production and Preservation of Digital Audio Objects (IASA, 2009), as well as the work of Casey and Gordon (2007). However, relatively little research has been conducted that illustrates how average users perceive the difference between reproductions created using archival audio standards versus lesser standards. Related research has explored minimum standards for digitizing speech-based recordings on audiocassette (e.g., 48kHz/24-bit), which could be useful for smaller organizations where the cost of digitization and file maintenance is too great (Jackson, 2013). To address the issue of what individuals hear, relevant research from psychoacoustics will be introduced. Psychoacoustics Psychoacoustics research aims to “determine the relation between the physical stimuli (sounds) and the sensations produced in the listener” (Plack, 2005, p. 4). With respect to psychoacoustics and sound reproduction systems, one may assume that humans would prefer the highest fidelity—or most accurate—reproduction. However, past research has demonstrated that listeners do not necessarily prefer listening to the highest fidelity sound recordings. As early as 1956 Kirk demonstrated that learning and past listening experiences help define listening preferences. In studying 210 college students, he found that “average college student prefers music and speech reproduced over a restricted frequency range rather than an unrestricted D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 6 frequency range” (p. 1113). This research indicates that particular sonic profiles may get associated with positive emotions produced not only by the sounds themselves but by the related listening experiences (e.g., pleasurable moments with friends), causing listeners to prefer these sonic profiles over sonic profiles that offer higher fidelity. This phenomenon has been attributed to the preference among some contemporary college age students for sound recordings subjected to lossy compression, where a perceptual sound encoder (such as the MP3 encoder) removes sounds that humans should not be able to hear (e.g., one sound masks another) (Sterne, 2012). Newspapers and magazines have picked-up related research from music scholars such as Jonathan Berger, which have resulted in headlines such as “Young music fans deaf to iPod’s limitations,” and “Are iPods killing music perception?” (Ahmed and Burgess, 2009; LeFevre, 2009). However, this point continues to be debated as new research suggests that teenagers may prefer the higher fidelity recordings (Olive, 2011). Linking favorable experiences to a particular sonic profile could explain the preference some have for analog recordings on magnetic tapes or vinyl records. Despite the observation that analog recordings produce a higher signal-to-noise ratio than digital recordings (e.g., tape hiss or crackle of vinyl record), some listeners continue to prefer these sonic profiles (e.g., Felten, 2012). In addition to learning experiences that determine listening preferences, other factors such as physiological factors contribute to what individuals hear. As mentioned earlier, one particularly salient aspect is age, where hearing loss is “a very common problem affecting older adults” (Cruickshanks et al., 1998, p. 879). Alten (2011) notes that as “gradual deterioration of the D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 7 auditory nerve endings occurs with aging, it usually results in a gradual loss of hearing first in the mid-high-frequency range, at around 3,000 to 6,000 Hz, then in the lower-pitched sound” (p. 17). Thus, after age 55 there is an accelerating rate of loss (Patterson et al., 1982). Listener age has also been shown to affect the ability to hear speech in noisy environments (Dubno et al., 1984). In sum, the research from psychoacoustics indicates that there is not a single or ideal way to reproduce sound, and reproduction preferences are in-part determined by past experiences and learning, the content itself and physiology. Methodology Study participants listened to three sets of oral histories clipped to two minutes in duration. The clips were created from oral histories that were being digitized in class during the Fall 2013 and Spring 2014 semesters across five courses being taught by the researcher (three sections of LIS 665 Projects in Digital Archives, and two sections of LIS 668 Projects in Moving Image and Sound Archiving). The researcher created these clips by playing back the first two minutes of an audiocassette tape and digitizing at 96 kHz/24-bit. The tape player used is an Alesis Tape2USB tape player connected to an ADC Benchmark USB analog-to-digital converter. The file is saved as a 24-bit WAV file using the open-source software program Audacity. 2 A copy of the WAV file was down sampled to 44.1 kHz / 16-bit using the Windows program r8brian, with the 2 http://audacity.sourceforge.net/ D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 8 conversion option for “Highest Quality” down-sample conversion. 3 The oral histories used to create the two-minute clips include: 1) An interview by dance critic Barbara Newman with dance choreographer Mark Morris from August 15, 1996. This tape is part of a personal collection of interviews with dancers and choreographers that formed the basis of her book, Grace Under Pressure: Passing Dance Through Time. File Size Information: Archival: 70 MB, CD-quality: 21.4 MB 2) An interview by Elizabeth Kennedy with oral history subjects Bobbi and Terri (fictitious names) for the Buffalo Women’s Oral History project. These oral histories formed the ethnographic dataset for Kennedy and Davis’ study of lesbian women in Buffalo from the 1930s-1950s, resulting in the seminal LGBT studies text Boots of Leather, Slippers of Gold: The History of a Lesbian Community. This interview was from September 25, 1982. This tape is held at the Lesbian Herstory Archives in Brooklyn, NY. File Size Information: Archival: 69.5 MB, CD-quality: 21.3 MB 3) An interview with Del Martin and Phyllis Lyon from May 9, 1987. Martin and Lyon formed the first lesbian political and social organization in the United States in 1955. The group was formed in San Francisco and named the Daughters of Bilitis. This tape is also held at the Lesbian Herstory Archives in Brooklyn, NY. File Size Information: Archival: 69.2 MB, CD-quality: 21.2 MB 3 http://www.voxengo.com/product/r8brain/ D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 9 The two versions of the recordings were randomly assigned the letter “A” or “B,” and students were asked to play back both “A” and “B” and decide which was the “archival quality” (or 24-bit / 96 kHz), and which was the “CD quality” (or 16-bit / 44.1 kHz). All six files were placed on the desktop of a Windows computer and grouped together visually. Each workstation in the computer lab connected with a pair of inexpensive JVC ear bud headphones that had been cleaned by the researcher using an anti-bacterial cloth. The same brand and model of headphones were used by all participants to eliminate any differences introduced by varieties of headphones. The student was asked not to inspect any of the metadata related to the file, such as file size, which would give away which recording was which. The students were also let know that that they could re-listen to the recordings, rewind, and compare as much as needed to reach their determination. The students would fill-out the survey included in the appendix and return it to the researcher. The research was conducted across 5 class sessions in October, 2013 and February 2014. The six files used for this study are available for listening and download. 4 The participants in the study represent graduate students in a MSLIS program. 53 individuals participated in this study, with an average age of 30.2 (standard deviation of 7.9). The oldest participant was 58 years old and the youngest was 23 years old. 79% of participants are female, and 21% male. Results 4 {URL removed for review purposes} D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 10 RQ1 - Can MSLIS students discern the difference between oral histories digitized at archival quality (96 kHz/24-bit) versus CD-quality (44.1kHz/16-bit)? Based on the responses of 53 MSLIS students, Table 1 reveals that less than half of the time on average could students discern which was the archival quality versus the CD quality recording. However, this was dependent on the actual recording, which is evident from observing that Test A had over sixty perfect correct identification, and B and C were only correctly identified around 35 percent of the time. Table 1. Percentage of sound recordings correctly identified as being “archival” digitization versus “CD-quality” digitization {Insert Table 1 Here} RQ2 - Additionally, how important do they believe this difference is? Participants were asked, given the sound test they just listened to and if you were digitizing an important oral history collection, how important is the difference between CD quality (16-bit / 44.1 kHz) and archival quality (24-bit / 96 kHZ) digitized sound? They answered this question on the scale (0 = Not at all important, 1 = A little bit important, 2= Important, 3 = Very important). The mean response was 1.3 (standard deviation of 0.78), with the most frequent response being “A little bit important” (31 individuals marked this response). D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 11 The researcher opened up a discussion after the surveys were submitted about how important the difference in quality was. One student mentioned that there was a slight difference in what you can hear in terms of the background noise between the archival and CD-quality version. The archival version made clearer the commotion in the background; however, she noted there was no difference in what could be heard from the primary speakers in the oral histories. She said that if the purpose of an oral history is to record the vocalized memories of the speaker, and the background noise is not the primary concern (as opposed to another kind of recording that might try to capture the noise of a landscape), then the quality difference was of little importance. One student mentioned that the sound of the tape hiss was slightly different between the two recordings; however, this was of little importance to her. Discussion and Limitations Results from this study reveal that MSLIS students have difficulty discerning the difference between oral histories digitized at archival quality (24-bit / 96 kHz) and CD quality (16-bit / 44.1 kHz). However, this can vary to some extent based on the actual content of the recording. Once completing this discernment test, the students most often found the difference between the two formats as being a “little bit important.” However, there were a minority of students—16 in total—who though the difference was important or very important. One limitation of this study was that the students all used inexpensive ear bud headphones in a classroom within a less than ideal sound digitization environment. Higher-quality headphones that encompassed the entire ear could possibly reveal more details. And although the classroom D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 12 was relatively quiet during the playback of the tests, some background noise is always present in the classroom’s Manhattan-based environment (e.g., street noise, noise from floors above and below). With respect to playback environment, Casey and Gordon (2007) recommend that “preservation transfer work is best undertaken in a studio designed as a critical listening space” (p. 10). At a minimum, the playback “studio must at least be free from ambient noise, it must be removed from other work areas and traffic, and its acoustic weaknesses should be well understood.” (p. 10). While although the facilities for the sound test and inexpensive headphones were chosen because they were the only resources available the researcher, they are appropriate choices because they are not dissimilar to the environment where most researchers will listen to sound recordings. For example, inexpensive headphones are in wide use, such as the white ear buds that come pre-package with the Apple iPhone, and library reading rooms and research spaces are filled with ambient noise. A final limitation is that the tape deck used to playback the original media is a consumer-grade tape deck of recent vintage. Most audio archivists recommend using cleaned and restored professional grade equipment (Casey and Gordon, 2007; Jackson, 2013). For example, consumer grade equipment does not allow adjustment to the azimuth, which is the angle at which the record/playback head connects with the tape. Unfortunately, this equipment can only be bought used, and it is difficult to purchase such equipment within the confines of contemporary higher education purchasing practices, which prefer to purchase new equipment from select retailers, and shy away from allowing purchases from less well-worn paths. Conclusion and Implications D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 13 A major implication of this study is that most untrained listeners may not be able to tell the difference between archival quality and CD quality digitization. If MSLIS students—who as part of their coursework have received training in the best practice in audio digitization— conclude on average that it is “a little bit important,” it is likely that the general public may feel similarly or even more extremely. For archival institutions, this could complicate justifying the additional digitization expense as well as increased file size. However, this is not to suggest that archivists abandon well-established sound digitization practices that produce results that audio archivists (and those able to hear fine grain audio differences) find superior. Rather, it does imply that additional work may be needed to train listeners to discern these fine grain differences, and appreciate the highest-fidelity replication of original audio recordings. This is no easy task; however, some listening education could help. For example, a series of exercises could be designed where important details that can only be revealed through higher fidelity recordings—and masked through lesser quality recordings—could help make the point of maintaining as much of the original audio content through digitization as possible. D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) 14 References Alten, S. R. (2011), Audio in Media, 9th Edition, Wadsworth, Boston, MA. Ahmed, M. and Burgess, K. (2009), “Young Music Fans Deaf to iPod’s Limitations.” London Times Online, 5 March 5, available at: http://www.thetimes.co.uk/tto/technology/gadgets/article1860325.ece (accessed 7 December 2013). Casey, M. and Gordon, B. (2007), Sound Directions: Best Practices for Audio Preservation, Harvard University and Indiana University, Cambridge, MA and Bloomington, IN. Cruickshanks, K. J., Wiley, T. L., Tweed, T. S., Klein, B. E. K., Klein, R., Mares-Perlman, J. A., Nondahl, D. M. (1998), “Prevalence of Hearing Loss in Older Adults in Beaver Dam, Wisconsin: The Epidemiology of Hearing Loss Study”, American Journal of Epidemiology, Vol. 148 No. 9, pp. 879-886. Dubno, J. R., Dirks, D. D., Morgan, D. E. (1984), “Effects of age and mild hearing loss on speech recognition in noise”, Journal of the Acoustical Society of America, Vol. 76 No. 1., pp. 87-96. Felton, E. (2012), “It’s Alive! Vinyl Makes a Comeback”, Wall Street Journal, 27 January, available at: http://on.wsj.com/x5SrGy (accessed 22 January 2014). Frisch, M. (1990), A Shared Authority: Essays on the Craft and Meaning of Oral and Public History. State University of New York Press, Albany, NY. IASA Technical Committee (2009), Guidelines on the Production and Preservation of Digital Audio Objects, 2nd Edition, edited by Kevin Bradley, available at: http://www.iasa- web.org/tc04/audio-preservation (accessed 22 January 2014). Jackson, D. J. (2013), “Defining Minumum Standards for the Digitization of Speech Recrodings on Audio Compact Cassettes”, Digital Preservation, Technology & Culture, Vol. 42 Num. 2, pp. 87-98. LeFevre, T. (2009), “Are iPods killing music perception?”, GizMag, 18 March, available at: http://www.gizmag.com/ipods-killing-music/11236/ (accessed 7 December 2013). Kirk, R. E. (1956), “Learning, a Major Factor Influencing Preferences for High-Fidelity Reproducing System”, Journal of the Acoustical Society of America, Vol. 28 No. 6, pp. 1113-1116. Olive, S. (2011), “Some New Evidence That Teenagers May Prefer Accurate Sound Reproduction”, conference paper presented at Audio Engineering Society Convention 20- 23 October, New York, NY. Patterson, R. D., Nimmo-Smith, I., Weber, D. L., Milroy, R. (1982), “The deterioration of hearing with age: Frequency selectivity, the critical ratio, the audiogram, and speech threshold”, Journal of the Acoustical Society of America, Vol. 72 No. 6, pp. 1788-1803. Plack, C. J. (2005), The Sense of Hearing, Psychology Press, New York, NY. Research Libraries Group and National Archives and Records Administration (2008), Trustworthy Repositories audit and certification: criteria and checklist, available at: http://wiki.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocument s/trac.pdf (accessed 22 January 2014). Ritchie, D. A. (2003), Doing Oral History: A Practical Guide, 2nd Edition. Oxford University Press, New York, NY. D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) http://www.emeraldinsight.com/action/showLinks?crossref=10.1515%2Fpdtc-2013-0008 http://www.emeraldinsight.com/action/showLinks?crossref=10.1121%2F1.1908573&isi=A1956WK51300017 http://www.emeraldinsight.com/action/showLinks?crossref=10.1093%2Foxfordjournals.aje.a009713&isi=000076634300008 http://www.emeraldinsight.com/action/showLinks?crossref=10.1093%2Foxfordjournals.aje.a009713&isi=000076634300008 http://www.emeraldinsight.com/action/showLinks?crossref=10.1121%2F1.391011&isi=A1984TA38100013 http://www.emeraldinsight.com/action/showLinks?crossref=10.1121%2F1.388652&isi=A1982PV16800014 15 Rudser, L. (2011), “Miss the Hiss? Fanatics Flip for Tunes on Cassette Tapes”, Wall Street Journal, 21 October, available at: http://on.wsj.com/nMyBZB (accessed 15 January 2013). Sterne, J. (2012), MP3: The Meaning of a Format, Duke University Press, Durham, NC. Weig, E, Kopanna, T. and Lybarger, K. (2007), “Large Scale Digitization of Oral History: A Case Study”, D-Lib Magazine, Vol. 13. Num. 5/6, available at: http://www.dlib.org/dlib/may07/weig/05weig.html (accessed 7 December 2013). Biographical Details Anthony Cocciolo is an Assistant Professor at Pratt Institute School of Information and Library Science, where his research and teaching are in the areas of digital archives, moving image and sound archives, and digital libraries. He completed his doctorate from the Communication, Computing, Technology in Education program at Teachers College, Columbia University. Prior to Pratt, he was the Head of Technology for the Gottesman Libraries at Teachers College, Columbia University. D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) http://www.emeraldinsight.com/action/showLinks?crossref=10.1215%2F9780822395522 http://www.emeraldinsight.com/action/showLinks?crossref=10.1045%2Fmay2007-weig 16 Appendix Survey: Digitizing Oral Histories: Can you hear the difference? Test 1: Which sound file is the archival quality (24-bit/96 kHz) sound file (as opposed to the CD quality (16-bit/44.1 kHz))? (Circle one answer) A B I can’t tell the difference Test 2: Which sound file is the archival quality (24-bit/96 kHz) sound file (as opposed to the CD quality (16-bit/44.1 kHz))? (Circle one answer) A B I can’t tell the difference Test 3: Which sound file is the archival quality (24-bit/96 kHz) sound file (as opposed to the CD quality (16-bit/44.1 kHz))? (Circle one answer) A B I can’t tell the difference Given this sound test, and if you were digitizing an important oral history collection, how important is the difference between CD quality (16-bit / 44.1 kHz) and archival quality (24- bit / 96 kHZ) digitized sound? (Circle one answer) Not at all important A little bit important Important Very important What year were you born in? 19________ What is your gender? (Circle answer) Male Female D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) Table 1. Percentage of sound recordings correctly identified as being “archival” digitization versus “CD-quality” digitization Test % of participants correctly identified the archival quality digitization A 62.3% B 34.0% C 37.7% Overall 44.7% D ow nl oa de d by P ra tt I ns ti tu te A t 07 :4 6 30 J un e 20 15 ( P T ) work_a3e5bua4tfbfxcqncqjhdpe5ku ---- Α Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 _________________ Received: 21.5.2017 Accepted: 2.9.2017 ISSN 2241-1925 © ISAST Identity and Access Management for Libraries Qiang Jin 1 and Deren Kudeki 2 1 Senior Coordinating Cataloger, and Authority Control Team Leader, University Library, University of Illinois at Urbana-Champaign, USA 2 Visiting Research Programmer, School of Information Science, University of Illinois at Urbana-Champaign, USA Abstract. Linked open data will change libraries in a dramatic way. It will redesign metadata, and display metadata on the web. In order to prepare for linked open data, libraries will gradually transition from authority control creating text strings to identity and access management using identifiers to select a single identity. The key step for moving into the linked open data for identity and access management for libraries is to correct thousands of incorrect names and subject access points in our online catalogs. This article describes a case study of the process of cleaning up unauthorized access points for personal and corporate names in the University of Illinois Library online catalog. The authors hope that this article will help readers of other libraries prepare for linked open data environment. Keywords Authority maintenance, controlled vocabularies, identity and access management, linked open data 1. Introduction Linked Data is to use the Web to connect data, information, and knowledge on the Semantic Web using URIs and RDF.(1) Linked Data provides library users with searching a vast range of local and remote content through a single point of entry across a comprehensive index into a library’s collection. Authority control is the area of Linked Data transition that has caused the most concern. (2) It is critical when we group works by authors with URIs in the linked data environment. Unfortunately, thousands of incorrect personal names, corporate names, and subjects in our online catalogs hinder users from finding library resources. Authority control is the process of selecting one form of name or title from among available choices and recording it, the alternatives, and the data sources used in the process. It is essential for effective retrieval of resources. It provides consistency in the form of access points used to identify persons, families, Qiang Jin and Deren Kudeki 356 corporate bodies, and works. (3) Authority control is central to the organization for information. Authority control has gone through many changes over the last hundred years or so. One major new concept on authority control in libraries emerged in 2010 after IFLA created Functional Requirements for Authority Data (FRAD) A Conceptual Model (4), and Functional Requirements for Subject Authority Data (FRSAD) A Conceptual Model (5). These two models analyze entities person, family, corporate body, work, expression, manifestation, item, concept, object, event, place and their relationships. They describe entities of highest significance, attributes of each entity, and relationships among entities in regard to user needs. FRAD helps catalogers rethink about how catalogs should function, and establish standards. In 2010, Resource Description & Access (RDA)(6), a new international cataloging code was released adopting FRAD, and FRSAD. Two major new concepts emerged in libraries in recent years. One is the creation of Bibliographic Framework Initiative (BIBFRAME), and the other is Schema.org. In 2012, the Library of Congress (LC) released a BIBFRAME model, a linked data alternative to MARC developed by Zepheira, a data management company. BIBFRAME is expressed in Resource Description Framework (RDF). (7) It serves as a general model for expressing and connecting bibliographic data. It is set to replace MARC 21 standards, and to use linked data principles to make library data discoverable on the Web while preserving a robust data exchange that supports resource sharing and resources discovery on the Web. In 2016, LC put out BIBFRAME model and vocabulary 2.0. The other major new concept is Schema.org, which is an initiative launched in 2011 by Bing, Google and Yahoo to create and support a common set of schemas for structured data markup on web pages. In 2012, OCLC took the first step toward adding linked data to WorldCat by appending Schema.org descriptive markup to WorldCat.org pages, making rich library bibliographic and authority data available on the Web. (8) 2. Background Information The University of Illinois at Urbana-Champaign in the United States includes about 44,087 undergraduate and graduate students, and 2,548 faculty. The University of Illinois at Urbana-Champaign (UIUC) Library is one of the largest libraries in North America. Its online catalog holds more than thirteen million volumes, 24 million items and materials in all formats, languages, and subjects, including 9 million microforms, 120,000 serials, 148,000 audio-recordings, over 930,000 audiovisual materials, over 280,000 electronic books, 12,000 films, and 650,000 maps. (9) Although many large research libraries routinely do authority maintenance work, the UIUC Library has never done any systematic authority work for the last several decades. In order to prepare to move to the linked data environment, in 2015, the UIUC Library decided to do authority maintenance work locally due to limited budget to correct an estimated over one million incorrect personal names, corporate names, geographic names, series titles and the Library of Congress subject headings in our online catalog. Because of Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 357 budget cuts for the last several years, the UIUC Library staff cutbacks and the expanding need for professional librarians to work on digital collections, dealing with controlled vocabularies has become more stringent. Cleaning controlled vocabularies are clearly critical to the success of linked data since they are the basis for the URIs that create linkages. The goal is to provide enhanced discovery of library data, bringing together the comprehensive collections of content with indexes for deeper searches with millions of unique descriptive data components for books, images, microforms, etc. In 2015, a small team was formed at the UIUC Library including one tenured librarian, one academic hourly, and two graduate assistants, and started working on authority maintenance work. The team discussed and decided to begin with fixing personal and corporate names. Even though the Library has over 13 million volumes, there are actually around 8 million bibliographic records in our online catalog. The team ran reports using SQL queries (10) to find “see” references for personal and corporate names in our bibliographic records because a major part of our problem with our personal and corporate names belong to this category. We separated the results of the queries into csv files holding up to 5,000 broken names each to run through the script we created locally. We solved various issues with the script before we were able to run all the files successfully. For each “see” personal or corporate name in our bibliographic record, the script looks for an authorized access point in the Library of Congress Authority File, and WorldCat trying to find matches. If a “see” reference finds an authorized access point and WorldCat also lists that access point, we consider a match. If no good access point is found or WorldCat does not list the access point, we save that name for human intervention in the future. Out of nearly 8 million bibliographic records in our online catalog, we corrected around 300,000 personal and corporate names successfully by machine, but we still have over 100,000 personal and corporate names that need to have people go over them one at a time checking the Library of Congress Authority File and correct them in our online catalog. Our precaution is necessary because our catalog is comprehensive. 3. Literature Review Many experts in the cataloging field have stated the importance of authority control for decades. Michael Gorman in his paper indicates that authorit y control is central and vital to the activities we call cataloging. (11) Tillett says that authority control is necessary for meeting the catalog’s objectives of enabling users to find the works of an author and to collocate all works of a personal or corporate body. (12) Hillman, Marker, and Brady mention that the basic goals of a controlled vocabulary are to “eliminate or reduce ambiguity; control the use of synonyms; establish formal relationships among terms and test and validate terms. (13) For decades, many libraries have either done in-house or have hired commercial vendors for their authority maintenance work. Some libraries have done in- house authority maintenance work due to their small scales of catalogs, or their budgets. For example, the Wichita State University Libraries did authority Qiang Jin and Deren Kudeki 358 maintenance work locally. They designed their workflow, and redesigned their cataloging staff structure for authority maintenance work. At the end of the project, they felt that they needed continuing support of their library administration’s commitment to allocate more staff in order to continue their authority maintenance work. (14) Commercial vendors usually offer several types of authority control services including authority work after cataloging bibliographic records, retrospective cleanup to supply their authority control after, or as part of a conversion project, and ongoing authority control for libraries. (15) Not only vendors help libraries do authority work for their print collections for decades, they have started to do authority work for library digital collections in non-MARC. In 2013, Backstage Library Works did authority work for the University of Utah’s Library digital collection in non-MARC. Their project demonstrates that it is possible to complete major updates to records in order to bring them in line with the authorized terms in commonly used controlled vocabularies without a large of amount of manual work. (16) 4. Initial Resources The goal of this project is to replace variant name access points in our bibliographic records with their authorized form. We are able to detect variant name access points through an SQL query that selects name access points that have been labeled as “s” in our database, which means the access point is a “see” reference with linked bibliographic records. While this provides a large number of access points to fix, it is worth noting that only unauthorized name access points that have been labeled as a “see” reference are being addressed by this process. If a name access point is not authorized and not labeled as a “see” reference, it will be ignored entirely. The reason we chose “see” reference with linked bibliographic records was because the big percentage of our incorrect personal names, corporate names, and subject headings belong to this category that lists name access points from our catalog that match references in authority records based on the text string only. The query we used is from Consortium of Academic and Research Libraries in Illinois: CARLI since we are part of the consortium, which consists 134 member libraries in Illinois. This query was used to determine the scope of this project, by running on 8 blocks of 1,000 bibliographic records to sample the more than 8 million bibliographic records in our online catalog. We found that there were 1,251 “see” references across the 8,000 bibliographic records examined. Using this rate, we estimate that there should be roughly 1 million “see” references in our database, all of which need to be examined for this project. There are five different kinds of access points, each of which may have its own intricacies for finding the correct authorized form. The table below shows what share of the “see” references each group contains. Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 359 Group Share of “see” References Name-title 3.6% Subject 45.7% Title 3.7% Name (corporate) 22.1% Name (personal) 25% Because of the large amount of data involved, and the special considerations needed for each group, the team decided to develop our tools and processes for authority maintenance around one specific group, while also building a general structure that can be used in the future when the other groups are given specific attention. We chose to focus our efforts on fixing personal names, because the specific considerations needed for finding the appropriate authorized access point for any given problematic personal name access points are relatively simple. Also, since personal names make up about a quarter of the problematic access points we can detect, it’s easy to produce a large sample to work on, and fixing personal names fixes a good portion of all the problematic access points. Once we finished development on fixing personal names, we were able to expand to fixing corporate names relatively quickly thanks to the similarities between personal and corporate names, and a code structure built to support the addition of modules for fixing the other groups. We chose to develop our tools in Python due to the large number of third party libraries available. Thanks to third party libraries, we are able to easily establish Z39.50 connections and convert MARC-8 records into unicode. We chose to run our queries in SQL Developer because of its speed. 5. Approach The general process we developed to update unauthorized name access points is splitting into three distinct steps. The first step is querying our database for the unauthorized name access points in our bibliographic records. The second step is processing the results of that query. The final step is uploading the changes found in the processing. The specific implementation of this process for personal and corporate names is as follows: 5.1. The Query We run a query in SQL Developer over a given range that returns a list of unauthorized personal and corporate names in bibliographic records in that range that are recognized as "see" variants, rather than an authorized name. For each of these problematic names, the query returns several important pieces of information. It gives us the unauthorized name, complete with all associated subfields. It gives us the bibliographic id (BIBID) number, which is the unique identifying number of the bibliographic record containing the name in our Qiang Jin and Deren Kudeki 360 database. And finally, it gives us the OCLC number of that same bibliographic record. (17) 5.2. Processing the Query Results The query results are run through a Python script that searches for the authorized name that best fits the problematic name from the record. If an authorized name cannot be found, the problematic name is added to a list of other unresolved names that are meant to be reviewed by humans. If an authorized name can be found, the correction is applied to the bibliographic record, which is added to a master collection of corrected records. To begin the processing of the query results, the problematic names are all read into the script, and grouped by BIBID. Each full problematic record is then retrieved from the database, one at a time, using a Z39.50 request. Each problematic name from the query is then matched with a personal or corporate name field in the bibliographic record. This match is made by calculating the Levenshtein distance between each name from the query, and each name in the record, and associating the pairs of names with the smallest calculated difference. Once all the problematic names have been found in the record, each name is processed individually to find the authority record that best matches it. The first step of this process is to call a web application program interface (API) with the selected problematic name. For personal names, the Virtual International Authority File (VIAF) AutoSugest API is called, and for corporate names the VIAF SRU Search API is called. The API returns a list of suggested authorities in VIAF, which is then reduced to a list of authorities that are listed as either personal or corporate names and have a Library of Congress Control Number. The list of Library of Congress Control Number (LCCN) is used to retrieve the Library of Congress (LC) authority record for each personal name through a series of Z39.50 queries. For each authority record, the Levenshtein distance is calculated between the problematic name, and all the versions of the name listed in the record. The smallest Levenshtein distance is found across all suggested authority records, and if that Levenshtein distance is small enough, the authorized name in that record is selected as the solution for the problematic name being examined. If the smallest Levenshtein distance found is not small enough, the problematic name, along with the authorized name with the smallest Levenshtein distance, is placed in a list of names that should be assessed by a human being. Once the best guess has been selected from VIAF's list of authority records, the selection needs to be independently verified, because some of the suggestions that come out of this process are incorrect, but very similar to the problematic name in question. To do this, the OCLC number from the bibliographic record is used to retrieve OCLC's version of the bibliographic record. Each of the names in the OCLC record is compared to the solution that has been selected. If any of the names in the OCLC record contains an exact match for all of the information in the selected name that is considered independent confirmation. If the OCLC record fails to confirm the selected name, the problematic personal name along Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 361 with our selection is placed in a list of names that should be assessed by a human being. Otherwise, the name selected by VIAF API and the Levenshtein distance calculation has been confirmed by the OCLC record, and is now considered safe to upload to the database as a correction. Once a name is to be uploaded, the problematic name is removed from the bibliographic record that was retrieved at the beginning of the process, and replaced with the authorized name that has been selected. Once all the problematic names in a record have been processed, if any of them have been replaced, the updated record is written to a collection of updated records. When the script has finished running this collection of records needs to be uploaded to the database to apply the changes. 5.3. Uploading the Changes We are able to apply the changes to our bibliographic records one at a time by importing the collection of records that have been updated into the Voyager Client and manually overwriting the each existing record with its revised version. Since there are around 300,000 bibliographic records need to be updated, we are now waiting to talk to system services people in our consortium, and ask them to upload these changes in bulk. Our authority maintenance work will also help other 133 academic and research libraries in our consortium. Qiang Jin and Deren Kudeki 362 6. Development The project goal of fixing an unauthorized access point is to find the authority record that lists the unauthorized access point as a variant, and to replace the unauthorized access point in the bibliographic record with the authorized access point from the authority record. This would be easy if the unauthorized access point listed some sort of unique id number that points to the authority record that the access point is meant to be associated with. While this is theoretically doable, it is not generally done, and none of the bibliographic records we looked at during the development process had any direct pointers to authority records. This means that the authority record that is needed is not immediately obvious, but instead needs to be discovered by using relevant data from the unauthorized access points, and the bibliographic record that access point is in. The tool we use to search for the correct authority file for personal names is VIAF’s AutoSuggest API, which takes a string as an input, and outputs a JSON file listing all the access points that may be relevant to the query string. This tool was chosen because it is easy to send a query and get a response programmatically, which makes it convenient for automation. The AutoSuggest API may return a variety of suggestions based on the input, and we have to sort through those suggestions to see which, if any is the authority record that we are looking for. The main challenge with AutoSuggest is sending queries that will get meaningful results back. There are simple cases to look out for, specifically when the query string ends with a comma or dash that ensure no results will be returned. This kind of pattern is easy to detect, and easy to fix. Simply removing the final character in these cases tends to return relevant suggestions. Less straightforward is how to handle queries with unusual diacritics. In some cases using all the unusual diacritics in the search will turn up nothing, but revising the query to remove those diacritics will yield relevant results. Sometimes this case is reversed, where the diacritics are the only way to get good suggestions. Our best solution for this is to always send a query with all the diacritics present, but if no results are returned and there are diacritics from outside the ASCII table, the query is re-sent with the non-ASCII characters removed. Examples of personal names with unusual diacritics are: aMi-ʾgyur-rdo-rje, cYoṅs-dge Gter-ston aMourik , D oula Another major issue for querying AutoSuggest is knowing how many subfields should be included. Including subfields like dates, titles or numeration can make the search key more specific, but if it includes information that VIAF does not have, the results can come up blank. During the development process we found search terms that include a single additional subfield can produce a small number of relevant looking results, but if more than one subfield is included, typically no results are returned. Because of this we send multiple queries to AutoSuggest until results are returned, each query including a different subfield, Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 363 and on with just the name. All of this adds up to the potential for multiple queries being sent for a single name until we get some result to examine. For example, for the problematic access point “aŚāhajī, cKing of Tanjore, dfl. 1684-1712.” our first two queries to VIAF contain the name and date information (http://www.viaf.org/viaf/AutoSuggest?query=Śāhajī,+fl+1684- 1712 and http://www.viaf.org/viaf/AutoSuggest?query=Sahaji,+fl+1684-1712) return no results. Our third query which includes the name and title information (http://www.viaf.org/viaf/AutoSuggest?query=Śāhajī,+King+of+Tanjore,) returns one unique personal name, which is then selected as our solution for this name. In contrast, a query that combines all three subfields (http://www.viaf.org/viaf/AutoSuggest?query=Śāhajī,+fl+1684- 1712+King+of+Tanjore,) returns no suggestions. Once AutoSuggest has given us results, we need to decide if any of the authorities that suggests are what is meant by the unauthorized access points we’re looking at. During development we noticed that in some of the authorities that AutoSuggest returned, the name from the query is listed as an associate, for example a search for “Robert Craft” would return the Library of Congress control number (LCCN) for “Igor Stravinsky,” which lists “Craft” as an associate. To avoid these cases, and find the best case, we began checking the 100 and 400 fields of the authority record against the unauthorized name. This successfully excluded the obviously wrong cases, but it also excluded some cases that were actually correct, but had minor differences in punctuation, where the unauthorized name may have an extra period or comma that the 100 or 400 field did not. Examples of such cases are “Stephenson, Andrew G.,” which is the name listed in our bibliographic record while the closest match that could be found is “Stephenson, Andrew G.”, and “Goldschmidt, Jenny ELisabeth.” from our record vs “Goldschmidt, Jenny Elisabeth” as the closest match. 700 10 Stephenson, Andrew G. 400 1 Stephenson, Andrew G. 100 1 Mohammadou, Eldridge 400 1 Mohamadou, Eldridge This, along with a case where the unauthorized name had a pipe character it should not have, written in MARCXML as Cockrell, W. D. |q (William D.), made it clear that it may be impossible to find an exact match for the unauthorized name we are looking at, even if we find the correct authority record. The pipe character showed that we could not simply exclude periods or commas, as the variations may be more unpredictable than that. Instead, our solution was to look through all the 100 and 400 fields from the suggested authority records and determine how similar each of them is to the unauthorized name we are trying to fix. We do this by calculating the Levenshtein distance, also known as the edit distance, to determine the fewest number of changes it takes to change the unauthorized name to the field we are assigning a similarity score to. Few changes mean the names are already quite similar. At this point we just need to Qiang Jin and Deren Kudeki 364 concern ourselves with the authority record with the smallest Levenshtein distance calculated. If that distance is too big, we can determine that none of the suggested authorities were similar enough to be considered good matches. This is where we can adjust how conservative or aggressive we want our changes to be. A larger maximum allowed Levenshtein distance will return more changes, but runs the risk of selecting more bad solutions. A smaller maximum will mean fewer changes, but those changes will be more likely to be correct. We wanted to be fairly conservative for this project and found that a Levenshtein distance of 2 was the highest where we were satisfied with the suggestions. One of the hurdles in this process was how to retrieve the LC authority records we’ve been discussing. We get the LCCN for the authority record from AutoSuggest, but it was not immediately clear what the best service was to retrieve the full authority record. The easiest and most accessible service was using the LCCN Permalink service to retrieve the authority as a MARCXML file. LCCN permalinks are URLs for LC bibliographic records and authority records. This is a well-documented service and consists of simply adding the LCCN to the end of a standard URL, and sending an HTTP request with that URL. The problem is that the Library of Congress limits access to this service to one request every six seconds, which is too slow for a project on this scale. The Library of Congress allows more frequent access through their Z39.50 service, but that service isn’t as well documented or straightforward. We eventually took the time to learn how to send requests over Z39.50 because the speed was so important. We did this by importing a Python library called PyZ3950 that allows us to make Z39.50 requests, and send queries formatted as “@attr 1=9 [LCCN]”, which gets us the authority record we’re looking for. There were a few issues concerning problems with character encoding that arose. First, the results of our SQL query returned characters outside the standard ASCII character set as question marks. This means that a name with unusual diacritics would have a question mark in the middle of it, or that a name in full Hebrew script would be entirely made up of question marks with the exception of any date information. For example the name listed as “|a בלבן, ”??in our records was listed as “a????????, ??????????,d1944 ”.־d 1944| ,אברהם in the query results we were reading from. Eventually it became clear that the names in the second case are always in the 880 field, which should actually be labeled as a “see” reference, and we simply need to ignore those names. To get around the first case, when we retrieve the full record from our database, we match the unauthorized names with the 100 and 700 fields in the bibliographic record, again calculating the Levenshtein distance, to find the name in the record. At first these full bibliographic records that we imported had character encoding issues of their own. Diacritics were being applied to the character before they should have been, the diacritics themselves were wrong, and 880 fields were being filled with gibberish instead of question marks. For example the name “|a לחיה אל בין : |b עוז עמוס של ביצירתו עיון / |c בלבן אברהם.” would be incorrectly written in MARCXML as: (2aio `l lgid :(B (2rieo aivixze yl rneq ref /(B Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 365 (2`axdm alao.(B All this led to the conclusion that the bibliographic records were not encoded in UTF-8. The fact that the diacritics were coming before the character they should have been applied to, as opposed to following the character as is the case in UTF-8, eventually led to the realization that these records were encoded in MARC-8. To decode the MARC-8 characters, we now retrieve the bibliographic record as an .mrc file and load that into the pymarc library in Python, which has the built in ability to decode MARC-8. Once these problems were solved, the names from the bibliographic record were in good shape to be used in the query sent to AutoSuggest. Using the methods described above to query VIAF AutoSuggest and use the Levenshtein distance to find the best suggestion ended up producing results for 93% of the names we looked at. But when we manually examined the quality of those results, we found that 5% of them were the wrong name. This project is expected to be run on around half a million personal and corporate names, and an error rate of 5% would result in tens of thousands of unauthorized access points to be overwritten with authorized but incorrect access points. It was determined that a good way of filtering out these errors would be to compare our solutions with solutions that are acquired through an unrelated method, and only allow solutions where both methods agree. No method is likely to be completely accurate, but if the two methods are independent, errors are unlikely to occur for the same name in both methods. The independent method we chose was to compare our results to OCLC’s bibliographic records. This is easy to do because all of our bibliographic records have an OCLC number listed, so it’s simply a matter of retrieving the record and comparing the names. We also do not want to simply copy all the information in OCLC, because OCLC has its own errors that we do not want to blindly perpetuate and use in place of our own potentially correct information. Combining our conclusions with OCLC’s records gives us the best of both sources, while avoiding replacing our current access points with incorrect ones. The percent of names we replace is reduced with this method, down to around 75%, while reducing errors to a fraction of a percentage point. While this leaves a large group of names unfixed, we output or best guesses for the unfixed names in a spreadsheet, which gives humans fixing the errors a good place to start. 7. Testing & Results Because of the large scale of this project, an important part of the development cycle was testing the code on a sample set, seeing what errors occur, and adjusting the code to account for those errors. We want to improve our existing database without pushing many errors, which is a danger with automation, so testing and analyzing the results were a constant part of our process. For early development, we tested our code on a sample of around 550 unauthorized access points. When the code could not find a solution for an access point, we looked for ways to find a solution based on the unauthorized access points and adjusted the code accordingly. When we did find a solution, we examined the solution to determine if it looked correct, and if it was not we looked for ways to fix it, or Qiang Jin and Deren Kudeki 366 exclude it if there was no way to fix it. As our solutions for this sample became good, we expanded our testing to another range of around 1,000 unauthorized access points to see how well our code performed and to address any problems that emerged with the new sample. For a second set of about 1,000 unauthorized access points, we assigned expected solutions to all the access points before we ran the code, and after the code ran we compared the results. The results were as follows. The solution that the algorithm chose was wrong 1.3% of the time, but none of those cases were pushed as final solutions, instead the algorithm detected a problem with all of these answers and simply listed them as a best guess for these names that it could not find an acceptable solution for. 2.8% of the time the solutions selected by the human and the program were different, but both were valid. In other words, these were cases where multiple authority records existed for the same person, and the human and computer selected different records. In this case the human tended to choose authority records with 670 fields with the title of the work in question, or with a title that looked to be in the same field as the work in question, while the algorithm did not take the 670 field into consideration. 1.8% of the time the algorithm chose correctly, and the human chose incorrectly. In the remaining 94% of cases the human and computer agreed on the solution. The main takeaway from these results is that our algorithm is roughly on par with a human, but it is much faster and can catch its own mistakes, though we can still improve performance by checking the 670 field. Overall, when we run the code across all three samples of training data, it looks at 2,412 unauthorized personal names, and finds what it considers to be an acceptable solution for 1,915, or 79.4% of access points. A remaining 20.6% of access points are not changed by the algorithm and require a human to examine and fix. The margin of error for these numbers is ±2%. At the time of writing, we have found that 2 of the written solutions are wrong, which is 0.08% of the unauthorized names examined. We compared the performance of our code to MarcEdit’s Validate Headings tool, which also tries to automatically fix unauthorized access points. Both tools were run on a sample of 705 unauthorized personal names, and we compared the quality of the results. In 542 cases, or 77% of the time, we found that our tool and MarcEdit both came to the same conclusion. This could mean they both chose the same solution, or they both were unable to come up with a solution. There are four cases (1% of the time) where the two tools come up with different solutions but both solutions are wrong. There are 13 cases (2% of the time) where the two tools come up with different solutions, but there’s not enough information to determine which solution is better. There are 100 cases (14% of the time) where the two tools come up with different solutions, and our code produced the correct solution. The reverse case, where the two tools disagree and MarcEdit is the one that produces the correct solution, happened in 46 cases, or 7% of the time. In addition to our code coming up with more good solutions, our code has a major advantage over MarcEdit. Whenever MarcEdit finds a solution, it writes that solution to the collection of records being processed, regardless of the Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 367 quality of that result. Our code will only write a result to the records if that result can be found in the OCLC version of the bibliographic record being fixed. What this means is that every time we found that MarcEdit had made a mistake, that mistake was written to the same record that the other solutions are written to. But every time we found that our code had found an incorrect solution, the code itself had also determined that its solution was not good enough, and wrote the solution to a spreadsheet for humans to look at, and it did not write the incorrect solution to the collection of records to be updated. So MarcEdit produces fewer correct solutions and generates new incorrect data, while our code produces more correct solutions and avoids pushing errors into the records. For these reasons we feel that our code does a better job at fixing unauthorized access points than MarcEdit’s Validate Headings tool. Because of our methods, there are limitations to how many unauthorized access points we can fix. First, because of how the query is structured, we only look at names that are labeled as “see” references. Any access point that is unauthorized, but not labeled will not be collected by the query. While our code to process the query results will work with a spreadsheet of non-see reference unauthorized names, as long as the spreadsheet is formatted correctly, the code is largely untested on this sort of data, so it’s not clear how good the results would be. Second, so far the tool has only been developed and tested for personal and corporate names. The code is designed so that in the future other access point like subjects, series, titles, and name-titles can be addressed, but these methods have not been tested on anything but personal and corporate names, and nothing has been written to address specific issues that other access point types might have. Third, there is an upper bound of names based on the double check against the OCLC records. Because we ignore results that don’t agree with the OCLC record, we can not fix a name that is not already accurate in OCLC. One way to get around this limitation would be to find a second independent source to check both our result and OCLC’s record against. This way if any two of the three sources agree, that can be selected as the solution. Finally the speed of our process is limited by the number of external calls our code makes. Every record requires two external calls: one to access our local record and one to access the OCLC version of the record. Every name that is processed calls the AutoSuggest API between 1 and 16 times depending on if any usable suggestions are returned. Every suggestion from AutoSuggest results in a Z39.50 call to retrieve the authority record from the Library of Congress. The number of calls here varies based on what the results from AutoSuggest are, but there are typically only a small number of suggestions. The calls to AutoSuggest have the greatest impact on the speed of this process. It will realistically be called all 16 times for any names that VIAF doesn’t have or doesn’t have an LCCN for, both cases are a small, but sizable fraction of our dataset. In addition, we have observed AutoSuggest to have the slowest response time of all the services we call in this process. Qiang Jin and Deren Kudeki 368 There is no way to avoid the worst case scenario for AutoSuggest because a slightly different query has the potential to return relevant results, even when other queries have failed. The best way to speed up processing an entire database with this tool would be to split the data that needs to be processed into smaller chunks and to run a few of these chunks in parallel on different machines at the same time. This would have to be limited to only a few machines at a time to avoid overloading the services this tool is using, but processing the data across a few machines at once would divide the total amount of time needed to process it all. Bib Record No. Total Automatically Changed Need Manual Correction 0 - 1 million 79091 63783 15308 1 - 2 million 77073 60869 16204 2 - 3 million 65512 50427 15085 3 - 4 million 62124 46572 15552 4 - 5 million 51012 34000 17012 5 - 6 million 27994 17788 10206 6 - 7 million 34971 23184 11787 7 - 8 million 12378 3696 8682 Total 400,500 295,500 (73.8%) 105,000 (26.2%) At the time of writing (April 2017), we run the query to find all unauthorized personal and corporate names in our online catalog, and discover that there are around 400,500 personal and corporate names that need to be corrected. After we run the script for multiple 5,000 files for those 400,000 personal and corporate names, we have successfully corrected around 300,000 personal names and corporate names by machine. We still have over 100,000 personal and corporate names that we need to check them by hand. Among those 100,000 personal and corporate names, we believe many of them should be correct. We set them aside for our conservative calculations. Qualitative and Quantitative Methods in Libraries (QQML) 6: 355-370, 2017 369 The automatically changed records are in XML format. So we need to convert XML to MARC format using MarcEdit, and then import MARC files into Voyager and update the records. Given the large number of auto-corrected records, our team hopes that our consortium will help to set up a profile to finish the work. The records that need manual check and correction are in csv format. Catalogers who are familiar with multiple foreign languages may be hired to finish this work through exploring LC Authority files and OCLC records. In the meantime, we are working to create a different script to correct subject access points in our online catalog. In the future, we plan to use the script and the workflow to continuously work on authority maintenance work since incorrect names and subjects appear in our online catalog every day. 8. Conclusion A critical part of linked library data lies within the establishment of its backbone: authority data. Our script has not only corrected around 400,000 personal and corporate names in western European languages, but also has also fixed personal and corporate names in diacritics and Unicode non-Roman languages in our online catalog. We hope that our script and workflow of fixing unauthorized access points in our bibliographic records can help other libraries as they are preparing to migrate to the linked open data environment. Notes 1. Linked data: Connect Distributed Data Across the Web http://linkeddata.org/ 2. BIBFLOW: A Roadmap for Library Linked Data Transition, Prepared 14 March, 2017, MacKenzie Smith, Carl G. Stahmer, Xiaoli Li ,Gloria Gonzalez, University Library, University of California, Davis Zepheira Inc http://roytennant.com/BIBFLOWRoadmap.pdf 3. Qiang Jin, Demystifying FRAD: Functional Requirements for Authority Data (Santa Barbara, Libraries Unlimited, 2012), 3. 4. IFLA Study Group on Functional Requirements and Numbering of Authority Records (FRANAR), Functional Requirements for Authority Data A Conceptual Model: Final Report (Munchen: K. G. Saur, 2013 http://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf 5. IFLA Working Group on the Functional Requirements for Subject Authority Records (FRSAR), Functional Requirements for Subject Authority Data (FRSAD) A Conceptual Model (Munchen: K. G. Saur, 2010) http://www.ifla.org/files/assets/classification-and-indexing/functional- requirements-for-subject-authority-data/frsad-final-report.pdf 6. Joint Steering Committee for Development of RDA: Resource Description and Access, 2010 https://access.rdatoolkit.org/ 7. Bibliographic Framework Initiative https://www.loc.gov/bibframe/ 8. OCLC WorldCat https://www.oclc.org/en/worldcat/data-strategy.html 9. University of Illinois at Urbana-Champaign Library www.library.uiuc.edu 10. CARLI (Consortium of Academic and Research Libraries in Illinois) https://www.carli.illinois.edu/products-services/i-share/reports 11. Michael Gorman, “Authority Control in the Context of Bibliographic Control in the Electronic Environment,” Cataloging & Classification Quarterly, 39 (2004): 13 http://roytennant.com/BIBFLOWRoadmap.pdf http://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf https://access.rdatoolkit.org/ https://www.loc.gov/bibframe/ https://www.oclc.org/en/worldcat/data-strategy.html https://www.carli.illinois.edu/products-services/i-share/reports Qiang Jin and Deren Kudeki 370 12. Barbara B. Tillett, “Authority Control: State of the Art and New Perspectives,” Cataloging & Classification Quarterly, 39 (2004): 23. 13. D.I. Hillmann, R. Marker, & C. Brady, “Metadata standards and applications,” The Serials Librarian, 54 (2008): 1. 14. Sha Li Zhang, “Planning an Authority Control Project at a Medium-sized University Library,” College & Research Libraries, 62, no. 5, (2001): 395. 15. Sherry L. Vellucci, “Commercial Services for Providing Authority Control: Outsourcing the Process,” Cataloging & Classification Quarterly, 39 (2004): 443 16. Silvia B. Southwick, Cory K. Lampert, and Richard Southwick, “Preparing Controlled Vocabularies for Linked Data: Benefits and Challenges, Journal of Library Metadata, 15 (2015): 177. 17. System Properties Comparison Microsoft Access vs. Microsoft SQL Server vs. Oracle, https://db- engines.com/en/system/Microsoft+Access%3bMicrosoft+SQL+Server%3bOracle https://db-engines.com/en/system/Microsoft+Access%3bMicrosoft+SQL+Server%3bOracle https://db-engines.com/en/system/Microsoft+Access%3bMicrosoft+SQL+Server%3bOracle work_a5mvw326zzg2laahq4vzwr27pi ---- editörden.indd Türk Kütüphaneciliği 22, 2 (2008), 235-239 c Haberler / News J Dizin / Index • M. Kemal Sevgisunar Doktora unvanı aldı. • ÜNAK'08 -“Bilgi: Farklılık ve Farkındalık • ÜNAK-OCLC Konsorsiyum ve Eğitim Toplantısı M. Kemal Sevgisunar, Doktora unvanı aldı. M. Kemal Sevgisunar, Ankara Üni­ versitesi Sosyal Bilimler Enstitüsü Bilgi ve Belge Yönetimi Anabilim Dalı'nda, “Türkiye'de Siyasi Ge­ lişmeler ve İdeolojik Yaklaşımların Bilgi ve Belge Yönetimi Alanına Et­ kileri” adını taşıyan teziyle, Doktor (Dr.) ünvanını aldı. Ağırlıklı olarak Milli Kütüphane, Kütüphaneler ve Yayınlar Genel Müdürlüğü, TBMM Kütüphanesi, Bilgi ve Belge Yöneti­ mi Bölümleri, Devlet Arşivleri, Türk Kütüphaneciler Derneği'nde olage­ len siyasal etkileşimi ele alan tez, bu alanda gerçekleşen ilk bilimsel çalışma oluşu yanında, Osmanlı'nın son dönemlerinden başlayarak özet • Balkan Ülkeleri Kütüphaneler Arası Bilgi Belge Yönetimi ve İşbirliği • ÜNAK Bildirileri Veri Tabanı • Türkiye'de ihtiyaç maddeleri sıralamasında kitap 235. Sırada yer alıyor! bir siyasal panorama sunuyor. Dok­ tora çalışması, yapan M. Kemal Sevgisunar başta olmak üzere, Tez Danışması Doç. Dr. Nazlı Alkan'a ve emeği geçen herkese teşekkür ediyor, başarılarının devamını dili­ yoruz. ÜNAK-OCLC Konsorsiyum ve Eğitim Toplantısı TOBB Ekonomi ve Teknoloji Üni­ versitesi Söğütözü Yerleşkesi'nde, 8-9 Mayıs 2008 tarihleri arasın­ da, ÜNAK-OCLC Konsorsiyum ve Eğitim Toplantısı yapıldı. Russ Hunt, Vivien Cook, Catherine Bon- ser ve Christien Negrel' in sunum yaptığı toplantıların başarılı geçtiği 236 Haberler / News belirtildi. ÜNAK Bildirileri Veri Tabanı 1991 yılında kurulan ve mesleğimi­ zin seçkin sivil toplum örgütlerin­ den birisi olan ÜNAK (Üniversite ve Araştırma Kütüphanecileri Der­ neği), başarılı çalışmalarına yeni eklentiler yaptı. ÜNAK Bildirileri Veri Tabanı, Türkçe-İngilizce Konu Başlıkları Listesi ve Kütüphanecilik Terimleri Sözlüğü bunlardan bir­ kaçı. Süren ve tamamlanan çalış­ malarıyla mesleğimize renk katan ÜNAK'ı kutluyor, çalışmalarında başarılar diliyoruz. ÜNAK'08 -“Bilgi: Farklılık ve Farkındalık ÜNAK'08 -“Bilgi: Farklılık ve Far- kındalık” adlı toplantı, 9-11 Ekim 2008 tarihleri arasında, İstanbul'da Yaşar Üniversitesi Yerleşkesi içinde gerçekleşecektir. Toplantı konuları­ nın; Bilgi Yönetimi, Bilgi Ekonomi­ si, Bilgi Sistemleri, Bilgi Ağları, Bil­ gi Güvenliği, Bilginin Pazarlanma- sı, Bilgi Merkezleri ve Hizmetleri, Koleksiyon Yönetimi, Bibliyografik Denetim, Kullanıcı Hizmetleri, Da­ nışma Hizmetleri, Süreli Yayınlar, Dizenleme (indeksleme) Elektro­ nik İçerik Yönetimi, e-Kitaplar ve e-Dergiler, Sivil Toplum Örgütleri, Konsorsiyumlar, Akademik Bilgiye Açık Erişim, Tarama Motorları, Yeni Nesil Bilgi Servisleri, Web 2.0, Sos­ yal Ağlar, Web 3.0, Semantik Web Siteleri, Yayıncılık ve e-Yayıncılık, Arşiv ve Arşiv Sistemleri, Kurum­ sal Bilgi ve Belge Yönetimi, Sayısal Arşivler, Entelektüel Sermaye Hak­ ları, İnovasyon, AB Süreci, Eğitim, Sürekli Eğitim, Uzaktan Eğitim, Bilgi Okuryazarlığı olacağı bildiril­ di. Etkinliğin başarılı geçmesini di­ ler, emeği geçen herkesi kutlarız. Balkan Ülkeleri Kütüphaneler Arası Bilgi Belge Yönetimi ve İş­ birliği 5-7 Haziran 2008 tarihleri arasın­ da, Edirne'de, Trakya Üniversitesi Kütüphane ve Dokümantasyon Dai­ resi Başkanlığı girişimiyle, “Balkan Ülkeleri Kütüphaneler Arası Bilgi Belge Yönetimi ve İşbirliği” sem­ pozyumu düzenlendi. Balkanlardan 10, toplam 13 ülkeden (Bulgaristan, Romanya, Bosna-Hersek, Arna­ vutluk, Rusya, Sırbistan, İngiltere, İrlanda, Yunanistan, Çek Cumhuri­ yeti ve Kosova) 245 kişinin katıl­ dığı toplantıda, 50 bildiri,17 poster sunuldu. Rektör Prof. Dr. Enver Du­ ran, Kütüphaneler ve Yayınlar Ge­ nel Müdürü Doç. Dr. Ahmet Arı ve Daire Başkanı Ender Bilar'ın açılış konuşması yaptığı toplantı, başarılı etkinliklerle ve sonuç bildirgesinin Haberler / News 237 yayınlanması ile sona erdi. Emeği geçen başta Ender Bilar olmak üze­ re herkesi kutluyoruz. Türkiye'de ihtiyaç maddeleri sı­ ralamasında kitap 235. Sırada yer alıyor! Bağımsız Eğitimciler Sendikası(BES) AR-GE raporuna göre Türkiye'de okunan kitaplar genellikle “siyaset, aşk, cinsellik” konularını işliyor. Günde ortalama 5 saat TV seyreden Türk halkı, kitap okumaya yılda yal­ nızca 6 saat vakit ayırıyor. Türkiye kitap okuma konusunda çoğu Afrika ülkelerinin gerisinde kal­ mış durumda. Japonya'da toplumun yüzde 14'ü, Amerika'da yüzde 12' si, İngiltere ve Fransa'da yüzde 21'i dü­ zenli kitap okur iken, Türkiye'de yal­ nızca on binde bir kişi kitap okuyor. Toplam nüfusu sadece 7 milyon olan Azerbaycan'da kitap ortalama 100 bin tirajla basılırken, 71 milyon nüfuslu Türkiye'de bu rakam 2 bin - 3 bin civarında kalıyor. Birleşmiş Milletler İnsani Gelişim Rapor'unda kitap okuma oranında Türkiye, Libya, Tanzanya, Kongo ve Ermenistan gibi ülkelerin bulunduğu dünya ülkeleri arasında 86. sırada yer alıyor. Bir Japon bir yılda ortalama 25 ki­ tap okuyor, Bir İsviçreli bir yılda or­ talama 10 kitap, bir Fransız bir yılda ortalama 7 kitap, bir Türk ise 10 yılda ancak 1 bir kitap okuyor. Türkiye'de okuma alışkanlığına sahip 70 bin kişi bulunuyor. Türkiye'de bir kişinin kitap oku­ mak için ayırdığı zamanın; 300 katını bir Norveçli, 210 katını bir Amerika­ lı, 87 katını bir İngiliz, 87 katını da bir Japon ayırıyor. Dünya ortalaması bile bizim ayırdığımız zamandan 3 kat fazla. Birleşmiş Milletlerin yaptırdığı bir araştırmada kitap için ayrılan bütçeye bakarsak; Norveçli 137 $, Alman 122 $, Belçikalı 100 $, Avustralyalı 100 $, Güney Koreli 39 $, Dünya ortalama­ sı 1,3 $ iken bir Türk ise yılda ancak 0,45 $ ayırıyor. ABD'de yılda 72 bin kitap ba­ sılırken, Rusya'da 58 bin kitap, Japonya'da 42 bin kitap, Fransa'da 27 bin kitap, Türkiye'de ise 7 bin ki­ tap basılıyor. Türkiye'de dergi okuma oranı yüzde 4 iken, televizyon izle­ me oranı ise yüzde 95. İngiltere'de, Ortalama Bir Gazete Olan Günlük The Sun Gazetesi Türkiye'deki Ga­ zetelerin Toplam Tirajı Kadar Satı­ yor. Türkiye'deki Gazete Okurlarının Yüzde 85'i Yalnızca Spor Ve Maga­ zin Sayfalarını okuyor. Türkiye'de kütüphane Sayısı 1412 olmasına rağmen sadece 400 ta­ nesi uluslar arası kütüphane standart­ larını taşımakla birlikte; kütüphane­ lerimizdeki kitap Sayısı 12.221.392, kütüphanelere kayıtlı Üye Sayısı 254.007 ve satın Alınan Kitap Sayısı ise 13.862'de tıkanıp kalıyor. Türkiye'nin Okuma Alışkanlığı isimli çalışmaya göre;Türkiye'de ih­ tiyaç maddeleri sıralamasında kitap 238 Haberler / News 235. sırada yer alıyor. Türkiye'de öğ­ rencilerin sadece yüzde 19'u 25'ten fazla kitaba sahip. Kütüphanelere internetin girmesiyle birlikte kütüp­ hanelere gidenlerin sadece yüzde 8'i kitap okumaya gidiyor. Türkiye'de en çok basılan yerli beş kitap: Keloğlan Masalları, Nas­ rettin Hoca Fıkraları, Cinsel içerikli kitaplar, Karadeniz Fıkraları ve Dini Bilgiler İlmihal kitapları olurken; en çok basılan yabancı kitaplar ise La Fontaine Fablları , Ezop Masalları, Andersen Masalları, Çocuk Kalbi ve cinsel konulu kitaplar oluyor. BES AR-GE uzmanları kitap oku­ ma alışkanlığının kazanılması için ailelere şu önerilerde bulunuyor: Ço­ cuklarınıza bebeklik çağından itibaren kitap okuyun ve önemli günlerde ço­ cuklarınıza kitap hediye edin. Okudu­ ğu ve sizinle paylaştığı her kitap için onu ödüllendirin. Çocuğunuzu yaşına ve özelliklerine uygun kitaplarla ta­ nıştırın. Kitap okuma alışkanlığının çocukların bilişsel ve dil gelişiminde oldukça önemli olduğunu vurgulayan AR-GE uzmanları, öğrencilerin de kitap okumadığını hatta ders kitapla­ rını dahi okumadıklarını kaydederek, “gençlik ciddi televizyon program­ larını bile izlemiyor. Ünlü yazarların ünlü kitaplarının adlarını bilmeyen Türk Gençliği futbolcuların ve man­ kenlerin künyesini, sevgililerini ise ezbere biliyor. TÜİK rakamlarına göre zorunlu eğitim çağı nüfusu dışın­ da yüzde 12,5'i okuma yazma bilme­ yen Türk halkı gazeteleri ve kitapları “Yalan Makinesi” diye adlandırırken, kitap ve gazetelerin yalnızca resimle­ rine bakıyor. Türk halkı kitap ve ga­ zeteleri daha çok soba tutuşturmak, sofra bezi yapmak ve külah yapmak için daha çok kullanıyor” dedi. Kitap sevgisi ve kitap okuma alış­ kanlığı konusunda dünya sıralama­ larının en gerilerinde yer alan Türk halkının bu duruma düşmesinin ne­ denini son çeyrek yüzyıldır izlenen yanlış politikalara bağlayan Bağımsız Eğitimciler Sendikası Genel Başkanı Gürkan Avcı, 1980 sonrası güdülen politikalarla kitap okumak kamuoyu­ na zararlı diye tanıtıldı. Özellikle de okuyan ve düşünen kişiler bu süreçte hain ve zararlı kişi olarak kamuoyuna tanıtıldı. 1950'li yıllardan sonra ABD ve AB uzmanlarının yön vermeye başladığı milli eğitim politikaları yü­ zünden eleştiri yapmayan, sistemin bir parçası olmaya çalışan ve popüler kültüre göre şekillenen bir gençlik ya­ ratıldı. Kitap, gazete okuma alışkan­ lığı kazanamamış toplum da yöneti­ cilerini sorgulamadığı gibi iyi yöne­ ticileri ve kaliteli politikacıları seçme konusunda da istekli davranamıyor. Halkımız televizyon seyretme­ yi, birkaç sarkıcının ve futbolcunun özel hayatını, talk-şov dedikodularını öğrenmeyi daha bir önemser durum­ da olduğu gibi, değil kitap okumayı, ciddi televizyon programlarını bile izlemiyor. İzlenen politikalar futbol­ cuların ve mankenlerin künyesini, sevgililerini ezbere bilen bir gençlik yarattı. Gençlik kitap okumuyor ama Haberler / News 239 her konuda fikirleri var. İşte bunun için ülkemiz yıllardır yangın yeri gibi. Okumadığımız için cahiliz ve okuma­ dığımız için kaliteli politikacıları ve yetenekli yöneticileri seçemiyoruz” dedi. work_a6bbbsvcsjbh3gg72goakusucy ---- 20 Years of Persistent Identifiers – Which Systems are Here to Stay? CODATACODATA II SS UU Klump, J and Huber, R 2017 20 Years of Persistent Identifiers – Which Systems are Here to Stay? Data Science Journal, 16: 9, pp. 1–7, DOI: https://doi.org/10.5334/dsj-2017-009 REVIEW 20 Years of Persistent Identifiers – Which Systems are Here to Stay? Jens Klump1 and Robert Huber2 1 CSIRO, Mineral Resources, Perth, AU 2 MARUM, University of Bremen, Bremen, DE Corresponding author: Jens Klump (jens.klump@csiro.au) Web-based persistent identifiers have been around for more than 20 years, a period long enough for us to start observing patterns of success and failure. Persistent identifiers were invented to address challenges arising from the distributed and disorganised nature of the internet, which often resulted in URLs to internet endpoints becoming invalid. Over the years several different persistent identifier systems have been applied to the identification of research data, not all with the same level of success in terms of uptake and sustainability. We investigate the uptake of persistent identifier systems and discuss the factors that might determine the stability and longevity of these systems. Persistent identifiers have become essential elements of global research data infrastructures. Understanding the factors that influence the stability and longevity of persistent identifier systems will help us guide the future development of this important element of research data infrastructures and will make it easier to adapt to future technological and organisational changes. Keywords: persistent identifiers; semantic web; research data repositories Introduction Web-based persistent identifiers have been around for more than 20 years, a period long enough for us to start observing patterns of success and failure. Persistent identifiers were invented to address challenges arising from the distributed and disorganised nature of the internet, which not only allowed new technolo- gies to emerge, it also made it difficult to maintain a persistent record of science (Dellavalle et al. 2003; Lawrence et al. 2001). This phenomenon, also dubbed “link rot”, affects all digital resources on the web, including research data (Vines et al. 2014). It has been argued that “link rot” can be avoided by careful management of web servers to keep URLs sta- ble over a long time, a principle called “Cool URIs” (Berners-Lee 1998). For semantic applications the use of Cool URIs has been proposed and it has been questioned whether DOI are necessary in a world of Cool URI (Bazzanella, Bortoli, and Bouquet 2013). “Pretty much the only good reason for a document to disappear from the Web is that the company which owned the domain name went out of business or can no longer afford to keep the server running.” (Berners-Lee 1998). Unforeseen to Berners-Lee, a few years after his statement the “dot.com bubble” burst and many companies went out of business, leaving many web domains orphaned. Other companies were acquired and merged into existing entities, and again sometimes losing their original web domain. One way of addressing the root problem of the persistence of locators on the web was by the introduction of persistent identifiers which separated the identity of an object from its location on the web (see e.g. Arms 1995; Lawrence et al. 2001; Lynch 1997). Adding a system to ensure global uniqueness makes persistent identifiers a tool that allows us unambiguous identification of resources on the net. The expectations were https://doi.org/10.5334/dsj-2017-009 mailto:jens.klump@csiro.au http://www.dot.com Klump and Huber: 20 Years of Persistent Identifiers – Which Systems are Here to Stay?Art. 9, page 2 of 7 that persistent identifiers would lead to greater accessibility, transparency and reproducibility of research results. The discussion of PID vs. Cool URI in Bazzanella, Bortoli, and Bouquet (2013) shows that persistent access to web resources is not merely a technical question, but rather a “social contract” that needs to be entered by the stakeholders aiming to maintain persistent references to objects on the web. In this paper we want to review 20 years of persistent identifier practice and the uptake of different per- sistent identifier systems. In a series of case studies we want to characterise well known persistent identifier systems, assess their successes and failures, and extract what can be learned from these examples. Uptake of Persistent Identifiers One way to assess the success of particular identifier systems is to survey their adoption by research data repositories. This might seem like a straightforward approach, but it turns out to be difficult to define a measure for the uptake of persistent identifier systems because the sizes of research data repositories and the granularities vary by orders of magnitude. Our analysis of the uptake of persistent identifier systems by research data repositories is based on data from the Registry of Research Data Repositories (re3data.org). re3data.org is a global registry of research data repositories that covers all academic disciplines. The registry arose from two separate projects, re3data.org (Pampel et al. 2013) and DataBib (Witt and Giarlo 2012) and is now managed by DataCite. The total sample we obtained from re3data.org in December 2015 listed 1381 repositories. Out of this total a subset of 475 repositories used some type of persistent identifier. Note that some repositories use more than one type of persistent identifier. Figure 1 summarises the use of persistent identifier types used by repositories listed in the re3data.org database. The focus of re3data.org is on research data repositories and despite its size the re3data.org registry does not claim global coverage. Still, its catalogue can be considered to be a representative overview. Not covered by re3data.org are collections of research specimens like herbaria or cultural artefacts, which might also use persistent identifiers for the identification of items in their collections. Furthermore, identifiers are also used outside of research, for example in the identification of companies on stock exchanges. In this paper we will only discuss the use of persistent identifiers in the context of research data and the record of science. In their standardised descriptions of research data repositories re3data.org distinguish between only a few identifier systems (Rücknagel et al. 2015). Digital Object Identifiers (DOI) are clearly the most widely adopted persistent identifier in research data repository systems. Figure 2 differentiates persistent identifiers used by three kinds of repositories, namely disciplinary repositories, institutional repositories and repositories that fall in neither of the two former categories. Remarkable is the relatively frequent use of “other” persis- tent identifier systems, not differentiated in the re3data.org description, by disciplinary repositories. This points at an important role of discipline specific identifiers. Figure 1: Number of repositories using a particular type of persistent identifier. A total of 457 out of 1381 repositories (status of 14 Dec 2015) use some sort of persistent identifier. Some repositories use more than one type of persistent identifier. http://www.re3data.org/ http://www.re3data.org http://www.re3data.org http://www.re3data.org http://www.re3data.org/ http://www.re3data.org http://www.re3data.org http://www.re3data.org http://www.re3data.org http://www.re3data.org Klump and Huber: 20 Years of Persistent Identifiers – Which Systems are Here to Stay? Art. 9, page 3 of 7 Again, DOI are by far the most used persistent identifiers. “Other” types of identifiers seem to play an important role in disciplinary repositories and 57 of these repositories serve the life sciences. This indicates a special role that “other” identifier systems play in particular disciplines. Institutional repositories frequently use identifiers based on the Handle system, which may be due to the fact that some institutional repository software, like DSpace (Smith et al. 2003), use Handle-based identifiers to identify objects in their holdings. Years of Crisis While persistent identifiers are adopted and implemented by a growing number of data archives, not every PID system experienced a success story during the last years. A few PID systems even experienced severe problems to the degree that lead to a temporary shutdown of some of their core services, which in turn led to orphaned, unresolvable or unmanaged PIDs. As mentioned above, the years 2015 and 2016 turned out to be years of crisis for some persistent identi- fier systems, in particular for Persistent URL (PURL) and Life Science Identifiers (LSID). While PURL seems to have gained a new lease on life through transferring to a new organisational and technical base, the future of LSID as a resolvable persistent identifier seems uncertain. PURL was introduced by the Online Computer Library Center, Inc. (OCLC) as a bridging technology to prepare for introduction of Universal Resource Names (URN). PURL implements the URI concept and thus it does not separate between identifier and resolving mechanism. PURL has no single global resolving mecha- nism and PURL resolvers do not communicate amongst each other to share resolving information like DNS or Handle servers would do. For most of its history PURL had little social infrastructure and formal govern- ance. In 2014 OCLC withdrew its institutional support and the future of PURL became unclear while PURL experienced severe technical problems for some time and the system was put into a ‘read-only’ maintenance mode (Baker 2015). In September 2016 OCLC and the Internet Archive announced that the URL redirection service, on which PURL is based, will in future be operated by the Internet Archive (OCLC 2016). This move brought PURL back from the brink of extinction. In December 2015 a total of 16 research data repositories in re3data.org were listed as using PURL, and only few of them were using PURL exclusively. Using Google Scholar as a search engine we estimate that about 16,400 PURL identifiers are being used in the entire schol- arly record indexed by Google Scholar. Of these, less than 5,000 seem to identify digital objects like data, most seem to identify semantic concepts. Figure 2: Types of repositories using a particular type of persistent identifier. Note that some repositories use more than one type of persistent identifier and are defined as both disciplinary and institutional. “Other” PID seem to be important in disciplinary repositories, pointing to particular disciplinary groups of practice. Handle PID (non-DOI) are used by many institutional repositories, possibly due to use of Handle in repository software and early adoption. http://www.re3data.org Klump and Huber: 20 Years of Persistent Identifiers – Which Systems are Here to Stay?Art. 9, page 4 of 7 LSIDs had been introduced by the Object Management Group (OMG) in 2004 as a way to naming and identifying data resources stored in multiple, distributed data stores. From 2009 onwards the biodiversity informatics communities’ standardisation authority (Taxonomic Database Working Group, TDWG) strongly supported LSIDs as the preferred GUID technology. LSIDs were thought to be used by all globally leading providers for biodiversity data to identify organism names. However, LSIDs provide neither a global resolv- ing mechanism nor a centralised provider registration. The implementation of this standard is relatively complex, resolution is DNS based and requires a multistep procedure, the associated metadata format is RDF. As a consequence the technology was controversially disputed (e.g. Hyam 2015; Page 2016) and opin- ions in discussion forums seemed to favour a simpler identifier system such as HTTP URIs. As a result of these difficulties the system remained fragmented and fragile. In 2016, maintenance on TDWGs LSID resolution service was terminated and TDWGs support of LSIDs came into questioning by members of the group (TDWG 2016). After two months without a central resolving system, a resolver has been made available at http://www.lsid.info. However, the discussion is ongoing and significant parts of the biodiversity informatics community recommend switching from LSID to cool URI (Guralnick et al. 2015). Using Google Scholar as a search engine we estimated that about 14,000 LSIDs have been used in the sci- entific literature. Which are here to stay? At the same time as criteria for trusted repositories were developed (Dobratz et al. 2009; Sesink, van Horik, and Harmsen 2008), similar efforts looked at criteria for trustworthy persistent identifier systems. Most notable are the criteria for trusted persistent identifier systems developed by Bütikofer (2009) in the context of the German nestor research programme on long-term preservation, and the review of persistent identifier systems as tools for science by Duerr et al. (2011). While the criteria of Bütikofer emphasize technical and organisational criteria, the review of Duerr et al. focuses more on usability of identifier systems as part of the academic record. Even though the authors come to different conclusions about which systems are likely to persist, both recognise the importance of organisational sustainability. If organisational stability is the Achil- les Heel of persistent identifier systems, are there ways we can achieve better sustainability of PID systems? A first step towards better sustainability of PID systems would be more transparency. This should include all aspects of a PID system, technical documentation, policies, governance and in particular the data and metadata which are necessary to resolve a PID. Today, most of discussions related to the status of PID systems are hidden in online discussion fora and email lists and is only rarely made public. It is entirely unsatisfactory to publicly and officially promote a PID system while exit strategies are being discussed in the background or services are silently ceased. This needs to change, and clearly a more participatory attitude and proactive communication strategy would be beneficial for all PID systems stakeholders. A set of criteria, analogous to the criteria for the description of research data repositories published by re3data.org (Rücknagel et al. 2015), would help with the evaluation of PID systems. In conjunction with the re3data.org criteria, they would also help to identify weaknesses in the shared responsibility of the data provider and the operator of the PID resolver service for a reliable resolution of identifiers to web endpoints. The large number of discipline specific resolver systems tells us that there may be very specific needs in the governance of a PID system that are not met by the generic services. Here it is necessary to have a close look at the value proposition of a particular PID system and the services it provides. In addition to organisational criteria, the value proposition of a particular PID system also asks us to evaluate its technical basis and alternative technical solutions. As we have seen from the LSID example, the seeming simplicity of Cool URIs is still attractive. The HTTP protocol is simple and universally available but the risks of “link rot” have not gone away. Other interesting technical alternatives are based on peer-to-peer networking technologies such as Blockchain (Bolikowski, Nowiński, and Sylwestrzak 2015) or MagnetLinks (Golodoniuc, Car, and Klump, this volume). Peer-to-peer technologies would allow a “Devolution of Power” and community-based backup strategies for PID resolution. Coming back to the question of the value proposition of PID systems, do we need PID resolvers? Yes, because we do not yet live in a semantic web world where linked data graphs would lead us to resources as proposed by Sachs and Finin (2010). As an interim solution data providers should take advantage of avail- able web search engines and make their data holdings discoverable. A possible approach would be to use mainstream web technologies as a supplement and potential fall-back solution to PID systems. Candidate technologies are microformats or JSON-LD which are suited to expose both, metadata as well as potentially http://www.lsid.info http://www.re3data.org http://www.re3data.org Klump and Huber: 20 Years of Persistent Identifiers – Which Systems are Here to Stay? Art. 9, page 5 of 7 multiple identifiers associated to a digital object. Complementary sitemaps or catalogue services catalogued in a publicly available registry such as the GEOSS CSR could enable the implementation of common, generic resolution services. There is, however, the other value proposition of PID, the persistent identification of elements of the record of science. Properly identifying these elements in a way that can be consumed by human and machine clients alike, and maintaining the persistence of objects and identifier resolution, is not a purely technical problem but is maintained through a social contract. The stability of this social contract, together with a sus- tainable and adaptable technological base, will determine the sustainability and resilience of a PID system. It is tempting to assume that a social contract becomes increasingly binding as user community relying on a PID particular system grows. With the examples discussed in this paper we show that this is most likely an illusion. The DOI system, which is arguably the most successful PID system today, has a strong commercial backing while minor systems such as URN and ARK have the backing of national libraries. It might be a bitter pill to swallow for some members in the research data community wary of all things commercial, but busi- ness models are essential aspects of PID systems – sustainable PID systems do not come for free. Acknowledgements The authors would like to thank the Registry of Research Data Repositories (re3data.org) for providing an excerpt of the re3data.org database. We wish to acknowledge the European Commission for their funding of the projects ENVRIplus (Reference number: 654182) and THOR (Reference number 654039), as well as funding by the German Research Foundation (DFG) of the project GFBio. We also thank the reviewers for their constructive comments that helped to improve this manuscript. Competing Interests The authors have no competing interests to declare. About the Authors Jens Klump is a geochemist by training and OCE Science Leader Earth Science Informatics in CSIRO Min- eral Resources. Jens’ field of research is data intensive science. Research topics in this field are numeri- cal methods in minerals exploration, virtual research environments, remotely operated instruments, high performance and cloud computing, and the development of system solutions for geoscience projects. In his previous position at the German Research Centre for Geosciences in Potsdam he was involved in the development of the publication and citation of research data through Digital Object Identifiers. This project sparked further work on research data infrastructures, including the publication and curation of scientific software and reproducible research. Robert Huber is a geologist and information specialist holding a PhD in Marine Geology. He has worked for several years as information system architect for the aerospace industry and the renewable energy indus- try. Since 2002 he is employed at the Centre for Marine Environmental Sciences (MARUM) at the University Bremen and responsible for projects in scientific data management and IT development especially in the fields of ontology development, marine observatory networks and biodiversity in the PANGAEA working group. References Arms, W Y 1995 Key Concepts in the Architecture of the Digital Library. D-Lib Magazine, July. https://www. cnri.dlib/july95-arms. Baker, T 2015 The Future of PURLs. JISC DC Architecture. Retrieved from: https://www.jiscmail.ac.uk/cgi- bin/webadmin?A2=ind1511&L=DC-ARCHITECTURE&F=&S=&P=3711. Bazzanella, B, Bortoli, S and Bouquet, P 2013 Can Persistent Identifiers Be Cool? International Journal of Digital Curation, 8(1): 14–28. DOI: https://doi.org/10.2218/ijdc.v8i1.246 Berners-Lee, T 1998 Cool URIs Don’t Change. Cambridge, MA: World Wide Web Consortium (W3C). Retrieved from: http://www.w3.org/Provider/Style/URI. Bolikowski, Ł, Nowiński, A and Sylwestrzak, W 2015 A System for Distributed Minting and Manage- ment of Persistent Identifiers. International Journal of Digital Curation, 10(1): 280–86. DOI: https://doi. org/10.2218/ijdc.v10i1.368 Bütikofer, N 2009 Catalogue of Criteria for Assessing the Trustworthiness of PI Systems. 13. Nestor-Materialien. Göttingen, Germany: Niedersächsische Staats und Universitätsbibliothek Göttingen. Retrieved from: http://nbn-resolving.de/urn:nbn:de:0008-20080710227. http://www.re3data.org/ http://www.re3data.org/ https://www.cnri.dlib/july95-arms https://www.cnri.dlib/july95-arms https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1511&L=DC-ARCHITECTURE&F=&S=&P=3711 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1511&L=DC-ARCHITECTURE&F=&S=&P=3711 https://doi.org/10.2218/ijdc.v8i1.246 http://www.w3.org/Provider/Style/URI https://doi.org/10.2218/ijdc.v10i1.368 https://doi.org/10.2218/ijdc.v10i1.368 http://nbn-resolving.de/urn:nbn:de:0008-20080710227 Klump and Huber: 20 Years of Persistent Identifiers – Which Systems are Here to Stay?Art. 9, page 6 of 7 Dellavalle, R P, Hester, E J, Heilig, L F, Drake, A L, Kuntzman, J W, Graber, M and Schilling, L M 2003 Going, Going, Gone: Lost Internet References. Science, 302(5646): 787–88. DOI: https://doi. org/10.1126/science.1088234 Dobratz, S, Hänger, A, Huth, K, Kaiser, M, Keitel, C, Klump, J, Rödig, P, et al. 2009 Catalogue of Criteria for Trusted Digital Repositories. 8. Nestor Materials. Frankfurt (Main), Germany: Deutsche Nationalbib- liothek. Retrieved from: http://nbn-resolving.de/urn:nbn:de:0008-2010030806. Duerr, R E, Downs, R R, Tilmes, C, Barkstrom, B, Lenhardt, W C, Glassy, J, Bermudez, L E and Slaughter, P 2011 On the Utility of Identification Schemes for Digital Earth Science Data: An Assessment and Recommendations. Earth Science Informatics, 4(3): 139–60. DOI: https://doi.org/10.1007/ s12145-011-0083-6 Golodoniuc, P, Car, N J and Klump, J subm. Distributed Persistent Identifiers System Design. Data Science Journal. Guralnick, R P, Cellinese, N, Deck, J, Pyle, R L, Kunze, J, Penev, L, Walls, R, et al. 2015 Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data. ZooKeys, 494(April): 133–54. DOI: https://doi.org/10.3897/zookeys.494.9352 Hyam, R 2015 Taxa, Taxon Names and Globally Unique Identifiers in Perspective’. In: Watson, M F, Lyal, C and Pendry, C (Eds.) Descriptive Taxonomy: The Foundation of Biodiversity Research. Cambridge, United Kingdom: Cambridge University Press, pp. 260–71. DOI: https://doi.org/10.1017/CBO9781139028004.026 Lawrence, S, Coetzee, F, Glover, E, Pennock, D, Flake, G, Nielsen, F, Krovetz, R, Kruger, A and Giles, L 2001 Persistence of Web References in Scientific Research. IEEE Computer, 34(2): 26–31. DOI: https:// doi.org/10.1109/2.901164 Lynch, C 1997 (October) Identifiers and Their Role In Networked Information Applications. ARL: A Bimonthly Newsletter of Research Library Issues and Actions. Retrieved from: http://www.arl.org/newsltr/194/ identifier.html. Page, R 2016 Surfacing the Deep Data of Taxonomy. ZooKeys, 550(January): 247–60. DOI: https://doi. org/10.3897/zookeys.550.9293 Pampel, H, Vierkant, P, Scholze, F, Bertelmann, R, Kindling, M, Klump, J, Goebelbecker, H-J, Gundlach, J, Schirmbacher, P and Dierolf, U 2013 Making Research Data Repositories Visible: The re3data.org Registry. PLoS ONE, 8(11): e78080. DOI: https://doi.org/10.1371/journal. pone.0078080 Rücknagel, J, Vierkant, P, Ulrich, R, Kloska, G, Schnepf, E, Fichtmüller, D, Reuter, E, et al. 2015 Meta- data Schema for the Description of Research Data Repositories. Version 3.0. Potsdam, Germany: German Research Centre for Geosciences. DOI: https://doi.org/10.2312/re3.008 Sachs, J and Finin, T 2010 What Does It Mean for a URI to Resolve? In: Proceedings of the AAAI Spring Sym- posium on Linked Data Meets Artificial Intelligence, 3. Stanford, CA: AAAI Press. Retrieved from: http:// www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1178. Sesink, L, van Horik, R and Harmsen, H 2008 Data Seal of Approval. Den Haag, The Netherlands: Data Archiving and Networked Services (DANS). Retrieved from: http://www.datasealofapproval.org/. Smith, M, Barton, M, Branschofsky, M, McClellan, G, Walker, J H, Bass, M, Stuve, D and Tansley, R 2003 DSpace: An Open Source Dynamic Digital Repository’. D-Lib Magazine, 9(1). DOI: https://doi. org/10.1045/january2003-smith TDWG 2016 (September 28) Decide about Fate of Lsid.tdwg.org. GitHub. TDWG/Infrastructure. Retrieved from: https://github.com/tdwg/infrastructure/issues/60. Vines, T H, Albert, A Y K, Andrew, R L, Débarre, F, Bock, D G, Franklin, M T, Gilbert, K J, Moore, J-S, Renaut, S and Rennison, D J 2014 The Availability of Research Data Declines Rapidly with Article Age. Current Biology, 24(1): 94097. DOI: https://doi.org/10.1016/j.cub.2013.11.014 Witt, M and Giarlo, M 2012 Databib. Libraries Faculty and Staff Presentations, January. Retrieved from: http://docs.lib.purdue.edu/lib_fspres/1. https://doi.org/10.1126/science.1088234 https://doi.org/10.1126/science.1088234 http://nbn-resolving.de/urn:nbn:de:0008-2010030806 https://doi.org/10.1007/s12145-011-0083-6 https://doi.org/10.1007/s12145-011-0083-6 https://doi.org/10.3897/zookeys.494.9352 https://doi.org/10.1017/CBO9781139028004.026 https://doi.org/10.1109/2.901164 https://doi.org/10.1109/2.901164 http://www.arl.org/newsltr/194/identifier.html http://www.arl.org/newsltr/194/identifier.html https://doi.org/10.3897/zookeys.550.9293 https://doi.org/10.3897/zookeys.550.9293 http://www.re3data.org https://doi.org/10.1371/journal.pone.0078080 https://doi.org/10.1371/journal.pone.0078080 https://doi.org/10.2312/re3.008 http://www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1178 http://www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1178 http://www.datasealofapproval.org/ https://doi.org/10.1045/january2003-smith https://doi.org/10.1045/january2003-smith http://www.lsid.info/ https://github.com/tdwg/infrastructure/issues/60 https://doi.org/10.1016/j.cub.2013.11.014 http://docs.lib.purdue.edu/lib_fspres/1 Klump and Huber: 20 Years of Persistent Identifiers – Which Systems are Here to Stay? Art. 9, page 7 of 7 How to cite this article: Klump, J and Huber, R 2017 20 Years of Persistent Identifiers – Which Systems are Here to Stay? Data Science Journal, 16: 9, pp. 1–7, DOI: https://doi.org/10.5334/dsj-2017-009 Submitted: 18 November 2016 Accepted: 14 February 2017 Published: 22 March 2017 Copyright: © 2017 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/ licenses/by/4.0/. OPEN ACCESS Data Science Journal is a peer-reviewed open access journal published by Ubiquity Press. https://doi.org/10.5334/dsj-2017-009 http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/ Introduction Uptake of Persistent Identifiers Years of Crisis Which are here to stay? Acknowledgements Competing Interests About the Authors References Figures Figure 1 Figure 2 work_agqqmbkdpvakdgrdtwny7a5hja ---- PII: S0098-7913(02)00237-X 321 0098-7913/02$–see front matter © 2002 Elsevier Science Inc. All rights reserved. PII: S0098-7913(01)00237-X An Interview with Steve Shadle Emily McElroy, Column Editor with a contribution from Bonnie Parks Bonnie Parks interviewed Steve Shadle, serials cataloger for University of Washing- ton Libraries, in August 2002. In this interview Shadle provides a cataloger’s perspec- tive on the challenges he and other serials catalogers face in the organization and management of electronic and print serial titles. Serials Review 2002; 28:321–326. © 2002 Elsevier Science Inc. All Rights Reserved. Steve Shadle is a serials cataloger for the University of Washington Librar- ies. He has given several workshops and presentations on serials and elec- tronic resources cataloging at vari- ous conferences including the Amer- ican Library Association (ALA) and North American Serials Interest Group’s (NASIG) annual meetings. He is active in NASIG, ALA, the Association for Library Collections and Technical Services (ALCTS), and serves on the editorial board for Serials Review . Shadle has penned a number of articles that have been published in several library science journals including The Serials Librarian and Serials Review . Recently he and Les Hawkins (Library of Congress) co-authored the instructor and trainee manuals for the Serials Coopera- tive Cataloging Training Program’s (SCCTP) Electronic Serials Cataloging workshop. Currently he is at work on a book about cataloging electronic resources. Shadle has also been asked to develop a one-credit course on e-serials for the University of Washington’s Information School. Interview Bonnie Parks (BP): Before we begin, why don’t you tell me a little bit about your background and what made you decide to choose serials cataloging over other aspects of librarianship. Steve Shadle (SS): When I was in library school at the University of Washington (UW), I didn’t know what I wanted to do. Becoming a systems librarian was a defi- nite possibility as I had done a small amount of database development when I worked at King County Library System. I was really intrigued with the possibilities of systems applications in the library environment. While in library school, I enjoyed the cataloging courses I took from Ellen Soper. I think I was one of the few nerds who enjoyed cataloging . . . what’s that saying from the Marines? The Few, The Proud, The Anal-Retentive? Dr. Soper’s retired now. BP: Right. She retired the year before I started at the UW. In fact, you taught the basic cataloging course to my class. SS: Oh, that’s right. I had only been back in Seattle for about six months when I taught that class. The library school hadn’t yet hired Dr. Soper’s replacement and was desperate to have the class taught. So your class was my learning experience. I apologize for that. Prior to library school, I’d done some clerical work at King County Li- brary System in cataloging, collection development, and reference. I worked mostly part-time while I was getting my undergraduate degree. The librarians there (espe- cially in Reference and Collection Development) were very good about involving staff in all the work of the de- partment, so I felt I had a pretty good idea of what was involved in public library work. After I graduated, I kept working at King County while I was trying to figure out what I wanted to be when I grew up. I wasn’t interested in getting a master’s degree in library science because of what I perceived as the stigmas associated with the pro- fession (low salaries, pink collar, social service work, etc.). After about a year of unsuccessfully trying to find entry-level systems processing jobs, I started questioning these perceptions and realized after reading a book titled Do What You Love, The Money Will Follow that I really McElroy is Serials and Electronic Collections Librarian, Loy- ola University Health Sciences library, Maywood, IL 60153; e-mail: emcelro@lumc.edu. Parks is Serials Unit Head, Oregon State University, Corvallis, OR 97330; e-mail: bonnie.parks@orst.edu. Serial Conversations 322 Parks / Serials Review 28/4 (2002) 321–326 should consider going to library school. 1 I was also strug- gling with coming out of the closet about the same time and making the decision to accept my sexuality. Accepting my career choice went hand-in-hand in a weird sort of way (my coming out as a librarian as it were). This connection was only reinforced by the fact that my first ALA confer- ence was in San Francisco during Pride weekend. My first job out of library school was at the U.S. Agency for International Development (USAID) in Wash- ington, D.C., where I was a technical services librarian. It was a small federal agency library employing three li- brarians and three technicians. We didn’t do extensive research, only quick reference. There was a separate re- search staff to do in-depth analysis. It was a good learning experience since I was doing a variety of things and the subject matter was interdisciplinary because of the varied nature of foreign assistance projects. I was always learn- ing new things. Because I was working for a contractor, the pay and stability weren’t great. Eventually I realized that I wasn’t cut out to work in a special library. When I was in library school, I had interviewed for an intern- ship at the Library of Congress (LC). I saw the opportu- nities at that library and what kind of institution it was. When I was hired at USAID, I had it in the back of my head that I still wanted to work at LC. When I decided to leave USAID, I started applying for any positions at LC that I was more or less qualified for. The first position I was offered was an ISSN cataloger at the National Serials Data Program (NSDP). 2 It was only after I was trained at NSDP that I really discovered how interesting serials cat- aloging was. I was being trained right at the time that the CONSER Cataloging Manual was being developed, so Pamela Simpson, another recent NSDP hire, and myself were guinea pigs for the manual. 3 At LC, I really discov- ered how interesting serials cataloging was. BP: What was it that you found interesting about serials cataloging? SS: I think the reasons I prefer cataloging serials to monographs is the variety of materials and presentations and the amount of judgment that is required. As Pamela Simpson once told me, cataloging a serial is like studying a gazelle as it bounds across the savannah; cataloging a monograph is like doing an autopsy. I used this analogy once at a basic serials cataloging workshop, and I’ll never use it again. I think I alienated a few people. First, serials tend to have non-standard presentations so there’s judg- ment involved there. And then because of their changing nature, there’s more judgment involved in terms of de- scribing how a serial changes. Oftentimes, you have to guess what a serial might do in the future based on the facts in hand and these days an e-mail to the publisher. Identifying relationships between the title in your hand and other serial titles adds another aspect to the work. I feel especially fortunate to have been trained at NSDP because of the variety of serials one catalogs there. Every serial title published in the United States that is assigned an ISSN is cataloged by NSDP. The variety of materials I saw on a daily basis was greater than anything that would be held in one individual library, including the Library of Congress. NSDP cataloged it all, everything from Hustler Busty Beauties to academic journals to grange newsletters to popular magazines. I even helped catalog a serial t-shirt: each issue had a short story printed on it. Like all serials, it changed its title—from Tee Shorts to The Cotton Quarterly . It was in NSDP that I also started cataloging computer files. At that point, it was mostly diskettes, but by the time I left in early 1995 there were enough e-serials being published that there was talk of establishing conventional practices beyond what little was specified in AACR. BP: So you didn’t have any serials experience when you took the job? SS: No, I didn’t have any serials experience other than serials acquisitions when I was at USAID. Ellen Soper taught a serials class at UW that I hadn’t taken. Serials were not on my radar when I was in library school. If someone had told me in library school that I would end up as a serials cataloger, I’d be rolling on the floor. I’m not sure what Julia Blixrud and Regina Reynolds saw in my interview, but apparently they liked something since they hired me. I think the reason they feel comfortable hiring people without serials experience is that the training at NSDP is very thorough. What’s most important to them is that they hire people who will be good catalogers. NSDP quickly provides them with serials experience. NSDP is sort of a cataloging boot camp in some ways. You’re always thinking about production because a lot of those publishers want their ISSN yesterday and you’ve got this wide variety of serials to catalog and everything you do is reviewed for at least six months, if not a year. I had many conversations with my reviser (thank you, Les Hawkins, for your patience), and it was in those con- versations that I really began to understand what aspects of serials cataloging came from the rules and what was cataloger’s judgment. BP: And now you’re a serials cataloger at the UW. Why an academic library? SS: Serials cataloging in a large, academic library was a natural progression from ISSN cataloging. At the time I left NSDP, ISSN catalogers only did original descriptive cataloging. If they were working with copy, they would verify subject headings, but they wouldn’t assign subject headings or do authority work. When you’re expected to assign an average of two ISSN an hour, you don’t have time for the niceties of subject headings. Besides, NSDP’s work really isn’t that of a library. NSDP is part of an in- ternational serials registration system assigning ISSN as a standard identifier. I understand that NSDP catalogers are now actually cataloging for the Library of Congress collection and are doing full-level cataloging. Cataloging for a large academic library was a natural next step in my career. BP: What are some of the biggest challenges you encoun- ter when cataloging serial titles? SS: I think the most challenging thing really is electronic serials, not only in terms of cataloging, but also in terms 323 Parks / Serials Review 28/4 (2002) 321–326 of resource and collection management. I think the most common cataloging challenge is that there are no stan- dard presentations of bibliographic information. Often within a particular package or aggregation you might have a standard way of presenting information, so that if you are actually creating records for an entire package, after you’ve done the first few, then the rest of them are like, “I know where to get this information.” But across the board there are so few standards. Cataloging serials generally is difficult because the presentations are inconsistent, but they are even more so with e-serials. It’s also more difficult to navigate an elec- tronic serial. With a print serial you can flip through the pages to find the information you need. With an e-serial, you have to click, click, click everywhere to find the in- formation necessary for cataloging. I think, in that re- spect, it’s a little more challenging and a lot more time consuming. Also, and I say this a lot in workshops, one of the problems we have that really hasn’t been dealt with very well in terms of the Anglo American Cataloguing Rules (AACR2) Chapter 12 changes is the fact that with tangible serials (print serials or any other tangible for- mats) each issue has to self-identify for the whole system to work. In order for the publishers to know what they’re sending out, in order for the receipt staff to know what they’re receiving, in order for the acquisitions staff to know what they’ve ordered and need to claim, in order for a user to find a citation, and in order for the binding people to know what they have, every issue needs to self- identify. So you’re going to have some kind of a chief source that the cataloger can use on every issue. Well, you don’t have to do that in the electronic world. In the electronic world, what people care about is the article, the content. It’s like the publisher has cut out the articles of all of the issues and they’ve put them in one place on the Web space, and then they’ve taken one copy of the cover, masthead and the editorial information and they’ve put that in another place on the Web space, so what happens is that we can’t as easily rely on the issue to identify the bibliographic information that is used for identification and description. In successive entry cata- loging we have to rely on individual issues because we compare from issue to issue to issue to identify where a title change or a major change happens. If the informa- tion is no longer on each issue, this absence really presents a challenge. Another challenge is the link maintenance issue—the location (URL) of the resource changes. BP: That’s a problem faced by anyone who catalogs Web resources. I understand you’re involved in a CONSER project involving the use of PURLs (Persistent Uniform Resource Locators) to assist in maintaining links for freely-available e-serials. SS: Right, the CONSER PURL Pilot. Funny you should focus on the PURL project. Our contribution has been pretty minimal. We’ve only contributed forty-six PURLs to date. So to make up for it, we’ve hosted the project dis- cussion list. It is my little contribution, plus it was a good learning experience. Stephanie Sheppard, another serials librarian at the UW, and I are the only catalogers who are assigning PURLs here at the UW. PURLs can also be as- signed to resources currently cataloged as monographs, but our e-resource catalogers have been so busy that they haven’t had time to do more than read up on the project. We haven’t assigned PURLs for more e-serials because we’ve been spending our copious spare time on figuring out how to incorporate Serials Solutions data into our workflows and how to integrate that with Innovative In- terfaces’ new WebBridge and e-serials holdings software. We haven’t had a lot of new non-commercial titles to cat- alog recently and I haven’t had the time to retrospectively identify e-serials that can use PURLs. BP: Who else was involved in the pilot project and how does it work? SS: University of California at Los Angeles (UCLA) was the primary contributor of PURLs (nearly 900 of 2047). Valerie Bross at UCLA has done an absolutely fantastic job of managing the project. The process isn’t difficult. It is basically a two-step process of searching the PURL data- base to confirm your URL hasn’t already been assigned a PURL, then filling out a simple form to assign a PURL for that URL. Until PURLs can be automatically gener- ated from an OCLC Passport or Connexion session, it is an extra step in the process (one which I think is well worth the benefit). For the thirty-four PURLs we’re re- sponsible for, maintenance has been minor, only one or two titles at most on the weekly link maintenance report. More information about the project is available from the CONSER home page. 4 BP: This sounds similar to the process used by the Gov- ernment Printing Office (GPO). SS: It’s exactly what GPO is doing; only the records are directly maintained by a single organization like GPO. Because GPO has a mandate for cataloging U.S. govern- ment publications, it makes sense for them to be respon- sible for that area of the world. And in some respects it probably makes sense for CONSER to be responsible for maintaining e-serial access since there is an organiza- tional commitment to describe and provide access to se- rials. As you know, the scope of the CONSER PURL Pilot is freely-available e-serials which are not U.S. government publications. The GPO PURL server controls U.S. gov- ernment publications. BP: But it sounds like a project with a great deal of po- tential, especially in a cooperative environment. SS: I definitely believe that a cooperative PURL mainte- nance program is one effective solution as long as the database is large enough to be seen as a useful resource, for example, one with the Program for Cooperative Cata- loging (PCC). Because PURLs are becoming more com- monplace, I’ve seen a number of institutional-based PURL servers, which in some respects defeats the purpose of having a PURL server. Unless there is some coordination between PURL servers or perhaps some very clear, com- monly acknowledged scopes among PURL servers (i.e., GPO covers U.S. governmental publications), it seems like there is the potential for redundancy among PURL servers. 324 Parks / Serials Review 28/4 (2002) 321–326 If you’re going to have a PURL project that a lot of people are going to take advantage of, it’s got to be co- operative, it’s got to be opened up. Like with the success of OCLC, if you do have people who are willing to contribute and maintain, there is some real potential. I still don’t think it’s a final solution because you still have to have a person find out that a link is broken and main- tain it. However, instead of being maintained in thou- sands of library catalogs and Web pages around the world, it’s only maintained in one place. There’s a lot of savings there, but it still requires a person to go in and identify that the link is broken. BP: So is CONSER considering designating someone as a link maintainer? SS: In the CONSER project, the institution that makes the original PURL assignment is responsible for main- taining the PURL in the resolution table. OCLC runs link validation software on a weekly basis. The institutional coordinator then gets weekly error reports showing not just 404 errors, but a number of conditions, including the number of redirects. So if the original URL is now a redirect, you get notified. The final report available through the CONSER home page gives a good idea of the scope in terms of the number of PURLs needing mainte- nance and other statistics. BP: You mentioned earlier that the serials department at UW is looking into ways to incorporate Serials Solutions data into the workflow. Many libraries are struggling with ways to integrate e-journals and aggregators into their OPACs, and many are coming up with homegrown programs. Did you consider creating something locally? SS: We’ve already created something locally in that we’ve integrated e-journals and aggregators into our WebPAC, more or less successfully. Links for titles in smaller packages (usually less than a couple hundred) are entered and maintained by our serials acquisition staff. Catalog records for the individual titles from our two largest aggregations (ProQuest and LexisNexis) are ob- tained from a third party and periodically machine- loaded. We’ve also created an SQL database of catalog records for all our electronic resources. This database, called the Digital Registry, is spun off from our MARC database on a daily basis and forms the basis for many of our Web-based services, including the e-journals and database pages and many of our subject-based pages. The service that we’re not currently providing is a link management service that takes OpenURL data (mostly from our bibliographic databases) and returns library services to the user (i.e., a list of services such as “Full- text from ScienceDirect,” “Catalog search for this title,” “ILL request”). In order to do article-level linking using OpenURL, we need to have holdings data in a formatted form for our e-serials, something we haven’t tracked to date. Serials Solutions will be providing us with two sets of data: holdings data for as many packages as we can get from them, and full MARC records for titles from three aggregator databases, ProQuest, LexisNexis and Ex- panded Academic Index. We’re then planning on using this data in conjunction with Innovative Interfaces’ Web- Bridge software to provide additional library services. BP: What convinced you to give a commercial product, in this case Serials Solutions, a try? SS: In our case, it was a need to improve user service by providing better, more customized access to our full-text collections. Our systems office had discussed developing link management software in-house, but with the num- ber of packages and publishers we get full text from, the problem was unmanageable. Now that OpenURL is be- coming a more commonly used standard, we decided to try Innovative Interfaces’ WebBridge software so that we would have a compatible commercial product (and a sin- gle data store) to support some of these services. As I mentioned, the specific data we were looking for to sup- port the link management software was holdings data and MARC records for the aggregator databases. BP: By now we’ve all heard about the upcoming AACR2 revisions, specifically those involving Chapter 12, the chapter that deals with serials. What are some of these re- visions, and how do they address some of the cataloging challenges that you mentioned earlier? SS: The main revision is the change of scope of Chapter 12 to “Continuing Resources” which now includes not only serials but also a new type of material called “Inte- grating Resources.” These are defined as resources that are added to or changed by updates that do not remain discrete and are integrated into the resource. Websites, loose-leaf publications, and databases are examples of integrating resources. Until now, there were supplemen- tary rule sets, like Adele Hallam’s guidelines for print loose-leafs. 5 Nancy Olson’s manual never really ad- dressed describing how an electronic resource changed over time. 6 I think one of the great things about the Chapter 12 revision is that catalogers will be able to de- scribe consistently how a Website or a database changes over time. Other significant revisions that affect serials catalog- ers are the changes to the title change rules. Instead of re- ferring to “title changes,” catalogers will be referring to “major changes” and “minor changes.” This is the ter- minology used by the ISSN network to identify whether a change is significant enough to create a new record, thus a major change. Serials catalogers have previously called any change significant enough to create a new record a “title change” even if the change was not in the title, but in the corporate body main entry. Then there are many title changes that aren’t significant enough to be a “title change” but are considered title variants. No won- der non-catalogers find serials cataloging so confusing. I’m hopeful that the change in terminology will make our conversations a little clearer. In addition to the terminol- ogy, the actual title change rules have changed signifi- cantly so that many fewer changes in the title will be con- sidered major changes and so fewer successive entry serial records will be created in the future. To tell you the truth, I’m not sure that the change in rules will help with the cataloging challenges I talked about earlier. We’ll still have inconsistent and ambiguous 325 Parks / Serials Review 28/4 (2002) 321–326 presentation of bibliographic information on electronic resources, and catalogers will still have to use a lot of judgment in creating original records for electronic seri- als. There are some e-serial specific guidelines for title changes that will definitely help in determining whether a particular e-serial title change is major or minor. I think the main benefits are that catalogers now have rules gov- erning how to describe the changing aspects of non-serially published resources and that we’ll see more consistent cataloging. BP: What kind of an impact, if any, do you see these changes having on users? SS: Records will include a lot more dates (date viewed, last update, etc.) so that a user will have a better sense of how current a particular record is. Better access will also be provided since earlier access points (earlier titles, au- thors, issuing bodies) will be retained in the catalog records. For example, if an online database changes its name, the earlier name will be retained in the record (MARC field 247) so searchers will be able to find the earlier title in the catalog. BP: What about the impact on catalogers and serials staff? I imagine there are some retraining issues that need to be addressed. SS: Not only retraining issues, but also some organiza- tional issues. Since databases, Websites and loose-leafs are cataloged using Chapter 12, who catalogs them? Where do the divisional or personnel lines go? I might be wrong on this, but I think the e-resources cataloging community, as a whole, was not really aware of the ex- tent of the changes to Chapter 12, and there might be a small amount of panic out there when individual cata- logers start seeing catalog records that include notes and tagging conventions that are different from what they’re used to seeing. These changes won’t affect all e-resources. Static resources (individual documents, monographs) will still be cataloged using Chapter 9. When the re- source changes somehow, the cataloger will then need to consult Chapter 12 in addition to Chapter 9. SCCTP and the PCC are developing a one-day work- shop on cataloging integrating resources. I understand that the release date will be sometime in April 2003. In the meantime, various organizations will be sponsoring programs on the rule changes. ALCTS will be holding a series of institutes on the revisions, and Jean Hirons’(LC) overview of the changes is available from the CONSER home page. My advice to electronic resource catalogers is to keep your ears open for any continuing education opportunities that might be available and take a look at Jean’s overview to get an idea of what to expect. BP: Let’s talk a little more about the e-serials course with which you’re currently involved. 7 SS: The e-serials course was released in April 2002, and I think that four workshops have been given since its re- lease. I just gave it for the first time a couple of days ago at OCLC Western Service Center in Lacey, Washington. BP: How was the workshop received? SS: We had twelve people, and it worked really well. The evaluations were very positive. The course is very information-dense. There’s some information about dif- ferent approaches to providing access to aggregations through the library catalog and also how to provide access outside the catalog. There’s a little tutorial on OpenURL using an SFX example because that was the only widely available commercial product six months ago. I’m hope- ful that it will get people who have never had to deal with these issues before pointed in the right direction. And there’s a bibliography where they can find out more about various e-serial products and projects. BP: This is the workshop that you co-developed with Les Hawkins from LC? SS: Right, with Les Hawkins. Jean Hirons had identified a number of people who could be workshop developers for either the e-serials course or the advanced serials course, which was being developed at the same time. Be- cause of my background, she felt it would be more ap- propriate for me to work on the e-serials course. I’m one of probably a couple dozen “experts” in the field. It’s re- ally kind of funny because these days I really have to try to make time for cataloging e-serials because we don’t do a whole lot of original cataloging of e-serials. Most of the titles we get (and I think this is true for a lot of people) are part of a package or are commercially published so there’s cataloging copy in OCLC for most of our new titles. I know that there are people like David Van Hoy from MIT, Renette Davis from University of Chicago, Naomi Young from University of Florida, Becky Cul- bertson from University of California at San Diego, and Valerie Bross (UCLA) who do a lot more cataloging and are really much more the experts on this stuff than I am. I just happened to be the one to write a couple of articles and get involved with the training issues, so my name is out there more. BP: I understand that the course is aimed at those who have serials cataloging experience. SS: Yes. The course is geared towards people who have serials cataloging experience but not necessarily a lot of experience with e-serials. I really like the fact that it doesn’t just focus on creating original records, but it also addresses other, related issues. BP: What are some of these common issues? SS: Some of the issues covered include how to provide access both within and without the catalog, how to han- dle a variety of change situations such as title, URL, or format changes, and what is a serial versus an integrating resource. BP: A final question for you. As online resources become more commonplace in libraries, how do you see the role of serials catalogers evolving to meet the future chal- lenges these resources present? SS: Geez, you save the easy questions for the end! As we use more commercial tools and products, I think serials catalogers might become the “serial metadata experts,” 326 Parks / Serials Review 28/4 (2002) 321–326 being part of a team that manages serial data that can come from a variety of sources or that is used in a variety of services. It all depends on the library. At our institu- tion, we’ve definitely seen a decrease in print serial work- flow that is as much due to serials cancellation projects as to the format. Or put another way, it’s cheaper for us to buy a title in electronic form as part of a package than in- dividually in print, so we’ll cancel the print. However, we have a pretty sizable microfilm backlog, so I think some- one will still be creating serial catalog records at the Uni- versity of Washington for quite awhile. Notes 1. Marcia Sinetar, Do What You Love, the Money Will Follow: Dis- covering Your Right Livelihood . New York: Paulist Press, 1987. 2. National Serials Data Program. http://www.loc.gov/issn (17 August 2002). 3. CONSER Cataloging Manual (CCM). Washington, DC: Serial Record Division, Library of Congress, distributed by the Cataloging Distribution Service, 1998. 4. CONSER PURL Pilot. http://www.loc.gov/acq/conser/purl/ documentation.htm (17 August 2002). 5. Adele Hallam, Cataloging Rules for the Description of Looseleaf Publications with Special Emphasis on Legal Materials , 2nd ed . Wash- ington, DC: Library of Congress, 1989. 6. Nancy B. Olson, Cataloging Internet Resources: A Manual and Practical Guide , 2nd ed. Dublin, OH: OCLC, 1997. 7. Serials Cooperative Cataloging Training Program (SCCTP). The materials for the SCCTP e-serials workshop have been translated into Chinese and Spanish. More information about the course is available at http://www.loc.gov/acq/conser/scctp/home.html (19 August 2002). work_ak5jigojane2zjdpjwse6pbcme ---- Noname manuscript No. (will be inserted by the editor) Contextualization of topics Browsing through the universe of bibliographic information Rob Koopman · Shenghui Wang · Andrea Scharnhorst Received: date / Accepted: date Abstract This paper describes how semantic indexing can help to generate a con- textual overview of topics and visually compare clusters of articles. The method was originally developed for an innovative information exploration tool, called Ari- adne, which operates on bibliographic databases with tens of millions of records [18]. In this paper, the method behind Ariadne is further developed and applied to the research question of the special issue “Same data, different results” – the better understanding of topic (re-)construction by different bibliometric approaches. For the case of the Astro dataset of 111,616 articles in astronomy and astrophysics, a new instantiation of the interactive exploring tool, LittleAriadne, has been cre- ated. This paper contributes to the overall challenge to delineate and define topics in two different ways. First, we produce two clustering solutions based on vector representations of articles in a lexical space. These vectors are built on semantic indexing of entities associated with those articles. Second, we discuss how Lit- tleAriadne can be used to browse through the network of topical terms, authors, journals, citations and various cluster solutions of the Astro dataset. More specif- ically, we treat the assignment of an article to the different clustering solutions as an additional element of its bibliographic record. Keeping the principle of semantic indexing on the level of such an extended list of entities of the bibliographic record, LittleAriadne in turn provides a visualization of the context of a specific clustering solution. It also conveys the similarity of article clusters produced by different al- R. Koopman OCLC Research, Schipholweg 99, Leiden, The Netherlands Tel.: +31 71 524 6500 E-mail: rob.koopman@oclc.org S. Wang OCLC Research, Schipholweg 99, Leiden, The Netherlands Tel.: +31 71 524 6500 E-mail: shenghui.wang@oclc.org A. Scharnhorst DANS-KNAW, Anna van Saksenlaan 51, The Hague, The Netherlands Tel.: +31 70 349 4450 E-mail: andrea.scharnhorst@dans.knaw.nl ar X iv :1 70 2. 08 21 0v 1 [ cs .D L ] 2 7 F eb 2 01 7 2 Rob Koopman et al. gorithms, hence representing a complementary approach to other possible means of comparison. Keywords Random projection · clustering · visualization · topical modelling · interactive search interface · semantic map · knowledge map 1 Introduction What is the essence, or the boundary of a scientific field? How can a topic be defined? Those questions are at the heart of bibliometrics. They are equally rele- vant for indexing, cataloguing and consequently information retrieval [24]. Rigour and stability in bibliometrically defining boundaries of a field are important for research evaluation and consequently the distribution of funding. But, for infor- mation retrieval - next to accuracy - serendipity, broad coverage and associations to other fields are of equal importance. If researchers seek information about a certain topic outside of their areas of expertise, their information needs can be quite different from those in a bibliometric context. Among the many possible hits for a search query, they may want to know which are core works (articles, books) and which are rather peripheral. They may want to use different rankings [25], get some additional context information about authors or journals, or see other closely related vocabulary or works associated with a search term. On the whole, they would have less need to define a topic and a field in a bijective, univocal way. Such a possibility to contextualize is not only important for term-based queries. It also holds for groups of query terms, or for the exploration of sets of docu- ments, produced by different clustering algorithms. Contextualisation is the main motivation behind this paper. If we talk of contextualisation we still stay in the realm of bibliographic infor- mation. That is, we rely on information about authors, journals, words, references as hidden in the entirety of the set of all bibliographic records. Decades of biblio- metrics research have produced many different approaches to cluster documents, or more specifically, articles. They often focus on one entity of the bibliographic record. To give one example, articles and terms within those articles (in title, abstract and/or full text) form a bipartite network. From this network we can either build a network of related terms (co-word analysis) or a network of related articles (based on shared words). The first method, sometimes called lexical [20], has been applied in scientometrics to produce so-called topical or semantic maps. The same exercise can be applied to authors and articles, authors and words [21], and in effect to each element of the bibliographic record for an article [13]. If we extend the bibliographic record of an article with the list of references contained by this article, we enter the area of citation analysis. Here, the following methods are widely used: direct citations, bibliographic coupling and co-citation maps. Hy- brid methods combine citation and lexical analysis (e.g., [14, 37]). We would like to note here that in an earlier comparison of citation- and word-based mapping approaches Zitt et al. ([38]) underline the differences both signals carry in terms of what aspect of scientific practice they represent. We come back to this in the next paragraph. Formally spoken, the majority of studies apply one method and often display unipartite networks. Sometimes analysis and visualization of multi-partite networks can be found [32]. Contextualization of topics 3 Each network representation of articles captures some aspect of connectivity and structure which can be found in published work. Co-authorship networks shed light on the social dimension of knowledge production, the so-called Invisible Col- lege [9, 23]. Citation relations are interpreted as traces of flows of knowledge [28, 30]. By using different bibliographic elements, we obtain different models for, or representations of, a field or topic; i.e. as a conceptual, cognitive unit; as a com- munity of practice; or as institutionalized in journals. One could also say that choosing what to measure affects the representation of a field or topic. Another source of variety beyond differences arising from choice of representations is how to analyze those representations. Fortunately, network analysis provides several clas- sical methods to choose from, including clustering and clique analysis. However, clusters can be defined in different ways, and some clustering algorithms can be computationally expensive when used on large or complex networks. Consequently, we find different solutions for the same algorithm (if parameters in the algorithm are changed) and different solutions for different algorithms. One could call this an effect of the choice of instrument for the measurement or how to measure. Using an ideal-typical workflow, these points of choice have been further detailed and discussed in another paper of this special issue ( [33]). The variability in each of the stages of the workflow results in ambiguity, and, if not articulated, makes it even harder to reproduce results. Overall, moments of choice add an uncertainty margin to the results [19, 27]. Last but not least, we can ask ourselves whether clear delineations exist between topics in practice. Often in the sciences very dif- ferent topics are still related to each other. There exist unsharp boundaries and almost invisible long threads in the fabric of science [5], which might inhibit the finding of a contradiction-free solution in form of a unique set of disjunct clusters. There is a seeming paradox between the fact that experts often can rather clearly identify what belongs to their field or a certain topic, and that it is so hard to quantitatively represent this with bibliometric methods. However, a closer look into science history and science and technology studies reveals that even among experts opions regarding subject matter classification or topic identification might vary. What belongs to a field and what not is as much an epistemic question as also an object of social negotiations. Moreover, the boundaries of a field change over time, and even a defined canon or body of knowledge determining the essence of a field or a topic can still be controversial or subject to change [8]. Defining a topic requires a trade-off between accepting the natural ambiguity of what a topic is and the necessity to define a topic for purposes of education, knowledge acquisition, and evaluation. Since different perspectives serve different purposes, there is also a need to preserve the diversity and ambiguity described earlier. Having said this, for the sake of scientific reasoning it is equally necessary to be able to further specify the validity and appropriateness of different methods for defining topics and fields [11]. This paper contributes to this sorting-out-process in several ways. All are driven by the motivation to provide a better understanding of the topic re-construction results by providing context: context of the topics themselves by using a lexical approach and all elements of the bibliographical record to delineate topics; and context for different solutions in the (re-)construction of topics. We first introduce the method of semantic indexing, by which each bibliographic record is decom- posed and a vector representation for each of its entities in a lexical space is build, resulting in a so-called semantic matrix. This approach is conceptually closer to 4 Rob Koopman et al. classical information retrieval techniques based on Salton’s vector space model [29] than to the usual bibliometrical mapping techniques. In particular, it is similar to Latent Semantic Indexing or Latent Semantic Analysis. In the specific case of the Astro dataset, we extend the bibliographic record with information on cluster assignments provided by different clustering solutions. For the purpose of a delin- eation of topics based on clustering of articles, we reconstruct a semantic matrix for articles based on the semantic indexing of their individual entities. Secondly, based on this second matrix, we produce our own clustering solutions (detailed in [36]) by applying two different clustering algorithms. Third, we present an inter- active visual interface called LittleAriadne that displays the context around those extracted entities. The interface responds to a search query with a network visu- alization of most related terms, authors, journals, citations and cluster IDs. The query can consist of words or author names, but also clustering solutions. The displayed nodes or entities around a query term represent, to a certain extent, the context of the query in a lexical, semantic space. In what follows, we address the following research questions: Q1 How does the Ariadne algorithm, originally developed for a large corpora which contains tens of millions of articles, work on a much smaller, field-specific dataset? How can we relate the produced contexts to domain knowledge re- trieved from other information services? Q2 Can we use LittleAriadne to compare different cluster assignments of papers, by treating those cluster assignments as additional entities? What can we learn about the topical nature of these clusters when exploring them visually? Concerning the last question, we restrict this paper to a description of the approach LittleAriadne offers, and we provide some illustrations. A more detailed discussion of the results of this comparison has been taken up as part of the comparison paper of this special issue [33], which on the whole addresses different analytic methods and visual means to compare different clustering solutions. 2 Data The Astro dataset used in this paper contains documents published in the period 2003–2010 in 59 astrophysical journals.1 Originally, these documents had been downloaded from the Web of Science in the context of a German-funded research project called “Measuring Diversity of Research,” conducted at the Humboldt- University Berlin from 2009 to 2012. Based on institutional access to the Web of Science, we worked on the same dataset. Starting with 120,007 records in total, 111,616 records of the document types Article, Letter and Proceedings Paper have been treated with different clustering methods (see the other contributions to this special issue). Different clustering solutions have been shared, and eventually a selection of solutions for the comparison has been defined. In our paper we used clustering solutions from CWTS-C5 (c) [31], UMSI0 (u) [34], HU-DC (hd) [12], STS-RG (sr) [6], ECOOM-BC13 (eb), ECOOM-NLP11 (en) (both [10]) and two of our own: OCLC-31 (ok) and OCLC-Louvain (ol) [36]. The CWTS-C5 and UMSI0 1 For details of the data collection and cleaning process leading to the common used Astro dataset see [33]. Contextualization of topics 5 are the clustering solutions generated by two different methods, Infomap and the Smart Local Moving Algorithm (SLMA) respectively, applied on the same direct citation network of articles. The two ECOOM clustering solutions are generated by applying the Louvain method to find communities among bibliographic coupled articles where ECOOM-NLP11 also incorporates the keywords information. The STS-RG clusters are generated by first projecting the relatively small Astro dataset to the full Scopus database. After the full Scopus articles are clustered using SLMA on the direct citation network, the cluster assignments of Astro articles are collected. The HU-DC clusters are the only overlapping clusters generated by a memetic type algorithm designed for the extraction of overlapping, poly- hierarchical topics in the scientific literature. Each article is assigned to a HU-DC cluster with a confidence value. We only took those assignments with a confidence value higher than 0.5. More detailed accounts of these clustering solutions can be found in [33]. Table 1 shows their labels later used in the interface, and how many clusters each solution produced. All the clustering solutions are based on the full dataset. However, each article is not necessarily guaranteed to have a cluster assignment in every clustering solution (see the papers about the clustering solutions for further details). The last column in Table 1 shows how many articles of the original dataset are covered by different solutions. Table 1 Statistics of clustering solutions generated by different methods Cluster label Solution #Clusters Coverage c CWTS-C5 22 91% u UMSI0 22 91% ok OCLC-31 31 100% ol OCLC-Louvain 32 100% sr STS-RG 556 96% eb ECOOM-BC13 13 97% en ECOOM-NLP11 11 98% hd HU-DC 113 91% 3 Method 3.1 Building semantic representations for entities The Ariadne algorithm was originally developed on top of the article database, ArticleFirst of OCLC [18]. The interface, accessible at http://thoth.pica.nl/ relate, allows users to visually and interactively browse through 35 thousand journals, 3 million authors, and 1 million topical terms associated with 65 mil- lion articles. The Ariadne pipeline consists of two steps: an offline procedure for semantic indexing and an online interactive visualization of the context of search queries. We applied the same method to the Astro dataset and built an instanti- ation, named LittleAriadne, accessible at http://thoth.pica.nl/astro/relate. To describe our method we give an example of an article from the Astro dataset in table 2. We list all the fields of this bibliographic record that we used for LittleAriadne. We include the following types of entities for semantic indexing: http://thoth.pica.nl/relate http://thoth.pica.nl/relate http://thoth.pica.nl/astro/relate 6 Rob Koopman et al. Table 2 An article from the Astro dataset Article ID ISI:000276828000006 Title On the Mass Transfer Rate in SS Cyg Abstract The mass transfer rate in SS Cyg at quiescence, estimated from the ob- served luminosity of the hot spot, is log M-tr = 16.8 +/- 0.3. This is safely below the critical mass transfer rates of log M-crit = 18.1 (correspond- ing to log T-crit(0) = 3.88) or log M-crit = 17.2 (corresponding to the “revised” value of log T-crit(0) = 3.65). The mass transfer rate during outbursts is strongly enhanced Author [author:smak j] ISSN [issn:0001-5237] Subject [subject:accretion, accretion disks] [subject:cataclysmic variables] [sub- ject:disc instability model] [subject:dwarf novae] [subject:novae, cata- clysmic variables] [subject:outbursts] [subject:parameters] [subject:stars] [subject:stars dwarf novae] [subject:stars individual ss cyg] [subject:state] [subject: superoutbursts] Citation [citation:bitner ma, 2007, astrophys j 1, v662, p564] [citation:bruch a, 1994, astron astrophys sup, v104, p79] [citation:buatmenard v, 2001, astron as- trophys, v369, p925] [citation:hameury jm, 1998, mon not r astron soc, v298, p1048] [citation:harrison te, 1999, astrophys j 2, v515, l93] [cita- tion:kjuikchieva d, 1998, a as, v262, p53] [citation:kraft rp, 1969, apj, v158, p589] [citation:kurucz rl, 1993, cd rom] [citation:lasota jp, 2001, new astron rev, v45, p449] [citation:paczynski b, 1980, acta astron, v30, p127] [cita- tion:schreiber mr, 2002, astron astrophys, v382, p124] [citation:schreiber mr, 2007, astron astrophys, v473, p897, doi 10.1051/0004-6361:20078146] [citation:smak j, 1996, acta astronom, v46, p377] [citation:smak j, 2002, acta astronom, v52, p429] [citation:smak j, 2004, acta astronom, v54, p221] [citation:smak j, 2008, acta astronom, v58, p55] [citation:smak ji, 2001, acta astronom, v51, p279] [citation:tutukov av, 1985, pisma astron zh, v11, p123] [citation:tutukov av, 1985, sov astron lett+, v11, p52] [cita- tion:voloshina ib, 2000, astron rep+, v44, p89] [citation:voloshina ib, 2000, astron zh, v77, p109] Topical terms mass transfer; transfer rate; ss; cyg; quiescence; estimated; observed; lumi- nosity; hot spot; log; tr; safely; critical; crit; corresponding; revised; value; outbursts; strongly; enhanced UAT terms [uat:stellar phenomena]; [uat:mass transfer]; [uat:optical bursts] Cluster ID [cluster:c 19] [cluster:u 16] [cluster:ok 18] [cluster:ol 23] [cluster:sr 17] [clus- ter:eb 1] [cluster:en 1] [cluster:hd 1] [cluster:hd 18] [cluster:hd 48] authors, journals (ISSN), subjects, citations, topical terms, MAI-UAT thesaurus terms and cluster IDs (see Table 1). For the Astro dataset, we extended the origi- nal Ariadne algorithm [17] by adding citations as additional entities. In the short paper about the OCLC clustering solutions [36] we applied clustering to differ- ent variants of the vector representation of articles, including variants with and without citations. We reported there about the effect of adding citations to vector representations of articles on clustering. In Table 2 we display the author name (and other entities) in a syntax (indi- cated by square brackets) that can immediately be used in the search field of the interface. Each author name is treated as a separate entity. The next type of entity is the journal identified by its ISSN number. One can search for a single journal using its ISSN number. In the visual interface, the ISSN numbers are replaced by the journal name, which is used as label for a journal node. The next type of entities are so-called subjects. Those subjects originate from the fields “Author Keywords” and “Keywords Plus” of the original Web of Science records. Citations, Contextualization of topics 7 Table 3 Entities in LittleAriadne Journals 59 Authors 55,607 Topical terms 60,501 Subjects 41,945 Citations 386,217 UAT terms 1534 Cluster IDs 610 Total 546,473 references in the article, are considered as a type of entity too. Here, we use the standardized abbreviated citations in the Web of Science database. We remark that we do not apply any form of disambiguation–neither for the author names nor for the citations. Topical terms such as “mass transfer” and “quiescence” in our example, are single words or two-word phrases extracted from titles and ab- stracts of all documents in the dataset. A multi-lingual stop-word list was used to remove unimportant words, and mutual information was used to generate two- word phrases. Only words and phrases which occur more than a certain threshold value were kept. The next type of entity is a set of Unified Astronomy Thesaurus (UAT)2 terms which were assigned by the Data Harmony’s Machine Aided Indexer (M.A.I.).3 Please refer to [7] for more details about the thesaurus and the indexing procedure. The last type of entity we add to each of the articles (specific for LittleAriadne) is the collection of cluster IDs corresponding to the clusters to which the article was assigned by the various clustering algorithms. For example, the article in Table 2 has been assigned to clusters “c 19” (produced by CWTS-C5) and “u 16” (produced by UMSI0), and so on. In other words, we treat the cluster assignments of articles as they would be classification numbers or additional subject headings. Table 3 lists the total number of different types of entities found in the Astro dataset. To summarize, we deconstruct each bibliographic record, extract a number of entities, and add some more (the cluster IDs and the topical terms). Next, we construct for each of these entities a vector in a word space built from topical terms and subject terms. We assume that the context of all entities is captured by their vectors in this space. Figure 1 gives a schematic representation of these vectors which form the matrix C. All types of entities – topical term, subject, author, citation, cluster ID and journal – form the rows of the matrix, and their components (all topical terms and subjects) the columns. The values of the vector components are the frequencies of the co-occurrence of an entity and a specific word in the whole dataset. That is, we count how many articles contain both an entity and a certain topical term or subject. Matrix C expresses the semantics of all entities in terms of their context. Such context is then used in a computation of their similarity/relateness. Each vector can be seen as the lexical profile of a particular entity. A high cosine similarity value between two entities indicates a large overlap of the contexts of these two 2 http://astrothesaurus.org/ 3 http://www.dataharmony.com/services-view/mai/ http://astrothesaurus.org/ http://www.dataharmony.com/services-view/mai/ 8 Rob Koopman et al. Fig. 1 Dimension reduction using Random Projection entities – in other words, a high similarity between them. This is different from measuring their direct co-occurrence. For LittleAriadne, the matrix C has roughly 546K × 102K elements, and is sparse and expensive for computation. To make the algorithm scale and to pro- duce a responsive online visual interface, we applied the method of Random Pro- jection [1, 15] to reduce the dimensionality of the matrix. As shown in Figure 1, we multiply C with a 102K × 600 matrix of randomly distributed –1 and 1, with half-half probabilities.4 This way, the original 546K × 102K matrix C is reduced to a Semantic Matrix C′ of the size of 546K × 600. Still, each row vector repre- sents the semantics of an entity. It has been discussed elsewhere [2] that with the method of Random Projection, similar to other dimension reduction methods, es- sential properties of the original vector space are preserved, and thus entities with a similar profile in the high-dimensional space still have a similar profile in the reduced space. A big advantage of Random Projection is that the computation is significantly less expensive than other methods, e.g., Principal Component Anal- ysis [2]. Actually, Random Projection is often suggested as a way of speeding up Latent Semantic Indexing (LSI) [26], and Ariadne is similar to LSI in some ways. LSI starts from a weighted term-document matrix, where each row represents the lexical profile of a document in a word space. In Ariadne, however, the unit of analysis is not the document. Instead, each entity of the bibliographic record is subject to a lexical profile. We explain in the next section that, by aggregating over all entities belonging to one article, one can construct a vector representation for the article that represents its semantics and is suitable for further clustering processes (for more details please consult [36]). 4 More efficient random projections are available. This version is more conservative and also computationally easier. Contextualization of topics 9 With the Matrix C′, the interactive visual interface dynamically computes the most related entities (i.e., ranked by cosine similarity) to a search query. After irrelevant entities have been filtered out by removing entities with a high Maha- lanobis distance [22] to the query, the remaining entities and the query node are positioned in 2D so that the distance between nodes preserves the corresponding distance in the high dimensional space as much as possible. We use a spring-like force-directed graph drawing algorithm for the positioning of the nodes. Designed as experimental, explorative tool, no other optimisation of the network layout is applied. In the on-line interface, it is possible to zoom into the visualization, to change the size of the labels (font slider) as well as the number of entities displayed (show slider). For the figures in the paper, we used snapshots, in which node labels might overlap. Therefore, we provide links to the corresponding interactive display for each of the figures. In the end, with its most related entities, the context of a query term can be effectively presented [18]. For LittleAriadne we extended the usual Ariadne interface with different lists of the most related entities, organized by type. This information is given below the network visualization. 3.2 From a semantic matrix of entities to a semantic matrix for articles The Ariadne interface provides context around entities, but does not produce article clusters directly. In other words, articles contribute to the context of entities associated with them but the semantics of themselves need to be reconstructed before we can apply clustering methods to identify article clusters. We describe the OCLC clustering workflow elsewhere [36], but here we would like to explain the preparatory work for it. The first step is to create a vector representation of each article. For each article, we look up all entities associated with this article in the Semantic Matrix C′. We purposefully leave out the cluster IDs, because we want to construct our own clustering later independently, i.e., without already including information about clustering solutions of other teams. For each article we obtain a set of vectors. For our article example in Table 2 we have 55 entities. The set of vectors for this article entails one vector representing the single author of this article, 12 vectors for the subjects, one vector for the journal, 21 vectors for the citations and 20 vectors for topical terms. Each article is represented by a unique set of vectors. The size of the set can vary, but each of the vectors inside of a set has the same length, namely 600. For each article we compute the weighted average of its constituent vectors as its semantic representation. Each entity is weighted by its inverse document frequency to the third power; therefore, frequent entities are heavily penalized to have little contribution to the resulting representation of the article. In the end, each article is represented by a vector of 600 dimensions which becomes a row in a new matrix M with the size of 111, 616 × 600. Note that since articles are repre- sented as a vector in the same space where other entities are also represented, it is now possible to compute the relatedness between entities and articles! Therefore in the online interface, we can present the articles most related to a query. To group these 111,616 articles into meaningful clusters, we apply standard clustering methods to M. A first choice, the K-Means clustering algorithm results in 31 clusters. As detailed in [36], with k = 31, the resulting 31 clusters perform 10 Rob Koopman et al. the best according to a pseudo-ground-truth built from the consensus of CWTS- C5, UMSI0, STS-RG and ECOOM-BC13. With this clustering solution the whole dataset is partitioned pretty evenly: the average size is 3600 ± 1371, and the largest cluster contains 6292 articles and the smallest 1627 articles. We also apply a network-based clustering method: the Louvain community detection algorithm. To avoid high computational cost, we first calculate for each article the top 40 most related articles, i.e., with the highest cosine similarity. This results in a new adjacency matrix M′ between articles, representing an article similarity network where the nodes are articles and the links indicate that the connected articles are very similar. We set the threshold for the cosine similarity at 0.6 to reduce links with low similarity values. A standard Louvain community detection algorithm [3] is applied to this network, producing 32 partitions, i.e., 32 clusters. Compared to K-Means 31 clusters, these 32 Louvain clusters vary more in terms of cluster size, with the largest cluster containing 9464 articles while the smallest cluster 86 articles. The Normalized Mutual Information [35] between these two solutions is 0.68, indicating that they are highly similar to each other yet different enough to be studied further. More details can be found in [36]. 4 Experiments and results To answer the two research questions listed in the introduction, we conducted the following experiments: Experiment 1. We implemented LittleAriadne as an information retrieval tool. We searched with query terms, inspected and navigated through the resulting network visualization. Experiment 2. We visually observed and compared different clustering solutions. 4.1 Experiment 1 – Navigate through networked information We implemented LittleAriadne, which allows users to browse the context of the 546K entities associated with 111K articles in the datasets. If the search query refers to an entity that exists in the semantic matrix, LittleAriadne will return, by default, top 40 most related entities, which could be topical terms, authors, subjects, citations or clusters. If there are multiple known entities in the search query, a weighted average of the vectors of individual entities is used to calculate similarities (the same way an article vector is constructed). If the search query does not contain any known entities, a blank page is returned, as there is no information about this query. Figure 2 gives a contextual view of “gamma ray.”5 The search query refers to an known topical term “gamma ray,” and it is therefore displayed as a red node in the network visualization. The top 40 most related entities are shown as nodes, with the top 5 connected by the red links. The different colours reflect their types, e.g., topical terms, subjects, authors, or clusters. Each of these 40 entities is further connected to its top 5 most related entities among the rest of the entities in the visualization, with the condition that the cosine similarity is not below 0.6. 5 Available at http://thoth.pica.nl/astro/relate?input=gamma+ray http://thoth.pica.nl/astro/relate?input=gamma+ray Contextualization of topics 11 Fig. 2 The contextual view of the query term “gamma ray” A thicker link means the two linked entities are mutually related, i.e., they are among each other’s top 5 list. The colour of the link takes that of the node where the link is originated. If the link is mutual and two linked entities are of different types, one of the entity colours is chosen. The displayed entities often automatically form groups depending on their relatedness to each other, whereby more related entities are positioned closer to each other. Each group potentially represents a different aspect related to the query term. The size of a node is proportional to the logarithm of its frequency of occurrences in the whole dataset. The absolute number of occurrences appears when hovering the mouse cursor over the node. Due to the fact that different statistical methods are at the core of the Ariadne algorithm, this number gives an indication of the reliability of the suggested position and links. In Figure 2, there are four clusters from OCLC-31, ECOOM-BC13 and ECOOM- NLP11, and CWTS. The ECOOM-BC13 cluster eb 8 and ECOOM-NLP11 cluster en 4 are directly linked to “gamma ray,” suggesting that these two clusters are probably about gamma rays. It is not surprising that they are very close to each other, because they contain 7560 and 5720 articles respectively but share 3603 articles. At the lower part, the OCLC-31 cluster ok 21 and the CWTS cluster c 15 are also pretty close to our search term. They contain 1849 and 3182 articles respectively and share 1721 articles in common which makes them close to each other in the visualization. By looking at the topical terms and subjects around 12 Rob Koopman et al. Fig. 3 The contextual view of cluster ok 21 these clusters, we can have a rough idea of their differences. Although they are all about “gamma ray,” Clusters eb 8 and en 4 are probably more about “radiation mechanisms,” “very high energy,” and “observations,” while Clusters ok 21 and c 15 seem to focus more on “afterglows,” “prompt emission,” and “fireball.” Such observations will invite users to explore these clusters or subjects further. Each node is clickable which leads to another visualization of the context of this selected node. If one is interested in cluster ok 21 for instance, after clicking the node, a contextual view of cluster ok 21 is presented,6 as shown in Figure 3. This context view provides a good indication about the content of the articles grouped together in this cluster. In the context view of cluster ok 21 we see again the cluster c 15, which was already near to ok 21 in the context view of “gamma ray.” But the two ECOOM clusters, eb 8 and en 4 that are also in the context of “gamma ray” are not visible any more. Instead, we find two more similar clusters u 11 and ol 9. That means that, even though the clusters ok 21 and eb 8 are among the top 40 entities that are related to “gamma ray,” they are still different in terms of their content. This can be confirmed by looking at their labels in Table 4.7 As mentioned before, in the interface one can also further refine the display. For instance, one can choose the number of nodes to be shown or decide to limit the display to only authors, journals, topical terms, subjects, citations or clusters. 6 Available at http://thoth.pica.nl/astro/relate?input=[cluster:ok%2021]. 7 More details about cluster labelling can be found in [16]. http://thoth.pica.nl/astro/relate?input=[cluster:ok%2021] Contextualization of topics 13 Table 4 Labels of clusters similar to ok 21 and to ”gamma ray” Cluster IDs Size Cluster labels ok 21 1849 grb, ray burst, gamma ray, afterglow, bursts grbs, swift, prompt emission, prompt, fireball, batse c 15 3182 grb, ray bursts, gamma ray, afterglow, bursts grbs, sn, explosion, swift, type ia, supernova sn ol 9 2895 grb, ray bursts, gamma ray, afterglow, bursts grbs, sn, type ia, swift, explosion, ia supernovae u 11 2051 grb, ray bursts, gamma ray, afterglow, bursts grbs, sn, explosion, type ia, swift, supernova eb 8 7560 gamma ray, pulsar, ray bursts, grb, bursts grbs, high energy, jet, radio, psr, synchrotron en 4 5720 gamma ray, grb, ray bursts, cosmic ray, high energy, bursts grbs, afterglow, swift, tev, tev gamma The former can be done by the slider show or by editing the URL string directly. For the latter options, tick boxes are given. An additional slider font allows to experiment with the font size of the labels. A display with only one type of entity enables us to see context filtered along one perspective (lexical, journals, authors, subjects), and is often useful. For ex- ample, Figure 48 shows at least three separate groups of authors who are most related to “subject:hubble diagram.” At any point of exploration, one can see the most related entities, grouped by their types and listed at the bottom of the interface. The first category shown are the related titles, the titles of the articles most relevant to a search query. Due to license restrictions, we cannot make the whole bibliography available. But when clicking on a title, one actually sees the context of a certain article. Not only titles can be clicked through, all entities at the lower part are also clickable and such an action leads to another contextual view of the selected entity. At the top of the interface, under the search box, we find further hyperlinks behind the label exact search and context search. Clicking on the hyperlinks auto- matically sends queries to other information spaces such as Google, Google Scholar, Wikipedia, and WorldCat. For exact search, the same query text is used. For con- text search, the system generates a selection among all topical terms related to the original query term and send this selection as a string of terms (with the Boolean AND operation) to those information spaces behind the hyperlinks. This option offers users a potential way to retrieve related literature or web resources from a broader perspective. In turn, it also enables the user to better understand the entity-based context view provided by Ariadne. Let us now come back to our first research question: how does the Ariadne algo- rithm work on a much smaller, field-specific dataset? The interface shows that the original Ariadne algorithm works well on the small Astro dataset. Not surprisingly, compared with our exploration in the much bigger and more general ArticleFirst dataset, we find more consistent representations; that is, specific vocabulary is displayed, which can be cross-checked in Wikipedia, Google or Google Scholar. On the other hand, different corpora introduce different contexts for entities. For 8 Available at http://thoth.pica.nl/astro/relate?input=%5Bsubject%3Ahubble+ diagram%5D&type=2 http://thoth.pica.nl/astro/relate?input=%5Bsubject%3Ahubble+diagram%5D&type=2 http://thoth.pica.nl/astro/relate?input=%5Bsubject%3Ahubble+diagram%5D&type=2 14 Rob Koopman et al. Fig. 4 The authors who are the most related to “subject:hubble diagram” example, “young” in ArticleFirst9 is associated with adults and 30 years old, while in LittleAriadne it is immediately related to young stars which are merely 5 or 10 millions years old.10 Also, the bigger number of topical terms in the larger database leads to a situation where almost every query term produces a response. In LittleAriadne searches for, e.g., a writer such as JaneAusten retrieve nothing. Not surprisingly, for domain-specific entities, LittleAriadne tends to provide more accurate context. A more thorough evaluation needs to be based, as for any other topical mapping, on a discussion with domain experts. 4.2 Experiment 2 – Comparing clustering solutions In LittleAriadne we extended the interface with the goal of observing and compar- ing clustering solutions visually. As discussed in Section 3.1 cluster assignments are treated in the same way as other entities associated with articles, such as top- ical terms, authors, etc. Each cluster ID is therefore represented in the same space and visualized in the same way. In the interface, when we use a search term, for 9 Available at http://thoth.pica.nl/relate?input=young 10 Available at http://thoth.pica.nl/astro/relate?input=young http://thoth.pica.nl/relate?input=young http://thoth.pica.nl/astro/relate?input=young Contextualization of topics 15 example “[cluster:c]” and tick the “scan” option, the interface scans all the enti- ties in the semantic matrix which starts with, in this case “cluster:c,” and then effectively selects and visualizes all CWTS-C5 clusters.11 This way, we can eas- ily see the distribution of a single clustering solution. Note that in this scanning visualization, any cluster which contains less than 100 articles is not shown. Figure 5 shows the individual distribution of clusters from all eight clustering solutions. When two clusters have a relatively high mutual similarity, there is a link between them. It is not surprising to see the HU-DC clusters are highly connected as they are overlapping, and form a poly-hierarchy. Compared to CWTS-C5, UMSI and two ECOOM clusters, the STS-RG and the two OCLC solutions have more cluster-cluster links. This suggests that these clusters overlap more in terms of their direct vocabularies and indirect vocabularies associated with their authors, journals and citations. If we scan two or more cluster entities, such as “[cluster:c][cluster:ok],” we put two clustering solutions on the same visualization so that they can be compared visually. In Figure 6 (a) we see the high similarity between clusters from CWTS- C5 and those from OCLC-31.12 CWTS-C5 has 22 clusters while OCLC-31 has 31 clusters. Each CWTS-C5 cluster is accompanied by one or more OCLC clusters. This indicates that they are different, probably because of the granularity aspect instead of any fundamental issue. Figure 6 (b) shows two other sets of clusters that partially agree with each other but clearly have different capacity in identifying different clusters.13 Figure 7 (a) shows all the cluster entities from all eight clustering solutions.14 The STS and HU have hundreds of clusters, which make the visualization pretty cluttered. Figure 7 (b) shows only the solutions from CWTS, UMSI, OCLC and ECOOM, whose numbers of the clusters are comparable.15 Concerning our second research question - can we use LittleAriadne to compare clustering solutions visually? - we can give a positive answer. But, it is not easy to see from LittleAriadne why some clusters are similar and the others not. The visualization functions as a macroscope[4] and provides a general overview of all the clustering solutions, which helps to guide further investigation. It is not conclusive, but a useful heuristic devise. For example, from Figure 7, especially 7 (b), it is clear that there are “clusters of clusters.” That is, some clusters are detected by all of these different methods. In the future we may investigate these clusters of clusters more closely and perhaps discover that different solutions identify some of the same topics. We continue the discussion of the use of visual analytics to compare clustering solutions in the paper by Velden et al. [33]. 11 This scan option is applicable to any other type of entities, for example, to see all sub- jects which start with “quantum” by using “subject:quantum” as the search term and do the scanning. 12 Available at http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D% 5Bcluster%3Aok%5D&type=S&show=500 13 Available at http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Au%5D% 5Bcluster%3Asr%5D&type=S&show=500 14 Available at http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D% 5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster% 3Aen%5D%5Bcluster%3Asr%5D%5Bcluster%3Ahd%5D&type=S&show=500 15 Available at http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D% 5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster% 3Aen%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Aok%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Aok%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Au%5D%5Bcluster%3Asr%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Au%5D%5Bcluster%3Asr%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster%3Aen%5D%5Bcluster%3Asr%5D%5Bcluster%3Ahd%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster%3Aen%5D%5Bcluster%3Asr%5D%5Bcluster%3Ahd%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster%3Aen%5D%5Bcluster%3Asr%5D%5Bcluster%3Ahd%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster%3Aen%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster%3Aen%5D&type=S&show=500 http://thoth.pica.nl/astro/relate?input=%5Bcluster%3Ac%5D%5Bcluster%3Au%5D%5Bcluster%3Aok%5D%5Bcluster%3Aol%5D%5Bcluster%3Aeb%5D%5Bcluster%3Aen%5D&type=S&show=500 16 Rob Koopman et al. (a) CWTS-C5 clusters (b) UMSI0 clusters (c) OCLC-31 clusters (d) OCLC-Louvain clusters (e) ECOOM-BC13 clusters (f) ECOOM-NLP11 clusters (g) STS-RG clusters (h) HU-DC clusters Fig. 5 The distribution of clusters Contextualization of topics 17 (a) Highly similar clustering solutions (b) Clustering solutions with different focuses Fig. 6 Visual comparison of clustering solutions 18 Rob Koopman et al. (a) All clustering solutions (b) Clusters from CWTS, UMSI, OCLC and ECOOM Fig. 7 Visual comparison of clustering solutions Contextualization of topics 19 5 Conclusion We present a method implemented in an interface that allows browsing through the context of entities, such as topical terms, authors, journals, subjects and cita- tions associated with a set of articles. With the LittleAriadne interface, one can navigate visually and interactively through the context of entities in the dataset by seamlessly travelling between authors, journals, topical terms, subjects, citations and cluster IDs as well as consult external open information spaces for further contextualization. In this paper we particularly explored the usefulness of the method to the problem of topic delineation addressed in this special issue. LittleAriadne treats cluster assignments from different solutions as additional special entities. This way we provide the contextual view of clusters as well. This is beneficial for users who are interested in travelling seamlessly between different types of entities and their related cluster assignments generated by different solutions. We also contributed two clustering solutions built on the vector representation of articles, which is different from solutions provided by other methods. We start by including references and treating them as entities with a certain lexical or semantic profile. In essence, we start from a multipartite network of papers, cited sources, terms, authors subjects, etc. and focus on similarity in a high dimensional space. Our clusters are comparable to other solutions yet have their own characteristics. Please see [33, 36] for more details. We demonstrated that we can use LittleAriadne to compare different cluster- ing solutions visually and generate a wider overview. This has a potential to be complementary to any other method of cluster comparison. We hope that this in- teractive tool supports discussion about different clustering algorithms and helps to find the right meaning of clusters. We have plans to further develop the Ariadne algorithm. The Ariadne algo- rithm is general enough to incorporate additional types of entities into the semantic matrix. Which entities we can add very much depends on the information in the original dataset or database. In the future, we plan to add publishers, conferences, etc. with the aim to provide a richer contextualization of entities typically found in a scholarly publication. We also plan to elaborate links to articles that contribute to the contextual visualization, thus strengthening the usefulness of Ariadne not only for the associative exploration of contexts similar to scrolling through a sys- tematic catalogue, but also as a direct tool for document retrieval. In this context we plan to further compare LittleAriadne and Ariadne. As mentioned before, the corpora matter when talking about context of entities. The advantage of LittleAriadne is the confinement of the dataset to one scientific dis- cipline or field and topics within. We hope by continuing such experiments also to learn more about the relationship between genericity and specificity of contexts, and how that can be best addressed in information retrieval. Acknowledgement Part of this work has been funded by the COST Action TD1210 Knowescape, and the FP7 Project ImpactEV. We would like to thank the internal reviewers Frank Havemann, Bart Thijs as well as the anonymous external referees for their 20 Rob Koopman et al. valuable comments and suggestions. We would also like to thank Jochen Gläser, William Harvey and Jean Godby for comments on the text. References 1. Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences 66(4), 671–687 (2003). DOI http://dx. doi.org/10.1016/S0022-0000(03)00025-4. URL http://www.sciencedirect.com/science/ article/pii/S0022000003000254 2. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: Applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, pp. 245–250. ACM, New York, NY, USA (2001). DOI 10.1145/502512.502546. URL http://doi.acm.org/10.1145/ 502512.502546 3. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communi- ties in large networks. Journal of Statistical Mechanics: Theory and Experiment (2008). P10008 (12pp) 4. Börner, K.: Plug-and-play macroscopes. Communications of the ACM 54(3), 60–69 (2011) 5. Boyack, K., Klavans, R.: Weaving the fabric of science. In: K. Börner, E.F. Hardy (eds.) 6th Iteration (2009): Science Maps for Scholars. Places & Spaces: Mapping Science (2010) 6. Boyack, K.W.: Investigating the Effect of Global Data on Topic Detection. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Sciento- metrics (2017). DOI 10.1007/s11192-017-2297-y 7. Boyack, K.W.: Thesaurus-based methods for mapping contents of publication sets. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics (2017). DOI 10.1007/s11192-017-2304-3 8. Galison, P.: Image and logic: A material culture of microphysics. University of Chicago Press (1997) 9. Glänzel, W., Schubert, A.: Analysing scientific networks through co-authorship. In: H.F. Moed, W. Glänzel, U. Schmoch (eds.) Handbook of quantitative science and technology research, p. 257276. Springer (2004). DOI 10.1007/1-4020-2755-9 12 10. Glänzel, W., Thijs, B.: Using hybrid methods and ‘core documents’ for the representation of clusters and topics. the astronomy dataset. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identifi- cation of thematic structures in science, Special Issue of Scientometrics (2017). DOI Usinghybridmethods 11. Gläser, J., Glänzel, W., Scharnhorst, A.: Introduction to the special issue “same data, different results?”. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics (2017). DOI 10.1007/s11192-017-2296-z 12. Havemann, F., Gläser, J., Heinz, M.: Memetic search for overlapping topics. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Sciento- metrics (2017). DOI 10.1007/s11192-017-2302-5 13. Havemann, F., Scharnhorst, A.: Bibliometric networks. CoRR abs/1212.5211 (2012). URL http://arxiv.org/abs/1212.5211 14. Janssens, F., Zhang, L., Moor, B.D., Glänzel, W.: Hybrid clustering for validation and improvement of subject-classification schemes. Information Processing & Management 45(6), 683 – 702 (2009). DOI http://dx.doi.org/10.1016/j.ipm.2009.06.003. URL http: //www.sciencedirect.com/science/article/pii/S0306457309000673 15. Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Math. 26, 189–206 (1984) 16. Koopman, R., Wang, S.: Mutual information based labelling and comparing clusters. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics (2017). DOI 10.1007/s11192-017-2305-2 http://www.sciencedirect.com/science/article/pii/S0022000003000254 http://www.sciencedirect.com/science/article/pii/S0022000003000254 http://doi.acm.org/10.1145/502512.502546 http://doi.acm.org/10.1145/502512.502546 http://arxiv.org/abs/1212.5211 http://www.sciencedirect.com/science/article/pii/S0306457309000673 http://www.sciencedirect.com/science/article/pii/S0306457309000673 Contextualization of topics 21 17. Koopman, R., Wang, S., Scharnhorst, A.: Contextualization of topics - browsing through terms, authors, journals and cluster allocations. In: A.A. Salah, Y. Tonta, A.A.A. Salah, C.R. Sugimoto, U. Al (eds.) Proceedings of ISSI 2015 Istanbul: 15th International Soci- ety of Scientometrics and Informetrics Conference, Istanbul, Turkey, 29 June to 3 July, 2015. Bogaziçi University Printhouse (2015). URL http://www.issi2015.org/files/ downloads/all-papers/1042.pdf 18. Koopman, R., Wang, S., Scharnhorst, A., Englebienne, G.: Ariadne’s thread: Interactive navigation in a world of networked information. In: B. Begole, J. Kim, K. Inkpen, W. Woo (eds.) Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems, Seoul, CHI 2015 Extended Abstracts, Republic of Korea, April 18 - 23, 2015, pp. 1833–1838. ACM (2015). DOI 10.1145/2702613.2732781. URL http://doi.acm.org/10.1145/2702613.2732781 19. Kouw, M., Heuvel, C.V.d., Scharnhorst, A.: Exploring uncertainty in knowledge represen- tations: Classifications, simulations, and models of the world. In: P. Wouters, A. Beaulieu, A. Scharnhorst, S. Wyatt (eds.) Virtual Knowledge. Experimenting in the Humanities and the Social Sciences, p. 89126. Cambridge, Mass.: MIT Press. (2013) 20. Leydesdorff, L., Welbers, K.: The semantic mapping of words and co-words in contexts. Journal of Informetrics 5(3), 469–475 (2011). DOI 10.1016/j.joi.2011.01.008 21. Lu, K., Wolfram, D.: Measuring author research relatedness: A comparison of word-based, topic-based, and author cocitation approaches. Journal of the American Society for Infor- mation Science and Technology 63(10), 1973–1986 (2012). DOI 10.1002/asi.22628 22. Mahalanobis, P.C.: On the generalised distance in statistics. Proceedings National Insti- tute of Science, India 2(1), 49–55 (1936) 23. Mali, F., Kronegger, L., Doreian, P., Ferligoj, A.: Dynamic scientific co-authorship net- works. In: A. Scharnhorst, K. Börner, P. van den Besselaar (eds.) Models of Sci- ence Dynamics, Understanding Complex Systems, pp. 195–232. Springer Berlin Hei- delberg (2012). DOI 10.1007/978-3-642-23068-4 6. URL http://dx.doi.org/10.1007/ 978-3-642-23068-4_6 24. Mayr, P., Scharnhorst, A.: Scientometrics and information retrieval: weak-links revitalized. Scientometrics 102(3), 2193–2199 (2015). DOI 10.1007/s11192-014-1484-3. URL http: //dx.doi.org/10.1007/s11192-014-1484-3 25. Mutschke, P., Mayr, P.: Science models for search: a study on combining scholarly in- formation retrieval and scientometrics. Scientometrics pp. 1–23 (2014). DOI 10.1007/ s11192-014-1485-2. URL http://dx.doi.org/10.1007/s11192-014-1485-2 26. Papadimitriou, C.H., Raghavan, P., Tamaki, H., Vempala, S.: Latent semantic indexing: A probabilistic analysis. Journal of Computer and System Sciences 61(2), 217 – 235 (2000). DOI http://dx.doi.org/10.1006/jcss.2000.1711. URL http://www.sciencedirect. com/science/article/pii/S0022000000917112 27. Petersen, A.: Simulating nature: A philosophical study of computer-simulation uncertain- ties and their role in climate science and policy advice. Het Spinhuis: Apeldoorn (2006) 28. Radicchi, F., Fortunato, S., Vespignani, A.: Citation Networks. In: A. Scharnhorst, K. Börner, P. Besselaar (eds.) Models of Science Dynamics, Understanding Complex Systems, vol. 69, chap. 7, pp. 233–257. Springer Berlin / Heidelberg, Berlin, Heidel- berg (2012). DOI 10.1007/978-3-642-23068-4\ 7. URL http://dx.doi.org/10.1007/ 978-3-642-23068-4_7 29. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY, USA (1986) 30. de Solla Price, D.J.: Networks of scientific papers. Science 149(3683), 510–515 (1965). DOI 10.1126/science.149.3683.510. URL http://www.sciencemag.org/content/ 149/3683/510.short 31. Van Eck, N.J., Waltman, L.: Citation-based clustering of publications. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Sciento- metrics (2017). DOI 10.1007/s11192-017-2300-7 32. Van Heur, B., Leydesdorff, L., Wyatt, S.: Turning to ontology in STS? turning to STS through “ontology”. Social Studies of Science 43(3), 341362 (2013). DOI 10.1177/030631271245814 33. Velden, T., Boyack, K., van Eck, N., Glänzel, W., Gläser, J., Havemann, F., Heinz, M., Koopman, R., Scharnhorst, A., Thijs, B., Wang, S.: Comparison of topic extraction ap- proaches and their results. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic struc- tures in science, Special Issue of Scientometrics (2017). DOI 10.1007/s11192-017-2306-1 http://www.issi2015.org/files/downloads/all-papers/1042.pdf http://www.issi2015.org/files/downloads/all-papers/1042.pdf http://doi.acm.org/10.1145/2702613.2732781 http://dx.doi.org/10.1007/978-3-642-23068-4_6 http://dx.doi.org/10.1007/978-3-642-23068-4_6 http://dx.doi.org/10.1007/s11192-014-1484-3 http://dx.doi.org/10.1007/s11192-014-1484-3 http://dx.doi.org/10.1007/s11192-014-1485-2 http://www.sciencedirect.com/science/article/pii/S0022000000917112 http://www.sciencedirect.com/science/article/pii/S0022000000917112 http://dx.doi.org/10.1007/978-3-642-23068-4_7 http://dx.doi.org/10.1007/978-3-642-23068-4_7 http://www.sciencemag.org/content/149/3683/510.short http://www.sciencemag.org/content/149/3683/510.short 22 Rob Koopman et al. 34. Velden, T., Yan, S., Lagoze, C.: Mapping the Cognitive Structure of Astrophysics by Infomap. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics (2017). DOI 10.1007/s11192-017-2299-9 35. Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research 11, 28372854 (2010) 36. Wang, S., Koopman, R.: Clustering articles based on semantic similarity. In: J. Gläser, A. Scharnhorst, W. Glänzel (eds.) Same data – different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Sciento- metrics (2017). DOI 10.1007/s11192-017-2298-x 37. Zitt, M., Bassecoulard, E.: Delineating complex scientific fields by an hybrid lexical- citation method: An application to nanosciences. Information Processing & Manage- ment 42(6), 1513 – 1531 (2006). DOI http://dx.doi.org/10.1016/j.ipm.2006.03.016. URL http://www.sciencedirect.com/science/article/pii/S0306457306000379. Special Is- sue on Informetrics 38. Zitt, M., Lelu, A., Bassecoulard, E.: Hybrid citation-word representations in science map- ping: Portolan charts of research fields? Journal of the American Society for Information Science and Technology 62, 1939 (2011) http://www.sciencedirect.com/science/article/pii/S0306457306000379 1 Introduction 2 Data 3 Method 4 Experiments and results 5 Conclusion work_aknvriq4wbhgpo2lmoxk5gibuy ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586771 Params is empty 216586771 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586771 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_axmfybigufgelddol5gb4c5isa ---- 1 Collaborative Batch Creation for Open Access E-Books: A Case Study Philip Young, Rebecca Culbertson, Kelley McGrath Abstract When the National Academies Press announced that more than 4,000 electronic books would be made freely available for download, many academic libraries expressed interest in obtaining MARC records for them. Using cataloging listservs, volunteers were recruited for a project to identify and upgrade bibliographic records for aggregation into a batch that could be easily loaded into catalogs. Project organization, documentation, quality control measures, and problems are described, as well as processes for adding new titles. The project’s implications for future efforts are assessed, as are the numerous challenges for network-level cataloging. Introduction In June 2011, the National Academies Press (NAP) issued a press release announcing that portable document format (PDF) versions of their books could be freely downloaded from their website.1 While about 65 percent had previously been available for download, and almost all were available to read online through a web reader, over 4,000 books were now downloadable, as would be most books issued in the future. NAP books are primarily reports from scientific panels on a variety of topics, and often ordered in print by academic libraries. A link to the online version is frequently added to the print record in either the library catalog or in OCLC. A library’s catalog web page for a book can also include a link to the e-book via a link resolver or JavaScript that uses elements of the MARC record to create a search. However, these methods depend on the presence of the print version in the catalog, and most libraries do not have all of the available titles. The announcement by NAP presented an opportunity to fill those gaps and add nearly all of the content in electronic form to a library’s catalog. In several academic libraries, collection development librarians expressed interest in providing access to the e-books in their catalogs. A week after the press release, a catalog librarian made an inquiry on the Batch cataloging listserv2 to discover whether records were available for the newly accessible e-books. An OCLC collection set could be purchased for 2,580 books published through 2008, but the records were for the print version with a link added. In 2008 the library contributing these print records suspended cataloging records for this set because they were changing in their local catalog to the separate record technique and could no longer support adding links to print records. At this time, most libraries use separate records for print and electronic versions of monographs. One respondent to the inquiry contacted NAP and was told that MARC bibliographic records were not available from them, though records could be ordered through NetLibrary or Ebrary. Neither vendor answered inquiries about the availability 2 and cost of a NAP record set. A third vendor was discovered but offered only a subset of the available titles. Soon, another listserv respondent offered to organize a project to create a list of OCLC record numbers for the NAP e-books that any library could batch search and download. The record batch would therefore be free (that is, no cost above the OCLC subscription), weeded of the duplicates that plague e-book cataloging, and there would be an opportunity to upgrade the records. Doing so at the network level would save individual libraries from the work of batch searching and weeding, as well as the quality control often necessary after loading records into the local catalog. While lacking an explicit cost, the record batch would depend on time volunteered by catalog librarians to create it. Libraries in general add fewer online open access resources to their catalogs than might be expected. The most commonly added materials in this category are government documents. Most libraries receive government document records in batches through a service such as Marcive or OCLC. Many libraries load records for e-journals in the Directory of Open Access Journals, especially since the records are often available from vendors who provide journal records for electronic resource management systems (ERMs). Some libraries also add selected websites to their catalogs. Although there are many freely available online e-books, these are not often added to library catalogs in large numbers. One reason for this is the lack of organized record batches that could be quickly loaded into the catalog. This is in contrast to vendors who often provide MARC bibliographic records as an inducement for customers to buy a particular collection of e- books. However, there are potentially many advantages to including open access e-books in library catalogs. In addition to expanding a library’s collection, open access e-books can also be used as a collection weeding tool3 and libraries will likely want to provide access to the increasing number of open access textbooks.4 Unlike some very large open access e-book collections, the size of the NAP collection is small enough that collaborative batch creation seemed a reasonable, attainable goal. The creation of a curated record batch would ensure record quality and reduce the burden on any individual library wishing to provide local access to this collection. Additionally, the collection consists of recent and ongoing scholarship, whereas larger freely available collections tend to be dominated by older public domain content. Since all libraries have access to the content, the NAP e-books and similar collections seem ideally suited for large-scale cooperative cataloging of record batches. Literature Review Academic consortia most frequently report collaborative work on record sets, usually involving quality control of vendor records. Cary and Ogburn describe the origins of what may be the earliest effort, involving a group of Virginia academic libraries with access to the same content.5 In an attempt to avoid duplication of work, they contacted other consortia, but no other shared cataloging agreements were discovered. Their first project involved a set of vendor records that were improved and shared via file transfer 3 protocol (FTP). Catalog librarians in the consortium differed in skill and experience, and the project revealed “a significant need for training and help in interpreting and applying cataloging rules and standards.” Shieh, Summers, and Day subsequently provided a more detailed account of the same project, including difficulties in loading, record quality, and authority work.6 They note, “further research is needed on administrative implications of cooperative cataloging in consortia, addressing equitable allocation of personnel, scheduling in conjunction with local projects, and cost/benefit for participating institutions.”7 Martin and Mundle relate a consortial effort to improve vendor records through communication rather than through record editing.8 Record problems were reported on a discussion list by libraries in the consortium, then aggregated and forwarded to the vendor. They found that communicating with each other and the vendor to improve records before receipt (by reviewing sample record sets) was the best way to ensure quality metadata, assisted by the added influence of the consortia as opposed to a single library. Contrary to the pre-distribution quality control employed by the NAP project, the authors suggest that libraries may best serve their users by “working to improve accuracy, completeness, and discoverability after access has been established.”9 Chew and Braxton describe an Illinois consortium using a shared system and its effort to establish consortial standards for cataloging electronic resources.10 Among the problems mentioned are vendor restrictions on record sharing and the importance of record identifiers, particularly for vendor records. Preston focuses on how cooperative e-book cataloging work was “organized, negotiated, and divided among project participants” in an Ohio consortium.11 Work on specific record sets was negotiated by members at bimonthly meetings, and was largely dependent on the skills needed. Issues of fairness can arise when only a small minority contributes but all benefit. Cataloging work can be distributed in a variety of ways. A post on the blog All Things Cataloged described a method used by the Bavarian library network, in which for one year one library ‘adopts’ one e-book package, taking responsibility for improving that package’s metadata (which includes adding subject headings, doing authority work and, where possible, linking print version and e-book). These automatic and manual improvements are then shared cooperatively.12 Such consortial efforts toward batch record improvement, however, are rarely shared on a wider scale. Little research has occurred about record batches for open access e-books. Beall discusses loading a record set for a very large collection of open access e-books (Mbooks, now HathiTrust) into the catalog.13 Records were stripped of metadata due to the requirements of OCLC’s member agreement , made available via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard, then crosswalked back 4 into MARC using the editing software MarcEdit. Records were then improved using global update in the catalog. Despite low metadata quality, the author felt that providing access to the content through the catalog was far more important than record accuracy and completeness. The Cooperative Online Serials group (CONSER) completed the first year of an Open Access Journal Project begun in April 2010 to catalog any e-journals in the Directory of Open Access Journals (DOAJ) lacking a CONSER record.14 The project recognizes the increasing importance of open access resources as libraries undergo journal cancellation projects and provide support for open access publishing initiatives. Journals in DOAJ must meet certain criteria, such as scholarly content, peer review, and assignment of an ISSN. The DOAJ project is similar to the NAP project in size, collection growth, and multidisciplinary content. Records already exist for most of the content, and the project “decreases duplicative cataloging efforts.” Titles were assigned by cataloging expertise (e.g., language or subject knowledge), and a frequently asked questions (FAQ) page on the project website is provided to assist participants. 15 This project was so successful that CONSER libraries signed up for a second round of cataloging new DOAJ titles. Hellman points to several large open access e-book collections and notes that libraries have not done a good job of including them in their catalogs.16 One effort singled out is the University of Pennsylvania’s Online Books Page, edited by John Mark Ockerbloom.17 Over a million open access e-books are indexed, including large collections such as HathiTrust and Project Gutenberg. The metadata needs for these collections are sometimes great, and “libraries can make significant contributions, especially when they work cooperatively.”18 Project Organization Once catalogers were in agreement about the significance of the project, attention quickly turned to implementation. Considerable work was required to provide participants with a spreadsheet of the NAP titles. An initial title list was established based on a coverage load for SFX (3,860 titles through some time in 2008). SFX, an ExLibris product, is best known as an OpenURL link resolver, but also contains a knowledgebase. The SFX spreadsheet included only minimal metadata: title proper, provider name and URL. The NAP identifier was extracted from the URL and student labor was used to fill out the spreadsheet with ISBNs for searching and title availability status. NAP identifiers do not appear to be issued sequentially and there are unused numbers in the sequence. Newer titles were identified by student workers who checked a range of NAP identifiers on the NAP website. They began checking with a number that was deemed to be low enough to have sufficient overlap with the identifiers provided by SFX not to miss anything and stopped when they encountered a long series of unused numbers. All titles were sorted into the following categories. 1. Available in PDF and assigned an ISBN 2. Available in PDF but not assigned an ISBN 5 3. Available only in HTML/openbook format 4. Forthcoming and prepublication titles 5. Various categories for which a free ebook is not available This information was imported into Microsoft Access, which was used to track the status of the project, sort titles into categories, and generate Excel spreadsheets for participants. Early in the project, there were about eight participants, but when progress was slower than hoped for, a second call for volunteers was made on the Batch and Autocat listservs. The number of participants then swelled to about twenty, though the amount and quality of work varied by participant. Project documentation was prepared by the organizer and evolved through four versions as participants added suggestions (see appendix). Other guidance provided by the organizer included a procedure for batch searching on OCLC Connexion, and directions for using a macro that converted e-book records to the provider-neutral standard.19 The project had no explicit decision-making process, and the project direction depended on e- mail feedback to the organizer’s questions. One discussion centered on whether the URL should be standardized, and if so, which form of it should be used. A wide variety of NAP links have been attached to OCLC records (see table 1). A standard URL would be easier to manipulate in MarcEdit or other batch editing programs. Some forms of the URL lead to NAP’s web reader, where it is less clear that a downloadable PDF is also available. These forms were rejected. A search on OCLC determined the form that was used with the greatest frequency, and this became the standard. Table 1. Examples of NAP link variation in OCLC nap.edu/catalog.php?record_id=9999 General e-book page (selected) books.nap.edu/catalog.php?record_id=9999 General e-book page nap.edu/catalog/9999.html General e-book page nap.edu/books/NI000136/html/ HTML/openbook version nap.edu/books/030907603X/html/ HTML/openbook version nap.edu/openbook.php?record_id=9999 HTML/openbook version nap.edu/catalog.php?record_id=9999#toc HTML version table of contents Project Workflow Upon receiving a spreadsheet of 50 titles, participants were to carry out several tasks. One of the most important was identifying the best record for a title and recording its OCLC record number. At the project’s end, this would enable any library to import a text file of all the record numbers into Connexion’s batch searching module to retrieve a record batch that could then be exported to the library’s catalog. The high number of 6 duplicate records made selection difficult. A few participants reported duplicates to OCLC, but this was a very time-consuming process. Once a record was selected, participants verified that the URL for the NAP version was present in the agreed-upon form, and working properly. Other edits ensured that any e- ISBNs, where present, were recorded in the MARC field 020 $a, and that any print ISBNs were in a 020 $z. Participants also checked that headings were authorized, in order to save work at the end of the project. Some edits were optional, such as making a record provider-neutral or adding Medical Subject Headings (MeSH) and National Library of Medicine (NLM) classification. Macros were available for automating provider-neutral record conversion, and for deriving an original e-book record from a print version. While PDF download does require registration on the NAP site (at a minimum, users must provide an e-mail address), no note about this was added to records. Individual libraries may choose to add a note, if desired, after records are retrieved. Participants were also asked to note special situations or problems. Some OCLC records needing editing were Program for Cooperative Cataloging (PCC) records, and non-PCC libraries were restricted from editing them. The PCC status of records was noted so that they could be gathered and sent to a PCC participant for editing. A recent change in OCLC policy allows those with a Name Authority Cooperative (NACO) authorization to edit PCC records. This might have improved project workflow had the change taken place before the project began. In a very few cases, an e-book record for a title did not exist, and an original record was created. Name headings (usually for committees or conferences) without authority records (those that could not be “controlled” in OCLC) were also noted for NACO participants to address later. When all of the spreadsheets had been returned to the organizer, it became apparent that the varying skill levels of the participants resulted in quality control problems. A follow- up project was begun using selected volunteers. Record numbers were searched in batch mode in OCLC Connexion and sent to the local save file where a macro was used to identify record errors and anomalies for cleanup. The macro focused on identifying information that would affect the usability of the records, such as lack of a NAP URL in the agreed-upon form and the presence of uncontrolled headings that might not be supported by authority records. Many of the name headings had authority records, but a few participants had not been familiar with the “control” function in OCLC which links a heading to its authority record. Some participants did not understand that the instructions from the first part of the project asked them to report headings that could not be controlled in the notes column of the spreadsheet so that they could be followed up on. In addition, there were many headings of the form 710 2b $a National Research Council (U.S.). $b. Committee on Fire Research. where the base part of the heading in $a could be controlled, but the whole heading was not controllable. Some participants did not understand that these should be reported as problems. Because the Connexion macro command CS.IsHeadingControlled identifies 7 these partially controlled headings as controlled, they could not be identified during the second stage. Although a goal of this project was to support all headings with authority records and control the headings in Connexion, this goal was not met. This is primarily because not all of the uncontrolled headings were flagged at the point where a person was looking at the record, and it is not currently possible to retroactively identify them by automated means. A second obstacle to our goal of comprehensive authority control is that records in OCLC Connexion are not static. Even if the record was complete and accurate at the time one of the participants last edited it, record quality may be enhanced or degraded before a library retrieves it. Due to record merging, unauthorized headings were introduced into some of the project’s previously cleaned-up records. Completed reviews were reported to the organizer, along with any problems encountered. A second follow-up project addressed titles that were part of multi-volume sets or had been cataloged as serials in print. While it is common for multi-volume sets to be cataloged as individual volumes as e-books, a decision was made to use set records where the Library of Congress had done so for the print version. However, for cases where the print version was on a serial record, such as Biographical Memoirs, cataloging as individual volumes was thought to be more practical. Authority record creation was also completed during this stage. OCLC record numbers were then compiled into a text file and uploaded to the web. A separate text file was made available for multi-volume set records. Availability of the files was first announced to project participants for testing before wider distribution. The records downloaded from OCLC based on this list will need some editing by each library for loading into the local catalog. At a minimum, non-NAP URLs present on these provider-neutral records will need to be removed and information needed for the loading process will have to be added. This takes little time with tools such as MarcEdit, but is an extra step that will have to be performed. The record batch will also be available as a WorldCat collection set available to both OCLC and non-OCLC libraries, though at a cost. Collection set records are pre-processed by OCLC so if they are properly set up, the records can be loaded immediately. Usage data from text file downloads and WorldCat Collection Set purchases could give an indication of the project’s usefulness. Future Plans Plans for updating the record batch are ongoing. According to NAP, about 200 new titles are issued per year. Options for keeping up with new titles include the NAP weekly e- mail newsletter,20 the new books web page21 (although it only lists new books for the last 30 days), and possibly a vendor knowledgebase. Volunteers could be assigned to titles on a monthly basis. However, different skills are required for creating new records (i.e., “original” cataloging). Deriving new records from the print version would not be possible in most cases, since the online version usually precedes the print version. Therefore the pool of qualified volunteers will likely be smaller. Rather than create new 8 records, another option would be to periodically search OCLC for new records entered by others, and add those OCLC record numbers to an updated text file. An additional problem is that many of the new titles are issued in a prepublication version before being replaced with the published version. This process can take a few months. Should titles be cataloged as prepublication versions, or should catalogers wait for the final publication version? Advantages of the former are that catalogers would be providing timelier metadata for available content that will likely change little. The vast majority of the description, including the URL, would remain the same once a publication version was issued, although NAP does appear to use different ISBNs for the prepublication and published versions. However, records would need to be marked in some way (either through a MARC field or a list kept by the project) so they could be finalized against the publication version and the “description based on” language and physical description changed. While print prepublication versions continue to exist, online prepublications often do not. While a prepublication PDF could be downloaded by a library, this seems unlikely. If records are not updated, then they will describe a manifestation that no longer exists in its online form, and another record for the final version could be created. NAP notifications for replacement of the prepublication version with the published version could be used to update the record. Closer collaboration with NAP could also help in distinguishing titles with and without PDF versions, and in identifying any removed items. Discussion The NAP batch creation project was far more time-consuming than expected, and placed inordinate demands on the organizer. The project began in June 2011 and the OCLC numbers for PDF e-books with ISBNs were distributed in February 2012. Work is ongoing on the NAP e-books for PDFs without ISBNs. It has not yet been decided if the project will incorporate e-books available only in HTML format. Except for very recent releases, there were e-book records available in WorldCat for all the NAP PDF e-books with ISBNs. The larger problem was sorting through multiple records to identify the best record. The project did not offer explicit guidelines for selection of the best record, although these were implied in the instructions (see appendix) and participants were expected to have sufficient expertise to evaluate records. In e-mails among the volunteers, selecting the record with the most holdings was frequently suggested. As long as a record met criteria in the instructions, or was upgraded to that standard, record choice was not crucial. In contrast, a significant proportion of the NAP e-books without ISBNS lack e-book or even print records in WorldCat. Even where records exist, they are often of lesser quality. Completing the process of identifying and upgrading or creating these records requires a different and more advanced skill set than the initial part of the project and it is not clear that the current pool of volunteers has the necessary resources or is willing to make the required time commitment. 9 The project proceeded more slowly than anticipated in part because the organizer lacked time to devote to the project at key points and became a bottleneck in the workflow. Likewise, some participants did not complete as many batches nor work as quickly as would have been necessary for a more timely finish. This reflects the reality that many catalog librarians have extensive demands on their time and that this is a volunteer project added to participants' regular duties. Other factors delaying completion were the relatively low number of volunteers and the need for error checking. The second review of the records identified errors made by a few volunteers without the appropriate skills, as well as record problems that are inevitably and inadvertently missed in a project on this scale. It is difficult, if not impossible, to anticipate the problems encountered in such a project, and some policy changes were made after work had begun. Therefore the editing done by participants was not always consistent from beginning to end. Consistency was also affected by making some edits optional, such as adding MeSH and NLM classification desired by some participants. Most batch creation projects will likely require too much work for a single organizer. Duties should be well distributed in order to prevent overload and project bottlenecks. A model such as the CONSER project for the DOAJ could be implemented by the PCC for open access e-book record sets such as those included in the new Directory of Open Access Books,22 though many skilled catalogers not at PCC institutions would be excluded. Documentation proved hard to write, and it was even more difficult to get a disparate group of participants to follow it. The project revealed wide variation in the skill and knowledge of its volunteers. No assumptions should be made in this respect, particularly given the lack of an accountability mechanism. This has been a long- standing problem that was noted in the earliest consortial effort in collaborative batch cataloging.23 Though crowdsourcing batch creation through a global cataloging network has tremendous potential, it is difficult to ensure that quality work will result, even when specific guidance is provided. One potential way to improve the existing project directions would be to include more “before and after” examples of editing records, including screenshots. It might also be possible to create a macro for participants to run after editing a record that would alert them immediately to areas of the record needing attention. Of course, this is limited to the sorts of errors that are amenable to automated identification. It would also be wise, although more time-intensive, to initially distribute a few records to participants as a test to see if any cataloging misunderstandings exist, which would enable faster feedback. Spreadsheets would only be distributed after participants had demonstrated the ability to meet the project's standards for record quality. Alternatively, if batch e-book projects were taken on by the PCC, organizational responsibilities could be distributed, and well-trained participants would be ensured. The ever-changing nature of OCLC’s bibliographic database presents the practical problem of maintaining record consistency. In many of the consortial projects described in the literature review, edits were made to records received from vendors and then 10 distributed directly to consortial members. In this scenario, the quality of the distributed records can be closely controlled. For our project, since the current OCLC record use policy24 does not allow the redistribution of records, we are limited to distributing OCLC record numbers that OCLC members can re-search to obtain the records. The updating and building upon existing metadata in OCLC records is usually a positive development, for example when different subject vocabularies or genre headings are added to records. However, the extensive record merging taking place in OCLC frequently changes record content, and sometimes for the worse. In addition to the introduction of unauthorized headings mentioned earlier, the wide variety of URLs used in OCLC records to access NAP e-books means that the links probably will not remain standardized. The pre-distribution quality control employed by this project differs from common practice, where records are improved locally, whether before or after record load.25 Local quality control is a vastly inefficient process due to the duplicative work done by each library receiving the records. Pre-distribution quality control, even using mostly manual record editing, saves a tremendous amount of duplicative effort by ensuring that records are largely error-free. One challenge for the local editing model has been lack of a method to batch upload record improvements to the network level once they have been distributed to individual libraries. Locally, efficient batch editing can take place via MarcEdit or global update in an individual catalog, but transferring these improvements to OCLC would require editing individual records one at a time. There is a conflict between the need to quickly load records for immediate access to electronic resources, and the desire to reduce duplication of effort in the editing and authority control of the records. While network-level cataloging work remains the exception rather than the rule, encouraging recent steps have been taken by OCLC toward that ideal. These include the expansion of the pilot Expert Community Experiment into a permanent program, the extension of PCC record editing privileges to NACO members, an algorithm to programmatically perform heading control on new and existing records, and WorldCat Local. At the same time, network-level cataloging is hindered by OCLC’s record use policy and the proliferation of other record sources. It is also difficult or impossible in OCLC’s Connexion interface to do the kinds of efficient batch editing of records supported by MarcEdit and many local ILSs. Newer bibliographic utilities such as SkyRiver and Biblios.net freely share records, but OCLC, MARC record services, and some vendors place restrictions on record sharing. Cataloging may become more network-like yet remain in restricted silos. Truly network-level cataloging will require freely sharable records. Future Research Possibilities for new metadata elements emerged during this project. A code for the identification of open access resources would serve two purposes for libraries. First, it would enable catalogers to find and gather resources for adding to the catalog. Second, if the code was part of the catalog display, education and advocacy about open access in 11 academic libraries might be furthered. Also, a metadata element to indicate publication status could help solve the problem with prepublication versions in this project. This indicator could alert a cataloger that the record description needed updating when the final publication version was available. The result would be faster delivery of electronic content to catalog users. Finally, a metadata element to describe e-book formats is badly needed. The term “e-book” has been used for a wide variety of online textual material, but users need to know whether it is downloadable (and if so in what format) or only readable in a web browser (thus requiring an internet connection). Further work is needed to ensure metadata consistency between formats. In the course of upgrading NAP e-book records, participants often consulted the print record. Print records sometimes contained desirable metadata not present on the e-book record, such as MeSH and NLM classification. Conversely, e-book records often contained contents notes and summaries not present on the print record. It was also more common for e- book records to link to the print version (using the MARC 776 tag) than vice versa. The two formats occasionally presented conflicting metadata in the form of a differing main entry or call number. This metadata divergence for identical content does not help the catalog user. Implementation of FRBR-aware catalogs, in which metadata at the work level can be applied to all formats, may solve this problem. FRBR may also help in situations where multi-volume sets were cataloged on a single record in print, but each volume on a separate record as an e-book. The case for open metadata should be made more clearly and forcefully. Freely sharable metadata will mitigate the effects of the competing MARC record silos that are developing. Open metadata may be a requirement for the linked data environment the library world is currently exploring,26 but there is no need to wait until then. Conclusion The NAP e-book project represents a unique and successful collaboration resulting in a batch of over 3,500 records available for loading into local catalogs. No other cataloging project, to our knowledge, has been accomplished with such a wide variety of volunteers. While this aspect of the project hindered consistency, future projects can implement suggested controls. The CONSER project for DOAJ could serve as an organizational model. As open access resources increase in numbers and prominence, libraries will need to devote greater attention to metadata for them. Elimination of duplicative record editing is badly needed. Lacking a mechanism to upload a batch of corrected records, this project employed pre-distribution quality control at the individual record level. This quality control was affected by the skills of the volunteers as well as the dynamic nature of a large bibliographic database. Open metadata is needed if network level cataloging is to be realized. Despite the problems encountered in this project, organized batch creation projects are an effective way to provide access to important collections. 12 Notes 1. “The National Academies Press Makes All PDF Books Free To Download; More Than 4,000 Titles Now Available Free To All Readers,” accessed February 28, 2012, http://www8.nationalacademies.org/onpinews/newsitem.aspx?RecordID=06022011 2. Batch listserv, http://listserv.vt.edu/cgi-bin/wa?A0=BATCH. 3. Kirstin Steele, "Free electronic books and weeding," Bottom Line: Managing Library Finances 24(3) (2011): 160-161, http://dx.doi.org/10.1108/08880451111185982. 4. Steven Ovadia, "Open-Access Electronic Textbooks: An Overview," Behavioral & Social Sciences Librarian 30(1) (2011): 52-56, http://dx.doi.org/10.1080/01639269.2011.546767. 5. Karen Cary and Joyce L. Ogburn, “Developing a Consortial Approach to Cataloging and Intellectual Access,” Library Collections, Acquisitions, & Technical Services 24 (2000): 45-51, http://dx.doi.org/10.1016/S1464-9055(99)00095-0. 6. Jackie Shieh, Ed Summers, and Elaine Day, “A Consortial Approach to Cooperative Cataloging and Authority Control: The Virtual Library of Virginia (VIVA) Experience,” Resource Sharing & Information Networks 16:1 (2002), 33-52, http://dx.doi.org/10.1300/J121v16n01_04. 7. Shieh, Summers, and Day, “A Consortial Approach,” p. 48. 8. Kristin E. Martin and Kavita Mundle, “Cataloging E-books and Vendor Records: A Case Study at the University of Illinois at Chicago,” Library Resources & Technical Services 54(4) (2010): 227-237, http://alcts.metapress.com/content/h1455767637633x8/. 9. Martin and Mundle, “Cataloging E-books,” p. 235. 10. Chiat Naun Chew and Susan M. Braxton, “Developing Recommendations for Consortial Cataloging of Electronic Resources: Lessons Learned,” Library Collections, Acquisitions, & Technical Services 29 (2005): 307-325, http://dx.doi.org/10.1016/j.lcats.2005.08.005. 11. Carrie A. Preston, “Cooperative E-Book Cataloging in the OhioLINK Library Consortium,” Cataloging & Classification Quarterly 49 (2011): 257-276, http://dx.doi.org/10.1080/01639374.2011.571147. 12. All Things Cataloged, “Publisher e-book metadata,” (July 7, 2011), accessed February 28, 2012, https://allthingscataloged.wordpress.com/2011/07/07/publisher-e- book-metadata/. http://www8.nationalacademies.org/onpinews/newsitem.aspx?RecordID=06022011� http://listserv.vt.edu/cgi-bin/wa?A0=BATCH� http://dx.doi.org/10.1108/08880451111185982� http://dx.doi.org/10.1080/01639269.2011.546767� http://dx.doi.org/10.1016/S1464-9055(99)00095-0� http://dx.doi.org/10.1300/J121v16n01_04� http://alcts.metapress.com/content/h1455767637633x8/� http://dx.doi.org/10.1016/j.lcats.2005.08.005� http://dx.doi.org/10.1080/01639374.2011.571147� https://allthingscataloged.wordpress.com/2011/07/07/publisher-e-book-metadata/� https://allthingscataloged.wordpress.com/2011/07/07/publisher-e-book-metadata/� 13 13. Jeffrey Beall, “Free Books: Loading Brief MARC Records for Open-Access Books in an Academic Library Catalog,” Cataloging & Classification Quarterly 47 (2009): 452- 463, http://dx.doi.org/10.1080/01639370902870215. 14. CONSER, “Cooperative Open Access Journal Project Planning Group Report, April 30, 2010.” http://www.loc.gov/acq/conser/Open-Access-Report.pdf . 15. CONSER, “Open Access Journal Project FAQ.” Updated August 26, 2010. http://www.loc.gov/acq/conser/Open-Access-FAQ.html. 16. E.S. Hellman, “Open Access E-books,” The No Shelf Required Guide to E-book Purchasing, Library Technology Reports 47:8 (2011): 18-27, http://alatechsource.metapress.com/content/r7u235k327mm3q3h/. 17. “The Online Books Page,” edited by John Mark Ockerbloom, accessed February 28, 2012, http://onlinebooks.library.upenn.edu. 18. Hellman, “Open Access E-Books,” p. 24. 19. Program for Cooperative Cataloging, Provider-Neutral E-Monograph MARC Record Guide. (Prepared by Becky Culbertson, Yael Mandelstam, George Prager, includes revisions to September 2011). Retrieved February 7, 2012 from http://www.loc.gov/catdir/pcc/bibco/PN_Guide_20110915.pdf. 20. “Subscribe to the NAP Newsletter,” The National Academies Press, accessed March 6, 2012, http://www.nap.edu/updates/index.html. 21. “New Releases,” The National Academies Press, accessed March 6, 2012, http://www.nap.edu/new.html. 22. “A New Service for Open Access Monographs: the Directory of Open Access Books,” Open Access Publishing in European Networks, accessed March 5, 2012, http://project.oapen.org/index.php/news/46-doab-press-release. 23. Cary and Ogburn, “Developing a Consortial Approach,” p. 50. 24. “WorldCat Record Use Policy,” OCLC, accessed February 28, 2012, http://www.oclc.org/worldcat/recorduse/default.htm. 25. Elaine Sanchez, Leslie Fatout, Aleene Howser, and Charles Vance, “Cleanup of NetLibrary Cataloging Records: A Methodical Front-End Process,” Technical Services Quarterly 23(4) (2006), 51-71, http://dx.doi.org/10.1300/J124v23n04_04. 26. Raymond Bérard, “Free Library Data?,” Liber Quarterly 20:3/4 (2011), 321-331, http://liber.library.uu.nl/publish/articles/000512/article.pdf. http://dx.doi.org/10.1080/01639370902870215� http://www.loc.gov/acq/conser/Open-Access-Report.pdf� http://www.loc.gov/acq/conser/Open-Access-FAQ.html� http://alatechsource.metapress.com/content/r7u235k327mm3q3h/� http://onlinebooks.library.upenn.edu/� http://www.loc.gov/catdir/pcc/bibco/PN_Guide_20110915.pdf� http://www.nap.edu/updates/index.html� http://www.nap.edu/new.html� http://project.oapen.org/index.php/news/46-doab-press-release� http://www.oclc.org/worldcat/recorduse/default.htm� http://dx.doi.org/10.1300/J124v23n04_04� http://liber.library.uu.nl/publish/articles/000512/article.pdf� 14 Appendix National Academies Press Free Ebook Project Goals • Update OCLC ebook master record for free NAP ebooks to include NAP URL in the form http://www.nap.edu/catalog.php?record_id=????? • Perform basic quality control on the master record • Compile a list of OCLC numbers that can be used by participants and others to batch search, attach holdings and load the complete set of records into local catalogs Initial set up A list of NAP title IDs, titles, and (where available) ISBNs has been prepared covering the free NAP ebooks. Spreadsheets have been prepared that include batch search strings and 856 fields that can be cut and pasted into Connexion records. Perform batch search on assigned record range Each participant in this project will receive one or more spreadsheets with a list of titles to search and upgrade. The spreadsheet has columns with various search strategies that can be used for batch or individual searches and a column with URLs for pasting into Connexion. The columns that can be used for batch searches are: ISBN, ISBN limited to records held by Ebrary, title keyword combined with “National Academ*” as publisher (to pick up National Academy or National Academies) both as a plain search and as a search limited to records held by Ebrary. Each search is limited to mt:cai (cataloged as internet resource) and ll:eng (for English language records). To use for batch searching, select and copy the cells from the column you intend to use and paste the results in a Notepad text file. Separate instructions are provided for using the text file with the selected searches to generate a batch search in Connexion if you are not familiar with this process. Searches from any of these columns can also be copied and pasted into the command line search box and run individually. Select record to use and make the following edits If more than one record is retrieved, select the best record. Once you have selected a record to use, make the following edits. 15 Try to select a record that has an LC call number and LC subject headings that appears to be based on the print record. If the record does not have an LC call number and LC subject headings, add them if possible (note: I am finding some records with 588 Description based on print record with no 776 and no print record that I can find in OCLC; these probably should not be cataloged as “based on print record.”) Choose a record for an online item (no print records with URLs) Encoding level I or L if possible (upgrade if you feel comfortable) Check and fix if needed: 008/23/Form = o 006 = m\\\\\\\\d 007 = cr [leave any additional codes if already on the record, but it is not necessary to add them unless it is your local practice 020: add $z in front of any print ISBNs 050 _4 This is preferred to the 090 for LC call numbers not assigned by LC; ditto for the 060 _4 245|h = [electronic resource] [If an AACR2 record; there is at least one RDA record in this set] Control headings if possible Provider-Neutral tidbits: If you are cataloging as P-N, then you will either be using 500 Title from PDF t.p. (National Academies Press, viewed July 1, 2011) OR 588 Description based on print version record. In either case, with P-N, delete the 538 Mode of access note. [NOTE: The 500 field in the sentence above has now been changed to: “...then you either be using 588 Description based on online resource; title from PDF t.p. (National Academies Press, viewed...” ] Make sure there are LCSH subject headings in the record; would be nice to add NLM/MeSH headings if they are available and you have time Delete any institution-specific or proxied 856 fields 856 40 Remember that the second indicator is zero Add National Academies Press catalog record URL with $3 for National Academies Press (can copy from spreadsheet; might be a good idea to make first URL). Do not include $z URL should be in the form: http://www.nap.edu/catalog.php?record_id=12815 A good practice would be to make this the first URL (since it will be accessible to all users) and to delete other URLs going to the NAP site (so that we have consistent results for manipulation with MarcEdit) If the record is not provider-neutral, you may choose to make it into a provider neutral record, but this is not required. **IMPORTANT** Click on the URL in the 856 and make sure it works. After making changes, replace master record. http://www.nap.edu/catalog.php?record_id=12815� 16 Update spreadsheet and return to coordinator Update your copy of the spreadsheet with the relevant OCLC numbers. Just insert the plain OCLC numbers; it is not necessary to add any prefixes such as * or #. Add any questions or concerns or describe any unusual situations in the notes column. For example, it would be useful to note RDA records in the notes column. Return your completed spreadsheet to the coordinator by email. This will be used to compile a list of OCLC numbers that will be distributed to participants and posted publicly somewhere. National Academies Press Free Ebook Project Goals Initial set up Perform batch search on assigned record range Select record to use and make the following edits Update spreadsheet and return to coordinator work_ayjz4dv5pnfubpmozdhjbvst7m ---- Accueil - Institut National de Recherche en Agriculture, Alimentation et Environnement Accéder directement au contenu Accéder directement à la navigation Toggle navigation HAL HAL HALSHS TEL MédiHAL Liste des portails AURéHAL API Data Documentation Episciences.org Episciences.org Revues Documentation Sciencesconf.org Support  Connexion Connexion Connexion avec ORCID se connecter avec Fédération Créer un compte Mot de passe oublié ? Login oublié ? fr en Accueil Dépôt Recherche Consultation Par collection Par type de publication Par domaine Par collaboration Par période de publication Boîte à outils Présentation de HAL INRAE Pour les chercheurs Pour créer son idHAL et son CV Modèles de CV HAL INRAE Pour le réseau HAL INRAE Utilisation du Thésaurus INRAE A l'attention du déposant Accueil Archive Ouverte d'INRAE L'Archive Ouverte d'INRAE est destinée au dépôt et à la consultation des travaux scientifiques de l'Institut national de recherche pour l'agriculture, l'alimentation et l'environnement. Son objectif : ouvrir la science sur l'ensemble de nos thématiques de recherche. Recherche Actualités Fêtez le premier anniversaire du portail en mettant vos CV aux couleurs HAL INRAE !  (23/03/2021)   Moins A l'occasion du premier anniversaire de l'archive ouverte HAL INRAE l'équipe HAL INRAE vous propose trois modèles de CV aux couleurs de la charte graphique INRAE. Plus d'informations ici. Indexez vos publications grâce au thésaurus INRAE !  (22/03/2021)   Plus Le 22 mars 2021 le champ Thésaurus INRAE fait son apparition dans le formulaire de dépôt HAL INRAE. Ce nouveau champ vous permet d’enrichir le signalement de vos publications à l’aide de “concepts” issus du thésaurus INRAE. Lire la suite Le CCSD forme au dépôt dans HAL  (22/03/2021)   Plus L'équipe support et assistance du CCSD propose des sessions d’une heure en ligne (avec Zoom) pour se familiariser avec le dépôt dans HAL. Contenu d’une session : définir ses préférences de dépôt, comment créer une notice bibliographique, comment faire un dépôt avec fichier et comment gérer ses dépôts. A qui s’adressent ces formations ? En priorité aux chercheurs, enseignants-chercheurs et doctorants, qu’ils aient déjà déposé ou non. 6 sessions sont programmées entre le 25 mars et le 29 avril. Lien pour s'inscrire. Références 293 946 Open Access 30 % Contact Se connecter Cliquez, c’est ici dernières publications Mathieu Cassel, Jerome Lave, Alain Recking, Jean-René Malavoi, Hervé Piégay. Bedload transport in rivers, size matters but so does shape. Scientific Reports, Nature Publishing Group, 2021, 11 (1), ⟨10.1038/s41598-020-79930-7⟩. ⟨hal-03135776⟩ Camille Bourgneuf, Danielle Bailbé, Antonin Lamazière, Charlotte Dupont, Marthe Moldes, et al.. The Goto-Kakizaki rat is a spontaneous prototypical rodent model of polycystic ovary syndrome. Nature Communications, Nature Publishing Group, 2021, 12 (1), pp.1064. ⟨10.1038/s41467-021-21308-y⟩. ⟨hal-03147161⟩ Angélina Trotereau, Claudine Boyer, Isabelle Bornard, Max Jean Bernard Pécheur, Catherine Schouler, et al.. High genomic diversity of novel phages infecting the plant pathogen Ralstonia solanacearum, isolated in Mauritius and Reunion islands. Scientific Reports, Nature Publishing Group, 2021, 11, pp.5382. ⟨10.1038/s41598-021-84305-7⟩. ⟨hal-03162746⟩ N. Lazzerini, V. Balter, A. Coulon, T. Tacail, C. Marchina, et al.. Monthly mobility inferred from isoscapes and laser ablation strontium isotope ratios in caprine tooth enamel. Scientific Reports, Nature Publishing Group, 2021, 11 (1), pp.2277. ⟨10.1038/s41598-021-81923-z⟩. ⟨hal-03124324⟩ Mathias Frontini, Arnaud Boisnard, Julien Frouin, Malika Ouikene, Jean Benoit Morel, et al.. Genome-wide association of rice response to blast fungus identifies loci for robust resistance under high nitrogen. BMC Plant Biology, 2021, 21 (1), ⟨10.1186/s12870-021-02864-3⟩. ⟨hal-03174194⟩ Données personnelles Mentions légales work_azqoqjn72rh2deyqipbhwzpgla ---- Title Page Article: Scalable Decision Support for Digital Preservation *This version is a voluntary deposit by the author. The publisher’s version is available at: http://dx.doi.org/10.1108/OCLC-06-2014-0025 Author Details Author 1 Name: Christoph Becker Department: Faculty of Information University/Institution: University of Toronto Town/City: Toronto Country: Canada Author 2 Name: Luis Faria University/Institution: KEEP Solutions Town/City: Braga Country: Portugal Author 3 Name: Kresimir Duretec Department: Information and Software Engineering Group University/Institution: Vienna University of Technology Town/City: Vienna Country: Austria Acknowledgments (if applicable): Part of this work was supported by the European Union in the 7th Framework Program, IST, through the SCAPE project, Contract 270137. Abstract: Purpose – Preservation environments such as repositories need scalable and context-aware preservation planning and monitoring capabilities to ensure continued accessibility of content over time. This article identifies a number of gaps in the systems and mechanisms currently available, and presents a new, innovative architecture for scalable decision making and control in such environments. Design/methodology/approach – The paper illustrates the state of the art in preservation planning and monitoring, highlights the key challenges faced by repositories to provide scalable decision making and monitoring facilities, and presents the contributions of the SCAPE Planning and Watch suite to provide such capabilities. Findings – The presented architecture makes preservation planning and monitoring context-aware through a semantic representation of key organizational factors, and integrates this with a business intelligence system that collects and reasons upon preservation-relevant information. http://dx.doi.org/10.1108/OCLC-06-2014-0025 Research limitations/implications - The architecture has been implemented in the SCAPE Planning and Watch suite. Integration with repositories and external information sources provide powerful preservation capabilities that can be freely integrated with virtually any repository. Practical implications - The open nature of the software suite enables stewardship organizations to integrate the components with their own preservation environments and to contribute to the ongoing improvement of the systems. Originality/value – The paper reports on innovative research and development to provide preservation capabilities. The results enable proactive, continuous preservation management through a context-aware planning and monitoring cycle integrated with operational systems. Keywords: Repositories, preservation planning, preservation watch, monitoring, scalability, digital libraries Scalable Decision Support for Digital Preservation Christoph Becker, Kresimir Duretec & Luis Faria 1. Introduction Digital preservation aims at keeping digital information authentic, understandable, and usable over long periods of time and across ever-changing social and technical environments (Rothenberg 1995; Garret & Waters 1996; Hedstrom 1998). The challenge of keeping digital artifacts accessible and usable while assuring their authenticity surfaces in a multitude of domains and organizational contexts. While digital longevity as a challenge is increasingly encountered in domains as diverse as high-energy physics and electronic arts, the repository is still the prototypical scenario where the concern of longevity is of paramount importance, and libraries continue to play a strong role in the preservation community. Repository systems are increasingly made fit for actively managing content over the long run so that they can provide authentic access even after the availability of the original creation context, both technical and social. In this process, they have to address two conflicting requirements: The need for trust, a fundamental principle that is indispensable in the quest for long-term delivery of authentic information, and the need for scalability, arising from the ever-rising levels of digital artifacts deemed worthy of keeping. Systems that address aspects of preservation include repository software, tools for identification and characterization of digital artifacts, tools for preservation actions such as migration and emulation, systems to address aspects of analysis and monitoring, and preservation planning. It is understood today that automating most aspects of an operational preservation system is a crucial step to enable the scalability required for achieving longevity of digital content on the scales of tomorrow. Such automation is required within components of a complex system, but also needs to address systems integration, information gathering, and ultimately, decision support. The core capabilities that an organization needs to establish cover ● preservation operations, i.e. preservation actions such as emulation, virtualization and migration of digital objects to formats in which they can be accessed by the users, but also object-level characterization, quality assurance and metadata management; ● preservation planning, i.e. the creation, ongoing management and revisions of operational action plans prescribing the actions and operations to be carried out as means to effectively safeguard, protect and sustain digital artifacts authentically and ensuring that the means to access them are available to the designated community; and ● monitoring as a sine-qua-non of the very idea of longevity: Most of the risks that need to be mitigated to achieve longevity stem from the tendency of aspects in the socio-technical environment to evolve and sometimes change radically. Without the capability to sustain a continued awareness of a preservation system and its environment, preservation will not achieve its ultimate goal for long. Monitoring focuses on analyzing information gathered from different sources, both internal and external to the organization, to ensure that the organization stays on track in meeting its preservation objectives (Becker, Duretec, et al. 2012). Such awareness needs to be based on a solid understanding of organizational policies, which provide the context for preservation. In general terms, it can be said that policies "guide, shape and control" decisions taken within the organization to achieve long-term goals (Object Management Group, 2008; Kulovits et al. 2013b). Monitoring, policy making and decision making processes are guided by information on a variety of aspects ranging from file format risks to user community trends, regulations, and experience shared by other organizations. Sources that provide this kind of information include online registries and catalogues for software and formats or technology watch reports of recognized organizations. These are increasingly available online, but the variety of structures, semantics and formats prohibit, so far, truly scalable approaches to utilizing the knowledge gathered in such sources to provide effective decision support. (Becker, Duretec, et al. 2012) However, the key challenge confronting institutions worldwide is precisely to enable digital preservation systems to scale cost-efficiently and effectively in times where content production is soaring, but budgets are not always commensurate with the volume of content in need of safeguarding. Recent advances in using paradigms such as mapreduce (Dean & Ghemawat 2004) to apply distributed data- intensive computing techniques to the content processing tasks that arise in repositories show a promising step forward for those aspects that are inherently automated in nature. But ultimately, for a preservation system to be truly scalable as a whole, each process and component involved needs to provide scalability, including business intelligence and decision making. Here, the decision points where responsible stakeholders set directions and solve the tradeoff conflicts that inevitably arise need to be isolated and well-supported. Planning and monitoring as key functions in preservation systems have received considerable attention in recent years. The preservation planning tool Plato has shown how trustworthy decisions can be achieved (Becker et al. 2009). Its application in operational practice has advanced the community’s understanding of the key decision factors that need to be considered (Becker & Rauber 2011a), and case studies have provided estimates of the effort required to create a preservation plan (Kulovits et al. 2013a). Finally, the systematic quantitative assessment of preservation cases can provide a roadmap for automation efforts by prioritizing those aspects that occur most frequently and have the strongest impact (Becker, Kraxner, et al. 2013). However, creating a preservation plan in many cases still is a complex and effort-intensive task, since many of the required activities have to be carried out manually. It is difficult for organizations to share their experience in a way that can be actively monitored by automated agents and effectively used by others on any scale. Automated monitoring in most cases is restricted to the state of internal storage and processing systems, with little linking to preservation goals and strategies and scarce support for continuously monitoring how the activities in a repository and its overall state match the evolving environment. Finally, integrating whatever solution components an organization chooses to adopt with the existing technical and social environment is difficult, and integration of this context with strategies and operations is challenging (Becker & Rauber 2011c). This article presents an innovative architecture for scalable decision making and control in preservation environments, implemented and evaluated in the real world. The SCAPE Planning and Watch suite builds on the preservation planning tool Plato and is designed to address the challenges outlined above. It makes preservation planning and monitoring context-aware through a semantic representation of key organizational factors, and integrates this with a sophisticated new business intelligence tool that collects and reasons upon preservation-relevant information. Integration with repositories and external information sources provide powerful preservation capabilities that can be freely integrated with virtually any repository or content management system. The new system provides substantial capabilities for large-scale risk diagnosis and semi-automated, scalable decision making and control of preservation functions in repositories. Well-defined interfaces allow a flexible integration with diverse institutional environments. The free and open nature of the tool suite further encourages global take-up in the repository communities. The article synthesizes and extends a series of articles reporting on partial solution blocks to this overarching challenge (Becker & Rauber 2011c; Becker, Duretec, et al. 2012; Faria et al. 2012; Petrov & Becker 2012; Kulovits et al. 2013a; Kulovits et al. 2013b; Faria et al. 2013; Kraxner et al 2013). Besides pulling together the compound vision and value proposition of the integrated systems and providing additional insight into the design goals and objectives, we conduct an extensive evaluation based on a controlled case study, outline the interfaces of the ecosystem to enable integration with arbitrary repository environments, discuss implications for systems integration, and assess the improvement of the provided system over the state of the art in terms of efficiency, effectiveness, and trustworthiness. The article is structured as follows. The next section illustrates the state of the art in preservation planning and monitoring and highlights the key challenges faced by repositories to provide scalable decision making and monitoring facilities. Section 3 presents the key goals of our work and the main conceptual solution components that are developed to address the identified challenge. We outline common vocabularies and those aspects of design pertaining to future extensions of the resulting preservation ecosystem. Section 4 presents the suite of automated tools designed and developed to improve decision support and control in real-world settings. Becker et al (2015) will discuss the improvements of the presented work and identified limitations, based on a quantitative and qualitative evaluation of advancing the state of art including a case study with a national library. 2. Digital preservation: Background and challenges 2.1 Digital preservation and repositories The existing support for active, continued preservation in the context of digital repositories can be divided into several broad areas: repository software, tools for identifying and characterizing digital objects, tools for preservation actions (migration and emulation) and quality assurance, and systems for preservation monitoring and preservation planning. The main goal of repository software is to provide capabilities of storing any type of content and accompanying metadata, managing those data, and providing search and retrieval options to the user community. Rapidly growing demands for storing digital material are shifting development trends towards more scalable solutions aiming to provide a fully scalable system capable of managing millions of objects. Even though it is a very important aspect, scalability is only one of the dimensions which need to be effectively addressed in digital repositories. Many repositories are looking for ways to endow their systems with the capabilities to ensure continued access to digital content beyond the original creation contexts. Replacing an entire existing repository system is rarely a preferred option. It may not be affordable or sustainable, but also will often not solve the problem, since the organizational side of the problem needs to be addressed as well and preservation is inherently of continuous nature. In a recent survey, a majority of organizations looking for preservation solutions stated that they are looking for mix-and-match solution components that can be flexibly integrated and support a stepwise evolutionary approach to improving their systems and capabilities. There is a strong preference in the community for open-source components with well-defined interfaces, fitting the approach preferred by most organizations (Sinclair et al. 2009). The importance of file properties and file formats in digital preservation resulted in broad research and the development of tools and methods for file analysis and diagnosis. According to their functionality such tools can be divided into three categories: identification (identifying the format of a file), validation (checking the file conformance with the format specification) and characterization (extracting object properties) (Abrams, 2004). Probably the best known identification tool is the Unix command file. Further examples are the National Archives’ DROIDi tool and its siblings such as fidoii . Characterization tools differ in performance characteristics as well in feature and format coverage. Some of the most used and cited examples are the JSTOR/Harvard Object Validation Environment JHoveiii and its successor JHove2, and the eXtensible Characterization Languages (XCL) (Thaller 2009). Apache Tika iv combines fast performance with a coverage that extends beyond mere identification to cover extraction of various object features. Acknowledging that a single tool cannot cover all formats and the entire feature space, the File Information Tool Set (FITS) v combines other identification and characterization tools such as DROID and JHove and harmonizes their output to be able to cover different file formats and have a richer feature space as a result. Some efforts have been reported on aggregating and analyzing such file statistics for preservation purposes. Most approaches and tools demonstrated thus far are often focused solely on format identification (Knijff & Wilson 2011; Hutchins 2012). (Brody et al. 2008) describes PRONOM-ROAR, an aggregation of format identification distributions across repositories. Today, automatic characterization and meta data extraction is supported by numerous tools. The SCAPE project is packaging such components into discoverable workflows and by that providing a possibility to automatically discover, install and run those toolsvi. This encompasses migration actions, characterization components, and quality assurance. The latter refers to the ability to deliver accurate measures about the quality of digital objects, in particular to ensure that preservation actions have not damaged the authenticity of an object’s performance (Heslop et al. 2002). The SCAPE project is addressing that question by providing a number of tools for image, audio video and web quality assurance (Pehlivan et al. 2013). Furthermore it is packaging those components into discoverable workflows and by that providing the facilities to discover and invoke these tools. The metadata collected by different identification and characterization tools will help with managing the objects more efficiently and effectively. The real power of such data becomes especially visible when visual aggregation and analysis methods are used. Jackson (Jackson 2012) presented a longitudinal analysis of the format identification over time in the UK web. Even though this study was only using identification data, the resulting statistics of the evolution of different formats over time can yield significant insights into format usage trends and obsolescence. There is a clear need of a broad systematic approach to large-scale feature-rich content analysis to support business intelligence methods in extracting important knowledge about the content and formats stored in a repository. This is also a key enabler for successful preservation planning, one of the six functional entities specified in the OAIS model (CCSDS 2002). Its goals are to provide functions for monitoring internal and external environments and to provide preservation plans and recommendations which will ensure information accessibility over a longer period of time. Viewed as an organizational capability, the two main sub capabilities are Operational Preservation Planning and Monitoring (Antunes et al. 2011). The planning tool Plato (Becker et al. 2009) is up to now the best known implementation of an operational preservation planning method. It provides a 14-step workflow that guides a preservation planner in making decisions about the actions performed on a digital content. The result of a planning process is a trustworthy and well documented recommendation which identifies the optimal action from a defined set of alternatives according to specified objectives and requirements. These plans are not strategic plans guiding the organization’s processes and activities, but operational specifications for particular actions to be carried out with exact directives on how they shall be carried out. Even though Plato offers a great deal of automation, some steps in the workflow require significant manual work. Kulovits (Kulovits et al. 2009) showed that in 2009, a typical use case involved several people for about a week, including a planning expert to coach them. Preservation monitoring shows a comparable gap of automated tool support. Current activities usually result in technical reports, as (Lawrence et al. 2000), DigiCULTvii and Digital Preservation Coalition periodic reportsviii, or file format and tool registries (PRONOM ix, Global Digital Format Registry (GDFR)x, Unified Digital Format Registry (UDFR)xi, the P2 registryxii, and others). Technical reports function on a principle of periodically publishing documents about available formats and tools. They are meant for human reading and support no automation. Registries such as PRONOM are shared and potentially large, but very often do not provide in-depth information. They have difficulties in ensuring enough community contributions and, where those contributions exist, they are often sparse and dispersed in different registries. Moderation of such contributions through a closed, centralized system has proven notoriously difficult, which has led to increasing calls for a more open ecology of information sources (Becker & Rauber 2011a; Pennock et al. 2012) xiii. An early attempt to demonstrate automation in preservation monitoring was PANIC (Hunter & Choudhury 2006). The goal was to provide a system which will periodically combine the metadata from repositories with the information captured from software and format registries in order to detect potential preservation risks and provide recommendations for possible solutions. The initiative to develop an Automatic Obsolescence Notification Service (Pearson 2007) aimed at providing a service that would automatically monitor the status of file formats in a digital repository against format risks collected in external registries. Unfortunately, the dependency on external format registries to provide information for a wider range of file formats was a limitation for AONS, which caused it to monitor only a limited amount of information. Preservation actions need to be carefully chosen and deployed to ensure they in fact address real issues and provide effective and efficient solutions. There is an increasing awareness and understanding of the interplay of preservation goals and strategies, tools and systems, and digital preservation policies. Policies are often thought to provide the context of the activities and processes an organization executes to achieve its goals, and hence the context for the preservation planning and monitoring processes described. Yet, in Digital Preservation, the term “policies” is used ambiguously; often, it is associated with mission statements and high-level strategic documents (Becker & Rauber 2011c). Representing these in formal models would lead to only limited benefit for systems automation and scalability, since they are intended for humans. On the other hand, models exist for general machine-level policies and business policies. However, a deep domain understanding is required to bring clarity into the different levels and dimensions at hand. This should be based on an analysis of the relevant drivers and constraints of preservation. A driver in this sense is an “external or internal condition that motivates the organization to define its goals” (Object Management Group 2010), while a constraint is an “external factor that prevents an organization from pursuing particular approaches to meet its goals” (Object Management Group 2010). Common examples for preservation policies are on the level of statements in TRAC (OCLC and CRL 2007), ISO16363 (ISO 2010), or statements in Beagrie et al. (2008). These are well known, but their impact is not always well understood, and operations based on these can be quite complex to implement. Moreover, there is no recognized model for formalizing preservation policies in a standard way. Providing such context for preservation planning, monitoring and operations, however, is key to successful preservation. So far, context has been provided implicitly as part of decision making, adding a burden on decision makers and threatening the quality and transparency of planning and actions. These policies correspond to what the OMG standards call "business policies". The OMG has been active in modeling and standardizing this concept for many years and produced in particular two valuable standards: the Business Motivation Model (Object Management Group 2010) and the Semantics of Business Vocabulary and Business Rules (SBVR) (Object Management Group 2008). According to these, policies are non-enforceable elements of governance that guide, shape and control the strategies and tactics of an organization. An element of governance is an "element of guidance that is concerned with directly controlling, influencing, or regulating the actions of an enterprise and the people in it". Enforceable means that "violations of the element of governance can be detected without the need for additional interpretation of the element of governance" (Object Management Group 2008). There are various levels of policy statements required in a digital preservation environment. While the DP community has specified criteria catalogs for trustworthy preservation systems, these fail to separate concerns and distinguish between strategic goals, operational objectives and constraints, and internal process metrics. The relationship between these is often vague. Compliance monitoring in operational preservation systems is restricted to generic operations and does not align well with the business objectives of providing understandable and authentic access to information artifacts. The lack of clarity, separation of concerns, formalism and standardization in regulations for DP compliance means that operationalizing such compliance catalogs is very difficult, and verification of compliance is manual and either limited to abstract high-level checks on a system’s design or inherently subjective. 2.2 On trust and scalability Preservation planning methods and tools such as Plato have evolved considerably from their origins (Strodl et al. 2006). It is worth recalling here the two fundamental dimensions along which such evolution could take place – dimensions set by the decision space in which these methods are designed to operate. The key requirements, not at all compatible at first sight, are trust and scalability. Trust as a requirement hardly disputed mandates organizations to strive for transparency, accountability, and traceability, traits evidently recommended by standards such as the Repository Audit and Certification checklist (ISO 2010). Achieving trust requires a carefully designed environment that promotes transparency of decisions, ensures full traceability of decision chains, and supports full accountability. Scalability, on the other hand, is mandated by the sheer volumes of content pouring into repositories in the real world, and calls for automation, reduced interaction, simplified decisions, and the removal of human interaction wherever possible. Scalability calls for automated actions applied in standardized ways with minimized human intervention. Trust, on the other hand, mandates that any automated action is fully validated prior to execution, providing an assessment trail against the objectives specified by the organization which is supported by real-world evidence. Preservation planning methods and tools such as Plato come a long way along the path of trustworthy decision making, but by the very nature of the task have difficulties in making progress on the dimension of scalability. Considerable effort is commonly required for taking trustworthy decisions, as well as for creating, structuring and analyzing the underlying information that is the input for the decision making process. Until now, this often means that organizations fail to move from hardly trustworthy ad-hoc decision making to fully trustworthy, well-documented preservation planning (Becker & Rauber 2011c; Kulovits et al. 2013a). 2.3 Challenges and goals The preservation of digital content requires that continuous monitoring, planning and the execution of corrective actions work together towards keeping the content authentic and understandable for the user community and compatible with the external environment and restrictions. However, many institutions carry out these processes in a manual and ad-hoc way, completely detached from the content lifecycle and without well-defined points of interoperability. This limits the ability to integrate and scale preservation processes in order to cope with the escalating growth of content volume and heterogeneity, and it undermines the capacity of institutions to provide continued access to digital content and preserve its authenticity. We observe that there are a number of gaps in the means currently available to institutions: 1. Business intelligence mechanisms are missing that address the specific needs of preservation over time and enable organizations to monitor the compliance of their activities to goals and objectives as well as risks and opportunities. Similarly, organizations lack the scalable tools to create feature-rich profiles of their holdings to support this monitoring and analysis process. There are no accepted ways to address the need for continuous awareness of a multitude of key factors prone to change, including user communities, available software technology, costs, and risks, to provide a unified view on the alignment of an organization’s operations to goals and needs. While the community is eager to share the burden and promote collaboration, it is notoriously difficult for organizations to effectively do so. 2. Knowledge sharing and discovery at scale is not widely practiced, since there is no common language, no effective model, and little clarity as to what exactly can and should be shared and how. Hence, sharing is practiced on an ad-hoc and peer-to-peer basis, with little scalable value for the wider community. 3. Decision making efficiency needs to be improved without sacrificing transparency and trustworthiness. This requires not only more efficient mechanisms built into decision making tools, but also a more explicit awareness of an organization’s context. 4. Preservation policies are a key factor to achieve this and have been notoriously difficult to pin down. In this context, it is important to understand policies as ‘elements of guidance that shape, guide and control’ (Object Management Group 2008) the activities of an organization, so that the core aspects can be formalized and understood by decision support and business intelligence systems. 5. Systems integration, finally, is chronically difficult and only successful where modular components with clearly defined purpose and well-specified interfaces are provided in the place of monolithic, custom-built solutions. It becomes clear that establishing such capabilities cannot simply be solved by introducing a new software tool, but requires careful consideration of the socio-technical dimensions of the design problem. Designing a set of means to address these issues requires a solid understanding of socio- technical environments and a flexible suite of methods and tools that can be customized, integrated and deployed in a real-world context to address the issues pertaining to a particular situation. The following section will discuss each of the design challenges in turn and derive a set of overarching design goals. Based on these, we will present the main concepts and solution components that form the main contribution of our work and discuss how they can be used in isolation or conjunction to improve the state of art in scalable decision making and control. 3. Scalable, context-aware Preservation Planning and Watch 3.1 Overview Based on the observations outlined above, this section derives a number of design goals to be addressed in order to enable scalable decision making and control for information longevity, while further advancing the progress made on the path of trust, in a form that can make substantial real-world impact for a variety of organizations. Based on a new perspective that emphasizes the continuous nature of preservation, we describe an architectural design for trustworthy and scalable preservation planning and watch. Section 4 discusses the implementation of the architecture in the SCAPE Planning and Watch suite. 3.2 Design goals Systematic analysis of digital object sets is a critical step towards preservation operations and a fundamental enabler for successful preservation planning: Without a full understanding of the properties and peculiarities of the content at hand, informed decisions and effective actions cannot be taken. While large-scale format identification has been in focus for a while and tools for in-depth feature extraction exist, little work has been shown that combines in-depth analysis and large-scale aggregation into content profiles that are rich in information content and large in size. G1: Provide a scalable mechanism to create and monitor large and rich content profiles. For successful preservation operations, a preservation system needs to be capable of monitoring compliance of preservation operations to specifications, alignment of these operations with the organization’s preservation objectives, and associated risks and opportunities that arise over time. Achieving such a business intelligence capability for preservation requires linking a number of diverse information sources and specifying complex conditions. Doing this automatically in an integrated system should yield tremendous benefits in scalability and enable sharing of preservation information, in particular risks and opportunities. G2: Enable monitoring of operational compliance, risks and opportunities. The preservation planning framework and tool Plato provide a well-known and solid approach to create preservation plans. However, a preservation plan in Plato 3 is constructed largely manually, which involves substantial effort. This effort is spent in analyzing and describing the key properties of the content that the plan is created for; identifying, formulating and formalizing requirements; discovering and evaluating applicable actions; taking a decision on the recommended steps and activities; and initiating deployment and execution of the preservation plan. When automating such steps, trustworthiness must not be sacrificed for efficiency. Still, the efficiency of planning needs to be improved to the point that creating and revising operational plans becomes an affordable and normal, routine part of organizations responsible for safeguarding content and is understood well enough so that it can potentially be offered as a service. G3: Improve efficiency of trustworthy preservation planning. For decision support and monitoring systems to be truly useful, they need to be aware of the context in which they are operating. That includes an awareness of the organizational setting and the state of the repository so that they can assess risks and identify issues that need intervention, but it extends to an awareness of the world outside the repository to ensure these systems can provide this assessment also with respect to the larger context in which the repository operates. So far, it has been very difficult to make the organizational context known to the systems in a way that enables them to act upon it. The planning tool Plato 3, for example, requires the decision makers to model their goals and objectives in a tree structure; but it is not directly aware of other organizations’ goals and objectives. Similarly, the context awareness of systems such as PANIC is very limited. Most importantly, hence, preservation systems need to be endowed with an awareness of the context in which they shall keep content alive. This includes the organizational goals and objectives, constraints, and directives that shape and control the preservation operations of a repository. Such an awareness of the context requires a formalized representation of organizational constraints and objectives and a controlled vocabulary for representing the key entities of the domain. Given the evolutionary nature of the world in which preservation has to operate, such a vocabulary needs to be permanent, modular and extensible. G4: Make the systems aware of their context. Preservation planning focuses on the creation of preservation plans; Preservation Watch focuses on gathering and analyzing information; operations focus on actual processing of data and metadata. These methods and tools will in general be deployed in conjunction with a repository environment. This requires open interfaces and demonstrated integration patterns in order to be useful in practice. We hence need a system architecture that is based on open interfaces, well-understood components and processes, open data and standard vocabularies, but also able to be mixed and matched, extended and supportive of evolution over time. Components in an open preservation ecosystem need to use standards and appeal beyond digital preservation to enable growth and community participation. They should be built around a simple core, with the goal to connect and enable rather than impose and restrict. The preservation community is painfully aware how important sustainable evolution is for their systems, as emphasized by a recent discussionxiv. Correspondingly, the ecosystem in question should be built with sustainability in mind. G5: Design for loosely-coupled preservation ecosystems. Clearly, addressing the sum of these goals requires a view on the preservation environment that focuses on the continuous, evolving nature of information longevity as a sustained capability rather than a one-time activity. The following section presents such a view, focusing on the preservation lifecycle and its key components. 3.3 The preservation lifecycle Figure 1: Digital preservation lifecycle Figure 1 shows a view on the key elements in a preservation environment that relates the key processes required to successfully sustain content over time to each other. The preservation lifecycle naturally starts with the repository and its environment and evolves around the continuous alignment of preservation activities to the policies and goals of the organization. The Repository is an instance of a system which contains the digital content and may comprise processes such as ingest, access, storage and metadata management. The Repository may be as simple as a shared folder with files that represent the content, or as complex as dedicated systems such as DSpacexv, Eprints xvi and RODAxvii. The Repository refers not only to the generic software system but also its instantiation within an institution, related to an institutional purpose guided and constrained by policies that define objectives and restrictions for its content and procedures. In the context of this article, the preservation policies drive how the Repository must align to its context, environment and users, guiding digital preservation processes such as Watch, Planning and Operations. The alignment of the content and the activities of a repository to its context, environment and users is constantly monitored by Watch to detect preservation risks that may threaten the continuous and authentic access to the content. This starts by obtaining an understanding of what content the repository holds and what the specific characteristics of this content are. This process is supported by the characterization of content and allows a content owner to be aware of volumes, characteristics, format distributions, and specific peculiarities such as digital rights management issues and complex content elements. The characterization process feeds the aggregated set of key characteristics of the monitored content, i.e. the content profile, into the Watch process. This is depicted as the ‘monitored content’ in Figure 1. Repository events such as ingest or download of content are monitored by the Watch process, as they can be useful for tracking producer and consumer trends and uncover preservation risks. The Watch process cross-relates the information that comes from internal content characterization and repository events with the institutional policies and the external information about the technological, economic, social and political environment of the repository, allowing for the identification of preservation risks and opportunities. For example, checking the conformance of content with the owner’s expectations or policies, identifying format or technological obsolescence in content, or comparing the content profile with other repositories can reveal possible preservation risks, but also opportunities for actions and possibilities to improve efficiency or effectiveness. These possible risks and opportunities should be analyzed by Planning to devise a suitable response. The Planning process carefully examines the risks or opportunities, considering the institution’s goals, objectives and constraints. It evaluates and compares possible alternatives and produces an action plan that defines which operations should be implemented and which service levels have been agreed on, and documents the reasoning that supports this decision (Becker et al. 2009). This action plan is deployed to the Operations process that orchestrates the execution of the necessary actions on the repository content, if necessary in large-scale distributed fashion, and integrates the results back to the repository. These operations can include characterization, quality assurance, migration and emulation, metadata, and reporting. The Operations process should provide information about executed actions such as quality assurance measurements to the Watch process to be sure that the results conform to the expectations set out in the action plan. All the conditions about internal and external information considered as a decision factor by Planning should be continuously monitored so that the organization knows where active plans remain aligned and valid over time. Once a condition is detected that may invalidate a plan, Planning should be called upon to re-evaluate the plan. This perspective on digital preservation as a set of processes or capabilities that interact with each other to achieve the digital preservation objectives has evolved considerably over the last decade. From the oft-cited standard model in the domain, the OAIS (CCSDS 2002), which emphasizes a functional decomposition of elements in an archive, the perspective evolved to the capability- based view of the SHAMAN Reference Architecture (Antunes et al. 2011), which based the model strongly in Enterprise Architecture foundations and thus integrated the domain knowledge of preservation with a holistic view of the organizational dimensions. However, neither presents a specific view on how these processes can align with each other in practice, allowing the flow of information from one process to the next. The streamlined view illustrated in Figure 1 forms a lifecycle that ensures digital content on repositories is continuously adapted to the environment, the target users and institutional policies. Considering the above, it becomes clear that optimization of efficiency (whether of performance and cost or effort) must not only occur within each process, and not only consider scalable processing of data, but also at the integration points between each of the processes and in the decision functions themselves, so the whole preservation lifecycle becomes efficient and sustainable. Finally, many of the activities in these processes require sophisticated tool support to be applicable in a real-world environment. 3.4 An architecture for loosely-coupled preservation systems Achieving a full preservation lifecycle requires a set of components that implement the digital preservation processes and interoperate with each other in an open and scalable architecture. Figure 2 shows the set of components that are required to support and partially automate the processes necessary to sustain the preservation lifecycle. These need to be designed to be modular and flexible, have clearly distinguished functionalities, and fit the technical specifications of the institution context. Figure 2 Overall architecture of scalable planning and watch The Content profiler has the function of aggregating and analyzing content characteristics and producing a well-specified content profile that provides a meaningful and useful summary of the relevant aspects of the content. This component has to cope with large amounts of data in the content and support the watch and planning components by summarizing the important aspects to a content profile, exposed via the Content profile interface. The Watch component has the function of collecting this and other aspects in order to provide the business intelligence functionalities necessary for monitoring and alignment. By gathering information from diverse sources relevant for preservation, it enables the organization to monitor compliance, risks and opportunities, based on monitoring conditions that can be specified in the corresponding interface. For example, it provides the means to answer questions such as “How many organizations have content in format X?” or “Which software components have been tested successfully in analyzing objects in format Y?” (Becker, Duretec, et al. 2012). The component should be able to raise events when specified conditions are met. Interested clients provide a Notification interface to receive such events. The Planning component is hence informed about conditions that require mitigation. Its key function is to support the creation, revision and deployment of trustworthy and actionable preservation plans. In order to achieve this, it needs to retrieve the complete content profile, potentially access data from the repository for sample sets to experiment with for evaluation purposes, and will use the Plan Management interface to initiate the execution of the actions specified in the plan. A Repository should be able to integrate this preservation lifecycle architecture by implementing a set of interfaces xviii: Data Management enables basic operations on the data held by the repository to ensure controlled access, Event Reporting ensures that Watch can be informed about the status of operations and repository activities (Becker, Duretec, et al. 2012), and Plan Management provides the facilities to create and update preservation plans and initiate their deployment. To coordinate these sometimes complex activities and processes that are executed by operations, a Workflow engine can be used to execute the preservation action plan, i.e. the set of all actions and quality assurance tasks that compose the execution of a plan on the content. The Data management interface can also be used to merge the results of executing the action plan back to the repository. Interoperability between components is achieved via well-defined interfaces that allow the decoupling from the specific implementation of each component and also allow the reuse, replacement and easier maintenance of each of the components. The interfaces are open, in order to allow easy support of different component implementations, in particular different repository implementations. A key goal of such open interfaces is to enable continuous growth of systems by community participation. However, standardization in this area needs to go one step further and support semantic interoperability of the components. Components need to be aware of the context they are operating in, and this context need to be well communicated and mapped between each of components. Information exchanged between these components needs to be opened up to the community to build synergies, enable knowledge discovery, and move from static to dynamically growing information sources. The next section will describe the mechanisms designed to support this. 3.5 Policies as basis for preservation management When endowing components of a context-aware planning and watch system as envisioned here with an awareness of organizational context to create "policy-driven planning and watch", the idea cannot be that entirely non-enforceable elements drive something automatically, since the result would be random behavior. Instead, the idea is to relate non-enforceable high-level policies to practicable policies that are machine-understandable, but usually not specific enough to directly drive operations. The control of operations then is the responsibility of preservation planning, which creates enforceable preservation plans based on practicable policies. Corresponding to the observation that policies ‘guide, shape, and control’ the activities of an organization (Object Management Group 2010), we distinguish between the following levels. Guidance policies are non-enforceable governance statements that reside on the strategic (governance) level and often relate several high-level aspects of governance to each other. For example, they express value propositions to key stakeholders, commit to high-level functional strategies, define key performance indicators to be met, or express a commitment to comply with a regulatory standard. These policies are expressed in natural language and need to be interpreted by human decision makers. Automated reasoning on these is not generally feasible. The aspects to be included in such policy statements can be standardized and identified, but the statements can often not feasibly be expressed as machine language to a meaningful extent. In the preservation domain, typical examples can be seen in current regulatory compliance statements (ISO 2010), but also in preservation business policies (Beagrie et al. 2008). Control policies, on the other hand, are “practicable elements of governance that relate to clearly identified entities in a specified domain model … [and] constitute quantified, precise statements of facts, constraints, objectives, directives or rules about these entities and their properties.” (Kulovits et al 2013b). Practicable means that a statement is `sufficiently detailed and precise that a person who knows the element of guidance can apply it effectively and consistently in relevant circumstances to know what behavior is acceptable or not, or how something is understood.’ (Object Management Group 2008). Such policies can be fully represented in a machine-understandable model, but are often not directly actionable in the sense that it does not make sense to directly enforce them in isolation: The exact enactment will depend on the context and the relation of multiple control policies. For example, multiple control policies may be defined in isolation and contradict each other. The resolution of this contradiction in the decision making process (preservation planning) leads to a specified set of rules in the plan. This rule set is then actionable and enforceable. Some control policies will, on the other hand, be in principle enforceable. For example, constraints about data formats to be produced by conversion processes can be automatically enforced in a straightforward way. Control policies are practicable in the sense of the SBVR, but generally have to be specified by human decision makers in policy specification processes that refer to the guidance policies and take into account the drivers and constraints of the organization to create control policies. These processes can be standardized to a degree similar to standard business processes. The typical inputs and outputs as well as the stakeholders responsible, accountable, consulted and informed can be specified. Yet, it should not be prescribed to a particular organization in which way these policies have to be managed. By applying these levels, non-enforceable high-level policies can be related to practicable policies that are machine-understandable, but usually not specific enough to directly drive operations. The control of operations then is the responsibility of preservation planning, which creates enforceable preservation plans based on practicable policies. These preservation plans correspond to business rules. We note that if control policies are specified in a formal model, it should be possible to check instances of that model against formal constraints. Figure 3: Digital preservation policies need a well-defined domain model (Kulovits et al 2013b) An institutions’ specific policies should thus be specified following a well-defined vocabulary. In order to make such policies meaningful, a core set of domain elements has to be identified and named so that the properties of these concepts can be referred to, represented and measured. This is illustrated in Figure 3 and Figure 4. Ultimately, a preservation case arises, in analogy to a business case, from the identified value of a set of digital artifacts for a specified, more or less well-defined, set of users, called the user community. A preservation case hence concerns identified content and identified users and specifies the goals that should be achieved by preservation. Practically, the level of detail known in each specific instance about the users’ goals and means will vary greatly, but where there is no identified potential value in preserving a set of digital artifacts, it will likely be discarded. The scope of the preservation case thus corresponds closely to the statements of “preservation intent” discussed by Webb et al. (2013). In order to successfully preserve objects for a set of users, i.e. address a preservation case, goals will be identified and made explicit by specifying objectives. These are more explicit than a general preservation intent and represent the general goals for effective and efficient continued access to the intellectual content of the digital artifacts in precise statements: The objectives specify desirable properties of the objects with regards to authenticity, formats, and other aspects of representation (such as compression, codecs, or encryption); desired properties of the formats in which such objects shall be represented; desired properties of the preservation operations carried out to achieve preservation goals, in particular preservation actions to be applied (such as a preferred strategy of migration or an upper limit on costs); and access goals derived from knowledge about the user community. It can be seen that the core focus of this model is on continued accessibility and understandability on a logical level, emphasizing the continued alignment that is at the heart of preservation rather than the mere conservation of the bitstreams themselves, which is seen as a necessary precondition to be addressed independently. It is only through access (of whatever form) that preservation results in value; and it is only through a continued process that such understandability and access can be assured. Making the aspects that should be aligned explicit and measurable is the first step towards intelligent detection and reaction. Correspondingly, the core set of control policy elements is shown in Figure 4, taken from (Kulovits et al. 2013b) which describes the controlled vocabularies in more detail. Figure 4: The core set of elements in the vocabulary (Kulovits et al, 2013b) The ontology of core control policies and the ontology of the domain elements referenced in these statements are permanently accessible on http://purl.org/DP/control-policy and http://purl.org/DP/quality. While a detailed discussion of these elements is out of the scope of this article, the next sections will show how it enables the components of the implemented software suite to sustain an awareness of an organization’s objectives and constraints and monitor the alignment of operations to the preservation goals. 3.6 A preservation ecosystem The standardization of the policy vocabulary and the domain model allows us to envision a digital preservation ecosystem that brings together the Organization, the Community environment, the Solution components and the Decision support and control tools that make up the loosely-coupled system presented in Section 3.4. The vocabularies allow all of these entities to share a common language and be able to interoperate. Figure 5 illustrates how the digital preservation vocabulary connects the ecosystem domains: ● Organization. An organization has digital content and internal goals regarding its purpose and delivery which influence decisions on how to curate, preserve and reuse the content over time. People on behalf of the organization manage information systems and define policies that guide and constrain the selection and design of operations to be executed to preserve the content. The formulation of policy instances for the organization can follow a vocabulary that is widely understood by the other parts of the ecosystem. Figure 5: A common language connects the domains of the preservation ecosystem ● Community environment. Other organizations with particular concerns, not necessarily to preserve content, develop and populate systems that support various aspects of preservation directly or indirectly. These systems contain essential information on aspects relevant to preservation. The main building blocks in this domain include technical registries such as PRONOM, but increasingly extend to environments not originally emerging within digital preservation, such as the workflow sharing platform myExperiment xix or public open source software repositories such as github xx. ● Solution components comprise the services and tools, platforms and infrastructure components that support the necessary operations to address organizations’ needs. These components relate to the pieces that need to be put together to allow addressing the organization objectives in specific cases in cost-efficient ways. These tools must be selected considering the organization’s policies (criteria and constraints) that define requirements for a solution. The main types of such solution components include software tools for file format identification, feature extraction, validation, migration, emulation and quality assurance. Solution components in this domain are generally developed, maintained, and distributed by commercial or noncommercial solution providers trying to meet market needs. Many of them are in fact created by members of the preservation community.xxi ● Decision support and control, finally, brings together those methods and systems that support the organization in choosing from the solution domain those elements that fit their policies and goals best, ensure most effectively that the content remains usable for the community, and support the organization in the continued task of monitoring. Each software system requires information about certain domain entities. For example, content profiling needs to describe objects it analyzes, and preservation tools need to report measures. Planning needs to discover preservation actions, evaluate actions, and describe plans. Watch needs to collect measures on all these entities, detect conditions, and observe events. Finally, decision makers need to describe their goals and objectives in a way understandable by the systems, so that decision support can provide customized advice and support that befits their specific policies and constraints. The next section will outline how this ecosystem has been implemented and instantiated and show the preservation lifecycle in action within the ecosystem. We will outline the solution architecture and discuss the specific components of the architecture in turn, and then return to the preservation lifecycle and how the ecosystem increasingly supports scalable, context-aware preservation planning and monitoring and its integration into repository environments and the community. 4. The SCAPE Planning and Watch Suite 4.1 Overall solution architecture The architecture outlined above has been implemented by a publicly available set of components and API specifications that can be freely integrated with any repository system. The suite of components aims to provide the tool support necessary to enable organizations to advance from largely isolated, ad-hoc decisions and actions reacting to preservation incidents to well-supported and well-documented, yet scalable and efficient preservation management. The following section describes each of the key building blocks of this tool suite, focusing on the core design goals and features and pointing to references for further in-depth information. Note that the design is not limited to large-scale environments, but understands scalability as a general flexibility with a focus on efficiency and automation. This is relevant in two ways: First, the tools do not require large-scale infrastructure, but are able to leverage it when present. Second, providing a loosely-coupled set of modular components enables organizations to adopt the suite using an incremental approach, without large upfront investments. Figure 6 depicts the SCAPE software components supporting the preservation lifecycle and implementing the components and interfaces described above. The next sections will describe each of these components in turn. Figure 6: SCAPE software components supporting the preservation lifecycle 4.2 C3PO: Scalable Content analysis Recent advancements in tool development for file analysis resulted in a number of tools covering different functionality such as identification, validation and characterization. A crucial challenge presented by those tools is the variance of coverage in terms of file formats supported and features extracted. The characterization tool FITS addresses the problem of coverage by combining outputs from different identification and characterization tools in one descriptor. This enables a rich characterization of a single file by using only one tool, which will in fact run the appropriate identification and characterization tools on the content and normalize the output into a well-defined XML output. While this comes at a performance cost, it is currently the only method that provides reasonable coverage of the feature space, covering both a variety of identification measures such as the PRONOM format ID and mime-types as well as in-depth feature extraction supported by an array of toolsxxii. Using the output of characterization tools such as FITS and Apache Tika, the tool Clever Crafty Content Profiling of Objects (C3PO) xxiii enables a detailed content analysis of large-scale collections (Petrov & Becker 2012). Figure 7 provides a high level overview of the process, which as a result produces a detailed content profile describing the key distribution characteristics of a set of objects. The process starts with running identification and characterization tools on a set of content. The metadata produced by those tools is collected and stored by C3PO, which currently supports the metadata schemas of FITS and Apache Tika. Support for other characterization tool output formats can easily be added by extending the highly modular architecture, which enables the integration of additional adaptors to support other metadata formats and gathering strategies. The combination of using multiple metadata extraction tools on the same content will often result in conflicts, a state where two tools provide different values for the same feature. A common example is the file format, when two tools assign different format identifiers to the same file, either because of different interpretation logics or simple because they have a different way of representing the same format. C3PO offers the possibility to add rules which will resolve those conflicts. These rules can range from simple conditions regulating that certain two identifiers represent the same format to complex rules prioritizing certain tools or deriving values based on the presence of other features. Figure 7: The key steps of content profiling The architecture of C3PO decouples the persistence layer so that a variety of engines can be used. The default database provides strong scalability support by using the open-source highly-scalable MongoDB, which supports sharding (Plugge et al. 2010) and map-reduce (Dean & Ghemawat 2004) natively. This also enables users to provide their own analytics on the basis of this platform, similar to the built-in queries that are readily supported through a web user interface. These standard analytical queries calculate a range of statistics from the size of the collection to the triangular distributions of all numerical features and a histogram of all non-numerical features in the collection. The set of these statistics is the heart of the content profile. In addition to its processing platform, C3PO offers a simple web interface which allows dynamic exploration of the content. Part of it is shown in Figure 8, displaying a collection with about 42 thousand objects and an overall size of approximately 23 GB. Additional diagrams show the distribution of mime- types and formats. The user can create additional diagrams for any feature present in the set in order to visualize key aspects of the property sets. Advanced filtering techniques enable exploring the content in more detailed fashion. By clicking on a bar representing a certain format in the format distribution diagram, for instance, the user will filter down on the corresponding object set to see details about that part of the collection only. This enables a straightforward drill-down analysis to see, for instance, how many of a set of TIFF files are valid or how many have a certain compression type. Figure 8: C3PO visualizing a content profile While C3PO can be readily used independently, it integrates with the remaining two components in the planning and watch suite, Scout and Plato. The integration with Scout offers the possibility to monitor the feature distributions of any number of collections over time. By creating a historic profile from a collection, its growth and changes in the distributions of key aspects such as formats can be revealed over time. The integration with Plato uses an export of the content profile for the whole or a subset of a collection into a well-defined content profilexxiv. This profile identifies and describes the set of objects contained and provides a statistical summary of file format identification and important features extracted. Plato understands this profile and uses it to obtain statistics about the content set for which a plan is being created. Finally, this profile can contain a set of objects that are seen as representative for the entire set, to enable controlled experimentation on a realistic subset instead of the entire set of objects. This can yield increased reliability of the sample selection and provide a substantial speedup, since without these heuristics, samples have to be selected by hand, a tedious and error-prone process (Becker & Rauber 2011c). As with the other modules, the heuristics used to select samples from this multidimensional view on the content set are flexible and configurable, and additional algorithms for sample selection can be added easily. 4.3 Scout: Scalable Monitoring Scout xxv is an automated preservation monitoring service which supports the scalable preservation planning process by collecting and analyzing information on the preservation environment, pulling together information from heterogeneous sources and providing coherent unified access to it. It addresses the need to combine an awareness of the internal state of an organization and its systems (internal monitoring) with an awareness of the environment in the widest sense (external monitoring) to enable a continued assessment of the alignment between the two (Faria et al. 2012). The information is collected by implementing different source adaptors, as illustrated in Figure 9. Scout has no restrictions on the types of data that can be collected. It is built to collect a variety of data from different sources such as format and tool registries, repositories, and policies. It already implements source adaptors for the PRONOM registry, content profiles from C3PO, repository events (ingest, access, and migration), policies and other specific adaptors. The combination of content profiles from C3PO with repository events from the Report API provides a complete overview of the current content in a repository and shows trends of how the overall set of content is evolving. Figure 9: Scout information flows from sources to users Continuous automated rendering experiments xxvi can be used to track the ability of viewing environments to display content and verify whether it corresponds to the original performance (Law et al. 2012). Once information is collected, it is saved in a formally specified and normalized manner to the knowledge base (Faria et al. 2012). Built upon linked data principles, the knowledge base supports reasoning on and analysis of the collected data using standard mechanisms such as SPARQLxxvii. Such queries provide the mechanisms for automatic change detection. By registering an interest in a watch condition associated with such a query, the results will be monitored periodically. When the condition is met, a notification is sent to the user. Conditions can cover arbitrary circumstances of relevance in the known domain, ranging from checks on content validity and profile conformance to certain constraints to the question whether any new tools are available to measure a certain property in electronic documents, or whether a Quality Assurance tool that is in use for validating authenticity of converted images is still considered reliable by the community. Upon receiving the notification, the user can initiate additional actions such as preservation planning to address any risks that have surfaced or take advantage of opportunities that have been detected. Scout has a simple web interface which allows operations such as management, adding new adaptors and triggers, and browsing the collected data. This includes dynamically generated visualizations of data over time. By operating over a longer period, Scout is expected to have a valuable collection of historical data. Figure 10 shows an example of evolution of file formats through time. The resulting graph is based on an analysis of approximately 1.4 million files gathered in the period from December 2008 to December 2012 by the Internet Memory foundation xxviii. Additional content sets that are gathered for historical analysis and shared publication include a set of over 400 million web resources collected in the Danish web archive over almost a decade and characterized using fitsxxix. Figure 10: A format distribution of 1.4M files from the Internet Memory archive Other specific adaptors demonstrate the capacity of Scout to incorporate new information and identify new preservation risks. Faria et al. (2013) describe a case study that demonstrates how to use information extraction technologies on crawled web content to extract specific domain cases, like publisher-journal relationships, and integrate it with Scout for monitoring producers in journal repositories. Another specific adaptor feeds large-scale experiments on the renderability analysis of web pages into the knowledge base. Here, image snapshots are taken of pages from web archives with different web browsers, and the result is compared with image quality assurance tools. Expanding the comparison with structural information from the web page and cross-relation with content profiles of the resources used by the page will give further insight into which formats and which of their features are affecting the renderability of pages on modern web browsers. 4.4 Plato: Scalable Decision making Upon discovery of a risk or misalignment between the organization’s content and actions and the objectives, a plan is needed to resolve the detected problem and improve the robustness of the state of the repository against preservation threats. Creating such a plan is supported by the publicly available open-source planning tool Plato, which implements the preservation planning method described in detail in (Becker et al. 2009). Figure 11: The planning workflow The tool guides decision makers through a structured planning workflow and supports them in producing an actionable preservation plan for a defined set of objects. In doing so, they use a thorough goal-oriented evidence-based evaluation of the potential actions that can be applied. Controlled experimentation on real sample content is at the heart of the four-phase workflow shown in Figure 11: Testing the candidate actions on real-world content greatly increases the trust that stakeholders put into the actions to be taken and ensures that the chosen steps are not simply taken from elsewhere and applied blindly, but will be effective and fit for the specific situation (Becker & Rauber 2011c). 1. Define requirements: In the first phase, the context of planning is documented, and decision criteria are specified that can be used to find the optimal preservation action. The specification starts with high-level goals and breaks them down into quantifiable criteria. The resulting objective tree provides the evaluation mechanism for choosing from the candidate preservation actions. To enable this, the set of objects to preserve is profiled, and sample elements are selected that will be used in controlled experimentation. 2. Evaluate alternatives: In an experiment step, empirical evidence is gathered about all potential candidate solutions by applying each to the sample content selected. The results are evaluated against the decision criteria specified in the objective tree. 3. Analyze results: For each decision criterion, a utility function is defined to allow the comparison across different criteria and their measures. This utility function maps all measures to a uniform score that can be aggregated. Relative weights model the preferences of the stakeholders on each level of the goal hierarchy. An in-depth visual and quantitative analysis of the resulting score of candidates leads to a well-informed recommendation of one alternative to choose. 4. Build preservation plan: In this final phase, the concrete plan for action is defined. This includes an accurate and understandable description of which action is to be executed on which objects and how, and specifies the quality assurance measures to be taken along with the action to ensure that the results are verified and correspond to the expected outcomes. Responsibilities and procedures for plan execution are defined. The finished preservation plan drives the activities in operations and Watch and will be reevaluated over time. Figure 12: Plato visualizing criteria statistics from its knowledge base (Becker et al, 2013) Plato has been used for operational preservation planning in different scenarios in recent years. The Bavarian State Library, for example, evaluated the migration options for one of their largest collections of scanned images of 16th-century books (Kulovits et al. 2009). A detailed discussion of this and several other case studies is given in (Becker & Rauber 2011b). At this point, creating a preservation plan still was an effort intensive and complex task, since many of the required activities had to be carried out manually for each plan. However, the collected set of real-world cases enabled systematic analysis of the variety of decision factors and a systematic categorization and formalization of the criteria used for decision making (Becker & Rauber 2011a; Kulovits et al. 2013b). Figure 12 shows Plato visualizing aggregated decision criteria collected in the knowledge base. This is increasingly supporting Plato in becoming context-aware and automating many of the steps that have previously prohibited large-scale, policy-driven preservation planning (Kraxner et al. 2013; Kulovits et al. 2013b). Figure 13: Preservation operations are composed of multiple components (Kulovits et al. 2013b) As part of the tool suite presented here, Plato has been integrated with Scout, C3PO, and an online catalogue for preservation components published as reusable, semantically annotated workflows on myExperiment xxx . An actionable preservation plan can contain a complex number of automated processing steps of different kinds of operations, linked through a pipeline of inputs and outputs that is best represented as a workflow as shown in Figure 13. Specifying such a workflow in a standard manner, as opposed to a textual operations manual, greatly reduces the risk of operational errors and streamlines deployment. The integration of Plato with the Taverna workflow engine provides such possibilities (Kraxner et al. 2013). Plato furthermore is endowed with an awareness of the control policies encompassing objectives and constraints to be followed. This understanding of the drivers and constraints of an organization is provided by an awareness of the semantic policy model which can be shared across members of the same organization. This removes much of the burden of contextual factors needing clarification, which previously accounted for much of the difficulty in starting a planning process (Becker & Rauber 2011c; Kulovits et al. 2009). Together, this removes much of the effort required for preservation planning: The institutional context is provided by and documented by a semantic model; content statistics, samples and technical descriptors are provided by the content profile; and the available actions that can mitigate risks such as obsolescence can be discovered on myExperiment. Finally, executable workflows can be deployed to the repository, removing risks of misunderstandings and misconfigurations and easing the burden of running operations in accordance to specifications (Kraxner et al. 2013). This awareness and the integration with an open and growing experiment sharing platform plus an open controlled vocabulary provides the basis for continued improvement of operations over time, as organizations can build on each other’s work, show quantitative improvement of new solution components over those previously available, and discover which solution components are needed most urgently. As an example, consider the need to verify the quality of migration processes with respect to content authenticity: When converting even seemingly simple artifacts such as digital photographs, many conversion components introduce subtle errors by omitting embedded metadata, misinterpreting white balance and color setting, or using lossy compression methods where none was expected. Automated means are required to validate each conversion (Bauer & Becker 2011), but developing these is a heavy burden for each organization on their own. Instead, by showing that certain quality checks are required by multiple scenarios, efforts can be shared and focused on those aspects that are most frequent and at the same time critical for decision makers. The visual analysis shown in Figure 12 supports this by visualizing the quantified impact of each decision criterion and computing aggregated impact factors for arbitrary sets of criteria and preservation plans (Becker, Kraxner, et al. 2013). 4.5 Repository The repository is defined here as the system that contains and manages content, allowing ingest and access features. A repository may be as simple as a shared folder with files that represent the content, or as complex as dedicated systems such like DSpace or RODA. There are many different types and implementations of repositories, each with different features and a focus on the needs of different types of institutions. Endowing a repository with digital preservation features should therefore be independent on the repository type and implementation. To achieve the integration with the tools described above, that effectively support the digital preservation processes, a set of repository integration APIs are defined: Data Connector API, Report API and Plan Management API. Data Connector API The Data Connector API is an interface that allows access and modification of content in the repository. Defined as a RESTful web service (Fielding 2000), it contains methods to ● Retrieve intellectual entities, metadata, representations, files and named bit streams, ● Ingest an intellectual entity (synchronously and asynchronously), ● Update an intellectual entity, a representation or a file, and ● Search intellectual entities, representations or files using Search/Retrieval via URL protocol xxxi. The SCAPE Digital Object Model defines how to represent the intellectual entities, metadata, representations, files and named bit streams defined above. It defines a METS xxxii xxxiii profile that uses PREMIS to specify the technical metadata, the rights associated with the object, and the digital provenance metadata. The Data Connector API specification and SCAPE Digital Object Model is availablexxxiv, and the API reference implementations are provided by RODA and Fedora Commons 4. Report API The Report API is an interface that provides access to repository events such as ● Ingest started or finished, ● Descriptive metadata viewed or downloaded, ● Representation viewed or downloaded, or ● Preservation plan executed. The Report API is defined as an OAI-PMH xxxvi. The Report API specification is availablexxxvii xxxviii. A Fedora Commons reference implementation is being developed. xxxv provider that uses PREMIS metadata to describe the repository events. The PREMIS Agent is used to define who triggered the event, PREMIS Date/time to define when the event has occurred, and PREMIS Details is used to describe what has happened. The OAI- PMH protocol allows harvesting of all events and filtering by date and type of event. A Scout Report API adaptor harvests all events and creates aggregations of the events and a reference implementation is available in RODA Plan Management API This interface provides the facilities to deploy and manage preservation plans in the repository. Defined as a RESTful web service, it contains methods to ● Search and retrieve plans, ● Deploy a new plan, ● Retrieve or add a preservation execution state (e.g. in progress, success or fail), and ● Enable and disable a preservation plan. The implementation of the Plan Management API (called the Plan Management Component) can use a Workflow Engine such as Taverna, which understands the workflow language in which the action plan is defined, to execute the workflow and run its preservation actions and quality assurance components. Finally, the Plan Management Component can use the Data Connector API to merge the result of preservation action, such as migration, back into the repository. The Plan Management API specification is available onlinexxxix, and the API reference implementations are being developed by RODA and Fedora Commons 4. 4.6 Workflow Engine Any complex set of operations such as those outlined in Section 4.4 will benefit from a workflow environment to support the coordinated execution on large amounts of content. The system design separates the implementation-level detail of such a workflow engine to enable integration of different platforms. However, there is strong tool support available based on an integration of the workflow engine Taverna and the workflow sharing platform myExperiment, where fully annotated solution components can be published for sharing and discovery. Operational preservation plans can be created and specified as Taverna workflows and published using semantic annotations following the controlled vocabularies described above (Kraxner et al. 2013). Components published using this ontology can be discovered automatically and monitored for specific properties in Scout. The aggregated experience collected on their behavior can support early selection and recommendation of likely fits in the planning process. However, an organization who wishes to support a different workflow engine could replace Taverna with a platform of their own choice. 4.7 Automating the preservation lifecycle To illustrate how the presented suite of tools can support the preservation lifecycle, consider the following scenario. An institution has a repository with content and policies in place. These policies might not be formalized, and some even only documented implicitly, but they are what represents the intentions of an organization and should guide and constrain all the preservation processes. A first step requires the institution to define and formalize the purpose of the content and the digital preservation requirements associated with it. This requirements start by a high level definition of the mission and objectives, such as “long term availability and authenticity”, and must iteratively relate to low level requirements that relate to tangible and measurable facts, like for example “no compression allowed”. These more specific requirements, i.e. control policies, should be defined using the SCAPE policy model. By running characterization tools and C3PO on the repository and configuring the Scout adaptors for repository integration, which uses the C3PO and the Report API, Scout will be able to constantly monitor characteristics of the content and the repository events of importance for digital preservation. Scout provides the facility to upload the policies defined in the SCAPE policy model and activate a set of triggers. It will then notify the users when policy conformance is not fulfilled. These triggers might need external information monitored by Scout, such as the content of format and tool registries, different classes of experiments, and even manually inserted human knowledge. Scout may for example detect that some content uses compression, but that this violates a defined policy, and hence send an email notification to the Planner. The third step is to decide which actions should be taken to mitigate this problem. The Planner can use Plato to support the creation of a well described and traceable preservation plan that addresses the detected preservation risk. By knowing the defined preservation policies, Plato can pre-fill many of the necessary contextual bits of required information, supporting the reuse of the institution's objectives definition and greatly reducing the time needed to create a preservation plan (Kulovits et al. 2013a). Furthermore, Plato can automatically find and retrieve solution alternatives by connecting to the myExperiment preservation components catalogue. Also, Plato can automatically conduct experiments on all alternatives discovered in myExperiment, applying them to the set of sample objects. The analysis of results is partially supported by quality assurance tools that provide an evaluation of the behavior of each alternative considering the case requirements, which enables the decision maker to discover the best solution. The fourth step is to deploy the preservation plan into the repository via the Plan Management API. The Plan Management Component of the repository can use a workflow engine to execute the preservation action, including the quality assurance steps, and use the Data Connector API to merge the action results back into the repository. The results of the preservation action quality assurance step are sent to Scout via the Report API, so that Scout can monitor if the action performed as expected. Finally, the preservation plan contains triggers to be installed in Scout to automatically monitor if the assumptions taken on the decision-making step remain true. If the action plan does not execute as expected or if the preservation plan needs to be reviewed because policies or the environment have changed, then the Planner is again notified to re-evaluate the preservation plan, starting again the cycle. 5. Summary Digital Preservation is the set of activities and processes required to ensure the continued, authentic access to digital content over time. Providing such information longevity across changing socio-technical environments poses a number of challenges, in particular in the light of recent rising content volumes. Scalability for handling large amounts of data can be achieved by state of the art technologies commonly used in the cloud. Additionally, scalable monitoring and decision making is required to support automated, large-scale operations of systems and tools. Scaling up decision making, policy definition, and processes for monitoring and actions requires a set of techniques that include scalable in-depth content analysis, intelligent information gathering, and efficient multi-criteria decision support. But it also requires loosely-coupled systems that are able to interact with each other and the wider preservation context and are capable of evolution over time, and a set of common vocabularies that can be used to publish and discover knowledge about the evolving preservation ecosystems. This article presented the SCAPE Planning and Watch suite, a new, innovative system for scalable decision making and control in preservation environments. The Planning and Watch suite builds on Plato and extends it into a loosely coupled, extensible preservation planning and monitoring system that can be integrated with virtually any repository and content management system through open and standardized interfaces. While each of the components can be used and integrated independently from the other components, this article focused on the compound value contribution that can be obtained by the set of systems and showed how the resulting SCAPE ecosystem can support organizations in managing their holdings more effectively, using policy-driven monitoring and well-supported decision making systems to provide scalable decision making and control capabilities in support of digital preservation objectives. In Becker et al (2015), we will conduct a systematic assessment of the system based on the design goals outlined in this article. We will discuss the improvements of the presented work and identified limitations, based on a quantitative and qualitative evaluation including a case study with a national library. Acknowledgements Part of this work was supported by the European Union in the 7th Framework Program, IST, through the SCAPE project, Contract 270137. References Abrams, S. L. (2004), “The role of format in digital preservation”, VINE: The Journal of Information and Knowledge Management Systems, Volume 34, Number 2, pp. 49-55. Antunes, G. and Borbinha, J. and Barateiro, J. and Becker, C. and Proenca, D. and Vieira, R. (2011), “Shaman reference architecture”, version 3.0. SHAMAN project report. Bauer, S. and Becker, C. (2011), “Automated Preservation: The Case of Digital Raw Photographs” , in Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation Proceedings of 13th International Conference on Asia-Pacific Digital Libraries (ICADL 2011) in Beijing, China, 2011, Springer-Verlag. Beagrie, N. and Semple, N. and Williams, P. and Wright, R. (2008), “Digital Preservation Policies Study Part 1: Final Report”, HEFCE. Becker, C. and Kraxner, M. and Plangg, M. and Rauber, A. (2013), “Improving decision support for software component selection through systematic cross-referencing and analysis of multiple decision criteria”, in Proceedings of 46th Hawaii International Conference on System Sciences (HICSS), 2013, Maui, USA, pp 1193-1202. Becker, C. and Duretec, K. and Petrov, P. and Faria, L. and Ferreira, M. and Ramalho, J.C. (2012), “Preservation Watch: What to monitor and how”, in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Becker, C. and Duretec, K. and Faria, L. (2015). “Scalable Decision Support for Digital Preservation: An Assessment”. To appear in: OCLC Systems & Services, volume 31, no. 1. Becker, C. and Kulovits, H. and Guttenbrunner, M. and Strodl, S. and Rauber, A. and Hofman, H. (2009), “Systematic planning for digital preservation: evaluating potential strategies and building preservation plans”, International Journal on Digital Libraries, Volume 10, Issue 4, pp 133–157. Becker, C. and Rauber, A. (2011a), “Decision criteria in digital preservation: What to measure and how”, Journal of the American Society for Information Science and Technology, Volume 62, Issue 6, pp 1009-1028. Becker, C. and Rauber, A. (2011b), “Four cases, three solutions: Preservation plans for images”, Technical report, 2011, Vienna University of Technology, Vienna, Austria. Becker, C. and Rauber, A. (2011c), “Preservation Decisions: Terms and Conditions Apply. Challenges, Misperceptions and Lessons Learned in Preservation Planning”, in Proceedings of the 11th annual international ACM/IEEE Joint Conference on Digital libraries (JCDL), 2011, Ottawa, Canada, pp 67-76. Brody, T. and Carr, L. and Hey, J. and Brown, A. and Hitchcock, S. (2008), “PRONOM-ROAR: Adding Format Profiles to a Repository Registry to inform Preservation Services”, The International Journal of Digital Curation ( IJDC), Volume 2, Issue 2, 2007, pp 3–19. CCSDS, (2002), “Reference Model for an Open Archival Information System (OAIS)”, Retrieved from http://public.ccsds.org/publications/archive/650x0b1.pdf Dean, J. and Ghemawat, S. (2004), “MapReduce: simplified data processing on large clusters”, in Proceedings of 6th conference on Symposium on Operating System Design & Implementation, Berkley, USA. Faria, L. and Akbik, A. and Sierman, B. and Ras, M. and Ferreira, M. and Ramalho, J.C. (2013), “Automatic Preservation Watch using Information Extraction on the Web”, in Proceedings of the 10th International Conference on Preservation of Digital Objects (iPRES) 2013, Lisbon, Portugal . Faria, L. and Petrov, P. and Duretec, K. and Becker, C. and Ferreira, M. and Ramalho, J.C. (2012), “Design and architecture of a novel preservation watch system”, in The Outreach of Digital Libraries: A Globalized Resource Network Proceedings of 14th International Conference on Asia-Pacific Digital Libraries (ICADL) 2012, Taipei, Taiwan, pp. 168–178. Fielding, R.T. (2000), “Architectural Styles and the Design of Network-based Software Architecture”, doctoral disertation, University of California, Irvine. Garret, J. and Waters, D. (1996), “Preserving digital information: Report of the task force on archiving digital information”, (The Commission on Preservation and Access and RLG). Hedstrom, M. (1998), “Digital Preservation: A time bomb for digital libraries”, in Journal of Computers and the Humanities, 1997, Volume 31, Issue 3, pp 189–202. Heslop, H. and Davis, S. and Wilson, A. (2002), “An approach to the preservation of digital records”, Green paper, National Archives of Australia, 2002, retrieved from http://www.naa.gov.au/Images/An- approach-Green-Paper_tcm16-47161.pdf. Hunter, J. and Choudhury, S. (2006), “PANIC: an integrated approach to the preservation of composite digital objects using Semantic Web services”, in International Journal on Digital Libraries (IJDL) 2006, Volume 6, Issue 2, pp 174-183. Hutchins, M. (2012), “Testing software tools of potential interest for digital preservation activities at the national library of Australia”, 2012, Technical report, National Library of Australia. ISO (2010), “Space data and information transfer systems - Audit and certification of trustworthy digital repositories (ISO/DIS 16363)”, International Standards Organisation. Jackson, A.(2012), “Formats over time: Exploring UK web history”, in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Knijff, J. and Wilson, C. (2011), “Evaluation of characterization tools”, Technical report retrieved from http://www.scape-project.eu/wp- content/uploads/2012/01/SCAPE_PC_WP1_identification21092011.pdf. Kraxner, M. and Plangg, M. and Duretec, K. and Becker. C, and Faria, L. (2013), “The SCAPE Planning and Watch suite”, in Proceedings of the 10th International Conference on Preservation of Digital Objects (iPRES)2013, Lisbon, Portugal. Kulovits, H. and Rauber, A. and Kugler, A. and Brantl, M. and Beiner, T. and Schoger, A. (2009), “From TIFF to JPEG2000? Preservation Planning at the Bavarian State Library Using a Collection of Digitized 16th Century Printings”, in D-Lib Magazine ,2009, Volume 15, Number 11/12. Kulovits, H. and Becker, C. and Andersen, B. (2013a), “Scalable preservation decisions: A controlled case study”, in proceeding of Archiving 2013. Washington D.C., USA , pp 167-172. Kulovits, H. and Kraxner, M. and Plangg, M. and Becker, C. and Bechofer, S. (2013b), “Open Preservation Data: Controlled vocabularies and ontologies for preservation ecosystems”, in Proceedings of the 10th International Conference on Preservation of Digital Objects (iPRES)2013, Lisbon, Portugal. Law, M.T. and Thome, N. and Gançarski, S. and Cord, M. (2012), “Structural and visual comparisons for web page archiving”, in Proceedings of the 2012 ACM symposium on Document Engineering (DocEng’12), 2012, New York, NY, USA, pp 117-120. Lawrence, G.W. and Kehoe, W. and Kenny, A.R. and Rieger, O.Y. and Walters, W. (2000), “Risk Management of Digital Information: A File Format Investigation”. Object Management Group (2010), “Business Motivation Model 1.1”. Object Management Group (2008), “Semantics of Business Vocabulary and Business Rules (SBVR)”, Version 1.0. OCLC and CRL (2007), “Trustworthy Repositories Audit & Certification: Criteria and Checklist”. Pearson, D. (2007), “AONS II: continuing the trend towards preservation software “Nirvana”, in Proceedings of the 4th International Conference on Preservation of Digital Objects (iPRES)2007, Beijing, China, 2007 . Pehlivan, Z. (2013), “Quality Assurance Workflow, Release 2 + Release Report”, Technical report , retrieved from http://www.scape-project.eu/wp- content/uploads/2013/06/SCAPE_D11.2_UPMC_V1.0.pdf. Pennock, M. and Jackson, A. and Wheatley, P. (2012), “CRISP: Crowdsourcing Representation Information to Support Preservation”, in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Petrov, P. and Becker, C. (2012), “Large-scale content profiling for preservation analysis”, in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Plugge, E. and Hawkins, T. and Membrey, P. (2010), “The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing”, Apress, USA. Rothenberg, J. (1995), “Ensuring the longevity of digital documents”, in Scientific American, Volume 272, Number 1, pp. 42-47. Sinclair, P. and Billenness, C. and Duckworth, J. and Farquhar, A. and Humphreys, J. and JArdine, L. (2009), “Are you Ready? Assessing Whether Organisation are Prepared for Digital Preservation”, in Proceedings of the 6th International Conference on Preservation of Digital Objects (iPRES)2009, San Francisco, USA, pp 174-181. Strodl, S. and Rauber, A. and Rauch, C. and Hofman, H. and Debole, F. and Amato, G. (2006), “The DELOS Testbed for Choosing a Digital Preservation Strategy” in Digital Libraries: Achievements, Challenges and Opportunities proceedings on 9th International Conference on Asian Libraries (ICADL), 2006, Springer-Verlag, Kyoto, Japan pp. 323-332. Thaller, M. (2009), “The eXtensible Characterisation Languages – XCL”, 2009, Hamburg, Germany, Verlag Dr Kovac. Webb, C. and Pearson, D. and Koerbin, P. (2013), “Oh, you wanted us to preserve that?! Statements of Preservation Intent for the National Library of Australia’s Digital Collections”, in D- Lib Magazine, 2013, Volume 19, Number 1/2. i http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm ii https://github.com/openplanets/fido http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm https://github.com/openplanets/fido iii http://jhove.sourceforge.net/ iv http://tika.apache.org/ v http://code.google.com/p/fits/ vi http://www.scape-project.eu/tools vii http://www.digicult.info/pages/techwatch.php viii http://dpconline.org/advice/technology-watch-reports ix http://www.nationalarchives.gov.uk/PRONOM/ x http://www.gdfr.info xi http://udfr.cdlib.org/ xii http://p2-registry.ecs.soton.ac.uk xiii http://fileformats.archiveteam.org/ is one example. xiv http://blogs.loc.gov/digitalpreservation/2013/06/why-cant-you-just-build-it-and-leave-it-alone/ xv http://www.dspace.org xvi http://www.eprints.org xvii http://www.roda-community.org xviii https://github.com/openplanets/scape-apis xix http://www.myexperiment.org/ xx https://github.com/ xxi http://www.scape-project.eu/tools xxii https://code.google.com/p/fits/ xxiii http://peshkira.github.io/C3PO/ xxiv https://github.com/peshkira/C3PO/blob/master/format/C3PO.xsd xxv http://openplanets.github.io/scout/ xxvi http://wiki.opf-labs.org/display/SP/Comparison+of+Web+Snapshots xxvii http://www.w3.org/TR/rdf-sparql-query/ xxviii http://internetmemory.org xxix http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits xxxhttp://myexperiment.org xxxi http://www.loc.gov/standards/sru/ xxxii http://www.loc.gov/standards/mets/ xxxiii http://www.loc.gov/standards/premis/ xxxiv https://github.com/openplanets/scape-apis/ xxxv http://www.openarchives.org/pmh/ xxxvi https://github.com/openplanets/scout/tree/master/adaptors/report-api-adaptor xxxvii https://github.com/openplanets/scape-apis xxxviii https://github.com/openplanets/roda xxxix https://github.com/openplanets/scape-apis http://jhove.sourceforge.net/ http://tika.apache.org/ http://code.google.com/p/fits/ http://www.scape-project.eu/tools http://www.digicult.info/pages/techwatch.php http://dpconline.org/advice/technology-watch-reports http://www.nationalarchives.gov.uk/PRONOM/ http://www.gdfr.info/ http://udfr.cdlib.org/ http://p2-registry.ecs.soton.ac.uk/ http://fileformats.archiveteam.org/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.dspace.org/ http://www.eprints.org/ http://www.roda-community.org/ https://github.com/openplanets/scape-apis http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/tools https://code.google.com/p/fits/ http://peshkira.github.io/c3po/ https://github.com/peshkira/c3po/blob/master/format/c3po.xsd http://openplanets.github.io/scout/ http://wiki.opf-labs.org/display/SP/Comparison+of+Web+Snapshots http://www.w3.org/TR/rdf-sparql-query/ http://www.scape-project.eu/ http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits http://myexperiment.org/ http://myexperiment.org/ http://www.loc.gov/standards/sru/ http://www.loc.gov/standards/mets/ http://www.loc.gov/standards/premis/ https://github.com/openplanets/scape-apis/ http://www.openarchives.org/pmh/ https://github.com/openplanets/scout/tree/master/adaptors/report-api-adaptor https://github.com/openplanets/scape-apis https://github.com/openplanets/roda https://github.com/openplanets/scape-apis 1. Introduction 2. Digital preservation: Background and challenges 2.1 Digital preservation and repositories 2.2 On trust and scalability 2.3 Challenges and goals 3. Scalable, context-aware Preservation Planning and Watch 3.1 Overview 3.2 Design goals 3.3 The preservation lifecycle 3.4 An architecture for loosely-coupled preservation systems 3.5 Policies as basis for preservation management 3.6 A preservation ecosystem 4. The SCAPE Planning and Watch Suite 4.1 Overall solution architecture 4.2 C3PO: Scalable Content analysis 4.3 Scout: Scalable Monitoring 4.4 Plato: Scalable Decision making 4.5 Repository Data Connector API Report API Plan Management API 4.6 Workflow Engine 4.7 Automating the preservation lifecycle 5. Summary Acknowledgements References work_b43k2bypnzhzhfqbya32bje6mq ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216589344 Params is empty 216589344 exception Params is empty 2021/04/06-01:37:02 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216589344 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:02 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_b5pffivo5fbijj2bqqhnyjxwcu ---- JILIS 12(3) MC ONLINE 6-11-02... International Payment: Methods to Consider Christine Robben Cherié L. Weible ABSTRACT. As libraries gain access to more online databases, library patrons gain more access to obscure citations. Consequently, Interlibrary Loan Departments, once deciding to participate in International Loans must then decide on how they will pay and bill their international partners. This article describes six options available for international payment and billing. The methods described include how to use reciprocal agreements, standard invoicing, pre-paid requests, deposit accounts, OCLC ILL Fee Management, International Reply Coupons, and the IFLA Voucher Scheme to obtain and pay for ILL transactions. [Article copies available for a fee from The Haworth Document Delivery Service: 1-800-HAWORTH. E-mail address: Website: © 2002 by The Haworth Press, Inc. All rights reserved.] KEYWORDS. Interlibrary loan, international, payment methods, recip- rocal agreements, invoicing, IFM, ILL fee management, IFLA Voucher Scheme, IRC Christine Robben is Interlibrary Loan Reference Librarian, Miller Nichols Library, University of Missouri-Kansas City, 800 East 51st Street, Kansas City, MO 64110 (E-mail: robbenc@umkc.edu). Cherié L. Weible is Assistant Information Resource Retrieval Center Librarian, University of Illinois at Urbana-Champaign, 128 Main Li- brary, MC-522, 1408 West Gregory, Urbana, IL 61801 (E-mail: cweible@uiuc.edu). Journal of Interlibrary Loan, Document Delivery & Information Supply Vol. 12(3) 2002  2002 by The Haworth Press, Inc. All rights reserved. 29 http://www.HaworthPress Do you have a patron who orders esoteric materials? What does the Interlibrary Loan Department do when the requested material is only available from an international lender? Even if the material is supplied how will you pay for it once it is received? What is the currency in the country of the supplying library? And after you find out, how does your library pay for an invoice in a foreign currency? Do not add stress to the already hectic world of Interlibrary Loan. Be- fore ordering another international request, spend a little time becoming familiar with the payment possibilities available for International Inter- library Loan. Your library may already participate in a method for Inter- national Payment! Methods of payment for International Interlibrary Loan can range from reciprocal agreements to a voucher scheme. This article discusses how to use the reciprocal agreements, IFLA Voucher Scheme, standard invoicing, pre-paid requests, deposit accounts, OCLC ILL Fee Management, and International Reply Coupons to ob- tain and pay for ILL transactions. RECIPROCAL AGREEMENTS In her article “Library to Library,” Mary Jackson states that the “op- erational definition” of a reciprocal agreement “may read as an informal agreement made between two ILL managers in order to avoid ILL charges” (67). What benefits does a reciprocal agreement have for In- ternational Interlibrary Loan? The answer is many of the same benefits reciprocal agreements provide with libraries only 100 miles away. Es- tablishing reciprocal agreements gives the borrowing library familiarity with the lending library. In addition, they will know the cost of a request prior to placing an order. To establish which libraries you would like to approach for a recipro- cal agreement, you will need to decide from whom you frequently bor- row. You will also need to determine what “frequently” means for your library. For some libraries requesting three items a month from the same supplier is frequent. For larger research institutions, requesting three items from the same supplier happens every week. If the library you ap- proach agrees to establish a reciprocal agreement, your benefits will de- pend upon the agreement. Many reciprocal agreements include the free loan of books and the free supply of articles up to a certain number of pages. Depending on the established agreement, the benefits to your li- brary in an international agreement include the ability “to avoid paying lender fees, to control the amount the borrower pays in borrowing 30 Journal of Interlibrary Loan, Document Delivery & Information Supply charges, and to minimize or eliminate the problems associated with pro- cessing invoices and issuing payment” (Jackson 67). Also, if your li- brary agrees on free loans and copies you will not have to fret over the conversion factor when handling international currency. IFLA VOUCHER SCHEME The International Federation of Library Associations and Institutions (IFLA) Office for Universal Availability of Publications (UAP) has es- tablished the IFLA Voucher Scheme to make payment for loans and photocopy requests easier for International Interlibrary Loan. The IFLA Voucher Scheme was launched to by-pass what the UAP identified as common difficulties associated with international payment. These diffi- culties are “lack of access to hard currency, high banking charges for both the supplying and requesting library, currency exchange difficul- ties, and high administration costs in issuing and processing invoices” (Press Release 1). The actual vouchers are plastic cards (see Figure 1). There are green vouchers equal to eight dollars and red vouchers equal to four dollars. The UAP Office recommends that an eight-dollar voucher pay for one transaction. However, the four-dollar vouchers are available for libraries that feel the supply of their materials are worth more than the eight-dollar voucher. Libraries are still able to establish their own fees for lending. For example, if you request an article from a library that is 28 pages, you could be charged at a different rate than a fifteen-page article. By using the IFLA vouchers your library can avoid the costs of in- voicing and the hassle of currency conversion. If your library tends to be a net lender, the IFLA vouchers can be collected and redeemed for their face value. Any library may join the IFLA Voucher Scheme by purchasing the vouchers. Over 60 countries and approximately 800 li- braries participate in the Voucher Scheme (Gould). STANDARD INVOICING As with all interlibrary loaned materials, invoicing is appropriate for international ILL. For the borrower, the difficulties with standard in- voicing include the written language and the rate of currency exchange. One can usually tackle the written language easily with the aid of a dic- tionary, but the rate of exchange could be more challenging. Christine Robben and Cherié L. Weible 31 Often the detail of paying the invoice is passed on to an accounting or billing office. If the lending library will accept credit card payment, that may be the easiest method of payment for your library, since paying with a credit card will allow the borrowing library to receive a bill in their home currency from the credit card company. As a lender, if you are sending an invoice to an overseas library, it is necessary to send an invoice that is easy to read. Gretchen Hallerberg outlines eleven key concepts needed to produce clear invoices in her ar- ticle “How to Prepare Effective Interlibrary Loan Invoices.” The steps applying to international invoicing are summarized as follows: 1. Know that your invoice may not be accompanying the returning payment. 2. Have a unique invoice number and date on the invoice. 3. Generic cover memos tend to get separated from attached papers. 4. Include the borrowing library’s complete address. 5. Invoices for small amounts of money may delay payment. 6. Include your complete address. 7. The “remit address” should be similar to the supplying library’s address. 8. Include a complete telephone number. 9. List acceptable payment methods. 10. The printed invoice should be clear with dark text. 11. Small invoices are easier to lose (22-3). 32 Journal of Interlibrary Loan, Document Delivery & Information Supply FIGURE 1. Front of a Full IFLA Voucher DEPOSIT ACCOUNTS AND PRE-PAID REQUESTS Deposit Accounts are usually an option for payment to libraries that do not participate in IFM or the IFLA voucher scheme. As Virginia Boucher states in her Interlibrary Loan Practices Handbook, by using a deposit account, “ . . . payment of individual invoices can then be avoided” (117). Deposit Accounts are convenient when numerous transactions take place with one foreign supplier. To create a deposit ac- count, you must contact the lending library and decide if the account is feasible for each library and what the charges per transaction will be. Once the account is established, the lending library will withdraw the costs from the account as a request is filled. After the initial account has been established, replenishing the account with additional funds as needed becomes easy. ILL FEE MANAGEMENT (IFM) The OCLC ILL Fee Management Service allows the library to pay for and receive payment through your OCLC invoice. Once you have joined the IFM service, you need to update your MAXCOST field in the OCLC request to an IFM amount. However, both a dollar amount (with or without a dollar sign [$]) and the IFM code must be included in your MAXCOST field for OCLC to recognize the transaction as one affect- ing IFM. For example, both “$10IFM” and “10ifm” are acceptable, but “ifm” is not acceptable. According to OCLC’s IFM online guide Han- dling Lending Charges Through OCLC Invoices, ILL Fee Management works if: • Both borrower and lender have entered valid ILL Fee Manage- ment statements. • The LENDING CHARGES amount does not exceed the MAXCOST amount. • The status of the request is updated to RECEIVED by the bor- rower. With IFM, OCLC keeps track of the amount libraries may owe each other. To illustrate, if you receive a request from an international sup- plier with a MAXCOST of $20IFM, you can answer yes to that request, charging no more than $20IFM. Through IFM, OCLC will track whether you order a book from the same international library and how Christine Robben and Cherié L. Weible 33 much they charged your library. If you charged the international library $15IFM and the international library charged you $20IFM from that li- brary, the international library will find a debit of five-dollars on their monthly Network or OCLC bill. The monthly Network or OCLC bill is where you will see the change from participating in the IFM service. Like the international library in the example above, you may see a debit for that month. Or, if your li- brary is a net-lender, you may find a credit to your account, which could decrease your overall OCLC bill. The other item you may find on your bill is the administrative fee, which OCLC charges for completed bor- rowing transactions. If you need additional information about partici- pating in the OCLC IFM service, inquire at your area OCLC office. INTERNATIONAL REPLY COUPONS (IRCs) International Reply Coupons may be purchased from post offices in countries that are members of the Universal Postal Union (see Figure 2). (For a list of participating countries visit the UPU web site online.) IRCs allow borrowers to pay for international interlibrary loans by attaching a coupon to the request or invoice. The USPS web site states in Publication 51 that the IRC is “equivalent in value to the destination country’s mini- mum postage rate for an unregistered airmail letter. The purchase price is $1.75 per coupon.” If your library begins to accumulate IRCs from inter- national libraries, you may redeem them at your local Post Office for postage stamps. Generally, the exchange of IRCs for international interli- brary loans is dependent on the agreement between the borrowing and lending library. As with any loan, some libraries will charge more and, therefore, require that more coupons be sent as payment. CONCLUSION Ordering an international interlibrary loan should not add stress to the workload of ILL departments. If the options for international pay- ment have been considered before placing an order, you may be able to avoid many of the challenges that arise when paying an international lender. The options discussed in this article–the IFLA Voucher Scheme, reciprocal agreements, standard invoicing, pre-paid requests, 34 Journal of Interlibrary Loan, Document Delivery & Information Supply deposit accounts, OCLC ILL Fee Management, and International Reply Coupons–were intended to illustrate how an Interlibrary Loan Depart- ment can prepare for International lending and borrowing by establish- ing a method of payment, and ultimately better serve library patrons. REFERENCES Boucher, Virginia. Interlibrary Loan Practices Handbook. Chicago: American Library Association, 1996. Gould, Sara. “Paying for International ILL Transactions: The IFLA Voucher Scheme Past, Present, and Future.” Access to Anything–Anytime–Anyplace: Challenges in Global Resource Sharing. ALA Annual Meeting. San Francisco, CA. 17 June 2001. Hallerberg, Gretchen A. “How to Prepare Effective Interlibrary Loan Invoices.” Jour- nal of Interlibrary Loan, Document Delivery & Information Supply. 9.1 (1998): 21-5. Jackson, Mary E. “Library to Library: Reciprocal Agreements.” Wilson Library Bulle- tin. March (1994): 67-68, 134. Online Computer Library Center. OCLC ILL Fee Management Service. “Handling Lending Charges through OCLC Invoices.” May 2001 . Press Release. “The IFLA Voucher Payment Scheme.” Journal of Interlibrary Loan, Document Delivery & Information Supply. 6.2 (1995): 1-3. United States Postal Service. Publication 51. January 2001. Page 14. . Universal Postal Union. July 2001. . Christine Robben and Cherié L. Weible 35 FIGURE 2. Front of an International Reply Coupon wysiwyg://289/ http:// http://www http://www.upu.int/ work_bd35nga5cvgtlpz6rwdfi72kj4 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216588595 Params is empty 216588595 exception Params is empty 2021/04/06-01:37:01 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216588595 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:01 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_bdgwamujvbh65lg7utzomgwzym ---- Microsoft Word - vine articlecwww.doc Cataloguing the World Wide Web: CORC at Edinburgh University Background Having successfully operated the WorldCat i co-operative cataloguing model for many years OCLC ii decided to explore using the same concept to ease the burden and duplication of effort of individual libraries organising, describing and presenting web resources for their users. However, rather than encourage libraries to add web resources to WorldCat OCLC decided to design a system which would incorporate the benefits of co-operative cataloguing with an automatic metadata harvesting or extraction tool and the flexibility of being able to map between multiple metadata formats. The Co-operative Online Resource Catalogue (CORC) iii was thus conceived as a co-operative system which would also help to automate the record creation process and allow libraries to utilise different metadata schemes to suit their requirements. OCLC sent out a call for participants to test out the system in late 1998 and released the first version of CORC early in 1999. Records were initially loaded into the CORC database from OCLC’s InterCat iv and NetFirst v databases. Participants were asked to commit 0.5 FTE to use the system to search for, edit and add new resources. They were also encouraged to provide active feedback which was used to develop the system with regular releases over the period of the project. OCLC also hosted a number of participants’ meetings to further garner ideas about and problems encountered with the system. During this period CORC was made available free of charge to all participating institutions. How CORC works CORC provides a ‘button’ link which can be dragged from the system and dropped directly into the user’s personal toolbar. The user simply clicks on this button when viewing a site of interest on the web browser and CORC automatically starts the record creation process by harvesting metadata from the HTML metatags. The user is taken directly into CORC and presented with a ‘raw’ metadata record for the chosen web site. Alternatively, it is possible to enter CORC through its homepage, choose the record creation area and paste in the URL of the site. Multiple records can be created by this method; the user specifies the URL of a site and also the number of links on that site that they would like to create records for. CORC then creates basic records for these sites and informs the user when they are available for editing. Once inside the record creation process the editor has access to the record and the web site it describes, which is displayed on a lower frame of the page. This permits the editor to refer to the web site and edit the record accordingly. If preferred the whole screen can be dedicated to the record and the Web site viewed on a separate browser window. The automatically generated fields can be edited using the normal cut-and-paste functions and other fields are deleted and added as required with the click of a button. In response to concern from cataloguers CORC added alternative editing modes in addition to their original template options which require browser interaction every time a field is added or deleted. Cataloguers felt this made the editing process unacceptably slow so a text area was developed which allowed users to edit records with fewer browser interactions, so speeding up the process. Sets of constant data can be created, stored and used when editing a group of records displaying significant similarities. Constant data can be automatically added at the initial record creation stage by CORC or at any later stage by the cataloguer from a list of actions on a drop down menu. CORC also trialed the WebDewey vi product with the project phase of CORC. This automatically creates Dewey classification numbers for web sites some of which are then linked to appropriate Library of Congress Subject Headings. Unfortunately, experience at Edinburgh suggested that many of the class numbers generated were very wide of the mark and as the Library had recently moved over to the Library of Congress classification scheme it was decided not to proceed with this tool when it became a separate chargeable product. Authority Control CORC offers cataloguers the option to control all the headings in a record in one step or on an individual basis. Headings are checked with an authority control button which either accepts the heading as valid or opens up an authority box suggesting alternatives and allowing the cataloguer to search the OCLC authority files. Valid headings become highlighted links which lead to the authority records. It is also possible to search and browse the authority indexes for suitable headings. Saving and Exporting Records Cataloguers can either save records into the main Resource Catalogue where they will be accessible to all CORC users or they can choose to keep them private by using the Savefile area. CORC provides each institution with a Savefile area which can be used in several ways. It can store new records which are still works in progress. Once new records are completed they can be moved into the main database making them accessible to all participants and indeed users must move new records into the public domain before they can be exported. The Savefile can also be used to store records which have been copied or cloned from the main database for the purpose of local editing before records are exported. CORC also allows some restricted editing of ’master’ records on the main database so that inaccuracies, such as invalid URLs, can be corrected. CORC automatically validates records and warns cataloguers of any MARC formatting errors before they are submitted to the main database or exported to another system. Records can be exported singly or in a batch from CORC in either MARC or Dublin Core HTML or RDF formats. CORC has also introduced a link checking service. Users can check periodically on CORC for a list of invalid or redirected URLs the system has discovered. Users are only informed about URLs on records they have either created or have requested to be informed about – such as records they have exported. It is then up to the user to correct the URL on CORC and their local system as necessary. Metadata Formats CORC supports both MARC21 and Dublin Core formats. Work at Edinburgh University has concentrated mainly on using MARC so that records for web resources can be imported into the Library’s Endeavor Voyager catalogue. CORC provides context sensitive help and support with MARC editing, with links from each MARC tag and links via the home page to other information sources such as Nancy Olson’s Guide to Cataloguing Internet Resources vii . Dublin Core users have the option of following ’simple’ Dublin Core or using qualifiers as developed by the DCMI viii . CORC has also developed additional qualifiers which reflect library users’ experience with MARC and the desire to be able to map easily between the two formats. Examples of CORC qualifiers include the addition of Personal, Corporate and Conference with the Contributor element and the qualifier is Part of Series with the Relation element. CORC also provides online help with editing in Dublin Core. It is possible to move between the two formats at any time when viewing or editing a record. Not all information will be visible in both formats as some fields in one format do not have an equivalent in the other, however no information should be lost in the mapping and all will reappear if the record is subsequently viewed in the original format. Mapping from MARC to Dublin Core tends to lead to a more satisfactory result than the reverse action due to the strict rules and guidelines associated with MARC. Pathfinders In addition to the record creation side of the system, CORC also has another related strand in the form of subject bibliographies or ‘pathfinders’. CORC aims to help libraries by automating the process of creating subject gateways to both digital and physical resources and providing access to its database of web resources. Benefits also include the link maintenance mentioned earlier and the dynamic search feature which allows pathfinders to be automatically updated as new resources are added to the main catalogue. As with resource records libraries can also benefit from co-operative effort as it is possible to copy, edit and export pathfinders so saving on the time and effort of creating resource pages from scratch. Pathfinders are a valuable function for libraries wishing to set up pages of resource links within specific subject domains quickly. In practice however, particularly in academic libraries, it is likely that subject librarians would find the value of them to lie mainly as an alerting tool to new web resources in a domain, which can then prompt further editing of already existing pages produced locally. Pathfinders have yet to be used at Edinburgh, but they are currently being evaluated as a potential tool for updating and enhancing the Library’s current subject web pages. Future of CORC CORC was launched as a fully chargeable product in July 2000, following a similar usage and pricing structure as WorldCat. Both services have been integrated, with all records submitted to the main catalogue in CORC simultaneously added to WorldCat and any records containing 856 links saved to WorldCat uploaded to CORC on a daily basis. WorldCat can be searched from CORC, but it is not possible to search for ‘web only’ resources from WorldCat. CORC is not a static product and OCLC is still looking at ways of improving the service with enhancements added to each new release. Areas of potential future development include the incorporation of other metadata formats such as the IMS ix metadata standard. It is also hoped that CORC’s functionality might be extended to aid with the problems associated with digital preservation - in particular the issue of archiving web sites.CORC at Edinburgh University Through the Science and Engineering Library and Learning Information Centre (SELLIC) project, Edinburgh University Library joined the project in October 1999 after the appointment of a Metadata Editor. Transatlantic training was arranged by way of an audio conference and the Metadata Editor introduced to the system. The initial training was followed up with a short period of time using the CORC practice system before the Metadata Editor started to add resources to the main database. CORC has been used primarily to create MARC records for electronic resources including departmental web sites, electronic abstracting and indexing databases and for web resources recommended by academics on course web sites. The Library is also currently considering changing its current electronic journal policy of simply adding 856 links to records for print journals to one of creating separate records for electronic journals. CORC would be useful here as a repository of records which could be easily edited and imported into the Library’s catalogue. The Library is also looking at ways of encouraging academic staff to recommend web sites for the catalogue in much the same way as they recommend print resources. The University’s Web Editor is developing a simple web form which will allow academics to simply paste in a URL which will then be sent to a database of URLs so recreating for the world of the web the familiar cataloguing backlog for the Metadata Editor to work from. Finally the Library would like to encourage the incorporation of good quality metadata in the University’s own web pages, enriching them and making them potentially more useful to the science and engineering communities worldwide. CORC is one potential tool for creating Dublin Core HTML metatags which could be easily added into the HTML source for these pages. Conclusion The University of Edinburgh’s experience with the metadata side of CORC has been largely positive. Access to an expanding database of good quality records for web resources saves the Library time and effort in creating records from scratch. The system saves on some original cataloguing with the automatic creation of a basic record, and allows the cataloguer conveniently to view the web site and record all on one screen. It provides good online help for editing in both MARC and Dublin Core, automatically maps between the two formats, links to OCLC authority files and provides URL checking. In this respect it is a useful tool which allows the Library to describe, present and maintain selected web resources for its users. However, on the downside, the raw metadata records are very basic and require a considerable amount of editing to bring up to an acceptable standard, the editing process tends to be somewhat slower than with a traditional library system cataloguing module due to the reliance on the web and the variable speed of browser interactions, and finally there is still a considerable degree of US bias with regard to the resources available on the system. We would wish to see this addressed through much greater adoption of the service by European academic libraries. Zena Mulligan SELLIC Metadata Editor Edinburgh University Library John MacColl SELLIC Director & Sub-Librarian, Online Services Edinburgh University Library i OCLC WorldCat: http://www.oclc.org/oclc/menu/colpro.htm ii OCLC: http://www.oclc.org/home/ iii OCLC Co-operative online resource catalogue: http://www.oclc.org/oclc/corc/ last accessed 04.04.01 iv OCLC InterCAT project: http://www.oclc.org/oclc/research/projects/intercat.htm v OCLC NetFirst: http://www.oclc.org/oclc/netfirst/ vi OCLC Dewey Decimal Classification: WebDewey in CORC: http://www.oclc.org/oclc/fp/products/webdeweyincorc/webdeweyincorc.htm vii Olson, Nancy ed. Cataloguing internet resources: a manual and practical guide, 2 nd ed. OCLC, 1997: http://www.oclc.org/oclc/man/9256cat/toc.htm viii Dublin Core Metadata Initiative: http://dublincore.org/ ix IMS Global Learning Consortium, Inc.: http://www.imsproject.org/ work_beqchdaomfcsdonudxsqfbbrgm ---- Microsoft PowerPoint - Paper-P1D-161Bryant,de_Castro,Mennielli_a Practices and Patterns: the Convergence of Repository and CRIS Functions Open Repositories 2019 * Hamburg, Germany * 11 June 2019 Rebecca Bryant, PhD, Senior Program Officer, OCLC Research bryantr@oclc.org @RebeccaBryant18 https://orcid.org/0000-0002-2753-3881 Pablo de Castro, Open Access Advocacy Librarian, University of Strathclyde euroCRIS Secretary pablo.de-castro@strath.ac.uk @pcastromartin https://orcid.org/0000-0001-6300-1033 Dr. Annette Dortmund, Research Consultant, OCLC Annette.Dortmund@oclc.org @libsum https://orcid.org/0000-0003-1588-9749 Michele Mennielli, International Membership and Partnership Manager, DuraSpace mmennielli@duraspace.org @micmenn https://orcid.org/0000-0002-4968-906X • Introduction to OCLC Research, euroCRIS, and our collaborative research • The background: evolution of repository-CRIS convergence • Survey findings Today’s presentation • Devoted to challenges facing libraries and archives since 1978 • Community resource for shared Research and Development (R&D) • Engagement with OCLC members and the community around shared concerns • Learn more ▪ oc.lc/research ▪ Hangingtogether.org blog oc.lc/rim >200  Members 45  Countries 15  Strategic  Partners An international not‐for‐profit association founded in 2002 to bring together  experts on research information in general and research information systems  (CRIS) in particular Lígia Maria Ribeiro, lmr@fe.up.pt Universidade do Porto – FEUP & EUNIS Pablo de Castro, pablo.decastro@kb.nl Stichting LIBER& euroCRIS Michele Mennielli, m.mennielli@cineca.it CINECA & EUNIS & euroCRIS http://www.eunis.org/blog/2016/03/01/crisir-survey-report/ http://www.eunis.org/wp-content/uploads/2016/03/cris-report-ED.pdf Building on previous research Practices and Patterns in Research Information Management: Findings from a Global Survey oc.lc/rimsurvey Rebecca Bryant, PI, OCLC Research Pablo de Castro, Strathclyde University and euroCRIS Anna Clements, University of St. Andrews and euroCRIS Annette Dortmund, OCLC EMEA Jan Fransen, University of Minnesota, Twin Cities Michele Mennielli, DuraSpace and euroCRIS Plus a number of valuable collaborators at OCLC Repository/CRIS interoperability: background • Designed as a  mechanism to get  repositories populated • Original workflow CRIS  metadata records  repositories http://www.rsp.ac.uk/events/repositories-and-cris-systems-working-smartly-together/ CRISesand OARs • CRISes and OARs developed substantially separately • OARs primarily developed as discovery tools • Public facing • Generally populated via individual deposit • Development usually led by library staff • CRISes mainly used as (data) management tools • Used within institutions, with little or no public interface • Often built to integrate with other systems to allow  import of data • Most implementations led by institution’s research offices • Considerable overlap in the data that is collected by each  9 Slide by Mark Cox (KCL &  euroCRIS Board, July 2011) http://www.rsp.ac.uk/documents/ get‐uploaded‐ file/?file=rspnottingham190711_e mbed.ppt Integration of CRISes and OARs • Increasing realisation in institutions that there is benefit from  streamlining OAR and CRIS development • Several potential ways of doing this • Can use a CRIS for gathering all research related information,  including research output, and feed/link to repository • Can expand repository to be able to cover a wider range of  research related information • Can use portal functionality of sophisticated CRIS systems to  mimic repository functionality 10 Slide by Mark Cox (KCL &  euroCRIS Board, July 2011) “Several potential ways of doing this”: IR‐as‐a‐CRIS • Eprints‐based IR‐as‐a‐ CRIS model operating at  eg the University of  Glasgow Enlighten  repository, http://eprints.gla.ac.uk/ • Collection of additional  info on funding sources  requires connection to  (interoperability with) project database http://eprints.gla.ac.uk/186403/ An early attempt at describing the CRIS/IR landscape (2014) Interoperability • CRIS and IR  Integration • CRIS‐as‐IR • IR‐as‐CRIS Research assessment exercises  identified in the OCLC/euroCRIS survey as one of the main drivers for  CRIS implementation worldwide Elusive landscape to pin down due to swiftly evolving character, esp w/  widespread arrival of commercial CRISs to the UK in the wake of REF2014 Sense of technologies, workflows and  institutional units at odds with each other  (while warnings were issued on the 'false  dichotomy‘ between IRs and CRISs,  http://www.ukcorr.org/2012/09/26/three‐ perennial‐questions/) IT’S IMPORTANT TO KEEP  COLLECTING UP‐TO‐DATE  SNAPSHOTS OF THIS SWIFTLY  EVOLVING LANDSCAPE Commercial CRISs start trying to operate as repositories – which they were  not originally designed to do Online survey data collection: Oct 2017 – Jan 2018 o English and Spanish versions Convenience sample o Report is explanatory and descriptive in nature Survey promotion through: o OCLC and euroCRIS communications channels and events worldwide o Communications by CRIS vendors and user communities o Listservs, social media, and announcements to research & library organizations OCLC/euroCRIS Survey: Methodology and promotion 20 Countries 84 Institutions 44 countries 381 institutions 70% with  a CRIS 58% with  a CRIS The Surveys In the 18% of the cases where both CRIS and  IR systems are available a single software  platform is used for both How many systems? Which of the following internal systems interoperate with your RIM System(s)? CRIS-IR Interoperability Content and Functions Internal Systems Interoperability Institutional Repository: 63% Library Management System: 8% Financial: 36% HR: 68% 43% 22% 36% 78% The two surveys tracked different external systems interoperability External Systems Interoperability Technologies Attempt to capture an up‐to‐date snapshot for the evolved repository/CRIS co‐existence & merging - Case study for Haplo‐based integration where repo & CRIS are one and the same (open source)  platform - Seamless Worktribe/DSpace integrations - Case study for Pure‐based interoperability - Best practice (integrated) DSpace‐CRIS implementation at UnityFVG More importantly: the application of integrated/interoperable systems to areas like Open Science  implementation, reporting for research assessment, research data management surpasses the  functionality achieved by repository‐only or CRIS‐only solutions Fast forward to June 10th OR19 WS on repo/CRIS interoperability/integration References Bryant, Rebecca, Anna Clements, Carol Feltes, David Groenewegen, Simon Huggard, Holly Mercer, Roxanne Missingham, Maliaca Oxnam, Anne Rauh and John Wright. 2017. Research Information Management: Defining RIM and the Library’s Role. Dublin, OH: OCLC Research.  doi:10.25333/C3NK88 Bryant, Rebecca, Annette Dortmund, and Constance Malpas. 2017.  Convenience and Compliance: Case Studies on Persistent Identifiers in European Research Information. Dublin, Ohio: OCLC Research.  doi:10.25333/C32K7M Bryant, Rebecca, Anna Clements, Pablo de Castro, Joanne Cantrell, Annette Dortmund, Jan Fransen, Peggy Gallagher, and Michele Mennielli.  2018. Practices and Patterns in Research Information Management: Findings from a Global Survey. Dublin, OH: OCLC  Research. https://doi.org/10.25333/BGFG‐D241   Bryant, Rebecca, Anna Clements, Pablo de Castro, Joanne Cantrell, Annette Dortmund, Jan Fransen, Peggy Gallagher, and Michele Mennielli. 2018. Data  Set: Practices and Patterns in Research Information Management: Findings from a Global Survey. Dublin, OH: OCLC  Research. https://doi.org/10.25333/QXR6‐D439 Carr, Leslie (2010) EPrints: a hybrid CRIS/repository. Workshop on CRIS, CERIF and Institutional Repositories, Italy. 10 ‐ 11 May 2010. 2 pp .  https://eprints.soton.ac.uk/271048/ de Castro, Pablo, Kathleen Shearer, and Friedrich Summann. 2014. “The Gradual Merging of Repository and CRIS Solutions to Meet Institutional  Research Information Management Requirements.” Procedia Computer Science 33: 39–46. https://doi.org/10.1016/j.procs.2014.06.007.  Ribeiro, Lígia, Pablo De Castro, and Michele Mennielli. “EUNIS‐EuroCRIS Joint Survey on CRIS and IR,” 2016. http://www.eurocris.org/news/cris‐ir‐ survey‐report.  Discussion oc.lc/rimsurvey Rebecca Bryant, PhD, Senior Program Officer, OCLC Research bryantr@oclc.org @RebeccaBryant18 https://orcid.org/0000-0002-2753-3881 Pablo de Castro, Open Access Advocacy Librarian, University of Strathclyde euroCRIS Secretary pablo.de-castro@strath.ac.uk @pcastromartin https://orcid.org/0000-0001-6300-1033 Dr. Annette Dortmund, Research Consultant, OCLC Annette.Dortmund@oclc.org @libsum https://orcid.org/0000-0003-1588-9749 Michele Mennielli, International Membership and Partnership Manager, DuraSpace mmennielli@duraspace.org @micmenn https://orcid.org/0000-0002-4968-906X work_bfzmqqh4rbad7olnva7k3b46pu ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216589913 Params is empty 216589913 exception Params is empty 2021/04/06-01:37:02 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216589913 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:02 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_blynnbwkmrghpn7agu2hebqx6m ---- JILDD 17(4) NG print.vp Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=wild20 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve ISSN: 1072-303X (Print) 1540-3572 (Online) Journal homepage: http://www.tandfonline.com/loi/wild20 How One Part-Time Library Staff Member Can Provide Interlibrary Loan Service Emily Knox To cite this article: Emily Knox (2007) How One Part-Time Library Staff Member Can Provide Interlibrary Loan Service, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 17:4, 87-94, DOI: 10.1300/J474v17n04_10 To link to this article: https://doi.org/10.1300/J474v17n04_10 Published online: 04 Oct 2008. Submit your article to this journal Article views: 37 http://www.tandfonline.com/action/journalInformation?journalCode=wild20 http://www.tandfonline.com/loi/wild20 http://www.tandfonline.com/action/showCitFormats?doi=10.1300/J474v17n04_10 https://doi.org/10.1300/J474v17n04_10 http://www.tandfonline.com/action/authorSubmission?journalCode=wild20&show=instructions http://www.tandfonline.com/action/authorSubmission?journalCode=wild20&show=instructions How One Part-Time Library Staff Member Can Provide Interlibrary Loan Service Emily Knox ABSTRACT. Interlibrary loan is an integral part of library service, but it is a time and resource consuming endeavor. This article presents several time and cost-saving policies and procedures that will provide interli- brary loan service in very small interlibrary loan departments. doi:10.1300/ J474v17n04_10 [Article copies available for a fee from The Haworth Docu- ment Delivery Service: 1-800-HAWORTH. E-mail address: Website: © 2007 by The Haworth Press, Inc. All rights reserved.] KEYWORDS. Interlibrary loan, small staff, policies, procedures HOW ONE PART-TIME LIBRARY STAFF MEMBER CAN PROVIDE INTERLIBRARY LOAN SERVICE Interlibrary Loan is one of the most important services a library can offer. Small libraries have more of a need for interlibrary loan, espe- cially on the borrowing side, than larger ones. They usually do not have the staff and resources for a full-fledged interlibrary loan department. Emily Knox is Associate Director and Reference Librarian, St. Mark’s Library General Theological Seminary, 175 Ninth Avenue, New York, NY 10011 (E-mail: Knox@GTS.EDU). Journal of Interlibrary Loan, Document Delivery & Electronic Reserve Vol. 17(4) 2007 Available online at http://jildd.haworthpress.com © 2007 by The Haworth Press, Inc. All rights reserved. doi:10.1300/J474v17n04_10 87 Although the amount of time and attention interlibrary loan takes is dis- couraging, both lending and borrowing services are essential. In order to offer interlibrary loan without overwhelming a small staff, it is essen- tial to use various time-saving measures. This article will describe policies and procedures that are used in the St. Mark’s library, as well as others that were gathered in response to an E-mail solicitation to several listservs. St. Mark’s Library is a net lender with approximately 400 lending requests (half of them filled) and 45 borrowing requests per year. St. Mark’s does not use any specialized software such as Clio, Illiad, or Ariel. Almost all interlibrary loan trans- actions are conducted through OCLC Resource sharing. St. Mark’s Li- brary also accepts requests by fax or mail. DEFINITION OF AN “ON-THE-FLY” INTERLIBRARY LOAN DEPARTMENT There are several ways to characterize this particular type of interli- brary loan department. For convenience, this paper uses the term “On- the-Fly.” To describe an interlibrary loan department where one person conducts all interlibrary loan activities but has other primary duties. The term is not used disparagingly, but indicates that interlibrary loan tasks are completed in a “get it done” manner. An “On-the-Fly” department is different from a one-person interlibrary loan department. A one-person department implies that interlibrary loan is the staff member’s primary duty, and interlibrary loan takes up the bulk of his or her working hours. RESOURCES IN INTERLIBRARY LOAN AND DOCUMENT DELIVERY FOR BEGINNERS Most interlibrary loan literature is written for large interlibrary loan departments. Although this can help an “On-the-Fly” interlibrary loan li- brarian, it takes a while to choose which information and new ideas are applicable to a particular institution. For example, quite a bit of in- terlibrary loan literature discusses new software and equipment, but it is often impossible for a small department to justify the expense of such products. There are two books that can be useful to even the smallest of de- partments. Virginia Boucher’s Interlibrary Loan Practices Handbook, 88 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve though dated, introduces interlibrary borrowing and lending. The “On- the-Fly” staff member can review the suggested policies and procedures and decide what is best. Mary Jackson’s Assessing interlibrary loan/ DD Services: New Cost-Effective Alternatives, although clearly geared toward large research institutions, also offers cost-saving procedures that can be used in any library. It is important for an “On-the-Fly” department staff to be aware of what is available in new technology even if they decide not to use it. If there is time, subscribing to interlibrary loan journals and listservs can be invaluable. Communications from OCLC regional service pro- viders can also help staff members keep track of changes to OCLC Re- source Sharing. OCLC offices distribute newsletters, offer training in resource sharing and employ experts who can help you or refer you to someone knowledgeable in the field. By e-mailing listservs, one can learn how small libraries, all over the country, handle interlibrary loan. Methods range from the archaic (doing everything on paper) to cutting edge. Many of the libraries that replied to the survey, train student workers or volunteers to perform in- terlibrary loan tasks. It is a way to get interlibrary loan tasks accom- plished if it becomes too overwhelming for one person. GOOD POLICIES All interlibrary loan departments, even the smallest, should take Leslie Morris’ admonitions in “Why Your Library Should Have an In- terlibrary Loan Policy and What Should be Included” to heart. Most of the time-saving measures detailed below are policy decisions. In many libraries, interlibrary loan policies and procedures are often inher- ited and may be out of date. Taking the time to evaluate current policies carefully will save time. My non-scientific survey of “On-the-Fly” departments indicates that some institutions are not taking advantage of changes in technology that could greatly reduce the amount of time spent on interlibrary loan. It may have made sense ten years ago to print everything and keep a notebook, but new technology allows most transactions to take place electronically. Keeping a paper trail can consume staff-time and resources without pro- viding any tangible benefits. The less time one has to devote to interlibrary loan, the more stringent policies may need to be. While an institution’s own patrons come first and all borrowing requests should be researched and processed, lending Emily Knox 89 requests can be triaged when there is limited time. It is important to re- member that borrowing libraries prefer that requests that will not be filled are forwarded quickly so that they can be filled by another library. IDEAS FOR BORROWING Who In an “On-the-Fly” department, it is important to identify and some- times limit who is eligible to use borrowing services. Assessing Interli- brary loan/DD Services: New Cost-Effective Alternatives states that user-initiated interlibrary loan is a cost-effective borrowing method. Al- though St. Mark’s Library, nor any of the libraries that responded to the survey, use it. This is probably because it takes quite a bit of time to complete the initial setup. However, since it has been proven to be ef- fective, “On-the-Fly” departments should consider implementing user- initiated borrowing. When Even in tiny departments, borrowing requests should be processed as soon as possible. This may seem obvious but when there are no paper requests, a staff member must log into the interlibrary loan management software almost every day to check for pending borrowing requests. It is easy to forget to check when it is not a “processing day.” Establishing a time to check for requests every day (e.g., right after lunch) is a good way to make sure that the task is completed. What Libraries with a small staff can also maintain a policy of not borrow- ing materials that are available at the local public library or at libraries with reciprocal borrowing privileges. Although patrons prefer to have the materials they need delivered to them, this policy can greatly reduce the number of borrowing requests. Fees Library budgets are tight and interlibrary borrowing can be expen- sive. Charging patrons interlibrary loan borrowing fees can help, but 90 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve only if too much time and effort are not expended trying to collect the fees. A few of the surveyed libraries mentioned that they simply add the borrowing fee to the patrons’ library record. This eliminates the need to generate paper invoices for patrons and saves processing time. Other in- stitutions simply do not charge for interlibrary loan service because it eliminates record keeping and collection duties. Many libraries try to borrow from institutions that do not charge fees because of either consortia agreements or institutional policies. If a charge is assessed, the interlibrary loan fee management (IFM) sys- tem makes keeping track of these fees much easier. (IFM is discussed later.) Keeping Track Keeping track of borrowed material is probably one of the most vex- ing problems in interlibrary loan. Some institutions add an interlibrary loan item to patron records so that the material can be tracked with other items borrowed from the library. This can be a workable solution if add- ing dummy items to your circulation system is not too onerous. If you do not have a system that automatically prints out straps or la- bels, coming up with visual cues for interlibrary loan books can take some ingenuity. Handwriting “ILL” and the due date on removable la- bels that are placed on borrowed books is one solution. It is important to be aware, however, that these labels can be expensive and some lenders do not want them used on their books. Running a “borrowing library” search on OCLC Resource Sharing every so often will help remind the staff member of what is currently borrowed. If an item is due in the near future, the staff member can send a reminder E-mail to the patron. Communicating with patrons via E-mail reduces the need for more time-consuming phone calls or mailed notices. IDEAS FOR LENDING Interlibrary lending is the area where an “On-the-Fly” department can save the most time. Remember that not all requests must be filled since other libraries, with the same material and more people on staff, can also fill the request. This may seem obvious, but librarianship is a service profession. Most librarians do everything they can to get a pa- tron what they need, even if that patron is halfway across the country. It Emily Knox 91 is important to remember, however, that interlibrary loan only functions when libraries are willing to lend materials, and when each institution participates in the system fully and fairly. Who and When Always triage your requests. This can be difficult. How do you de- cide that one request is more important than another is? One way is to establish a hierarchy for requestors. Lending requests from consortia and affiliated libraries are always completed; others are moved further down the queue. Some libraries complete incoming requests on particu- lar days of the week. On slow days, however, check the pending queue to see if there are any requests that can be filled quickly or to see if your institution is the only one listed in the queue. What In order to save time and resources, it makes sense to consider not lending any special collection material or theses because of their rarity and the increased need for careful packaging, tracking, etc. Sending only circulating books and articles reduces the amount of preparation time required for each item. Carefully wrapping interlibrary loan items is recommended in the Interlibrary Loan Practices Handbook (Boucher 1997, 60), but it seems unnecessarily excessive for a $20 book to be sent in a padded envelope. Instead, eliminate the need for extra wrapping by having a policy of only sending books that are robust enough to withstand the shipping process. Fees OCLC’s IFM system has eliminated the need for invoices. “If an insti- tution does not use IFM, and they charge for borrowing, we will not lend to that library.” This might seem overly harsh, but it takes more resources to create, print, and mail an invoice than what a $15 fee recovers. Keeping Track Some libraries create an individual patron record in their circulation system for each borrowing library. However, it is easier to keep track 92 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve of interlibrary loan lending by creating a single interlibrary loan patron in the circulation system. Allowing a generous lending period of 6 months may be necessary if items are shipped via Library Rate. (The lending period suggested in OCLC Resource Sharing is one month.) All notices for the account are sent to the interlibrary loan staff member who can then contact the library by e-mail or phone. A long, six-month, lending period ensures that this rarely occurs. GOING PAPERLESS Most of the suggestions above have one factor in common–they use as little paper as possible. Interlibrary loan is the best area to implement the paperless office. There is no need to duplicate information that is readily available electronically. Computer-based management systems; OCLC Resource Sharing, Clio, and Illiad, keep track of transactions including the titles of all loaned items and the names of all borrowing libraries. When borrowing, it is best to fill out the online form as completely as possible so that on retrieving a record it provides all needed information. It is not necessary to print lending requests until they are filled. A copy of the request is placed in the book for shipment. Going paperless means using as many electronic interlibrary loan re- sources as possible. “On-the-Fly” departments cannot afford software, such as Ariel, for electronic transmission of articles, but a decent scan- ner is all that is needed to send articles to a patron’s e-mail address or to a borrowing library. Illiad’s Odyssey program, which allows electronic article delivery between Odyssey libraries, is now available, free, to all libraries for downloading. GETTING INTERLIBRARY LOAN DONE The above suggestions may seem counterintuitive to some of the fun- damental philosophies of librarianship. Instead of saving the patron’s time, they save staff time. The ideas will allow any library, no matter how understaffed, to begin to provide expected and valuable interlibrary loan service to patrons. Emily Knox 93 BIBLIOGRAPHY Boucher, Virginia. Interlibrary Loan Practices Handbook. Chicago and London: American Library Association, 1997. Jackson, Mary, Bruce Kingma and Tom Delaney. Assessing Interlibrary Loan and Document Delivery Services: New Cost-Effective Alternatives. Washington, DC: Association of Research Libraries, 2004. Morris, Leslie. “Why Your Library Should Have an Interlibrary Loan Policy and What Should Be Included.” Journal of Interlibrary Loan, Document Delivery and Elec- tronic Reserves 15:4 (2005): 1-7. doi:10.1300/J474v17n04_10 APPENDIX E-mail Questionnaire In May 2006, I sent an initial E-mail to NewLib, ATLANtis, GSLIS (UIUC) Discussions and GSLIS (UIUC) Alumni to find libraries that had an interlibrary loan department that matched my definition of “On-the-Fly” li- braries. The following questions were sent to all who responded: 1. How many hours a week do you spend on interlibrary loan? Is the work done on certain days or throughout the week? 2. How do you manage incoming and outgoing requests? Do you use com- puter software to do interlibrary loan? 3. If you charge for the service, how do you keep track of the invoices and money? 4. Is someone else in your library trained to do interlibrary loan if you are away? 94 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve work_bnofrdlstnbd3acu3jfsga4fyu ---- The Impact of Approval Plans on Acquisitions Operations and Work Flow By: Rosann Bazirjian Bazirjian, Rosann. “The Impact of Approval Plans on Acquisitions Operations and WorkFlow.” The Acquisitions Librarian 8(16): 29-35 (1996). [Also published in Approval Plans: Issues and Innovations (edited by John Sandy) pp. 29-35, Haworth Press, New York (1996).] Introduction Many papers about approval plans address their effect on collection development and subject bibliographers. We are told that approval plans provide systematic coverage of profiled subjects, and ensure that libraries will not miss newly published titles. We also ace informed that approval plans allow our subject bibliographers and/or faculty to more closely evaluate monographs so that they can make wise purchase selections. In addition, they free up our professionals from the time-consuming work of title-by-title selection. All of these statements are true. Approval plans have made a world of difference in the area of collection development. Few papers on approval plans, however, address the impact they have on our daily acquisitions operations and work flow. This paper will focus on their impact on acquisitions procedures and the personnel who process them. Approval Plan Popularity It is clear that approval plans have been and continue to be extremely popular. In fact, in a 1988 survey of The Association of Research Libraries, over 90% of the respondents claimed that they used approval plans.' Even back in 1968, Peter Spyers-Duran, who organized the first conference on this new phenomenon, bravely stated that "approval gathering plans are here to stay."2 These days, one is hard-pressed to find a university library not using approval plans to one extent or another. Benefits of Approval Plans Papers which have focused on approval plans from the acquisitions perspective have cited savings in staff time as one of the most important benefits for acquisitions. Back in 1971, one such article claimed that a "well-managed approval plan can save at the minimum one full-time position, with significantly higher savings possible depending upon variances in internal procedures."3 In 1979, Cargill and Alley were citing a savings in time and labor as rationale for the use of approval plans4 In addition, the 1988 report of the Association of Research Libraries stated that "savings in staff time”5 was the most common reason cited for having an approval plan. Today, as the demand upon libraries continues to increase, and budgets continue to http://libres.uncg.edu/ir/clist.aspx?id=44 decrease, approval plans are often seen as a way of coping without increasing staff.6 Acquisitions librarians see approval plans as a way of reducing the number of firm orders which must be processed on a daily basis,7 thus absorbing some of the workload from our overburdened staffs. As our libraries continue to downsize, approval plans can be seen as a way of coping with reduced staffing levels. Thus, papers which have dealt with approval plans and acquisitions departments, for the most part, cite savings in staff time as one of the most beneficial reasons for setting up a profile. Perhaps it is time to say that this is not necessarily true. Approval plan processing in acquisitions can be extremely labor intensive, as well as disruptive to work flow. Negative Impact on Acquisitions Time Intensive Is it possible to say that approval plans are labor intensive when we are supposed to be utilizing them because they save us time? The answer is yes. First, we must consider the extra time spent in monitoring the plan. Depending on the institution, monitoring the plan can involve either professional or clerical staff. As was eloquently stated by Robert Nardini, even "the most discriminating selectors cause the most work for acquisition."8 A bibliographer may ask acquisitions personnel to verify why a particular title has arrived at the library and perhaps to what portion of the profile it applies. In addition, and along these same lines, even before a firm order is placed, the library with an approval plan might need to think in terms of the profile before simply placing that order. For example, if a library has a university press profile, and a bibliographer submits an order for a university press title, various on-line databases and/or microfiche, need to be checked first to verify whether or not that title might automatically be arriving. Again, this involves greater labor effort and is time intensive. Not only is monitoring of the plan time intensive, but so are the inevitable book returns. Searchers at Syracuse University library found that many duplicates are received with an approval plan.9 That could very well be due to procedures as they are implemented, but regardless of the reason why this happens, returns are time consuming and costly. Our staff ,must alter the invoices, cull out the titles to be returned and prepare financial calculations; all labor intensive activities. Although the author cannot cite specific statistics, it is felt that the proportion of firm order returns compared to approval returns is marginal. At times, manual files need to be kept and consulted. If a library chooses not to establish records in their on-line system for a title selected from form selection, for example, searchers may be required to sift through an alphabetized file of all titles ordered on approval to avoid duplication. Once the book is received, it may then be necessary to once again review the file to remove the slip. Labor Intensive Magrill and Corbin's suspicions that staff time saved at one point of the work flow is merely transferred to another10 is absolutely correct. The selection time saved for collection development librarians is simply transferred to the extra time it takes to process these titles in the acquisitions department. As Martin Warzala states, "by adding labor intensity to library processes associated with approvals . . . the client is defeating part of the purpose of approval service.”11 Approval plans often involve exceptions to routine work procedures, and it is those exceptions which make the processing of approval plans labor intensive. Joe Barker states that rather than approval plans resulting in a reduction of staff at Berkeley, "approval plans result in a shifting of work from one area to another.''l2 He adds that, as an example, their receiving unit took on more approval plan receipts, selector review shelves, more returns and more disruptive checking and creating of records on receipt.13 The library at Syracuse University is in the practice of returning hardbound books received on the approval plan when a subject bibliographer has determined that he/she would like the less expensive paper edition. Again, this is another exception to handle. It involves a return as well as placement of a firm order for the desired paper edition. Because it is out of routine, work flow is affected and extra labor is spent. Disruptive to Workflow Approval plans are disruptive to acquisitions workflow. As stated, they tend to be the exception rather than the norm, and with exceptions one tends to associate the problems.14 Usually, we need to maintain separate files. We also tend to write special handling procedures. These exceptions to the workflow require more complex procedures in order to effectively process them. Approval plans prescribe "a more complex set of acquisitions practices than would be needed if everything were ordered using one method."15 One staff member at Syracuse University stated that "everything stops"16 when the approval shipment arrives. Firm orders are pushed aside as approval titles are given priority due to their nature and the necessary review process. We need to provide special viewing areas, and set up review schedules. Our work flows "must accommodate the needs and schedules of selectors visiting the approval review shelf,"17 and that is disruptive. For libraries receiving a very large number of approval titles on a weekly basis, both "physical and staffing problems"18 can result when attempting to display the titles for review and schedule their removal. Also, constant reminders to bibliographers to review these shelves is disruptive as well. Possible Alternatives What can be done to make approval plan processing less problematic to acquisitions workflow and procedure? It is important to view approval plans in conjunction with other acquisitions procedures rather than as a separate entity. The less exceptions to the workflow, the better. Do not view approval plans in isolation from other acquisitions functions. As Axford says, "this is analogous to designing a powerful new automobile engine without facing up to the necessity of also redesigning the extra drive train to achieve the desired level of performance."l9 Supervisors need to make certain that they are constantly examining the workflow and not making exceptions to procedures. It is important to involve the staff who will be processing the material in all procedural decisions. Find out what can be handled in the least disruptive way from those directly involved in the process. Every effort needs to be made to streamline processing as much as possible. We must also keep in mind that technology is constantly changing. Back in 1987, the Survey of The Association of Research Libraries noted that "the effect of automation on approval plans is not yet very great."20 However, the report continued to say that "advances in the automation of acquisitions processes may change the way approval plans are handled in the future. Direct electronic transmission of bibliographic files from the vendor to the library may make it possible for libraries to do title-by-title review."21 Not only is this occurring now, but our staffs are able to toggle between bibliographic utilities, local library management systems and internet resources on one personal computer. This is certainly a help when trying to process our approval plans more efficiently. It is important to make use of the new technology in order to make approval plan processing less tedious. Manual "on order" files should be reflected on-line; software such as Blackwell North America's New Titles Online (NTO) should be readily available and consulted. The searcher should be able to toggle between NTO and their on-line system as well as their bibliographic utility, once again in an effort to streamline workflow. Of course, the advent of Promptcat is certain to change things even further. With this service, when a book vendor sends a new approval title to a particular library, they will also inform OCLC of the transaction. OCLC will then automatically add the library's holding symbol to the corresponding OCLC record, and transfer the record to the library's own on-line system. This product is designed to "increase efficiency in technical processing."22 PromptCat attempts to streamline acquisitions and cataloging "with minimal intervention by library staff."23 During testing of PromptCat at Michigan State University, it was reported that staff time was saved "due to efficient processing and reduced editing time."24 As Marda Johnson from OCLC states, "you can shape PromptCat to your library's workflow"25 by selecting various processing options which meet your library's specific needs. Again, this is the key to the efficient integration of approval plans in technical services-consider work flow and staff when processing approval plans and make sure procedures are streamlined as much as possible. It will never be possible to treat approval plans the same way we treat firm orders, but the less disruptive we make procedures, and the more we try to conform to the work flow in place, the more our approval plan will work for us rather than fight us. If our acquisitions department procedures are efficient, and our approval plan processes are well thought-out and constantly examined, we can minimize the disruption to work flow, and perhaps, just perhaps, make our approval plans work for us. References 1. Robert F. Nardini, "Approval Plans: Politics and Performance," College and Research Libraries, 54 (5) (September, 1993). p. 41 8. 2. ibid., p. 417. 3. H. William Axford, "Economics of a Domestic Approval Plan," College and Research Libraries, 32 (5) (September, 197l), p. 371. 4. Martin Warzala, "Evolution of Approval Services," Library Trends. 42 (3) (Winter, 1994), p. 5 1 5. 5. op. cit., Nardini, p. 419. 6. ibid., p. 4 19. 7. ibid., p. 419. 8. ibid., p. 4 19. 9. Meeting, Searching Section of the Bibliographic Services Department, Syracuse University Library, May 8, 1995. 10. Mary Rose Magrill and John Corbin, Acquisitions Management and Collection Development in Libraries, 2nd edition, Chicago: American Library Association, 1989, p. 126. l I . op. cit., Worzala, p. 518. 12. Joseph W. Barker, "Vendor Studies Redux: Evaluating the Approval Plan Option From Within," Library Acquisitions: Practice and Theory, 13 (2) (1989), p. 136. 13. ibid., p. 136. work_bnv5sebbhjhrzkjcqba7u3lduq ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216589931 Params is empty 216589931 exception Params is empty 2021/04/06-01:37:03 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216589931 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:02 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_bnz6n5es7vfrnnmj45qpwirsda ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216590500 Params is empty 216590500 exception Params is empty 2021/04/06-01:37:03 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590500 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:03 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_bzhjv2biz5b2rkrpcuhpmbjb3i ---- OCLC Micro. Vol. 8, no. 5, p.18-23, 1992 ISSN: 8756-5196 doi: 10.1108/EUM0000000003692 © MCB UP Ltd http://www.emeraldinsight.com http://www.emeraldinsight.com/Insight/viewContainer.do?containerType=Journal&containerId= 11150 The Need for Funded Research Tschera Harkness Connell an assistant professor for the school of Library and Information Science, Columbus Program at Kent State University. Part I: Barriers to Effective Subject Access in Library Catalogs The first part of this article is a brief summary of an OCLC-funded project, "Identifying Barriers to Effective Subject Access in Library Catalogs," in which I participated. The project was under the leadership of Professor F.W. Lancaster of the University of Illinois at Urbana- Champaign. A more complete report of the project is published elsewhere (Lancaster, Connell, Bishop, and McCowan, 1991). Purpose The purpose of the study was to determine the probability that a skilled catalog user would retrieve "the best" materials represented in the catalog on some subject and, if they are unable to retrieve the best materials, to determine what changes would be needed to ensure that future catalogs would allow the user to retrieve more of the better materials. Background In most studies on how to improve subject searching in online catalogs, success is measured in terms of whether the user is able to match subject terminology with the terminology of the catalog, or whether or not the user selects an item or items from among the items retrieved from the match. Such definitions of success do not consider whether or not the user has located anything useful. They do not address the issue of whether the user can locate what is in some sense best (i.e., the most complete, the most up-to-date, or the most authoritative). Traditionally, the purpose of the catalog is not prescriptive. In fact, one of the purposes of the catalog is to present all the related works in the collection. However, it is the assumption of this study that users want to be able to locate what in some sense is the best. "Best" in this study is defined as "recommended." We examined whether or not skilled users would retrieve books that appeared on lists, compiled by specialists, of recommended readings in various subject areas. Methodology Fifty-one bibliographies on a wide range of topics were assembled. The lists were obtained from faculty and from recommended readings appearing in recently published articles in encyclopedias or encyclopedic dictionaries. The sample of topics used in the study was determined by the availability of faculty lists and fairly recent (1983-1989), specialized bibliographies containing significant numbers of items likely to appear in the catalog of a research library. For each bibliography, the following steps were taken: http://www.emeraldinsight.com/ http://www.emeraldinsight.com/Insight/viewContainer.do?containerType=Journal&containerId=11150 http://www.emeraldinsight.com/Insight/viewContainer.do?containerType=Journal&containerId=11150 1. Journal articles were eliminated, since traditionally these have not appeared in library catalogs. 2. A search on the topic of each bibliography was performed in the "full" online catalog of the University of Illinois (FBR) by two members of the team who were familiar with FBR. These two members performed all the searches. The searches were performed on the basis of the title of the article or bibliography only. The searchers did not see the bibliography until after the search was completed. 3. Items in the bibliography not retrieved by the subject search, and then subsequently determined not to be owned by the University of Illinois, were eliminated. 4. Items not retrieved by the original subject search were gathered and examined to determine why items presumed to be relevant to a particular topic were not retrieved in the original subject search. An evaluation was made to determine how the search strategy or characteristics of the catalog would have to be changed to allow these to be retrieved. Results The results of the fifty-one searches varied from eight cases having 100 percent recall to two searches with zero recall. The fifty-one bibliographies collectively contained 607 items included within FBR, and of these, 327 were retrieved in the subject searches. If we simply average these numbers (327/607), we get an average recall of 53.9 percent. This result is probably higher than what most users would achieve. The searchers were instructed to search broadly, which means that the searchers used all seemingly relevant terms that they identified in LCSH, at any level of specificity, and not just the broadest applicable subject headings. Thus, a search on Pre-Columbian religions included terms related to specific religions as well as the more general terms. The searchers were further instructed to give no concern for the precision of the search. For example, to get a high recall on the Gumbel distribution, which relates to the statistics of extremes, the searchers used broad terms such as Mathematical statistics and Stochastic processes, which retrieve records for more than 1,200 items. This same situation applies to other searches. Therefore, while recall was high in a few of the fifty-one searches, these results would not be achieved under real-life conditions because a library user would just not be willing to look through hundreds of records to find a handful of items. High recall and high precision occurred only in situations where the subject of the search coincided closely with a subject heading or headings. For example, the search on the image of women in the Bible achieved 75 percent recall on the single heading Women in the Bible and could have achieved 100 percent recall by the use of the additional term Woman (Theology). Such close matches between subject heading and topic were rare. The main purpose of the study was to determine what might be done to library catalogs to make them more effective tools for subject access. With this in mind, items that were not retrieved by the initial subject search were examined in order to determine what changes in the catalog and/or indexing policy would be necessary to make it possible to retrieve the recommended items (see Table 1). In looking at the subject headings assigned to the items, it was determined that had we searched on closely related headings, the recall ratio would have increased to 378/607 (62.3 percent), an improvement of a little more than 8 percentage points. For example, had we searched on the heading Glossolalia for the topic "spirit possession" or the heading Poverty—Government policy—United States for the topic "hunger and malnutrition in the U.S.," recall would have slightly improved. Of course, this is theoretical improvement based entirely on hindsight. If the searches had been broadened to include titles and other information, little improvement in recall would have occurred. Only ten of the 229 items not retrieved by subject headings would have been retrieved. That extending a search from subject headings to titles or subtitles has minimal effect on recall suggests that the subject headings assigned are very "close" to the terminology of the titles. The results shown in Table 1 might suggest that the problems of subject access in library catalogs could largely be solved if the text of contents pages and/or indexes were stored in a form suitable for searching. Nothing could be farther from the truth. Many searches on extended records would retrieve thousands of items rather than the hundreds that were retrieved in many of the searches on existing records alone. Only in the case of an atypically specific search, involving a rather rare word or name (such as Gumbel), might the enhanced record improve search results. ____________________________________________________________________ The library catalog, as it now exists, may provide adequate subject access for a small collection, but it is inadequate for a large, multidisciplinary library Implications In this study comprehensive searches were performed in order to determine to what extent the items considered important by the expert could be retrieved by the persistent and diligent searcher. Some of the failures were due to factors other than indexing policy and catalog design. Examination of recommended items showed that a significant number of the readings recommended by experts are relevant by analogy only. Although these items may be important to the topic, it is difficult to see how they could have been retrieved by any likely search approach. For this reason alone, if one wants to know the best things to read on some topic, there is no substitute for consulting an expert, either directly or indirectly (e.g., through an expert-compiled bibliography). What is discouraging about the results of this study is that items clearly relevant to the topic are often difficult to retrieve. Library catalogs as now designed permit only superficial subject searches. Lack of specificity in subject headings, coupled with the fact that the catalog provides access only at the level of the complete bibliographic item makes it virtually impossible to achieve a high recall at an acceptable level of precision. Ability to search on other parts of the bibliographic record (e.g., keywords in title), and even to access a greatly enhanced record (e.g., containing contents pages or book indexes) does not improve performance as much as one might expect. The library catalog, as it now exists, may provide adequate subject access for a small collection, but it is inadequate for a large, multidisciplinary library. Despite popular belief, the transformation of the card catalog into an online database has not significantly improved subject access. Indeed, it may have made the situation worse because it has led to the creation of much larger catalogs that represent the holdings of many libraries. The present study shows solutions must be sought in providing ways for the user to browse large groups of retrieved bibliographic records, to discriminate among the records retrieved, and ultimately to choose from among the items themselves. Partial solutions to the problems may involve the adoption of detailed analytical subject cataloging, and the development of finer tuned vocabulary control. Users need ways to explore categories of related headings without having explicitly to think-up or key-in all relationships. We must continue our research efforts to explore both conceptual and mechanical solutions to the difficulties of achieving both high recall and precision in the online catalog. Part 2: An Educator's View of Research The second part of this article gives one educator's view of research. This view is described in the context of the education and research roles of the library and information science educator. The University Research Environment Although research can be triggered under a fruit tree with little more than an inquiring mind and a falling apple, promoting research requires creating an environment that nurtures ideas and encourages participation. Ideas for research abound 1 and collegial support within the university provides the natural forum for the discussion and nurturing of ideas. Participation is encouraged with the provision of the tools and resources that enable faculty to accomplish research. Adjusting faculty workloads through the use of research assistants, laboratory monitors, and adjunct faculty, plus minimizing committee work, are some of the ways that administrators have traditionally encouraged faculty participation in research. The building of library research collections has been one way of providing tools. However, universities typically do not do as well in the provision of scientific equipment, computers, or software. Because of limited funding, universities often are unable to keep up with industry in terms of state-of-the-art technology. While it is possible for industry to pass the cost of upgrades on to customers, academic departments within universities are more limited in their funding options. This cost limitation puts university researchers at a disadvantage. Inadequate funding may limit the questions that can be asked, or at least the ways in which the questions can be examined. Inadequate funding also limits the training that can be passed on to students. In the fields of library and information science, technology is critical. Our research agendas call for examination into issues such as access to information, the information needs of users, the uses of information (U.S. Department of Education. Office of Educational Research and Improvement, 1988, pp. 16-19), the impact of technology on research, and preservation studies (Lynch and Young, 1990, pp. 235-240). All of these require interdisciplinary approaches (for which the university provides an excellent setting) and state-of-the-art technology (for which the university is hard pressed to keep up). As university budgets continue to shrink, the funds to pursue these questions will need to come from outside the university. Teaching Students Research Skills Performing research is just one aspect of the educator's role; teaching students is another. For many of us, it may well be that our most enduring contribution will be the impact that we make on our students. In this context of teaching, let me describe what I consider to be a somewhat schizophrenic role of the library and information science educator. Most of our students are masters students, at the beginning of their professional careers. As educators, it is crucial that we enable these students to be intelligent consumers of research. We must help students learn to evaluate research well enough to be able to incorporate useful findings into their thinking and into their decision making. It is even more important that we help our students develop an enthusiasm for research. Basic research skills will aid them in problem solving, marketing, and planning. However, inculcating enthusiasm is very difficult to do. Certainly it helps if we are enthusiastic about our own research. But the reality is that for many students the immediate concern is how to get a job and how to perform on that job. I teach in the area of technical services. In cataloging courses, the student preoccupation with the immediate translates to more of an interest in "how to" than "why." Yet, if we do not address the broad issues of information management, our students are not prepared to evaluate new approaches to organizing information for access. If we do not help students understand why particular solutions have been adopted in the past, they are less likely to understand the complexities of the issues that new technologies present. Finding the proper balance between theory and practice is often difficult. The ultimate goal must be to teach students the skills and attitudes needed for self-directed inquiry. Teaching the skills and attitudes for self-direction is the means for unifying the theory/practice schizophrenia. These are the skills and attitudes that are essential to a competent practicing professional information specialist. These are also the skills and attitudes that are essential to persons involved in research. The tension between practice and theory is usually not a problem with doctoral students. The chances are good that our doctoral students already have a commitment to research by the time they enter the Ph.D. program. Many are established in the field before they begin the degree. Because of this, their needs are similar to those of faculty, but just a step removed. They need nurturing, time, and tools. In addition, they need opportunities to work with experienced faculty engaged in research. Obviously, one of the ways to get students (at any level) excited about research is to get them involved in the process—to provide for them a positive research experience. This can be partially accomplished through assignments and research papers. It can also be accomplished, at least on a limited scale, by involving students in faculty research projects. The OCLC-funded project that I participated in involved a senior faculty member and three students. Two of the students were master's degree students working as graduate assistants. I was a doctoral student, participating out of interest. I found the interaction of the group process invigorating. It seems to me that the process has a number of advantages. For the student, the group process provides a protective environment yet at the same time an intellectually challenging one. For the advancement of certain kinds of knowledge, the group process may be necessary due to the complexity of the issues under examination. The interdisciplinary nature of many of the questions facing library and information science researchers require a variety of skills not likely to be present in a single person. ________________________________________________________________________ Much of OCLC's research has led to products of services that have greatly altered the ways that libraries provide their services.____________________________________________ OCLC Since 1986 OCLC, through its Library School Research Equipment Support Program and later its successor the Library and Information Science Research Grant Program, has funded approximately thirty such projects. The grant program "assists schools of library and information science to conduct high-quality technical research" by funding release time from teaching for the principal investigator, research assistants, travel, equipment, and other project-related expenditures" (OCLC, 1988, p. 41). Applicants for the program must explain the significance of the research, why the research is innovative, and suggest future directions the research might take. From an examination of the variety of projects (both basic and applied) that have been funded over the years, it is apparent that this grant opportunity is a pretty open invitation—that projects need not be tied to OCLC products, services, or research agendas. OCLC is to be commended for that. Open-ended funding opportunities are rare. We need more such funding. We need funding that can be used for basic research. For too long we have depended upon tradition within the field and commercial vendors outside the field to define our services. To take an example from recent history, during the early stages of the online catalog we were experimenting on our users with very rudimentary and inefficient automated catalogs. Since then, researchers have begun to look at the organization and retrieval of knowledge more broadly in terms of how knowledge is created, stored, and used. However, more exploration is needed on how people seek information. More testing is needed to see how proposed solutions work in different environments, (e.g., disciplines, institutions, cultures). It is the basic research that will give us the depth of understanding needed for future, viable applied research. It will be a strong foundation of basic research that will eventually enable us to design and build the tools that provide access to ideas— whether the ideas are accessed through people, through print, or through processors. Over the years, OCLC has been heavily involved in research. OCLC itself is a product of research. For example, cataloging workflow studies have led to products that have streamlined cataloging operations and have changed the day-to-day operations in technical services. More recently, OCLC research into using the online union catalog for collection evaluation has led to a new service, the OCLC/ Amigos collection analysis system. However, OCLC has also been heavily involved in basic research. Research into retrieval techniques, document structures, and interface design and management may (and probably will) result in products, but for the present, research in these areas increases our basic understanding of the interactive effects of language and text, and of presentation and retrieval techniques. In the 1991 summary of OCLC's strategic plan, OCLC states that its strategy "is to return to the basics, to fundamentals"—to emphasize librarianship, collaboration and cooperation, to build upon its strengths in cataloging, resource sharing, and excellent user service (OCLC, 1991, p. 12). Library and information science faculty have similar interests in building foundations. I would encourage library and information science educators and OCLC to look for ways of increasing opportunities for OCLC/faculty collaboration. Perhaps schools of library and information science and OCLC could co-sponsor summer research institutes for groups of researchers to explore predefined topics of mutual interest. These institutes could be designed to promote interdisciplinary research. They could be designed to promote practitioner/educator collaboration. If held at OCLC, these institutes could offer technical support not possible at most schools of library and information science. Such institutes would also provide a means for technology transfer back to the universities. Without the opportunity to use and evaluate state-of-the-art technology, it is difficult for faculty to know what is needed or to know for which to ask. Another idea for cooperation would be to use schools of library and information science as test sites for developing OCLC products. This would be a way of increasing the research involvement of new recruits to the field. There are many ways to take advantage of our shared research interests. I would encourage library and information science educators and OCLC to continue to explore ways of doing so. Note 1. For examples in library and information science see ''Research Questions of Interest to ARL" in U.S. Department of Education. Office of Educational Research and Improvement, 1988, Rethinking the Library in the Information Age, Vol. 1; ''Research Questions of Interest to CLR" in Lynch and Young, eds., 1990, Academic Libraries: Research Perspectives; and "ACRL Research Agenda" in College and Research Libraries News, 51 (April 1990): 317-319. References ACRL Research Agenda. (1990). College & Research Libraries News, 51,317-319. Lancaster, F.W., Connell, T.H., Bishop, N., & McCowan, S. (1991). "Identifying Barriers to Effective Subject Access in Library Catalogs." Library Resources & Technical Services, 35,377-392. Lynch, M. J. & Young, A., Eds. (1990). Academic Libraries: Research Perspectives. Chicago: American Library Association. OCLC. (1988). Annual Review of OCLC Research, July 1987-June 1988. Dublin, Ohio: OCLC Online Computer Library Center. OCLC. (1991). Journey to the 21st Century: A Summary of OCLCs Strategic Plan. Dublin, Ohio: OCLC Online Computer Library Center. U.S. Department of Education. Office of Educational Research and Improvement. (1988). Rethinking the Library in the Information Age (Vol. 1). Washington, DC: U.S. Government Printing Office. work_bzizshtcb5hgppyggea524whsm ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216587687 Params is empty 216587687 exception Params is empty 2021/04/06-01:37:00 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587687 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:00 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_bpacbmsecba7paq4k3clw5ahse ---- Microsoft Word - 03_toplu Türk Kütüphaneciliği 23, 3 (2009), 448-488 / \ Hakemli Yazılar / Refereed Papers Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey Mehmet Toplu * Öz Konsorsiyum oluşumları, 20. yüzyılın ikinci yarısından itibaren, kütüphanecilik alanındaki temel kavramlardan biri olmuştur. Uluslararası düzeyde meydana gelen yayın artışı karşısında, enformasyon merkezlerinin kullanıcı taleplerini tek başlarına karşılayamamaları, kaynak paylaşımı ve ortak koleksiyon gelişimi gibi uygulamaları zorunlu kılarken, konsorsiyum oluşumlarının da zeminini hazırlamıştır. Basılı yayıncılığın etkin olduğu bu dönemde, kütüphanelerarası ödünç verme, belge sağlama, toplu katalogların oluşturulması gibi konular da, konsorsiyumların faaliyetleri arasında yer almaktadır. 1990'lı yılların ikinci yarısından itibaren, elektronik yayıncılığın enformasyon hizmetlerinde tümüyle etkin hale gelmesi ile birlikte, konsorsiyum oluşumları uluslararası düzeyde hızla yaygınlaşmış ve faaliyetlerini de daha çok aynı veritabanlarını ortak satın alma üzerine odaklandırmışlardır. Çalışmada ilk olarak, konsorsiyum oluşumlarının uluslararası düzeyde nasıl bir gelişim gösterdiği ve elektronik yayıncılığın bu süreci nasıl etkilediği araştırılacaktır. İkinci olarak da ANKOS, UNAK-OCLC konsorsiyumları ile TÜBİTAK EKUAL (Elektronik Kaynaklar Ulusal Akademik Lisans) Anlaşmaları'nın Türkiye'deki Yrd. Doç. Dr., Gazi Üniversitesi, Enformatik Bölümü. e-posta: mtoplu@gazi.edu.tr mailto:mtoplu@gazi.edu.tr 449 | Hakemli Yazılar / Refereed Articles Mehmet Toplu enformasyon merkezlerinin koleksiyon gelişimlerine ve hizmetlerine etkileri ele alınacak ve sorunlu alanlar tanımlanarak, bunlara yönelik çözüm önerileri dile getirilecektir. Anahtar sözcükler: ortak koleksiyon geliştirme; kütüphane konsorsiyumları; kaynak paylaşımı; elektronik yayıncılık; koleksiyon yönetimi Abstract Consortiums have been one of the basic concepts in the field of libraries since the second half of 20th century. Information centers have failed to meet the user demands as the number of international publications increased, and this in turn, has forced them to share documents and develop common collections, while it also paved the way for consortiums. Interlibrary loans, document delivery, establishment of union catalogs were also among the main activities of consortiums in the same time period in which printed publications were dominating. Electronic publications have dominated the information services since the second half of 1990's. As a result, consortiums have quickly become widespread internationally, and concentrated their activities mainly on buying the same databases together with the others. In this paper, how the consortiums have developed internationally, and how electronic publishing affected this process are investigated first. Then, the effects of the agreements of ANKOS, UNAK-OCLC consortiums and TÜBİTAK EKUAL (National Academic License for Electronic Resources) on the collection development and information services are evaluated in Turkey. Finally, problematic areas are determined, and solutions are proposed. Keywords: co-operative collection development; library consortium; resource sharing; electronic publishing; collection management Giriş Enformasyon merkezlerinin gelişiminde, kullanıcıların nitelik ve talepleri, bütçe olanakları, bilginin basım ve yayımında kullanılan teknolojiler ile bütün bunlara bağlı Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 450 olarak yürütülen politika ve uygulamalar önemli derecede belirleyici olmuştur. Kullanıcıların nitelikleri ve bilgi edinme amaçları, enformasyon merkezlerinin hangi yapı içerisinde gelişeceğini ortaya koyarken, koleksiyonun gelişimini de tanımlamaktadır. Kullanıcıların talepleri, koleksiyonun gelişiminde temel dayanak noktasını oluştururken, aynı zamanda enformasyon merkezlerinin bu beklentiler doğrultusunda belirli alanlarda uzmanlaşmasına ve hizmetlerini biçimlendirmelerine katkı sağlamaktadır. Koleksiyonun gelişiminde kullanıcı talepleri kadar etkili olan, hatta zaman zaman onun önüne de geçebilen bir başka faktör, bilgi yayım araçları ve yöntemleridir. Bilginin insan eli ile yazılıp çoğaltıldığı dönemlerde temel yaklaşım, koleksiyonu korumak üzerine odaklanırken, basım teknolojilerinin gelişimi bunu tümden değiştirmiş, kullanımı ve yararlanmayı ön plana çıkarmıştır. Ayrıca basılı ve kâğıda dayalı (kitap, süreli yayın vb.) bilgi kaynaklarının etkili olduğu dönemlerde, enformasyon merkezi odaklı ve kurumsal temelli koleksiyon gelişimi ve hizmet sunumu güçlenerek varlığını devam ettirmiştir. Elektronik yayıncılığın gelişimi ve yaygınlaşması, koleksiyon geliştirme politikalarını ve buna bağlı olarak enformasyon hizmetleri ile ilgili bütün yapı ve kurumların yeniden şekillenmesini gerekli kılmıştır. Bu gelişmeler, enformasyon merkezlerinin tanımlanmasında, kurumsal ve mekânsal algılamalardan ziyade, hizmet odaklı bakış açısını ön plana çıkarmıştır. Elektronik yayıncılık asıl önemli etkisini, koleksiyonun gelişimi ve buna bağımlı hizmet sunumunda göstermiştir. Yeni koşullarda, basılı kaynaklar döneminde bir zorunluluk olan mekânsal bağımlılıklar etkinliğini yitirmeye başlarken, enformasyon merkezleri bilgi kaynakları için herhangi bir depoya gereksinim duymadan çok daha fazla bilgiyi daha etkin bir biçimde kullanıcıya eriştirebilir hale gelmiştir. Bu gelişmeler, kullanıcıların enformasyon merkezlerine olan bağımlılığını ortadan kaldırmış, internet bağlantısının ve elektronik yayınlara erişim için gerekli iznin bulunduğu her ortamda bilgiye erişilebilir kılmıştır. Ayrıca elektronik yayıncılık sadece metinsel değil, aynı zamanda çoklu ortam olarak adlandırılan görsel işitsel her türlü enformasyon kaynağını aynı kanallarla erişilebilir hale getirmiştir. Kullanıcı talepleri ve bilginin yayım araçlarındaki değişiklikler yanında, enformasyon merkezlerinin sahip oldukları bütçe olanakları koleksiyon gelişimini etkileyen bir başka önemli faktördür. Enformasyon merkezleri ancak sahip oldukları 451 | Hakemli Yazılar / Refereed Articles Mehmet Toplu bütçe olanakları çerçevesinde kullanıcı taleplerini karşılayabilmektedir. Özellikle bilimsel çalışmalarda ve buna paralel olarak yayın artışında meydana gelen gelişmeler, enformasyon merkezlerini kullanıcı taleplerini tek başlarına karşılayamaz duruma getirmiş ve bu durumda onları yeni arayışlara itmiştir. Hizmet amaçlarına uygun olarak, uyumlu ve güvenilir koleksiyon geliştirmeyi temel amaç edinen enformasyon merkezleri, derme geliştirme programlarını sadece içinde bulundukları dönemdeki acil gereksinimleri karşılamak üzerine değil, bütün süreci kapsayacak şekilde planlamaktadırlar. Ayrıca dermenin derinliği ve kalitesi, bu oluşumu yaratacak personelin eğitimi ve gelişimi ile kaynakların kullanılmasına yönelik çalışmalar1 koleksiyon geliştirme uygulamaları içerisinde yer almaktadır. Enformasyon merkezleri koleksiyon geliştirme konusunda, 20. yüzyılın ikinci yarısına kadar, genellikle tek başlarına hareket etmekte ve kendi olanakları ile kullanıcı taleplerini karşılamaya çalışmaktadırlar. 1 Harrod's Librarians' Glossory and Reference Book (2000). Compiled by. R. Prytherch; 9th ed; Aldershot: Grower. Yukarıda belirtilen gelişmeler, bilgi üretimi ve buna bağlı olarak yayın sayısında meydana gelen artışlar, koleksiyon geliştirmede, sağlıklı bir gelişim için, diğer paydaşlarla birlikte hareket etmeyi zorunluluk haline getirmiştir. Her ne kadar kurumsal politika ve uygulamalar hiçbir zaman terk edilmese de, diğer enformasyon merkezleri ile ortak hareket etme ve onların olanaklarından yararlanma düşüncesi bölgesel, ulusal hatta uluslararası ölçekte yoğun bir şekilde tartışılmaya başlanmış ve bu çerçevede yeni politika, uygulama ve kavramlar mesleki uygulamaların temelini oluşturmuştur. Ortak koleksiyon geliştirme, kütüphanelerarası işbirliği, konsorsiyum oluşumları, kaynak paylaşımı vb. gibi birçok kavram, enformasyon hizmetlerinde temel belirleyici unsur olarak etkin bir şekilde yer almaya başlamıştır. Basılı yayıncılığın etkili olduğu dönemlerde ortak koleksiyon geliştirme ve bu alandaki yaklaşım, politika ve uygulamalar Enformasyon merkezlerinin; yayın artışı, kullanıcı talepleri ve sınırlı bütçe olanakları karşısında koleksiyon gelişiminde tek başlarına yetersiz kalmaları sonucu “ortak koleksiyon geliştirme” kavramı mesleki çevrelerde en fazla tartışılan konulardan biri Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 452 olmuş ve kurumsal ölçekli yaklaşımlar yerine, birlikte hareket etmeyi gerekli kılan politika ve uygulamalar ön plana çıkmaya başlamıştır. Ortak koleksiyon geliştirme iki veya daha fazla kütüphanenin, koleksiyonun yerel gereksinimler çerçevesinde ayrı ayrı geliştirmesi yerine, katılımcılar arasında bir bütünlük içerisinde geliştirilmesini ve katılımcı tüm kütüphane kullanıcılarının erişimine açılmasını amaç edinen, kaynak paylaşımı çerçevesinde maddi ve yönetimsel güçlerin ortak kullanımını temel alan anlaşmadır.2 Birlikte hareket etme yönündeki bu davranışlar, doğal olarak kurumların bireysel olarak yürüttükleri kendi politika ve uygulamalarının değişmesine neden olmaktadır. Bu da kurumsal koleksiyon geliştirme politikasının değişmesine neden olduğu gibi, zaman zaman kullanıcıların olumsuz tepkilerine de neden olmaktadır. Çünkü ortak koleksiyon geliştirme politikası çerçevesinde, enformasyon merkezleri, “yasal ya da yasal olmayan bir anlaşmanın sonucu olarak, katılımcı kütüphaneler arasında dermenin, olanakların ve uzmanlıkların birlikte kullanımını gerekli kılacak kaynak paylaşımı”3 programını uygulamaya koymaktadırlar. Bunun sonucu olarak enformasyon merkezleri, karşılıklı olarak bir takım sorumluluk ve yükümlülüklerin altına girmektedirler. Bu alanda geliştirilen bütün politikaların amacı, katılımcı enformasyon merkezlerinin, anlaşmalar çerçevesinde belirlenen yükümlülüklerini etkin bir biçimde yerine getirmesini sağlamak ve diğer paydaş kurumların olası olumsuz gelişmelerden etkilenmelerini en aza indirgemektir. 2 ODLIS -Online Dictionary for Library and Information Science. by Joan M. Reitz. 8. 7. 2009 tarihinde http://lu.com/odlis/search.cfm. adresinden erişilmiştir. 3 ODLIS... Enformasyon merkezleri arasında, temel olarak ortak koleksiyon geliştirmeden ziyade, var olan kaynakları daha etkin kullanabilmeyi amaçlayan işbirliği anlayışı, 19. yy'ın son çeyreğinde Amerika Birleşik Devletleri'nde (ABD) ortaya çıkmıştır. 1885 yılında E. Mac R. B. Downs ve 1886 yılında Melvil Dewey Library Journal'da kütüphane işbirliği ve kütüphane koleksiyonlarının ortaklaşa kullanımı üzerine makaleler yayınlamışlardır (Pathak, 2004, s. 228). Bundan sonra kütüphanelerarası işbirliğine olan ilgi artarak devam etmiş ve 1880'li yıllarda American Library Association (ALA) “İşbirliği Komitesi” oluşturarak bir rapor hazırlamıştır. 1900'lü yılların başında Kongre Kütüphanesi (Library of Congress - LC) katılımcı kütüphanelere dağıtılan katalog kart sistemini, daha sonra da LC konu başlıklarını http://lu.com/odlis/search.cfm 453 | Hakemli Yazılar / Refereed Articles Mehmet Toplu hazırlamaya başlamıştır (Bostick, 2001, s. 128; Holley, 1975, s. 293). Bu çalışmalar, hizmetlerdeki verimliliği ve buna bağlı olarak kullanıcı taleplerini artırmaya başlamıştır. İşbirliği alanındaki çalışmalar, enformasyon kaynaklarının depolanması, duplike yayınların dağıtımı, gazete ve diğer araştırma kaynaklarının yeniden üretimi, sorumluluk alanlarındaki araştırma materyallerinin koordinasyonu, özel koleksiyonun gelişimi, bölgesel dokümanların ve gazete merkezlerinin organizasyonu ile kullanıcıların kütüphanelerarası ödünç verme ve belge sağlama hizmetlerinden yararlanması için gerekli entegre sistemlerin geliştirilmesi gibi diğer enformasyon hizmetleri alanlarına doğru genişlemeye başlamıştır (Scigliano, 2002, s. 393; Kittel,1975, s. 246). ABD'de ilk başta ulusal düzeyde başlatılan bu çalışmalar, daha sonra uluslararası bir boyut kazanmaya başlamış ve enformasyon hizmetleri alanındaki standartların gelişiminde öncü rolü oynamışlardır. Daha çok enformasyon merkezlerinin sahip oldukları koleksiyonu ortaklaşa kullanmayı hedefleyen ve bu amaçla yürütülen altyapı çalışmaları işbirliğinin ilk adımını oluşturmuştur. Bu çalışmalar, aynı zamanda enformasyon merkezlerinin kaynaklarını ortaklaşa kullanmasını sağlayacak kavramsal yapının gelişimine ve uygulamaların yaygınlaştırılmasına olanak sağlamıştır. Bütün bunların sonucu işbirliği, kütüphanelerarası ödünç verme, işbirliğine dayalı koleksiyon yönetimi, stokların paylaşılması, ortak kataloglama, otomasyon kolaylıkları, ağ erişimi, personel eğitimi, lobicilik gibi birçok işlevi yerine getirmek amacıyla kütüphaneler arasında yapılan ortaklaşa düzenlemeleri kapsamaya başlamıştır.4 4 ODLIS... Özellikle İkinci Dünya Savaşı'ndan sonra, ülkeler birbirlerine ekonomik ve teknolojik üstünlük sağlayabilmek amacıyla bilimsel çalışmalara daha fazla önem vermeye başlamıştır. Bunun sonucu, bilimsel yayın artışında ve bu yayınlara olan talepte önemli artışlar meydana gelmiştir. Ayrıca bilgiye olan talep sadece bilimsel alanla sınırlı kalmamış, toplumsal yaşamın bütün alanları bilgi temelli bir örgütlenme içerisine girmiştir. Bir başka deyişle, bireylerin toplumsal yaşama etkin bir biçimde katılabilmeleri için ne kadar bildikleri ve bunu faaliyetlerine nasıl aktarabildikleri önemli bir unsur haline gelmiştir. Bu da doğal olarak enformasyon hizmetlerine ve bu bağlamda da kurumlarına olan talebi artırmıştır. Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 454 Enformasyon merkezleri hem yayın, hem de bu yayınlara olan talep artışları karşısında daha önce verdikleri hizmetlerin yetersizliğini keşfetmeye başlamışlardır. İşbirliği çalışmalarında sadece var olan kaynakların ortak kullanımının yeterli olmadığını, aynı zamanda koleksiyonun gelişimi konusunda da birlikte hareket etmenin bir zorunluluk haline geldiğini fark etmişlerdir. Bunun sonucu olarak kütüphane ortaklıkları ve konsorsiyum kavramları enformasyon hizmetleri alanında tartışılmaya başlanmıştır. Karşılıklı yardım ve fonksiyonları yerine getirmek üzere enformasyon merkezleri tarafından kaynak ve masrafların paylaşımında işbirliği sağlamak amacıyla oluşturulan ortaklıklar5 1943 yılında ALA tarafından hazırlanan kütüphane terimleri sözlüğünde, ortak kitap seçimi kavramı çerçevesinde ele alınmıştır (Holley, 2003, s. 698). 5 Harrod's Librarians'... 6 Harrod's Librarians'... Özellikle 1960'lı yıllardan itibaren ortaklıktan ziyade konsorsiyum oluşumunun ön plana çıktığı görülmektedir (Giardano, 2003, s. 1613). Konsorsiyum kavramı birlik ve ortaklık olarak kütüphanecilik ilkelerinde genellikle işbirliği, eşgüdüm ve birlikte çalışma kavramları tarafından kuşatılmıştır (Pathak, 2004, s. 228; Bashirullah, 2006, s. 104). En az iki bağımsız kütüphane ve/veya kütüphane sistemleri tarafından yasal anlaşmalarla kurulan ve kaynak paylaşımını amaçlayan konsorsiyumlarda, konsorsiyumların niteliğine göre etkinlik alanları değişmekle birlikte, genellikle ortak koleksiyon geliştirme, eğitim ve öğretim, koruma, merkezi hizmetler, kütüphane otomasyon hizmetlerine yönelik ağ bağlaşıklığı, sistem desteği, görüş alışverişi, kataloglama için gerekli yönetimsel destek, kütüphanelerarası ödünç verme, toplu listelerin oluşturulması, ortak satın alma gibi hizmetleri içermektedir. İçerik sağlayıcıları ile anlaşarak kaynak alımlarında indirimler sağlanması da konsorsiyumun temel amaçları arasında yer almaktadır.6 Konsorsiyum, karşılıklı yararlar için kütüphane işbirliğini gerektiren örgütlenmeyi gerekli kılar ve üye kütüphanelerin kaynak paylaşımına önderlik eder (Alberico, 2002, s. 63; Doughherty, 1988, s. 289). Ayrıca bazı enformasyon merkezleri tek başlarına göze alamayacakları büyük ölçekli projelerin riskini paylaşmak amacıyla da, konsorsiyum oluşumlarına gitmektedirler (Hirshon, 1999, s. 125). 455 | Hakemli Yazılar / Refereed Articles Mehmet Toplu Konsorsiyumların ortaya koyduğu geniş kapsamlı bu hizmet alanları, doğal olarak üye kütüphanelerin kurumsal olarak yürüttükleri politikaların değişmesine ve çıkar ilişkisine dayanan ama aynı ölçüde birbirine bağımlı hizmet uygulamalarını da beraberinde getirmiştir. Sürekli bir toplumsal ve teknolojik değişim ortamında, ekonomik zorlamalar, enformasyon merkezlerine daha iyi ve daha farklı hizmet sunmak için konsorsiyumlarda yer almayı zorunlu kılmaktadır. Bu yüzden, ortaklık stratejileri kütüphane politikalarının can alıcı noktalarını oluşturmaktadır (Ching, 2003, s. 304). Enformasyon hizmetlerinde, basılı kaynakların etkin olduğu dönemlerde konsorsiyumların yukarıda tanımlanan amaç ve işlevleri arasında en öncelikli konuların başında ortak koleksiyon gelişimi ve bu çerçevede de kaynak paylaşımının sağlanması gelmektedir. Ortak koleksiyon gelişiminde temel amaç, konsorsiyum kapsamında alınması düşünülen kaynakların sadece bir üye tarafından alınmasını ve diğer merkezlerin kullanıcılarının da bu kaynaktan yararlanmasını sağlamaktır (Chan, 2002, s. 14). Bu sayede enformasyon merkezleri hem sınırlı bütçe olanaklarını daha olumlu ve verimli bir şekilde kullanabilecek, hem de kullanıcılarının taleplerini daha fazla karşılayabilecektir. Doğal olarak böyle bir oluşum bazı yükümlülükleri de beraberinde getirmektedir. Her şeyden önce enformasyon merkezleri, konsorsiyum kapsamında almakla yükümlü oldukları yayınların aboneliklerine diğerlerine danışmadan ve onların oluru alınmadan istedikleri gibi son veremeyeceklerdir. Ayrıca üye bütün kurumlar kendi enformasyon altyapılarını, gerekli bibliyografik denetim araçlarını ve teknolojik yatırımlarını konsorsiyumun belirlediği standartlar ve tanımlamalar çerçevesinde gerçekleştireceklerdir. Etkin ve standart bir altyapı oluşturulamazsa, enformasyon kaynaklarına erişimde büyük güçlüklerle karşılaşılacak, konsorsiyumdan beklenilen yarar istenilen düzeye çıkarılamayacaktır. Burada konsorsiyum yönetimine de büyük sorumluluklar düşmektedir. Her şeyden önce konsorsiyum yönetimi standartların oluşturulması, altyapı koşullarının belirlenmesi, üye kurumlar arasında, bibliyografik denetim araçlarının oluşturulması, kütüphanelerarası ödünç verme, ödemelerin yapılış biçimi, kurallara uymayanlara nasıl yaptırımlar getirileceği vb. konularda rehberler hazırlamak, yol haritaları belirlemek ve gerekli her türlü eğitim ortamını oluşturmakla yükümlüdürler. Konsorsiyum kapsamında ortak koleksiyon geliştirmede enformasyon merkezleri açısından bazı sorunların yaşanması da olasıdır. Her şeyden önce enformasyon merkezleri, konsorsiyum kapsamında aboneliklerine son vermek zorunda Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 456 kalacakları yayınlar konusunda şüpheli davranabileceklerdir. Çünkü hem kendi kullanıcılarının tepkilerinden çekinmektedirler, hem de yayını satın alacak diğer kurumun bu yükümlülüğünü ne kadar yerine getirebileceği konusunda bazı şüpheleri oluşabilecektir. Ayrıca kullanıcılar daha önce kendi kurumlarından ilgili yayına daha güvenli ve kısa zamanda erişebilirlerken, bu uygulama ile birlikte diğer merkezlerden belgenin gelmesi için beklemek zorunda kalacaklardır. İşin içerisine kütüphanelerarası belge sağlama hizmetlerine başvurma gibi ek işlemler girmektedir. Bundan dolayı enformasyon merkezleri, kullanıcılar tarafından çok fazla talep edilen temel süreli yayınların aboneliklerinin sona erdirilmesi konusunu şüpheyle yaklaşırlar ve bu alandaki uygulamalarını daha az kullanılan yayınlar üzerinde gerçekleştirmeyi tercih ederler. Konsorsiyumlar kurumsal düzeydeki koleksiyon gelişim kültürünü değiştirmekte ve koleksiyon yönetici/seçicilerinin kullanıcıları ile olan ilişkileri yeni yapılanmadan etkilenmektedir. Bu, ortaklığın karşı karşıya kaldığı önemli sorunlardan biridir (Dannelly, 1998, s. 39). Çünkü kullanıcılar, talep ettikleri yayınların konsorsiyum kapsamındaki diğer merkezlerden biri tarafından satın alındığı için alınamayacağı yönündeki yanıttan hoşnut olmayacak, bu da daha önce geliştirilmiş olan olumlu ikili ilişkileri olumsuz olarak etkileyecektir. Konsorsiyum kavramının gittikçe yaygınlaşması ve uluslararası ölçekte birçok ülkede uygulama olanağı bulması, bunların oluşturulması yönündeki tartışmaları artırmış ve bir takım farklılıkları beraberinde getirmiştir. Bazı konsorsiyumlar bölgesel, ulusal vb. coğrafi temelli bir örgütlenme modelini tercih ederlerken, birtakım konsorsiyumlar da tıp, mühendislik gibi konu bazlı; ya da halk, üniversite, araştırma kütüphaneleri gibi enformasyon merkezlerinin türlerine göre bir yapılanmayı tercih etmişlerdir (Shachaf, 2003, s. 94). Doğal olarak bu örgütlenme modellerinde ülkelerin coğrafi büyüklükleri, enformasyon merkezlerinin gelişmişlik düzeyleri ve sunulan hizmetlerdeki farklılıklar önemli ölçüde belirleyici olmaktadır. Bununla birlikte bazı konsorsiyumlar hizmet temelli bir örgütlenme yapısını tercih etmektedirler. Bu çerçevede oluşturulan büyük ölçekli konsorsiyumlar, bilgisayar ortamına aktarılmış ve bütün teknik hizmetleri kapsayan işlemlerle, küçük boyutlu konsorsiyumlar ise kullanıcı hizmetleri ve günlük problemlerle ilgilidir. Sınırlı amaçlı konsorsiyumlar özel konu alanlarındaki işbirliği, kütüphanelerarası ödünç verme ya da referans ağ işletmeleri alanlarında etkinlik göstermektedirler (Bostick, 2001, s. 128). 457 | Hakemli Yazılar / Refereed Articles Mehmet Toplu Basılı yayıncılığın etkin olduğu dönemde geliştirilen konsorsiyumların işleyiş biçimlerine bakıldığında, her şeyden önce, hizmetlerin enformasyon merkezlerine bağımlı bir şekilde yürütülmek zorunda olduğu görülmektedir. Çünkü hizmetler üye merkezlerin koleksiyonlarına bağımlı olarak yürütülmek durumundadır ve bunların etkin kullanımı da buralardaki yapılanmalar ile bire bir bağlantılıdır. İkinci önemli bir unsur da, üye merkezlerdeki kaynaklara erişim için güçlü bir enformasyon alt yapısının gerekli olmasıdır. Ne kadar güçlü bir ortak koleksiyon geliştirme programı uygulanırsa uygulansın, eğer üye merkezlerdeki kaynaklara erişimi sağlayacak bibliyografik denetim araçları oluşturulamazsa, sağlıklı bir konsorsiyumdan bahsedilemez. Nitekim konsorsiyumların yaygınlaştığı 1960'lı ve 1970'li yıllarda ortak koleksiyon geliştirme ile birlikte kataloglama ve toplu listelerin oluşturulması, bibliyografik kayıtların elektronik ortama aktarılması gibi konulara ağırlık verilmesinin nedeni budur. Kütüphane kataloglarının ve süreli yayın listelerinin elektronik ortama aktarılması, kataloglara online erişim ve internetin gelişimi ile bunların yaygınlaşması (Dannelly, 1998, s. 39) konsorsiyum çalışmalarını önemli ölçüde kolaylaştırmış, hizmetlerin daha hızlı ve verimli bir şekilde yürütülmesine olanak sağlamış hatta bu örgütlenme biçiminin yaygınlaşmasına katkı sağlamıştır. Bu dönemde konsorsiyumlar, sınırlı bütçelerin en iyi şekilde kullanılmasında önemli bir araçtır. Özellikle 1980'li yılların ikinci yarısından itibaren, yayın fiyatlarında normal enflasyonun çok üzerinde bir artış meydana gelmiştir. Örneğin ABD'de 1986­ 1990 yılları arasında tüketici fiyat endeksi sadece % 57 oranında artarken, dergi abonelikleri % 192 oranında artmıştır (Alberico, 2002, s. 64). Yine yayın fiyatlarındaki artış sonucu Almanya'da dergi aboneliklerinin sayısı 1989-1998 yılları arasında 95.000'den 81.000'e inerek % 15 oranında azalmıştır. Aynı süre içerisinde yapılan harcamalar % 63 oranında artmış ve yapılan harcamalar 19.6 milyon DM'dan 31.9 milyon DM'a çıkmıştır. 1999 yılında abonelik ücreti 4 milyon DM olan 2500 den fazla derginin aboneliğine son verilmiştir (Reinhardt, 2001, s. 68). Enformasyon merkezleri bu olumsuz gelişmelere birlikte hareket ederek karşılık verebilmişler, ortak koleksiyon geliştirme ve kaynak paylaşımı programları ile bütçelerini ortak kullanarak daha da büyütebilmişlerdir (Balas, 1998, s. 42). Ortak koleksiyon geliştirilmesi yönündeki bu çabalar ve kütüphanecilerin kullanıcı taleplerini karşılamada birçok enformasyon merkezinin kaynaklarını göz önünde bulundurma zorunluluğu, aynı dönemde koleksiyon yönetimi kavramını da Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 458 gündeme getirmiştir. Koleksiyon yönetimi; bu alandaki temel ilkelerden başlayarak, enformasyon merkezlerinin koleksiyonlarının organizasyonu ve korunmasını, kullanıcıların öncelikli amaçları için gereksinimlerinin belirlenmesini ve kaynak sağlanmasında alternatif araçların saptanmasını amaçlayan bir oluşumdur.7 Ayrıca enformasyon hizmetlerinin çok merkezli bir sistem içerisinde yürütülmesinde koleksiyon yönetiminin işlevleri daha da artmaktadır. Koleksiyonun gelişimi, oluşumu, bu çerçevede gerekli politikaların hazırlanması, bibliyografların eğitilmesi, koleksiyonun analizi, kullanımı ve korunması gibi konular koleksiyon yönetimi kavramı çerçevesinde ele alınmaktadır (Branin, 1998, ss. 4-5). Hizmetlerin yürütülmesinde diğer paydaş merkezlerin göz önünde bulundurulması süreci yanında, elektronik yayıncılığın gelişimi ve talep edilen bilgilerin bu ortamda kullanıma sunulması koleksiyon yönetimi kavramının önemini daha da artırmıştır. Enformasyon merkezleri, sahip oldukları sistemi etkin bir şekilde kullanmaya devam ederken, aynı zamanda elektronik ortamdaki bilgiye erişim konusunda da önemli adımlar atmaktadır. Bu da enformasyon merkezlerinde basılı ve elektronik kaynaklar olmak üzere bilginin depolanması, organizasyonu ve erişiminde ikili bir yapının oluşumuna neden olmaktadır. 7 Harrod's Librarians' . Konsorsiyal oluş umlar çerçevesinde ortak koleksiyon geliş tirme ve kaynak paylaşımı, kuramsal olarak, kullanıcılara daha fazla bilgi sunabilme açısından son derece cazip olmasına karşın, uygulama aşamasında bazı sorunları beraberinde getirmektedir. Ortak koleksiyon geliştirme çalışmalarında çok fazla zaman ve çaba harcanması, işlemlerin pahalı gerçekleşmesi, paydaşlar arasındaki kaynakların erişiminde güçlü bir altyapıya gereksinim duyulması gibi etkenler söz konusudur. Bununla birlikte konsorsiyuma üye kurumların koleksiyon geliştirmede hangi alanlarda uzmanlaşacağı önemli bir sorundur (Wylie, 1999, s. 27). Ayrıca konsorsiyuma üye kurumlar, faaliyetlere ve harcamalara her zaman eşit düzeyde katkı sağlayamazlar, ancak hizmetlerden etkin bir şekilde yararlanmak isterler. Bu da paydaşlar arasında bazı rahatsızlıklara neden olmaktadır (Holley, 1998, ss. 22-25). Konsorsiyumda yönetim yapısının nasıl olacağı, hangi kuruma bağlı olarak çalışmalarını sürdüreceği, yönetimde kimlerin görev alacağı ve bunların nasıl seçileceği, üyeler arasında nasıl bir görev paylaşımının yapılacağı, harcamalar için kaynakların nereden elde edileceği gibi konular, bu tür örgütlenmelerde oldukça önemlidir. 459 | Hakemli Yazılar / Refereed Articles Mehmet Toplu Konsorsiyumun sağlıklı yürütülmesi amacıyla belirlenen birçok ilke ve kural, enformasyon merkezlerinin güçlü kurumsal kimliklerini koruma kaygıları nedeniyle zaman zaman uygulanamamaktadır. Enformasyon merkezlerinin bağlı olduğu üst kurumların, oluşacak masrafların karşılanması gibi konulardan rahatsızlık duymaları konsorsiyal oluşumları tehdit etmektedir. Bazı ülkeler bu sorunların çözümünü sağlamak amacıyla kamu kurumlarına bağlı örgütlenmeyi ön plana çıkarmışlardır. Örneğin konsoriyumlar, Çin'de Eğitim Bakanlığı'nın, İsrail'de Ulusal Akademik Bilgisayar Merkezi'nin ve Avustralya'da Milli Kütüphane'nin denetimi altında faaliyetlerini sürdürmektedir (Shachaf, 2003, s. 97). Elektronik yayıncılığın ortak koleksiyon geliştirmeye ve konsorsiyuma etkileri Özellikle 1990'lı yılların ikinci yarısında, enformasyon merkezlerinin koleksiyon geliştirme politikalarında ve bu amaçla konsorsiyal oluşumlara gitme yönündeki çabalarında önemli değişikliklerin olduğu görülmektedir. Daha önce de değinildiği gibi, belirtilen dönemde özellikle bilimsel yayınların fiyatlarındaki büyük artışlar nedeniyle, enformasyon merkezleri, birçok basılı yayının aboneliğine son vermek zorunda kalmıştır. Bu dönemde hem basılı süreli yayın fiyatlarının var olan enflasyondan daha fazla artması, hem de elektronik yayıncılığın hızla yaygınlaşması ve bir alternatif olarak sunulması, doğal olarak yeninin kabulü için “bir baskı unsuru yaratmak mı?” sorusunu gündeme getirmektedir. Çünkü yeni bilgi ortamının daha çabuk benimsenmesi için, mevcut sistemde birçok teknolojik dönüşümün sağlanması ve kütüphanecilerin dikkatlerinin bu yöne çekilmesi bir zorunluluktur. Nitekim elektronik yayıncılık, kütüphaneciler, enformasyon merkezleri ve kullanıcılar tarafından çok çabuk benimsenerek, koleksiyon gelişiminden konsorsiyonal örgütlenmelere kadar bilgi hizmetleri alanındaki hemen her şey yeni sisteme uygun hale getirilmeye başlanmıştır. Bilişim ve iletişim teknolojileri, enformasyon hizmetlerinde ilk önce bibliyografik denetim araçlarının elektronik ortama aktarılmasında kullanılmıştır. 1960'lı yıllardan itibaren kütüphane katalogları, süreli yayın listeleri, Index Medicus, Engineering Index ve Chemical Abstract gibi bibliyografik kaynaklar elektronik ortamda yayımlanmaya ve kullanılmaya başlamıştır (Lancaster, 1982, ss. 54-55). Bu çalışmalar sonucu, MEDLARS ilk bilgisayar destekli enformasyon erişim sistemini Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 460 oluşturup, 1971 yılında online versiyonunu kullanıma açarken, DIALOG ilk online ticari veritabanını kullanıma sunmuştur. 1983 yılında Amerikan Kimya Kurumu (American Chemical Society) dergilerinin elektronik tam metinlerine erişim olanağı sağlarken (Tonta, 1997, s. 305), 1993 yılında elektronik olarak erişilebilen dergi sayısı 3.000'den fazladır (Thornton, 2009, s. 844). 2001 yılında ise bilgilerin % 93'ü elektronik ortamda üretilmeye başlanmıştır (Oğuz, 2006, s. 56). Artık bilimsel süreli yayınlardan kitaplara, tezlerden patentlere ve standartlara kadar hemen bütün bilgi kaynakları tam metin olarak elektronik ortamda hazırlanmaya ve kullanıma bu şekilde sunulmaya başlanmıştır. Doğal olarak bunda, elektronik yayıncılık yanında internetin gelişiminin de büyük rolü vardır. 1960 ve 1970'li yıllarda Dialog ve LexisNexis gibi uzak erişimli elektronik veritabanlarına telefon hatları üzerinden bağlantılar gerçekleştirilirken (Cotter, 2005, s. 23), 1990'lı yılların ikinci yarısından itibaren bilginin elektronik ortamda erişimini sağlayan temel araç internet olmuştur. Bilişim ve iletişim teknolojilerinin, bilgi ve bilgi kaynaklarının yayımı, depolanması ve erişiminde basılı yayın araçlarının yüzyıllar boyunca var olan hâkimiyetini ortadan kaldırması, doğal olarak enformasyon hizmetleri alanındaki bütün politika, uygulama ve örgütlenmelerin değişmesine neden olmuştur. Daha 1986 yılında elektronik yayıncılığın bilimsel dergiler ve mevcut sistem üzerindeki etkileri tartışılmaya başlanmıştır (Butler, 1986, s. 48). İlk değişim, bilginin yayımı ve buna bağlı olarak depolanması ve erişiminde ortaya çıkmıştır. Kâğıda dayalı basılı dönemde geçerli olan, hatta bir ölçüde bilgi kaynaklarının depolanmasında enformasyon merkezlerine bağımlılığı zorunlu kılan uygulamalar, elektronik yayıncılıkla birlikte sona ermeye başlamıştır. Yayın evleri elektronik ortamda hazırladıkları her türlü bilgi kaynağını, kendi web sitelerinden, satın alan merkezlerin sistemlerinden ya da üçüncü bir tarafın sunucusu üzerinden erişime sunmaya başlamışlardır. Bilginin depolanması ve erişiminde yaşanan bu değişim doğal olarak onun pazarlanması, abone olunması sürecini de değiştirmiştir. Basılı kaynakların etkili olduğu dönemlerde enformasyon merkezleri, kullanıcılar tarafından çok tercih edilen ve uluslararası ölçekte bilim çevrelerinde genel kabul görmüş yayınlara abone olmayı tercih ederlerken ve koleksiyon gelişim uygulamalarını bu yönde oluştururlarken, yayın evleri ve aracı firmalar da sistemlerini yayınların tek tek pazarlanması üzerine kurmuşlardır. Elektronik yayıncılıkla birlikte ise, yayınevleri yayınlarını tek tek 461 | Hakemli Yazılar / Refereed Articles Mehmet Toplu pazarlama yerine, tümünü kapsayacak ya da konu ayrımı temelinde bir paket halinde sunmaktadırlar. Bu aynı zamanda kütüphaneciler tarafından da tercih edilen bir sistem olmuştur. Çünkü bu uygulama kütüphanecileri eski dönemlerde oldukça zaman ve emek gerektiren tek tek dergi abonelikleri modelinden kurtarırken, kullanıcılara da daha önceden erişemedikleri birçok yayına erişebilme olanağı sağlamıştır (Chan, 2002, s. 15). Bu gelişmeler, aynı zamanda ortak koleksiyon geliştirme politikaları ile konsorsiyumların amaç ve işlevlerinin de değişmesine neden olmuştur. Basılı yayıncılık döneminde, ortak koleksiyon geliştirme programının temel amacı, bilimsel yayının sadece bir merkez tarafından satın alınarak olası duplikasyonları engellemek iken, yeni sistemde bu bakış açısı hemen hemen tümüyle terk edilmiştir. Konsorsiyumların yeni hedefi, yayın evleri ve diğer elektronik yayın pazarlamacıları tarafından kendilerine sunulan “bilimsel yayın paketleri” üzerinde ne kadar fiyat indirimi sağlanabileceği üzerine odaklanmaktır. Konsorsiyumlar, kaynak paylaşım programlarını fiyatlandırma üzerine odaklandırmakta, ortak koleksiyon gelişim programının “tek yayın” “tek merkezde” ilkesi yerine, teklif edilen paket kaç merkez tarafından satın alınırsa, ne kadar fiyat indirimi sağlanacağı ilkesini benimsemektedir. Yeni ilkelerde dijital ortamda daha fazla kullanıcıya daha fazla bilgi kaynağının demokratik bir biçimde ulaştırılması yer almaktadır. Dijital enformasyonun paylaşılması ve erişimi yeni konsorsiyum anlayışının birinci karakteristiğini oluştururken, kurumsal koleksiyonun kullanılmasını sağlayacak elektronik belge dağıtımı ve çevrimiçi katalogların oluşturulması gibi unsurlar da ikinci karakteristiğini oluşturmaktadır. Bu gelişmeler taraflar açısından da cazip gözükmektedir. Kütüphaneciler daha önceki koleksiyon gelişim politikalarının zahmetli ve bir o kadar da emek gerektiren işlemlerinden kurtularak, daha fazla yayını kendi aracı rollerine fazla gereksinim duyulmadan, kullanıcılarının hizmetine sunabilme avantajını elde etmişlerdir. Hatta klasik koleksiyon geliştirme dönemlerine göre, daha az çaba sarf etmelerine rağmen, kullanıcılar tarafından daha fazla takdir edilmektedirler. Kullanıcılar ise, bağlı bulundukları enformasyon merkezlerine gitmeden ofis, ev vb. mekânlardan bilgisayarları aracılığı ile daha fazla bilgiye erişebilmenin mutluluğunu yaşamaktadırlar. Hatta bilgisayar ortamında, daha önce sadece enformasyon kaynaklarının bibliyografik verilerine ulaşabilirken şimdi makalelerin, kitapların, patentlerin, standartların vb. tam metinlerini, görüntülerini, seslerini elde edebilmektedirler. Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 462 Yayıncılar ve diğer pazarlamacılar açısından ise durum çok daha farklıdır. Konsorsiyum ilk başlarda yayıncılar açısından bir baş belası olarak algılansa da, pazarlama sürecinde daha az insanla ilişki kurmaları ve müşteri niteliğini daha iyi görebilmeleri nedeniyle, yeni sistemi daha fazla tercih eder hale gelmişlerdir (Chan, 2002, s. 16). Yayıncılar, yayınlarını bir paket halinde, konsorsiyuma üye kurumlarla ayrı ayrı görüşmeden ve fazla bir çaba sarf etmeden pazarlayabilmektedirler. Ayrıca yayınevleri, basılı yayın döneminde fazla tercih edilmeyen yayınlarını bu paketler içerisinde pazarlayabilme olanağına da kavuşmuşlardır. Böylece bütün yayınlar müşterilerle daha rahat buluşabilmektedir. Bir başka deyişle, daha önce fazla tercih edilmeyen bilimsel yayınlara bile, kullanıcılar tarafından daha rahat bir şekilde ulaşılabilmektedir. Basılı yayıncılık döneminde, bilim çevrelerince genel kabul görmüş yayınlar dışında, diğer yayınlara erişimin, özellikle sınırlı bütçe olanaklarına sahip ülkelerde, neredeyse olanaksız olduğu düşünülürse, elektronik yayıncılığın sunduğu avantajlar daha iyi anlaşılacaktır. Bu gelişmeler, aynı zamanda konsorsiyumların oluşturuluş gerekçelerini de dönüşüme uğratmıştır. Ortak koleksiyon geliştirme, kataloglama, standartlar oluşturma, belge sağlama, ödünç verme ve kaynak paylaşımı gibi çok amaçlı işlevsellik yerine, satın alma temel ilkesi doğrultusunda hareket eden bir yapı ön plana çıkmıştır. Bunun sonucu olarak günümüzde oluşturulan birçok konsorsiyum “satın alma kulübü” olarak adlandırılmaktadır (Hirshon, 1999, s. 393). Yayın evleri ve ilgili pazarlama şirketleri ile yapılan lisans anlaşmaları gereğince göreceli olarak daha ucuza daha fazla yayına erişimin sağlanması, fiziksel bir depolama sorununun olmaması, kütüphaneci ve kullanıcılara cazip gelmektedir. Ancak lisans anlaşmalarının genellikle kiralama yöntemiyle yapılması ve/veya enformasyon merkezlerinin daha ucuz olması nedeniyle böyle anlaşmaları tercih etmeleri, yapılan anlaşma çerçevesinde yayınların tam metinlerinin geriye dönük erişiminin (5, 10 yıl gibi) sınırlı olması, gelecek dönemlerde birçok sorunu da beraberinde getirecektir. Örneğin bir yayın evi ile yapılan anlaşma sona erdiğinde, kullanıcılar birgün önce erişebildikleri yayınlara ertesi gün erişemez hale geleceklerdir. Ya da bir yıl önce erişilebilen yayınlar, yayınevi ve/veya pazarlama şirketleri ile anlaşma devam etse dahi, yıl kısıtlaması nedeniyle, gelecek dönemde erişilemez hale gelecektir. Aynı şekilde yayınevi değişen bilimsel dergiler, farklı yayın paketleri içerisine transfer olduklarından 463 | Hakemli Yazılar / Refereed Articles Mehmet Toplu geriye dönük erişimlerinde sorunlar yaşanmaktadır. Enformasyon merkezlerinin, veritabanlarına hemen her yıl, geçmiş dönemdeki koşullarla abone oldukları (kiraladıkları) göz önüne alındığında, yayınların geriye dönük erişiminde gelecek dönemlerde daha fazla sorunla karşılaşılabilecektir. Basılı kaynakların aboneliklerinde mülkiyet temelli bir uygulama geçerli olduğundan, satın alınan yayına yönelik her türlü tasarruf enformasyon merkezlerine aitti ve aboneliklerine son verilse bile, geriye dönük erişim her zaman olanaklıydı. Elektronik yayıncılık döneminde, koleksiyon geliştirme sürecinde konsorsiyum oluşumlarının en fazla tercih edilme nedenlerinin başında, ürünlerin ücretlendirilmesi konusunda yapılacak pazarlıklar gelmektedir. Özellikle süreli, monografik vb. birçok bilimsel esere sahip yayın evleri, enformasyon merkezleri ve konsorsiyumlar karşısında güçlü bir tekel konumundadırlar. Bilimsel yayınların niteliğinden dolayı, bunların teklik arz etmesi ve başka ürünlerle ikame edilememesi, yayın evlerinin konumlarını ve pazarlık güçlerini artırmaktadır. Yayınevleri, elektronik yayın pazarına girmek için, geleneksel basılı yayıncılık yatırımlarını ve alt yapılarını terk ederek yeni teknolojik yatırımlara yönelirken, aynı zamanda daha önceki gelirlerini korumayı hedeflemektedirler. Enformasyon merkezleri ve konsorsiyumlar ise, daha az parayla daha fazla bilgiyi kullanıcılarına sağlama çabası içerisindedirler (Alberico, 2002, 70). Bundan dolayı enformasyon merkezleri, özellikle elektronik yayıncılık döneminde, yayınevleri ile mücadele edebilmek ve daha güçlü bir pazarlık ortamı yaratabilmek amacıyla konsorsiyum oluşumlarına daha fazla önem vermeye başlamışlardır. Enformasyon merkezleri, konsorsiyum oluşumlarını yerel ölçekle sınırlı tutmamış, daha güçlü kurumsal yapı geliştirebilmek için 1997 yılında Uluslararası Kütüphane Konsorsiyumları Birliği'ni (International Coaliation of Library Consortia-ICOLC) kurmuşlardır (Pathak, 2004, s. 229). Bu birliğe birçok ülkede oluşturulmuş yaklaşık 150 konsorsiyum üyedir (International Coalition of Library Consortia-ICOLC). Bu da yayınevleri ve/veya pazarlamacılarla konsorsiyumlar arasındaki ilişkilerin boyutlarını göstermesi açısından son derece önemlidir. Özellikle gelişmekte olan ülkeler başta olmak üzere, elektronik yayıncılık döneminde konsorsiyumların hızla yaygınlaşmasının nedenini sadece bunların pazarlık gücünün artmasına bağlamak yanıltıcı olur. Geleneksel yayıncılık dönemlerinde enformasyon merkezleri konsorsiyumlarda yer alabilmek için çok daha fazla emek sarf etmek ve katkı sağlamak zorunda idi. Çünkü konsorsiyumlar ve üye merkezler ortak Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 464 oluşturulacak koleksiyonun etkin erişimini ve daha fazla kaynak paylaşımını sağlamakla yükümlüydüler. Günümüzdeki konsorsiyumlar, daha çok “satın alma” ve “pazarlık gücü” temelinde örgütlendiklerinden, üye sayıları büyük önem taşımaktadır. Ayrıca enformasyon merkezlerinin, fazla çaba sarf etmeden ve katkı sağlamadan yayınları daha ucuza almaları, doğal olarak konsorsiyum örgütlenmelerini daha çekici hale getirmektedir. Elektronik yayıncılıkla birlikte, kamuoyunda oluşan önemli düşüncelerden biri de, geçmişe göre, enformasyon kaynaklarına daha ucuza sahip olunduğu ve erişilebildiğidir. Kullanıcıların geçmişe göre, daha fazla bilgi kaynağına çok daha kısa zamanda erişebilmesi, hatta aynı anda birçok kişinin aynı yayından yararlanabilmesi, bu düşüncelerin gelişiminde önemli rol oynamaktadır (Moothart, 1995, s. 62). Burada üzerinde durulması gereken, elektronik yayınların sağlanmasında, geriye dönük arşivlemeyi olanaklı kılan satın alma modelinin mi, ya da belirli bir dönemi kapsayan lisans anlaşmalarının mı tercih edildiğidir. Ayrıca ürünlerin kullanıma nasıl sunulacağı, aynı anda kaç kullanıcının veritabanından yararlanabileceği, geriye dönük kullanımda ve tam metin erişimde ne tür sınırlamaların yapılacağı gibi unsurlar da önemlidir (Cotter, 2005, s. 26). Yayınevleri ve/veya pazarlama kuruluşları konsorsiyumlara veya enformasyon merkezlerine, farklı ücretlendirme modelleriyle birçok değişik seçenek sunmaktadır. Ücretlerin ve kullanımın niteliği tercih edilen seçeneklere göre değişmektedir. Ayrıca üniversitelerin öğretim elemanı ve öğrenci sayıları ile kütüphane büyüklükleri, fiyatlandırmayı etkileyen diğer unsurlardır. Kullanıcı sayıları arttıkça lisans anlaşmalarındaki fiyatlar da artmaktadır. Örneğin Türkiye'deki bir üniversite kütüphanesi, ANKOS kapsamında 34.000 elektronik kitaba sahip Ebrary veritabanına, yıllık lisans anlaşması karşılığında, 34.000 ABD Doları öderken, tümüyle satın alma modeli (içerik satın alma modeli) karşılığında Springer yayınevinin 2005-2007 yılları arasındaki 9.000 kitabı için 242.542 Avro ve 2008 yılına ait 4.983 kitabı için de 76.194 Avro ödemiştir. Bu rakamlardan da anlaşılacağı gibi, kiralama ve içerik satın alma modellerinde farklı fiyatlandırmalar söz konusudur. İçerik satın alma modelinde, kiralama yöntemine göre daha yüksek fiyat ödenmektedir. Bunlara ek olarak, kiralama yöntemine dayalı lisans anlaşmaları, birçok kitaba erişim olanağı sağladığından, bazı enformasyon merkezlerine, azalan koleksiyonlarını artırmaları açısından önemli avantajlar sağlamaktadır (Walters, 2006, s. 25). 465 | Hakemli Yazılar / Refereed Articles Mehmet Toplu Elektronik yayıncılık, koleksiyon geliştirme ve bunlara yönelik uygulamalarda yarattığı bütün bu değişiklikler yanında, telif haklarından, daha farklı nitelikteki insan gücüne, bilginin bilginin erişimi, yayımı ve depolanmasından, ülkelerin enformasyon altyapılarına kadar birçok farklı sorunun yeni baştan ele alınmasına neden olmuştur. Yeni süreçte, elektronik yayınların sağlanmasına yönelik anlaşmalarda, telif haklarının korunması, bilgi güvenliğinin sağlanması gibi unsurlar çok daha önemli hale gelmişlerdir. Elektronik ortamdaki bilginin kural dışı kullanımını engellemek ve bilgi güvenliğini sağlamak amacıyla anlaşmalarda düzenlemeler yapılmakta ve taraflara karşılıklı bir takım yükümlük, sorumluluk ve yaptırımlar yüklenmektedir. Tamamen teknolojiye bağımlı bilgi erişim ortamı, aynı zamanda bu alanlarda uzmanlaşmış nitelikli insan gücü sorununu da gündeme getirmektedir. Bireyler daha önce temel bilgi depolama ve erişim araçlarını kullanırlarken, günümüzde daha karmaşık bilgi yönetim tekniklerini kullanmaktadırlar (Stern, 2003, s. 1138). Enformasyon uzmanları yıllar boyu süren çalışmalarında mesleki deneyimler elde ederlerken ve bunları diğer meslektaşlarına aktarırken, yeni bilgi erişim ortamı, çok daha fazla teknolojik uzmanlık bilgisini zorunlu kılmaktadır. Bu da doğal olarak konsorsiyumların, hatta bir ölçüde yayın sağlayıcılarının, sorumluluklarını artırmaktadır. Konsorsiyum oluşumlarında uzman paylaşımı, personelin ve kullanıcıların eğitimi gibi unsurlar ön plana çıkmaktadır (Bhattacharya, 2004, s. 166). Burada üzerinde durulması gereken önemli konulardan biri, ülkelerin teknolojik altyapılarıdır. Enformasyon merkezleri ve onların örgütleri, elektronik bilgiye erişim konusunda ne kadar çaba sarf ederlerse etsinler, eğer ülkede sağlıklı internet bağlantısını gerçekleştirecek güçlü bir telekomünikasyon altyapısı yoksa bunun gerçekleşmesi pek olanaklı değildir. Multimedya araçlarının ve buna bağlı görsel işitsel materyallerin hızla yaygınlaşması, telekomünikasyon altyapısının önemini daha da artırmaktadır. Doğal olarak bu, enformasyon merkezlerinin ve onların örgütlerinin üstesinden gelebileceği bir sorun değildir. Sorun ancak siyasi iktidarların güçlü desteği ve ulusal ölçekli yatırımlarıyla çözümlenebilir (Kenan, 1975, s. 186). Elektronik yayıncılık ve dijital kütüphaneler bilginin sağlanması, depolanması, organizasyonu ve yayımı ile ilgili eski uygulamaları ortadan kaldırırken (Gerenimo, 2005, s. 426) aynı zamanda yeni araştırmalara daha fazla katkı sağlamakta ve yaşam boyu öğrenmeyi yaygınlaştırmaktadır (Greenstein, 2000, s. 291). Bütün bu gelişmeler, Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 466 kütüphanecilerin mevcut koşullarda rollerinin ne olacağını ve hangi niteliklere sahip olması gerektiğini gündeme getirmektedir (Tenopir, 2003, s. 615). Enformasyon hizmetleri alanındaki bütün bu değişimler, her türlü örgütlenmenin ve bu çerçevede konsorsiyumların oluşumlarının yeniden sorgulanmasına neden olmaktadır. Konsorsiyumların misyonlarının ve vizyonlarının ne olacağı, üyelerin kimlerden oluşacağı ve bunlar arasında görev paylaşımının nasıl yapılacağı oldukça önemlidir. Yayın sağlayıcı kuruluşlarla yapılacak lisans anlaşmalarında, bütün üyeler adına mı hareket edileceği ya da her üye ile ayrı ayrı sözleşme imzalanmasının mı tercih edileceği gibi unsurlar da göz önünde bulundurulması gereken diğer konulardır (Hirshon, 2001, s. 152). Bütün bunlar, her oluşumun kendi koşullarını yarattığını, eylem için birçok seçenek bulunduğunu ve içinde bulunulan duruma en uygun olanın tercih edilmeye çalışıldığını göstermektedir. Türkiye'de ortak koleksiyon geliştirme ve konsorsiyum çalışmaları Türkiye'de basılı yayıncılığın etkin olduğu dönemler için, birçok gelişmiş ülkede ortaya çıkan ortak koleksiyon geliştirme çalışmalarından ve bu çerçevede oluşturulan konsorsiyal örgütlenmelerden söz etmek neredeyse imkânsızdır. Hatta böyle bir kavramın enformasyon hizmetleri alanında çalışanların gündemine bile gelmediğini söylemek yanlış olmaz. Ayrıca, işbirliği kapsamında değerlendirilebilecek etkinliklerin sayısı son derece sınırlıdır ve bunlar ortak bir çabadan ziyade bazı kurumların özverili çabaları sonucu gerçekleştirilmiştir. Örneğin Türkiye'deki bilimsel süreli yayınları ve bunların bulundukları enformasyon merkezlerini saptamak amacıyla 1970'li yıllarda yürütülen süreli yayınlar toplu katalog çalışmaları tamamen TÜBİTAK/TÜRDOK'un inisiyatifi ile yürütülmüştür. Bütün il ve bölgeleri kapsayacak şekilde planlanan bu proje, yeterli işbirliği ortamının oluşmaması sonucu belirlenen kapsamda tamamlanamamıştır. Ankara ve İstanbul Süreli Sayınlar Toplu Katalogları hazırlanarak kamuoyu ile paylaşılırken, İzmir Süreli Yayınlar Toplu Kataloğu ise hazırlandığı halde yayınlanamamıştır. Diğer il ve bölgelere yönelik çalışmalara ise hiçbir zaman başlanamamıştır (Toplu, 1991, ss. 244-247). Türkiye'de ortak koleksiyon geliştirme ilkeleri çerçevesinde ve konsorsiyal örgütlenme biçiminde olmasa da, üniversite kütüphanelerinin süreli yayın harcamalarını 467 | Hakemli Yazılar / Refereed Articles Mehmet Toplu azaltmak ve akademik çalışmalara destek olmak amacıyla, 1984 yılında Yükseköğretim Kurulu (YÖK) Dokümantasyon ve Uluslararası Bilgi Tarama Merkezi kurulmuştur (Tuncer, 1986, s. 34). Bütçelerinin büyük çoğunluğunu süreli yayınlara ayıran üniversite kütüphanelerinin bu yükten kurtulmasını, kitap ve görsel işitsel materyallere daha fazla kaynak ayırmasını hedefleyen merkezin, abone olduğu uluslararası bilimsel süreli yayın sayısı yıllar içerisinde sürekli artarak 1991 yılında yaklaşık olarak 9.500'e ulaşmıştır. 1991 yılında Türkiye'deki 29 üniversite kütüphanesinden sadece ODTÜ Kütüphanesi 1.943 ve Boğaziçi Üniversitesi Kütüphanesi de 1.547 bilimsel süreli yayına abonedir. Geriye kalan 27 üniversite kütüphanesinin 5'inin 1000 ile 1459, 6'sının 500 ile 883 ve 16'sının da 34 ile 488 arasında süreli yayına abone olduğu görülmektedir (Toplu, 1992, s. 94). Üstelik bu dergilerin önemli bir kısmının Türkçe süreli yayınlar olduğu dikkate alındığında YÖK Dokümantasyon Merkezi'nin 1991 yılında 9.500'e yakın uluslararası bilimsel süreli yayına abone olması önemli bir gelişmedir. YÖK Dokümantasyon Merkezi faaliyette olduğu 1984-1996 yılları arasında ulusal ölçekte Türk bilim insanlarına bilgi tarama ve belge sağlama konusunda önemli hizmetlerde bulunmuştur (Tuncer, 1988, s. 58). YÖK Dokümantasyon Merkezi'nin çalışmaları zaman içerisinde ortak koleksiyon geliştirme ve konsorsiyal oluşumlara dönüştürülemediği gibi, üniversite kütüphanelerinin süreli yayın aboneliklerindeki duplikasyonların önlenmesine de katkı sağlamamıştır. Örneğin Tonta (1999, s. 500) tarafından yapılan bir çalışmada, Ankara'da birbirine yakın mesafedeki Ulusal Akademik Ağ ve Bilgi Merkezi (ULAKBİM), Bilkent, ODTÜ ve Hacettepe Üniversitesi kütüphanelerinin 1997 yılında abone oldukları fiyatı beş bin doların üzerindeki 30 derginin aboneliklerinde önemli duplikasyonların olduğu vurgulanmaktadır. ULAKBİM bu söz konusu 30 derginin tümüne abone iken ODTÜ 21'ine, Hacettepe 18'ine ve Bilkent Üniversitesi de 8'ine abonedir ve bu aboneliklerin karşılığında binlerce ABD doları ücret ödenmektedir. Aynı şekilde Index Medicus, Science Citation Index (SCI) gibi ikincil kaynakların satın alımında da enformasyon merkezleri arasında birçok duplikasyon söz konusudur. 1990'lı yılların başında yaşanan ekonomik kriz ile birlikte YÖK Dokümantasyon Merkezi'nin süreli yayın abonelikleri 1992 ve 1994 yıllarında kısmen ve 1995 yılında tümüyle kesintiye uğramış ve kurum işlevsizleşmeye başlamıştır. Bu gelişmeler sonucu kurumun Tez Birimi dışındaki bütün hizmet ve kaynakları 1996 yılında TÜBİTAK'a devredilmiş ve onun bünyesinde yürütülmekte olan enformasyon hizmetleri ile Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 468 birleştirilmiştir. ULAKBİM adı ile yeniden örgütlenen, iki kurumun bilgi, deneyim ve hizmetlerini bir araya getiren bu yeni oluşum aynı zamanda belge sağlama, elektronik bilgiye erişim ve bunun altyapısının oluşturulması anlamında Türkiye'deki enformasyon hizmetlerine önemli katkılar sağlamıştır. ULAKBİM tarafından, 1997 yılında Ulusal Akademik Ağ (ULAKNET)'in kurulması, elektronik bilginin erişiminde önemli yapı taşlarından birini oluşturmuştur. ULAKNET üniversiteler, araştırma merkezleri ve üniversitelerle işbirliği içerisinde olan diğer araştırma kurumları (içerik sağlayıcı kamu kurumları) arasında bir bilgisayar ağı kurarak, ulusal ve uluslararası düzeyde internet üzerinden erişim sağlamıştır (Dünden Bugüne ULAKNET). ULAKBİM ve ULAKNET'in kuruluşundan sonra Türkiye'de elektronik ortamdaki bilgiye erişim hızla yaygınlaşmaya başlamış, bu da konsorsiyal oluşumlara zemin hazırlamıştır. Türkiye'de konsorsiyum oluşturulması yönündeki ilk girişimler de, ULAKBİM tarafından başlatılmıştır. 1997 yılında, Academic Press (AP) yayıneviyle ilişkiye geçilmiş ve bu yayınevi tarafından çıkarılan 174 dergide yer alan makaleleri içeren IDEAL (International Digital Electronic Access Library) veritabanı, deneme amacıyla internet üzerinden ULAKBİM, Bilkent, ODTÜ ve Hacettepe kütüphanelerinin hizmetine açılmıştır. Konsorsiyum oluşumunu gerçekleştirmek ve bu konuda paydaş olabilecek kurumları bilinçlendirmek amacıyla ULAKBİM tarafından Ankara'da 14 Kasım 1997 tarihinde "Elektronik Bilgi Kaynaklarının Kullanımında İşbirliği" konulu toplantı düzenlenmiştir. Toplantıya, üniversitelerin kütüphanelerden sorumlu 115 rektör yardımcısı ile kütüphane ve dokümantasyon daire başkanları katılmıştır. Toplantı sonucunda, öğretim üyeleri ve araştırmacıların elektronik ortamdaki bilgiden etkin bir şekilde yararlanması ve ULAKBİM'in bu çalışmalara öncülük etmesi konularında görüş birliğine varılmış, ancak yeterli ekonomik ve kurumsal destek sağlanamadığından girişim başarıya ulaşamamıştır (Tonta, 2001, ss. 295-296). ULAKBİM'in yeterli ekonomik gücünün olmaması, idari nedenlerden doğan aksaklıklar, çalışmaları yürütecek yasal örgütsel yapının eksikliği, paydaş olması düşünülen kurumların bu konularda yeterli bilgi birikimi ve bilincinin bulunmaması konsorsiyum girişiminin başarısızlığında önemli rol oynamıştır. Ayrıca Türkiye'deki mali sistemin kurumsal 469 | Hakemli Yazılar / Refereed Articles Mehmet Toplu harcamalarda getirmiş olduğu kuralların, ortak ödemelere ve bu tür kaynak aktarımına izin vermemesi, girişimi olumsuz etkileyen bir başka unsurdur. Anadolu Üniversite Kütüphaneleri Konsorsiyumu (ANKOS) ULAKBİM tarafından başlatılan konsorsiyum girişimi başarısızlığa uğramasına rağmen, ortaya çıkan bilgi birikimi ve bilinç, kısa bir süre sonra yeni bir oluşumun temelinin atılmasına olanak sağlamıştır. Bu girişimin, ULAKBİM'in daha önce oluşturmaya çalıştığı ancak başarısız olduğu, konsorsiyum oluşumunda yer alması düşünülen kurumlar tarafından başlatılması son derece düşündürücüdür. 1999 yılında dört üniversite kütüphanesinin EBSCO'nun hazırladığı bir sözleşmeyi imzalamasıyla başlatılan bu çalışma, 2000 yılında ULAKBİM ile birlikte yedi üniversite kütüphanesinin üç veritabanına abone olmasıyla resmiyet kazanmış ve Anadolu Üniversite Kütüphaneleri Konsorsiyumu (ANKOS) adını almıştır (Karasözen, 2004, s. 402). Amaçlarını; abone olunan veya satın alınan e-kitapların ve veri tabanlarının rasyonel gelişimini sağlamak; ilgili kurumların kullanıcılarının eğitimsel ve araştırmaya yönelik ihtiyaçlarının karşılamak ve daha fazla kaynak alımı gibi konularda üyeler arasında ortak politika belirlemek; üye merkezler arasında sürekli eğitim ve personel değişimi sağlamak; telif hakları, lisans anlaşmaları vb. dünyadaki diğer konsorsiyumlarla, enstitü ve organizasyonlarla işbirliği yapmak ve Türkiye'deki akademik kütüphanelerin gelişimi için gerekli girişimlerde bulunmak olarak belirleyen ANKOS, 10 yıl gibi kısa bir sürede önemli gelişmeler göstermiştir. (Tablo 1): ANKOS'un Üye ve Veritabanı Sayılarının Gelişimi Yıl 2001 2002 2003 2004 2005 2006 2007 2008 2009 Üye sayısı 39 58 78 80 82 86 86 90 108 Veritabanı 9 15 24 30 34 38 45 47 63 Top. 129 235 402 564 723 841 992 909 1234 Abonelik (Erdoğan, 2009, s. 78; Çukadar, 2009)8 8 Tablo, Erdoğan ve Çukadar'ın kaynak olarak gösterilen ilgili çalışmalarından derlenerek oluşturulmuştur. Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 470 Her hangi yasal bir zemini ve geliri olmayan, harcamalarını lisans anlaşması yaptığı yayıncı ve/veya içerik sağlayıcı firmaların bağışları ile karşılayabilen ve tamamen gönüllülük esasına göre faaliyetlerini sürdüren ANKOS'un (Karasözen, 2009) üye sayısı Tablo 1'de görüldüğü gibi 2001 yılında 39 iken, yıllar içerisinde sürekli artış göstererek 2009 yılında 108'e yükselmiştir. ANKOS üye sayısının sürekli artmasında, dönem içerisinde yeni kurulan üniversiteler önemli rol oynamıştır. Türkiye'deki üniversite sayısı 2001 yılında 76 iken bu sayı 2009 yılında 139'a yükselmiştir. ANKOS'un en büyük yararı, hiç kuşkusuz, üniversite kütüphanelerinin büyük bir çoğunluğunu konsorsiyum üyesi yapabilmesidir. Önümüzdeki yıllarda, yeni kurulan üniversite kütüphanelerinin de konsorsiyuma üye olmaları ile birlikte, bu sayı daha da artacaktır. ANKOS'un 2009 yılındaki üye profillerine bakıldığında, bunların 67'sinin devlet, 5'i Kuzey Kıbrıs Türk Cumhuriyeti'nde olmak üzere 33'ünün Vakıf Üniversite Kütüphanesi ve 10'unun da ULAKBİM, Türkiye Atom Enerjisi Kurumu gibi kamu kuruluşları ve enformasyon merkezlerinden oluştuğu görülmektedir. ANKOS'un asıl başarısı hiç kuşkusuz, yapılan lisans anlaşmaları ile elektronik ortamdaki bilginin ulusal ölçekte yaygınlaşmasını ve erişilebilirliğini artırmış olmasıdır. Tablo 1'de görüldüğü gibi ANKOS 2001 yılında 9 veritabanı ile lisans anlaşması yaparken, bu sayı her geçen yıl sürekli artarak 2005'te 34'e, 2007'de 45'e ve 2009 yılında da 63'e (bunlardan 4'ü ile 2009'da herhangi bir lisans anlaşması yapılmamıştır) yükselmiştir. Yıllar içerisinde üye ve lisans anlaşması imzalanan veritabanı sayılarının artması, ulusal ölçekte abone olunan ve erişilebilen veritabanı sayılarının da artmasına neden olmuştur. Tablo 1'de görüldüğü gibi 2001 yılında 39 konsorsiyum üyesi tarafından 9 veritabanına toplam 129 abonelik yapılmışken, bu sayı sırasıyla 2005 yılında 82, 34, 723 ve 2009 yılında da 110, 63 ve 1234 olarak gerçekleşmiştir. Ayrıca yıllar içerisinde, enformasyon merkezlerinin ortalama olarak abone oldukları veritabanı sayılarının sürekli artması, başka bir olumlu gelişmedir. 2002 yılında enformasyon merkezleri ortalama 4 veritabanına abone iken, bu sayı 2005 yılında 9'a ve 2009 yılında da 11'e yükselmiştir. Konsorsiyumun oluşumundan önce, Ankara, İstanbul ve İzmir gibi büyük şehirlerin dışında yaşayan bilim insanı ve araştırmacıların büyük bir çoğunluğunun bilimsel yayınlara erişimi neredeyse imkânsızdı. Hatta olanakları daha iyi durumda olan üç büyük şehirdeki araştırmacılar bile, uluslararası düzeyde kıyaslandığında, sınırlı 471 | Hakemli Yazılar / Refereed Articles Mehmet Toplu sayıda bilimsel yayına erişebilmekteydi. Türkiye'deki birçok enformasyon merkezi, içerik sağlayıcılar ile konsorsiyel lisans anlaşmaları yaparak, basılı yayınların ağırlıklı olduğu dönemde hayal bile edilemeyecek, binlerce bilimsel yayını kullanıcılarına ulaştırma başarısını elde etmiştir. Ayrıca bu gelişmeler, Türkiye'deki bilimsel bilginin erişilebilirliğini dar bir çevreden çıkararak, ulusal ölçekte yaygınlaşmasına olanak sağlamış ve bilim insanlarının büyük şehirlerdeki enformasyon merkezlerine olan bağımlılığını önemli ölçüde azaltmıştır. Özellikle yeni kurulan birçok üniversite ve araştırma kütüphanesi, fazla emek sarf etmeden, ekonomik olanakları oranında, kendinden çok daha önce kurulmuş paydaşlarıyla aynı hacimdeki bilimsel bilgiyi kullanıcılarına ulaştırabilir hale gelmiştir. Lisans anlaşması yapılan veritabanı sayısının sürekli artmasının ve buna bağlı olarak bilimsel bilgiye erişimin yaygınlaşmasının sonuçlarını, tam metin makalelerin indirilme sayılarından görmek mümkündür. Tablo'2 de görüldüğü gibi, 2001 yılında, konsorsiyum üyelerinin kullanıcıları abone oldukları veritabanlarından 1.402.490 adet tam metin makale indirirlerken, bu sayı 2005 yılında 9.542.769'a ve 2007 yılında da 12.191.096'ya yükselmiştir. 2008 yılında ise, indirilen makale sayısı bir önceki yıla göre yaklaşık bir milyon düşüş göstererek 11.207.856' ya gerilemiştir9 10. Rakamlardan da anlaşılacağı gibi, sekiz yıllık süreç içerisinde kullanıcılar tarafından indirilen tam metin makale sayısında yaklaşık 9 katlık bir artış meydana gelmiştir. Bu da Türkiye'de elektronik ortamdaki bilgiye erişimin hızla yaygınlaştığını göstermektedir. 9 Bunun nedeni, ScienceDirect, IEEE ve Web of Science gibi hemen hemen bütün enformasyon merkezleri tarafından abone olunan veritabanlarının, ANKOS Konsorsiyumu kapsamı dışında kalmasıdır. Bu veritabanları, daha sonra ele alınacağı gibi, TÜBİTAK EKUAL Anlaşmaları kapsamında, talepte bulunan bütün enformasyon merkezlerine kullanıma açılmaktadır. Bunun sonucu bu veritabanlarının kullanım istatistikleri, ANKOS veritabanları kullanım istatistiklerinden ayrı olarak değerlendirilmektedir. 10 Tablo, Erdoğan ve Çukadar'ın kaynak olarak gösterilen ilgili çalışmalarından derlenerek oluşturulmuştur. (Tablo 2): Yıllar İtibariyle İndirilen Makale Sayısı 2001 2002 2003 2004 2005 2006 2007 2008 1.402.490 2.263.851 5.686.213 8.246.653 9.542.769 10.483.372 12.191.096 11.207.856 (Erdoğan, 2009, s. 383; Çukadar, 2009 )10 Özellikle taşradaki üniversite kütüphanelerinin önemli bir kısmının yeterli nitelikli elemana sahip olmadıkları göz önünde bulundurulduğunda, ANKOS'un Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 472 elektronik ortamdaki bilgiye erişim konusunda gösterdiği çaba ve bunun sonucunda elde edilen başarı, son derece önemlidir. Ayrıca yasal dayanağı ve maddi gücü olmayan, tamamen gönüllük esası temelinde yürütülen bir oluşumda, alınacak veritabanlarının belirlenmesi, bunların üyelere tanıtımının yapılması, enformasyon merkezleri bazında kullanıcı büyüklüklerine göre ayrı ayrı fiyatlandırılması ve faturalandırılması da, başarı konusunda göz önünde bulundurulması gereken diğer unsurlardır. ANKOS'un kuruluşunda önemli rol oynayan ve belirli bir dönem başkanlığını yapan Karasözen'in (2002) de belirttiği gibi, ANKOS'un “yaparak öğrenmeye dayalı bir model” olması ve Türkiye'de daha önce bu alanda herhangi bir uygulamanın bulunmaması, örgütlenmenin önemini daha da artırmaktadır. Bu başarıda kurumsallaşma yönünde atılan adımların rolü küçümsenemez. ANKOS, faaliyetlerini daha etkin sürdürebilmek için; Lisans Anlaşmaları, Halkla İlişkiler, Kullanıcı İstatistikleri, Veritabanı Değerlendirme, Açık Erişim ve Kurumsal Arşivler, Kurumsallaşma ve İşbirliği Araştırma Grupları oluşturmuştur. Bu gruplar ilgili oldukları alanlarda, hem ANKOS'un faaliyetlerini ve hem de uluslararası düzeyde meydana gelen gelişmeleri yapılan yıllık toplantılarda katılımcılara sunmaktadırlar. Enformasyon merkezi yöneticileri ve katılımcılar, bu toplantı ve sunumlardan elde ettikleri bilgi ve deneyimleri kendi kurumlarına yansıtmaya çalışmaktadırlar. Ayrıca ANKOS'un yıllık toplantılarında yapılan ürün tanıtımları, enformasyon merkezlerinin yöneticileri ve çalışanlarının bilgilendirilmesi açısından son derece yararlıdır. Bununla birlikte, ANKOS'un yönetim organlarında, çalışma gruplarında ve veritabanı sorumluluklarında, farklı enformasyon merkezlerinden kişilerin görev alması, konsorsiyum çalışmalarının gücünü artırırken, aynı zamanda edinilen bilgi birikimlerinin kurumlar arasında paylaşılmasına ve işbirliğinin artırılmasına katkıda bulunmaktadır. ANKOS'un Türkiye'de elektronik ortamdaki bilgiye erişim konusunda sağladığı bütün bu avantajlar yanında, konsorsiyum oluşumları ve ortak koleksiyon geliştirme uygulamaları çerçevesinde değerlendirildiğinde, bazı sorunları da beraberinde getirdiği görülmektedir. Her şeyden önce ANKOS'un temel faaliyet alanı, elektronik yayınları ve/veya veritabanlarını ortak satın alarak ücretlerde indirim sağlama üzerinedir. Bu da elektronik yayıncılıkla birlikte uluslararası ölçekte hızla yaygınlaşan ve konsorsiyum adıyla anılan, ancak daha çok “satın alma kulübü” olarak adlandırılan bir yapının ortaya 473 | Hakemli Yazılar / Refereed Articles Mehmet Toplu çıkmasına neden olmuştur. Hâlbuki yukarıda da değinildiği gibi, konsorsiyumlar, üye kurumlar arasında kaynak paylaşımı, ortak koleksiyon geliştirme, kütüphanelerarası ödünç verme ve toplu katalogların oluşturulması gibi birçok temel hizmet alanını kapsayan faaliyetlerde bulunmaktadırlar. ANKOS'un faaliyetlerinde “ortak satın alma” nın temel ilke olarak belirlenmesi, konsorsiyumun yukarıda belirtilen diğer işlevlerinin göz ardı edilmesi, doğal olarak onun daha sağlıklı bir gelişim göstermesini engellemektedir. Konuyla ilgili olarak Akbaytürk (2003, s. 253) tarafından yapılan bir çalışmada, ankete yanıt veren üyelerin yaklaşık % 50'si ANKOS'a üye olma gerekçelerini, ülke genelinde dergi aboneliklerindeki duplikasyonların azaltılması olarak yanıtlamışlardır. Ancak ANKOS böyle bir çalışmayı hiçbir zaman gündemine alamamıştır. ANKOS, 2006 yılından itibaren, üyeleri arasında Kütüphanelerarası İşbirliği Takip Sistemi (KİTS) adıyla kütüphanelerarası belge sağlama hizmeti başlatmıştır. Ancak, çalışmanın tamamen gönüllük esasında yürütülmesi ve gerekli alt yapı koşullarının oluşturulmamış olması, konsorsiyum temelli sağlıklı bir yapının ortaya çıkmasını engellemektedir. Söz konusu sistemin, her biri ANKOS üyesi olan ULAKBİM, Gazi, Hacettepe ve Orta Doğu Teknik Üniversitesi Kütüphaneleri tarafından 1999 yılının son aylarında başlatılan, Ortak Belge Sağlama Hizmeti (OBES) ile hiçbir bağlantısının olmaması son derece düşündürücüdür (Toplu, 2009, s. 108). OBES kapsamında hazırlanan ve ilgili kütüphaneler ile Bilkent ve Başkent üniversitelerinin süreli yayınlarını kapsayan toplu katalogun geliştirilmesi konusunda da herhangi bir çaba ve işbirliği ortaya çıkmamıştır. Yine aynı şekilde 2008 yılında ULAKBİM tarafından başlatılan Ulusal Toplu Katalog (TOKAT) projesi ANKOS'un gündeminde yer almamıştır. ÜNAK-OCLC Konsorsiyumu Türkiye'de, ANKOS'un oluşum süreci ile hemen hemen aynı dönemlerde, ÜNAK- OCLC konsorsiyumu'nun da başlatıldığı görülmektedir. Her ne kadar Üniversite ve Araştırma Kütüphanecileri Derneği (ÜNAK), Online Computer Library Center (OCLC)'yi ilk kez 1993 yılında Boğaziçi Üniversitesi'nde yaptığı bir seminerle tanıtmışsa da, o dönemde, Türkiye'nin ağ yapısı bu hizmetlerin elektronik olarak yürütülmesine olanak vermediğinden, konsorsiyum oluşumu gerçekleştirilememiştir. Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 474 2000 yılında bu girişimler tekrar başlatılmış ve 15 üniversite ve araştırma kütüphanesinin katılımı ile ÜNAK-OCLC Konsorsiyumu kurulmuştur (Günden, 2001, 111). ÜNAK-OCLC Konsorsiyumu üç farklı ürün grubu için yürütülmektedir. Bunlar; 11 veri tabanını içeren, web tabanlı çevrim içi danışma hizmeti sağlayan ve önemli bibliyografik ve tam metin veri tabanları ile danışma kaynaklarını aynı ara yüzle tarama olanağı sağlayan FirstSearch, 400'den fazla yayın evinin e-kitap koleksiyonunu kapsayan NetLibrary ve bilgi kaynaklarının MARC kayıtlarının paylaşımına dayanan World Cat - Connexion’dır. ÜNAK-OCLC konsorsiyal oluşumunda 2009 yılı itibariyle FirstSearch konsorsiyumunda 15, NetLibrary'de 5 ve WorldCat'te ise 19 üye yer almaktadır (ÜNAK-OCLC Konsorsiyumu). Yapılan yıllık toplantılarda; ücretlendirme, yeni ürünlerin tanıtımı ve veritabanlarının kullanımı gibi konularla, karşılaşılan sorunlar ve bu sorunlara yönelik çözüm önerileri ele alınmaktadır. Gerek oluşum, gerekse faaliyetlerin sürdürülmesi açısından ANKOS ve ÜNAK-OCLC Konsorsiyumları arasında birtakım farklılıklar söz konusudur. Her şeyden önce ANKOS, hiçbir yasal ve kurumsal yapısı olmadan, tamamen gönüllülük esası temelinde ve yine bu çerçevede seçimle oluşan yönetim birimiyle faaliyetlerini sürdürürken, ÜNAK-OCLC Konsorsiyumu, mesleki bir dernek olan ÜNAK'ın kurumsal yapısı altında ve onun aracılığı ile faaliyetlerini sürdürmektedir. Ayrıca ANKOS oluşumunda lisans anlaşması yapılacak firmalarla tek tek görüşülürken ve fiyatlar bu şekilde saptanırken, ÜNAK- OCLC konsorsiyumunda, lisans anlaşması sadece OCLC ile yapılmaktadır. Bu iki konsorsiyum arasındaki başka bir önemli farklılık da, lisans anlaşmalarının ve faturaların düzenlenmesinde gözlenmektedir. ANKOS kapsamındaki lisans anlaşmalarında fatura düzenlemeleri, ilgili firmalarla enformasyon merkezleri arasında yapılırken, ÜNAK-OCLC Konsorsiyumunda ise, ÜNAK ve üyeler arasında bir düzenleme söz konusudur. ÜNAK bu işlemleri, iktisadi işletme yapısına sahip bir dernek olduğu için yürütebilmektedir. ÜNAK-OCLC Konsorsiyumda, üzerinde durulması gereken önemli bir unsur da, WorldCat lisans anlaşmasının sunmuş olduğu olanaklardır. WorldCat, kataloglama alanında kayıt değişimine ve katılımcı kütüphanelerin yerel ölçekte üretmiş oldukları kayıtların bu sisteme aktarılmasına olanak sağlamaktadır. Bu sayede, bibliyografik kayıtların yaratılmasında, kataloglama ve sınıflandırma konularında standartlaşmaya öncülük etmektedir. Ayrıca sistemin, toplu katalogların oluşumunda, kütüphanelerarası 475 | Hakemli Yazılar / Refereed Articles Mehmet Toplu ödünç verme ve belge sağlama hizmetlerinde etkin bir şekilde kullanılabildiği görülmektedir. Türkiye'de toplu katalog ve ulusal bir sistemin oluşumu konusunda sağlıklı bir gelişim olmadığı dikkate alındığında, WorldCat konsorsiyumu bu tür çalışmalara öncülük edebilir. Hatta ULAKBİM, UNAK-OCLC Konsorsiyumu ile işbirliğine giderek, bu konularda yürütmekte olduğu çalışmaları, daha sağlıklı bir zeminde yürütebilir. Ayrıca UNAK-OCLC Konsorsiyumuna üye kurumların, aynı zamanda ANKOS ve TÜBİTAK EKUAL'in her ikisine de üye olduğu dikkate alındığında, bu yönde üçlü bir çalışmanın yürütülmesi güçlü bir irade ile mümkün hale gelebilir. Böyle bir oluşum, aynı zamanda kütüphanelerarası belge sağlama ve ödünç verme hizmetlerinin de sağlıklı bir ortamda yürütülmesine olanak sağlayacaktır. TÜBİTAK EKUAL - Elektronik Kaynaklar Ulusal Akademik Lisansı Türkiye'de yukarıda söz edilen iki konsorsiyal oluşum yanında, 2006 yılında ULAKBİM'in başlatmış olduğu, Elektronik Kaynaklar Ulusal Akademik Lisansı projesi, elektronik ortamdaki bilgiye erişimde ve ulusal ölçekte yaygınlaşmasında önemli işlevler üstlenmiştir. TÜBİTAK Bilim Kurulu, 2005 yılında aldığı bir kararla, 2006 yılından başlamak üzere, ULAKBİM aracılığı ile akademik bilgi üretimini etkinleştirmek, bilgi hizmetlerini ulusal ölçekte yaygınlaştırmak ve bilimsel bilgiye erişimde araştırmacılar arasında fırsat eşitliği yaratmak amacıyla böyle bir projeyi uygulamaya karar vermiştir. Projeden, en az lisans düzeyinde diploma veren eğitim kurumları olan, Türkiye ve KKTC' deki bütün üniversiteler, Polis Akademisi ile Harp Okulları ve Akademisi öğretim elemanı ve öğrencileri yararlanabilmektedir. Proje, 2006 yılında alınan başka bir kararla, Sağlık Bakanlığı'na bağlı Eğitim ve Araştırma Hastaneleri'ni de içerecek şekilde genişletilmiştir (TÜBİTAK EKUAL). Projenin hayata geçirilmesi ile birlikte, EBSCOhost, IEEE, Web of Science, OVID, ScienceDirect, Taylor and Francis gibi daha önce ANKOS lisans anlaşmaları kapsamında yer alan ve abonelik talebi fazla olan veritabanları ULAKBİM tarafından ilgili kurumlar adına abone olunarak EKUAL kapsamında kullanıma açılmıştır. Böylece üniversite ve araştırma kütüphaneleri bu veritabanlarına hiçbir ücret ödemeden kullanıcılarına erişim sağlarken, aynı zamanda koleksiyon gelişimi konusunda da büyük avantajlar elde etmişlerdir. Bununla birlikte, daha önce ANKOS lisans anlaşmaları kapsamında yer alan ve enformasyon merkezleri tarafından abone olunan Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 476 veritabanlarına olan talep, yeni kurulan üniversitelerle birlikte sürekli artmıştır. 2006 yılında ANKOS lisans anlaşmaları kapsamında EBSCOhost'a 42 enformasyon merkezi abone iken, 2009 yılında ULAKBİM bu veritabanına, yaklaşık 141 üniversite ve 55 eğitim ve araştırma hastanesi adına abone olmuştur. Aynı yıllarda IEEE'in aboneliği 24'ten 141'e, Web of Science'ınki 59'dan, 141'e, OVID'inki 28'den 56'ya, ScienceDirect'inki 62'den 104'e ve Taylor and Francis'inki 36'dan 141'e yükselmiştir. Bu rakamlardan da anlaşılacağı gibi, ULAKBİM, EKUAL projesi kapsamındaki talepler doğrultusunda veritabanı aboneliklerini sürekli artırmıştır. 2009 yılı itibari ile ULAKBİM; Üniversiteler, Harp Akademileri, Harp Okulları ve Polis Akademisi adına 8 elektronik veritabanının aboneliklerini sürdürmektedir. ULAKBİM'in, EKUAL kapsamında sürdürdüğü başka bir önemli çalışma da, elektronik veritabanları kullanımını Sağlık Bakanlığı Eğitim ve Araştırma Hastaneleri'ni kapsayacak şekilde genişletmesidir. Bu girişim, daha önce sadece bilimsel araştırma kurumları ve üniversitelerle sınırlı olan bilimsel bilgiye erişimin yaygınlaştırılması açısından son derece önemlidir. Türkiye'deki bilimsel bilgiye erişimin ve bu yöndeki örgütlenmelerin daha çok üniversitelerle sınırlı kaldığı, Sağlık Bakanlığı gibi kurumlarda, bu yönde etkin bir hizmet yapısının geliştirilemediği göz önünde bulundurulduğunda, projenin önemi daha iyi anlaşılacaktır. Bilimsel bilgiye erişimin bu şekilde yaygınlaştırılması, araştırmacıların, bilimsel araştırma ve bunları yayınlama motivasyonlarını da olumlu yönde etkilemektedir. Ayrıca tıp alanındaki araştırmacı ve uygulamacıların kendi alanlarındaki gelişmeleri takip etmeleri ve bunları mesleklerine yansıtmaları sağlıklı bir toplum yaratılması açısından da son derece önemlidir. ULAKBİM, 2009 yılında, bu proje kapsamında, 5 elektronik veritabanının aboneliklerini sürdürmektedir (Sağlık Bakanlığı Eğitim ve Araştırma Hastaneleri). ULAKBİM, EKUAL projesi ile bilimsel bilgiye erişimde, bilim insanı ve araştırmacılara önemli avantajlar sağlarken, aynı zamanda TÜBİTAK'a bağlı araştırma birimleri ve enstitülerdeki araştırmacıların bilgi gereksinimlerini karşılamak amacıyla, EKUAL kapsamındaki veritabanları dışında, ayrıca 70'e yakın elektronik bilgi kaynağına abonedir (Veri Tabanları). ULAKBİM, bu veritabanlarını, yürütmekte olduğu belge sağlama hizmetleri ile Türkiye'deki bilim insanı ve araştırmacıların kullanımını da sunmaktadır. 477 | Hakemli Yazılar / Refereed Articles Mehmet Toplu ULAKBİM, elektronik bilgiye erişim konusunda, ulusal ölçekte sağladığı bu olanaklar yanında, veritabanlarının geriye dönük arşivlenmesi konusunda da çalışmalar yürütmektedir. Bu kapsamda ScienceDirect, IEEE ve Institute of Physics (IOP) veritabanları geriye dönük olarak arşivlenmektedir. Lisans anlaşmalarının, 5-10 yıl gibi, son yılları kapsayacak şekilde yapıldığı, bundan dolayı da bilim insanı ve araştırmacıların, bir yıl önce hatta bir gün önce erişebildikleri elektronik bilgi kaynaklarına erişemeyebilecekleri dikkate alındığında, geriye dönük arşivlemenin önemi daha iyi anlaşılacaktır. ANKOS ve ÜNAK-OCLC konsorsiyumlarında, konsorsiyumların yapısı, ekonomik güçleri, yasal konumları vb. nedenlerden dolayı, böyle bir çalışmanın yapılması neredeyse olanaksızdır. ULAKBİM'in EKUAL projesi, bu projeden yararlanmakta olan ve ekonomik güçleri sınırlı enformasyon merkezleri açısından son derece önemlidir. Devlet bütçesinden, yayın alımı için, Türkiye'deki kamu üniversite kütüphanelerine ayrılan ödenekler göz önünde bulundurulduğunda, bu daha iyi anlaşılacaktır. 2009 yılında, yayın alımı için, 47 devlet üniversitesine, 50.000 ile 250.000 TL arasında, 26 tanesine 251.000 ile 500.000 TL arasında, 9 tanesine 501.000 ile 1.000.000 TL arasında ve 7 tanesine de 1.000.001- 1.500.000 TL arasında ödenek ayrılmıştır. Bunun dışında, Hacettepe Üniversitesi Kütüphanesi'ne ayrılan ödenek 1.600.000 TL iken, Gazi ve Marmara Üniversite Kütüphaneleri'nin 2.000.000 TL, Boğaziçi Üniversitesi Kütüphanesi'nin 2.900.000 TL, İTÜ Kütüphanesi'nin 3.200.000 TL ve ODTÜ Kütüphanesi'nin de 4.500.000 TL dir11. Bir başka deyişle, Türkiye'deki kamu üniversite kütüphanelerinden on üç tanesi bir milyon TL ve üzeri ödeneğe sahipken iki milyon TL ve üzeri ödeneğe sahip kamu üniversite kütüphanesi sadece beş tanedir. Kamu üniversite kütüphaneleri başta olmak üzere, Türkiye'deki enformasyon merkezlerinin büyük bir çoğunluğunun yeterli ekonomik kaynağı bulunmadığı düşünüldüğünde, ULAKBİM'in EKUAL projesinin önemi daha da artmaktadır. Bu sayede enformasyon merkezleri, yetersiz olan bütçelerini daha verimli kullanabilme olanağını elde etmekte ve daha sağlıklı koleksiyon geliştirme politikaları izleyebilmektedirler. 11 2009 Bütçe Kanunu. T. C. Resmi Gazete. 31.12.2008 Sayı 27097 (Mükerrer). Projenin sağladığı bu avantajlar yanında, bunun ne kadar süre ile sürdürülebileceği konusundaki kaygıların da, göz önünde bulundurulması gerekmektedir. Projenin süreklilik taşıyıp taşımayacağı, veritabanlarının lisans Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 478 anlaşmalarının devamlılığının sağlanıp sağlanamayacağı gibi konular önemli birer soru işaretidir. Nitekim proje kapsamında yer alan EngineeringVillage2, BMJ Online Journal ve JCR gibi veritabanlarının lisans anlaşmalarının iptal edilmesi, bu yöndeki kaygıları artırmaktadır. Ayrıca ULAKBİM'in geçmişini oluşturan TÜBİTAK/ TÜRDOK ve YÖK Dokümantasyon ve Uluslararası Bilgi Tarama Merkezi'nin, kurumsal olarak sürekliliğini sağlayamadığı ve belirli bir süre sonra misyonunu gerçekleştiremez hale geldiği dikkate alındığında, bu yöndeki kuşkular artmaktadır. Koleksiyon gelişimlerini TÜBİTAK EKUAL Projesi kapsamında daha iyi duruma getiren enformasyon merkezleri, bu projenin işlevsizleşmesi durumunda, kullanıcıları hatta bağlı oldukları üst yönetim birimleri ile karşı karşıya geleceklerdir. ULAKBİM'in yürütmekte olduğu ve Türkiye'deki ulusal bilgi birikiminin bibliyografik denetimini ve erişimini sağlayacak Ulusal Toplu Katalog (TO-KAT), ortak belge sağlama (OBES) ve Ulusal veritabanı projeleri de bu anlamda son derece önemlidir. Oluşturulmakta olan toplu katalog çalışmalarına bu zamana kadar kurumsal olarak yaklaşık yirmi enformasyon merkezi kayıtlarını göndermiştir. Ancak programın oluşumu aşamasında paydaşlarla birlikte belirlenmiş standart bir yapı ortaya konamamıştır. Ayrıca toplu kataloğa veriler, kurumlardan geldiği gibi aktarılmakta ve standart bir yapı ortaya konamamaktadır. Kurumların yeni verilerini sisteme nasıl aktaracağı, künyesinde değişiklik yapılan ya da koleksiyondan çıkarılan yayınların listeden kimin tarafından ayıklanacağı vb. gibi konularda da bazı belirsizlikler söz konusudur. Ayrıca TO-KAT'a bibliyografik kayıtlarını aktaran kurum sayısının istenilen düzeyde olmadığı dikkate alındığında, hâlâ birçok enformasyon merkezinin projeye duyarsız kaldığı ve/veya yeterli desteği sağlamadığı söylenebilir. ULAKBİM'in 1999 yılının son aylarında, Hacettepe, Gazi ve ODTÜ Kütüphaneleri ile birlikte başlattığı Ortak Belge Sağlama Projesi ve bu proje kapsamında oluşturduğu Süreli Yayınlar Toplu Kataloğu12 başladığı kurumlarla sınırlı kalmış ve ulusal ölçekli bir projeye dönüşememiştir. Hatta ODTÜ Kütüphanesi 2007 yılında projeden ayrılmıştır (Toplu, 2009, ss. 100-101). 12 Süreli yayınlar toplu kataloğunda Bilkent ve Başkent Üniversiteleri'nin de kayıtları vardır. Türkiye'de ANKOS, ÜNAK-OCLC konsorsiyal oluşumları ile ULAKBİM tarafından yürütülen çalışmalar bilimsel bilginin erişiminde ve ulusal ölçekte yaygınlaşmasında önemli işlevler üstlenmişlerdir. Bu çalışmalar basılı yayıncılık 479 | Hakemli Yazılar / Refereed Articles Mehmet Toplu döneminde hayal bile edilemeyecek düzeyde bilginin yaygınlaşmasına hatta toplumsallaşmasına olanak sağlamıştır. Üniversitelerdeki yaklaşık 100.500 öğretim elemanı ve 807.500 öğrenci ile (2008-2009 Öğretim Yılı Yükseköğretim İstatistikleri Kitabı); TÜBİTAK, Türkiye Atom Enerjisi Kurumu, GATA, Sağlık Bakanlığı'na bağlı Eğitim ve Araştırma Hastaneleri, Merkez Bankası gibi kurumlardaki bütün bilim insanı ve araştırmacılar elektronik ortamdaki bilimsel bilgiye erişebilecek düzeydedir. Hatta birçok üniversitede öğretim elemanı eksiği bulunduğu, gerekli araştırma altyapılarının oluşmadığı ve öğrencilerin bu yayınları takip edebilecek yeterli yabancı dil bilgisine sahip olmadığı göz önünde bulundurulursa, bilgi sisteminden etkin bir şekilde yararlanılamadığı bile söylenebilir. Bütün bu olumlu gelişmelerin daha fazla etkinlik kazanabilmesi ancak enformasyon merkezleri ve onların çalışanlarının geliştirecekleri hizmet yapısı ve bu yöndeki politikaları ile olanaklıdır. Bilgi hizmetleri birimleri, giderek zenginleştirdikleri kapsamlı bir elektronik kaynak yelpazesinden en etkin biçimde yaralanılabilmesini sağlamak için, sürekli geliştirecekleri hizmetlerini, paylaşım ve işbirliği ruhu içinde sunarak, ülkenin araştırma kapasitesini destekleyebildikleri oranda, ülke kalkınmasına katkıda bulunacaklardır (Aslan, 2006). Enformasyon merkezleri ve çalışanları bu bilinçle hareket ettikleri ölçüde varoluş gerekçelerini topluma daha iyi anlatabileceklerdir. Sonuç ve Öneriler Ortak koleksiyon geliştirme kavramı çerçevesinde olmasa da, kaynak paylaşımının temelini oluşturacak kütüphanelerarası işbirliği yönündeki çalışmalar, 19. yüzyılın sonlarında ortaya çıkmıştır. Bu dönemlerde enformasyon merkezleri, koleksiyon geliştirme yönündeki politikalarını kurumsal ölçekte ve birbirinden bağımsız olarak sürdürürlerken, kullanıcı talepleri karşısında, diğer kurumların koleksiyonlarına gereksinim duymaya başlamışlardır. Diğer enformasyon merkezlerinin hangi koleksiyona sahip oldukları ve bunlardan kendi kullanıcıları için yararlanılıp yararlanılamayacağı yönündeki düşünceler, işbirliğinin sağlanması yönündeki düşüncelerin temelini oluştururken, aynı zamanda bunun için gerekli enformasyon altyapısının oluşumu yönündeki çalışmaların başlamasına neden olmuştur. Ortak katalogların oluşumu ve bunun için gerekli standartların oluşturulması, Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 480 kütüphanelerarası ödünç verme hizmetlerinin yürütülmesine yönelik kurallar, öncelikle ele alınan konular arasında yer almıştır. Özellikle İkinci Dünya Savaşı'ndan sonra, ülkeler arasında ortaya çıkan iktisadi, teknolojik vb. rekabet, bilimsel araştırmaların büyük bir ivme kazanmasına, bilgiye olan talebin artmasına ve yayıncılık sektörünün hızla gelişmesine neden olmuştur. Bu gelişmeler, enformasyon merkezlerini yeni arayışlara itmiş, ortak koleksiyon geliştirme ve daha fazla kaynak paylaşımını temel alan, ortak yükümlülük ve sorumlulukları beraberinde getiren konsorsiyum oluşumlarının zeminini hazırlamıştır. Enformasyon ve iletişim teknolojisindeki gelişmeler, ortak katalogların oluşturulması ve bilginin bibliyografik denetimi gibi alanlarda önemli işlevler üstlenmiştir. 20. yüzyılın sonlarına doğru, elektronik yayıncılığın hızla yaygınlaşması ile birlikte, konsorsiyum kavramı, ortak koleksiyon geliştirme ve kaynak paylaşımı gibi temel uygulamalardan uzaklaşarak, birlikte satın alma düşüncesi çerçevesinde ele alınmaya başlamıştır. Böylece konsorsiyumların büyük bir çoğunluğu satın alma kulübüne dönüşmüş ya da yeni oluşumlar bu şekilde ortaya çıkmıştır. Elektronik yayıncılık ve konsorsiyal oluşumlar, bilginin erişimi ve yaygınlaşmasında önemli fırsatlar sağlarken aynı zamanda basılı ortamda ve yerel koleksiyonda bulunan kaynakların göz ardı edilmesine neden olmuştur. Özellikle gelişmekte olan ülkelerde, daha önce sağlıklı bir enformasyon altyapısı oluşturulamadığından, elektronik yayıncılıkla gelen kolay ve daha güncel bilgiye erişim olanağı, basılı ortamdaki ve yerel koleksiyondaki bilginin büyük ölçüde göz ardı edilmesine neden olmuştur. Halbuki gelişmiş ülkelerin büyük bir çoğunluğu daha önceki işbirliği ve konsorsiyum çalışmalarında geliştirmiş oldukları güçlü enformasyon altyapıları ile eski ve yeni sistemi, etkin bir şekilde kullanmaya devam etmektedirler. Türkiye'deki konsorsiyum çalışmaları, 21. yüzyılın başlarında, elektronik yayıncılığın enformasyon hizmetlerinde etkinliğini artırdığı bir dönemde başlamış ve daha çok “satın alma kulübü” niteliğinde faaliyetlerini sürdürmüştür. ANKOS ve ÜNAK-OCLC konsorsiyumlarında, enformasyon merkezleri arasında, ortak koleksiyon geliştirme uygulamalarına hemen hemen hiç yer verilmemiştir. Kaynak paylaşımı da, satın alınan ürünlerde yapılan fiyat indirimleri düzeyinde uygulama olanağı bulabilmiştir. Bir başka deyişle, geleneksel konsorsiyumların, toplu katalogların oluşturulması, mesleki standartların saptanması, kütüphanelerarası ödünç verme ve 481 | Hakemli Yazılar / Refereed Articles Mehmet Toplu belge sağlama hizmetleri ve üye merkezler arasında güçlü bir enformasyon altyapısının kurulması yönündeki çalışmaları, Türkiye'deki konsorsiyum uygulamalarında etkinlik kazanamamıştır. 2000'li yılların ortalarından itibaren ANKOS tarafından, KİTS adıyla başlatılan ödünç verme ve belge sağlama hizmetleri, hiçbir alt yapı koşulu hazırlanmadan, tamamen gönüllülük esasına dayalı olarak yürütülmektedir. Ayrıca gelişmiş ülkelerdeki konsorsiyum uygulamalarında koleksiyon geliştirme, kütüphanelerarası belge sağlama ve ödünç verme, birbirinden ayrılmaz iki unsur olarak birlikte ele alınırken, Türkiye'de böyle bir anlayış yeterince gelişebilmiş değildir. Ancak bu konsorsiyum çalışmaları, yukarıda belirtilen sorunlar yanında, elektronik ortamdaki bilimsel bilginin erişiminde ve ulusal ölçekte yaygınlaşmasında önemli işlevler üstlenmiştir. Birçok enformasyon merkezi, bu konsorsiyum oluşumları sayesinde, basılı ortamda hiçbir zaman sahip olamayacağı bilimsel bilgiyi, elektronik ortamda kullanıcılarına erişilebilir kılmıştır. Türkiye'nin herhangi bir yerindeki bilim insanı, konsorsiyal çalışmalar sayesinde, görece kendisinden daha fazla olanaklara sahip büyük şehirlerdeki enformasyon merkezlerine gitmeden, temel bilimsel yayınlara yerel ölçekte erişebilmektedir. 2000'li yılların ortalarından itibaren enformasyon merkezleri, ULAKBİM'in EKUAL kapsamında başlattığı proje ile, daha önce bütçelerinin büyük bir kısmını harcadıkları temel ve en fazla talep edilen tam metin ve bibliyografik veritabanlarına ücretsiz erişebilir hale gelmişlerdir. Ayrıca ULAKBİM'in bu projeyi Sağlık Bakanlığı'na bağlı eğitim ve araştırma hastanelerindeki bilim insanlarını kapsayacak şekilde genişletmesi, bilimsel bilgiye erişimin ulusal ölçekte daha da yaygınlaşmasına olanak sağlamıştır. Ancak ULAKBİM'in sunduğu bu hizmetlerde süreklilik sağlanıp sağlanamayacağı yönündeki kuşkular hâlâ giderilebilmiş değildir. Nitekim beş yıl gibi kısa bir sürede, EKUAL kapsamında lisans anlaşması imzalanarak kullanıma açılan veritabanlarından bazılarının aboneliklerinin iptal edilmesi, bu yöndeki kuşkuları artırmaktadır. Bütün bu sorunlar ve fırsatlar göz önünde bulundurulduğunda, Türkiye'de sağlıklı bir enformasyon altyapısının ve hizmet ağının kurulması için mevcut sistemin iyileştirilmesi gerektiğini söylemek doğru bir saptama olacaktır. Her şeyden önce ULAKBİM'in, EKUAL kapsamında lisans anlaşması imzalayarak ulusal ölçekte kullanıma sunduğu veritabanları için sürekli ve güvenli ekonomik kaynak sağlanmalıdır. Bu amaçla üniversitelerin üst kurumu niteliğindeki Yüksek Öğretim Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 482 Kurulu, TÜBİTAK, Sağlık Bakanlığı, DPT ve Maliye Bakanlığı gibi kuruluşlar ortak bir çalışma yürüterek, bu veritabanlarının aboneliğinin sürekliliğini sağlayacak ekonomik desteği yaratmalıdırlar. Veritabanlarından yararlanan bütün ilgili kurum ve kuruluşlar ULAKBİM'le işbirliğine giderek baskı unsuru oluşturmalıdırlar. Ancak bu çalışmalarda, üniversite ve araştırma kütüphanelerine kamudan ayrılan ödeneklerde kesinti yapılması yönünde bir uygulamaya yer verilmemelidir. Gerek konsorsiyumlar, gerekse EKUAL kapsamında yapılan lisans anlaşmaları daha çok güncel bilimsel bilginin erişimine yönelik yapılmakta, veritabanlarının geriye dönük arşivlenmesi konusu göz ardı edilmektedir. E-dergilerin sürekli erişiminin hiçbir zaman garantisi yoktur. Bundan dolayı da, Dünya'nın birçok ülkesinde, e-dergilerin sürekli erişiminin sağlanması ve arşivlenmesi konusunda çalışmalar yürütülmektedir. Buna karşın, Türkiye'deki konsorsiyumlar ve enformasyon merkezleri, lisans anlaşmalarını bütçe sınırlamaları nedeniyle, daha çok güncel bilginin erişimi üzerine odaklandırmakta, geriye dönük arşivleme konusunu tümüyle göz ardı etmektedirler. Konsorsiyumların yapısı ve enformasyon merkezlerinin ekonomik güçleri göz önünde bulundurulduğunda, böyle bir çalışmayı yürütme olanaklarının da olmadığı görülmektedir. Sadece ULAKBİM, ScienceDirect, IEEE ve IOP veritabanlarının geriye dönük arşivlenmesi yönünde lisans anlaşmaları yapmaktadır. Bu çerçevede geriye dönük arşivlemede hangi veritabanlarının tercih edileceği, EKUAL üyeleri ve konsorsiyumlar tarafından belirlenmeli ve bu konuda ULAKBİM'i yönlendirici ve destekleyici politikalar geliştirilmelidir. Bununla birlikte güncel bilginin erişimi yönünde yapılan lisans anlaşmaları 5-10 yıl gibi belirli dönemleri kapsadığından, daha sonraki yıllarda yapılan abonelikler daha önce erişilen bilgileri erişilemez hale getirmektedir. Bundan dolayı, firmalarla yapılacak lisans anlaşmalarında, daha önce erişilebilen, ancak yeni abonelikte dönem kısıtlaması nedeniyle erişilemeyen kaynaklar, belge sağlama kapsamında daha ucuz bir şekilde elde edilebilmelidir. Anlaşmalarda bu düzenlemelerin yapılması sağlanmalıdır. Türkiye'de belge sağlama konusunda OBES ve KİTS kapsamında yürütülen çalışmalar yeniden değerlendirilmeli ve ulusal bir sistemin kurulması yönünde, TO­ KAT da göz önünde bulundurularak, çalışmalar başlatılmalıdır. Bu tür bir çalışmanın başlatılabilmesi için, bütün kurumların ortak katkı sağlaması ve işbirliğine hazır olması gerekmektedir. Türkiye'de enformasyon merkezleri gerek konsorsiyumlar, gerekse 483 | Hakemli Yazılar / Refereed Articles Mehmet Toplu kişisel çabaları ile birçok elektronik veritabanına ve süreli yayına abone olmasına karşın, hangi yayının nerede bulunabileceği konusunda, sağlıklı bir süreli yayınlar toplu kataloğu bulunmamaktadır. Bundan dolayı ULAKBİM, ANKOS ve ÜNAK-OCLC konsorsiyumu işbirliği çerçevesinde bazı yaptırımları beraberinde getiren ortak bir çalışma yürütmelidir. ULAKBİM tarafından yürütülmekte olan ulusal toplu katalog projesi (TO-KAT), bütün paydaş kurumların katkıları ile yeniden ele alınmalı ve daha sağlıklı bir yapıya kavuşturulmalıdır. Türkiye'deki enformasyon merkezleri koleksiyon geliştirme programlarını, ortak koleksiyon geliştirme ve kütüphanelerarası ödünç verme ve belge sağlama hizmetleri çerçevesinde birlikte yürütmeli, konsorsiyumlar da bu oluşumları desteklemelidir. Özellikle ekonomik anlamda belirli bir güce sahip enformasyon merkezleri, aynı veritabanlarını satın almak yerine, farklı kaynaklara yönelerek, ulusal ölçekte daha fazla bilginin erişilebilirliğini olanaklı kılmalıdır. Amaç sadece başkaları tarafından yaratılmış enformasyon sistemlerinden yararlanarak bilgi hizmeti sağlamak olmamalı, ulusal altyapının gelişimi konusunda da çaba sarf edilmelidir. Ulusal ölçekte etkin ve sağlıklı bir enformasyon hizmeti ancak bu şekilde yaratılabilir. Kaynakça 2008-2009 Öğretim Yılı Yükseköğretim İstatistikleri Kitabı. 28.07.2009 tarihinde: http://www.osym.gov.tr/BelgeGoster.aspx?F6E10F8892433CFFD4AF1EF75F7A79 68B40CE59E171C629F adresinden erişilmiştir. Alberico, R. (2002). Academic library consortia in transition. New Directions for Higher Education. No. 120, 63-72. Anadolu Üniversite Kütüphaneleri Konsorsiyumu. 15.07.2007 tarihinde http://www.ankos.gen.tr/ adresinden erişilmiştir. Akbaytürk, T. (2003). Türkiye'deki korsorsiyumların kütüphanelerde satın alma üzerine Etkisi. Türk Kütüphaneciliği, 17 (33), 247-262. Aslan, S. (A.). (2006). Elektronik kaynaklara erişim modelleri. 28.07. 2007 tarihinde: http://ab.org.tr/ab07/sunum/73.ppt#256,1, erişilmiştir Balas, J. (1998). Library consortia in the brave new online world. Computers in Libraries, April, 42-44, http://www.infotoday.com Bashirullah, A., Jayaro, X. (2006). Consortium: a solution to academic library in Venezuela. Library Collections, Acquisitions & Technical Services, 30 (2006), 102­ 107. http://www.osym.gov.tr/BelgeGoster.aspx?F6E10F8892433CFFD4AF1EF75F7A79 http://www.ankos.gen.tr/ http://ab.org.tr/ab07/sunum/73.ppt%2523256,1 http://www.infotoday.com Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 484 Bhattacharya, P. (2004). Advances in digital library initiatives: a developing country perspective. The International Information & Library Review, 36, 165-175. Bostick, S. L. (2001). The History and development of academic library consortia in the United States: an overview. The Journal of Academic Librarianship, 27 (1), 128­ 130. Branin, J. J. (1998). Shifting boundaries: managing research library collections at the beginning of the twenty-first century. D. B. Simpson (Editor) Cooperative Collection Development: Significant Trends and Issues,içinde (1-18). Newyork: The Haworth Press. Butler, B. (1986). Scholarly journals, electronic publishing, and library networks: from to 2000. Serials Review, Summer and Fall, 47-52. Chan, G. Ferguson, A. W. (2002). Digital library consortia in the 21st century: the Hong Kong JULAC case. Collection Management, 27 (3/4), 13-27. Ching, S. H., Poon, P. W. T. and Huang, K. L. (2003). Managing the effectiveness of the library consortium: a core values perspective on Taiwan e-book net. The Journal of Academic Librarianship, 29 (5), 304-315. Cotter, G., Carroll, B., Hodge G. and Japzon, A. (2005). Electronic collection management and electronic information services. Information Services & Use, 25 (1), 23-34. Çukadar, S. (2009). Sayılarla Üniversiteler Veritabanları ve ANKOS. IX. ANKOS yıllık toplantısı, 24 - 26 Nisan 2009, İnönü Üniversitesi. 24.07.2009 tarihinde http://www.ankos.gen.tr/2009/TR/Program.html adresinden erişilmiştir. Dannelly, G. N. (1998). The Center for research libraries and cooperative collection development: partnership in progress. D. B. Simpson (Editor). Cooperative Collection Development: Significant Trends and Issues içinde (37-45). Newyork: The Haworth Press. Doughherty, R. M. (1988). A Conceptual framework for organizing resource sharing and shared collection development programs. The Journal of Academic Librarianship, 14 (5), 287-291. Dünden Bugüne ULAKNET. 13.06.2009 tarihinde: http://www.ulakbim.gov.tr/hakkimizda/tarihce/ulaknet/dunbugun.uhtml adresinden erişilmiştir. Erdoğan, P. and Karasözen, B. (2009). Portrait of a Consortium: ANKOS (Anatolian University Libraries Consortium). The Journal of Academic Librarianship, 35 (4), 377-385. Gerenimo, V. A., Aragon, C. (2005). Resource sharing in university libraries: a tool for information interchange. Library Collections, Acquisitions, & Technical Services, 29, 425-432. Giardano, T. (2003). Library consortia in Western Europe. Encyclopedia of Library and Information Science içinde (ss. 1613-1619). New York: Marcel Dekker. Greenstein, D. (2000). Digital libraries and their challenges. Library Trends. 49 (2). 290-303. http://www.ankos.gen.tr/2009/TR/Program.html http://www.ulakbim.gov.tr/hakkimizda/tarihce/ulaknet/dunbugun.uhtml 485 | Hakemli Yazılar / Refereed Articles Mehmet Toplu Günden, A. (2001). ÜNAK/OCLC konsorsiyum çalışmları. Bilgi Dünyası, 2 (1), 106­ 119. Hirshon, A. (2001). International library consortia: positive starts, promising futures. Journal of Library Administration. 35 (1/2), 147-166. Hirshon, A. (1999). Libraries, consortia, and change management. The Journal of Academic Librarianship, 25 (2), 124-126. Holley, E. G. (1975). The Role of Professional associations in a network of library activity. Library Trends, 24 (2), 293-306. Holley, R. P. (2003). Cooperative collection development. Encyclopedia of Library and Information Science içinde (ss. 698-708). New York: Marcel Dekker. Holley, R. (1998). Cooperative collection development: yesterday, today, and tomorrow. Cooperative Collection Development: Significant Trends and Issues içinde (ss. 19-36) D. B. Simpson (Editor) Newyork: The Haworth Press. International Coalition of Library Consortia-ICOLC . 15.07.2009 tarihinde: http://www.library.yale.edu/consortia/ adresinden erişilmiştir. Karasozen, B. and Lindley, J. A. (2004). The Impact of ANKOS: consortium development in Turkey. The Journal of Academic Librarianship, 30 (5), 402-409. Karasözen, B. (2002). Kütüphane Hizmetlerinde İşbirliği ve Ortaklıklar: ANKOS. Elektronik Gelişmeler Iş ı ğında Araştırma Kütüphaneleri Sempozyumu, 24-26 Ekim 2002, Abant İzzet Baysal Üniversitesi, Bolu. 24.2 2009 tarihinde: http://www.ankos.gen.tr/index.php?option=com_content&task=view&id=69 &Itemid=1 adresinden erişilmiştir. Karasözen, B. ve Lindley, J. A. ANKOS: Türkiye'de konsorsiyum gelişimi. 24. 02. 2009 Tarihinde: http://www.ankos.gen.tr/index.php?option=com_content&task=view &id=67&Itemid=1 adresinden erişilmiştir. Kenan, B. R. (1975). The Politics of technological forces in library cooperation. Library Trends, 24 (2), 183-190. Kittel, D. A. (1975). Trends in state library cooperation. Library Trends, 24 (2), 245­ 255. Lancaster, F. W.(1982). Libraries and Librarians in an Age of Electronics. Arlington, Va.: Information Resources Press. Moothart, T. (1995). Migration to electronic distribution through OCLC'S electronic journals online. Serial Review, Winter, 61-65 Oğuz, E. S. (2006). Web arşivleme yaklaşımları ve örneklerle web arşivleri. 14.01.09 tarihinde http://kaynak.unag.org.tr/bildiri/unak06/u06.8.pdf . adresinden erişildi. Pathak, S. K and Deshpande, N. (2004). Importance of consortia in developing countries-an Indian scenario. The International Information & Library Review, 36, 227- 231. Reinhardt, Werner & P. T. Boekhorst (2001). Library consortia in Germany. Liber Quarterly, 11, 67-79. Sağlık Bakanlığı Eğitim ve Araştırma Hastaneleri. 28.07.2009 tarihinde http://www.library.yale.edu/consortia/ http://www.ankos.gen.tr/index.php?option=com_content&task=view&id=69 http://www.ankos.gen.tr/index.php?option=com_content&task=view http://kaynak.unag.org.tr/bildiri/unak06/u06.8.pdf Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 486 http://www.ulakbim.gov.tr/cabim/ekual/veritabani/hast_vt.uhtml adresinden erişilmiştir. Scigliano, M. (2002). Consortium purchase: case study for a costbenefit analysis. The Journal of Academic Librarianship, 28 (6), 393-399. Shachaf, P. (2003). Nationwide library consortia life cycle. Libri, 53, 94-102 Stern, D. (2003). New knowledge management systems: the implications for data discovery, collection development, and the changing role of the librarian. Journal bof the American Society for Information Science and Technology, 54 (12), 1138­ 1140. Tenopir, C. (2003). Electronic publishing: research issues for academic librarians and users. Library Trends, 51 (4). 614-635 Thornton, G. A. (2009). Impact of electronic resources on collection development, the roles of librarians, and library consortia. Library Trends, 48 (4), 842-856 Tonta, Y. (2007). Elektronik yayıncılık, bilimsel iletişim ve kütüphaneler. Türk Kütüphaneciliği, 11 (4), 305-314. Tonta, Y. (2001). Collection development of electronic information resources in Turkish university libraries. Library Collections, Acquisitions, & Technical Services 25, 291-298 Tonta, Y. (1999). Kütüphanelerarası İşbirliğinin Neresindeyiz? Bilginin Serüveni: Dünü, Bugünü, Yarın ı ... Türk Kütüphaneciler Derneğ i'nin Kuruluşunun 50. Y ı l ı Uluslararası Sempozyum Bildirileri 17-21 Kasım 1999 içinde (493-514) 17-21 Kasım 1999, Ankara: TKD. Toplu, M. (2009). Belge sağlama hizmetlerinin gelişimi ve Türkiye perspektifi. Türk Kütüphanecili ğ i. 23 (1), 83-118. Toplu, M. (1992). Üniversite kütüphanelerinin bilimsel araştırmadaki işlevi ve Türkiye gerçeği. Türk Kütüphaneciliği, 6 (2), 89-107. Toplu, M. (1991). Ulusal Bilgi Ağları ve Türkiye. Ankara: Ankara Üniversitesi Sosyal Bilimler Enstitüsü. (Yayınlanmamış Yüksek Lisans Tezi). Tuncer, N. (1988). Belge sağlayan kuruluşlar: YÖK Dokümantasyon Merkezi ve BLDSC. Türk Kütüphaneciliği, II (2), 51-60. Tuncer, N. (1986). Yükseköğretim Kurulu Dokümantasyon ve Uluslararası Bilgi Tarama Merkezi. Yükseköğretim Bülteni, l, 33-35. TÜBİTAK EKUAL. 29.07.2009 tarihinde http://www.ulakbim.gov.tr/cabim/ekual/ adresinden erişilmiştir. Ulusal Toplu Katalog. 28.07.2009 tarihinde http://www.toplukatalog.gov.tr/index.php? cwid=8 adresinden erişilmiştir. ÜNAK-OCLC Konsorsiyumu. 28.07.2009 tarihinde http://www.unak.org.tr/unakoclc/ adresinden erişilmiştir Veri Tabanları. 28.07.2009 tarihinde http://www.ulakbim.gov.tr/cabim/vt/?filter%5Bkeyword%5D=&filter%5Baccessibili ty%5D=2&filter%5Btype%5D=&Submit=Ara adresinden erişilmiştir. http://www.ulakbim.gov.tr/cabim/ekual/veritabani/hast_vt.uhtml http://www.ulakbim.gov.tr/cabim/ekual/ http://www.toplukatalog.gov.tr/index.php http://www.unak.org.tr/unakoclc/ http://www.ulakbim.gov.tr/cabim/vt/?filter%255Bkeyword%255D=&filter%255Baccessibili 487 | Hakemli Yazılar / Refereed Articles Mehmet Toplu Walters, W. H. (2006). Should libraries acquire boks that are widely held elsewhere? A brief investigation with implications for consortial book selection. Bulletin of the American Society for Information Science and Technology, February/March, 25­ 27. Wylie, N. R., T. L. Yeager (1999). Library cooperation. New Direction for Higher Education, 106, 27-35. Summary Consortiums, which were amongst the most debated issues of librarianship in 1960s, contribute largely in developing aggregated collection and source sharing between information centers. As of late 1990s, by activating electronic publications in information services, consortiums started to become widespread and focused on collective purchasing from the same database. Consortium activities in Turkey began when electronic publication services became effective on information services in early 21st century and continued its activities more like a purchasing club. ANKOS (Anatolian University Libraries Consortium) and ÜNAK (University and Research Librarianship Association)-OCLC (Online Computer Library Center) consortiums do not involve developing joint collection between information centers. And source sharing is applied as discounts in prices of purchased products. These consortium activities are important for access to scientific information in electronic media and its expansion on national level. Many information centers enabled access to this information, which is not possible through printed media, thanks to consortiums. A scientist from any part of Turkey is able to access basic scientific publications with consortiums at local level without going to information centers. Together with this, information centers are able to access basic and most- requested databases, for which they previously spent most of the budget, with the national license agreement project initiated by ULAKBİM within the scope of TÜBİTAK EKUAL. In addition, expansion of this project by ULAKBİM so as to address scientists from training and research hospitals affiliated to Ministry of Health also enabled spreading of scientific information on national level. Despite these developments in access to information as of early 21. century in Turkey, some fundamental problems still exist. As consortiums are formed on voluntary Elektronik Yayıncılığın Ortak Koleksiyon Geliştirme ve Kütüphane Konsorsiyumlarına Etkileri ve Türkiye'deki Uygulamalar The Effects of Electronic Publishing on Co-operative Collection Development and Library Consortia and the Applications in Turkey | 488 basis and have no legal or economical power, they can only continue their activities as purchasing clubs. Therefore, they cannot take effective steps in basic issues such as retrospective archive of electronic databases and formation of union catalogues. On retrospective archiving of electronic resources in Turkey, ULAKBİM works on a study covering ScienceDirect, IEEE and IOP databases. Together with this, in 1999, ULAKBİM started Joint Document Supply Project (OBES) in coordination with Libraries of Gazi, Hacettepe and Middle East Technical Universities in Ankara. Within the same study, aggregated catalogues of periodicals of Bilkent and Başkent Universities together with information centers are brought into use. As of 2008, ULAKBİM also put into practice the National Union Catalogue Project (TO-KAT) covering bibliographic records of all information centers. This study is not decently supported by consortiums and information centers in Turkey. In addition, there is no standard procedure to continue this process and ULAKBİM could not coordinate effectively with consortiums and other institutions of which it is a member. As of 2006, document supply services are initiated under the name “Interlibrary Coordination Monitoring System (KITS)” by ANKOS. However, this activity is implemented on voluntary basis without any infrastructure. If ANKOS supported OBES service previously initiated by its members and contributed to its nationwide extension, it could develop a sound structure. It is required that more cooperative and contributory activities are exercised by consortiums including same members and ULAKBİM for resource sharing, collection development, interlibrary lending and document supply services as well as a sound information infrastructure in Turkey. work_c2ypay5ltjdo3ojbag26dpgy5a ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216589257 Params is empty 216589257 exception Params is empty 2021/04/06-01:37:01 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216589257 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:01 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_c3npyy26xbblfkoybd5ha5p4sq ---- Library Trends Abstract This article describes the Online Computer Library Center’s (OCLC) Open WorldCat program. WorldCat is a worldwide union catalog created and maintained collectively by more than 9,000 member institutions. Open WorldCat seeks to make library collections and services visible and available through popular search engines such as Yahoo! and Google and other heavily used sites on the open Web. In this capacity, Open WorldCat provides an important central connection between the shared information of the library network and the Web. The article describes the history and rationale of the project; explains how Open WorldCat works for information seekers, participating libraries, and partners; and reports on what OCLC has learned from the program to date. Introduction Today’s Web users expect information at their fi ngertips, regardless of where they are searching. Libraries can meet this expectation only by reach- ing further into the network of information resources that their patrons use and delivering content and services to users at the point of need. Satisfying patron expectations means reaching beyond the library portal and into the commercial search engines, vertical information portals, and e-commerce sites that have become such an integral part of patron workfl ow. The Online Computer Library Center’s (OCLC) Open WorldCat pro- gram is one approach to integrating access to library collections and services into the “fl ows” of Web users. WorldCat is a worldwide union catalog cre- ated and maintained collectively by more than 9,000 member institutions. With more than 60 million online records representing almost 1 billion The Online Computer Library Center’s Open WorldCat Program Chip Nilges LIBRARY TRENDS, Vol. 54, No. 3, Winter 2006 (“Librar y Resource Sharing Networks,” edited by Peter Webster.), pp. 430–447 © 2006 The Board of Trustees, University of Illinois 431 items held by member institutions, it is the largest and most comprehensive database of its kind. Open WorldCat seeks to make library collections and services visible and available through popular search engines such as Yahoo! and Google and other heavily used sites on the open Web. In this capacity, Open WorldCat provides an important central connection between the shared information of the library network and the Web. Through Open WorldCat, OCLC partners with search engines and other Web sites to link from their search results to a “fi nd in a library” service managed by OCLC and powered by the WorldCat database. The “fi nd in a library” service provides the user with a list of nearby libraries with hold- ings in WorldCat. OCLC also manages a registry of Online Public Access Catalogue (OPAC) links for its member libraries, which are used to take the user to the record describing the item of interest in the OPAC of choice. A number of other services are available from the Open WorldCat interface, including such IP authenticated services as access to link resolvers, virtual reference services, e-books, and other digital licensed content. This article describes the Open WorldCat program, including the project history and rationale; how it works for users, libraries, and partners; results to date; lessons learned; and future plans. Project History The genesis of Open WorldCat was OCLC’s 2000 strategic plan, “Ex- tending the OCLC Cooperative,” which charted a course for the evolution of WorldCat into a “globally networked, and globally available information resource” (OCLC, 2000, p. 12). The plan, developed by OCLC leadership and staff in 1999 and vetted extensively by OCLC’s board members, had as one of its key tenets the notion of “weaving libraries into the web” by making World- Cat openly accessible “in many versions from many paths: through individual library portals. . . . And through information partner portals (e.g., through database aggregators, Web search engines, and Web portals)” (OCLC, 2000, p. 12). The report elaborates on the concept of open access to WorldCat: “Information partners, including database aggregators, Web search engines, and Web portals, will use Extended WorldCat Discovery and Navigation services as an ingredient to build enriched access to information. With this cooperation, libraries will have a method to include library collections in the mix of Web pages and commercial content offered to library users” (OCLC, 2000, p. 28). This quote encapsulates two key drivers of the Open WorldCat project. The fi rst was the notion of broadening access to library collections by integrating them into the open Web resources most heav- ily used by information seekers, regardless of the provider (library, .org, commercial site). The second is the notion of tackling this effort through a cooperative approach, in which WorldCat is used as a directory and broker- ing service, or a “switch,” that alerts the Web searcher to the availability of library materials and then connects the user to those materials. nilges/oclc’s open worldcat program 432 library trends/winter 2006 Research Following the publication of the strategic plan, OCLC undertook re- search in three areas to vet the concept of Open WorldCat: (1) research with potential users of the service, to test the value proposition of fi nding library collections and their location on the open Web; (2) research with OCLC member libraries, to test the value proposition of exposing their collections through popular search engines as a way of extending their reach; and (3) research with potential partners, to test the value proposition of enhancing their services by integrating metadata describing library collections and a service for connecting their users to local library catalogs and portals for service. This research took place in 2001 and the fi rst half of 2002. Research with Students A key component of our research with potential users of the service fo- cused on college students. We focused on these users because we knew that students were increasingly using Web search engines and other Web sites as a starting point for research assignments. We wanted to assess the value that these users might place on searching collections of nearby libraries as part of their broader Web searching. We commissioned Harris Interactive to con- duct an online survey of over 1,000 college students in the autumn of 2001 (OCLC, 2002). The survey concluded that students in this group were likely to start their research online—in fact, 96 percent reported they begin their research for assignments with Web search engines. At the same time, nine out of ten respondents claimed to use traditional print library resources at least some of the time, including print journals as well as books. Respondents were also shown a mock up of an integrated search of library collections through a major search engine, with records describing items held by libraries and links to local library catalogs. Fifty-three percent reported that they would use such an option to search library collections through search engines at least on a monthly basis. Forty-seven percent said they would use the library locator feature to fi nd a nearby library that has a book they want, and 45 percent said they would go to the library in person to get a book found this way. A signifi cant number of respondents (37 percent) said they would travel to another library to get a book they found this way. Other studies confi rmed the importance of the Web as a research tool for students. Chief among these was a study commissioned by the Pew In- ternet and American Life Project (2002), “The Internet Goes to College.” This study reported that the strong majority of college Internet users say the Internet “has had a positive impact on their college academic experi- ence,” and 73 percent of respondents reported that they use the Internet more than the library for research. These kinds of results supported our belief that students were indeed moving their research activity to the open Web and, in particular, to popular search engines. It also suggested that the library could offer these students value in this new research fl ow. 433 Research with Member Libraries In the same period, we also undertook a variety of market research activi- ties to assess the value of the Open WorldCat concept to the libraries that OCLC serves. Our belief was that Open WorldCat would help libraries by making them accessible on the open Web, which would help them to reach an audience that was clearly shifting its research activity to nonlibrary portals of various kinds. We also believed that there was particular value in an orga- nization such as OCLC undertaking this project because it would be possible for OCLC to develop a shared infrastructure that many thousands of li braries could use to expose their collections in multiple open Web sites without any additional work on the part of the library. This research took place in 2001–2002 and included a survey of members, a series of four discussions with library directors and staff in different parts of the country, extensive discussions with advisory committees and OCLC Members Council interest groups, and briefi ngs/discussions with the OCLC Board of Trustees. The member sur vey took place in the winter of 2002 and included 194 libraries that use OCLC services. Fifty-eight percent of those surveyed agreed completely and 26 percent agreed somewhat to the following state- ment: “My library, its collections, and its services should be visible to any Web user regardless of where they reside.” Those surveyed were also asked how likely it would be to enable links from search engines and Web book vendors to their collections through WorldCat. Forty-nine percent of re- spondents indicated that they were very or somewhat likely to enable links from search engines to their collections, and 36 percent said they were very or somewhat likely to enable links from Web book vendors. While the results did not indicate that a majority of member libraries would enable links, we considered this a good result, given that the concept had not been described in detail. Also, for each type of link, there was also a relatively high “neutral” result (28 percent for search engines and 21 percent for Web book vendors), suggesting that the strong majority were neutral or positive at this very early stage in the project. In short, OCLC members were supportive of the notion of broad access to their collections and, like their users, were beginning to think of search engines as appropriate access points to their collections. It was clear from these results that additional research was warranted. One of the many face-to-face discussions with OCLC members took place with an ad hoc advisory group that met in Chicago on September 24–25, 2002. The group included leaders from academic and public libraries, statewide and regional library consortia, and the OCLC Board of Trustees. These experts were shown early prototypes of the system and were presented with a straw-man ser vice model and business model for the service. They were asked what they felt the value of the service was to OCLC member libraries, what they believed the service must include on day one, who they believed to be the target audiences for the service, how nilges/oclc’s open worldcat program 434 library trends/winter 2006 it should be positioned, who OCLC should partner with, and a variety of other questions along these lines. The general recommendations of this group included a strong endorse- ment of the project. At the same time, the group was specifi c and clear that the service must meet a number of key objectives when released and that this project, if completed, would only mark the beginning of what OCLC needed to do to help its member libraries reach their users on the open Web. Some specifi c recommendations from this group included the following: • Fulfi llment services of some sort must be included in version one of the service—at minimum, the ability to fi nd a nearby library with the item and link to the OPAC • The service must be designed for end-users—students and public library patrons—and OCLC should continue research with end-users • The service must include a critical mass of affi liates (including an “an- chor” site such as Google) • It must include all WorldCat bibliographic records and a critical associ- ated mass of library holdings • Informational materials to help libraries market the service and justify the service to decision makers (for example, city councils, provosts, etc.) must be included Later in 2002 OCLC also conducted a series of four focus groups at the offi ces of four OCLC regional networks. These focus groups were attended by library directors and key library staff from OCLC member libraries served by these networks. The idea received support in these discussions, and participants offered important suggestions and articulated key concerns that had a direct impact on the development of the service. Research with Partners In addition to testing with potential end-users and the OCLC member libraries whose collections would be exposed in this new way, we of course needed to test the value proposition of the service with potential partners. That value proposition was, we felt, clear: that search engines and other kinds of sites on the open Web—such as book vendor sites—would see value in providing access, from within their sites, to a directory of the combined collections of thousands of libraries. To test this proposition, in the late summer of 2001 we developed a prototype system that would accept simple queries (for example, ISBN, title/author) and return a Web page showing bibliographic information about the item, as well as a service that would allow the user to enter a postal code, state name, or country name and return a list of libraries near them that held the item they had found, based on holdings in WorldCat. 435 Between late summer of 2001 and June of 2002, we established part- nerships with a number of Web -based book vendors, including Abebooks, Alibris, and AABA (Antiquarian Booksellers Association of America), to test the value of this approach to potential partners and to learn about the potential volume of traffi c we would need to support, the technical model for delivering this kind of service, and the manner in which the service might be used. We chose these sites because there was a good fi t between their catalog and WorldCat. We offered them access to WorldCat when a search against their catalogs failed to produce results, on the grounds that the book would likely be indexed in WorldCat. By starting with a restricted model of this sort, we felt we could learn what we needed to know to deter- mine if there was in fact value in the approach—as indicated by real user activity—and to scale the system for broader use. These early partnerships were successful on several dimensions. Part - ners valued the connection to WorldCat, usage activity was climbing (it reached more than 140,000 referrals per month by the spring of 2002), and OCLC member libraries and industr y commentators had received the concept favorably. At the same time, acceptance of the idea of open Web access to WorldCat continued to grow. Additionally, other organizations had also begun to experiment with open access models for similar kinds of resources. Chief among these was the Research Libraries Group (RLG), which in October 2003 announced the RedLightGreen project, through which it made the RLG database available and searchable on the open Web. Encouraged by what we had learned in our initial pilot, we contacted Google in May about the possibility of providing access to a set of WorldCat records that would contain pointers to a “fi nd in a library” service residing in Dublin, Ohio. This service, the second generation of the pilot service described above, would perform essentially the same function: enable a user to enter location information and fi nd nearby libraries that held an item in their collection. But it would be supported by a much more robust technical infrastructure and a more complete set of links to library OPACs. We proposed releasing to Google a set of records representing the 2 million most widely held items in WorldCat in order to maximize the possibility that a user fi nding one of them could also fi nd a nearby library for service. We proposed to release to Google a subset of Machine-Readable Cataloguing (MARC) data fi elds for these records. Google was enthusiastic about the project and signed an agreement with OCLC in summer of 2003 to pilot the service in its main index. This pilot began in December 2003, when WorldCat records fi rst began appearing in Google.com. In January 2004 Yahoo! also became interested in the project and made the same set of 2 million records available from Yahoo.com. At the time of writing, OCLC’s partnerships with both Google and Ya- hoo! have been positive for OCLC member libraries, for users of Google nilges/oclc’s open worldcat program 436 and Yahoo!, and for OCLC itself. Traffi c on partner sites, including the book sites mentioned above (with the addition of Biblio.com), Google and Yahoo!, and www.BookPage.com, has grown to almost 9 million referrals a month (see Figure 1), signifi cantly expanding access to the collections of OCLC member libraries. In addition, links to Open WorldCat have expanded to 3.4 million re- cords in Yahoo! and the Google main index (Google.com), and Open WorldCat has been featured in Google Scholar (ww.scholar.google.com). (Google, in fact, has har vested the entire WorldCat database for use in Scholar.) In providing this expanded record set, we have sought to begin to address the issue of providing users of Open WorldCat with access to the complete list of library locations for items they fi nd. This expanded record set represents the 3 million most widely held items from a version of WorldCat against which the OCLC Offi ce of Research’s Functional Requirements for Bibliographic Records (FRBR) algorithm has been applied. As a result of this process, these records represent the most widely held manifestations of the 3 million most widely held works in WorldCat. We have also begun to expand what is available to include 400,000 of the least widely held items in the database, that is, the items held by a single library. Google Scholar is notable in that it has signaled a clear shift in the ap- proach of major search engines toward a more refi ned and comprehensive approach to providing access to scholarly/research information. Yahoo!’s beta of its “Mindset” service, which allows the user to specify the intent of a search (commercial to informational), is a different approach that also serves the goal of providing access to a more information-rich search experi- ence for students, researchers, and information professionals. Because of the affi liate relationships that characterize Web search, the number of sites providing access to WorldCat content has grown substan- tially in the past year. Today, over 800 different Web sites link to the Open WorldCat “fi nd in a library” service each month, and this number continues to grow. These sites include non-U.S. versions of partner sites, such as Ya- hoo! Mexico, Singapore, and Canada; sites that access content from Google, Yahoo!, or both (Alta Vista, Dogpile, etc.); and sites that have embedded links to particular Open WorldCat records. How Open WorldCat Works Open WorldCat includes service components for users, member libraries, and partners. These are described briefl y below. User Services Access Points Users can access Open WorldCat through partner sites (http://www.oclc.org/worldcat/open/partnersites/default.htm) and fol- low links in these sites to OCLC member libraries for service. In addition, OCLC and its partners have published a number of Open WorldCat search library trends/winter 2006 437 tools that can be used to access Open WorldCat records directly from within partner sites. These tools, available from OCLC’s Web site at http://www.oclc .org/worldcat/open/searchtools/default.htm, include a cobranded Yahoo! toolbar that includes a search capability limited to WorldCat records indexed by Yahoo!, a link to WorldCat records from within the Google toolbar using Google’s “auto-link” capability, and Firefox extensions that allow a user of that Web browser to search the WorldCat records in Google or Yahoo! directly. Additionally, in summer 2005 we will publish a series of lightweight search tools and Web services to make it easy for libraries and other partners to embed searches to Open WorldCat within their local services. User Experience The Open WorldCat user experience today is consistent with the pilot system, though it has been enhanced steadily to improve ac- cess to more of WorldCat and to more library services. A sample search will show the user’s current workfl ow and also provide a baseline for describing known issues and how the program works for participating libraries, as well as plans for enhancing the service. In the example in Figure 2, the user has entered the keyword search “Shelby Foote writer’s life” on the main search page in Google. A keyword search on “Shelby Foote” would have retrieved the same item as approxi- Figure 1. Monthly Accesses of Open WorldCat nilges/oclc’s open worldcat program Fi gu re 2 . R es u lt s o f a G o o gl e K ey w o rd S ea rc h 438 library trends/winter 2006 mately the twenty-fi fth result on the page, and a title phrase search of “A writer’s life” would have brought up the same result as approximately the twentieth result on the page. (I will have more to say regarding page rank- ing and user search characteristics below.) Every Open WorldCat record available through the Google and Yahoo! index is prefaced with the phrase “Find in a Library,” as part of OCLC’s effort to build the library brand within general Web search tools. The meta- data in the “snippet” in Figure 2 is culled from the MARC record fi elds that we provide search partners. These include basic bibliographic information about the item, as well as subject headings (which can improve the hit rate). We are also beginning to experiment with other fi elds that may improve the performance of WorldCat data in these services. Users coming to Open WorldCat from a book vendor site, such as Alibris, or from a site that links to Open WorldCat from citations that it creates (such as Google Scholar or Bookpage.com) will not see a snippet formatted like the one in this example. Those users will see a link such as Scholar’s “Library Search” or “Find in a WorldCat Library.” From the “snippet” in a results set, the user will link to the “Find in a Library” page shown in Figure 3. (Here again, featuring the library brand is intentional.) In addition to the ability to “Find Libraries with Item,” this page lever- ages the metadata in WorldCat records by providing hot links on author name, title, and WorldCat subject headings. These links will execute a search for WorldCat records on the highlighted term against the search engine the user has come from. From the Shelby Foote record, for instance, clicking on the subject link “Southern States—Historiography” produces a list of seventy-eight titles from Open WorldCat that have been indexed in Google. These subject links are heavily used, which is not surprising, given that most users fi nd Open WorldCat records in search engines as a result of a subject search rather than a known-item search. Many WorldCat records also contain an “Other Editions” link, which a user can follow to a list of all of the versions (in Functional Requirements for Bibliographic Records [FRBR] terms, manifestations) of the work they have found. From the Open WorldCat record describing The Da Vinci Code, for instance, a user has direct access to all of the manifestations of this work via the “other editions” link (see Figure 4). Following this link retrieves a list of manifestations, including the large print edition, various sound recordings, translations, the movie, etc. (see Figure 5). In the summer of 2005 we will fully integrate access to manifestations into the primary “Find in a Library” page by consolidating all holdings and subject headings and representing all manifestation types on this top -level page. This use of FRBR, as well as the subject linking shown above, are examples of the value of a structured approach to metadata. From the “Find in a Library” landing page, users have a number of options. They can enter a zip or postal code, state name, or the name of a country, and 439nilges/oclc’s open worldcat program 440 library trends/winter 2006 the service will fi nd a list of nearby libraries using a geo-location algorithm that retrieves up to ten nearby libraries or, failing that, broadens the search to regional and, ultimately, international libraries. In the example in Figure 6 the user has entered the Chicago zip code “60609” and retrieved ten local libraries holding the book. These libraries are sorted in descending order by proximity to the zip code entered, as indicated in the “distance” column. The names of the libraries in the list are highlighted, indicating that the user can click on a name and follow a link to the library’s catalog. In this example a user who clicks on the link to the Chicago Public Library, for instance, would be taken to the entry for this book in that library’s OPAC, as shown in Figure 7. (Note the branding of Chicago Public Library on the Open WorldCat frame at the top of the page.) As of this writing we have assembled a directory of 6,700 links to library catalogs, and approximately 65 percent of these will take the user directly to the page in the librar y’s OPAC corresponding to the item found via Open WorldCat, using an ISBN, ISSN, or an OCLC number. We are actively harvesting and maintaining OPAC links, as well as links to OpenURL resolv- ers, library information pages, and library “Ask a” services. This “registry” component of Open WorldCat is a lynchpin of the service and an area that we will continue to invest in. Figure 3. “Find in a Library” Page Figure 4. “Other Editions” Link Figure 5. “Other Editions” Results Set 442 library trends/winter 2006 Users of the service who are coming from an IP address that OCLC rec- ognizes are also able to access services that the library has registered with OCLC. In Figure 7, these links appear in the gray box on the left. Today, these include OpenURL resolvers, links to patron-initiated interlibrary loan (ILL), and links to other reference services (provided by OCLC and a vari- ety of vendors). Approximately 15 percent of all links to “Find in a Library” interface come from users whose IP address is recognized by OCLC. User Behavior We track a variety of user activity measures, which provide some insight into user behavior and guide enhancements to the service. In addition, we capture and analyze qualitative feedback through a comments link on the “Find in a Library” page. Users most often access Open WorldCat via a simple keyword search (generally a subject search) of two to four terms. A recent one-day sample of searches that linked to Open WorldCat records included sixteen subject searches and four known-item searches in the top twenty searches for the day (see Table 1 for details). A 6,000-search sample of searches showed that the average number of search terms was 2.38, and the Open WorldCat re- cord was, on average, approximately the sixth item displayed in the Yahoo! search results. At the same time, there is also signifi cant linking activity from results found below item ten on an average results set, suggesting that Open WorldCat does serve a constituency of more determined researchers who Figure 6. List of Libraries That Hold the Item 443 tend to dig deeper into results sets. We do not know how frequently users who see “fi nd in a library” links on a partner site choose those links and click through to the “Find in a Library” service. Users click on another link 15–20 percent of the time after landing on a “Find in a Library” page. Most often, they follow a subject link to another list of items. They click off to a library service of some sort (an OPAC, for instance) approximately 4–6 percent of the time after landing on a “Find in a Library” page. When they click to a library service, they go to an OPAC or library information page approximately 80 percent of the time. Some users come to the “Find in a Library” page from an IP address that OCLC recognizes as valid for service. These users can choose from a number of services, ranging from direct links to full text, to OpenURL resolvers, to patron ILL or access to an e-book, depending on what their library has enabled. In April of 2005 users followed IP authenticated links approximately 22,000 times. Thirty-seven percent went to the library OPAC, 36 percent to FirstSearch, and 24 percent to an OpenURL resolver; less than 1 percent (approximately 500) were ILL requests. Because these links are enabled by libraries and displayed only from authenticated IP addresses, it is very diffi cult to generalize about user preferences or traffi c patterns from these numbers. It is clear, however, that users exercise the options presented. Figure 7. Link to Library OPAC nilges/oclc’s open worldcat program 444 library trends/winter 2006 In addition to measuring system activity, we have also evaluated qualita- tive feedback. Figure 8 summarizes a sample of 192 comments submitted by users of Open WorldCat in the late autumn of 2004. The comments were analyzed by staff in OCLC’s corporate marketing area and grouped into the categories shown. A few of these areas refl ect the relative newness of the service and relate to users and library staff praising the service and/or raising questions re- garding their collections appearing or failing to appear in a search engine. Encouragingly, we received a relatively high percentage of testimonials from happy end-users who had discovered the service. “Find libraries with item issue” and “Library holdings issue,” for instance, together comprised over 20 percent of comments. Most of these were library staff who did not know that only a subset of WorldCat records had been indexed by Google and Yahoo! or were asking questions about whether or not holdings had been set for their collection on a particular item. These kinds of questions, while important, were not surprising at that point in the project. Other kinds of comments pointed the project in new directions. The largest category of questions, Reference, consisted of users who submitted what constituted a reference question through the comments box. As a direct result of this phenomenon, we have begun routing reference questions we receive to OCLC’s 24/7 reference service and will integrate access to library “Ask a” and virtual reference into Open WorldCat in the summer of 2005. Equally illuminating were comments regarding bibliographic issues with records and buying the items found through Open WorldCat. In response to the former, we are partnering with the OCLC Offi ce of Research this Table 1. Top Twenty Searches Leading to Open WorldCat Records Rank Search Term Hits on May 20, 2005 1 Denmark history 38 2 0028 9604 (Newsweek) and WorldCat 36 3 Violence in the Workplace Prevention site (worldcatlibraries.org) 27 4 Teaching high school English 24 5 Find in a Library Da Vinci Code 24 6 0226103897 (Chicago Manual of Style) and WorldCat 22 7 Fuel injection 20 8 Medical 17 9 WorldCat 15 10 Lsyvygotskii site (worldcatlibraries.org) 14 11 Slideboard 13 12 Henry Frederick Prince of Wales site (worldcatlibraries.org) 13 13 9960340112 (“A brief illustrated guide to understanding Islam”) and WorldCat 12 14 worldcatlibraries.org 12 15 Site www.worldcatlibraries.org--foxz1 12 16 Levente szasz site (worldcatlibraries.org) 11 17 Hannah Arendt 11 18 Greeting card thesis 11 19 Wendyl Marshall William Beaudine: From Silents to Television 10 20 Cooperative Learning 9 445 summer to pilot a “meta-wiki” service through Open WorldCat that will give users the ability to contribute reviews, tables of contents, and notes regard- ing Open WorldCat records. In response to the latter, we plan to pilot a “buy it” link from Open WorldCat to determine the demand among users of Open WorldCat for purchasing the items they fi nd in the service. The “buy it” option is also a way to test alternative funding models for WorldCat: proceeds from sales will be shared with OCLC member libraries directly. Library Services Libraries participate in Open WorldCat by setting their holdings in World- Cat and confi guring their Open WorldCat profi le. Libraries set holdings by cataloging with OCLC or by batch loading holdings directly into WorldCat. WorldCat includes holdings for approximately 12,000 institutions. Confi guration options for Open WorldCat include links to local services (OPAC, OpenURL resolver, “Ask a” service) and display preferences (for example, name of library to display in Open WorldCat). Libraries can also enable authenticated links and set the IP address ranges from which these links should display in Open WorldCat. Confi guration options for Open WorldCat are available from http:// www.oclc.org/worldcat/open/default.htm. From this page, libraries with holdings in Open WorldCat also have the option to opt out of the service and have their holdings indicators removed from the “Find in a Library” ser- vice. To date, only approximately 150 libraries have exercised this option. Beginning in January 2005, we also began providing libraries with usage statistics for Open WorldCat that indicate the number of links from each Figure 8. Open WorldCat Feedback Comments nilges/oclc’s open worldcat program 446 library trends/winter 2006 partner site to their local site for service. This service, as well as the promo- tional materials we have developed, are intended to help member libraries promote Open WorldCat to their patrons and funding bodies and to show one way that they are seeking to meet their users at the point of need. Partner Services Open WorldCat also includes a variety of partner services. As mentioned above, the program includes a linking program through which OCLC pro- vides partners with partial WorldCat records, as well as a program through which OCLC will accept known-item queries sent by partners to the “Find in a Library” service. Among current partners, two obtain metadata from OCLC (Google and Yahoo!); the rest are sending queries using a predefi ned syntax from metadata in their catalogs. OCLC also manages a version of Open WorldCat, called the WorldCat Partner Program, for sites that license content to libraries (http://www.oclc.org/vendors/worldcatpartners/ default.htm). Through this program, partners can link into WorldCat and FirstSearch in a variety of ways. A large component of partner services are OCLC’s partner development and partner relations activities. As partner services evolve and change, so must Open WorldCat. Developing new methods of access (for example, Web services), maintaining and managing contacts within partner sites, working with partners to deploy Open WorldCat within new partner services, and managing data feeds and placement in partner sites are signifi cant, ongoing activities that OCLC performs on behalf of its member libraries. Conclusions It is important to note that Open WorldCat is just one facet of a broader effort to provide open access to WorldCat. In addition to this program, OCLC offers its members a union catalog ser vice, called the WorldCat group catalog, that provides librar y consortia with a publicly accessible catalog of their consortia holdings that is a customized view of WorldCat. There are currently more than fi fteen group catalogs available on the Web (http://www.oclc.org/groupservices/access/default.htm). The OCLC Of- fi ce of Research has also made a variety of views of WorldCat publicly ac- cessible, including a fi ction view and a “top 1000” view (http://www.oclc .org/research/researchworks/default.htm). Open WorldCat is only a starting point for this broader effort. Over the coming year we expect the model to evolve dramatically, both by design and in response to the rapidly changing information environment. In ad- dition to those already mentioned, our planned enhancements include an OpenURL registry and gateway that will enable us to redirect Web surfers to appropriate OpenURL resolvers. We also are actively pursuing new partners and plan to announce recent signings in the coming weeks and months. We continue to be interested in bringing more of WorldCat out into the 447 open, in particular the millions of uniquely held items it describes. And we are always looking to expand and improve the interface, the fulfi llment options we can support, and the quality of the user experience. Finally, we are looking hard at simplifying and streamlining our services for enabling partners, whether members, other .orgs, or .coms, to integrate whatever components of WorldCat they wish to use into their applications. We also expect to continue grappling with known issues. Page rank, for instance, is and continues to be one of the biggest challenges facing Open WorldCat and other services that seek to integrate content into user research fl ow/workfl ow in popular search engines. Specialized views, such as Google Scholar and Yahoo! Mindset, offer help for the specialized audi- ences that will likely use these tools, but more general audiences will need direct access. Underlying this work is the understanding that the nature of search and, more broadly, the discovery-to-delivery chain for libraries and other information providers, is fundamentally shifting, and that WorldCat must shift with it. WorldCat must evolve from a monolithic reference database that is designed primarily for use in private networks by information profes- sionals and researchers to a search service that combines vertical search, syndicated search, and Web services and is distributed across private and public networks. It is diffi cult to say today where this understanding will lead us, but it is easy to see that we must move, quickly, in the direction of broader and broader access options, and better and better methods for locating and getting the item, if we are to serve the needs of our members and their patrons. References Online Computer Librar y Center, Inc. (OCLC). (2000). Extending the OCLC cooperative: A three-year strategy. Dublin, OH: OCLC. Online Computer Library Center, Inc. (OCLC). (2002). OCLC white paper on the information habits of college students. Retrieved June 7, 2005, from http://www.oclc.org/research/an- nouncements/2002-06-24.htm. Pew Internet and American Life Project. (2002). College students say Internet helps them. Retrieved June 7, 2005, from http://www.pewinternet.org/PFP/r/50/press_release.asp. Chip Nilges is Executive Director of the WorldCat Content and Global Access Division of the Online Computer Library Center, Inc. (OCLC). An OCLC employee since 1994, Chip has held a variety of positions in product management at OCLC, including product manager of OCLC’s electronic journals service, Electronic Collections On- line, and product manager of FirstSearch, OCLC’s online reference service. In 1999 Chip was part of the team that formulated OCLC’s 2000 strategic plan. Promoted to director of new product planning following the development of that plan, he led the product teams that launched OCLC’s virtual reference service, QuestionPoint, and Open WorldCat, which makes library resources available from nonlibrary Web sites. Chip has presented widely on these projects and has published a number of articles on electronic journals. Chip holds an MBA in marketing and an MA in literature, both from Ohio State University. nilges/oclc’s open worldcat program work_c47njftnsveffprbbouahlsspm ---- Library Management, 2002, Vol. 23 Iss: 6/7, pp.325 - 329. ISSN: 0143-5124 DOI: 10.1108/01435120210432291 http://www.emeraldinsight.com/ http://www.emeraldinsight.com/journals.htm?issn=0143-5124 http://www.emeraldinsight.com/journals.htm?issn=0143- 5124&volume=23&issue=6&articleid=859054&show=html © 2002 MCB UP Ltd Outsourcing of Slavic cataloguing at the Ohio State University libraries: evaluation and cost analysis Magda El-Sherbini Abstract Examines the outsourcing of Slavic original cataloguing at the Ohio State University libraries. It includes: the rationale for doing so; the evaluation and cost analysis; the advantages and disadvantages; and what we have learned. Introduction An earlier paper provided a case study of copy cataloguers and their changing roles at the Ohio State University Library (El-Sherbini, 2001). This paper was developed from a presentation by the author at the Ohio Library Council Annual Meeting held in Columbus, Ohio in November 2000. In 1994, the Ohio State University libraries conducted a study on the viability of outsourcing of the original cataloguing of Slavic books. The results of this study were analysed in an article entitled "Contract cataloging: a pilot project for outsourcing Slavic books" (El-Sherbini, 1995). At that time the Slavic backlog consisted of approximately 25,000 titles in all formats. Based on this analytic study, which proved at that time that the outsourcing was more cost effective than hiring an MLS cataloguer, the decision was made to contract out the original cataloguing of this Slavic backlog. After six years of the Slavic original cataloguing being undertaken by the vendor, it is now time to evaluate the process to determine:  if it is still cost effective;  if there are other options available;  if there is new in-house funding for cataloguing;  if the library is not satisfied with the vendor service;  if the library developed in-house expertise; and  if the library has reorganised its operations and found a way to utilize expertise from other departments. Hence, this paper examines the outsourcing of Slavic original cataloguing at the Ohio State University libraries. It reviews:  the rationale for so doing;  the evaluation;  the advantages and disadvantages; and  what we have learned. Rationale In a previous study (El-Sherbini, 1995), the rationale for contracting out the Slavic original cataloguing was summarised in the following points:  resignation of the Slavic languages original cataloguer in April 1993;  a backlog of Slavic materials, about 25,000 titles in various Slavic languages and in various formats;  budget uncertainties; and  change to a new in-house system (OSCAR). After studying several methods of obtaining original cataloguing records, the decision was made to conduct a pilot project and send 100 Slavic titles (monographic materials only) to OCLC TechPro. The goals of this pilot project were to test the quality of records obtained from a vendor, and compare the cost for cataloguing in-house versus outsourcing. The study proved that the quality of records obtained by the vendor was acceptable and the price was reasonable. This step was followed by writing the actual contract specifications and starting to send 65 monographic titles to OCLC to be catalogued originally. Evaluation and cost analysis of the contract Statistical information After five years of contracting out the Slavic original cataloguing, now it is time to reassess and evaluate this process. In order to do the assessment, some statistical information is needed. The Slavic backlog was divided into a historic backlog and new receipts backlog in addition to the current new receipts (acquisition) materials. (1) Historic backlog information from November 1994 to September 2000:  Number of titles in the historic backlog in 1994 = 25,000 titles.  Number of titles catalogued by OCLC TechPro 1994-2000 = 2,508 titles (original cataloguing).  Number of titles catalogued by OSU staff from the historic backlog= 18,706 titles (copy and original cataloguing). (This cataloguing was done by twi Slavic GAs and student assistants.)  Total catalogued titles from the historic backlogs 21,214 titles.  Remaining uncatalogued titles from the historic backlog = 3,786 titles. (2) New receipts backlog catalogued by OSU staff in 1994-2000 = 5,000 titles. (3) Current new receipts (acquisition) catalogued by OSU staff in 1994-2000 = 11,246 titles. (This cataloguing is done by the two Slavic GAs and student assistants):  Total of titles catalogued in-house from November 1994 to September 2000 = 18,706 + 5000 + 11,246 = 34,952 titles (copy and original cataloguing).  Total of titles catalogued by OSU staff and TechPro from November 1994 to September 2000 = 37,460 (copy and original). The following is a description of the catalogued titles by material types:  book: 35,707;  serial: 958;  manuscript: 201;  music score: 200;  printed map: 192;  A/V: 17;  music record: 182;  computer file: one;  OSU thesis: one; and  non-music record: one. Cost analysis In this section, the author will discuss the cost of cataloguing using the vendor versus the cost of cataloguing in-house. Some of the figures, such as the support cost, will be estimated. The cost of using in-house staff is based on the monthly and/or annual salary with benefits. OCLC TechPro cost analysis  Cost of the 2,508 catalogued titles = $76,140.65.  Support costs[1] = $15,831.  Total cataloguing cost = OCLC TechPro cost + the support. cost = $76,140.65 + 15,831 = $91,971.65  Cost by OCLC TechPro per title = $91,971.65 divided by 2,508 titles = $36.67 per title (the average for original cataloguing). Cataloguing cost in-house  Total of titles catalogued in-house from November 1994 to September 2000 = 34,952 copy and original (about 20 per cent original = 6,990 original titles and 80 per cent copy = 27,962 titles).  Staff and students costs for cataloguing these materials [l,2] = $210,993.46.  Original cataloguing credit = 6,967 titles x $4 (average) = 827,868.  Net cost = $210,993.46 - 527,868 = $183,125.46.  Cost per title (copy and original) = $183,125.46 divided by 34,952 titles = $5.23.  To calculate the cost per original title, an assumption was made that about 65 per cent of the staff time ($183,125.46) was spent to perform this function = $119,035.49 divided by 6,990 titles = $17.03.  To calculate the cost per copy cataloguing title, an assumption was made that about 35 per cent of the staff time ($183,125.46) was spent on this function = $64,089.97 divided by 27,869 titles = $2.29. Conclusion of the cost analysis From the above cost analysis you will note that in 2000 the cost of doing original cataloguing in-house was $17.03 and the cost of contract cataloguing is $36.67 per title. In 1994 the cost of undertaking the original cataloguing in-house was $56.32 per title and for contact cataloguing it was $34.71 (El-Sherbini, 1995). The reasons for lowering the cost of cataloguing in-house were:  Shifting original cataloguing responsibilities to staff, GAs and student assistants. This enabled the department to focus more on the cataloguing and eliminated a great deal of the time that the professional cataloguers were spending on other professional activities (e.g. committees, publications, national involvement, etc.).  Streamlining the workflow by eliminating the redundancies such as the first and second searches.  Eliminating the complexity of the workflow, and the difficulty of moving materials from one room to another, and from one person to another. This is eliminated by consolidating the cataloguing function in one room and creating teamwork to handle the cataloguing operations from A-Z. The teamwork eliminated several steps, and reduced time wasted in terms of problem-solving and answering questions.  Eliminating the backlog. The items are handled once and never returned to the backlog for someone else to do the work.  Increasing the number of workstations. Everyone in the cataloguing department now has his/her own workstation. This eliminated the amount of time spent waiting for a free terminal.  Cataloguing documentation is now available on the Web which makes it more efficient and up-to-date.  Moving an in-house technical services IT person made it more efficient for trouble shooting to happen immediately. The vendor price for cataloguing remains high because once you start at a certain price for cataloguing per title, you will not be able to lower this price. Instead, every year the price increases slightly. Cataloguing in-house still requires extensive training especially with students and graduate student assistants (GSA). Advantages and disadvantages of contract cataloguing Advantages (1) Outsourcing is used as a means to reduce backlogs, increase productivity, and allow for shifts in staff. (2) Outsourcing is used to gain expertise in foreign languages that is not available from the local staff. (3) It opens our eyes to other methods of cataloguing in-house for example: • reducing redundancy in handling each item; • simplifying the workflow; • using paraprofessionals in cataloguing; and • using graduate students and student assistants in cataloguing (4) The vendor's focus is only on cataloguing: there is no involvement in submitting name authority records to NACO; less time is spent on searching bibliographical sources beyond the OCLC database to support the form of heading. (5) The vendors have greater flexibility in moving personnel according to their needs. (6) Compared to libraries' workflow, their workflow is more efficient. Since they must keep current with their workflow because of specific deadlines for their customers, they do not encounter the problems and expenses involved in managing a backlog. (7) Keeping current with, and distributing information about, cataloguing rules takes less time with fewer people. (8) Some original cataloguing is done by experienced paraprofessionals who generally are employed at salaries that are lower than those of professionals. (9) The vendors are mostly cataloguing according to the customer's specifications; hence, no time is spent in negotiating changes in procedures or in decision-making. Disadvantages (1) Outsourcing has proven not to be cost effective. (2) The quality of cataloguing can be vary from time to time based on who the vendor is hiring, and the level of expertise. (3) Hiring graduate students and student assistants to do cataloguing in-house could be time consuming and problematic since it does not provide stability, and training could be a major issue. In this case, contract cataloguing can be a good solution; alternatively hiring permanent staff to do cataloguing in-house could be more economical and more effective. (4) It is very difficult for the vendor to keep the cataloguing quota as stated in the agreement. This can cause a problem in terms of controlling the backlog and planning on when it will be eliminated. (5) The outsourcing price can rise and there is no flexibility in reducing the price unless you change vendors. (6) Sometimes, you spend a great deal of time to fix or solve cataloguing problems in-house. (7) Communicating problems to the vendor can be time consuming. What we have learned and the conclusion Contracting can be viewed as a remote extension of technical services and sometimes is the only option. It might not solve all of the problems and it can be time consuming and frustrating. Outsourcing, however, is not a threat to the professional cataloguers. If you do it right, outsourcing might be rewarding for the library. Notes 1. See details of support costs in Appendix 1. 2. See details of staff costs in Appendix 2. 3. Data used to calculate the costs of using in-house staff are based on actual staffing, which fluctuated throughout the five-year period. References El-Sherbini, M. (1995), "Contract cataloging: a pilot project for outsourcing Slavic books", Cataloging & Classification Quarterly, Vol. 20 No. 3, pp. 57, 64-6. El-Sherbini, M. (2001), "Copy cataloguers and their changing roles at the Ohio State University Library: a case study", Library Management, Vol. 22 No. 1 and 2, pp. 80-5. Appendix 1 OCLC support cost Retrieving books from the backlog: 1 hour of a GAA = 1 x $9.80 = $9.80. Searching 65 titles in OCLC: 5 hours of GAA time = 5 x $9.80 = $49. Cost for in-house searching: 65 titles x 30 cents = $19.50. Searching the local system to discharge books from the backlog, charge to OCLC TechPro and create the inventory lists: 3 hours of a students time = 3 x $5 = $15. Packing the books in boxes: 2 hours of students time = 2 hours x $5 = $10. Sending the books to OCLC TechPro = $15 UPS. Reviewing the returned books against the inventory list and checking the books in the OSU local system to make sure that every book had been returned and was in the system: 4 hours of student time = 4 x $5 = $20. Spot-checking: 3 hours by the Slavic GAA = 3 x $9.80 = $29.40. Managing the project, solving problems, and answering questions (locally or with OCLC TechPro): 5 hours of faculty time = 5 x $19.23 = 196.15. Total support cost per month = $263.85. Total support cost for five years: $263.85 x60 months = $15,831. Appendix 2 Staff cost for cataloguing these materials [3]: Graduate Student Assistants (GAA) = 2 x $10,200 a year = $20,400 x 5 years = $102,000 Student assistant = 40 hours a week x $5 average per hour = $200 x 52 weeks = $10,400 x 3 years = $31,200. Slavic staff = $90,847.8 for three years and 70 per cent were only devoted to cataloguing Slavic =$63,593.46. Slavic staff = $21,000 for one year and 20 per cent was devoted to cataloguing Slavic = $4,200. Faculty supervisor = average salary in 5 years = $200,000 x 5 per cent spent on this project = $10,000. (This includes problem solving and training GAs and student assistants in addition to catalogue maintenance.) Outsourcing of Slavic cataloguing at the Ohio State University libraries: evaluation and cost analysis Introduction Rationale Evaluation and cost analysis of the contract Cost analysis Advantages and disadvantages of contract cataloguing What we have learned and the conclusion Notes References Appendix 1 Appendix 2 work_c4cgkllrbbeftnjexxe5x6zhkq ---- Using the OAI-PMH ... Differently Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine July/August 2003 Volume 9 Number 7/8 ISSN 1082-9873 Using the OAI-PMH ... Differently   Herbert Van de Sompel Digital Library Research and Prototyping Los Alamos National Laboratory Jeffrey A. Young OCLC Office of Research Thomas B. Hickey OCLC Office of Research Abstract The Open Archives Initiative's Protocol for Metadata Harvesting (OAI-PMH) was created to facilitate discovery of distributed resources. The OAI-PMH achieves this by providing a simple, yet powerful framework for metadata harvesting. Harvesters can incrementally gather records contained in OAI-PMH repositories and use them to create services covering the content of several repositories. The OAI-PMH has been widely accepted, and until recently, it has mainly been applied to make Dublin Core metadata about scholarly objects contained in distributed repositories searchable through a single user interface. This article describes innovative applications of the OAI-PMH that we have introduced in recent projects. In these projects, OAI-PMH concepts such as resource and metadata format have been interpreted in novel ways. The result of doing so illustrates the usefulness of the OAI-PMH beyond the typical resource discovery using Dublin Core metadata. Also, through the inclusion of XSL1 stylesheets in protocol responses, OAI-PMH repositories have been directly overlaid with an interface that allows users to navigate the contained metadata by means of a Web browser. In addition, through the introduction of PURL2 partial redirects, complex OAI-PMH protocol requests have been turned into simple URIs that can more easily be published and used in downstream applications. Introduction It comes as no surprise that most current implementations of OAI-PMH [1] repositories mainly make descriptive metadata about resources harvestable. This emphasis on descriptive metadata has its origin in the early motivations of the OAI3 (Open Archives Initiative) effort that focused on making distributed resources discoverable. The OAI-PMH facilitates this by providing a simple, yet powerful framework for metadata harvesting that allows harvesters to gather metadata held by different repositories into a central location and to make it searchable there. Initially, the descriptive metadata provided by OAI-PMH repositories was to a large extent limited to the mandatory unqualified Dublin Core, but an evolution towards the provision of more extensive descriptive metadata, such as MARC21, is becoming apparent. Providing extensive descriptive metadata is possible in the OAI-PMH thanks to the notion of parallel metadata formats that enables repositories to expose metadata about the same resource in multiple metadata formats. Creative interpretation of what actually constitutes a resource about which an OAI-PMH repository holds metadata, and of the nature of metadata formats used in the OAI-PMH, has led to suggestions that the protocol could be quite useful beyond the traditional domain of resource discovery using Dublin Core metadata and could reach into the realm of state maintenance in distributed systems [2 , 3 , 4 ]. As a matter of fact, metadata records in the OAI-PMH are any data that can be validated against a W3C4 XML5 Schema6. Therefore, the OAI-PMH can be a medium for incremental, date-sensitive exchange of any form of semi-structured data. In the Section below entitled "Unconventional OAI-PMH resources and metadata formats", three creative uses of the OAI-PMH notions of resource and metadata format are described in the context of the application in which they are used. The metadata contained in OAI-PMH repositories is typically gathered by harvesters that process it and make it searchable through a user interface. In these uses of the OAI-PMH, repositories are never directly accessed by end-users; the "customers" of the repositories are robots. The Section entitled "A user interface for OAI-PMH repositories", describes an approach to overlay OAI-PMH repositories with an interface allowing users to directly navigate the repository content. We also show how this approach has been used to make the GSAFD Thesaurus, the OpenURL Registry and the XTCat Thesis Catalog user-accessible. The Section entitled "PURLs for simple access to OAI-PMH records" describes an approach to use PURL (Persistant URL7) partial redirects to create simple URIs that lead to records in OAI-PMH repositories. These URIs are easier to publish and use in downstream applications than their corresponding OAI-PMH protocol requests. Unconventional OAI-PMH resources and metadata formats In this Section, three examples are given of less conventional OAI-PMH repositories that make creative use of the OAI-PMH notions of resource and metadata format. The repositories are described in the following order: the GSAFD8 Thesaurus, the Digital Library Usage Logs, and the OpenURL9 Registry. The GSAFD Thesaurus Libraries put a great deal of effort into creating, maintaining, and using thesauri to improve the recall and precision of database searches. In the GSAFD Thesaurus project, the OCLC Office of Research attempts to determine the value of cross-thesaurus linking and improved thesaurus-access. The desired enhanced thesaurus services are intended for both machine and human use. As the basis of this effort, the GSAFD Thesaurus is stored as an OAI-PMH repository. The GSAFD Thesaurus records, which are available in MARC 2110 format, were downloaded from the American Library Association site [5]. The records were enriched with 7XX fields where a GSAFD term mapped to a term in the Library of Congress Subject Heading (LCSH) file. The exact nature of this mapping process as well as an evaluation of its added-value is beyond the scope of this article. The records were then converted to the MARC XML format [6] and subsequently stored in an OAI-PMH repository. The resources about which the OAI-PMH repository exposes metadata are the concepts represented by thesaurus terms. In the repository, an OAI-PMH item exists per thesaurus term, and its OAI-PMH identifier is the actual thesaurus term. Three OAI-PMH metadata formats are available per OAI-PMH item: oai_dc: Used to dynamically disseminate unqualified Dublin Core metadata describing the enriched thesaurus record. marc21: Used to contain the enriched MARC 21 thesaurus records in the MARC XML format. z39_19: Used to dynamically disseminate a user-friendly Z39.19 format [7] representation of the enriched thesaurus records. This approach allows the GSAFD Thesaurus to be simultaneously accessed in three modes, all based on the OAI-PMH protocol: Interaction by users via a Web browser. At the time of writing, this mode of interaction can be experienced at OAI-PMH baseURL http://alcme.oclc.org/gsafd/OAIHandler?verb=ListIdentifiers&metadataPrefix=z39_19. Table 1 shows a Z39.19 record in its XML form. Details on how direct user-interaction with an OAI-PMH repository can be implemented are provided in the Section entitled "A User interface for OAI-PMH Repositories". Interaction by machines through OAI-PMH-based Web Services mechanisms. Interaction by OAI-PMH harvesters aiming at recurrently gathering thesaurus records, and updates thereof, for use elsewhere. Adventure fiction Use for works characterized by an emphasis on physical and often violent action, exotic locales, and danger, generally with little character development. Adventure stories Swashbucklers Thrillers Picaresque literature Robinsonades Sea stories Western stories Adventure and adventurers-Fiction Adventure stories Table 1: A Z39.19 thesaurus record from the GSAFD Repository. This record shows the relationships between terms using standard Z39.19 labels. The "MT" label is an extension to the Z39.19 standard to indicate terms that are mapped to a different thesaurus. The scheme attribute indicates that the first maps to a term in the LCSH (juvenile) thesaurus and the second maps to a term in the standard LCSH thesaurus. The controlNumber attribute indicates the unique ID within the external thesaurus. As a result, by storing the GSAFD Thesaurus as an OAI-PMH repository, its content becomes an integral part of the Web infrastructure where it can be seamlessly used by both human and machine using standard Web tools. Digital Library Usage Logs In a recent collaboration between Old Dominion University and the Los Alamos National Laboratory (LANL) aimed at the deployment of recommender systems, logs that describe the usage of the LANL Digital Library are exposed through the OAI-PMH. The Digital Library Usage Log repositories currently cannot be publicly harvested. Because of the specific aim of the project, only actions by which users express a preference for a specific document are selected and ingested into a relational database. In the current set-up, preference is measured implicitly [8], and a log-entry is typically created when a user clicks an OpenURL [9] provided for a document available via the LANL Digital Library services. The content of the database populated by events that express user preferences is exposed as two interlinked OAI-PMH repositories: The User Repository: The resources about which this repository exposes metadata are the users of the LANL Digital Library. In this repository, an OAI-PMH identifier exists per user accessing the Digital Library. In principle, two parallel metadata formats are available for each OAI-PMH identifier: a Dublin Core record describing the user and a User-log record that lists all relevant actions by the user. An individual entry in a User-log record will, amongst other things, list the unique identifier of the document about which the user expressed a preference, as well as the datetime on which this happened. For reasons of privacy, the OAI-PMH identifiers used in this repository do not reveal the real identity of the user, and the Dublin Core record describing the user is actually not exposed. This raises questions of compliance of the User Repository with the OAI-PMH that mandates support of Unqualified Dublin Core. It also illustrates why the OAI Technical Committee failed to reach a consensus quickly on the question of whether or not to keep Dublin Core mandatory in version 2.0 of the protocol. The Document Repository: The resources about which this repository exposes metadata are the documents for which users have expressed a preference. In this repository, an OAI-PMH identifier exists per document for which a preference has been expressed. Two parallel metadata formats are available for each OAI-PMH identifier: a Dublin Core record describing the document itself and a Document-log record that lists all expressions of preference for the actual document by users. An individual entry in a Document-log record will, amongst others, list the unique identifier of the user that expressed a preference for the document, as well as the datetime on which this occurred. Table 2 shows a brief Document-log record from this repository. To support a better readability, XML Namespace information is omitted, and information is not encoded. ori:doi:10.1006/geno.1999.5925 ori:oai:lib.lanl.gov:agent:IP:128.165.10.178 ori:oai:lib.lanl.gov:agent:IP:128.165.44.22 Table 2: A Document-log record from the Document Repository. This record shows that two users have expressed a preference for the document with unique identifier ori:doi:10.1006/geno.1999.5925. The user with unique identifier ori:oai:lib.lanl.gov:agent:IP:128.165.10.178 has done so on 2002-02-22, while the user with unique identifier ori:oai:lib.lanl.gov:agent:IP:128.165.44.22 has done so on 2002-02-23. The XLinks, which are OAI-PMH requests against the User Repository, lead to the User-log record for the respective user. The User Repository and the Document Repository are interlinked by means of XLinks11 in the following manner: In the User Repository, an XLink leads from each individual entry in a User-log record — recording that the user has expressed preference for document x — to the Document-log record in the Document Repository that describes expressions of preference for document x by the whole user community. In the Document Repository, an XLink leads from each individual entry in a Document-log record — recording that the document was preferred by user y — to the User-log record in the User Repository that describes all expressions of preference by user y. These XLinks can be seen in Table 1. It is expected that the usage of the OAI-PMH in the context of this project will provide a scalable and sustainable infrastructure to share continuously updated usage information with a fully autonomous downstream application. That application will mine harvested usage logs and provide recommendations based on patterns derived from the mining activity. Recommendations will be accessible by querying the application using an XML ContextObject as specified the Draft NISO OpenURL Standard. The OpenURL Registry The upcoming NISO OpenURL Standard is a so-called "framework standard". Its nature is inspired by the Bison-Futé model [10] and extends the usability of OpenURL's context-sensitive services concept [10, 11, 12, 13] beyond the scholarly domain in which OpenURL originated. It does so by specifying a framework that enables communities to define and implement their own OpenURL-based service environment. The approach builds on a Registry that is introduced to contain the explicit definitions for core components of the OpenURL Framework as registered by communities. Such core components include, amongst others, Namespaces of Identifiers that can be used to identify resources, Metadata Formats that can be used to describe resources, ContextObject Formats that can be used to express the payload of an OpenURL using a well-defined syntax, and Community Profiles that list the actual choices a Community makes from Registry entries when it actually deploys its own OpenURL environment. To bootstrap deployment of the new specification in the original OpenURL community, many initial Registry entries provided by the NISO AX Committee12 are relevant for the purpose of open linking in the scholarly information environment. For example: The registered Namespaces of Identifiers include the DOI13 Namespace, the Namespace of PubMed identifiers, and the Namespace of OCLC WorldCat numbers. Several registered Metadata Formats have tags that closely resemble those used in OpenURL 0.1 [9], but MARCXML [6] and Unqualified Dublin Core [14] have also been registered. These Metadata Formats are either defined by means of an XHTML14 document derived from a well-defined Template [15], or by means of a W3C XML Schema. Two ContextObject Formats that can be used by several Communities to express the payload of an OpenURL have been defined. The first is a Key/Encoded-Value Format [16] that — like OpenURL 0.1 — expresses the OpenURL payload as a list of ampersand-delimited key/value pairs. Its definition is based on the aforementioned XHTML Template. The second is the XML Format [17] in which the OpenURL payload is expressed as an XML instance document, the format of which is defined by means of a W3C XML Schema. Early in the NISO AX Standardization effort, it had been suggested that making the OpenURL Registry OAI-PMH conformant could lead to an environment in which OpenURL Resolvers could easily remain synchronized with the definitions contained in the Registry by regularly polling for updates and harvesting them whenever required [3]. Although the nature of the content of the Registry has significantly evolved since then, the idea of creating an OAI-PMH compliant Registry has been used for the Registry for Trial Use of the Draft NISO OpenURL Standard. The nature of the OAI-PMH Repository that holds the Registry entries is described below. In order to avoid confusion between OAI-PMH and OpenURL terminology the following convention is used in this description: an OAI-PMH term is followed by [OAI], and an OpenURL term is followed by [OURL]. The resources [OAI] about which the repository [OAI] contains metadata [OAI] are the concrete entries for core components of the OpenURL Framework that are registered by Communities in order to be able to deploy their OpenURL environment. For example, a resource [OAI] can be the DOI Namespace [OURL], an XML Metadata Format [OURL] to describe book-like objects, or the Key/Encoded-Value ContextObject Format [OURL] used to express the payload of an OpenURL. Each such registered item receives an Identifier [OURL] at registration, and this becomes an identifier [OAI] for the item in the OAI-PMH repository. Currently, the OAI-PMH repository supports three metadata formats [OAI] with the following metadataPrefixes [OAI] and characteristics: oai_dc: Dublin Core is used to describe registered items. A Dublin Core record exists per registered item. mtx: This metadata format [OAI] is defined by the W3C XML Schema that defines XHTML. The metadata format [OAI] is actually further restricted by the XHTML Template that is used for defining Key/Encoded-Value Metadata Formats [OURL] for the OpenURL Framework. An MTX record [OAI] exists for all registered Key/Encoded-Value Metadata Formats [OURL], and the record [OAI] content is the actual XHTML-based definition of the Metadata Format [OURL]. xsd: This metadata format [OAI] is defined by the W3C XML Schema that defines W3C XML Schema. An xsd record [OAI] exists for all registered XML Metadata Formats [OURL], and the record [OAI] content is the actual W3C XML Schema that defines the Metadata Format [OURL]. If required for the purpose of registration, the repository can support additional metadata formats [OAI], as long as they can be defined by means of W3C XML Schema. For example, it is anticipated that Community Profiles listing Registry choices will be registered and will be unambiguously expressed as well-formed XML instance documents that validate against a special-purpose W3C XML Schema. If this happens, the repository can support a fourth metadata format [OAI] to accommodate such Community Profile documents. In addition to the described interpretations of the OAI-PMH notions of resource and metadata formats, the repository also builds on the OAI-PMH notion of sets. In the OAI-PMH, repositories can optionally implement sets, which group contained items into hierarchical subdivisions of the repository. The repository used for the OpenURL Registry implements a set structure in which every set refers to a core component of the OpenURL Framework. As such, there are, for example, sets to contain registered Namespaces of Identifiers, a set to contain registered Character Encodings, etc. The described approach to the deployment of the OpenURL Registry does effectively facilitate a straightforward synchronization of information that is essential to the functioning of the OpenURL Framework between the Registry and OpenURL Resolvers. But, as will be shown in the next Section, it also enables the creation of a straightforward interface that allow users to navigate the Registry content in a meaningful manner, by the sole use of OAI-PMH requests. The OpenURL Registry repository can be harvested at OAI-PMH baseURL http://alcme.oclc.org/openurl/servlet/OAIHandler. A user interface for OAI-PMH repositories The OAI-PMH was designed to facilitate incremental harvesting of metadata contained in a repository by robots; so far, its uses have largely been restricted to that application area. However, it is also possible to explore the content of repositories from a user interface that only uses OAI-PMH requests as its navigation mechanism. As will be shown, this approach can be highly attractive for certain repositories. In order to implement direct and meaningful access from a Web browser to the content of an OAI-PMH repository, a reference to an XSLT stylesheet is introduced in OAI-PMH protocol requests. Doing so, a protocol response looks as shown in Table 3. 2002-02-08T12:00:01Z  ... Table 3: An OAI-PMH protocol response with embedded reference to an XSLT stylesheet. When such a response is sent to an automated process such as an OAI-PMH harvester, the stylesheet reference will be ignored and the XML will be processed directly. However, a Web browser receiving the response will use the stylesheet reference to render the response into HTML in the manner specified by the stylesheet. As such, it is possible to create a browser-based user interface to interact directly with an OAI-PMH repository by merely clicking OAI-PMH requests provided in the interface, and by receiving OAI-PMH responses rendered by means of a specified stylesheet. Similarly (perhaps even more generally useful), an OAI service provider can directly issue a GetRecord request to a record's home repository on behalf of the user so that the home repository has control of the transformation/display/branding of the records that users see. The capabilities of the user interface using this method are rather limited because of the limitations of using only OAI-PMH verbs. However, for simple applications or for the navigation of small repositories, the approach can be quite useful as is illustrated in the following examples. Figure 1 shows the result of issuing the OAI-PMH GetRecord request to obtain a Dublin Core record from the XTCAT Experimental Thesis Catalog. The response, which includes a stylesheet reference, is rendered by the browser. No further external mediation is required for displaying the metadata contained in the response to users. In addition, several navigational links are provided in the interface that are OAI-PMH requests. Figure 1: An OAI-PMH response to the GetRecord request http://alcme.oclc.org/xtcat/servlet/OAIHandler? verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:xtcat.oclc.org:OCLCNo/ocm00006585 Figure 2 shows how the OAI-PMH ListIdentifiers verb is used to render terms from the small GSAFD Thesaurus in a user interface. In the GSAFD Thesaurus, terms are treated as OAI-PMH identifiers. As can be seen, the interface allows for further exploration of each thesaurus term. This is achieved by hyperlinking each term with an OAI-PMH GetRecord request for the Z39.19 [7] metadata that describes the term. This approach would not lead to an interesting user interface if the OAI-PMH identifiers were not meaningful in themselves — for example, they could be numbers sequentially assigned to thesaurus terms instead of the terms themselves. In such cases, a ListRecords request could be substituted for ListIdentifiers and the meaningful thesaurus terms could be extracted from the appropriate metadata tag. Figure 2: An OAI-PMH response to the ListIdentifiers request http://alcme.oclc.org/gsafd/OAIHandler? verb=ListIdentifiers&metadataPrefix=Z39_19 Figure 3 shows how the OAI-PMH ListSets request is used in the OpenURL Registry to display a list of the core components of the OpenURL Framework. Again, the interface allows for further exploration of the Registry by means of OAI-PMH requests. For example, the hyperlink provided with each set name is an OAI-PMH ListRecords request for all items in the set that have Dublin Core metadata. Because all entries in the OpenURL Registry have Dublin Core metadata, the result will be a list describing each item of the specified set, i.e., of the specified core component of the OpenURL Framework. Figure 3: An OAI-PMH response to the ListIdentifiers request http://alcme.oclc.org/openurl/servlet/OAIHandler? verb=ListSets Finally, Figure 4 shows the usage of the OAI-PMH ListMetadataFormats request to allow users of the OpenURL Registry to navigate to either of the two currently existing types of definitions for OpenURL ContextObject Formats or Metadata Formats — namely, the Key/Encoded-Value Formats defined by means of the XHTML Template, or the XML Format defined by means of XML Schema. For each definition type, a metadata format [OAI] is available in the repository. The hyperlink provided with each of the listed types is an OAI-PMH ListRecord request for Dublin Core metadata of all Format definitions of the specified type, i.e., with the metadata format [OAI]. Figure 4: An OAI-PMH response to the ListIdentifiers request http://alcme.oclc.org/openurl/servlet/OAIHandler? verb=ListMetadataFormats PURLs for simple access to OAI-PMH records OAI-PMH identifiers uniquely identify items in OAI-PMH repositories. They are resolved through the use of the identifier itself, along with an identifier of a metadata format available for that item. The resolution occurs through the submission of rather lengthy OAI-PMH GetRecord requests (e.g., http://alcme.oclc.org/xtcat/servlet/OAIHandler?verb= GetRecord&metadataPrefix=oai_dc&identifier=oai:xtcat.oclc.org:OCLCNo/ocm00006585). In the above, it has been shown that the OAI-PMH can be useful for direct human interaction with repositories. Hence it seems sensible to construct "cool URLs" [18] to resolve OAI-PMH identifiers, because they are easier for humans to use. PURLs [19] are a method for creating and maintaining URLs for digital collections. PURLs do this by offering a level of indirection to URLs that enables a collection owner to change the URL for objects in the collection while maintaining a Persistent URL for publication and access. The PURL system also includes the ability to do "partial redirects" in which only part of the PURL is used for the indirection to an actual URL. This turns out to be an effective technique for creating a name gateway to turn complex OAI-PMH GetRecord requests into cool URLs15. The proposed scheme [20] for creating OAI-PMH GetRecord PURLs is: "http://purl.org/oai/" repository-identifier "/" metadataPrefix "/" local-identifier For OAI-PMH repositories where the identifiers conform to the oai-identifier schema [21], the repository-identifier in the PURL should match the repository-identifier embedded in the oai-identifier. For example, the XTCat [22] Repository has oai-identifiers of the form oai:xtcat.oclc.org:OCLCNo/ocm00006585. The repository-identifier is therefore xtcat.oclc.org and the local-identifier for this particular item is OCLCNo/ocm00006585. Following the proposed PURL scheme, the corresponding cool URL is http://purl.org/oai/xtcat.oclc.org/oai_dc/OCLCNo/ocm00006585, which resolves to the oai_dc GetRecord response shown earlier. Such resolution is achieved by creating a PURL partial redirect of the form: "http://purl.org/oai/" repository-identifier "/" metadataPrefix "/" which will be mapped to the destination: baseURL "?verb=GetRecord &metadataPrefix=" metadataPrefix "&identifier=oai:" repository-identifier ":" Examples: /oai/xtcat.oclc.org/oai_dc/ -> http://alcme.oclc.org/xtcat/servlet/OAIHandler ?verb=GetRecord &metadataPrefix=oai_dc &identifier=oai:xtcat.oclc.org: /oai/registry.openurl.info/oai_dc/ -> http://www.openurl.info/registry/servlet/OAIHandler ?verb=GetRecord &metadataPrefix=oai_dc &identifier= Note, however, that the OpenURL Registry used in the latter example doesn't use identifiers that conform to the oai-identifier scheme, so the entire identifier must be appended to the PURL rather than a parsed local-identifier. Appending a local-identifier to a PURL partial redirect has the effect of appending it to the PURL partial redirect's destination, thus completing the identifier parameter in the OAI-PMH GetRecord request. The described technique makes publishing of OAI-PMH GetRecord requests in downstream applications easier and makes handling the requests by humans more straightforward. Conclusions This article has introduced some novel ways to use the OAI-PMH. It has been shown that, through the creative interpretation of the OAI-PMH notions of resource and metadata format, repositories with rather unconventional content, such as Digital Library usage logs, can be deployed. These applications further strengthen the suggestion that the OAI-PMH can effectively be used as a mechanism to maintain state in distributed systems. It has also been shown that simple user interfaces can be implemented by the mere use of OAI-PMH requests and responses that include stylesheet references. For certain applications, such as the OpenURL Registry, the interfaces that can be created in this manner seem to be quite adequate, and hence the proposed approach is attractive if only because of the simplicity of its implementation. The availability of an increasing amount of records in OAI-PMH repositories generates the need to be able to reference such records in downstream applications, through URIs16 that are simpler to publish and use than the OAI-PMH HTTP GET requests used to harvest them from repositories. This article has shown that PURL partial redirects can be used to that end. Acknowledgments The authors would like to acknowledge the work Patrick Hochstenbach (Los Alamos National Laboratory), and Johan Bollen (Old Dominion University) on the Digital Library Usage Log repository, as well as the work of Phil Norman (OCLC) on the OpenURL Registry. Many thanks also to Patrick Hochstenbach, Carl Lagoze, and Michael Nelson for their feedback on the draft of this article. Notes 1. XSL - Extensible Stylesheet Language, . 2. PURL - Persistent URL, . 3. OAI - Open Archives Initiative, . 4. W3C - World Wide Web Consortium, . 5. XML - Extensible Markup Language, . 6. XML schema, . 7. URL - Uniform Resource Locator, . 8. GSAFD - Guidelines on Subject Access to Individual Works of Fiction, Drama, Etc., . 9. OpenURL - NISO AX Committee. 2003. The OpenURL Framework for Context-Sensitive Services, Draft Standard. . 10. MARC 21 - MAchine-Readable Cataloging (21 refers to the 21st century), . 11. XLink. . 12. NISO AX - National Information Standards Organization, . 13. DOI - Digital Object Identifier, . 14. XHTML - Extensible HyperText Markup Language, . 15. Cool URLs, "Cool URIs don't change," . 16. URI - Uniform Resource Identifier, . References [1] Lagoze, Carl, Herbert Van de Sompel, Michael Nelson, and Simeon Warner. 2002. The Open Archives Initiative Protocol for Metadata Harvesting - Version 2.0. [2] Van de Sompel, Herbert. 2000. Closing Keynote Address for the Task Force Meeting of the Coalition for Networked Information, San Antonio TX, Fall 2000. . [3] Van de Sompel, Herbert and Donna Bergmark. 2002. A distributed registry for OpenURL metadata schemas with an OAI-PMH conformant central repository. IEEE Proceedings of the 2002 International Conference on Parallel Processing Workshops, 18-21 August 2002, Vancouver CA, pp. 469-472. [4] Nelson, Michael. 2002. Service Providers: Future Perspectives. Presentation at the 2nd Workshop on the Open Archives Initiative. Geneva, Switzerland, October 2002. [5] Association for Library Collections & Technical Services. 2003. ALA | MARC 21 Authority Records for GSAFD Genre Terms. . [6] MARCXML. . [7] National Information Standards Organization. 1993. Guidelines for the Construction, Format, and Management of Monolingual Thesauri. . [8] Claypool, Mark, Phong Le, Makoto Wased, and David Brown. 2001. Implicit Interest Indicators. Proceedings of the International Conference on Intelligent User Interfaces, January 14-17 2001, Santa Fe, NM, pp. 33-40. [9] Van de Sompel, Herbert, Patrick Hochstenbach and Oren Beit-Arie. 2000. OpenURL Syntax Description. . [10] Van de Sompel, Herbert and Oren Beit-Arie. 2001. Generalizing the OpenURL Framework beyond References to Scholarly Works: The Bison-Futé Model. D-Lib Magazine. 7(7/8). . [11] Van de Sompel, Herbert and Patrick Hochstenbach. 1999. Reference Linking in a Hybrid Library Environment. Part 1: Frameworks for Linking. D-Lib Magazine. 5(4). . [12] Van de Sompel, Herbert and Patrick Hochstenbach. 1999. Reference Linking in a Hybrid Library Environment. Part 2: SFX, a Generic Linking Solution. D-Lib Magazine. 5(4). . [13] Van de Sompel, Herbert and Patrick Hochstenbach. 1999. Reference Linking in a Hybrid Library Environment. Part 3: Generalizing the SFX Solution in the "SFX@Ghent & SFX@LANL" experiment. D-Lib Magazine. 5(10). . [14] Johnston, Pete. 2002. Unqualified Dublin Core XML Schema for OAI-PMH. . [15] NISO Committee AX. 2003. The Z39.88-2003 Matrix Constraint Language. . [16] NISO Committee AX. 2003. The Key/Encoded-Value Physical Representation. . [17] NISO Committee AX. 2003. The XML Physical Representation. . [18] Berners-Lee, Tim. 1998. Hypertext Style: Cool URIs don't change. [19] OCLC. 2003. Persistent URL Home Page. . [20] Powell, Andy, Jeffrey A. Young, Thomas B. Hickey. In press. [21] Lagoze, Carl, Herbert Van de Sompel, Michael Nelson, and Simeon Warner. 2002. Implementation Guidelines for the Open Archives Initiative for Metadata Harvesting: Specification and XML Schema for the OAI Identifier Format. . [22] OCLC. 2003. XTCat - Experimental Thesis Catalog. .   Copyright © Herbert Van de Sompel, Jeffrey A. Young, and Thomas B. Hickey Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next Article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/july2003-young   work_cabyg7z32regheuhkbk5den7eu ---- This is an author produced version of a paper published by Interlending & Document Supply. This paper has been peer-reviewed but does not include the final publisher proof- corrections or journal pagination. Ylva Gavel Bringing the national interlending system into the local document supply process – a Swedish case study. Interlending & Document Supply 2015 43(2):104-109 DOI: http://dx.doi.org/10.1108/ILDS-12-2014-0060 This article is © Emerald Group Publishing and permission has been granted for this version to appear here http://hdl.handle.net/10616/44766. Emerald does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Emerald Group Publishing Limited. http://dx.doi.org/10.1108/ILDS-12-2014-0060 http://dx.doi.org/10.1108/ILDS-12-2014-0060 http://hdl.handle.net/10616/44766 Bringing the National Interlending System into the Local Document Supply Process – a Swedish Case study Ylva Gavel Karolinska Institutet University Library, Stockholm, Sweden Ylva.Gavel@ki.se Abstract Purpose - The purpose of this paper is to describe how systems automating the local document supply process (such as integrated library systems and ILL management systems) can be integrated with systems automating regional document requesting (interlending). This is illustrated with a case study of DocFlow, an ILL management system developed in-house at Karolinska Institutet, and its integration with Libris, the national interlending system in Sweden. Design/methodology/approach - The present paper describes how system integration between Libris and DocFlow was accomplished in practice. It also discusses various aspects of integration between systems offering automation of document supply. Findings - Integration between local document supply workflows and regional document request flows may involve techniques such as import of outgoing and incoming interlending requests, synchronization of status values between systems, exchange of messages between systems, and quick links to the native interfaces of external systems. Practical implications - The paper brings up various aspects to consider when developing or procuring a system for the local management of ILL workflows. Originality/value - The paper may provide a deeper understanding of system integration as it applies to the document supply process. Keywords Interlending, document supply, workflow management, library automation, system integration, Sweden Paper type Case study Introduction Library work is about providing access to documents - returnables (typically books), non-returnables (typically journal articles or book chapters), web pages etc. Document requests are handled in various local workflows. A document request from a patron may incur ILL. ILL and related processes are also known as “interlibrary loan”, “interlending”, “resource sharing”, “document supply”, or “document delivery” (although the latter term is often limited to the somewhat narrower scope of delivery of non-returnables). In addition to outgoing ILL requests submitted on behalf of patrons, the library may handle incoming ILL requests from other libraries. The ILL request flows between libraries take place at a regional level (typically national, but sometimes global). In all, document supply involves at least four document request flows (local requests for returnables, local requests for non-returnables, regional requests for returnables, and regional requests for non- returnables). Various systems offer automation of the steps from discovery to delivery of a document. In order for the process of document supply to proceed as smoothly, a request should preferably move seamlessly between the systems involved. Not all products claiming to provide automation of ILL actually do support all the workflows and request flows associated with the ILL process. For example, the ILL module of the integrated library system (ILS) may not be able to handle requests for non-returnables and/or incoming ILL requests from other libraries. The present paper presents a Swedish case where a local ILL management system (DocFlow) was integrated with the national interlending system (Libris). Local ILL Workflows Many steps in the workflows associated with document requests can be automated. Systems offering support for local ILL workflows are often referred to as ILL management systems (although they may also offer automation of other aspects of document supply, such as local requests for articles from journals held locally). Typical examples include well-known off-the-shelf products such as ILLiad, Relais and Clio - see (Jackson, 2000), (Knox, 2010), (Gavel and Hedlund, 2008) and (Breeding, 2013d) for reviews. Many ILS systems come with ILL modules offering varying degrees of support for the local ILL workflows. Some libraries develop home-grown solutions for ILL management. Actually, ILLiad began like that (Kriz et al., 1998). ILL often involves circulation (checking out books to be lent to other libraries and keeping track of books obtained from other libraries) or document delivery. The term “document delivery” usually refers to the delivery of non-returnables such as journal articles. Circulation is typically handled by the ILS. Many ILL management systems come with document delivery modules, but document delivery can also be handled by standalone products such as Ariel, Prospero and Odyssey (Weible, 2004), (Hosburgh and Okamoto, 2010). Document delivery software typically supports electronic delivery, but due to restrictions imposed by copyright law and license agreements for electronic media the library may still be obliged to transmit some documents on paper. The medium of the document delivered does not necessarily reflect the originating medium. Regional ILL Request Flows The ILL process involves regional request flows occurring when libraries turn to external suppliers in order to obtain documents on behalf of patrons. The external suppliers are typically other libraries, but also commercial services such as Infotrieve. There are different national models for the handling of regional document request flows. The UK ILL scene is dominated by a single supplier (i.e., the BLDSC) (Lowery, 2006). Countries such as Sweden have a national interlending system. In other countries still, there are several coexisting regional ILL networks. The library will typically use the regional interlending(s) system for lending (supplying documents to other libraries) as well as borrowing (requesting documents from other libraries on behalf of patrons). See Figure 1. A library may have to rely on more than one interface in order to submit requests to its favourite suppliers. Automation of the regional workflows is offered by systems such as union catalogues and interlending systems. Interlending systems offer capabilities such as routing of requests between libraries connected to the system. An interlending system usually relies on a union catalogue in order to suggest potential suppliers for a document. Few off-the-shelf products support the automation of regional ILL workflows. Examples include VDX (Farrelly and Reid, 2003) (Braun et al., 2006) and consortial borrowing solutions such as INN-Reach (Breeding, 2013c). Interlending systems such as OCLC (Jordan, 2009), DOCLINE (Collins, 2007) and rapidILL (Delaney and Richins, 2012) have a wide usage. Many countries have developed their own national interlending systems in order to meet local policies and needs. Examples in the Scandinavian countries include Libris in Sweden (Olsson, 1996), (Thomas, 2012), (Sagnert, 2008), (Lindström and Malmsten, 2008), DanBib in Denmark (Andresen and Brink, 2011), and NOSP in Norway (Gauslå, 2006). The distinction between interlending systems supporting regional request flows and local ILL management systems is not always entirely clear-cut. Some of the regional systems also offer support for local workflows. For example, OCLC/WorldCat is bundled with products such as ILLiad and WorldShare ILL (OCLC, 2002), (Breeding, 2013d). The Need for System Integration During its life cycle, a document request may flow through several systems (Breeding, 2013b). For instance, requests must make it between systems supporting ILL workflows at a local level and systems supporting ILL request flows at a regional level: Document requests from local patrons incur interlending requests at a regional level, and interlending requests from other libraries need to be handled locally. A possible scenario might be as follows (see Figure 2): A patron finds a document of interest in a database or discovery tool and brings up the menu of a link resolver such as SFX. No full text is available, so the link resolver menu displays a pre-populated ILL request form. The ILL request from the patron is handled in the local ILL management system. This system, in turn, relays the request to a regional interlending system. If the request is filled, the document is sent to the requesting library rather than the patron that requested it initially. If it is a journal article, it may be delivered through the document delivery module of the local ILL management system. Delivery of a book may involve the creation of a circ-on-the-fly record in the ILS so as to allow the book to be checked out by the patron. There is a need for system integration, or information about the request will have to be re-keyed whenever it is to pass the boundary between one system and another. Fortunately, there are international standards in place to allow systems involved in the ILL process to talk to each other. Standards providing interoperability between systems involved in ILL include OpenURL (linking), Z39:50 (search/retrieval), ISO ILL (interlending), and NCIP (circulation) (Breeding, 2013a), (Nye, 2004), (Needleman, 2012), (Andresen, 2013), (MacKeigan, 2014), (ISO, 2014), (Carlson, 2008), (Needleman et al., 2001). Standards compliance enables off-the-shelf compatibility between systems. For example, if a regional interlending system is compliant with ISO ILL, it can be expected to integrate seamlessly with local ILL management systems supporting the same standard. Successful cases of integration have been reported for ISO ILL compliant systems such as VDX, ILLiad and Relais (Farrelly and Reid, 2003), (OCLC, 2002), (Breeding, 2013a), (MacKeigan, 2014), (Jilovsky and Howells, 2012), (Moreno and Xu, 2010), (Irwin, 2009), (Nyqvist, 2008), (Hanington and Reid, 2010), (McGillivray et al., 2009). Unfortunately, although it has been around for a long time, the ISO ILL standard has not been consistently adopted (Jackson, 2005), (Breeding, 2013a), (Balnaves, 2013), (MacKeigan, 2014). A new version that might possibly remedy some of the shortcomings of previous versions is on its way (ISO, 2014). Many regional interlending systems offer integration via their own proprietary APIs or technologies such as structured email messages (Jordan, 2009), (Rodríguez-Gairín and Somoza- Fernández, 2014), (Gavel and Hedlund, 2008), (Gould, 2000). In Sweden, various local library automation systems integrate with the APIs of the national Libris interlending system (see below). An API (Application Programming Interface) is basically a piece of software allowing external systems to speak to a system according to a predefined set of rules. Unlike the output of end user interfaces, the output of APIs is mainly for machine consumption. A Swedish Case of System Integration – Libris and DocFlow The Swedish ILL Scene In Sweden, the Libris systems have been offering library automation since the 1970s. The national union catalogue has an interlending module that relays ILL requests between libraries. The Libris systems are maintained by the Swedish National Library (also known as the Royal Library). The union catalogue contains bibliographic records of books and journals held by university and research libraries, and also some public libraries. Most Swedish university libraries rely primarily on Libris for their ILL needs. Some libraries (particularly in the STM sector) also have a heavy usage of Subito. The Libris interlending module addresses the task of directing an ILL request to a library owning the item requested. The holdings data in the union catalogue makes it possible to identify potential suppliers automatically. The support for local ILL management is rather limited, however. During the 1990s the university library of Karolinska Institutet (KIB) developed SAGA, a system offering automation of various aspects of document supply (Gavel and Hedlund, 2008). Other libraries could subscribe to the SAGA services. Several major university libraries were connected. The SAGA libraries formed a collaboration network where ideas and resources were shared. Apart from SAGA, Swedish libraries have been using systems such as the ILL modules of ILS systems and the Swedish FFB system for their local ILL management needs. Although encouraging for the developers, the wide adoption of SAGA was a cause for some concern. For KIB it entailed a role as a provider of ILL infrastructure, a commitment that was perhaps a bit outside of the normal scope of university library operations. Facing the upcoming retirement of a member of the SAGA team, KIB initiated an internal review of the SAGA services in 2011. It was decided to phase out the external SAGA services and focus on internal user needs. SAGA closed down in 2014. Several SAGA libraries have developed their own in-house systems for local ILL management. The libraries were able to set up their new systems at a relatively short notice. Possibly, the firm understanding of ILL workflow management gained by the participation in the SAGA community contributed to the successful implementation of the new ILL management systems. Systems launched at former SAGA libraries include BasILL (Lund University), Viola (Stockholm University) and EBBA (Umeå University). KIB itself replaced SAGA with DocFlow, a system with essentially the same features as SAGA. Docflow handles document requests from local patrons as well as other libraries. It provides workflow management with work queues corresponding to stages in the management of a document request. The interface supports tasks such as verification of incoming requests, generation of messages to requesters, generation of pickup slips, location of potential suppliers for ILL requests, monitoring of outstanding ILL requests, document delivery, statistics, and billing. Unlike SAGA, DocFlow is only for internal use by KIB. Integration with Libris The national Libris system can communicate with other systems. The initial version of SAGA was integrated with Libris via SMTP: Email messages with ILL requests deriving from Libris were parsed automatically by a program at the SAGA server and saved in the SAGA database. Subsequent versions of Libris have offered HTTP based data export. In this case, the local ILL management system connects to an export server at Libris in order to retrieve data on ILL requests. Since HTTP is the protocol of the Web, the requests at the export server can also be brought up in a Web browser for review (although in a format primarily meant for machine consumption). The functionality for machine-to-machine communication offered by Libris made it possible to integrate Libris with SAGA. SAGA could import requests from other Libris libraries automatically. Also, it could import requests placed in Libris on behalf of patrons. In the Libris interlending module, messaging between libraries is implemented as status changes. Examples of statuses that a request can assume during its life cycle include “Outstanding”, “Forwarded”, “Read”, “May Reserve”, “Reserved”, “Not Fulfilled”, and “Delivered”. A new API with richer options for system integration was launched by Libris in 2013. The SAGA libraries were invited to beta test the new features. Integration through the new API was implemented in DocFlow. With the new API, the primary format for data exchange is JSON (JavaScript Object Notation, a machine readable format that is still reasonably human readable). External systems are allowed to import data on incoming (lending) requests as well as outgoing (borrowing) requests. Also, it is possible to update certain status values and submit responses to requests. The responses, in turn, result in status changes in the Libris system. Integration of Lending Requests Lending is the process of supplying documents to other libraries (Hilyer, 2006b). In the context of ILL, the word “lending” may refer to non-returnables (such as journal articles) as well as books - a terminology that may be a bit confusing to non-librarians. Many lending requests received by KIB derive from the national Libris system. Lending requests from Libris are synchronized with DocFlow via the Libris API (see Figure 3). The lending requests from Libris are imported automatically by DocFlow and appear in the work queue for incoming requests. DocFlow automatically checks the holdings of incoming requests in the ILS and the electronic journal list. The lending request workflows supported by DocFlow are mainly related to pickup and document delivery. Books have to be checked out using the ILS. Upon importing a Libris lending request, DocFlow sets the status in Libris to “Read”. In some cases KIB is unable to supply the item requested. In the DocFlow interface, there is a button for submitting responses to Libris requests. The button performs an API based status update in the background. A negative response moves the request to the next supplier in Libris (if any). If the item is a book that is out on loan, KIB may offer the requesting library to place a hold. The requesting library, in turn, may request a hold via the Libris interface. This results in a status change in Libris that is imported by DocFlow. Requests for holds appear in the work queue for incoming requests. The document delivery module of DocFlow supports the delivery of copies of journal articles and book chapters (in many cases on paper due to restrictions imposed by licenses and copyright law). When the status in DocFlow is set to “Delivered”, an API call in the background ensures that the status change is transmitted to the Libris system. In addition to Libris requests, lending requests from the Scandinavian interlending systems DanBib, Bibsys and NOSP are also imported automatically to DocFlow. In this case, the data derives from email messages that are parsed automatically. Integration of Borrowing Requests Requests from local patrons are submitted to DocFlow via a web form. This form can be used in standalone mode, but it is also integrated with several databases and discovery tools via the OpenURL based SFX menu. The local requests deriving from the form often incur ILL requests. Borrowing is the process of submitting ILL requests to external suppliers (typically other libraries) on behalf of patrons (Hilyer, 2006a). In the context of ILL, the word “borrowing” may refer to non- returnables (such as journal articles) as well as books - again a terminology that may be a bit confusing to non-librarians. The borrowing requests are placed in systems such as Libris, Subito, DOCLINE, DanBib, Bibsys and NOSP. When a request is received by DocFlow, the holdings for the document requested are checked automatically in various supplier systems if there is an ISSN or ISBN. Otherwise, a search in a potential supplier system has to be performed by the librarian. DocFlow has quick links to several supplier systems where a search is generated automatically based on the bibliographic information available in DocFlow. Upon retrieval of a bibliographic record for the document requested by the patron, a borrowing request can be placed via the native interface of the supplier. DocFlow supports automatic synchronization of borrowing requests with Libris via the Libris API (see Figure 4). However, the outgoing borrowing requests are not pushed into the Libris interlending system. Instead, they are entered via the native Libris interface and imported back to DocFlow afterwards. This is because the Libris interface is needed in order to tie the request to a bibliographic record in the union catalogue. The holdings information in the union catalogue makes it possible to generate a lender string consisting of sigel codes (Swedish library symbols) for potential supplier libraries. The quick links in DocFlow simplify the process of placing requests. For Libris, there is a link to the union catalogue search interface as well as a link directly to the ILL form of the interlending module. The ILL form is pre-populated automatically with data from DocFlow. Import of Libris borrowing requests is performed at a regular interval so as to allow monitoring of status changes of outstanding requests. The borrowing requests imported from Libris are linked to the initial patron requests in DocFlow via the Docflow order number. When a borrowing request from Libris is ingested by DocFlow, the status of the corresponding DocFlow request is updated to “Order Placed”. If a response has been submitted in Libris by the supplier library, the status of the DocFlow request is updated to “Negative Response” (the request has not been fulfilled) or “May Reserve” (the book requested is out on loan, but it is possible to place a hold). If a borrowing request is placed in a system other than Libris (such as Subito), this has to be recorded manually. In this case, the name of the supplier system and the order number of the order placed is entered via the DocFlow interface. Discussion The ILL management process may involve several workflows, request flows, systems and document types. A library procuring software for the automation of ILL should consider carefully which workflows and request flows exactly need to be automated. The present case study illustrates some points to consider. ILL takes place at a local level (e.g., management in a local ILL management system) as well as a regional level (e.g., in consortial borrowing networks, national interlending systems and global systems such as OCLC). At KIB, the local workflows handled in DocFlow are seamlessly integrated with the national request flows of the Libris system. The technical solution relies on the APIs of the Libris system, but also on other approaches such as quick links. The integration keeps re-keying of information between DocFlow and Libris to a minimum. Integration with regional interlending systems such as Libris may rely on ISO ILL or proprietary standards. The Libris API supports data import and submission of messages and status changes for requests in the interlending module. Although the API is not ISO ILL compliant, the statuses supported have some similarities to the standards supported by version 3 of the ISO ILL standard. An approach with some similarities to the integration between DocFlow and Libris is described in (Rodríguez-Gairín and Somoza-Fernández, 2014). In this case, the local ILL management system GTBib-SOD was integrated with OCLC WorldShare via SOAP based web services offered by OCLC. An older case study discussing the considerations when integrating the Clio ILL management system with OCLC is presented in (Natale, 1999). The ILLiad ILL management system is integrated with OCLC in a similar fashion (OCLC, 2002), (Hilyer, 2006b), (Hilyer, 2006a). DocFlow supports the handling of incoming (lending) requests as well as outgoing (borrowing) requests. Not all ILL management systems support both. Due to the need to link the borrowing request to the originating request and monitor its progress in the remote system, system integration of borrowing requests may be somewhat more elaborate than that of lending requests. Integration between the major systems in the ILL process is essential, since any manual synchronization between systems may be labor intensive and possibly error prone. In addition to the regional interlending system, systems such as the ILS are candidates of integration with the local ILL management system. Libraries procuring a system involved in any step of the ILL process should check carefully with potential vendors how exactly it may be integrated with any other systems involved. Also, the overall system architecture should be considered. For example, the ILS may be integrated either with the local ILL management system or the regional interlending system. Docflow supports the handling of returnables as well as non-returnables. Not all ILL management systems support all document types. However, many steps in the ILL process (such as verification of requests and location of potential suppliers) are essentially the same for all document types and should preferably be handled in the same system. The APIs offered by Libris allow smooth integration with local workflows involving exchange of document requests between Swedish libraries. For a medical library like KIB, a similar integration with Subito would be valuable. Still, there will always be supplier systems that cannot be perfectly integrated with the local ILL management system, either because of technical limitations or due to the fact that the request volumes are not high enough to justify the programming effort. The system architecture of the local ILL management system must allow for that (e.g., by allowing manual recording of borrowing requests submitted). Andresen, L. (2013), "New Interlibrary Loan Standard", Trends & Issues in Library Technology, Vol. 2 No. 1, available at: http://www.ifla.org/files/assets/information-technology/tilt_v2_i1.pdf (accessed 22 Dec 2014). Andresen, L. and Brink, H. (2011), "Document supply in Denmark", Interlending & Document Supply, Vol. 39 No. 4, pp. 176-185. Balnaves, E. (2013), "Editorial: Vale ISO ILL?", Trends & Issues in Library Technology, Vol. 2 No. 1, available at: http://www.ifla.org/files/assets/information-technology/tilt_v2_i1.pdf (accessed 22 December 2014). Braun, P., Hörnig, L. and Visser, F. (2006), "A new approach towards a national inter-library loan system in the Netherlands: introducing VDX", Interlending & Document Supply, Vol. 34 No. 4, pp. 152-159. Breeding, M. (2013a), "Interoperability and Standards", Library Technology Reports, Vol. 49 No. 1, pp. 12-15. Breeding, M. (2013b), "Introduction to Resource Sharing", Library Technology Reports, Vol. 49 No. 1, pp. 5-11. Breeding, M. (2013c), "The Orbis Cascade Alliance: Strategic Collaboration among Diverse Academic Institutions", Library Technology Reports, Vol. 49 No. 1, pp. 30-31. Breeding, M. (2013d), "Products and Services", Library Technology Reports, Vol. 49 No. 1, pp. 16-29. Carlson, A. (2008), "Acronyms Gone Wild! ILL Flirts with NCIP", Resource Sharing & Information Networks, Vol. 19 No. 1-2, pp. 71-75. Collins, M. E. (2007), "DOCLINE", Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 17 No. 3, pp. 15-28. Delaney, T. G. and Richins, M. (2012), "RapidILL: an enhanced, low cost and low impact solution to interlending", Interlending & Document Supply, Vol. 40 No. 1, pp. 12-18. Farrelly, J. and Reid, D. (2003), "Interlending and document supply: international perspectives in a New Zealand context", Interlending & Document Supply, Vol. 31 No. 4, pp. 228-236. Gauslå, A. (2006), "The Norwegian "Bibliotek" database in a Nordic ILDS perspective", Interlending & Document Supply, Vol. 34 No. 2, pp. 57-59. Gavel, Y. and Hedlund, L. O. (2008), "Managing document supply: a SAGA come true in Sweden", Interlending & Document Supply, Vol. 36 No. 1, pp. 30-36. Gould, S. (2000), "Sending ILL requests by e-mail: a discussion and IFLA guidelines", Interlending & Document Supply, Vol. 28 No. 2, pp. 73-78. Hanington, D. and Reid, D. (2010), "Now we're getting somewhere - adventures in trans Tasman interlending", Interlending & Document Supply, Vol. 38 No. 2, pp. 76-81. Hilyer, L. A. (2006a), "Borrowing", Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 16 No. 1/2, pp. 17-39. Hilyer, L. A. (2006b), "Lending", Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 16 No. 1/2, pp. 41-51. Hosburgh, N. and Okamoto, K. (2010), "Electronic Document Delivery: A survey of the Landscape and Horizon", Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 20 No. 4, pp. 233-252. Irwin, J. (2009), "Reaping the harvest: end-user access and staff savings at the University of Auckland, New Zealand", Interlending & Document Supply, Vol. 37 No. 2, pp. 76-78. ISO (2014), "ISO 18626 Interlibrary Loan Transactions", available at: http://illtransactions.org/ (accessed 18 Jan 2015). Jackson, M. (2000), "Interlibrary Loan and Resource Sharing Products: An Overview of Current Features and Functionality", Library Technology Reports, Vol. 36 No. 6, pp. 5-225. Jackson, M. (2005), "When a good standard development process fails", Interlending & Document Supply, Vol. 33 No. 1, pp. 53-55. Jilovsky, C. and Howells, S. (2012), "Light at the end of the tunnel: transitioning from one interlending system to another", Interlending & Document Supply, Vol. 40 No. 1, pp. 19-25. Jordan, J. (2009), "OCLC 1998-2008: Weaving Libraries into the Web", Journal of Library Administration, Vol. 49 No. 7, pp. 727-762. Knox, E. (2010), "ILL and Document Delivery Technology Systems", in Document Delivery and interlibrary loan on a shoestring, Neal-Schuman Publishers, New York, pp. 137-158. http://illtransactions.org/ Kriz, H. M., Glover, J. M. and Ford, K. C. (1998), "ILLiad: Customer-Focused Interlibrary Loan Automation", Journal of Interlibrary Loan, Document Delivery & Information Supply, Vol. 8 No. 4, pp. 31-47. Lindström, H. and Malmsten, M. (2008), "User-centred design and the next generation OPAC - a perfect match?", paper presented at 32nd ELAG Library Systems Seminar, 2008, Wageningen, available at: http://library.wur.nl/elag2008/presentations/Lindstrom_Malmsten.pdf (accessed 23 Dec 2014). Lowery, B. (2006), "The British Library and document supply services", in Bradford, J. and Brine, J. (eds.) Interlending and Document Supply in Britain Today, pp. 15-24. MacKeigan, C. (2014), "The future of interoperability for ILL and resource sharing", Interlending & Document Supply, Vol. 42 No. 2/3, pp. 105-108. McGillivray, S., Greenberg, A., Fraser, L. and Cheung, O. (2009), "Key factors for consortial success: realizing a shared vision for interlibrary loan in a consortium of Canadian libraries", Interlending & Document Supply, Vol. 37 No. 1, pp. 11-19. Moreno, M. and Xu, A. (2010), "The National Library of Australia's document supply service: a brief overview", Interlending & Document Supply, Vol. 38 No. 1, pp. 4-11. Natale, J. A. (1999), "Using Clio 1.2 with the ILL Microenhancer for Windows", Journal of Interlibrary Loan, Document Delivery & Information Supply, Vol. 9 No. 3, pp. 25-51. Needleman, M., Bodfish, J., O'Brien, J., Rush, J. E. and Stevens, P. (2001), "The NISO circulation interchange protocol (NCIP) - an XML based standard", Library Hi Tech, Vol. 19 No. 3, pp. 223-230. Needleman, M. H. (2012), "An Update on New NISO and ISO Initiatives", Serials Review, Vol. 38 No., pp. 66-67. Nye, J. B. (2004), "Recent Developments in Standards for Resource Sharing", Journal of Library Administration, Vol. 40 No. 1-2, pp. 89-106. Nyqvist, C. (2008), "Interlibrary Loan Interoperability Experiment: A Nontechnical View", Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 18 No. 4, pp. 417-424. OCLC (2002), "OCLC Illiad - Supercharging Interlibrary Loan", OCLC Newsletter, Vol. No. 255. Olsson, L. (1996), "Knowledge Organization as a National Information System Project: the Shaping of LIBRIS", Swedish Library Research, Vol. No. 2/3, pp. 57-64. Rodríguez-Gairín, J.-M. and Somoza-Fernández, M. (2014), "Web services to link interlibrary software with OCLC Worldshare", Library Hi Tech, Vol. 32 No. 3, pp. 483-494. Sagnert, B. (2008), "The Swedish LIBRIS system offers new web facilities for searching and ILL to librarians and the genereal public", Interlending & Document Supply, Vol. 36 No. 1, pp. 37-42. Thomas, B. (ed.) (2012), Att bryta ny marc : LIBRIS 40 : en jubileumsskrift, Kungliga biblioteket, Stockholm. Weible, C. L. (2004), "Selecting Electronic Document Delivery Options to Provide Quality Service", Journal of Library Administration, Vol. 41 No. 3, pp. 531-540. About the author Ylva Gavel is a systems developer at the university library of Karolinska Institutet. She holds a degree in Engineering Physics and a PhD degree in Theoretical Physics from the Royal Institute of Technology in Sweden. Recent development work involves projects in the fields of resource sharing, information retrieval and bibliometrics. Ylva can be contacted at: Ylva.Gavel@ki.se mailto:Ylva.Gavel@ki.se Figure Legends Figure 1 ILL borrowing (requesting) and lending (supplying) at local and regional levels Figure 2 The life cycle of a document request Figure 3 Synchronization of incoming lending requests Figure 4 Synchronization of outgoing borrowing requests ILL Management (Local) Interlending (Regional) Other Library Other Library Patron 1 Local request (document 1) Patron 2 Local request (document 2) ILL borrowing request (document 2) ILL lending request (document 2) ILL lending request (document 1) ILL borrowing request (document 1) Interlending (Regional) Discovery (Local) Requesting (Local) ILL Management (Local) Circulation (Local) DocFlow Libris Interlending Other Library Response, status update Import request, status update (automatic) Other Supplier SystemImport request (automatic) DocFlow Libris Interlending Other Supplier System Other LibrarySubmit request (manual) Submit request (manual) Import request back, status update (automatic) work_cagkoadbt5fjlmdz3mgfhppijm ---- Library Management, 2001, vol. 22, no. 1/2, p.80-85. ISSN: 0143-5124 DOI: 10 1109/01435120110358970 http://www.emeraldinsight.com/Insight/viewContainer.do;jsessionid=D595C3D09216607FED575FDA0AF7B12F? containerType=JOURNAL&containerId=121 http://www.emeraldinsight.com/Insight/viewContentItem.do;jsessionid=D595C3D09216607FED575FDA0AF7B12 F?contentType=Article&hdAction=lnkpdf&contentId=858998&dType=SUB&history=false © 2001 MCB University Press Copy cataloguers and their changing roles at the Ohio State University Library: a case study Magda El-Sherbini Abstract The Ohio State University Libraries began the process of restructuring the Technical Services Department in early 1995. Changes that were introduced as a result of this process had a profound effect on the roles of professional and para-professional staff in the organisation. The author of this paper will outline the process of revamping the organisational structure of a large department in an academic library and will discuss the impact it had on the traditional roles assigned to librarians and staff in such a setting. The study will concentrate on copy cataloguers and their changing role in Technical Services. Introduction It would not be an exaggeration to state at the outset that the restructuring of the cataloging department at the Ohio State University Libraries has had a profound effect on the roles and responsibilities of librarians and technical staff in that organisation. Changes resulting from the restructuring process helped redefine the role of original and copy cataloguers as well as their support staff. In this paper I will outline the reorganisation process and discuss the impact it had on traditional staff roles. The idea to reorganise technical services evolved over time, and began to take final shape around 1990. The process was initiated in response to the growing demands placed on the cataloging department and technical services. Restructuring emerged as the most effective method of maximising staff resources and providing internal opportunities for staff development and advancement. The idea of redefining traditional roles of librarians and staff was articulated in an article written in 1991 by Jennifer Younger (Younger, 1991), former director for technical services at OSU, who wrote: Designing, creating and coordinating a bibliographic access system that will continue into the next century requires a team approach to utilize the expertise of both professionals and para-professionals. Ideas explored in this paper led to the initial planning and implementation of the restructuring scheme at OSUL. The author of the present article held positions of head of original cataloging and later head of the cataloging department, and worked with the director for technical services and head of the acquisitions department on planning and implementation of these changes. Although not all the details of the outcome were clear at the beginning of the process, the directions set early on enabled us to succeed in achieving general objectives and goals. In addition to the long-range goals of maximising staff resources, there were also short-term objectives that we wanted to achieve in the process. These included: • streamlining the workflow; • increasing productivity of cataloguing in general and copy cataloguing in particular; and • redirecting higher expertise to other functions of cataloguing. In order to discuss the reorganisation process and the resulting changes in the roles of copy cataloguers at OSU, it will be helpful to present the background information. At the time when the restructuring decisions were being considered, the cataloging department consisted of approximately 60 professional librarians and technical staff. They were organised in seven sections: (1) search and support staff; (2) copy cataloguing; (3) original cataloguing; (4) maintenance; (5) authority; (6) special collections; and (7) serials. Each section generally represented various functions of technical processing assigned to the department. At the end of the five-year restructuring process, the department employed 18 professional librarians and technical staff and ten half-time student assistants. Work is now being conducted in six sections that are organised along subject or format lines. The department retained all the cataloguing functions except for simple copy cataloguing, which was reassigned to the acquisitions department, and the special collections section, which became an independent department. Changes in the role of copy cataloguers can best be seen when viewed against the process of restructuring of the original cataloguing, copy cataloguing and search sections of the cataloging department. The original cataloging section In the early 1990s some questions were raised about the role professional cataloguers played in the overall library operations. OSU librarians, who have faculty status and are in tenure track positions, were burdened with a wide range of academic, administrative and management responsibilities. At the same time they bore the primary responsibility for original cataloguing of all library materials. As the library found itself in the position of having to reallocate existing resources and seek new solutions, due in part at least to the shrinking library resources, the role of the cataloguing professional became part of the overall discussion. It was evident that professionally trained and experienced cataloguers constituted an important asset to the library. The administration had hoped to make better use of the acquired skills of professional cataloguers by redefining their primary areas of responsibility, placing the emphasising on more challenging and demanding management and training roles, and reducing their responsibility for direct title by title cataloguing. This became a viable option and came into focus partly in view of emerging alternatives to traditional methods of providing original cataloguing records. Options such as contracting out became available and gained a certain degree of acceptance in the library community at that time. The original cataloging section (OCS) at OSU consisted largely of professional librarians who were responsible for original cataloguing. Many of them were recruited to perform original cataloguing of foreign language materials. Their responsibility was limited largely to original cataloguing and various functions outside the department, related to the promotion and tenure commitments associated with the faculty rank. Primary cataloguing responsibilities of the OCS included: • original cataloguing; • complex copy cataloguing: K, M, L levels (which in some cases require extensive editing and enhancing of the record); brief records in OCLC (which require upgrading the record to full bibliographic record); assigning call numbers and subject headings (which need strong subject and expertise background). Prior to the participation in the organisational restructuring, the author of this paper conducted two studies that explored alternative methods of original cataloguing (El-Sherbini, 1992, 1995). These included co-operative cataloguing, the use of temporary student help and contract cataloguing. Results of these studies suggested that there were other, viable ways to perform original cataloguing. The possibility of using alternative ways to do cataloguing also opened the door to new initiatives. Transformation of the original cataloging section began with the creation of a new high- level staff position to perform original cataloguing of western language materials. This was the first staff position at the OSUL with the responsibility for original cataloguing. The position was filled by copy cataloguers. Ten graduate student assistants with foreign language expertise were hired to assist in performing original and copy cataloguing of foreign language materials (Arabic, Hebrew, Japanese, Chinese, Slavic, German, and Greek). This shift of responsibility allowed the professional cataloguers to become more involved in management issues. They participated more actively in policy discussions, formulating and discussing cataloguing policy and contributed their experiences to library committee work. Besides their cataloguing responsibility, they also managed the workflow, and hired and trained graduate assistants. They were also involved in solving complex technical problems (such as call numbers, subject headings, solving conflict in the database and others), and negotiating contracts with vendors. The OCS became a department with a combination of staff and professional librarians, where staff and student assistants performed original cataloguing, with librarians devoting more time to training and supervising functions. This restructuring of the original cataloging section had a positive impact on productivity. The section was able to eliminate the longstanding backlogs in Arabic and western language materials as well as theses. The section kept current with the cataloguing of newly received materials. The success of this experiment led to the introduction of this management model in other sections of the library. The search section In the next phase of restructuring, a review of the search section was undertaken. Initially, the section consisted of an office manager, two classified staff and several student assistants. All were responsible for receiving books from the acquisitions department, searching the OCLC database, matching records and making printouts for copy cataloguers to use in cataloguing, and sorting materials by the type of records found (e.g. LC vs. member copy). Members of the section also created brief records for those titles not found in OCLC and entered these records into the OSUL system. In addition, they were responsible for organising and managing the backlogs. Section workflow and procedures were examined in preparation for possible restructuring and the following observations were made: • There were redundancies in operations between the acquisitions department and the search section. • The cost of searching was very high. • Records found in the OCLC database were not used for copy cataloguing. • There were substantial delays in processing, because most of the books that did not have copy were placed in the backlog for another search after six months. To remedy the situation and introduce a more effective way to move materials through the system, it was decided to restructure the workflow and eliminate the unnecessary processing steps. This resulted in the merging of the search section with the copy cataloguing section. Most of the staff of the section joined the copy cataloguing section, while the head of the search section was reassigned to the original cataloguing section to perform original cataloguing. The revised workflow allowed for the elimination of some of the redundancies in the workflow and resulted in reducing the time required to complete material processing. Merging of the two sections allowed the department to make better use of existing expertise and provide opportunities for staff advancement. Special collections During the reorganisation, the special collections section became an independent department. One copy cataloguer was reassigned there to perform copy cataloguing, while another copy cataloguer moved to special collections to fill the vacant LA2 position. These two positions were lost to the cataloging department. The two staff members continued to perform duties similar to those which were part of their area of responsibility prior to the move. Copy cataloguing At the start of the process, the copy cataloguing section consisted of civil service staff (para-professionals) members of various ranks. A professional librarian managed the section. The staff were responsible for very simple copy cataloguing, adapting copy from the OCLC database and performed copy cataloguing. After all the changes in the department, the copy cataloging section ceased to exist and copy cataloguers were reassigned to the newly created sections, where they perform various functions ranging from original cataloguing to searching and processing of incoming materials. Several factors effected copy cataloguing functions and copy cataloguing roles at the OSUL: • A study of the feasibility of using OCLC's new Prompt Cat product showed that many of the approval plan western language materials have good copy in the OCLC database. Processing of these materials was moved to acquisitions, where it was done at the time of receipt. • After reassigning some functions of the copy cataloging section to the acquisitions department, the copy cataloging section was disbanded. • Classified staff at the LA1 level - the higher level copy cataloguers - were trained to perform some high level copy work which was previously handled by original cataloguers. This included assigning call numbers based on existing subject headings, cataloguing analytics and some problem solving. • Some members of the search section started to perform simple copy cataloguing functions previously assigned to copy cataloguers only. • Vacant professional librarian positions in the cataloging department were not reopened. It was determined that original cataloguing could be done by higher-level staff. A number of former copy cataloguers filled these vacant positions and were assigned responsibility for original cataloguing. • Creation of new high-level staff positions in the cataloging department and the special collections department encouraged some copy cataloguers to apply for these positions and their vacant positions in copy cataloging were eliminated. • Graduate student assistants were hired to perform various functions, which included everything from searching to producing records for foreign language materials. All of the above factors contributed in various degrees to the emergence of a new role for copy cataloguers. Many copy cataloguers assumed responsibilities formerly assigned only to the professionals. These responsibilities include: • descriptive cataloguing; • subject analysis; • call number assignment; • original cataloguing of literary works; • original cataloguing of related editions; • creating authority records for contribution to NACO; • problem solving (example: fixing call number problems); • training graduate student assistants; • work with graduate student assistants with language expertise, to create original cataloguing records; • managing and organising special projects; • supervising student assistants. Many other technical functions formerly assigned to librarians (e.g. conflict between serial and monograph; Z call number; cutter number adjustment for all classes M, N, P) could now be performed by copy cataloguers. Copy cataloguers perform this work for all formats except for theses and serials. It is important to add that most of the graduate student assistants working in the cataloging department perform some of the same responsibilities, or work with copy cataloguers to accomplish this. They are being hired for their language and subject expertise. Those copy cataloguers who assumed these new responsibilities moved to new positions in the civil service classification and their job titles changed from "copy cataloguer" to "cataloguer". Some former copy cataloguers are now working in the acquisitions department with the receiving staff. After initial training in the new department they are performing the following tasks: • searching titles in OSCAR when the piece arrives and reviewing against the existing bib and order records; • searching new titles in OCLC to locate matching records for downloading and processing. Conclusion OSUL has always employed many qualified and experienced staff members and librarians. The library management team recognised that there were opportunities to utilise available skills more effectively. This was accomplished by reassigning responsibilities. For copy cataloguers, it meant assuming responsibility for higher-level work (cataloguing). For professional librarians it meant moving away from cataloguing to management, training and other responsibilities. Massive restructuring of the cataloging department at the OSUL had a profound effect on the classified staff involved in performing copy cataloguing. It provided the copy cataloguers with new challenges and opportunities to assume responsibilities previously reserved for professional librarians. Those copy cataloguers who accepted the challenge moved into positions of higher responsibility and rank. They now perform original cataloguing (after position reclassification) and complex copy cataloguing as well as supervise student assistants. Those copy cataloguers who moved to the acquisitions department were given the opportunity to perform some of the same functions. In addition, they were offered training in new areas. This enabled them to perform processing functions related to acquisitions work. As a result of this experiment at OSUL, copy-cataloguing positions as they existed formerly have changed significantly. Training played a major role in the reorganisation process. It was handled internally by professional librarians who conducted most of the training sessions. The cataloging department created training guidelines for this purpose. Online tutorials were made available as part of the training package. The training program was gradual as each step was introduced according to the degree of difficulty. It started with assigning call numbers, followed by assigning subject headings, MARC formats, descriptive cataloguing, and finally full descriptive cataloguing. Training of this magnitude requires a minimum of six months. Staff reactions to the changes in the department were predictably varied. In the initial stages of the reorganisation process, there was a great deal of uncertainty and an overall lack of confidence in the process and its outcome. This is not surprising, given the scope of the restructuring and the nature of the changes being introduced. Such reactions can be attributed to the difficulty of articulating results of a long-range complex process such as this. As the first positive results began to emerge, both the librarians and staff began to see benefits of the process and accepted the concept. This came very gradually, and in some cases required special reassurances and a lot of individual support. I must emphasise that the OSU reorganisation included all operations in the cataloging department and was not limited only to the copy cataloging section. Since the changes were so broad in scope, staff reaction should be viewed in that context. At the very end of the process, when concrete benefits became apparent, most of the staff dramatically changed their attitudes and in general seemed very happy with the new arrangement and their new roles. It is necessary to emphasise the importance of clear and consistent communication throughout the whole process. Keeping people well informed of what is being planned and what changes are taking place helps quite a lot. Throughout the process, staff were informed about its objectives and progress. This was achieved mainly through staff meetings. This reorganisation helped to streamline the workflow, eliminated redundancies, improved productivity, simplified the administrative structure in the department and it was cost effective. The final result is a better use of individuals and their particular sets of skills and experience. Most of the goals established at the outset of the process have been achieved. Professional librarians and para-professional staff are now working together in teams where they can make better use of their experience and expertise. Improvements in the organisational structure of Technical Services led to better co-operation as cataloging and acquisitions departments are working together on technical service issues. Productivity has increased dramatically and many of the cataloguing backlogs have been eliminated or substantially reduced. References El-Sherbini, M. (1992), "Cataloging alternatives: an investigation of contract cataloging, cooperative cataloging, and the use of temporary help", Cataloging & Classification Quarterly, Vol. 15 No. 4, pp. 67-88. El-Sherbini, M. (1995), "Contract cataloging: a pilot project for outsourcing Slavic books at the State University Libraries", Cataloging & Classification Quarterly, Vol. 20 No. 3, pp. 57-73 Younger, J. (1991), "The role of librarians in bibliographic access services in the 1990s", Journal of Library Administration, Vol. 15 No. 1-2, pp. 125-50. Copy cataloguers and their changing roles at the Ohio State University Library: a case study Abstract Introduction The original cataloging section The search section Special collections Copy cataloguing Conclusion References work_ce5isw63ejb5hnc3kidzjflaji ---- Microsoft Word - The Cataloging of Self-Published Items-Manuscript-032719.docx 1     This is an Accepted Manuscript of an article published by Taylor & Francis Group in Cataloging and Classification Quarterly on May 14, 2019, available online: http://www.tandfonline.com/10.1080/01639374.2019.1602091 © 2019 Taylor & Francis Group. Personal use of this material is permitted. Permission from Taylor & Francis Group must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. To cite this article: Nurhak Tuncer & Reed David (2019): The Cataloging of Self-Published Items, Cataloging & Classification Quarterly, DOI: 10.1080/01639374.2019.1602091 2     The Cataloging of Self-Published Items NURHAK TUNCER Elizabeth City State University, Elizabeth City, North Carolina, USA REED DAVID Washington State University, Pullman, Washington, USA Abstract: This article presents the results of a survey conducted in the fall of 2015 of librarians who are cataloging self-published items. The survey was conducted in response to the growing popularity of self-publishing and the increasing prevalence of self-published items in libraries. Survey respondents were asked to describe how they are cataloging these items and provide representative examples of the records they have created. An analysis of both the survey responses and the records is presented, followed by suggestions for best practices for cataloging these items. KEYWORDS: Self-publishing, cataloging, surveys, cataloging research, cataloging best practices Acknowledgments: Karen Snow, Jean Harden, Kevin Kishimoto, Mark Scharff, Beth Iseminger INTRODUCTION From books to musical scores to audio and video recordings, self-publishing has exploded in popularity in recent years and self-published items have begun to make their way into all types of libraries. Cataloging these items can be challenging, though, usually because they tend to have vague or incomplete publication information on them. Such items usually need either original cataloging or highly complex copy cataloging and generally require at least some outside research in order to be properly cataloged. In this article, these challenges are presented 3     and discussed in detail after careful analyses of a survey and the MARC records that were submitted by the survey respondents. The authors began working together on this topic in March 2015. They realized that most of what little has been written in the library literature on self-publishing is from the collection- management perspective, with almost nothing available from the cataloging perspective. To address the challenges posed by self-published items, the authors conducted a survey of primarily U.S. librarians to analyze how they are cataloging these items. Launched on September 14, 2015 and closed on December 31, 2015, the survey asked catalogers to describe how they are cataloging self-published items and provide representative examples of the records they have created. The authors analyzed both the survey responses and the records to determine what the current trends are in the cataloging of these items. They presented their analyses at various professional conferences between October 2015 and August 2016.1 The goal of the authors’ work is descriptive and not proscriptive. They want to learn more about how self-published items are being cataloged in libraries today, get the library community talking about this topic, and shed light into areas that may need further research. It is hoped that this article will make clear that the library community needs to have a conversation about cataloging self-published items. The authors would like to see this conversation lead to the creation of best practices for cataloging self-published items, so that they may be as visible and accessible as possible for library users. THE DEFINITION OF SELF-PUBLISHING Defining self-publishing is important for catalogers as well as for the authors of this article, since it defines the scope of this research project. It is definitely more than a simple dictionary definition; Merriam-Webster defines the word self-publish as “to publish (a book) 4     using the author's own resources.”2 At the very least, the portion in parentheses should read “(a book or other media),” given that items in all formats can be self-published. Even with this addition, this is a vague definition. After careful observation, though, the authors of this article have agreed on a fuller definition of self-published items. They define them as items for which the creators had control over most or all of the many stages of the publication process. These stages begin with content creation and editing, continue with designing the items’ packages and marketing the items (frequently on the website of the creator or the self-publishing firm), and end with the production and distribution of the items. Not all of the stages have to be completed by the creator in order for the item to be self-published, but a majority of them should be. For instance, the creator can do everything else themselves, but may leave the production up to a print-on-demand firm, the distribution up to a commercial distributor, or both. In this instance, since a majority of the stages of the publication process were completed by the creator, the authors would define the item as self-published. Self-published items are frequently born digital and may only be available online via print-on-demand through a commercial distributor or on the creator’s website. However, one should keep in mind that not all print-on-demand items are self-published, nor all self-published items are print-on-demand. Print-on-demand as an action itself is certainly a related topic to self- publishing, because print-on-demand is sometimes the last stage of the self-publication process. However, the authors of this article are mainly interested in how all of the stages of the self- publication process are reflected in the library catalog, not just the last stage. The definition of self-publishing is important for catalogers for several reasons. If a cataloger knows who did what to bring a self-published item into existence, or at least has an idea of how much control the creator had over the self-publication process, one can catalog the 5     item more effectively. Who is responsible for which stages of the self-publication process can vary between different self-publishing firms and sometimes even between different versions or copies of an item. It strongly affects how publication information is presented on the item, which in turn affects how that information (provenance, publication data, anything that improves the item’s visibility) would be recorded in a catalog record. In general, no matter how one defines it, self-publishing is becoming a fact of life for libraries and librarians, as each day more creators want to self-publish and more self-published items are making their way into libraries. LITERATURE REVIEW There is a growing body of work in the library community on self-publishing. Most of what has been done so far has been focused on the topic from the collection-management perspective, dealing with questions of how to find, acquire, and preserve such items. The literature about how to catalog self-published items has been sparse. Despite this situation, there are a number of articles and conference presentations that can be used as a foundation to build on this topic. Michael Saffle wrote an article on self-publishing in musicology that also offered a brief overview of the history of self-publishing in general. He mentioned a number of famous works that were self-published by their creators. The specific works he mentioned include John Milton’s book Areopagitica (1644), Upton Sinclair’s book Money Writes! (1924), Carl Philipp Emanuel Bach’s Kenner und Liebhaber sonatas (1779-1787), and Georg Philipp Telemann’s Six Trios of 1718.3 His article is a reminder that, as popular as it is today, self-publishing is not a new phenomenon and has existed in various forms for centuries. In recent decades, though, technological advancements have drastically changed self-publishing and libraries need to pay closer attention to it. 6     Juris Dilevko and Keren Dali took a closer look at self-publishing than Saffle did, examining the manner in which libraries are collecting self-published books. Like Saffle, they gave a list of famous authors who self-published, a list that includes William Blake, Elizabeth Barrett Browning, Willa Cather, W. E. B. DuBois, Benjamin Franklin, Nathaniel Hawthorne, Beatrix Potter, Mark Twain, Walt Whitman, and Virginia Woolf. Their method involved searching OCLC WorldCat for books published by seven of the best-known self-publishing firms. The specific areas of the catalog records they considered included firm names, subject headings, and the number and types of libraries that have holdings on the records.4 Jana Bradley, Bruce Fulton, and Marlene Helm studied the cataloging of self-published items. They searched WorldCat for self-published books, much as Dilevko and Dali did, albeit on a larger scale, as they considered a randomly selected sample of the output of ninety-three self-publishing firms and investigated the results from a number of angles. One of the angles they considered was cataloging; they looked at presence or absence of ISBNs, encoding levels, and near-matches.5 Their article is significant because it lists self-publishing firms, observes the items they publish, and examines some missing publication information on these items. The authors of this article also analyzed the WorldCat records gathered in their survey to identify the patterns in these records. Robert P. Holley provided an overview of the state of bibliographic control, including cataloging, of self-published items. His book chapter emphasizes sources of bibliographic control other than individual catalogers creating traditional records, including the Library of Congress Cataloging Distribution Service, distributors’ websites, and catalog records created by distributors. In particular, he describes an announcement by Smashwords and OverDrive, distributors of e-books whose products include self-published e-books, that they will provide 7     MARC records for their e-books as “[t]he most exciting news for improved bibliographic control of self-published works[.]”6 Left unsaid are the quality of these records, as well as how much cleanup they may require from individual catalogers. Librarians are starting to bring creators of self-published items into the conversation. The Reference and User Services Association (RUSA) created an online Library Publishing Toolkit to support libraries that engage in publishing initiatives.7 Heather Moulaison Sandy discussed a self-publishing initiative at a public library in suburban Kansas City, Missouri.8 Both the RUSA toolkit and Sandy’s article focus on how to support content creators. This is an idea whose time has come, as more and more libraries aim towards educating content creators in copyrighting, publishing, designing, and marketing their items. The involvement of libraries in this process can also lead to better cataloging of these items. The contemporary self-publishing phenomenon includes formats other than printed books. One such format is the zine, which is defined by Merriam-Webster as “a noncommercial often homemade or online publication usually devoted to specialized and often unconventional subject matter.”9 Zine creators generally complete every stage of the publication process themselves, making zines self-published. Heidy Berthoud has written about the creation of the zine collection at Vassar College, including its cataloging. Zines present a variety of cataloging challenges, from lacking International Standard Serial Numbers to not having basic bibliographic information to occasionally beginning as monographs before becoming serials. Librarians who catalog zines are working on a metadata standard and union catalog for them in order to improve their cataloging.10 Zines are not the only format of self-published items that would benefit from specialized cataloging rules. 8     Kent Underwood wrote an article on self-published music scores that deals with them solely from the collection-management perspective. Contemporary composers are increasingly publishing their own scores and making them available through their personal websites, often as PDFs or via print-on-demand. Underwood is particularly concerned with the impermanence of these websites, which can cause libraries to miss opportunities to preserve these scores and to document contemporary musical culture. He is also concerned with figuring out how to find these scores, since these composers do not publish through well-known vendors that libraries use. Therefore, it is a challenge for libraries to obtain these items.11 Underwood’s research shows that libraries need to reconsider and change their collection management strategies in order to include such scores as well. However, scores are not the only music format that is self-published. Self-published recordings are becoming increasingly popular, particularly in music genres such as jazz, popular music, and folk music. Recordings in these genres tend to be the primary document, when scores are usually absent and recordings are often the only way of manifesting the works. Catalogers working with recordings in these genres must keep in mind that the recording and anything associated with it, such as CD liner notes, may be the sole source of information they can use for transcribing data and creating bibliographic records for these works. Music catalogers are dealing with self-published items and topics closely related to them as well, as shown in two recent presentations at music library conferences. Charles Peters discussed the workflows used by his library and other libraries for music scores that are available only as PDFs, most of which are self-published. From selection and acquisition to cataloging, preservation, and circulation, these workflows differ drastically from those for scores published in print. He intends to conduct a survey on this topic and create a final report for use by libraries developing their own workflows for such scores.12 Anne Adams and Morris Levy presented on 9     the cataloging of print-on-demand music scores, which, as mentioned above, is related to self- publishing. In their presentation, they explain how they believe such music scores should be cataloged, using records they have created as case studies on this topic and offering them as templates for best practices.13 Their work supports the authors’ assertion that these items are challenging to catalog, particularly when one is recording the publication information. Peters’s, Adams’s, and Levy’s presentations focus on the last stage of the self-publication process in general and print-on-demand in particular. They feature the very first examples of case studies in music and could be used as a foundation for best practices for self-published items and pave the way to better cataloging practices for all formats of such items. The articles and presentations described in this literature review show that self-publishing has implications for all types of libraries, not only in terms of cataloging, but also acquiring and preserving self-published materials and finding ways to support content creators in publishing their own materials. The authors of this article hope to contribute to the evolving library literature on this topic with the analysis of their survey on the cataloging of such materials. This survey proves that catalogers are making a sincere effort to catalog self-published items, and the results represent a collective response and voice of these catalogers. The patterns found in the survey results may lead catalogers to find better ways to catalog such items. METHODOLOGY The authors conducted a survey of primarily U.S. librarians to analyze how they are cataloging self-published items. They created the survey with the survey program Qualtrics. The final version of the survey was launched after getting approval and passing the institutional review board (IRB) requirements in order to follow the ethics protocols and protect the confidentiality of individuals who took the survey. It consisted of twelve quantitative and 10     qualitative questions, all of which had text boxes for survey respondents to provide additional comments. The survey was launched on September 14, 2015, closed on December 31, 2015, and completed by 403 people. Once it was open, the survey was distributed over six electronic discussion lists and through four Facebook groups. The lists used were the lists of the Music Library Association, Music OCLC Users Group, and Online Audiovisual Catalogers, the Association for Library Collections and Technical Services’ Central list, and the lists RDA-L and Autocat. The Facebook groups used were Library Think Tank - #ALATT (then known as ALA Think Tank), Music Librarians, RDA Cafe, and Troublesome Catalogers and Magical Metadata Fairies. Once the survey was closed, the results were exported into a spreadsheet and analyzed by the authors. A separate spreadsheet was created to analyze the data from the survey’s eleventh question, in which the authors analyzed WorldCat records that were submitted by survey respondents. SURVEY RESULTS Introductory Questions The survey began with three questions about the respondents’ work in general and their work with self-published items in particular. These questions were asked in order to give the authors more ways to analyze the survey results. The first question asked the respondents to identify the type of library where they work. As shown in table 1, “academic” was the top response, although all types of libraries had some responses. This is a reminder that looking for better ways to catalog self-published items is every library’s concern. Table 1. 1. What type of library do you currently work for? Answer Response Percentage Academic 181 45% Public 138 34% 11     Special 26 6% Archives 3 1% School 5 1% Other (please specify) 48 12% Total 401 100% The second question concerned whether respondents catalog self-published items in their local or consortial databases or in a database, such as WorldCat, shared beyond a consortium. It was asked to determine how accessible their records are. More than a quarter of respondents are only cataloging these items locally (table 2). Table 2. 2. Do you catalog self-published items locally/consortially or in a database (e.g. OCLC) shared beyond a consortium? Answer Response Percentage Local/consortial 109 27% Shared beyond a consortium 129 32% Both 152 38% Other (please specify) 11 3% Total 401 100% The third question was about formats. The authors wanted to know what formats of self- published items are being cataloged in addition to books, so that later in their research they could focus on each format in more detail. In this question, “Books” was the top response (table 3), but other formats had strong showings as well, including music formats. “Sound recordings” was the second-most popular response, with “Musical scores” not far behind. Indeed, every format received its fair share of answers, which underlines the point from the first question, namely that self-published items are every library’s concern. Table 3. 3. What format(s) of self-published items do you catalog? (check all that apply) 12     Answer Response Percentage Books 379 94% Musical scores 64 16% Sound recordings 153 38% Video recordings 146 36% Electronic resources 79 20% Serials 61 15% Other (please specify) 23 6% Publication Information Questions The next three questions asked respondents how they record publication information for items from self-publishing firms. Each question corresponded to one of the common subfields seen in the MARC 260 and 264 fields. The first two questions included answers that involved entering information in brackets, which is the common thing to do when the information comes from outside of the item being cataloged or when the information is not taken from the preferred source of information. The first of these questions was about how catalogers record the publisher, information that is generally found in MARC 260/264 subfield b. A cataloger could consider either the firm or the creator to be the publisher of a self-published item based on the information found on the item itself or from outside research. With this in mind, the authors came up with a number of possible scenarios, some with the creator, some with the firm, some with brackets, some without. As shown in table 4, the most popular answer was “Other (please specify),” which shows a lack of clarity in this area. The authors of this article wanted to know how catalogers usually handled the absence of publisher information. Do they prefer to put the creator name or the firm name in the publication area or just put “Publisher not identified” in brackets? The survey results show that very few respondents are routinely using “Publisher not identified” in brackets. A majority 13     of the catalogers were not leaving this area blank by choosing between the creator or the publishing firm as the publisher. The authors analyzed the MARC records collected in the survey’s eleventh question and compared them to the responses to this question. They have seen that the best examples of records have both the creator and the firm names transcribed in the record. Table 4. 4. How do you record the publisher? Answer Response Percentage Firm name, in brackets 42 11% Creator name, in brackets 70 18% [Publisher not identified] 39 10% Firm name, not in brackets 113 28% Other (please specify) 133 34% Total 397 100% This question had a text box for additional thoughts on this topic, shown below, and those thoughts were similarly varied. The creator name, both in brackets and not, was used in place of the publisher very frequently. Question 4: Selected text box responses “If the publisher information isn't there, I use my best judgment AND brackets.” “Usually creator name, unless creator name is unclear; then Firm name in brackets.” “Not easy. Usually use the author since they probably paid to have it printed.” “I consider firms like Createspace to be publishers.” “If the firm name is prominent, I use that. If not, I use creator name.” The second of these three questions concerned recording the place of publication (MARC 260/264 subfield a). The goal of asking this question was to see whether the firm location or where the author lives could be substituted for the publisher’s location. Responses to this question paralleled responses to the previous question to a significant extent in terms of having the same problem of lacking publication information on the items. “Firm location, not in 14     brackets” and “Where the author lives, in brackets” had strong showings, but again, “Other (please specify)” was the top response (table 5). Table 5. 5. How do you record the place of publication? Answer Response Percentage Firm’s location, in brackets 64 16% Where the author lives, in brackets 65 16% [Place of publication not identified] 57 14% Firm's location, not in brackets 81 21% Other (please specify) 127 32% Total 394 100% As with the previous question, the text box answers to this question, shown below, were highly varied. Catalogers are struggling with the question of whether or not to use the printing location in this field, since the publication location is frequently missing but the printing location may be available. Question 5: Selected text box responses “If a place is indicated on the item, I record it; if not, I conjecture at least to the country. I absolutely never use [Place of publication not identified].” “The place has been confusing. I started using the printing location in brackets. But I have since decided [United States] would be better.” “Never use [place of publication not identified] if it can be avoided.” The last of these three questions pertains to recording the date of publication (MARC 260/264 subfield c). The authors asked this question because, in their own work, they observed that the publication date of these items is usually not clear, mainly because an actual publication date is absent most of the time, leaving them to use a copyright or printing date instead. They wanted to know how other librarians handled this situation. The survey responses also show that the publication date is not entirely clear on such items (table 6). 15     Table 6. 6. When recording the date of publication, is that date usually clear or unclear? Answer Response Percentage Clear 212 55% Unclear 171 45% Total 383 100% This question’s text box responses, shown below, align with the survey responses. Some catalogers resort to an outside source, such as Amazon, to answer this, but more used the copyright or printing date to infer a publication date. Question 6: Selected text box responses “I tend to use outside resources, such as Amazon, to identify a date of publication.” “There is usually at least a copyright date from which to infer a publication date.” “Since we are not using RDA, we include only the copyright date (when given), not the date of publication.” “Sometimes, I infer the date of printing (left corner at the bottom of the last page) as date of publication. If there is a copyright date available, I use that date, instead.” “I usually need to infer a publication date from the date of manufacture or copyright date- -when I do so, I record the date of publication in brackets.” Questions Related to Cataloger’s Judgment The next four questions concerned cataloger’s judgment, not in the traditional sense of the phrase, but rather in terms of the judgment calls that catalogers have to make when they have no clear guidelines for cataloging self-published items. While the inconsistency of the data on these items makes use of cataloger’s judgment unavoidable, again, clearer guidelines would help. Such guidelines would probably reduce the need for cataloger’s judgment. In the authors’ cataloging experience, self-published items generally need original cataloging and require more outside searching in order to properly catalog them. Therefore, they wanted to know if these items require more effort to catalog. They asked a question about this situation to see what other people thought about it, as well as to look for correlations between this question and the next question, which pertains to how often these items need original 16     cataloging. As shown in table 7, although “the same effort” was the most popular answer, it is important to note that one-third of the survey respondents indicated that it takes more effort, which strongly suggests that it is more difficult to catalog these items compared to items from traditional publishers. The text box responses, shown below, also show a variety of required effort levels. Table 7. 7. How much effort do you put into cataloging self-published items compared to items from traditional publishers? Answer Response Percentage Less effort 31 8% The same effort 236 59% More effort 134 33% Total 401 100% Question 7: Selected text box responses “Getting to be the same amount of effort as I become more firm in my decisions” “The same effort ends up requiring more time, because the information isn't as available.” “If the author is from our community, we make more effort.” “More effort because they almost always need original cataloging.” “Effort will vary from item to item depending on the nature of the material.” The authors have observed that self-published items usually need original cataloging. Were other catalogers making this observation as well? They asked a question to find out. As shown in table 8, they were. “Mostly original records” was the top response by a significant margin. The selected text box responses, shown below, backed up the answers to the survey question. Table 8. 8. How often do you create original records for self-published items? Answer Response Percentage Mostly original records 253 63% 17     Some original records 117 29% Very few original records and mostly copy records 30 7% No original records 1 0% Total 401 100% Question 8: Selected text box responses “It's very rare for us to find existing records for these items.” “Most of our self pubs are local, so this involves original cataloging.” “Almost always original records.” “Other than original, I tend to update/enhance existing brief vendor records for self- published items (very minimal information - often just ISBN and title etc.” The authors have noticed that self-published items can be of varying quality. John Luther Adams’s Pulitzer Prize-winning orchestral work Become Ocean was self-published by his Taiga Press.14 At the opposite end of the spectrum, someone could self-publish something as ordinary as, for instance, a compendium of entries in their own blog. Are other catalogers looking at the quality of these items? If so, are they questioning whether or not these items belong in their libraries? The survey’s ninth question was asked to find out. Just under half of the respondents, it turns out, consider whether or not the items should be added to the catalog (table 9), even if their role in the library is limited to cataloging. Table 9. 9. When cataloging self-published items, do you ever consider whether or not they should be added to the catalog? Answer Response Percentage No 203 51% Yes, but I keep my concerns to myself 76 19% Yes, and I report items that shouldn't be added, but they get added anyway 17 4% Yes, and I report items that shouldn't be added, and they do not get added at least some of the time 101 25% Total 397 100% 18     The text box responses, shown below, offered further insights. While some catalogers specifically do not do collection development, such as the first one quoted below, others have some level of responsibility for making collection-development decisions, such as the next two. Question 9: Selected text box responses “I'm just a cataloger. I don't do collection development or management.” “I am chief policy maker for music items.” “I also oversee collection development…” “It is often difficult for our selection staff to know an item is self-published before it arrives. Our book vendors do not always make it clear on their websites.” “I keep a list of online reprint self-publishers to avoid.” “Faculty requests always override our quality concerns.” Adding a note pertaining to an item's self-published status is a way that catalogers could clearly mark a record as being for a self-published item. Sometimes the publication information on the item has a written statement which indicates that it is self-published. The authors wanted to know if catalogers are transcribing that information if present on the item or adding it even if it is not on the item. While a majority of survey respondents do not choose to add additional notes regarding an item’s publication status (table 10), about one-third of them always or sometimes add such notes. Table 10. 10. Do you ever add any notes pertaining to an item's self-published status? Answer Response Percentage Always 27 7% Sometimes 100 25% Never 273 68% Total 400 100% Based on the text box responses, shown below, catalogers who do create notes pertaining to an item’s self-published status do so in a fairly straightforward manner. They usually put the term “self-published” in a note when they see that exact wording on the item. 19     Question 10: Selected text box responses “Generally, if item is explicitly identified as self-published we quote that identification directly.” “I have a stack of self-published books waiting to be cataloged now. I am considering adding a 500 note saying, ‘This book is self-published’ just so that information is there. I don't see any reason not to add it.” “Add a note about it being self-published.” “Self-published” “I add a "local author" note to the item.” Submitted MARC Record Analysis & Comments Question The survey's eleventh question is its most important as it compares existing catalog records with the survey responses. The question read, “If you catalog self-published items in OCLC, please provide OCLC numbers for representative examples of the records you create.” OCLC numbers for 357 MARC records were submitted and a wide variety of information from the records was entered into a spreadsheet, including format, publication firm when present, and type of library, to name three. The result of this process was a rich data set that can be analyzed in many ways. As an example, almost all of the records examined were created by either academic or public libraries (table 11a), which is consistent with the survey’s first question. Table 11a. 11a. Records by the type of library that created them Type of Libraries Response Academic 203 Public 100 Special 54 Total OCLC Records Examined 357 The authors compared the data that they received from the submitted OCLC records with the data they gathered from two of the other survey questions. From their analysis of the OCLC records that the survey respondents submitted, although firm name appears much more often in 20     the records’ publisher area (MARC 260/264 subfield b), one-third of submitted records show that creator name is recorded as the publisher (table 11b), similar to the survey’s fourth question. It is important to see that the creator of the works is being recorded in the publication area, perhaps because catalogers believe the creator has more control over the self-publication process, as explained earlier in the authors’ version of defining self-publishing. Table 11b. 11b. Records by how the publisher was recorded Publication Area Response Firm Name (with brackets or without) 221 Creator’s Name (with brackets or without) 111 Both 20 Blank / [Publisher not identified] 25 Total OCLC Records Examined 357 The authors also compared the dates of the records that they analyzed with the survey’s sixth question. Like the sixth question, their analysis of the records showed that the dates on these items tend to be unclear. Out of 357 records, 272 of the records had a publication date (table 11c). 187 of 272 records had a date in brackets, meaning that they were either inferred from another sort of date, such as a copyright date, or established by a cataloger’s outside research. For that matter, at least some of the 85 publication dates that are not in brackets may be duplicated copyright dates as well. The 85 records that had no publication date almost always had copyright, distribution, production, or manufacture dates, but no publication date. Table 11c 11c. Records by how the publication date was recorded Publication Date Response Publication Date With Brackets 187 21     Publication Date Without Brackets 85 No Publication Date 85 Total OCLC Records Examined 357 The authors searched two major name authority databases, the Virtual International Authority File (VIAF) and the Library of Congress Name Authority File (LCNAF), for the creators of the self-published items (table 11d). Although the majority of the survey respondents were from the United States, some of the international self-publishing authors that they cataloged could have been only existing in VIAF and not in LCNAF. Therefore, both databases were searched in order to be safe. A significant majority of the names were in neither of the databases. This is a problem, since it makes these creators less visible and could lead to a single creator's name being recorded in different ways by different libraries. The absence of creator names in these two major authority databases could also mean that authority records for these names have only been created in local library databases and not uploaded to these major databases. Whether these names are only important to local communities or to the entire world, the authors believe that they should be made more visible to everyone. This is one of many aspects of this topic that need to be considered in more depth by better cataloging guidelines. Table 11d. 11d. Records by author presence in authority files Name Analysis Response Neither VIAF nor LCNAF 200 VIAF & LCNAF 147 LCNAF 2 VIAF 3 No Name 5 Total OCLC Records Examined 357 22     Two of the MARC records submitted by the survey respondents illustrate the variety of ways in which people are cataloging self-published items using their best judgments to make these items as visible as they can. (The creators of these records gave permission to use their records in this article.) The first record is for a CD by the Portland, Oregon-based rock musician Ryan VanDordrecht (table 11e).15 It is important to note that VanDordrecht is transcribed as the publisher in the first MARC 264 field, which clearly indicates that he is a self-publisher. Also, the publication location is a guess by the cataloger based on where VanDordrecht lives and is put in brackets, which is a much better practice than recording “Place of publication not identified” in brackets. VanDordrecht has relationship designators as the composer and performer in his 100 field, which opens up the possibility that he could have a third one as the “self-publisher” since he also has that role. Creating “self-publisher” as an Resource Description and Access (RDA) relationship designator opens up a debate as to how useful it would be for the end user or for anyone keeping track of self-published materials in their catalog. There is also a streaming version of the album, which is available at VanDordrecht's website.16 It is possible to have a note about the availability of the streaming version in the catalog record as well, but this is another debate. Overall, this record shows good practices to make the item more visible. Table 11e. 11e. Excerpt from a record for a self-published CD 100 1_ |a VanDordrecht, Ryan, |e composer, |e performer. 245 10 |a Beast of love / |c Ryan VanDordrecht. 264 _1 |a [Portland, Ore.] : |b Ryan VanDordrecht, |c [2014] 264 _4 |c ℗2014 300 __ |a 1 audio disc : |b CD audio, digital ; |c 4 3/4 in. 500 __ |a Compact disc. 23     500 __ |a Title from container. 511 0_ |a Performed by Ryan VanDordrecht, Rian Lewis, Brian Koch, Troy Walstead, and Jen Dashney. 505 00 |t Hard lover -- |t Great American life -- |t I ain’t coming home tonight -- |t Wild ones -- |t You got Travelin’ on -- |t Hard lover -- One more cigarette -- |t Last one to know. 650 _0 |a Rock music |y 2011-2020. 650 _0 |a Rock music |z Oregon |z Portland. The second record that is shown as an example is for a book by best-selling authors Sarina Bowen17 and Elle Kennedy (table 11f).18 There are three 264 fields in the record. As in the previous record example, this record also makes it clear that Bowen and Kennedy are the self-publishers separately transcribed in two subfield bs in the first 264. The second 264 credits the firm (in this case, CreateSpace) as the manufacturer, with a third 264 for the copyright date. The dates are all the same in the three 264s, and the publication date is inferred from the copyright date. In addition, the first 264 field gives the publication location as where the authors live in brackets with a question mark, which shows the cataloger’s effort to record something meaningful in this space rather than leaving it blank. All of these are good practices which makes the item more visible and accessible. The roles of the authors and firm are further explained in the first 500 field, with an additional 500 field that explains the second 264 field. It is important to note that first 500 field also indicates that the book is self-published. The second 500 indicates that the manufacture date and place is taken from the end of the book, which is an unusual place that is not generally considered as a source of information. Overall, both records show catalogers making their best efforts to transcribe the information available in unusual places for publication information. Table 11f. 11f. Excerpt from a record for a self-published book 24     100 _1 |a Bowen, Sarina, |e author. 245 10 |a Him / |c Sarina Bowen, Elle Kennedy. 264 _1 |a [United States?] : |b S. Bowen, |b E. Kennedy, |c [2015] 264 _2 |a Lexington, KY : |b [Manufactured by CreateSpace], |c 2015. 264 _4 |c ©2015 300 __ |a 352 pages ; |c 21 cm 336 __ |a text |b txt |2 rdacontent 337 __ |a unmediated |b n |2 rdamedia 338 __ |a volume |b nc |2 rdacarrier 500 __ |a Work self-published by authors using CreateSpace. 500 __ |a Place and date of manufacture taken from end of work and may differ between printings. The survey's twelfth, and final, question read, “Please provide any further comments or concerns that you have about cataloging self-published items here.” A small sampling of what they had to say is shown below. The authors of this article were particularly pleased to see opinions such as the one from the third person quoted here. Question 12: Selected responses “I should add often we catalog self-published local authors (books) or music groups (cds) to support local authors/musicians in our community. We often add copies of these local items to our special collections room as well as circulating copies.” “We have some very rare and unusual recordings done by people who went on to be famous as musicians or in other fields.” “I am glad to see that you are researching this topic, which I think needs to be tackled on a much wider scale throughout the cataloguing community. Together, we should make up suggested guidelines for templates or more efficient ways of cataloging these items.” “We urge the “publishers” to provide better information and create a better product.” “Too many self-published works are lacking the information necessary to properly catalog them…” DISCUSSION & SUGGESTIONS FOR BEST PRACTICES To quote the third person quoted in the preceding question, the authors’ work as a whole shows that “more efficient ways of cataloging [self-published] items” are urgently needed. The 25     WorldCat records and the survey results show that catalogers have been making a sincere effort to catalog these items, although the lack of information on these items makes these efforts challenging. The records and results also show that the cataloging of these items has been inconsistent. As these items make their way into libraries in increasing numbers, the need for best practices to catalog them increases as well. The records and survey results, particularly the records excerpted in tables 11e and 11f, show patterns that the authors believe can be the foundation of better ways of cataloging these items. With that in mind, they have a number of suggestions for what best practices for cataloging self-published items should address. Catalogers and other librarians are not the only people who should help to shape those best practices. Creators are involved in their items’ publication process and could create better bibliographic information that catalogers need. Public and academic libraries frequently host self-publishing initiatives and reach out to creators of self-published items in their communities. Public-services librarians could work with catalogers in order to inform and educate these creators about how to create more useful data for cataloging. This would make these items much easier to catalog and more visible and accessible to everyone, including potential buyers. Therefore, when constructing best practices for the cataloging of self-published items, the ways in which creators are publishing and distributing them and the ways in which library users would like to access them should be considered as well. The survey’s first three questions bring up a number of related issues. While the first question makes it clear that self-published items are every type of library’s concern, best practices should consider cataloging of self-published items could differ in various types of libraries, as, for instance, public libraries may need to handle them differently from academic libraries. Similarly, the third question shows that libraries are collecting self-published items in a 26     variety of formats, making it necessary for best practices to address the differences between formats. The second question suggests that some libraries are only cataloging self-published items locally and not sharing them in databases such as WorldCat. This is concerning, as it limits these items’ accessibility, not only for library users, but for other catalogers who may be seeking common ground for creating guidelines for cataloging self-published items. As shown in the survey’s fourth question, the question of what to record in the publisher area in records for self-published items is a complicated one. How much publishing responsibility does the creator take for a self-published item, and how much does the firm take? Should the creator’s name be given preference, or the firm name, or should both names be routinely recorded in the publication area? For the most part, catalogers are making an effort to put something meaningful in that space, an effort that, like cataloging these items in shared databases, helps make the items visible. This takes more time than putting “Publisher not identified” in brackets would, as it requires doing at least a little outside research. However, catalogers are still struggling to decide between recording the creator or the firm as the publisher, which shows that there is a need for clearer guidelines. Recording both the creator name and firm name is possible, particularly for catalogers using RDA. The authors have seen that the best records tend to have both names. The record shown in table 11f is a good example of this practice. Whether they follow this practice or not, catalogers should consider what would best help library users find these items. Recording the place of publication for self-published items is similarly complicated. To a significant extent, this is a matter of cataloger’s judgment due to the inconsistency or absence of information on such items. As with the publisher, though, catalogers are making an effort to put something meaningful in this space by guessing and using brackets, which improves the items’ 27     visibility. Putting “Place of publication not identified” in brackets is clearly the last resort. Catalogers should remember that library users will most likely benefit if there is at least the country name (based on the creator’s nationality or residence) or printing location in brackets. The record shown in table 11f illustrates this. The survey results indicate that publication dates for self-published items need to be addressed as well, as they are usually absent from the item. In their analysis of the submitted records for items in all formats, the authors have seen that there is not always a publication date on an item, but there is almost always a production, manufacture, distribution, or copyright date. The text box responses to the sixth question show that catalogers are frequently inferring a publication date from the copyright date when the latter is present. In fact, doing this is clearly suggested and underlined in RDA, according to an LC-PCC Policy Statement on the topic.19 Both the text box responses and the submitted records show that, again, there is a high effort from catalogers to put some sort of date in the publication area instead of putting “Date of publication not identified” in brackets. This effort can involve using printing dates from the bottom of the last page of a self-published book, which is an unusual place to get such information. The responses to the sixth question show that clearer guidelines would help here as well, especially when an item is print-on-demand and only has a printing date. The record shown in table 11f features a publication date inferred from a copyright date, as well as a manufacture date. The survey’s seventh question asks whether catalogers put more or less effort into cataloging self-published items than they do into items for traditional publishers. It is important to note that less effort should not be perceived as the item in question having less value, although there is a correlation between them. Some of the survey respondents indicated in the seventh 28     question’s text box responses that they spent more time on items that are more beneficial to their community. Based on these responses, as well as the answers to the question itself, these items may receive more effort and can take more time since they lack complete publication information and usually need original cataloging. Whether they question the value of these items or not, catalogers are putting in the time and effort to make these items visible and accessible. On the other hand, the lack of guidelines for how to catalog such items also slows down catalogers, resulting in these items needing quite a bit of time and effort. As the first quote shown above mentioned, though, it gets easier as one gains more experience, and having clearer guidelines would make it easier as well. The survey’s eighth question dealt with how often catalogers have to create original records for self-published items. Between this question and the previous one, it is clear that self- published items require more work compared to items from traditional publishers, work that catalogers are putting in to make these items as visible as possible. Sharing of all records created for self-published items, as discussed in the survey’s second question, would be particularly helpful, as any sort of original cataloging can be highly time-consuming. Better guidelines for cataloging these items would help as well, as they would with all aspects of this topic. While the research behind this article is primarily focused on cataloging, the authors are also interested in the role that catalogers play, or could play, in collection development. The survey’s ninth question shows that, due to their varying quality, self-published items may blur the line between cataloging and collection development. As one of the text box responses to this question points out, collection-management personnel might not always catch that an item is self-published, an area in which catalogers can help. The provenance or purpose of these self- published items sometimes plays a role as these items take their places in libraries. Sometimes, 29     for local authors such as students, professors, or members of the community, there is a policy to catalog them regardless of the quality of the item. In this case, catalogers can indicate the item’s provenance in the record or add note fields indicating that there is no bibliography or index to give a clue to library users about how scholarly these items are. As described in the survey’s tenth question, clearly marking a record as being for a self- published item is a useful practice that could benefit library users. For instance, it is very important for musicians to know who published the score to a work they are studying or performing, as well as which edition of the score they have. Scholars in other fields have similar concerns and would find it useful to know if the item was published as a print-on-demand item and which printing company printed it. If catalogers mark records as being for self-published items, when and how should they do so? According to the text box responses for the tenth question, catalogers sometimes put the term “self-published” in a note when they see it on the item. It is possible to add terms related to self-publishing (such as “Print-on-demand items” or “Self-published items”) to a controlled vocabulary, such as the Library of Congress Genre/Form Terms. Whether an additional note is created or a controlled-vocabulary term is used, the way in which a record is marked as being for a self-published item could be standardized, which would be useful to scholars. These items tend to exist in a variety of formats, editions, and printings. Books and scores can have both print and electronic versions, albums can be released both on CD and as digital downloads, and anything that is print-on-demand can have several different printing dates. Should catalogers create separate records for each of these, or should they create one record for a given item? If they create separate records, Linked Data might be the way to connect all of these 30     versions to each other. If they create one record, the record shown in table 11f might be a model for how to do so. In addition to the matters described earlier in this section, best practices for cataloging self-published items need to consider other matters that were not discussed in this article. Within the records themselves, these include, but are not limited to, the presence or absence of ISBNs, MARC encoding levels, and the absence of summary notes (MARC field 520) which impedes classification and the assignment of subject headings. More broadly, one could also consider why self-published authors, as shown in table 11d, frequently do not have national authority records for their names; it could be due to a lack of training, or it could be due to local policies to not create such records. One can also gain further insight into self-published items by taking a closer look at the actions of self-publishing creators. Each of these matters could be the topic of an additional research project as well. The authors strongly encourage further research in these areas. Today, libraries are becoming part of the self-publishing scene, not only by acquiring self-published items, but by supporting content creators with self-publishing initiatives. Catalogers are part of this scene as well, since they are responsible for making these items visible to library users. The goal of the authors of this article is to point out that the self-publishing phenomenon needs further research, especially from the cataloger’s perspective. They hope that this article was able to serve as a voice for the catalogers who took this survey and submitted their records for analysis. The authors’ survey results, record analysis, and literature review all show that it is time for catalogers to approach this issue more closely and look at the specific ways in which people are cataloging self-published items. Perhaps more conversation on this 31     topic can lead to the creation of best practices for cataloging these items, so that they may be as visible and accessible as possible for library users. NOTES 1. The conferences where this project was presented were the Music Library Association’s Midwest Chapter’s annual meeting in October 2015, the American Library Association’s Midwinter Meeting in January 2016, the Music OCLC Users Group’s annual meeting in March 2016, the Music Library Association’s annual meeting in March 2016, the International Association of Music Libraries’ annual congress in July 2016, and the International Federation of Library Associations’ World Library and Information Congress in August 2016. 2. “Self-publish | Definition of Self-publish by Merriam-Webster,” Merriam-Webster Incorporated, accessed April 10, 2017, https://www.merriam-webster.com/dictionary/self- publish. 3. Michael Saffle, “Self-publishing and Musicology: Historical Perspectives, Problems, and Possibilities.” Notes 66, no. 4 (June 2010): 726-738, http://dx.doi.org/10.1353/not.0.0376. 4. Juris Dilevko and Keren Dali, “The Self-Publishing Phenomenon and Libraries.” Library & Information Science Research 28, no. 2 (2006): 208-234, http://dx.doi.org/10.1016/j.lisr. 2006.03.003. 5. Jana Bradley, Bruce Fulton, and Marlene Helm, “Self-Published Books: An Empirical 'Snapshot.'” Library Quarterly 82, no. 2 (2012): 107-140, Library, Information Science & Technology Abstracts, EBSCOhost. 32     6. Robert P. Holley, “Self-Publishing and Bibliographic Control,” in Self-Publishing and Collection Development: Opportunities and Challenges for Libraries, ed. Robert P. Holley (West Lafayette, Indiana: Purdue University Press, 2015), 113-123. 7. “RUSA Library Publishing Toolkit - A Service of the Reference and User Services Association (RUSA),” Reference and User Services Association, accessed March 24, 2018, https://rusapubtools.wordpress.com/. 8. Heather Moulaison Sandy, “The Role of Public Libraries in Self-Publishing: Investigating Author and Librarian Perspectives,” Journal of Library Administration 56, no. 8 (2016): 893- 912, http://dx.doi.org/10.1080/01930826.2015.1130541. 9. “Zine | Definition of Zine by Merriam-Webster,” Merriam-Webster Incorporated, accessed February 14, 2019, https://www.merriam-webster.com/dictionary/zine. 10. Heidy Berthoud, “Going to New Sources: Zines at the Vassar College Library,” Serials Librarian 72, nos. 1-4 (2017): 49–56, http://dx.doi.org/10.1080/0361526X.2017.1320867. 11. Kent Underwood, “Scores, Libraries, and Web-Based, Self-Publishing Composers.” Notes 73, no. 2 (December 2016): 205-240, General OneFile, Gale. 12. Charles Peters, “Acquiring New Music from Unconventional Sources: PDF Copies in the Library,” paper presented at the International Association of Music Libraries Annual Congress, Riga, Latvia, June 2017. 13. Anne Adams and Morris Levy, “Cataloging Scores in an Age of Print on Demand,” paper presented at the Music OCLC Users Group Annual Meeting, Orlando, FL, February 2017. 14. “2014 Pulitzer Prize Winners & Finalists - The Pulitzer Prizes,” Columbia University, accessed June 18, 2018, http://www.pulitzer.org/prize-winners-by-year/2014. 33     15. “Bio -- Ryan VanDordrecht,” Ryan VanDordrecht, accessed June 18, 2018, http://www.ryanvandordrecht.com/bio/. 16. “Beast of Love -- Ryan VanDordrecht,” Ryan VanDordrecht, accessed June 18, 2018, http://www.ryanvandordrecht.com/music/. 17. “Bio -- Sarina Bowen,” Sarina Bowen, accessed June 18, 2018, https://www.sarinabowen.com/bio/. 18. “About Elle | Elle Kennedy,” Elle Kennedy, accessed June 18, 2018, https://www.ellekennedy.com/about-elle/. 19. “LC-PCC PS for 2.8.6.6,” RDA Toolkit, accessed August 29, 2017, http://access.rdatoolkit.org/document.php?id=lcpschp2&target=lcps2-1467#lcps2-1467 work_cgydze247bgfngr6hqeesvb2l4 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216585866 Params is empty 216585866 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585866 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_ciahwgaclffrdficfkdf5ik2bi ---- Contemporary Journal of African Studies 2019; 6 (1): 114-137 https://dx.doi.org/10.4314/contjas.v6i1.7 ISSN 2343-6530 © 2019 The Author(s) Open Access article distributed under the terms of the Creative Commons License [CC BY-NC-ND 4.0] http://creativecommons.org/licenses/by-nc-nd/4.0 Re-fashioning African Studies in an information technology driven world for Africa’s transformation Joseph Octavius Akolgo Phd Candidate, Institute of African Studies, University of Ghana Author’s email: octaviusakolgo@gmail.com Abstract The African Studies programme, launched in the University of Ghana by Ghana’s first president, was for “students to know and understand their roots, inherited past traditions, norms and lore (and to) re-define the African personality” and the “inculcation of time honoured African values of truthfulness, humanness, rectitude and honour …and ultimately ensure a more just and orderly African society” Sackey (2014:225). These and other principles constitute some of the cardinal goals of the programme in both public and private universities in Ghana. Considering tertiary education as both a public and private enterprise, this paper seeks to enrich the discourse on African Studies by taking a retrospection of the subject and investigated university students’ perceptions of the discipline among public and privately funded spheres. Adopting a qualitative approach, the paper interviewed students on the relevance of the discipline in a contemporary information technology driven world. The outcome of such interrogation was that African Studies is even more relevant in the era of globalization than it might have been in immediate post independent Africa. It concludes by unraveling how the discipline can be re-fashioned for Africa’s transformation. Keywords: African Studies, Africa’s transformation, private universities, public universities, information technology driven world 114 Re-fashioning African Studies in an Information Technology Driven World for Africa’s Transformation Joseph Octavius Akolgo Abstract The African Studies programme, launched in the University of Ghana by Ghana’s first president, was for “students to know and understand their roots, inherited past traditions, norms and lore (and to) re-define the African personality” and the “inculcation of time honoured African values of truthfulness, humanness, rectitude and honour …and ultimately ensure a more just and orderly African society” Sackey (2014:225). These and other principles constitute some of the cardinal goals of the programme in both public and private universities in Ghana. Considering tertiary education as both a public and private enterprise, this paper seeks to enrich the discourse on African Studies by taking a retrospection of the subject and investigated university students’ perceptions of the discipline among public and privately funded spheres. Adopting a qualitative approach, the paper interviewed students on the relevance of the discipline in a contemporary information technology driven world. The outcome of such interrogation was that African Studies is even more relevant in the era of globalization than it might have been in immediate post independent Africa. It concludes by unraveling how the discipline can be re-fashioned for Africa’s transformation. Keywords: African Studies, Africa’s transformation, private universities, public universities, information technology driven world Résumé Le programme d’Etudes Africaines, lancé à l’University of Ghana par le Premier Président du Ghana, visait à permettre aux «étudiants de connaître et de comprendre leurs racines, leurs traditions, normes et coutumes héritées du passé (et de) redéfinir la personnalité africaine», à permettre l’«inculcation de valeurs africaines traditionnelles de vérité, d’humanité, de rectitude et d’honneur… et à assurer ultérieurement une société africaine plus juste et ordonnée » Sackey (2014: 225). Ces principes et d’autres constituent certains des objectifs cardinaux du programme dans les universités publiques et privées du Ghana. Tout en considérant l’enseignement tertiaire comme une entreprise à la fois publique et privée, cet article cherche à enrichir le discours sur le programme d’Etudes Africaines en considérant le sujet en rétrospective et, ce, sur la base des enquêtes réalisés sur les perceptions des étudiants universitaires à l’égard de la discipline parmi les domaines financés par les fonds publics et privés. En adoptant une approche qualitative, l’article a interrogé des étudiants sur la pertinence de la discipline dans un monde contemporain axé sur la technologie de l’information. Selon les résultats de cette enquête, le programme d’Etudes Africaines est voire plus pertinent à l’ère de la mondialisation par rapport à la période ayant immédiatement suivi les indépendances en Afrique. Il conclut en décrivant comment la discipline peut être reconçue pour encourager la transformation de l’Afrique. Mots-clés: études africaines, transformation de l’Afrique, universités privées, universités publiques, monde axé sur la technologie de l’information https://dx.doi.org/10.4314/contjas.v6i1.7 Joseph Octavius Akolgo (octaviusakolgo@gmail.com) is a PhD candidate at the Institute of African Studies at the University of Ghana. He holds MPhil in African Studies and BA (Hons.) in Political Science and Theatre Arts from UG, Legon. He teaches African Studies at the View Valley University – Techiman Campus and tutors at Alliance Francaise Accra, both on part time basis. His research interest is in Youth, Rural Livelihoods, Land and issues of Development. His goal is to develop a professional career in academia, using research, teaching and public service to effect change in the lives of individuals and communities. Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 115 The function of education is to teach one to think intensively and to think critically. Intelligence plus character - that is the goal of true education. – Martin Luther King, Jr. Introduction Education is the most powerful weapon which can be used to change the world said Nelson Mandela inspirationally. This powerful weapon is most often directed towards the young, who are often regarded as ‘tomorrow’s leaders’; logically, because the future and continuity of any animal category depends on its young ones. The decisiveness of education for the preservation of lives of members and the maintenance of the social structure in any society has been has similarly emphasized by Rodney (1972). Hence, the kinds of knowledge education offer African youth is critical as it is important to the society in general. One wonders whether the education given the youth is inimical to the advancement of African ideals and values on the one hand, or whether it supports, promotes and consolidates non-African ideals and values, on the other. How education reflects values embraced by a group of people is also very crucial to development of communities of all kinds. It is the importance of the values to a given society that it has been argued that “education is not just the inculcation of facts as knowledge but a set of values that in turn appraise the knowledge being acquired” Nyamnjoh (2004:162). Thus, when the values are unsuitable for progress, “the knowledge acquired is rendered irrelevant and becomes merely cosmetic” (ibid). Education as a value enriching mechanism is critical to the kinds of knowledge needed to socialize the young. In the context of Africa, African centred knowledge would aptly meet this requirement. Unquestionably, one of the key areas of study which is a fulcrum for African centred knowledge production is African Studies. Naturally, therefore, The 2nd Kwame Nkrumah Pan-African Intellectual & Cultural Festival conference sought “to examine and critically investigate the role of African centred education and knowledge production for shaping the development agenda.” Obviously African Studies would help students appreciate values which Nyamnjoh (2004) not only advocates but project same values introduced by post independent African universities. Ghana’s first president, Kwame Nkrumah had conceived the value of African centred education many decades ago as he had foreseen the relevance of knowledge production in Africa’s transformation achievable through African Studies specifically, and education, generally. Therefore, in launching the African Studies programme in the University of Ghana in 1963, he advocated unambiguously for “students to know and understand their roots, inherited past traditions, norms and lore (and to) re-define the African personality” Sackey (2014: 225). Students were also admonished to embrace the “inculcation of time honoured African values of truthfulness, Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 116 humanness, rectitude and honour (as well as) redefine youth immorality and indiscipline and ultimately ensure a more just and orderly African society” (ibid). These and other tenets constitute some of the purposes of the African Studies programme in universities in Ghana. This paper is, therefore, about Re-fashioning African Studies in an Information Technology Driven World for Africa’s Transformation. The paper seeks to answer the following questions: What has been the aim of African Studies from its inception after the Second World War and, also, as envisioned by Kwame Nkrumah? In the midst of highly privatized tertiary educational system and IT driven world, what are students’ perceptions of African Studies? How should African Studies be refashioned in an information technology driven world for Africa’s transformation? It is the intention of the paper to enrich the discourse on how African Studies should be refashioned to counteract the forces of “domination” and marginalization” in a highly globalized and information technology driven world. Methodology The study population was drawn from one public and three private universities, two of which were faith-based and the other a secular one. The main reason was to find out how the religious or foundational philosophy of the respective institutions could affect the perceptions of students of the discipline. The sample was taken from the University of Ghana, Valley View University, Islamic University College, Ajinringano, East Legon and Ashesi University, Brekusu near Aburi, in the Eastern Region. The rationale for picking these institutions was manifold: firstly it was for convenience and cost effectiveness as they were within the researcher’s reach. Secondly, in Ghana, the private sector has relatively lately become a funder of tertiary education as it is essentially considered “as a private good, a commodity to be bought and sold in an artificially constructed education market driven by the forces of supply and demand” Kelsey (1998:52). Although operating under strictly market principles, private universities are enjoined by the National Accreditation Board to have African Studies as a compulsory course on their curriculum. The addition of private institutions was to satisfy the public-private dichotomy. Another consideration was for the purpose of achieving a good mix of religious and secular balance since the tertiary educational scene has become a highly competitive zone of varieties of products. The field is a panorama of all kinds of institutions “that are public and private; secular and religious; comprehensive and specialized; large and small; transnational and parochial; research intensive and vocational... and other defining social markers...” Zeleza (2009: 111). Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 117 Hence in selecting the sample, this variegated and multi-dimensional educational landscape needed to be seriously considered so as to ensure that all ‘blocs’ were catered for. This position solidly rests on a perception that African Studies is a non-employable area. Sackey (2014) reveals that the “public has ridiculed African Studies” and argue that “students are expected “to learn “proper” disciplines that would fetch them employment after studies” and not remain liabilities on parents and the society at large (Sackey, 2014 : 253). Students therefore pursue Business-related and other courses as these are perceived to be employable programmes, hence the need to find out how all students’ see African Studies irrespective of status as public-private student or course affiliation. The main reason for stating the distinction between fee-paying and non-fee-paying students is to find out whether one’s status a fee paying or non-fee paying could have any effects on how they perceive African studies. This very necessary as there are preferences for courses that are classified as “proper”, with African Studies not favourably placed on that list. It was also to assess how the institutions organize the African studies programme as that organization could play a role in determining students’ perception. Participants were purposively sampled not necessarily by reason of ‘specialist or in-depth knowledge’ of the subject but on the basis of fulfilling the various categories of students described above. A qualitative approach was used to allow students freely respond to the research questions on the subject and share their personal perspectives without restrictions. Semi- structured interview guide was designed on a ‘teaser’ into African Studies on what students’ response was when they heard of “Africa or Africans”. Interviews were conducted with four students from Ashesi University, six from Valley View University (three from Oyibi and three from Campus) and four from Islamic University College. Informal discussions were also held with UGRC 220 two tutorial groups, made up of thirteen and eleven members respectively, of the University of Ghana,1 on the subject. Data was collected between November 2016 and March 2017. The reason for the choice of number of respondents from various universities was for two reasons. Firstly, it was largely determined by the function of financial resource limitation, to facilitate travelling to the institutions for many days,2 and time constraints, largely, on the part of the respondents. Appointments were made with many students from each 1This was when the research began the experiential learning project, a requirement of the PhD programme. This was from November 2016 until I presented the experiential learning report in April 2017. The students were sometimes in groups as they waited for their Teaching Assistant 2 I traveled to Ashei on three different occasions, the first to make enquiries and appointments. I visited the campus on two more occasion. The road was not motorable and this increased the cost of travelling to campus. It was similar for the other institutions. The students, did not equally have enough time to interact fully with the researcher and thus was limited to the numbers. Later there was the need to visit the Techiman Campus because they have large numbers of sandwich teachers who are mostly teachers. The cost therefore restricted the choice of numbers. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 118 university, however, the respective numbers listed faithfully honoured the appointments. Secondly, the numbers provided per institution adequately supplied responses that met the expectation of the researcher on the issues being researched. Even though more students were interviewed from each of the universities, no new information was added to the themes of the research instruments, not previously given by those students. Thus a point of saturation was reached for the research (Crewell, 2012). The paper has three sections. The first part is devoted to a review of literature, the aim of which is to address the goals of African Studies as an academic discipline. The second discusses the perceptions of African Studies by private and public tertiary students of the four universities. The final section attempts to suggest how African studies may/should be refashioned in an information technology driven world for Africa’s transformation. Review of Literature Historically, African Studies developed outside Africa, not within it, and as a study of Africa, but not by Africans (Mamdani, 1998). The contact and study of Africa has its genesis in the fifteenth century, when Europeans began to explore the world and engage in studies of “primitive” societies out of curiosity but hardly related to rigorous scientific research.3 African Studies in its more contemporary origins are tied to post 1945, in the overall context of the onset of the process of decolonization and the rise of the East-West Cold War (Olukoshi, 2006; Melber 2009, UNESCO Courier, 1967). The aim for embarking on African Studies from precolonial to post-colonial epochs are not obvious but knowledge production and diversity of its uses are reasonable and noble goals for one to be involved in the venture. Colonial Era Prior to colonialism, Europe had trade relations with Africa including the obnoxious Trans-Atlantic slave trade. Shortly before Africa was formally partitioned, “a new interest” emerged which spurred Europeans to exceed the trade courtesies they enjoyed by directly mingling in the socio-economic life of Africans. This strategy was used to learn more about the peoples and resources available in the interior of the continent. They then determined which crops was to be given production priority. Missionaries also proselytized Africans to Christianity (Ajaye, 1989). At this juncture, there was no explicitly stated goal for Europeans learning about Africa neither was the study classically situated in the academic realm. The aim of studying Africa at the dawn of, and during colonialism was, logically, to systematically replace African attitudes, tastes, education system, institutions and belief systems among others, with those of the 3 See content.inflibnet.ac.in/data-server/eacharya...304/2/.../304-2-ET-V1-S Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 119 European systems and institutions. A speech by Sir Babington Macaulay to the British parliament in 1835, after touring and learning about India provides a lead. The imperial politician argued that to be able to break a nation with a strong cultural and spiritual heritage, strategically, one needed to replace her educational system. He contended then that if the colonized “think that all that is foreign and English is good and greater than their own they will lose their self-esteem, their national culture and they will become what we want them, a truly dominated nation” (https://robertlindsay.wordpress.com/2012/12/01/ lord-macaulays-speech-to-british-parliament-1835/). This scheme apparently worked effectively in many parts of Africa as even today many Africans, including some Ghanaians, have a strong taste for all things Western (Falola 2003: 91) and adherents of African traditional religions are outnumbered by those of non-African beliefs (GSS, 2013:62). Although this writer has no knowledge of existing colonial official policy document backing this statement, it makes sense to believe that colonial acculturation policy came to anchor on the thinking of Sir Macaulay. Objectives of African Studies in the Post-colonial Era After World War II, African Studies was employed chiefly as a medium for understanding Africa and its people. This was not for epistemological purposes from the Western perspective. African Studies, as de Haan (2010: 99) has described, was “colluded in an all-encompassing civilization mission where it was necessary to know and understand Africa better.” This stance is ideologically and racially motivated and considered “a cultural product of the West ...perceived as politically passive”. It therefore does not offer significantly meaningful contribution to the study of the continent Abrahamsen (2003: 190) The ideological partiality of postcolonial theory is heavily criticized by Afrocentric writers. Mlambo (2006) argues that “western science, capitalism and social science and other (forms of) knowledge and practices not only lead to the domination of the African continent by the West,” but has also led to Africa’s marginalization in the world in terms of economic development including the continent’s capacity to participate fully in the global knowledge community” (p, 161). Contrary views maintain that African Studies is meant to produce knowledge about Africa since it is least studied and is similarly associated with undergraduate education generally (Hodder-Williams, 1986) The postcolonial goal was primarily “to decolonize” the discipline. African intelligentsia rediscovered and rewrote their peoples’ histories and humanity previously seized and denied by Europe (Zelesa, 2009:116). An audacious effort was made to elevate discourses in African Studies by producing “alternative, sometimes radical, narratives of African history Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 120 and development... (and) meant “to challenge received wisdom about the continent’s past” (Olukoshi, 2009: 537). African Studies was to counteract the misrepresentation of western scholarship that portrays Africa and its people as an “other”, a “unique phenomenon”, “a dark continent”, “continent whose history only began with the arrival of the European” (ibid). The goal was to counteract the denigration and to unveil “the African personality” as an equal global citizen. Nkrumah’s Vision of African Studies4 Nkrumah’s vision and expectation of African Studies was embedded in his inaugural speech of the Institute of African Studies of the University of Ghana on 25th October 1963 which comprehensively lays out the objectives and scope of African Studies. The IAS and academics were admonished “to study Africa, in the widest possible sense … in all its complexity and diversity, and its underlying unity” as well as “…the history, culture and institutions, languages and arts … in new African-centred ways”. The discipline is to “re- assess and assert the glories and achievements of our African past and inspire our generation, and succeeding generations, with a vision of a better future”. The youth can only be inspired by what has been achieved, and learn from the mistakes made by earlier generations which must serve as a guiding principle for shaping a progressive future. African Studies should inculcate a sense of certain core values in students, with morality being key. Morality may embrace those values that are acceptable in our communities, for instance being each other’s keeper, respect for the elderly, hard work, trustworthiness etc. Gyekye (2013) explains that “the morality established and maintained by our culture was a social morality, the kind of ethic that emphasizes concern for the wellbeing of every member of the society” (p.173). This morality is embedded in the social ethic of the society and requires that “each member of the community acknowledges common values, obligations and understandings and each person feels commitments to the community expressed through desire and willingness to advance its interests” (ibid). Nkrumah saw tertiary education as the “kind …which will produce devoted men and women with imagination and ideas, who, by their life and actions, can inspire people to look forward to a great future”. Nkrumah probably was referring to the Ghanaian and the African situation, in general. He urged that “our aim must be to create a society that is not static but dynamic, a society in which equal opportunities are assured for all” and educational administrators must “remember that as the aims and needs of our society change … our educational institutions must be adjusted and adapted to reflect 4 These quotes are taken from Nkrumah’s inaugural speech, titled “The African Genius,” at the opening of the Institute of African Studies Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 121 this change.” University curricula must adjust to changing societal needs. To Nkrumah, education was meant to make a person more open-minded with critical judgment skills as it “consists not only in the sum of what a man knows, or the skill with which he can put this to his own advantage” but… (s/he) “Must also be measured in terms of the soundness of his judgment of people and things, and in his power to understand and appreciate the needs of his fellow men, and to be of service to them”. Additionally, education ought to produce people who must have empathy and sympathy and should genuinely make an effort to ensure community-wide welfare. The educated person should be so sensitive to the conditions around them that their chief endeavour would be to improve these conditions for the good of all. The goal of education, as outlined by Nkrumah, is very apt for any society but more so for the communal African society. His idea of education, particularly at the tertiary level, is one that reflects the ubuntu philosophy - I am because you are and you are because I am. Nkrumah expected that beneficiaries of education would use its fruits for the communal good. It may further be expounded that Nkrumah’s exhortation was to build on the general aim of African ‘traditional’ education which is “based on the socio-cultural and economic features shared by the various communities” (Nsamenang and Tchombe, 2011: 24). Traditional education’s main aims, among others, have been “to create unity and consensus in society... to inculcate feelings of group supremacy and communal living and to prepare the young for adult roles and status” (ibid). Tertiary education, in this scheme of things, should be a continuum of this life process among members of the society, irrespective of one’s level of western educational attainment. In summary, Nkrumah envisage African Studies to be an emancipatory mission for those bound by “mental slavery” of the western educational system. Very critical in Nkrumah’s vision of African Studies is for the student “to be deeply rooted in his culture and exhibit high moral standards acceptable in the African community” (emphasis added). Another objective that cannot be overlooked is that African Studies is to equip the student to be aware of how colonialism studied African institutions and this knowledge should prepare and empower the student intellectually to deflect the maneuverings of neocolonialism and other forms of domination. Perceptions of Tertiary Students of African Studies The rationale for attempting to capture the aims of African Studies as envisioned by Nkrumah is to evaluate whether or not what he saw many years ago to be the role of African-centred education through African Studies has changed with globalization and information technology. As will be discussed below, the findings among students of all categories described in the methodology section reveal a general unanimity about the relevance of African Studies. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 122 Before proceeding with the above endeavour, it is necessary to define ‘Perception’ as used in the context of the paper. It has been argued that “Our contact with the world is through perception” (Wade 2005: vii). Perception is considered a process by which individuals organize and interpret their sensory impressions in order to give meaning to their environment. People’s behaviour is based on their perception of what reality is. It is also seen as a complex phenomenon that provides an input higher-order processes such as making a choice (Hoffman et al, 2015). Perception therefore plays a role in our life choices and influences attitudes towards the undertakings of an individual. The term is not used in an exclusively philosophical sense, but has been given its everyday meaning; thus the way something (in this case African Studies) is regarded, understood or interpreted. The Oxford Advance Learner’s Dictionary meaning of perception as “an idea, a belief or an image you have as a result of how you see or understand something” is also significant in use of the term in this paper The interest in this project began when the writer started organizing tutorials in the University of Ghana and later taught African Studies at the Valley View University-Techiman Campus. The question arose, “if you hear [the words] Africa or Africans, what comes to your mind?” This question became a standard ‘entry point’ of the African Studies tutorials and lectures and was used to assess what students’ perception of the continent and its people were. Analysis of the Results The responses varied but were mostly negative. Featuring was the view ‘‘black man with black sense’’ or the idea that the African cannot do anything positive or that Africans think negatively of themselves and their continent. Another perspective was that Africans have a “pull him down” (PhD) attitude, interpreted to mean that individuals within some African communities are not inspired or enthused to see one of their own progressing or “doing well” in the area of business or any endeavour of life. Rather, detractors make efforts to “pull him (such a person down)”. Some respondents claim that such negative efforts are normally advanced in the spiritual realm. In this situation, victims were pursued through detractors visiting spiritualists (“witch doctors”) to drag the ‘progressive person’ down. This belief, though difficult to pursue empirically, persisted among most students who were interviewed. Others claimed that there could be smearing of the successful person’s image by subtle or blatant negative propaganda such as spreading false information. Such negative propaganda could be crafted to suggest that the person’s success is achieved through unorthodox means. Related was the response of “black man with black sense” and “African mentality”. The former was suggested to mean that some people did not have Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 123 expectations of positive prospects for their neighbours. Respondents claimed that there are often scheming and machinations to undo each other, even if those involved were brothers or related by blood. The latter was not explained entirely differently. These statements could not be based on empirical examples as no specific communities were cited. The assertions more likely reflected discourses on the media landscape as such unsubstantiated allegations are often made on radio and television programmes and social media platforms. Africans (leaders) are corrupt and power drunk,” “Africa has bad or visionless leaders” and that some of these leaders use their ethnic groups to stay in power in perpetuity. Some students mentioned that “backwardness is the hallmark of Africa.” Other views that are negatively perceived include beliefs and practices in witchcraft. Africa has a dependency syndrome or she has perpetual over reliance on Western countries. This response has mostly been a refrain. Others say Africa is a continent with resources but no skill and interest to develop them using our (Africans’) own intelligence. Also Africans are people with negative cultural practices. There were also positive responses that recurred with every year group. These included “Africa is a continent with rich culture and a lot of natural and human resources,” “Africans are religious people who observe high morals,” “Africa has a large population” (but this was not given any qualification), “Africa contributes a lot of raw materials to the world.” “Africa has beautiful culture” and so on. The unfavourable responses gave the impression that some Ghanaian tertiary students perceived their continent negatively prior to undertaking the mandatory African Studies course. I therefore decided to find out what students’ perceptions, particularly, the relevance of African studies as a discipline, was after they had undertaken the lectures in the discipline. Even though the idea was to find out students’ perception about African Studies, the researcher allowed other opinions which bordered on their perceptions also about the continent. The analysis, based on the responses, have been classified under three themes namely Culture and History, Leadership and Governance and the African identity, and finally, Africa’s contribution in the global community. Culture and History One striking response repeated throughout the research was students’ surprise or shock when the question of “relevance” of African Studies was asked. It was as if the questioner was not from academia. This reaction emerged among both direct fee paying students, whether from public or private, and ‘non-fee paying students’.5 Not only were the responses from participants emphatic 5 Every tertiary student pays a fee of sort hence this category refers to students’ whose fees are somewhat subsidized. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 124 about the relevance of African Studies but also positive of its centrality for the future of the youth of Ghana. A student from University of Ghana said: We didn’t know anything about Afro Studies initially and just thought we would use it to meet the requirement, but from the first three ‘intro lectures’, it is one of the most crucial courses on campus that can help to understand who we are as a people. It will help, we, the younger generations to know their past, help future leaders to know the sacrifices their forefathers made and the practice good democracy. In the writer’s days as an undergraduate student, African Studies was meant to satisfy the requirement for award of the degree by ‘just’ passing it. Some students did not attend African Students lectures until a few weeks or days to the examination. At a higher level, it is not uncommon to hear from some social science professors and lecturers who make demeaning remarks about the methodological approach of African Studies because it is perceived that qualitative method is not rigorous enough. This is often demoralizing, particularly because these kinds of comments are from persons who are supposed to be well informed and should be educating the ordinary person on the street. Participants were of the view that African Studies to exposes students to the culture and history of Africa. The course was perceived to teach African culture and history which was important. A student from VVU-Tuchman Campus said: African Studies is a tool for achieving tolerance to any culture in our society where one person will not look down on the culture of others. It makes learners know more about the cultures of others. This response is significant in that there are misunderstandings and conflicts among people because of ignorance of each other’s culture. Hence this response means that some students most probably understand the uniqueness of culture and would, all things being equal, respect each other’s culture, especially since Ghana, like the rest of Africa, is multicultural. One would expect those who respond this way to respect the culture of a community s/ he is to serve in any capacity. Some participants also saw culture to be “the backbone of communities” while others saw it to be a source of identity. Even in the midst of globalization, it is culture that distinguishes one group from another. The response was summed up thus: Everyone is equal in the world, so it is our dressing, the songs that we sing, our dances that will make us uniquely Ghanaian. When one watches the opening ceremony during the Olympics, you would normally identify each country by its dressing. It is also good to see Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 125 Ghanaians in some traditional attire - usually kente. Some aspects of our culture can also bring us foreign exchange, so African Studies is very relevant in this respect. Not all respondents wholeheartedly embraced the relevance of culture and African Studies. Students of theology and those with religious sympathies tended to perceived African Studies as projecting ideals that are not ‘edifying to the soul’ and therefore retrogressive in the spiritual sense. A Level four hundred Religious Studies student from a private university expresses this view: In some situations, African culture does not promote ideals that edifies the soul. Some cultural practices in Ghana do not conform to the scriptural ways of worship. African Studies do not condemn such acts but seem to ‘condone’ such practices in the name of culture. How can such negative ‘cultures’ help develop the continent. Though this argument is flawed, as outmoded cultural practices are not glorified in African Studies, very ‘conservative’ religious adherents hold such an opinion. Negative cultural practices were also highlighted and the perception of some participants was that students who had undertaken the course and understood the uniqueness of culture and how to negotiate transformation of outmoded cultural practices should help influence the modifying of negative ones. Teachers who work in the rural communities with NGOs and other advocacy groups were called upon to execute this task. Belief in witchcraft, female genital mutilation, bad widowhood rites, and negative exotic marriage practices such as leverage marriages, child marriages, marriage to deities like the Trokosi6 system among others, were seen to be backward in the information technology world. Archaic cultural practices paint “a very ugly image of Ghana and Africa in the eyes of the rest of the world”, a sandwich student observed. Education, (i.e. sensitization), was suggested to be the means to end most of these negative cultural practices and that students who had read African Studies were better positioned to help in that regard. African Studies was perceived to be a subject that taught African History (incidentally, the past was perceived negatively) and governance systems: “Afro Studies helps us to have awareness about our past so as to guide us (to) establish good identities or personalities as Africans”. The discourse relative to this point was that leadership has failed to learn from the mistakes of the African past. 6A participant mentioned this as an example in the Volta Region. The author does not know much about this system and is not in the position to declare it negative as most of the things said about it is from the media and NGOs. These groups might not give a holistic assessment of the practice. However, the media landscape presents Trokosi as a very retrogressive cultural practice which makes it possible for young girls to be engaged to deities. It is said the priests eventually have children with 'the children so betrothed to the deities. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 126 Other opinions bordered on a globalized system that dictated the pace and Africa could only trod along without choice, redering national political leadership particularly as ‘obedient’ followers. African Contribution to and in the Global Community The massive contribution of the continent’s raw materials to the global economy was a matter of interest and disappointment to the students. The reality that most of the continent’s raw materials are exported to the industrialized Western countries for processing serves as double loss for the continent and its youth. While most African leaders lament the lack of employment openings for the continent’s youth, they are, at the same time, busily and unaccountably exporting youth employment opportunities to others. A student from the University of Ghana argued that it was not raw materials that were being exported, but employment prospects that were expended to the youth of industrialised countries to the detriment of the teeming youthful population of the continent. He contended that the “refusal” of African leaders to pursue a ‘drastic’ industrialization drive for the continent was self-enslavement and betrayal to the youth and other sections of the African society. Contributing raw material to global community, although positive, represents a negative contribution as it compromises Africa’s youth employment and the continent’s self-reliance. One positive reflection of Africa in the global community that was highlighted was the exemplary leadership demonstrated by Kofi Annan as the former UN Secretary-General. Similar positive images of Africa is the contribution to global sports and entertainment by men and woman from the continent, both at home and abroad. The puzzle has always been why Africans within the continent are not as influential in the sports field as Africans who are in the diaspora. This similarly reflected in Africans who have contributed to inventions while outside the continent. The research revealed respondents’ conclusious that there was the need to do a further research into why this situation exists beyond issues of sports infrastructure and academic facilities. Leadership and Governance These are very important issues to tertiary students who participated in my research. Most of the responses held that there is no future for the continent if leadership remains visionless, full of empty promises and divisiveness. A number of respondents expressed disdain for leaders who are ever and very ready to enrich family and friends and cronies as well as praise singers of leadership. One student remarked : Our governance systems have problems continent-wide, there are problems with parties in power, there are problems with parties waiting Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 127 for power. African Studies is perhaps the way out: if we the younger generations will take the course more seriously and attempt to put into practice some of the things suggested in writings in African Studies, and learn from the history of the great leaders of Africa there will be hope for our country. Other respondents similarly expressed grave concerns about the absence of reliable electoral systems that could be insulated from manipulation of incumbent political parties. Other views on leadership and governance yearned for a situation where African countries could run elections without funding from external sources. This would reflect that the continent is truly independent and free from external or neocolonialist manipulation. The inability of African politicians to think and act ‘independently’ was because of the reliance on western countries to fund every project, including elections. Another response relating to governance suggested African countries should have working institutions grounded on the rule of law. The media is pereceived as extremely partisan and unable to effectively serve its watchdog role. Most students gave examples of Ghana where, depending on the leanings of the media house, opinions are presented as facts, and “some media groups think for the public”. Further, respondents charged that the raw facts are not presented to listeners or readers but rather conclusions or insinuations intended to bait or direct them, thus they are made to feel s/he cannot decide what the truth is but must be guided by the ‘truth’ as presented by the media house. The research participants thus concluded that the media was not very helpful in ensuring good governance because a vibrant and principled media is a recipe for good governance. Some of the respondents were concerned about the battered image of Africa. These concerns centred on negative things about the continent such as being the hub of poverty, centre of ethnic conflicts, ‘patrons’ of corruption, victims of communicable diseases such as HIV/AIDS and others, and, in general, a continent of squalor. Others were also worried about racial abuse of the continent’s sports personalities who have been denigrated, even on the field of play. Students perceive that in the globalized world, there is still the need for a positive African identity and African Studies helps one to understand why Africans have a battered/negative image, are stereotyped and grossly misrepresented either intentionally or through ignorance. The fact that over the years nothing seems to be working positively for Africa was also expressed strongly. Issues such as low pace of advancement of liberal democracy which has been accepted as the medium of governance, unimpressive economic performance of African countries spanning time and space, violent conflicts and the springing of vigilant and terrorists groups, rising and unsolvable unemployment crisis among others are serious concerns Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 128 for some students. Suggestions were made that solutions to some of these retrogressing developments could be found using African Studies lectures as platforms. Other Perceptions Some students, though in complete disagreement with the relevance of the area, expressed somewhat deviating opinions from the above. A female Accounting student from Valley View University commented as follows: African Studies, is too bookish and you must be very interested in reading to enjoy it. Those of us who do calculations always do not have some of its elements in our course area. So when the lecturer leaves, the class, it becomes very difficult. It would have been good to have the course designed into some aspects of the courses we all pursue. It is difficult to identify African Studies in some courses. Another observation made was related to courses with practical components offered by some universities. For instance, in University of Ghana, courses like Music and Dance Studies have their practical elements. A participant observed that even though African Studies is relevant, the practical aspects end with the course in the university campus after the student has written and passed the examination. The participant doubted the long term relevance of such electives, particularly after school when there are no notes or reading material to which one can refer. The above observations may not be inappropriate as they may reemphasize the call that the university must adapt to the changing needs of society. For instance, how should certain Western modelled courses be designed in the African Studies programme to address such concerns are worth exploring. It is equally important to bring home to students that the field must not necessarily be designed to accommodate all courses but that such courses could also be designed using Afrocentric approaches which can address the concerns raised. There is also the need for methodological innovations in some of the electives taking into consideration how the students may make reference in future after school. What about employment? The likelihood that a course of study would ensure employment was also raised by some respondents as already hinted by Sackey (2014). A respondent who expressed no qualms about the relevance of African Studies to raise the consciousness of African values, morality and aspirations, as envisaged by Nkrumah was, however, pessimistic of the course’s prospects of providing “rewarding employment”. “Most of us are trying to have the degree because of getting a good job”, he explained, “but African studies is like humanities Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 129 and social science courses; I know most students who are from the humanities do not get fulfilling jobs”. This concern is a concrete and realistic one. The current outcry of every parent and student in or out of the university currently, particularly in Ghana, is employment. A desperate mother in the South African movie, Sarafina, said her children “do not eat glory”. Knowledge acquired from a course and its relevance remains ‘fantasy and fairy tales’ if the graduate remains jobless. As the film’s character cited above inferred, the student would not eat, wear, and be accommodated by knowledge if that knowledge fails to earn him/her a decent job. The employment question is not just about African Studies courses but the entire educational system and its connection to the needs of the economy of most African countries. Most of the universities are still fixated on course content while employers want resourceful graduates who are problem solvers. Hence, Nkrumah’s admonition that the ever changing nature of “our society” required dynamic forward looking educational institutions that can keep pace with the rapidly changing society must be seriously considered. African Studies institutions and centres may also need to re- design courses of the discipline with objectives of traditional education in an information communication technology. Such an innovation would assuage uncertainties confronting students in relation to employment. Be that as it may, it must, however, be understood that “Not all university programmes are professional oriented” (Sackey, 2014: 260). There are professional as well as non- professional disciplines which are complementary. Some courses offer broad knowledge to advance and prepare students for life; African Studies is one such course and it could provide students solutions to the employment problem. Discipline not taken seriously A different perspective presented on the relevance of African Studies is the lack of seriousness on the part of some students. One student was clear on the relevance of the discipline but observed that some students in his institution “did not take the course serious”[sic]. In that university, courses in the discipline attracted one credit hour per semester. Students were unclear as to whether or not it was truly calculated as part of calculation of the overall cumulative grade point average (CGPA). The respondent stated that since students “believed that it was not calculated for our overall CGPA, nothing drastic will happen if they didn’t take the courses serious”[sic]. The respondent also stated that they heard some time ago that the course was to be removed from the university’s curriculum. This ‘rumour’ has lingered for some time and encouraged apathy and lack of commitment from some students. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 130 The discipline suffers similar fate in one of the sampled institutions7 but in a different form. In that institution, students read African Studies as an aspect of another course.8 While the university discussed in the first instance had clear courses and their codes for African Studies, the latter had none. The organization of African Studies in the format described, in the institution, is merely to fulfil the policy of African Studies being made compulsory by the National Council on Tertiary Education. The idea of submerging African Studies with other courses brings to mind the genesis of the discipline in the University of Ghana in the late 1940s. Professor Kofi Abrefa Busia had been appointed as a lecturer in African Studies tasked with developing progam”. Confronted with lack of personnel and other problems the school faced, he converted African Studies to a Department of Sociology instead. Professor Busia is reported to have said “African Studies? We are all in Africa so studying Africa…’’ (Allman, 2013: 184, 185). This argument was later to be used by other liberal lecturers who contended that “since the university is an African university it presupposes that its orientation in every department should be automatically African” (Sackey, 2014: 251). While no evidence has been found to link some current private institutions to amalgamating and submerging African Studies to other courses akin to what manifested in the University of Ghana in the formative stage of the discipline, these institutions have apparently reactivated these historical antecedents. Another observation made related to courses with practical components offered by some universities. For instance, in University of Ghana, courses like Music and Dance Studies have their practical elements. Thus, a ‘power struggle’ or ‘feud’ ensued between Eurocentric social science courses and African Studies as a discipline. “African Studies was ridiculed and dismissed as a legitimate academic pursuit” by some Western academicians and African liberalists academics (Sackey, 2014: 250). The course was thought undeserving of a place on the University of Ghana curriculum (Sackey, 2014: 250, 259). African Studies courses were designated dondology—studies in drumming and dancing—when the School of Performing Arts was hitherto an integral part of the Institute of African Studies. Some universities see the compulsory programme of study for all university students as being rammed down their throats by the National Council on Tertiary Education. The irony in the instance of these institutions is that while fee paying students recognize the relevance and do not have concerns earning more credit hours from the courses of the discipline, it is the institutions that probably seek to maximize cost by either curtailing the number of credit hours or amalgamating it into different course areas just to fulfil the National Council 7 These institutions in the sample would not be named for the reason it might dent their business image since they are private. 8 This was as at the time of this project and it is unknown whether this would continue to be the case. Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 131 on Tertiary Education policy. Some private institutions are only legally bound to run African Studies as a discipline and do not see it as necessary area for the pursuit of knowledge and an avenue for inculcating the youth with African values: To equip students with indigenous knowledge of Ghana and Africa broadly defined to include inherited ideas, beliefs, values, legends, mythology, institutions and practices, science and technology. The goal is to nurture in the youth of Ghana and Africa the desire and the skills to fashion home-grown solutions to Africa’s problems. The overall controversy of these debates and lack of unanimity of Afrocentric epistemology of African Studies continues to be the reason for the discipline being ridiculed and students “not taking the course seriously”. Evaluation of the responses The overall evaluation of the responses offered by students re-affirms the vision Nkrumah had about African Studies in his 1963 inaugural speech. African Studies institutes and centres have significant roles to play, but more importantly the universities themselves have to give the prominence to African Studies as it is the only course that teaches and acquaints the students with their African identities as well as their cultural roots. African Studies is even more important today than it was in the post-independence era because Africans have carved a niche in every sphere of life from the political to socio-economic, and from the scientific to technological, among others. It is therefore important that students are made aware of these developments that would ensure that they are not global spectators but see themselves as contributors to the wider human community. Refashioning African Studies for Africa’s Transformation In suggesting how African Studies should be refashioned for Africa’s transformation, it is worth satisfactorily answering the question “what is the rationale for African Studies in the postcolonial context?” (Olukoshi, 2006). Or better still, what is the rationale for African Studies in the ICT age? Is the discipline for image reconstruction, ideological rebuttals, cultural revolution, epistemological reasons etc? It is also important to take on board the kind of development that is being sought for in the transformation, as advocated by the Kwame Nkrumah intellectual conference. Would the development being pursued be at the benevolence and generosity of another, or a self-reliant one? Is political leadership sincere in the development agenda? The emancipatory mission of African studies which Ghana’s first President, Kwame Nkrumah, unambiguously and succinctly articulated when Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 132 he urged the participants in 1963 to produce genuine knowledge about Africa through scientific and academic rigour to promote Africa’s development and transformation in response to thematic concerns of the time is a parallel reason for revolutionizing African Studies and knowledge production to answer for a new mission and vision. What new mission is being embarked upon? In addressing the ‘thing’ African Studies ought to be doing, there is the need to take stock of why knowledge was produced in African Studies during colonialism and postcolonial epochs. In a retrospective reflection, it has to be appreciated that knowledge production in African Studies has long been “extraverted, i.e. externally oriented, intended to meet the theoretical needs of our Western counterparts and answer the questions they pose” about Africa and Africans (Hountondji, 2009). In this sense, it may be laudable to suggest that the discipline should be ‘domesticated’; thus African scholars should make African Studies “an autonomous, self-reliant process of knowledge production and capitalization that enables us (as Africans) to answer our own questions and meet both the intellectual and the material needs of African societies”, to borrow from Hountondji (2009). A wider project of “knowing oneself in order to transform” is necessary for the development we seek. This “wider project” would then be that African scholars involved in African Studies should “develop first and foremost an Africa-based tradition of knowledge in all disciplines, a tradition where questions are initiated and research agendas set out directly or indirectly by African societies themselves” (ibid) to address questions and issues pertaining to the continent. This strategy would unearth African solutions to African problems and not repeat models from the West. In the period immediately after independence, African Studies was concentrated on delineating Africa and espousing its values for the world to know us better. This agenda, though important, significantly led to the neglect of “intra-African cross-national learning.” This situation has made mainstream African Studies constitute itself into a tool for others to master Africa (Olukoshi, 2006:539). In refashioning African Studies for Africa’s transformation this trend has to change. The youth must be encouraged to learn about intra-African issues and cross fertilize ideas not only through attending conferences physically but by making effective use of information communication technology to interact directly. For instance, using Skype or video conferencing, why cannot students in the University of Ghana, University of Ibadan, University of South Africa, University of Cairo and University of West Indies discuss a common African problem/issue and proffer solutions to a commonly identified subject ? In an ICT driven world, social media can be and should be used Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 133 effectively to link-up African youth from various parts of the world to engage in discourses that would promote socio-economic, political and other topical issues concerning Africans. As African youth in different parts of the world, they can categorise and share experiences such as their aspirations and fears, the opportunities and obstacles and how these can be achieved and surmounted respectively, etc. Using these media must, however, be managed innovatively by African Studies centres and institutes so as to avoid infiltration by radical fundamentalist groups that may use African youth to embark on unwanted social upheavals. These institutions should provide an academic framework to define the scope and ensure sanity of the discourses to guarantee intellectual benefits to the youth, as individuals, and to their communities. To refashion African Studies for Africa’s transformation in an ICT driven world, it would be appropriate to adopt critical thinking and problem solving approaches to African Studies. This strategy would be useful in tackling notorious subjects like corruption, over reliance on foreign aid, taste for foreign things, bad leadership, etc. Professors, senior scholars and others with experience, but not necessarily in academia, could fashion out a regulatory framework in which students are encouraged to employ critical thinking and creative ways of addressing these entrenched ills in African societies. Again ICT could be used for global outreach for contributions of ideas on shades of human endeavour from the African community. In an information technology driven world it is more critical now than ever before to use documentaries and cinema/films to advantage. Africa’s history should be re-interpreted and presented on Youtube and other streaming media heavily patronised by the youth for their comprehensive education on the historical achievements as well as the failures of the continent. These sketches can also be used as teasers to motivate students into reading the seminal works in African Studies. African Studies fashioned for an IT driven world would make it easier to reach African youth with intellectual information that would educate them about the realities of the continent’s progress and failings, rather than relying on information from other sources that may not be authentic or that could be misrepresentations. Another area that may help in refashioning African studies for Africa’s reconstruction and transformation is reading material on African Studies. There are more readings on Africa outside the continent than within it. Most people do not find Africanists writings on Africa until they enter higher institutions of learning, which does not make the situation easier because critical books on Africa and Africans are scarce. African Studies centres and institutes should intensify efforts to increase accessibility of reading material which is beyond the reach of most African youth. These should be put on the worldwide net so that students can access this information easily and without drudgery. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 134 Another way of refashioning African studies for Africa’s transformation is to move to the lower ladder of the educational system and institute the teaching/learning of African Studies at all levels of education, from primary to university level with suitable subjects for the various stages. Catch them young, as it is often said, must be an implementable aphorism. One of the ways of refashioning African Studies is to begin from colleges of education where Africa’s teachers are trained to take responsibility of the upbringing of younger children through Afrocentric epistemology. It appears that Western epistemology extols virtues of the West that downplay and overshadow African-centred values. For instance, some students were unacquainted and unfamiliar with African ‘indigenous’ science and technology activities in their communities. It was only when things such as herbal preparation, blacksmithing, brewing, spinning cotton wool into thread, etc were pointed out to them did they begin to have a deeper appreciation of science and technology in their communities. Most Ghanaians, and I dare suggest, African tertiary students lack fundamental knowledge about their countries and their continent, Africa. Conclusion Students perceive African Studies to be relevant in contemporary educational curriculum of tertiary institutions, irrespective of their status as either a fee paying students or public sponsored students. The religious or philosophical foundation of the students’ institution played no role in how African Studies was perceived. Concerns raised by students with respect to certain courses need a thorough scrutiny and appropriate action to meet the expectation of a majority of students. Africa’s reconstruction and transformation is hinged on African Studies. There must be commitments from African states, particularly financial and infrastructural, to facilitate the programme as a field of study. Educational policy makers must be flexible but Africanists -institutions and individuals- need to push them to act more responsively. Bureaucrats, like political leadership, seem visionless and contradictory, hence the need for constant pressure to be applied. Institutions like the National Accreditation Board and the National Council for Tertiary Education among others need very strong collaboration and ‘orientation’ from the Institute of African Studies, Legon. It is observed that personnel in these institutions have probably not read the mandate given by Ghana’s president to the Institute. This is because there are some tertiary institutions in Ghana that do not have typical African Studies courses but use other liberal studies courses as proxies. Supervision is lax and tutors who are not certificated in African Studies are the people handling African Studies courses. It is only collaboration and advocacy by Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 135 the Institute of African Studies, Legon and departments of African Studies in other universities that can turn things around for the better. African youth need to be taught the true history of Africa, without distortions. It is not wrong to be defensive with those who misrepresent, stereotype or denigrate the African on the one hand. On the other, it is appropriate to be candid about African history and the continent’s current situation, and present both to the student. It will help the youth decipher the truth from the falsehood. As the Jamaican reggae star Robert Nester Marley has urged in his lyrics: we must therefore “open our eyes and look within”, by knowing ourselves in order to transform. I am mindful that these proposals will not altogether be easy but where there is a will there is a way. Africa and its youth will be the winners ultimately. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World 136 References Abrahamsen, R. (2003). African Studies and the Postcolonial Challenge. African Affairs, Vol. 102, No. 407, 189-210. Allman, J. (2013). Kwame Nkrumah, African Studies, and the Politics of Knowledge Production in the Black Star of Africa. The International Journal of African Historical Studies, , Vol. 46, No. 2 (2013), 181-203. Alpers, E. A. (2002). What Is African Studies? Some Reflections, Identifying New Directions for African Studies. African Issues, Vol. 30, No. 2, 11-18. Cresswell, J. W. (2012). Educational Research, Planning, Conducting and Evaluating Quantitative and Qualitative Research. 501 Boylston Street, Boston: Pearson Education, Inc. de Haan, L. J. (2010). Perspectives on African Studies and Development in Sub-Saharan/Perspektiven der Afrikaforschung und der Entwicklung im subsaharischen Afrika. Africa Spectrum, Vol. 45, No. 1 (2010), pp. 95-116. Dike, O. K. (1967, June). The Scientific Study of Africa’s History. The UNESCO Courier, 9-13. Falola, T. (2003). Power of African Cultures. Rochester: University of Rochester Press. Ghana Statistical Service (2013). 2010 Population & Housing Census National Analytical Report. Accra: Government of Ghana, Accra. Gyekye, K. ((2013)). Philosophy, Culture and Vision: African Perspectives. Accra: Sub-Saharan Publishers. Hodder-Williams, R. (1986). African Studies: Back to the Future. African Affairs, Vol. 85, No. 341 (Oct., 1986), 593-604. Hountondji, P. J. (2009). Knowledge of Africa, Knowledge by Africans: Two Perspectives on African Studies. RCCS Annual Review [Online] URL: http://rccsar.revues.org/174; DOI: 10.4000/rccsar.174. J, W. N. (2005). Perception and Illusion: Historical Perspectives. Springer Science + Business Media, Inc. Kelsey, J. (1998). Privatizing the Universities. Journal of Law and Society, Transformative Visions of Legal Education, Vol. 25, No. 1, (Mar., 1998), 51-70. Mamdani, M. (1998, April 22nd, Wednesday). Is African Studies to be turned into a new home for Bantu education at UCT? Text of remarks at the Seminar on the Africa Core of the Foundation Course for the Faculty of Social Sciences and Humanities, University of Cape Town,. Cape Town, South Africa. Contemporary Journal of African Studies Vol. 6 No. 1 (2019), pp. 114-137 137 Melber, H. ((2009)). The Relevance of African Studies. Stichproben. Wiener Zeitschrift für kritische Afrikastudien Nr.16/2009, 9, 183‐200. Mlambo, A. S. ( 2006). Western Social Sciences1 and Africa: The Domination and Marginalization of a Continent African . Sociological Review, 10, (1), , 2006, 161-179. Nkrumah, K. (1963, October 25th). The African Genius: Speech by Dr. Kwame Nkrumah, President of Ghana at the opening of the Institute of African Studies at the University of Ghana, Legon, Accra, Ghana. Nsamenang, A. B. (2011). Handbook of African Educational Theories and Practices: A Generative Teacher Education Curriculum. Bamenda, North West Region (Cameroon): Human Development Resource Centre (HDRC). Nyamnjoh, F. B. (2004). A Relevant Education for African Development— Some Epistemological Considerations. Africa Development, Vol. XXIX, No. 1, 2004, 161-184. Olukoshi, A. (2006). African Scholars and African Studies. Development in Practice, Vol. 16, No. 6, 533-544. Rodney, W. (1973). How Europe Underdeveloped Africa. Washington, D. C: Howard University Press. Sackey, B. M. (2014). African Studies: Evolution, Challenges, and Prospects. In J. A.-E. Samuel Agyei-Mensah, Changing Perspectives (pp. 239-262). Springer Dordrecht Heidelberg New York London: Springer Science+Business Media Dordrecht. Zelaza, P. T. (2009). African Studies and Universities since Independence. Transition, No. 101, Looking Ahead (2009), 110-135. Akolgo, J. O./ Re-fashioning African Studies in an Information Technology Driven World work_cl3okrpirjd2tigoevsgoc66sq ---- TITLE Dennis Massie OCLC Research Interlending trending: A look ahead from atop the data pile Note: This is a pre-print version of a paper forthcoming in Interlending & Document Supply. Please cite the published version; a suggested citation appears below. Correspondence about the article may be sent to massied@oclc.org. Abstract This paper explores five forces likely to significantly affect interlending operations in the near term: 1) the transition from print to electronic resources; 2) management of legacy print collections; 3) mass digitization projects; 4) competition from other information providers; and 5) copyright. This paper provides a unique look at forces that are shaping the future of global ILL activities by using data from authoritative sources to illustrate the effects these forces are having and will continue to have on libraries and ILL operations. It predicts that most libraries will be slow to divest themselves of print monographs on a large scale; libraries will continue to build new offsite storage facilities but put more thought into their contents; increased discoverability of digitized texts and greater copyright restrictions will drive users to print; librarians will make gray areas of copyright law work for them instead of against them; publishers, librarians, authors, lawyers, and scholars will find a responsible and fair solution to providing digital access to ‘orphan’ works; and ILL will persist as a core operation for nearly all libraries. © 2012 OCLC Online Computer Library, Inc. 6565 Kilgour Place, Dublin, Ohio 43017-3395 USA http://www.oclc.org/ Reuse of this document is permitted consistent with the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.0 (USA) license (CC-BY-NC-SA): http://creativecommons.org/licenses/by-nc-sa/3.0/. Suggested citation: Massie, Dennis. 2012. “Interlending Trending: A Look Ahead from Atop the Data Pile.” Interlending & Document Supply, 40,2: 125-130. Pre-print available online at: http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf. mailto:massied@oclc.org� http://www.oclc.org/� http://creativecommons.org/licenses/by-nc-sa/3.0/� http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 2 of 10. Introduction In 2001, I was fortunate enough to attend the 7th IFLA Interlending and Document Supply (ILDS) conference in Ljubljana where there was a lot of talk about whether ILL had any future. Leaf through the conference program and you’ll find presentations such as “When Resource Sharing Used to Work” and “E- Books, the Demise of ILL,” and – my personal favorite – a panel discussion about publishers selling articles directly to end users called “Two’s Company, Three’s a Crowd: Who Needs a Librarian?” [1] There was so much negative talk that I was tempted to change the title of my own presentation to “ILL: Let’s Call the Whole Thing Off” which I intended to sing to the tune of that old “you say to-may-to, I say to-mah-to” song. You say meta-day-ta, I say meta-dah-ta… You say final user, I say end user… Digitize! Copyright! Publishers! Compromise! Let’s call the whole thing off… A decade later, ILL activity among Association of Research Libraries members is up 20%, from 3.8 million filled requests in 1999 to 4.5 million in 2009. [2] Filled requests on the OCLC ILL system are up 17%, from 8.7 million in 2001 to 10.2 million in 2010. [3] Obviously, the idea of libraries sharing collections with each other is still very much alive. Five Forces, Five Years I have identified a list of five forces that I believe will affect the course of resource sharing operations during the next five years. While the list is certainly not comprehensive, these forces at least lend themselves quite nicely to numerical analysis. By examining hard data, I will forecast how those forces will affect our behavior as resource sharing professionals. How will we be going about our business five years from now, and what will be the nature of that business? The forces under examination are: 1. P to E transition (Print to Electronic) 2. Managing legacy print collections 3. Implications of mass digitization projects 4. Competition 5. Copyright P to E Transition (Print to Electronic) What does the data tell us about the ongoing transition in our libraries – and indeed in our society – from mostly print to mostly electronic? For one thing, it tells us that the transition has already happened with periodicals, scholarly and otherwise. We’ve got them scanned, backed up, archived, and stored away in dark caves, some never to be seen by human eyes again. Finally, librarians are feeling that it’s safe to divest ourselves of the print counterparts to all these electronic journals. There are 7,160,028 articles available in JSTOR as of February 20, 2012. [4] In addition, there are now half a million pre-1923 articles available even to nonsubscribers (JSTOR, 2011). In most fields, even the humanities, scholars are going digital because of convenience and lower costs to libraries. But it would be premature to assume that the transition to electronic journals is complete. Three years ago, Constance Malpas attempted to identify all peer reviewed scholarly journals currently published and came http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 3 of 10. up with around 27,000 titles. [5] Moreover, an astonishing 36%, or 10,000 titles, mostly in the humanities, are available only in print. These are not oddball special interest magazines like “Weekend Cannibal”. They are peer reviewed scholarly journals. These findings suggest that not everything will be digitized, not even all scholarly research will be digitized. Perhaps as much as one third of the scholarly literature will remain available only in print. (And who are you going to call when you need an article from one of those print-only titles for your next research project? ILL, that’s who.) There is much more to be said about the transition from print to electronic. For example, the percentage of the library budget that Association of Research Libraries members spent on licensed resources more than doubled in a 7-year period, from 25% in 2002 to 56% in 2009 (Kyrillidou and Morris, 2011). But it’s not just libraries that are transitioning from P to E, and the transition is not limited to periodicals. A June article on PewInternet .org cited a survey done by Princeton Survey Research Associates International, which showed that Kindle ownership among US adults doubled from 6% in November 2010 to 12% in May 2011, a pretty steep climb in just six months and climbing twice as fast as tablet computers (Purcell, 2011). A couple of numbers involving e-books are also quite telling. The Internet Archive, through its Open Library project, is making 85,000 e-books available for lending among a consortium of 150 libraries which contributed scans of digitized books to Open Library. The user must be physically present at the library to borrow the e-book, but then can take it anywhere on a Kindle or other device for the term of the loan. Some of the material is in copyright but out of print. Individual libraries, not the Internet Archive, determine the lending rights for in-copyright materials they’ve contributed. Brewster Kahle of the Internet Archive hopes publishers will sell e-books to libraries instead of licensing them, get their money, then move on to the next title - the way it’s always been done with print books (Rapp, 2011). OCLC is partnering with the Ingram Content Group to offer e-books from the MyiLibrary collection for e-loan through WorldCat Resource Sharing. 50,000 titles from major publishers such as Wiley have been loaded into WorldCat under the owning symbol IDILL. Loans cost 15% of the publisher’s purchase price, payable via OCLC’s ILL Fee Management (IFM) service, and the user has access to the book for nine days (OCLC, 2011a). These are two examples of how e-books are now being loaned via ILL. This is an important beginning. The first thing that comes to mind when resource sharing practitioners consider the transition from print to electronic is whether the licenses our administrators sign will allow us to lend those items via ILL. But there is another important implication of this shift in the nature of our collections that goes unremarked upon. Soon there will be fewer libraries where administrators care about print in the same way that they have in the past. Fewer libraries will continue to operate as if they have a mandate to preserve print resources. Libraries that will continue to see preservation of print materials as part of their mission will need to self-identify sometime in the near future. In fact, this has already started to happen with efforts like the Western Regional Storage Trust (WEST) Project [6] where shared print archives are linking up to form a network and making their preservation commitments widely known. Libraries can’t all go E and forget about P at the same time, or essential research materials will be lost forever. Managing Legacy Print Collections The second of the five forces influencing ILL is the management of legacy print collections. There are already nearly a billion print volumes in North American college and university libraries – wow! – and we’re adding another 25 million each year (Payne, 2007), many of which are going directly into storage. That is quite a legacy we’re leaving, especially since so many institutions are storing the same things. Lizanne Payne conducted a study of storage facilities in North America for OCLC Research and found that about 70 million print volumes are stored in 68 offsite storage facilities across the United States. Most of those facilities are full and many organizations are considering building new storage facilities or adding pods onto existing facilities. The cost of doing so is between $3 and $4 per volume, a very serious http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 4 of 10. investment when you consider that such facilities are usually built to hold about 1.5 million volumes (Payne, 2007). One might think it would be cheaper to leave those print volumes on campus, but Paul Courant determined that it costs $4.26 per year to keep a printed volume on the shelf vs. $0.86 per year to store it in high- density offsite storage (Courant and Nielsen, 2010). Another study found the price of keeping a monograph in a library over the course of that book’s lifetime is a cool 718% of the original purchase price (Lawrence et al, 2001). Other formats are not nearly so expensive to keep as monographs, but the price tag is still hefty. What are the implications of all these numbers for resource sharing librarians? It’s obvious that the stock we rely upon to fill ILL requests is an expensive pet to keep around. Filling most ILL requests is fairly easy at the moment because you can always find somebody that has what your patron needs on the shelf. But will that still be the case when administrators start looking for less costly alternatives to storing all these print volumes? Not only are print collections huge and expensive, gobbling up resources desperately needed for other things, but they are barely even used. Study after study shows this to be true. A 10-year study of the ILL and circulation activity of 89 OhioLink libraries found that a mere 6.5% of the aggregate OhioLink collection accounted for 80% of the use (OhioLINK et al, 2011). Since the system averaged four copies of each and every volume, there was plenty of “nothing” to “not circulate” (O’Neill, 2009). The University of California system and others with a long history of using offsite storage show that 1% to 2% of stored collections are paged each year (Payne, 2007). We have no idea what titles are in shared storage because, with a few exceptions, the records for those items are loaded into WorldCat under the owning library’s symbol, not the storage facility’s. Because most materials in high-density shelving are arranged by size, there’s about a 0% chance any of that material can ever be weeded. So all that print that’s widely duplicated and not being used is most likely there to stay. If all that stuff in storage is likely to stay there because of the difficulty in weeding material shelved by size, maybe we should start thinking of such collections as de facto archives. It may be worth finding out what is stored in those archives and seeing if all the storage facilities across the system could be leveraged as a collective asset to all libraries. Of such modest thoughts are mighty research projects born. One such project became known as the Cloud Library Project (Malpas, 2011). It explored the idea that a shared print storage facility, such as the 8.5 million volume ReCap (jointly run by Columbia University, New York Public Library, and Princeton University) might provide the basis for a third-party library such as New York University (NYU) to consider divesting itself of significant portions of its print collection, assuming the third party library could buy subscription access to what is stored in ReCap. For this to work, the collections stored in ReCap must sufficiently mirror NYU’s print collection. We also compared the print ReCap collections and NYU’s print collections with the HathiTrust corpus, the idea being that HathiTrust would provide preservation assurances for the materials, while ReCap would provide physical access to users. We expected to find substantial overlap among the NYU, ReCap, and HathiTrust holdings. After all, everyone collects and stores the same things. Right? Well, not exactly. While there was a lot of overlap between NYU’s print volumes and HathiTrust, only 10% of those titles were also held in ReCap. The availability of print back-up was considered essential to support any divestment of print collections by the third party library. The good news is that by including the print holdings of not just ReCap but also of all campus libraries of all institutions that own ReCap, overlap with NYU’s print holdings represented in HathiTrust rose to a remarkable 90% (Malpas, 2011). What are we to make of this? It’s simple. We had been hoping that a library like NYU could divest itself of print collections by partnering with a single print storage facility that mirrored its collection and would provide its users access to the materials formerly owned by NYU. That didn’t pan out. But by expanding http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 5 of 10. the replacement collection to include the full collections – not just stored stuff but everything – owned by three top-tier ARL libraries – duplication with NYU approached 100%. The next questions to be explored were as exciting as they were obvious: • What if a network of print archives existed? • With a business model in place so libraries could subscribe to just-in-case access to print materials? • How many supplier archives would be needed? • Could other libraries divest themselves of low-use print? And, of course, there is a final question of particular interest to resource sharing librarians: • What effect would this model have on ILL? We added the collections of five major storage facilities to the mix. The Center for Research Libraries had the least amount of overlap with HathiTrust – less than 5% – but this is not surprising since CRL collects the kind of low-use ephemeral material that would only be found in a place like CRL. ReCap, with its 8.5 million volumes, yielded an overlap with HathiTrust that approaches 20%. The combined holdings of the University of California’s Southern and Northern Regional Library Facilities duplicates 45% of HathiTrust’s holdings. And Library of Congress duplicates almost 60% of the digitized material represented in HathiTtust. Combined, all five storage facilities could meet 75% of the print access need for items preserved digitally in HathiTrust (Malpas, 2011). So, to return one last time to our original supposition, having a third party library establish a partnership with a single print storage facility would in most cases not provide adequate coverage to allow that third party library to winnow its own low-use print titles. But by adding just a few more storage facilities, the print duplication of the massive HathiTrust digital corpus grew to nearly 75%. Clearly, a network of linked shared print facilities would alter the equation for the better. But important questions remain. Would such a network really enable other libraries to discard print? And is there a business model that would make sense for both consumers and suppliers that would allow this to happen? Finally, what effect would outsourcing a significant portion of your low-use print collection have on your ILL traffic? Your own library might have little or no print collection, so you’d be borrowing from networked storage facilities frequently. Implications of Mass Digitization Projects Mass digitization projects such as HathiTrust constitute the third of our five forces because of their implications for the way library collections are built, managed and shared. One thing you can safely say about the HathiTrust is: “Wow, it’s big!” According to the HathiTrust web site, over 9.9 million volumes, or 5 million book titles, have been digitized already, including almost 3.5 billion pages. [7] HathiTrust also includes a quarter of a million serial titles, and it’s getting even bigger. With more partners joining, the rate of growth is skyrocketing. HathiTrust grew to 6 million volumes in just twenty months. Current projections are that it will equal the size of Harvard University collections (16 million volumes) by 2013, and surpass the Library of Congress (30 million volumes) by 2020 (Malpas, 2011). Aside from its sheer size, one of the most interesting and important aspects of HathiTrust is the copyright status of the digitized items. It includes over 2.5 million public domain volumes, many of which happen to be US government documents. Overall, 27% of its collection is in the public domain and now freely available. Within HathiTrust, nearly three-quarters or 7 million titles are covered by copyright, searchable by many, but viewable in their entirety by very few. [8] Estimates of the number of orphan works – works published in years you’d think would be covered by copyright protection, but where the rights holders or the copyright status of the item is difficult to determine – run as low as 2 million and as high as 5 million titles. A sample done by the University of Michigan copyright office found that 45% of 100,000 titles checked were orphan works (MLibrary News, 2011). That projects to nearly 4.5 million potential orphan works in the HathiTrust. http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 6 of 10. The University of Michigan and a number of other HathiTrust partners (University of California, University of Wisconsin, Indiana University, and Cornell University) have established a rigorous method for checking the copyright status of suspected orphan works, and plan to release such works for full access to their users as titles are cleared. The first batch of 27 orphan works by 27 French, Russian, and American authors is about to be released. A quarter of a million researchers will have full access (Young, 2011). For daring to push the orphan works envelope, several author’s associations have filed suit against these institutions (Authors Guild, 2011a). Rights holders want to put a stop to this. One author said the HathiTrust takedown policy for presumed orphan works is like requiring a homeowner to notify the cat burglar ahead of time that he does not wish for his home to be ransacked. An author’s guild blogger tracked down the agent for one orphan work’s author. It turns out the writer had written a book upon which an Elvis Presley movie had been based, and he’s just signed a contract with a publisher to release an e-version of yet another of his out-of-print works (Authors Guild, 2011b). Obviously, this is going to get ugly. If HathiTrust prevails and all the orphan works eventually are made available to researchers at the schools involved, that means nearly 7 million full-text volumes would become available to a quarter of a million students. Of course, that’s not the same as releasing all these works to all Internet users, but from a librarian’s perspective it’s a step in the right direction. The news gets better. There are nearly 5 million records in WorldCat for items represented in HathiTrust, with new records being added every day, and OCLC has just reached an agreement with HathiTrust to provide full-text indexing of the entire digitized corpus within WorldCat (OCLC, 2011b). Researchers will soon enjoy the benefit of an almost unimaginably powerful new tool. I believe this indexing will drive more researchers to use the print versions of works, after they find what they need via HathiTrust or WorldCat but can’t view the material. A couple of studies, one at Michigan and another at Columbia University, have shown no evidence that being part of the digitized corpus results in greater use of its analog form. Helen Look examined University of Michigan students’ use of the top 500 titles from HathiTrust and, although students racked up nearly a million page views within these 500 works, only 2.2% of them circulated that year. Nearly 40% had never circulated (Look, 2010). However, students in the study had full access to the digitized versions, so why would they need print? It’s when folks can discover text but not access it that we’ll see them driven to print. Competition The fourth of our five forces is competition, by which I mean competition for the attention share of information seekers. Competition can be bad if users bypass libraries in their information gathering and rely on inferior sources, which can have a negative impact on library funding. However, competition might also benefit scholarship, as more sources of information are brought together in one place. Some competition threatens to put us out of work. Some helps us do our jobs better. I’ve chosen three competitors for profiling. One was named in Perceptions of Libraries as the point where 84% of researchers start their quest for information. Not a single person surveyed reported starting their search on a library web site (OCLC, 2010). Our competitor? The Internet, of course. This next is an example of what can be a good, even helpful competitor. If an entity such as OAIster adds 25 million records for open access digital resources to the place where I search for books and journals, that entity is my friend. [9] The same goes for ArchiveGrid, which adds a million records to the world’s biggest bibliographic database. [10] This competitor would be aggregators. Our third competitor puts up some really big numbers each year, such as 8 billion – that’s dollars of revenue from US scholarly publishing. Two billion of that comes from subscriptions sold overseas. No wonder this competitor – publishers, obviously - doesn’t want us to have the right to supply copies of articles to libraries in other countries (Association of American Publishers, n.d.). My feeling is that if publishers spent as much time providing convenient, affordable end-user access to their resources as they do trying to make ILL lending from e-journals completely onerous, they could double their take and we’d http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 7 of 10. be happy for our patrons who were being served so well. But don’t hold your breath. We’ll be printing and rescanning e-journal articles for a while more, I’m afraid. Copyright The last of our five forces is the really big train hurtling down the track: copyright. As we learned from that Cloud Library Project described above, each of the top-tier Association of Research Libraries members owns about 3.5 million print volumes and there is typically a 30% overlap between that ARL library and HathiTrust (Malpas, 2011). For example, over 700,000 titles were held by both NYU (print) and HathiTrust (bytes). If copyright permitted and NYU could provide full access to all 700,000 titles, NYU could divest itself of the corresponding print volumes and save 55,000 assignable square feet of space and as much as $3 million in cost avoidance for storing those volumes (Malpas, 2011). The space saved would be more than twice the size of a typical learning commons. If every library could do that, libraries and scholarship would be transformed. ILL might finally go out of business. But that’s not likely to happen anytime soon. The Google Book settlement – as imperfect as it was, still our best chance to make that massive digitized corpus widely available – is pretty much dead. New copyright legislation may come along and ease the problem, or help librarians and publishers find some common ground. And the Chicago Cubs, who last won the World Series just four years before the Titanic sank, could become world champions again, too. Librarians can push the envelope and take any privileges not expressly forbidden by copyright law until they’re made to stop, what my colleagues and I call “running until tackled.” The University of Michigan’s orphan works gambit is an excellent example of this. Brewster Kahle and the Internet Archive amassing their own print repository of books may be another, as one suspects that part of their strategy is to posit some relationship between what they can do with their digital copy if they also happen to own a print copy (Wohlsen, 2011). I certainly hope so. Predictions and Conclusions Now it’s time to make a few modest predictions about the way these five forces will actually affect the way we do resource sharing in the next five years. First, libraries will not rush into unloading print. We’ve already seen how perfect circumstances had to be before libraries jettisoned print back files of JSTOR titles. The rest will be harder and take longer. Some libraries in extreme need are already withdrawing print but others won’t do so as long as there is any good reason not to. Print will stay, accessible through ILL. “Managed scarcity” will happen within the next five years only on a small scale. We simply don’t have enough experience to reduce collections to the minimal number of copies needed to support the entire system. It’s too uncertain how increased demand for fewer copies will affect the condition of the physical item. It is clear that if we throw stuff away and then want it back, it’s going to be just too darned bad. Groups like the Western Regional Storage Trust will do a great job of sharing preservation commitments with others. Various libraries will assume responsibility for different parts of the collective collection. What will continue to elude us is a business model that would allow consumers to ditch print and buy subscription access to stored materials. I think that’s a shame and a golden opportunity lost, or at least deferred. I’m convinced that having full-text indexing of HathiTrust in WorldCat, right next to that “find it in a library” button, is going to drive users to seek print copies of works in the digitized corpus. It’s certainly how I behave, and I’m not all that peculiar. http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 8 of 10. Institutions will continue to build print repositories, shared and otherwise, but will be extremely careful about what they place there. You won’t see duplicates or JSTOR titles. More thought will be given to subject matter and past patterns of use. Filling these facilities will become quite a deliberate process. Librarians will let go of old fears and boundaries and push for as much access as the law will allow, and make gray areas work for them instead of against them. I have a (perhaps irrational) belief that “orphan works” is an area where publishers, librarians, authors, lawyers and scholars can come together and find a sensible solution that is fair to all parties while promoting research and creation of new knowledge. This seems doable, and the potential impact could be massive. My last prediction is that the Cubs will finally win the World Series – certainly before ILL as we know it ever has reason to go out of business. But that could still lay a long way into the future. ILL Word Cloud Finally, I made a word cloud out of the titles and abstracts of papers presented at the 12th IFLA Interlending and Document Supply Conference. It contains ideas and issues discussed in the conference presentations. No matter how closely I look, I can’t find a discouraging word about ILL. No “demise.” No “stop.” I found the word “barriers” in the upper right, but just above it are the words “allow lending across.” I love the fact that “sharing” is almost as big as “libraries.” If I have one regret, it’s that “user” isn’t a little bigger. But it’s pretty big. Certainly bigger than “barriers.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 9 of 10. We’ve gotten very good at what we do. The ideas expressed throughout this conference prove that we’re all striving to be better still. Our libraries and our users need that from us. Five years from now, they’re still going to need us. The data pile says as much. And I’m going to go out on a limb here and guarantee it. Notes 1. 7th Interlending and Document Supply International Conference, 1 – 5 October 2001, Ljubljana, Slovenia. “Providing access through co-operation”. Official program. 2. Data obtained from ARL Statistics 1998 – 1999 and ARL Statistics 2008 – 2009, Association of Research Libraries, Washington D.C., available at www.arl.org/bm~doc/1998-99arlstats.pdf and www.arl.org/bm~doc/arlstat09.pdf, respectively. 3. Data obtained from OCLC Annual Report 2000 – 2001, available online at http://library.oclc.org/cdm4/item_viewer.php?CISOROOT=/p15003coll7&CISOPTR=30&CISOBOX =1&REC=7 and OCLC Annual Report 2009 – 2010, OCLC, Inc., Dublin, Ohio. 4. “JSTOR by the numbers”, JSTOR Web site, http://about.jstor.org/about-us/jstor-numbers. Accessed December 7, 2011. 5. Interview with Constance Malpas, September 7, 2011. For background information see www.oclc.org/research/activities/policy/default.htm. 6. See Western Regional Storage Trust web site for more information on the WEST Project, www.cdlib.org/west. 7. Data obtained from HathiTrust web site at www.hathitrust.org. Accessed December 8, 2011. 8. Data obtained from HathiTrust web site at www.hathitrust.org. Accessed December 8, 2011. 9. For additional information on the OAIster database, see the web site at www.oclc.org/oaister/. 10. Additional information on ArchiveGrid is available at www.oclc.org/research/activities/archivegrid/default.htm. References Association of American Publishers (n.d.), “Learn more about scholarly information”, available at: http://publishers.org/psp/learnmore/ (accessed December 8, 2011). Authors Guild (2011a), “Authors Guild, Australian Society of Authors, Quebec Writers Union sue five U.S. universities”, 12 September, available at: www.authorsguild.org/advocacy/articles/authors-3.html. Authors Guild (2011b), “Found one! We reunite an author with an ‘orphaned work’”, body of article and comment by Stephen Bell, 12 September, available at: www.authorsguild.org/advocacy/articles/found-one- -we-re-unite-an.html. Courant, P. N., and Nielsen, M.”B.” (2010), “On the cost of keeping a book”, The idea of order: transforming research collections for 21st century scholarship, CLIR Publication 147, Council on Library and Information Resources, Washington, D.C., pp. 81 – 105. JSTOR (2011), “JSTOR – free access to early journal content, and serving “unaffiliated” users”, [press release] 7 September 2011, http://about.jstor.org/news-events/news/jstor%E2%80%93free-access-early- journal-content (accessed December 7, 2011). Kyrillidou, M. and Morris, S. (2011), ARL Statistics 2008 – 2009, Association of Research Libraries, Washington, DC, available at: www.arl.org/bm~doc/arlstat09.pdf. Lawrence, S. R., Connaway, L., and Brigham, K.H., (2001), “Life cycle costs of library collections: creation of effective performance and cost metrics for library resources”, College and Research Libraries, Vol. 62 No. 6, pp. 541 - 553. http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� http://www.arl.org/bm~doc/1998-99arlstats.pdf� http://www.arl.org/bm~doc/arlstat09.pdf� http://library.oclc.org/cdm4/item_viewer.php?CISOROOT=/p15003coll7&CISOPTR=30&CISOBOX=1&REC=7� http://library.oclc.org/cdm4/item_viewer.php?CISOROOT=/p15003coll7&CISOPTR=30&CISOBOX=1&REC=7� http://about.jstor.org/about-us/jstor-numbers.%20Accessed%20December%207� http://about.jstor.org/about-us/jstor-numbers.%20Accessed%20December%207� http://www.oclc.org/research/activities/policy/default.htm� http://www.cdlib.org/west� http://www.oclc.org/oaister/� http://www.oclc.org/research/activities/archivegrid/default.htm� http://publishers.org/psp/learnmore/� http://www.authorsguild.org/advocacy/articles/authors-3.html� http://www.authorsguild.org/advocacy/articles/found-one--we-re-unite-an.html� http://www.authorsguild.org/advocacy/articles/found-one--we-re-unite-an.html� http://about.jstor.org/news-events/news/jstor%E2%80%93free-access-early-journal-content� http://about.jstor.org/news-events/news/jstor%E2%80%93free-access-early-journal-content� http://www.arl.org/bm~doc/arlstat09.pdf� Massie: “Interlending Trending: A Look Ahead from Atop the Data Pile.” http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf Page 10 of 10. Look, H. (2010), “Mass digitization: analyzing online versus print usage at a large academic research library”, poster presented at Association of Research Libraries Leadership and Career Development, 26 June 2010, Washington, D.C., available at: www.arl.org/bm~doc/LookPoster.pdf. Malpas, C. (2011), Cloud-sourcing research collections: managing print in the mass-digitized library environment, OCLC Research, Dublin, Ohio, available at: www.oclc.org/research/publications/library/2011/2011-01.pdf. MLibrary News (2011), “MLibrary launches projects to identify orphan works”, 16 May, available at: www.lib.umich.edu/marketing-and-communications/news/mlibrary-launches-project-identify-orphan- works. OCLC, Inc. (2010), Perceptions of libraries 2010: context and community, OCLC, Dublin, Ohio, available at: www.oclc.org/reports/2010perceptions.htm. OCLC, Inc. (2011a), “OCLC and Ingram to offer new option for access to e-books”, [press release], 11 April 2011, www.oclc.org/news/releases/2011/201116.htm (accessed December 7, 2011). OCLC, Inc. (2011b), “HathiTrust full-text index to be integrated into OCLC services, making content from this important collection easily discoverable”, [press release] 7 September 2011, available at: www.oclc.es/news/releases/2011/201150.htm. OhioLINK Collection Building Task Force, Gammon, J., and O’Neill, E.T. (2011), OhioLINK-OCLC collection and circulation analysis project 2011, OCLC Research, Dublin, Ohio, available at: www.oclc.org/research/publications/library/2011/2011-06r.htm. O’Neill, E.T. (2009), “OhioLink collection analysis project: preliminary analysis”, presented at RLG Programs Annual Partnership Meeting, Philadelphia, Pennsylvania, 2 June 2009, available at: www.oclc.org/research/events/2009-06-02i.pdf. Payne, L. (2007), Library storage facilities and the future of print collections in North America, report commissioned by OCLC Programs and Research, Dublin, Ohio, available at: www.oclc.org/research/publications/library/2007/2007-01.pdf. Purcell, K. (2011), E-reader ownership doubles in six months: adoption rate of e-readers surges ahead of tablet computers, Pew Internet & American Life Project, Washington, D.C., available at: www.pewinternet.org/~/media/Files/Reports/2011/PIP_eReader_Tablet.pdf. Rapp, D. (2011), “Internet Archive tests new ebook lending waters: in-library, and license-free”, LibraryJournal.com, 2 March, available at: www.libraryjournal.com/lj/home/889508- 264/internet_archive_tests_new_ebook.html.csp (accessed December 7, 2011). Wohlsen, M. (2011), “Brewster Kahle, Internet archivist, seeks 1 of every book ever written”, Huffington Post, 1 August, available at: www.huffingtonpost.com/2011/08/01/internet-archivist-seeks-_n_914860.html (accessed December 8, 2011). Young, J. R. (2011), “U of Michigan tests murky waters of copyright law by offering digital access to some ‘orphan’ books”, Chronicle of Higher Education, 23 June, available at: chronicle.com/blogs/wiredcampus/u-of-michigan-tests-murky-waters-of-copyright-law-by-offering-digital- access-to-some-orphan-books/31946. About the Author Dennis Massie is Program Officer at OCLC Research. He can be reached at massied@oclc.org. http://www.oclc.org/research/publications/library/2012/Massie-IDS.pdf� http://www.arl.org/bm~doc/LookPoster.pdf� http://www.oclc.org/research/publications/library/2011/2011-01.pdf� http://www.lib.umich.edu/marketing-and-communications/news/mlibrary-launches-project-identify-orphan-works� http://www.lib.umich.edu/marketing-and-communications/news/mlibrary-launches-project-identify-orphan-works� http://www.oclc.org/reports/2010perceptions.htm� http://www.oclc.org/news/releases/2011/201116.htm� http://www.oclc.es/news/releases/2011/201150.htm� http://www.oclc.org/research/publications/library/2011/2011-06r.htm� http://www.oclc.org/research/events/2009-06-02i.pdf� http://www.oclc.org/research/publications/library/2007/2007-01.pdf� http://www.pewinternet.org/~/media/Files/Reports/2011/PIP_eReader_Tablet.pdf� http://www.libraryjournal.com/lj/home/889508-264/internet_archive_tests_new_ebook.html.csp� http://www.libraryjournal.com/lj/home/889508-264/internet_archive_tests_new_ebook.html.csp� http://www.huffingtonpost.com/2011/08/01/internet-archivist-seeks-_n_914860.html� http://chronicle.com/blogs/wiredcampus/u-of-michigan-tests-murky-waters-of-copyright-law-by-offering-digital-access-to-some-orphan-books/31946� http://chronicle.com/blogs/wiredcampus/u-of-michigan-tests-murky-waters-of-copyright-law-by-offering-digital-access-to-some-orphan-books/31946� mailto:massied@oclc.org� work_cq7ojjaofzd5dga7x5vngqvfxi ---- doi:10.1016/j.serrev.2005.02.006 ARTICLE IN PRESS DTD 5 SERREV-00479; No. of pages: 7; 4C: Serial Conversations An Interview with Diane Hillmann and Frieda Rosenberg Jian Wang, Contributor Bonnie Parks, Column Editor doi:10.1016/j.serrev. Wang is Serials C land State Univers jian@pdx.edu. OCLC recently announced a plan to implement MARC 21 Format for Holdings Data (MFHD) and invited holdings experts Frieda Rosenberg and Diane Hillmann to serve as advisors and to aid in the implementation process. In December 2004, Jian Wang interviewed Rosenberg and Hillmann. They discuss their longtime involvement with the holdings standard and provide interesting perspectives on the issues, challenges, and benefits for the constituencies (libraries, the serials community, system vendors, and bibliographic utilities) involved with and responsible for implementing and using MFHD. Serials Review 2005; xx:xxx–xxx. The term bMARC 21 Format for Holdings DataQ (MFHD) is no longer a strange name to most librarians, but how it is understood and practiced by the library community varies. To some, MFHD is the established holdings standard used by libraries in managing serial publications in a standardized and consistent manner. 2005.02.006 ataloger, Branford P. Millar Library, Port- ity, Portland, OR 97207-1151, USA; e-mail: 1 To others, it is still a vague concept with little application in local use. I was honored to be able to interview two well-known holdings experts, Diane Hill- mann (at left) and Frieda Rosenberg, to discuss serials holdings related issues with a focus on MFHD. Diane Hillmann is the metadata specialist, National Science Digital Library at Cornell University. Besides her expertise in metadata, she is also one of the pioneers in the development of holdings standards. Frieda Rosenberg is head of serials cataloging for the Uni- versity of North Carolina (UNC) at Chapel Hill. She is also known as the bmother of serials holdingsQ because of her numerous workshops and publications in the field. ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx Professional Questions Jian Wang (JW): What initially sparked your interest in serials holdings/holdings standards? Diane Hillmann (DH): I was a law librarian at Cornell in technical services from 1977-1995, so I was interested in both serials and non-serials holdings. Law libraries traditionally have had the most creatively misbehaved publication patterns, and it was the law community that developed the understanding of bcontinuing resourcesQ that eventually spread to other libraries. Frieda Rosenberg (FR): Ironically, my interest began in the late seventies, when, after seven years as a paraprofessional turning out catalog cards by both typewriter and terminal keyboard, I moved to North Carolina, went to library school in Chapel Hill, and worked as a volunteer at the information desk in a local university library. The coordinator told me that for serials I should steer people toward the microfiche holdings list rather than to the card catalog. I felt ambivalent about that (remembering previous efforts at producing cards!). I wondered what could be done, if holdings were all important, to bring cataloging and holdings together. As I finished my library degree in 1978 and actually got a job as a serials cataloger at the UNC Library (where the same separation prevailed), I noticed even more files of holdings: the Kardex, the binding records, the serials printout, the microfiche and a separate card file called the Srec (serial record)—and this was just the serials departmentTs portion of all existing serials holdings files! As standards began to arrive in the next few years along with online catalogs, it began to dawn on me that the holdings needed for so many purposes would be more efficient in one place, but only if they were able to serve adequately for those purposes—and that was what standards, plus online access, could help to achieve. JW: Frieda, you said, bHoldings are at the hub of library serials use and serials management, just as central as the bibliographic record.Q1 Why is that? FR: Now I can say, bTogether with the bibliographic record, holdings are at the hub of library serials use,Q because the resource is all the richer when the biblio- graphic and holdings records are finally united. But the experience that I described above showed me how important holdings were even in alphabetical title lists without a lot of bibliographic information. Our physical bcom-ficheQ (computer output microfiche) list was sent all over campus and the state. The reference desks in each branch, as well as in our main library, were extremely active users of the list. Serials management- check-in, binding, inventory, preservation, interlibrary loan, circulation, bhooks to holdingsQ or even manual notations of holdings in printed periodical indexes—all these processes involved holdings and contributed their own holdings data to the mix. In an integrated system they still do, though we still havenTt managed to run all these operations off of one file. JW: When was MFHD first introduced? What was the driver in the development of this holdings standard? 2 Why has it taken so long for MFHD to be accepted in practice? DH: The MARC Format for Holdings was developed in the mid-1980s. I wasnTt involved with MARC develop- ment that early (I began in 1988), so ITm not entirely sure what the driving force behind the development was, but I suspect it was union listing. I believe that one reason it took so long for the holdings standard to be implemented in libraries was that it was designated as a bdraftQ for a long time, probably almost fifteen years, even though it was relatively stable long before that. Also, it was complex and heavily encoded, even for a MARC stand- ard. Very few people understood its power and potential sufficiently to attempt to use it, and the library manage- ment system vendors were very reluctant to be at the bleeding edge of development. VTLS was the only integrated library system with full MFHD capability for many years, and consequently they contributed signifi- cantly to its development. We owe them a great deal. FR: The MARC Format for Holdings Data began as a project within the Association of Southeastern Research Libraries (ASERL). Eight ASERL libraries began in the very early eighties to develop a way to communicate holdings data by computer. Eventually the Library of Congress commissioned them to develop their new standard as a MARC Format, and it became the MARC Format for Holdings and Locations, later USMARC and finally MARC21 Format for Holdings Data. So, unlike the bibliographic standard, an LC development, and the holdings display standard, developed by ANSI Z39 subcommittees, MFHD was inspired and created through the efforts of libraries working together. None- theless, it has been slow in both development and implementation. The format got a reputation for difficulty, so much so that some features (such as expansion and compression) barely exist in the field even today. It is a standard for communication, so it cannot in and of itself guarantee standard data, though it certainly helps encourage it. All holdings standards were harder to implement than bibliographic standards because, in the minds of many, holdings are considered local data and thus up to each individual library, so that adoption of standards seems like a loss of local control. Furthermore, the sheer bulk of this free-form, legacy data in large libraries, its existence at different levels of granularity and in different forms suiting a variety of functions were all deterrents to standardization. JW: What are the major reasons to implement MARC Holdings? DH: I believe weTre at a point where that question shouldnTt need to be asked. Those of us old enough to remember when the bibliographic formats were new remember that there were similar questions asked before everyone fully understood how essential stand- ard data were for libraries sharing data amongst themselves and investing heavily in their own data in an environment where systems change and data must migrate from one system to another. Unless you believe that there is some value in going it alone, ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx and bowing out of the incredible data sharing infra- structure that makes libraries in this country a model of common sense collaboration, you need to imple- ment MARC Holdings. Nobody can afford not to implement—that train has left the station. FR: Using MARC Holdings makes more sense than ever now. It has been adopted by all the major integrated library systems and is about to be adopted by OCLC as the basis of its local data record. In some cases the ILS (integrated library system), or OCLC, will be able to map your data into MARC, so you will receive a database of holdings which will be compatible with new systems, new versions of your present system, and computers accessing your data remotely. What a great benefit! Acquiring or adding publication patterns ena- bles you to predict serial receipts and saves you check-in labor. Both patterns and data save you costs, enable you to share records and acquire records from others, and then multiply these benefits across the library commun- ity as other libraries share information with you. JW: What challenges or difficulties have libraries experienced in the actual implementation process? DH: The biggest challenge has been the inclination of many library systems vendors to implement the stand- ards in a proprietary way, emphasizing interfaces that protected library staffs from the horrors of encoding. Some sort of interface for check-in staff (who may be students or part-timers) is very necessary, but librarians and managers must understand what sits below those interfaces and be able to interact with and understand the coded data. I remember the days when systems developers were convinced that librarians would never be able to deal with numeric field tags and coded subfields. We think thatTs hilarious now, but itTs essentially the attitude that is hindering the full imple- mentation of MARC Holdings. FR: Developing onsite knowledge of MFHD takes some time and effort. Leadership is needed in order to create the necessary training. Administrative support is essential for these priorities. If holdings work is shared among various groups of staff engaged in different activities, their buy-in and their training is a crucial foundation for the task of developing the holdings database. WeTd like to be able to say that you are guaranteed smooth sailing once you have this database, but since systems vary widely in their accommodation of the format and its functionality, migration between systems may still offer some setbacks and risks. This is something we need to work on. JW: What advice or suggestions would you offer libraries that are thinking about implementing MFHD? What should libraries consider before they make that decision? DH: I think the question is not bifQ but bwhenQ and bhow.Q CONSER provides great training for libraries in holdings, and good documentation. Librarians should approach this issue the same way they do anything else new: learn, plan, implement. There are libraries that have already done this and are happy to help others and 3 to pass on their experience to new implementers. No rocket science here, really! FR: Look at the considerations in the last paragraph: your libraryTs human resources and administrative support for intensive training and the creation and maintenance of the data, at whatever detail you can manage. Look at your data, too: it is easiest to map if it is or can be delimited, categorized, and labeled. If it canTt, itTs apt to be still mapable to textual holdings. Develop as much expertise as you can. Visit other libraries. Read the literature; for instance, the NASIGuide to Holdings is complete and at this writing will soon be available from the NASIG (North American Serials Interest Group) Web site. If possible, attend one of the workshops available on MFHD. Include MFHD in the discussions with prospective vendors and include specific detail in your query. For example, do you 1) support the current edition of MFHD for all types of material; 2) support encoding for base volumes, supplements, and indexes; 3) support the creation and maintenance of paired 853 and 863 fields; 4) support all subfields of the publication pattern data; and 5) allow receipt of materials according to input publication patterns? Ask for demonstrations of the features. Discuss your particular data with vendors. And when you finally choose one, test by submitting records for trial conversion. JW: What do you see as the benefits of standardized holdings data for the serials community in a global environment? DH: We have far more experience in this arena than many of our potential partners in the publishing and serials service industries, and I think we shouldnTt be shy about sharing that experience. Standard data are some- thing that libraries believe in fervently, and weTve built up a significant economic infrastructure around the sharing of this data. I hear calls for bsimplificationQ among some of these partners, and I find myself a bit mystified by some of this. Recall that it was not libraries that developed the complex publications that required complex standards to record, it was certainly publishers! I think itTs also sometimes forgotten that standards like MFHD are designed for machine-to-machine communi- cation, not human-to-human. Computers deal with much more complex data than encoded holdings even before breakfast. FR: Accrual of benefits tends to be circular. As more libraries implement the standard, everything improves: the data, the standard itself, the implementation of the standard in systems in the market, and the availability of shared archives and templates in systems and utilities, enabling further rounds of improvement. JW: The CONSER Publication Pattern and Holdings Initiative was a major step forward in promoting the use of MFHD.2 WhatTs the idea behind this initiative? What challenges were involved in carrying out the experiment to add publication pattern data to CONSER records in OCLC? ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx DH: I was there for that one so ITm happy to spill the beans. I attended some of the very early meetings back in the 1980s and early 1990s about sharing publication patterns. ITd gotten a bit frustrated by the lack of momentum in implementing the standard and had snapped at far too many people who opined that holdings were only local data, after all. I approached Jean Hirons on this issue, and there was a historic lunch at ALA at which Jean, Linda Miller, and I hatched the Publication Pattern Initiative. We wrote up a charge and got going convincing the rest of the serials community that the time was ripe for this kind of effort. Thankfully, Rich Greene at OCLC shared our vision and helped us figure out how to jump-start the effort, using local fields in CONSER records and a file of records from Harvard, and we were in business. FR: The Pattern and Holdings Initiative grew out of the realization that although a specific libraryTs holdings might be local data, looked at another way they were a subset of buniversalQ holdings, which were holdings as they came from the publisher. Diane Hillmann, who first suggested the project, wanted to harness these universal holdings, or publication history, for each title as 1) an archive of information for the larger world and 2) a database for all libraries to draw on for assessing their holdings, creating local holdings, and informing their users. Major challenges in designing the experi- ment were identifying which data, both retrospective and current, would be most useful in a shared database. For example, how important are patterns to retrospec- tive data? Deciding how to deal with limited space within the OCLC bibliographic record in the old platform was another challenge we no longer need to face. Jump-starting our work with a data load from the Harvard University Library database made the process much clearer and showed that the idea would really work. JW: What impact, if any, has the CONSER project had on libraries that are still not ready to implement MARC 21 for holdings? DH: I hope it has lit a warm little blaze under their desk chairs! Seriously, though, even libraries that knew they couldnTt implement right away have been instrumental in bringing some of the library system vendors around to fully implement MFHD, and we couldnTt have done it without their cooperation. FR: If Wen-ying Lu also is participating in this interview series, she will be the best person to answer this question! She and Paul Moeller recently conducted a survey on serial holdings which you may have seen on several discussion lists. They asked, among other questions, whether pattern fields now displaying at the bottom of a large number of OCLC serial records had at least attracted the notice of many libraries. The results of this survey should be out soon. The fact that two system vendors have developed loaders for the MARC data should definitely attract some of their reluctant customers, who may be considering predictive check-in and would benefit from some ready-made 4 patterns. We would also like to see more loaders developed rapidly. [Ed. note: By pure coincidence, their survey results are published in this issue of Serials Review]. JW: One of the goals of the CONSER project is to work with ILS vendors to develop systems that support MARC holdings. What is the current state of MFHD compliance by ILS vendors? DH: Much better than it was in the beginning. Some vendors made false starts, hoping to implement in ways that would give them competitive advantage and an easier interface, but most of them have come round to understanding that itTs the ability to exchange full standard data thatTs at the core of the effort, and sexy proprietary interfaces wonTt sell if they get in the way of that goal. FR: It is mixed but improving. It would be difficult for any vendor to keep up with the changes (really improve- ments) in the format designed to predict more serials accurately. Implementations are some years behind what the Format contains; however, we have to remember that the Format does not tell vendors how to implement its provisions. Instead, change happens as vendors are challenged to accommodate incoming MARC data. If the system isnTt adequate to handle it, the customer will probably not be satisfied with a bdown-migration.Q I think this, along with competition in general, is the greatest spur to better implementations. JW: Diane, you noted in a NASIG presentation that Z39.71 is to MFHD as AACR2 is to MARC biblio- graphic standards, which marvelously illustrates the two tracks of holdings standards.3 Could you elaborate a bit more on the relationships between Z39.71 and MFHD? How different is the current standard Z39.71 from the previous standards such as Z39.42, Z39.44, and Z39.57? DH: The earlier standards maintained a somewhat artificial separation between serials and non-serials, which were coming undone as MFHD was developing and digital resources finished the job. Z39.71 brought the serial and non-serial standards together into one standard. It is interesting to note that the late (and sorely missed) Ellen Rappaport, who was working for Albany Law School at the time, was co-chair of the NISO committee that developed the standard. She wrote an excellent summary of its history and highlights for her law library colleagues, available at http://www.aallnet. org/sis/tssis/tsll/26-0304/serliss.htm (accessed February 13, 2005). The Z39.71 standards contain most of the context and definitions crucial to understand MFHD, and the MARC standard provides the bpackagingQ that supports the sharing of holdings data created according to Z39.71. They are very intertwined at the conceptual level, certainly. JW: Frieda, you have played a key role in developing the CONSER Guidelines for Input of Captions/Patterns and Holdings Data, the Serials Holdings Workshop course materials for the Serials Cataloging Cooperative http://www.aallnet.org/sis/tssis/tsll/26-0304/serliss.htm ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx Training Program, and Holdings Guidelines for NASIG. What are some of the issues that you have been dealing with when writing the documentation? How do you think librarians and library staff benefit from using these educational materials? FR: Each of those guides was designed for a different user group with different objectives. The guidelines are in the initiativeTs participant manual, and solely written for those who input 891 fields (embedding 853/863 fields, the basic bpaired fieldsQ of the MFHD) into OCLC bibliographic records. They would probably bewilder someone unfamiliar with the special aims of that project, and they leave out all sorts of information that would be necessary in creating local holdings, since the 891 fields are meant to contain buniversal holdingsQ or bpublication historyQ fields. The holdings workshop, within its time constraints, is designed to give an overview and introduction to the subject of local serial holdings, along with some concrete guidance to get people started creating holdings records. It does answer some bwhyQ questions and has appendices, which tackle a few subjects that the workshop canTt cover in depth. One of these appendices is a brief code-by-code hand- book also available on the Web (http://www.lib.unc. edu/cat/mfh/mfhhandbook.html, accessed February 13, 2005). The NASIGuide, which should be available by the time this issue is released, is a much more leisurely and in-depth survey of the MFHD. It tries to cover many more issues, such as migration and conversion of specific fields, than previous guides. Where interpreta- tions have differed in the past, the NASIGuide will discuss them at length and give the reason why one interpretation has prevailed or is favored. I hope that not only librarians and library staff, but also system vendors and bibliographic utilities can take advantage of any of these documents and feel on more solid ground in an arena of competing demands. JW: We know that OCLC is implementing MFHD; you both have been invited to serve as advisors to aid in the implementation. What sorts of results do you envision with this project? DH: I was very impressed with the group at OCLC that is working on their MFHD implementation. They went through the standards documentation with a fine- toothed comb and asked us a great number of really good questions. Their first task is translating their union list data, and I think theyTve found the right balance in approaching that task. FR: We are understandably elated by the whole idea of the LDR (OCLCTs Local Data Record) finally being MARC-based. We understand that OCLC is taking this step because they are receiving better data from many libraries and no longer find it acceptable to use only part of it. The most revolutionary benefit, however, will be that OCLC will convert non-MARC records (far more reasonably, we feel, than a library could do on its own) and the library will have the benefit of that MARC data for further use. Even libraries not intending to union list that data could have it processed for migration or other 5 purposes. It would be impossible to take such a giant step forward without the willing cooperation of our largest bibliographic utility, which also hosts the CONSER database and the Publication Pattern Initiative data. JW: Diane, you currently chair the CONSER Task Force to Explore the Use of a Universal Holdings Record. What is a universal holdings record? How is it different from bpublication history?Q Why do you think the concept of universal holdings is important in todayTs shared environment for holdings records? DH: In late summer 2001, Ellen Rappaport and I floated a short discussion paper beginning to define a universal holdings record, based on the notion that what was published for a title was important data bibliographically and should be represented in a hold- ings record (available at http://content.nsdl.org/dih1/ PubPatt/Universal_holdings_statement.html, accessed February 13, 2005). Once the Publication Pattern Initiative began, the Task Force to Explore the Use of a Universal Holdings Record was charged. One of our first tasks was to find a new name for the bthingQ we were talking about because apparently the one Ellen and I chose was confusing people. The task force finally settled on bpublication history recordQ after some discussion sessions with groups of interested librarians, and it seems to have stuck. But of course, the task force still has the old name! I think what confused people at first was this notion that holdings were institution-based, but the publica- tion history record is really part of the complete bibliographic description, conceptually speaking. But if you think about it, what it provides is a template against which holdings can be matched and compared. From that basis, a display relating holdings within an institution, among versions (digital, print, microform) can be constructed. With a publication history record with a currently maintained publication pattern, you also have the basis to exchange information on newly published or available issues and volumes, as well as almost enough detail to construct a standard citation for an article. It is a really powerful underpinning for many of the data exchange challenges weTre struggling with today, and the best thing is that increasing numbers of libraries are committing to using and maintaining it in common with others. We are building the same kind of shared environment that weTve had for almost forty years with bibliographic data, with the same strengths and infrastructure that did the job for us then. JW: The term bserial super recordQ came up at the 2004 ALA Annual Meeting last June. Could you tell me a bit more about this new record model? How does this type of record fit into the FRBR concept and how does it relate to the bholdings recordQ? DH: Frieda and I have been circulating a short paper on this for some time (see http://www.lib.unc.edu/cat/mfh/ serials_approach_frbr.pdf, accessed February 13, 2005), but this fall an article in LRTS by Kristin Antelman came which really supports our notion, with some http://www.lib.unc.edu/cat/mfh/mfhhandbook.html http://content.nsdl.org/dih1/PubPatt/Universal_holdings_statement.html http://www.lib.unc.edu/cat/mfh/serials_approach_frbr.pdf http://www.loc.gov/acq/conser/patthold.html http://www.nasig.org/newsletters/newsletters.2002/02sept/02sept_preconference.html ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx excellent research and summarization of various approaches included.4 The bsuper-recordQ operates to a great extent as a FRBR work record in ways that make far more sense in a serials context than an authority record does. The best part of it is that most of the relationship links needed to support such an entity already exist in serials bibliographic records, which suggests that much of the work in creating these records, at least at first, could be done algorithmically. There are still a lot of critical questions to be answered, primarily concerning how these records fit into our current bibliographic uni- verse, how should they be distributed and maintained, etc. FR: Again, we are delighted that OCLC is also interested in the bsuper-record.Q The bsuper-recordQ actually stems from a concept first encountered in an article by Melissa Bernhardt (Beck) in Cataloging & Classification Quarterly in 1988.5 The article suggested utilizing the encoded control numbers within 780 and 785 linking fields in searches to create a tree display of related serial titles. Though the article did not discuss holdings in detail, it did suggest that some local holdings information be displayed along with the tree. When the Task Force on the Uses of a Publication History Record, chaired by Diane Hillmann, took up this idea, we used Melissa BeckTs concept along with Rahmatollah FattahiTs terminology of a bsuperQ work6 to collocate the related titles for successive entries (780/ 785) and simultaneous versions (776). The record might be a virtual record created on the fly by a looping search of the appropriate linking field control numbers and titles on each record, continuing until a match was found, and displaying the results in a variety of ways including graphical displays. Most important for the Initiative was the provision that the publication history record for all successive titles-a MFHD record showing a bperfectQ or complete set of volumes and issues-be constructed and displayed as a unit for each format. The concept might be further adapted to local holdings. More elaborate ideas, suggesting some different and more exhaustive ways of attaining this kind of collocation, are coming out of the FRBR task groups as they tackle serial relationships in their discussions. JW: Diane, you were one of the invited speakers for the 2005 ALA Midwinter Symposium on bCodified Inno- vations: Data Standards and Their Useful Applications,Q which brings together collective efforts from systems vendors, standards representatives, and librarians. What specific standards were discussed? What roles does each constituent play in implementing the standards? DH: I talked about some of the work weTve been discussing, and, in addition, there were discussions of ISSN (and other identification standards, as well as OpenURL), standards relevant to electronic resource management, ONIX, ISTC and dispatch data used by serials vendors and publishers. 6 JW: Frieda, you gave a workshop titled bDo Holdings Have a Future?Q several years ago.7 What is the future for holdings in your view? FR: I think it is only being realistic for even a die-hard cheerleader for holdings to say that once all present and past serial literature is digitized and readily available online at the issue and article level, local holdings-and surely the local catalog as well-will be only relics, replaced by newer systems of organization of informa- tion. Both digitization and the user flight from printed resources are already starting, but they are still gradual processes and reserved for institutions and libraries in the parts of the world that can afford the increased cost of digital materials. For a long time to come there wonTt be digital access to everything or the access wonTt be universal. If we abandon our stored treasury of information instead of finding ways to make it more accessible, we wonTt be fulfilling the libraryTs mission. JW: Is there anything else that I havenTt asked, but you would like to add? DH: I think itTs important to stress how the work above fits into the larger picture. Libraries have an enviable tradition of metadata sharing, supported by a strong infrastructure. Building on that base, and moving, as libraries have always done, from the monographic to the serial (and beyond), I think weTll start to see the same kinds of standardization and normalization that we saw in the early days, as shared bibliographic data became the norm in libraries. CONSER was in the forefront of those efforts and continues to provide important leadership now. I remember well the grous- ing and grumbling of that era, as we moved towards a common understanding of our goals and realized some truly astounding efficiency in the process. We take all that for granted now, so these efforts to expand on that success seem new and different. We somehow need to reassert what we already know to be true—shared data built on standards is cheaper, better, and the only way to go! FR: ITd like to expand on something related to your second question. That is the increased importance of local item information to online remote searching. Item information conveys the physical (or conceivably vir- tual) unit in which the sought piece is available. This information is being created separately from holdings and stored in many proprietary formats as textual strings. Transactional information, also proprietary is added to the items to reveal the status of an item at a particular time. Communication and migration of this information is often problematic. I think that in an ideal library system, the summary holding, physical item information, and uncompressed issue information would be a view of one file obtained through automatic compression and expansion. That may no longer be possible. But since remote communication of informa- tion at the more granular level, along with its status, has proven important, what can we do to standardize it within library systems? . . . And ITd like to thank you for some interesting questions. ARTICLE IN PRESS Parks / Serials Review xx (2005) xxx–xxx Notes 1. Frieda Rosenberg, bDo Holdings Have a Future?Q Serials Librarian 36, no. 3–4 (1999): 529–539. 2. CONSER Publication Pattern Initiative, http://www.loc.gov/acq/ conser/patthold.html (accessed February 14, 2005). 3. NASIG Newsletter 17, no. 3 (2002), http://www.nasig.org/ newsletters/newsletters.2002/02sept/02sept_preconference.html (accessed February 14, 2005). 4. Kristin Antelman, bIdentifying the Serial Work as a Bibliographic 7 Entity,Q Library Resources & Technical Services 48, no. 4 (2004): 238. 5. bDealing with Serial Title Changes: Some Theoretical and Practical Considerations,Q Cataloging & Classification Quarterly 9, no. 2 (1988): 25–39. 6. Rahmatollah Fattahi, bSuper Records: An Approach Towards the Description of Works Appearing in Various Manifestations,Q Library Review 45, no. 4 (1996): 19–29. 7. Rosenberg, Do Holdings Have a Future? Presentation at the 13th Annual North American Serials Interest Group Conference, Boulder, Colorado, June 18–21, 1998. Serial Conversations: An Interview with Diane Hillmann and Frieda Rosenberg Professional Questions Notes work_bjndyhgmhvabldt2hs6ztz4644 ---- A DESCRIPTIVE STUDY OF STATE-WIDE BIBLIOGRAPHIC DATABASES STAN GARDNER, B.A., A.M.L.S. APPROVED: Major Professor Committee Member Committee Member Committee Member a Committee Mem~ ean of - e Co lege of Library and Information Sciences Dean of the Robert B. Toulouse School of Graduate Studies ~ Gardner, Stan, A Descriptive Study of Statewide Bibliographic Databases. Doctor of Philosophy in Library and Information Science, August, 1992, 360 pages, 24 tables, 2 figures, bibliography, 69 titles. This dissertation has compiled information about statewide bibliographic databases, their format, their cost, the number of titles and records, how they are being used, what kinds of libraries are using such databases in each state, and the effectiveness of those databases. General information about twenty-eight states' bibliographic databases is included in this dissertation. The users of thirteen states responded to a questionnaire, surveying the effectiveness of the statewide database in their state. The costs to the individual states varies from zero, where all costs are covered by local funds or Library Services and Construction Act fund, up to 4.4 million dollars. Usage of interlibrary loan increase is detailed and explained. There has never been an evaluation of the effectiveness of a statewide bibliographic database. This is a descriptive study of statewide bibliographic databases. No other such study appears in library and information science indexes. A DESCRIPTIVE STUDY OF STATEWIDE BIBLIOGRAPHIC DATABASES DISSERTATION Presented to the Graduate Council of the University of North Texas in Partial Fulfillment of the Requirements For the Degree of DOCTOR OF PHILOSOPHY By Stan Gardner, B.A., A.M.L.S. Denton, Texas August, 1992 Copyright by Stan Gardner 1992 iii ACKNOWLEDGEMENTS This study would not be possible without the support of my wife Katherine G. Ellerton, my parents Mr. and Mrs. C.H. Gardner, or my faculty advisor Dr. Herman Totten. My appreciation to them for the help they have given me. iv TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES... ........ CHAPTER 1: INTRODUCTION TO THE STUDY Background Information Statement of the Problem Purpose of the Study . - Significance of the Study Limitations of the Study Scope ......... ........ Research Questions . . . ENDNOTES .................. CHAPTER 2: REVIEW OF THE LITERATURE Alaska Colorado . . . . . Connecticut . . . - Delaware Georgia.........-.- Iowa Kansas Louisiana.....-.-.. Maine ..... . . Maryland Mississippi - Missouri . - Nebraska - - Nevada New Jersey - North Dakota - Ohio . - -0 0- - Oklahoma 0 - - - -0 Oregon .-0 0 -- Pennsylvania -0a Rhode Island --- South Dakota - Tennessee.......-.. Virginia West Virginia - - Wisconsin........ Regional Databases ENDNOTES.............-. ix ix . - - - . - . . . - - . . . - - - - . . - - - - - - - - - - - - - - - - - V . - - - . - . . . . - - - . - . . . . - - - . . . . . . . . . . . . . . 1 2 6 6 7 8 9 9 11 12 13 14 14 14 15 15 16 16 17 17 18 18 20 20 20 21 21 22 22 22 23 23 24 24 24 25 26 27 - - . . . - - - - . . - - - - - - - - - - - - - - - - - . . . . . - . . - . . - - - - - - - - - - - . - - - - - . . . . . - - . . . . . . . . - - . - . . . . . - . . - & -0 -a -0 -0 -0 - . . . - - - - . . - - - - - - - - - - - - - - - - - - . . . - - - - . . - - - - - - - - - - - - - - - - - - -0 - -0 - -0- . 0. CHAPTER 3: METHODOLOGY . . . . . . . . . . . . . . . . 30 Instrumentation . . . . . . . . . . . . . . . 31 Data Collection . . . . . . . . . . . . . . . 31 Analysis of Data . . . . . . . . . . . . . . 32 Definition of Terms . . . . . . . . . . . . . 33 ENDNOTES . . . . . . . . . . . . . . . . . . . . . 35 CHAPTER 4: ANALYSIS OF THE DATA . . . . . . . . . . . . 36 Response . . . . . . . . . . . . . . . . . . 36 Title of Respondent (38); Type of Library (39); Size of Collections (40); Type of Uses of the Statewide database (40); Amount of time spent daily on the state-wide database (42); Amount of time spent daily on ILL. (43); Staff using database (44); Dedicated equipment (45); Public Access? (45); Why Not Provide Public Access? (46); Hardware Problems (47); Software Problems (47); Training - offered and attended (48); Training - Adequate or need additional training? (48); Importance / Quality / Usefulness (49); Increases or decreases of service (54); ILL's being verified using the state-wide database (56); Methods of ILL - before and after the state-wide database (57); ILL volume prior to and after the state-wide database (58); Helpful features of database (59); Improvements needed (60); Provides needed information (61); Priority of state-wide automation (62); Selected comments from respondents (63) State Libraries responses: . . . . . . . . . 64 Selected Responses of State Libraries (65) ENDNOTES.......-.-..-.... . . . . . . . . . . . . 67 CHAPTER 5: SUMMARY, CONCLUSIONS, AND RECOMMENDATIONS . 68 Size of Libraries, Effect of Use of Statewide Database.-............... . 69 Format of the Statewide Database ......... 70 Use of Statewide Databases by Libraries . . . 71 Variety of Uses of the Statewide Database . . 72 Communication Methods.. .-... . . . . . . . 73 Opinions on the Standard Features of Systems 74 vi Effect of Statewide Database on Resource Sharing . . . . . - - . -.-.-.-.-.-. -.. 75Strengths and Weaknesses of Statewide Databases..........-.-.-.-.... 75Factors to Consider in Selecting a Statewide Database Vendor..........-.-. 76 Significance of This Study........... 78 Recommendations for Further Study...........80 ENDNOTES .. ..........................-.... 81 APPENDIX A: STATEWIDE BIBLIOGRAPHIC HOLDINGS DATABASE ASSESSMENT QUESTIONNAIRE ... .-.....-......... 82 APPENDIX B: QUESTIONS ASKED IN SURVEY SENT TO STATE LIBRARY AUTOMATION OFFICERS..........-... 88 APPENDIX C: USER RESPONSES TO QUESTIONNAIRE BY STATE . 92Responses from Alaska ........-.-- ... .. .93 Responses from Connecticut.......-.-..... 105 Responses fromDelaware.- .. ....... --115 Responses from Iowa. ............... 125Responses from Maryland-.-......-..-........-.. 135Responses from Missouri.-.-......-....--.-.. 145Responses from Pennsylvania.......-.-...-.- 155Responses from North Dakota....-.-....--. .. 164 Responses from Ohio. ....... .. . . . .. . . 174Responses from South Dakota...-.-.-..-..... 185Responses from Tenn.. --.-.-.-.-.-.-...... 195Responses from Wisconsin.- ...-.-.... ----. . 206Responses from West Virginia .-.......... 216 APPENDIX D: RESPONSES FROM STATE LIBRARIES.----......226 APPENDIX E: VENDOR'S RESPONSE TO MISSOURI'S RFP. AUTO-GRAPHICS, BRODART COMPANY, AND LIBRARY CORPORATION . ................... 230 Proposal from Brodart Company................ 231Pricing from Brodart Company. ............ 262Proposal from Auto-Graphics..-.....-.-..... ... 265Pricing from Auto-Graphics........294 Proposal from Library Corporation .. . . . . . . 297Pricing for Library Corporation..--...... 319 APPENDIX F: EXAMPLE OF A COST ANALYSIS OF STATEWIDE DATABASES BY FORMAT - MICROFICHE, CD-ROM, ON-LINE, AND OCLC........................322 Cost of State Database on Microfiche ......... 323 vii Cost of State Database Online Using OCLC * .326Cost of State Database Online Using Brodart, Inc. 330 BIBLIOGRAPHY . . - - - - -. ------ 338 viii LIST OF TABLES TABLE 1 TABLE 2 . Z% &J.LJJ z . .* Table 3... . .. TABLE 4..... TABLE 5 . TABLE 6 . s.. TABLE 7 . TABLE 8 * . TABLE 9 . TABLE 10 * * . TABLE11 .... TABLE 12 *33*. TABLE 13 TABLE 14 TABLE 15 . . . TABLE 16 TABLE 17 TABLE 18 TABLE 19 . TABLE20 TABLE 21 TABLE 22 TABLE 23 TABLE 24 . . . TABLE 25 Tables C26 to C48 . Tables C49 to C70 . Tables C71 to C93 . Tables C94 to C116 Tables C117 to C139 Tables C140 to C161 Tables C162 to C183 Tables C184 to C206 Tables C207 to C229 Tables C230 to C252 Tables C253 to C275 Tables C276 to C298 Tables C299 to C321 . 0. 3 3 3 0 0 - - * 3 3 - 3 - - . 216 LIST OF FIGURES Figure 1 Vendors of Library Catalogs on CD-ROM. 3 ix . . . . . . . . . . . . . . . . 0 0 * ' ' * * * * * * * ' * * * * * ' * * ' ' * * * * * * * * * * * * ' * * ' * * ' * * ' ' * * ' * ' * - * * ' ' ' ' - * * * * ' * ' - * ' * * ' * * ' ' * * - - - ' - - - * ' - * ' ' - * * * * ' * ' - * ' * * ' * * ' ' * * - - - ' - ' - * * * - - - ' ' ' - * * * - ' * ' - * ' * * - * * ' - - * * ' - - - * * * - ' * ' - - ' - * - * - ' - - * - - - - - ' - ' * * - - ' ' - * - - - * - - - - - - - - - ' - - - - - - - - ' - ' * * - - - - - ' - - - - - - - - - - - - - - - - - - - - - - - - - . - - - - ' - ' * * - - ' - - - - - - - - . - - - - - 38 39 40 40 42 43 44 45 45 46 47 47 48 48 49 54 56 57 58 59 60 61 62 63 65 93 105 115 125 135 145 155 164 174 185 195 206 - - - - - - - - - - - . . . ' . ' * * . - - * * - - - - - ' - ' * * - CHAPTER 1 INTRODUCTION TO THE STUDY For several decades libraries have been concerned with sharing resources between them. The primary method for doing this has been Interlibrary Loan (ILL). In order to borrow materials effectively, libraries need to know what other libraries have. Many methods have been used to identify materials, such as union lists between cooperating libraries and compiled automated bibliographic databases extending over political and geographical regions. Two common forms of bibliographic databases are the On-line Computer Library Center (OCLC) and a state-wide database developed by individual states. There has never been an evaluation of the effectiveness of a state-wide bibliographic database prior to this study. In fact, there has not been even a simple compilation of state-wide bibliographic databases. There are no entries in the library and information science indexes about states that have developed state-wide bibliographicdatabases. This study compiled information about state-wide bibliographic databases, their format, their cost, how they 1 2 are being used, what kinds of libraries are using the database in each state that has a state-wide database, and asked users of those bibliographic databases if they were effective. Appendix A and B contain copies of the questionnaires used to gather this information. Appendix C gives a compilation of the responses of the various state libraries. Appendix D shows the responses of the individual users of each state. Background Information During the past decade, many states have experimented with the development of state-wide bibliographic databases. A state-wide bibliographic database is defined as a file of machine-readable bibliographic records that is a comprehensive source of the bibliographic holdings of libraries within the political and geographical boundaries of a state. 1 Illinois and West Virginia started early with state- wide bibliographic databases by creating interfacing on-line systems. These databases included records from public, college, and special libraries, and were accessible to users in libraries and others with microcomputers and modems. Eighteen states creating such state-wide bibliographic holdings databases during the past few years have been utilizing a more recently created format, that is, Compact disc - Read Only Company CD~-PAC Memory (CD-ROM) Auto-Graphics Impact technology. Brodart LePac Only eight Gaylord Co. SuperCat vendors at this General Research Corp. LaserGuide time offer CD-ROM Library Corp. Intelligent Catalog public access Library Systems & Services LOANet catalogs 2 or Marcive Marcive/PAC Compact disc - Utlas Int. CD-CAT Public Access Figure 1 Vendors of Library Catalogs on Catalogs (CD- PACs). However, Marcive and Utlas have never successfully bid for a statewide database contract as reported by the twenty-eight state libraries responding to the questionnaire used in this study. Brodart, Inc. introduced the first CD- PAC in the summer of 1985. Brodart's "LePac" system and Auto-Graphics "Impact" are used most often, with seven states using Brodart, and four states using Auto-Graphics, out of the eighteen states that currently have CD-PACs. Methods of providing access to a state-wide bibliographic database include on-line systems, microforms, and CD-ROM optical discs. Magnetic tape and magnetic disk may be used in the future, but are not currently used by any * Note: Gaylord and LSSI split and created two separateCD products in 1989. In 1991, Follett and LSSI contractedwith each other to develop and market LOANet. 3 4 state as a means of access to a state-wide bibliographic database. Microforms are considered the least desirable form of a state-wide bibliographic database. They can provide the same information at a fraction of the cost, but there is a major disadvantage to it. There is a great deal to be desired in the search capability of microforms. Microforms are sequential in nature so that one has to go though many pages in order to arrive at the specific page needed. Microforms are an extension of printed catalogs, a user physically has to handle the plastic film to find the specific range in author, or title depending on how the microform is printed. It is impossible to access multiple records automatically by searching key terms. On-line systems and CD-ROM share many of the same advantages in retrieval of bibliographic information. On- line systems have an advantage, in that information is updated continually, not in batch mode over a period of months. Illinois is an example of a state that has an online system. The major argument against using an on-line system is cost, i.e. telecommunications, equipment, and personnel. Part of this research studied the difference between formats and showed the extent that cost factors play in states selecting their delivery system. A secondary disadvantage of an on-line system is that when the phone lines are down 5 or the main computer is down, there is no way to access the database. CD-ROM systems do not have this disadvantage, the user is able to go to another microcomputer if a problem arises with the equipment. The reduction of cost and the advance of technology has made gigabite magnetic drives now feasible to consider as another alternative to CD-ROM or on-line systems. The major argument for using an on-line system is that they are instantly accessible, when a change is made in the database everyone can use the changed data instantly. Some library directors and boards consider this to be a disadvantage under some circumstances. In interlibrary loan, many library boards do not want other libraries to borrow new material. They feel that their money was spent to purchase materials for their patrons and other libraries should do the same.3 Statement of the Problem No evaluation of state-wide bibliographic databases exists. There is nothing in print on which format (microforms, CD-ROM, on-line systems, etc.) were selected by the states that have state-wide bibliographic databases, nor the criteria for the selection of a specific format in each state with a state-wide bibliographic database. Currently, there is nothing published which lists the states that have 6 developed state-wide bibliographic databases. Purpose of the Study The purpose of this study was to conduct an assessment of state-wide bibliographic databases and to report the impact of their usage based upon the information supplied from a sample of the librarians in each state who use the databases. In order to accomplish this a description was made of each state's database and its configuration. The description consisted of the number of libraries included in each state's database, the organization of the data, and the types of data included. In addition, samples of inter- library loan statistics were collected from each state that uses such a system. Significance of the Study Many states feel that sharing resources is important, and state-wide bibliographic databases are a way to accomplish this goal. They feel that sharing resources is important because in today's world it is almost impossible to provide the information requested by a library's various patrons due to the tremendous increase in information 7 available world wide. 4 An evaluation of present state- wide bibliographic databases, since those states that are creating a bibliographic database will expend great amounts of money, time, and effort, is needed. This dissertation is a bench-mark to those states considering creating a state-wide bibliographic database. Those states that currently have such a database will have access to information about other state's bibliographic databases. It can bring attention to aspects of the various databases that may require reevaluation and it also can become a planning tool for improvements. This could be used as part of an interactive dialogue between the state library and the individual libraries using the databases. Data concerning the databases illustrates where perceived problems exist, and could be used by public libraries and the state libraries making decisions regarding the development of state-wide bibliographic databases. This study also included what some vendors of bibliographic databases currently offer in the way of services, backup, and sophistication of retrieval software. Information regarding the impact of such a database on library services in states currently using a state-wide bibliographic database will be useful in determining what formats other states may wish to pursue. 8 Limitations of the Study This study will look only at those states that have bibliographic records in a state-wide bibliographic database. Each state that has a state-wide bibliographic database was asked to supply a random list of library addresses with a contact person who currently uses the database. This random selection of libraries was sent a survey form. The study was limited by the number, style, and accuracy of the responses of those surveys returned. Scope The scope of this dissertation was intended to study only those states that have developed a state-wide bibliographic database and where publicly funded libraries are eligible to participate in using the database. Research Questions In order to develop this dissertation, the author addressed these research questions: 1. How are state-wide bibliographic databases used by libraries in each state? (i.e. developing automation 9 for individual libraries, Interlibrary Loan, Optical Public Access Catalogs (OPAC), Cataloging, etc.) 2. What is the impact of a state-wide bibliographic database on resource sharing in each state? 3. What are the strengths and weaknesses of a state-wide bibliographic database? 4. What are the factors that state libraries should consider when selecting a state-wide bibliographic database vendor? 5. What is a way currently being used to select a vendor's product? ENDNOTES 1. Glazer, F. J. "That bibliographic highway in the sky." Library Journal 110, no. 2 (February 1, 1985): 64-67. 2. Bills, L. G., and Helgerson, L. S. , "CD-ROM Public access catalogs: Database creation and maintenance." Library Hi Tech 6, no. 1 (1988) : 67-86. 3. Budd, John, Steven Zink, and Jeanne Voyles. "How Much Will It Cost? Predictable Pricing of ILL Services: An Investigation and a Proposal." RQ 31 (Fall 1991): 70-74. 4. Beaton, Barbara. "Interlibrary Loan Training and Continuing Education Model Statement of Objectives." Q 31 (winter, 1991): 177-184. 10 CHAPTER 2 REVIEW OF THE LITERATURE A search of library literature relevant to the development of state-wide bibliographic databases indicates that little information has been published in this area. Many studies have been published on interlibrary loan systems and their effectiveness, but not relating to a state-wide bibliographic database. A number of states have looked at the possibilities of creating a machine readable database of library holdings, usually within an overall plan for library automation. An on-line search of the ERIC database and a manual search of Library Literature resulted in relevant articles and ERIC research reports. None of the citations found in Dissertation Abstracts and only two items in Library and Information Science Abstracts (LISA) pertained to state-wide or regional bibliographic databases. Many states have considered developing a state-wide bibliographic database in some form. Currently 28 states have produced state-wide bibliographic databases, 18 of those are on CD-ROM. In addition, a number of states have regional databases on CD-ROM or are considering developing 11 12 state-wide bibliographic databases. Twelve states have on- line state-wide bibliographic databases and six still use microforms as their format of choice. The six states that provide microforms also provide either on-line or CD-ROM systems at an additional cost to the local library.** The goals of most of these state-wide bibliographic database projects include at least one of these three goals: (1) to promote resource sharing among libraries within each state; (2) to encourage use of automation on the local level; and (3) to improve the accuracy of bibliographic records created by the individual libraries. To determine the degree to which these goals have been accomplished was a major part of this researcher's effort. The following include those states actively using state-wide bibliographic databases and a brief comment on each. Alaska: One of the six states that provide multi-format databases, it can be accessed via CD-ROM, Microfiche, and on-line. The vendor is WLN, all types of libraries use the database, but expenses are shared between federal and local funding sources. The database contains 2.2 million holding records and approximately 1 million titles as of the spring ** State library survey forms compiled by Stan Gardner 1991 and 1992. 13 of 1992. The database was first accessible to libraries in 1985. The primary purpose of the database is resource sharing, the secondary purpose is cataloging. There are 20 libraries contributing records to the statewide database. 1 Colorado: In 1992 the Colorado legislators approved the creation of the "Colorado SuperNet.1" A system of individual library catalogs with a single menu that would be accessible on-line via the InterNet. The number of records and titles that are on this system have not yet been compiled, since it is just now in development. This is an extension of the Colorado Academic Research Library (CARL) system to all libraries in Colorado.2 Connecticut: A long planned project starting with CD-ROM test discs in 1988 and 1989 to the 1990 system of 3 discs supplied by Auto-Graphics, containing 2.04 million titles and 9.6 million holdings. Two hundred and seven libraries are in this project, which has as its primary goal to provide a public access catalog to all the libraries in Connecticut. 3 Delaware: In October of 1990 the first CD-ROM disc was produced and consisted of records from 50 public, academic, private 14 school, and special libraries. There were 386,153 titles on the first disc produced by Brodart. The primary purpose of this database is to improve interlibrary loan. 4 Georgia: This is another of the six states that have multiple formats. Serials are on-line through OCLC serials sub- system. The rest of their database is on Microfiche. This database includes 14 million holdings with 7.8 million titles. Expenses are covered by a combination of federal and local funds. The OCLC serials sub-system was first established in 1988. The primary use of the database is for resource sharing.5 Iowa: Iowa produced the first state-wide bibliographic database on CD-ROM. In 1986 the Iowa state library distributed 2 CD-ROM discs containing their state-wide bibliographic database. The vendor was a small company in Colorado called Blue Bear, Inc.. The Iowa state library originally had planned on developing a COM (Computer On Microfiche) type database, but after talking to the Blue Bear staff they decided on the CD-ROM format. The database was developed as a resource sharing tool, and started with only 32 libraries that had OCLC tapes available. Currently the Iowa database contains 1,5 million records, almost 5 15 million holdings on three discs. The database was and is a LSCA project. In 1991 the Iowa State Library distributed a Request for Proposal, looking for a new vendor for the production of the database, since Blue Bear has decided not to continue in this type of business. 6 Library Corporation was accepted as Iowa's vendor, and will distribute the new database in the summer of 1992.7 Kansas: In 1988 Brodart, Inc. produced the state-wide bibliographic database using Library Service and Construction Act (LSCA) funds. It is used as a tool to support resource sharing (ILL), and consists of approximately 2 million records on two discs. 8 It is provided on both CD-ROM and on Microfiche to those libraries who request it that way. 9 Louisiana: Started in the 1960's as a Union List which did not contain full bibliographic information, it developed into a statewide database on microfilm. In 1987 LSSI produced the database on 12" videodiscs, and in 1989 the system changed to CD-ROM. There are 1.4 million titles, 4,685,721 holdings on 2 discs, consisting of 53 public libraries, 3 academic libraries, and the State library participating. The primary goal is resource sharing, secondary goals include 16 verification of data, and cataloging of materials. Funding consists of a combination of federal (LSCA), state, and local moneys. 1 0 Maine: Maine produced its first state-wide holdings catalog on CD-ROM in December, 1988 using Auto-Graphics as the vendor. Their three goals are: (a) to facilitate resource sharing; (b) to assist libraries in converting their holdings to machine-readable form by matching against MaineCat and (c) to provide computer-based access to local holdings for a library's own walk-in users (public access). MaineCat has school, public and academic libraries involved. It includes 200 libraries, with 2.5 million holdings and 1.1 million titles. Maine is unusual in the sense that this project has used state funds completely, and no federal funds have been allocated in either its creation or maintenance." In 1991 they published a RFP for a new vendor. Library Corporation received the bid and will distribute the new database in 1992. Maryland: Like many states, Maryland had been working on a state- wide bibliographic database using microfilm since 1975. In 1988 this database was converted to 2 CD-ROM discs using Auto-Graphics software. They have also established an on- 17 line system. However, there are some major defects in the on-line system, such as no Boolean searching. Currently there are 135 public, academic, school and special libraries contributing 2.6 million titles and 6.5 million plus records. The primary goal of this database is to support resource sharing (ILL) throughout the state. 12 Mississippi: Started in 1979, the Mississippi Union Catalog consisted of 40 public libraries using microfilm. In 1985 LSSI produced the database on 12" videodiscs, and in 1987 converted them to CD-ROM. There are currently 700,000 titles and 3 million plus holdings on a single disc, from 243 public libraries, the state library, other state agencies, and serials holdings from all of the community colleges in the state. The primary goal of the statewide database is for resource sharing, a secondary use is cataloging. Funding is shared between federal, state, and local resources. A microform version of the statewide database is still available upon request. 13 Missouri: The idea for a state-wide bibliographic database in Missouri was under consideration in the late 1970s. The need for libraries to share their resources and to take maximum advantage of computer and communications technology 18 led the Missouri State Library to commission a study to investigate the possibilities. Published originally in December 1978 and somewhat revised in January 1979, this report focused on the plans to improve library service in the state and to make it more feasible for libraries to implement new technology. The number one priority recommended was that Missouri "establish a state-wide bibliographic database of library records." 14 In 1987 a contract was signed with Brodart, Inc. to produce a CD-PAC of the machine-readable records available from all types of libraries. The Missouri State Library secured funding through the LSCA to furnish public libraries throughout the state with the hardware and software to create machine readable records of their collections using the Bibliofile system. Brodart processed records from the Online Computer Library Center (OCLC) records, and other proprietary systems already in existence. In October 1988 the CD-PAC was distributed to 216 Missouri libraries participating in the project. The project goals were twofold: first to encourage development of machine readable records to promote local automation of library services and secondly, to encourage interlibrary cooperation and resource sharing. 15 Currently there are 3.5 million titles, and 9 million holdings on four discs. Missouri has also produced three additional discs. One contains the Union list of serials 19 and newspapers for Missouri libraries, the second is an Author/Title index showing where that record can be found on the original four master discs, and the third is a supplemental disc produced six months after the original master discs.16 Nebraska: The on-line system used in Nebraska is OCLC, there are 4 million records loaded into the database, it is not known how many of these are unique. One hundred and thirty-five libraries use OCLC, and all of the cost is borne by the local library.17 Nevada: In 1988 Nevada contracted with General Research Corporation to produce a state-wide bibliographic database using "LaserGuide. " Over seventy libraries participated including public, academic, and special libraries. The startup database contained approximately 1.2 million holdings. 18 New Jersey: The state of New Jersey was considering the development of state-wide bibliographic databases by 1980. The Computer Application Task Force of New Jersey listed the "creation of a state-wide bibliographic database and standards for 20 machine-readable records and the creation of a state-wide union catalog" among a list of recommendations.19 Since then some of the libraries have gone together to produce a regional database on CD-ROM, but have not yet developed such a database state-wide. North Dakota: Using the University system as the contractor, North Dakota on-line users can also connect with South Dakota's and Minnesota's databases. The software used is UNISYS/PAL, the database contains 793,721 titles and 1,166,086 holding records. Established in 1989, funding comes from a combination of state and local moneys. The primary purpose of the database is resource sharing. Twenty-two libraries currently use the system.2 0 Ohio: In November, 1990 the first CD-ROM disc consisted of records from 24 public libraries and holding 343,055 titles and 654,734 holding records. The vendor for the database is Library Corporation. The primary goal in establishing this state-wide bibliographic database is to provide expanded resources for users through interlibrary loan. The Ohio state library is selling the Ohio Shared Catalog CD-ROM for $255.00 each.21 Ohio is also working on an on-line system using the University of Miami in Oxford Ohio, as the hub of 21 the system. Currently (1992) they are planning on linking thirteen university libraries, two private university libraries, two medical college libraries and the state library of Ohio. 22 Oklahoma: In 1991 the Oklahoma State Legislature appropriated $350,000.00 to the Oklahoma Department of Libraries to administer a CD-ROM bibliographic catalog project. The target date for completion of the first CD-ROM disc is spring, 1992. It is expected to combine bibliographic catalogs of approximately 300 public, school, academic and special libraries. Projections show that the disc(s) should contain approximately 7 million records.23 Oregon: The statewide database only contains serials holdings information. 24 Pennsylvania: In 1984 they used LSCA funding to start the development of a state-wide bibliographic database. State funding continued the project after the second year, the project was designed to provide a Public Access Catalog, not just a reference tool; and finally, the database includes all libraries, public, academic, special and school libraries. 22 All records in the database were built directly from shelf list cards. The first CD-PAC was distributed in Pennsylvania in the fall of 1986. There are 1,050 libraries participating in the statewide database which contains 2.6 million titles and 12.8 million holding records. Brodart, Inc. is the vendor. 25 Rhode Island: There are 45 libraries that provide the basis of records in the Rhode Island state-wide bibliographic database. The first CD-ROM database was delivered in July, 1990; it now has 367,562 titles, and 1.3 million holdings. This project is funded by a local private foundation. There is also an on-line version of the database accessible to those libraries that wish to use it. The primary purpose of the database is to provide public access catalogs to participating libraries. Auto-Graphics is the vendor for the CD-ROM database. 26 South Dakota: Established in 1987, this on-line database uses the UNISYS/PAL's software system. There are 184 libraries using the system, which contains 1.1 million titles and 2 million records. The cost of the project is shared between federal, state, and local resources. 27 23 Tennessee: In 1989, the Tennessee State Library and Archives began the process of developing TELINET, a state-wide library database. TELINET currently includes records of the bibliographic holdings of the State Library and Archives, the Public Library of Nashville and Davidson County, the Knox County Public Library, the Chattanooga/Hamilton County Library, the Memphis/Shelby County Library and Information Center, the Tennessee Union List of Serials and the multi- county regional libraries in the state. This encompasses 1.25 million titles on two CD-ROM discs and uses Auto- Graphics as the vendor. The main purpose of the system is for resource sharing (ILL). It is paid for using LSCA funds.28 Virginia: The state library of Virginia produced its first CD-ROM state-wide bibliographic database in 1988. Currently this database contains 4 million records, and has used Brodart's LePac software in the past. The Virginia state library is now dropping its CD-ROM version of the database and developing an on-line database using its own computer and the Virginia Tech Library System (VTLS).29 West Virginia: The VTLS on-line database is used by 111 libraries of 24 all types and contains 1.3 million titles and 3 million records. Funding comes from a combination of federal, state, and local moneys. It was first started in 1983, and has had very little upgrading since then.30 Wisconsin: In 1983 the Wisconsin Department of Public Instruction, Division of Library Services, started the Wisconsin Catalog (WISCAT) as a project to develop a state-wide resource sharing tool and a state-wide bibliographic database. At first, WISCAT was a microfiche catalog; in 1987 a recommendation of the state's Council on Library and Network Development was to phase out production of the microfiche format and produce a CD-ROM database. At that time an online system was considered, but funding required to maintain such a database and access to it was not considered feasible. Currently WISCAT has 1000 libraries involved, including public, academic, school and special libraries. There are 4.15 million titles and 21 million holdings included on the database, located on 5 different CD-ROM discs. 3 1 The vendor producing the database and retrieval software is Brodart. 3 2 Regional Databases: Portions of California, Washington, Montana, and Idaho have developed regional databases, but not individual state- 25 wide database. WLN provides both a regional CD-ROM database and an on-line database for libraries in Washington, Alaska, Montana, 33 and Idaho. The WLN database contains 435 libraries, 3.3 million holdings, 1.3 million titles and is housed on 4 CD-ROMs. LaserCat, WLN's retrieval software is primarily concerned with providing resource sharing for regional libraries.3 4 ENDNOTES 1. Williams, Lynne, Automation Librarian, Alaska State Library, Letter to (Stan Gardner, Jefferson City, Mo] November, 1991. 2. Fayad, Susan, Senior Consultant, Network Development, Colorado State Library, phone interview [with Stan Gardner, Jefferson City, Mo], February, 1992. 3. Uricchio, William, Michelle Duffy and Roberta Depp. "From Amoeba to ReQuest: A history and case study of Connecticut's CD-ROM based statewide database." Library Hi-Tech 8, No. 2 (1990): 7-21. 4. Sloan, Tom W., Deputy Director, Delaware Division of Libraries, Letter to [Stan Gardner, Jefferson City, MO], October, 1990. 5. Ostendorf, JoEllen, Interlibrary Cooperation, Division of Public Library Services for the State of Georgia, phone interview [with Stan Gardner, jefferson City, Mo] March, 1992. 6. Cates, Dan, Network Coordinator, Iowa State Library, phone interview [with Stan Gardner, Jefferson City, Mo], April 25, 1991. 7. Cates, Dan, Network Coordinator, Iowa State Library, phone interview [with Stan Gardner, Jefferson City, Mo], March, 1992. 8. Moeller, Ronda, Coordinator Kansas Union Catalog, Kansas State Library, phone interview [with Stan Gardner, Jefferson City, Mo], March 21, 1991. 9. Moeller, Ronda, Coordinator Kansas Union Catalog, Kansas State Library, phone interview [with Stan Gardner, Jefferson City, Mo], February, 1992. 10. Ferguson, Bobby, Louisiana State Library, Phone Interview [with Stan Gardner, Jefferson City, MO], August 7, 1991. 11. Beiser, Karl, Library Systems Coordinator, Maine Department of Educational and Cultural Services, Letter to [Stan Gardner, Jefferson City, MO], August 10, 1990. 12. Smith, Barbara G., Chief, State Library Network and Information Services Section of the Maryland State Department of Education, Division of Library Development and Services, Letter to [Stan Gardner, Jefferson City, MO], September 8, 1990. 26 27 13. Smith, Sharman, Director of Library Services, Mississippi Library Commission, Phone Interview [with Stan Gardner, Jefferson City, MO], August 7, 1991. 14. Becker, J., and Hayes, R. M. A Statewide data base of bibliographic records for Missouri libraries. (Los Angeles: Becker and Hayes, 1979), 56. 15. Missouri State Library, records and files dated from 1987 to 1991. 16. Missouri State Library. Unpublished papers and reports, Spring, 1992. 17. Mundell, Jacqueline, Network Services Librarian, Nebraska Library Commission, Survey form from Stan Gardner, completed and returned December, 1991. 18. "Nevada installs CD-ROM catalog." Wilson Library Bulletin 62 no. 3, (1988): 14. 19. New Jersey Computer Applications Task Force. A report of the Computer Applications Task Force. Trenton, NJ: New Jersey State library, (1980), ERIC, 234 766. 20. Slater, Frank, Librarian, North Dakota State Library, Survey form from Stan Gardner, completed and returned December, 1991. 21. Ohio State Library. "Ohio Shared Catalog CD-ROM Available." The State Library of Ohio News. 249, no. 7 (Columbus, Ohio: Ohio State Library, March, 1991). 22. Sessions, Judith Hwa-Wei Lee, and Stacey Kimmel "OhioLink: Technology and Teamwork Transforming Ohio Libraries" Wilson Library Bulletin, June 1992, pg. 43-45. 23.ODL Source: A newsletter published by the Oklahoma Department of Libraries Volume XVI numbers 7 & 8, July/August, 1991, page 6. 24. Scheppke, Jim, State Data Coordinator, Oregon State Library, phone interview [with Stan Gardner, Jefferson City, Mo] March 1992. 25. Goodlin, Margaret, School Library and Educational Media Supervisor, State Library of Pennsylvania, Letter to [Stan Gardner, Jefferson City, MO], August 14, 1990. 26. Frechette, Dorothy B., Deputy Director, Rhode Island Department of State Library Services, Letter to [Stan Gardner, Jefferson City, MO], August 10, 1990. 28 28. Herrick, Jacci, Information Services Coordinator, Tennessee State Library, Letter to [Stan Gardner, Jefferson City, MO], October 4th, 1990. 29. Wilson, Ashby, Director of Automated Systems and Networking Division of the Virginia State Library and Archives, phone interview [with Stan Gardner, Jefferson City, MO] April, 1991. 30. Prosser, Judith, Interlibrary Cooperation Librarian, West Virginia Library Commission, Survey form from Stan Gardner, completed and returned December, 1991. 31. Drew, Sally, Director, Bureau for Interlibrary Loan & Resource Sharing, Wisconsin State Library, Letter to [Stan Gardner, Jefferson City, MO], August, 1990. 32. Wisconsin Council on Library and Network Development. Automating Wisconsin Libraries. (Madison, WI: Wisconsin State Department of Public Instruction, Division of Library Services, 1987), ERIC, ED 922 479. 33. Staffeldt, Darlene, Information Resources Director, Montana State Library, Letter to [Stan Gardner, Jefferson City, MO], September 11, 1990. 34. Griffin, David, Information Officer, Western Library Network, Letter to [Stan Gardner, Jefferson City, MO], August 24, 1990. CHAPTER 3 METHODOLOGY Data was gathered in two ways. First, a survey form was send to all 50 state libraries, with a phone interview of State Library's Automation Officer in states that have reported having a state-wide bibliographic database. Secondly, a survey was sent to approximately 25% of the libraries in each state that use the state-wide bibliographic database. Seven hundred and fifty surveys were sent. These libraries were randomly selected by individual state libraries in their state. Seventeen state libraries responded to the survey form. The state library is the coordinator of the state-wide bibliographic database, and is usually the agency that pays for developing and maintaining most statewide databases. In addition each state library maintains the files, the "Request for Bid" used to select the vendor, and data on the individual libraries (circulation, collection, population, ILL transactions, staff, etc. of each library) which uses the state-wide database. The survey was send randomly to 25% of the libraries in each state that participates in a state-wide bibliographic 29 30 database. The addresses of these libraries were requested from each individual state library during the phone interview, and by a follow up letter. By surveying 25% of each state's libraries involved in a state-wide bibliographic database, the proportions represented should have been equal. However, due to the responses, some of the smaller states had a higher representation than larger states. Each survey form was marked with a code to identify the state and type of library of the respondent. Instrumentation The research prepared a questionnaire addressing the general research questions identified in Chapter 1. The questionnaires were printed on pastel colored paper. (There has been research that supports the idea that pastel colored questionnaires receive a higher response rate than those on plain white paper.)1 The survey instrument was first pilot tested on selected libraries of differing types within Missouri and revisions were made before being distributed. Data Collection This researcher used both a survey instrument and a phone interview with each state library automation officer, 31 as the primary means of collecting data. Secondary sources of data included reports and files developed or generated by individual state libraries. A third source of data came from library reference tools. Analysis of Data Surveying 25% of libraries participating in use of a state-wide database randomly selected, provided the pattern of basic use. In addition, by using reports compiled by state libraries and comparing past reports to current data, the information obtained was used to develop a database of changes in patterns of usage and resources since the beginning of the state-wide bibliographic database in each state. Other data that was gathered by the survey instrument are: type of library; size of library collection; average daily use of the state-wide database in minutes; type of staff using the state-wide database; method of inter-library loan request (i.e., mail, OCLC, ALANET, state, local or regional library networks, telephone, FAX, etc.); number of incoming and outgoing inter-library loan requests before and after implementation of the state-wide database. Appendix A contains a sample of the survey form to individual libraries and Appendix B contains a list of the questions each State Library's Automation Officer was asked. Appendix C contains a compilation of each state 32 library's report, and Appendix D contains the responses of the users from the survey found in Appendix A. Definition of Terms In order to be consistent and to avoid a conflict of definitions, the following terms are defined. ALANET: A telecommunications network operated by the American Library Association. This ceased to exist as of February, 1992. CD-PAC: Compact disc - Public Access Catalog, a catalog containing bibliographic data of one or more libraries on CD-ROM. CD-ROM: Compact disc - Read Only Memory, an information storage device in which information is stored digitally on a laser optical disk, and decoded with software through a computer. Interlibrary Loan: A request from one library to another library to provide a particular item, or photocopy. OPAC Online Public Access Catalog - A computer based library catalog that allows users to access bibliographic information by themselves via computer terminals. 2 OCLC On-Line Computer Library Center. OCLC has become the single largest bibliographic database in the U.S. offering bibliographic services to libraries. It is 33 normally considered as the primary cataloging tool, or the interlibrary loan (Group Access) communications tool for libraries. Resources: The collections, staff, and facilities available to a library. When speaking of sharing Resources, it usually refers to materials in the collection, that could be loaned to another library. State Library: The Library designated by each State Government to disseminate and regulate Library Services and Construction Act funds. Most state libraries also coordinate state-wide library activities, provide specialized library service to state government and provide other services based upon the needs of that state. State-wide bibliographic database: A file of machine readable bibliographic records which is intended to be a comprehensive source of the bibliographic holdings of libraries within a state. ENDNOTES 1. Borg, W.E., and M. D. Gall. Educational research: An introduction. 4th ed. (New York: Longman, 1983), 422. 2. Berger, Carol A. for Non-Librarians. Company. 01990. Library Lingo: A Glossary of Library Terms 2nd Edition. Wheaton, Ill: C. Berger and 34 CHAPTER 4 ANALYSIS OF THE DATA Response Questionnaires were mailed to 750 libraries in 15 states based on the mailing lists provided by each state library. This is approximately 25% of the libraries that use the statewide database in each state. A total of 325 questionnaires were returned, representing a 43% return. Thirty six of those returned were marked "do not use" or " do not wish to respond." These were not included in the analysis. Libraries in four states did not return a large enough number to be considered valid. There were only one or two responses which did not reflect the users of a state-wide database for the entire state. This resulted in a 38% response rate. There were sufficient responses from 13 states to be considered valid, where the response was 10% or more of the surveys send out. Due to the number or surveys returned, a follow up letter was not sent to those libraries not responding. 36 The questionnaire was completed by personnel with 16 different job titles. Some individuals did not respond to this question (1%), and some put their name instead of a title (9%). The largest number of the respondents identified themselves as "Librarian" or "Director", 51% of the total. "Interlibrary Loan Librarians" or "Assistant ILL" completed 6% of the questionnaires and "Reference Librarians" submitted responses for 8%. Other personnel completing questionnaires included Media Specialist (4%), Assistant Librarians (7%), Technical Services Librarians (4%), Adult Services Librarians (5%), and others as indicated in Table 1. Because some questions were not completed by all of the respondents, the analysis of each question was calculated using the total number of responses for that question. Therefore, the total number of responses will vary from question to question or from table to table. Some states had CD-ROM databases, some had Microfiche, and some were on- line. Some states had two or even three of these formats being used at the same time. This also caused a varied response to the questionnaire. The individual states' responses can be found in Appendix D. 37 TABLE 1 Title of Respondent TitLe: n % Assistant -Associate Director_ 19._. Assistant ILL 4 1% Bibliographic Specialist 2 1% Computer Manager, Coordinator 3 1% Coordinator Adult Services 13 5% Director, Head Librarian, Library Manager 147 51% Extension Librarian 1 0% Head, Collection DeveLopment 1 0% Head, Reference Services 6 2% Head, TechnicaL Services, Cataloging 11 4% ILL Coordinator, Supervisor, Head, etc. 15 5% Library Clerk 2 1% Library Tech 2 1% Media SpeciaList, LRC Specialist, Information Specialist 10 4% Name if individual rather than titLe 26 9% Reference Librarian 18 6% System Operator 2 1% No Response 4 1% 286 100% 38 Tables 2 and 3 indicate the total response to the questionnaire by type of library and size of library collection. TABLE 2 Type of Library Type Number Percentage Public 130 45% Academic 90 31% School 48 17% Special 21 7% Totals: 289 100% The largest number of respondents were from public libraries and represented 45% of the total. Academic libraries were second in number of responses with 31% of the total. In many states school and special libraries are not included among statewide database users. However in this survey, schools represented 17% of the respondents, while special libraries accounted for 7% of all respondents. The school libraries from Pennsylvania skew the representation nationwide, but that is simply because not many other states include school libraries in the state-wide bibliographic databases. 39 Table 3 Size of Collections Collections: n % Under 25,000 88 30% 25,001 - 50,000 74 26% 50,001 - 100,000 47 16% 100,001 - 250,000 45 16% Over 250,000 27 9% Unknown size 8 3% Total 289 100% Libraries with collections under 25,000 volumes accounted for almost one-third of the respondents. The responses from libraries with less than 50,000 volumes accounted for 56% of the respondents, while those libraries with collections over 250,000 volumes made up only 9% of the total respondents. TABLE 4 Type of Uses of the Statewide database [Description Number % Interlibrary loan 289 39.6% Public Access 107 14.7% Backup 55 7_._5% Cataloging / 175 24.0% Acquisitions Collection 84 11.5% Development Other 19 2.6% 729 99.9% 40 Question 4 of the survey asked the respondents to check all of the various ways in which they were using the statewide database. Five choices were given and a sixth was open ended so that the respondents could enter any other use for the database. Table 4 illustrates the responses. As one would expect of a statewide database designed to encourage resource sharing, the primary use of the statewide database was for interlibrary loan usage (40%). Two-hundred and eighty-nine of the respondents use the database for interlibrary loan purposes. More than 24% of the users verify cataloging or acquisitions data with the database. Fifteen percent of the users use the database as a public access catalog. Just over 11% of the users view the database as an aid to collection development. Two percent of the users responded in the other category. Indicating that the database was used as a reference tool for the public, students, and faculty to find what other materials were available through out the state. 41 TABLE 5 Amount of time spent daily on the state-wide database Statewide Database :Mnutes n % 0or no response 38 13% Less than 10 18 6% 10 tol19 27 9% 20 to 29 16 5% 30 - 44 45 15% 45 -59 6 2% 60 - 119 42 14% 120 - 179 31 11% 180 - 239 20 7% 240-299 8 3% 300 + 36 12% Other 5 2% TotaL 292 ] 99% Thirty-eight libraries did not respond to question number 5. Eighteen libraries used the statewide database for less than 10 minutes daily, 27 from 10 to 19 minutes daily, 16 from 20 to 29 minutes, 45 for 30 to 44 minutes, 6 for 45 to 59 minutes, 42 for 60 to 119 minutes, 31 for 120 to 179 minutes, 20 for 180 to 239 minutes, 8 for 240 to 299 minutes, and :36 indicated they used the database for over 300 minutes a day. Some of these indicated that their on- line system was available via dial up access 24 hours a day. In replying to question number 6, the least amount of time reported was twice a month. Forty users indicated that the database was being used more than five hours a day. 42 Forty users did not respond. Seventeen percent used the database for an hour each day. TABLE 6 Amount of time spent daily on ILL. Interlibrary loan Minutes n % o or no response 40 13.9% Less than 10 11 3.8% 10 to 19 19 6.6% 20 to 29 14 4.9% 30 - 44 35 12.2% 45 - 59 9 3.1% 60 - 119 50 17.4% 120 - 179 35 12.2% 180 - 239 13 4.5% 240 - 299 18 6.3% 300 + 40 13.9% Other 4 1.4% Total. 288 100.2% 43 TABLE 7 Staff using database Staff n % Interlibrary Loan 219 20.3% Reference 532 49.4% Technical Services 115 10.7% Director 155 14.4% Extension Services staff 20 1.9% Other 33 3.1% No Response 4 0...4.%_ TotaL 11078 100.2% Table 7 profiles the personnel who use the statewide database. Multiple answers were common from the respondents, which is why there are 1,078 separate entries. It was to be expected that from databases designed to facilitate interlibrary loan, those personnel who would most frequently be reported as users would be interlibrary loan staff. However, this did not hold true. Reference staff was 49% of the use, versus 20% for interlibrary loan staff. This is followed by library directors (14%),, and technical services staff (10%). Among other personnel listed were students, faculty, and secretaries. Extension services staff used the databases less than 2% of the time. 44 TABLE 8 Dedicated equipment Responses n No response 11 3.8% Yes 189 65.9% No 87 30.3% TotaL 287 100.0% Eleven libraries did not respond to question number 8, while 66% indicated "yes" the work station was dedicated to the statewide database. Thirty percent replied that they used the equipment for other purposes besides the statewide database. TABLE 9 Public Access? Responses n % No response 3 1.0% Yes 143 49.5% No 14:3 49.5% Total 289 100.0% Question 9 asked if the public had access to the statewide database. Three libraries did not respond, 143 replied that they didn't, and 143 replied that the public did have access to the statewide database. When asked why they didn't allow the public to use the statewide database, 56% indicated that they didn't have 45 enough equipment, 24% said that they didn't have room for public terminals, and 12% indicated that the database was available for staff only. This is reflected by individual states like Ohio, where the libraries have to pay for their CD-ROM discs. Also, some states, especially states with on- line systems, indicated that the software was not user friendly and patrons could not use the database without a librarian assisting them in its use. TABLE 10 WhyNotProvidePublicAccess? Responses n % No Interest 3 2.4% No equipment 70 56.0% Difficulty of use 3 2.4% Staff use only 15 12.0% No Room 31 24.8% Microfiche only 1 0.8% Used as a toy 1 0.8% No CD-ROM extensions 1 0.8% TotaL 125 100.0% Equipment failure has not been a major problem, only 14% indicated that they had equipment problems. Most of those failures were communications problems for on-line systems, or disk failures for CD-ROM systems. Some of the- "equipment" problems were really lack of trained staff knowing how to set up and operate the database. 46 TABLE 11 Hardware Problems -Responses Tn % No response 9 3% Yes 41 No 237 83% Total 287-100% Of the 14% who had problems with software, some were actually hardware related problems, some didn't have the staff with computer skills to set up and operate the database, several wanted to do things that their software wasn't programmed to do, and some were related to not understanding the manuals and help screens. One respondents indicated that they were never able to put the microfiche in the correct way to be able to read it. TABLE 12 ____________ Software Problems Responses n No response 13 4% Yes 47 14% No 222 65% Not Applicable 59 17% Total 341 100% Three states did not offer statewide training, but some of the users received training. The respondents in those states did not explain how they received training. Even in 47 those states that did offer training, time has passed since it was offered, and new people have taken jobs without having had access to that training. TABLE 13 Training - offered and attended Responses n - n - State Attended Training No response 8 7 2% Yes 245 237 83% No _53 42 15% Total 306 286 100% In question 15, (Table 14) over 25% indicated that they needed additional training in order to make effective use of the statewide database. Appendix D has a break down of the individual state's responses. TABLE 14 Responses No response Yes No Total Training - Adequate or need additional training? n - Adequate training % n - need Training 51 18% 9 214 74% 73 23 8% 206 288 100% 288 % 3% 25% 72% 100% I .6mm 48 Question # & Descriptor 16. Browse - Author, Title, or Subject Searches. 17. Express - Advanced LeveL of searching. 18. BooLean 19. Keyword 20. WiLdcard 21. Ease of use - searching 22. Speed 23. Directions 24. Manual 25. Screens 26. Changing Discs Total # per category Average Percentage of each category Average of each category Imp 76 26% 77 27% 35 12% 75 26% 36 13% 69 24% 40 14% 68 24% 33 12% 67 23% 24 8% 600 19% 55 TABLE 15 ortance / Quality / Usefulness 1 ExcetLent, 5 = Poor S 1 31 4 NR Totals Across 94 32% 79 28%- 59- 20% 64 22% 47 17% 119 42% 72 25% 99 34% 65 23% 109 38% 60 21% 867 27% 79 64 22% 61 22% 70 24% 74 26% 73 26% 61 22% 75 26% 78 27% 100 35% 81 28% 93 32% 830 26% 75 19 7% 14 14 - 16 4 5% 37 13% 25 9% 24 8% 11 4% 52 18% 21 7% 27 10% 10 3% 18 6%j 258 8% 23 16 6% 16 6% 23. 8% 16. 6% 26 9% 9 3% 23 8% 9 3% 22 8% 5 2% 16 6% 181 6% 16 21 7% 35 12% 64 22% 32 11% 77 27% 14 5% 26 9% 12 4% 37 13% 16 6% 79 27% 413 13% 38 - 38 100% 282 100% 288 99% 286 100% 283 100% 283 100% 288 100% 287 99% 284 101% 288 100% 290 100% 286 99% JI 290 - mmmmw -Ww JL-a-MEMN-M-MM. -M-00 I I1q. j I mlw w 4 I I I I 49 Almost all of the CD-ROM databases have at least two modes of searching. These terms are from Brodarts LePac software for the standard (Browse) and advanced (Express) search modes, since Brodart has the seven largest statewide database contracts. Browse searching allows a single search by author, title, or subject much like a card catalog searching. The Express mode, a somewhat more sophisticated method of searching, permitting the user to search multiple fields simultaneously. Keyword searching is available using the "Anyword" field and both Boolean logic and truncated searches may be performed. Because many users search in the Express Mode and yet never utilize these specialized search strategies, questions 18, 19, and 20 addressed each feature separately. Some on-line systems like Maryland's, have no Boolean logic searching available. Eighty percent of the respondents rated the Browse search mode as average or above average. Seventy-seven percent rated the Express search mode as average or above average. Boolean searching is performed in the LePac system using the Express Mode. A string of terms in a search field assumes the "and" logic should be applied. Terms inserted within parenthesis marks are searched with "or" logic. Terms entered with a tilde (~) between words are searched with "not" logic. Question 18 asked the users to rate the 50 Boolean search capabilities of their Statewide database. Twenty-two percent of the users did not respond to this question. The significant difference in the lower response rate on Boolean searching suggests that a substantial number of users are unfamiliar with the Boolean search logic and therefore, do not use this search strategy. However, as mentioned before, some databases do not even offer Boolean searching as an option. Several users asked "what is Boolean" on their surveys. It appears that with 22% not responding to this question that additional training is badly needed in this area. Of those who use Boolean logic, 24% rated their software as average, and 32% rated their software above average or excellent. Twenty-one percent rated their software as below average or poor. Question 19 required the respondent to assess the "Keyword" search strategically. Seventy-four considered it average or above average, 15% considered it below average or poor, while 11% did not respond to the question. Question 20 asked for assessment of the "Wildcard" or "Truncated" search strategy. With LePac this requires the user to insert an asterisk (*) to the right of a minimum of the first three letters of a search term. All terms with the corresponding first three letters are retrieved. To perform an embedded character truncated search, the question mark (?) is inserted within a search term. A question mark 51 may be inserted for each unknown letter of the term. For example, the search for wom?n will locate both "woman" and "women." Seventy-seven users did not respond to this question. This suggests, as in Boolean searching that they are unfamiliar with the truncated search strategies and have not utilized the Reference Manual for self-learning of these capabilities of the system. Fifty-six percent of those who did rate the truncated search strategy considered it average, above average, or excellent. Because approximately one-fourth of the users did not respond to these search strategies, it may be deduced that these are areas requiring additional instruction to the user so that the search capabilities of the statewide database are used to the maximum advantage. Question 21 asked about the general ease of searching of the database. Eighty-four percent considered the database easy to use. Since this compares very closely to the percentage that have had training and feel that they don't need any additional training, it can be deduced that this response is based on their previous training and the amount of time becoming familiar with the database. Question 22 asked about the speed of using the database. Twenty-six percent considered it average, 39% rated it as above average or excellent, while 26% considered 51 52 it below average. CD-ROM searching, while much faster than manual methods, is considerably slower than on-line searching. The responses here are mixed together, but those states having databases on CD-ROM gave this a much lower satisfaction rating than those using an on-line system. Those using microforms were uniformly unhappy with the manual searching capabilities of their database. This also reflects a growing awareness of changing technology. The computers of today are considerably faster than the computer of four or five years ago, and the users want to utilize that improvement. Fifty-eight percent of the users considered the on- screen directions to be above average, 27% considered them average, and 10% considered them below average. In question 24 the users were not as kind in rating the Reference Manual. Thirty-five percent considered it as only average, 35% considered it above average, and 18% considered it to be below average. This also reflects some states that do not have a reference manual at all, which is what most of the 13% who did not respond indicated. Question 25 asked the user to rate the readability of the database user screens. Eighty-nine percent rated them as average, above average, or excellent. Overwhelmingly, the users were in agreement that the design and text of the screens were of high quality. 53 Question 26 refers to the CD-ROM systems that require a physical change from one disc to another to access different parts of the bibliographic database. Missouri found that the addition of a single Author/Title index disc helped, but did not entirely solve this problem. TABLE 16 Increases or decreases of service. 1 __increased, 5 decreased. Question # & 1 213 4 5 NR Totals Descriptor _____I jI jI Across 27. ILL incoming 73 103 80 10 3 18 287 %_25% 36% 28% 3% 1% 6% 99% 28. ILL outgoing 62 99 94 13 3 14 285 % 22% 35% 33% 5% 1% 5% 101% 29. FilL Rate 41 107 99 14 3 20 284 % 14% 38% 35% 5% 1% 7% 100% 30. BLind Searches 10 24 122 40 19 71 286 received % 4% 8% 43% 14% 7% 25% 101% Question 27 asked about the impact on resource sharing via incoming ILL requests. Twenty-five percent indicated that it had greatly increased their incoming ILL requests. Thirty-six percent indicated that it has significantly increased their incoming ILL requests. While 28% showed no change and 4% indicated that they had a decrease in their incoming ILL requests. 53 54 Question 28 asked about the impact on out-going ILL requests. Ninety-four or 33% replied that it had no impact on their requests. But 35% responded that it had a significant increase and 22% responded that it increased greatly their out-going ILL requests. Only 6% indicated that it had decreased their out-going ILL requests. Question 29 asked about the impact on the fill rate of ILL requests. Thirty-five percent indicated that it had no impact. Thirty-eight percent indicated that it had significantly increased their fill rate, and 14% indicated that it had greatly increased their fill rates. Again, 6% indicated that it had decreased their fill rates. Question 30 asked about blind search requests. Twenty- five percent did not respond to this question, leading one to believe that it was not understood by many of the respondents. In fact one respondent wrote on the questionnaire "what is a blind search?" Forty-three indicated that it had no impact on their receiving blind requests, while 21% indicated that it had reduced their receiving blind requests. Twelve percent indicated that it had increased their blind requests. Question 31 (Table 17) asked for the approximate percentage of interlibrary loan requests that were being verified using the statewide database. 55 TABLE 17 ILL's being verified using the state-wide database NR 0-25% 26-50% 51-75% 76-100% Total # 40 38 38 59 112 287 14% 13% 13% 21% 39% 100% It appears that the users of statewide databases are successful in verifying most of their interlibrary loan requests with a search in the database. Over 39% of the respondents reported that the success rate of verification was between 76% and 100%. Only 13% indicated that they verified one quarter or less of their ILL requests using the statewide database, while another 34% indicated verifying between 26% and 75% of their ILL requests. Questions 32 asks the types of methods used to request ILL prior to implementation of the statewide database. The three methods most used were: U.S. Mail (31%); Networks (23%); and Phone (21%) . OCLC came in at 10%, mainly from the larger public and academic libraries. Four percent of the respondents indicated that they had no ILL service before the statewide database. 56 TABLE 18 Methods of ILL - before and after the state-wide database OCLC Mail ALANET Net Phone Fax Other No TotaL works Service # Prior 62 196 10 145 131 47 22 24 637 %_10% 31% 2% 23% 21% 7% 3% 4% 101% After 112 197 51 131 134 126 30 12 793 %_14% 25% 6% 17% 17% 16% 4% 2% 101% Percentage 2% 1% %5 -1% 1% 3% %1 %200 80% Increase/decrease Question 33 asked about the method used for ILL requests after implementation of the statewide database. The mail (25%), Networks (17%), and phone (17%) decreased significantly. At the same time, OCLC (14%), ALANET (6%) and Fax (16%) showed significant increases in usage. In 1991, Missouri dropped using ALANET as it's state communication ILL system. At that time Missouri's libraries were one-third of the total users of ALANET, and paid over $70,000.00 a year for the service. Four months after Missouri canceled the contract, ALANET closed down. 57 TABLE 19 ILL voLume prior to and after the state-wide database Prior to databasee Descriptor After database incoming outgoing incoming outgoing 31 21 No Response 37 33 51 51 No Service 32 23 65 67 <10 59 54 30 29 10-20 47 57 23 34 21-44 25 29 15 18 45-75 21 26 6 6 76-100 9 13 18 28 101-350 27 31 6 2 351-499 6 4 8 3 500-1000 9 8 12 8 1001+ 16 11 To determine what effect, if any, the statewide database has had on the volume of interlibrary loan requests, the respondents were asked to provide statistics regarding average monthly incoming and outgoing ILL requests both before and after implementing the statewide database. Table 19 shows these results. This is followed by an increase in ILL usage, both in incoming and outgoing ILL requests. Many libraries said that it did not significantly change their ILL requests. The change seems to be that there are now more libraries using an ILL system than before the implementation of the statewide database. Prior to the advent of the statewide database, 51 respondent libraries did not participate in providing any 58 ILL service. Since the implementation of the statewide database only 23 provide no outgoing ILL service. There was a 37% increase in the libraries that provide ILL services to their patrons since using the statewide database. TABLE 20 Helpful features of database Response n % No Response 72 18% Automation Plans 5 1% CataLoging 22 6% Item Status 7 2% Ease of use 33 8% ILL 19 5% Location tooL 84 21% Browse mode 2 1% Searching 79 20% Verification 28 7% All formats are available 1 0 Reference use 16 4 Collection DeveLopment 5 1 Magazine Index - Author\Title 27 7 Index ._400_101% 400 101% Eighteen percent did not respond to the question posed in Table 20, while 21% indicated that the statewide database was most helpful as a location tool. This was closely followed (20%) by those using it to search bibliographic records. 59 TABLE 21 Improvements needed _ n_% Responses: n% No Responses 71 20% Authority control - cataloging - Acquisitions 15 4% Electronic delivery - full text - E-MaiL 7 2% Circulation Procedures - item Location 4 1% CumuLative printing of screens or search 6 2% ILL Policies, manuals, on-Line system 9 3% Indexes to manuals, on screen instructions - 4 1% Help Screens Errors - duplicate records - multiple titles - 57 16% Cleanup database Need more libraries inputing records 27 8% Periodicals 11 3% Update more often & consistently 42 12% Searching - 30 8% Public access software 6 2% Conmiunications 3 1% Speed 29 8% Changing Discs - where applicable 11 3% Refusal to loan materials 5 1% Statistics 3 1% For CD-ROM - Division of database other than 3 1% by date Change to CD-ROM 3 1% Get every one on-line 6 2% Education 3 1% 346 355 101% 98 Twenty percent of the users didn't respond to this question. Sixteen percent indicated that the improvement most needed was cleaning up the database and getting rid of the duplicate records and multiple titles. Twelve percent 60 indicated that an improvement needed was consistent updating of the database. Other concerns reflect the variation of each state's database. For example, Pennsylvania has a problem with some libraries refusing to loan materials outside their local area. Other concerns are listed in Table 21. TABLE 22 Provides needed information Responses: n No Response 53 19% No 17 6% Yes 216 76% 286 101% Seventy-six percent of the respondents indicate that the statewide database does meet their needs. 61 TABLE 23 Priority of state-wide automation Responses: n _ % No Responses 112 25.6% Accuracy in Database 21 4.8% Automation Services to all Libraries 35 8.0% Continuing Education 19 4.4% Continue with current projects 19 4.4% Full text deliver 12 2.8% Funding 36 8.2% Improve ILL delivery system 17 3.9% Keep)Database updated 15 3.4% Make system easer to use 13 3.0% Retrospective Conversion 27 6.2% Verification & Holding info. 1 0.2% Statewide database 34 7.8% Statewide electronic mail system 20 4.6% Circulation software & hardware 18 4.1% 1 don't understand what Priority means? 1 0.2% Statewide Borrowing agreement 20 4.6% Switch to OCLC 3 0.7% Vendor - Change 2 0.5% Last copy center - out of print materials 2 0.5% Electronic directory of libraries 2 0.5% Database management -tong range planning 7 1.6% Coordination lead by the state - don't install 1 0.2% & abandon 427.0 437 100.2% Sub- Sub- 97.9 Totals Totals Totals 437 100.2% Table 23 indicates again the variety of the concerns in the 12 states surveyed. However, it is a sad commentary on librarianship reading one users response that, "I don't understand what Priority means?" 62 TABLE 24 Selected comments from respondents Responses: n % No Responses 256 %84.77 Include all libraries in state 2 %0.66 No way to cancel a request 1 %0.33 No serial holding request 1 %0.33 Not open to public 1 %0.33 Decrease paperwork 1 %0.33 Use Statewide database in Reference Services 5 %1.66 Reimbursement for ILL net lenders 9 %2.98 More training needed in automation 2 %0.66 Centralized billing for ILL 2 %0.66 Great if Automated 1 %0.33 Our library does not provide ILL service 1 %0.33 This is our main source of info about other 1 %0.33 Libraries Funding is needed for private Libraries 1 %0.33 If materials cost less than $20, should not 1 %0.33 loan Statewide library card 2 %0.66 Get it On-line 1 %0.33 Three methods of access, on-Line, CD-ROM, & 1 %0.33 Microfiche We're 50 years behind the times 1 %0.33 No school bib records in the database 1 %0.33 Looking forward to getting new vendor 2 %0.66 Need Cataloging tool 1 %0.33 Most libraries use\prefer CD-ROM over 2 %0.66 Microfiche Include aLL Libraries in State 1 %0.33 It is expensive 1 %0.33 State Library does an excellent job 1 %0.33 Has Greatly increased ILL from small Libraries 1 %0.33 with no additional funding Need more statewide cooperation 2 %0.66 290.00 302 100% 96.01 63 Table 24 is a general question, intended to see if the questions in the survey were understood, and to catch anything that might be unique to a specific state. The majority of respondents did not reply to this question (84%). Of those who did, the responses were very interesting. They included ideas from establishing a centralized billing for ILL services, to using the statewide database in reference services, to needing a cataloging tool. This reflects the diversity of needs among libraries in the states surveyed. State Libraries responses: Responses from individual states are compiled in Appendix C. Many states would not provide the cost of their statewide database. Of those states that did, the sum total amounts to $7,629,082. Wisconsin provided a cost analysis of how they determined the cost of each format of a statewide database. This can be found in Appendix F. They found that, assuming everything is from startup cost to distribution, microfiche would cost $548,019.00, OCLC would cost $5,029,354.00, and CD-ROM would cost $377.019.00. They also looked at the possibility of using an on-line vendor (not OCLC). The projected cost was $6,207,397.00. 64 TABLE 25 Selected Responses of State Libraries States: # of Lib. in # of Titles # of HoLdings Cost of Data base Database Alabama 40 2,770,704 5,700,000 Alaska 20 1,000,000 2,200,000 $43,000 Colorado 165 $75,000 Connecticut 207 2,040,090 9,613,923 $700,000. Delaware 50 386,153 Georgia 182 7,800,000 14,000,000 $80,000. ILLinois 375 7,700,000 21,400,000 $4,400,000. Iowa 540 1,500,000 5,000,000 Indiana 90 11,000,000 Kansas 300 2,000,000 7,000,000 Louisiana 60 1,400,000 4,685,000 Maine 200 1,200,000 3,000,000 $100,000. MaryLand 135 2,600,000 6,500,000 $325,000. Mississippi 55 597,714 2,040,057 $50,000. Missouri 216 3,500,000 9,000,000 $178,000. Nebraska 135 4,000,000 Nevada 70 1,200,000 North Dakota 22 793,741 1,166,086 $233,217. Ohio 24 343,055 654,734 $29,000. Oklahoma 435 2,085,750 4,000,000 Oregon 165 100,000 250,000 $12,000. Pennsylvania 1,050 2,600,000 12,800,000 Rhode Island 45 367,562 1,300,000 $100,000. South Dakota 184 1,124,255 2,030,385 $446,846. Tennessee 82 800,000 $180,000. Virginia 88 4,000,000 West Virginia 111 1,293,000 3,000,000 $300,000. Wisconsin 1,020 4,153,805 21,000,000 $ 377,019. Sub-Totals 3,486 37,817,207 112,409,800 $6,225,217.00 Totals: 6,066 48,155,829 156,540,185 $7,629,082.00 65 According to the American Library Directory (1991-92)1 there are 31,127 libraries of all types, excluding branches and other service centers, in the United States. The number of libraries using statewide database as reported by the various state libraries total 5,011. Statewide databases contain a total of 46,070,079 titles, and 156,540,185 holdings records. ENDNOTES 1. Simon, Peter, et al American Library Directory 1991-92 44th Edition, New Providence, New Jersey:R.R. Bowker, 1991. 66 CHAPTER 5 SUMMARY, CONCLUSIONS, AND RECOMMENDATIONS This chapter presents a summary of the major findings, implications, and conclusions of this study, along with suggestions for further research considerations. As previously pointed out in Chapter 2, relatively little published information exists concerning the implementation of statewide bibliographic databases. While a review of the literature indicated that the possibilities for developing statewide databases are under consideration in various states across the nation, many do not have any research to support their decision making processes. Two previous studies attempted to evaluate a single CD- ROM database and give some information about the feasibility of statewide databases. The first in Illinois (1987) studied a version of the CD-ROM database using Brodarts LePac software. The database was critiqued by a random sample of patrons, by members of a University of Illinois Library and Information Science class, and by the library staff at four participating libraries.' In the second study using the LePac software in Pennsylvania, Epler and 67 Cassell2 indicated that interlibrary loan transactions in Pennsylvania increase by an average of 68% in the first year after the introduction of ACCESS Pennsylvania. Because many of the libraries in Pennsylvania were using the database as a public access catalog, a 300-500% increase in circulation was also reported. Among academic libraries, the database was viewed as an important public relations and outreach service. Another reason for this high percentage of ILL can be found by looking at the type of library using the statewide database. Pennsylvania has made a tremendous effort to include school libraries in the statewide database. Before the statewide database, school libraries had no access to a bibliographic database or a state-wide union list. They had no computerized system available and in most of the school libraries no ILL service was offered prior to the statewide database. In this respect Pennsylvania is not unusual. In most states school libraries do not have access to bibliographic databases. In those states where school libraries are included within the database the same holds true. Size of Libraries, Effect of Use of Statewide Database Thirty percent of the libraries responding to this survey had less than 25, 000 volumes in the their 68 69 collections. There is a definite correlation between the size of the library and the enthusiasm the respondent had for the statewide database. The small libraries had nothing like this in the past, and have found the state-wide bibliographic database to be a tremendous resource. The larger libraries had access to other types of databases such as OCLC, RLIN, etc. and as a result show less enthusiasm. The state-wide database has made a difference in their ability to compare databases and their effectiveness. Format of the Statewide Database Three basic formats are used in statewide databases: Microfiche; CD-ROM; and On-line. The greatest difficulty found in microfiche is that it is a manual searching database. The CD-ROM database systems were the most cost effective. The majority of these databases were paid for by the state at no cost to the individual library other than the staff to maintain it, and the equipment to display it. It's searching capability was liked by the majority of respondents who used CD-ROM, and it was found to be useful in cataloging and reference searching as well as ILL. It's biggest drawback was in the update schedule, the number of duplicate records, and the number of discs that needed to be 70 changed. Iowa was one of the states who had a very weak software retrieval program and had decided to change vendors. As a result of this their responses were very negative. The on-line systems were rated from great enthusiasm, to deep despair. Maryland's on-line system does not allow for Boolean searching, keyword searching, or wildcard searching. Needless to say, the people of Maryland were not happy with their database. North and South Dakota were very happy with their system and said great things about it. Each state's database has different advantages and weaknesses. Use of Statewide Databases by Libraries It is evident that the statewide databases are being used by libraries. The extent of the use varies depending upon a number of factors. Both size and type of library appeared to have a bearing on the degree to which the statewide databases are being used. Smaller libraries which did not have a wealth of other resources such as affiliation with a bibliographic utility for cataloging and/or interlibrary loan welcomed the statewide database as a much needed tool for providing service to their patrons. Public libraries comprise the majority of users (45%). 71 Many states allow school libraries to participate, but no funds are available to assist them in retrospective conversion of their bibliographic records. Most states require that a library supply its bibliographic records in machine readable form before allowing them to use a state supported ILL system. This has resulted in a few school libraries participating in statewide database projects. School libraries account for 17% of the respondents, the majority from to Pennsylvania and Wisconsin, who have made an effort to include school libraries as a part of their statewide database. Many academic libraries use OCLC and do not wish to duplicate efforts in searching and responding to ILL requests from other sources. Variety of Uses of the Statewide Database The primary use of the system is for resource sharing. Almost 40% of the respondents were using their statewide database for interlibrary loan. However, the majority of staff that use the statewide database identified themselves as reference staff, not interlibrary loan staff. A possible explanation could be that smaller libraries do not have the personnel to separate jobs, and the reference staff also handle all interlibrary loan requests. 72 The second major use of the system was as a cataloging and/or acquisitions verification tool. More than 24% of the respondents used the system for this purpose. Approximately 15% of the users reported that the database was used as a public access resource. This may resolve the question of why reference staff use it so much; they help the patrons to use it. Eleven percent of the respondents indicated that they use the database as a collection development tool. While there is no simple answer to the amount of time spent using the database, the majority of users appear to use the system between an hour and an hour daily. Communication Methods The U.S. Mail is still the system most libraries use, but that use is declining from 31% prior to the statewide database to 25% after implementation of the statewide database. State, local, or regional networks, the phone, and facsimile, are among the next level of communications for ILL requests. OCLC is slowly gaining ground, but it is a slow growth (10% up to 14%). The advent of telefacsimile (fax) could have an impact because fax machines were not readily available to libraries when most state-wide database projects were started. 73 Opinions on the Standard Features of Systems There was virtually no difference between the standard search capabilities and the advance search capabilities in user satisfaction. Both the browse and express searching were scored almost identical to each other. The boolean, keyword, and wildcard searching levels however, were rated very differently. This seems to be because some systems do not have the capability to perform these searching strategies. Almost all users were favorable to the screen design of their various systems. While not being overly generous with excellent ratings, a clear majority of users agreed the clarity of on-screen directions, the readability of the screens, and the general ease in using the system were above average. This was not true when it came to the reference manual. While a majority found it acceptable, many responded by saying "We have never gotten a reference manual." This researcher concluded that users were very positive in their assessment of the mechanics of using the system, and in having the system available to them. 74 Effect of Statewide Database on Resource Sharing The effect of the statewide database, while not dramatic, does show a definite increase in use of ILL; incoming, outgoing, and fill rates. Looking at the volume of ILL transactions in Table 19 found in Chapter 4, we can see a definite increase in ILL since the implementation of the statewide databases. However, this can be explained more simply by looking at the number of libraries that did not offer any ILL service prior to the implementation of the statewide database. Strengths and Weaknesses of Statewide Databases It is impossible to identify all of the strengths and weaknesses of all of the statewide databases due to their complexity, variances, and differences in format. However, all the users agree that the statewide database has changed the way that they operate portions of their library. From cataloging, to the reference desk, to the circulation desk, to the cataloger, changes have occurred in how these various departments offered services. Simply being a tool that provides holdings information is a strength of each statewide database. While each state believes that its system could be improved, the respondents communicated their approval that the statewide database 75 exists and can provide the basis for continued growth of resource sharing in their state. Weaknesses were mentioned in great detail, but they were different for each state. Many users indicated that they would prefer a communications network that is linked to the database directly so that information does not have to be re-keyed to request materials. A database is only as good as the information it contains; therefore, users want a "clean" database without duplicate records, typos, and poor quality cataloging.3 This is perceived as a weakness of almost all the statewide database systems. Interestingly enough the Library of Congress's catalog and OCLC also have problems with a clean database.4 Factors to Consider in Selecting a Statewide Database Vendor In Appendix E are condensed responses to the Missouri RFP from Auto-Graphics, Brodart Company, and Library Corporation. Appendix F contains an example of a cost analysis of the different formats of a statewide database. The information here and the responses found in Appendix D give us the following questions to ask in preparing to select a vendor for a statewide database. 76 1. How much money do you have and how stable is it over several years? 2. Will the local libraries be expected to purchase equipment, or will the State Library provide grants for equipment? 3. How current do you want the database to be? If constantly updated, the system must be on-line. If quarterly , semi-annually, or annually, then CD-ROM is the best solution. 4. What is the purpose of the statewide database? If ILL, then either on-line or CD-ROM is preferred. Do you want an electronic ILL request system as part of the statewide database? 5. Does your state have a flat rate telecommunications system maintained by the state? If so this will eliminate the greatest cost of an on-line system. 6. On-line systems - if not item tracing, can be distributed to several regional libraries instead of having to centralize everything. 77 7. If on-line, can the software system handle magazine indexes and full text journal articles? This is desired by many librarians today. 8. What kind of support staff does the State Library have to maintain the system? 9. How many libraries already have bibliographic databases of their local holdings? One possibility is to consider a consortia database like the Colorado Research Academic Libraries (CARL) rather than an integrated database. 10. What kinds of libraries are to be included? In looking at a statewide database remember that cost is only one significant consideration in selecting a vendor. Helgerson (1987) provides a comprehensive report on how to select a CD-ROM public access system. Her information is still quite useful. 5 Significance of This Study This study has compiled information about state-wide bibliographic databases that has never previously been 78 published. It is significant and has made contributions to the field of library and information science. In this study can be found: 1. the strengths and weaknesses of existing state-wide bibliographic databases; 2. what software is currently being used; 3. what states have state-wide bibliographic databases; 4. the types of libraries included in those databases; 5. their purposes; 6. their costs; 7. their formats; 8. the number of titles and records in each database; and 9. the contact person in each state with responsibility of that state-wide database. Libraries now have information that can help them select a format and a vendor of bibliographic database. This study can be a guide to libraries establishing their own database, or it can be used to re-evaluate a states existing bibliographic database. It includes many factors they should consider before starting to develop their own state-wide bibliographic database. 79 Recommendations for Further Study The area not studied in as great a detail as desired was the cost of a statewide database. Many State Libraries either did not have a good understanding of the actual cost of their statewide database or were reluctant to reveal the information. Six states had multiple formats of databases. Those formats, Microfiche, CD-ROM, and on-line should be compared separately. Including them all together was like mixing apples and oranges. Many of the problems of one format didn't exist in another, and many things simply were not comparable. The responses from those states with multiple formats were difficult to interpret, since it was difficult to determine which format was being evaluated. An interesting line of research would be to find out why some databases are still produced on microforms since the Wisconsin study 6 clearly shows that microfiche is more expensive to produce than CD-ROM if you are starting with no database and no equipment. The National Research and Education Network (NREN) act opens up the possibility of having all statewide databases available nation wide, at a very low cost. This could mean the death of the bibliographic utility companies, unless they can adapt to the changing technology and telecommunications that are now available to many libraries. ENDNOTES 1. Watson, P. K. "CD-ROM catalogs -- Evaluating LePac and looking ahead." Online 11, no. 5 (1987): 74-80. 2. Epler, D., and R.E. Cassell "Access Pennsylvania: A CD-ROM database project." Library Hi Tech 5, no. 3 (1987): 81-92. 3. Flanders, Bruce. "Library Automation News and Analysis" Kansas Libraries (June 1991): 6. 4. Beall, Jeffrey. "AL Aside - Ideas: The dirty database test" American Libraries (March 1991): 197. 5. Helgerson, L. W. , "Acquiring a CD-ROM Public Access Catalog System Part 1: The Bottom Line may not be the top priority." Library Hi Tech 19, vol. 5, no. 3 (Fall 1987): 49-75. 6. Wisconsin Council on Library and Network Development. Automating Wisconsin Libraries. Madison, WI: Wisconsin State Department of Public Instruction, Division of Library Services; 1987. 80 APPENDIX A STATEWIDE BIBLIOGRAPHIC HOLDINGS DATABASE ASSESSMENT QUESTIONNAIRE 81 APPENDIX A STATEWIDE BIBLIOGRAPHIC HOLDINGS DATABASE Assessment Questionnaire Please respond to the following questions about the library in which you work and the use of the statewide bibliographic holdings database. Check or circle the appropriate reply. Title of person completing the questionnaire. 1. Type of Library: (a) ___ Public (c) School (b) Academic (d) Special 2. Size of library collection: (a) (b) (c) (d) (e) Annual Circulation of this collection: Under 25,000 volumes 25,001 - 50,000 50,001 - 100,000 100,001 - 250,000 Over 250,000 4. The statewide bibliographic holdings database is used for: (Check all that apply) (a) (b) (c) (d) (e) (f) Interlibrary loan Public access catalog Back-up catalog for local system Cataloging/Acquisitions verification tool Collection development aid Other (Please specify). 82 5. Amount of time spent daily using statewide database: (minutes) (hours) 83 6. Amount of time spent daily on Interlibrary loan processes: (minutes) (hours) 7. Library personnel who use statewide database: (a) _ Interlibrary Loan Staff (b) ___ Reference Staff (c) ___ Technical Services Staff (d) ___ Library Director (e) ___ Extension Services Staff (f) ___ Other (Please specify) 8. Is the statewide database loaded on equipment dedicated to its use? (a) Yes (b) No 9. Do library patrons use the statewide database? (a) Yes (b) ___ No If No, please give reason (i.e. used only in technical services, no room in public area, afraid of damage, etc.) 10. Has hardware (equipment failure or incompatibility) been a problem in using the statewide database? (a) Yes (b) No If Yes, please describe 11. Has software (the retrieval system) been a problem in using the statewide database? (a) -Yes (b) ___ No (c) _Not Applicable If Yes, please describe 12. Did the State Library offer special training workshops before disseminating the statewide database? (a) Yes (b) No 13. Did you or someone from your library participate in a training session prior to implementation to the use of the statewide database? (a) _ Yes (b) ___ No 14. If yes, was the training session adequate for efficient use of the statewide database? (a) Yes (b) No 84 15. Do you or your staff feel the need for additional training? (a) _ Yes (b) .1_No On the scale of 1 to 5 rate quality / usefulness of: the relative importance / 16. Browse mode of searching 17. Express mode of searching 18. Boolean searching 19. "Anyword" or "Keyword" searching 20. Truncated searching (wildcard "*" or "?") 21. General ease of searching 22. Response time 23. Clarity of on-screen directions 24. Information in the Reference Manual 25. Readability of the database user screens 26. Procedures for changing discs (if necessary) EXCELLENT 1 2 AVERAGE POOR 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 On the scale of 1 to 5 with 1 rate the increase or decrease 27. Since the implementation of the statewide database, have incoming interlibrary loan requests: 28. Outgoing interlibrary loan requests have: 29. The fill rate (the percentage of interlibrary loan requests successfully completed) since the implementation of the statewide database has: 30. The number of blind search requests received (excluding any agreements the library has with other libraries to accept blind searches) has: 31. The approximate percentage of Interlibrary loan requests verified via the statewide database is: 32. The method(s) used to transmit interlibrary loan requests prior to the statewide database was: (check all that apply) being the greatest increase, of the following: INCREASED SAME DECREASED GREATLY GREATLY 1 2 3 4 5 1 2 3 4 5 2 2 3 4 5 2 2 3 4 5 (a) (b) (c) (d) 0-25% 26-50% 51-75% 76-100% (a) OCLC (b) U.S. Mail (c) ALANET (d) Local or Regional network (e) ___ Phone (f) Fax (g) ___ Other (Please specify) 85 86 33. The method(s) used to transmit interlibrary loan request after implementation of the statewide database is: (check all that apply) (a) (b) (c) (d) OCLC U.S. Mail ALANET Local or Regional network (e) Phone (f) Fax (g) Other (Please specify) Please answer the following statistical questions to the best of your ability. 34. On the average, annual interlibrary loan requests prior to the statewide database were approximately: incoming (number of items, incoming means requests received from other libraries.) outgoing 35. On the average, annual interlibrary loan requests since implementing the statewide database are approximately: incoming (number of items, incoming means requests received from other libraries.) outgoing 36. List those features of the statewide database that are especially helpful. 37. List those features of the statewide database that are in need of improvement. 38. Does the statewide database provide the needed information to find library materials? YES NO 39. What do you feel should be the first priority of statewide automation? 40. Please make any comments about the statewide database, that weren't already addressed above. APPENDIX B QUESTIONS ASKED IN SURVEY SENT TO STATE LIBRARY AUTOMATION OFFICERS 87 APPENDIX B Questions asked in Survey sent to State Library Automation Officers If your state has a statewide bibliographic database project please return this survey to Stan Gardner, 4417 Stringtown Rd., Lohman, MO 65053. 1. Number of libraries involved in State-wide bibliographic database project? 2. Types of libraries involved in state-wide bibliographic database project? (a) ___ Public (b) - Academic (c) ___ School (d) Special 3. What format has been selected to disseminate the state- wide bibliographic database? (Microforms, On-line, CD- ROM?) (a) ___ CD-ROM (b) - Microform (c) On-line (d) ___ Print (e) - Other (Please explain Other) 4. Current number of holdings and titles in state-wide bibliographic database? holdings: unique titles: 5. Vendor who maintains and produces the state-wide bibliographic database? (Or is this done "in-house?") 6. Primary goal of the state-wide bibliographic database? (a) interlibrary loan - resource sharing (b) statewide automation development (c) local library automation development (d) public access catalogs (e) cataloging (f) other 7. What is the annual cost of the project? 88 89 8. What is the source of the fund for the project? (a) LSCA (b) State (c) Local (d) Private (e) combination of above 9. What year was the statewide bibliographic database first produced? 10. What is the name of the person responsible for the ongoing maintenance and development of the state-wide bibliographic database? Name Title 11. Are serials and/or Audio/Visual materials included in the state-wide database? (a) ___ Yes (b) No 12. What is the current number of interlibrary loan requests in your state? Incoming Outgoing Special Libraries? Academic Libraries? Public Libraries? School Libraries? 13. What was the number of interlibrary loan requests in your state before implementing the statewide database? Incoming Outgoing Special Libraries? Academic Libraries? Public Libraries? School Libraries? 90 14. Is USMARC Communications II the protocol used in the database? If not, what is the protocol? 15. What has been the greatest value of having a statewide database in your state? APPENDIX C USER RESPONSES TO QUESTIONNAIRE BY STATE 91 APPENDIX C USER RESPONSES TO QUESTIONNAIRE BY STATE Responses from Alaska: Tables C26 to C48 TABLE C26 Question # 1: Title of Respondent - Alaska Title: n % Assistant - Associate Director 0 0% Assistant ILL 0 0% Bibliographic Specialist 0 0% Computer Manager, Coordinator 1 10% Coordinator Adult Services 0 0% Director, Head Librarian, Library Manager 6 60% Extension Librarian 0 0% Head, Collection Development 1 10% Head, Reference Services 0 0% Head, Technical Services, Cataloging 2 20% ILL Coordinator, Supervisor, Head, etc. 0 0% Library Clerk 0 0% Library Tech 0 0% Media Specialist, LRC Specialist, Information Specialist 0 0% Name 0 0% Reference Librarian 0 0% System Operator 0 0% No Response 0 0% 10 100% 92 93 TABLE C27 Respondents to Questionnaire by TYPE of Library - Alaska Type Number Percentage Public 5 50% Academic 2 20% School 1 10% Special 2 20% Totals: 10 100% TABLE C28 Size of CoLLections - Alaska Responses: n % of users Under 25,000 2 20.0% 25,001 - 50,000 2 20.0% 50,001 - 100,000 1 10.0% 100,001 - 250,000 2 20.0% Over 250,000 3 30.0% Not Responsive 0 0.0% Total 10 100.0% TABLE C29 Uses of the Statewide database - Question #4 - Alaska Description Number J0% Interlibrary loan 9 26% Reference Staff 5 15% Backup 3 9% Cataloging / Acquisitions 10 29% Collection Development 6 18% Other 1 3% 34 100% 94 TABLE C30 Questions #5 & 6 - Alaska - Amount of time spend daily on: Statewide Database Interlibrary loan Minutes n Lnw% 0 or no response 1 10% 1 10.0% Less than 10 0 0% 0 0.0% 10 to19 0 0% 0 0.0% 20 to 29 0 0% 0 0.0% 30 - 44 0 0% 1 10.0% 45 - 59 0 0% 0 0.0% 60 - 119 0 0% 2 20.0% 120 -179 0 0% 2 20.0% 180 - 239 3 30% 0 0.0% 240 - 299 0 0% 0 0.0% 300 + 6 60% 4 40.0% Other 0 0% 0 0.0% Total 10 100% 10 100.0% TABLE C31 Question # 7 - Type of staff using database - Alaska Staff n__% Interlibrary Loan 10 25.6% Reference 9 23.1% Technical Services 10 25.6% Director 5 12.8% Extension Services staff 4 10.3% Other 1 2.6% No Response 0 0.0% Total 39 100.0% 95 TABLE C32 Question # 8 - dedicated equipment - Alaska Responses _n_ No response 1 10.0% Yes 7 70.0% No 2 20.0% Total 10 100.0% TABLE C33 Question # 9 - Public Access? - Alaska Responses n % No response11 10% Yes 7 70% No 2 20% Total 10 100% TABLE C34 No Public Access - Why - Alaska Question 9A Responses% n No Response 8 80% No Interest00% No Equipment 1 10% Used as a toy 1 10 Staff use only 0 0 No R oom 00% Totat 10 100% 96 TABLE C35 Question# 10 - Hardware-Alaska Responses n % No response 1 10% Yes 0 0% No 9 90% Total 10 100% TABLE C36 Question # 11 - Software - Alaska Responses n % No response 1 5% Yes 1 5% No 8 40% Not Applicable 10 50% Total 20 100% TABLE C37 [ Question # 12 & 13 - Training - Alaska Responses n - n - State Attended Training No response 1 0 0% Yes 4 8 80% No 5 2 20% Total 10 10 100% 97 TABLE C38 Questions 1 4 & 15- Training- Alaska n - Adequate training % n - need Training 3 30% 1 7 70% 5 0 0% 4 10 100% 10 Responses No response Yes No Total % 10% 50% 40% 100% 1 I 98 Alaska Importance / Quality / 1 = Excellent. 5 = Question # & Descriptor 16. Browse - Author, Title, or Subject Searches. 17. Express - Advanced Level of searching. 18. Boolean 19. Keyword 20. Wildcard 21. Searching 22. Speed 23. Directions 24. Manual 25. Screens 26. Changing Discs Total. # per category Average Percentage of each category Average of each category 10% 1 10%1 10%. 2 20% 2 20% 2 20% .3 30%. 2 20% 3 30% 2 20% 10% 20 18% 2 2 5 50% 3 30% 2 20% 4 40% 2 20% 3 30% 2- 20% 2- 20% 2 20% 2 20% 3 30% 30 27% 3 3 2 20% 3 30% 1 10% 2 20% 2 20% 10% 3 30% 0 0% 1 10% 3 30% 3 30% 21 19% 2 Usefulness Poor 4 5 NR Totals Across 0 0%. 0 0%- 2 20% 0 0% I 10% 2 20% 0 0% 3 30% 0 0% 0 0% 0 0% 8 7% 1 0 0% 0 0% 10% 0 0% 0 0% 0 0% 0 0-% 1 10% 2 20% -10% 0 0% 5 5% 0 0 2 -MMOt 2 20% 3 30% 3 30% 2 20% 3 30% 2 20% 2 20% 2 20% 2 20% 2 20% 3 30% 26 24% 2 10 100% 10 10 10 100% 10 100% 10 100% 10 100% 10 100% 10 100% 10 100% 10 100% 10 100% 10 100% I I I I TABLEV C39 I I I 99 IMPACT OF THE STATEWIDE DATABASE ON RESOURCE SHARING TABLE C40 Alaska Questions 27-30, Increases or decreases of service. 1 increased, 5 decreased. Question # & 1 J 2 13 4 5I NR TotalsDescriptor ii ____JAcross 27. ILL 3 6 0 0 0 1 10 incoming 30% 60% 0% 0% 0% 10% 100% 28. ILL 2 7 0 0 0 1 10 outgoing z_ _ 20% 70% 0% 0% 0% 10% 100% 2 9 . Fill Rate 2 5 2 0 0 1 10 %_20% 50% 20% 0% 0% 10% 100% 30. Blind 1 1 3 1 1 3 10Searches received %10% 10% 30% 10% 10% 30% 100% TABLE C41 Question # 31 - Alaska NR 0-25% 26-50% 51-75% 76-100% Total # 1 1 2 5 10 10% 10% 10% 20% 50% 100% -I 100 TABLE C42 Alaska Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks 32. Prior 1 9 1 3 3 2 2 0 21 % 5% 43% 5% 14% 14% 10% 10% 0% 101% 33. After 4 7 1 4 3 4 6 0 29 %_14% 24% 3% 14% 10% 14% 21% 0% 100% Percentage 4% -1% %1 -1% -1% 2% 3% 0% 72%Increase/decrease TABLE C43 Questions # 34 & 35 - Alaska __ Prior to database Descriptor After database incoming outgoingincoming_ J outgoing 2 2 No Response 2 3 2 0 <10 0 0 2 1 10-20 2 1 1 2 21-44 0 1 0 0 45-75 1 1 0 2 76-100 1 0 2 3 101-350 3 2 0 0 351-500 0 1 1 0 500-1000 1 1 0 0 1001+ 0 0 10 10 TotaL Responses 10 10 Percentage increase / 10% 10% ______________Decrease of ILL 101 TABLE C44 Question # 36 - Alaska Response n % No Response 3 9% Automation Plans 0 0% Cataloging 0 0% Collection Development 0 0% Ease of use 1 3% ILL Printed Forms 0 0% Location toot 5 16% Reference use 1 3% Searching 2 6% Verification 0 0% - - --- ----- --------.......... . 12 37% 102 TABLE C45 Alaska Improvements needed Question_#_37 Responses: n % No Responses 3 23% Authority controL 0 0% Changing Discs, to many discs 0 0% Cleanup 0 0% Cumulative printing of screens or search 0 0% Get all Libraries on-line or CD-ROM 2 15% E-Mail 0 0% Errors - duplicate records - multiple titles 0 0% Need more libraries inputting records 1 8% Periodicals add 0 0% Refusal to loan materials 0 0% Searching - save search terms & que between 3 23% discs Statistics 1 8% Updating more often & consistently 3 23% 13 100% TABLE C46 Question # 38 - Alaska Meet the users needs? Responses: n % No Response 2 20% No 0 %o Yes 8 %80 10 100% 103 TABLE C47 ALaska Question # 39 Priority Responses: n % No Responses 4 33.3% Accuracy in Database 1 8.3% Automation Services to all libraries 1 8.3% Continuing Education 0 0.0% Continue with current projects 0 0.0% Full text deliver 0 0.0% Funding 0.0% Improve ILL delivery system 1 8.3% Keep Database updated 0 0.0% Make system easer to use 0 0.0% Retrospective Conversion 2 16.7% Statewide Borrowing Agreement 1 8.3% Statewide database 1 8.3% Statewide electronic mail system 0 %0.0 Establish statewide circulation system 0 0.0% I don't understand what Priority means? 0 0.0% Database management- long range planning 1 8.3% Totals 12 99.8% TABLE C48 Alaska Question # 40 Comments Responses:_ n% No Responses 8 100% Use Statewide database in Reference Services 0 0% Reimbursement for ILL net Lenders 0 0% Great if Automated 0 0% Three methods of access, on-Line, CD-ROM, & 0 0% Microfiche This is our main source of information about 0 0% other libraries _________________________________ 8100% 104 Responses from Connecticut: Tables C49 to C70 TABLE C49 #Question_#1:_Title ofRespondent_- Connecticut _% Title: j n ] Assistant - Associate Director 1 4% Assistant ILL 0 0% Bibliographic Specialist 0 0% Computer Manager, Coordinator 1 4% Coordinator Adult Services 1 4% Director, Head Librarian, Library Manager 12 50% Extension Librarian 1 4% Head, Collection Development 0 0% Head, Reference Services 1 4% Head, Technical Services, Cataloging 0 0% ILL Coordinator, Supervisor, Head, etc. 0 0% Library Clerk 0 0% Library Tech 0 0% Media Specialist, LRC Specialist, Information Specialist 0 0% Name 3 13% Reference Librarian 3 13% System Operator 0 0% No Response 1 4% 24 100% 105 TABLE C50 Respondents to Questionnaire by TYPE of Library - Connecticut Type Number Percentage PubLic 16 67% Academic 8 33% School 0 0% Special 0% Totals: 24 100% TABLE C51 Size of ColLections - Connecticut Responses: n %fZ -.... .0. . users Under 25,000 25,001 - 50,000 50,001 - 100,000 100,001 - 250,000 over 250,000 Not Responsive Total 2 12 4 3 2 24 24 8% 50% 17% 13% 8% 400% TABLE C52 Uses of the Statewide database - Question #4 - Connecticut DescriptionNumber Interlibrary loan 22 34% Public Access 16 25% Backup 7 Cataloging / Acquisitions 10 15% Collection Deveopment 6 9% Reference - Other 4 __65 I 106 TABLE C53 Questions #5 & 6 - Connecticut - Amount of time spend daiLy on: Statewide Database InterLibrary Loan Minutes n % n % 0 or no response 4 16.7% 6 25.0% Less than 10 0 0.0% 0 0.0% 10 to19 2 8.3% 0 0.0% 20 to 29 0 0.0% 1 4.2% 30 - 44 3 12.5% 4 16.7% 45 - 59 0 0.0% 0 0.0% 60 - 119 4 16.7% 4 16.7% 120 - 179 3 12.5% 3 12.5% 180 - 239 3 12.5% 1 4.2% 240 - 299 1 4.2% 1 4.2% 300 + 4 16.7% 4 16.7% Other 0 0.0% 0 0.0% .TotaL _24 100.1% 24 100.2% TABLE C54 Question # - Type of staff using database - Connecticut Staff n_% Interlibrary toan 21 31.8% Reference 18 27.3% Technical Services 13 19.7% Director 12 18.2% Extension Services staff-o 0.0% Other 1 1.5% Students, Faculty of 0 0.0% institution No Response 1 1.5% Total 66 100.0% 107 TABLE C55 Question # 8 - dedicated equipment - Connecticut Responses J n No response 2 8.3% Yes 20 83.3% No 2 8.3% Total 24 99.9% TABLE C56 Question # 9 - Public Access? - Connecticut Responses n No response 0 0% Yes 22 92% No 28% Total 24 100% TABLE C57 No Public Access - Why - Connecticut Question 9A ResponsesJ nI No Response 23 96% No Interest 0 0% No Equipment 0 0% Microfiche onj 0 0 Staff use only 0 0 No Room 1 4% Total 24 100% TABLE C57 Question # 10 Hardware - Connecticut Responses J No response 4% Yes 2 8% No 21 88% Total 24 100% 108 TABLE C.58 Question # 11 - Software - Connecticut Responses n No response 2 4.4% Yes 12.2% No 20 43.5% Not Applicable 23 50.0% Total 46 TABLE C59 Question # 12 & 13-Training-Connecticut Responses n - n - State Attended Training No response 2 4% Yes 21 22 92% No 1 1 4% Total 24 24 100% TABLE C60 Questions 14-& 15- Training- Connecticut Responses n- Adequate training % n - need % I I Training No response- 2 8% 2 8% Ye WW21 88% 3 13% No41 4% 19 79% Total 24 100% 24 100% 109 TABLE C61 Connect i cut Importance / Quality / Usefulness 1 = Excellent, 5 = Poor Question # & Descriptor 1 2 3 4 5 NR Totals I I__ 11Across 16. Browse - Author, 8 9 4 1 0 2 24 Title, or Subject Searches. %_33% 38% 17% 4% 0% 8% 100% 17. Express - Advanced 6 7 3 0 1 7 24 levelof searching. -_______%25% 29% 13% 0% 4% 29% 100% 18. Boolean 3 2 4 4 2 9 24 % 13% 8% 17% 17% 8% 38% 101% 19. Keyword 8 5 5 1 1 4 24 % 33% 21% 21% 4% 4% 17% 100% 20. Wildcard 4 2 5 1 3 9 24 % 17% 8% 21% 4% 13% 38% 101% 21. Searching 11 9 3 0 0 1 24 %46% 38% 13% 0% 0% 4% 101% 22. Speed 9 10 3 1 0 1 24 %_38% 42% 13% 4% 0% 4% 101% 23. Directions 14 6 3 0 0 1 24 % 58% 25% 13% 0% 0% 4% 100% 24. Manual 4 5 7 1 2 5 24 % 17% 21% 29% 4% 8% 21% 100% 25. Screens 8 9 5 0 0 2 24 %_33% 38% 21% 0% 0% 8% 100% 26. Changing Discs 4 4 5 0 0 11 24 %_17% 17% 21% 0% 0% 46% 101% Total # per category 79 68 47 9 9 52 24 Average Percentage of 30% 26% 18% 3% 3% 20% 100% each category Average of each 7 6 4 1 1 5 category 110 TABLE C62 Connecticut Questions 27-30, Increases or decreases of service. S=1 increased, 5 decreased. Question # & ~ 1 2 3 J 4 5 I NR I Totals [Descriptor I_ _ __ __ _ _I_ _ _ _J Across 27. ILL 7 6 8 1 2 0 24 incoming 29% 25% 33% 4% 8% 0% 99% 28. ILL 8 7 8 1 0 0 24 outgoing % 33% 29% 33% 4% 0% 0% 99% 29. Fill Rate 3 5 15 1 0 0 24 % 13% 21% 63% 4% 0% 0% 101% 30. Blind 1 1 14 0 0 8 24 Searches received 4% 4% 58% 0% 0% 33% 99% TABLE C63 Question # 31 - Connecticut NR 0-25% 26-50% 51-75% 76-100% Total # 4 6 3 5 6 24 17% 25% 13% 21% 25% 101% TABLE C64 Connecticut Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks 32. Prior 9 12 0 14 15 7 3 0 60 % 15% 20% 0% 23% 25% 12% 5% 0% 100% 33. After 9 11 0 11 13 14 3 1 62 % 15% 18% 0% 18% 21% 23% 5% 2% 102% Percentage 1% -1% 0% -1% -1% 2% 1% 0% 97% Increase/decrease 111 TABLE C65 Questions # 34 & 35 - Connecticut Prior to database Descriptor After database incoming outgoing incoming outgoing 2 2 No Response 2 2 4 2 <10 2 1 5 3- 10-20 4 2 2 7 21-44 3 5 1 4 45-75 2 4 4 0 76-1002 2 2 4 101-350 5 6 1 0 351-500 1 0 1 0 500-1000 2 1 2 2 1001+ 1 1 24 24 Total Responses 24 24 Percentage increase / 100% 100% Decrease of ILL TABLE C66 Question # 36 - Connecticut Response n % No Response 3 10% Automation Plans 1 3% Cataloging 1 3% Collection Development 1 3% Ease of use 3 10% ILL Printed Forms 2 7% Location tooL 12 40% Reference use 0 0% Searching 6 20% Verification 1 3% 30 99% 112 TABLE C67 Connecticut Improvements needed Question_#_37 Responses: n % No Responses 5 15.6% Authority control 1 3.1% Changing Discs, to many discs 0 0.0% CLeanup 2 6.3% Cumulative printing of screens or search 0 0.0% Division of database other than by dates 0 0.0% E-Mail 0 0.0% Errors - duplicate records - multiple titles 6 18.8% Need more libraries inputting records 8 25.0% Periodicals add 2 6.3% Refusal to loan materiaLs 0 0.0% Manual of ILL policies 2 6.3% Boolean searching - add 3 9.4% Updating more often & consistently 3 9.4% _ _ _ _ _ _ _ _ __1 32 100.2% TABLE C68 Question # 38 - Connecticut Meet the users needs? Responses: n % No Response 1 4% No 2 % Yes 21 %88 24 100% 113 TABLE C69 Connect i cut Question # 39 Priority Responses: n No Responses 5 11% Accuracy in Database 0 0% Automation Services to all libraries 6 13% Continuing Education 1 2% Continue with current projects 1 2% Full text deliver 1 2% Funding 3 7% Improve ILL delivery system 1 2% Keep Database updated 2 4% Make system easer to use 0 0% -Retrosctive Conversion 3 7% Statewide Borrowing Agreement 1 2% Statewide database122% Statewide electronic mail system 5 %11 Establish statewide circulation system 3 7% I don't understand what Priority means? 00% Database management Long range planning 3 7% Totals 45 99% TABLE C70 Connecticut Question # 40 Comments Responses: n % No Responses 18 67% Use Statewide database in Reference Services 1 4% Reimbursement for ILL net lenders 4 15% This is our main source of information about 1 4% other libraries Statewide library card 2 7 Get it on-line 1 4 27 101% 114 Responses from Delaware: Tables C71 to C93. TABLE C71, Question # 1: Title of Respondent Title: n % Director, Head Librarian 8 67% ILL Coordinator, Supervisor, Head, etc. 0 0% Coordinator Adult Services 0 0% Head, Technical Services, Cataloging 0 0% Head, Reference Services 1 8% Media Specialist, LRC Specialist 0 0% Assistant - Associate Director 1 8% Assistant-ILL 0 0% Library Tech 0 0% L library Cerk 0 0% System operator 0 0% Reference Librarian 1 8% Name 0 0% B-ibliographic Specialtist 1 8%/ Computer Manager, Coordinator 0 0% Head, Col lect ion DevetopMent 0 0% Extension Librarian 0 0% 12 99% 115 TABLE C72 TABLE C7-3 -__,....._. Size of Collections - Delaware Responses: n % of users Under 25,000 650 25,001 - 50,000 3 25% 50,001 - 100,000 0 0% 1001001 - 250000 3 25% Over 2500000 0% Not Responsive 0 Total 12 TAB-LE C7 4 Uses of the Statewide database - Question #4 - Delaware Description Number Interlibrary loan 12 60.0% Public Access 5 25.0% Backup 2 10.0% Cataloging / Acquisitions 1 5.0% Collection Development 0 0.0% Other_0 0.0% C, 20 100.0% Respondents to Questionnaire by TYPE of ILib - - -,. -- -- -- - ---- - I- 1-L Li ary D--uetaware Type Number Percentage Public 9 75% Academic 2 17% School 0 0% Special 1 8% TotaLs: 12 100% 116 TABLE C75 Questions #5 & 6 - Delaware - Amount of time spend daily on: Statewide Database Interlibrary loan Minutes n % n % 0 or no response 2 17% 1 8% Less than10 0 0% 0 0% 10 to19 0 0% 0 0% 20 to 29 1 8% 0 0% 30 - 44 5 42% 1 8% 45 - 59 0 0% 0 0% 60 - 119 3 25% 5 42% 120 - 179 0 0% 2 17% 180 - 239 0 0% 1 8% 240 - 299 0 0% 1 8% 300 + 0 0% 1 8% Other 1 8% 0 0% Total 12 100% 12 99% TABLE C76 Question # 7 - Type of staff using database - Delaware Staff n % No Response 0 %0 Interlibrary loan 11 42% Reference 7 27% Technical Services 1 4% Director 4 15% Extension Services 1 4 Other 2 8% Total 26 100% 117 TABLE C77 Question # 8 - dedicated equipment - Delaware Responses n No response 2 17% Yes 2 7 No 8 67% Total 12 101% TABLE C78 Question # 9 - Public Access? - Delaware Responses n__ No response 0 0% Yes 9 75% No 3 25% Total 12 100% TABLE C79 No Public Access - Why - Delaware Question 9A Responses n % No Response 9 75% No Interest 0 0% No Equipment 1 8% No Room 2 17% Total 12 100% TABLE C80 Question # 10 - Hardware - Delaware Responses n % No response 0 0% Yes 2 17% No 10 83% Total 12 100% 118 TABLE C81 Question # 11 - Software - Delaware Responses n %_ No response 2 17% Yes 1 8% No 9 75% TotaL 12 100% TABLE C82 Question # 12 & 13 - Training Delaware Responses n - State n - Attended % Training No response 0 0 0% Yes 9 10 83% No 3 2 17% TotaL 12 12 100% TABLE C83 Questions 14 & 15 - Training - Delaware Responses n - Adequate training % n - need % Training No response 2 M _ 1 8% Yes 7 58% 3 25% No 3 25% 8 67% Totat 12 100% 12 100% 119 TABLE C84 Delaware Importance / Quality / Usefulness 1 = Excellent,_5 = Poor Question # & Descriptor 1 2 3 4 5 NR TotaLs Acrss 16. Browse - Author, 3 3 2 2 1 1 12 Title, or Subject Searches. % 25% 25% 17% 17% 8% 8% 100% 17. Express - Advanced 3 3 4 1 1 0 12 level of searching. %25% 25% 33% 8% 8% 0% 99% 18. Boolean 1 1 4 3 1 2 12 % 8% 8% 33% 25% 8% 17% 99% 19. Keyword 2 4 4 2 0 0 12 % 17% 33% 33% 17% 0% 0% 100% 20. Wildcard 2 0 2 5 0 3 12 % 17% 0% 17% 42% 0% 25% 101% 21. Searching 1 7 3 1 0 0 12 % 8% 58% 25% 8% 0% 0% 99% 22. Speed 3 2 5 1 0 1 12 %25% 17% 42% 8% 0% 8% 100% 23. Directions 3 8 0 1 0 0 12 %_25% 67% 0% 8% 0% 0% 100% 24. Manual 1 3 4 1 0 3 12 % 8% 25% 33% 8% 0% 25% 99% 25. Screens 4 5 2 0 1 0 12 % 33% 42% 17% 0% 8% 0% 100% 26. Changing Discs 1 0 3 0 0 8 12 %8% 0% 25% 0% 0% 67% 100% Total # per category 24 36 33 17 4 18 12 Average Percentage of 18% 27% 25% 13% 3% 14% 100% each category Average of each 2 3 3 2 0 2 category 120 TABLE C85 Delaware Questions 27-30, Increases or decreases of service. 1 increased, 5 decreased. Question # & 1 1 21 3 4 51 NR Totals Descriptor jI__I _ j j _I __I__ I__ Across 27. ILL 0 8 4 0 0 0 12 incoming % 0% 67% 33% 0% 0% 0% 100% 28. ILL 0 7 4 0 1 0 12 outgoing % 0% 58% 33% 0% 8% 0% 99% 29. Fill Rate 0 3 5 3 1 0 12 % 0% 25% 42% 25% 8% 0% 100% 30. Blind 0 3 2 3 0 4 12 Searches received % 0% 25% 17% 25% 0% 33% 100% TABLE C86 Question # 31 - Delaware NR 0-25% 26-50% 51-75% 76-100% Total # 1 0 1 5 5 12 8% 0% 8% 42% 42% 100% TABLE C87 Delaware Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other No Total rks Service # 32. Prior 0 6 0 9 7 0 1 1 24 % 0% 25% 0% 38% 29% 0% 4% 4% 100% 33. After 1 7 0 10 6 7 1 0 32 % 3% 22% 0% 31% 19% 22% 3% 0% 100% Percentage 0% -1% 0% -1% -1% 0% 1% 0% 75% Increase/decrease 121 TABLE C88 Questions # 34 & 35 - DeLaware Prior to database Descriptor After database incoming outgoing J incoming outgoing 3 3 No Response 1 1 4 2 <10 3 1 0 2 10-20 2 3 4 3 21-44 2 3 0 1 45-75 3 1 0 0 76-100 0 2 0 1 101-350 1 1 1 0351-500 0 0 0 0 500-1000 0 0 0 0 1001+ 0 0 12 12 Totat Responses 12 12 Percentage increase / 10% 10% Decrease of ILL TABLE C89 Question # 36 - Detaware Response n % No Response 4 36% Automation Plans 0 0% CataLoging 0 0% Collection Devetopment 0 0% Ease of use 00% ILL Printed Forms 0 0% Location toot 3 27% Reference use 0 0% Searching 4 36% Verification 0 0% 11 99% 122 TABLE C90 Delaware Improvements needed Question_#_37 Responses: n % No Responses 1 6% Authority control 1 6% Changing Discs 0 0% Cleanup 1 6% Cumulative printing of screens or search 0 0% Division of database other than by dates 0 0% E-Mail 2 13% Errors - duplicate records - multiple titles 1 6% Need more libraries inputting records 2 13% Periodical add 1 6% Refusal to loan materials 0 0% Searching - save search terms & que between 0 0% discs Speed 0 0% Updating more often & consistently 7 44% 16 100% TABLE C91 Question # 38 - Delaware Meet the users needs? Responses: n No Response 0 0% No 1 %_8 Yes 11 %92 12 100% 123 TABLE C92 Delaware Question # 39 Priority Responses: n _ _ No Responses 0 0% Accuracy in Database 2 10% Automation Services to all libraries 1 5% Continuing Education 2 10% Continue with current projects 0 0% Fult text deliver 1 5% Funding 0 0% Improve ILL delivery system 2 10% Keep Database updated 3 15% Make system easer to use 0 0% Retrospective Conversion 2 10% Statewide Borrowing Agreement 0 0% Statewide database 2 10% Statewide electronic mail system 2 10% Establish statewide circuLation system 3 15% Totals 20 100% Table C93 Delaware Question # 40 Comments Responses: n % No Responses 11 85% 50 yrs behind the times 1 8% No schooL bib records in database 1 8% 13 101% 124 Responses from Iowa: Tables C94 to C116. TABLE C94 Question # 1: Title of Respondent Title: n % Director, Head Librarian 9 56% ILL Coordinator, Supervisor, Head, etc. 0 0% Coordinator Adult Services 1 6% Head, Technical Services, Cataloging 0 0% Head, Reference Services 0 0% Media Specialist, LRC Specialist 2 13% Assistant - Associate Director 2 13% Assistant ILL 0 0% Library Tech 0 0% Library Clerk 0 0% System Operator 0 0% Reference Librarian 0 0% Name 1 6% Bibliographic Specialist 0 0% Computer Manager, Coordinator 1 6% Head, Collection Development 0 0% Extension Librarian 0 0% 16 100% 125 TABLE C95 Respondents to Questionnaire by TYPE of Library - Iowa Type Number Percentage Public 9 60% Academic 1 7% School 5 33% SpeciaL 2 13% TotaLs: 15 100% TABLE C96 Size of Collections - Iowa Responses: n % of users Under 25,000 11 65% 25,001 - 50,000 2 12% 50,001 - 100,000 1 6% 100,001 - 250,000 0 0% Over 250,000 2 12% Not Responsive 1 6% Total 17 101% TABLE C97 Uses of the Statewide database - Question #4 - Iowa Description Number % Interlibrary loan 16 59.3% Public Access 6 22.2% Backup 1 3.7% Cataloging / Acquisitions 2 7.4% Collection Development 2 7.4% Other 0 0.0% [ 27.____100.0%_______ 126 TABLE C98 Questions #5 & 6 - Iowa - Amount of time spend daily on: Statewide Database Interlibrary loan Minutes n % n % 0 or no response 3 18% 3 16% Less than10 0 0% 0 0% 10 to 19 4 24% 4 21% 20 to29 1 6% 1 5% 30 - 44 6 35% 1 5% 45 - 59 0 0% 0 0% 60 - 119 1 6% 6 32% 120 - 179 1 6% 1 5% 180 - 239 0 0% 0 0% 240 - 299 0 0% 1 5% 300 + 0 0% 1 5% Other 1 6% 1 5% Total 17 101% 19 99% TABLE C99 Question # 7 - Type of staff using database - Iowa Staff n % No Response 0 %0 Interlibrary loan 9 30% Reference 4 13% Technical Services 3 10% Director 11 37% Other 3 10% Total 30 100% 127 TABLE C100 Question # 8 - dedicated equipment - Iowa Responses n No response 1 6% Yes 11 65% No 5 29% TotaL 17 100% TABLE C101 Question # 9 - Public Access? - Iowa Responses n No response 0 0% Yes 11 65% No 6 35% TotaL 17 100% TABLE C102 No PubLic Access - Why - Iowa Question 9A Responses n % No Response 11 61% No Interest 1 6% No Equipment 5 28% No Room 1 6% Total 18 101% TABLE C103 Question # 10 - Hardware - Iowa Responses n % No response 0 0% Yes 2 12% No 15 88% TotaL 17 100% 128 TABLE C104 Question # 11 - Software - Iowa Responses n % No response 1 6% Yes 8 47% No 8 47%/, Total 17 100% TABLE C105 Question # 12 & 13 - Training - Iowa Responses n - State n - Attended % Training _I No response 0 0 0% Yes 12 14 82% No 5 3 18% Total 17 17 100% TABLE C106 Questions 14 & 15 - Training - Iowa Responses n - Adequate training % n - need % Training No response 1 6% 0 0% Yes 14 82% 5 29% No 2 12% 12 71% TotaL 17 100% 17 100% 129 TABLE C107 Iowa Importance / Quality / Usefulness 1 = Exceltent, 5 = Poor Question # & Descriptor 1 2 3 4 5 NR Totals Across 16. Browse - Author, 0 2 4 1 8 2 17 Title, or Subject Searches. %_0% 12% 24% 6% 47% 12% 101% 17. Express - Advanced 0 1 7 1 6 2 17 Level of searching. % 0% 6% 41% 6% 35% 12% 100% 18. Boolean 0 2 1 0 7 6 16 % 0% 13% 6% 0% 44% 38% 101% 19. Keyword 1 1 2 2 7 4 17 %_6% 6% 12% 12% 41% 24% 101% 20. Witdcard 0 0 1 1 9 6 17 %_0% 0% 6% 6% 53% 35% 100% 21. Searching 0 4 3 5 5 0 17 %_0% 24% 18% 29% 29% 0% 100% 22. Speed 1 4 9 1 2 0 17 % 6% 24% 53% 6% 12% 0% 101% 23. Directions 1 6 6 1 3 0 17 %_6% 35% 35% 6% 18% 0% 100% 24. Manual 2 2 4 0 6 3 17 %_12% 12% 24% 0% 35% 18% 101% 25. Screens 4 3 7 2 1 0 17 % 24% 18% 41% 12% 6% 0% 101% 26. Changing Discs 3 5 2 0 1 6 17 %_18% 29% 12% 0% 6% 35% 100% Total # per category 12 30 46 14 55 29 17 Average Percentage of 7% 16% 25% 8% 30% 16% 102% each category Average of each 1 3 4 1 5 3 category 130 TABLE C108 Iowa Questions 27-30, Increases or decreases of service. 1= increased, 5 decreased. Question # & 1 2 3 4 5 NR Totals Descriptor Across 27. ILL 5 5 4 1 1 1 17 incoming % 29% 29% 24% 6% 6% 6% 100% 28. ILL 3 4 8 0 0 2 17 outgoing % 18% 24% 47% 0% 0% 12% 101% 29. FiLL Rate 4 6 4 1 0 2 17 %_24% 35% 24% 6% 0% 12% 101% 30. Blind 1 2 6 2 0 6 17 Searches received % 6% 12% 35% 12% 0% 35% 100% TABLE C109 Question # 31 - Iowa NR 0-25% 26-50% 51-75% 76-100% Total # 2 2 3 5 5 17 12% 12% 8% 29% 29% 100% TABLE C110 Iowa Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other No Total rks Service # 32. Prior 4 8 0 6 8 5 2 3 36 % 11% 22% 0% 17% 22% 14% 6% 8% 100% 33. After 4 3 0 11 6 12 2 0 38 % 11% 8% 0% 29% 16% 32% 5% 0% 101% Percentage 1% -0% %?? -2% -1% 2% 1% 0 95% Increase/decrease Fr I 131 TABLE C111 Questions # 34 & 35 - Iowa Prior to database Descriptor After database incoming outgoing incoming outgoing 6 6 No Response 5 6 11 9 <10 4 5 0 1 10-20 4 5 1 0 21-44 1 1 0 0 45-75 1 0 0 0 76-100 0 0 1 1 101-350 1 0 0 0 351-500 0 0 0 0 500-1000 0 0 0 0 1001+ 1 0 19 17 Total Responses 17 17 Percentage increase / 9% 10% Decrease of ILL TABLE C112 Question # 36 - Iowa Response n % No Response 9 56% Automation Plans 0 0% Cataloging 0 0% Collection Development 0 0% Ease of use 0 0% ILL Printed Forms 0 0% Location tooL 7 44% Reference use 0 0% Searching 0 0% Verification 0 0% I-16 100% TABLE C113 Iowa Improvements needed Question_#_37 Responses: n % No Responses 8 38% Authority control 1 5% Changing Discs 0 0% Cleanup 1 5% Cumulative printing of screens or search 1 5% Division of database other than by dates 0 0% E-Mail 1 5% Errors - duplicate records - multiple titLes 0 0% Need more libraries inputting records 2 10% Periodical add 1 5% Refusal to loan materials 0 0% Searching - save search terms & que between 3 14% discs Speed 2 10% Updating more often & consistently i 5% _21 102% TABLE C114 Question # 38 - Iowa Meet the users needs? Responses: n % No Response 5 29% No 6 %35 Yes 6 %35 17 99% 132 133 TABLE C115 Iowa Question # 39 Priority Responses: n % No Responses 7 33% Accuracy in Database 2 10% Automation Services to all libraries 1 5% Continuing Education 1 5% Continue with current projects 1 5% FulL text deLiver 0 0% Funding 1 5% Improve ILL delivery system 1 5% Keep Database updated 1 5% Make system easer to use 4 19% Retrospective Conversion 0 0% Statewide Borrowing Agreement 0 0% Statewide database 2 10% Statewide electronic mail system 0 0% EstabLish statewide circulation system 0 0% TotaLs 21 102% Table C116 Iowa Question # 40 Comments Responses: n % No Responses 14 82% Look forward to getting new vendor 2 12% Needcataloging tool 1 6% _______________________________ 17 100% 134 Responses from Maryland: Tables C117 to C139. TABLE C117 Question # 1: Title of Respondent Title: n % Director, Head Librarian 6 46% ILL Coordinator, Supervisor, Head, etc. 2 15% Coordinator Adult Services 0 0% Head, Technical Services, Cataloging 0 0% Head, Reference Services 1 8% Media Specialist, LRC Specialist 0 0% Assistant - Associate Director 3 23% Assistant ILL 0 0% Library Tech 0 0% Library Clerk 0 0% System Operator 0 0% Reference Librarian 0 0% Name o 0% Bibliographic SpeciaList 1 8% Computer Manager, Coordinator 0 0% Head, Collection Development 0 0% Extension Librarian 0 0% 13 100% 135 TABLE C118 Respondents to Questionnaire by TYPE of Library - Maryland Type Number Percentage Public 9 56% Academic 3 19% School 2 13% Special 2 13% Totals: 16 101% TABLE C119 Size of CoLlections - MaryLand Responses: n % of users Under 25,000 2 13% 25,001 - 50,000 2 13% 50,001 - 100,000 6 38% 100,001 - 250,000 2 13% Over 250,000 4 25% Not Responsive 0 0% Total 16 102% TABLE C120 Uses of the Statewide database - Question #4 - MaryLand Description Number % InterLibrary Loan 16 35.6% Public Access 4 8.9% Backup 7 15.67 CataLoging / Acquisitions 11 24.4% ColLection DeveLopment 7 15.6% Other 0 0.0% 451100.1% 136 TABLE C121 Questions #5 & 6 - Maryland - Amount of time spend daily on: Statewide Database Interlibrary loan Minutes n % n 0 or no response 4 25% 1 6% Less than10 0 0% 0 0% 10 to19 0 0% 0 0% 20 to 29 0 0% 0 0% 30 - 44 3 19% 1 6% 45 - 59 0 0% 1 6% 60 - 119 1 6% 1 6% 120 - 179 1 6% 1 6% 180 - 239 2 13% 1 6% 240 - 299 2 13% 2 13% 300 + 3 19% 8 50% Other 0 0% 0 0% Total 16 101% 16 99% TABLE C122 Question # 7 - Type of staff using database - Maryland Staff n No Response 0_%0 Interlibrary loan 13 27% Reference 15 31% Technical Services 7 15% Director 6 13% Extension Services 5 10 Other 2 4% Total 48 100% 137 TABLE C123 Question # 8 - dedicated equipment - Maryland Responses n No response 0 0% Yes 11 69% No 5 31% Total 16 100% TABLE C124 Question # 9 - Public Access? - Maryland Responses n % No response 0 0% Yes 15 94% No1 6% Total 16 100% TABLE C125 No Public Access - Why - Maryland Question 9A Responses n No Response 15 94 No Interest 0 0% No Equipment 0 0% No CD-ROM extensions 1 6% Total 16 100. TABLE C126 Question # 10 - Hardware - Maryland Responses n % No response 0 0% Yes 3 19% 13 81% Total 16100% 138 TABLE C127 Question # 11 - Software - Maryland Responses n % No response 0 0% Yes 3 19% No 13 81% TotaL 16 100% TABLE C128 Question # 12 & 13 - Training - Maryland Responses n - State n - Attended % Training No response 2 0 0% Yes 9 11 73% No 5 4 27% TotaL 16 15 100% TABLE C129 Questions 14 & 15 - Training - Maryland Responses n - Adequate training % n - need Training No response 4 25% 0 0% Yes 11 69% 3 19% No 1 6% 13 81% Total 16 100% 16 100% 139 TABLE C130 Maryland Importance / Quality / Usefulness 1 = Excellent, 5 = Poor Question # & Descriptor 12 3 4 5 NR Totals 1111111Across 16. Browse - Author, 2 5 5 2 0 2 16 Title, or Subject Searches. %13% 31% 31% 13% 0% 13% 101% 17. Express - Advanced 2 3 4 0 0 2 11 Level of searching. %18% 27% 36% 0% 0% 18% 99% 18. Boolean 0 2 1 3 3 2 11 % 0% 18% 9% 27% 27% 18% 99% 19. Keyword 0 4 8 2 0 1 15 %0% 27% 53% 13% 0% 7% 100% 20. Wildcard 0 1 3 0 4 4 12 % 0% 8% 25% 0% 33% 33% 99% 21. Searching 3 7 4 0 0 2 16 %19% 44% 25% 0% 0% 13% 101% 22. Speed 3 5 3 4 0 1 16 %19% 31% 19% 25% 0% 6% 100% 23. Directions 3 9 3 0 0 1 16 %19% 56% 19% 0% 0% 6% 100% 24. Manual 0 5 5 1 1 1 13 %0% 38% 38% 8% 8% 8% 100% 25. Screens 2 11 3 0 0 1 17 %12% 65% 18% 0% 0% 6% 101% 26. Changing Discs 0 4 6 1 0 1 12 %0% 33% 50% 8% 0% 8% 99% Total # per category 15 56 45 13 8 18 14 Average Percentage of 9% 34% 29% 9% 6% 12% 99% each category Average of each 1 5 4 1 1 2 category 140 TABLE C131 Maryland Questions 27-30, Increases or decreases of service. 1 increased, 5 decreased. Qusin# & 1 f2 131 4 5 NR ITotals Dsrpor j____j____I II___ ____ Across 27. ILL 4 4 7 1 0 0 16 incoming %1 25% 25% 44% 6% 0% 0% 100% 28. ILL 4 6 6 0 0 0 16 outgoing %_25% 38% 38% 0% 0% 0% 101% 29. Fill Rate 0 3 10 1 1 0 15 %_0% 20% 67% 7% 7% 0% 101% 30. Blind 0 1 7 3 0 3 14 Searches received % 0% 7% 50% 21% 0% 21% 99% TABLE C132 Question # 31 - Maryland NR 0-25% 26-50% 51-75% 76-100% Total # 0 0 3 5 8 16 0% 0% 19% 31% 50% 100% TABLE C133 Maryland Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other No Total rks Service # 32. Prior 3 7 0 8 8 3 7 0 36 % 8% 19% 0% 22% 22% 8% 19% 0% 98% 33. After 3 6 0 7 8 9 6 1 40 %_8% 15% 0% 18% 20% 23% 15% 3% 102% Percentage %1 -1% 0% -1% -1% %3 1% -3% 90% Increase/decrease 141 TABLE C134 Questions # 34 & 35 - Maryland Prior to database Descriptor After database incoming outgoing incoming outgoing 5 5 No Response 3 3 4 4 <10 1 1 0 0 10-20 1 1 0 2 21-44 1 0 1 0 45-75 2 1 1 0 76-100 0 2 1 2 101-350 2 2 1 1 351-499 1 0 0 1 500-1000 1 2 3 1 1001+ 4 4 16 16 Total Responses 16 16 Percentage increase / 10% 10% Decrease of ILL TABLE C135 Question # 36 - Maryland Response n % No Response 3 12% Automation Plans 2 8% Cataloging 1 4% Collection Development 1 4% Ease of use 6 24% ILL Forms 2 8% Location tool 3 12% Reference use 2 8% Searching 5 20% Verification 0 0% 25 100% TABLE C136 MaryLand Improvements needed Question # 37 Responses: n % No Responses 4 13% Authority control 1 3% Add Boolean & Keyword searching to software 7 23% Cleanup 0 0% Cumulative printing of screens or search 0 0% Get everybody on-Line 3 10% Item Location 2 6% Errors - duplicate records - multiple titles 0 0% Need more Libraries inputting records 0 0% Periodical add 0 0% put limits on ILL materials Loaned 2 6% Searching - save search terms & que between 5 16% discs Speed 0 0% Updating more often & consistently - CD-ROM 7 23% versions. 31 100% TABLE C137 11 Responses: n % No Response 5 31% No 0 00 Yes 11 %69 16 100% 142 Question # 38 - Maryland Meet the users needs? 11 143 TABLE C138 Maryland Question # 39 Priority Responses: n % No Responses 2 11% Accuracy in Database 2 11% Automation Services to all Libraries 1 5% Continuing Education 1 5% Continue with current projects 0 0% Full text deliver 0 0% Funding 4 21% Improve ILL delivery system 2 11% Keep Database updated 0 0% Make system easer to use 0 0% Retrospective Conversion 1 5% Statewide Borrowing Agreement 1 5% Statewide database 1 5% Statewide electronic mail system 1 5% Establish statewide circulation system 3 16% Totals 19 100% Table C139 Maryland Question # 40 Comments Responses: n % No Responses 11 79% Database in three formats 2 14% Most libraries use CD-ROM 1 7% 14 100% 144 Responses from Missouri: Tables C140 Table C140 to C161 Respondents to Questionnaire by TYPE of Library - Missouri Type Number Percentage Academic 29 31% Public 60 64% School 2 2% Special 3 3% Totals: 94 100% Table C141 Size of Collections Responses: n % of users Under 25,000 23 24.5% 25,001 - 50,000 30 32% 50,001 - 100,000 20 21.5% 100,001 - 250,000 15 16% Over 250,000 5 5% Total 94 100% Table C142 Uses of the Statewide database - Question #4 Description Number Backup 9 5.6% Cataloging / Acquisitions 38 23.8% Collection Development 14 8.8% Interlibrary loan 92 57.5% Public Access 5 3.1% Reference 2 1.3% 160 100% Table C143 Question # 9 - Public Access? Responses n % No response 2 2% Yes 14 14% No 80 84% Total 94 100% 145 Table C144 Public use of the Statewide database - Question # 9 Responses: n % PubLic did use. 12I 7% Public did not use 80 49%_ Afraid of damage 8 5% No Room 19 12% No equipment 45127% 164 100% Table C145 Questions #5 & 6 - Amount of tim spend daily on: Statewide Database Interlibrary loan Minutes n % n % 0 or no response 14 15% 11 11.7% Less than 10 11 12% 7 7.5% 10 to 19 15 16% 7 7.5% 20 to 29 9 10% 7 7.5% 30 14 15% 14 14.9% 45 3 3% 4 4.3% 60 19 20% 16 17.0% 120 2 2% 8 8.5% 180 3 3% 4 4.3% 240 2 2% 5 5.3% 300 1 1% 10 10.6% Other 1 1% 1 1.1% Total 94 J1 100% 94 100% 146 Table C146 Question # 7 - Type of staff using database Staff n % No Response 2 1% Interlibrary loan 70 38% Director 49 26% Assistant Director 2 1% Reference 31 17% Technical Services 27 15% Other 5 3% Total 186 100% Table C147 Question # 8 - dedicated equipment Responses n No response 3 3% Yes 54 58% No 37 39% Total 94 100% Table C148 Question # 10 - Hardware Responses n__ No response 3 3% Yes 8 9% No 83 88% Total 94 100% Table C149 Question # 11 - Software Responses n % No response 5 5% Yes 20 21% No 69 74% Total 94 100% 147 Table C150 Question # 12 & 13 - Training Responses n - State n - Attended % Training _I _-II No response 0 4 4% Yes 94 75 80% No 0 15 16% TotaL 94 94 100% Table C151 Questions 14 & 15 - Training Responses n - Adequate training % n - need % Training No response 21 22% 4 4% Yes 66 71% 14 15% No 7 7% 76 81% Total 94 100% 94 100% 148 Table C152 Importance / Quality / Usefulness 1 = Excelent, 5 = Poor Question # & Descriptor 1 2 3 4 5 NR Totals Across 16. Browse - Author, 29 27 23 5 4 6 94 Title, or Subject Searches. % 31% 29% 24% 5% 4% 6% 100% 17. Express - Advanced 17 33 23 11 5 5 94 level of searching. % 18% 35% 24% 12% 5% 5% 100% 18. Boolean 2 19 30 9 4 30 94 % 2% 20% 32% 10% 4% 32% 100% 19. Keyword 11 22 32 11 6 12 94 % 12% 23% 34% 12% 6% 13% 100% 20. WiLdcard 1 17 25 11 6 34 94 % 1% 18% 27% 12% 6% 36% 100% 21. Searching 15 37 32 3 3 4 94 % 16% 39% 34% 3% 3% 4% 100% 22. Speed 4 4 21 31 18 16 94 % 4% 4% 22% 33% 19% 17% 100% 23. Directions 19 28 35 7 2 3 94 % 20% 30% 37% 7% 2% 3% 100% 24. Manual 8 18 40 14 3 11 94 %_9% 19% 43% 15% 3% 12% 100% 25. Screens 19 33 32 6 1 3 94 % 20% 35% 34% 6% 1% 3% 100% 26. Changing Discs 5 18 42 11 15 3 94 % 5% 19% 45% 12% 16% 3% 100% Total # per category 130 256 335 119 67 127 94 Average Percentage of 13% 25% 32% 12% 6% 12% each category Average of each 12 23 30 11 6 12 category 149 Table C153 Questions 27-30, Increases or decreases of service. 1 = increased, 5 decreased. Question # & 1 2 3 4 5 NR Totals 27. ILL 10 33 41 6 0 4 94 incoming % 11% 35% 44% 6% 0% 4% 100% 28. ILL 9 36 40 5 0 4 94 outgoing % 10% 38% 43% 5% 0% 4% 100% 29. Fill Rate 14 38 35 3 0 4 94 % 15% 40% 37% 3% 0% 4% 100% 30. Blind 3 6 42 16 9 18 94 Searches received % 3% 6% 45% 17% 10% 19% 100% Table C154 Question # 31 NR 0-25% 26-50% 51-75% 76-100% Total # 5 17 13 15 44 94 5% 18% 14% 16% 47% 100% Table C155 Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other None Total rks # 32. Prior 21 75 8 57 40 5 1 1 208 % 10% 36% 4% 27% 19% 2% 0% 0% 100% 33. After 67 72 50 22 37 25 3 1 277 24% 26% 18% 8% 13% 9% 1 0% 100% Percentage 3% -1% 6% -39% -1% 5% 3% 0% 0% Increase/decrease 150 Table C156 Questions $ 34 & 35 Prior to database Descriptor After database incoming outgoing incoming outgoing 40 32 No Service 24 17 43% 34% Percentage of respondents 26% 18% giving no ILL service. Percent Decrease of no 60% 53% service 1 1 No Response 10 0 21 26 <10 32 29 12 11 10-20 7 23 7 9 21-44 7 7 5 6 45-75 2 7 0 2 76-100 0 3 3 5 101-350 6 5 1 1 351-500 1 0 10 500-1000 2 2 3 _ 11001+ 3 1 54 62Total Responses 70 77 Percentage increase / 13% 13% Decrease of ILL Table C157 Question # 36 Response n % No Response 26 23% Author/Title index 22 19% Cataloging 4 4% Ease of use 7 6% Location tool 18 16% Reference use 5 4% Searching 27 24% SeriaLs 4 4% 113 100% 151 Table C158 Improvements needed Question # 37 Responses: n % No Responses 26 24% Authority control 4 4% Changing Discs 9 9% Errors - duplicate records - multiple titles 14 13% Exiting 3 2.5% Help Screens 2 2% Need more libraries inputting records 5 5% Periodical disc 2 2% Searching 9 8.5% Software 5 5% Speed 16 15% Updating more often & consistently 10 10% 105 100% Table C159 Question # 38 Meet the users needs? Responses: n % No Response 14 15% No 5 5% Yes 75 80% 94 100% 152 Table C160 Question # 39 Priority Responses: n % No Responses 45 33% Accuracy in Database 4 3% Automation Services to all Libraries 8 6% Continuing Education 8 6% Continue with current projects 8 6% Fax machines in every Library 2 1% Full text online databases 4 3% Funding 8 6% Improve ILL delivery system 3 2% Keep Database updated 4 3% Make system easer to use 5 4% More consultants 2 1% On-Line systems 6 4% Retrospective Conversion 9 7% Statewide Borrowing Agreement 2 1% Statewide database 8 6% Statewide electronic mail system 3 2% Switch to OCLC 3 2% Vendor - change 1 1% Out of print materials 1 1% Directory of libraries 1 1% Establish statewide circuLation system 1 1% Sub-totals 129 94 TotaLs 136 100% 153 Table C161 Question # 40 Comments Responses: n % No Responses 90 93% Use Statewide database in Reference Services 2 2% Centralized billing for ILL 2 2% Reimbursement for ILL net lenders 1 1% More training is needed in automation 2 2% 97 100% 154 Responses from Pennsylvania: Tables C162 to C183. Table C162 Respondents to Questionnaire by TYPE of Library - Pennsylvania Type Number Percentage Academic 2 4% Public 10 21% School 34 72% Special 1 2% Totals: 47 100% Table C163 Size of Cotlections Responses: n % of users Under 25,000 32 68% 25,001 - 50,000 11 23% 50,001 - 100,000 1 2% 100,001 - 250,000 1 2% Over 250,000 0 0% Not Responsive 2 4% Total 47 100% Table C164 Uses of the Statewide database - Question #4 Description Number Backup 11 8% Cataloging / Acquisitions 35 27% Collection Development 17 13% Interlibrary loan 45 34% Public Access 21 16% Reference - Other 3 2% 132 100% 155 Table C165 Questions #5 & 6 - Amount of time spend daiLy on: Statewide Database InterLibrary Loan Minutes n % n % 0 or no response 1 2% 3 6% Less than10 0 0% 0 0% 10 to19 2 4% 7 15% 20 to 29 3 6% 4 9% 30 - 44 4 9% 9 19% 45 - 59 2 4% 3 6% 60 - 119 9 19% 10 21% 120 - 179 14 30% 6 13% 180 - 239 4 9% 0 0% 240 - 299 1 2% 2 4% 300 + 5 11% 1 2% Other 2 4% 2 4% TotaL 47 100% 47 99% Table C166 Question # 7 - Type of staff using database Staff n % No Response 0 0% Intertibrary Loan 28 29% Director 32 34% Assistant Director 0 0% Reference 14 15% TechnicaL Services 8 8% Other 13 14% Total 95 100% 156 Table C167 Question # 8 - dedicated equipment Responses n % No response 1 2% Yes 25 53% No 21 45% Total 47 100% Table C168 Question # 9 - Public Access? Responses n % No response 0 0% Yes 40 85% No 7 15% Total 47 100% Table C169 No PubLic Access - Why Question 9A Responses n % No Response 0 0% No Interest 1 14% No Equipment 3 43% No Room 3 43% TotaL 7 100% Table C170 Question # 10 - Hardware Responses n % No response 1 2% Yes 8 17% No 38 81% Total 47 100% 157 Table C171 Question_#_11 - Software Responses n % No response 1 2% Yes 2 4% No 44 94% TotaL 47 100% Table C172 Question # 12 & 13 - Training Responses n - State n - Attended % Training No response 0 1 2% Yes 47 46 98% No 0 0 0% Total 47 47 100% Table C173 Questions 14 & 15 - Training Responses n - Adequate training % n - need % Training _ No response 1 2% 0 0% Yes 44 94% 10 21% No 2 4% 37 79% TotaL 47 100% 47 100% 158 Table C174 Importance / Quality / Usefulness 1 = Excellent, 5 = Poor Question # & Descriptor 1 2 3 4 5 NR Totals Across 16. Browse - Author, 15 17 13 1 1 0 47 Title, or Subject Searches. %32% 36% 28% 2% 2% 0% 100% 17. Express - Advanced 31 12 2 1 0 1 47 level of searching. %66% 26% 4% 2% 0% 2% 100% 18. Boolean 15 14 14 1 0 3 47 %_32% 30% 30% 2% 0% 6% 100% 19. Keyword 25 14 6 2 0 0 47 %_53% 30% 13% 4% 0% 0% 100% 20. WiIdcard 13 13 14 2 1 4 47 % 28% 28% 30% 4% 2% 9% 101% 21. Searching 20 22 4 0 0 1 47 %_43% 47% 9% 0% 0% 2% 101% 22. Speed 5 15 17 6 3 1 47 % 11% 32% 36% 13% 6% 2% 100% 23. Directions 14 18 12 2 0 1 47 %_30% 38% 26% 4% 0% 2% 100% 24. Manual 9 21 15 0 0 2 47 %_19% 45% 32% 0% 0% 4% 100% 25. Screens 15 24 7 0 0 1 47 % 32% 51% 15% 0% 0% 2% 100% 26. Changing Discs 7 17 15 6 0 2 47 % 15% 36% 32% 13% 0% 4% 100% Total # per category 169 187 119 21 5 16 47 Average Percentage of 33% 36% 23% 4% 1% 3% 100% each category Average of each 15 17 11 2 0 1 category 159 Table C175 Questions 27-30, Increases or decreases of service. 1 = increased, decreased. Question#& 1 2 3 4 5 NR Totals Descriptor Across 27. ILL 25 15 4 1 0 2 47 incoming %_53% 32% 9% 2% 0% 4% 100% 28. ILL 22 13 8 2 0 2 47 outgoing %_47% 28% 17% 4% 0% 4% 100% 29. Fill Rate 12 18 9 2 1 5 47 %_26% 38% 19% 4% 2% 11% 100% 30. Blind 3 3 18 7 6 10 47 Searches received % 6% 6% 38% 15% 13% 21% 99% Table C176 Question # 31 NR 0-25% 26-50% 51-75% 76-100% Total # 10 4 1 7 25 47 21% 9% 2% 15% 53% 100% Table C177 Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other Total rks 32. Prior 4 26 0 16 16 6 13 81 %_5% 32% 0% 20% 20% 7% 16% 100% 33. After 3 41 0 26 25 28 3 126 %_2% 33% 0% 21% 20% 22% 2% 100% Percentage 1% -2% 0% -2% -2% 5% 0% 64% Increase/decrease 160 Table C178 Questions_$_34_&_35 Prior to database Descriptor After database incoming outgoing incoming outgoing 24 21 No Service 2 4 51% 45% Percentage of respondents 2% 4% giving noILL service. Percent Decrease of no -83% -19% service 4 4 No Response 6 4 12 14 <10 8 10 2 6 10-20 21 12 2 3 21-44 5 9 0 0 45-75 3 5 0 0 76-100 1 0 1 1 101-350 0 2 0 0 351-500 0 0 1 0 500-1000 1 0 0 0 1001+ 0 0 22 28 Total Responses 45 42 Percentage increase / 20% 15% Decrease of ILL Table C179 Question # 36 Response n % No Response 7 8% Automation Plans 1 1% CataLoging 6 7% ColLection Development 1 1% Ease of use 5 6% ILL Printed Forms 9 10% Location tooL 21 24% Reference use 1 1% Searching 15 17% Verification 22 25% 88 100% 161 Table C180 Improvements needed Question # 37 Responses: n % No Responses 9 13% Authority controL 2 3% Changing Discs 1 1% CLeanup 6 9% CumuLative printing of screens or search 3 5% Division of database other than by dates 3 4% E-MaiL 2 3% Errors - dupLicate records - muLtipLe titLes 15 22% Need more Libraries inputting records 3 5% Periodical add 2 3% RefusaL to Loan materiaLs 5 7% Searching - save search terms & que between 3 4% discs Speed 11 16% Updating more often & consistentLy 3 5% 68 100% Table CI81 Question # 38 Meet the users needs? Responses: n % No Response 10 21% No 0 %0 Yes 37 %79 47 100% 162 Table C182 Question # 39 Priority Responses: n % No Responses 14 18% Accuracy in Database 4 5% Automation Services to alL Libraries 2 3% Continuing Education 1 1% Continue with current projects 4 5% Full text deLiver 4 5% Funding 14 18% Improve ILL delivery system 2 3% Keep Database updated 3 4% Make system easer to use 2 3% Retrospective Conversion 6 8% Statewide Borrowing Agreement 8 10% Statewide database 4 5% Statewide electronic mail system 6 8% Establish statewide circulation system 3 4% TotaLs 77 100% Table C183 Question # 40 Comments Responses: n % No Responses 36 88% Use Statewide database in Reference Services 1 2% Reimbursement for ILL net tenders 4 10% 41 100% 163 Responses from North Dakota: Tables C184 to C206. TABLE C184 Title: Question # 1: Title of Respondent - N.D. n _% Assistant - Associate Director 1 9% Assistant ILL 0 0% Bibliographic Specialist 0 0% Computer Manager, Coordinator 0 0% Coordinator Adult Services 0 0% Director, Head Librarian, Library Manager 6 55% Extension Librarian 0 0% Head, Collection Development 0 0% Head, Reference Services 0 0% Head, Technical Services, Cataloging 0 0% ILL Coordinator, Supervisor, Head, etc. 0 0% Library Clerk 0 0% Library Tech 0 0% Media Specialist, LRC Specialist, Information Specialist 0 0% Name 1 90 Reference Librarian 3 27% System Operator 0 0% No Response 0 0% 11 100% 164 TABLE C185 Type Number Percentage Public 2 18% Academic 6 55% School 0 0% Special 3 27% Totals: 11 100% TABLE C186 Size of Collections - N.D. Responses: n % of users Under 25,000 3 27.3% 25,001 - 50,000 1 9.1% 50,001 - 100,000 3 27.3% 100,001 - 250,000 2 18.2% over 250,000 2 18.2% Not Responsive 0 0.0% Total 11 100.1% TABLE C187 Uses of the Statewide database - Question f4-NI.D. Description Number 0% Interlibrary loan 10 27% Public Access 10 27% Backup 2 5% Cataloging / Acquisitions 9 24% Collection Development 5 14% Reference - Other 1 3% 371100% Res ondents to Questionnaire by TYPE of Library - N.D. 165 TABLE C188 Questions #5 & 6 - N.D. - Amount of time spend daily on: Statewide databasee [ InterLibrary Loan Minutes n % n % 0 or no response 2 18% 2 20.0% Less than 10 0 0% 0 0.0% 10 to 19 1 9% 0 0.0% 20 to 29 1 9% 0 0.0% 30 - 44 1 9% 0 0.0% 45 - 59 0 0% 0 0.0% 60 - 119 0 0% 0 0.0% 120 - 179 1 9% 1 10.0% 180 - 239 0 0% 10.0% 240 - 299 0 0% 3 30.0% 300 + 5 45% 3 30.0% Other 0 0% 0 0.0% TotaL 11 99%10 100.0% TABLE C189 Question # 7 - Type of staff using database-NI.D. Staff n % InterLibrary Loan 11 25.0% Reference 10 22.7% Technical Services 9 20.5% Director 9 20.5% Extension Services staff 3 6.8% Other 2 4.6% No Response 0 0.0% Total 44 100.1% 166 TABLE C190 Question # 8 - dedicated equipment - N.D. Responses n % No response 0 0.0% Yes 8 72.7% No 3 27.3% Total 11 100.0% TABLE C191 Question # 9 - Public Access? - N.D. Responses n % No response 0 0% Yes 8 73% No 3 27% Total 11 100% TABLE C192 No Public Access - Why - N.D. Question_9A Responses n % No Response 9 82% No Interest 0 0% No Equipment 2 18% Difficulty of use 0 0 Staff use onLy 0 0 No Room 0 0% Total 11 100% TABLE C193 Question # 10 - Hardware - N.D. Responses n % No response 1 9% Yes 3 27% No 7 64% TotaL 11 100% 167 TABLE C194 Question # 11 - Software - N.D. Responses n % No response 0 0% Yes 0 0% No 9 50% Not Applicable 9 50% Total 18 100% TABLE C195 Question # 12 & 13 - Training - N.D. Responses n - State n - Attended % Training No response 1 0 %10 Yes 4 8 73 No 6 3 %27 Total 11 11 100% TABLE C196 Questions 14 & 15 - Training - N.D. Responses n - Adequate training % n - need % Training No response 5 45% 1 9% Yes 5 45% 6 55% No 1 9% 4 36% Total 11 99% 11 100% 168 TABLE C197 N.D. Importance / Quality / Usefulness 1 = Excellent, 5 = Poor____' Question # & Descriptor 1 2 3 4 5 NR Totals ____ I_ I IAcross 16. Browse - Author, 5 3 2 1 0 0 11 Title, or Subject Searches. % 45% 27% 18% 9% 0% 0% 99% 17. Express - Advanced 4 2 2 0 1 2 11 level of searching. % 36% 18% 18% 0% 9% 18% 99% 18. Booean 5 2 1 1 0 2 11 % 45% 18% 9% 9% 0% 18% 99% 19. Keyword 6 1 1 0 1 2 11 %_55% 9% 9% 0% 9% 18% 100% 20. Witdcard 5 3 0 0 1 2 11 %45% 27% 0% 0% 9% 18% 99% 21. Searching 6 2 2 0 1 0 11 % 55% 18% 18% 0% 9% 0% 100% 22. Speed 4 5 1 1 0 0 11 %_36% 45% 9% 9% 0% 0% 99% 23. Directions 3 5 1 1 1 0 11 %_27% 45% 9% 9% 9% 0% 99% 24. Manual 2 2 2 3 0 2 11 % 18% 18% 18% 27% 0% 18% 99% 25. Screens 3 5 1 0 1 1 11 %27% 45% 9% 0% 9% 9% 99% 26. Changing Discs 0 0 0 0 0 11 11 % 0% 0% 0% 0% 0% 100% 100% Total # per category 43 30 13 7 6 22 11 Average Percentage of 35% 25% 11% 6% 5% 18% 100% each category Average of each 4 3 1 1 1 2 category 169 TABLE C198 N.D. Questions 27-30, Increases or decreases of service. 1 = increased, 5 decreased. Question # & 1 2 3 4 5 NR Totals Descriptor Across 27. ILL 7 1 2 0 0 1 11 incoming % 64% 9% 18% 0% 0% 9% 100% 28. ILL 3 3 3 2 0 0 11 outgoing % 27% 27% 27% 18% 0% 0% 99% 29. Fill Rate 2 2 5 1 0 1 11 % 18% 18% 45% 9% 0% 9% 99% 30. Blind 0 1 5 3 0 2 11 Searches received % 0% 9% 45% 27% 0% 18% 99% TABLE C199 Question # 31 - N.D. NR 0-25% 26-50% 51-75% 76-100% Total # 4 2 5 0 0 11 36% 18% 45% 0% 0% 99% TABLE C200 N.D. Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks# 32. Prior 6 8 0 10 6 4 0 0 34 % 18% 24% 0% 29% 18% 12% 0% 0% 101% 33. After 6 6 0 10 6 5 1 0 34 % 18% 18% 0% 29% 18% 15% 3% 0% 101% Percentage 1% -1% 0% -1% -1% 1% 0% 0% 100% Increase/decrease 170 TABLE C201 Questions # 34 & 35 - N.D. Prior to database Descriptor After database incoming outgoing incoming outgoing 2 2 No Response 2 2 0 0 <10 0 0 1 0 10-20 0 0 0 2 21-44 1 0 2 0 45-75 0 1 0 1 76-100 0 1 5 5 101-350 6 6 0 0 351-500 1 0 1 1 500-1000 0 0 0 0 1001+ 1 1 11 11 TotaL Responses 11 11 Percentage increase / 10% 10% Decrease of ILL TABLE C202 Question # 36 - N.D. Response n % No Response 3 18% Automation PLans 0 0% CataLoging 3 18% CoLLection DeveLopment 1 6% Ease of use 2 12% ILL Printed Forms 2 12% Location tooL 1 6% Reference use 2 12% Searching 2 12% Verification 1 6% 17 102% TABLE C203 N.D. Improvements needed Question # 37 Responses: n % No Responses 3 21% Authority control 0 0% Acquisitions 2 14% Circulation Procedures 2 14% Cumulative printing of screens or search 1 7% Division of database other than by dates 0 0% Indexes to manuals, on screen instructions 3 21% Errors -,duplicate records - muLtiple titles 0 0% Need more libraries inputting records 1 7% Periodicals add 1 7% Refusal to loan materials 0 0% Searching - 1 7% Speed 0 0% Updating more often & consistently 0 0% 14 98% TABLE Question # 38 - N.D. Meet the users needs? Responses: n % No Response 2 18% No 2 %18 Yes 7 %64 11 100% C204 171 172 TABLE C205 N.D. Question # 39 Priority Responses: n % No Responses 4 31% Accuracy in Database 0 0% Automation Services to all Libraries 1 8% Continuing Education 1 8% Continue with current projects 0 0% Full text deliver 0 0% Funding 0 0% Improve ILL delivery system 1 8% Keep Database updated 0 0% Make system easer to use 0 0% Retrospective Conversion 1 8% Statewide Borrowing Agreement 1 8% Statewide database 3 23% Statewide electronic mail system 0 %0 Establish statewide circulation system 1 8% I don't understand what Priority means? 0 0% Database management - Long range planning 0 0% Totals 13 102% TABLE C206 N.D. Question # 40 Comments Responses: n % No Responses 10 91% Use Statewide database in Reference Services 0 0% Funding for private Libraries 1 9% If materials cost Less than $20, should not 0 0% loan Our library does not provide Ill services 0 0% This is our main source of information about 0 0% other libraries 11 100% 173 Responses from Ohio: Tables C207 to C229. TABLE C207 fT e Question # 1: Title of Respondent - Ohio _n Title: I nI % Assistant - Associate Director 0 0% Assistant ILL 0 0% Bibliographic Specialist 0 0% Computer Manager, Coordinator 0 0% Coordinator Adult Services 1 10% Director, Head Librarian, Library Manager 7 70% Extension Librarian 0 0% Head, Collection DeveLopment 0 0% Head, Reference Services 0 0% Head, Technical Services, Cataloging 1 10% ILL Coordinator, Supervisor, Head, etc. 1 10% Library Clerk 0 0% Library Tech 0 0% Media Specialist, LRC Specialist, Information Specialist 0 0% Name 0 0% Reference Librarian 0 0% System Operator 0 0% No Response 0 0% 10 100% 174 TABLE C208 Respondents to Questionnaire by TYPE of Library - Ohio Type Number Percentage Public 9 90% Academi c 0 0% School 0 0% Special 1 10% Totals: 10 100% TABLE C209 Size of Collections - Ohio Responses: n % of users Under 25,000 0 0.0% 25,001 - 50,000 5 50.0% 50,001 - 100,000 2 20.0% 100,001 - 250,000 3 30.0% over 250,000 0 0.0% Not Responsive 0 0.0% Total 10 100.0% TABLE C210 Uses of the Statewide database - Question #4 - Ohio Description Number % Interlibrary loan 9 50% Public Access 2 11% Backup 1 6% Cataloging / Acquisitions 5 28% Collection Development 1 6% Reference - Other 0 0% 18 101% I jMI 175 TABLE C211 Questions #5 & 6 - Ohio - Amount of time spend daily on: Statewide DatabaseInterlibrary Loan Minutes n % n% 0 or no response 1 10% 1 10.0% Less than 10 1 10% 2 20.0% 10 to 19 2 20% 0 0.0% 20 to 29 0 0% 0 0.0% 30 - 44 3 30% 1 10.0% 45 - 59 0 0% 0 0.0% 60 - 119 1 10% 1 10.0% 120 - 179 1 10% 4 40.0% 180 - 239 0 0% 1 10.0% 240 - 299 1 10% 0 0.0% 300 + 0 0% 0 0.0% Other 0 0% 0 0.0% Total 1010% 10 100.0% TABLE C212 Question # 7 - Type of staff using database - Ohio Staff n _% Interlibrary loan 8 32.0% Reference 2 8.0% Technical Services 7 28.0% Director 5 20.0% Extension Services staff 1 4.0% Other 1 4.0% No Response 1 4.0% Total 25 100.0% 176 TABLE C213 Question # 8 - dedicated equipment - Ohio Responses n % No response 0 0.0% Yes 1 10.0% No 9 90.0% Total 10 100.0% TABLE C214 Question # 9 - Public Access? - Ohio Responses n % No response 0 0% Yes 0 0% No 10 100% Total 10 100% TABLE C215, No Public Access - Why - Ohio Question 9A ResponsesJ n1 No Response 3 30% No Interest 1 10% No Equipment 2 20% Difficulty of use 1 10 Staff use only 0 0 No Room 3 30% Total 10 100% TABLE C216 Question # 10 - Hardware - Ohio Responses nJ No response 0 0% Yes 2 20% No 8 80% Total 10 100% 177 TABLE C217 Question # 11 - Software - Ohio Responses n % No response 0 0% Yes 3 15% No 7 35% Not ApplicabLe 10 50% Total 20 100% TABLE C218 Question # 12 & 13 - Training - Ohio Responses n - State n - Attended % Training No response 0 0 0% Yes 10 8 80% No 0 2 20% Tot aL 10 10 100% TABLE C219 Questions 14 & 15 - Training - Ohio Responses n - Adequate training % n - need % Training No response 3 27% 0 0% Yes 6 55% 6 55% No 2 18% 5 45% Total 11 100% 11 100% 178 TABLE C220 Ohio Importance / Quality / Usefulness _____ = Excel tent, 5 = Poor ____ _________ Question # & Descriptor 1 2 3 4 5 NR Totals II_ _ I_ IIIIAcross 16. Browse - Author, 4 2 2 1 0 1 10 Title, or Subject Searches. % 40% 20% 20% 10% 0% 10% 100% 17. Express - Advanced 6 1 2 0 0 1 10 level of searching. %60% 10% 20% 0% 0% 10% 100% 18. Boolean 1 3 2 1 1 2 10 %10% 30% 20% 10% 10% 20% 100% 19. Keyword 3 1 4 1 0 1 10 %30% 10% 40% 10% 0% 10% 100% 20. Wildcard 1 1 6 0 0 2 10 % 10% 10% 60% 0% 0% 20% 100% 21. Searching 2 5 2 0 0 1 10 %20% 50% 20% 0% 0% 10% 100% 22. Speed 1 4 4 0 0 1 10 %10% 40% 40% 0% 0% 10% 100% 23. Directions 2 1 5 1 0 1 10 % 20% 10% 50% 10% 0% 10% 100% 24. ManuaL 1 2 4 2 0 1 10 %10% 20% 40% 20% 0% 10% 100% 25. Screens 3 2 4 0 0 1 10 % 30% 20% 40% 0% 0% 10% 100% 26. Changing Discs 1 0 2 0 0 7 10 % 10% 0% 20% 0% 0% 70% 100% Total # per category 25 22 37 6 1 19 10 Average Percentage of 23% 20% 34% 5% 1% 17% 100% each category Average of each 2 2 3 1 0 2 category TABLE C221 Ohio Questions 27-30, Increases or decreases of service. 1= increased, 5 decreased. Q~sin#& 1 i 21 31 4 5 fNR Totals Descriptor I___ .... A.......... ___I___ ___I Across 27. ILL 1 6 2 0 0 1 10 incoming % 10% 60% 20% 0% 0% 10% 100% 28. ILL 0 3 4 2 0 1 10 outgoing % 0% 30% 40% 20% 0% 10% 100% 29. Fill Rate 0 4 4 1 0 1 10 % 0% 40% 40% 10% 0% 10% 100% 30. Blind 0 0 6 1 0 4 11 Searches received % 0% 0% 55% 9% 0% 36% 100% TABLE C222 Question # 31 - Ohio NR 0-25% 26-50% 51-75% 76-100% Total # 3 4 0 3 0 10 30% 40% 0% 30% 0% 100% 179 180 TABLE C223 Ohio Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rksI - 1 6 32. Prior 0 6 0 4 4 6 1 2 23 % 0% 26% 0% 17% 17% 26% 4% 9% 99% 33. After 0 7 0 4 3 6 1 2 23 % 0% 30% 0% 17% 13% 26% 4% 9% 99% Percentage %?? -1% %?? -1% -1% 1% 1% 100% 100% Increase/decrease TABLE C224 Questions # 34 & 35 - Ohio Prior to databaseDescriptor After database i ncomi ng outgoing incoming outgoing 1 1 No Response 1_ 1 1 4 <10 1 1 3 1 10-20 3 5 1 2 21-44 2 1 2 2 45-75 1 1 0 0 76-100 1 1 2 0 101-350 1 0 0 0 351-500 0 0 0 0 500-1000 0 0 0 0 1001+ 0 0 10 10 Total Responses 10 10 Percentage increase / 10% 10% Decrease of ILL 181 TABLE C225 Question # 36 - Ohio Response n % No Response 3 23% Automation PLans 1 8% Cataloging 2 15% Collection Development 0 0% Ease of use 2 15% ILL Printed Forms 1 8% Location tool 1 8% Reference use 0 0% Searching 3 23% Verification 0 0% 13 100% TABLE C226 Ohio Improvements needed Question # 37 Responses: n % No Responses 2 22% Authority control 1 11% Changing Discs, to many discs 0 0% Cleanup 1 11% Cumulative printing of screens or search 1 11% Division of database other than by dates 0 0% E-Mail 0 0% Errors - duplicate records - multiple titles 0 0% Need more Libraries inputting records 3 33% Periodicals add 0 0% Refusal to loan materials 0 0% Searching - 1 11% Speed 0 0% Updating more often & consistently 0 0% 9 99% Question # 38 - Ohio Meet the users needs? TABLE C227 1] Responses: n No Response 3 33% No 0 %0 Yes 6 %67 9 100% TABLE C228 Ohio Question # 39 Priority Responses: n % No Responses 2 15% Accuracy in Database 2 15% Automation Services to all libraries 1 8% Continuing Education 0 0% Continue with current projects 3 23% Full text deliver 0 0% Funding 0 0% Improve ILL delivery system 0 0% Keep Database updated 0 0% Make system easer to use 0 0% Retrospective Conversion 1 8% Statewide Borrowing Agreement 0 0% Statewide database 3 23% Statewide electronic mail system 0 %0 Establish statewide circulation system 0 0% I don't understand what Priority means? 0 0% Database management - Long range planning 1 8% Totals 13 100% 182 183 TABLE C229 Ohio Question # 40 Comments Responses: n % No Responses 8 80% Use Statewide database in Reference Services 0 0% Reimbursement for ILL net Lenders 0 0% If materials cost less than $20, should not 1 10% loan Our library does not provide Ill services 1 10% This is our main source of information about 0 0% other Libraries 10 100% 184 Responses from South Dakota: Tables C230 to C252. TABLE C230 Question # 1: Title of Respondent - S.D. TitLe: n 1% Assistant - Associate Director 1 9% Assistant ILL 0 0% Bibliographic Specialist 0 0% Computer Manager, Coordinator 0 0% Coordinator Adult Services 0 0% Director, Head Librarian, Library Manager 6 55% Extension Librarian 0 0% Head, Collection Development 0 0% Head, Reference Services 2 18% Head, Technical Services, Cataloging 1 9% ILL Coordinator, Supervisor, Head, etc. 0 0% Library Clerk 0 0% Library Tech 0 0% Media Specialist, LRC Specialist, Information SpeciaList 1 9% Name 0 0% Reference Librarian 0 0% System Operator 0 0% No Response 0 0% 11 100% 185 TABLE C231 Respondents to Questionnaire by TYPE of Library - S.D. Type Number Percentage Public 7 64% Academic 3 27% School 1 9% Special 0 0% Totals: 11 100% TABLE C232 Size of ColLections - S.D. Responses: n % of users Under 25,000 1 9.1% 25,001 - 50,000 1 9.1% 50,001 - 100,000 6 54.6% 100,001 - 250,000 2 18.2% over 250,000 1 9.1% Not Responsive 0 0.0% Total 11 100.1% TABLE C233 Uses of the Statewide database - Question #4 - S.D. Description Number _% Interlibrary Loan 11 28% Public Access 9 23% Backup 0 0% Cataloging / Acquisitions 8 20% Collection Development 8 20% Reference - Other 4 10% 40 101% 186 TABLE C234 Questions #5 & 6 - S.D. - Amount of time spend daily on: Statewide Database InterLibrary loan Minutes n % n % 0 rno response 3 27% 1 9.1% Less than 10 0 0% 0 0.0% 10 to 19 0 0% 0 0.0% 20 to 29 0 0% 0 0.0% 30 - 44 00% 0 0.0% 45 - 59 00% 1 9.1% 60 - 119 19% _99%% 120 - 179 2 18% 3 27.3% 180 - 239 1 9% 2 18.2% 240 - 299 0 0% 0 0.0% 300 + 4 36% 3 27.3% Other 0 0% 0 0.0% TotaL 11 99% 11 100.1% TABLE C235 Question # 7 - Type of staff using database - S.D. Staff n[ _ InterLibrary Loan 10 22.2% Reference 10 22.2% Technical Services 9 20.0% Director 10 22.2% Extension Services staff 3 6.7% Other 3 6.7% No Response 0-0.0% Total 45 100.0% 187 TABLE C236 Question # 8 - dedicated equipment - S.D. Responses n % No response 1 9.1% Yes 9 81.8% No 1 9.1%,_, Total 11 100.0% TABLE C237 Question # 9 - PubLic Access? - S.D. Responses n__ No response 0 0% Yes 8 73% No 3 Total 11 100% TABLE C238 No Public Access - Why S.D. Question 9A Responses n__ No Response 8 73% No Interest 0 0% Dial Access only - equipment 3 27% Difficulty of use 0 0 Staff use only 0 0 No Room 0 0% Total 11 100% TABLE C239 Question # 10 - Hardware - S.D. Responses n No response 0 0% Yes 3 27% No 8 73% Total 11100% 188 TABLE C240 Question # 11 - Software - S.D. _% l Responses n No response 0 0% Yes 0 0% No 11 100% Not ApplicabLe 0 0% Total 11 10 TABLE C241 Question #12 & 13 - Training - S.D. Responses n - State n - Attended % Training No response 0 0%0 Yes 11 11 %100 No 0 0 %0 Total 11 11 100% TABLE C242 Questions 14 & 15 - Training - S.D. Responses n - Adequate training % n - need % Training No response 0% 0 0% Yes 10 91% 8 73% No 1 9% 3 27% TotaL 11 100% 11 100% 189 TABLE C243 S.D. Importance / Quality IUsefulness 1 = Excellent, 5 Poor Question # & Descriptor 1 2 3 4 5 NR Totals 1.j 1 1_I ~Across 16. Browse - Author, 2 6 1 1 0 1 11 Title, or Subject Searches. 18% 55% 9% 9% 0% 9% 100% 17. Express - Advanced 1 4 1 0 0 5 11 level of searching. %9% 36% 9% 0% 0% 45% 99% 18. Boolean 2 5 2 1 0 1 11 18% 45% 18% 9% 0% 9% 99% 19. Keyword 7 2 1 0 0 1 11 64% 18% 9% 0% 0% 9% 100% 20. WiLdcard 4 3 2 1 0 1 11 36% 27% 18% 9% 0% 9% 99% 21. Searching 4 4 2 0 0 1 11 36% 36% 18% 0% 0% 9% 99% 22. Speed 1 6 3 0 0 1 11 %__9% 55% 27% 0% 0% 9% 100% 23. Directions 1 3 4 2 0 1 11 9% 27% 36% 18% 0% 9% 99% 24.Manual 1 1 4 1 2 2 11 %9% 9% 36% 9% 18% 18% 99% 25. Screens 0 5 4 1 0 1 11 %0% 45% 36% 9% 0% 9% 99% 26. Changing Discs 0 0 0 0 0 11 11 0% 0% 0% 0% 0% 100% 100% Total # per category 23 39 24 7 2 26 11 Average Percentage of 19% 32% 20% 6% 2% 21% 100% each category Average of each 2 4 2 1 0 2 category 190 TABLE C244 S.D. Questions 27-30, Increases or decreases of service. 1 = increased,_5 decreased. Question #& 1 2 3 4 51 INR j Totals Descriptor I___ ___L___ ___i___ ___IAcross 27. ILL 6 5 0 0 0 0 11 incoming % 55% 45% 0% 0% 0% 0% 100% 28. ILL 5 4 0 1 0 1 11 outgoingI % 45% 36% 0% 9% 0% 9% 99% 29. Fill Rate 2 7 2 0 0 0 11 %_18% 64% 18% 0% 0% 0% 100% 30.Blind 0 1 4 1 0 5 11 Searches received % 0% 9% 36% 9% 0% 45% 99% TABLE C245 Question # 31 - S.D. NR 0-25% 26-50% 51-75% 76-100% Total # 0 1 3 6 1 11 0% 9% 27% 55% 9% 100% TABLE C246 S.D. Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks # 32. Prior 7 10 0 4 8 3 0 0 32 % 22% 31% 0% 13% 25% 9% 0% 0% 100% 33. After 7 11 0 7 7 6 0 0 38 % 18% 29% 0% 18% 18% 16% 0% 0% 99% Percentage 1% -1% 0% -2% -1% 2% 0% 0% 84% Increase/decrease 191 TABLE C247 Questions # 34 & 35 - S.D. Prior to database Descriptor After database incoming outgoing ncoming outgoing 4 5 No Response 2 3 1 0 <10 2 1 0 1 10-20 1 0 1 1 21-44 0 0 3 2 45-75 2 1 0 0 76-100 1 0 1 1 101-350 1 3 0 351-500 1 1 0 0 500-1000 0 1 1 1 1001+ 1 1 11 11 TotaL Responses 11 11 Percentage increase / 10% 10% Decrease of ILL TABLE C248 Question # 36 - S.D. Response n % No Response 2 10% Automation Plans 0 0% CataLoging 2 10% Item Status 2 10% Ease of use 1 5% ILL 2 10% Location tooL 3 15% Reference use 2 10% Searching 5 25% Magazine index 1 5% I_20 100% 192 TABLE C249 S.D. Improvements needed Question_#_37 Responses: n % No Responses 3 27% Authority control 0 0% Acquisitions o 0% CircuLation Procedures 0 0% Cumulative printing of screens or search 0 0% Education - more training 1 9% Indexes to manuals, on screen instructions 1 9% Errors - duplicate records - multiple titles 0 0% Need more libraries inputting records 1 9% Periodicals add 2 18% Refusal to loan materials 0 0% Searching - 1 9% Add personnel to system operations 1 9% Increase full text 1 9% 11 99% TABLE C250 Question # 38 - S.D. Meet the users needs? Responses: n % No Response 3 27% No 0 %0 Yes 8 %73 11 100% 193 TABLE C251 S.D. Question # 39 Priority Responses: n No Responses 3 23% Accuracy in Database 1 8% Automation Services to all Libraries 2 15% Continuing Education 1 8% Continue with current projects 0 0% Full text deliver 0 0% Funding 0 0% Improve ILL delivery system 0 0% Keep Database updated 0 0% Make system easer to use 1 8% Retrospective Conversion 0 0% Statewide Borrowing Agreement 1 8% Statewide database - funding 4 31% Statewide electronic mail system 0 %0 Establish statewide circulation system 0 0% I don't understand what Priority means? 0 0% Database management - Long range planning 0 0% Totals 13 101% TABLE C252 S.D. Question # 40 Comments Responses: n No Responses 7 64% Include all libraries in state 1 _ 9% It is expensive 1 9% If materials cost less than $20, should not 0 0% loan State Library does an excellent job 9% This is our main source of information about 1 9% other libraries - helps resource sharing 11 100% 194 Responses from Tenn.: Tables C253 to C275. TABLE C253 Question_#_1:_Title ofRespondent_- Tenn. Title: n % Assistant - Associate Director 2 29% Assistant ILL 1 14% Bibliographic Specialist 0 0% Computer Manager, Coordinator 0 0% Coordinator Adult Services 0 0% Director, Head Librarian, Library Manager 1 14% Extension Librarian 0 0% Head, Collection DeveLopment 0 0% Head, Reference Services 0 0% Head, Technical Services, Cataloging 0 0% ILL Coordinator, Supervisor, Head, etc. i 14% Library CLerk 0 0% Library Tech 2 29% Media Specialist, LRC SpeciaList, Information Specialist 0 0% Name 0 0% Reference Librarian 0 0% System Operator 0 0% No Response 0 0% 7 100% 195 TABLE C254 Respondents to Questionnaire by TYPE of Library - Tenn. Type Number Percentage PubLic 5 56% Academic o00% SchooL 0 0% Special 4 44% TotaLs: 9 100% TABLE C255 Size ofCollections - Tenn. Responses: I _ _ _ _ I _ __ _ Under 25,000 1 11.1% 25,001 - 50,000 0 0.0% 50,001 - 100,000 0 0.0% 100,001 - 250,000 5 55.6% Over 250,000 3 33.3% Not Responsive 0 0.0% TotaL 9 100.0% TABLE C256 Uses of the Statewide database-Question #4 - Tenn. Descr ipt ion Number % Interlibrary Loan 3 38% Public Access 0 0% Backup 0 0% Cataloging / Acquisitions 4 50% Cotlection Development 0 0% Reference - Other 1 13% 8 101% 196 TABLE C257 Questions #5 & 6 - Tenn. - Amount of time spend daily on: Statewideiatabase Interlibraryloan Minutes n n % 0 or no response 2 29% 4 57.1% Less than 10 0 0% 0 0.0% 10 to19 0 0% 0 0.0% 20 to 29 0 0% 0 0.0% 30 - 44 0 0% 0 0.0% 45 - 59 0 0% 0 0.0% 60 - 119 0 0% 0 0.0% 120 - 179 1 14% 1 14.3% 180 - 239 1 14% 0 0.0% 240 -299 1 14% 1 14.3% 300 + 2 29% 1 14.3% Other 0 0% 0 0.0% Total 7 100% 7 100.0% TABLE C258 Question # 7 - Type of staff using database - Tenn. Staff n % Interlibrary loan 3 42.9% Reference 0 0.0% Technical Services 4 57.1% Director 0 0.0% Extension Services staff 0 0.0% Other 0 0.0% No Response 0 0.0% Total 7 100.0% 197 TABLE C259 Question # 8 - dedicated equipment - Tenn. Responses n % No response 0 0.0% Yes 7 100.0% No 0 0.0% Total 7 100.0% TABLE C260 Question # 9 - Public Access? - Tenn. Responses n % No response 0 0% Yes 0 0% No 7 100% Total 7 100% TABLE C261. No Public Access - Why - Tenn. Question 9A Responses n % No Response 1 14% No Interest 0 0% Dial Access only - equipment 0 0% Difficulty of use 0 0 Staff use only 6 86 No Room 0 0% Total 7 100% TABLE C262 Question # 10 - Hardware - Tenn. Responses n %- No response 0 0% Yes 4 57% No 3 43% TotaL 7 100% 198 TABLE C263 Question # 11 - Software - Tenn. Responses n No response 0 0% Yes 4 57% No 3 43% Not AppLicabLe 0 0% TotaL 7 100% TABLE C264 Question # 12 & 13 - Training - Tenn. Responses n - State n - Attended % Training No response 0 0 00 Yes 7 7 %100 No 0 0__ _ TotaL 7 7 100% TABLE C265 Questions 14 & 15 - Training - Tenn. Responses n - Adequate training % n - need % Training No response 0 0% 0 0% Yes 7 100% 2 29% No 0 0% 5 71% Total 7 100% 7 100% 199 TABLE C266 Tenn. Importance / Quality / Usefulness I = Excellent, 5 Poor Question # & Descriptor 1 2 3 4 5 NR Totals I I II I I Across 16. Browse - Author, 1 1 3 0 2 0 7 Title, or Subject Searches. %14% 14% 43% 0% 29% 0% 100% 17. Express - Advanced 1 2 3 0 0 1 7 level of searching. %14% 29% 43% 0% 0% 14% 100% 18. Boolean 2 0 3 1 1 0 7 % 29% 0% 43% 14% 14% 0% 100% 19. Keyword 1 1 3 1 0 1 7 % 14% 14% 43% 14% 0% 14% 99% 20. WiLdcard 0 2 3 0 1 1 7 % 0% 29% 43% 0% 14% 14% 100% 21. Searching 1 3 3 0 0 0 7 % 14% 43% 43% 0% 0% 0% 100% 22. Speed 2 2 2 1 0 0 7 % 29% 29% 29% 14% 0% 0% 101% 23. Directions 1 1 4 0 1 0 7 % 14% 14% 57% 0% 14% 0% 99% 24. Manual 0 0 4 0 3 0 7 % 0% 0% 57% 0% 43% 0% 100% 25. Screens 0 0 5 1 0 1 7 % 0% 0% 71% 14% 0% 14% 99% 26. Changing Discs 0 0 5 0 0 2 7 %0% 0% 71% 0% 0% 29% 100% Total # per category 9 12 38 4 8 6 7 Average Percentage of 12% 16% 49% 5% 10% 8% 100% each category Average of each 1 1 3 0 1 1 category 200 IMPACT OF THE STATEWIDE DATABASE ON RESOURCE SHARING TABLE C267 Tenn. Questions 27-30, Increases or decreases of service. 1 = increased, 5 decreased. 1uet1o # 1 21314151 NR Totals Descriptor I_ _I_ _I I_ _ Across 27. ILL 0 0 3 0 0 4 7 incoming % 0% 0% 43% 0% 0% 57% 100% 28. ILL 0 0 3 0 0 4 7 outgoing % 0% 0% 43% 0% 0% 57% 100% 29. Fill Rate 0 1 2 0 0 4 7 % 0% 14% 29% 0% 0% 57% 100% 30. Blind 0 0 1 1 0 5 7 Searches received % 0% 0% 14% 14% 0% 71% 99% TABLE C268 Question # 31 - Tenn. NR 0-25% 26-50% 51-75% 76-100% Total # 5 0 1 1 0 7 71% 0% 14% 14% 0% 99% 201 TABLE C269 Tenn. Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks _I# 32. Prior 3 3 0 2 4 1 1 3 17 % 18% 18% 0% 12% 24% 6% 6% 18% 102% 33. After 3 3 0 3 3 2 0 3 17 % 18% 18% 0% 18% 18% 12% 0% 18% 102% Percentage 1% -1% %?? -2% -1% 2% %0 %100 100% Increase/decrease TABLE C270 Questions # 34 & 35 - Tenn. Prior to database Descriptor After database incoming outgoing _I incoming outgoing 5 5 No Response 5 5 0 0 <10 0 0 0 0 10-20 0 0 0 0 21-44 0 0 0 0 45-75 0 0 0 0 76-100 0 0 0 1 101-350 0 1 1 0 351-500 1 0 0 1 500-1000 0 1 1 0 1001+ 1 0 7 7 TotaL Responses 7 7 Percentage increase / 10% 10% Decrease of ILL 202 TABLE C271 Question # 36 - Tenn. Response n % No Response 1 8% All formats are available 1 8% Cataloging 0 0% Item Status 3 23% Ease of use 1 8% ILL 1 8% Location tooL 2 15% Browse mode 2 15% Searching 1 8% Verification 1 8% 13 101% TABLE C272 Tenn. Improvements needed Question # 37 Responses: n % No Responses 2 22% Authority control - cataloging 1 11% Electronic delivery - full text 0 0% Circulation Procedures 1 11% Cumulative printing of screens or search 0 0% Ill 1 11% Indexes to manuals, on screen instructions 0 0% Errors - duplicate records - multiple titles 0 0% Need more libraries inputting records 0 0% Periodicals spotty 1 11% Update software 2 22% Searching - 0 0% Public access software 0 0% Communications 1 11% 9 99% 203 TABLE C273 Question # 38 - Tenn. Meet the users needs? Responses: n % No Response 2 29% No 0 %0 Yes 5 %71 7 100% TABLE C274 Tenn. Question # 39 Priority Responses: n % No Responses 2 29% Accuracy in Database 1 14% Automation Services to all libraries 0 0% Continuing Education 0 0% Continue with current projects 0 0% Futl text deliver 0 0% Funding 0 0% Improve ILL deLivery system 0 0% Keep Database updated 0 0% Make system easer to use 0 0% Retrospective Conversion 1 14% Verification & HoLding info. 1 14% Statewide database 0 0% Statewide electronic maiL system 0 %0 Circulation software & hardware 1 14% I don't understand what Priority means? 0 0% Cataloging of materials into database 1 14% TotaLs 7 99% 204 TABLE C275 Tenn. Question # 40 Comments Responses: n % No Responses 5 63% Include all libraries in state 0 0% No way to cancel a request 1 13% No serial holding request 1 13% Not open to public 1 13% Decrease paperwork 0 0% _ _ _ _ _ _8 102% 205 Responses from Wisconsin: Tables C276 to C298. TABLE C276 Question # 1: Title of Respondent Title:j n % Assistant - Associate Director 0 0% Assistant ILL 0 0% Bibliographic Specialist 0 0% Computer Manager, Coordinator 0 0% Coordinator Adult Services 1 5% Director, Head Librarian, Library Manager 8 42% Extension Librarian 0 0% Head, Collection Development 0 0% Head, Reference Services 0 0% Head, Technical Services, Cataloging 0 0% ILL Coordinator, Supervisor, Head, etc. 3 16% Library CLerk 0 0% Library Tech 0 0% Media Specialist, LRC SpeciaList, Information Specialist 4 21% Name 2 11% Reference Librarian 1 5% System Operator 0 0% No Response 0 0% 19 100% 206 TABLE C277 Respondents to Questionnaire by TYPE of Library - Wisconsin Type Number Percentage Academic 3 16% Public 11 58% School 3 16% Special 2 11% Totals: 19 100% TABLE C278 Size of ColLections - Wisconsin Responses: n % of users Under 25,000 5 26.3% 25,001 - 50,000 3 15.8% 50,001 - 100,000 1 5.3% 100,001 - 250,000 4 21.1% Over 250,000 4 21.1% Not Responsive 2 10.5% Total 19 100.1% TABLE C279 Uses of the Statewide database - Question #4 [ Wisconsin Description Number% InterLibrary loan 45 34% Public Access 21 16% Backup 11 8% Cataloging / Acquisitions 35 27% Collection Development 17 13% Reference - Other 3 2% 132 100% 207 TABLE C280 Questions #5 & 6 - Wisconsin - Amount of time spend daiLy on: Statewide Database Intertibrary loan Minutes n % n I 0 or no response 1 5% 3 15.8% Less than 10 1 5% 1 5.3% 10 to19 0 0% 0 0.0% 20 to 29 1 5% 1 5.3% 30 - 44 6 32% 2 10.5% 45 - 59 0 0% 0 0.0% 60 - 119 1 5% 3 15.8% 120 - 179 3 16% 2 10.5% 180 - 239 3 16% 2 10.5% 240 - 299 0 0% 2 10.5% 300 + 3 16% 3 15.8% Other 0 0% 0 0.0% TotaL 19 100% 19 100.0% TABLE C281 Question # 7 - Type of staff using database - Wisconsin Staff n _% InterLibrary Loan 16 32.7% Reference 11 22.5% Technical Services 10 20.4% Director 10 20.4% Extension Services staff 2 4.1% Other 0 0.0% Students, FacuLty of 0 0.0% institution No Response 0 0.0% TotaL 49 100.1% 208 TABLE C282 Question # 8 - dedicated equipment - Wisconsin Responses n % No response 0 0.0% Yes 14 73.7% No 5 26.3% TotaL 19 100.0% TABLE C283 Question # 9 - Public Access? - Wisconsin Responses n % No response 0 0% Yes 7 37% No 12 63% TotaL 19 100% TABLE C284 No PubLic Access - Why - Wisconsin Question 9A Responses n % No Response 7 35% No Interest 0 0% No Equipment 7 35% Microfiche onLy 1 5 Staff use only 4 20 No Room 1 5% TotaL 20 100% 209 TABLE C285 Question # 10 - Hardware - Wisconsin Responses n % No response 1 5% Yes 3 16% No 15 79% Total 19 100% TABLE C286 Question # 11 - Software - Wisconsin Responses n % No response 0 0% Yes 2 6% No 15 44% Not AppticabLe 17 50% Total 34 100% TABLE C287 Question # 12 & 13 - Training- Wisconsin Responses n - State n - Attended % Training No response 1 0 0% Yes 16 15 79% No 2 4 21% TotaL 19 19 100% TABLE C288 Questions 14 & 15 - Training - Wisconsin Responses n - Adequate training % n - need % Training No response 3 16% 0 0% Yes 14 74% 6 32% No 2 11% 13 68% Total 19 101% 19 100% 210 TABLE C289 Wisconsin Importance / Quality / Usefulness 1 = Excellent, 5 = Poor Question # & Descriptor 1 2 3 4 5 NR Totals _ _ _I _ _I_ _I__ IAcross 16. Browse - Author, 5 10 1 1 0 2 19 Title, or Subject Searches. % 26% 53% 5% 5% 0% 11% 100% 17. Express - Advanced 3 4 6 2 0 4 19 Level of searching. % 16% 21% 32% 11% 0% 21% 101% 18. Boolean 1 5 5 3 2 2 18 % 6% 28% 28% 17% 11% 11% 101% 19. Keyword 7 3 4 3 0 2 19 %37% 16% 21% 16% 0% 11% 101% 20. Wildcard 2 1 9 2 0 4 18 % 11% 6% 50% 11% 0% 22% 100% 21. Searching 2 13 2 0 0 2 19 % 11% 68% 11% 0% 0% 11% 101% 22. Speed 2 6 3 6 0 2 19 % 11% 32% 16% 32% 0% 11% 102% 23. Directions 4 8 2 3 0 2 19 % 21% 42% 11% 16% 0% 11% 101% 24. Manual 2 4 5 3 3 2 19 % 11% 21% 26% 16% 16% 11% 101% 25. Screens 4 7 6 0 0 2 19 % 21% 37% 32% 0% 0% 11% 101% 26. Changing Discs 1 2 10 0 0 6 19 %_5% 11% 53% 0% 0% 32% 101% Total # per category 33 63 53 23 5 30 19 Average Percentage of 16% 30% 26% 11% 2% 15% 100% each category Average of each 3 6 5 2 0 3 category 211 TABLE C290 Wisconsin Questions 27-30, Increases or decreases of service. 1 = increased, 5 decreased. Question # & 1 2 3 4 5 NR Totals Descriptor Across 27. ILL 4 11 2 0 0 2 19 incoming % 21% 58% 11% 0% 0% 11% 101% 28. ILL 3 8 6 0 0 2 19 outgoing % 16% 42% 32% 0% 0% 11% 101% 29. Fill Rate 1 11 4 2 0 1 19 % 5% 58% 21% 11% 0% 5% 100% 30. Blind 1 3 8 2 3 2 19 Searches received % 5% 16% 42% 11% 16% 11% 101% TABLE C291 Question # 31 - Wisconsin NR 0-25% 26-50% 1 51-75% 76-100% Total # 5 1 3 2 8 19 26% 5% 1 16% 11% 42% 100% TABLE C292 Wisconsin Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks 32. Prior 3 17 0 11 8 3 4 1 47 % 6% 36% 0% 23% 17% 6% 9% 2% 99% 33. After 5 14 0 13 11 5 4 1 53 % 9% 26% 0% 25% 21% 9% 8% 2% 100% Percentage 2% -1% %?? -1% -1% 2% 1% 100% 89% Increase/decrease I 212 TABLE C293 Questions # 34 & 35 - Wisconsin Prior to database Descriptor After database incoming outgoing incoming outgoing 2 0 No Service 0 0 11% 0% Percentage of respondents 0% 0% giving noILL service. Percent Decrease of no -200% 0% service 3 3 No Response 3 3 4 4 <10 3 1 3 1 10-20 1 4 3 2 21-44 2 1 0 2 45-75 1 4 1 1 76-100 3 0 0 4 101-350 1 3 0 0 351-500 0 2 3 0 500-1000 1 0 2 2 1001+ 4 2 19 19 TotaL Responses 19 20 Percentage increase / 10% 11% Decrease of ILL 213 TABLE C294 Question # 36 - Wisconsin Response n % No Response 4 13% Automation PLans 0 0% Cataloging 2 6% Collection Development 1 3% Ease of use 5 16% ILL Printed Forms 0 0% Location tool 6 19% Reference use 3 9% Searching 9 28% Verification 2 6% 32 100% TABLE C295 Wisconsin Improvements needed Question # 37 Responses: n % No Responses 4 19% Authority control 2 10% Changing Discs, to many discs 1 5% Cleanup 3 14% Errors - dupLicate records - multipLe titles 7 33% Periodicals add 1 5% Updating more often & consistently 3 14% 21 100% TABLE C296 Question # 38 - Wisconsin Meet the users needs? Responses: n % No Response 3 16% No 1 %5 Yes 15 %79 .19 100% 214 TABLE C297 Wisconsin Question # 39 Priority Responses: n % No Responses 5 13% Accuracy in Database 2 5% Automation Services to all libraries 8 21% Continuing Education 2 5% Continue with current projects 2 5% Full text deliver 1 3% Funding 1 3% Improve ILL delivery system 2 5% Keep Database updated 2 5% Make system easer to use 0 0% Retrospective Conversion 1 3% Statewide Borrowing Agreement 5 13% Statewide database 0 0% Statewide electronic mail system 2 %5 Establish statewide circulation system 2 5% I don't understand what Priority means? 1 3% Database management - long range planning 2 5% Totals 38 100% TABLE C298 Wisconsin Question # 40 Comments Responses: n % No Responses 36 82% Use Statewide database in Reference Services 1 2% Reimbursement for ILL net lenders 4 9% Great if Automated 1 2% Our library does not provide Ill services 1 2% This is our main source of information about 1 2% other libraries 0%44 215 Responses from West Virginia: Tables C299 to C321. TABLE C299 Question_#_1:_Title ofRespondent_- W.V. Title: n% Assistant - Associate Director 0 0% Assistant ILL 0 0% Sibliographic SpeciaList 0 0% Computer Manager, Coordinator 0 0% Coordinator Adult Services 0 0% Director, Head Librarian, Library Manager 2 22% Extension Librarian 0 0% Head, ColLection DeveLopment 0 0% Head, Reference Services 0 0% Head, Technical Services, Cataloging 3 33% ILL Coordinator, Supervisor, Head, etc. 0 0% Library Clerk 0 0% Library Tech 0 0% Media Specialist, LRC Specialist, Information Specialist 0 0% Name 0 0% Reference Librarian 1 11% System Operator 2 22% No Response 1 11% 9 99% 216 TABLE C300 Respondents to Questionnaire by TYPE of Library - W.V. Type Number Percentage Public 9 100% Academic 0 0% School 0 0% Special 0 0% TotaLs: 9 100% TABLE C301 Size ofColLections W.V. Responses: n % of users Under 25,000 0 0.0% 25,001 - 50,000 2 22.2% 50,001 - 100,000 2 22.2% 100,001 - 250,000 3 33.3% Over 250,000 1 11.1% Not Responsive 1 11.1% Total 9 99.9% TABLE C302 Uses of the Statewide database - Question #4 - W.V. DescriptionNumber__ InterLibrary loan 9 43% Public Access 3 14% Backup 1 5% Cataloging / Acquisitions 7 33% ColLection DeveLopment 1 5% Reference - Other 0 0% [ ____________I21 100%] 217 TABLE C303 Questions #5 & 6 - W.V. - Amount of time spend daily on: Statewide databasee InterLibrary Loan Minutes n %an% 0 or no response 0 0% 3 33.3% Less than 10 1 10% 1 11.1% 10 to19 1 10% 1 11.1% 20 to 29 0 0% 0 0.0% 30 - 44 0 0% 1 11.1% 45 - 59 1 10% 0 0.0% 60 - 119 2 20% 1 11.1% 120 - 179 2 20% 1 11.1% 180 - 239 0 0% 0 0.0% 240 - 299 0 0% 0 0.0% 300 + 3 30% 1 11.1% Other 0 0% 0 0.0% Total 10 100% 9 99.9% TABLE C304 Question # 7 Type of staff using database - W.V. Staff nI InterLibrary loan 9 37.5% Reference 5 20.8% Technical Services 7 29.2% Director 2 8.3% Extension Services staff 1 4.2% Other 0 0.0% No Response o00.0% Total 24 100.0% 218 TABLE C305 Question # 8 - dedicated equipment - W.V. Responses n No response 1 11.1% Yes 5 55.6% No 3 33.3% Total 9 100.0% TABLE C306 Question # 9 - Public Access? - W.V. Responses n No response 0 0% Yes 2 22% No 7 78% Total 9 100% TABLE C307 No Public Access - Why - W.V. Question 9A Responses n No Response 1 10% No Interest 0 0% Dial Access only - equipment 1 10% Difficulty of use 2 20 Staff use only 5 50 No Room 1 10% Total 10 100% 219 TABLE C308 Question # 10 - Hardware - W.V. Responses n % No response 1 11% Yes 1 11% 0 7 178% Total 9 100% TABLE C309 Question # 11 - Software - W.V. Responses n % No response 1 11% Yes 2 22% No 6 67% Not ApplicabLe 0 0% Total 9 100% TABLE C310 Question # 12 & 13 - Training - W.V. Responses n - State n - Attended % Training No response 1 1 %11 Yes 1 2 %22 No 7 6 %67 Totat 9 9 100% TABLE C311 Questions 14 & 15 - Training - W.V. Responses n - Adequate training % n - need Training No response 6 67% 0 0% Yes 2 22% 2 22% No 1 11% 7 78% Total 9 100% 9 100% 220 TABLE C312 W.V. Importance / Quality / Usefulness 1 = Excellent, 5 = Poor Question # & Descriptor 1 2 3 4 5 NR Totals Across 16. Browse - Author, 1 2 2 2 0 2 9 Title, or Subject Searches. % 11% 22% 22% 22% 0% 22% 99% 17. Express - Advanced 2 4 1 0 0 2 9 Level of searching. %22% 44% 11% 0% 0% 22% 99% 18. Boolean 2 2 2 0 1 2 9 %22% 22% 22% 0% 11% 22% 99% 19. Keyword 2 2 2 0 1 2 9 %_22% 22% 22% 0% 11% 22% 99% 20. Wildcard 2 1 1 0 1 4 9 %_22% 11% 11% 0% 11% 44% 99% 21. Searching 2 3 4 0 0 0 9 %_22% 33% 44% 0% 0% 0% 99% 22. Speed 2 6 1 0 0 0 9 %_22% 67% 11% 0% 0% 0% 100% 23. Directions 1 4 3 0 1 0 9 %_11% 44% 33% 0% 11% 0% 99% 24. Manual 0 0 5 1 0 3 9 % 0% 0% 56% 11% 0% 33% 100% 25. Screens 3 3 2 0 0 1 9 % 33% 33% 22% 0% 0% 11% 99% 26. Changing Discs 1 0 0 0 0 8 9 %_11% 0% 0% 0% 0% 89% 100% Total # per category 18 27 23 3 4 24 9 Average Percentage of 18% 27% 23% 3% 4% 24% 99% each category Average of each 2 2 2 0 0 2 category 221 TABLE C313 W.V. Questions 27-30, Increases or decreases of service. 1 increased, 5 decreased. Question # &1 1 2 1 3 1 4 5T NRI Totals Descriptor 1 1 j____1 ___ ___Across 27. ILL 1 3 3 0 0 2 9 incoming % 11% 33% 33% 0% 0% 22% 99% 28. ILL 3 1 4 0 0 1 9 outgoing % 33% 11% 44% 0% 0% 11% 99% 29. Fill Rate 1 4 2 1 0 1 9 % 11% 44% 22% 11% 0% 11% 99% 30. Blind 0 2 6 0 0 1 9 Searches received % 0% 22% 67% 0% 0% 11% 100% TABLE C314 Question # 31 - W.V. 1] NR 0-25% 26-50% 51-75% 76-100% Total # 0 0 1 3 5 9 0% 0% 11% 33% 56% 100% TABLE C315 W.V. Questions # 32 & 33, Methods of ILL OCLC Mail ALANET Netwo Phone Fax Other NR Total rks 32. Prior 1 9 1 1 4 2 0 0 18 % 6% 50% 6% 6% 22% 11% 0% 0% 101% 33. After 0 9 0 3 5 3 0 0 20 % 0% 45% 0% 15% 25% 15% 0% 0% 100% Percentage 0% -1% 0% -3% -1% 2% %?? %?? 90% Increase/decrease 222 TABLE C316 Questions # 34 & 35 - W.V. Prior to database Descriptor After database incoming ( outgoing incoming outgoing 3 2 No Response 0 0 1 2 <10 3 4 2 2 10-20 1 1 1 1 21-44 1 1 1 1 45-75 3 0 0 0 76-100 0 2 0 0 101-350 0 0 1 0 351-500 0 0 0 0 500-1000 1 0 0 1 1001+ 0 1 9 9 TotaL Responses 9 9 Percentage increase / 10% 10% Decrease of ILL TABLE C317 Question # 36 - W.V. Response n % No Response 4 40% Automation Plans 0 0% Cataloging 1 10% Item Status 2 20% Ease of use 0 0% ILL 0 0% Location tooL 2 20% Reference use 0 0% Searching 0 0% Verification 1 10% 10 100% TABLE C318 Improvements needed Question # 37 Responses: n % No Responses 1 6% Authority control - cataloging 1 6% Electronic delivery - full text 2 13% Circulation Procedures 1 6% Cumulative printing of screens or search 0 0% ilt should be on-line 4 25% Indexes to manuals, on screen instructions 0 0% Errors - duplicate records - multiple titles 0 0% Need more libraries inputting records 1 6% Periodicals add 0 0% Update software 3 19% Searching - 1 6% Public access software 1 6% Change to CD-ROM 1 6% 16 99% TABLE C319 Question # 38 - W.V. Meet the users needs? Responses: n % No Response 3 33% No 0 __ _ Yes 6 %67 9 100% 223 224 TABLE C320 W.V. Question # 39 Priority Responses: n % No Responses 2 15% Accuracy in Database 0 0% Automation Services to all Libraries 2 15% Continuing Education 1 8% Continue with current projects 0 0% Full text deLiver 1 8% Funding 1 8% Improve ILL delivery system 2 15% Keep Database updated 0 0% Make system easer to use 1 8% Retrospective Conversion 0 0% Statewide Borrowing Agreement 0 0% Statewide database - funding 0 0% Statewide electronic mail system 1 %8 Establish statewide circulation system 1 8% I don't understand what Priority means? 0 0% Coordination lead by the state, don't instalL 1 8% & abandon Totals 13 101% TABLE C321 W.V. Question # 40 Comments Responses: n % No Responses 2 25% Include all libraries in state 1 13% Greatly increased ILL from small lib. with no 1 13% funding Need more statewide cooperation 2 25% Reduce cost 1 13% Decrease paperwork 1 13% 8 102% APPENDIX D RESPONSES FROM STATE LIBRARIES 225 C. 6 UM F, ev s0 Li I . 0 0 e H N N ] > - e ~ 603 i L S N S8 8-~ I a N ~ 08 0 0 cc ~ N& e L 0 a (A.I 114c c I .I .,., .e . - - 226 0 OS 4P "a i a 04 U-40 4.0 &1' 414 Eu - 0 4 0 4J to 40 0~ Nen 1j w ti 44 0 4+ N +N IU a) 4 409 N - * 6 N 0 0 n PN 0 PM C4 It 4)) .20'a 1 . z .010 Uca goNca w 0 0 U) P]I0 T 00 all~L~ in l' I .~U 41 K 227 0 LL G ii6 g it I I i~t 5 * 6 i U *6@ #a 04U C! 228 APPENDIX E VENDOR'S RESPONSE TO MISSOURI'S RFP. AUTO-GRAPHICS, BRODART COMPANY, AND LIBRARY CORPORATION 229 230 Request for Proposal - State of Missouri CD-ROM Statewide Database Referrals to attachments or Appendix are directions in the original documents. Those attachments are not included in this document. This document is a direct copy of the response to a RFP from the Missouri State Library by Brodart Company. Proposal from Brodart Company PART ONE INTRODUCTION AND GENERAL INFORMATION Introduction 1.1 Noted. Brodart Automation has thoroughly reviewed the terms and conditions set forth in the RFP. Organization 2.1 Noted. Brodart has reviewed the Proposal submission requirements detailed in "PART FOUR - PROPOSAL SUBMISSION INFORMATION" on page 22. Our proposal complies with the organizational requirements recommended. Supplemental information has been provided in "ATTACHMENTS" on page 40. Background information. Subparagraphs 3.1 through 3.5 in this section have been reviewed and are noted. 231 PART TWO - SCOPE OF WORK 1. GENERAL REQUIREMENTS Brodart Automation is proposing our database creation and maintenance service and our Le Pac(R) public access catalog as the continuing solution to the state's ongoing CD-ROM public access catalog needs. As the current vendor of this product, the state will continue to benefit from Brodart's familiarity with your needs and requirements. Brodart is proposing to produce your catalog exactly according to the specifications in the RFP. We feel that one of the most important advantages we can offer the state is our data compression technology. With a catalog the size of the state of Missouri's, data compression is of key importance. As a leader in the creation of CD-ROM databases, we have developed techniques which maximize the storage capacity of the compact disc. By fully exploiting the CDs capacity, we are able to keep the number of discs required for the catalog to a minimum reducing both production and hardware costs to the state. The management of an extremely large database with multiple update input sources requires a vendor with extensive experience in the management of large databases. Brodart is, perhaps, the most experienced vendor in the area of database management. Currently, we have over 1,200 separate programs designed to efficiently manipulate bibliographic data for the creation of precise library automation products. Over the many years we have been providing this service to libraries we have been able to successfully manipulate bibliographic data from a wide variety of sources. With this experience behind us, we do not anticipate any problems maintaining a high quality database for the state of Missouri. We are proposing to provide the state with the following products and services: Creation of statewide database: Brodart proposes to continue to provide the state with the overall creation of the database. We will continue to apply our sophisticated file and data manipulation programs to the state's variety of input sources to create a fully merged and deduplicated file, which will result in the production of a "clean" CD- ROM product for the state. Additionally, the application of our automated authority control processing will insure the continued accuracy and currency of the name and subject headings in the database. 232 CD-ROM Creation: Brodart will continue to prepare the file for mastering onto the CD-ROM disc. Brodart will then create and deliver the discs in the appropriate number of copies at the production intervals requested by the state in the RFP. CD-ROM Search Software: Brodart will continue to provide our Le Pac search software to be used by the state's libraries in conjunction with the CD-ROM catalog. We have reviewed the specifications discussed in the RFP and essentially they mirror the product you are receiving today. The current users of the catalog are familiar with Le Pac's ease of operation and powerful searching features. The product we are providing to you today is a direct result of input received not only from the state of Missouri's library staff, but from thousands of Le Pac users throughout the country. In summary, by selecting Brodart Automation as the continuing source for the production of the state's CD-ROM union file, you continue to assure the library patrons throughout the state with the finest quality public access catalog available at the best possible price. We look forward to continuing our relationship with the state's libraries as we continue to serve your automation needs. 3.1 Brodart will provide the first edition of the catalog within the time frames discussed in the RFP. Subsequent editions of the catalog will be delivered by October 1 with a supplemental catalog produced and delivered by April 1 of each year. Provided we receive the appropriate product profiles and input data within the established time frames, we do not anticipate any difficulties with "on time" product delivery. 3.1.1 Brodart will provide the demonstration catalog on CD-ROM as requested by the state. 3.2 Brodart will continue to produce the statewide database from the bibliographic input sources discussed in the RFP. 233 3.2.1 Brodart is the current producer of the MARC tapes and will use them for production of the state's database and subsequent CD-ROM catalog. 3.2.2 Brodart has processed the input sources discussed in the section for production of the state's catalog. 3.2.3 Brodart has reviewed the additional possible input sources listed in Exhibit A, Pricing Page, Section 3.1 through 3.11. We are able to process any tape input source in true LC MARC II Communications format. Although we are able to process this data for inclusion in the database, we strongly recommend that the state review the quality of all input sources prior to inclusion in the catalog. A union database seeks to combine all duplicate records into one "master" record with each contributing library's holdings information appended to the record. During the "deduplication" of the file, many times records of questionable quality, regardless of source, do not "deduplicate," even though they are, in fact additional copies of the same record. These "dirty" records clutter the union file, increase catalog production costs and generally do not conform to established MARC standards. Our experience has shown that occasionally some records are not of a quality suitable for inclusion in a quality state catalog. We are willing to work with the state to determine the suitability of including records from various input sources on a case-by-case-basis. Brodart can accept and Process any diskette input that is in microLif or US MARC microLif Protocol for inclusion in the catalog. 3.3 Brodart will perform the deduplication processing according to the hierarchy delineated in sections 3.3.1, OCLC, 3.3.2 UTLAS, 3.3.3 Bibliofile and Section 3.3.4,. 234 3.4 Brodart will create the database according to the hierarchy specified by the state. 3.5 Brodart is currently maintaining the Missouri Union List of Serials (MULSP) as a separate file. 3.5.1 Noted. Brodart is thoroughly familiar with the tag/subfield structure of the MULSP database and has successfully manipulated it for inclusion in the state's current Le Pac CD-ROM catalog. 3.5.2 Noted. Brodart is familiar with the file structure. 3.6 Brodart will assign a unique control number to all records in the database. For OCLC records the OCLC number will be retained. 3.7 All tags on the master file will be retained. 4. AUTHORITY CONTROL 4.1 Before CO mastering Brodart will apply automated LC Authority Control processing to the database as indicated in the RFP. A discussion of Brodart's automated authority control processing procedure is provided at Attachment A. 235 4.2 The new records added to the file will be authorized with the entire file prior to the production of the cumulative catalog. 4.3 Part of our standard authority control processing is thegeneration of appropriate cross references for display inthe catalog. Le Pac will automatically display the appropriate "SEE" and SEE ALSO" references. Users may thenselect and be taken directly to those particular catalog entries. 4.4 The catalog will be updated with new or changed LC subjectheadings and cross references. 5. CD-ROM DISC CREATION 5.1 Brodart will perform the necessary mastering and premastering of the database and then transfer the data to CD-ROM disc. 5.1.1 Brodart will produce the copies (400) as requested in theRFP. Additional copies can also be produced and delivered asmay be desired by the state. 5.2 The recommended drive for use with the Le Pac catalog is theHitachi CD-ROM drip. Drives from Philips and Sony are alsoknown to be compatible. MS-DOS extensions will be required for use with non-Hitachi drives. 5.3 Le Pac conforms to High Sierra Group (ISSO 9660) standardsfor format volume and file structure. 236 5.4. Brodart's sophisticated data compression technology allows us to most fully use the tremendous storage capacity of the CD-ROM disc. We had successfully processed one million (1,000,000) titles on just one compact disc. When dealing with a bibliographic database the size of the state of Missouri's, the vendor's ability to fully utilize the storage capacity of the CD-ROM disc is of key importance to the cost of the project. Additional discs escalate costs in both the areas of disc replication and the requirement for additional CD-ROM drives. Brodart's ability to contain more title per disc than any other CD-ROM vendor gives the state a considerable cost savings. Brodart can process the catalog among the discs in any of several different "split" methodologies. Brodart is confident that the state's catalog will "fit" on only five CDs, including the state's supplement file and the index file disc currently required to tell users the correct disc to use for each search entered. Essentially, the file can be mastered in one of two ways. Option 1 Brodart can master the file as one file spanning multiple CDs. The search software would be set to search the entire file simultaneously as one (1) file. Through this method, the user is not required to switch or swap discs and the requirement for index CD is eliminated. The clear disadvantage is the requirement for all workstations to be equipped with multiple chained CD-ROM drives. This alternative would increase hardware required for the system. Option 2. Brodart can continue to master the file under the state's current disc swap methodology as specified in the RFP. Although this methodology requires that users swap disks, the need for additional CD-ROM drives is eliminated. With this option the catalog would be spanned across the discs as follows: o Monographs - 2 discs 0 Serials File - 1 disc * Supplement File - 1 disc o Short Author/Title Index - 1 disc There are other possibilities for the efficient "split" of the data available to the state. Brodart will be happy to 237 discuss any other Possibilities with the state should you desire. This is the current arrangement of the Missouri catalog. 5.5 The ability to concurrently chain CD-ROM drives together is a function of hardware. Currently, the Hitachi 3600 series of drives can be chained up to eight (8) concurrent drives. 5.6 The state is the sole owner of its data and the CD-ROM discs purchased; therefore, the state retains ownership of older versions of the catalog. 5.7 The Le Pac catalog produced will be in compliance with ISSO Standard 9660 for format volume and file structure. Should this specification change at any time during the contract period, Brodart will notify the state prior to product delivery. 5.8 Brodart's current production of the catalog has the index file (short author and title) contained on one CD-ROM disc as specified in the RFP. Brodart will continue to produce this index on one disc, if required by the state. 6. Search Software Requirements 6.1 Search Software Capabilities 6.1.1 Le Pac is menu driven and each menu provides the user with the full range of options available at that point in operation. 238 6.1.2 When a Le Pac catalog is mastered to operate in a multiple disc swapping configuration, each search is saved in memory while the user inserts the proper disc. The search is then automatically executed on the "correct" disc without the need to re-enter the search criteria. 6.1.3 The system functions with all versions of MS-DOS extension. 6.2 6.2.1 Le Pac functions in two modes, Browse Access and Express Access. More experienced searchers often prefer Express Access with its ability to combine search criteria across multiple author, title and subject fields. More casual searchers often prefer Browse Access for its ability to take the user directly to the alphabetical point in the catalog most closely matching the search criteria entered. 6.2.2 Le Pac allows users to select a display format (public access, ILL, full MARC or reference desk). The ability to create, display and print bibliographies is in development and is scheduled for release with the next edition of the software (Fall 91). 6.3 Brodart will provide a state-wide, unlimited number of workstation licenses to the state for the cost of seven thousand dollars ($7,000.00) per year. This includes the Le Pac Public Access License and the Le Pac Professional licenses. Complete information detailing the Le Pac Professional options has been provided at Attachment B. The Le Pac Professional options included for this price are: o Interlibrary Loan (print or download to disk version) Bibliographic Maintenance Holdings Update 239 * Save Please see "EXHIBIT A - PRICING PAGE" on page 36 for complete pricing information. 6.4 Brodart will provide the software on 5 1/4" floppy diskettes. 7. SPECIFIC SEARCH SOFTWARE REQUIREMENTS: 7.1 With Le Pac, any indexed field can be searched. Searching is keyworded through use of the "ANYWORD" field and searches (excluding number searches) are left-to-right, direct order searches. The index field requested in Section 7.1, items a. through d. can all be selected as index points, Brodart has provided one mastering cost for any fields the state may wish to choose as index points. There is no charge for additional index points. Please see "EXHIBIT A - PRICING PAGE" on page 36 for complete pricing information. 7.2 7.2.1 Searches can be limited by publication date, material format and language. 7.2.2 Searches can be terminated at any time and the user can be returned to the main menu. 7.2.3 Currently, Le Pac can print screens. Enhanced record printing capabilities, including the capability to print bibliographies, are in development and will be available with the upcoming release of the software in the Fall of 1991. 240 7.2.4 After a search has been entered a brief title screen display search results. The user then selects from this list. 7.3 7.3.1 With Le Pac context sensitive "HELP" is available to users at any point in operation through use of the "F/1" key. A full help menu is available to the user through a single keystroke. 7.3.2 Le Pac provides error messages to users instructing them as to actions that may be taken at any given point in operation. 7.3.3 Le Pac is not case sensitive. 7.3.4 When multiple search terms are entered in any one search field, Le Pac ignores extra blank spaces between the terms. 7.3.5 Le Pac provides a "brief record" screen which lists the search results. The user, through use of the light bar, mayselect a particular title and Le Pac will then display therecord. Four formats for display are available, with a shortpublic access display containing holdings data as thedefault display. The user may also select to have the recorddisplay in full MARC format, interlibrary loan format or reference desk format, 7.3.6 The product currently produced by Brodart for the state,displays sorted holdings data as a four character display.In the future, if the state desires, Brodart will expand 241 holdings data to display a five character code. This will require special programming. 7.3.7 When a search results in a listing exceeding one screen, the user may scroll through the records. 7.3.8 When multiple terms are used in a search, the Boolean "and" is the implied operator. Le Pac also supports the Boolean operators of "or" and "not". 7.3.9 The system will search on whatever is contained in the record. For example, if "1984" is entered in the title field, the Orwell classic will be retrieved. 7.3.10 Local call numbers are displayed with the holdings symbol. 7.4 Le Pac Multi-Level Location Searching allows the user to search either just the holdings symbol of its current location or the entire catalog can be searched. Additionally, when no match is retrieved from a search the system prompts the user to expand the search. This is accomplished through a pop-up window and the user may then depress the "ALT E" key combination to automatically expand the search. Search criteria need not be re-entered. 7.4 Le Pac runs "on top" of these applications and will not interfere with operation. The CD-ROM drive address is modifiable through MS-DOS. 7.6 Currently, supplements can be created and down-loaded to a hard disk drip. A recent Le Pac enhancement allows supplemental bibliographic data to be created on a local 242 hard disk drive, In both cases, the search software access both the CD and the hard disk drive seamlessly and simultaneously. The same key functions are used. 7.7 7.7.1 Users can select from the multiple title screen through use of the light bar and "ENTER" key a title and then display additional information on that title, including complete holdings data and call number. 7.7.2 User can "step back" through previous searches through use of the "F/8" recall keys. 7.7.3 Express Access allows users to create combination searches with entries in the "Author", "Title", "Subject", "Anyword" and "Location" fields in combination. 7.7.4 Le Pac allows for specific phrase searching. User can accomplish this by enclosing the search terms in quotation ("") marks. 7.7.5 In Browse, truncation is implicit after the entry of as few as one character. In Express Access users can enter a "wildcard" character indicated by the asterisks (*). Such as: comput* and retrieve all the entries containing, computer, computing, etc. 7.8 243 7.8.1 Le Pac displays the percentage of the catalog searched. There are no plans to display the number of records retrieved. 7.8.2 In Browse Access, the user is taken directly to the point in the catalog most closely matching the search criteria entered regardless of whether it is an exact match or not. 7.8.3 Le Pac offers a Browse Access capability. 7.8.5 User can step back through previous searches and modify them. 7.8.6 Simple queries take approximately 1-2 seconds. A sample Le Pac Response Time Test is provided at Attachment C. To a great extent, response time is a function of the hardware used. 7.9 The state may add additional indexes by notifying Brodart thirty (30) days prior to data input cut-off date. 7.10 These fields will be added upon request at no additional cost. INTERFACE TO OTHER FUNCTIONS 244 8.1. 8.1.1 Selected records can be copied to a hard or floppy disk in MARC format. 8.1.2 Search results can be saved into an ASCII file. 8.1.3 Exit to DOS can be accomplished through use of the ALT/X key combination. 8.2 Catalog cards can be printed locally using the Le Pac Professional options. 9. TRAINING AND DOCUMENTATION 9.1 A comprehensive reference manual will be provided with each set of Le Pac discs and software. 9.2 Brodart will provide the training sessions as requested in the RFP. We will provide the training as requested in the RFP at no additional charge. Training above that specified in the RFP will be charged at the rate of four hundred dollars ($400.00) per day plus expenses. Please see "EXHIBIT A - PRICING PAGE" on page 36 for pricing information. 9.3 Product enhancements will be provided at no extra charge as they become available. 245 10. DISTRIBUTION 10.1 As with all Brodart customers, the state is the sole owner of its database. Brodart will create the spin off products for individual libraries upon request. Please see "EXHIBIT A - PRICING PAGE" on page 36 for pricing information. 10.2 Brodart will deliver the tapes as requested in the RFP. Pricing information for this service has been provided in "EXHIBIT A -PRICING PAGE" on page 36. 11. STATEWIDE DATABASE MAINTENANCE 11.1 Le Pac Professional options will provide the state with an efficient cost effective method of performing maintenance on the database. Brodart will process the updates for inclusion in the next edition of the catalog. Optionally, the state can continue to perform database maintenance as they have in the past. Brodart will process the transactions for inclusion in the next product. 11.2 Brodart will provide 9-Track tape copies of individual libraries' databases upon request. Please see "EXHIBIT A - PRICING PAGE" on page 36 for pricing information. 11.3 Missouri libraries can process additions, changes, and deletions to the catalog on floppy diskette. The diskettes can be forwarded to Brodart for inclusion in the next edition of the catalog. Brodart has also provided information detailing our InterActive Access System (IAS) our online bibliographic utility. IAS will provide the state with a real-time method of performing database maintenance in an on-line 246 environment. The state of Kansas, also a Brodart customer, is currently using IAS to perform extensive database clean- up work. We will provide pricing for this option upon request. A full description of IAS has been provided at Attachment D. Included with this description is a copy of the Spring 1991 issue of InterAction. This issue features an article by Bruce Flanders, Director of Technology, Kansas State Library, detailing the Kansas state library's use of the IAS system for its database maintenance and clean up needs. 12. ADDITIONAL REQUIREMENTS 12.1 Brodart's CD-ROM, Le Pac customer list is most extensive. Wehave provided a list of customers closely representing the state of Missouri's catalog in size and composition at Attachment E. 12.2 Brodart will correct any and all software errors or "bugs" should they occur. 12.3 Brodart has been a leader in the library community for overfifty (50) years. A brief history of our overall library experience and our extensive experience in the library automation market has been provided at Attachment F. Brodart's experience in the creation of CD-ROM public accesscatalogs is unmatched. In 1985, Brodart was the first company to provide this technology to libraries. 12.4 Brodart has provided pricing for this service in "EXHIBIT A- PRICING PAGE on page 36. The pricing provided assumes ourstandard specifications and standard collections. 247 12.5 Upon termination of the contract, Brodart will provide the state with a copy of the database at no cost. 13. LIQUIDATED DAMAGES 13.1 Noted 13.2 Noted. PART THREE - GENERAL CONTRACTUAL REQUIREMENTS Except as otherwise noted in "PART TWO - SCOPE OF WORK" on page 2, Brodart will comply with the contractual stipulation of this section. Pricing has been provided in "EXHIBIT A - PRICING PAGE" on page 3-. PART FOUR - PROPOSAL SUBMISSION INFORMATION 1. Submission of Proposals 1.1 Noted. Brodart has completed the required forms and they have been signed as indicated. 1.1.1 The original Form P-92 has been signed and is included in a sealed envelope in the front of this proposal. 1.1.2 Noted. 1.1.3 Noted. 248 1.2 Noted. 1.3 Noted. Brodart has organized this proposal to mirror the format in which it was originally forwarded to us. We have used the state's numbering and naming conventions. Cross references have been provided. Additional information has been provided by way of attachments and has been so identified. 1.3.1 Noted. Each section has followed the state's recommended format. With the exception of supplemental material identified as attachments, the information has been grouped according to the state's organizational conventions. 1.3.2 Noted. The originally signed Form P-92 is provided in a sealed envelope at the front of this proposal. 1.4 The request bid bond has been provided in a sealed envelope at the front of this proposal. Only an original has been provided. 2. CLARIFICATION OF REQUIREMENTS 2.1 Noted. 2.2 Noted. 2.3 Noted. 249 2.4 Noted. 2.5 Noted. 2.6 Noted 3. EVALUATION PROCESS 3.1 Noted. 3.2 Noted. Brodart will attend a question and answer period if required by the state. 3.3 Noted. 3.4 Noted. 3.5. Noted. 4. CONTRACT AWARD 4.1 Noted. 250 4.2 Noted. 5. Pricing 5.1 Unit prices ONLY have been provided in "EXHIBIT A - PRICING PAGE" on page 36. Pricing extensions, if required, will be provided upon request, provided quantities required are specified. 5.2 Noted. 5.3 Noted. 5.4 Noted. 5.4.1 Noted. As previously stated, Brodart has provided unit pricing only. Extensions, if required, will be provided upon request and receipt of quantity required information. 5.4.2 Noted. 5.4.3 Noted. 5.4.4 Noted. The special programming fee quoted in "EXHIBIT A - PRICING PAGE" on page 36 is for additional special programming. 5.4.5 Noted. 251 6. OFFEROR'S EXPERIENCE AND RELIABILITY 6.1 Brodart has provided name, address and contact person information along with a detailed project summary for some of our larger customers whose requirements closely match those of the state of Missouri. These references and project summaries can be found at Attachment E. 6.2 The information requested in this section can be found at Attachment E, Project Summaries. Responses to items, 6.2.1,6.2.2 and 6.2.3 have also been included in this attachment. 6.3 Brodart's Financial data is provided at Attachment G. 6.4 A sample Le Pac response time test was conducted and the results can be found at Attachment C. Response time is, to a great extent, a function of the hardware used. 6.6 A sample of the Missouri database, currently produced by Brodart has been provided at Attachment H. Please note: only one copy of this sample has been provided with the original response. Additional samples, if required, will be provided upon request. 7. EXPERTISE OF OFFEROR'S PERSONNEL 7.1 Please see Attachment I for resumes of the personnel that will be assigned to manage all aspects of the production of the state's CD-ROM database. 252 7.2 For matters of a contractual nature, the state may contact: Mr. Ron Van Fleet Brodart Automation 500 Arch Street Williamsport, PA 17705 1 800-233-8467, ext 640 For matters of a technical or service related nature, the state may contact Ms Linda Craner Brodart Automation 500 Arch Street Williamsport, PA 17705 1-800-233-8467, ext 640 Mr. VanFleet's and Ms Craner's resumes have been provided at Attachment I. 7.3 Brodart has also provided information regarding other personnel that will be assigned to the project in Exhibit C. 7.4 Brodart does not required additional staff members to accomplish this project according to specifications. Should additional staff be required, Brodart will provide information detailing their backgrounds and experience upon request. 7.5 Nubro Inc. is a General Partner of Brodart Co. Nubro Inc.'s Corporate Income Tax ID number is 12950250. Brodart Co (partner) Sales Tax ID number is 11964928. Under its current organization Brodart is authorized to conduct business in the state of Missouri. 253 8. PROPOSED METHOD OF PERFORMANCE 8.1 Brodart's proposed method of performance has been detailed in "PART TWO - SCOPE OF WORK" on page 2. As requested, that information has not been repeated in this section. 8.2 A narrative description of the method in which Brodart proposes to satisfy the requirements in the RFP has been provided in "1. GENERAL REQUIREMENTS" on page 2. As requested, that information has not been repeated in this section. 8.2.1 HARDWARE COMPATIBILITY a. Le Pac is essentially compatible with any IBM PC or true compatible with 640K RAM, a single accessible drive and a CD-ROM drive. As the current provider of the state's catalog, we are aware of the kinds and types of hardware currently in use in the state's libraries. This hardware is known to be compatible. The catalog we are proposing in this RFP will be compatible with these existing workstations. b. For continued basic Le Pac operation, no additional hardware will be required, (except for the addition of net Le Pac sites in the state). For the Le Pac Professional options we are proposing a hard disk drive will be required. Hard disk drives are available from Brodart Automation at the following prices: LE PAC HARDWARE (Memorex Telex Model 7045) Configuration Processor 80286 Accessible drip 3.5" 1.44 MB (5 1/4" at same) Video adaptor VGA Parallel ports 1 Serial ports 2 8 bit slots 1 16 bit slots 5 Keyboard w/custom keycaps 1 MS-DOS 1 MS-DOS extensions 1 MTC MODEL 7045: $1,f515 .00/unit4 254 MAINTENANCE On-site: Depot: $275. 00/year $250.00/year ADD ON PRICING MONITORS VGA Color: VGA Black & White: FLOPPY DRIVES 360 KB 5 1/4: 1.2 MB 5 1/4: CD ROM DRIVES Internal: External. HARD DRIVES 20 MB Hard Drive: 40 MB Hard Drive: PRINTER Dot Matrix (model 1173): $405.00 $155.00 $125.00 $150.00 $530.00 $590.00 $465.00 $465.00 $483.00 Compatible hardware can also be purchased from a local hardware dealer, if desired. 3. Hitachi is the recommended drive; however, drives from Philips and Sony are known to be compatible. MS-Dos extensions will be required for non-Hitachi drives. 4. Basic unit does NOT include drive and monitor. 5. On-site maintenance is only available to libraries within a 74 mile radius of a Memorex Telex Service Center. 6. Hard disk drive will be required for the Le Pac Professional options. 255 5.4 Sophisticated data compression technology allows us to most fully use the tremendous storage Capacity of the CD-ROM disc. We have successfully processed one million (1,000,000) titles on just one compact disc. When dealing with a bibliographic database the size of the state of Missouri's, the vendor's ability to fully utilize the storage capacity of the CD-ROM disc is of key importance to the cost of the project. Additional discs escalate costs in both the areas of disc replication and the requirement for additional CD- ROM drives. Brodart's ability to contain more titles per disc than any other CD-ROM vendor gives the state a considerable cost savings. Brodart can process the catalog among the discs in any of several different "split" methodologies. Brodart is confident that the state's catalog will "fit" on only five CDs, including the state's supplement file and the index file disc currently required to tell users the correct disc to use for each search entered. Essentially, the file can be mastered in one of two ways. Option 1 Brodart can master the file as one file spanning multiple Cbs. The search software would be set to search the entire file simultaneously as one (1) file. Through this method, the user is not required to switch or swap discs and the requirement for index CO is eliminated. The clear disadvantage is the requirement for all workstations to be equipped with multiple chained CD-ROM drives. This alternative would increase hardware required for the system. Option 2. Brodart can continue to master the file under the state's current disc swap methodology as specified in the RFP. Although this methodology requires that users swap disks, the need for additional CD-ROM drives is eliminated. With this option the catalog would be spanned across the discs as follows:' o Records 1-4 discs * Serials (MVLSP) File - 1 disc o Short Author, Title Index - 1 disc There are other possibilities for the efficient "split" of the data available to the state. Brodart will be happy to discuss any other possibilities with the state should you desire. This is the current arrangement of the Missouri catalog. 256 8.2.2 In our many years of processing bibliographic input, we have always been able to process MARC formatted data. Brodart will be happy to analyze any input source tapes the state may be Considering for inclusion in the catalog to help insure a Continued quality display of the data in the catalog. We have provided information detailing our bibliographic data processing procedures in general and a section describing our current handling of the Missouri file. This information can be found in Attachment 3. Generally speaking, Brodart will require ninety (90) days lead time to perform proper analysis of all new input sources from the state. 8.2.3 AUTHORITY CONTROL Brodart has provided information detailing our authority control processing procedures at Attachment A. 8.2.4 CD-ROM DISC CREATION a. The Le Pac disc conforms to High Sierra Group (ISSO 9660) standards for volume and file structure. b. Brodart is confident the entire catalog will fit on only six CDs. This will include the master index disc required by the RFP, four discs for records, and the disc required for the serials (MULSP) file. c. Brodart data compression techniques allow us to fully use the tremendous storage capacity of the CD-ROM disc. We have included over one million records on a single CD-ROM disc. To our knowledge, this amount is approximately 200,000 to 400,000 more records than can be processed by other vendors. It must be understood, however, that the number of records that can be contained on a single disc is, to a great extent, dependent upon the size, composition and holdings data of the records. Our record count for number of records on a disc is based on complete, full MARC records with holdings data included. In many cases the number of records that can be placed on a single CD-ROM disc is based on the composition of the file and the records themselves. To our knowledge, we can contain more records on a single disc than any other vendor. 8.2.5 Search Software a. A copy of our standard Master Service Agreement (MSA) has been provided at Attachment K. 257 b. Information detailing our planned enhancements to our product line has been provided at Attachment L. c. With Le Pac's Express Access "ANYWORD' feature, any information in the record that has been identified as an index point can be used as a search qualifier. Pricing information has been provided in "EXHIBIT A - PRICING PAGE" on page 36. This pricing includes all fields currently indexed, plus those mentioned in the RFP. d. Sample error messages have been provided in Attachment M. e. Detailed information regrading Le Pacls Multi-Level Location Searching feature has been provided in Attachment N. f. A list of Le Pac stoplisted words has been provided at Attachment 0. g. The Le Pac stoplisted words, (Attachment 0) are not searched if entered alone as search criteria. However, if entered as part of a specific phrase search, such as "Of mice and men" they will be searched as part of the entire search criteria. 8.2.6 INTERFACE TO OTHER FUNCTIONS a. Le Pac can access a printer to print screens and other information. b. Currently Records can be imported to a hard or floppy disk as an ASCII file and then can be edited with a text editor. The upcoming new release of the Le Pac software (Fall 1991) will include the ability to print bibliographies. c. Brodart is also proposing the Precision OnelTfill product line as a future option for the state to consider. These very economical CD-ROM products give libraries unparalleled power and performance for performing both retrospective conversions and keeping the catalog current. We are offering these products to the state at the following prices: 258 Precision One R$taO0.00/copy Includes search software and CD-ROM disc which contains 1 million of the titles most frequently held by school and public libraries. Precision One Current 450.00/copy Includes search software and one CU per month for twelve months containing the previous two years worth of LC cataloging, Brodart's original cataloging and a comprehensive video collection. If both purchased togethe$ 650.00 Le Pac is fully compatible with both these products. Detailed information about the Precision One family of products has been provided at Attachment P. 8.2.7 a. A copy of the Le Pac and Le Pac Professional reference manuals has been provided at Attachment Q. b. The Le Pac software is written in the "C" programming language. 8.3 8.3.1 SEARCH SOFTWARE a. Le Pac is equipped with a local information editor which allows individual libraries to design local information screens. Many Le Pac customers use this feature to display special instructions, local notes and information and items of special interest. Additionally, local sites are able to: * Set Multi-Level Location Searching defaults* Adjust screen reset time* Choose format (public access, reference desk, ILL, or full MARC) display format. b. The brief record screen displays (multiple title screen) displays author, title, subject, and publication data. 259 c. In Express Access, any combination of author, title, subject, "anyword" and location is available. d. When multiple "hits" are indicated, Le Pac displays the percentage of the database searched. e. Le Pac's Browse Access, allows users to enter traditional search criteria, (either author, title or subject) and the user is then taken directly to the alphabetical point in the catalog most nearly matching the criteria entered. The user may then browse the short title list and make selection by moving the light bar (up and down arrow key) to the desired item. By depressing the "enter" key the user is then taken to the desired record. f. With Le Pac previous searches can be recalled and modified without the need to reenter the entire search. g. With Le Pac any field may be indexed and searched. The state may make these determinations by completing the required items in the Le Pac Product Profile Form. A sample of this form has been provided at Attachment R. h. The 260 $b, Publisher, 505 $a Contents note, and 074 $a GPO item number fields can all be selected as index points and searched. Current users who have selected these fields as index points do not report any significant impact on disc storage capacity or erosion of response time. I. Le Pac "ANYWORD" feature, allows for virtually unlimited search capabilities. Additional search features, as they are developed, will be made available to the state. j. Search results may be sent to a printer, or saved to a disk. The ability to produce cards and labels is also a function of the Le Pac Professional. (Detailed information about this product has been provided at Attachment B. Brodart is proposing this product in conjunction with our Le Pac Catalog. There is no additional charge for the inclusion of the Le Pac Professional software and the license fee.) Please see "EXHIBIT A - PRICING PAGE" on page 36 for pricing information. 260 INTERFACE TO OTHER FUNCTIONS a. When DOS is resident on the library's PC (Le Pac NOS operating system employed), exit to DOS is achieved through use of the ALT/X key combination at the opening Le Pac Screen.' b. Both the Le Pac Professional (Attachment B) and our Precision One Products (Attachment P) have the ability to print cards and labels. 8.4 Noted The autoboot version of the software takes users directly in and out of the Le Pac system. 8.5 A step-by-step description of tasks and events has been provided at Exhibit D. 8.5.1 Exhibit D, "Schedule of Events" has also been completed. 8.6 An organizational chart, depicting staffing and appropriate lines of authority has been provided at Attachment S. 8.3.2 [261] IS? NO. 52011 g Page 21 of 33 The of ferar shall provide the following information for services providedwith the terms and conditions specified herein.raol Costsassociated withrequired shall be included in the following prices. It accordance providingth A Annual Edition of the Statewide Database: The offerer shall provide a total pricefor the annual edition of the statewide database. The total price shall includeall costs for the creaftion of the statewide database based on 3.5 aillionbibliographic records, authority osmtrol, producing the aster CS40d *iscs, tooopies of the cD-ton product, two magnetic 1600 bpi ASCII tape Copies, providing;400 copies of the software doceuentation/user amnal, software license, training,ec The offeror shall provide a price for aCh asditoial espy of the CMoproduct and software doceentation/user naval in excess ef 400 copies Theofferor shall Alsoprovide a price per bibliographic record a excess of 3*5bibliograpfticrecords, hr e of fror shall provide firm, fixed prices for theOriginal Contract Period and mazia prices for each extension period. Annual Edition of the Statewide Database: Original Contract Period: First Extension Period: Second Extension Period: Third Extension Period: Based on: 160,650.00 total3.5 million record *..1530n.. total:s,575, 000 records*Jiia2ZW-2-- total 3,650,000 record lma t,0L-totai.3,325.000 rsee..- CDwROM Product and So! tware DocumientacionmUser Mnuatl Lin eces of AAssumes 6 disc set. Additional discs are $15.00 per disc/copya. Original Contract Period:$2-10 per c1b. First Extension Period: per c c. Second Extension Period:$ - per ed. Third Extension Period: $ rVG 'Sibliographic Record in excess of 3.5 million bibliographic records: 00 copies. epy spy 'SPY a. Original Contract Period: per recordb. First Extension periods per r$perarcorde. Second Extension period: per recordd. Third Extension period: per record *Includes everything listed i A boy per record5. Statevide Database Seppleent: The of ferer shallow eattal price for thestatewide database supplement. The total prices shall iacludeall costs for thecreation of the statewide database supplement, producing the aster lDtoh discs,400 copies of the CD-RO product, two genetic 1600 bpi "ACII tape copiessetc.The offerer shall also provide a price for each copy of the Cw product providedin excess of 400 copies. The offerer shall provide fi i, fixed prices for theOriginal Contract and maxima prices for each extension period. * Statewide Database Supplement: Based On average 50,000 TitZ&'Suppaenent a. Original Contract period: b. First Extension period: c. Second Extension Period: d. Third Extension period: AUTh0RizafljiDmUi 7 -- total 7 -- u7- - ; total 7 - 5.ntotal total' *6r& qsc w et o c un p e s c e r of discs required will Change hasGr~hrite of 71,000 unique titles per year, a. * b. c. d. '61k /, Page 29 of 33 *CD-10 Product in excess of 400 copies: Based on a 250,000 title supplement A. Original Contract Period: Per COPYb. First Extension Period:.00 per copyc. Second Extension Period: per copyd. Third Extension Period: per copy C. Customized Changess: The offerer shall provide a price per hour for providingCustomized changes in the search software pursuant to the state agency's request.The of feror shall provide a f in, fixed Price for the *riginal Contract Period and amaxima price for each extension period. a. Original Contract Period: per hour programingb. First Extension ories: per hor programmingc. Second Extension Period:. per hour programingd. Third Extension Period:= per hour provrbfinr D. Spinoff Product: The offeror shall provide a price per record for th creation of aspinoff product on a CD-ROM disc and a 9 Track Tape. Teofferor shell provide aprice CD-ROM Disc and per I Track Tape. The offerr shall provide a fin, fiedprice of the original contract period and a meximis price for each extension period.NOTE: Extraction fee below, is waived for the production of Brodart products.*CD-ROM Disc a. Original Contract Period: *005j pr record $1J00 per discb. First Extension Period: per c rerd J per discc. Second Extension Period: per record Wt iper discd. Third Extension Period: per record ARper disc*AllrPrices are plus extraction fee (5.002 on file retracted from)n9 Track Tape a. Original Contract Period: per record per dis.b. First Extension Period: *cr.y ... per record .. . per disc. Second Extension Period: try sper record * . per discd. Third Extension Period: . 4&... per reerd *Sper disc"1250 .00 per 250,000 title Plus U5.00/rel.Shelf list: IfProposed, the olreor mat provide a price per record for the creationof a machine readable catalog record from printed shelflist Is USKARC format. Theofferor shall provide a f in fixed price for the fiimal contract period and amaxima price for each extension period. a. Original Contract Period: 8.52b. First Extension Period: per recordc. Second Extension Period: per recordd. Third Extension Period: per record*Assuming standard specification and standard collecflni per record7. The offerer must provide a total price per library for any additional hardware neededto operate the search software. The total price shall include the cost of theequipment and installation. The offeror shall provide a fir fixed price for theoriginal contract period and a axia price for each mtensio period.'Assumes libraries cure ntly using workstations as disCussed in Prt Two, Scope of work,a. Original Contract Period: $ N 4~A per record itemb. First Extension Period: per recordc. Second Extension Period: per recordd. Third Extension Period: per record The firm, fixed prices stated above are provided is accordance with the termaconditions.of 52114. A V U RItC7 January 28, 1992 ATE ( 262 j [ 263 ] I?? NO. 3201148 Page 30 of 33 WMEIBIT I PRICE ANALYSIS Annual Edition of the Statewide Database 1. Creation of the Statewide Database 2. Authority Control 3. Producing the Master CD-ROM Disc 4. 400 Copies of the CD-ROM Product 5. Two magnetic 1600 bpi ASCII tape copies 6. 400 Copies of the Software Documentation/User Manual 7. List Other: Statewide Software Licensing Fee S No Chpa..ge_ $ .0075/title $ .023/title $ 15.00/disc/cop $250.00/250,oo titles $ Included +25.00/r( $ 7.l00 flO/yn * TOTAL (See price quoted for 00001 on the Pricing Page) $ Statewide Database Supplemet 1. Creation of the Statewide Database 2. Producing the Master CD-ROM Disc 3. 400 Copies of the CD-ROM Product 4. Two magnetic 1600 bpi ASCII tape copies 7. List Other: $No Charce $.023 $15.00/disc con $250.00/2So.ooo titles +$25.00/r * $ * TOTAL (See price quoted for 00003 on the Pricing Page) $ .... _ ___ ___._-_January 28, 1992 AUTHOR IZYSIG.NATURE DATE 264 Proposal from Auto-Graphics. PART ONE INTRODUCTION AND GENERAL INFORMATION 1. INTRODUCTION 1.1 This document constitutes a request for competitive, sealed proposals from qualified individuals and organizations to provide services in accordance with the terms and conditions set forth herein. 2. ORGANIZATION 2.1 This document, referred to as a Request for Proposal (RFP) has been divided into the following parts for the convenience of the offeror: 2.1.1 Part One - General Information 2.1.2 Part Two - Scope of Work 2.1.3 Part Three - General Contractual Requirements 2.1.4 Part Four - Proposal Submission Information 2.1.5 Part Five - Exhibits 3. BACKGROUND INFORMATION 3.1 In order to enhance resource sharing and library automation at local, regional, and statewide levels, the Missouri State Library has contracted for the creation of a statewide database of bibliographic holdings and records and search software on CD-ROM discs. The CD-ROM discs contain records and holdings symbols for libraries throughout the state of Missouri. The project links libraries of all sizes and types and ensures that all Missouri residents have access to the materials and information they need. 3.2 Currently, the statewide database contains approximately 3.5 million unique records and 8 million holdings. The Missouri State Library 265 3.3 Currently, the records and holdings are on 9 track tape in a MARC (Machine Readable Cataloging) format from Brodart Automation. Approximately 85 OCLC (Online Computer Library Center) and 100 non- OCLC three character symbols are on the tapes. The records and holdings from other libraries come from a variety of cataloging sources. 3.4 Over 185 public and academic libraries in Missouri have purchased CD-ROM disc players. Approximately 110 public libraries purchased Epson Equity 1 + libraries bought Hitachi 1503s CD-ROM disc players. A number of public microcomputers and Hitachi 1503s CD-ROM disc players. Seventy academic libraries also acquired Bibliofile to help them convert local records to machine readable form. 3.5 The primary purpose of the statewide database was as a locator for interlibrary loan. The statewide database enables libraries to do interlibrary loan and provides access to collections across the state which have not been available previously. However, usage reports suggest that the statewide database has been used increasingly for reference and cataloging purposes including the generation of MARC records for internal automation efforts. This document constitutes a request for sealed proposals, including prices, from qualified individuals and organizations to furnish those services and/or items as described herein. Proposals must be mailed to the Division of Purchasing, P.O. Box 809, Jefferson City, Missouri 65102, or hand- carried to its offices in Room 580, Harry S. Truman Building, Jefferson City, Missouri. 266 NOTES ON PRICING PAGES A-G understands the State's requirement for firm, fixed prices, and has quoted prices on this basis. However, insofar as these prices are based upon the information provided in the State's RFP, we reserve the right to apply the same unit prices quoted to such orders as may exceed the quantities described in the RFP, or quote additional prices to cover variations in processing not originally requested. Alternatively, A-G will accept the State's prior written instruction to limit processing to quantities originally forecast, thus avoiding the application of any additional charges. 1. Annual Edition of the Statewide Database A. original Contract Period 1.1 Creation of the statewide database based on data preparation for 185 libraries $50.00 $9,250.00 1.2 Authority control (including validation and replacement, and catalog cross-references for names and subjects) based on 3,500,000 existing database records + 500,000 added input records = 4,000,000 records $0.008 $32,000.00 1.3 Producing the CD-ROM master discs (including data compression, indexing, premastering, and mastering) based on 4,000,000 records @ $0.0095 = $38,000.00 1.4 Copies of the CD-ROM product; assuming 600 sets of 4 discs $15.00/disc $36,000.00 1.5 Two magnetic 1600 bpi ASCII tape copies based on 4,000,000 records $0.0005 X 2 sets $4,000.00 1.6 Providing copies of the software documentation/user manual based on an annual statewide system license covering the Patron, Expert, Research, Location Scoping, System Administration, and Catalog Maintenance modules, plus initial provision of and updates for one user manual per site. Covers use of and support for software by any library within the state. Annual license fee due upon delivery of initial catalog and at each contract renewal $29,250.00 Total firm, fixed price for original contract period $148,500.00 267 B. First Extension Period (7/1/92 - 6/30/93) 1.1 Creation of the statewide database; i.e., maintenance for 185 libraries $50,00 $9,250.00 1.2 Authority control (including validation and replacement, plus generation of catalog cross- references for names and subjects) based on 1,000,000 additional input records $0.008 $8,000.00 1.3 Producing the CD-ROM master discs (including data compression, indexing, premastering, and mastering) based on 4,500,000 records $0.0095 $ 42,750.00 1.4 Copies of the CD-ROM product; assuming 600 sets of 5 discs $15.00/disc $ 45,000.00 1.5 Two magnetic 1600 bpi ASCII tape copies based on 4,500,000 records $0.0005 X 2 sets $ 4,500.00 1.6 Providing copies of the software documentation/user manual based on an annual statewide system license covering the Patron, Expert, Research, Location Scoping, System Administration, and Catalog Maintenance modules, plus initial provision of and updates for one user manual per site. Covers use of and support for software by any library within the state. Annual license fee due upon delivery of initial catalog and at each contract renewal S 29,250.00 Maximum price for first extension period $138,750.00 C. Second Extension Period (7/1/93 - 6/30/94) 1.1 Creation of the statewide database; i.e., maintenance for 185 libraries $50.00 $9,250.00 1.2 Authority control (including validation and replacement, plus generation of catalog cross- references for names and subjects) based on 1,000,000 additional input records $0.008 $8,000.00 1.3 Producing the CD-ROM master discs (including data compression, indexing, premastering, and mastering) based on 5,000,000 records $0.0095 $ 47,500.00 1.4 Copies of the CD-ROM product; assuming 600 sets of 5 discs $15.00/disc $ 45,000.00 268 1.5 Two magnetic 1600 bpi ASCII tape copies based on 5,000,000 records $0.0005 X 2 sets $5,000.00 1.6 Providing copies of the software documentation/ user manual based on an annual statewide system license covering the Patron, Expert, Research, Location Scoping, System Administration, and Catalog Maintenance modules, plus initial provision of and updates for one user manual per site. Covers use of and support for software by any library within the state. Annual license fee due upon delivery of initial catalog and at each contract renewal $ 29,250.00 Maximum price for second extension period $144,000.00 D. Third Extension Period (7/1/94 - 6/30/95) 1.1 Creation of the statewide database; i.e,, maintenance for 185 libraries $50.00 $9,250.00 1.2 Authority control (including validation and replacement, plus generation of catalog cross- references for names and subjects) based on 1,000,000 additional input records $0.008 $ 8,000.00 1.3 Producing the CD-ROM master discs (including data compression, indexing, premastering, and mastering) based on 5,500,000 records @ $0.0095 $52,250.00 1.4 Copies of the CD-ROM product; assuming 600 sets of 6 discs $15.00/disc $54,000.00 1.5 Two magnetic 1600 bpi ASCII tape copies based on 5,500,000 records $0.0005 X 2 sets $5,5oo.oo 1.6 Providing copies of the software documentation/ user manual based on an annual statewide system license covering the Patron, Expert, Research, Location Scoping, System Administration, and Catalog Maintenance modules, plus initial provision of and updates for one user manual per site. Covers use of and support for software by any library within the state. Annual license fee due upon delivery of initial catalog and at each contract renewal $ 29,250.00 269 Maximum price for third extension period $158,250.00 II. Statewide Database Supplement A. original Contract Period and Extension Periods 1.1 Creation of the statewide database supplement including authority control processing of added records, per supplement record $0.0175 1.2 Producing the CD-ROM master discs (including indexing, premastering, and mastering), each supplement edition $1,950.00 1.4 Copies of the CD-ROM product; assuming 600 single disc supplements $15.00 $9,000.00 1.5 Two magnetic 1600 bpi ASCII tape copies based on 500,000 records $0.0005 X 2 sets $500.00 III. Data input from sources defined on pricing pages 3.1 Tape Prices shown apply to input records received for processing into the catalog. Prices assume records contain minimum data needed to a) establish location code and local call number, and b) compare data against existing database records for merging purposes. 3.2 Disk A handling charge of $25.00 per diskette will be applied to input received on floppy diskette. Otherwise, charges for input on magnetic tapes are identical to charges for input on MS-DOS diskettes. 4. Programming Programming will be charged at the fixed rate quoted. Changes requested in addition to the specifications contained in the present RFP are subject to negotiation and scheduling. 270 5. Spinoff products 5.1 CD-ROM disc Prices shown are intended to be identical to prices quoted for publishing the statewide CD-ROM supplement. 5.2 9-track tape Minimum charge of $975.00 will be applied once to cover multiple tape sets when sets are ordered concurrently. 6. Conversion Additional charges may apply for additional customer requested keying; e.g. for copy-level data such as bar code numbers. Prices shown cover input of converted records to statewide database. If separate output products are requested, spinoff charges quoted above will apply. PROPOSED METHOD OF PERFORMANCE A. Response to Specifications in Scope of Work This section of our proposal responds point-by-point to the specifications under "Scope of Work" (RFP Part Two), and also includes the information requested under RFP items 8.2 and 8.3. A sequential narrative outlining the steps involved in producing the statewide database on CD-ROM follows this section. 1. GENERAL REQUIREMENTS Having created statewide databases on CD-ROM for Connecticut, Maryland, Tennessee, and other states, provinces, and large consortia, A-G understands the general requirements of the RFP. For this project, we propose to: a. Accept a copy of the current Missouri database of some 3.5 million unique titles provided by Brodart, and reformat and index this file for CD-ROM publication. b. Reformat and merge an additional 500,000 titles provided by participating libraries from various cataloging sources. 271 c. Apply standard authority control processing to the merged database. d. Publish the merged data base on a set of 4-6 CD-ROM discs and produce 400-600 copies of the CD, software and documentation for use by participating libraries. e. Produce annual catalogs and interim supplements on CD in the same manner, processing some 1 million titles each year for this purpose. The statewide database will be maintained by A-G, such that only new input will need to be processed for each update. 2. HARDWARE COMPATIBILITY All of the equipment described is known .to be compatible with our software and CDs. IMPACT requires a fully IBM-compatible 8088, 80286, or 80386 PC with: o 640K RAM o At least one 1.2MB or higher capacity floppy disk drive o MS-DOS version 3.2 or higher o Standard keyboard, preferably with the ten function keys on the left side o Monochrome or color monitor with graphics card * Minimum of one CD-ROM drive; the system supports access to multiple drives. Any model equipped with MS-DOS CD-ROM Extensions version 2.0 or higher can be used. Most libraries are currently purchasing 80286 machines with a 40MB hard disk, since this configuration represents a better value, dollar-for dollar, than the older XT-level machines. However, the basic system runs fine (but not as fast) on an 8088 machine with a high-density floppy drive, 272 and hundreds of libraries continue to use the catalog on the original XT platform. If existing equipment is to be used, some modifications may be necessary, depending upon the current configurations in use. Assuming that the existing PC is itself compatible, necessary modifications might typically include upgrading RAM from 512K to 640K or 1MB, replacing a 360K floppy drive with a high density drive, adding one or more CD-ROM drives and/or installing MS-DOS Extensions. Optional upgrades might include the addition of a printer, modem, hard disk or second floppy drive, or a VGA color monitor. Systems known to be incompatible include certain high-end IBM PS/2 models (PS/2 model 70 and above). Also, neither Apple nor Macintosh equipment will support our software, which requires a fully IBM PC-compatible machine. Hitachi, Sony, Amdek, Toshiba, NEC, and Phillips CD-ROM drives are all known to be compatible, given the use of Microsoft Extensions. We have no information about the compatibility of models offered by Pioneer and DENON, but assume these would be compatible if a High Sierra driver is available. We understand that the Phillips model CM155 does not support such a driver and would therefore not support the system. For libraries wishing to purchase new equipment, we recommend the following configuration, and will guarantee a purchase price of less than $2,000 per unit for orders placed through the State Library during the initial contract period. This price includes a one-year warranty. Extended service contracts are available at an annual cost equal to 10% of the original purchase price. Recommended 80286 IMPACT Cataloging Station* 80286 12MHz CPU with 1MB RAM One internal Hitachi CD-ROM drive 40MB 28ms hard disk drive One 5.25" or 3.5" high density floppy disk drive 12" monochrome monitor with graphics card** 101 keyboard with function keys on left 200 watt power supply One parallel and two serial ports Front security panel with standard IMPACT signage*** All cabling and connectors MS-DOS version 3,3 and Extensions Shipping to individual site One year warranty * Dial access would require the addition of a standard modem and off-the-shelf telecommunications software. A VGA monitor with card can be substituted for this monitor at time of order at an additional charge. 273 Each catalog station's front panel provides complete protection for internal CD-ROM and floppy disk drives, preventing unauthorized access or tampering. Security panels can be removed by authorized staff using the custom tool provided, and can be exchanged from unit to unit as needed, leaving no marks when removed. The security panel also serves as the nameplate for each catalog station, clearly identifying each unit as a public access catalog. Custom signage, including library logos, further description of the catalog, or other information, is also available. 3. STATEWIDE DATABASE CREATION 3.1 A-G will agree to provide the first edition of the statewide database on CD within four months of the award of the contract, subject to the requirements for receipt of files and specifications described below and in the schedule of project phases following this section. We can produce future editions of the catalog and supplement by the dates indicated under the same terms. 3.1.1 A-G will produce the demonstration database within 30 days of receipt of the Brodart USMARC tapes and final processing specifications. Please note that, according to the terms specified by the State, half of the time allowed for production of the catalog will be exhausted by the time the demonstration database is received by the State. Thus, it will not be possible to address any changes desired as a result of the State's review of the demonstration database without affecting the schedule. A-G will not be responsible for delays resulting from changes initiated by the State after processing has begun. 3.2 As a condition of our agreement with the State, A- G will require files and file specifications for each data source to be included. A-G will provide profile forms and answer any questions libraries 274 may have regarding the information required, but the State will be ultimately responsible for ensuring that all data is received by A-G prior to the input cutoff for each catalog or supplement. 3.2.1 We understand this to mean that the Brodart tapes will be received in readable condition at A-G no later than thirty days after the date of contract award. In order to ensure that these tapes can be processed immediately, we will require a record count, list of location codes, and specifications detailing the format of location/call number and record control number data no later than 10 days following award of the contract. Failure of the data on the delivered tapes to correspond to the specifications shall be grounds for renegotiating the schedule. 3.2.2 The same conditions noted above apply to files provided from the additional sources listed under this item. 3.2.3 Data from the optional sources listed on the Pricing Page can be processed under the same terms noted above, with the following exception: inclusion of non-MARC files and files in which location/call number data and MARC data are not part of the same bibliographic record (e.g., separate item and MARC files) will require renegotiation of the schedule. 3.3 (Including 3.3.1 - 3.3.4). A-G will deduplicate records from the additional sources specified against the Brodart tapes, and retain the preferred version from among multiple occurrences according to the hierarchy outlined here. We assume that the present Brodart data base has been merged, and that the preferred version has been kept, and will not attempt further internal deduplication within this file. Upon receipt, the Brodart file will be indexed by OCLC control number (field 001) and LCCN (field 010) so that additional files can be merged on this basis. If the State prefers a more exacting match, selected 275 data from other MARC fields can be used to validate LCCN matches. Deduplication by text matching is not included in the present proposal but can be ordered as an additional services if desired. (Please refer to Attachment V, section B.) 3.4 A-G's union database system (see Attachment V1 section C) supports the hierarchy described, to the extent that this is supported by the data in each input file or record. For example, records must contain data on source of cataloging (MARC field 040) and date of cataloging (MARC field 005 or file presented in chronological sequence) in order to be considered in this hierarchy. In order to be properly consolidated during this process, holdings data must first be standardized from the various formats in which it may currently appear into a common format (A-G uses 949 $1 location code 5a local call number). A-G will need completed profile forms for each data source in order to reformat holdings data properly. (See Attachment IV for sample profile forms.) 3.5 A-G will publish the MULSP database as a separate file to be included on one of the statewide database CDs. A-=G will provide software to support a function key enabling users to "toggle" between the main catalog and the MULSP serials file. The MULSP catalog will be separately indexed and can support different display and scoping options from the main catalog. Also, if summary holdings data is to be displayed, we recommend that the State consider using the optional Holdings Display feature designed for display of the more extensive holdings data associated with serials. Please refer to Attachment III, page 123. 3.5.1 These fields can be indexed for retrieval per the standard index arrangement described in Attachment X. All fields listed can be displayed, if desired. We do not find any indication in your specification as to the location of the holdings data within the MULSP records, and will need further specifications on this file if we are to proceed. 276 3.6 Our general policy is to use the OCLC number as the record control number where present, and to assign a sequential number in a higher range for non-OCLC records, and we propose to follow this procedure here. 3.7 Generally, all 6XX fields are retained in the database, regardless of indicator. If the State wishes to have certain 6XX fields dropped on the basis of tagging or indicator value to save space or for any other reason, this would need to be specified. 4. AUTHORITY CONTROL 4.1 Our proposal includes validation and replacement of the entire cumulated statewide database against the complete and up-to-date LC name and subject authority files, plus generation of see and see also references. Validation and replacement processing is described in detail in Attachment VI, section C. The generation and operation of see and see also references are described in Attachment VII, section B. All fields listed in this item of the RFP are addressed, as shown in the validation matrix appended to Attachment VI. 4.2 Our proposal includes the validation and replacement processing as described above for the entire initial database, and for all new records added in subsequent updates. 4.3 See and see also references will be freshly generated for the entire file following each update. Cross-references are not invisible to the user, since this would not allow users to choose from among multiple references from the same term. Instead, the cross reference is shown to the user, and the user can press "Enter" to show the choices under the referenced term, or in the case of multiple terms, move the cursor to select the term desired, and then press "Enter". 4.4 We have assumed that validation and replacement processing would be applied to the initial database, and to all records later added to the database in the course of the contract. If the State is satisfied with the authority control of the Brodart database, we would be prepared to apply validation and replacement processing only 277 to the new records added for the initial catalog, thus reducing the cost of the initial catalog by an estimated $24,000. Alternatively, the State could request re-authorization of the entire database at any time at the same per-record price. 5. CD-ROM DISC CREATION 5.1 The authorized statewide database will be indexed and published on CD-ROM discs, as required. 5.1.1 Our proposal includes pricing based on a minimum of 400 sets of discs. The State may order any quantity in excess of this figure at the unit prices quoted. We have used a figure of 600 sets in calculating costs on the notes accompanying the Pricing Pages. In the event that all sets are not ordered at the same time, the State should be aware that the minimum order that can be processed at these prices is 100 sets. 5.2 All of the CD-ROM drives mentioned are known to be compatible. Please refer to item 2, above. 5.3 All IMPACT CDs are produced in the High Sierra or ISSO 9660 format for CD volume and file structure, and require the use of the MS-DOS Extensions. 5.4 A database totalling 4 million records could only fit on one 640MB CD if each record, with indexes, cross-references, and holdings, averaged less than 160 characters. Since the average length of a generic MARC record is 600-800 characters, this would imply unacceptably drastic reductions in the amount of data that could be stored or indexed. Given the fact that the present statewide database resides on six CDs, we're sure that the State is aware of the issues regarding the storage limits of the CD-ROM medium. We expect that the use of data compression will reduce the size of the MARC record portion of the catalog by about 60%, allowing about 920,000 full MARC records with indexing and cross references to fit on each CD. Depending on whether the MULSP serials file is reflected in the totals provided, and on the number of unique records ultimately included in the initial catalog, we expect that the initial catalog will require four discs, which would grow to five or six discs as the database 278 continues to expand in subsequent updates. This storage requirement could be arranged in several ways: o as 4 or 5 separate discs split by date or some other characteristic, each searchable on one drive, requiring disk swapping. o as one transparently searchable 4 or 5 disc set requiring an equivalent number of drives. * as two separately searchable two-disc sets, split by date, each requiring two CD-ROM drives, possibly with a fifth disc containing the MULSP file and non-print items. 5.5 The system will support transparent use across multiple drives, up to the limits of the user hardware. Depending upon the CD-ROM card used, some machines will support up to four drives, while others may support up to eight. 5.6 A-G will not require the return of previous editions. 5.7 We would expect to establish the format of the data with the State before the time of delivery, but can certainly agree to requirement, provided that no proprietary information is required. 5.8 As we understand it, this requirement would apply to the first and third scenarios listed under 5.4, above. Actually, such a disc would be unnecessary in the third scenario, since the user would need to make two searches the author/title index and the CD with the actual record in any event. If the first scenario were to be selected, AG is willing to negotiate developing such an index disc in a subsequent contract year and at additional cost to the State. 6. SEARCH SOFTWARE REQUIREMENTS 6.1 See items below. 6.1.1 The software is function-key driven. Please refer to Attachment VIII. The User Guide in 279 Attachment III also provides full details on system operation. 6.1.2 Transaction-saving is not currently supported, but could be implemented within the initial contract period, depending upon the hardware available, the scenario selected (see 5.4, above), and the specific functionality required. 6.1.3 The system currently works with version 2.0 of the MS-DOS Extensions and will be kept compatible as new versions are released. 6.2 See items below. 6.2.1 The function key-driven approach obviates the need for two modes of access by enabling rapid, single keystroke progression through all system menus and eliminating redundant and unnecessary displays at all stages of a search. Logically obvious choices are made automatically by the system, not required of the user. While system operation is simplified for all users, system capabilities are diversified so that novice users are led naturally toward the least complex modes of searching, while experienced users may skip to more advanced techniques. 6.2.2 The System Administration model supports local profiling of record displays. Please refer to Attachment III, section 6.2. 6.3 A-G has quoted a license fee for an annual, renewable statewide license that would cover use of the software proposed by any library within the State. 6.4 Search software will be provided on either 5.25" or 3.5" diskettes, as preferred by the State. 7. SPECIFIC SEARCH SOFTWARE REQUIREMENTS 7.1 (Items a - d). All fields listed, except SuDoc number, are currently supported and will be indexed in the statewide database. Please refer to Attachment X. SuDoc number access is supported in our GDCS catalog of GPO materials, and can be 280 added to the statewide database at the programming charges listed. 7.2 See items below. 7.2.1 The search qualification options described are supported by the Research Level software module (see Attachment IX), but will require an additional index that would increase the storage requirement for the catalog and possibly the number of discs required. 7.2.2 The system provides an "Escape Search" function meeting this requirement. 7.2.3 A variety of save and print functions are supported; see Attachment III, section 5. 7.2.4 Any item may be displayed in greater detail by highlighting the item and pressing "Enter". 7.3 See items below. 7.3.1 Context-sensitive help menus are available from any screen, and can be edited by library staff at the local site or by a global instruction from the State to include the level of assistance thought to be required. 7.3.2 Error messages are provided as specified. 7.3.3 Case distinctions are ignored as required here. 7.3.4 Spacing distinctions beyond a single space are ignored as required here. 7.3.5 The system actually supports four levels, which meet the specifications described here. Please refer to Attachment VIII, section B. 7.3.6 Holdings will be sorted alphanumerical by four character code, as specified. The System Administration software also supports a feature allowing holdings to be grouped into up to nine separate alphabets, if preferred. Each alphabet can also be displayed with an assigned, separate label. This may address the State's desire to identify and sort holdings according to the 281 size of the library, thus obviating the need for a five-character code. However, if preferred, the State can designate five- character codes for display purposes at any time. 7.3.7 Multiple screens are supported as specified, both for single records and for record lists. "More" and "End" messages are displayed to inform the user that more record/list is available or that the end of the record/list has been reached. 7.3.8 Implied "and" is assumed in all multi-word keyword searches, unless explicitly overridden using the Boolean capabilities of the Research Level software. Explicit "and", "or", and "not" operators can be invoked at this level, although this would require an additional index that would increase the storage requirement for the catalog and possibly the number of discs required. 7.3.9 Any alphanumeric string may be searched, as specified. 7.3.10 Local call numbers are displayed in association with the library holdings code, as specified. 7.4 Location Scoping is supported, as required here. Please refer to Attachment IX, section C, for details of the operation of this feature. 7.5 The search software will not interfere with the normal operation of any of the telecommunications packages or other software listed. 7.6 Transitions between the CD and the magnetic disks are transparent, as specified. 7.7 See items below. 7.7.1 Please refer to 7.2.4, above. 7.7.2 The "Prior Step" function key allows users to step back through any search to its origin. 282 7.7.3 Author, title, and subject entries can be browsed in a combined dictionary index at the Patron Level. Combined Boolean keyword access is also supported, but will require additional CD-ROM storage. 7.7.4 Keyword searching will identify multiple words wherever they occur within a field. 7.7.5 Truncated searching is supported at the Research Level. Additional indexing and CD- ROM storage may be required. 7.8 See items below. 7.8.1 The system reports the number of matches, as specified. 7.8.2 The browse mode supports near-matching, as specified. 7.8.3 Browsing is supported as specified. Please refer to Attachment IX, section A. 7.8.4 (No specification.) 7.8.5 Users can return to the current search entered and modify its terms without rekeying the entire search entry. However, prior searches cannot be retrieved once a new search has been entered. 7.8.6 Response time will vary with the type of search and the equipment used, but average response times for simple searches will generally average 3 - 5 seconds. 7.9 Fields may be added to the indexing arrangement described in Attachment X upon request. A-G reserves the right to negotiate costs and scheduling for requests that would necessitate different screen designs, additional function keys, or different types of searches (i.e., other than browsing, keyword, Boolean, number). 7.10 Please refer to note above. 8. INTERFACE TO OTHER FUNCTIONS 8.1 See items below. 283 8.1.1 MARC records can be downloaded using the "CHOOSE" function. Please refer to Attachment III, section 5.4. 8.1.2 Choice of MARC or ASCII text format is supported; see Attachment III, section 5.4. 8.1.3 The system can be set up either to inhibit or enable users to exit to DOS using the "Escape" key. 8.2 The "CHOOSE" function allows the system to interface with external card production software programs such as UltraCard/MARC and our own IMPACT/Slims small library management system. However, these programs are not included in the present proposal. 9. TRAINING AND DOCUMENTATION 9.1 The User Guide included as Attachment III will be provided with each set of CD-ROM discs initially provided. Updates to this documentation will be provided automatically as changes are released. 9.2 Two training sessions will be provided as required. Please refer to Attachment XI for additional information on training. 9.3 Enhancements to the software modules proposed will be provided with each new catalog edition as they are developed and released. 10. DISTRIBUTION 10.1 A-G takes no exception to this section and will provide spinoff products upon request at the prices quoted. 10.2 A-G takes no exception to this section and will provide the tape copies as specified. 1600 bpi tapes will be provided if required, although we suggest that the State consider whether 6250 bpi tapes or 8mm cartridge tapes might not be provided as an alternative. 284 11. STATEWIDE DATABASE MAINTENANCE 11.1 A-G takes no exception to this specification. Please refer to Attachment V, sections A.3 and E for details of union database maintenance procedures and options. 11.2 A-G will provide tape spinoff products as specified, upon request. 11.3 Our proposal includes a Catalog Maintenance module that will allow authorized users to create transactions to add, change, or delete their holdings within the union data base. These transactions are written to a floppy disk and sent to A-G for batch application to the data base in the scheduled update cycle. Libraries using the catalog as a resource for retrospective conversion can either create holdings transactions to add their holdings to the data base and then arrange to have a complete file of their holdings extracted at a later date, or download edited MARC records for immediate use, or both. Smaller public libraries using both methods to convert their collections have reported match rates of up to 90% against IMPACT catalogs containing comparable numbers of records. Please refer to Attachment III, section 7, for a more detailed description of the procedures used within this module. 12. ADDITIONAL REQUIREMENTS 12.1 Please refer to customer list included as Attachment I. 12.2 Error correction is included as part of the ongoing software license and support fee. Please refer to Attachment XIV for a copy of the standard license terms and a description of our policy on- correction of data errors. 12.3 A-G has provided CD-ROM catalog services for libraries since 1987. Please refer to Attachment I for customer list. 285 12.4 A-G will provide the spinoff tape products specified upon request. 12.5 A-G accepts this specification as stated. 13. LIQUIDATED DAMAGES 13.1 A-G accepts this specification, subject to the terms for receipt of files and project specifications described in section 3, above, and with the provision that A-G will not be held responsible for this penalty in the event of delays occasioned by the State, acts of God, or any other forces beyond our control. 13.2 A-G accepts this specification on the same terms as the item above, and with the added provision that this specification may be renegotiated in the event that the State introduces further specifications for the format of these tapes beyond those described in the RFP. B. Additional Information Requested under RFP Sections 8.2 - 8.3 8.2.1 HARDWARE COMPATIBILITY Covered under section A.2. 8.2.2 STATEWIDE DATABASE CREATION Covered under section A.3 and Attachment V. 8.2.3 AUTHORITY CONTROL Covered under section A.4 and Attachment VI. 8.2.4 CD-ROM DISC CREATION Covered under section A.S. 8.2.5 SEARCH SOFTWARE a. Please refer to Attachment XIV. b. Our development schedule is driven by customer requests, and as such 286 we have no scheduled dates for future system developments. Generically applicable software enhancements resulting from customer requests are made available to all users of the particular system modules affected as they are released and developed. A recent example is a utility program allowing users to extract MARC database subsets by holding code in batch mode directly from a CD-ROM union catalog. Software enhancements are distributed as new catalog editions are produced and delivered. c. These (plus location scoping) are the only qualifiers currently supported. d. Please refer to sample CDs included. Initiating a keyword search on a term not in the database will produce a "No matches found..-" message, e. Customized scoping for each library (actually, each PC) is available. Please refer to Attachment IX, section C. f. Please refer to Attachment VIII, section D.5, for a list of stopwords. g. See item above. Normally, even single-character "words", e.g., author initials, are searchable. For a catalog this size, it may be necessary to limit indexing of words that appear in tens of thousands of entries. 8.2.6 INTERFACE TO OTHER FUNCTIONS a. Covered under section A.7.2.3. b. Covered under section A.8.1.2. c. The search software is designed to operate IMPACT catalog CDs, and is 287 not itself compatible with any other CD-ROM databases. 8.2.7 TRAINING AND DOCUMENTATION a. Please refer to Attachment III. b. Software is written in "C". 8.2.8 STATEWIDE DATABASE MAINTENANCE Covered in section A. 11.3. 8.3 SEARCH SOFTWARE 8.3.1 See items below. a. Please refer to section 6 of the User Guide included in Attachment III. The software profile included in Attachment IV also provides a summary of the profiling options available. b. The brief, or "four-up" screen display shows title, author, date, and call number for up to four matched records per screen. Selection of any of these records produces a further label led display which is entirely profitable by the local user. Please refer to Attachment III, section 6. c. Any combination of the indexed fields listed in Attachment X (except control numbers) can be searched. As indicated elsewhere, inclusion of this index may increase CD storage requirements. d. Number of matches is specified upto 9999, and higher in research level searching. e. Please refer to Attachment IX, section A. f. Covered under section A.7.8.5. 288 g. Covered under section A. 7.9. Additions should be requested in writing well before the next scheduled cutoff date to allow for programming and testing. h. None of these fields should have any effect on response time, or any significant effect on CD storage, although it is possible that they could tip the balance in a case where all current discs were very close to being full; i.e., within 10MB per index. 1. Indexing for local call number browsing is available as an option, but has not been proposed due to the amount of CD storage consumed. j. Search results can be downloaded using the "CHOOSE" function in MARC or ASCII text formats and transferred to other systems and programs for external applications. 8.3.2 INTERFACE TO OTHER FUNCTIONS a. Covered in Section A.8.1.3. b. Covered in section A. 8.2. C. Outline of Project Phases A-G projects the following sequence of events and project phases following receipt of our proposal, assuming we are awarded a contract to produce and maintain the statewide CD-ROM catalog. An estimate of the time required for each phase is included, indicating those areas where completion of the project phase would be dependent upon actions or decisions to be taken by the State. We have tentatively scheduled production resources within a November -March time frame, based on our expectation that the contract would be awarded in October. These dates can be adjusted if the State requires more time to submit input files or review project specifications. However, we reserve the right to renegotiate the schedule in this event, to allow for other projects that may be in production within a later 289 time frame. It should be understood that our ability to conform to this, or any set of dates proposed, is dependent on the finalization of project specifications with the State, and the receipt of the input data necessary for the initial catalog. In order to deliver the initial catalog within 4 months of award, we propose the following schedule of project phases. Proposal Evaluation (October) Following receipt of our proposal, we will be pleased to answer any questions the State may have, to discuss alternative project scenarios, or to provide any additional information we can that may be helpful. Contract Award (Assume November 1) Data Profiling and Receipt (November) During this phase, A-G will expect to receive the completed profile forms previously distributed, along with the actual input files to be used in assembling the initial catalog data base. While Auto-Graphics will begin data preparation (the next phase) for individual files as they are received, it should be understood that the project cannot advance so long as we are lacking files and/or profile information. For this reason, we will need to establish a mutually agreeable cut-off date, beyond which we would proceed without any input files not yet received or profiled. We are willing to hold this project phase open for as long as necessary, although this would delay the projected delivery date for the catalog. The following schedule assumes that all profiles and data files will have been received on or before November 29, 1991. Data Preparation (December 2 - 20) During this phase we will standardize the location/call number data in the various input files to a common format that will support both their retention through the deduplication process and the IMPACT system's location scoping feature. Error listings will be generated for records found to be unprocessable based on the profiles provided; e.g, records with location codes not listed in the profile forms completed by the library, or lacking the field or fields from which location or call number information was 290 supposed to have been taken. These printouts will be returned to the contributing library for review and resolution prior to the next edition of the catalog. Demonstration Database Production (December 2 - 20) As soon as a suitable subset of the data base has been prepared, A-G will produce a small sample catalog on floppy disk or CD-ROM for use by the State as an advance demonstration of the system to be provided. This file will be delivered to the State by December 20, unless prior dates have been adjusted by the State. Data Base Consolidation (December 21 - January 15) A-G will index the current Brodart database and match in records from additional sources profiled and delivered in time for the catalog cutoff) to create a unified catalog data base consisting of unique master records, with all applicable local holdings data cumulated to the master version of each record. The resulting file will be ready for CD-ROM premastering; i.e., indexing and cross reference generation. Authority Control Processing (January 16 - 31) A-G will process the cumulated masterfile against the complete and up-to-date LC name and subject authority files, applying automatic global changes resulting from a match with LC 4XX headings. A separate process will be used to generate cross references for each CD-ROM disc. CD-ROM Catalog Production (February 1 - 15) During this phase we will divide the file according to the CD-ROM storage option selected by the State and generate indexes. After premastering is completed, Auto-Graphics' project manager will review the premastered file on our CD- ROM publisher, using the actual software to be provided to the State. This quality control check verifies that all access points and displays conform to project specifications. We will need a final order for the number of CD-ROM disc sets to be produced at this time. The State may wish to order extra sets for backup and new participants now, since an additional service charge will apply to re-orders. Also, we will need to have at this time a final order for the number and configuration of software units to be provided with the initial catalog. These will be configured and 291 copied while the CD-ROM discs are being mastered and replicated. CD-ROM Mastering and Quality Control (February 16 - 28) After verification, premastered tapes are sent to our subcontractor for mastering and replication. A final quality control check is also performed when the replicate discs are returned. CD-ROM CataloG Delivery (by March 1) All copies of the CD-ROM catalog discs, software, documentation, and project statistics will be delivered to the State by this date, subject to the terms of our proposal in section A.3, above. Training (March) Training dates will be scheduled according to a schedule negotiated with the State. We have found that two days of training is generally sufficient to provide library systems of a similar size with a base of trained individuals who can serve as an ongoing resource for other participants and staff members. Delivery of Database Copies (By March 31, or within 30 days of catalog delivery) A-G will deliver the two sets of database tapes required by this date. Ongoing Database Updates Once the original catalog has been delivered and accepted, Auto-Graphics will provide services to maintain the union data base, software, and, optionally, equipment purchased from us. The data base can be maintained and expanded by: 1) applying MARC transaction tapes provided by members, consisting of records added, changed, or deleted since the original data cutoff; 2) merging in complete MARC data bases for new participants; or 3) applying holdings transactions created using the optional Catalog Maintenance module to add, delete, or change holdings on existing data base records. Transactions provided by any of these means will be applied to the existing data base as part of an update cycle leading up to the publication of the annual catalog or supplement. Subsequent catalogs and 292 supplements will follow a production cycle similar to that outlined above. D. Other Information A summary of the schedule outlined above (RFP Exhibit D) follows this section. Please refer to Attachment XV for a general A-G organization chart. This page has been inserted during digitization. Either the original page was missing or the original pagination was incorrect. LEIBIT A PRICING PAGE CONTINUED CD-ROM Product in excess of 400 copies: a. Original Contract Period: b. First Extension Period: c. Second Extension Period: d. Third Extension Period: $ 2. per copy $ .2.5902 per copy $ L25.0 per copy $.. 12L.0 per copy C. Customized Changes: The of feror shall provide a price per hour for providing customized changes in the search software pursuant to the state agency's request. The offeror shall provide a firm, fixed price for the original contract period and a maximum price for each extension period. a. Original Contract Period: b. First Extension Period: c. Second Extension Period: d. Third Extension Period: $ 75.00 per hour $ MOOO per hour $ 75.00 per hour $ 71.00 per hour D. Spinoff Product: The offeror shall provide a price per record for the creation of a spinoff product on a CD-ROM disc and a 9 Track Tape. The offeror shall provide a price CD-ROM Disc and per 9 Track Tape. The of feror shall provide a firm, fixed price of the original contract period and a maximum price for each extension period. CD-ROM Disc a. Original Contract Period: b. First Extension Period: c. Second Extension Period: d. Third Extension Period: 9 Track Tape $ 0.0175 per record $ 0.0175 per record $ 0.0175 per record $ 0.0175 per record (minimum $500.00/catalog) $ 25.00 per $ 25.00 per $ 25.00 per $ 25.00 per a. Original Contract Period: $ 0.0025 per record $ 25.00 per 4ietaj b. First Extension Period: $ 0.0025 per record $ 25.00 per 4ieta c. Second Extension Period: $ 0.0025 per record $ 25.00 per dsetaj d. Third Extension Period: $ 0.0025 per record $ 25.00 per dtsetaj (minimum $500.00/tape copy) E. Shelflist: If proposed, the offeror must provide a price per record for the creation of a machine readable catalog record from printed shelflist in USMARC format. The offeror shall provide a firm fixed price for the original contract period and a maximum price for each extension period. a. Original Contract Period: b. First Extension Period: c. Second Extension Period: d. Third Extension Period: $ 0.45 per record $ 0.475 per record $ 0.50 per record 0 0._ per record F. The offeror must provide a total price per library for any additional hardware needed to operate the search software. The total price shall include the cost of the equipment and installation. The offeror shall provide a firm fixed price for the original contract period and a maximum price for each extension period. a. Original Contract ?eriod: b. First Extension Pericd: c. Secc-d Externsin Period: d. Third Extension Fericd: * 1.095.00 per record $ 1095.00 per record $ 1,095.00 per record $ 1,095.00 per record The firm, fixed prices stated above are provided in accordance with the terms and conditions of R8? B20114S. RFP NO. 1201148 Pago 29 of 33 [2943 disc disc disc disc pe P* po pe [295] RFP NO. B201148 Page 30 of 33 EXWI1IT S PRICE ANALYSIS Annual Edition of the Statewide Database 1. Creation of the Statewide Database * No-chargei 2. Authority Control $ 3,750.00 3. Producing the Master CD-ROM Disc $ 52,500.00 4. 400 Copies of the CD-ROM Product $ 30,000.00 5. Two magnetic 1600 bpi ASCII tape copies $ 3,500.00 6. 400 Copies of the Software Documentation/User Manual $ No charge 7. List Other: System software license $ 24,000.00 Performance bond $ No charge Training (as required in RFP) $ No charge TOTAL (See price quoted for 00001 on the Pricing Page) $ 112,750.00 Statewide Database Supplement 1. Creation of the Statewide Database $ 3,750.00 2. Producing the Master CD-ROM Disc $ 5,000.00 3. 400 Copies of the CD-ROM Product $ 6,000.00 4. Two magnetic 1600 bpi ASCII tape copies $ 250.00 7. List Other: TOIAL (See price ;uoted for CC03 cn the ?ricinj Page) $ 15,000.00 February 7, 1992 AUThORIZED SICNA. tRE DATE 296 Proposal from Library Corporation 4. PROPOSED METHOD OF PERFORMANCE The Library Corporation is submitting the following proposal in response to your Request for Proposal for a Missouri State Library CD-ROM Statewide Catalog. Throughout the years The Library Corporation (TLC) has developed the most user friendly, yet sophisticated, library automation tools available. Our automated modules include BiblioFile Cataloging, BiblioFile Circulation, BiblioFile Public Access Catalog, and BiblioFile Acquisitions. Other services include database processing, CD-ROM mastering, special software development, retrospective conversion, and more. Future developments include Serials Control and Interlibrary Loan. All software programs for TLC are written in the 11C11 programming language. All software programs operate in the MS-DOS operating environment on IBM compatible personal computers. The network software is Novell and also operates in the MS-DOS operating environment. BiblioFile utilizes the full MARC record structure format and will accept many other vendors' MARC records, providing the records are in MARC II communications format. BiblioFile Public Access Catalogs (PAC), the proposed software for the Missouri State Library statewide catalog, gives your patrons, and staff, access to your collection through hundreds of access points. Ease of use is the key to any public access catalog and there is no catalog that is more friendly to use than TLC PAC. In this section of our Proposal, The Library Corporation is addressing the conditions listed in ",4 on pages 23-26, section 8. PROPOSED METHOD OF PERFORMANCE, of the RFP. 1. Proposals will be evaluated based on the offeror's distinctive plan for performing the requirements of the RFP. Since the evaluators have already read the Scope of Work as described in the RFP, it is not necessary for the offeror to repeat the exact RFP Language, or to present a paraphrased version, as an original idea for a technical approach. 297 The Library Corporation has read and understands the Scope of Work as described in the RFP. Our approach is presented in the following sections. 2. The offeror MUST submit a written narrative which demonstrates the method or manner in which the offeror proposes to satisfy the requirements of the Scope of Work. The language of the narrative should be straightforward and limited to facts, solutions to problems, and plans of proposed action. The Library Corporation believes strongly that the only way to assure a successful CD ROM union catalog of the type envisioned for State Library requires a commitment from both parties to maximum advance planning during the pre-mastering stages and continued dialogue during subsequent use of the system. Meeting the schedule dictated by the completion dates will require a mutual adherence to the implementation plan developed prior to the signing of the contract. The State Library's primary responsibility is to get the data to The Library Corporation in an expeditious manner. The second responsibility of the State Library is to answer any questions and give approval of tests and samples within a reasonable time frame. TIC's responsibilities are to provide the State Library with a concise, easily understood picture of how the bibliographic processing will be carried out and the time frame in which the numerous elements of the entire job will be completed. For a sample Project Planning and Implementation schedule please see the following section. TIC will provide, as part of the project and to provide tools to clarify technical discussions, a sample catalog will be provided to the State Library. Pre-mastering, mastering and production of the CD-ROM union catalog will require approximately three weeks after approval of the sample. Implementation Receipt of the data is the critical factor upon which all successful implementation schedules are determined. In addition, the consistency of the data, and the timeliness of 298 review and revision will also impact the schedule. With these factors in mind we present the following target schedule. I. Within thirty days of receipt of all data from Missouri State Library, TIC will present to the State Library the full analysis of data which will include without limitation: A. Provide for each input source, i.e. institution: 1. List of holding locations and collections occurring on input tape. 2. Count of number of occurrences of each holding location and collection. 3. Count of number of records without a useable holding location or collection. B. Provide for each library employing input stamps: 1. List of input stamps extracted for each holding location. 2. Count of number of occurrences of each input stamp extracted for each holding location. II. Within thirty days of State of Missouri's clarification and return of the analysis, TLC will merge all the data into a single file for the preparation of the CD and deliver to the State Library a sample of the merged file sufficient to permit State of Missouri to verify that merging and call number generation has been performed satisfactorily and an analysis of the merged file, which will include without limitation: A. Sample of call numbers generated, to include: 1 . For each holding collection and location, a sample of at least 25 call numbers generated, preferably distributed through the input file. 2. For each library using automatic stamps, a sample of at least 25 call numbers generated for each automatic stamp, preferably distributed through the input file. 3. For each library using input stamps, a sample of at least 25 call numbers generated for 299 each input stamp, preferably distributed through the input file. 4. For each library using automatic oversize stamps, a sample of at least 25 call numbers generated for each automatic oversize location for each holding location and collection, preferably distributed through the input file. B. Provide for each input source a total of number of records without call number data. C. Provide for each library employing automatic stamps: 1. List of automatic stamps generated. 2. Count of number of occurrences of each automatic stamp generated. D. Provide for each library generating automatic oversize designations: 1. List of oversize designations. 2. Count of number of occurrences of each oversize category for each holding location and collection. III. Within sixty days of State of Missouri's acceptance of the merged file, TLC will deliver to the State Library the completed CD-ROM database which shall comply fully with the specifications set forth in the State of Missouri RFP and the TLC Response. Test Phase As a standard, integral component of TLC's CD union catalog production procedures, the library receives and approves a sample catalog prior to mastering. The sample will be produced based on the specifications determined during the advanced planning discussions between State of Missouri and TLC. Everything possible is done in the planning and specifications setting stage to minimize the chance that the sample catalog will contain any surprises. The Library Corporation works closely with your staff to develop reasonable turnaround times to review and approve the sample. In addition, the review of samples is included in the sampleimplementation schedule presented above. Bibliographic data processing A detailed review of the steps taken in bibliographic dataprocessing are included in the Project planning andImplementation sections above. The first step in the bibliographic data processing is to"lay down" the data on the TLC implementation system. Atthis stage, preliminary analysis of the various files ismade. Questions about the database, such as record count,missing fields, unreadable records, etc., are brought to theattention of State of Missouri and resolved. During this period reports are generated on the databasesoverall record count, record structure, holdings symbols,call number structure, and any data variations. Results ofthe analysis are sent to the library. Based on library'sresponse to these review materials, TLC would then beginprogrammer customization of the database targeted towardproduction of a I 0,000 record sample Union Catalog. Thesample database with necessary evaluation hardware andsoftware would be forwarded to the library staff for review.Additional samples may be forwarded to the library based onany corrections cited as necessary by the library staff. Upon final approval of the PAC sample, TLC then begins theproduction run of the library's full database and masteringof the CD-ROM Union Catalog. Authority Control The Library Corporation understands the necessity forauthority control is purely a local decision. Authoritycontrol processingris includedhas part of the pre-masteringdata processing work done by The Library Corporation. TheLibrary Corporation will run your records against the latestLibrary of Congress Name and Subject Authority files. "See"and "see also" references are created and all records aredeblinded. Standard services include: Provide cross-references: The Library Corporation providesvalid cross references ("SEE" and "SEE ALSO") to the correctform of a name or subject. Deblind entries: Thedatabase is"de-blinded" eliminating cross references from subject ornames which are not contained in a bibliographic record. 300 301 Additional authority control services are also based on theuse of the Library of Congress Name and Subject authorityfiles. The Library Corporation will run your database against the Library of Congress Name and Subject authorityfiles and provide the following: Flip headings: the authority data is flipped from theauthority record to the bibliographic record if headings donot match. Exception List: preparation of a "no match" list can be provided. For additional information on the Authority Control Process,please refer to Appendix 4. 2.1 HARDWARE COMPATIBILITY a. The offeror MUST list the hardware, including personal computers, microcomputers, hard disks, printers, etc., compatible with the proposed search software. The standard hardware configuration recommended for BiblioFile PAC is: an IBM PC 286 compatible computer with an internal Hitachi CD-ROM drive, 40mb hard disk drive, graphics adaptor, floppy drive, generic keyboard, and monochrome monitor. Any standard CD-ROM drive that is compatible withand accepts Microsoft Extensions will operate BiblioFile PAC; however, sound will be supported only on Hitachi drives. TLC strongly recommends and endorses the Hitachi CD-ROM drive. Thesedrives are available from The Library Corporation. The Library Corporation recommends monochrome monitors; however, a color version of the softwareis under development. You may use the unioncatalog with the color monitor after turning off the graphics capability. From past experience we know of several compatibleprinters such as the Star Thermal Silent Printer,and the IBM Thermal printer. In addition, anystandard dot matrix serial or parallel printer,such as the Okidata, Epson, and IBM Proprinter arecompatible with the Intelligent Catalog. Any Epson printer will work with the software, aswill the Okidata printers in the Epson emulationmode. In fact, it has been our experience that 302 printers that accept an 80 character carriage width and allow an ASCII dump are compatible with the software. We have also been successful with other printers that allow an Epson emulation. Laser printers in general are not recommended with the Intelligent Catalog. b. The offeror MUST list the specific equipment needed for the proposed search software. In addition, the offeror MUST state the cost and cost of the maintenance of such equipment. BiblioFile PAC is available through The Library Corporation as software or a turnkey system. The public access catalog is a complete turnkey system with your library's database on CD-ROM. The Intelligent Catalog includes an IBM PC compatible computer (a PC AT 286 is standard, 386 is optional) with a built-in CD-ROM drive, 40 mb hard drive, graphics adaptor, and floppy drive. It also includes a monochrome monitor, color-coded keyboard and audio capabilities supported by a telephone handset and headphones. $ 2,470 Full PAC support includes full hardware replacement, software support, updates of your library's catalog on CD-ROM, and unlimited access to TLC's toll-free support line $595/year Optional: PC AT 386 computer $300 Handcrafted wooden cabinet available in two heights and in a variety of finishes to suit your library's needs and decor. $500 Color-coded keyboard $150 Hewlett Packard Thinkjet, includes tractor feed, cable, and first year support $550 Annual support after first year $165 303 Hitachi CD-ROM drive - includes interface card, and cable $680 Annual hardware support $120 2.2 STATE\WIDE DATABASE CREATION: Other than the Brodart MARC tapes, OCLC, UTLAS, BiblioFile, LaserQuest, and MULSP, the offeror SHALL identify any additional cataloging source(s) for the source data acceptable by the offeror for entry into the statewide database. The offeror SHALL provide such information on Exhibit A. For each type of cataloging source, the offeror SHALL indicate the amount of lead time needed to enter such into the statewide database in order to complete the statewide database within the time frame specified herein. The Library Corporation has extensive experience reading and processing machine readable records in MARC II communications format, as well as several other formats. TLC has worked with data from many different vendors, including: tapes from AutoGraphics, Brodart, CLSI, DRA, EBCDIC, Geac, LSSI, Marcive, OCLC, RLIN, NOTIS and Utlas, as well as floppy diskettes from BiblioFile, LaserQuest and SuperCat, and even Circ Plus circulation databases. In addition, we frequently process records that are in IPF (internal processing format) standard, in Microlif, in NOTIS MARC, RLIN, CAN MARC, and other variations of the USMARC. TLC can convert the IPF of records from CLSI and Follett. We are also capable of tailing raw non-MARC and pseudo MARC records from several vendors and converting them into enriched MARC records. The Library Corporation requires no additional lead time to enter these sources into the statewide database. 2.3 AUTHORITY CONTROL: The offeror MUST explain how new or changed subject headings and cross references will be processed. The Library Corporation understands the necessity for authority control is purely a local decision. Authority control processing is included as part of the pre- mastering data processing work done by The Library Corporation. The Library Corporation will run your records against the latest Library of Congress Name and Subject Authority files. "See" and "see also" references are 304 created and all records are deblinded. Standard services include: Provide cross-references: The Library Corporation provides valid cross references ("SEE" and "SEE ALSO") to the correct form of a name or subject. Deblind entries: The database is "de-blinded" eliminating cross references from subject or names which are not contained in a bibliographic record. Additional authority control services are also based on the use of the Library of Congress Name and Subject authority files. The Library Corporation will run your database against the Library of Congress Name and Subject authority files and provide the following: Flip headings: the authority data is flipped from the authority record to the bibliographic record if headings do not match. Exception Listing: preparation of a "no match" list can be provided. For additional information on the Authority Control Process, please refer to Appendix 4. 2.4 CD-ROM DISC CREATION a. The offeror MUST indicate whether the CD-ROM disc conforms to High Sierra Group (150 9660) standards The Library Corporation CD-ROM union catalog projects in general and the Union Catalog produced for Missouri State Library will adhere to the High Sierra Group (ISSO 9600) standards. b. The offeror MUST determine whether multiple CD-ROM discs are needed for the statewide database. If the statewide database would require two or more CD-ROM discs, the offeror SHALL explain the reason for such and indicate how the proposed search software will handle multiple CD-ROM discs in searching. Multiple CD-ROM discs will be needed for the statewide database. The Library Corporation currently supports largedatabases such as the 1.6 million record database at Rochester Regional Library Council and a 1.3 million recordexperimental project done with the New York Public Library. 305 This database currently resides on three CD-ROM discs. The number of discs required for the Missouri project can only be determined upon examination of the database. For additional information, please see the response to "C.t' below. Each CD-ROM disc will have its own index to the items held on each individual CD. In addition, TLC will provide a separate CD-ROM disc with an index of the holdings of all the CD-ROM discs so that a user can enter a title or author to determine on which CDROM disc the full record is held. Another option would be to produce multiple disc systems which would require more than one CD-ROM drive but would not require the separate author title search. c. The offer MUST indicate the approXimate number of records which will fit on one CD-ROM disc. Typically, each CD-ROM disc produced by The Library Corporation includes up to 600,000 records, depending on the size of each record and the number of holdings. An examination of the Missouri database by TLC Technical Services would be necessary to determine the number of discs required for the project. 2.5 SEARCH SOFTWARE a. The offeror MUST provide a copy of the standard maintenance agreement covering the performance of the proposed search software. The Library Corporation does not require a signed contract to do business with a library, however, TLC is happy to review a proposed contract by your library. We feel this policy is in your best interest and allows you to "make the rules" by which we serve your library. We are happy to work with you to create a mutually agreeable contract if you so desire. The same is true for ongoing maintenance of hardware and software. The prices found in the cost section include the first year maintenance. After the first year you have the choice of renewing total system support on an annual basis. Simply stated, The Library Corporation provides absolutely all support of the system software, hardware, updates, enhancements, and unlimited access to our toll-free hotline for one price. 306 b. The offeror MUST provide written details regarding anticipated upgrades to the proposed search software including features, projected delivery date, and procedures for updating the current system. Information is becoming the world's most valuable commodity. The Library Corporation is committed to providing librarians and their patrons with the tools and support to gain fast, easy access to the world's store of knowledge. To achieve this goal, TLC unleashes creative minds to exploit technology to the limit and to provide unparalleled service to librarians. Library automation does not stand still. Many new advances are happening every day and the future of library automation is bright. Every technological advance is being developed for only one reason to help answer the needs of your library. In this advancement there will be companies that survive, companies that thrive, and companies that die. The Library Corporation will be one of the companies that will thrive. We have made sure of this by dedicating one third of our staff to research and development. This group is made up of some of the world's most intelligent programmers. It is this same staff that have repeatedly introduced new innovations that have become the standards by which other systems are measured. All BiblioFile systems are provided with the appropriate system documentation. This documentation is thorough and provides step-by-step instructions in guiding librarians through the software. Documentation and release notes are provided to all users when changes or modifications to the software are made. c. Other than publication date, format of material by type, and language, the offeror SHALL specify any other qualifiers which the user can use to limit searches. The Limit Search function available in the Intelligent Catalog helps patrons narrow searches in Find Anything or View Catalog. Limit searches can be performed by catalog entry type and branch library locations, as well as publication date, material type (media type), and language. Catalog entry type searches contain searches such as authors only or subjects only, or a patron can specify any combination of entry types. As an alternative View Catalog is designed for patrons who know specifically what they are locking for. From the first 307 screen of View Catalog a patron can narrow a search to a particular index or a combination of indexes. This mode of searching assumes the user has had some experience with database searching. Sophisticated patrons can choose not to follow the FIND ANYTHING search route, and go directly to specific search argument: author, title, subject, or any combination of the three and limit searches by language, media type, year range and library. You can qualify a search by publication year or range of years. The following options are available: ALL all years 1978 Only 1978 1973-1978 1973 through 1978 1975- 1975 and after -1982 1882 and before With branch library locations, particular library branches or groups of libraries in your system can be limited. Each individual library can predetermine which branch or libraries patrons can search on a particular catalog station. When limitations are set, the occurrence list in a search will show which items can be found in the selected J branches. The library may use the powerful scoping feature to limit by individual library, by type of library, by library system, and by geographic region. If the librarian allows it in the configuration, patrons can use their own Limit Search definitions along with branch scoping. For example, if a patron always goes to Branch "A," the patron can Limit Search to Branch "A" and it will always be included in searches, no matter which branch scoping level the patron may choose. d. The offeror MUST provide a sample of error messages used in the system. Error messages appear in BiblioFile PAC when an inappropriate key is pressed. For example, while in the Find Anything mode, the FIO key is pressed. The PAC will prompt you with "You do not have any items saved to print. Please press ESC to continue". This error message occurs when you do not have any items saved for printing. 308 Context-sensitive and self explanatory help messages are always available in BiblioFile PAC. BiblioFile PAC has 107 help screens to date. As new functions are added the appropriate help messages are included. Upon request we will provide a printout of these help screens. BiblioFile PAC help screens are context sensitive and can be locally edited by the library staff. Help screens are prompted by pressing the help key or are automatically displayed after a pre-set number of seconds of keyboard inactivity. This "time out" mechanism is locally configurable. Examples of help screens appear throughout the BiblioFile PAC handbook. Each BiblioFile PAC screen also contains a second level of help at the bottom of each screen. This level of help is displayed in reverse highlighted video and is also context sensitive. e. The offeror MUST provide information on the scoping capabilities of the system, and advise whether customized scoping for each library is available, Each library may use the powerful scoping feature, defining up to 99 different levels, to limit by individual library, by type of library, by library system, and by geographic region. If the librarian allows it in the configuration, patrons can use their own Limit Search definitions along with branch scoping. For example, if a patron always goes to Branch "A," the patron can Limit Search to Branch "A" and it will always be included in searches, no matter which branch scoping level the patron may choose. To begin branch scoping, the patron simply presses a function key. f. The offeror MUST provide a list of stopwords, The following stopwords are currently used during the BiblioFile PAC indexing process and are not searchable by the user: AND, BUT, FOR, FROM, TO, THE, WITH. In addition to this list, all one and two-letter words are stopwords except the following: CD, DR, ED, FE, GO, I, ID, II, IV, IX, ME, OF, ST, TV, U2, US, V, VD, VI, X, XI, XX. TLC's implementation staff will work with the Missouri State Library project.administrator on adding any additional stopwords to this list as required. 309 g. The offeror MUST indicate any terms which are not indexed and searchable; for example, two letter words at the beginning of a title. Please see 5 for a list of standard MARC fields searchable with the PAC software. The Library Corporation is happy to discuss with the State Library any other MARC fields that it wishes to be indexed. 2.6 INTERFACE TO OTHER FUNCTIONS a. The offeror MUST indicate whether the user can access the printer in order to print screens, lists, and other information from the CD-ROM disc, screen, hard disk, and floppy disk. BiblioFile PAC software allows each library to set up print limitations. In the configuration option of each IC, the librarian selects the screens which will allow users to print to floppy disk. The Intelligent Catalog allows users to print the following screens: - Maps - Catalog heading screen: hit lists in Find Anything, View Catalog, and non- fiction Get Advice. - Multiple title screen - Shelflist screen; single item-level display - Bulletin Board - User notes - User log - MARC records to diskette b. In addition to the MARC format, the offer MUST specify what other formats, if any, are available to copy records from the CD-ROM disc onto hard or floppy disks. All fields of a MARC record can be displayed with BiblioFile PAC ways to display and print the resulting list are available, full MARC record, full labelled display, full card display, brief labelled, and brief card. At any time during searches, the librarian or patron can change the display format or print format of catalog entries. The librarian selects the default display and print formats in the configuration. C. The offeror MUST list and provide information onthe other CD-ROM products, if any, with which thesearch software is compatible y An exciting option, soon to be available, is a mergedperiodical index/monograph CD. This merged database may besearched with the same powerful Intelligent Catalog searchtechniques and will result in "hit"lists of both monographsand periodical articles. The cost of this merged database will depend on theinformation supplied by the index vendor, such as number ofyears required and type of index. With the addition of theperiodical resources, the database will expand accordingly We currently have prototype arrangements with varioussuppliers and will be happy to work with the library in thisarea. The library will be responsible for negotiating aseparate arrangement with the index vendor for a tape subscription, with the tapes being sent to The LibraryCorporation for mastering onto the CD. 2.7 TRAINING AND DOCUMENTATION a. The offeror MUST submit one Copy of the searchsoftware/documentation user manual. All BiblioFile systems are provided with the appropriatesystem documentation. This documentation is thorough and provides step-bystep instructions in guiding librariansthrough the software. Documentation and release notes are provided to all users when changes or modifications to thesoftware are made. A PAC handbook is included as part ofthis Proposal. b. The of feror MUST specify the language in which thesearch software is written. All software programs for The Library Corporation arewritten in the "C" programming language. 2.8 STATEWIDE DATABASE MAINTENANCE: The offeror SHALL -describe the processing sequence for adding, deleting,and replacing records in the statewideddatabase. The Library Corporation would first work with the individual library to determine its cataloging practices. We would thentake each individual archival tape and treat it according tothe library's operational specificationseOncetthis step is 310 311 completed, we will match it against the Missouri state database by the points specified by the client. For example, 25 characters of 245a, 25 characters of 245b, sections of tag 260c, tag 260b; we can match against bibliographic level or bibliographic type to make sure they haven't used a monographic record for AV, etc. We will operationally merge records on the characters that are present in the data. We can utilize any information that is present for matching purposes. The IC Edit utility enables you to download and edit MARC records you find on your Intelligent Catalog CD-ROM database. This utility will enable you to transmit changes, such as holdings code information, to The Library Corporation for inclusion in your next CD-ROM database. It would be installed on your Public Access Catalog workstation. BiblioFile Cataloging is the recommended method of keeping your database up to date. The Library Corporation has extensive experience reading and processing machine readable records in MARC II communications format, as well as in several other formats. TLC has worked with data from many different vendors. Updates may be provided in magnetic form from a variety of sources such as those listed in Section 4.2.2 in our Proposal. 3. In addition, the offeror should provide the following information: 3.1 SEARCH SOFTWARE a. The offeror should indicate which parts of the system are flexible for individual library control. You can customize BiblioFile Public Access Catalogs to meet the needs of your patrons through configuration options. Configuration options can be reached only by a special key combination and password. It is reconfigured for the password to be changed often to prevent tampering and to protect your stations. Following are some features and functions that are configurable: Time intervals for automatic display of help and catalog restart Limit searches to particular branch libraries 312 Change tags and labels in screen displays Turn compact disc sound ON or OFF Change librarian password Set the library name which appears on printed lists of items Change format of the display of multi-branch, multi- call number locations Define printing options available on each station Set Circulation link parameters The configuration also offers utility functions to help you maintain and use your catalogs. You can maintain the library event calendar, transfer configuration changes, edit help screens, format floppy diskettes, set branch scoping levels, and more. b. The offeror should specify the information provided on the brief screen display; for example, author, title, publisher, date, edition, and system ID number. Any field within the MARC record can be displayed at any point within a display format. This is a configurable option controlled locally by the library staff. It is our experience that the local call number is usually displayed at the bottom of a record with a blank line between it and the other data. This allows the call number and branch location to stand out within a record. At any time during searches, the librarian or patron can change the display format or print format of catalog entries. The librarian selects the default display and print formats in the configuration. Options for display formats are: full labelled format, a brief labelled format, a brief card format, a card image format, and a MARC record format. The Change Display feature allows the patron to customize the record display during a search by pressing a single key. The two brief screen formats supported by the BiblioFile PAC software are brief card format and brief labelled format. A description of each follows: Example of brief labelled format: Title: Gone with the wind by Margaret Mitchell. 313 Publisher: Garden City, N.Y.: International Collectors Library, c 1936. Collation: 689 p. 22 cm. Location: SOUTH REGIONAL: 813.5 M6826 Example of brief card display: Gone with the wind by Margaret Mitchell. Garden City, N.Y. : International Collectors Library, c1936. SOUTH REGIONAL: 813.5 M6826 c. The offeror should list the combination searches the search software can support. The View Catalog search mode of BiblioFile PAC supports for the following field combinations: Search all entries, Subjects only, Subjects and Titles, Subjects and Authors, Titles only, Titles & Authors, and Authors only. d. The offeror should specify whether the number of matches is specified in the event of multiple records. BiblioFile PAC provides the number of matches in the event of multiple records. e. If proposed, the of feror should describe the browsing capabilities. Patrons can "browse the shelves" before going to the stacks. Press the right and left arrow keys to see catalog entries for books shelved next to the one selected. The catalog displays items with the next sequential call number, by Dewey or LC classification number, depending on the scheme used in your library. These searches are displayed and can be printed in any of the following formats: brief labelled, full labelled, brief card, card image, or MARC record. f. The offeror should specify whether the user has the capability to modify a previous search. Any time during a search, the patron can press the "UNDO" key to return to a previous search. The user may then modify a search. 314 The catalog offers users more help in finding additional subjects and authors in a nonfiction search through the Get Advice function. With the Get Advice feature, patrons can ask for alternative search paths. This help is available to the user who has saved one or more items, as the suggestions are based on previous searching activity. g. The offeror should provide information on the process for adding a searchable field to a future database project. All fields of a MARC record can be searched with BiblioFile PAC, as listed in Appendix 5. The Library Corporation would be happy to discuss with the library any other MARC fields to be indexed. h. The offeror should provide information on the amount of CD-ROM disc space required to store and index the following optional searchable fields and the impact on search time and response time of the added fields. Publisher (260 $b) Contents note (505 $a) GPO item number (074 $a) The contents note field (505 $a) and GPO item number (074 $a) are currently supported by BiblioFile PAC. The publisher (260 $b) field is not currently supported. *An analysis of the database by TLC Technical Services would be necessary to determine the impact on search time of the added fields. Factors in this analysis include the number of fields to be indexed and the holdings information. i. The offeror should specify what additional searching features are available. Again, all fields of MARC record can be searched with BiblioFile PAC. Please refer to Appendix 5 for a list of fields indexed in BiblioFile PAC. BibCat combines the best features of the Intelligent Catalog's Find Anything and View Catalog searching modes into one smooth searching function. BibCat also offers a subject approach, like Browse Topics. On-screen prompts help the inexperienced user get started, and to serve as a reminder to more experienced patrons. Many users prefer the combined dictionary and all-word searching. BibCat goes one step beyond Find Anything. If nothing is found in a search, the catalog presents a list of entries nearest your search argument. 315 Two main search modes, Find Anything and View Catalog, are markedly different in their sophistication. The search mode that sets TLC apart is the Intelligent Catalog's Find Anything. Find Anything is a keyword search mode which assumes the patron has never used a computerized catalog. Prior to the beginning of a search the screen asks "What would you like to find in the catalog?" As soon as the patron begins to type, a dictionary of words appears on the right hand side of the screen. This dictionary is designed to help patrons with spelling. It begins a search across all indexes (unless the patron has specified a particular index) and alerts the patron as to the number of "hits" it finds. The patron is then instructed to press to initiate the search. As an alternative View Catalog is designed for patrons who know specifically what they are looking for. From the first screen of View Catalog a patron can narrow a search to a particular index or a combination of indexes. This mode of searching assumes the user has had some experience with database searching. Sophisticated patrons can choose not to follow the Find Anything search route, and go directly to specific search argument: author, title, subject, or any combination of the three and limit searches by language, media type, year range and library. Searches are easy to retrace. The catalog keeps track of search paths and with a single keystroke (the UNDO function), permits the patron to return to the previous screen. Patrons can save individual items to review or print later as well as save items from a multiple title list. Each time a patron saves an item, the screen displays the total number of items saved. Up to 200 items can be saved for later printing. Patrons can Browse Topics and go directly to subject areas of the catalog, without first typing a word search. The initial screen presents a list of general subjects, based on the broad breakdowns in the LC or Dewey classification. Patrons continue to select further subdivisions until the shelf level is reached. Then the patron can browse other books right or left on the shelf, just as in a word or phrase search. BiblioFile PAC software fully supports Boolean searching. In the Find Anything search mode "and", "or" and "not" arguments are supported. In fact, when you enter more than 316 one word, and the words do not appear as a phrase, the PAC software performs an automatic "and" search. The View Catalog search mode allows for the following field combinations: Search all entries, Subjects only, Subjects and Titles, Subjects and Authors, Titles only, Titles & Authors, and Authors only. j. The offeror should indicate what can be done with search results once they are obtained; for example, download, print in bibliographies, etc. Patrons can print the catalog entry for any item by pressing the Print Items key. A menu offers these choices: Print only the current item, Arrange all of the saved items before printing, Print the items in the order in which they were saved. Patrons can produce a sorted bibliography of catalog selections by use of a function key. The following options are available: By library shelf number, By date of publication, Alphabetically by Author/Title, Alphabetically by Title. Patrons can also produce a sorted bibliography of catalog selections. Several sorting options are available. The format of the printed items is controlled by the Change Display option. It can vary from an abbreviated entry to a full MARC record, depending on what the patron needs. User print privileges are configurable. The librarian can turn off any or all of the printing capabilities on any station. To help patrons keep track of a search's progress or review words already searched, the Intelligent Catalog automatically saves a log of search paths. A patron can review the log anytime by pressing a function key. A patron's log can accumulate up to 200 lines of information. The patron may print the log by pressing a function key. Each time a patron views a record in the catalog, the shelf status of the item is automatically displayed if the library has BiblioFile Circulation linked with their PAC. The library Corporation will also be happy to discuss linking with other vendor's circulation systems to provide shelf status. 3.2 INTERFACE TO OTHER FUNCTIONS a. The offeror should explain how the process of exiting the search software and returning to DOS will be done. 317 The process of exiting BiblioFile PAC and returning to DOS is accomplished through configuration options. These configuration options are reached by a special key combination and password, to prevent tampering with the catalog. The password can be changed by the librarian as often as desired to protect the stations. Once entering the correct password, the Master Menu is displayed. From this point, one can exit to DOS by selecting the option from the Menu. b. If the system will allow the user to print catalog card from the CD-ROM disc, the offeror should explain how this printing is accomplished. The Library Corporation presents two methods of using data on the CD for local card production. Each approach requires the use of BiblioFile Cataloging which provides the flexibility of printing cards according to the library's specifications. First, MARC records from the BiblioFile PAC station are saved to a floppy diskette and then imported into BiblioFile Cataloging for editing and printing. With the second approach, The Library Corporation provides each library with a local disc, as well as the union CD. This local disc is a duplicate of the union database and has been reindexed for use with BiblioFile Cataloging software. Immediate editing for card printing capabilities is made available via this local disc. Please refer to the attached brochure for a description and pricing of BiblioFile Cataloging. Also enclosed (with the brochure) is a blue flier describing a special subscription offer for BiblioFile Cataloging. 1 3181 aif NO. 3201>68 Page Z8 of 33 The Library Corporation EBIhIT A PRICING PAGE The offeror shall provide the following information for services provided in accordance with the terms and conditions specified herein. All costs associated with providing the required shall be included in the following prices. A. Annual Edition of the Statewide Database: The offeror shall provide a total price for the annual edition of the statewide database. The total price shall include all costs for the creation of the statewide database based on 3.5 million bibliographic records, authority control, producing the master CD-ROM discs, 400 copies of the CD-ROM product, two magnetic 1600 bpi ASCII tape copies, providing 400 copies of the software documentation/user manual, software license, training, etc. The offeror shall provide a price for each additional copy of the CD-ROM product and software documentation/user manual in excess of 400 copies. The offeror shall also provide a price per bibliographic record in excess of 3.5 bibliographic records. The offeror shall provide firm, fixed prices for the Original Contract Period and maximum prices for each extension period. Annual Edition of the Statewide Database: Original Contract Period: First Extension Period: Second Extension Period: Third Extension Period: $ 119,000 total $ 124.000 total $ 129,000 total 1 -4 000nwtotal CD-ROM Product and Software Documentation/User Manual in excess of 400 copies: Original Contract Period: First Extension Period: Second Extension Period: Third Extension Period: $ 210 per copy $ per copy $ " per copy 2$40 per copy Bibliographic Record in excess of 3.5 million bibliographic records: Original Contract Period: First Extension Period: Second Extension Period: Third Extension Period: $ 0 per record 0 0per record $ per record $ per record S. Statewide Database Supplmeut: The offeror shall provide a total price for the statewide database supplement. The total prices shall include all costs for the creation of the statewide database supplement, producing the master C0-ROM discs, 400 copies of the CD-ROM product, two magnetic 1600 bpi ASCII tape copies, etc. The offeror shall also provide a price for each copy of the CD-ROM Product provided in excess of 400 copies. The offeror shall provide firm, fixed prices for the Original Contract and maximum prices for each extension period. Statewide Database Supplement: a. Original Contract Period: b. 7irst Extension Period: c. Second Extension Period: d. Thitt Extension Period: IAU::iZRIZSD s:NA:URz $ 80,000 total $ 84,000 total $ 88,000 total $. 2.00 total January 30, 1992 DATE a. b. C. d. a. b. C. d. a. b. c. d. RFP NO. 3201148 Page 29 of 33 EMBIT A PRICING PAGE CONTINUED The Library Corporatron CD-ROM Product in excess of 400 copies: a. Original Contract Period: b. First Extension Period: c. Second Extension Period: d. Third Extension Period: $ 200 per copy $ 210 per copy $ 17n per copy $ 230 per copy C. Customized Changes: The of feror shall provide a price per hour for providing customized changes in the search software pursuant to the state agency's request. The offeror shall provide a firm, fixed price for the original contract period and a maximum price for each extension period. a. Original Contract Period: b. First Extension Period: c. Second Extension Period: 4. Third Extension Period: $ 100 per hour $ 100 per hour $ 1Lnn per hour $ tnn per hour D. Spinoff Product: The offeror shall provide a price per record for the creation of a spinoff product on a CD-ROM disc and a 9 Track Tape. The offeror shall provide a price CD-ROM Disc and per 9 Track Tape. The offeror shall provide a firm, fixed price of the original contract period and a maximum price for each extension period. a. Minimum $250; CD-ROM Disc Maximum $2,500 Dependent on # of discs a. Original Contract Period: $ 0.01 per record $ per disc b. First Extension Period: $ sae as a- per record $jg , per disc c. Second Extension Period: $ same as a. per record $AAMg per disc 4. Third Extension Period: $ sSe as a. per record $ same per disc 9 Track Tape a. Original Contract Period: b. First Extension Period: c. Second Extension Period: 4. Third Extension Period: Minijia_$250; 20/5Pri gpds SsME.AL.Aas. per record same as a. per record $ s .E as a . per record E. Sbelflist: If proposed, the offeror must provide a price per record for the creation of a machine readable catalog record from printed shelflist in USMARC format. The offeror shall provide a firm fixed price for the original contract period and a maximum price for each extension period. a. Original Contract Period: b. First Extension Period: c. Second Extension Period: d. Third Extension Period: $ .. LL per record $ ._5 per record $ _ 60 per record $ .6s per record F. The offeror must provide a total price per library for any additional hardware neededd to operate the search software. The total price shall include the cost of the equipment and installation. The offeror shall provide a firm fixed price for the original contract period and a maximn price for each extension period. a. Original Contract period : b. Firs: Zx:ansion Per:: c. Second x:ension Period: d. Third Zx:ezsizn ?r:i4d: $ N/A per record '11T1--o per record N/A per record N 7/ oper rec-rd The firm, fixed prices stated above are provided in accordance with the tens and conditions R7? 3 0U'43. [319] $ N/A per disc per disc per disc per disc The Library Corporation EXIBIT 3 PRICE ANALYSIS Annual Edition of the Statewide Database 1. Creation of the Statewide Database S 2. Authority Control 3. Producing the Master CD-ROM Disc 4. 400 Copies of the CD-ROM Product 5. Two magnetic ".600 bpi ASCII tape copies 6. 400 Copies of the Software Documentat..., User Man-al 7. List Other: Training: not required, but if desired by Missouri State Library is available at S300 per day plus expens TOTAL (See price quoted for 00001 on the Pricing Page) Statewide Database Supplement 1. Creation of the Statewide Database 2. Producing the Master CD-ROM Disc 3. 400 Copies of the CD-ROM Product 4. Two magnetic 1600 bpi ASCII tape copies 7. List Other: .TAL (See prize p d for CCC3 on t'e ?ri:in ?age) 80'000 January 30, 1902 Afl RiZD S:zNA:'Ra [320] RFP NO. 3201148 Page 30 of 33 10 '0.01/record $35,000 0 200 each S80,000 S0 10 each - $4,000 $ $ $ $ 119,000 $0 $_200 each a $80,000 $0 APPENDIX F EXAMPLE OF A COST ANALYSIS OF STATEWIDE DATABASES BY FORMAT - MICROFICHE, CD-ROM, ON-LINE, AND OCLC. 321 Cost of State Database on Microfiche ONE TIME EQUIPMENT STARTUP COSTS Microfiche readers UNIT COST l50.000 OCLC USERS Number o(Units Cost 0 Soo 000 M2TINETWISCAT USERS Number of Units Cost 0 $0000 ANNUAL COSTS Equipment Maintenance/Other OCLC M300 maintenance/year $432.000 68 $29,376.000 $29.376.000 Terminal maintenance/year $540.000 91 $49,140.000 $49.140 000 Modem i Leased Linekyear $780.000 72 4 56,160.000 $56,160.000 System service fee/year $336.000 152 551,072.000 $51.072.000 Dial access password/year $248.000 48 $11.904.000 $11.904.000 Dial access/cataloging/hrs. 59.600 2,964 528,454,400 $28,454.400 Basic service fee $50.000 84 $4,200.000 $4,200.000 On-going support 11.7% $106,894.836 $106,894.836 Production of Records (Library) Current cataloging OCLC Prime time $1.390 263,200 $365,848.000 $365,848.000 Non-prime time $1.170 147,069 $172,070.730 31172,070.730 Credits (30.500) 85,830 ($42,915.000) $42,915.000) MITINET MARC riche/year 390.000 200 $18.000.000 $18,000.000 Supplement/year $95.000 200 $19,000.000 $19,000.000 Retrospective conversion OCLC Prime time $1.170 6,151 77,196.670 I7,196.670 Non.prtme time 50.300 377,767 5113,330 100 $113,330.100 Microcon $0.340 295,437 $100,448.580 ISt100,448.580 MITINE T 50.000 $0.000 50.000 GPO $2,000.000 $2,000.000 $4,000.000 Database Maintenance Add unique MARC records $0.070 240,000 516,800-000 50,000 $3,500.000 $20,300.000 Add non- MARC records 50.200 25.000 $5.000.000 $5,000,000 Update/change records 50.000 50.000 Delete records $0.000 $0.000 Correct errors $0.000 50.000 Delete duplicate records $1 000 LCC N conschdatiornindexes 50.005 $20,400.000 520.400-000 Extraction supportiyear $6,000.000 $6,000 000 56.000.000 .7 4- [322] TOTALCOST for both 0.0se 0 50 .000 %We No j wo No L %-F "%A %1 %0 - [323j Cost of State Database on Microfiche continued) Soikware DeveiopmentMatntenance Annual salary programmer! Administration Annual salary iat.abase manager) LTE recordkeeping UNIT COST $29.465 856 $18,925.235 $8,802-590 OCLC USERS Number of Units Colt NUTINETWISCA T LSE RS Number of nto s Cost $29-465 356 $18,925 .235 $8.802 590 Training/Consultation Annual salary database manager $18,925.235 $18.925 235 $18.925.235 Other Annual Supplies $4.100.000 $4,100.000 $4.100.000 Archival tapes $1,000.000 $1,000.000 51.000 000 MACC transmission 56.000.000 56,000.000 56.000.000 Travel $5,000.000 55,000.000 $5,000000 Statistics $500.000 $500.000 $500000 Products from Whole Database Tapes $1,200.000 $1,200.000 $1.200.000 Microfiche Master/title/copy $0.032 2,700,000 $86,400.000 $86.400.000 Copies/per set of fiche $554.000 100 $55.400.000 400 $221,600.000 $277.000.000 TOTA L (ONE TIME EQUIPMENT) $ 0.000 $ 0.000 S 0.000 TOTAL ANNUAL OCLC $1,053,180.316 $1,053,180.316 TOTAL ANNUAL BRODART/OTHER 5 73.200.000 . $ 474,818.917 $ 548,018.917 STATE MICROFICHE PROJECT TOTAL $1,126,380 316 S 474,818.917 $1.601,199.233 PRODUCTS FOR LIBRARIES Products from Database Subset Tapes archival) Tapes deduped) Microfiche Mcrofilm CD-ROM WTSCAT Tapeload lnitialOCLC number InitiaL unique Ann.Ial None per titlei.005 Varies Varies Varies 50.120 2.100,000 $252,000 000 $0.200 600.000 $120,000.000 Combination 315.000 $41.000.000 .5-. TOTAL COST for both ser 529 465 156 518,925 235 58,302 590 II C 324] Notes for Costs of Database on Microfiche Allcosts are oased or. actual costs of the current project. Participation: OCLC users include 100 libraries which currently have online access. Processing center users are listed under WISCATMITINET for all costs besides cataloging because this is the current format used to supply them with bibliographic and holdings information. This distinction is not important for this scenario, but this assign- ment is consistent with that used in the other scenarios. There are 400 WISCAT/MITINET users. Equipment: A microfiche reader is needed to use the WISCAT microfiche. It is assumed that all current usersof WISCAT have this equipment. Production of records: OCLC libraries contribute records through use of the OCLC system and these records are added to the database by processing OCLC archival tapes. Other libraries add records through use of MITINET or through tapeload of records from other automated systems. Costs of adding both OCLC and MITINET transactions to the database are listed under database maintenance. Production of records using either method also involves labor costs which are not listed here. The cost of computer transactions and other items are listed in this budget. OCLC costs are based on actual OCLC usage for 1985/86 and prices are for 1986187. While OCLC costs are usually paid for locally, the costs are included here to show all costs associated with the project. The MARC fiche allow MITINET libraries to use bibliographic records in the Library of Congress MARC file which are not on the WISCAT database. Currently libraries share use of the MARC fiche, therefore fiche are only bought for 1/2 of the libraries participating. Costs for retrospective conversion including labor have been kept as a result of tracking LSCA projects. Use of OCLC has averaged 5.66 per transaction and use of MITINET has averaged 5 36 per trans- action when labor is taken into account. Database maintenance: While bibliographic records can be added from a variety of sources. only the addition of unique records incurs a cost. Once a record is in the database, there is no charge to add holdings from another library or to make changes to that record. Local products: Local products can be created after records are extracted from the database. Extraction costs are not charged if local CD-ROM or COM products are produced. There is an extraction charge for tape pro- ducts if the databaseisi to be extracted do not equal 100,000 titles. Support for smaller database extraction is included under MITINET since these libraries are most likely to have small databases. Both OCLC and MITINET libraries can make extractions. Unit costs for products vary depending on the number of ties included. For esamole. a small library wth 2.500 tot!es ni pay $ 97 per .t.e for a microfiche or CD-ROM master and 5 90053 :er ::Ge or microfiche cooies A large library wut n er $00 9c es par $ 04 per title for a master and $ 000031 Der .2:e :r rye copies. CD-ROM copies are oased on disc Si5 A. per Jos 'raer toan title costs. Notes on Purpose * Development of an interibrary loan tool *or ver-icaton of specific titles and library holdings. The microfiche is a very useful tool for interlibrary loan. The data. base includes records from a variety of types and sizes of libraries. Full bibliographic records are available to aid in identification of different editions and formats. Information on nearly 3 million titles and 10 million holdings are available, The weakness of this format for interlibrary loan is that the material cannot be kept up-to-date instantaneously. Normally a database of this size would not be entirely updated more than annually. It would be possible to produce supplements. e Development of a reference tool for verifying available information on specific subjects and verifying complex bibliographic citations. The microfiche can be used for this purpose. Subject access is available.since a separate subject section has been created. * Development of a database which could be used by libraries to create machine-readable bibliographic records for use in local and area level automation projects. The database from which the microfiche is created allows for records to be contributed from a variety of sources. Records can be extracted from this database for a single library or a group of libraries. Sub- sets of the database can be produced on tape, COM, or CD-ROM format. Statewide prices have or could be negotiated for any of the above formats. The bibliographic records extracted will be the master records in the database and will not contain each library's bibliographic variations. The detailed holdings statement will contain each library's variations. * Development of a tool which could be used as a guide for selecting miscellaneous pieces of cataloging information: such as call num- bers, subject headings. correct main entries, cataloging information. catalog card filing rules, and other information. The bibliographic records on the microfiche would contain all of the above information. * Development of a catalog which could be used by local libraries as a backup to local online circulation systems or library catalogs. Libraries currently use the microfiche to locate titles in their collec- tions when online systems are not operating. Purchasing copies of the statewide fiche is often more cost effective than creating a local fiche. * Development of a tool which could be used as a primary source of current cataloging information. -76- this WISCAT microfiche does not serve as an efficient means of providing current cataiog:ng. The database wil not be up-to date unless supplements are produced. The informationn may oe useful at the time the database a produced but will become decreasinglV so as time passes. Lbraries can create current machine-readable catalog records using the LC MARC fiche and VITINET retro but cannot produce cards ,n this process. Cards can be produced using MITINET/ marc and LULTRACARD MARC on an IBM-PC. However. ,t will not be cost effective to produce all records in thisfashion and may not provide satisfactory Input .nto the database as duphcate records could be created and go undetected. Other Comments The WISCAT apes received rm Brodar t could be oaded to OCLC at a cost >f 5124,300 OCLC oid r ad eac w record o t tape and set a three letter code for earn library listed 7t 3t cot ear how these records would be pdated on OCLC J holdIngs chargedOCLC cannot currently process MITINET transactions Detaied holdings informatIon 'call number, copies, etcA. would not be entered into OCLC If the tapes are loaded into OCLC. ony QCLC ibrares would have access to the records via OCLC. In thIs case, OCLC lbrar es might not need a copy of tne CD-ROM equipment or a copyof the CD- RO M disks. [325] Cost of State Database Online Using OCLC OCLC USERS MIT.NETWISCAT USERS TO NuwberJ Yuwroer LUNIT COST ofr m Cost %ft . " Cost ONE TIME EQUiPMENT STARTUP COSTS Leased line users M300 microcomputer Printer/cables Dialup user Microcomputer Printer/cables Modem Software OCLC profilinr- full user ANNUAL COSTS Equipment Maintenance M300 maintanance/year Terminal maintenance/year Modem (leased linelyear System service fee/year $3,015.00 $358-00 .35. .0 .0 $8,950.00 $1,405.17 $358-00 $371.00 $30.00 $150 00 0 0 0 0 0 0 I0 $0 00 tn0 M 50.00 $0.00 $0.00 50.00 so 00 $000 335 $50,250.00 550.250,00I V t 4 TALCOST +or Dotri -- -- -- - 25 2.1 375 375 375 335 S"5.375.00 taqr9A $526,938.75 3134.250.00 $139,125.00 $11,250.00 $50,250.00 $432.00 $540.00 $780.00 $336.00 ,33. 59,472.00 Production of Records (Lib rary) Current cataloging Primetime $1.39 263,200 $365,848.00 837,500 $1,164,125.00 51,529,973.00 Non-prime ume $1.17 147.069 $172,070.73 0 $0.00 5172,070.73 Credits (30.50) 85,830 (342,915.00) 0 $0.00 $42.915.00) Catalog cards $0.054 4.706,756 $254,164.82 4,18.500 $226,125.00 $480,289.82 Retrospective conversion Primetime $1.17 6.151 $7.196.67 0 $0.00 $".196.67 Non-prime time 50.30 377.767 $113.330.10 1,005,000 $301,500.00 $414.830.10 Microcon $0.34 295,437 $100.448.58 0 $0.00 $100,448.58 Onhne Access Telecommunicauons Leased line/year $1,680.00 112 $188.160.00 25 542.000,00 $230,160.00 Dialup Password authortzation/year $248.00 48 $11,904.00 375 $93,000.00 $104.904.00 Catalog & search charge/hr, $6.99 2,964 $20,718.36 104,520 $730,594.80 5751.313,16 Searching charge/hr. 56.99 0 $0.00 3,120 $21,808.80 $21.808.80 Searching transactions Searches 1 Ai/Tthreshold $0.06 842,600 $50,556.00 611900 $36,714,00 $8".2"0.00 Searches/ioldings $0.15 186.420 $27963.00 502.000 575,300.00 5103,263.00 Administration COWL Basic service fee $50.00 84 54,200 00 400 $20.000.00 524.200 68 91 72 152 $29,376.00 $49,140.00 $56,160 00 S.'s1 01712 w 25 0 25 '? A $10,800.00 $0.00 $19.500.00 $75,375.00 $526,938.75 $134,250.00 $139.125.00 $11,250.00 $50,250.00 540, 176.00 S49,140.00 $75.660.00 [326] F - I Cost of State Database Online Using OCLC continued) OCLC USERS YMrTNET WISCAT USERS TOTALCOST Number Number - UNIT COST ofUnts Cost oCUn. st 3ers TrainingConsultationmSupport Inmtal training 31,00000 0 $0.00 400 $400,000.00 5400,00 30 On-going support 11.7% 145,818184 $226.443.29 3 12.268 3 Products from Whole Database Archive tapes/annual Per record charges $0.035 12,000 $420.00 $420 00 30.030 48,000 $1,44000 31,440.00 $0028 180,000 $5.040.00 $5,040.00 $0023 360,000 58,280.00 $8,280.00 30.019 507,933 $9,650.73 59,650.73 30.015 0.00 $0.00 1,237,500 $18,562.50 $18,562.50 30.014 91.934 $1.287.08 $1,287.08 Per tape charge $35.000 67 $2,345.00 4 $140.00 $2.48500 Per 'frequency* charge $55.000 12 5660.00 $660.00 TOTAL ,ONE TIME EQUIPMENT) 50.00 5946,138.75 $946,138.75 TOTAL ANNUAL $1.634,334.91 33,395,019.39 35.029.354.30 ONLINE ACCESS PROJECT TOTAL 31,634,334.91 34,341.158.14 55,975,493.05 PRODUCTS FOR LIBRARIES Products from Database Subset Tapes (archival) Tapes ideduped) Microfiche CD-ROM Local database storage Interlibrary loan transmission Produces Referrals Holdings display use Lending credit Serials union list holdings Microenhancer software Serials union list Holdings data creation Holdings updates Start up tees Subject searching/BRS Conect hour Citations None per title/.005 Varies Varies per title/.005 $0.99 30.99 $0.15 (50.20) 0.06 $275.00 $0.24 $0.07 $340.00 $56.00 50.14 WISACAT Tapeload I taisLOCLC number Initialiunique 30 120 2,100,000 $252.000.000 30.200 600,000 $120.000.000 .79- [327] Notes for Costs of OCLC Online Database Costs for OCLC libraries are based on the number of trarsactuons or uWits used or in place Lm 1985-86 times the costs per transactions or units for 1986-87. Costs for MITINET libraries are oased on unit costs multiplied by the estimated number of units in the cost scenario document. Participation: OCLC library costs are based on the current level of equipment and activity, MITINET/WISCAT library costs are based on a single terminal per library Costs are based on 85 OCLC libraries and 25 MITINET/WISCAT hbraries using leased lines and 15 OCLC libraries and 375 MITINET/WISCAT libraries using dialup lines. OCLC processing center libraries receive cataloging through the processing center, but have online access for searching. Equipment: OCLC libraries use M300 terminals (microcomputer> or older model terminals as already installed. Dialup users use IBM-PC equipment at state contract costs. This equipment includes a standard IBM-PC (256K) with monitor, keyboard, cables, printer and modem. Apple or IBM-PC compatible equipment would be cheaper. Many libraries already have equipment which could be used, however, costs are figured as if all libraries bought equipment for this purpose. All MITINET libraries already have Apple or IBM terminals which could be used for this purpose if the level of searching does not interfere with other services. Production of records and data input: All production of records would be accomplished online via the OCLC database. The 65 OCLC processing center libraries receive current cataloging and retrospective conversion services through the processing center library and the costs are included in the OCLC column for cataloging. Database maintenance: Ongoing database maintenance for OCLC is built into the produc- tion of records costs incurred by each library when a record is used for the first time. Telecomnunications: It is assumed that libraries using the dialup connection to catalog will spend 26 hours per month using OCLC. Processing center libraries which do not catalog or do retrospective conversion will spend 4 hours per month using OCLC. Subject searching is not included as this cannot be done on OCLC. Administratuon/training/consultation/support: The category for on-going support covers costs for all of the above items and is put under training because this is the predominate pur- pose. This cost is figured as a percentage of costs associated with annual equipment maintenance, production of records, and search. ing costs. Telecommunications, equipment costa, training and other items are not included. Product. frm the wrote database: At the present time. OCLC ts developing two tyoes of CD.RO\( products. The Frst :s a Reference CD- RO., the second .s a ca--a Log:ng CD-ROM whicn has an onine connectio( ,or batch upLadng or records. Both products will contain porcions of tre OCLC data. oase. No cost, production schedule or specific prcduc: descrpt on informationn are yet available. It s not known whether OCLC *L: e able to produce custom CD-ROM from indmividual library -r state. wide databases even Jail holdings are in OCLC Local products: OCLC produces archival tapes which contain a copy of each record created each time the system is used. These tapes contain duplicate records for the same title and must be *deduped" prior to being used in any automated system. They are also in OCLC MARC format rather than LC MARC format. There is often an added cost to carry out this process prior to loading a record into a local system. OCLC does not produce microfilm or microfiche from the biblio- graphic database. Microfiche can be produced from the serials union list only. CD-ROM products are not currently produced for indivi- dual libraries or groups of libraries. Tapes are not produced for customized output which can be loaded into other vendors systems or microcomputer systems. Libraries can contract with other vendors to process OCLC archival tapes and produce microfilm, microfiche, CD-ROM or customized tape products. Each library would have to do this individually as this process would not be covered by a statewide contract under this scenario. Notes on Purposes e Development of an interlibrary loan tool for verification of specific tiles and library holdings. The OCLC database contains 14 million records with holdings for libraries throughout the country. A directory contains interlibrary loan policies for the libraries with holdings in the database. An online interlibrary loan system allows for the completion of both verification and request transmission processes. Costs for verifica- tion of requests are included under searching and display holdings charges. Costs of using the interlibrary loan subsystem are not included in this analysis. e Development of a reference tool for verifying available informa- tion on specific subjects and verifying complex bibliographic cita tions. The database contains full bibliographic records which can be used for verification of complex citations. Since the database is very large and up-to-date, most stations are likely to be found. Subject access is not available on the OCLC online system. Subject access to a portion of the OCLC rile which may not contain all the holdings of any individual library is available through BR.S. The costs of searching BRS are not included in this analysts. e Development of a database which could be used by libraries to create machine:readable bibliographic records for use in local and area level automation projects. -so- [328] Bibliographic records and holdings cannot be extracted directly from the OCLC database. Records of each transaction are produced on archivat tapes, WILS receives monthly archival tapes from OCLC containing the records of all OCLC users. These tapes are maintained by the l~WMadison Adminstrative Data Processing Department (ADPI. Extractions can be made from the archival tapes by ADP. This cost will normally not exceed $400 and varies depending on the number of records extracted. These archival tapes must be processed by a vendor prior to loading the records into most local systems. Many vendors of mini-computer systems can process these records, but there may be an added cost to do so. Microcomputer vendors frequently cannot process these OCLC archival tape records, The costa of processing archival tapes are not included in this cost analysis. Some vendors have the capability of loading records individually from the OCLC online system to the local system. This process requires purchase of additional equipment and the cost of doing this is not provided in this coat analysis. a Development of a tool which could be used as a guide for selecting miscellaneous pieces of cataloging information. such as call num bers, subject headings, correct main entries, catalog card filing rules, and other information. The database contains all of the above information and could be used for this purpose. * Development of a tool for use in selection of materials for library collections. The size of the database makes OCLC a good source of information on the availability of titles and can be used to determine if purchases are needed. OCLC also has both an onlne and a microcomouter based ac.siions system which facl.,tates the ordering process, ,-cluding direct transmission of orders to many jobbers. The cost of the ecq.;stion are not included in this cost analysis. * Development of a catalog which could be used by local braries as a bacKup to local online circulation systems or library catalogs. Although OCLC could be used as backup to local online crculation systems or online catalogs, its value for this purpose is limited because the database contains only master records. Call numbers and other local modifications are not shown on the online system. Also only three-letter symbols are shown on the online system so internal four-letter code information is not available online. * Development of a catalog which could be used by library users to supplement local library catalogs. Some OCLC libraries have OCLC terminals in their public access a:eas for staff and patron use. e Development of a tool which could be used as a primary source of current cataloging information. The primary purpose of the OCLC system is shared cataloging. Catalogng is the foundation upon which all the other features of the system are built. Through the cataloging process. a database for verification and interlibrary loan is created. Libraries may make modifications to records in the database and these modifications are kept on the archival tapes. .81- [329] [3303 Cost of State Database Online Using Brodart OCLC LSERS NLTINETWISC AT USE RS TOT AL COST Number Number UNIT COST ofUn; s Cost fit. Cst ONE TIME EQUIPMENT STARTUP COSTS Leased tine users Technical services terminal $2,384.00 159 S379056.00 25 $59,600.00 3438656 00 Printer and adapter $1,400 00 85 $119,000.00 25 $35.00000 $54,000 00 Cluster adapter (multple term. 3500.00 47 $23,500.00 0 $0.00 $23.500 00 Modem $4,450.00 85 $378.250.00 26 $115700 00 $493950 00 Software 30.00 85 $0.00 25 50.00 50 00 Installation $680.00 85 $57.800.00 25 $17,000.00 $74800.00 Dialup users Microcomputer $1,405.17 15 $21,077.55 375 $526.938.75 $548,016.30 Printer/cables $358.00 15 $5.370.00 375 $134,250.00 $139.620 00 Modem $371.00 15 $5,565.00 375 5139.12500 $144,690.00 Software $150.00 15 $2,250.00 375 S56250.00 $58,50000 ANNUAL COSTS Equipment Maintenance Technical services terminal/year $312.00 159 $49,608.00 25 $7,800.00 $57.408 00 Printer $252.00 85 $21420.00 25 $6,300 00 $27,720.00 Cluster adapter/year $120.00 47 $5,640.00 0 50.00 $5,640.00 Modem leased linet/year $360.00 85 $30,600.00 25 $9,000.00 $39,600.00 Software $0.00 85 $0.00 25 $0.00 $0.00 Production of Records tLibrary) Current cataloging Transactions $0.00 496,099 $0.00 837,500 $0.00 $0.00 Catalog cards 50.04 4.706,756 $188.270 24 4,187,500 $167.500.00 $355,770.24 Storage/year 30.005 496.099 $2,480.50 837.500 $4,187 50 $6,668 00 Retrospective conversion Transactions $0.00 679,355 $0.00 1,005,000 $000 $0.00 Storage/year $0.005 679,355 $3,396.78 1,005.000 $5,025.00 $8,42 78 Database Maintenance Add unique records $0.07 240,000 $16,800.00 50,000 $3,500.00 $20,300.00 Add non-MARC records S0 20 25,000 $5.000.00 $5,000 00 Updatechange records $0.00 Delete records $0.00 Correct errors 50.00 Delete duplicate records LCCN consolidation/indesee 50.005 $20,400.00 $20.400.00 GPO $2,000.00 $2000.00 $4.000.00 Storage costs/month $10.000.00 $120,000.00 S120,000 00 Extraction support/annual $6,000.00 $6.000 00 56,000 00 -82- [331] Cost of State Database Online Using Brodart !continued) I I OCLC USERSOCLCb USERS Number MITINETqWISCAT tSERS Number UNIT COST ofUnits Cost of 5 Cost Onhne Access Telecommumccions Leased line $14,400.00 $14.400 00 Main dropper month $1.200-00 Multi-drop lineswper month $330.00 85 $336,600.00 25 $99,000.00 $435,600.00 Port accaess/per month $250.00 85 $255,000.00 25 $75.000.00 $330,00000 Dialup Tymnet ports/per 8/month $1,600.00 15 $36,000.00 375 $900,000.00 $936.000,00 Logon/port access/month $182.00 15 532,760.00 375 $819,000.00 $851,760,00 Phone charges $7.00 15 $34,020.00 375 $850-500.00 $884,520-00 Transaction Costs $0.00 1,029,020 $0.00 621.000 $000 $0.00 Administration Salary & f.b. (db managersiyear $37,850.47 $37,850.47 $37,850 47 LTE quality control $8,802.59 $8 802.59 $8,802.59 Training/Consultation Salary & f.b. (trainer/consVyear $26.516.64 $26.516.64 $26,516.64 Training from Brodart $500.00 $1,000.00 $3,000.00 $4,000 00 Other (Annual) Supplies $4,100.00 $4,100.00 $4,100-00 Travel $10,000.00 $10,000.00 $10.000.00 Statistics $500.00 $500.00 $500.00 Products from Whole Database Tapes $1,200.00 $1,200.00 $1,20000 TOTAL (ONETIME EQUIPMENT) $ 957,606.00 $1,027,613.75 $1,985,219.75 TOTAL ANNUAL ONLINE ACCESS PROJECT TOTAL $1,973.201.51 $4,234,195.95 $6.207,397.46 PRODUCTS FOR LIBRARIES Products from Database Subset Tapes (archival) None Tapes deduped) per titlei.005 Microfiche Varies CD-ROM Varies Local database storage per title/.005 -83- TOTAL COST [3321 Notes for Costs of Brodart Online Database All costs are oased on estimates made by Brodart. Actual costs would :e ootasned through a bid process and might well be less than listed here ?articipapion: it is assumed that new equipment would be purchased for all libraries. Costs are provided based on all OCLC Libraries having the number of terminals they now have and WISCAT/MITINET lbrar- es having a single terminal or microcomputer for use of the system. It is assumed that all libraries would have online access to the database either through use of leased lines or dialup lines. Costs are based on 85 OCLC libraries and 25 MITINET/WISCAT libraries using leased lines and 15 OCLC libraries and 375 MITINET/ WISCAT libraries using dialup lines. Equipment: The equipment used by libraries with leased lines includes: a Telex terminal and printer and a 9600 baud modem. One extra modem is needed for San Diego. Installation includes equipment and phone line installation. IBM 3276 terminals may also be used. Terminals with security features for public and patron use are available at approximately the same price. The technical services equipment would allow libraries to search the database and input data into the database once authorization to do so is given. Microcomputers can- not now be used on leased lines, but Brodart is working on this capability. Dialup users may use Apple or IBM computers and telecommunica- tions software which emulates an IBM 3270 terminal Crosstalk and Apple Access are recommended and it is not now known whether PC- Talk or ASCII Express will also work). The cost includes a standard IBM-PC I256K) with monitor, keyboard, cables, printer and modem. Apple or IBM-PC compatible equipment would be cheaper. Many li- braries already have equipment which could be used; however, costs are figured as if all libraries bought equipment for this purpose. All MITINET libraries already have Apple or IBM terminals which could be used for this purpose if the level of searching does not inter- fere with other services. Production of records and data input Libraries could have the capability of adding or updating records directly into the database. There are several reasons this may not be desirable from the point of view of the library or the state. The data- base contains a master record. and it may not be desirable from a quality control standpoint to give all users the authorization to change that record n the database directly. Brodart would create a workspace for records which are cataloged or changed. It is assumed that all libraries will catalog on the system in this scenario. Libraries which use Brodart for cataloging and want to save local variations in the bibliographic record, must set up a separate database with Brodart. Local database storage costs are 5.005 per record. Catalog cards cost 5.04 per record. MITINETWISCAT users' transactions are figured on the basis of 400 libraries cataloging 2500 titles a year and doing 3000 retrospec- tive conversions a year total for 198546 divided by 400i. It is assumed that all libraries would catalog using this system. Data)ase maintenance: Whle babiographic records can be added f:n man anrares. or; the addition of unique records :ncurs a ccst Once a record s n the database. there IS no charge to add hod.c-gs fr moanotner ersrv or to make changes to :nat record. The cost ' zne sad.t-on of .iniue cataloging records ., sted under dataoase mn rte.rnace rather than cataloging. Telecommunications Leased line costs are based on estimates of the cost of lnes From AT&T. Actual line costs per library could vary depending on the lo- cation of the library A average costs were used based on estimates for the entire state. Line costs might be less f the state contracted for leased line use as a part of the telephone contract, but this is not yet possible. A trunk line is necessary from Wisconsin probably in La Crosse)ito San Diego where the computer and database are located. Dialup use does not incur phone line charges as Brodart uses an $00 number for this purpose. Telephone charges are included in the $25 connect time cost. It is assumed that libraries will use 27 hours per month at $25 per hour. At 15 hours a month. Brodart recommends using a leased line as this appears to be the breakeven point. Training: It is assumed in this scenario that Brodart would hold 6 workshops around the state for training. It is assumed that DLS staff would be hired to provide training as well. Local products: Local products can be created after records are extracted from the database. Extraction costs are not charged if local CD-ROM or COM products are produced. There is an extraction charge for tape pro- ducts if the databases) to be extracted do not equal 100.000 titles. Support for smaller database extraction is included under MITINET since these libraries are most likely to have small databases. Both OCLC and MITINET libraries can make extractions. Unit costs for products vary depending on the number of titles included. For example, a small library with 2,500 titles will pay 5.07 per title for a microfiche or CD-ROM master and 1.00053 per title for microfiche copies. A large library with over 500.000 titles will pay 5.04 per title for a master and 5.000031 per title for microfiche copies. CD-ROM copies are based on disc $15 per disc) rather than title costs. Notes on Purposes e Development of an interlibrary loan tool for verification of specific titles and library holdings. The database would contain over 2.7 million bibliographic records and over 10 million library holdings in Wisconsin. This database would be updated frequently and be more up-to-date than the WISCAT microfiche and probably more up-to-date than a potential CD-ROM product. The database contains all four letter OCLC codes including internal library codes. Experienced OCLC users would find this useful. Non- -84- [333] OCLC users may find it confusing as no translation of library names is used ason WISC AT. The software does allow transmission of interibrary loan requests to other libraries using the system. a Development of a reference tool for verifying available informa. tion on specific subjects and verifying complex bibliographic cite- ions. The database contains full bibliographic records which can be used for verification of complex citations. In addition the search straw. tegies are flexible and powerful. Subject access is available through searching subject headings or by key word searching. e Development of a database which could be used by libraries to create machine-readable bibliographic records for use in local and area level automation projects. Bibliographic records and holdings can be extracted from the data- base. Libraries can do retrospective conversion by searching the database nine. Unique records can be added to the database using the cataloging/maantenance function and records can also be modi- fled. Use of the cataloging/maintenance function requires know. ledge of MARC fields and tags. The staff in many non-OCLC libraries are not currently familiar with MARC. and this would require extensive training to assure the records would be created properly. If the proper information is not entered in each MARC field, the machine-readable records will not process correctly in a future automated system. 0 Development of a tool which could be used as a guide for selecting miscellaneous pieces of cataloging information: such as call num- bars, subject headings. correct main entries, catalog card filing rules, and other information. The database contains all of the above information and could be used for this purpose. e Development of a tool for use in selection of materials for library colecuons. The database would contain bibliographic records and holdings of 500 or more Wisconsin libraries and would be a very useful guide to determine whether or not items should be purchased depending on esumated use of the item. a Development of a catalog which could be used by local Libraries as a backup to local online circulaUon systems or library catalogs. The Brodart software was designed specifically to be used as an online catalog for staff or patron searching. It is easy to use and has a number of fairly powerful searching capabilities. Hardware and software security features are available for staff or patron searching. It would be very useful as a backup to an online circulation system or catalog for finding bibliographic informauon. It would not keep track of circulation information. Only one user could use a single work staton at a time. Depending on the frequency of use, more than one terminal might be needed for patron use. a Development of a catalog which could be used by library users to supplement local library catalogs. The software was soectfcslv designed for >nine ca.acg use It . possible to estr:ct searhes tOOny the Ildings its ij-g:e rary so thi the state .nor' ast c"d :e used as a xa. .brar :italog e Development ofa tool whichh couid be used is I or fry source of currentt ctaiging riorratcn This software is not designed primarily for :sig'rg Cur-erry libraries can print shelf st cards on site. arid Brodart hes 'Ate solty to produce full sets of catalog cards as an oftie Service Use of the system for cataloging requires knowledge of M ARC fieLds and tags Libraries which use the dataase for cataloging purposes and wish to keep local variations in the bibliograpnic records wouldd riced to set up separate databases with Brodart. In this scenario, it is assumed that catalogingt information would bekept online for one year only Libraries could also keep their ensure database online, but the cost would be much nreater each year. Llese they plan to use their individual database onlne, it e assumed that transactions would be stored on tape after a year or loaded into a local automation system on a regular schedule. Interfaces may be available between Brodart and some circulaun system vendors. Brodart would need to update the database from the transactions created in a master work~fle or the local databases. It is not clear how frequently this would be done. Separate databases are neces- sary to allow libranes to preserve local cataloging variations. Other Comments Since this scenario assumes that all current OCLC users would use this system. there is a large one-time equipment cost to replace all OCLC terminals and equipment. It is unlikely that this scenano would ever be implemented as outlined here. Many OCLC users would not want to change systems, and it would not be adv antageous for all to do so. The OCLC database would contain many more records than the WISCAT database ever will contain and libraries will get a higher hit rate against that database. This scenario includes costs for two services (interlibrary loan transmission and subject searching) which are not in the OCLC scenario costs. These services are included here, because they are included in the base costs of the service and there are not additional transaction costs associated with them. Brodart currently does not have a system this large in operation. The costs as presented here, however, provide a conceptual view of the unit costs and the effect of applying them to a specified number of libraries. -85- [334] Cost of State Database On Co impact Disc OCLC USERS MNTINET-WISCAT USERS TOTA L C OS-, Number Nxmber %rbot UNITCOST of urts Cost ofT'nltis Cost ONE TIME EQUIPMENT STARTUP COSTS CO ROM players t4) $2.700 000 .400 51080.000 000 S1,350.000 000 Microcomputer $1,200 000 100 $120,000.000 400 $480,000 000 $600.000 000 PintermcableS $358.000 35,8000000 400 5143.200.000 $179,000 000 Software $0000 100 $0000 400 $0.000 50000 ANNUAL COSTS Equipment Maintenance/Other CD-ROM Player/Microcomputar/year $200.000 100 $20,000 000 400 ' 580,000.000 $100,000.000 Software $0000 100 0.000 400 1 0.000 $0000 OCLC M.300 maintenance/year $432.000 68 329,376.000 $29,376.000 549,140.000 Terminal maintenance/year $540.000 91 549,140.000 . Modem (leased line Vyvar $780.000 72 556,160.0001$56.160,000 System service feesyear $336.000 152 551,072.000 %5t,072.000 Dial accsspasiword/year $248.000 48 $11,904.000 I$11,904.000 Dial acceas/cataloging/lrs. 59 600 2,964 528,464.40028,454.400 Basic service fee $50.000 84 j 4,200.000 $4,200.000 11.7% 5106,894.836 I:--06,894.836On-gOing SUPPOr% Production of Records (Library) Current cataloging OCLC Prnme time Non.prime time Credits MITINET MARC fiche/year Supplement/year Retrospective conversion OCLC Prime urns Non-prme time Microcon MITINET GPO Databese Maintenance Add unique MARC records Add non-MARC records Update/change records Delete records Correct errors Delete duplicate records LCCN consolidation/indetes Extraction support/year $1.390 $1.170 (50.500) $90.000 595.000 $1.170 50.300 50.340 50.000 263,200 S365,848.000 147,069 $172,070.730 85,830 (S42.915,000) 6,151 37.196.670 377.767 $113,330.100 295,437 1100,448.580 $0.070 $0.200 50.000 $0.000 50.000 50.000 50.005 $6,000.000 240,000 $16,800-000 $365,848.000 $172,070.730 542915-000) 200 $18,000.000 $18,000.000 200 519,000.000 519,000,000 $7,196.670 $113,330.100 $100,448.580 50,000 50,000 $2,000.000 $4.000.000 $3,500.000 $5,000.000 $20,400.000 $6,000.000 50,000 25.000 $20,300.000 $5,000.000 $0.000 $0.000 $0.000 50.000 $20.400.000 6,000.000 I I___I_______ I ______________ _________ -86- I .' -.00 [335] Cost of State Database On Compact Disc (continued) OCLC USE RS MITINETWISCATUSERS TOTAL COST Number Number for'both UNITTCOST of Units Cost of Units Cost Users Onilne Access Telecommunications $0.000 30 000 Leased line Dialup Transaction costs $0.000 $0000 Software Development/Maintenance Annual salary (programmer) $29,465 856 $29,465 856 $29.465.856 Administrauon Annual salary (database manager) $18,925.235 $18,925-235 $18,925.235 LTE recordkeeping $8,802.590 $8,802.590 $8,802.590 Training/Consultation Annual salary (database manager) $18.925.235 318,925.235 $18.925.235 Other tAnnual) Supplies $4,100.000 $4,100.000 4,100000 Archival tapes $1.000.000 $1,000,000 $1,000-000 MACC transmission $6,000.000 $6,000,000 $6,000.000 Travel $5,000.000 $5,000.000 $5,000,000 Statistics $500,000 $500.000 $500.000 Products from Whole Database Tapes $1,200.000 $1,200.000 $1,200.000 CD-ROM Msster/titlelcopy $0.032 2,700,000 $86,400.000 86.400.000 Copiesiper set of disks 160.000 100 $6,000.000 400 $24,000.000 $30,000.000 TOTAL (ONE TIME EQUIPMENT) $ 425,800.000 $1,703.200.000 $2,129,000.000 TOTAL ANNUAL OCLC $1,053,180.316 $1,053,180.316 TOTAL ANNUALEBRODART/OTHER S 19,800.000 $ 357,218.917 S 377,018.917 STATE CO MPACT DISC PROJECT TOTAL $1.498,780.316 $2,060,418.917 $3,559,199.233 PRODUCTS FOR UBRARIES Products from Database Subset Tapes (archival) Tapes (deduped) Microfiche Microfilm CD-ROM IISCAT Tapeload lnitia/OCLC number Initiai/unique Annual None per titlei.005 Varies Varies Varies $0.120 2,100,000 10.200 600,000 Combination 315,000 $252.000.000 $120,000.000 $41.000.000 -87- %otas for Costs of Compact Disc Dasabase All costs are based on esUmates made by Brodart. Actual costs would be obtained through a bid process and might well be less than listed here Parucipation: OCLC users include 100 libraries which currently have online access. Processing center users are listed under WISCAT/lITINET for all costs besides catalogng because this is the current format used to supply them with bibliographic and holdings information. This distinction is not important for this scenario, but this asigment is consistent with that used in the other scenarios. There are 400 W1SCAT/MITINET users. Equipment; IBM-PC mcrocoesputrs and the four CD-ROM players ar pinced as they would be purchased through Brodart. The printer and cables are quoted at state purchasmng prices. The mucrocomputer is a 512K computer with one disk drive and a full keyboard. The pnce is based on purchase of 100 or more uuta. The price of both microcomputer and CD-ROM players is expected to decrease. It is ales possible to use IBM or compatible equipment purchased through the state contract. but this does not appear to be cheaper at this time. Production of records: The bibliographic database to produce the CD-ROM is the sane as the one used to produce the WISCAT nucrofiche. OCLC libranes contribute records through use of the OCLC system and these records are added to the database by procesig OCLC archival tapes. Other libranes add records through use of MITINET or through tapeload of records from other automated syTeins. Costs of adding both OCLC and MITINET transactions to the database are listed under database maitenance. Production of records using either method also evolves labor costs which are not lined here. The amt of computer transactions and other items are lined in this budget. OCLC cne are based on actual OCLC uage for 156564 and pnes are for 15/7. While OCLC coas are usually paid for locally. the cwts are included here to show all costs aocated with the projet. The MARC fche allow MITINET libraries to ua bibliographic records in the Library of Congress MARC file which are not on the WISCAT database. Currently libraries share use of the MARC fiche; therefore, fiche are only bought for 1/2 of the libraries participaung. Costs for retrospecuve conversion including labor have been kept as a result of tracking LSCA projects. Use of OCLC has averaged ." per transaction and a of MITINET has averaged $4 per transacton when labor is taken into account. Database maintenance: While bibliographic records can be added from a variety of sources, only the addition of unique records incurs a cost. Once a record is n tie database. there is no charge to add holdings from another library or to make changes Lo that record. Libraries using CD-ROM receive many of the benefits of "sing an online system such as online searching capabiliues. However titers are no transacuon cot for searching once Ue product is created. Since it isa fixed media, It also cannot be kept up to dsite in an online mode. Theri are no telecommunicauons costs. Local product&: Local products can be created afer records are extracted from Lhe datsbase. xtraction coats are not charged if local CD-ROM or COM products are produced. There is an exiracuon charge for tape pro- ducts if Use daoubaaees) to be extracted do not equal 100.000 titles. Support for smaUer database extracuon is included under MITINET since tese libraries are most lkely to have small databases. Both OCLC and MITINET branes can makes exracnton. Unit costs for products vary depending on the number of titles included. For sample a small library with 2,500 utles will pay 5.07 per te for a mirofiche or CD-.ROM master and 5-00053 per Utle for miucrofche copies. A large library with over 500.000 ties will pay 5.04 per utle for a master and 5.000031 per Utle for microfiche copies. CD-ROM copies are based on disk (515 per disk) rather than Utle coae Notes on Purpoes 0 Development of an interlibrary loan tool for verification of speciic ties and library holdings. The CD-ROM would be a very useful tool for intrlibrary loan. The database inlude mcords from a varety of types and %sns of librar- se. Library staff can update the records in the library and provide DLS with update transactsonj. The CD-ROM format allows more flexible searching pattern than the microfiche version of Use catalog. Author, Utile. truncated, and keyword searching techniques are possible. Fl bibliographic records are available to aid in Wen. tAcation of different editions and format. Informaton on nearly 3 million tiles and 10 million holdings are available. Once a rcard has been identiied on the CD-ROM it will be possible to write that record on a disk. Them records can then be sent to the bulletn board synem. The weakness of this format for interlibrary loan is that the norms. tion cannot be kept up ts dateintatanously. Supplement disks Can be produced perodually and the entre database can be updated penodically. Nornally a database of tha sise would not be enurely updated more than annually. It would be possible to produce supple. ments. a Development of a reference tool for venfying available mforma- ton on specific subjects and verifyng complex bibliographic cita- tans. The CD-ROM database can be used for this purpose. Subject access can be obtained by searching subject heading formation or by using the key word searching capabilitis. 0 Development of a database which could be used by libranes to crests machmie-readable bibliographic records for use in local and area level automaton projects. -88- [j3361 Oniie acess: The database from which the CD-ROM product is created allows for records to be contributed from a variety of sources. Records can be extracted from this database for a single library or a group of libra tries. Subsets of the database can be produced on tape. COW, or CO-ROM format. Statewide prices have or could be negotiated for any of the above format. The bibliographic records extracted will be the master record in the database and will not contain each library's bibiogaphic variations. The holdings statement will contain each library's variations. * Development of a tool which could be used as a guide for selecting miscellaneous pieces of cataloging information: such as call num- bers, subject headings, correct main entries. cataloging information, catalog card filing rules, and other information. The bibliographic records on the CD-ROM would contain all of the above informauon. * Development of a catalog which could be used by local libraries as a backup to local online circulation systems or library catalogs. The CD-ROM workstation could serve as a workstation which could be used when an online system is not operating. The user would also be able to search in an online environment. The database would not be as up-to-date as the online catalog or circulation system and a means of supplementing this information might be necessary. How- ever, a substantial portion of the information would be available. It is possible to limit searches to only the holdings of a single library so it would not be necessary for a patron or staff member to look at the state holdings unless this was judged to be desirable. Circulation information would not be available. * Development of a tool which could be used by library users to supplement local library catalogs. The CD-ROM format can be used as an online catalog for inhouse patron or staff use. The software has been specifically developed for this use. Searching techniques are flexible and easy to use. It is possible to search on a single library name or on a systemwide basis as well as a statewide basis. e Development of a tool which could be used as a primary source of current cataloging information. This system as costed out here will not serve as an efficient means of providing current cataloging in the traditional sense- The database will not be up-to-date unless supplements are produced. The infor- mation may be useful at the time the database is produced but will become decreasmgly so as time passes. The software does not currently have the capability of printing catalog cards. There are two ways in which libraries could supplement this system to provide cataloging services. The Bibliofile software and disks can be operated on the same equipment as a CD-ROM version of WISCAT Libraries could subscribe to that system to obtam:n currentt cataloging. Libraries can create current machinereadable catalog records using the LC MARC riche and MITINET/retro but cannot produce cards .n this process. Cards can be produced using MlTINET1marc and ULTRACARD MARC on an IBM-PC, However. it will not be coat effective to produce all records in this fashion and may not provide satisfactory input into the database as duplicate records could be created and go undetected. Other CouMents This option requires a large one-time investment in equipment. Once this investment is made. the on-going annual costs are less than those for the current microfiche project. The 0CLC costs listed are those which are paid for by local libraries to obtain the servicesOf OCLC which libraries would continue to incur regardless Of the existence of this project. The WISCAT tapes received from Brodart could be loaded into OCLC at a cost of S324,000. OCLC would read each record on the tapeand set a three letter code for each library listed. It is not clear how these records would be updated on 0CLC if holdings changed. OCLC cannot curenUy process MITINET transactions. Detailed holdings information (call number, copies. etc.l would not be entered into OCLC. If the tapes are loaded into OCLC, only OCLC libraries would have access to the records via OCLC. In this case, OCLO libraries might not need a copy of the CD-ROM equipment or a copy of the CD-ROM disks. -89- [337] BIBLIOGRAPHY Andre, P. Q. J. "Optical disc applications in libraries." Library Trends 37 (1989): 326-342. "Autographics produces first edition of Maine CD-ROM catalog." Advanced Technology Libraries 18, no. 4 (1989): 4. "Automation News" ODL Source. 16, no. 7 & 8 (1991): 6. Beall, Jeffrey. "AL Aside - Ideas: The dirty database test" American LIbraries (March 1991): 197. Beaton, Barbara. "Interlibrary Loan Training and Continuing Education Model Statement of Objectives." RQ 31 (winter, 1991): 177-184. Becker, J., and Hayes, R. M. A State-wide data base of bibliographic records for Missouri libraries. Los Angeles: Becker and Hayes, 1979. Becker, J., and L.W. Helgerson. "CD-ROM public access catalogs: Database creation and maintenance." Library Hi Tech 6, no.1 (1988): 67-86. Beiser, Karl, "CD-ROM Catalogs: The State of the Art." Wilson Library Bulletin 63, no.3 (November 1988): 25- 34. Beiser, Karl, Library Systems Coordinator, Maine Department of Educational and Cultural Services, letter to [Stan Gardner, Jefferson City, Mo] August 10, 1990. Berger, Carol A. Library Lingo: A Glossary of Library Terms for Non-Librarians. 2nd ed. Wheaton, Ill: C. Berger and Company, 1990. Bills, L. G., and L. W. Helgerson. "CD-ROM Public access catalogs: Database creation and maintenance." Library Hi Tech 6, no. 1 (1988): 67-86. Bills, L. G., and L. W. Helgerson. "User Interfaces for CD- ROM pacs." Library Hi Tech 22, v.6, no. 2 (1988): 73- 115. 338 339 Bocher, R. "MITINET/retro in Wisconsin libraries." Information Technology and Libraries 3 (1984): 267-292. Borg, W.E., and Gall, M. D., Educational research: An introduction. 4th ed. New York: Longman, 1983. Budd, John, Steven Zink, and Jeanne Voyles. "How Much Will It Cost? Predictable Pricing of ILL Services: An Investigation and a Proposal." RQ 31 (Fall 1991): 70- 74. Cassell, R. E. "Pennsylvania's CD-ROM state-wide union catalog," in SCIL: The Second Annual Software! Computer/Database Conference and Exposition for Libraries and Information Managers Conference Proceedings, ed. N. M. Nelson. Westport, CT: Meckler, 1987: 34-35. Cates, Dan, Network Coordinator, Iowa State Library, phone interview [with Stan Gardner, Jefferson City, Mo], April 25, 1991. Cates, Dan, Network Coordinator, Iowa State Library, phone interview [with Stan Gardner, Jefferson City, Mo], March, 1992. Clark, Katie, "Comparisons of online and CD-ROM databases: Content and Retrieval Differences." Online/CD-ROM'90 Conference Proceedings. Weston, CT: Online, Inc. (1990): 36-39. Davis, W. P. "Missouri libraries move into CD-ROM world." Show-Me Libraries 39, no.3 (1987): 4-6. DeWath, N. V., and Palmour, V. E., Missouri state-wide bibliographic data base survey. Rockville, MD: King Research, Inc., 1980. ERIC, ED 195 228. Drew, Sally, Director, Bureau for Interlibrary Loan & Resource Sharing, Wisconsin State Library, letter to [Stan Gardner, Jefferson City, Mo] August, 1990. Epler, D., and R. E. Cassell. "Access Pennsylvania: A CD-ROM database project." Library Hi Tech 5, no. 3 (1987): 81-92. Epler, D. M. "Networking in Pennsylvania: Technology and the school library media center." Library Trends 37, no. 2 (1988): 43-55. 340 Fayad, Susan, Senior Consultant, Network Development, Colorado State Library, phone interview [with Stan Gardner, Jefferson City, Mo], February, 1992. Flanders, Bruce. "Library Automation News and Analysis" Kansas Libraries (June 1991): 6. Frechette, Dorothy B., Deputy Director, Rhode Island Department of State Library Services, letter to [Stan Gardner, Jefferson City, Mo], August 10, 1990. Gatcheff, V. "LePac technologies tie the keystone state together." Library Trends 37 (1987): 89-92. Glazer, F. J. "That bibliographic highway in the sky." Library Journal 110, no. 2 (1985): 64-67. Goodlin, Margaret, School Library and Educational Media Supervisor, State Library of Pennsylvania, letter to [Stan Gardner, Jefferson City, Mo], August 14, 1990. Griffin, David, Information Officer, WLN, letter to [Stan Gardner, Jefferson City, Mo], August 24,1990. Helgerson, L. W., "Acquiring a CD-ROM Public Access Catalog System Part 1: The Bottom Line May Not Be The Top Priority." Library Hi Tech 19, vol. 5, no. 3 (Fall 1987): 49-75. Herrick, Jacci, Information Services Coordinator, Tennessee State Library, letter to [Stan Gardner, Jefferson City, Mo], October 4th, 1990. Kolbe, Jane, State Librarian, South Dakota State Library, Survey form from Stan Gardner, completed and returned December, 1991. Lambert, Steve, and Suzanne Ropiequet. CD-ROM: The new papyrus: the current and future state of the art. Redmond, Washington: Microsoft, 1986. Logsdon, L. "Brodart named vendor for state-wide database." Show-Me Libraries 39, no. 5 (1988): 4-5. "MainCat bill passes." Library Journal 112, no. 2 (1987): 20. "Maine approves state-wide catalog." Wilson Library Bulletin 62, no. 1 (1987): 10. 341 "MaineCat fact sheet." Nelson, N. M.,, Editor. SCIL: The Second Annual Software/Computer/Database Conference and Exposition for Libraries and Information Managers Conference Proceedings. Wesport, CT: Meckler; 1987. Mischo, Lare. "The Alice-B Information Retrieval (IR) System: A Locally Developed Library System at Tacoma Public Library". Library Hi Tech 29, no. 8(1) (1990): 7-20. "Missouri libraries outfitted with CD-ROM." Wilson Library Bulletin 62, no. 3 (1987): 15. Missouri State Library, records and files dated from 1987 to 1991. Moeller, Ronda, Coordinator Kansas Union Catalog, Kansas State Library, phone interview [with Stan Gardner, Jefferson City, Mo], March 21, 1991. Moeller, Ronda, Coordinator Kansas Union Catalog, Kansas State Library, phone interview [with Stan Gardner, Jefferson City, Mo], February, 1992. Moore, B. "An Introduction to CD-ROM technology." Show-Me Libraries 38, no. 11 (1987): 12-13. Mundell, Jacqueline, Network Services Librarian, Nebraska Library Commission, Survey form from Stan Gardner, completed and returned December, 1991. "Nevada installs CD-ROM catalog." Wilson Library Bulletin 62, no. 3 (1988): 14. New Jersey Computer Applications Task Force. A report of the Computer Applications Task Force. Trenton, NJ: New Jersey State library; 1980. ERIC, Ed 234 766. New York State Library. Libraries & technology: A strategic plan for library resource sharing in New York. New York: New York State Library; 1987. ERIC, ED 286 523. Niemeyer, Mollie D. MCAT, The Missouri Statewide Bibliographic Database: An Assessment. Master's Thesis, Central Missouri State University, 1989. Ohio State Library. "Ohio Shared Catalog CD-ROM Available." The State Library of Ohio News. Columbus, Ohio: Ohio State Library. 249, no. 1 (March, 1991): 12. 342 Ostendorf, JoEllen, Interlibrary Cooperation, Division of Public Library Services for the State of Georgia, phone interview [with Stan Gardner, jefferson City, Mo) March, 1992. Palmour, V.E., and DeWath, N.V. Missouri state-wide bibliographic data base survey. Rockville, MD: King Research, Inc., 1980. ERIC, ED 195 228) Prosser, Judith, Interlibrary Cooperation Librarian, West Virginia Library Commission, Survey form from Stan Gardner, completed and returned December, 1991. Scheppke, Jim, State Data Coordinator, Oregon State Library, phone interview [with Stan Gardner, Jefferson City, Mo] March 1992. Sessions, Judity, Hwa-Wei Lee, and Stacey Kimmel. "OhioLink: Technology and Teamwork Transforming Ohio Libraries. Wilson Library Bulletin 66, no. 10 (June 1992): 43-45. Slater, Frank, Librarian, North Dakota State Library, Survey form from Stan Gardner, completed and returned December, 1991. Sloan, Tom W., Deputy Director, Delaware Division of Libraries, letter to [Stan Gardner, Jefferson City, Mo], October, 1990. Smith, Barbara G., Chief, State Library Network and Information Services Section of the Maryland State Department of Education, Division of Library Development and Services, letter to [Stan Gardner, Jefferson City, Mo], September 8, 1990. Smith, Frederick E., and Messmer, George E. J., "The State-wide Automation Planning Process in New York." Library Hi Tech 26, no.7(2) (1989): 85-89. Smith, L. C., "Questions and answers: Strategies for using the electronic reference collection," in Impact on resource sharing and reference work. Urbana-Champaign, IL: University of Illinois Graduate School of Library and Information Science, 1990. Staffeldt, Darlene, Information Resources Director, Montana State Library, letter to [Stan Gardner, Jefferson City, Mo), September 11, 1990. 343 Uricchio, William, and Duffy, Michelle, "From Amoeba to ReQuest: A History and Case Study of connecticut's CD-ROM-Based Statewide Database." Library Hi Tech 30 no. 8(2) (1990): 7-21. Watson, P. K. "CD-ROM catalogs -- Evaluating LePac and looking ahead." Online 11, no. 5 (1987): 74-80. Watson, P. D., & Golden, G. A. "Distributing an online catalog on CD-ROM -- The University of Illinois experience." Online 11, no. 2 (1987): 65-74. Williams, Lynne, Automation Librarian, Alaska State Library, Letter to (Stan Gardner, Jefferson City, Mo] November, 1991. Wilson, Ashby, Director of Automated Systems and Networking Division of the Virginia State Library and Archives, phone interview [with Stan Gardner, Jefferson City, Mo], April, 1991. Wisconsin Council on Library nd Network Development. Automating Wisconsin Libraries. Madison, WI: Wisconsin State Department of Public Instruction, Division of Library Services; 1987. ERIC, ED 922 479. "WLN releases LaserCat. " Wilson Library Bulletin 61, no. 9 (1987) :10. work_cntmvbxh35gsxc6w3vxufkaeei ---- Museum Data Exchange: Learning How to Share Museum Data Exchange: Learning How to Share Final Report to The Andrew W. Mellon Foundation Günter Waibel Ralph LeVan Bruce Washburn OCLC Research A publication of OCLC Research Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 2 Museum Data Exchange: Learning How to Share Waibel, et. al., for OCLC Research © 2010 OCLC Online Computer Library Center, Inc. All rights reserved February 2010 OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 1-55653-424-8 (978-1-55653-424-9) OCLC (WorldCat): 503338562 Please direct correspondence to: Günter Waibel Program Officer waibelg@oclc.org Suggested citation: Waibel, Günter, Ralph LeVan and Bruce Washburn. 2010. Museum Data Exchange: Learning How to Share. Report produced by OCLC Research. Published online at: www.oclc.org/research/publications/library/2010/2010-02.pdf. http://www.oclc.org/� mailto:waibelg@oclc.org� http://www.oclc.org/research/publications/library/2010/2010-02.pdf� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 3 Contents Executive Summary .......................................................................................................................... 6 Introduction: Data Sharing in Fits and Starts .................................................................................... 8 Early Reception of CDWA Lite XML ................................................................................................... 10 Grant Overview ............................................................................................................................... 11 Phase 1: Creating Tools for Data Sharing ....................................................................................... 12 COBOAT and OAICATMuseum 1.0: Features and Functionality............................................ 14 Implementing and Refining the Suite of Tools ..................................................................... 16 Phase 2: Creating a Research aggregation ..................................................................................... 17 Legal agreements ............................................................................................................... 17 Harvesting Records ............................................................................................................. 17 Preparing for Data Analysis ................................................................................................. 19 Exposing the Research Aggregation to Participants ............................................................. 21 Phase 3: Analysis of the Research Aggregation .............................................................................. 22 Getting Familiar with the Data ............................................................................................. 23 Conformance to CDWA Lite, Part 1: Cardinality .................................................................... 26 Excursion: the Default COBOAT Mapping ........................................................................... 28 Conformance to CDWA Lite, Part 2: Controlled Vocabularies ............................................... 29 Economically Adding Value: Controlling More Terms ........................................................... 32 Connections: Data Values Used Across the Aggregation .................................................... 34 Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 4 Enhancement: Automated Creation of Semantic Metadata Using OpenCalais™ ................ 37 A Note about Record Identifiers .......................................................................................... 38 Patricia Harpring’s CCO Analysis ......................................................................................... 40 Third Party Data Analysis .................................................................................................... 41 Compelling Applications for Data Exchange Capacity...................................................................... 41 Conclusion: Policy Challenges Remain ........................................................................................... 44 Appendix A: Project Participants .................................................................................................... 46 Appendix B: Project Related URLs for Tools and Documents ........................................................... 47 Bibliography ................................................................................................................................... 48 Notes: ............................................................................................................................................ 50 Figures Figure 1. Draft system architecture for a CDWA Lite XML data extraction tool .................................. 13 Figure 2. Block diagram of COBOAT, its modules and configuration files ........................................ 15 Figure 3. Excerpt from a report detailing all data values for objectWorkType from a single contributor ...................................................................................................................... 20 Figure 4. Excerpt of a report detailing all of units of the information containing a data value across the research aggregation ...................................................................................... 21 Figure 5. Screenshot of the no-frills search interface to the MDE research aggregation .................. 22 Figure 6. Records contributed by MDE participants ........................................................................ 24 Figure 7. Use of CDWA Lite elements and attributes in the context of all possible units of information ..................................................................................................................... 24 Figure 8. Use of possible CDWA Lite elements and attributes across contributing institutions, take 1 .............................................................................................................................. 25 Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 5 Figure 9. Use of possible CDWA Lite elements and attributes across contributing institutions, take 2 .............................................................................................................................. 26 Figure 10. Any use of CDWA Lite required / highly recommended elements ................................... 26 Figure 11. Any use of CDWA Lite required / highly recommended elements ................................... 27 Figure 12. Use of CDWA Lite required / highly recommended elements by percentage ................... 28 Figure 13. Match rate of required / highly recommended elements to applicable controlled vocabularies .................................................................................................................. 30 Figure 14. Top 100 objectWorkTypes and their corresponding records for the Metropolitan Museum of Art .............................................................................................................. 32 Figure 15. Top 100 nameCreators and their corresponding records for the Harvard Art Museum......................................................................................................................... 33 Figure 16. Most widely shared values across the aggregation for nameCreator, nationalityCreator, roleCreator and objectWorkType ....................................................... 35 Figure 17. nationalityCreator: relating records, institutions, and unique values ............................. 36 Figure 18. objectWorkType: relating records, institutions, and unique values ................................ 36 Figure 19. Screenshot of a search result from the research aggregation ......................................... 39 Figure 20. objectWorkType spreadsheet (excerpt) for CCO analysis, including evaluation comments .................................................................................................................... 40 Figure 21. Overall scores from CCO evaluation—each bar represents a museum ............................ 41 Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 6 Executive Summary The Museum Data Exchange, funded by The Andrew W. Mellon Foundation, brought together a group of nine museums and OCLC Research to create tools for data sharing, build a research aggregation and analyze the aggregation. The project established infrastructure for standards-based metadata exchange for the museum community and modeled data sharing behavior among participating institutions. Tools The tools created by the project allow museums to share standards-based data using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). • COBOAT allows museums to extract Categories for the Description of Works of Art (CDWA) Lite XML out of collections management systems. • OAICatMuseum 1.0 makes the data harvestable via OAI-PMH. COBOAT’s default configuration targets Gallery Systems’ TMS, but can be adjusted to work with other vendor-based or homegrown database systems. Both tools are a free download from: http://www.oclc.org/research/activities/museumdata/. Configuration files adapting COBOAT to different systems can be shared at: http://sites.google.com/site/museumdataexchange/.  For more detail, see Phase 1: Creating tools for Data Sharing on page 12. Data Harvesting and Analysis Harvesting data from nine museums, the project brought together 887,572 records in a non-public research aggregation, which participants had access to via a simple search interface. The analysis showed the following: • for CDWA Lite required and highly recommended data elements, 7 out of 17 elements are used in 90% of the contributed records • the match rate against applicable Getty vocabularies for objectWorkType, nameCreator and roleCreator is approximately 40% • the top 100 objectWorkType and nameCreator values represent 99% and 49% of all aggregation records respectively. http://www.oclc.org/research/activities/museumdata/� http://sites.google.com/site/museumdataexchange/� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 7 Significant improvements in the aggregation could be achieved by revisiting data mappings to allow for a more complete representation of the underlying museum data. Focusing on the top 100 most highly occurring values for key elements will impact a high number of corresponding records, and would be low-hanging fruit for data clean-up activities. For further analysis, the research aggregation will be available for third party researchers under the terms of the original agreements with participating museums.  For more detail, see Phase 2: Creating a Research Aggregation on page 17 and  Phase 3: Analysis of the Research Aggregation on page 21. Impact In its relatively short life span to date, the project’s suite of tools has catalyzed several data sharing activities among project participants and other museums: • The Minneapolis Institute of Arts uses the tools in a production environment to contribute data to ArtsConnected, an aggregation for K-12 educators. • The Yale University Art Museum and the Yale Center for British Art use the tools to share data with a campus-wide cross-search, and contribute to a central digital asset management system. • The Harvard Art Museum and the Princeton University Art Museum are actively exploring OAI harvesting with ARTstor. (Three additional participants have signaled that this would be a likely use for their OAI infrastructure as well.) Participating vendors contributed to the museum community’s ability to share: • Gallery Systems extended COBOAT for EmbARK, demonstrating the extensibility of the MDE approach. • Selago Design created custom CDWA Lite functionality for MIMSY XG, freely available to customers as part of their OAI tools. An increasing number of projects and systems using CDWA Lite / OAI-PMH as a component (for example OMEKA, Steve: The museum social tagging project, CONA™) can be seen as a leading indicator for the future need of data sharing tools like the ones created as part of the Museum Data Exchange. When there are applications for sharing data which directly support the museum mission, more data is shared, and museum policies evolve. Conversely, when more data is shared, more such compelling applications emerge.  For more detail, see Compelling Applications for Data Exchange Capacity on page 40. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 8 Introduction Data Sharing in Fits and Starts Digital systems and the idea of aggregating museum data have a longer history than the availability of integrated access to museum resources in the present would suggest. As early as 1969, a newly formed consortium of 25 US art museums called the Museum Computer Network (MCN) and its commercial partner IBM declared, “We must create a single information system which embraces all museum holdings in the United States” (IBM et al. 1968). In collaboration with New York University, and funded by the New York Council of the Arts and the Old Dominion Foundation, MCN created a “data bank” (Ellin 1968, 79) which eventually held cataloging information for objects from many members of the New York-centric consortium, including the Frick Collection, the Brooklyn Museum, the Solomon R. Guggenheim Museum, the Metropolitan Museum of Art, the Museum of Modern Art, the National Gallery of Art and the New York Historical Society (Parry 2007). However, using electronic systems with an eye towards data sharing was a tough sell even back in the day: when Everett Ellin, one of the chief visionaries behind the project and then Assistant Director at the Guggenheim, first shared this dream with his Director, he remembers being told: "Everett, we have more important things to do at the Guggenheim" (Kirwin 2004). The end of the tale also sounds eerily familiar to contemporary ears: “The original grant funding for the MCN pilot project ended in 1970. Of the original fifteen partners, only the Metropolitan Museum and the Museum of Modern Art continued to catalog their collections using computerized methods and their own operating funds.” (Misunas et al.) Today, the museum community arguably is not significantly closer to a “single information system” than 40 years ago. As Nicholas Crofts aptly summarizes in the context of universal access to cultural heritage: “We may be nearly there, but we have been “nearly there” for an awfully long time.” (Crofts 2008, 2) Not for lack of trying, however, as a non-exhaustive selection of strategies and experiments to standardize museum data exchange in the US highlights: • The AMICO Library of digital resources from museums (conceived in 1997, a full year before eXtensible Markup Language (XML) became a W3C recommendation) created a data format consisting of a field-prefix (such as OTY for Object Type) and the field delimiter “}~” to exchange information (AMICO). • In 1999, a consortium of California institutions (MOAC) implemented a mark-up standard from the archival community (Encoded Archival Description or EAD) to bring their resources into an existing state-wide aggregation of library special collections and archival content. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 9 • Between 1998 and 2003, the CIMI consortium launched a range of projects exploring data standards and protocols for exchange, including Z39.50, Dublin Core and the UK standard SPECTRUM. All of these initiatives had merit in their particular historical context as well as a heyday of adoption, yet none of these strategies achieved consensus and wide-spread use over the long term. The most contemporary entry in the history of museum data sharing is Categories for the Description of Works of Art (CDWA) Lite XML (Getty Trust 2006). In 2005, the Getty and ARTstor created this XML laschema “to describe core records for works of art and material culture” that is “intended for contribution to union catalogs and other repositories using the Open Archives Initiative (OAI) harvesting protocol” (Getty Research Institute n.d.). Arguably, this is the most comprehensive and sophisticated attempt yet to create consensus in the museum community about how to share data. The complete CDWA Lite data sharing strategy comprises: • A data structure (CDWA) expressed in a data format (CDWA Lite XML) • A data content standard (Cataloging Cultural Objects—CCO) • A data transfer mechanism (Open Archives Initiative Protocol for Metadata Harvesting—OAI- PMH) What follows is a brief example of how these different specifications work hand in hand to establish standards-based, shareable data: • CDWA, a data field and structure specification, defines a discrete unit of information such as “Creation Date” with sub-categories for “Earliest Date” and “Latest Date.” • CCO, a data content standard, specifies the rules for formatting a date as “Late 14th century” for display and using an ISO 8601 format “1375/1399” for machine indexing. • CDWA Lite XML, a data format, allows the encoding of all this information, as shown in the code snippet below: Late 14th century 1375 1399 • OAI-PMH, a data exchange standard, allows sharing the resulting record. The protocol supports machine-to-machine communication about collections of records, including retrieval from a content provider’s server by an OAI-PMH harvester. It also supports synchronizing local updates with the remote harvester as the museum data evolves (Elings and Waibel 2007). The Museum Data Exchange (MDE) project outlined in this paper attempts to lower the barrier for adoption of this data sharing strategy by providing free tools to create and share CDWA Lite XML descriptions, and helps model data exchange with nine participating museums. The activities were generously funded by The Andrew W. Mellon Foundation, and supported by OCLC Research in collaboration with museum participants from the RLG Partnership. The project’s premise: while technological hurdles are by no means the only obstacle in the way of more ubiquitous data sharing, having a no-cost infrastructure to create standards-based descriptions should free institutions to Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 10 debate the thorny policy questions which ultimately underlie the 40 year history of fits and starts in museum data sharing. Early Reception of CDWA Lite XML The launch of CDWA Lite XML was officially announced at the MCN annual conference in Boston on November 5, 2005. The following two data points help illuminate its reception by the community. A small survey among ten prominent museums from the RLG Partnership (seven from the United States, two from the United Kingdom, one from Canada) conducted by Günter Waibel approximately six months after the initial launch of CDWA Lite XML showed that: • Capabilities for exporting standards-based data of any kind (including CDWA Lite XML) are non-existent. • Policy issues are a major obstacle to providing access to high-quality digital images. No museum provides free access to publication-quality digital images of artworks in the public domain without requiring a license (one museum has plans), while nine museums license publication-quality digital images for a fee. • A limited amount of data sharing already happens, primarily with subscription-based resources. While eight museums provide access to digital images on their Web site, four museums contribute to licensed aggregations such as ARTstor or CAMIO, and two contribute to non-licensed aggregations such as state-wide or national projects. Approximately 18 months after the launch of CDWA Lite XML, the newly minted CDWA Lite Advisory Committee1 surveys the cultural heritage community writ large to gauge the impact of CDWA Lite, and finds the following: • CDWA Lite XML garners great interest: 144 respondents (50.7% from museum community) start the 22 question survey. • Even among the self-selecting group of those taking the survey, few have the experience to complete it: only the first three questions have responses from a majority of respondents, while the numbers drop precipitously once questions presuppose basic working knowledge of CDWA Lite. Only 22 individuals complete the survey. Given this backdrop, an RLG Programs/OCLC working group called “Museum Collection Sharing,” (OCLC Research n.d.c) inaugurated in May 2006, sought to support increased use of the fledgling CDWA Lite strategy by providing a forum for museum professionals to share information and collaborate on implementation solutions. The group identified the following hurdles for getting museum data into a shareable format: • The complexities of mapping data in collections management systems to CDWA Lite • The absence of mechanisms to export data out of collections management systems and transform it into CDWA Lite XML • The complexities of configuring and running an OAI-PMH data content provider Circumstances made the creation of an OAI-PMH data content provider which “speaks” CDWA Lite XML the lowest-hanging fruit on the list. In their proof-of-concept project with ARTstor, The Getty had implemented a modified version of OAICat, an open source OAI data provider originally written by Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 11 Jeff Young (OCLC Research). In collaboration with the working group and supported by Jeff, OCLC Research released a CDWA Lite enabled version of OAICat (OAICatMuseumBETA) in the fall of 2007. Unfortunately, parallel investigations into widely applicable mechanisms to create CDWA Lite XML records did not immediately bear fruit. For example, the working group discussed the possible application of OCLC’s Schema Transformation technology (OCLC Research n.d.b) with Jean Godby (OCLC Research) and explored Crystal Reports, a report writing program bundled with many collections management systems, to output CDWA Lite XML. However, the release of OAICatMuseumBETA provided the impetus for funding from The Andrew W. Mellon foundation to remedy a situation in which museums on the working group had a tool to serve CDWA Lite XML records, yet had no capacity to create these records to begin with. Grant Overview The grant proposal funded by The Andrew W. Mellon Foundation in December 2007 with $145,0002 Phase 1: Creation of a Batch Export Capability contained the following consecutive phases, which will also structure the rest of this paper. The grant proposed to make a collaborative investment into a shared solution for generating CDWA Lite XML, rather than many isolated local investments with little community-wide impact. Grant participants aimed to leverage the experience some institutions on the Museum Collection Sharing working group had gained from exploring local solutions to create a common solution. The Yale University Art Gallery, for example, had started developing a command-line tool using customizable SQL files which create database tables corresponding to CDWA Lite; the Metropolitan Museum of Art was working with ARTstor on a CDWA Lite / OAI data-transfer solution as part of the Images for Academic Publishing (IAP) program. To keep the grant manageable and within budget, we limited our investigation to an export mechanism for Gallery Systems’ TMS, the predominant database among the museums in the Collection Sharing cohort. Museum partners: Harvard Art Museum (originally Museum of Fine Arts, Boston; the grant migrated with staff from the MFA to Harvard early in the project); Metropolitan Museum of Art; National Gallery of Art; Princeton University Art Museum; Yale University Art Gallery. Phase 2: Model Data Exchange Processes through the Creation of a Research aggregation The grant proposed to model data exchange processes among museum participants in a low-stakes environment by creating a non-public aggregation with data contributions utilizing the tools created in Phase 1, plus additional participants using alternative mechanisms. The grant purposefully limited data sharing to records only—including digital images would have put an additional strain on the harvesting process, and added little value to the predominant use of the aggregation for data analysis (see Phase 3 on the next page). Museum partners: all named in Phase 1, plus the Cleveland Museum of Art and the Victoria & Albert Museum (both contributing through a pre-existing export mechanism); in the process of the grant, data sets from the Minneapolis Institute of Arts and the National Gallery of Canada were also added. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 12 Phase 3: Analysis of the Research Aggregation The grant proposed to surface the characteristics of the research aggregation, both its potential utility and limitations, through a data analysis performed by OCLC Research. The CDWA Lite / OAI strategy had been expressly created to support large-scale aggregation—however, would the museum data transported by these means actually come together in a meaningful way? A minimal interface to the research aggregation would make cross-collection searching available to museum participants. Museum partners: all nine institutions named under Phase 1 and Phase 2. All individuals who had a significant role in the activities surrounding the grant are acknowledged in Appendix A. Project Participants (page 45). Phase 1: Creating Tools for Data Sharing The first face-to-face project meeting at the Metropolitan Museum of Art in January 2008 resulted in the following draft system architecture for a data extraction tool, which distilled our far-ranging discussions around the required functionality into a single graphic. This quote, like Figure 1 taken from the original meeting minutes, explains the envisioned flow of the data: “The Data Extraction Tool obtains data from the Source Database through application of SQL based mapping profiles. It will store the resulting output in a new and separate CDWA Lite Work Database that resides on a server behind the institution’s firewall. The Work Database provides an efficient means of representing the data defined by the CDWA Lite standard and the OAI header. A Database Publishing Tool will be configurable to push data across the firewall to the Public OAI CDWA Lite XML Database. In addition, the tool will be capable of publishing CDWA Lite XML records with an OAI wrapper to the Public OAI CDWA Lite XML File System, or CDWA Lite XML records with or without and OAI wrapper to an Internal CDWA Lite XML File System. Either the public File System or XML Database could be accessed by an OAI repository to respond to HTTP queries from the Web.” While Figure 1 and its description hint at the emerging complexity of the grant’s endeavor, some of the devils are still hiding in the details. For example, even a tool providing a solution solely for TMS needs to support significant variability in the source data model: it needs to adapt to a variety of different product versions of TMS used by different project participants, as well as different implementations of the same product version by different project participants. In addition, the tool needs to adapt to a variety of different practices within an institution: the Metropolitan Museum, for example, is running twenty installations of TMS controlled by different departments, while for other participants, a single instance of TMS within a museum contains considerable variability because different departments use that single instance according to different guidelines. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 13 Figure 1. Draft system architecture for a CDWA Lite XML data extraction tool Supporting crucial OAI-PMH features created additional requirements for the tool: it needs to keep track of updates to the TMS source data so it only regenerates CDWA Lite XML for updated records, and is capable of communicating these updates through OAI-PMH. In addition, the tool needs to be able to mark records as belonging to an OAI-PMH set so museums can create differently scoped packages of metadata for different harvesters. In short, our first project meeting surfaced a mismatch between required features, timeline and budget for Phase 1 of the grant. In addition, the meeting exposed tension between the open source requirement of the grant, and official policies at the majority of participating museums, which did not have resources for open source development, and supported Microsoft Windows exclusively. While everybody around the table wanted to create an open source solution, lack of support for open source within the group constituted a serious risk factor for successful implementation. Apparently, others shared the concern that overall requirements, timeline and budget for the project were out of sync. The response from open source developers in the museum community who received our RFP was tepid, and only one party wanted to discuss details. Ben Rubinstein, Technical Director at Cognitive Applications Inc. (Cogapp), a UK consulting firm with a long track-record of compelling museum work, presented us with an intriguing solution to our conundrum. As a by-product of many museum contracts which required accessing and processing data from collections management systems, Cogapp had developed a system called COBOAT (Collections Online Back Office Administration Tool). As part of our project, Ben proposed, Cogapp would make a fee-free, closed-source version of COBOAT available, while creating an open-source, plug-in module which trained the tool to convert data into CDWA Lite XML. Leveraging an existing tool allowed the project to stay within budget limits; creating the open-source plug-in with grant money allowed us to stay within our funding mandate; the overall package would be a good fit for Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 14 the Microsoft Windows platforms commonly supported at most project participant sites. After review with project participants, Cogapp was awarded the contract to create the batch export capability envisioned by the grant. COBOAT and OAICATMuseum 1.0: Features and Functionality The suite of tools which emerged as part of the MDE project includes both COBOAT and an updated version of OAICatMuseum. COBOAT is a metadata publishing tool developed by Cogapp that transfers information between databases (such as collections management systems) and different formats. As implemented in this project, COBOAT allows museums to extract CDWA Lite XML out of Gallery Systems’ TMS. With configuration files, COBOAT can be adjusted for extraction from different vendor-based or homegrown database systems, or locally divergent implementations of the same collections management system. COBOAT software is available under a fee-free license for the purposes of publishing a CDWA Lite repository of collections information at http://www.oclc.org/research/activities/coboat/. OAICatMuseum 1.0 is an OAI-PMH data content provider supporting CDWA Lite XML which allows museums to publish the data extracted with COBOAT. While COBOAT and OAICatMuseum can be used separately, they do make a handsome pair: COBOAT creates a MySQL database containing the CDWA Lite XML records, which OAICatMuseum makes available to harvesters. The software upgrade from BETA to 1.0 was created by Bruce Washburn in consultation with Jeff Young (both OCLC Research). OAICatMuseum 1.0 is available under an open source license at http://www.oclc.org/research/activities/oaicatmuseum/. More details: COBOAT’s default configuration files make a best-guess effort to run a complete output job, which includes retrieving data from TMS, transforming it into CDWA Lite records, and outputting them to a MySQL database that can become a data source for OAICatMuseum. While the configuration files have evolved through the experience of museum participants, new implementations will likely require modifications to adapt to local practice. The first unpolished export of a handful of sample XML records helps pin-point areas for improvement. COBOAT can be run across all TMS records, or a predefined subset; in addition, it keeps track of changes to the data source, and outputs updated modification dates for edited records to OAICatMuseum. (In this way, the suite of tools allows a data harvester to request only records which have been updated, instead of re-harvesting a complete set.) Based on an “OAISet” marker in TMS object packages, COBOAT also generates data about OAI sets. (This allows a museum to expose differently scoped sets of data for harvesting.) Beyond CDWA Lite, OAICatMuseum also offers Dublin Core for harvesting, as mandated by the OAI-PMH specification. The application creates Dublin Core from the CDWA Lite source data on the fly via a stylesheet. The following diagram provides an overview of the modules contained in COBOAT, and the configuration files which instruct different processes. http://www.oclc.org/research/activities/coboat/� http://www.oclc.org/research/activities/oaicatmuseum/� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 15 Figure 2. Block diagram of COBOAT, its modules and configuration files COBOAT consists of a total of five modules. Each of these modules can be customized through configuration files or scripts. Primary to extracting and transforming data to CDWA Lite XML, as well as adapting COBOAT to different databases or database instances, are the following three modules: • Retrieve module: extracts data out of a database (by default, TMS). The retrieve configuration file (XML) determines which data to grab, and creates a series of text files from the database tables. • Processing module or plug-in: performs transformation to CDWA Lite XML. Two configuration files are used for this procedure: the 1st pass data loader script (XML) assembles data arrays, while the 2nd pass renderer script (Smarty3 • Build module: the build script (XML) outputs the data to a simple MySQL database (used by OAICatMuseum) ) turns the data arrays into CDWA Lite XML. While the MDE project exclusively implemented COBOAT with TMS, it can be extended to other database systems. With the appropriately tailored configuration files, COBOAT can retrieve data from Oracle, MySQL, Microsoft Access, FileMaker, Valentina or PostgreSQL databases, as well as any ODBC data source (such as Microsoft SQL Server). Gallery Systems has tested the flexibility of COBOAT by running it against its EmbARK product, which is based on the 4th Dimension database system and has a different data structure from TMS. Slight modifications of COBOAT were required to support data extraction from tables and fields with an Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 16 initial underscore in their names. According to Robb Detlefs (Director of West Coast Operations and Strategic Initiatives at Gallery Systems), the COBOAT configuration files were easily adapted to extract and transform the data, and the renderer script provided sufficient means for applying logic to the data from the original system. Several EmbARK clients are expected to implement COBOAT in the near future. The MDE project has set up a Web site at http://sites.google.com/site/museumdataexchange/ where configuration files for COBOAT can be discussed and shared. These configuration files could either represent extensions to different database systems, or tweaks of default files to adapt them to a particular instance of an already covered database. The EmbARK configuration files are available at this site. Implementing and Refining the Suite of Tools In order to extend the pre-existing COBOAT application to include CDWA Lite XML capability and arrive at a default TMS configuration for the tool, Cogapp built a first instance of the new processing plug-in and tested it with two museum participants. The Metropolitan Museum of Art, with twenty stand-alone instances of TMS and upwards of 300K records one of the project’s most complex cases, became the first implementer. In parallel, Cogapp worked with the Princeton University Art Museum—as a smaller institution with fairly limited technical support, this museum represented the other end of the spectrum. Once the entire suite of tools, including OAICatMuseum 1.0, had been implemented at both of these institutions with considerable support from Cogapp and OCLC Research, the remaining museums faced the task of installing the applications as if they had simply downloaded them from the Internet, with no initial support other than the manuals. In addition to the five museum participants named in the grant, the Minneapolis Institute of Arts added yet another layer of testing for the suite of tools. To support data sharing between the Institute and the Walker Art Museum as part of the ongoing redesign of ArtsConnectEd (Minneapolis Institute of Arts and Walker Art Center. n.d.), the Institute of Arts installed a pre-release version of COBOAT / OAICatMuseum in a production environment (Dowden et al. 2009). The information architecture of ArtsConnectEd revolves around OAI harvesting of CDWA Lite records from each of the contributors, and the MDE tools matched the Institute’s needs for a readily implementable CDWA Lite / OAI solution. When all five Phase 1 museums plus the Minneapolis Institute of Arts museums had tried their hands at implementing the tools, they found COBOAT eminently suitable to the task of transforming their collection data into CDWA Lite XML. Those with slightly higher technical proficiency tended to find the tool easier to use than those with less technical support. The in-depth documentation for COBOAT won universal acclaim, and the additional high-level Quick Start Guide museums wanted to see is now part of the tool’s download. Multiple museum representatives commented that matching up a desired effect on the output with the appropriate configuration file seemed like one of the biggest hurdles to jump. The considerable flexibility built into COBOAT clearly has its learning curve, and rewards those who make the time to familiarize themselves with the possibilities. On the other hand, museums also commented on the instant gratification of executing the initial default export, which invariably produced very encouraging, if not perfect, results. An e-mail on the project list read: “After just the initial run of this app I think I might be able to say I'm a ‘CogApp Fanboy.’ Got any t- shirts?” http://sites.google.com/site/museumdataexchange/� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 17 Some figures courtesy of the Harvard Art Museum exemplify resource needs and runtime for COBOAT. While there are many variables impacting runtime, the entire process of retrieving, transforming and loading 236,466 records took 85 minutes at Harvard. The raw retrieved data required approximately 80MB of space, while the temporary data created by the processing module occupied 0.9 GB, and the final MySQL production database 2.7GB. The museums who had implemented OAICatMuseum pronounced it a solid player of the team, with the only caveat being that memory allocations had to be monitored carefully, especially for harvests upwards of 50K records. Increasing the Java Virtual Machine (JVM) memory allocation from 512 (default) to 1GB enabled larger harvests. At over 110K records, the Metropolitan’s dataset constituted the largest gathered with OAICatMuseum as part of this project. While there are many variables influencing the duration of a harvest, the Metropolitan OAI data transfer took about 24 hours. Phase 2: Creating a Research Aggregation Legal Agreements A legal agreement governed the data transfers between museum participants and OCLC Research. In the spirit of creating a safe sand-box environment for experimenting with the technological aspects of data sharing, the 1½ page agreement aimed to clarify that access to all data would remain limited to the participating museums; that the data would be purged one year after publication of the project report; that OCLC Research data analysis findings (part of this report) dealing with museum data would be anonymized; and that legitimate third party researchers could petition for access to the aggregation under identical terms to augment the communities knowledge of aggregate museum data. Observations on the process of executing the agreements reflect how complex data sharing can become in the absence of a community consensus around common policies and behaviors. This, in and of itself, constitutes a finding of the project. With a single exception, museums asked for relatively minor changes in the agreement—nevertheless, the entire process of executing agreements took six months to complete. The single biggest factor in delays seemed to be the different comfort levels of the museum staff working on the project, and legal council and administrators reviewing the agreement. In the process, some institutions which had planned to contribute all of their collection records to the research aggregation had to scale back to a subset. On the other hand, four institutions signed the agreement within six weeks of receipt with practically no changes, highlighting how different the processes, precedents and policies impinging on the decision were at each institution. Harvesting Records Not every participant in the grant used COBOAT and OAICatMuseum to encode and transfer their data. For the research aggregation and data analysis portion of the project, three institutions used alternative means to create and share CDWA Lite records. • The Cleveland Museum of Art used a pre-existing mechanism for creating CDWA Lite records on the fly from their Web online collection database in response to OAI-PMH requests. (Incidentally, this mechanism was built by Cogapp.) Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 18 • The Victoria & Albert Museum transformed an XML export from their MUSIMS (SSL) collections management system into CDWA Lite using stylesheets, and ftp’d the data. • The National Gallery of Canada, like the Minneapolis Institute of Arts, joined the project as a non-funded partner once shared interests emerged. The Gallery’s vendor Selago Design crucially enabled their participation by prototyping a CDWA Lite / OAI capacity in their MIMSY XG 4 collections management system, for which the MDE harvest constituted the first test. A little bit more detail on the MIMSY XG solution: according to James Starrit (Manager of Web Development, Selago Design), a MIMSY OAI-PMH tool set already existed when Gayle Silverman (Director of Community Relations, Selago Design) approached the National Gallery about participating in the MDE project with Selago’s support. This OAI provider was developed by Selago using PHP (with OCI8/Oracle extensions), and can be used with unqualified Dublin Core, and extended to other standards via templates. To support CDWA Lite, the appropriate mappings for the National Gallery of Canada's bilingual dataset had to be created. As a result of this work, CDWA Lite is now included in the OAI tool set, and available to any MIMSY XG user. Of all nine institutions whose records the project acquired, OAI-PMH was the transfer mechanism of choice in six cases, with four institutions using OAICatMuseum, and two an alternate OAI data content provider (Cleveland and the National Gallery of Canada). Two additional institutions wanted to employ OAICatMuseum, yet found their attempts thwarted. Policy reasons disallowed opening a port for the harvest at one museum; at another institution, project participants and OCLC Research ran out of time diagnosing a technical issue, and the museum contributed MySQL dump files from the COBOAT-created database instead. And last but not least, the Victoria & Albert Museum simply ftp’d their records. Once a set of institutional records was acquired, OCLC Research performed an initial XML schema validation as a first health-check for the data. For two data contributors, all records validated. Among the other contributors, the health-check surfaced a range of issues: • Element sequencing: valid elements were supplied, but not in the order defined by the CDWA Lite XML schema • Incorrect paths: for example, missing a “Wrap” or “Set” element in the XML path • Missing namespaces: for example, type attributes not preceded by “cdwalite:” • Missing required elements: for example, recordID not provided inside recordWrap • Invalid Unicode characters: Unicode characters 0x07 and 0x18 found in some records, preventing validation The two validating record sets came from institutions using COBOAT who had not tweaked the default configuration files. Many of the element sequencing, incorrect path and missing namespace issues were introduced in those portions of the output which had been edited. While OCLC Research harvested data at least twice from each contributor, mostly to provide an opportunity to rectify schema validation errors, downstream processes and tools (described below) were flexible enough to also handle non-valid records. Two lessons from harvesting the nine collections stand out: • First, OAI-PMH as a tool is ill-matched to the task of large one-time data transfers, compared to an ftp or rsync transfer of records, or an e-mail of mySQL dumps. Data providers reap the Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 19 benefit of the protocol predominantly through its long-term use, when additions and updates to a data set can be effectively and automatically communicated to harvesters. Within the confines of the MDE project, however, OAI remained the preferred mode of data transfer, since the grant set an explicit goal of taking institutions through an OAI process. • Second, the relatively high rate of schema validation errors after harvest leads to the conclusion that validation was not part of the process on the contributor end. Ideally, schema validation would have happened before data contribution. Validation provides the contributor with important evidence about potential mapping problems, as well as other issues in the data; in this way, it becomes one of the safeguards for circulating records which best reflect the museums data. Preparing for Data Analysis To prepare the harvested data for analysis, as well as to provide access to the museum data, OCLC Research harnessed the Pears database engine,5 which Ralph LeVan (OCLC Research) outfitted with new reporting capabilities, and an array of pre-existing and custom-written tools. Pears ingests structured data, in this case XML, and creates a list of all data elements or attributes (referred to as “units of information” from here on out) which contain a data value. The database then builds indexes for the data values of each of these units of information. The values themselves remain unchanged, except that they are shifted to lower case during index building. For data analysis, these indexes provide a basis for grouping values from a specific unit of information for a single contributor, or the aggregate collection; as well as grouping the units of information themselves via tagpaths across the array of contributors. Figure 3 shows screen-shots of sample reports from Pears in spreadsheet format, which should help bring these abstract concepts to life. Column A provides counts for the number of occurrences for each data value. (Note that only 26 of the 121 values for objectWorkType are shown.) Just at a simple intuitive level, this report provides some valuable information: first of all, it demonstrates that objectWorkTypes for this contributor were limited to a finite number of 121; it shows that a number of terms were concatenated, probably in the process of exporting the data, to create values such as “drawing-watercolor”; at first glance, only these concatenated values seem to duplicate other entries (such as “drawing”). The occurrence data indicates a high concentration of objects sharing the same objectWorkType, with numbers quickly falling below a 1K count. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 20 Figure 3. Excerpt from a report detailing all data values for objectWorkType from a single contributor Figure 4 shows an excerpt of a report which groups units of information from all contributors. Even in the impressionistic form of a screenshot, this report provides some valuable first impressions of the data. Data paths (column A) which have long lists of institutions associated with them (column C) represent units of information which are used by many contributors; occurrence counts in column B, when mentally held against the approximately 900K total records of the research aggregation, complement the first observation by providing a first impression of the pervasiveness with which certain units of information are used. Outlier data paths which have only a single institution associated with them often betray an incorrect path, as confirmed by schema validation results. While OCLC Research held fast to a principle of not manipulating the source data provided by museums, the single instance of data clean-up performed prior to analysis consisted in mapping stray data paths to their correct place. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 21 Figure 4. Excerpt of a report detailing all of units of the information containing a data value across the research aggregation As already noted, for this project Pears and its reporting capabilities were tweaked and extended in many small as well as significant ways. For example, Ralph LeVan added a new application to the OCLC Research array of Pears tools which facilitated the comparison between museum source data values and controlled vocabularies. The entire process for comparing values to vocabularies was the following: initially, each Getty vocabulary (AAT, ULAN, TGN) was transformed into a Pears databases with an exposed SRW/U (Search/Retrieve via the Web or URL) interface. An application walked down the sorted list of data values in an institutional index, and compared them with the preferred and non-preferred terms in the controlled vocabulary, in both instances using the SRW/U interface to Pears as its conduit. The resulting report gave a count of the number of matching terms, as well as how many were found in the preferred and non-preferred indexes of the controlled vocabulary. A separate report listed the 100 most frequently occurring terms in the institutional database index, indicated whether they matched as a preferred or non-preferred term, and provided the term identifiers from the matching Getty vocabulary terms. Another investigation during which Pears learned a new trick: evaluating the interconnectedness of descriptive terms across the nine contributors. Since the analysis aimed to appraise the utility of aggregating CDWA Lite records, OCLC Research wanted to establish which values used by one contributor could be found in other contributor’s datasets. An SRU client walked down an index from one institution and used the words found in that index to search the aggregate database, looking for matches. The resulting report used the 100 most frequently occurring terms of each contributing institution and counted how many times these terms occurred in each of the remaining museum datasets. Exposing the Research Aggregation to Participants In addition to supporting data prep for analysis, Pears also provided a low-overhead mechanism for making individual databases as well as the research aggregation of all nine contributors available to Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 22 project participants. By simply adding a stylesheet, the SRW/U enabled Pears turns into a database with searchable or browseable indexes (see Figure 5). Figure 5. Screenshot of the no-frills search interface to the MDE research aggregation Project participants had password-protected access to both the aggregate as well as the individual datasets. Phase 3: Analysis of the Research Aggregation Before OCLC Research started the data analysis process, museum participants formulated the questions which they would like to ask of their institutional and collective data. Some of their questions about the institutional data sets were: Are the required CDWA Lite fields present? What is the state of compliance with CCO? Are the same terms used to consistently denote the same concepts? How are controlled vocabularies used? About the aggregate data set, museum participants wanted to know: Do queries across the research aggregation return meaningful results? How is my cataloging different from the other institution’s cataloging? Which CDWA Lite fields are used by all institutions? How does the lack of subject data impact the research aggregation? For both the institutional data sets and the aggregate data set, participants evidently assumed that there would be room for improvement, because in either instance, they wanted to hear recommendations for how performance of the data could be enhanced. These questions were formalized and expanded upon in a methodology6 to guide the overall analysis efforts. This methodology grouped questions into two sections: a Metrics section which Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 23 contained questions with objective, factual answers, and an Evaluation section which contained questions that are by their nature more subjective. The Metrics section asked questions about Conformance (does the data conform to what the applicable standards—CDWA Lite and CCO— stipulate?), as well as Connections (what overt relationships between records does the data support?). Connections questions in essence tried to triangulate the elusive question of interoperability. The section on Evaluation asked questions about Suitability (how well do the records support search, retrieval, aggregation?), as well as Enhancement (how can the suitability for search, retrieval, aggregation can be improved?). The methodology was not intended to be a definitive checklist of all questions the project intended to plumb. It laid out the realm of possibilities, and allowed OCLC Research to discuss which questions it could tackle given expertise, available tools and time constraints. Most questions from the Metrics section lent themselves to machine analysis, and have been answered. OCLC Research itself predominantly worked on questions regarding CDWA Lite conformance, while an analysis of CCO compliance was outsourced to Patricia Harpring and Antonio Beecroft (see Patricia Harpring’s CCO Analysis on page 38)Most questions pertaining to Evaluation, however, were beyond the reach of our project. Especially questions about Suitability require more foundational research until they are tractable to data analysis. Unless credible and deep data about search behaviors of museum data becomes available, any question about Suitability in turn begs the question: suitable in which context for whom to do what? As Jennifer Trant summarized in an introductory blog to a study of search logs at the Guggenheim Museum, “[W]e know almost nothing about what searchers of museum collections really do. [I] couldn't find a single serious [information retrieval] study in the museum domain.” (Trant 2007). The Searching Museum Collections project, organized by Susan Chun (Consultant), Rob Stein (Indianapolis Museum of Art) and Christine Kuan (ARTstor), may provide some of the lacking datapoints: the project proposes to analyze search logs of museums and data aggregators, including ARTstor logs, to answer questions about user behavior (Searching Museum Collections n.d.). Getting Familiar with the Data Overall, a total of 887,572 records were contributed by nine museums in Phase 2 of the grant (as shown in Figure 6). Six out of nine museums contributed all accessioned objects in their database at the time of harvest. Of the remaining three, one chose a subset of all materials published on their Web site, while two made decisions based on the perceived state of cataloging. Among those providing subsets of data, the approximate percentages range from one third to a little over 50% of the data in their collections management system. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 24 Figure 6. Records contributed by MDE participants Figure 7 represents the elements and attributes found in the aggregation, placed in the context of all possible 131 CDWA Lite units of information.7 It shows that few of the available units of information were consistently and widely used, some were used a little, and many were not used at all. Figure 7. Use of CDWA Lite elements and attributes in the context of all possible units of information Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 25 Another take on the distribution of element/attribute use across the aggregation: Figure 8 shows the number of contributors that had made any use of a unit of information. A relatively small number (10 out of 131, or 7.6%) of elements/attributes are used at least once by all nine museums. These units of information are: • displayCreationDate (CDWA Lite Required) • displayMaterialsTech (CDWA Lite Required) • displayMeasurements (CDWA Lite Highly Recommended) • earliestDate (CDWA Lite Required) • latestDate (CDWA Lite Required) • locationName (CDWA Lite Required) • “type” attribute on locationName (Attribute) • nameCreator (CDWA Lite Required) • nationalityCreator (CDWA Lite Highly Recommended) • title (CDWA Lite Required) Figure 8. Use of possible CDWA Lite elements and attributes across contributing institutions, take 1 A final look at the distribution of use for the totality of all CDWA Lite elements/attributes (as shown in Figure 9) makes it easy to see how many units of information are not used at all (approximately 54%) and how many are used at least once by all contributors (approximately 8%). Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 26 54% not used 11% used by 1 8% used by all Figure 9. Use of possible CDWA Lite elements and attributes across contributing institutions, take 2 Conformance to CDWA Lite, Part 1: Cardinality The CDWA Lite specification calls 12 data elements “required.” and 5 elements “highly recommended” (see Figure 10). The specification’s authors deem these elements particularly important for both the description of a piece of artwork as well as its indexing and retrieval. In theory, the required elements are mandatory for schema validation8 ―in practice, the vast majority of records which did not contain a data value in a required element still passed the schema validation test, since declaring the data element itself suffices for validation. Figure 10. Any use of CDWA Lite required / highly recommended elements Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 27 As would be expected, use of CDWA Lite required / highly recommended elements shows a lot more density than use of the schema overall (see Figure 8). The counts shown in Figure 10 reflect the number of contributors who used these elements at least once. 9 out of 17 required / highly recommended elements are used by all contributors. A conspicuous outlier is subjectTerm (more on that later). However, a graph of the institutions which provided values in these elements for all of their records gives a different view of how comprehensive these required / highly recommended elements were utilized. locationName emerges as the only element consistently present in all records across all nine contributors. A little over 50% (9 of the 17) required / highly recommended elements occur consistently in only three or less museum contributors. Almost 50% (8 of the 17) required / highly recommended elements occur consistently in five or more contributors. Figure 11. Any use of CDWA Lite required / highly recommended elements The more realistic middle ground to the overly optimistic Figure 10 and the overly pessimistic Figure 11 is a graph of the percentage of records in the research aggregation that have values in the required / highly recommended elements (Figure 12). Discounting the outlier subjectTerm, the consistency with which these elements occur is greater than 65% overall. For 7 of 17 elements, consistency is above 90%. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 28 Figure 12. Use of CDWA Lite required / highly recommended elements by percentage Excursion: the Default COBOAT Mapping At this point, a short break from figures and a disclaimer about the data is in order. Project participants submitted data to the research aggregation as part of an abstract exercise. The project parameters made no demand of them other than to make CDWA Lite records available. Consequently, fields which may very well be present in source systems remained unpopulated in the submitted CDWA Lite records. At the point where OCLC Research accepted the data contribution because of the time constraints of the grant project, a real life aggregator may very well have gone back to negotiate for further data values deemed crucial to a specific service. For the 6 of 9 contributors using COBOAT, the default mapping provided with the application heavily influenced their data contribution. However, this default mapping covered all elements which museum participants had provided to Cogapp as part of their TMS to CDWA Lite mapping documents, and in that way, the defaults do represent a consensus of which units of information the museums considered important or unimportant. The default COBOAT mapping uses 32 units of information of the 131 defined in the CDWA Lite schema. All CDWA Lite required/recommended elements and attributes are in the default mapping, except for subjectTerm. (subjectTerm did not appear on any of the mapping documents.)9 In hindsight, it would have been beneficial to consciously reflect on those choices as a group, and ponder the impact of the default mapping on the outcome of the data analysis. (This would have likely surfaced the absence of subjectTerm, and perhaps the lack of any termSource attributes to declare controlled vocabularies.) On the other hand, COBOAT participants did have the option of adding further data elements and attributes, which they made limited use of: four out of six COBOAT contributors used, in various combinations, 11 additional elements and attributes. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 29 Conformance to CDWA Lite, Part 2: Controlled Vocabularies CDWA Lite, as well as its attendant data content standard CCO, recommends the use of controlled vocabularies for 13 data elements, six of which are required / highly recommended: • objectWorkType • nameCreator • nationalityCreator • roleCreator • locationName • subjectTerm None of the contributing museums had marked the use of controlled vocabularies on any of the six elements in question. (As noted above, the “termSource” attribute was not part of the COBOAT default export). To get a sense of the deliberate or incidental use of values from controlled vocabularies, OCLC Research created a list of the top 100 most frequently used terms for each participating museum, deduplicated the lists (sometimes identical terms were frequently used at multiple museums), and then matched the remaining terms to a controlled vocabulary source recommended by CDWA Lite. The data matching exploration highlights whether connections with an applicable thesaurus are possible without expertise-intensive and costly processing of data. Matches shown in Figure 13 are exact matches achieved without any manipulation of the source data, and pertain to the top 100 values for any given element from all contributing institutions. (The numbers along the x-axis, above the vocabulary acronym, give the total count of the top 100 values. For example, objectWorkType is represented by 577 top 100 deduplicated values from the eight institutions which contributed values for this element.) In some instances, higher match rates might have been achieved by post- processing, for example by splitting concatenated data values museums had contributed. Some values matched on multiple entries in their corresponding controlled vocabulary (more details below). Figure 13 includes the multi-matching values in the percentage counts. For some of the data elements shown in Figure 13, the results of matching against controlled vocabularies is more indicative than for others. The issues encountered in matching values from museum contributors to controlled vocabularies were semantic mismatches (false hits), matches prevented by concatenated or deviantly structured data, and multiple matches. These caveats make all matches on TGN summarized in Figure 13 (subjectTerms in AAT and TGN; nationalityCreator in TGN; locationName in TGN) somewhat questionable. Reasonably indicative, however, are the matches of nameCreator in ULAN, objectWorkType in AAT, roleCreator in AAT. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 30 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 379 838 519 577 232 200 200 TGN ULAN TGN AAT AAT AAT TGN locationName nameCreator nationalityCreator objectWorkType roleCreator subjectTerm subjectTerm M at ch es pref erred non-pref erred (TGN = The Getty Thesaurus of Geographic Names®, ULAN = Union List of Artist Names®, AAT = Art & Architecture Thesaurus®) Figure 13. Match rate of required / highly recommended elements to applicable controlled vocabularies What follows is a more in-depth discussion for each attempt at matching. objectWorkType and AAT • 8 out of 9 institutions contributed objectWorkType data values. The count for deduplicated top 100 objectWorkTypes is 577 across the eight contributing institutions. (As would be expected, for some institutions, the sum total of all their objectWorkTypes is less than 100). • 41% of objectWorkTypes (236 out of 577) match on an AAT term, with 98 matching on a preferred term, and 138 matching on a non-preferred term. 8% of these 41% represent terms which match on more than one AAT term. nameCreator and ULAN • All nine institutions contributed nameCreator data values. The count for deduplicated top 100 nameCreators is 838 across the nine contributing institutions. 37% of nameCreators (314 of the 838) match on a ULAN term, with 213 matching on a preferred term, and 101 matching on a non-preferred term. 1% of these 37% represent terms which match on more than one ULAN term. • A small disclaimer: for two contributors, large amounts of names did not match because they are inverted and miss a coma, such as Warhol Andy, Adams Ansel, Whistler James Mcneill, Saint Laurent Yves, etc. Had these names, which do exist in ULAN, matched, a higher overall match rate would have been the result. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 31 roleCreator and AA • 7 out of 9 institutions contributed roleCreator data values. The count for deduplicated top 100 roleCreators is 232 across the seven contributing institutions. (As would be expected, for most institutions, the sum total of all their roleCreators is less than 100). • 41% of roleCreators (95 of the 232) match on an AAT term, with 9 matching on a preferred term, and 86 matching on a non-preferred term. 7% of these 41% represent terms which match on more than one AAT term. subjectTerm and TGN, AAT • subjectTerm was only used by two institutions, and therefore does not constitute a compelling sample. In addition, matching subject terms on TGN produced many hits which indeed were a letter-for-letter equivalent to the museum data, but where the intended semantic concept was not a place name: the subjectTerm “commerce”, for example, matched on nine place names in TGN (inhabited places in Alabama, Georgia, Michigan, Mississippi, Missouri, Oklahoma and Tennessee), yet referred predominantly to a collection of photographs taken during the Great Depression with captions such as “Untitled (Sig. Klein Fat Men's Shop, 52 Third Avenue, New York City)”. locationName and TGN • Ironically, data values which (at least in appearance) mimicked the hierarchical style of the thesaurus (such as “north America, american southwest, united states, new mexico, acoma pueblo”) did not match on their entry in TGN. While they would provide a human user with unambiguous information about the place in question, for a machine match, “Acoma Pueblo” unadorned would have made the connection in our test. On the other hand, single values like “florence” often produced multiple hits: Florence, Italy, or which of the 42 inhabited places called “Florence” in the United States? nationalityCreator and TGN • In many instances, the data for nationalityCreator contained concatenated strings, such as “belgium, brussels, 18th century” or “american, born england,” which could not be matched on TGN without further processing of the source data. In summary, the vocabulary matching exercise indicates that in order to preserve the possibility of extending museum data with the rich information available in thesauri, even knowing the source thesaurus would have been only marginally helpful. Performing some data processing on the museum data could have created higher match rates. However, the value of using controlled vocabularies for search optimization or data enrichment can only be fully realized if the termsourceID is captured alongside termSource to establish a firm lock on the appropriate vocabulary term. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 32 Economically Adding Value: Controlling More Terms The high record count with which many individual data values on the top 100 lists are associated suggests opportunities for adding value to the data by controlling a relatively low number of terms with impact on a relatively high number of records per data set. Consider the example for objectWorkType values in records from the Metropolitan Museum of Art depicted in Figure 14: • The top 100 objectWorkTypes represent 99% of objectWorkType values in all 112,000 Metropolitan records. • The top 100 objectWorkTypes matching on either a preferred or a non-preferred AAT term represent 27 matches, which is equal to 73% of all Metropolitan records. (8 of these 27 matches contain terms which match on more than one AAT entry.) • The top 100 objectWorkTypes not matching on any AAT term represent 73, which is equal to 26% of all Metropolitan records. In other words, by tending to 73 objectWorkType values and disambiguating an additional 8, the Metropolitan could extend objectWorkType control to 99% of all 112,000 Metropolitan Museum records. Figure 14. Top 100 objectWorkTypes and their corresponding records for the Metropolitan Museum of Art These numbers by and large hold true for objectWorkType values across the aggregation: • The top 100 objectWorkTypes for all 8 contributors combined represent 94% of objectWorkTypes in all 847,000 records. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 33 • The top 100 objectWorkTypes for all 8 contributors combined matching on either a preferred or a non-preferred AAT term represent 236, which is equal to 64% of all aggregate records. (46 of these 236 matches contain terms which match on more than one AAT entry). • The top 100 objectWorkTypes for all 8 contributors combined not matching on any objectWorkType term represent 341, which is equal to 30% of all aggregate records. In other words, by tending to 341 objectWorkTypes and disambiguating an additional 46, the aggregate collection control for objectWorkType could be extended to 94% of all 847,000 records. A data element like objectWorkType would be expected to produce favorable numbers in this type of analysis: by its very nature, objectWorkType contains a relatively low number of values which presumably reappear across many records in a collection. For nameCreator, a data element which has a relatively high number of values across aggregation records, one would expect a less impressive result from focusing on top 100 terms. Consider the example depicted in Figure 15 from the Harvard Art Museum: • The top 100 nameCreators represent 50% of nameCreators in all 236,000 Harvard records. • The top 100 nameCreators matching on either a preferred or a non-preferred ULAN term represent 49 matches, which represent 25% of all Harvard records. (2 of the top 49 matches contain terms which match on more than one ULAN entry). • The top 100 nameCreators not matching on any ULAN term represent 51, which represent 26% of all Harvard records. In other words, by tending to 51 nameCreator values and disambiguating an additional two, Harvard could extend nameCreator control to 50% of all 236,000 Harvard records. While the overall percentages for nameCreator are necessarily lower than for objectWorkType, doubling the rate of control still constitutes a formidable result. Figure 15. Top 100 nameCreators and their corresponding records for the Harvard Art Museum Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 34 Again, the comparison statistics across the aggregation: • The top 100 nameCreators for all 9 contributors combined represent 49% of nameCreators in all 888,000 aggregation records. • The top 100 nameCreators for all nine contributors combined matching on either a preferred or a non-preferred ULAN term represent 314, which represents 17% of all aggregate records. (Eight of these 314 matches contain terms which match on more than one ULAN entry.) • The top 100 nameCreators for all nine contributors combined not matching on any nameCreator term represent 524, which is equal to 31% of all aggregate records. In other words, by tending to 524 nameCreators and disambiguating an additional eight, the aggregate collection control for nameCreator could be extended to 49% of all 888,000 aggregation records. Connections: Data Values Used Across the Aggregation Apart from evaluating conformance to CDWA Lite and vocabulary use, OCLC Research at least dipped its toe into the murky waters of testing for interoperability. By asking questions about how consistently data values appeared across the nine contributors to the aggregation, some first impressions of cohesion can be triangulated. This investigation concentrated on a select number of data elements which are required / highly recommended by CDWA Lite, widely used by contributors and presumably of prominent use for searching and browse lists. For these elements, the top 100 values for each museum were cross-checked against other contributors to establish how many institutions use that same value, and with what frequency (i.e. in how many records). Figure 16 provides a small sample from the resulting spreadsheets. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 35 nameCreator objectWorkType Value Contributors Records Value Contributors Records parmigianino 8 783 sculpture 8 10369 raphael 8 1127 print 6 175588 abbott, berenice 6 572 photograph 6 86017 albers, josef 6 564 drawing 6 58140 beuys, joseph 6 975 painting 6 15837 blake, william 6 1054 furniture 6 2206 bonnard, pierre 6 623 book 5 15644 bourne, samuel 6 806 paintings 5 9612 brandt, bill 6 611 ceramic 5 5562 chagall, marc 6 2207 textiles 5 3898 textile 5 1775 nationalityCreator portfolio 5 861 Value Contributors Records calligraphy 5 822 american 9 248206 glass 5 800 australian 9 1578 manuscript 5 534 austrian 9 2443 poster 4 3493 belgian 9 1057 metalwork 4 2826 brazilian 9 221 plate 4 1851 british 9 50924 costume 4 956 canadian 9 15776 album 4 823 chinese 9 2905 wallpaper 4 500 cuban 9 188 sketchbook 4 356 danish 9 591 frame 4 297 collage 4 132 roleCreator mosaic 4 116 Value Contributors Records prints 3 8814 artist 6 475747 negative 3 7235 engraver 6 6465 photographs 3 7081 printer 6 14009 jewelry 3 3662 publisher 6 25535 vase 3 2663 designer 5 19130 dish 3 2440 editor 5 1704 bowl 3 2328 etcher 5 1822 tile 3 1707 lithographer 5 1496 ring 3 1279 painter 5 1376 medal 3 1277 architect 4 667 jug 3 1271 Figure 16. Most widely shared values across the aggregation for nameCreator, nationalityCreator, roleCreator and objectWorkType With a small amount of additional processing, the spreadsheets underlying these figures allow statements about how many values in a specific data element are shared across how many institutions, and how many records these elements represent. Figures 17 and 18 explore what can be learned about the Aggregates cohesiveness by looking at nationalityCreator and objectWorkType. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 36 Figure 17. nationalityCreator: relating records, institutions and unique values For nationalityCreator, the distribution of unique values and associated records across the nine participating institutions matches what one would expect from browsing the data, given the preponderance of nationalityCreator values from a small set of countries. A relatively small number of unique values (28) are present in the data from all 9 participants, and correlate to a large number of records (554K). Additionally, one would expect to see many unique values represented in the data shared by one or two institutions, with correspondingly low numbers of associated records, for those nationalities that are less common. Sure enough, 408 nationalityCreator values are held by any two or a single institution, representing 26K records, or 2.9% of the entire aggregate. Given this appraisal, nationalityCreator values seem to form a coherent set of data values. Figure 18. objectWorkType: relating records, institutions and unique values For objectWorkType, the distribution of unique values and associated records across the nine participating institutions show a less coherent mix. (In part, this can be attributed to small differences in values preventing a match, such as the singular and plural forms for a term evident for objectWorkType in Figure 18: print(s), photograph(s), painting(s), textile(s), etc.) Since only eight institutions contributed objectWorkType data, no unique objectWorkType values are represented across all contributors. Only one value is found in the data of eight contributors. The first spike in Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 37 the graph occurs at five values found in the data of six museums. Though those five values account for a significant number of records (338K, or 38% of the Aggregate), one would have expected the number for both widely shared values and corresponding record counts to be higher. Moving on to the spike at values present in a single museum’s data, it is hard to conceive that the high number of unique values (404) representing a high number of records (273K, or 30% of the Aggregate) accurately reflects the underlying collections. The large number of unique objectWorkTypes suggests an opportunity to reduce noise in the data via more rigorous application of controlled vocabulary (according to Figure 13, the current match rate to AAT is 41%), which would produce a set of unique values that could be more sensible to browse. Enhancement: Automated Creation of Semantic Metadata Using OpenCalais™ The analysis project made a small foray into exploring automatic ways of enhancing the museum records by exposing a few select records from the MDE aggregation to the OpenCalais Web Service (Thomson Reuters n.d.). The OpenCalais Web Service processes text into semantic metadata, i.e. it locates entities (people, places, products, etc.), facts (John Doe works for Acme Corporation) and events (Jane Doe was appointed as a Board member of Acme Corporation). As the examples in parenthesis, which come from the OpenCalais FAQ, indicate, the Web Service is mainly oriented towards commercial data, but cultural institutions like the PowerHouse Museum (Chan 2008) have also explored its potential. The results of applying OpenCalais to select MDE records suggest that especially records with unstructured narrative description will benefit, sometimes quite significantly, but even less completely described records can benefit by adding more semantic value to certain elements (e.g., parsing a location name into city, state/province, and country names) and finding additional names for groups, people, and events from within notes and other CDWA Lite elements. OpenCalais also showed surprising skill at transforming certain strings into categories and values. For example, it was able to generate the category and value “Currency:pence” from the string “this chair which cost 16/6 (82p).” Here are CDWA Lite access points for an MDE record with a moderate level of structured text, and no unstructured or narrative description: objectWorkType: Photograph title: Claude Monet nameCreator: Larchman, Harry roleCreator: Artist nameCreator: Monet, Claude roleCreator: Portrait sitter earliestDate: 1900 latestDate: 1909 locationName: The Metropolitan Museum of Art, New York, NY, USA Photograph; Claude Monet; Larchman, Harry; Artist; Monet, Claude; Portrait sitter; 1900; 1909; The Metropolitan Museum of Art, New York, NY, USA Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 38 . . . it finds the following matching categories and values: City: Art Country: United States Facility: The Metropolitan Museum of Art Person: Claude Monet Position: Artist ProvinceOrState: New York, United States OpenCalais also returns a confidence level for these assertions, not shown here, that could help a system demote the “city of Art” it has identified. But if the MDE terms are plugged into a template that emulates wall label text, for example: In this photograph created during the years 1900-1909, the artist Harry Larchman has portrayed the subject Claude Monet. The photograph is in the collection of the Metropolitan Museum of Art, New York, NY, USA. . . . then OpenCalais finds more concepts: City: Art Country: United States Facility: Metropolitan Museum of Art Person: Claude Monet Person: Harry Larchman Position: artist Province or State: New York, United States Generic Relations: portray, Harry Larchman, Claude Monet Person Career: Harry Larchman, artist, professional, current This relatively simple step of providing a template of narrative structure around specific data element values from the CDWA Lite record source may be an important consideration in any projects that attempt to further enrich or extend the data using tools that look for semantic value within sentence structure, such as OpenCalais. A Note About Record Identifiers Persistent and unique record identifiers are essential for supporting linking and retrieval, as well as data management of records across systems. Without a reliable identifier it is difficult or impossible to match incoming records for adds, updates, and deletes, or to link to a specific record. In other words, a reliable identifier unlocks one of the chief benefits of using OAI-PMH for data sharing, the ability to keep a data contribution to a third party current with changes in the local museum system. There are four places in the OAI-PMH and CDWA Lite data where unique record identifiers may be supplied: • recordID: CDWA Lite Required, CCO Required “A unique record identification in the contributor’s (local) system”10 • recordInfoID: Not required or recommended “Unique ID of the metadata. Record Info ID has the same definition as Record ID but out of the context of original local system, such as a persistent identifier or an oai identifier ” Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 39 • workID: CDWA Lite Highly Recommended, CCO Required “Any unique numeric or alphanumeric identifier(s) assigned to a work by a repository” • OAI-PMH Identifier OAI Required. Schema, repository, and item id. E.g., oai:artsmia.org:10507 Among the data contributors, six of nine used recordID and recordInfoID, eight of nine used workID, some used both. (Both were defined in the default COBOAT mapping.) All contributors used at least one of the CDWA Lite identifiers. The OAI-PMH identifier is required by the protocol, and when available, can help disambiguate record identifiers that would otherwise not be unique across repositories. As noted earlier, six ofnine museums used OAI to contribute data. If relied upon, the OAI identifier needs to be static (changing repository IDs adds volatility) and available along with the CDWA Lite data in whatever system incorporates the data. When identifiers are supplied that are not unique across an aggregation, disambiguation problems develop. As depicted in Figure 19, the identifier “1953.155” is used by two different contributors for two different works. Figure 19. Screenshot of a search result from the research aggregation From a data aggregator’s point of view, the key concerns is not which data element is used to provide an identifier, but that the same element be used consistently by all contributors. Given its required nature, the recordID element seems like a good candidate for consensus. Aggregators can supplement it with information about the contributor to make it unique across the collection (e.g. by using the OAI identifier or following its conventions in case not all contributors use OAI), which will help ensure efficient and reliable record processing and retrieval. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 40 Patricia Harpring’s CCO Analysis The OCLC Research data analysis largely focused on testing for conformance against stipulations made by the CDWA Lite specification. The rules outlined in CDWA Lite conform to the much more comprehensive set of guidelines laid out by its companion data content standard CCO, as well as the full CDWA online (Getty Trust, J. Paul 2009). However, more rigorous evaluation of values supplied by contributors against CCO did not lend itself to the kind of machine-processing matching up with OCLC Research’s skills, and called for a deep familiarity with the data content standard. OCLC Research contracted with CCO co-author Patricia Harpring, supported by Antonio Beecroft (both from the Getty Research Institute), to spend some of their weekend and vacation time to evaluate the CCO-ness of the data. As a basis for this additional analysis, OCLC Research provided Patricia with Pears reports for the 20 data elements which either CDWA Lite or CCO mark as required / highly recommended. Each of the elements was represented by its top 100 most frequently occurring values. Patricia drew up scoring principles for the spreadsheets, which she and Antonio used to evaluate the top 20 values for each element from the 9 contributors (see Figure 20). Figure 20. objectWorkType spreadsheet (excerpt) for CCO analysis, including evaluation comments Patricia and Roberto subtracted from an institution’s score in particular for missing data, multiple terms in a single element, data mismatches (i.e. roleCreator containing attributionQualifier information), uncontrolled terms, as well as the title “Untitled” (CCO requires a descriptive title) and the displayCreationDate “Undated” (CCO requires an approximate date). Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 41 Figure 21 provides the overall scoring pattern for the core elements under examination, and gives the impression that overall, the MDE contributor’s exhibited considerable conformance to CCO. In Patricia’s own words: “The nine sets of data analyzed for compliance in this study scored quite well. Many of the points deducted in scoring were due to mapping and parsing issues that could be easily corrected. The most frequent issues concerned having multiple terms in one field or missing data that is 1) probably actually available in the institution's local data base (e.g., a missing Work ID) or 2) could be filled using suggested default values (e.g., globally supply "artist" or "maker" for missing Creator Role).” (Harpring 2009) Figure 21. Overall scores from CCO evaluation—each bar represents a museum More details on the scoring criteria itself, as well as a brief discussion of analysis results, can be found in Patricia’s document, “Museum Data Exchange CCO Evaluation—Criteria for Scoring” (Harpring 2009). Third Party Data Analysis The agreements with participating museums described in the section “Legal agreements” include a provision which allows a third party researcher to take possession of the data under the terms of the original letters of agreement, analyze the data, and publish findings. From the outset, OCLC Research realized that it could contribute some knowledge about the characteristics of the data, but that other entities with different tools, interests, and (perhaps) contextualizing data could bring additional findings to light. Initially, some of the museums themselves had also expressed an interest in taking a methodical look at the aggregate data. (See OCLC Research n.d.d. for more information on third party analysis.) Compelling Applications for Data Exchange Capacity The design of the MDE project allowed participants to experiment and gain experience with sharing data without necessarily having settled the policy issues the museum community at large is still grappling with. While this approach by and large succeeded (witness the creation of tools, the sharing of data, the lessons in the data analysis), the absence of real-life requirements, a real-life audience and real-life applications for the data made it difficult for the museums to calibrate their data for submission, and for OCLC Research to evaluate it for suitability. Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 42 To survive and thrive, museum data sharing at participating institutions will need to outgrow the sandbox, become sanctioned by policies, and applied in service of goals supporting the museum mission. Needless to say, sharing data is not a goal unto itself, but an activity which needs to drive a process or application of genuine interest to an individual museum. As a natural by-product of the grant work, participants and project followers surfaced and discussed potential applications for CDWA Lite / OAI-PMH in a museum setting, and the project invited representatives from OMEKA, ARTstor, ArtsConnectEd and CHIN, as well as Gallery Systems and Selago Design, to participate in our final meeting at the Metropolitan Museum of Art on July 27, 2009. In the meantime, some of the museums have already put their new sharing infrastructure to work in production settings; others are actively exploring their options. Below are brief sketches of the institutional goals a CDWA Lite / OAI-PMH capacity could support. Goal: Create an exhibition Web site, or publish an entire collection online • OMEKA is a free, open source Web-based publishing platform for collections created by the Center for History and New Media, George Mason University. For a museum to take advantage of this tool, a core set of data has to migrate from the local collections management system into the OMEKA platform. A museum can use COBOAT and OAICatMuseum to create an OAI data content provider, and OMEKA’s OAI-PMH Harvester plug-in (George Mason University n.d.) to ingest and update the data. At least one of the MDE project participants supported the test of this plug-in by providing access to their OAI-PMH installations, and others may follow. Goal: Add a tagging feature to your online collection • Steve: The museum social tagging project is a collaboration of museum professionals exploring the benefits of social tagging for cultural collections. As part of its research, Steve has created software for a hosted tagging solution. For a museum to take advantage of this tool, its data will have to be loaded into the tagging application. The Steve tagger can harvest OAI-PMH repositories of data, and accepts data in both CDWA Lite and Dublin Core. The project team envisions that a future local version of the tagger will contain the same ingest functionality. Goal: Disseminate authoritative descriptive records of museum objects • The Cultural Objects Name Authority (CONA™), a new Getty vocabulary of brief authoritative records for works of art and architecture, is slated to be available for contributions in 2011. For museums, contribution to CONA ensures that records of their works as represented in visual resources or library collections are authoritative. Although CONA is an authority, not a full-blown database of object information, it complies with the cataloging rules for adequate minimal records described in CDWA and CCO. As Patricia Harpring outlined during our final meeting, the vocabulary editorial team will accept contributions in CDWA Lite XML or in the larger CONA contribution XML format. Goal: Expose collections for K-12 educators, students and scholars • ArtsConnectEd is an interactive Web site that provides access to works of art and educational resources from the Minneapolis Institute of Arts and the Walker Art Center. The Institute and the Walker pool their resources using a CDWA Lite / OAI-PMH infrastructure, and the Institute of Arts has successfully implemented the MDE tools to contribute to this Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 43 aggregation. At the final face-to-face MDE meeting, both ArtsConnectEd representatives and grant participants speculated about whether the resource could grow to include additional contributors. Robin Dowden (Walker) demonstrated a private research prototype site including the MDE datasets from the National Gallery of Canada and the Harvard Art Museum, as well as records from the Brooklyn Museum (6K) accessed via an API. These three new datasets had been loaded into ArtsConnectEd within 72 hours by Nate Solas (Walker), and even without any smoothing around the edges, the five-institution version of ArtsConnectEd provided a solid experience: “CDWA Lite format and indexing is very good at first glance,” as Robin observed. Goal: Expose collections to higher education • As one of the original co-creators of CDWA Lite, ARTstor welcomes contributions in CDWA Lite. During our final face-to-face project meeting, Bill Ying and Christine Kuan (both ARTstor) emphasized that ARTstor is eager to create relationships with data contributors in which repeat OAI harvesting to support updating and adding to the data becomes a matter of routine. ARTstor currently counts 80 international museums among its contributor, yet only a very small minority of them have contributed data via OAI. As an outcome of the MDE project, both the Harvard Art Museum and the Princeton University Art Museum are actively exploring OAI harvesting with ARTstor, while three additional participants have signaled that this would be a likely use for their OAI infrastructure as well. Goal: Effectively expose the collective collection of a university campus (collections from libraries, archives and museums) • At Yale University, both the Yale University Art Gallery (grant participant) and the Yale Center for British Art (an avid follower of the project) are implementing COBOAT / OAICatMuseum with the goal of using this capacity for a variety of university-wide initiatives. The museums contribute data to a cross-collection search effort via OAI-PMH (Princeton has similar ambitions), and the Yale museums will also use the same set-up to sync data with OpenText’s Artesia, a university-wide digital assets management system. In addition, the Art Gallery proposes to use CDWA Lite XML records to share data with the recipients of traveling collections. Goal: Aggregate collection data for national projects • The Canadian Heritage Information Network (CHIN) is currently redeveloping Artefacts Canada (http://www.chin.gc.ca/English/Artefacts_Canada/), a resource of more than 3 million object records and 580,000 images from hundreds of museums across the country. So far, the resource grows largely via contributions of tab-delimited files and spreadsheets representing museum data, and CHIN would like to explore other mechanisms for aggregation. Corina MacDonald (CHIN), who attended the MDE final project meeting, speculated that a test bed of large Canadian institutions using COBOAT / OAICatMuseum might provide lessons for a way forward. • Another example of a national aggregation project, this time from the UK: A venture of the Collections Trust, the Museums, Libraries and Archives Council (MLA), the European Commission and technical partners Knowledge Integration Ltd, Culture Grid pulls together information from UK library, archive and museum databases, and then opens up this content to media partners such as Google and the BBC to ensure that it is available to as wide an audience as possible (Collections Trust n.d.). To drive data into the Culture Grid, The http://www.chin.gc.ca/English/Artefacts_Canada/� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 44 Collections Trust has created an SDK for collections management system providers, which allows them to easily integrate functionality for offering up structured DC via an OAI-PMH data content provider. While this effort is not built around CDWA Lite XML, the general strategy of opening up collections by providing a low-barrier export mechanism is parallel and complimentary to the MDE work. Conclusion: Policy Challenges Remain In his insightful article “Digital Assets and Digital Burdens: Obstacles to the Dream of Universal Access,” already cited in the introduction, Nicholas Crofts (2008, 2) provides a list of false premises for data sharing: i. Adapting to new technology is the major obstacle to achieving universal access ii. The corpus of existing digital documentation is suitable for wide-scale diffusion iii. Memory institutions want to make their digital materials freely available Ironically, these premises can be viewed as structuring the MDE project. The MDE grant posited that a joint investment in shareable tools (cf. i) might help force the policy question of how openly to disseminate data (cf. iii), while also allowing an investigation into how suitable museum descriptions are for aggregation (cf. ii). At the end of the day, however, there is no disagreement with Crofts position: ultimately, policy decisions allow data sharing technology to be harnessed, or create the impetus to upgrade descriptive records. In the case of the MDE museums, all of them had enough institutional will towards data sharing to participate in this project. As a result of this project, some have already used their new capacity for data exchange to drive mission-critical projects. A quick recap of the most significant developments catalyzed by the MDE tools: the Minneapolis Institute of Arts uses the MDE tools to contribute data to ArtsConnected; Yale University Art Museum and the Yale Center for British Art use the tools to share data with a campus-wide cross-search, and contribute to a central digital asset management system; the Harvard Art Museum and the Princeton University Art Museum are actively exploring OAI harvesting with ARTstor, while three additional participants have signaled that this would be a likely use for their OAI infrastructure as well. Obviously, it is too early to judge the ultimate impact of making the MDE suite of tools available, yet these developments are promising. While the data analysis efforts detailed in this paper cannot be viewed as a conclusive measure for the fitness of museum descriptions, they ultimately leave a positive impression: the analysis shows good adherence to applicable standards, as well as reasonable cohesion. Where there is room for improvement, some fairly straightforward remedies can be employed. Significant improvements in the aggregation could be achieved by revisiting data mappings to allow for a more complete representation of the underlying museum data. Focusing on the top 100 most highly occurring values for key elements will impact a high number of corresponding records, and would be low- hanging fruit for data clean-up activities. Museums engaging in data exchange will learn new ways to adapt and improve their data output every time they share, and the MDE experiment was just the first step on that journey. At the end of the day, the willingness of museums to share data more widely is tied to the compelling application for that shared data. When there are applications for sharing data which directly support the museum mission, more data is shared. When more data is shared, more such Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 45 compelling applications emerge. This chicken-and-egg conundrum provides a challenge to both museum policy makers as well as those wishing to aggregate data. The list of aggregators, platforms, projects and products provided in the previous chapter which support data exchange using CDWA Lite / OAI provides hope that these compelling applications will move museum policy discussions forward. In the summation of his paper, Nicholas Crofts lays out what is at stake: “[O]ther organisations and individuals are actively engaged in producing attractive digital content and making it widely available. Universal access to cultural heritage will likely soon become a reality, but museums may be losing their role as key players.” (Crofts 2008, 13) No matter which museum you represent, a search on Flickr® for your institution’s name viscerally confirms the validity of this prediction. It seems appropriate to close this paper with the words of a man who has fought this same policy battle 40 years ago. While the 1960s were a different time indeed, the arguments sound quite familiar. In an oral history interview from 2004, here is how Everett Ellin remembers making his case for the digital museum and shared data: “So museums should know how to reduce all records, all registrar records, records of accessions, to a digital file, and each file is kept in an archive, a digital archive, and we tie all the archives together by a computer network. We take all these archives and we link them up, and then when a technology comes that I know is certain that will let you take reasonably good photos digitally, then we will make digital files of those photos and we will put that in a separate part of the same archive—I have that language from 1966 in print—and we will begin to get into the electronic age. And we or you will be ready for the day when you see what I mean about that you are a medium, and that you have to stand toe to toe with mass media, because it's going to be a battle of images inevitably— inevitably.” (Kirwin, 2004) Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 46 Appendix A. Project Participants The following individuals have played a major role in the success of the Museum Data Exchange project by contributing their expertise, perspective and time. Grant funded museum participants: Andrea Bour, Doug Hiwiller, Holly Witchey (Cleveland Museum of Art) Jeff Steward (Harvard Art Museum) Piotr Adamczyk, Michael Jenkins, Shyam Oberoi (Metropolitan Museum of Art) Peter Dueker (National Gallery of Art) Cathryn Goodwin (Princeton University Art Museum) Alexander Macfie, Alan Seal (Victoria & Albert Museum) Ariana French, Thomas Raich, Tim Speevack (Yale University Art Gallery) Additional museum participants: Andrew David, Michael Dust, Jim Ockuly (Minneapolis Institute of Arts) Sonya Dumais, Greg Spurgeon (National Gallery of Canada) Additional contributors: Christine Kuan, William Ying (ARTstor) Corina MacDonald, Anne-Marie Millner (Canadian Heritage Information Network) James Safley, Tom Scheinfeldt (Center for History and New Media, George Mason University) Nick Poole (Collections Trust) Patricia Harpring and Antonio Beecroft (Consultants) Robb Detlefs (Gallery Systems) Scott Sayre (Sandbox Studios) Gayle Silverman, James Starrit (Selago Design) Robin Dowden, Nate Solas (Walker Art Center) Cogapp Ben Rubinstein, Stephen Norris, Mat Walker OCLC Research Ralph LeVan, Günter Waibel, Bruce Washburn, Jeff Young Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 47 Appendix B. Outputs of the Museum Data Exchange Activity Tools COBOAT http://www.oclc.org/research/activities/coboat/ OAICatMuseum 1.0 http://www.oclc.org/research/activities/oaicatmuseum/ Connecting with other users of COBOAT and OAICatMuseum, and exchanging COBOAT application profiles for different databases: http://sites.google.com/site/museumdataexchange/ Documents Patricia Harpring - Criteria for Scoring (CCO Evaluation) A document which outlines general findings from Patricia Harpring's CCO analysis, and her methodology for evaluating CDWA Lite against CCO. http://www.oclc.org/research/activities/museumdata/scoring-criteria.pdf MDE Analysis Methodology A document which outlines the array of possible analysis questions the project surfaced. http://www.oclc.org/research/activities/museumdata/methodology.pdf CDWA Light, CCO, COBOAT mapping A spreadsheet listing all content-bearing data elements and attributes defined by CDWA Lite, plus mappings to CCO. The document also indicates which of these data elements are part of the COBOAT default mapping. http://www.oclc.org/research/activities/museumdata/mapping.xls All of these documents are available from http://www.oclc.org/research/activities/museumdata/default.htm http://www.oclc.org/research/activities/coboat/� http://www.oclc.org/research/activities/oaicatmuseum/� http://sites.google.com/site/museumdataexchange/� http://www.oclc.org/research/activities/museumdata/scoring-criteria.pdf� http://www.oclc.org/research/activities/museumdata/methodology.pdf� http://www.oclc.org/research/activities/museumdata/mapping.xls� http://www.oclc.org/research/activities/museumdata/default.htm� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 48 Bibliography Chan, Sebastian. 2008. “OPAC2.0 – OpenCalais meets our museum collection / auto-tagging and semantic parsing of collection data.” Fresh + New(er) (31 March). A Powerhouse Museum blog. http://www.powerhousemuseum.com/dmsblog/index.php/2008/03/31/opac20-opencalais- meets-our-museum-collection-auto-tagging-and-semantic-parsing-of-collection-data/. Crofts, Nicholas. 2008. Digital assets and digital burdens: obstacles to the dream of universal access. Paper presented at the annual conference of the International Documentation Committee of the International Council of Museums (CIDOC), September 15-18, in Athens, Greece. http://cidoc.mediahost.org/content/archive/cidoc2008/Documents/papers/drfile.2008-06-72.pdf. [Conference program available from: http://cidoc.mediahost.org/content/archive/cidoc2008/EN/site/Home/t_section.html.] Collections Trust. n.d. “Culture Grid.” http://www.collectionstrust.org.uk/culturegrid. Dowden, R., and S. Sayre. 2009. “Tear Down the Walls: The Redesign of ArtsConnectEd.” In J. Trant and D. Bearman (eds). Museums and the Web 2009: Proceedings. Toronto: Archives & Museum Informatics. Published March 31. http://www.archimuse.com/mw2009/papers/dowden/dowden.html. Elings, M.W. and Günter Waibel. 2007. Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums. First Monday 12,3. http://firstmonday.org/issues/issue12_3/elings/index.html or http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1628/1543. Ellin, Everett. 1968. An international survey of museum computer activity. Computers and the Humanities 3,2: 65-86. George Mason University. n.d. “Plugins/OaipmhHarvester.” Omeka. Center for History and New Media. http://omeka.org/codex/Plugins/OaipmhHarvester. Getty Trust, J. Paul. 2009. Categories for the Description of Works of Art. Ed. Murtha Baca and Patricia Harpring (rev. 9 June by Patricia Harpring). http://www.getty.edu/research/conducting_research/standards/cdwa/. ———. n.d. Categories for the Description of Works of Art Lite. Getty Research Institute. http://www.getty.edu/research/conducting_research/standards/cdwa/cdwalite.html. Getty Trust, J. Paul, and College Art Association. 2006. CDWA Lite: Specification for an XML Schema for Contributing Records via the OAI Harvesting Protocol. http://www.getty.edu/research/conducting_research/standards/cdwa/cdwalite.pdf. http://www.powerhousemuseum.com/dmsblog/index.php/2008/03/31/opac20-opencalais-meets-our-museum-collection-auto-tagging-and-semantic-parsing-of-collection-data/� http://www.powerhousemuseum.com/dmsblog/index.php/2008/03/31/opac20-opencalais-meets-our-museum-collection-auto-tagging-and-semantic-parsing-of-collection-data/� http://cidoc.mediahost.org/content/archive/cidoc2008/Documents/papers/drfile.2008-06-72.pdf� http://cidoc.mediahost.org/content/archive/cidoc2008/EN/site/Home/t_section.html� http://www.collectionstrust.org.uk/culturegrid� http://www.archimuse.com/mw2009/papers/dowden/dowden.html� http://firstmonday.org/issues/issue12_3/elings/index.html� http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1628/1543� http://omeka.org/codex/Plugins/OaipmhHarvester� http://www.getty.edu/research/conducting_research/standards/cdwa/� http://www.getty.edu/research/conducting_research/standards/cdwa/cdwalite.html� http://www.getty.edu/research/conducting_research/standards/cdwa/cdwalite.pdf� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 49 Harpring, Patricia. 2009. Museum Data Exchange: CCO Evaluation Criteria for Scoring. OCLC. http://www.oclc.org/research/activities/museumdata/scoring-criteria.pdf. IBM Federal Systems Division and Everett Ellin. 1969. An Information System for American Museums: a report prepared for the Museum Computer Network. Gaithersburg MD: International Business Machines Corporation. Smithsonian Institution Archives, RU 7432 / Box 19 Kirwin, Liza. Oral history interview with Everett Ellin, 2004 Apr. 27-28, Archives of American Art. Smithsonian Institution. http://aaa.si.edu/collections/oralhistories/transcripts/ellin04.htm. Minneapolis Institute of Arts and Walker Art Center. n.d. ArtsConnectEd. http://www.artsconnected.org/. Misunas, Marla and Richard Urban. A Brief History of the Museum Computer Network. Museum Computer Network. http://www.mcn.edu/about/index.asp?subkey=1942. New Digital Group, Inc. n.d. Smarty: Template Engine. http://www.smarty.net/copyright.php. OCLC Research. n.d.a. CDWA Lite, CCO, COBOAT Mapping. An output of the Museum Data Exchange activity. http://www.oclc.org/research/activities/museumdata/mapping.xls. ———. n.d.b. Metadata Schema Transformation Services. http://www.oclc.org/research/activities/schematrans/default.htm. ———. n.d.c. Museum Collections Sharing Group. http://www.oclc.org/research/activities/museumdata/museumcollwg.htm. ———. n.d.d. Museum Data Exchange. http://www.oclc.org/research/activities/museumdata/default.htm. Parry, Ross. 2007. Recoding the museum: digital heritage and the technologies of change. Museum meanings. London: Routledge. Searching Museum Collections. n.d. Searching Museum Collections: A Research Project. https://museumsearch.pbworks.com/. OpenSiteSearch Community. n.d. OpenSiteSearch. http://opensitesearch.sourceforge.net/. Thomson Reuters. n.d. OpenCalais. http://www.opencalais.com/. Trant, Jennifer. 2007. “Searching Museum Collections on-line – what do people really do?” jtrant’s blog. Archives & Museum Informatics (January 1). http://conference.archimuse.com/node/7424. http://www.oclc.org/research/activities/museumdata/scoring-criteria.pdf� http://aaa.si.edu/collections/oralhistories/transcripts/ellin04.htm� http://www.artsconnected.org/� http://www.mcn.edu/about/index.asp?subkey=1942� http://www.smarty.net/copyright.php� http://www.oclc.org/research/activities/museumdata/mapping.xls� http://www.oclc.org/research/activities/schematrans/default.htm� http://www.oclc.org/research/activities/museumdata/museumcollwg.htm� http://www.oclc.org/research/activities/museumdata/default.htm� https://museumsearch.pbworks.com/� http://opensitesearch.sourceforge.net/� http://www.opencalais.com/� http://conference.archimuse.com/node/7424� Museum Data Exchange: Learning How to Share www.oclc.org/research/publications/library/2010/2010-02.pdf February 2010 Waibel, et. al., for OCLC Research Page 50 Notes 1 The original members of the committee: Nancy Allen (ARTstor), Erin Coburn (J. Paul Getty Museum), Ken Hamma (Getty Trust), Michael Jenkins (Metropolitan Museum of Art), Nick Pool (MDA), Jenn Riley (Indiana University), Günter Waibel (OCLC Research) 2 Grant funds were used exclusively to off-set museum costs; to pay an external contractor for the creation of the data extraction tool; to pay an external contractor for a piece of data analysis; and to pay for travel to face-to-face project meetings. OCLC contributions for project management and data analysis were in-kind. 3 Smarty is a templating language; see New Digital Group, Inc. n.d. 4 MIMSY XG was previously owned by Willoughby, and has been acquired by Selago Design in2009. 5 Available as Open Source through the OpenSiteSearch project at SourceForge (OpenSiteSearch Community n.d.). 6 Available from OCLC Research n.d.d. 7 Of these 131 units of information bearing data content, 67 are data elements, and 64 are attributes. For a detailed view of CDWA Lite information units of information bearing data content, as well as a mapping to CCO, please see OCLC Research n.d.a. 8 There is a slight discrepancy between the schema and the documentation: in the schema, recordID and are only required if their wrapper element (recordWrap) is present; the documentation, however, calls both data elements “required.” 9 A detailed mapping of CDWA Lite to the COBOAT default mapping can be found in OCLC Research n.d.a. 10 All quotes in this block refer to Getty Trust, J. Paul 2006. work_cs7atd4evzannojext4x3d3yxu ---- URI FAQs URI FAQs PCC URI Task Group on URIs in MARC September 26, 2018 1. What is a URI?.................................................................................................................... 2 2. What forms can a URI take? ............................................................................................... 2 3. What is an IRI? ................................................................................................................... 3 4. What is a real world object? ................................................................................................ 3 5. Which URIs apply to linked data? ....................................................................................... 3 6. Why when I use a URI in a browser, does it send me to a different link? ............................ 4 7. What if I just want to add a link to a web page (e.g., an author's website)? ......................... 4 8. Will my ILS accept URIs?.................................................................................................... 4 9. Do any vendors provide URIs? ........................................................................................... 5 10. What MARC fields and subfields can URIs be added to? ................................................ 5 11. What is the relationship of a URI in $0 to its MARC field and component subfields? ....... 7 12. What is the difference between URIs in $0 and $1? ........................................................ 7 13. Why are skos:Concepts not considered Real World Objects (RWOs) with respect to $0 and $1?...................................................................................................................................... 9 14. Shouldn’t the URIs in $0 and $1 be coordinated? Doesn’t that create extra work? .......... 9 15. Which URI sources should I use in my cataloging? ......................................................... 9 16. Is there a limit to the number of URIs I can use in one field? ..........................................10 17. Can I put URIs in name authority records? .....................................................................10 18. Can I put URIs in bibliographic records in Connexion? ...................................................10 19. If I have the choice, is it preferable to put URIs in bibliographic or authority records?.....12 20. When can/should I use the new field 758? .....................................................................12 21. Why were linking entry fields (76X-78X) not included in the task force proposals? .........12 22. How to formulate and obtain a linked data URI for a resource? ......................................12 23. Are validators available for dereferenceable URIs? ........................................................13 24. What tools are available? ...............................................................................................13 25. Where can I find training resources on URIs and linked data?........................................13 26. Are RDF URIs sensitive to use of http versus https? ......................................................13 27. Is permalink different from a canonical URI? ..................................................................14 1. What is a URI? Wikipedia defines a uniform resource identifier (URI) as follows: A string of characters that identify a resource. A URI can be specified in the form of a URL or a URN. More information from Wikipedia: https://en.wikipedia.org/wiki/Uniform_Resource_Identifier. W3C defines a URI simply as an ASCII string used to identify things on the Semantic Web. For cataloging professionals, a URI tends to be an HTTP uniform resource identifier. A resource is not necessarily accessible via the Internet; e.g., human beings, corporations, buildings, works of art, rivers, and books in a library can all be resources. Abstract concepts can also be resources. Other terms used for resources are entity and thing. An entity may have an identifier established in an authority database, such as the Library of Congress/NACO Authority File via the LC Linked Data Service (http://id.loc.gov), or by a service that creates identifiers such as Wikidata (http://wikidata.org) or BBC Things (http://www.bbc.co.uk/things/). An identifier, constructed with a Web service protocol as a prefix, e.g. http://, is referred to as an HTTP URI. In the Resource Description Framework environment, an HTTP URI is a dereferenceable URI that facilitates operations from machine to machine. (http://www.bbc.co.uk/things/). 2. What forms can a URI take? URIs can be classified either as Uniform Resource Locators (URLs) or Uniform Resource Names (URNs). In addition to identifying a resource, URLs provide a means of locating the resource by describing its primary access mechanism, e.g. http:// or ftp:// or mailto:, etc. URNs uniquely identify a resource, but do not necessarily specify its location or how to access it. Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme. There are dozens of schemes, but the most common for library applications are http, https, ftp, and mailto. The scheme name is always followed by a colon. Some URI schemes, such as http and ftp, are associated with network protocols. Examples of URIs include: http://isni.org/isni/0000000034980992 http://viaf.org/viaf/130909670 https://doi.org/10.1037/arc0000014 PCC URIs in MARC Task Group 2 https://en.wikipedia.org/wiki/Uniform_Resource_Identifier http://www.bbc.co.uk/things/ https://doi.org/10.1037/arc0000014 http://viaf.org/viaf/130909670 http://isni.org/isni/0000000034980992 http://www.bbc.co.uk/things http://www.bbc.co.uk/things http:http://wikidata.org http:http://id.loc.gov https://en.wikipedia.org/wiki/Uniform_Resource_Identifier ftp://ftp.fao.org/docrep/fao/005/H5365E/H5365E00.pdf mailto:John.Doe@example.com telnet://192.0.2.16:80/ urn:oasis:names:specification:docbook:dtd:xml:4.1.2 urn:isbn:0-679-73669-7 3. What is an IRI? IRI stands for Internationalized Resource Identifier; it is an extension of the URI scheme and is defined in RDF 3987 [1] . Whilst URIs contain characters from a subset of the ASCII character set, IRIs may contain the full range of Unicode characters. IRIs are of benefit to institutions wishing to mint persistent identifiers in a variety of scripts. However, they are more susceptible to IDN homograph attack [2]. Additionally, support for IRIs among the various Semantic Web technology tools is still uneven. [3] Further information about IRIs is available here: http://www.w3.org/TR/rdf11­ concepts/#section-IRIs [1] http://www.ietf.org/rfc/rfc3987.txt [2] https://en.wikipedia.org/wiki/IDN_homograph_attack [3] http://svn.aksw.org/papers/2010/ISWC_I18n/public.pdf 4. What is a real world object? A real world object is an entity, such as a person, place, etc. It can be actual or conceptual. It is often referred to as a Thing. W3C's document, Cool URIs https://www.w3.org/TR/cooluris/, states a convention for distinguishing a Thing and documents about the Thing (e.g. a webpage or authority data). 5. Which URIs apply to linked data? When considering the use of URIs in the context of linked data, the question isn't so much what URIs apply to linked data, but what is the function of a URI in the context of linked data. Very broadly, URIs in a linked data context help to establish knowledge about an object. This PCC URIs in MARC Task Group 3 https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_TR_rdf11-2Dconcepts_-23section-2DIRIs&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=N8Ix_Bx2C6XGjh4rka10bZcqadIvQ1_Awog_Y8afbjc&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_TR_rdf11-2Dconcepts_-23section-2DIRIs&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=N8Ix_Bx2C6XGjh4rka10bZcqadIvQ1_Awog_Y8afbjc&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ietf.org_rfc_rfc3987.txt&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=CJx_WrRyAmGyYtfcILzqtTyMUtJOTIIlLAVivaCqnAY&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ietf.org_rfc_rfc3987.txt&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=CJx_WrRyAmGyYtfcILzqtTyMUtJOTIIlLAVivaCqnAY&e= https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_IDN-5Fhomograph-5Fattack&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=rfFoA9oNjbW_KWaTQ9rTV_I8MQ3_zo97LgmywOXTWmw&e= https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_IDN-5Fhomograph-5Fattack&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=rfFoA9oNjbW_KWaTQ9rTV_I8MQ3_zo97LgmywOXTWmw&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__svn.aksw.org_papers_2010_ISWC-5FI18n_public.pdf&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=7EPdNAVKCzzY_P_Jvu-g9HScgIrbnIkzuYmi1TpOL4Q&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__svn.aksw.org_papers_2010_ISWC-5FI18n_public.pdf&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=cYvH4KdFMqovZijO7eBBto5LWB23bGEMSfHU_p8CRKo&m=iVlhy82p4_9OxhB7Smhk3P5VG9wBBOHpdS124FDKVbQ&s=7EPdNAVKCzzY_P_Jvu-g9HScgIrbnIkzuYmi1TpOL4Q&e= https://www.w3.org/TR/cooluris http://svn.aksw.org/papers/2010/ISWC_I18n/public.pdf https://en.wikipedia.org/wiki/IDN_homograph_attack http://www.ietf.org/rfc/rfc3987.txt http://www.w3.org/TR/rdf11 telnet://192.0.2.16:80 mailto:mailto:John.Doe@example.com ftp://ftp.fao.org/docrep/fao/005/H5365E/H5365E00.pdf knowledge may be in the form of relationships, or concepts, or facts -- though, the most important aspect of the URI is that this information is developed to be consumed by machines, not people. One of the common misconceptions when assigning data to a $0 of a MARC record is that any URI that points to information about the object is a valid one; that this field enables catalogers to provide users with more information about a term, a concept, a person, etc. And indirectly, it does... but not in a way that the public directly consumes. Linked Data URIs create the bridges that allow systems to share and understand information, and in this context, only URIs that point to machine actionable data should be utilized. 6. Why when I use a URI in a browser, does it send me to a different link? This is called dereferencing, i.e. retrieving a representation of the resource identified by the URI. If the semantic web data is published according to best linked data practices, the URI identifying the Thing is different from the URI identifying the Web document describing the Thing. For example, http://sws.geonames.org/6252001/ identifies the United States; once in a browser, this redirects to http://www.geonames.org/6252001/united-states.html, the URL for the Web document describing the United States. For this reason, you should not assume that the URL you see in the browser address window is the one that you should use in your bibliographic or authority record. For further information on dereferencing see the W3C document Cool URIs for the Semantic Web: http://www.w3.org/TR/cooluris/ 7. What if I just want to add a link to a web page (e.g., an author's website)? To point to a web address (as opposed to an identifier) that provides further information about an entity, such as the website of an author, use the 856 field with first indicator 4 (for http address), and second indicator 2 (for related resource). The URL goes in the $u. Subfield $3 or $z may be used to describe the resource being pointed to. Examples: 856 42 $3 Author's website $u http://stephenking.com/ 856 42 $u http://margaretatwood.ca/ $z Connect to author's website 8. Will my ILS accept URIs? In many cases ILS will accept URI, but it would be prudent to exercise caution and carefully work through configuration and testing. In some cases ILS may make little or no use of URI and in PCC URIs in MARC Task Group 4 http://sws.geonames.org/6252001/ http://www.geonames.org/6252001/united-states.html http://www.w3.org/TR/cooluris/ http://stephenking.com/ http://margaretatwood.ca/ http:http://margaretatwood.ca http:http://stephenking.com http://www.w3.org/TR/cooluris http://www.geonames.org/6252001/united-states.html http://sws.geonames.org/6252001 others their use may cause issues for display and/or indexing of headings. Impact on other services through cataloguing workflow should also be considered. For instance, at present records with a $1 cannot be uploaded to OCLC since it is not yet configured as a valid field in their systems. 9. Do any vendors provide URIs? Vendors such as MARCIVE and Backstage Library Works provide URI for MARC bibliographic and authority data alongside work on authorities. Casalini is also working on the provision of URI for MARC data through its SHARE VDE project. 10. What MARC fields and subfields can URIs be added to? Subfields $0 and $1 are established in numerous fields in all of the MARC formats. $0 contains an “Authority record control number or standard number." This number may be a URI. The recently approved $1 has been designed to hold the URIs of RWOs (Real World Objects). Please see question 11 for further information about the difference between URIs in $0 and $1. The relator code in $4 was redefined to host a URI for relationship. These subfields should not be confused with $u (Uniform Resource Identifier), which should only be used to record document web addresses or URLs. Subfield $4 in numerous fields of the bibliographic and authority formats can hold a URI for relationships between agents and works, expressions, manifestations, and items or for relationships between works, expressions, manifestations, and items. In the authority format field 024 can also hold URIs. Some examples: Bibliographic Format 100 1# $a Stravinsky, Igor, $d 1882-1971, $e composer. $4 http://id.loc.gov/vocabulary/relators/cmp $0 http://id.loc.gov/authorities/names/n79070061 257 ## $a Korea (South) $2 naf $0 http://id.loc.gov/authorities/names/n79126802 $1 http://vocab.getty.edu/tgn/7000299-place 336 ## $a text $b txt $2 rdacontent $0 http://id.loc.gov/vocabulary/contentTypes/txt 336 ## $a text $2 rdaco $0 http://rdaregistry.info/termList/RDAContentType/1020 370 ## $g London (England) $2 naf $1 http://www.wikidata.org/entity/Q84 PCC URIs in MARC Task Group 5 http://id.loc.gov/vocabulary/relators/cmp http://id.loc.gov/authorities/names/n79070061 http://vocab.getty.edu/tgn/7000299-place http://id.loc.gov/vocabulary/contentTypes/txt http://rdaregistry.info/termList/RDAContentType/1020 http://www.wikidata.org/entity/Q84 http://www.wikidata.org/entity/Q84 http://rdaregistry.info/termList/RDAContentType/1020 http://id.loc.gov/vocabulary/contentTypes/txt http://vocab.getty.edu/tgn/7000299-place http://id.loc.gov/authorities/names/n79126802 http://id.loc.gov/authorities/names/n79070061 http://id.loc.gov/vocabulary/relators/cmp 380 ## $a Novels $2 lcgft $0 http://id.loc.gov/authorities/genreForms/gf2015026020 $1 http://www.wikidata.org/entity/Q8261 385 ## $a Children $2 lcdgt $0 http://id.loc.gov/authorities/demographicTerms/dg2015060010 610 20 $a Harvard University $x Students $v Yearbooks. $0 http://id.loc.gov/authorities/subjects/sh85059205 650 12 $a Arthritis $x diagnosis. $1 http://id.nlm.nih.gov/mesh/D001168Q000175 655 #7 $a Picture books. $2 lcgft $0 http://id.loc.gov/authorities/genreForms/gf2016026096 $1 http://dbpedia.org/resource/Picture_book 700 1# $4 http://rdaregistry.info/Elements/w/P10129 $i Motion picture adaptation of (work): $a Austen, Jane, $d 1775-1817. $t Lady Susan. $1 http://viaf.org/viaf/183486135 711 2# $a Olympic Winter Games $n (21st : $d 2010 : $c Vancouver, B.C.) $0 http://id.loc.gov/authorities/names/n2006017550 $1 http://dbpedia.org/resource/2010_Winter_Olympics Authority Format 024 7# $a http://isni.org/isni/0000000122802598 2 uri 024 7# $a http://id.worldcat.org/fast/1789938 $2 uri 024 7# $a http://www.wikidata.org/entity/Q913 $2 uri 370 ## $e Zagreb (Croatia) $2 naf $1 http://sws.geonames.org/3186886/ $1 http://vocab.getty.edu/tgn/7015558-place 372 ## $a Figure skating $2 lcsh $0 http://id.loc.gov/authorities/subjects/sh2005002252 $1 http://dbpedia.org/resource/Figure_skating 375 ## $a Males $2 lcdgt $0 http://id.loc.gov/authorities/demographicTerms/dg2015060003 377 ## $a fre $0 http://id.loc.gov/vocabulary/languages/fre PCC URIs in MARC Task Group 6 http://id.loc.gov/authorities/genreForms/gf2015026020 http://www.wikidata.org/entity/Q8261 http://id.loc.gov/authorities/demographicTerms/dg2015060010 http://id.loc.gov/authorities/subjects/sh85059205 http://id.nlm.nih.gov/mesh/D001168Q000175 http://id.loc.gov/authorities/genreForms/gf2016026096 http://dbpedia.org/resource/Picture_book http://rdaregistry.info/Elements/w/P10129 http://viaf.org/viaf/183486135 http://id.loc.gov/authorities/names/n2006017550 http://dbpedia.org/resource/2010_Winter_Olympics http://isni.org/isni/0000000122802598 http://id.worldcat.org/fast/1789938 http://www.wikidata.org/entity/Q913 http://sws.geonames.org/3186886 http://vocab.getty.edu/tgn/7015558-place http://id.loc.gov/authorities/subjects/sh2005002252 http://dbpedia.org/resource/Figure_skating http://id.loc.gov/authorities/demographicTerms/dg2015060003 http://id.loc.gov/vocabulary/languages/fre 380 ## $a Novels $2 lcgft $0 http://id.loc.gov/authorities/genreForms/gf2015026020 $1 http://www.wikidata.org/entity/Q8261 500 1# $4 http://rdaregistry.info/Elements/w/P10203 $i Screenwriter: $a Kushner, Tony $w r $0 https://www.idref.fr/034453245 $1 http://dbpedia.org/resource/Tony_Kushner $1 http://www.wikidata.org/entity/Q704433 $1 http://www.bbc.co.uk/things/68b04078-0443­ 4a61-96f2-4bdab1cdc163 530 #0 $4 http://rdaregistry.info/Elements/w/P10226 $i Continuation of (work): $a Environmental and natural resources series $w r $0 http://id.loc.gov/authorities/names/no2003004696 11. What is the relationship of a URI in $0 to its MARC field and component subfields? The subfields that correspond to the object designated by the URI in $0 (or $1) vary from one field to another. Because of its complex history MARC is simply not consistent about this. The PCC URI group is drafting a set of tables to spell out the significant subfields for each of the more commonly used MARC fields. (See Task Group’s April 15, 2017 report, page 5: https://www.loc.gov/aba/pcc/bibframe/TaskGroups/PCC_URI_TG_20170415_Report.pdf#page= 5) 12. What is the difference between URIs in $0 and $1? Differences in Definition: $0 reflects the library community's longstanding commitment to controlled headings and the sources that have established them, while $1 points to factual descriptions of entities. • According to Library of Congress documentation, $0 contains “the system control number of the related authority or classification record, or a standard identifier such as an International Standard Name Identifier (ISNI).” The control number can appear as an identifier, or as a token in a URI that resolves to a description, whose purpose is to cite the source where an authoritative heading used elsewhere in the field has been established. The description accesses features information about the heading, such as its provenance, revision history, or representations in multiple languages or scripts. • The newly defined $1 is defined as a place for catalogers or automated processes to insert URIs that identify real-world objects that the field is about and resolve to machine- PCC URIs in MARC Task Group 7 http://id.loc.gov/authorities/genreForms/gf2015026020 http://www.wikidata.org/entity/Q8261 https://www.idref.fr/034453245 http://dbpedia.org/resource/Tony_Kushner http://www.wikidata.org/entity/Q704433 http://www.bbc.co.uk/things/68b04078-0443-4a61-96f2-4bdab1cdc163 http://www.bbc.co.uk/things/68b04078-0443-4a61-96f2-4bdab1cdc163 http://rdaregistry.info/Elements/w/P10226 http://id.loc.gov/authorities/names/no2003004696 https://www.loc.gov/aba/pcc/bibframe/TaskGroups/PCC_URI_TG_20170415_Report.pdf#page=5 https://www.loc.gov/aba/pcc/bibframe/TaskGroups/PCC_URI_TG_20170415_Report.pdf#page=5 http://rdaregistry.info/Elements/w/P10203 understandable RDF descriptions, such as persons, places, and organizations that have names such as New York, Albert Einstein, or Microsoft. The descriptions feature biographical details, geospatial coordinates, domains of influence, and photos or other images. $1 defines a class of URIs that are specialized for linked data applications such as clustering, fact extraction, disambiguation, and identity resolution. Differences in Usage: In linked-data terms, $1 is an open-world solution to identity management, while $0 is primarily about the library community’s management of headings and is relatively closed. • $0 contains a pointer to the source of authority control for a heading, while $1 points to the real-world-object described in the field. Otherwise, there is no special dependency between $1 and other subfields in the same field. In particular, $1 is not defined as a source for an authority- controlled heading. • $0 typically contains information published or endorsed by standards bodies in the library community, while the contents of $1 carry no such presumption. When adding a URI to $1, the cataloger is adding a crucial source of identifying information that may come from a library- community resource such as VIAF, or a third-party resource known only to specialized domains, such as performing arts or scientific sub-specialties. Differences in structure: $0 contains a mixture of legacy and semantic-web encodings, while $1 contains URIs that have been formulated according to linked-data conventions. • The $0 may either contain a control number, or a URI containing the control number as a token. For example, a $a field containing the string "Lennon, John 1940-1980" may contain a $0 subfield with the LCNAF control number n 80017868. Alternatively, the $0 may contain the URI http://id.loc.gov/authorities/names/n80017868. Both point to essentially the same information in a variety of forms, such as a human-readable HTML page or a machine-understandable RDF encoding. Of course, the identifier predates the semantic web, so it may also identify a unique record in a database or paper copy of the authority file. • The $1 contains a URI that conforms to linked data conventions. It is globally unique, persistent, and resolves to an RDF-encoded description of a real-world object. Examples include http://www.wikidata.org/entity/Q1203 and http://viaf.org/viaf/196844. • In theory, the differences between $0 and $1 URIs are detectable by URI validators such as Vapour. But in practice, automatic detection is challenging because the Web protocols and the implementation of URIs have changed over time. To address this problem, the PCC-URI task group has published the Formulating URIs document cited above, which identifies the relevant PCC URIs in MARC Task Group 8 http://id.loc.gov/authorities/names/n80017868 https://www.wikidata.org/entity/Q1203 http://viaf.org/viaf/196844 syntax patterns for resources most likely to be consulted by the library community. Longer term, we anticipate that improved data models and software tools will automate much of the task of constructing the appropriate URIs for a given MARC or other resource-description context. • Link to Vapour: http://linkeddata.uriburner.com:8000/ 13. Why are skos:Concepts not considered Real World Objects (RWOs) with respect to $0 and $1? Simple Knowledge Organization System (SKOS) “is an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading systems and taxonomies within the framework of the Semantic Web.” https://www.w3.org/2004/02/skos/intro. skos:Concepts, the central class in SKOS, are used to build entries within a particular Knowledge Organization Scheme. The concept works as a proxy for a thing in the real world, and it can have statements about it that do not apply to the RWO, e.g. versioning information for the term, or what scheme the concept is in ­ neither of which is true about the RWO. See also section 3.1 Mapping Concept Schemes in the SKOS Primer (ttps://www.w3.org/TR/skos­ primer/), specifically the language on skos:exactMatch and owl:sameAs. The semantics of skos:Concept are that they exist within a particular vocabulary, and they have assertions within that particular vocabulary. We would not say that two skos:Concepts are the owl:sameAs each other. They are not, in the same way that the skos:Concept http://rdaregistry.info/termList/RDAColourContent/1002 is not owl:sameAs http://vocab.getty.edu/aat/300137660. They may be a skos:exactMatch or skos:closeMatch, or they might have the same foaf:focus, but they are not themselves the same thing. 14. Shouldn’t the URIs in $0 and $1 be coordinated? Doesn’t that create extra work? One comment on the MARC Proposal 2017-08 expressed concern that $1 introduces maintenance problems because it must be kept in sync with $0. But this is primarily a consequence of the Library of Congress implementation of ‘Authority’ and ‘RWO’ URIs, which are derived by partitioning a source authority record into two sets of statements that must be reassembled in some circumstances. But when the contents of $1 is a URI maintained outside the library community, there is no formal dependency between the data models of the library authority file mentioned in $0 and the $1 resource. 15. Which URI sources should I use in my cataloging? Please watch for an upcoming best practice document for common URI vocabularies in MARC. PCC URIs in MARC Task Group 9 https://www.w3.org/2004/02/skos/intro https://www.w3.org/TR/skos-primer/ https://www.w3.org/TR/skos-primer/ http://rdaregistry.info/termList/RDAColourContent/1002 http://vocab.getty.edu/aat/300137660 https://www.loc.gov/marc/mac/2017/2017-08.html http://linkeddata.uriburner.com:8000 16. Is there a limit to the number of URIs I can use in one field? Strictly speaking the MARC definitions place no constraints on the number of URIs or their source. But using URIs from different sources creates conceptual and practical problems it would be best to avoid. Best practices will be provided by PCC during 2018. 17. Can I put URIs in name authority records? Currently URIs can be given in note fields (e.g. 670) and in the 024 field. See the NACO 024 Best Practices Guidelines. URIs cannot as yet be given elsewhere in the authority record, for example not in 5XX or 3XX fields. The PCC URIs in MARC Pilot is exploring and identifying best practices for the recording of URIs in other MARC authority fields. 18. Can I put URIs in bibliographic records in Connexion? URIs can be added to bibliographic records in Connexion, but there are issues surrounding it, e.g., how the use of URIs relates to functionality for controlling headings. OCLC’s handling of $0 and $1 for controlled headings is described below. OCLC-MARC Format Update 2017, described in Technical Bulletin 267 (https://www.oclc.org/support/services/worldcat/documentation/tb/267.en.html), includes the redefinition of subfields $0 and $4 to include URIs that are in the form of a Web retrieval protocol. Such URIs may now be included in any bibliographic field for which $0 or $4 is authorized. In OCLC-MARC Format Update 2018, described in Technical Bulletin 268 (https://help.oclc.org/WorldCat/Cataloging_documentation/Technical_Bulletins/268) OCLC announced the availability of $1 to accommodate RWO URIs. Note that in fields that can be controlled by OCLC Connexion software (1XX, 6XX, 7XX, 8XX), $0s is removed when the field is controlled. However, $1 is retained. PCC URIs in MARC Task Group 10 https://help.oclc.org/WorldCat/Cataloging_documentation/Technical_Bulletins/268 https://www.oclc.org/support/services/worldcat/documentation/tb/267.en.html https://www.loc.gov/aba/pcc/naco/documents/NACO-024-Best-Practices.pdf https://www.loc.gov/aba/pcc/naco/documents/NACO-024-Best-Practices.pdf There is at present no simple automated way within OCLC Connexion to add URIs, so care must be taken when they are added by hand. Examples of some OCLC bibliographic records that include URIs: #742510466, #820632069, #870305395, #924738796, #992709092, #1004632218. Examples of some fields with URIs: 100 1# Shapiro, Barbara A., $d 1951- $e author. $4 http://rdaregistry.info/Elements/a/P50195 $4 http://id.loc.gov/vocabulary/relators/aut 257 ## United States $a Great Britain $2 naf $0 http://id.loc.gov/authorities/names/n78095330 $0 http://id.loc.gov/authorities/names/n79023147 336 ## text $b txt $2 rdacontent $0 http://id.loc.gov/vocabulary/contentTypes/txt 337 ## computer $b c $2 rdamedia $0 http://id.loc.gov/vocabulary/mediaTypes/c 338 ## online resource $b cr $2 rdacarrier $0 http://id.loc.gov/vocabulary/carriers/cr 344 ## digital $2 rdatr $0 http://rdaregistry.info/termList/typeRec/1002 344 ## $b optical $2 rdarm $0 http://rdaregistry.info/termList/recMedium/1003 344 ## $g surround $2 rdacpc $0 http://rdaregistry.info/termList/configPlayback/1004 346 ## Laser optical $2 rdavf $0 http://rdaregistry.info/termList/videoFormat/1009 346 ## $b NTSC $2 rdabs $0 http://rdaregistry.info/termList/broadcastStand/1002 347 ## video file $2 rdaft $0 http://rdaregistry.info/termList/fileType/1006 347 ## $e region 1 $2 rdare $0 http://rdaregistry.info/termList/RDARegionalEncoding/1002 347 ## text file $2 rdaft $0 http://rdaregistry.info/termList/fileType/1002 382 01 piano $0 http://id.loc.gov/authorities/performanceMediums/mp2013015550 $n 1 $s 1 $2 lcmpt 386 ## $4 http://id.loc.gov/vocabulary/relators/fmd $i Film director: $a Mexicans $2 lcdgt $0 http://id.loc.gov/authorities/demographicTerms/dg2015060329 386 ## $4 http://id.loc.gov/vocabulary/relators/fmd $i Film director: $a Men $2 lcdgt $0 http://id.loc.gov/authorities/demographicTerms/dg2015060359 700 1# Cuarón, Alfonso, $e film director, $e screenwriter, $e film producer, $e film editor. $4 http://id.loc.gov/vocabulary/relators/fmd $4 http://id.loc.gov/vocabulary/relators/aus $4 http://id.loc.gov/vocabulary/relators/fmp $4 http://id.loc.gov/vocabulary/relators/flm 700 1# Cuarón, Jonás, $e screenwriter. $4 http://id.loc.gov/vocabulary/relators/aus 700 1# Heyman, David, $d 1961- $e film producer. $4 http://id.loc.gov/vocabulary/relators/fmp PCC URIs in MARC Task Group 11 http://id.loc.gov/vocabulary/relators/fmp http://id.loc.gov/vocabulary/relators/aus http://id.loc.gov/vocabulary/relators/flm http://id.loc.gov/vocabulary/relators/fmp http://id.loc.gov/vocabulary/relators/aus http://id.loc.gov/vocabulary/relators/fmd http://id.loc.gov/authorities/demographicTerms/dg2015060359 http://id.loc.gov/vocabulary/relators/fmd http://id.loc.gov/authorities/demographicTerms/dg2015060329 http://id.loc.gov/vocabulary/relators/fmd http://id.loc.gov/authorities/performanceMediums/mp2013015550 http://rdaregistry.info/termList/fileType/1002 http://rdaregistry.info/termList/RDARegionalEncoding/1002 http://rdaregistry.info/termList/fileType/1006 http://rdaregistry.info/termList/broadcastStand/1002 http://rdaregistry.info/termList/videoFormat/1009 http://rdaregistry.info/termList/configPlayback/1004 http://rdaregistry.info/termList/recMedium/1003 http://rdaregistry.info/termList/typeRec/1002 http://id.loc.gov/vocabulary/carriers/cr http://id.loc.gov/vocabulary/mediaTypes/c http://id.loc.gov/vocabulary/contentTypes/txt http://id.loc.gov/authorities/names/n79023147 http://id.loc.gov/authorities/names/n78095330 http://id.loc.gov/vocabulary/relators/aut http://rdaregistry.info/Elements/a/P50195 710 2# IFLA FRBR Review Group. $b Consolidation Editorial Group, $e editor. $4 http://rdaregistry.info/Elements/e/P20048 $4 http://id.loc.gov/vocabulary/relators/edt 780 00 $4 http://rdaregistry.info/Elements/w/P10226 $t Environment (Rivonia, South Africa) $x 2219-8199 $w (DLC) 2010252067 $w (OCoLC)530178144 19. If I have the choice, is it preferable to put URIs in bibliographic or authority records? URIs can have value in both bibliographic and authority records. There are some kinds of data ­ and not only URIs - that can more logically and non-redundantly be provided in authorities, but the fact is that we operate in a mixed environment where a clean separation is not made. Currently URIs are for the most part not approved for use in NACO authorities. The Task Group hopes to address this restriction. 20. When can/should I use the new field 758? The 758 field became part of the MARC specification in December 2017 and was implemented by OCLC in September 2018. PCC has not yet issued best practices for use of this field; these can be expected in late 2018 or early 2019. 21. Why were linking entry fields (76X-78X) not included in the task force proposals? 76X-78X linking entry fields tend to be associated with instance or manifestation data. While there is no reason instances or manifestations should not have RDF representations that could be linked in $0 (or $1), it is not clear that stable sources exist yet for these data. The task group therefore gave enhancements to these fields a lower priority than its other proposals. In addition, the 758 field has been defined to accommodate instance entities and predicates if required. Although the task group expects to give less emphasis to MARC proposals in its third year, it is open to use cases that may justify proposals affecting the 76X-78X fields. 22. How to formulate and obtain a linked data URI for a resource? Ideally it is best to acquire URIs through automated processes such as SPARQL queries or via lookup tools built into metadata editors. This is not always possible at present. Catalogers wanting to add properly formed and coded URIs to their records should consult the PCC Formulating URIs document (https://www.loc.gov/aba/pcc/bibframe/TaskGroups/formulate_obtain_URI_guide.pdf). PCC URIs in MARC Task Group 12 https://www.loc.gov/aba/pcc/bibframe/TaskGroups/formulate_obtain_URI_guide.pdf http://rdaregistry.info/Elements/w/P10226 http://id.loc.gov/vocabulary/relators/edt http://rdaregistry.info/Elements/e/P20048 23. Are validators available for dereferenceable URIs? There are validators which check whether semantic web data is correctly published according to current best linked data practices; in particular they check whether the URI tested identifies an entity, i.e. a RWO or a Web document describing the entity. At the time of writing, we are aware of the following validators: the Vapour validator http://linkeddata.uriburner.com:8000/vapour and Vafu http://vafu.redlink.io/ 24. What tools are available? Libraries adding URIs to their catalogs have used tools such as MARCEdit MARCNext, LOD/OpenRefine, and custom scripts (utilizing SPARQL), as well as working directly in SPARQL for querying endpoints to enrich data with a URI (or an IRI). 25. Where can I find training resources on URIs and linked data? The Linked Data Exploratorium (http://explore.dublincore.net/explore-learning-resources-by­ competency/ ) contains a great number of useful training resources related to linked data in general and URIs specifically. 26. Are RDF URIs sensitive to use of http versus https? The Web community has made a push toward more secure delivery of Web documents in the last decade using the HTTPS protocol. For human readability an RDF URI may resolve or redirect to a web page/document displaying information about the resource it identifies, but the RDF URI itself does not represent that web page/document. For example, an RDF URI from Wikidata, e.g. http://www.wikidata.org/entity/Q36322 may trigger an entity 303 redirect from server to various outputs: 1) A generic document after machine content-negotiation (can default to json syntax) https://www.wikidata.org/wiki/Special:EntityData/Q36322 2) An RDF turtle https://www.wikidata.org/wiki/Special:EntityData/Q36322.ttl 3) An HTML https://www.wikidata.org/wiki/Q36322 Notice the returned URIs are all in secure protocol format, https://, because they represent web PCC URIs in MARC Task Group 13 http://linkeddata.uriburner.com:8000/vapour http://vafu.redlink.io/ http://explore.dublincore.net/explore-learning-resources-by-competency/ http://explore.dublincore.net/explore-learning-resources-by-competency/ http://www.wikidata.org/entity/Q36322 http://www.wikidata.org/entity/Q36322 http://www.wikidata.org/entity/Q36322 http://www.wikidata.org/entity/Q36322 https://www.wikidata.org/wiki/Special:EntityData/Q36322 https://www.wikidata.org/wiki/Special:EntityData/Q36322 https://www.wikidata.org/wiki/Special:EntityData/Q36322.ttl https://www.wikidata.org/wiki/Special:EntityData/Q36322.ttl https://www.wikidata.org/wiki/Q36322 https://www.wikidata.org/wiki/Q36322 https://www.wikidata.org/wiki/Q36322 https://www.wikidata.org/wiki/Q36322 documents, while the RDF URI is http://. Additionally, the paths in respective URIs are not exactly the same. These differences are subtle. Since an RDF URI is not a Web address, but rather an identifier, it should not need to change from HTTP protocol to HTTPS. Ideally, an RDF URI published as http://www.wikidata.org/entity/Q36322 should not be re-used or re-stated elsewhere as https://www.wikidata.org/entity/Q36322. That said, whether or not it makes a difference to use http: or https: in an RDF URI may ultimately depend on the host server. The host server may be set up to seamlessly resolve http: to https: and vice versa, in which case it may not make a difference in how it resolves, but it may make a difference to a SPARQL query. In addition, if the server is not set up to resolve one to the other, then using http: or https: will make a difference in how the RDF URI resolves, as well as to machine querying. Therefore, it’s best to re-state an RDF URI exactly as it is published by the host and for the host not to change RDF URIs from http: to https:. 27. Is permalink different from a canonical URI? A persistent URL that takes a user to a Web document is called a permalink. A host may declare a URI to be canonical that is, the URI preferred by the host and tagged as canonical for content negotiation. Canonicalization of a URI by a host allows content negotiation between machines and search engine optimization (SEO) to index the link preferred by the host for displaying Web content. PCC URIs in MARC Task Group 14 http://www.wikidata.org/entity/Q36322 https://www.wikidata.org/entity/Q36322 1. What is a URI? 2. What forms can a URI take? 3. What is an IRI? 4. What is a real world object? 5. Which URIs apply to linked data? 6. Why when I use a URI in a browser, does it send me to a different link? 7. What if I just want to add a link to a web page (e.g., an author's website)? 8. Will my ILS accept URIs? 9. Do any vendors provide URIs? 10. What MARC fields and subfields can URIs be added to? 11. What is the relationship of a URI in $0 to its MARC field and component subfields? 12. What is the difference between URIs in $0 and $1? 13. Why are skos:Concepts not considered Real World Objects (RWOs) with respect to $0 and $1? 14. Shouldn’t the URIs in $0 and $1 be coordinated? Doesn’t that create extra work? 15. Which URI sources should I use in my cataloging? 16. Is there a limit to the number of URIs I can use in one field? 17. Can I put URIs in name authority records? 18. Can I put URIs in bibliographic records in Connexion? 19. If I have the choice, is it preferable to put URIs in bibliographic or authority records? 20. When can/should I use the new field 758? 21. Why were linking entry fields (76X-78X) not included in the task force proposals? 22. How to formulate and obtain a linked data URI for a resource? 23. Are validators available for dereferenceable URIs? 24. What tools are available? 25. Where can I find training resources on URIs and linked data? 26. Are RDF URIs sensitive to use of http versus https? 27. Is permalink different from a canonical URI? work_d47y5gjw6raozbtrbmhl5fa4ty ---- Microsoft Word - OTDCF_v24no2.doc by Norm Medeiros Associate Librarian of the College Haverford College Haverford, PA Screw Cap or Cork? Keeping Tags Fresh (and Related Matters) ___________________________________________________________________________________________________ {A published version of this article appears in the 24:2 (2008) issue of OCLC Systems & Services.} ABSTRACT This article comments to the excitement caused by release of “On the Record,” the final report of the Working Group on the Future of Bibliographic Control. The article notes the challenge of maintaining user-supplied tags in the absence of an agency responsible for their upkeep. It also refers to the chaos emerging from the convergence of enriched catalogs, WorldCat Local, and federated tools, all of which are vying for library search. KEYWORDS cataloging; social tagging; tags; enriched catalogs “Subject analysis – including analyzing content and creating and applying subject headings and classification numbers – is a core function of cataloging; although expensive, it is nonetheless critical.”1 The Library of Congress (LC) Working Group on the Future of Bibliographic Control released its final report on 9 January 2008. The Working Group (WG) was convened by Deanna Marcum, LC’s Associate Librarian for Library Services, and charged with: • Presenting findings on how bibliographic control and other descriptive practices can effectively support management of and access to library materials in the evolving information and technology environment; • Recommending ways in which the library community can collectively move toward achieving this vision; • Advising the Library of Congress on its role and priorities. Despite a lack of controversial recommendations by the WG – the exception being suspension of work on RDA -- the report has caused a stir in libraryland.2 It reminds me of the commotion that ensued following release of George Mitchell’s report on the illegal use of steroids in Major League Baseball (MLB) just a month earlier. I happened to be home during the airing on C-SPAN of day two of the congressional hearing, which featured MLB Commissioner Bud Selig and MLB Players Association head Donald Fehr. Not surprisingly, their testimony was combative and accusatory. As I watched them spar, I thought how useful it would be to the library community if the House Committee on Oversight and Government Reform would hold a hearing on the Working Group’s report. Day one could feature WG co-chairs Olivia Madison and Brian Schottlaender, who would articulate thoughtful responses to questions posed by the committee members. The dialogue would be cordial, and Chairman Waxman would conclude the hearing by thanking Ms. Madison and Mr. Schottlaender, as he did Senator Mitchell, for leading such a thorough investigation. Day two would be the main event, featuring Deanna Marcum, commissioner of the Working Group report and change agent, against Michael Gorman, the staunch defender of complex and exhaustive cataloging. Michael Buffer could introduce them as they enter the room, followed by his trademark, “Let’s get ready to rumble!” Now that would be “must see TV,” not to mention a practical way to solve our differences. On the Record holds few surprises. It is consistent with recommendations made by Karen Calhoun in a previous LC-commissioned paper, with a notable exception pertaining to Library of Congress Subject Headings (LCSH). 3 Calhoun recommended the dismantling of LCSH in favor of explorations into automated subject analysis, while the WG sees value in the continued use LCSH, albeit using a faceted approach. In an attempt to maximize productivity of subject terms, the WG recommends LC and the Program for Cooperative Cataloging (PCC) find ways for additional libraries to create and maintain authority records. It’s interesting to ponder how maintenance of LCSH might come to bear on tags, user-created subject terms popular on sites such as LibraryThing, Del.icio.us, and Flickr. The WG promoted incorporating tags into the library catalog, and indeed such terms can aid in discovery by providing a vernacular that may not otherwise be contained within the bibliographic description of the item, especially given the time lag between the common usage of a term and the appearance of that term as an LCSH heading or cross-reference. The question has moved in my view from one of whether such tags offer bibliographic enrichment, to how these tags will be maintained throughout the years, or as Joyce Ogburn puts it, “how tags will age.”4 TAG MAINTENANCE Despite numerous problems, including ambiguity, polysemy, and synonymy, tags have transitioned from the novel to the mainstream.5 Little attention, however, has been given to long-term tag maintenance. If libraries generally adopt user tags in the catalog, what happens to retrieval via these terms as their meaning changes with time? As Mary Ellen Bates cautions, “No one’s considering ‘Is this how we’ll refer to this issue in 2 years?’”6 We can’t expect users who contribute tags to be mindful of the consequences of their choices, but if we are opening our catalogs to community influence, then libraries should consider how to prevent these terms from going stale. Could this class of subject terms undergo authority control? The LC Working Group noted the need for better collaboration in creating and maintaining authority data. As time goes by such collaboration may need to be focused on this new and popular descriptive element. BIBLIOGRAPHIC ENRICHMENT Libraries have tended to equate bibliographic control with the production of metadata for use solely within the library catalog. This narrow focus is no longer suitable in an environment wherein data from diverse sources are used to create new and interesting information views. Library data must be usable outside of the catalog, and the catalog must be able to ingest or interact with records from sources outside of the library cataloging workflow. The tightly controlled consistency designed into library standards thus far is unlikely to be realized or sustained in the future, even within the local environment.7 It’s fascinating to watch the development of enriched or “next generation” catalogs. The field of available products is growing, most recently with the addition of Villanova University’s “VUFind” , an open source application that seeks to be a portal for an institution’s locally-created metadata, including but not limited to the bibliographic records contained within its library catalog. On the other end of this spectrum is WorldCat Local, offering the immensity of the WorldCat database, along with shared collections and open access materials. Although WorldCat Local offers branding and the ability to prioritize results based on availability at the local institution, it is diametrically opposed to the next generation catalogs, which are customized to serve the needs of a well-defined user population. Somewhere along this spectrum, or more accurately, matrix, exists federated search products, such as WebFeat and Ex Libris’ MetaLib. And let’s not forget Google, whose Scholar tool may be the best federated search service available. The chaos emerging from the convergence of enriched catalogs, WorldCat Local, and federated tools, coupled with the commotion caused by “On the Record” and its yet-to-be-determined aftermath, should make for a memorable year. REFERENCES 1. Working Group on the Future of Bibliographic Control (2008). On the Record: Report of the Library of Congress Working Group on the Future of Bibliographic Control. Available: http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf. (Accessed: 14 January 2008). 2. A recent example is Foster, Andrea L. & Jennifer Howard (2008). “Library of Congress Report Urges Libraries to Update cataloging Strategies,” Chronicle of Higher Education 54(21): A11. 3. Calhoun, Karen (2006). The Changing Nature of the Catalog and its Integration with Other Discovery Tools. Available: http://www.loc.gov/catdir/calhoun-report-final.pdf (Accessed: 25 January 2008). 4. Ogburn, Director of the J. Willard Marriott Library at the University of Utah, referred to the aging of tags during her presentation at the 2008 ALCTS Symposium, “Risk and Entrepreneurship in Libraries: Seizing Opportunities for Change,” held 11 January 2008 in Philadelphia, PA. 5. Spiteri, Louise F. (2007). The Structure and Form of Folksonomy Tags: The Road to the Public Library Catalog,” Information Technology and Libraries 26(3): 13-25. 6. Bates, Mary Ellen (2006). “Tag: You’re It!,” Online 30(1): 64. 7. Working Group on the Future of Bibliographic Control (2008): 31. work_d5fqxkasr5edzi3vqielcndska ---- 061-080______¿Ï-Á¶ÀçÀÎ.hwp 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 A Study on Book Metadata Creation and Distribution on Supply Chain 조 재 인(Jane Cho)* 목 차 1. 서 론 2. 공 사슬상의 계자들과 도서메타데이터 3. 출 계와 도서 계 메타데이터 유사 표 과 상호 운용성 분석 4. 도서메타데이터의 효율 인 생성․유통 방안 고찰 4.1 공 사슬상의 도서메타데이터 흐름 분석 4.2 효율 생성․유통 방안 제시 5. 결 론 5.1 요약 5.2 제언 록 최근 출 계는 이용자의 구매 의사 결정에 있어 메타데이터 활용이 매우 요한 요소임을 인식하게 됨에 따라, 효율 인 데이터의 작성과 품질 유지, 그리고 공 사슬상의 표 과 교환 시스템에 해 심을 갖게 되었다. 도서 계 한 목록 작성의 경제 효용성을 추구하면서 좀 더 최 정보원에 근 한 데이터 소스를 통해 작업을 간소화할 수 있는 모델을 추구하게 되었다. 본 연구는 발생 기원은 동일하지만 각각 출 계와 도서 계에서 각기 다른 형식과 표 으로 활용되고 있는 도서메타데이터의 흐름을 살펴보고 공통의 쟁 과 상호 운용의 가능성을 조망하 다. 최근 출 계와 도서 계를 심으로 논의되고 있는 각종 이슈들을 검해 보고 도서메타데이터의 효율 인 생성․유통 방안에 해 고찰하 다. ABSTRACT Recently, the publishing community now recognizes the importance of metadata in customers’ buying decisions. As a result, they are more interested in effective metadata creation and quality maintenance, as well as standardization of exchanging system in the supply chain. As the library community also investigates the economic effectiveness of creating metadata, they try to find the best model for simplifying metadata creation by using sources close to the original. This study analyzes metadata work flow which had same source but be used in different fields by their own type and standard. It also discusses the same issues about each section and possibility about interoperation. Finally this paper tries to find an effective creation and distribution model of book metadata which can be used in domestic publishing and the library community. 키워드: 출 공 사슬, 도서메타데이터 ONIX, MARC, CIP * 인천 학교 문헌정보학과 조교수(chojane123@naver.com) 논문 수일자: 2010년 7월 19일 최 심사일자: 2010년 7월 26일 게재확정일자: 2010년 8월 17일 한국문헌정보학회지, 44(3): 61-80, 2010. [DOI:10.4275/KSLIS.2010.44.3.061] 62 한국문헌정보학회지 제44권 제3호 2010 1. 서 론 출 계는 유통과 매를 해 메타데이터를 생산하고 도매상, 서 으로 이어지는 비즈니스 로세스에 의해 공 사슬상(Supply Chain)1) 의 트 들과 공유하게 된다. 한편, 도서 계는 발견과 식별, 소장을 해 메타데이터를 생성하 고 국가 표도서 , 서지유틸리티기 , 개별도 서 으로 이어지는 일련의 흐름 속에서 상호 공 유하거나 교환하게 된다. 도서메타데이터는 그 발생 기원은 동일하지만, 출 계와 도서 계의 공 사슬상에서 고유의 목 에 의해 각기 다른 표 과 형식으로 생산되어 진화되어 간다. 온라인 환경에서 출 계는 그동안 비즈니스 목 으로 내부 으로만 활용하 던 메타데이 터를 이용자들에게 쉽게 공개할 수 있게 되었 다. 더구나 이용자의 구매 의사 결정에 있어 메 타데이터 활용이 매우 요한 요소임을 인식하 게 됨에 따라, 효율 인 데이터 작성과 품질 유 지, 그리고 공 사슬상의 표 과 교환 시스템에 해 심을 갖게 되었다. 한편 도서 계에서도 검색 엔진의 등장으로 도서 목록의 요성이 감소되기 시작하면서, 카피목록, 자동처리를 통 해 신속성을 제고하고 비용을 삭감하기 한 총체 로세스 정비가 요구되게 되었다. 그 에 따라 공동목록이나 아웃소싱을 통해 비용을 감할 뿐 아니라, 출 계의 정보원(Upstream Metadata)을 통하여 작업을 간소화할 수 있는 모델도 고민하게 되었다. 그러한 맥락에서 LC (Library of Congress)는 출 계 메타데이터 반입을 통한 기술목록 작성 계획을 발표하 으 며, OCLC(Online Computer Library Center) 도 ‘A Symposium for Publishers and Librari- ans’2)를 계기로 도서메타데이터 생성과 유통 의 새로운 패러다임을 모색하게 되었다. 본 연구에서는 최근 논의되고 있는 도서메 타데이터에 한 이슈들을 종합하여, 출 계와 도서 계의 공통 요구와 복 노력, 그리고 상 호 운용의 가능성 등을 검토해 보고자 한다. 더 불어 이를 기반으로 도서메타데이터 생명 주기 에 있어 보다 진화된 메카니즘의 도입 가능성 을 모색해 본다. 본 연구는 첫째, 공 사슬상의 계자들 입장에서 메타데이터의 유통 경향과 련 쟁 을 분석하 다. 둘째, 출 계와 도서 계 양측에 용되고 있는 유사 데이터 표 과 상호 운용성을 살펴보았다. 셋째, 행 도서 메타데이터 흐름상의 특징을 악해 보며, 양 측이 복 노력을 최소화하고 상호 정 으로 메타데이터를 운용할 수 있는 방안을 고찰해 보았다. 2. 공 사슬상의 계자들과 도서메타데이터 NISO(National Information Standards Orga- nization)와 OCLC는 도서메타데이터 계자들 을 출 사, 도서유통업체, 서 , 메타데이터벤 더, 포털, 국가 표도서 로컬도서 으로 구 분하고 있다. 본 장에서는 NISO와 OCLC의 조 사(NISO and OCLC 2009), ‘A Symposium for Publishers and Librarians' 련 자료, 한국출 1) 공 업체에서 고객에 이르는 모든 거래 트 와의 정보, 자 , 지식의 흐름을 지칭하는 용어. 2) A Symposium for Publishers and Librarians. . 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 63 유통진흥원(http://www.booktrade.or.kr/) 등 을 참조하여 개별 계자들 입장에서 도서메타 데이터의 유통 경향과 련 쟁 들을 종합해 보 도록 한다. 2.1 출 사 미국의 형 출 사는 출 물의 제작과정에 서 발생되는 메타데이터를 ONIX(ONline Infor- mation eXchange) 형식으로 리하고 있다. 그러나 아직까지 많은 소규모 출 사들은 엑 셀을 비롯해 독자 인 포맷을 채택하고 있다. 더구나 도서 홍보나 신간 안내를 해 제작하는 린트 카탈로그가 메타데이터의 역할을 신 하기도 한다. 이러한 상황은 혼자서 기획, 편집, 업까지 도맡아 하고 있는 개미출 사가 체 의 1/4 가량을 차지하고 있는 우리나라의 경우 에 더 일반 이라고 할 수 있겠다. 기존에는 출 사들이 고유의 리 시스템을 통하여 데이터 를 내부 으로만 리하 기 때문에 비즈니스 트 간의 공유와 교환에 한 요구가 발생하 지 않았다. 그러나 공 사슬의 유통 산화와 데이터 리의 효율화에 한 수요가 발생하면 서, 표 데이터 형식이 필요하게 되었다. 한 온라인 환경에서 출 사들이 간 유통 과정을 생략하고 이용자들과 직 하게 되면서, 메 타데이터의 생성과 공 사슬망내에서의 품질 유지에도 심을 갖게 되었다. 이러한 배경으로 최근 BISG(The Book Industry Study Group) 은 ONIX 일을 평가하고 데이터의 품질과 시성의 기 을 수하는 출 사들을 식별하는 로세스도 운 하게 되었다. 2.2 도서 유통 업체 도서 유통 업체인 Baker & Taylor, Ingram 은 매해 다루게 되는 도서의 숫자가 10%씩의 증가율을 보이고 있다고 하 다. 실제 20만권 정 도가 증가하지만 디지털 포맷을 비롯한 다양한 구 형의 등장으로, 한 개의 컨텐츠가 보통 2-3개 의 매체로 간행되고 있기 때문이라고 한다. 보통 유통 업체들은 CIP(Cataloging in Publication) 보다 출 사의 메타데이터를 우선 으로 채택 하게 되지만, 수서와 동시에 MARC 데이터 제 공을 원하는 도서 들을 해 미의회도서 의 PCC(Program for Cooperative Cataloging)에 도 참여하고 있다. 한편, 북센, 송인서 , 한국출 동조합과 같은 한국의 표 인 유통 업체 들은 한국출 유통진흥원과 공동으로 자체 입 고 도서에 한 ONIX를 작성하고 있다. 출 과정의 부산물로 생성되는 해외 사례와 달리 유통 단계에서 최 의 ONIX가 작성되고 있는 셈이다. 2.3 서 한편 표 인 오 라인 서 인 반스앤노블 은 양질의 기술 메타데이터가 매출과 직결된다 는 생각을 가지고 있다. 우리나라의 최 서 인 교보 문고 역시 질 좋은 메타데이터가 마 의 경쟁력이라고 단하여 데이터의 무상 공 유에는 다소 제한 인 입장을 보이고 있다. 기 술 데이터의 품질 제고와 정확성, 신속성 제고 를 하여 반스엔노블은 출 사와 유통업체에 게 44개의 필수 데이터 요소를 요구하고 있다. BISG가 정의하고 있는 요소들이 부분 포함 64 한국문헌정보학회지 제44권 제3호 2010 되지만, 기술 요소 이외에도 재고 여부, 배포처 리스트, 가격과 같이 물류 변화에 따른 가변 요소도 매우 요하게 간주되고 있으며, 그런 측 면에서 간혹 ONIX보다 EDI(Electronic Data Interchange)가 더욱 탁월한 구조로 평가되기 도 한다. 각기 다른 소스들로부터 데이터를 수집 하게 되는 형 온라인 서 에서는 범용 으로 수용되는 표 이나 최우량사례(Best Practice) 가 데이터 유지 기 을 결정하는 요한 요소 가 된다. 2.4 메타데이터 벤더 NISO의 보고서에서는 데이터를 수집하여 보 강하고 재배포하는 Bowker, Nielsen Book, 서 지 유틸리티 기 인 BDS(Bibliographic Data Services), 그리고 OCLC를 메타데이터 벤더로 정의하고 있다(NISO and OCLC 2009). 우리 나라의 경우에는 ONIX를 생성하고 각계에 필 요한 형식으로 배포하는 한국출 유통진훙원, 서지유틸리티서비스를 제공하는 한국교육학술 정보원 등이 여기에 해당된다고 볼 수 있겠다. 벤더들은 다양한 형식의 데이터를 수집한 후, 특정 표 에 따라 보강하여 MARC 는 ONIX in XML 형식으로 유통업체, 매상, 도서 등 에 재배포한다. 데이터를 수집하여 가공하고 다 시 배포하는 역할을 하는 이들에게 있어 상호 운용성은 매우 요한 요소인데, ONIX가 50%, 엑셀을 비롯한 기타 디지털 형식이 45%, 나머지 는 5%는 인쇄물 형태로 수집되고 있다. Bowker 와 Nielsen Book은 서명과 자명의 거 통제 를 통하여 데이터를 보강시킬 뿐 아니라, 표지 이미지, 목차, 자 양력, 북 수상, 독자 수 , 이 용자 리뷰와 평가, 추천 정보 등을 여기에 추가 시키고 있다. 한 ISBN을 등록하고 출 사 리픽스를 부여하며, CISAC(International Con- federation of Societies of Authors and Com- posers), IFFRO(The International Federation of Reproduction Rights Organizations)와 함 께 ISTC(International Standard Text Code) 를 창설하여, 작권 리 업무 효율화를 한 기반을 마련하기도 하 다. 한편, UK의 CIP 아 웃소싱 업무를 맡고 있는 BDS는 75,000건의 출 코드를 만들어낼 뿐 아니라, ONIX를 MARC21로 변환시켜 보 하고 있다고 하며, 매해 6-12만 개의 서지 코드를 수집하고 있는 OCLC는 NextGen Pilot3)을 통해 보강된 ONIX 를 생성시켜, 역으로 출 계에 재보 할 계획을 가지고 있다고 한다. 그 만큼 메타데이터 벤더 들은 공 사슬내에서 데이터의 품질 제고와 배 포에 있어 요한 간자 역할을 수행하고 있다. 한편, 우리나라에서는 한국출 유통진흥원이 ONIX를 생성하고 각계에 필요한 형식으로 배 포하는 역할을 수행한다. 한국출 유통진흥원 은 앞서 언 한 바와 같이 유통 업체인 북센의 창고에 입고된 신간도서에 한 ONIX를 작성 하여 서 , 유통 업체 등에 보 하며, 데이터를 MARC으로도 변환하여 국립 앙도서 , 한국 교육학술정보원 일선 도서 에 다양한 방식 으로 보 하고 있다. 3) Next Generation Cataloging. . 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 65 2.5 포털 한편, 작권 침해 문제로 출 계와 작가 조합 등에 의해 제소된 바 있는 구 은 최근 긴 상 끝에 작권 비용에 련된 Book Rights Registry 의 운 을 맡게 되었다. 이 등록소를 통해 권리 보유자들의 데이터베이스를 리하고, 그들을 확인하여 소재를 악하며, 지 을 조율하게 된 다. 따라서 자와 작을 식별하고 작 간의 계를 악하는 문제, 한 시리즈와 멀티 볼룸 작을 식별하고 리하는 문제 등이 이슈가 되 고 있다. 구 은 이 문제의 해결을 해 최근 메 타데이터의 리를 강조하게 되었으며, 더불어 유 작 식별 알고리즘을 개발하는 등 도서와 작의 식별에도 많은 심을 기울이고 있다. 한 편, 우리나라의 네이버는 책서비스를 통하여 출 사들의 신간을 홍보하고, 구매자들과 연결될 수 있는 직 인 경로를 제공하고 있다. 한 국내 주요 온라인 서 과 종별 도서 종합목 록을 통합한 명실공히 도서 통합 포털을 지향하 고 있어, 유 작의 식별과 품질 유지의 문제 가 주목된다. 2.6 국가 표도서 데이터 교환을 한 표 과 서지 제어에 한 각종 정책을 개발하는 미의회도서 은 R2 컨설 을 통해 북미 지역의 MARC 시장을 분석하고 업계의 불필요한 복에 하여 지 한 바 있다 (R2 Consulting LLC 2009). 목록의 경제성에 한 부분은 최근 도서 계에 화두가 되고 있는 데, 이러한 맥락에서 LC는 PCC 로젝트를 통 해 동 목록과 거의 요성을 유도하고 있으 며, 도서 산업과 작권, 도서 과 정보 분야에서 사용되는 다양한 표 들이 상호 운용되어야 한 다는 을 매우 강조하고 있다. 국가 표도서 은 출 계와의 데이터 공유를 해서도 많은 연 구를 수행하고 있는데, 자 의 XML 포맷 에서 핵심 요소를 추출하여 ONIX 코드로 만 드는 로젝트, 그리고 XML이나 텍스트 베이스 PDF에서 메타데이터를 추출하여 METS(Meta- data Encoding and Transmission Standard) 와 함께 MODS(Metadata Object Description Schema) 포맷으로 활용하는 연구가 주목되고 있다. 한편, 국가 표도서 은 출 사로부터 받 은 정보를 기반으로 CIP를 작성하여 도서 매 업자에게 신간에 한 출 도서 목록을 제공 하고 있다. 출 도서 목록은 출 이 완료된 이후에 제목, 크기, 페이지 등의 수정 작업을 거 쳐 완벽한 목록으로 보강되고, 여기에 주제 분석, 거 통제 등의 지 인 작업이 더해져 도서 계 에 여러 가지 창구로 재 보 된다. 2.7 로컬 도서 NISO와 OCLC의 보고서에 의하면(2009), 일선 도서 에서는 30% 미만에 해서만 원목 작업을 수행하고 있다고 한다. 그러나 아직까지 다운로드 받은 많은 코드들이 로컬 도서 의 목록 작성 행에 의해 수정되고 있다고 보고되 고 있다. LC가 보 한 코드 350,000건 80% 가 로컬 도서 에 의해 업그 이드되었고, BL 이 만든 26만건 에서도 55%가 업그 이드되 었다고 하니, 국가 표도서 이 완벽한 수 의 서지데이터를 작성하여 배포하여도 로컬 도서 벨에서는 많은 부분의 수정과 갱신이 이루 66 한국문헌정보학회지 제44권 제3호 2010 어지고 있는 셈이다. 한편, 최근에는 자 과 자책의 향으로 많은 이용자들이 목록이 아 니라, 구 과 같이 원문의 내용 검색을 통하여 원하는 페이지로 직 근하게 되었다. 따라서 목록 작성에 투입되는 인력과 산 운 의 효율 성 문제가 제기되고 있다. 계자 메타데이터 황 시 데이터 형식 출 사 - 미국의 형 출 사는 ONIX를 활용하고 있지만 부분은 엑셀 일을 비롯해 독자 인 포맷을 선택. 국내 출 사는 거의 부분 ONIX를 작성하지 않음 - 직 이미지, 목자, 서평 등의 부가 정보를 구축하여 홈페이지를 통해 홍보하는 경우가 많아지면서 메타데이터의 생성과 유지 에 한 필요성을 인식하게 됨 - 출 사 ONIX (미국 50%), 엑셀 인쇄목록 자체포멧 유통업체 - CIP 데이터보다는 출 사의 데이터가 우선 으로 채택 - 도서 이 수서와 동시에 MARC을 원하고 있어, 많은 벤더들이 미의회도서 의 PCC에 참여 - 국내에서는 유통단계에서 최 ONIX가 작성 - Baker & Taylor, Ingram - 북센,한국출 동조합 ONIX, 자체포멧, MARC 서 - 기술메타데이터가 매출과 직결된다고 인식 - 반스앤노블은 BISG가 정의하고 있는 기술 요소 이외에도 재고 여부, 배포처 리스트, 가격과 같은 가변 인 데이터 시. 매를 한 시 갱신 강조 - 아마존 반스앤노블 - 교보문고 YES24 ONIX, EDI 자체포멧 MARC 메타데이터 벤더 - 거 통제를 수행하여 데이터를 보강하며 챕터, 커버 이미지, 목차, 베스트셀러 사이테이션, 자 양력, 북 수상, 독자 수 , 이용자 리뷰와 평가, 추천 정보 등을 추가 - 공 사슬내에서 도서메타데이터의 품질 제고와 배포에 있어 요한 간자 역할을 수행 - Bowker Nielsen Book, BDS, OCLC - 출 유통진흥원, KERIS ONIX MARC 자체포멧 포털 - 구 은 수 백만권의 책을 디지털화하여 공개하고 있으며 작 권 비용에 련된 Book Rights Registry 운 . 자와 작을 식별하고 작간의 계 악에 주력. ONIX와 MARC을 모두 수용 - 네이버는 책서비스를 통해 출 사가 신간을 홍보하고 직 매할 수 있는 기회 지원. 국내 최 도서 포털을 해 각종 서 과 도서 의 데이터를 수집․통합 - 구 북, 네이버책 ONIX MARC 국가 표 도서 - 출 사로부터 받은 정보를 기반으로 CIP를 작성하고 도서 에 제공 - 북 벤더에게도 출 3-6개월 에 신간에 한 출 도서 목록을 제공. 출 이 완료된 이후에 제목, 크기, 페이지 등의 수정 작업을 거쳐 완벽한 목록으로 보강되고, 주제 분석, 거 통제 등의 지 인 작업이 더해져 도서 계에 여러 가지 창구로 재 보 - LC - 국립 앙도서 MARC(CIP) 로컬 도서 - 30% 미만에 해서만 원목 작업 수행 - 다운로드 받은 서지에 해서도 로컬 행에 의한 추가 작업 다수 수행 - 도서 목록의 유효성과 경제성에 한 논의 가속 - 개별 도서 MARC <표 1> 공 사슬상 계자들과 도서메타데이터 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 67 2.8 도서메타데이터 유통 경향과 련 쟁 술한 도서메타데이터 유통 경향은 아래와 같이 요약해 볼 수 있겠다. 첫째, 공 사슬의 유통 산화와 데이터 리의 효율화에 한 수 요가 증하면서, 출 사들은 표 데이터 생성 과 품질 유지에 많은 심을 갖게 되었다. 둘째, 온라인 시장 유율 증가는 서 을 포함한 출 계 내부에 있어, 기술 메타데이터에 한 요 성을 더욱 강조하게 되었다. 셋째, 최근 하나의 작이 다양한 매체로 간행되면서 도서유통업 계 역시 더 많은 양의 메타데이터를 처리하게 되었으며, 유 작의 효과 식별과 작권 리 효율화의 문제에도 직면하게 되었다. 네 번째, 한편, 도서 계는 아직도 기술 목록에 많 은 시간과 노력을 투입하고 있어, 경제 효용 성을 제고하기 한 안의 모색이 실히 요구 되고 있다. 도서메타데이터는 출 사, 도서유통업체, 서 , 도서 등 공 사슬상에서 각기 필요한 형 식과 내용으로 활용되고 있어, 생성과 품질 유 지에 복 투입되는 시간과 비용에 한 문제 가 쟁 이 되고 있다. 이와 련하여 Hachette Book Group의 계자는 출 사가 도서에 한 가장 정확한 정보를 가지고 있다고 강조하 고 도서유통업체, 서 , 도서 등은 최 정보 원(Upstream Metadata)인 출 사의 데이터 를 효율 으로 활용하는 방안을 모색할 필요가 있다고 논평하 다. 같은 맥락에서 NetLibrary 의 Suzanne Kemperman(OCLC 2009a)은 아 마존이든 도서 이든 상 없이 이용자들은 동 일한 검색 경험을 원하므로, 발견이라는 이슈 는 양 커뮤니티에 있어 요한 공통 컨셉이라 고 말하 다. 따라서 출 계와 도서 계는 서지 데이터를 연계하여 공동 활용해야 할 것이라고 지 하 다. 도서 계도 이제 원목 작업은 도서 계 체에서 그만두어야 할 시기가 되었다고 논평하고, 최 정보원(Upstream Metadata) 데이터를 활용하여 목록의 경제성을 추구해야 한다고 논의하 다(OCLC 2009a). 한편, 최근 하나의 콘텐츠가 자책, 화 등 다양한 방식으로 사용되면서, 콘텐츠에 한 작권 리 문제가 두되었다. 따라서 동일 작의 다양한 구 형을 식별하고, 자를 정 확하게 식별하는 문제에 한 요성도 부각되 었다. 그 밖에 시리즈와 그 구성물을 식별하는 문제, ISBN 부여 단 를 결정하는 문제 등이 도서 산업과 도서 분야에서 쟁 이 되고 있 다. 출 계와 도서 계는 각각 메타데이터 형 식으로 MARC과 ONIX를 채택할 뿐 아니라, 자와 작의 식별을 해서도 각기 다른 표 을 채택하고 있어, 최근 양측의 문성을 바 탕으로 한 공동 응 노력이 필요하다고 논의 되어진다. 3. 출 계와 도서 계 메타데이터 유사 표 과 상호 운용성 분석 출 계와 도서 계는 비즈니스 트 와 유 기 간에 데이터가 원활히 흘러갈 수 있도 록 각기 표 을 정의하고 있는데, 언 했다시 피, 양측은 유사하면서도 상이한 형식을 채택 하고 있다. 본 장에서는 몇 가지의 체계에 해 그 유사성과 차이 , 그리고 상호 운용성을 살 펴보도록 한다. 68 한국문헌정보학회지 제44권 제3호 2010 3.1 ONIX / MARC ONIX는 XML 메시지 구조를 사용하는 코 드 값으로 구성되어 있는 스키마로 도서산업계 에서 데이터를 교환하기 한 표 으로 사용된 다. 2001년 1월에 도서 산업의 상품정보를 자 형태로 표 하고 자상거래를 지원하기 하여 버 1이 발표되었다. 온라인 거래가 가능 한 모든 콘텐츠에 용 가능한 ONIX에는 목록 정보, 자정보, 도서정보, 출 사정보, 유통정 보 등을 포함하고 있다. ONIX는 출 사로부터 도매상과 메타데이터 벤더에게 다양한 경로로 달되며, 출 후 갱신된 리뷰, 가격, 상태 정 보는 ONIX 피드로 트 들에게 재배포 된다. 한편, 최근 발표된 ONIX 3.0은 디지털 콘텐츠 를 더욱 효과 으로 다루고 동일 작의 다양 한 구 형이 구조 으로 연계되도록 지원하며, 연속간행물을 좀 더 효율 으로 다룰 수 있도 록 개선되었다고 한다.4) 한편, 출 계에 응되는 메타데이터 형식상 의 표 인 MARC은 출 정보를 기반으로 CIP 데이터로 먼 탄생된다. 출 사가 교정쇄 는 자 버 으로 표제지, 권기, 색인 등을 국가 표도서 에 보내면, 국가 표도서 은 CIP를 제 작하여 출 될 작에 포함시켜 되돌려 보낸다. 미국에서는 10개 정도의 큰 출 사가 55,000건 정도를 ONIX 형태로 보내고 있지만, CIP 생성 에 ONIX가 직 사용되지 않으며, 국가 표도 서 이 작성한 통제 해딩과 분류 번호도 출 사 의 ONIX 스트림에 의해 다시 환류되지 않는다. 여하튼 두 표 은 기본 으로 구조상 그리고 어의상의 큰 차이를 가지고 있다. MARC은 유 통과 매에 련된 상세 사항을 취 하지 않으 며, ONIX는 수용자에 의해 데이터가 어떻게 사 용되고 갱신되며 리되는지에 한 정보를 다 루지 않는다. 더구나 MARC에서와 같은 근 에 한 개념이 부재하다. 이러한 차이로 인 해 사실상 두 표 의 완벽한 매핑은 불가능하며, 매핑 가운데 데이터 손실 발생도 불가피하다 (Godby 2010). 그러나 ONIX와 MARC은 LC, OCLC에 의해 서지 정보의 획득을 한 매핑이 시도되고 있으며, 국내에서도 한국출 유통진 흥원이 MARC 생성을 해 ONIX를 활용하고 있다(한국출 유통진흥원 2006). 3.2 BISAC / LCSH, DDC BISAC(Book Industry Standards and Com- munications)은 출 계의 공 사슬상에서 범용 으로 활용되는 주제 분류 표목으로 검색 시스 템을 통해 주제 분야를 검색하거나 도서를 배열 하기 하여 활용된다. 9개의 알 벳 문자에 의해 표 되며, 컴퓨터, 소설, 역사 등 52개의 주요 섹 션으로 구성 된다.5) 가령 아 리카 역사 일반은 HIS001000으로 표 되며, HISTORY/AFRICA/ GENERAL로 해석될 수 있다. LCSH가 300,000 개의 용어로 구성되어 있는 반면, BISAC은 3,000 개에 불과한데, BISAC이 좀 더 이며 이용 자 친화 이라는 평가를 받고 있다. 따라서 공공 도서 에서도 DDC를 신하여 BISAC을 채택 하고 있는 곳이 늘고 있다고 한다. Maricopa도 서 은 포커스 구룹 인터뷰를 통해 DDC 분류가 4) EDItEUR. . 5) BISAC. . 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 69 이용자들에게 친숙하지 않아, 서 과 같이 더 쉬 운 분류 체계를 도입할 필요가 있음을 입증하 고, 그에 따라 BISAC을 채택하게 되었다고 말하 고 있다(Norman Oder 2007). 최근 OCLC에서 는 BISAC과 DDC의 매핑을 추진한 바 있을 정 도로 도서 계의 BISAC 용 가능성에 해서 도 논의가 이루어지고 있다(Mitchell 2010). 3.3 ISNI / NACO, VIAF The International Standard Name Identifier (ISNI)6)는 ISO Standard(ISO 27729)로 미디어 콘텐츠 산업의 출 , 리, 콘텐츠 배포와 공 사슬망에서 이름 식별을 하여 사용된다. ISNI 는 혼동될 수 있는 이름의 모호성을 분명하게 하기 한 도구를 제공하며, 이 정보가 유 시 스템들 사이에 연계되기 하여 활용된다. 개인 에 한 망라 정보를 직 으로 제공하기 하여 개발된 시스템이 아니기 때문에, 다양한 공 사슬상의 트 들이 개인 정보를 유출하지 않으면서도 정보를 교환할 수 있는 구조를 가진 다. 개인을 구분하기 한 최소한의 메타데이터 로 구성되며, 기타 정보는 제한 속이 보장된 데 이터베이스내에서 리된다. ISNI 운 체계는 참조 데이터베이스를 생성 리하는 Registra- tion Authority(RA)와 이용자들에게 ISNI 서비 스를 제공하는 Registration Agencies(RAG) 로 구분된다. RAG는 CISAC(International Confederation of Societies of Authors and Composers), IFRRO(International Federation of Reproduction Rights Organisations), IPDA (International Performers’ Database Associ- ation), Bowker 이외에도 OCLC(Online Com- puter Library Center) 등이 맡고 있다. 한편, 도서 계는 거 통제와 이름 식별 체계 유지를 하여 NACO(The name authority pro- gram component of the PCC)와 VIAF 등을 운 하고 있다. VIAF(The Virtual International Authority File)는 미의회도서 , 독일국립도 서 , 랑스국립도서 의 동 로젝트로 하 나의 단일 이름 거에 하여 세 개 기 의 거 일을 가상으로 조합하게 된다. VIAF는 천 만개 이상의 인명과 그 이형으로 구성된 거 코드를 가지고 있으며, NACO의 회원이 로컬 <그림 1> ISNI 운 구조 (출처: www.isni.org) 6) ISNI. . 70 한국문헌정보학회지 제44권 제3호 2010 <그림 2> RA, RAG의 역할 (출처 www.isni.or) 도서 의 인명 코드를 추가하게 되면, 자동 으로 VIAF가 갱신되는 구조로 운 된다. 도서 계에서는 1984년에 Standard Author- ity Data Number(ISADN)를 개발하 으나, 거 코드의 기능 요건으로 FRAR(Functional Requirements for Authority Records)를 정의 할 뿐, 더 이상 인명식별체계는 업데이트 하지 않 고 있다. 신 최근 MARBI(MARC Advisory Committee 2010)는 출 계에서 사용되고 있는 ISNI를 아래와 같이 서지와 거 코드에 추가 시켜 인명식별방식을 보완하고자 하여 주목된다. 100 1# $aRendell, Ruth,$d1930-$0ISNI 8462 8328 5653 6435 3.4 ISTC / FRBR 한편, The International Standard Text Code (ISO 21047)7)는 출 사, 도서 매업자와 작 권 리 시스템에서 서로 다른 구 형을 가지고 있는 동일 원 작을 식별하기 하여 고안된 개 념이다. 표 은 2009년 5월에 공식 으로 출 되었으며 International ISTC Agency에서 구 이 시작되었다. ISTC의 기본 인 신텍스는 16개 의 숫자와 문자로 구성되고 있으며, Registration element, Year element, Work element, Check digit이 포함된다. ISTC registration agency 는 작자와 작권 표자의 요구에 의해 고 유번호를 부여하게 되는데, 작권이 만료된 작물도 공정 이용을 해 국가도서 이 고유 번호 부여를 요구하기도 한다. 한편, 도서 계의 FRBR은 서지 세계를 개 체와 계에 의해 재해석하는 새로운 개념 모델 로 작, 표 형, 구 형, 아이템의 계층 구조 를 취하고 있다. FRBR은 ICP, RDA 등 차세 목록의 기본 개념이 되고 있을 정도로 도서 계 에서 매우 요한 개념으로 이해되고 있다. 하나 의 작이 다양한 표 형과 구 형으로 표 되 7) ISTC. . 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 71 거나 제작될 수 있다는 개념을 기반으로 하고 있 어, 다양한 유 작물을 동일 작 벨로 그 룹핑하는 것이 가장 요한 과제가 된다. OCLC 는 서지 데이터베이스에서 동일 작에서 생 된 유 작을 클러스터하는 워크셋 알고리즘 을 개발하 으며, FRBR 워크셋 정보를 웹상에 서 활용할 수 있는 xISBN 서비스를 운 하고 있다. 한편, ISTC를 FRBR의 작 벨 는 표 형 벨에 매핑하는 문제에 한 논의는 아 직 혼란스러운데, ISTC가 텍스트 기반 작의 식별체계인 만큼, MARBI(MARC Advisory Committee 2010)는 최근 소설류에 한해 MARC 21 포멧에 아래와 같이 식별번호를 포함시키고 자 한다. 024 7# $aISTC 0A3 2009 012445C9 B$2istc [ISTC for the work “Winter in Madrid”] 3.5 유사 표 의 상호 운용성 에서 살펴본 바와 같이 양측은 유사한 표 을 채택하여 도서메타데이터를 리하고 있다. ① 자료의 발견과 식별을 한 메타데이터 형식 으로 도서 계는 MARC을, 발견과 유통․ 매 를 하여 출 계는 ONIX를 사용하고 있으며, ② 자료의 내용 분류를 하여 도서 계는 LCSH나 DDC를, 출 계는 BISAC이나 BIC을 사용하고 있다. 한편, ③ 도서 계는 자의 식별 을 해 거 통제를 실시하고 있으나, 출 계는 ISNI를 활용하고 있으며, ④ 동일 작에서 생 된 다양한 구 형을 통제하기 하여 출 계는 ISTC(International Standard Text Code)를, 도 서 계는 FRBR이라는 개념을 활용하고 있다. 술한 표 들은 도서의 매와 유통, 도서 소장과 이용자 근이라는 상이한 목 을 해 개발되었다. 그러므로 자는 가격과 상태에 한 유통과 마 정보, 그리고 작권 리의 효율성 제고에 촛 을 맞추며, 후자는 발견과 근 그리고 유 작의 식별을 강조하게 된다. 따라서 양측 체계는 구조상, 어의상의 큰 차이를 가지며, 이로 인해 완벽한 호환은 불가능하다고 평가된다. 그러나 부족한 부분을 보완하기 하 여 상 측 표 이 제한 으로 활용되고 있으며, 축 된 문성이나 노하우를 채택하여 좀 더 효 율 으로 발 시킬 수 있는 방안에 한 논의도 이루어지고 있다는 이 주목된다. 4. 도서메타데이터의 효율 인 생성․유통 방안 고찰 앞에서는 도서메타데이터의 유통 경향을 분 석하고 련 쟁 을 도출하 다. 더불어 양측이 채택하고 있는 유사 표 에 하여 살펴보았다. 도서메타데이터는 출 사, 도서유통업체, 서 , 도서 등 공 사슬상에서 각기 필요한 형식과 내용으로 활용되고 있어, 생성과 품질 유지에 복 투입되는 시간과 비용에 한 문제가 쟁 이 되고 있다. 한 원소스 멀티유스 경향으 로 하나의 컨텐츠가 다양한 방식으로 간행되면 서, 동일 작의 다양한 구 형을 식별하거나 자를 정확하게 식별하는 문제도 쟁 이 되고 있다. 각기 다른 표 을 운용해 온 양측은 최근 상호 운용하여 경제성을 도모하는 한편, 직면 한 복잡한 문제에 공동 처할 필요를 느끼기 시작하 다. 72 한국문헌정보학회지 제44권 제3호 2010 본 장에서는 도서메타데이터를 둘러싼 여러 가지 쟁 가운데, 생성과 품질 유지의 효율성을 제고시키는 측면에 을 맞춰 양측에 용 가 능한 새로운 메카니즘을 고민해 보고자 한다. 4.1 공 사슬상의 도서메타데이터 흐름 분석 언 했다시피, 생성과 품질 유지의 효율성에 한 문제가 논의의 쟁 이 되고 있어, 먼 , 도 서메타데이터의 흐름을 공 사슬망 차원에서 조 망하고, 그 특징과 문제 을 구체화할 필요가 있 겠다. 본 장에서는 OCLC(2009b)의 조사를 바 탕으로 ONIX CIP 데이터 샘 , 출 사, 유 통업체, 서 등 각 공 사슬 계자들 사이트를 참조하여 <그림 3>과 같이 메타데이터의 흐름을 재구성해 보았다. ① 출 단계, ② 출 후 유통 단계, ③ 매 단계, ④ 도서 소장 서 비스 단계로 구분하여, 주요 계자, 데이터 형 식과 주요 요소, 그리고 흐름상의 특징을 아래와 같이 정리해 본다. ① 출 단계 ∙주요 계자: 출 사, 국가 표도서 ∙데이터 형식: ONIX, CIP, 기타포멧 ∙주요 생성 요소: 출 /인쇄정보, 기본서 지정보, 주제분석 거통제정보(CIP) ∙㉠ 출 에는 출 /인쇄에 한 정보 뿐 아니라, 기본 서지정보가 생성된다. 그러 나 출 완료 후, 서명, 부서명, 사이즈 등 이 변화되므로 이 단계의 서지정보는 매 우 유동 이다. 일부 출 사들은 ONIX를 이용해 메타데이터를 리하지만, 부분 의 소출 사들은 독자 인 형식으로 데 이터를 리하고 있다. ㉡ 한편, 출 사가 국가 표도서 에 제출한 데이터는 주제 명 표목과 분류 기호 등이 추가되어 CIP 데이터로 탄생되기도 한다. 그러나 CIP 신청률이 2008년 기 으로 체 발행물의 7-13%로(김선애 2009) 매우 조한 국내 에서는 출 메타데이터의 존재를 일반 화하기 어렵다. ② 출 후 유통단계 ∙주요 계자: 유통업체, 메타데이터 벤더 ∙데이터 형식: ONIX, MARC, 기타포멧 ∙주요 생성 요소: 상태, 유통 정보, 독자수 , 북수상, 이미지, 목차 등 보완된 기술 정보 ∙㉠ 책이 출 되면 페이지, 크기와 같은 형 태사항과 서명, 부서명 같은 서지 데이터 가 고정된다. ㉡ 도매상과 유통업체는 도 서 입수 후, ONIX를 검증하고 가격과 상 태 같은 유통 정보를 업데이트한다. 한 도서 납품을 하여 MARC 데이터를 구축하기도 한다. ㉢ 메타데이터 벤더는 거 통제, 주제 분석, 챕터 정보, 이미지, 목차, 자 양력, 북수상 정보 등을 추가해 데이터의 품질을 보강하여, 다양한 형식으 로 서 , 도서 등에 배포한다. 국내에서 는 한국출 유통진흥원이 유통 단계에서 최 ONIX 메타데이터를 생성하여 신간 출 물이라는 이름으로 련 업체 도 서 에 배포하고 있다. ③ 매 단계 ∙주요 계자: 서 , 이용자 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 73 ∙데이터 형식: ONIX, MARC, EDI, 기타 포멧 ∙주요 생성 요소: 배포처, 가격 정보, 리뷰, 평 등 이용자 생성 정보 ∙㉠ 서 은 도매상, 출 사, 벤더 등으로부 터 입수한 메타데이터에 새로운 데이터를 추가시킨다. 리뷰나 평 같은 이용자 생 성 메타데이터를 추가하여 이용자가 구매 결정시 활용할 수 있도록 보강하기도 한 다. 한편, ㉡ 재고 여부, 배포처 리스트, 가 격과 같은 가변 데이터 요소가 이 단계 에서 갱신된다. ④ 도서 소장 서비스 단계 ∙주요 계자: 서지유틸리티기 , 도서 , 이용자 ∙데이터 형식: MARC ∙주요 생성 요소: 기술정보, 근 정보, 로컬 소장 정보 ∙㉠ 도서 은 CIP 데이터 는 도매상이나 벤더가 제공하는 신간출 정보를 수서 결 정 과정에서 활용한다. ㉡ 수서가 완료되면 데이터는 정리 트로 인계되며, 소장정보 를 포함한 로컬 목록 작업이 완성된다. ㉢ 도매상이 납품하거나 각종 서지유틸리티에 서 제공받은 MARC이 활용되지만, ㉣ 로 컬의 고유 정보 입력과 수정, 보완 작업에 많은 시간과 노력이 투입되기도 한다. <그림 3>과 같이 공 사슬망 차원에서 조망 한 도서메타데이터는 출 단계에서 이용자 에게로 이어지는 생명 주기에 의해 진화되어 가 <그림 3> 공 사슬상의 메타데이터 흐름 74 한국문헌정보학회지 제44권 제3호 2010 는 특징을 보인다. 출 단계에서는 기본서지 정보와 출 /인쇄에 련된 각종 정보가 생성되 며, 유통단계에서는 다양한 기술정보가 보강될 수 있다. 이용자들과 직 만나게 되는 매단계 에서는 구매자들의 선택을 돕거나 마 의사 결정에 활용 가능한 다양한 정보가 생성될 수 있다. 한편, 국가 표도서 의 CIP, 유통업체의 MARC 납품 등을 통해 출 계와 도서 계에 상호 유입되면서 불필요한 재구축도 감소시킬 수 있어, 이러한 흐름상의 특징을 체계화하면, 도서메타데이터를 효율 으로 생성하여 유통시 킬 수 있다. 그러나 이러한 특징에도 불구하고 공 사슬 상의 각 계자들은 도서메타데이터를 자신들 의 목 에 따라 각기 다른 형식과 내용으로 구 축하면서 단편 으로 활용하고 있다. 다시 말해 생명 주기 동안 보강된 데이터 요소들을 각 계자들 사이에 효과 으로 공유하지 않고 복 구축하거나 재 작성하면서 불필요한 노력을 발 생시킨다. 유통업체와 벤더, 온라인 서 등 출 계 계자들은 각기 마 과 리 정보의 최 신성 유지를 하여 많은 인력을 투입하고 있으 며, 도서 계 역시 각종 서지유틸리티로부터 무 료로 활용할 수 있는 데이터를 도매상이나 벤더 들로부터 유상으로 구매하거나 별도의 목록 인 력을 투입하여 새롭게 작성하면서 불필요한 산과 노력을 투입하고 있다. 한 양측이 많은 노력을 투입하여 보강한 데이터 요소들도 상호 간 충분히 공유되지 않고 있다. 도서 계가 주 제 분석, 거 통제 등 지 인 노력으로 보강한 데이터 요소들이 출 계로 극 유입되어 활용 되지 않으며, 출 /인쇄에 한 상세정보, 이용 자 리뷰나 평가, 자의 이력, 북수상, 이용자 수 등 출 계의 공 사슬상에서 생성되거나 보 강된 의미 있는 데이터 요소들도 원활히 도서 계에 유입되어 활용되지 못한다. 4.2 효율 생성․유통 방안 제시 술한 바와 같이, 도서메타데이터는 동일한 정보원을 가지지만 양측의 목 에 따라 각기 다 른 형식으로 활용되고 있으며, 각계의 노력으로 풍부해진 데이터 요소들이 상호 공유되지 못하 고 있다. 도서 계와 출 계는 상호 운용 가능 성과 필요성을 모두 인지하고 있으며, MARC 과 ONIX의 부족한 부분을 시에 교류하길 원 한다. 더 근본 으로는 출 단계에서부터 고 품질의 서지데이터가 생성되어 출 계와 도서 계에 공유되고, 생명주기에 따라 진화되길 기 한다. 그러한 맥락에서 본 장은 이를 한 몇 가지 방안에 해 고민해 보고자 한다. 4.2.1 ONIX와 KORMARC의 상호 보강 메카니즘 출 계에서 유통되는 메타데이터와 도서 계의 목록데이터는 서로 부족한 부분을 보강할 수 있어야 할 것이다. 출 계에서 유통되는 메 타데이터에는 출 /인쇄에 한 상세정보 이외 에도 생명 주기에 따라 도서 계에서 확보하기 어려운 다양한 정보가 생성될 수 있다. 반면, 도 서 계의 목록 데이터에도 주제 분석과 거 통 제를 통해 출 계에서 확보하기 어려운 다양한 정보가 생성될 수 있다. 한 소 서지데이터 를 기반으로 한 다양한 마이닝 기술을 통해 서 지 요소를 더욱 풍부하게 보강시킬 수 있다. 가 령 FRBR 워크셋을 통해 신간도서의 원 작과 련된 서지 코드를 소 DB에서 추출한다면, 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 75 원 작의 서명, 자, 주제에 련된 상세 정보, 그리고 동일 원 작에서 생된 다양한 유 작 정보를 기계 으로 확보하여 양측이 모두 유 용하게 활용할 수 있을 것이다. 우리나라의 경우, 유통단계에서 작성된 ONIX 의 서지 요소가 KORMARC으로 변환되어 서지 유틸리티 기 을 통해 도서 계에 보 되고 있 다. 여기에 사용되는 변환 알고리즘을 수정하여, 오닉스와 KORMARC이 상호 보강되는 새로운 메카니즘을 도입할 수 있을 것이다. <그림 4>는 본 논문에서 제안하는 ONIX와 KORMARC의 상호 보강 메카니즘 개념도이다. 아래에서 조 더 구체 으로 설명해 본다. 첫째, 그림 상단에서 보이는 바와 같이, 재 는 유통 단계에서 작성된 ONIX의 서지 데이터 요소가 ONIX to KORMARC 알고리즘을 통해 MARC으로 변환되어 서지유틸리티 기 을 비 롯한 도서 계에 신간출 정보로 보 되고 있다. 둘째, ONIX와 KORMARC의 단순 매핑으 로는 도서 계에서 지 까지 다루지 않았지만 최근 그 요성이 강조된 다양한 요소들이 유실 될 수 있다. 더구나 기술의 수 차이 등 구조 문제로 인하여 매핑이 쉽지 않은 경우도 존재한 다. 가령 ONIX의 PR8에 정의된 “공헌자 이름/ 형태”에는 직 소속기 , 자의 이력, 자 의 배경 등 다양한 기술사항이 하 요소로 정의 되어 있다. 그러나 KORMARC과의 기술 수 차이는 변환 과정에서 이러한 요소를 원활히 수 용하기 어렵게 한다. PR14에서 다루고 있는 “독 자”도 마찬가지이다. 역시 하 요소로 심연령 , 독자기술( , 컴퓨터 보자), 독자범 독자범 의 정확도 등이 정의되어 있으나 같은 <그림 4> ONIX와 KORMARC 상호 보강 메카니즘 개념도 76 한국문헌정보학회지 제44권 제3호 2010 이유로 변환이 용이하지 않다. ONIX의 변환 알 고리즘은 이러한 요소를 수용하기 하여 보완 될 필요가 있으며, 추가 요소의 수용을 해 경 우에 따라 KORMARC의 구조 수정도 필요 해 보인다. 셋째, 상호 보강을 해 KORMARC과 ONIX 를 매핑하면, 다음과 같은 세 가지 경우가 나타 날 수 있다. ONIX와 일치하는 MARC이 존재 하지 않는 경우, ONIX와 일치하는 MARC이 존 재하는 경우, 그리고 KORMARC은 존재하나 ONIX가 아직 생성되지 않은 경우이다. ① 먼 , ONIX와 일치하는 MARC이 존재하지 않는 경 우는, 신간에 한 ONIX가 생성되었지만, 아직 KORMARC은 생성되지 않은 상태이다. 이러 한 경우, ONIX의 서지 요소(ISBN, 총서, 서명 자사항, 자, 사항, 발행정보, 독자사항 등) 로 KORMARC을 자동 생성시킬 수 있으며, 이 데이터는 도서 계에서 신간출 정보로 활용될 수 있다. ② 한편, ONIX와 KORMARC이 모 두 존재하는 경우는 상호 보강 알고리즘에 의해 ONIX 고유의 요소( 자이력, 자 기, 심연 령, 서평 등)로 KORMARC을 보강시키며, 동 시에 KORMARC 고유의 요소(주제명, 분류번 호, 유 작정보 등)로 ONIX를 보강시킬 수 있다. 상호 보강 알고리즘은 매핑 상이 되는 상 데이터에 부족한 요소를 발견하여 보완하 거나, 우수한 쪽의 데이터 요소로 체시킬 수 있다. ③ 마지막으로 ONIX보다 KORMARC 이 먼 생성되어버린 경우도 있다. 우리나라에 서는 유통 단계에서 최 의 ONIX가 구축되고 있으므로, 간혹 출 사의 직 을 통해 신간을 구입한 도서 이 ONIX보다 먼 KORMARC 을 작성하게 되는 경우가 있을 수 있다. 이러한 경우, 먼 구축된 KORMARC의 서지 요소 (ISBN, 서명 자사항, 주제분석, 거통제)로 ONIX를 자동 생성시킬 수 있으며, 이 데이터는 완벽한 ONIX를 구축하는데 기 데이터로 활 용될 수 있을 것이다. 넷째, 최근에는 하나의 컨텐츠가 도서, 만화, 화 등 다양한 매체로 재생산되고 있다. 한 새롭게 출 된 도서일지라도 기존 작의 개작, 증보, 번역물 하나일 수 있다. OCLC는 이러 한 아이디어를 바탕으로 FRBR 알고리즘을 NextGen Pilot에 활용하고 있다. 같은 방식으 로 국내 최 서지데이터베이스인 UNICAT에 서 FRBR 클러스터를 생성하면, 상이 되는 도서의 다양한 유 작에 공통 으로 포함되 어 있는 서지 정보(주제명, 분류번호, 원제목, 원 자 등)를 기계 으로 추출하여, 신간 서지 데이터의 품질을 제고시킬 수 있을 것이다. 따 라서 ONIX와 일치하는 KORMARC이 존재하 지 않는 ①번과 같은 경우에도, FRBR 클러스 터 정보로 ONIX의 서지 요소를 보강시킬 수 있 게 된다. 4.2.2 CIP 활성화를 통한 고품질 서지데이터 생성․공유 한편, 술한 ONIX와 MARC의 상호 보강 메카니즘이 CIP가 활성화되어 있지 않은 국내 환경을 한 보완 안이라면, CIP 활성화는 출 계와 도서 계가 출 단계에서부터 고품 질의 서지데이터를 생성하여 공유할 수 있는 좀 더 근원 방안이라고 말할 수 있겠다. CIP는 출 사, 유통 업체, 서 등 공 사슬상의 계자 들이 복하여 서지데이터를 작성하는 노력을 최소화시킬 수 있으며, 동시에 카피 목록 작성을 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 77 통해 도서 업무의 경제성을 제고시킬 수 있다. 따라서 ONIX와 MARC의 상호 보강 메카니즘 과 더불어 출 사들의 CIP 신청을 독려할 수 있 는 방안에 한 고민도 필요하겠다. CIP를 제공 한 출 사가 시에 출 도서를 홍보하고 매 경로를 확보할 수 있다면 자연스럽게 참여율을 제고할 수 있을 것이다(김선애 2009). 국내 최 도서 포털이면서, 매 시스템을 연계하고 있는 “네이버 책”의 출 사 신간 등록 서비스와 제휴 한다면, CIP 참여율 제고에 많은 도움을 받을 수 있을 것으로 기 된다. 4.2.3 생명 주기에 따라 진화되는 매커니즘 앞서 언 한 바와 같이, 도서메타데이터는 공 사슬상에서 시간이 지남에 따라 도서의 상태, 재고 여부, 가격 등의 비즈니스 련 정보 뿐 아 니라, 이용자 에서 평 , 리뷰, 태그 등 새로 운 정보가 추가될 수 있다. 이러한 정보는 출 계 측면에서는 비즈니스 의사 결정 데이터로 활 용될 수 있으며, 도서 계에서는 이용자의 단 과 선택을 돕는 정보로 활용될 수 있을 것이다. 데이터가 일 으로 유지되어서 생명 주기에 따라 진화되어가기 해서는 먼 출 계 내부 의 표 채택이 선행되어야 하며, 동시에 데이 터 흐름에 따라 추가되고 갱신된 부분이 공 사 슬에 역으로 환류되는 메커니즘의 고안도 필요 할 것이다. ONIX 피드를 통해 진화된 데이터가 정보원을 향해 자동 피드되고 종합목록과 같은 서지시스템으로도 달되어 연계 으로 갱신될 뿐 아니라, 그것을 다운로드 받은 로컬도서 에 도 피드되는 방안을 고민해 볼 필요가 있을 것 이다. 5. 결 론 5.1 요약 본 연구에서는 첫째, 공 사슬상의 계자들 입장에서 메타데이터의 유통 경향과 련 쟁 을 분석하 다. 하나의 작이 다양한 매체와 형식으로 간행되면서 출 계는 자세한 기술 정 보 제시를 요구받게 되었고, 온라인 시장 유 율 증가로 인해 메타데이터는 출 마 의 주 요 요소가 되었다. 도서 계는 목록 작성의 경 제 효용성을 추구하면서, 최 정보원에 근 한 데이터 소스를 통해 목록 작성 업무를 간소 화할 수 있는 모델을 모색하게 되었다. 한편, 원 소스 멀티유스 경향으로 하나의 컨텐츠가 다양 한 방식으로 간행되면서, 동일 작의 다양한 구 형을 식별하거나 자를 정확하게 식별하 는 문제가 양측에게 모두 쟁 이 되고 있다. 이 러한 배경으로 최근 상호 운용하여 경제성을 도 모하는 한편, 직면한 복잡한 문제에 한 공동 처 필요가 제기되었다. 둘째, 출 계와 도서 계에서 용되고 있는 유사 데이터 표 과 상호 운용성을 살펴보았다. 출 계는 매와 유통을 해 ONIX, BISAC/BIC, ISNI, ISTC 등을 사용하고 있으며, 도서 계는 도 서 소장과 이용자 근을 해 MARC, LCSH/ DDC, 거통제, FRBR 등을 사용하고 있다. 양 측 체계는 구조상, 어의상 큰 차이를 가지고 있 어 완벽한 호환은 불가능하지만, 부족한 부분을 상호 보완한다면 좀 더 효율 으로 운용될 수 있다고 평가된다. 한편, 이용자의 선택과 구매 결정을 지원하기 한 양질의 기술 메타데이터 생성, 그리고 유 작과 자의 식별 문제는 78 한국문헌정보학회지 제44권 제3호 2010 양측 표 이 직면한 공통 과제로 그간의 문성 을 바탕으로 한 공동 응이 실히 요구된다. 셋째, 출 계와 도서 계에서 유통되고 있는 도서메타데이터 흐름상의 특징과 문제 을 악 해 보았다. 행 도서메타데이터는 출 단계 에서부터 이용자에게로 이어지기까지 출 사, 유통업체, 서 , 도서 등에서 각기 필요한 형 식과 내용으로 활용되고 있다. 생명 주기 동안 상호간 복잡하게 교류되면서 진화되기도 하지 만, 개별 계자들에 의해 복 구축되거나 재 작성되면서 불필요한 노력이 발생되기도 한다. 한 생명 주기 동안 보강된 데이터 요소들이 각 계자들 사이에 효과 으로 공유되지 못하고 있다. 넷째, 의 분석을 바탕으로 도서메타데이터 생명 주기에 있어 보다 진화된 메카니즘의 도입 가능성을 모색해 보았다. 아래 제언에서 기술하 고 있는 바와 같이, 본 연구는 양측이 복 노력 을 최소화하고 상호 정 으로 메타데이터를 운용할 수 있는 3가지 방안을 제시하 다. 5.2 제언 도서 계와 출 계가 새로운 메카니즘을 통 해, MARC과 ONIX의 부족한 부분을 시에 교류할 수 있다면, 도서 계는 신속하고 정확한 기술정보를 확보하고 이를 통해 목록의 경제성 을 제고할 수 있다. 한 출 계도 주제 분석과 거 통제 등으로 풍부해진 메타데이터를 시 에 확보하고 매자 시스템으로 양질의 데이터 가 신속하게 달될 수 있을 것이다. 본 연구에 서는 도서메타데이터의 새로운 생성․유통 체 계를 한 몇 가지 방안을 제안하 다. 첫 번째, ONIX와 MARC 데이터가 상호 매핑되어 양측 의 부족한 부분을 보강할 수 있는 알고리즘이 도 입되어야 할 것이다. MARC에 부족한 출 /인 쇄에 한 상세정보와, 자 기, 서평 등의 마 정보, ONIX에 부족한 주제 분석, 거 통 제, 연 록 정보 등을 상호 보강하여 양측이 시에 활용할 수 있는 새로운 체계가 고안되어 야 할 것이다. 두 번째, CIP 제도를 활성화시켜 출 도서 정보가 도서 계에 원활히 유입되 고, 도서 계의 지 노력을 통해 보강된 서지 정보가 출 계에서도 원활히 활용되어야 할 것 이다. 세 번째, 도서메타데이터 생명 주기에 따 라 추가되고 갱신되는 다양한 정보를 ONIX 피 드를 통해 공 사슬에 역으로 환류시키고, 이미 다운로드 받아 활용하고 있는 도서 계에도 자 동 피드시킬 수 있는 메커니즘이 고안되어야 할 것이다. 참 고 문 헌 [1] 김선애. 2009. 우리나라 CIP 로그램에 한 고찰. ꡔ제46회 국도서 회 세미나 3: 공공도서 목록, 빠르고 정확하게ꡕ. [online]. [cited 2009. 12. 1]. . [2] 장지숙. 2009. 공공도서 목록 품질향상을 한 방안: CIP를 심으로. ꡔ제46회 국도서 회 공 사슬상의 도서메타데이터 생성․유통에 한 고찰 79 세미나 3: 공공도서 목록, 빠르고 정확하게ꡕ. [online]. [cited 2009. 12. 1]. . [3] 한국출 유통진흥원. 2006. KORMARC ↔ ONIX 변환표. [online]. [cited 2009. 10. 1]. . [4] 国立国会図書館, 2007, デューイ十進分類法を採用しない図書館,議論の的に, カレントア ウェアネス-E No.111. [online]. [cited 2009. 10. 1]. . [5] A Symposium for Publishers and Librarians. [online]. [cited]. . [6] BISAC. [online]. [cited]. . [7] EDItEUR. [online]. [cited]. . [8] Godby, & Carol Jean. 2010. “Mapping ONIX to MARC." [online]. [cited 2010. 4. 10]. . [9] ISNI. [online]. [cited]. . [10] ISTC. [online]. [cited]. . [11] Mitchell, Joan S. 2010. “BISAC-DDC Mappings. ALA Midwinter Meeting Boston January 16, 2010." [online]. [cited 2010. 3. 1]. . [12] MARC Advisory Committee. 2010. “MARC DISCUSSION PAPER NO. 2010-DP03." [online]. [cited 2010. 3. 10]. . [13] NISO, & OCLC. 2009. “Streamlining Book Metadata Workflow." [online]. [cited 2009. 12. 1]. . [14] Oder, & Norman. 2007. “Behind the Maricopa County Library District’s Dewey-less Plan, Library Journal, 5/31/2007." [onine]. [cited 2009. 10. 1]. . [15] OCLC. 2009a. “Report on OCLC's Symposium for Publishers and Libraries." [online]. [cited 2010. 3. 3]. . [16] OCLC. 2009b. “From ONIX to MARC and Back Again: New Frontiers in Metadata Creation at OCLC, ALA Midwinter January 25, 2009." [online]. [cited 2010. 3. 13]. . [17] R2 Consulting LLC. 2009. “Study of the North American MARC Records Marketplace." [online]. [cited 2009. 12. 30]. . 80 한국문헌정보학회지 제44권 제3호 2010 [18] Working Group on the Future of Bibliographic Control. 2008. “On the Record: Report of The Library of Congress Working Group on the Future of Bibliographic Control." [online]. [cited 2009. 11. 1]. . •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] Sun-Ae Kim. 2009. “Urinara CIP Programe Gwanhan Gochal." 46th KLA General Conference Semina 3: Gonggongdoseogwanmokrok, Bbareugo Jeonghwakhage. [online]. [cited 2009. 12. 1]. . [2] Ji-Suk Jang. 2009. “Gonggongdoseogwan mokrok Pumjilhyangsangeul Wihan Bangan: CIPreul Jungsimeuro." 46th KLA General Conference Semina 3: Gonggongdoseogwanmokrok, Bbareugo Jeonghwakhage. [online]. [cited 2009. 12. 1]. . [3] Book Trade Promotion Center. 2006. KORMARC ↔ ONIX Byeonhwanpyo. [online]. [cited 2009. 10. 1]. . work_dacfn6fc2na2xfgxevwtzxss2q ---- Like a Snowball Gathering Speed: Development of ASERL’s Print Journal Retention Program DIANE BRUXVOORT University of Florida Libraries, Gainesville, Florida JOHN E. BURGER Association of Southeastern Research Libraries, Durham, North Carolina LYNN SORENSEN SUTTON Z Smith Reynolds Library, Wake Forest University, Winston-Salem, North Carolina The Association of Southeastern Research Libraries (ASERL) has instituted a distributed print archive program to share the costs and effort of long-term retention of print journals. Beginning with a review of the history of the project, the authors address the voluntary nature of participation in the program, methodologies for identifying journals to be retained, as well as methodologies for tagging retained journals so that an accurate record of current decisions is maintained over the twenty-five year length of the commitment. Policies and procedures continue to evolve as participation increases and lessons are learned with each step. KEYWORDS print repository, distributed print archive, print journal retention, ASERL, cooperative collection management, off-site storage HISTORY The Association of Southeastern Research Libraries (ASERL) was founded in April 1956 by the deans/directors of 32 research libraries in the Southeastern U.S. to provide a forum for sharing _______________ Address correspondence to Lynn Sutton, Dean, Z Smith Reynolds Library, Wake Forest University, Winston-Salem, NC 27109. E-mail: suttonls@wfu.edu mailto:suttonls@wfu.edu ASERL‟s Print Journal Retention Program 2 knowledge and best practices about emerging issues in the profession. Since that time, ASERL has been a leader in forging important partnerships and other activities to sustain research libraries. In the 1960s, ASERL was an important player in the establishment of faculty status for research librarians. In the 1970s, ASERL members founded SOLINET to serve the needs arising from the newly-created centralized cataloging services brokered by OCLC. In the 1990s, ASERL members helped create some of the first group purchasing/licensing activities, a hallmark of library cooperation. More recently, ASERL has created a wide variety of programs to meet member needs, including active professional development and networking activities, expanded resource sharing services, and cooperative collection management programming. In September 2000, ASERL surveyed its members to determine the need for off-site storage. At the time, on average, ASERL members each had need to store 93,000 volumes immediately, and each expected to need room to store an average of 300,000 volumes by 2005. Discussions with non-ASERL colleagues concerning development of a regional, shared storage facility showed many other libraries needed additional storage space as well. However, at the time there were significant reservations from state-supported libraries concerning sending their materials out of state. Further, the idea of large-scale weeding and discarding met great resistance because of the potential impact on comparative statistics and rankings among libraries. It was during these discussions in 2001 that Paul Willis, then Dean of Libraries at University of Kentucky, suggested that a virtual storage concept might make more sense than building a shared regional facility. He believed that new technologies, linked catalogs, and improved delivery systems made such a system possible. ASERL provides a shared catalog and a delivery system for many of its member libraries, making this a possibility -- at least in theory. ASERL‟s Print Journal Retention Program 3 In 2002, at an ARL/OCLC Institute workshop on library architecture, there was discussion that very few (if any) libraries will ever weed the items held in remote storage facilities – these items are a de facto “bank” of permanently stored materials. This led to many discussions about how a virtual storage collection could be created based on this premise, often championed by Paul Gherman, who was University Librarian at Vanderbilt University at the time. In 2003, the concept was also discussed and identified as an Action Item at the CRL conference on Preserving America‟s Print Resources (PAPR). That same year CLIR published Developing Print Repositories: Models for Shared Preservation and Access, a report on regional repositories as a cost effective solution for collections management. One goal was to “determine how, and to what degree, various consortia and university systems are using repositories to move beyond the immediate goal of providing cost-effective collection storage and delivery and to begin to cooperatively manage and preserve their research collections” (Reilly and DesRosiers 2003, 2). There has been international interest in this concept as well: In 2004 it was discussed at the IFLA-sponsored Second International Conference on Repository Libraries in Kuopio, Finland among attendees from 15 countries. Much of this discussion focused on monographs, as there was concern at the time that the labor required to identify complete journal runs and page-level verification would be too onerous to undertake. In 2004 ASERL commissioned a study by OCLC to determine the level of overlap in monograph titles held at nine storage facilities owned by ASERL libraries. Of the 2.3 million volumes stored at these facilities, only 15 items were held at all nine sites. Given the premise that these collections would be permanent, the lack of widespread overlap in these collections ASERL‟s Print Journal Retention Program 4 pointed to a rich source that could be used as a “bank” for use by ASERL libraries seeking to weed locally-held circulating collections. In December 2004, Google Library/Book Search was announced and the notion of large-scale digitization of legacy print collections became much closer to reality. Additionally, the importance of volume counts in library rankings were evolving. Today, research libraries are more focused on special and unique materials rather than simple quantitative data. With the advent of Google Scholar and the emergence of e-readers and mobile computing in the latter part of the decade, library users began to expect electronic access to the overwhelming majority of materials. As the decade progressed, interest in how to deal with legacy print collections continued to increase. In collections journals, authors looked at Approaches to the Storage of Low Use and Last Copy Research Materials (O‟Connor and Jilovsky 2008), and put forward suggestions on Developing Criteria for the Withdrawal of Print Content Available Online (Bracke and Martin 2005). O‟Connor and Jilovsky outline ASERL‟s concept of a “virtual storage collection to… assist with the identification of last copies and the wider availability of low-use materials” (2008, 123), but the concept had yet to come to fruition. The literature at the time also began to reflect joint storage projects. Paul Genoni (2007) describes the efforts in Australia to reach a national solution. With three regional repositories in place, the national repository concept struggled toward implementation. “There are many reasons to believe that Australia research libraries and communities would benefit substantially ASERL‟s Print Journal Retention Program 5 from a national print repository. It will only be possible, however, with the right structure for leadership, coordination and advocacy” (Genoni 2007, 251). The title of O‟Connor and Smith‟s 2008 article put Ohio‟s repository needs in perspective – Ohio Regional Depositories: Moving from Warehousing Separate Collections to Servicing Shared Collections (2008), and finally in 2009, DiBiase and Watson reported on the Orbis Cascade Alliance distributed print repository. “If a consortium could identify “archival” copies of journal runs that would be kept at member libraries (and eventually at a shared storage facility) in perpetuity, then many libraries could exercise discretion to safely withdraw duplicate print runs to make room for new materials” (22). The Ithaka report, What to Withdraw: Print Collections Management in the Wake of Digitization, (Schonfeld and Housewright 2009) pointed to a new methodology that calculated real-world risks for reducing the number of duplicative print collections. ASERL members quickly realized the far-reaching potential of the Ithaka report and officially endorsed it at their Fall 2009 Membership Meeting. This was a move to emphasize ASERL‟s support for transitioning from a reliance on print to an environment of electronic delivery for most users, with the ability to access original print materials when needed. As these changes were occurring, the Andrew W. Mellon Foundation provided funding for the WEST Project in 2009, launching a new model for sharing legacy print resources across a broad geographic area. It was becoming clear that most library users increasingly expected electronic access to materials – both newly published, born-digital materials and digitized versions of older, printed materials. Many librarians were also becoming more comfortable with “Just In Time” models of collection management -- relying on shared access to rare materials just in time to meet a user‟s need -- ASERL‟s Print Journal Retention Program 6 rather than a “Just In Case” model that requires creating vast local collections just in case a user might someday want them. In April 2009, ASERL members convened a special session in Williamsburg, Virginia to revisit the much-discussed notion of a shared virtual storage collection. It was at this meeting that the new emphasis on rare and unique items in library collections raised issues about relying on a few copies of a specific monograph – would the editions be the same? Might there be unique qualities such as marginalia that would make weeding monographs an unwise decision? These questions remain largely unanswered. Moreover, as library space pressures continue to mount, the opportunities for sharing bound journal collections were increasingly apparent: many journal back files are becoming available online, greatly reducing the need for print access. And in general, journals don‟t have the issues that can arise with monographs, such as varying imprints and editions that might make one copy more interesting or unique than others. Plus, there was significant space to be gained – for most longstanding journals, an agreement to retain a single journal title for use by others could free up considerable shelf space across the consortium, whereas an agreement to retain a monograph had much less potential for space savings. Thus, an important decision was made: ASERL would re-focus its shared virtual collection on retaining print journals rather than monographs. OTHER REGIONAL MODELS The ASERL Board of Directors charged the Shared Storage Study Group (SSSG) to examine the policies created by other research libraries and consortia and recommend a course of action for ASERL members to collaborate on sharing legacy print journal collections. Fortunately, the ASERL‟s Print Journal Retention Program 7 Center for Research Libraries (CRL) has played a national leadership role in facilitating communication among regional consortia that are either actively running or currently considering shared print archives. SSSG investigated current status/best practices of the following models: WEST  Western Regional Storage Trust, run by California Digital Library  Funded by three-year Mellon grant  Phase One goal of 150,000 volumes from 8,000 journals  25 year commitment with review every 5 years  Access through existing interlibrary loan channels Orbis/Cascade  Regional distributed print repository with Memorandum of Understanding  JSTOR and American Chemical Society titles in print archive  Group membership in WEST United Kingdom Research Reserve  Partnership between British Library and Higher Education  At least 3 copies of low use journals maintained within UK  Access through British Library Document Supply Service FORMING A POLICY PROPOSAL SSSG members were determined to create a proposal that would be as simple as possible and reduce barriers to participation. In any collaborative process, inevitable barriers arise that may challenge, slow down, or even kill a project, and this project was no exception. As SSSG members (a mix of deans/directors and collections/technical services librarians) deliberated, several issues arose. Free Riders ASERL‟s Print Journal Retention Program 8 Several institutions worried that some libraries would take advantage of collections held by others and weed their own lesser-used volumes but not offer any titles to the group. This was eventually resolved through acceptance of this as a possibility, but with a determination that it would not be allowed to stop progress. Cost Allocation Some wondered if those who were storing large amounts of material should be compensated by those who were storing less or otherwise benefiting from the retention commitments. It was decided that participating libraries were making retention decisions largely based on their own patron needs, so it was deemed fair that each library should be responsible for their own storage costs. In the end, participants agreed that the costs and complexities of any compensation model were greater than any likely returns. Access Consideration was given as to whether participants in the program should be given priority for access to collections held by other participants. It was decided that it would not be worth disrupting existing interlibrary loan/document delivery networks that are working well. This was the same conclusion reached by the WEST group. MEMORANDUM OF UNDERSTANDING The proposed agreement was less than two pages written in simple terms – a considerable achievement in a world of long-winded policies, licensing terms, and other agreements. The final agreement (Appendix A) was modified only slightly to incorporate changes that allowed ASERL‟s Print Journal Retention Program 9 greater flexibility in how retained items could be stored under the agreement. A number of other decisions needed to be made to make the project a reality and tended to fall on the side of inclusion and flexibility in order to maximize participation. Governance was given over to a Steering Committee with representatives from each participating library. The duration of the Agreement (and thus the institutional commitment) was set for 25 years, similar to WEST, but providing an opt-out clause with 24 months‟ notice and an overall program review in 2020 and 2030. Verification was mandated at the volume level, rather than page level. Flexibility in housing arrangements was built in, with transparent risk factors identified for remote storage facilities, locked/secured stacks, and open stacks with corresponding access risk factors for non- circulating, building use only and circulating statuses. One of the most political decisions was whose signature was needed on the agreement. Some deans/directors felt comfortable signing the long term agreement. Other institutions had guidelines that required a chancellor‟s or president‟s signature. In the end, who signed the agreement was left to local discretion. The agreement was ratified by unanimous vote at ASERL‟s Spring 2011 Membership Meeting in Nashville, Tennessee. TITLE INCLUSION STRATEGY With the Agreement and a Steering Committee in place, the next step was for each library to nominate titles for inclusion. Many potential models were considered. Even though the Ithaka report (Schonfeld and Housewright 2009) specifically advised that additional local print copies of JSTOR titles were not needed, some libraries felt that it would be prudent to have a JSTOR archive in the Southeast. Others favored a publisher-based strategy for a systematic approach. Others felt that “low-hanging fruit” at each institution was as much as the group could hope for. ASERL‟s Print Journal Retention Program 10 At this point, ASERL deans/directors were growing anxious and wanted to see significant progress. Because of the existence of other cooperative projects and the long-standing history of trust among ASERL members, the predominant feeling was that even with a certain amount of risk and ambiguity, the project needed to move forward. But who would be first? The logjam was broken when the members of the Triangle Research Library Network (TRLN, consisting of the University of North Carolina-Chapel Hill, Duke University, North Carolina State University, and North Carolina Central University) committed all the journal titles from their own local Cooperative Print Retention program to the larger ASERL initiative. The number of committed titles grew rapidly from 300 to 1,000 to 2,000. Connections were also made between ASERL and the National Library of Medicine to coordinate with NLM‟s regional efforts to retain the top 250 core journal titles in the health sciences. In the end, the ASERL agreement to retain legacy print journal collections allows participating libraries to select titles they wish to retain based on local needs and interests. As representatives from participating institutions began to discuss practical logistics for identifying, compiling and collating titles to be held for the distributed print archive, it became clear that internal methodologies for decision-making on title identification would be as distributed as the archive, and much discussion would be needed to come to agreement on logistics for compiling and collating the lists of titles. Each ASERL member institution has their own unique set of circumstances driving their participation in the archive: the closing of a branch, the opportunity to move print volumes to a ASERL‟s Print Journal Retention Program 11 new storage facility, a storage facility that is filling up too quickly, the need to replace stacks with user seating, etc. Since the needs are different, it follows that the decision-making processes will be different. The most common factor in identifying titles for the archive is working with titles that are currently under review for another reason. Institutions wish to handle materials once, making decisions on a title‟s disposition and moving on to the next project. An institution will typically bring a set of titles under consideration, most often either those housed in a physical location that is being reconsidered, or those compiled electronically in a database, to collection development librarians or subject liaisons for decisions on whether individual titles should be kept or discarded; and if kept, whether they should be tagged for inclusion in the distributed print archive. Factors considered in deciding if a title should be included in the print archive include the completeness of a title run, the uniqueness of a title, inclusion of a title in an electronic journal package to which the library subscribes, and the advice of the relevant subject librarian as to continued usage of the title by the students and faculty of the institution. Complete or nearly complete title runs are favored for inclusion in the archive as they decrease the need for filling gaps in the run from other institutions‟ collections. Unique titles, or those journals not held by a large number of libraries, increase the diversity and depth of the archive, actually increasing their effectiveness through a wider distribution. Subject librarians also help identify titles that are core to the teaching or research needs of the students and faculty of the institution and are most likely to be used in print. COLLECTION AND COLLATION OF TITLES ASERL‟s Print Journal Retention Program 12 As participating institutions began to identify titles, the list of logistical questions grew. How will the titles be initially collected? How will the titles be collated? If spreadsheets are used for collection, are they sufficient for collation, or does a database need to be built? Is there an open source database that would serve for collation? What information needs to be included? Must everyone include the same information or can/should there be optional fields? Will participating institutions work directly in the collated title list, or will ASERL staff collate based on institutional spreadsheets? Can the work be as distributed as the archive? How will duplication be handled? Must titles identified be retained closed stacks or a storage facility? May volumes held in open stacks be included in the archive? The list of questions seemed daunting at times. While some answers are known, others are still evolving. Through trial and error and a series of conference calls with the program‟s Steering Committee, procedures began to develop. A spreadsheet with requested (not yet required) fields was developed and institutions began to identify and nominate titles for inclusion in the archive. Populated spreadsheets are sent to ASERL and collated into the archive master title list, which currently exists as a spreadsheet. Open source software is being considered for compilation of a database, but no decision on particular software has been made at the writing of this article. With thousands of titles already listed and many more to come, the spreadsheet will become increasingly difficult to maintain and manipulate. As a result, participants have agreed that a database is the preferred long term method for storage of and access to the title list. Librarians and staff at participating institutions work to provide accurate information in a standard format to facilitate the work of ASERL staff in collating the titles. ASERL‟s Print Journal Retention Program 13 The Steering Committee agreed that duplication of titles is acceptable within the archive. When identifying titles to be added to the distributed print archive, participants can review the title list to see if a title is already included. However, they may choose to include a title already listed without needing to justify their decision. Reasons for doing so may include a more complete run, the need to have a print copy close at hand, doubt as to whether the other institution will be able to maintain the title, or the level of security identified for the title at another institution. Decisions on inclusion are strictly up to the participating institution. This decision may come up for discussion again as the archive evolves, but since the archive is in a distributed model, duplication becomes much less of an issue than it would in a physical archive. When identifying titles for retention in the distributed print archive, participating institutions must identify for each title whether the physical volumes are held in high density remote storage, locked/secured stacks or open stacks. Any of the three is permissible, but the location must be disclosed since the level of security of the materials will affect decisions made by other participating institutions. Inclusion of materials in open stacks is atypical for an archive, but this option allows institutions to participate who do not have access to a secure storage facility or closed stacks. Participants must also identify the current circulation status of the volumes to indicate levels of potential risk. Identified categories are Circulating, Non-Circulating, and Building Use Only. THE RECORD Conversations continue on what additional information needs to be included in the record for each title, with enough decisions made to proceed. Each record includes: institution name, ASERL‟s Print Journal Retention Program 14 OCLC symbol, title, print ISSN, OCLC number, contributions, gaps, total volumes, retention note, type of inventory (physical or bibliographic), circulation status, archival status, and risk level. A few other fields were included initially -- „subject‟ and „average cubic feet‟ -- but were later removed by the Steering Committee. A variety of small, but important decisions needed to be made for many of the included fields. For example, librarians are quite adept at remembering to leave off initial articles in journal titles, but not as likely to remember to do this for foreign language titles. The contributions are being left open-ended, since a closing date for the archive has not been chosen, but this may change in the future. Participating institutions will include a retention note in a 583 field in their home library catalogs to indicate their compliance in the record, but most institutions have chosen to postpone implementation until the metadata standard being developed by CRL and OCLC is available for their use. To expedite the opportunity to fill the gaps in the future, the Steering Committee decided, after much discussion, to include an explicit listing of gaps for each title, rather than leaving them to be figured out from the contributions field. The University of Florida has begun to develop a disposition database to be used for filling gaps, building on a similar tool they developed to facilitate the disposition of federal documents (another collection management program offered by ASERL.) Questions remain as to whether volumes identified to fill gaps will remain at the home library or will be transferred to the library that is retaining the majority of the volumes for the title. ASSESSMENT ASERL‟s Print Journal Retention Program 15 The plan for assessing the utility and value of this project continues to evolve. ASERL is a small organization with limited staff resources, so a full-scale, statistically-valid evaluation is not realistic. Instead, ASERL expects to rely on anecdotal and simple survey information from members regarding the usefulness of the program. For example, at least one ASERL library is already considering withdrawing some locally-held print journal titles based on items that will be retained by another library under this agreement. ASERL will periodically track this type of information, and perform simple calculation using the number of volumes retained to estimate potential space savings for libraries. EXPECTATIONS FOR THE FUTURE ASERL‟s 2010 - 2013 Strategic Plan focuses heavily on cooperative programming such as the Print Journal Retention Program. ASERL members enjoy a high level of trust that enables creative partnerships to be built, fostering new services even during this time of significant technological and financial change. The organization will continue to develop activities to help member libraries find their desired level of redundancy for goods and services. For example, the Print Journal Retention Program comports well with ASERL‟s Cooperative Federal Depository Program, which seeks to improve the corpus of federal documents held by libraries across the region. Similarly, many ASERL members are Land Grant institutions, and share a mission to support the development of agriculture within their states. These libraries are developing ways to collaborate on sharing print journals and government documents related to agriculture. ASERL members are also considering ways to share technology services -- many of which have a high level of redundancy across ASERL‟s membership -- and, on the other end of the spectrum, to examine options for sharing the costs and workload for infrequently-used services, ASERL‟s Print Journal Retention Program 16 such as non-English cataloging. ASERL will also continue to seek new partners -- either through adding libraries as new members, or via strategic partnership agreements with other library consortia -- to help ensure that research libraries remain vital, important centers to support research, teaching and learning on their campuses and within the communities they serve. APPENDIX A ASERL‟S COOPERATIVE JOURNAL RETENTION POLICY ASERL Collaborative Journal Retention Program Agreement -- Approved April 2011 Introduction ASERL libraries seek new options for sharing the costs and effort of long-term retention of print journals. The policies contained in this document have been reviewed and approved by the ASERL Board of Directors and all participating ASERL libraries. The following agreement provides assurance that the journals designated under this agreement will be retained and available for research purposes as long as the need reasonably exists, thereby allowing participating ASERL libraries to consider withdrawing duplicates of said items from their campus collections, and to rely with confidence on access to the retained copies. 1. Governance 1.1. The program will be governed by a Steering Committee consisting of one representative of each participating library and a liaison from the ASERL Board of Directors. Each participating library director will designate the Steering Committee member. The ASERL Executive Director shall be an ex officio member of the committee and shall be non-voting except to decide any tie votes. ASERL‟s Print Journal Retention Program 17 2. Duration of Agreement, Discontinuance of Participation 2.1. This agreement shall be in effect through December 31, 2035, upon which time this agreement may be renewed as desired by participating libraries. This agreement will be reviewed in 2020 and 2030 to ensure it continues to provide value to participants. 2.2. Any modification, amendments or other changes to this agreement must be approved by a 2/3 majority vote of the Steering Committee and a review of the ASERL Board. 2.3. A participating library may opt to discontinue their participation in this agreement at any time without penalty, but must provide written notice to the Steering Committee a minimum of 24 months prior to withdrawing from the agreement. 3. Selection and Identification of Retained Materials 3.1. This agreement is designed primarily for storing low use print journals. 3.2. Materials will be selected for retention based on the completeness of the journal set and their quality/condition. 3.3. Participating libraries shall note the retention status of designated items within their local catalogs and/or other collection management systems, as deemed appropriate by the Steering Committee. 3.4. ASERL shall maintain a free and publicly accessible list describing the journals retained under this agreement, as deemed appropriate by the Steering Committee. 3.5. The participating library shall maintain all of the designated journals in their original, artifactual form whenever possible. If necessary because of damage to or loss of the original of any of the materials, a hard copy facsimile may be used to fill in gaps. 4. Retention Facilities ASERL‟s Print Journal Retention Program 18 4.1. Items that are to be retained under this agreement will be housed in one of the following types of facilities: High Density Remote Storage Facility Locked / Secured Stacks Open Stacks An environmentally controlled, secured facility that is not open for public browsing On-site access that is not open for public browsing Open for public browsing 5. Ownership and Maintenance of Retained Materials 5.1. The ownership of materials designated for retention under this agreement shall remain the property of the library that originally purchased the item(s). The library that agrees to retain a set of journals will verify the degree of completeness of the set to the volume level. 5.2. Upon agreeing to retain a set of journals, the retaining library will visually inspect each volume to ensure its serviceable condition. Serviceable condition will be defined as physically usable. Materials infested by mold or otherwise in a state of obvious deterioration will not be accepted for retention. 5.3. Should a participating library be unwilling or unable to retain a set of journals that were designated as part of this agreement, that library must provide 12 months written notice to ASERL and offer to transfer ownership of said journals to another ASERL library for retention under this agreement. 6. Operational Costs 6.1. All costs and workload for staffing and maintaining the facilities and retained materials will be borne by the library that undertakes the agreement. 7. Duplicate Materials ASERL‟s Print Journal Retention Program 19 7.1. Any ASERL library may at its discretion retain duplicates of items retained under this agreement by other members of ASERL. No ASERL library will be required to discard any materials. 8. Circulation 8.1. Access to the contents of retained journals will be through electronic or paper duplication, or on-site access to specified items at the contributing library‟s discretion. 8.2. The current circulation status of contributed titles must be accurately reported to indicate levels of risk. Levels of potential risk are defined in the table below: High Density Remote Storage Facility Locked / Secured Stacks Open Stacks Non-Circulating Lowest Risk Low Risk Moderate Risk Building Use Only Low Risk Low - Moderate Risk Moderate - High Risk Circulating Moderate Risk Moderate - High Risk Highest Risk 9. Lost or Damaged Materials 9.1. In the event of loss, damage or deterioration, the participating library shall use reasonable efforts to promptly obtain replacement copies of any of the retained items. Original artifactual copies are always preferred, but facsimiles are acceptable when necessary. APPENDIX B ASERL LIBRARIES PARTICIPATING IN THE COLLABORATIVE JOURNAL RETENTION PROGRAM AGREEMENT (AS OF JANUARY 2012) Auburn University Clemson University ASERL‟s Print Journal Retention Program 20 Duke University East Carolina University Georgia Institute of Technology Louisiana State University Mississippi State University Tulane University University of North Carolina at Chapel Hill University of North Carolina at Greensboro University of Alabama University of Florida University of Kentucky University of Louisville University of Memphis University of Mississippi University of Tennessee University of Virginia Vanderbilt University Virginia Commonwealth University Virginia Tech Wake Forest University College of William & Mary ASERL‟s Print Journal Retention Program 21 REFERENCES Bracke, Marianne Stowell, and Jim Martin. 2005. Developing criteria for the withdrawal of print content available online. Collection Building 24 (2) (05): 61-4, https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=17234283&site=ehost- live. Accessed March 6, 2012. Di Biase, Linda T., and Mark R. Watson. 2009. Orbis cascade alliance distributed print repository: Organizing collections at the consortial level. Collection Management 34 (1) (Jan): 19-30, https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=36438205&site=ehost- live. Accessed March 6, 2012. Genoni, Paul. 2008. Towards a national print repository for Australia: Where from and where to? Library Management 29 (3) (05): 241-53, https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=70286188&site=ehost- live. Accessed March 6, 2012 O'Connor, Phyllis, and Melanie F. Smith. 2008. Ohio regional depositories: Moving from warehousing separate collections to servicing shared collections. Collection Management 33 (1) (01): 129-42, https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=33893570&site=ehost- live. Accessed March 6, 2012 O'Connor, Steve, and Cathie Jilovsky. 2008. Approaches to the storage of low use and last copy research materials. Library Collections, Acquisitions, & Technical Services 32 (3) (09): 121- 6, https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=36990620&site=ehost- live. Accessed March 6, 2012. Reilly, Bernard F., and Barbara DesRosiers. 2003. Developing print repositories: Models for shared preservation and access. Developing Print Repositories: Models for Shared Preservation and Access, https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=ISTA3803909&site=eho st-live. Accessed March 7, 2012. Schonfeld, Roger C., and Ross Housewright. 2009. What to Withdraw? Print Collections Management in the Wake of Digitization. Ithaka S+R. https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=17234283&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=17234283&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=36438205&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=36438205&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=70286188&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=70286188&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=33893570&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=33893570&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=36990620&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=36990620&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=ISTA3803909&site=ehost-live#_blank https://search.ebscohost.com/login.aspx?direct=true&db=lxh&AN=ISTA3803909&site=ehost-live#_blank work_dbge7dlezrfdzonpysgtloxy2a ---- UC San Diego UC San Diego Previously Published Works Title One for Nine Ten: Cataloging for Consortia Collections, a UC model Permalink https://escholarship.org/uc/item/26p2d48s Journal Cataloging & Classification Quarterly, 56(2-3) ISSN 1544-4554 Authors Chin, Renee Deng, Shi Culbertson, Rebecca et al. Publication Date 2017-11-21 Peer reviewed eScholarship.org Powered by the California Digital Library University of California https://escholarship.org/uc/item/26p2d48s https://escholarship.org/uc/item/26p2d48s#author https://escholarship.org http://www.cdlib.org/ One for Ten: The UC Shared Cataloging Program 1 One for Nine Ten: Cataloging for Consortia Collections, a UC model1 Renee Chin, Rebecca Culbertson, Shi Deng, Kathleen Garvey-Clasby, Bie-hwa Ma, Donal O’Sullivan, Annie Ross Abstract: In January 2000, the University of California created the Shared Cataloging Program (SCP). Based at the University of California, San Diego, the SCP is a “centralized cataloging model” for the California Digital Library consortium collections. This article will take an evolutionary look at the perpetual challenges of sustaining a consortial cataloging model and highlight the efforts of the SCP in the ongoing quest to eliminate the redundancy of efforts by centralizing the optimization of cataloging efficiency. Keywords: Shared Cataloging Program (SCP), California Digital Library (CDL), consortial cataloging, centralized cataloging, batch cataloging, file distribution, Chinese e-resources cataloging, URL maintenance Introduction Over 200 library consortia worldwide are members of the International Coalition of Library Consortia (ICOLC), an informal organization whose mission is to share information about matters of mutual interest such as new online resources, pricing practices, and sharing of resources where possible between libraries. The majority of these consortia focus on academic libraries. One hundred twenty (59%) are from North America.1 There are many different models that consortia have used for cataloging. One lasting model is the Shared Cataloging Program (SCP), the centralized cataloging agency employed by 1 The authors dedicate this article to the memory of Valerie Bross. Valerie was involved in the early planning stages of the SCP and the SCP Steering Committee (now SCP-AC). She served on the SCP-AC from 2008-2016 and played a key role in the development of UC systemwide cataloging policies and practices. Additionally, she co-founded the CONSER PURL project (2001) and helped implement the UC CONSER Funnel (2006), both of which continue to benefit the UC and the SCP. The Version of Record of this manuscript has been published and is available in Cataloging & Classification Quarterly on November 21, 2017. 56:2-3, 188-213, DOI: 10.1080/01639374.2017.1388895 To link to this article: https://doi.org/10.1080/01639374.2017.1388895 https://doi.org/10.1080/01639374.2017.1388895 One for Ten: The UC Shared Cataloging Program 2 the California Digital Library (CDL) for the ten campus libraries in the University of California (UC). This article continues where the French, Culbertson, and Hsiung article, “One for Nine: the Shared Cataloging Program of the California Digital Library” left off in 2002 and will focus on the SCP’s commitment to cost-effectiveness for cataloging and maintenance tasks. In October 1997, the UC Task Force on Electronic Resources (TFER) established a set of guiding principles that led to the creation of the SCP. These are still followed today: • Emphasize ease of use for catalog users • Expand access to the maximum while minimizing cost • Conform to national cataloging policies • Define a cataloging approach for electronic resources that works for the Melvyl® Union catalog but does not compromise local catalog integrity.2 (See section on Cataloging Policies below.) Established in January 2000, the SCP is a California Digital Library (CDL) funded program that began cataloging consortially licensed e-journals for the nine, now ten, University of California campus libraries. In 2003, the program expanded exponentially when it incorporated the cataloging of e-books and open access e-journals. The SCP’s main responsibilities are to provide the timely representation of CDL-licensed materials or bibliographer selected materials in UC campus integrated library systems and the UC union catalog, Melvyl®; to maintain the currency of subscriber and coverage data; to eliminate the redundancy of cataloging efforts among the UC campuses; and to keep URLs current through the use of persistent identifiers.3 This article will focus on the approaches that have contributed to the SCP’s longevity amidst an environment of ever-growing availability of e-resources, platform and URL changes, title deletions, title transference between packages, overlap of titles in multiple packages, and changes in technology and diminishing resources. Many of the future issues and challenges predicted by French, Culbertson, and Hsiung for the SCP in 2002 have come to pass. Over the years, the SCP has adapted to change with innovation and collaboration, meeting the demand One for Ten: The UC Shared Cataloging Program 3 for processing and cataloging materials beyond e-journals to include not only resources of different bibliographic formats (e-books, conference proceedings, databases, audio, video) but also varying licensing models (open access resources, government documents, and Demand- Driven Acquisitions (DDA)), and even language-specific cataloging such as Chinese e- resources. Due to an increasing need to catalog Chinese e-resources the SCP hired a full-time Chinese language cataloger in 2007. This article would be incomplete without a discussion of the key factors that sustain a consortial cataloging program: lean organizational structure and staffing, carefully constructed cataloging policies, and the ability to adapt quickly. Organizational structure Effective communication is paramount to consortial cataloging. The SCP manages much of its work through remote collaborations occurring on a daily basis involving stakeholders at many levels (campus, local, statewide, national, and international). These collaborations require a myriad of communication methods spanning several time zones: within records, face-to-face, individual phone calls, conference calls (both online and phone), email, attendance at professional conferences, serving on national and international committees, and shared tools such as the CDL HelpLine, listservs, websites, blogs, and wikis. To understand the vital role that communication plays in everyday SCP work, it is necessary to explain the organizational structure by which the SCP functions. The SCP is based at UC San Diego but reports to the California Digital Library (CDL) in Oakland, California. Organizationally, the SCP is part of the CDL Collection Development and Management Program, led by Ivy Anderson, the director and coordinator of the shared library collections on behalf of the ten University of California campuses. This program acquires scholarly content, manages UC's mass digitization efforts, organizes and supports shared physical library collections, negotiates and licenses shared digital materials for the UC libraries, One for Ten: The UC Shared Cataloging Program 4 and provides and maintains catalog records for materials licensed by the CDL on behalf of the UC libraries.4 On systemwide issues, the SCP receives direction and guidance from the CDL Joint Steering Committee for Shared Collections (JSC) and advice from the SCP Advisory Committee (SCP-AC). The JSC advises the CDL Collection Development and Management Program Director on cataloging priorities and other decisions concerning systemwide shared collections. The JSC membership is comprised of representatives from the ten UC campuses, mainly Associate University Librarians and/or Collection Coordinators for collection management. To help the JSC make informed decisions, the SCP provides timetables and feedback on the feasibility of every cataloging priority decision. (See the section on Cataloging Priorities for more details.) In addition to advising the SCP on systemwide reactions to proposed cataloging policy and practice changes and related system issues including ILS, the SCP-AC offers continued guidance on cataloging issues. Its membership is comprised of catalogers, one from each of the ten campuses. To prepare records for access and distribution, the SCP works in close partnership with other groups within the CDL Collection Development and Management Program, particularlyCDL Acquisitions (CDLA). The SCP is also part of the CDL Licensed Content Group and the CDL Electronic Resource Team (ERT). Serving on these two groups keeps the SCP informed during the various stages of the contracting process (proposal, negotiation, and licensing) which enables the SCP staff to research and plan in advance the most effective way to process an incoming e-resource package. The SCP may also work directly with vendors, particularly Chinese vendors, during the contracting process to acquire MARC records with specific metadata requirements and accurate title lists with persistent URLs. After a package launch, the SCP collaborates with CDLA, the ERT, and/or the SCP-AC to troubleshoot any potential or existing access problems discovered during the cataloging process related to cataloging, file distribution and processing, and proxy. One for Ten: The UC Shared Cataloging Program 5 Lastly, the SCP is unique in that its base of operation is in a metadata services department located in a large academic library on the UC San Diego campus. The SCP staff work closely with UC San Diego staff to take advantage of their expertise and locally hosted training events (e.g., RDA and NACO training), to participate directly in decision-making that would impact both SCP and UC San Diego (e.g., sharing the same ILS), and to influence and drive UC and national cataloging policy and procedure to benefit UC and beyond. The original cost-saving motivations behind the creation of a centralized cataloging agency using “an existing cataloging unit”5 spawned the ideal environment for the SCP to grow and evolve over the years. Overall, the SCP’s symbiotic relationship with UC San Diego has played a major, though indirect, role in the SCP’s continued sustainability and success. Figure 1. Shared Cataloging Program (SCP) organization flowchart. One for Ten: The UC Shared Cataloging Program 6 Staff Funding Models The SCP operational workflows and organizational structure developed during the pilot project for record creation, maintenance, and distribution, have remained essentially the same while e-resource collections continue to grow exponentially in proportion to staffing. See Figure 2 below. Figure 2. Exponential growth of unique titles6 compared to the unchanged number of SCP staff. The SCP started out with one FTE in 2000. Today, the SCP operation comprises 4.60 FTE including a 0.3 FTE manager. There are two different funding models for the SCP FTE. All the SCP FTE are funded by the CDL except for the Chinese language cataloger position. The Chinese language cataloger position was initially created in 2007 and funded by CDL as a temporary measure to meet the request of the UC East Asian Bibliographers Group to catalog Chinese e-resources. After the first year, the position was jointly funded by nine UC campuses and the CDL7 based on the CDL East Asian cost share model for collections. The position became permanently funded in 2014 One for Ten: The UC Shared Cataloging Program 7 and continues to remain jointly funded. Where there is a need and demand, pooling funds from the ten UC campuses has proven to be a viable cooperative strategy for the greater good. Cataloging Policies From the beginning, the SCP’s mission has been to provide a centralized approach to record cataloging, maintenance, and distribution with an emphasis on eliminating redundancy of cataloging efforts to achieve time and cost savings for all involved. One early challenge was the promotion and establishment of standards that would maximize access for MELVYL® users while recognizing the strengths and limitations of local systems and avoiding requirements that would compromise their functionality.8 This goal was achieved and continues to be achieved through agreement by the SCP-AC on UC policy and cataloging standards that eliminate the need for any customization or accommodation of the various campus ILSs and local practices.9 The campuses accepted and agreed to the following three cataloging policies as a means to carry out the aforementioned mission: 1. Adoption of a standard for cataloging electronic resources: To facilitate and support full record cataloging and ongoing maintenance, the SCP adopted the single record approach to promote consistent record merging and enable the campuses to load SCP records into their local catalogs with little human intervention. This policy was later amended to apply to e-journals only. The SCP adopted the separate record approach for cataloging e-books in 2003, as the scale for the manual work required by the single record approach became untenable. The separate record approach also made it possible to utilize automated techniques to expedite the availability of the resources and to support maintenance efforts of campuses choosing to withdraw print collections in favor of online ones. 2. The use of persistent URLs: With a centralized link resolver, URL maintenance is performed only once (by the SCP) instead of ten times, thus relieving the campuses One for Ten: The UC Shared Cataloging Program 8 of the major burden of local URL maintenance. See section on The Challenge of Persistent URLs and URL maintenance below. 3. Distribution of entire MARC records: To save time and further reduce manual efforts in the area of record maintenance, the SCP distributes the entire OCLC MARC record to the campuses whenever there is a change to a record (e.g., title changes, URL fixes, and coverages updates, etc.). The OCLC control number is the load match point and a 599 MARC field with a standardized note is added to the record to indicate whether the record contains an addition, change, or deletion in description. The campuses handle each category of change as permitted by their workflows. Cataloging Priorities When the SCP was established in 2000, its primary goal was to provide bibliographic access to consortially licensed e-journals for the UC campuses. As the campuses grew to increasingly depend on the SCP, the program expanded to include online databases, e-books (including online audio and video), online government documents, open access resources, Chinese e-resources, and most recently DDA e-resources. The rapidly growing collections generated an increasing demand to provide timely access to all types of e-resources, regardless of licensing model, adequate resources, or adequate staffing. The SCP cataloging priorities were initially determined almost exclusively by the SCP staff, with limited input from CDL and the other campuses. In response to ever-increasing workloads caused by a sudden explosion of e-book packages and request to catalog other types of e-resources during FY 2007-2008, the CDL established a more formal process to prioritize cataloging needs with the goal to balance and prevent unnecessary cataloging delays. To facilitate this process, the SCP submits a quarterly report to the JSC which outlines the SCP’s proposed cataloging priorities for the coming quarter. The JSC sends the report to One for Ten: The UC Shared Cataloging Program 9 various UC Common Knowledge Groups (formerly Bibliographer Groups) compiling feedback used to approve and adjust priorities as needed and requested. Two categories of priorities are included in the report: standing and project-oriented. The standing priorities are general in nature and do not necessarily target specific e-resource packages. Reviewed annually by the JSC, the standing priorities are overarching and defined by categories such as licensed materials, new additions to existing licensed packages, open access resources, and ongoing maintenance of existing e-journal packages (transfers, content changes, etc.). The project-oriented priorities are specific in nature and may be triggered by newly licensed packages or significant backlogs of existing packages. Project-oriented priorities may also be identified by the SCP staff, or requested by the campuses in response to internal or external factors. Project-oriented priorities are not fixed and may get bumped and moved around depending on the standing priorities or in response to changes in other packages.10 The general principles of the standing cataloging priorities set by the JSC dictate what gets cataloged first: licensed materials (databases, e-journals, e-books, in that order) before open access materials, current/new titles before older titles, and “first in, first out” when there are competing needs within the same category.11 In daily practice the priorities driving the work are a combination of standing and project-oriented ones. Cataloging work benefiting greater numbers of campuses tends to get a higher priority.12 E-journals cataloging Licensed e-journals were the first category of e-resource cataloged by the SCP and e- journals in general continue to receive the highest cataloging priority. Cataloging is provided for standard subscription packages, titles in aggregator databases, and requested open access e- journals. In addition to providing records for new e-journal titles, the SCP maintains and distributes updates to e-journal records for changes such as coverage updates and the addition of new providers. One for Ten: The UC Shared Cataloging Program 10 The process for cataloging and providing records for e-journals has changed little since the SCP’s inception. E-journals are still cataloged using the single record approach and package-specific metadata13 is supplied automatically to each record during the process of cataloging. In addition to new cataloging, the SCP also performs bibliographic maintenance on existing records. E-journals maintenance constitutes the majority of the SCP e-journals cataloging workload. It started out as a manual process, became a cooperative effort with the establishment of the UC CONSER Funnel in 2006 (see section on UC collaboration and beyond below), and recently, gained an automated component. In spring 2014, the SCP began using the Daily Update Report service offered by the OCLC WorldShare Collection Manager to maximize and facilitate e-journals maintenance and updates to CONSER and other serial records. Based on a specific profile, the SCP receives a daily file of records with additions, updates or deletions to the specific MARC fields in the master record as well as changes to the encoding level, date/publication status changes, 040 to include 040 $e rda, and OCLC control number. Control number and date/publication status changes are processed manually. Control number changes require redistributing the records to the campuses with a 599 DELete note for the record with the old OCLC control number and a 599 NEW note for the record with the new control number. Date/publication status changes usually require adjustment/closing out the holdings and title change processing. Using the OCLC WorldShare Collection Manager and title change lists from the aggregators (especially EBSCO), the SCP staff can update e-journal records with efficiency and accuracy often before these changes appear in other sources such as the SFX link resolver. Other e-journal changes are processed in batch and entire MARC records are overlaid and re-sent to the campuses in the weekly SCP files. Furthermore, after each batch overlay, the SCP staff check for any local records with an active date status that don’t match the current ceased date status (usually status changes prior to 2014), catching older title changes that fell through the cracks due to former, less reliable manual processes. One for Ten: The UC Shared Cataloging Program 11 E-books cataloging and batch processing In terms of evolution, the history of SCP cataloging can be characterized as steady and consistent for the first eight years with both e-journals and e-books being cataloged one by one, using the single record approach. The increasing availability of e-books, open access resources, and DDA programs forced the SCP to adapt and reinvent its workflows leading to the development and exploration of a diversity of cataloging techniques. In the early 2000s, e-journals cataloging took center stage amidst a burgeoning e-journal publishing industry. At that time, e-book publishing had not yet taken off. As a result, e-book records were not yet readily available in cataloging utilities such as OCLC so manually checking every detail (e.g., matching ISBNs, URL verification for e-books was still manageable and desirable. This comfortable cataloging pace and process incontrovertibly changed in FY 2008- 2009, with the simultaneous occurrence of several events: the SCP began cataloging and distributing records for online California state documents,14 the loss of 1.35 FTE, a sudden deluge of e-books, and the implementation of the provider-neutral record for electronic monographs. These events, along with changed cataloging priorities, were the catalysts that precipitated the innovation and creativity behind the batch processing techniques that have become one of the mainstays of SCP e-book cataloging. As early as 2006, the SCP began exploring new OCLC search commands/indices and algorithms to achieve accurate record identification and retrieval en masse. Search strategies were very important in early batch techniques predating the provider-neutral record for monographs. Before the OCLC WorldShare Knowledge Base, records in collections may have been available in OCLC but identification and retrieval were time-consuming to perform on a one-by-one basis. For over a decade, SCP had an active partnership with OCLC to produce records for OCLC WorldCat Collection Sets. (See also the section on UC Collaboration and Beyond.) Using search retrieval techniques, the SCP identified and cataloged records that One for Ten: The UC Shared Cataloging Program 12 OCLC then packaged as collections to make available to the larger library community. Adding to search strategies, the SCP staff created complex OCLC macros using OCLC Macro Language (OML) to manipulate and customize metadata and batch export records for distribution, automating much of the SCP’s cataloging workflows from the basic to the advanced. By 2008, these techniques paved the way for the SCP’s first successful experiment in batch cataloging, the creation of a workflow to harvest and distribute current online California state documents to the UC campuses.15 Using the original guiding principles of the SCP as its underpinnings, the SCP came up with an improved process to continue the timely representation of CDL-licensed materials or bibliographer selected materials in UC campus integrated library systems and the UC union catalog, Melvyl®,16 striving to do more with less. In August 2009, the implementation of the provider-neutral record for monographs broke new ground for innovation in e-books cataloging. Given the difficulties and time-consuming nature of cataloging and identifying separate records for each provider, the original standard of one-by-one cataloging became unsustainable with the sudden influx of thousands of e-books. The provider-neutral record for monographs, however, finally made it possible to take advantage of batch cataloging techniques by facilitating record merging of duplicate records, thereby promoting a more consistent standard for cataloging e-books. In just six years, after a reorganization of the SCP responsibilities and priorities and the formation of a working group to create and evaluate procedures for every publisher/vendor, the SCP progressed from cataloging a couple of thousand e-books a year to over 50,000 e-books a year. The SCP’s first successful batch processing project was documented in a NASIG poster session entitled, Weapons of Mass Distribution: Cataloging with Deadly Efficiency.17 Using a combination of vendor supplied metadata and title lists, customized OCLC search algorithms, complex OCLC macros, and the new addition of the all-important MARC editing utility, MarcEdit, the SCP staff derived original English language e-resource records from equivalent manifestation OCLC records (all formats and languages) using looping macros that were so accurate and efficient One for Ten: The UC Shared Cataloging Program 13 that little post-cataloging was needed prior to record distribution. These macros and other batch processing tools improved upon the ones previously developed for processing the online California state documents and effectively reduced the e-book cataloging workload from thirty hours/week to a mere three hours/month. To fully understand and appreciate the significance and magnitude of the SCP’s batch processing methods, it is important to highlight a few key points. The SCP successfully adapted and employed batch processing techniques at a time when vendor records were not available. In the past few years, the advent and availability of vendor records have given libraries the ability to load thousands of records unchecked into their catalogs with a promise of efficiency and accuracy. The lure of using vendor records has not escaped the SCP’s attention. The SCP did utilize proprietary vendor records for at least one early e-book collection and free vendor records have been used to derive OCLC records for a couple of other collections. The SCP continually and proactively explores the use of vendor records but finds, in most cases, that it is necessary to catalog and distribute OCLC records to the greatest extent possible. Since 2009, the UC Libraries discovery platform and union catalog, Melvyl®, has been powered by OCLC WorldCat Local. As a result, the importance of using OCLC records for record distribution and re-distribution to the campuses is ten-fold and has significant implications for the future. The main problem with using vendor records is discovery. Static vendor records are ideal for individual local catalogs because they can be loaded and forgotten. At the consortial level, with a union catalog powered by OCLC WorldCat Local, using vendor records is disadvantageous on several levels. Since only OCLC records are discoverable in Melvyl®, UC inter-library loan services and campuses using Melvyl® exclusively as their discovery system would not have access to resources through vendor records. Libraries would be forced to use individual library catalogs as siloes to identify which campuses owned a particular title. Vendor records can also be costly and proprietary which limits their flexibility and restricts their use (e.g., they cannot be uploaded into OCLC). Unlike static vendor records, OCLC records are One for Ten: The UC Shared Cataloging Program 14 continuously updated and OCLC participants can use WorldCat Updates offered through OCLC’s WorldShare Collection Manager to set up specific profiles to automatically receive record updates and changes. OCLC records are maintained in real-time in the union catalog and the SCP distributes the record updates to the campuses for their local catalogs. Even though the direct use of vendor records is problematic, the SCP catalogers do take advantage of them (and other sources such as the OCLC WorldShare KB) as tools for extracting pertinent metadata to complement and supplement existing SCP batch processing tools and techniques. For example, the SCP staff often use MarcEdit to extract a list of ISBNs or ISSNs from vendor records (in the absence of publisher title lists) to batch search records in OCLC. The decision to use and distribute OCLC records exclusively is a testament to the foresight and forward-thinking minds of Crystal Graham and Patricia French (then at UC San Diego and UC Davis, respectively), as well as others, architects of the initial cataloging policies in 1997 to eliminate the need for any customization or accommodation of the various campus ILSs and local practices.18 Metadata staff spend the majority of their working hours designing, maintaining, and upholding consistent cataloging and bibliographic description standards for a good reason. Vendor records lack the consistency and standards compliance that are necessary for ease of identification, searching, access, maintenance, and description (see section on Chinese cataloging below for the problems caused by inconsistent metadata and standards). Since OCLC is the UC’s exclusive and preferred cataloging utility, using OCLC records allows the SCP to maintain a consistency and standard that becomes even more important in the face of ILS changes, record migration, a new discovery system, and the implementation of new concepts such as BIBFRAME and linked data. This consistency will promote the interoperability and flexibility of converting, storing, and indexing SCP managed metadata in a new system. These policies were not developed in a vacuum--it is essential for One for Ten: The UC Shared Cataloging Program 15 the campuses to be able to rely on the consistency and standard of quality expected of SCP distributed consortial records in order to keep moving forward. Figure 3. FY 2015-2016 Records distributed by campus UCB = UC Berkeley, UCD = UC Davis, UCI = UC Irvine, UCLA = UC Los Angeles, UCM = UC Merced, UCR = UC Riverside, UCSB = UC Santa Barbara, UCSC = UC San Cruz, UCSD = UC San Diego, UCSF = UC San Francisco Though the majority of e-books are processed in batch, a few select titles and packages may be cataloged manually if original cataloging is required or value added metadata is requested. There is, of course, give and take with the batch processing of records. The SCP staff don’t have the time to look at each record, check every URL, and ensure that every record is perfect before being distributed to the UC campuses. The SCP consists of 4.60 FTE doing the collective work of 10 cataloging agencies. The SCP continues to explore strategies for applying batch processing techniques in all areas including e-journals and maintenance. Beginning with the FY 2015-2016, the SCP began processing the OCLC daily reports for monographs. Monographs present unique maintenance challenges. Due to the implementation of the provider-neutral record for monographs, the OCLC community merging project, and the vast number of duplicate e-monograph records being created and uploaded by vendors, there are often a large number of control number changes due to record merging. Because the OCLC control number is the load match point, the SCP does not process all these changes. It would require redistributing 2 sets of records (DELetes and NEW titles) to the campuses. The SCP is investigating the use of batch processing techniques to make this maintenance process more manageable pending approval from the One for Ten: The UC Shared Cataloging Program 16 campuses. Other types of updates, such as upgraded K-level records containing desired added subject headings, are automatically distributed to the campuses. Overall, using the OCLC daily reports has helped the SCP deliver better-quality records to the ten UC campuses on a weekly basis. The total number of all records distributed to the UC campuses in the last five fiscal years has vastly increased due to processing the OCLC daily updates for both e-journals (2014) and monographs (2015). When medical subject (MeSH) headings were deconstructed in 2016, the daily update files were helpful in identifying and updating the thousands of monograph and e-journal records that were changed. Cataloging Chinese E-resources: A New SCP Venture In fall 2007, the SCP embarked on a new cataloging venture and hired a full-time Chinese language cataloger to address the growing plethora of CDL licensed Chinese e- resources. Though the Chinese digital publishing industry produces content at astronomical rates, Chinese e-resource cataloging is still given low priority or is nonexistent in North American libraries. As a result, very few Chinese e-resource records exist in cataloging utilities such as OCLC, and in vendor knowledge bases. Misconceptions about the need for cataloging Chinese e-resources have also been problematic, with many libraries accepting vendor records, promoting the keyword searching capabilities of databases, and believing (erroneously) that discovery layers19 are the answer to accessing all e-resources. The typical cataloging practice for East Asian e-resources involves library systems staff without language expertise loading vendor or knowledge base supplied records and metadata into local catalogs. Unfortunately, the low priority given to Chinese e- resources in general, combined with a historical lack of staffing, and blind faith in vendor records has had a snowball effect on the entire Chinese digital publishing industry leading to rampant inconsistent metadata practices. One for Ten: The UC Shared Cataloging Program 17 The SCP discovered early on that Chinese e-resources shared the same challenges and issues that all e-resources do, and more. Chinese e-resources cataloging suffers from substandard and unreliable vendor-supplied metadata,20 inconsistent description standards, inconsistent Pinyin word division standards, erroneous transcriptions (simplified vs. traditional characters), duplicate records (result of bad metadata and content presentation issues), hybrid records (Chinese notes in records described in English), and non-persistent linking, with an added layer of politics and cultural differences. These everyday issues become overwhelming and magnified because they often present themselves together at the same time—it’s never just one problem. Poor quality metadata and inconsistent standards and practices render metadata conversion more difficult, cause linking errors and failed searches, and further propagate inaccurate and duplicate bibliographic records, all of which brings the cataloging process to a grinding halt. To make use of metadata in Chinese vendor records, the SCP must apply complex batch processing techniques to first clean up the bad metadata. At the most basic level, the scripts in a set of Chinese vendor records must be compared and verified against the character forms (traditional or simplified) on the resources themselves and corrected as needed to ensure proper identification and valid search results. Romanized forms require yet additional manual manipulation to standardize the metadata. The perpetuation of inconsistent standards can detrimentally affect something as basic as a title search, affirming the importance of accurate metadata and the need to maintain consistency and standards for records. The cleaned-up metadata is used to search copy or create original records in batch for upload into OCLC where they are available for discovery by the international library community and the UC campuses via the UC Union Catalog, Melvyl®. The SCP recognized that cooperative cataloging and collaboration would be the gateway through which to start remedying the colossal problem of Chinese e-resources cataloging. Chinese language catalogers could lead the way in bridging the gap between East Asian studies librarians, vendors, systems, link service providers, publishers, bibliographic One for Ten: The UC Shared Cataloging Program 18 service agencies, discovery services, and other stakeholders in the supply chain. The SCP has also taken a proactive role in establishing the CEAL Task Force on Metadata Standards and Best Practices for East Asian Electronic Resources (CEAL ERMB Task Force), a national initiative for working with Chinese vendors to improve their metadata and promote best practices with the goal to identify and improve the presentation and metadata issues that affect e-resources cataloging efficiency. The CEAL ERMB Task Force helped launch the China-U.S. Translation and Research Collaboration Project on Electronic Resource Standards and Recommended Practices in the United States, an international project to translate and promote NISO standards and best practices for cataloging Chinese e-resources into Chinese for international constituents. Opening the channels of communication was the first step towards establishing trust between libraries and vendors to ensure that business activities would be mutually beneficial to all stakeholders. Efforts are starting to pay off as UC librarians have succeeded in including metadata standards requirements for two Chinese DDA licensing agreements21 and in conveying and promoting the importance of including title changes in title lists. The largest Chinese journal vendor recently agreed to provide this service, ending a decade long issue of contention between UC librarians and Chinese vendors. Hopefully, with the largest Chinese vendor leading the way, smaller vendors will follow suit. For further discussion on the SCP Chinese cataloging collaborations, see the section on Collaborations at UC and Beyond below. Weekly file maintenance and distribution The SCP distributes records on a weekly basis to the ten UC campuses. The SCP record distribution process was first documented in 2002,22 and while the fundamental elements of it have remained fairly stable since then, there have been a few adjustments. The SCP records live in UC San Diego’s local integrated library system (ILS) and are differentiated from UC San Diego’s records primarily by a unique location code. The first major One for Ten: The UC Shared Cataloging Program 19 change was the addition of e-books to the e-journals workload which necessitated a split in files based on format, increasing the number of files from nine (later ten) files to 20 files per week. In the early days of the SCP, most records were “cloned” one-by-one from existing UC San Diego records, but with the employment of batch processing tools, most records are now loaded twice into UC San Diego’s local catalog: once for UC San Diego (if they are a participant in a package) and once for the SCP, with subsequent record updates occurring one-by-one or in batch overlay. The entire bibliographic MARC record is still distributed whenever it is newly added, changed or deleted, with a MARC field 599 communicating the nature of the record being delivered. Receiving libraries can sort and organize the records based on the note in the 599 field: 599 NEW—Newly distributed records. Example: 599__NEW $c 170328 599 UPD—Changes to already distributed records. Example: 599__UPD $b cat $c 170328 (indicates a bibliographic change to the record) 599 DEL—Deletion of a record. Example: 599__DEL $c 170328. Other special local fields that are important in SCP records include: 793—Package title hook (local). Example: 793 0_Cambridge online journals or 793 0_ MIT Press online monographs 920—Participating campus fields. Example: 920 UCB Example of a SCP MARC record with 599, 793 and 920 fields: One for Ten: The UC Shared Cataloging Program 20 Figure 4. Sample SCP MARC record with 599, 793, and 920 fields. At the start of each week, the SCP file processor searches and gathers all SCP bibliographic records flagged with 599 fields from the prior week. Separate files of MARC records are created for SCP monographs and e-journals using saved batch searches. The SCP monographs and e-journals files receive several quality control checks (such as records that lack a URL in the 856 field or a SCP-approved title hook), and are corrected accordingly. The monographs and e-journals files are then uploaded to the SCP file server where they are separated into a relevant monographic and serial file for each campus based on the 920 fields in the records. Consequently, the campuses only receive files of titles for which they are entitled, indicated by the presence of at least one 856 field in the MARC record. Each campus then accesses the SCP file server at UC San Diego to obtain their own files. The distribution process represents a consistent workload for the SCP and the campuses. With increasing workloads that must be absorbed by existing staff, the SCP continually explores ways to make not only cataloging but also record distribution more time- One for Ten: The UC Shared Cataloging Program 21 saving and cost-efficient. In July 2011, the UC Systemwide Operations and Planning Advisory Group (SOPAG) established the Power of Three (POT) working groups that were charged with assessing and formulating systemwide actions and policies for implementation of the Next Generation Technical Services Initiative.23 The POT5 group was charged with the responsibility to “maximize the effectiveness of Shared Cataloging” and its first task was to “assess the benefits and risks of stopping the distribution of (SCP) bibliographic records to the ten campuses for their local OPACs.” Would the elimination of the record distribution workload result in SCP/campus cost and/or staffing savings sufficient to offset, if any, negative impacts of the SCP not distributing the records to the campuses? To accomplish this, POT5 and the UC Cataloging and Metadata Common Interest Group (CAMCIG) investigated the following two issues: 1) Determine the staffing and service impacts on UC libraries if the SCP discontinued record distribution to the campuses, and if campuses chose to add the records themselves, and 2) Determine the economic impact of the current record distribution process to both the SCP and the campuses, and identify alternative methods for record distribution and loading, and their cost. POT5 surveyed four UC campuses to find out how they used SCP records and the impacts on library services if SCP record distribution were eliminated. Survey results revealed that SCP records were heavily used for reference and instruction, collection development, interlibrary loan, acquisitions, cataloging, and circulation activities. Public services used SCP records as sources for persistent URLs to add to LibGuides and other instruction materials and to assist with collection development (to verify if titles were owned before placing orders or to make preservation decisions about withdrawing print collections). Inter-Library Loan departments relied heavily on SCP records and their elimination would have detrimental impacts on ILL revenue, staff time (redundant searching in multiple systems), and workload (unnecessary borrowing requests and orders from within and outside UC). Acquisitions departments used SCP records for creating lists to track titles moving in and out of CDL One for Ten: The UC Shared Cataloging Program 22 licensed packages and help identify print titles for cancellation. In addition to the above, eliminating the distribution of SCP records would create work redundancies for the campuses, inhibit user discovery of and access to CDL e-resources, and reduce the ability of libraries to deliver quality user services. The economic ramifications of eliminating SCP record distribution were equally compelling for maintaining the status quo. Findings revealed that the total cost of distributing a single record to the ten campuses was slightly less than 50 cents per record. For FY 2010-2011, the total cost of distributing SCP records to all campuses was $31,887 (the SCP cost was $5,162 and the remaining $26,725 share was distributed amongst the campuses). The total FTE involved in the process for all campuses was .642 FTE for distribution and loading of records. CAMCIG further examined the costs and benefits of using other methods for the campuses to acquire, load, and access SCP records: UC San Diego catalog via Z39.50, OCLC, and Melvyl®. They determined that none of these options was feasible or cost-effective and the cleanup efforts and costs associated with changing the process would be absorbed by the individual campuses making these options even less desirable. By the conclusion of the study, none of the UC campuses could identify benefits to the elimination of record distribution by the SCP. Given the low cost and high quality of the SCP’s records and file distribution method, the unacceptable alternative of using Melvyl® as the sole discovery system for CDL e-resources, and lack of cost benefit or work efficiencies associated with alternative methods, the CAMCIG recommended continuing the SCP’s current operation for record distribution.24 The SCP has continued to search for alternatives to record distribution, in particular the OCLC WorldShare Collection Manager and OCLC Knowledge Base (OCLC KB), since OCLC WorldCat is the backend database that feeds Melvyl®. In 2014, the OCLC WorldShare Collection Manager offered improvements and features not previously available. The SCP started experimenting with the idea of using the OCLC KB for record distribution and evaluated its feasibility based on the following questions: Does the OCLC KB include records for titles in One for Ten: The UC Shared Cataloging Program 23 all the packages managed by the SCP, including CDL selected open access packages? If not, what is the feasibility of adding records for titles in CDL packages not represented in the OCLC KB? Can this process be automated or does it require manual processing? Time-wise, how would this process compare with current SCP record distribution? Can package activation be done at the package level for CDL selected titles or will the titles need to be manually activated? Do all the titles in the OCLC KB have OCLC WorldCat record numbers? What if the OCLC numbers in the OCLC KB do not match the control numbers in existing SCP records? Is there a way to set differentiated OCLC holdings for CDL licensed and selected open access resources at the campus and consortial levels? What are the workload and cost implications of switching to and using the OCLC KB for SCP record distribution? Would the CDL and the UC be able to support the additional cost(s)? After a lengthy review, the SCP determined that the OCLC KB was not yet a viable alternative to the SCP record distribution process. The longevity of the SCP record distribution process is another example of the foresight of the founders of the SCP who hit upon a fairly simple solution of consortial record distribution that (with a few tweaks) has persisted in light of staffing, economic, and technological changes. The Challenge of Persistent URLs and URL maintenance • 2000—Implementation of the CDL PID server, supported by open-source OCLC PURL software. • May 2001--CONSER PURL pilot project proposed for cooperative maintenance of URLs (also known as BibPURLs) for open access resources on OCLC records maintained by CONSER institutions. • July 25, 2002—Implementation of the CONSER PURL server with participation extended to BIBCO members: http://bibpurl.oclc.org/.25 • October 3, 2005—Established linking guidelines for SCP cataloged e-resources.26 http://bibpurl.oclc.org/ One for Ten: The UC Shared Cataloging Program 24 • July 1, 2006—Implementation of ExLibris SFX link resolver at CDL. The SCP staff began using SFX OpenURLs for linking licensed e-journals.27 • 2007--Introduction of PURLZ (Zepheira), the new replacement for the OCLC PURL software. Attempts to migrate PID data and upgrade to PURLZ unsuccessful. • July 2008—The SCP began cataloging and distributing records for online California state documents encouraging extensive use of the CONSER PURL server.28 • 2009—E-book publishing takes off and for the first time in its history, CDL licensed more e-books than e-journals. • July-October 2013--Brian Riley, programmer at CDL, worked with the SCP to design a custom in-house link resolver to replace the aging PID server. • November 2013—Implementation of the new CDL PID server at http://uclibs.org/. • December 2015—Established linking guidelines for SCP cataloged licensed e-books promoting use of SFX OpenURLs, DOIs, ARKS, persistent publisher provided URLs, or PIDs.29 • November 2016—The SCP began cataloging open access e-book collections One of the hallmarks of the SCP is the link resolution and maintenance service it provides to the UC campus libraries for UC selected e-resources. Since the implementation of the CDL PID server in 2000, the SCP’s commitment to URL persistence has been constant and vigilant, its direction guided by several distinct events: The implementation of the ExLibris SFX link resolver, a CDL proposal to include the cataloging and distribution of open access resource records in 2006,30 the introduction and influx of e-books in 2009, an aging PID server (2007- 2013), and the gradual emergence of new and better persistent identifiers. The implementation of the ExLibris SFX link resolver in 2006 meant that the majority of e- journals could be linked through SFX with OpenURLs that pointed to the UC-wide SFX Knowledgebase instead of being redirected through the CDL PID server.31 The SCP could rely on publishers to maintain metadata through ExLibris. With the implementation of the CONSER http://uclibs.org/ One for Ten: The UC Shared Cataloging Program 25 PURL server in 2002,32 the SCP was well poised to utilize CONSER PURLs (or BibPURLs) for linking open access resources when they became part of the SCP workflow in 2006. The ExLibris SFX link resolver and the CONSER PURL project expanded and distributed the responsibility of cooperative URL maintenance beyond the SCP level. Around 2009, the launch of e-books (tens of thousands) necessitated not only new ways of providing access to these resources (batch processing) but also more efficient ways of maintaining the URLs. The number of e-resources in the CDL collection was growing as quickly as the CDL PID server was aging—it was clear that the PID server was becoming less capable of handling the growing data. Attempts to migrate the data and upgrade the software were unsuccessful but new and better types of persistent identifiers began emerging and proved to be more reliable and less prone to breakage: OpenURLs, DOIs, ARKS, permalinks, etc. While the CDL sought a replacement for the PID server, these new identifiers gave the SCP additional persistent linking options that supported and further expanded its original mission of providing URL persistence and cooperative URL maintenance. The original CDL PID server was redesigned and recreated in 2013 to continue maintenance of legacy data, support open access resources and licensed e-books, and serve as a last resort for persistence when no other options are available. The longevity of both the CDL PID server and the CONSER PURL server supports the potential and long-term sustainability of cooperative link maintenance. BibPURLs continue to be the best option for maintaining URL persistence for open access resources, since they are more susceptible to change than any other type of e-resource. For licensed resources, publishers have been making a more conscientious effort to provide persistent linking capabilities for their resources. Today, the SCP uses a variety of link resolution services and persistent identifiers to ensure URL persistence for their cataloged resources, compared to its initial exclusive reliance on the CDL PID server. Additionally, cooperative maintenance of URLs beyond the SCP level has led to less redundant, more cost-effective link resolution services, relieving one of the major burdens of consortial and e-resource cataloging. One for Ten: The UC Shared Cataloging Program 26 Collaboration at UC and beyond Conceived as a cooperative venture, the SCP strives to incorporate the cooperative model in all its activities to eliminate redundancies and improve efficiency, from cooperative URL maintenance, cooperative maintenance of shared metadata (OCLC records), cooperative cataloging, cooperative planning, to cooperative training. The SCP is constantly collaborating with other institutions, not only within UC but beyond. One UC-wide collaborative project is the UC CONSER Funnel, which was established in 2006 to foster collaboration among e-journals catalogers at the UC campuses.33 The initiative took advantage of existing UC CONSER participation and expertise to promote and provide a support network to encourage the maintenance and growth of the CONSER database through cooperative cataloging activities. The Funnel was so successful in its first two years that serial catalogers from the Getty Research Institute and California State Library joined the effort in 2008, expanding the benefits of the Funnel outside UC. Cooperative maintenance and cataloging by UC staff contribute to the international cataloging world (through the CONSER database) and, on a more local level, to the rest of UC through SCP record distribution. Today, catalogers from the various UC campuses and California institutions are still benefitting from the Funnel with e-journals training and guidance provided to new serials catalogers. These cooperative efforts contribute to and uphold the high quality expected of CONSER and SCP records. Other cooperative and collaborative efforts include pilot cataloging projects between the SCP and institutions outside the UC. Since 2010, the SCP has participated in the CONSER Cooperative Open Access Journal Project (OAJ), a national cooperative cataloging project aimed at increasing the coverage of e-journal titles, especially open access titles, in the CONSER database.34 In 2011, the SCP initiated a pilot project with the University of Maryland One for Ten: The UC Shared Cataloging Program 27 to catalog the IEEE conference proceedings as an OCLC Worldcat Collection set. Still in effect today, the project was inaugurated in earnest in December 2011 with the sharing of procedures, search strategies, and macros. By January 2012, both parties conceived and agreed to a process to share the cataloging workload by dividing new titles into ongoing biweekly lists and provide each other with the OCLC numbers of cataloged records every month. This partnership continued despite the elimination of the OCLC WorldCat Collection sets and their efforts were not hindered by implementation of RDA. The strong collaboration and communication between the SCP and the University of Maryland allowed them to coordinate and work these changes into their existing procedures and macros without a compromise in quality or delay of cataloging. In 2013, the SCP partnered with the University of Oregon (and several other libraries) to provide high quality cataloging records through the OCLC WorldCat Collection sets for the National Academies Press (NAP) materials.35 After the demise of the OCLC WorldCat Collection sets (2013), the SCP set up a global collection in the OCLC KB and still contributes new original records to the KB for the NAP resources on a weekly basis. On the international level, the SCP is especially proactive and has partnered with many institutions on cataloging Chinese e-resources. The SCP and the University of Hong Kong are the only two institutions among CEAL member libraries that have staff resources dedicated to the cataloging of Chinese e-resources. They partnered with the goal of combining staff resources to do more with less by collectively identifying common issues (e.g., e-resource presentation and metadata issues that affect cataloging efficiency), collaboratively resolving problems, and organizing as a group to effectively communicate and cooperate with vendors and publishers to improve metadata practices. In 2014, the SCP staff led the CEAL ERMB Task Force in conducting a survey among CEAL libraries to identify the Chinese collections that would most benefit from a cooperative cataloging project, provided that three or more libraries expressed interest in the collaboration. Following the survey, the SCP staff, on behalf of the CEAL ERMB Task Force, launched and established cataloging projects for the following three One for Ten: The UC Shared Cataloging Program 28 collections:36: 1) Dacheng old periodical full-text database (大成老旧刊全文数据库, 7,500 titles) (Cataloging partners: University of Hong Kong, Stanford University, and University of Washington); 2) Chinese periodical full-text database (1911-1949) (民国时期期刊全文数据库, 25,000 titles) (Cataloging partners: University of Michigan, Stanford University, and University of Washington); and, 3) China Academic Journals (CAJ, 中国期刊全文数据库, 12,000 titles) (Cataloging partners: Columbia University, University of Maryland, Claremont College, Yale University, Cornell University, University of Hong Kong). The initial cataloging for these projects was modeled after the well-known established cooperative cataloging model, the CONSER Cooperative Open Access Journal Project (OAJ). Record maintenance and updates for these collections are made via the OCLC WorldShare Collection Manager. UC Model: Cost and Benefits Two guiding principles of the SCP emphasize the ease of use for catalog users and expanding access to the maximum while minimizing cost. The idea of “one for ten” cataloging was conceived to reduce cataloging redundancies and save significant staff time and money ten times. Over the past six years, the CDL and UC Libraries conducted several studies substantiating the value of the SCP records and their impact on resource discovery and access by the ten campuses. In FY 2010-2011, the CDL conducted an internal cataloging cost analysis study to compare the cataloging cost of the SCP cataloged e-books and campus cataloged print books from the same vendor. The cost of the SCP record ranged from 17 cents each for batch cataloging to $21.44 each for original full level cataloging. The average cost of the campus record ranged from $2.92 each for copy cataloging to $25 each for original full level cataloging. The calculations were based on staffing levels, cataloging time, and number of titles cataloged. Over the 28 month period of the study, the SCP cataloged over 25,668 titles at a total cost of One for Ten: The UC Shared Cataloging Program 29 $78,523, an average of $3.06 per title, saving the campuses at least $223,005 during that period. Likewise, a study on the cost of the SCP record distribution revealed modest cost savings. In May 2011, the UC Libraries Springer e-Book pilot project usage survey report corroborated that e-book users were most likely to discover e-books through 1) the library catalog, 2) a general Internet search engine, or 3) the library website. In figure 5 below, e-book discovery is broken down by Springer e-book users versus general academic e-book users. Based on 2,569 responses from all ten campuses, 60% of Springer e-book users and 53% of general (non-Springer) academic e-book users were most likely to discover and access e-books through library catalogs, rather than through assumed search engines such as Google which constitute 33% and 43% of users, respectively. See Figure 5 below which is Figure 23 in the report.37 Figure 5. UC 2011 Survey on Methods for discovering access to e-books, taken from Figure 23 of the Report One for Ten: The UC Shared Cataloging Program 30 In fall 2015, the CDL and UC Santa Cruz jointly conducted a focus group to gather graduate students’ responses to the question, “For your last scholarly book use, where did you get the book of your chosen format?” Seven out of thirty-one students (23%) indicated that they used e-books for their last book use. Four out of the seven students (57%) indicated that they found and accessed the e-books from their library online catalog.38 Although this does not establish a direct correlation between SCP cataloging and e-book online discovery and retrieval, it does suggest that local library online catalogs are still heavily used and valued by students for digital resource discovery and retrieval which supports the import and relevance of SCP cataloging and record distribution. The import and relevance of SCP record distribution to the campuses for loading into local campus ILSs had previously been substantiated in the 2011 study carried out by POT5 and CAMCIG (see section on Record Distribution) so these responses were not surprising. In 2010, the CDL witnessed a noticeable spike in usage data for CDL licensed e-books. Around this time, the SCP was fully engaged in batch processing techniques making thousands of e-books available to UC users in a short time. Figure 6 compares the monthly cataloging and usage data of two e-book packages in 2014-2016.39 The usage data for package B (~488,000) is so high that the usage data for package A (~1,500) which is steady, lies on line zero. Figure 7 compares trends of cataloging and usage data of CDL licensed e-books in 2011-2016.40 Overall, the data supports the implication that usage is driven by an increase in cataloging. Although more comprehensive research is needed to substantiate this observation, there is little doubt that the SCP cataloging plays a direct and valuable role in supporting increased discovery and access of CDL selected e-resources for the UC community. One for Ten: The UC Shared Cataloging Program 31 Figure 6. Trends in cataloging and usage of e-books, 2014-2016. Figure 7. SCP cataloged e-books and usage, 2007-2016. UC Model: challenges and strategies As stated by French, Culbertson, and Hsiung in 2002, “the success of a shared cataloging program is measured by the extent to which it meets the needs of its participants.”41 After seventeen years, the SCP has upheld its mission to provide the timely and economical One for Ten: The UC Shared Cataloging Program 32 representation of CDL selected e-resources in Melvyl® according to national cataloging standards. Due to the SCP’s cataloging efforts, the CDL collection usage quadrupled in the last five years, illustrating steady and increased support of UC faculty and students. Going forward, the key to the SCP’s continued sustainability will be the maintenance and cultivation of partnerships and collaborations with the UC campuses, other consortia and libraries, aggregators, publishers, and vendors. To succeed, good communication and persistence will be needed for all stakeholders to continue to develop innovative solutions and standards that all can follow. Additionally, close partnerships will be necessary for meeting local changes on the horizon, such as the implementation of a new ILS that will be shared by the SCP and its participants. The SCP will continue to be on the lookout for advances in technological systems and tools that are economically scalable and feasible and can be employed for the optimization of cost-efficient cataloging. The experience of the “One for Ten” Shared Cataloging Program demonstrates that a centralized cataloging model for e-resources can be sustainable and cost-effective in a complex library consortium environment. Acknowledgements The authors wish to give a special acknowledgement to Adolfo R. Tarango, head of the SCP from August 2001 to March 2016. Adolfo played a vital role in leading and shaping the SCP to meet the needs and demands of the consistently evolving electronic resources landscape. A significant amount of content for this article was gleaned from the many CDL and/or SCP reports/studies he authored or co-authored. Additionally, he generously shared his time and provided his institutional knowledge about SCP policies and practices, without which this article would not have been possible. Adolfo currently is the Head of Technical Services at the University of British Columbia Library. One for Ten: The UC Shared Cataloging Program 33 Notes 1 Celeste Feather, "The International Coalition of Library Consortia: origins, contributions and path forward," Insights 28, no. 3 (2015): 89-93, accessed March 10, 2016. doi.org/10.1629/uksg.260. 2 Task Force on Electronic Resources, “Report to the University of California Heads of Technical Services.” March 25, 1998. Available on the Internet Archive. Accessed August 8, 2017. https://web.archive.org/web/20070208163957/http://tpot.ucsd.edu/Cataloging/HotsElectronic/tfer.html 3 University of California, California Digital Library. n.d. Organization, Priorities, & Strategies. Accessed April 10, 2017. http://www.cdlib.org/services/collections/scp/organization/. 4 University of California, California Digital Library. n.d. Collection Development and Management, accessed April 10, 2017. http://www.cdlib.org/services/collections. 5 Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for nine: the shared cataloging program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. doi: 10.1016/S0098-7913(01)00169-1 6 Unique title refers a title cataloged using “a single provider-neutral record that incorporates all specific package and other local information on one record” according to the PCC Provider-Neutral E-Resource MARC Records Guide 7 For the cost share model of funding the Chinese language cataloger position, UC San Francisco lacks an East Asian Studies program so they opted out. 8 Task Force on Electronic Resources, “Report to the University of California Heads of Technical Services.” March 25, 1998. Available on the Internet Archive. Accessed August 8, 2017. https://web.archive.org/web/20070208163957/http://tpot.ucsd.edu/Cataloging/HotsElectronic/tfer.html. 9 Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for nine: the shared cataloging program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. doi:10.1016/S0098-7913(01)00169-1 10 Julia Kochi, Armanda Barone, Adolfo R. Tarango, and Lucia Orlando, "POT 5 Report and Recommendations on Deliverable 1: Assess the benefits and risks of stopping the distribution of bibliographic records to the ten campuses for their local OPACs," September 20, 2012, accessed April 3, 2017. http://libraries.universityofcalifornia.edu/groups/files/ngts/docs/pots/pot5_deliverable_1.pdf 11 Joint Steering Committee on Shared Collections, “SCP Cataloging Priorities,” CDL Shared Cataloging Program website, October 11, 2016. Accessed April 13, 2017. http://www.cdlib.org/services/collections/scp/organization/Priorities.html 12 CDL Shared Cataloging Program, “SCP Cataloging Priorities,” CDL Shared Cataloging Program website, accessed September 8, 2017. http://www.cdlib.org/services/collections/scp/organization/Priorities.html 13 CDL Shared Cataloging Program, “E-Resources Tracking: CDL Licensed Electronic Resources,” CDL Shared Cataloging Program website, accessed April 13, 2017. http://www.cdlib.org/services/collections/scp/tracking/eresourcestracking.html 14 Rebecca Culbertson, Annelise Sklar, and Donal O'Sullivan, "Bountiful Harvest: Batch Searching and Distribution of Electronic State Document MARC Records," Dttp: Documents to the People 2008: 9-11. http://doi.org/10.1629/uksg.260 https://web.archive.org/web/20070208163957/http:/tpot.ucsd.edu/Cataloging/HotsElectronic/tfer.html http://www.cdlib.org/services/collections/scp/organization/ http://www.cdlib.org/services/collections https://www.loc.gov/aba/pcc/scs/documents/PCC-PN-guidelines.html https://www.loc.gov/aba/pcc/scs/documents/PCC-PN-guidelines.html https://web.archive.org/web/20070208163957/http:/tpot.ucsd.edu/Cataloging/HotsElectronic/tfer.html http://libraries.universityofcalifornia.edu/groups/files/ngts/docs/pots/pot5_deliverable_1.pdf http://www.cdlib.org/services/collections/scp/organization/Priorities.html http://www.cdlib.org/services/collections/scp/organization/Priorities.html http://www.cdlib.org/services/collections/scp/tracking/eresourcestracking.html One for Ten: The UC Shared Cataloging Program 34 15 Ibid. 16 CDL Shared Cataloging Program, “Organization, Priorities, & Strategies,” accessed April 19, 2017. http://www.cdlib.org/services/collections/scp/organization/ 17 Donal O'Sullivan, Rebecca Culbertson, and Adolfo Tarango, “Poster Sessions. Weapons of Mass Distribution: Cataloging with Deadly Efficiency,” Serials Librarian 64(2013): 308. doi:10.1080/0361526X.2013.760300 18 Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for nine: the shared cataloging program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. doi:10.1016/S0098-7913(01)00169-1 19 Bie-hwa Ma, “Collaborative e-collection management: CEAL cooperative cataloging project for e-collections. CEAL Committee on Technical Processing,” (conference presentation, Committee on Technical Processing, Council on East Asian Libraries annual conference, Toronto, Canada, March 14, 2017), accessed April 13, 2017. http://www.eastasianlib.org/ctp/Meetings/2017/Collaborative%20e-collection%20management_2017_03-14.pptx 20 For a more in-depth discussion of Chinese vendor-supplied metadata and examples, see Connie Lam, “Chinese e- resource metadata problems that cause access issues,” (conference presentation, Committee on Technical Processing, Council on East Asian Libraries annual conference, Philadelphia, PA, March 25, 2014), accessed March 16, 2016. http://www.eastasianlib.org/ctp/Workshops/2014/CEAL_ERMB_Connie_rev.pptx and Bie-hwa Ma, “Strengthening the Chinese electronic resources supply chain with standards and best practices,” (conference presentation, Committee on Technical Processing, Council on East Asian Libraries annual conference, Toronto, Canada, March 14, 2012), accessed March 16, 2017. http://www.eastasianlib.org/ctp/Meetings/2012/Ma_StrengtheningChineseE. 21 Bie-hwa Ma, Shi Deng, and Susan Xue, "Leveraging NISO standards and best practices to improve discovery and access of digital resources," International Journal of Librarianship 2017: 51-66. 22 Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for nine: the shared cataloging program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. doi:10.1016/S0098-7913(01)00169-1 23 Next-Generation Technical Services (NGTS), 2011-2013 was an initiative developed by the University Librarians and SOPAG to redesign technical services workflows across the full range of library formats in order to take advantage of new system-wide capabilities and tools, minimize redundant activities, improve efficiency, and foster innovation in collection development and management to the benefit of UC library users. 24 Armanda Barone, Adolfo Tarango, Julia Kochi, and Lucia Orlando, "Evaluation of SCP Decision-Making Process for Cataloging Priorities, Final Report," September 20, 2012, accessed April 3, 2017. http://libraries.universityofcalifornia.edu/groups/files/ngts/docs/pots/pot5_scp_deliv3_final.pdf 25 Valerie Bross, “The PCC/CONSER PURL Project: Improving Access to Free Resources,” Serials Librarian 45, no. 1 (2003): 19-26. doi:10.1300/J123v45n01_02 26 UC Link Resolver Planning Group, “UC Link Resolver General Principles & Detailed Linking Guidelines,” SCP Cataloging & Linking Guidelines web page, October 3, 2005; accessed March 23, 2017. http://www.cdlib.org/services/collections/scp/docs/UClinkresolverguidelines.doc 27 Shared Cataloging Program, “Linking Guidelines for CDL Licensed eJournals,” SCP Cataloging & Linking Guidelines web page, April 10, 2009, accessed March 23, 2017. http://www.cdlib.org/services/collections/scp/docs/SFXlicensed.pdf 28 Rebecca Culbertson, Annelise Sklar, and Donal O'Sullivan, "Bountiful Harvest: Batch Searching and Distribution of Electronic State Document MARC Records," Dttp: Documents to the People 36, no. 4 (Winter 2008): 9-11. http://www.cdlib.org/services/collections/scp/organization/ http://www.eastasianlib.org/ctp/Meetings/2017/Collaborative%20e-collection%20management_2017_03-14.pptx http://www.eastasianlib.org/ctp/Workshops/2014/CEAL_ERMB_Connie_rev.pptx http://www.eastasianlib.org/ctp/Meetings/2012/Ma_StrengtheningChineseE http://libraries.universityofcalifornia.edu/groups/files/ngts/docs/pots/pot5_scp_deliv3_final.pdf http://www.cdlib.org/services/collections/scp/docs/UClinkresolverguidelines.doc http://www.cdlib.org/services/collections/scp/docs/SFXlicensed.pdf One for Ten: The UC Shared Cataloging Program 35 29 Shared Cataloging Program, "Linking Guidelines for CDL Licensed eMonographs," SCP Cataloging & Linking Guidelines web page, Dec. 9, 2015, accessed, March 23, 2017. http://www.cdlib.org/services/collections/scp/guidelines/856URLguidelinesforEmonographs.pdf 30 California Digital Library, “Open Access Resources at the UC Libraries,” California Digital Library website, June 9, 2006, accessed March 23, 2017. http://www.cdlib.org/services/collections/openaccess.html 31 UC Link Resolver Planning Group, “UC Link Resolver General Principles & Detailed Linking Guidelines,” SCP Cataloging & Linking Guidelines web page, October 3, 2005; accessed March 23, 2017. http://www.cdlib.org/services/collections/scp/docs/UClinkresolverguidelines.doc 32 Valerie Bross, “The PCC/CONSER PURL Project: Improving Access to Free Resources,” Serials Librarian 45, no. 1 (2003): 19-26. doi:10.1300/J123v45n01_02 33 Valerie Bross, “Doing More with More: The UC CONSER Funnel Experience,” Cataloging & Classification Quarterly, Volume 48, no. 2-3 2010): 153-160. doi:10.1080/01639370903535676c 34 Open Access Journal Project, "Cooperative Open Access Journal Project Report," PCC CONSER website (Library of Congress), April 30, 2010, accessed April 20, 2017 https://www.loc.gov/aba/pcc/conser/issues/Open-Access- Report.pdf 35 Philip Young, Rebecca Culbertson & Kelley McGrath, “Collaborative batch creation for open access e-books: a case study,” Cataloging & Classification Quarterly, 51, no. 1-3 (2012): 102-117. doi: 10.1080/01639374.2012.719075 36 For more information about these projects, see Bie-hwa Ma, “Collaborative e-collection management: CEAL cooperative cataloging project for e-collections. CEAL Committee on Technical Processing,” (conference presentation, Committee on Technical Processing, Council on East Asia Libraries annual conference, Toronto, Canada, March 14, 2017), accessed April 13, 2017. http://www.eastasianlib.org/ctp/Meetings/2017/Collaborative%20e-collection%20management_2017_03-14.pptx 37 Chan Li, Felicia Poe, Michele Potter, Brian Quigley, and Jacqueline Wilson. UC Libraries Academic e-Book Usage Survey : Springer e-Book Pilot Project. May 2011, accessed April 3, 2017. http://www.cdlib.org/services/uxdesign/docs/2011/academic_ebook_usage_survey.pdf 38 CDL and UC Santa Cruz focus group data was provided by Chan Li, CDL Sr. Data Analyst. 39 Trends of usage data in the Figure 6 was provided by Nga Ong, CDL Library Data and Services Analyst. 40 The usage data and the chart of the trends were provided and prepared by Chan Li, CDL Sr. Data Analyst. 41 Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for nine: the shared cataloging program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. doi: 10.1016/S0098-7913(01)00169-1 http://www.cdlib.org/services/collections/scp/guidelines/856URLguidelinesforEmonographs.pdf http://www.cdlib.org/services/collections/openaccess.html http://www.cdlib.org/services/collections/scp/docs/UClinkresolverguidelines.doc https://www.loc.gov/aba/pcc/conser/issues/Open-Access-Report.pdf https://www.loc.gov/aba/pcc/conser/issues/Open-Access-Report.pdf http://www.eastasianlib.org/ctp/Meetings/2017/Collaborative%20e-collection%20management_2017_03-14.pptx http://www.cdlib.org/services/uxdesign/docs/2011/academic_ebook_usage_survey.pdf Organizational structure Staff Funding Models E-books cataloging and batch processing Cataloging Chinese E-resources: A New SCP Venture Weekly file maintenance and distribution The Challenge of Persistent URLs and URL maintenance Collaboration at UC and beyond UC Model: Cost and Benefits work_dg5cwof455dhtfs6oizmizutze ---- Microsoft Word - Kingsley OCLC Demetrius.doc Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 1 OF 15 Those who don’t look don’t find: Disciplinary considerations in repository advocacy Danny Kingsley Centre for the Public Awareness of Science Australian National University Acton ACT 0200 danny.kingsley@anu.edu.au This paper has been accepted with minor editorial changes for OCLC Systems and Services: International Digital Library Perspective (OSS:IDLP), Vol 24, No 24, early 2009. This will be a special issue related to the areas of open access and scholarly communications (including case studies, scholarly communication case studies) Abstract Purpose of this paper By describing some of the often ignored aspects of repository advocacy, such as disciplinary differences and how these might affect the adoption of a particular institutional repository, this paper aims to offer practical guidance to repository managers and those responsible for open access and repository policy. Design/methodology/approach The argument uses examples from an empirical study of 43 in-depth interviews of academic staff in three disciplines, Chemistry, Computer Science and Sociology, at two Australian universities. The interviewees discussed their interaction with the literature as an author, a reader and a reviewer. Findings Disciplines are markedly different from one another, in terms of their subject matter, the speed of publication, information seeking behaviour and social norms. These all have bearing on the likelihood a given group will adopt deposit into an institutional repository as part of their regular work practice. Practical implications It is important to decide the purpose of the institutional repository before embarking on an advocacy program. By mapping empirical findings against both diffusion of innovations theory and writings on disciplinary differences, this paper shows that repository advocacy addressing the university academic population as a single unit is unlikely to be successful. Rather, advocacy and implementation of a repository must consider the information seeking behaviour and social norms of each discipline in question. What is original/value of paper The consideration of disciplinary differences in relation to repository advocacy has only begun to be explored in the literature. Introduction The widespread uptake of the internet in the scholarly world over the last 15 years offers opportunities to reform the long-standing scholarly communication system. Repositories have been mooted as a way to achieve open access, amongst other possible uses, but to date, particularly in institutional repositories, deposit of material Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 2 OF 15 has been slow. This paper is looking at the challenges facing digital repositories in facilitating open access and how they are changing scholarly communication. In such a discussion it is necessary to explore what repositories are intended for. This depends on not only the type of repository in question, but whether the end-user is an institution, an academic researcher, a practitioner or the general public. Given the work of academic researchers is usually the intended content of these digital repositories, we will take an in-depth look at the work practices of these researchers to determine the barriers to a general embracing of repositories as part of the scholarly communication process. This paper will examine the introduction of repositories into the academic environment in terms of diffusion of innovations theory before discussing disciplinary differences and how this affects the acceptance or not of repositories by certain academic groups. Specifically, the information-seeking behaviour within disciplines has a direct bearing on the likelihood of a given group to accept repositories as they are currently structured. Throughout, examples will be given from a research project into three disciplines based at two universities in Australia. The conclusion makes recommendations to institutional repository managers for achieving a more enthusiastic uptake of their repository. The open access argument The open access movement has been active for over a decade. Broadly advocating that peer-reviewed scholarly material should be freely available on the internet at the time of publication, the movement originally developed from a reaction to the scholarly ‘crisis’ of the 1990’s when journal prices skyrocketed (Harnad, 2003). Exact definitions of what constitutes open access have since been determined (Open Society Institute, 2002; Max Planck Institute, 2003). There are generally two ways to achieve open access currently: open access publishing and using a digital repository to deposit the author’s version of an article at the time of submission or publication. These are referred to as the ‘gold’ and ‘green’ roads to open access respectively (Harnad et al., 2004). Open access publishing has historically been in specifically created open access journals, such as PLoS Biology, or in journals that have moved from a subscription-based model to an open access model, such as the Medical Journal of Australia. Generally open access journals are funded either through a pay-on-acceptance charge (sometimes inaccurately referred to as author-charges), or through scholarly association membership fees. It should be noted, however, that most open access journals will waive the charge for authors who are unable to pay. In the last two years, the ‘hybrid model’ has become increasingly popular with publishers, who offer authors the opportunity to have their article freely available at time of publication for a fee. Some of these journals ‘anticipate’ the subscription cost of the journal will be reduced according to the number of open access articles that appear in the issues (Suber, 2006). This paper is concerned with the second, ‘green’ method of achieving open access, making author’s versions of articles available online. This can be through an author’s own website, although generally repositories are considered to be more ‘robust’ and searchable due to a requirement that they comply with the Open Access Initiative (OAI) Protocoli which requires interoperable standards for searching of repositories. Types of repositories The recent widespread uptake of repositories in institutions (van Westrienen & Lynch, 2005) has largely been due to the availability of open-source software that Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 3 OF 15 offers institutions the ability to build a repository ‘out of the box’. The most widely used repository platforms worldwide are ePrintsii and DSpaceiii. The release of ePrints by Southampton University in 2001 and DSpace in 2002, jointly developed by MIT and Hewlett Packard, allowed any group or institution to build a digital repository at minimal cost. There are other open-source platforms, and some service providers offer proprietary software platforms, but these will not be discussed here in any depth. The intended purpose of a repository not only determines the usefulness of a given repository platform as a way of achieving open access, but also affects how easy the repository is to use, and what tools are available to the repository manager to encourage repository use. EPrints and DSpace have been designed with different purposes in mind. The goal of ePrints was to allow for the deposit of author pre- or post- prints to facilitate open access to the material without the reader having to pay a subscription fee. DSpace has a wider remit than ePrints, archiving a range of digital content including images, datasets and other forms of scholarly output (Nixon, 2003). There is a distinction between institutional repositories and subject-based repositories that is relevant to the discussion here. In the former, the policies on the selection and retention of material, as well as the general scope and organization of the repository, is determined by the institution. This stands in contrast to the discipline- or subject-based repository where depositing policies are determined by the research communities. These often develop in an ‘organic’ manner in response to a specific need in a discipline (Chan, 2004). Attitudinal research of law and economics academics has indicated a preference for subject-based repositories over an inter-discipline based archive. (Pelizzari, 2003). As a demonstration of this preference, it is instructive to look at the participation levels of three subject-based repositories. The most obvious example is arXiviv, which was developed in 1991 as an archive for preprints in physics by Paul Ginsparg and hosted at the Los Alamos Laboratory. Now hosted at Cornell University, the scope has expanded to include astronomy, computer science, mathematics and other areas. According to its site, arXiv, at the time of writing, offers: “open access to 451,387 e-prints in Physics, Mathematics, Computer Science, Quantitative Biology and Statistics”. RePEc, Research Papers in Economicsv is a repository disseminating research in economics, where participants can deposit material through their own institutional repository, or directly to the repository. RePEc’s website states that it holds: 222,000 working papers, 316,000 journal articles, 1,500 software components and numerous listings fro books and chapters, author content and publication and institutional contacts. In the biological and life sciences, PubMed Centralvi, run by the US National Institutes of Health (NIH), is a free digital archive of the journal literature. Begun in 2000, the archive holds digitised versions of articles dating back to the 1800’s as well as new material added daily. The archive holds approximately 650,000 items, with most of the recent content added by researchers who have been funded by the NIH. Institutional repositories, by contrast, have not enjoyed this kind of success. OpenDOARvii is a website listing and providing information on over 1,000 academic research repositories. A cursory glance shows that in Australia, institutional repositories contain between a handful and several thousand items, with the larger numbers often representing collections of images, or metadata items without the full access version of a paper attached. This low participation rate in institutional repositories is reflected worldwide (Ware, 2004a; Pelizzari, 2003; Allen, 2005). Even at Cornell University, the home of arXiv, academic deposits into the institutional DSpace repository have been low, with faculty indicating that those using a subject Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 4 OF 15 archive found it fulfilled their needs, making the institutional repository redundant (Davis & Connolly, 2007). What are repositories for? Variously, repositories have been mooted as: a simple way of achieving open access without changing the scholarly communication system or threatening publisher’s livelihoods (Harnad, 2003), a method of streamlining university administration systems, a way to assist with academic workflows (Foster & Gibbons, 2005), or a tool with which to fundamentally change the whole scholarly communication system (Crow, 2002; Brown, Griffiths, & Rascoff, 2007). Given that digital libraries will generally be hosting institutional repositories, this paper will now focus on them rather than subject-based repositories. One of the reasons for the low participation rate in institutional repositories is an issue of purpose. Currently, of the above potential uses of institutional repositories, only the institutional goals of creating a university administration system and a digital library are being achieved. Certainly the large numbers of image and video items in institutional repositories indicate there was a need for this type of facility not previously being met. There are, however, serious long-term problems of sustainability for digital repositories (Bellekom, 2004). This is also an issue for data repositories (Buchhorn & McNamara, 2006), a topic for discussion elsewhere. There is no doubt that institutional repositories are potentially a very useful tool for many aspects of an institution’s administration, from offering a method for collating all the output from an institution, to reporting to funding bodies. In some respects, it is not surprising that institutional repositories benefit the institution. Certainly the nomenclature has indicated to the academic community that the repository is designed to support and highlight the achievements of the institution rather than provide any benefit to them (Foster & Gibbons, 2005). But the issues are more complex than a matter of terminology. Using institutional repositories as a method of achieving open access has, to date, been only partially successful. While it is extremely difficult to quantify not only the number of items freely available in repositories but also the number of articles produced in a given year (Tenopir, 2004), a widely mooted figure is that approximately 15% of published articles are available in open access form in repositories (Sale, 2005). The broader question of whether repositories are reforming the scholarly communication landscape is well beyond the scope of this article. Suffice to say that while arguments abound that the days of the scholarly journal are limited (the subject of a previous paper (Kingsley, 2007a)) the scholarly communication system is currently deeply embedded in the reward system used in academia (Steele, Butler, & Kingsley, 2006), and until this changes there is unlikely to be a revolution. The practice of putting author’s versions of papers into repositories, despite concerns on behalf of publishers, has so far had little impact on subscription rates. Looking at the arXiv example, this highly successful and almost universally used (in the relevant disciplines) repository has been shown to have had no effect on the subscription rates of the journals publishing the final versions of the paper appearing in the repository (Beckett & Inger, 2006). Let us turn our attention to why repositories are having the successes and failures that they are. In doing so, the argument will now draw on research that has looked at Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 5 OF 15 implementing new ideas into groups of people, described as the diffusion of innovations. Diffusing repositories into the academic community In 1962, Everett M Rogers wrote a book, Diffusion of Innovations, outlining a new theory of how innovations came to be accepted by groups of people. The 5th edition was published in 2003. Institutional repositories clearly represent a new innovation, defined by Rogers as “an idea, practice, or object that is perceived as new by an individual” (Rogers, 2003) (p.12). The diffusion process is concerned with communication of a new idea to members of a social system, described as ‘”a set of interrelated units…engaged in joint problem solving to accomplish a common goal” (p.23). The implementation of repositories into the academic community fits neatly into these definitions, and this section of the paper will discuss insights from diffusion theory which may help guide those responsible for repository advocacy. One of the reasons why subject-based repositories have enjoyed relative success could be explained by their focus on a particular discipline. Years of research into diffusion of innovations have demonstrated that diffusions are more successful if managed as a decentralised system, where the participants can make decisions about the diffusion process and create and share information with one another to reach a mutual understanding. Decentralised systems are likely to fit more closely with the user’s need and problems. By contrast, institutional repositories are, by definition ‘centralised’ systems, where the decisions about the innovation itself and the diffusion of the innovation are imposed from an external source – the university administration. In these instances the innovation is diffused as a “uniform package to potential adopters who accept or reject the innovation. The individual adopter of the innovation is thought of as a relatively passive accepter” (Rogers, 2003, p.395). To simplify Roger’s argument, an innovation is more likely to be adopted if: the adopter perceives the innovation to be more advantageous than the idea or process it supersedes, if it is consistent with existing values, past experiences and needs of the adopter, if it is perceived to be difficult to understand and use, if it can be experimented with and if the results of the innovation are visible to others. Institutional repositories face difficulties on all these counts to varying extents. Issues of perceived complexity, and demonstrability depends partially on the software platform the repository is built on and how the institution has customised their own repository. For example ePrints offers download statistics for individual papers, and a simple-to-use deposit interface. DSpace has partially attempted to address the disciplinary difference issue by structuring the repository so it reflects ‘communities’ within the university. These can be mapped to the departments of the university. In a university or other institutional environment, it is fair to say that generally the repository has been developed with the institutional structure in mind (Chan, 2004) Often attempts to encourage repository use have involved university-wide strategies, such as mining personal websites for material academics are already putting online, finding out which journals allow deposit of post prints and approaching authors who have published in them, or determining which OA journals people have published in (Mackie, 2004). But these suggestions, while likely to be effective in the initial goal of partially filling the repository, are heavily reliant on having a centralised person or system in the institution to manage this ingestion. These methods are unlikely to spontaneously encourage widespread use on the behalf of the academic community itself, not least because the academic community is not homogenous, discussed in greater depth below. Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 6 OF 15 There are several commonly encountered problems with the adoption of repositories by academics in universities. One is a matter of language. For example the expression ‘post-print’ is one used widely within the open access community and within library circles. The general academic community, however is not familiar with this term, so using the expression ‘the final corrected post peer review draft version’ is far more effective (Callan, 2007). Another barrier to adoption generally experienced across most disciplines is a simple technological issue, the format of the item being deposited. There are problems with using proprietary software for items being deposited into a repository for what is intended to be the longer-term (Barnes, 2006b). Issues such as Microsoft Office Word 2007 not being backwardly compatible to previous versions of Word illustrate the difficulties of using this software in a long-term storage capacity. A simple way of addressing this is to ask authors to convert their documents to a pdf before depositing them. However while this may seem to be a simple instruction to some people, it can cause difficulties within the general population that may not be as computer literate as assumed and may not have access to the appropriate software. An open-access software program is currently being developed to automate this system (Barnes, 2006a) but until this is operational and deployed, the alternative is to provide a staff member to assist with the conversion and depositing process. The third issue is one of copyright. While many publishers do allow archiving of pre- and/or post- prints there is a website academics and administrators can use to determine publisher copyright policiesviii, most academics are unaware of it. This is reflective of the broader phenomenon that awareness of the copyright status of published work varies in the academic community as does willingness to comply. Checking the copyright status of articles is time consuming and potentially confusing for academics, and is more efficiently dealt with at an administration level (Callan, 2007; Mackie, 2004). While the above challenges are experienced across the board in academic environments, in would be foolhardy to think of the research community as a homogenous group. The difficulty with developing diffusion policy within an institution is that the ‘existing values, past experiences and needs’ of academics change according to the discipline. Rather than a single social system, academics consist of a series of small, disparate groups with distinct differences. It is for this reason that a uniform advocacy or ‘roll-out’ program for a given institutional repository is unlikely to succeed. With this in mind we now turn our attention to disciplinary differences and how they might affect the adoption of repositories. The disciplinary difference issue To say that disciplines differ from one another is a truism, however, the extent to which they differ, not only between disciplines but also within them is the subject of this section of the paper. In order to illustrate some of the propositions put forward here, examples will be given from interviews conducted as part of a research project into the barriers to the uptake of open access in Australia. A total of 43 in-depth interviews were conducted at two Australian universities, the Australian National University and the University of New South Wales, from October 2006 to March 2007 with academics in the field of Chemistry, Sociology and Computer Science. The semi-structured interviews discussed the behaviour of the researcher as a reader, a writer and a reviewer of articles as well as canvassing views on open access and attitudes to their institutional repository. After analysis of the transcripts, two interviews were conduced as triangulation, at Queensland University of Technology with the repository manager and the Deputy Vice-Chancellor of Technology, Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 7 OF 15 Information and Learning Support. QUT was chosen because it is the only university in Australia with a mandate to deposit scholarly output into the institutional repository (QUT, 2004). The full methodology of the research project is detailed elsewhere (Kingsley, 2007b). In choosing the three disciplines for interview, the initial consideration was for the way the disciplines publish their work. Chemistry, representing a hard science, traditionally publishes in peer-reviewed articles in journals. Sociology, while also publishing in this manner, also has a tradition of publishing books or monographs, while Computer Science primarily uses conference proceedings for peer-reviewed communication. Publishing output, however is only one manifestation of the fundamental differences between disciplines, and results from the general ‘speed’ of the endeavour in question. Fast moving research with many people working on similar topics is described as urban (using the analogy of urban life) (Becher & Trowler, 2001). High-energy physics and computer science are obvious examples, but the current race for priority in stem cell research identifies this as an urban area. Urban areas of research need a fast form of communication, and the development of repositories like arXiv was merely an electronic extension of an already thriving pre-print culture (Hagstrom, 1970). This hectic pace demands more informal forms of communication. Crane observed as early as 1972 that physicists placed a higher ranking on informal sources of information using sources like conversation and correspondence compared to chemists. Of the three disciplines interviewed, computer science is the fastest moving. The use of conferences as a method of communicating ideas is the most efficient in this context. Sociology, by contrast fits squarely in the category of rural research, where an individual researcher may be the only person world-wide working on a given topic. Books are an appropriate format for publication in this context. Many of the people interviewed in sociology described delays in journal article publication of two years, in one case an interviewee had been waiting for publication of a book chapter for nine years (although this is not typical it does illustrate how protracted the process can be). These general time frames have been reported elsewhere (Becher & Trowler, 2001). Chemistry falls in the middle with academics in different sub-disciplines reporting a range of publication times. While generally academics can be described as people who work with ideas, the nature of the particular intellectual tasks on which specific groups are engaged determines to some extent their ‘culture’. The divide between disciplines is not limited to the subject being explored. It extends to all aspects of the research endeavour, the language used, the methods of communication and the sources of information, to name a few. If reconsidering the likelihood of adoption of a new technology, such as a repository, the level of engagement a particular group will have towards a technology will be partially determined by their current work practices, and these differ from discipline to discipline. Disciplines themselves are hard to define, but to be admitted to membership of a section of the academic profession “involves not only a sufficient level of technical proficiency in one’s intellectual trade but also a proper measure of loyalty to one’s collegial group and of adherence to its norms” (Becher & Trowler, 2001, p.47). Identifying differences between disciplines may not be enough to determine successful ways of implementing repository use, as disciplines themselves encompass a series of sub-specialisms. Many of the computer scientists spoken to made the comment that they were ‘unusual’ because they ‘straddled’ another area. Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 8 OF 15 While these areas were all different from one another, the trend of ‘straddling’ appeared to be almost universal, and certainly within the cohort of computer scientists/engineers interviewed it would be difficult to identify a ‘typical’ or representative one. This observation has been made elsewhere: There is no single method of enquiry, no standard verification procedure, no definitive set of concepts that uniquely characterizes each particular discipline. It is in some contexts more meaningful to speak about the identifiable and coherent properties of subsidiary areas within one disciplinary domain or another. (Becher & Trowler, 2001) (p65) Generally the academic population is unaware of how other disciplines function, “academics seem to be surprisingly hazy in characterising other people’s subjects of study, and their stereotypes of both subjects and practitioners are in general neither particularly perceptive nor particularly illuminating“ (Becher, 1981, p.110). It can be argued that the university administration is similarly hindered in its understanding of the myriad of work practices and social norms in disciplines. One work practice of relevance in this debate is information-seeking. Desperately seeking information The way a given group of researchers search for information has a great bearing on their perceived attitude towards the usefulness of institutional repositories. Generally speaking, researchers undertake two kinds of searching of the literature, broad and specific (Back, 1962). While the term ‘keeping up with the literature’ might be considered quaint in some disciplines and irrelevant in others, it is still a practice undertaken in defined areas such as chemistry, although techniques have changed with the advent of the internet: “I used to on Friday morning check all the journals. In the old days we would go to the library” Several interviewees made similar comments. Now these searches are conducted electronically: “I get abstracts of journals sent – keeping up with it all is hard. I am on email lists…I look at journals online.” More commonly researchers will be looking at a specific topic, because they are reviewing a paper and wish to ensure that the topic has not been covered elsewhere, or because they are writing a paper on the topic and need to ensure that they have seen or are aware of all other work in the area. It is this latter type of searching that is of most relevance to this paper. By looking at the specific tools different groups of researchers use to find information, clues can be found as to the usefulness or not of a repository to that group. Taking chemistry as the first example, those interviewed indicated that they use a series of tools including SciFinder, Thompson Scientific’s Web of Science and Chemical Society Abstracts. There was not a great reliance on Google as a search engine, with a preference for databases. The chemists, when asked about whether they would place material in a repository, made comments such as: “I as a user would like something that’s searchable not just for an institution but across all institutions” “My view is it would just get buried, people wouldn’t look for it.” Of course, the idea of the repository is that the searcher does not need to go to the institutional web page, they can use a search engine such as Google or OAIster and find the paper, almost without knowing they have found their goal through a Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 9 OF 15 repository. But this mis-perception that items in a repository would not be found by other people reflects the way chemists currently search for information: going to the database where information is housed rather than conducting general searches. The sociologists in the sample used a wide range of tools to help them with their information search. In keeping with the rural nature of the endeavour, the concept of ‘keeping up with the literature’ was not adhered to as there is not necessarily a specific literature in a given area of enquiry: “I am a bit of a generalist in my approach. What it gets down to is largely a matter of accident.” “The ideas are interdisciplinary, the field is so broad I don’t worry about covering it” In computer science, far more so than the other disciplines interviewed, it is common practice to have a personal website with all published papers listed on that site. In many cases there is a version of the paper attached to that listing. This practice reflects the ecological approach advocated by (Gandel, Katz, & Metros, 2004) who suggest personal digital repositories which can then be collated. When asked about the copyright status of those papers, the interviewees indicated that either they thought they had permission, or they were not concerned about potential repercussions from the publishers: “I haven’t asked permission [to put pdfs on my site] but I have had no problems” “I don’t worry about copyright policies” “All my stuff on the web probably contravenes the lettering of copyright… Publishers aren’t bothered about you putting up papers on your website as long as that’s all”. Putting the copyright implications of this practice to one side, having information available in this form means that without exception, the computer scientists spoken to used Google as a search tool amongst other methods. The subject of their searches was a person rather than a topic, and the first place to look was an individual’s website where the relevant paper (or one that was close enough) could be downloaded. This last situation is an interesting conundrum for an advocate of an institutional repository. Those researchers who put their papers into personal websites are already practising open access. All the material they use is available freely online via a Google search. Using personal websites might not address some of the sustainability issues that repository developers are trying to resolve, but in a fast- moving discipline, most material is out of date very quickly so this is not necessarily a priority: “Because I am researching the web – it’s changing everyday. If my results are not out in one year … it will go nowhere” “Computing moves so fast”. There are evidently in some cases, serious copyright issues with this practice that should probably be addressed for the researchers, but if the institution’s focus is on achieving open access, then energy would be better spent, in the case of computer science at least, addressing the copyright problem rather than trying to encourage those academics to alter their behaviour and use the institution’s repository. Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 10 OF 15 The mandate argument When encouraging self-archiving, subject-based repositories have great advantage over institutional ones, “it seems there is a direct correlation between willingness to self-archive and the existence of subject-based repositories. Most of the academic units that have a high percentage of self-archiving scholars already have well- established subject repositories set up in that area” (Andrew, 2003). One way of enforcing an increase in use of institutional archives is to mandate deposit into them. Several open access advocates have called for mandates to encourage repository use (Harnad, 2006), (Sale, 2007). In theory, this is supported by attitudinal studies showing that 80% of academics would willingly place their work into a repository if required to do so (Swan & Brown, 2004). But as the QUT experience shows mandating alone is not the solution. Care must be taken to address both the broader and discipline-specific issues when rolling out the repository (Cochrane & Callan, 2007; Allen, 2005). Open access as a sales pitch There is a distinction between an individual’s attitude towards an idea and their behaviour towards it, and open access is an example of this. Those interviewed generally expressed open access sentiments suggesting the results of science should be freely available: “Research is pretty meaningless if you can’t communicate it. The whole purpose of research rests on disseminating the research” “I believe work should be published. We are financed by the tax payer, it should be in the public domain”. “What’s science for if you don’t have things available?”. However when asked about changing their behaviour such as using a repository, there was less enthusiasm and in some cases antipathy towards the suggestion: “I can’t see the point of putting thesis on Digital Thesis when I have a copy on my own website” “I don’t know what benefit it is for me, it sounds like more work to do it” “I don’t see any harm in depositing in a IR, but don’t see any use in it either” “I have a concern about plagiarism” “There are all sorts of copyright restrictions”. Certainly other studies have shown that in theory, academics support open access (Swan & Brown, 2004) but their practice does not bear this out when looking at what scholarly output is available in an open access format worldwide. This apparent dichotomy could be for several reasons, not least the method of the introduction of the technology, discussed in this paper. Another compelling reason for resisting changes to their current work practices is scholarly publishing is tied to the reward system in academia, and any change to the practice potentially jeopardises the academic’s standing (Steele, Butler, & Kingsley, 2006; Bjork, 2004; Harley, Earl- Novell, Arter, Lawrence, & King, 2007). In addition, there have been potential clinical concerns expressed about publishing non-peer reviewed articles in chemistry and biomedicine, as well as fear of plagiarism in some humanities areas (Ware, 2004b). Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 11 OF 15 The open access message is not necessarily a good ‘selling point’ to academics as a reason to put material into a repository. Researchers based in institutions in first world countries already have ‘open access’ to much of what they require because their institution subscribes to it. Access is not necessarily an issue for them. In the two (well resourced) universities where interviews were conducted, the only access issue expressed was by some of the sociologists who found they needed to buy their own books. Generally however this seemed to be accepted by the interviewees. Having books available in open access form is not what is being discussed here, so for the purposes of this paper, the academics who are being asked to make their work freely available are likely not to be having issues themselves with obtaining the material they use. Open access encompasses more than simply scholarly communication, which implies communication between scholars, potentially a very private conversation. Academic endeavour is in many ways a social activity (Crane, 1972). Particularly in the sciences, research builds upon itself as researchers report small steps in the movement towards an answer to a large problem that many people are working on. Newton’s famous quote ‘if I have seen further it is by standing of the shoulders of giants’ is a lyrical description of this phenomenon. In order for this progression to occur, it is essential for researchers to communicate their findings to one another. Traditionally this has been by publishing articles in peer-reviewed journals, but as communication channels have improved, some disciplines have adopted faster, more informal methods of communication. Specifically, the introduction of the internet (which, it can be argued, represents a seismic shift in communication in the order of that of the printing press) has allowed for new types of communication previously unimagined. These ‘Web 2.0’ techniques, such as blogs, wikis, Skype (to mention a few) are being adopted by many of the computer scientists interviewed. A few good friends Generally academic circles are very small, with an immediate group of approximately 5-20 people. A larger group of interested researchers might encompass about 200, but that is the extent of people who would have a direct research interest in an individual’s work (Becher & Trowler, 2001). The intimate nature of these groups means researchers are known to one another: “I follow …leads given by people I know. I rely on personal networks” “I know most of the people active in my field, they send me their work. About 12-20 people”. If we consider the small size of the members not only of one discipline but of the sub- speciality that makes up a particular individual’s inner circle, the likelihood is that these people are not working in the same institution, indeed many of the people interviewed discussed their collaborators overseas. Given the requirements in a university environment of covering a broad range of topics for undergraduate teaching, it is not surprising that many academics find their research colleagues outside their own institution (Becher, 1981; Foster & Gibbons, 2005). Academics need to communicate and share thoughts with their small inner circle, and using a tool developed by the institution is unlikely to be the first choice. Considering the small size of the intended audience of a particular piece of work, it is not surprising that many scholarly papers are never cited. A core of approximately 2,000 journals now accounts for 95% of cited articles (Steele, Butler, & Kingsley, Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 12 OF 15 2006). Even allowing that citation counts are a blunt way to determine how many people read a paper, the academic audience for scholarly papers is not huge. But if we move from scholarly communication and turn to open access the audience becomes considerably broader. There is a large literature demonstrating that having articles open access increases the citations for those papers (Hitchcock, 2006). Apart from researchers in the third world, there is a wide audience for scholarly output including practitioners, such as teachers, nurses, doctors, medical and scientific lawyers and accountants who work in fields that benefit from research but are usually not in a workplace that subscribes to the relevant journals. Field researchers – for private organizations and for government departments – are also similarly disadvantaged. These are the benefactors of having material available as open access. It follows, then, that institutions with these cohorts might be at an advantage when implementing their repository. Indeed, this was suggested as a possible reason for QUT’s relative success in repository uptake (Callan, 2007). It is perhaps surprising then, that the author who consistently heads the Top 50 Authors list in QUT ePrints is researching and publishing in chemistry. With over 61,000 downloads of his papers in the previous year as at (November 2007), a possible explanation for this extraordinary interest is some of the work is in the area of environmental chemistry – another area where there are many field practitioners not tied to institutions. This current wider audience for scholarly articles does not necessarily translate into quantifiable ‘points’ for the researcher in the form of citations. The open access argument will only tie back into reform of the scholarly communication situation if it reflects the reward system. If the way ‘success’ or ‘impact’ is measured changes (such as a count of downloads of material, for example), then the arguments for making material open access will become considerably more compelling for the academic. Conclusion A repository manager, faced with the challenge of encouraging repository use must consider several aspects. While touting the repository as a means to achieve open access may appeal to some academics, the more pressing issues of disciplinary norms and their expected reporting behaviours will take precedence. Addressing these concerns will be the first step in successful repository advocacy. However advocacy alone will not always translate into action by the academic community, and consideration of disciplinary differences when offering reasons and methods for using the repository will ensure a much smoother transition. Steps such as simplifying the process, offering assistance with the more technical aspects of depositing papers, having a person available on the telephone rather than an email enquiry have all been shown to increase enthusiasm for the repository (Foster & Gibbons, 2005). Adding benefits such as an individual researcher page or tying the process into already existing administration to avoid repeated reporting will encourage take-up of the system because it offers a benefit to the researcher. When developing a university policy on open access and/or institutional repository use, the existing behaviours of the academic community expected to use it need to be considered. If the purpose of the repository is to achieve open access for the university or institutional output, then those disciplines where open access is already being practiced should be a low priority. Those disciplines unlikely to use repositories to find information will need to be given other reasons why their work should be Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 13 OF 15 made freely available. The concerns of researchers in other disciplines, such as about plagiarism, need to be taken seriously and addressed. Repositories are unlikely to solve scholarly communication issues in the short term. If open access is a priority, even with a strong open access policy, a mandate and staff dedicated to the process, the initial increase in open access to the institutions output is likely to be slow. If, on the other hand, the purpose of the repository is an administrative one, to assist the institution with reporting for funding, or as a showcase of the university output, then the onus of spending time adding to the repository and maintaining it should fall squarely on the shoulders of the university administration and not the academic community. It is a simple question – who benefits? Bibliography Allen, J. (2005). Interdisciplinary differences in attitudes towards deposit in institutional repositories Retrieved 20 January, 2006, from http://eprints.rclis.org/archive/00005180/ Andrew, T. (2003). Trends in Self-Posting of Research Material Online by Academic Staff. Ariadne(37). Back, K. W. (1962). The behaviour of scientists: Communication and creativity Sociological Inquiry, 32, 82-87. Barnes, I. (2006a, 2 February). Integrating the Repository with Academic Workflow. Paper presented at the Open Repositories 2006, Sydney University. Barnes, I. (2006b, July 2006). Preservation of Word-Processing Documents. Retrieved 30 September, 2006, from http://www.apsr.edu.au/publications/preservation_of_word_processing_docu ments.html Becher, T. (1981). Towards a Definition of Disciplinary Cultures. Studies in Higher Education, 6(2), 109-122. Becher, T., & Trowler, P. R. (2001). Academic Tribes and Territories (Second ed.): The Society for Research into Higher Education & Open University Press. Beckett, C., & Inger, S. (2006). Self-Archiving and Journal Subscriptions: Co- existence or Competition? An International Survey of Librarians' Preferences. London: Publishing Research Consortium. Bellekom, C. (2004). Building preservation functionality in a digital archive: the National Library of the Netherlands. Learned Publishing, 17(4), 275-280. Bjork, B.-C. (2004). Open access to scientific publications - an analysis of the barriers to change? Information Research: an international electronic journal, 9(2). Brown, L., Griffiths, R., & Rascoff, M. (2007, 26 July). University Publishing In A Digital Age. Ithaka Report Retrieved 28 November, 2007, from http://www.ithaka.org/strategic- services/Ithaka%20University%20Publishing%20Report.pdf Buchhorn, M., & McNamara, P. (2006, September). Australian eResearch Sustainability Survey. Retrieved 28 November, 2007, from http://dspace.anu.edu.au/handle/1885/44304 Callan, P. (2007). Interview at QUT. In D. Kingsley (Ed.) (pp. Interview). Brisbane. Chan, L. (2004). Supporting and Enhancing Scholarship in the Digital Age: The Role of Open-Access Institutional Repositories. Canadian Journal of Communication, 29, 277-300. Cochrane, T., & Callan, P. (2007). Making a Difference: Implementing the eprints mandate at QUT. International Digital Library Perspectives, 23(3), 262-268. Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 14 OF 15 Crane, D. (1972). Invisible Colleges: Diffusion of Knowledge in Scientific Communities. Chicago: The University of Chicago Press. Crow, R. (2002). The Case for Institutional Repositories: A SPARC Position Paper. Washington: The Scholarly Publishing & Academic Resources Coalition. Davis, P. M., & Connolly, M. J. L. (2007). Institutional Repositories: Evaluating the Reasons for Non-use of Cornell University's Installation of DSpace. D-Lib Magazine, 13(3/4). Foster, N. F., & Gibbons, S. (2005). Understanding Faculty to Improve Content Recruitment for Institutional Repositories. D-Lib Magazine, 11(1). Gandel, P. B., Katz, R. N., & Metros, S. E. (2004, March/April). "The Weariness of the Flesh": Reflections on the Life of the Mind in an Era of Abundance. Educause Review, 40-51. Hagstrom, W. O. (1970). Factors Related to the Use of Different Modes of Publishing Research in Four Scientific Fields. In C. E. Nelson & D. K. Pollock (Eds.), Communication Among Scientists and Engineers (pp. 85-124). Lexington: Heath Lexington Books. Harley, D., Earl-Novell, S., Arter, J., Lawrence, S., & King, c. J. (2007). The influence of academic values on scholarly publication and communication practices. Journal of Electronic Publishing, 10(2). Harnad, S. (2003). For Whom the Gate Tolls? How and Why to Free the Refereed Research Literature Online Through Author/Institution Self-Archiving, Now. In D. Law & J. Andrews (Eds.), Digital Libraries: Policy Planning and Practice: Ashgate Publishing. Harnad, S. (2006). FRPAA and paying publishers to self archive. Retrieved 28 November, 2007, from http://www.library.yale.edu/~llicense/ListArchives/0606/msg00165.html Harnad, S., Brody, T., Vallieres, F., Carr, L., Hitchcock, S., Gingras, Y., et al. (2004). The green and gold roads to Open Access. Nature Web Focus: Access to the Literature Retrieved 03 March 2005, from http://www.nature.com/nature/focus/accessdebate/21.html Hitchcock, S. (2006). The effect of open access and downlaods ('hits') on citation impact: a bibliography of studies. OpCit Project Retrieved 4 September, 2006, from http://opcit.eprints.org/oacitation-biblio.html Kingsley, D. (2007a). The journal is dead, long live the journal. On the Horizon, 15(4), 211-221. Kingsley, D. (2007b). The one that got away? Why changed reporting requirements will work against open access in Australia. First Monday (Submitted October 2007). Mackie, M. (2004). Filling Institutional Repositories: Practical strategies from the DAEDALUS Project. Ariadne (39). Max Planck Institute. (2003, 20-22 October). Berlin Declaration on Open Access to Knowledge in the Science and Humanities. Retrieved 28 November, 2007, from http://www.zim.mpg.de/openaccess-berlin/berlindeclaration.html Nixon, W. (2003). DAEDALUS: Initial experiences with EPrints and DSpace at the University of Glasgow. Ariadne(37). Open Society Institute. (2002). Budapest Open Access Initiative. Budapest, Hungary. Pelizzari, E. (2003). Academic staff use, perception and expectations about Open- access archives. A survey of Social Science Sector at Brescia University. Retrieved 24 February, 2005, from http://eprints.rclis.org/archive/00000737/01/Academic_staff_perception_about _Open_archives.htm QUT. (2004). Policy F/1.3 E-print repository for research output at QUT. Retrieved 25 March, from http://www.mopp.qut.edu.au/F/F_01_03.jsp Rogers, E. M. (2003). Diffusion of Innovations (Fifth ed.). New York: The Free Press. Those who don’t look don’t find: disciplinary considerations in repository advocacy Danny Kingsley – danny.kingsley@anu.edu.au PAGE 15 OF 15 Sale, A. (2005). The impact of mandatory policies on ETD acquisition. Retrieved 13 Feb 2006, 2006, from http://eprints.comp.utas.edu.au:81/archive/00000222/ Sale, A. (2007). The Patchwork Mandate. D-Lib Magazine, 13(1/2). Steele, C., Butler, L., & Kingsley, D. (2006). The Publishing Imperative: the pervasive influence of publication metrics. Learned Publishing, 19(4), 277-290. Suber, P. (2006). Nine questions for hybrid journal programs. SPARC Open Access Newsletter (101). Swan, A., & Brown, S. (2004). Authors and open access publishing. Learned Publishing, 17(3), 219-224. Tenopir, C. (2004, 1 February). Online Scholarly Journals: How many? Library Journal.com. van Westrienen, G., & Lynch, C. A. (2005). Academic Institutional Repositories: Deployment status in 13 nations as of mid 2005. D-Lib Magazine, 11(9). Ware, M. (2004a). Institutional repositories and scholarly publishing. Learned Publishing, 17(2), 115-124. Ware, M. (2004b). Universities' own electronic repositories yet to impact on Open Access. Nature Web Focus: Access to the literature Retrieved 9 February, 2005, from http://www.nature.com/nature/focus/accessdebate/4.html i http://www.openarchives.org/ ii http://www.eprints.org/ iii http://www.dspace.org/ iv http://arxiv.org v http://repec.org/ vi http://www.pubmedcentral.nih.gov/ vii http://www.opendoar.org/ viii http://www.sherpa.ac.uk/romeo.php work_dijyreduyfd5dl5qvlyxprkc2q ---- Microsoft Word - 2012 10 19B submitted to MARS.doc 1 Collaborative Initiatives in Error Handling and Bibliographic Maintenance: Use of Electronic Distribution Lists and Related Resources IAN FAIRCLOUGH Fenwick Library, George Mason University, Fairfax, Virginia, USA This is an electronic version of an article published in Cataloging & Classification Quarterly v. 51 (2013) no. 1-3. Cataloging & Classification Quarterly is available online at http://www.tandfonline.com/doi/abs/10.1080/01639374.2012.719074. ABSTRACT Over the past decade, people working collaboratively have created several electronic distribution lists, each dedicated to notification about a specific issue in error handling for bibliographic and authority records, and other aspects of catalog maintenance. Librarians and others concerned for the accuracy of classification numbers, established headings, and series data can communicate among each other via these lists and related projects. This article documents their history and role in cataloging operations. Subscription information and frequently-used abbreviations are provided in Appendixes. This study was supported by an award of five days of research leave from George Mason University Libraries. The author gratefully acknowledges the kindness of John Zenelis, University Librarian, and the Libraries’ Professional Development Committee (Laura Jenemann, Chair) in granting this award. 2 INTRODUCTION A keyword search in an online catalog fails because of a typographical error in a bibliographic record. A library patron browsing the collection misses a book on the topic of interest because it is misshelved, not by an inattentive page, but because of a mistranscribed digit in the call number. A student who wants a list of all books by a particular author gets titles not only by that person but also others by a namesake with whom the sought person has become confused. A researcher attempting to identify titles in a scholarly series fails to find all but a few, on account of inconsistent application of headings among the various agencies from which the records in a local catalog were imported without examination. These situations have in common the theme that an error has happened in the cataloging and classification process. Often, typographical errors are considered "minor", and perhaps this claim is valid when evaluating the skills of someone being trained in applying cataloging rules to library materials. But in the context of an online catalog, what one person might consider to be a "minor" typo can result in a user's failed search: hardly a minor matter. Contemporary tools such as Web search engines, when faced with nonstandard data, display a "Did you mean?" message with one or more plausible alternatives. Automated spellchecking routines can "correct" data with or without human supervision of each instance. But this practice sometimes results in inappropriate changes, in cases where the supposedly incorrect data is in fact what was intended. Tools such as these cannot perform the task of repairing the damage done by inadequate representation of resources. Human intervention is required. This article describes some of the tools available to librarians for error handling so that they can attend to situations in their local databases and concurrently assist others to do likewise. PREVIOUS STUDIES 3 Literature related to this article falls into two main categories: Work on typographical errors in bibliographic records; and the use of electronic distribution lists as a means of communication among librarians. Terry Ballard documents the origins of numerous interrelated collaborative projects in Jeffrey Beall's "Dirty Database" test1. A seemingly trivial exercise, Beall's test was the spark that set in motion these projects, which are still in full operation two decades later. Published in American Libraries in 19912, the test invites people to examine their local database for accuracy. In private correspondence, Beall has stressed that this "test" is not scientific in nature, a statement that is in no way characteristic of his scholarly output. Ballard performed Beall's test on the bibliographic database at Adelphi University. Although Adelphi performed well, after further investigations Ballard concluded that the situation with respect to library catalogs universally was in need of clean-up. He used the newly- available keyword searching capabilities of Adelphi’s online catalog to compile a list of all misspelled words found therein, a project that gave rise to the database Typographical Errors in Library Databases. 3 This database, which began with a few hundred entries, has now grown to several thousand, and is colloquially known as the Ballard List, a phrase which belies the essentially collaborative nature of the project, with numerous people contributing errors they have found, and assisting with maintenance of the database. In a statistically-based test, Beall and co-author Karen Kafadar examined the extent to which libraries deriving their cataloging from master records in OCLC's WorldCat database had corrected any errors in those records when adding them to their local database.4 Beall and Kafadar's research drew its examples from the Ballard List in a randomized process, thereby reciprocating Ballard's initial use of the "Dirty Database" test. 4 The relevance of Beall's article lies in his findings: Thirty to forty percent of typographical errors in records used for copy cataloging go uncorrected. Beall urges OCLC and other agencies to "redouble their commitment to eliminating typographical errors and develop more sophisticated algorithms to detect and eliminate the errors"5. Whether or not Beall's exhortation was heeded, much work is still needed to reduce, if it is not feasible to entirely eliminate, typos and other errors in bibliographic records. Asking suppliers of records to perform this task, rather than distributing records containing errors that have to be cleaned up by their customers, reduces the workload of all recipients, thereby saving them valuable time that can be spent on other activity. But if those agencies do not do so, the burden falls on individual librarians working with their local collections. Such librarians often work collaboratively, as described below, but not all database recipients may receive corrections. The articles by Ballard, and  by  Beall  and  Kafadar  recount the situation for which electronic mail became the platform for several projects dedicated to the elimination of errors in bibliographic records. A related concern also covered in these projects was improving the quality of bibliographic data that was not technically erroneous at the time of cataloging, but that requires local attention on the part of librarians because of obsolescence or another situation. Bernie Sloan comprehensively discussed the use of e-mail lists in librarianship.6 One concern is terminology: in common parlance such lists are often called "listservs." Sloan cautions: "listserv" should not be used generically to describe electronic discussion lists, as the term is a registered trademark licensed to L-Soft International, Inc."7. He recommends simply using "list". Two further observations to complement Sloan's remarks: 5 (1) Using the term "listserv" to refer to all lists regardless can lead to the mistaken impression that all such lists operate similarly, whereas rival software does not necessarily have the same features as Listserv®. (2) Distribution rather than discussion is a more comprehensive term to interpose, thus: "electronic distribution list". This comment is borne out as Sloan identifies two types of lists: An announcement list serves a function much like that of a newsletter. The communication is one-way. Subscribers cannot post to the list. The list owner or moderator generally is the only person authorized to post items to the list. Subscribers play a passive role, simply receiving information. A discussion list is an interactive forum for communication. Subscribers may post an e- mail message to the list, with all subscribers receiving a copy of the message. Other subscribers may choose to respond to the initial message, with their e-mail reply also being distributed to all list subscribers. To these types, I have added a third: a notification list. This type resembles an announcement list, but with some differences. Rather than just one authorized person, other list subscribers may post. Discussion is not encouraged, but where appropriate it is permitted: for example, when a contributor requests assistance with resolving a problem. Often an initial question results in off-list correspondence, with the resolution of the matter in a subsequent post to the list. This releases other list readers from having to follow intervening stages of the discussion, allowing them to focus instead on the resolution and its application to their own work locally. With discussion lists, on the other hand, it is preferred that all correspondence be made on the list for all to see. 6 These characterizations are broad generalizations, and do not apply in every instance. Whatever the type, it is important that contributions be made to a list on a fairly regular basis, lest it should become inactive. Another distinction Sloan makes is between "public" and "private" lists, public lists being open to all, while private lists have restrictions on membership. To which I add: with Listserv®, "private" has a specific technical meaning for list operation. "Private" means that only subscribers can send a message to the list, but it does not necessarily restrict who may subscribe. Lists are further distinguished as "moderated" (someone serves as "editor"--again, a Listserv® technical term--to review all messages prior to distribution) and "unmoderated": messages are forwarded without review. To complement Sloan's comments on list moderation: a Listserv® list can be set so that all persons writing to the list must themselves confirm that they intend to do so, a technique to cut down on unwanted messages ("spam"). This initial confirmation can be followed by moderator approval, for double confirmation that the message is pertinent to the list's purpose. Archives are often a feature of list structure. Sloan says, "A good archive will allow you to search for past messages on a given topic, written by a specific person, etc."8 and that unarchived lists are rare. Listserv® as well as rival products mostly allow all messages posted to a list to be stored for future reference. Further comments on Sloan's remarks: archiving does not guarantee a literal rendition of all that was said on a list, for under certain circumstances the archives can be altered. It is also technically possible for a message to be present in the archives without actually having been distributed to subscribers (and with Listserv®, some lists are specifically set up for this purpose). Another means of accessing Listserv® archives is via third- party software such as gmane.com. 7 Sloan's article continues with observations about list etiquette; passive members ("lurkers") versus the active ones who make the list work and provide its personality; a statistical study of list membership and participation (one half of one percent of the number of posters accounted for 22 percent of the total messages sent); the fact that some participants were professionally active before lists existed while others have never known life without them; and the time when commercial activity on the Internet was proscribed (hardly imaginable today, but with a lasting negativity toward vendor participation on some lists). In concluding remarks: Sloan says, "Posting to a list may bring you fame, or it may bring you notoriety. Active participation in a library list allows you to make an impression like nothing else can. Make sure it’s a good impression." Sloan's article is recommended reading for all wanting a general background in list usage and participation. The lists described in this article all have the characteristic of being intended for notification9: an aspect rarely found among other lists, and which distinguishes them as a group. All of them are collaborative endeavors. From the beginning, they were set up as cooperative projects, with several people participating. In their ongoing status, users have both active and passive roles with respect to list participation, as Sloan describes, but only with respect to whether or not they actually contribute to the list. Participation in the projects that the lists support happens as a person reads a list message and investigates the local situation, taking remedial action if necessary. Thus project participation extends beyond actively contributing to the list itself, as "passive" list subscribers follow up on posts. Brief articles announcing the start of these projects were issued in TechKNOW, a publication of the Ohio Library Council's Technical Services Division. 8 INITIATIVES IN ERROR HANDLING The first paragraph of this article gave a description of various types of situations that can result in the failure of an online catalog to assist the user as required. Not all of the situations are errors in quite the same sense, but they must be addressed if the catalog is to function properly. Perhaps some readers of this article believe that the solution to error handling is 100% accuracy during the cataloging process. If feasible, this truly would avoid the problem in most, but not all, of the situations addressed in this article. Nevertheless the fact remains, as Ballard has pointed out: the presence of typographical errors in library databases is a universal problem. Complete accuracy is an unrealistic expectation, and is not feasible. Instead, this article describes initiatives that have been taken to remedy erroneous situations rather than to prevent them. Participation in the projects can result in wholesale improvement of the quality of a local bibliographic database. But improvement will not happen automatically. Human intervention is required. The initiatives and the situations they address can be grouped into three types: (1) Typographical errors, addressed via the Typo of the Day for Librarians Blog; (2) Classification errors, notified via the electronic lists DEWEYERROR and LCCERROR; (3) Access issues, the subject of the electronic lists PERSNAME-L, SERIES-L and SACOLIST. INITIATIVES ADDRESSING TYPOGRAPHICAL ERRORS In his article, Ballard kindly credits this author with asking the initial question that gave rise to the Typo of the Day for Librarians Blog. The blog's introductory statement reads, "We are a group of librarians from all over the world with a common interest - keeping our online catalogs free of errors." This current collaborative project is built on the foundation of 9 Typographical Errors in Library Databases, to which numerous people, notably Tina Gunther and Phalbe Henriksen, have been contributing for many years. The intention of this project is that, by correcting instances of that particular error on that day, databases worldwide can be rid of it: an ambitious goal, undoubtedly, but one which is achieved in part in each location where action is taken. One person from each institution can monitor and act upon blog posts. The blog has blossomed in a way that has sparked the imagination of numerous participants. The daily posts are replicated via an automatic retransmission on AUTOCAT, a general-purpose discussion list for cataloging. The blog is accompanied by a Wiki, used for development of the program of blog posts and as an index to record what typos have already been announced. The LIBTYPOS discussion list (not hosted by Listserv®, but a Google group) allows communication among participants and others interested. Several contributors, notably Carol Reid (New York State Library) have kept the project going. Anyone wishing to participate in writing the daily blog posts is invited to join in the activity.10 The work of correcting typos cannot be automated without danger of introducing an error where one did not previously exist. If a system has a global change function, it is best ignored for this purpose. The word Grammer, for example, featured in Beall's Dirty Database Test, is also a correctly spelled personal name. As such, it is also within scope for the PERSNAME-L list, discussed below. INITIATIVES ADDRESSING CLASSIFICATION ERRORS As with typographical errors, mistakes in call numbers sometimes happen, either from data being mistranscribed, or from the misapplication of classification schemes. When a call 10 number has an error, library users browsing the shelves can miss the materials for which they are looking. Collection development librarians reviewing the materials on the shelf will perhaps wonder why a title that is seemingly out of place is present. Even in a closed-stack environment, a call number error is a problem for a reference librarian attempting to assist with locating materials on behalf of a scholar who has sought assistance.11 Errors in Dewey Decimal Classification (DDC) numbers came to my attention in such frequency that I took action to assist all concerned. The following account describes the circumstances surrounding the establishment of the DEWEYERROR list, and its earlier manifestation, the Dewey Error Notification List. Before starting work at Marion Public Library (MPL) in Marion, Ohio on May 30, 2001 I wondered why a public library serving a community of some 64,000 residents had a need for a second M.L.S.-degreed professional cataloger. But this need soon became apparent as the extent of catalog and collection maintenance for the existing collection was revealed. I worked mostly with copy cataloging from records found in OCLC's WorldCat database. Additional work was often necessary to correct erroneous data. It seemed counter-productive to make corrections locally without also addressing the problem in the source. Thus it was not long before I began sending reports of errors to OCLC, for records found in WorldCat, and to the Library of Congress (LC), for versions of those records found in LC's own online catalog. This included DDC numbers assigned by LC's Decimal Classification Division (DCD). In giving the MPL nonfiction collection a thorough review, I examined how DDC had been applied, evaluating sections of the collection in tandem with materials selectors, who would advise on priorities and deacquisition items prior to reclassification. Extensive local practices also existed, for which I provided documentation and made additions. Although I insured to the best of my ability that DDC numbers in WorldCat master records were formulated properly, a 11 different number was often assigned locally. Thus investigation of the appropriateness for local use of the DDC number given in the WorldCat master record required a two-stage process: 1. determining that the number given in the master was strictly correct, and 2. entailing steps in local modification. Consequently I evaluated the accuracy as well as appropriateness for local purposes of DDC numbers in records for all incoming nonfiction titles. In the course of this work, records failed the verification process in the first stage sufficiently often to arouse concern about the overall accuracy of DDC numbers in libraries in general. The practice of examining WorldCat member copy records to determine whether a DDC number is correct or appropriate locally is quite common among libraries. But this practice is not the case for bibliographic records originating with LC. It is widely assumed that LC records will be free of error. That is the case for all who routinely assign them to staff charged with accepting the LC record without question. It was the process of first finding out what the DDC number was supposed to be, working strictly as instructed in the schedules and tables, that revealed that errors were indeed to be found in DDC numbers in LC records. Because of the potential benefit to other users, shortly after starting at MPL I began to communicate about suspected errors in LC catalog records. I exchanged e-mail messages with staffers at LC's Decimal Classification Division (DCD). Occasionally a DCD staff person responded with an informative correction to my report, but mostly they confirmed my suspicions and thanked me. I quickly learned that LC staff like to be informed of errors in LC's own catalog12, but are not necessarily concerned with errors in a version of that record found elsewhere, such as in WorldCat. LC staff have no control over other versions of the record once it has been promulgated to other agencies, and often other data (some of it erroneous) is introduced elsewhere. 12 Other librarians came to my attention for posting messages about erroneous records on electronic distribution lists such as AUTOCAT and OCLC-Cat, and gradually the idea formed of separating off these reports into a forum expressly dedicated to communication among all those concerned. A general forum for all kinds of errors seemed too large a project. But one could perhaps address one specific type of error, and DDC numbers seemed to be a logical choice for the project that evolved, first as the Dewey Error Notification List (DENL) and later as DEWEYERROR. DEWEYERROR In spring 2002 I set up the Dewey Error Notification List using Microsoft Outlook. I maintained this list privately on my work computer, adding people to a group of e-mail addresses as they expressed interest, and receiving and forwarding messages from participants. About thirty people participated, many of whom learned about the project from an announcement on AUTOCAT. The number of participants grew to around seventy in the first two years. The list functioned entirely through action on my part in each individual case, though my actions mostly followed those of other contributors. Those who sent reports to LC would include me as a recipient, and I forwarded the reports to the group, in addition to reports of my own. Other people, regardless of whether they actively checked errors or not, received these reports and, to the best of my knowledge, acted upon them. Thus there were two groups for whom DENL was appropriate: 1) Those who routinely accept DDC numbers from LC catalog records without checking them; 2) Those who not only check the DDC numbers, but also advise LC when they suspect a number is in error. 13 This process for forwarding messages was a labor-intensive one. The effort required, plus technical difficulties in maintaining the list using Microsoft Outlook, prompted me to seek assistance. These difficulties entailed handling messages that could not be delivered because of an incorrect address, the non-response of the recipient's host server, and so forth. Furthermore, no facility was available that would create archives of messages contributed. But a Listserv® list can have all its posts archived, and will thus preserve a record of all messages for future use and reference--provided that they are not adjusted (which technically can and does happen). At MPL I did not have access to Listserv® software, but I hoped that someone who did (most probably at a university) might offer to help. My wish was fulfilled when Margaret Maurer (Kent State University) suggested that the Technical Services Division (TSD) of the Ohio Library Council (OLC) could assist with migration of the list to a Listserv® platform. In summer 2004 the council met and readily agreed to assist with this project. The name DEWEYERROR was decided upon as the most suitable name. Since the word Dewey is a registered trademark held by OCLC, an Action Council member (Laura Salmon) sought permission for use of this name, which OCLC kindly granted. Since LC records are in the public domain no authorization is required for their reproduction. But in order to insure that no conflict of interest would arise with LC practices, I contacted DCD to advise them of the new list. Dennis McGovern (then Chief of DCD), spoke kindly of the new venture both at its outset and subsequently, even at one point contacting me for assistance with statistical data on the number of records reported. On October 27, 2004 DEWEYERROR commenced operation, hosted at Kent State University (KSU), with myself, Maurer, and Sevim McCutcheon (also of KSU) serving as co- listowners, plus an initial 77 subscriptions. TSD coordinator Bonnie Doepker (Dayton Metro 14 Library) and I announced the new list in TechKNOW. 13 We also sent announcements to various electronic lists, and the number of subscriptions grew to around 300. Eventually McCutcheon took over as leader, and Tom Adamich became the third listowner when I ceased working with DDC in January 2008. DEWEYERROR was set up under close guidelines for contribution of messages. As with the other lists that are the subject of this article, all messages are reviewed, so no post goes to the readers without first being "approved" by a list owner. The list owner does not, however, check the contents of a message for accuracy. Rather, the message is scrutinized for evidence that LC has already been informed of the suspected error. Most messages contained both LC and DEWEYERROR as recipients, so could easily be identified. But in cases where it was not evident (as when the report to LC was made independently of DEWEYERROR), a listowner would write to the sender to check whether LC had been informed, and if not, to request that the writer tell LC directly. Although LC staff wish to be advised about suspected errors, they have already set up channels of communication by which to receive reports. It was not intended for DEWEYERROR to replace or even complement those channels, although LC staffers are welcome to subscribe and read messages. DEWEYERROR is stricter in adherence to policy than was DENL, because of the special nature of the agreements under which DEWEYERROR was set up. With DENL, I sometimes distributed messages identified as "not exactly wrong" (and therefore not sent to LC), but which reflected a local practice, thinking that some subscribers might wish to use a similar number locally -- particularly if their collection was of similar size and nature to MPL's. But such messages were not in scope for DEWEYERROR. 15 Example of a Typical DEWEYERROR Report Barbara Thiesen (Bethel College, North Newton, Kansas) has provided a model example of a DEWEYERROR post in this recent one14: LCCN: 2010033097 Author: Phillips, Carl Title: Double Shadow Dewey number in record is 813.54. This is a book of poems, so the Dewey number should be 811.54. This message has also been submitted to LC. The message contains a proper identification of the bibliographic record in question, a statement of the problem, and its resolution. The final statement assures the listowner of compliance with the policy of reporting to LC. Since many new subscribers had not seen messages sent via DENL, I began to resend the ones that would qualify under the new DEWEYERROR guidelines, identifying them as reposts with an initial phrase beginning Previously reported. Bryan Baldus (Quality Books, Inc.), who contributes frequently to DEWEYERROR, had maintained a file of all posts to DENL, which came in very useful as we identified which messages to forward. Eventually McCutcheon took over the task of forwarding the messages. A search of the DEWEYERROR archives conducted on November 27, 2011 using the phrase previously reported retrieved numerous such messages, the most recently posted having the date June 8, 2010. It was one that Carl Cording (College of Saint Rose) originally sent on March 22, 2004. One might wonder why a message from six years earlier documenting a DDC error might be deemed of current interest. But for any library, if an item is still misplaced locally after several years because of an erroneous number, the matter can be of concern. By fall 2006 I had become a member of OLC TSD's Action Council, serving (eventually as coordinator) through 2008. We reviewed the status of DEWEYERROR, and decided that 16 further sponsorship was unnecessary. The list was functioning well without any intervention or assistance on TSD's part. Therefore the official connection was dropped, leaving DEWEYERROR in the hands of its listowners for routine operation. Provided that this status continues, and one should bear this provision in mind, DEWEYERROR can continue to operate as under its current arrangements. At present, it has about one post per week: not a heavy flow of traffic, but sufficient to maintain list operation and avoid the danger of becoming inactive. LCCERROR If DDC numbers in LC records have errors, one might think that Library of Congress Classification (LCC) numbers would also have them. It is however more tricky to determine cases of error with LCC than with DDC. And while working with DDC I had no business investigating LCC numbers. It did cross my mind that a companion list to DEWEYERROR, doing for LCC what DEWEYERROR does for DDC, might serve a useful purpose. After moving in 2008 to my current position at George Mason University, an LCC-classed library, and having indeed encountered cases of erroneous LCC numbers, it became practical to consider starting LCCERROR. Roman Panchyshyn (Kent State University) and Sevim McCutcheon kindly agreed to host the list at KSU. Once technical specifications and list policy were established, the list commenced operation on October 31, 2010. It currently has 71 subscribers, far fewer than the other lists. It might seem that the two lists, each addressing one of the major classification schemes in use in libraries worldwide, would have almost identical functions. DEWEYERROR and LCCERROR do share a lot in common in their scope and purpose. But the circumstances of the two classification schemes are different in subtle ways that significantly affect list operation. In 17 part this is from their history, in part from characteristics of the schemes, as well as their ownership and the circumstances in which call numbers are applied to library materials. Practically speaking: the LCC scheme does not apply just to the classification portion of a call number, but also to the shelflisting. In a bibliographic record created by LC, both DDC and LCC numbers can be included, but only the LCC number is carried out to an exact shelf location. Furthermore, libraries other than LC that use LCC create numbers for local use and in the process can integrate them with existing numbers in LC's catalog, in anticipation that other libraries subsequently encountering the number they have created will use that number. Thus the widespread application of LCC within WorldCat falls within the scope of concern of LCCERROR, whereas DEWEYERROR is restricted to numbers found within the LC catalog. The expectation of exactitude applies with LCC numbers regardless of origin; it does not apply with DDC numbers in OCLC member copy. Therefore LCCERROR was set up to include notification of suspected errors in LCC numbers in all sources, not just records originating with LC. No requirement exists that the report must also be submitted to LC. Indeed, in cases of non- LC records, no such report should be sent. As with DEWEYERROR, I was concerned that LCCERROR would cause no misunderstandings or inconvenience with LC. So I wrote to Dr. Barbara Tillett (Chief, LC Policy & Standards Division) to assure her that steps were in place to prevent messages that should properly be sent to LC from being sent instead to LCCERROR. She kindly responded: "Would it be possible for your service to send a notification directly to Mary Kay Pietris [LC Cataloging Policy Specialist] ... as our point person on LCC corrections? We have similar arrangements for personal names and some other alerts, and it really helps us respond more quickly to fix our data. As I am sure you are aware, LCC has evolved greatly over time so what 18 may be considered an 'error' in classification today was not when the book was originally cataloged. The JX schedule, which is now the JZ and KZ schedules is a good example of this. While some libraries might want to reclass their books, I'm sure many others, like LC, don't have the time and manpower. Perhaps on your list you could remind everyone of that and on the LCCERROR homepage. Catalogers should know this about LCC, but sometimes when you're looking at bibliographic records individually you lose sight of the big picture. Thanks for helping us improve the quality of our data."15 I was glad to comply with these requests, and the LCCERROR list thus does have a reader from LC in Ms. Pietris. And in her mention of personal names perhaps Dr. Tillett is referring to PERSNAME-L, which has long had subscribers among the LC staff. In posts to LCCERROR, I differentiate carefully between those commenting on records contributed by LC (which I also communicate directly to LC) and those on records originating elsewhere (and therefore are not reported to LC). Example of a Typical LCCERROR Report Jay Shorten (University of Oklahoma) contributed the following: LCCN 2011006448 The lesson of Carl Schmitt, Expanded ed., is not complete yet, but has an 050 00 of JC263.S34 M44514 2011. This number is not correct; it should be JC263.S34 M44513 2011 to match its previous edition, LCCN 98023580. (LC has changed their call number.)16 This post exemplifies use of LCCERROR to notify about a changed book number. One might question whether such a change is necessary. That question is addressed by the statement that LC changed their number: in other words, it was important enough to LC to make the change. Policy at individual libraries may differ on this matter. 19 If one considers the number of subscribers and messages as a measure of success, LCCERROR has not been as successful as DEWEYERROR. Several months after the start of operations, I asked Bryan Baldus for his ideas about why LCCERROR doesn't get as many posts as DEWEYERROR. Baldus, who contributes to both lists, answered: "Part of the reason could be that many just accept LCCs as they are, without question. Another, that it's more difficult to see the problem with an LCC than in a Dewey, since they are more difficult to quickly parse. After ten plus years working with them, I barely have any memorized, while with Dewey I can usually look at the number and determine what it means; can see standard subdivisions that are or aren't what they should be."17 The mnemonic aspect of DDC is well known, if only to DDC classifiers; whereas LCC lacks such features. The underusage of LCCERROR makes me wonder whether the time and effort spent on maintaining it is worthwhile. As the originator, I have a nurturing attitude toward the list's continuance. Practically, it would be better to close the list down rather than to allow it to become inactive. By alerting the cataloging community to this service, I hope that more people will begin to take advantage of it, make contributions themselves, and ensure that LCCERROR will continue to grow for as long as there is a need for it. INITIATIVES ADDRESSING ACCESS ISSUES Providing appropriate headings in bibliographic records, and assuring that those headings are in the proper form, are tasks with which a cataloger is well familiar. In preparing a bibliographic record for shared usage, catalogers follow rules and hopefully apply them correctly at the time. But changes in practice occur after the fact, as new headings are created, personal names are disambiguated, and series titles are standardized. Sometimes data are updated to 20 comply with revised cataloging rules, an issue about which catalogers are currently wondering as they consider implementation of Resource Description and Access (RDA), a topic that has yet to impact us with full force. Some readers will share memories of the widespread introduction of new headings with the adoption of the Anglo-American Cataloging Rules, 2nd Edition (AACR2) over thirty years ago. It also happens that local practice creeps into shared records. An example of this is encoded in the MARC format. Field 490 (which bears the digit 9, elsewhere in MARC indicative of local practice), recording series data as found in the source, is often coded with a first indicator that represents the application of a local situation rather than the decision recorded in a series authority record (SAR). Thus when LC decided in 2006 to no longer provide an added entry for a bibliographic series, but simply to use field 490 to record the series data as found, they were in effect implementing a local practice, "local" in this context referring to LC. And in so doing LC followed a MARC coding practice already observed elsewhere. If the previous sentences were made in a post to a discussion list rather than in a published article, an outburst of responses would likely ensue! But this observation is made here, not to provoke controversy, but in order to document the background against which PERSNAME-L and SERIES-L came into existence as notification lists. These lists can differ from DEWEYERROR and LCCERROR in what is considered to be an error. In some situations no error existed at the time that the original cataloging was done. In many cases, changes in headings that were correct when assigned subsequently took place for legitimate reasons. Some agencies can receive notification of the change via a third party, but for others, particularly those for whom such services are prohibitive, either for reasons of 21 expense or because of the staff time involved in attending to the reports, the PERSNAME-L and SERIES-L lists can come in handy. PERSNAME-L PERSNAME-L originated largely through the efforts of Jay Shorten to communicate information specific to personal name headings via a dedicated list. For some time, Shorten had posted notices to AUTOCAT announcing the creation of a name authority record (NAR), the modification of a name heading, or some other issue concerning a personal name in bibliographic and authority records. My initial role was to ask for his agreement to collaborate on this project. I also provided documentation for policies and procedures and wrote the welcome message. Shorten set up the list, and PERSNAME-L started operations on July 30, 2007. Whether fully detailed or a brief comment, one of the PERSNAME-L listowners approves for distribution all posts that fall within the list's scope of interest. My active role as a listowner ended in January 2008. At that time Roger Miller (Public Library of Cincinnati and Hamilton County) and Wanda Gunther (University of North Carolina) became listowners and began to share with Shorten, on a rotating basis, the responsibility of approving messages.18 Most posts to PERSNAME-L consist of reports of a changed name heading. Some subscribers including myself have on occasion asked if another subscriber has information about a person they are trying to contact for the purpose of establishing an authorized heading, or whether an existing NAR represents the individual whose book is being cataloged. In this respect, the more people participating in PERSNAME-L, the more likely it is that someone will actually know the individual being investigated, or perhaps will work at the same institution and 22 have access that is not available to outsiders. Thus PERSNAME-L exemplifies a form of "crowd-sourcing". A Notification on PERSNAME-L Sometimes a cataloger from an institution that is not a participant of the Name Authorities Cooperative Project (NACO) seeks help from a NACO member in getting a NAR created, so as to document research, provide a reference, or to record other pertinent information for those interested. Such work is illustrated by the following exchange. Rich Aldred (Haverford College) wrote: "My system reported to me that Napoleon, Art (trumpeter, cornetist, leader, and writer) is a 400 to Sudhalter, Richard M. However, there was a movie director, producer, writer Art Napoleon. There is no authority record for him, but, according to International Movie Database (IMDb), his dates are 1920-2003, so I added that to our catalog record, and the few OCLC records for his works."19 To which Deborah Tomaras (New York Public Library) kindly responded: "I've made the authority record for director Art Napoleon (ARN 8934581). I'll control all the headings with his name on them." 20 Note that Tomaras took the additional step of controlling the headings in bibliographic records, a feature of OCLC Connexion functionality. Such work, which goes beyond the scope of PERSNAME-L activity, greatly assists all future users of the records involved. Thus NACO participants can and do engage in the kind of differentiation work that lends itself to reporting about changed headings for individuals. Such changes form a substantial portion of the messages that have been contributed to PERSNAME-L since its inception. PERSNAME-L has no formal connection with the NACO project, although it receives 23 occasional mention on PCCLIST, the Listserv® list associated with (and restricted to participants in) the Program for Cooperative Cataloging (PCC) of which NACO is a component. Not all situations necessarily receive all recommended attention, however. On OCLC- Cat a correspondent inquired about an apparent mismatch in a WorldCat master record between a heading and the NAR to which it was controlled.21 That message prompted me to alert PERSNAME-L readers to a change in death dates for author Thomas G. Frothingham from 1937 to 1945, as documented in the third 670 field of the NAR (n 90606903). Another correspondent wrote pointing out that the first 670 field gave a usage different from that on which the heading had been based (middle initial G. vs. middle name Goddard), and that therefore the heading in the NAR should be revised.22 As of February 20, 2012 no such revision has taken place. Many catalogers will find themselves in the position of knowing that edits to records are appropriate, but will not do so for various reasons.23 I corrected the dates locally, in the interests of our catalog users. But owing to conflicting work responsibilities, I did not go further. The concern over the middle name/initial was in my judgment not of sufficient concern to take upon myself the additional tasks to retrieve our local item so that I could have it in hand while editing the NAR. The data that concern catalog users are not factually incorrect. In considering what action to take, the accuracy of the data plus the needs of the user are the principal criteria. PERSNAME-L has derived or incorporated related projects. An example of this is a set of data files from Gary Strawn (Northwestern University). His message List of Newly- differentiated Headings from LC Names ..."24 drew my attention to this resource, which Strawn created independently of the PERSNAME-L project. It is updated weekly and has proved a valuable source for headings that had recently been created with qualifying information that 24 differentiated one person from another namesake. Strawn says: "The reports of newly- differentiated headings are indeed kept up to date by weekly postings. It may be of interest to note that these are now accompanied by weekly listings of several other kinds of changes: authority records for undifferentiated names whose 'author of' 670s have changed; authority records deleted; authority records whose 1XX field has changed (available both as text and in MARC format)".25 The file of newly-differentiated headings formed the basis of a local project. With assistance from a student employee, I identified headings of local concern and investigated whether an update was required. In searching locally for the undifferentiated heading, sometimes I would find the one referenced in the file. In other instances, I would find yet another namesake who was unqualified locally, but is now represented by a differentiated heading in the WorldCat master record. The procedure for each heading under investigation in the local catalog entailed: 1) Perform a "browse by author" (in some cases, also by subject) search. 2) If an undifferentiated heading is found, display the bibliographic record and search in WorldCat for the current version of the master record. 3) If the heading in the master record has qualifying information, import the NAR to the local database and edit the local heading to match the master record. 4) Send e-mail to PERSNAME-L advising about the changed heading. An important caveat: One should not assume that a heading in a WorldCat master record is necessarily correct. In my posts to PERSNAME-L I avoid where possible statements concerning the accuracy of found data, preferring merely to point out the discrepancy. Instances have 25 occurred where a local record has the correct heading and the master record is in error. Sometimes a local heading contains a qualifier, but the NAR is unqualified. No standardization of message format is required, and contributors are free to write messages in any style they choose. For many messages, I use a template stating the data found locally plus that found in an WorldCat master record, with the associated NAR included in full. The posts that are the most useful for PERSNAME-L are those that can be acted upon. Such posts typically give advice concerning situations that have been resolved already. Among the issues entailed in posting a message to a list such as PERSNAME-L are the following: • The likelihood that the message will interest most subscribers • The amount of detail to be included • The probability that the information presented is accurate. Sometimes a group of PERSNAME-L contributors write several messages in succession with information on various individuals, in a chain reaction resulting in multiple disambiguations. One contributor, Stephen Arnold (University of Oxford) takes an additional step of researching the likelihood that a heading he is documenting will be found in other catalogs, checking it first against a set of other catalogs, and thereby increasing the likelihood that someone looking for that heading locally will find it.26 Thus messages from Arnold are particularly likely to result in updates to local records. On September 7, 2011 I asked PERSNAME-L readers for their opinion about including OCLC holdings in posts, so that readers can see whether their own institution will be one affected by the information in the message. Formerly, displaying holdings for WorldCat records incurred a transaction cost, but since that no longer applies, the principal expense entailed is that of the time taken to do so. For titles with many hundreds of holdings, including them in an e-mail 26 in this way would not necessarily help, but for titles with few holdings, it might cut down on users following up unnecessarily for headings they won't likely have. Opinions varied as to how useful this information will be. It is technically feasible, but not in great demand. Moreover, other bibliographic records can represent other editions of the same work with different holdings, and it proved impractical to include all holdings. One reader commented that it is worthwhile for him to check each heading brought to his attention. The catalog might not have that particular heading, but in many cases that of a namesake is affected. In September 2010 I asked subscribers to PERSNAME-L for feedback in connection with a presentation on related topics given at Library Research Seminar V (LRS-V), held in October 2010 at the University of Maryland University College (College Park). The purpose of this request was to illustrate how PERSNAME-L is used in people's work. Here are some of the testimonials provided at that time: " As I receive messages I … determine what changes need to be made … an invaluable cataloging tool." Elizabeth Heffington (Lipscomb University).27 "It’s far more manageable than the LC Authority records feed, and altogether a very useful item." Teague Allen (California Institute of Technology).28 " When I receive a message … I check our database and correct bibs (merging, change dates, initials, etc) then I check the LC auth list and import the corrected auth record. I have been able to clean up and correct many headings using the PERSNAME-L." Barbara Stampfl, (Polk County Library Cooperative, Florida).29 "I have, on occasion, been able to provide information … that one of my list-reading colleagues can use to create an authority heading that the rest of us can then use." Dennis Reynolds (Madison Public Library, Wisconsin).30 27 " I go through each message and check our database heading by heading. It is time consuming but quite worth while. When it comes to undifferentiated headings and headings used in old records I was able to update those headings in our database and make sure the vendor includes us the corresponding authority records in our next batch… The other great thing … experienced NACO catalogers. I learned a lot by reading how they investigated to break conflicts. …" Tzu-Jing Kao (Multnomah County Library, Oregon)31 " … these posts often led to a discovery of multiple people referenced on a single local name heading. … I hate to do major work on records with only my library being the beneficiary. It can be time consuming to determine whether or not NARs are duplicates or not, and to spend this time is hard to justify if only my library has the information." Jason LeMay, (Gwinnett County Public Library, Georgia).32 " I use it to disseminate some info that we've garnered for our own purposes but which may be useful for others and may not otherwise be readily available to them. … any credit Oxford builds up … is amply repaid. … Roger Miller was invaluably helpful over a tricky series … which I'm sure was in part because Cincinnati has benefitted from Oxford posts." Stephen Arnold (Oxford University).33 I asked a similar question on AUTOCAT and, in addition to comments like those above, also requested and received some responses from individuals telling why they do not participate in PERSNAME-L. Most telling was the following, from a person who requested not to be identified: " I do feel that I'm doing some really good work that would come in handy to others, but, alas, I have been told explicitly that our mission is not to take extra time for the larger cataloging community. I just wish I could spend more time going back and fixing our own messes." 28 The number of positive contributions received outweighed the negatives. Nevertheless, the negative contribution is well grounded. It is not only an economic issue but a philosophical one. That is to say, a cataloging agency might not have the financial resources to be able to participate in PERSNAME-L. But that is a separate issue from whether one should or should not take the extra steps required to participate in the projects described. Such agencies do receive some of the benefits the projects confer, because they will encounter and benefit from records on which other agencies have worked to provide corrections and so forth. The following comment from Anthony Franks (formerly Cooperative Program Section Head at LC) is a resounding endorsement of our collaborative activity: "Do keep up, however, this list strikes me as a far more efficient mechanism than waiting for LC or OCLC update reports."34 As of this writing PERSNAME-L has about 350 subscriptions. Posts to the list occur at a rate of two or three per working day (the archives for October 2011 have 51 entries). It has significantly altered the bibliographic landscape, enabling wide-scale disambiguation both in local catalogs and in national databases. In the event that name headings in bibliographic records are eventually linked, not to authority records (as with OCLC Connexion's "control headings" functionality) but to other resources, PERSNAME-L will have greatly assisted. Envisioned are links from personal name headings in bibliographic records to: biographical and bibliographical tools; dictionary and encyclopedia articles; personal web pages and social media sites. Linking to such resources requires accurate data in bibliographic records. In particular, the currently still sanctioned practice of allowing a NAR to represent first one person, then subsequently (through undifferentiation and redifferentation) for that same NAR to represent a different person, must be deprecated. 29 SERIES-L SERIES-L, dedicated to issues with bibliographic series, started operation in 2009, with Wayne Sanders (University of Missouri-Columbia), Kathleen Schweitzberger (University of Missouri-Kansas City) and myself as co-listowners. The host site is the University of Missouri- Columbia Listserv® installation. SERIES-L was intended primarily as a notification list, but actually the phrase "action list" was used in the initial publicity and list documentation, in part intended as a gentle hint to users that posts upon which a user could not act were to be avoided in favor of those giving directions for immediate database maintenance. But in the event, this phrase might have discouraged people from posting. It was my expectation that SERIES-L would be comparable in performance to PERSNAME-L, with regular posts from a band of dedicated contributors. But in comparison, postings were few and far between. Nevertheless, the list unexpectedly found a special niche. Roger Miller and Bryan Baldus effectively made SERIES-L viable by contributing lists of SARs, primarily for adult fiction and children's literature, and often having a heading entered under a personal name. In fact Miller mentioned that he had been in discussion with Margaret Maurer with a view to setting up a list expressly dedicated to such SARs. The appearance of SERIES-L fulfilled their purpose, and the contributions of Miller and Baldus have been the mainstay of the list's existence. Joan Condell (Dallas Public Library) wrote, "I work at a public library, and series, especially kids/YA series, are very important to the public services librarians. … I found a lot of series that needed controlling."35 It had also been my expectation that academic librarians would take advantage of SERIES-L to collaborate on getting their indexes of series holdings in order. One instance which was awaiting my attention at the time of list set-up was Biblioteca de autores españoles. The 30 concern with this publication was its complicated volume numbering, with several subseries interspersed amidst the general enumeration, and requiring special description in analytic bibliographic records and their associated holdings. The details of this project were more complex than could be accommodated in a message to SERIES-L, so instead I offered to send the documentation (kept locally in a Microsoft Excel file) to anyone wishing it. In the event, one or two people wrote, a disappointing but understandable rate of response. Perhaps other libraries already had these volumes in order and required no further attention. But a more probable expectation is that this series was not a priority for catalog editing and maintenance at the time that the list posting appeared. One consolation is the fact that the post remains in the list archives and can be retrieved, eventually, by anyone knowing that this source is available. One might think that in the wake of LC's decision not to trace series, SERIES-L would serve as a means whereby those librarians who considered tracing series in accordance with existing and new SARs would take up the opportunity to communicate with each other. But despite the outcry of protest against that decision, SERIES-L has not served in this way. Since at the inception of SERIES-L it was made plain that discussion of the LC decision would be inappropriate, perhaps people shied away from the list for that reason: although it was discussion of the decision itself, not dealing with its consequences, that was out of scope. Why has not more interest been shown in SERIES-L, compared with PERSNAME-L? Any number of reasons might account for the low response. In order for an electronic distribution list to continue in service, a regular flow of posts is required, lest people simply forget about its existence. A recent post of mine to OCLC-Cat listed all the lists described in this article: and although announcements have been made for all these lists, plus occasional reminders, one correspondent replied to OCLC-Cat that he was unaware of them. With 31 DEWEYERROR, I used the previously reported messages not only to bring those messages to more people's attention, but also to generate a regular flow of postings in order to keep the list in people's active memory. For this reason, although discussion was not encouraged on SERIES-L, I have learned not to attempt to quash it when it arises. It's better for a list to have a regular flow of posts, some of which might not qualify as "actionable," than for the list to wither through inaction. Thus SERIES-L has somewhat more posts that qualify as discussion than was originally intended. SACOLIST This list is one with which I have had no direct involvement beyond subscribing and reading the messages posted. It is dedicated to issues about LC Subject Headings (LCSH) and shares many characteristics with the lists already described. It differs from the other lists and projects desribed in this article in that LC hosts and administers it directly. SACOLIST has its own web page, which states: "The listserv may also be used as a vehicle to foster discussions on the construction, use, and application of subject headings. Questions posted may be answered by any list member and not necessarily by staff from the Coop Team or CPSO." 36 Applicable here are the comments from Sloan's article, reviewed above, concerning the "personality" of the list. It has functioned mostly for announcements, notably of LC's Subject Editorial Review Meeting, and provides valuable feedback concerning the meeting's rationale in cases where a proposed subject heading was not approved. Recently, list subscribers have responded critically to some of the announced decisions, giving the list more the character of discussion than of notification. The web page designates it as The SACO Listserv, and it is associated primarily with the Subject Authority Cooperative Program. Subscribers might thereby conclude that posts should be related to that program. But the web page clearly indicates 32 that a broader scope is intended. Notification of errors in the manner and practice of the lists discussed above has not occurred--yet. It remains for people to make use of the opportunity afforded. ISSUES AND CONCERNS All of the projects described in this article are collaborative in nature. Yet with the exception of SACOLIST, none of them has required the approval of a professional body. Otherwise, the only involvement of any such body was the sponsorship mentioned in the establishment of DEWEYERROR, which after two years was discontinued, leaving the list to continue to operate without any noticeable change. My role has been that of an initiator, but without the cooperation of all those mentioned--and many others, to whom I apologize for not having named them individually, they are simply too numerous--these projects would never have come to fruition. People continue to post to general discussion lists, such as AUTOCAT and OCLC-Cat, on matters that are within the scope for notification lists. They are entitled to do so, for the messages are also in scope for the general lists, and it is entirely a matter for the individual to decide in which forum to post a notice. Perhaps those who post elsewhere do so because of the greater number of subscriptions that the general lists have (each has several thousand, compared with a few hundred at most for the notification lists). Or perhaps people are simply not mindful of the existence of the pertinent notification list. When feasible, if I see someone posting elsewhere then I write to advise them of the notification list's existence. But this is a time- consuming activity, and does not always get the desired results. Another option is to repost the information. This can be done by requesting the writer of the message for permission to forward it. When doing so, you can also ask the person if they 33 would prefer to send it on themselves. Such messages do not always get responses, however. Reposting messages without permission is to be avoided since it is a generally deprecated practice. Therefore when cooperation is not forthcoming, I write a fresh message to the list on which it belongs, mentioning that it was occasioned by the previous message, and giving the name of its author and list. POSSIBLE FUTURE PROJECTS A vision for a future notification list dedicated to a particular aspect of cooperative quality control arises once people have identified the concern which it will address. Some readers might have guessed that, if a list dedicated to personal names has prospered, a companion one for corporate names can be established. "CORPNAME-L" is indeed a possibility. Here is why such a list does not yet exist. First, no one has yet started one, a simple enough reason. Anyone can set up a list if they so choose, but a host site is a prerequisite, as are co-listowners. Although technically an individual acting alone can get a Listserv® list set up and serve as the sole listowner, doing so is contrary to the collaborative spirit that has been expressed in the existing projects, and runs the risk of a domineering personality unduly affecting the nature of the list, as well as trouble if the person stops performing the tasks necessary to insure its ongoing viability. When setting up lists I have sought to get different host sites with two or three people. Hopefully doing so avoids consequences such as would occur if one Listserv® site hosting all the lists should become dysfunctional, thereby rendering them all inoperable simultaneously. Having multiple listowners also shares the responsibility among several colleagues, and allows for a succession of leadership in administering the list. 34 A more substantial reason why CORPNAME-L does not yet exist is that corporate names present different challenges and opportunities than personal names. Corporate bodies change their names for different reasons and at a different rate, with earlier and later headings to be accounted for in NARs. Some of them have hierarchical structures, which must be accommodated according to the rules for subordinate bodies. Even the type of publication with which they are associated is different: corporate headings are more likely than personal names to be associated with serials, and in many libraries serials cataloging has traditionally been the domain of a specialist. Hopefully, CORPNAME-L will eventually be realized : any reader with a mind to participate in it is invited to take the initiative. It is also hoped that the list would be successful. A listowner sometimes needs to take steps to insure that it does not become inactive. Nothing is worse for a list than to set it up, announce it, have a group of people subscribe, and then for no posts to appear. Effectively, such a list is a "silence" list, not qualifying under any of the categories described above. It is understandable that a once thriving list might become inactive: all it takes is for those making posts to stop doing so, without people stepping in to take their place. But to set up a list that from the start is inactive is to be avoided. Furthermore: the Listserv® manager at a host site can notice that a list is inactive and take steps to discontinue it, along with the archival record. To prevent possible dysfunctionality of this nature, a prospective listowner of a new list is recommended to have a store of situations laid up so as to provide substance for a number of messages to get the list started. Once those messages start to appear, hopefully other readers will begin to make their own posts. Eventually, a "critical mass" of list contributors can emerge, assuring the list's viability. 35 Finally: it is not known what the "tolerance level" for these projects is. Several people participate in more than one, or indeed all of them, and one does not wish to set up a situation in which someone finds the number of projects to be especially burdensome. For this reason, it is recommended that monitoring each project be a responsibility shared among several co-workers. CONCLUSION The projects discussed in this article constitute a radically different approach to cooperative quality control than has existed prior to the electronic era. Communication via electronic mail allows notification to large groups of people. On some discussion lists for cataloging, notably AUTOCAT and OCLC-Cat, people ask for help in doing their jobs, and receive it: and in the process, most other subscribers read the messages, and the aggregate amount of time expended is considerable. The dedication of PERSNAME-L and other lists to notification about specific situations entails a much closer correspondence between the time spent reading a message and the benefit received. There  are  many  ways  in  which  staff,  technical  services  managers,  administrators,  and  other   organizations  can  work  collaboratively  to  improve  the  records  that  are  contributed  to  shared  databases.   It  would  be  advantageous  to  all  stakeholders  if  employees  are  encouraged  by  managers  and   administrators  to  contribute  to  these  collaborative  efforts  that  have  the  potential  for  far-­‐reaching   positive  effects.  The  role  of  cooperative  organizations  such  as  OCLC,  and  vendors  that  supply   bibliographic  records  to  libraries  should  also  not  be  overlooked.  Together,  all  of  these  stakeholders  can   work  together  to  improve  the  effectiveness  of  bibliographic  control  and  increase  library  efficiencies. APPENDIX A Contact and subscription information for the discussed Listserv® e-mail lists. 36 In some cases, registration with the host Listserv® site might be required. DEWEYERROR: Subscription address: listserv@listserv.kent.edu Address for posting messages: deweyerror@listserv.kent.edu Address for contacting listowners: deweyerror-request@listserv.kent.edu Web page for subscription: https://listserv.kent.edu/cgi-bin/wa.exe?SUBED1=DEWEYERROR Web page for archives search: https://listserv.kent.edu/cgi-bin/wa.exe?A0=DEWEYERROR LCCERROR Subscription address: listserv@listserv.kent.edu Address for posting messages: lccerror@listserv.kent.edu Address for contacting listowners: lccerror-request@listserv.kent.edu Web page for subscription: https://listserv.kent.edu/cgi-bin/wa.exe?SUBED1=LCCERROR Web page for archives search: https://listserv.kent.edu/cgi-bin/wa.exe?A0=LCCERROR PERSNAME-L Subscription address: listserv@lists.ou.edu Address for posting messages: persname-l@lists.ou.edu Address for contacting listowners: persname-l-request@lists.ou.edu Web page for subscription http://lists.ou.edu/cgi-bin/wa?SUBED1=PERSNAME-L Web page for archives search https://lists.ou.edu/cgi-bin/wa?A0=persname-l SERIES-L Subscription address: listserv@po.missouri.edu Address for posting messages: series-l@po.missouri.edu Address for contacting listowners: series-l-request@po.missouri.edu 37 Web page for subscription https://po.missouri.edu/cgi-bin/wa?SUBED1=SERIES-L Web page for archives search https://po.missouri.edu/cgi-bin/wa?S1=SERIES-L The SACO Listserv (sacolist@loc.gov) To subscribe, etc. follow the instructions on the SACO Listserv web page http://www.loc.gov/aba/pcc/saco/sacolist.html. APPENDIX B ABBREVIATIONS AACR2 Anglo-American Cataloging Rules, 2nd edition. ARN Authority Record Number (OCLC) DENL Dewey Error Notification List DDC Dewey Decimal Classification DCD Decimal Classification Division (Library of Congress) KSU Kent State University LC Library of Congress LCC Library of Congress Classification LCSH Library of Congress Subject Headings MARC Machine Readable Cataloging MPL Marion Public Library (Marion, Ohio) NACO Name Authorities Cooperative Project NAR Name Authority Record OCLC Online Computer Library Center OLC Ohio Library Council 38 PCC Program for Cooperative Cataloging RDA Resource Description and Access SACO Subject Authority Cooperative Program SAR Series Authority Record TSD Technical Services Division (Ohio Library Council) NOTES 1 Terry Ballard, "Systematic Identification of Typographical Errors in Library Catalogs", Cataloging & Classification Quarterly 46, no. 1 (2008), 27-33. 2 "AL ASIDE--IDEAS," American Libraries 22, no. 3 (March 1991): 197. 3 "Typographical Errors in Library Databases" is available online at http://librarytypos.blogspot.com/. 4 Jeffrey Beall and Karen Kafadar, "The Effectiveness of Copy Cataloging at Eliminating Typographical Errors in Shared Bibliographic Records," Library Resources & Technical Services 48 no. 2 (2004): 92-101. 5 ibid., 97. 6 Bernie Sloan, “Electronic Discussion Lists,” Journal of Library Administration 44 no. 3-4 (2006): 203-225. 7 ibid., 204.   8 ibid., 210. 9 The remarks in this paragraph, and elsewhere in this article, might not apply exactly to SACOLIST, "The SACO Listserv", discussed below, for which the author had no responsibility, and which merits consideration because it falls within the scope of concern. 10 An index to past postings, with sign-up instructions for contributing, is available at http://libtypos.pbworks.com/w/page/17113321/FrontPage. 39 11 Thomas Mann, Reference Librarian at LC, has described such activity in his essay "What is distinctive about the Library of Congress in both its collections and its means of access to them, and the reasons LC needs to maintain classified shelving of books onsite, and a way to deal effectively with the problem of 'books on the floor'", Paper prepared for AFSCME 2910, The Library of Congress Professional Guild, November 6, 2009. http://www.guild2910.org/Future%20of%20Cataloging/LCdistinctive.pdf . 12 LC's catalog is available online at http://catalog.loc.gov. 13 Bonnie Doepker and Ian Fairclough. "DEWEYERROR Discussion List has a New Home -- with OLC" TechKNOW :11 issue 1 (March 2005):1-2. This announcement and those of other lists described in this article are available at http://www.library.kent.edu/page/11234.   14 Barbara Thiesen, e-mail message to DEWEYERROR, January 6, 2012. https://listserv.kent.edu/cgi-bin/wa.exe?A0=DEWEYERROR 15 Barbara Tillett, e-mail message to author, November 9, 2010. 16 Jay Shorten, e-mail message to LCCERROR, November 1, 2011 https://listserv.kent.edu/cgi-bin/wa.exe?A0=LCCERROR 17 Bryan Baldus, e-mail message to author, April 1, 2011. 18 Miller retired in April 2012 and no longer serves as a listowner. 19 Rich Aldred, e-mail message to PERSNAME-L, August 4, 2011. https://lists.ou.edu/cgi-bin/wa?A0=persname-l 20 Deborah Tomaras, e-mail message to PERSNAME-L, August 4, 2011. Tomaras is now a cataloger at South Portland (Maine) Public Library. 21 Adger Williams, e-mail message to OCLC-Cat, October 5, 2011. http://listserv.oclc.org/archives/oclc-cat.html 22 Jack Hall, e-mail message to PERSNAME-L, October 5, 2011. 40 23 In several workshops and conference presentations with Brenda Block (Manager, OCLC Quality Control Section) I have reviewed numerous reasons why people do not correct errors -- as well as reasons why they should. 24 Gary Strawn, e-mail message to PERSNAME-L, November 6, 2008. 25 Gary Strawn, e-mail message to author, December 18, 2011. This file and others are available online at http://files.library.northwestern.edu/public/AuthLoadReport/Names/ . 26 Stephen Arnold, e-mail message to author, May 9, 2012. 27 Elizabeth Heffington, e-mail message to author, September 28, 2010. 28 Teague Allen, e-mail message to author, September 29, 2010. 29 Barbara Stampfl, e-mail message to author, September 28, 2010. 30 Dennis Reynolds, e-mail message to author, September 28, 2010. 31 Tzu-Jing Kao, e-mail message to author, October 5, 2010. 32 Jason LeMay, e-mail message to author, September 28, 2010. 33 Stephen Arnold, e-mail message to author, September 28, 2010. 34 Anthony Franks, e-mail message to PERSNAME-L, May 20, 2010. 35 Joan P. Condell, e-mail message to author, April 2, 2009. Quoted in Ian Fairclough, "SERIES-L: A New Tool for Cooperative Quality Control," TechKNOW :15 issue 1 (June 2009):11. http://www.library.kent.edu/files/TechKNOW_June_2009.pdf . 36 "About the SACO Listserv (sacolist@loc.gov)" http://www.loc.gov/aba/pcc/saco/sacolist.html work_dimer3kuyndgbbmpzw7ktqf4sy ---- FOLIA_Prelims.indd 2021�·�VOLUME 55�·�ISSUE 1 EDITOR-IN-CHIEF Olga Fischer University of Amsterdam, The Netherlands E-mail: O.C.M.Fischer@uva.nl EDITORIAL ASSISTANT Sune Gregersen University of Amsterdam, The Netherlands E-mail: s.gregersen@uva.nl REVIEW EDITOR Javier Pérez-Guerra University of Vigo, Spain FOLIA LINGUISTICA ACTA SOCIETATIS LINGUISTICAE EUROPAEAE PAST EDITORS Vol. 1 (1967): Ronald A. Crossland Vols. 2-14 (1968-1980): Peter Hartmann Vols. 15-39 (1981-2005): Wolfgang U. Dressler Vols. 40-47(1) (2006-2013): Teresa Fanego Vols. 47(1)-52(1) (2013-2018): Hubert Cuyckens EDITORIAL BOARD Enoch Aboh University of Amsterdam, The Netherlands John Ole Askedal University of Oslo, Norway Kasper Boye University of Copenhagen, Denmark Sonia Cristofaro University of Pavia, Italy María Josep Cuenca University of Valencia, Spain Hubert Cuyckens University of Leuven, Belgium Katarzyna Dziubalska-Kołaczyk University of Poznań, Poland Teresa Fanego University of Santiago de Compostela Jeffrey Heath University of Michigan, USA Gunther Kaltenböck University of Vienna, Austria Belén Méndez-Naya University of Santiago de Compostela, Spain Amina Mettouchi Ecole Pratique des Hautes Etudes, France Heiko Narrog Tohoku University, Japan Martine Robbeets Max Planck Institute for the Science of Human History, Jena Jean-Christophe Verstraete University of Leuven, Belgium Letizia Vezzosi University of Perugia, Italy SOCIETAS LINGUISTICA EUROPAEA Founded in 1966 for the fostering, in European countries and elsewhere, of the scientific and scholarly study of language in all its aspects. http://www.societaslinguistica.eu/ At the 53rd Annual Meeting, the following officers were elected. The date between brackets shows the year that the current office term ends. President Vice-President President-Elect Treasurer Secretary Editor FL Editor FLH Conference Manager Executive Committee Johannes Kabatek, Universität Zü rich (2021) Teresa Fanego, Universidade de Santiago de Compostela (2021) Magdalena Wrembel, Adam Mickiewicz University, Poznań (2021) Lachlan Mackenzie, VU Amsterdam (2023) Bert Cornillie, KU Leuven (2021) Olga Fischer, Universiteit van Amsterdam (2023) Maria Napoli, Università degli studi del Piemonte Orientale (2023) Olga Spevak, Université de Toulouse 2 (2022) the above plus Annemarie Verkerk, Universität des Saarlandes (2021) Francesco Gardani, Universität Zürich (2021) Bridget Drinka, University of Texas, San Antonio (2022) Eva Schultze-Berndt, University of Manchester (2022) Iker Salaberri, University of the Basque Country (2023) Rik van Gijn, Universiteit Leiden (2023) Scientific Committee Nikolaos Lavidas, National and Kapodistrian University of Athens (2021) Eitan Grossman, Hebrew University of Jerusalem (2022) Francesca Di Garbo, Stockholm University (2023) Frans Hinskens, Meertens Instituut Amsterdam/Radboud Universiteit Nijmegen (2024) Karolina Grzech, Stockholm University (2025) Nominating Committee Chair Krzysztof Stroński, Adam Mickiewicz University, Poznań (2021) Anna Cardinaletti, Università Ca’ Foscari Venezia (2022) Åshild Næss, University of Oslo (2023) Francisco Gonzálvez-García, Universidad de Almería (2024) Yvonne Treis, CNRS LLACAN Villejuif, Paris (2025) To join SLE and receive access to FOLIA  LINGUISTICA please consult the SLE website. Dues for personal membership of the society are €30.00 per year, €15 for students, and €10 for members from economically challenged countries. For institutional subscriptions please contact De Gruyter at orders@degruyter.com. ABSTRACTED/INDEXED IN Baidu Scholar · BDSL Bibliographie der deutschen Sprach- und Literaturwissenschaft · BLL Bibliographie Linguistischer Literatur · CNKI Scholar (China National Knowledge Infrastructure) · CNPIEC: cnpLINKer Dimensions EBSCO (relevant databases) · EBSCO Discovery Service · ERIH PLUS (European Reference Index for the Humanities and Social Sciences) · Gale/Cengage · Genamics JournalSeek · Germanistik · Google Scholar · IBR (Interna- tional Bibliography of Reviews of Scholarly Literature in the Humanities and Social Sciences) · IBZ (International Bibliography of Periodical Literature in the Humanities and Social Sciences) · J-Gate · Journal Citation Reports/Social Sciences Edition · JournalGuide · JournalTOCs · KESLI- NDSL (Korean National Discovery for Science Leaders) · Linguistic Bibliography · Linguistics Abstracts Online · Microsoft Academic · MLA International Bibliography · MyScienceWork · Naver Academic · Naviga (Softweco) · Norwegian Register for Scientifi c Journals, Series and Publishers · OLC Linguistik · Primo Central (ExLibris) · ProQuest (relevant databases) · PsycINFO · PSYN- DEX · Publons · QOAM (Quality Open Access Market) · ReadCube · SCImago (SJR) · SCOPUS · Semantic Scholar · Sherpa/RoMEO · Summon (ProQuest) · TDNet · Ulrich‘s Periodicals Directory/ ulrichsweb · WanFang Data · Web of Science: Arts & Humanities Citation Index; Current Contents/ Arts & Humanities; Current Contents/Social and Behavioral Sciences; Social Sciences Citation Index · WorldCat (OCLC) The publisher, together with the authors and editors, has taken great pains to ensure that all information presented in this work (programs, applications, amounts, dosages, etc.) refl ects the standard of knowledge at the time of publication. Despite careful manuscript preparation and proof correction, errors can nevertheless occur. Authors, editors and publisher disclaim all responsibility for any errors or omissions or liability for the results obtained from use of the information, or parts thereof, contained in this work. ISSN 0165-4004 ∙ e-ISSN 1614-7308 All information regarding notes for contributors, subscriptions, open access, back volumes and orders is available online at http://www.degruyter.com/view/j/fl in RESPONSIBLE EDITOR Olga Fischer, University of Amsterdam, Department of English, Spuistraat 134, 1012VB Amsterdam, the Netherlands. E-mail: O.C.M.Fischer@uva.nl JOURNAL MANAGER Katharina Kaupen, De Gruyter, Genthiner Straße 13, 10785 Berlin, Germany, Tel: +49 (0)30 260 05-423, Fax: +49 (0)30 260 05-250. E-mail: katharina.kaupen@degruyter.com RESPONSIBLE FOR ADVERTISEMENTS Katharina Kaupen, De Gruyter, Genthiner Straße 13, 10785 Berlin, Germany, Tel.: +49 (0)30 260 05–170. E-mail: anzeigen@degruyter.com TYPESETTING TNQ Technologies, Chennai, India PRINTING Franz X. Stückle Druck und Verlag e.K., Ettenheim © 2021 Walter de Gruyter GmbH, Berlin / Boston Offenlegung der Inhaber und Beteiligungsverhältnisse gem. § 7a Abs. 1 Ziff. 1, Abs. 2 Ziff. 3 des Berliner Presse- gesetzes: Die Gesellschafter der Walter de Gruyter GmbH sind: Cram, Gisela, Rentnerin, Berlin; Cram, Elsbeth, Pensionärin, Rosengarten-Alvesen; Cram, Dr. Georg-Martin, Unternehmens-Systemberater, Stadtbergen; Cram, Maike, Wien (Österreich); Cram, Jens, Mannheim; Cram, Ingrid, Betriebsleiterin, Tuxpan/Michoacan (Mexi- ko); Cram, Sabina, Mexico, DF (Mexiko); Cram, Silke, Wissenschaftlerin, Mexico DF (Mexiko); Cram, Björn, Aachen; Cram, Berit, Hamm; Cram-Gomez, Susana, Mexico DF (Mexiko); Cram-Heydrich, Walter, Mexico DF (Mexico); Cram-Heydrich, Kurt, Angestellter, Mexico DF (Mexico); Duvenbeck, Brigitta, Oberstudienrätin i.R., Bad Homburg; Gädeke, Gudula, M.A., Atemtherapeutin/Lehrerin, Tübingen; Gädeke, Martin, Student, Ingol- stadt; Gomez Cram, Arturo Walter, Global Key Account Manager, Bonn, Gomez Cram, Ingrid Arlene, Studentin, Mexico, DF (Mexiko), Gomez Cram, Robert, Assistant Professor. London UK, Lubasch, Dr. Annette, Ärztin, Berlin; Schütz, Dr. Christa, Ärztin, Mannheim; Schütz, Sonja, Diplom.-Betriebswirtin (FH), Berlin; Schütz, Juliane, Berlin; Schütz, Antje, Berlin; Schütz, Valentin, Mannheim; Seils, Dorothee, Apothekerin, Stuttgart; Seils, Gabriele, Journalistin, Berlin; Seils, Christoph, Journalist, Berlin; Siebert, John-Walter, Pfarrer, Oberstenfeld; Anh Vinh Alwin Tran, Zürich (Schweiz), Tran, Renate, Mediatorin, Zürich (Schweiz). Contents Articles Timur Maisak Endoclitics in Andi 1 Lilián Guerrero When-clauses and temporal meanings across languages 35 Yunfan Lai The complexity and history of verb-stem ablauting patterns in Siyuewu Khroskyabs 75 Irina Burukina Profile of reflexives in Hill Mari 127 Hussein Al-Bataineh Alternations of classificatory verb stems in Tłı̨chǫ Yatıì: a cognitive semantic account 163 Carlota de Benito Moreno Is there really an aspectual se in Spanish? 195 Timothy Osborne Adjectives as roots of nominal groups: the big mess construction in dependency grammar 231 Book Reviews Chen Ou Peppina Po-lun Lee: Focus manifestation in Mandarin Chinese and Cantonese: A comparative perspective 265 Evie Coussé Johanita, Kirsten: Written Afrikaans since standardization: A century of change 271 Folia Linguistica 2021 | Volume 55 | Issue 1 Grażyna Kiliańska-Przybyło and Monika Grotek Conklin, Kathy, Ana Pellicer-Sánchez and Gareth Carrol: Eye-tracking: A guide for applied linguistics research 275 Hanna Jaeger Nick Palfreyman: Variation in Indonesian Sign Language: A typological and sociolinguistic analysis 281 Jane Hodson Urszula Clark: Staging language: Place and identity in the enactment, performance and representation of regional dialects 285 Minchen Wang Scott F. Kiesling: Language, gender and sexuality: An introduction 289 Stavros Assimakopoulos Jacques Moeschler: Non-lexical pragmatics: Time, causality and logical words 295 Marwan Jarrah Ahmad Alqassas: A multi-locus analysis of Arabic negation: Micro-variation in Southern Levantine, Gulf and Standard Arabic 301 Yong Zhou and Hui Xu Ljiljana Progovac: A critical introduction to language evolution: Current controversies and future prospects 307 Folia Linguistica 2021 | Volume 55 | Issue 1 work_dirfzc7jezhfnodhb3oxyh3tae ---- DOI: 10.6245/JLIS.2017.432/739 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 109 第四屆「美國華人圖書館員協會圖書館 傑出領袖張鼎鍾教授紀念獎」 (The 4th CALA Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung) 「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」(CALA Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung)該獎項為紀念張鼎鍾教授,每年由 「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」遴選委員會選出圖書資訊界之卓有 貢獻領導者,以顯揚及傳承圖書資訊學界之重要社會貢獻和服務精神。第四屆得獎人為李華偉博 士,在 2016 年獲得該獎項。李華偉博士的領導風格以愛人如己、謙和、積極務實和創新的問題解 決為其特長,是全球圖書館界中第一位大學圖書館退休館長榮獲任職大學圖書館以其名命名的館 長。李華偉博士的經歷和得獎感言十分值得細讀。本期刊出獎項簡介、得獎人簡介,以及李博士 獲獎感言。  「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」獎項簡介  第四屆「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」得獎人簡介  第四屆「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」得獎人李華偉博士獲 獎感言--Following the Steps of a Giant Library Leader: Dr. Margaret Chang Fung 110 Journal of Library and Information Science 43(2):109 – 121(October, 2017) 「美國華人圖書館員協會圖書館傑出 領袖張鼎鍾教授紀念獎」獎項簡介 紀念張鼎鍾教授 張鼎鍾教授(1934-2010)是「美國華人圖書館員協會」(Chinese American Librarians Association,簡稱 CALA)的終身會員,一位傑出的圖書館館長、教育家、學者,更是當代 傑出的圖書館領導者之一。張鼎鍾教授曾任教於臺灣大學、臺灣師範大學、政治大學、淡江 大學及輔仁大學等校作育菁莪,並曾擔任臺灣師範大學圖書館館長、興建新館,貢獻良多。 張鼎鍾教授並曾任考試委員,任職考試委員十二年期間,關切圖書館人才考用甚深,對考銓 制度與各類考試之主導與策劃竭盡心力,為國舉才。張鼎鍾教授十分關心國際事務發展,對 於國際文化之交流及圖書館事業之促進不遺餘力,是「資訊科學暨科技學會」(Association for Information Science and Technology,簡稱 ASIS&T)臺北分會發起貢獻者之一。張鼎鍾教授 在 1998 年擔任中國圖書館學會理事長擔任理事長期間,為「圖書館法」之通過多方奔走, 促使延宕數十年的「圖書館法」完成立法程序,使圖書館界多年的願望得以實現,厥功甚偉。 張教授並為學會募得數百萬基金,建立學會經濟基礎俾便發揮學會功能,提供更多服務。張 教授不僅是一位重要的圖書館館長、更是一位教育家、倡導者,也因此成為少數獲得 CALA 傑出服務獎者之一。 圖書館傑出領袖張鼎鍾教授紀念獎 為紀念張鼎鍾教授的卓越貢獻,張鼎鍾教授家人與「美國華人圖書館員協會」(CALA) 於 2012 年共同成立「張鼎鍾教授紀念基金會」並設置「圖書館傑出領袖獎」,用以表彰張教 授與協會一致的使命與目標。張鼎鍾教授對圖書館界貢獻良多,熱心公益,可謂傾私濟公, 其職涯中的成就、貢獻、領導與奉獻精神令人欽佩,可謂圖書館從業人員的楷模。 提名須知 「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」以華人圖書資訊從業人員 為對象,每年表彰在圖書資訊服務領域中傑出的領導和成就的美籍華人圖書館領袖。對於推 動華裔圖書館事業的卓越貢獻和服務,將得到獎項委員會的特別關注。這個獎項是永久性的, 是該協會授予的最高榮譽之一,得獎人終生只能獲獎一次。無論年齡、性別、宗教信仰、性取 向或專業服務時間長短,候選人都將被平等地評估。候選人必須是「美國華人圖書館員協會」 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 111 成員。得獎者可獲得協會頒贈美金 1 千元的支票,並於當年度「美國圖書館協會」(American Library Association,簡稱 ALA)年會中的「華人圖書館員協會」頒獎晚宴上公開頒獎。「美 國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」得獎人並得以英文或中文原創論文, 或得獎心得在臺灣師範大學出版之圖書館學與資訊科學半年刊發表。 提名文件包括以下各項,必須以英文提交:  正式提名信函  3-5 封支持信函  被提名者簡歷  被提名者出版品(不超過 5 份)  被提名者相關獎項和榮譽  其他證明文件 112 Journal of Library and Information Science 43(2):109 – 121(October, 2017) CALA Outstanding Library Leadership Award in Memory of Dr.Margaret Chang Fung: An Overview About Dr. Margaret Chang Fung Dr. Margaret Chang Fung (1934-2010) was a Chinese American Librarians Association (CALA) life member, a distinguished library director, educator, scholar, and one of the most prominent library leaders around the world. She made enormous achievements and contributions to librarianship, including pioneering in Chinese library automation, library education, international library cooperation and exchange, library and information acts, policies and standards, management, organization, and many other fields. She was a great library director, educator, and advocator in Taiwan and one of the CALA Distinguished Services Award recipients. Dr. Fung’s substantial lasting achievements, contributions, leadership and dedication to the library profession are admirable and impressive. She is an exemplary role model for librarians, in general, and Chinese American librarians, in particular. The Award In order to honor Dr. Fung’s distinguished leadership and dedicated career, The Fung family and the Chinese American Librarians Association have agreed to create the Dr. Margaret Chang Fung Memorial Fund and establish the Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung in CALA consistent with the mission and goals of the Association. Guidelines for Nomination The Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung is given annually to a Chinese American library and information professional who has consistently demonstrated outstanding leadership and achievement in library and information services at the national and/or international level. Distinguished contributions and services to the advancement of Chinese-American librarianship will receive special consideration from the Awards Committee. This Award is permanent and one of the highest recognitions given by the Association. None individual will receive the award more than once. Candidates will all be evaluated equally, regardless of age, gender, religion, sexual orientation, or length of service to the profession. 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 113 Candidates must be CALA members. The award consists of a $1,000 check. The 2016 Award recipient will be honored at the 2016 CALA Awards Banquet in June 2016 during the ALA Annual Conference. The recipient of the “Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung” is encouraged to submit an original paper in English or Chinese for publication in the joint NTNU/CALA Journal of Library and Information Science. Nomination packages must be submitted in English and must include:  A formal nomination letter from the nominator  3 - 5 supporting letters  Candidate's curriculum vitae  Samples of publications (no more than 5)  List of relevant awards and honors received  Other supporting documentations 114 Journal of Library and Information Science 43(2):109 – 121(October, 2017) 第四屆「美國華人圖書館員協會圖書館 傑出領袖張鼎鍾教授紀念獎」得獎人 李華偉博士簡介 第四屆「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」獲獎人—李華偉博 士,他的職業生涯體現了張鼎鍾教授紀念獎所標記的作為圖書館館長、教育家、創新者、學 者、政府官員,彰顯圖書館事業之榮譽和價值,是一位典範的圖書資訊界領導者。 李華偉博士,1954 年畢業於臺灣師範大學獲教育學士學位、1961 年取得美國卡內基梅 隆大學和匹茲堡大學圖書館學碩士學位、1965 年獲得匹茲堡大學教育和圖書館學博士學位, 並於 2012 年獲俄亥俄大學頒贈榮譽博士學位。 李華偉博士是一位績效卓著的圖書館經營者,在大學圖書館和國會圖書館工作半個世紀 以來,與教職員、行政主管、立法者和社區合作,共同努力,創造有價值的服務,擁有卓越 的成就、創新的管理和傑出的貢獻,其中服務的機構包括賓夕法尼亞州愛丁堡大學圖書館館 長和副教授(1966 年 7 月至 1968 年)、美國國際開發署邀請擔任泰國曼谷亞洲理工學院圖書 館和資訊中心主任(1968 年 8 月至 1975 年 6 月)、科羅拉多州立大學圖書館副主任、圖書館 學教授(1975 年 7 月至 1978 年 7 月)、俄亥俄大學圖書館館長(1978 年 8 月至 1999 年 8 月)、 俄亥俄大學名譽館長(1999 年 9 月至今)、泰國清邁大學圖書資訊學系傅爾布萊特高級專家 (2001 年 9 月至 2001 年 10 月)、美國國際圖書館電腦中心(OCLC)顧問暨傑出訪問學者(2000 年至 2003 年)、美國國會圖書館亞洲部主任(2003 年 2 月至 2008 年 3 月)、美國聯邦政府 IMLS 計畫評鑑委員(2008 年 10 月至 2012 年 9 月),以及其他美國和世界各地的領導和顧問諮詢 工作。李華偉博士是世界各地圖書館同僚最高度尊重、也是最具影響力的領導人之一,是一 位國際知名的圖書館領導者,其長期的成就和貢獻為他贏得了無數獎項、榮譽和認可。 李華偉博士的成就和貢獻特別顯現在領導、管理、服務和全球圖書館合作、培訓、應用 新技術和籌款等方面。他曾經擔任美國俄亥俄大學圖書館館長,是華人在美國擔任研究型大 學圖書館總館館長的第一人,不僅如此,1999 年,俄亥俄大學將其新建的圖書館分館命名為 「李華偉圖書館分館」,並將其總圖書館的第一層樓重新裝修,命名為「李華偉國際藏書中 心」作為李華偉博士退休的獻禮,這是一份罕見的殊榮。李博士於俄大退休後,於 2003 年 受聘於美國國會圖書館擔任亞洲部主任,是該館第一位也是唯一正式擔任此職務的華人。此 外,他曾為俄亥俄大學圖書館、美國國會圖書館、美國華人圖書館員協會募得大額捐款,2015 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 115 年更榮獲全球最大圖書館專業組織—美國圖書館協會所授予的最高榮譽梅爾維爾杜威獎等, 李華偉博士在圖書館領導工作、圖書館經營、服務、管理、培訓,以及新技術應用的成功故 事,展示了世界上有遠見卓識的領導者的榜樣,是這個專業領域榜樣中的榜樣。這也是張鼎 鍾博士獎項所倡導的價值。 在繁忙的職業生涯中,李華偉博士仍擁有出色的出版記錄,其中包括 60 多篇期刊文章 和 9 篇專著。近年並有兩本傑作專著,激勵和啟發了世界各地的同行。 The Collected Works of Hwa-Wei Lee 李華偉文集. (Guangzhou: Sun Yet-sen University Press, 2011). 2 vols. 1565 p.(圖書館學家文庫 Library of Library Scientists)(中、英文). The Sage in the Cathedral of Books: the Distinguished Chinese American Library Professional Dr. Hwa-Wei Lee =書籍殿堂的智者:傑出圖書館學家李華偉傳。(中文正體、簡體、英文). 這兩本著作為讀者打開視野,從李博士的生活、才華、經驗、智慧、知識、領導力、管 理、視野、為人處事態度等方面學習和受益。李博士願意分享他的專業知識,以及人生寶貴 經驗,體現了他對圖書館事業和專業人員的熱愛。 李華偉博士對於圖書館經營高瞻遠矚的領導力和創新管理、引領海內外圖書館專業人才 的培訓和提攜、引用資訊科技及其政策指導、募款與增加外部贈款,以及對國際圖書館合作 的長期貢獻,已經成為這個專業領域的模範。恭喜 李華偉博士!並感謝您的勤奮和智慧工 作經驗,成就這個領域的典範榜樣,對於圖書館專業發展,非常具有鼓舞作用。 116 Journal of Library and Information Science 43(2):109 – 121(October, 2017) Dr. Hwa-Wei Lee Recipient of the 4th CALA Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung The award recipient for 2016 is Dr. Hwa-Wei Lee whose professional career has reflected the merits and values in honor of Dr. Margaret Chang Fung’s distinguished library career as a dedicated library director, educator, innovator, scholar, government official, and above all, an exemplary leader. Dr. Lee is a model leader for his superlative achievements in leadership, management, services, national and global library cooperation, training, application of new technologies and fundraising. He is an internationally acclaimed distinguished library icon whose long life achievements and contributions have earned him countless awards, honors and recognitions both in the United States and abroad. Dr. Lee is also recognized as one of America’s most celebrated library leaders. In appreciation of Dr. Lee’s immense contributions to the Ohio University, the University, in August 1999, named a new library building after Dr. Lee as the Hwa-Wei Lee Library Annex and the first floor in the Main Library as the Hwa-Wei Lee Center for International Collections, in his honor when he retired from the University in 1999. This is the highest tribute one can receive in the library world. His success stories are our exemplary role models. Dr. Lee demonstrates an exemplary role model for the world’s visionary leaders in library leadership, management, training, and application of new technologies for library operation and services that Dr. Margaret Chang Fung advocated. Dr. Lee received his B.Ed. from National Taiwan Normal University in 1954; M.Ed. from the University of Pittsburgh in 1959; M.L.S. from Carnegie-Mellon University and University of Pittsburgh in 1961; Ph.D. (Education and Library Science) from the University of Pittsburgh in 1965; and Honorary Doctor of Letters from Ohio University in 2012. With a successful career spans for half a century of leadership in prestigious academic libraries and national library, Dr. Lee has a proven ability to work collegially with faculty, staff, administrators, legislators, and communities to advance the institutions he had served. He is a results oriented administrator of excellent caliber with remarkable track records of superlative achievements, innovative management and distinguished contributions, including: Chief Librarian and Associate Professor, Edinboro University of Pennsylvania, 7/1966-7/1968; Director 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 117 of Library and Information Center, Asian Institute of Technology, Bangkok, Thailand, (invited by the U.S. Agency for International Development), 8/1968-6/1975; Associate Director of Libraries and Professor of Library Administration, Colorado State University, 7/1975-7/1978; Dean of University Libraries, Ohio University, 8/1978-8/1999; Dean of Libraries Emeritus, Ohio University, 9/1999- present; Fulbright Senior Specialist, Department of Library and Information Science, Chiang Mai University, Chiang Mai, Thailand, 9/2001-10/2001; OCLC Consultant and Distinguished Visiting Scholar, 2000-2003; Chief, Asian Division, Library of Congress, 2/2003-3/2008; Project Evaluator, IMLS funded China-U.S. Library Collaboration Project, 10/2008-9/2012; and other national and international leadership and consultation positions. Dr. Lee is one of the most influential leaders who is highly respected by colleagues around the world. Throughout his amazing career, Dr. Lee has an outstanding publication record, including over 60 journal articles and nine monographs. Two masterpieces of monographs have been published recently which have inspired and enlightened our peers around the world. They are:  The Collected Works of Hwa-Wei Lee 李華偉文集. (Guangzhou: Sun Yet-sen University Press, 2011). 2 vols. 1565 p. (圖書館學家文庫 Library of Library Scientists)(In Chinese and English).  The Sage in the Cathedral of Books: the Distinguished Chinese American Library Professional Dr. Hwa-Wei Lee = 書籍殿堂的智者︰傑出美籍華裔圖書館學家李華偉. (Guilin, Guangxi Normal University Press Group, 2011). 305 p. (In Chinese and English). These two books provide rich information and resources for the readers to learn and benefit from Dr. Lee's life, abilities, experience, intelligence, knowledge, leadership, management, vision, talents, and many more. Dr. Lee’s willingness to share his expertise in publications exemplifies his caring for the library professionals. Dr. Lee’s major merits can be identified and described at least in the following features: leadership and innovative management, creative training, national and international guidance for information technologies and policies, and distinguished leadership in fundraising and an increase in external grants. Dr. Lee’s significant long lasting contributions to the library profession, community, education, research, publications, and his effectively collaborate with national and international library executives for cooperative library initiatives have become legacy in the profession. Congratulations! Dr. Hwa-Wei Lee! Thanks for your diligent stories and very encouraging as an exemplary role model for the profession. 118 Journal of Library and Information Science 43(2):109 – 121(October, 2017) 第四屆「美國華人圖書館員協會圖書館 傑出領袖張鼎鍾教授紀念獎」得獎人 李華偉博士獲獎感言 追隨一位圖書館界領導巨人的腳步:馮張鼎鍾博士 李華偉 很榮幸榮獲 2016 年「美國華人圖書館員協會圖書館傑出領袖張鼎鍾教授紀念獎」。已 故馮張鼎鍾博士(1934-2010)是臺灣、美國及其他地區優秀圖書館領導者公認的榜樣。她的 家人在 2012 年在 CALA 設立了這個年度紀念獎,是對馮張博士最適合的紀念。 我有幸認識馮張博士近半個世紀,從馮張博士在美國學習和工作時開始,以及後來她在 臺灣的出色工作,包括圖書館管理、教學、研究、出版和政府公務部門(考試委員)工作。 她在臺灣擔任中華民國圖書館學會會長任內,對臺灣圖書館法的起草和通過發揮了重要作用。 另外,她是中國圖書館自動化的先驅,也是臺灣婦女權益的孜孜不倦倡導者。 我和馮張博士係透過國際會議和專業活動,在圖書館專業上有多次的交會。我很榮幸在 1985 年應她的邀請,協助在俄亥俄州哥倫布舉行的美國資訊科學學會第 45 屆年會上組織一 個特別會議。我還記得早上會經常接到她的電話,查看會議安排的細節。她對細節的關注讓 我印象深刻。我們彼此聯繫的越多,我對她的眼光、領導力和對卓越的要求的敬意就越多。 在我 50 年的圖書館職涯中,從研究生時期擔任圖書館學生助理、圖書館員實習生到成 為匹茲堡大學圖書館專業館員、擔任杜肯大學(Duquesne University)圖書館技術服務部主 管、在賓夕法尼亞州立愛丁堡大學(Edinboro State University of Pennsylvania)先後擔任副館 長和館長、接著在美國國際開發署的贊助下,前往泰國曼谷擔任亞洲理工學院圖書館和資訊 中心主任、然後又獲聘擔任科羅拉多州立大學圖書館副館長、俄亥俄大學圖書館館長、並獲 選為 OCLC 訪問傑出學者、以及在美國國會圖書館亞洲部擔任主任等。我在這個職涯的不同 階段中有許多傑出榜樣 - 馮張博士便是其中一位佼佼者。 在我的圖書館職涯中,我也獲得了無數的榮譽和認可,包括俄亥俄大學傑出管理者獎、 俄亥俄州圖書館協會頒發的俄亥俄年度圖書館員獎、美國圖書館協會頒發的約翰艾姆斯漢弗 萊國際圖書館事業貢獻獎、在我第一次退休時,俄亥俄大學以一棟新蓋的圖書館分舘命名為 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 119 「李華偉圖書館分館」來表揚我對該校圖書館的貢獻、俄亥俄州圖書館理事會將我的名字列 入俄亥俄圖書館員名人堂、俄亥俄大學授予名譽博士學位,以及不久前剛榮獲的美國圖書館 協會所授予的最高榮譽梅爾維爾杜威獎,但我認為美國華人圖書館員協會(CALA)紀念馮 張鼎鍾博士的傑出圖書館領導者獎對我來說具有非常特殊的意義,因為馮張博士是我親自認 識及非常欽佩的卓越領導者。另外,我也很榮幸能夠追隨這個獎項的前三位優秀獲獎者,他 們是前多米尼加大學研究生院圖書館學院院長,CALA 創始人之一李志鍾博士、前臺灣國家 圖書館館長王振鵠教授、臺灣師範大學教務長兼圖書資訊學研究所教授陳昭珍博士。他們都 是真正卓越的圖書館領導者,對圖書館事業具有豐富的和深遠的貢獻。他們為我和其他未來 的獲獎者創造了高標準的標竿,鼓勵我們去爭取和超越。 我深深的感謝、以謙遜和自豪感接受這個獎項。 120 Journal of Library and Information Science 43(2):109 – 121(October, 2017) Following the Steps of a Giant Library Leader: Dr. Margaret Chang Fung Hwa-Wei Lee It is a great honor to be the recipient of the 2016 CALA Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung. The late Dr Fung (1934-2010) is a recognized role model for outstanding library leaders in Taiwan, U.S. and beyond. It is a fitting tribute to her that her family established this annual memorial award at CALA in 2012. I had the privilege of knowing Dr. Fung for almost half a century; from the time she studied and worked in the U.S. and later during her outstanding career in Taiwan, including in library administration, teaching, research, publications, and government work on the civil service examination. In her role as the president of the Library Society of China in Taiwan, she was instrumental in the draft and passage of the Library Law in Taiwan. In addition, she was a pioneer in Chinese library automation and a tireless advocate for woman’s rights in Taiwan. Our professional paths crossed many times through international conferences and programs. It was a privilege that in 1985, I was invited by her to assist in organizing a special session at the 45th Annual Conference of the American Association of Information Science held in Columbus, Ohio. I still remember the many phone calls from her in the morning hours to check on the details of the conference planning. Her attention to details made a great impression on me. The more we contacted each other, the more was my respect for her vision, leadership, and dedication to excellence. Throughout my 50 years of library career beginning from being a student library assistant, a librarian trainee, and a beginning professional librarian at the University of Pittsburgh Library during my graduate studies; to head of technical services at Duquesne University Library; assistant head and then head librarian at Edinboro State University of Pennsylvania; director of library and information center at Asian Institute of Technology in Bangkok, Thailand, under the sponsorship of the U.S. Agency for International Development; associate director of libraries at Colorado State University; dean of libraries at Ohio University; visiting distinguished scholar at OCLC; and finally to the position of chief of Asian division at the Library of Congress; I had many role models at different stage of that career -- Dr. Fung was an important one of them. 圖書館學與資訊科學 43(2):109 – 121(民一○六年十月) 121 Of the many honors and recognitions I received during my library career, including the Outstanding Administrator’s Award from Ohio University, the Ohio Librarian of the Year awarded by Ohio Library Association, the John Ames Humphrey Award for Contributions to International Librarianship by American Library Association , the naming of a library building as Hwa-Wei Lee Library Annex by Ohio University upon my first retirement, the induction into the Ohio Librarian Hall of Fame by Ohio Library Council, the conferring 0f a Honorary Doctor of Letters Degree by Ohio University, and the Melvil Dewey Medal Awarded by American Library Association, etc., I consider the CALA Outstanding Library Leadership Award in Memory of Dr. Margaret Chang Fung as having a very special meaning to me in that Dr. Fung is the one I had known personally and has always had my great admiration. In addition, I also feel honored to be among those outstanding past recipients of this award; they are Dr. Tze-Chung Li, former Dean of Graduate Library School at Dominican University and founder of CALA; Dr. Cheng-Ku Wang, former Director General of the National Central Library in Taiwan; and Dr. Joyce Chao-Chen Chen, Dean of Academic Affairs and Professor of Library and Information Science, National Taiwan Normal University. All of them are truly outstanding library leaders with illuminated library careers and far-reaching contributions. They established the high bar of achievements for me as well as for other future award recipients to aspire and excel. It is with profound gratitude that I accept the Award with a sense of humbleness and pride. 4302-06 work_dmerjm5znba7pchjf5qoqow6ym ---- 047-064_______03-(¿Ï)-ÀÌÁö¿ø.hwp 정보조직 사서직 역량 및 직무 유형 분석* An Analysis on Librarian Competencies and Job Type in the Organization of Information 이지원(Ji-Won Lee)** 초 록 본 연구는 도서 장에서의 요구하는 인력 황을 살펴보고, 정보조직학 교육의 개선 방안을 모색하기 하여 정보조직 사서직에서 필요로 하는 역량과 직무 유형을 분석하 다. 미국 도서 의 채용공고 298건을 상으로 조사․분석한 결과 기존의 자격요건들의 수요가 지속되고 있으며, 새로운 환경 변화를 반 한 표 자자원 련 요구가 추가되었음을 확인하 다. 정보조직 직무 유형은 크게 통 인 정보조직의 특성이 반 된 것과 자자원, 정보기술의 특성이 반 된 것으로 나 수 있었다. 한 정보조직 교육의 개선을 하여 자자원 교육의 확 , 이론과 실습의 조화, 정보기술 활용의 강화, 새로운 교수 방법의 용을 제안하 다. ABSTRACT This study analysed the librarian competencies and job type to identify the present state of required librarians and to suggest the direction for education in the organization of information. As the result of investigation 278 job announcement, the traditional qualifications and the knowledge of new standard and S/W related electronic resources are required. The job type divided between cluster for the traditional demands and cluster for electronic resources and information technology. It proposed the expansion of education related electronic resources and information technology, the balance of theory and practice, and the adoption of new teaching method for education in the organization of information. 키워드: 정보조직, 사서직 역량, 채용공고, 문헌정보학 교육, 자자원, 정보기술 organization of information, librarian competencies, job announcement, LIS education, electronic resources, information technology * ** 본 연구는 2010학년도 구가톨릭 학교 연구비 지원에 의한 것임. 구가톨릭 학교 임강사(jiwon@cu.ac.kr) ■ ■ 논문 수일자: 2011년 8월 16일 ■최 심사일자: 2011년 8월 16일 ■게재확정일자: 2011년 9월 12일 정보 리학회지, 28(3): 47-64, 2011. [http://dx.doi.org/10.3743/KOSIM.2011.28.3.047] 48 정보 리학회지 제28권 제3호 2011 1. 서 론 1.1 연구의 목적과 필요성 도서 정보센터 장에서 정보조직은 수 집된 다양한 정보 자원들을 효과 으로 이 용자와 연결하는 필수 인 과정이다. 한 다 양한 정보원의 주제 분석, 분류, 그리고 목록을 비롯한 2차 정보원의 작성과 리 등에 한 정 보조직학은 문헌정보학의 발 기부터 핵심 교육 역으로 자리잡아 왔다. 다른 분야들과 마찬가지로 정보조직 역시 도 서 내․외부의 환경 변화에 직․간 으로 많은 향을 받아왔다. 컴퓨터와 정보통신의 발달은 통 인 정보조직의 근본 인 변화를 가져왔고, 네트워크의 발 과 웹 환경으로의 변 화는 엄청난 양의 디지털 형태의 정보를 만들어 내면서 이에 한 정보조직을 새로운 과제로 내 놓았다. 해외의 경우 정보조직 업무를 수행하 는 직명에서도 통 으로 사용해 온 목록사서 (cataloger)외에도 메타데이터 사서(metadata librarian), 자자원 사서(electronic resources librarian)가 등장하 다. 한 인터넷과 웹의 향력이 강해지면서 도서 과 목록의 역할에 한 기의식이 커지게 되었고, 필연 으로 정보조직 분야에서의 개선과 변화에 한 많은 고민과 심이 생기게 되었다. 이러한 변화의 흐름 속에서 FRBR(Functional Requirements for Bibliographic Records; 서지 코드의 기 능상 요건), 국제목록원칙규범(The Statement of International Cataloguing Principles), RDA (Resource Description and Access; 자원 기 술과 근)와 같은 새로운 원칙과 표 한 만 들어지게 되었다. 이와 같은 도서 장에서의 변화에 발맞추 기 해서는 정보조직 업무를 수행하는 문직 양성 교육에서의 개선 한 반드시 필요한 사 안이다. 정보조직학 교육과 련하여 해외에서 는 교과과정 연구, 교육 황과 방법 연구, 장 의 요구 분석 등 다양한 연구가 지속 으로 수 행되어 왔으나, 국내에서는 1990년 반과 최 근 몇 건의 연구가 발표된 정도이다. 본 연구는 도서 장에서의 요구하는 인력 황을 살펴보고, 정보조직학 교육의 개선 방안 을 모색하기 하여 정보조직 사서직의 역량과 직무 유형을 분석하고자 한다. 이를 한 정보원 으로 2007년부터 2011년 6월까지의 미국 도서 의 채용공고를 사용하 다. 사서의 채용공고 는 사서에게 요구되는 직무와 자격요건에 한 장의 요구를 잘 반 하고 있는 정보원인데 국 내의 경우 발생량이 충분하지 못하고 정보조직 과 같은 직무별로 공고가 게재되는 경우가 거의 없어 정보원으로 사용하기에 어려움이 있기 때 문이다. 도서 이나 정보센터의 규모나 주제 분 야 등에서 미국 도서 과 국내의 상황이 차이가 있으나, 국내 도서 의 발 과정을 살펴볼 때 미국 도서 의 재의 모습이 앞으로의 국내 도 서 의 변화를 측하여 용하는데 있어서 큰 무리가 없을 것으로 생각된다. 1.2 연구의 방법 정보조직 사서직의 역량과 직무 유형을 분석 하기 하여 본 연구는 문헌연구, 내용분석, 계 량 분석 방법을 사용하 다. 문헌 연구는 정 보조직학 교육 정보조직 사서직의 역량 정보조직 사서직 역량 직무 유형 분석 49 자격요건에 한 연구들을 살펴보았다. 내용분 석과 계량 분석을 해서 정보조직 사서직 련 채용공고를 298건을 수집하여 분석하 으며, 데이터 구축 분석 방법은 다음과 같다. 1.2.1 데이터 수집 구축 데이터 수집을 하여 사용한 정보원은 다 음의 세 가지이다. 첫째, 미국도서 회지인 American Libraries에 2007년부터 2011년 6월 까지 사서 채용공고 정보조직 련 고이 다. 내용을 확인하여 복 게재된 것, 상세 내용 이 구인 기 의 홈페이지로 연결되어 있어 게 시물 내에는 자세한 내용이 포함되지 않은 것 은 제외하 다. 둘째, 정보조직 분야의 표 listserv인 AUTOCAT의 2007년부터 2011년 6월까지의 게시물 에서 채용공고에 해당하는 키 워드(job announcement, job posting, position announcement, position posting)로 검색된 것 을 수집하 다. 마찬가지로 복 게재된 것, 상 세 내용이 구인 기 의 홈페이지로 연결되어 있 어 게시물 내에는 자세한 내용이 포함되지 않은 것, 그리고 정보조직 분야와 련이 없는 것은 제외하 다. 셋째, 미국도서 회(ALA) 홈페 이지에서 제공하는 Joblist1)에서 Job Category 가 Cataloging/Bibliographic Control, Technical Services인 것으로 제한하여 검색된 데이터 (2011. 5. 11~7. 22)를 수집하 다. 내용 확인 결과 정보조직 분야와 련이 없는 것은 제외하 으며, 구인 기 홈페이지로 연결된 것 에 서 근이 가능한 것은 해당 내용을 포함시켜 총 22건의 데이터를 수집하 다. 1.2.2 데이터 분석방법 정보조직 사서직에게 요구되는 역량을 조사 하기 하여 채용 공고 내용 수행해야 할 책 무(responsibility 는 duty)와 자격요건(re- quirement)을 교육, 이론 지식, 편목 능력, 외 국어 능력, 정보기술 능력, 인 계 기타 업 무 능력의 총 6개 범주로 나 고, 이론 지 식, 편목 능력은 각각 하 범 로 세분하여 분 석하 다. 정보조직 사서직의 직무 유형을 분석하기 해서는 채용공고의 책무를 상으로 자동색인 을 수행한 후 추출된 색인어를 단서로 하여 공 고문 클러스터링을 수행하는 계량 방법을 사 용하 다. 분석 방법은 이재윤 등(2007)의 연 구에서 사용한 방법과 차에 따라서 자동색인 과 불용어 제거, 용어 가 치 설정, 다단계 클 러스터링의 순으로 진행하 다. 책무를 기술한 용어들 불용어와 주제성이 없는 용어를 제거 하 고, 용어 가 치는 문헌내 용어빈도와 역문 헌빈도를 곱한 log TF․IDF 공식(정 미 2005) 으로 설정하 으며, 유사계수는 코사인 유사도 를 용하여 각 용어 벡터간 유사도를 산출한 후, Ward 클러스터링 알고리즘을 사용하여 군 집을 생성하 다. 우선, 298건의 채용공고를 반 으로 균형있게 나 기 해 평균 10건을 기 으로 30개의 군집으로 클러스터링을 수행 하 다. 이 가운데 군집 크기가 5개 미만인 7개 의 군집은 이와 가장 유사한 군집과 통합하는 작업을 거쳐 총 23개의 소군집이 생성되었다. 다음으로 보다 상 범주의 유형으로 군집을 만 들어 상세 분석을 수행하기 해, 기 생성된 23 1) , 최근 두 달간의 구인공고만 게시됨. 50 정보 리학회지 제28권 제3호 2011 개 군집의 유사성을 기반으로 이재윤 등(2007) 의 연구와 같이 다단계의 추가 클러스터링을 수행하여 9개의 군집, 2개의 군집을 생성 하 다. 그리고 각각의 군집 단 로 표 주제 어를 선정하여 유형을 분석하 다. 표 주제 어 선정을 해서 log TF․IDF 공식으로 각 용어들의 가 치를 공고문마다 구한 후, 각 군 집에 속한 공고문별로 동일한 용어의 가 치를 합산한 값이 높은 용어를 선정하 다. 2. 선행연구 정보조직학 교육과 련하여 해외에서는 교 과목 유형과 과목 수, 실제 개설 과목 수 개 설주기 등과 같은 교과과정 연구, 반 인 그 리고 주제 분석, 거제어와 같은 세부 분야의 교육 황과 방법 연구 등 정보조직학 교육을 주된 내용으로 하는 연구 정보조직 사서직 의 역량과 자격요건 분석을 통하여 정보조직학 교육 방향을 모색하는 연구가 지속 으로 수행 되어 왔으나, 국내에서는 1990년 반과 최 근 몇 건의 연구가 발표된 정도이다. 본 장에서 는 국내 연구 최근의 해외 연구 심으로 문 헌정보학 교육을 심 내용으로 하는 연구와 채용공고 설문조사를 통한 정보조직 사서직 의 역량과 자격요건 분석 연구로 나 어 살펴 보고자 한다. Joudrey(2002; 2008)는 정보조직학 교과과 정에 한 종 연구를 수행하 는데, ALA의 인증을 받은 미국, 캐나다 문헌정보학 학원 의 교과과정을 상으로 하 다. 두 연구에서 정보조직학 교과목의 유형과 과목 수, 실제 개 설된 과목 수 등을 확인하고 변화된 내용을 비 교․분석하 다. Davis(2008)도 ALA 인증 문헌정보학 학 원의 목록 련 교과과정을 조사하고, 이 연 구 결과와 비교를 수행하 다. 목록 련 교과 과정을 8가지 유형으로 나 어 교과목의 유형 과 과목 수, 실제 개설 과목 수, 개설주기 등을 살펴보았다. 조사 결과 평균 3.79개의 과목을 개설하 고, 가장 많이 개설된 유형은 정보조 직 개론과 기 목록 색인/ 록임을 확인하 다. 조재인(2010)은 미래의 목록 작성 업무에 한 측과 사서의 새로운 역할과 역량에 한 연구들을 종합하여 차세 목록 교육의 방향성 을 제안하 다. 사서 양성 과정 즉 정보조직 교 육에서는 자원의 기술과 발견을 한 정보 조 직의 기본 개념과 정보서비스의 사명을 악하 여 장에 용할 수 있도록 하고, 업 사서의 재교육 과정에서는 서지 세계를 이해하는 새 로운 방법을 이해하고 도서 의 변화된 환경을 용하여 업을 수행할 수 있도록 하는 교육 과정이 필요하다고 하 다. 노지 (2010)은 국내에 개설된 정보조직 교 과목의 비 과 주요 내용 미국의 정보조직 역의 교육목표, 기본방향, 주요 내용을 바탕 으로 국내 교육의 방향성을 제시하 다. 학에 서의 교육은 원리 심으로 하고 도서 장 에서의 실무 교육과 조화롭게 연계되는 방향으 로 개선되고, 한 사서의 교육문제에 해 문 헌정보학계, 도서 계, 도서 회의 력이 필 요함을 강조하 다. 사서의 역량과 자격요건에 한 도서 장 의 요구를 조사하기 해서 많이 사용된 방법 정보조직 사서직 역량 직무 유형 분석 51 은 인쇄매체나 온라인 매체에 게재된 채용공고 분석과 설문조사이다. 정연경(1997; 1999)은 1990년~1997년간의 미국 학도서 목록사서 채용공고 292건과 1996년~1999년간의 목록사서 채용공고 200건 의 내용 분석을 통해 목록사서의 역할 자격 요건을 살펴보았다. 이 연구에서 목록사서에게 필요한 자격요건은 미국도서 회(ALA) 인 증 석사학 , 문헌정보학 통 인 분류 목 록 도구에 한 지식, 주제 배경, 도서 업무 경력, 도서 자동화시스템 컴퓨터 활용 능 력, 외국어 능력과 같은 문 지식과 경험에 더 불어 감독 리 능력, 커뮤니 이션 기술, 분 석 문제 해결 능력, 변화하는 환경에 한 응력, 문직 활동과 계속 교육 등의 인성과 업 무 태도로 조사되었다. Albitz(2002)는 학도서 자자원 사서 (electronic resources librarian)에 한 1996 년~2001년간의 채용공고 101건을 분석하 다. 소속 부서는 이용자서비스 부서(Public Service) 와 정리부서(Technical Service)의 비율이 비 슷하 으며, 문직 경력도 특별한 요구가 없 는 경우와 1~3년의 경력을 요구하는 경우가 약 40% 정도씩으로 비슷하게 나타났다. 수행 해야 할 책무로는 자자원 리, 정보 사, 서 지 교육, 웹 디자인, 장서개발, 도서 자동화/기 술 지원, 직원 교육, 편목, 연속간행물 업무 등의 순으로 나타났다. Hall-Ellis(2006; 2008)는 자신이 수행한 3 건의 연구들을 기 로 하여 목록사서에게 필요 한 역량을 정리하 다. 2000년~2005년간의 목 록사서 채용공고 355건과 2006년에 수행한 외 국어 능력에 한 289건의 설문조사를 종합하 여 크게 5가지 항목으로 필요 역량을 종별, 직 무별(신입, 리자)로 분석하 는데, 교육, 이 론 지식, 편목 능력, 커뮤니 이션 기술, 인 계 기술이 이에 해당한다. 이론 지식은 편 목 도구(목록 규칙)와 서지 기술로, 편목 능력 은 기술 편목, 거제어, 분류 체계, 주제 분석, 력 체제, 외국어 능력으로 세분하 고, 인 계 기술에는 리더쉽, 감독, 훈련 등을 포함하 다. Park과 Lu(2009)는 2003년~2006년간 AUTOCAT listserv에 게시된 메타데이터 문 가(Metadata Professionals) 채용공고 107건을 상으로 그들의 역할과 필요 역량을 분석하 다. 조사 결과 직명으로는 Metadata Librarian, Catalog/Cataloging & Metadata Librarian 이 가장 많이 나타났으며, 메타데이터 문가 가 수행할 책무로는 메타데이터 생성, 업무 조 정․감독․기획과 같은 리 업무, 편목 업무, 자자원 리, 최신 동향의 악 등의 순서로 조사되었다. 메타데이터 문가로서 요구되는 자격요건과 기술은 력 커뮤니 이션 기술 을 포함하는 인 커뮤니 이션 기술, 편목 분류 표 에 한 지식, 메타데이터에 한 지 식과 기술, 자/디지털자원 리 등의 순서로 나타났다. Han과 Hswe(2010) 역시 2000년~2008년간 의 메타데이터 사서 채용공고 86건 통 인 목록사서에 한 채용공고 85건을 분석하 다. 메타데이터 사서의 필수 자격요건으로는 메타데이터와 련된 지식과 경험(메타데이터 스키마, 최신 동향 새로운 표 )이 가장 많 이 언 되었고, 편목 경험과 지식, 그리고 통 인 목록, 분류 표 을 포함한 메타데이터 스 52 정보 리학회지 제28권 제3호 2011 키마들에 한 요구가 뒤따랐다. 메타데이터 사 서와 통 인 목록사서에의 자격요건 비교에 서 MARC에 한 지식의 경우는 두 분야 모두 비슷한 비율을 차지하면서 증가하는 추이를 보 으며, 가장 큰 차이를 보 던 항목은 새로운 기술에 한 지식으로 나타났다. 박옥남(2011)은 ․충남지역 공공도서 학도서 사서 37명을 상으로 정리 사서에게 요구되는 항목 학에서의 목록, 분류 학습 실습 경험의 도움 유무와 앞으로 목록 분류 수업에서 향상되어야 할 내용을 설문조사하 다. 조사 결과 목록 분류 지식 과 메타데이터 형식에서는 국내 목록규칙, 분 류표, KORMARC에 한 응답이 가장 많았고, 다른 표 메타데이터에 해서는 그 요 성을 낮게 인식하고 있었다. 목록업무시 알아 야 할 자료 유형으로는 단행본, 계속자료, 자 자료 순으로 나타났고, 컴퓨터 활용능력 웹 활용능력은 기본 심화 기능을 사용할 수 있 는 정도인 자 수 을 갖추어야 한다는 응 답이 많았다. 인성 리능력에 해서는 분 석 인 문제해결능력과 독립 인 업무수행능 력이 가장 요한 요소로 나타났다. 목록 수업 실습에서 향상되어야 할 사항에 해서는 실습에 한 강조, 비도서와 같은 다양한 유형 의 자료의 목록 등의 응답이 있었다. 3. 정보조직 사서직 자격요건 및 역량 분석 3.1 수집 데이터 개요 세 가지 정보원으로부터 수집된 총 298건의 데이터의 유형을 살펴보면 다음과 같다. 우선, 채 용공고 정보원과 연도별 데이터 건수는 <표 1>과 같다. 연도별 데이터 건수에서 인쇄물로 출 되 는 American Libraries의 데이터 건수와 감하 고 있음을 확인할 수 있었고, 비용과 시간면에서 온라인 구인 고가 보다 효과 인 매체로 자리 잡았다고 생각된다. 종별로는 <표 2>와 같이 학도서 이 부 분을 차지하 으며(66.8%), 공공도서 (7.7%), 정부기 도서 (5.4%), 기업체도서 (4.4%), 연구도서 (3.0%), 기타 기 (12.8%) 순으로 나타났다. 연도별 건수 에서 각 호 안에 있 는 수치는 인력 견업체가 해당 기 을 신하 여 낸 공고 건수이다. 정부기 도서 의 경우 16건 에 15건이, 기업체도서 의 경우 13건 11건이 인력 견업체에서 낸 공고이다. 기 타 기 에는 박물 , 학교도서 , OCLC와 같 은 도서 력체, 도서 시스템업체 인력 견업체에서 낸 공고 에서 상기 의 종 을 확인할 수 없는 경우가 포함되었다. 구 분 2007 2008 2009 2010 2011 합계 American Library 23 10 4 0 0 37 AUTOCAT 59 63 46 53 18 239 ALA Joblist 0 0 0 0 22 22 합계 82 73 50 53 40 298 <표 1> 채용공고 수집 정보원과 연도별 건수 정보조직 사서직 역량 직무 유형 분석 53 유 형 2007 2008 2009 2010 2011 합 계(%) 학도서 61 49[1] 29[3] 28[1] 32 199(66.8%) 공공도서 8 8[3] 3[1] 2 3 24(8.1%) 정부기 도서 0 4[3] 3[3] 8[8] 1[1] 16(5.4%) 기업체도서 2 3[3] 3[3] 4[4] 1[1] 13(4.4%) 연구도서 6 0 0 3 1 10(3.4%) 기타 5 9[2] 12[10] 8[2] 2 36(12.1%) 합계(인력업체) 82 73[12] 50[20] 53[15) 40[2] 298(100%) * 연도별 건수 에서 각 호 안의 수치는 인력 견업체에서 게시한 공고임 <표 2> 채용공고 종과 연도별 건수 직 별로는 리자 이 85건(28.5%), 일반사 서직 이 213건(71.5%)이었다. 직명에서 catalog (er)만을 포함하는 경우가 119건(39.9%), meta- data만을 는 catalog(er)와 metadata를 함께 포함하는 경우가 63건(21.1%), electronic re- sources가 포함된 경우가 22건(7.4%), 기타 94 건(31.5%)로 나타났다. 기타에는 technical services만이 포함되거나 serials librarian과 같 이 특정 자료유형에 한 직명이 포함되어 있다. 3.2 채용공고에 나타난 정보조직 사서직 요구 역량 채용공고에서 요구하는 정보조직과 련된 사서직 역량은 선행연구 공고 내용을 참고 하여 크게 교육, 이론 지식, 편목 능력, 외국 어 능력, 정보기술 능력, 인 계 업무 능력 의 6개 범주로 나 고, 이론 지식, 편목 능력, 인 계 업무 능력은 각각 하 범 로 세분 하여 체 11개의 세부범주로 분석하 다. 각 범 주는 채용공고에서 책무와 필수조건(required) 으로 명시하는 것은 필수요건으로, 추가조건 (desired or additional)으로 명시하는 것은 추 가요건으로 세분하 다. 3.2.1 교육 요건 정보조직 사서에게 요구하는 교육 요건은 249 건(83.6%)이 ALA가 인증하는 문헌정보학 석 사학 (MLS, MLIS) 다. 두 개 이상의 석사학 를 요구하는 경우는 22건(7.4%)으로 많지 않 았다. 기타에 나타난 68건(22.8%)은 ALA 인증 이라는 언 이 없는 문헌정보학 분야 석사학 와 학사학 만을 요구하는 경우이다. 학사학 의 경우에는 주제분야에 한 언 이 없거나, 는 기 에서 요구하는 특정한 주제분야(인문 학, 사회과학, 술사, 음악 등)에 한 요구가 있는 경우가 모두 포함된다(<표 3> 참조). 3.2.2 이론 지식: 목록규칙 목록 작성의 기 도구가 되는 목록규칙에 한 지식은 목록규칙에 한 반 인 지식 과 일정 기간의 편목 경험을 요구한 경우에 해당하는 일반 항목이 120건(40.3%)이었다. 특정한 목록규칙을 명시한 경우는 AACR2가 148건(49.7%), LCRI(Library Congress Rule Interpretations)이 38건(12.8%)이었으며, 최 근 발표된 RDA에 한 요구도 20건(6.7%)이 있었는데, 이는 많은 수치는 아니나 채용공고 가 정보조직과 련된 환경 변화를 얼마나 신 54 정보 리학회지 제28권 제3호 2011 구 분 ALA 인증 석사학 추가 석사학 기타 필수요건 243 4 68추가요건 6 18 합계 249 22 <표 3> 교육 요건 구 분 일반 AACR2 LCRI RDA 필수요건 111 134 35 15 추가요건 9 14 3 5 합계 120 148 38 20 <표 4> 이론 지식: 목록규칙 속하게 반 하고 있는가를 보여주는 사례라고 생각한다(<표 4> 참조). 3.2.3 이론 지식: 메타데이터 형식 역시 목록 작성의 기 도구로서의 메타데이 터 형식에 한 지식은 반 인 지식, 메타데이 터에 한 지식, Non-MARC에 한 지식, 일정 기간 편목 경험을 요구한 경우에 해당하는 일반 항목이 144건(48.3%)이었다. 특정 메타데이터 형식으로는 MARC에 한 요구가 압도 으로 많았으며(164건, 55%), DC(69건, 23.2%), EAD (47건, 15.8%), MODS(34건, 11.4%), METS(33 건, 11.1%) 순으로 나타났다. 기타 항목은 그 외 의 메타데이터로 VARcore, FGDC, PREMIS 등이 있었다(<표 5> 참조). 3.2.4 편목능력: 기술편목 경험 능력 실제 인 목록 작성과 직 인 련이 있는 편목 능력 기술편목 경험 능력에 해서 는 편목 경험을 요구하는 일반 항목이 49건 (16,4%) 있었다. 원목(original cataloging) 경험 이라는 명시가 되어 있는 경우는 157건(52.7%), 카피편목(copy cataloging)은 51건(17.1%), 직 원들이 작성한 목록을 검하고 필요한 내용을 수정 보완하는 작업(complex copy cataloging) 경험이나 능력을 요구하는 경우는 49건(16.4%) 이 있었다. 기타 항목은 음악자료, 지도자료, 학 논문, 시청각자료 등 특별한 자료유형에 한 기술편목 능력을 요구하는 경우이다(<표 6> 참조). 구 분 일반 MARC DC EAD MODS METS 기타 필수요건 128 147 54 40 29 28 36추가요건 16 18 15 7 5 5 합계 144 165 69 47 34 33 <표 5> 이론 지식: 메타데이터 형식 정보조직 사서직 역량 직무 유형 분석 55 구 분 일반 O.C C.C C.C.C 기타 필수요건 45 108 21 16 5추가요건 4 49 30 33 합계 49 157 51 49 <표 6> 편목능력: 기술편목 경험 능력 3.2.5 편목능력: 거제어 편목작업에 있어 가장 문 이고도 시간을 많이 필요로 하는 업무가 거제어이다. 거 제어에 한 책무나 지식, 경험을 필수요건으 로 요구하는 경우는 58건, 추가요건은 8건으로 체 66건(22.1%)의 요구가 있었다. 3.2.6 편목능력: 분류능력 목록데이터에 포함되는 분류번호 부여를 한 분류 능력을 하여 분류 체계에 한 지식 이나 분류 경험을 요구하는 경우는 반 인 분 류 경험이라는 언 은 매우 었는데, 이는 미 국 도서 부분이 LCC, DDC와 같이 사용하 는 분류체계의 종류가 한정되어 있고, 해당 기 에서 사용하는 분류체계에 한 요구를 분명 하게 나타내는 경우가 부분이기 때문이라 생 각된다. LCC에 한 요구가 압도 으로 많았 는데, 이는 채용공고의 67%가 학도서 이고, 미국 학도서 가운데 상당히 많은 기 이 LCC 분류체계를 사용하고 있다는 사실이 반 된 것이라 추측된다. LCC 분류체계를 사용한 경험이나 련 지 식을 요구한 경우는 110건(36.9%)이었으며, 199개의 학도서 에는 42.7%에 해당하는 85개 도서 이 요구하고 있었다. DDC의 경우 는 34건(11.4%)에 불과하여 LCC과 많은 차이 를 보 다. 기타 분류체계로는 미국의학도서 분류체계(NLMC), 미국문서 리국 분류체계 (Superintendent of Documents Classification) 등이 있었다(<표 7> 참조). 구 분 일반 LCC DDC 기타 필수요건 7 103 24 11추가요건 1 7 10 합계 8 110 34 <표 7> 편목능력: 분류 능력 3.2.7 편목 능력: 주제 분석 목록 데이터에 주제명을 부여하기 하여 미국 국회도서 주제명표목표(LCSH)에 한 지식 과 사용 경험을 요구한 경우는 필수요건으로 119 건, 추가요건으로 11건으로 총 130건(43.6%)이 있었다. 그 밖에 19건(6.4%)은 주제 분석을 한 일반 인 지식 MeSH와 같은 주제명표목표나 AAT와 같은 시소러스에 한 경험이나 지식을 요구하고 있었다. 3.2.8 편목 능력: 력체제 다른 기 과의 력체제를 활용한 목록작성 은 OCLC가 설립된 1960년 부터 시작되어 부분의 도서 에서 사용하는 방식이다. 채용공고에서 OCLC 분담편목의 경험을 요 구하는 경우가 133건(44.6%)으로 압도 으로 많았으며, 력체제나 서지 유틸리티에 한 경험을 요구하는 일반 항목은 매우 었다. 이 는 재 서지 유틸리티로서 WLN, RLIN 등을 56 정보 리학회지 제28권 제3호 2011 통합한 OCLC의 표성이 크기 때문에, 명확히 OCLC 분담편목의 활용 경험으로 명시하는 경 우가 많기 때문일 것이다. 그 밖에 력체제 활용 경험을 요구한 것은 미국국회도서 이 주 하여 운 하고 있는 PCC(Program for Cooperative Cataloging)의 세부 로그램인 NACO(Name Authority Cooperative Program), CONSER (Cooperative online serials Program), SACO (Name Authority Cooperative Program), BIBCO(Monographic Bibliographic Record Program)이 있었다. 기타 항목으로는 SuDoc, OhioLink와 같은 특정 국가나 지역의 력체 에 한 지식이나 활용 경험을 요구한 경우이 다. 체 으로는 152건(51%)의 공고에서 한 가지 이상의 력체제에 하여 요구하고 있었 다(<표 8> 참조). 3.2.9 외국어 능력 다양한 언어의 자원들을 조직하기 해 필요 한 외국어 능력에 있어서는 어도 1개의 외국 어에 한 요구가 필수요건으로 40건, 추가요건 으로 45건으로 총 85건(28.5%)이 있었으며, 2 개 이상의 외국어 능력을 요구하는 경우는 필수 요건 5건, 추가요건 38건으로 총 43건(14.4%) 의 요구가 있었다. 외국어 종류로는 스페인어, 랑스, 독일어 등이 가장 많았으며, 아시아 언 어, 라틴어, 터키어 등 다양하게 나타났다. 3.2.10 정보기술 능력 정보조직 분야에 있어서의 정보기술의 활용 은 보편화된 일이며, 정보시스템 기술 분야 를 제외하면 도서 의 다른 분야보다 가장 많 이 요구되는 분야이다(최상희 2008). 정보기술에 한 반 인 지식 는 컴퓨터 사용 능력 등으로 표 된 일반항목은 43건 (14.4%)으로 나타났다. 도서 업무용 S/W 사 용 경험에 한 요구는 채용공고에 나타난 모든 요건들 에서 가장 높은 비율(205건, 68.8%) 을 차지하고 있었다. 도서 업무용 S/W 가 장 많은 요구는 통합도서 시스템(Integrated Library System: ILS)으로 Voyager, Millennium, Aleph 등 특정 시스템을 언 한 경우도 많았다. 그리고 자자원과 련된 S/W(ERMS, 링킹 시스템, 메타검색시스템 등), 편목도구 S/W (Cataloger's Desktop, & Classification Web, MARCedit 등), CONTENTdm과 같은 디지 털자원구축 리 S/W도 포함되어 있었 다. HTML, XML, XSLT 기술이나 웹 디자인, Dreamweaver와 같은 웹 작도구 등 웹 기술에 한 요구는 37건(12.4%), Oracle, MySQL과 같은 데이터베이스 사용 능력은 34건(11.4%)이 있었다. 기타 항목 62건(20.8%)으로는 MS- Office 등의 사무용 S/W, 기본 인 로그래 능력, 기타 기 에서 특별히 요구하는 S/W 사 용 경험 등이 있었다(<표 9> 참조). 구 분 일반 OCLC NACO CONSER SACO BIBCO 기타 필수요건 6 112 11 9 3 2 10추가요건 1 21 18 4 10 10 합계 7 133 29 13 13 12 <표 8> 편목능력: 력체제 정보조직 사서직 역량 직무 유형 분석 57 구 분 일반 도서 S/W 웹 기술 DB 기타 필수요건 35 132 22 26 62추가요건 8 87 15 8 합 계 43 205* 37 34 * 필수요건과 추가요건 복 14건을 제외한 수치임 <표 9> 정보기술 능력 3.2.11 인 계 기타 업무능력 인 계 기타 업무능력에 있어서는 무엇 보다도 커뮤니 이션 기술에 한 요구가 150 건(50.3%)으로 가장 많았으며, 이 에서 부 분이 필수로 요구하고 있었다. 리더쉽/감독 능 력에 한 요구가 127건(42.6%)으로 반 가 까이 차지하고 있었으며, 환경에서 원활하 게 일하는 능력 타 부서와의 상호 력 능력 을 요구한 경우는 99건(33.2%)이 있었다. 도서 정보조직과 련된 최신 동향 악은 69 건(23.2%), 리하는 직원에 한 교육/훈련 능력은 55건(18.5%)의 요구가 있었다. 이용 자 심 서비스 자세를 요구한 경우가 31건 (10.4%) 있었는데, 이는 높은 수치는 아니나 통 으로 이용자와의 직 인 이 은 정보조직에 있어서 별도로 이러한 서비스 자세 를 요구하고 있다는 에서 정보조직 업무 성 격의 변화를 반 하고 있다고 사료된다(<표 10> 참조). 이밖에도 독립 인 업무 수행 능력, 변화하 는 환경에 한 응력, 문제해결 능력, 창의력, 계속교육을 통한 지속 인 자기 개발의 요구가 있었다. 한 문 분야의 연구 학술활동도 요구하 는데, 이는 미국 학도서 에서의 사 서직은 종신재직권과 교수로서의 직 을 보장 받은 경우가 많기 때문에 이를 유지하기 한 연구업 학술단체의 참여도 요구하고 있는 것이다. 인 계 업무능력에 있어서 리자 과 일반사서직 을 비교하여 보았는데, <그림 1> 에서 나타나듯이 상호 력을 제외하고는 모두 리자 에 한 요구 비율이 높았다. 다만 리 더쉽/ 리 능력을 제외한 다른 항목에서는 큰 차이가 없었고, 상할 수 있듯이 리더쉽/감독 능력에 있어서는 리자 의 요구가 2배 이상 되었다. 교육/훈련의 비율도 차이가 있으리라 생각되었으나 실제로는 리자 과의 차이가 크지 않았는데, 이는 일반직 사서의 경우에 도 업무보조직원이나 학생들에 한 교육/훈련 의 책무가 지 않기 때문이라고 사료된다. 구 분 커뮤니 이션 리더쉽/감독 상호 력 동향 악 교육/훈련 이용자 심 필수요건 143 102 94 62 44 31 추가요건 7 25 5 7 11 0 합 계 150 127 99 69 55 31 <표 10> 인 계 업무능력 58 정보 리학회지 제28권 제3호 2011 <그림 1> 인 계 업무능력에서의 직 별 요구 비율 4. 정보조직 사서직 직무 유형 분석 채용공고에 나타난 정보조직 사서직의 직무 유형을 세분하여 살펴보고자 용어 클러스터링 을 통해 2.2.2 에서 설명한 바와 같이 군집, 군집, 소군집의 3단계로 군집을 생성하 고, 주제어를 검토하여 각 단계의 군집 특성을 살 펴보았다. 2가지 군집(Large Cluster: LC)의 경우 LC1에 속한 채용공고는 총 174건이고 상 표 주제어는 metadata, MARC, bibliographic, authority, original, copy 다. LC2는 총 124건이 고 상 표 주제어에 access, resource, digital, electronic, technology, serial이 포함되었다. 표 주제어를 통해 LC1은 통 인 정보조직의 특성이 반 되었고, LC2는 자자원, 디지털자 원, 정보기술과의 연 이 많은 군집으로 분석 되었다. LC2의 경우 serial이 포함된 것은 표 인 자자원인 자 과 상통하기 때문이 라고 생각된다. LC2와 련이 깊다고 생각한 metadata가 오히려 LC1의 표 주제어로 나 타난 것은 특이할만한 일인데, 이를 통해 미국 도서 에서는 이제 metadata가 정보조직에 있 어 일반 인 용어가 되었으며, 자자원을 다 루지 않더라도 metadata에 한 기본 인 지 식은 기존 자원과의 통합 인 에서 필요하 다는 것을 반 하고 있다고 사료된다. 9가지 군집(Medium Cluster: MC)의 건 수와 각 군집에 해당하는 주제어는 <표 11>과 같다. LC1에 속하는 5가지 군집을 살펴보면, 가 장 큰 건수를 가지는 MC1은 메타데이터, 데이 터베이스, 거제어, 학, MARC이 주제어로 나타났다. 데이터베이스가 정보기술과 연 이 있으나 이미 오래 부터 도서 의 목록이 산 화되어 리되었기 때문에 LC1에 속하는 군 집에서 출 한 것이 가능하다고 생각된다. MC4 는 주제어를 살펴보면 통 인 목록작성과 가 장 연 이 많은 군집임을 확인할 수 있다. MC7 정보조직 사서직 역량 직무 유형 분석 59 군집 군집 표 주제어 번호 건수 번호 건수 LC1 174 MC1 55 metadata, database, authority, university, MARC MC4 37 copy, original, monography, head, complex MC7 34 MARC, OCLC, authority, clerk MC8 24 reference, copy, material, complex, original MC9 24 book, processing, shelving, archive, technician LC2 124 MC2 43 serial, service, manage, technical, acquisition MC3 44 system, technology, manage, oversee, leadership MC5 23 digital, metadata, repository, institutional, university MC6 14 license, electronic, resource, access, e-journal <표 11> 직무 유형 군집 건수 주제어 과 MC8 역시 통 인 목록작성을 나타내는 주 제어가 많았으며, MC7에서는 OCLC가 주제어 로 추출되어 자격요건을 확인하니 평균 OCLC 분담편목 경험인 45%보다 20% 이상 높은 68% 가 채용 요건으로 포함되어 있었다. MC8에서는 정보 사(reference)가 에 띄는 주제어인데, 직명 가운데 Cataloging/Reference Librarian 이 포함되어 있었으며 50%에 가까운 채용공고 에서 정보 사 서비스 제공을 책무에 포함하고 있었다. MC9는 장비작업과 사서 보조직원 련어가 주제어로 추출되었는데, 장비작업의 업 무를 주로 하는 임시직, 사서 보조직원의 채용 공고 부분을 차지하는 특정 도서 인력 견 업체의 공고가 58%를 차지하고 있었다. LC2에 속하는 4가지 군집을 살펴보면, MC2는 연속간행물, 기술 사, 리자의 역할을 나타내는 주제어를 가지고 있었다. MC3는 리자의 역할과 정보기술과 연 이 많은 주제어 를 포함하고 있었으며, MC5는 디지털, 포지 터리, 학 등이 표 주제어로 나타났다. MC5 에 속하는 23건의 종을 살펴본 결과 평균 학 도서 의 비율인 67%보다 훨씬 높은 87%에 해 당하는 20건이 학도서 에 해당되었다. MC6 의 주제어에는 자자원의 특성이 가장 잘 반 되어 있음을 확인할 수 있다. 23개의 소군집의 건수 주제어는 <표 12> 와 같다. 각각의 군집에 포함된 소군집의 특 성을 살펴보면, MC1에 포함된 SC1은 메타데이 터, 데이터베이스, 리자 련 주제어를 가 지고 있었고, SC1의 리자 의 비율을 확인한 결과 평균 29%보다 훨씬 높은 44%로 나타났 다. SC5는 주제(subject)가 다른 소군집과 차 별화되는 용어인데, 교육요건과 외국어 능력을 확인한 결과 40% 이상의 공고에서 특정 주제 분야나 2개 이상의 외국어 능력을 요구하고 있 었다. SC11은 특별한 주제어는 포함되어 있지 않았으며, SC13은 데이터베이스, 거제어의 주제어가 포함되어 있었다. MC2에 속하는 SC4, SC19 역시 통 인 목록작성 연 이 많 은 주제어를 가지고 있었으며, SC4는 48%의 리자 비율을 반 하는 주제어도 있었다. OCLC, authority, clerk이 주제어인 MC7에 속 하는 SC12는 이 군집에 속하는 채용공고 50%가 요구하는 거제어가 주제어로 포함되어 있었고, 60 정보 리학회지 제28권 제3호 2011 군집 번호 군집 번호 소군집 표 주제어 번호 건수 LC1 MC1 SC1 16 metadata, database, head, leadership SC5 17 subject, metadata, bibliographic SC11 14 metadata, MARC, standard SC13 8 quality, database, authority, metadata MC4 SC4 21 coordinator, complex, head, copy SC19 16 copy, print, original, MARC MC7 SC12 18 control, authority, metadata SC20 11 manuscript, subject, MARC, curator SC23 5 transcribe, clerk, card MC8 SC16 17 reference, public, instruction, complex SC18 7 copy, complex, director MC9 SC17 11 book, backlog, assist, card SC21 7 process, technician SC22 6 archive, shelving, process LC2 MC2 SC2 9 serial, binding, journal, electronic SC7 20 service, technical, acquisition, collection SC8 14 lead, strategy, plan, budget MC3 SC3 11 university, leadership, profession, oversee SC6 26 function, technology, oversee, director, plan SC14 7 manage, system, administrator, staff MC5 SC9 11 metadata, digital, university, harvest SC15 12 digital, repository, institutional, technology MC6 SC10 14 license, electronic, resource, access, e-journal <표 12> 직무 유형 소군집 건수 주제어 SC20은 필사본, 큐 이터, 그리고 SC5와 마찬 가지로 주제(subject)가 포함되어 있었는데, 채 용기 박물 의 성격을 가진 Shakespeare 문도서 이 30% 가까이 차지하고 있었고, 50% 이상의 공고에서 특정 주제분야나 2개 이 상의 외국어 능력을 요구하고 있었다. SC23은 5건 모두가 카드목록을 MARC으로 환하는 로젝트를 한 도서 인력 견업체의 보조 원(clerk) 채용공고 으며, 주제어 한 이를 반 하고 있었다. reference와 카피 목록을 보 완하는 수 의 편목과 련된 complex를 주 제어로 가지는 MC8에 속하는 SC16, SC18은 complex라는 주제어가 공통 으로 나타났으 며, SC16에서는 65%의 공고에서 정보 사 서 비스 제공을 책무에 포함하고 있었고, SC18에 서는 리자의 특성이 나타났다. 장비작업과 사 서 보조직원 련어를 주제어로 MC9에 속하는 3개의 군집은 공통 으로 인력 견업체의 공고 비 이 매우 컸으며, SC17에는 미정리 도서 편 목을 한 임시직 채용공고가 많았고, SC22에 는 기록 리와 연 된 직명이 50% 차지했으며 장비작업을 나타내는 주제어가 추출되었다. 정보조직 사서직 역량 직무 유형 분석 61 연속간행물, 기술 사를 나타내는 주제어를 가지는 MC2에는 3개의 소군집이 속하 는데, SC2는 연속간행물과 학술지가, SC7은 기술 사, 수서, 장서개발과 련된 주제어가 나타났 다. SC8의 경우 리자의 역할을 나타내는 주 제어만이 상 에 나타났고, 직명을 추가로 살 펴본 결과 기술 사와 계속자료를 포함하는 직 명이 80% 정도 나타났음을 확인하 다. 리 자의 역할과 정보기술을 나타내는 주제어를 가 지는 MC3에 속하는 소군집 역시 비슷한 주제 어가 나타났다. 다만, SC3의 경우는 100%가 모두 학도서 이었고, 이를 반 하는 주제어 인 university가 상 주제어로 나타났고, SC6 과 SC14는 의미는 동일하나 사용된 주제어가 상이하여 군집이 분리되었다. 디지털과 포지 터리가 표 인 주제어인 MC5에 속하는 SC9 와 SC15는 메타데이터, 디지털, 기 포지터 리와 련된 주제어가 추출되었고, SC15는 SC3 과 같이 11건 모두가 학도서 인 특성이 반 되었다. SC10의 특성은 앞서 살펴본 MC6과 동일하다. 5. 결론 및 제언 20세기 반 이후 나타난 컴퓨터와 정보통신 의 발달은 통 인 정보조직의 근본 인 변화 를 가져왔고, 인터넷의 등장과 웹 환경으로의 속한 변화는 엄청난 양의 자자원과 디지털 형태의 정보를 생산하며 이에 한 정보조직을 새로운 과제로 내놓았다. 이와 같은 도서 장에서의 환경 변화에 효과 으로 응하기 해서는 정보조직 사서직의 역할이 확 되고, 보 다 다양한 지식과 역량을 갖추어야 한다는 주장 들이 제기되어 왔다. 본 연구는 이러한 주장들이 최근 미국 도서 사서직 채용공고에 얼마나 반 되고 있는가 를 내용분석을 통해 살펴보았다. 한 군집분 석을 통해 세부 직무 유형도 조사하 다. 총 11 개의 세부항목으로 나 어 살펴본 정보조직 사 서직에 요구되는 자격과 역량에는 통 으로 열거되었던 AACR2, LCRI, LCC, DDC, LCSH 과 같은 편목, 분류, 주제분석을 한 도구들과 MARC 형식이 여 히 많은 비 을 차지하고 있었다. 그러나 FRBR, RDA와 같은 서지제어 모형과 새로운 표 , 그리고 자자원을 기술하 기 한 다양한 메타데이터 형식의 이해도 요구 하고 있었다. 목록 작성을 한 력체제에 한 경험은 반 이상 요구하고 있었으며, 하나 이상의 외국어에 한 요구는 30% 정도가 있었 다. 정보기술 능력 가운데 도서 업무용 S/W 사용 경험에 한 요구는 채용공고에 나타난 모 든 요건들 에서 가장 높은 비율(68.8%)을 차 지하고 있었고, 이 가운데는 통합도서 시스템 에 한 경험 요구가 가장 많았다. 주목할만한 은 자자원과 련한 다양한 S/W(ERMS, 링킹 시스템, 메타검색시스템 등)에 한 요구 가 지 않게 있었다는 것이다. 이는 도서 자 원 에서 자자원의 비 이 크게 증가하고 있다는 사실과 자자원과 련하여 메타데이 터 작성만이 아니라 이용자의 근도 함께 고 려됨을 반 하고 있다고 여겨진다. 인 계 기타 업무능력도 매우 다양한 요구들이 포 함되어 있었는데, 커뮤니 이션 기술에 한 요구가 반 이상으로 가장 많았다. 리더쉽/감 독 능력에 한 요구는 상하 듯이 리자 62 정보 리학회지 제28권 제3호 2011 에서 70% 이상 차지하 다. 이용자 심의 서 비스 자세에 한 요구는 10% 정도로 높은 수 치는 아니나 통 으로 이용자와의 직 인 이 었던 정보조직 업무의 성격이 변화하 고 있음을 반 한 것이라 생각한다. 채용공고에 나타난 정보조직 사서직의 직무 유형은 크게 통 인 정보조직의 특성이 반 된 것과 자자 원, 정보기술의 특성이 반 된 것으로 나 수 있었다. 통 인 정보조직을 표하는 주제어 로 메타데이터가 포함된 것은 미국의 경우 이 제 메타데이터는 정보조직에 있어 일반 인 용 어가 되었음을 보여 다고 여겨진다. 하 군집 들은 직 , 종, 자격요건, 공고 주체 등을 기 으로 특성을 확인할 수 있었다. 도서 장에서 요구하는 사서의 역량을 키 우기 하여 정보조직 문직 양성 교육이 끊임 없이 변화․발 해야 하는 것은 재론의 여지가 없다. 본 연구결과를 기 로 하여 국내 정보조 직 교육의 개선 방향을 다음과 같이 제시한다. 첫째, 자자원 련 내용을 확 하고, 인쇄 자원과의 통합 시각에서 다루어져야 한다. 미국 도서 의 경우 자자원의 조직은 하나의 직무 유형으로 큰 비 을 차지하고 있었다. 그 러나 국내의 경우 정보조직 교육에서는 비도서 자료 등의 일부 내용으로 다루어지는 경우가 많고, 도서 장에서는 정보조직 련 부서 에서 자자원을 극 으로 수용하지 못하고 있으며 따라서 기존 자원과의 연계하여 이용자 에게 폭넓은 근을 제공하는데도 한계를 가지 고 있다(이지원 2011). 자자원의 계속 인 증가는 자명한 일이며, 이에 한 정보조직의 요성 한 강조되어야 할 것이다. 둘째, 정보조직 교육에 있어 이론과 실습의 조화가 이루어져야 한다. 정보조직과 련된 환 경의 변화는 앞으로도 계속될 것이며, 더욱 가 속화되리라 상한다. 따라서 정보조직의 기본 개념, 정보자원의 생성에서 이용까지의 순환과 정에서 차지하는 역할의 요성과 다른 업무와 의 계, 주제분석, 분류, 편목의 기본 원리와 기 능 등에 한 기 인 이론의 토 가 단단히 구축되어야 할 것이다. 이론 교육을 한 효과 인 교수 방법에 한 고민은 반드시 필요할 것이지만, 실무 경험이 없는 학생들에게 정보조 직의 이론만을 강조한다면 교육 효과에 있어 한 계를 가질 것이다. 물론 도서 장에 나가서 선임 사서에게서 보다 실제 인 교육을 받을 수 있다. 그러나 국내 도서 환경에서 보 사서 에게 충분한 교육을 제공하고 업무를 수행하도 록 하기는 쉽지 않은 것이 실이다. 따라서 이 러한 교육은 자칫하면 단편 이고 단순한 기능 만 강조하게 될 수 있다. 그러므로 학 교육에 서 충분한 실습을 통해 이론 교육의 효과를 높 이고 장에 나갔을 때도 단기간 교육만으로도 업무 수행에 무리가 없도록 해야 할 것이다. 셋째, 정보조직 교육에 있어서의 극 인 정 보기술 활용과 연계를 확 해야 한다. 이는 앞 서 언 한 자자원 교육 실습과도 한 련이 있으며, 계속 으로 진화하는 웹 환경에 정보조직이 극 으로 응하기 해서도 필 요하다. 도서 시스템과 유사한 환경에서 실습 이 가능하도록 하고, 웹과 함께 변화하는 도서 목록의 재 모습을 학생들이 직 찾아서 확인하도록 하며, 다양한 도서 련 S/W의 기능에 한 기본 인 지식과 활용 사례 소개 등이 실제 인 방법이 될 것이다. 마지막으로 교육 과정에 도서 장의 모습 정보조직 사서직 역량 직무 유형 분석 63 을 최 한 반 하고 간 경험할 수 있는 기회 를 제공해야 한다. 이는 기존의 교수 방법이나 교실에서의 수업만으로는 한계를 가지는 사항 이다. 따라서 장실습이나 도서 사 활동 등 을 통해서 인 계 기타 업무 능력에 한 필요성을 직 체험할 수 있도록 하고, 문제기반 학습(Problem-Based Learning: PBL) 등 새 로운 교수 방법의 용을 통하여 문제해결능력, 분석 ․통합 사고, 커뮤니 이션 능력을 확 하고 상호 력의 필요성을 인식하도록 할 수 있을 것이다. 본 연구가 정보조직 업무의 역할과 요성을 재조명하고, 정보조직 사서직의 문성을 강화 하며, 바람직한 방향으로 정보조직 교육이 개 선되는데 일조하기를 기 한다. 참 고 문 헌 노지 . 2011. 한국의 자료조직 교육에 한 진 단과 방향 모색. 한국도서 ․정보학회 지 , 42(1): 225-245. 박옥남. 2011. 정리사서 문성 재고에 한 연구. 한국비블리아학회지 , 22(1): 95-116. 이재윤, 문주 , 김희정. 2007. 텍스트 마이닝을 이 용한 국내 기록 리학 분야 지 구조 분석. 한국문헌정보학회지 , 41(1): 345-372. 이지원. 2011. 학도서 자자원 메타데이터 실 태 분석. 정보 리학회지 , 28(1): 221- 235. 정연경. 1997. 학도서 목록사서의 역할 자 격요건에 한 연구. 정보 리학회지 , 14(2): 143-163. 정연경. 1999. 목록사서직의 자격요건과 목록교 육의 방향: 1990년 반 이후 미국을 심으로. 제6회 한국정보 리학회 학 술 회 , 1999년 8월 18일-19일. 서울: 연세 학교. 정 미. 2005. 정보검색연구 . 서울: 구미무역. 조재인. 2010. 차세 목록 교육의 방향성에 한 연구. 한국도서 ․정보학회지 , 41(2): 127-145. 최상희. 2008. 구인 고에 나타난 정보기술 련 사서직 자격요건 분석. 한국도서 ․정 보학회지 , 39(1): 339-354. Albitz, Rebecca S. 2002. “Electronic resource librarians in academic libraries: A posi- tion announcement analysis, 1996-2001." Portal, 2(4): 589-600. Davis, Jane M. 2008. “A survey of cataloging education: Are library schools listening?” Cataloging & Classification Quarterly, 46(2): 182-200. Joudrey, Daniel N. 2002. “A new look at US graduate courses in bibliographic con- trol.” Cataloging & Classification Quar- terly, 34(1/2): 59-101. Joudrey, Daniel N. 2008. “Another look at grad- uate education for cataloging and the 64 정보 리학회지 제28권 제3호 2011 organization of information.” Cataloging & Classification Quarterly, 46(2): 137- 181. Hall-Ellis, Sylvia D. 2006. “Cataloging electronic resources and metadata: Employers’ expectations as reflected in american libraries and AutoCAT, 2000-2005.” Journal of Education for Library & In- formation Science, 47(1): 38-51. Hall-Ellis, Sylvia D. 2008. “Cataloger compe- tencies … What do employers require?" Cataloging & Classification Quarterly, 46(3): 305-330. Han, Myung-Ja and Patricia Hswe. 2010. “The evolving role of the metadata librarian.” Library Resources & Tech- nical Services, 54(3): 129-141. Park, Jung-ran and Caimei Lu. 2009. “Metadata professionals: Roles and competencies as reflected in job announcements, 2003-2006." Cataloging & Classification Quarterly, 47(2): 145-160. work_dodgvf7r7nfpthtiv6r564zcji ---- HKUST Institutional Repository Building an institutional repository: sharing experiences at the HKUST Library Ki-Tat LAM and Diana L. H. CHAN The author Ki-Tat Lam is the Head of Library Systems at HKUST Library. Diana L. H. Chan was the Head of Reference at HKUST Library and is now the Associate Librarian of Public Services at the City University of Hong Kong. Keywords Institutional repositories, open access, self-archiving rights, content recruitment, DSpace, digital libraries, HKUST Abstract Purpose - To document HKUST’s experiences in developing its Institutional Repository and to highlight its programming developments in full-text linking and indexing, and cross institutional searching. Design/methodology/approach - This paper describes how HKUST Library planned and set up its Institutional Repository, how it acquired and processed the scholarly output, and what procedures and guidelines were established. It also discusses some new developments in systems, including the implementation of OpenURL linking from the pre-published version in the Repository to the published sources; the partnership with Scirus to enable full-text searching; and the development of a cross- searching platform for institutional repositories in Hong Kong. Findings - It illustrates what and why some policy issues should be adopted, including paper versioning, authority control, and withdrawal of items. It discusses what proactive approaches should be adopted to harvest research output. It also shows how programming work can be done to provide usage data, facilitate searching and publicize the repository so that scholarly output can be more accessible to the research community. Practical implications - Provides a very useful case study for other academic libraries who want to develop their own institutional repositories. What is originality/value of paper - HKUST is an early implementer of institutional repositories in Asia and its unique experience in policy issues, harvesting contents, standardization, software customization, and measures adopted in enhancing global access will be useful to similar institutions. Introduction The Hong Kong University of Science and Technology (HKUST) is a young institution opened in October 1991. It offers taught and research programs in science, engineering, business, humanities and social science, with 430 full-time faculty members, 5,600 undergraduates and 3,200 postgraduates. Despite its short history, HKUST has rapidly evolved into a world class institution, and was ranked number 43 in the world by The Times Higher Education Supplement in 2005. This is the Pre-Published Version 2 The HKUST Library has been engaged in library digitization projects since its foundation 15 years ago, including the early project on Course Reserve Imaging in 1993 and the CJK (Chinese, Japanese, Korean) capable systems for Digital University Archives and Electronic Theses in 1997. The experiences gained through these projects have facilitated a smooth creation of its Institutional Repository. The Library showed its early support to the open access concept by joining SPARC in 2001. And in November 2002, Kimberly Douglas, the University Librarian of the California Institute of Technology, was invited to the Library to give a staff development workshop on E-prints, OAI (Open Access Initiatives) and institutional repository. The Library decided to build the HKUST Institutional Repository after the workshop, aiming to create a permanent record of the institution’s scholarly output in digital format, and to make the Repository globally and openly accessible. The HKUST Institutional Repository (see Figure 1) was launched in February 2003 with its first batch of 105 computer science technical reports. It has grown to 2,369 documents from 42 academic departments in September 2006, holding preprints, technical reports, working papers, conference papers, journal articles, presentations, book chapters, patents, and PhD theses. They are mainly PDF files with some PowerPoint and program files. These scholarly works were accessed 69,000 times excluding robots in the period from September 2005 to August 2006. A research study on how Hong Kong Chinese students learned English was downloaded 800 times in just a month. Figure 1 Home page of the HKUST Institutional Repository (http://library.ust.hk/repository/) The HKUST Institutional Repository was reported in a number of conference presentations (Chan 2004a; Chan 2004b; Lam 2004; Lam 2006) and journal articles 3 (Chan, Kwok, Yip 2005; Kwok, Chan, Wong 2006). This paper will summarize and update the issues discussed in these reports, including how to plan and set up the Repository, how to acquire and process the scholarly output, and what procedures and guidelines were established. It will also discuss some of the new developments, including the implementation of OpenURL linking from the pre-published version in the Repository to the published sources; the partnership with Scirus to enable full-text searching; and the development of the HKIR, the cross-searching platform for institutional repositories in Hong Kong. Planning Stage The Library adopted a bottom-up approach for building its Institutional Repository. As the concept of institutional repository was quite new in Hong Kong in the year of 2002, it would be easier to begin the project in a small scale but with gradual expansion in scope and institutional participation. With a tangible repository at hand, the Library could approach the faculty and university administration, explained and demonstrated what IR was and how they could benefit from it. Another advantage of starting small was that the investment of resources would be relatively small as compared to a large-scale project and the Library could have more freedom to test the water before moving on. Establishing a Task Force The project began with the establishment of a task force in December 2002, consisting of four librarians from the Reference and Systems departments and the Associate University Librarian. The task force’s charge was to identify the issues involved in creating the IR, to evaluate and select the software for hosting the Repository, and to develop action plans. Findings and recommendations were reported to the Library Administration Committee, the main decision-making body, for approval. A number of policy issues were resolved during the early stage of planning. For example:  Make the IR totally open and accessible to the world. If a faculty member wishes to restrict access, then the document will not be accepted.  The IR is a deposit of research documents, not merely an index with links to external sources; if the Library does not have the right to deposit the full-text papers, they cannot be included in the IR.  Undertake retrospective work to include documents previously published in addition to the current ones.  Do not include ephemeral materials such as faculty-produced course notes, popular works or feature columns from newspapers, but would limit the coverage to published material and grey literature only.  Allow authors to submit documents online and they will sign a permission agreement, granting the Library non-exclusive distribution rights.  Adopt Adobe’s PDF format as the default document format.  Build a single database and not to have multiple databases such as the model adopted by California Institute of Technology. Selecting IR Software While there are many options for selecting IR software and hosting services today, the choices were extremely limited in the year of 2002. Like many of its digital library projects, the Library decided to use open source software for its IR. The main advantage of open source software was that it provided flexibility for local 4 customization and feature enhancements. The significant software cost savings was also a consideration, as the Library did not receive extra funding for the IR project. The task force decided to focus on open source software that supported OAI-PMH (Open Access Inititative - Protocol for Metadata Harvesting). Two such IR software programs were evaluated, namely EPrints and DSpace. EPrints was developed by the University of Southampton and was widely used by IR implementers in 2002. DSpace was jointly developed by MIT Libraries and Hewlett-Packard Company, and began its first release on Sourceforge at the time of the Library’s evaluation. DSpace was developed with experience gained from EPrints, but with a clever move from the Perl programming language to Java and Servlet. And at that time, it also had better Unicode support, which was essential to the Repository that would contain Chinese materials. With the above consideration, the Library decided to adopt DSpace. Once DSpace was selected, the Library began to develop an initial prototype, using the 105 working papers from the Department of Computer Science, which were freely available on their website in postscript format. During this prototyping, a number of design issues were resolved, including how to organize the documents by departments and by document types, and what fields were required in the metadata. Staffing As there was no extra funding and manpower provided for the project, the Library relied on existing library staff to create and maintain the Repository. In addition to the initial planning and system setup done by the task force, an IR Team of eight reference librarians and five data entry staff was established to handle on-going work such as faculty liaison, document acquisition and processing, and the actual data input tasks. It was estimated that about 350 man-days for librarians and another 350 man- days for support staff were spent in the first three years of the project. It was found that the time and efforts taken to acquire and process the content so as to achieve a critical mass were quite substantial. Institutions that are interested in creating IRs should be aware of the staffing implications and should request sufficient funding for the project. Organizing the Repository In DSpace, a repository is made up of a hierarchy of communities, collections, items, metadata and bitstreams. A document is represented by an item, which contains metadata, i.e., a description of the document and a bundle of bitstreams, such as PDF and PowerPoint files that hold the actual content of the document. Items are held in collections, which are further grouped under communities. Documents in the HKUST Institutional Repository are organized by academic departments (communities), and within a department, they are further grouped according to the document types (collections), such as conference papers and journal articles. As of September 2006, the Repository has 2,369 documents in 42 communities and 139 collections. As expected, disciplines which have an established tradition of sharing preprints and working papers, such as computer science and engineering, are ranked as top contributors to the Repository (see Table 1). Conference papers, journal articles, preprints, working papers and doctoral theses constitute the major document types held in the Repository (see Table 2). 5 Table 1 Top 10 contributing departments (September 2006) Academic Departments and Centers Size Percentage Computer Science 478 20.2% Electrical and Electronic Engineering 324 13.7% Mechanical Engineering 164 6.9% Marketing 154 6.5% Mathematics 130 5.5% Physics 126 5.3% Chemistry 106 4.5% Social Science 106 4.5% Biology 84 3.5% Language Center 84 3.5% Others 613 25.9% Total 2369 100.0% Table 2 Total number of documents by document types (September 2006) Document Types Size Percentage Conference papers 636 26.8% Working papers, technical papers, research reports, preprints 549 23.2% Journal articles 537 22.7% Doctoral theses 473 20.0% Presentations 70 3.0% Patents 58 2.4% Book chapters 38 1.6% Others 8 0.3% Total 2369 100.0% The metadata of the documents is encoded in qualified Dublin Core schema. DSpace’s default DC registry was followed, except for a locally defined qualifier openurl for the element identifier. The purpose of defining this local field will be discussed later in this paper. Document Submission and Processing To make the document submission as simple and effortless as possible, the Library decided to develop its own Faculty Submission Form, a web-based interface outside of the DSpace workflow. Faculty members are only required to input minimal bibliographic data, such as title, author and citation source when submitting the actual files to the server. The Form contains a Non-Exclusive Distribution License (Figure 2). They need to check the “I Agree” box to grant permission to the Library. The IR Team then verify and enhance the metadata, ascertain publishers’ policies to avoid depositing the wrong versions, harvest and convert the files to PDF format as needed, and add the documents to the Repository. A web-based Add Item program was also developed for the IR Team so that they can integrate these document submission and processing tasks seamlessly with DSpace. 6 Figure 2. Non-Exclusive Distribution License in the Faculty Submission Form Harvesting Research Output While some faculty members and researchers had the initiative to submit their documents via the submission form, most of them were not responsive at all. Therefore, the Library had to take a more proactive approach to discover and harvest research output for the Repository. For example, the IR Team had:  Visited faculty members’ personal and departmental websites as well as the websites of the research centers and institutes on campus to harvest full-text research papers and publications posted on the web.  Surveyed academic departments to harvest collections of working papers and technical reports.  Searched the library catalog to identify proceedings of conferences held at HKUST.  Scanned through boxes of pre-published research papers held in the University Archives.  Searched electronic databases and open access sources such as Web of Science and DOAJ to identify papers published by the HKUST researchers.  Contacted individual faculty members to ask for their complete publication lists and their full-text documents. In most of the above cases, the IR Team had to contact the original authors to obtain permissions before loading the harvested documents to the Repository. And if the electronic version was unavailable, the paper document would be digitized. The HKUST Electronic Theses database was built a few years earlier than the IR. Not all these theses are open access. The Library decided to include only those PhD theses with author permission in the IR so that all of them would be openly accessible. Instead of depositing a second copy to the IR, only metadata was created, together with a link to retrieve the full-text from the Electronic Theses database. HKUST faculty members need to report annually to the Research Output Collection System. The Library asked the office in-charge of this system to include a checkbox in the submission page to indicate the reporters’ willingness to deposit their publications into the Repository. If the box is checked, an email containing the 7 citations will be sent to the IR Team for follow up actions. This automatic alert mechanism has enabled us to harvest research output on an annual basis. Publishers’ Policies and Deposit Guidelines Verifying and selecting a version of the document for depositing to the Repository is far from a trivial task. While more and more publishers nowadays have their self- archiving policies clearly spelt out on their websites, this was not the case in 2003. Project RoMEO, which provides a directory of publisher self-archiving policies, was just launched in 2002. The IR Team surveyed the publisher’s policies for the journal articles via SHERPA/RoMEO and publisher websites. Findings were recorded in the IR Staff Working Manual for easy future reference. The list currently contains more than 60 publisher policies (see Figure 3). Publishers’ policies can be summarized as follows:  no archiving allowed  allow pre-refereed version only  allow post-referred version only  allow pre- and post- refereed versions  allow publisher’s version  allow all versions  not specified Figure 3. List of publishers’ policies collected by the HKUST Library, showing links to special notes and acknowledgment text, and links to publishers’ website If the policy is unknown, the IR Team will write to the publishers for clarification and ask them for the archiving permission. And if the Library’s version is not usable, the IR Team will contact the authors to ask for an acceptable version. 8 The Library also encourages authors to negotiate with publishers so as to retain their self-archiving rights and the rights for personal educational use. They should also avoid granting an exclusive long-term license that extends beyond first publication. Versioning It is essential that users know whether the version deposited is the published version or not. To avoid confusion, a watermark “This is the Pre-Published version” is stamped on the first page of the document if it is a pre-refereed or post-refereed version. A note “pre-published version” is also displayed in the “Files in This Item” box of the item record display page (Figure 4). A piece of scholarly work may undergo several rounds of revisions. Authors are encouraged to submit revised versions as separate documents. They can also replace the previous versions as long as they are not published items. By doing so, the revised version will bear the same identifier (handle) number as the previous one. Author Name Authority Control Authors may have their works published under different names. It is essential to perform some form of authority control for consistency. Policies for entering the names of HKUST researchers were established. Name authority records for some authors are readily available from the library catalog. For those without record, several university publications such as the Academic Calendar, Faculty’s Profile and Communications Directory are consulted. In a few cases, emails were sent to the authors to seek their preferences on the names used, especially in the case of maiden and married names, names in Chinese, and the inclusion of Christian names. For Chinese documents, the English name of a HKUST author will be taken from the title page, should it appear in English or bilingually in English and Chinese. If HKUST authors do not provide their English names in their Chinese documents, the IR Team will look up their English names and add them to the metadata. For bilingual names of the same author, the Chinese name will be entered in parentheses after the English one, e.g., “Chan, Diana L. H. (陳麗霞)”. Some documents in the Repository were jointly written with non-HKUST authors. Since it is difficult to identify these non- HKUST affiliated authors, the Library decided not to perform authority check on them. Subject Keywords When authors submit records, they can supply three to eight keywords or phrases for indexing. If the subject field is not filled, the IR Team will extract the keywords from the abstracts. English keywords are used for Chinese documents as well. The Library decided not to use thesauri or LC subject headings in assigning subjects. Withdrawal of Items from the Repository At the specific request from authors, documents in the Repository can be permanently removed. To retain the historical record, such transactions will be noted in the metadata record. Since these documents may have been cited by others, the system will supply a "tombstone" when they are requested. A withdrawal statement will be displayed in place of the view document link. 9 Programming Efforts The advantages of using open source software such as DSpace as the platform for the Institutional Repository became more apparent when the needs and requests to customize the software flooded in. Apart from the Faculty Submission Form and the Add Item Form as mentioned in the previous section, the following are other customizations that are worth mentioning. Linking to the Published Version Some publishers only allow institutions to archive the pre-published version. From time to time, the Library receives authors’ feedback that they prefer users to read the published version rather than the pre-published version archived in the IR. One can easily resolve this problem by adding the direct URL link to point users to the published version residing on an aggregator’s or the publisher’s website to which the institution has a subscription. The Library objected to this approach because such links would become broken due to subscription changes. After much study, the Library decided to implement an OpenURL linking mechanism on DSpace so that users can be dynamically redirected to library-subscribed resources that host the published version. To enable this linking feature, the metadata (item record) must contain the OpenURL string. This is made possible by using a locally defined Dublin Core field identifier.openurl. DSpace’s item record display page was modified to enable a link-resolver button when the item contains an OpenURL (see Figure 4). Program was developed to query OCLC’s OpenURL Resolver Registry on-the-fly while displaying the button. By doing so, users will see their own institution’s link-resolver, such as WebBridge for HKUST. When users click on the button, the link-resolver will try its best to identify external sources that contain the published version. Constructing OpenURL manually is extremely painful. To automate this process, a web-based program was developed. It can intelligently parse the title, constributor.author and the identifier.citation fields to obtain most of the required key- value pairs for constructing the OpenURL. As the journal’s ISSN is not available in the Repository’s metadata, the program searches the library catalog to obtain the numbers. With this program, the OpenURL string can be quickly created within a few mouse clicks. Figure 5 shows the web interface for this highly user-friendly OpenURL Builder program. 10 Figure 4. A pre-published document record, showing versioning information, with WebBridge link to the published sources Figure 5. OpenURL Builder - automating the construction of the OpenURL Usage Statistics and Top 20 Most Accessed Documents In-house usage analyzing programs were developed to supplement the usage reports that come with DSpace. They are based on the web access logs captured by the server Check Library catalog and auto- insert the ISSNs to the form Click this button to create this OpenURL fragment Click this link to test the OpenURL Click on this image to launch an OpenURL link-resolver to locate the published version Document deposited in the Repository is a pre-published version Watermark 11 when users issue requests to download the bitstreams, e.g., the PDF files. The Repository is open to the world and allows visits from search engines, robots and OAI harvesters. As a result, it receives tens of thousands of web requests per day. Program was developed to enable the Library to know how many times the IR documents were downloaded by “real” users, excluding robot accesses. This figure was updated monthly to the Repository home page. Another customized program was the monthly listings of the Top 20 most accessed documents. It is interesting to analyze these Top 20 lists as they give a good account of documents, topics and authors that users are most interested in. Such information is useful for IR promotion. For example, the Library wrote to the authors in the lists to inform them about the high usage of their papers. The IR Team also showed the lists to the faculty members during departmental visits. While the majority of the documents are from the academic departments, it is worth mentioning that a number of documents authored by the HKUST Language Center made their way into the lists, together with the ones HKUST Library wrote on institutional repository and virtual reference. CJK Search and Display In the early versions of DSpace, there were problems on searching and displaying Chinese characters. The authors managed to fix these problems by revising and replacing some of the DSpace source codes. While some of these problems were eventually fixed in DSpace’s later versions, the timing of fixing them was critical to the Library’s IR software selection. Had they not been fixed during software evaluation, the Library would not have selected DSpace. Thanks to open source, one could dig into the source codes and fix problems quickly. The main CJK problem was attributed to the use of the CJK-illegible string tokenizer. DSpace is Unicode capable, meaning that it supports data and strings in multiple scripts, including CJK. However, like many other non-Roman scripts, the way Chinese strings are sorted, indexed and searched can be quite different from that for English. Global software developers should be aware of these differences in order to avoid problems similar to the ones encountered with DSpace. Enhancing Global Access It is essential to publicize an institutional repository so that the research output can be made known to the world. In addition to making the Repository readily available and openly accessible, the Library has implemented the following measures to allow search engines, agents and harvesters around the world to discover documents in the Repository. OAI-PMH Compliance OAI-PMH (Open Access Initiative – Protocol for Metadata Harvesting) is a protocol that allows metadata to be easily harvested by computer programs. Like other IR systems, DSpace is OAI-PMH compliant. It is useful to register the OAI Base Path of the IR to various OAI registries, such as the ROAR (Registry of Open Access Repositories) maintained by EPrints. OAI harvesters can then follow this registered path and retrieve the metadata for their own searching and indexing services. At least two well known services, namely OAIster and Scirus, are constantly harvesting HKUST’s IR metadata via this protocol. 12 Indexed by OAIster OAIster is a project of the University of Michigan Digital Library Production Service. By using OAI protocol, it has collected almost 10 million of metadata records of academically-oriented digital resources from 680 institutions around the world. The Library contacted OAIster in June 2003 and since then the Repository records are included in OAIster. HKUST research output is therefore available to all OAIster users via its one-stop searching interface. Full-text Searching on Scirus Scirus is Elsevier’s free search engine for scientific information. In addition to web pages, it also harvests content from selective institutional repositories. In November 2005, Scirus proposed to index the HKUST Institutional Repository. The project involved building the mechanism to harvest the content of the Repository, indexing both the metadata and full-text of the documents, making them searchable on the Scirus platform, and integrating the Scirus search form within the Repository home page. This feature was rolled out in May 2006. Thanks to Scirus, the Library is able to offer full-text searching external to DSpace as well as to open up the content to a larger scientific research community. Crawling by Google and Yahoo Robots from search engines are allowed to visit and crawl the web pages of the Repository. By enabling robot access, HKUST’s research output is readily available via popular search engines such as Google and Yahoo, as well as their subsets, such as Google Scholar. The following story, as told by the Library’s reference librarians, shows the effectiveness of using these search engines to discover documents in the Repository: “Once, we received an email from someone in the U.K. who wanted to contact the author of a PhD thesis. It turned out that the requestor was the father of a son suffering from a type of cancer called Ewing Sarcoma. He discovered the thesis on the IR via the web. We acted as the intermediary and passed his enquiry to the author concerned.” (Kwok, Chan, Wong 2006) Searching with SRW/U SRW/U (Search and Retrieval for the Web, or by URL) is a protocol for searching heterogeneous databases using XML and HTTP. It retains the core functionality of Z39.50 but in the form of web services. With SRW/U, search service providers can broadcast a search to various institutional repositories and deliver the search results in their own GUI interface. To allow such federated searching, the Library implemented the SRW/U layer to the Repository in October 2004, based on OCLC’s SRW/U open source software. The HKIR Experiment Other universities in Hong Kong have also started to build their institutional repositories. There is an emerging need to share IR experiences among them and to collaborate. One of the possibilities is to develop a union repository for scholarly output in Hong Kong. To demonstrate the feasibility of such collaboration and to study the issues involved, the Library developed an experimental system called HKIR (Hong Kong Institutional Repositories) in February 2006. The system is powered by the DSpace software, with OCLC’s OAIHarvester2 software for harvesting OAI metadata. As of September 2006, six collections of ETD and institutional repositories 13 from five Hong Kong universities were created, allowing cross-searching of local scholarly output. A number of issues were identified during the study. Many of them are related to the standardization of metadata description among institutions. These include standardization in author names, subject analysis, document types and metadata schema (Figure 6). Figure 6. Two records of the same article in HKIR (http://lbapps.ust.hk/hkir/), showing different metadata description Another problem related to OAI harvesting was also identified during the study. While DSpace uses qualified Dublin Core as the metadata schema, OAI’s default metadata format oai_dc requires unqualified Dublin Core. As a result, metadata that contains qualifiers, such as identifier.citation, would become identifier after the OAI harvesting. Unless there is revision from the oai_dc schema, local institutions will be required to use a HKIR defined metadata format. Conclusions The open access movement began with the establishment of SPARC to address market dysfunctions in scholarly publishing, followed by the formation of the OAI to promote author self-archiving and interoperable standards. After almost a decade of hard work, the authors see a number of good converging signs: publishers are more supportive of open access, the number of open access journals continues to grow, research funding bodies understand and better embrace open access, and more importantly, the flourishing of institutional repositories to preserve scholarly output and to make it openly accessible. Different author name forms assigned to the same article Inconsistent document type assigned to the same article 14 Installing IR software such as DSpace is straightforward, but tailoring the software and setting up policies and procedures to make it work effectively in one’s institutional environment are uphill tasks. Even more difficult is the effort needed to recruit content. IR providers need to continue to educate researchers about the IR and encourage them to deposit their research to the Repository. They also need to campaign for government support. While more and more institutions in Asia are beginning to develop their own repositories, the authors see the need of experience sharing, collaboration and standardization. HKUST is an early implementer of institutional repositories in Asia and its unique experience will be useful to similar institutions in this region. References Chan, D. (2004a). "Managing the challenges : acquiring content for the HKUST Institutional Repository", International Conference on Developing Digital Institutional Repositories: Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology Libraries, Pasadena, CA, and the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/1973 (accessed September 28, 2006). Chan, D. (2004b). "Strategies for acquiring content : experiences at HKUST", International Conference on Developing Digital Institutional Repositories: Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology Libraries, Pasadena, CA, and the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/1974 (accessed September 28, 2006). Chan, D., Kwok, C. and Yip, S. (2005). "Changing roles of reference librarians : the case of HKUST Institutional Repository", Reference services review, v. 33, no. 3, pp. 268-282, available at http://hdl.handle.net/1783.1/2039 (accessed September 28, 2006). Kwok, C., Chan, D. and Wong, G. (2006). "From idea to reality: building the HKUST Institutional Repository", University library journal, v. 10, no. 1, March, available at http://hdl.handle.net/1783.1/2528 (accessed September 28, 2006). Lam, K.T. (2004). "DSpace in action : implementing the HKUST Institutional Repository system", International Conference on Developing Digital Institutional Repositories: Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology Libraries, Pasadena, CA, and the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/2023 (accessed September 28, 2006). Lam, K.T. (2006). "Exploring IR technologies", Workshop on Managing Scholarly Assets in Institutional Repositories: Sharing Experiences Among JULAC Libraries, Hong Kong, February 24, 2006, the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/2501 (accessed September 28, 2006). work_dunyieqx5redllqiy3yr6crw6y ---- PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [Lukishova,] On: 28 January 2009 Access details: Access Details: [subscription number 908220816] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Modern Optics Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713191304 Organic photonic bandgap microcavities doped with semiconductor nanocrystals for room-temperature on-demand single-photon sources Svetlana G. Lukishova a; Luke J. Bissell a; Vinod M. Menon b; Nikesh Valappil b; Megan A. Hahn c; Chris M. Evans c; Brandon Zimmerman a; Todd D. Krauss c; C. R. Stroud Jr a; Robert W. Boyd a a The Institute of Optics, University of Rochester, Rochester, NY, USA b Department of Physics, Queens College-CUNY, Flushing, NY, USA c Department of Chemistry, University of Rochester, Rochester, NY, USA First Published on: 23 January 2009 To cite this Article Lukishova, Svetlana G., Bissell, Luke J., Menon, Vinod M., Valappil, Nikesh, Hahn, Megan A., Evans, Chris M., Zimmerman, Brandon, Krauss, Todd D., Stroud Jr, C. R. and Boyd, Robert W.(2009)'Organic photonic bandgap microcavities doped with semiconductor nanocrystals for room-temperature on-demand single-photon sources',Journal of Modern Optics, To link to this Article: DOI: 10.1080/09500340802410106 URL: http://dx.doi.org/10.1080/09500340802410106 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. http://www.informaworld.com/smpp/title~content=t713191304 http://dx.doi.org/10.1080/09500340802410106 http://www.informaworld.com/terms-and-conditions-of-access.pdf Journal of Modern Optics 2009, 1–8, iFirst Organic photonic bandgap microcavities doped with semiconductor nanocrystals for room-temperature on-demand single-photon sources Svetlana G. Lukishova a*, Luke J. Bissell a , Vinod M. Menon b , Nikesh Valappil b , Megan A. Hahn c , Chris M. Evans c , Brandon Zimmerman a , Todd D. Krauss c , C.R. Stroud Jr a and Robert W. Boyd a a The Institute of Optics, University of Rochester, Rochester, NY USA; b Department of Physics, Queens College – CUNY, Flushing, NY USA; c Department of Chemistry, University of Rochester, Rochester, NY USA (Received 7 February 2008; final version received 14 August 2008) We report the first experimental observation of fluorescence from single semiconductor nanocrystals (colloidal quantum dots) in microcavities. In these room-temperature experiments we observed photon antibunching from single CdSe nanocrystals doped into a chiral one-dimensional photonic bandgap liquid-crystal microcavity. The chirality resulted in high-purity, circular polarization of definite handedness of the emitted single photons. We also report the fabrication of chiral microcavities for telecom wavelengths doped with PbSe nanocrystals as well as a solution-processed-polymer microcavity with a defect layer doped with CdSe nanocrystals between two distributed Bragg reflectors. These systems with their low host fluorescence background are attractive for on-demand single-photon sources for quantum information and communication. Keywords: single-photon sources; colloidal quantum dots; microcavities 1. Introduction The development of on-demand single-photon sources (SPSs) with photons exhibiting antibunching has recently been of significant interest for their applica- tions in quantum cryptography [1–4]. A desirable feature for a SPS is photon polarization, since in the case of single photons with definite polarization, the quantum cryptography system’s efficiency will be twice that of an unpolarized SPS. Room-temperature SPSs based on colloidal semi- conductor quantum-dot (QD) fluorescence [5,6] are very promising because of higher QD photostability at room temperature than that of conventional dyes, and relatively high quantum yield (up to�100%). Recently, electrically driven light emission from a single colloidal QD at room temperature was obtained [7], opening up the possibility for electrical pumping of an SPS on demand, based on colloidal QDs. At the same time, colloidal QDs can be dissolved/dispersed in various hosts, e.g. photonic crystal microcavities, from solution. Placed into a microcavity environment, the QD spontaneous emission rate can be enhanced via the Purcell effect [4]. The cavity geometry can also influence QD polarization. In spite of the well-developed microcavity fabrica- tion techniques [8], and wide use of these techniques for heterostructured QDs operating at cryogenic temperatures, e.g. [4,9], there are only few reports of doping colloidal QDs into microcavities. For instance, Poitras et al. [10] demonstrated spontaneous emission enhancement by a factor of 2.7 from CdSe QDs embedded in a half-wavelength one dimensional cavity sandwiched between two distributed Bragg reflectors (DBR). The DBRs were prepared using sputter deposition of TiO2-SiO2 quarter-wavelength thick layers. Kahl et al. [11] started with the same approach, but also used a focused ion beam to etch micropillar microcavities of both round and elliptical cross-section from a planar cavity. Martiradonna et al. [12] combined the DBR technique prepared by e-beam evaporation of TiO2-SiO2 layers with imprint litho- graphy of a defect layer. Lodahl et al. [13] embedded CdSe QDs into a titania inverse opal photonic crystal and observed both inhibition and enhancement of decay rates. Fushman et al. [14] mapped cavity resonances of PbS QDs in an AlGaAs membrane using QD fluorescence. Wu et al. [15] achieved efficient coupling of 1.5-mm emission from PbSe QDs to a Si-based photonic-crystal-membrane microcavity with a Purcell factor of 35. In [16], Bose et al. reported weak coupling of PbS QDs with the same Si-based microcavity. In addition, Hoogland et al. [17] reported PbS QD 1.5-mm lasing of whispering gallery modes in a microcapillary resonator, and Barrelet et al. [18] studied CdS QDs in nanowire one-dimensional photo- nic crystal structures. *Corresponding author. Email: sluk@lle.rochester.edu ISSN 0950–0340 print/ISSN 1362–3044 online � 2009 Taylor & Francis DOI: 10.1080/09500340802410106 http://www.informaworld.com D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 It should be noted that no data exist in the literature on the fluorescence of room-temperature, single colloidal QDs embedded in the microcavity environment, nor on preserving fluorescence anti- bunching of single colloidal QDs in the cavity materials. All results on SPSs based on QDs in microcavities, see, e.g. [4,9], are reported for hetero- structured QDs at cryogenic temperatures. We describe here the results of doping single colloidal semiconductor QDs in two types of organic 1-D photonic bandgap microcavities which permit us to use single colloidal QDs directly from a solution: (1) chiral microcavity made of monomeric and/or oligomeric planar-aligned cholesteric liquid crystals (CLCs) [19–21] and (2) polymer microcavity with a defect layer between two DBRs [22–23]. These two types of microcavity are robust and simple in fabrication. We report the first observation of single-emitter circularly polarized fluorescence of definite handed- ness. The polarization selectivity is produced by a chiral microcavity. We also observed fluorescence antibunching from a colloidal CdSe QD in this microcavity. Earlier, we achieved both fluorescence antibunching [19–21] and definite linear polarization of single dye molecule fluorescence at room temperature in planar aligned nematic liquid crystal (LC) hosts. This method was applied for anisotropic dye molecules which can be aligned by rod-like nematic LC molecules [21,24]. A chiral microcavity provides circular polar- ized fluorescence for emitters even without dipole moments. Such a microcavity can be prepared from CLCs for any optical wavelength. We also present here chiral microcavities for PbSe QDs with a fluorescence maximum at 1.5 mm prepared both from monomeric (fluid-like) and oligomeric glassy CLCs. By imaging the fluorescence of QDs in a 1.5-mm microcavity as well as in a polymer DBR microcavity with a defect layer we also demonstrate their potential for SPS device applications. Section 2 of this paper describes the experimental setup for fluorescence imaging, antibunching and polarization measurements. Section 3 is devoted to chiral microcavities made of CLCs doped with single colloidal CdSe, CdSeTe or PbSe QDs. It contains details on sample preparation, photonic-bandgap transmission curves matched with QD fluorescence at 580 nm, 700 nm and 1.5 mm, circular polarized fluores- cence, and antibunching measurements. Section 4 describes the preparation of a polymer microcavity with a defect layer doped with single CdSe QDs between the DBRs and single-QD fluorescence ima- ging in this structure. 2. Experimental setup The experimental setup consists of a home-built confocal fluorescence microscope based on a Nikon TE2000-U inverted microscope with several output ports. Figure 1 shows the abbreviated schematics of our experiment for fluorescence imaging and anti- bunching measurements (a) and polarization and spectral measurement (b). We excite our samples with 76 MHz repetition-rate, 6 ps pulse duration, 532-nm light from a Lynx mode- locked laser (Time-Bandwidth Products Inc.). To obtain a diffraction-limited spot on the sample, the excitation beam is expanded and collimated by a telescopic system with a spatial filter. The samples are placed in the focal plane of a 1.3-numerical A P D 1 APD 2 50/50 BS Dichroic mirror Objective lens Single emitters in host (a) (b) ND filters Filters Microscope cover slips Pump Variable delay TimeHarp 200 PC Dichroic mirror Objective lens Single emitters in host ND filters Filters Microscope cover slips Pump λ/4 Spectrometer PC Coupling lens Fiber Glan Thompson Polarizer Figure 1. Schematics of experimental setup for fluorescence imaging and antibunching measurements (a); spectral and polarization measurements. (b) We used the following abbreviations: neutral density (ND); single-photon counting avalanche photodiode module (APD); beamsplitter (BS). (The color version of this figure is included in the online version of the journal.) 2 S.G. Lukishova et al. D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 aperture, oil-immersion microscope objective used in confocal reflection mode. In focus, the intensities used are of the order of several kW cm �2 . Residual transmitted excitation light is removed by a dichroic mirror and a combination of two interference filters yielding a combined rejection of nine orders of magnitude at 532 nm. The sample’s holder is attached to a piezoelectric, XY translation stage providing a raster scan of the sample through an area up to 50 mm�50 mm. The following diagnostics are placed in the separate output ports: (1) A Hanbury Brown Twiss interferometer con- sisting of a 50/50 beamsplitter cube and two cooled, Si single photon counting avalanche photodiode modules (APDs) SPCM AQR-14 (Perkin Elmer). The time interval between two consecutively detected photons in separate arms is measured by a TimeHarp 200 time correlated single-photon counting card using a conventional start-stop protocol. (2) Electron multiplying, cooled CCD-camera iXon DV 887 ECS-BV (Andor Technologies). (3) Fiber-optical spectrometer (Ocean Optics). The method of defining the dissymmetry of circular polarization is described in [25]. For circular- polarization measurements both an achromatic quarter waveplate and a Glan Thompson linear polarizer on rotating mounts are placed in the spectrometer port in front of the spectrometer. An area with several single QDs is selected by an EM-CCD camera and/or APD-detectors with a raster-scan of the sample. It is imaged to the spectrometer input fiber. The spectra are recorded with a 6-s accumulation time with background subtraction. CdSe/ZnS core/shell QDs were obtained commercially from Evident Technologies (maximum fluorescence wavelength �o¼620 nm) or were synthesized according to published methods (�o¼580 nm) [26,27]. CdSeTe QDs (�o¼700 nm) were obtained commercially from Invitrogen. PbSe/C18H34O2 QDs (�o¼1.5 mm) contain- ing PbSe QDs of smaller-size (�o�900 nm) were synthesized according to variations of literature methods [28,29]. 3. Chiral microcavity made of LCs doped with single colloidal QDs 3.1. CLC sample preparation In a planar-aligned CLC, the rod-shaped anisotropic molecules with small chiral ‘tails’ form a periodic helical structure with pitch p [30]. For sufficiently thick CLC layers, the reflectance of normally incident, circularly polarized light, with the same handedness as the CLC structure, is nearly 100% within a band centered at �c¼navp. The bandwidth is approximately ��¼�c�n/nav, where nav is the average of the ordinary no and extraordinary ne refractive indices of the medium: nav¼ (noþne)/2, and �n¼ne�no. This per- iodic structure can also be viewed as a 1-D photonic crystal, with a bandgap within which propagation of light is forbidden. For emitters located within this structure, the spontaneous emission rate is suppressed within the spectral stopband and enhanced near the band edge [31,32]. Lasing experiments in dye-doped CLC structures with high dopant concentration [32] confirmed that the best condition for coupling is when the dopant fluorescence maximum is at a band edge of the CLC selective transmission curve. For sample preparation we use two types of LCs: (1) monomeric mixtures of low-molecular-weight E7 nematic-LC blend with a chiral additive CB15 and (2) oligomer CLC (OCLC) powders [19–20]. E7 and CB15 are fluids at room temperature. Both materials were supplied by EM Industries. We filtered the E7 and CB15 to remove fluorescent contaminants. Powder OCLCs were supplied by Wacker GmbH. For development of CLC hosts which form a chiral photonic bandgap tuned to the QD fluorescence band, two main aspects are important: (1) properly choosing the concentration (or ratio) of different LC compo- nents and (2) providing planar alignment of the CLC. For the monomeric mixtures, the stopband position �c of the photonic bandgap is defined roughly by C¼nav/ (�c�HTP), where C is the weight concentration of CB15 in the CB15/E7 mixture, nav�1.6 for this mixture, and HTP�7.3 mm�1 is the helical twisting power of the chiral additive in nematic liquid crystal. The actual stopband position relative to the fluores- cence maximum of the QD was further defined empirically by obtaining selective transmission curves of different samples using a spectrophotometer. After monomeric CLC preparation, a QD solution of �1nM concentration was mixed with monomeric CLC and solvent was evaporated. For the OCLC powders there is not such a simple relation between the concentration and �c. We found the right ratio R of components only empirically by mixing the different ratios of two OCLC with different �c (1.17 mm and 2.15 mm) by dissolving them in a solvent. By evaporating the solvent using a procedure of heating this solution in a vacuum inside a rotating retort, we obtained a new powder oligomer with an intermediate �c. After that, monomeric CLC doped with QDs is placed between two cover glass slips and planar aligned through uni-directional mechanical motion between Journal of Modern Optics 3 D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 the two slides. For planar alignment of OCLC, a cover slip with Wacker powder is placed on a hot plate and melted at �120�C. The second slip is used to shear the melted oligomer at this temperature. After the align- ment the sample is slowly cooled into the glassy state, preserving CLC order and planar alignment [20]. For single-molecule fluorescence experiments Wacker pow- ders need to be purified. For further details of CLC doping and sample preparation from both monomeric and oligomeric LCs with different photonic bandgaps, see [19,20]. 3.2. 1-D-photonic bandgap transmission curves By properly choosing the concentration of different LC monomers and/or Wacker glassy oligomers and providing planar alignment of the LCs, we developed CLC chiral 1-D photonic bandgap structures with different stopband positions doped with single QDs (either CdSe, CdSeTe or PbSe). In both cases of monomeric and oligomeric liquid crystals, increasing the concentration (or ratio) of a component with higher HTP changes the position of the stopband in the direction of shorter wavelengths. The error in defining weight concentrations can smear this effect for mixtures with concentrations close to one another. The stopband positions are tuned to the QD fluores- cence bands for the visible (Figure 2) and �1.5 mm (Figure 3). Figure 2 shows selective transmission of two monomeric 1-D chiral photonic bandgap structures with 36.6% (a) and 36.0% (b) weight concentrations of chiral additive CB15 in E7/CB15 mixtures. It also shows the fluorescence spectra of the CdSe (a) and CdSeTe (b) QDs with the centers of the fluorescence peaks near 580 and 700 nm. 400 500 600 700 800 Wavelength (nm) T ra n sm is si o n ( % ) 0 F lu o re sc e n ce s p e ct ru m o f C d S e /Z n S Q D s (a .u .) 0 20 40 60 80 100 450 550 650 750 F lu o re sc e n ce s p e ct ru m o f C d S e T e /Z n S Q D s (a .u .) Wavelength (nm) 850 T ra n sm is si o n ( % ) 1.0 0.5 0 20 40 60 80 100(a) (b)1.0 0.5 Figure 2. Selective transmission of two monomeric chiral photonic bandgap CLC hosts for right-handed circular polarized light and the fluorescence spectrum of the CdSe (a) and CdSeTe (b) QDs. (The color version of this figure is included in the online version of the journal.) 40 50 60 70 80 90 100(a) (b) 1.0 1.2 1.4 1.6 1.8 2.0 0 0.2 0.4 0.6 0.8 1.0 F lu o re sc e n ce s p e ct ru m o f P b S e Q D s (a rb . u n its ) Wavelength (µm) T ra n sm is si o n ( % ) 40 50 60 70 80 90 100 0.80 1.0 1.2 1.4 1.6 1.8 0 0.2 0.4 0.6 0.8 1.0 F lu o re sc e n ce s p e ct ru m o f P b S e Q D s (a rb . u n its ) Wavelength (µm) T ra n sm is si o n ( % ) 2.0 Ra = 1:0 Rc = 1:1 Rb = 1:0.24 Ca = 16.3% Cc = 14.4% Cb = 16.0% a b c a b c Figure 3. PbSe QD fluorescence spectrum and selective transmission of chiral photonic bandgap cholesteric microcavities for telecom wavelength 1.5 mm for unpolarized light [for circularly polarized light the minimum transmission value will be smaller than 5% (see Figure 2, where measurements were made with a linear polarizer and achromatic quarter waveplate)]. (a) For monomeric liquid crystals of different concentrations, C, of chiral additive CB15 in CB15 and E7 mixture; (b) for Wacker chiral oligomeric powder mixtures of different ratios, R, of �c¼1.17 mm powder to �c¼2.15 mm powder. (The color version of this figure is included in the online version of the journal.) 4 S.G. Lukishova et al. D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 Figure 3 shows spectral transmission curves for several prepared photonic bandgap structures with the band edge at 1.5 mm made of monomeric CLC (left) and OCLC (right). The fluorescence spectrum of a PbSe QD solution at high QD concentration is depicted in both figures. 3.3. Circular polarized fluorescence and antibunching in a chiral microcavity Figure 4(a), shows several single-CdSe/ZnS QD fluorescence images in a CLC host with the same QD fluorescence spectrum as presented in Figure 2(a). The spectral transmission of this CLC photonic bandgap structure is similar to the spectral transmission curve also shown in Figure 2(a). Note that the dark horizontal stripes in the pattern of separate QDs are the result of single QD blinking, which is a characteristic property of single-QD fluorescence. The raster scan area is 15 mm�15 mm. Figure 4(b), shows single-PbSe QD fluorescence images in a CLC photonic bandgap structure with the spectral transmission curve (a) in Figure 3(a). We used the solution containing both 1.5 mm emitting QDs and a small concentration of 900 nm emitting QDs as a result of synthesis. Using Si APDs we are only able to record QD fluorescence from the smaller size QDs with �o�900 nm. Figure 5(b) shows emission spectra for CdSe/ZnS QDs in a chiral CLC cavity for right-handed (black line) and left-handed circular polarizations (red line). The spectral transmission of the cavity is presented in Figure 2(a). The degree of circular polarization is measured by the dissymmetry factor ge [25,33]: ge ¼ 2ðIL � IRÞ ðIL þ IRÞ , ð1Þ where IL and IR are the intensities of left-handed and right-handed circular polarizations. At 580 nm, ge¼�1.6. For unpolarized light ge¼0. The fluorescence spectrum of the same CLC microcavity without QDs is depicted in Figure 5(a). The nature of the spectral peaks which were observed in the CLC cavity without QDs is unclear. It is not a microcavity effect, because we observed the same features from unaligned CLC without a microcavity. It can be attributed to some impurities which we did not remove during the LC purification procedure. We also illuminated a single CdSe QD in the CLC host and measured the fluoresced photon statistics under saturation conditions. Figure 6(a) presents the g (2) (t) histogram at different interphoton times t. The value of g (2) (0) is 0.76�0.04. One sees that the peak at zero interphoton time is clearly smaller than any of the other peaks, which shows an antibunching property. This antibunching histogram can be improved by using QDs which fluoresce outside the fluorescence spectrum of the CLC shown in Figure 5(a). One easily can see a host fluorescence peak at the fluorescence maximum of the selected QD. At wavelengths larger than �700 nm, no host background is observed. cts/ms 5 µm 44(a) (b) 20 10 2 30 40 5 µm cts/ms 32 28 24 20 16 Figure 4. Typical confocal fluorescence images (1) of single CdSe QDs in CLC host (a) (15 mm�15 mm scan), and single PbSe QDs in CLC host (25 mm�25 mm scan) (b). (The color version of this figure is included in the online version of the journal.) 0 10 20 30 40 50 60 70 80 500 550 600 650 700 Wavelength (nm) LHPLHP RHPRHP F lu o re sc e n ce in te n si ty , re l. u n its 550500 600 650 700 750 30 60(a) (b) 50 40 20 0 Wavelength (nm) F lu o re sc e n ce in te n si ty (r e l. u n its ) 10 Figure 5. Fluorescence spectrum of the monomeric CLC host without QDs (a) and CdSe QDs in a similar CLC host (Figure 2(a)) for two different circular polarizations of single photons: (b) black line – right-handed, red line – left-handed. (The color version of this figure is included in the online version of the journal.) Journal of Modern Optics 5 D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 We doped CdSeTe QDs with �0¼700 nm in the CLC host with the stopband shown in Figure 2(b), and when illuminating a single QD, obtained antibunching with g (2) (0)�0 (Figure 6(b)). Note that the QD fluorescence maximum is outside of the CLC back- ground spectrum. It shows that excluding the CLC background helps to obtain better antibunching. The fitted curve g (2) (t)¼1 – (1/N )e (-t/�) [34] gives g (2) (0)¼0.001�0.034, with a fluorescence lifetime ��15 ns and with the number N of illuminated emitters equal to 1. This quantum dot has � larger than the time between two laser pulses, so we cannot observe fluorescence excited by the separate laser pulses as in Figure 6(a). Estimation of the efficiency P of on-demand polarized antibunched photon emission into the col- lecting objective showed P�10% with the second- order correlation function g (2) (0)¼0.8, measured from the antibunching histogram of Figure 6 (a), using 5.2 mW excitation power. We define P from the following equation: ½1�gð2Þð0Þ�Nout ¼ NincQD��QAPDGP, ð2Þ where Nout¼2�10 5 counts s �1 is the measured photon count rate by the APDs and NincQD¼Ninc�abs¼ (I/h�)�abs¼7.17�10 6 photons s �1 is the number of photons incident on the quantum dot per second. Here I is the measured incident intensity in the focal area of the sample (I¼2.23 kW cm �2 ), h�¼3.73� 10 �19 J for 532-nm light, and �abs¼1.2�10 �15 cm 2 is the absorption cross-section of the quantum dot at 532 nm taken from the measurements of Leatherdale et al. [35]. The parameters of Equation (2) defining losses in the microscope detection system (��QAPD) are as follows: �¼0.49 is the measured transmission of all interference and colored glass filters in front of the APDs (preventing cross-talk between them), �¼0.48 is the measured transmission of the objective, microscope optics, imaging lenses and nonpolarizing beamsplitter, and QAPD¼0.58 is the quantum efficiency of the APD at 580 nm as quoted by the vendor. The parameter G¼0.4 is the measured CdSe/ZnS QD quantum yield. The value of P characterizes the cavity (collection efficiency from the source into the collecting objective), and the value of GP characterizes both the cavity and the fluorescent emitter together. In our measurements for the on-demand polarized anti- bunched photon source, P�10%, and GP�4%. The value of GP can be increased to 10% by using a quantum dot with G�1 (see, e.g. experimental paper [36]). 4. Polymer microcavity with defect layer doped with single QDs between the DBRs The second type of photonic bandgap structure doped with single CdSe/ZnS core/shell QDs from Evident Technologies is prepared using solution processing. The fluorescence maximum of these QDs is �620 nm. The details of structure preparation are reported in [22–23]. The DBRs are fabricated by spin coating alternating quarter wavelength thick polymer layers with different refractive indices n [poly-vinylcarbazole (PVK) with n¼1.683 and poly-acrylic acid (PAA) with n¼1.428 at 600 nm] [22,23]. Solvents are chosen such that the solvent for one polymer does not dissolve the other polymer. PVK is soluble in non-polar solvents such as toluene or chlorobenzene but not in polar solvents such as water or alcohol, whereas PAA is soluble in alcohol but not in chlorobenzene. Greater than 90% reflectivity is obtained using 10 periods of the DBR structure. The 1-D microcavity is formed by sandwiching a �/nPVK thickness PVK defect layer doped with single QDs between two such DBRs. The top and bottom DBRs both comprise 10 periods. QDs are mixed with a defect layer PVK polymer solution and spin coated 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 −80 −60 −40 −20 0 20 Interphoton times (ns) 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1(a) (b) −80 −60 −40 −20 0 20 Interphoton times (ns) g (2 ) ( t) g (2 ) ( t) Figure 6. Histograms of coincidence counts of single-QD fluorescence in a CLC host under pulsed excitation. The dip at zero interphoton time indicates antibunching. (a) For the CdSe QD of Figure 2(a) with �0 inside the liquid crystal background; (b) for CdSeTe QD of Figure 2(b) with �0 outside the liquid crystal background. (The color version of this figure is included in the online version of the journal.) 6 S.G. Lukishova et al. D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 onto the bottom DBR structure (QD concentration in solution is �7.7 nM). Figure 7(a) shows the reflectivity of the whole structure (DBRs with a defect layer between them) with the cavity mode at �620 nm. The quality factor (Q) of such a microcavity was found to be �40. The theoretical estimate of Q for the polymer distributed Bragg mirrors to form a microcavity structure is within the same order of magnitude (�60). This discrepancy is attributed to lower reflectivities of the actually fabricated DBR mirrors than anticipated by the theory using the ideal mirrors. The inset of Figure 7(a), shows the normalized reflectivity spectra of the DBR with 10 periods without any defect layer. Fluorescence imaging of single QDs in a DBR structure with a defect layer is shown in Figure 7(b). This raster scan image shows blinking of single QDs (bright horizontal stripes) and a low host fluorescence background, making these structures suitable for single photon source applications. It was reported earlier for a high concentration of CdSe/ZnS QDs [22,23] that the photoluminescence decay time � of the QDs in a microcavity was �150 ps. For QDs on a bare glass substrate or embedded in PVK without a bandgap structure � was found to be �1000 ps and �400 ps, respectively [22,23]. It should be noted that this type of microcavity provides 1-D light confinement so that light can propagate in a lateral direction. Preparation of a micropillar microcavity from the current 1-D microcavity using focused ion beam etching is in progress. Etching with a similar 1-D microcavity was carried out in [11]. An elliptical cross-section micro- pillar will provide linear polarized emission of desired direction from single colloidal QDs. In contrast to [10], in which DBRs and a defect layer doped with colloidal QDs between them are prepared by sputtering alternating SiO2 and TiO2 layers, the process described here is more simple and robust and compatible with the standard solution processing technique of colloidal QDs. The technique of sputtering and/or thermal evaporation require multiple deposition systems – one for colloidal quantum dots and another for DBRs. In addition, these techniques have detrimental effects on the optical properties of the QDs due to the introduction of surface defects. 5. Conclusion Single semiconductor nanocrystal (colloidal QD) fluorescence in microcavities was studied for the first time. We report the first observation of single-emitter circularly polarized fluorescence of definite handedness due to microcavity chirality. The chiral microcavities were prepared by a simple method of planar alignment of cholesteric liquid crystals. Antibunching experi- ments show that the fluorescence background of the medium is low, so that antibunching of QD fluores- cence is preserved. 1-D chiral photonic bandgap structures possess an advantage over conventional 1-D photonic bandgap technologies. Because the refractive index n varies gradually rather than abruptly in chiral structures, there are no losses into the waveguide modes, which arise from total internal reflection at the border between two consecutive layers with different n. Therefore, it is not necessary to make a micropost structure as in [11] to reject ‘leaky’ modes in a lateral direction. In addition, liquid crystal microcavities are wavelength-tunable by changing the temperature and applied electric field. Single QD fluorescence in a polymer microcavity with a defect layer doped with single QDs between DBRs was studied as well. Such easily prepared and 400 500 600 700 800 900 1000 0.2 0.4 0.6 0.8 1.0 Wavelength (nm) Cavity Mode 400 600 800 1000 0.2 0.4 0.6 0.8 1.0 N o rm a liz e d r e fl e c ta n c e Wavelength (nm) N o rm a liz e d r e fl e c ta n c e 2 µm cts/ms 14 12 10 8 6 4 Figure 7. (a) Reflectivity spectrum of the polymeric microcavity embedded with CdSe QDs with �0¼620 nm. Shown in the inset is the normalized reflectivity spectrum showing the stop band of the ten-period Bragg reflector without the defect layer. (b) Single QD fluorescence imaging in a polymeric microcavity under 532-nm excitation (10 mm�10 mm scan). (The color version of this figure is included in the online version of the journal.) Journal of Modern Optics 7 D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 robust organic microcavities using low fluorescence background materials can be incorporated in visible and telecom SPSs. Our next steps will be to increase the efficiency of our on-demand polarized antibunched photon source. We will do this by: (1) selecting QDs with G�100%, �o outside the host fluorescence background, and � shorter than several ns, and (2) providing strong coupling between a single QD and a cavity by refining the cavity preparation technique. We also plan to demonstrate fluorescence antibunching at telecom wavelength. Acknowledgements The University of Rochester authors acknowledge support by the NSF Award ECS-0420888 and EHR-0633621. L.J. Bissell thanks the Air Force for a SMART fellowship. The authors thank A. Lieb and L. Novotny for advice and help, Z. Shi and H. Shin for assistance and J. Dowling for providing better understanding of emitter fluorescence in CLC photonic bandgap structures. Queens College authors’ work was supported partly by the Army Research Office, Short Term Analytical Service Grant at Queens College – CUNY. References [1] New J. Phys., special issue, Focus on Single Photons on Demand, 2004, 6. [2] Kumar, P.; Kwiat, P.; Migdall, A.; Nam, S.W.; Vuckovic, J.; Wong, F.N.C. Quantum Information Processing 2004, 3, 215–231. [3] Lounis, B.; Orrit, M. Rep. Progr. Phys. 2005, 68, 1129–1179. [4] Yamamoto, Y.; Santori, Ch.; Vuskovic, J.; Fattal, D.; Waks, E.; Diamanti, E. Progr. Informatics 2005, 1, 5–37. [5] Lounis, B.; Bechtel, H.A.; Gerion, D.; Alivisatos, P.; Moerner, W.E. Chem. Phys. Lett. 2000, 329, 399–404. [6] Messin, G.; Hermier, J.P.; Giacobino, E.; Desbiolles, P.; Dahan, M. Opt. Lett. 2001, 26, 1891–1893. [7] Huang, H.; Dorn, A.; Bulovic, V.; Bawendi, M. Appl. Phys. Lett. 2007, 90, 023110-1–13. [8] Vahala, K.J. Nature 2003, 424, 839–846. [9] Englund, D.; Fattal, D.; Walks, E.; Solomon, G.; Zhang, B.; Nakaoka, T.; Arakawa, Y.; Yamamoto, Y.; Vuckovic, J. Phys. Rev. Lett. 2005, 95, 013904-1–7. [10] Poitras, M.; Lipson, C.B.; Du, H.; Hahn, M.A.; Krauss, T.D. Appl. Phys. Lett. 2003, 82, 4032–4034. [11] Kahl, M.; Thomay, T.; Kohnle, V.; Beha, K.J; Merlein, K.; Hagner, M.; Halm, A.; Ziegler, J.; Nann, T.; Fedutik, Y.; Woggon, U.; Artemyev, M.; Pérez- Willard, F.; Leitenstorfer, A.; Bratschitsch, R. Nano Lett. 2007, 7, 2897–2900. [12] Martiradonna, L.; De Giorgi, M.; Troisi, L.; Carbone, L.; Gigli, G.; Cingolani, R.; De Vittorio, M. International Conference on Transparent Optical Networks (ICTON) 2006, 2, 64–67. [13] Lodahl, P.; Floris van Driel, A.; Nikolaev, I.S.; Irman, A.; Overgaag, K.; Vanmaekelbergh, D.; Vos, W.L. Nature 2004, 430, 654–657. [14] Fushman, I.; Englund, D.; Vuckovic, J. Appl. Phys. Lett. 2005, 87, 241102-1–23. [15] Wu, Z.; Mi, Z.; Bhattacharya, P.; Zhu, T.; Xu, J. Appl. Phys. Lett. 2007, 90, 171105-1–3. [16] Bose, R.; Yang, X.; Chatterjee, R.; Gao, J.; Wong, C.W. Appl. Phys. Lett. 2007, 90, 111117-1–3. [17] Hoogland, S.; Sukhovarkin, V.; Howard, I.; Cauchi, S.; Levina, L; Sargent, E.H. Opt. Express 2006, 14, 3273–3281. [18] Barrelet, C.J.; Bao, J.; Loncar, M.; Park, H.-G.; Capasso, F.; Lieber, C.M. Nano Lett 2006, 6, 11–15. [19] Lukishova, S.G.; Schmid, A.W.; McNamara, A.J.; Boyd, R.W.; Stroud Jr, C.R. IEEE J. Selected Topics in Quant. Electron 2003, 9, 1512–1517. [20] Lukishova, S.G.; Schmid, A.W.; Supranowitz, C.M.; Lippa, N.; McNamara, A.J.; Boyd, R.W.; Stroud Jr, C.R. J. Mod. Opt. 2004, 51, 1535–1547. [21] Lukishova, S.G.; Schmid, A.W.; Knox, R.P.; Freivald, P.; McNamara, A.; Boyd, R.W.; Stroud Jr, C.R.; Marshall, K.L. Molec. Cryst. Liq. Cryst. 2006, 454, 403–416. [22] Valappil, N.V.; Zeylikovich, I.; Gayen, T.; Das, B.B.; Alfano, R.R.; Menon, V.M. MRS Fall Meeting, Boston 2006, Paper no. M10.1. [23] Valappil, N.; Luberto, M.; Menon, V.M.; Zeylikovich, I.; Gayen, T.K.; Franco, J.; Das, B.B.; Alfano, R.R. Photon. Nanostruct. 2007, 5, 184–188. [24] Lukishova, S.G.; Schmid, A.W.; Knox, R.; Frievald, P.; Bissell, L.; Boyd, R.W.; Stroud, C.R. J. Mod. Opt. 2007, 54, 417–429. [25] Shi, H.; Conger, B.M.; Katsis, D.; Chen, S.H. Liquid Crystals 1998, 24, 163–172. [26] Murray, C.B.; Norris, D.J.; Bawendi, M.G. J. Am. Chem. Soc. 1993, 115, 8706–8715. [27] Qu, L.; Peng, Z.A.; Peng, X. Nano Lett. 2001, 1, 333–337. [28] Murray, C.B.; Sun, S.; Gaschler, W.; Doyle, H.; Betley, T.A.; Kagan, C.R. IBM J. Res. Devel. 2001, 45, 47–55. [29] Du, H.; Chen, C.; Krishnan, R.; Krauss, T.D.; Harbold, J.M.; Wise, F.W.; Thomas, M.G.; Silcox, J. J. Nano Lett. 2002, 2, 1321–1324. [30] Chanrasekhar, S. Liquid Crystals; Cambridge University Press: London, 1977. [31] Dowling, J.P.; Scalora, M.; Bloemer, M.J.; Bowden, C.M. J. Appl. Phys 1994, 75, 1896–1899. [32] Kopp, V.I.; Fan, B.; Vithana, H.K.M.; Genack, A.Z. Opt. Lett. 1998, 23, 1707–1709. [33] Chen, S.H.; Katsis, D.; Schmid, A.W.; Mastrangelo, J.C.; Tsutsui, T.; Blanton, T.N. Nature 1999, 397, 506–508. [34] Hollars, C.W.; Lane, S.M.; Huser, T. Chem. Phys. Lett. 2003, 370, 393–398. [35] Leatherdale, C.A.; Woo, W.-K.; Mikulec, F.V.; Bawendi, M.G. J. Phys. Chem. B 2002, 106, 7619–7622. [36] Yao, J.; Larson, D.R.; Vishwasrao, H.D.; Zipfel, W.R.; Webb, W.W. Proc. Natl. Acad. Sci. 2005, 102, 14284–14289. 8 S.G. Lukishova et al. D o w n l o a d e d B y : [ L u k i s h o v a , ] A t : 0 4 : 0 4 2 8 J a n u a r y 2 0 0 9 work_dvvcp2w6ubasrihlphrb3kaghy ---- OCLC Open Survey slides for COAR - Titia -2 COAR 2019 – Lyon, France Open Content in or Beyond the Repository Ecosystem: findings from the 2018 OCLC Open Content Survey Titia van der Werf Senior Program Officer, OCLC Research As of 30 December 2018 The international context: a global network of libraries Americas 10,060 members in 23 countries EMEA 6,050 members in 78 countries Asia Pacific 1,472 members in 20 countries Scale and accelerate library learning, innovation and collaboration OCLC Global Council Program Committee Debbie Schachter Chair (ARC) Rupert Schaab (EMEA) Tuba Akbaytürk (EMEA) Kuang-hua Chen (APRC) Defining the scope • Not only OA – also other freely available online open content • Acknowledging the "continuum of openness“ • All library types • All library services • All over the world 1. Institutional repository 2. Supporting authors/researchers/teachers 3. Advocacy and policies 4. Publishing 5. Data services 6. Bibliometrics 7. Selecting open content NOT managed by my library 14 categories of open content services 8. Supporting users/instruction/literacy 9. Promoting the discovery of open content 10. Digitizing collections 11. Digital Collections Library 12. Born-digital (legal) deposit/Web- archiving 13. Deep interactions with open content 14. Assessment SURVEY FINDINGS Responses by Library Type University & Research 72% Vocational and Other Education 8% Public 8% Special 7% National 2% Other 3%Total: 705 responses from 82 countries Research & University: 511 responses from 69 countries 7% 6% 8% 10% 5% 10% 8% 7% 7% 9% 17% 12% 10% 19% 38% 40% 40% 39% 46% 41% 45% 49% 50% 53% 51% 56% 63% 59% 55% 54% 53% 50% 49% 49% 46% 44% 43% 38% 32% 32% 27% 21% 0% 20% 40% 60% 80% 100% Publishing (n=191) Supporting users/instr ucting/digital literacy programs (n=266) Supporting authors/r esearcher s/teachers (n=252) Institutional repository (n=274) Advocacy and policies (n=223) Data services (n=1 70) Digit izing collections (n=237) Bibliometrics (n=138) Born-digital (legal) deposit/Web-archive (n=126) Assessment (n=137) Selecting open content not managed by the library (n=229) Digit al Collections Library (n=227) Deep interactions with open cont ent (n=109) Promoting the discover y of open content (n=268) OCLC supports my library's efforts I see a r ole for OCL C to suppor t I do not see a role for OCLC to support OCLC’s role in support of libraries’ open content activities 7% 6% 8% 10% 5% 10% 8% 7% 7% 9% 17% 12% 10% 19% 38% 40% 40% 39% 46% 41% 45% 49% 50% 53% 51% 56% 63% 59% 55% 54% 53% 50% 49% 49% 46% 44% 43% 38% 32% 32% 27% 21% 0% 20% 40% 60% 80% 100% Publishing (n=191) Supporting users/instr ucting/digital literacy programs (n=266) Supporting authors/r esearcher s/teachers (n=252) Institutional repository (n=274) Advocacy and policies (n=223) Data services (n=1 70) Digit izing collections (n=237) Bibliometrics (n=138) Born-digital (legal) deposit/Web-archive (n=126) Assessment (n=137) Selecting open content not managed by the library (n=229) Digit al Collections Library (n=227) Deep interactions with open cont ent (n=109) Promoting the discover y of open content (n=268) OCLC supports my library's efforts I see a r ole for OCL C to suppor t I do not see a role for OCLC to support OCLC’s role in support of libraries’ open content activities TENTATIVE CONCLUSIONS FROM THE SURVEY FINDINGS Tentative conclusions • Libraries are mostly invested in Open Content activities relating to: – Research support – Digital Libraries where they are more confident to achieve impact. • Open Content activities relating to Discovery seem to be suffering from a lack of resources, unclear funding and priorities. ©2016 OCLC [list any external authors here]. This work is licensed under a Creative Commons Attribution 4.0 International License. Suggested attribution: “This work uses content from [list presentation title] © OCLC, [list any external authors here] used under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0/.” THANK YOU Titia van der Werf http://creativecommons.org/licenses/by/4.0/ work_e5xvewn6qfhh7iwkbwmjxqw374 ---- Available Online at SAINS TANAH Website: https://jurnal.uns.ac.id/tanah SAINS TANAH – Journal of Soil Science and Agroclimatology, 15(2), 2018, ii STJSSA, ISSN p-ISSN 1412-3606 e-ISSN 2356-1424, parent DOI: 10.15608/stjssa https://jurnal.uns.ac.id/tanah Available Online at SAINS TANAH Website: https://jurnal.uns.ac.id/tanah SAINS TANAH – Journal of Soil Science and Agroclimatology, 15(2), 2018, i-iii STJSSA, ISSN p-ISSN 1412-3606 e-ISSN 2356-1424, parent DOI : 10.15608/stjssa EDITORIAL TEAM EDITOR IN CHIEF Dr. Sudadi Faculty of Agriculture, Universitas Sebelas Maret, Indonesia; Scopus Author ID: 56530847500, h-index: 1 ASSOCIATE-EDITOR Dr. Komariah Faculty of Agriculture, Universitas Sebelas Maret, Indonesia; Scopus Author ID: 48661102400, h-index: 1 Dr. Dwi Priyo Ariyanto Faculty of Agriculture, Universitas Sebelas Maret, Indonesia; EDITORIAL BOARD Prof. Masateru Senge Gifu University, Gifu, Japan, Japan; Scopus Author ID: 9845622100, h-index : 3 Prof. Dr. Jusop Shamshuddin Universiti Putra Malaysia, Malaysia; Scopus Author ID: 6601937415, h-index: 15 Prof. Vita Ratri Cahyani Faculty of Agriculture, Universitas Sebelas Maret, Indonesia; Scopus Author ID: 6507383077, h-index: 8 Prof. Dedik Budianta Soil Science Depart, Faculty of Agriculture, Sriwijaya University, Indonesia. Scopus Author ID: 6506086436 Prof. Irwan Sukri Banuwa University of Lampung, Indonesia. Scopus Author ID: 6507383077 Dr. Susilo Hambeg Poromarto, Faculty of Agriculture, Universitas Sebelas Maret, Surakarta, Indonesia; Scopus Author ID: 6507601450, h-index: 2 Dr. Supyani Faculty of Agriculture, Universitas Sebelas Maret, Surakarta, Indonesia; Scopus Author ID: 6507892928, h-index : 3 Dr. Supriyadi Indonesian Society for Microbiology (PERMI), Indonesia Scopus Author ID: 56544601700 Dr. Sulakhudin Department of Soil Science, Faculty of Agriculture, University of Tanjungpura, Indonesia; Scopus Author ID: 56888768200 Dr. Benito Heru Purwanto Faculty of Agriculture, Gadjah Mada University, Indonesia; Scopus Author ID: 6507171824, h-index: 4 Dr. Dwi Setyawan Universitas Sriwijaya, Palembang, Indonesia Scopus Author ID: 36704620200 Dr. Jauhari Syamsiyah Faculty of Agriculture, Universitas Sebelas Maret, Indonesia; Scopus Author ID: 57192415870 Dr. MMA Retno Rosariastuti Faculty of Agriculture, Universitas Sebelas Maret, Indonesia; ASSISTANT EDITOR Sidik Pramono Faculty of Agriculture, Universitas Sebelas Maret, Indonesia EDITORIAL OFFICE Department of Soil Science, Faculty of Agriculture, Universitas Sebelas Maret Jl. Ir. Sutami 36A Kentingan Surakarta, Jawa Tengah 57126, Indonesia Phone/Fax. +62-271-632477 ; Email : sainstanah@uns.ac.id, sainstanahuns@gmail.com BANK BNI 46 Cab. Sebelas Maret Surakarta; Acc : SUDADI; Acc. No. 0448109689 https://jurnal.uns.ac.id/tanah mailto:sainstanah@uns.ac.id mailto:sainstanahuns@gmail.com Available Online at SAINS TANAH Website: https://jurnal.uns.ac.id/tanah SAINS TANAH – Journal of Soil Science and Agroclimatology, 15(2), 2018, ii STJSSA, ISSN p-ISSN 1412-3606 e-ISSN 2356-1424, parent DOI: 10.15608/stjssa AIMS AND SCOPE SAINS TANAH published the results of research and study in soil science and agroclimatology and other fields related, include:  Soil physics and conservation  Soil chemistry and fertility,  Soil biology and biotechnology  Clay mineralogy  Plant nutrient  Pedogenesis  Geology and Mineralogy  Soil survey and classification  Soil reclamation and remediation  Agroclimatology  Environment INDEXING AND ABSTRACTING Sains Tanah has been registered in the OAI database. Index of this journal:  Google Scholar http://scholar.google.co.id/citations?user=WhtN6IR7DMQC  Indonesian Publication Index http://portalgaruda.org/?ref=browse&mod=viewjournal&journal=5909  Directory of Open Access Journal https://doaj.org/toc/2356-1424  SHERPA RoMEO self-archiving policy http://www.sherpa.ac.uk/romeo/issn/1412-3606/  Bielefeld Academic Search Engine http://www.base-search.net/Search/Results?q=dccoll:ftunimssurakarta&refid=dclink  OCLC WorldCat http://www.worldcat.org/search?q=http%3A%2F%2Fjurnal.fp.uns.ac.id%2Findex.php%2Fta nah%2Foai+tanah&qt=results_page  ASEAN Citation Index (ACI) https://www.asean-cites.org/  PKP Index https://index.pkp.sfu.ca/index.php/browse/index/2250  SINTA (Science and Technology Index) http://sinta.ristekdikti.go.id/journals/detail/?id=1067  SAINS TANAH has been ACREDITED with "B" grade by the Ministry of Research, Technology and Higher Education (RistekDikti) of The Republic of Indonesia in Director Decree No. 51/E/KPT/2017, December 4, 2017, and effective until 2022. https://jurnal.uns.ac.id/tanah http://scholar.google.co.id/citations?user=WhtN6IR7DMQC http://portalgaruda.org/?ref=browse&mod=viewjournal&journal=5909 https://doaj.org/toc/2356-1424 http://www.sherpa.ac.uk/romeo/issn/1412-3606/ http://www.base-search.net/Search/Results?q=dccoll:ftunimssurakarta&refid=dclink http://www.worldcat.org/search?q=http%3A%2F%2Fjurnal.fp.uns.ac.id%2Findex.php%2Ftanah%2Foai+tanah&qt=results_page http://www.worldcat.org/search?q=http%3A%2F%2Fjurnal.fp.uns.ac.id%2Findex.php%2Ftanah%2Foai+tanah&qt=results_page https://www.asean-cites.org/ https://index.pkp.sfu.ca/index.php/browse/index/2250 http://sinta.ristekdikti.go.id/journals/detail/?id=1067 Available Online at SAINS TANAH Website: https://jurnal.uns.ac.id/tanah SAINS TANAH – Journal of Soil Science and Agroclimatology, 15(2), 2018, iii STJSSA, ISSN p-ISSN 1412-3606 e-ISSN 2356-1424, parent DOI: 10.15608/stjssa TABLE OF CONTENTS 1. Editorial Team ............................................................................................................... i 2. Editorial Office ............................................................................................................... i 3. Bank .............................................................................................................................. i 4. Aim and Scope .............................................................................................................. ii 5. Indexing and Abstracting ............................................................................................... ii 6. Table of Contents .......................................................................................................... iii RESEARCH 7. STUDYING THE SOLUBILITY, AVAILABILITY, AND UPTAKE OF SILICON (SI) FROM SOME ORE MINERALS IN SANDY SOIL (Rama T. Rashad and Rashad A. Hussien) ............ 69-82 8. ENHANCING CHROMIUM PHYTOSTABILIZATION USING CHELATOR (Agrobacterium sp. I26 AND MANURE) TO SUPPORT GROWTH AND QUALITY OF RICE (Oryza sativa L.) (Riani Dwi Utari, Mohammad Masykuri, and Retno Rosariastuti) .................................................................................................................. 83-92 9. RATOON SYSTEMS IN TIDAL LOWLAND: STUDY OF GROUND WATER DYNAMICS AND THE CHANGE OF NUTRIENT STATUS ON RICE GROWTH (Momon Sodik Imanudin, Bakri, and Raina Jelita) .................................................................................. 93-103 10. NUTRIENT RELEASE PERFORMANCE OF STARCH COATED NPK FERTILIZERS AND THEIR EFFECTS ON CORN GROWTH (Nur Izza Faiqotul Himmah, Gunawan Djajakirana, and Darmawan) ......................................................................................... 104-114 11. THE EFFECTS OF INORGANIC FERTILIZER AND MINERAL LEUCITE RESIDUES ON K UPTAKE AND MAIZE YIELDS (Zea mays L.) IN OXISOLS (Sri Hartati, Slamet Minardi, Wiwik Hartatik, and Isna Luthfa Haniati) .......................................................... 115-122 12. LOCAL AIR AND SOIL TEMPERATURE MODELING USING HIMAWARI 8 SATELLITE IMAGERY (Adhia Azhar Fauzan, Komariah, Sumani, Dwi Priyo Ariyanto, and Tuban Wiyoso) .............................................................................................................. 123-133 SHORT COMMUNICATION 13. SOIL CARBON TRANSITIONS SUPPORTING CLIMATE CHANGE MITIGATION (Kurniatun Hairiah) ........................................................................................................ 134-139 14. Cover Page and Author Guidelines ............................................................................... App. 1 15. Publishing Etichal Statement ......................................................................................... App. 6 16. Competing Interests Form ............................................................................................. App. 8 17. Declaration Of Competing Interests .............................................................................. App. 9 18. Publication Ethics and Malpractice Statement ............................................................... App. 10 19. Guidelines for Reviewers ............................................................................................... App. 13 20. Acknowledgement to Reviewers in this issue .................................................................. App. 15 21. Submission Information .................................................................................................. App. 16 https://jurnal.uns.ac.id/tanah EDITORIAL TEAM EDITORIAL OFFICE BANK AIMS AND SCOPE INDEXING AND ABSTRACTING TABLE OF CONTENTS RESEARCH SHORT COMMUNICATION work_e67zygzkazbapihtm5pf7riyqu ---- 教育資料與圖書館學 Journal of Educational Media & Library Sciences http://joemls.tku.edu.tw Vol. 57 , no. 1 (2020) : 35-72 MARC21鏈結資料化的轉變與應用 A Study on MARC21 Transformation and Application for Linked Data 陳 亞 寧* Ya-Ning Chen* Associate Professor E-mail:arthur@gms.tku.edu.tw 温 達 茂 Dar-maw Wen Chief Knowledge Officer English Abstract & Summary see link at the end of this article http://joemls.tku.edu.tw mailto:arthur@gms.tku.edu.tw 教育資料與圖書館學 57 : 1 (2020) : 35-72 DOI:10.6120/JoEMLS.202003_57(1).0045.RS.AM 研 究 論 文 MARC21鏈結資料化的轉變與應用 陳亞寧 a* 温達茂 b 摘要 MARC一直是圖資界重要的資訊交換標準,由於格式的過時,且 不被圖資界以外的領域熟知與使用,反而阻礙MARC的應用。隨 著語意網的推展,鏈結資料技術已被圖資界視為解構書目資訊的 一項新方法。有鑑於此,重新檢視MARC採取何種方式展延至鏈 結資料與其效益是值得探討的研究議題。首先,本文以鏈結資料 提出的2006年為基準,分析相關MARC提案與討論文件的內容及 相關的鏈結資料因應方式。再者,本文選取兩筆MARC書目記錄 與一份MARC提案文件範例作為八個使用個案,導入BIBFRAME 與RDA兩項書目本體至使用個案,以實證與解說MARC展延為鏈 結資料的方式。結果證明MARC已成功融合資源描述框架與結構 外,也是圖資界的鏈結資料交換標準。最後,討論MARC提案文 件中所定義的書目實體等相關議題。 關鍵詞: 機讀編目格式,鏈結資料,書目框架,資源描述與檢索本 體,資源描述框架化 前 言 長久以來,圖書資訊(以下簡稱「圖資」)界採取機讀編目格式(MAchine- Readable Catalog,簡稱MARC)作為資訊組織的國際標準,利於在不同的圖書 館自動化系統間交換資訊,達成資訊共享的目的。然而,隨著資訊的網路化 與數位化,網路搜尋引擎已成為全球資訊網路的重要數位資訊查找工具。由於 MARC格式的過時(outdated format),只能存在於圖書館導向型系統,對非圖資 界而言,MARC既陌生又不被使用,格式就顯得十分特殊(uniqueness)。即使 少數圖書館自動化系統能提供MARC資訊給網路搜尋引擎擷取,多數以MARC 管理書目資訊的圖書館自動化系統仍獨立於全球資訊網路及網路搜尋引擎範圍 a 淡江大學資訊與圖書館學系副教授 b 飛資得系統科技股份有限公司知識長 * 本文主要作者兼通訊作者:arthur@gms.tku.edu.tw 本文作者同意本刊讀者採用CC創用4.0國際 CC BY-NC 4.0(姓名標示-非商業性)模式使用 此篇論文 2019/08/24投稿;2020/01/14修訂;2020/01/15接受 http://joemls.tku.edu.tw 36 教育資料與圖書館學 57 : 1 (2020) 之外,已形成所謂的資訊孤島(information silo; Lagace, 2014)。另一方面,自 2006年起,Berners-Lee(2006)提出鏈結資料(Linked Data,簡稱LD)概念及 其設計原則,係將現有文件網(web of document)轉變為資料網(web of data), 提供一個開放型網路空間,以統一資源識別碼(Uniform Resource Identifier,簡 稱URI)命名每一項資料,且經由相同URI的識別以鏈結不同來源的資料。隨 著LD的興起,已吸引各界投入LD的相關研究與應用。依據2019年3月鏈結 開放資料雲(Linked Open Data Cloud)將LD共分為跨領域(cross domain)等九 類,其中在出版品(publications)一類之下又區分為書目(bibliographic; McCrae, 2019),這意謂出版品書目相關資訊已在現有的LD領域佔有一席之地,也更加 引起圖資界思索如何採用LD概念與相關技術,將現有的MARC21資訊轉變為 LD,進而成為語意網(semantic web)的一部分,擴展既有圖資界相關資訊的應 用發展。 以資料設計觀點而言,LD有別於MARC是以資料為中心(data centric)的 主要設計理念(Di Noia et al., 2016),而且以資源描述框架(Resource Description Framework,簡稱RDF)作為資料模式(data model)。依據全球資訊網(World- Wide Web,簡稱W3C)協會發布的官方文件內容,LD主要關鍵之一在於採 用特定本體(ontology)作為資料模式化(data modeling)的基礎,以建立不同 資料或資訊物件間之相互關係(Hyland et al., 2014; Hyland & Villazón-Terrazas, 2011),且盡量使用既有本體的概念及其詞彙與關係為原則,以呈現資料模式 化的結果(Villazón-Terrazas et al., 2011)。在語意網中,Berners-Lee等(2001) 將本體視為語意網中的重要組成元件之一,用來正確定義詞彙間關係的文 件或檔案。目前圖資界已有所謂的書目記錄需求(Functional Requirements for Bibliographic Records,簡稱FRBR)、圖書館參考模式(Library Reference Model,簡稱LRM)與書目框架(Bibliographic Framework,簡稱BIBFRAME) 等不同概念模式(conceptual model)。雖然FRBR只是一種概念模式,在實作方 面,FRBR早已被視為一種書目本體且應用在LD的資料模式化工作,包括伊朗 國家圖書館暨檔案館(National Library and Archive of IRAN,簡稱NLAI;Eslami & Vaghefzadeh, 2013)、西班牙國家圖書館(Biblioteca Nacional de España,簡 稱BNE;Vila-Suero & Gómez-Pérez, 2013; Vila-Suero et al., 2012)與法國國家 圖書館(Bibliothèque nationale de France [BNF], 2018)等個案,皆採用FRBR 三個群組為書目本體。早期RDA本體(RDA ontology)已納入FRBR與權威資 料功能需求(Functional Requirements for Authority Data,簡稱FRAD)兩項概 念模式,同時配合RDA註冊中心(RDA Registry,簡稱RDAR)的發展,已依 前述Berners-Lee等(2001)本體的定義要求,將FRBR與FRAD轉換為符合本 體要求的類別與屬性關係外,並使用URI予以命名。隨著RDA 3R計畫(RDA Toolkit Restructure and Redesign Project)的啟動,目前RDAR已逐漸將LRM納 http://joemls.tku.edu.tw 37陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 入(RDA Steering Committee, 2019)。另一方面,美國國會圖書館(Library of Congress,簡稱LC)所發展的BIBFRAME,已在LC鏈結資料服務(Linked Data Service,簡稱LDS,http://id.loc.gov/)官方網站上正式公告BIBFRAME本體的 類別與關係及所屬的URI外,Linked Data for Production(LD4P)各項計畫皆視 BIBFRAME為書目本體,以探討圖書館資源轉換為LD時的相關議題(Linked Data for Production [LD4P], 2017)。例如,在LD4P計畫之一的共享虛擬發掘環 境(SHARE Virtual Discovery Environment,簡稱SHARE-VDE)計畫所推出的 LD平台,係以BIBFRAME本體為LD資料模式(Casalini, 2017),提供LD驅動 式(LD driven)目錄,以及相關視覺化呈現與查詢等功能。 在圖資界中,有些實際案例已大量批次將MARC資訊LD化,包括大英圖 書館(British Library,簡稱BL;Deliot, 2014; Deliot et al., 2016)、瑞典國家聯合 目錄(LIBrary Information System,簡稱LIBRIS;Malmsten, 2008, 2009)、BNE (Santos et al., 2015; Vila-Suero & Gómez-Pérez, 2013; Vila-Suero et al., 2012)、 BNF(Simon et al., 2013; Wenz, 2013)、美國內華達大學圖書館(University Libraries, University of Nevada; Lampert & Southwick, 2013; Southwick, 2015)與 伊利諾香檳分校(University of Illinois at Urbana-Champaign; Cole et al., 2013)等。 然而,以BL、BNE、BNF、德國國家圖書館(Deutsche National Bibliothek, 簡稱DNB)等16個案例為個案研究分析中,Chen(2017)發現15個研究個案 同時採取2個以上本體進行LD資料模式化作業外,也各自發展所屬的LD資 料模式。誠如Suominen與Hyvönen(2017)的研究結果指出,由於每一圖資界 LD個案的資料模式不同,除了產生不一致的問題外,更重要的是陷入另外一 種LD資訊孤島的現象,反而阻礙圖資界彼此間LD的再利用(reuse)、相容性 (compatibility)與互操作性(interoperability)。 就實際作業現況而言,MARC仍是現今多數圖書館自動化系統的主要處理 對象,藉以組織各式資訊。現今圖資界正處於OCLC Research Library Partnership 所稱的「MARC與LD的複合式環境」(a hybrid MARC-linked data environment; Smith-Yoshimura, 2018b),亦即同時面對LD與既有MARC記錄(legacy MARC records)共同存在的事實。如同參與Linked Data for Libraries(LD4L)計畫的史丹 福大學圖書館(Stanford University Libraries)一份簡報內容指出: ⋯ o Almost all of our processing systems are rooted in MARC o Our ILS is rooted in MARC o Any change to that basic environment will be very expensive o And we probably don’t want to change the entire environment, some things are probably done fine in a MARC based relational database, so we will need some sort of hybrid http://joemls.tku.edu.tw 38 教育資料與圖書館學 57 : 1 (2020) [圖書館自動化系統仍根植於MARC,改變此種環境的代價極高,我 們不可能改變整個環境,有些事務仍然可以在關聯資料庫的MARC順 利運作,因此我們需要某種複合式作業]。(Schreur, 2015, Slide 19) 另外,一如Cole等(2013, p. 172)所言:「All of these libraries have one thing in common: they publish their catalog records as LOD and use them in discovery services」[對所有圖書館而言,除了以LD方式發布目錄資訊外,同時也導入 LD作為探索服務之用]。這也與OCLC兩次的LD調查報告結果相符,就是多數 機構實施LD的主要目的之一在於引入外部LD資源(resources)提供機構本身的 使用者利用(Smith-Yoshimura, 2016, 2018a)。換言之,圖資界導入LD的主要目 的除了將MARC轉成LD予以對外發布成為語意網的一部分外,更重要的是導 入LD的聚合功能(aggregation),引入外部LD資源,提供使用者的LD驅動式 資源探索服務。綜合上述探討,MARC除了在原有圖書館自動化系統中滿足各 類文獻的資訊組織作業需求外,能否因應LD時勢需求而有所適當調整,同時 容許採用圖資界現有的書目本體(如前述BIBFRAME與RDA本體)及其詞彙, 達成一致性的LD資料模式,促成圖資界彼此間的LD共享與再利用外,也能提 供使用者LD驅動式資源探索服務等目的,則是現今圖資界在邁向LD前,必須 對MARC的轉變有所了解,更是值得深入探討的一項研究議題。 二、文獻探討 有關 MARC 的調整事宜,係由 MARC 諮詢委員會(MARC Advisory Committee,簡稱MAC)向MARC指導委員會(MARC Steering Group)1提出所 謂的MARC提案(MARC proposal)或討論文件(discussion paper),作為修訂 MARC的主要審查文件(Library of Congress [LC], 2019a)。一旦審核通過後, 依據MARC提案文件內容正式調整MARC的相關結構與內容。由於LD於2006 年提出,本文以2006年為起始點,回溯有關LD議題的MARC提案與討論文件 為範圍,探討MARC因應LD所調整的相關結構與內容之用,除非2006年以後 的MARC文件提及2006年前的相關文獻,則不在此限,亦即編號MARC 98-10 提案文件(詳表1至表2及相關內容說明)。此外,由於MARC提案與討論文件 皆以某一議題為主要討論重點,通常最新文件且獲通過者作為修訂MARC的主 要依據,以整體考量MARC的調整需求。2 因而,本文採取主題方式,整合相 關文件一起探討,而不依據每一文件逐一討論,避免以偏概全。 1 目前MARC指導委員會由LC、加拿大國家圖書館暨檔案館(Library and Archives Canada)、 BL與DNB共同組成(LC, 2019a)。 2 事實上,LC所公告的MARC相關文件僅標示出相關文件的編號,並未明確標示取代哪些文件。http://joemls.tku.edu.tw 39陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 ㈠ 標示FRBR第一群組內及第二群組內之兩兩關係 在編號MARC 2009-06/1提案(MARC Proposal 2009-06/1: Accommodating Relationship Designators for RDA Appendix J and K in MARC 21 Bibliographic and Authority Formats)文件(LC, 2009)中,主要目的在於MARC21書目資料與權 威資料格式中標示RDA附錄J與K,亦即FRBR第一群組內及第二群組內之兩 兩關係,且獲通過。主要調整內容如下: l 增加$4與$i至MARC21書目資料格式的欄號76X-78X,及增加$i至 MARC21書目資料格式的欄號X00、X10、X11與X30-78X,說明FRBR 第一群組內之兩兩相互關係。 l 增加$i至MARC21權威資料格式的欄號5XX,以說明FRBR第二群組內 之兩兩相互關係。 l 更改 MARC21 書目資料格式欄號 787 名稱為「其他關係」(Other Relationship Entry)。 ㈡ 增加國際標準名稱識別碼(International Standard Name Identifier, ISNI)的標示 在編號MARC 2010-06提案(Proposal No. 2010-06: Encoding the International Standard Name Identifier (ISNI) in the MARC 21 Bibliographic and Authority Formats)文件(LC, 2010)中,主要目的在於$0可以著錄ISNI,且該文件已通 過。增加ISNI至MARC21的主要涵蓋範圍如下: l MARC21書目資料格式:100、110、111、600、610、611、700、710與 711。 l MARC21權威資料格式:024、100、110、111、150、151、500、510、 511、550、551、700、710、711、750與751。 ㈢ $0權威記錄控制號或標準號(Authority Record Control Number Or Standard Number)與$1實際的世界物件(Real World Object, RWO)URI(RWO URI) 有關LD的URI方面,共有八份文件探討此一議題(請詳表1)。原始$0 在編號MARC 98-10提案文件(LC, 1998)中,定義為「記錄控制號」(record control number),至編號MARC 2015-07提案文件中,名稱則更改為「權威記錄 控制號或標準號」,同時可以用URI方式標示外,也以圓括弧方式帶出URI類 型的前導用語,如URI與ISNI(LC, 2015)。至編號MARC2016-DP18討論文件 中,則擴大應用至MARC館藏資料格式(holdings format),以及去除圓括弧與 前導用語兩項建議列入提案作為進一步評估審核(LC, 2016b)。直至編號MARC 2017-08提案文件審核公告後,除了通過去除圓括弧與前導用語的建議內容, 還包括新增$1,以標示LD的RWO URI外,應用範圍也擴展至五種MARC格式 http://joemls.tku.edu.tw 40 教育資料與圖書館學 57 : 1 (2020) (LC, 2017e)。在使用方式上,$0與$1可擇一使用,或同時使用。若以LD觀 點而言,$0與$1等同於RDF資料模中三位元的「物件」(object),可直接使用 URI進行標示,其中$0用於描述LD權威記錄的URI(如LC提供各項的LD資源), 而$1則是用於標示真實世界存在物件的URI。換言之,經由$0與$1著錄URI, 將原有MARC記錄鏈結至現有的LD資源。若依據編號MARC 2017-06提案、 編號MARC 2017-08提案與編號MARC 2019-03提案文件內容,$0與$1可應用 在MARC21書目、權威、館藏、分類(classification)與社群資訊(community information)格式的相關欄號如下: l MARC21書目資料格式:033、034、043、100、110、111、130、240、 257、336、337、338、340、344、345、346、347、348、370、377、 380、381、382、385、386、388、518、567、600、610、611、630、 647、648、650、651、654、655、656、657、662、700、710、711、 751、752、753、754、800、810、811、830、880、883、885 l MARC21權威資料格式:024、034、043、336、348、260、360、368、 370、372、373、374、376、377、380、381、382、385、386、388、 500、510、511、530、548、550、551、555、562、580、581、582、 585、672、673、682、700、710、711、730、747、748、750、751、 755、762、780、781、782、785、880、883、885 l MARC21館藏資料格式:337、338、347、561、883 l MARC21分類資料格式:034、043、700、710、711、730、748、750、 751、754、880、883 表1 有關$0與$1的MARC21文件與狀態 文件編號 文件名稱 狀態 Proposal No. 98-10 Definition of Subfield $0 for Record Control Number in the 7XX Fields in the USMARC Classification and Community Information Formats (LC, 1998). 通過 Proposal No. 2015-07 Extending the Use of Subfield $0 (Authority record control number or standard number) to Encompass Content, Media and Carrier Type (LC, 2015). 通過 Discussion Paper No. 2016-06 Define Subfield $2 and Subfield $0 in Field 753 of the MARC 21 Bibliographic Format (LC, 2016a). 轉為 提案 Discussion Paper No. 2016-18 Redefining Subfield $0 to Remove the Use of Parenthetical Prefix “(uri)” in the MARC 21 Authority, Bibliographic, and Holdings Formats (LC, 2016b). 轉為 提案 Discussion Paper No. 2016-19 Adding Subfield $0 to Fields 257 and 377 in the MARC 21 Bibliographic Format and Field 377 in the MARC 21 Authority Format (LC, 2016c). 轉為 提案 Proposal No. 2017-06 Adding Subfields $b, $2, and $0 to Field 567 in the MARC 21 Bibliographic Format (LC, 2017d). 通過 Proposal No. 2017-08 Use of Subfields $0 and $1 to Capture Uniform Resource Identifiers (URIs) in the MARC 21 Formats (LC, 2017e). 通過 Proposal No. 2019-03 Defining Subfields $0 and $1 to Capture URIs in Field 024 of the MARC 21 Authority Format (LC, 2019c). 通過 http://joemls.tku.edu.tw 41陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 l MARC21社群資訊格式:043、100、110、111、600、610、611、630、 648、650、651、654、656、657、700、710、711、730、880、883 ㈣ $4關係(Relationship) 有關LD的語意關係方面,共有五份文件探討此一議題(請詳表2)。雖然 MARC21已新增了$0與$1作為著錄URI之用,促成原有MARC記錄與某一外 部LD 資源的URI鏈結,但是MARC記錄與特定LD URI兩者之間的語意關係 仍未予以標示清楚。原來$4在MARC21書目資料格式的名稱為「著作職責或著 作方式」(relator code),可與$e(relator term)同時著錄或擇一著錄,主要用於 標示FRBR第一群組與第二群組間的資源責任關係。自2017年3月21日的編號 MARC 2017-01提案文件公告後,$4同時可應用在MARC21書目資料與權威資 料格式的相關欄號外,且名稱更改為「關係」。在使用方式上,有時$4與$e可 相互搭配使用,有時$4也可與$i(relationship information)一起使用,而$e與 $i則分別以文字說明$4所標示的關係資訊,$4則可直接以URI方式標示(LC, 2017a)。因此,自2017年3月以後,$4的語意與功能作用已明顯改變,等同於 RDF三位元的「述語」(predicate),作為鏈結主詞(subject)與物件兩者間關係 及其關係意義之用。以編號MARC 2017-01提案文件的範例為例,245$a的題名 視為RDF主詞,經由視為RDF述語的$4直接著錄LC LDS的URI(http://id.loc. gov/vocabulary/relators/edt),同時也使用$e著錄文字內容為編輯者(editor),補 充說明$4的URI語意識別碼意義為編輯者,而$0則視為RDF物件,可使用LC LDS URI(http://id.loc.gov/authorities/names/n80145489)代表原來$a的作者名稱。 原編號MARC 2017-01提案文件內的列舉範例如下所示(LC, 2017a): 245 00 $aReligion, learning and science in the ‘Abbasid period / $cedited by M. J. L. Young. 700 1# $aYoung, M. J. L. $0http://id.loc.gov/authorities/names/n80145489 $eeditor $4http://id.loc.gov/vocabulary/relators/edt 就LD化程度而言,$4補足了原有$0與$1只標示URI,但缺乏兩個LD物 件或URI之間的語意關係,或缺乏此筆MARC記錄與外部LD物件或URI之間 的語意關係。在MARC相關提案文件內容中(如編號MARC 2018-FT01提案), 列舉RDAR內RDA本體的屬性關係(property)作為$4的範例,而SHARE- VDE平台中,則著錄BIBFRAME的屬性關係在$4。換言之,圖資界現有 BIBFRAME與RDA書目本體所定義類別(class)間的屬性關係,皆可著錄在 $4,以標示書目本體不同類別兩兩之間的關係。以MARC21書目資料而言, 欄號245$a被視為RDF三位元的主詞,含有某一$4的欄號為RDF三位元的物 件,再以$4建立LD主詞與物件間的關係。依據編號MARC 2017-01提案、編號 MARC 2017-02提案、編號MARC 2017-03提案與編號MARC 2018-FT01提案文http://joemls.tku.edu.tw 42 教育資料與圖書館學 57 : 1 (2020) 件公告內容,$4可應用在MARC21書目資料與權威資料格式的相關欄號如下: l MARC21 書目資料格式:100、110、111、370、386、600、610、611、 630、650、651、654、662、700、710、711、720、730、751、760、762、 765、767、770、772、773、774、775、776、777、780、785、786、787 l MARC21權威資料格式:370、371、386、400、410、411、430、448、 450、451、455、462、480、481、482、485、500、510、511、530、 548、550、551、555、562、580、581、582、585、700、710、711、 730、748、750、751、755、762、780、781、782、785、788 表2 $4的MARC21文件與狀態 文件編號 文件名稱 狀態 Discussion Paper No. 2016-DP21 Defining Subfields $e and $4 in Field 752 of the MARC 21 Bibliographic Format (LC, 2016d). 轉為 提案 Proposal No. 2017-01 Redefining Subfield $4 to Encompass URIs for R e l a t i o n s h i p s i n t h e M A R C 21 A u t h o r i t y a n d Bibliographic Formats (LC, 2017a). 通過 Proposal No. 2017-02 Defining New Subfields $i, $3, and $4 in Field 370 of the MARC 21 Bibliographic and Authority Formats (LC, 2017b). 通過 Proposal No. 2017-03 Defining New Subfields $i and $4 in Field 386 of the MARC 21 Bibliographic and Authority Formats (LC, 2017c). 通過 Proposal No. 2018-FT01 Adding Subfield $4 to Field 730 in the MARC 21 Bibliographic Format (LC, 2018b). 通過 ㈤ $2名稱(Name)與題名(Title)的來源標示 MARC21除了通過採用$0、$1與$4著錄或標示LD的URI外,也曾在 編號MARC 2018-DP07討論(Designating Sources for Names in the MARC 21 Bibliographic Format; LC, 2018a)文件提出增加$2標示URI的來源名稱,當時 未獲通過,但改為列入提案文件,作為進一步評估。直至編號MARC 2019-02 提案(Defining Source for Names and Titles in the MARC 21 Bibliographic Format; LC, 2019b)文件提出且獲過後,$2可用來清楚標示URI的來源名稱,如ISNI、 VIAF與Wikidata等,也取代前述編號MARC 2015-07提案文件以圓括弧方式 帶出URI類型前導用語的著錄方式建議。$2著錄範圍僅限於書目記錄格式的 100、110、111、130、240、700、710、711、730、758、800、810、811與830 (LC, 2019b)。 ㈥ 定義MARC21書目資料格式的欄號758資源識別碼(Resource Identifier) 編號MARC 2017-09提案文件已獲通過,文件建議新增欄號758用以記載書 目記錄所描述的資源對象或相關資源,不限於FRBR第一群組的作品、內容版 本、載體版本或單件,但不用於特定的內容標準或資料模式(LC, 2017f)。http://joemls.tku.edu.tw 43陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 綜合上述討論,可明顯發現MARC21為了因應LD的趨勢發展,已在結構 與內容方面作了調整,主要包括六個分欄(即$0、$1、$2、$4、$e與$i)與一 個欄號(即758),而包含前述六個分欄的MARC21書目資料與權威資料格式等 欄位請參照附錄一與附錄二。 3 儘管MARC21已調整相關措施以反映LD需求, 然而如何應用上述MARC21的LD策略化結構與內容,且實際導入BIBFRAME 或RDA書目本體至現有的MARC記錄,以及可能產生的效益,則是本文所擬 探究的研究議題。 三、研究範圍與研究方法 為了實證前述MARC21的LD化策略與相關結構內容應用,首先本文將上 一節文獻探討所歸納的MARC21相關結構與內容進行RDF化(RDFization),亦 即所謂RDF三位元化(RDF’s triplification)。由於MARC21的LD化範圍以書目 資料與權威資料居多數,同時此兩種格式也是圖資界最常使用的標準格式。因 此,本文僅以MARC21書目資料與權威資料兩種格式為研究範圍。依照前述 RDF的主詞、述語與物件三位元的結構,分別將LD化的MARC21書目資料與 權威資料兩種格式相關欄號與分欄予以RDF化,以符合RDF的主詞、述語與 物件三位元。在MARC21書目資料格式方面,欄號245分欄a(Tag 245 $a)視為 RDF三位元的主詞,$4視為RDF三位元的述語,而包含前述$4的某一欄號視 為RDF三位元的物件(請詳圖1a上方所示)。在MARC21權威資料格式方面, 欄號1XX分欄a視為RDF三位元的主詞,$4視為RDF三位元的述語,包含$4 的某一欄號視為RDF三位元的物件(請詳圖1a下方所示)。反之,若書目資料 格式欄號245或權威資料格式欄號1XX分欄a視為RDF三位元的物件,$4仍視 為RDF三位元的述語,包含$4的某一欄號視為RDF三位元的主詞(請詳圖1b 所示)。再者,本文選擇BIBFRAME及RDA本體等兩種書目本體為實作對象, 採用前述MARC為LD新增的欄號758與六個分欄著錄BIBFRAME與RDA書 目本體型的LD實例,而MARC記錄則分別取自密西根大學圖書館(University of Michigan Ann Arbor Library)與賓州大學圖書館(University of Pennsylvania Libraries)共2筆書目記錄(請詳附錄三),以及MARC提案文件內的實例,且 採取使用個案(use case)方式解說與驗證MARC21的LD化實際情形。最後, 為能呈現MARC記錄轉變為LD後的結果,除了使用個案三外,本文的每一使 用個案皆提供表格,說明導入BIBFRAME與RDA書目本體後的調整內容及所 屬RDF示意圖(請參見表3)。 3 依據上述MARC有關LD欄號與分析,本文在2019年11月18日上網逐一查核現有MARC21書 目資料與權威資料格式及其LD相關欄號與分欄(https://www.loc.gov/marc/bibliographic/與 https://www.loc.gov/marc/authority/),結果請詳附錄一與附錄二。http://joemls.tku.edu.tw 44 教育資料與圖書館學 57 : 1 (2020) 四、研究結果: MARC的LD使用個案分析與實徵證明 本節內容以前述MARC提案與討論文件所歸納的結果(包括可以應用$0、 $1、$2、$4、$e與$i的欄號及欄號758),同時導入BIBFRAME與RDA等兩種 書目本體的URI與相關LD URI資源,採取八個使用個案實徵證明MARC的LD 策略化結構與內容的應用方式,並以使用個案一、個案二與個案五說明LD聚 合效益等項目為主要探討重點。 ㈠ 使用個案一:書目實體與作者關係 以原始MARC記錄而言,著錄範圍限於中文版傲慢與偏見(Pride and prejudice)此小說的書目相關資訊為主。若採取所謂的LD豐富化(enrichment)4 4 所謂的豐富化作業係指現有記錄經由鏈結至權威檔或外部L D資源,增加原有記錄的功能, 以促進使用者發現新的資訊與資源(Possemato, 2018)。 圖1 MARC21書目資料與權威資料兩種格式相關 LD化欄號與分欄的RDF三位元轉換概念圖 http://joemls.tku.edu.tw 45陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 作業程序,且以BIBFRAME本體為依據,增加使用$4,以標示欄號100與 245$a書目實體(bibliographic entity)之間的資源責任關係為「代理者」(即 agent,http://id.loc.gov/ontologies/bibframe/agent),且主要作者為「Austen, Jane, 1775-1817」,並在欄號100的$0與$1分別著錄虛擬國際權威檔(Virtual International Authority File, VIAF)與DBpedia提供的URI,作為LD外部資源 鏈結之用,且以$2標示URI的來源。再者,從RDA書目本體觀點而言,仍 可沿用$4,但資源責任關係改換為「作者代理者」(即has author agent,http:// rdaregistry.info/Elements/w/P10061),且沿用VIAF與DBpedia提供的URI作為 LD的外部資源鏈結(請參見表4)。 表4 書目實體與作者關係 MARC案例 MARAC21的RDF三位元標示方式:書目實體與作者 原始MARC記錄 100 1 # $aAusten, Jane,$d1775-1817. 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. BIBFRAME的資 料模式個案 100 1 # $aAusten, Jane,$d1775-1817. $4http://id.loc.gov/ontologies/bibframe/agent (bf:agent) $1http://dbpedia.org/page/Jane_Austen $2DBpedia $0http://viaf.org/viaf/102333412 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 表3 使用個案表格的欄位說明 MARC案例 MARAC21的RDF三位元標示方式:書目實體與作者 原始MARC記錄 依本文附錄三研究樣本MARC書目資料格式欄號245,或權 威資料格式欄號110為列舉範例,再依使用個案性質選擇相 關欄號作為基礎範例。如劃一題名,包括原始MARC書目 記錄欄號240與245等兩項資料。 BIBFRAME的資 料模式個案 以上述原始MARC記錄範例為基礎,採用$0、$1著錄URI 外,並在$4加入BIBFRAME本體屬性關係的URI,以建立 MARC記錄中之RDF主詞與物件的鏈結關係。 應用的 BIBFRAME類別 與屬性關係 以BIBFRAME本體為依據,呈現上述「BIBFRAME的資料 模式個案」結果的RDF三位元(RDF triple statement),格 式為「主詞→述語→物件」,其中主詞與物件皆英文首字大 寫,述語則英文首字小寫,且述語以單向箭號代表主詞與 物件間的語意關係與方向。 BIBFRAME實例 的RDF示意圖 以RDF三位元方式呈現上述「BIBFRAME的資料模式個案」 結果的示意圖。 RDA本體的資料 模式個案 以上述原始MARC記錄範例為基礎,採用$0、$1著錄URI 外,並在$4加入RDA本體屬性關係的URI,以建立MARC 記錄中之RDF主詞與物件的鏈結關係。 應用的RDA本體 類別與屬性關係 以RDA本體為依據,呈現上述「RDA本體的資料模式個案」 結果的RDF三位元陳述,格式為「主詞→述語→物件」,其 中主詞與物件皆英文首字大寫,述語則英文首字小寫,且 述語以單向箭號代表主詞與物件間的語意關係與方向。 RDA本體實例的 RDF示意圖 以RDF三位元方式呈現上述「RDA本體的資料模式個案」結 果的示意圖。 DBpedia與VIAF 的LD聚合示意圖 只應用在使用個案一,說明使用個案一在鏈結外部URI資 源後,所產生的LD聚合效益。 http://joemls.tku.edu.tw 46 教育資料與圖書館學 57 : 1 (2020) MARC案例 MARAC21的RDF三位元標示方式:書目實體與作者 應用的BIBFRAME 類別與屬性關係 Work→agent→Person BIBFRAME實例 的RDF示意圖 http://viaf.org/viaf/ 102333412 bf:agent http://dbpedia.org/page/ Jane_Austen bf:agent Tag245$a RDA本體的資料 模式個案 100 1 # $aAusten, Jane,$d1775-1817. $4http://rdaregistry.info/Elements/w/#P10061 (rdaw:P10061,has author agent) $1http://dbpedia.org/page/Jane_Austen $2DBpedia $0http://viaf.org/viaf/102333412 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 應用的RDA本體 類別與屬性關係 Work→has author agent→Person RDA本體實例的 RDF示意圖 Tag245$a rdaw:P10061 http://dbpedia.org/page/ Jane_Austen http://viaf.org/viaf/ 102333412 rdaw:P10061 DBpedia與VIAF 的LD聚合示意圖5 http://dbpedia.org/page/ Jane_Austen Tag245$a bf:agent http://viaf.org/viaf/ 102333412 bf:agent http://viaf.org/viaf/ 4220155466472402160005 http://d-nb.info/standards/elementset/ gnd#familialRelationship Variants of ʻ Jane Austenʼ in Dbpedia 1…N Works of Jane Austen in Dbpedia 1…M owl:sameAs is dbo:author of 5另一方面,經過豐富化作業後,除了原來MARC記錄中的「珍.奧斯汀」 (Jane Austen)主要著者款目已鏈結至DBpedia與VIAF的URI外,也代表此筆 MARC記錄經由上述兩個URI達成某種程度上的資料聚合。具體而言,經由 DBpedia的URI(http://dbpedia.org/page/Jane_Austen)鏈結,已聚合了「珍.奧 斯汀」不同語文的著者名稱外,也包括了「珍.奧斯汀」的不同英文作品。若從 5 本文僅以B I B F R A M E為範例說明,而R D A本體則可依此類推。另外,限於篇幅,本文在 R D F示意圖中,解說經由D B p e d i a的L D聚合效益時,僅以概念式圖解示例(即1⋯N與1⋯ M),而非逐一圖解說明。 http://joemls.tku.edu.tw 47陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 VIAF的(http://viaf.org/viaf/102333412)URI鏈結,除了各國語文的著者名稱外, 還可經由下列DNB提供的URI鏈結至「珍.奧斯汀」的家族成員,亦即「珍. 奧斯汀」的第五位姪女「Caroline Jane Knight」。VIAF的「Austen, Jane, 1775- 1817.」的記錄如下所示: Austen, Jane, 1775-1817. Permalink: http://viaf.org/viaf/102333412 500 1 _ $aKnight, Caroline Jane (http://viaf.org/viaf/4220155466472402160005) $4bezf $4http://d-nb.info/standards/elementset/gnd#familialRelationship $eBeziehung familiaer ㈡ 使用個案二:書目實體與作品關係 在此一使用個案中,主要是針對書目實體與作品間關係進行標示,亦 即劃一題名的作品關係。在原始MARC記錄中,並未標示任何關係。若改採 BIBFRAME與RDA書目本體,本文除了使用$4分別標引作品關係外,另外選 擇了SHARE-VDE與OCLC作品識別碼(Work ID)作為外部LD資源鏈結(請 參見表5)。以SHARE-VDE的作品識別碼為例,此一識別碼聚合了美國杜克 大學圖書館(Duke University Libraries)、紐約大學圖書館(New York University Libraries)、史丹佛大學圖書館、芝加哥大學圖書館(University of Chicago Library)、密西根大學圖書館、賓州大學圖書館、耶魯大學圖書館(Yale University Library),及加拿大亞伯達大學(University of Alberta Libraries)等有 關英文版傲慢與偏見(Pride and prejudice)作品館藏(請詳圖2)。換言之,經由 SHARE-VDE的作品URI達成虛擬式聯合目錄的功能。相同地,OCLC作品識 別碼提供WorldCat相關作品與人名(如作品的編輯者)。 ㈢ 使用個案三:書目實體與出版者關係 以MARC21現況而言,$0、$1與$4並未定義在欄號260之內。以SHARE- VDE實例而言,採用了$9標示大陸拼音的「志文出版社」(Zhi wen chu ban she)。就MARC21而言,仍然是有效的,因為屬於所謂的「自由使用型的分欄」 (local subfield)。相對而言,在MARC21尚未將$0、$1與$4加入欄號260內之 前,上述SHARE-VDE是一種折衷方式,利用$9達成外部鏈結資源的鏈結。原 則上,BIBFRAME與RDA仍無法經由MARC21欄號260的$1與$4分別合法建 立所屬的「出版者」(Publisher)6與「出版社代理者」(has publisher agent),以標 6 在BIBFRAME中,類別名稱為「出版」(Publication),標籤名稱(label)則為「出版者」 (Publisher),本文在此處使用後者以利說明屬性關係,請詳http://id.loc.gov/ontologies/ bibframe/Publication。 http://joemls.tku.edu.tw 48 教育資料與圖書館學 57 : 1 (2020) 示欄號245與260之間的出版關係。上述SHARE-VDE個案提供欄號260自由使 用型分欄相關資料如下: 245 10$601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 260 $603$aTaibei Shi :$bZhi wen chu ban she,$c1992. $9http://share-vde.org/sharevde/rdfBibframe/Publisher/269614 ㈣ 使用個案四:書目實體與內容、媒體與載體關係 相同的,原始MARC記錄中,分別採取$2加以說明關係類型,$a以文字 說明關係類型的意義,$b以代碼標示關係類型的意義。若改採MARC21的$0 與$4兩個分欄,除了上述$2、$a與$b作法外,額外以$4與$0方式加入符合 表5 書目實體與作品關係 MARC案例 MARAC21的RDF三位元標示方式:書目實體與作品 原始MARC記錄 240 1 0 $aPride and prejudice.$lChinese 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. BIBFRAME的資 料模式個案 240 1 0 $aPride and prejudice.$lChinese $4http://id.loc.gov/ontologies/bibframe/instanceOf (bf:instanceOf) $0http://share-vde.org/sharevde/docBibframe/Work/139617-12 $2share-vde $0http://worldcat.org/entity/work/id/1881837462 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 應用的BIBFRAME 類別與屬性關係 Instance→instanceOf→Work BIBFRAME實例 的RDF示意圖 http://worldcat.org/ entity/work/id/ 1881837462 bf:instanceOf http://share-vde.org/ sharevde/docBibframe/ Work/139617-12 bf:instanceOf Tag245$a RDA本體的資料 模式個案 240 1 0 $aPride and prejudice.$lChinese $4http://rdaregistry.info/Elements/m/P30135 (rdam: P30135,has work manifested) $0http://share-vde.org/sharevde/docBibframe/Work/139617-12 $2share-vde $0http://worldcat.org/entity/work/id/1881837462 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 應用的RDA本體 類別與屬性關係 Manifestation→has work manifested→Work RDA本體實例的 RDF示意圖 http://worldcat.org/ entity/work/id/ 1881837462 rdam:P30135 http://share-vde.org/ sharevde/docBibframe/ Work/139617-12 rdam:P30135 Tag245$a http://joemls.tku.edu.tw 49陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 RDF語法的述語與物件,明確建立書目實體(即245$a)有關內容(content)、媒 體(media)與載體(carrier)等關係及其意義外,並以外部鏈結資源的方式標示 關係類型;而RDA 本體依此類推,分別在$4標示內容、媒體與載體等關係及 其意義(請參見表6)。 表6 書目實體與內容、媒體與載體關係 MARC案例 MARAC21的RDF三位元標示方式:內容、媒體與載體 原始MARC記錄 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 245 1 0 $601$a傲慢與偏見 /$c 珍・奧斯汀著 ; [夏穎慧譯]. 336 # # $atext$btxt$2rdacontent 337 # # $aunmediated$bn$2rdamedia 338 # # $avolume$bnc$2rdacarrier BIBFRAME的資 料模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 336 # # $atext$btxt$2rdacontent $4http://id.loc.gov/ontologies/bibframe/content (bf: content) $0http://id.loc.gov/vocabulary/contentTypes/txt 337 # # $aunmediated$bn$2rdamedia $4http://id.loc.gov/ontologies/bibframe/media (bf: media) $0http://id.loc.gov/vocabulary/mediaTypes/n 338 # # $avolume$bnc$2rdacarrier $4http://id.loc.gov/ontologies/bibframe/carrier (bf: carrier) $0http://id.loc.gov/vocabulary/carriers/nc 應用的BIBFRAME 類別與屬性關係 Work→content→Content Instance→media→Media Instance→carrier→Carrier 圖2 經由SHARE-VDE作品URI提供虛擬式聯合目錄 資料來源: 畫面擷取自SHARE-VDE. (n.d.). http://share-vde.org/sharevde/docBibframe/Work/139617-12。 http://joemls.tku.edu.tw 50 教育資料與圖書館學 57 : 1 (2020) MARC案例 MARAC21的RDF三位元標示方式:內容、媒體與載體 BIBFRAME實例 的RDF示意圖 http://id.loc.gov/ vocabulary/carriers/nc bf:carrier http://id.loc.gov/ vocabulary/contentTypes/ txt bf:content Tag245$a http://id.loc.gov/ vocabulary/mediaTypes/n bf:media RDA本體的資料 模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 336 # # $atext$btxt$2rdacontent $4https://www.rdaregistry.info/Elements/e/P20001 (rdae: P20001,has content type) $0http://rdaregistry.info/termList/RDAContentType/1020 337 # # $aunmediated$bn$2rdamedia $4https://www.rdaregistry.info/Elements/m/P30002 (rdam: P30002,has media type) $0http://rdaregistry.info/termList/RDAMediaType/1007 338 # # $avolume$bnc$2rdacarrier $4https://www.rdaregistry.info/Elements/m/P30001 (radm: P30001,has carrier type) $0http://rdaregistry.info/termList/RDACarrierType/1049 應用的RDA本體 類別與屬性關係 Expression→has content type→literal or URI Instance→has media type→literal or URI Instance→has carrier type→literal or URI RDA本體實例的 RDF示意圖 http://rdaregistry.info/ termList/ RDACarrierType/1049 rdam:P30001 http://rdaregistry.info/ termList/ RDAContentType/1020 rdae:P20001 Tag245$a http://rdaregistry.info/ termList/ RDAMediaType/1007 rdam:P30002 ㈤ 使用個案五:書目實體與譯者關係 在原始MARC記錄中,係為「Pride and prejudice」的傳統中文版(traditional Chinese)譯本,譯者為「夏穎慧」(Xia, Yinghui)。由於在SHARE-VDE、ISNI、 VIAF與LC LDS皆無上述譯者的URI,反而在OCLC WorldCat Identities與國家 圖書館鏈結資源平台能查得上述譯者所屬URI。依循MARC21的$4與$0的作 法,本文額外以$e加註文字說明譯者的身份別,同時建立關係與外部鏈結資源 的物件,並以BIBFRAME與RDA兩種書目本體方式標示,結果如表7所示。 其中在OCLC WorldCat Identities的「夏穎慧」所屬URI資訊下,已聚合上述譯者http://joemls.tku.edu.tw 51陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 有關「珍.奧斯汀」(Jane Austen)的中譯作品等相關資訊。 ㈥ 使用個案六:書目實體與主題關係 在附錄三的第二筆原始MARC記錄中,皆有兩個以上的主題,本文只以一 個主題為例說明。在採用BIBFRAME時,除了以$4標示主題關係外,同時也 以$0加註外部鏈結資源的URI,達成符合RDF三位元的語法結構,而RDA本 體亦依此類推予以標註(請參見表8)。 表7 書目實體與譯者關係 MARC案例 MARAC21的RDF三位元標示方式:書目實體與譯者 原始MARC記錄 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 245 1 0 10$601$a傲慢與偏見 /$c 珍・奧斯汀著 ; [夏穎慧譯]. 700 1 # $605$aXia, Yinghui. BIBFRAME的資 料模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 245 1 0 $601$a傲慢與偏見 /$c 珍・奧斯汀著 ; [夏穎慧譯]. 700 1 # $605$aXia, Yinghui.$etranslator $4http://id.loc.gov/ontologies/bibframe/agent (bf:agent) $0http://worldcat.org/identities/np-xia,%20yinghui/ $2worldcatidentities $0http://catld.ncl.edu.tw/authority/AC000064697 應用的BIBFRAME 類別與屬性關係 Work→agent→Agent BIBFRAME實例 的RDF示意圖 http://catld.ncl.edu.tw/ authority/AC000064697 bf:agent http://worldcat.org/ identities/np- xia,%20yinghui/ bf:agent Tag245$a RDA本體的資料 模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 245 1 0 $601$a傲慢與偏見 /$c 珍・奧斯汀著 ; [夏穎慧譯]. 700 1 # $605$aXia, Yinghui.$etranslator $4https://www.rdaregistry.info/Elements/e/P20037 (rdae:P20037,has translator agent) $0http://worldcat.org/identities/np-xia,%20yinghui/ $2worldcatidentities $0http://catld.ncl.edu.tw/authority/AC000064697 應用的RDA本體 類別與屬性關係 Expression→has translator agent→Person RDA本體實例的 RDF示意圖 http://catld.ncl.edu.tw/ authority/AC000064697 rdae:P20037 http://worldcat.org/ identities/np- xia,%20yinghui/ rdae:P20037 Tag245$a http://joemls.tku.edu.tw 52 教育資料與圖書館學 57 : 1 (2020) 表8 書目實體與主題關係 MARC案例 MARAC21的RDF三位元標示方式:書目實體與主題 原始MARC記錄 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 650 # 0 $aSocial classes$vFiction. BIBFRAME的資 料模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 650 # 0 $aSocial classes$vFiction. $4http://id.loc.gov/ontologies/bibframe/subject (bf:subject) $0http://id.loc.gov/authorities/subjects/sh2008111427 $2lcnaf 應用的BIBFRAME 類別與屬性關係 Work→subject→Subject BIBFRAME實例 的RDF示意圖 http://id.loc.gov/ authorities/subjects/ sh2008111427 Tag245$a bf:subject RDA本體的資料 模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 650 # 0 $aSocial classes$vFiction. $4https://www.rdaregistry.info/Elements/w/P10256 (rdaw: P10256,has subject) $0http://id.loc.gov/authorities/subjects/sh2008111427 $2lcnaf 應用的RDA本體 類別與屬性關係 Work→has subject→Subject RDA本體實例的 RDF示意圖 http://id.loc.gov/ authorities/subjects/ sh2008111427 Tag245$a rdaw: P10256 ㈦ 使用個案七:書目實體與實例(instance)/載體版本關係 依據MARC21對欄號758的定義,主要在記載書目實體所描述的資源或相 關資源,可將OCLC WorldCat的書目記錄視為相關資源,並以BIBFRAME的 「有實例」(hasInstance)標示兩者關係,而RDA本體則以「相關載體版本」(has related manifestation of manifestation)標示兩者關係(請參見表9)。 表9 書目實體與實例/載體版本關係 MARC案例 MARAC21的RDF三位元標示方式:書目實體與實例 原始MARC記錄 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. BIBFRAME的資 料模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 758 # # $1http://worldcat.org/oclc/213888776 應用的BIBFRAME 類別與屬性關係 Instance→hasInstance→Instance BIBFRAME實例 的RDF示意圖 Tag245$a http://worldcat.org/oclc/ 213888776 bf:hasInstance RDA本體的資料 模式個案 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 758 # # $4http://www.rdaregistry.info/Elements/m/P30048 (rdam:P30048,has related manifestation of manifestation) $1http://worldcat.org/oclc/213888776 http://joemls.tku.edu.tw 53陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 MARC案例 MARAC21的RDF三位元標示方式:書目實體與實例 應用的RDA本體 類別與屬性關係 Manifestation→has related manifestation of manifestation→Manifestation RDA本體實例的 RDF示意圖 Tag245$a http://worldcat.org/oclc/ 213888776 rdam:P30048 ㈧ 使用個案八:個人與機構間關係 依據MARC提案2017-01編號的實例中(LC, 2017a),$0、$1、$4與$i亦可 使用在MARC權威資料格式,藉以標引個人、家族與機構等兩兩之間的關係。 在表10案例中,則是先使用$i以文字說明「貝聿銘」(Pei, I.M., 1917-)此作者 係為「貝聿銘建築師事務所」(I.M. Pei Associates)的創辦人(founder)關係後, 再利用$4導入RDA本體的屬性關係URI,以標示個人與機構之間的關係,同時 以$0著錄LCLDS的URI,以串連至「貝聿銘」LD化個人權威記錄。 表10 個人與機構間關係 MARC案例 MARAC21的RDF三位元標示方式:個人與機構 Proposal No. 2017-01 110 2 # $a I.M. Pei Associates 500 1 # $wr $ifounder: $4http://www.rdaregistry.info/Elements/a/P50029 (rdaa: P50029,has founding person of corporate body) $aPei, I. M. $d1917- $0http://id.loc.gov/authorities/names/n79065003 應用的RDA本體 類別與屬性關係 Corporate Body→has founding person of corporate body→Person RDA本體實例的 RDF示意圖 Tag110$a http://id.loc.gov/ authorities/names/ n79065003 rdaw:P50029 五、討 論 ㈠ MARC記錄的LD內增豐富化與LD外部資源聚合 經由$0、$1與$4的豐富化作業程序,MARC記錄已增加了LD資源(即$0 或$1)與語意關係(即$4)等URI,藉以將既有MARC記錄等不同類型的資訊 與現有LD網路空間建立鏈結,使得MARC達成兩種具體效益。首先,將外部 LD資源導入現有MARC記錄之內,使LD成為MARC記錄內部書目資訊的一部 分,豐富了原有MARC記錄內容。再者,更重要的是,這些豐富化後的URI將 MARC展延至現有LD網路空間,且經由相同的外部LD資源URI,無形地聚合 相同URI不同來源的LD外部資源(如前述使用個案一、個案二與個案五)。除 了可自動形成聯合目錄與類似Google知識圖譜(knowledge graph)功能外,經 由LD關係提供脈絡化資訊及其功能導航(contextual information and navigation functionality),也可促進LRM之探索型(explore)使用者任務的達成。另外, http://joemls.tku.edu.tw 54 教育資料與圖書館學 57 : 1 (2020) 只要$0或$1使用到鏈結資料中心(linked data hub)的URI(如VIAF或ISNI), 則有助於圖資界MARC資訊被其他領域應用的機會。 ㈡ MARC21既是圖資界傳統目錄資訊的交換標準,也是圖資 LD交換標準 經由上述使用個案的實證後,發現MARC21的$0與$1可直接著錄URI, 達成LD外部資源的鏈結。然而,隨著MARC21增加$4的前提下,LD鏈結關 係的意義是可被明確著錄的。因而從前述使用個案可發現一筆記錄(書目或權 威)能著錄平台內外的URI,換言之,亦即同一資訊平台內部LD資源相互鏈結 外,也可與外部LD資源建立鏈結關係。MARC21此種LD策略性調整,有助於 內外部LD資源的鏈結外,可更加明確標示鏈結關係的意義。除了有利於LD圖 書館自動化系統開發外,更有利於使用者界面的脈絡化資訊導引與呈現。另一 方面,從前述使用個案也可發現MARC21已融合了符合RDF三位元化的要求。 因而,MARC21除了可持續作為圖資界以記錄為單位的資訊交換標準外,亦可 作為以LD資料為單位的LD化圖書資訊的交換標準與著錄格式。 ㈢ MARC21已成為書目本體的資料容器(data container),也 是具體落實書目本體的載體 經由上述使用個案的探討,可以發現本文已採用MARC21的$0標示書目 實體與劃一題名作品關係 (即前述Pride and prejudice用SHARE-VDE與WolrdCat 作品URI標示),採用$0與$1標示作者(即前述Austen, Jane, 1775-1817用 DBpedia與VIAF的URI標示),及採用$4著錄BIBFRAME與RDA書目本體的 屬性關係,以標示RDF主詞與物件間的述語關係與意義等,皆完全符合RDF三 位元物件的LD資源鏈結,以及採用欄號758鏈結OCLC WorldCat書目記錄URI 達成建立書目實體與實例/載體版本間關係。換言之,MARC21透過$0、$1、$4 與欄號758的方式,已能將BIBFRAME與RDA書目本體之資料模式化所定義的 類別與屬性關係予以著錄與標示。從此觀點而言,MARC21經過LD策略化調 整的功能結構與內容後,已可完全容納BIBFRAME與RDA書目本體內容外, 更是不同圖書館自動化系統間的LD交換共享載體。如果未來RDAR內容能順利 完全轉變成LRM,MARC21仍然可無礙地著錄、標示與承載LRM此一書目本 體的內容。另外,由於MARC21的LD化,屆時亦有利於後設資料(metadata) 型的數據分析與探勘。此外,採取此種方式也有別於前述採取大量批次的圖資 LD個案(如BL、BNE、BNE與DNB等),主要差異有二:首先,圖書館可選擇 使用BIBFRAME或RDA本體,再搭配應用MARC21為LD增加的分欄與欄號達 成LD化,而不是採取兩種以上的本體,達成資料模式與屬性關係的一致化, 避免陷入前述Suominen與Hyvönen(2017)指出的LD孤島。第二,轉化MARC 為LD的方式相形簡單,只須熟悉一種書目本體,而無須熟悉兩種以上的本體。 http://joemls.tku.edu.tw 55陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 以BL的英國國家書目(British National Bibliography,簡稱BNB)為例,依據 Chen(2017)的分析,BNB至少採用了Bibliographic Ontology、DC、FOAF、Event Ontology、ISBD、OWL與SKOS等本體。雖然MARC此種方式有其優點,但也 有缺點,即是未完全遵循原有BIBFRAME與RDA本體有關類別與屬性關係的 使用原則(請詳㈤MARC21 LD化書目實體與書目本體應用方式之書目本體應用 方式相關探討)。 ㈣ MARC21的RDF化結構的應用方式:單向或雙向 在MARC書目資料格式中,可發現MARC21對於RDF三位元的應用方式 採取圖1a的方式,亦即以欄號245為RDF主詞,其他欄號為RDF物件,採用$4 作為RDF述語以建立鏈結關係。同樣地,在MARC權威資料格式中,可發現 MARC21對於RDF三位元的應用方式也是採取圖1a的方式,亦即以欄號1XX 為RDF主詞,欄號5XX為RDF物件,兩者間以欄號5XX的$4為RDF述語加 以鏈結關係化。未來圖1b是否可應用於書目資料格式與權威資料格式中,促使 MARC21的LD策略化成為雙向式應用方法,則有待觀察。 ㈤ MARC21的LD化書目實體與書目本體的應用方式 由前述使用個案,可得知目前有關LD的MARC21文件皆將欄號245 的$a視為書目實體。如果依照LC公告的MARC21轉換至BIBFRAME文件 (MARC 21 to BIBFRAME 2.0 conversion specifications; LC, 2019d)與MARC轉換 至FRBR文件中(Mapping of MARC data elements to FRBR and AACR; Network Development and MARC Standards Office, 2006)等兩份文件,分別將欄號245 的$a視為BIBFRAME的實例與RDA本體(或FRBR)的載體版本。然而,從前 述使用個案可發現書目實體有時是作品(例如使用個案六的主題關係),有時 是內容版本(例如使用個案五的譯者關係),有時是實例或載體版本(例如使用 個案四的媒體與載體關係)。換言之,MARC21對於書目實體給予相當高度的 彈性化,對欄號245的$a並未有一致與明確的定義。再者,從前述使用個案可 發現MARC21採取最終端的單一化RDF三位元方式標示主詞、物件及其關係 的述語,亦即只採用一組RDF三位元陳述。然而,無論BIBFRAME或RDA書 目本體皆有一定的應用原則,所有個案不可能只採用一組RDF三位元陳述。 以前述個案三的出版者關係為例,如果是BIBFRAME,RDF的三位元陳述如 右所示—「Instace(即欄號245$a題名) – provisionActivity – ProvisionActivity – agent – http://share-vde.org/sharevde/rdfBibframe/Publisher/269614」。如果改以 RDA本體,由於欄號245$a的中譯題名是屬於內容版本,所以RDF的三位元陳 述如右所示—「Expression(即欄號245$a題名) – has manifestation of expression – Manifestation(即欄號245$a題名) – has publisher agent – http://share-vde.org/http://joemls.tku.edu.tw 56 教育資料與圖書館學 57 : 1 (2020) sharevde/rdfBibframe/Publisher/269614」。由前述討論,意謂著LC必須提出 MARC的LD化最佳範例(best practices)的使用指引文件,引導圖資界使用 MARC21的處理方式,才能與現有書目本論的語意關係與知識邏輯相互調和, 否則就各行其事,最後仍會形成不一致的現象。一旦不一致情形出現,有可能 減損原來本體達成知識結構的展現與關係推理等功能,乃至於降低本體型後設 資料的數據分析。 六、結 語 從MARC的討論與提案文件的探討,已可明顯發現MARC已將LD的RDF 三位元陳述語法融入。MARC可經由豐富化作業程序增加相關外部LD資源URI 的鏈結後,亦達成了LD化的資料聚合,擴展MARC記錄成為現有LD網路空間 的一部分。另外,經由本文導入BIBFRAME與RDA書目本體及其相關使用個 案實徵研究後,MARC的LD策略化結構與內容調整,已將MARC提升兼具國 際化目錄資訊交換標準格式外,也可作為圖資界LD交換標準,除了同時可容 納BIBFRAME與RDA書目本體外,未來是否可擴展至不同學科領域LD本體的 標示與著錄,則待進一步研究。 誌 謝 本文部分成果係由科技部105年度專題研究計畫經費補助(計畫編號MOST 105-2410-H-032-057),在此一併致謝。 參考文獻 Berners-Lee, T. (2006). Linked data: Design issue. https://www.w3.org/DesignIssues/ LinkedData.html Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 35-43. Bibliothèque Nationale de France. (2018). Open data. data.bnf.fr. https://data.bnf.fr/en/opendata Casalini, M. (2017, August 15-17). BIBFRAME and Linked Data practices for the stewardship of research knowledge [Paper presentation]. IFLA Satellite Meeting 2017: Digital Humanities, Berlin, Germany. https://dh-libraries.sciencesconf.org/132918/document Chen, Y.-N. (2017). A review of practices for transforming library legacy records into linked open data. In E. Garoufallou, S. Virkus, R. Siatri, & D. Koutsomiha (Eds.), Metadata and semantic research: 11th International Conference, MTSR 2017 Tallinn, Estonia, November 28 – December 1, 2017 Proceedings (pp. 123-133). Springer. https://doi. org/10.1007/978-3-319-70863-8_12 Cole, T. W., Han, M.-J., Weathers, W. F., & Joyner, E. (2013). Library marc records into linked open data: Challenges and opportunities. Journal of Library Metadata, 13(2-3), 163-196. http://joemls.tku.edu.tw 57陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 https://doi.org/10.1080/19386389.2013.826074 Deliot, C. (2014). Publishing the British National Bibliography as linked open data. Catalogue & Index 174, 13-18. http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf Deliot, C., Wilson, N., Costabello, L., & Vandenbussche, P.-Y. (2016, October 13-16). The British National Bibliography: Who uses our linked data? [Paper presentation]. International Conference on Dublin Core and Metadata Applications 2016, Copenhagen, Denmark. http://dcpapers.dublincore.org/pubs/article/download/3820/2005 Di Noia, T., Ragone, A., Maurino, A., Mongiello, M., Marzoccca, M. P., Cultrera, G., & Bruno, M. P. (2016). Linking data in digital libraries: The case of Puglia Digital Library. In A. Adamou, E. Daga, & L. Isaksen (Eds.), Proceedings of the 1st Workshop on Humanities in the Semantic Web co-located with 13th ESWC Conference 2016 (pp. 27-38). CEUR- WS. http://ceur-ws.org/Vol-1608/paper-05.pdf Eslami, S., & Vaghefzadeh, M. H. (2013, August 17-23). Publishing Persian linked data of national library and archive of Iran [Paper presentation]. IFLA World Library and Information Congress: 79th IFLA General Conference and Assembly. Singapore. http:// library.ifla.org/193/1/222-eslami-en.pdf Hyland, B., Atemezing, G. A., & Villazón-Terrazas, B. (2014). Best practices for publishing linked data. W3C. https://dvcs.w3.org/hg/gld/raw-file/cb6dde2928e7/bp/index.html Hyland, B., & Villazón-Terrazas, B. (2011, 15 March). Linked data cookbook. W3C. https:// www.w3.org/2011/gld/wiki/Linked_Data_Cookbook Lagace, N. (2014). Pre-standards initiatives: Bibliographic roadmap and altmetrics. Information Standard Quarterly, 26(3), 23-26. https://doi.org/10.3789/isqv26no3.2014.06 Lampert, C. K., & Southwick, S. B. (2013). Leading to linking: Introducing linked data to academic library digital collections. Journal of Library Metadata, 13(2-3), 230-253. https://doi.org/10.1080/19386389.2013.826095 Library of Congress. (1998). MARC proposal no. 1998-10. Library of Congress. https://www. loc.gov/marc/marbi/1998/98-10.html Library of Congress. (2009). MARC proposal no. 2009-06/1. https://www.loc.gov/marc/ marbi/2009/2009-06-1.html Library of Congress. (2010). MARC proposal no. 2010-06. https://www.loc.gov/marc/ marbi/2010/2010-06.html Library of Congress. (2015). MARC proposal no. 2015-07. https://www.loc.gov/marc/ mac/2015/2015-07.html Library of Congress. (2016a). MARC discussion paper no. 2016-DP06. https://www.loc.gov/ marc/mac/2016/2016-dp06.html Library of Congress. (2016b). MARC discussion paper no. 2016-DP18. https://www.loc.gov/ marc/mac/2016/2016-dp18.html Library of Congress. (2016c). MARC discussion paper no. 2016-DP19. https://www.loc.gov/ marc/mac/2016/2016-dp19.html Library of Congress. (2016d). MARC discussion paper no. 2016-DP21. https://www.loc.gov/ marc/mac/2016/2016-dp21.html http://joemls.tku.edu.tw 58 教育資料與圖書館學 57 : 1 (2020) Library of Congress. (2017a). MARC proposal no. 2017-01. https://www.loc.gov/marc/ mac/2017/2017-01.html Library of Congress. (2017b). MARC proposal no. 2017-02. https://www.loc.gov/marc/ mac/2017/2017-02.html Library of Congress. (2017c). MARC proposal no. 2017-03. https://www.loc.gov/marc/ mac/2017/2017-03.html Library of Congress. (2017d). MARC proposal no. 2017-06. https://www.loc.gov/marc/ mac/2017/2017-06.html Library of Congress. (2017e). MARC proposal no. 2017-08. https://www.loc.gov/marc/ mac/2017/2017-08.html Library of Congress. (2017f). MARC proposal no. 2017-09. https://www.loc.gov/marc/ mac/2017/2017-09.html Library of Congress. (2018a). MARC discussion paper no. 2018-DP07. https://www.loc.gov/ marc/mac/2018/2018-dp07.html Library of Congress. (2018b). MARC proposal no. 2018-FT01. https://www.loc.gov/marc/ mac/2018/2018-ft01.html Library of Congress. (2019a). MARC advisory committee. https://www.loc.gov/marc/mac/ advisory.html Library of Congress. (2019b). MARC proposal no. 2019-02. https://www.loc.gov/marc/ mac/2019/2019-02.html Library of Congress. (2019c). MARC proposal no. 2019-03. https://www.loc.gov/marc/ mac/2019/2019-03.html Library of Congress. (2019d). MARC 21 to BIBFRAME 2.0 conversion specifications. https:// www.loc.gov/bibframe/mtbf/ Linked Data for Production. (2017). LD4P grant proposal. https://wiki.duraspace.org/display/ LD4P/LD4P+Grant+Proposal Malmsten, M. (2008, September 22-26). Making a library catalogue part of the semantic web [Paper presentation]. International Conference on Dublin Core and Metadata Applications 2008, Berlin, Germany. http://dcpapers.dublincore.org/pubs/article/view/927/923 Malmsten, M. (2009). Exposing library data as linked data. In Proceedings of IFLA WLIC 2009. http://disi.unitn.it/~bernardi/Courses/DL/Slides_10_11/linked_data_libraries.pdf McCrae, J. P. (2019). The linked open data cloud: Subclouds by domain. https://lod-cloud.net/#about Network Development and MARC Standards Office. (2006). Mapping of MARC data elements to FRBR and AACR. In Functional Analysis of the MARC 21 Bibliographic and Holdings Formats (Rev. ed.). http://www.loc.gov/marc/marc-functional-analysis/source/table3.pdf Penn Libraries. (n.d.). Jane Austen’s Pride and prejudice / edited by Claudia L. Johnson, Susan J. Wolfson. https://franklin.library.upenn.edu/catalog/FRANKLIN_9939511983503681 Possemato, T. (2018). How RDA is essential in the reconciliation and conversion processes for quality Linked Data. JLIS.it, 9(1), 48-60. https://doi.org/10.4403/jlis.it-12447 RDA Steering Committee. (2019). Frequently Asked Questions. RDA Registry. https://www. rdaregistry.info/rgFAQ http://joemls.tku.edu.tw 59陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 Santos, R., Manchado, A., & Vila-Suero, D. (2015, August 15-21). Datos.bne.es: A LOD service and a FRBR-modelled access into the library collections [Paper presentation]. IFLA World Library and Information Congress: 81st IFLA General Conference and Assembly. Cape Town, South Africa. http://library.ifla.org/1085/1/207-santos-en.pdf Schreur, P. (2015). Implications of a linked data transition: Stanford University’s projects and plans. http://www.lib.berkeley.edu/catalog_dept/sites/drupal7.lib.berkeley.edu.catalog_ dept/files/Implications%20of%20a%20Linked%20Data%20Transition.docx SHARE-VDE. (n.d.). http://share-vde.org/sharevde/docBibframe/Work/139617-12 Simon, A., Wenz, R., Michel, V., & Di Mascio, A. (2013). Publishing bibliographic records on the web of data: Opportunities for the BnF (French National Library). In P. Cimiano, O. Corcho, V. Presutti, L. Hollink, & S. Rudolph (Eds.), The Semantic Web: Semantics and Big Data. ESWC 2013: 10th International Conference, ESWC 2013, Montpellier, France, May 26-30, 2013. Proceedings (pp. 563-577). Springer. https://doi.org/10.1007/978-3- 642-38288-8_38 Smith-Yoshimura, K. (2016). Analysis of international linked data survey for implementers. D-Lib Magazine, 22(7-8). https://doi.org/10.1045/july2016-smith-yoshimura Smith-Yoshimura, K. (2018a). Analysis of 2018 international linked data survey for implementer. Code4lib Journal, 42. https://journal.code4lib.org/articles/13867 Smith-Yoshimura, K. (2018b). What metadata managers expect from and value about the research library partnership. Hanging Together. http://hangingtogether.org/?p=6683 Southwick, S. B. (2015). A guide for transforming digital collections metadata into linked data using open source technologies. Journal of Library Metadata, 15(1), 1-35. https://doi.org /10.1080/19386389.2015.1007009 Suominen, O., & Hyvönen, N. (2017). From MARC silos to linked data silos? O-Bib. Das Offene Bibliotheksjournal, 4(2), 1-13. https://doi.org/10.5282/o-bib/2017H2S1-13 University of Michigan Library. (2018). Ao man yu pian jian / Zhen, Aositing zhu ; [Xia Yinghui yi].傲慢與偏見 / 珍.奧斯汀著 ; [夏穎慧譯]. https://search.lib.umich.edu/ catalog/record/014616392?query=Ao+man+yu+pian+jian+Xia+Yinghui+yi&library=U- M+Ann+Arbor+Libraries Vila-Suero, D., & Gómez-Pérez, A. (2013). datos.bne.es and MARiMbA: An insight into library linked data. Library Hi Tech, 31(4), 575-601. https://doi.org/10.1108/LHT-03-2013-0031 Vila-Suero, D., Villazón-Terrazas, B., & Gómez-Pérez, A. (2012). datos.bne.es: A library linked data dataset. Semantic Web, 4(3), 307-313. https://doi.org/10.3233/SW-120094 Villazón-Terrazas, Vilches-Blázquez, L. M., C orcho, O ., & G ómez-Pérez. (2011). Methodological guidelines for publishing government linked data. In D. Wood (Ed.), Linking government data (pp. 27-49). Springer. https://doi.org/10.1007/978-1-4614-1767-5_2 Wenz, R. (2013). Linked open data for new library services: The example of data.bnf.fr. JLIS.it, 4(1), 403-415. https://doi.org/10.4403/jlis.it-5509 陳亞寧 0000-0001-7598-1139 温達茂 0000-0003-1525-4815http://joemls.tku.edu.tw 60 教育資料與圖書館學 57 : 1 (2020) 附錄一 MARC21書目資料格式有關LD的 相關欄號、欄位名稱與分欄對照表 欄號 欄 位 名 稱 分 欄 $0 $1 $2 $4 $e $i 033 Date/Time and Place of an Event ◎ ◎ 034 Coded Cartographic Mathematical Data ◎ ◎ 043 Geographic Area Code ◎ ◎ 050 Library of Congress Call Number ◎ ◎ 052 Geographic Classification ◎ ◎ 055 Classification Numbers Assigned in Canada ◎ ◎ 060 National Library of Medicine Call Number ◎ ◎ 070 National Agricultural Library Call Number ◎ ◎ 080 Universal Decimal Classification Number ◎ ◎ 084 Other Classification Number ◎ ◎ 085 Synthesized Classification Number Components ◎ ◎ 086 Government Document Classification Number ◎ ◎ 100 Main Entry-Personal Name ◎ ◎ ◎ ◎ ◎ 110 Main Entry-Corporate Name ◎ ◎ ◎ ◎ ◎ 111 Main Entry-Meeting Name ◎ ◎ ◎ ◎ 130 Main Entry-Uniform Title ◎ ◎ ◎ 240 Uniform Title ◎ ◎ ◎ 251 Version Information ◎ ◎ 257 Country of Producing Entity ◎ ◎ 336 Content Type ◎ ◎ 337 Media Type ◎ ◎ 338 Carrier Type ◎ ◎ 340 Physical Medium ◎ ◎ 344 Sound Characteristics ◎ ◎ 345 Projection Characteristics of Moving Image ◎ ◎ 346 Video Characteristics ◎ ◎ 347 Digital File Characteristics ◎ ◎ 348 Format of Notated Music ◎ ◎ 370 Associated Place ◎ ◎ ◎ ◎ 377 Associated Language ◎ ◎ 380 Form of Work ◎ ◎ 381 Other Distinguishing Characteristics of Work or Expression ◎ ◎ 382 Number of ensembles of the same type ◎ ◎ 385 Audience Characteristics ◎ ◎ 386 Creator/Contributor Characteristics ◎ ◎ ◎ ◎ 388 Time Period of Creation ◎ ◎ 518 Date/Time and Place of an Event Note ◎ ◎ 567 Methodology Note ◎ ◎ 600 Subject Added Entry-Personal Name ◎ ◎ ◎ ◎ ◎ 610 Subject Added Entry-Corporate Name ◎ ◎ ◎ ◎ ◎ 611 Subject Added Entry-Meeting Name ◎ ◎ ◎ ◎ 630 Subject Added Entry-Uniform Title ◎ ◎ ◎ ◎ ◎ 647 Subject Added Entry-Named Event ◎ ◎ ◎ http://joemls.tku.edu.tw 61陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 欄號 欄 位 名 稱 分 欄 $0 $1 $2 $4 $e $i 648 Subject Added Entry-Chronological Term ◎ ◎ ◎ 650 Subject Added Entry-Topical Term ◎ ◎ ◎ ◎ ◎ 651 Subject Added Entry-Geographic Name ◎ ◎ ◎ ◎ ◎ 654 Subject Added Entry-Faceted Topical Terms ◎ ◎ ◎ ◎ ◎ 655 Index Term-Genre/Form ◎ ◎ ◎ 656 Index Term-Occupation ◎ ◎ ◎ 657 Index Term-Function ◎ ◎ ◎ 662 Subject Added Entry-Hierarchical Place Nam ◎ ◎ ◎ ◎ ◎ 700 Added Entry-Personal Name ◎ ◎ ◎ ◎ ◎ ◎ 710 Added Entry-Corporate Name ◎ ◎ ◎ ◎ ◎ ◎ 711 Added Entry-Meeting Name ◎ ◎ ◎ ◎ ◎ 720 Added Entry-Uncontrolled Name ◎ ◎ 730 Added Entry-Uniform Title ◎ ◎ ◎ ◎ ◎ 751 Added Entry-Geographic Name ◎ ◎ ◎ ◎ ◎ 752 Added Entry-Hierarchical Place Name ◎ ◎ ◎ ◎ ◎ 753 System Details Access to Computer File ◎ ◎ 754 Added Entry-Taxonomic Identification ◎ ◎ 758 Resource Identifie ◎ ◎ ◎ ◎ ◎ 760 Main Series Entry ◎ ◎ 762 Subseries Entry ◎ ◎ 765 Original Language Entry ◎ ◎ 767 Translation Entry ◎ ◎ 770 Supplement/Special Issue Entry ◎ ◎ 772 Supplement Parent Entry ◎ ◎ 773 Host Item Entry ◎ ◎ 774 Constituent Unit Entry ◎ ◎ 775 Other Edition Entr ◎ ◎ 776 Additional Physical Form Entry ◎ ◎ 777 Issued With Entry ◎ ◎ 780 Preceding Entry ◎ ◎ 785 Succeeding Entry ◎ ◎ 786 Data Source Entry ◎ ◎ 787 Other Relationship Entry ◎ ◎ 800 Series Added Entry-Personal Name ◎ ◎ ◎ ◎ ◎ 810 Series Added Entry-Corporate Name ◎ ◎ ◎ ◎ ◎ 811 Series Added Entry-Meeting Name ◎ ◎ ◎ ◎ 830 Series Added Entry-Uniform Title ◎ ◎ ◎ 883 Machine-generated Metadata Provenance ◎ ◎ 885 Matching Information ◎ ◎ 註: 本文最後上網查證日期為2019年11月18日。分欄名稱分別是$0-權威記錄控制號或 標準號(authority record control number or standard number)、$1-實際的世界物件(Real World Object,RWO)URI(RWO URI)、$2-標目或用語來源(source of heading or term)、$4-關係(relationship)、$e-著作職責用語(relator term)與$i-關係資訊 (relationship information)。 http://joemls.tku.edu.tw 62 教育資料與圖書館學 57 : 1 (2020) 附錄二 MARC21權威資料格式有關LD的 相關欄號、欄位名稱與分欄對照表 欄號 欄 位 名 稱 分 欄 $0 $1 $2 $4 $e $i 024 Other Standard Identifier (R) ◎ ◎ ◎ 034 Coded Cartographic Mathematical Data (R) ◎ ◎ 043 Geographic Area Code ◎ ◎ 050 Library of Congress Call Number ◎ ◎ 052 Geographic Classification ◎ ◎ 055 Library and Archives Canada Call Number ◎ ◎ 060 National Library of Medicine Call Number ◎ ◎ 065 Other Classification Number ◎ ◎ 070 National Agricultural Library Call Number ◎ ◎ 075 Type of Entity ◎ ◎ 080 Universal Decimal Classification Number ◎ ◎ 087 Government Document Classification Number ◎ ◎ 260 Complex See Reference-Subject ◎ ◎ 336 Content Type ◎ ◎ 348 Format of Notated Music ◎ ◎ 360 Complex See Also Reference-Subject ◎ ◎ 368 Other Attributes of Person or Corporate Body ◎ ◎ 370 Associated Place ◎ ◎ ◎ ◎ 372 Field of Activity ◎ ◎ 373 Associated Group ◎ ◎ 374 Occupation ◎ ◎ 376 Family Information ◎ ◎ 377 Associated Language ◎ ◎ 380 Form of Work ◎ ◎ 381 Other Distinguishing Characteristics of Work or Expression ◎ ◎ 382 Medium of Performance ◎ ◎ 385 Audience Characteristics ◎ ◎ 386 Creator/Contributor Characteristics ◎ ◎ ◎ ◎ 388 Time Period of Creation ◎ ◎ 400 See From Tracing-Personal Name ◎ ◎ ◎ 410 See From Tracing-Corporate Name ◎ ◎ ◎ 411 See From Tracing-Meeting Name ◎ ◎ 430 See From Tracing-Uniform Title ◎ ◎ 448 See From Tracing-Chronological Term ◎ ◎ 450 See From Tracing-Topical Term ◎ ◎ 451 See From Tracing-Geographic Name ◎ ◎ 455 See From Tracing-Genre/Form Term ◎ ◎ 462 See From Tracing-Medium of Performance Term ◎ ◎ 480 See From Tracing-General Subdivision ◎ ◎ 481 See From Tracing-Geographic Subdivision ◎ ◎ 482 See From Tracing-Chronological Subdivision ◎ ◎ 485 See From Tracing-Form Subdivision ◎ ◎ 500 See Also From Tracing-Personal Name ◎ ◎ ◎ ◎ ◎ http://joemls.tku.edu.tw 63陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 欄號 欄 位 名 稱 分 欄 $0 $1 $2 $4 $e $i 510 See Also From Tracing-Corporate Name ◎ ◎ ◎ ◎ ◎ 511 See Also From Tracing-Meeting Name ◎ ◎ ◎ ◎ 530 See Also From Tracing-Uniform Title ◎ ◎ ◎ ◎ 547 See Also From Tracing-Named Event ◎ ◎ ◎ ◎ 548 See Also From Tracing-Chronological Term ◎ ◎ ◎ ◎ 550 See Also From Tracing-Topical Term ◎ ◎ ◎ ◎ 551 See Also From Tracing-Geographic Name ◎ ◎ ◎ ◎ 555 See Also From Tracing-Genre/Form Term ◎ ◎ ◎ ◎ 562 See Also From Tracing-Medium of Performance Term ◎ ◎ ◎ ◎ 580 See Also From Tracing-General Subdivision ◎ ◎ ◎ ◎ 581 See Also From Tracing-Geographic Subdivision ◎ ◎ ◎ ◎ 582 S e e A l s o F r o m T r a c i n g - C h r o n o l o g i c a l Subdivision ◎ ◎ ◎ ◎ 585 See Also From Tracing-Form Subdivision ◎ ◎ ◎ ◎ 672 Title Related to the Entity ◎ ◎ 673 Title Not Related to the Entity ◎ ◎ 700 Established Heading Linking Entry-Personal Name ◎ ◎ ◎ ◎ ◎ ◎ 710 Added Entry-Corporate Name ◎ ◎ ◎ ◎ ◎ ◎ 711 Established Heading Linking Entry-Corporate Name ◎ ◎ ◎ ◎ ◎ 730 Established Heading Linking Entry-Uniform Title ◎ ◎ ◎ ◎ ◎ 747 Established Heading Linking Entry-Named Even ◎ ◎ ◎ ◎ ◎ 748 Established Heading Linking Entry-Chronological Term ◎ ◎ ◎ ◎ ◎ 750 Established Heading Linking Entry-Topical Term ◎ ◎ ◎ ◎ ◎ 751 Established Heading Linking Entry-Geographic Name ◎ ◎ ◎ ◎ ◎ 755 Established Heading Linking Entry-Genre/Form Term ◎ ◎ ◎ ◎ ◎ 762 Established Heading Linking Entry-Medium of Performance Term ◎ ◎ ◎ ◎ ◎ 780 Subdivision Linking Entry-General Subdivision ◎ ◎ ◎ ◎ ◎ 781 S u b d i v i s i o n L i n k i n g E n t r y - G e o g r a p h i c Subdivision ◎ ◎ ◎ ◎ ◎ 782 S u b d i v i s i o n L i n k i n g E n t r y - C h r o n o l o g i c a l Subdivision ◎ ◎ ◎ ◎ ◎ 785 Subdivision Linking Entry-Form Subdivisio ◎ ◎ ◎ ◎ ◎ 788 Complex Linking Entry Data ◎ ◎ 883 Machine-generated Metadata Provenance ◎ ◎ 885 Matching Information ◎ ◎ 註: 本文最後上網查證日期為2019年11月18日。分欄名稱分別是$0-權威記錄控制號或 標準號(authority record control number or standard number)、$1-實際的世界物件(Real World Object,RWO)URI(RWO URI)、$2-標目或用語來源(source of heading or term)、$4-關係(relationship)、$e-著作職責用語(relator term)與$i-關係資訊 (relationship information)。 http://joemls.tku.edu.tw 64 教育資料與圖書館學 57 : 1 (2020) 附錄三 2筆MARC記錄 記錄範例1 LEADER 01330cam^a22003977a^4500 001 014616392 005 20160519094310.0 008 981203s1992^^^^ch^af^^^^^^^^^000^1^chi^d 020 $a9575453395 020 $a9789575453398 035 $a(OCoLC)213888776 035 $a(OCoLC)ocn213888776 040 $aCUT$beng$cCUT$dOCLCG$dOCLCO$dOCLCQ 041 1 $achi$heng 049 $aEYMG 066 $c$1 099 $aPR 4034 .P75 C5 1992 100 1 $aAusten, Jane,$d1775-1817. 240 1 0 $aPride and prejudice.$lChinese 245 1 0 $601$aAo man yu pian jian /$cZhen, Aositing zhu ; [Xia Yinghui yi]. 245 1 0 $601$a 傲慢與偏見 /$c 珍・奧斯汀著 ; [ 夏穎慧譯 ]. 250 $602$aZai ban. 250 $602$a 再版 . 260 $603$aTaibei Shi :$bZhi wen chu ban she,$c1992. 260 $603$a 台北市 :$b 志文出版社 ,$c 1992. 300 $a2, 428 pages, [4] pages of plates :$billustrations, portraits ;$c20 cm. 336 $atext$btxt$2rdacontent 337 $aunmediated$bn$2rdamedia 338 $avolume$bnc$2rdacarrier 490 0 $604$aXin chao shi jie ming zhu ;$v7 490 0 $604$a 新潮世界名著 ;|v 7 700 1 $605$aXia, Yinghui. 700 1 $605|a 夏穎慧 . 資料來源:University of Michigan Library. (2018). Ao man yu pian jian / Zhen, Aositing zhu ; [Xia Yinghui yi].傲慢與偏見 / 珍・奧斯汀 著 ; [夏穎慧譯]. https://search.lib.umich.edu/catalog/record/0146 16392?query=Ao+man+yu+pian+jian+Xia+Yinghui+yi&library=U- M+Ann+Arbor+Libraries。 http://joemls.tku.edu.tw 65陳亞寧、温達茂:MARC21鏈結資料化的轉變與應用 記錄範例2 LEADER 01368cam a2200421 a 4500 001 9939511983503681 005 20180817000027.0 008 020813s2003 nyuab b 000 1 eng 010 $a 2002030162 020 $a0321105079 (pbk.) 035 $a(OCoLC)ocm50477169 035 $a(OCoLC)50477169 035 $a3951198 035 $a(PU)3951198-penndb-Voyager 040 $aDLC$cDLC$dC#P$dBAKER 043 $ae-uk-en$0http://id.loc.gov/vocabulary/geographicAreas/e-uk- en$2marcgac 049 $aPAUU 050 0 0 $aPR4034$b.P7 2003 082 0 0 $a823/.7$221 100 1 $aAusten, Jane,$d1775-1817. 240 1 0 $aPride and prejudice 245 1 0 $aJane Austen’s Pride and prejudice /$cedited by Claudia L. Johnson, Susan J. Wolfson. 260 $aNew York :$bLongman,$cc2003. 300 $axxxv, 459 p. :$bill., map ;$c21 cm. 440 2 $aA Longman cultural edition 504 $aIncludes bibliographical references (p. 455-459). 600 1 0 $aAusten, Jane,$d1775-1817.$tPride and prejudice. 650 0 $aSocial classes$vFiction. 650 0 $aYoung women$vFiction. 650 0 $aCourtship$vFiction. 650 0 $aSisters$vFiction. 651 0 $aEngland$vFiction. 655 7 $aDomestic fiction.$2lcsh 655 7 $aLove stories.$2gsafd 700 1 $aJohnson, Claudia L. 700 1 $aWolfson, Susan J.,$d1948- 938 $aBaker & Taylor$bBKTY$c8.60$d8.60$i0321105079$n0004069995 $sactive 994 $aC0$bPAU 資料來源:Penn Libraries. (n.d.). Jane Austen’s Pride and prejudice / edited by Claudia L. Johnson, Susan J. Wolfson. https://franklin.library.upenn. edu/catalog/FRANKLIN_9939511983503681。 http://joemls.tku.edu.tw Journal of Educational Media & Library Sciences 57 : 1 (2020) : 35-72 DOI:10.6120/JoEMLS.202003_57(1).0045.RS.AM R es ea rc h A rt ic le A Study on MARC21 Transformation and Application for Linked Data Ya-Ning Chena* Dar-maw Wenb Abstract MARC has been accepted as a standard format for information interchange in libraries for decades. Owing to the outdated format, MARC is unknown and unused outside of libraries. Moving to the era of semantic web, the technology of linked data (LD) is regarded as a new approach to deconstruct library bibliographic data (LBD) into LD for libraries. It is deserved to examine what approach has been adopted to extend MARC into LD and its potential benefits. This study has analyzed MARC proposals and discussion papers related to LD as a basis to investigate what changes have been approved for MARC since 2006 of the LD initiative. Furthermore, eight use cases selected from two MARC records and an instance of one MARC proposal respectively were employed to address how MARC changes have been transformed MARC-based LBD into LD in practice by combining classes and properties of BIBFRAME a n d R DA bibliographic ontology. Consequently, it reveals th at R DF ’s triplification has been integrated as part of MARC successfully. Therefore, M A RC is not only a standard for communication and representation of bibliographic and related information, but also one for LD in libraries. Related issues to fundamental definition of bibliographic entity defined in MARC proposals for LD have also discussed. Keywords: MARC, Linked data, BIBFRAME, RDA ontology, RDFization SUMMARY Introduction M A c h i n e R e a d a b l e C a t a l o g i n g ( M A R C ) h a s b e e n a d o p t e d a s a n international standard for information organization, especially for exchanging and sharing information between library automated systems. As information heads increasingly towards cyberization and digitization, search engines have become an essential tool for finding networked information resources on the Internet. Owing to an outdated format, MARC is not known in non-library domains and sectors. Most MARC-based information are embedded in proprietary library automated systems exists as an information silo owing to the isolation from coverage of a Associate Professor, Department of Information and Library Science, Tamkang University, New Taipei City, Taiwan b Chief Knowledge Officer, Flysheet Technologies Co., Ltd., Taipei, Taiwan * To whom all correspondence should be addressed. E-mail: arthur@gms.tku.edu.tw The Author acknowledges that the Article is distributed under a Creative Commons CC BY-NC 4.0. http://joemls.tku.edu.tw 67Chen & Wen: A Study on MARC21 Transformation and Application for Linked Data search engines (Lagace, 2014). On the other hand, Linked Data (LD), initiated by Tim Berners-Lee (2006), has been used as an approach to transform a web of documents into a web of data through URI naming and linking with related resources in an open networked environment. According to the investigation of Linked Open Data Cloud, “bibliography of publications” is one of the categories and shows the significance of library bibliographic information in the domain of LD. However, LD has gained attention from libraries to transform legacy library data into LD and explore its potential applications through the adoption of LD related technologies and tools. Basically LD is data centric for data design (Di Noia et al., 2016). One of the key points of LD is to employ ontology as a basis for data modeling to delineate the relationships between individual LD (Hyland et al., 2014; Hyland & Villazón-Terrazas, 2011). It is encouraged to reuse existing authoritative vocabularies that are in widespread usage to describe common types of data (Villazón-Terrazas et al., 2011). Although the Functional Requirements for Bibliographic Records (FRBR) and the Bibliographic Framework (BIBFRAME) are conceptual models, actually they are regarded as ontologies for libraries in practice. For example, the National Library and Archive of IRAN (NLAI; Eslami & Vaghefzadeh, 2013), Biblioteca Nacional de España (BNE; Vila-Suero & Gómez-Pérez, 2013; Vila-Suero et al., 2012) and Bibliothèque nationale de France (2018) have used FRBR as an ontology for LD transformation, whereas cases of Linked Data for Production (LD4P) have employed BIBFRAME as an ontology to address issues related to LD transformation. Furthermore, vocabularies and their relationships of BIBFRAME and RDA ontology have been assigned URI maintained by the LC and RDA Registry, respectively. Therefore, these two bibliographic ontologies FRBR and BIBFRAME both have conformed to the requirements of ontology defined by Berners-Lee et al. (2001) for the semantic web. There is no doubt that MARC is still employed to organize information by many library automated systems around the world. As a matter of fact, libraries have encountered the hybrid requirements for MARC and LD at the same time. Meaning that libraries must not only to transform MARC into LD, but also include external LD resources into library automated systems to migrate user’s information navigation into LD driven resource discovery. It is of interest to know what changes have made to MARC and their applications in practice in accordance with the aforementioned hybrid requirements for inclusion of LD. Literature Review Totally 18 MARC documents (14 proposals and four discussion papers) published since the term LD was coined in 2006 were selected to investigate the revisions of MARC for LD implemented applications, including subfields $0, http://joemls.tku.edu.tw 68 Journal of Educational Media & Library Sciences 57 : 1 (2020) $1, $2, $4, $e, $i, and tag 758. Furthermore, in this study, we checked against two online documents (MARC21 Format for Bibliographic Data (MFBD) and MARC21 Format for Authority Data (MFAD) to collate related MARC subfields and tags for LD applications. Methodology First, MFBD and MFAD were selected as target subjects to examine how MARC implements related LD subfields and tags in practice. Then RDF triplification was performed for MARC. In other words, subfield a of tag 245 in MFBD and subfield a of tag 110 in MFAD were regarded as the subject of RDF, $4 was regarded as the predicate of RDF, and $0 or $1 both of MFBD and MFAD were regarded as the object of RDF. Conversely, $0 or $1 both of MFBD and MFAD were regarded as the subject of RDF, subfield a of tag 245 in MFBD and subfield a of tag 110 in MFAD as the object of RDF, and $4 still as the predicate of RDF. Third, vocabularies defined by BIBFRAME and RDA ontology were used as the predicate of RDF during transforming MARC to LD. Eight use cases derived from two MFBD records offered by the University of Michigan Ann Arbor Library and the University of Pennsylvania Libraries WebPACs, as well as instances of the aforementioned MARC documents addressed in the literature review section were employed to investigate how $0, $1, $2, $4, $e, $i and tag 758 were used to extend MARC to LD in detail. The eight use cases included the following relationships: authorship, work’s uniform title, publisher, content/media/ carrier, translator, subject, instance/manifestation, and organization and individual person. Lastly, each use case was provided with a summarized table to illustrate the distinction between the original MARC and RDFized MARC instance with vocabularies of selected bibliographic ontology (i.e., BIBFRAME and RDA ontology) in accordance with RDF’s triple statement and their RDF graphs respectively. Discussion MARC is addressed from the following perspectives: • In terms of LD linkage, MARC can be enriched through by internal enrichment to aggregate external LD resources. • In terms of information exchange, MARC21 is not only a format for information interchange and sharing, but also an exchange format for sharing MARC-based LD information between library automated systems. • In terms of application of ontology, MARC21 has become a data container of bibliographic ontology (such as BIBFRAME and RDA ontology), and is also a carrier to reify bibliographic ontology into practice. • In terms of use cases, one of RDF’s triplification approaches was used by MARC, that is, subfield a of tag 245 in MFBD and subfield a of tag http://joemls.tku.edu.tw 69Chen & Wen: A Study on MARC21 Transformation and Application for Linked Data 110 in MFAD are regarded as the subject of RDF, and $0 or $1 both of MFBD and MFAD as the object of RDF. On the contrary, it will be worth knowing whether the opposite RDF’ triplification approach and syntax (i.e., $0 or $1 both of MFBD and MFAD are regarded as RDF’s subject, and subfield a of tag 245 in MFBD and subfield a of tag 110 in MFAD as RDF’s object) is a workable approach for MARC in the future. • According to examination of eight use cases in this study, the ‘bibliographic entity’ of subfield a of tag 245 in MFBD has stood for various entities including work and instance in BIBFRAME, or work, expression and manifestation in RDA ontology. It has revealed there is a need for a reasonable definition for subfield a of tag 245 in MFBD when libraries adopt LD related MARC subfields and tags. In terms of structure of BIBFRAME and RDA ontology, it often needs more than two RDF triples statements to complete the semantic relationships between two individual LD resources. According to the illustration of eight use cases, one may find that MARC has employed one RDF triple statement to delineate the semantic relationships rather than a complete set of RDF triples, for example the relationships between BIBFRAME’s instance/RDA’s manifestation and publisher. Indeed a practical guideline is needed to direct libraries about how to select the appropriate BIBFRAME or RDA vocabularies to build up the semantic relationships between LD resources. Conclusion According to an analysis of MARC proposals and discussion papers focused on LD and eight use cases, it can be seen that related MARC subfields and tags have been revised to integrate the RDF data model and syntax. Thus external LD resources can be aggregated into part of MARC by enrichment. Furthermore, MARC is not only an international format for sharing bibliographic information, but also a container for exchanging MARC-based LD information in libraries. It would be interesting to know whether RDF-based MARC subfields and tags will be applied to other ontologies in addition to BIBFRAME and RDA ontology. ROMANIZED & TRANSLATED REFERENCE FOR ORIGINAL TEXT Berners-Lee, T. (2006). Linked data: Design issue. https://www.w3.org/DesignIssues/ LinkedData.html Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 35-43. Bibliothèque Nationale de France. (2018). Open data. data.bnf.fr. https://data.bnf.fr/en/opendata Casalini, M. (2017, August 15-17). BIBFRAME and Linked Data practices for the stewardship of research knowledge [Paper presentation]. IFLA Satellite Meeting 2017: Digital http://joemls.tku.edu.tw 70 Journal of Educational Media & Library Sciences 57 : 1 (2020) Humanities, Berlin, Germany. https://dh-libraries.sciencesconf.org/132918/document Chen, Y.-N. (2017). A review of practices for transforming library legacy records into linked open data. In E. Garoufallou, S. Virkus, R. Siatri, & D. Koutsomiha (Eds.), Metadata and semantic research: 11th International Conference, MTSR 2017 Tallinn, Estonia, November 28 – December 1, 2017 Proceedings (pp. 123-133). Springer. https://doi. org/10.1007/978-3-319-70863-8_12 Cole, T. W., Han, M.-J., Weathers, W. F., & Joyner, E. (2013). Library marc records into linked open data: Challenges and opportunities. Journal of Library Metadata, 13(2-3), 163-196. https://doi.org/10.1080/19386389.2013.826074 Deliot, C. (2014). Publishing the British National Bibliography as linked open data. Catalogue & Index 174, 13-18. http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf Deliot, C., Wilson, N., Costabello, L., & Vandenbussche, P.-Y. (2016, October 13-16). The British National Bibliography: Who uses our linked data? [Paper presentation]. International Conference on Dublin Core and Metadata Applications 2016, Copenhagen, Denmark. http://dcpapers.dublincore.org/pubs/article/download/3820/2005 Di Noia, T., Ragone, A., Maurino, A., Mongiello, M., Marzoccca, M. P., Cultrera, G., & Bruno, M. P. (2016). Linking data in digital libraries: The case of Puglia Digital Library. In A. Adamou, E. Daga, & L. Isaksen (Eds.), Proceedings of the 1st Workshop on Humanities in the Semantic Web co-located with 13th ESWC Conference 2016 (pp. 27-38). CEUR- WS. http://ceur-ws.org/Vol-1608/paper-05.pdf Eslami, S., & Vaghefzadeh, M. H. (2013, August 17-23). Publishing Persian linked data of national library and archive of Iran [Paper presentation]. IFLA World Library and Information Congress: 79th IFLA General Conference and Assembly. Singapore. http:// library.ifla.org/193/1/222-eslami-en.pdf Hyland, B., Atemezing, G.A., & Villazón-Terrazas, B. (2014). Best practices for publishing linked data. W3C. https://dvcs.w3.org/hg/gld/raw-file/cb6dde2928e7/bp/index.html Hyland, B., & Villazón-Terrazas, B. (2011, 15 March). Linked data cookbook. W3C. https:// www.w3.org/2011/gld/wiki/Linked_Data_Cookbook Lagace, N. (2014). Pre-standards initiatives: Bibliographic roadmap and altmetrics. Information Standard Quarterly, 26(3), 23-26. https://doi.org/10.3789/isqv26no3.2014.06 Lampert, C. K., & Southwick, S. B. (2013). Leading to linking: Introducing linked data to academic library digital collections. Journal of Library Metadata, 13(2-3), 230-253. https://doi.org/10.1080/19386389.2013.826095 Library of Congress. (1998). MARC proposal no. 1998-10. Library of Congress. https://www. loc.gov/marc/marbi/1998/98-10.html Library of Congress. (2009). MARC proposal no. 2009-06/1. https://www.loc.gov/marc/ marbi/2009/2009-06-1.html Library of Congress. (2010). MARC proposal no. 2010-06. https://www.loc.gov/marc/ marbi/2010/2010-06.html Library of Congress. (2015). MARC proposal no. 2015-07. https://www.loc.gov/marc/ mac/2015/2015-07.html Library of Congress. (2016a). MARC discussion paper no. 2016-DP06. https://www.loc.gov/ marc/mac/2016/2016-dp06.html http://joemls.tku.edu.tw 71Chen & Wen: A Study on MARC21 Transformation and Application for Linked Data Library of Congress. (2016b). MARC discussion paper no. 2016-DP18. https://www.loc.gov/ marc/mac/2016/2016-dp18.html Library of Congress. (2016c). MARC discussion paper no. 2016-DP19. https://www.loc.gov/ marc/mac/2016/2016-dp19.html Library of Congress. (2016d). MARC discussion paper no. 2016-DP21. https://www.loc.gov/ marc/mac/2016/2016-dp21.html Library of Congress. (2017a). MARC proposal no. 2017-01. https://www.loc.gov/marc/ mac/2017/2017-01.html Library of Congress. (2017b). MARC proposal no. 2017-02. https://www.loc.gov/marc/ mac/2017/2017-02.html Library of Congress. (2017c). MARC proposal no. 2017-03. https://www.loc.gov/marc/ mac/2017/2017-03.html Library of Congress. (2017d). MARC proposal no. 2017-06. https://www.loc.gov/marc/ mac/2017/2017-06.html Library of Congress. (2017e). MARC proposal no. 2017-08. https://www.loc.gov/marc/ mac/2017/2017-08.html Library of Congress. (2017f). MARC proposal no. 2017-09. https://www.loc.gov/marc/ mac/2017/2017-09.html Library of Congress. (2018a). MARC discussion paper no. 2018-DP07. https://www.loc.gov/ marc/mac/2018/2018-dp07.html Library of Congress. (2018b). MARC proposal no. 2018-FT01. https://www.loc.gov/marc/ mac/2018/2018-ft01.html Library of Congress. (2019a). MARC advisory committee. https://www.loc.gov/marc/mac/ advisory.html Library of Congress. (2019b). MARC proposal no. 2019-02. https://www.loc.gov/marc/ mac/2019/2019-02.html Library of Congress. (2019c). MARC proposal no. 2019-03. https://www.loc.gov/marc/ mac/2019/2019-03.html Library of Congress. (2019d). MARC 21 to BIBFRAME 2.0 conversion specifications. https:// www.loc.gov/bibframe/mtbf/ Linked Data for Production. (2017). LD4P grant proposal. https://wiki.duraspace.org/display/ LD4P/LD4P+Grant+Proposal Malmsten, M. (2008, September 22-26). Making a library catalogue part of the semantic web [Paper presentation]. International Conference on Dublin Core and Metadata Applications 2008, Berlin, Germany. http://dcpapers.dublincore.org/pubs/article/view/927/923 Malmsten, M. (2009). Exposing library data as linked data. In Proceedings of IFLA WLIC 2009. http://disi.unitn.it/~bernardi/Courses/DL/Slides_10_11/linked_data_libraries.pdf McCrae, J. P. (2019). The linked open data cloud: Subclouds by domain. https://lod-cloud. net/#about Network Development and MARC Standards Office. (2006). Mapping of MARC data elements to FRBR and AACR. In Functional Analysis of the MARC 21 Bibliographic and Holdings Formats (Rev. ed.). http://www.loc.gov/marc/marc-functional-analysis/source/table3.pdf Penn Libraries. (n.d.). Jane Austen’s Pride and prejudice / edited by Claudia L. Johnson, Susan J. Wolfson. https://franklin.library.upenn.edu/catalog/FRANKLIN_9939511983503681http://joemls.tku.edu.tw 72 Journal of Educational Media & Library Sciences 57 : 1 (2020) Possemato, T. (2018). How RDA is essential in the reconciliation and conversion processes for quality Linked Data. JLIS.it, 9(1), 48-60. https://doi.org/10.4403/jlis.it-12447 RDA Steering Committee. (2019). Frequently Asked Questions. RDA Registry. https://www. rdaregistry.info/rgFAQ Santos, R., Manchado, A., & Vila-Suero, D. (2015, August 15-21). Datos.bne.es: A LOD service and a FRBR-modelled access into the library collections [Paper presentation]. IFLA World Library and Information Congress: 81st IFLA General Conference and Assembly. Cape Town, South Africa. http://library.ifla.org/1085/1/207-santos-en.pdf Schreur, P. (2015). Implications of a linked data transition: Stanford University’s projects and plans. http://www.lib.berkeley.edu/catalog_dept/sites/drupal7.lib.berkeley.edu.catalog_ dept/files/Implications%20of%20a%20Linked%20Data%20Transition.docx SHARE-VDE. (n.d.). http://share-vde.org/sharevde/docBibframe/Work/139617-12 Simon, A., Wenz, R., Michel, V., & Di Mascio, A. (2013). Publishing bibliographic records on the web of data: Opportunities for the BnF (French National Library). In P. Cimiano, O. Corcho, V. Presutti, L. Hollink, & S. Rudolph (Eds.), The Semantic Web: Semantics and Big Data. ESWC 2013: 10th International Conference, ESWC 2013, Montpellier, France, May 26- 30, 2013. Proceedings (pp. 563-577). Springer. https://doi.org/10.1007/978-3-642-38288-8_38 Smith-Yoshimura, K. (2016). Analysis of international linked data survey for implementers. D-Lib Magazine, 22(7-8). https://doi.org/10.1045/july2016-smith-yoshimura Smith-Yoshimura, K. (2018a). Analysis of 2018 international linked data survey for implementer. Code4lib Journal, 42. https://journal.code4lib.org/articles/13867 Smith-Yoshimura, K. (2018b). What metadata managers expect from and value about the research library partnership. Hanging Together. http://hangingtogether.org/?p=6683 Southwick, S. B. (2015). A guide for transforming digital collections metadata into linked data using open source technologies. Journal of Library Metadata, 15(1), 1-35. https://doi.org /10.1080/19386389.2015.1007009 Suominen, O., & Hyvönen, N. (2017). From MARC silos to linked data silos? O-Bib. Das Offene Bibliotheksjournal, 4(2), 1-13. https://doi.org/10.5282/o-bib/2017H2S1-13 University of Michigan Library. (2018). Ao man yu pian jian / Zhen, Aositing zhu ; [Xia Yinghui yi].傲慢與偏見 / 珍.奧斯汀著 ; [夏穎慧譯]. https://search.lib.umich.edu/ catalog/record/014616392?query=Ao+man+yu+pian+jian+Xia+Yinghui+yi&library=U- M+Ann+Arbor+Libraries Vila-Suero, D., & Gómez-Pérez, A. (2013). datos.bne.es and MARiMbA: An insight into library linked data. Library Hi Tech, 31(4), 575-601. https://doi.org/10.1108/LHT-03-2013-0031 Vila-Suero, D., Villazón-Terrazas, B., & Gómez-Pérez, A. (2012). datos.bne.es: A library linked data dataset. Semantic Web, 4(3), 307-313. https://doi.org/10.3233/SW-120094 Villazón-Terrazas, Vilches-Blázquez, L. M., Corcho, O., & Gómez-Pérez. (2011). Methodological guidelines for publishing government linked data. In D. Wood (Ed.), Linking government data (pp. 27-49). Springer. https://doi.org/10.1007/978-1-4614-1767-5_2 Wenz, R. (2013). Linked open data for new library services: The example of data.bnf.fr. JLIS.it, 4(1), 403-415. https://doi.org/10.4403/jlis.it-5509 Ya-Ning Chen 0000-0001-7598-1139 Dar-maw Wen 0000-0003-1525-4815http://joemls.tku.edu.tw 35-72.pdf Journal of Educational Media & Library Sciences http://joemls.tku.edu.tw Vol. 57 , no. 1 (2020) : 35-72 work_efjbtjwyanbarfsqbb6efzhi74 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586690 Params is empty 216586690 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586690 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_egk7ytcxafai3lhf45eprwe2ya ---- gi$$$$p027 Genes and Immunity (2000) 1, 169  2000 Macmillan Publishers Ltd All rights reserved 1466-4879/00 $15.00 www.nature.com/gene Publisher’s announcement Macmillan Publishers Ltd is pleased to be able to announce the creation of a new company. Nature Publishing Group brings together Nature, the Nature monthly titles and the journals formerly published by Stockton Press. Stockton Press becomes the Specialist Journals division of Nature Publishing Group. The new company will be a partner of the scientific community and will be an innovative, responsive and visible presence in scientific and medical publishing. Nature Publishing Group will use its unique strengths, skills and global perspective to meet the demands of a rapidly changing and challenging publishing environment. The Group’s publications are known for delivering high-quality, high-impact content, fair pricing, rapid publi- cation, global marketing and a substantial presence on the Internet. These elements are the key to excellence in selecting, editing, enhancing and delivering scientific information in the future. As a company, we have three core values: quality, service and visibility. These values are set to benefit all our customers—authors, readers, librarians, societies, companies and others—thus building strong pub- lishing relationships. Genes and Immunity Genes and Immunity is now part of the Specialist Journals division of Nature Publishing Group. It will be marketed and sold from our offices in New York, Tokyo, London and Basingstoke. Within the electronic environment, Genes and Immunity will benefit from a substantial invest- ment in innovative online publishing systems, offering global access, intelligent searches and other essential functions. Librarians will be able to provide their readers with print and online versions of Genes and Immunity through a variety of services including OCLC, Ingenta (linking to the BIDS service), SwetsNet, Ebsco, Dawson’s InfoQuest and Adonis. At a time when the basis of traditional journal publishing is undergoing significant changes, Nature Publishing Group aims to support the scien- tific and medical community’s needs for high quality publication ser- vices which provide rapid and easy access to the best of biomedical and clinical results. Jayne Marks Publishing Director Specialist Journals, Nature Publishing Group Publisher’s announcement Genes and Immunity work_eh6ygj4refcinklq7pquozqieq ---- OCLCReadingToolsFinal01[Figs Inline] An interactive reading environment for online scholarly journals: The Open Journal Systems Reading Tools Rick Kopak, and Chia-Ning Chiang Published in OCLC Systems & Services, 25 (2): 114-124. Introduction Open Journal Systems (OJS) is the result of a major research and development effort carried out under the auspices of the Public Knowledge Project (PKP) founded in 1998 by John Willinsky. OJS was developed to facilitate greater access to scholarly research by providing an open source platform for the production and distribution of the main coin of the academic research process, the scholarly journal article. As a production system, OJS enables and supports work processes at every stage of the overall publication process: from initial submission to final publication. In support of PKP’s central goal of increasing access to public knowledge, OJS is freely available, is locally installed and controlled, and is available in more than ten languages through the larger efforts of the world-wide OJS community. From the system user’s point of view, OJS supports activities for three primary roles: editors, authors, and readers. OJS management tools are based largely on the workflow found in typical journal publishing environments. Editors can easily set up the system to meet their local requirements and carry out editorial duties efficiently, e.g. accepting and tracking submissions, coordinating peer review, automatic notification of overdue reviews, template based responses for email communications to reviewers and authors. From the point of view of authors, they can upload files directly to the journal’s OJS installation, enter metadata for OAI indexing (to enable resource discovery), track the submission process, accept reviews, and resubmit revised copies (Willinsky, 2005). The third major role that OJS supports is that of the reader of the journal’s published articles. As a web-based system delivering both HTML and Portable Document Format versions of articles, OJS provides to the reader all of the convenience and efficiencies of distributed electronic document systems, e.g. anytime/anywhere access, cut and paste, and search. But, carried along with this sense of convenience is the suspicion that once received by the reader at their computers they will still choose to read the article by first printing it out (Schilit, et al., 1999). In an effort to further engage the reader at the interface, and to improve the overall reading environment, a set of Reading Tools were (and continue to be) developed and integrated into OJS. Building upon the exemplary models offered by Highwire Press, PubMed, and others the goal of the initial stages of tool development was to leverage existing online resources and to provide a functionality that would increase the general level of engagement with the materials “without adding significantly to the journal’s cost or the editor’s workload” (Willinsky, 2005). More recently, development efforts have focused on creating annotation and linking components as part of the general suite of Reading Tools with the goal of facilitating even more interaction with journal content. This article reviews the development, and specific purposes of the Reading Tools and outlines plans for future development. Reading Tools An important initiative within OJS is the development of a set of Reading Tools with the purpose of enhancing the online reading experience and improving the level of critical engagement with the content of the journal articles published within the system. Critical engagement is understood in this context as the interplay between information as encountered and the analysis and use of that information. Typically, critical engagement involves aspects of meaning making and comprehension and can be signified by recognition of nuance in information presented, the ability to draw important distinctions between competing perspectives and positions, and the ability to examine and interpret evidence, cause and effect, and so on (Monroe, 2003; Salvo, 2002). Furthermore, critical engagement is viewed as a product of an “active reading” strategy that, in its ideal form, integrates the critical, interpretive, and creative aspects of information use. Stated differently, active reading “is the combination of reading with critical thinking and learning, and is a fundamental part of education and knowledge work” (Schilit, Golovchinsky, and Price, 1998). As noted by Schilit, et al. (1998), finding related resources and moving between them is an important aspect of active reading, but “[t]his activity of finding related materials while reading is often disconnected from the main activity. Typical information retrieval interfaces force users to suspend their reading activity to identify and then to retrieve related documents” [p. 251]. A major design focus in the development of Reading Tools is to keep focus on the article while at the same time enabling guided navigation to related materials that elaborate on the context of the article. As such, an attempt has been made to move from simple information retrieval toward information processing and elaboration of context. Reading Tool Components The Reading Tools themselves are represented as a set of grouped links on the right-hand edge of the OJS journal HTML view. Figure 1 shows the Reading Tools in relation to a journal article presented with the default OJS stylesheet. Structural Items The Reading Tools are divided into two major components. The first set, immediately under the author’s name in Figure 1, represents access to information and services that are more structural than topical or domain specific. “Abstract”, for example, provides the abstract in a separate window for those cases in which the abstract is not shown as part of the article itself. "Review policy" provides information about the journal’s policies for receiving and reviewing submissions, e.g. whether there is an open submission policy, whether the journal is peer- reviewed, and whether the journal is indexed by a database utility. Readers, for example, may wish to know whether the article is peer reviewed. "How to cite item" enables the export of basic citation data to popular bibliographic management software (e.g. EndNnote, ProCite, and Reference Manager) and aids the reader in easily collecting citation data in the early stages of information seeking. “Indexing metadata”, includes fifteen descriptive elements (based on Dublin Core, see Willinsky, 2005, p. 515). This is useful to readers in providing additional information about the article, but also enables effective harvesting, and as a consequence maximizes resource discovery. Figure 1. HTML View of Journal Article and Associated Reading Tools “Supplementary files” include downloadable supporting materials associated with the article such as research instruments and datasets or spreadsheets. "About the author", "Notify colleague", and "Email the author" are mechanisms for connecting readers and authors. As it suggests, “About the author” provides useful information about the author and informs the reader of competencies and areas of specialization. “Notify colleague” provides an email form with the title of the article in the subject line, while “Email the author” provides the same functionality but with the author’s email address inserted. “Look up terms” enables the reader to easily look for definitions of terms within the article itself. The reader can do so by either double-clicking on a term within the article text, or by engaging the link and entering keywords directly into the search table. Once initiated, the reader also has the ability to select from a number of open access dictionaries and encyclopedias in which to search for the definition. Related Items The OJS administrative system organizes Reading Tools into ‘Disciplines’. Within each discipline the Reading Tools breakdown into ‘Related Items’, where each related item is associated with specific ‘Resources’. Journal managers and editors can create and modify Disciplines, Related items, and the Resources searched according to their requirements. Currently, Reading Tools offer Related Item sets for nineteen different domains, e.g. education, biology, business, economics. As such, resources are generally domain specific and seek to accommodate the information needs of readers within a particular discipline. The categorization of resources is guided to enhance material mastery. Material mastery skills, as defined by Covi (1999) are developed and maintained through information use and reflect differences in disciplinary search strategies, disciplinary materials, and field integration. Resources may also reflect other important differences that are not necessarily domain specific. For example, the National Library of Australia recently reported on their customization of OJS Reading Tools (Graham, 2007). The Infrastructure Group of the NLA reviewed the standard set of Reading Tools delivered with the OJS software and identified additional resources that focused more specifically on Australian interests. Primarily, though, Related Items enable discovery of resources in the domain area (discipline) of the journal [1]. Table 1 provides examples of frequently used contexts (Related Item elements) found in many OJS journals. Although the elements are functionally equivalent across various domains, the specific resources that they lead to are domain specific. For example, “Related Studies” provides a functionally equivalent search, i.e. it will search databases, bibliographies etc., for studies related to the current article, but the resources searched are specific to the domain of the journal being used. In education the “Related Studies” context would include a search of the Education Resources Information Center (ERIC), while in a computer science journal it would search, for example, the HCI Bibliography. Beyond this basic set of Related Items, the editors of journals may choose to add additional contexts to enable direct access to resources that are specific to the topical focus of the journal. For example, OJS provides a Related Item set for astrophysics. Using this set, the editor can add the specific context ‘Astro Data’ which provides access to data at the UK Astronomy Data Centre, and NASA’s High Energy Astrophysics Science Archive Research Center (HEASARC). Editors can also create their own contexts, adding to them relevant resources to be searched. Table 1. Reading Tool Related Items and Examples of Context ElementsTable 1. Reading Tool Related Items and Examples of Context ElementsTable 1. Reading Tool Related Items and Examples of Context Elements Related Items Description Examples of Databases Searched Definitions (Look up Terms) Look up definitions of terms in the article. Enter terms to be searched by double clicking on term in article or by typing in. These resources have been selected because of their relevance and open accessibility. • Columbia Encyclopedia • Google Definition • Webster Online Author's work Search other articles by the author in related databases. • BBC Learning • High Beam Research • RAND Research Government Policy Search government information from official government sites. • Government of Canada • FirstGov (US) • Directgov (UK) • Europa Book Search whole or partially available books that are freely readable over the Internet. • Google Print • Online Books Page • Books-On-Line • Universal Library Book reviews Search online free book reviews. • H-Net Reviews • New York Review of Books • CM: Canadian Review of Materials • Leonardo Digital Book Reviews Dissertations Search open access dissertations • CRL Foreign Doctoral Dissertation Databases • Dissertation.com • Networked Digital Library of Theses and Dissertations Union Catalog (NDLTD) • Scirus ETD Search Online forums Search online meeting or discussion information • H-Net, Humanities & Social Sciences Online • Liszt • MInd: the Meetings Index Quotations Search open-access quotable quotes • Bartlett's Familiar Quotations • Quotations Page Pay-per-view Search articles or book chapters available on the pay-per-view based service to individual scholars • Ingenta • Ebrary • Questia • Wiley InterScience Pay-per- view Service Related studies Search research information from related resources • BBC Learning • High Beam Research • RAND Research Media Reports Search major news websites or aggregate news websites for accessing news reports worldwide • Google News • Globe and Mail • People's Daily • National Public Radio • New York Times • The Japan Times Online • The Moscow Times • Washington Post • Newsdirectory Web search Broaden searches by using major search engines • Google • Google Scholar • Clusty the Clustering Engine • Vivisimo Annotation and Linking Tools Annotation has been long recognized as a fundamental component of active reading strategies. Adler (1940), in writing about books, stated that writing in the margins, between lines, and in the back and front covers (a distinct form of annotation) are indispensable to reading. “[R]eading, if it is active, is thinking, and thinking tends to express itself in words, spoken or written. The marked book is usually the thought-through book” [p. 11]. Reading scholarly journal articles is also viewed as a highly directed and interactive activity (Adler et al., 1998; Bishop, 1998; Marshall et al., 1999; Kaplan & Chisik, 2005). Readers actively pursue meaning, carrying on a mental dialogue with the writer. Previous research on the annotation habits of students and professionals demonstrated that the readers’ annotations are also highly goal-oriented (Wolfe, 2000; Marshall et al., 1999; Marshall & Brush, 2004). Research also illustrates the importance of annotations in writing while reading, including discussion of use and form (Marshall & Shipman, 1997; Marshall, 1998), the influence of annotation on readers (O’Hara et al., 1998), and their role in scholarly communication (Furuta & Urbina, 2002). Schilit et al. (1998a) enumerate three specific advantages of the direct annotation of documents that reflect an active reading strategy: •Convenience: Since annotations are directly integrated with the reading material, writers do not have to swap to a different tool to make statements about the content. Annotating the document in situ does not interrupt the flow of reading. •Immersion in document context: Annotations contain more information because they occur within the “context” of a document (rather than as isolated information object). • Visual search: Annotations stand out visually in a document, allowing readers to easily scan for them. Toward the goal of further enabling an active reading strategy, we have created a prototype annotation and linking tool that is currently being testing with users (Kopak & Chiang, 2007). Figure 2. shows the annotation component in the context of an OJS HTML journal article. Figure 2. OJS with Annotation Tool The annotation area is currently represented as marginalia. To create an annotation, the reader highlights the article text that will correspond to the annotation, and clicks on a vertical bar that activates a text box in the margin next to the area to be annotated. The Reader can then enter free-form text, or cut and paste from other parts of the article, or other sources altogether. For example, a reader may want to look-up a definition of a term in the text using the existing “Look up Terms” facility, cut the desired definition that is returned, and paste the definition into the annotation space. The linking component builds on previous research (Kopak, 1999; 2000; 2002) in which a set of link types was developed to describe the functional relationship between nodes of information identified within a text, or more importantly, between texts. A link type in this instance is a label that describes the nature of the relationship between the two nodes of information, e.g. ‘illustrates’, ‘defines’, ‘exemplifies’. Figure 3. OJS Journal Article with Link Type Pull Down Menu (no additional Reading Tools) In the existing prototype, a reader can highlight a word, phrase, sentence, etc. in a given article within an OJS installation and create a hypertext link to a paragraph in another journal article by first locating, then clicking on the corresponding paragraph at the target end of the link. A reader can also annotate the link via the feature described above, or select from the pre-existing set of link types arrayed in a pull-down menu in the annotation margin (see Figure 3). The goal in developing the linking and typing tools is to instantiate a hypertext component into OJS that would take advantage of linking’s ability to provide an additional form of annotation. If one of the functions of annotation is to provide a means to connect a related piece of information to a content segment in context, we can accomplish this as well by linking directly to an existing piece of information rather than creating a new one. Additionally, we sought to enable readers to add value to the link by attributing the functional nature of the relationship between the content segment and the linked item. We did so by enabling the easy application of link types characterizing these kinds of relationships, (i.e. the linked item defined, illustrated, compared, etc. the content segment). Furthermore, the linking tool enables leverage of the functional relationships to increase the level of coherence between text segments of journal articles being read and annotated within the system. Subsequently, readers become authors by creating functionally meaningful relationships between these fragments. Movement through the articles within the system can be recalled and retraced with navigation guided by the purpose and role that associated information has in facilitating comprehension. Future Work User testing of the Reading Tools continues (Willinsky, 2004 ; Siemens, et al., 2006) and results are useful and encouraging in suggesting new ways of providing these kinds of resources to readers. PKP has also developed a Web-based XML service enabling easy conversion of submissions into a standard, well-structured layout format. In addition to simplifying work processes, XML offers the potential of an increased ability for readers to interact with journal documents at the interface. For example, a reader could select different views of an article depending on the stage of the information seeking process at which they find themselves, choosing perhaps to only display the abstract, conclusion, and bibliography in the early stages of the research process. An important area for future development concerns the role of social interaction in an environment like that offered by OJS. The annotation prototype largely enables annotation and linking for personal use. We are currently carrying out a study that investigates the role of social input to content description and navigation via the annotation component. Specifically, the investigation seeks to identify the kinds of annotations that readers would and would not find useful in an environment where one could contribute their annotations (or a selection of them) to a public space. Future possibilities for the linking component include investigation of the social construction of hypertexts. Interest here lies in the ability to construct navigable planes through an information space represented by a collection of journal articles provided in a system like OJS. Readers, for example, could choose to publish their links resulting from their interactions with a particular OJS corpus. Commonly identified links would form the basis of a publicly available network of paths. In every way, future research and development efforts around Reading Tools will continue to support the goal of providing a rich, interactive information environment for readers. Endnotes [1] Choosing good sources is a powerful means of setting readers off in the right direction. It was important, therefore, to choose databases for Related Items searches based on specific criteria. Extant research does set out some criteria for selection of Internet resources (Zhang, 2001). Based on this research, eight essential selection criteria for resource evaluation were used to assess resources for the Reading Tools. Resources had to be easily accessible, of scholarly content, open access, of high credibility, maintain a significant collection size, be sustainable, facilitate material mastery, and provide diversity. Bibliography Adler, A., Gujar, A., Harrison, B., O’Hara, K., and Sellen, A. (1998) “A diary study of work– related reading: Design implications for digital reading devices,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’98), Los Angeles, pp. 241–248. Adler, M. (1940) “How to mark a book”, Saturday Review of Literature, July 6, pp. 11-12. Bishop, A. (1998) “Digital libraries and knowledge disaggregation: The use of journal article components,” Proceedings of the Third ACM Conference on Digital Libraries, Pittsburgh, pp. 29–39. Covi, L. (1999) “Material mastery: Situating digital library use in university research practices”, Information Processing & Management, Vol 35 No 3, pp. 293-316. Furuta, R., and Urbina, E. (2002) “On the characteristics of scholarly annotations,” Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia (Hypertext 2002), pp. 78–79. Available http://www.csdl.tamu.edu/cervantes/pubs/HT02cp.pdf Graham, S. (2007) “The National Library of Australia: Open access to Open Publish”, First International PKP Scholarly Publishing Conference, Simon Fraser University, Vancouver, BC, Canada. Available http://ocs.sfu.ca/pkp2007/viewabstract.php?id=1 Kaplan, N., and Chisik, Y. (2005) “Reading alone together: Creating sociable digital library books,” Proceeding of the 2005 Conference on Interaction Design and Children (Boulder), pp. 88–94. http://www.csdl.tamu.edu/cervantes/pubs/HT02cp.pdf http://www.csdl.tamu.edu/cervantes/pubs/HT02cp.pdf http://ocs.sfu.ca/pkp2007/viewabstract.php?id=1 http://ocs.sfu.ca/pkp2007/viewabstract.php?id=1 Kopak, R., and Chiang, C. (2007) "Annotating and linking in the Open Journal Systems", First Monday, Vol 12 No 10. Available (1 October 2007) http://www.uic.edu/htbin/cgiwrap/bin/ ojs/index.php/fm/article/view/1961/1838 Kopak, R. (2002) “Link typing in hypertext: Defining Conceptual Attributes.” Proceedings of the Canadian Association for Information Science, Toronto, ON, pp. 215-222. Kopak, R. (2000) A Taxonomy of Link Types for Use in Hypertext. Unpublished Ph.D. dissertation, University of Toronto. Kopak, R. (1999) “Functional link typing in hypertext,” ACM Computing Surveys, Vol 31 No 4, pp.16-22. Marshall, C. (1998) “Toward an ecology of hypertext annotation,” Proceedings of Ninth ACM Conference on Hypertext and Hypermedia (Pittsburgh), pp. 40–49. Available http:// csdl.tamu.edu/~marshall/ht98-final.pdf Marshall, C., and Brush, A. (2004) “Exploring the relationship between personal and public annotations,” Proceedings of the Joint Conference on Digital Libraries - JCDL ’07 (Tuscon), pp. 349–357. Available http://csdl.tamu.edu/~marshall/112-marshall.pdf Marshall, C., Price, M., Golovchinsky, G., and Schilit, B. (1999) “Introducing a digital library reading appliance into a reading group,” Proceedings of ACM Digital Libraries (Berkeley), pp. 77–84. Available http://csdl.tamu.edu/~marshall/ht97.pdf Marshall, C., and Shipman, F. (1997). “Effects of hypertext technology on the practice of information triage,” Proceedings of the ACM Hypertext ’97 Conference (Southampton), pp. 124–133. Available http://csdl.tamu.edu/~marshall/ht97.pdf Monroe, B. (2003) “Fostering critical engagement in online discussions: The Washington State University study”, Washington Center for Improving the Quality of Undergraduate Education Newsletter, Fall, pp. 31-33. O’Hara, K., Smith, F., Newman, W., and Sellen, A. (1998) “Student readers’ use of library documents: Implications for library technologies,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’98), Los Angeles), pp. 233–240. Available http:// www.hpl.hp.com/personal/Kenton_Ohara/papers/Library_CHI.pdf Salvo, M. (2002) “Critical engagement with technology in the computer classroom”, Technical Communication Quarterly, Vol 11 No 3, pp. 317-337. Schilit, B., Price, M., Golovchinsky, G., Tanaka, K., and Marshall, C. (1999) “As we may read: The reading appliance revolution”, Computer, Vol 32 No 1, pp. 65-73. http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1961/1838 http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1961/1838 http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1961/1838 http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1961/1838 http://csdl.tamu.edu/~marshall/ht98-final.pdf http://csdl.tamu.edu/~marshall/ht98-final.pdf http://csdl.tamu.edu/~marshall/ht98-final.pdf http://csdl.tamu.edu/~marshall/ht98-final.pdf http://csdl.tamu.edu/~marshall/112-marshall.pdf http://csdl.tamu.edu/~marshall/112-marshall.pdf http://csdl.tamu.edu/~marshall/dl99.pdf http://csdl.tamu.edu/~marshall/dl99.pdf http://csdl.tamu.edu/~marshall/ht97.pdf http://csdl.tamu.edu/~marshall/ht97.pdf http://www.hpl.hp.com/personal/Kenton_Ohara/papers/Library_CHI.pdf http://www.hpl.hp.com/personal/Kenton_Ohara/papers/Library_CHI.pdf http://www.hpl.hp.com/personal/Kenton_Ohara/papers/Library_CHI.pdf http://www.hpl.hp.com/personal/Kenton_Ohara/papers/Library_CHI.pdf Schilit, B., Golovchinsky, G., and Price, M. (1998) Beyond paper: Supporting active reading with free form digital in annotations”, Proceeding of the CHI 98 Conference on Human Factors in Computing Systems, April, pp. 249-256. Shipman, F., Price, M., Marshall, C., and Golovchinsky, G. (2004) “Identifying useful passages in documents based on annotation patterns,” Proceedings of the 7th European Conference on Research and Advanced Technology for Digital Libraries (Trondheim), pp. 101–112. Available http://www.fxpal.com/publications/FXPAL-PR-03-216.pdf Siemens, R.; Willinsky, J.; Blake, A. (2006). A study of professional reading tools for computing humanists. Electronic Textual Cultures Laboratory, University of Victoria. Available http://etcl- dev.uvic.ca/public/pkp_report (retrieved November 15, 2007) Willinsky, J. (2005) “Open Journal Systems: An example of open source software for journal management and publishing”, Library Hi Tech, Vol. 23 No 4, pp. 504-519. Willinsky, J. (2004) “As open access is public access, can journals help policymakers read research?”, Canadian Journal of Communication, Vol 29 No 3&4, pp. 381-401. Wolfe, J. (2000) “Effects of annotations on student readers and writers,” Proceedings of the Fifth ACM Conference on Digital Libraries (San Antonio), pp. 19–26. Zhang, Y. (2001) “Scholarly use of Internet-based electronic resources”, Journal of the American Society for Information Science and Technology, Vol 52 No 8, pp. 628-654. http://www.fxpal.com/publications/FXPAL-PR-03-216.pdf http://www.fxpal.com/publications/FXPAL-PR-03-216.pdf http://pkp.sfu.ca/biblio/author/Siemens http://pkp.sfu.ca/biblio/author/Siemens http://pkp.sfu.ca/biblio/author/Willinsky http://pkp.sfu.ca/biblio/author/Willinsky http://pkp.sfu.ca/biblio/author/Blake http://pkp.sfu.ca/biblio/author/Blake http://pkp.sfu.ca/node/471 http://pkp.sfu.ca/node/471 http://pkp.sfu.ca/node/471 http://pkp.sfu.ca/node/471 work_elfyjvf3zzaozkxp7ke5x3oham ---- This may be the author’s version of a work that was submitted/accepted for publication in the following source: Cochrane, Thomas & Callan, Paula (2007) Making a Difference: Implementing the Eprints Mandate at QUT. OCLC Systems and Services, 23(3), pp. 262-268. This file was downloaded from: https://eprints.qut.edu.au/6916/ c© Copyright 2007 Emerald Group Publishing Reproduced in accordance with the copyright policy of the publisher. Notice: Please note that this document may not be the Version of Record (i.e. published version) of the work. Author manuscript versions (as Sub- mitted for peer review or as Accepted for publication after peer review) can be identified by an absence of publisher branding and/or typeset appear- ance. If there is any doubt, please refer to the published source. https://doi.org/10.1108/10650750710776396 https://eprints.qut.edu.au/view/person/Cochrane,_Tom.html https://eprints.qut.edu.au/view/person/Callan,_Paula.html https://eprints.qut.edu.au/6916/ https://doi.org/10.1108/10650750710776396 This is the manuscript version of a paper published as Cochrane, Tom G. and Callan, Paula A. (2007) Making a Difference: Implementing the eprints mandate at QUT. International Digital Library Perspectives 23(3):pp. 262-268. Copyright 2007 Emerald Making a Difference: Implementing the eprints mandate at QUT Professor Tom Cochrane Deputy Vice Chancellor (Technology Information & Learning Support) Queensland University of Technology Paula Callan eResearch Access Coordinator Queensland University of Technology Abstract Classification: Case study Purpose; The purpose of this article is to describe the impact of a university-wide deposit mandate on the self-archiving practices of academics and to show how a mandate can make a positive difference. Methodology/Approach; The article explains the genesis of the eprints mandate at QUT and outlines the response of the academics to the endorsement of the policy. The implementation of the mandate is then examined in detail including discussion and evaluation of specific implementation strategies and practices. Findings; The experience of Queensland University of Technology suggests that a university- wide eprints mandate definitely increases the rate of self-archiving. Cultural and organisation change takes time, even with a mandate. Advocacy initiatives and implementation strategies have to be aligned with current skills and needs of the researchers. For a self-deposit system to be successful, the barriers need to be as low as possible. Practical implications Institutional repository administrators should consider creating a scaffolded deposit system that is fast, intuitive and requires only basic technology skills. The efforts of early adopters should be recognised as publicly as possible. Evidence of success is the best form of persuasion. Keywords: Open access; Self-archiving; mandate; Eprints; Institutional repository; QUT Making a Difference: Implementing the eprints mandate at QUT Queensland University of Technology (QUT) in Brisbane, Australia, is one of Australia’s largest universities with 40,000 students and 3,500 staff. It had its antecedents in an institute of technology in which the main disciplines were science, engineering, information technology, architecture, design, health sciences, business and law. As it grew, large discipline groupings in education, justice studies, human services, humanities and, later, creative industries were added. This discipline profile meant that library investment was significantly skewed towards science and technology, and within that, a heavy proportion of the expenditure was on the periodical literature. It is well documented that these areas showed some of the highest inflation rates in terms of procurement cost for university use. For Australian universities, this problem was exacerbated by exchange rate issues between Australia and its main sources scholarly material; Europe and North America. Being a university of technology, QUT has always been keen to support the development of pervasive and convenient digitally-based services to its students and staff. It saw, in the rise of opportunities in the digital age, a chance to greatly improve the reach of its own academic staff and researchers as they strived to achieve “stretch” targets to improve their own research output. In recent years QUT research has grown (by one measure, research income) at 25% per year. Early in the development of these digital opportunities, there was considerable speculation in the relevant literature about the breakdown of journal formats, about scholarly communication transactions being developed at the article level and about much greater sharing of scholarly work at an earlier stage in its gestation (Smith, 1999). But any system-wide investigation of the possibilities always foundered on the rocks of vaguely formulated copyright questions and challenges. At the same time, there was an emergence of evidence indicating that some academic journal publishers were moving to new business models (Gannon, 2000). All this has been amply documented in the literature. Developing a mandating policy At QUT, as a result of these general influences, considerable attention was being paid to a rising debate led by research communities about the need for better access and availability of research outputs at either discipline or institutional levels. Early work in North America and the UK in particular was tracked with great interest (Harnad, 2000). By 2002, the maturity of the discussion globally, and the development of practical approaches to institutional repositories focussed on research output, had reached a stage where the debate could be taken into various academic and research groups within QUT. To the proposition that QUT might develop an institutional server, and that the institution should develop a policy of requiring deposit, there was surprisingly little resistance. In many cases, academic authors were already putting PDF versions of their own material on their personal websites. In other cases, notably the Faculty of Science, the general mission of providing better access to our institution’s output had general approval. In terms of policy support, significant senior officers within the University were comfortable with the development of a proposal. Accordingly, a policy process was taken through University Research Committee and subsequently endorsed at University Academic Board in September 2003 ("E-print repository for research output at QUT," 2003). The mandate was thus now a part of University governance. The next question was how to actually advance this mandate in a way which would foster cooperation and was aligned with the interests of researchers and academics. This meant not using the mandate as a blunt instrument, but instead finding a way to support the process. Implementation In June 2003, while the mandate was being discussed and debated, funding was allocated for the appointment of an appropriately skilled person, a significant proportion of whose time could be dedicated to the task of developing the actual activity of populating an eprint server at QUT. A project officer was appointed and work commenced immediately on the installation of repository software (EPrints). A reference group, which included researchers, librarians and a representative from the University’s Office of Research, was consulted at various times during early stages of the project and, as a result, a few changes were made to the default configuration of the metadata fields to accommodate perceived local needs. For example, the default LCSH subject-heading list was replaced with the Australian Standard Research Classification (ASRC) codes (Australian Bureau of Statistics, 1998). These codes are used by Australian researchers to classify their work when reporting their outputs to the government as part of the annual Higher Education Research Data Collection (HERDC). As this was to be a ‘self-deposit’ system, it seemed appropriate to use a classification system with which the researchers would be familiar. The repository was given a name, QUT ePrints, and launched at a formal event in November 2003. Personalised invitations were sent to all Deans of Faculty, Directors of Research Centres, Heads of School and high-profile researchers. Brochures promoting QUT ePrints were distributed widely across the University and articles about the benefits of open access were published in the campus media. Needless to say, the repository was not immediately inundated with deposits so, in order to gather some initial content for the repository, the QUT web-site was scanned for sources of “low-hanging fruit”. This turned up a series of working papers and a number of conference papers. After contacting the authors for permission, these papers were deposited by the Project Officer. The authors were all very happy for their work to be included in the repository, especially if someone else was depositing it for them. While this was a useful start, it was not the prime target material; peer reviewed journal articles. The QUT Office of Research was happy to work collaboratively with the Library and provided a spreadsheet containing details of all peer reviewed journal articles and conference papers authored by QUT researchers in the previous two years. The information was drawn from a database maintained for the annual HERDC reports. By sorting the spreadsheets by author, it was possible to identify the researchers who published most frequently. Emails were sent to these researchers, explaining the rationale for self-archiving, gently reminding them that it was now University policy that copies of their publications should be located in QUT ePrints and inviting them to attend one of the regular workshops that were being run to help researchers learn how to deposit their papers. Some responded by asking for more information, others agreed that it sounded like a great concept and most promised to deposit their papers as soon as they had some time. Unfortunately, very few followed through and actually deposited the papers. Regular information sessions and workshops were included in the Library’s calendar of information literacy classes but most workshop attendees were postgraduate students who were enthusiastic about the concept but often had no publications to deposit. It was clear that, without some gentle pressure from above, it would be a long time before self-archiving would spread beyond a handful of enthusiastic early- adopters. It was time to play the “policy card”. Emails were sent to all Heads of School (academic departments) asking if they would like a speaker from the Library to come to their next staff meeting to explain “the implications of the new eprint repository policy”. By mentioning the policy, it resulted in much higher acceptance rate than would have been the case for an offer to come and talk about a new library service. In the week following each presentation, the Project Officer and the relevant Faculty liaison librarian would arrange a hands-on deposit workshop for the group. These group-specific workshops were significantly better attended than the earlier general sessions had been Administrative assistants from the academic departments were also invited to attend the workshops so they could become a local source of advice and expertise. Some academic departments employed research assistants to help with the deposit of publications. It is highly unlikely that this would have happened in the absence of the mandate. By the end of the first twelve months, 425 documents had been deposited in the repository but still only a small proportion had been deposited by the authors themselves. Further investigation was needed to identify the factors inhibiting the uptake of self-archiving. A series of phone calls and meetings with some of the researchers who had not followed through on their stated intentions to deposit their publications provided the Project Officer with some answers. Lack of time was the reason given most frequently. However, other factors generally emerged from these discussions. Chief amongst them was the fact that they had found the deposit process to be quite complex and time consuming. At this time, the researchers were asked to deposit the postprint (manuscript) version of their work as a PDF file. Detailed instructions had been provided on how to convert MS Word documents to PDF but this, it seemed, only added to the perceived complexity. Most of the researchers also expressed serious concerns about the copyright issues associated with self-archiving. What would be the consequences if they got it wrong? Some researchers had established good relationships with the editors of their preferred journals and did not want to jeopardize their good standing. At this time, the deposit guide advised the researchers that they should check their publisher’s policy on self-archiving by reading the terms in the publication agreement they had signed or by consulting the SHERPA (http://www.sherpa.ac.uk/romeo.php) list of publisher policies. This had seemed like sensible advice at the time but it was perceived as a major hurdle by most researchers. When discussing open access, the researchers readily agreed that, in principle, the entire corpus of peer-reviewed literature should be accessible to all would-be readers, not just those who could afford to pay for it. However, as they did not personally experience any difficulty accessing the literature they needed, perhaps the message did not resonate with them. On reflection, it became clear that focus of the message should have been about how self-archiving would be beneficial to their research projects and to them personally and professionally. That is, by self- archiving they would be increasing the visibility and accessibility of their work and that this would maximise the impact of their research (as measured by citations); it could also save them time and possibly lead to new networking opportunities. Taking on board the information gleaned from these discussions, new strategies were devised to change the self-archiving experience into one that would be less time-consuming and more useful to the researchers. The first step was to simplify the deposit process by accepting files in MS Word format. The Library incorporated the file conversion into the document review process. The next step was to alleviate the widespread anxiety about copyright by announcing that the Library would manage the rights-checking for all journal articles and would enable a level of access consistent with the publisher’s stance on the dissemination of the postprint version by authors. Checking the publisher’s policy on the SHERPA list during the metadata review stage only took a couple of extra minutes. Where the publisher’s policy is unknown, the Library sends an email to the publisher (or editor) requesting information about the rights retained by authors in their standard publication agreement. The same email also requests permission to include the work in QUT ePrints in case it turns out that all rights are retained by the publisher. A number of email templates were created for this purpose to minimise the workload. The information received is recorded in a database for future use. Where the publisher requires an embargo period to be observed, this is managed by the Library. Where open access cannot be enabled immediately, access to the full- text manuscript file is blocked by the Library but the metadata is still accessible. The link in the eprint record to the journal or the publisher’s web-site will still facilitate access to the paper. The eprint record also provides author contact details. The aggregation of records created by the “Browse by person” feature in the repository software was promoted to QUT researchers as a means of creating a “personal showcase” for their publications. It was suggested that they should consider including the URL for this page into their email signature. If anyone emailed them for a copy of a paper, they could simply direct the requester to the link. This would save time and would expose the requester to other papers that may also be of interest. By ‘book-marking’ their own eprints page, they would also have ready access to their manuscript files and, in most cases, access to the published versions of these documents via the links to the journals. Finally, to provide additional positive reinforcement, a download statistics feature was added to the QUT ePrints website. The scripts for this feature were developed by Kingsley Gurney, a colleague from the University of Queensland. This enhancement has been a very popular with the researchers. Embedding self-archiving into research workflows Some of the researchers with prolific publishing outputs, who had been emailed earlier, were now visited by the Project Officer and shown the simplified deposit process. Knowing that it must be difficult to track all their publications for the once-a- year HERDC report, it was suggested that self-archiving the manuscripts to QUT ePrints, as soon as they were accepted for publication, could help them to manage this information efficiently. One of the researchers visited at this time, Professor Ray Frost from the School of Physical and Chemical Sciences, now deposits 4-5 new papers each month. The twelve month download count for his papers is now in excess of 25,000. Consequently, Professor Frost is now an enthusiastic advocate for the repository and has encouraged many of his colleagues and all of his postgraduate students to take advantage of the eprints-edge. When a researcher’s download total reaches a major ‘milestone’ (eg 10,000), a congratulatory email is sent to the relevant Faculty mailing list (which will include the researcher’s supervisor). This is an effective promotional strategy as it captures attention and often results in a flurry of enquiries. Reference to the eprint policy has been embedded into a number of research reporting processes at QUT. For example, in the guidelines for the Teaching & Learning Development Small Grant Scheme, applicants are advised that all publications arising from grant projects must be deposited in the eprint repository in accordance with University policy. This promotes awareness of the eprint policy and encourages academics to regard self-archiving as part of normal research practice. Public policy commentary In October 2006 a report to the Australian government released findings that up to $628 million a year in economic and social benefits to the nation might be realised by moving to having research results freely available in Australia. This was reported in The Australian, (Lane, 2006). The same article continued with a particular mention of the QUT mandate, and included a quote from Professor Ray Frost. “Ray Frost, from QUT's school of physical and chemical sciences, said the ePrints repository gave him a new global readership. His papers were downloaded on average 2080 times a month” (Lane, 2006p21) Public attention to these innovations seems to be increasing. There is significant emphasis in some jurisdictions in Australia, and elsewhere, on the argument that publicly funded research must be freely available wherever possible. QUT’s experience has confirmed that our researchers get greater visibility, and clearly in consequence, some greater degree of impact, and that their research is simply more accessible. Future plans The SHERPA List of publisher policies is an invaluable tool, used internationally to facilitate rights-checking. If standard copyright licences that reserve some rights were to become the norm, this step could become unnecessary. However, in the meantime, there is a need for more information about the current policies of Australasian publishers. Fortunately, The Open Access to Knowledge (OAK) law project, which is based at QUT, has recently received a grant from the Australian Government to study publication agreements used by Australian researchers. The proposed ‘OAK List’ will be complied in association with the UK-based SHERPA Project. The OAK project team will also be investigating new forms of publication agreements designed to facilitate open access to research articles. The recently released OAK Law Project Report provides more details of the proposed objectives (Fitzgerald, Fitzgerald, Perry, Kiel-Chisholm, Driscoll, Thampapillai and Coates, 2006). QUT Library will be an active participant in this phase of the OAK law project; with two new project staff located within the Library. The OAK project staff will be consulting with Australian eprint repository coordinators and other relevant stakeholders. The QUT eprint repository, QUT ePrints, will soon be moving to the ARROW repository platform; a combination of Fedora database software with VITAL as the interface layer. Once the ARROW software is in place, QUT Library will develop and implement new services related to the storage and dissemination of a wider range of digital objects. However, the original vision, the greater reach of and access to, QUT research will remain an undiluted continuing objective. Conclusions Having a mandate that has been implemented sensitively plus a simple, low-stress deposit process has proved to be a winning combination for QUT. The deposit rate increased significantly in the second and third years of operation and there are now nearly 4000 items in the repository. The vast majority (nearly 90%) include a full-text document that is openly accessible and, for many of the others, open access will be enabled once the publisher-requested embargo period has elapsed. The best indicator of the success of the QUT strategy is the proportion of the institution’s publication output that is being self-archived. In 2005, QUT researchers published approximately 1200 peer reviewed journal articles and conference papers. So far, 637 of these have been deposited in QUT ePrints by the authors; a “capture” rate of over 50%. This figure is likely to rise because 2005 publications being still being deposited at the time of writing (October 2006). The mandate has been instrumental in this success and, to date, it has not met with any fierce resistance. The story may have been different had the policy been wielded as a stick with explicit penalties for non-compliance from the beginning. Instead it has been used as a gentle lever to encourage participation. By making it a policy, the University sends the message that self-archiving is a worthwhile and valued practice that deserves to be a high priority. Once researchers begin experiencing the many benefits that flow from having their work in an open access repository, they tend to become enthusiastic participants. It is QUT’s experience that having a policy can certainly make a difference in terms of getting researchers to take the first step. References: Australian Bureau of Statistics. (1998). 1297.0 - Australian Standard Research Classification (ASRC). http://www.abs.gov.au/ausstats/abs@.nsf/66f306f503e529a5ca25697e001766 1f/2d3b6b2b68a6834fca25697e0018fb2d!OpenDocument "E-print repository for research output at QUT"(2003). In Manual of Policies and Procedures, Policy F/1.3: Queensland University of Technology, Brisbane, Australia. http://www.mopp.qut.edu.au/F/F_01_03.html. Fitzgerald, B., A. Fitzgerald, M. Perry, S. Kiel-Chisholm, E. Driscoll, D. Thampapillai and J. Coates. (2006). OAK Law Project Report Number 1: Creating a legal framework for copyright management of open access within the Australian academic and research sector. http://www.oaklaw.qut.edu.au/ Gannon, F. (2000). "World wide wisdom: Electronic publishing Is moving ahead". EMBO Reports, Vol 1 No 1: pp 9-10. http://www.nature.com/cgi- taf/DynaPage.taf?file=/embor/journal/v1/n1/full/embor618.html. Harnad, S. (2000). "E-knowledge: freeing the refereed journal corpus online". Computer Law & Security Report, Vol 16 No 2: pp 78-87. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.scinejm.htm. Lane, B. (2006). "Benefits of free access". The Australian, 18th October, p 21. http://www.theaustralian.news.com.au/printpage/0,5942,20599488,00.html Smith, J. W. T. (1999). "The deconstructed journal - a new model for academic publishing". Learned Publishing, Vol 12 No 2: pp 79-91. http://library.kent.ac.uk/library/papers/jwts/d-journal.htm. work_ethceq2aj5d65evycbpcex7nqa ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586438 Params is empty 216586438 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586438 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_etodvliawrcxpcv6s3xht3ksku ---- Why So Many Repositories? Examining the Limitations and Possibilities of the Institutional Repositories Landscape Why So Many Repositories? Examining the Limitations and Possibilities of the Institutional Repositories Landscape Authors: Kenning Arlitsch & Carl Grant This is an Accepted Manuscript of an article published in Journal of Library Administration on March 27, 2018, available online: http://www.tandfonline.com/10.1080/01930826.2018.1436778. Arlitsch, K., & Grant, C. (2018). Why So Many Repositories? Examining the Limitations and Possibilities of the Institutional Repositories Landscape. Journal of Library Administration, 58(3), 264–281. doi:10.1080/01930826.2018.1436778 Made available through Montana State University’s ScholarWorks scholarworks.montana.edu http://scholarworks.montana.edu/ http://scholarworks.montana.edu/ https://doi.org/10.1080/01930826.2018.1436778 https://www.tandfonline.com/toc/wjla20/current © Kenning Arlitsch and Carl Grant Address correspondence to Kenning Arlitsch, Dean of the Library, Montana State University, P.O. Box 173320, Bozeman, MT 59717-3320, USA. E-mail: kenning.arlitsch@montana.edu Column Title: posIT Column Editor: Kenning Arlitsch, Dean of the Library, Montana State University, Bozeman, MT kenning.arlitsch@montana.edu This JLA column posits that academic libraries and their services are dominated by information technologies, and that the success of librarians and professional staff is contingent on their ability to thrive in this technology-rich environment. The column will appear in odd-numbered issues of the journal, and will delve into all aspects of library-related information technologies and knowledge management used to connect users to information resources, including data preparation, discovery, delivery and preservation. Prospective authors are invited to submit articles for this column to the editor at kenning.arlitsch@montana.edu Why So Many Repositories? Examining the limitations and possibilities of the IR landscape KENNING ARLITSCH Dean of the Library, Montana State University, Bozeman, MT, USA CARL GRANT Associate Dean, Knowledge Services & Chief Technology Officer, University of Oklahoma Libraries, Norman, OK, USA Keywords Institutional repositories; digital repositories; IT management; library administration; open access; collective action, network effect, enterprise institutional repositories 2 Abstract Academic libraries fail to take advantage of the network effect because they manage too many digital repositories locally. While this argument applies to all manner of digital repositories, this article examines the fragmented environment of institutional repositories, in which effort and costs are duplicated, numerous software platforms and versions are managed simultaneously, metadata are applied inconsistently, users are served poorly, and libraries are unable to take advantage of collective data about content and users. In the meantime, commercial IR vendors and academic social networks have shown much greater success with cloud-based models. Collectively, the library profession has enough funding to create a national-level IR, but it lacks the willingness to abandon local control. Introduction Most industries have learned to leverage the power of networks, but libraries still struggle to afford similar advantage to the realm of institutional repositories (IR). The theory of the “network effect” holds that a product gains value much faster as more people use it, and it is characterized by the use of cloud-based infrastructure and services. But by continuing to manage thousands of locally-installed repositories, libraries and their users have failed to benefit from this network effect. “Many institutions describe a situation where they have as many as five different platforms (and perhaps as many as 20 or more actual independent instances of one of these multiple platforms) that have characteristics of IR.” (Coalition for Networked Information, 2017). Academic libraries have helped to create a wealth of open access scholarly content, but it is a wealth that is difficult to count, analyze, and understand how it benefits users and the academy. The dispersed model facilitates local control, but it creates a variety of problems that include siloed content, duplication of effort, inconsistent application of metadata standards, discovery deficiencies, and increased strain on scarce resources. When characterizing the current condition of IR, some authors have also cited the lack of local grassroots support, poor usability, low usage, high cost, fragmented control, and distorted markets (Van de Velde, 2017). The proliferation of individual repositories means that standards are often implemented a little differently from one repository to another, requiring crosswalks that lead to loss of granularity when attempts are made to aggregate content. Aggregators, such as Google Scholar, for IR, or the Digital Public Library of America (DPLA), for cultural heritage repositories, often struggle to harvest and normalize metadata from disparate repositories where “standards” were applied inconsistently. Meanwhile, commercial platforms like Digital Commons, from bepress, have used the power of the network to limit such inconsistencies, which advantages them and their customers. Academic social networks such as Academia.edu and ResearchGate, whose models of operation have leveraged the power of the network from their inception, have fared much better than library-managed IR, when measured by submission participation, use, and bulk of content. Library Systems Platforms, such as Alma from Ex Libris, or WorldShare, from 3 OCLC, also capitalize on this network effect and have shown their worth by rapidly acquiring share in that market. The problem is not money, as David Lewis has recently pointed out in his proposal for a 2.5% commitment toward collective open access, open source, and open data efforts. Collectively, U.S. academic libraries control budgets of approximately $7 billion (D. W. Lewis, 2017), but we lack the will to pool our resources and act collectively, on this and other costly concerns, such as the continually escalating prices of scholarly journals (Wenzler, 2017). Numbers and Growth Tracking the number of IR worldwide can be difficult, as the few registries that exist tend to rely on self-registration. However, by examining those registries and the professional literature, a reasonable estimate of the number of IR can be calculated and their growth mapped. Although generic digital repositories had existed for some time, repositories devoted to collecting the intellectual output of universities began to appear in the early years of this century, and by mid-2005 a survey found 305 IR in 13 countries, excluding the U.S. (Van Westrienen & Lynch, 2005). A similar survey taken in that same year, of the then-124 members of the Coalition for Networked Information (CNI), found that 40% had implemented an IR, while 88% of the rest had plans to do so (Lynch & Lippincott, 2005). A survey from the IMLS-funded MIRACLE Project cast a wider net to include directors and senior administrators from a variety of academic libraries in the spring of 2006, and found that only 11% of its 446 respondents reported implementing an IR, while 53% had done no planning that would take them in that direction (Markey & Council on Library and Information Resources, 2007). Based on those studies, it is fair to say that the number of IR was low and growth was slow until around 2005. However, a boom in IR growth was about to occur, as a survey of repositories registered with the Directory of Open Access Repositories (OpenDOAR) showed that “the total number of repositories in OpenDOAR grew from 128 in December 2005 to 2,253 in December 2012 … [representing] a 1660% increase during the period” (Pinfield et al., 2014). A recent report on next-generation repositories refers to “the globally-distributed network of more than 3000 repositories” (Confederation of Open Access Repositories, 2017), a number that seems conservative according to the current listings in OpenDOAR of 3,448 repositories (University of Nottingham, 2017). Numbers in the Registry of Open Access Repositories (ROAR) are even higher, listing 4,585 results on its website (University of Southampton, 2017). ROAR’s numbers, however, are potentially inflated as a result of its automated harvesting model, which “tends to pick up a significant number of invalid sites, including experimental implementations that have few records or those with metadata-only entries” (Pinfield et al., 2014). 4 Repository66, a “mashup of data from ROAR and OpenDOAR,” had the number at 3,045 repositories in April 2014, which was apparently the last time that data were last harvested (S. Lewis, 2014). The DuraSpace Registry lists 2,288 public repositories, but that number includes only DSpace, Fedora, and VIVO repositories (over 2,000 of those are DSpace repositories), and there is little reason for non-DuraSpace members to list themselves in this registry (DuraSpace, 2017a). Numerous other repository platforms that appear in OpenDOAR and ROAR, including Digital Commons and ePrints, do not appear in the DuraSpace Registry. Based on these surveys and registries, it seems fair to state a conservative estimate of the number of open access IR at 3,000-3,500 worldwide at the end of 2017. It is quite possible that the actual number is higher. Management There are positive aspects to distributed repositories. Local implementation may be desirable, for reasons of flexibility, control, customization, and “distributed networks are more sustainable and at less risk to buy-out or failure” (Confederation of Open Access Repositories, 2017). However, there is little direct discussion in the literature about the drawbacks of managing many instances of repositories; most of the literature focuses on content recruitment, staffing, or metadata issues (Moulaison Sandy & Dykas, 2016; Palavitsinis, Manouselis, & Sánchez- Alonso, 2017; Park, 2009; Wu, 2015; etc) rather than the challenges posed by managing multiple repositories. A 2013 survey examines costs associated with IR and the additional services offered by some, but it is inconclusive due to a low survey response rate. However, the authors do raise concerns about the model of dispersed repositories in their conclusion, noting: “…it remains to be seen if individual institutional management of a repository is the most efficient and effective means of operation. A question that should be asked of the users of repositories is whether their needs are met by the dispersed model of repositories that exists at the present time, or if some kind of unification (or at least unified search and retrieval capability) would be more useful” (Burns, Lana, & Budd, 2013). Seaman lists a variety of other challenges faced by IR: insufficient awareness and stakeholder engagement; content recruitment; perceptions of intellectual property loss; and submission difficulty. He also examines the a priori assumption in many libraries that IR were necessary and would succeed, which usually resulted in an absence of formal needs assessments prior to implementation (Seaman, 2017, p. 34). Indirectly, Seaman raises the advantages of the network effect by observing the success of “academic social networks like ResearchGate and Academia.edu” (Seaman, 2017, p. 37), noting that they currently contain approximately 100 million (ResearchGate, 2017) and 18 5 million (Academia.edu, 2017) publications, respectively. It is important to acknowledge that the success of ResearchGate is tempered by a pending publishers lawsuit that seeks removal of up to 7 million articles that may be in breach of copyright (Van Noorden, 2017), and that Elsevier issued “2800 [takedown] notices to Academia.edu in 2013, but did not take the site to court” (Chawla, 2017). These actions reveal a strength of library-run IR, where the copyright status of publications is generally vetted before a paper is uploaded. Still, the papers subject to removal from ResearchGate and Academia.edu because of potential copyright breaches are small percentages of their respective totals. It is also useful to note that the decentralized nature of open access IR worldwide makes it virtually impossible for libraries to produce the total number of publications they contain, and which are so easily revealed in ResearchGate and Academia.edu. Further evidence of the popularity and reach of academic social networks comes from a study that found ResearchGate contained more than half of the articles published by scholars at top Spanish universities, while only 11% of those same articles were available in the institutional repositories of the authors’ home institutions (Borrego, 2017). Borrego concludes that “IR may be fulfilling a role by helping universities to showcase their research activity and give an account of their research performance, but they are hardly playing a role in providing OA to content published in subscription journals.” Of course, the task of managing repositories requires a resource precious to all libraries: staff labor. While the numbers can vary wildly, “many institutions spread the work over two main posts: a Repository Manager… a Repository Administrator...” (Wickham, 2011). The authors of this article note that the average FTE devoted to repository management between their two institutions alone is approximately 5 FTE. Multiplying this by the 3,500 repository count from above (which, again, could be low) provides a measure of the massive amount of resources our profession is allocating to repositories (8 hrs/day*22 days/month*12 months*5 FTE*3,500 institutions=36,960,000 person hrs/year). Of course, this number needs further study to identify what portion of labor is being devoted to support unique customizations compared to product maintenance, such as upgrading software versions, fixing bugs and security enhancements. If we could re-allocate these latter staffing resources to developing shared user- facing solutions, we could see some sizable boosts in functionality. In the current environment, the profession is wasting massive amounts of resources in duplicating shared IR infrastructure work. Software Versions The variance in software platform versions implemented in the community are dramatic. For example, the DuraSpace Registry listed 2,004 repositories that were running the DSpace platform as of December 2017. Of these, there is a wide distribution of versions, with almost 800 (nearly 40%) of the known versions showing 1.8 or earlier, representing a span of software releases dating back nearly 16 years (see Figure 1). The latest release of DSpace is version 6.x, and all DSpace versions prior to 5.x are unsupported as of January 2018 (DuraSpace, 2017b). 6 Figure 1: Screen capture from the DuraSpace Registry taken 2017-12-10, showing the number of registered repositories (in parentheses) running each version of DSpace. The negative implications of this situation for the profession cannot be overstated. With each subsequent release of a new version of the software, these sites become more entrenched in the past as the difficulty of upgrading becomes more and more challenging. In addition, these sites increasingly put themselves at risk of security and ransomware attacks, both of which contain the potential to compromise the trust required by the community of users who invest their intellectual property in these very repositories. The community of developers is also spread more thinly as a result of trying to provide critical security upgrades for multiple versions of the software. By any reasonable measure, this is a misuse of valuable resources that are largely focused on simply maintaining the status quo. User Perspectives Perhaps the biggest question to ask when considering the value in the current environment of multiple repositories is “how does this condition serve the user?” In answering this question, we can consider two types of users: those engaged in the discovery process; and those whom we hope will self-archive their intellectual output in a repository. From the perspective of those engaged in discovery, there is little value in trying to search repositories individually. Far more effective is the use of aggregators like Google Scholar (GS), whose academically-inclined users are exactly the audience that are valued by IR managers, and which previous research has demonstrated to be the source of 48%-66% of referrals to IR that it has successfully harvested and indexed (OBrien et al., 2016). But previous research has also demonstrated that many IR struggle with the harvesting and indexing requirements of GS, resulting in IR content being poorly represented in its index (Arlitsch & O’Brien, 2012) and thus diminishing the value of the aggregation. 7 In the domain of self-archiving, Poynder notes that “...author self-archiving has remained a minority sport, with researchers reluctant to take on the task of depositing their papers in their institutional repository. Where deposit does take place, it is invariably hard-pressed intermediaries who do the work” (Poynder, 2016). It is interesting to note that MIT recently stated: “MIT remains a leader in open access, with 44 percent of faculty journal articles published since the adoption of the policy freely available to the world.” (MIT Libraries, 2017). While we don’t dispute the statement, we will note the that if capturing 44% - i.e. less than one half of faculty journal articles - makes one the leader (and it does, because most institutions see lower numbers), then, clearly we all have much work left to do. A recent CNI presentation outlined some of the challenges users face (including Researchers, VPRs, and Provosts) in dealing with self-archiving and other IR services (Grant, Soderdahl, & Beit-Arie, 2017). Included were: 1. Researchers a. Depositing in multiple repositories b. Entering metadata c. Maintaining multiple “profiles” d. Managing requests for copies of data associated with research e. Generating research data management plans 2. Vice Presidents of Research a. Minimizing workflows for researchers so they can focus on research. b. Helping research and researchers gain exposure and press coverage c. Needing to raise the profile of the institution d. Helping researchers comply with the data management plan milestones e. Generating reports on research data use and reuse 3. Provost a. Increasing faculty impact b. Helping with researcher retention c. Increasing community impact and understanding of the value of research d. Increasing the brand value of the institution (CNI Meeting, Fall 2017, Project Briefings) These are sizeable challenges but not insurmountable, provided we come to the realization that allocating resources as we are won’t take us there. IR are clearly not working very well for users, either on the discovery end or the submission end. It’s time for a change. Looking Forward The rate of technological change makes it incredibly difficult to predict, with any accuracy, where we’ll be 5-10 years from today. In his recent book, Thank You For Being Late (Friedman, 8 2017), Thomas Friedman noted that it was just over ten years ago that the following technologies were introduced: ● iPhone (2007) ● Hadoop (big data) ● GitHub ● Facebook (Sept 2006) ● Twitter ● Google bought YouTube (2006) ● Android ● Kindle ● IBM’s Watson ● Intel’s high-k-metal gate microchips The rate of technological change really hasn’t slowed since that time, nor is it likely to slow in the future. One thing that is clear is that this rate of change quickly obsoletes many existing technologies. The implications of the rate of change on scholarly communication will continue to be large and causes one to wonder if Van de Velde was correct when he surmised: “The Institutional Repository is obsolete. Its flawed foundation cannot be repaired. The IR must be phased out and replaced with viable alternatives.” (Van de Velde, 2017). Of course, the definition of viable alternatives covers a great many possibilities and far fewer agreements on which one is correct. However, most discussions on this topic focus on newer, more comprehensive versions of the major IR software platforms, such as DSpace or Fedora Commons. Others focus on the needs for enterprise level, cloud-based solutions that offer massive scalability, multi-tenant software, network effects and more. Beyond this level, others are pushing into entirely different, innovative, but as yet untested ideas like those discussed by Herbert Van de Sompel in his Closing Plenary of the Fall 2017, CNI Meeting during the Paul Evan Peters Award & Lecture, titled: “Scholarly Communication: Deconstruct and Decentralize. A Bold Speculation” (Van de Sompel, 2017). Changing the Mindset Fifteen years ago, many academic libraries were still managing their own email servers. Today, email servers are rare in libraries; most of us use cloud-based email and calendaring services from Microsoft or Google. The idea of libraries running their own email servers now rightfully seems ludicrous, but at the time, abandoning them was often a cause of anger and argument, and some libraries gave them up only reluctantly. Currently, the idea of a national-level institutional repository is ruled out almost reflexively. “No monolithic system could ever work” is a common response whenever someone is bold enough to suggest it. But why? Why has a monolithic system worked so well for ResearchGate and Academia.edu? Why did it work so well for Digital Commons, whose customers have long been 9 confident that their content was indexed in Google Scholar than most distributed DSpace repositories? Why has it worked for HathiTrust, whose collection of digitized books surpassed 16 million volumes in 2017 (Furlough, 2017), and whose sheer mass has created exciting text- mining possibilities in one of the largest literary datasets ever produced (HathiTrust Research Center, 2016)? A new web service called the Repository Analytics & Metrics Portal (RAMP) was developed in 2017 by Montana State University and its partners at OCLC Research, the University of New Mexico, and the Association of Research Libraries (OBrien, Arlitsch, Mixter, Wheeler, & Sterman, 2017). The purpose of RAMP is to accurately count file downloads from IR, and it works with almost any platform. Aside from providing good reporting numbers, RAMP’s greater value to the community is in the dataset it is accumulating from the nearly 30 repositories that are currently subscribed. The dataset provides non-personally identifiable information about search engine queries that surfaced repository content, and it includes the URL or handle of each file that was downloaded from a given repository, meaning that all the metadata for each publication could be subsequently mined from each repository. The research value of this dataset is enormous and unique. It could, for the first time, provide a profile of the content available in IR, internationally; it could be used to conduct a gap analysis to determine what users are searching vs. what they’re finding; an analysis could be conducted of metadata across repositories, and using machine learning techniques, could reconcile local metadata to national standards. The possibilities are endless. In numerous presentations around the nation where the team has introduced RAMP, one of the most frequent questions we have fielded is whether the code will be made openly available. The technical answer is yes, it could be. There is nothing secret or commercially valuable about the code itself. But the question once again reveals the mindset of defaulting to local implementation and local control, while completely missing the value of the centrally- accumulated dataset. Again, ResearchGate, Academia.edu, and bepress figured out long ago the value of a dataset that could be generated from a centralized infrastructure. We are approaching a crossroad where technological capability, increasing budgetary restraints, and sustainability concerns are converging to give critical mass to the idea of a collective infrastructure. David Lewis’s 2.5% Commitment proposal provides potential funding for just such an idea. At the moment, we see two simultaneous paths: 1. Existing or new organizations take on this challenge, which will take a great deal of time, resources and energy; and 2. Using technology adoption life cycle analysis to create a group of innovators to lead the way. We believe these two things need to happen in parallel, as one is a longer term solution and the other is a shorter term solution and clearly we need both. We’ll deal with the technology 10 adoption life cycle analysis (the shorter term solution) in a later section, and here we’ll deal with the organizational topic. Currently, we see two non-profit, membership organizations that seem appropriate for dealing with the challenges: OCLC and DuraSpace. Both already have repository products in this space and both are already running hosted (Software-as-a-Service) offerings of their repository software. Yet, both have challenges to overcome before they could take on what we’re describing in this paper. What’s in the Way? Collaborations that reflect our weaknesses rather than strengths As a profession, librarians excel at the creation of collaborative organizations and/or efforts. We’re a group that dreams big, but we don’t seem to execute as well as we dream. How many of the press releases issued at the launch of collaborative efforts actually achieve the intended goals? Perhaps the marketing value of the press release is adequate return, but one wonders if some level of accountability for achieving the stated goals needs to be applied? Perhaps we should more actively challenge our existing collaborative organizations to step up to these challenges. OCLC is a large, community-owned and governed organization. Yet, it isn’t actively leading these conversations. Perhaps the leadership doesn’t see the need? Or perhaps the governance structure doesn’t allow for the need to be addressed? One has to marvel at the description of the governance structure1 on the OCLC website: OCLC Governance organization Number of positions Executive Management Team 11 Board of Trustees 15 Global Council of Delegates 48 A total of 74 positions governing an organization generating $208M in annual operating revenues. While this is certainly “community governance” one must observe that the 2017 Annual Report notes: “OCLC has operated at a loss due to restrained price increases combined with heavy strategic investment into new services, as well as technology upgrades, facility renovations, and a staff resource realignment.” 2 Operating at a loss is one possible explanation for OCLC’s limited leadership in this area, but limited agility may be another reason. Perhaps, as “community members” we need to push this organization to address the critical need for IR reform. 1 https://www.oclc.org/en/about/leadership.html?cmpid=md_ab_leadership 2 https://www.oclc.org/en/annual-report/2017/home.html 11 A much younger organization than OCLC, DuraSpace represents a community of users and is focused on a relatively narrow suite of open access repository platforms and storage. That narrow focus is a strength compared to OCLC’s array of products and services, but whereas OCLC has become a large, global organization, DuraSpace is small and its personnel are limited. Were an initiative like the 2.5% Commitment to become a reality, funds could be funneled to DuraSpace to manage a single platform for all its subscribers. Simultaneously, its members would need to support a move away from the current two versions (XMLUI/JSPUI) of its DSpace platform to a single, shared platform, ensuring that money is being applied for maximum benefit to all. As a profession, we need to analyze what is working and what isn’t, learn from it, and apply the lessons to an organization that addresses the need of a next generation service for open, scholarly communications. Collaborative governance models/agreements that lack accountability Many new collaborative or organizational models are, quite appropriately, based on signed agreements between the participants. These agreements can range from a Memorandum of Understanding (MOU) to a full contract, and templates for both can readily be found on the web. These agreements help to ensure the organization’s intent carries across time and personnel changes, as well as to clarify what happens when something goes wrong. Generally, the parties will agree that if done properly, these agreements get dropped in a file and stay there. They’re typically brought out only if something has gone awry. However, there are lessons we need to apply to these documents to ensure they deliver the intended results. As is well documented in the library literature, successful collaboration contains the following elements: 1. Clear and shared goals 2. Risks and rewards are real and shared 3. The time period for achievement is clearly defined 4. Everything written down Accountability is a crucial part of that list. Those who work in academia are well aware that the best of plans are subject to buffeting by changes in university administration, local, state, or national politics and, of course, the ever present question of funding availability (Grant, 2010). For example, one of the scenarios that could arise is when a university administrator instructs the library dean to change direction in a way that requires the library to redirect resources previously committed to a collaborative effort. If the collaboration agreement is well written, the library should be obligated to cover the cost of another collaborative organization being found and/or hired to handle the work to which the library had previously committed. If this wording were in place, it would provide protection to: 1) position the library dean to negotiate with the administrator to cover the cost of the impact of their new idea and; 2) ensure that the work component gets done so the collaborative effort can still succeed as originally planned. In such a scenario, collaborations would be more likely to succeed since accountability would be maintained for all concerned. 12 What can we do we do now? The scenarios envisioned above, if implemented properly, will take years of concentrated, collaborative effort. The needed governance models will require intricate and complicated discussions and a lot of compromise. The funding models such as envisioned by Lewis, if they can be agreed upon by librarians, will then be complicated by compliance needs at the local, national, and international legal levels. MOUs or contracts will need to be developed, vetted, and approved. People will need to be hired, trained, and put into place to manage the new collaborative organization. Of course, the problem with most collaborative approaches, as alluded to above, is that the core idea can die the proverbial death-by-a-thousand-cuts, be they compromises to the idea or length of time passed. The authors wonder if in fact, the right question has even been asked at the starting line? Perhaps, the question that needs to be asked is: How does the profession of librarianship move from the identification of a clear, wide professional need (an idea) to the realization of a working solution to address that need? The problem is really not unique to this profession, and many publications have been written on this subject. One that has endured the test of time and seen multiple editions as a result, is Moore’s Crossing the Chasm, which describes a “technology adoption life cycle that shows the stages of idea adoption, starting with those who create the idea (innovators), to those that become the early adopters and then on through the early majority, late majority and finally, the laggards.” He developed a chart (Figure 2), that shows the well-known Bell Curve, but applied here with the Innovators at the far left and the Laggards at the far right, and the majority in the large hump in the middle. The ‘chasm’ comes into play between the early adopters and the early majority, in that many products don’t successfully navigate that transition and fall into oblivion. The book’s focus is analyzing what is required for a successful transition. 13 Figure 2: Technology Adoption Cycle taken CC BY 2.5, File:DiffusionOfInnovation.png, Wikipedia There is a great deal of wisdom in this work for us to apply to the challenges being analyzed above and in trying to determine what we can do to meet them today and in the long term. One of the most important, for the purpose of this article, is to realize that major new ideas are created by a small group of “innovators” and “early adopters” and not by the early to late majority or laggards. As such, an approach we should consider is to identify who falls into the Innovators category, and then focus on enabling and empowering them to have access to the resources needed to bring their ideas at least to a working conceptual model. Examples of these types of people have been named throughout this work. Perhaps, as we look at creating a new organization or investing in an existing one to take on these challenges, this is a model we should utilize. The technical infrastructure needed for any of the solutions discussed is available today. Lewis has shown that we have the money, collectively, so what remains is the will power. Multiple vendors now offer sophisticated, secure, agile, and scalable cloud-based hardware hosting platforms (IaaS - Infrastructure-as-a-Service) and at very reasonable costs. Many libraries run their repositories entirely on such cloud-based services and have found it superior in nearly every way to locally run hardware. Notre Dame University, for example, made this decision at the university level for the majority of their IT services (including the Libraries) noting: “It’s really one of our core IT strategies,” says Michael Chapple, senior director of IT Service Delivery at the University. “It’s the biggest technology change we’ve made in the last 10 years. We’re reinventing the way we provide IT service to the campus.” (Butler, 2015). There are two important caveats that should be noted here: 1) The impact of the recent decision to end Net Neutrality could have severe consequences for cloud-based systems in higher 14 education, and thus the impact will have to be monitored closely. This decision was made within the last two weeks of the writing of this article, and thus, it’s too early to predict; and 2) it bears repeating, we’re only talking here about the IaaS (Infrastructure as a Service) layer, and not the SaaS (Software as a Service) total stack. Even then, it’s not necessarily an enterprise solution, because often, all that is being done is moving the repository stacks into the cloud, but still with no network effect. For that to happen, the software must become true multi-tenant software, wherein the same software instance is supporting essentially all users. That is when the true economies of scale begin to apply; when a new version is released, the one instance is updated and therefore all users connected to that instance, instantly see the new functionality. Summary The proliferation of locally-run institutional repositories in academic libraries has created, in our estimation, problems that far outweigh the benefits. Of course, there are advantages to local control, but the thousands of extant repositories do not serve our users well, and they massively duplicate labor and other costs. The desire for local control has led to a fragmented landscape that is not sustainable, and in which we are unable to generate and analyze potentially powerful collective data about the content of our repositories and the interaction of users. There is tremendous value to considering the possibility of consolidated data. All content, metadata, and use analytics are considered data in this paradigm, and the service and research possibilities are endless in such a paradigm. The artificial barriers of individual repositories must be eliminated to achieve this vision. We must prepare for the rapidly evolving technological landscape needed to support scholarly communication. Agility is essential. Teams, organizations, infrastructure, and software tools must be prepared for this environment. Most importantly, we must determine a way forward that enables the innovators in our profession to rapidly build working models for deployment and refinement. IR were born from a dream for open access to the intellectual output of the world’s research. Libraries have made an extraordinary effort that has accumulated a massive amount of content, and that content must be protected and migrated to a more collective and useful platform. It is time to acknowledge that we have come up short, but we may yet achieve our dream if we can change course and re-architect to a large-scale, shared infrastructure that better serves our users. Acknowledgement Some of the thinking and research in this article was informed by previous funding from the Institute of Museum and Library Services. The authors would also like to thank Doralyn Rossmann for her review of the final draft. 15 References Academia.edu. (2017). About Academia.edu. Retrieved December 10, 2017, from https://www.academia.edu/about Arlitsch, K., & O’Brien, P. S. (2012). Invisible institutional repositories: Addressing the low indexing ratios of IRs in Google Scholar. Library Hi Tech, 30(1), 60–81. Borrego, Á. (2017). Institutional repositories versus ResearchGate: The depositing habits of Spanish researchers: Institutional repositories versus ResearchGate. Learned Publishing, 30(3), 185–192. https://doi.org/10.1002/leap.1099 Burns, C. S., Lana, A., & Budd, J. M. (2013). Institutional Repositories: Exploration of Costs and Value. D-Lib Magazine, 19(1/2). https://doi.org/10.1045/january2013-burns Butler, B. (2015, December 14). How Notre Dame is going all in with Amazon’s cloud. Network World. Retrieved from https://www.networkworld.com/article/3014599/cloud- computing/how-notre-dame-is-going-all-in-with-amazon-s-cloud.html Chawla, D. (2017). Publishers take ResearchGate to court, alleging massive copyright infringement. Science. https://doi.org/10.1126/science.aaq1560 Coalition for Networked Information. (2017). Rethinking institutional repositories: report of a CNI executive roundtable. Washington, D.C. Retrieved from https://www.cni.org/wp- content/uploads/2017/05/CNI-rethinking-irs-exec-rndtbl.report.S17.v1.pdf Confederation of Open Access Repositories. (2017). Next generation repositories: behaviours and technical recommendations of the COAR Next Generation Repositories Working Group (p. 32). Retrieved from https://www.coar-repositories.org/files/NGR-Final- Formatted-Report-cc.pdf DuraSpace. (2017a). DuraSpace Registry [Non-profit]. Retrieved December 8, 2017, from http://www.duraspace.org/registry DuraSpace. (2017b, September 20). All Documentation - DSpace Documentation [DuraSpace Wiki]. Retrieved January 14, 2018, from https://wiki.duraspace.org/display/DSDOC/All+Documentation Friedman, T. L. (2017). Thank You for Being Late: An Optimist’s Guide to Thriving in the Age of Accelerations (Reprint edition). New York: Picador. Furlough, M. (2017, December 11). HathiTrust has reached 16 Million Volumes! Retrieved January 9, 2018, from https://www.hathitrust.org/16-million-volumes 16 Grant, C. (2010). A Partnership for Creating Successful Partnerships. Information Technology and Libraries, 29(1), 5–7. Grant, C., Soderdahl, P., & Beit-Arie, O. (2017, December). Research Outputs: You Want Me to Do What?!? Finding a Way Forward for Librarians, Researchers and Other Stakeholders. PowerPoint presented at the CNI Fall Member Meeting, Washington, D.C. Retrieved from https://www.cni.org/topics/user-services/research-outputs-you-want-me- to-do-what-finding-a-way-forward-for-librarians-researchers-and-other-stakeholders HathiTrust Research Center. (2016, May 5). One of the world’s largest digital libraries opens doors to text mining scholars. Retrieved January 9, 2018, from https://www.hathitrust.org/one-of-worlds-largest-digital-libraries-opens-doors-to-text- mining-scholars Lewis, D. W. (2017, September 11). The 2.5% commitment. Indianapolis, Ind. Retrieved from http://hdl.handle.net/1805/14063 Lewis, S. (2014, April 20). Repository Maps. Retrieved November 29, 2017, from maps.repository66.org Lynch, C. A., & Lippincott, J. K. (2005). Institutional Repository Deployment in the United States as of Early 2005. D-Lib Magazine, 11(09). https://doi.org/10.1045/september2005-lynch Markey, K., & Council on Library and Information Resources (Eds.). (2007). Census of institutional repositories in the United States: MIRACLE Project research findings. Washington, D.C: Council on Library and Information Resources. MIT Libraries. (2017, December 30). MIT convenes ad hoc task force on open access to Institute’s research. Retrieved December 30, 2017, from http://news.mit.edu/2017/mit- convenes-ad-hoc-task-force-open-access-research-0707 Moulaison Sandy, H., & Dykas, F. (2016). High-Quality Metadata and Repository Staffing: Perceptions of United States–Based OpenDOAR Participants. Cataloging & Classification Quarterly, 54(2), 101–116. https://doi.org/10.1080/01639374.2015.1116480 OBrien, P., Arlitsch, K., Mixter, J., Wheeler, J., & Sterman, L. B. (2017). RAMP – the Repository Analytics and Metrics Portal: A prototype web service that accurately counts item downloads from institutional repositories. Library Hi Tech, 35(1), 144–158. https://doi.org/10.1108/LHT-11-2016-0122 OBrien, P., Arlitsch, K., Sterman, L., Mixter, J., Wheeler, J., & Borda, S. (2016). Undercounting File Downloads from Institutional Repositories. Journal of Library Administration, 56(7), 854–874. https://doi.org/10.1080/01930826.2016.1216224 17 Palavitsinis, N., Manouselis, N., & Sánchez-Alonso, S. (2017). Metadata and Quality in Digital Repositories and Libraries from 1995 to 2015: A Literature Analysis and Classification. International Information & Library Review, 49(3), 176–186. https://doi.org/10.1080/10572317.2016.1278194 Park, J.-R. (2009). Metadata Quality in Digital Repositories: A Survey of the Current State of the Art. Cataloging & Classification Quarterly, 47(3–4), 213–228. https://doi.org/10.1080/01639370902737240 Pinfield, S., Salter, J., Bath, P. A., Hubbard, B., Millington, P., Anders, J. H. S., & Hussain, A. (2014). Open-access repositories worldwide, 2005-2012: Past growth, current characteristics, and future possibilities: Open-Access Repositories Worldwide, 2005- 2012: Past Growth, Current Characteristics, and Future Possibilities. Journal of the Association for Information Science and Technology, 65(12), 2404–2421. https://doi.org/10.1002/asi.23131 Poynder, R. (2016, September 22). Q&A with CNI’s Clifford Lynch: Time to re-think the institutional repository? Retrieved January 10, 2018, from https://poynder.blogspot.co.uk/2016/09/q-with-cnis-clifford-lynch-time-to-re_22.html ResearchGate. (2017). About us fact sheet. ResearchGate. Retrieved from https://www.researchgate.net/press Seaman, D. M. (2017). Leading across boundaries: Collaborative leadership and the institutional repository in research universities and liberal arts colleges (Dissertation). Simmons College School of Library and Information Science, Boston, MA. Retrieved from https://search.proquest.com/docview/1958944695?accountid=28148 University of Nottingham. (2017, November 23). Search or browse for repositories. Retrieved from http://www.opendoar.org/find.php?format=charts University of Southampton. (2017, November). Welcome to the Registry of Open Access Repositories. Retrieved November 24, 2017, from http://roar.eprints.org Van de Sompel, H. (2017, December). Scholarly Communication: deconstruct and decentralize. Internet presented at the Coalition for Networked Information fall member meeting, Washington, D.C. Retrieved from https://www.slideshare.net/hvdsomp/paul-evan-peters- lecture Van de Velde, E. (2017, December 28). Let IR RIP [Non-profit]. Retrieved December 28, 2017, from http://scitechsociety.blogspot.com/2016/07/let-ir-rip.html Van Noorden, R. (2017). Publishers threaten to remove millions of papers from ResearchGate. Nature. https://doi.org/10.1038/nature.2017.22793 18 Van Westrienen, G., & Lynch, C. A. (2005). Academic Institutional Repositories: Deployment Status in 13 Nations as of Mid 2005. D-Lib Magazine, 11(09). https://doi.org/10.1045/september2005-westrienen Wenzler, J. (2017). Scholarly Communication and the Dilemma of Collective Action: Why Academic Journals Cost Too Much. College & Research Libraries, 78(2), 183–200. https://doi.org/10.5860/crl.78.2.16581 Wickham, J. (2011, October 5). Institutional Repositories: staff and skills set. University of Nottingham. Retrieved from http://www.rsp.ac.uk/documents/Repository_Staff_and_Skills_Set_2011.pdf Wu, M. (2015). The Future of Institutional Repositories at Small Academic Institutions: Analysis and Insights. D-Lib Magazine, 21(9/10). https://doi.org/10.1045/september2015-wu work_f2xzajpfhrawdb3ncjhow25wce ---- http://joemls.tku.edu.tw 教育資料與圖書館學 Journal of Educational Media & Library Sciences http://joemls.tku.edu.tw Vol. 51 , no. 3 (Spring 2014) : 391-410 由供應鏈提供電子書書目紀錄品質 與維護問題之探討:以臺灣學術 電子書暨資料庫聯盟的運作為例 Investigating the Quality and Maintenance Issues of Bibliographic Records Provided by the e-Book Supply Chain: Using the Operations of the Taiwan Academic E-Book & Database Consortium as an Example 陳 昭 珍* Chao-Chen Chen* Professor E-mail:joycechaochen@gmail.com English Abstract & Summary see link at the end of this article mailto:joycechaochen@gmail.com� http://joemls.tku.edu.tw 教育資料與圖書館學 51 : 3 (Spring 2014) : 391-410 DOI:10.6120/JoEMLS.2014.513/0616.RS.AM 由供應鏈提供電子書書目紀錄品質 與維護問題之探討:以臺灣學術 電子書暨資料庫聯盟的運作為例 陳昭珍 摘要 採用來自供應鏈的書目紀錄,已是圖書館電子書資訊組織的重要 趨勢。然而廠商的書目資料來源為何?品質為何?圖書館如何處 理這些書目紀錄,是否滿意這些書目資料?都值得探討。本研究 首先從29種電子書產品抽出1,080筆書目資料,並以MarcEdit檢查 書目資料之品質,發現有問題的書目紀錄比例約占14%;其次, 本研究訪談12家廠商,了解其書目來源,發現西文書目大多抄錄 自OCLC,也有由原廠提供資料再委託編目公司加工者,此外, 亦有由臺灣廠商自行編目者;最後,本研究以問卷調查臺灣學術 電子書聯盟成員館對於廠商提供書目的滿意度。發現大多數圖書 館有將書目紀錄匯入OPAC,也大致滿意廠商的書目資料,然而對 於書目資料的正確性仍有很多意見。 關鍵詞: 電子書,供應鏈,資訊組織,書目紀錄,機讀記錄 前 言 2005年,美國國會圖書館副館長Deanna B. Marcum在一次演說提到:國 會圖書館每年花在編目的經費為四千四百萬美元,在圖書館已提供大量電子資 源、使用者已習慣優先使用數位資訊,並以關鍵字檢索的Google時代,圖書 館到底該如何編目?圖書館的目錄是否需改變?這是圖書館界需要一起面對的 問題(Marcum, 2005)。在此之後,美國國會圖書館成立委員會探討圖書館目錄 的改變方向與作法,由Karen Calhoun主持,並於2006年提出報告:「Changing nature of the catalog and its integration with other discovery tool」。此報告建議: 圖書館應擴大採用得自「供應鏈」的書目資料,以加速建立書目產品之效率 (Calhoun, 2006)。 研 究 論 文 國立臺灣師範大學圖書資訊學研究所教授 通訊作者:joycechaochen@gmail.com 2014/02/11投稿;2014/04/23修訂;2014/04/24接受 http://joemls.tku.edu.tw 392 教育資料與圖書館學 51 : 3 (Spring 2014) 臺灣的大學圖書館從2001年起,有成立聯盟聯合採購,或自行採購電子書 者。其中又以2007年起,經教育部補助,由90多所大學組成臺灣學術電子書 聯盟規模最大,每年採購約2億台幣西文電子書,數量已達6萬多筆;2011年 起該聯盟也開始採購中文電子書,提供電子書的廠商至少有15家以上(臺灣學 術電子書暨資料庫聯盟推動小組,2013)。正如Karen Calhoun所建議,臺灣學 術電子書聯盟採購電子書時,也要求供應商提供MARC書目紀錄。過去台灣學 術電子書聯盟對於電子書使用情況,每兩年有一次大幅度且深入的調查(陳昭 珍、詹麗萍、謝文真、陳雪華,2011);然而,對於來自供應商的書目資料來 源為何?品質為何?圖書館如何處理這些書目資料,對於供應商所提供書目資 料是否滿意?都尚未有深入研究,是本研究擬深入探討的議題。 二、文獻分析 ㈠ 數位時代圖書館資訊組織發展困境 資訊組織向來是圖書資訊學的核心課程,也是圖書館員的核心知能。在紙 本時代,人類的出版品主要倚賴圖書館進行資訊組織及書目控制。隨著科技的 進步,圖書館的技術服務持續變革,從MARC紀錄、聯合目錄、線上公用目 錄等,都是圖書館善用資訊科技的結果。然而數位出版不僅改變目錄型式,更 改變圖書館資訊組織的方式。 Geh(1993)曾預測,未來的電子書在進入圖書館成為館藏時,即會自動編 目,使用新的語意,與其他架上資源自動產生交互參考連結,並取代舊有的上 架程序。Oddy(1996)則指出,對於越來越受關注的電子文件應有一套標準的 來源資訊,讓編目員在編輯書目資料時能有所依據,每一電子文件都應有專屬 的目錄紀錄。Dorner(2000)認為,21世紀編目員的角色主要受到二個因素的影 響,一是資訊數位化,二是資訊標準的改變。圖書館同時擁有紙本資料及數位 資源,代表著編目人員除需持續地組織傳統資源,還要能參與數位資源取用機 制的發展。 2005年,美國國會圖書館副館長Deanna B. Marcum在一次演說提到:美 國國會圖書館每年花在編目的經費為四千四百萬美元,在使用者以數位內容優 先,以關鍵字檢索的Google時代,圖書館到底該如何編目?是圖書館界需集 體面對的問題。她進一步指出,如果我們已經建立了數位內容,還需要詳細 的編目嗎,是否可直接將Google當成目錄?目前圖書館的目錄能適合且滿足 使用者嗎? Amazon的目錄除了基本的作者、書名、出版者等資料外,也提供 內文摘錄,不但可供關鍵字檢索,還可讓讀者「看到書的內涵」(Look inside the book),進而引發購買慾望。此外,她也認為當大多數老師及學生已經很少 使用圖書館目錄,而改用其他搜尋工具時,圖書館的目錄必需改變(Marcum, http://joemls.tku.edu.tw 393陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 2005)。基於這樣論點,美國國會圖書館隨之進行兩個有關圖書館目錄的研究 並提出重要的報告,其一為2006年由Karen Calhoun主持的研究,研究分析圖 書館目錄的現況、探討恢復圖書館目錄生機之可行性,以及未來的發展方向 (Calhoun, 2006)。其次為2008年的研究,報告指出:使用者已無法從圖書館線 上目錄得到他們想找的資訊,當前的資訊環境、資訊內容已經改變,使用者已 經改變,因此圖書館也需改變其服務模式,而圖書館目錄更需改變。研究者並 認為,圖書館目錄已失去其獨特優勢,傳統編目工作之成本效益受到挑戰,目 前圖書館目錄唯一的功能只在支持館藏盤點,以便將圖書館的紙本館藏透過檢 索,傳遞使用者手中(Working Group on the Future of Bibliographic Control, Li- brary of Congress, 2008)。 2006年美國國會圖書館成立探討21世紀書目控制的工作小組,並於2008 年提出研究報告。該小組認為,未來的書目控制將朝合作、分散、國際化、 網路化方向發展。所謂合作是指圖書館的書目控制需和出版社、平台供應商及 使用者合作,電子資源來源多元,內容改變快速,其書目資料將是動態而非靜 態。該報告也提出圖書館資訊組織五個改變方向: 1.透過合作及分享,擴大採用得自「供應鏈」的書目資料,加速書目產品 的效率。 2.圖書館應將資源及努力投入在高價值的活動上,特別是將少用及獨特的 資料展示給更多使用者使用,以擴展知識的建立。 3.將科技定位在未來。WWW是我們的科技平台,也是標準資料傳遞最適 合的平台,我們應認知到,書目控制的使用者不只是「人」,「系統程式」也以 不同方式在與書目控制互動。 4.將我們的社群定位在未來。使用者對作品的評鑑應加到該作品的描述資 料中。此外,也認知到FRBR中「作品」觀念的重要性,它可將相關作品關連起 來。 5.透過教育及評量,加強圖書館的專業,以對現在及未來做出最好決策。 2007年成立的Digital Library Federation(DLF)ILS Discovery Interfaces Task Force對於現有圖書館目錄進行調查,並列出以下數個重大問題: 1.現有系統是為了管理紙本館藏和藏品而設計,對於數位資源的管理並不 適合; 2.目前的OPACs無法支援多種metadata,缺乏對於FRBR的支援; 3. OPAC僅能查詢有訂購的館藏; 4. OPAC介面較不易使用,且難以作為其他搜尋工具; 5.以OPACs進行探索性的搜尋較為困難,且OPACs通常缺乏拼音矯正及相 關性演算等基本功能,無法鼓勵使用者瀏覽或偶遇資訊; http://joemls.tku.edu.tw 394 教育資料與圖書館學 51 : 3 (Spring 2014) 6.若使用者不知道資料的正確名稱或書寫規則,即無法檢索既有的書目紀 錄。 ㈡ 線上電子書與圖書館目錄 2000年以後,電子書佔圖書館的館藏比例逐年增多,很多大學圖書館甚 至優先選購電子書。然而,有了電子書,圖書館又將如何進行資訊組織,讓 使用者檢索呢?根據MacCall調查的21所醫學圖書館中,發現有19所在網站上 提供title-level的清單供使用者查詢,有20所圖書館有建立書目紀錄(MacCall, 2006)。 Dinkelman及Stacy-Bates(2007)分析美國ARL(Association of Research Li- braries)會員圖書館發現,有56%圖書館在網頁上提供獨立的電子書查詢系統, 此研究建議,圖書館應在網頁建立one-step電子書查詢系統,同時也應做主題 分類。 Hutton利用10所學術圖書館的電子書網頁及OPAC系統查詢10本電子書, 結果發現,在圖書館的網頁可找到9本,但在OPAC上只能找到3本電子書 (Hutton, 2008)。從這些研究可知,很多圖書館並未將電子書的書目資料建入圖 書館的館藏目錄。Dillon(2001)、Gibbons(2001)及Langston(2003)等人的研 究都證明,如果圖書館將電子書書目資料加到館藏目錄,將有助於使用者找到 及使用這些電子書。 2009年,CIC(Committee of Institutional Cooperation)成員伊利諾大學圖書 館為改善電子書的書目品質,與CLI(Center for Library Initiatives)成員館一起 合作,檢查Springer 提供的書目資料,並將問題條列出來。Springer的書目資 料乃委由Ingram公司提供,他們發現Ingram所提供的書目資料問題有三種:第 一種是會影響資料的存取問題(access issues),主要為書目資料無全文連結資訊 或連結失效,及Ingram公司無法提供舊書之書目紀錄;其次為匯入問題(load issues),主要為圖書館的Voyager自動化系統會將MARC 001及003欄資料組合 為035系統識別號,如果號碼有重複,即會產生記錄覆蓋或紀錄無法匯入的問 題;第三種是有關記錄的品質問題,這類問題不會讓資料無法連結或匯入,但 會造成使用者的混淆(Martin & Mundle, 2010)。 ㈢ 電子書書目資料格式 Taylor & Francis 數位發展部主任Mark Majurey認為電子書並未改變書的形 式,改變的是書的傳輸方式。書籍的傳輸方式一旦被改變,其庫存管理方式也 將隨之改變;以新科技將圖書內容傳遞給使用者,也會帶來新的商業模式。而 要將上述工作做好,建立正確且完整的書目資料(即metadata),是必要的基礎 建設。沒有metadata,顧客不知道有那些書、書的格式為何、要透過什麼管道 http://joemls.tku.edu.tw 395陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 取得、用何軟體閱讀;沒有metadata,出版社也無從管理資產,無法處理任何 交易及獲得正確的銷售報告;沒有metadata,圖書館無法提供使用者查詢,也 無法得知正確的使用統計(Majurey, 2009)。 在電子書的產業鏈上,每一環節所用的metadata格式可能不同。出版社通 常採用ONIX為其交換格式,圖書館界採用MARC,有些經銷商會自訂XML DTD;而epub行動式電子書則採用Dublin Core。簡言之,電子書書目資料格式 主要有下列四種,茲說明如下: 1. epub電子書採用Dublin Core格式 開放電子書論壇(Open eBook Forum, OeBF)是電子書的國際性商業與標準 組織。該組織為使電子書能成功的開拓市場,並以標準格式在閱讀系統(Read- ing System)間傳遞,制定開放電子書出版架構(Open eBook Publication Struc- ture, OeBPS),以作為電子書內容描述的標準。2007年9月,OeBF改名為國際 數位出版論壇(International Digital Publishing Forum, IDPF)。符合IDPF標準的 電子書又稱為epub電子書。epub電子書的書目資料採用XML語法,且由兩個 子元素組成:即Dublin Core欄位, 而,則為擴展元素,若有訊息無法在DC元素描述,則使用者可自 行定義標籤來擴展(International Digital Publishing Forum, 2011)。 2. 出版界及供應商採用的ONIX格式 線上資訊交換標準(Online Information eXchange, ONIX),是針對圖書而發 展出的網路資料交換標準,其目的在於增進圖書的電子商務,提供書商豐富而 標準化的產品資訊,以因應圖書批發商及零售商之間各種格式的書目資訊交換 需求。 3. 圖書館界採用的MARC格式 MARC是圖書館界採用已久的資訊描述格式,描述的內容豐富完整,在圖 書館自動化系統間,以ISO 2709格式交換,雖然也有用XML語法設計MARC schema,但使用並不普遍,對出版界而言,MARC也缺少如ONIX所提供的交 易資訊。 4. 開放出版分銷系統(OPDS) 開放出版分銷系統(Open Publishing Distribution System, OPDS)是由開放 電子書社群及網頁典藏計畫(Internet Archives)聯合設計並公告發行的目錄格式 及數位內容系統。OPDS目錄以Atom及HTTP標準為基礎來設計的聯合目錄。 小型的OPDS目錄可聚集在一起,成為更大的OPDS目錄。OPDS的metadata主 要有Atom element及DCMI Metadata Term,若兩者有重疊者,以Atom element 為優先(Open Publication Distribution System, 2011)。對各種規模圖書館而言, OPDS Catalog可讓使用圖書館的使用者不用拜訪圖書館網頁,而使用該館的電 http://joemls.tku.edu.tw 396 教育資料與圖書館學 51 : 3 (Spring 2014) 子書及他館的電子書。美國加州也以OPDS 建立Open Library,目前已有百萬筆 以上的電子書供讀者使用。 三、研究目的與研究方法 ㈠ 研究目的 具體而言,本研究主要目的如下: 1. 探討電子書出版社或平台服務商提供給圖書館的書目資料來源為何?書 目資料品質為何? 2. 了解圖書館對於書商所提供電子書書目資料之滿意情況,以及處理與維 護書目資料之問題。 ㈡ 研究方法 本研究採用之方法主要有: 1. 內容分析法:本研究以MarcEdit程式,抽樣分析中西文電子書之書目資 料品質。其抽樣對象為臺灣學術電子書暨資料庫聯盟2008-2012年間, 由供應商所提供的書目紀錄。在此期間,聯盟曾經向15家供應商採購過 29種產品;聯盟所採購的每一本電子書,供應商需提供該書之機讀 紀錄。 2. 訪談法:以訪談法了解廠商所提供的中英文書目紀錄來源及品質管控等 問題。 3. 問卷調查:以問卷調查臺灣學術電子書聯盟成員館是否滿意廠商所提供 的書目資料?如何處理廠商提供的書目資料?對於廠商提供的書目紀錄 有何建議? 四、研究結果 ㈠ 廠商提供的電子書書目紀錄品質 本研究首先以MarcEdit軟體,抽樣分析臺灣學術電子書聯盟廠商所提供的 中西文電子書書目紀錄之品質。抽樣方式及分析結果如下: 1. 電子書書目紀錄抽樣方式 ⑴ 每家廠商電子書書目紀錄以亂數表進行抽樣,300筆以下者抽樣10筆, 301~600筆者抽樣20筆,601筆以上者抽樣30筆,共計有1,080筆。詳 如表1。 http://joemls.tku.edu.tw 397陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 表1 書目品質檢查數量表 (單位:筆數) 產品名稱 2008年 2009年 2010年 2011年 2012年 小 計 SpringerLink 30 30 30 30 30 150 Informa Healthcare 10 10 Karger Books 10 10 Ovid Medical Books 20 20 10 50 ANA 電子書 10 10 凌網 10 10 Elsevier 20 30 30 20 100 ebrary 20 30 30 80 SIAM 20 20 Columbia Univ. Press 30 30 ABC-CLIO_&_Greenwood 30 30 10 10 80 IOS 10 10 10 10 40 Gale_E-Reference 10 10 20 Taylor & Francis 30 30 Oxford 30 30 30 30 30 150 ⑴ Cambridge Collections Online ⑵ Cambridge Books online 10 20 30 60 McGraw-Hill eBooks 30 10 40 Palgrave eBooks 30 30 60 L&B中文電子書選輯 10 10 20 麥格羅.希爾繁體中文書 全集 10 10 Emerald 30 30 EBSCO ebook (原netlibrary) 20 10 20 50 AiritiBooks華文電子書 10 10 SAGE Reference電子書 10 10 合 計 1,080 2. 電子書書目紀錄抽驗結果 從2008-2012電子書聯盟共採購29種產品,研究者抽出1,080筆書目資料詳 加檢查,發現廠商所提供書目紀錄有問題者共有151筆,比例約占14%,相關 問題如下: ⑴ 電子資源008定長欄資料型式應為「s」,但很多書目紀錄未著錄該欄, 可能由於出版社或書商下載的紙本書書目紀錄未將資料型式修改為「s」。 ⑵ 電子書與紙本書若有不同ISBN,均應著錄於MARC檔,以供讀者檢 索,但很多書目020 ISBN著錄不全,而出版社網站則有完整ISBN。 ⑶ 題名及作者敘述項應該是書目紀錄必填的欄位,讀者會透過書名來搜尋 資料,然而有書目紀錄無245欄。 ⑷ 未著錄041語文別。 ⑸ 出版社的電子書網頁有版本描述,在書目資料卻無250版本敘述。 ⑹ 260出版年代與電子書網頁之出版年不同。 http://joemls.tku.edu.tw 398 教育資料與圖書館學 51 : 3 (Spring 2014) ⑺ 集叢名著錄在440,新規則應著錄於490。 ⑻ 無300稽核項。 ⑼ 856 URL連結錯誤。 詳細的問題及筆數,如表2所示。 表2 電子書書目紀錄抽查結果分析 序號 MARC欄位名稱 筆數 抽查狀況 1 008定長欄 34 008定長欄的資料型式未著錄,及未依規定著錄 為s最多,其次為300稽核項與008定長欄資料不 一致、504書目註與008不一致等 2 020 ISBN欄位 21 電子書網頁有紙本書及電子書ISBN,但MARC檔 內不完整 3 029未定義 4 029欄位未定義,但抽樣之書目該欄有資料 4 040編目來源 5 未著錄040編目來源 5 041語文別 11 未著錄041語言別 6 049未定義 8 049欄位未定義,但抽樣之書目該欄有資料 7 245題名作者敘述項 8 以未著錄作者較多,其次為題名不全、書與書目 錯置、無245題名作者敘述項 8 250版本項 6 抽查之書目未著錄250版本項敘述 9 260出版項 19 書目紀錄之出版年與電子書之出版年不同 10 300稽核項 13 缺300稽核項,或稽核項與008定長欄資料不一致 11 490集叢項 17 集叢項著錄在440 (舊MARC規則),未更新至490 12 504書目註 1 504書目等附註與008段不一致 13 700作者 1 作者名拼錯 14 856 URL連結 3 856 URL連結錯誤 ㈡ 供應商書目資料來源及品質管控訪談分析 圖書館採購紙本書,可以直接檢視及驗收實體書是否正確,但採購電子 書時,圖書館所能驗收者,只有書目紀錄及其連線的正確性。因此對電子書而 言,書目紀錄的重要性不言可喻。書目資料關乎聯盟所採購的電子書是否可被 使用者正確檢索,其中包括幾個問題:書目資料來源為何?如何有效驗收書目 資料及其所連結電子書?以及書目資料與其所連結電子書是否為同一本書? 臺灣學術電子書暨資料庫聯盟曾向15家廠商採購電子書,但有3家未繼續 採購,故本研究對12家廠商進行訪談,以了解廠商書目資料的來源、管理與 控管書目品質的方法。訪談結果分為:電子書書目紀錄來源、ISBN處理原則、 系統連線問題、提供典藏用全文檔問題等。茲說明如下: 1. 電子書書目紀錄來源 臺灣學術電子書聯盟訂有機讀紀錄規範,也要求廠商必須依此規範分別 提供聯盟館所採用的機讀格式之書目紀錄。由訪談得知,廠商不同,各項產 品之書目紀錄來源也不同,很多西文書供應商所提供的書目資料直接抄錄自 OCLC,下載後提供給聯盟圖書館;也有由原廠提供書目紀錄再委託臺灣的編 目公司加工處理者;此外,無論西文或中文電子書,亦有由臺灣廠商自行編目 者。各家電子書書目紀錄來源如表3所列: http://joemls.tku.edu.tw 399陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 表3 各家廠商書目紀錄來源一覽 序號 廠商 電子書產品名稱 原始MARC來源 CMARC轉檔 1 SpringerLink SpringerLink 2008-2013 SpringerLink 編目部門 綠保公司 2 Elsevier Elsevier SDOS OCLC 臺大圖書館 3 Oxford Oxford Scholarship Online Oxford編目部門 臺大圖書館 4 Emerald Emerald Backstage Library Works 年豐編目公司 5 EBSCO Publishing EBSCO ebooks (原netlibrary) OCLC 文崗資訊 6 飛資得醫學 OVID LWW 98-100年購自OCLC,101年 外包文華編目做原編 文華編目公司 Karger 98-100年購自OCLC,101年 外包文華編目做原編 文華編目公司 Informa 98-100年購自OCLC,101年 外包文華編目做原編 文華編目公司 ANA 98-100年購自OCLC,101年 外包文華編目做原編 文華編目公司 7 飛資得知識 SAGE Reference 文華編目公司 文華編目公司 CRCNetbase OCLC (智泉代理時期) 朱江編目公司 (智泉代理時期) World Scitific 文華編目公司 文華編目公司 8 智泉國際 eBrary OCLC 朱江編目公司 Columbia Univ. Press OCLC 朱江編目公司 SIAM OCLC 朱江編目公司 9 文道國際 ABC-CLIO & Greenwood OCLC 文景公司編目人員 IOS 文景公司編目人員 文景公司編目人員 Gale 原廠提供 文景公司編目人員 Taylor & Francis 原廠提供 文景公司編目人員 InfoSci 原廠提供 文景公司編目人員 10 碩亞數碼 Cambridge Collec- tions Online 原廠委外或自編皆有 碩睿 Cambridge Books Online 原廠委外或自編皆有 碩睿 Palgrave connect eBooks OCLC 碩睿 McGraw-Hill (含中西文電子書) 碩睿 碩睿 Library & Book (L&B) 碩睿 碩睿 11 凌網科技 中文電子書 年豐編目公司 年豐編目公司 12 華藝數位 中文電子書 自編 自編 2. ISBN處理原則 ISBN是一本書最重要的身分證,透過ISBN可有效的查核複本、查核書目 資料的正確性、查核書目資料與電子書的一致性。透過訪談,得知各家廠商對 於ISBN的處理原則不一,主要有下列幾種情形: ⑴ 紙本書與電子書用同一組ISBN,如Elsvier、OVID、 ANA;若紙本與電 子書為同一ISBN,則無論使用者查詢或圖書館查核複本時,紙本或電 http://joemls.tku.edu.tw 400 教育資料與圖書館學 51 : 3 (Spring 2014) 子版都可容易的被查核到; ⑵ 紙本與電子書有不同的ISBN,如Oxford、Karger、Informa;若紙本與 電子書為不同的ISBN,則查核複本時,無法查到另一版本之複本; ⑶ 紙本與電子書原則上用同一套ISBN,但若有POD版,會另外有POD ISBN,如SprinkLink放到Amazon銷售時,有POD ISBN; ⑷ 在不同國家出版同一本書,其ISBN不同;只要一書有不同的ISBN,在 檢索及複本查核上都易出錯。同一本書在不同國家出版,會申請不同的 ISBN,這種作法對於紙本書較合宜,但在無疆界的數位出版時代並不 適當。 ⑸ ISBN主要由出版社申請,委由平台商放到電子書平台時,平台商不會 再申請該電子書之eISBN; ⑹ 針對10碼ISBN與13碼ISBN有做轉換之檢查機制,以確保不會重複, 如凌網科技。 3. 系統連線問題 ⑴ 目前各廠商所提供的系統,連線大致穩定,若有連線問題,廠商也會 在最短時間解決。 ⑵ 有少數的電子書在採購進館幾年後,因授權問題而無法連結使用,如 SpringerLink的連線穩定度佳,不過也曾發生因授權而有下架的情形; 平台商如凌網公司,對此問題表示,該公司已與出版社簽訂合約,版 權到期的書會從平台下架,但已售出的書不會收回,確保連結正常。 ⑶ 代理商無法自動偵測系統的連線狀況,多數皆因使用者反映才得知,也 有廠商由人工定期偵測系統連線是否正常; ⑷ 某一筆電子書連線錯誤或無法連線,廠商也無自動偵測機制,皆因使用 者反映才得知。 4. 提供典藏用全文檔問題 臺灣學術電子書聯盟要求廠商應繳交電子書全文檔(raw data)供典藏用, 即所謂的dark archive檔,並要求廠商以一書一檔、以ISBN為檔名、PDF格式、 不可加密的原則提供。然而各家廠商繳交全文檔情況不一,如: ⑴ 一書多檔:一本書若有12章,則有12個檔; ⑵ 廠商的書目紀錄與所提供的全文檔不一致,非同一本書; ⑶ 廠商將全文檔加密,聯盟無法打開該電子書; ⑷ 廠商提供的電子書為XML或HTML格式,非聯盟要求的PDF格式; ⑸ 廠商只繳交空白光碟,光碟內無全文檔; ⑹ 平台商無法提供全文檔,只能提供第三方之保證信函。 http://joemls.tku.edu.tw 401陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 ㈢ 圖書館對於供應商提供書目紀錄滿意度及書目紀錄處理方式 本研究另以問卷調查95所臺灣學術電子書聯盟成員館對書商所提供之書目 紀錄是否滿意,以及該館如何處理廠商所提供的電子書書目紀錄。問卷共寄發 95份,回收95份,回收率100%。問卷內容主要包括:圖書館所採用的機讀格 式為何,廠商所提供的書目是否符合圖書館的需求,圖書館是否將電子書的機 讀檔匯入館藏目錄,圖書館除了OPAC外,是否有其他電子書檢索系統。圖書 館對於目前廠商所提供的書目紀錄有何建議,成員館對於臺灣學術電子書聯盟 所提供電子書整合查詢系統之看法及建議。調查結果說明如下: 1. 基本資料 本問卷直接以電子郵件寄給臺灣學術電子書暨資料庫聯盟館平日在處理聯 盟採購電子書的同仁,因各館規模不一、處理單位不一,因此填答問卷的單位 不一,主要為技術服務單位,填答單位統計如表4。 表4 圖書館問卷填答單位 編號 組別 數量 % 1 採編組 41 43.2 2 圖書(館藏、資源)管理、 徵集、服務組 18 18.9 3 技術服務組 7 7.4 4 讀者服務組 6 6.3 5 參考組 6 6.3 6 其他 17 17.9 合 計 95 100 2. 圖書館所採用機讀編目格式 各館採用機讀格式情況不一,採用兩種機讀格式者,共有33所:其中採用 CMARC與MARC21者24所,採用CMARC與USMARC者9所;只使用一種機 讀格式者,共62 所:其中採用CMARC者24所,採用MARC21者38所。詳細 情況如表5。 表5 電子書聯盟成員館採用機讀格式 編號 機讀格式 數量 % 1 CMARC only 24 25.26 2 MARC21 only 38 40.00 3 CMARC & MARC21兩種並存 24 25.26 4 CMARC & USMARC 兩種並存 9 9.47 合 計 95 100 3. 廠商提供書目紀錄是否符合圖書館需求 有87所圖書館認為,廠商提供的書目紀錄大致符合圖書館需求,占聯盟成 員館92%,其中淡江大學圖書館表示:「即使有格式問題無法啟用,本館同仁也 能自行排除」,中正大學圖書館提到:「中文電子書書目過於簡略,希望能更完 http://joemls.tku.edu.tw 402 教育資料與圖書館學 51 : 3 (Spring 2014) 整」;認為機讀書目不符合需求者,共計8所圖書館,占聯盟成員8%。 少數圖書館認為廠商的書目紀錄不符合所需,主要問題如下: ⑴ 有6所圖書館提到廠商未修正書目紀錄細節,導致無法轉入自動化系 統; ⑵ 2所圖書館認為廠商所提供機讀格式不符合該館所需。 4. 圖書館有否將電子書書目紀錄匯入館藏目錄 廠商提供的書目紀錄,圖書館是否已匯入圖書館自動化系統?調查顯示, 共計91所圖書館已將電子書機讀檔匯入館藏目錄,佔成員館96%,未匯入館藏 目錄者,有3所圖書館,佔成員館3%。 表7 成員館是否將機讀 書目匯入OPAC分析 項目 館數 % 是 91 96 否 3 3 未填是否 1 1 合 計 95 100 三所未將書目匯入自動化系統之圖書館,主要問題為: ⑴ 有拿到檔案,但尚未匯入; ⑵ 自動化系統無法接受MARC格式紀錄; ⑶ 書目紀錄常異動,無人力維護。 5. 圖書館除OPAC外,有否其他電子書檢索系統 除OPAC系統外,圖書館是否提供讀者其他檢索電子書之工具?調查結果 顯示,有38所圖書館,除自動化系統外,並未提供其他檢索工具,佔聯盟館 40%,有提供其他電子書檢索系統者有57所,佔聯盟館60%,而其中所提供工 具為該館的電子資源整合查詢系統者,有40所圖書館,佔70%。 6. 圖書館對於供應商提供書目紀錄之建議 圖書館對廠商所提供的書目紀錄有何建議,共34所圖書館回答這個開放 題,占34%。主要建議有: ⑴ 有29所圖書館希望廠商修正書目紀錄細節,力求資料的正確性; ⑵ 3所圖書館希望廠商提供的MARC筆數應與聯盟採購數一致; ⑶ 3所圖書館希望廠商提供完整的分類號及主題資料; 表6 成員館認為供應商書目 紀錄是否符合需求分析 項目 館數 % 是 87 92 否 8 8 合 計 95 100 http://joemls.tku.edu.tw 403陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 ⑷ 各有一所圖書館提出以下建議:不要誤植連結資訊、提供符合格式之書 目、西文書提供RDA格式、確實查核複本、提供801等館藏著錄段等。 上述建議有些是合理的要求,但也有些需進一步檢討其合理性。 五、研究結論與建議 ㈠ 研究結論 本研究採用三種方法,分別針對書目資料、提供書目的廠商,及應用書目 資料的圖書館進行研究,以了解電子書供應商提供給圖書館的書目紀錄來源為 何?書目資料品質為何?圖書館對於書商所提供的書目資料是否滿意,圖書館 如何處理書目資料等問題。綜合各項研究結果獲得下列結論。 1.書目紀錄來源:在電子書時代,圖書館已完全採用來自「供應鏈」之書 目資料,供應商也根據各館所要求的中西文機讀格式提供書目紀錄;不過大多 數供應鏈所提供的書目資料,仍來自圖書館所建立的聯合目錄,多數供應商的 書目資料來自OCLC,也有些供應商會根據聯盟所要求的規範加工。 2.書目紀錄品質:供應商提供的書目紀錄,節省了圖書館的編目時間與 人力。本研究以MarcEdit抽驗各供應商之書目品質,發現有問題的書目約占 14%,比例不算低;其中以008資料類型代碼錯誤最多,沒有245題名及作者敘 述項較嚴重,此外ISBN處理原則混亂,記錄不完整,raw data全文檔錯誤或格 式不符要求等皆為常見的問題;而由問卷得知,圖書館大多認為廠商的書目紀 錄符合其需求,但也有29所圖書館希望廠商能修正書目紀錄細節,力求資料的 正確性。 3.廠商對電子書連結正確性的管控:廠商對於系統連線是否正常,或某一 筆電子書是否連結正確,多無自動檢核機制,幾乎都倚賴使用者回報問題。 4.聯盟館採用機讀格式:各館採用機讀格式包括MARC 21、USMARC、 CMARC三種,聯盟館有62所只採用一種機讀格式,33所圖書館中西文各採用 不同機讀格式。對廠商而言,圖書館採用不同機讀格式會增加其轉檔及核對資 料的成本。 5.圖書館對書目紀錄滿意度:92%成員書館認為廠商提供的書目紀錄大致 符合需求,顯示圖書館大致滿意供應商提供之書目紀錄。 6.圖書館處理書目紀錄情況:有96%學校圖書館將電子書機讀書目匯入圖 書館館藏目錄,但仍有4%圖書館未將書目資料匯入圖書館自動化系統。 7.圖書館提供使用者檢索電子書途徑:除OPAC外,有60%圖書館提供 使用者其他查詢電子書的途徑,其中以提供電子資源整合檢索系統最多(如 MUSE、Primo等),其次為聯盟開發的電子書整合查詢系統,亦有圖書館建置 電子書網頁專區。 http://joemls.tku.edu.tw 404 教育資料與圖書館學 51 : 3 (Spring 2014) 8.圖書館對廠商提供書目紀錄建議:圖書館建議廠商提高其書目紀錄的正 確性,書目紀錄應有分類號與主題標目,數目紀錄筆數應與採購筆數一致,提 供RDA格式、查核複本、提供801館藏著錄段等。 ㈡ 研究建議 針對上述研究結果,本研究提出下列三點建議: 1.相較於過去,電子書書目紀錄之重要性遠大於紙本時代,若書目資料品 質不佳、資料錯誤或連結不正確,即等於付了錢,卻沒拿到貨品。如何確保書 目的正確連結,電子書的長久使用,是採購電子書遭遇到的大挑戰。為了解決 這個問題,需有可處理電子書選擇、採購、驗收、書目資料管理、raw data全 文檔典藏之整合系統,才能解決相關問題。 2.由供應商提供書目紀錄,確能節省圖書館的編目人力與時間,但圖書館 需將書目資料完整且正確的匯入圖書館自動化系統,並確保該書目紀錄永久正 確連結。建議圖書館自動化系統應該有自動查核連結正確性的功能,並將連結 失效的書目自動回報給圖書館及平台商。 3.電子書倚賴供應鏈提供書目紀錄,然而供應商並非編目專家,因此圖書 館界需有專責單位,訂定電子書之書目規範(含ISBN的處理原則),也應檢討 對廠商要求之合理性,完整書目資料(如給分類號及主題標目)的提供是否全為 廠商的責任;當然,供應商也應聘請圖書資訊學系所畢業生負責編目,才能提 供正確的書目資料給圖書館界。 誌 謝 本研究謝謝孟君同學、鄭宇涵同學的協助,更感謝台師大圖書館採編組陳 敏珍組長、楊雪子助理的大力幫忙。 參考書目 Geh H.-P. (1993)。ヨーロッパ統合に向けてのライブラリアンシップの動向。情報管 理,35(10),857-869。doi:10.1241/johokanri.35.857 陳昭珍、詹麗萍、謝文真、陳雪華(2011)。大學圖書館電子書使用現況調查。數位圖 書館論壇,5,29-40。 臺灣學術電子書暨資料庫聯盟推動小組(2013)。臺灣學術電子資源永續發展計畫:101 年教育部委託計畫結案報告。台北市:國立臺灣師範大學圖書館。 Calhoun, K. (2006). The changing nature of the catalog and its integration with other discovery tools: Prepared for the Library of Congress. Ithaca, NY: Cornell University Library. Retrieved from http://www.loc.gov/catdir/calhoun-report-final.pdf Dillon, D. (2001). E-books: The University of Texas experience, part 2. Library Hi Tech, 19(4), 350-362. doi:10.1108/EUM0000000006540 http://joemls.tku.edu.tw 405陳昭珍:由供應鏈提供電子書書目紀錄品質與維護問題之探討 Dinkelman, A., & Stacy-Bates, K. (2007). Access e-books through academic library web sites. College & Research Libraries, 68(1), 45-58. Dorner, D. (2000). Cataloging in the 21st century–part 2: Digitization and information standards. Library Collections, Acquisitions, & Technical Services, 24(1), 73-87. doi:10.1016/S1464-9055(99)00099-8 Gibbons, S. (2001). netLibrary eBook usage at the University of Rochester Libraries: Version 2. Rochester, NY: University of Rochester Libraries. Retrieved from http://www.lib. rochester.edu/main/ebooks/analysis.pdf. Hutton, J. (2008). Academic libraries as digital gateways: Linking students to the burgeoning wealth of open online collections. Journal of Library Administration, 48(3/4), 495-507. doi:10.1080/01930820802289615 International Digital Publishing Forum. (2011). EPUB. Retrieved from http://idpf.org/epub Langston, M. (2003). The California State University e-book pilot project: Implications for cooperative collection development. Library Collections, Acquisitions, & Technical Services, 27(1), 19-32. MacCall, S. L. (2006). Online medical books: Their availability and an assessment of how health sciences libraries provide access on their public websites. Journal of the Medical Library Association, 94(1), 75-80. Majurey, M. (2009, October). Dealing with eBook metadata. In Editeur 31st international supply chain seminar. Symposium conducted at the meeting of the EDItEUR, Frankfurt, Germany. Retrieved from http://www.editeur.org/files/Events pdfs/Supply chain presentations/Dealing with eBook metadata article.pdf Marcum, D. B. (2005, January). The future of cataloging. In Ebsco leadership seminar. Symposium conducted at the meeting of the Ebsco, Boston, MA. Retrieved from http:// loc.gov/library/reports/CatalogingSpeech.pdf Martin, K. E., & Mundle, K. (2010). Cataloging e-books and vendor records: A case study at the University of Illinois at Chicago. Library Resources & Technical Services, 54(4), 227- 237. Oddy, P. (1996). Future libraries, future catalogues. London, UK: Library Association. Open Publication Distribution System. (2011). Open Publication Distribution System: Officially specification & blog. Retrieved from http://opds-spec.org/ Working Group on the Future of Bibliographic Control, Library of Congress. (2008). On the record: Report of the Library of Congress Working Group on the future of bibliographic control. Washington, DC: Library of Congress. Retrieved from http://www.loc.gov/ bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf 陳昭珍 0000-0002-4042-0822 http://joemls.tku.edu.tw Journal of Educational Media & Library Sciences 51 : 3 (Spring 2014) : 391-410 DOI:10.6120/JoEMLS.2014.513/0616.RS.AM Investigating the Quality and Maintenance Issues of Bibliographic Records Provided by the e-Book Supply Chain: Using the Operations of the Taiwan Academic E-Book & Database Consortium as an Example Chao-Chen Chen Abstract It is an important trend to expand the use of bibliographic records f rom the supply chain for libraries. However, what are the sources of vendors’ bibliographic records? Their qualit y? How do libraries deal with these bibliographic records? Are they satisfied them? This study first drawn 1,080 bibliographic records from 29 e-book products, and MarcEdit was used to check their quality and 14% were found to contain errors. Secondly, this study interviewed 12 vendors and found out that bibliographic records of western books were mostly copied from the OCLC, and there were also bibliographic records from the original vendors who commissioned outsource companies to do the cataloging process. In addition, there were those cataloged by Taiwanese manufacturers themselves. Lastly, this study sent questionnaires to Taiwan Academic E-Book & Database Consortium members to survey their satisfaction on bibliographic records provided by vendors and their recommendations thereof. This study shows that most libraries have inputted the bibliographic records of the e-books into their OPAC system and are generally satisfied with the bibliographic records provided by the vendors, though opinions vary on the accuracy of these records. Keywords: E-Books; Supply chain; Information organization; Bibliographic records; MARC records SUMMARY Deanna B. Marcum (2005) mentioned in a speech that “The Library of Congress spent $44 million dollars on cataloging every year. We live in an era that a large quantity of digital resources are provided, and users used to referring to digital information as their priority and searching with keywords in Google. In such a time, how libraries do their cataloging? Do bibliographies of libraries need to be changed? This is a critical issue that people in library field need to face together.” Since Marcum’s speech, the Library of Congress has set up a R es ea rc h A rt ic le Professor, Graduate Institute of Library and Information Studies, National Taiwan Normal University, Taipei, Taiwan E-mail: joycechaochen@gmail.com http://joemls.tku.edu.tw 407Chen: Investigating the Quality and Maintenance Issues of Bibliographic Records Provided by the e-Book Supply Chain committee for studying ways of changing library bibliographies, with Karen Calhoun as the committee director. The committee then published a report titled, “Changing Nature of the Catalog and Its Integration with Other Discovery Tools”, in which it suggested that libraries should expand their bibliographies by adopting bibliographic records from the publisher supply chain, for enhancing efficiency of building bibliographies (Calhoun, 2006) Applying “supply chain” concept to bibliographic records for enhancing efficiency of building bibliographies has become a trend since libraries began purchasing e-books. Since 2007, the libraries of more than 90 universities in Taiwan have collaborated to form the Taiwan Academic E-book and Database Consortium for purchasing e-books written in Chinese and western languages. More than 15 E-book vendors have also participated in this consortium (Taiwan Academic E-Books & Database Consortium, 2013). When considering which items to purchase, the Consortium will ask the vendors to provide MARC bibliographic records. However, few studies have ever investigated related issues, such as the sources and qualities of bibliographic records, how libraries deal with these bibliographic records, and whether libraries are satisfied with bibliographic records provided by the vendors. Therefore, in this study, the researcher analyzes these questions. Research Goals and Methods There are two research goals of this study: 1. To investigate the sources and quality of bibliographic records provided by e-book vendors. 2. To understand whether libraries are satisfied with those e-book bibliographic records, and how libraries deal with and maintain bibliographic records. Three research methods are adopted in this study: 1. Using MarcEdit software program to analyze the quality of bibliographic record samples provided by vendors to Taiwan Academic e-book and Database Consortium during the period of 2008-2012. During this period, the Consortium purchased 29 items from 15 vendors. Bibliographic records were chosen by a simple random sample, with 10 items drawn from the 1-300 records, 20 items from 301-600 records, 30 items from 601-900 records, etc. The total sampled records were 1080. 2. Using interview method to investigate the sources and quality control issue of bibliographic records provided by vendors. 3. Using surveys to investigate 1) whether member libraries of the Taiwan Academic E-book and Database Consortium are satisfied with bibliographic http://joemls.tku.edu.tw 408 Journal of Educational Media & Library Sciences 51 : 3 (Spring 2014) records provided by vendors, 2) how libraries handle bibliographic records provided by vendors, and 3) what suggestions the libraries would have for vendors regarding bibliographic records. Research Findings 1. Sources of bibliographic records: Most of the bibliographic records provided by vendors are copied from the joint bibliography built up by libraries. Bibliographic records of books written in English are mostly from OCLC. Some vendors modify the records for meeting the requirements of the Consortium. 2. Quality of bibliographic records: A f t e r b e i n g a n a l y z e d w i t h M a r c E d i t s o f t w a r e p r o g r a m , 1 4 % o f bibliographic records were found to have errors. The field with most errors was 008 Fixed-length Data Elements; the most serious error was without 245 Title Statement. Other common errors include inconsistent ISBN formats, incomplete records, inaccurate raw data, or unmatched formats. The surveys showed that most libraries considered the bibliographic records of the vendors to meet requirements; however, 29 libraries expected the vendors could modify details of bibliographic records and verify the accuracy. 3. Vendors’ Accuracy Control of e-book links Most of the vendors did not have an automatic checking mechanism on whether the system connection is normal or whether every link of e-book record is accurate. Most of them rely on user’s reporting errors or problems. 4. MARC systems adopted by the Consortium: The MARC systems adopted by the libraries include MARC 21, USMARC and CMARC. Sixty-two libraries adopted single one MARC system; 33 libraries adopted different systems for Chinese books and western-language books.. When libraries adopt different MARC systems, the cost to the vendors of converting files and verifying records increases. 5. Satisfaction of bibliographic records 92% of member libraries considered that the bibliographic records meet the requirements, suggesting a high degree of satisfaction. 6. Handing of bibliographic records: 96% of member libraries loaded e-book records into their library catalogues, but 4% of the libraries did not include the bibliographic records into their library automation systems. 7. Library users’ access to e-book searching In addition to OPAC, 60% of libraries provided other accesses to e-book searching, such as Library E-resource Meta-search System, and Multi-user http://joemls.tku.edu.tw 409Chen: Investigating the Quality and Maintenance Issues of Bibliographic Records Provided by the e-Book Supply Chain Universal Search Environment (MUSE), which was developed by the Consortium. Some libraries even built up exclusive e-book webpages. 8. Libraries’ suggestions for vendors: L i b r a r i e s s u g g e s t e d t h a t v e n d o r s s h o u l d e n h a n c e t h e a c c u r a c y o f bibliographic records, apply call numbers and subject headings, check the numbers of bibliographic records against purchased items, verify copies, and provide RDA format and 801 bibliographic description. Suggestions Based on the research findings mentioned above, three suggestions were made: 1. Regarding the purchasing of e-books, it is a big challenge to ensure the accuracy of bibliographic records and maintain regular uses of e-books. To address the issue of accuracy and consistency, we need an integrated system for handing the selection, purchases, and verification of e-books, as well as managing bibliographic records and including full-text files. 2. Libraries need to import complete and accurate bibliographic records into their library automation systems, and make sure the accuracy of links of bibliographic records. Vendors and library automation systems should have mechanisms for auto-checking the links, and reporting ineffective bibliographic records to libraries and vendors. 3. E-book vendors are not cataloging experts, thus people in the library field should set standards or regulations for e-book cataloging, and make reasonable requirements for e-book vendors to follow. To enhance the quality of bibliographic records, vendors should also recruit cataloging experts trained by graduate schools and departments of library and information science. ROMANIZED & TRANSLATED REFERENCE FOR ORIGINAL TEXT Geh H.-P. (1993)。ヨーロッパ統合に向けてのライブラリアンシップの動向[Trends in librarianship towards a united Europe]。情報管理,35(10),857-869 [Journal o f I n f o r m a t i o n P ro c e s s i n g a n d M a n a g e m e n t , 3 5 (10) , 857-869]。d o i :10.1241/ johokanri.35.857 陳昭珍、詹麗萍、謝文真、陳雪華[Chen, Chao-Chen, Chen, Li-Ping, Hsieh, Wen-Jen, & Chen, Hsueh-Hua](2011)。大學圖書館電子書使用現況調查[A survey of e-book usage among university library patrons in Taiwan]。數位圖書館論壇,5,29-40 [Journal of Digital Library Forum, 5, 29-40]。 臺灣學術電子書暨資料庫聯盟推動小組[Taiwan Academic E-Books & Database Consortium tuidong xiaozu.](2013)。臺灣學術電子資源永續發展計畫:101年教育部委託計畫 結案報告[Taiwan xueshu dianziziyuan yongxu fazhan jihua: 2012 Ministry of Education weituo jihua jiean baogao]。台北市:國立臺灣師範大學圖書館[Taipei: National http://joemls.tku.edu.tw 410 Journal of Educational Media & Library Sciences 51 : 3 (Spring 2014) Taiwan Normal University Library]。 Calhoun, K. (2006). The changing nature of the catalog and its integration with other discovery tools: Prepared for the Library of Congress. Ithaca, NY: Cornell University Library. Retrieved from http://www.loc.gov/catdir/calhoun-report-final.pdf Dillon, D. (2001). E-books: The University of Texas experience, part 2. Library Hi Tech, 19(4), 350-362. doi:10.1108/EUM0000000006540 Dinkelman, A., & Stacy-Bates, K. (2007). Access e-books through academic library web sites. College & Research Libraries, 68(1), 45-58. Dorner, D. (2000). Cataloging in the 21st century–part 2: Digitization and information standards. Library Collections, Acquisitions, & Technical Services, 24(1), 73-87. doi:10.1016/S1464-9055(99)00099-8 Gibbons, S. (2001). netLibrary eBook usage at the University of Rochester Libraries: Version 2. Rochester, NY: University of Rochester Libraries. Retrieved from http://www.lib. rochester.edu/main/ebooks/analysis.pdf. Hutton, J. (2008). Academic libraries as digital gateways: Linking students to the burgeoning wealth of open online collections. Journal of Library Administration, 48(3/4), 495-507. doi:10.1080/01930820802289615 International Digital Publishing Forum. (2011). EPUB. Retrieved from http://idpf.org/epub Langston, M. (2003). The California State University e-book pilot project: Implications for cooperative collection development. Library Collections, Acquisitions, & Technical Services, 27(1), 19-32. MacCall, S. L. (2006). Online medical books: Their availability and an assessment of how health sciences libraries provide access on their public websites. Journal of the Medical Library Association, 94(1), 75-80. Marcum, D. B. (2005, January). The future of cataloging. In Ebsco leadership seminar. Symposium conducted at the meeting of the Ebsco, Boston, MA. Retrieved from http:// loc.gov/library/reports/CatalogingSpeech.pdf Martin, K. E., & Mundle, K. (2010). Cataloging e-books and vendor records: A case study at the University of Illinois at Chicago. Library Resources & Technical Services, 54(4), 227- 237. Oddy, P. (1996). Future libraries, future catalogues. London, UK: Library Association. Open Publication Distribution System. (2011). Open Publication Distribution System: Officially specification & blog. Retrieved from http://opds-spec.org/ Working Group on the Future of Bibliographic Control, Library of Congress. (2008). On the record: Report of the Library of Congress Working Group on the future of bibliographic control. Washington, DC: Library of Congress. Retrieved from http://www.loc.gov/ bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf Chao-Chen Chen 0000-0002-4042-0822 391-410 Journal of Educational Media & Library Sciences http://joemls.tku.edu.tw Vol. 51 , no. 3 (Spring 2014) : 391-410 391-410 work_f3e6maxb7bcffhgaygckc476im ---- Library Management, 2007, Vol. 28, No. 6/7, p.366-378. ISSN:0143-5124 doi: 10.1108/01435120710774503 http://www.emeraldinsight.com/index.htm?PHPSESSID=oib9rnl8j0vtc3ua0de88bp641 http://www.emeraldinsight.com/journals.htm?issn=0143-5124 http://www.emeraldinsight.com/journals.htm?issn=0143- 5124&volume=28&issue=6&articleid=1617033&show=abstract © 2007 Emerald Group Publishing Limited Assessing trends to cultivate new thinking in academic libraries Sally A. Rogers The Ohio State University Libraries, Columbus, Ohio, USA Abstract Purpose - The purpose of this paper is to present an organized view of current trends affecting academic libraries that one research library developed to encourage new thinking; this view could assist others seeking to help their organizations think differently about the future of information access and management. Design/methodology/approach - One strategy for identifying important trends using a small number of key resources is highlighted in the paper. A snapshot of the many trends affecting academic libraries is categorized to show interrelationships and to provide specific examples along with a general overview. Included is a brief description of how the snapshot was used by one library. Implications for the future and perspectives on the value of cultivating new thinking are presented in the conclusion. Findings - The paper finds that rapid and far-reaching change is challenging libraries to think very differently, to act much more quickly, and to set trends rather than merely react to them. Assessing trends can help libraries foster organizational change through exposure to new ideas and see where new partnerships and areas of expertise must be developed to meet new needs. Practical implications - The snapshot became the basis for two library-wide events at Ohio State that better positioned attendees to inform and to accommodate decisions about service priorities, personnel and budget requests. Originality/value - This paper organizes many diverse trends into a general overview to make inter- relationships and implications more understandable to those unlikely to develop such a view on their own - for example: university personnel outside the library, middle managers and those they supervise within the library, students of library and information management. Keywords Academic libraries, Change management, Information management Paper type Viewpoint Introduction Those in leadership positions in higher education and in academic libraries face a significant challenge as they try to envision the future with some degree of accuracy in order to make good decisions about service priorities, resource allocations, and organizational structures. Visibility into the future is so limited that it is a challenge to predict what will be expected of these organizations even two or three years from now. If visibility is limited for those in upper level leadership positions, it could be non-existent for those at other levels in their organizations if regular exposure to new perspectives and ideas has not been a priority. The following article presents a view of current trends in academic libraries that was developed at The Ohio State University Libraries (OSUL) to encourage new thinking to inform decisions about future directions. Because it was a challenge to show in some coherent fashion how the many key trends affecting academic libraries relate to one another as the basis for a http://www.emeraldinsight.com/index.htm?PHPSESSID=oib9rnl8j0vtc3ua0de88bp641 http://www.emeraldinsight.com/journals.htm?issn=0143-5124 http://www.emeraldinsight.com/journals.htm?issn=0143-5124&volume=28&issue=6&articleid=1617033&show=abstract http://www.emeraldinsight.com/journals.htm?issn=0143-5124&volume=28&issue=6&articleid=1617033&show=abstract library-wide discussion, the resulting view is being shared in the hope that it might assist others looking for meaningful ways to Assessing trends help their organizations think about the future. Related resources Just as trends abound, so does the information about them; that is part of the problem - how to keep up with reading it and how to tie it all together into something that makes sense and is usable as a basis for decision making. Many strategies are possible. One is to track a few carefully selected resources that one is confident will provide or lead to information on the most important trends. A few such resources that were used to create the snapshot of trends at OSU are cited here as examples. Tracking current trends requires using resources that report information while it is indeed current. It is helpful to identify others who have demonstrated an ability to assess trends and to set appropriate directions in the arenas that impact academic libraries in order to benefit from their thinking. The semi-annual meetings of the Coalition for Networked Information (CM) Task Force offer both very current information and the perspectives of CNI's executive director Clifford Lynch, whose knowledgeable insights are well worth hearing on a regular basis. His "meeting roadmap" and the project briefings for the spring and fall Task Force meetings and are posted on the CNI web site (CNI, n.d.) only a few weeks ahead of time, ensuring their currency. The CNI-ANNOUNCE electronic forum gives subscribers invaluable updates on key developments and reports as well as announcing various conference opportunities throughout the year. An archive of the forum is available. D-Lib Magazine has as its goal "timely and efficient information exchange for the digital library community" (D-Lib, n.d.). Its 11 (electronic only) issues per year include articles on current topics, as well as current awareness and event links. Some articles are solicited, and many are written by leaders of key initiatives in the field. OCLC also offers timely information on current issues through its newsletter, now called NextSpace, and the OCLC Symposium held at semi-annual conferences of the American Library Association (ALA). Symposium presentations, such as the one held in January 2006 entitled "Rebranding an Industry: Extreme Makeover," are available on the OCLC web site (OCLC, n.d.). OCLC offers podcasts, RSS feeds, and weblogs, such as the one by Lorcan Dempsey, their chief strategist and vice president for Research, who regularly shares visionary thinking through his blog and many other venues. Several significant reports on the current information environment and perceptions of actual and potential library users have been issued by OCLC in the past few years. Cathy De Rosa, vice president of Marketing and Library Services for OCLC, was a principal contributor to these reports; and she has given many excellent presentations sharing important perspectives on their contents (De Rosa, 2004, 2005). EDUCAUSE (n.d.) offers information about technology trends in higher education through its conferences and publications, such as EDUCAUSE Review and EDUCAUSE Quarterly, both of which are available on the organization's web site. The May/June 2006 issue of EDUCAUSE Review includes a message from the executive team indicating the organization is expanding its focus to look at campus issues and "grand challenges," not just IT issues (Hawkins et al, 2006). There are many other resources that are important sources of information on the latest trends affecting academic libraries. One could cite, for example, The Chronicle of Higher Education, The New York Times, First Monday, Wired Magazine, Information Today, conference proceedings and webcasts on current topics, reports of recently funded research projects and the outcomes of that research. Whatever sources are used, the critical factor is to look beyond the library to see it within the context of what is happening in the academy, in industry, in government, and in society. For those who regularly read the above-mentioned sources, the snapshot of current trends that follows will not be surprising. Its intended value lies in the gathering and organization of trends into something that hopefully makes sense to those who might not be consulting such resources routinely or who might not have had the time to analyze and to synthesize the information. These individuals might include university personnel outside the library, middle managers and those they supervise within the library, and students of library and information management. Snapshot of trends New models for content management A trend introduced in the past several years has been new models for content management; for example, institutional and other repositories that use open source platforms like DSpace (n.d.) and Fedora (n.d.); course management systems that now also can serve as digital content repositories; and systems that support the creation and/or management of peer reviewed e-journals such as bepress (n.d.) and the open source DPubS (n.d.) software being developed by the Cornell University Libraries in collaboration with the Pennsylvania State University Libraries and Press. The impetus has been the recognition that digital content constitutes a valuable asset that should be managed better than it has been. OSU is exploring the relationship of its DSpace repository (the OSU Knowledge Bank) to the statewide Fedora repository (the Digital Resource Commons) being built by the OhioLINK (n.d.) consortium and the learning object repository that is a part of Desire2Learn (Desire2Learn.com, n.d.), the newly implemented course management system (called Carmen at OSU). This trend offers new opportunities for libraries, both in terms of content production and content management, because much of the content is outside the realm of what libraries traditionally collect, organize, and deliver. Because libraries are investing their resources in producing and gathering content in addition to purchasing it, the development of collection policies for digital and repository initiatives would seem advisable. The OSU Libraries' Collections Advisory Council has been consulted about proposed digital projects, but no formal collections policy for digital initiatives has been written. New models for the creation and dissemination of scholarship should help to advance the movement to create change in scholarly communication. Libraries have supported this movement and the new open access journals that have resulted from it. They have encouraged faculty to take a stand against exorbitant journal price increases in their disciplines. But many challenges remain in this arena. Marianne Gaunt, a speaker at the 2005 American Library Association (ALA) annual conference, questioned whether the journal creation process of peer review, editorial services, distribution, and archiving should be unbundled in a new business model. Dan Greenstein, who spoke at the same session, noted that the most important aspect is the identification of quality. If a new model can be designed to do that, the current system of journal publishing should be changeable. New levels of granularity http://desire2learn.com/ Management of content at finer levels of granularity is possible with some of the new options described above, and the current trend is to focus on the content itself, not on the containers in which it comes. A thought-provoking report on this topic was issued by OCLC Online Computer Library Center, in 2004. It states: Content is no longer format-dependent and users are not dependent on traditional distribution channels for access to content. This is true both in the realms of scholarly communication and popular materials. For libraries and content sellers, this means the processes of acquisition, organization and delivery of content need to change to accommodate the expectations of our communities (OCLC Online Computer Library Center, 2004, p. 2). Nancy Davenport gave an example of this trend when she spoke at the 2005 ALA annual conference. She mentioned that a couple of academic libraries cancelled subscriptions to ancillary titles and put half of the money they saved aside to buy articles from those titles as needed. They found that they needed to spend only about half of what they put aside. New roles and opportunities The new models for content management also offer end-users in various communities the options of submitting content and metadata themselves and of deciding what content to include in their collections. If end-users enthusiastically embraced these new models, one might question whether there will be a significant role for libraries in managing the non-traditional content. But the trend to date has been for end-users to resist taking time away from their primary scholarly pursuits such as research to digitize, to submit, and to describe their output for the new repositories. However, the fact that the option exists for the end-user to do functions similar to those that libraries have done as stewards of print and electronic resources creates the following opportunities for libraries: • To play a new role as facilitators in making end-user participation as easy as possible. • To partner with end-users to manage their content (by offering a digitization service or a metadata service, for example). • To advise on the development of tools that simplify the process of content creation and dissemination for the end-user, recognizing that tools and applications have become a primary technology development focus (whereas hardware was the focus in the past). One motivator for faculty to be interested in institutional repositories is the emphasis that federal agencies have started to place on preservation of digital content created with their grant funding. Faculty are looking to the library for assistance in addressing this preservation aspect when preparing grant proposals. In general, attention to digital preservation is increasing in conjunction with heightened awareness of both the value and the vulnerability of digital content. Mechanisms are needed to ensure authenticity and integrity of content not only when it is first created, but also over time. Libraries, as creators, sponsors, and stewards of digital content, must be thinking at the outset how they will migrate and preserve it on an ongoing basis. These are areas that must be given more attention, particularly given the rising number of computer security threats. Cornell University Library offers a workshop on digital preservation management. The workshop web site includes a tutorial (Cornell University Library, n.d.) containing an informative timeline that presents milestones in digital technology and preservation, including major preservation initiatives that are currently underway. New scale Another recent trend in the area of content management is mass digitization on a scale and at a pace that previously has seemed unachievable. Google's announcement in fall 2004 of plans to digitize all or part of the collections of five libraries for their Google Print project (now called Google Book Search) garnered considerable interest as well as concern about possible copyright violations. Regardless of the outcome of this particular initiative, or others such as the digital content archive being built by the Open Content Alliance (n.d.), what is significant is that mass digitization on the scale proposed by Google is now plausible. Technological advances have allowed scanning to be done more quickly and with less human intervention. Digital storage costs have declined significantly. And perhaps most importantly, there are players willing to take the risks and to invest to make it happen. These convergence factors have set the stage for new opportunities and partnerships for libraries. New access options In parallel to the attempts to take advantage of the wealth of information in library print collections through mass digitization, there are efforts underway to leverage the rich store of library-created metadata through harvesting by the major internet search engines. For example, Google and Yahoo! have harvested selected fields from records in OCLC's WorldCat database so that library resources can be retrieved in response to a general internet search. Such responses are flagged as "find in a library." OCLC provides institutions with statistics on the amount of web traffic going to library resources from this Open WorldCat program. The statistics for OSU show an average of 1,022 public (i.e. unauthenticated) accesses of the catalog, library information, or Ask-a-Librarian options each month from January to December 2005. Internet users who otherwise might not have found OSU's library resources are being led to those resources through Open WorldCat. The role of the local library catalog relative to the various other new access options remains a question, however. Putting library catalog records on the open internet has led people to start thinking differently about local OPACs, including questioning whether they will be needed in the future. Dissatisfied with current OPAC functionality, some libraries have purchased new search interfaces that work in conjunction with, but do not replace, their existing library catalogs. For example, North Carolina State University (NSCU Libraries, n.d.) implemented Endeca's search technology in January 2006. AquaBrowser (n.d.) is another interface option offering advanced searching capabilities. In addition, University of Rochester's River Campus Libraries received a grant from the Andrew W. Mellon Foundation in April 2006 to explore requirements for a new open-source online system known as extensible Catalog (XC) (Dickman, 2006). New interfaces can make it much easier for users to take advantage of the richness of the MARC metadata in the library catalog to refine their searches - but the users still have to know that the catalog exists and where to find it in order to use the new functionality. New access options also are raising questions related to management of the local catalog. For example, library personnel working on an oral history project for OSU's Knowledge Bank created links from related catalog records to the content added to the repository. They also requested system changes when they identified problems with the way the metadata displayed when searching for the oral histories in the Knowledge Bank. It seemed perfectly natural to make these access improvements because library -personnel were conducting the project. However, had a non-library community been responsible for inputting their own content and metadata, it is quite possible that the library would not have known of the display problem or of the relationships of repository content to items in the library catalog. The control that libraries typically have maintained over their catalogs may be an unrealistic goal for the repository model that is intended to support an expanded universe of contributors. Further, given the increasing number of access options that are not controlled by the library, what is the relative value of continuing to exert high levels of control over what goes into the catalog? Also, does it make sense to try to make the non-catalog options search and display like the catalog - or are they different models with different purposes? Libraries cannot effectively answer these questions through introspection. They must engage in ongoing dialogues with those they wish to serve. New design principles for libraries Scott Bennett, a library space consultant, has written and lectured convincingly about the need for library spaces to be designed to support learning more directly (Bennett, 2003). For example, users want more areas to study and to work in groups. The theme of the July /August 2005 issue of EDUCAUSE Review is learning space design. One of the articles summarizes characteristics of the well-designed classroom of the future as follows (Long and Ehrmann, 2005): • designed for people, not for ephemeral technologies; • optimized for certain learning activities; not just stuffed with technology; • enables technologies to be brought to the space, rather than having them built into the space; • allows invisible technology and flexible use; • emphasizes soft spaces; • useful across the 24-hour day; and • "zoned " for sound and activity. If academic libraries are to support learning more directly, then perhaps the well-designed library of the future should have these same characteristics. The article also describes a fascinating vision of what the authors call "situated computing" where instructions are embedded in the physical space to tell devices within that space how they should be configured. For example, a faculty member could use his or her course schedule to create an event profile for a particular class session, indicating any technology support needs. When the faculty member enters the classroom for the session, the building network reads a radio frequency identification (RFID) tag on his or her ID, retrieves the event profile, and activates the appropriate support devices according to the preferences specified in the profile. When students enter the classroom, their IDs can register their presence with the network and information for the class can be transferred to their preferred workspaces, which might be handheld devices. Those trying to envision what academic library spaces should look like in the future might find it helpful to keep this type of visionary thinking about classrooms in mind. New mobility It is obvious from simple observation that the use of mobile communications and computing devices is increasing. A logical assumption is that this trend can be expected to have a larger impact on libraries in the future. An article in the May/June 2005 EDUCA USE Review on "Enabling mobile learning" quotes Penny Wilson of Macromedia who has described mobile wireless devices as "tools of mass disruption" because of the innovations in learning technologies that they are expected to spark (Wagner, 2005). In July 2005, it was announced that a new URL suffix -.mobi - had been defined for use by sites that specifically format their content for display on the small screens of cell phones and other internet-capable handheld devices. The initial domain name registration opened in May 2006. Mobile phone companies asked for the new domain name and are encouraging its use. Presumably, the more internet content that is available for cell phones and the easier it is to access, the more interest there will be in phones with greater functionality, allowing the companies to expand their markets (Reardon, 2006). Libraries have been offering online reference for some time, but now they are experimenting with additional and possibly more convenient ways that this could be done. Instant messaging (IM) is one popular option, and Short Message Service (SMS) could be another. A university library in Australia is offering an "SMS a Query" service to allow librarians to receive short text messages of up to 160 characters any time from anywhere using cell phones (OCLC Online Computer Library Center, 2006). One company, Altarama, also in Australia, markets software to support delivery of reference services via SMS. In the USA, the library at Southeastern Louisiana University offers a text message reference service (Hines, 2005). New influences and expectations Gaming also is going mobile. A 2005 issue of the OCLC Newsletter compared "gamers" and "boomers" and talked about implications for libraries. One article quoted Marc Prensky on the impact of gaming: Today's average college grads have spent less than 5,000 hours of their lives reading, but over 10,000 hours playing video games. Today's students think and process information fundamentally differently from their predecessors (Beck, 2005). The article also notes that the author, Beck, found that web sites with a game component capture and hold people's attention better than any other. Based on a survey of more than 2,000 professionals and hundreds of interviews, Beck is convinced that video games will have a significant impact on our entire culture. Blogs, wikis, podcasting and the like are having an impact as well. The March/April 2006 issue of EDUCAUSEReview includes an in-depth description of the proliferation of Web 2.0 services and tools that support social networking. The article also explores Assessing trends the pedagogical implications of Web 2.0 (Alexander. 2006). For example, these technologies can enable student group learning as well as collaborative research by faculty; and lectures available as podcasts from the class wiki can make the learning process more mobile. But many wikis and blogs are not scholarly enterprises. They offer easy ways to self publish on the web and, as a result, the amount of amateur digital content that is available is growing. Svoboda (2006) notes in the May 2006 IEEE Spectrum Online: As the first-ever major reference work with a democratic premise - that anyone can contribute an article or edit an entry - Wikipedia has generated shared scholarly efforts to rival those of any literary or philosophical movement in history. Its signature strength, however, is also its greatest vulnerability. User- generated articles are often inaccurate or irrelevant, and vandals ... are a constant threat. In fact even Wikipedia's founder, Jimmy Wales, warns college students not to use Wikipedia for class projects or serious research (The Chronicle of Higher Education, 2006). Joan Frye Williams, a librarian and consultant on information technology planning, spoke at the 2005 ALA Annual Conference about the surprising level of trust that people are placing in peer-supplied information. She noted that the library has lost its premier position as a trusted source. OCLC's 2006 report (De Rosa et al, 2006, pp. 6-4) on college students' perceptions supports this view: College students trust information they get from libraries, and they trust the information they get from search engines. The survey revealed that they trust them almost equally, which suggests that libraries have no monopoly on the provision of information. With exposure to many different options for information gathering, users are developing expectations for accessing library resources that are shaped by the general internet. In the past, even if it was difficult or time-consuming to access information provided by the library, users would do it because they had no other choice. Now, in the minds of many, they do have choices; and they are choosing ease of use and convenience even when a somewhat more difficult option would yield better results. Therefore, in order to make sure that valuable content and services are used, libraries need to give as much attention to convenience and ease of use as to ensuring that resources are of high quality. Where students are concerned, the need to teach them how to learn, not just how to use library resources, is probably as great as it has ever been. However, librarians working toward this goal need to be sure to teach techniques that really will be used. If the pathways to the library's riches are too convoluted, more energy should be focused on building simpler pathways than on giving better navigation instructions. The implication is that libraries should be interacting more with those that they want to serve in a way that will enhance their understanding of current and future user needs. New partnerships One way for libraries to assess how well they are supporting learning and research and what useful new services they might provide is to work even more closely with academic departments. Collection development librarians traditionally have interacted with teaching faculty to address the collections needs of academic programs. Initiatives such as institutional repositories call for new partnerships centered on creation, delivery, and preservation of digital content and metadata. These areas also represent new service opportunities for the library. For example, James Mullins, dean of the Purdue University Libraries, is encouraging the Libraries' faculty to collaborate on interdisciplinary sponsored research initiatives with colleagues in colleges and schools throughout the university (Purdue University Libraries, 2005). Through extensive conversations with campus faculty, Mullins has found that the researchers have many - needs that can be met by the type of expertise offered by librarians, such as the ability to organize and manage large data sets. Several sponsored research proposals naming Purdue librarians as part of the research team have been funded (Mullins and Brandt, 2006). New organizations The summer 2004 issue of Library Trends is devoted to organizational development as practiced in libraries. The issue editors, Denise Stephens and Keith Russell, provide an extensive review of the literature on organizational development, change, and leadership in several disciplines. In their article (Stephens and Russell, 2004, p. 240), they note: The library community is well aware of the impacts of rapidly changing information technology, evolving user expectations and information-seeking behaviors, and changes in information publishing and dissemination. It is unclear, however, whether awareness of these driving environmental issues equals understanding and whether the knowledge of these issues is applied to planning and implementation of change in library organizations. One of the changes that is taking place is for student assistants to be given different types of responsibilities. Libraries such as Georgia Tech have found that users respond very well to being helped by their peers. They are using student assistants as an interface for student users of their facilities and services, particularly where computer and multimedia support are concerned. At OSU, an innovative Peer Library Tutor program was implemented in 2005 to train students to assist their peers with research and use of library resources. The pilot program, developed by Katharine Webb, was highly successful and resulted in plans for expansion to other areas of the library in 2006. Another trend is for academic libraries to define new positions to manage scholarly communication issues. For example, OSU has an experimental Rights Management Coordinator position that is responsible for providing leadership in this area. Redefining responsibilities for existing personnel is a trend as well. At OSU, some members of Technical Services now have a role in seeking copyright permissions for faculty and in promoting rights awareness on campus. Others are assisting with the development of a campus-wide expertise system called OSU:Pro and are working with campus units on submission of content to the Knowledge Bank. Because the information environment has changed significantly, traditional library organizational structures do not necessarily fit the work that needs to be done now. At OSU, there has been a recent shift to cross-disciplinary management of common public service functions (e.g., circulation, reference, management of the collection) by a team of coordinators reporting to an assistant director. Previously, all functions within a discipline were managed by the subject matter expert. The goal has been to allow subject experts to devote less time to operational issues and more time to new responsibilities requiring their scholarly expertise. Using the trends snapshot The foregoing snapshot of trends affecting academic libraries was the focus of two open meetings to which all OSU library personnel were invited. The first event was an overview presentation (by the author) that grouped more than 20 individual trends into four categories, namely: (1) content management; (2) changing uses and users of libraries; (3) outreach, teaching, and learning; and (4) changing personnel patterns. The purpose of the overview was to provide some structure and a sense of relationship among the various trends that might not be readily apparent. It also connected the trends to activities underway at OSU or at other universities to ground them in reality while also looking to the future. The second event was a half-day in-service session away from the library. Attendees were divided randomly into groups and asked to rate seven trends (some from each major category) according to the impact and effort involved with taking action. Facilitated discussion of the outcome followed, with attendees sharing the reasoning behind their ratings. Then each group brainstormed one of its trends to generate ideas on specific actions the library should take in response. The trends events gave personnel throughout the OSU Libraries an opportunity to think and to interact with one another in new ways. The overview established some common ground and a current context. The group exercises encouraged different perspectives on possible actions and priorities. Together, the events put attendees in a better position to inform and to accommodate subsequent decisions about service priorities, positions to be filled, and budget requests. Conclusion Rapid and far-reaching change has become the norm for the environment in which academic libraries operate, necessitating that library personnel think very differently and act much more quickly. These challenges are significant, especially for large research libraries where the size of the organization often impedes nimbleness in responding to current trends. Indeed, a big part of the struggle is for libraries to be proactive and trend setting rather than merely reactive to trends imposed upon them. This cannot be done without cultivating and maintaining a world view that looks well beyond the library. Whether the focus is content, services, outreach, or personnel, libraries cannot succeed by working in isolation. They must evaluate, obtain, and support products from more and more vendors whose primary clients are not libraries; participate in development and support of technology solutions with members of open-source communities; partner with other campus units to deliver coherent enterprise-wide information services through architectures that simplify discovery and navigation for an increasingly mobile population; develop new relationships with knowledge seekers to understand and meet their changing needs; consult experts in other professions for guidance on design of facilities and services; recognize and manage the influence of new government policies and legislation; and collaborate creatively to bring needed new skill sets into their organizations. All of this must be done with the expectation that budgets for libraries, universities, and their consortia are likely to be stable (at best) or decreasing. The financial challenges are significant. To compete successfully for limited funds, libraries must demonstrate excellence and value in a way that is recognized, not only by those distributing the funds, but also by those who are fellow competitors for it. There is work to be done in this arena. The value of what libraries offer is not as clearly recognizable as it once was because the uniqueness associated with library offerings has diminished. The future does not hinge on our processes or on our technologies, but on our ability to build new supportive relationships for libraries. This may require establishing many individual relationships between library and non-library personnel to build mutual understandings of needs and expertise to serve as a foundation for new organizational relationships. It will also require that the library personnel bring something seen as important and needed to the table. Monitoring current trends is essential to help libraries identify opportunities to build new relationships and to strengthen and grow their expertise accordingly. Involving the entire organization in this process is beneficial because it solicits the widest range of perspectives and also fosters essential change through exposure to new ideas. Examining current trends gives the future some shape, even in the face of great uncertainty, and allows people to envision something of value in what lies ahead rather than seeing only what they must leave behind. References Alexander, B. (2006), "Web 2.0: a new wave of innovation for teaching and learning?", EDUCAUSEReview, Vol. 412, March/April, pp. 33-44, available at: www.educause.edu/ir/ library/pdf/erm0621.pdf (accessed July 9, 2006). AquaBrowser (n.d.), "AquaBrowser Library", available at: www.aquabrowser.com/ (accessed January 25, 2007). Beck, J. (2005), "Staying in the game!", OCLC Newsletter, 267, January/February/March, p. 9, available at: www.oclc.org/news/publications/newsletters/oclc/2005/267/downloads/ stayinginthegame.pdf (accessed August 4, 2006). Bennett, S. (2003), Libraries Designed for Learning, Council on Library and Information Resources, Washington, DC, available at: www.clir.org/pubs/reports/publ22/contents.html (accessed July 7, 2006). bepress (n.d.), "The Berkeley Electronic Press™", available at: www.bepress.com/index.html (accessed January 25, 2007). (The) Chronicle of Higher Education (2006), "The Wired Campus: Wikipedia founder discourages academic use of his creation", June 12, available at: http://chronicle.com/wiredcampus/ article/1328/wikipedia-founder- discourages-academic-use-of-his-creation (accessed January 25, 2007). CNI (n.d.), "Coalition for Networked Information", available at: www.cni.org (accessed July 28, 2006). Cornell University Library (n.d.), "Timeline: digital technology and preservation", available at: wwwJibraiyxomell.edu/iris/1iitorial/dprn/rimeline/index.html (accessed January 25,2007). De Rosa, C, Dempsey, L. and Wilson, A. (2004), The 2003 OCLC Environmental Scan: Pattern Recognition, OCLC, Dublin, OH, available at: www.oclc.org/reports/escan/default.htm (accessed July 29, 2006). De Rosa, C, Cantrell, J., Hawk, J. and Wilson, A. (2006), College Students' Perceptions of Libraries and Information Resources, OCLC Online Computer Library Center, Dublin, OH, available at: www.oclc.org/reports/perceptionscollege.htm (accessed July 29, 2006). De Rosa, C, Cantrell, J., Cellentani, D., Hawk, J., Jenkins, L. and Wilson, A. (2005), Perceptions of Libraries and Information Resources, OCLC, Dublin, OH, available at: www.oclc.org/ reports/2005perceptions.htm (accessed July 29, 2006). Desire2Learn.com (n.d.), "Desire2Learn innovative learning technology", available at: www. desire21earn.com/ (accessed January 25, 2007). http://www.educause.edu/ir/ http://www.aquabrowser.com/ http://www.oclc.org/news/publications/newsletters/oclc/2005/267/downloads/ http://www.clir.org/pubs/reports/publ22/contents http://www.bepress.com/index.html http://chronicle.com/wiredcampus/ http://www.cni.org/ http://wwwjibraiyxomell.edu/iris/1iitorial/dprn/rimeline/index.html http://www.oclc.org/reports/escan/default.htm http://www.oclc.org/reports/perceptionscollege.htm http://www.oclc.org/ http://desire2learn.com/ http://desire21earn.com/ Dickman, S. (2006), "Mellon grant funds planning analysis for future online services", University of Rochester News, April 14, available at: www.rochester.edu/news/show.phpPid = 2518 (accessed January 25, 2007). D-Lib (n.d.), "About D-Lib Magazine", available at: www.dlib.org/about.html (accessed July 28, 2006). DPubS (n.d.), "DPubS Digital Publishing System", available at: http://dpubs.org/ (accessed January 25, 2007). DSpace (n.d.), "DSPACE", available at: www.dspace.org/ (accessed January 25, 2007). EDUCAUSE (n.d.), .),"Welcome to EDUCAUSE", available at: www.educause.edu (accessed July28, 2006). Fedora (n.d.), "fedora", available at: www.fedora.info/ (accessed January 25, 2007). Hawkins, B.L., Golden, C, Luker, M.A., Barone, C.A., Katz, R.N. and Oblinger, D.G. (2006), "A message from the EDUCAUSE executive team", EDUCAUSE Review, Vol. 4 No. 3, May/June, p. 4, available at: www.educause.edu/apps/er/erm06/erm06312.asp (accessed July 7, 2006). Hines, S. (2005), "Text messaging/SMS reference services", Infolssues, November, available at: www.lib.umt.edu/services/infoissues/archive/nov2005.htm (accessed July 9, 2006). Long, P.D. and Ehrmann, S.C. (2005), "Future of the learning space: breaking out of the box", EDUCAUSE Review, Vol. 40 No. 4, July/August, pp. 42-58, available at: www.educause. edu/ir/library/pdf/erm0542.pdf (accessed July 7, 2006). Mullins, J.L. and Brandt, D.S. (2006), "Building an Interdisciplinary Research Program", videotape of presentation March 30, Ohio State University Libraries, Columbus, OH, available http://hdl.handle.net/1811/6122 (accessed July 9, 2006). NCSU Libraries (n.d.), "Endeca at the NCSU Libraries", available at: www.lib.ncsu.edu/endeca/ (accessed January 25, 2007). OCLC (n.d.), "OCLC Online Computer Library Center", available at: www.oclc.org (accessed July 28, 2006). OCLC Online Computer Library Center (2004), 2004 Information Format Trends: Content, Not Containers, OCLC, Dublin, OH, available at: http://www5.oclc.org/downloads/community/ 2004infotrends_content.pdf (accessed July 28, 2006). OCLC Online Computer Library Center (2006), "Send a librarian an SMS", OCLC Newsletter, January/February/March, p. 12, available at: www.oclc.org/news/publications/newsletters/ oclc/2005/267/downloads/n267.pdf (accessed July 9, 2006). OhioLINK (n.d.), "Ohio Library and Information Network", available at: www.ohiolink.edu/ (accessed January 25, 2007). Open Content Alliance (n.d.), "Open Content Alliance", available at: www.opencontentalliance. org/ (accessed July 7, 2006). Purdue University Libraries (2005), "Dean of libraries: profile", available at: www.lib.purdue.edu/ admin/deansbio.html (accessed January 25, 2007). Reardon, M. (2006), "'Dot-mobi' domain for mobile devices hits the web", CNETNews.com, May 23, — available at: http://news.com.com/Dot-mobi + domain + for + mobile + devices +hits + the+ Web/2100- 1039_3-6075779.html (accessed January 25, 2007). http://www.rochester.edu/news/show.phpPid http://www.dlib.org/about.html http://dpubs.org/ http://www.dspace.org/ http://www.educause.edu/ http://www.fedora.info/ http://www.educause.edu/apps/er/erm06/erm06312.asp http://www.lib.umt.edu/services/infoissues/archive/nov2005.htm http://www.educause/ http://hdl.handle.net/1811/6122 http://www.lib.ncsu.edu/endeca/ http://www.oclc.org/ http://www5.oclc.org/downloads/community/ http://www.oclc.org/news/publications/newsletters/ http://www.oclc.org/news/publications/newsletters/ http://www.ohiolink.edu/ http://www.opencontentalliance/ http://www.lib.purdue.edu/ http://www.lib.purdue.edu/ http://cnetnews.com/ http://news.com.com/Dot-mobi Stephens, D. and Russell, K. (2004), "Organizational development, leadership, change, and the future of libraries", Library Trends, Vol. 53 No. 1, Summer, pp. 238-57. Svoboda, E. (2006), "One-click content, no guarantees", IEEE Spectrum Online, May, available at: www.spectrum.ieee.org/may06/3412/2 (accessed July 29, 2006). Wagner, E.D. (2005), "Enabling mobile learning", EDUCAUSE Review, Vol. 40, May/June, pp. 41-52, available at: www.educause.edu/ir/library/pdf/erm0532.pdf (accessed July 7, 2006). Further reading Google (2004), "Google checks out library books", available at: www.google.com/press/pressrel/ print_library.html (accessed July 7, 2006). Corresponding author Sally A. Rogers http://www.spectrum.ieee.org/may06/3412/2 http://www.educause.edu/ir/library/pdf/erm0532.pdf http://www.google.com/press/pressrel/ work_f6n3a263ofdv7ewtbwwm7nh4ve ---- Recommended Best Practices for Digital Image Capture of Musical Scores Jenn Riley Ichiro Fujinaga The authors Jenn Riley is!Digital Media Specialist at Digital Library Program, Indiana University, Bloomington, Indiana, USA Ichiro Fujinaga is Assistant Professor of Music Technology at Faculty of Music, McGill University, Montréal, Québec, Canada Keywords Best practice, digitization, musical scores Abstract Musical scores, as complex visual articles with small details, are difficult to digitally capture and deliver well. All capture decisions should be made with a clear idea of the purpose of the resulting digital images and must be flexible enough to fulfill unanticipated future uses. Best practices for detail and color capture are presented for creating an archival image containing all relevant data from the print source, based on commonly defined purposes of digital capture. Options and recommendations for file formats for archival storage, web delivery and printing of musical materials are presented. Introduction Libraries and archives embarking on digital imaging projects today have a great deal more guidance for decision-making than they did just a few years ago. Standards and best practices for many types of originals have emerged, from the early NARA (National Archives and Records Administration) Guidelines (Puglia and Ruginski, 1988), Cornell University’s Digital Imaging for Libraries and Archives (Kenney and Chapman, 1996), its successor Moving Theory into Practice (Kenney and Rieger, 2000), and the Library of Congress’ documentation for the American Memory project (Fleischhaur, 1988; Library of Congress, 2000) to the Arts and Humanities Data Service’s Guides to Good Practice series (Arts and Humanities Data Service, 2002) and the NINCH (National Initiative for a Networked Cultural Heritage) Guidelines (NINCH, 2002). These standards and best practices documents take a wide variety of approaches, from prescriptive lists of appropriate resolutions and bit depths for various formats to explanations of decision-making processes to determine specifications individually for each item to be digitized. They cover many formats of originals, but tend to focus on photographic and printed textual materials. Much of the information in these guidelines can be transferred to the digital capture of musical scores. However, musical notation has a much greater need for accurate detail capture. Staff and ledger lines, dots and bars are all very small details, and any loss of detail results in a significant loss of meaning. This paper will present some best practice guidelines for decision-making for digital image capture of musical scores. Defining the purpose of scanning Before any decisions regarding capture specifications can be made, the purpose of the imaging project must be clearly defined. Is the musical score important as a historical artifact, or is only the musical content within worth preserving? Manuscripts, rare materials, and those with annotations by a collector are examples of scores that would require artifactual treatment. Mass-printed publications now in poor condition may be candidates for content-only capture. As the capture of paper watermarks has been covered elsewhere (Edge, 2001; Kenney and Rieger, 2000; Stewart, Scharf and Arney, 1995; Wenger, et al., 1995), it will not be covered here. Note that not all materials are good candidates for digital imaging: rare and fragile materials might best be captured for preservation on medium-format color film, such as Ilford’s Ilfochrome Micrographic , which has an estimated 300-year life expectancy. While we cannot anticipate all future uses of our digital images, our digitization decisions must be made to ensure that master images are flexible enough for a variety of uses. For musical scores, master images should at the very least support the creation of derivative versions for web delivery, printing and Optical Music Recognition (OMR). Master file specifications Resolution Scanning resolution must be set to capture all important detail from the original. One method of determining this resolution is to determine the minimum scanning resolution based on the stroke width of the smallest detail (Kenney and Rieger, 2000, pp. 46-47). For musical notation, this smallest detail is generally the white space between beams (see Figure 1). While Kenney advocates capturing the smallest detail with 2 pixels for adequate reproduction of the stroke with a grayscale scan, 3 pixels per detail is required for successful OMR with the forthcoming Gamera software (MacMillan et al., 2001). Figure 1. An example of very small spacing between beams (scanned at 600dpi). An online version of this image is available at However, details in musical notation are consistently smaller than 1mm and are difficult to measure accurately without specialized equipment. Also, since printing sizes of musical notation are not consistent between different publications, this method would have to be applied individually to each piece of music to be scanned. Because of these problems, for most projects it would be more appropriate to simply capture all images at the same resolution. Our tests have found that 600dpi is a sufficient resolution to capture all significant detail for most musical notation, as seen in Figure 2, where the 600dpi scan more adequately renders the ledger line and the sharp sign. This resolution will capture detail as small as .005in (.027mm) with the required 3 pixels. For larger printed notation, 300dpi may be sufficient. Our preliminary studies show that resolutions above 600dpi generally do not offer much advantage for the purpose of web viewing, printing, or OMR. This is true even in the case of miniature scores as shown in Figure 3, where there is an improvement from 300dpi to 600dpi but there are no clear improvements in 1200dpi or 1600dpi scans. Grayscale versions of these sample images and some others can be found at . a ) b ) Figure 2. small detail scanned at a) 300dpi, b) 600dpi. a ) b ) c) d ) Figure 3. Miniature score scanned at a) 300dpi, b) 600dpi, c)1200dpi, d)1600dpi. Color Reproduction and Bit Depth Musical notation must be captured in grayscale, as 1-bit (bitonal) scanning does is generally not adequate to capture all important detail. (See for a comparison of bitonal and grayscale detail reproduction.) If color is used on the page in a meaningful way, such as on sheet music covers, color scanning should be used. Grayscale and color scanning should both use at least 8 bits per channel, and higher bit depths may be appropriate for some uses. In order to preserve this full color range, any image manipulations done according to the guidelines below should be performed in the scanning software at the time of capture, not after capture with an image-editing application. It is important to understand that image manipulations done to the master file, including straightening, reduce the amount of data present in the master image. They should be done only to achieve the goal of the master image: to reproduce an artifact or to maximize capture of its musical content. Before doing any image adjustments, the imaging system must be set up properly to ensure that the scanner accurately sees the color of the printed original and the image displayed on the monitor accurately represents the data in the image. The scanner and monitor should both be characterized and managed via International Color Consortium (ICC) profiles . Operating-system level color management exists both for Macintosh in the form of ColorSync and for Windows 98, 2000, ME, and XP in the form of Image Color Management (ICM) . Locally- created ICC profiles, such as those created with Monaco Systems’ software for each device are preferable to generic profiles for a specific model of scanner or monitor. Once a system is properly calibrated, it should capture reasonably color-accurate versions of the original printed materials. If the purpose of the imaging project is to capture the artifact as it exists today, no corrections should be made to the master images. Every effort should be made to ensure pages are straight during capture as rotating them in image-editing software can result in a loss of detail. If capture of the musical content rather than visual content has been determined as the purpose of the scan, the contrast between the musical notation and background of the page should be maximized. A well-contrasted page will have completely filled-in note heads, solid staff lines, and clean white space between staff lines when viewed at 100% magnification in image editing software. Master File Formats Uncompressed TIFF is generally suggested as the most appropriate file format for master files (Fleischhauer, 1998; Puglia and Roginski, 1998). However, TIFF is not a true, but instead a de facto, standard. The PNG (Portable Network Graphics) format may be an emerging replacement for TIFF for this purpose (Roelofs, 1999). PNG has the technical capabilities to store all relevant information captured according to these guidelines. It can use lossless compression, and produces significantly smaller files than uncompressed TIFF files and various JPEG lossless compression schemes (Santa- Cruz, 2000). Most archival imaging projects, however, still use TIFF as the master file format, and it may be some time before it is clear whether the digital library community as a whole accepts PNG as a master file format. Storage of Master Files Proper storage of master files is perhaps the most difficult aspect of managing a digital imaging project. One possible system allowing for multiple copies of master and derivative files on a variety of media is described at . However, even basic configurations such as this one will not be available to many smaller institutions without sufficient technical support embarking on digital projects. Storage of master files on optical media such as CD-R and DVD-R is a short-term solution and should be supplemented by efforts to increase access to long-term data storage. Web Delivery File Formats Regardless of whether master files are captured as artifacts or just for content, methods for delivering the images via the web are the same. At first glance, there appear to be an extremely large number of file format options for web delivery. Using an open format is not as important for delivery images as it is for master images. However, some choices are better than others and the final decision regarding web delivery format should take into account three major considerations: availability of web viewers for the format, support for multi-page images, and file size. Table 1 sorts possible delivery formats according to the first two of these criteria. Table 1. Comparison of some web-deliverable image formats Support for multi-page images? Yes No Al l gr ap hi ca l w eb br ow se rs JPEG GIF PNG* R eq ui re s co m m on pl ug -in PDF A va ila bi lit y of w eb v ie w er R eq ui re s un co m m o n pl ug -in TIFF DjVu JPEG2000 *PNG has a reputation for poor web browser support, but current support problems are with advanced PNG functionality, namely, alpha transparency. Simple page images in PNG format will display properly in recent versions all major web browsers, including Netscape and Internet Explorer versions 4 and above, on all major platforms. Usage of pre-version 4 browsers is now at less than 1%, according to Jupitermedia’s The Counter . Unfortunately, there is not a multi-page image format with native support in the mainstream graphical web browsers. While tools for viewing PDF files are fairly widespread and easily available, the PDF format was not designed for efficient compression of scanned images. Converting score images to PDF at acceptable resolutions for screen viewing and printing, even for short pieces, will generally result in prohibitively large file sizes. Other formats, such as DjVu (Bottou et al., 1988) and JPEG2000 (Santa Cruz et al., 2000) hold promise for more efficient web delivery in the future but are not currently widespread enough to be appropriate for use to a wide audience. Instead, single-page image formats should be used together with some sort of “page-turning” mechanism in the user interface. To accomplish this, metadata describing the structural relationship of page images to one another must be stored, for example, in the METS schema , and used to generate HTML code for navigation within the score. The choice between JPEG, GIF, and PNG is affected partially by file size, which is a function of the pixel dimensions of the display image. Pixel Dimensions The size of score images for screen display depends on the size and type of your original and the characteristics of your users. Most standards and best practices for web delivery of digital images focus on determining a fixed set of pixel dimensions for images, balancing the amount of detail presented with the need to fit an image on a user’s screen. However, for musical notation, the readability of the page and the level of detail presented are essential, and thus are more important than making an entire score page visible at a glance. Downsizing master score image files to 100-200 dpi from their original page size should result in screen-readable images from most sizes of originals, as seen at . As these images show, for all but the smallest printed notation, web-deliverable images can be created that show all necessary detail without requiring horizontal scrolling. However, vertical scrolling will be required at many screen resolutions. At these sizes, there is very little visual difference between grayscale JPEG, GIF, and PNG files of musical score pages. JPEG files are preferable to GIF files for two reasons. We have found that for grayscale notation pages, JPEG images of score pages at medium-high to high quality tend to be smaller than GIF files, and do not show obvious compression artifacts at these sizes. Scores with large printing can be compressed more heavily, down to what many define as “medium” quality (e.g. 50% in utilites such as ImageMagick and GraphicConverter or level 6 in Adobe Photoshop). For color images, GIF files are unsuitable because the GIF format is limited to an 8-bit palette, which can result in unacceptable color-shifting. PNG offers an advantage over JPEG in that it can use lossless compression. We have found PNG files for web delivery of scores to be smaller than high-quality JPEGs but larger than medium-high quality JPEGs. Some average file sizes for the different formats can be found in Table 2. Table 2. Representative file sizes for web-deliverable images from 9” x 12” original 2 0 0 d p i 1 5 0 d p i 1 0 0 d p i GIF 598K 389K 216K P N G 500K 326K 180K JPEG high quality 647K 421K 280K JPEG medium high quality 411K 268K 137K JPEG medium quality 332K 215K 111K For some collections it may be appropriate to provide thumbnail-sized images for browsing. While thumbnails of notation pages would generally not be very useful, thumbnail browsing of sheet music covers may be desirable. Images downsized to 5-25 dpi from their original page size should produce thumbnail-sized images. The compression method should be either JPEG (medium to high quality) or PNG. Processing Filters While no image processing should be done on the master files, it may be appropriate while creating derivatives for web display in order to increase their readability. Depending on the size and quality of the original, sharpening, deskewing and thresholding filters may be appropriate for use when creating web-deliverable images of musical scores. Printing Printing is a much greater need for digitized musical score collections than for many other formats. While it may not be important to be able to print colored covers or pages from original manuscripts, score pages intended for use for practice or performance will need print capability. While the exact best file format for print versions of score images may vary between user populations, generally score images for printing on laser printers are best presented as bitonal files at 250–400dpi, depending on the original print size (see examples at ). At lower resolutions, bitonal PNG files on average are smaller, while at higher resolutions, Group 4 compressed TIFF files on average are smaller, as shown in Table 3. Table 3. PNG and Group 4 compressed TIFF file size comparison for bitonal images. PNG TIFF (Group 4) 800dpi 329 KB 192 KB 400dpi 183 KB 146 KB 250dpi 90 KB 96 KB 200dpi 64 KB 71 KB 100dpi 25 KB 38 KB Files intended for printing must be easily downloaded by users. The TIFF format allows multi-page files, which would eliminate the need for bundling single image files using a utility like ZIP for Windows or TAR for Unix-based systems. However, many TIFF viewers cannot display multi-page TIFF files. Conclusion Digital imaging standards and best practices can be applied to the digitization of musical scores, when used with a full understanding of the decision-making processes behind their recommendations. A well- designed digital imaging process with appropriate quality control mechanisms can result in flexible master files from which successful OMR can be done, and web-viewable and print-quality images can be created. Acknowledgments Ichiro Fujinaga would like to thank Michael Droettboom, Karl MacMillan, and Asha Srinivasan for their help in the preparation of this paper. This research is funded in part by the NSF’s DLI-2 initiative (#981743), IMLS National Leadership Grant, and support from the Levy Family. References Arts and Humanities Data Service, (2002), “Guides to Good Practice in the Creation and Use of Digital Resources”, Available: http://www.ahds.ac.uk/guides.htm (Accessed: 2002, December 18). Bottou, L., Haffner, P., Howard, P.G., Simard, P., Bengio, Y. and Le Cun, Y. (1988), “High quality document image compression with DjVu”, Journal of Electronic Imaging, vol. 7, no. 3, pp. 410- 425. Edge, D. (2001), “The digital imaging of watermarks”. Computing in Musicology, vol. 12, pp. 261-274. Fleischhauer, C. (1998, July 13), “Digital formats for content reproductions”, (Library of Congress), Available: http://memory.loc.gov/ammem/formats.html (Accessed: 2002, December 20). Kenney, A. and Chapman, S. (1996), Digital Imaging for Libraries and Archives, Cornell University Library, Ithaca, NY. Kenney, A. and Rieger, O. (2000), Moving Theory into Practice, Research Libraries Group, Mountain View, California. Library of Congress, (2000, April 14), “Building Digital Collections: Technical Information and Background Papers,” Available: http://memory.loc.gov/ammem/ftpfiles.html (Accessed: 2002, December 17). MacMillan, K., Droettboom, M. and Fujinaga, I. (2001), “Gamera: A structured document recognition application development environment”, Proceedings of the 2nd Annual International Symposium on Music Information Retrieval, Oct. 15-17 2001, Bloomington, Indiana, pp.15-16. “NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials”, (2002 October), Available: http://www.nyu.edu/its/humanities/ninchguide/index.html (Accessed: 2002, December 17). Puglia, S. and Roginski, B. (1998, January), “NARA guidelines for digitizing archival materials for electronic access”, (National Archives and Records Administration), Available: http://www.archives.gov/research_room/arc/arc_info/ guidelines_for_digitizing_archival_materials.html (Accessed: 2002, December 15). Roelofs, G. (1999), PNG: The Definitive Guide, O’Reilly, Sebastopol, CA. Santa-Cruz, D., Ebrahimi, T., Askelöf, J., Larsson, M. and Christopoulos, C.A. (2000), “JPEG 2000 still image coding versus other standards”. In Proceedings of the SPIE’s 45th annual meeting, Applications of Digital Image Processing XXIII, vol. 4115, pp. 446-454. Stewart, D., Scharf, R.A. and Arney, J.S. (1995), “Techniques for digital image capture of watermarks”, Journal of Imaging Science and Technology, vol. 39, no. 3, pp. 261-267. Wenger, E., Karnaukhov, V., Haidinger, A. and Merzlyakov, N. (1995), “Image analysis for dating of old manuscript”, in Chin, R.T., Ip, H.S., Naiman, A.C. and Pong, T.-C. (Eds.), Image Analysis Applications and Computer Graphics, Lecture Notes in Computer Science 1024, Springer, Berlin. work_fcf5rtc7vjcdjfsx4rrbeh57di ---- Building Institutional Repository Infrastructure in Regional Australia Building Institutional Repository Infrastructure in Regional Australia 1 Building Institutional Repository Infrastructure in Regional Australia Caroline Drury has qualifications in arts, linguistics and information technology. She is currently based at the University of Southern Queensland, Australia and is Senior Technical Officer on the RUBRIC Project – a Department of Education, Science and Training (DEST) initiative to encourage the use of institutional repository systems in small and regional universities in Australia and New Zealand. Introduction The RUBRIC (Regional Universities Building Research Infrastructure Collaboratively) Project began in late 2005 and has been tasked to meet the needs of smaller and regional universities in Australia and New Zealand in developing sustainable infrastructure for the deployment of institutional repositories. To achieve this task the RUBRIC project is working with five core partners in the evaluation, testing, development and deployment of institutional repository infrastructure. In working with the core project partners, the FOSTER (Facilitated Online Sources Toolkit for Establishing Repositories) toolkit is being developed. The development of FOSTER, its anticipated use and outcomes thus far are the focus of this paper. The need for FOSTER Early in the project, the RUBRIC team recognized, based on previous experience, the need for a resource which could assist an institution in establishing institutional repository infrastructure. A sizable task to any institution, the establishment of such a repository is especially daunting to small and regional universities who do not typically have the resources to spare for such a project. RUBRIC also found that some institutions were not familiar with institutional repositories and their use. This led to some of the partners needing a business case to put to the institution as to why that institution needed an institutional repository. Circumstances such as IT department restructuring and implementation of new learning management systems caused confusion as to the need for yet another system. RUBRIC recognized that these types of difficulties would not be unique and that the resources in building such a business case would be useful to the broader community. RUBRIC decided that a collection of resources such as the ones needed for a business case would be not only useful, but of great importance to a fledgling institutional repository. Not only the resources though, but also the mentoring in using these resources would be extremely important. This is particularly the case for smaller and regional universities for whom the project is a greater risk. Some of the key difficulties faced by regional and smaller universities may include “reduced capacity for risk-taking with experimental or immature technologies; limited infrastructure resources; limited staffing resources [...]” (Lowe, 2006). However, there are also some advantages that smaller and regional universities may 2 have over larger institutions including “greater agility for adopting innovations with less bureaucracy; well integrated intra-institutional infrastructure which facilitates the delivery of projects that require a range of skills from different departments; [...] close links between research, learning and teaching as the practical application and dissemination of research [being] of major importance to regional universities...” (Lowe, 2006). The development of the FOSTER concept is being designed to assist the RUBRIC Project partner institutions in understanding not only the types of software packages available for this purpose, but also what is required to implement a successful institutional repository and what is needed to ensure sustainability beyond the project. FOSTER Development The FOSTER (Facilitated Online Sources Toolkit for Establishing Repositories) toolkit is being developed in response to a need which was identified by the RUBRIC Project in early 2006, of having a centralized, accurate collection of resources for assisting in the establishment of institutional repositories. In particular, the focus is upon the establishment of institutional repositories in regional and smaller universities. FOSTER Information Days Keeping in mind the primary aim of mentoring partners, FOSTER Information Days were created as a means by which RUBRIC Central team members and project partners could join together to share information and experiences. These days have allowed the group dynamic to bring out issues important to the partners, and not surprisingly, there are similar issues for each university. By identifying these issues and establishing their priority in such a project, the development of resourcing the issues is able to progress. The Information Days have allowed such discussions to take place and following the Information Day, issues discussed are written into the RUBRIC wiki. Another feature of the Information Days has been the development of group trust between RUBRIC Central team members and the project partners. The development of this trust has been of significance in the progression of the project. By providing the opportunity for face-to-face interaction, the partner project managers are able to discuss their issues in an informal setting. The face-to-face interaction also facilitates brainstorming sessions, in which RUBRIC Central team members and partners are able to collaboratively decide on priorities and discuss issues. The FOSTER Information Days have also provided an opportunity for RUBRIC Central to discover where strengths and weaknesses in the current levels of knowledge exist. Each partner institution brings different experiences to the project and the provision of face-to-face interaction provides an open forum for these experiences to emerge. Through this, RUBRIC Central has been able to find out which areas of institutional repository infrastructure development need the most focus, particularly for smaller and regional universities. Building Institutional Repository Infrastructure in Regional Australia 3 Collaboration Since the early days of the FOSTER concept, collaboration with other more experienced bodies has been greatly valued. Efforts were made to talk to other parties who have been involved in institutional repository infrastructure development for a while. By discussing history, difficulties, and future plans with these other parties, RUBRIC was able to create a bigger picture for the smaller and regional universities who are the project partners. Being able to note the difficulties that may have been faced by other universities is of great value to a fledgling project. A common theme that emerged during talks with more experienced parties was the lack of documentation of best practice in the development of repository infrastructure within Australia. This lack means that smaller and regional universities are placed in the difficult position of not necessarily knowing which way is the best way for that individual institution to implement an institutional repository and yet also not having the resources to spare to find out. Another aspect of the collaboration undertaken was a review of the previous experience of the RUBRIC team members. Most of this experience came from three members of the team who had previously been involved in establishing an institutional repository for the University of Southern Queensland. This project had been carried out over a period of 18 months and much had been learnt about the difficulties of implementing such a system at a small regional university. The final aspect of collaboration has been discussions with the project partners. Most of the partners do not yet have an institutional repository at their university, so the information gleaned from these partners has included expectations, perceived needs and wants, and a review of the resource restrictions within which they are able to operate. As all the project partners are regional universities, the resources available to them are of a similar nature, however there are also unique needs for each institution which need to be taken into account. FOSTER Tools In developing the concept of FOSTER, RUBRIC is making extensive use of various communication methods which will eventually feed into the final Toolkit. The primary tools being utilised include a wiki, a subversion repository, tagging, and teleconferencing. 4 FOSTER TOOLKIT Wiki FOSTER Information Days Weekly teleconferences Work ticketing Shared experiences Focus topics Previous experience of RUBRIC Central del.icio.us tagging Blogging Subversion repository Using a wiki has emerged as a useful way in which to not only gather feedback, but also to further discussions. The wiki is also another way in which RUBRIC Central is able to mentor project partners in the establishment of institutional repository infrastructure. RUBRIC Central is able to direct and suggest discussions and topics which emerge from the wiki at the same time as encouraging participation by partners. The informal structure and writing style of the wiki mean that basic concepts and topics are able to develop relatively freely and with input from all. The wiki provides an environment in which the basics of the FOSTER Toolkit can grow throughout the course of the project. The wiki also provides a test bed for issues involved implementation of this type of infrastructure and will enable the longer term product of the FOSTER Toolkit to be a more refined product, and better written. By using the wiki in this way, the best sources of the information gathered will be selected for the final product. RUBRIC uses a subversion repository to house all documentation. An emphasis has been placed on documenting all possible procedures and guidelines produced by the project, in order to provide maximum assistance to project partners. The subversion repository allows RUBRIC Central team members and project partners to have centralised, versioned document storage. Although access to the subversion repository is currently limited to RUBRIC associated parties, it also contains a public section for technical reports which are freely available. RUBRIC also makes use of social tagging by using del.icio.us (2006). Several main RUBRIC tags have been created and all project participants are encouraged to tag resources they find useful and informative with an appropriate RUBRIC tags. These tags can then be monitored by RSS feeds or similar, allowing all interested parties to share in the discovery of these resources. Teleconferences are another important tool in use by the RUBRIC Project. The teleconferences are held weekly, or more frequently as the need arises. These allow project partners to stay in regular contact with RUBRIC Central when distance does Building Institutional Repository Infrastructure in Regional Australia 5 not permit and issues facing partners are regularly discussed and monitored. Monitoring of these issues and tasks involved in solving them is done by use of a work ticketing system, using Trac (2006). RUBRIC Central team members may be assigned tickets involving further research into an issue, or troubleshooting of a system. Each of these tickets can then be tracked by interested project partners to the point of resolution. This allows transparent and interactive communication, while also providing historical documentation as to the work being carried out by RUBRIC Central. Intended Use of FOSTER The use of the final product FOSTER toolkit is anticipated to be primarily by regional and smaller universities, but may also be of use to larger institutions who are undertaking the establishment of institutional repository infrastructure. The FOSTER toolkit will have a primarily Australian and New Zealand focus, however may be of use to the broader international community. The collection of resources that it is anticipated will eventually make up the Toolkit will provide a valuable and succinct resource for anyone undertaking a similar project. Ongoing development of FOSTER The information sources which are being used in the development of the FOSTER toolkit are being gathered from across the globe. By utilising resources from such a broad pool, the FOSTER toolkit will be a valuable resource for the intended audience of project partners and other small and / or regional universities in Australia and New Zealand. The development of these resources into the final Toolkit will commence 6 closer towards the end of the project, when it is anticipated that the collection will be gathered together as a complete resource. The categorisation of FOSTER will be based upon current perceived needs of the RUBRIC project partners along with emerging best practice requirements. It is anticipated that these categories will alter throughout the development of FOSTER, as the needs of project partners, and others, change. Feedback is seen as an important part of the development of FOSTER, primarily because of the value of others' experiences. The utilisation of several tools by which feedback can be created and received provide methods for gathering of feedback. By using the wiki, the teleconferences, and face-to-face interaction, RUBRIC is able to regularly gather feedback from project partners. As the development of FOSTER progresses, feedback will continue to be sought from partners as to their perceptions of usefulness and relevance of emerging topics. The continuation of the community of trust built up between project partners and the RUBRIC Central team will continue to be important. This includes open discussion at FOSTER Information Days and during teleconferences, as well as allowing for editing of and adding to other people's wiki pages. The emergence of important issues within each institution will continue to provide direction for the development of the FOSTER concept. Difficulties faced in developing FOSTER While developing the idea of the FOSTER concept was relatively simple, the resourcing of it has proven a challenge. The FOSTER concept was not in the original RUBRIC Project bid documentation and so was not planned from the outset of the project. The emergence of a number of issues coupled with the desire to archive and make publicly available any solutions the project found fuelled the beginnings of the concept. This also meant however, that the development of the concept needed to be resourced, including finding staff. Once the realisation came as to the potential impact of this resource, it was important to ensure the continued development. Staff on the RUBRIC Central team have in some cases had to expand on their normal duties to assist with the development. Research on the issues often requires extensive expenditure of effort and time, causing the need for staff to be solely dedicated to this task. While some information has been found as part of the normal progression of the project, there has also been the need to have someone available to research and document this information in an appropriate manner. Another difficulty faced is the continued development of the tools being used. The use of the wiki and the del.icio.us tagging was not expected to be such a success and was therefore planned for use on a smaller scale. While the larger scale use is still conducive to the FOSTER concept development, RUBRIC Central can now see in hindsight that planning for their use on a larger scale would have provided greater flexibility for the future. At this half-way point in the project, it is possible that a re- think of the structure and use of these resources may provide this flexibility. It is anticipated that the move from wiki- and discussion-based development to the more static Toolkit will require a different approach for the construction of the Toolkit. The end product FOSTER Toolkit will have a more structured format and will not provide for commenting mechanisms as are being used currently. Using this Building Institutional Repository Infrastructure in Regional Australia 7 different writing style will allow RUBRIC to accommodate the different audience for which the end product Toolkit is being developed. Evaluation of FOSTER The primary perceived usefulness for the end product FOSTER Toolkit is for smaller and / or regional universities in Australia and New Zealand. By having access to such a resource, staff involved in the implementation of an institutional repository would be able to gain a succinct picture of the processes involved, as well as requirements that may face an institution, allowing for better planning of new institutional repository infrastructure. The full effectiveness of final FOSTER Toolkit is yet to be seen, however initial indications are that the FOSTER concept is a particularly effective tool, especially for its intended audience of smaller and regional universities within Australia and New Zealand. The effectiveness of the tools being used has emerged regularly during the project in facilitating communication and knowledge sharing amongst project partners and RUBRIC Central. By measuring the importance of the FOSTER concept by the use of the feeder resources and tools, RUBRIC Central is able to monitor the effectiveness and make changes where necessary. Both partners of the project as well as RUBRIC Central make regular use of these tools and the importance of these has been rated highly by all involved. With the dissemination of this information into the final FOSTER Toolkit, it is anticipated that the concept could develop into a resource which could become an important part of repository infrastructure establishment within the broader community as well. The use of the wiki to facilitate the development of the FOSTER concept has been supported from many of the project partners. In particular, comments have been received as to the usefulness of being able to discuss a topic in a teleconference and then pursue it further through the use of the wiki. The end result of these discussions and wiki topics will then be able to feed into the Toolkit. The dynamic nature of the wiki has been mentioned, and the benefit that is created from having a resource that is not static. While it is anticipated that the Toolkit will be a static resource, the dynamic nature of these feeder resources enables thorough documentation of experiences and knowledge gained during the life of the project. The varying rates of progress at each institution have revealed another benefit of the wiki, in that the information there can be built on at different times by different people, from the varying perspectives. All these things come together to allow the wiki to develop as an important tool in the development of the FOSTER concept. Future directions The FOSTER toolkit is not a static resource, and accordingly may never reach a fully 'completed' state. As the area of institutional repositories continues to develop, and further generations of appropriate solutions emerge, the FOSTER concept and later, Toolkit, will also continue to develop. 8 Local circumstances and changes to these will continue to influence the development of the FOSTER concept. Possible changes in national reporting workflows, such as the proposed Research Quality Framework (2006), will affect the outcomes of FOSTER. The use of a wiki, subversion repository, tagging and teleconferences will continue to enable thorough investigation into useful resources which will be included as part of the Toolkit. Further developments in the areas of research and institutional repositories will continue to be monitored, particularly in regards to their effects on the development of the final Toolkit. The development of the FOSTER concept will continue to assist RUBRIC Central in its role as mentor to project partners. The various stages of development of FOSTER are providing indications of the need for such a collective resource as this. The need for a centralised resource which is particularly focused on smaller and regional universities is essential. As the concept continues to develop, the resources feeding into it will also continue developing, allowing for the establishment of a thorough and effective tool for use by smaller and regional universities within Australia. Building Institutional Repository Infrastructure in Regional Australia 9 References del.icio.us Web Site. Viewed October 2, 2006, http://del.icio.us Lowe, D (2006) “Implementing institutional Repositories in Regional Australia”, ALIA Click06 Conference, pp. 5-6. Available http://conferences.alia.org.au/alia2006/ Research Quality Framework. Viewed October 2, 2006, http://www.dest.gov.au/sectors/research_sector/policies_issues_reviews/key_issues/re search_quality_framework RUBRIC Web Site. Viewed October 2, 2006, http://www.rubric.edu.au Trac Web Site. Viewed October 2, 2006, http://trac.edgewall.org/ http://del.icio.us/ http://trac.edgewall.org/ http://rubric.edu.au/ http://www.dest.gov.au/sectors/research_sector/policies_issues_reviews/key_issues/research_quality_framework http://www.dest.gov.au/sectors/research_sector/policies_issues_reviews/key_issues/research_quality_framework http://conferences.alia.org.au/alia2006/ work_fjpdqquirrfxnoxpwtk2b55hmm ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216589006 Params is empty 216589006 exception Params is empty 2021/04/06-01:37:01 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216589006 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:01 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_fk7l6oe2prcpfmldcye5hshmre ---- mp$$$$p197 Molecular Psychiatry (2000) 5, 3  2000 Macmillan Publishers Ltd All rights reserved 1359–4184/00 $15.00 www.nature.com/mp Publisher’s Announcement Macmillan Publishers Ltd is pleased to be able to announce the creation of a new company. Nature Publishing Group brings together Nature, the Nature monthly titles and the journals formerly published by Stockton Press. Stockton Press becomes the Specialist Journals division of Nature Publishing Group. The new company will be a partner of the scientific community and will be an innovative, responsive and visible presence in scientific and medical publishing. Nature Publishing Group will use its unique strengths, skills and global perspective to meet the demand of a rapidly changing and challenging publishing environment. The Group’s publications are known for delivering high-quality, high-impact content, fair pricing, rapid publi- cation, global marketing and a substantial presence on the Internet. These elements are the key to excellence in selecting, editing, enhancing and delivering scientific information in the future. As a company, we have three core values: quality, service and visibility. These values are set to benefit all our customers—authors, readers, librarians, societies, companies and others—thus building strong pub- lishing relationships. Molecular Psychiatry Molecular Psychiatry is now part of the Specialist Journals division of Nature Publishing Group. It will be marketed and sold from our offices in New York, Tokyo, London and Basingstoke. Within the electronic environment, Molecular Psychiatry will benefit from a substantial invest- ment in innovative online publishing systems, offering global access, intelligent searches and other essential functions. Librarians will be able to provide their readers with print and online versions of Molecular Psy- chiatry through a variety of services including OCLC, Ingenta (linking to the BIDS service), SwetsNet, Ebsco, Dawson’s InfoQuest and Adonis. At a time when the basis of traditional journal publishing is undergoing significant changes, Nature Publishing Group aims to support the scien- tific and medical community’s needs for high quality publication ser- vices which provide rapid and easy access to the best of biomedical and clinical results. Jayne Marks Publishing Director Specialist Journals, Nature Publishing Group Publisher's Announcement Main Molecular Psychiatry work_fmfa74xxoveu7e44vt5n34f744 ---- PubMed Central Archiving: A Major Milestone for Current Developments in Nutrition CURRENT DEVELOPMENTS IN NUTRITIONEDITORIAL © 2018 Odle. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits noncommercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com. Manuscript received August 17, 2018. Revision accepted August 20, 2018. Published online August 23, 2018. PubMed Central Archiving: A Major Milestone for Current Developments in Nutrition Jack Odle North Carolina State University, Raleigh, NC Dear Nutrition Community, I am excited to announce that Current Developments in Nutrition (CDN) has been approved for inclusion in PubMed Central. This milestone was achieved at a record pace following our application in January and is highly attributable to the sterling reputations of the ASN and Oxford University Press, who partner in the publication of this fully open-access nutrition journal that was launched in January 2017. This rapidly achieved milestone codifies the health and vitality of the Journal. To date, CDN has published 83 manuscripts. Most of these have already been transferred to PubMed Central, and the remainder should be posted within 2 mo. As of 1 July 2018, citations for CDN articles also now appear in PubMed. Maximizing the Discoverability of Your Publications in CDN Table 1 provides a list of indexing agencies in which CDN has been registered, as well as a number of social media outlets collated in Altmetric scores that are tallied for CDN articles. The importance of this indexing cannot be overstated. The impact of a published paper begins with its discoverability by those interested in the research. These popular search engines provide a powerful means of discovery. Given the full open-access publication model of CDN, authors can further highlight and immediately distribute their publications through social media outlets. With this indexing and distribution via social media, authors of papers in CDN can be assured of maximum discoverability of their publications. As papers in CDN are discovered, read, and eventually cited, we look expectantly for our first impact factor as early as 2019. Maximizing the Efficiency of Publication in CDN The operational motto among the editors and staff within CDN is to “publish quality nutrition research, quickly.” Although we continue to endure the growing pains of being a start-up journal, we have implemented editorial procedures to maximize efficiency. These include a simple means to transfer manuscripts among our family of journals (The American Journal of Clinical Nutrition, The Journal of Nutrition, Advances in Nutrition, and CDN) with or without any accompanying reviews. Such porting expedites decisions by the receiving journal. Furthermore, we strive for a “one-revision” workflow wherein authors are guided and expected to complete revisions with one iteration. Although this remains an aspirational goal, our first-year production statistics confirm our publication efficiency. The number of days from submission to acceptance averaged 90 d, and total time to final publication averaged 120 d. Such rates are impressively quick for a quality start-up journal. For more information regarding the attributes of publishing in CDN, please see https://academic.oup.com/cdn/pages/cdn_publishing. 1 http://creativecommons.org/licenses/by-nc/4.0/ mailto:journals.permissions@oup.com. http://orcid.org/0000-0003-4965-2096 https://academic.oup.com/cdn/pages/cdnpublishing 2 Odle TABLE 1 Maximum discoverability of papers in CDN1 Indexing agencies Social media outlets2 PubMed Central Twitter Google Scholar Facebook CABI Nutrilink EBSCO Kudos IFIS (FSTA database) Reddit OCLC Press releases ProQuest/ExLibris Blog posts TDNet Pinterest AGRICOLA LinkedIn YEWNO Google+ Web of Science Faculty1000 Current Contents Policy documents DOAJ 1AGRICOLA, Agricultural Online Access; CABI, Centre for Agriculture and Biosciences International; CDN, Current Developments in Nutrition; DOAJ, Directory of Open Access Journals; EBSCO, Elton B. Stephens Co.; FSTA, Food Science and Technology Abstracts; IFIS, International Food Information Service; OCLC, Online Computer Library Center; TDNet, Teldan Network. 2Current Altmetrics data show that CDN articles have been cited in 140 news stories within 10 countries and have been the subject of 1825 Twitter feeds spanning 45 countries as well as 81 Facebook posts within 6 countries. Gratitude and Congratulations Successfully launching a new open-access journal such as CDN during this brisk digital era poses a formidable challenge. When competition is so high, it is natural and even expected that “loyalties” can become thin, diluted, and dispersed. Despite this backdrop, the global nutrition community has been very supportive. Whether through submission of quality manuscripts, assistance with peer review, or through reading and citation of CDN papers, the nutrition community has helped us reach this important milestone. The dedication and commitment of the Deputy and Academic Editors serving CDN and the ASN, Oxford University Press, and Kaufman-Wills-Fusting staff have been more than amazing. I feel privileged to work with such a talented team. In closing, I extend my congratulations to everyone in the nutrition community who has contributed and who shares in the achievement of this milestone for CDN. Respectfully yours, Jack Odle, PhD Editor-in Chief Current Developments in Nutrition CURRENT DEVELOPMENTS IN NUTRITION work_fntxa4twe5bdxiqsqlvd7bepnq ---- Analysis of International Linked Data Survey for Implementers Search D-Lib: HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB D-Lib Magazine July/August 2016 Volume 22, Number 7/8 Table of Contents   Analysis of International Linked Data Survey for Implementers Karen Smith-Yoshimura OCLC Research smithyok@oclc.org DOI: 10.1045/july2016-smith-yoshimura   Printer-friendly Version   Abstract The International Linked Data Survey for Implementers conducted by OCLC Research in 2014 and 2015 attracted responses from 90 institutions in 20 countries. This analysis of the 112 linked data projects or services described by the 2015 respondents — those that publish linked data, consume linked data, or both — indicates that most are primarily experimental in nature. This article provides an overview of the linked data projects or services institutions or organizations have implemented or are implementing, what data they publish and consume, the reasons respondents give for implementing linked data and the barriers encountered. Relatively small projects are emerging and provide a window into future opportunities. Applications that take advantage of linked data resources are currently few and are yet to demonstrate value over existing systems.   1 Introduction The impetus for an "International Linked Data Survey for Implementers" was discussions with OCLC Research Library Partner1 metadata managers who were aware of some linked data projects or services but felt there must be more "out there" that they should know about. The survey instrument was designed in consultation with OCLC colleagues and a few OCLC Research Library Partners and beta tested by several linked data implementers. We conducted the initial survey in July and August 2014, distributing the link to the survey on multiple listservs and on Twitter. The target audience were those who had already implemented a linked data project or service, or were in the process of doing so. Questions were asked both about publishing linked data and consuming linked data. The results were published in a series of posts on the OCLC Research blog, HangingTogether.org.2 While the initial survey results received a generally appreciative response from readers, some noted and regretted the absence of several prominent linked data efforts in Europe. To address these gaps, we repeated the survey on 1 June 2015, with responses due by 31 July 2015.   2 Overview Seventy-one institutions responded to the 2015 survey, compared to 48 in 2014. 29 responded to both, with a total of 90 responding institutions in 20 countries (see the list appended at the end of this article). Respondents from the US accounted for 43% of the total with 39 responses, followed by Spain (10), the United Kingdom (9) The Netherlands (6), Norway (4), and Canada (3). We received 2 responses each from Australia, France, Germany, Italy and Switzerland, and 1 response each from Austria, the Czech Republic, Hungary, Ireland, Japan, Malaysia, Portugal, Singapore and Sweden. We were successful in our recruiting responses from national libraries, with 14 national libraries (20% of the total) reporting they have implemented or are implementing linked data projects or services (compared to 4 in 2014). Academic libraries represented 31% of the 2015 respondents (23), followed by 9 multi-institutional networks (14%), 7 government (10%), 6 scholarly projects — hosted by a single responding institution but involving multiple institutions around a particular subject or theme (8%), 5 public libraries (6%), 3 museums (4%), 3 societies (4%) and one publisher. Examples from these sectors are presented in Section 5. The 71 institutions responding to the 2015 survey reported a total of 168 linked data projects/services, of which 112 were described by respondents at various levels of detail. Two-thirds of the described linked data projects/services are in production, of which 61% have been in production for more than two years. This represents a doubling of the number of "mature" projects/services in production reported in 2014. Ten of the projects are "private", for that institution's use only. How long linked data project/service in production 2015 2014 Not yet in production 37 27 Less than one year 19 13 More than one year, less than two years 10 12 More than two years 46 24 Total projects/services described 112 76 Table 1: Survey Responses on Time Linked Data Projects Have Been in Production External groups involved:Thirty-five of the projects did not involve any external groups or organizations, but most (69%) did involve some level of external collaboration. In order of frequency named, the external collaborators were: other universities or research institutions; other libraries or archives; external consultants or developers; part of a national collaboration; other consortium members; a systems vendor; part of an international collaboration; a corporation or company; part of a state- or province-wide initiative; part of a discipline-specific collaboration; a scholarly society; a foundation. Staffing: Almost all of the institutions that have implemented or are implementing linked data projects/services have added linked data to the responsibilities of current staff (98); only 14 have not. Twenty have staff dedicated to linked data projects (twelve of them in conjunction with adding linked data to the responsibilities of current staff). Four are adding or have added new staff with linked data expertise; thirteen are adding or have added temporary staff with linked data expertise; and seventeen are hiring or have hired external consultants with linked data expertise. Funding: Twenty-five of the projects received grant funding to implement linked data; nearly three-quarters (82) are funded by the library/archive and/or the parent institution. Five linked data projects received funding support from partner institutions; one received corporate funding and one was privately funded. As many projects are still at an early stage (pre-implementation or early implementation), relatively few respondents were prepared to assess their projects' success. 46 reported that their linked data project or service was successful or "mostly" successful. Comments on measuring success clustered around: Data re-use: One expectation of linked data is that it will enable broader use of local data by more communities. Although one respondent noted that other websites on campus consumed their data and that different data resources were successfully integrated, a couple noted the difficulty to ascertain how much of their data is being disseminated and re-used. Increased discoverability: One goal for publishing linked data is to increase the discoverability of the institution's resources. One pointed to high search engine ranking for its collections' rare content as a success metric. New knowledge creation: Some see repositioning library knowledge work and providing access to its resources through the semantic Web as a network activity enriching researchers' understanding. One pointed to better support of multilingualism by fetching multilingual labels from linked data vocabularies. Thought leadership: The linked data efforts demonstrated that the institution was taking the initiative in laying the groundwork for a future, different environment. The metrics used include good publicity and feedback among library linked data communities, demonstrating linked data possibilities and the influence on others' linked data projects. Preparation for the semantic Web: A couple noted the benefits of preparing their existing metadata and facilitating metadata remediation for the semantic Web — even without any tangible improvements yet — and that it is a worthwhile investment of resources. Operational success: Metrics used include working well with other services that use the data and inspiring others to contribute content. One noted that publishing digital collections as linked data improved discoverability and connections with related data sets, and another that crawlers know what to do with their linked data. Organizational development: Even absent metrics demonstrating linked data's value to others, linked data projects still provide professional development for staff. Organizational transformation: Changing catalog data management from MARC-based records to RDF modeling and linked data principles will require new workflows across the library community, now hampered by data duplication. Linked data can give non-librarian content specialists more control over the authorities they use. In both the 2014 and 2015 surveys, most projects/services both consume and publish linked data. Relatively few only publish linked data. How linked data is used 2015 survey 2014 survey Consume linked data 38 25 Publish linked data 10 4 Both consume & publish 64 47 Table 2: Survey Responses on How Linked Data Is Used   3 What and Why Linked Data Is Published Given the relatively large representation of libraries among respondents, it is no surprise that bibliographic and authority data are the most common types of data published (56 and 45 responses respectively), with descriptive metadata a close third (43). Other types of data published as linked data reported: Ontologies/vocabularies (30), digital collections (26), geographic data (18), datasets (16), data about museum objects (10), encoded archival descriptions (5), organizational data (5) and data about researchers or library staff (2). Most of the linked data datasets are small. Of the 67 responses reporting their datasets' size, 39 were less than 10 million triples and 19 were more than 100 million triples. Only three reported sizes of over 1 billion triples: North-Rhine-Westphalian Library Service Center (1-5 billion), the Norwegian University of Science and Technology's "various" linked data projects totaling 15 billion triples, and OCLC's WorldCat Linked Data (15 billion triples). Comparing the 2014 and 2015 survey results, the key motivations to publish linked data appear unchanged. While the number of responses increased due to greater survey participation, the relative rank of motivations remained the same, as shown in Table 3. Chief Motivations for Publishing Linked Data 2015 2014 Expose data to a larger audience on the Web. 91% 88% Demonstrate what could be done with datasets as linked data. 80% 80% Heard about linked data and wanted to try it out by exposing some local data as linked data. 58% 41% Explore whether publishing data as linked data will improve Search Engine Optimization (SEO) for local resources. 38% 18% Table 3: Chief Motivations for Publishing Linked Data A few responded that their administrations had specifically requested that they expose their data as linked data. The British Library noted that their linked data services were part of the UK Government's Public Sector Initiative. Other reasons written in included: The need to publish linked data to consume it and to re-use it in future projects Maximize interoperability and reusability of the data Testing BIBFRAME3 and schema.org A project requirement Provide stable, integrated, normalized data on research activities across the institution. Most projects/services that have been implemented received an average of fewer than 1,000 requests a day over the previous six months. The seven most heavily used linked data datasets as measured by average number of requests a day, with over 100,000 requests per day are: Europeana, which aggregates metadata for digital objects from museums, archives and audiovisual archives across Europe The Getty Vocabularies: Art & Architecture Thesaurus (AAT), Getty Thesaurus of Geographic Place Names (TGN) and Union List of Artist Names (ULAN) Library of Congress' Linked Data Service with over 50 vocabularies National Diet Library's NDL Search, providing access to bibliographic data from Japanese libraries, archives, museums and academic research institutions. North Rhine-Westphalian Library Service Center's Linked Open Data service, providing access to bibliographic resources, libraries and related organizations and authority data. OCLC's WorldCat Linked Data, a catalog of over 370 million bibliographic records made experimentally available in linked data form. OCLC's Virtual International Authority File (VIAF), an aggregation of over 40 authority files from different countries and regions. Another six linked data datasets receive between 10,000 and 50,000 requests a day: American Numismatic Society's nomisma Thesaurus of numismatic concepts Bibliothèque nationale de France's data.bnf.fr, providing access to the BnF's collections and providing a hub among different resources British Library's British National Bibliography National Diet Library's Authority Data OCLC's WorldCat Works, the high-level description of a resource common to all editions of a work OCLC's FAST (Faceted Application of Subject Headings), a faceted subject heading schema derived from Library of Congress' subject headings Linked data datasets use a variety of RDF vocabularies and ontologies, and most use multiple ones. The datasets respondents named, in the order of the frequency cited, are: Simple Knowledge Organization System (skos) Friend of a Friend (foaf) DCMI Metadata Terms (dcterms) Dublin Core Metadata Element Set (dce) Schema.org vocabulary (schema) The Bibliographic Ontology (bibo) Local Vocabulary VOCABS rda Europeana Data Model vocabulary (edm) ISBD elements (isbd) WGS84 Geo Positioning (geo) BIBFRAME Vocabulary (bf) Expression of Core FRBR Concepts in RDF (frbr) The Event Ontology (event) Metadata Authority Description Schema (mads) CIDOC Conceptual Reference Model (crm) BIO: A vocabulary for biographical information (bio) MODS RDF Ontology The OAI ORE terms vocabulary (ore) Core Organization Ontology (org) DPLA Metadata Application Profile (MAP) FRBR-aligned Bibliographic Ontology (fabio) Library extension of schema.org (lib) Music Ontology (mo) VIVO Core Ontology (vivo) Archival collections ontology (arch) ISNI (International Standard Name Identifier — ontology yet to be published) RDA Group 2 Elements (rdag2) British Library Terms RDF schema (blt) Data Catalog Vocabulary (dcat) EAC-CPF Descriptions Ontology for Linked Archival Data (eac-cpf) Nomisma Ontology Radatana Review Vocabulary (rev) Licenses:4 Twenty-six projects/services do not announce any explicit license; an equal number apply CC0 1.0 Universal, the most common license used. Other licenses that respondents use, in order of frequency, are: Open Data Commons Attribution (ODC-BY) Open Data Commons Open Database License (ODC-ODbl) Public Domain Dedication and License or PPDL Creative Commons Attribution-NonCommercial-NoDerivatives (BY-NC-ND) French Government's Open Licence (similar to ODC-BY) Creative Commons Attribution 3.0 Creative Commons Attribution ShareAlike 3.0 Creative Commons Attribution 4.0 International Creative Commons Attribution-NonCommercial ShareAlike 3.0 Unported License Creative Commons Attribution-NonCommercial ShareAlike 4.0 International (BY-NC-SA) Open Data Commons Attribution-ShareAlike Community Norms Accessibility: Of the 74 projects or services that publish linked data, 19 do not currently make their data accessible outside their institution. Most of the 74% that do offer multiple methods. Web pages are the most common method, followed by content negotiation, file dumps, SPARQL endpoint, SPARQL editor and applications. The most common serialization of linked data used is RDF/XML. Other serializations that are less often used by order of frequency cited are: Turtle, JSON-LD, N-Triples, RDFa Core, RDF/JSON, Notation3 and N-Quads. Technologies: The technologies used by respondents to publish linked data are diverse, and most used multiple technologies. Table 4 lists the technologies used in order of frequency. No. of Projects Used Technology (order of frequency) 10 or more SPARQL, Java, XSLT, Zorba 2-9 Solr, Virtuoso Universal Server (provides SPARQL endpoint), Google Refine, Jena Applications, RDF Store, Drupal7, Python, Apache Fuseki, ElasticSearch, Perl, Metafacture, DIGIBIB, MongoDB, 4store, Apache Marmotta, BlazeGraph, GraphDB (formerly OWLIM by Ontotext Software), Hydra, Numishare (built on Orbeon, eXist and Solr), Rails 1 Apache Tomcat, Cubicweb, Django, dotNetRDF, Fedora Commons, Hbase/Hadoop, JAX-RS, Joomla, LibHub, Mapping Memory Mapper (3M), MARC Report and MARC Global (from The MARC of Quality), MySQL, Node.js, Oracle, PoolParty, Protègè, Pubby, r2rml-parses, Ruby, Ruby Virtuoso triplestore, Sesame, skosmos, Wordpress Table 4: Technologies Used for Publishing Linked Data Barriers: The primary barriers to publishing linked data cited by respondents, in the order of the most cited are: Steep learning curve for staff Inconsistency of legacy data Selecting appropriate ontologies to represent our data Establishing the links Little documentation or advice on how to build the systems Lack of tools Immature software Ascertaining who owns the data Several also noted as additional barriers restrictive licenses, insufficient resources, data sets too large to publish as a whole (and difficult for others to consume), insufficient institutional support and adapting the current infrastructure with linked data technologies.   4 What and Why Linked Data Is Consumed Shown below are the linked data resources most consumed by the 2015 survey respondents (those consumed by 12 or more projects) in order of the most frequently cited. (Astericks denote resources from institutions that are survey respondents.) Virtual International Authority File (VIAF)* DBpedia GeoNames id.loc.gov* Resources the respondents convert to linked data themselves Getty's Art and Architecture Thesaurus* FAST Linked Data* WorldCat.org* data.bnf.fr* Deutsche National Bibliothek Linked Data Services* These could be considered successful publishers of linked data by the degree to which others consume the data provided. Library respondents who have implemented projects consuming linked data from other sources generally chose sources in the library domain to consume rather than expanding their universe to non-library sources, with DBpedia and GeoNames being the two exceptions. Other linked data sources consumed by at least four projects or services in order of frequency cited: WorldCat.org Works, ISNI (International Standard Name Identifier), Europeana, Lexvo, DPLA (Digital Public Library of America), Wikidata, ORCID (Open Researcher and Contributor ID), AGROVAC (United Nations Food and Agriculture Organization), Hispana, Nomisma.org and Pleiades Gazetteer of Ancient Places. Although the numbers differed between the 2014 and 2015 surveys, the primary reasons for institutions to consume linked data had similar rankings, as shown in Table 4. Chief Motivations for Consuming Linked Data 2015 2014 Provide local users with a richer experience. 76% 73% Enhance local data by consuming linked data from other sources. 74% 77% More effective internal metadata management. 48% 33% Greater accuracy and scope in local search results. 40% 25% Explore whether consuming linked data from external sources will improve Search Engine Optimization (SEO) for local resources. 28% 25% Experiment with combining different types of data into a single triple store. 25% 31% Heard about linked data and wanted to try it out by using linked data resources. 25% 27% Table 5: Chief Motivations for Consuming Linked Data Barriers: The primary barriers to consuming linked data cited by respondents, in the order of the most cited: Matching, disambiguating and aligning source data and linked data resources Mapping of vocabulary What's published as linked data is not always reusable or lacks URIs Lack of authority control Datasets not being updated Size of RDF dumps Understanding how data is structured before using it Volatility of data format of dumps, and Lack of tools Unstable endpoints It's difficult to get other institutions to do their own harmonization between objects and concepts Service reliability Disambiguation of terms across different languages is difficult Other barriers written in included: licenses more restrictive than ODC-BY; institutions viewing linked data as research projects rather than infrastructure; insufficient number of linked data datasets of local interest; API limits; and insufficient resources to incorporate consumed linked data into routine workflows.   5 Examples in Production The spreadsheet5 containing all responses to both the 2014 and 2015 surveys includes the links to the 75 linked data projects or services in production reported in 2015. A few examples from the different types of respondents are described here.   5.1 National Libraries Most of the 16 national library linked data projects or services in production provide access to their bibliographic and authority records as linked data. Three national libraries' linked data datasets are among the top 12 consumed by all respondents: id.loc.gov (Library of Congress), data.bnf.fr (Bibliothèque nationale de France) and DNB Linked Data Service (German National Library). The British Library was one of the first to make its national bibliography available as linked open data, exposing it in bulk. It is considered successful as it has been selected for the UK National Information Infrastructure and its data model has been influential. Entities include links to both ISNI (International Standard Name Identifier) and VIAF identifiers. The end of the html page for an entity shows the SPARQL query to retrieve the result that people can modify and re-run, as illustrated below: Figure 1: The British National Bibliography's SPARQL Query Viewer The German National Library described four projects that publish linked data: German National Bibliography, German Integrated Authority File (GND), its BIBFRAME prototype, and Entity Facts, which aggregates information about entities from various sources in a data model facilitating reuse by developers without domain knowledge. Its Web interface provides information in German and English (see Figure 2). Figure 2: Screen shot of Example from German National Library's Entity Facts The National Diet Library reported on 5 projects in the 2015 survey that publish linked data: bibliographic data, authority data, International Standard Identifier for Libraries and Related Organizations (ISIL) for Japan, an aggregation of resources related to the Great East Japan earthquake of 2011 and the Nippon Decimal Classification (the Japanese standard classification system), currently being converted into linked data.   5.2 Networks Digital Public Library of America provides access to public domain and openly licensed content held by archives, libraries, museums, and other cultural heritage institutions in the United States6. Europeana aggregates metadata for digital objects from libraries, museums, and archives across Europe. Its Europeana data model (EDM) is based on semantic web principles. It fetches multilingual labels from linked data vocabularies. The North Rhine-Westphalian Library Service Center (hbz) publishes one of the largest linked data datasets (1-5 billion triples). Its linked open data API provides access to 20 million bibliographic records and 45 million holdings from the hbz union catalog; to authority data from the German Integrated Authority File (GND); and to library address data from the German International Standard Identifier for Libraries and Related Organizations (ISIL) registry. OCLC has published well over 20 billion RDF triples extracted from MARC records and library authority files. This constitutes the largest library aggregation of linked data resources in the world. Three of these linked data sources (FAST, VIAF and WorldCat) were among the top 10 linked data sources consumed by the 2015 survey respondents.   5.3 Academic Libraries Most academic library respondents' linked data projects are experimental in nature. North Carolina State University's Organization Name Linked Data database includes links created by its Acquisitions and Discovery staff to descriptions of the same organization in other linked data sources, including the Virtual International Authority File (VIAF), the Library of Congress Name Authority File (LCNAF), Dbpedia, Freebase, and International Standard Name Identifier (ISNI).   5.4 Public Libraries Few public libraries responded to the survey, and only two have projects or services in production. Arapahoe Library District is an early adopter of the LibHub initiative7 to have its catalog Web accessible. The Oslo Public Library converted its MARC catalog to RDF linked data enriched with information harvested from external sources and constructed with SPARQL update queries. Its collection of book reviews written by Norwegian libraries link to the bibliographic data.   5.5 Museums Few museums responded to the survey. The British Museum's Semantic Web Collection Online is organized around the CIDOC Conceptual Reference Model to harmonize its data with that from other cultural heritage organizations.   5.6 Scholarly projects Dalhousie University's Institute for Big Data Analytics hosts the multidisciplinary and multinational Muninn Project aggregating data about World War I in archives around the world. It extracts data from digitized documents and converts it into structured databases that can support further research. It also hosts the Canadian Writing Research Collaboratory, an online infrastructure project to investigate links between writers, texts, places, groups, policies, and events. The Pratt Institute's Linked Jazz project applies Linked Open Data technologies to digital heritage materials and explores the implications of linked data in the user experience. It exposes relationships between musicians and enables jazz enthusiasts to make more connections. It generates triples from the content of interview transcripts from five jazz archives — from the data rather than converting existing metadata. The first phase of the project was funded through an OCLC/ALISE Library and Information Science Research grant. [8] Figure 3: Screen shot of a Linked Jazz display Nomisma is a collaborative, international project among cultural heritage institutions hosted by the American Numismatic Society providing a linked open data thesaurus of numismatic terms and identifiers.   5.7 Publishers Springer is the only publisher to respond to our survey. It is making data about scientific conferences available as Linked Open Data to make information about publications, authors, topics and conferences easier to explore and to facilitate analysis of the productivity and impact of authors, research institutions and conferences.   6 Conclusion The responses to the 2015 survey may be considered a snapshot of a linked data environment still evolving. We have a partial view of the landscape, as our analysis is confined to those who responded to the survey, primarily from the library domain. Many of the projects described are experimental in nature, as indicated by a wide range of ontologies and technologies used. Linked data projects have focused on the data. Developing linked data services that provide new or better value to communities than the current architecture will require developing applications taking advantage of multiple linked data sources and integrating datasets from different domains. Several noted that linked data enabled them to overcome silos of data within their own institution. A number of the projects and services described are multi-institutional and collaborative. Such collaboration may help alleviate lack of resources at the institutional level while providing a means to scale learning where local expertise is scarce. Respondents noted the value of the learning experiences provided by working with linked data as preparation for a more "web aware" environment. Educational motivation may be the primary driver for institutional involvement. Respondents offered advice for others considering a linked data project, including: Focus on what you want to achieve, not technical stuff. Model data that solves your use cases. Strive for long-term data reconciliation and consolidation. Add distinctive value: Build on what you have that others don't. Pick a problem you can solve. Involve your institution/community. Have a good understanding of linked data structure, available ontologies and your own data. Consume your own published data. Consider legal issues from the beginning. Read as widely as possible and consult community experts. Start now! Just do it!   Acknowledgements The author would like to acknowledge the contributions of OCLC colleagues Jean Godby, Constance Malpas and Jeff Young to this article.   References 1 The OCLC Research Library Partnership allies like-minded research libraries in twelve countries, providing a venue to undertake cooperative actions that benefit scholars and researchers everywhere. 2 The results of the 2014 survey were posted on HangingTogether.org between 28 August 2014 and 8 September 2014: Linked Data Survey results 1 — Who's doing it; Linked Data Survey results 2 — Examples in production; Linked Data Survey results 3 — Why and what institutions are consuming; Linked Data Survey results 4 — Why and what institutions are publishing; Linked Data Survey results 5 — Technical details; Linked Data Survey results 6 — Advice from the implementers. 3 BIBFRAME is the Bibliographic Framework Initiative launched by the Library of Congress to provide a foundation for the future of bibliographic description in the broader networked world. 4 The W3C provides an overview of licensing for linked open data. See also Open Data Commons (ODC) licenses and Creative Commons (CC) licenses. 5 Spreadsheets with the complete responses to both the 2014 and 2015 International Linked Data Survey for Implementers are publicly available from OCLC here. 6 See the current list of the DPLA's partners. 7 The Libhub initiative's goal is to raise the visibility of libraries on the Web by "exploring the promise of BIBFRAME and Linked Data." 8 OCLC Research and the Association for Library and Information Science Education (ALISE) annually collaborate to offer grant awards up to $15,000 to support one-year research projects that offer innovative approaches to integrate new technologies and that conduct research contributing to a better understanding of the information environment and user behaviors. For more information, see: http://www.oclc.org/research/grants.html.   Appendix: Institutions Responding to International Linked Data Survey for Implementers Institution Country 2015 Survey 2014 Survey ABES France X   Agencia Española de Cooperación Internacional para el Desarrollo (AECID) Spain X   American Antiquarian Society USA   X American Numismatic Society USA X X Anythink Libraries USA X   Arapahoe Library District USA X   Archaeology Data Service (UK) UK   X Biblioteca della Camera dei deputati (Italy) Italy X X Biblioteca. Real Academia Nacional de Medicina Spain X   Biblioteca Valenciana Nicolau Primitiu Spain X   Biblioteca Virtual de Derecho Aragonés Spain X   Bibliotheque nationale de France France X   BIBSYS NTNU (Norwegian University of Science and Technology) University Library Norway X X Big Data Institute Canada X   British Library UK X X British Museum UK X X Carleton College USA X X Charles University in Prague Czech Republic   X Chemical Heritage Foundation USA X   Colorado College USA X x Colorado State University USA X X Columbia University USA X   Consejería de Educación, Cultura y Deportes Gobierno de Castilla-La Mancha, Españaa Spain X   Consorci de Serveis Universitaris de Catalunya Spain X   Cornell University USA X X Dartmouth College USA X   Data Archiving and Networked Services, Royal Netherlands Academy of Arts and Sciences The Netherlands   X Digital Public Library of America USA X X Diputación de Málaga. Cultura y Deportes. Biblioteca Cánovas del Castillo Spain X   Europeana Foundation The Netherlands X X Evansville Vanderburgh Public Library USA X   Fundacción Ignacio Larramendi (Spain) Spain X X German National Library Germany X   Goldsmiths' College UK   X Haute école de gestion de Genève (SwissBib) Switzerland X   J. Paul Getty Trust USA X   Koninklijke Bibliotheek The Netherlands X   Laurentian University Canada X   Library of Congress USA X X Ministry of Defense (Spain) Spain X   Minnesota Historical Society USA X X Missoula Public Library USA   X National Diet Library Japan X   National Library Board (NLB) of Singapore Singapore   X National Library of Malaysia Malaysia X   National Library of Medicine USA X X National Library of Portugal Portugal X   National Library of Spain Spain X   National Library of Sweden Sweden X   National Library of Wales UK X   National Széchényi Library Hungary X   New York Public Library USA X   New York University USA X   North Carolina State University Libraries USA X X North Rhine-Westphalian Library Service Center Germany X   NTNU (Norwegian University of Science and Technology) University Library Norway X X OCLC USA X X Ohio State University USA X   Oslo Public Library Norway X X Pratt Institute USA X   Public Record Office, Victoria Australia   X Queen's University Library Australia   X RERO — Library Network of Western Switzerland Switzerland X   Research Libraries UK UK   X Smithsonian USA X X Springer USA X X Stanford University USA X X Stichting Bibliotheek.nl The Netherlands   X The European Library The Netherlands X X Tresoar (Leeuwarden — The Netherlands) The Netherlands   X Università degli Studi Roma TRE Italy X   University College Dublin Ireland   X University College London (UCL) UK   X University of Alberta Libraries Canada X X University of Applied Sciences St. Poelten Austria X   University of Bergen Library Norway X X University of British Columbia Canada   X University of California-Irvine USA X X University of Illinois at Urbana-Champaign USA   X University of Liverpool UK X   University of Nevada, Las Vegas USA X   University of North Texas USA   X University of Oxford UK   X University of Pennsylvania Libraries USA X X University of Tennessee, Knoxville USA X   University of Texas at Austin USA X X Villanova University USA X   Wellcome Library UK X   Western Michigan University USA X X Yale Center for British Art USA   X     About the Author Karen Smith-Yoshimura is Senior Program Officer at OCLC and works with research institutions affiliated with the transnational OCLC Research Library Partnership on issues related to creating and managing metadata. She is based in the OCLC Research office in San Mateo, CA.   Copyright ® 2016 OCLC Online Computer Library Center, Inc. work_fqrvp3gh7ngr7m2p2mtfhb23pq ---- A Narrative History of Resource Sharing in the State of Maryland Andrea Japzon Abstract The evolution of statewide resource sharing and reciprocal borrowing for Maryland public libraries is discussed. Beginning in the 1950s, the Enoch Pratt Free Library assumed responsibility for filling interlibrary loan requ3ests or the state due to the size of its collection. In 1971, Pratt became the State Library Resource Center and its interlibrary loan responsibilities became formalized. Through a series of technological advancements in library catalogs and interlibrary loan systems, Maryland has arrived at the MARINA system to facilitate sharing resources throughout the state. The state has a long-standing philosophy of cooperation, which makes the MARINA endeavor possible. In 1971, the Enoch Pratt Free Library in Baltimore, Maryland became the state Library Resource Center. As the Pratt Library has the largest collection in the state, the library has always responded to requests from around the state for interlibrary loan (ILL). In the 1950s, Pratt received state and city funding for its ILL efforts. As the population of Maryland grew and the tax base increased, the need for library service increased as well. The demand for collection support from Pratt increased and deposit collections were distributed around the state. Throughout the 1960s, state funding to Pratt increased to support the increased demand. In 1971, a law was passed making Pratt legally responsible for its role in state resource sharing. The law also helped to insure that financial resources would be made available for technological capital improvements. The Assistant Director for Public Services at Pratt, Patricia Wallace, who was a key player in the technological development of resource sharing in the state of Maryland, describes the progress of interlibrary loan as trains running along different tracks avoiding collision. Trains progressed along the tracks of population growth, public library expansion, technological advances, and library policy/philosophy evolution. Fortunately for the state of Maryland, all trains have run fairly evenly with one another with technology keeping pace with the resource sharing initiatives of the state. MARINA, the current resource sharing system in Maryland, owes its success more to a philosophy of equity of access than to technological developments. Within Maryland, the public library systems operate without regard for county borders. Property taxes do not dictate public library patronage. The arrangement in the state allows for the fairest distribution of funding from federal and state agencies. A cooperative borrowing agreement among Maryland public libraries has been in place since 1968. The agreement is a one-page document signed by all the public library directors. The agreement states that all public libraries are available for use by all Maryland residents with universal check out and return. The cooperative borrowing agreement is the work of a true Page 2 library visionary, Netti Taylor. She was the state librarian in Maryland for three decades starting in the 1960s and provided the leadership that is responsible for the current state of cooperation within the state. She established solid and lasting support for resource sharing. Technology has caught up with her vision and Maryland now provides one of the most successful statewide resource sharing endeavors in the country. Technological Advances in Resource Sharing With a well-established philosophy of resource sharing, the technology train was working to bring Taylor's vision to life. Pratt began receiving interlibrary loan requests in the 1950s and 1960s via the telephone and through U.S. Mail. In 1978, Pratt was the only library in the state with OCLC capability. At this time, only a few catalogs were available publicly. The Pratt staff had to consult numerous shelves of book catalogs for each county until a request was sound. Through the vendor, Autographics, Pratt was able to obtain a union catalog for the state on microfilm and then eventually on microfiche. By 1975, requests were sent via Teletype. Patricia Wallace was responsible for traveling the state and training individuals in the use of the Teletype. The 1980s brought the use of the CD-ROM for a statewide union catalog. The state's entire collection was easily transported on two-CD collection and were reluctant to let it go. However, the real-time benefits of online records were appealing enough to retire the CDs. In 1985, Autographics designed Milnet, an automated interlibrary loan system, to provide computer links between participating libraries, access to the online union catalog, and an electronic mail system. This system facilitated interlibrary loan for six public library systems, three regional libraries, three academic libraries, and the State Library Resource Center. This system allowed for the first time overnight electronic transmission of ILL requests and put an end to paperlooping. This went a long way to decreasing turnaround time. In the early 1990s and in the same spirit of cooperative borrowing, librarians from around the state developed the Sailor network (named for a famous Chesapeake Bay Retriever). The first goal of the Sailor network of offering free Internet services to Maryland residents was achieved by 1994. The second goal of the network of developing an interlibrary loan system that worked with the Sailor server was achieved in 1995 with SAILS, Sailor Automated Interlibrary System. The software product that facilitated SAILS was created by CPS Systems, Inc. and employed telnet technology. The SAILS system could integrate the creation of an ILL request with a search of the union catalog that contained current availability information. Also, capturing bibliographic data for requests was possible along with the tracking of requests. In the late 1990s, the DYNIX Corporation purchased the SAILS product from CPS Systems. DYNIX released the product URSA, Universal Resource Sharing Application, and SAILS changed to MARINA. Initially, MA3RINA stood for Maryland Anything, Anytime, Anywhere Resource In Network Application. The URSA product is a Z39.50 based application that allows all 18 catalogs from the 23 Maryland public libraries to be searched at once. The URSA product has greatly enhanced resource sharing in the state as it has automated many of the tasks of interlibrary loan and introduced patron-placed requests. Patrons access the union Page 3 catalog directly via the Web from anywhere, submit requests electronically, make journal article requests, review requests in progress, and determine availability for direct pick-up of items. As the URSA product is mapped directly to the libraries’ circulation systems, many steps that previously required mediation are automated: holds placed, items checked out and in, creation of short bibliographic records for circulation of items at borrowing libraries. Advances in technology, an emphasis on access over ownership, and the progressive decentralization of lending from the Pratt/State Library Resources Center led to a reduction of interlibrary loan staff. MILO, Maryland Interlibray Loan Organization In 1960, the County Services department was formed at Pratt. The library was under contract with the Maryland State Department of Education to provide services to the public libraries in the state. That year 4 staff filled 1,782 requests. By 1970, 19 staff processed 54,536 requests. In 1976, the name of the department changed to MILO to reflect the complexity of the growing network of libraries that participated in resource sharing through the newly sanctioned State Library Resource Center. In 1981 due to the advances in technology, the same 19 staff processed 117,944 requests. After 1990, requests processed by the MILO staff for the state reached its peak of 125,803 and has declined since then due to the decentralization of lending throughout the state and the spread of OCLC use to other libraries. For the past ten years, the staff has been reduced to half of its former levels with 9.5 staff members. With the increase in technology, the need for staff decreases. Current ILL processes are both streamlined and automated allowing for maximal efficiency. The MILO department is a part of a library system that is unlike most other cooperative endeavors. For example, Tampa Bay Library Consortium, North Bay Cooperative Library System, and the Detroit Area Library Network, are all separate agencies that support cooperative borrowing within a consortium. As Pratt is also the State Library Resource Center, the management of the state's resource sharing network is through the MILO department. The MILO department serves many functions including ILL services for Pratt patrons. With the help of 125 in-state referral agencies, MILO provides approximately 1,000 individual library agencies with interlibrary loan service. MILO brokers interlibrary loan requests for any library in the state as part of its statewide responsibility. Another major service provided by MILO is transshipping. MILO is the hub for six library delivery systems that deliver to public, academic, and school libraries throughout the state. For fiscal year 2003, MILO staff transshipped over 600,000 items. Page 4 MARINA MARINA is the latest stage in the development of resource sharing throughout the state. A total of 99 libraries use MARINA. Eighteen library catalogs are profiled through the URSA product. Libraries with profiled catalogs function both as lenders and borrowers. Currently, only public library catalogs are profiled; however, academic, school, and special libraries are profiled to make requests through the MARINA network. Requests through MARINA are preferred over other requesting venues such as faxing or through OCLC. Requests received through MARINA are less expensive and require less work on the part of the staff as all ILL requests are maintained through the URSA product. Currently, MILO staff has to maintain requests in both URSA and OCLC. However, it is now possible to link URSA and OCLC directly. In addition to saving time and money through maintaining only one resource sharing system, the direct connection will automatically transfer unfilled MARINA requests to OCLC. The MILO department is in the process of implementing the direct connection. Perhaps the greatest convenience of MARINA to ILL staff and to library users is that ILL staff is not required to make requests. Patrons can place unmediated requests via the Web. If a user's local system does not have a certain title, the user can access the MARINA system at http://www.sailor.lib.md.us/m/marina/. After the user logs on to the system using his/her library card number, the user can search all profiled catalogs at once, check the availability of the title, and place a request. Figure 1 shows the steady increase of interlibrary loan requests made via MARINA. The totals do not include OCLC transactions. Additionally, the MILO staff processed approximately 20,000 transactions through OCLC during the last fiscal year. Requests are sent to all libraries profiled to lend on the MARINA network. This has contributed to the decentralization of requests away from MILO and has spread the filling of requests around the state. The URSA product automatically places holds on requested items and generates a list of titles requested for each lending library on a daily basis. The decentralization of filling requests has led to a reduction in turnaround time. The wait time for a typical MARINA request is only two days. [This space left blank intentionally] http://www.sailor.lib.md.us/m/marina/� Page 5 Fig. 1 Total MARINA requests for fiscal years 1999 through 2003 Future of resource sharing in Maryland Ultimately, the goal is for the MARINA network to grow to include a greater number and types of libraries. Academic, school, and special libraries currently borrow through MARINA, and in the future, hopefully these types of libraries will help expand the state's resource sharing mission by becoming lenders. Promotional efforts are under review with the Maryland Division of Library Development and Services to raise awareness of both the MARINA system and the cooperative borrowing agreement within the state. With heightened visibility, the significance of the state's efforts to share resources will be impressed upon library users and library institutions alike For Maryland ILL practitioners, assessing the impact of technology on request fill rates is an important next step in the development of statewide resource sharing. Many want to examine other aspects of collection sharing that still need improvement, for example, collection control and inventory practices. Like all ILL practitioners, Maryland ILL practitioners are interested in discovering and removing the barriers to a 99.9 percent fill rate. 0 20,000 40,000 60,000 80,000 100,000 120,000 FY 03 FY 02 FY 01 FY 00 FY 99 work_fvejaloxojaq3pj5jfolh5blay ---- jhh$$$p418 Journal of Human Hypertension (2000) 14, 5  2000 Macmillan Publishers Ltd All rights reserved 0950-9240/00 $15.00 www.nature.com/jhh Publisher’s announcement Macmillan Publishers Ltd is pleased to be able to announce the creation of a new company. Nature Publishing Group brings together Nature, the Nature monthly titles and the journals formerly published by Stockton Press. Stockton Press becomes the Specialist Journals division of Nature Publishing Group. Nature Publishing Group will use its unique strengths, skills and global perspective to meet the demand of a rapidly changing and challenging pub- lishing environment. The Group’s publications are known for delivering high-quality, high-impact content, fair pricing, rapid publication, global marketing and a substantial presence on the internet. These elements are the key to excellence in selecting, editing, enhancing and delivering scientific information in the future. As a company, we have three core values: quality, service and visibility. These values are set to benefit all our customers – authors, readers, librarians, societies, companies and others – thus building strong publishing relationships. Journal of Human Hypertension Journal of Human Hypertension is now part of the Specialist Journals division of Nature Publishing Group. It will be marketed and sold from our offices in New York, Tokyo, London and Basingstoke. Within the elec- tronic environment, Journal of Human Hypertension will benefit from a substantial investment in innovative online publishing systems, offering global access, intelligent searches and other essential functions. Librarians will be able to provide their readers with print and online versions of Jour- nal of Human Hypertension through a variety of services including OCLC, Ingenta (linking to the BIDS service), SwetsNet, Ebsco, Dawson’s Info- Quest and Adonis. At a time when the basis of traditional journal publishing is undergoing significant changes, Nature Publishing Group aims to support the scientific and medical community’s needs for high quality publication services which provide rapid and easy access to the best of biomedical and clinical results. Jayne Marks Publishing Director Specialist Journals, Nature Publishing Group Publisher’s announcement Journal of Human Hypertension work_fx34tla4enhr3a3igkgbwuqajq ---- Parallel Text Searching on a Beowulf Cluster using SRW Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine September 2005 Volume 11 Number 9 ISSN 1082-9873 Parallel Text Searching on a Beowulf Cluster using SRW   Ralph R. LeVan OCLC Office of Research Thomas B. Hickey OCLC Office of Research Jenny Toves OCLC Office of Research Introduction While the news is full of reports of the success of the Internet search engines at searching billions of web pages at prices so low that they can afford to give the searching away for free, such affordable searching is not common in the rest of the world. What searching is available in the rest of the world is not scalable, not cheap or not fast. Often it suffers from a combination of those flaws. This article describes our experience building a scalable, relatively inexpensive, and fast searching framework that demonstrated 172 searches per second on a database of 50 million records. The article should be of interest to anyone seeking an inexpensive, open source, text-searching framework that scales to extremely large databases. The technology described uses the SRW (Search/Retrieve Web) service in a manner nearly identical to federated searching in the metasearch community and should be of interest to anyone doing federated searching. What problem were we trying to solve? OCLC [1] maintains a database (WorldCat [2]) of 50 million bibliographic records that grows at a rate of 3 million records per year. OCLC provides a search service (FirstSearch [3]) that makes available both browser and Z39.50 [4] access to the WorldCat database. Typical search loads average 5 searches per second with peak loads at 16 searches per second. This service currently runs on IBM WinterHawk computers using Oracle Text [5] for searching. We wanted to know if it was technically feasible to run such a service on cheaper hardware using open source software. The goal was to demonstrate 100 searches per second on this framework. We chose to partition our large database into a number of smaller databases and use federated searching techniques to search across the databases with a Beowulf cluster [6] as our hardware platform. A Beowulf cluster is a collection of commodity CPU's connected by a high-speed network switch running an open source operating system. (We were interested in Beowulf cluster technology for a range of experiments within OCLC Research, not just for this one experiment.) Our cluster consists of 24 nodes with 2 Intel Xeon 2.8 GHz CPU's per node with 4GB of memory. The root node has 130GB of disk shared via NFS [7]. The 23 application nodes have 80GB of disk per node. The network switch was a 48-port Cisco Catalyst 4850 switch with all internal ports at gigabit speed. Each node has 2 gigabit Ethernet connections. The cluster runs the Linux [8] operating system and the Rocks [9] cluster management software. Clusters like this now cost less than $100,000. The partitioning scheme The cluster configuration (1 root node and 23 application nodes) led to our partitioning scheme. Since each application node had 2 processors and the processors were running hyperthreading (theoretically supporting 2 processes per CPU), we concluded that we could run 4 processes on each application node. We wanted to reserve one process for hypothetical redundancy planning, so that left 3 processes per node or 69 processes for the entire cluster. Since each partition was to get its own searching process, the database was divided into 69 partitions. When we return results from a search, we order the results by the popularity of the items described by the bibliographic records in the database. We are able to assign a popularity score to the items because we know which of our member libraries hold those items. (We call the score the "holdings count".) We did not want to have to sort the search results across the 69 partitions, so we chose to partition the database by the popularity of the items, putting the most popular items into the first partitions, and so on. With that scheme, we only need to sort the item records within the partition and can have the sorting done on the application nodes, instead of having to collect all the results on the root node and doing the sorting there. 50M records divided across 69 partitions leads to a partition size of approximately 725K records. The most popular 725K items are held by 264 or more libraries. (The most popular item, Time Magazine, is held by 6,349 libraries.) The next partition consists of records for items held by 140-263 libraries. The partitioning continues in this manner until we get to the 1.2M items held by seven libraries. This number is much larger than the 725K records we had planned for each partition. When we get to items held by a single library, we find nearly 21M records. The large partitions are further subpartitioned in pieces of equal size near the 725K record goal. The subpartitioning scheme is arbitrary, based on internal record number. The 69 partitions are sequentially assigned to the application nodes, three partitions per node. Problems with disk architecture and partitioning In the best of all worlds, we would have moved the complete 41GB of MARC-21 data and 4GB of associated holdings information onto the Beowulf cluster and would have done the partitioning there. Unfortunately, there are problems with trying to move that much data in one transaction. Expecting the applications responsible for moving the data to perform flawlessly over the necessary hours is unreasonable; something – either hardware or software – is likely to fail. Using zip technology to compress the data reduces the transmission time, but zip has problems with files that large. Since the files needed to be broken down in chunks smaller than 2GB for zip, we decided to do the partitioning on the data's host computer and then zip the partitions. Once the data was on our Beowulf cluster, we could begin experimenting with the multi-processing features of the system. Would it be better to have the root node unzip the 69 files, or have the application nodes each unzip their three files? It took the root node 29 minutes to unzip the 69 files. Next, we had the application nodes unzip the files where they resided on the root node. After 2 hours, we cancelled the jobs. Finally, we copied the zipped files to the disks attached to the application nodes and had the application nodes unzip them there. It took 14 minutes to copy the files sequentially and 9 minutes to unzip them in parallel. We tried having the application nodes copy the files in parallel; this took 55 minutes. Clearly, disk contention is a problem on the NFS-mounted disks on the root node for this type of operation. The database technology OCLC Research has developed open source database software that consists of two modules: the search engine (Gwen [10]) and the database building software (Pears [11]). Gwen and Pears have been used by OCLC Research for several years, and they are also used in some of OCLC's commercial products. Pears, running under a different search engine, is also used by the Open Source SiteSearch [12] community. This software has been used to support monolithic versions of the WorldCat database and should be more than adequate for searching a database one sixty-ninth that size. The question remained as to what federating technology to use. OCLC Research has considerable experience with Z39.50, which has been in use for nearly a decade to support federated searching in the Library community. But, a Web-Service-based alternative to Z39.50 has recently been developed: SRW [13] (Search and Retrieve on the Web). Our experience with both protocols leads us to believe that SRW is much easier to understand and implement than Z39.50, but retains all the functionality necessary to support federated searching. Our open source implementation of an SRW server [14] includes an abstract database interface with implementations for both Pears and Gwen, and for Jakarta Lucene [15] as used by the open source digital archive, DSpace [16]. Our SRW server is built using the Apache Axis [17] Web Service toolkit, and we run it under the Apache Tomcat [18] servlet engine. Pears, Gwen and the SRW server are all 100% Java. The searching architecture The partitions are made searchable via the SRW and SRU (Search and Retrieve URL) search protocols. SRW is a Web Service based on SOAP [19] with functionality closely based on that provided by the Search and Present services of Z39.50. SRU provides the same functionality as SRW, but with a REST [20] model instead of SOAP. With SRU, all the request parameters, including the query, are embedded in the URL. Our SRW/U service is implemented using the Apache Axis toolkit and runs under Apache Tomcat. To provide the service, there is a single copy of Tomcat running on each application node. The SRW/U service is configured to know about the three partitions on each node. Each partition is searchable at a different base URL (e.g., the first partition on application node 0-0 had a base URL of http://app-0-0:8080/SRW/Search/Partition1.) We considered running a separate Tomcat server for each partition, but decided that Tomcat would run each search of each partition on a separate thread and that Linux would see to it that the threads got spread across the available CPU's. We used two sets of searches for our testing. The full set of searches was extracted from our logs from one day's searching of our WorldCat database in our FirstSearch service. The logs were filtered to remove date range and truncated searches. (The filtering was done to support other searching experiments, and we didn't think it would affect our results with the experiment described in this article.) The second set of searches consisted of 1,000 searches randomly selected from the 37,000 searches in the full set. We wrote a client to read the searches from a file and send the queries to the partitions. The actual sending and receiving of messages to a particular partition happened on separate threads. We would extract the document counts from the responses from each partition. Those counts would be summed and reported in the client's log. A sampling of those counts was manually checked for accuracy, and the results of subsequent runs of the client were compared with the validated results. An SRW client The first client architecture was quite trivial. A search was read from an input file and passed to each of 69 threads that would process the search for one of the 69 partitions. Each thread would generate an SRW request using code generated by the Apache Axis SOAP toolkit from the SRW WSDL (Web Services Description Language). The threads then extracted the document count from the response. The counts from the threads were summed and the total count was reported. Average response time for the 1,000-search test was 437ms, or slightly more than 2 searches per second. This was far from our 100 searches per second goal. We used two tools during our testing. The first tool was the client software itself, which reported each query, search results, and the response time for the search. In addition, the client software reported the average response time for the entire run, and the fastest and slowest searches. The second tool was the Ganglia Cluster Toolkit [21]. This toolkit generates a dynamic web page that allows the user to monitor the activity of the cluster. While it can produce detailed information on the memory, disk, network and CPU activity of each node, and for the cluster as a whole, we primarily used their cluster summary page. On that page, a graph of overall activity over the last hour for each node was provided. The background color for the graph also indicated the immediate activity for the node, with colors ranging from blue (barely active) to red (very active). While running the SRW client test, we noticed that the root node for the cluster, where the client was running, was red on the Ganglia page, and all the search nodes were blue. This indicated to us that the client software might be the bottleneck preventing faster searches. We understood that the process of serializing Java objects into XML and back that the Apache Axis toolkit performed was not an inexpensive one. When we also considered that the client was doing this 69 times for each search, we weren't surprised that the SRW client had become a bottleneck. An SRU client We next modified the client to use SRU instead of SRW. This simply entailed appending a few SRU parameters to the same base URL that we had used for SRW and then appending the query. The response was still the same XML record that was returned by SRW, but instead of parsing it, we just scanned it for the string marking the postings count and extracted that count from the record. This work was still being done in 69 threads, one for each partition of the database. Response time dropped by a factor of 10, to 40ms per search, but, the Ganglia page still indicated that the client was a bottleneck. SRW revisited We then made a final attempt to use SRW. Instead of using the Apache Axis toolkit to encode and decode the SOAP messages, the messages were constructed by hand and sent to the server. The responses were scanned for the postings count as in the SRU client. Response time had dropped to 46ms per search – much better, but still not nearly as good as when we used the simpler SRU code. We made no further attempts to improve the SRW code. 23 databases are better than 69 The Ganglia page seemed to indicate that the client itself was the reason faster searching wasn't possible. We decided to reduce the number of threads that the client needed from 69 to 23, which reduced the work of the client by two-thirds. To compensate for this change in the client, small database federators were created for the application nodes. A single SRW database was created that acted as a gateway to the three local databases and aggregated the results of the searches on those partitions. Using the new database architecture, SRU client searches dropped from 40ms to 14ms per search. The hand-built SRW client went from 46ms to 30ms per search. Finally, the original SRW client went from 445ms to 164ms per search. 14ms per search results in an overall throughput of 71 searches per second, which was not yet at our goal of 100 searches per second. In addition, while some of the search nodes were showing as yellow on the Ganglia page, the root node was still red, and most of the application nodes were only green. Clearly, the client continued to do too much of the work, and not enough was being done by the application nodes. Moving the aggregation to the application nodes To address the problem of the client doing too much work, we created a new kind of SRW database. This database aggregated the results of searches sent to remote databases, unlike the previous aggregating database that searched local databases. This new database solved two problems: when run on the root node, it provided a single database for web browsers to use to search the entire WorldCat database with 15ms response time; when run on the application nodes, it provided 23 possible servers to which a client might send WorldCat searches. We then created a new client to take advantage of these new databases. The client still read searches from a file, but it gave those searches to an SRU client running on a thread that sent the search to a remote WorldCat database. The main client did not wait for the response from the SRU client. Instead, it read another search and waited for an available SRU client to give the search to. Overall searching throughput was measured by the main client by recording the start and end times for the run and dividing that into the number of searches performed, resulting in an overall number of searches per second. Experiments were performed with differing numbers of SRU clients. The complete results are displayed in Figure 1 below. The significant result was the achievement of an overall search throughput of 172 searches per second. During the tests that produced this final result, the Ganglia page showed all the nodes red. Results of experiment searching 23 databases. Status of development As of this writing, there are no plans to use this code in production at OCLC. As a result, there are a number of features left undeveloped in this framework. No provision was made for Fault Tolerance. Fault Tolerance can be achieved by duplicating the partitions on multiple nodes and configuring 23-way aggregators that use the different combinations of partitions. While record retrieval across the federated database is complete, sorting and ranking remains to be done. This work can be primarily done at the partition level and the aggregators can simply merge the results. While the system is incomplete, all the open source code described in this article, including the fully functional SRW server and databases, and the Pears/Gwen database, are available from OCLC Research [22]. Conclusions There is reason to be concerned about the efficiency of SRW and SOAP-based Web Services as opposed to SRU and REST-style services, at least in high-throughput multi-threaded clients. The goal of this project was to demonstrate 100 searches per second on a large database using relatively inexpensive hardware. In the end, we demonstrated 172 searches per second. The framework scales easily by adding more nodes to the system. References 1. OCLC Online Computer Library Center, Inc. . 2. WorldCat, . 3. FirstSearch, . 4. Z39.50, . 5. Oracle Text, . 6. Beowulf cluster, . 7. NFS, . 8. Linux, . 9. Rocks, . 10. Gwen, . 11. Pears, . 12. Open Source SiteSearch, . 13. SRW, . 14. SRW server, . 15. Jakarta Lucene, . 16. DSpace, . 17. Apache Axis, . 18. Apache Tomcat, . 19. SOAP, . 20. REST, . 21. Ganglia Cluster Toolkit, . 22. Open Source from OCLC Research, . Copyright © 2005 OCLC Online Computer Library Center, Inc. Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions doi:10.1045/september2005-levan   work_fxbrbfsczfetvh2d6xc2mxc5wq ---- 464 The American Archivist Vol. 79, No. 2 Fall/Winter 2016 interesting and useful to read about research from the perspective of a “digital native”—someone who had no recall of doing research without first checking online. The authors also little acknowledge that many people—especially those with a passion project rather than a professional mandate—rely on the digital world because they cannot afford the time or money to take research trips such as the ones described in Curiosity’s Cats. As an archivist who does reference, I appreciate that this book is not meant to be instructional—it is not about how to provide good assistance—but rather shows a user-side view of the joys and frustrations of trying to find information, of suspecting that a fact is floating out there that you have not quite managed to pin down yet. Archivists can learn lessons from it, particularly about how researchers look to us for guidance but can easily lose faith in our abilities. It is also instructive to remember that displaying enthusiasm for a researcher’s work can help him or her along. Perhaps most important, Curiosity’s Cats under- scores the relevance of archives. Caryn Radick Special Collections and University Archives Rutgers University The Evolving Scholarly Record By Brian Lavoie et al. Dublin, Ohio: OCLC Research, 2014. 25 pp. Freely available at http://www.oclc.org/content/dam/research/publications/library/2014/oclcresearch- evolving-scholarly-record-2014-5-a4.pdf. ISBN 1-55653-476-0 (978-1-55653-476-8). Stewardship of the Evolving Scholarly Record: From the Invisible Hand to Conscious Coordination By Brian Lavoie and Constance Malpas. Dublin, Ohio: OCLC Research, 2015. 33 pp. Freely available at http://www.oclc.org/content/dam/research/publications/2015/ oclcresearch-esr-stewardship-2015.pdf. ISBN 1-55653-498-1 (978-1-55653-498-0). In the predigital era, archivists may have felt alone on an island advocating for the preservation of the cultural record. Archival programs were often mar- ginal to the mission of the institutions they were charged with documenting. In recent years, however, the increase in born-digital content has come together conveniently with users’ growing demands for ways to leverage data sets and other products that aid in scholarship and research. Librarians who are liaisons D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/0360-9081-79.2.464 by C arnegie M ellon U niversity user on 06 A pril 2021 The American Archivist Vol. 79, No. 2 Fall/Winter 2016 465 to faculty, metadata specialists, computer programmers, and others are now as anxious about the sustainability of unpublished or semipublished records as archivists have long been. It’s become a crowded island. In reaction to, and to help shape discussions about, this recent focus on a scholarly record that is virtually all born digital, OCLC Research has published two reports that archivists would do well to note. The Evolving Scholarly Record (2014) and the subsequently disseminated Stewardship of the Evolving Scholarly Record: From the Invisible Hand to Conscious Coordination (2015) provide a starting point for all kinds of information professionals to conceptualize what comprises the scholarly record and what the challenges are in capturing, managing, and providing access to it. Unfortunately, while archivists will be very familiar with the concepts these two reports introduce, they do not acknowledge the archival tradition that could inform discussions about stewardship of this content. The Evolving Scholarly Record opens by neatly describing the sea change we’re experiencing. While in the past, “the scholarly record was largely defined by the formally published monographic and journal literatures,” this definition has grown to include the “raw materials” of the scholarly process, including data sets, blog posts, and other unpublished or semipublished materials (2014, pp. 6, 11). Those tasked with capturing the scholarly record confront not just a proliferation of formats, but also unclear ownership of these materials; custody spread across multiple, often proprietary, platforms; and the sheer volume of materials being produced all frustrate easy solutions to effective stewardship. If the content itself does not pose enough of a challenge, demand from scholars to steward it has intensified, as end users sense the value of these materials despite their “weightlessness” (an adjective that OCLC cleverly borrows from economic theory) (2015, p. 5). To help its readers better understand all the forces at play when stewarding the scholarly record, OCLC has created a very useful graphical model (2014, p. 10). Words will not do it justice, but in short, “process,” or the inputs that create “outcomes,” cause the “aftermath” of discussion and further research. Credit goes to OCLC for attempting to make concrete its concept by listing examples of formats for each component of its framework; examples of “process” materials include data sets and methods that inform a project; “outcomes” can include print or e-journals; “aftermath” may reflect blog posts discussing the scholarly outcome and even more research. The Evolving Scholarly Record even goes so far as to provide some examples of instantiations of this content: ArXiv.org, Dryad, a particular researcher’s blog, and so forth. Perhaps unintentionally, the examples reveal the report’s limited understanding of what the scholarly record might be, as all of these materials are at least semipublished and, crucially, reflect intent from the creator to disseminate. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/0360-9081-79.2.464 by C arnegie M ellon U niversity user on 06 A pril 2021 466 The American Archivist Vol. 79, No. 2 Fall/Winter 2016 What might the scholarly record include, then, if not just the content types that OCLC lists? Stewards of the scholarly record would do well to look at past literature on documenting science, such as Woolgar and Latour’s 1979 book, Laboratory Life: The Construction of Scientific Facts.1 The research team’s landmark study was to create an anthropology of the scientific method, shadowing scien- tists at the Salk Institute to observe closely their daily activities. Woolgar and Latour found that science does not unfold in a purely rational, logical manner, but instead is heavily informed by unintentional compromises that are often unconscious to the researcher. Laboratory Life is essential reading for all infor- mation professionals serious about documenting the scholarly record compre- hensively and inclusively.2 Woolgar and Latour would likely disagree with the implication in the OCLC reports that the creators of science can define, and therefore limit, what is the scholarly record. (The authors might also find help- ful their colleague Jackie Dooley’s 2015 report for OCLC Research, The Archival Advantage: Integrating Archival Expertise into Management of Born-digital Library Materials, which explores the myriad skills archivists bring to stewardship of born-digital content.3) Another very useful diagram that The Evolving Scholarly Record introduces is the “Stakeholder Ecosystem,” which outlines the four main activities sur- rounding production of this newly inclusive scholarly record: create, use, col- lect, and fix (2014, p. 16). When understood as sequential activities, this model will remind archivists of the records life-cycle concept, in which a record is created (“create”), used by someone (“use”), acquired by an archives once it is no longer actively used (“collect”), and preserved by the archives in perpetuity (“fix”). “Collect” and “fix” are clearly activities that stewards of the scholarly record would prioritize; for a scholarly record to be used, it must first be cap- tured and preserved in a way that will maximize its use. “Create” and “use” are activities prioritized by end users. The authors note that evolutions in scholarly production can lead to one or more of these activities being bypassed; for exam- ple, a scholar may create and use materials without making arrangements to have them collected or fixed; a third-party system may collect without fixing (i.e., taking appropriate preservation actions on) the record; and other scenarios. The challenge for stewards of the scholarly record is to determine how they can regularly integrate themselves into these activities so that the record can be preserved for future use. As discussed earlier, both The Evolving Scholarly Record and Stewardship of the Evolving Scholarly Record consider a variety of challenges facing stewards of the scholarly record, most of which will be very familiar to the archives community. In addition to the proliferation of formats and the abundance of volumes men- tioned before, the authors reference “channels that bypass formal publication venues” and the need to capture this content, the acknowledgment that the D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/0360-9081-79.2.464 by C arnegie M ellon U niversity user on 06 A pril 2021 The American Archivist Vol. 79, No. 2 Fall/Winter 2016 467 “aftermath” phase often includes people who are not the creator(s) of the initial outcome and the accidental or intentional circumvention of the “fix” and “col- lect” stages so essential for sustaining the scholarly record (2015, p. 10). Perhaps the most curious assertion in both reports is the claim that a distinction must be made between the “scholarly” record and the “cultural” record. The Evolving Scholarly Record notes this without explanation, so one is left to conjecture what the authors mean by this point. Perhaps they are arguing that some products of the scholarly process are important but do not merit inclusion in what suddenly is presented as a narrow definition of the scholarly record. More justification would need to be provided for followers of Woolgar and Latour, who would likely say that all products of the scholarly process are important to analyzing and understanding the conditions under which scholarly outcomes come to be. The Evolving Scholarly Record and Stewardship of the Evolving Scholarly Record are likely to elicit ambivalent responses from the archives community. On the one hand, the valuable service OCLC has provided in summarizing and deepening understanding about a major issue of interest to the field should not be ignored. The authors’ diagrams are clear, inclusive as intended to be applied “across domains,” and instructive (2014, p. 6). The enumeration of types of scholarly records in The Evolving Scholarly Record further helps to solidify understanding of the conceptual model to the point where a reader might be substantively prepared to take action on stewardship in a meaningful way. Stewardship of the Evolving Scholarly Record expands on the concepts the previous publication only touched, detailing real-world challenges such as the creation of service-level agreements, division of responsibilities in cooperative arrangements, and the development of “robust trust networks” to give the public confidence in the institutions taking on stewardship duties (2015, p. 26). On the other hand, in the end, it is troubling to encounter a failure to acknowledge the centuries-old body of knowledge that archivists bring to this issue. Anxious statements that “choices will have to be made” would be readily informed by our tradition of appraisal (2014, p. 22). “New” models seem not-so- new when compared to the records life cycle. And statements like “the “paper trail” of science will be captured in ways it never has been before” will undoubt- edly raise eyebrows from archivists who have been preserving the records of scientists for at least fifty years (one can start with books on documenting sci- ence by Maynard Brichford and Joan Haas et al. to get a sense of the discussions that came long before conversations about “big data”) (2014, p. 11).4 Archivists are no longer alone on an island, but they risk being cast out to sea if they do not find ways to play a central role in discussions about the stewardship of the scholarly record. Jordon Steele Johns Hopkins University D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/0360-9081-79.2.464 by C arnegie M ellon U niversity user on 06 A pril 2021 468 The American Archivist Vol. 79, No. 2 Fall/Winter 2016 Notes 1 Bruno Latour and Steve Woolgar, Laboratory Life: The Construction of Scientific Facts (Princeton, N.J.: Princeton University Press, 1986). 2 Christopher J. Prom gave a succinct overview of the volume in a 2013 “Digital Dialogues” presentation at the Maryland Institute for Technology in the Humanities (MITH). See “Documenting Science in the Digital Age: What’s the Same and What’s Different,” Maryland Institute for Technology in the Humanities, http://mith.umd.edu/podcasts/ chris-prom-documenting-science-digital-age-whats-whats-different. 3 Jackie Dooley, The Archival Advantage: Integrating Archival Expertise into Management of Born-digital Library Materials (Dublin, Ohio: OCLC Research, 2015), http://www.oclc.org/content/dam/research/ publications/2015/oclcresearch-archival-advantage-2015.pdf. 4 Maynard Brichford, Scientific and Technological Documentation: Archival Evaluation and Processing of University Records Relating to Science and Technology (Urbana: University of Illinois at Urbana- Champaign, 1969); Joan K. Haas, Helen Willa Samuels, and Barbara Trippel Simmons, Appraising the Records of Modern Science and Technology (Cambridge, Mass.: MIT, 1985). Dissonant Archives: Contemporary Visual Culture and Contested Narratives in the Middle East Edited by Anthony Downey. London: I.B. Tauris and Co. Ltd, 2015. 469 pp. Softcover. $28.00. Illustrations (some color). ISBN 978-1-78453-411-0. The late Egyptian writer Naguib Mahfouz (1911–2006) asserted that you can tell whether a man is clever by his answers and wise by his questions. In Dissonant Archives: Contemporary Visual Culture and Contested Narratives in the Middle East, academic, writer, and editor Anthony Downey presents the writings, inter- views, and original artwork of acclaimed academics, curators, activists, filmmak- ers, and artists. By turns clever and at all points wise, these practitioners have produced work that not only creatively engages the heterogeneity of archived cultural production across the Arab world, but also astutely posits important questions for archival science. These sage queries oblige archivists to recon- sider their professional practices (p. 14). To illustrate, are archivists open to the dissonant revelations about their profession created by artists whose artistic practice produces work imbued with suppositional visions of the future and explores alternative, interrogative, or even fictional forms of the athenaeum? Alternatively, why have contemporary artists developed a dominant aesthetic strategy committed to working with archives? In seventeen thought-provoking essays and two large inserts featuring artwork created by artists who utilized archival materials from Iraq, Israel, Lebanon, Afghanistan, Tunisia, Algeria, Morocco, Syria, Jordan, Turkey, Egypt, Pakistan, and Palestine, Downey endeav- ors to show how contemporary artists attempt to provide astute answers to the D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/0360-9081-79.2.464 by C arnegie M ellon U niversity user on 06 A pril 2021 i0360-9081-79-2-461 4 i0360-9081-79-2-461 5 i0360-9081-79-2-461 6 i0360-9081-79-2-461 7 i0360-9081-79-2-461 8 work_fxppfrjivbakdl3qxkra5cyeui ---- EPrints makes its mark Nigel Stanger1 Graham McGregor2 University of Otago, PO Box 56, Dunedin 9054, New Zealand 1nstanger@infoscience.otago.ac.nz 2gmcgregor@business.otago.ac.nz Abstract Purpose — To report on the impact and cost/benefit of implementing three EPrints digital repositories at the University of Otago, and to encourage others to follow suit. Design/methodology/approach — Three repositories were successfully implemented at the University of Otago using existing commodity hardware and free open source software. The first pilot repository was implemented within ten days, and is now a fully- functional system that is being championed for institutional-wide use by the University Library. The other two repositories emerged from different community needs. One is academic, concerned with collecting and researching indigenous content; the other is designed to preserve and manage collective memory and heritage content for a small rural community. Findings — Digital repositories can: • be established quickly and effectively with surprisingly few resources; • readily incorporate any kind of extant digital content, or non-digital material that is converted to electronic form; • meet multifarious needs, from academic institutions seeking to enhance research vis- ibility and impact, to individuals and small communities collecting and preserving their unique memory and heritage records; and • establish connectivity with the global community from the moment they go live. Practical implications — The technology and global support community have matured to a state where a fully-featured repository can be quickly and easily implemented. Originality/value — This article describes the short history, development and impact of the first live repositories of their kind in New Zealand. Their utility and implications for the unique communities that have given rise to them are also explored, by way of encouraging others to take up the digital challenge. Article Type: Case study Keyword(s): Digital institutional repositories; Repository implementation; Community repositories; GNU EPrints. 1 Introduction Digital institutional repositories have become a hot topic in recent years, and many institutions worldwide are now actively implementing them. This article discusses how low cost, yet fully functional digital institutional repositories (IRs), can be set up in a very short time frame. The authors reflect on the lessons learned while implementing three different repositories at the University of Otago, and discuss some new and exciting applications of digital repositories arising from these. The authors also suggest some best practices for implementing an IR and discuss issues that must be considered when moving from a small-scale pilot implementation to a full roll-out. Interest in institutional repositories at the University of Otago was sparked by the release of the New Zealand Digital Strategy by the New Zealand government in May 2005. The strategy aims to ensure that “New Zealand is a world leader in using infor- mation and technology to realize our economic, environmental, social and cultural goals” (New Zealand Government, 2005). In parallel with this, the National Library of New Zealand set up an expert working party with representatives from across the research sector to investigate the feasibility of establishing a national institutional repository for New Zealand’s research outputs (Rankin, 2005). The National Library is fostering a work program to improve access to New Zealand’s research outputs, by collaborating with institutions to stimulate the set-up of research repositories. In May 2005, two senior University of Otago staff undertook a study tour of Digital Challenges facing universities in the United States. Their report provided the impetus for the first IR pilot in Otago’s School of Business. Project work began on November 7 2005, with the following goals (Stanger and McGregor, 2006): • To establish a proof of concept demonstrator for storing and providing open access to digital research publications in the School of Business. • To evaluate the potential of the demonstrator for adoption by the wider University of Otago research community. • To connect the School of Business with the global research community, in line with the feasibility study and recommended actions for a national repositories framework (Rankin, 2005). This article discusses how three different repositories were implemented from scratch, the issues that arose during implementation and the process that has lead to their sub- sequent development and use. 2 EPrints Otago The GNU EPrints repository management software was chosen for the pilot repository because it was widely used, well-supported, inexpensive and would not lock the School of Business into specific technologies or vendors (Sale, 2005). The development team also had prior experience with the software. A rapid prototyping methodology was adopted, emphasizing quick releases of visible results with multiple iterations, in order to create interest in the project at an early stage, and enable a positive feedback cycle. A sandbox was used to test entries and entry formats before the material went live. Tools, techniques, development tasks and other relevant issues were documented on an ongoing basis using a private wiki. 1 The pilot implementation was completed within ten days of assembling the project team, with most of this time spent tweaking the look and feel of the web site and collecting content (Stanger and McGregor, 2006). This outcome was made possible by establishing a very clear brief to “prove the concept”, rather than taking on a large scale project involving many different disciplines, researchers and research outputs from the outset. Early decisions were made to restrict the content and content domain for the pilot, in order to speed the collection process and minimize requirements “creep”. Meetings were kept to a minimum and policy and procedural issues that required institutional decisions were noted as work progressed, rather than tackled head on. The project was widely publicized within the School and Heads of Departments were consulted to ensure top- level buy-in. This approach produced immediate results and the repository was quickly populated with a range of working/discussion papers, conference items, journal articles and theses. There was no cost associated with the GNU EPrints software or its associated online community, and from a technical point of view the project was wonderfully straight- forward. The School of Business repository1 was deployed on a spare mid-range server running FreeBSD, which meant that hardware and software costs were essentially nil. In other words, if there happens to be some spare hardware lying around, an initial repository can be set up very cheaply, and expanded later. A minimalist approach was taken with regard to gathering content; partly because of the prototypical nature of the project, and partly because material in the hand is worth more than promises by authors to supply content at some indeterminate future date. New publications are always being created, and content acquisition is a moving target that has to be effectively managed. Once basic content acquisition and data entry protocols were put in place, an incremental methodology was adopted. Content was limited to voluntary contributions in PDF format from colleagues in the School of Business, but with no constraint on the type of output. As of November 30 2006, the repository contains 409 documents covering a wide range of topics and document types, with new content being continually acquired. It is remarkable what can be achieved by a small, dedicated, knowledgeable and enthu- siastic implementation team. As with any project, the right mix of technical and project management skills is crucial in making things happen. The project team comprised the School’s Research Development Coordinator (project management and evangelism), an Information Science lecturer (software implementation), the School’s IT manager (hard- ware and deployment) and two senior students (research, content acquisition and data entry). Oversight was provided by a standing committee comprising representatives from Information Technology Services, the University Library and the School of Business. 3 Impact of the pilot Traffic and downloads were generated from the moment the system went live, and the Tasmania statistics package (Sale and McGee, 2006) that sits alongside the repository became an object of fascination in its own right. The initial response to the pilot repos- itory seemed spectacular, with nearly 19,000 downloads recorded within the first three months from eighty different countries. This level of traffic excited considerable interest from both inside and outside the University. However, while the repository had indeed 1http://eprints.otago.ac.nz/ 2 Month N 2005 D J 2006 F M A M J J A S O N N u m b e r o f h it s 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 N u m b e r o f it e m s 0 50 100 150 200 250 300 350 400 450 Abstract views Full text downloads Repository size Figure 1: Total monthly hit rates (bar chart, left axis) and number of items (line chart, right axis) for the Otago School of Business repository, up to November 30 2006. been accessed from eighty countries, it was salutary to discover that the download rates were in fact over-inflated by a factor of about five. This was due to an undocumented assumption in the Tasmania statistics software (Sale and McGee, 2006) that resulted in hits being counted multiple times if statistics were gathered more often than once per day. The lesson here is to always be wary of computers bearing wonderful news! Despite the downward adjustment to overall download rates, there is still ongoing healthy interest in the repository, as shown in Figure 1. Interestingly, the repository experiences many more abstract views than full text downloads. An informal analysis of hit rates across eight other repositories that generate similar statistics, shows that some experience the same pattern as Otago, while others experience more downloads than abstract views. Further investigation is needed to determine why this variation occurs. Otago’s rate of traffic growth has also been compared with the repositories mentioned above. Figure 2 indicates that traffic to the Otago repository grew much more rapidly during its early months than for any of the other eight repositories investigated, including some that are much older and larger (see Table I). This may be a consequence of growing public awareness of digital repositories, or there may be other factors involved. A research project is currently under way to investigate possible reasons for this finding. An exciting outcome of the pilot has been the ability to make available material that might otherwise be difficult or impossible to access, and thus increase the likelihood of it being cited (Harnad, 2005; Hajjem, Harnad and Gingras, 2005). For example, Figure 3 shows that nearly three-quarters of the items in the Otago repository are items that might not otherwise be readily accessible, such as theses, dissertations, and departmental 3 Months since launch 0 5 10 15 20 25 30 35 40 45 50 55 60 65 C u m u la ti v e n u m b e r o f h it s ( lo g s c a le ) 1 10 100 1,000 10,000 100,000 1,000,000 dLIST E−LIS U. Nottingham U. Otago/SoB U. Otago/Te Tumu U. Otago/Cardrona Rhodes U. U. Melbourne U. Tasmania Figure 2: Comparison of traffic growth across nine EPrints repositories, as of November 30 2006. (The different line styles are used only to distinguish the lines; they have no other significance.) Table I: Details of repositories compared in Figure 2, as of November 30 2006. Age in Num. Repository months items dLIST (University of Arizona, U.S.A.) ≈ 54 843 E-LIS (CILEA, Italy) ≈ 48 4638 University of Melbourne (Australia) ≈ 53 1479 University of Nottingham (U.K.) ≈ 65 242 University of Otago/Cardrona 6.5 17 University of Otago/School of Business 12.5 409 University of Otago/Te Tumu 7 31 Rhodes University (South Africa) ≈ 21 385 University of Tasmania (Australia) ≈ 26 355 4 Journal (5.6%) Conference (19.6%) Thesis/dissertation (33.0%) Working/discussion (40.8%) Other (1.0%) Figure 3: Types of item in the Otago School of Business repository, November 30 2006. working/discussion papers. Indeed, the top ten downloaded items as of November 30 2006 comprise four departmental working papers, two conference papers, two research reports, one journal paper and one PhD thesis. The full text of these items is also readily searchable by major Internet search engines such as Google (Sale, 2006), often within only a few days of being deposited. The pilot was not only technologically successful, but also generated much local and national interest. Consequently, after a mere six months, the pilot became the official repository for Otago’s School of Business. It has also been adopted as a model with potential for roll-out across the entire University. As there are four academic Divisions at Otago (of which the School of Business is one), a federated model of repositories is envisaged that would be centrally linked and managed by the University Library. Having proved the concept, it has been (and is) relatively simple to develop other repositories with similar speed. The key is having an experienced team and a highly focused project management plan. 4 EPrints Te Tumu The success of the pilot excited considerable interest throughout the University commu- nity. In early 2006, Te Tumu, Otago’s School of Māori, Pacific and Indigenous Studies, expressed an interest in implementing a repository for their specific needs. They were particularly interested in the use of a digital repository as a means of disseminating their research and other work, as there are relatively few “official” outlets for their disci- pline. In addition to the usual items found in most typical IRs, Te Tumu wished to store multimedia items such as images of traditional crafts and artwork, and video clips of performances. This was simply a matter of adding appropriate item types to the EPrints metadata configuration and creating corresponding templates. Drawing on experience from the pilot, the Te Tumu repository2 was implemented in less than a month, and was officially launched on May 3 2006, making it the first repository for indigenous studies in New Zealand. Interest in the repository is evident with almost 4,500 downloads from 70 different countries during its first seven months. The repository currently contains 31 items, including articles, theses, images and video clips. 2http://eprintstetumu.otago.ac.nz/ 5 5 Issues to consider 5.1 Copyright Copyright is an issue that needs to be faced, although concerns that are voiced tend to be perceived rather than actual problems (eprints.org, 2005; Sale, 2006). A substantial frac- tion of the material loaded into the Otago repositories comprised departmental working or discussion papers, for which permission to publish online had already been granted. Items with uncertain copyright status had full text access restricted until their status was confirmed. The SHERPA web site3 was a valuable resource for ascertaining journal copyright agreements. 5.2 Data standards The New Zealand Digital Strategy proposes the long term goal of linking all New Zealand repositories to share information and avoid isolated “silos of knowledge”, where each institution has little idea of what is happening elsewhere (New Zealand Government, 2005). It is therefore imperative that open standards such as the Dublin Core Metadata Initiative4 be applied for both data and metadata. Dublin Core is natively supported by EPrints, and also by many library cataloging systems. 5.3 Data entry Data entry may often be carried out by people who are not specifically trained for the task (such as document authors), so it is essential to have well-defined and widely publicized processes and standards for data entry. EPrints allows the data entry process to be heavily customized to the needs of an individual repository. A final editorial verification is also essential to check the quality of the data entered and to ensure that the item is suitable for inclusion in the repository. 5.4 Content acquisition The key issue regarding acquisition of material is whether self-archiving should be com- pulsory (top-down) or voluntary (bottom-up). Sale (2005; 2006) argues that a compulsory policy is much more effective for growing a repository, as illustrated by the growth rates of repositories at the Queensland University of Technology (compulsory, high growth) and the University of Queensland (voluntary, low growth). Compulsory archiving policies are often driven by the need to capture information for research evaluation and funding pur- poses, but run the risk that authors may react negatively to such a requirement. Swan and Brown (2004) surveyed 157 authors who did not self-archive and found that 69% of them would willingly deposit their articles in an open access repository if required to do so. A more recent study increased this figure to 81% (Swan, 2006). Another issue is when authors should deposit new content into a repository. In partic- ular, should pre-prints of submitted papers be immediately deposited, or should authors wait until the paper has been accepted for publication? There are valid arguments for both positions, but in the case of highly popular repositories, waiting for acceptance may 3http://www.sherpa.ac.uk/ 4http://www.dublincore.org/ 6 prove to be a “safer” option. In March 2006, the authors submitted an article to a jour- nal, and concurrently deposited a pre-print (Stanger and McGregor, 2006) into the pilot repository. The pre-print quickly became the most popular download from the repository, with 625 downloads in only three weeks. The journal subsequently rejected the article on the basis that the material had already been widely disseminated and was therefore no longer topical. 5.5 Types of content Decisions about the types of material that should be archived (e.g., working papers, theses, lecture material, multimedia files) are also key, as is the question of what historical material to include. Indeed, this has proved to be one of the most challenging issues faced at Otago, since there can be considerable cost associated with scanning to convert non-digitized work into digital format. There are also associated practical and logistical issues. The value of a repository depends on the number of authors contributing (Rankin, 2005). Ready targets for inclusion are outputs that would otherwise have only limited availability, such as departmental working and discussion papers, and theses and dis- sertations. The latter in particular are often very difficult to obtain from outside the institution that published them, yet paradoxically, they are often the easiest to obtain for the purposes of populating an IR, because there is a lower likelihood of copyright issues, and departments often have copies of the documents to hand. Extant and already bound material requires page-by-page scanning, which can be a long and arduous process. While a number of robotic scanners are available, these are likely to be out of the financial reach of most institutions. The content focus at Otago has thus moved towards the development of a mandatory policy that requires all student theses and dissertations to be submitted in both hard and electronic copy. 6 Looking ahead An exciting consequence of the School of Business repository has been an approach from various communities throughout New Zealand to help set up repositories of heritage material relating to their community. The first of these was Cardrona, a small rural Central Otago community with a long and varied history. The Cardrona Community Repository5 was launched on May 17 2006, and is the first community repository in New Zealand. Digital repositories offer communities a wonderful opportunity to preserve their historical and cultural heritage, and to disseminate it to a much wider audience than normally possible. It can also provide a sense of focus for the community, especially in cases like Cardrona, where the population is quite small and somewhat geographically dispersed. This information can be of academic use too, such as in a recent study that used community historical information to document the long-term effects of climate change (Hopkin, 2006; Miller-Rushing, Primack, Primack and Mukunda, 2006). The Otago team is also playing a significant role in the Open Access Repositories in New Zealand (OARiNZ) project6. This is a government-funded project to develop a national infrastructure connecting all of New Zealand’s digital research repositories. 5http://cardrona.eprints.otago.ac.nz/ 6http://www.oarinz.ac.nz/ 7 Work is currently under way at Otago on an easy-to-use installer and configurator for EPrints repositories, in order to encourage wider adoption of these technologies. 7 Conclusion Experience at Otago demonstrates that in an increasingly digital world, digital reposi- tories are a necessary and welcome means of archiving and making accessible electronic content of all kinds. Global connectedness between scholars and communities at the touch of a keyboard is not a clichéd dream, but a reality. The technology has matured to the point where a basic repository can be set up with a very moderate level of technical expertise. Even setting up a heavily customized repository can be achieved in a matter of days rather than weeks, if a dedicated and knowledgeable team is created and given focused, achievable and bounded goals. Software costs are essentially nil, hardware costs are minimal, and there is a hugely supportive and generous worldwide community of scholars who are willing to share their technical knowledge and expertise at no cost. On the non-technical side, there are now sufficient repository implementations around the world that IR’s are becoming less of a novelty and more an integral tool for researchers, librarians and archivists alike. While Otago is yet to adopt an institution-wide repository, there is little doubt that the progress made to date with its three different thrusts has generated widespread interest locally, nationally and globally. In a purely academic context, the tension between traditional (journal based) scholarship and publishing, and digital (repository based) scholarship and publishing has yet to play itself out. The authors’ experience with community preservation and heritage groups, on the other hand, suggests that given appropriate access to the technology, the content flood gates will truly open. The imprint of EPrints at Otago has not only made its mark, it has stimulated a renaissance-like enthusiasm for making available knowledge and ideas and history and scholarship that might otherwise remain hidden or inaccessible. The added value is that the required institutional or community investment, both time and money, in developing a digital repository seems rather trivial. The authors suggest that prospective repository developers “hit the ground running” and welcome contact from anyone who needs help to do so! Acknowledgments The authors would like to thank Professor Arthur Sale of the University of Tasmania, Eve Young of the University of Melbourne and Stevan Harnad of the University of Southamp- ton for their enthusiastic assistance and support. The authors are also indebted to project Research Assistants Monica Ballantine and Jeremy Johnston for their considerable exper- tise and enthusiasm, and to School IT Manager Brent Jones for deploying and maintaining the repository server. A final acknowledgement must go to Te Tumu and the Cardrona community for the wonderful opportunities that they have provided. References eprints.org (2005), ‘Self-archiving FAQ’, web page. Accessed on September 26 2006. *http://www.eprints.org/openaccess/self-faq/ 8 Hajjem, C., Harnad, S. and Gingras, Y. (2005), ‘Ten-year cross-disciplinary comparison of the growth of open access and how it increases research citation impact’, IEEE Data Engineering Bulletin 28(4), 39–46. Harnad, S. (2005), Australia is not maximising the return on its research investment, in C. Steele, ed., ‘19th Roundtable of the National Scholarly Communications Forum’, The Australian Academy of the Humanities, Sydney, Australia. *http://eprints.utas.edu.au/204/ Hopkin, M. (2006), ‘Family albums highlight climate change’, web article, news@nature.com. Accessed on 27 September 2006. *http://www.nature.com/news/2006/060807/full/060807-10.html Miller-Rushing, A. J., Primack, R. B., Primack, D. and Mukunda, S. (2006), ‘Photographs and herbarium specimens as tools to document phenological changes in response to global warming’, American Journal of Botany 93(11), 1667–1674. New Zealand Government (2005), ‘The Digital Strategy: Creating our Digital Future’, policy document, New Zealand Government. *http://www.digitalstrategy.govt.nz/ Rankin, J. (2005), ‘Institutional Repositories for the Research Sector’, feasibility study, National Library of New Zealand. *http://wiki.tertiary.govt.nz/~InstitutionalRepositories/Main/ ReportOfFindings Sale, A. (2005), ‘The key things to know’, presented at the New Zealand Institutional Repository Workshop, Wellington, New Zealand. *http://eprints.utas.edu.au/223/ Sale, A. (2006), Researchers and institutional repositories, in N. Jacobs, ed., ‘Open Ac- cess: Key Strategic, Technical and Economic Aspects’, Chandos Publishing, Oxford, UK, chapter 9, pp. 87–100. *http://eprints.utas.edu.au/257/ Sale, A. and McGee, C. (2006), ‘Tasmania Statistics Software’, University of Tasmania. Accessed on September 26 2006. *http://eprints.utas.edu.au/262/ Stanger, N. and McGregor, G. (2006), Hitting the ground running: Building New Zealand’s first publicly available institutional repository, Discussion Paper 2006/07, Department of Information Science, University of Otago, Dunedin, New Zealand. *http://eprints.otago.ac.nz/274/ Swan, A. (2006), The culture of Open Access: researchers’ views and responses, in N. Ja- cobs, ed., ‘Open Access: Key Strategic, Technical and Economic Aspects’, Chandos Publishing, Oxford, UK, chapter 7, pp. 52–59. *http://eprints.ecs.soton.ac.uk/12428/ Swan, A. and Brown, S. (2004), ‘Authors and open access publishing’, Learned Publishing 17(3), 219–224. 9 About the authors Dr. Nigel Stanger is a lecturer in the Department of Information Science at the University of Otago School of Business, where he has taught in the areas of systems analysis and database systems since 1989. He has active research interests in digital repositories, distributed and web database systems, XML technologies, physical database design and database performance. He was the project lead and programmer for the School of Business EPrints repository, which he continues to maintain and enhance. He is also heavily involved in projects to increase the uptake of digital repository technology within New Zealand, and is a key member of the Open Access Repositories in New Zealand (OARiNZ) project. Dr. Graham McGregor is the Research Development Coordinator for the University of Otago School of Business. He is an experienced tertiary academic and manager, who has held senior positions in both the polytechnic and university sectors in New Zealand, and worked as an independent consultant. As an academic, he largely published in the field of sociolinguistics. He has also joint authored work on ICT pedagogy and practice and has written reports for several New Zealand government agencies. His current role is to stimulate and coordinate research development activities across New Zealand’s business and academic communities. He was instrumental in launching the School of Business EPrints repository. 10 work_fyc7jqdd2jbqjavekcljq3cxua ---- Microsoft Word - 008_Baydur Türk Kütüphaneciliği 24, 3 (2010), 526-532 / \ Görüşler / Opinion Papers -J Değişim ve Bibliyografik Denetim Development and Bibliographic Control Gülbün Baydur* * Prof. Dr. Hacettepe Üniversitesi Edebiyat Fakültesi Bilge ve Belge Yönetimi Bölümü. e-posta:gulbun@hacettepe.edu.tr Öz Son yirmi yılda bibliyografik organizasyon çevresel, teknik ve toplumsal değişimle etkileşime uğrayarak hızla değişip farklılaşmaktadır. Kütüphaneler, organizasyonda kurulu prensip ve sistemler dışına çıkıp yeni ortamlara geçiş yapmak zorunda kalmışlardır. Organizasyonu gereken malzeme çeşitlenmiş, kullanıcı farklılaşmış, bilgi erişim olanakları çoğalmış bulunmaktadır. Kooperatiflerle rutin işler artmış gibi görünse de; bilginin organizasyonunda aranan profesyonel nitelikler çeşitlenmiş, farklılaşmıştır. Bunu, “katalog ve üstveri hizmetleri” için verilen iş ilanları örneklemektedir. Aranan nitelikler, örneğin, kataloglama ve üstveri kuralları (FRBR, RDA), standartlar (MARC, Dublin Core), DDC, LC, LCSH, elektronik içerik yönetimi, vizyon, gelişmeleri izleyebilmek üzere meslektaşlarıyla iletişim vb. Üniversitelerin akademik programlarında da bu farklılaşmaya hızla yer verilmesi zorunludur. Bu arada da meslektaşlarla iletişim halinde olunması gerekir. Hızlı değişimin geleceğine dair öngörü ancak başlıklar halinde yapılabilmiştir. Anahtar Sözcükler: değişim; bibliyografik denetim mailto:gulbun@hacettepe.edu.tr Değişim ve Bibliyografik Denetim Development and Bibliographic Control | 527 Abstract Changing role of bibliographic organization belongs to enviromental,technical and social differentiation in last two decades. Librarians have new responsibilities out of their established principles and systems. User behavior and IR tradiation are changed, materials for organization also differentiate from published to electronic. Proffesional qualities for bibliographic organization people become varied. Descriptions for the positions available are examples for the changing quality of professionals. For “cataloging and metadata services” it is necessary, e.g. cataloging and metadata rules (FRBR, RDA), standards (MARC, Dublin Core) Dewey/LC classification, LCSH, digital content management,vision for cataloging and metadata services, connections within the profession to keep up with trends,etc. University departments have to leader position with their quicly developed programs and it is also necessary to make cooperation with professionals in the market. Keywords: development; bibliographic control 20. Yüzyıl 20. Yüzyıl kütüphanelerde kataloglama ve sınıflama hizmetleri uluslararası kullanılabilen standartların, büyük ölçüde bunlara dayalı kuralların ve 19. yüzyıldan başlayarak geliştirilmiş prensip ve sistemlerin yapılarına dayalı biçimde sürdürülebilmiştir. Kütüphanelerin organizasyonları günümüze kadar belli bir yayın “var mıdır?”, varsa ”nerededir?” (Cutter, 1904) cevaplayabilmiştir. Kataloglardan erişim, yazar (kişi, tüzel kuruluş ve eseradı) ile olduğu gibi; yaygın biçimde denetimli diller (konu başlığı listeleri) kullanılarak konu erişimi de sağlanabilmiştir. Geçtiğimiz yüzyılda özellikle ikinci Dünya Harbi'den sonra yayın sayısı ve türleri hızla artmıştır. Mikroformlar, görsel-işitsel malzemeler, merkezi bilgisayar çıktıları, v.b. Farklı yayın türlerine rağmen, kataloglama kurallarının (AACR yerine AACR 2),sınıflama sistemlerinin gözden geçirilmiş olması geleneksel kütüphane uygulamalarının sürdürülmesini sağlamıştır. Merkezi kataloglama ve sınıflama hizmetleri; çevrimiçi teknolojiyle birlikte standartlaşarak elektronik ortamda da gerçekleştirilebilmiştir. MARC projeleri alandaki 528 | Görüşler / Opinion Papers Gülbün Baydur önce mekanizasyon, daha sonra da otomasyonun ilk standartlaşma uygulamalarıdır (Gorman, 2002a, ss.4-5). Kütüphane katalogları, tarih boyunca da olduğu gibi kütüphane materyallerine benzer ortamlarda yaratılmıştır. Kart kataloglar,COM,OPAC,v.b. 20. yüzyıl, diğer yandan, kütüphanelerde sözcüklere dayalı konu erişiminin yaygınlaştığı bir devir olmuştur. LCSH' in özellikle araştırma kütüphanelerinde yaygın kullanımının olması gibi, özel terim listeleri / Thesaurus‘a (örneğin, MeSH) da rastlanmıştır. Basılı indekslerde de bu tür denetimli diller kullanılmıştır. Kütüphane kataloglarıyla erişim, kapsamları basılı yayınlar olan dermelerle sınırlı kalmıştır. Bununla birlikte kütüphanelerin organizasyonları 20. Yüzyılda kütüphaneciliğin olmazsa olmazı olarak ön plandadır. Bu hizmetlerde oluşturulup geliştirilen kataloglama kuralları, uluslararası standart yapıları ve sınıflama sistemleri; bunlara dayalı uygulamalar, kütüphanelerde bilgi erişimi sistemli bir şekilde gerçekleştirmeyi sağlamış, kurulu prensiplerle profesyonellik ağırlık kazanmıştır. 20. yüzyılda bibliyografik kayıtların elektronik ortamda oluşturuluyor olabilmesi, alanda standartlaşmaya önem kazandırmıştır. Yüzyılın son on yılında Web ortamındaki bilginin üstveri ile denetlenebilmesi, kütüphanecilikte çağın yakalanmış olması bakımından önemlidir. Nitekim, üstveri standart yapıları (Dublin Core) teknisyenlerle kütüphanecilerin birlikteliğiyle oluşturulup geliştirilebilmiştir. Kütüphane dışı ticari öz yayınlar ve indeksler, çevrimiçi erişilebilen veritabanları birer bibliyografik üstveri olarak kütüphane kataloglarından pek farklı olmamakla beraber; kütüphane kaynaklarını desteklemektedirler. Kütüphaneler daha büyük hacimde (kitap gibi); veritabanları ise makale gibi daha fazla içerik analizleyen, ufak ama analitik kaynakları kapsamaktadır. Ancak ikincilerin birden çok konu tanımlamalarının belirlenmiş olması; serbest metin ve Boolean taramalara olanak verimektedir. Bu da, şimdilik kullanıcı için erişimi verimli kılmaktadır (Ruschoff, 2010, ss.62-63). Erişim sayıları, kütüphane kataloglarındaki denetimli dillere (Kataloglama kuralları, konu başlığı listeleri, vb.) oranla çok yüksek olmaktadır. 21. Yüzyılın İlk On Yılı Son yirmi yılda bilginin organizasyonundaki değişim hızlanmıştır. 21.yüzyılda, 20. Yüzyıldaki gibi kataloglamadaki en az iki yüzyıllık geçmişi olan alt yapılanmaya dayalı Değişim ve Bibliyografik Denetim Development and Bibliographic Control | 529 uygulamalar, her geçen gün farklılaşıp yenilenerek değişmektedir. Değişim, büyük ölçüde elektronik yayın ve yeni iletişim teknolojilerine dayanmaktadır. 21. yüzyılla birlikte bibliyografik denetimde yeni yapıların yapılandırılması gereksiniminden doğan araştırmaların yoğunlaştığı bir döneme girmiş bulunuyoruz (Storey, 2010). Bibliyografik denetimle birlikte profesyonellik, disiplinlerarası niteliğini güçlendirmiştir. Kütüphaneciler araştırmalarda farklı gruplarla çalışmak zorunda kalmışlardır. Özellikle standartların geliştirilmesinde ve verimli bilgi yönetimi için kütüphaneciler, grupların olmazsa olmazı olarak kabul görmüşlerdir (Gilchrist, 2002, s.16). Kullanıcıların erişim olanakları, kullandıkları araçlar (ortamlar) değişmiştir, kişisel bilgisayarlarıyla kütüphaneye gitmeden kaynaklara erişebilmeleri bunun bilinen örneğidir. Tarama motorlarına dayalı projelerden biri de popüler olan tarama motoru Google “Book library” çalışmasıdır. Burada elektronik ortama aktarılmış kitaplara yayıncı veya kütüphanelerin erişebilmelerinin hedeflendiği dile getirilmektedir (Google, 2010). Elektronik ortamda kataloglamada kooperatifleşme hizmetleri kapsamında OCLC' nin World/Cat Cooperative Cataloging adıyla sürdürmeye başladığı program, bir yeniliktir. Bu, bir tür eskinin toplu kataloğunun geniş kapsamlı dönüşü olarak değerlendirilebilir; global bir bibliyografik denetim, global bir katalog oluşturulmaya çalışılıyor olması bir değişimdir. World/Cat, yakın bir gelecekteki gelişmelere başlangıç olabilir (Turner, 2010, ss.271-278). Çevresel değişim, farklılaşan kullanıcı davranışları kütüphanecileri, yeni ortamların organizasyonunda mesleki birikimlerini yeniden yapılandırmaya yöneltmiştir. Nitekim, çalışmalar AACR 3 yerine yeni kurallar üzerinde yoğunlaşmış bulunmaktadır. Resource Description and Access (RDA) en yeni örnektir (Joint Steering, 2010). Functional Requirements for Bibliographic Records (FRBR ) da bu tür araştırmaların ürünüdür. Kayıtlarla denetlenen bilgiye erişimin bir yolu olan konu erişimi, son yıllarda alanda tanımlanan ontoloji ve taksonomiyle (Gilchrist, 2002) birlikte, otomatik indeksleme için denetimli dilleri (örneğin, yaygın biçimde sınıflama şemaları) sınanır kılmıştır. Thesauri formunda yeni indeksleme dilleri tasarlanmaktadır. LCSH'den 530 | Görüşler / Opinion Papers Gülbün Baydur yapılandırılan Genre/Form Thesaurus bir örnektir (LC, 2010). Tüm bu ve benzer çabalar, Web ortamında anlamlı; ama dağınık bilgiye sistemli erişim sağlamak zorunluluğunun yaşanmakta olduğunu göstermektedir. Kuram ve Uygulama İlişkisi 2010'da kütüphaneci ve bilgi hizmetlerinde organizasyonla ilgili personel alımında aranan koşullar, son yirmi yıldaki değişimin uygulamadaki sonuçlarını göstermektedir (İş Duyuruları ALA... 2010; Australia... 2010; Canadian... 2010). Boş bulunan pozisyonlar zaman zaman ‘katalog ve üstveri uzmanlığı' olarak tanımlanabilmektedir. Ön koşul, eş ağırlıklı olmak üzere ‘eğitim' ve ‘deneyim' in kazanılmış olmasıdır. Adayda aranan nitelikler: • Kataloglama ve üstveri • Kütüphane sistemleri • Güncel kataloglama ve üstveri kuralları (AACR2, FRBR, Recource Description and Access Code (RDA) pratiği • Standartlar (MARC, Dublin Core) • Sınıflama sistemleri (Dewey, LC) • Konu başlığı listeleri (LCSH) • Çevrimiçi kopya kataloglama (LC/OCLC katalog gereçleri) Adayda yaygın olarak aranan ek nitelikler: • Basılı ve dijital materyallerin içerik yönetimleri • Kataloglama hizmetlerinde vizyon sahibi olmak. Anlaşılabileceği gibi yeni adıyla ‘Katalog ve Üstveri Hizmetleri'nde görevlendirilebilecek personelin niteliklerinde bir dönüşüm yaşanmaktadır (DeBosque, 2009, ss.1-2). Merkezi kataloglama ile yaygınlaşan kopya kataloglama ile bu alanda çalışan profesyonel sayısında gözlenen azalmaya rağmen profesyonellikte aranan nitelikler çeşitlenip çoğalmıştır. Uygulamalardaki hızlı değişimin ve dönüşümü eğitim programlarını etkilememesi mümkün değildir. Eğitimin, pazarından uzak düşmesi beklenemez. Nitekim akademik ders programları da ders içerikleri, kuram ve uygulama dengeleriyle farklılaşıp çeşitlenmektedir. Öğrenciye değişen vizyon kazandırılmaya çalışılmaktadır Değişim ve Bibliyografik Denetim Development and Bibliographic Control | 531 (Gorman, 2002; Hopkins, 1989; Hudon, 2010; Joudrey, 2002; Spillane, 1999). Geleceğin programlarını çevresel, toplumsal ve teknolojik değişim etkilemelidir (Williamson, 2010). Ancak eğitimciler olarak zaman zaman üniversitelerin iç yapılarından dolayı yeni programların hayata geçirilmesinde doğabilecek gecikmelerle değişimin gerisinde kalma endişesini taşıyoruz. Son yirmi yıldaki değişimle birlikte bibliyografik denetimde terminoloji de hızla çeşitlenip yenilenmektedir. Web'in kataloğu olan üstveri terimi, gibi. Sonuç Kataloglamada bir değişim yaşanmaktadır, kimilerine göre 2010 ‘Kataloglama Araştırmaları Yılı'dır (Roeder, 2010). Nitekim araştırmalar sürdürülmektedir. Günümüzün hareketliliğinde bu günden tek tek öngörüde bulunmak zordur. Ancak, yukarıda başlıklar halinde sıralanan gelişmeleri dikkate alarak birkaç şey söylenebilir. Yakın bir gelecekte yeni bilgi erişim yolları yaratılmak zorundadır. Buna bağlı olarak kataloglamanın süreceği; ama işlev değiştireceği görüşüne katılmamak mümkün değildir (Gorman, 2002a, s.5). Bu alanda kullanıcı araştırmaları önemlidir. Bilindiği gibi organizasyonun kimin için yapıldığı indekslemeyi belirleyici olmuştur, günümüzde ise, veritabanı kullanıcılarının davranışları öne çıkmaktadır. Bu bakımdan bilginin organizasyonu ile erişim iç içe düşünülmesi gereken çalışma alanları olmaya devam edecektir. Toplumsal, çevresel, teknolojik gelişim, indekslemedeki yöntem ve sağlantılarda değişimi; bir ölçüde de prensiplerde gelişimi yaratacaktır (Williamson, 2010, s.12). Değişim, özellikle niteleyici ve konu kataloglamasında yaşanılacaktır. Bilginin organizasyonunda (kataloglama ve sınıflama, indeksleme, öz hazırlama) profesyonelin, farklılaşan kullanıcı, doküman, indeksleme dilleri ve erişilebilir teknoloji birikimine sahip olması gerekir. Kazanım eğitimle olduğu gibi profesyonel hayatta da dinamik biçimde sürdürülmek zorundadır. Mesleğin dinamizmi bizleri de kuram ve uygulamada işbirliğini, güncel birikimle canlı tutmaya yönlendirmektedir. Kaynakça (ALCTS/ CETRC), LC and Catholic University of America. Preconferance. (2008) What they don't teach in library schools: Competencies, education, and employer expectations for a career in cataloging. 15 Temmuz 2010 tarihinde http://www.loc.gov/catdir/cpso/careercat.html adresinden erişildi. http://www.loc.gov/catdir/cpso/careercat.html 532 | Görüşler / Opinion Papers Gülbün Baydur Cutter, Charles A. (1904). Rules for a dictionary catalog.4th ed. Rewritten. Washington, D.C. DelBosque, D. ve Cory L. (2009). A chance of storm: New librarians navigating technology tempests. Technical Services Quarterly 26 (4): 261-286. Gilchrist, A. (2002). Thesauri, taxonomies and ontologies - An etimological note. Journal of Documentation 59 (1): 7-18. Google book library project (2010). 26 Temmuz 2010 tarihinde http://books.google.com/library.html adresinden erişildi. Gorman, M. (2002a). Why teach cataloging and classification? Cataloging and Classification Quarterly 34 (1/2): 1-13. Gorman, M. (2002) From card catalog to WebPACS. 15 Temmuz 2010 tarihinde http://www.loc.gov/catdir/bibcontrol/gorman - paper.html adresinden erişildi. Hopkins, J. (1989). Classification and cataloging eduation. The Bookmark (Spring): 179-182. Hudon, M. (2010). Teaching classification 1990-2010 Cataloging and Classificati Quarterly 48(1): 46-82. 23 Haziran 2010 tarihinde http://dx.doi.org/10.1080 adresinden erişildi. İş duyuruları: ALA Joblist. 23 Haziran 2010 tarihinde http://joblist.ala.org adresinden erişildi. İş duyuruları: Australian Library and Information Association 23 Haziran 2010 tarihinde http://www.alia.org.au/employment/vacancies/listing.html/?ID=1697 adresinden erişildi. İş duyuruları: Canadian Library Association 23 Haziran 2010 tarihinde http://www.cla.ca/AM/Templete adresinden erişildi. Joint Steering Committe for Development RDA (2010). RDA: Resource Description and Access 26 Temmuz 2010 http://www.rda-jsc.org/rdafaq.html adresinden erişildi. Joudrey, D. N. (2002). A new look at US graduate courses in bibliographic control. Cataloging and Classification Quarterly 34 (1-2): 59-101. LC. (2010). Library of Congress to formally seperate LC Genre/ Form thesaurus from LCSH. 17 Ocak 2010 http://www.loc.gov/catdir/cpso/genreformthesaurus.html adresinden erişildi. Roeder, R. (2010). Guest editorial: A year of cataloging research. Library Resources and Technical Services 54 (1): 2-3. Ruschoff, C. (2010). Guest editorial: New areas for cataloging research. Library Resources and Technical Services 54 (2): 62-63. Spillane,J. L. (1999). Comparison of required introductory cataloging courses, 1986­ 1998. Library Resources and Technical Services Quarterly 43 (4): 223-224. Storey, T. (2010). OCLC launches innovation labratory. NextSpace 15: 20. Turner, A. H. (2010). OCLC WorldCat as a cooperative catalog. Cataloging and Classification Quarterly 48 (2-3): 271-278. Williamson, N. J. (2010). Is there a catalog in your future? Access to information in the year 2006. Cataloging andClassification Quarterly 48 (1): 10-25. http://books.google.com/library.html http://www.loc.gov/catdir/bibcontrol/gorman http://dx.doi.org/10.1080 http://joblist.ala.org http://www.alia.org.au/employment/vacancies/listing.html/?ID=1697 http://www.cla.ca/AM/Templete http://www.rda-jsc.org/rdafaq.html http://www.loc.gov/catdir/cpso/genreformthesaurus.html work_fythilh3gzfodezyg7owrdzeim ---- <3030352D30333328303334292D2D30312DB9DABCBCB9CC2CB1E8BEE7BFEC2E687770> 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구* Exploratory Study on the Activity about Utilization and Contribution to the Union Catalog 조 재 인 (Jane Cho)** 록 종합목록 활성화를 해서는 서지네트워크 참여 도서 의 공동체 의식과 력 정신이 가장 요하겠으나, 기여에 한 한 보상은 참여 동기를 유발시킬 수 있다. 따라서 본 연구는 해외 종합목록의 기여보상제도를 살펴보고, 우리나라 학도서 종합목록 참여 도서 의 기여 활동과 이용 정도를 탐사 으로 분석하 다. 연구의 내용을 구체 으로 기술하면 첫째, 기술통계 분석을 통해 종합목록 참여 도서 의 기여 활동과 이용도에 한 반 인 황을 악해 보며, 둘째, 피어슨 상 분석(Pearson Correlation Analysis)을 통해 기여 활동과 이용 정도간에 어떠한 상 계가 존재하는지 분석해 보았다. 셋째, 계층 군집 분석(Hierarchical Clustering)을 통해 참여 기 을 유형화하여 기여 집단의 규모, 특수 공헌 집단의 존재 여부 등을 분석하 다. ABSTRACT In order to re-activation of shared cataloging, spirit of community and cooperation are most needed. But proper compensation about contribution would motivate. Through representing basic data for making compensation policy about university library’s share cataloging system, this study analyzes activities about contribution and utilization of participated libraries. To put it concretely, this study considers overall status of contribution and utilization through descriptive statistics and analyzes relationship between both sides. Furthermore through clustering participating library, this study brightened the libraries that ought to be compensated and who would be need to be specially rewarded. And draw the libraries that need to be paid and need to be led for active participation of shared cataloging. 키워드: 종합목록, 기여보상제도, UNICAT, OCLC Union Database, Compensation Policy about Share Catalogue, UNICAT, OCLC * ** 이 연구는 2014년 인천 학교 자체연구비 지원에 의해 수행되었음. 인천 학교 문헌정보학과 부교수(chojane123@naver.com) 논문 수일자 : 2015년 2월 22일 논문심사일자 : 2015년 3월 3일 게재확정일자 : 2015년 3월 13일 한국비블리아학회지, 26(1): 35-50, 2015. [http://dx.doi.org/10.14699/kbiblia.2015.26.1.035] 36 한국비블리아학회지 제26권 제1호 2015 1. 서 론 1.1 연구의 배경 필요성 종합목록은 서지네트워크를 통해 도서 간 소장 자원을 공유하고 목록 작성 업무를 공동 으로 수행하기 하여 구축․운 되고 있다. 참 여 도서 은 종합목록에 구축된 서지데이터를 다운로드 받아 자 의 목록 작성에 활용하는 한 편, 자 소장 여부에 한 데이터를 종합목록 에 송부함으로써, 상호 차 원문복사를 해 필요한 소재 정보를 구축하게 된다. 한 신규 서지데이터를 종합목록에 송부하여 타 도서 이 다운로드 받을 수 있도록 하고 종합목록에 존재하는 불완 한 서지를 수정, 보완함으로써 품질 제고에도 기여하게 된다. 40여년 미국에서 시작되어, 72,000개의 도 서 이 참여하고 있는 OCLC(Online Computer Library Center)의 WorldCat은 3억여건의 서 지 코드를 보유하고 있어, 세계 으로 가장 방 한 종합목록을 구축․운 하고 있다. 그 밖 에, 호주, 일본, 국 등에서도 국가 단 종합목 록 데이터베이스를 구축하여 이를 기반으로 상 호 차와 공동목록 서비스를 운 하고 있다. 우 리나라에서도 1990년 후반 국 학 기반 종 합목록 데이터베이스가 구축되어, 2014년 10월 재까지 738개 기 이 참여하고 있으며, 구축 된 데이터 량은 서지 코드 1,000만 건, 소장 코드 5,094만 건에 이른다. 서지네트워크는 공동체 의식과 력 정신을 기반으로 유지․운 된다. 따라서 참여 기 모 두 공동체의 일원이라는 의식을 가지고 자발 력을 통해 조화로운 발 을 도모해야 한다. 그러나 서지네트워크에는 소수의 기여하는 도 서 과 일방 으로 혜택을 받아가는 도서 이 공존하고 있어 문제가 되고 있다. 우리의 경우, 반 이상의 참여 기 이 구축 활동에는 참여하 지 않고 일방 으로 서지 코드만을 다운로드 받아가고 있으며, 일본에서도 체 서지 코드 의 50%가 단지 20개의 도서 에 의해 구축되고 있어, 심각한 양극화 상이 지 되고 있다. 따 라서 일부 해외의 서지네트워크에서는 기여 도 서 의 공헌을 인정하여 히 보상할 뿐 아니 라, 지속 인 신규 서지 코드 생성과 품질 유 지에 동기를 부여하기 하여 다양한 형태의 기 여보상제도를 운 하거나 계획하고 있다. 이러한 배경에서 본 연구는 신규 서지 코드 생성과 품질 제고를 하여 한국의 학도서 종합목록에도 기여보상제도가 논의될 수 있다 는 제하에 학도서 종합목록 참여 도서 의 기여 활동 이용 정도를 실증 으로 악 하고 참여 도서 을 유형화해 본다. 1.2 연구의 목 본 연구의 목 을 구체 으로 기술하면 다음 과 같다. 첫 번째로 학도서 종합목록 참여도서 의 기여 활동과 이용 정도에 한 반 인 황을 악해 본다. 얼마나 많은 도서 이 종합 목록을 실제 으로 활용하 으며, 얼마나 많은 도서 이 어떻게 기여하고 있는지 기술 통계와 빈도 분석을 통해 알아본다. 두 번째로 종합목록 참여 도서 의 서지구축, 소장구축, 서지품질제고와 같은 기여 활동과 이 용 정도간의 계를 피어슨 상 계수(Pearson 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 37 Correlation Coefficient)를 통해 살펴, 기여와 이용도간에 어떠한 상 계가 있는지 악해 본다. 세 번째로 계층 군집 분석을 통하여 참여 기 을 유형화하고 기술통계와 일원분산분석 (ANOVA: One-way Analysis of Variance) 을 통해 군집의 특징을 분석해 본다. 기여 집단 의 규모, 특수 공헌 집단의 존재 여부를 분석할 뿐 아니라, 수혜 집단, 참여 유도가 필요한 집단 의 규모를 악하 다. 2. 종합목록 황과 기여보상제도 2.1 한국의 학도서 종합목록 황과 특징 한국의 학도서 종합목록은 육안 식별 기 반의 규모 데이터베이스 정비, 회원 등 제 운 , 상시 품질 검증 체계 도입으로 인해 안정 기에 어들었다. 더구나 참여 기 이 꾸 히 확 되어, 근래에는 736개 기 을 포 하는 거 서지네트워크로 성장하게 되었다. 종합목록 은 국립 앙도서 의 국가 서지, 한국출 진흥 원의 신간 서지와 연동되어 일선 도서 의 목 록 작성 작업 효율을 도모하고 있으며, 상호 차/문헌복사 서비스를 한 근간이 되고 있다. (조재인 2012). 종합목록의 서지는 연간 20-40만 건씩 꾸 히 되어 2014년 8월말 기 1,000만 여건에 이르며, 소장정보도 5,094만 건에 이른다(RISS Unicat Hompage 2014). 이용 황을 보면, 연 간 검색 400-600만 건, 다운로드도 80-110만 건 에 이르고 있다(장 연 2014). 2014년 10월 재, 공동목록 회원 도서 은 4년제 학도서 이 부분(306개)이지만 2년 제 학도서 도 129개 이 참여하고 있으며, 그 밖에 문 공공도서 301개가 참여하고 있다. 회원은 가, 나, 다 3종류로 구분되며, 모든 등 의 회원이 무료로 참여할 수 있다. 가회원은 종합목록에 신규 서지 코드와 소장 코드를 모 두 구축하는 도서 으로 서지 코드와 소장 코 드를 온라인으로 한 건씩 실시간으로 업로드하 거나, 일정 기간 축 한 후에 배치로 송부하여 구축할 수 있다. 총 736개 기 가회원 도서 은 176개가 등록되어 있으며, 데이터 업로드 에 한 특별한 보상은 제공되지 않고 있다. 한 편 나회원 도서 은 소장 코드만을 종합목록 에 구축하는 도서 으로 247개 도서 에 해당되 는데, 상호 차서비스에 참여하기 해서는 나 회원 이상으로 종합목록에 가입되어야 한다. 한 편, 가장 많은 수를 차지하는 다회원 도서 은 모두 313개 기 으로 특별한 의무 사항과 비용 부과 없이 데이터를 검색하고 다운로드 받아 카 피카탈로깅에 활용할 수 있다. 2.2 해외 종합목록 서비스의 기여보상제도 해외의 표 인 종합목록 서비스는 OCLC의 WorldCat, NII(National Institute of Informatics) 의 NACSIS CAT, 호주 국립도서 의 ANBD (Australian National Bibliographic Database) 를 들 수 있다. OCLC의 WorldCat(www.worldcat. org)은 세계 72,000개의 도서 이 참여하고 3 억건 이상의 서지데이터가 구축된 가장 방 한 종 합목록 데이터베이스이다. 한 NII의 NACSIS CAT(ci.nii.ac.jp/books)은 2014년 재 1,259 38 한국비블리아학회지 제26권 제1호 2015 개 도서 이 참여하여 1,000만여 건의 서지가 구축된 일본 최 의 종합목록이며, ANBD(www. nla.gov.au/librariesaustralia/about/anbd)는 학, 공공도서 을 모두 망라한 5,000만 건의 서지가 구축된 호주 국가 종합목록이다. 각 국가의 서비스마다 조 다른 회원제 운 방식과 기여 보상에 한 입장을 보이고 있다. 그 일본의 NACSIS CAT은 우리와 비슷하 게 무료로 운 되면서 아직까지 기여보상제도 를 운 하지 않고 있다. 그러나 국립 학 12개, 사립 학 6개를 심으로 생성된 신규 코드가 체의 49.70%를 차지할 정도로 신규 코드 생성 기 이 제한되어 있어(佐藤義則 2011), 신규 코드 생성 진을 한 다양한 방법이 모색되고 있다. NII는 이를 해 데이터 제공 기 에게 공헌도에 합한 칭호를 부여하고 표창 할 뿐 아니라, 일방 으로 혜택을 받아가는 기 에게 가를 요구할 계획을 검토하고 있다(國立 情報學硏究所 2009). 한편, 우리와 일본처럼 기여보상제도를 운 하지 않으면서 회원 기 에게 비용을 징수하는 경우도 있다. 호주국립도서 은 종에 따라 조 씩 다른 멤버쉽 비용을 부과하고 있는데, 학 도서 의 경우, 개별 도서 의 총 산 비 일정 비율이 회비로 요구되고 있다. 그러나 걷어진 회 비는 종합목록 운 과 연구 개발 등에 투입될 뿐, 기여 활동에 한 보상으로는 지출되고 있지 않 다. 호주 국립도서 은 데이터를 업로드 하는 기 여 행 는 순수한 사 정신으로 이루어지고 있 을 뿐, 기여 활동에 한 보상은 제공하지 않고 있다고 말하고 있다(Library Austrailia 2013). 마지막으로 OCLC는 호주국립도서 과 같 이 요 을 징수하고 있으나 이런 방식으로 형 성된 재원의 많은 부분이 기여에 한 보상으 로 제공되고 있다. OCLC는 2012년에 $22.4 million을 기여 도서 에 보상했고, 2011년에는 $19.8 million, 2010년에는 $19.6 million을 각 각 보상해 왔다. 기여의 유형은 서지 코드 생 성, 서지 코드 수정, 소장 코드 제거와 그 밖 에 상호 차를 통한 자료 제공 등이 포함되는데, 체 보상액의 46%인 $10,349가 서지 코드 작성에, 33%인 $7,302가 코드 수정 소장 정보 구축에 제공되어져 왔다. 체 회원 기 40%의 기 이 기여에 한 보상 을 제공받 은 셈이며, 2012년에는 0.5%인 100개 도서 이 OCLC에 지불하는 연간 이용료보다 보상 을 더 많이 받아갔다(OCLC 2013). 2014년 6월부 터는 Global Advisory Group on Credits의 제 안에 의해 보상제도 운 방식을 트랜젝션 방식 에서 정액제(flat-rate) 방식으로 환함으로써 보다 극 으로 참여 도서 들의 기여를 보상 할 계획이라고 한다(Goodson 2015). 이와 같이 세계 으로 가장 큰 향력을 가진 OCLC는 거둬드린 수익의 반 가까이를 보상 제도로 환 원하고 있다. OCLC는 보상 자체가 공유와 기 여를 한 정신에 앞서진 않지만, 도서 들에게 동기 부여가 되고 있을 뿐 아니라, 참여 도서 입장에서 이용료를 감할 수 있다고 말하고 있 다(OCLC 2013). 3. 연구 방법 본 연구는 종합목록 회원 기 의 참여 행태 를 이해하기 하여 2014년 10월에 학술정보통 계시스템(rinfo.mest.go.kr)에 가장 최근 데이 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 39 터로 구축된 종합목록 공동활용 통계를 수집하 다. 종합목록 공동 활용 통계의 조사 년도는 2013년이며, 자료기 일은 2012년 3월 1일이 다. 수집된 데이터는 ‘기여 활동’과 ‘이용 정도’ 로 구분할 수 있으며, ‘기여 활동’은 서지구축, 소장구축, 서지품질제고 건수로 설정하고, ‘이 용 정도’는 검색 건수로 정의하 다. 서지 구축 은 온라인 는 배치로 학도서 이 업로드한 데이터가 복 알고리즘과 서지교체 알고리즘 에 의해 종합목록상에 신규 서지로 채택된 경우 를 의미하며, 소장구축은 자 이 소장하고 있는 도서에 한 정보를 종합목록에 업로드함으로 써 상호 차를 해 필요한 소재 정보로 표시 되는 경우를 의미한다. 한편, 서지품질제고건수 는 서지수정, 서지삭제, 서지통합 건수를 모두 합산한 데이터로 사서가 온라인으로 종합목록 상에 존재하는 서지의 품질을 제고하거나 사서 가 발견한 복 서지를 통합하는 순수한 기여 활동을 의미한다. 수집된 데이터는 다음과 같은 차를 통해 분 석하 으며, 분석 도구는 SPSS 21을 활용하 다. 첫 번째, 참여 도서 의 반 인 황을 살 펴보고, 서지구축, 소장구축, 서지품질 제고, 검 색 건수로 구분하여 기 통계와 빈도분석을 실 시해 반 인 실태를 악해 본다. 두 번째, ‘기여 활동’ 간의 상 성을 악하기 하여 서지구축과 소장구축, 서지구축과 서 지품질제고, 그리고 소장구축과 서지품질제고간 의 계를 피어슨 상 계수(Pearson Correlation Coefficient)를 통해 확인해 본다. 한편, ‘기여 활동’과 ‘이용 정도’간의 계를 악하기 하 여 서지구축과 검색, 소장구축과 검색, 그리고 서지품질제고와 검색간의 피어슨 상 계수도 도출해 본다. 세 번째, 공동목록 참여 기 을 유형화하기 하여 계층 군집분석을 수행한다. 덴드로그 램(Dendrogram)을 산출해 군집의 숫자를 결 정하고 결정된 군집간의 차이를 악하기 하 여 기술통계 분석과 일원분산분석(ANOVA: One-way Analysis of Variance)을 실시해 특 징을 규명해 본다. 네 번째, 기여 활동 이용 정도, 양측의 상 계, 그리고 도출된 군집의 특징을 기반으로 기여보상제도 운 방향을 시사해 본다. 한편, 본 연구에 사용된 데이터의 기 시 이 2012년 3월 1일이므로, 본 연구에서 제시할 기여 이용 정도, 그리고 군집의 특성과 규모는 해당 연도에 한정됨을 밝힌다. 따라서 연도별로 회원 기 의 기여 활동과 활용 정도에 따라 군집의 특 성과 규모에 차이가 나타날 수 있다. 4. 분석 결과 4.1 기여 활동 이용 정도 분석 본 장에서는 종합목록 회원 도서 의 기여 활동 이용도를 반 으로 살펴 본다. 첫 번째, 종합목록 회원 도서 의 반 인 참여 행태 분석을 하여 기 통계 분석을 실 시한 결과는 <표 1>과 같이 나타났다. 회원으로 가입된 도서 실제 데이터가 존재하는 도 서 은 체 회원 기 355개뿐으로 나타났 으며 나머지 도서 은 회원으로 가입은 했으나 시스템에 속조차 하지 않은 것으로 나타났다. 서지구축 평균은 기 당 1,498건, 소장구축 평 40 한국비블리아학회지 제26권 제1호 2015 기여활동 활용 서지구축 소장구축 품질기여 검색건수 N 유효 355 355 355 355 결측 0 0 0 0 평균 1,498.1972 10,027.7718 22.9634 30,131.3042 최소값 .00 .00 .00 .00 최 값 52,245.00 185,356.00 2,442.00 574,443.00 합계 531,860.00 3,559,859.00 8,152.00 10,696,613.00 <표 1> 기여 이용 정도에 한 기 데이터(데이터가 존재하는 355개 도서 기 ) 균은 10,027건으로 나타났으며, 서지수정, 삭제, 통합과 같은 서지품질제고 건수의 평균은 22건 에 불과하 다. 반면 검색 건수의 평균은 30,131 건으로 나타나, 서지구축, 소장구축, 서지품질 기여 건수에 각각 20배, 3배, 1,500배에 달하는 것으로 나타났다. 두 번째, 종합목록 ‘기여 활동’과 ‘이용도’에 있어 참여 도서 의 분포를 좀 더 자세히 살펴 보기 하여 빈도 분석을 수행한 결과는 <표 2> 와 같이 나타났다. 실제 참여 도서 인 355개 도서 을 기 으로 분석해 볼 때, 서지구축의 경우, 70%에 해당하는 도서 이 한 건의 데이 터도 구축하지 않은 것으로 나타났으며, 나머 지 28.5%가 10,000건 미만에 그치는 수치를 보 여, 서지구축 기여 도서 의 숫자는 제한 인 것으로 악할 수 있었다. 소장구축의 경우는 62%에 해당하는 도서 이 한 건도 데이터를 구축하지 않았으며, 33.4%가 50,000건 미만인 것으로 나타났다. 그러나 50,000만 건 이상을 구축한 도서 도 7.8%나 존재해, 서지구축 기 여도서 보다는 많은 규모를 나타내고 있었다. 서지품질제고에 있어서는 86%의 도서 이 아 무런 실 을 가지고 있지 않았으며, 0.6%에 해 당하는 2개 도서 만이 두드러지게 높은 기여 를 한 것으로 나타났다. 한편, ‘이용도’를 의미 하는 지표인 검색의 경우, ‘기여 활동’을 의미하 는 다른 지표들과 달리 단지 19%의 도서 만 이 기록이 없는 것으로 나타났다. 다수를 차 지하는 도서 이 100,000만 건 주변에 분포되어 있었으며, 100,000만 건 이상인 도서 도 7.2% 나 되는 것으로 나타났다. 세 번째, 의 빈도 분포표에서 나타난 특징을 시각화하여 기여 활동과 이용 정도를 좀 더 쉽게 비교하기 하여 산 도를 도출한 결과 <그림 1> 기여 활동 서지구축 비율 빈도 0 70.1 249 1-10,000 28.5 95 10,000-20,000 0.9 3 20,000-30,000 1.5 5 30,000-40,000 0.6 2 50,000- 0.3 1 <표 2> 기여 활용 행태에 한 빈도 분석 결과 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 41 서지구축 소장구축 서지품질제고 검색 <그림 1> 기여 활동 이용 정도 통계에 근거한 도서 산 도 기여 활동 소장구축 비율 빈도 0 62 217 1-50,000 33.4 112 50,000-100,000 6.6 22 100,000-1,500,000 0.6 2 150,000- 0.6 2 서지품질제고활동 비율 빈도 0 86.0 307 0-500 12.0 46 500-1,000 0.0 0 1,000-1,500 0.0 0 1,500-2,000 0.0 0 2,000- 0.6 2 활용 정동 검색 비율 빈도 0 19.0 69 0-100,000 72.0 258 100,000-200,000 5.6 20 200,000-300,000 0.8 3 300,000-400,000 0.8 3 42 한국비블리아학회지 제26권 제1호 2015 과 같이 나타났다. 서지구축의 경우, 부분의 이 하단에 집해 있는데 반해, 소장구축은 좀 더 많은 이 단까지 고루 분포하고 있는 것을 확인할 수 있다. 이는 소수의 도서 만이 서지 구축에 기여하고 있으나, 소장정보 구축 에는 상 으로 다수의 도서 이 참여하고 있 음을 설명한다. 한편, 서지품질제고의 경우, 부분의 이 ‘0’ 주변에 분포하며, 단 두 개의 만이 높은 곳에 분포해 있다. 이는 극소수의 도서 만이 기여하고 있으며 부분의 도서 은 실 이 없음을 나타내고 있다. 마지막으로 검색에서는 많은 도서 이 일정 범주 내에 조 하게 분포하는 것을 확인할 수 있다. 서지나 소장구축, 서지품질제고의 경우 0값에 수많은 들이 모여 마치 굵은 선을 형성하고 있는데 반해, 검색 산 도에서는 0에서 10만 건 사이에 들이 조 하게 분포함으로써, 많은 수의 도 서 이 종합목록을 통해 일정 정도의 혜택을 받아가고 있음을 나타내고 있다. 에서 분석한 결과를 요약해 보면 다음과 같이 정리할 수 있겠다. 회원 등 제에 의한 개 별 회원의 데이터 구축에 한 의무 조건에도 불구하고, 실제 참여 도서 의 60% 이상이 서 지구축과 소장구축 실 이 부재하 다. 더욱이 서지품질제고 활동에 있어서는 80% 이상의 도 서 이 실 을 가지고 있지 않아, 소수의 도서 만이 종합목록에 기여하고 있는 것으로 악 되었다. 반면 다수 도서 이 0에서 10만 사이 의 검색 실 을 가지고 있어, 기여보다는 검색 을 해 종합목록에 참여하고 있는 것으로 악할 수 있겠다. 4.2 각 참여 활동 간의 상 성 분석 본 장에서는 피어슨 상 분석을 통해 참여 활 동간의 계를 악하 다. 먼 ‘기여 활동’을 의미하는 지표인 서지구축, 소장구축, 서지품질 제고간의 상 성을 분석하 으며, 더불어 ‘기여 활동’과 ‘이용 정도’간의 상 성 측정을 해 서 지구축과 검색, 소장구축과 검색, 그리고 서지 품질제고와 검색간의 상 도를 측정하 다. 첫 번째, ‘기여 활동’ 간의 상 성을 분석한 결과 <표 3>과 같은 결과가 도출되었다. 먼 , 서지구축과 소장구축 간에는 강한 양의 상 성 (0.902)이 있는 것으로 나타났다. 이는 소장도 서가 많은 학도서 에서 작성한 서지가 종합 목록 상에 자주 신규 서지로 채택되고 있음을 의미할 수 있겠다. 다시 말해 장서량이 많은 도 서 이 종합목록 상에 존재하지 않는 희귀 도서 를 많이 보유하고 있을 뿐 아니라, 신간 도서를 다른 도서 보다 먼 수서하여 서지 코드를 작성한 후 업로드했을 가능성이 크다는 것이다. 한 일반 으로 이러한 도서 에서 작성된 서 지의 품질이 우수하여, 서지 교체 매카니즘에 의해 기존 서지를 체하고 있는 것으로 추정해 볼 수 있겠다. 한편, 서지구축과 서지품질제고, 그리고 소장구축과 서지품질제고 간에는 각각 0.405, 0.423의 상 으로 높지 않은 수치가 나 타났다. 서지 소장 구축은 일정기간 동안 자 에 된 코드가 배치로 송부되어, 자동 처리되지만, 서지품질제고는 사서가 온라인으 로 속해 직 한건 한건 데이터를 수정해 품 질을 제고하는 행 로, 자 의 소장 도서가 많 다고 해서 품질 개선을 한 노력을 더 많이 기 울이고 있는 것은 아님을 나타내고 있다. 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 43 서지구축 소장구축 서지구축 Pearson 상 계수 1 .902(**) 유의확률 (양쪽) .000 N 355 355 소장구축 Pearson 상 계수 .902(**) 1 유의확률 (양쪽) .000 N 355 355 서지구축 서지품질제고활동 서지구축 Pearson 상 계수 1 .405(**) 유의확률 (양쪽) .000 N 355 355 품질기여 Pearson 상 계수 .405(**) 1 유의확률 (양쪽) .000 N 355 355 서지품질제고활동 소장구축 서지구축 Pearson 상 계수 1 .423(**) 유의확률 (양쪽) .000 N 355 355 품질기여 Pearson 상 계수 .423(**) 1 유의확률 (양쪽) .000 N 355 355 ** P < 0.01 <표 3> 기여 활동간의 상 성 두 번째 ‘기여 활동’과 ‘이용 정도’간의 피어 슨 상 분석 수행 결과는 <표 4>와 같이 나타났 다. 소장구축건수와 검색간의 상 계수는 0.713 으로 나타나 높은 상 성이 있는 것으로 분석 되었다. <그림 2>의 산 도에서 나타나는 바와 같이 들이 양의 상 계를 보이며 퍼져 있 어, 소장 도서가 많은 학도서 이 그만큼 많 이 이용하고 있음을 설명하고 있다. 소장 도서 가 많은 도서 의 목록 작성량이 그 지 않은 도서 보다 많아, 종합목록을 검색하는 빈도가 상 으로 높기 때문에 이러한 상 성을 보이 는 것으로 추정된다. 그러나 서지구축과 검색 간에는 피어슨 상 계수가 0.563으로 나타나 상 으로 약한 상 성을 보 는데, 이는 신 간 도서나 희귀 도서를 많이 보유하거나 우수 한 품질의 서지를 제공해 기존 종합목록 서지 를 체한 도서 들이 종합목록에 의존하는 정 도가 상 으로 조한 상을 설명할 수 있 겠다. 신간 도서나 희귀 도서는 카피카탈로깅 이 가능한 서지가 종합목록 상에 존재하지 않 을 가능성이 높고, 고품질의 서지를 작성하는 도서 은 카피 카탈로깅보다 오리지 카탈로 깅(Original Cataloging)을 선호하기 때문으로 추정된다. 한편, 서지품질제고건수와 검색건수 간의 상 성을 분석한 결과에서는 0.285정도의 낮은 상 계수를 나타냈다. 이는 종합목록을 많이 검색한다고 해서 서지데이터 품질 제고를 해 44 한국비블리아학회지 제26권 제1호 2015 서지구축 검색건수 서지구축 Pearson 상 계수 1 .563(**) 유의확률 (양쪽) .000 N 355 355 검색건수 Pearson 상 계수 .563(**) 1 유의확률 (양쪽) .000 N 355 355 검색건수 소장구축 검색건수 Pearson 상 계수 1 .713(**) 유의확률 (양쪽) .000 N 355 355 소장구축 Pearson 상 계수 .713(**) 1 유의확률 (양쪽) .000 N 355 355 품질기여 검색건수 품질기여 Pearson 상 계수 1 .285(**) 유의확률 (양쪽) .000 N 355 355 검색건수 Pearson 상 계수 .285(**) 1 유의확률 (양쪽) .000 N 355 355 ** P < 0.01 <표 4> 기여와 이용 정도 간의 상 성 <그림 2> 기여 활용 통계에 근거한 도서 산 도 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 45 노력하고 있다고 말할 수 없음을 의미한다. 다 시 말해, 서지품질제고 실 이 높은 도서 은 데이터의 이용량과 상 없이 순수한 사 정신 을 가지고 기여하고 있다고 설명해 볼 수 있을 것이다. 에서 ‘기여 활동’간, 그리고 ‘기여 활동’과 ‘이용 정도’간의 상 분석을 통해 드러난 결과 를 요약하면 다음과 같이 기술할 수 있겠다. 첫 번째, 소장도서가 많은 학도서 의 서지가 종합목록상에 자주 신규 서지로 채택되고 있다. 두 번째, 소장도서가 많거나 신규 서지로 많이 채택된 기 이 특별히 서지품질제고를 한 노 력을 더 많이 기울이고 있는 것은 아니다. 세 번째, 소장도서가 많은 도서 이 그 만큼 종합 목록 서지데이터를 더 많이 활용하고 있으나, 그 서지가 종합목록상에 많이 채택된 도서 은 오리지 카탈로깅 비 이 상 으로 높 아, 종합목록 이용도가 다소 떨어진다고 추정 해 볼 수 있겠다. 네 번째, 검색을 많이 한다고 해서 종합목록 서지품질제고에 기여하고 있는 것은 아니다. 다시 말해 서지품질제고에 기여 하고 있는 도서 은 이용량과 무 하게 순수하 게 기여하고 있는 것이다. 4.3 군집의 형성 특징 분석 계층 군집분석은 가까이에 있는 상들로 부터 노드를 결합해 감으로써 트리모양의 계층 을 형성해 가는 방법으로 덴드로그램을 통해 군 집이 형성되어 가는 과정을 정확히 악할 수 있 다. 여기에서는 서지구축, 소장구축, 서지품질제 고, 검색 건수를 Z값으로 표 화하고 심 군 집화 방식(Centroid Method)을 이용하여 군집 을 형성하 다. 그 결과, 다음과 같이 총 4개의 군집이 형성되었다. 군집 1에는 344개 다수의 학이 포함되었으며, 군집 2, 3, 4는 각각 2개, 7개, 2개 학도서 이 포함되는 것으로 나타났 다. 각 군집의 특성과 차이를 악하기 하여 <표 5>와 같이 기술 통계 분석을 병행하 다. <그림 3> 종합목록 기여 활용 행태에 따른 덴드로그램 <표 5>의 서지 구축 평균을 보면, G3이 30,338건 으로 가장 높게 나타났고, G4가 21,984건으로 그 다음으로 높게 나타났다. G2는 5,227건, G1 은 770건으로 하게 낮은 수치를 보 다. 소 장 구축에 있어서도 G3이 116,399건으로 가장 높게 나타났지만, 108,933건인 G4와 큰 차이를 보이지는 않는다. 반면, G2는 77,780건, G1은 6,894건으로 조한 수치를 나타냈다. 서지품 질제고건수에 있어서는 G4가 압도 으로 많은 2,291건을 보 고, G3은 118건, G2와 G1은 각 46 한국비블리아학회지 제26권 제1호 2015 N 평균 최소값 최 값 서지구축 G1 344 770.5 0.0 12778.0 G2 2 5227.0 4206.0 6248.0 G3 7 30338.4 21443.0 52245.0 G4 2 21984.0 10149.0 33819.0 합계 355 1498.1 0.0 52245.0 소장구축 G1 344 6894.2 0.0 92846.0 G2 2 77780.5 75990.0 79571.0 G3 7 116399.0 85389.0 185356.0 G4 2 108933.0 67251.0 150615.0 합계 355 10027.8 0.0 185356.0 검색건수 G1 344 23446.8 0.0 320322.0 G2 2 496591.0 418739.0 574443.0 G3 7 180563.6 126047.0 338843.0 G4 2 186890.0 143967.0 229813.0 합계 355 30131.3 0.0 574443.0 품질기여 G1 344 7.9 0.0 462.0 G2 2 6.5 1.0 12.0 G3 7 118.3 0.0 316.0 G4 2 2291.0 2140.0 2442.0 합계 355 23.0 0.0 2442.0 <표 5> 각 군집의 기여 활동 이용 정도에 한 기술 분석 결과 자유도 F 유의확률 서지구축 3 343.352 .000 소장구축 3 140.172 .000 검색건수 3 100.828 .000 품질기여 3 1609.748 .000 <표 6> 각 군집간의 기여 활용 행태 차이 악을 한 일원분산분석(ANOVA)결과 각 10건 미만으로 매우 낮게 나타났다. 이 수치 를 통하여 4개 그룹의 특성을 정리하면 다음과 같이 기술할 수 있겠다. G1은 기여 활동과 이 용 모두 조한 그룹이며, G2는 이용은 많이 했으나, 기여도는 그리 높지 않은 그룹으로 특 징지을 수 있겠다. 한편, G3은 극 으로 기여 하고 있을 뿐 아니라, 이용도 많이 하는 그룹으 로 설명할 수 있으며, G4는 극 으로 기여하 고 이용할 뿐 아니라, 종합목록 품질 제고를 해 노력하는 그룹으로 설명할 수 있을 것이다. 한편, 각 변수에 있어, 집단 간의 차이가 통계 으로도 유의미한지 살펴보기 하여 추가 으로 일원분산분석을 실시하 다. 분산분석 결 과 <표 6>과 같이 검정통계량 F값의 유의확률 이 모두 .000으로 나타나, 모든 변수에 있어 그 룹 간에 차이가 의미 있는 것으로 나타났다. 따 라서 기술 통계 분석을 통해 비교한 평균 차이 가 의미 있는 집단 간의 차이임이 입증되었다. 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 47 5. 논 의 본 연구는 학별 종합목록 참여 통계를 기 반으로 ‘기여 활동’ ‘이용 정도’를 살펴보고 각 행태간의 상 성을 분석하 다. 한 군집 분석을 통해 참여 도서 을 유형화하 다. 첫 번째, 종합목록 ‘기여 활동’과 ‘이용 정도’ 를 의미하는 각 지표별로 기 통계 빈도 분 석을 수행한 결과 다음과 같은 사실을 발견하 다. 서지 구축과 소장 구축은 실제 참여 기 을 기 으로 할 때, 60% 이상이 실 을 가지고 있지 않았으며, 서지품질제고 활동의 경우 참 여 기 의 80% 이상이 실 을 가지고 있지 않 았다. 그러나 검색 실 은 많은 도서 이 고르 게 가지고 있어, 종합목록 기여 활동은 소수 도 서 에 의해 이루어지고 있으며, 부분이 혜 택만을 취하고 있는 것으로 단되었다. 따라 서 종합목록 기여 활동에 참여하는 소수의 도 서 에 한 보상을 고려해 볼 수 있을 것이다. 두 번째, ‘기여 활동’과 ‘이용 정도’에 한 상 성을 분석한 결과 다음과 같은 사실을 발견 하 다. 서지 소장구축 활동이 극 일수 록 검색 건수가 높게 나타났다. 따라서 기여 활 동에 극 인 도서 이 이용도 극 으로 하 고 있는 것으로 단되었다. 그러나 참여 도서 서지가 종합목록에 신규로 채택되거나 고품질의 서지를 업로드 해 서지 교체가 이루 어진 도서 의 검색 건수는 소장정보만 구축된 도서 보다 상 으로 조하게 나타났다. 한편, 서지품질제고와 검색 건수간의 상 성은 매우 낮은 것으로 나타나, 서지품질제고는 이 용과 무 한 순수한 기여 활동으로 단되었다. 따라서 기여보상제도를 운 한다면, 서지품질 제고에 한 기여를 가장 높게 평가하고 소장 <그림 4> 군집별 략맵 48 한국비블리아학회지 제26권 제1호 2015 구축보다는 서지구축에 한 기여를 좀 더 인 정하는 것이 바람직할 것으로 보여진다. 세 번째, 군집분석 결과, 종합목록 참여 기 은 모두 4개의 군집으로 클러스터링되었다. 첫 번째 군집은 기여 활동과 이용이 모두 조한 군집(G1)으로 형성되었고, 두 번째 군집은 기 여 활동은 조하나 이용도가 매우 높은 군집 (G2)으로 형성되었다. 세 번째 군집은 기여 활 동과 이용도가 모두 높은 군집(G3)으로 형성 되었으며, 네 번째 군집은 기여 활동과 이용도 가 높을 뿐 아니라, 서지품질제고 활동이 특별 히 높은 군집(G4)으로 형성되었다. 이들 군집 의 특징을 도식화해 설명하기 하여, 기여 활 동을 합산하여 Y축으로, 검색 건수를 X축으로 지정하고 서지품질제고에 10배의 가 치를 두 어 산 도를 산출하 다. <그림 4>의 산 도를 보면, 높은 기여도를 보인 G3과 G4는 상단 간 부분에 치하고 있으며, 기여 활동과 이용 모두 조한 G1은 좌측 하단, 검색 건수가 특 별이 높게 나타난 G2는 우측 단부에 치하 는 것으로 표시되었다. 산 도에서 상단에 치한 군집 3-4는 높은 기여도를 보인 집단으로 보상을 고려할 수 있 겠으며, 지속 인 서지 구축과 품질제고 활동 을 독려해야 할 것이다. 특히 서지품질제고를 통해 극 인 기여 활동을 수행한 군집 4는 특 수 공헌을 인정할 필요가 있을 것이다. 한편, 두 드러지는 이용도를 보인 그룹 2는 상황에 따라 가의 요구를 고려할 수 있겠으며, 기여와 활 용 면에서 모두 소극 인 그룹 1은 일단 그 요 인을 악하고, 활성화를 해 홍보와 유인책 을 마련할 필요가 있을 것이다. 6. 결 론 본 연구에서는 종합목록 참여 도서 의 ‘기 여 활동’과 ‘이용 정도’를 탐사 으로 분석하여 기여 집단의 규모와 특징, 특수 공헌 집단의 존 재 여부, 수혜 집단과 참여 유도가 필요한 집단 의 략을 살펴 보았다. 분석 결과, 종합목록 참여 도서 의 기여 활 동은 매우 소수의 도서 에 집 되어 있는 것 을 확인할 수 있었다. 부분의 도서 이 혜택 만을 취하고 있어, 기여 도서 에 한 보상이 필요하다고 단되었다. 특히 상 분석 결과 이용량과는 상 없이 순수한 기여 활동이 보여 지는 서지품질제고에 한 기여를 가장 높게 인정하고 그 다음 서지구축, 소장구축 순으로 기여 정도를 인정하는 것이 바람직할 것으로 단되었다. 한편, 군집분석을 통해 참여도서 을 유형화한 결과 총 4개의 군집이 생성되었다. 기여가 두드러진 군집, 기여와 이용이 모두 극 인 군집, 이용이 두드러진 군집과 기여와 이용 모두 조한 군집으로 구분되었다. 따라 서 군집의 규모와 특성을 이해하고 그에 따라 보상을 통한 지속 인 서지 구축과 품질 제고 독려, 가의 요구, 참여 유도 등 차별화된 응 략이 필요할 것으로 보여진다. 본 연구는 기 데이터만을 제공하고 있어, 실제 운 을 해서는 각 기여 활동 이용 정 도에 따른 보상 방식의 결정이 필요하겠다. 한 재원 마련 가능성 등이 검토되어야 할 것이 다. OCLC는 보상제도의 재원을 마련하기 하 여 서비스를 유료화하고 보상액을 이용료로부 터 차감하는 방식을 채택하고 있다. 그러나 우 리는 일단 유료화하지 않으면서 기여를 보상해 학도서 의 종합목록 기여 활동 이용 정도에 한 탐사 연구 49 수 있는 방안의 모색과 그에 앞서 기여보상 에 한 참여 기 의 의견 수렴 등이 선행되어 야 할 것이다. 참 고 문 헌 國立情報學硏究所. 2009. 次世代目録所在情報サービスの在り方について [online]. [cited 2014.3.26]. . 장 연. 2014. 2014년도 상반기 종합목록운 원회 [online]. [cited 2014.10.1]. . 장 연. 2012. 2012년도 상반기 KERIS 종합목록 운 원회 [online]. [cited 2014.10.1]. . 조재인. 2012. 한국과 일본의 학 학술정보 공유 유통 체계 비교 연구. ꡔ한국도서 정보학회지ꡕ, 43(4): 23-45. 佐藤義則. 2011. これからのNACSIS-CAT/ILLの運用体制について [online]. [cited 2014.3.26]. . Goodson L. 2015. Question about enhance program. 3 March 2015. [cited 2015.3.3]. Personal Communication. OCLC. 2013. Final Report Global Advisory Group on Credits and Incentives [online]. [cited 2014.3.26]. . RISS Unicat Hompage. [cited 2014.10.5]. . Library Austrailia. 2013. Response to your Libraries Australia enquiry, 8 July 2013. [cited 2013.7.8]. Personal Communication. •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) Chang, K. Y. 2014. UNICAT Management Committee the first half year of 2014 [online]. [cited 2014.10.1]. . 50 한국비블리아학회지 제26권 제1호 2015 Chang, K. Y. 2012. UNICAT Management Committee the first half year of 2012 [online]. [cited 2014.10.1]. . Cho, J. 2012. “A Comparative Study of Academic Resource Sharing and Service System Between Korea and Japan.” Journal of Korea Library and Information Science Society, 43(4): 23-45. work_g3gv4wxkcvc3bb3yvggbyd43bm ---- Global Collaboration and the Future of the OCLC Cooperative James G. Neal portal: Libraries and the Academy, Volume 7, Number 3, July 2007, pp. 263-271 (Article) Published by The Johns Hopkins University Press DOI: 10.1353/pla.2007.0034 For additional information about this article Access provided by Columbia University (16 May 2014 18:22 GMT) http://muse.jhu.edu/journals/pla/summary/v007/7.3neal.html http://muse.jhu.edu/journals/pla/summary/v007/7.3neal.html portal: Libraries and the Academy, Vol. 7, No. 3 (2007), pp. 263–271. Copyright © 2007 by The Johns Hopkins University Press, Baltimore, MD 21218. GUEST EDITORIAL Global Collaboration and the Future of the OCLC Cooperative James G. Neal Presentation at the OCLC Members Council Meeting in Quebec City on February 6, 2007* I was originally asked to attend this Members Council program to provide an update on our progress in advancing RLG programs as part of OCLC. Then your president, Ernie Ingles, contacted me and indicated that this was not acceptable and that a talk exploring the future of the organization and its strategic and global role was required. A relatively comfortable and innocent assignment became this challenging and pro- vocative treatise. I was born professionally, like many in this audience, during the infancy of OCLC in the late 1960s and 1970s. You know what they say about those wonderful years—if you remember them, you weren’t there. We have lived through OCLC’s adolescence and now its full maturity as a global enterprise serving the library community’s voracious ap- petite for cost-effective access to worldwide information. I have been involved, like you, in its regional networks. I have served on and chaired its Research Libraries Advisory Council. I addressed the Members Council in 1993 when the most controversial topic was boxed wine at the receptions. I fought OCLC as part of the “Save MARC” group during the battle over the copyrighting of the database. And now I chair the board committee on RLG Programs and serve on the RLG Program Council. Clearly, we all have OCLC in our professional genes and see our success as libraries serving our communities as intimately bundled up in OCLC’s vitality and relevance. As Martin Buber once noted, we no longer just stand side by side, but with one another. My plan this afternoon is to outline briefly a context for my ideas and then to describe a series of things that I want from OCLC: first, I will identify the things I want OCLC to *A partial video clip and an entire audio of this speech are available at: http://www.oclc.org/ memberscouncil/meetings/default.htm. Global Collaboration and the Future of the OCLC Cooperative264 watch and observe with more intensity; second, I will list the things that I want OCLC to sense and feel with more passion; and then third, I will enumerate the things that I want OCLC to commit to and to do with more investment. One of my favorite films is Mel Brooks’ History of the World, Part 1. There is a classic scene when Brooks, playing Moses, is coming down the mountain carrying three large stone tablets. He yells, “Children of Israel, I have 15…”—and he trips, and one of the tablets crashes to the ground. He picks himself up and proceeds down the mountain. “Children of Israel, I have 10 commandments.” Such is the history of social change. So, Members Council delegates, I have 24 suggestions for OCLC’s observation, sensitivity, and action as this global cooperative grows in its impact and reach. First, some context. The late newscaster Charles Kuralt once noted that, thanks to the interstate highway system in the United States, one can travel from New York to San Francisco and see absolutely nothing. The technology and information infrastructure upon which we rely is necessary but insufficient. Our users tell us, often very clearly, what they want: more and better content, more and better access, convenience, new capabilities, cost moderation if not reduction, personal control, and enhanced individual and organizational productivity. Libraries of all types continue to advance core roles in information acquisition, synthesis, navigation, dissemination, interpretation, understanding, and archiving. But the focus on get, organize, find, deliver, answer, learn, and preserve is being extended as libraries assume new and often schizophrenic roles as consumers, aggregators, pub- lishers, educators, research and development organizations, entrepreneurs, and policy advocates. Libraries of all types are being challenged to manage shifting values and to respond to critical trends. It was the former CEO of OCLC, K. Wayne Smith, who provided me with the best definition of trends. In 1996, there were 4,963 Elvis impersonators work- ing in the United States; and, by 2006, that number had increased to 27,206; and, if that trend continues, by 2016, one out of three people in this audience will be singing Hound Dog for a living. We face heightened levels of accountability and new measures of success. We en- counter new pressures for market penetration and diversification. We need to align more rigorously our resources with our priorities and focus less on strategic planning and more on strategic action. We need to focus more on risk capital, competition, business planning, and sustainability in moving from concept to customer. There is an ex- pectation of rigorous resource attraction and not just effective resource allocation as our administrative mandate. There is an expanding requirement for customization and for attention to individual needs and preferences. We need to respond to the mantra of self-service, the ATM expectations that our users bring to all service interactions. We note a wave of mutability, of constant change, of hybrid structures and approaches. Libraries of all types will be LEGACY, responsible for centuries of societal records in all formats. We will be INFRASTRUCTURE, the essential combination of space, We need to align more rigorously our resources with our priorities and focus less on strategic plan- ning and more on strategic action. James G. Neal 265 technology, systems and expertise—what I have come to call our façade, our trompe l’oeil. We will be REPOSITORY, guaranteeing the long-term availability and usability of the intellectual and cultural output. We will be PORTAL, serving as sophisticated and intelligent gateways to expanding multimedia, interactive content, and tools. We will be ENTERPRISE, leveraging our assets, advancing innovation, and building new markets and capacities. What does all this mean for the OCLC cooperative? As the late Ken Kesey, author of One Flew Over the Cuckoo’s Nest, once commented in an interview, “You can count the seeds in the apple, but you can’t count the apples in the seed.” Let’s see if I can plant some ideas that might bear some provocative fruits. What do I want OCLC to watch and to observe with more intensity? Allow me to cite eight examples: 1. I want OCLC to watch the transformation of the cooperation to competition continuum in the library community. We have advanced an aura of profes- sional “kumbaya” when, in fact, there is an expanding struggle for collections, staff, donors, grants, and visibility, for example. We also recognize that among the reasons for the RLG/ OCLC combination was the enhanced power achieved through new aggregation and scale and the ability to compete with more capital and agility with the new players in the information marketplace. We also need to be sensitive to perceptions of monopoly, often misdirected in my view, as we face a persistent market condition in which there are fewer providers of products and services key to the library community. 2. I want OCLC to watch the very schizophrenic organizational frameworks and structures in which libraries are evolving. Organizational charts present, at best, the “current lie” and belie the rampant shifts and informal structures that explain how priorities are established, decisions are made, resources are allocated, and power is wielded. We are increasingly integrating centralized planning and resource distribution systems with loosely coupled consultative structures with extra-institutional ventures with entrepreneurial enterprises and maverick units. How will this fluidity and vitality contribute to productivity and success, and how will it affect the working relationship between OCLC and libraries? 3. I want OCLC to watch the expanding anxiety over workforce development in libraries, the alignment of supply and need. This includes new thinking about the role and substance of professional education, our recruitment and employment strategies, our shallow commitment to staff development and lifelong learning, and our new approaches to staff retention. I have recently teased out the concept of the “raised by wolves” feral professional in our libraries. We are bringing in new librarians with diverse and non- We have advanced an aura of profes- sional “kumbaya” when, in fact, there is an expanding struggle for collections, staff, donors, grants, and visibility. Global Collaboration and the Future of the OCLC Cooperative266 MLS academic credentials. We are implementing a wide range of new and non-librarian professional assignments, now approaching 50 percent of our professional staffs in some libraries. And we continue to see formerly professional roles assumed by support staff and students. What will be the socialization implications? What will be the impact on values, outlooks, styles, and expectations? When OCLC talks to libraries, who will be on the other end of those conversations? 4. I want OCLC to watch the new visibility and renewed vigor around standards development. NISO is revitalizing the standards conceptualization, consultative, and deployment lifecycle. The goal must be standards that are transparent, open, impartial, relevant, consensus-based, performance- based, coherent, built on due process, timely, and committed to certification and ongoing registry and maintenance. Standards need to solve the right problems. But standards work must also embrace education and training, promotion and publicity, test beds, conformance monitoring, and, what I call, standards “lite” activities—such as white papers, technical reports, and best practices. Will OCLC play an expanded role in our community’s work on standards? 5. I want OCLC to watch the expanding calls for more rigorous accountability and assessment. This is a product of institutional expectations and g o v e r n m e n t a n d f u n d i n g agency mandates. We need effective and widely embraced measures of user satisfaction, market penetration, success, impact, cost effectiveness, and usability. How do we know if libraries meet and exceed these various evaluative tests? Too often it is as if there were three kinds of people in our profession—those who can count and those who can’t. Will OCLC play a role in library appraisal? 6. I want OCLC to watch developments around cyber infrastructure, and around text and data mining tools and capabilities. When researchers map the universe, monitor the environment, or investigate the gene, for example, they build massive data sets, much of it unstructured and often multimedia. They want tools for location, extraction, distribution, collaboration, visualization, and simulation. More and more, researchers from all fields and disciplines want to be able to search for words or phrases, to establish meanings and patterns, to link objects. They need an open-text mining interface and protocol. Will libraries play a role in building the technologies and capacities to support these arenas and add value to the processes? Or is this a market that will be dominated by the publisher and software industries? 7. I want OCLC to watch the convergence, the intersection among libraries, archives, museums, and other cultural organizations. This is part of the RLG Programs agenda—managing the collective collection, renovating I want OCLC to watch the expanding calls for more rigorous accountability and assessment. James G. Neal 267 descriptive and organizing practices, and getting more effectively from discovery to delivery to use. There is much to be learned and shared across these communities, and OCLC can create the commons for interaction and collaboration. 8. I want OCLC to watch the extraordinary innovations and experimentation around emerging technologies and to map relevant capacities in the hands of library users. We can all generate our lists, but think of the obvious—the hand-held devices and the social networking tools. Think of the less obvious—real-time speech recognition, vision systems, and intelligent robots. Where do libraries fit into the venture enterprises being launched in our universities and our communities? How do libraries break away from the mindset that quality equals content, when we so clearly understand that quality equals content plus functionality, the imbedded and integrated tools and services? What do I want OCLC to feel, to sense with more passion? Allow me to cite six examples: 1. I want OCLC to feel the new spirit of globalization, a topic you are actively discussing at this Members Council. To steal from Socrates, I am a citizen, not of Ohio, not of the United States, but of the whole world. How does OCLC build a global commonwealth that reaches beyond shared bibliographic records, beyond international collections, beyond international customers and users, beyond differences in language, in standards, in laws, and in cultural traditions across east and west, and across north and south? Let us remember what Ghandi said when he was asked, “What do you think of Western civilization?” and his response, “My, what a wonderful idea.” Poverty does not equal naïveté or a lack of understanding and sophistication. 2. I want OCLC to feel a renewed sense of partnership and collaboration with the global library community. We understand the business relationship that is at the core, but as Harvard researcher Rosabeth Kanter tells us, partnerships “must yield benefits for the partners…[a sense of] creating new value together…[and of] exchange, getting something back for what you put in.”1 We in the United States tend to take a narrow and more opportunistic view of relationships, evaluating them strictly in financial terms, frequently neglecting the political, cultural, and social aspects of partnerships. In collaborative ventures, we need to know ourselves and our industry, embrace the importance of personal chemistry and of the need for compatibility on strategy and values. We need strategic integration, tactical integration, operational integration, interpersonal integration, and cultural integration, as stressed by Kanter. These must be part of OCLC’s relationships with its communities. 3. I want OCLC to feel an expanded sense of social responsibility. IFLA’s core values perhaps state it best: “The belief that people, communities, and Global Collaboration and the Future of the OCLC Cooperative268 organizations need universal and equitable access to information, ideas, and works of imagination for their social, educational, cultural, democratic, and economic well being.”2 Does this mean that OCLC needs to think about and act on matters of poverty, health, oppression, literacy, and censorship? I would say YES, and let’s work together on the hows. Remember that every snowflake in an avalanche pleads not guilty. 4. I want OCLC to feel our institutional and community goals and not just library goals. How can OCLC embrace in its mission such things as the success of graduates, faculty and teacher productivity, administrative performance, community vitality, business advancement, and quality lives. Cost-effective access to worldwide information is necessary but insufficient. 5. I want OCLC to feel the tension in the library community between dissonance and harmony, between anxiety and complacency, between disruption and unity, and between chaos and order. I want OCLC to tap into and to embrace these emotions and this psychology, to recognize but also build on these conditions. Don’t fight them or avoid them. Put them on the OCLC couch and probe them, and make them work for the collective enterprise. 6. I want OCLC to feel the importance of the value proposition—what customers and members see as the utility and merit of investing in, working with, and associating with the organization. Can OCLC continue to differentiate its products and services from the offerings of competitors? Can benefits to our communities be clearly tracked and delineated? What difference does OCLC make? What do I want OCLC to do, to work on with more attention and investment? Al- low me to cite 10 examples: 1. I want OCLC to get more involved in the creative application of new media and digital technologies to all levels of teaching and learning. How can we enhance the student and teacher experience? How can we more effectively integrate the library into the online learning environment and the course management system? Our teaching and learning systems need content creation, storage and management of complex learning materials, sophisticated search and query capabilities, distribution and access tools, and new approaches to rights management. Can OCLC make a difference and introduce a sound business venture into this arena? 2. I want OCLC to expand its capacity for assisting libraries with their digital preservation and archiving needs. Yes, OCLC has invested substantially in this area, but the task is extraordinary, and we do not have clarity in our I want OCLC to feel the im- portance of the value propo- sition—what customers and members see as the utility and merit of investing in, working with, and associating with the organization. James G. Neal 269 community around the technical issues, the economic challenges, and the policy framework. We need to continue to protect analog information as we also preserve converted and born digital content. Libraries are committed to the persistence, stewardship, integrity, and protection of our information assets. However, we are well short of dealing effectively with the dynamic, multimedia, ephemeral, and vulnerable conditions for digital and network resources. 3. I want OCLC to advance the open revolution. Listen to our rhetoric: open source, open standards, open design, open architecture, open courseware, open knowledge, open archives, open access, and so on. Can OCLC share the value of openness and help the library community confront the barriers of market, technologies, laws, and traditional behaviors and norms? 4. I want OCLC to support more systematically the repository movement, the increasing tendency to deposit works and content in multiple places while it may also move through traditional publishing channels. We have discipline, institutional, consortium, academic unit, personal, community, and national repositories. How will the growth in the scope, rigor, complexity, and diversity of content repositories reshape the nature of collections, the integrity of sources, and the work of libraries? What will be the impact on discovery and aggregation? 5. I want OCLC to partner with libraries in defining and participating in a more rigorous entrepreneurial capacity. Libraries and the organizations that work with them must become more interested in leveraging their assets, their space, content, services, technologies, and expertise. Libraries are seeking new customers and markets for their products. This urge comes under the mandates of finances, competition, and prestige. Business development, however, requires risk capital, rigorous planning, market analysis, and cultural and legal firewalls between the commercial library and the operating library. We must also ask if e-commerce is a valuable source of revenue and reputation enhancement or a slippery and naïve slope to expensive and diverting competition. OCLC and library entrepreneurial partnerships can help us to answer these questions. Remember—it may be the early bird that gets the worm, but it is the second mouse that gets the cheese. 6. I want OCLC to help the library community to establish a research and development agenda and capacity. Librarianship is largely an information- poor information profession. We have never effectively committed to solving real problems in real situations with well-designed studies and carefully analyzed data. We generally make decisions through intuition and by the proverbial seats of our pants. An R&D capacity would enable the creation of new knowledge to be shared and used, position the library as a laboratory for experimentation, serve as a magnet for new staff and skills, offer opportunities for capitalization and technology transfer, provide support Librarianship is largely an information-poor information profession. Global Collaboration and the Future of the OCLC Cooperative270 for decision-making, foster a culture of risk-taking, and attract federal, foundation, and corporate investment. Does OCLC have a fundamental stake in this capacity of enabling faster movement from concept to prototype to testing to market in partnership with libraries? 7. I want OCLC to participate in and not just observe the transformation of scholarly communication. As we have advanced over three decades from the library serials crisis to the scholarly publishing revolution to the open access wars, the library community has advanced a consistent agenda: a competitive market, easy distribution and use of information, innovative applications of technology, quality assurance, and permanent archiving. We need to break away from the dysfunctional publishing marketplace, introduce new capitalistic and socialist publishing models, and advance system transforming tactics. OCLC can and should engage and make a difference by participating in the evolving scholarly communication arena. 8. I want OCLC to invest in leadership development for libraries. The key success criteria for senior leadership assignments have been transformed by an extraordinary convergence of new pressures and issues facing libraries. Concurrent with the growing demand for talented individuals to tackle these challenges, the desire to take on leadership roles seems to wane as potential leaders encounter the turbulence, stress, and demands bundled up in these jobs. Demographics demonstrate the aging of the library population, particularly directors, and document retirements will cause an increased number of openings over the next decade but without a generation of leaders eager and ready to assume these positions. We need OCLC to see its success linked to the health of libraries and the ability to recruit knowledgeable, accomplished, and savvy individuals to leadership roles. 9. I want OCLC to engage in national and global information policy matters. I understand the lobbying limitations, and I appreciate the urge to stand neutral on such matters. But we need OCLC’s expertise and clout at the policy tables. Only a small number of libraries and librarians have stepped up to deal with the legislative and legal challenges. The policy issues are wide and complex: intellectual freedom, privacy, civil liberties, network development, telecommunications, government information, appropriations, copyright; and I could go on. The ability of libraries to serve their users is dramatically affected by developments in these areas. OCLC should care and find a legal way to get substantively involved. We are losing, but librarians are still active in the policy and political trenches. 10. I want OCLC to help us to conceive and define what we now refer to as “services to the network” or the post-integrated library management system. How can we best sustain the backroom operations while building a new approach to access and services, to a sophisticated array of searching, analysis, and communication tools that extend the ability of our users to work creatively and productively in and through our libraries? James G. Neal 271 Collectively, these are the things that I want OCLC to watch, to feel, and to do. I sense the wonder and the discomfort that you as Members Council might be expe- riencing—the mergers, the product acquisitions, and the new service suite. I sense a modest feeling of alienation, what Marx called the separation of things that naturally belong together. What is the balance of payments between the OCLC networks and the mother ship? Are networks tools of exportation, distributed service agencies, or agents of collaboration? In any relationship, as Churchill once pointed out—to improve is to change; to be perfect is to change often. In a novel by Salman Rushdie, the Water Genie speaks about the Ocean of the Stream of Stories: “And because the stories were held in fluid form, they retained the ability to change, to become new versions of themselves, to join up with other stories, and so become yet other stories.”3 Such is the history and future of the successful saga of OCLC. James G. Neal is vice president for information services and university librarian, Columbia University, New York, New York; he may be contacted via e-mail at: jneal@columbia.edu. Notes 1. Rosabeth Moss Kanter, “Collaborative Advantage: The Art of Alliance,” Harvard Business Review (July/August 1994): 97. 2. International Federation of Library Associations and Institutions, IFLANET, “More About IFLA: Core Values Statement,” International Federation of Library Associations and Institutions, http://www/ifla.org/III/intro00.htm (accessed April 23, 2007). 3. Salman Rushie, Haroun and the Sea Stories (London: Granta Books, 1990), 73. work_g45rfctfabcqbeesf6ncc5hxva ---- E D I T O R I A L Editorial S. M. Balaji Published online: 18 December 2010 � Association of Oral and Maxillofacial Surgeons of India 2010 Warm regards to all. My term of editorship has come to an end. I wish to con- solidate the development that has been achieved during this term. At present, in 2010 the journal had received 246 manuscripts till mid November. This indicates that the journal is well received and is much sought after for publication. A prime achievement of the editorial team in the tenure is making arrangement for online viewing and online publication even before printing the journal. The custom- ized login ID and passwords have been despatched to all members of AOMSI as per mailing list addresses. Many members have received their passwords and have logged in the same. The editorial and peer review process of JMOSI has been completely digitalized and available in the web- site which can be accessed world wide. We have received 246 submissions in the year 2010 (till November 15th). Of this, 45 (18.3%) articles are under tech- nical check process, while 144 (58.53%) were accepted after transparent review process. 27(10.8%) of all submitted arti- cles were rejected for various reasons. The acceptance rate has been escalated to 58% during the period of editorship. Current year technical check requires 5.6 days. In 2010, the editorial team requested for 1,442 reviews from 488 potential review- ers. The average number of reviews per reviewer was 3. It took approximately 12.9 days to complete a review. The mean time taken for a decision was 53.8 days. The mean turn-around time of an article was about 7.7 weeks. About 144 letters of acceptance were issued by the editorial team based on the reviewer comments. At present, the editorial team has about 190 articles being queued up for publication. Of the 246 submissions, 45 (18.3%) were international submissions from 16 countries. Submissions have been from Japan, Brazil, Iran, Turkey, South Africa, Nigeria, Egypt, UK, USA and several other countries. Original researches from international submission have been con- sidered solely for the purpose of the increasing the quantity of original researches, citation and meeting standards of the PubMed indexing so as to get JMOSI indexed. Visibility and access of the article improved in 2010. Our journal continued to be recognized by several inter- national indices including OCLC and Summon by serial Solutions. All of our published articles were visible in Google Scholar. The publication internal work flow has been suitably modified in June–July of this year to suit the PubMed acceptance protocol. This is one of the much needed herculean task the editorial team undertook to index the journal in Pubmed database. JMOSI has 475 Institutions across globe accessing the journal through various means. Of these 475 institutions, 305 are from Australia and Asia and 165 in Europe. For the first time our own journal has reached the libraries of Europe and Americas. This accounts for the increasing quality of manuscripts submitted for our journal. The journal that was a financially unstable venture for decades has now reached a stage where it creates revenue in spite of several expenditures. The present editorial team has made the following achievements in the journal during its term—Global acceptance of AOMSI and penetration of the name of JOMSI into international libraries, indexing of journal in Google scholar, OCLC and Summon by serial Solutions, Quality and appearance of journal has been made to meet international standards, internal workflow made compatible to meet PubMed Index, Indian and International submissions is on the increase, subscription base has been widened and international subscription has S. M. Balaji (&) No. 30, Kavinagar Bharathi Dasan Road, Teynampet, Chennai 600 018, Tamilnadu, India e-mail: smbalaji@gmail.com 123 J. Maxillofac. Oral Surg. (Sept-Dec 2010) 9(4):320–321 DOI 10.1007/s12663-010-0142-4 increased, Quality advertisements have been increased and most importantly we have revenue being generated from the journal. I take this opportunity to thank all the faculties and researchers for considering JMOSI for publishing their work, sponsors for advertisements, and reviewers for their support. I would also like to thank co-publishers Springer India for their support in making JMOSI a truly world class journal. I would also like to thank the entire editorial team for their support during this tenure. J. Maxillofac. Oral Surg. (Sept-Dec 2010) 9(4):320–321 321 123 Editorial << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (ISO Coated v2 300% \050ECI\051) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Error /CompatibilityLevel 1.3 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJobTicket false /DefaultRenderingIntent /Perceptual /DetectBlends true /DetectCurves 0.1000 /ColorConversionStrategy /sRGB /DoThumbnails true /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo true /PreserveFlatness true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts false /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 149 /ColorImageMinResolutionPolicy /Warning /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 150 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 149 /GrayImageMinResolutionPolicy /Warning /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 599 /MonoImageMinResolutionPolicy /Warning /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /CreateJDFFile false /Description << /ARA /BGR /CHS /CHT /CZE /DAN /ESP /ETI /FRA /GRE /HEB /HRV (Za stvaranje Adobe PDF dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. Stvoreni PDF dokumenti mogu se otvoriti Acrobat i Adobe Reader 5.0 i kasnijim verzijama.) /HUN /ITA /JPN /KOR /LTH /LVI /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR /POL /PTB /RUM /RUS /SKY /SLV /SUO /SVE /TUR /UKR /ENU (Use these settings to create Adobe PDF documents best suited for high-quality prepress printing. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) /DEU >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /ConvertColors /ConvertToCMYK /DestinationProfileName () /DestinationProfileSelector /DocumentCMYK /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ] >> setdistillerparams << /HWResolution [2400 2400] /PageSize [595.276 841.890] >> setpagedevice work_gcvvd2mytrgdjpvnitrwmieuke ---- Editorial Message: IJFS Journal Information and Best Paper Award Editorial Message: IJFS Journal Information and Best Paper Award Shun Feng Su1 • Jin-Tsong Jeng2 Published online: 23 August 2016 � Taiwan Fuzzy Systems Association and Springer-Verlag Berlin Heidelberg 2016 1 Journal Information The International Journal of Fuzzy Systems (IJFS) is an official journal of Taiwan Fuzzy Systems Association (TFSA). Since its launch in 1999, IJFS has served as a popular forum for researchers who are interested in all kinds of fuzzy-related research. We are looking forward to having your contributions and bringing your high-quality work to our readers. IJFS was originally published by TFSA. Since 2015, in order to reach out for more audience, IJFS now is published by Springer and is included in the database– SpringerLink. At present, abstract of IJFS is indexed in Science Citation Index Expanded (SciSearch), Journal Citation Reports/Science Edition, SCOPUS, INSPEC, Google Scholar, EBSCO, CSA, CSA Environmental Sci- ences, Earthquake Engineering Abstracts, EI-Compendex, Food Science and Technology Abstracts, Mathematical Reviews, OCLC, SCImago and Summon by ProQuest. Due to a high volume of paper submission recently, starting from 2016, IJFS will have 6 issues per year. 1.1 Aims and Scope IJFS will consider high-quality papers that deal with the- ories, designs, and applications of fuzzy systems, soft computing systems, grey systems, and extension theory systems ranging from hardware to software. Survey and expository submissions are also welcome. 1.2 Get Subscription • Online subscription, valid for one calendar year, • Immediate Content Access via SpringerLink, • 1 Volumes with 6 issues per year, • Subscription will auto-renew for another year unless duly terminated. About the submission system, please go to the new website provided by Springer to submit your paper: https://www. editorialmanager.com/ijfs/. In order to encourage authors submit high-quality papers to IJFS and promote our journal, TFSA set up the Best Paper Award for IJFS. The related information about the Best Paper Award is stated in the following. 2 Best Paper Award The IJFS Best Paper Award will be given every year to an outstanding paper in the field of fuzzy systems. The paper must have been published in IJFS and must present inno- vative approaches for solving complex problems on theory, design, and application of fuzzy systems, soft computing systems, grey systems, and extension theory systems from hardware to software. The winner will be awarded 300 US Dollar and a certificate. Funding for this award was pro- vided by the TFSA. The awards ceremony will be held in the annual International Conference on Fuzzy Theory and Its Applications (iFUZZY Annual Conference) that hosted by TFSA. The candidate paper of IJFS Best Paper Award will be nominated via Associate Editors of IJFS from previous year published paper of IJFS. Nominations due by: 31 March each year (for 2016, it is postponed to 31 May). & Shun Feng Su sfsu@mail.ntust.edu.tw 1 National Taiwan University of Science and Technology, Taipei, Taiwan 2 National Formosa University, Huwei, Taiwan 123 Int. J. Fuzzy Syst. (2016) 18(5):731 DOI 10.1007/s40815-016-0244-3 https://www.editorialmanager.com/ijfs/ https://www.editorialmanager.com/ijfs/ http://crossmark.crossref.org/dialog/?doi=10.1007/s40815-016-0244-3&domain=pdf http://crossmark.crossref.org/dialog/?doi=10.1007/s40815-016-0244-3&domain=pdf Editorial Message: IJFS Journal Information and Best Paper Award Journal Information Aims and Scope Get Subscription Best Paper Award work_gg6k44gm5vbn5mgocwsvndaz4a ---- 005-030-------01-.......hwp WorldCat과 한국 련 장서의 분포에 한 연구* - 청소년 상 기를 심으로 - An Analysis on the Distribution of Books on Korea in WorldCat: With a Focus on Biographies for Juvenile Readers 윤 정 옥 (Cheong-Ok Yoon)** 목 차 1. 머리말 2. 선행 연구 3. 한국 련 기의 서지 코드 분포 4. 청소년 상 기의 서지 코드 분석 5. WorldCat 분석과 문제 6. 맺음말 록 이 연구의 목 은 WorldCat에 수록된 한국 련 기도서를 분석함으로써 한국 련 인물에 한 지식 분포 황 확산의 가능성을 망하는 것이다. 주제에 Korea를 포함하는 기도서 서지 코드 15,007건과 청소년 상 기도서 487권의 언어와 주제, 주제 인물, 미국 내 소장도서 등 분석하 다. 청소년 상 어 기 30권은 미국 내 소장도서 수가 많았고, 김일성과 김정일이 가장 많이 등장한 주제 인물이었다. 한국어 기 457권은 소장도서 수가 많지 않고, 세종 왕과 이순신이 가장 많은 책에서 다루어졌다. 한국 련 청소년 상 기도서들은 본격 기보다 등 학년 수 그림책이나 일화 심 스토리 등 편향성을 보여, 도서 장서를 통한 한국 련 인물에 한 흥미 유발이나 지식 확산의 망은 제한 이었다. 주제 패싯의 오류, 동일 주제 인물의 분산, 소장도서 리스트의 부정확성 등 문제 도 확인하 다. ABSTRACT The purpose of this study is to review the current status of disseminating knowledge on Korean people by analyzing biographies for juvenile and general audiences with the subject term ‘Korea’ in WorldCat. Languages and topics of 15,007 bibliographic records and topic facets, biographees, and holding libraries in the U.S. of 487 biographies for juvenile audiences were analyzed. Major findings are as follows: 1) 30 English biographies are held by may libraries in the U.S. and the most popular subjects are Kim Il-song and Kim Jong-il, 2) 457 Korean biographies are not held by many libraries, and King Sejong and the General Yi Sun-sin are the most popular subjects, 3) many Korean biographies are picture books for the very young readers with the focus on Korean folklores and anecdotal biographies, and 4) there are some errors in topic facets, dispersion of biographees, and inaccurate holding lists of bibliographic records. Therefore, there seems to be little to read with an interest and promote the diffusion of knowledge on Korean people through libraries. 키워드: 월드캣, 한국 련 기, 지식 확산, 서지 코드 분석, 소장도서 WorldCat, Biographies of Koreans, Knowledge Diffusion, Analysis of Bibliographic Records, Holding Libraries * ** 이 논문은 2015-2016 학년도에 청주 학교 한국문화연구소가 지원한 학술연구조성비(특별연구과제)에 의해 연구되었음. 청주 학교 문헌정보학과 교수(jade@cju.ac.kr) 논문 수일자: 2015년 10월 22일 최 심사일자: 2015년 10월 22일 게재확정일자: 2015년 11월 3일 한국문헌정보학회지, 49(4): 221-239, 2015. [http://dx.doi.org/10.4275/KSLIS.2015.49.4.221] 222 한국문헌정보학회지 제49권 제4호 2015 1. 머리말 1.1 연구의 목 과 필요성 OCLC의 WorldCat은 2015년 9월 재 347,400,683건의 서지 코드를 수록하고 있으 며, 소장자원은 1,308,883,155건에 이른다(OCLC 2015a). 2013년 6월 당시 세계 170여 국가 72,000 여 개 도서 소장 자료 250,021,271건의 서지 코드를 수록하고 있었던 것(OCLC 2013a) 을 감안하면 불과 2년 사이 1억 2천여만 건의 서지 코드가 증가하 다. 실제로 WorldCat에 는 매 10 마다 한 건씩 새로운 서지 코드가 추가되고 있다. 이처럼 방 한 규모이며 속 히 성장하는 WorldCat은 명실공히 세계 주 요 도서 들의 소장 자원 황을 실시간으로 반 하는 도구가 되고 있다. 이에 따라 OCLC Research는 일련의 연구 (Dickey 2011; O'Neill, Connaway and Dickey 2008; Lavoie, Connaway and O'Neill 2007) 에서 WorldCat 서지 데이터베이스를 데이터 마이닝의 소스로 활용하여 각국의 도서 출 데이터를 기반으로 한 문화 다양성 측정, 도 서의 이용자 수 분석, 자자원의 증가 동향 분석 등을 시도하 다. 이러한 시도는 WorldCat에 수록된 서지 코 드를 분석함으로써 특정한 주제 련 도서 장 서에 반 된 문화 , 역사 , 사회 지식 분포 의 황을 악할 수 있다고 주장한 선행 연구들 (윤정옥 2012; 2013)과 같은 맥락에 있다고 할 수 있다. 이 연구는 이 연장선상에서 WorldCat 이 가진 “통합 주제 련 정보원 주제 확 산 도구로서의 가치”에 주목하며, 특히 기에 을 맞추어 한국인과 련된 지식의 확산 가능성을 살펴보고자 하 다. 기는 “한 사람의 삶의 기록”이며 “사람의 삶 과 련된 문학의 한 분 ”(Harrod 1984)라고 정의된다. 특히 잘 쓰인 기는 역사, 사람, 그 리고 문학 술성이라는 세 가지 필수요소를 갖는다고 평가된다(Sutherland and Arbuthnot 1986, 440). 따라서 기를 통해서 독자들은 사 람에 해서 알게 되고, 그가 살던 시 , 역사, 문화 등 여러 측면의 이야기를 할 수 있게 된 다. 만약 도서 에 한국인의 기가 많이 소장 되어 있다면, 독자들은 개별 한국인에 한 지 식만이 아니라 나아가서 한국의 역사, 문화, 사 회 등에 한 지식도 할 가능성이 있다. 이런 제 하에 이 연구에서는 WorldCat에 한국과 련된 사람들의 기도서 서지 코드 들이 얼마나 있으며, 어떤 인물들이 주제로서 다루어졌는지, 얼마나 많은 도서 들이 이들을 소장하고 있는지 살펴보았다. 특히 청소년 상 기의 언어, 주제(topic), 주제 인물(subject), 미국 내 소장도서 리스트 등을 분석함으로써 한국 련 인물에 한 지식 확산의 황 가 능성을 망하고자 하 다. 1.2 연구의 방법과 내용 이 연구는 다음과 같은 단계로 계량분석 내용분석을 병행하여 진행하 다: (1) WorldCat의 Advanced Search에서 ‘Subject: Korea’, ‘Content: Biography’, ‘Format: Book’으로 제한하여 검색하 다. (2) 검색 결과 서지 코드들을 Topic(주제), WorldCat과 한국 련 장서의 분포에 한 연구 223 Language(언어), Audience(이용자 수 )의 3개 패싯 분포 내용을 분석 하 다. (3) 일반 청소년 상 기의 주제 분석, 청소년 상 기의 언어별 주제 분석, 주제 인물 미국 내 소장도서 분석 의 단계로 진행하 다. 1) 주제, 언어 이용자 수 분석: 패싯 별 데이터 그룹의 교차 테이블 작성. 2) 주제 인물: 각 OCLC 서지 코드의 Subject 필드에서 인명주제표목을 추출. 3) 소장도서 : 각 OCLC 서지 코드 별 로 United States로 지역을 제한하여 Holdings List를 추출하고 엑셀로 정리. 2. 선행 연구 WorldCat 련 연구는 OCLC 산하 연구기 인 OCLC Research에서 다수 진행되었다. Dickey (2011)가 수행한 로젝트에서는 WorldCat 데이터로 국가별 장서를 악하고 이를 기반으 로 볼리비아, 칠 , 독일 등 6개 국가의 도서 출 황을 분석하 다. Dickey는 데이터 마이 닝은 “데이터의 집합에서 추출한 새로운 정보 를 나타내기 해 데이터베이스 기타 형 데이터 세트를 컴퓨터의 도움으로 분석하는 것”이라고 정의하며, 특히 서지 데이터의 “집 합 컬 션”으로서 WorldCat 데이터베이스의 잠재 가치를 강조하 다. WorldCat이 장서 서비스 개발의 의사 결정을 한 데이터뿐 만 아니라 확장된 검색 근 방법을 제공하 기 한 데이터 소스로 사용될 수 있음을 확인 하 다. O'Neill, Connaway와 Dickey(2008)는 WorldCat 은 책들과 소장도서 들을 식별하는 정보원이 므로 소장정보를 이용하여 특정 자료가 목표로 하는 독자 수 유형을 추론할 수 있다고 하 다. 이들은 독자 수 은 이 자료를 선정한 도서 유형에서 계량 으로 측정 가능하고, 데이 터는 검색 개선, 장서 분석, 독자 지원 정보 제공 서비스 향상에 쓰일 수 있다고 주장하 다. 이들은 MARC 서지 코드에서 자료의 이 용자 수 을 나타내는 008 필드 ‘Audience Level’ 코드에 각각 가 치를 부여하고, 도서 유형에 따라 다시 가 치를 부여하여, 30권의 단행본 샘 에 한 독자 수 순 를 측정하 다. 연구자들은 이러한 정보가 WorldCat의 FictionFinder, WorldCat Identities, Dewey Browser 같은 도구에서 활용될 수 있음을 강조 하 다. 실제로 이 데이터는 2015년 재 실험 데이터 서비스 인 WorldCat Identities Network의 각각의 ‘WorldCat Identities page’ 에서 제공되고 있다(OCLC 2015e). Lavoie, Connaway와 O'Neill(2007)은 MARC21 코드에서 알고리즘으로 디지털 자료를 식별하 는 기 을 용하여 WorldCat에 편목된 디지털 자료의 규모, 유형, 특성 소장 패턴, 그리고 시간이 흐름에 따른 디지털 자료의 편목 활동 동 향을 검토하 다. 이들은 MARC21 서지 코드 데이터에서 ‘Type of Record=computer file (리더의 byte 6 값이 “m”일 때), Form of Item=electronic (008필드의 byte 23 혹은 29 값이 “s”일 때), General Material Designation (GMD)=electronic resource’ 3개 필드 006, 007, 533, 856 필드의 데이터를 사용하여 자 224 한국문헌정보학회지 제49권 제4호 2015 자원을 식별하고 이들의 집합 특성을 악하 다. 이들은 Bipolar Disorders: A Guide to Helping Children & Adolescents (M. Waltz) (1,340 holdings) 등 2005년 7월 당시 WorldCat 에 가장 리 소장된 디지털 자원 10종을 확인 할 수 있었다. Leung, Chan과 Song(2006)은 국 한방 의 학 주제 련 자료의 국어 외 언어 출 동향 을 악하기 하여 WorldCat 서지 코드를 분석하 다. 이 연구는 1970년 이후 한방 련 주제의 출 이 증 하고, 특히 침술에 해 많이 다루고 있음을 발견하 다. 연구자들은 출 코드를 분석하여 주제 련 어 자료의 비 이 가장 크고, 세계 으로 한방 의학이 차 인식이 확 되고 있음을 확인하 다. OCLC Research는 최근 이러한 일련의 연 구들을 기반으로 WorldCat 서지 데이터베이스 에서 코드 탐색 데이터 마이닝을 수행하 여 다양한 정보를 추출하고, “새롭고 흥미로운 주제를 택하여 WorldCat과 그것의 데이터에서 수집할 수 있는 독특한 것들을 공개하고 호기심 을 자극”하려고 시도하 다(OCLC Research 2015a). 이러한 노력의 일환으로 2014년부터 매달 1개의 주제를 택하여 WorldCat을 분석하 고 주제 순 리스트를 공개하고 있다. 2015년 9월 재 “Action & Adventure Movies of 70s, 80s, 90s and Today” 등 7개 주제 리스트를 공 개하 고, 2014년에는 “Top 10 Love Stories in Libraries”, “Top 10 Scottish Works” 등 12 개 주제 리스트를 공개하 다. 를 들어 2015년 5월 ‘발명가의 달’을 기하여 “Top 25 Inventors Found in Libraries”를 공개하고 세계 도서 소장 도서과 화에서 가장 많이 발견된 25 명을 가장 인기 있는 발명가라고 하여 명단을 편 성하 다(OCLC Research 2015a). 이 명단에 는 Benjamin Franklin(1706-1790), Alexander Graham Bell (1847-1922), Leonardo da Vinci (1452-1519) 등이 포함되어 있는데, 이들의 이 름은 FAST(Faceted Application of Subject Terminology) 표목 “Inventors”를 포함하는 코드의 서 세트에서 추출되었다. 그 다음 주제 로서의 개인명(600 필드)을 코드에서 추출하 고 OCLC 회원 도서 소장정보로 순 를 매겼 다. 이것은 WorldCat이 “학술 문화 기록 (the scholarly and cultural record)의 방 한 부분을 표 하기 해” 사용될 수 있다는 제에 서 가능한 일이다(OCLC Research 2015b). 윤정옥(2012)은 목록에서 특정 주제 련 자 료의 서지 코드 분포는 잠재 지식 확산 가 능성에 향을 미친다고 하며, WorldCat 수록 한국 일본 련 서지 코드의 분포 황을 분석한 바 있다. 연구자는 오늘날 세계 도서 들이 WorldCat 같은 국제 종합목록을 통 해 연결되고, 서지정보를 공유할 뿐 아니라 상 호 차 등으로 자원까지 공유하는 빈도가 높아 짐에 따라, 목록은 지식 탐구의 시발 이며 도 서 장서는 지식 확산의 잠재성을 나타낼 것 이라고 주장하 다. 하지만 실증 상 계의 분석에는 이르지 못하 다. 윤정옥(2013)은 다른 연구에서 WorldCat 수 록 한국 일본 련 청소년도서의 언어, 장르, 주제 특성을 살펴보고, 소장도서 보유 황을 분석하 다. 1993년부터 2012년까지 발 행된 청소년용 책의 서지 코드 1997년 발 행 도서들의 소장도서 수, 언어, 장르 주제 표목 분포를 계량 분석한 결과는 한국 련 도 WorldCat과 한국 련 장서의 분포에 한 연구 225 서는 한국어 텍스트의 집 , 유아 등 학년 수 도서의 과잉, 옛날이야기나 일화 심 기서 등 장르와 주제 편향성 등을 확인하 다. 연구자는 한국 련 읽을거리의 흥미 유 발이나 지식 확산 망이 제한 일 것이라고 망하 다. 3. 한국 련 기의 서지 코드 분포 여기에서는 WorldCat에 수록된 한국 련 기의 서지 코드를 분석하 다. 체 자료의 언 어, 주제와 연도별 분포 황을 살펴보고, 독자 수 을 일반(Non-Juvenile)과 청소년(Juvenile) 으로 나 어 검토하 다. 데이터 추출을 해 2015년 9월 13일부터 25일 사이 WorldCat에서 주제(su:)에 Korea를 포함하는 서지 코드를 검색하고, 이들을 매체, 언어, 장르 주제 패 싯으로 범주화한 결과를 도표화하 다. WorldCat에서 주제에 Korea를 포함하는 서 지 코드 373,987건 도서의 서지 코드는 238,872건( 체의 63.9%)이며, 기도서의 서 지 코드는 15,007건(6.3%)이다. 청소년 상 도서의 서지 코드는 6,876건( 체의 2.9%)이 며, 그 기는 918건(청소년도서의 13.4%) 이다. <그림 1>은 한국 련 기도서의 서지 코드 15,007건 일반 상 14,089건(93.9%)과 청소년 상 918건(6.1%)의 Agriculture, Anthropology, Art & Architecture 등 28개 주제 분포 황 을 보여 다. <그림 1>에서는 WorldCat에서 사용된 주제 패싯의 문명을 그 로 채택하 다. 한 각 주제에 해당하는 서지 코드 건수 나 비율은 명시하지 않고, 체 인 분포 황 만 보여주었다. 일반 청소년 상 기 모두에서 ‘History & Auxiliary Sciences(이하 ‘역사’라 부름)’ 주 <그림 1> 일반 청소년 상 기도서 서지 코드의 주제별 분포 [2015.9.15.] 226 한국문헌정보학회지 제49권 제4호 2015 제의 비 이 으로 큰 것을 알 수 있다. 일반 상 기도서 ‘역사’ 주제 서지 코드 는 모두 5,590건으로 체의 44.5%를 차지하 다. 그 다음으로 ‘Philosophy & Religion(이하 ‘철학과 종교’라 부름)’ 주제 서지 코드가 1,721 건(13.7%), Art & Architecture(이하 ‘ 술 과 건축’이라 부름) 881건(7.0%), ‘Business & Economics(이하 ‘비즈니스와 경제학’이라 부 름)’ 825건(6.6%)으로 나타났다. 청소년 상 기도서에서도 ‘역사’ 주제 서 지 코드가 301건(52.3%)으로 가장 비 이 컸 고, ‘Languages, Linguistics, & Literatures(이 하 ‘어문학’이라 부름)’ 69건 (12.0%), ‘ 술과 건축’ 57건(9.9%), ‘철학과 종교’ 27건(4.7%) 등이 그 다음이었다. 다른 주제들은 서지 코 드 건수가 매우 고 분산되었다. 원래 WorldCat의 한국 련 도서 서지 코 드 238,872건 가운데 ‘역사’ 주제는 48,277건으 로 20.2%에 해당한다. 이것과 비교하면 일반 청소년 상 기도서 모두에서 ‘역사’ 주제 의 비 은 두 배 이상 크다. 이것은 기가 역사 상 인물을 주로 다루며, 음악가, 과학자, 지리학 자와 같이 문 분야나 주제 역이 분명한 인 물도 일반 으로 ‘역사’에 넣기 때문에 나타나 는 상이라 할 수 있다. 4. 청소년 상 기의 서지 코드 분석 4.1 언어별 소장도서 의 분포 WorldCat의 한국 련 청소년 상 기도 서의 서지 코드 918건 한국어도서는 817건 (89.0%), 어도서는 33건(3.6%)이며, 나머지 68건(7.4%)은 국어, 베트남어, 일본어, 스웨 덴어, 미상 언어 서지 코드를 포함하고 있다. 이 연구에서는 <표 1>에 보는 것과 같이 한국어 도서와 어도서 서지 코드 850건 가운데 복을 제거하고, 주제 패싯을 가진 서지 코드 한국어 기 어 기 소장도서 수 도서 수 비율 소장도서 수 도서 수 비율 10개 이상 11 2.4% 100개 이상 9 29.0% 9개 3 0.7% 70-99개 3 9.7% 7개 4 0.9% 50-69개 2 6.5% 6개 4 0.9% 30-49개 3 9.7% 5개 18 3.9% 10-29개 4 12.9% 4개 26 5.7% 7개 3 9.7% 3개 37 8.1% 6개 5 16.1% 2개 92 20.1% 5개 1 3.2% 1개 131 28.7% 2개 1 3.2% 0개 131 28.7% 2,737개 30권 100% 950개 457권 100% <표 1> 한국어와 어 기의 소장도서 분포 WorldCat과 한국 련 장서의 분포에 한 연구 227 487건, 그리고 이들이 기술하고 있는 책 487권 을 분석 상으로 삼았다. 따라서 이하 본문에 서는 경우에 따라 서지 코드 건수 혹은 도서 권수를 각각 분석 상으로 언 하 다. <표 1>은 한국어 기 457권과 어 기 30 권의 미국 내 소장도서 분포를 보여 다. 한 국어도서와 어도서의 미국 내 도서 소장 황은 확연히 차이가 났다. 어 기 30권은 모두 2,737개 도서 에, 한국어 기 457권은 모 두 950개 도서 에 각각 소장되어 있다. 체 487권의 미국 내 소장도서 은 모두 3,687개이 다. 단순히 계량하면 어도서는 1권 당 평균 91개 도서 , 한국어도서는 평균 2.1개 도서 이 소장하고 있다. 언어에 따라 소장도서 수 의 차이가 매우 크며, 한국어도서는 사실상 평 균 소장도서 수가 별로 의미가 없다. 왜냐 하 면 <표 1>에 보는 바와 같이 부분 한두 개 도 서 만 소장하고 있거나, 아 소장도서 이 없는 경우도 있기 때문이다. 한국어 기 457권 가운데 10개 이상 도서 이 소장한 도서는 11권(2.4%)에 불과하다. 단 1개 도서 이 소장한 도서가 131권(28.7%), 2개 도서 이 소장한 도서가 92권(20.1%)으 로, 결국 반 가까운 223권(48.8%)을 미국 내 한두 개 도서 만이 소장하고 있는 상황이 다. 한 미국 내 소장도서 이 없는 도서도 131권(28.7%)이다. 이들 일부는 호주, 캐 나다, 뉴질랜드 등 다른 나라 도서 에 소장 되어 있기도 하나 여기에서 일일이 확인하진 않았다. 한편 어 기는 30권으로 수가 많지 않지 만, 그 가운데 9권(29.0%)이 100개 이상 도서 에 소장되어 있고, 수백 개 도서 이 소장한 도서들도 있다. 소장도서 수가 10개 이하인 도서는 10권(33.3%)으로, 그 가운데 소장도서 이 7개인 도서가 3권, 6개인 도서가 5권, 그 리고 소장도서 수가 5개와 2개인 도서가 각 각 한 권씩 있다. 한국 련 청소년 상 도서들 어도서 는 한국어도서에 비하여 소장도서 수가 많다 는 것을 이 연구(윤정옥 2013)에서도 확인한 바 있다. 이 연구에서는 기 장르로 한정함으 로써 체로 미국 도서 장서에서 한국 련 인물을 다룬 도서에 한 심이 크지 않고, 한 국인에 한 지식이 도서를 통해 알려질 기회 는 많지 않음을 다시 확인하 다. 4.2 어 기의 주제 인물과 소장도서 기는 사람에 한 작이므로 주제 인물 이 구인가가 요하다. <표 2>는 청소년 상 어 기도서 30권 가운데 미국 내 소장도 서 수의 순 로 20권의 주제 인물을 보여 다. 미국 내에서 가장 많은 도서 이 소장한 어 기는 Linda Walvoord Girard와 Linda Shute의 We adopted you, Benjamin Koo (Niles, Ill: A. Whitman, 1989)이다. 한국 련 청소년 상 기 언어를 불문하고 가장 많은 도서 이 소장한 이 도서는 1989년 인 쇄본, 자도서, 라유식 자도서, 1999년 인쇄본, 2014년 자도서 등 모두 7개 을 538개 도서 이 갖고 있다. 이 도서는 등학생 학년 수 의 34페이지 그림책이며 주제는 다 음을 포함하고 있다: 228 한국문헌정보학회지 제49권 제4호 2015 서명 주제 인물 소장 도서 Walvoord, L., & Shute, L. (1989). We adopted you, Benjamin Koo. Niles, Ill: A. Whitman. Koo (고아소년) 538 Stewart, M. (2000). Se Ri Pak: Driven to win. Brookfield, Conn: Millbrook Press. 박세리 388 Benson, S., & Raffaelle, G.-A. (2002). Korean War: Biographies. Detroit: UXL. 한국 쟁 319 Behnke, A. (2008). Kim Jong Il's North Korea. Minneapolis, MN: Twenty-First Century Books. 김정일 221 Fenton, J. (1988). All the wrong places. New York: Atlantic Monthly Press. 펜튼 (종군기자) 201 Sohn, W. T. (2003). Kim il sung and Korea's struggle: An unconventional firsthand history. Jefferson, N.C: McFarland. 김일성 155 Ingram, S. (2004). Kim Il Sung. San Diego: Blackbirch Press/Thomson/Gale 김일성 131 Koestler-Grack, R. A. (2004). Kim Il Sung and Kim Jong Il. Philadelphia: Chelsea House 김일성, 김정일 116 Piddock, C. (2007). North Korea. Milwaukee, WI: World Almanac Library. 북한 105 Hart, J. (2008). Kim Jong Il: Leader of North Korea. New York: Rosen Pub. 김정일 90 Wyborny, S. (2009). Kim Jong Il. Detroit: Lucent Books. 김정일 82 Tieck, S. (2014). PSY. Minneapolis, Minnesota: ABDO Publishing Company 싸이 64 Werstein, I., & Papin, J. (1969). The trespassers; Korea, June 1871. New York: Dutton. 한미조약 (1871) 57 Goldstein, N. (1999). Kim Dae-jung. Philadelphia: Chelsea House Publishers. 김 49 Peterson, T. (2003). Korean Americans. Chicago, Ill: Heinemann Library. 이민가정 48 Aldridge, R. (2009). Ban Ki-Moon. New York: Chelsea House Publishers. 반기문 35 Yi, Y., & Kim, Y. (1982). The hero of the Sal River battle. Seoul, Korea: Daihak Pub. Co. 을지문덕 23 Yi, Y., & Kim, Y. (1982). Dazzling Pulguk Temple: The story of Kim Tae-song. Seoul, Korea: Daihak Pub. Co. 김 성 22 Sheafer, S. A. (2008). President Roh Moo Hyun. Broomall, Pa: Chelsea House. 노무 20 Chang, C., Sŏ, C., & Han, S. (2011). The patriotic story of martyr Gwansun Yu. Sŏul, Korea: Ungjin Chuniŏ. 유 순 15 <표 2> 어 기의 주제 인물과 소장도서 수 (N=20) ∙Andrews, Benjamin Koo -- Juvenile literature. ∙Adopted children -- Korea (South) -- Biography -- Juvenile literature. ∙Intercountry adoption -- United States -- Biography -- Juvenile literature. 이 도서의 서지 코드(OCLC No. 18383061) 에 따르면 “한국에서 아기 때 입양된 9살 소년 Benjamin Koo Andrews가 다른 나라에서 입 양되어 자라는 것이 무엇인지를 설명”하는 도 서이다. 이 도서는 한국 혹은 한국인보다는 입 양이라는 주제에 더 을 맞추고 있지만, 어 쩌면 등학교 학년 학생들의 한국이란 나라 에 한 첫인상의 형성 혹은 보 지식의 습 득에 향을 미칠 수도 있을 것이다. 두 번째로 많은 도서 이 소장한 한국 련 기는 골 선수 박세리를 주제로 한 M. Stewart WorldCat과 한국 련 장서의 분포에 한 연구 229 의 Se Ri Pak: Driven to win (Brookfield, Conn: Millbrook Press, 2000)(OCLC No. 41572595)이다. 이 도서 역시 48 페이지의 등학생 수 그림책으로, 인쇄본과 자책의 2 개 을 모두 388개 도서 이 소장하고 있다. <표 2>에 포함된 어 기 3권은 북한의 이 지도자인 김정일, 3권은 김일성, 그리고 1 권은 김정일과 김일성을 함께 다루어, 이 두 사 람이 청소년 상 어 기에서는 가장 많이 다루어진 주제 인물로 나타났다. 이 도서들은 모두 미국 내 출 사에서 발행되었고, 80여 개 에서 200여 개에 이르는 많은 도서 들이 소장 하고 있다. 이들은 WorldCat을 주제 Korea로 검색하면 Korea(South)와 Korea(North)에 련된 도서가 다 검색되기 때문에 포함되었다. 한국, 즉 남한 련 인물을 다룬 어도 서 Se Ri Pak 다음으로 많은 도서 이 소장 한 책은 가수 싸이를 그린 S. Tieck의 PSY (Minneapolis, Minnesota: ABDO Publishing Company, 2014)이다. 등학생 수 그림책인 이 도서는 64개 도서 이 소장하고 있다. 그 다음 은 김 통령을 주제로 한 N. Goldstein의 Kim Dae-jung(Philadelphia: Chelsea House Publishers, 1999)으로 49개 도서 이 소장하 고 있다. 그밖에 반기문 유엔 사무총장, 노무 통령, 독립운동가 유 순 열사를 다룬 어 기가 각각 1권씩 있었다. 고 인물을 다룬 것으로는 을지문덕과 김 성을 주제로 한국 내 출 사에서 발행된 어 기가 각각 1권씩 있 으며, 20여 개 도서 이 소장하고 있다. 어 기 30권 <표 2>에 포함되지 않은 10권은 10개 미만 도서 들이 소장하고 있다. 이들 가운데 E. B. Adams가 일연의 삼국유사 에 나온 김경문, 문무왕, 이차돈, 선덕여왕의 이 야기를 풀어쓴 4권의 그림책이 있다. 이들은 주 제표목 ‘Folklore-Korea-Juvenile literature’를 포함하고 있으며, 단순히 사람과 련된 옛이야 기일 뿐 형 기라고 하기 어렵다. 를 들 어 김경문 이야기인 The three good events: How a young boy became king in Silla (OCLC No. 18783299) 등 어디에도 기로 간 주될 만한 정보가 없다. 다른 책도 문제가 있었다. Leonardo da Vinci: Leonardo di ser Piero da Vinci (Seoul, South Korea: Y. kids, 2008) (OCLC No. 732322964) 는 이탈리아인인 오나르도 다빈치를 주제로 한 만화인데, 주제표목에 ‘Graphic novels-Korea (South)’와 ‘Korea(South)’가 포함되어 있어 한국 련 기로 검색되었다. 어 기도서 30 권 5권(16.6%)은 이처럼 문제가 있었으나, WorldCat 검색의 제한 을 시하기 하여 그 로 포함시켰다. 4.3 한국어 기의 주제 인물과 소장도서 <표 3>은 청소년 상 한국어 기도서 457권 가운데 미국 내 소장도서 수의 순 로 20권 의 주제 인물을 보여 다. 미국 내에서 가장 많은 도서 이 소장한 한국 어 기는 김구 선생의 백범일지 (서울: 계림문 고, 1994)로 모두 46개 도서 이 소장하고 있다. 그 다음으로는 최창숙 , 이순신의 난 일기 (서울: 성출 사, 2006)로 40개 도서 이 소장 하고 있으며, 이묘신의 목민심서 (서울: 성출 사, 2006)와 장채훈의 목민심서 (서울: 꿈동 산, 2000)를 각각 36개 도서 이 소장하고 있다. 230 한국문헌정보학회지 제49권 제4호 2015 서명 주제 인물 소장 도서 김구. (1994). 백범일지. 서울: 계림문고. 김구 46 이순신. (2006). 난 일기. 서울: 성출 사. 이순신 40 이묘신 , 정약용. (2006). 목민심서. 서울: 성출 사. 정약용 36 장채훈 , 정약용. (2000). 목민심서. 서울: 꿈동산. 정약용 36 이정은 & 한국독립운동사 연구소. (2004). 유 순: 불꽃같은 삶. 천안: 독립기념 한국독립운동사 연구소. 유 순 22 홍양호. (1985). 해동명장 . 서울: 민족문화추진회. 장군 16 최성수. (2007). 우리 역사의 주체 인물. 서울: 북피아. 13 김 배. (1987). 인민을 해 천만리. 평양: 성청년출 사. 김일성 11 유희 . (1985). 인민들 속에서. 평양: 성청년출 사. 김일성 11 송민주. (2001). 나도 일등 한 이 있다. 서울: 비룡소. 송민주(학생) 10 이상석. (2010). 못난 것도 힘이 된다. 서울: 양철북. 이상석(교사) 10 변운 . (1985). 항일유격 참가자들의 회상기. 평양: 성청년출 사. 항일유격 9 조선을 알아야 한다. (1983). 평양: 성청년출 사. 김일성 9 친어버이의 사랑 속에. (1982). 평양: 성청년출 사. 김일성 9 오찬홍. (1988). 한생을 싸움의 길에서. 평양: 성청년출 사. 김일성, 최 7 무지개 비낀 만경 . (1982). 평양: 성청년출 사. 김일성 7 주체의 새 시 를 펼치시오. (1984). 평양: 성청년출 사. 김일성 6 유지 . (2009). 조선 왕실의 보물, 의궤. 서울: 토토북. 정조 6 김모세. (1987). 일화로 엮은 134인의 한국 인 . 서울: 민서출 사. 인 6 김모세. (1990). 깔깔 조상 . 서울: 림당. - 5 <표 3> 한국어 기의 주제 인물과 소장도서 수: 상 20 까지 주목할 만한 사실은 <표 4>에 포함된 20권 8권이 북한에서 발행되었고, 이들 7권은 김 일성이 주제라는 이다. 평양의 성청년출 사가 발행한 김일성 기인 인민을 해 천만 리 와 인민들 속에서 는 각각 11개 도서 이 소장하고 있으며, 다른 2권은 9개 도서 , 다 른 2권은 7개 도서 , 그리고 나머지 1권은 5개 도서 이 소장하고 있다. 항일유격 참가자 들의 회상기 (평양: 성청년출 사, 1985)는 일제하 독립을 해 게릴라 활동을 했던 항일유 격 를 주제로 삼은 북한 발행 도서이며, 미국 내 소장도서 수에선 상 에 올라 있다. <표 4>에서 보는 한국어 기도서 각각에 연 결된 소장도서 리스트 일부는 문제가 있다. 왜냐 하면 특정한 도서의 소장도서 리스트가 실제로 그 도서의 소장도서 만을 포함하지 않 기 때문이다. 를 들어 <표 4>의 난 일기 (서울: 성출 사, 2006)의 미국 내 소장도서 리스트에 40개 도서 이 포함되었으나, 실제 개 별 도서 의 이름을 클릭해 보면 다른 난 일기 를 소장하고 있는 것을 볼 수 있다. 이 리스트에 서 소장도서 Cornell University Library를 클릭하면 노승석이 옮긴 이순신의 난 일기 (서울: 동아일보사, 2005)로 연결되고, Brown University Library를 클릭하면 허경진이 옮긴 난 일기 (서울: 앙북스, 2014)로 연결된다. 이러한 소장도서 리스트의 혼란은 서로 다 른 도서인 이묘신의 목민심서 (서울: 성출 WorldCat과 한국 련 장서의 분포에 한 연구 231 사, 2006)와 장채훈의 목민심서 (서울: 꿈 동산, 2000)를 각각 36개 도서 이 소장하고 있 다고 한 데서도 드러난다. “Displaying libraries 1-6 out of 36 for all 62 editions”이라고 명시 한 이 두 도서의 소장도서 리스트는 일치하 다. 이 리스트의 맨 에 등장한 Arizona State University Libraries를 클릭하면 다산연 구회 역주 목민심서 (서울: 창작과 비평사, 2003)의 서지 코드가 연결되어 있다. 다시 말 하면 특정한 책에 연결된 소장도서 리스트가 반드시 그 도서를 소장한 도서 이 아닐 수도 있고, 그런 문제는 난 일기 나 목민심서 와 같이 다양한 번역본 혹은 버 이 여러 출 사 에서 발행되었을 경우에 발생한다. 이것은 소 장도서 리스트가 개별 도서(item)가 아닌 작(work)을 심으로 편성되었기 때문일 수 있다. 따라서 <표 4>의 한국어 기의 소장도서 수에 따른 순 는 다소 주의가 필요하다. 앞서 <표 3>에서 본 어 기의 소장도서 리스트는 그런 문제가 없다. 를 들어 Alison Behnke의 Kim Jong Il's North Korea (Minneapolis, MN: Twenty-First Century Books, 2008)는 모두 221개 도서 이 소장하고 있 다. 서지 코드의 ‘View all editions and formats’ 를 통해 확인해 보면, 동일한 도서의 2008년 1종, 2008년 자도서 1종, 2012년과 2014년 개정 자도서 각각 1종이 포함되어 있고, 소 장도서 리스트는 이들 4개 어느 것이라 도 소장한 도서 을 포함하고 있다. 다른 를 보 면 Sohn Won Tai의 Kim il sung and Korea's struggle: An unconventional firsthand history (Jefferson, N.C: McFarland, 2003)는 5개 을 155개 도서 이 소장하고 있다. 도서 이 2003년 인쇄본 4종(OCLC No. 491048058, 237812579, 249140791, 51914243), 자도서 1 종(OCLC No. 606953271) 어느 것이든 소 장하 으면 리스트에 포함되었다. 4.4 한국어 기의 주제 분포 청소년 상 한국어 기 457권 가운데 429 권(93.8%)은 개인이나 집단, 주제 등을 명시하 고, 를 들어 <표 3>에 포함된 깔깔 조상 (서울: 림당, 1990) 등 28권(6.2%)은 기 의 상을 확실히 명시하지 않았다. 기에서 다루어진 주제는 181개로서 <표 4>에 보는 바 와 같이 개인, 집단 주제의 3개 유형으로 나 어 볼 수 있다. 김구, 세종 왕, 이순신과 같 은 개인 154명(85.1%), 항일유격 , 명사, 과학 자, 독립운동가와 같은 집단 22개(12.2%), 그리 고 나무 숭배, 삼국사기, 국가안보, 시조, 성공이 라는 5개 주제(2.8%)가 포함되어 있다. <표 4>에서 개인과 집단을 통틀어서 가장 많 은 기의 주제가 된 인물은 세종 왕과 이순 신으로 각각 17권씩에서 다루어졌다. 하지만 미국 내에서 이들을 주제로 다룬 기도서들을 소장한 도서 은 그리 많지 않다. 세종 왕 기를 소장한 미국 내 도서 은 모두 26개인데, 세종 왕의 어린 시 (서울: 세종 왕 기념 사업회, 1984) 등 2권이 4개 도서 에 각각 소 장되어 있고, 다른 10권은 각각 한두 개 도서 이 소장하고 있다. 5권은 WorldCat에 서지 코드는 있지만 미국 내 소장도서 은 없다. 이순신 기를 소장한 미국 내 도서 은 모두 58개인데, 앞에서 언 한 최창숙 , 이순신의 난 일기 (서울: 성출 사, 2006)를 40개 232 한국문헌정보학회지 제49권 제4호 2015 유형 기 주제 주제(명/개) 각 주제별 기(권) 기(권) 개인 (154명) 세종, 이순신 2 17 34 김구 1 11 11 김일성 1 10 10 신사임당, 김유신 2 9 18 안 근, 개토 왕, 정약용 3 8 24 김정호, 신채호, 왕건, 유 순, 장보고 5 7 35 안창호, 윤석 2 6 12 강감찬, 김홍도, 원효, 을지문덕, 이성계, 주시경, 한석 , 허 8 5 40 동명성왕, 유일한, 이 섭, 장 실, 황희 5 4 20 계백, 권율, 등 13 3 39 우장춘, 이태 , 장기려 등 27 2 54 차범근, 황우석, 이휘소 등 85 1 85 집단 (22개) 항일유격 1 16 16 과학자, 인, 명사, 여성인물, 화랑 5 2 10 화감독, 고아, 독립운동가, 문화계인사, 사육신, 신동, 십 , 어린이, 술가, 왕, 운동선수, 장군, 재상, 종군 안부, 탐험가, 화가 16 1 16 주제(5개) 나무 숭배, 삼국사기, 시조, 국가안보, 성공 5 1 5 총계 181 429 <표 4> 한국어 기의 주제 (N=181) 도서 이 소장한 것으로 되어 있고, 이순신 (서울: 동아출 사, 1987) 등 2권이 4개 도서 에 각각 소장되어 있을 뿐, 6권은 각각 한두 개 도서 만이 소장하고 있다. 이순신 기도 이 순신 (서울: 문공사, 1992) 등 8권은 소장도서 이 없다. 김구선생을 주제로 다룬 기는 모두 11권이 고, 58개 도서 이 소장하고 있다. 앞서 <표 3> 에서 백범일지 는 46개 도서 이 소장하여 단 일 도서로는 소장도서 이 가장 많은 것으로 나타났는데, 김구선생을 주제로 한 다른 4권은 각각 한두 개 도서 이 소장하 다. 한편 독립 운동의 큰 별 김구 (서울: 민서출 사, 1988) 등 4권은 미국 내 소장도서 이 없다. 김일성이 주제인 기는 10권으로 모두 68개 도서 이 소 장하 다. 그 2권은 11개 도서 , 2권은 9개 도서 이 각각 소장하고 있어, 한국어도서이면 서도 각 권당 평균 소장도서 수가 다른 도서 들에 비해 많다. 앞서 언 한 로 체 개인 154명 가운데 반 이상인 85명(55.2%)이 단 1권의 기에서 다루어졌다. 특히 이들 가운데 옥수수박사 김 순권 이야기 (서울: 우리교육, 2000), 석주명 의 나비 이야기 ( 주: 북스타, 2011) 등 20권 은 미국 내 단 1개 도서 에 소장되어 있는 것으 로 나타났다. 기의 주제가 된 22개 집단들 에서는 16 권에서 다루어진 항일유격 가 가장 큰 비 을 차지하 고, 이 주제의 도서들은 모두 북한에 서 발행되었다. 항일유격 참가자들의 회상 WorldCat과 한국 련 장서의 분포에 한 연구 233 기 (평양: 성청년출 사, 1998)는 미국 내 9개 도서 이 소장하 고, 다른 6권은 각각 2개 에서 6개의 도서 이 소장하고 있으며, 한 권은 소장도서 이 없다. 과학자, 인, 명사, 여성인 물, 화랑이라는 5개 집단은 각각 2권씩의 기 주 제가 되었으며, 화감독, 고아, 독립운동가 등 15개 집단은 각각 1권씩의 기 주제가 되었다. 나무 숭배, 성공 등 5개 주제는 각각 1권씩의 기에서 다루어졌다. 이들 가운데 실패의 문가들: 우리 시 의 멘토들이 들려주는 한 실패 이야기 (서울: 샘터, 2012)는 미국 내 4개 도서 이 소장하 고, 국가안보를 주제로 한 내 마음 속 빛나는 안보의식: 2010 안보 홍보 공보 (서울: 경찰청, 2010), 시조를 주 제로 한 역사를 간직한 8가지 시조 이야기 (서울: 어린이작가정신, 2008)는 2개 도서 이 소장하 으며, 나머지 2권은 각각 1개 도서 이 소장하고 있다. 4.5 한국어 기 주제 인물의 특성 청소년 상 한국어 기의 주제가 된 개인 154명의 활동 시 별 성별로 구분해서 살펴 보았다. 우선 이들의 활동 시 별 분포는 <그림 2>에 보는 바와 같다. 추송웅, 박수근 등 인물이 53명(34.4%)으로 가장 많고, 정약 , 이순신 등 33명(21.4%)이 조선시 , 신채호, 서재필 등 27명(17.5%)이 일제강 기를 포함 한 근 인물이다. 신라시 인물은 선덕여왕, 김 성 등 19명(12.3%), 고려와 고구려의 인물 은 각각 7명씩(4.5%)이며, 백제의 인물이 아도 화상 등 5명(3.2%)이다. 고조선, 발해, 고 의 인물이 각각 1명씩(0.6%)이다. 근 사 인물 의 비 이 으로 크다. 체 주제 인물 154명 가운데 여성은 10명 (6.4%)이다. 짚풀문화운동가 인병선을 비롯하 여 신사임당, 유 순, 혜경궁 홍씨, 황혜성, 김활 란, 나혜석, 이정 , 송민주, 조화순이 그들이다. 신사임당(9권)과 유 순(6권)은 여성들 가장 많은 기의 주제가 되었고, 황혜성: 우리의 맛 을 세상에 알리다 (서울: 청어람 미디어, 2012) 등 2권의 주제인 요리연구가 황혜성을 제외한 다른 여성들은 모두 1권씩의 주제가 되었다. 유 명인사는 아니지만 조화순은 노동운동가, 송민 <그림 2> 개인 주제의 시 별 분포 (N=154) 234 한국문헌정보학회지 제49권 제4호 2015 주는 일기를 도서로 펴낸 학생이다. 한편 이정 은 미국 내 4개 도서 이 소장하고 있는 장재 갑 백호의 정체: 그림책 (평양: 성청년 출 사, 2012)(OCLC No. 845102830)의 주제 인물로 어떤 일을 했는지 확실히 알기 어렵다. 한편 집단으로서 여성 인물을 주제로 한 한 국 여성사 편지: 마고 할미부터 안티 미스코리 아까지 (서울: 책과 함께 어린이, 2009), 나 는 당당하게 살리라: 한국사를 뒤흔든 여성들 (서울: 푸른 나무, 2005.)라는 2권의 기가 있 고, 일제 강 기 강제동원된 안부 할머니들 을 주제로 한 끝나지 않은 겨울 (경기도 주: 보리, 2010)은 미국 내 3개 도서 이 소장되어 있다. WorldCat 수록 청소년 상 한국어 기에 서 근 사 인물의 비 이 크다거나, 여성 인 물의 비 이 작다는 상은 소장도서 들의 심을 반 하는 것일 수도 있다. 하지만 반드시 그 다고 단정하기 에 한국 여성의 삶, 혹은 근 사 이 인물의 삶에 한 읽을 만한 기도서들, 도서 들이 장서에 포함시킬 만한 도서들이 충분히 발행되고 있는지, 우리나라 기도서 출 황의 분석도 필요할 것이다. 5. WorldCat 분석과 문제 이 연구에서는 WorldCat에서 한국 련 기도서의 서지 코드를 검토하고, 특히 청소년 상 기도서의 주제 미국 내 소장도서 분포 상황을 살펴보았다. 그 황을 다음과 같 이 요약할 수 있다: 첫째, WorldCat에 수록된 주제 Korea를 포 함한 도서의 서지 코드는 238,872건이다. 이 들 기도서의 서지 코드는 총 15,007건으 로 일반 상은 14,089권, 청소년 상은 918권 이다. 28개 주제 패싯 분석 결과는 ‘역사’ 주제 가 일반 상 도서의 44.5%, 청소년 상 도서 의 52.3%로 큰 비 을 차지함을 나타냈다. 둘째, 주제 패싯 분석이 가능한 한국어도서 457권의 미국 내 소장도서 은 950개, 어도서 30권의 소장도서 은 2,737개로 모두 3,687개이 다. 어도서는 1권 당 평균 91.2개 도서 이 소 장하 으나, 한국어도서는 반 가까운 223권 (48.8%)을 한두 개 도서 이 소장하 고, 131 권은 미국 내 소장도서 이 없다. 셋째, 미국 내 소장도서 이 가장 많은 책은 입 양아를 주제로 한 We adopted you, Benjamin Koo (538개)이었고, 한국어도서 엔 김구 선 생의 백범일지 가 가장 많은 도서 (46개)에 서 소장되어 있다. 하지만 한국어도서의 소장도 서 리스트는 해당 도서가 아닌 작(work) 심으로 편성된 것이 있어 주의를 요한다. 넷째, 한국어 기도서 457권의 주제가 된 개 인(154명), 집단(22개) 주제(5개)는 모두 181개이다. 세종 왕, 이순신, 항일유격 , 김 구, 김일성 등이 10권 이상 기의 주제가 되었 고, 차범근, 이휘소 등 85명은 단 한 권의 기 에서 다루어졌다. 어 기도서 30권에서는 김 일성과 김정일이 각각 4권에서 다루어졌고, 펜 튼, 박세리, 김수로왕 등 17명의 인물이나 한국 쟁, 한미조약(1871) 등 5개 주제는 각각 1권 씩에만 다루어졌다. 다양한 한국 인물을 깊이 있게 다룬 기들이 에 띄지 않는다. 상기한 WorldCat 서지 코드의 황은 비 WorldCat과 한국 련 장서의 분포에 한 연구 235 교 단순한 계량분석으로 악할 수 있었으나, 이들이 기술한 도서들의 주제 패싯, 주제 인물 소장도서 등의 내용분석 과정에서 다음과 같은 문제 을 찰할 수 있었다: 첫째, 동일 인물을 다룬 도서에 상이한 주제 패싯이 부여되었다. 한국어 기의 주제 181명 30명의 인물이나 집단을 다룬 기들에 부 여된 주제 패싯이 분산된 것으로 나타났다. 세 종 왕을 주제로 한 기 17권은 ‘어문학’(2 권), ‘정치학’(1권), ‘역사’(14권)의 3개 주제에 분산되어 있고, 이순신 기 17권은 ‘어문학’(1 권), ‘정치학’(1권), ‘문헌정보학, 총류 참고 자료’(1권) ‘역사’(14권)의 4개 주제에 분산 되어 있다. 정약용 기 8권은 ‘정치학’(1권), ‘문헌정보학, 총류 참고자료’(1권), ‘어문 학’(1권), ‘역사’(1권) ‘철학과 종교’(4권)의 5개 주제로 분산되어 있다. 를 들어 무인 강감찬이 주제인 강감찬 (서울: 동아, 1987)이 ‘어문학’으로, 강감찬장 군 (서울: 앙문화사, 1977)은 ‘인류학’으로, 강감찬 (서울: 계몽사, 1986)은 ‘정치학’으로 분류되어 있다. 다른 강감찬 (서울: 홍문 서 , 1997)과 강감찬, 최 (서울: 삼익출 사, 1996)은 ‘역사’에 들어있다. 한 사람을 다 룬 기가 역사를 비롯한 4개 주제에 분산되어 있다. 세종 왕이나 이순신과 같이 여러 책의 주제 가 된 인물들만이 아니라 단 두세 권의 주제인 인물들도 분산되어 있었다. 법률가 이태 의 기 이태 (서울: 웅진, 2005)은 ‘법학’에, 이태 (서울: 동서문화사, 1984)은 ‘사회학’ 에 각각 분류되어 있으며, 황혜성의 기 한 권 은 ‘공학과 기술’, 다른 한 권은 ‘어문학’에 분류 되어 있다. 이것은 WorldCat에 서지 코드를 입력하는 도서 들의 기 처리 방식이 다르기 때문에 생긴 문제일 수도 있다. 하지만 이 책들 상당 수가 등학생 수 , 특히 유아나 학년 상 그림책이므로 특정한 주제 분야 인물의 본 격 기라고 하기엔 미흡한 이 많고, 주제 의 분산은 하지 않다. 둘째, 주제 분류가 부정확한 사례도 있었다. 앞 서 지 한 주제 패싯의 분산도 일부 부정확한 주 제 분류 때문이기도 하다. 를 들어 서 가 한석 의 기 한석 (서울: 림당, 1990)의 서지 코드(OCLC No. 23108812)는 ‘Calligraphers- Korea-Biography-Juvenile literature’라는 주 제표목을 갖고 있는데, ‘비즈니스와 경제학’ 주 제에서 ‘Writings’ 소주제로 들어가 있다. 한 엔씨소 트 표인 김택진을 다룬 김택진 (서 울: 다산어린이, 2014)의 서지 코드(OCLC No. 895326133)는 ‘Electronic games industry- Korea (South)--Biography-Juvenile literature’ 라는 주제표목을 갖고 있으나, ‘체육학’ 주제에 서 ‘Games and Amusements’ 소주제로 들어 가 있다. 이순신 (서울: 웅진출 사, 1987)의 서지 코드(OCLC No. 27788730)는 ‘Korea--History-- Japanese Invasions, 1592-1598-Juvenile literature’ 를 주제표목으로 갖고 있으며, ‘어문학’에서 소 주제로 분류되지 않았다. 강감찬 (서울: 동아출 사, 1987)의 서지 코드(OCLC No. 22895277) 는 ‘Generals-Korea-Biography-Juvenile literature’, ‘Korea-History--Juvenile literature’를 주제 표목으로 갖고 있으며, ‘어문학’에서 ‘American Literature’ 소주제로 들어갔다. 이처럼 주제 패 236 한국문헌정보학회지 제49권 제4호 2015 싯만이 아니라 그 아래 소주제 분류도 문제가 있는 경우가 있었다. 셋째, 문제 인물 기의 사례도 있다. 이것 은 WorldCat의 문제이기보다, 개별 도서 장 서의 문제이다. 를 들어 황우석의 꿈: 계란으 로 바 를 치자! 하늘을 감동시키자! 한국을 인 류 구원 세계의 심으로! (서울: 동서문화사, 2005) (OCLC No. 62903352)는 논란을 일으킨 황우석 박사의 기로서 ‘생명과학’ 주제로 분류 되었고, Geneticists-Korea(South)--Biography, Stem cells-Transplantation-Research -- Korea (South)--Juvenile literature 등 주제표목이 주어졌다. 진실을 왜곡한 인물의 기가 도서 에 소장되어 있고, 한국 과학자에 하여 알 고 싶은 독자에게 유일한 읽을거리가 된다는 것은 우려스러운 부분이다. 부 한 행 를 했다고 평가된 인물을 특정 시 , 즉 진실이 드 러나기 에 극도로 미화했던 도서만이 도서 에 소장되어 있는 것은 문제라 할 수 있다. 그밖 에도 언론, 정치, 교육계 인물로 명하지만 친 일반민족행 를 한 것으로 알려진 인물을 주제 로 한 김성수 (서울: 동서문화사, 1984), 믿 음으로 산 김활란박사 (서울: 보이스사, 1982) 같은 기들도 마찬가지 문제를 갖는다. 공과 가 있는 사람의 삶을 객 으로 진솔하게 그 리기보다, 역사학자 우용이 지 한 것처럼 “어려서부터 드러나는 비범성, 모든 역경을 극 복하는 불굴의 의지, 결함을 찾기 어려운 완벽 한 인간성 등 천편일률 인 ‘ 인의 모델’을 만 들어”(한겨 신문 2015) 왔던 우리나라 기 도서 출 시장의 문제 행태를 반 하기 때문 이다. 한국과 한국인에 한 읽을거리 선택의 여지가 별로 없는 다른 나라 도서 이용자들 에게 편향 지식을 가능성도 있다. 넷째, 소장도서 리스트의 부정확성 문제이 다. 앞서 백범일지 , 난 일기 , 목민심서 의 경우처럼 특정한 도서의 서지 코드에 연결된 소장도서 리스트가 해당 책이 아니라 해당 작의 소장도서 을 포함할 수도 있기 때문이다. 하지만 다른 책들의 소장도서 리스트에도 자 주 이 게 책과 작이 섞이는 문제가 있는지 확 인하기 어렵다. 를 들어 WorldCat Identities Network에서 김구선생의 아이덴티티 코드 에는 ‘김구에 하여 가장 리 소장된 책’ 리스 트에 백범일지 가 들어있다(OCLC 2014f). 이 책은 도진순 주해 백범일지 (서울: 돌베개, 2002) 서지 코드(OCLC No. 50944888)로 연 결되고, 미국 내 소장도서 리스트의 25개 도 서 은 모두 바로 이 책을 소장하고 있다. 그러 나 여기에서 분석한 몇몇 책들의 소장도서 리스트에는 분명 혼란이 있고, 향후 분석이 필 요하다. 6. 맺음말 WorldCat 자체가 “집합 컬 션”이기도 하 지만, 그 안에 수록된 다양한 주제로 가상의 컬 션이 구성될 수 있다. 이 연구에서 살펴본 바 에 따라 넓게는 한국 련 기책의 “집합 컬 션”, 좁게는 미국 내 도서 에 소장된 한국 련 기책의 “집합 컬 션”이 구성될 수 있다. 그리고 이러한 컬 션을 통해 한국인, 문화, 역 사 등에 한 지식이 확산될 수도 있다. 다만 이 “집합 컬 션”은 개별 도서 의 정책이나 심에 따라 선택된 도서들의 편 데이터가 WorldCat과 한국 련 장서의 분포에 한 연구 237 모인 것이므로 도서들의 수 뿐 아니라 그것을 기술한 서지 코드의 수 도 균일하지 않을 수 있다. 따라서 이 연구에서 검토한 한국 련 기도서들의 서지 코드에서는 주제, 주제 인물, 소장도서 리스트 등에서 데이터 오류와 혼란 이 찰되었다. 앞서 언 한 OCLC 자체의 연구들은 WorldCat 서지데이터베이스의 내부 데이터를 자유롭게 분석함으로써 다양한 측면의 연구가 가능하 다. 하지만 이 연구는 WorldCat 이용자에게 공 개된 서지 코드의 외부 데이터만을 분석했다 는 한계를 갖고 있다. WorldCat을 연구자가 직 검토할 수 있는 범 안에서 찰한 문제 을 기술하는 데 그쳤을 뿐, 그 원인을 규명하고 해결책을 모색하는 데까지 이르지는 못하 다. 향후 이들을 보다 심층 으로 분석하고 개선의 방향을 제안하는 단계로 나아갈 것을 과제로 남겨둔다. 참 고 문 헌 [1] 윤정옥. 2012. 도서 목록의 지식 확산 도구 역할에 한 시론(試論). 한국도서 ․정보학회지 , 43(1): 123-141. [2] 윤정옥. 2013. WorldCat 수록 한국 일본 련 청소년도서의 분석. 한국문헌정보학회지 , 47(3): 5-23. [3] 우용의 를 만든 물건들: 인 . 2015. 한겨 신문 . 10월 13일. 제29면. [4] Dickey, T. J. 2011. “Books as Expressions of Global Cultural Diversity.” Library Resources & Technical Services, 55(3): 148-162. [5] Harrod, L. M. 1984. Harrod’s Librarians’ Glossary: of Terms Used in Librarianship, Documentation and the Book Crafts and Reference Book. 5th ed. Farnham, UK: Gower Publishing Company. [6] Lavoie, B. F., Connaway, L. S. and O'Neill, E. T. 2007. “Mapping WorldCat's Digital Landscape.” Library Resources & Technical Services, 51(2): 106-115. [7] Leung, S., Chan, K. and Song, L. 2006. “Publishing Trends in Chinese Medicine and Related Subjects Documented in WorldCat.” Health Information & Libraries Journal, 23(1): 13-22. [8] Online Computer Library Catalog. 2015a. WorldCat facts and statistics. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 9. 3.] [9] Online Computer Library Catalog. 2015b. Directory of OCLC Members. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 9. 3.] 238 한국문헌정보학회지 제49권 제4호 2015 [10] Online Computer Library Catalog. 2015c. Refining a Search. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 10. 1.] [11] Online Computer Library Catalog. 2015d. Enriching WorldCat with FAST. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 10. 11.] [12] Online Computer Library Catalog. 2015e. WorldCat Identities Network. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 10. 21.] [13] Online Computer Library Catalog. 2015f. 金九 1876-1949. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 10. 13.] [14] Online Computer Library Catalog. 2013a. WorldCat facts and statistics. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 6. 8.] [15] Online Computer Library Catalog. 2013b. OCLC WorldCat. Dublin, USA: Online Computer Library Catalog. [online] [cited 2013. 6. 8.] [16] Online Computer Library Catalog. 2013c. WorldCat Database Reaches 2 Billion Holdings. Dublin, USA: Online Computer Library Catalog. [online] [cited 2013. 6. 8.] [17] Online Computer Library Catalog Research. 2015a. What in the WorldCat? Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 9. 25.] [18] Online Computer Library Catalog Research. 2015b. Top 25 Inventors Found in Libraries: Can you name their most famous inventions? Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 9. 25.] [19] Online Computer Library Catalog Research. 2015c. Faceted Application of Subject Terminology. Dublin, USA: Online Computer Library Catalog. [online] [cited 2015. 9. 25.] [20] O'Neill, E. T., Connaway, L. S. and Dickey, T. J. 2008. “Estimating the Audience Level WorldCat과 한국 련 장서의 분포에 한 연구 239 for Library Resources.” Journal of the American Society for Information Science & Technology, 59(13): 2042-2050. [21] Sutherland, Z. and Arbuthnot, M. H. 1986. Children and Books. 7th ed. New Jersey, USA: Allyn & Bacon. • 국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] Yoon, Cheong-Ok. 2012. “A Discourse on the Role of Library Catalogs as a Tool for Knowledge Distribution.” Journal of Korean Library and Information Science Society, 43(1): 123-141. [2] Yoon, Cheong-Ok. 2013. “An Analysis on the Juvenile Books on Korea and Japan in the WorldCat.” Journal of the Korean Society for Library and Information Science, 47(3): 5-23. [3] “Jeon U-yong’s Things to Make the Modern Society: Bioraphies of the Great People.” 2015. The Hankyoreh. October 13. p.29. work_glzzjlmdcbbtfajjlkwokwlzpq ---- plantilla Library of Congress controlled vocabularies and their application to the Semantic Web By Corey A. Harper and Barbara B. Tillett SUMMARY: This article discusses how various controlled vocabularies, classification schemes and thesauri can serve as some of the building blocks of the Semantic Web. These vocabularies have been developed over the course of decades, and can be put to great use in the development of robust web services and Semantic Web technologies. The article covers how initial collaboration between the Semantic Web, Library and Metadata communities are creating partnerships to complete work in this area. It then discusses some cores principles of authority control before talking more specifically about subject and genre vocabularies and name authority. It is hoped that future systems for internationally shared authority data will link the world's authority data from trusted sources to benefit users worldwide. Finally, the article looks at how encoding and markup of vocabularies can help ensure compatibility with the current and future state of Semantic Web development and provides examples of how this work can help improve the findability and navigation of information on the World Wide Web. KEYWORDS: Controlled vocabularies, Semantic Web building blocks, authority control 1 Introduction: Library of Congress Tools and Launching the Semantic Web An essential process is the joining together of subcultures when a wider common language is needed. Often two groups independently develop very similar concepts, and describing the relation between them brings great benefits… The Semantic Web, in naming every concept simply by a URI, lets anyone express new concepts that they invent with minimal effort. Its unifying logical language will enable these concepts to be progressively linked into a universal Web. Thus concludes Tim Berners-Lee’s seminal 2001 Scientific American article on the Semantic Web (Berners-Lee, Hendler & Lasilla, 2001). The concepts established here are strong ones, and the vision of a thoroughly interconnected Web of data that these concepts suggest, could prove a catalyst for the way we research, develop, interact with, and build upon ideas, culture and knowledge. The idea presented here, of independent groups working with similar concepts, is applicable to the very technologies that can serve as the Semantic Web’s underpinnings. The Semantic Web communities and library communities have both been working toward the same set of goals: naming concepts, naming entities, and bringing different forms of those names together. Semantic Web efforts toward this end are relatively new, whereas libraries have been doing work in this area for hundreds of years. The tools and vocabularies developed in libraries, particularly those developed by the Library of Congress, are sophisticated and advanced. When translated into Semantic Web technologies they will help to realize Berners-Lee’s vision. Semantic Web technologies are now, in their own right, starting to reach a state of maturity. Berners-Lee, the director of the World Wide Web Consortium (W3C) and Eric Miller, W3C Semantic Web Activity Lead, frequently describe these technologies in terms of the Semantic Web Stack (see Figure 1). Many of the components depicted as layers in the Semantic Web Stack are already in place, although not nearly as widely implemented as most Semantic Web proselytizers would like. Development on the various levels depicted in this graphic has been a long time in the making. In a recent interview with Andrew Updegrove (2005) in the Consortium Standards Bulletin, Berners-Lee identifies one cause of the slow rate of development, implying that each layer is dependant on the layers below it. “We were asked to hold up the query and rules work because people didn’t want to start on it until the ontology work had finished, so for some we were in danger of going too fast”. 2 3 Figure 1: W3C Semantic Web Stack Taken from W3C website, licensed under CC Attribution 2.5 License As the technologies represented by the Semantic Web Stack continue to mature, there is a tremendous potential for the library community to play a significant role in realizing Berners- Lee’s vision. In a presentation at Dublin Core 2004 (Miller, 2004, Slide 26), as well as in a number of other presentations over the years, Eric Miller implored the library community to become active in Semantic Web development. Miller outlines the role of libraries in the Semantic Web as follows: • “Exposing collections – use Semantic Web technologies to make content available • Web’ifying Thesaurus / Mappings / Services • Sharing lessons learned • Persistence” While all of these roles are significant, the idea of moving thesauri, controlled vocabularies, and related services into formats that are better able to work with other Web services and software applications is particularly significant. Converting these tools and vocabularies to Semantic Web standards, such as the Web Ontology Language (OWL), will provide limitless potential for putting them to use in myriad new ways. This will enable the integration of research functionality – such as searching and browsing diverse resources, verifying the identity of a particular resource’s author, or browsing sets of topics related to a particular concept – into all sorts of tools, from online reference sources and library catalogs to authoring tools like those found in Microsoft’s Office Suite. Miller (2004, Slide 27) also emphasizes the role that libraries can play in helping to realize the trust layer in the Semantic Web Stack, stating, “Libraries have long standing trusted position that is applicable on the Web”. The Semantic Web has a lot to gain by recruiting libraries and librarians and involving them in the development process. The W3C’s stated mission is “to lead the World Wide Web to its full potential by developing protocols and guidelines that ensure long-term growth for the Web”. This focus on protocols and guidelines helps explain why the Semantic Web Stack includes little to no mention of content. For example, it includes an ontology layer, which is primarily represented by OWL – a Web Ontology Language – a specification for adding Semantic Web enabling functionality to existing ontologies. More recently, another Semantic Web technology– Simple Knowledge Organization System (SKOS) Core – has been designed for the encoding of the contents of thesauri. The W3C’s emphasis has been on how to encode ontologies, which fits with their stated mission. The source of ontologies and vocabularies is outside the scope of the W3C’s concerns, although the usefulness of such ontologies is certainly dependant on their validity and trustworthiness. This is where Miller’s thoughts on the role of libraries seem most relevant. Libraries have a long-standing history of developing, implementing, and providing tools and services that make use of numerous controlled vocabularies. Presumably, part of the process of “web’ifying thesaurus, mappings and services” involves converting existing tools into Semantic Web standards, such as OWL and SKOS. Miller, and other vocabulary experts recognize that this would be of tremendous value to Semantic Web initiatives. Taking these steps would reduce the need for Semantic Web development to revisit decisions made over the centuries that libraries have been organizing and describing content, which ties into the related idea of “sharing lessons learned”. 4 5 2 Incremental Progress – Initial Developments Progress on bridging the gap between the Semantic Web community and the library community has been underway for some time. A variety of projects are in progress, or completed, that will help to bring more of the tools that libraries develop into the Semantic Web and more general Web Services spheres. Many of these projects are being developed within the Dublin Core Metadata Initiative (DCMI), which draws heavily on both the Semantic Web and library communities, as well as a variety of other information architecture and metadata communities. One example of such collaboration is the expression of a sub-set of MARC Relator Terms (Network Development and MARC Standards Office – Library of Congress, 2006) in RDF for use as refinements of the DC contributor element. Relator terms allow a cataloger to specify the role that an individual played in the creation of a resource, such as illustrator, calligrapher, or editor. Allowing some of these terms to be used as refinements of contributor allows the expression of much more specific relationships between individuals and the resources they create. An example of the use of MARC Relator Terms, both in a MARC record and in Dublin Core, can be seen in Figure 2. Figure 2a depicts a mnemonic MARC record with personal name added entries for correspondents. The 700 tags at the end of this record each include a subfield e, which contains a MARC Relator term identifying the role of these individuals. The corresponding Dublin Core Record in 2b represents this same information using the ‘marcrel’ namespace with the Relator code ‘CRP’, which corresponds to the term ‘correspondent.’ The concept of author added entry, represented by MARC tag 700 in 2a, is implicit in the Dublin Core example because the Relator terms are all element refinements of dc:contributor. Figure 2a: Relator Terms in and MARC (shown in MARC tag 700, subfield e) 1 6 7 Figure 2b: MARC Relator Terms in and DC (shown in tag) Another example of collaboration can be seen in the ongoing work to bind the Metadata Object Description Schema (MODS) metadata element set (Library of Congress, 2006, June 13) to RDF and to the DC Abstract Model so that MODS terms, or alternately new DC properties derived from or related to these MODS terms, could be available to Dublin Core Application Profiles (Heery & Patel, 2000). Similar collaborations between the Institute of Electrical and Electronics Engineers Learning Object Metadata group (IEEE-LOM) (IEEE Learning Technology Standards Committee , 2002) and DCMI resulted in an RDF binding of IEEE-LOM, which essentially serves to make IEEE-LOM metadata statements and records useable within the context of the Semantic Web. A summary of this process was reported at the Ariadne Conference in 2003 (Nilsson, Palmer & Brace, 2003, November). This presentation resulted in a joint-task force between the IEEE and DCMI communities to formally map the IEEE-LOM Schema to the DCMI Abstract Model as well. Progress on this process can be tracked on the DC Education Working Group’s Wiki.(Joint DCMI/IEEE LTSC Taskforce, 2006, March). However, both of these examples apply to metadata elements and resource descriptions themselves. The progress that has been made on bringing tools from the library world into the Semantic Web has, thus far, been entirely focused on the idea of exposing library collections. Incorporating elements used in library resource descriptions into the sets of resource properties that are used to enable the Semantic Web is a large step, and enables library metadata to interoperate with Dublin Core and other RDF encoded metadata. Authority records, library thesauri, and library controlled vocabularies, if converted into formats that support Semantic Web technologies, have an even greater potential for revolutionizing the way users – and machines – interact with information on the Internet. 3 Authority Control – Core Principles The benefits and virtues of authority control have been debated and restated for decades. When we apply authority control, we are reminded how it brings precision to searches, how the syndetic structure of references enables navigation and provides explanations for variations and inconsistencies, how the controlled forms of names, titles, and subjects help collocate works in displays, how we can actually link to the authorized forms of names, titles, and subjects that are used in various tools, like directories, biographies, abstracting and indexing services, and so on. We can use the linking capability to include library catalogs in the mix of various tools that are available on the Web. In order to enable these capabilities in a Web-based environment, it will be necessary to move the records that facilitate linking and aggregating in the library into more universal and generalized encoding. The Library of Congress is taking steps in this direction, particularly through work on MARCXML for Authority Records (Library of Congress, 2005, April 22) and the Metadata Authority Description Schema (MADS) (Library of Congress, 2005, December 14), along with crosswalks to go between them. A next logical step is to begin work on translating this data into the Resource Description Framework (RDF), which is not a trivial task. 4 Subject & Genre Vocabularies for the Semantic Web There are a vast number of controlled vocabularies for various forms of subject access to library materials. Some of these vocabularies are classification schemes, such as Dewey Decimal Classification (DDC) and Library of Congress Classification (LCC). Others are controlled lists of subject headings or terms, which adhere to national and international guidelines for thesaurus construction, like the Library of Congress Subject Headings (LCSH), the two Thesauri for Graphic Materials: Subject Terms (TGM I) and Genre & Physical Characteristic Terms (TGM II), Guidelines on Subject Access to Individual Works of Fiction, Drama, Etc. (GSAFD), and the Ethnographic Thesaurus. Some of the subject thesauri published by The Getty, such as The Getty Thesaurus of Geographic Names (TGN) and The Art and Architecture Thesaurus (AAT) are truly hierarchical thesauri. 8 9 Many of these controlled vocabularies are registered as DCMI Encoding Schemes in the DCMI Terms namespace. Registered Encoding Schemes qualifying the DC Subject element include (DCMI Usage Board, 2005, January 10): • Dewey Decimal Classification (DDC), • Library of Congress Classification (LCC), • Library of Congress Subject Headings (LCSH), • Medical Subject Headings (MeSH), • National Library of Medicine Classification (NLM), • The Getty Thesaurus of Geographic Names (TGN), • Universal Decimal Classification (UDC). However, when these vocabularies are used as values of DC Subject in metadata records outside of the library context, the syndetic structure of the source vocabulary is all but lost. There is no way for search tools or other applications to make use of information about related terms and variant forms as part of the entry vocabulary. In some cases, item metadata for Dublin Core records is included within a larger system that relies heavily on MARC data, such as OCLC WorldCat. When this is true, many of the Library of Congress controlled vocabularies are indirectly linked to these Dublin Core records by virtue of the availability of MARC authority records for the controlled vocabularies in those systems. Systems outside of these application environments need to be able to retrieve and translate or otherwise make use of MARC authority record data to make effective use of the hierarchies, equivalency relationships, and structures in the content of controlled vocabularies. These vocabularies present tremendous potential for improving access to web resources and Semantic Web data, as well as enhancing networked applications. Search engine results could be dramatically improved, both in terms of precision and recall. Different subject vocabularies covering the same concept space could be merged together or associated, providing an environment where differences in terminology between different communities would provide less of a barrier to effective browsing of resources. Front-end interfaces could be built for a variety of online reference tools that take advantage of the rich structure of relationships between topics that is provided by controlled vocabularies. In some regards, libraries are only now realizing the full potential of catalog records to provide innovative and new browsing interfaces. An example can be seen in North Carolina State Library’s new, Endeca powered, catalog interface (North Carolina State University [NCSU] Libraries, 2006), which is presently built entirely on the structure of bibliographic records. This new catalog interface allows users to refine a search by navigating through record clusters that share a particular property. Drawing on the idea of faceted classification, clusters can be grouped by subtopic, genre, language, time- period, geographic region, and in a variety of other ways. These facets are derived from information in various access points, subject headings and subdivisions in a particular bibliographic result set. It is easy to envision a similar technique being used to broaden searches and even to present initial browse interfaces to specific collections of information resources. When interfaces start to leverage the power of authority control in new and interesting ways, the benefit to users will be immense. Another example of initiatives to leverage authority control is OCLC Research’s Terminology Services project to “offer accessible, modular, Web-based terminology services” (OCLC Research, 2006). The Terminologies Pilot Project in October 2005 explored techniques for encoding sample vocabularies, means of mapping between them to help users identify relationships, and methods for incorporating the resulting services into other applications and tools. These services are the beginning of a rich tapestry of semantics that can be delivered within a user’s current context, whatever that context is. In some cases, the services may be entirely carried out by server-side applications. Almost any application that serves content dynamically could include a navigation system that draws on the hierarchies and relationships 10 11 between terms in LCSH, and could provide search interfaces that draw on term equivalencies to retrieve a broader set of resources when doing keyword searches. Additionally, there could be services that exist within client side applications, pulling vocabulary structures from the network and integrating them into authoring tools. One such service component prototyped for the OCLC pilot uses the Microsoft Office 2003 Research Services Pane to access genre terms. “… if a college student wishes to categorize a reading list of fiction titles based on genre, he could copy the titles into a Microsoft Excel 2003 workbook, open the Research services pane, send a search to the OCLC Research GSAFD vocabulary service, and then place the results into his document” (Vizine-Goetz, 2004). This provides the user with the ability to browse and search genre categorizations without having to leave the application in which the search results will be used. Additionally, the more sources of data are made available in this way, the more automated the process can be. If the Research Services Pane had access to bibliographic records, genre or subject categorization could be fully automated and available at the touch of a button. A cataloger could use a similar tool to suggest appropriate terms from a controlled vocabulary, which will lead to lots of cost saving opportunities. Similar applications could be developed as browser plug-ins or extensions. A sidebar application could be built for FireFox that could harness the power of controlled vocabularies when browsing Web resource that provide some degree of keyword tagging or folksonomy support. Imagine browsing a resource like Flickr (Flickr, 2006), and being able to query LCSH for relationships that may be defined between subject terms through the tags labeling the subject of a particular photographic image and tags for various related concepts. The sidebar could include hyper-linked broader, narrower, equivalent, and associative terms that would pull together additional photographs tagged with those related terms. In a different context, such an application could attempt to scan html source code for word frequency and try to guess the primary topics, returning related terms from a controlled vocabulary, and perhaps linking to search engine results for the concepts represented. Additionally, a similar service could be provided using a word highlighted by the user. The possibilities are endless. 5 Name Authority for the Semantic Web The benefits of authority control described above – search precision, more powerful navigation, collocation, and linking between various tools and resources – apply to metadata about the creators of resources as well as to subject access. The library community is well positioned to play a significant role in these developments. Libraries have been dealing with identification, disambiguation, and collocation of names of content creators since the beginning of cataloging. The different forms of name used by the same creator in various print publications and other types of resources have always led to some degree of difficulty in grouping works together. The syndetic structure of name authority files has proven itself a very useful tool to help collocate works by an author regardless of the form of name on a particular item. The proliferation of resources on the Web extends the scope of the collocation problem. Initiatives are appearing in the Web community to help provide better mechanisms for identifying persons, families, and corporate entities that have a role with respect to information resources. InterParty is a European Commission funded project exploring the interoperation of “party identifiers,” which would provide standard identification numbers to serve the same purpose that authorized forms of names serve in library applications: to help collocate and disambiguate individual content creators (Information Society Technologies, 2003). More recently, some of the InterParty members and others submitted a similar proposal for International Standard Party Identifiers (ISPI) as an ISO standard (Lloret & Piat, 2006). More directly related to Semantic Web Development is the Friend of a Friend (FOAF) project. FOAF is about “creating a Web of machine-readable homepages describing people, the links between them and the things they create and do” (The Friend of a Friend [foaf] Project, n.d.). Finding ways to integrate these initiatives with existing mechanisms for name authority control in 12 13 libraries can help to bring library catalogues into the mix of tools available on the Web. Additionally, the availability of library authority data in a more Web-friendly format has the potential to positively influence the organization of the broad spectrum of Web content already available. The development of a virtual international authority file (VIAF) has been a key idea moving forward this initiative. A Virtual International Authority File (VIAF) Presently, authority files are maintained and developed by a large range of national bibliographic agencies. To make the most of this potential, it is useful to first integrate the somewhat disparate sources of authority data that exist even within the library community. These agencies develop files that are generally focused on the creators of content relevant to a particular national and cultural identity. As geographic boundaries become more and more porous, and culture becomes much more international, there is increasingly overlap between the authority files maintained by these agencies. The concept of a Virtual International Authority File (VIAF) has been discussed since the 1970’s within the International Federation of Library Associations and Institutions (IFLA). Initially, IFLA envisioned a single shared file; more recently the concept has evolved into one of linking existing national and regional authority files. The primary objective of this vision is to facilitate sharing the workload and reducing cataloging costs within the library community. The community is expanding, especially in Europe, where libraries are viewed as one of many “memory institutions”, along with archives, museums, and rights management agencies. Ideally, authority files can be freely shared among all of these communities. A shared file would reduce the cost of doing authority work by avoiding repetition of effort while combining various forms of names that are particular to resource published within the context of a particular region, culture, or nation. Combining or clustering the forms of name will result in a much richer set of authority information, enabling users to access information in the language, scripts, and form they prefer. Additionally, a single international authority file system will be far more useful when integrated into various Web retrieval tools and Web content descriptions. Such a tool could be used by a wide range of Web systems to improve the precision of user’s searches and to provide the user’s preferred display of the language and script of names. Authority records are used to collocate resources that utilize varying forms of name, but collocation does not need to dictate the script or language used by the end user display. Figure 3 shows how a single entity – whether it is a concept, person, place, or thing – has a variety of labels that identify it in different languages and scripts. Traditionally, library systems have relied on a preferred label for all entities, although the preference would vary depending on the geographic context of the system. Merging or linking records that utilize different languages and scripts allows the end-user to select the language and script used to display information about entities irrespective of system’s default preference. This is appropriate in a truly global Web, where geographic and national boundaries are considerably less significant. Figure 3: One Entity - Many Labels2 A variety of projects have sought to address this language challenge in recent years, exploring mechanisms to combine individual authority files. One such project, the European Commission funded AUTHOR project converted a sample of authority records from five European bibliographic agencies in France, England, Belgium, Spain, and Portugal into the UNIMARC 14 15 format and made them available as a searchable file. “The challenge was that each library has its own language, cataloging rules, bibliographic record format and local system for its online authority file” (Tillett, 2001). Combining these records into a single UNIMARC file required a large amount of record normalization. No attempt was made to link the records for the same entity. More recently, OCLC Online Computer Library Center Inc., the Library of Congress, and Die Deutsche Biblioteck (DDB) began a joint project to test the idea of a VIAF. OCLC used matching algorithms to link name authority records of these two national bibliographic agencies and built a server to store the combined records. Additional phases of the project will involve ongoing maintenance to update the central file when either source is updated and possibly the development of a multilingual end user interface (Morris, 2003). Following the evaluation of the project, the addition of new partners will be explored, particularly those potential partners with non-roman authority records. 6 Leveraging a VIAF on the Semantic Web When combined with developments in the broader metadata, Web-design, and Semantic Web communities, the power and utility of VIAF outside of libraries becomes clear. Authority record data can be associated more easily with a variety of Web resources, allowing users and potentially machines to immediately start to evaluate the information they are looking at. A quick search of bibliographic data related to a given resource author allows the retrieval of her dissertation, which could be mined for data about the degree granting institution. Other bibliographic records could be retrieved to help evaluate the original Web resource and related works could offer pathways to additional relevant resources. As other resources start including metadata that uses identifiers or headings to link to a VIAF, the opportunity to connect more interesting bits of information can add significant value to any Web-based information resource. Wikipedia entries, journal articles, Who’s Who biographical info, an individual’s blog, their homepage, or the homepage of their place of work can all be interconnected, as well as linked to journal articles, bibliographic records in catalogs and in e- commerce sites, and a variety of other scholarly resources. These interconnections have extensive implications for research. Once there is a corpus of biographical information combined into a data store that is connected to authority data (as well as associated bibliographic data), the information can be used to make inferences about any document, article, Web page, or blog entry that turns up when searching for information. For example, imagine a blog post that includes information about its author's identity. This information could be referenced against available biographical data and used to make inferences about the veracity and objectivity of the post’s content. If the author were affiliated with the Recording Industry Association of America (RIAA) a trade group representing the U.S recording industry, or the Electronic Frontier Foundation (EFF), a non-profit legal organization focused on defending “digital rights”, heavily involved in fighting bad uses of Digital Rights Management technology, and opposing limitations on fair use, the statement is likely to be much less objective than content posted by a Harvard law professor. While an agent or search tool couldn’t necessarily flag such resources as potentially biased without additional information about the affiliate organization, it would still be very useful to present these additional facts to the user when returning search results. The metadata community has a number of initiatives underway for describing people, both as agents of resource creation and for the utility of describing relationships between people, describing connections between people and organizations, and for capturing other contact information and other descriptive information about individuals and groups. Examples of such initiatives include the Dublin Core Agents Working Group’s work on defining a metadata standard for agent description and identification. Ultimately, this work should result in the development of an Application Profile for agent description. Also, work has been completed on “Reference Models for Digital Libraries: Actors and Roles” within the 16 17 DELOS/NSF Working Group. This work culminated in Final Report, issued in July 2003. Interestingly, this work goes well beyond the scope of authors and content creators; instead the DELOS/NSF model categorizes Actors as Users, Professionals, and Agents. In this context, agents are the traditional content creators that help populate a digital library and professionals are the developers of the digital library itself and the providers of digital library services. The scope of their work may be much deeper than is necessarily relevant to discussions of authority control on the Semantic Web, although the models used and conclusions drawn may prove to play an important role in future discussions about authority control for names. Another initiative, the Friend Of A Friend (FOAF) project described earlier, is of particular use to the Semantic Web community, because it was conceived with the Semantic Web in mind and is built upon the RDF data model. FOAF expresses identity through any property value pair, allowing you to aggregate data about individuals using any unique property, such as an email address or the URL for a home page. FOAF is primarily designed for community building, but when the possible privacy issues of sharing personal data are resolved there is much potential for FOAF to help aggregate public information about individuals. FOAF could be used to aggregate all sorts of resources, both by and about individuals—Again, resources such as, Wikipedia entries, journal articles, Who’s Who biographical info, the individual’s homepage, their blog, or their place of work. This information could be glommed into a program like Piggy Bank (Simile, 2006) – a FireFox extension for viewing, collecting and merging RDF encoded data, developed by the Simile (Semantic Interoperability of Metadata and Information in unLike Environments) project – or any other Semantic Web enabled tool, and processed along with other local or remote data stores. For example, a local data store of vcards and/or FOAF data might include private data, such as phone number, calendaring system, and email address. The very presence of that particular identifier in a local store of RDF data about people might change the context of any information interaction with a Wikipedia or Who’s Who entry. Using any “unique” property to identify entities will certainly help to aggregate most of them, but it would miss entities that are not described using a particular piece of identifying information. Inferences about which descriptions represent the same entity will only go so far in establishing a positive match. The ability to completely aggregate such data becomes even more powerful when the identifying properties can be referenced against the VIAF to determine alternate forms of name and representations in alternate scripts. This allows a query of available RDF data to be much more comprehensive when deciding what pieces of data should be aggregated. 7 Markup and Encoding of Authority Data One of the most valuable activities that libraries and librarians can engage in, both to help realize the Semantic Web and to generally increase the findability of electronic resources in general, is the process of creating versions of vocabularies in machine-readable format. Thesauri, authority files, classifications schemes, and subject heading lists – collectively referred to as Knowledge Organization Systems – have enormous potential for enhancing the discoverability and organization of resources in a networked environment. The potential only increases when such systems are provided in formats designed for emerging Web technology, such as OWL and SKOS – two of the ontology schema of the Semantic Web. Knowledge organization systems can enhance the digital library in a number of ways. They can be used to connect a digital library resource to a related resource. The related information may reside within the KOS itself or the KOS may be used as an intermediary file to retrieve the key needed to access it in another resource. A KOS can make digital library materials accessible to disparate communities. This may be done by providing alternate subject access, by adding access by different modes, by providing multilingual access, and by using the KOS to support free text searching. (Hodge, 2000) 18 19 The perceived benefits of knowledge organization systems listed above are of particular importance, and apply to networked resources beyond the scope of digital library materials. The availability of machine-readable representations of various thesauri and other controlled vocabulary enables more effective search and retrieval, better browse functionality and general organization of materials online, and the automatic creation of context sensitive linkages between available resources and data sets. Additionally, and most importantly for Semantic Web development, providing controlled vocabularies outside of traditional library systems enables a variety of applications to more effectively merge and manipulate data and information from disparate sources. There are many steps that need to be taken to realize this set of goals. Firstly, the library and Semantic Web communities must agree on how best to encode these vocabularies. Many of the subject schemes listed above already exist in one or more machine-readable representations. Much of LCSH exists in MARC records in library catalogs and the databases of cataloging services such as OCLC’s WorldCat and RLIN 21. All of the LC controlled vocabularies (LC classification, LCSH subject authority, and the name authority records) are available as complete files of MARC 21 or MARCXML formatted authority records through the Cataloging Distribution Service of the Library of Congress (2005) (free test files are also available). The Getty (2003) provides licensed access to its vocabularies in three formats: XML, relational tables and MARC, and provides sample data from each vocabulary for free. At first glance, these formats do not appear to be of much use in the context of RDF, but the standardization and global use of MARC makes it possible to convert these into RDF-friendly data. One approach is the creation of URIs to identify each terminal node on the source XML structure as a unique concept that can be used as a property in the RDF and DC Abstract Model sense. This prospect has the potential to retain as much of the detail available in the source format as possible, but may prove unnecessary and undesirable due to the complexity of the resultant sets of properties. On the other hand, the information from MARC records may in fact be useful for automatic processes and machine activities. The richness of MARC authority data, and the time and effort invested in developing, encoding, and sharing this data, provides a unique and powerful set of vocabularies. However, it remains to be seen whether the complexity of the resulting XML records would be an impediment to interoperability. An alternative approach involves simply cross-walking the XML data into an already defined RDF-friendly form. Along the way, detail about relationships between terms will likely be lost, but the end product will probably be much simpler to work with. One possible target RDF vocabulary, the Simple Knowledge Organization System (SKOS) Core (W3C Semantic Web Activity, 2004, February), has much potential in this context. SKOS Core provides a model for expressing the structure of what they refer to as a ‘Concept Scheme’. “Thesauri, classification schemes, subject heading lists, taxonomies, terminologies, glossaries and other types of controlled vocabulary are all examples of concept schemes” (Miles, Mathews, Wilson & Brickley, 2005, September). SKOS Core provides a means of expressing most of the semantic relationships included in most library subject vocabularies. For example, “prefLabel” and “altLabel” represent “use” and “use for” references, while “broader” and “narrower” are used to identify hierarchical relationships. SKOS allows for the creation of new labels and the encoding of more specific types of relationships as well. Additionally, SKOS provides mechanisms for various types of notes – scopeNote, definition, example and note. The SKOS Core community has drafted documentation on “Publishing a Thesaurus on the Semantic Web,” (Miles, 2005, May) which provides a guide for using SKOS to both describe and encode vocabularies in RDF. Converting large controlled vocabularies into RDF data is certainly a good way to get sample data sets to use to build prototype services. However, the long-term maintenance of such data stores may be problematic. As changes are made to the source vocabulary, those changes need to be propagated through to all the various formats that the vocabulary is made available in. In the context of SKOS, this is a manageable task. SKOS extensions have been proposed to allow for versioning and the tracking of changes made to controlled vocabularies over time (Tennis, 20 21 2005). However, in cases where vocabularies are likely to be managed in a variety of different formats, the SKOS extensions are less helpful. Another approach is to harvest the source data in its native format and translate it into a variety of output formats either as nightly batch processes or on-the-fly as data is requested. Progress is being made in this area, such as OCLC’s work on exposing vocabularies in a variety of delivery formats, including MARCXML and SKOS (Dempsey et al., 2005). It doesn’t matter whether the data store is MADS, MARC, RDF, or some arbitrary flavor of XML, it can be transformed into another format either on-the-fly or as a batch process. A centralized name authority database could be created from the national authority files available in various library communities and stored as normalized MARC 21. If this data store is deemed useful to the Friend of a Friend (FOAF) community, the data can be turned into RDF. Similarly, Web services like Flickr could convert and makes use of Library of Congress Subject headings to augment both the searching and development of their folksonomies. 8 Conclusion Berners-Lee suggested that, “The vast bulk of data to be on the Semantic Web is already sitting in databases … all that is needed [is] to write an adapter to convert a particular format into RDF and all the content in that format is available” (Updegrove, 2005). The data, metadata, and thesauri available in various library databases and systems present a unique opportunity to take a large step forward in the development of the Semantic Web. The realization of the Semantic Web vision, which isn’t too far from Berner-Lee’s original vision for the World Wide Web itself, involves a remarkably broad set of goals. Part of the Semantic Web vision is about aiding resource discovery by creating tools to help searcher’s refine and develop their searches, and to aid in the navigation of search results. These improvements will be augmented by the improved metadata that will result from making these same tools and vocabularies available to resource authors. Information professionals are too few in numbers to describe and catalog all of the Web’s resources. Resource authors will have to play an active role in describing the materials they publish, perhaps having their descriptions refined and further developed by automated processes and by information professionals. Research in this area has been taking place in a variety of author communities, including scientific, government, and educational institutions (Greenberg & Robertson, 2002). Such collaborative efforts need not be limited to authors and metadata professionals. Other domain experts can add further descriptive information through the process of tagging and reviewing materials. This marriage of the folksonomy and controlled vocabularies would serve as a step towards what Peter Morville has elegantly referred to as “the Sociosemantic Web” (Morville, 2005). Another large part of the Semantic Web vision is about enabling “agents” or systems to insert a searcher’s/user’s individual context or perspective into a search for information. This necessarily involves interacting with the elements that make up that context, such as schedules, contacts, group membership, profession, role, interests, hobbies, location, etc. Systems can then be developed that “understand” the searcher’s needs, based on who the searcher is and the searcher’s “context” or demographics. Developing this kind of machine understanding involves encoding the vast wealth of information available electronically in such a way that it can be negotiated according to a searcher’s individual “context”. Even if privacy issues hinder our ability to automate the incorporation of personal information into the “context” of a specific information interaction, there is still a tremendous amount of value in making external information sources more readily accessible for machine processing, and making the information more interoperable, easier to interpret and ultimately combined and used in novel and interesting ways. More importantly, even if it is possible to automate the user side of this process, there will undoubtedly be a user base that chooses not to trust these context-dependant decisions to Semantic Web “agents” described in Berners-Lee’s writings. In the case of either of 22 23 these two scenarios, the tools that support Semantic Web technology will still make most searchers’ experiences more pleasant and much less frustrating. Acknowledgements The authors are indebted to the sound boarding, feedback, and editorial input of many people, and would particularly like to thank Lori Robare and Jane Greenberg for their input. NOTES REFERENCES 1 MARC Record Shown in MarcEdit, freely distributed MARC Record editing software. Available online at: http://oregonstate.edu/~reeset/marcedit/html/ 2 Figure from Tillett, Barbara B. “Authority Control: State of the Art and New Perspectives,” co-published simultaneously in Cataloging & Classification Quarterly, v. 38, no. 3/4, 2004, p. 23-41; and Authority Control in Organizing and Accessing Information: Definition and International Experience (ed.: Arlene G. Taylor and Barbara B. Tillett). Haworth Press, 2004, p. 23-41 (figure on p. 34). REFERENCES Berners-Lee, T., Hendler, J., & Lasilla, O. (2001). The Semantic Web [Electronic version]. Scientific American, 284(5), 34-43. Retrieved April 15, 2002, from: http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21 Cataloging Distribution Service – Library of Congress. (2005). MARC Distribution Services: Your Source for Machine Readable Cataloging Records via FTP. Retrieved March 22, 2006, from: http://www.loc.gov/cds/mds.html DCMI Usage Board. (2005, January 10). DCMI Metadata Terms. Retrieved March 22, 2006, from: http://dublincore.org/documents/dcmi-terms/ Dempsey, L., Childress, E., Godby, C.J., Hickey, T.B., Houghton, A., Vizine-Goetz, D., & Young, J. (2005). Metadata switch: thinking about some metadata management and knowledge organization issues in the changing research and learning landscape. Forthcoming in LITA guide to e-scholarship (working title), ed. Debra Shapiro. Retrieved April 15, 2006, from: http://www.oclc.org/research/publications/archive/2004/dempsey-mslitaguide.pdf Flickr. (2006). Popular Tags on Flickr Photo Sharing. Retrieved April 1, 2006, from: http://www.flickr.com/photos/tags/ The Friend of a Friend (foaf) Project. (n.d.). Retrieved April 12, 2006, from: http://www.foaf- project.org/ http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21 http://www.loc.gov/cds/mds.html http://dublincore.org/documents/dcmi-terms/ http://www.oclc.org/research/publications/archive/2004/dempsey-mslitaguide.pdf http://www.flickr.com/photos/tags/ http://www.foaf-project.org/ http://www.foaf-project.org/ The Getty. (n.d.). Obtain the Getty Vocabularies. Retrieved March 22, 2006, from: http://www.getty.edu/research/conducting_research/vocabularies/license.html Greenberg, J. & Robertson, D.W. (2002) Semantic Web Construction: An Inquiry of Authors’ Views on Collaborative Metadata Generation. In: Metadata for e-Communities: Supporting Diversity and Convergence. Proceedings of the International Conference on Dublin Core and Metadata for e-Communities, 2002, Florence, Italy. October 13-17. Retrieved April 15, 2006, from: http://www.bncf.net/dc2002/program/ft/paper5.pdf Heery, R. & Patel, M. (2000). Application profiles: mixing and matching metadata schemas.” Ariadne, 25 Retrieved April 24, 2006, from: http://www.ariadne.ac.uk/issue25/app-profiles/ Hodge, G. (2000). Systems of Knowledge Organization for Digital Libraries. The Digital Library Federation. Retrieved April 12, 2006, from: http://www.clir.org/pubs/reports/pub91/contents.html IEEE Learning Technology Standards Committee. (2002, July). Draft Standard for Learning Object Metadata. Retrieved March 22, 2006, from: http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf Information Society Technologies. (2003). InterParty. Retrieved April 12, 2006, from: http://www.interparty.org/ Joint DCMI/IEEE LTSC Taskforce. (2006, March). DCMI Education Working Group Wiki. Retrieved March 22, 2006, from: http://dublincore.org/educationwiki/DCMIIEEELTSCTaskforce Library of Congress. (2005, April 22). MARCXML – MARC 21 XML Schema: Official Web Site. Retrieved April 15, 2006, from: http://www.loc.gov/standards/marcxml/ (includes Test authority data for Classification, Names, and Subjects). Library of Congress. (2005, December 14). MADS – Metadata Authority Description Schema: Official Web Site. Retrieved April 15, 2006, from: http://www.loc.gov/standards/mads/ (includes MARCXML Authorities to MADS crosswalk). Library of Congress. (2006, June 13). MODS: Metadata Object Description Schema. Retrieved June 30, 2006, from: http://www.loc.gov/standards/mods/ Lloret, R., & Piat, S. (2006). Outline for ISO Standard ISPSI (International Standard Party Identifier). Retrieved April 12, 2006, from: http://www.collectionscanada.ca/iso/tc46sc9/docs/sc9n429.pdf Miles, A. (2005, May). Quick Guide to Publishing a Thesaurus on the Semantic Web: W3C Working Draft 17 May 2005. Retrieved April 15, 2006, from: http://www.w3.org/TR/2005/WD-swbp-thesaurus-pubguide-20050517/ Miles, A., Mathews, B., Wilson, M., & Brickley D. (2005, September) SKOS Core: Simple Knowledge Organisation for the Web In: Proceedings of the International Conference on Dublin Core and Metadata Applications, Madrid, Spain, 12-15 September 2005. p. 5-13. Retrieved April 15, 2006, from: http://www.slais.ubc.ca/PEOPLE/faculty/tennis-p/dcpapers/paper01.pdf Miller, E. (2004, October). The Semantic Web and Digital Libraries. Keynote presentation from The International Conference on Dublin Core and Metadata Applications, 2004. Shanghai, China, 11-14 October 2006. PowerPoint presentation retrieved April 1, 2006 from: http://dc2004.library.sh.cn/english/prog/ppt/talk.ppt Morris, S. (2003, September). Virtual International Authority [press release]. Retrieved April 24 http://www.getty.edu/research/conducting_research/vocabularies/license.html http://www.bncf.net/dc2002/program/ft/paper5.pdf http://www.ariadne.ac.uk/issue25/app-profiles/ http://www.clir.org/pubs/reports/pub91/contents.html http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf http://www.interparty.org/ http://dublincore.org/educationwiki/DCMIIEEELTSCTaskforce http://www.loc.gov/standards/marcxml/ http://www.loc.gov/standards/mads/ http://www.loc.gov/standards/mods/ http://www.collectionscanada.ca/iso/tc46sc9/docs/sc9n429.pdf http://www.w3.org/TR/2005/WD-swbp-thesaurus-pubguide-20050517/ http://www.slais.ubc.ca/PEOPLE/faculty/tennis-p/dcpapers/paper01.pdf http://dc2004.library.sh.cn/english/prog/ppt/talk.ppt 25 15, 2006, from: http://www.loc.gov/loc/lcib/0309/authority.html Morville, P. (2005). Ambient Findability. Sebastopol, CA: O’Reilly. Network Development and MARC Standards Office – Library of Congress. (2006, June 23). MARC Code Lists for Relators, Sources, Description Conventions. Retrieved June 30, 2006, from: http://www.loc.gov/marc/relators/ Nilsson, M., Palmer, M. & Brase, J. (2003, November). The LOM RDF binding – principles and implementation. Paper presented at The 3rd Annual Ariadne Conference, Leuven, Belgium. Retrieved March 22, 2006, from : http://rubens.cs.kuleuven.ac.be/ariadne/CONF2003/papers/MIK2003.pdf North Carolina State University (NCSU) Libraries. (2006). Endeca at the NCSU Libraries. Retrieved March 6, 2006, from: http://www.lib.ncsu.edu/endeca/ OCLC Research. (2006). Terminology Services. Retrieved March 22, 2006, from: http://www.oclc.org/research/projects/termservices/ Tennis, J. (2005). SKOS and the Ontogenesis of Vocabularies. In: Proceedings of the International Conference on Dublin Core and Metadata Applications, Madrid, Spain, 12-15 September 2005. Retrieved April 15, 2006, from: http://purl.org/dcpapers/2005/Paper33 Tillett, B. (2001). Authority control on the Web. In Proceedings of the Bicentennial Conference on Bibliographic Control for the New Millennium : Confronting the Challenges of Networked Resources and the Web, Washington, D.C., November 15-17, 2000. Sponsored by the Library of Congress Cataloging Directorate. Edited by Ann M. Sandberg-Fox. Washington, D.C. : Library of Congress, Cataloging Distribution Service, p. 207-220. Retrieved April 15, 2006, from: http://www.loc.gov/catdir/bibcontrol/tillet_paper.html Simile. (2006). Piggy Bank. Retrieved March 22, 2006, from: http://simile.mit.edu/piggy-bank/ Updegrove, A. (2005, June). The Semantic Web: An Interview with Tim Berners-Lee. Consortium Standards Bulletin, 5(6), Retrieved February 9, 2006, from: http://www.consortiuminfo.org/bulletins/semanticweb.php Vizine-Goetz, D. (2004). Terminology services: Making knowledge organization schemes more accessible to people and computers. OCLC Newsletter, 266. Retrieved March 22, 2006, from http://www.oclc.org/news/publications/newsletters/oclc/2004/266/ W3C Semantic Web Activity. (2004, February) Simple Knowledge Organisation System (SKOS). Retrieved March 22, 2006, from: http://www.w3.org/2004/02/skos/ http://www.loc.gov/loc/lcib/0309/authority.html http://www.loc.gov/marc/relators/ http://rubens.cs.kuleuven.ac.be/ariadne/CONF2003/papers/MIK2003.pdf http://www.lib.ncsu.edu/endeca/ http://www.oclc.org/research/projects/termservices/ http://purl.org/dcpapers/2005/Paper33 http://www.loc.gov/catdir/bibcontrol/tillet_paper.html http://simile.mit.edu/piggy-bank/ http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html http://www.consortiuminfo.org/bulletins/semanticweb.php http://www.oclc.org/news/publications/newsletters/oclc/2004/266/default.html http://www.w3.org/2004/02/skos/ Introduction: Library of Congress Tools and Launching the Se Incremental Progress – Initial Developments Authority Control – Core Principles Subject & Genre Vocabularies for the Semantic Web Name Authority for the Semantic Web Leveraging a VIAF on the Semantic Web Markup and Encoding of Authority Data Conclusion Acknowledgements NOTES ( REFERENCES work_gqmbfwxvjfhgdf4gpmtgnj3ska ---- IDS article Recent developments in Remote Document Supply (RDS) in the UK – 3 Stephen Prowse, Kings College London British Library to pull out of document supply You read it here first. Purely for the sake of artistic and dramatic licence I’ve omitted the question mark that should rightly accompany that heading. But even with the question mark firmly in place it still comes as a shock, doesn’t it? Can you imagine life without the Document Supply Centre? Can you think the unthinkable? Why should the BL pull out and where would that leave RDS? Rather like a science fiction dystopia, I’ve tried to imagine what such a post-apocalyptic world would look like and what form RDS might take. Assuming, that is, that RDS would survive the fallout. This article will attempt to show why the BL may be on the verge of abandoning document supply and what could fill some of the huge gap that would be left. Seven minutes to midnight We can think of the likelihood of a post-BL document supply world in the same terms as the Doomsday Clock positing the likelihood of nuclear Armageddon – the nearer to midnight the clock hands then the closer the reality. Perhaps we’ll start it at seven minutes to and then adjust in future articles? Perhaps a clock could adorn the cover of this esteemed journal? It could be argued that trends have been pushing the BL towards an exit for a while – the relatively swift and ongoing collapse of the domestic RDS market, for example. But the idea was first publicly mooted or threatened (take your pick) at a seminar jointly organised by the BL and CURL on 5th December 2006 at the BL in London. Presentations from the event can still be found on the CURL website [1]. This event brought together all those with a stake or an interest in the proposed UK Research Reserve (UKRR), a collaborative store of little-used journals and monographs. Librarians are notoriously loathe to completely discard items, preferring to hang on to them in case of future need. Sooner or later this creates a storage problem as space runs out. Acquiring extra space is often problematic and expensive. What is to be done? Moving to e only preserves access to the content and frees up space but can’t be wholly trusted, so print needs to be held somewhere – just in case. Co-operating with other libraries, HE institutions can transfer print holdings to an off-site storage depot and, once an agreed number of copies have been retained, can dispose of the rest. This is the theory underpinning the UKRR. The UKRR is a co-operative that will eventually invite institutions to become partners or subscribers. The first phase involves the following institutions working with the BL – Imperial College (lead site), and the universities of Birmingham, Cardiff, Liverpool, St Andrew’s, and Southampton. Research has shown that the BL already holds most of the stock that libraries would classify as low use and seek to discard – an 80% overlap of journals held by the BL and CURL libraries has been identified. Additional retention copies (meaning a minimum of three in total) would be required to placate fears of stock accidentally being destroyed. It is not felt that extra building will be necessary – stock will be accommodated at BLDSC and at designated sites by encouraging some institutions to hold on to their volumes so that others can discard. SCONUL will be the broker in negotiations as to who will be asked to retain copies. The first phase began in January 2007 with 17.3 km of low-use journals being identified among the partners for storage/disposal. If the BL already holds most of these volumes, and there is a need to ensure that two more copies are kept, it will be interesting to see how much of the 17.3 km will actually make it to disposal. I expect that libraries will be asked to hold on to much of the material that they would like to send for disposal. In fact, at a subsequent CURL members’ meeting in April 2007, Imperial disclosed that 30 out of 1300 metres of stock selected for de-duplication had been sent to BL. This represents only 2.3% being offloaded. Once participation widens then there will be increased scope for disposal, but I can’t see the partner institutions creating much space until that happens. Should the UKRR really take off then there may be a need for more building space to accommodate stock, although the BL now has a new, high density storage facility at Boston Spa. The business model behind the UKRR will mark a change in the way remote document supply is offered to HE institutions and could determine the future of the service. Instead of the current transaction-based model the new model will be subscription–based and will comprise two elements – 1) a charge to cover the cost of BL storage and 2) a charge according to usage. Institutions that don’t subscribe, including commercial organisations, will be charged premium rates. The theory is that costs will not exceed those currently sustained for document supply. Assuming funding is provided for Phase 2, we will see the roll-out of this new model after June 2008. Advocacy will be crucial to the success of the UKRR. The original study reported widespread buy-in to the idea but will that translate into subscriptions? Many libraries will already be undertaking disposal programmes, particularly those with more confidence in and/or subscriptions to e repositories such as Portico. Will anyone really want access to that material once it’s out of sight (and out of mind)? If requests remain low (and decline further) and take- up isn’t great then that could spell the end as far as the BL & RDS go. The editor has already commented on the apparent lack of commitment to RDS from the chief executive of the BL (McGrath, 2006). This year the BL faces potentially very damaging cutbacks from the 2007 Spending Review with threats to reading rooms, opening hours and collections together with the possible need to introduce admissions charges. RDS wasn’t mentioned as a possible target for cutbacks but then the BL will want to see how the UKRR and the new model fares. Tough financial targets in the future coupled with low use/low take-up could lead to a time when the BL announces enough is enough. I’m sure that scenario has been considered by a new group of senior managers set up within the British Library – the Document Supply Futures Group. Not surprisingly little is known about this Group (again this was something divulged at the CURL December presentation) but if it’s looking at all possible futures then it must also be considering no future. The group is headed by Steve Morris the BL’s Director of Finance and Corporate Services. McGrath reported in his paper just quoted that senior figures were seriously considering the future of document supply in 2001. Whatever will come from this Group’s deliberations the present tangible outcome is a commitment to the UKRR. We’ll see where that goes – the clock’s ticking. Alternative universes If ever we’re left adrift in RDS without the BL then what are the alternatives? One option is to go it alone and request from whoever will supply. For this a good union catalogue will be a fundamental requirement. COPAC has had a facelift and, as with other search tools, the Google effect can be seen in the immediate presentation of a simplified ‘quick search’ screen. Expansion is taking place with the catalogues of libraries outside of CURL also being added e.g. the National Art Library at the Victoria and Albert Museum already added and the Cathedral Libraries’ Catalogue forthcoming. The national libraries of both Wales and Scotland are on COPAC, as is Trinity College Dublin. The national library of Scotland has always had an active ILL unit, although this is far too small to take on too many requests. Further development of COPAC has seen support for OpenURLs so that users are linked to document supply services at their home libraries. CISTI is the Canadian document supplier that would welcome more UK customers. However it should be remembered that the BL acts as a backup supplier for CISTI, so without them CISTI could only play a minor role. A new service for 2007 is the supply of ebooks. For US$25 users can access the ebook online for 30 days, after which the entitlement expires. As far as I’m aware this is the first solution aimed at tackling the problem of RDS in ebooks. Ejournal licences have become less restrictive and usually allow libraries to print an article from an ejournal and then send it to another library. This obviously isn’t an option for ebooks, and neither can libraries download and pass on the whole thing or permit access, so the CISTI solution is an attractive option. Of course a major undertaking of the BL is to act as a banker on behalf of libraries for the supply of requests. Libraries quote their customer numbers on requests to each other and charges can then be debited and credited to suppliers once suppliers inform BL (via an online form or by sending a spreadsheet). IFLA vouchers can act as currency but these are paper- based rather than electronic, even though an e voucher has long been desired and projects have looked at producing one. Realistically, survivors in a post-BL document supply world would need to band together with like-minded others to form strong consortia and reap the benefits from membership of a large group. Effectively that boils down to two options – joining up with OCLC or with Talis. Despatches from the Unity front – no sign of thaw in the new cold war OCLC and Talis have both, naturally enough, been promoting their distinct approaches to a national union catalogue and a RDS network that can operate on the back of that, while firing an occasional blast into the opposing camp. Barely was the ink dry on the contract between The Combined Regions (TCR) and OCLC for the production of UnityUK, than opening salvos were being exchanged. At the time Talis had announced they were going ahead with their own union catalogue and RDS system. Complaints relating to data handover and data quality were being lobbed Talis’ way. Since then news items on UnityUK have appeared regularly in CILIP’s ‘Library + Information Update’ (Anon, 2006) along with a stream of letters (Chad, Froud, Graham, Green, Hendrix, McCall, 2006), including one from a Talis user and two from senior Talis staff, bemoaning the situation and seeking ‘a unified approach’. Talis’ position is that they would like to enable interoperability between Talis Source and OCLC for both the union catalogue and the RDS system. I’m sure TCR’s position is that OCLC won the tender to provide a union catalogue and RDS services, while Talis didn’t bid, and they are happy to press on without Talis, thank you very much. An article by Rob Froud, chair of TCR, in a previous issue of ILDS (Froud, 2006b), providing some history and an update on progress, was met with a counter-blast from Talis’ Dr Paul Miller in the Talis Source blog (Miller, 2007). A particular bone of contention was the decision taken by Rob Froud to withdraw a number of TCR libraries’ holdings from the Talis Source union catalogue. Not an especially surprising move given the circumstances, but neutrals should note that libraries can contribute holdings records freely to both. However access to the union catalogue will only be free with Talis. This free access for contributors has seen more FE and HE libraries joining Talis Source. It’s interesting comparing membership lists. While there isn’t great overlap between the two there is a significant minority of public library authorities who are members of both. Will this continue, and, if so, for how long? UnityUK and Talis Source have staked their claims to be the pre-eminent union catalogue and RDS network on their respective websites [2, 3]. UnityUK have this to say - “In 2007, the combined UnityUK and LinkUK services will bring together 87% of public libraries in Great Britain, Jersey and Guernsey in to one national resource sharing service.” They show their extent of local authority coverage with the following membership figures- 97% County Councils 97% London Boroughs 97% Metropolitan authorities 75% Unitary authorities Meanwhile, Talis Source announces itself as “the largest union catalogue in the UK comprising 26 million catalogue items and 55 million holdings from over 200 institutions.” (April 25 2007).th No more ISO ILL for NLM In January 2007 the National Library of Medicine (NLM) in the U.S. said that it would no longer accept ILL requests into its DOCLINE system via ISO ILL. The reasons cited were poor take-up (only three libraries were using it), and the drain on resources by having to test separately with every supplier and every institution that wanted to use it. The protocol itself is quite long but implementers do not have to implement every item – they can select. This meant however that each implementer had to test with NLM even if they were using one (out of only four) of the systems suitable for use. The time and effort required to support ISO ILL was too much and so the NLM pulled the plug. This raises a number of questions about the use of ISO ILL and its future. It doesn’t seem to be well-used in the U.S. e.g. OCLC’s Resource Sharing website lists nearly four times as many Japanese libraries using it compared to those in the U.S., and the British Library hasn’t developed its own ISO ILL gateway since that came on stream. That gateway is of course run on VDX. On the other hand, ISO ILL is used in VDX-based consortia in the UK (UnityUK), the Netherlands, Australia and New Zealand. Quite where all this leaves ISO ILL I don’t know, but I wouldn’t be too optimistic about its prospects. Big deals - unpicking the unused from the unsubscribed Statistics on ejournal usage have moved on apace since publishers committed themselves to achieving COUNTER compliancy in their reports. By creating a common standard, COUNTER reports from one publisher can be meaningfully compared with those of another, knowing that both treat data in the same way. SUSHI takes that a step further by consolidating reports from several publishers into one to provide easy comparisons and show usage across platforms. These can be accessed via Electronic Resource Management Systems (ERMS) or by subscribing to a service such as ScholarlyStats. By utilising such tools analysis of these statistics will becoming increasingly sophisticated, but I suspect that for the moment it remains at a somewhat elementary level. After all, who has the time to look much beyond full text downloads and what titles are or are not being used? The Evidence Base team at the University of Central England have been running a project involving 14 HE institutions that looks at their usage of ejournals, specifically big deals. Libraries are given reports on their usage of ejournals within selected deals and how these rate for value etc. Furthermore, libraries can compare their use with use made at other libraries in the project. At King’s we have received a number of reports including our use of Blackwell’s STM collection in 2004-05, ScienceDirect in 2004-05 and Project Muse in 2005 (Conyers, 2006-07). The Blackwell’s report runs to 22 pages and provides a wealth of detail. Some key findings are highlighted – • 19% increase in usage from 2004-2005 • 91% of requests come directly from the publisher’s web-site, compared to 9% through Ingenta • The average number of requests per FTE user was 6.7 in 2004 and 8.4 in 2005 • 50% of titles in the STM deal were used 100 times or more and 96% of total requests were generated by these titles • 62% of high priced titles in the deal (£400 and over) were used 100 times or more. Higher priced titles were used more frequently than those with a low price (under £200) • 78% of subscribed titles and 39% of unsubscribed titles were used 100 times or more • 62 titles (14% of total) received nil or low use (under 5 requests) in 2005. 22 of these (35%) were unpriced titles not fully available within the deal and a further 18 (29%) were low price (under £200) • The average number of requests per title in 2005 was 369. Average requests for a subscribed title were 860 and unsubscribed title 186 • The heaviest used title was Journal of Advanced Nursing which recorded 15,049 requests in 2005 and 13,840 in 2004 So the report confirms that heavy use is made of titles in the deal, that practically all use is concentrated on half the titles, although practically every title gets some use, and that it is the expensive titles that are most used, but also that unsubscribed titles can attract heavy use. Furthermore, in discussing costs the report finds that the average cost of a request to a subscribed title is 84p in 2005, and just 16p to an unsubscribed title. Pretty good value when all is said and done. The second report confirms much of what the first found. I’ll focus on two of the deals – ScienceDirect (SD) and Project Muse (PM) – as the first is our biggest deal (and will be the case for other libraries too) and PM has a humanities focus which provides a nice contrast. In SD 35% of titles were used 100 times or more, in PM 15%. SD had 2% of titles with nil use*, PM 4% (*nil use doesn’t include ‘unpriced’ titles with limited availability). SD had 80% of subscribed titles used 100 times or more and 27% of unsubscribed titles; for PM the figures were 36% and 9% respectively. This reflects the relative importance of ejournals to users in STM and Humanities fields but also shows how much users gain from a big deal like SD. The average cost for a request to a subscribed SD title was £1.12 and only 2p for an unsubscribed title. One of the arguments against big deals is that you are buying content that you don’t really need – a lot of filler is thrown in with the good stuff. While not totally dispelling that presumption, research such as that produced by Evidence Base can counter that argument somewhat and certainly puts a lot more flesh on bare bones. If you choose carefully which deals you sign up to then your users can make good use of this extra content. At the time of writing (June) Evidence Base were recruiting institutions for a second round of the project. A report from Content Complete (the ejournals negotiation agent for FE, HE and the Research Councils) outlined what they discovered from trials involving five publishers and ten HE institutions that took place between January and December 2006 (Content Complete Ltd, 2007). The idea behind the trials was to look at alternative models to the traditional big deal, and in particular focus on unsubscribed or non-core content and acquiring this via pay per view (PPV). Although the common idea of PPV as a user-led activity was quickly dropped as impractical, a cheaper download cost per article was agreed for all but one of the publishers instead. PPV was then considered in the context of two models – one where unsubscribed content is charged per downloaded article, and the second also with a download charge per article, but this time, should downloading reach a certain threshold, PPV would convert to a subscription and there would be no further download charges. This second option appears more attractive to librarians at first glance as it puts a ceiling on usage, and therefore cost per title, but costs could still mount up considerably if the library saw heavy usage across a wide range of unsubscribed content and was forced into taking further subscriptions. The report highlights a number of problems to do with accurately measuring downloads such as the need to discount articles that are freely available, to not count twice those that are looked at in both HTML and PDF, and to include those downloaded via intermediaries’ gateways. Ultimately these problems proved too much of a technical and administrative difficulty to overcome during the trials for both publishers and librarians. Such problems are likely to continue for some time, although one imagines, given sufficient incentive, they could be overcome with automation and developments to COUNTER and SUSHI. However, would the incentive exist? For the trials also found that the PPV models didn’t compare too well against the traditional big deals in terms of management, and in almost all cases ended up more expensive. Updates In Recent Developments…2 I reported on the RDS proposal for the NHS in England. There’s been some progress on this but there’s still quite a way to go. A list of options has been trimmed to five to undergo cost-benefit analysis before deciding on an eventual winner. The options range from doing little or nothing to improving direct access to content to using a vendor’s RDS system to outsourcing. Building a search engine across catalogues or developing a national union catalogue were the rejected options. It won’t be until November that the preferred option is chosen and then should procurement prove necessary that will take until September 2008 with implementation following early in 2009 (ILDS Task Group, 2007). There have been two significant developments on open access (OA). Firstly, the UK version of PubMed Central launched in January 2007. Like the original U.S. version this will be a permanent archive of freely available articles from biomedical and life sciences journals. Although initially set up as a mirror service, the UK version has 307 such journals at the time of writing (June 2007) against 334 in the U.S version. We can expect future developments to favour UK and European resources. The UK version is supported by a number of organisations – the British Library, the European Bioinformatics Institute and Manchester University are the suppliers while a number of organisations including the Wellcome Trust provide funding. Secondly, for researchers who do not have access to an institutional or subject repository JISC is now offering a service called the Depot, where peer-reviewed papers can be deposited. The Depot is not intended as a long-term repository but rather more of a stop-gap until more become available. eTheses – a long time coming Of course, repositories don’t just have to be homes for journal articles; they can contain a lot more. The possibility of institutions holding their own theses in electronic form has been mooted since the early to mid nineties. Early projects often had a Scottish base and had wider dissemination of research material as a key factor in their raison d’être. An important group looking into the subject was the University Theses On-line Group (UTOG), chaired by Fred Friend. A survey they undertook showed how important theses were to those who consulted them, how authors would be happy to see their own theses more widely consulted, that most theses were being produced in electronic form and so should therefore be easily adapted to storage in an electronic form (Roberts, 1997). One of the members of the UTOG, the Robert Gordon University, subsequently led a smaller group to look at etheses production, submission, management and access. The recommendations from that group led to the EThOS (Electronic Theses Online Service) project which in turn is in the process of establishing itself as a service. From that service researchers will be able to freely access theses online while deposit can be directly into EThOS or by harvesting from institutional repositories. Digitisation of older theses can also be undertaken by the British Library as part of the service. Around the peak of BLDSC’s RDS operations in 1996-97 over 11,000 theses were supplied as loans with more than 3,000 also being sold as copies (Smith, 1997). Final point With UK PubMed Central and EThOS the British Library will be making material freely available that would previously have had to be obtained via RDS. That seems to be the way that much RDS has been going. Previously it was quite expensive, took a while and had to be done via an intermediary; increasingly the documents traditionally obtained via RDS are free and available directly to users immediately. It’s an interesting turnaround isn’t it? Notes 1 BL & CURL presentations on the UKRR from the December 2006 meeting can be found at http://www.curl.ac.uk/projects/CollaborativeStorageEventDec06.htm 2 TCR/UnityUK http://tcr.futurate.net/index.html http://www.curl.ac.uk/projects/CollaborativeStorageEventDec06.htm http://tcr.futurate.net/index.html 3 Talis Source http://www.talis.com/source/ References Anon. (2006), “Will UnityUK bring ILL harmony?”, Library + Information Update, Vol. 5 No 5, pp.4. Anon. (2006), “OCLC Pica/FDI and Talis set out their stalls,” Library + Information Update, Vol. 5 No 5, pp.4. Chad, K. (2006), “Removing barriers to create national catalogue”, Library + Information Update, Vol.5 No 7-8, pp.24. Content Complete Ltd (2007, JISC business models trails: a report for JISC Collections and the Journals Working Group, available at http://www.jisc- collections.ac.uk/media/documents/jisc_collections/business%20models%20trials%20report %20public%20version%207%206%2007.pdf Accessed 28th June 2007. Conyers, A. (2006-2007), Analysis of usage statistics, Evidence Base, UCE, Birmingham, unpublished reports. Froud, R. (2006), “Small price to pay for a proper inter-library lending system”, Library + Information Update, Vol.5 No 7-8, pp.25. Froud, R. (2006b), “Unity reaps rewards: an integrated UK ILL and resource discovery solution for libraries”, Interlending & Document Supply, Vol. 34 No 4, pp. 164–166. Graham, S. (2006), “We want a unified approach to inter-library lending”, Library + Information Update, Vol.5 No 9, pp.25. Green, S. (2006), “Make Unity UK freely available to boost demand”, Library + Information Update, Vol.5 No 6, pp.24. Hendrix, F. (2006), “Struggle for national union catalogue”, Library + Information Update, Vol.5 No 6, pp.26. ILDS Task Group (2007), Strategic business case for interlending and document supply (ILDS) in the NHS in England: recap and update on short listing of options, unpublished report. McCall, C. (2006), “Seeking a unified approach to inter-library lending”, Library + Information Update, Vol. 5 No 10, pp.21. McGrath, M. (2006), “Our digital world and the important influences on document supply”, Interlending & Document Supply, Vol. 34 No 4, pp. 171–176. Miller, P. (2007), “Unity reaps rewards: a response”, Talis Source Blog, available at http://www.talis.com/source/blog/2007/03/unity_reaps_rewards_a_response_1.html (Accessed 7th June 2007). http://www.talis.com/source/ http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.talis.com/source/blog/2007/03/unity_reaps_rewards_a_response_1.html Roberts, A. (1997), Survey on the Use of Doctoral Theses in British Universities: report on the survey for the University Theses Online Group, available at http://www.lib.ed.ac.uk/Theses/ (Accessed 28th June 2007). Smith, M. (1997), How theses are currently made available in the UK, available at http://www.cranfieldlibrary.cranfield.ac.uk/library/content/download/678/4114/file/smith.pdf (Accessed 6th July, 2007). http://www.lib.ed.ac.uk/Theses/ http://www.cranfieldlibrary.cranfield.ac.uk/library/content/download/678/4114/file/smith.pdf Recent developments in Remote Document Supply (RDS) in the U work_grgsqp5hxrbgnonasiaxeozcpa ---- The Consolidation of Two Campus Interlibrary Loan Units Into One: A Partnership Between Law and Main Campus Valparaiso University From the SelectedWorks of Ruth S. Connell 2010 The Consolidation of Two Campus Interlibrary Loan Units Into One: A Partnership Between Law and Main Campus Ruth Sara Connell, Valparaiso University Available at: https://works.bepress.com/ruthconnell/1/ http://www.valpo.edu https://works.bepress.com/ruthconnell/ https://works.bepress.com/ruthconnell/1/ Connell: Page 1 of 27 The Consolidation of Two Campus Interlibrary Loan Units into One: A Partnership between Law and Main Campus Ruth Sara Connell Connell: Page 2 of 27 The Consolidation of Two Campus Interlibrary Loan Units into One: A Partnership between Law and Main Campus ABSTRACT. Until July 2009 Valparaiso University had two campus interlibrary loan units, one serving the main campus and the other serving the law school. In the wake of a projected 15-20% campus-wide budget cut, the two campus libraries investigated ways to work together to reduce costs. The libraries came together with OCLC and Atlas Systems, Inc., to consolidate the two separate units and ILLiad databases into one. This article discusses the reasons for merging two interlibrary loan units and the steps undertaken to combine two ILLiad systems. KEYWORDS. ILLiad, Interlibrary Loan, Law Libraries, Merger, Consolidation, Costs Ruth S. Connell is the Electronic Services Librarian and oversees the Interlibrary Loan Department at Valparaiso University’s Christopher Center for Library and Information Resources, 1410 Chapel Drive, Valparaiso, IN 46383 (E-mail: Ruth.Connell@valpo.edu). The author acknowledges Kevin Ford, Customer Support Manager at Atlas Systems, without whom the merger could never have taken place. The author also acknowledges David Smith of OCLC for his assistance with the LDAP conversion. Connell: Page 3 of 27 Background Valparaiso University is located in the city of Valparaiso, Indiana, a community of 30,000 in northwest Indiana, fifty-five miles from Chicago. Valparaiso University (Valpo) is a private comprehensive institution serving approximately 4,000 students. Most of the university’s seven colleges (arts and sciences, engineering, business, honors, nursing, and the graduate school) are served by Valparaiso’s main library, the Christopher Center for Library and Information Resources. The School of Law is served by its own library. The Christopher Center for Library and Information Resources is a four- story, 115,000 square foot building that opened in 2004. In addition to the library, the building also houses the public services arm of the campus Information Technology department (IT), although the library and IT are separately managed. The Christopher Center employs nine librarians, eleven support staff, 50 student assistants, and serves approximately 3,400 students. Comparatively, the law school library is 22,456 square feet and located within the School of Law building. The law library has six librarians, six support staff, 20 student assistants, and serves approximately 600 students. Each library is separately managed and has its own OCLC symbol. The two libraries share a catalog, but shared little else until 2009. Both libraries ran independent interlibrary loan departments. Literature Review The literature discussing merging of interlibrary loan (ILL) departments into other departments largely consists of single ILL departments being moved Connell: Page 4 of 27 under other administrative units, such as access services or reference. Although not the same as merging two separate ILL departments into one, there are many parallels that can be made between these two disparate types of consolidation. Tribble’s (1991) article discusses the merger of the Indiana State University (ISU) interlibrary loan unit of reference with the circulation department. The library merged two units into one in an effort to improve efficiency without hiring new staff. Tribble lists a number of factors to consider when merging operations. She suggests asking the question “Do two or more units perform like functions that can be accomplished more efficiently by one unit?” (p. 151) Another consideration should be whether the staff to be combined are compatible. Regular communication with staff members throughout the process is extremely important. Staff communication is a common theme in articles concerning mergers. Tribble credits some of the success of ISU’s merger to the open communication that helped relieve some of the uncertainty caused by the change. In their 2009 article, Alarid and Sullivan discuss the University of Denver Penrose Library’s 2006 merger of the Interlibrary Loan unit with Access Services. “As with any major organizational change, the challenge lies in the integration, readjustment, and refinement of communication and workflow procedures” (p. 222). Recently, Empire State College began offering interlibrary loan and document delivery to their patrons via a partnership with the University of Buffalo (Bertuca et al., 2009). The interlibrary loan hub is based within the University of Buffalo campus and Empire pays UB for the service. As these are two separate institutions, each has its own ILLiad client, but the interlibrary loan Connell: Page 5 of 27 department on the University of Buffalo’s campus has configured both clients to work on the same machines. Empire does not have one physical campus, and therefore does not have a physical library or collection. The University of Buffalo fills requests from their collection, and if an Empire patron requests a book not held in the Buffalo collection, the book is ordered through Alibris, and once returned by the patron, added into Buffalo’s collection. This project has proven popular with Empire faculty and students. An oft discussed topic in interlibrary loan literature is cost analysis. Many articles have been written on how to reduce interlibrary loan costs. Although the exact estimates vary, many studies have found that the most expensive interlibrary loan cost is labor. In Morris’ article (2004) entitled “How to Lower Your Interlibrary Loan and Document Delivery Costs: An Editorial”, he stated that labor costs are about 80% of total interlibrary loan costs. The same year (2004) a study by Jackson found that staff costs comprised 54% of borrowing costs and 73% of lending costs (p. 74). Both Morris and Jackson suggest looking at the level of staffing (librarian, support staff, or student workers) to see if labor costs can be reduced by altering what level of staff works on different procedures. An unknown for Christopher Center staff going into this conversation was how law patron expectations and needs differed from the rest of campus. In an article concerning law faculty services, Schilt (2007) commented that training faculty to use electronic resources was difficult. One of the librarians she surveyed commented that “faculty preferred to ask us to find things for them” (p. 193). She described the role of librarians. “Librarians dedicate time to discover and learn how to use each new resource; if the faculty took the time to do this, Connell: Page 6 of 27 they would never get anything to publication. Economy of effort suggests that they should rely on our expertise and dedicate their time to thinking and writing” (p. 198). This philosophy is contradictory to the Christopher Center’s approach to faculty support, which is to assist with training on databases, but not to perform the research on behalf of disciplinary faculty. Luckily, the type of hand- holding described in Schilt’s article is not practiced at the law library either. For the most part, Valparaiso law faculty members have student assistants to help them with their research and law librarians help train those students to use databases, but do not perform in-depth research for law faculty members. Had the law patron’s expectations for interlibrary loan been closer to those outlined in Schlit’s article, the prospect of consolidation would have been less palatable for the Christopher Center. Interlibrary Loan Operations: Pre-Consolidation If you accept that labor costs comprise a majority of the cost of an ILL operation, it follows that unmediated interlibrary loan costs will be lower than mediated interlibrary loan costs because they require less staff intervention. Jackson’s study (2004) found that not only are unmediated services more cost- effective, but they are also faster and provide a higher fill rate than mediated services (p. 98). Prior to the merger, the Christopher Center utilized two methods of unmediated borrowing that the law library did not. One of these was the ILLiad electronic delivery component of ILLiad called Odyssey, with the trusted sender setting turned on. Prior studies have shown this option reduces turnaround time and costs of article borrowing (Connell & Janke, 2006). The Connell: Page 7 of 27 other unmediated processing option in use at the Christopher Center was the Direct Request book borrowing feature of OCLC WorldCat Resource Sharing. When patrons place book requests and include an OCLC number or ISBN, and enough lenders are available within the state, those requests are sent directly to lenders without staff intervention. The Christopher Center uses the OCLC symbol IVU and the interlibrary loan department’s daily operation is run by the Interlibrary Loan Manager (ILLM), a full time position. The ILLM has several student assistants reporting to her working about 0.5 FTE. The ILLM reports to the Electronic Services Librarian who does not work in the interlibrary loan office on a daily basis except during the absence of the ILLM, but sets all procedures, maintains the ILLiad web pages, implements new technology, and handles other issues brought to her by the ILLM. In an informal survey conducted on the Interlibrary Loan listserv in 2002 regarding where the interlibrary loan unit resides within the library, the greatest number of respondents indicated it fell under Circulation or Access Services (Cheung, Patrick, Cameron, Bishop, & Fraser, 2003). The reason interlibrary loan at IVU is not housed in Access Services is historical. The ILLM position used to report to the Reference Services Librarian, but in 2005 there was a restructuring of library faculty positions and the Reference Services Librarian was tapped to fill the newly created Electronic Services Librarian position. Because she was the only librarian with interlibrary loan and ILLiad experience, the department moved with her. Connell: Page 8 of 27 Although the IVU interlibrary loan department has never been a part of the Access Services umbrella, Access Services staff does assist by scanning lending articles overnight so that they can be delivered in the morning. As shown in a study by a team of University of Arizona librarians, one of the main problems contributing to slow turnaround time was processing delays during evening and weekend hours (Voyles, Dols, & Knight, 2009). This is another area where the Christopher Center had a slight advantage over the law library because of a few hours of weekend student labor and evening scanning help from Access Services. The IVU interlibrary loan department uses ILLiad on a hosted OCLC server. It has its own contract for the statewide courier service and has been using both Ariel and Odyssey for electronic delivery of articles since 2005. During the July 2008 to June 2009 fiscal year, IVU received 8,028 borrowing requests and filled 6,626 (83%) (see Table 1). On the lending side, 6,166 requests were received and 4,369 (71%) were filled. Average borrowing turnaround time was 3.39 days for articles and 8.6 days for loans (see Table 2). Lending turnaround time was 13.89 hours for articles and 22.29 hours for loans. The law library uses the OCLC symbol IVZ and its interlibrary loan department’s daily operation was also run by a staff member with the title Interlibrary Loan Manager, but this was only a 0.5 FTE position. The IVZ manager had no student assistants and reported to the Cataloging Services Librarian. IVZ had its own separate hosted ILLiad subscription, its own statewide courier service contract, and its own Ariel license. IVZ was not using Odyssey. Connell: Page 9 of 27 During the 2008 to 2009 fiscal year, IVZ received 394 borrowing requests and filled 321 (81%) (see Table 1). During that same time period they received 1,237 lending requests and filled 588 (48%). Average borrowing turnaround time was 6.14 days for articles and 8.96 days for loans (see Table 2). For lending, turnaround time was 1.47 days for articles and 7.31 days for loans. Some of IVZ’s items are held in the Christopher Center’s Automated Storage and Retrieval System (ASRS) which contributed to increased lending turnaround time since those items had to be shipped across campus before IVZ could process them for delivery. Fiscal Challenge In 2008, all departments and colleges within Valparaiso University were charged with brainstorming scenarios to reduce costs by 15-20% because of predicted budget cuts. The leadership of the two campus libraries decided that one possible cost savings measure would be to combine the two interlibrary loan departments. Anticipated cost savings included moving from two ILLiad licenses, hosted servers, sets of staff, and state courier contracts to one. Because there were a number of logistical concerns, this proposal was vetted during an inquiry period starting in December 2008. The Christopher Center’s Electronic Services Librarian took the lead on this project and contacted OCLC to find out what costs would be involved with the merger, and to ask questions about how operations might be affected as a result of such a combination. Based on that initial consultation with OCLC, it was determined Connell: Page 10 of 27 that consolidation would be feasible. The libraries submitted their budget reduction proposals with the inclusion of the interlibrary loan consolidation. In April 2009, the libraries received word that their respective budgets had been cut based on the assumption that the interlibrary loan departments would be combined. At the same time, the Christopher Center lost one part time and four full time positions through voluntary and involuntary severance, and the law library eliminated one full time position. Although the consolidation of interlibrary loan departments was not mandated from above, if the libraries had chosen not to work together, they would have had to make cuts elsewhere to equal the cost savings the consolidation would provide, possibly resulting in further staff cuts. Consolidation Consideration In April 2009, the leadership of the two libraries met to discuss specific details of the proposal. The decision was made to retain both OCLC symbols. As the larger system, IVU would take over IVZ’s operation, almost like an outsourced operation. This method of consolidation negated the staff compatibility concern raised in other interlibrary loan mergers (Tribble, p. 151). Although there would not be any full time interlibrary loan staff at IVZ, a student assistant would be hired to work daily to pull lending requests; scanning article requests and putting loan requests in campus mail. On April 23rd, the Electronic Services Librarian received notification from IVU’s leadership to begin working with IVZ on the merger. She quickly scheduled a meeting to be held on April Connell: Page 11 of 27 30th for interested parties from the two interlibrary loan units to discuss the merger. In preparation for the initial meeting, the Electronic Services Librarian contacted OCLC to let them know about the consolidation, talk about changing authentication methods, and establish a general timeline and plan of action. It was assumed that when IVU took over IVZ’s operation, IVZ patrons would be entered as new users in IVU’s database, therefore losing their request history. During late April, an OCLC representative informed IVU’s Electronic Services Librarian of another option. For a fee, IVZ’s ILLiad database could be merged with IVU’s so that IVZ would not lose any data. Atlas Systems would have to perform this operation, and a quote was provided. Taking into consideration the authentication change and a database merger, the OCLC representative estimated an end-of-June initial target date. The day before the meeting, the Electronic Services Librarian e-mailed all of the proposed attendees from both libraries with the information about the database merger option so that everyone would have time to digest the information. Shortly thereafter, an IVZ representative e-mailed back that they had not yet decided whether they wanted to move ahead with the consolidation, so a discussion of database merger was premature. This revealed a miscommunication. IVU thought the purpose of the meeting was to discuss the impending consolidation of departments, while IVZ thought it was to discuss whether the consolidation was feasible. This miscommunication led the meeting in a different direction. Much of the meeting was devoted to addressing concerns from both sides, although logistics were still discussed. Connell: Page 12 of 27 Issues & Concerns Before the meeting, IVZ had given IVU access to their ILLiad reports, which revealed that in the previous year IVZ had filled as many requests (909) as an average month at IVU (10,995/12= 916.25). Because of the relatively small size of IVZ’s operation, IVU’s staff was comfortable assuming the extra volume, but remained concerned about issues such as transportation of items between campus libraries and separation of billing. Both libraries agreed that if the consolidation were accepted, campus mail would be used to transport items between buildings. The Electronic Services Librarian was sufficiently confident that ILLiad’s billing manager would be up to the task of handling separation of billing for IVU and IVZ. Overall, IVU staff felt that a consolidation of departments would require considerable time for initial setup, but once everything was in place, would not require much extra effort to maintain. IVZ also had a number of concerns that they addressed in the meeting. A major concern was that the consolidation would not save them much time if they were still expected to verify citations, gather and copy articles, and gather lending loans to mail to IVU. IVU responded that they could provide interlibrary loan assistance to IVZ patrons over the phone, via e-mail, or in person if patrons stopped by the Christopher Center. However, Christopher Center staff would not be able to provide on-site support within the law library. It was thought that based on the volume, a student assistant working within the law library could handle pulling loans and sending them via campus mail, as well as pulling articles to scan and deliver electronically. Connell: Page 13 of 27 In addition, the consolidation would save the time of IVZ staff because it would no longer be necessary to keep up on ILLiad upgrades and updates since the ILLiad software would no longer be used in that location. Because IVU would maintain the ILLiad web pages for IVZ, they could also upgrade IVZ to the most current ILLiad web interface. The prediction was that the consolidation would free the time of the 0.5 FTE Interlibrary Loan Manager to devote to other work within the law library. Another major concern for IVZ was the Valparaiso University Law Review. The editors of this journal use interlibrary loan services heavily and require special dispensation. They often require multiple renewals for loans, and also need an account that all law review staff members can share. IVU was considering switching authentication systems at the same time as the merger and could not address the shared account issue during the meeting, but later learned from OCLC that even with the new authentication system, such an account could be created. In addition, IVZ had a list of lenders they would not borrow books from for law review because those lenders did not allow multiple renewals. IVU agreed that if the merger went ahead, this list would be used to order books for the law review account. IVZ was also concerned about possible increased turnaround time for their patrons due to a suggestion to have all interlibrary loan deliveries go to IVU for processing and then be transported to IVZ via campus mail, which takes about a day. IVZ was still receiving many, and delivering all of their articles in print format, while IVU had moved almost exclusively to electronic delivery in 2004. If the consolidation went forward, IVU promised that all articles for both libraries Connell: Page 14 of 27 would be delivered electronically, eliminating the issue of campus mail for articles. The previous year IVU’s turnaround time for articles had been 2.75 days faster than IVZ’s, and IVU promised similar turnaround time for articles if the proposal were accepted (see Table 2). Regarding borrowing loans, IVU and IVZ’s turnaround times for the previous year were similar: 8.60 days versus 8.96 days respectively. The additional day required for campus mail would likely slow loan turnaround time a bit, but IVZ’s overall borrowing turnaround time when factoring in both loans and articles would decrease. In addition, IVU thought they could decrease IVZ’s lending turnaround time since IVZ was pulling many journals from their stacks to scan and deliver, when those articles were available electronically in PDF format through databases with licenses allowing for interlibrary loan lending. IVU offered that they could provide these other benefits to IVZ if the consolidation went forward: • Deliver loans via campus mail to the offices of IVZ faculty and staff (and accept return delivery the same way), so IVZ staff would not have to handle those items • Help train a student assistant to pull, scan, and send IVZ materials to IVU • Provide interlibrary loan statistics to IVZ as requested A concern of both libraries was that if an agreement were reached, that there would be sufficient time to complete the project early enough during the summer to allow for testing, trouble shooting, and familiarizing staff with new procedures before the fall semester began. Connell: Page 15 of 27 At the end of the meeting, it looked promising that the consolidation would go forward, but IVZ’s leadership did not give the green light to move ahead with the consolidation until almost two weeks later, on May 13th. Only at that time did the Electronic Services Librarian begin to work with OCLC on the technical aspects of the consolidation. LDAP Conversion Throughout the consolidation process, there were many setbacks, and one of the biggest frustrations was the LDAP conversion. During an initial conversation with OCLC, IVU inquired about switching from ILLiad authentication to Lightweight Directory Access Protocol (LDAP) authentication. IVU had been considering LDAP authentication for several years as it was the authentication system used by the rest of campus and would allow patrons to log into ILLiad with the same username/password combination they used for all other campus systems. IVU had not proceeded with implementation because of complications LDAP would have had separating out law patrons; possibly causing more problems than it solved. With the merger, LDAP became an attractive possibility, especially once it was confirmed that even with LDAP authentication we could manually create non-LDAP ILLiad authentication accounts for the law review or proxy borrowers (ex. student assistants placing requests for professors). It was expected that once all the inputs were received from Valparaiso University’s Information Technology (IT) department and those were entered into the ILLiad Customization Manager, LDAP would work. In fact, the IT Connell: Page 16 of 27 department inadvertently provided one incorrect line of code which caused authentication to fail, and it took eight working days of back and forth between the Electronic Services Librarian, the campus IT department, and OCLC (including numerous hours on hold) to diagnose the problem. Once the code was fixed on June 11th, there was one other unanticipated issue with LDAP authentication. Earlier versions of ILLiad had assigned new registrants with the AuthType of Default. For those patrons, LDAP authentication worked seamlessly after the switch. Patrons who had registered after the library had upgraded to later versions of ILLiad had been assigned an AuthType of ILLiad, and therefore could login under their old username/password combination after the switchover but not their LDAP authorization. There were literally hundreds of patrons with the incorrect AuthType, so OCLC ran a rapid update on the patron database to switch everyone with the AuthType ILLiad to Default. The LDAP conversion took ten days to complete, approximately nine days longer than expected. Database Merger Meanwhile, on June 1st, IVZ’s leadership notified the Electronic Services Librarian that they wanted to pay the extra money for the database merger so their patrons’ history would not be lost. This message was immediately forwarded onto OCLC. On June 12th the Electronic Services Librarian contacted OCLC to ask for an update on this issue. After an investigation taking another five days, it was discovered that OCLC had failed to notify Atlas Systems that the Connell: Page 17 of 27 merger would take place. Once Atlas Systems was finally notified, they contacted the Electronic Services Librarian with information about moving ahead, but 16 days were lost due to communication issues. In order to begin the merger, Atlas Systems needed to take both libraries’ ILLiad systems down for a period of time, predicted to last half a day, in order to convert IVU’s database into a Shared Server ILLiad system and add an IVZ portion. At this initial stage, the IVZ portion of the Valparaiso shared server would be empty and not yet operational. On June 19th Atlas Systems and IVU worked together to find possible times for the initial shared server conversion and settled on June 30th or July 1st. Atlas Systems contacted OCLC to ask for administrative level access to IVU’s OCLC hosted server. Ten days later OCLC responded with a message to Atlas Systems saying security procedures had changed, and they could not provide Atlas Systems access. On July 1st, Atlas Systems e-mailed the Electronic Services Librarian to say that they were having “spirited discussions with the folks at OCLC”, but were working to move things along. Meanwhile, on July 1st, IVZ’s state courier contract ended, and so all law school items started to be delivered to IVU and then had to be shipped over to IVZ for processing before being delivered to patrons or reshelved. On that same day, IVZ’s ILLiad license expired, but OCLC agreed to extend their ILLiad service for no additional cost until the switchover could be completed. To review the timeline, in April OCLC had given IVU/IVZ an initial target date of “end of June.” Both libraries were hoping to have the conversion complete by early July in order to have plenty of time to train staff and overcome Connell: Page 18 of 27 unanticipated issues with the new procedures before the start of school in late August. Sixteen days had been lost in June due to communication issues between OCLC and Atlas Systems, and because of changes in security procedures, weeks were passing without any progress. Because of all of these delays, on July 7th Valpo gave OCLC and Atlas Systems a deadline of the end of July to complete the merge. If that deadline could not be met, the order to merge the databases would be cancelled. The libraries decided that it was more important to get IVZ up and running on the shared ILLiad server without their history than to have their history and not have enough time prior to the beginning of the fall semester. OCLC and Atlas Systems agreed to the July 31st deadline. On July 9th, OCLC notified IVU’s Electronic Services Librarian that IVZ needed to upgrade to ILLiad 7.4 (IVU was already on 7.4) before the first stage of the merger could take place; IVZ took the first possible upgrade slot on July 13th. After the upgrade the initial downtime for the shared server conversion could be scheduled and was set for July 17th. When the day of the shared server conversion arrived, both IVU and IVZ’s staff side operations were taken down while OCLC completed the process. The customer web pages were not taken down at this time. After three hours, OCLC notified both libraries that they should be up and running. Although IVZ’s system recovered successfully from this operation, IVU’s did not. OCLC and IVU tried to troubleshoot, but it was a Friday afternoon so vendor staffing was limited and the problem was not able to be solved before the weekend. On Monday morning, OCLC notified Atlas Systems that IVU’s system had been down since Connell: Page 19 of 27 Friday, and Atlas Systems responded immediately and made some changes that fixed the system. At that time, Atlas Systems also let the Electronic Services Librarian know that IVU’s ILLiad customer web pages URL had changed. This was the first time IVU was informed the URL would be changing. It required an update of multiple website links as well as changing all interlibrary loan OpenURL links in vendor databases. Both IVU and IVZ had been under the impression that they would share one set of web pages at a base Valpo URL, but OCLC and Atlas Systems set them up for two separate sets of web pages with /IVU and /IVZ extensions. It would have been possible for IVU and IVZ to share web pages at the IVU base URL, but this information was communicated to IVU after the URL conversion had already taken place and the work to switch all links over had already been completed. IVU already had a set of web pages that would suffice with very little modification. IVZ, however, was on the pre-7.0 interface and needed extensive customization. During the shared server conversion a copy of IVU’s pages had been placed in the IVZ web folder, so the Electronic Services Librarian modified the IVU pages to make them appropriate for IVZ. An example of a modification that was made for IVZ was the inclusion of an explanation of how to request items to be delivered to the law library as opposed to held for pickup within the Christopher Center. IVZ’s web pages were created in advance of the merger so that they would be ready to upload as soon as the merger was complete. The final merger of IVZ’s database into IVU’s was scheduled for July 30th; one day before the project completion deadline. As this would require downtime Connell: Page 20 of 27 of both the staff side and customer web pages, announcements were made to campus that ILLiad would be down for an undetermined portion of the day. In preparation for the merger, Atlas Systems ran a report to find usernames that were duplicated in IVU and IVZ’s databases. In all cases, these duplicate usernames identified individuals who had registered in both systems; there were no cases of different people signing up with the same username. In many cases, these were former undergraduates who had registered within IVU and then moved onto law school and registered with the same username within IVZ. Valparaiso also offers some dual degrees involving law and other graduate programs (ex. JD/MBA) which resulted in some of the duplicates. Atlas Systems sent this list to IVU and IVZ and asked them to identify with which location each student should be affiliated. IVU and IVZ worked through the list, sent the results to Atlas Systems, and Atlas Systems merged duplicate accounts and affiliated each person with the correct symbol. Once it began, the merger took Atlas Systems 4.3 hours after which Valpo was contacted to begin testing on staff side operations, which took another three hours. Once complete, Atlas Systems turned the customer web pages back on. The next step of testing was to make sure that all patrons could log into the combined system, and for IVU patrons, there were no problems. The Electronic Services Librarian had a previously gathered list of IVZ patrons who had agreed to help with testing. She contacted several of them and asked them to log in. None of them were able to log in successfully. After some back and forth with both OCLC and ILLiad, a relatively simple fix was discovered. The WebAuth key Connell: Page 21 of 27 in the Customization Manager on IVZ’s side of the server had defaulted to ILLiad and had to be switched to LDAP in order to work. We discovered minor problems with authentication, email, and copyright clearance but these were resolved with the assistance of Atlas Systems and OCLC. Finally, on July 30, 2009, the merger was complete, and after troubleshooting everything was fully functional a day later. Although there were plenty of problems along the way, Atlas Systems and OCLC were able to meet the end of July deadline. Conclusion While the combined Valparaiso University interlibrary loan department is still young, it appears to be a success. In the two month period following the consolidation, IVZ’s lending fill rate went up (for both articles and loans) and borrowing turnaround time dropped by approximately four days for articles. Borrowing turnaround time for loans is up slightly, but overall law patrons and other libraries borrowing from IVZ are benefiting from this merger. In terms of cost, the university will certainly save money on services, equipment, and more importantly, labor. Two months after the consolidation, the heaviest IVZ users (those with more than five requests) were asked for their feedback regarding the change. The feedback was largely positive. One person stated that “the electronic delivery of articles is a nice improvement.” Another wrote, “The experience has been excellent -- no problems at all with books or articles.” A third person wrote, “My experience with ILL has been both positive and negative.” This person had two Connell: Page 22 of 27 main complaints. The first had to do with cancelled requests for items held in the law library’s microfiche collection. Because of this comment, it was decided that in the future, requests for items held in the microfiche collection would not be cancelled, but submitted. The second complaint was regarding cancelled requests supposedly available online through Westlaw and Lexis databases. Because access to those two databases is restricted to the law community, the interlibrary loan department did not have access to verify that material that appeared to be available according to the OpenURL link resolver actually was available. Normally when patrons request articles that the link resolver says are available, the ILLM verifies the content actually is available before canceling those requests. This patron’s comments highlighted the need to be able to do this for law restricted databases, so the interlibrary loan staff asked for and received username/password access to law databases for the sole purpose of verifying content. A half-time Interlibrary Loan Manager staff position at law has been replaced by a student employee working about 11 hours a week. The Interlibrary Loan department is now receiving 10 hours a week of assistance from Access Services that it had not in the past, but that is due to a restructuring of Access Services, not the consolidation of IVZ into IVU. Even with the extra assistance from Access Services, fewer Valparaiso University non-student staff hours are being devoted to interlibrary loan than before the merge. IVZ’s labor costs have dropped significantly because the student employee retrieving items at IVZ earns considerably less than the staff member who used to do those tasks. It should be noted that IVZ’s Interlibrary Loan Manager did not lose employment during the Connell: Page 23 of 27 consolidation process; she was transferred to another department within the law library. As noted in the literature review, communication is an issue that appears frequently in articles regarding mergers, and communication was a major issue during this consolidation. Many people fear change, and with regular communication, some of these fears can be allayed. Some law library staff members in particular had many concerns going into this process, but through regular conversations those issues were successfully addressed. Though it was anticipated that communication between library staff members might be an issue, we did not anticipate that vendor communication would consume so much time and energy throughout this process. There were five distinct entities involved from beginning to end: IVU, IVZ, Valparaiso’s IT department, OCLC, and Atlas Systems. It proved a major challenge to receive timely responses from all interested parties. If other institutions are considering a similar consolidation, time delays of this nature should be factored into the schedule. Although this project required a considerable amount of staff time over the summer of 2009 for implementation, the expected long-term cost and efficiency savings made this project worthwhile. The labor savings will accrue yearly, as well as software savings from eliminating duplication with ILLiad and state courier savings with the elimination of one contract. It is impossible to say what would have happened had the consolidation not taken place, but it is likely that it prevented the loss of other positions during this period of retrenchment. With expected long-term decreases in law school turnaround time and increased fill Connell: Page 24 of 27 rate, this consolidation of two separate interlibrary loan departments into one was a resounding success. REFERENCES Alarid, T., & Sullivan, C. (2009). Welcome to the Neighborhood: The Merger of Interlibrary Loan with Access Services. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 19(3), 219-225. Bertuca, C., Lelonek, C., Tuohy, R., Ortner, J., Bouvier, A., Dithomas, S., Hayes, S., Morehouse, S. (2009). Two ILLiad Clients, One Desktop, Purchase on Demand: Sharing a University's Collection, Staff, and Expertise. Journal of Access Services, 6(4), 497-512. Cheung, O., Patrick, S., Cameron, B., Bishop, E., & Fraser, L. (2003). Restructuring the Academic Library: Team-Based Management and the Merger of Interlibrary Loans with Circulation and Reserve. Journal of Interlibrary Loan, Document Delivery & Information Supply, 14(2), 5-17. Connell, R., & Janke, K. (2006). Turnaround Time Between ILLiad's Odyssey and Ariel Delivery Methods: A Comparison. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 16(3), 41-55. Connell: Page 25 of 27 Jackson, M. (with Kingma, B., & Delaney, T.). (2004). Assessing ILL/DD Services: New Cost-Effective Alternatives. Washington, DC: Association of Research Libraries. Morris, L. (2004). How to Lower Your Interlibrary Loan and Document Delivery Costs: An Editorial. Journal of Interlibrary Loan, Document Delivery & Information Supply, 14(4), 1-3. Schilt, M. (2007). Faculty Services in the 21st Century: Evolution and Innovation. Legal Reference Services Quarterly, 26(1), 187-207. Tribble, J. (1991). Merging library operations. Library Administration & Management, 5(3), 151-154. Voyles, J., Dols, L., & Knight, E. (2009). Interlibrary Loan Meets Six Sigma: The University of Arizona Library's Success Applying Process Improvement. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 19(1), 75-94. Connell: Page 26 of 27 Table 1 Requests Received and Filled during 2008-2009 Fiscal Year IVU IVZ Received Filled Received Filled Borrowing 8,028 6,626 394 321 Lending 6,166 4,369 1,237 588 Total 14,194 10,995 1,631 909 Connell: Page 27 of 27 Table 2 Turnaround Time for 2008-2009 Fiscal Year IVU IVZ Borrowing: articles 3.39 days 6.14 days Borrowing: loans 8.60 days 8.96 days Borrowing: overall 6.34 days 8.34 days Lending: articles 13.89 hours 1.47 days Lending: loans 22.29 hours 7.31 days Lending: overall 18.70 hours 5.14 days Valparaiso University From the SelectedWorks of Ruth S. Connell 2010 The Consolidation of Two Campus Interlibrary Loan Units Into One: A Partnership Between Law and Main Campus Microsoft Word - 373525-convertdoc.input.362058.YhYKw.doc work_gu3x6o2v7jdlbdujk2fwyxfmam ---- Analysis of the use of open archives in the fields of mathematics and computer science HAL Id: sic_00131187 https://archivesic.ccsd.cnrs.fr/sic_00131187 Submitted on 15 Feb 2007 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Analysis of the use of open archives in the fields of mathematics and computer science Anna Wojciechowska To cite this version: Anna Wojciechowska. Analysis of the use of open archives in the fields of mathematics and computer science. OCLC Systems & Services, 2007, 23 (1), pp.54-69. �sic_00131187� https://archivesic.ccsd.cnrs.fr/sic_00131187 https://hal.archives-ouvertes.fr Analysis of the use of open archives in the fields of mathematics and computer science Anna Wojciechowska* * GERSIC : LVIC - University Aix-Marseille 3, France annaw@cmi.univ-mrs.fr Open access to Scientific and Technical Information (STI) is becoming more and more important. The Open Archives Initiative (OAI) has developed to the point that the principle is now accepted as an essential component in the communication of scientific results and in the publication of knowledge. Such free access to STI can take two forms: · self-archiving of articles, generally on the Web, and in particular in digital archives and repositories with public access (open archives), · publication in open access (OA) journals. In 2001, as part of a joint project with the CNRS1 and the INRIA,2 in partnership with the MathDoc3 unit, the CCSD4 developed “Hal”,5 a tool for scientific communication between researchers which, within the framework of the Open Archive Initiative (OAI), was aimed at promoting institutionalised self-archiving of research results in open access archives. For mathematics (and other disciplines), whenever the corresponding sub-discipline exists in “ArXiv”,6 all recent documents are automatically transferred from Hal into ArXiv. The CNRS, Inserm,7 INRA8 and INRIA signed a joint declaration promoting open access to Scientific and Technical Information (Berlin, October, 2003). On March 22nd, 2005 they signed a common policy agreement to develop open archives with a common administrative and technical management. The researcher is at the centre of the archiving system. This enquiry aims at studying the requirements and use of these open archives by researchers. Presentation A questionnaire was sent to part of the mathematical and computer science community in France by several libraries of the National Group of Mathematics Libraries. The fields (mathematics and computer science) were chosen because: · there is currently much discussion about open archives in human and social sciences (drawing comparisons with what is done in science, technology and mathematics), · these fields are of direct concern for the author of this report – a librarian in a mathematics and computer science library who wanted to check whether the library's users were aware of the existence of open archives, and if so, how they used them. The first version of the questionnaire was tested in May 2005 with some researchers of the 1 National Centre for Scientific Research, 2 French Institute for Research in Computer Science, 3 National Centre for Documentary Coordination in Mathematics, 4 Center for Direct Scientific Communication (CCSD/CNRS), 5 The Hal (“Hyper Article en Ligne”) archive provides authors with an interface enabling them to file manuscripts of scientific articles in every discipline in the database of the CNRS Center for Direct Scientific Communication (CCSD/CNRS), 6 or 7 French Institute for Health and Medical Research, 8 French Institute for Agricultural Research 1 http://arxiv.org/ Laboratory of Analysis, Topology and Probability in Marseille (LATP). After several discussions some questions were reformulated or removed, others were added. Early in July 2005, the revised version was sent to the members of other laboratories in Marseille. After some further corrections, the final version of the questionnaire was sent by e-mail in November 2005 to mathematicians and computer scientists via several libraries of the RNBM.9 The librarians of the RNBM distributed this questionnaire to their users. The number of people reached by the questionnaire is estimated at 2200 ; of these, 128 persons answered. Participation Twelve research centers participated in this survey : Besançon, Bordeaux, Clermont Ferrand, Grenoble (Imag and the Fourier Institute), Marseille, Nancy, Paris (Jussieu and Orsay), Rouen, Strasbourg. The 128 persons who participated in the survey were essentially research lecturers and CNRS researchers, as shown in table 1: Table 1 Participants Percentage CNRS researcher 20.30% Research lecturer 56.30% Other 23.40% The productivity of a researcher is not the same at the beginning, middle or end of his career. Neither is the use of new technologies to access information. For this reason we included statistics concerning the age of the participants to try to reveal differences in use based on age. Almost 70% of participants were less than 40 years old. Participation in the investigation according to the age of the respondent is presented in table 2. Table 2 Age Percentage < 30 years old 30.50% 30-40 years old 37.50% 40-50 years old 13.30% >50 years old 18.70% Almost half of the participants in the survey said that they knew the term “open archives” or had heard of these archives. 9 National Group of Libraries in Mathematics, 2 < 30 years old 30-40 years old 40-50 years old >50 years old 0 2,5 5 7,5 10 12,5 15 17,5 20 22,5 25 27,5 Knowledge of the term "open archive" Yes No It seemed interesting to know how researchers learned of the existence of open archives, what motivated them to self-archive the articles, and what they knew about copyright and open access (OA) journals. The probable sources of information on open archives are presented in table 3. Altogether 110 persons answered the question “How did you learn of the possibility of archiving your publications in institutional open archives?” Table 3 Source of information about open archives Percentage colleagues 42.20% information from the library 15.60% other 10.00% information from CNRS 9.40% debate about open access 7.80% co-authors 7.00% information from the Ministry 0.80% information from the university 0.80% “Colleagues” are a source of information about the existence of open institutional archives for 42% of researchers. “Other sources” includes this questionnaire and the newsletters of the INRIA. Information search The researchers are both readers and authors of articles. For this reason, the first part of the questionnaire concerned researchers as readers and access to work-related information. This concerns especially the search for bibliographical references and for full text articles, both past and recent. It seemed interesting to know where researchers find such information, how they search for it, whether they need assistance from librarians, how they use electronic information, how often they consult electronic articles, and how much recent the consulted articles are. To the question: “Where do you obtain the articles you need ?” the researchers could give several answers (table 4). 3 < 30 years old 30-40 years old 40-50 years old >50 years old 0 5 10 15 20 25 30 35 40 45 50 Source of information about open archives other co-authors colleagues Information from the library Information from the university Information from the ministry Information from CNRS debate about open access Table 4 Source of articles Percentage Department library 75.80% Databases10 56.30% Springer Link11 45.30% ScienceDirect12 39.80% Editors’ websites 33.60% Other 32.00% University library 27.30% INIST13 5.50% Even if 76% of respondents find the articles (or their references) in their department’s library, bibliographic databases are quite heavily used. 56% of respondents have already used bibliographical databases as a source of information, and journals with full text on line (with subscription paid by the library or university) are consulted more and more : 45% of researchers find the articles in Springer Link and 40% in ScienceDirect. 32% give other sources of articles, like personal Web pages, ArXiv, Hal, other on-line subscriptions like Jstor,14 CiteSeer, or direct contact with the authors. When it comes to searching the Web for research articles with open access to the full text, a majority of researchers use Google (66%) and ArXiv (66%). Some of them use Emis,15 personal pages or Numdam.16 Google indeed makes it possible to find personal Web pages, the pages of the laboratories or of the libraries locally storing scientific publications, and it also enables direct access to full text articles. The use of ArXiv means access to e-prints in particular, though some published articles can also be found there. 10 MathSciNet () and Zentralblatt (), subscription access paid for by libraries. 11 SpringerLink : Springer, Birkhauser et Kluwer' subsrciption access to full text of journals, 12 Elsevier's journals in full text, payed by universities 13 INIST : National Institute for Scientific and Technical Information, 14 The Scholarly Journal Archive : 15 This site lists electronic journals and hosts home pages or mirrors of many of them, 16 Folder database of digitized mathematics, 4 0 10 20 30 40 50 60 70 80 90 100 Sources of articles Laboratory library Database Springer Link ScienceDirect Editors' web sites Other University library INIST http://www.jstor.org/ http://www.emis.de/ZMATH/ http://e-math.ams.org/mathscinet/ Answers to the question concerning access points to e-preprints confirmed familiarity with Hal and ArXiv (for 58%), but 77% of researchers access electronic preprints via personal pages. Access via laboratory or library sites was indicated by 18.8%, which means nearly 1 researcher out of 5 knows that a local archive exists and is able to access it directly. More than 80% of the respondents find the articles they need without any difficulty, and this independently of the age of the researcher. Researchers usually ask librarians to find the references or publications they need. But the possibility of finding information on line has made this easier. So it seemed interesting to know whether these articles are still accessed with the assistance of the librarians (table 6). Table 6 Ask for help Percentage always 0.00% often 5.50% sometimes 58.60% never 35.20% 5 0 10 20 30 40 50 60 70 80 90 Access to the full text Google ArXiv Hal Other Emis Any 0 10 20 30 40 50 60 70 80 90 Access to articles often sometimes always never <30 30-40 40-50 >50 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Access to articles by age often sometimes always never Almost 60% ask for the librarian's help only sometimes, and 35% do not need any help. Full text articles are recent (usually after 1995). But more and more older articles digitized within the framework of various local, national or international projects are also on-line in open access. If researchers do not need help in retrieving information, it is because almost 40% of them use articles which were published during the last 10 years, i.e. articles which are for the most part available either on line in full text, or in paper version in libraries. E-publications are consulted by almost 60% of those questioned once per week or more. Almost all (93%) consult preprints on line, but, as we shall see later, not so many deposit their preprints on line in open access. Publications This part analyses the amount of researchers' publications available, how they are filed in open archives, the types of filed articles, etc. A publication can take the form of a preprint, i.e. of a written text without peer review or editorial review, or of postprint – the final published version of article. 46% of questioned researchers publish a maximum of 1 article per year, but these are mostly recent PhDs or young lecturers. 46% of those questioned said they publish 2-3 articles per year. The publication of an article generally corresponds to an advance in a research project. It is on the basis of these articles that the researcher is evaluated and financed. For those researchers who answered the questionnaire, the main reason for publishing their work was to communicate their research findings to the scientific community (table 7). Table 7 Aim of publication Percentage Communication of results to the community 86.00% Career advancement 37.50% Personal prestige in the domain 17.20% Increased possibilities for future funding 9.40% Other 5.50% 6 0 5 10 15 20 25 30 35 40 45 50 55 60 Number of publications per year 0-1 2-3 4-5 More then 5 <30 30-40 40-50 >50 0 2.5 5 7.5 10 12.5 15 17.5 20 22.5 25 Number of publications per age per year 0-1 2-3 4-5 More then 5 Each of them could give several answers. Among other answers there is scientific curiosity, the wish for the results to be recognised and accepted, the wish to clearly formulate the results obtained. Experience in self-archiving Self-archiving consists in posting an electronic document (generally an article, for a researcher) on a website, a document which can then be consulted freely by all. This is done in order to maximize visibility of, and accessibility to, the results. There are three ways a researcher can provide open access to articles: · he can deposit a copy of an article on a personal or institutional website, · he can place it in an institutional open access archive (such as Hal), · he can put it in a disciplinary open access archive (such as ArXiv). Scientific publications are mainly posted on line by the authors and/or the co-authors (table 8). But some of the authors who publish between 2 and 4 articles per year do not archive them (the articles are sent directly to the publishers). The documents posted by the administrative staff of the laboratories usually do not contain full text. They are merely postings of bibliographical notes. Table 8 Author of deposit Percentage You 74.20% Your co-author 10.90% Staff of your laboratory 11.70% To the question “What kind of documents do you put into open archives ?” (table 9), 65% of those questioned posted postprints, and 77% preprints. Among other types of publications we find errata and files of conference talks (e.g. Powerpoint files). Table 9 Types of publication deposited Percentage refereed article 64.80% congress paper 22.70% preprint 77.30% technical report 13.30% book chapter 6.00% dissertation, thesis etc. 23.40% course notes 18.00% exercises 18.00% other 2.30% 73% do not fear “plundering” or improper use of the e-prints, which corresponds (at least) to the number of depositors of preprints in open archives. Among those who archive their publications in institutional open archives, 55% do so as a matter of principle, and 25% because they're there. To the question: “How many articles did you deposit in open archives over the last 3 years”, some researchers declared having posted articles on personal websites. Yet, among the researchers who did not deposit any article, most did not have personal Web pages. This question was divided into several sub-questions, making it possible to separately evaluate postings of preprints and articles with peer review into personal and institutional sites. 7 Some people did not answer these questions : either the posting was made by the co-author/the administrative staff, or quite simply the researcher did not give an answer (see the table 8). Self-archiving respondents were asked what their original motivation was for self-archiving their work. Researchers could give several answers, which means that probably those which self- archived articles on their websites also did so (at least partly) in institutional archives. Thus, the total percentage do not indicate the true level of self-archiving, especially as the articles could be deposited by other people (co-authors, etc.). Here below are the details about self-archiving : Preprints on personal websites 66.4% of respondents posted at least 1 preprint on a personal site. Among the 29.7% of those which did not post anything, the majority consists of PhD students, and researchers over 50. Preprints on laboratory or library websites 36.7% of respondents posted at least 1 preprint on the site of the laboratory or the library. Preprints on Hal 18.6% of respondents filed at least 1 preprint on Hal. Among them, 87.5% filed at least 2 preprints on Hal. Preprints on ArXiv 33.6% people filed at least 1 preprint on ArXiv. Among them, 79% filed at least 2 preprints. Postprints on personal websites 63.3% of respondents posted at least 1 refereed article on a personal website. P ostprints on laboratory or library websites 17.2% of respondents posted at least 1 refereed article on the site of the laboratory or the library. Postprints on Hal 13.3% of respondents filed at least 1 refereed article on Hal. Among them, 82.4% deposited at least 2 refereed articles. Postprints on ArXiv 8 At least one preprint archived At least one postprint archived 0 10 20 30 40 50 60 70 80 90 Self-archiving of publications Perso. Sites Lab. Sites Hal ArXiv 16.4% of respondents filed at least 1 refereed article on ArXiv. Among them, 71.4% filed at least 2 refereed articles. Answers were given by only 128 people, but a look at the Hal site shows that there are fewer publications in mathematics and computer science than in physics for example.17 Then, it seemed interesting to determine the opinion of researchers on the ergonomics of Hal and ArXiv and about the procedure of self-archiving. 124 researchers gave an answer on the use of Hal and 127 on the use of ArXiv. The findings here show that among those who gave an opinion on the use of Hal, 62.5% find it easy. Among those which gave an opinion on the use of ArXiv, almost 50% find it easy and 31.3% very easy. Among the “no opinion” answers, there is a certain number of people who don't deposit their articles. 17 As of 31/12/05 the number of publications filed on HAL was as follows : physics : 8479, mathematics : 2181, computing : 1736 9 0 5 10 15 20 Ease of use of Hal Very easy Easy Somewhat difficult Difficult 0 5 10 15 20 25 30 35 Ease of use of ArXiv Very easy Easy Somewhat difficult Difficult <30 30-40 40-50 >50 0 2,5 5 7,5 10 12,5 15 17,5 20 22,5 25 27,5 30 Ease of use of ArXiv per age Very easy Easy Somewhat difficult Difficult <30 30-40 40-50 >50 0 2,5 5 7,5 10 12,5 15 Ease of use of Hal per age Very easy Easy Somewhat difficult Difficult The next questions concerned the necessary time to self-archive an article. To complete a deposit on Hal, the first deposit takes less than 30 minutes for 34.5% of respondents and subsequent deposits less than 15 minutes for 81.5% of them. To complete a deposit on ArXiv, the first deposit takes less than 30 minutes for 41.3% of respondents and subsequent deposits less than 15 minutes for 73.3%. The directors of the different scientific departments of the CNRS asked the laboratories to deposit a copy of all their published articles in an institutional repository – Hal –, but the CNRS did not make this compulsory. The question “What would be your reaction if your employer (CNRS or Ministry of Education) required you to the deposit your publications in an institutional repository?” was voluntarily provocative and the vast majority (74%) of respondents said they would comply willingly with this requirement. Only 7.80% said they would comply reluctantly and 6.20% said they would not do so. Copyright awareness An author can file in open archives any type of document of which he is the intellectual owner. This concerns documents already published or in press, documents in the course of scientific validation (preprints) or work papers. Authors can make digital postprints available when : · they have retained copyright and granted only non-exclusive rights to the publisher, · they have transferred all rights to the publisher, but the publisher’s policy permits authors to distribute postprints under specified terms and conditions (most publishers now have such self- archiving policies), · they have modified the preprint using errata/corrigenda. Only explicit prohibition of copyright transfer in a contract (which then gives exclusively to the publisher the rights to electronically exploit the document) makes it compulsory for the author to ask for the publisher's permission to file a document in open archives. It seemed interesting to find out what researchers really knew concerning the legal aspects of scientific publications and to ascertain whether researchers read the copyright statements provided by publishers. Authors were asked who retained the copyright to the last article they had self-archived. 56.2% said that it remained with the publisher, 30% did not know it. Only 5.5% said it was themselves. Of the researchers who had self-archived (especially on personal Web pages) their last published article, a majority (54%) did not know whether the publisher's permission was necessary before doing so. 18% didn't ask their publisher for it, and only 7.8% did ask. 5.5% said they had copyright on their last published article. Besides, 77.3% did not know that they could negotiate self-archiving with their publisher. These answers can mean that the self-archiving was not prohibited in the contracts of copyright transfer, or that the articles were published without the publisher's permission. It is possible to think here that a majority of researchers sign the copyright statements provided by publishers without reading them. 10 Knowledge of open access (OA) journals Publication in open access journals is the second form of open access to scientific information. Questioned on whether they had submitted a manuscript to an open access journal in the last three years, only 17% gave a positive answer. Researchers were invited to indicate the reasons for publishing their work in an open access journal. The authors who publish there (22 respondents) are generally motivated by the principle of open access and the good reputation of this journal in their field. People who did not publish in open access journal were asked to indicate the reasons for not having done so, and the results are as follows (the percentage was calculated with respect to the 100 people who said they had not published in open access journals) : · the researchers do not know OA journals in their field (72%), · they say that the open access journal in their field are not prestigious (10%), · they are against the author-payer principle (7 %), · Other reasons : article rejected. Finally, the intentions of researchers are as follows: 34.4% of researchers plan to publish in an OA journal in the future and 38.3% do not know yet if they will do so. Yet the level of awareness of open access journals is relatively low in mathematics, and there are more than 80 titles (http://www.doaj.org/). Moreover, the “author-payer” economic model, which exists in other countries, is not popular in France, and it is probably not well known.18 Conclusions · A majority of researchers find the articles (or their references) necessary for their work in libraries, but journals with full text (with subscriptions paid by the libraries) are often consulted. · More than 80% of respondents find the articles they need without any difficulty. Almost 60% of them ask for the librarian's help only sometimes and 35% do not need any help. · To obtain articles with full text in open access, the researchers use especially ArXiv (to get electronic preprints) and Google (to search for personal Web pages). · The most consulted (at least once a week) full text articles available on-line were published 18 For instance, the new Springer-Kluwer group has launched Open Choice : if an author chooses to pay ($ 3500), his article will be available without a subscription being necessary. 11 <30 30-40 40-50 >50 0 2.5 5 7.5 10 12.5 15 17.5 20 22.5 25 27.5 30 32.5 Owner of the copyright per age Me Publisher I don't know Owner of the copyright Me Publisher I don't know http://www.doaj.org/ during the last ten years. · Almost 50% of researchers publish 2-3 articles per year. · A majority of them post a copy of their articles on personal websites, and 28% have done so for at least 5 years. · Part of those researchers who post publications on personal sites also simultaneously file them in institutional open archives. But researchers always post many more articles on their personal webpages (63%) than in Hal (12%) or in ArXiv (16%). · Those who have already filed publications in Hal or ArXiv find them easy to use and say they needed less than 30 minutes to file their first article and less than 15 minutes for subsequent filings. · The authors claim that they are becoming more aware of open access, and that they learned of the existence of institutional open archives thanks to communication between them. · Yet the sources of on-line articles with open access, such as OA journals, are still not very well- known. · Official information coming from the CNRS or the Ministry was on the whole unnoticed. · A majority of researchers would accept being required to file publications in institutional open archives. · Those who are already filing publications in open archives do so as a matter of principle. · A majority of them do not read the contracts they sign with their publishers, and they do not know of the possibility of negotiating these contracts. · Most researchers sign the copyright statements provided by their publishers without reading them. With regard to the self-archiving of articles in open institutional archives: To increase the number of publications filed in institutional archives, it would be necessary to encourage the researchers and help them to adopt this new style of publication in order to improve the distribution of their scientific production. The difficulties in the development of open archives are not technical, but social. The practices of self-archiving already form part of the work pattern of researchers in mathematics and computer science, but only as a kind of "Google" reflex, i.e. depositing publications on personal sites to ensure that they can be found by the most commonly used search engine. The utility of institutional open archives is not yet well understood. The development of open archives is founded on the self-archiving of scientific publications by their authors. It would be necessary to envisage training researchers to the use of Hal. Making the legal aspects of scientific publication more widely known, and sensitizing researchers to the necessity of checking the contracts they sign with their publishers, also appear to be key issues. With regard to the publication of OA journals: According to a study undertaken by the Center for Information Behaviour and the Evaluation of Research (CIBER),19 more and more scientists publish in OA journals. Yet it's impossible to say so of French researchers in mathematics and computer science for the moment. It is necessary to increase the dissemination of information about open journals since, according to the results of this questionnaire, they are not sufficiently exploited. It could be the role of libraries to circulate relevant information from the publishers of OA journals. 19 12 Bibliography Allen, James. Interdisciplinary differences in attitudes towards deposit in institutional repositories (2005). Available at http://eprints.rclis.org/archive/00005180/ Aubry C., Janik J. (dir.). 2005. Les archives ouvertes : enjeux et pratiques. Guide à l'usage des professionnels de l'information. Paris: ADBS, 332 p. Centre for Information Behaviour and the Evaluation of Research (CIBER). Available at http://www.ucl.ac.uk/ciber/ciber_2005_survey_final.pdf Chanier, T. 2005. Archives ouvertes et publication scientifique. Comment mettre en place l’accès libre aux résultats de le recherche ? L’Harmattan, Paris, 188p. Fily, Marie-Françoise. 2005. Introduction au concept d'archive ouverte. Available at http://archivesic.ccsd.cnrs.fr/sic_00001523.html Gallezot, Gabriel. Le Libre Accès (Open Access) : partager les résultats de la recherche, Colloque international : L'information numérique et les enjeux de la société de l'Information - Tunis, 14- 16 Avril 2005 - ISD. 14 avril 2005. Available at http://archivesic.ccsd.cnrs.fr/sic_00001416.html Guide juridique du CNRS, en cours de publication. Available at http://publicnrs.inist.fr/ JISC Disciplinary Differences Rapport (2005). Available at http://www.jisc.ac.uk/uploaded_documents/Disciplinary Differences and Needs.doc Libre accès à l'information scientifique et technique. Available at http://www.inist.fr/openaccess 13 http://www.inist.fr/openaccess http://www.jisc.ac.uk/uploaded_documents/Disciplinary Differences and Needs.doc http://publicnrs.inist.fr/ http://archivesic.ccsd.cnrs.fr/sic_00001416.html http://archivesic.ccsd.cnrs.fr/sic_00001523.html http://www.ucl.ac.uk/ciber/ciber_2005_survey_final.pdf http://eprints.rclis.org/archive/00005180/ work_guwr26hhpfaglpznq7fpdurieq ---- 86695 283..298 FEATURES Towards user-centered indexing in digital image collections Krystyna K. Matusiak University of Wisconsin-Milwaukee Libraries, Milwaukee, Wisconsin, USA Abstract Purpose – User-created metadata, often referred to as folksonomy or social classification, has received a considerable amount of attention in the digital library world. Social tagging is perceived as a tool for enhancing description of digital objects and providing a venue for user input and greater user engagement. This article seeks to examine the pros and cons of user-generated metadata in the context of digital image collections and compares it to professionally created metadata schema and controlled vocabulary tools. Design/methodology/approach – The article provides an overview of challenges to concept-based image indexing. It analyzes the characteristics of social classification and compares images described by users to a set of images indexed in a digital collection. Findings – The article finds that user-generated metadata vary in the level of description, accuracy, and consistency and do not provide a solution to the challenges of image indexing. On the other hand, they reflects user’s language and can lead toward user-centered indexing and greater user engagement. Practical implications – Social tagging can be implemented as a supplement to professionally created metadata records to provide an opportunity for users to comment on images. Originality/value – The article introduces the idea of user-centered image indexing in digital collections. Keywords Digital storage, Collections management, Image processing, Indexing Paper type Viewpoint Introduction The expansion of digital technologies has enabled wider access to visual resources held by museums and libraries. In the last decade cultural institutions have undertaken large-scale digitization projects to convert their collections of historical photographs and art slides to digital format. Digitized images are presented to users on the web through digital collections that offer enhanced image manipulation and multiple search options. Advances in digital technologies and an increase in the number of digital image collections, however, have not been supported by comparable advances in image retrieval, indexing systems, and options for user interaction (Armitage and Enser, 1997; Choi and Rasmussen, 2002; Trant, 2003). Digitization has created a need for more extensive image description to facilitate image discovery in the digital environment. A considerable amount of indexing work accompanies image digitization in library and museum settings. Archivists and catalogers transcribe existing image captions, assign subject terms, and create other descriptive metadata to provide access points for image retrieval. Many archival collections have little or no accompanying textual descriptions, so image indexing also requires original research and verification of data. Descriptive metadata are created in The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm Towards user-centered indexing 283 OCLC Systems & Services: International digital library perspectives Vol. 22 No. 4, 2006 pp. 283-298 q Emerald Group Publishing Limited 1065-075X DOI 10.1108/10650750610706998 museums and libraries by professional catalogers following standards and using controlled vocabulary tools. This approach represents traditional document-oriented indexing where items are classified a priori by professional catalogers with little or no input from end-users (Fidel, 1994). The web, however, challenges this world of clear boundaries and distinct authority roles. With the introduction of blogs, wikis, newsfeeds, and bookmarking tools, the web provides an environment for collaborative knowledge construction and social networking (Hammond et al., 2005). It also creates new opportunities for sharing digital images and classifying them by user-generated keywords. Photo sharing sites, like Flickr (www.flickr.com), allow users to upload images and categorize them using their own terms. User-created indexing, often referred to as folksonomy or social classification, has received a considerable amount of attention, with some enthusiasts calling it “a revolution in the art and science of categorization” (Sterling, 2005). It is probably premature to talk about “ a revolution” and call for an abandonment of cataloging standards and controlled vocabulary tools. On the other hand, the social classification movement has initiated a discussion in the digital library community about the use of social networking applications, engaging users, and building virtual communities (Bearman and Trant, 2005). This article contributes to this discussion by reviewing the relevant literature on image indexing and providing an overview of social classification in relation to images. It examines the challenges and usefulness of social tagging and its potential implications for developing user-oriented indexing of digital image collections. Image indexing in the digital environment Purpose A picture is worth 1,000 words – this old saying rings especially true for those who attempt to describe images for digital collections. As Roberts (2001) points out, it will take every one of these words to provide an adequate description of pictures included in image databases. Without comprehensive indexing, the images will remain buried in the database, never seen by the users. In the online environment images pose problems of access and retrieval more complicated than those of text documents. Visual information embedded in pictures is difficult to access without prior indexing. The primary purpose of indexing is to identify images and provide access to them. Layne (1994) identifies two goals for image indexing: (1) To provide access to images based on the attributes of those images. (2) To provide access to useful groupings of images. These general goals refer to analog as well as digital image collections, but have become more critical in the digital environment where users access images without assistance of librarians or archivists. Approaches to image indexing Research literature identifies two distinct approaches to image indexing (Goodrum, 2000; Enser, 2000; Rasmussen, 1997; Trant, 2003): (1) Concept-based, where image attributes and semantic content are identified and described verbally by human indexers. OCLC 22,4 284 (2) Content-based, where features of images, such as color, shape, or texture are automatically identified and extracted by computer software. Goodrum (2000) points out that very little research has been conducted on the relative effectiveness of these approaches to image indexing in the digital environment. Chu (2001), examining the research literature on image indexing and retrieval, observes that there is very little collaboration between researchers of these two approaches. Content-based research, although very vibrant in the information science community, is not transferred into practice in digital libraries where most systems are built with concept-based approach. In “seeking the alliance of concept-based and content-based paradigms” Enser (2000) looks at visual image retrieval from the user’s perspective and examines several user studies. He concludes that within archival image collections users tend to rely more on concept-based rather than content-based image retrieval techniques. Subject access to visual resources is particularly important. His findings are confirmed by other studies of user queries in image databases. Choi and Rasmussen (2002) find subject description a very important factor assisting users in judging image relevance for their research needs. Challenges to concept-based image indexing Concept-based indexing provides intellectual access to the visual content of an image. It involves translating the visual information into textual description to express what the image is about and what it represents. In addition to subject description, metadata associated with an image can also contain information about image authorship and provenance. Descriptive metadata are created based on standardized metadata schema, such as Dublin Core or VRA Core, using controlled vocabulary tools or natural language for metadata values. Concept-based indexing requires human indexers to interpret the meaning of the picture, assign subject headings, and transcribe image captions and textual annotations. The process of translating the content of an image into verbal expressions poses significant challenges to concept-based indexing. Several researchers provide an overview of the problem and generally agree that even extensive text-based indexing is usually inadequate to meet user needs and provide effective image retrieval (Besser, 1990; Chen and Rasmussen, 1999; Enser, 2000; Jorgensen et al., 2001; Layne, 1994). The literature is quite extensive and the following lists just summarize the major points. Some challenges are due to the complexity and richness of visual medium: . Images are rich and often contain information useful to researchers from many disciplines (Besser, 1990). . Image is often used for a purpose not anticipated by the original creator (Besser, 1990). . The same image can mean different things to different people (Chen and Rasmussen, 1999; Enser, 2000). . Images can have several layers of meaning from specific to more abstract (Enser, 2000; Jorgensen et al., 2001; Layne, 1994). . Unlike text document, image does not contain information about its authorship. Towards user-centered indexing 285 Other challenges are associated with language ambiguities and limitations of human indexing: . Lack of general agreement on what attributes of an image should be indexed (Chen and Rasmussen, 1999). . Difficulty in determining the appropriate level of indexing (Enser, 2000). . Subjectivity and lack of consistency - indexers cannot apply indexing terms with any degree of consistency (Rasmussen, 1997). . Problem in matching the terms that users type to describe their information needs with the controlled vocabulary used in indexing (Gordon, 2001; Hastings, 1999; Jorgensen, 1998; Roberts, 2001). . Difficulty in mapping a user’s mental model of what a picture is about with the indexer’s mental model (Heidorn, 1999). User studies Although the success of a user finding images on a topic of interest depends on the quality of image indexing and matching of indexer vocabulary with user language, there are few studies evaluating the effectiveness of image indexing from the user perspective (Goodrum, 2000; Stephenson, 1999; Trant, 2003). User studies primarily focus on specific group of users and examine queries within particular collections or subject domains. Armitage and Enser (1997) analyze requests from seven image archives and categorize them according to a facet-based matrix and three levels of abstraction. They observe similarities in image query formulation across a range of different libraries. Choi and Rasmussen (2003) examine queries formulated by faculty and graduate students searching for visual information on American history in the Library of Congress American Memory collection. Their study demonstrates that most of user needs fall into general/nameable needs, while only a small percentage belong to abstract categories. The researchers also find that date, title, and subject descriptors are important factors representing images. Few studies mention user participation in the indexing process or engage users in describing images as part of an evaluation of indexing systems (Hastings, 1999; Jorgensen, 1998, Jorgensen et al., 2001). Hastings (1999) compares user queries, user-supplied access terms and retrieval tasks in the online collection of Contemporary Caribbean Paintings. In addition to supplying their own keywords, users were also asked to rate the assigned index terms. Hastings’s study indicates the need for users to add their own descriptors and index terms in the search process. It also poses several important questions for future research about user interaction with image databases and the role of user feedback. Jorgensen’s research also engages non-specialist users performing image description. It focuses on the types of image attributes and levels of image indexing. In her 1998 study, Jorgensen observes a disjunction between a variety of image attributes that users describe and those attributes typically addressed in traditional image indexing systems. She recommends testing the assumptions underlying controlled vocabularies and newer descriptive tools, such as metadata schema. The need for a new approach The reviewed research literature echoes Hastings’s (1999) statement that the problem of intellectual access to images in digital collections remains largely unsolved. It also OCLC 22,4 286 indicates the need for greater user involvement in the indexing process and an evaluation of traditional indexing techniques from the user perspective. In practice, digital librarians struggle with an increasing number of digital images that need to be indexed for online delivery. Traditional indexing techniques are costly and labor-intensive and even practitioners are not sure whether they provide the only or best way to meet user needs. Trant (2003) notices that there is a sense among librarians that much could be done to improve access to visual collections, both in the use of existing indexing and in the applications of new technologies. Bearman and Trant (2005) recognize that “we may be alienating a user community by not speaking their language.” Many practitioners feel that traditional document-oriented indexing techniques are insufficient for image indexing in the web environment and search for a new approach. The museum community discussed the potential of user-generated tagging in image indexing at the “Cataloguing by Crowd” forum at the 2005 Museum and the Web Conference. Following the conference, the Metropolitan Museum of Art and the Cleveland Museum of Art conducted a series of exploratory tests. The Guggenheim Museum began a preliminary exploration through a prototype application, where users are encouraged to annotate a collection of images. An overview of this project is presented in Bearman and Trant’s (2005) paper. As indicated by the discussion on CONTENTdm listserv (Archives of CONTENTDM-L@OCLC.ORG, 2005), digital librarians also see social networking applications as tools that can involve users and enhance image description. Social classification Social classification represents a new approach to organizing content in the web environment where users create their own textual descriptors using natural language terms (tags) and share them with a community of users. This new organically emerging system of organization with users assigning keywords to their own or shared content has been referred by several terms, including social classification, distributed classification, social tagging, ethnoclassficiation, and folksonomy (Hammond et al., 2005). The term folksonomy, combining the words folk and taxonomy, has been attributed to Thomas Vander Wal. It has gained a considerable popularity, but as Merholz (2004a) points out in one of his blogs, the term folksonomy is actually inaccurate. Taxonomy implies a hierarchical relationship, while tagging applied in social networking software is characterized by a flat, non-hierarchical structure. The term social classification is used here to emphasize the collaborative nature of user-generated tags and their use in social context. Social tagging has been introduced in a number of web services. Users can assign their own tags to web site bookmarks (del.icio.us or Furl), weblog posts (Technorati), and photos (Flickr). CiteULike and Connotea provide an opportunity to tag academic publications. The purpose of tagging in this collaborative environment is not only to organize the web content for an individual user, but also to share the categories with other users, so they can easily browse and retrieve the information classified by others. Golder and Huberman (2005) observe that collaborative tagging is most useful when there is nobody in the role of “librarian” to classify information or there is simply too much content. Towards user-centered indexing 287 Social classification of digital images There are a number of web sites that provide users with space to store digital photos. What makes Flickr unique and popular is its classification and networking application that allows assigning tags, commenting, and sharing images and associated tags with a community of users. The site was launched in February 2004. Flickr’s images are also part of Yahoo image search as a result of the recent partnership with Yahoo! Eric Costello, one of Flickr’s developers, indicates in an interview with Jesse Garrett (2005) that initially Flickr was envisioned as a tool for an individual to organize collections of photos and share them with friends and family using a simple tagging functionality modeled on the bookmarking site del.icio.us. The push for broader classification and social interaction came from the community of users, who were interested in sharing their pictures and tags with a wider audience, not just a small collection of friends. Flickr provides a simple and unrestricted tagging system. Users can assign as many tags as they wish using keywords that they deem to be the most appropriate for their photos. They also have an opportunity to see how other users apply the tags in the context of other images. This aspect of communal verification or immediate feedback is what makes social classification different from traditional indexing that is usually conducted by a single authority in isolation from users. Mathes (2004) points out, “this tight feedback loop leads to a form of asymmetrical communication between users through metadata”. In social networking applications, such as Flickr, the meaning is created and negotiated by a community of users in the context of use. Flickr displays “hot tags” added in the last 24 hours and a set of the most popular tags (Figure 1). A brief analysis of the popular tags demonstrates several characteristics of this approach to organizing content: Figure 1. Flickr’s most popular tags as of January 10, 2006 OCLC 22,4 288 . Proper names indicating geographic location (boston) are listed along topical terms, such as bridge and building. . There are no explicit or hierarchical relationships – europe is on the same level as italy or rome. . Singular nouns, such as animal, flower, dog are accompanied by their plural equivalents, animals, flowers, and dogs. . There is no control for synonyms – new york, newyorkcity, and nyc are listed in the same set. . Specific terms, e.g. river or rock are mixed with more abstract, such as reflection. . There are several compound tags that combine two or more words, e.g. geotagging, blackandwhite, roadtrip. . Modifiers, e.g. blue or urban, or pronouns (me) are listed in the mix of nouns. . New terms, such as cameraphone or moblog, are added quickly to the list of tags. Some of these features, such as the lack of synonym control or use of singular and plural indicate the limitations of social classification for retrieval purposes. Several researchers point out the messy, “jumbled,” or “sloppy” nature of social tagging, especially when compared with formal classification systems (Guy and Tonkin, 2006; Hammond et al., 2005; Mathes, 2004). Guy and Tonkin, in a recent article, analyze the major flaws of folksonomy including misspellings, badly encoded word groupings, singular and plural forms, personal tags, and single use tags. The authors suggest some strategies for improving “sloppy tags,” but also observe that such practices may discourage users. Shirky (2005) sees tagging as an organic way of organizing information that “seems like a recipe for disaster, but as the Web has shown us, you can extract a surprising amount of value from big messy data sets.” Social classification also demonstrates a number of strengths, particularly for description and retrieval of images. The interlinked system of tags supports browsing activities and serendipitous discovery of images in the digital environment. The most important strength of social tagging, however, is its close connection with users and their language. Mathes (2004) points out that it directly reflects user “choices in diction, terminology, and precision.” The vocabulary is current and flexible as it quickly absorbs newly-created terms and neologisms invented by web users. The chaotic mixture of synonyms, abbreviations, singulars, and plurals represents the actual language of users – the terms they use to describe their images and the words they will more likely type while searching for images in other digital collections. Comparison of the two approaches to image indexing To examine the differences between traditional indexing and social classification and evaluate a potential usefulness of social tagging in digital collections, the author compared two sets of images: one featured in the Flickr site (Figures 2 and 3), the other indexed in a digital collection created at the University of Wisconsin-Milwaukee Libraries (Figures 4 and 5). The Flickr’s photos have been recently added to the site by two different users. The digital image collection “Cities Around the World,” (http:// collections.lib.uwm.edu/cgi-bin/browseresults.exe?CISOROOT¼%2Fcatw) currently being constructed at the University of Wisconsin-Milwaukee Libraries, features Towards user-centered indexing 289 photographs from the collection of the American Geographical Society. The collection, built in CONTENTdm digital media management system, uses Dublin Core metadata schema and a number of controlled vocabulary tools, such as Library of Congress Thesaurus for Graphic Materials for topical subject headings and the Getty Thesaurus of Geographic Names for geographic location. The first set of photos (Figures 2 and 4) features the Brooklyn Bridge in New York. Although the images are similar, the level of description is quite different. The description provided by the user consists of only two tags, while the indexing in the Cities collection includes several topical subject terms and headings depicting the Figure 2. Brooklyn Bridge, a photo on the Flickr’s site posted by Rachelle Yankelevitz OCLC 22,4 290 Figure 3. Cologne Night2 – a photo posted on the Flickr’s site by Ilhan Aksoy. Towards user-centered indexing 291 geographic location. The image indexed in the library setting has not only a more detailed description, but also indicates the relationship between the terms, although the terms are not pre-coordinated. The level of indexing is also more consistent with other images of Brooklyn Bridge in the same digital collection. The search for “Brooklyn Bridge” on the Flickr site retrieves over 4,000 images. An examination of additional pictures on the first screen demonstrates a huge difference in the level of indexing and the choice of terms. Some users assign just two tags, while others add six or more. Not only the depth of indexing differs, but also users’ perspectives as they focus on Figure 4. New York, Brooklyn Bridge, an image from the “Cities Around the World” a digital collection created at the University of Wisconsin-Milwaukee Libraries OCLC 22,4 292 different aspects – the Manhattan skyline or financial district in the background. It seems that social tagging not only exemplifies many challenges in image indexing identified by the researchers in the field, but also multiplies them by the number of volunteer indexers on the web. Subjectivity and the lack of consistency between indexers, the differences in the level of indexing, and the variation of image attributes are more widespread in social classification due to the sheer number of people tagging their images. Figure 5. Cologne, an image from the “Cities Around the World” a digital collection created at the University of Wisconsin-Milwaukee Libraries Towards user-centered indexing 293 The second set of photos (Figures 3 and 5) reveals another dimension of social classification that is missing in most traditionally indexed digital collections. Both photographs feature the Hohenzollern Bridge and the Cathedral in the city of Cologne, Germany. The level of indexing is not that much different, although the library indexing reflects a certain hierarchical structure in listing Religious facilities and cathedrals in the subject field and in the terms depicting geographic location. All terms are in the plural, while an image tagged on the Flickr site uses singular nouns. The major difference is, however in the language. All the indexing in the library-created digital collection is in English, while the tags selected by the user on the Flickr site are both in English and German. User-generated metadata reflect an increasingly multilingual and multicultural web audience. This brief comparison confirms a basic difference between social classification and traditional indexing techniques that employ metadata schema and controlled vocabularies. The traditional approach provides a more consistent and detailed description of images in a hierarchical, structured manner. Social classification, on the other hand, lists tags without indicating relationships in flat name spaces, though it does reflect the language, or sometimes multiple languages of the users. Kwasnik (1999), exploring the role of classification in knowledge representation and discovery, observes that “classification is a way of seeing”. Unlike formal classification systems, social classification is not an artificial construct representing highly structured knowledge in a mature or a specific domain. It emerges organically and reflects individual user perceptions, observations, and impressions. It gives users an opportunity to describe the world the way they see it. Discussion Challenges to implementing social tagging in digital image collections As demonstrated by the above comparison, social classification represents a significant shift and new possibilities in image indexing, but it does not offer a simple or miraculous solution to many complex issues inherent in image description. On the contrary, the challenges and problems of intellectual access to images seem to be multiplied in the social networking environment. There is also a fundamental difference between social classification and traditional indexing in regard to motivation. Flickr users tag their own content – private digital photo collections that they want to manage and share with friends, family, and a wider community. In the social networking environment, users engage in the game of tagging for their own benefit. Hammond et al.(2005) refer to this approach to classification as “selfish tagging.” Although there are examples of altruistic contributions on the web with Wikipedia being a primary one, it is difficult to predict whether users will be willing to invest their effort and time into describing images held at museums and libraries. Bearman and Trant (2005) discuss the issues of motivation and rewards in the context of the prototype project at the Guggenheim Museum. The discussion about social classification in digital collections will remain theoretical, if not futile, unless we see an implementation of social networking applications in digital library systems on a larger scale. Librarians also need to create an encouraging environment, where users become interested in participating in the indexing process and in contributing their expertise. OCLC 22,4 294 Implications for digital image collections Although social classification is not a universal solution and poses a set of old and new challenges, it does offer opportunities for enhancing image indexing and engaging users. Many librarians are probably wondering what will be the role of professional catalogers, if indexing goes into the hands of users? Interestingly, similar questions were posed by system designers when interface design moved towards a more user-centered approach (Henderson, 2000); however, iterative interface design with user participation and usability testing has not eliminated the jobs of system designers. Social classification does not have to be seen as an alternative or replacement of traditional indexing, but rather as an enhancement. These two approaches can supplement each other. In the view of challenges to intellectual access to visual resources, traditional indexing, nevertheless, offers more consistency in indexing and relatively similar level of specificity in describing image attributes. Controlled vocabularies and standards enable uniform access and interoperability. Social classification, on the other hand, brings user language, perspective, expertise, and eventually may lead towards more user-oriented indexing. Above all, it offers great opportunities for user engagement. In comparison with sites like Flickr, digital image collections appear rather static and monolithic. In the current digital library environment users have little or no opportunities for commenting on images or providing feedback on indexing, not to mention adding their own keywords. Heidorn (1999) views indexing as a form of communication between the indexer and the people who search for images in a collection. He mentions “shared cognitive heritage” and language as major factors in the communication between indexers and searchers. In traditional document-oriented indexing, however, this communication tends to go in one direction. With catalogers deciding a priori the structure and language of description and users remaining on the passive recipient end, it is difficult to determine how much knowledge and language is actually shared in this process. Social networking application, if implemented in digital collections may provide an opportunity for a communication model that works in both directions. As demonstrated by the reviewed literature on concept-based indexing, the gap between user language and controlled vocabularies applied in indexing have been identified as a major problem in providing intellectual access to images. Controlled vocabularies do not reflect users’ language, and for the purpose of image indexing, are too rigid and often outdated. User-generated tags, although unstructured and “sloppy,” are richer, more current, and multilingual. There are several options for incorporating user language into digital collections: . Users can add their own tags to the metadata in the records. . Users can provide feedback on the terms assigned by indexers. . User-supplied tags can be used to develop “a controlled vocabulary that truly speaks the users’ language” (Merholz, 2004b). In addition, implementing social networking applications in digital collections can foster collaborative knowledge construction. Users can contribute to the depth of image description and enhance the intellectual content of digital collections. Their engagement can take many forms from assigning tags, to commenting on images, Towards user-centered indexing 295 and annotating them. Expertise in local history and language can be particularly valuable in cultural heritage collections, where users can help to identify images and enhance description with their unique knowledge and perspectives. Users’ comments can also be a source of evaluation data indicating the relevance of collections to users’ needs and provide directions for future development of digital image collections. Conclusion The phenomenon of social classification raises questions about an established pattern in the current library practice where image indexing is performed in isolation from users. User-centered interface design of digital libraries received a considerable amount of attention, but image indexing still follows traditional document-oriented principles. The discussion of social classification and “metadata for the masses” (Merholz, 2004b) might help to introduce user language and user views to digital collections creating a more interactive and user-oriented environment. Although social classification is not an answer in itself to many inherent problems in image description, nevertheless, it can lead towards more user-oriented indexing. From the perspective of a practitioner involved in building digital image collections, it offers an opportunity for greater user engagement and help in building virtual communities. References Archives of CONTENTDM-L@OCLC.ORG (2005), “Web2.0 features for CONTENTdm?”, December, Week 1, available at: http://listserv.oclc.org/archives/contentdm-l.html Armitage, L.H. and Enser, P.G.B. (1997), “Analysis of user need in image archives”, Journal of Information Science, Vol. 23 No. 4, pp. 287-99. Bearman, D. and Trant, J. (2005), “Social terminology enhancement through vernacular engagement: exploring collaborative annotation to encourage interaction with museum collections”, D-Lib Magazine, Vol. 1 No. 9, available at: www.dlib.org/dlib/september05/ bearman/09bearman.html (accessed December 2, 2005). Besser, H. (1990), “Visual access to visual images: the UC Berkeley Image Database Project”, Library Trends, Vol. 38 No. 4, pp. 787-98. Chen, H.L. and Rasmussen, E.M. (1999), “Intellectual access to images”, Library Trends, Vol. 48 No. 2, pp. 291-302. Choi, Y. and Rasmussen, E.M. (2002), “Users’ relevance criteria in image retrieval in American history”, Information Processing and Management, Vol. 38 No. 5, pp. 695-726. Choi, Y. and Rasmussen, E.M. (2003), “Searching for images: the analysis of users’ queries for image retrieval in American history”, Journal of the American Society for Information Science and Technology, Vol. 54 No. 6, pp. 498-511. Chu, H. (2001), “Research in image indexing and retrieval as reflected in the literature”, Journal of the American Society for Information Science and Technology, Vol. 52 No. 12, pp. 1011-8. Enser, P.G.B. (2000), “Visual image retrieval: Seeking the alliance of concept-based and content-based paradigms”, Journal of Information Science, Vol. 26 No. 4, pp. 199-210. Fidel, R. (1994), “User-centered indexing”, Journal of the American Society for Information Science, Vol. 45 No. 8, pp. 572-6. OCLC 22,4 296 Garrett, J.J. (2005), “An interview with Flickr’s Eric Costello”, available at: www.adaptivepath. com/publications/essays/archives/000519.php (accessed December 19, 2005). Golder, S.A. and Huberman, B.A. (2005), “The structure of collaborative tagging systems”, available at: http://arxiv.org/ftp/cs/papers/0508/0508082.pdf (accessed December 2, 2005). Goodrum, A.A. (2000), “Image information retrieval: an overview of current research”, Journal of Information Science, Vol. 3 No. 2, pp. 63-7. Gordon, A.S. (2001), “Browsing image collections with representations of common-sense activities”, Journal of the American Society for Information Science and Technology, Vol. 52 No. 11, pp. 925-9. Guy, M. and Tonkin, E. (2006), “Folksonomies: tidying-up tags?”, D-Lib Magazine, Vol. 12 No. 1, available at: www.dlib.org/dlib/january06/guy/01guy.html (accessed January 18, 2006). Hammond, T., Hannay, T., Lund, B. and Scott, J. (2005), “Social bookmarking tools (I): a general review”, D-Lib Magazine, Vol. 11 No. 4, available at: www.dlib.org/dlib/april05/hammond/ 04hammond.html (accessed December 2, 2005). Hastings, S.K. (1999), “Evaluation of image retrieval systems: role of user feedback”, Library Trends, Vol. 48 No. 2, pp. 438-52. Heidorn, B.P. (1999), “Image retrieval as linguistic and nonlinguistic visual model matching”, Library Trends, Vol. 48 No. 2, pp. 303-26. Henderson, A. (2000), “Engagement of design with use”, Interactions, Vol. 7 No. 2, pp. 74-81. Jorgensen, C. (1998), “Attributes of images in describing tasks”, Information Processing & Management, Vol. 34 Nos 2/3, pp. 161-74. Jorgensen, C., Jaimes, A., Benitez, A.B. and Chang, S.F. (2001), “A conceptual framework and empirical research for classifying visual descriptors”, Journal of the American Society for Information Science and Technology, Vol. 52 No. 11, pp. 938-47. Kwasnik, B.H. (1999), “The role of classification in knowledge representation and discovery”, Library Trends, Vol. 48 No. 1, pp. 22-47. Layne, S.S. (1994), “Some issues in the indexing of images”, Journal of the American Society for the American Society for Information Science, Vol. 45 No. 8, pp. 583-8. Mathes, A. (2004), “Folksonomies – cooperative classification and communication through shared metadata”, available at: www.adammathes.com/academic/computer-mediated- communication/folksonomies.html (accessed October 28, 2005). Merholz, P. (2004a), 2004a, “Ethnoclassification and vernacular vocabularies”, August 30, available at: www.peterme.com/archives/000387.html (accessed November 11, 2005). Merholz, P. (2004b), “Metadata for the masses”, available at: www.adaptivepath.com/ publications/essays/archives/000361.php (accessed December 19, 2005). Rasmussen, E. (1997), “Indexing images”, Annual Review of Information Science and Technology, Vol. 32, pp. 69-196. Roberts, H.E. (2001), “A picture is worth a thousand words: srt indexing in electronic databases”, Journal of the American Society for Information Science and Technology, Vol. 52 No. 11, pp. 911-6. Shirky, C. (2005), “Ontology is overrated: categories, links, and tags”, available at: www.shirky. com/writings/ontology_overrated.html (accessed October 10, 2005). Stephenson, C. (1999), “Recent developments in cultural heritage image databases: directions for user-centered design”, Library Trends, Vol. 48 No. 2, pp. 410-37. Sterling, B. (2005), “Order out of chaos”, Wired, Vol. 13 No. 4, p. 2005), available at: www.wired. com/wired/archive/13.04/view.html?pg¼4 (accessed December 2, 2005). Towards user-centered indexing 297 Trant, J. (2003), “Image retrieval benchmark database service: a needs assessment and preliminary development plan”, available at: www.clir.org/pubs/reports/trant04/tranttext. htm (accessed October 10, 2005). About the author Krystyna K. Matusiak works at the University of Wisconsin-Milwaukee Libraries as a Digital Collections Librarian. She has managed digitization at the UWM Libraries since the program was initiated in 2001. The list of the collections she has designed and managed is available at: www.uwm.edu/Library/digilib/ She is also a doctoral student at the University of Wisconsin-Milwaukee. Her research interests include image indexing, usability, and evaluation of digital libraries. OCLC 22,4 298 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_gwbfs2r3j5eznf534uaayhtev4 ---- Microsoft Word - Baich paper final with figures Published as: Baich, Tina. 2012. “Opening Interlibrary Loan to Open Access.” Interlending & Document Supply 40,  no. 1: 55‐60.    Opening interlibrary loan to open access Tina Baich University Library, Indiana University-Purdue University Indianapolis, USA Abstract Purpose – The purpose of this paper is to examine interlibrary loan requests for open access materials submitted during fiscal years 2010 and 2011 and to determine the impact of open access materials upon fill rate for interlibrary borrowing requests. Design/methodology/approach – Borrowing requests for open access materials were quantitatively analysed and compared to total borrowing requests. Findings – During the period studied, borrowing requests for open access materials increased while overall requests held steady. As the number of requests filled with open access documents continues to rise, IUPUI University Library is able to provide a service to users and cost savings for the library by utilizing this material. The difficulty users have in navigating the online information environment makes it unlikely that interlibrary loan requests will decrease due to the growing amount of open access material available. Originality/value –The literature discussing the use of open access materials to fulfill ILL requests is limited and largely focuses on educating ILL practitioners about open access and providing suggested resources for locating open access materials. This research paper studies actual requests for open access materials and their impact on interlibrary loan. Keywords – open access, interlibrary loan, interlending, academic libraries Paper type – Research paper Introduction Even though open access materials are freely available on the Internet, library users still request them through interlibrary loan (ILL). In February 2009, Indiana University-Purdue University Indianapolis (IUPUI) University Library began tracking borrowing requests for open access materials. As the number of requests filled with open access documents continues to grow, IUPUI University Library is able to provide a service to users and cost savings for the library by utilizing this material. This paper presents data regarding IUPUI University Library’s open access ILL borrowing requests for fiscal years 2010 and 2011 and describes some of the most commonly used online resources for filling these requests. Discussion of open access is generally focused on scholarly journal publishing and the free availability of content either directly from publishers or through the self-archiving efforts of authors. Proponents of open access in this context argue that it allows for wider dissemination of scholarly work, thus providing authors the opportunity for greater impact. It also lowers the cost barrier to providing content for libraries and, in the academic world, gives the institution access to the scholarly output of its faculty. However, many other documents fit the general criteria of open access: digital, online content that is both free of charge and free of most copyright and licensing restrictions (Suber, 2010). Based on these criteria, I include conference papers, electronic theses and dissertations (ETDs), and public domain works in my discussion of open access ILL requests. Literature review There is no shortage of articles on open access, but very little tying open access to interlibrary loan. In 2006, Karen Kohn encouraged ILL practitioners to find both free lenders and free materials in order to lower ILL costs (Kohn, 2006, p. 58). The section on finding free materials describes “sites that list journals with free full-text access and databases that either include full text or provide links to full text at publishers’ Web sites” (Kohn 2006, p. 61). Kohn also rightly suggests checking for online availability of commonly free materials such as government documents, reports, and white papers before attempting to borrow them. Despite listing a number of resources for open access 2  journal articles, Kohn never uses the term open access beyond recommending the Directory of Open Access Journals. The sites the author recommends are still prominent sources for open access materials. In the same year, Heather G. Morrison discussed open access and its implications for resource sharing. Morrison uses the majority of her article to provide an overview of open access, a list of specific open access resources, and a discussion of a Canadian library network knowledgebase, which includes records for open access journals. Where she sets herself apart is in her presentation of possible implications of open access on resource sharing. Early in the article, Morrison quotes Mike McGrath’s statement that open access “is one of the reasons for the decline in document delivery in many countries” (McGrath, 2005, p. 43), but does not entirely accept this assertion. She suggests that increased user expectations may result in “a decrease in routine interlibrary loan requests, combined with an increase in more complex requests requiring more expert knowledge and/or more advanced search skills” (Morrison, 2006, p. 106). While IUPUI University Library has not seen a marked decrease in routine ILL requests, it is clear that users are locating rare materials that do require more effort on the part of staff to locate. Though Morrison did not present data to support her arguments, she rightly anticipated a partial shift in interlibrary loan work. Another article connecting open access and ILL was published in 2010 (Martin, 2010). Rebecca Martin seeks to educate reference and ILL staff about open access resources and the importance of maintaining current awareness of new open resources and trends in open access. Martin’s article is less an inventory of resources and more a primer on the open access landscape. The author presents a concise, straightforward introduction to different categories of open access materials. She sees this as a way to provide a value-added service to patrons without detriment to library departments. Martin includes a discussion of open textbooks and educational resources in addition to open access journal content. These additional types of open access materials should not be discounted in the ILL environment as this author expects we will begin seeing requests for such items in the near future. The literature discussing the use of open access materials to fulfill ILL requests is limited and largely focuses on educating ILL practitioners about open access and providing suggested resources for locating open access materials. This paper will present data on the use of open access materials in interlibrary loan and an updated survey of commonly used open access resources. Overview of institution and ILL operations IUPUI is part of the Indiana University system, which comprises eight campuses across the state of Indiana. IUPUI also has its own extension campus, Indiana University-Purdue University Columbus, located approximately one hour south of Indianapolis in Columbus, Indiana. All Indiana University campus libraries collaborate in a number of ways including a shared online catalogue and a remote circulation service. IUPUI University Library Interlibrary Services department serves the faculty, staff, and students of the Schools of Art & Design, Business, Education, Engineering & Technology, Health & Rehabilitation Sciences, Informatics, Journalism, Liberal Arts, Library & Information Science, Nursing, Physical Education & Tourism Management, Public & Environmental Affairs, Science, and Social Work as well as University College. The campus’s professional schools are each served by their own library. The Interlibrary Services (ILS) department consists of ½ FTE librarian, 2 FTE staff members, and approximately 3 FTE student employees. The University Library is an OCLC supplier, participates in RapidILL, and uses the OCLC ILLiad ILL management system. In Fall 2008, the ILS department began offering an Article Delivery Service to deliver articles electronically from the library’s print collection to patrons. Prior to Fall 2008, only distance education students qualified for this service. This new service contributed to a large increase in requests in fiscal year (FY) 2009 as compared with the previous year. In FY 2008, ILS received a total of 16,638 ILL borrowing requests. In FY 2009, the first year of the Article Delivery Service, the department saw a 28% increase in borrowing requests received, with total submissions reaching 23,210. Total ILL borrowing requests increased slightly in FY 2010 to 23,422 submissions, of which 21,308 (91%) were filled through traditional interlibrary loan, the Article Delivery Service, or remote circulation between other Indiana University campus libraries. In FY 2011, requests declined 6% from the previous year, with 22,098 borrowing 3  requests received and 20,093 (91%) requests filled through interlibrary loan, the Article Delivery Service, or remote circulation.     Figure 1 ILL borrowing requests submitted and filled  Open access ILL workflow IUPUI University Library uses OCLC ILLiad as its ILL management system, which provides greater automation and customization of ILL procedures than OCLC WorldCat Resource Sharing (Weible, 2011, p. 95). The ability to create custom e-mails, queues, and routing rules within ILLiad makes it easy to process and track open access requests. Two custom queues, “Awaiting Open Access Searching” and “Awaiting Thesis Processing”, prompt ILS staff to search for open access materials before referring the request to a potential supplier via OCLC. Items published in the US prior to 1923 fall into the public domain. In IUPUI University Library’s ILLiad system, patron requests containing a pre-1923 publication date are therefore automatically routed to the “Awaiting Open Access Searching” queue regardless of document type. Staff members then use ILLiad add-ons to search the HathiTrust, Internet Archive, and Google for freely available electronic copies. Likewise all requests with the document type thesis or containing the phrase “Dissertation Abstracts” are automatically routed to the “Awaiting Thesis Processing” queue. Staff members then search the ProQuest Dissertations & Theses database for subscription access and the Internet for the existence of an ETD if not accessible through ProQuest. Staff members search OCLC first for all requests that fall outside of these two queues. OCLC holdings information prompts ILL staff to verify local holdings and, in the case of returnables, the library holdings of other Indiana University campuses. Requests for returnable items held locally or within the Indiana University library system are transferred to the remote circulation service for processing. Through the Article Delivery Service, locally owned non-returnable items are delivered electronically to patrons through the document delivery module in OCLC ILLiad. If an item is not available locally or through the remote circulation service, the staff member proceeds with requesting the item through OCLC unless it is apparent from the OCLC record that the material is open access. The staff member might also locate an open access item in the course of citation verification. Extensive searching for open access options does not occur for these requests until all other borrowing options have been exhausted. 4  When an open access document is located, the staff member enters information into the request form including the URL in the Call Number field, “open” or “etds” (depending on the document type) in the Lender field, and changes the System ID to OTH. The Lender field entry allows for internal tracking of requests filled using open access materials and ETDs. She then saves the PDF to the ILLiad web server and sends the patron a custom e-mail notifying him of the document’s availability. The URL located in the Call Number field is automatically inserted in the e-mail for the patron’s reference. The email also informs the patron that the document was found freely available on the Internet. IUPUI University Library chooses to deliver open access documents to patrons for their convenience. In acknowledgement of the staff time and effort required to locate and deliver these materials, open access requests are counted towards the department’s fill rate. If a library preferred not to deliver the document to a user, the staff member instead could choose to send the URL to the patron and complete the request without actually posting the document to the user’s account, or cancel the borrowing request and provide the URL in the cancellation e-mail. Open access borrowing requests and resources In FY 2010, 318 borrowing requests were filled with open access materials. The following year, 487 were filled for an increase of 35%. Though these requests account for a small percentage of the whole, many of them would have been difficult to fill through traditional means and would have a negative impact on the department’s overall fill rate. Over two years, borrowing these 805 items through traditional ILL carries the potential cost of $14,087.50 based on Mary Jackson’s 2004 cost estimate of $17.50 per borrowing transaction (Jackson, 2004, p. 31). Assuming borrowing all of these items would even be possible, the cost to potential lenders would be approximately $7,462.35 based on Jackson’s mean lending cost of $9.27 per transaction (Jackson, 2004, p. 31). Though there are minor costs associated with processing open access requests, this represents a significant savings for the library and our lending partners. The 805 open access requests received during the two-year period under study represent a wide range of material types. The most frequent was journal articles (405), followed by books and book chapters (136), theses and dissertations (108), conference papers (104), reports (44), government documents and patents (5), and other miscellaneous materials (3). This ranking remains largely the same when considering individual years, with only theses and dissertations and conference papers trading places in FY 2011. A discussion of the top four document types follows.     Figure 2 Open access requests by document type  5  Article requests Open access article requests were filled through a number of sources, but most were located on the websites of open access journals or in digital repositories. Search tools commonly used include Google Scholar, IUPUI University Library’s e-journal portal, and OCLC. The library uses Serials Solutions as its vendor for electronic resource management. Within the administrative module, it is possible to activate “subscriptions” to various open access journal collections. Thanks to this feature, resources such as PubMed Central and the Directory of Open Access Journals as well as various collections of freely accessible journal titles are linked through the library’s e-journal portal. MARC records are generated for the titles in these open access collections and added to the library catalogue, thus providing an additional access point. The ILS staff regularly use the e-journal portal to determine whether requested items are held electronically. The inclusion of open access collections in the e-journal portal allowed staff to locate 101 open access articles. These account for one quarter of the total open access article requests received during fiscal years 2010 and 2011.     Figure 3 Number of open access requests filled through e‐journal portal  Despite the wealth of open access titles included in the library’s e-journal portal, 75% of open access article requests were discovered through other means. ILS staff filled an additional 81 requests using journal websites typically located through Google Scholar or through URLs present in OCLC bibliographic records. Other major sources for open access articles included university websites, institutional repositories, or US library digital collections (52); other digital repositories (51); author/faculty websites (34); organisation websites (33); and government websites (18). Though targeted searches are done for difficult requests, most of these were located with a simple Google search. In the category “other digital repositories”, the three most frequently used sites were Gallica (11), CiteSeerX (9), and arXiv.org (7). Gallica, the digital library of the Bibliothèque nationale de France, provides free access to over a million books, periodicals, manuscripts, maps, images, sound recordings, and scores. Text documents are freely available and can be downloaded as a PDF. Gallica contains a number of pre-1900 texts that would typically be difficult to borrow from a French library. CiteSeerX, a scientific literature digital library and search engine, was developed in 1997 at the NEC Research Institute and moved to Pennsylvania State University’s College of 6  Information Science and Technology in 2003 (Pennsylvania State University, 2010). It primarily indexes computer and information science research articles. CiteSeerX provides links to download open access documents from their original locations. arXiv.org is owned and operated by Cornell University Library. It started in 1991 as a subject- based repository for preprints in physics and has since expanded to include a number of science and science-related subjects. arXiv.org now contains nearly 700,000 open access articles. Book and book chapter requests Books and book chapters represented 17% of total open access borrowing requests. The titles obtained were evenly distributed in terms of their publication date, ranging from the oldest published in 1582 to the most recent published in 2009. The greatest number of requests was submitted by patrons from the History department (30), followed by those from Philanthropic Studies (18) and Religious Studies (17). Most freely available books were located in Google Books (50), the Internet Archive (39), or the HathiTrust (25). IUPUI University Library utilizes the ILLiad add-ons, which provide access to various websites from within the ILLiad client, to quickly check for electronic availability of public domain books in each of these repositories. Google Books is an online repository of digitized print materials. Though free full-text is not available for all content, Google Books does contain a large number of out of copyright monographs and journals. Google Books has a simple, user-friendly interface and is readily accessible to anyone with an Internet connection. The metadata associated with Google Books items is not always as complete or accurate as that of the Internet Archive or HathiTrust, which sometimes makes it difficult to locate an item. Founded in 1996, the Internet Archive collaborates with a number of institutions to collect and preserve materials. It provides access to an extensive archive of moving images, audio, software, educational resources, and text and serves as home of the Wayback Machine, an archive of web pages. In addition to housing public domain documents, there is also a collection of open access documents. Text materials can be read online or downloaded as PDF, EPUB, Kindle, and various other file types. The HathiTrust began as a collaborative digitization effort between the Committee on Institutional Cooperation (CIC) member universities and the University of California system, but is now open to other institutions. This shared digital repository currently contains over nine million volumes with nearly two and a half million volumes in the public domain. Users from member institutions can log in to the HathiTrust to download full-text PDFs of public domain materials. Other users can view the full-text online. Thesis and dissertation requests When processing requests for theses and dissertations, the ILL department first seeks to borrow a physical copy of the requested item. Additionally, the department will purchase PDF copies of theses from online-only institutions, such as Capella University, for all patrons. When unable to borrow, student requests for theses and dissertations are cancelled with a note telling them they can purchase a copy through ProQuest. The department will purchase a PDF copy of a dissertation if requested by faculty. However, thanks to increasing availability of electronic theses and dissertations (ETDs), the department is able to provide patrons with access to content that is often difficult to borrow. Of the 1,119 borrowing requests for theses and dissertations during the period studied, 649 were obtained through traditional interlibrary loan. The ILS department was able to fill an additional 246 requests using remote circulation services (28), purchasing through ProQuest (110), and locating open access ETDs (108). This left 224 requests unfilled for a fill rate of 80%. Were it not for the availability of ETDs, the department’s fill rate for theses and dissertations would have decreased by a full 10 percentage points. Of the 108 ETDs obtained, only eight were written prior to 2000; an additional 21 were written between the years of 2000 and 2005, leaving the highest concentration of theses written from 2006 to 2010. The ETDs represent the scholarly work of five different countries. While most ETDs obtained were written at institutions in the US (91), ETDs from other countries were also requested by users: Canada (12), Australia (3), The Netherlands (1), and South Africa (1). Florida (15), Ohio (11), and Texas (10) led US states in the number of ETDs requested. Results drop by nearly half for the next states in line, California (6) and Virginia (6). ETDs are typically located through URLs found in OCLC records or through Google searches. Not all ETDs are catalogued separately from print, so OCLC 7  records for their print counterparts should be checked for URLs in addition to looking for Internet resource OCLC records.     Figure 4 ETDs obtained by country & state (US)  The majority (n=83, 76.9%) of ETDs were located in the granting institution’s institutional repository. Institutional repositories are created by individual academic institutions to collect, preserve, and distribute the collective research output of its faculty, staff, and students. Such repositories may include research article preprints, conference papers, presentations, working papers, ETDs, and more. Institutional repository metadata can be harvested by any organisation using the OAI-PMH protocol, such as OCLC’s OAIster database. The contents of OAIster are included in OCLC WorldCat search results, which increases the visibility of institutional repository collections. Institutional repositories can also be crawled by Google, thereby making their contents discoverable through a simple Google search. Other sources for ETDs included the OhioLINK ETD Center (10) and Theses Canada Portal (9). The OhioLINK ETD Center, a joint repository for Ohio academic institutions, was the second most heavily used source for locating ETDs. OhioLINK, or Ohio Library and Information Network, is a consortium of 88 Ohio academic libraries and the State Library of Ohio that provides statewide access to resources and resource discovery systems. In 2001, OhioLINK launched the ETD Center, which is a free, online database of theses and dissertations granted by Ohio academic institutions and includes full-text when available. Currently, 25 of the 88 OhioLINK academic institutions participate in the ETD Center. The Theses Canada Portal is a service of the Library and Archives Canada. Though launched in 2004, most Canadian theses written since 1998 are available electronically through the portal. The Library and Archives Canada estimates the number of ETDs available is approximately 50,000. Conference paper requests Conference papers comprised 9.7% of total open access borrowing requests in FY 2010 and 15% in FY 2011. Of the 805 total open access requests submitted during this two-year period, 103 requests were for conference papers (12.8%). More than half (57.3%) of the conference papers located were written between 2006 and 2009. While conference or association websites are often the best source for conference papers, 44.7% (46) of open access conference paper requests were located in All Academic or the related repository, Political Research Online. All Academic is primarily a conference management tool with features including abstract management, peer review, scheduling tools, reports generator, and final program documents. The site also offers an archiving service. Archived conference papers are available free of charge. All Academic also hosts Political Research Online, a pre-print 8  repository project of the American Political Science Association and a consortium of similar associations. These two archives are especially strong in papers on the subject of political science. Conclusion In 2010, OCLC Research released a report on the findings of twelve user behaviour studies, which found that seven of the twelve provided evidence for the “increasing centrality of Google and other search engines” in the information-seeking behaviour of researchers (Connaway & Dickey, 2010, p. 27). This reliance on search engines may be a result of another common finding, the importance of speed and convenience to users (Connaway & Dickey, 2010, p. 32). The importance of convenience is reinforced in a paper by Connaway, Dickey and Radford (2011), in which they defined information source, ease of access and use, and time constraints as aspects of convenience. Significantly, the authors found that convenience is so critical to the information-seeking process that users will “readily sacrifice content for convenience” (Connaway et al., 2011, p. 27-28). The simplicity of a Google search typically results in millions of results. It takes time and evaluative skill to process even a fraction of these for relevant and accurate information. However, users value speed and convenience most, making it virtually impossible for them to assess the full information landscape. They will make do with the first page of search engine results and may disregard library resources entirely. If access to an important information item is not immediately apparent from the point of discovery, it is unlikely the user will search extensively for access before submitting an ILL request. ILL requests are unlikely to decrease as a result of increasing numbers of open access materials. In fact, IUPUI University Library’s ILL data shows that the number of requests filled with open access materials is actually growing while overall requests hold relatively steady. As more and more materials become freely available on the Internet, users have increasing difficulty in navigating the vast information environment and the myriad options that the Library and the Internet offer for finding what they want. ILL librarians and staff have specialized search skills and knowledge of resources of which our patrons are often unaware. Users can easily discover resources, but it is often up to ILL to deliver them. ILL departments must begin utilizing open access materials to enhance service and educate users about open access. Resources All Academic. http://convention3.allacademic.com/one/www/research/index.php? arXiv.org. http://arxiv.org/ CiteSeerX. http://citeseerx.ist.psu.edu/ Gallica. http://gallica.bnf.fr/ Google Books. http://books.google.com/ HathiTrust. http://www.hathitrust.org/ Internet Archive. http://www.archive.org/ OhioLINK ETD Center. http://etd.ohiolink.edu/ Political Research Online. http://convention2.allacademic.com/one/prol/prol01/ Theses Canada Portal. http://www.collectionscanada.gc.ca/thesescanada/ References Connaway, L.S. and Dickey, T.J. (2010), The Digital Information Seeker: Report of the Findings from Selected OCLC, RIN, and JISC User Behaviour Projects. Available at: http://www.jisc.ac.uk/media/documents/publications/reports/2010/digitalinformationseekerreport.pdf Connaway, L.S., Dickey, T.J. and Radford, M.L. (2011), “‘If it is too inconvenient, I’m not going after it:’ convenience as a critical factor in information-seeking behaviors,” Library and Information Science Research, Vol. 33, pp. 179-190. Pre-print available at: http://www.oclc.org/research/publications/library/2011/connaway-lisr.pdf. Kohn, K. (2006), “Finding it free: tips and techniques for avoiding borrowing fees and locating online publicly available materials”, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 16 No. 3, pp. 57- 65. Jackson, M.E. (2004), Assessing ILL/DD Services: New Cost-Effective Alternatives, Greenwood, Westport, CT. 9  McGrath, M. (2005), “Interlending and document supply: a review of the recent literature – 51”, Interlending & Document Supply, Vol. 33 No. 1, pp. 42-48. Martin, R.A. (2010), “Finding free and open access resources: a value-added service for patrons”, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 20 No. 3, pp. 189-200. Morrison, H.G. (2006), “The dramatic growth of open access: implications and opportunities for resource sharing”, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 16 No. 3, pp. 95-107. Pennsylvania State University 2010, “About CiteSeerX,” Available at: http://citeseerx.ist.psu.edu/about/site;jsessionid=440EFBF183647A637F0950A3DDEE89CE. Suber, P. (2010), Open Access Overview, Available at: http://www.earlham.edu/~peters/fos/overview.htm. Weible, C.L. & Janke, K.L. (2011), Interlibrary Loan Practices Handbook, American Library Association, Chicago, IL. About the Author Tina Baich is an Assistant Librarian at Indiana University-Purdue University Indianapolis’ University Library where she has been the Interlibrary Loan Librarian since September 2006. Tina is a Member at Large on the ALA RUSA STARS Executive Committee and also serves as Chair of the ALA RUSA STARS International ILL Committee. She is especially interested in web-based interlibrary loan finding aids and the impact of open access on interlibrary loan. Tina is a graduate of the Indiana University Schools of Library & Information Science and Liberal Arts with master degrees in Library Science and Public History. Tina Baich can be contacted at cbaich@iupui.edu. work_h4q2hhrsqjfzljnoxxuggtadmm ---- Pagina niet gevonden | Vrije Universiteit Brussel Overslaan en naar de algemene inhoud gaan StuderenAanbod BachelorMasterVoorbereidingsprogramma'sSchakelprogramma'sMaster na masterPostgraduatenPermanente VormingenDoctorerenWerken en studerenStuderen aan de VUBInnovatie in onderwijsOnderwijskwaliteit Praktische info InschrijvingenVoorbereidenToelatingsexamenBrochuresOp kot in Brussel Voor studenten LessenroostersFlexibel studerenAcademische kalender Follow us on InnovatieInnovatie en valorisatie VUB TechTransfer - valorisatie onderzoekVUB Foundation - filantropische fondsenwervingCrosstalks - niet-disciplinaire kennisuitwisseling Follow us on OnderzoekInformatie Onderzoeksbeleid en -organisatieVUB-onderzoeksgroepenOnderzoekscoördinatieEthiek en IntegriteitEthische commissies & Data ProtectionEuropean Liaison OfficeData Management & Open ScienceScience & SocietyResearch training & Development Output Zoek VUB-onderzoek in PURE portaalExcellentieNieuws over VUB Onderzoek Doctoreren Doctoreren, iets voor jou?Doctoreren stap voor stapDoctoral Training ProgrammeDoctoral School of Human SciencesDoctoral School of Life Sciences & MedicineDoctoral School of Natural Sciences and (Bioscience) Engineering Voor VUB-Onderzoekers IntranetOnderzoeksfinanciering en ondersteuning VUB LRN Expertisecel WetenschapscommunicatiePostDoc Corner Follow us on DienstenStudenten Toekomstige StudentenVUB StudentenAlumni Medewerkers VUB-medewerkersNieuwe medewerkersOud-medewerkers Overige diensten Voor iedereenVoor bedrijven & instellingenVoor leerkrachtenVoor journalisten Follow us on UniversiteitWe are VUB Ken jij de VUB?De wereld heeft je nodigDat zijn wijHistoriekToekomstVUB Foundation Infodesk BestuurStrategisch plan 2030Feiten en cijfersJaarverslagPers Campussen CampussenBrussels Humanities, Sciences & Engineering CampusBrussels Health CampusBrussels Photonics Campususquare.brusselsBrussels City Campus Faculteiten Sociale Wetenschappen & Solvay Business SchoolGeneeskunde en FarmacieLetteren en WijsbegeerteLichamelijke Opvoeding & KinesitherapiePsychologie & EducatiewetenschappenRecht en CriminologieWetenschappen en Bio-ingenieurswetenschappenIngenieurswetenschappenMultidisciplinair Instituut Lerarenopleiding (MILO) Follow us on NederlandsEnglish Portal for VUB personeel Studenten Nieuws Journalisten Toekomstig Personeel Zoekveld Zoeken NLEN Studeren Innovatie Onderzoek Diensten Universiteit Pagina niet gevonden Pagina niet gevonden De pagina die je zocht, blijkt niet te bestaan. Dit is meestal het resultaat van een slechte of verouderde link. Sorry voor het ongemak, we proberen dit zo snel mogelijk op te lossen. En nu? Geen nood, we zouden de VUB niet zijn als we je niet verder zouden helpen. Populaire pagina's Opleidingen Inschrijvingen Faculteiten Campussen VUB foundation Kom werken aan de VUB Gebruik de zoekfunctie Zoekveld Zoeken Contacteer ons via Facebook op Twitter Infopunt studenten info@vub.ac.be 02/629.20.10 Pleinlaan 2 1050 Brussel Contact Portalen PubliekVUB-studentenVUB-personeelToekomstig Personeel Platformen Nieuws op VUB TodaySelfService StudentenICT-HelpdeskVUB webmail50 jaar VUBVUB Beslist Home OnderwijsDiensten Created by webmaster - Inloggen | Privacyverklaring |Cookie Settings work_h5qq5xidevbijefc5hrvuyslvi ---- OCLC Community Center University of Massachusetts Boston From the SelectedWorks of Lisa Romano June 22, 2017 OCLC Community Center Lisa Romano Available at: https://works.bepress.com/lisa_romano/8/ http://www.umb.edu https://works.bepress.com/lisa_romano/ https://works.bepress.com/lisa_romano/8/ Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=wtsq20 Download by: [University of Massachusetts] Date: 27 June 2017, At: 12:01 Technical Services Quarterly ISSN: 0731-7131 (Print) 1555-3337 (Online) Journal homepage: http://www.tandfonline.com/loi/wtsq20 OCLC Community Center Lisa Romano To cite this article: Lisa Romano (2017) OCLC Community Center, Technical Services Quarterly, 34:3, 330-331, DOI: 10.1080/07317131.2017.1321403 To link to this article: http://dx.doi.org/10.1080/07317131.2017.1321403 Published online: 22 Jun 2017. Submit your article to this journal Article views: 1 View related articles View Crossmark data http://www.tandfonline.com/action/journalInformation?journalCode=wtsq20 http://www.tandfonline.com/loi/wtsq20 http://www.tandfonline.com/action/showCitFormats?doi=10.1080/07317131.2017.1321403 http://dx.doi.org/10.1080/07317131.2017.1321403 http://www.tandfonline.com/action/authorSubmission?journalCode=wtsq20&show=instructions http://www.tandfonline.com/action/authorSubmission?journalCode=wtsq20&show=instructions http://www.tandfonline.com/doi/mlt/10.1080/07317131.2017.1321403 http://www.tandfonline.com/doi/mlt/10.1080/07317131.2017.1321403 http://crossmark.crossref.org/dialog/?doi=10.1080/07317131.2017.1321403&domain=pdf&date_stamp=2017-06-22 http://crossmark.crossref.org/dialog/?doi=10.1080/07317131.2017.1321403&domain=pdf&date_stamp=2017-06-22 standardization bodies, funding agencies, best practices networks, and other projects in digital culture. In doing so, PREFORMA aims to create a network of institutions with common interests that will advance the definitions of requirements for open source tools and take part in their assessment. Packed with information about standards and tools, this site is an excellent resource for learning about the complexities of preserving standard file formats and the necessity for gaining full control over the process of conformance testing for files intended for long-term preservation. Rating: 4 out of 5. Amanda Mita Seton Hall University South Orange, NJ Amanda.mita@shu.edu © 2017 Amanda Mita. Published with license by Taylor & Francis. https://doi.org/10.1080/07317131.2017.1321402 OCLC Community Center Introduced in the summer of 2015, the OCLC Community Center is designed to give librarians one place to locate and share information on various OCLC products. According to OCLC, the Community Center “offers a place for library staff to connect online, share best practices, stay up to date on new product releases and contribute ideas to improve OCLC services” (https://www.oclc.org/en/news/releases/2015/201521dublin. html). The Community Center is accessible from the OCLC WorldShare Platform (after signing in) by clicking the drop down next to Need help? in the top right corner and then selecting Community Center. Or users can access the Community Center directly via the Web at https://www.oclc.org/community/home.en.html, then clicking the Sign in button, and entering their username and password for the WorldShare platform. In both cases, The Community Center main page appears. Currently, there are 11 communities: CONTENTdm, WorldShare Acquisitions, WorldShare Collection Manager, WorldShare License Manager, EZproxy, WorldShare Circulation, WorldShare Interlibrary Loan, WorldShare Record Manager, Tipasa, WorldShare Analytics, and WorldCat Discovery. Users are able to access the communities of the products in which they subscribe. OCLC plans on adding more communities in the future to accommodate other OCLC products. Each community has a “page” for: Community Home, Discussions, News, Events, Enhancements, and Support & Training. Plus, some communities offer additional pages such as Downloads, Workflows, Insights, and Notes. The Community Home page includes a few of the most recent postings for each of the pages. Additionally, the Community Home page offers links, product insights, announcements, member stories, and a search box to locate content in the community. Unfortunately, the Community Center lacks a site-wide search. Most pages display the postings with the most recent posting listed first. The Events page shows the happenings that will occur first by category. The Discussions and 330 TECH SERVICES ON THE WEB https://doi.org/10.1080/07317131.2017.1321402 https://doi.org/10.1080/07317131.2017.1321402 https://www.oclc.org/en/news/releases/2015/201521dublin.html https://www.oclc.org/en/news/releases/2015/201521dublin.html https://www.oclc.org/community/home.en.html https://crossmark.crossref.org/dialog/?doi=10.1080/07317131.2017.1321402&domain=pdf&date_stamp=2017-06-15 Enhancements pages let users post their questions and ideas to the community. Some of the pages, such as Discussion and News, let users subscribe to notifications from the area in order to keep up-to-date with the community, without logging in. These notifications from the community are helpful, but function essentially as another email mailing list. To cut down on the number of subscriptions, one subscription per community may be more desirable to some users, especially since there does not appear to be much activity in some of the communities. In general, the site is fairly easy to navigate by clicking on the “pages” links to access the content in the community. Clicking on a posting displays the posting and any replies. To return to the page content, users can click the page link again. Navigation links at the bottom of page let users review previous listings. The Communities link near the top of the page provides quick access to the other OCLC communities. This link organizes the various communities into categories. This breakdown would be helpful on the main Community Center page, but is not included. One navigation problem is that clicking the Support & Training link takes users to the “Support & Training” area of the OCLC website. That is, users are no longer in the Community Center and need to use the back button to return to the Community Center. OCLC should either open a new tab, or provide a warning that users are leaving the site and a link back to the Community Center. Regrettably, the back button may not work if users have been inactive too long and were signed out by the system. The titles of the communities follow the name of the various OCLC products. However, Connexion users may not realize that Record Manager is the cataloging community. This community is geared mostly to WorldShare Record Manager users and not Connexion users. Thus, the Community Center is largely ignoring one of its largest user groups. Additionally, a very brief tutorial (less than 2 minutes) is available on the Community Center main page. A longer tutorial showing the features of the Community Center would be helpful and would assist in publicizing this site to more users. As the Community Center is a new site, it lacks much of the history of previous OCLC discussions and content. The OCLC listservs have been very helpful and responsive in the past. Users may be wary of adding another or changing the way they receive OCLC information, or may not have a WorldShare sign-on. At this point in time, the OCLC Community Center does not seem heavily used, focuses on the WorldShare platform, and lacks in-depth content in some communities. Rating: 3.5 out 5. Lisa Romano University of Massachusetts Boston, Massachusetts Lisa.Romano@umb.edu © 2017 Lisa Romano. Published with license by Taylor & Francis. https://doi.org/10.1080/07317131.2017.1321403 TECHNICAL SERVICES QUARTERLY 331 https://doi.org/10.1080/07317131.2017.1321403 https://doi.org/10.1080/07317131.2017.1321403 https://crossmark.crossref.org/dialog/?doi=10.1080/07317131.2017.1321403&domain=pdf&date_stamp=2017-06-15 University of Massachusetts Boston From the SelectedWorks of Lisa Romano June 22, 2017 OCLC Community Center tmpsHD3_1.pdf work_h7bpg2pyubh7bhbm3wjxmwcibi ----            Faculty and Staff Publication: Shannon Pritting Pre-Print Citation for Final Published Work: Pritting, S., & Jones, W. (2015). Advancements in Real-Time Availability in Interlibrary Loan. Journal of Interlibrary Loan, Document Delivery & Electronic Reserves, 25 (1/2), 25-38. doi:10.1080/1072303X.2016.1143905 http://www.tandfonline.com/doi/abs/10.1080/1072303X.2016.1143905 ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 2 Abstract Determining if items are available is a major part of Interlibrary Loan work. Many libraries try to minimize staff time spent on determining availability by investing in circulation based resource sharing systems that require a major investment in time and funds, and then work only for the libraries within the circulation based system. The IDS Project created a new solution, Lending Availability Service, to automatically determine availability through software that is integrated within the resource sharing software, ILLiad. The Lending Availability Service determines availability for any requests a library receives, and can automate portions of the ILL workflow that require determining whether an item is on the shelf or in a collection that can be lent. The Lending Availability Service is highly configurable and was designed with ILL workflows in mind, and overcomes problematic areas in workflows to allow for highly optimized resource sharing through automatic lookups of availability. ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 3 Advancements in Real-Time Availability in Interlibrary Loan Although there has been a large amount of technological development in libraries over the past decade, especially in the area of resource sharing, basic issues such as determining the availability of items for loan via Interlibrary Loan still present significant problems for most libraries. Existing availability software usually exists as a closed system, and will only work with other libraries using the same system, or requires availability to be checked at the point of request by the patron. As availability checking is often the one area of resource sharing that is not consistently automated, availability leads to higher costs in Interlibrary Loan. With the ability to automate availability checking for ILL requests regardless of what library they came from, libraries can decrease ILL costs. To meet the needs of reliable and comprehensive availability checking, and fill the gap in software that can work with a variety of systems and can focus on automating availability when the library is a lender, the IDS Project created the Lending Availability Service application. Literature Review Though library consortia partner together to ensure that resource sharing is most effective among its members, according to Thomas Bruno (2013), “they still suffer from the same basic problems that libraries using Worldcat Resource Sharing experience, such as lack of real-time availability” (p.47). Bruno goes on to chronicle how many consortia have addressed the issue of real-time availability: shared or union catalogs (2013, p. 47). Real-time availability and the ability to offer unmediated access to collections is crucial to extending access to the collections of other libraries, and there are multiple programs such as Borrow Direct and the Committee on Institutional Cooperation libraries that utilize Circulation standards such as the National Circulation Interoperability Standards (NCIP) (Bruno, 2013, p. 48). Bruno identifies one ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 4 problem of circulation based systems that determine availability and facilitate resource sharing as “consortial borrowing/lending operations often exist in isolation from traditional interlibrary loan or document delivery operations, patron expectations may not be adequately managed when transitioning from one service to another” (Bruno, 2013, p. 48). In short, libraries use separate systems to provide the same service, depending on what library owns the material, mostly for the sake of efficiency, as availability and other circulation actions are managed by circulation-based consortial resource sharing. The need for complex real-time availability will be increasingly important as the strategies for coordinated shared collections and other areas of collaboration will continue to change and evolve in the next decade. Increasingly, libraries will adapt to an access model of collections, and the collections that they focus on building will be largely special collections, which may not be available for lending via Interlibrary Loan. As Michael Levine-Clark asserts, “Libraries will focus more on special collections and divide up remaining print collection responsibilities within a consortium” (2014, p. 435). If libraries shift print collection building to consortia, and only collect special materials within their individual libraries, being able to determine availability and to provide circulation like efficiency for sharing consortial print collections, and to effectively restrict or manage complex rules regarding lending special collections will be an area that resource sharing will need to focus on both presently and in the future. There are other systems that will help to provide availability information, one of them the Relais ILL system, which can “automatically search a range of catalogs via Z39.50 and use the results to automatically build a routing list” (Guadagno, 2005, p. 84). However, the Relais’ system’s availability is checked once, and the routing list moves along, not factoring in changes ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 5 in availability as the transaction moves from library to library. Relais uses a number of factors to determine whether or not an item can be requested through the Relais D2D system (Relais, 2015). The first step in determining “requestability” is to identify if the patron placing the request is eligible to use the Relais system aligned to a specific partnership consortia based on the patron type returned in their patron profile. Next, specific parameters set up by a lending library using Relais D2D are analyzed to determine if an item is able to be requested. A lending library is able to set specific requestability criteria for item format, shelving location, and call number range. For item format and shelving location, a lending library must provide a list of formats and shelving locations that have been determined as requestable. If the format or shelving location is not contained within the list stored within the Relais database, then those unlisted formats and locations are identified as unrequestable. Determining the requestability of items based on call number operates in an opposite listing method. If desired, lending libraries can list the beginning letters (or words, such as “micro”) that would be unrequestable. If the alphabetic portion of the call number for the requested item is not contained within the list, then that item is identified as requestable. Finally, item availability is decided by checking the lending library’s catalog to determine if the requested item is out on loan. Another system that allows for automatically determining the availability of library collections is the Rapid Returnables, or “RapidR,” system. RapidILL began in 2013 first offering book chapter requests as part of its direct request system for ILL articles. Initially, the pilot of book chapter requesting only automated look up of the materials and sending of requests, and did not factor in availability, as “the lack of real time availability checking of the shelf status of the book, nearly a year’s worth of experience has shown that this is not an undue burden on suppliers nor is it a delay to the requesting library’s patron in receiving the chapter requested” ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 6 (McWaters, 2013, p. 89). However, RapidILL has since added more availability functions as it has added Rapid Returnables (referred to as RapidR), which uses the Rapid Manager to look up availability and holdings information at the point of creating a borrowing request (Natale, 2014). But, currently, RapidR is only available to RapidILL members, and is not integrated with OCLC resource sharing functionality. Some libraries, who have focused on building collections that are responsive to user requests or to filling special needs of the library are using Interlibrary Loan or Holds and Recalls to dictate what they purchase, and then use OCLC Local Holdings Records to deflect these items from being lent to other libraries due to higher demand at the library. Indeed, this method is a way of solving one issue of availability: heavily requested material that is consistently cancelled because it is in use. For example, Gerrit Van Dyk provides a rationale and template for how to use circulation functions such as holds to create a demand-driven acquisitions program, part of which is using Local Holdings Deflections in the OCLC records to prevent other libraries from borrowing via ILL (2014, p. 306). The Local Holdings Record (LHR) Deflection as a method for denying ILL requests in the MaRC record is “traditionally used only for serials, but it is extremely effective” (Van Dyk, 2014, p. 306). Many libraries use LHR deflections for frequently requested titles they do not want to lend, but “these titles must be reviewed frequently,” especially as the “titles may drop in demand and be eligible to lend to another institution in time” (Van Dyk, 2014, p. 306). The process of changing LHR lending permissions is time-consuming and must be done on a title-by-title basis, so it is not a sustainable practice on a large scale without a large amount of staff time dedicated. Also, as Van Dyk discusses, the inability of merging circulation and Interlibrary Loan data seamlessly creates problems for automation and decision making for ILL staff: “ILL and holds personnel should meet often to ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 7 discuss what titles are popular. The ILL borrowing team will need to keep up with these titles to know that they should not try to borrow in most cases.” (2014, p. 306). As volume of ILL borrowing and other operations increases and staff time availability decreases, there will be more need to have information such as the item status of “hold requested” or how many holds requested factored into the ILL workflow without staff intervention. As the economic environment for libraries has changed, resource sharing has been affected greatly, serving as a method to supplement local collections. However, with the increase in resource sharing, there are also increased expectations for turnaround times. For physical items, “libraries continue to ship physical materials, in some cases using expensive expedited methods to meet consortia expectations for turnaround.” (Kress and Leon, 2012, p. 82). As resource sharing volume continues to increase, the “need to know the current costs of performing ILL” is important (Kress and Leon, 2012, p. 82). Also, “Technology has benefited interlibrary loan by automating many processes that were once manual,” but many of these technologies are cost-prohibitive (Kress and Leon, 2012, p. 82). In examining the costs of sharing print material, and the areas that were not yet automated by ILL technology, real-time availability is the one area that is not yet widely automated, especially in a manner that is efficient, flexible, and easy to maintain. In reducing the amount of staff time required to fill or cancel lending requests, one of the largest costs of lending via ILL can be mitigated. The National Library of Australia (2001) found that of the average of $17.03 per request cost in lending, 61.2 percent of the cost of the request was for staff time, with the second largest as shipping costs (cited in Kress and Leon, 2012, p. 86). Equally as important, the costs of unfilled and filled requests were broken out in the 1997 cost survey conducted at Wichita State, which found that a filled lending request cost ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 8 $2.47, while an unfilled request cost $1.36 (Kress and Leon, 2012, p. 85). For Wichita State, “The low cost of lending for this institution results from student assistants performing the majority of work in the lending unit” (Kress and Leon, 2012, p. 85). Thus, figuring out ways to eliminate processes that involve staff time is essential. Kress and Leon’s study, compared with previous studies, show that staffing costs continue to be the major component of ILL lending costs, although the percentage of cost related to staff has reduced from the 2002 ARL study from 75 percent to Kress and Leon’s 63 percent (2012, p. 92). The 12 percent reduction is largely related to continued improvements in software and technology related to ILL, especially the rapid development of the major ILL management software ILLiad, developed by Atlas Systems and licensed by OCLC. However, as the processes that involve staff require application of policies, discovering ways to program and configure policies to facilitate automation was an essential step in development of Lending Availability Service. One of the areas in Kress and Leon’s cost study most relevant to availability was the difference between ILL and “Circ to Circ” borrowing and lending. The difference in cost between Circ to Circ borrowing was most striking, with Circ to Circ borrowing costing $2.22 per request, and ILL loans costing $6.86 (Kress and Leon, 2012, p. 91). Likewise, Circ to Circ lending, at $3.58, was much less than ILL lending loans, at $4.73 (Kress and Leon, 2012, p. 91). Likely, a major difference between Circ to Circ lending and ILL lending is the need for staff to look up request information, cancel requests that are not available, and often require staff to apply policies. Although availability is not a widely considered major issue in resource sharing, the IDS Project identified availability as a major area for increased automation, efficiency, and cost savings. ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 9 Discussion Through Lending Availability Service, The IDS Project Technology Development Team (TDT) sought to make lending physical items and book chapters less expensive while not requiring a cost-prohibitive technology to do so. In short, Lending Availability Service brings the cost savings of consortial circulation without the large cost invested for a major software package and union catalog, and provides a more flexible technology that would work with other libraries sending requests who do not belong to the consortia using the availability software. The time and cost savings could be generated regardless of where the borrowing request originated. About the IDS Project The IDS Project, created in 2004, is a cooperative that has provided tools and support to make efficient and effective resource sharing possible for all libraries, regardless of size and financial status. Specifically, the IDS Project has created many different software tools that have connected many of the existing software systems and services that libraries already subscribe to in order to solve important issues in resource sharing and other areas of libraries. One of the first tools that the IDS Project created was the Article Licensing Information Availability Service (ALIAS), which connected open URL resolver holdings information, a centrally managed license database, and integrated these tools seamlessly into ILLiad. The result for ALIAS was a change from Project libraries filling only 33 percent of electronic article requests to 64 percent of electronic article requests (Sullivan, 2014, p. 217). In 2009, the IDS Project created the Getting It System Toolkit (GIST) as a tool that libraries can use to connect acquisitions and resource sharing. GIST was created to “transform the business of borrowing, buying, and accessing library materials in two important ways,” integrating “ILL and acquisitions into one flexible workflow,” and “automating the integration of data to support librarians’ decisions (Pitcher et al., ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 10 2010, p. 224). GIST and ALIAS are both key examples of the IDS Project’s ability to “combine data from various vendor Web application programming interface (API) services” with workflows and software platforms such as ILLiad allowing for better services and tools for patrons and staff (Pitcher et al., 2010, p. 225). The IDS Project, in 2010, created IDS Search, which relied on multiple API services and connected to libraries’ Z39.50 services as well, which began the IDS Project’s connection of availability of items and creating innovative tools to factor availability for ILL workflows. IDS Logic During 2013, the IDS Project Technology Development Team (TDT) created Lending Availability Service, which through a connection to the library’s ILL system with an ILLiad server addon, automates determining availability of lending loans requested via Interlibrary Loan. To create a real-time dynamic method to determine availability and to automate actions in ILLiad, the IDS TDT created IDS Logic, which can communicate with ILLiad via a server addon, inserting information from a variety of sources into ILL transactions and send information to external sources such as OCLC via the ILLiad system. Through the IDS Logic server addon, the actions staff would typically take to determine availability can be fully automated. Server addons are a method for libraries to interact with their ILLiad databases, and to send commands to update OCLC ILL requests, and automate parts of the ILL process. Server addons run via ILLiad’s System Manager and can be used to send emails, cancel transactions, insert information into transactions, and perform a variety of functions in ILLiad without the need for staff to open ILL requests. Staff only need access to the ILLiad Customization Manager (access to the Customization Manager is typically given to library managers and staff) to install ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 11 server addons, making powerful automation available to many users. As the IDS TDT developed server addons, the iterations through different addons and the inability to effectively troubleshoot issues led the team to pursue other methods to use server addons not to perform complex work, but to facilitate the connection between the ILLiad database and other automation services. Thus, IDS Logic uses only a basic connection to the ILL system via the ILLiad server addon, and communicates to middleware on the IDS Logic server to pass information and functions between a variety of systems and web services such as the library’s catalog and OCLC web services. IDS Logic also provides the ability to set up complex configurations of rules to apply to situations where staff previously needed to open requests and determine what actions to take based on library policies. IDS Logic makes the connections between the ILL system and other external services and also stores that library’s policies and configuration to be used to automate both simple and complex decision points in resource sharing. IDS Logic pulls a variety of different information from various web services and then applies the library’s unique configuration for workflow decisions to processes. Through Logic, libraries can choose to automate several parts of their ILL process, or just a few, depending on local policies, procedures, and eagerness to automate services. Through consultation with IDS Project members, new automation and efficiency services are continuously added to IDS Logic. Lending Availability Service In the case of Lending Availability Service (LAS), there is a connection to the library’s Z39.50 server or web service holding availability information, which then returns availability, call number, location, and other relevant holdings information. The IDS TDT created an XML schema that allows for customized mapping of information and MaRC data from the Z39.50 server or availability service to accommodate different systems that do not use the same fields, as ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 12 well as to accommodate for non-standard cataloging practices using different MARC fields for similar purposes. Custom mapping of information returned by the Z39.50 server or availability web service is also particularly helpful when staff want to make automation decisions based on local notes and unique cataloging practices. The IDS TDT surveyed technologies that would allow for real-time availability lookups, and found that some technologies such as those afforded by the NISO Circulation Interchange Protocol (NCIP) were very good for our use, but usually required libraries to spend large amounts to purchase licenses, and usually could not be used if the requesting library did not use the NCIP protocol. However, the ability to use the Z39.50 server or availability web service that libraries use to provide access to their online public access catalog was something that virtually all libraries had, and with the ability to take cataloging and availability information that was non- standard and map it to regularized fields for libraries made the reliance on Z39.50 and existing availability web services the best choice as there would be no extra cost to libraries, and accommodation for non-standard information could be handled. As more libraries expressed interest in Lending Availability Service, more methods of determining availability were also added. For example, libraries using OCLC’s Worldshare Management System (WMS) relied on an availability API that the IDS Logic middleware interacts with and passes information between WMS and ILLiad after applying complex rules. Similar connections were made with Ex Libris’ Alma and other Integrated Library Systems that may not rely on Z39.50 technology. The Lending Availability Service finds all the relevant information needed to either import the information for staff to pull the books, or to cancel the request in both ILLiad and OCLC resource sharing. This leads to much faster cancellation times from lending libraries who cannot lend a book that is checked out or in a restricted collection, improving patron services to ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 13 all. This also leads to less staff time spent cancelling requests for items that they cannot lend, and improves speed of finding and delivering loan requests. As the Lending Availability Service can find all temporary or permanent location information within the catalog, there is much more refined deflection capabilities than those offered in OCLC Policies Directory Deflections, which are dependent upon format types, publication age, and other broader criteria that is held in OCLC holdings. The deflection abilities in LAS allow for deflection based on collections (e.g., Rare Books Room) or temporary locations (e.g., Course Reserves), statuses (e.g., Missing, Lost), or any other information that the library’s Z39.50 server contains. In addition to checking whether an item is checked out or not, the Lending Availability Service also applies several availability “rules” to each ILL lending request, which factor in the many variations in local systems and cataloging practices. One of the rules bypasses availability checks, which is useful for items such as media or other items that may be booked or reserved in another system outside of the library’s integrated library system. This is helpful for items that need to be flagged for review by staff and require decisions that can’t be programmed efficiently. Another rule allows libraries to make items that always show via local catalog policies as unavailable as available to lend via ILL. So, if a library would lend via ILL a non-circulating collection, these could have the call number and location inserted, but return a positive availability when the system would normally return unavailable. The last type of rule allows for automatic inclusion or exclusion of collections, item types, locations, or temporary locations. These types of exclude or include rules act as refined deflections for libraries who do not want to loan specific groups of items, or know that they may have cataloged items differently, which affects availability. An example of an inclusion rule is libraries that do not fully catalog to the ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 14 item level government documents, which would not allow for item-level availability in any system, so the library would need to check the shelves for availability. Exclusions could include course reserves, new book shelves, or other permanent collections such as college archives or special collections. If libraries have multiple copies in their collection, the LAS will check all copies to try to determine if any are available, and will prioritize which location the library has set as their preferred collection or location. Lastly, many libraries determine availability differently for libraries within specific groups, and many will lend restricted collections to some groups, but not others. A major example of a group that most libraries expand lending privileges for member libraries is the OCLC Research Library Partnership: SHARES. SHARES libraries, among many other benefits, try to lend as much as possible to each other, and to treat each SHARES request as a special case and try not to summarily cancel. In these cases, LAS is able to factor in groups that libraries belong to and apply different policies to these libraries. So, if a library wanted to deflect special collections for everyone except SHARES Libraries, LAS could facilitate group based exceptions such as this. Once an item has been determined as available, specific criteria is used to determine the due date for the item requested. A basic configuration allows for libraries to set a standard due date for all items sent to a borrowing library. More complex configurations allow libraries to set specific due date days based on information returned by the MaRC record of the item requested. For instance, LAS could make adjustments to the due date day interval if a lending library had a local policy of lending A/V materials for a shorter period of time in comparison to monographs. Any other information included in the MaRC record, including local notes, could also be used to differentiate the due date of the requested item. In addition, LAS has the ability to set specific ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 15 due dates for determined groups of libraries and can accommodate for multiple lists to allow for different due dates for each group. There are also several predictable problem areas in the lending loan workflow in ILL/ILLiad that Lending Availability helps staff to identify and flag for careful processing. The first issue is identifying the correct ILLiad lender address that the request should be attached to. Those who have used the ILLiad interface on a regular basis will understand the frequency and amount of times that staff are required to manually select lenders in ILLiad, as well as the shipping issues caused by incorrect lender selection. When a lending request is analyzed by LAS, it identifies whether the library’s ILLiad database has multiple lender addresses for this OCLC symbol or not. If there is only one lender address that can be adequately matched, the lender address is set by LAS, and the transaction is routed or cancelled as necessary. If there are multiple lender addresses and an adequate matching record is not found, this transaction is flagged for staff to review. Since the frequent selection of lender addresses that have no issues or don’t need to be updated are now handled by LAS, staff can now more critically view the lender selection for requests that they must manually process, which helps reduce shipping errors. In addition to multiple lenders, LAS also determines when a lending loan request is for a title that has multiple volumes or multiple items. LAS looks in the fields in the Z39.50 or availability web service information used to determine volume information for titles and inserts information indicating that the title is multivolume and routes to a custom ILLiad queue, if desired. As often multiple volume requests are problematic since the borrower may not indicate what volumes he or she would like, this functionality flags transactions that need special attention or review. Also, if there are collections or item types that often create false multiple ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 16 item requests, such as compact disc collections, these can be bypassed from the multiple item/multiple volume check. As automating a process as simple as determining the availability and location of lending returnables is fraught with potentials for error, the IDS Project TDT worked with staff at several libraries to analyze the ILL workflow to identify as many areas that needed checks and quality control, and developed additional configuration and ILLIad addon functions to address these issues. One such development is the “MaxCost Reviewer,” which normalizes the non-standard information that borrowers often insert into the maximum cost field in ILL requests, and then compares this amount against the library’s maximum cost, factoring in group and individual agreements for libraries to provide reciprocally free services. The goal throughout development of Lending Availability Service has been to identify whenever there was a need for a request to be opened, and develop ways to remove that need, or to make the decisions as quick as possible after opening the request. As development of the Lending Availability Service has grown, the IDS Project Technology Development Team has expanded availability automation to book chapter requests as well. During initial phases of development in which librarians and staff working in ILL were surveyed to determine how differently they treated scan or chapter requests versus loans, it became clear that most libraries were much more liberal in scanning versus lending. To create the book chapter availability look up, the same custom mapping of Z39.50 or availability web service fields is used to determine if the item had a number that matched an ISBN’s general characteristics in the correct MaRC fields, and then connect this with other parts of the MaRC record to determine that the item is a chapter request. Rather than rely on item type definitions (e.g., using “monograph” as the only type checked for this), a more refined approach was created ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 17 that would handle all of the different item types that can be used to catalog a material that would be considered a “book chapter” request in ILL. Borrowing Availability Service Having developed an availability service that was functional for lending, the TDT then moved on to availability checking in Interlibrary Loan borrowing, creating the Borrowing Availability Service (BAS). Although ILL borrowing, for many libraries, is meant to supplement local collections, for many libraries loan requesting in borrowing has become a second copy service. More liberal circulation policies, in addition to improved service in ILL has increased the percentage of ILL borrowing for items that the library already owns, but are checked out. For example, from 2012-2014, Syracuse University’s ILL borrowing was often for materials owned locally. Syracuse patrons received 6,493 of locally owned items via ILL, which was 31.6 percent of the 20,546 total ILL loans from that period. From those 6,493 requests, 5,054 were unique titles. As these statistics were gathered by matching OCLC numbers for requests against the OCLC fields in the local catalog, the numbers of locally owned materials requested may be even higher considering alternate editions may have been requested for some locally owned materials, and additional requests were cancelled due to being held in reserves collections as well. With the number of unique titles, and the high percentage of requests that are held, but not available, the issue of real-time availability in borrowing is one that has a large impact on the amount of time staff spend processing borrowing requests. The ability to check the local availability before sending ILL requests out automatically via OCLC’s “Direct Request” service for loans is a crucial part of automating more borrowing requests. To check local availability, all borrowing loan requests are automatically checked to determine if they are checked out. If the item is found to be available, the request is ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 18 automatically sent to a document delivery queue in ILLiad, requiring only student assistant time to pull from the shelves. If the item is owned, but not available, the request will be sent via Direct Request. As the BAS can also determine local collections and temporary statuses, libraries using this service can also stop textbook or reserve collection requests from going out via Direct Request, which is a persistent issue for Interlibrary Loan departments. One of the main settings in a library’s Direct Request Profile is whether or not to review locally owned materials. Although some libraries choose to allow any requests that are locally owned to be sent to other libraries without review, many manually review requests that are held locally before sending via ILL, and BAS offers an automated way to check availability locally without forcing staff to open ILL borrowing requests needlessly. BAS, as with other IDS Project tools, seeks to make the tool configurable enough to meet local needs so that automation tools will be used to their fullest potential, rather than being deactivated, circumvented, or used minimally. Case Studies of Effect of Lending Availability Service The Lending Availability Service (LAS) is currently being used at most of the IDS Project Libraries and some libraries outside of IDS Project, which range from small Community Colleges to Comprehensive 4-year and Master’s Granting Institutions, to large research institutions. At the State University of New York at Geneseo, primarily a comprehensive 4-year college with a full-time enrollment of approximately 5,600 students, Lending Availability Service processed 2,777 requests of total 2,954 lending loans from June 1st to December 31st, 2014. From the transactions that the Lending Availability processed, 1,236 requests were automatically cancelled without staff mediation. About half of these requests were cancelled because the item was not available on the shelf (622 requests), and the other half were cancelled ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 19 due to collection level deflections (614 requests). None of the cancellations could have been handled by OCLC Policies Director level deflections as the cancellations were based on availability and collection status. Lending Availability Service determined that 1,420 requests (48 percent of the total lending loan volume) were for items that were available to be shipped, and of those requests, 99 percent (1,410 requests) were shipped to the borrowing library. Lending Availability Service found that 86 requests did not contain the matching requested ISBN or OCLC number within the catalog, and an additional 49 requests returned an unknown availability, meaning that 49 requests contained an ISBN or an OCLC number and the catalog did not contain an item record for the requested item, or the title contained within the request was less than a 60 percent match to the title within the catalog. In comparison to the total number of ILL requests handled at Geneseo, Lending Availability Service automated 94 percent of lending loan requests (2,954 total requests), 36 percent of lending requests (7,650 total requests), and 13.6 percent of all interlibrary loan requests (20,395 total requests). Understandably, the larger the institution, the more effect that Lending Availability Service will have in saving time and improving services. At Syracuse University, from June 1st to December 31st, 2014, the Lending Availability Service processed 11,743 requests. Of these, 2,557 were completely automated, as they were cancellations, and no staff ever opened the request. Of the 2,557 cancellations, 2,038 were for items that were checked out, with 483 items being cancelled because of collection level deflections. The Lending Availability Service automated all but pulling and marking as “shipped” 8,523 items, making the processing of these items largely student level work. Only 389 were identified as “Bypasses,” which required that staff manually process due to complex policy interpretation. Interestingly, as LAS runs every 2- 3 minutes, the potential to automate another 1,231 was missed as staff opened up transactions ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 20 and manually processed them before the automation could run. For the period of 6/1/2014- 12/31/2014, the total ILL volume for Syracuse University (including Borrowing, Document Delivery, and all Lending) was 42,369, which means that LAS automated or partially automated over 27 percent of the total volume for ILL requests at Syracuse University. The number of cancellations by LAS at Syracuse University and Geneseo demonstrate the time savings that availability checking can have at a library. Further time savings can also be demonstrated with availability look ups being automated in borrowing. Additionally, the large number of cancellations due to items being checked out or in restricted collections also shows that ILL requests are often for items that are in high demand or in specialized local collections such as course materials or special collections. Conclusion Since the creation of Lending Availability Service, IDS Logic, and later Borrowing Availability Service in 2013, over 100 libraries have implemented the IDS Project’s availability automation tools, saving a great deal of time in processing availability and looking up call numbers and locations. These libraries have benefitted from this automation without the need for implementation of a union catalog or comprehensive circulation based ILL system. Instead, Lending Availability Service can be easily implemented and is designed to be integrated with any requests coming into the ILLiad system, with highly configurable settings. Libraries have varied from hundreds of local collection deflections to just a few collections that are not allowed to lend. Each installation of the Lending Availability or Borrowing Availability Service has been unique, with IDS Logic serving to create flexible automation of ILL workflows. Although innovations in determining real-time availability may not be as high profile or exciting as advancements in discovery or other areas in libraries, it is an issue that has large ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 21 repercussions in the ability to efficiently deliver high-quality resource sharing services in a cost- effective manner. In creating real-time availability tools that can be refined to fit ILL workflows and policies, a large percentage of ILL work can be automated, freeing resource sharing staff to focus on the time consuming, but rewarding public service aspects of Interlibrary Loan. In sum, by freeing staff from the need to determine shelf availability, they will become more available to serve patrons. ADVANCEMENTS IN REAL-TIME AVAILABILITY IN INTERLIBRARY LOAN 22 References Bruno, T. (2013). Access services beyond circulation: Interlibrary loan and document delivery. In Dawes (Ed.). Twenty-first century access services: On the frontline of academic librarianship (45-60). Chicago: Association of College and Research Libraries. Guadagno, E. (2005). Taking interlibrary loan and document delivery to new frontiers using Relais ILL. Journal of Interlibrary Loan, Document Delivery, and Electronic Reserve, 15(4), 83-87. Leon, L., & Kress, N. (2012). Looking at resource sharing costs. Interlending & Document Supply, 40(2), 81-87. doi:10.1108/02641611211239542 Levine-Clark, M. (2014). Access to everything: building the future academic library collection. Portal: Libraries & The Academy, 14(3), 425-437. MacWaters, C. (2013). Having it all: Using RapidILL for book chapter interlending. Interlending & Document Supply, 41(3), 87-89. Natale, J. (2014). RapidILL and RapidR Book Chapters and Loans. Paper presented at the IDS Project annual conference, Syracuse NY. Presentation retrieved from: http://idsproject.org/conferences/2014/presentations/Natale-Presentation-IDS- RapidL.pdf. Pitcher, K., Bowersox, T., Oberlander, C., & Sullivan, M. (2010). Point-of-need collection development: The getting it system toolkit (GIST) and a new system for acquisitions and interlibrary loan integrated workflow and collection development. Collection Management, 35(3-4), 222-236. Relais. (2015). Requestability. Retrieved from: https://relais.atlassian.net/wiki/display/ILL/Requestability Sullivan, M. (2014). “Article Licensing Information and Availability Service (ALIAS).” Library Consortia: Models for Collaboration and Sustainability. Ed. by Valerie Hortand and Greg Pronevitz. ALA. Van Dyk, G. (2014). Demand-driven acquisitions for print books: How holds can help as much as interlibrary loan. Journal Of Access Services, 11(4), 298-308. doi:10.1080/15367967.2014.945120 work_hcfoyn4aujbqjpo66w3rqykc3e ---- Intake of Digital Content: Survey Results From the Field Search D-Lib: HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB D-Lib Magazine November/December 2016 Volume 22, Number 11/12 Table of Contents   Intake of Digital Content: Survey Results From the Field Jody L. DeRidder and Alissa Matheny Helms University of Alabama Libraries {jlderidder, amhelms}@ua.edu DOI: 10.1045/november2016-deridder   Printer-friendly Version   Abstract The authors developed and administered a survey to collect information on how cultural heritage institutions are currently managing the incoming flood of digital materials. The focus of the survey was the selection of tools, workflows, policies, and recommendations from identification and selection of content through processing and providing access. Results are compared with similar surveys, and a comprehensive overview of the current state of research in the field is provided, with links to helpful resources. It appears that processes, workflows, and policies are still very much in development across multiple institutions, and the development of best practices for intake and management is still in its infancy. In order to build upon the guidance collected in the survey, the authors are seeking to engage the community in developing an evolving community resource of guidelines to assist professionals in the field in facing the challenges of intake and management of incoming digital content.   1 Introduction Digital materials pouring into special collections and archives present new and complex challenges for archivists, librarians, and records managers. As new records of our unfolding history are almost completely in digital form at this point, many cultural heritage institutions are struggling to develop and institute practical policies and procedures for the intake, selection, processing, and access of digital content. Particularly when faced with intake of multi-terabyte hard drives of mixed content, archivists may be overwhelmed with how to even begin to sort, identify, and select content from a device. Even the choice of tools can be overwhelming; for example, the Community Owned Digital Preservation Tool Registry currently lists 415 tools.1 In order to sift through the possible options more effectively before setting up local policies and procedures, we developed a survey (see Appendix I) to uncover what experienced digital archivists would recommend. The focus of the survey was on the selection of practical tools, the development of productive workflows, and recommendations.   2 Literature Review   2.1 Books The continual change in the field virtually precludes the inclusion of detailed recommendations in published books, which instead usually focus on overviews and thematic approaches. In 2001, Lazinger provided an excellent overview of many of the issues involved in selection and preservation of electronic documents, supplemented by a then extensive compilation of existing archives and digitization centers for more information.2 While providing a useful overview of the current field of digital preservation and known standards, the recent tome by Corrado and Moulaison fails to cover practical tools and workflows for those in the field.3 Sabharwal's 2015 publication4 extends the overview to include various forms of social media, and builds on the DCC Curation LifeCycle,5 which provides a more complete overview of all aspects of digital preservation. While these books provide an excellent review of issues and sometimes recommendations for what should be done, they do not try to address how to actually perform the work. More useful from an implementation perspective is the somewhat dated 2010 manual by Ross Harvey, which recommends specific tools, websites and tutorials.6 Also in 2010, a CLIR publication provided an in-depth review of the challenges and issues in intake and management of born-digital content, with a clear focus on the digital forensics aspects and rights issues.7 For current information about tools and resources, the online Digital Preservation Coalition Handbook8 is an excellent reference, which closely aligns with the purpose of our study, as it includes some information about what other institutions are doing (as case studies, primarily conducted in Europe). Many of the resources referenced by this handbook are best maintained online, such as a crowdsourcing effort which seeks to document all existing file formats,9 and an active site for questions and answers on digital preservation.10   2.2 Articles, White Papers and Reports Articles published in the past few years range from specific tool coverage and case studies to broad reviews of the challenges faced by librarians, archivists and records managers. Specific tools covered in depth by articles include BitCurator,11 a custom Word plug-in for substituting fonts,12 a custom Python script for transferring digital content across NTFS systems and collecting some data,13 AutoHotkey (automation in Windows) and Selenium IDE (for metadata work in Firefox),14 and guidance for small-scale web archiving (on Macs) with SiteSucker, TimeMachine and FileMerge.15 One fascinating case study on the use of Forensic Toolkit software for capture and processing of floppy disks, Zip disks, and CDs may be very useful for archivists faced with such media.16 Another case study described a low-cost exploration into building access to old media on newer equipment, for the digital forensics and extraction of obsolete formats.17 In 2013, the IMLS-funded Digital POWRR group organized almost seventy tools into categories based on the digital curation lifecycle and documented their functionality.18 A full discussion of their findings, and recommendations based on level of financial resources, are available online in a white paper.19 Some articles focus on case studies for collecting and archiving specific types of content, such as tweets,20 digital images,21 a Web collection of fugitive literature,22 catalogues raisonnés,23 videos from DVDs,24 and institutional records.25 26 Other case studies focus in depth on particular issues, such as the risks of data migration for certain formats27, or appraisal of electronic records in national archives.28 One unusual approach to developing a framework for appraisal and selection of digital content was based on statistical sampling, risk analysis and appraisal;29 unfortunately the effectiveness of this approach is not evaluated. However, the criteria for selection are quite useful as guidelines: mission alignment, value of the resources, cost and feasibility.30 In 2015, the University of Minnesota Electronic Task Force published a public report of their initial efforts to develop capacity in the Libraries to preserve and provide access to electronic resources.31 In their effort to develop initial policies and procedures for ingest, they realized that tools often worked differently with different file types, and "each collection brings with it the possibility of a new ingest scenario".32 Because of the inconsistencies in how various tools worked, they developed step-by-step guides for over 20 tools or processes; while they did not include these in the report, they did provide an overview and general guidelines. An interesting note from a survey of the University of Minnesota library staff is that email attachments were increasingly a source of concern for collections content.33 After developing initial guidelines, roles, documentation and steps for ingest, the Minnesota task force summarized: "A sound ingest process requires understanding the original storage media, determining the method of transfer best suited to the media and file types, having a secure storage location within the Libraries, and running multiple programs against source and destination records for quality control and to establish a preservation baseline."34 Prior to the final report, staff managed to ingest 13 accessions (over 24 GB), but the problem of determining what to keep created a bottleneck in the workflows, to be addressed in the future.35 Based on their limited experiences, they estimate an average time of 12 GB/hour for initial ingest only, with an additional seven hours per collection.36 In contrast to the University of Minnesota case study, a recent article by a lone arranger outlines her explorations in ingest of small quantities of digital content; she highlights the critical importance of appraisal, and notes that outsourcing these efforts requires funding that simply isn't available.37 Focusing on access methodology, one article offered up a method of providing web browse access to disk images, to simplify the questions of selection, management of collected content, and access, putting the burden on the user of whether to seek emulation or migration for improved usability.38 An extensive comparison of several repository systems in terms of suitability for visual research data (extendable to other data) was published in 2013.39 A more recent article combined three case studies about how to emulate and a brief discussion of the pros and cons of emulation as a service delivered over the web.40 Theoretical discussions also abound, such as Xie's analysis of the concept "reproducibility" as used in digital forensics for the purpose of digital records management.41 In 2011, Goldman urged archivists and records managers to begin to take initial steps towards appropriate management of digital content, regardless of limitations and seemingly insurmountable odds.42 In 2012 and 2013, OCLC Research provided a series of useful reports echoing Goldman's article, in an effort to assist archivists in taking the difficult first steps to effectively manage born digital content.43 In the widely acclaimed initial report, Erway provided basic principles and instructions for surveying and inventorying born-digital content, as well as basic steps for extracting digital content from readable media.44 A following report expanded on these basic steps and provided links to suggested tools, software, and resources that could provide in-depth discussions and further options.45 In 2012, the AIMS Project report stated that "the development of best practices within born-digital stewardship was not yet possible," so they sought to define good practices instead.46 Their inter-institutional framework was organized into four "Functions of Stewardship" (each with objectives, outcomes, decision points and tasks): collection development, accessioning, arrangement and description, and discovery and access.47 Critical considerations were clearly stated, such as "A determination is made as to whether the collection can be reasonably acquired, managed, and preserved within the constraints of the institution's resources."48 One of the points made in this report is perhaps key to developing best practices: each type of born digital content transfer has different implications that will likely lead to different workflows.49 The AIMS report was intentionally software-agnostic, though it included appendices with detailed case studies50 and tool reviews.51 A major outcome of this project was the draft of functional requirements for a tool to support arrangement and description of born-digital materials,52 though the resulting Hypatia project shows no signs of activity since 2013.53 Challenges faced by under-resourced institutions are eloquently described in a 2013 OCLC article.54 A recent broad overview of the current challenges of digital preservation observes that the field is developing swiftly, and that a danger to be avoided is being drawn into a preservation path that may not be critical, such as file format migration for open formats.55 Disturbingly, in the results of a survey of faculty at five Digital POWRR56 project partner institutions, 55.3% of respondents have lost irreplaceable work-related digital content, and 62.5% have obsolete digital content they likely can no longer access.57 The need for effective digital curation is pressing.   2.3 Surveys A 2009 OCLC survey identified born-digital materials as one of the top three "most challenging issues" in managing special collections (the others were digitization and space), and stated that "management of born digital archival materials is still in its infancy."58 Yet 79% of respondents had collected born-digital materials.59 This survey covered a variety of aspects of special collections and archives content and management, but did not attempt to identify tools and workflows, target formats or useful methods and recommendations for born digital materials. In 2012, an extensive survey of 64 of the 126 ARL libraries captured a snapshot of the tools, workflows, and policies used by special collections and archives to process, manage and provide access to born digital materials.60 This survey covered staffing, storage solutions, influences, and training as well as several of the areas covered in the survey described in this article. As the results of the current survey differed in several respects for some very similar questions, a comparison of these results will be included in the results discussion. Also in 2012, a survey of training needs in digital preservation found that the three most needed topics were "Methods of preservation metadata extraction, creation and storage" (70.3%), "Determining what metadata to capture and store" (68%), and "Planning for provision of access over time" (65.4%).61 The current survey may provide some guidance with regard to each of these. In 2013, the Smithsonian Institute surveyed seven of their archival units about their born digital holdings.62 They counted over 12,000 pieces of physical media, the majority of which were CDs,63 but their survey did not ask whether the holders were capable of extracting content from their media. Their holdings also may or may not be representative of those in other cultural heritage institutions. Mayer has performed a survey annually to assess the status of born-digital preservation only in Canadian archives;64 the results of the 2013 and 2014 surveys indicate a great deal of confusion about what holdings actually existed in each repository, and a good bit of uncertainty as to how they were actually managed.65 At this writing, the 2015 results are not yet available.66 At the end of 2014 and the beginning of 2015, another survey by UNESCO67 gathered information from members of 27 organizations (seven archives, twelve libraries, two museums, and six heritage organizations) to determine world-wide trends in selection of born digital heritage collections. The results of this survey were very general and varied, indicating that determining significance of content is key to selection, and clarifying that the field was still in its infancy.68   2.4 In Context Despite the excellent coverage of the issues and an expansive range of possible tools, none of the existing publications provide a clear and current overview of what a broad selection of cultural heritage peers are doing in the field: what works, what doesn't, and the choices they have made. When faced with almost any decision on setting policy in the field of digital librarianship, a review of what peers have implemented and an understanding of the pros and cons of their approaches can provide invaluable guidance. No one wants to repeat others' errors, and building upon existing practical experience is always ideal. While several institutions do provide some of this information either in reports, articles, or on their websites, our survey was intended as an approach that would gather information in a form which would allow us to compare responses across a targeted set of questions. Where the other surveys noted address similar issues, a comparison will be made in the results discussion.   3 Approach We reviewed the questions used by the surveys mentioned above, and adapted some of them for our survey while adding others that would provide us with detailed clarification about format selection, tools and workflows. The 20-question survey (see Appendix I) was divided into six sections: Materials & Content, Workflows, Content Management, Preservation Metadata, Access, and Recommendations. The Workflows section, which contained primarily open ended questions, was divided by stages: Identification, Analysis, Selection, and Processing. To allow participants to review the questions prior to taking the survey, the survey was saved to PDF form and posted online, and a link to this PDF was included in the announcements. The month-long survey was announced in mid-May (and again in early June) 2016 to four Society of American Archivist listserves: Metadata and Digital Objects Roundtable, Electronic Records Section, Research Libraries and Archives & Archivists. Other listserves included Code4Lib, Digital-curation Google group, Digital Library Federation, Diglib and the American Library Association Library Information Technology Association (LITA).   4 Results   4.1 Participants Half of the 62 respondents were from academic libraries (31); 12.9% (8) were from archives and the same number from government organizations; 4.84% (3) from museums, 3.23% (2) from public libraries, 1.61% (1 each) from a historical library and a special library. Eight respondents (12.9%) identified their institution as "other": institutional religious archives, national library, public media organization, radio and television, museum and library, technical company, academic archives and arts & education. Our survey allowed for free-text description of roles, but over half (58.07%) self-identified as some sort of archivist or curator (36 of 62), 43% (27) of them as having a digital role. 22.58% (14) identified as having a managerial role. Additionally, 17.74% (11) self-identified as a type of librarian, 11.29% (7) of which mentioned digital in the title, one who was also an archivist (not counted above) and one metadata librarian. One respondent did not identify a role.   4.2 Types of Media Of the 50 who responded to the question about what obsolete media they could effectively manage, 90% (45) are prepared to obtain content from 3.5 inch floppy discs, 84% (42) from PC hard drives, and 72% (37) from Mac hard drives. Over half (52%, 26) can extract content from 5.25 inch floppy disks, but only 10% (6) can work with 8-inch floppy disks. Varieties of Unix/Linux hard drives are only manageable by 28% (14) of the respondents, and hard disk drives that are not 2.5 or 3.5 inches can only be managed effectively by 34% (17) respondents. Furthermore, 74% (37) can extract from zip discs, and 22% (11) of the respondents are prepared to extract content form mini discs. Of the other types of media that respondents can extract content from, 3 respondents mentioned media that is not yet obsolete: CDs/DVDs and thumb drives (1 mention). Of those which are becoming obsolete, 2 mention video cassettes, computer magnetic tapes/LTO, and flash media (SD cards). Two respondents also mention Jaz disks; and 1 each mention audio cassettes, reels, Syquest cartridges, DAT (Digital Audio Tape) and DDS (Digital Data Storage). By comparison, the types of physical media (over 12,000 pieces) identified in the Smithsonian survey included: 33% CD 17% DATs 16% 3.5" diskette 14% 5.25" floppy diskette 12% DVD 7% other (excluding other data cartridges) 1% ZIP data cartridge69 There was no indication in the Smithsonian survey as to whether access to the media was yet supported. At the time of the survey, the expectation was that future Smithsonian acquisitions would likely include a large percentage of digital images, digital video, and computer aided design (CAD) files.70 Floppy disks, hard drives and zip disks were far more prevalent in our current survey than in the Smithsonian one.   4.3 Types of Content Of our 62 respondents, all of them are collecting digital text documents, and 95% (59) are collecting still images; 92% (57) are collecting audio recordings and 90% (56) are collecting moving images/videos. Over half are collecting websites (55%, or 34) and databases (53%, or 33); 42% (26) are collecting email; 27% (17) are collecting Geographical Information Systems (GIS) data, and 26% (16) are collecting executable files (software). Other types of content mentioned include data sets (2 respondents), 3D models and computer-aided design, research data in all formats, books and ephemera, and other project files. In addition, academic libraries collect the most types of content: 75% (9 of 12) institutions collect 8-10 types of media. By comparison, in the 2012 ARL survey (which also had 62 respondents), all the percentages except GIS were lower, indicating a likely increase over the years; and the relative ranking among them was very similar. Only 69% (43) were collecting digital text; 85% (53) still images; 79% (49) audio, 68% (42) moving images and 77% (48) videos.71 Institutional websites were collected by 39% (24), other websites by 29% (18); databases by 34% (21), email by 37% (23); GIS by 31% (19); and executables by 23% (14).72 In the 2009 OCLC survey, the percentages were even lower: 55% photographs 47% audio 46% institutional archival records 45% video 44% other archives and manuscripts 36% publications and reports 27% web sites 15% serials 11% data sets73 Some of these categories differ from those gathered in our survey, but digital text has clearly increased in importance, and every corresponding category was gathered by a much larger percentage of our respondents. We also asked what types of content respondent institutions did not collect. Email was the type least collected (48%, or 13 of 27 respondents), followed by executable files (30%, or 8). Websites and GIS data followed at 19% (5) and then databases at 15% (4). The highest numbers in this category correspond to the lowest numbers for the previous types of content collected question. Interestingly, social media content was avoided by only 1 respondent (4%), though it was not mentioned in the comments of the earlier question by any respondents as a genre commonly collected.   4.4 Target Formats Of our total survey participants, 38.7% (24) answered the question about target file formats. Of those, 95.83% (23) provided information about text documents, 83.33% (20) about still images, 70.83% (17) about audio and spreadsheets, 58.33% (14) about video/moving images and only 29.17% (7) about databases. This may indicate that few in the field are tackling this difficult task. Some comments clarified that certain formats were preferred, and other comments indicated that some target formats were for access only. One respondent qualified each entry by stating that their collection policies do not allow them to request specific formats, but then went on to identify which formats they strive to collect. In the "other" comments were an entry about MBOX for email and another about WordPress blog files for which appropriate formats had not yet been selected. A third "other" comment was: "We don't have target formats, we would only migrate through a rigorous format migration process inside our preservation system." PDF/A was the clear leader for text documents, selected by 60.87% (14 of 23) as a target format. PDF and DOCX each came in at 39.13% (9), TXT at 34.78%, DOC at 26.09% (6), RTF and ODT at 13.04% (3), XML at 8.7% (2), and each of the following formats were selected by 1 respondent (4.35%): PPTX, ODP, TCT (TurboCAD drawings), FILE (openable with Microsoft Word in compatibility mode), and the original. For still images, TIFF took the lead at 95% (19 of 20), followed by JPEG (45%, 9; 1 of these specified this is for access only); 20% (4) selected JPEG 2000; 15% (3) selected PNG; SVG and GIF were each selected by 10% (2), and JFIF, DNG and the original format were selected by 5% (1 person each). WAV files are still the predominant archival target format for audio, selected by 70.59% (12 of 17); MP3 follows at 47.06% (8, though 2 respondents clarified these were just for access). AIFF and the original were each selected by 11.76% (2), and the following were selected by 5.88% (1): MIDI, WMA, MPEG-4 audio, FLAC, OGG, and RealAudio. CSV is the preferred target format for spreadsheets (70.59%, 12 of 17 responses); only 29.41% (5) selected XLSX, 23.53% (4) selected TXT (1 specified ASCII); 17.65% (3) selected ODS, 11.76% (2) each selected PDF/A and the original, and XLS, TSV, and XML came in last at 5.88% each (1). Of the 14 responders on moving images/video target file types, 57.14% (8) selected MPEG-4 encoding; 21.43% (3) selected MPEG-2 encoding and AVI and MOV containers; 14.29% (2) selected FFV1 encoding with Matroska containers (1 of these specified LPCM audio encoding); 14.29% also selected Motion JPEG 2000 (1 with MXF), MPEG encoding, and whatever form the original is in; and 7.14% (1 each) identified OGG, MFX wrapper, Digital Video file, and WMV. Very few seem to be tackling databases. Of the 7 useful responses, 42.86% (3) selected CSV, TXT (1 ASCII), or the original format; 14.29% (1 each) selected SIARD text with DDL, ACCDB (Access Database 2007), DBF, MDB (Access Database 2003 and earlier), MS Access (unspecified years), XML, PDF/A and EBCDIC.   4.5 Content Identification One of the first tasks when faced with incoming digital content is to examine what is there in order to build an inventory, identify duplicate files, and assess what types of material are present. A wide range of tools were specified by 31 respondents as useful in identification of content, with Forensic Toolkit and BitCurator in the lead (29%, 9 and 26%, 8 respectively). In-house scripts and TreeSize were recommended by 16% (5); spreadsheets and DROID by 13% (4); QuickView Plus and Karen's Directory Printer by 10% (3); and 6% (2 each) recommended FITS, Data Accessioner, JHOVE, Exiftool and WinDirStat. Mentioned by 3% (1) were Total Commander, Vireo and Siegfried & Brunnhilde. In describing workflows used for the identification of content, 24 of 34 participants (70.6%) indicated that they image disks of physical media, 2 specifying that they assess the image to determine if the entire image or only individual files are necessary for preservation. Creation of a file inventory was detailed by 35.3% (12); creation of an accession record or some other kind of reference record using metadata by 29.4% (10); extraction and transfer of files by 26.5% (9); and recognizing duplicate files by 14.7% (5). Surprisingly, few respondents described security and validation procedures in this stage of ingest: only 4 respondents mentioned write blocking media, only 2 said they generate file checksums or establish fixity, and just 1 mentioned performing a virus check of files. Of the total answers, 41.2% (14) indicated specific tools used to perform steps in the workflow — FTK (7), BitCurator (3), Data Accessioner (3), FITS (2) — and 11.8% (4) referenced in-house scripts. In the 2013 Digital POWRR white paper, Data Accessioner was recommended for triage, to be used by institutions with no funding support.74 Three of our respondents found Data Accessioner to be useful in the identification workflows.   4.6 Content Analysis After identification and building an inventory, some analysis of the content usually must precede selection by curators. This may include locating (and associating) all the versions of any particular file, documentation of where groups of file types are found, documenting sets of files and file structures that together comprise a single item (such as a database or software system), documenting which files are system files or known common software, documenting which files contain social security numbers, phone numbers and other potential privacy issues, how many of which file types are found, and more. For analysis, spreadsheets and BitCurator were considered useful tools by 25% of the 24 respondents who answered this question. DROID and Forensic Toolkit were specified by 21% (5); bulk_extractor by 17% (4), and the following tools were listed as useful by 8% (2) each: IdentityFinder, Siegfried, TreeSize and Karen's Directory Printer. Workflows for content analysis were explained by 16 participants, 7 (43.8%) of which include manual content analysis of some sort, while 2 stipulated that an archivist or specialist reviews content. Twenty-five percent (4) pinpointed personally identifiable information (PII) and used report and/or analysis functions of tools in this stage. Determining file type or format and the identification of normalization or migration targets each account for 18.8% (3) of answers. Checksum generation or fixity establishment, backup copying of files, and generating a directory tree or structure each appear in 12.5% (2) of responses. Additionally, 3 respondents stated that their workflow for content analysis is still in development.   4.7 Content Selection Once initial analysis takes place, the difficult task of selection begins. This may include determining which of multiple versions of a file are the one(s) of interest; it may include isolating files of particular types which the donor has specified; and it may require sorting through many types of files in many directories. It's a daunting task for a large set of incoming content. Often during this stage some descriptive metadata is generated or collected to specify why particular files or directories should be retained. When asked for the most useful tools for selection of digital content, 27% (4 of 15 respondents) used manual review. Collection policy and Forensic Toolkit (FTK) were listed by 20% each (3). Retention schedules, tools to locate personally identifiable information, and simple knowledge of collections were each specified as useful by 13% (2). When asked whether collection policies at the host institution allow the respondent to select content to be preserved, 45.16% (28) of the 62 respondents did not answer. Of the remainder, 29.03% (18) answered "yes," 14.52% (9) selected "sometimes/under certain circumstances" and 11.29% (7) answered "no". Details on workflows for content selection were provided by 14 respondents, and 35.7% (5) stated that selection was guided by donor agreements. Another 35.7% (5) indicated that selection is manual at their institutions (with 2 remarking that archivists perform selection), and policy and collection needs guided 21.4% (3) of respondents' selection activities. Content restrictions and excluding personally identifiable information (PII), creating a records schedule, and file transfer to external storage each account for 14.3% (2) of the answers for this workflow. It is worth noting that workflows for the middle two phases of ingest, content analysis and selection, require the most manual analysis work (both representing the greatest percentage of answers as opposed to less than 10% of answers for identification and processing), making these workflows the most subjective and least likely to be automated.   4.8 Content Extraction When asked "do you preserve the original files (not on the original media)?" over half (51.61%, 32) of the 62 participants did not respond. Of those who did respond, 83.33% (25) said "yes;" and 16.67% (5) selected "sometimes/under certain circumstances." There were no negative responses. By comparison, in the 2012 ARL survey, only 77% (49) were ingesting records from legacy media.75 However, of our total number of survey participants, only 40% answered yes to this question, and only half of our survey respondents were from academic libraries. We contend that it is likely that academic research libraries are better equipped to ingest born digital content from a variety of legacy media. When asked "do you migrate (or normalize) files?" again, over half (53.23%, 33) of our participants failed to respond. Of those who did answer the question, 48.28% (14) said "yes;" 41.38% (12) selected "sometimes/under certain circumstances;" and 10.34% (3) said "no." Only 2 respondents support emulation (1 in house); 36 respondents failed to answer this question, and 24 said "no." In the 2012 ARL survey, 8 of 57 respondents (14%) were actively building emulation systems, but only 1 out of 64 were providing access via emulation.76 Of our respondents, the only one who supports emulation in house qualified the response with: "Currently only in selective cases and mostly just for staff use, but we would like to expand use of emulation as an access method for end users (particularly for obsolete CAD and 3D modeling formats for which migration strategies are not particularly effective)."   4.9 Processing The ultimate goal of processing is to make digital content accessible for both current and future users. A number of processing activities may occur to provide access points with current technology and to preserve digital content across generations of technology, including content migration, data normalization, and emulation implementation. For processing, Archivematica was identified as useful by 23% (5) of the 22 respondents; Forensic Toolkit and Acrobat Pro followed at 18% (4). Spreadsheets, in-house software/scripts and Handbrake came in at 14% (3); Bulk Rename Utility, Microsoft Word and ffmeg at 9% (2). One respondent (5%) said Preservica is useful for processing. Of the 16 participants who described their workflows for processing, the most (43.8%, or 7) stated that they transfer files to a repository or external storage in this phase. Normalization or migration of files takes place for 37.5% (6) respondents; creation of access copies 25% (4); and 18.8% (3) said the workflow for processing is still in development. Metadata generation or creation were specified by 31.3% (5). It is notable that this function only occurs in the first and last of our specified workflow phases: content identification and processing. The 2013 Digital POWRR white paper recommended that institutions with some resources but no ability to add technical personnel use Preservica for processing; for institutions with technical staff who are able to take on extra work, Archivematica was recommended.77 Interestingly, only 1 of our respondents used Preservica, but 5 used Archivematica. This may indicate that the respondents to our survey had technical staff available for the additional work.   4.10 Metadata Only 43.55% (27) of the 62 respondents answered the question about what types of technical metadata they capture. Of these, 92.59% (25) collect file date(s) and types, 88.89% (24) collect file sizes, and 85.19% (23) collect checksums. Original directory locations and file type versions were collected by 70.37% (19), creating software (and version) by 55.56% (15), associated files and structure of document by 40.74% (11), and only 29.63% (8) document the operating system type from which the files come. Endianness and appropriate technical standards based on file type (such as MIX and AES57) are only documented by 14.81% (4) of the respondents who answered this question. Of the 23 responders to the question about what rights information they collect, 95.65% (22) collect access and use restrictions, 73.91% (17) collect copyright information, and 65.22% (15) collect intellectual property rights information. Over half document the rights to make copies and derivatives (56.52%, 13) and to obtain the rights to preserve and migrate content (52.17%, 12); 39.13% (9) obtain digitization permissions, and 34.78% (8) collect the rights to make representations and metadata. One of the respondents qualified the above by stating that "These questions are addressed at a high level for accessions as a whole, but rarely at a more granular level." Three of the participants who did not collect any of this information added explanations: None [is collected] as the content belongs to the archives' mother organization All of our content is in the public domain (government publications) All our records are internal/institutional records so we own copyright By comparison, the ARL survey did not specifically ask about what rights information is collected; instead their survey asked about whether a variety of rights-related issues was addressed by ingest policies or procedures. The most commonly addressed issue when setting policies was whether to retain or destroy personally identifiable information (PII), 71% (30 of 42 respondents); second was whether to preserve copyrighted content (48%, or 20 of 42).78 Other types of administrative metadata that our respondents collected included repository-generated information, provenance and donor information, acquisition/accession information, disposition, and notes of actions taken.   4.11 Organizing/Tracking It is notable that 53% (10 of 19) of our respondents use spreadsheets as one of their primary tools to organize and track content; this seems to indicate that the processes are not advanced and embedded enough to have been transferred to databases or systems. Content management systems were listed by 26% (5) of the respondents, including ArchivesSpace (2), DSpace(1) and Omeka (1); 16% (3) used accession records, labels, databases or organization by collection and date to track content. Labeling tools mentioned included Forensic Toolkit and ePadd. Other tools mentioned included Bagit, Photo Mechanic, TreeSize Professional Reports, and Archivematica.   4.12 Providing Access By far, the most common method of tracking rights information among our participants (77%, or 10 of 13 respondents) is to embed it in the metadata, whether that is the finding aid, an item-level description, or a catalog record. A single respondent tracks rights information only in a spreadsheet, and another only in an in-house database, and a third only in the deed of gift. Some respondents selected multiple tracking mechanisms, as was the case with the 2 respondents who mentioned accession records. Of the 16 respondents who shared how they use rights information to control access, 44% (7) use manual processes, 31% (5) control access via the delivery system; 13% (2) simply do not, though 1 of them openly shares the rights metadata; 1 respondent just limits all access to in-house use, and 1 determines access based on user group (student or faculty member). In the ARL survey, only 13% (8 of 63) provided open access of all born-digital materials; 83% (52) restricted some of the content, and 5% (3) restricted use to certain categories of users.79 Less than half the survey participants answered the questions about the granularity with which access is provided to incoming digital content in-house (26 out of 62). Of those, 65.38% (17) provide access at the item level, 46.15% (12) at the folder level, 23.08% (6) at the hard drive level, and 7.69% (2) provide access to the disk image. One of the respondents adds that they are currently building capacity to provide access to disk images; a second respondent states they are only receiving single files from departments across campus, so they've not needed disk images. Two respondents clarify that in-house access is only available to staff, who then try to fulfill patron requests. A fifth respondent clarified that some database records are available for content search online; some can be identified and downloaded from an online catalog; and other records can be ordered on removable media. Another respondent added that for them, a folder is usually considered a "collection." Online access to incoming digital content has very similar picture. Of the 62 respondents, only 24 answered this question. Of those, 70.83% (17) provide online access at the item level, 50% (12) at the folder level, 41.67% (10) provide the directory structure; 20.83% (5) provide online access to the hard drive, and 8.33% (2) provide online access to disk images. Six of the respondents do not provide online access to incoming digital (though 1 makes exceptions for web archives), and a seventh states that most are under copyright, and thus are not available online. Another respondent clarifies that some records are available for download, but others are only available for record-level search and retrieval. One very helpful comment was this one: "In both reading room and online, access is given based on rights determined. We have four levels: online, three concurrent users online (for published material only), reading room only, or by permission only. Items are arranged to various levels of specificity and then delivered based on those arrangement and descriptive levels." By comparison, in the 2009 OCLC survey, 40% of the born-digital holdings are described in archival collections, 29% are cataloged, and 34% have no records80; yet 40% of respondents provide access to their born-digital content even if unprocessed.81 How that access is provided (online, in house, or by request) was not included in that survey. In the 2012 ARL survey, 66% (42 of 64) provided online access to born-digital content via a digital repository system; 28% (18) via a third-party access & delivery system; 23% (15) via a file space; only 2% (1 respondent) provided online access via emulation.82 In-library access via a dedicated computer workstation was provided by 48% (31); 34% (22) using portable media and the user's personal computer; 2% (1) via emulation; and fully 20% (13) did not provide any access to born digital materials, either in-house or online.83   4.13 Participant Recommendations When asked for general recommendations for other institutions that are just beginning to collect digital content, 25% (5 of 20) suggested developing policies and procedures first; 20% stated that planning, identifying target formats, selecting metadata and learning tools were critical. Documentation, connecting with professional communities, determining what works in your situation, and "just do it!" were mentioned by 15% (3); and 10% (2) focused on training & learning, making copies, using open source and starting small before attempting larger challenges.   5 Conclusions This survey has clarified that processes, workflows, and policies are still very much in development across multiple institutions. Variations in the levels of resources and technical expertise may be accountable for much of the range exposed by this survey. The drop in respondents from the beginning of the survey (62) to the end (20) may indicate survey fatigue, but likely also reflects that many respondents may not yet have developed more than initial steps in content intake. While all 62 answered the question about what types of content they collect, only 24 answered the next question about target formats. The tools used for content identification were specified by 31 respondents; 24 answered questions about content analysis; but only 15 answered questions about selection. The number of respondents jumped to 22 for processing questions and 23 for rights information collected; however, only 13 answered how rights information is tracked, and of those, the primary method is to embed it in metadata. Only 26 of the 62 respondents spoke to granularity of in-house access, and 24 to online access. It is quite possible that over half of the survey respondents have not yet developed any policies or procedures for access to the collected digital content. Still, some highlights are clear: the amount of digital content being collected is increasing, and the top target formats are TIFF, WAV, PDF/A, MPEG-4 and CSV or TXT. Few are able to extract content from older and more obscure media, and email and executable programs are the least collected content types. The top technical metadata collected are file dates, file type, size, and checksums. Analysis and selection are primarily manual processes, and spreadsheets are widely used to organize and track content. Forensic Toolkit and BitCurator are the lead tools for identification; and Archivematica, Forensic Toolkit and Adobe Acrobat Pro are most useful for processing. Notably, most of the tools identified in our survey are open source. Similarly, in the 2012 ARL survey, 74% (31 of 42 respondents) use open source tools; 50% (21) stated they use commercial tools for digital processing; 43% (18) use home-grown tools, and 29% (12) use outsourced services such as Archive-It.84 Use of database calls to restrict online access based on rights and permissions is thus far a rarity. Access methods are still focused primarily on the item level, and few are engaged in emulation. It will be interesting to see how this changes as the flood of incoming digital content continues to grow, and the possibilities of emulation as an online service85 expand. It seems clear that the development of best practices for the intake and management of digital content is still in its infancy. In the hopes that the information gathered will begin to lay the groundwork for the development of best practices and guidelines for others in the field, we are sharing the results of this survey widely, engaging practitioners in the field in multiple conference venues.86 Results from this survey have been used to seed a framework built on the AIMS87 methodology, and practitioners across archives, libraries, and special collections are invited to assist in building upon this online resource88, which was announced at the Digital Library Federation forum in November 2016. By sharing our experiences across the community, we can develop a living online resource of what works with what materials, and how best to manage the difficult challenges of the intake and curation of digital content.   References 1 Community Owned digital Preservation Tool Registry (COPTR), "Category:Tools". Last modified October 24, 2013. 2 Susan Lazinger, Digital Preservation and Metadata: History, Theory and Practice (Englewood, CO: Libraries Unlimited, 2001). 3 Edward M. Corrado and Heather Lea Moulaison, Digital Preservation for Libraries, Archives & Museums (Maryland: Rowman & Littlefield, 2014). 4 Arjun Sabharwal, Digital Curation in the Digital Humanities: Preserving and promoting archival and special collections, Chandos Information Professional Series (Waltham, MA: Elsevier, 2015). 5 Digital Curation Centre, "DCC Curation Lifecycle Model". Last modified 2016. 6 Ross Harvey, Digital Curation: A How-To-Do-It Manual, How-To-Do-It Manuals: Number 170 (New York: Neal Schuman, 2010). 7 Matthey G. Kirschenbaum, Richard Ovenden, Gabriella Redwine and Rachel Donahue, Digital Forensics and Born-Digital Content in Cultural Heritage Collections (Washington, DC: Council on Library and Information Resources, December 2010). 8 Digital Preservation Coalition, "Digital Preservation Handbook" 2nd Edition. Last modified 2015. 9 "Let's Solve the File Format Problem!". Last modified July 14, 2016. 10 "Digital Preservation Q&A". Last modified June 23, 2016. 11 Christopher A. Lee, Kam Woods, Matthew Kirschenbaum and Alexandra Cassanoff, "From Bitstreams to Heritage: Putting Digital Forensics into Practice in Collecting Institutions". BitCurator Project, September 30, 2013. 12 Geoffrey Brown and Kam Woods, "Born Broken: Fonts and Information Loss in Legacy Digital Documents", The International Journal of Digital Curation,1:6 (2011). http://doi.org/10.2218/ijdc.v6i1.168 13 Gregory Wiedeman, "Practical Digital Forensics at Accession for Born-Digital Institutional Records", Code4Lib Journal 31 (2016-01-28). 14 Andrew James Weidner and Daniel Gelaw Alemneh, "Workflow Tools for Digital Curation", Code4Lib Journal 20 (2013-04-17), 15 Katharine Dunn and Nick Szydlowski, "Web Archiving for the Rest of Us: How to Collect and Manage Websites Using Free and Easy Software," Computers In Libraries 29:8 (September 2009), 12-18. 16 Laura Wilsey, Rebecca Skirvin, Peter Chan and Glynn Edwards, "Capturing and Processing Born-Digital Files in the STOP AIDS Project Records: A Case Study", Journal of Western Archives 4:1 (2013), 1-22. 17 John Durno and Jerry Trofimchuk, "Digital forensics on a shoestring: a case study from the University of Victoria", Code4Lib Journal 27 (2015-01-21). 18 Digital POWRR, "Tool Grid". Last modified April 9, 2013. 19 Jaime Schumacher, Lynne M. Thomas, and Drew VandeCreek, "From Theory to Action: 'Good Enough' Digital Preservation Solutions for Under-Resourced Cultural Heritage Institutions", August 2014. 20 Timothy Arnold and Walker Sampson, "Preserving the Voices of Revolution: Examining the Creation and Preservation of a Subject-Centered Collection of Tweets from the Eighteen Days in Egypt", The American Archivist, 77:2 (Fall/Winter 2014), 510-533. http://doi.org/10.17723/aarc.77.2.794404552m67024n 21 Amanda A. Hurford and Carolyn F. Runyon, "New Workflows for Born-Digital Assets", Computers in Libraries (January/February 2011), 6-40. 22 Karen Schmidt, Wendy Allen Shelburne and David Steven Vess, "Approaches to Selection, Access, and Collection Development in the Web World: A Case Study with Fugitive Literature", Library Resources & Technical Services 52:3 (2008), 184-191. 23 Sumitra Duncan, "Preserving born-digital catalogues raisonnés: Web archiving at the New York Art Resources Consortium (NYARC)", Art Libraries Journal, 40:2 (2015), 50-55. 24 Laura Capell, "Building the Foundation: Creating an Electronic-Records Program at the University of Miami", Computers In Libraries (November 2015), 28-32. 25 Joseph A. Williams and Elizabeth M. Berilla, "Minutes, Migration, and Migraines: Establishing a Digital Archives at a Small Institution", The American Archivist 78:1 (Spring/Summer2015), 84-95. http://doi.org/10.17723/0360-9081.78.1.84 26 Daniel Noonan and Tamar Chute, "Data Curation and the University Archives", The American Archivist 77:1 (Spring/Summer 2014), 201-240. http://doi.org/10.17723/aarc.77.1.m49r46526847g587 27 Chris Frisz, Geoffrey Brown and Samuel Waggoner, "Assessing Migration Risk for Scientific Data Formats", The International Journal of Digital Curation, 7:1 (2012), 27-38. http://doi.org/10.2218/ijdc.v7i1.212 28 Jinfang Niu, "Appraisal and Custody of Electronic Records: Findings from Four National Archives", Archival Issues 34:2 (2012). 29 Jinfang Niu, "Appraisal and Selection for Digital Curation", International Journal of Digital Curation 9:2 (2014), 65-82. http://doi.org/10.2218/ijdc.v9i2.272 30 Ibid, 71-73. 31 University of Minnesota Libraries, "Electronic Records Task Force Final Report". Last modified September 11, 2015. http://hdl.handle.net/11299/174097 32 Ibid, 19. 33 Ibid, 27. 34 Ibid, 4. 35 Ibid, 20. 36 Ibid, 25. 37 Elizabeth Charlton, "Working with legacy media: A lone arranger's first steps", Practical Technology for Archives, 6 (June 2016). 38 Sunitha Misra, Christopher A. Lee and Kam Woods, "A Web Service for File-Level Access to Disk Images", Code4Lib Journal 25 (2014-07-21). 39 Leigh Garrett, Marie-Therese Gramstadt and Carlos Silva, "Here, KAPTUR This! Identifying and Selecting the Infrastructure Required to Support the Curation and Preservation of Visual Arts Research Data", The International Journal of Digital Curation 8:2 (2013), 68-88. http://doi.org/10.2218/ijdc.v8i2.273 40 Dianne Dietrich, Julia Kim, Morgan McKeehan, and Alison Rhonemus, "How to Party Like it's 1999: Emulation for Everyone", Code4Lib Journal 32 (2016-04-25). 41 Sherry L. Xie, "Building Foundations for Digital Records Forensics: A Comparative Study of the Concept of Reproduction in Digital Records Management and Digital Forensics", The American Archivist, 74 (Fall/Winter 2011), 576-599. http://doi.org/10.17723/aarc.74.2.e088666710692t3k 42 Ben Goldman, "Bridging the Gap: Taking Practical Steps Toward Managing Born-Digital Collections in Manuscript Repositories", RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage 12:8 (2011), 11-24. 43 OCLC Research, "Demystifying Born Digital Reports". Last modified 2016. 44 Ricky Erway, "You've Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media", OCLC Research (2012). 45 Julianna Barrera-Gomez and Ricky Erway, "Walk This Way: Detailed Steps for Transferring Born-Digital Content from Media You Can Read In-house", OCLC Research (2013). 46 AIMS Work Group, "AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship", (2012). 47 Ibid, 14. 48 Ibid, 24. 49 Ibid, 32. 50 Ibid, 120-136. 51 Ibid, 137-147. 52 Ibid, 148-170. 53 DuraSpace Project, "Hypatia". Last modified August 15, 2013. 54 Amanda Kay Rinehart and Patrice-Andre Prud'homme, "Overwhelmed to action: digital preservation challenges at the under-resourced institution", OCLC Systems and Services 30:1 (2014), 28-42. http://doi.org/10.1108/OCLC-06-2013-0019 55 Bernadette Houghton, "Preservation Challenges in the Digital Age", D-Lib Magazine, 22:7/8 (July/August 2016). http://doi.org/10.1045/july2016-houghton 56 Digital POWRR, "Preserving Digital Objects With Restricted Resources," last modified January 6, 2016. 57 Jaime Schumacher and Drew VandeCreek, "Intellectual Capital at Risk: Data Management Practices and Data Loss by Faculty Members at Five American Universities", International Journal of Digital Curation, 10:2 (2015), 96-109. http://doi.org/10.2218/ijdc.v10i2.321 58 Jackie M. Dooley and Katherine Luce, "Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives", OCLC Research. Last modified January 28, 2011. 59 Ibid, 59. 60 Naomi Nelson, Seth Shaw, Nancy Deromedi, Michael Shallcross, Cynthia Ghering, Lisa Schmidt, Michelle Belden, Jacki R. Esposito, Ben Goldman, and Tim Pyatt, "SPEC Kit 329: Managing Born-Digital Special Collections and Archival Materials" (Washington DC: Association of Research Libraries, 2012). 61 Jody DeRidder, "First Aid Training for Those on the Front Lines: Digital Preservation Needs Survey Results 2012", Information Technology and Libraries (June 2013), 18-28. http://doi.org/10.6017/ital.v32i2.3123 62 Greg Palumbo, Smithsonian Institution Archives, "Disk Diving: A Born Digital Collections Survey at the Smithsonian", The Bigger Picture: Exploring Archives and Smithsonian History Blog, September 13, 2012. 63 Greg Palumbo, Smithsonian Institution Archives, "The End of the Beginning: A Born Digital Survey at the Smithsonian Institution," The Bigger Picture: Exploring Archives and Smithsonian History Blog, April 30, 2013. 64 Allana Mayer, "Survey about Born-Digital Collections in Canadian archives", Librarianship.ca Blog, November 3, 2015. 65 Allana Mayer, "An Annual Survey Towards a Collaborative Digital Preservation Strategy for Canada" (paper presented at the 2015 Association of Canadian Archivists Annual Conference in Regina, Saskatchewan, June 11-13, 2015). 66 Allana Mayer, email correspondence with author, July 12, 2016. 67 United Nations Educational, Scientific and Cultural Organization (UNESCO), "Share your digital heritage strategy with the UNESCO-PERSIST project", Communication & Information Blog, January 20, 2015. 68 Wilbert Helmus, UNESCO-PERSIST Platform to Enhance the Sustainability of the Information Society Transglobally, "Survey on selection and collecting strategies of born digital heritage — best practices and guidelines". Final version March 30, 2015. 69 Greg Palumbo, "The End of the Beginning." 70 Ibid. 71 Naomi Nelson et al., "SPEC Kit 329," 29. 72 Ibid. 73 Dooley and Luce, "Taking Our Pulse", 59. 74 Digital POWRR, "From Theory to Action", p.13. 75 Naomi Nelson et al., "SPEC Kit 329", 34. 76 Ibid, 35, 71. 77 Schumacher et al., "From Theory to Action", 13. 78 Naomi Nelson et al., "SPEC Kit 329", 37. 79 Ibid, 82. 80 Dooley and Luce, "Taking Our Pulse", 46. 81 Dooley and Luce, "Taking Our Pulse", 39. 82 Naomi Nelson et al., "SPEC Kit 329," 71. 83 Ibid. 84 Ibid, 67. 85 bwFLA, "Legacy Environments at Your Fingertips: Emulation as a Service". Last modified date unknown. 86 Jody DeRidder and Alissa Helms, "Practical Options for Incoming Digital Content" (paper presented at the COSA/SAA Archives * Records 2016 Conference in Atlanta, GA, August 2-7, 2016); "Practical Options for Incoming Digital Content (paper presented at the Digital Library Federation Forum in Milwaukee, WI, November 7-9, 2016); "What Works and What Doesn't? Developing Guidelines for Incoming Digital Content" (paper presented at the Digital Preservation 2016 Conference in Milwaukee, WI, November 9-10, 2016). 87 AIMS Work Group, "AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship". 88 Jody DeRidder and Alissa Matheny Helms, "Incoming Digital Content Management". (Open Science Framework project). Last modified November 3, 2016.   Appendix I Digital Content Intake Survey Survey by Alissa Helms and Jody DeRidder, University of Alabama Libraries, Spring 2016 MATERIALS & CONTENT Which of the following types of digital content does organization currently collect? Audio recordings (including podcasts) Still images Moving images, videos Texts (such as unstructured office documents) Websites Email Databases Geographical Information Systems (GIS) Data Executable files (software) Other (please specify) What kinds of digital content do you NOT collect, and why? What forms of obsolete media are you prepared to obtain content from? Floppy discs (3.5 inch) Floppy discs (5.25 inch) Floppy discs (8 inch) Zip discs Mini discs Old hard drives (Mac) Old hard drives (Windows) Old hard drives (*nix) Hard disk drives (sizes other than 2.5 or 3.5 inch) Other (please specify) WORKFLOWS Identification. One of the first tasks when faced with incoming digital content is to examine what is there in order to build an inventory, identify duplicate files, and assess what types of material are present. We consider this the "identification" stage. What tools do you find useful for this process? What tools have you tried that were not useful, and why? What is your process or workflow for this aspect of digital content intake? Analysis. After identification and building an inventory, some analysis of the content usually must precede selection by curators. This may include locating (and associating) all the versions of any particular file, documentation of where groups of file types are found, documenting sets of files and file structures that together comprise a single item (such as a database or software system), documenting which files are system files or known common software, documenting which files contain social security numbers, phone numbers and other potential privacy issues, how many of which file types are found, and more. What tools do you find useful for this process? What tools have you tried that were not useful, and why? What is your process or workflow for this aspect of digital content intake? Selection. Once initial analysis takes place, the difficult task of selection begins. This may include determining which of multiple versions of a file are the one(s) of interest; it may include isolating files of particular types which the donor has specified; and it may require sorting through many types of files in many directories. It's a daunting task for a large set of incoming content. Often during this stage some descriptive metadata is generated or collected to specify why particular files or directories should be retained. What tools do you find useful for this process? What tools have you tried that were not useful, and why? What is your process or workflow for this aspect of digital content intake? Processing. The ultimate goal of processing is to make digital content accessible for both current and future users. A number of processing activities may occur to provide access points with current technology and to preserve digital content across generations of technology, including content migration, data normalization, and emulation implementation. What tools do you find useful for this process? What tools have you tried that were not useful, and why? What is your process or workflow for this aspect of digital content intake? CONTENT MANAGEMENT Do you preserve the original files (not on the original media)? Do you migrate (or normalize) files? If so, what are your target archival formats for the following types of materials: Text? Images? Audio? Moving images/video? Databases? Spreadsheets? Other? How do you track and organize content? Please share information about your workflows and any tools you have found helpful. PRESERVATION METADATA What types of technical metadata do you capture? Checksums File sizes File date(s) Original location Associated files Structure of document File type File type version Creating software (and version) Operating system Endianness Appropriate standards based on file type (such as MIX, AES57, etc.) Other: What types of rights information do you collect for your incoming digital content? Access and use restrictions Digitization permissions Rights to make copies and derivatives Rights to preserve and migrate Rights to make representations and metadata Intellectual property rights Copyright Other: How do you track rights information? (Example: Open Digital Rights Language (ODRL), METS rights language, etc.) What other types of administrative metadata do you collect? ACCESS How is your rights metadata used to control access? (Example: database entries queried upon clicking on item link, content organized according to access restrictions, etc.) At what level of granularity do you provide in-house access? (at the disk image, hard drive, directory structure, folder, or item level?) At what level of granularity do you provide online access? (at the disk image, hard drive, directory structure, folder, or item level?) Do you provide emulation of the original access method? Yes: in house Yes: via online emulation services such as https://olivearchive.org or http://bw-fla.uni-freiburg.de/ Yes: other _____________ No RECOMMENDATIONS What recommendations do you have for other institutions that are just beginning to collect digital content? Do you have other suggestions or comments about what has worked for you, and what has not? If you are willing to be contacted with follow-up questions, please provide your email address. This information is confidential and will not be published or associated with your responses in any publication or presentation.   About the Authors Jody L. DeRidder is the Head of Metadata & Digital Services at the University of Alabama Libraries and a co-founder of the DLF Assessment Interest Group. She holds an MSIS and an MS in Computer Science from the University of Tennessee.   Alissa Matheny Helms is the Digital Access Coordinator at the University of Alabama Libraries, where she works to improve access to digitized materials from the University's Special Collections and enhance policies and procedures regarding the intake of born digital content. She received her MLIS from the University of Alabama.   (On February 22, 2017, the spelling of the author's name in references 64, 65 and 66 was changed from Alana Mayer to Allana Mayer. Additionally, reference 65 was corrected to state that the paper was presented at the 2015 Association of Canadian Archivists Annual Conference, not the 2016 Association of Canadian Archivists Annual Conference.) (On January 4, 2017, the sentence, "Very few seem to be tackling databases. Of the 7 useful responses, 28.57% (2) selected CSV, TXT (1 ASCII), or the original format; 14.29% (1 each) selected SIARD text with DDL, ACCDB (Access Database 2007), DBF, MDB (Access Database 2003 and earlier), MS Access (unspecified years), XML, CSV, PDF/A and EBCDIC." was corrected to say, "Very few seem to be tackling databases. Of the 7 useful responses, 42.86% (3) selected CSV, TXT (1 ASCII), or the original format; 14.29% (1 each) selected SIARD text with DDL, ACCDB (Access Database 2007), DBF, MDB (Access Database 2003 and earlier), MS Access (unspecified years), XML, PDF/A and EBCDIC.")   Copyright ® 2016 Jody L. DeRidder and Alissa Matheny Helms work_hd3ie5gbcfbilowjua6lxtz5qi ---- ipm 23 Alexopoulos Cohen.fm The Effects of Computer Technologies on the Canadian Economy: Evidence from New Direct Measures Michelle Alexopoulos and Jon Cohen1 University of Toronto ABSTRACT New indicators of technical change in the field of computers based on new titles held by Canadian libraries are presented, and are used to demonstrate that a positive computer technology shock in Canada increases hours worked, output, and productivity in the short run. These measures indicate, first, that advances in the implementation of computer technology in Canada are largely influenced by innovations in the United States; and second, when compared to a United States-based indicator, that a gap emerged between United States and Canadian-held titles around the time that the productivity gap emerged between the two countries. Given that a strong, causal relationship is found to exist between the new indicators and total factor productivity, this evidence provides additional support for the hypothesis that crosss-border differences in the development and use of new computer technologies play a key role in explaining Canada’s productivity gap with the United States. RÉSUMÉ De nouveaux indicateurs du changement technique dans le domaine de l'informatique, basés sur les nouveaux titres des bibliothèques canadiennes, sont présentés et montrent qu'un choc technologique positif dans le domaine de l'informatique au Canada accroît à court terme le nombre d'heures travaillées, les résultats et la productivité. Ces mesures indiquent, premièrement, que les progrès de l'application de la technologie informatique au Canada sont en grande partie influencés par les innovations aux États-Unis et, deuxièmement, que lorsqu'elles sont comparées à un indicateur américaine, un écart apparaît entre les titres dans les bibliothèques canadiennes et américaines à peu près au moment où l'écart entre la productivité des deux pays se fait jour. Compte tenu de l'existence d'une forte relation de cause à effet entre les nouveaux indicateurs et la productivité totale des facteurs, ces observations renforcent l'hypothèse selon laquelle les différences entre le développement et l'utilisation des nouvelles technologies informatiques dans les deux pays jouent un rôle essentiel pour expliquer l'écart de productivité du Canada. 1 Michelle Alexopoulos is Associate Professor and Jon Cohen is Professor Emeritus in the Department of Econom- ics at the University of Toronto. The authors thank the referees for useful comments. Email: malex@chass.utoronto.ca. I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 17 C O M P U T E R T E C H N O L O G I E S A R E O F T E N viewed as a key contributor to productivity growth in advanced industrial countries such as the United States and Canada.2 It follows that cross-country differences in the development and use of these technologies may at least par- tially account for cross-country productivity differentials. This is particularly relevant for Canada and the United States because it is fre- quently argued that the growth in the produc- t i v i t y g a p b e t w e e n t h e s e t w o c o u n t r i e s i s attri butabl e to the mo re ra pid a doption of information technologies in the latter than in former.3 As compelling as this argument may appear, the lack of direct measures of technical change in this area has made it difficult to pro- vide a quantitative assessment of either the impact of computers on economic activity in the two countries or the causal link between differences in their adoption rates and produc- tivity differentials. In short, if we want to quan- t i f y t h e i m p a c t o n t h e s e e c o n o m i e s o f innovations in this area, we must be able to measure them. As it happens, that is precisely what we propose to do in this article. We first present new direct measures of technological change in the field of computers in Canada for the 1950-2005 period, based on the number of new computer-related titles held by Canadian libraries4 and then use them to show that: (1) the United States is the principal source of advances in computer tech- nology in Canada; (2) the rate of adoption of new computer technologies in the United States began to surpass that of Canada in the 1970s; and (3) a strong relationship exists between computer innovations and productivity, GDP, and hours worked.5 Together, these findings provide empir- ical support for the hypothesis that cross-border differences in the commercialization and rate of adoption of new computer technologies have played a key role in the widening of the Canada- U.S. productivity gap. The article is divided into four sections. The first section discusses the indicators and their properties. Section two reports the results of our regressions. Section three discusses the poten- tial link to the Canadian and U.S. productivity gap. The fourth and final section concludes. The Indicators Most would agree that a good direct measure of technical change should, at a minimum: (1) be available at least on an annual basis over a long time horizon; (2) be objectively determined; (3) weight different technologies according to their importance; and (4) tap into the full range of new advances. Moreover, for many purposes we would add a fifth requirement – that the indicator cap- ture innovations at the moment of their commer- cialization. This is important for two reasons. First, much of the impact on output, productivity and employment occurs through the adoption of new technologies. Second, unanticipated tech- nology shocks, an important feature of many eco- nomic models, are generally identified not by the timing of the invention or even the patenting of a new technique but by its use. Our new book-based indicators possess all of these features. They are: (1) objective because they are determined by cataloguing criteria established and followed consistently by librari- ans; (2) quantifiable because they are based on the number of new titles; (3) weighted because 2 See, for example, Alexopoulos (2011), Alexopoulos and Cohen (2011), Oliner et al. (2007), Khan and Santos (2002), Stiroh (2002), Van Ark et al. (2003), and Sharpe (2006), and the citations within. 3 See Basu et al. (2003) for an interesting study of differences in productivity growth between the United States and the United Kingdom and the relationship to IT technologies. 4 The approach, developed by Alexopoulos (2011), was applied initially to the United States. 5 In what follows we use the terms book-based and publication-based interchangeably. The indicators are primarily based on new manuscripts. However, pamphlets that are catalogued are included. Serial publi- cations and continuing resources are, by and large, excluded from the counts. 1 8 N U M B E R 2 3 , S P R I N G 2 0 1 2 more titles are published on important advances than on lesser ones; and (4) broad-based because new titles appear on all innovations of any sig- nificance.6 Finally, new publications, for good economic reasons, are timed to coincide with the commercialization of new products or pro- c es s e s . T h e i n no v a t i n g c o m p a n i e s w a nt t o spread the word about their new devices – what they are, how to use and maintain them, – while publishers and their writers want to profit from the market demand for information about these new technologies. In all cases timing is critical – too early and there is no market, too late, and the market is fully served. Although there are clearly other means aside from print to convey information about new technologies, our find- ings (Alexopoulos (2011) and Alexopoulos and Cohen (2009, 2011)) suggest that these book- based indicators provide a compelling way to quantify technical change and to evaluate its impact on the economy.7 Description of the Indicators Although we focused in our previous work on the United States, it is possible to use a similar methodology (developed in Alexopoulos (2011)) to create comparable technology indicators for Canada. This is because Canadian libraries also use MAchine Readable Cataloging (MARC) records to run their online catalogues.8 As is well-known, computer-related innovations in Canada are an amalgam of home-grown and imported advances, much more so than, for example, in the United States. Moreover, Can- ada does not have a library of the size and scope of the Library of Congress in the United States. For these reasons, we reshaped our approach to ensure that we capture the full range of foreign and domestic developed computer technologies commercialized in Canada. In particular, we used information from the catalogues of 1,062 Canadian libraries covered in the WorldCat database of the Online Computer Library Cen- tre (OCLC) on the number of new computer titles published between 1950 and 2005, without regard for country of publication.9 While not all Canadian libraries are members of OCLC, the membership includes the National Library of Canada, the country’s largest public libraries (e.g. those in Toronto, Montreal and Vancou- ver), and all major university libraries. As such, the combined MARC records of the member libraries provide a comprehensive list of all major new computer-related titles available to the Canadian public. Even though the MARC records were designed to serve the online cataloguing needs of librari- ans, it turns out , because of the large amounts of data buried in them, that they are also a poten- tially powerful research device for, among others, economists. For example, each MARC record contains information on the type of book (for example, a new title, a new edition of an existing one, a reprint, or a translation), the country and language of the publication, the publisher, the Library of Congress and/or the Dewey Decimal Classification Code, and a list of major subjects treated in the book. These data enable us to com- 6 Alexopoulos (2011), and Alexopoulos and Cohen (2009, 2011) also present evidence that the book publication measures are related to traditional measures of technical change such as R&D, patents, major innovations, and journal article counts in the United States. 7 Although some may be concerned that changes in the number of titles is driven by ups and downs in the publishing industry, our findings in the papers cited indicate that the patterns, on the whole, appear to be dictated by changes in innovations. Finally, although cataloguing and keyword assignment are poten- tially subject to error, there is no reason to believe that misclassification is a problem. 8 See Appendix A for an example of a MARC record. 9 The data used for this research were based on a snapshot of the OCLC’s WorldCat database as of the mid- dle of 2010. We took 2005 as our cut-off date to avoid any biases created by the backlog of uncatalogued titles. I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 19 0 20 40 60 80 100 120 19 50 19 56 Chart 1 Indicators of C Chart 2 Fraction of Eng in Canada Pub 0 1000 2000 3000 4000 5000 6000 7000 All new co 0 1000 2000 3000 4000 5000 6000 7000 New Engli held in Ca 0 1000 2000 3000 4000 5000 6000 New US pu computer 19 50 19 56 19 50 19 56 19 50 19 56 19 62 19 68 19 74 19 80 19 86 19 92 19 98 20 04 pile a list of new titles published each year on computers and computer science that are held by Canadian libraries in the sample between 1950 and 2005.10 To ensure that our titles actually rep- resent the appearance of new technologies, we eliminate from the sample all books that include history as a descriptor since they, almost by defi- nition, focus on the past not the present. Thanks to the richness of the MARC records, we are able to create, as can be seen in Chart 1, three slightly different indicators for the purposes of this analy- sis: the first includes all books held in the field by the Canadian libraries regardless of the language or country of publication; the second excludes non-English language titles; while the third is limited to English language titles published in the United States. In all cases, the indicator includes manuals and books that deal with new computer technologies that describe their nature and function, how they work, and how to use or repair them. Some of the titles are published or sponsored by the innovator or the company that developed it, while others are written by third parties who hope to profit from sales of the book. As noted earlier, all groups have an economic incentive to ensure that the publications appear as close as possible to the commercialization date of the new technologies. At the same time, libraries, seeking to serve their market, will then purchase the books they believe will be demanded and used by their patrons. For this reason we would expect to observe a close chronological coincidence between the copyright date of the first book on a new technology that appears in a library (as captured by the WorldCat database) and its commercialization date as reported in other source material. The results in Table 1, based on the dates for a sample of com- puter innovations commercialized in Canada and the United States, confirms this timing. 10 See Appendix B for a description of the Dewey Decimal Classifications and Library of Congress Classifications associated with the computer and computer science classifications included in the counts. omputer-Related Titles by Copyright Date lish-Language Held Computer Books lished in the United States (per cent) mputer titles held in Canada sh-language computer titles nada blished English-language titles held in Canada 19 62 19 68 19 74 19 80 19 86 19 92 19 98 20 04 19 62 19 68 19 74 19 80 19 86 19 92 19 98 20 04 19 62 19 68 19 74 19 80 19 86 19 92 19 98 20 04 2 0 N U M B E R 2 3 , S P R I N G 2 0 1 2 Table 1 also demonstrates the fluidity with which computer technologies flow across bor- ders. It appears that the technologies developed in the United States and the United Kingdom were quickly embraced in Canada. For example, the commercialization dates in Canada for com- puter software programs such as Lotus 1-2-3, and Windows were virtually identical to those in Table 1 Comparison of Dates for Selected Computer Innovations Notes: First book dates correspond to the copyright date of the first book held by a library in the OCLC WorldCat data- base. See Alexopoulos and Cohen (2011) for sources of the US innovation and commercialization dates for Windows, Lotus, Apple II+, Macintosh, Lisa, IBM PC, IBM PC/AT and Commodore 64, and the footnotes for information used for the Canadian commercialization dates and the dating of PAT, Corel Draw and the Sinclair ZX80. Innovation Date of innovation Country of Invention Commercialization Date in the United States First American Book Date Commercialization Date in Canada First Canadian Book Date Windows Nov. 1983 US Nov. 1985 1985 Nov 19851 1. http://www.guidebookgallery.org/ads/magazines/windows/win10-powerwindows-8 1986 Lotus Nov. 1982 US Jan. 1983 1983 Feb 19832 2. Michael Kieran, "Programs for micros enter new generation," The Globe and Mail, pp. R.12-R.12, Feb 28, 1983 1983 Apple II+ 1978 US June 1979 1979 19793 3. See comment on http://www.facebook.com/torontostar/posts/227813607276004 by J. Lyng, a resident of Toronto, and blog post on http://taoofnews.com/2011/01/25/thirty-years-in-new-media/ 1979 Macintosh Jan. 1984 US First Quarter 1984 1983 Jan 19844 4. Jonathan Chevreau, "Xerox Canada will carry Lisa, Macintosh machines," The Globe and Mail, B.14, May 11, 1984 1984 Lisa 1978 US Jan. 1983 1983 April 19835 5. Michael Kieran, "Programs for micros enter new generation," The Globe and Mail, pp. R.12-R.12, Feb 28, 1983 1984 IBM PC July 1980 US Aug. 1981 19826 6. WorldCat points to a scanned book captured by Google that is entitled Technical specifications under the series: IBM Personal Computer. Hardware reference library published by IBM. Aug 19817 7. Jonathan Chevreau, "Computerland chief expects IBM entry to add credibility to personal market," The Globe and Mail, pp. B.9-B.9, Aug 27, 1981 1982 IBM PC/AT Aug. 1984 US Fall 1984 19858 8. While not physically held in a library, OCLC records point to a technical publication by IBM for this computer with a copyright date 1984. Fall 19849 9. Jonathan Chevreau, "IBM launches new 20-megabyte PC unit," The Globe and Mail, pp. B.1-B.1, Aug 15, 1984 1985 Commodore 64 Jan. 1982 US Nov. 1982 1982 Sept 198210 10. Jonathan Chevreau.”Price war is expected in personal computers” The Globe and Mail, pp. B 15, Sept. 8, 1982 1982 PAT (OpenText)11 11. http://www.opentext.com/2/global/company/company-history.htm#ecml 1989 Canada 1992 1993 by 1991 1992 Corel Draw12 12. http://www.fundinguniverse.com/company-histories/Corel-Corporation-Company-History.html and American review of the technology by S. Rosenberg, “Corel Draw shows great promise” Byte Magazine, June 1, p. 213. 1987-1989 Canada 1989 1988 Jan 1989 1988 Sinclair ZX8013 13. http://en.wikipedia.org/wiki/Sinclair_ZX80 and http://maben.homeip.net/static/S100/sinclair/brochure/Sinclair%20ZX80%20Jan%2081%20Byte%20review.pdf 1980 UK Fall 1980 1981 Late 1980 1980 I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 21 http://www.guidebookgallery.org/ads/magazines/windows/win10-powerwindows-8 http://www.facebook.com/torontostar/posts/227813607276004 http://www.facebook.com/torontostar/posts/227813607276004 http://www.fundinguniverse.com/company-histories/Corel-Corporation-Company-History.html http://www.fundinguniverse.com/company-histories/Corel-Corporation-Company-History.html http://en.wikipedia.org/wiki/Sinclair_ZX80 http://maben.homeip.net/static/S100/sinclair/brochure/Sinclair%20ZX80%20Jan%2081%20Byte%20review.pdf the United States. The same is true for com- puter hardware such as the Apple II+, Commo- dore 64, IBM PCs, Sinclair, and the Macintosh all of which appear almost simultaneously in the two countries. As it happens, the flow was not uni - dir ec ti on al . C an ad ian i n n ovat io n s li ke C o re l D r aw, de v el o p ed by C o re l , a n d PAT, developed by Open Text Corporation, were adopted by American firms soon after their Canadian release dates. A close relationship can also be observed in Table 1 between first commercialization dates and copyright dates (book dates in the table), independent of the location of the innovations. For all of these cases, there is never more than a year’s difference between the copyright date and the year of its adoption in either the United States or Canada. In other words, the appear- ance of a computer-related book in a Canadian library provides a good indicator of the initial arrival (commercialization) of the new technol- ogy in the country. Moreover, as we have shown elsewhere (Alexopoulos (2011), and Alexopoulos and Cohen (2011)), new titles are associated with the introduction of new processes or prod- ucts and not their diffusion. Our indicators, in short, should provide a good measure of com- puter innovations in Canada. We turn now to the central question of this article – what impact did these new technologies have on economic activity in Canada? Output, Productivity, and Technical Change We have used similar indicators in other papers (Alexopoulos (2008, 2011), and Alex- opoulos and Cohen (2009 and 2011)) to explore the relationship between innovative activity in a v ar i et y o f f i el d s, ou t p u t , p ro du c t i vi t y, a nd employment in the United States, drawing on the MARC records of the Library of Congress, Amazon.com’s booklists, and R.R. Bowker’s publishers’ lists. In particular, we have found, first, that new computer technologies have been a n i m p o r t a n t d e t e r m i n a n t o f p r o d u c t i v i t y growth in the United States during the post WWII period, and second, that computer- driven technology shocks have led to short run increases in productivity, employment, and out- put. We repeat the analysis using the new Cana- dian indicators and ask: do we observe the same relationships in Canada? To answer this question, we estimate the fol- lowing bi-variate VARs:11 Yt = α+γt+ρYt-1 +εt (1) where Yt = [ln(Zt), ln(Xt)]’, with Zt being our measure of aggregate output or total factor productivity (TFP), and Xt being the number of new computer titles.12 As in Alexopoulos (2011) and Shea (1998), our computer indica- tor is ordered last in the VAR and a computer technology shock is identified by assuming that it affects the Z variables with a one year time lag. Our measure of aggregate TFP is from Madsen (2007) while hours worked and real GDP are based on data from Maddison (2010), the Historical Statistics of Canada, and CANSIM. Chart 3 displays the impulse responses to a one standard deviation com- puter technology shock (as identified by our indicator), and 90 per cent confidence inter- vals. Table 2 reports the Granger-causality 11 A vector autoregression (VAR) is a popular statistical model used to capture the linear interdependencies among multiple time series. As above, each of the variables in the model is represented by an equation linking the variable’s current value to lags of its own values, the lags of all the other variables in the model and other deterministic series such as a time trend. 12 Our choice of this specification is driven by two main considerations. First, it is comparable to the spec- ification used in our earlier work focusing on the United States. Second, Gospodinov, Maynard and Pesav- ento (2011) highlight problems associated with choosing a specification based on univariate unit root tests and demonstrate that severe biases can be introduced by removing low frequency movements by estimating VARs in first differences. 2 2 N U M B E R 2 3 , S P R I N G 2 0 1 2 http://en.wikipedia.org/wiki/Linear_dependence http://en.wikipedia.org/wiki/Time_series nglish Language uter Titles 15 20 15 20 15 20 nglish Language uter Titles nglish Language uter Titles and Xt is the value Do Hours Grange-Cause Computer Technologies? 0.789 0.757 0.850 Response of GDP to a positive shock to: All Canadian Held Computer Titles All Canadian Held English Language Computer Titles All Canadian Held E US Published Comp Response of Hours to a positive shock to: Response of TFP to a positive shock to: 0 5 10 15 20 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0 5 10 15 20 0 5 10 15 20 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0 5 10 15 20 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0 5 10 0 5 10 15 20 0 5 10 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0 5 10 15 20 0 5 10 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 All Canadian Held Computer Titles All Canadian Held English Language Computer Titles All Canadian Held E US Published Comp All Canadian Held Computer Titles All Canadian Held English Language Computer Titles All Canadian Held E US Published Comp tests, and Table 3 reports the variance decom- positions for the bi-variate cases. The results echo those for the United States reported in Alexopoulos (2011) and Alexopoulos Chart 3 Impulse Response Functions- Bi-Variate VAR Table 2 P-values of Granger Causality Tests Notes: For all cases Yt = α+γt+ρYt-1 +εt , where Yt = [ln(GDPt), ln(Xt)]’, Yt = [ln(TFPt), ln(Xt)]’ or Yt = [ln(Hourst), ln(Xt)]’ of the indicator at time t. Technology Indicator Do Computer Technologies Granger-Cause GDP? Does GDP Granger-Cause Computer Technologies? Do Computer Technologies Granger-Cause TFP? Does TFP Granger-Cause Computer Technologies? Do Computer Technologies Granger-Cause Hours? All Canadian held computer books (COMPALL) 0.071 0.795 0.059 0.914 0.046 All Canadian held computer books in English (COMPENG) 0.019 0.518 0.018 0.856 0.040 All Canadian held computer books in English published in the United States(COMPUS) 0.111 0.168 0.064 0.315 0.061 I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 23 and Cohen (2011). We find, first, that com- puter-related technical change, as measured by our indicators, had a significant impact on out- put, hours worked, and TFP in post-WWII Canada. Second, our Granger-causality tests indicate that causality runs from computer- based innovations to output, hours, and TFP and not the other way round. And, third, of the three series (all computer-related books, all English language titles, and English language, U.S.- based publications), it is the second that has the strongest influence on output, hours, and TFP. The impulse response functions associated with our VARs can be seen in Chart 3. As the first panel in Chart 3 shows, GDP significantly rises above trend for approximately 25 years fol- lowing a positive shock to computer technolo- gies (as identified by our indicators) with the peak effect occurring after approximately seven years. Panels 2 and 3 demonstrate that at least part of the increase in output is attributable to rises in hours worked and TFP – both of whose responses are similar to that of GDP.13 Each of these variables significantly rise for 15 to 25 years with their peak effect occurring between years 5 and 7. Of equal interest, the effects for all of the variables are roughly the same for all three indicators. The variance decompositions are reported in Table 3. We find, first, that in the initial years the impact of technical change in computers on our three variables is relatively weak. To be more pre- cise, in year three, the indicators accounted for 3.4–6.9 per cent of the variation in GDP, 2.8–5.5 per cent of the variation in TFP and 2.2–5.1 per cent of the variation in hours. By year 6, however, the effect has changed quite noticeably: technical advances in computers now account for 9.0–22.0 per cent of the variation in GDP, 11.8–21.3 per 13 While the analyses in Alexopoulos (2011) and Alexopoulos and Cohen (2011) are based on a slightly different time period, 1955-1997 and 1980-2008, their findings suggest the peak impacts for a computer innovation occur earlier in the United States. Table 3 Per cent of Variation Due to Technology in Two Variable VARs Notes: These decompositions are based on bi-variate VARs where ln(GDP), ln(TFP) and ln(L) are ordered first. For the cases of using the new book measures and patents the VAR takes the form Yt = α+γt+ρYt-1 +εt where Yt = [ln(GDPt), ln(Xt)]’, Yt = [ln(TFPt), ln(Xt)]’ or Yt = [ln(Lt), ln(Xt)]’ and Xt is the value of the indicator at time t. Years ln(GDP) ln(TFP) ln(Hours) All Canadian held computer books (COMPALL) 3 3.379 2.821 2.158 6 12.593 11.791 9.124 9 21.117 21.433 16.531 12 27.353 29.247 22.293 All Canadian held computer books in English (COMPENG) 3 6.888 5.528 2.732 6 22.359 21.275 11.474 9 33.719 35.360 20.510 12 40.779 45.048 27.339 All Canadian held computer books in English published in the US (COMPUS) 3 3.803 5.193 5.121 6 8.987 13.650 14.785 9 11.715 18.697 20.775 12 13.135 21.506 23.788 2 4 N U M B E R 2 3 , S P R I N G 2 0 1 2 cent of the variation in TFP and 9.1–14.8 per cent of the variation in hours. By year 12, the levels have jumped again: 13.1–40.8 per cent of the vari- ation in GDP, 21.5–45.0 per cent of the variation in TFP and 22.3–27.3 per cent of the variation in hours.14 In general, the impact of computer tech- nologies on the three variables is largest at medium-run horizons. Second, the indicators based on new English language computer titles account, on the whole, for a much larger percent- age of the variance in our three variables than do the other two. Ta b l e 4 a n d C h a r t 4 r e p o r t t h e v a r i a n c e decompositions and impulse responses related to the tri-variate VAR: Yt = α+γt+ρYt-1 +εt (1) where Yt = [ln(TFPt), ln(Hourst), ln(Xt)]’. As above, the technology indicators are ordered 14 The variation in GDP and TFP attributable to computers reported by Alexopoulos (2011) and Alexopoulos and Cohen (2011) are of similar magnitude. However, for the United States, the computer innovations tend to explain a larger share of the variance in years 3-6. Chart 4 Impulse Response Function: Tri-Variate VAR Positive Shock To All Canadian Held Computer Books Response of TFP Response of Hours Positive Shock To All Canadian Held English Language Computer Books Positive Shock To All Canadian Held English Language US Published Computer Books -0.005 0.000 0.005 0.010 0.015 0.020 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 -0.015 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 -0.010 0 5 10 15 20 0 5 10 15 200 5 10 15 20 0 5 10 15 200 5 10 15 20 -0.015 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0 5 10 15 20 Response of TFP Response of Hours Response of TFP Response of Hours I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 25 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Canadi 19 50 19 56 Table 4 Per cent of Variat in the Tri-variate Notes: For all case ln(Xt)]’ and Xt All Canadian held c (LNCOMPA) All Canadian held c in English (LNCOMPE) All Canadian held c in English Published in the US (LNCOMPUE) Chart 5 Total Factor Pr 1950-2005 (1 Source: The TFP me an TFP American TFP 19 62 19 68 19 74 19 80 19 86 19 92 19 98 20 04 last. Again, we find evidence that a positive com- puter technology shock significantly increases productivity and hours worked. However, the confidence intervals for this case do not exclude the possibility that hours worked may initially decrease immediately following the shock. On the other hand, the results in Table 4 do confirm that new computer technologies play a strong role in productivity movements and a moderate one in variations in hours worked in the medium run. Canadian Productivity and the U.S.-Canada Productivity Gap In Chart 5, we present Canadian and Ameri- can TFP indices f rom Madsen (2007). Two trends are apparent. First, his estimates suggest that Canadian TFP in 2005 was approximately the same as it was in the mid-1970s. Second, starting in the late 1970s, Canada’s TFP growth failed to keep pace with that of the United States, giving rise to a well-known productivity gap.15 On the face of it, the first trend would seem to be inconsistent with the analysis in the previous subsection. In addition, it appears to be at odds with the upsurge in computer titles held in Can- ada and, accordingly, with the apparent advances in computer technology in this country. As it happens, the problem lies not with the data or with our argument but with the misunderstand- ing that TFP is a proxy for technological inno- vation. As we all know but often forget, TFP is a residual that contains all those factors other than labour and capital that affect GDP growth. These include, among other things, changes in scale economies, organizational capital, utiliza- tion rates, measurement errors and so on, some of which could easily affect the size and rate of change of the residual. The bottom line, for our purposes, is that TFP does not measure pure technical change – which is exactly what our book based indicator is capturing. Moreover, although computer-based technical change did 15 This gap is also seen in labour productivity measures. ion Due to Computer Technologies VARs s Yt = α+γt+ρYt-1 +εt , where Yt = [ln(TFPt), ln(Hourst), is the value of the indicator at time t. Horizon (Years) ln(TFP) ln(Hours) omputer books 3 3.368 0.256 6 13.311 2.715 9 23.273 8.010 12 30.742 14.239 omputer books 3 7.136 0.421 6 25.226 4.658 9 39.543 13.441 12 48.307 22.863 omputer books 3 6.571 1.119 6 15.683 6.110 9 20.348 11.656 12 22.568 15.515 oductivity in Canada and the United States, 950=100) asures for the total economy are from Madison (2007). 2 6 N U M B E R 2 3 , S P R I N G 2 0 1 2 19 92 19 98 20 04 United States 19 93 19 99 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 N um be r of N ew T it le s Canadian Held Computer Books 19 50 19 56 19 62 19 68 19 74 19 80 19 86 US Held Computer Books play an important role in driving productivity advances in Canada, there were other, counter- balancing forces at work as well. Although unpacking the contents of the residual exceeds the scope of this article, it is a worthwhile project for future research. As for the second trend, the productivity gap has, naturally enough, attracted the attention of Canadian academics and policy makers.16 The central questions are the obvious ones – what caused the gap and why has it grown? The answers matter for at least two reasons. First, we cannot begin to address the problem until we identify its source and, second, our ability to compete wi th our neighbor to the south is closely linked to the relative productivity in the two countries. Results reported in papers such as Sharpe (2010), Rao (2011), Rao et al. (2004), Rao and Tang (2001) and Van Ark et al. (2003) suggest that differences in the use and the rate of adoption of information technologies, especially computers, in the two countries are likely a major contributor to gap. While there are always issues with cross- country comparisons, both our metrics, and our overall findings tend to support this view. First, as noted earlier, technological advances in the field of computers have had a significant impact on Canadian productivity. Moreover, as reported in Alexopoulos (2011) and Alexopoulos and Cohen (2011), a similar relationship can be observed in the United States. It follows, then, that if there were a gap in the adoption of new computer tech- nologies between the two countries, this may have been a non-trivial contributor to the pro- ductivity gap. The question is: did such a gap exist? Cross border data on ICT investment in Sharpe (2010) points to a gap, as does our new book-based indicators. Specifically, Chart 6 shows the num- ber of new computer titles held by libraries in the United States as recorded by OCLC along- side the number of new computer titles held by Canadian libraries. It shows that a gap begins to emerge in the early 1970s.17 A similar pattern can be seen in Chart 7, based on indicators cre- ated from the holdings of the largest library in the United States, the Library of Congress, and the largest in Canada, the University of Toronto 16 See, for example, Rao and Tang (2001), Rao et al. (2004, 2008), Baldwin, Gu and Yan (2008), and Rao (2011). Chart 6 New Computer Books Held in Canada and the by Copyright Date Chart 7 New Computer Titles by Copyright Date 0 200 400 600 800 1000 1200 1400 1600 1800 N u m b er o f N ew T it le s University of Toronto Libraries 19 57 19 63 19 69 19 75 19 81 19 87 Library of Congress 2000 I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 27 Libraries.18 Since our VAR results suggest that a lag exists between the commercialization of new computer technologies (as measured by our new indicators) and their impact on productivity, the appearance of a productivity gap in the 1980s is perfectly consistent with the emergence of an adoption gap a few years earlier.19 In short, the computer-related technology gap measured by our book-based indicators does appear to have contributed to the emergence of the productiv- ity gap beginning in the 1980s.20 Conclusion In this article, we draw on the holdings of Canadian libraries to develop new book-based indicators of technical change in the field of computers for the years 1950-2005 and use them to determine the impact in Canada of techno- logical advances in this area on output, produc- tivity, and employment. As we have argued elsewhere (Alexopoulos (2011, 2008) and Alex- opoulos and Cohen (2009, 2011), these new indicators resolve many of the problems that plague traditional measures of innovative activ- ity such as patent citations and research and development expenditures. They also have the additional advantage that because they are con- sistent across countries as well as over time, they facilitate international time series comparisons. We are able to show, for example, that most of the computer innovations identified in publica- tions held in Canada actually originate in the United States. More, we can demonstrate, using VARs, that similar to our results for the United States, positive computer-related technology shocks in Canada lead to increases in GDP, TFP and hours worked in the short and medium run. Finally, we can make use of our new approach to show that starting in the 1970s, the number of new computer titles in Canada began to lag sig- nificantly the number in the United States (the appearance, our indicators suggest, of a technol- ogy gap), contributing a decade or so later to the emergence of a productivity gap between the two countries. This finding still leaves open the question of why the technology gap emerged in the first place, but with the identification of this problem, we hope policy makers can take steps to address it. 17 One might be concerned that funding differences could affect the comparability of the indicators across the two countries. However, statistics available from http://www.oclc.org/reports/escan/economic/educationli- braryspending.htm suggest that Canada spends slightly more per capita (4.6 per cent) on its library collections than does the United States, and more as a fraction of GDP (0.20 per cent versus 0.12 per cent). Given that all major U.S. and Canadian libraries are represented in our sample, and that the budgets are sufficient to allow Canadian libraries to accumulate in aggregate the same titles as their American counterparts, we believe the indicators do provide important information about the knowledge gap. 18 The University of Toronto Libraries has one of the largest collections in North America. According to sta- tistics based on the number of titles and volumes held, its collection is approximately 53 per cent of the size of the Library of Congress despite the fact that the Library of Congress serves a much larger popula- tion than the University of Toronto Libraries. 19 It should be noted that a widening productivitty gap beween Canada and the United States does not require that there be a growing gap in computer titles in relative terms. The current level of TFP depends on the lags of all of the new titles in the economy (not in relative terms). The fact that the Americans are still accumulating more new titles would imply that the gap should be there. However, as the current gap on the new books shrinks, the gap would widen less provided that the coefficients on the lagged titles are the same in the two countries. 20 Alexopoulos and Tombe (2011) identify a gap in management techniques which may also contribute to the presence of the gap. 2 8 N U M B E R 2 3 , S P R I N G 2 0 1 2 http://www.oclc.org/reports/escan/economic/educationlibraryspending.htm http://www.oclc.org/reports/escan/economic/educationlibraryspending.htm References Alexopoulos, Michelle (2008) “Extra! Extra! Some positive technology shocks are expansionary!” Economics Letters, Vol. 101, pp. 153-156. Alexopoulos, Michelle (2011) “Read All About it!! What happens following a technology shock?” American Economic Review, Vol. 101, No. 4, pp. 1144-79. Alexopoulos, Michelle and Jon Cohen (2009) “Mea- suring our Ignorance, One Book at a Time: New Indicators of Technical Change, 1909-1949,” Journal of Monetary Economics, Vol. 56, pp. 450- 70. Alexopoulos, Michelle and Jon Cohen (2011) “Vol- umes of Evidence: Examining Technical Change Last Century Through a New Lens,” Canadian Journal of Economics, Vol. 44, No. 2, pp. 413-450. Alexopoulos, Michelle and Trevor Tombe (2011) “Managerial Knowledge and Canadian Produc- tivity,” University of Toronto Working Paper. Baldwin, John R., Wulong Gu and Beiling Yan (2008) “Relative Multifactor Productivity Levels in Canada and the United States: A Sectoral Analysis,” Statistics Canada Research Paper, Cata- logue No. 15-206-X, No. 019 Basu, S., J. Fernald, N. Oulton and S. Srinivasan (2003) “The Case of the Missing Productivity Growth: or, Does Information Technology Explain Why Productivity Accelerated in the US but not the UK?” NBER Macroeconomics Annual. Gospodinov, N., A. Maynard and E. Pesavento (2011) “Sensitivity of Impulse Responses to Small Low-Frequency Comovements: Reconcil- ing the Evidence on the Effects of Technology Shocks,” Journal of Business & Economic Statistics, Vol. 29, No. 4, pp. 455-467. Khan, H. and M. Santos (2002) “Contribution of ICT use to Output and Labour Productivity Growth in Canada,” Bank of Canada Discussion Paper 2002-07. Library of Congress Classification. A-Z, Library of Congress, Cataloguing Distribution Services, Washington, D.C. Various years. Maddison, A. (2010) Statistics on World Population, GDP, Per Capita GDP 1-2008 AD. Groningen Growth and Development Center. Madsen, J. (2007) “Technology spillover through trade and TFP convergence: 135 years of evi- dence for the OECD countries,” Journal of Inter- national Economics, Vol. 72, No. 2, pp. 464-480. Oliner, S., D. Sichel and K. Stiroh (2007) “Explain- ing a Productive Decade,” Brookings Papers on Economic Activity, Vol. 38, No. 1, pp. 81-152. Rao, Someshwar (2011) Cracking Canada’s Productiv- ity Conundrum. IRPP Study 25 (Montreal: Insti- tute for Research on Public Policy). Rao, Someshwar and Jianmin Tang (2001) “The Contribution of ICTs to Productivity Growth in Canada and the United States in the 1990s,” International Productivity Monitor, Number 3, Fall, pp. 3-18. Rao, Someshwar, Jianmin Tang and Weimin Wang (2004) “Measuring the Canada-U.S. Productiv- ity Gap: Industry Dimensions,” International Productivity Monitor, No. 9, Fall, pp. 3-14. Rao, Someshwar, Jianmin Tang and Weimin Wang (2008) “What Explains the Canada-U.S. Labour Productivity Gap?” Canadian Public Policy, Vol. 34, No. 2, pp. 163-92. Sharpe, Andrew (2006) “The Relationship between ICT Investment and Productivity in the Cana- dian Economy: A Review of the Evidence,” Research Report 2006-05 (Ottawa: Centre for the Study of Living Standards) Sharpe, Andrew (2010) “The Canada-U.S. ICT Investment Gap in 2008: Gains in Communica- tions Equipment and Losses in Computers,” CSLS Research Note 2010-01(Ottawa: Centre for the Study of Living Standards). Shea, John (1998) “What Do Technology Shocks Do?” NBER Macroeconomics Annual, Vol. 13, pp. 275-310. Stiroh, K. J. (2002) “Information Technology and the US Productivity Revival: What Do the Industry Data Say?” American Economic Review, Vol. 92, No. 5, pp. 1559-76. Van Ark, B., Robert Inklaar and Robert H. McGuckin (2003) "The Contribution of ICT- Producing and ICT-Using Industries to Produc- tivity Growth: A Comparison of Canada, Europe and the United States," International Productivity Monitor, Number 6, Spring, pp. 56-63. I N T E R N A T I O N A L P R O D U C T I V I T Y M O N I T O R 29 http://ideas.repec.org/a/bes/jnlbes/v29i4y2011p455-467.html http://ideas.repec.org/a/bes/jnlbes/v29i4y2011p455-467.html http://ideas.repec.org/s/bes/jnlbes.html http://ideas.repec.org/a/bin/bpeajo/v38y2007i2007-1p81-152.html http://ideas.repec.org/a/bin/bpeajo/v38y2007i2007-1p81-152.html http://ideas.repec.org/s/bin/bpeajo.html http://ideas.repec.org/s/bin/bpeajo.html http://ideas.repec.org/a/sls/ipmsls/v6y20035.html http://ideas.repec.org/a/sls/ipmsls/v6y20035.html http://ideas.repec.org/s/sls/ipmsls.html http://ideas.repec.org/s/sls/ipmsls.html Appendix A Sample Marc Record and Associated Online Display Marc Record: 00992cam 2200253 a 45000010008000000050017000080080041000250350021000 66906004500087010001700132020004600490400018001 95050002700213082001600240100002700256245008900 28326001220037230000340049450000200052865000430 0548650004300591630003800634991006600672- 4768599-19930312102159.8-860214s1986 waua 001 0 eng - 9(DLC) 86002512- a7bcbccorignewd1eocipf19gy-gen- catlg- a86002512 - a0914845705 (pbk.) :c$17.95 ($27.95 Can.)- aDLCcDLCdDLC-00aQA76.8.I2594bA541986- 00a005.265219-1 aAndrews, Nancy,d1945--10aWindows :bthe official guide to Microsoft's operatingenvironment / cNancy Andrews.- aRedmond, Wash. :bMicrosoft Press ;a[New York] :bDistributed to the book trade in the U.S. by Harper & Row,cc1986.- axii, 292 p. :bill. ;c24 cm.- aIn- cludes index.- 0aIBM Personal Computer XTxProgramming.- 0aIBM Personal Computer ATxProgramming.-00aMicrosoft Windows (Computer file)- bc-GenCollhQA76.8.I2594iA54 1986p00034791090tCopy 1wBOOKS Online display of information in Marc Record: Windows: the official guide to Microsoft's operating environment 4768599 LC control no.: 86002512 Type of material: Book (Print, Microform, Electronic, etc.) Personal name: Andrews, Nancy, 1945- Main title: Windows : the official guide to Microsoft’s operating environment / Nancy Andrews. Published/Created: Redmond, Wash. : Microsoft Press ; [New York] : Distributed to the book trade in the U.S. by Harper & Row, c1986. Description: xii, 292 p. : ill. ; 24 cm. ISBN: 0914845705 (pbk.) : $17.95 ($27.95 Can.) Notes: Includes index. Subjects: Microsoft Windows (Computer file) IBM Personal Computer XT --Programming. IBM Personal Computer AT --Programming. LC classification: QA76.8.I2594 A54 1986 Dewey class no.: 005.265 Appendix B Computer Classifications in Library of Congress Classification and the Dewey Decimal System In the Library of Congress, the books pertain- ing to Computers and Computer science are typically listed under the subclass QA Mathe- matics. For the indicators created in the paper, we used books classified under the QA 75 and QA 76 groups. Specifically, these are the books c l a s s i f i e d u n d e r Q A 7 5 - 7 6 . 9 5 C a l c u l a t i n g machines which include titles on electronic computers, computer science, and computer software. The indicators also include books classified under the Dewey Decimal System classifications 004 – 006. The items under these designations are grouped as follows: 004 Data processing & computer science 005 Computer programming, programs & data 006 Special computer methods 3 0 N U M B E R 2 3 , S P R I N G 2 0 1 2 Table 2 P-values of Granger Causality Tests Table 1 Comparison of Dates for Selected Computer Innovations The Effects of Computer Technologies on the Canadian Economy: Evidence from New Direct Measures Michelle Alexopoulos and Jon Cohen University of Toronto Abstract New indicators of technical change in the field of computers based on new titles held by Canadian libraries are presented, and a... Résumé De nouveaux indicateurs du changement technique dans le domaine de l'informatique, basés sur les nouveaux titres des bibliothèqu... The Indicators Output, Productivity, and Technical Change Canadian Productivity and the U.S.-Canada Productivity Gap Conclusion References Appendix A Sample Marc Record and Associated Online Display Appendix B Computer Classifications in Library of Congress Classification and the Dewey Decimal System Table 3 Per cent of Variation Due to Technology in Two Variable VARs Table 4 Per cent of Variation Due to Computer Technologies in the Tri-variate VARs Chart 1 Indicators of Computer-Related Titles by Copyright Date Chart 2 Fraction of English-Language Held Computer Books in Canada Published in the United States (per cent) Chart 3 Impulse Response Functions- Bi-Variate VAR Chart 4 Impulse Response Function: Tri-Variate VAR Chart 5 Total Factor Productivity in Canada and the United States, 1950-2005 (1950=100) Chart 6 New Computer Books Held in Canada and the United States by Copyright Date Chart 7 New Computer Titles by Copyright Date << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /All /Binding /Left /CalGrayProfile (Dot Gain 20%) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Tags /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJDFFile false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails false /EmbedAllFonts true /EmbedJobOptions true /DSCReportingLevel 0 /SyntheticBoldness 1.00 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams false /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveEPSInfo true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org) /PDFXTrapped /Unknown /Description << /FRA /ENU (Use these settings to create PDF documents with higher image resolution for improved printing quality. The PDF documents can be opened with Acrobat and Reader 5.0 and later.) /JPN /DEU /PTB /DAN /NLD /ESP /SUO /ITA /NOR /SVE /KOR /CHS /CHT >> >> setdistillerparams << /HWResolution [2400 2400] /PageSize [612.000 792.000] >> setpagedevice work_hel2nmqj6bf6vdlyaqrvgbzqcy ---- ELIS_OTDCF_v21no4.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA The Future of the Anglo-American Cataloguing Rules ___________________________________________________________________________________________________ {A published version of this article appears in the 21:4 (2005) issue of OCLC Systems & Services.} “He has all the virtues I dislike and none of the vices I admire.” -- Sir Winston Churchill ABSTRACT This article discusses the impending update to the Anglo-American Cataloging Rules (AACR) and its potential impact on libraries and other metadata communities. KEYWORDS Anglo-American Cataloging Rules (AACR) ; RDA ; Resource Description and Access ; cataloging ; metadata ; content rules Each summer, circulation staff in my library inventories a section of the stacks and brings collection issues to the attention of appropriate bibliographers. Since I am responsible for the economics collection, I see an array of government documents that have managed to elude the cataloging process. Many of these titles are decades old, having squatted in the library undisturbed and uncirculated since our online catalog was implemented in 1990. This summer’s group of cunning books included annual reports from the Comptroller of the Currency; FALK Project reports, and texts of various legislative acts. My favorite book in the group, however, is the 1949 edition of the Dictionary of Occupational Titles, published by the United States Employment Service. Not surprisingl y, the first occupation I flipped to was “librarian,” which carried a definition that still pertains today (United States Employment Service, 1949): Manages a library, supervising assistants and performing specific duties according to size of library: Selects books to be purchased by library, or approves or rejects list of books prepared by subordinates. Determines library policies and coordinates work of departments. Supervises the classification, cataloging, shelving, and circulation of books and periodicals. Works with schools or organizations, giving advice in courses of reading and references for research. Furnishes expert service in giving information from books on subjects of general or special interest to groups or individuals. Specialty occupations cross-referenced were “Librarian, Reference,” “Medical Librarian,” and “Patients’ Librarian.” There was no heading for “Librarian, Technical Services,” or even “Catalog Librarian.” “Cataloger” was listed, with the following brief definition: Classifies books, magazines, or other library materials according to desired group headings, such as history, drama or fiction. By comparison, the latest version of the dictionary expands this definition significantly, though it is still inadequate by today’s standards (Dictionary of Occupational Titles, 1991): Compiles information on library materials, such as books and periodicals, and prepares catalog cards to identify materials and to integrate information into library catalog: Verifies author, title, and classification number on sample catalog card received from CLASSIFIER (library) against corresponding data on title page. Fills in additional information, such as publisher, date of publication, and edition. Examines material and notes additional information, such as bibliographies, illustrations, maps, and appendices. Copies classification number from sample card into library material for identification. Files cards into assigned sections of catalog. Tabulates number of sample cards according to quantity of material and catalog subject headings to determine amount of new cards to be ordered or reproduced. Prepares inventory card to record purchase information and location of library material. Requisitions additional cards. Records new information, such as death date of author and revised edition date, to amend cataloged cards. May supervise activities of other workers in unit. As the definitions above illustrate, change is inherent in libraries, and not more so than in cataloging. Whether systems, standards, or tasks, the cataloging community seems to remain in a constant state of transition. It’s not going to get any easier in the near future. The hot topic in cataloging circles at the recent American Library Association Annual Conference was RDA: Resource Description and Access, the successor to AACR2. When the Joint Steering Committee for Revision of AACR (JSC) announced that the new text would veer from the original path that was to be AACR3, many catalogers were stunned. Rather than the original plan to simply evolve in a natural way to AACR3, which seemed to be accepted by the cataloging community at large, RDA takes a more progressive approach to providing relatively simple content rules that could be adopted by various metadata communities in need of such guidance. WHAT WILL RDA MEAN TO ME? Since learning about RDA and some of the motivations behind it, I’ve thought about the consequences for my library staff, as well as the larger information world. I will go on record as saying I applaud the shift from AACR3 to RDA. The JSC was courageous to make such a move, especially knowing that many catalogers would think RDA’s simplification a deterioration of cataloging standards. Although this may be the case, development of a code that could be far-reaching in the international community is a bold move, and if successful, will facilitate meaningful data exchange across disparate metadata providers, not to mention place ALA and its counterpart associations in high esteem among metadata communities in desperate need of simple, useful content rules. That said, training and system redesign will require significant budget allocations. More difficult that the financial preparation may be the emotional distress some catalog librarians will undergo while adjusting to this new code. Comments I overheard while attending ALA lead me to believe this emotional hurdle will not be small for some. RDA IN PRACTICE “Our cataloguing rules need to remain independent of any communication format. They also provide a content standard for elements of bibliographic description and access that could be used by any of the emerging metadata standards” (Joint Steering Committee for Revision of AACR, 2005). It’s exciting to imagine a world where use of RDA extends to metadata projects outside of the library system. Even my small library is involved in a number of non-MARC bibliographic projects that would benefit from RDA’s guidance. The strength and ultimate long-term value of RDA, however, will not be measured by library acceptance and utilization. Instead RDA will be judged by how well it accommodates the needs of cultural institutions outside the world of libraries. Will these communities find the rules easy to understand and implement? Are the rules appropriate and sensible? Can the value of adhering to RDA be quantified and illustrated? These are the questions that will need affirmative answers in order for RDA to achieve the JSC’s bold vision. CONCLUSION Discussion about RDA will only get hotter as the JSC makes available sample chapters and an overall prospectus later this summer. It’s highly unlikely RDA will take any drastic turns from its present course, but input from the cataloging community may play a role as RDA is finalized over the next two years. This is in some ways a risky step for the JSC, and one that will continue to draw attention. Much like early detractors of Dublin Core and other non-MARC metadata schemes, I suspect RDA will ultimately convert even the most staunch traditionalists. REFERENCES Joint Steering Committee for Revision of AACR (2005). “RDA: Resource Description and Access.” Available: http://www.collectionscanada.ca/jsc/docs/rdapptmay2005.pdf (Accessed: 19 July 2005). United States Employment Service (1949). Dictionary of Occupational Titles, 2nd ed. Washington, DC: United States Government Printing Office. United States Employment Service (1991). Dictionary of Occupational Titles, 4TH ed., rev. Available: http://www.occupationalinfo.org/ (Accessed: 12 July 2005). work_hf5nqfi63ra7fkhirdjvluguna ---- Trends in Twitter Hashtag Applications: Design Features for Value-Added Dimensions to Future Library Catalogues Hsia-Ching Chang and Hemalata Iyer Abstract The Twitter hashtag is a unique tagging format linking Tweets to user-defined concepts. The aim of the paper is to describe various applications of Twitter hashtags and to determine the functional characteristics of each application. Twitter hashtags can assist in ar- chiving twitter content, provide different visual representations of tweets, and permit grouping by categories and facets. This study seeks to examine the trends in Twitter hashtag features and how these may be applied as enhancements for next-generation library catalogues. For this purpose, Taylor’s value-added model is used as an analytical framework. The morphological box developed by Zwicky is used to synthesize functionalities of Twitter hashtag applications. And finally, included are recommendations for the design of hashtag-based value- added dimensions for future library catalogues. Introduction The social media world contains a plethora of current information; how- ever, it is not always easy to keep up with the volume or retrieve valuable information using traditional search engines. Various applications have been developed to help information consumers locate and share social media resources that match their interests. Twitter, a social media plat- form with 140 character limits, serves as a real time information communi- cation network. A Twitter hashtag is a unique tagging format with a prefix symbol, #, that associates a user-defined tag with Tweet content. Beyond supporting different search criteria, various Twitter hashtag applications may also provide users with functionalities to organize, share, save, or pub- lish the search results of twitterverse resources. Management of information and information systems often empha- sizes the importance of user needs. Taylor’s value-added model is one of LIBRARY TRENDS, Vol. 61, No. 1, 2012 (“Losing the Battle for Hearts and Minds? Next- Generation Discovery and Access in Library Catalogues,” edited by Kathryn La Barre), pp. 248–258. © 2012 The Board of Trustees, University of Illinois 15_61_1_chang_248-258.indd 248 8/28/12 9:15 AM 249value-added dimensions/chang & iyer the few models that provides the notion of adding value to both informa- tion and information systems as a way to meet user needs (Taylor, 1982). The core function of information systems is to manage the information required to perform business processes, regardless of the devices adopted to implement it. Hence, this study discusses Twitter hashtag applications available for use in computer and mobile devices. The morphological box developed by Zwicky (1969) is a design tool that supports generating ideas and detailing. The primary processes include (1) defining user requirements, (2) collecting functional characteristics from existing systems, and (3) listing attributes for each functional charac- teristic. Based on user requirements suggested by Taylor (1986), this study explores notable functional characteristics of Twitter hashtag applications and compares Twitter adoption and hashtag use on leading library and so- cial cataloguing Web sites. The findings demonstrate hashtag use patterns from different perspectives, and suggest two value-added dimensions that combine effective functional characteristics and existing best practices. The suggestions from this case study can be used to generate new ways to address user needs in the library catalogues of the future. Rationale Taylor’s value-added model has played a prominent role in user-centered design of information systems for more than two decades. However, the model does not specify any particular context of system use. Integrating a taxonomy of tagging motivation into Taylor’s model provides an effective approach to analyzing user requirements and value-added dimensions. The Essence of Taylor’s Value-Added Model Taylor’s value-added model has been recognized as a visionary framework and is highly relevant for use in evaluating information systems (Eisen- berg & Dirks, 2008) and knowledge organization systems (Pimentel, 2009). The following features of Taylor’s model are used in this analysis: information spectrum and underpinning system features. Information Spectrum The notion of the “information spectrum” refers to a hierarchical struc- ture of information, which clearly outlines a series of value-added pro- cesses and associated methods (Taylor, 1982). The spectrum identifies five phases of increasing complexity and sophistication as: data, information, informing knowledge, productive knowledge, and action. For instance, transforming raw data into information requires the efforts of organizing processes through grouping, classifying, relating, formatting, signaling, or displaying. An information professional engages in synthesizing and making judgments in order to make the transition from the information phase to the informing- or productive-knowledge phases that can support action and decision making. The entire information spectrum represents 15_61_1_chang_248-258.indd 249 8/28/12 9:15 AM 250 library trends/summer 2012 how different levels of information processing enable and advance the creation of value-added processes. Value-Added Model Besides focusing on information, another influential value-added model by Taylor (1986) reflects the focal points of values: user criteria (of choice), system, and interface. Taylor suggested six user requirements that have the potential to add value to information systems: ease of use, noise reduc- tion, quality, adaptability, time saving, and cost saving. These areas of user criteria define an effective interface for bridging the gap between user and system. The value-added processes in the system instantiate at the interface level. Thus, developing examples of value-added processes helps focus and deliver functional features to users. User Perspectives: Taxonomy of Tagging Motivations Understanding user needs and how users make sense of the environment and their experiences are critical aspects of information system design. Optimal design pays close attention to the study of information behavior; such study seeks to understand why and how people seek and use informa- tion as they try to make sense of the environment and attempt to remain well informed. Tagging is a form of information behavior. Understand- ing the motivations underlying tagging behavior is becoming increasingly important to the design of information systems. This is especially true in library catalogues as user tagging becomes progressively more prevalent. Hashtags have also become a popular form of information organization for users of social media; the diffusion of innovation theory explains why users adopt new communication behavior, such as hashtagging (Chang, 2011). While Taylor’s model addresses the aspect of user feedback and offers universal design principles, an understanding of specific user require- ments together with a holistic user perspective is also necessary for de- signing hashtag systems design. Ames and Naaman (2007) examined the tagging motivations of users and developed a two-dimensional taxonomy of tagging motivation: function and sociality. As per this taxonomy, users engage in tagging for the purpose of organization and communication, and this constitutes the function dimension. The sociality dimension re- flects that tagging could be undertaken to meet individual needs or to meet the needs of others, especially social needs. The resultant four cat- egories are self organization, self communication, social organization, and social communication. Tagging for self organization supports personal search and retrieval; users tag for personal information management intended for future retrieval. Tagging for self communication provides for personal con- text and also serves as a memory cue. In essence the user motivation to tag is to aid in personal organization and communication. On the other hand, from the social or collective viewpoint, the organization and com- 15_61_1_chang_248-258.indd 250 8/28/12 9:15 AM 251value-added dimensions/chang & iyer munication function is intended for others. Thus in the social organization category, the primary motivation for tagging is to facilitate public search and self promotion. In the social communication category, tags are intended to communicate contextual information to others thereby taking advan- tage of the wisdom of crowds to assist in public search and communication with friends or family. Amongst these four categories, the social organiza- tion aspect dominates tagging motivation; as Ames and Naaman (2007) discovered, taggers are motivated by the urge to share their organized in- formation with the public. The function dimension in this taxonomy aligns closely with the user criteria depicted in Taylor’s model discussed in the previous section. Research on tags and tagging indicates that general users prefer the flexibility and control of using their own vocabulary to convey certain meanings to personalized information organization, (Eden, 2005; Vander Wal, 2007). An interesting case by Sinha (2005) distinguished categoriza- tion behavior from tagging behavior based on the cognitive process and concluded that taggers seem to encounter less cognitive overhead. Taggers typically do not hesitate to tag, while professional cataloguers often must comply with subject authority rules when using subheadings to combine concepts. Taggers have the freedom to upload as many tags as they want, whereas cataloguers often make parsimonious use of subject headings by combining more than one concept through the use of subheadings. Drawing from Soergel’s (1985) concept, Hjorland (1997) addressed the difference between content-oriented indexing and need-oriented in- dexing. Content-oriented indexing (for example, the Library of Congress Subject Headings [LCSH]) reflects the document perspective as a way to identify the item. This is less likely to reveal user contexts. In contrast to LCSH, hashtagging offers greater support and aligns more closely to need-oriented indexing. Therefore, when evaluating whether hashtag- ging systems meet user needs, it is important to note the different stand- points between professional and lay users. What matters most for design decisions is understanding the targeted user needs: whether the hashtag applications are adopted or integrated for professional use or library pa- tron use. Case Study Method: Morphological Box A morphological box lists, in a table-like form, the attributes of a product or service based along a matrix of user needs. This can be an effective device for creative idea generation and results in various new alternatives created by recombining attributes in the matrix (Ritchey, 2011; Zwicky, 1969). The Zwicky box has been applied to multiple domains, such as management, public policy, and natural science (Ritchey, 2011). In the information systems context, the salient attribute might be functional de- sign elements. The problem is that each hashtag application fulfills cer- 15_61_1_chang_248-258.indd 251 8/28/12 9:15 AM 252 library trends/summer 2012 tain aspects of information needs. This can cause difficulties in arriving at definitive hashtag management. These are the steps for creating a mor- phological box: (1) identify user requirements; (2) list the function at- tributes as column headings; (3) list available variations of the attributes; (4) select one item from each column randomly or mix interesting com- binations of items; (5) evaluate whether the combination is feasible or alternatively recombine the elements in another new way. Each user requirement in table 1 is drawn from Ames and Naaman’s (2007) two dimensions of tagging motivation: communication and or- ganization. This table indicates its corresponding functional attributes (called interfaces by Taylor 1986) in the first entry and the subordinate attributes (called value-added examples by Taylor 1986) in the second entry. This case study outlines five functional characteristics, including memory, social signaling, search, directory, and archive, based on Ames and Naa- man’s taxonomy. As shown in table 1 for the first user tagging require- ment, “communication,” if a strand of the table is followed vertically for the first functional characteristic “memory,” it leads to more detailed val- ue-added function attributes, acquired through system analysis of existing applications, such as self-publishing, scheduling to send tweets, archiving, and saved searches. On the other hand, the horizontal path emphasizes moving forward to the next user requirement or functional characteristic. Table 1. Morphological box of Twitter hashtag applications User Requirements Communication Organization Function Characteristics (Interfaces) Memory Social Signaling Search Directory Archive Function Attributes (Value-Added examples) self-publishing Retweet Tweets Interest Compile scheduling to send tweets Mention Time Subject Save archiving sharing photos/ videos #hashtag Industry Analytics saved searches sharing organized tweets URL Keyword recent searches Visualizations Language Location recent 500- 1500 tweets from @name real-time trends making it public or private mentioning @name user- contributed definitions of real-time trends 15_61_1_chang_248-258.indd 252 8/28/12 9:15 AM 253value-added dimensions/chang & iyer The list of user requirements and functional attributes are notably neither comprehensive nor prescriptive. One of the disadvantages of the morphological box is that it is difficult to select or evaluate the best path among the potential combinations of ideas. For instance, our morpholog- ical box has 4032 (= 4*4*7*6*6) possible solutions. In brainstorming ses- sions, the process of elaborating details often gives support for expanding ideas. On the contrary, when attempting to converge ideas, listing more categories does not guarantee better results. To benefit from the method, adopters must identify elements most appropriate to their needs and omit those that are not. Trends in Twitter Hashtag Applications Because of advances in technology, Twitter hashtags can blend both communication and organization by helping information professionals enhance research or reading experiences and by bringing users and re- sources closer together. As with other social media platforms, Twitter can be used to quickly connect people and to allow them to follow updates about each other. Alternatively, a person could connect with new people by joining the information networks and news channels that reflect one’s own interests, link to breaking news, or disseminate business updates— which may take a great deal more effort. Brevity is another reason why Twitter has become an efficient means of communication; especially with the introduction of hashtags, which facilitate searching or keeping track of updates on relevant resources. A number of Twitter applications en- hance communication capabilities to distribute tweets with music, photos, and videos. Twitter Hashtag applications turn Twitter into a research and archive tools especially when it is integrated into applications that pro- vide structured directories, advanced search, content organization, con- tent presentation, and analytics. These functions are useful for people or businesses that need to organize and keep track of tweets that mention specific names, hashtags, keywords, or topics. Such functions not only allow content customization and presentation but also provide analyt- ics that monitor illustrative statistical trends such as the top ten tweeted hashtags, tweet volume over time, top users, and the percentages of tweets and retweeting. Table 2 presents the comparison of Twitter adoption and hashtag use (recent two-month tweeted #hashtags, from September 1, 2011, to No- vember 17, 2011) between the leading library and social cataloguing Web sites. The leading library cataloguing Web sites include OCLC, Worldcat, and the Library of Congress; the representative social cataloguing Web sites are Goodreads and LibraryThing. The Library of Congress joined Twitter in 2007 and is the earliest Twit- ter adopter in the case study followed by Goodreads, OCLC, Worldcat, and LibraryThing. Similarly, the Library of Congress has the most follow- 15_61_1_chang_248-258.indd 253 8/28/12 9:15 AM Table 2. Twitter Adoption and Hashtag Use of Leading Library and Social Cataloguing Sites Twitter account @OCLC @Worldcat @library congress @Library Thing @Goodreads Description of the Twitter account “OCLC is about con- necting people to knowledge through library coop- eration.” “WorldCat is the world’s largest library cata- logue and is a great place online to find materi- als in librar- ies, world- wide.” “We are the largest li- brary in the world, with millions of books, recordings, photo- graphs, maps and manu- scripts in our collections.” “Watch this space for feature an- nounce- ments, site news, and more from the team at Library Thing.com” “The largest site for read- ers and book recommen- dations. Find new books, recommend books, track your reading, join book clubs, win advanced copies, and much more!” Twitter profile Following: 162 Followers: 5605 Tweets: 2016 Favorites: 0 Joined on April 8, 2009 Following: 258 Followers: 2293 Tweets: 594 Favorites: 1 Joined on April 8, 2009 Following: 6 Followers: 302,385 Tweets: 1579 Favorites: 0 Joined on June 29, 2007 Following: 3022 Followers: 4398 Tweets: 1303 Favorites: 3 Joined on January 20, 2011 Following: 10,160 Followers: 223,784 Tweets: 1963 Favorites: 1 Joined on August 19, 2008 Recent two- month tweeted hashtags #oclcr (4) #WorldCat #natbookfest (15) #Library ThingReads (10) #quoteofthe day(9) #orrapcap #Spotify #thankyou steve #fridayreads (9) #NBA 2011 #wms4aca #declaration #bookhaiku (4) #Goodreads Choice (10) #orss (2) #wanttohelp #AllHallows Read #neardalert #orarchive grid #birthdays #oclc #ala11 #ows #weekend goodreads #wcid #playlearn #NatBook Festival #DDC11 #librariannerd humor Note: The number within parentheses indicates the frequency of tweeted hashtags from September 1, 2011, to November 17, 2011. 15_61_1_chang_248-258.indd 254 8/28/12 9:15 AM 255value-added dimensions/chang & iyer ers of the institutions in the case study and follows the fewest. Interest- ingly, OCLC has the highest number of tweets, but the Worldcat catalogue is the lowest contributor of tweets, although it joined Twitter on the same day. OCLC and the Library of Congress have used hashtags more regularly (e.g., #oclcr reused 4 times) than WorldCat. Frequently tweeted hashtags tend to fall into a few categories. The most consistently adopted type of hashtag relates to events or announcements. This type of event hashtag (#natbookfest) stands for “national book festi- val”. This hashtag also appeared on the most frequently tweeted hashtag list at the Library of Congress. Other events might include tweets about online book club events, for example, which bear hashtags such as #Li- braryThingReads, #GoodreadsChoice, #fridayreads, and #quoteoftheday. These are fairly common on the two social cataloguing Web sites exam- ined in this case study, Librarythings.com and Goodreads. Other similar hashtags relate to collections of research resources (e.g., #oclcr) and knowledge sharing gained from reading (e.g., #quoteoftheday). One no- table difference among sites relates to the number of followers. With the exception of Librarything, Twitter followers on other sites outnumber the ones they are following. However, social cataloguing Web sites appear to focus on creating two-way conversation, whereas professional cataloguing Web sites seem to focus on one-way broadcasting, or pushing news or an- nouncements out to their Twitter followers. Suggestions for Value-Added Dimensions to Future Library Catalogues Libraries and social cataloguing organizations have used Twitter (as well as hashtags) in various ways. Professional cataloguing organizations tend to adopt Twitter primarily for broadcasting or communication purposes to send out the latest news and announcements. Social cataloguing sites go beyond communication and actively engage followers in idea ex- changes through adding hashtags to create and organize particular top- ics. Once a site regularly starts creating and organizing hashtags (such as #fridayreads), interactions with followers connect people with infor- mative content. This study suggests two approaches to exemplify possible dimensions to establish such connections through hashtag applications: connecting the process through a value-added interface, and connecting the materials through providing value-added information. Connecting the Process through Value-Added Cataloguing Interface Use of the Zwicky morphological box presents many paths forward for conceptualizing innovative use of Twitter hashtag applications to solve in- formation problems. Whether this is possible depends heavily on the will- ingness of library catalogue service providers to add hashtag features in support of existing library catalogue services. This is not beyond the realm 15_61_1_chang_248-258.indd 255 8/28/12 9:15 AM 256 library trends/summer 2012 of possibility; for example, WorldCat already integrates user-generated book reviews by connecting to Goodreads.com. This application embeds a Twitter icon into each book page, thereby allowing book sharing by mak- ing it simple to tweet a book title that includes a link to Goodreads. This not only retains and extends the catalogue format but also can greatly improve accessibility to materials. According to the morphological box (table 1), one of the potential combinations for value-added dimensions to future library catalogues might lie in the combination of “#hashtag, analytics, and visualizations”— one possible use of this combination would consist of allowing patrons to see the relevant hashtags assigned to a book title or an author’s name by providing visualizations or listing analytics. Such an approach might be especially suitable on a library Web page promoting an author or a set of titles on a particular subject. Connecting Library Materials through Value-Added Information Taylor’s value-added information spectrum suggests different levels of providing value-added information. Thus, another practical combination arrived through morphological analysis is archiving, real-time trends, user- contributed definitions of real-time trends, and sharing organized tweets URL. For users who are curious on the meaning of real-time trending top- ics (hashtags or keywords), the Hootsuit mobile application enables quick and easy browsing of current definitions by clicking the question marks next to the trends. If users are interested in coauthoring trend definitions, Whatthetrend.com provides a Wiki system that allows users to update the definitions of real-time Twitter top ten trends grouped by countries and cities. For new users needing to locate social media connections that match their interests, Twitter’s official mobile application has a built-in direc- tory structured according to interests. Twitter users are interested in many other novel features and interesting content, as are readers. Taking re- cent record-breaking tweets per second (SR7 Online News, 2011) as an example, when Steve Jobs passed away on October 5, 2011, four hashtags (#SteveJobs, #ThankYouSteve, #RIPSteveJobs, and #iSad) and two related keywords (“Think Different” and “Stay Hungry”) dominated the top ten trend list. Many Twitter users worldwide shared various Internet resources including image collections, videos, quotes, and links to news on that day. This user-generated content could be archived and in such a form would be ideal for connecting with library materials. For example, this archived Web content could be linked to Steve Jobs’ newly published autobiogra- phy and any other related publications on Steve Jobs. 15_61_1_chang_248-258.indd 256 8/28/12 9:15 AM 257value-added dimensions/chang & iyer Conclusions Twitter’s real-time updates represent a wealth of timely information. The issue is how to extract and transform potential information on almost any subject into knowledge that can be used. Twitter hashtag applications en- able users to add value to the tweeted content by reorganizing, publish- ing, and distributing the content based on different criteria, ranging from hashtags to keywords to user names. Carrying out a morphological analy- sis can help discover new combinations of value-added services, though some of these combinations might be more practical than others. This is where the cataloguing experience and practice comes into play. Inherited from the spirit of Twitter open API, Twitter hashtag applications allow de- velopers to configure their applications according to user requirements. Hence, as long as library organizations have the Twitter data stream con- nections, they can optimize third-party applications equipped with func- tions that include search, directory, and archiving functions and allow analysis and visualization of Twitter hashtags used within the library cata- logue environment and linking out to the richness of the Twitterverse. In- formation professionals familiar with advanced searching techniques may readily adopt hashtags as a useful search strategy. For example, imagine a reader following one hashtag connection to another and finding useful material. Such success might lead the reader to share and communicate their discoveries with others through the use of hashtags, thus adding to the value-added processes. As a design method, the morphological box has the potential to inspire creativity and assist in organizing user require- ments into tables and decomposing categories of design features through which innovative solutions can be developed to enhance next-generation catalogues. References Ames, M. & Naaman, M. (2007). Why we tag: Motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 971–980). New York: ACM. Chang, H.-C. (2011). Rehashing information architecture: Exploring human-information interaction of collaborative tagging using Twitter hashtags. Unpublished doctoral dissertation, University at Albany, State University of New York. Eden, B. (2005, November/December). Metadata and its applications: New directions and updates. Library Technology Reports, 41(6). Chicago: American Library Association. Eisenberg, M. B. & Dirks, L. (2008). Taylor’s value-added model: Still relevant after all these years. Paper presented at iConference, UCLA, Los Angeles. Hjorland, B. (1997). Information seeking and subject representation: An activity-theoretical approach to information science. Westport, CT: Greenwood Press. Pimentel, D. M. (2009). The KO roots of Taylor’s value-added model. In E. K. Jacob & B. Kwasnik (Eds.), Proceedings from North American Symposium on Knowledge Organization, vol. 2 (pp. 58–67). Syracuse, NY: International Society for Knowledge Organization: Chapter for Canada and United States. Ritchey, T. (2011). Wicked problems/Social messes: Decision support modelling with morphological analysis. Berlin: Springer. Sinha, R. (2005, September 27). A cognitive analysis of tagging. Message posted to http:/ / rashmisinha.com/2005/09/27/a-cognitive-analysis-of-tagging/ 15_61_1_chang_248-258.indd 257 8/28/12 9:15 AM 258 library trends/summer 2012 Soergel, D. (1985). Organizing information: Principles of data base and retrieval systems. London: Academic Press. SR7 Online News. (2011, October 7). Doing the job: Forbes data reaffirms Twitter outpouring for Apple founder and icon. Message posted to http:/ /www.sr7.com.au/2011/10/doing -the-job-forbes-data-reaffirms-twitter-outpouring-for-apple-founder-and-icon/ Taylor, R. S. (1982). Value-added processes in the information life cycle. Journal of the American Society for Information Science, 33(5), 341–346. Taylor, R. S. (1986) Value added processes in information systems. Norwood, NJ: Ablex. Vander Wal, T. (2007). Folksonomy coinage and definition. Retrieved March 14, 2012, from http:/ / vanderwal.net/folksonomy.html Zwicky, F. (1969). Discovery, invention, research: Through the morphological approach. Toronto: MacMillan. Hsia-Ching Chang is the project manager of HyWeb Technology Co., Ltd. in Taiwan, where she works on consulting for business analysis and transformation of business requirements in software systems. She received a PhD in information science from the University at Albany, State University of New York. Her research interests include human–computer interaction, user interface design, information architecture, IS/ IT adoption and diffusion, and Web 2.0 technologies. Hemalata Iyer is an associate professor in the Department of Information Studies at the University at Albany, State University of New York. Her research interests are in the areas of information organization, metadata and access, social informatics (espe- cially as it relates to culture and cultural heritage information and communication), visual resources, image indexing and organization, and human information behavior. 15_61_1_chang_248-258.indd 258 8/28/12 9:15 AM work_htyhexv43jbq7fuxtfo7cqwfum ---- OpenDOAR Repositories and Metadata Practices Search D-Lib:   HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB   D-Lib Magazine March/April 2015 Volume 21, Number 3/4 Table of Contents   OpenDOAR Repositories and Metadata Practices Heather Lea Moulaison, Felicity Dykas and Kristen Gallant University of Missouri {moulaisonhe, dykasf}@missouri.edu and kahm9c@mail.missouri.edu DOI: 10.1045/march2015-moulaison   Printer-friendly Version   Abstract In spring 2014, authors from the University of Missouri conducted a nation-wide survey on metadata practices among United States-based OpenDOAR repositories. Examining the repository systems and current practices of metadata in these repositories, researchers collected and analyzed the responses of 23 repositories. Results from this survey include information about the creators of metadata, best practices and resources, and controlled vocabularies. Findings will inform libraries about the current state of repository and metadata choices in open repositories in the United States, especially as they pertain to overarching questions of interoperability.   Introduction The creation of metadata for research and repository content is an essential part of the scholarly communication process and is necessary for the long-term access and preservation of our digital (and digitized) heritage. Metadata choices and practices affect the findability of resources in the online environment, and these choices, influenced by the content itself, also reflect the institutions, stakeholders, and users of specific repositories. Content in repositories may be one-of-a-kind, with academic libraries creating digital repositories to house and make available the campus's unique intellectual capital (see Cullen & Chawner, 2011). Other institutions such as the American Museum of Natural History and the New York Public Library have also chosen to make curated digitized information freely available on the open Web. Such collections often have original or unique content, and "can more broadly facilitate the creation of new knowledge by an even wider array of scholars and researchers than in the past" (Gasaway, 2010, p. 758-759). As it stands, no one-size-fits-all answers to metadata practices have been devised, and details about current practice remain understudied. Having an understanding of current practice in highly visible and accessible repositories that are part of the OpenDOAR registry will provide insight into larger questions of access and interoperability. In spring 2014, we conducted a nation-wide survey on metadata practices among United States-based OpenDOAR repositories. The current analysis begins by addressing questions of repository demographics, including providing an overview of which systems respondents are using, what is being made available, and the overall size of the collection. Next, we investigate metadata practices including the metadata creation environment. Specifically on the topic of standards, we investigate the metadata schema being used as well as the controlled vocabularies. In order to ensure consistency and interoperability, repositories must have knowledgeable team members performing related duties. The survey gathered information on the kinds of individuals doing metadata-related work and inquired into the resources, including best practices documentation, that they have at their disposal.   OpenDOAR Repositories OpenDOAR is a directory of open repositories developed by the University of Nottingham in the United Kingdom and Lund University in Sweden (Jacso, 2006; "OpenDOAR: Open Access," 2006; "OpenDOAR or Directory," 2005). The directory serves an international academic community, establishing an authoritative and quality-based source for accessing open-access scholarly materials ("About," 2014). OpenDOAR ensures quality by visiting the repositories prior to listing them in the directory. The directory includes 2,704 repositories and their content ("Content Types," 2014), and through the OpenDOAR website, it is possible to search or browse for repositories based on criteria such as repository type, content type held, and subject area. In addition, the site includes OpenDOAR Search that uses Google Custom to search across repository content ("Tools for Repository Administrators," 2014). Content in OpenDOAR repositories can be as varied as private university scholarly publications or digital collections of music and art from a public library, and includes objects such as journal articles, theses and dissertations, conference papers, software, patents, datasets, learning objects, audio-visual materials, and books ("Content Types," 2014).   Method This study limited its scope to United States-based institutions and their repositories as listed on the OpenDOAR registry. Of the 328 OpenDOAR institutions listed in February 2014, we randomly selected 50 institutions for study. If individual United States-based institutions listed more than one repository in the directory, we focused our efforts on the first listed repository. In one case, we included both a repository and the consortium of repositories of which it was a member. In May 2014, emails were sent directly to repository administrators with links to the Qualtrics online survey instrument. The survey questions included information about the demographics of the repository and the metadata creation environment. (View a PDF of the survey through the MOspace repository.) Recipients were asked to forward the email to the responsible party if he or she felt someone else was better suited to answer.   Findings and Analysis The survey contained five sections with 19 major questions and sub-questions which aimed to investigate current practices in relation to the metadata choices and practices. Representatives from 23 of the institutions identified (46%) completed the first two sections of the survey, with 19 institutions (38% of the total) completing the entire survey. The responses were collected using the online survey software, Qualtrics, and analyzed further in Excel.   Repository Demographics: System, Content, and Size Institutions were asked demographics questions about the repository system they used, the content they collected, and the size of the collections. Among the 23 respondents, the most common repository system/software was DSpace with 43% of survey respondents using it. This finding is consistent with the findings of Li and Banach (2011) in their spring 2010 survey on preservation in academic libraries in North America and with the posted report about all OpenDOAR repositories' usage of repository software ("Usage of Open Access Repository Software", 2014). Twenty-six percent of libraries surveyed used Digital Commons (bepress), with 13% (n=3) using Fedora. The rest of respondents were using other software, including both commercial and homegrown systems; no respondents reported using EPrints in this study. Five repositories reported using more than one software package (see Table 1).   Repository/Software Number Using (N=23)     %     DSpace 10 43% Digital Commons (bepress) 6 26% Fedora 3 13% ExLibris DigiTool 2 9% Hydra 2 9% Islandora 1 4% Omeka 1 4% ContentDM 1 4% Locally developed software 2 9% Other 2 9% Table 1: Repository Software or System Used Content made available in these open repositories was varied, with 78% making individual articles, student projects, and/or images available. Seventy-four percent made photographs available, and 65% made electronic theses and dissertations of some kind available. Sixty-one percent made reports, digitized books, video, journals or presentations available (see Table 2). These responses are consistent with the data reported for the entirety of the OpenDOAR repositories as listed on the OpenDOAR website, with journal articles and theses and dissertations listed as the most common kinds of content ("Content Types in OpenDOAR Repositories," 2014).   Kinds of Content Number Using (N=23)     %     Images 18 78% Individual articles 18 78% Student projects 18 78% Photographs 17 74% ETDs 15 65% Presentations 14 61% Reports 14 61% Digitized books 14 61% Video 14 61% Journals 14 61% Newspapers 12 52% Audio 11 48% White papers 9 39% Research data/datasets 8 35% Born digital books 6 26% Databases 3 13% Websites 1 4% Other: government documents 1 4% Other: university archive items 1 4% Other: collective bargaining agreements 1 4% Table 2: Kinds of Content Overall, the 23 repositories surveyed held between two and 15 different kinds of content. The repository with only two kinds of content held individual articles and ETDs (see Figure 1). Figure 1: Kinds of Content, by Repository Finally, the repositories varied in size. Twenty-two respondents provided information about the extent of their digital collections. Thirty-two percent (n=7) held between 500-4,999 digital objects, with 23% (n=5) holding 5,000-9,999 and also 23% (n=5) holding 10,000-19,999 digital objects. Only one respondent held 100,000-1,000,000 digital objects, the highest of all the respondents. Given the variety of content mentioned in the previous question, it seems reasonable that smaller collections contain fewer content types. When plotted, the trendline generally confirms that collections with fewer kinds of content have smaller numbers of digital objects (see Figure 2). Figure 2: Digital Objects in the Collection by Kinds of Content   Metadata Practices Respondents also supplied information about the metadata schema and controlled vocabularies used in their repositories. Although the question of encoding schema was followed directly by the question about controlled vocabularies in use, all 23 respondents answered the schema question, but only 17 supplied information about the controlled vocabularies. In terms of metadata schema being used, many of the 23 respondents selected more than one schema. The greatest number of respondents by far used Dublin Core (n=12; 52%) or Qualified Dublin Core (QDC) (n=11; 48%). Metadata Object Description Schema (MODS) use (n=6; 26%) beat out MAchine-Readable Cataloging (MARC) use (n=4; 17%) in the repositories, and a variety of other schema were mentioned, each with only one or two repositories reporting their use. One respondent reported being unaware of schema being used (4%). The high use of standards such as Dublin Core ensures the interoperability of repository content. One cannot be overly optimistic with these results, however. Park reported in 2006 that the ambiguity of Dublin Core metadata elements can hamper consistency in their use. She noted that, "This in turn has great potential to hinder semantic interoperability" (Park, 2006, p. 32). Library-centric encoding schema like MARC are less used in these repositories, yet the library-based Library of Congress Subject Headings (LCSH) is the most common controlled vocabulary used, with 88% (n=15) of the seventeen respondents using it. Other library-based vocabularies used are the Library of Congress Name Authority File (NAF) access points (n=4; 24%); Medical Subject Headings (MeSH) (n=2; 12%); and Library of Congress Genre/Form Terms (LCGFT) (n=2; 12%). Controlled vocabularies maintained by the Getty were also mentioned, though not frequently, with 18% (n=3) using the Getty's Art & Architecture Thesaurus (AAT) and 12% (n=2) using the Thesaurus for Graphic Materials (TGM); only one respondent (6%) reported using Getty Thesaurus of Geographic Names (TGN). As mentioned, six survey respondents chose not to answer the question about controlled vocabularies, and one indicated that no controlled vocabulary are used (6%). In the results as we present them, we interpret respondents' lack of response as a result of their inability to speak to specifics of controlled vocabularies in use and have calculated the responses out of 17; we acknowledge that other explanations, including that no controlled vocabularies are in use, are also possible.   Metadata Creation Environments On the topic of the metadata creation environment, the survey asked respondents about staff involved in the creation of metadata for their repository and the tools and resources used. Nineteen respondents provided information about who inputs or loads metadata, creates descriptive metadata, creates administrative metadata, and reviews metadata. Overall, professional librarians with a master's level degree did the majority of the work. These librarians created descriptive metadata at 16 out of 19 institutions; they created administrative metadata at 14 institutions; and reviewed metadata at 15 institutions. Paraprofessional staff were the next most common group contributing to the repository's metadata, followed by administrators and department heads (see Table 3).   Team Member Creates DESCRIPTIVE metadata % Creates ADMINISTRATIVE metadata % Reviews metadata % Librarian (master's level) 16 84% 14 74% 15 79% Paraprofessional 10 53% 3 16% 6 32% Administrator (outside department) 7 37% 3 16% 3 16% Department head 3 16% 3 16% 4 21% Subject specialist 4 21% 2 11% 3 16% Student worker 4 21% 2 11% 0 0% Volunteer 2 11% 1 5% 1 5% IT 0 0% 2 11% 2 11% Table 3: Creating and Reviewing Metadata These standards are applied based on documentation and specialized resources. Eighteen respondents supplied information about the resources used in their repository work with many choosing more than one. In responding to the question about best practices, the majority of respondents reported using homegrown best practices (11 respondents; 61%); though 6 did not mention any best practices documentation (see Table 4). Best practices Number Using (N=18)     %     Best practices: homegrown 11 61% Best practices: Resource Description and Access (RDA) 4 22% Best practices: Western States /CDP Dublin Core Metadata Best Practices 1 6% Best practices: other 1 6% None mentioned 6 33% Table 4: Best Practices Used Most repositories used one set of best practices, with only four respondents using more than one set of best practices (see Figure 3). Figure 3: Number of Best Practices Used by Repositories Best practices documentation is not the only way that quality is ensured. Other resources respondents used included OCLC Connexion (n=8; 44%) and oXygen XML editor and RDA Toolkit (n=4; 22%). Five of the 18 respondents did not indicate any additional tools or resources (see Table 5). Other Resources Number Using (N=18)     %     OCLC Connexion 8 44% oXygen XML editor 4 22% RDA Toolkit 4 22% Cataloger's Desktop 3 17% Classification Web 3 17% MARCedit 3 17% Virtual International Authority File (VIAF) 3 17% id.loc.gov 2 11% DublinCore Generator.com 1 6% ORCID 1 6% None mentioned 5 28% Table 5: Other Resources Used in Metadata Creation   Discussion Having content available in open repositories is a first step toward ensuring the content's use, especially in a federated environment such as OpenDOAR. Providing the best conditions for access, including adequate systems and consistent and correct metadata, contributes to the usefulness of content over the long term. In general, the OpenDOAR repositories surveyed in this study are using known library standards, including Dublin Core, MODS, MARC, and LCSH. The use of some of these standards reflects the use of repository software systems that have been built with these standards as default options. Those creating and/or reviewing the metadata are primarily librarians and paraprofessional staff. A strength of traditional cataloging is the use of shared standards for descriptive and subject cataloging. Libraries worldwide participate in creating and using bibliographic records in WorldCat. OCLC reports that 72,000 libraries are represented in WorldCat ("A Global Library Resource," 2014). As described on the OCLC website, "OCLC members harness the collective energy and innovation of library world to share collections, metadata, best practices and expertise" ("The Value of Cooperative Cataloging," 2014). From this survey, we conclude that this collective energy is yet to be realized in digital repositories. Hillmann (2008) discusses the differences between traditional cataloging and work done in digital libraries. She observes that in digital libraries, "few communities of practice have been able to define their needs as a community" (p. 68). The report of the types of content in the digital repositories in this survey shows that many are unique items, including electronic theses and dissertations (ETDs), student projects, white papers, and research data and datasets. It is likely that many of the photographs and images are unique to the reporting repository, too. Given that repositories are not sharing records, the need for shared best practices does not occur at the repository level and the need for a community of practice may not be perceived as important. On the other hand, a large number of respondents are using controlled vocabularies with at least some of their repository material. The size of repositories may impact this perception about best practices, too. In comparison to the traditional holdings of large academic libraries, digital repositories are still small. Thirty-seven percent of respondents answering the question indicated that their repositories held fewer than 5,000 items, and 82% reported fewer than 20,000 items. The smaller repositories also reported having fewer kinds of content. Stvilia and Gasser (2008) observe that large repositories may receive greater use than smaller repositories and their need for quality metadata may be greater, but, in what they call the "cycle of diminishing returns," larger repositories may have "greater difficulty in providing those metadata with limited resources as the metadata collection continues to grow and becomes increasingly diverse" (p. 67). Repositories, although essential in the provision of information, especially unique content, are still working through the challenges of providing access to their content based on their own unique environments, cultures, and resources.   Conclusion The usefulness of metadata is dependent on many factors, including system functionality, the encoding of metadata for machine manipulation, and the quality of the metadata. In this study we gathered information on systems used, metadata encoding schemes and elements that impact metadata quality, including the level of staff creating it and best practices resources in use, in an effort to describe metadata practices. Repository operations surveyed were drawn from those registered with OpenDOAR, a directory of vetted repositories adhering to practices of openness. Having an understanding of current practices in this environment provides insight into larger questions of access and interoperability.   Acknowledgements This research was funded by a grant from the University of Missouri Richard Wallace Faculty Incentive.   References [1] About OpenDOAR. (2014). OpenDOAR. [2] Content Types in OpenDOAR Repositories — Worldwide. (2014). OpenDOAR. [3] Cullen, R., & Chawner, B. (2011). Institutional repositories, open access, and scholarly communication: A study of conflicting paradigms. The Journal of Academic Librarianship, 37(6), 460-470. http://doi.org/10.1016/j.acalib.2011.07.002 [4] Gasaway, L. N. (2010). Libraries, digital content, and copyright. Vanderbilt Journal of Entertainment & Technology Law, 12(4), 755-778. [5] A global library resource. (2014). OCLC. [6] Hillmann, D. I. (2008). Metadata quality: From evaluation to augmentation. Cataloging & Classification Quarterly, 46(1), 65-80. http://doi.org/10.1080/01639370802183008 [7] Jacso, P. (2006). GeoScienceWorld, OpenDOAR, and Enciclopedia Estudiantil Hallazgos. Online, 30(3), 52-54 [8] Li, Y. & Banach, M. (2011). Institutional repositories and digital preservation: Assessing current practices at research libraries. D-Lib Magazine, 17(5/6). http://doi.org/10.1045/may2011-yuanli [9] OpenDOAR or directory of open access repositories. (2005). Information Services & Use, 25(2), 109-111. [10] OpenDOAR: open access to research information. (2006). Library Hi Tech News, 23(3), 20-21. [11] Park, J.-r. (2006). Semantic interoperability and metadata quality: An analysis of metadata item records for digital image collections. Knowledge Organization, 33(1), 20-34. [12] Stvilia, B., & Gasser, L. (2008). Value-based metadata quality assessment. Library & Information Science Research, 30, 67-74. [13] Tools for Repository Administrators. (2014). OpenDOAR. [14] Usage of Open Access Repository Software — Worldwide. (2014). OpenDOAR. [15] The value of cooperative cataloging. (2014). OCLC.   About the Authors Heather Lea Moulaison is Assistant Professor at the iSchool at the University of Missouri. Her research focuses primarily on the intersection of the organization of information and technology and includes the study of issues pertaining to metadata, standards, and digital preservation. An ardent Francophile, Dr. Moulaison is also interested in international aspects of access to information.   Felicity Dykas is Head of the Digital Services Department at the University of Missouri. Previous positions have included Head of the Catalog Department and electronic resources librarian. She works with the university institutional repository and digital library and has additional interests in metadata standards, organizational systems for online resources, and the preservation of print and digital material.   Kristen Gallant is a graduate student at the University of Missouri's iSchool. Ms. Gallant holds a masters of arts in Art History and is interested in digital resources and their metadata.   (On March 23, 2015 a correction was made to the caption of Figure 3.) Copyright © 2015 Heather Lea Moulaison, Felicity Dykas and Kristen Gallant work_hz3bicvfavfdvfxc4smydbwhd4 ---- Library Trends v. 57, no. 3 Winter 2009: http://hdl.handle.net/2142/13586 Abstract The ECHO DEPository (also known as ECHO DEP, an abbreviation for Exploring Collaborations to Harvest Objects in a Digital Envi- ronment for Preservation) is an NDIIPP-partner project led by the University of Illinois at Urbana-Champaign in collaboration with OCLC and a consortium of partners, including five state libraries and archives. A core deliverable of the project’s first phase was OCLC’s development of the Web Archives Workbench (WAW), an open- source suite of Web archiving tools for identifying, describing, and harvesting Web-based content for ingestion into an external digital repository. Released in October 2007, the suite is designed to bridge the gap between manual selection and automated capture based on the “Arizona Model,” which applies a traditional aggregate-based archival approach to Web archiving. Aggregate-based archiving refers to archiving items by group or in series, rather than individually. Core functionality of the suite includes the ability to identify Web content of potential interest through crawls of “seed” URLs and the domains they link to; tools for creating and managing metadata for association with harvested objects; website structural analysis and visualization to aid human content selection decisions; and packaging using a PREMIS-based METS profile developed by the ECHO DEPository to support easier ingestion into multiple repositories. This article provides background on the Arizona Model; an overview of how the tools work and their technical implementation; and a brief summary of user feedback from testing and implementing the tools. The Web Archives Workbench (WAW) Tool Suite: Taking an Archival Approach to the Preservation of Web Content Patricia Hswe, Joanne Kaczmarek, Leah Houser, and Janet Eke LIBRARY TRENDS, Vol. 57, No. 3, Winter 2009 (“The Library of Congress National Digital Information Infrastructure and Preservation Program,” edited by Patricia Cruse and Beth Sandore), pp. 442–460 (c) 2009 The Board of Trustees, University of Illinois 443hswe/web archives workbench tool suite The Web Archiving Problem The Ubiquitous Web For a broad range of organizations, websites are now the delivery mecha- nism of choice for nearly any type of information content. Much of this content is created and disseminated in electronic formats only, with printed copies considered just a courtesy or convenience. The electronic format environment, while expedient for current access purposes, pres- ents challenges for anyone charged with preserving information over time. These challenges include the sheer volume of Web-published infor- mation, traditional issues of selection and description, as well the techni- cal challenges associated with long-term preservation of digital objects. The Challenges of Web Archiving Volume and Selection of Web Content An immediate challenge of Web ar- chiving is assuring that all content of long-term relevance delivered through the Web is identified and collected (i.e., harvested). Difficulties arise first from the task of selecting pertinent content for preser vation from the enormous volume of information streaming from Web servers at any given point in time. Selection decisions will be influenced by the charge of the individual responsible for capturing specific content types (usually a librar- ian or archivist) based on appraisal or collection development; on policies created in concert with the mission of the institution or organization; and on the audience or user community being served. The sheer volume of content published on the Web makes a fully manual perusal of online re- sources infeasible. The sheer volume of web-published information is still a major barrier to collecting content. The Nature of the Web The dynamic nature of the Web also creates prob- lems for selection and harvesting of content. URLs can change overnight; resources can be taken offline with little or no notice; and new, related content can be added in new or different directories than those visited previously by a Web crawler harvesting an organization’s website. Although Web crawling automates archiving of a website, it is quite possible for Web crawlers simply to miss content because of a “robots exclusion protocol” (activated by the website’s administrator to make parts of a site “uncraw- lable”) or because of the impenetrable character of the Deep Web (where content, such as a results page to a Web form, is inaccessible to a Web crawler or Web spider). In addition, the vast measure of the Web renders scalable Web crawling an almost intractable technical challenge. Knowing where to find all content eligible for harvesting according to collection development and appraisal policies becomes nearly impossible without intentional coordination or without Web crawling tools and resources that are designed for, and take account of, the fluid nature of website content and the massive scale of the Web. 444 library trends/winter 2009 The Importance of Context Context is about understanding relationships between different and discrete pieces of information. It is about understand- ing why the information was created, by which individual or organization, and at what point in time. Contextual information can help define the boundaries and the scope of harvested content. As with analog objects, much of the usefulness of digital objects, which make up our cultural record, depends on having descriptive and contex- tual information about them. Once content is identified and harvested, it is necessary to provide access to the digital object. Such content access means that attention should be paid to capturing accurate metadata along with the content itself. This contextual metadata will help describe the origin or “provenance of the resource,” as well as why and when it was created. For example, is the discovered resource one in a series of annual reports from a particular state agency? Is it a single publication summariz- ing research findings? Or does it encompass results from a specific survey taken as part of a larger effort to revamp community services? In the case of a digital object, metadata not only supports human interpretation of content, it is needed to provide crucial technical information for main- taining long-term viability of the object itself. An Archival Approach to Web Archiving (the Arizona Model) Foundational Elements of the Web Archives Workbench The Web Archives Workbench tool suite is based on the principles of the “Arizona Model,” an aggregate-based approach to Web archiving de- signed to bridge the gap between human selection and automated cap- ture. “Aggregate-based” means that rather than archive items singly, or individually, they are organized (grouped) in series, or in aggregates. The Arizona Model was developed in 2003 by Richard Pearce-Moses of the Arizona State Library and Archives. Background on the Arizona Model Most state libraries and archives have mandates to collect state agency pub- lications and make them available to the public. To this end, there are well- established depository systems that have worked with paper publications for many years. In a Web environment the nuances of determining what a pub- lication is, or who is responsible for selection and collection of particular information resources, becomes less clear. Nonetheless, to meet these man- dates librarians and archivists still must identify, select, acquire, describe, and provide access to state agency information “published” on websites. In early attempts to develop a collection of state agency electronic publications, two approaches came about. According to Cobb, Pearce- Moses, and Surface (2005), the first approach has its premise in “tradi- tional library processes of selecting documents one by one, identifying 445hswe/web archives workbench tool suite appropriate documents for acquisition; electronically downloading the document to a server or printing it to paper; then cataloging, processing, and distributing it like any other paper publication” (175). While this ap- proach ensures that valuable documents will be gathered, its dependence on manual selection will limit archiving to only a very few items. Scal- ing this process in accordance with the vastness of Web-based documents would necessitate an expansion in personnel that few state libraries have the funding to address (Cobb et al., 2005). Alternatively, in the other ap- proach, software tools that automate regularly occurring Web crawls are engaged. As Cobb, Pearce-Moses, and Surface (2005) assert, this model “trades human selection of significant documents for the hope that full- text indexing and search engines will be able to find documents of lasting value among the clutter of other, ephemeral Web content captured in the process” (176). Yet, while this model relieves librarians and archivists of the upfront onus of selection and organization, at the same time it may unduly burden future searchers, if full-text indexing and search capabili- ties do not evolve as anticipated. The Arizona Model, explained in detail below, constitutes a third ap- proach to Web archiving, incorporating both human assessment and au- tomated tools. An Archival Approach The Arizona Model applies an archival perspective to curating collections of Web publications. It exploits certain telling parallels between websites and archives: namely, the concept of provenance (i.e., documents classed together stem from the same source) and the organizational structure in- herent in both these kinds of collections—directories and subdirectories for websites, and series and subseries for archives (Cobb et al., 2005). In theory, if websites organize Web publications using common file directory structures, information about individual documents within sub-directo- ries could be inherited from parent directories. In the Arizona Model, which draws on basic archival practice, websites are handled as hierarchical aggregates rather than as individual items, and the original order of the documents (the order in which the creating agency oversaw them) is maintained. Provenance and original order are considered important contextual pieces of information. Retaining docu- ments in the order in which they were originally managed and keeping them clustered together based on the originating agency enhance one’s knowledge of the creation and original use of the documents. Provenance and original order also allow for “inheritance” of higher-level metadata meant to describe the home agency from which the documents came and the way the documents were originally arranged. Finally, an archival approach to curating a collection of Web docu- ments—focusing first on aggregates (collections and series), rather than 446 library trends/winter 2009 on individual documents—trims the number of items that need to be ap- praised by a human down to a more manageable number. Arizona Model Summary The Arizona Model uses a methodology that applies both human selec- tion and automated capture to the archiving of Web content. In this ap- proach, Web materials are managed in a way similar to the organization of materials in paper-based archives: as a hierarchy of aggregates rather than as individual items. This approach reduces the problem of the sheer vol- ume of preserving Web materials to a more manageable size, while main- taining a scalable degree of human involvement. It is the guiding model for OCLC’s Web Archives Workbench. A Tour of the Web Archives Workbench Tool Suite The Web Archives Workbench Workflow The Arizona Model is particularly instructive in its evocation of where, in the practice of archival management, automation can be considered most useful. That is, while technology may be applied for information processing activities such as data searching and tracking, and list construction and clas- sification, tasks for distinguishing whether content is in-scope or is valuable are best reserved for humans. Thus, a key deliverable of the ECHO DEPosi- tory project, with OCLC as the technical lead, was to develop a suite of tools that would follow the Arizona Model and thus achieve a productive complement between automated processing and human decision mak- ing, all the while adhering to established archival principles. Prior to tool design and development, OCLC carefully considered the user community’s needs, which OCLC identified as a blend of librarians and archivists. Significant to its consideration was the issue of terminology: how should tools and features in the Web Archiving Workbench be named if a mixed community of librarians and archivists was to serve as its user base? The word series, for example, while familiar to an archivist, might invoke semantics and usage that is different, even unfamiliar, for a librarian. Thus, in exploring the user community, OCLC had archivists look at new types of metadata and asked librarians to think about principles of archiving. Eventually, OCLC elected not to devise new terminology for the concepts at issue; not only did the team conclude that terminology was, in essence, a training matter, it also saw that the work of librarians and archivists often overlap—that is, each is frequently engaged in the milieu of the other. The software that OCLC created, the Web Archives Workbench (WAW), comprises five tools to identify, select, describe, and harvest Web- based materials, as well as to keep track of, or log, these activities and to generate reports about them. In doing so, they serve as a conduit between human involvement (via manual selection) and computerized capture of Web content: they convert the archivist’s policies for collecting content 447hswe/web archives workbench tool suite created on the Web to software-centered rules and configurations. They also assist information professionals by providing the means to add meta- data to harvested objects as aggregates. In addition, the tools implement the PREMIS-based METS profiles developed by ECHO DEP at the Uni- versity of Illinois for packaging content; by design these profiles facilitate ingestion into multiple external repositories and support long-term pres- ervation.1 Packaging is the last step in the WAW workflow, after which the objects are ready for ingest into an external digital repository (Figure 1). Furthermore, in doing high-level analysis for the user interface, OCLC arrived at several working assumptions that influenced the design of the tool suite. One assumption was that because the tools in the Web Archives Workbench might change over time, they needed to be “aware” of each other and enable the sharing of data, but—as important—the user should have the ability to opt not to use a tool in the Workbench. Through inter- views with librarians and archivists, OCLC also learned that harvesting re- sponsibilities often were shared among individuals; as a consequence, data generated by a tool had to be rendered shareable by multiple users—and simultaneously so. This feature would allow a user to view the work of an- other. In addition, rather than trying to integrate the Workbench into an institution’s many authentication schemes, OCLC incorporated a simple scheme, allowing the Workbench to run with just basic administration. In terms of harvesting, OCLC designed more than one harvesting workflow, so that a user could select the appropriate level of analysis and sophistica- tion for a task. For instance, the Quick Harvest feature is a single-screen Figure 1. Diagram of the Workflow Encompassed in the Web Archives Workbench 448 library trends/winter 2009 launch point that runs a harvest immediately. The Analysis tool, which is part of an extended harvesting workflow, requires more set-up, but it results in a bigger “pay-off” in terms of the website change observations it handles automatically for the user. Finally, where the deposit of harvested information is concerned, OCLC recognized that ingest to a variety of repositories, including its own Digital Archive as well as DSpace repositories, would need to be accom- modated. A clean, simple interface was created between the point where the Workbench ends and a repository software application would begin; that is, the Workbench generates harvested packages of content in a file system that the repository then picks up and processes. This is the point in the workflow at which the above-mentioned PREMIS-based METS profiles developed by ECHO DEP is implemented. A Tour of the Software The screenshot in figure 2 below displays the main WAW tools screen af- ter the user has logged on. The five tools in the Workbench are the Dis- covery, Properties, Analysis, Harvest, and System tools. In the screenshot they are exemplified by the topmost row of tabs. Although the Alerts tab sits in this row, it is less a tool than a feature of the Workbench. It enables users to access a collection of reports and alerts for the Discovery, Proper- ties, Analysis, and Harvest Tools. In the interface for the WAW tools, a tab is colored in to signify which tool is open, or active, at that particular mo- ment. In figure 2, for example, the Discovery tab is shaded, because the Discovery tool is currently active. Similarly, the Entry Points tab is shaded, because it is active as a component of the Discovery tool. A key advantage to the Workbench tools is that harvesting of Web content may be scheduled so that it occurs on a regular basis. However, the Work- bench tools also offer users the alternative of running a one-time harvest. This is known as the Quick Harvest, accessible via the Harvest tab. Quick Harvest is addressed briefly in the discussion below of the Harvest tool. The Discovery Tool: Finding Web Content of Interest The first step in constructing an archive of Web-based resources is to de- termine which parts of the Web hold desirable, and thus collection-worthy, content. This step lies at the crux of the Discovery Tool. The Discovery Tool aids in identifying potentially relevant websites by crawling relevant “seed” entry points to generate a list of domains to which the “seed” sites link. (Note: An entry point is a specific website URL where the Discovery Tool will begin to search for domains or collect Web content. A domain is a server on the Internet that may contain Web content and is identified by a high-level address. For example, http://www.illinois.gov/news/ is a web- site, and its domain is “Illinois.gov.” (Domains do NOT include “http://”.) In an approach that effectively borrows from citation analysis, the Dis- covery Tool is designed on the idea that on-topic sites likely point to other 449hswe/web archives workbench tool suite sites addressing a similar topic. The domains in the generated list are then manually evaluated as in-scope or out-of-scope, based on subject interest and collecting policies. Figure 3 shows a list of domains returned after entry points have been crawled, as well as radio buttons that note the scope for each domain. This process results in a list of domains defining a subset of the Web that is relevant for the user’s archiving purposes. Do- mains marked as in-scope can be associated with an Entity (i.e., creator, or agency, or organization responsible for the Web content). Later, in the Properties and Analysis Tools, metadata associated with entities (creators such as agencies or organizations) can be inherited by content harvested from a particular website. In short, the Discovery Tool is used to: • generate a list of potentially relevant domains by crawling seed sites; • assign domains as in-scope or out-of-scope; • add domains manually to the Domains list; • associate domains with entities (creating agencies or organizations). The Properties Tool: Entering Metadata to Describe Content Creators (Entities) Another premise of the Arizona Model is that, as much as possible, meta- data should be entered only once and be inherited by associated harvested objects. After the Entry Points and Domain features of the Discovery Tool are run, and entities (i.e., content creators) have been associated with domains, metadata about the resulting entities may be entered via the Properties Tool. Besides enabling the management of information about entities, the Properties Tool also allows the user to describe the relation- ships (e.g., parent/child) of entities with one another, as well as enter other information such as contact information. Importantly, the Properties Tool also can be easily engaged to create analyses and series from entities’ websites. The purpose of enabling analy- sis of a website is to examine its structure—that is, the directories that make up the website. (For more on the Analysis Tool, see below.) Figure 2. Screenshot of the WAW Interface That User Sees After Logging On 450 library trends/winter 2009 The Properties Tool is used to: • create and manage a list of content creators (entities); • assign metadata and other properties to entities; • specify websites that entities are responsible for, and create analyses and series based on those websites. The Analysis Tool: Visualizing the Structure of a Website Through the Analysis Tool it is possible to discern whether there is valu- able content in the directories that comprise a website and, if so, to iden- tify those chunks of content. “Series” refers to flexible aggregates of con- tent that are analogous to archival series—which may be a whole website or a portion of it (e.g., only PDFs of annual reports), or even one individ- ual page or document from websites. Loosely defined, a series is any col- lection of Web material that a user chooses to collect in one “bucket.” In addition, series are used in order to drive the Workbench harvest opera- tions. While series may be established within the Properties Tool, they can also be established and managed using the Analysis Tool, then harvested and packaged in the Harvest Tool. The Analysis Tool has two functional areas: • The Analysis screen, which provides visualization tools to aid in content selection decision-making and in series structure decisions. Here, too, a baseline analysis can be created against which to measure future website analyses. • The Series screen, where series are created, edited, and managed; Series objects are kept; and Series harvests are regulated. Figure 3. Screenshot of the Interface for the Domains Feature of the Discovery Tool 451hswe/web archives workbench tool suite The Analysis Tool is used to: • analyze the structure of a website; • enter associated entities; • set a baseline analysis for comparison with future analyses; • adjust settings, such as spider settings and change notification threshold settings; • define a “series” for harvesting (e.g., harvest as an individual object), with option to associate it with an entity; • hold series objects prior to harvest; • schedule harvests of series. In addition, operations for holding series objects and harvesting them may be accessed via the Properties Tool. The Harvest Tool: Reviewing, Packaging, and Ingesting Harvested Content All the harvests in the Workbench, including series harvests (via the Anal- ysis Tool) and quick harvests, are listed in the Harvest Tool. The Harvest Tool is used to monitor the status of harvests and to provide an opportunity to review and modify the harvest before packaging it up and ingesting it into a repository. There may be single-object harvests or multiple-object harvests, depending on whether the option to harvest content as individual objects was selected in the Series details screen of an Analysis-based Series (i.e., in the Analysis Tool). The Quick Harvest feature schedules one-time harvests of content based on a URL inputted directly into the Harvest Tool. After harvests are complete they may be reviewed, at which time ad- ditional metadata may be assigned. The user can render, or display, the harvested content within the WAW tool from the Harvest Results page. The user can actually “step into” the harvested content at both the harvest starting point and at any other point in the website (via the website file structure display), and the software will render the website appropriately. The purpose of the display feature in the Web Archives Workbench is to al- low the user to verify the correctness of what was harvested—“correctness” meaning that all the information expected to have been collected is col- lected. Once the harvested content is confirmed as correct, it then can be ingested into the user’s local repository. Display of content for end users should occur in the local repository. OCLC did not want to dupli- cate the functions of a repository as part of this project, and it realized that institutions would already have a significant investment in the reposi- tory of their choice. This way, users can leverage their existing repository software, with its existing indexing, collection organization, metadata and display functions, and operational (back-up) procedures. The actual code for a repository to display a Web document was not included in the scope of this project. 452 library trends/winter 2009 In sum, the Harvest Tool is used to: • monitor the status of harvests scheduled in the analysis tool; • delete completed harvests; • review completed harvest content, whether single-object or multi-object, prior to ingest; • review completed harvests; if desired, edit metadata and/or include/ exclude content; • ingest harvested content into a repository; • launch a one-time quick harvest using the Quick Harvest Tool. The Alerts Tab: Workbench Notifications As mentioned above, the Alerts tab is not a tool but, rather, a feature for notifying the user of a variety of systems information. This information includes notification about errors, incomplete processes, completed pro- cesses, and new information such as the discovery of a new domain, or a new folder encountered during analysis. In short, the Alerts Tab is used to review reports and alerts about Workbench functions. The System Tools: Monitoring and Managing Workbench Activities The System Tools tab contains a number of behind-the-scenes functions that affect and report on activities of the five main tools of the Work- bench. The System Tools are divided into four functional areas: • The Audit Log page, which displays recent Workbench activities and events; • The Spider Settings page, where the user can configure default Domain, Analysis, and Harvest spider settings, as well as create additional Domain, Analysis, and Harvest spiders with custom settings. Specifically, types of spider settings include, but are not limited to, depth (how deeply a website should be crawled, or spidered) and parameters of time (when, how frequently, and for how long); • The Import/Export page, through which the user can import or export a variety of metadata commonly used in the Workbench. These include entities, domains, and subject headings; • The Reports page, which generates printable reports on activities of the main five Workbench tools. It offers a view of in-development entity and series reports. Web Archives Workbench Tools Summary The Web Archives Workbench implements an archival approach to the selection and preservation of digital Web-based content. The Workbench automates much of the methodology embraced by the Arizona Model, 453hswe/web archives workbench tool suite particularly beyond the initial selection decisions made by the archivist (e.g., deciding at the start of the archiving process which website, or which part of a website, to capture and preserve). After selection param- eters are set, the Workbench facilitates the capture and management of the digital materials in hierarchical aggregates, not unlike the archiving of print-based materials (See Table 1). Table 1. Tools Summary WAW Tool Purpose/Functionality of Tool Discovery Tool Comprising the Entry Points and Domains tabs, the Discovery Tool helps to identify potentially relevant websites by crawling relevant “seed” Entry Points to generate a list of domains that they link to. At the end of this process, the users have a list of domains that defines the subset of the Web relevant for their archiving purposes. From here, the Properties and Analysis Tools are used to manage creator information about domains, and associate this information with harvests of content. Properties Tool Comprising the Entities tab, the Properties Tool is used to maintain information about content creators or “Entities” (e.g., government agencies), and associate them with the domains and websites they are responsible for. The Properties Tool also allows users to describe the relationships (e.g., parent/child) of Entities with one another, as well enter high-level metadata about them that may be inherited by content harvested from their websites. Importantly, the Properties tool can also be used to create and associate Series with Entities’ websites. Series and harvests are then further managed using the Analysis and Harvest/Package Tool. Analysis Tool Comprising the Analysis tab and the Series tab, the Analysis Tool provides website structure visualization tools to aid content selection decisions, and allows users to define archival Series, associate metadata with these series, and schedule recurring harvests of Web content. Harvesting activities are then monitored and managed in the Harvest Tool. Harvest Tool Comprising the Harvester and Quick Harvest tabs, the Harvest Tool lists all harvests within the Workbench, including Series harvests scheduled using the Analysis Tool as well as Quick Harvests. It is used to monitor their status, initiate the final harvesting and ingest steps for the completed harvests tracked in the Harvest Tool, including reviewing harvest contents and metadata before ingest. This is the final step in the Web Archives Workbench workflow. It also offers a separate Quick Harvest feature. System Tools The System Tools manage and monitor Workbench activities, reporting on operations undertaken in the four other tools. It has four functional sections: an Audit Log page (shows recent Workbench activities); a Spider Settings page (parameters for spidering may be set here); an Import/Export page (for moving metadata); and a Reports page (for producing printable reports about activities performed by the other tools). 454 library trends/winter 2009 Behind the Scenes: OCLC’s Technical Implementation of the Web Archives Workbench An ISO 9001 company, OCLC has an externally audited quality system based on the requirements of ISO 9001 as an aid for ensuring that prod- ucts meet user expectations and specified requirements. OCLC’s project development lifecycle is a process that specifies how OCLC services are marketed and developed. This process includes lifecycle documents such as project plans, requirements, design, test plans, operations support plans, and post-project reviews. The Web Archives Workbench program followed this lifecycle. OCLC software development teams are free to follow different meth- odologies within the framework of the OCLC lifecycle. The WAW devel- opment team used Dynamic Systems Development Methodology (DSDM), a comprehensive framework for agile project delivery (http://www.dsdm .org). The DSDM methodology applied to many parts of the project, includ- ing the requirements-gathering approach, requirements prioritization, and task scheduling. The core development team consisted of a total of four to six developers, two product managers, and one test analyst. Supporting this team within OCLC were systems and network engineers, quality assurance staff, operations staff, and other groups. UIUC provided project manage- ment, requirements input, documentation, engineering support, and test beds for the METS-based inter-repository data exchanges. The WAW program was divided into three main projects and many smaller releases in order to reduce risk and to create a feedback loop al- lowing refinement of the requirements based on previous releases. There were three major software releases, plus approximately twenty additional releases over the course of the three-year program. The three main develop- ment projects were based on the main areas of functionality of the tool suite: (1) Domain and Entity, (2) Analysis and Packager, and (3) Site Analysis and Change Management. Though the Domain and Entity features in WAW were somewhat functionally simple, the Domain and Entity project car- ried a significant amount of risk because it built the technical foundation on which the rest of the project would rest. The Site Analysis and Change Management tools were risky due to the usability issues involved in clearly representing to the user the process of harvesting and evaluating changes to websites. Throughout the project one of our main concerns was how to represent the Arizona Model in a clear and usable way in software. Based on early discussions, the system began to be seen as a “work- bench,” into which components and systems would be incorporated and dropped over time—perhaps because users would prefer to apply some of their local tools or perhaps because they would have multiple tools for a given task. Additionally, each component would grow its data quality over time, thereby forcing the rest of the system to adapt easily to evolving specifications and data versions. Therefore, the architecture is designed 455hswe/web archives workbench tool suite for location, interface, and data-exchange transparencies, which means that changes in those three main areas are expected to drive all other system characteristics. The high level technical architecture of the system was specified us- ing the Reference Model—Open Distributed Processing (RM-ODP). This framework uses various views of a system, including a domain model view, an information view, an application view, and a technology and deploy- ment view.2 Using this framework, OCLC created the following early do- main model of the system. (See fig. 4.) Some of the boxes in this domain model were later removed from the requirements, as our understanding of the system to be built changed over time. Figure 4. Diagram showing OCLC’s early domain model of the system that eventually developed into the WAW suite of tools 456 library trends/winter 2009 The architecture consisted of several layers: client, integration, ser- vice, and persistence. The client layer consisted of a user interface imple- mented using the Struts framework as a model-view-controller structure to the code. The second layer is a Web services layer that provides the hooks for a client to talk to the application. This layer also provides inte- gration between tools and translation between the internal and external representations of the data. Each developing WAW tool (Entity, Analysis, Domain, etc.) implemented a consistent Helper API to allow the user in- terface layer to Add/Update/Delete/Search single or multiple objects. The Oracle database provided a persistence layer. Once the high level design was produced, a detailed design was produced for each tool. OCLC created use cases for all main activities in each of the tools. The OCLC lifecycle requires formalized review and sign-off of requirements and de- sign documents. Following DSDM meant that detailed requirements and design were produced as needed before each implementation time box started, as opposed to the traditional waterfall software development ap- proach where all requirements are written, then all designs produced, then all coding completed. Each developer worked in his own sandbox, where a WAW interface was set up for his exclusive use. The work of multiple developers was in- tegrated into a development test environment called Baseline. This way, product managers and test analysts could review work in progress in Base- line. When Baseline was ready it was migrated into a quality assurance environment, where formalized testing was done against a test plan. For major installs, Baseline was also installed at UIUC for additional testing. The final step of the development process was to deploy the software into a production environment. The Web Archives Workbench was released as an open-source package on SourceForge in October 2007. Release documentation includes de- tailed installation instructions and a detailed user guide for understand- ing and using the tools. • WAW Release home page: https://sourceforge.net/projects/we- barchivwkbnch/ • Administration Guide: https://sourceforge.net/project/showfiles. php?group_id=205495 • User Guide: http://is.gd/gKlz • WAW software package: http://webarchivwkbnch.cvs.sourceforge.net/ webarchivwkbnch/webarchivwkbnch/ The Administration Guide has runtime environment requirements for WAW. It also has a list of all third-party software used by WAW in the in- corporated code section of the document. The third-party software is in- cluded in the WAW distribution. An OCLC subscription is not required to use WAW or to use this third-party software. 457hswe/web archives workbench tool suite The WAW tools, as developed by this project, will continue to be made publicly available indefinitely through SourceForge. In addition, in 2008 OCLC released a new array of services incorporating components of the WAW tools into a workflow with CONTENTdm, WorldCAT, and the OCLC Digital Archive. Findings—User Feedback Testing of the WAW tools was undertaken in varying degrees by the origi- nal project content partners, as well as by several volunteer organizations. Feedback about their experiences working with the tools was gathered during large-group project meetings at OCLC, as well as through phone conversations and e-mail exchanges. The overall response indicates that the Web archiving approach of the WAW tools was “elegant” and worth consideration, but in practice content partners generally did not imple- ment the full functionality of the tools. Thus, the potential benefits of applying an archival approach to the Web were not realized completely. Reasons for this partial implementation have to do with inadequate re- sources and time needed for training in the proper use of the tools, which also points up their complexity. The Web Archiving Workbench is power- ful and extensive in terms of Web harvesting and content, or series, analy- sis, but—according to the feedback from our content partners—at a cost of heuristics and usability. Not surprising, the Quick Harvest functional- ity was engaged most often; for some, the Quick Harvest feature became a much-valued component of their daily workflows. Changes in content delivery approaches—such as from static Web pages to database-driven pages—constituted another reason for not applying the full functionality of the tools. Limited Resources and Limited Time During their participation in the ECHO DEPository project, state library and archives partners remained under continual operational pressures to respond to the need for capturing content from agency websites. Some partners tested the WAW tools while continuing to use other Web content capture approaches in order to meet their immediate obligations, leaving fewer resources to focus on the WAW tools. Because the tools were still under development, testing of the various phased releases may also have been difficult to incorporate into daily workflows. Support from the proj- ect, in the form of interns, had been planned but was geared to the early releases of the Workbench, before the full functionality of the tools was implemented. In hindsight, putting project resources toward direct work with content partners, as originally intended, might have resulted in more use of the full functionality of the tools, especially if timed more specifi- cally to coincide with later, more fully functional, software releases. 458 library trends/winter 2009 Complexity of the Tools According to user feedback, the Quick Harvest and Discovery tools were easiest to use, because they could be set up quickly and incorporated into existing workflows without increasing the need for new resources. The full functionality of the tools involves understanding a process with a greater level of complexity than that presented by the Quick Harvest option. Partners reported that it was easier to use the Quick Harvest and Discovery tools rather than expend time and resources for learning, or testing, the tools suite as a whole. Further, some content partners report that the complicated interface of the tools was a barrier to using them to their fullest potential. Web Content Delivery The assumption proposed by the archival model—that a website and its directories are similar to an archival record collection and set of record series—does not apply today as easily as it did when the model was first proposed in 2003. An increasing amount of content is now delivered through database-driven websites rather than through static Web pages. The relationships between content items that may have been obvious when stored in a file directory are not always apparent when stored in a database. Therefore, crawling domains to find potential content to har- vest and applying inherited metadata according to a directory structure are now less useful approaches than they were just a few years ago. Despite this shift in how information is delivered via websites, the concept of con- tent inheriting metadata from previously harvested content, and then as- sociating that content with an existing aggregate collection, continues to be useful for making automated harvest processes more effective. Conclusions and Next Steps State librarians and archivists continue to search for the best methods for capturing Web content based on their specific mandates and the resources they have available to them. Recent developments in Web archiving ser- vices and tools provide new opportunities for partnering with others and for exploring new workflows. The Web Archives Workbench tools are one option among many. They automate the methodology prescribed by the Arizona Model, which is premised on key archival practices, such as ob- servation of provenance and adherence to original order. The four main tools (Discovery, Properties, Analysis, and Harvest) enable the identifica- tion, selection, description, and packaging of digital content. In addition, the WAW suite includes functionalities for error notification, as well as system tools for overseeing and reviewing Workbench activities. The les- sons learned from developing the Workbench, and the underlying archi- val model used to direct its development, underscore the merging roles 459hswe/web archives workbench tool suite and responsibilities of archivists and librarians in the digital environment and the need to re-evaluate and re-envision workflows. Moreover, the continuing mission and significance of this work have been affirmed in the second phase of NDIIPP. The University of Illinois, OCLC, and the University of Maryland have partnered to develop a stand- alone, open-source metadata extraction tool intended to provide access to archived content—a kind of next step for the Web Archives Workbench. In addition, in the State Initiatives component of NDIIPP, a selection of state libraries across the nation are collaborating to develop tools and ser- vice models for the management and preservation of state government digital materials. These projects address digital preservation in a variety of contexts, including disaster readiness and the recovery of data. Through the State Initiatives work, NDIIPP is addressing the fundamental issue of keeping at-risk state government resources viable as part of our national heritage and record. Notes 1. Two METS profiles developed by ECHO DEP are at work here: the ECHO DEP Generic METS Profile for Preser vation and Digital Repository Interoperability (2005) and the ECHO DEP METS Profile for Web Site Captures (2006). The former is the “top level” format-generic profile, which focuses on implementing PREMIS. The latter, a Web cap- ture profile, is an example of a “sub-profile,” which is used with the first one to provide a structure for more format-specific information. 2. In RM-ODP the architecture of a system is described by five views (essentially five different points of view) reflecting the separation of responsibilities between business sponsors, developers, and support staff. Those views are: • Enterprise—community, enterprise objects (domain model), objectives (requirements/ use cases), roles • Information—schemas, object attributes, data boundaries, constraints, semantics • Computational—components, interfaces, interactions, contracts • Engineering—transparencies (location, access, failure, persistence), nodes, chan- nels • Technology—technologies and products (the only dependence on specific products and implementation packages) References Cobb, J., Pearce-Moses, R. & Surface, T. (2005). ECHO DEPository Project. In Archiving 2005: final program and proceedings, April 26, 2005, Washington, D.C. (pp. 175–178). Springfield, VA: The Society for Imaging Science and Technology, 2005. Retrieved July 5, 2008, from http://www.ndiipp.uiuc.edu/pdfs/IST2005paper_final.pdf/ ECHO Dep Generic METS Profile for Preser vation and Digital Repositor y Interoper- ability. (2005). Retrieved August 27, 2008, from http://www.loc.gov/standards/mets/ profiles/00000015.html ECHO Dep METS Profile for Web Site Captures. (2006). Retrieved August 27, 2008, from http://www.loc.gov/standards/mets/profiles/00000016.html The ECHO DEPository: An NDIIPP-Partner Project of the University of Illinois at Urbana- Champaign with OCLC and the Library of Congress. (n.d.). Retrieved July 5, 2008, from http://www.ndiipp.uiuc.edu/ The ISO Reference Model for open distributed processing—An introduction. (1996). Retrieved August 27, 2008, from http://www.enterprise-architecture.info/Images/ Documents/RM-ODP2.pdf 460 library trends/winter 2009 OCLC Digital Management Services. (2008). Retrieved July 5, 2008, from http://www.oclc .org/us/en/services/collection/default.htm The National Digital Information Infrastructure and Preservation Program. (n.d.). Retrieved July 5, 2008, from http://www.digitalpreservation.gov/ Pearce-Moses, R., & Kaczmarek, J. (2005). An Arizona Model for preservation and access of Web documents. DttP: Documents to the People. 33(1), 17–24. Retrieved July 5, 2008, from http://www.ndiipp.uiuc.edu/pdfs/azmodel.pdf/ Rani, S., Goodkind, J., Cobb, J., Habing, T., Eke, J., Urban, R. & Pearce-Moses, R. (2006). Technical architecture overview: tools for acquisition, packaging, and ingest of Web objects into multiple repositories (poster). Opening information horizons: 6th ACM/IEEE-CS Joint Conference on Digital Libraries: June 11–15, 2006, Chapel Hill, NC, USA: JCDL 2006/sponsored by ACM SIG on Information Retrieval, ACM SIG on Hypertext, Hypermedia and the Web, IEEE Technical Committee for Digital Libraries (pp. 360–360). New York: ACM, 2006. Web Archives Workbench. (2008). Retrieved July 5, 2008 from http://sourceforge.net/proj- ects/webarchivwkbnch/ Web archiving. (2008, August 21). In Wikipedia, the free encyclopedia. Retrieved August 27, 2008, from http://en.wikipedia.org/wiki/Web_archiving Patricia Hswe is project manager for NDIIPP Partner Projects at the University of Illinois at Urbana-Champaign. In 2004–6 she held a CLIR postdoctoral fellowship in scholarly information resources at the university’s Slavic and East European Library. While a student at the Graduate School of Library and Information Science in 2006–8, she worked in the Mathematics Library and in Grainger Engineering Library Infor- mation Center; in addition, she was a graduate research assistant on NDIIPP during its first phase. Her research interests focus primarily on digital libraries and digital collections: metadata (standards, creation, semantics, usability, management); the challenges of digital preservation; use and users of digital resources; data curation in the humanities; and information literacy and services for graduate students. Joanne Kaczmarek is associate professor of library administration and archivist for electronic records at the University of Illinois at Urbana-Champaign. Kaczmarek was a co-PI on the university’s NDIIPP project and is involved in several ongoing practical initiatives related to information management and digital preservation. Her research interests include exploring the interplay between technology and human behavior as it relates to the management of information resources and what becomes the lasting, historic record of an individual, a group, an organization, or a culture. Prior to her current position Joanne was the project coordinator for the Mellon-funded OAI-PMH Cultural Heritage Repository project at the University of Illinois. Leah Houser is a manager in product development at the Online Computer Library Center (OCLC). Houser was the project manager for the OCLC staff who imple- mented the Web Archives Workbench. Activities for the project included prototyp- ing, requirements, design, programming, testing, product support activities, and communications. She has worked on OCLC’s digital preser vation products since 2000. Prior to her current position she worked on OCLC’s FirstSearch product as an administrative and project manager and software developer. Janet Eke served as project coordinator of the NDIIPP-funded digital preservation projects based at the University of Illinois at Urbana-Champaign from fall 2004 through August 2008. She is now the research services coordinator at the Graduate School of Library and Information Science (GSLIS) at the University of Illinois at Urbana-Champaign where she helps to develop ser vices, tools, and resources to support and promote the research efforts of the school. Previously at GSLIS she provided research services at a fee-based custom research unit and taught a master’s course in online searching. Before joining GSLIS in 1998, she worked for many years in public libraries. work_i3tmyahagrb2tfrstte2atqdja ---- Türk Kütüphaneciliği 18, 1 (2004), 105-106 MeslekiToplantilar Professional Meetings TARİH BAŞLIK, TEMA YER 2004 14.03.2004 Vision and innovation: a key to developing tomorrow’s library and information services Bretby, Derbyshire, İngiltere İLETİŞİM htttp>://ww^w'.2.bı^itishco^uncil.oıg!/seminaıs-inioımıat^^r^-^C^^İ^7dılm 27.03.2004 Annual and Exhibition of the United Kingdom Serials Group Manchester, İngiltere İLETİŞİM http://www.uksg.org/events/annualconf04.asp 31.03.2004 Museums and the Web International Conference of the Archives & Museum Informatics Washington, DC, ABD İLETİŞİM http://www.archimuse.com/ 12.04.2004 4th ASSIST national Seminar on “Digital Resources and Services in Libraries” Shankarghatta, Shimoga, Hindistan İLETİŞİM f^tttfi:,^^w^w^w^.freewebs.com/^ssist2004/ 14.04.2004 Workshops, “New Developments in Digital L,it)ı^ari^s (NDDL-2004)” of the Instituto para os Sistemas e Tecnologias de Informaçâo, Controlo e Comunicaçâo Porto, Portekiz İLETİŞİM luq?:/W^AVv/iU^(iisarq’^^^^od, a site hosted at Cornell University, that brings together information regarding projects in the e-resource management arena. The site is maintained by Adam Chandler (Cornell) and Tim Jewell (University of Washington) and it is a tremendous resource for anyone involved in the management of electronic resources. THE CONUNDRUM Expenditures for electronic resources have grown enormously in the past 10 years. Nearly 40% of my library’s serials budget supports online resources, yet the staff’s ability to manage this growing collection of resources is not much better than it was in 1994. For better or worse, serials departments still spend most of their time managing print resources (see Anderson, 2003 for a different approach). A large part of this conundrum is due to integrated library systems not providing tools to help us manage these resources. Administrative metadata, elements about licensed resources, do not fit comfortably into most library systems. As a result, managing electronic resources is a reactionary activity. Unlike the print environment, it is hard for us to know when an e-journal issue is missing or is late in being published. Unless publishers notify us when publication lapses occur, generally it is a user who notifies the library when an issue is not available. Some institutions have developed local systems that help them manage their e-resources. Grassroots efforts, led by Jewell and Chandler, have spearheaded development of standards and tools that hold promise for assisting libraries with this work. THE MARKETPLACE The Web Hub highlights some of the locally-developed e-resource management solutions created by libraries. For the most part, these systems have been developed by large research institutions. What is intriguing is the number of vendors interested in developing products to meet the needs of most academic libraries. In May 2002, NISO and the Digital Library Federation cosponsored a meeting in Chicago to further discussions regarding development of a standardized administrative metadata element set. Among the 50 attendees were representatives of the integrated library system marketplace (Innovative Interfaces, Ex Libris, SIRSI, and Endeavor), serials subscription agents (EBSCO), and bibliographic utilities (OCLC). Clearly, these organizations recognize the opportunity in this area for commercial products, and want to be the first to the marketplace. This strategy certainly has worked for Ex Libris with respect to its SFX product. The e-resource management market is in need of a similar killer application. Recently I had a chance to speak with Tim Jewell, Head of Collection Management Services at the University of Washington, and Adam Chandler, Information Technology Librarian at Cornell University, about the state of affairs in this exciting new area. NM: Tim, tell me about your DLF-sponsored work. How did you come to spearhead this initiative, and what are your expected outcomes? TJ: One of the more interesting discoveries I made while conducting a “best practices” study for the DLF a couple of years ago was the sheer amount of information concerning licensed electronic resources that DLF member libraries were gathering and trying to maintain and present to their staff and users. It began to seem really obvious that these libraries were all trying to capture the same kinds of information and do very similar things with it, and it gradually occurred to me that we could all progress much more quickly if we could define the common problem and find some ways to work together to solve it. Since I had already been in touch with most of the people who seemed to be active in this area, I thought that I could help by trying to facilitate information sharing and coordinate efforts. The main outcome that I hope to see is rapid progress in developing systems to manage electronic resources – and I see some real evidence that this is happening. NM: How pervasive is the e-resource management problem? Does it tend to be restricted to large academic institutions, or are libraries in general struggling with these issues? TJ: The larger academic libraries clearly have the biggest problem, but most academic libraries struggle with it, and I think that larger public libraries also do -- to some degree. For the academics, the problem is clearly getting bigger very quickly as they rely more heavily on databases and electronic journals. The more heavily libraries rely on e-journals and full-text aggregator packages, the more important and difficult it is for them to know what use restrictions might apply, and how to triage and track access problems. Doing those things effectively really requires different tools from what most libraries have had available to them. NM: The number of locally-developed e-resource tools is amazing, and new systems are being developed all the time. When did libraries begin developing databases to help manage administrative metadata? TJ: I'm not really sure who should get the credit for creating the first e-resource management system or database, since many libraries have been using different tools to track related information for some time, and at some point a few of those libraries began experimenting with database software. The two "early" efforts that I think have proven to be especially influential are Penn State's ERLIC system, which drew quite a lot of attention at a NASIG meeting a few years ago, and MIT's VERA system. I think that the articles that described VERA’s capabilities gave a lot of people a hint of what could and maybe should be done by new support tools. NM: Some commercial vendors are beginning to develop e-resource management tools. They’ve obviously realized that many libraries would prefer to purchase an off-the-shelf solution rather than spend staff time developing a local system. Do you think additional vendors will follow suit, and if so, do you envision a near-future market stocked with mature and standardized products? TJ: For the last couple of years, what I have been hearing from librarians is that they want to see good offerings from the vendors they normally do business with – either for their serials business or OPAC/ILS functions. What’s really interesting and heartening is to see those developments now taking place. For instance, I know of at least 5 companies and other organizations that are either actively developing e-resource management systems or are planning to do so, and I’m pretty sure some others will begin to do so shortly. I think that's a very good thing, since mounting a serious effort in this area takes a lot of resources, and few libraries have the staff to design, implement, and then maintain what need to be pretty sophisticated systems. NM: Adam, the Web Hub is a tremendously useful resource to those of us wanting to learn from the efforts of others. How did you and Tim conceive of it? AC: Tim and I started a dialogue about the problem of managing licensed electronic resources in the fall of 2000. Tim deals with these resources on a daily basis. I do not. I work on a variety of IT problems in the Cornell Library, but licensing is something for which my colleagues in another part of my department are responsible. I became involved because my supervisor, Karen Calhoun, asked me to survey the environment to see if we should build a system locally to manage electronic resources. It was clear to me from the beginning of the project that I could be put to best use by focusing on how to bring the data Tim was gathering into a form that would be beneficial to myself and others. Tim and I started with a problem, then slowly over time cobbled together a structure to solve it. The first piece of the structure was the Web Hub. The site has two primary functions. One, to keep people informed about the status of the project (meeting dates, reports, time line); and two, to point to relevant projects and data sets which may be helpful for building a system for managing electronic resources. Having a stable place where we could point people brought more energy into the project. It wasn't really sustainable for us to work on this alone. There is too much flux and uncertainty. We needed more data and people. The Web Hub highlighted interesting work that others were doing. Over time, we were able to bring in some of these people to the steering group (and now the reactor panel) because they too saw the value of collaborating on this difficult problem. The Web Hub provides a single point of reference. NM: The grassroots nature of your work is a model of collaboration. It shows what dedicated librarians can accomplish when faced with a common problem. As you gaze into your crystal ball, what additional efforts will be needed down the road to fully tackle the issues inherent to managing administrative metadata? TJ: As thrilled as I am to see active and excellent work on system development, I think we all will need to keep the “standards question” in mind and do all we can to standardize data elements and definitions, as we’ve been trying to do via the Web Hub and by working with NISO. There will be a lot of unhappy librarians at some point in the future if they find that the work they’ve invested in recording information about their electronic resources can’t be moved from the systems that are being developed now to some future system, when it comes time for that. Having good, widely-accepted standards where that makes sense should help prevent those problems. The other problem area that I think is really ripe for active discussion is whether it’s possible and practical to describe such common license terms as ILL and course-pack permissions -- so that those descriptions can be shared among libraries in much the same way we share catalog records. Not everyone thinks that can really be done, since there is such wide variation on these points from one publisher or vendor to another, and since the rights granted by a particular publisher to a given library might vary from what they are willing to grant another. It’s my own feeling that it should be possible to overcome those problems, which I think we need to do in order to make the fullest use of our e-resources. POSTSCRIPT It is important to note that Michael Gorman, among the deities in our profession, made a number of accurate predictions in his 1991 article, including the continued importance of library buildings and the central role resource sharing would play in the new millennium. REFERENCES Anderson, R. & Zink, S.D. (2003). “Implementing the unthinkable: the demise of periodical check-in at the University of Nevada.” Library Collections, Acquisitions, and Technical Services, vol. 27, no. 1, p. 61-71. Gorman, M. (1991). “The academic library in the year 2001: dream or nightmare or something in between?” Journal of Academic Librarianship , vol. 17, no. 1, p. 4-9. Schulz, N. (2001). “E-journal databases: a long-term solution?” Library Collections, Acquisitions, and Technical Services, vol. 25, no. 4, p. 449-459. work_ig2jjush3rbjtpl67u7xx5qiiy ---- Bibliometrics and Information Retrieval: Creating Knowledge through Research Synergies 1 Bibliometrics and Information Retrieval: Creating Knowledge through Research Synergies Judit Bar-Ilan Bar-Ilan University Ramat Gan, Israel Judit.Bar-Ilan@biu.ac.il Marcus John Fraunhofer Institute For Technological Trend Analysis Euskirchen, Germany marcus.john@int.fraunhofer.de Rob Koopman & Shenghui Wang OCLC Research Europe Leiden, Netherlands Rob.Koopman@oclc.org, shenghui.wang@gmail.com Philipp Mayr GESIS Cologne, Germany Philipp.Mayr- Schlegel@gesis.org Andrea Scharnhorst Royal Netherlands Academy of Arts and Sciences Amsterdam, Netherlands andrea.scharnhorst@dans.kna w.nl Dietmar Wolfram University of Wisconsin- Milwaukee Milwaukee, WI USA dwolfram@uwm.edu ABSTRACT This panel brings together experts in bibliometrics and information retrieval to discuss how each of these two important areas of information science can help to inform the research of the other. There is a growing body of literature that capitalizes on the synergies created by combining methodological approaches of each to solve research problems and practical issues related to how information is created, stored, organized, retrieved and used. The session will begin with an overview of the common threads that exist between IR and metrics, followed by a summary of findings from the BIR workshops and examples of research projects that combine aspects of each area to benefit IR or metrics research areas, including search results ranking, semantic indexing and visualization. The panel will conclude with an engaging discussion with the audience to identify future areas of research and collaboration. Keywords Bibliometrics, Information Retrieval, Digital Libraries, Visualization, Search, Semantic Indexing INTRODUCTION Information Retrieval (IR) and Bibliometrics/Informetrics/Scientometrics (referred to hereafter as “metrics”) represent two core areas of study in Information Science. Each has a long history with noted contributions to our understanding of how information is created, stored, organized, retrieved and used. Until recently, researchers have treated each of these areas as separate areas of investigation, with little overlap between the research topics undertaken in each area and little collaboration among researchers in both areas. This is surprising given that there are many common elements of interest to researchers in IR and metrics. Recognition of the mutually beneficial relationship that exists between IR and metrics has been growing over the past 15 years, with literature that specifically addresses this topic (e.g., Wolfram, 2003; Mayr & Scharnhorst, 2015) and the recent Bibliometric-enhanced Information Retrieval workshops (Mayr, Frommholz & Cabanac, 2016) held at metrics and IR meetings. The mutually beneficial relationship is evident in the application of metric and citation analysis methods in the design of IR systems and in the use of techniques developed in IR that lend themselves to the study of metric phenomena. A prime example is the development and use of the PageRank algorithm by Page, Brin, Motwani and Winograd (1999), which was inspired by ideas from citation analysis and then adapted to the Web to inform relevance ranking decisions of documents. It has since been re-purposed by metrics researchers for the ranking of authors and papers. ASIST 2016, October 14-18, 2016, Copenhagen, Denmark. PANEL ORGANIZATION This panel brings together researchers in IR and metrics to present an overview of how IR and metrics research may be combined, to provide examples of research that intersect both areas, and to engage in a discussion with the audience about future potential topics. The session will begin with an overview of the synergies that exist between IR and metrics, followed by a summary of findings from the BIR workshops and examples of research projects that combine aspects of each area to benefit IR and/or metrics research. The panel will conclude with an engaging discussion with the audience to identify future areas of research and collaboration. Initial questions to stimulate the discussion will include: 1) why don’t more IR researchers look to metrics research to help solve their research problems and vice versa, and; 2) as an IR or metrics researcher, what do you see as a research problem of interest that could benefit from the approaches used by the other area? OVERVIEW (Dietmar Wolfram) Metrics researchers have long recognized that empirical regularities or patterns exist in the way information is produced and used, such as author and journal productivity, the way language is used, and how literatures grow over time. These regularities extend to the content of IR systems and how the systems are used. Knowledge of these regularities, such as patterns in how users interact with IR systems, can help to inform the design and evaluation of IR systems. Similarly, measures developed for metrics research also have applications in IR. Conversely, techniques developed that support more efficient IR are now being applied in metrics studies. This is exemplified in the use of language and topic modeling, which were developed to overcome limitations of more simplistic “bag of words” approaches in IR. Topic modeling has become a useful tool for better understanding relationships between papers, authors and journals by relying on the language used within the documents of interest. These tools complement existing methods based on citations and collaborations in helping researchers to reveal the underlying structure of disciplines. The relationships between metrics research and IR--and in particular academic search--are much closer than many researchers realize. RECENT ADVANCES IN BIBLIOMETRIC-ENHANCED INFORMATION RETRIEVAL (Philipp Mayr) The presentation will report about recent advances of the Bibliometric-enhanced Information Retrieval (BIR) workshop initiative. Our motivation as organizers of the BIR workshops (2014, 2015 and 2016) started from the observation that the main discourses in both fields are different and the communities only partly overlap, as well as from the belief that a knowledge transfer is profitable for both sides. The first BIR workshop in 2014 set the research agenda by introducing each group to the other, illustrating state-of-the- art methods, reporting on current research problems, and brainstorming about common interests. The second workshop in 2015 further elaborated these themes. The third full-day BIR workshop at ECIR 2016 aimed to establish a common ground for the incorporation of bibliometric-enhanced services into scholarly search engine interfaces. In particular, we addressed specific communities, as well as studies on large, cross-domain collections like Mendeley and ResearchGate. The third BIR workshop addressed explicitly both scholarly and industrial researchers. In June 2016, we will organize the 4th BIR workshop at the JCDL conference in collaboration with the NLP and computational linguistics research group from Min-Yen Kan (see BIRNDL workshop http://wing.comp.nus.edu.sg/birndl-jcdl2016/). The past workshop topics included (but were not limited to) the following: • IR for digital libraries and scientific information portals • IR for scientific domains, e.g. social sciences, life sciences, etc. • Information seeking behavior • Bibliometrics, citation analysis, and network analysis for IR • Query expansion and relevance feedback approaches • Science Modeling (both formal and empirical) • Task-based user modeling, interaction, and personalization • (Long-term) Evaluation methods and test collection design • Collaborative information handling and information sharing • Classification, categorization, and clustering approaches • Information extraction (including topic detection, entity and relation extraction) • Recommendations based on explicit and implicit user feedback Previous BIR workshops have generated a wide range of papers. Proceedings are available at http://ceur-ws.org/Vol- 1143/, http://ceur-ws.org/Vol-1344/ and http://ceur- ws.org/Vol-1567/. The main directions of these workshop papers have been: • IR and recommendation tool development and evaluation • Bibliometric IR experiments and data sets • Document Clustering for IR • Citation Contexts and Analysis http://wing.comp.nus.edu.sg/birndl-jcdl2016/ http://ceur-ws.org/Vol-1344/ http://ceur-ws.org/Vol-1567/ http://ceur-ws.org/Vol-1567/ 3 The presentation will report about highlights of the past workshop papers and outline future directions of this initiative. APPLICATION OF THE H-INDEX FOR RANKING SEARCH RESULTS (Judit Bar-Ilan) In traditional IR, search results are usually ranked using tf*idf (term frequency/inverse document frequency). On the web, hypertext links can be utilized as well. The web-graph (node=web pages, links=hypertext links) is similar to citation networks (nodes=publications, links=citations). Citations are usually counted without assigning weights to citation. Similarly, the number of links to a web page can be counted, but this turns out to be insufficient because of the lack of quality control on the web, and links have to be weighted by their “importance”. This is the idea behind the PageRank (Page et al., 1999). This idea stems from bibliometrics (Pinski & Narin, 1976). It should be noted that the PageRank calculation is quite costly. We suggest using a variant of the h-index for ranking. The h-index was introduced by Jorge Hirsch (2005). Hirsch is a physicist, but the idea of combining publication and citation counts captured the imagination of bibliometrics researchers, and a huge number of variants were suggested. One of them, the h-index of a single journal paper suggested by Schubert (2009), can be applied to the web graph as well, by assessing the importance of a web page by the number of inlinks webpages linking to this page received (Bar-Ilan & Levene, 2015). The advantage of this method is that it is based on local computation unlike PageRank. This idea shows how bibliometrics and information retrieval can inform each other. SEMANTIC INDEXING FOR INFORMATION RETRIEVAL AND BIBLIOMETRIC ANALYSIS (Rob Koopman & Shenghui Wang) Large scale digital libraries offer users the opportunities to explore a vast amount of information using relatively uniform mechanisms, such as keyword-based or faceted searches. In the meantime, users are challenged to make sense of the overloaded result sets that are too big and complex to comprehend or to understand and counteract the biases derived from different ranking mechanisms that render the results. We believe that semantic indexing based on statistical analysis together with intuitive interfaces can help users to find relevant information and discover patterns fast and reliably. In this talk, we will present our Ariadne context explorer, which allows users to visually explore the context of bibliographic entities, such as authors, subjects, journals, citations, publishers, etc. The visualization is built on semantic indexing of these entities based on the terms that share the same contexts in a large scale bibliographic dataset. The statistical analysis based on Random Projection results in an underlying semantic space within which each entity is represented vectorially. Each bibliographic record or any piece of text could also be represented as a vector in this semantic space. The information retrieval task then becomes a task of finding the nearest neighbors in this space, no matter the search starts with an author, a citation, an article or a free text. We will demonstrate the Ariadne context explorer and report the results of applying such semantic indexing and visualization in a topic-delineation exercise. SEEKING FOR THE NEEDLE IN THE HAYSTACK: BIBLIOMETRICS, INFORMATION RETRIEVAL AND VISUALIZATION IN THE CONTEXT OF TECHNOLOGY FORESIGHT (Marcus John) Technology foresight is an important element of any strategic planning process, since it assists decision makers in identifying and assessing future technologies. One important assumption made in this context is that tomorrow's technologies are based on today’s daily work in scientific laboratories. Consequently, any technology foresight process must rely on a continuous scanning of the scientific and technological landscape in order to detect scientific advances, breakthroughs and emerging topics. In other words, a kind of science observatory has to be established. Due to the rising number of scientific papers published each year, it becomes more and more difficult to restrict this scanning and monitoring process solely to classical desktop research and information retrieval techniques. Additionally the classical task of IR, namely the identification of relevant information is exacerbated by the need to identify relevant and new information. Consequently, the information overload makes it necessary to complement classical approaches by quantitative data- driven approaches stemming from informetrics, bibliometrics, data mining and related fields. This work in progress report presents an overview of the ongoing research at the Fraunhofer INT and addresses the question if and how these quantitative data-driven approaches might enhance the classic portfolio of technology foresight. This will be exemplified along a prototypical technology foresight process along which different IR-related challenges will be identified. It will be demonstrated how eavesdropping into today's scientific communication by bibliometric means might support this process. Exemplarily, a procedure coined "trend archaeology" will be presented. This approach examines historic scientific trends and seeks for specific patterns within their temporal evolution. The proposed method is a multidimensional approach, since it takes into account multiple aspects of a scientific theme using bibliometric means. Additionally, "trend archaeology" is based on the synoptic inspection of different scientific themes, which emanate from different fields like nanotechnology or materials science. It will be demonstrated that for technology foresight it is mandatory to take into account the multidimensional-multiscalar, dynamic and highly interconnected nature of science. THE PANEL MEMBERS Andrea Scharnhorst (moderator) is Head of e-Research at the Data Archiving and Networked Services (DANS) institution in the Netherlands - a large digital archive for research data primarily from the social sciences and humanities. She is also member of the e-humanities group at the Royal Netherlands Academy of Arts and Sciences (KNAW) in Amsterdam, where she coordinates the computational humanities programme. Her work focuses on understanding, modeling and simulating the emergence of innovations. Judit Bar-Ilan (panelist) is professor at the Department of Information Science of Bar-Ilan University in Israel. She received her PhD in computer science from the Hebrew University of Jerusalem and started her research in information science in the mid-1990s at the School of Library, Archive and Information Studies of the Hebrew University of Jerusalem. She moved to the Department of Information Science at Bar-Ilan University in 2002. Her areas of interest include informetrics, information retrieval, Internet research, information behavior, the semantic Web and usability. Additional details are available at: http://is.biu.ac.il/en/judit/. Marcus John (panelist) received his PhD in the field of theoretical astrophysics. Since 2007, he has been a senior scientist at the Fraunhofer Institute for Technological Trend Analysis where he is mainly concerned with technology foresight and future-oriented technology analysis. His main fields of interest are complex systems science, physics of socio-economic systems, simulation methods and human enhancement. Additionally his work focuses on the application of bibliometric and other quantitative methods for technology foresight. Rob Koopman (panelist) is an architect in OCLC European, Middle East and Africa (EMEA) office in Leiden, Netherlands. His main research area is applied data science. He has a physics background and has worked at OCLC EMEA since 1981. Philipp Mayr (panelist) is team leader at the GESIS department Knowledge Technologies for the Social Sciences (WTS). He is the main organizer of the past BIR workshops. His team at GESIS runs two retrieval platforms which cover bibliographic information and full texts in the social sciences. His research interests include informetrics, information retrieval and digital libraries. Additional details are available at: http://www.gesis.org/de/das- institut/mitarbeiterverzeichnis/?alpha=M&name=philipp%2 Cmayr. Shenghui Wang (panelist) is a research scientist in OCLC Research since 2012, based in OCLC EMEA office in Leiden, Netherlands. Her current research activities include text mining, visualization as well as linked data investigations. Dietmar Wolfram (panelist) is professor at the School of Information Studies at the University of Wisconsin- Milwaukee. He received his PhD in Library and Information Science from the University of Western Ontario. His research interests include informetrics, information retrieval, the intersection between these two areas, scholarly communication and user studies. ACKNOWLEDGMENTS This panel is sponsored by ASIS&T SIG/MET. REFERENCES Bar-Ilan, J., & Levene, M. (2015). The hw-rank: An h- index variant for ranking web pages. Scientometrics, 102, 2247-2253. Hirsch, J. E. (2005). An index to quantify an individual's scientific research output. Proceedings of the National academy of Sciences of the United States of America, 102(46), 16569-16572. Mayr, P., Frommholz, I., & Cabanac, G. (2016). Bibliometric-Enhanced Information Retrieval: 3rd International BIR Workshop. In N. Ferro et al. (Eds.), Advances in Information Retrieval: 38th European Conference on IR Research, ECIR 2016 (pp. 865-868). Springer. Mayr, P., & Scharnhorst, A. (2015). Scientometrics and information retrieval - weak-links revitalized. Scientometrics, 102(3), 2193–2199. doi:10.1007/s11192- 014-1484-3 Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: bringing order to the web. URL: http://ilpubs.stanford.edu:8090/422/1/1999- 66.pdf. Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(5), 297–312. Schubert, A. (2009). Using the h-index for assessing single publications. Scientometrics, 78(3), 559–565. Wolfram, D. (2003). Applied informetrics for information retrieval research. Westport, CT: Libraries Unlimited. http://is.biu.ac.il/en/judit/ http://www.gesis.org/de/das-institut/mitarbeiterverzeichnis/?alpha=M&name=philipp%2Cmayr http://www.gesis.org/de/das-institut/mitarbeiterverzeichnis/?alpha=M&name=philipp%2Cmayr http://www.gesis.org/de/das-institut/mitarbeiterverzeichnis/?alpha=M&name=philipp%2Cmayr work_igqypypvh5dhfjwffbammnt3n4 ---- 105772 113..117 Collection analysis outcomes in an academic library Elizabeth Henry, Rachel Longstaff and Doris Van Kampen Saint Leo University, Saint Leo, Florida, USA Abstract Purpose – The intent of this article is to illustrate outcomes and results of a collection analysis done by a smaller academic library. Design/methodology/approach – The collection was evaluated using an online analysis tool combined with a physical inventory of the collection. Findings – Peer group comparisons revealed some of the problems with this particular collection were also widespread among the comparison libraries. The value of the e-book collection to patrons was clear: not only did e-books provide resources to remote students; they help compensate for shortfalls in the print collection. Practical implications – The catalog more accurately reflects what is on the shelf and also what is reported to OCLC. Access to the collection has been improved and enhanced. Steps were taken to refocus the library’s collection development procedures and management. The changes made have led to increased faculty involvement in selection and a more balanced, more comprehensive collection management plan. Originality/value – For any library considering whether they can or should do an analysis, the article illustrates that the benefits are well worth the time and expense. The analysis had a positive impact on collection development and management. Keywords Collections management, Academic staff, User interfaces, Data analysis, Academic libraries Paper type Case study Introduction Determining whether a library’s collection meets the needs of the user and the educational goals of the institution should be considered part of the core mission of the library. Academic libraries exist in order to “work with other members of their institutional communities to participate in, support, and achieve the educational mission of their institutions” (ACRL, 2003). If the library does not critically analyze its collection in order to determine how well it is supporting the mission of the university, then the purpose of the library’s existence could be called into question. Effective collection analysis and assessment provides quantitative and qualitative data for evaluating the usefulness and utility of a library’s holdings. It assists with determining budget requirements by focusing attention on how well the library’s collections in specific areas support the needs of the users and the needs of the institution. It also points out whether the institution’s investment in the collection is being managed responsibly. The aim of assessment is to determine how well the collection supports the goals, needs, and mission of the library or parent organization. The collection (both locally held and remotely accessed materials) is assessed in the local context. Evaluation seeks to examine or describe collections either in their own terms or in relation to other collections and checking mechanisms, such as lists. Both evaluation and assessment provide a better understanding of the collection and the user community (Johnson, 2004). Conducting a collection analysis can be expensive, time- consuming and labor intensive, but it is well worth the investment. Due to the many changes affecting modern libraries, it is important that librarians are aware of their library’s holdings. A collection analysis can educate current and new library staff about the collection, provide better data on which to determine collection development priorities for budget planning purposes, point out cataloging issues, and help the reference librarians better support and assist with the patron’s information search. “Efficient use of budgets, shelves, staff, and information seekers’ searching time – whether online or in the stacks – are a few of the less often articulated reasons to evaluate collections” (Agee, 2005). Collection analysis also allows for better management of resources, especially in fiscally lean times, and provides library administration with documented evidence on the stewardship of the library. The key for academic librarians is to think in terms of their role in overall institutional effectiveness. Accountability is a two-edged sword. It promotes the library and the librarians’ visibility on campus and supports the academic mission. However, it also brings more responsibility and an obligation to quantitatively document just exactly how the library is fulfilling the purpose and objectives of the institution. Thus, for the librarian, collection assessment is the most integral component of the accreditation process. With its companion collection management and development policy, it represents institutional effectiveness in microcosm (Henderson et al., 1993). Background to the study Saint Leo University’s Cannon Memorial Library supports an institution serving more than 14,000 students. Saint Leo University is a unique institution, with a small traditional The current issue and full text archive of this journal is available at www.emeraldinsight.com/0160-4953.htm Collection Building 27/3 (2008) 113 – 117 q Emerald Group Publishing Limited [ISSN 0160-4953] [DOI 10.1108/01604950810886022] Received: February 2008 Reviewed: March 2008 Accepted: April 2008 113 campus-based student body, and a large distance learning program. Many of the students are not within driving distance. Distance education programs, which use videoconferencing, online programs and offerings, and weekend and evening programs are integral to the University. It is a challenge to provide library resources and services to such a diverse and dispersed student population. As part of the university’s 2004/2005 Institutional Effectiveness Plan (IEP), the library director proposed a collection evaluation to evaluate whether the library met the Southern Association of Colleges and Schools standards and the American Library Association’s Standards for Libraries in Higher Education, approved by the ACRL Board of Directors in 2004 (ACRL, 2004). The ACRL standards encourage comparison with peer institutions, provide statements of good library practice, and suggest ways to assess that practice in the context of the institution’s priorities. When first considering a collection analysis, budget, staffing and the types of tools and data collection techniques were considered. The library staff is not large, and it was understood that the study would be primarily quantitative in nature rather than qualitative. Qualitative collection analysis would have involved a considerably larger investment in time and staff, (focus groups, citation analysis collected from student papers, surveys, etc.) and, therefore, it was determined to focus on quantitative tools available. The collection analysis tool selected was the OCLC WorldCat Collection Analysis tool. OCLC developed the WorldCat Collection Analysis program at just the right time for Saint Leo. A collection analysis had not been done for quite a number of years, and it was important to determine how well the library was supporting current program offerings and educational curricula. The library’s book collection consists of almost 100,000 print titles and approximately 53,000 electronic titles. As a preliminary step in the process, two technical services librarians were hired in 2005 to finish a retrospective conversion process, implement systematic authority control, and determine the methodologies required for collections evaluation and funding. Due to discrepancies between books on the shelf, books cataloged, and holdings reported to OCLC, it was decided that a thorough inventory of the collection was going to be necessary before the retrospective conversion could be continued. The inventory process has taken longer than anticipated, with two librarians spending approximately 25 per cent of their time on this for three years. In an academic library, summers are often devoted to special projects. In the summer and fall of 2007, the team spent 60-70 per cent of their time doing collection analysis. In 2006, Cannon Memorial Library contracted with OCLC for WorldCat Collection Analysis (WCA) for one year, even though the inventory was still ongoing and had not yet been completed. WCA provided information about both print and electronic holdings. The subscription price was based on the number of library holdings in OCLC. Most, but not all, of our e-books have been updated in OCLC. Comparison of the library’s collection to peer libraries and authoritative lists was an important part of the process. The objective, quantitative results of the analysis validated subjective speculation and targeted observations performed on selected sections of the collection. Methodology In preparation for the evaluation, a shelf list was generated and an inventory of holdings for flagship programs – Criminal Justice, Education, Psychology, and Sport Business – was completed. Every book in the collection was physically removed from the shelf and checked against bibliographic records in the Voyager catalog and in WorldCat. After finishing the inventory of the collections supporting the flagship programs, the remainder of the collection was also inventoried. Corrections were made as needed and records enhanced where possible. “Collection evaluation always begins with a complete, up-to-date inventory” (Intner, 2003), since “physical assessment provides a good indicator of the condition of the overall collection...” (Agee, 2005). The assessment team also participated in several online seminars presented by OCLC in order to learn how the OCLC conspectus software operated. The assessment team gathered data and generated graphs of publication dates for books and e-books. Some graphs were printed directly from WCA and others were printed from data exported from OCLC into an Excel spreadsheet. Because of the irregular way the statistics were displayed in WCA, and because of the small number of books in the library’s collection published before 1950, earlier years were not included in the analysis. Using a weighted average formula, the approximate average publication dates for print and e-books in the individual disciplines were calculated. Holdings were compared to two authoritative lists – Books for College Libraries (ALA, 1988) and Choice Outstanding Academic Titles (ACRL) – in order to generate a list of recommended titles that had not been purchased by SLU. Holdings were then compared to holdings from similar institutions selected by the assessment team. Saint Leo University has a considerable off-campus student body, a large religion collection and a growing theology program. It was impossible to construct true peer groups from the predefined lists provided; Saint Leo was too small to be classified with such institutions as Harvard, and too diverse to be compared to other smaller and medium liberal arts institutions. As an alternative, four groups of five libraries were created from institutions comparable in size to the university’s campus student body: 1 ICUF (Independent Colleges and Universities of Florida) member institutions. 2 Small Catholic colleges and universities. 3 Colleges with accredited sport business/management programs. 4 Colleges with pastoral studies (theology) programs. Many academic programs could not be compared to peer group institutions. In addition, some academic programs were not included in predefined WCA divisions, for example: Computer Science, Pastoral Studies, and Sport Business (a flagship program at SLU). Also, English and History were split into multiple subdivisions, making it difficult to get a true picture of holdings in these disciplines. Hebrew titles were listed under “Language, linguistics and literature” in WCA, but these same titles would be supportive of a theology program, rather than a language program, at Saint Leo University. Data were collected in two large three-ring notebooks, arranged by discipline, with graphs and spreadsheets illustrating both the total collection and each academic division. Data collected: Collection analysis outcomes in an academic library Elizabeth Henry, Rachel Longstaff, Doris Van Kampen Collection Building Volume 27 · Number 3 · 2008 · 113–117 114 . total holdings; . interlibrary loan statistics; . publication dates; . comparison of e-book and print book collections; . comparisons of print collection using Books for College Libraries and Choice Outstanding Titles; . comparisons of print collection to selected peer institutions. Results and discussion Verifiable, qualitative, and quantitative information about the print collection was amassed as a result of doing both an inventory and an analysis. The data validated or corrected subjective impressions, answered questions posed by librarians and faculty, facilitated decision making, and prompted changes. As a result of the data collected, it was now possible to illustrate the value of the library’s print and electronic book collections, and to demonstrate strengths, weaknesses, and imbalances in the overall collection. It also highlighted a need for greater attention to the university’s flagship programs. To correct disproportions in discipline-specific collections, new collection development policies and procedures were instituted, and the staff re-established systematic weeding of the print collection. The policy changes implemented led to increased faculty involvement in collection development and changes to the book selection process. The analysis also revealed that the print collection is aging, somewhat unbalanced, and, in some disciplines, inadequate, and that the collection development policy needs to be updated. By using peer group comparisons, it became clear that some of the problems with the collection were widespread among comparable libraries and not unique to Saint Leo University. Some perceived weaknesses are universal rather than unique; for example, the age of the collection. The peer group analyses show that the average age of most collections is 30 to 40 years, purchased at a time when library budgets were larger and focused primarily on print materials (see Figure 1). It is now possible to illustrate the value of purchasing and maintaining an electronic book collection, since an inclusion of recent e-book imprints improved the average age of the collection, increased circulation of the print collection, and better supported off-campus students. Not only do the e-book collections provide resources to off-campus students, but they help compensate for shortfalls in the print collections (see Figure 2). The library has been purchasing electronic book collections since June of 2001; during the fall semester of 2001, the 6,000 e-books purchased for the initial online collection were accessed 1,419 times. Currently, the library has an e-book collection that numbers more than 53,000, and in the fall semester of 2007 they were accessed 12,553 times. Interestingly enough, the online collection might have increased print circulation as well: in the fall semester of 2001 there were 2,753 checkouts, and in the fall 2007 semester there were 2,941 checkouts, an increase of 7 per cent. A quick survey of the literature on this topic showed that this trend was true at some institutions, but not at all institutions (Littman and Connaway, 2004). The focus of collection development at Saint Leo has shifted to put more emphasis on developing core collections in key academic programs. The library is responsible for developing the collections in support of the teaching mission of the university, and optimizing resources for users. Several years ago an allocation formula for monographic purchasing was introduced to make sure all disciplines were included. It became clear from the analysis the formula was not enough to correct current imbalances and a more focused core collection development strategy has been adopted. As discipline-specific holdings were compared to those of SLU peer institutions, a list of titles not currently owned was created. Where several libraries were found to own a particular title, that title was included in a list for purchase consideration. This was done for all flagship programs. Additional lists of core titles were created by consulting Resources for College Libraries (Bowker). Lists were forwarded to department heads of flagship programs and programs with large enrollments, and were circulated among the faculty. The lists provided a way to reach out to faculty and facilitate their participation in collection development. This has increased faculty participation, and improved communication and relations. Additionally, highlights of the results were presented to the library faculty, and to several academic departments. As a result, the English department requested a more in-depth analysis and a report of findings when the team reported there Figure 1 Cannon Library collection profile parallels peer libraries Collection analysis outcomes in an academic library Elizabeth Henry, Rachel Longstaff, Doris Van Kampen Collection Building Volume 27 · Number 3 · 2008 · 113–117 115 was a disproportionate number of books in History and English. There are multiple reasons for this finding – there is no formal departmental liaison program, and librarians might choose more books within their fields of expertise or knowledge base. “Lack of knowledge concerning the subject material could be used to explain the absence of certain, individual texts.” (Pankake et al., 1995), and currently most SLU librarians have a background in the humanities. Additionally, some individual faculty members are more vocal library supporters or more frequent library users, and, as a whole, their fields tend to be better represented in the collection because they order more often. Finally, there are more books available for purchase in the humanities. “It is not surprising that many of the subjects with very large collections are also subjects with a very high publishing output, such as history and literature.” (Knieval et al., 2005). Deselection is a valuable secondary result that is gained from collection evaluation, and is a key ingredient of successful collection management (Agee, 2005). From WCA data, lists of titles currently owned that need to be reviewed for possible weeding were compiled. Titles unique to Saint Leo and not held by any of the other benchmark libraries were identified as possible candidates for deselection. Conclusions WorldCat Collection Analysis is an excellent tool for learning how to perform collection analysis. As Munroe and Ver Steeg (2004) suggest: If the selector has little collection development background or experience in the field, he or she will need to do more quantitative study in order to become familiar with the field. It provided accurate (but not real-time) data that graphically illustrates the library’s holdings by subject. Additionally, WCA revealed collection strengths and weaknesses, uniqueness and overlap, and age and format. The creation of peer groups and comparisons with similar libraries was very helpful, showing whether or not collection development at a particular library was on target with what other libraries were doing. Where results differed, inconsistencies were examined further and, occasionally, justified. For example, SLU owns many books specific to Florida that libraries outside of Florida would not have, and the library also owns a significant number of volumes pertaining to Catholicism, religion, and theology, quite a number of which are in German or which are uniquely held by this institution and a very few others. E-book totals for Saint Leo appear to be much larger than for peer libraries, but it is unclear whether peer libraries reported their e-book holdings to OCLC. It is expected that e-books will become an increasingly larger and more important part of the library’s collection because of growing online and distance programs, and because some students prefer full text (MacDonald and Dunkelburger, 1998; Van Kampen, 2004). “Libraries have to invest in and prepare for a digital future while maintaining collections and services based on a predominately print world” (Bodi and Maier-O’Shea, 2005). The library now has access to information which allows a better understanding of the collection and its profile. Records were enhanced by adding thousands of tables of contents and other notes, providing better access and clarity, which appreciably improved access to the collection through the catalog. Usage data collected thus far supports this supposition. Inventory is one of the best ways for librarians to really get to know their collection. It helps to determine whether what is in the catalog is actually on the shelf. For example, while completing inventory, it was determined that up to 10 per cent of the books were missing in some disciplines. Also, by using WCA, titles were found in OCLC with the SLU symbol attached that were not in the catalog or on the shelf. Missing titles have turned out to be a larger and more complex problem than anticipated, a problem which will need to be addressed in the near future. Analyzing the monographic collection was a great start, which provided the library staff with data needed to determine collection policies and procedures. Including electronic books as part of the analysis pointed to future directions, especially in some subject areas where the information is changing faster than a print collection can possibly keep up. By considering all formats, the total amount of information in a given subject can be assessed (Bodi and Maier-O’Shea, 2005). It is likely that not all disciplines will be equally well supported by book collections; for example, the sciences are often better supported by databases and journal collections. Furthermore, not all disciplines publish books at the same levels. As Bodi and Maier-O’Shea (2005) ask: Figure 2 Declining print collection offset by electronic holdings Collection analysis outcomes in an academic library Elizabeth Henry, Rachel Longstaff, Doris Van Kampen Collection Building Volume 27 · Number 3 · 2008 · 113–117 116 How do we reasonably allocate funds, and how would a holistic budget more meaningfully reflect the library’s physical collection, electronic access, and “things” to come? A second analysis of the collection will be scheduled in a few years, providing the library with additional data, and a longitudinal look at the collection and how it has changed over a period of time. Collection analysis is not a static, one time or occasional avenue with which to analyze budgetary considerations; rather, it is a way to “provide a better understanding of the collection and the user community” (Johnson, 2004). In order to continue to improve the quality of the collection, additional steps need to be taken: it will be necessary to repeat the analysis at regular intervals; and to gather data from multiple sources such as circulation and interlibrary loan data; user studies also need to be added, and all library faculty should be brought into the process; finally, other authoritative sources should be consulted. A new system of collection management is in place as a result of the analysis, one that integrates faculty involvement with a more focused approach in selection. The goal is to have a more balanced, institutionally effective collection. It is a measurable goal: “presented and reported properly, evaluation data are a powerful tool that important people want to see” (Intner, 2003). A future analysis will show whether the library is using its resources wisely and effectively. References Agee, J. (2005), “Collection evaluation: a foundation for collection development”, Collection Building, Vol. 24 No. 3, pp. 92-5. American Library Association (1988), Books for College Libraries, 3rd ed., American Library Association, Chicago, IL. Association of Colleges and Research Libraries (ACRL) (2003), Guidelines for Instruction Programs in Academic Libraries, Association of Colleges and Research Libraries, available at: www.ala.org/ala/acrl/acrlstandards/guidelinesinstruction.cfm (accessed 23 June 2008). Association of Colleges and Research Libraries (2004), “Standards for libraries in higher education”, available at: www.ala.org/ala/acrl/acrlstandards/standardslibraries.cfm (accessed 15 February 2008). Bodi, S. and Maier-O’Shea, K. (2005), “The library of Babel: making sense of collection management in a postmodern world”, The Journal of Academic Librarianship, Vol. 31 No. 2, pp. 143-50. Henderson, W.A., Hubbard, W.J. and McAbee, S.L. (1993), “Collection assessment in academic libraries: institutional effectiveness in microcosm”, Library Acquisitions: Practice & Theory, Vol. 17, pp. 197-201. Intner, S.S. (2003), “Making your collections work for you: collection evaluation myths and realities”, Library Collections, Acquisitions, & Technical Services, Vol. 27 No. 3, pp. 339-50. Johnson, M. (2004), Fundamentals of Collection Development and Management, American Library Association, Chicago, IL. Knieval, J., Wicht, H. and Connaway, L.S. (2005), “(2004/) Collection analysis using circulation, ILL, and collection data”, Against the Grain, December/January, pp. 24-6. Littman, J. and Connaway, L. (2004), “A circulation analysis of print books and e-books in an academic research library”, Library Resources and Technical Services, Vol. 48 No. 4, pp. 256-62. MacDonald, B. and Dunkelburger, R. (1998), “Full-text database dependency: an emerging trend among undergraduate library users?”, Research Strategies, Vol. 16 No. 4, pp. 301-7. Munroe, M.H. and Ver Steeg, J.E. (2004), “The decision- making process in conspectus evaluation of collections: the quest for certainty”, Library Quarterly, Vol. 74 No. 2, pp. 181-205. Pankake, M., Wittenborg, K. and Carpenter, E. (1995), “Commentaries on collection bias”, College & Research Libraries, Vol. 56 No. 2, pp. 113-8. Van Kampen, D. (2004), “Development and validation of the multidimensional library anxiety scale”, College & Research Libraries, Vol. 65 No. 1, pp. 28-34. Further reading Burr, R.L. (1979), “Evaluating library collections: a case study”, The Journal of Academic Librarianship, Vol. 5 No. 5, pp. 256-60. Commission on Colleges of the Southern Association of Colleges and Schools (2004), “Principles of accreditation: foundations for quality enhancement”, available at: www. sacscoc.org/pdf/PrinciplesOfAccreditation.PDF (accessed 15 February 2008). Corresponding author Elizabeth Henry can be contacted at: elizabeth.henry@ saintleo.edu Collection analysis outcomes in an academic library Elizabeth Henry, Rachel Longstaff, Doris Van Kampen Collection Building Volume 27 · Number 3 · 2008 · 113–117 117 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_im2igz6i4fdcbcikthumfxofem ---- editorial memo—new frontiers to be reached 571 3 With the publication of issue 4/2015 of memo in Decem- ber 2015, memo finished its eighth year. Many experts, first of all the members of the editorial board (acting as authors and/or reviewers), have always actively sup- ported memo to ensure the high quality memo has always offered to its readership. Therefore, we want to express our gratitude and thank everyone for contributing to memo’s success! memo was founded in 2008 and the first year was entirely devoted to a lot of administrative and organi- zational work to get the first issue of memo published in autumn 2008. Considering memo’s development, it can be stated that the first years were busy with estab- lishing a reliable pool of authors and reviewers and col- lecting manuscripts of high quality. To boost memo’s way to internationality, section editors coming from dif- ferent nations were established (2 Austria, 1 Hungary, 1 Switzerland, 1 Russia) and the editorial board could be extended to many European countries and new frontiers were opened (Israel, Iran, USA). Looking back—What are memo’s achievements so far? • memo is the official journal of societies and study groups (CECOG—since 2008; and OeGHO—since 2012) • memo presents updates of various important conven- tions, such as ASH, ASCO, etc. • memo discusses many hot spots in our field and offers an interdisciplinary platform of communication • memo is included in the following indexing services: SCOPUS, EMBASE, Google Scholar, Academic One- File, Expanded Academic, Health Reference Center Academic, OCLC, SCImago, Summon by Serial Solu- tions • memo is read all over the world: displayed by a full- text download frequency of 9000 per year, the read- ers come from Europe, America, and Asia-Pacific 1/3 each. To continue this positive development and strive for get- ting listed in PubMed in the near future, it is already high time to think about memo’s coming issues. What would be interesting for our readers, then? For the next years, our experts from the editorial board and our readers asked us to give special focus to the fol- lowing topics: • Smoking and its role in oncology • New challenges in stem cell transplantation • Secondary malignancies after prior chemotherapy or organ transplantation • Drawbacks of targeted therapies—back to bench • Molecular basis of new therapeutic targets • Sarcomas: an interdisciplinary challenge • Colorectal cancer, still a long way to cure • New development in melanoma treatment • Palliative care—more than end-of-life attendance Prof. W. Hilbe () Wilhelminen Hospital Vienna, Montleartstraße 37, 1160 Vienna, Austria e-mail: memo-editor@springer.at W. Radlherr Innsbruck, Austria e-mail: memo-secretary@springer.at Received: 7 January 2016 / Accepted: 14 January 2016 / Published online: 8 March 2016 © Springer-Verlag Wien 2016 memo (2016) 9:57–58 DOI 10.1007/s12254-016-0254-8 memo—new frontiers to be reached The development of memo—magazine of european medical oncology Wolfgang Hilbe · Waltraud Radlherr http://crossmark.crossref.org/dialog/?doi=10.1007/s12254-016-0254-8&domain=pdf&date_stamp=2016-2-25 editorial 58 memo—new frontiers to be reached 1 3 • Quality of life—the unknown endpoint—interpreta- tion, and clinical consequences • Solid tumors in children • Neoadjuvant strategies in gastrointestinal cancer • Follow-up—scientific background for rationale rec- ommendations Through the contributions of the editorial board mem- bers, authors, and reviewers the future development of MEMO can be equally fruitful. Therefore, 1. your feedback to Editor-in-Chief Prof. Hilbe is appre- ciated; 2. your proposal of experts of who should be invited is necessary; and 3. your motivation of colleagues to submit their manu- scripts is essential. Counting on you and looking forward to many more interesting issues of memo Prof. Dr. Wolfgang Hilbe Editor-in-Chief, memo magazine of european medical oncology Waltraud Radlherr (memo Secretary) memo-secretary@springer.at Compliance with ethical Standards Conflict of Interest The authors have no conflict of interest to declare. mailto:memo-secretary@springer.at memo—new frontiers to be reached work_imf27wccarafjoshhnspmvt7te ---- . art libraries , journal 35/3 2010 shakes out against the kinds of special collections we currently see in most academic libraries, I'm not so sure. ... If we do continue to have 'special collections' as some type of library unit, I think they need to be much, much more imbricated with other outward facing library and campus services than they currently seem to be in most places.' H e is asking, in other words: Are we in special collections equipped to do our part? Food for thought. If you would like to contribute to the conversation, please get in touch. References 1. David Pearson, 'Special collections in a digital future,' Art libraries journal 35, no. 1 (2010): 12-17. 2. T h e Digital Preservation Coalition (UK) has a good definition of 'born digital' that begins: 'Digital materials which are not intended to have an analogue equivalent, either as the originating source or as a result of conversion to analogue form' at http://www.dpconline.org/advice/ introduction-dennitions-and-concepts.html. 3. See, for example Jackie Dooley, 'Ten commandments for special collections librarians in the digital age,' RBM: a journal of rare books, manuscripts, and cultural heritage 10, no.l (2009): 51-59. Originally presented at the ACRL Rare Books and Manuscripts Preconference, Los Angeles, California, 2008. 4. Jackie Dooley, 'Do born-digital materials belong "in" special collections?' Hangingtogether blog, entry posted on November 18, 2009, http://hangingtogether.org/?p=751 Editor's note Particular thanks to Judy Vaknin and the ARLIS/UK & Ireland Art Archives Committee for helping realise some of the contributions to this born-digital issue of the Art libraries journal. Several of its papers are revised versions of those presented at their study day in November 2009, which was entitled 'Fear & learning: approaches to the born- digital challenge in art & design archives' (see Elinor Robinson and Hannah Green on page 5; Douglas Dodds on page 10; and Kurt G.F Helfrich on page 23). Coincidentally OKBN*ARLIS Netherlands also held a study day on the born-digital later in the same month, and Inge Angevaare's article on page 17 is a re-worked version of the presentation she gave there. T h e article on K U L T U R was specially commissioned, as was the one on sharing online learning and teaching resources. And those by Thomas Hill and Martin Flynn began life at the IFLA Art Libraries Section events in Italy last year. Jackie Dooley Consulting Archivist OCLC Research and the RLG Partnership 6565 Kilgour Place Dublin OH 43017-3395 USA Email: dooleyj@oclc.org 4 http://www.dpconline.org/advice/ http://hangingtogether.org/?p=751 mailto:dooleyj@oclc.org work_iojbbbkrhfbtbaggp4pl5d3pji ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216587097 Params is empty 216587097 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587097 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_iqekcmseazd5rhdjg6tqpzd6fa ---- El Derecho a la Verdad en el Ámbito Iberoamericano Ius Humani | revista de derecho Volumen 6, año 2017 Publicada el mes de diciembre 2017 Frecuencia anual ISSN 1390-440X Ius Humani, Revista de derecho es un lugar abierto a los investigadores de todo el mundo, en todos los idiomas, donde se publican estudios originales sobre los derechos del ser humano (naturales, humanos o constitucionales) y sobre los procedimientos más efectivos para su protección, tanto desde la perspectiva filosófica, como la de la normativa superior del ordenamiento jurídico. La versión impresa de la Revista tiene una frecuencia anual desde el año 2016 y se imprime a finales del período. Tiene el ISSN 1390-440X. La versión digital de la Revista tiene el e-ISSN 1390-7794 y funciona como una publicación continua: aprobadas las colaboraciones se procede a publicarlas. Está indexada en múltiples sistemas como LATINDEX, Academic Search Premier, Advance Sciences Index, Cosmos Impact Factor, Dialnet, DOAJ, DRJI, EBSCO Legal Source, Emerging Sources Citation Index, ERIH Plus, Fuente Académica Plus- EBSCO, Global Impact Factor, Google Scholar, Heinonline, I2OR, Infobase Index, IPI, JournalTOCs, Miar, OAJI, OCLC, REDIB, Saif, Ulrich’s, VLex y en muchos otro catálogos y portales (Copac, Sudoc, ZDB, JournalGuide, etc.). www.iushumani.org Para canjes y susbcripciones remitirse a: revista@derecho.uhemisferios.edu.ec Ius Humani: Revista de derecho Facultad de Ciencias Jurídicas y Políticas Universidad de los Hemisferios www.uhemisferios.edu.ec EDICIÓN Servicio de Publicaciones de la Universidad de los Hemisferios (SPUH) revista@derecho.uhemisferios.edu.ec Dirección: Universidad de los Hemisferios / Paseo de la Universidad Nro. 300 y Juan Díaz (Urbanización Iñaquito Alto) / Quito – Ecuador Código Postal: EC170135 ISSN: 1390-440X Quito – Ecuador Tiraje: 300 ejemplares Revisión y corrección de textos: Esteban Cajiao Brito Diagramación y maquetación: Mario de la Cruz, Santiago Ullauri Ius Humani | revista de derecho Universidad de Los Hemisferios Quito – Ecuador Rector Diego Alejandro Jaramillo Decano de la Unidad Académica de Ciencias Jurídicas y Políticas Dr. René Bedón Garzón Director de la Revista Dr. Juan Carlos Riofrío Martínez-Villalba, Univ. de los Hemisferios (Quito, Ec.) Comité Científico Dr. Pedro Rivas Palá, Universidad de la Coruña (La Coruña, España) y Universidad Austral (Buenos Aires, Argentina) Dr. Hernán A. Olano García, Universidad de la Sabana (Bogotá, Colombia) Dr. Carlos Hakansson Nieto, Universidad de Piura (Piura, Peru) Dr.Hernán Pérez Loose, Universidad Católica Santiago de Guayaquil (Ecuador) Dr. Luis Castillo Córdova, Universidad de Piura (Piura, Peru) Comité Editorial D.a Maria Cimino, Università degli Studi di Napoli Parthenope (Nápoles, Italia) Dr. Julián Mora Aliseda, Universidad de Extremadura (Extremadura, España) Ph.D. Gustavo Arosemena, Universidad de Maastricht (Maastricht, Holanda) Dr. Alfredo Larrea Falcony, Universidad de los Hemisferios (Quito, Ecuador) Ph.D. Juan Cianciardo, Universidad de Navarra (Pamplona, España) Dr. Jaime Flor Rubianes, Pontificia Universidad Católica del Ecuador (Quito, Ec.) Mgr. María Teresa Riofrío M.-V., Univ. de Villanueva (Madrid, España) Dr. Edgardo Falconi Palacios, Universidad Central del Ecuador (Quito, Ec.) Sumario PREMIO “JUAN LARREA HOLGUÍN” Libertad de expresión y límites democráticos Freedom of Speech and Democratic Constraint José Luis Castro-Montero 11-25 ARTÍCULOS Vida y razonabilidad Life and Reasonability Luis Castillo Córdova 27-53 La persecución religiosa en el siglo XXI Religious Persecution on XXI Century César Castilla Villanueva 55-72 Evolución histórica de la oralidad y la escritura en el proceso civil español y ecuatoriano Oral and Written Process Historic Evolution on Spain and Ec. Álvaro Mejía Salazar 73-94 Derecho administrativo y derechos sociales fundamentales Administrative law and fundamental social rights Jaime Rodríguez-Arana Muñoz 95-105 Ineficacia de la acción de repetición Ineffective replay action Guillermo Enríquez Burbano 107-122 Estudio comparado sobre transparencia y derecho de acceso en el ámbito internacional y su influencia en España Administrative law and fundamental social rights Manuel Palomares Herrera 123-153 Sobre la injusticia del aborto On the Wrongfulness of abortion Gustavo Arosemena 155-172 La fumigación de herbicidas colombiana y el derecho humanitario internacional Colombian Fumigation of Herbicides and International Humanitarian Law Natalia Andrade Cadena 173-184 La paz del mundo y la perspectiva Islámica The World Peace and the Islamic Prospective Bahram Navazeni, Alireza Nabawi 185-198 Viabilidad de las sociedades laborales en Ecuador. Una aproximación documental Viability of labor companies in Ecuador. A documentary approach Mercedes Montiel 199-212 La evaluación y revisión del criterio de ciudadanía y su distinción de otros conceptos similares en la legislación iraní The Assessment and Review of the Citizenship Criterion and its Distinction from other Similar Concepts in Iranian Law Abbas Zera’at, Meysam Nematollahi 213-230 Génesis de las costumbres no codificadas Genesis of non-codified customs Atefeh Roohi Kargar, Rasoul Parvin 231-251 work_irwb6lljbfcnrflykoasnzlpru ---- T H E A M E R I C A N A R C H I V I S T Life with Grant: Administering Manuscripts Cataloging Grant Projects Susan Hamburger A b s t r a c t Administering manuscripts cataloging grant projects requires planning and flexibility. The author uses three separate retrospective conversion projects for personal papers at the Library of Virginia (formerly the Virginia State Library and Archives), the University of Virginia, and the Virginia Historical Society as the basis for discussing staffing, training, record quality, workflow, and quality control. The author points out the problem areas and the successes, and makes suggestions for future manuscripts cataloging retrospective conversion projects. I n t r o d u c t i o n R etrospective conversion of finding aids and typed catalog cards to machine-readable cataloging in local and national databases often requires outside funding and additional staffing. Many repositories, from small one-person shops to large research institutions, benefit from coop- erative grant projects. Funding agencies look more favorably on applications that offer access to nationally important collections, with a thematically orga- nized focus, and that combine the resources of several institutions. Proper and adequate planning before writing the grant proposal can avoid most problems with staffing, workflow, cataloging, and quality control. This article examines three manuscripts cataloging grant projects in Virginia repositories to discover the problems encountered, explicate lessons learned, and make recommenda- tions for managing future retrospective cataloging projects. L i t e r a t u r e R e v i e w There is a paucity of documentation in the archival literature on managing grant projects. Instead we find fragments that can be applied to writing grant proposals—practical applications of processing times and technical discussions 1 3 0 T h e A m e r i c a n A r c h i v i s t , V o l . 6 2 ( S p r i n g 1 9 9 9 ) : 1 3 0 - 1 5 2 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S about MARC AMC cataloging and subject access.1 Articles dealing with process- ing times cover initial arrangement and description but not the work involved in analyzing the resulting finding aid's suitability as the basis for retrospective con- version. Karen Temple Lynch and Thomas E. Lynch discuss rates of processing manuscripts and archives, and conclude "a rule-of-thumb rate for processing per- sonal papers might fall into the range of 0.5 to 2 linear feet per full-time proces- sor per week."2 Thomas Wilsted provides formulas to compute the total cost of archival processing, including personnel, supplies, and shelving.3 From bench- tests, Lyndon Hart estimates how long it would take one archivist to fully process one cubic foot of pre-1800, pre-1900, and post-1900 personal papers.4 A compa- rable study remains to be conducted on retrospective conversion of finding aids and catalog cards to machine-readable cataloging for personal papers collections. Mark A. Vargas and Janet Padway noted that "retrospective conversion of archival cataloging and original cataloging of archival materials are resource- intensive enterprises and should be undertaken only after thorough planning."5 Part of the preparation includes identifying staff with archival cataloging expe- rience or aptitude for learning. In the Milwaukee Urban Archives at the Univer- sity of Wisconsin-Milwaukee, Vargas and Padway realized that "the archivists had little experience as catalogers but found themselves in that role."6 They pin- pointed areas of concern that surfaced during the retrospective conversion pro- ject which could have been avoided with pre-planning. Understanding the quirks of the local online public access catalog (OPAC), deciding whether to create authority records in the local database, selecting collections needing improved description, and educating the library's general reference staff about the archives' collections all raise questions that need to be answered before writing a grant proposal. The literature on grant writing does not deal specifically with managing an archival project, and the library literature concentrates on the technical aspects of outsourcing, quality control, and upgrading the existing catalog card in- formation to comply with current cataloging standards and practice. There is little guidance for evaluating the existing finding aids for their inclusion of information for the required MARC fields, for redescribing or reprocessing a 1 Karen Temple Lynch and Thomas E. Lynch, "Rates of Processing Manuscripts and Archives," The Mid- western Archivist 7, no. 1 (1982): 25-34; Harriet Ostroff, "Subject Access to Archival and Manuscript Material," American Archivist 53 (Winter 1990): 100-105. 2 Lynch and Lynch, "Rates of Processing Manuscripts and Archives," 25, 32. 3 Thomas Wilsted, Computing the Total Cost of Archival Processing, Technical Leaflet Series No. 2. Mid- Atlantic Regional Archives Conference, 1989. 4 Lyndon Hart, 26 August 1994, "RE: Archival processing times [Discussion]," ARCHIVES listserv, avail- able at . 5 Mark A. Vargas and Janet Padway, "Catalog Them Again for the First Time," Archival Issues 17, no. 1 (1992): 49. 6 Vargas and Padway, "Catalog Them Again for the First Time," 53. 131 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H H A M E R I C A N A R C H I V I S T collection to provide a relevant on-line record, or the decision-making process for what level of staff is necessary to handle the conversion. We are all aware of the changes in finding aids as they have evolved over decades. Older finding aids suffer from the repository-specific focus, judgmen- tal comments on the usefulness of an item or collection for research purposes, assumptions that a reference archivist would interpret the contents, and the "dead white men" orientation that ignored the records of women or minorities buried within larger collections. David Stoker editorialized that "providing net- work access to catalogue records of hitherto under-used materials will inevitably have the desirable effect of encouraging their use by individuals who had no idea they existed."7 Do we, as archivists entering our catalog records into national databases, provide researchers with the outdated finding aid information simply to make them aware of our collections or do we redescribe the collections to pro- vide meaningful records? Can we afford the time and staff to examine selected collections and reprocess them if necessary before cataloging? Should we accept the personal, family, and corporate names as written in a finding aid or do the authority control work to maintain consistency in search- ing the database? James Maccaferri states that authority control "seeks to assure that the name. . .and subject headings used on bibliographic records are unique, uniform, and correctly formulated . . . and involves editing headings on existing bibliographic and authority records to achieve consistency."8 Archivists accustomed to transcribing the names as used in the collection often rebel against authority control. Avra Michelson argued that archivists "cannot ignore the greater costs associated with excessive searching or failed retrieval" despite the high costs of implementing authority control.9 But the consistency for researchers who can locate all collections dealing with one person without hav- ing to guess the variant spellings or nicknames far outweighs traditional practice within the institution. Archivists must learn to think beyond the needs of their own institution when embarking on a retrospective conversion project. Vargas and Padway commented that "before automation, if users were to discover the archival collections, they had to presume that such material existed even though it was not in the OPAC and make the effort to inquire at the gen- eral reference desk, where the staff may or may not have known something about the archives."10 This also assumed that a researcher knew which institution to write to or visit. 7 David Stoker, "Editorial: Computer Cataloguing in Retrospect, "Journal of Librarians/lip and In formation Sciences (December 1997): 177. "James Tilio Maccaferri, "Managing Authority Control in a Retrospective Conversion Project," Cata- loging & Classification Quarterly 14, nos. 3 / 4 (1992): 146. 9 Avra Michelson, "Description and Reference in the Age of Automation," American Archivist 50 (Spring 1987): 198. 1(1 Vargas and Padway, "Catalog Them Again for the First Time," 53. 132 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S How has subject access been addressed in finding aids, if at all? Avra Michelson concluded in her 1986 study of archival indexing practices that archivists inconsistently chose and constructed subject headings.11 How does the choice of subject headings affect useful retrieval in a stand-alone archival OPAC and in a national database? As Jackie Dooley has noted, "increasingly, archival descriptions are found in the same databases as books, periodicals, visual materials, museum objects, and other media."12 Will we provide general or specific subject access? Dooley stresses that "if high recall is paramount, archivists should focus on providing broad subject access to all collections. If precision is also required, they must learn to assign specific subject descriptors in a consistent manner."13 Do we take what is typed on a catalog card, assume it is an accurate descrip- tion of the collection, and reproduce it in an on-line database? How should we handle accretions—as separate catalog entries or should we combine them into one record? When catalog cards contain subject headings, do we check the latest version of Library of Congress Subject Headings a n d Cataloging Services Bulletin to ver- ify that the headings haven't been updated, superceded, or cancelled? All of the above questions should be, but are not always, addressed before undertaking a grant project. Ruth A. Inman suggests that "two types of skills are needed by catalogers for retrospective conversion and cataloging in general. The 'composing skill' used in the course of cataloging is much different from the editorial skill needed for proofreading."14 This same difference can apply to archivists who create a narrative finding aid in a prescribed format but lack the technical skills to translate the contents into MARC coding. The project manager needs to ask if the processing archivists can learn and correctly apply cataloging prin- ciples, if book catalogers can adapt their knowledge and skills to encoding col- lections rather than single items, or if student assistants can be taught to fill in a preprinted workform from finding aids and catalog cards. Jane McGurn Kathman and Michael D. Kathman suggest that performance measures for student assistants "assist managers in planning and monitoring activities . . ., enable the students to know what is expected of them and decrease the need for constant supervision while improving the quality and quantity of their work."15 How the students—or volunteers, in some cases—fit into the grant project should be part of the preplanning research. 11 Michelson, "Description and Reference in the Age of Automation," 192. 12 Jackie M. Dooley, "Subject Indexing in Context," American Archivist 55 (Spring 1992): 347. 13 Dooley, "Subject Indexing in Context," 351. 14 Ruth A. Inman, "Are Title II-C Grants Worth It? The Effects of the Associated Music Library Group's Retrospective Conversion Project," Library Resources and Technical Services 39 (April 1995): 175. 13 Jane McGurn Kathman and Michael D. Kathman, "Performance Measures for Student Assistants," College & Research Libraries 53 (July 1992): 300. 133 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T C a s e S t u d i e s Three research repositories—the Library of Virginia (formerly the Virginia State Library and Archives), the University of Virginia Special Collections Department, and the Virginia Historical Society—each applied for and received grant funding between 1990 and 1994 to catalog selected manuscript collections. The Library of Virginia negotiated a two-year cooperative grant through the Research Libraries Group funded by the National Endowment for the Human- ities to catalog a portion of its personal papers manuscript collections relating to its strengths in eighteenth- and nineteenth-century southern history.16 The University of Virginia received United States Department of Education Title II- C funding for one year for a projected three-year grant project to catalog all of its manuscript holdings.17 The Virginia Historical Society, following its success with a cooperative rare books cataloging grant, received a Title III.2 Library Ser- vices and Construction Act subgrant to catalog a minimum of four hundred manuscript collections for itself and seven other small repositories in the Com- monwealth that lacked the resources and staff to catalog their own collections.18 P r o j e c t O r g a n i z a t i o n Building on the expertise gained in the two-year Governmental Records Project19 to catalog state government records in the Research Libraries Infor- mation Network (RLIN) MARC AMC national database, the Library of Virginia participated with nine other repositories in a cooperative project to catalog manuscript collections.20 The preplanning included identifying the major 16 National Endowment for the Humanities, "An RLG Retrospective Conversion Project for Manuscript and Archival Collections," September 1990-August 1992. NEH funded the overall grant at $200,000 of which LVA received $11,093 and provided $16,054 in cost-sharing to create 525 MARC AMC records. 17 U.S. Department of Education, Higher Education Act Title II-C, Strengthening Research Library Resources Program, "Retrospective Conversion of the University of Virginia Library's Manuscripts and Archives," October 1992-September 1993. i« Virginia Historical Society, "History Library Network Manuscript Retrospective Conversion Project," October 1993-June 1994. The Virginia Historical Society received $19,355.12 from Library Services and Construction Act and contributed $26,164 in cost-sharing. The LSCA funds covered part of the cata- loger's salary and benefits, telecommunications costs to connect to OCLC, and equipment (one com- puter for Gunston Hall and one for the Virginia Historical Society). The participants in this History Library Network project included the Charlottesville-Albemarle Historical Collection at the Jefferson- Madison Public Library, Gunston Hall, James Monroe Museum, Lloyd House of the Alexandria Public Library, Mariners Museum, Mount Vernon, Valentine Museum, and Virginia Historical Society. 19 For a description of this NHPRC-funded cooperative grant project, March 1989-February 1991, see Marie B. Allen, "Intergovernmental Records in the United States; Experiments in Description and Appraisal," Information Development 8 (April 1992): 99-103. 20 T h e o t h e r participating repositories were the American Antiquarian Society, Cornell University, Emory University, Hagley Museum and Library, Louisiana State University, State Historical Society of Wiscon- sin, University of Minnesota, University of Pennsylvania, and Beinecke Rare Book and Manuscript Library at Yale University. 134 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S strengths of the repository, selecting the collections, preparing a work plan and budget, and writing the Library of Virginia portion of the grant proposal. Because of the overlap between the end of one grant and the beginning of the second, for three months the project manager administered the two Research Libraries Group grants simultaneously.21 While the archivist who would do the manuscripts cataloging wrote the grant proposal for the Library of Virginia, the University of Virginia and Virginia Historical Society administrators wrote their grants to hire an outside project cataloger. At the Virginia Historical Society, administrators first planned to have the project cataloger train staff in each repository to catalog their own collections. During the two training sessions it became apparent that the project cataloger could catalog all of the collections more efficiently. Half of the repositories had librarians overseeing the collections, and the other half had curators or archivists. None had cataloging experience, but two were willing to try. The training sessions became orientations to the kinds of information needed from each repository to create catalog records, rather than training to create such records. F u n d i n g Funding agencies specify what they will pay for as part of the grant. Grant funding presents a challenge to institutions to match external dollars with either in-kind services and/or monetary contributions. Will the grant pay for salaries, equipment, and/or computer connection charges? What unanticipated hidden costs might a repository incur? Will the grant pay for what you actually need or are you adapting yourself to the grant's requirements? How the balance is struck between cost-sharing and grant dollars requested can contribute to feeling that the grant was worthwhile or that it cost more than the repository received. In order to reach the percentage of in-kind services cost-sharing required by the funding agency, for example, the Library of Virginia needed to purchase one computer and printer and contribute the prorated time of five archivists. The Uni- versity of Virginia's costs were more substantial: five computer workstations, fur- niture, and OCLC dial-up charges. The Virginia Historical Society covered two- thirds of the salary and benefits of the project cataloger and travel to the other repositories for site visits. The choice between using grant money to purchase computers or for salaries often depends on what the funding agency will allow. Funding agencies require written reports and documentation on how the money is spent. For the personal papers project, as with the Government Records Project, the Research Libraries Group received the grant money 21 Although the Government Records Project grant ran through February 1991, the Library of Virginia completed its contracted 2,624 records in November 1990 leaving three months for the evaluation phase. The overlap required a separate spreadsheet to account for the time the same people worked on both grants. 135 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T which it reallocated to the participating repositories. Each institution received a different amount of money based on its calculated cost to catalog the col- lections specified in the grant proposal. The Research Libraries Group set quarterly goals based on production and reimbursed the institutions their share of the grant each quarter when they reached their goal. Part of the project manager's responsibility at the Library of Virginia was to prepare the quarterly statistical report to the Research Libraries Group, list- ing the number of RLIN records created. The project manager designed a spreadsheet to track the hours worked and wages earned in order to keep the project within the budget. Both the University of Virginia and the Virginia Historical Society received lump-sum payments at the beginning of the pro- jects, and the project cataloger only needed to write one end-of-year report on accomplishments. S t a f f ! n g Whether to use existing staff or hire a project cataloger often depends upon the organization's structure, regulations governing the hiring of contracted employees, the salary and benefits (or lack thereof) to be offered, the expertise and availability of internal staff, the external pool of qualified applicants, and what the grant will allow. Can the existing staff absorb an increased workload? Is there enough staff to meet the quota set by the grant? What other work can be postponed during the grant period? Does the staff need additional training to work on the project? Between the time of submitting a grant proposal and notification of the award, internal, unpredictable staff changes can seriously affect the best-laid plans of grant writers and administrators. It is imperative to consider carefully who is going to actually do the day-to-day production work and have contingency plans in case that person is promoted, reassigned, or quits. By the time NEH notified the Research Libraries Group about the successful funding for the personal papers project at the Library of Virginia, the project manager had assumed supervisory duties that precluded devoting 100 percent of her time to processing, cataloging, and working on the grant.22 The Virginia state government allowed agencies to hire part-time workers for a maximum of 1,500 hours per year and paid no benefits (health insurance, sick or vacation leave), but charged a percentage against the grant to pay social security and federal taxes. Archivists were not included on the state-approved list of positions for which temporary employees could be recruited. These restric- tions limited the level at which an employee could be hired and paid. Those 22 The original grant request from NEH was $355,045 of which the Library of Virginia would receive $31,695 to create 1,500 records. They had to proportionately decrease their goal by the 65 percent drop in funding. 136 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S hired would have to be willing to work twenty-eight hours per week for fifty-two weeks or forty hours a week for thirty-seven-and-a-half weeks a year. To supple- ment the permanent staff, the archives hired the part-time cataloger who was completing the cataloging on the Government Records Project grant. Hiring an experienced cataloger required only minimal orientation when switching from the government records to personal papers workform. The University of Virginia and the Virginia Historical Society, however, could hire a professional archivist at a competitive salary plus all benefits, because they were not bound by state government hiring regulations. One of the drawbacks of hiring a project archivist is not having that person's input into the initial planning process. What the administration conceives as a realistic plan on paper does not always work once the project archivist assesses the goals and compares the production expectations with the catalogability of the finding aids. The University of Virginia retrospective conversion of 10,873 literary and historical manuscript and archival collections expected one full-time cataloger, two paraprofessional staff, and four student assistants to create three to four thousand MARC AMC records each year from catalog cards created over a forty-year period. It became apparent that this was unrealistic. Conversely, the Virginia Historical Society underestimated the number of collections that could be cataloged during the grant project because they hired an experienced pro- ject cataloger/southern historian who needed no orientation or training before becoming productive. C a t a l o g e r s In some repositories, one staff member catalogs all materials from books to manuscripts; in others, the technical services department maintains cataloging as its sole responsibility but excludes manuscripts and archives; in still others, the duties are split along material format lines.23 The integration of the archives within the parent institution often determines its relationship with the cataloging department. During the grant project, the library and the archives at the Library of Virginia occupied the same building but in opposite wings with separate staffs, stacks, reading rooms, and access to collections. The historic separation between the two divisions led to the archives joining the Research Libraries Group to cat- alog its archival and manuscript collections in RLIN, while the library side pro- vided access nationally via OCLC. For its OPAC the library used the Virginia Tech Library System (VTLS). The archives also decided to use VTLS, but to create a separate catalog for the archival and manuscript collections to be able to cus- tomize the public display screens and to provide keyword searching, a feature that the library did not offer. 23 Vargas and Padway, "Catalog Them Again for the First Time," 50. 137 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T To move from a paper-based finding aid environment to an on-line system, the Library of Virginia archives division hired a cataloging librarian to become the processing section's automation archivist. The original plan to have all the processing archivists funnel their finding aids through the automation archivist who would transfer the information to MARC AMC workforms and enter them in VTLS worked for current accessions but would create a bottleneck during a grant project. When the archivesjoined the Research Libraries Group, three processing archivists who were formerly librarians attended week-long training on how to catalog in MARC AMC for RLIN. The four remaining archivists—three of whom came to the profession with history degrees only—preferred to remain outside the automation thrust. At the University of Virginia, however, because the Special Collections Department organizationally reported to the assistant director for technical ser- vices, the manuscripts cataloger and the cataloging department staff communi- cated much more openly. Not until the grant project promised to add over ten thousand new records to the database did the cataloging department take an active interest in the manuscript contributions to the shared on-line catalog. The manuscripts division within Special Collections separates the processing from the cataloging. The processing archivists do not catalog, but forward their find- ing aids to one professional manuscripts cataloger to create MARC AMC records and enter them in the OPAC. Combined with new accessions and the backlog, this presented much more work than one cataloger could handle. While the Library of Virginia book catalogers knew nothing of the cataloging grant pro- jects the archives pursued, the University of Virginia catalogers contributed their time and resources to the manuscripts cataloging grant project. The Virginia Historical Society had a fully developed technical services department for books and serials and a separate processing section for manu- scripts. The society's library professionals—both catalogers and manuscript processors—prepared detailed catalog cards for their collections. In the early 1990s the catalogers began adding their book records to OCLC, and the proces- sors soon followed with manuscript collections. But because the library did not have its own OPAC until 1998, they continued to have OCLC generate card sets. The other seven small repositories forming the History Library Network with the Virginia Historical Societyjoined OCLC to provide access to their holdings, but none had a local OPAC and only one had an OCLC terminal. Deciding who will catalog the manuscript collections depends not only on the availability of staff but on in-house expertise and support. While the lack of communication between archivists and catalogers at the University of Wiscon- sin-Milwaukee hampered Vargas and Padway,24 the archivists had the expertise of catalogers to call upon had they expressed their needs and expectations more 24 Vargas and Padway, "Catalog Them Again for the First Time," 51-52. 138 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S clearly. The Library of Virginia archivists relied on the staff at the Research Libraries Group and archival colleagues outside the institution to answer their questions. The University of Virginia and the Virginia Historical Society infra- structures supported the project cataloger. R e c o r d Q u a l i t y When designing a retrospective manuscripts cataloging grant project, it is important to consider the source for the cataloging record. The older a reposi- tory, the more likely that the finding aids were written following outdated for- mats and styles and may generally lack the components needed to create a mod- ern MARC record. Evaluating the collections to be cataloged not only for their topical, temporal, or geographical relevance to the project, but also for their compliance with current standards of description will give a realistic idea of the magnitude of the preparatory work needed in the pre-grant phase. From minor to major amounts of reprocessing may be necessary before collections can be cataloged. Given the increased workload archivists face as repositories lose staff positions to budget cuts or attrition, or as the number of collections accessioned increases, each repository must choose between creating minimal level cata- loging records in the hopes of someday enhancing them or creating full-level records the first time. Archivists no longer have the luxury of spending days con- ducting research for a single collection's record. The decision to create minimal- level records ensures that this is all that will ever be done. Striking a balance between the two extremes by selecting which collections will receive full treat- ment and which will receive minimal treatment is crucial to wise allocation of time and resources. In the last ten years, granting agencies such as the NEH and the U.S. Department of Education diverted funds away from initial processing and reprocessing. With the requirement to contribute the catalog records to national databases such as OCLC or RLIN, repositories must conform to national standards. For example, a catalog record needs to contain at least one Library of Congress subject heading. Bringing old finding aids up to current standards is a labor-intensive, time-consuming task. In writing the grant application, the University of Virginia's library adminis- tration relied on the advice of an outside consultant who estimated that the ret- rospective conversion of the entire collection could be accomplished in three years working solely from the catalog cards without doing authority work, con- sulting the finding aids, control folders, or collections. The OPAC would simply replicate the card catalog with its outdated subject headings, minimal description, and inconsistencies and nonstandard forms of personal and corporate names. No attempt was made to evaluate the collections for the degree of difficulty in cata- loging based on the completeness of the catalog card, the currency of the find- ing aid, the complexity of the collection, or lack of adequate description. After 139 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H K A M E R I C A N A R C H I V I S T submitting the proposal, the library administration instead wanted full-level, accu- rate records ready to tapeload to OCLC and RLIN, and that took a lot more time per record than the proposed workplan allowed. W o r k F l o w Because the Library of Virginia processing archivists lacked an in-depth understanding of MARC AMC and cataloging rules, the automation archivist reviewed the printouts both for typographical and coding errors. This created a bottleneck and would have posed a potential workflow problem had the quota of records been larger. The archives acquired, processed, and cataloged new accessions throughout the two-year grant period; but the grant work received priority handling because of the quarterly quota and reimbursement for salaries. Even though the Library of Virginia renegotiated the number of records to be created from 1,500 down to 525, proportional to the decreased funding, it still could not meet this goal with one part-time cataloger. Therefore, it enlisted other full-time archivists to assist with the cataloging. The project manager assigned each archivist five years of the Annual Report of Accessions to identify recent per- sonal papers collections to catalog. The part-time cataloger and project manager cataloged the collections specifically named in the grant application. The project manager already had devised a printed, encoded RLIN workform for personal papers collections to fill in and give to the data entry person. For the grant a revi- sion of the workform provided prompts and hints for catalogers accustomed to dealing with maps, and state and local government records (see Appendix). To relieve the catalogers from repetitious writing, certain key fields that appeared in every record, the structure within each field, and the field order to maintain con- sistency across records were preprinted on the form. The same form was adapted from RLIN-specific coding to NOTIS for the University of Virginia and then to OCLC for the Virginia Historical Society. Even though the workform contained preprinted fields and their definitions, not every archivist wrote the information in the correct field. The project manager reviewed the workforms created by the archivists to ensure their completeness, the use of appropriate subject headings, and adequate and relevant description. Prior to the arrival of the project cataloger at the University of Virginia, the department head hired the two library assistants, and the project super- visor (who was the permanent manuscripts cataloger) hired and trained the student assistants. The project supervisor had the students start photocopy- ing the shelf list cards beginning with the earliest accessions. This created a problem: the oldest cards contained the least information and necessitated consulting the control folders, finding aids, and often the collections them- selves. The added research slowed down the cataloging process. The project staff had to flip the process and catalog the recent accessions that had the 140 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S most accurate and most complete description, then work backward to come anywhere close to creating three thousand records the first year. By examin- ing the shelf list cards and control folders, the project staff realized that they could not breeze through this retrospective conversion project the way the consultant had suggested. The project cataloger trained the library assistants in manuscripts cataloging and set monthly goals for each employee, goals which all found impossible to meet.25 The majority of shelf list cards needed extensive revision and fuller descrip- tion before they could be entered into the OPAC. Some needed separate descriptions of unrelated items grouped together that were purchased from one dealer, while others needed to be combined with later accretions. The intrica- cies of separating and combining left the catalogers with a skewed count of col- lections described. As the project staff moved further back in accessions, work proceeded more slowly as they unraveled and rewove descriptions. Necessary, but not realistically foreseen, was some amount of reprocessing even if it were only a matter of redescribing from the original documents what the shelf list card failed to record. Extensive reprocessing was set aside for the processing unit to complete separately from the grant project. Collections previously reported to NUCMC were separated from later accretions not reported; the accretions were then cumulated and described together to create a separate record while maintaining the integrity of the orig- inal NUCMC record. The project staff searched RLIN to obtain the printouts of University of Virginia records described by NUCMC and entered into that national database. Midway through the project they found out that the RLIN NUCMC tapes were being downloaded into OCLC and had to adjust their pro- cedures to prevent tapeloading the same record from the OPAC to OCLC.26 A variety of approaches needed to be instituted for the History Library Net- work project at the Virginia Historical Society. Some of the repositories had find- ing aids or card catalogs of their holdings. Others (Mount Vernon, Gunston Hall, James Monroe Museum) interfiled all accessions in one chronological order because the collections dealt with one person, one family, or a succession of property owners. The project cataloger visited Gunston Hall and the James Mon- roe Museum to survey their collections and determine how best to create logical groupings of the manuscripts, then returned to catalog them onsite. Lloyd House and Mariners Museum librarians mailed their workforms to the project cataloger based at the Virginia Historical Society. The project cataloger visited 25 For 3,000 records, the project cataloger divided the work among four student assistants and two library assistants to average 40 records each per month plus the project cataloger's contribution of 120 records. The complexity of the collections thwarted this goal. 26 The initial ninety-six records on the tape stayed in the test database and were never dumped into the main database. Tapeloading to RLIN and OCLC continued until the University of Virginia switched from NOTIS to SIRSI around September 1996. When the systems staff solves a technical problem, the University of Virginia will resume tapeloading. 141 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T the Charlottesville-Albemarle Historical Collection and Valentine Museum to assist them in selecting the collections and returned to examine problematic col- lections. Mount Vernon hired a part-time graduate student to write descriptions of the papers of individual Washington family members and of discrete parts of the collection that they then sent to be cataloged. The Virginia Historical Soci- ety provided a list of targeted collections to encode in MARC AMC from their detailed shelf list cards. Q u a l i t y C o n t r o l Because of the relatively low number of records being created for the per- sonal papers project at the Library of Virginia, the library used one data entry per- son whose work the automation archivist proofread and corrected in daily batches. The data entry person made the corrections to the saved RLIN record, passed it into the production mode, and batch downloaded them to the local OPAC.27 At the University of Virginia the library assistants and project cataloger proofread and corrected the students' work before two other student assistants entered the data from the workforms into the OPAC. The staff set up macros on the OPAC terminals to speed up data entry and to eliminate the need to proof- read every field. The project cataloger edited the final printouts every night at home and corrected them on-line the next day to save an extra proofreading step instead of returning them to the student assistants to correct. One of the library assistants specialized in MARC MAP and both assistants combined accre- tions with existing OPAC records as well as created MARC AMC workforms for the more complex collections which were beyond the students' abilities. Early in the project, the staff realized that the consultant's recommendation not to do authority work did not fit into the library's need to have a clean cata- log. The project cataloger and library assistants searched the name authority file in OCLC and downloaded authority records not already in the OPAC. Adding this step increased the uniformity of names in the catalog but also added to the workload of the catalogers. Philosophically, they knew that to do a quick-and-dirty job of data entry with the assumption that someone in the near-to-distant future would go back and clean it up was unrealistic. So they opted for fewer but more accurate records in the database. Although the student assistants were remarkably prolific in creating work- forms, their inexperience produced a substantial amount of work to proofread and correct before data entry. Their work always required extensive revision. It 27 The archives hired temporary data entry people through Kelly Services. During the Government Records Project they went through a succession of people totally unfamiliar with cataloging until finding an intel- ligent, sharp young man who quickly recognized coding errors or omissions and corrected them. 142 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S was more efficient to sort the shelf list photocopies by degree of difficulty and topic (literary or historical) and assign batches to each student assistant based on his or her interest and ability. By the end of the project, the staff had created 2,079 workforms but entered only 1,384 in the OPAC.28 For the History Library Network, the project cataloger created the MARC records with OCLC's Cataloging Micro Enhancer (CATME) software on sepa- rate diskettes for each repository, printed the records, and mailed batches of them to the contacts at each repository to proofread and edit. When they either returned the proofed copy or phoned in their corrections, the project cataloger made the changes in CATME then batch-loaded the records to OCLC, keeping separate statistics for each repository and reporting the monthly progress to the whole group. P u b l i c A c c e s s Designing how the records will display on-line may be an adjunct benefit with a grant project, especially when the institution is in the early stages of automation. At the same time that the Library of Virginia embarked on the Government Records Project grant and became a member of the Research Libraries Group, it also negotiated with VTLS to customize its OPAC to accept archival records. Because the library division at the Library of Virginia already used VTLS as its book catalog, the archives division decided to use it as well. Rather than integrate the archival records, the administration opted to set up a parallel catalog with customized field displays such as "creator" rather than "author" and to offer keyword searchability that the library's database did not yet have. Until at least three thousand archival records filled the database and the archives and VTLS resolved all the display problems, the catalog was not available to researchers. The administration felt that researchers would be frustrated by the meager- ness of the database and would be set up for disappointment when their expec- tations of finding collection records were not met. During this period the project manager also faced the task of educating the reference archivists in how to use the archives OPAC and sought their help to refine it for better reference use. RLIN was only available on one computer in the processing section, and the data entry person received priority access. The project manager planned and exe- cuted training and orientation sessions with the reference archivists, most of whom were unaccustomed to, or at least uncomfortable with, using the OPAC and printed a guide sheet to help them with their searches. The opportunity to 28 Of the 2,079 workforms created (69.3% of the annual goal of 3,000), 1,483 were proofread and corrected (71.3%), and 18 records were combined with existing records in the OPAC; 93.3% of the workforms proofread and corrected were entered into the OPAC. 143 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T work on the grant focused attention on the personal papers and afforded the archives the chance to share information with researchers across the country via RLIN about a portion of the historically significant manuscript collections in the Library of Virginia.29 At the University of Virginia, the project staff coordinated with the screen design committee to customize the OPAC display screen for manuscript mate- rials in its integrated catalog. The library already had help screens on-line to assist patrons in recognizing the component parts of a catalog record. The Vir- ginia Historical Society did not have an OPAC until 1998. The manuscripts processing section typed voluminous, copiously detailed catalog cards for the public card catalog in addition to preparing finding aids for large collections. The retrospective conversion project created MARC records for OCLC, which provided printed catalog cards; only two OCLC terminals were available in the technical services department. O u t c o m e s The Library of Virginia insisted that every processing archivist also catalog his or her own collections, and the level of competency varied widely among the staff. Of the three archivists who received the initial week-long MARC AMC cataloging training from an RLIN staff member for the Government Records Project grant, only one remained employed at the Library of Virginia when the personal papers grant project began. The non-librarian archivists needed a crash course in archival cataloging. Despite sending everyone to the two-day "Understanding the USMARC Format for Archival and Manuscripts Control" workshop30 to provide them with the same basic information, each archivist absorbed the information and practiced it with different degrees of skill. The reluctance of some to do what they considered library work manifested itself in inadequately described collections. Consequently, closer supervision and review of their work was required of the project manager. Two archivists acci- dentally cataloged the same collection with intriguing results that exemplify the inexact science of manuscripts cataloging, as Avra Michelson documented.31 One wrote an excellent description of the collection in the 520 scope note field but failed to assign appropriate and sufficient number of subject headings. The other wrote a cursory description but selected pertinent subject headings. Com- bining the best of both created one good record. The necessity to catalog daily 29 T h e ten institutions a d d e d approximately seven thousand records to the RLIN database as part of the grant project. 30 T h e w o r k s h o p , t a u g h t by K a t h l e e n R o e (New York State Archives) a n d D e b b i e P e n d l e t o n ( A l a b a m a Department of Archives and History), was offered by the Society of American Archivists at the Mid-Atlantic Regional Archives Conference meeting in Alexandria, Virginia, on October 31 and November 1, 1990. 31 Michelson, "Description and Reference in the Age of Automation," 192-208. 144 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S during the project reinforced the skills learned in the workshop, and the repet- itive, regular practice coupled with positive feedback increased some of the archivists' confidence in their cataloging abilities. One of the benefits of the University of Virginia retrospective conversion project was the amount of new material discovered in the collections. Much of it had been unrecognized, deemed unimportant at the time of receipt, or poorly cataloged. For example, a Fitz John Porter document on the second battle of Bull Run in the Civil War originally contained a notation in the catalog record as "whining by a disgruntled Yankee." In another collection, John Randolph of Roanoke gave instructions to an "unidentified ship captain" who turned out to be the young Matthew C. Perry. The old cataloging emphasized politics and war, ignoring women, social history, economics, and slavery. The project's cataloging discovered materials relevant to current research topics, drew out local history connections, provided a breadth of subject access, and highlighted single items of unusual interest among larger collections. The University of Virginia grant project staff produced records of uniformly high quality. They carefully reviewed previous catalog records, modern collec- tion descriptions, and the collections themselves to provide succinct and accu- rate summaries designed to meet the needs of contemporary researchers in a variety of disciplines. The University of Virginia's statistics show a 17.5 percent increase in patron usage of manuscript collections during the grant project and an 11 percent increase during the following fiscal year. The number of research visits by these patrons increased 6.5 percent during the grant. Interestingly, in each year immediately after the first grant and a second one in 1994-95, the in-person patron registration and number of research visits steadily declined. The staff believes the use of manuscript collections continued to increase, but requests shifted from in-person to e-mailed, faxed, phoned, and mailed requests from outside Charlottesville. They did not begin keeping these statistics until 1995. The supposition that researchers discover the collections through on-line database searches and request either additional information or photocopies indicates a change in researcher habits and methodology.32 During a nine-month period, the History Library Network project at the Virginia Historical Society contributed 494 catalog records to OCLC. Although they originally expected each repository to contribute fifty records, some did not have that many collections and others exceeded the goal.33 Because the pro- 32 Unfortunately, the big push to catalog the collections in MARC AMC was abandoned. The University of Virginia took advantage of an available grant to do encoded archival description for the finding aids. The manuscripts cataloger does the current retrospective conversion work at the rate of one to five records per day on the literary collections whose finding aids are now encoded and the historical collections with poor catalog records, no finding aids, or finding aids in need of rewriting. 33 The final tally per repository shows: Charlottesville-Albemarle Historical Collection, 48; Gunston Hall, 48; James Monroe Museum, 33; Lloyd House, 54; Mariners Museum, 106; Mount Vernon, 50; Valentine Museum, 54; and Virginia Historical Society, 101. 145 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T ject cataloger completed the last six months part-time, it is conceivable that the Mariners Museum and Virginia Historical Society each might have been able to contribute one hundred additional records had she remained full-time. The challenge for Virginia Historical Society records lay in distilling the extensive cat- alog card descriptions into a cogent summary and including the most important subject headings. C o n c l u s i o n s The experience of these grant projects leads to several conclusions about administration, staffing, and record quality. Administration: • The staff must have input into the plan for the project in order for a work plan to meet or exceed its goals. For example, the Library of Virginia staff had control over development of its work plan and consequently the library created more records than were required.34 • When writing a grant proposal for manuscript retrospective conversion projects, it is important to consult with a manuscripts cataloger who is familiar with your collections if you plan to hire a project cataloger. • Target specific collections rather than attempt to catalog the entire archives if the holdings are large and the finding aids were created more than ten year ago. • Aim for a more reasonable annual goal. What at first glance appeared to be a simple retrospective conversion project at the University of Vir- ginia proved over the course of one year to contain complex decision- making strategies and unrealistic expectations.35 • Be realistic about how many records can be created given the com- plexity of the collections and the adequacy of the existing finding aids. • When selecting which collections to catalog, do a sampling of types of collections (level, age, and content of existing descriptions); and bench- mark cataloging time for collections representing these factors. Despite 4 In addition to the required 525, they added 52 additional personal papers collection records, plus 1 orga- nizational record, 17 business records, 6 church records, 77 genealogical records, 70 maps, 1 county gov- ernment record, 19 state government and 55 agency history records, and 3,353 Bible records. 5 The University of Virginia did apply for and receive another one-year Title II-C grant for $123,621 to continue the manuscripts cataloging, October 1994-September 1995 (with a six-month no-cost exten- sion), but without the expertise of the initial grant staff who had all accepted permanent positions elsewhere. This grant focused on major historical collections processed before 1960, papers and archi- tectural drawings of Thomas Jefferson, and collections that appeared in multi-collection guides such as the "Guide to Revolutionary War Collections." Because of the tighter focus, they cataloged 1,800 collections in the grant year plus another 601 during a six-month extension, still 99 records short of the original goal of 2,500. 146 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S the suggestion in the literature that an average of two to two-and-one- half hours work per record is the norm,36 a realistic assessment of your own collections' needs will provide more accurate data for your retro- spective conversion project. S t a f f i n g : Hire an archivist with expertise in the subject matter of the collections. The project manager/project cataloger developed an expertise in Vir- ginia and southern history over the course of four years that greatly facil- itated the application of Library of Congress Subject Headings. T h e n u m b e r of catalog records created per person increased with each project.37 When dedicating 100 percent of work time to cataloging, productivity rose. The most knowledgeable and experienced cataloger did not always contribute cataloging records because of administrative responsibilities, thus depriving the project of these skills. Ensure that the archivists have and maintain cataloging skills through regular practice. Using student assistants who lack manuscripts cataloging experience to create workforms increases the proofreading and editing responsibilities of higher-level staff and creates a production backlog. Student assistant tasks should be geared to their ability. R e c o r d Quality: The more people engaged in the project, the more variable the quality of records, and the more time it takes to review them. The older a repository is, the greater the likelihood of significant varia- tions in the quality and scope of finding aids, and the longer it will take to catalog the collections. These grant experiences demonstrate that we do a disservice to our staff and to researchers to force archivists with no library cataloging training to become intermittent catalogers. If processing archivists have an understanding of what makes a good finding aid and can write one, then cataloging archivists can create MARC AMC records that will 36 Patricia Cloud, "RLIN, AMC, and Retrospective Conversion: A Case Study, " Midwestern Archivist 11 (1986): 125-34; Patricia D. Cloud, ' T h e Cost of Converting to MARC AMC: Some Early Observations," Library Trends 36 (Winter 1988): 573-83. 37 In twenty-four months, one FTE created 24 records per month at the Library of Virginia; in twelve months, four FTEs created 43 records each per month at the University of Virginia; and in nine months, one PTE created 54.8 records per month at the Virginia Historical Society. 147 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H F. A M E R I C A N A R C H I V I S T provide enough information for the researcher to know if a collection potentially would be useful. Preplanning will ensure greater success with a manuscripts cataloging ret- rospective conversion grant. Managing the project includes not only meeting the numerical goals but knowing how to reach them with proper staffing, a clear understanding of the complexity of the project, a realistic workplan, and the skills and tools necessary to do the job while creating an enjoyable experience. 148 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S A p p e n d i x PERSONAL PAPERS WORKFORM In house use: Coded by: Verified: Input: Proof: (your initials) RLIN Fixed Fields PC = i-inclusive, s-single, n-no date, q- ID: is filled in by the Data Entry person. questionable date BLT: b = a-miscellaneous bound PD = date(s), if single date item use first collection (if a, then a book workform must blank (1970/ ); if inclusive but for be completed first, complete fields 580, & same year (Jan. 1970/June 1970) put 773 added to this workform.) c-collection, year in both blanks (1970/1970). [notes/charts]. L = change if not English, mul for multi- lingual, etc. RTYP:d ST:s EL:7 CC:9554 BLT:b__ DCF:a CSC:d PROC:b PP:vau L:eng PC: PD: / REP: MMD: OR:? POL: DM: RR: COL: _ _ RML: _ GEN: _ BSE: _ RLIN Variable fields (end all fields with a period) Cataloging Source 041 W Yi*c Vteeappm Author 100 ty Title 245 00 , (?tf for inclusive dates) Physical Description 300 W - _ * f v . ( _ cleaves, p . ) . [8c/or] 300 W *f leaves [or] pages. [&/or] 3 0 0 \/>y> rfcubic ft. Reproduction note 533 W - - • • — — - — - (state if partial reproduction with ;*3, ex. ̂ 3In part, ^aphotocopies, if entire collection is reproduced with various methods then, ^aPhotocopies and negative photostats). 149 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M I : R I C A N A R C H I V I S T Biographical Note 545 W (occupation/residence) Summary 520 0# [Summary]: Finding Aid (item/ folder cross out inapplicable term) Access Restrictions Terms for Use Pref. Citation 555 8$ Inventoryssbavailable in repositoryjtcitem/folder level control. . , . . . . , 506 W 540 W 524 W [Cite as:] Accession ., Personal Papers Collection, Virginia State Library and Archives, Richmond, Va. ISO D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 L I F E W I T H G R A N T : A D M I N I S T E R I N G M A N U S C R I P T S C A T A L O G I N G G R A N T P R O J E C T S ADDITIONAL 5XX FIELDS Subject/Name Subject/LC 650 topical 651 geograph. Occupation of author Form/Genre (alphabetical) Added entry (alphabetical) Other title Location 600/610 600/610 600/610 _ _ 600/610 6 5 0 / 6 5 1 W 650/651 W 650/651 MO 650/651 #0 650/651 #0 »#7 655 #7 , 655 07 655 07 ' 655 l>7 Personal papers.*2aa( 700/710 700/710 700 700 700 700 700 700 700 700 _.*2aat ^correspondent. ^correspondent. ^correspondent. __________ ^correspondent. /('correspondent ^correspondent. ^ecorrespondent. ^correspondent. 740 01 Personal papers collection; = 851 )£ty Virginia State Library and Archives, 5*bArchives and Records Division, # c l l t h St. at Capitol Sq., Richmond, Va. 23219. 151 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T ARC SEGMENT (Process Control Display Permit) PCDP: N (Process Control ID#) PCID: (Accession*) ACCN ( A c c e s s i o n D a t e ) _ _ / / (Material S p e c i f i e d ) MATL (Source) SRCF I S g ^ ^ ^ T ^ ^ ^ ' ^ P ^ . ^ ^ ' ^ ^ ^ ^ ' (Address) A D D R _ ^ ; ^ ' * (Method of Acquisition: i.e. Gift, Lent for copying) MTHD (Owner: if different from source) OWNR [owner address] _. (Action ID#) ACID . 1 _ (Action Displa) Permit) AIM" _ (Action) ACT ..__ Received (Time of Action) TAG / / (Time of Future Action) / _ (Action ID#) ACID 2 _ (Action Displa) Permit) ADP _ (Action ACT Described (Time of Action) TAC ' (Time of Future Action) (Action Agent: Processor) AGT (Action ID#) ACID _J\__ (Action Display Permit) ADP (Action) ACT Preserve (Time of Action) TAC / / (Time of Future Action) (Method) METH j a t t ^ i ^ I ^ 152 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.62.1.3456p7nj20106u42 by C arnegie M ellon U niversity user on 06 A pril 2021 work_j2otfcp5lveg7enmoeqtnwzebi ---- OCLC Systems & Services Emerald Article: Creating a digital library of three-dimensional objects in CONTENTdm Maura Valentino, Brian Shults Article information: To cite this document: Maura Valentino, Brian Shults, (2012),"Creating a digital library of three-dimensional objects in CONTENTdm", OCLC Systems & Services, Vol. 28 Iss: 4 pp. 208 - 220 Permanent link to this document: http://dx.doi.org/10.1108/10650751211279148 Downloaded on: 30-10-2012 References: This document contains references to 7 other documents To copy this document: permissions@emeraldinsight.com Access to this document was granted through an Emerald subscription provided by Emerald Author Access For Authors: If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service. Information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information. About Emerald www.emeraldinsight.com With over forty years' experience, Emerald Group Publishing is a leading independent publisher of global research with impact in business, society, public policy and education. In total, Emerald publishes over 275 journals and more than 130 book series, as well as an extensive range of online products and services. Emerald is both COUNTER 3 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation. *Related content and download information correct at time of download. Creating a digital library of three-dimensional objects in CONTENTdm Maura Valentino Center for Digital Scholarship and Services, Oregon State University, Corvallis, Oregon, USA Brian Shults Digitizing and Copying Center, University of Oklahoma, Norman, Oklahoma, USA Abstract Purpose – This paper aims to describe a project conducted by the University of Oklahoma Libraries to create a digital collection consisting of three-dimensional scientific objects. Design/methodology/approach – The University of Oklahoma Libraries developed the following methodology for creating a digital collection of three-dimensional objects. Digital still photographs of six sides of each object where created. These photographs were then used to create videos that emphasized the most interesting feature on each side of the object. These videos were then imported into CONTENTdm using the picture cube feature to create the digital representation of the three-dimensional object. Findings – This method was found to be a good method for representing three-dimensional objects in a two-dimensional format for inclusion in a digital collection. However, some limitations were encountered. For example, only one interesting feature could be emphasized on each side of the object and the software used to create the digital videos, while easy to use, offered only limited features for enhancing the resulting videos. Practical implications – This paper demonstrates a cost effective and resource efficient method of implementing a digital collection of three-dimensional objects that could be further improved through the use of more advanced video creation software. Originality/value – This paper offers insight into a new way of representing three-dimensional objects in a digital library. This information will be useful to digital librarians faced with resource and cost constraints who have collections of three-dimensional physical objects that would be of interest to their user community. Keywords Digital collections, Three dimensional objects, CONTENTdm, Art collections, Digital libraries, Digital images Paper type Case study Introduction The University of Oklahoma Libraries History of Science Collections houses a collection of unusual scientific instruments and historical artifacts. This collection is stored in various physical locations and most library users are unaware of its existence. Therefore this collection presented an ideal candidate for digitization. Digitization would enable users to locate and preview the collection online. Once the The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm Images are courtesy of the History of Science Collections, University of Oklahoma Libraries. OCLC 28,4 208 Received July 2012 Revised July 2012 Accepted July 2012 OCLC Systems & Services: International digital library perspectives Vol. 28 No. 4, 2012 pp. 208-220 q Emerald Group Publishing Limited 1065-075X DOI 10.1108/10650751211279148 user had perused the objects in the collection online, they could then visit the objects in person if required. The digital collection would also enable users to obtain pictures of the objects for their own use. However, this project presented several unique challenges related to the three-dimensional nature of the objects comprising the collection to be digitized. A traditional digital library inherently consists of two-dimensional objects. For example, many digital collections consist of digital photographs or digital representations of two-dimensional documents. Technologies and file formats for creating and storing two-dimensional objects, such as Joint Photographic Experts Group ( JPEG) or Portable Document Format (PDF), are widely available, fairly easy to master and reasonably inexpensive. In contrast, while software exists that is able to render accurate digital representations of three-dimensional objects such software is costly and requires specific technical expertise not often found in library organizations. At the present time there is a lack of easy to use, cost effective technologies for creating digital representations of three-dimensional objects. This project was designed to attempt to find solutions to these problems. And while HTML5 offers rendering of simple, symmetrical three-dimensional objects with minimal coding, it is not yet the solution for more complex objects such as are contained in this collection. Literature review Librarians and museum curators have been considering the utility and form of three-dimensional digital collections for some time and yet a review of the existing academic literature reveals less than might be expected. The majority of scholarship on the topic can be divided into three occasionally overlapping categories: (1) pedagogical – how archived and accessible three-dimensional objects will improve teaching and education; (2) research – the ways three-dimensional objects will improve research about the physical objects and the fields of study the objects belong to; and (3) how to – similar to this article, this category focuses on how and with what software the digital collection of three-dimensional objects was assembled. The pedagogical utility of a two-dimensional digital object is sufficient provided the object itself, for example a book page or a photograph, is itself two-dimensional. The additional pedagogical utility of three-dimensional digital object depends on the nature of the object or artifact being digitally captured. If the object has three dimensions, then its digital presentation as a three-dimensional object offers more information for the user. Rowe and Razdan (2003, p. 3) created a prototype digital library for three-dimensional collections, summarizing the state of digital libraries, thusly, “Today digital museum collections and digital libraries include text, graphics, images, and increasingly, video, sound, animation, and sophisticated visual displays. Some now display three-dimensional objects and permit the user to rotate and view an image of the original object in their browser window using QuickTime, plug-ins, or custom applications. Examples range from presentation of objects for research or public access to time-lapse movies of exhibit construction and panoramas of exhibitions.” Surprisingly, nearly a decade later this is still largely the technological milieu of three-dimensional digital libraries, though with the advent of HTML5 and web Creating a digital library 209 browsers featuring improved support of 3D rendering technological advances are likely to usher in a wave of efficient and affordable capabilities to our digital shores. Borgman et al. (2000, p. 228) studied the pedagogical utility of digital libraries and their role in developing students’ scientific thinking stating, “Digital libraries offer a wealth of opportunities to improve access to information resources in support of both ‘traditional’ on-campus instruction and distance-independent learning.” Therefore it is important that librarians become adept at representing three-dimensional objects digitally in order to best serve user communities. Borgman et al. (2000, p. 232) notes, “By understanding how students will use a digital library in the context of scientific thinking, it is possible to construct the digital library in a way that will support the underlying processes. In short, digital libraries are more than storehouses of information; they should be aids to the question-asking, information-gathering, information-organizing, information-analyzing, and question-answering processes of users.” The more information provided by a digital object the greater the utility of that object for the digital library patron. Not to be overlooked is the novelty of three-dimensional digital objects. The researchers Liesaputra and Witten (2009, pp. 116-117) found that adding a feature as simple as three-dimensional page turning increased user retention of information. The researchers discovered users were more engaged and “answered questions faster using the page-turning model than the other two formats.[. . .] The difference between the page-turning model and its closest competitor [were] statistically significant[.]” Qualitatively, the subjects studied commented that they preferred the page-turning model, particularly for longer digital books, “because the pagination breaks up the flow of information and helps them remember where things are in terms of their physical position on the page, allowing them to concentrate on searching rather than navigation. All subjects judged the page-turning book to be the format that is most engaging, natural and intuitive.” In this case, mere novelty increased the pedagogical value of the digital object. The increasing sophistication and affordability of the necessary technology to support digital libraries will soon present libraries and archives with a wealth of opportunity to make resources available online. As Eden (2007, p. 247) argued, “The appearance of the Internet in human culture [. . .] has produced the capacity to graphically and visually represent ideas, problems, challenges, solutions, and results, not as one-dimensional paradigms or presentations as in previous centuries, but in two or more dimensions, allowing the human mind to radically and instantly perceive new ways of solving and representing information.” Three-dimensional objects contain the capacity to reshape research, in addition to providing new avenues for research partnerships across locales. It is imperative that information professionals, such as librarians, begin to shape the methods in which these three-dimensional objects are archived, accessed, and made discoverable for present and future generations. As Eden (2007, p. 247) emphasizes, “[T]he next generation has been preparing itself for a future in which virtual collaboration with others globally will be the norm instead of the exception, and the fields of secondary and higher education are well behind the curve in addressing the learning needs of the future.” Not only do digital objects, three-dimensional or otherwise, serve as a pedagogical tool with which to increase instruction but also as a practical mode of collaboration. Moreover, the three-dimensional physical objects that libraries, special collections, and museums choose to digitize are often rare, antique, or fragile. Enabling access to OCLC 28,4 210 digital representations of these types of objects helps to reduce the hazards presented by repeated physical access of the objects by numerous individuals. Seulin et al. (2004, p. 120) generated three-dimensional digital objects of ancient wooden stamps that were engraved to illustrate manuscripts. The digital objects permitted the user “interactive, realistic and non-photorealistic visualization of engraved stamps via local or remote access [. . .]” thus preventing wear and tear on the original physical object. Not only was damage averted but researchers from distant locations were able to inspect the stamps in detail, with some researchers using the depictions and metadata obtained from the digital library to create facsimiles for their own study. Similarly, in 2004, IBM partnered with the Egyptian government to render three-dimensional digital animations of ancient Egyptian artifacts such as the throne of King Tutankhamen. The project, titled Eternal Egypt, “produced multimedia animations, 360-degree image sequences, panoramas of important locations, virtual environments, three-dimensional scans, real-time photos from web cameras and thousands of high-resolution images of ancient artifacts that weave together seven millennia of Egyptian culture and civilization” (www.eternalegypt.org). Again, this illustrates the growing and diverse ways in which three-dimensional digital objects may aid scholars in their endeavors. For librarians tasked with implementing a digital library containing three-dimensional objects, explicit explanations of protocols, processes, and procedures that have been used successfully to create such collections is vital to an effective implementation and an efficient use of resources. The previously mentioned articles by Rowe and Razdan, Liesaputra and Witten, and Seulin et al. also describe the specific methods they developed to create their digital collections. Their techniques varied widely and were contingent upon their specific resources and desired outcomes. As Politus et al. (2005, p. 1) note there are two necessary elements of any three-dimensional digital object collection: “a database where all information about the exhibits, models, etc. is kept and a renderer which is responsible for graphically representing all this information on the computers screen.” Librarians are familiar with the need to couple data and metadata, be it a book and catalog card or a digital object and its related digital metadata. Librarians must ensure the data and metadata continues to be functional as the tides of technology transform and new methods of retrieval and presentation become available. Doyle et al. (2009, p. 46) studied the issue of digital preservation of three-dimensional data stating, “Two of our primary preservation requirements were to ensure that the preserved object remains authentic and usable through time for our future users, allowing them to access, view and interact with the object in the same way as users in the past could.” They accomplished this by taking care to preserve the software used as well as the digital object itself, and by applying a metadata framework that was thorough and interpretable to the user. Existing collections In additional to the academic literature on the topic, existing digital libraries can be studied to determine how institutions have represented three-dimensional objects. The following projects provide a representative sample of such collections. . The Digital Library of South Dakota maintains a collection entitled “The Altered Books Collection”. This collection is comprised of books. However, these are not typical books. Rather they are works of art made from books. This digital library Creating a digital library 211 presents one picture of each object, arranged in the best view for each three-dimensional object. . Claremont College houses a collection of ceramic objects; again with one picture each, with each object adjusted for optimal view. These objects could benefit from more than one photo each. . University at Buffalo’s Edgar R. McGuire Historical Medical Instrument Collection has from one to several photographs of each object. Simpler objects have one. They do the same for their Universal Design Product Collection. A model of the human brain, for example, has four. . Virginia Commonwealth University’s Medical Artifact Collection provides one picture for each three dimensional object. Some are very complex and could use more views. . Texas Woman’s University Libraries University Archives contains some three-dimensional objects but provides only one picture for each. . The Oregon Health & Science University Digital Resources Library Historical Collections and Archives holds some three dimensional objects, but offers only one picture in most cases and occasionally two. . The MOMA in New York City has some sculpture and other three-dimensional objects in its online collections but only has one picture of each. . The Metropolitan Museum of Art also offers only one picture of three-dimensional works of art. . The Perseus project, from Tufts University, offers anywhere from one to thirty-one pictures of sculptures to focus in on various details. This is the collection that shows the most views and takes into account that a three-dimensional object must be shown from several angles for the user to understand the object completely. . The Digital Library of Appalachia has a picture of each three-dimensional art object and also has videos of the object spinning. The videos cannot be paused or slowed down. On a fast computer they spin very fast. While this digital library is recognizing that three-dimensional objects need to be viewed from every angle, it is not user friendly as the object is spinning so fast. . The University of Oklahoma Health Sciences Library is also working on a collection of scientific instruments. They are using three-dimensional photography using a two-lens system that attaches to a one-lens camera. This system takes two simultaneous pictures that can be displayed side by side. When one looks at these pictures through three-dimensional glasses a three-dimensional image appears. However, this system has limitations in terms of display, as the user needs three-dimensional glasses. The project Software exists to enable a detailed rendering of a three-dimensional model of a physical object that can be manipulated in three-dimensions on a computer screen. This software is typically used for scientific, architectural or medical projects where each detail and measurement must be precise and where the object must be able to be viewed in detail from any angle. While this type of software was considered for use in OCLC 28,4 212 developing the scientific instrument digital collection it was rejected for three reasons. First, such software is prohibitively expensive. Second, such software is difficult to use and would require significant and time consuming training before it could be effectively used by library staff or the retention of an expensive consultant with the required knowledge and skills. Third, extreme detail and accuracy in three-dimensional digital representation and manipulation were not required for this project. Nevertheless, the project did require realistic digital photographs and videos that even the most sophisticated, expensive, and labor-intensive photorealistic graphical rendering – be it two-dimensional vector graphics simulating three-dimensions or JavaScript animations and data visualization tools – would fall short of the demands of the project. While HTML5 seems perched to permit better rendering of simple and sophisticated three-dimensional digital objects, the constant considerations of time and cost can quickly become prohibitive. Unlike the data that populates the graphs and computer models of an engineer, the data contained within this collection consists of the components and composition of the physical object itself. Data such as the authentication page of the an early twentieth century milliammeter that notes where the object was manufactured and, if fortune were to favor, where and to whom it was sent or the method by which one component connects glass or steel thereby revealing the tools used in the construction of the instrument, those characteristics are types of data contained within the collection of scientific instruments. As such unaltered digital imagery is essential. The improvements in the multimedia functionality in CONTENTdm version 6 and soon to be available in HTML5 made examination and analysis of the collection timely and worthwhile. The Scientific Instruments and Historical Artifacts digital collection presently features historical scientific instruments from the nineteenth and twentieth centuries, as well as a cuneiform tablet from the thirteenth century. These particular instruments were selected for the initial trial phase of the collection for their physical accessibility, scope, and novelty. The collection currently contains a McGraw-Edison Edisette Cassette Tape Recorder with a microphone for audio note taking, a Thomas Scientific gas sampling bottle used in gas chromatography to separate chemical mixtures, a Hoppler Viscosimeter to measure the viscosity of fluids (Plate 1), a Bausch & Lomb abbe type refractometer used to measure the refractive index, or the passing of light, through an item (Plate 2), a late nineteenth century Bausch & Lomb optical microscope, a slide box for glass microscope slides, and early twentieth century Weston Electrical milliammeter and voltmeter. Additionally included are stills of an ancient Mesopotamian cuneiform tablet from a ziggurat located in southwestern Iran in the Khuzestan province (Plate 3). Each of these items required functional and realistic digital photographs and videos in order to capture the important facets of the instruments. As the use of advanced rendering software was rejected other approaches were designed and tested to determine how best to meet the needs of the project. Initially, experiments were conducted using a digital video camera to record the object. At first the camera was moved around the object in an attempt to film all sides of the object. However, the resulting video suffered from excessive camera vibration and it was difficult to maintain a consistent distance from the object for the entire length of the video. To attempt to address these issues the object was placed on a rotating platform and the camera was mounted in a stationary position. This solution addressed the Creating a digital library 213 problems of camera shake and uneven distance, but such a methodology only captured four sides of the object and did not allow for the filming of the top or bottom of the object. There was also the fact that a video of a spinning object is not good for researchers wanting to focus on a certain feature of the object. As a result of these unsuccessful video tests the use of multiple still photographs to represent the object was considered. Photographs were taken of each object from six viewpoints – all four sides and the top and bottom of each object. While these photographs provided a good overall representation of the object from a fixed distance, many of the objects contained interesting details that were not visible due to the scale of photograph. Consideration was given to providing close up photographs of these areas, however, as such photographs would not be directly associated with their location on the larger object, and as many of the objects contained multiple detailed areas of interest, this idea was rejected as potentially confusing to the viewer. Based on these experiments a final methodology was developed. It was determined that the best solution would be to take still photographs of the object and then create videos based on these photographs (Plate 4). These videos would zoom in on the Plate 1. OCLC 28,4 214 Plate 2. Plate 3. Creating a digital library 215 important areas of interest on the object to enable the user to view those areas close up and in detail (Plates 5 and 6). In addition, it was decided that a single still photographic image that best represented the entire object would also be included for the use of patrons who wanted a single image that represented the object for use in presentations, scholarly papers and for whatever purpose they desired. The specific methodology used was as follows. Photographs were taken of each object from six viewpoints – all four sides and the top and bottom of each object. Adobe Photoshop was used to crop each photograph. Each picture was cropped twice. Plate 4. Plate 5. OCLC 28,4 216 In the first instance, the photographs were cropped in such a way that the image best filled the frame. In the second instance, the pictures were cropped to the dimensions needed for use in iMovie. This photograph was then imported into iMovie. The slow zoom technique known as the Ken Burns Effect was utilized to focus on the most interesting detail presented by that view of the object. The University of Oklahoma Libraries uses CONTENTdm to host its digital collections. CONTENTdm includes a picture cube feature, which allows six views of a single digital object. This cube may contain still photographs or digital videos. For each object two picture cubes were created. One contained the six still photos of the object and the other contained the six videos of the object’s detail sections. The use of two cubes allowed the user to access the still photograph or the video representation of the objects as required. A final challenging aspect of the project relates to the development of effective and accurate metadata for the objects. While some of the objects and their uses were easily identified by History of Science faculty or by associated documentation others were of unknown origin and function. For the unknown objects various methods were used to obtain the data needed. For example, in one case a professor in the chemistry department used a chemistry listserv to contact experts in the subject worldwide. Plate 6. Creating a digital library 217 Working together they determined that the object was a gas sampling bottle (Plate 7). In other cases manufacturer information could be obtained from markings on the instrument. In such cases the manufacturers’ websites and representatives were consulted and the required information was obtained. Results This process resulted in a usable collection of three-dimensional objects. The user can see all six sides of each object in full view by using the picture cube containing the still images. The user can download any still picture desired for use in their projects. The user can also see interesting features on each side of the object in detail by accessing the picture cube, which contains the videos. By referencing the associated metadata the user could also find a similar item for sale and add it to his collection. Lessons learned There were many lessons learned during the implementation of this project. One of the primary lessons was that although the digitization methodology was developed in advance numerous unforeseen and time consuming issues remained to be solved Plate 7. OCLC 28,4 218 throughout the actual process of digital photographic capture. For example, the initial round of photographs was rejected due to problems with the appearance of the background placed behind the objects during the photographic process. Black velvet was initially used but it was impossible to keep this material from wrinkling, creating a background that distracted from the object itself. An iron could not be used on scene to remove the wrinkles as there are sensitive heat sensors in the History of Science Collections area and use of an iron would have risked setting off the chemical fire prevention system. For the next attempt to film the objects a large roll of green paper was used. This paper could not be folded without wrinkling so it was draped over a support structure and this provided a good non-wrinkled background. However, when the resulting photographs were examined the paper had shifted in many photographs causing parts of the object to be obscured. Based on these experiences it was decided to purchase a professional photographic background for use on the project. A MyStudio Seamless Tabletop Background Sweep Cyclorama was obtained and used successfully. The time saved by not having to reshoot numerous photographs justified this purchase. Another lesson learned related to the use of iMovie to create the required videos. While iMovie is an easy to use product, it sacrifices features for ease of use. The limitations of the pan and zoom effect were of particular interest on this project. To enable easy use of this feature, known as the Ken Burns Effect, iMovie only allows the user basic control over how the pan and zoom will occur. The user simply sets a starting rectangle and an ending rectangle and then iMovie creates the pan and zoom effect. However, the user cannot control the size or shape of the rectangles or the speed and precise direction of the transition. However, this limitation is also why iMovie is so easy to use. More advanced and expensive software, such as Adobe Premiere, offers control over the entire video editing process and numerous advanced video editing features. However, it has a steep learning curve that must be considered relative to available staff time and expertise. If the video was to be created for more advanced research or if greater video editing control was needed for a particular project such software should be considered. Another learning opportunity presented itself when CONTENTdm version 5 displayed administrative metadata in the title bar of the video window. CONTENTdm created its own naming system and presented that information in the title bar of each video as it played. Administrative metadata cannot be changed or deleted in CONTENTdm and this could be confusing for users. Therefore, a request for the ability to edit this metadata was submitted to the producers of CONTENTdm and such a revision is in process. CONTENTdm version 6 incorporates a more coherent and user-friendly metadata display, dispensing with the issue. A final lesson learned was that the creation of three-dimensional digital collections is the process that is still in its infancy. Librarians interested in these types of collections will benefit by sharing their experiences with others and by learning from the experiences of other librarians. Advanced imaging software is in constant development and the technologies used to create three-dimensional digital representations of physical objects will continue to advance and improve. Digital librarians need to remain in the forefront of these developments so that they may provide the most effective digital resources for their user community. Creating a digital library 219 Future projects Based on the lessons learned and the success of this project future projects using this same technology have been added to the digital initiatives queue. The knowledge attained through the trials, errors, and successes of the initial creation of the Scientific Instruments and Historical Artifacts collection will serve as a blueprint to begin incorporating additional instruments, artifacts, and artwork with the primary purpose being the facilitation of further research. References Borgman, C.L., Gilliland-Swetland, A.J., Leazer, G.L., Mayer, R., Gwynn, D., Gazan, R. and Mautone, P. (2000), “Evaluating digital libraries for teaching and learning in undergraduate education: a case study of the Alexandria Digital Earth Prototype (ADEPT)”, Library Trends, Vol. 49 Nos 2, Special Issue, pp. 228-50. Doyle, J., Viktor, H.L. and Paquet, E. (2009), “Long term digital preservation – preserving authenticity and usability of 3D data”, International Journal on Digital Libraries, Vol. 10, pp. 33-48. Eden, B. (2007), “2D and 3D information visualization: the next big Internet revolution”, College & Research Libraries News, Vol. 68 No. 4, pp. 247-51. Liesaputra, V. and Witten, I.H. (2007), “Computer graphics techniques for modeling page turning”, working paper, Department of Computer Science, The University of Waikato, Hamilton, October. Politis, D., Vaiou, G., Ioannis, M., Athanasios, P. and Charalampos, L. (2005), “Dynamically creating virtual museums”, ICS’05@ Proceedings of the 9th WSEAS International Conference on Systems, Article No. 66, World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, WI. Rowe, J. and Razdan, A. (2003), “A prototype digital library for 3D collections: tools to capture, model, analyze, and query complex 3D data”, paper presented at Museums and the Web 2003. Seulin, R., Nicolier, F.D., Fofi, D., Millon, G. and Stolz, C. (2004), “3D imaging applications for ancient wooden stamps analysis”, available at: http://le2i.cnrs.fr/IMG/publications/seulin_ OSAV04.pdf About the authors Maura Valentino began her career as a Microsoft Certified Trainer, teaching programming and database administration. She then returned to school and received a BA in Art History from the University of South Florida and an MSLIS from Syracuse University. Beginning in 2009, she served as the Coordinator of Digital Initiatives at the University of Oklahoma and currently is the Metadata Librarian at Oregon State University. Her research interests focus on the re-use of data and the many ways users find to do that. Maura Valentino is the corresponding author and can be contacted at: maura.valentino@oregonstate.edu Brian Shults is currently serving as the Interim Coordinator of Digital Initiatives in Bizzell Library at the University of Oklahoma Libraries. He has worked at the University of Oklahoma Libraries for over five years in various capacities and in digital initiatives for four years. His research focuses on digital libraries. OCLC 28,4 220 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_jgaaysdf3vg5dni67rrpiu7xey ---- T H E A M E R I C A N A R C H I V I S T Archival MARC Records and Finding Aids in the Context of End-User Subject Access to Archival Collections Rita L. H. Czeck A b s t r a c t This article discusses the findings of a study to determine the extent to which archival MARC records represent chronological, geographical, personal, and corporate informa- tion contained in corresponding finding aids to archival collections. A content analysis of twenty finding aids to archival collections and their corresponding archival MARC records was conducted. The data suggest that the level of representation in archival MARC records varies depending on subject category. Geographical terms were the most likely to be rep- resented, followed by personal names, chronological terms, and lastly corporate names. Allowing for the searching of full-text electronic finding aids would enable end users to benefit not only from the subject information present at the collection level and in the abstract, but also from the areas in finding aids that tend to get less MARC representation: scope/content notes, historical/biographical information, series summaries, and con- tainer information. I n t r o d u c t i o n M any archives and manuscript repositories have made finding aids available via the Internet. Websites with finding aids from hundreds of repositories nationwide may be a future alternative to searching bibliographic utilities such as the Online Computer Library Center (OCLC) for archival and manuscript collection information. Searchers may have the option to search either archival Machine Readable Cataloging (MARC) rec- ords or full-text finding aids in the same database. While the detailed infor- mation in finding aids may be useful for end users in determining relevance, it is unclear whether the finding aid format will be suitable as an initial locator The author wishes to acknowledge her husband, David, for his encouragement and support throughout the research process. 4 2 6 T h e A m e r i c a n A r c h i v i s t , V o l . 6 1 ( F a l l 1 9 9 8 ) : 4 2 6 - 4 4 0 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S of archival collections. Steven Hensen, one of the main developers of the MARC format for archival use and the author of Archives, Personal Papers, and Manuscripts, the defacto standard for archival cataloging, asserts that while finding aids may be available on-line, "it still seems likely that the pointers to such material will probably be structured catalog records."1 The produc- tion of MARC records and their entry into bibliographic utilities, as well as the preparation of finding aids for on-line environments, represent a signif- icant investment of time and money for archival repositories. Since MARC records contain a subset of the information provided in finding aids, what are the advantages of archival MARC records as compared to full-text finding aids for information retrieval? This article discusses a study conducted to analyze the subject information included in finding aids and the archival MARC records derived from them. L i t e r a t u r e R e v i e w The main function of an archival MARC record is to abstract the most relevant information from the finding aid to provide a brief, accurate de- scription of the collection. Among the portions of the MARC record that are typically available to end-user searching are title, author, a summary content note or abstract, and a brief historical or biographical note. Each MARC record also provides a list of index terms using a controlled vocabulary, such as Library of Congress subject headings. Index terms provide a succinct sum- mary of the most important subject information in the finding aid. The pur- pose of a controlled vocabulary, which controls for synonyms and different forms of names, is to allow the end user to collocate records that are topically similar without developing an elaborate search strategy or conducting mul- tiple searches for a given subject. The different types of subject terms listed in MARC records include geographical terms, personal names, corporate names, conferences, and occupations, as well as topical subject terms that do not refer to a specific person, place, or thing. Often, however, the list of index terms found in a MARC record is also in the corresponding finding aid, so index terms are not unique to the MARC format. In order to evaluate what information needs MARC records are best suited to address, it is useful to examine what elements typically make up user queries. In his study of patrons at the National Archives, Paul Conway analyzed 212 initial user questions posed to front desk staff at the archives.2 He reported the most frequent elements in these initial queries were me- 1 Steven L. Hensen, "RAD, MAD, and APPM: The Search for Anglo-American Standards for Archival Description," Archives and Museum Informatics 5 (Summer 1991): 2-5. 2 Paul Conway, Partners in Research: Improving Access to the Nation's Archive: User Studies of the National Archives and Records Administration (Pittsburgh: Archives & Museum Informatics, 1994). 427 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T dium, date, name, subject, place, and organization. Helen Tibbo in her study of providing access to historical literature asked historians to describe, in an open-ended format, what information would ideally be found in abstracts of historical writing.3 The results indicate the types of information most impor- tant to history scholars are chronological, geographical, individual/group, and topical subject terms. The Getty Online Searching Project conducted by Marcia Bates and colleagues was an attempt to study how humanities scholars operate as end users of on-line databases.4 The findings of the Getty study indicate that humanities scholars are most interested in personal names, geographical terms, chronological terms, discipline terms, and nonspecific topical subject headings when conducting on-line searches of document sur- rogates. While both finding aids and MARC records incorporate personal, cor- porate, geographical, chronological, and nonspecific topical information, MARC records represent a subset of these data. To compare the advantages of the two formats for information retrieval, it is helpful to review studies that address differences between a full-text document and an abstract with a list of index terms using a controlled vocabulary. Studies analyzing the ad- vantages and disadvantages of these formats in the context of on-line search- ing generally show that full-text searching provides a higher recall ratio, whereas abstract and index language surrogates provide a higher precision ratio (see Table 1). The recall ratio is the proportion of relevant items re- trieved out of all relevant items in a database. If there are a total of fifty relevant MARC records in OCLC, and ten are retrieved, then the recall ratio would be 20 percent. The precision ratio is the proportion of relevant items retrieved out of all items retrieved. If twenty items are retrieved, for example, and ten of those items are relevant, then the precision ratio would be 50 Table I. Retrieval Performance of Full Text, Abstracts, and Index Terms Tenopir Ro McKinin Blair & Maron * "Prec" = * * "Rec" = = Precision = Recall Full Prec* 18% 14% 37% 79% Text Rec** 74% 84% 75% 20% Abstract and Index Terms Prec 37% 62% Rec 19% 4 1 % Abstracts Prec Rec 59% 18% Index Prec 67% Terms Rec 21% 3 Helen R. Tibbo, Abstracting, Information Retrieval and the Humanities: Providing Access to Historical Lit- erature (Chicago: American Library Association, 1993). 4 Marcia J. Bates, Deborah N. Wilde, and Susan Siegfried, ' 'An Analysis of Search Terminology Used by Humanities Scholars: The Getty Online Searching Project Report Number 1," The Library Quar- terly 63 (January 1993): 1-39. 428 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S percent. Carol Tenopir conducted a study that compared the retrieval per- formance of searching full-text documents in the Harvard Business Review Online database versus searching a combination of abstracts and controlled vocabulary (or "bibliographic union").5 She found that searching the full- text documents produced an average recall ratio of 74 percent, but only an 18 percent precision ratio. Conversely, the bibliographic union of abstracts and index terms produced a recall ratio of only 19 percent, but a precision ratio of 37 percent. Jung Soon Ro's study was a replication of the Tenopir study on a smaller scale, and the findings produced an even more dramatic difference between full text and abstract/index formats.6 The recall ratio for full-text searching was 84 percent, while the precision ratio was only 14 per- cent. Searching only the abstracts produced a recall ratio of 18 percent, but a precision ratio of 59 percent. Finally, the controlled vocabulary terms pro- duced a recall ratio of 21 percent, but a precision ratio of 67 percent. A more recent study conducted by Emma Jean McKinin and associates examined re- trieval performance using the major medical databases: Medline, CCML, and MEDIS.7 Retrieval on Medline, a database with bibliographic records that include abstracts and index terms, was compared to retrieval using the full- text databases CCML and MEDIS. Again, as in the Tenopir and Ro studies, full-text searching produced a relatively high average recall ratio (75 percent) and a relatively low average precision ratio (37 percent).8 Searching the bib- liographic records again produced a relatively low recall ratio (41 percent) and a relatively high precision ratio (62 percent). Conversely, David C. Blair and M.E. Maron found evidence that full-text retrieval produced a high precision ratio (79 percent) and a low recall ratio (20 percent).9 Sung Been Moon summarized the possible reasons why the Blair and Maron study produced different results from the other retrieval studies.10 The differences may have been caused by different document types, different definitions of recall, or different methods of evaluating relevance. The most important factor is the different definition of recall used by Blair 5 Carol Tenopir, "Full Text Database Retrieval Performance," Online Review 9 (April 1985): 149-64. 6 J u n g Soon Ro, "An Evaluation of the Applicability of Ranking Algorithms to Improve the Effective- ness of Full-Text Retrieval. I. On the Effectiveness of Full-Text Retrieval," Journal of the American Society for Information Science 39 (March 1988): 73-78. 7 Emma Jean McKinin, Mary Ellen Sievert, E. Diane Johnson, and Joyce A. Mitchell, "The Medline/ Full-Text Research Project," Journal of the American Society for Information Science 42 (May 1991): 192— 208. 8 1 have computed the values for full-text retrieval performance by averaging together the results of searching the CCML and MEDIS databases. 9 David C. Blair and M.E. Maron, "An Evaluation of Retrieval Effectiveness for a Full-Text Document Retrieval System," Communications of the ACM 19, (1985): 289-99. 10 Sung Been Moon, Enhancing Performance of Full-Text Retrieval Systems Using Relevance Feedback (Ph.D. diss., University of North Carolina at Chapel Hill, 1993). 429 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T and Maron. Ro and Tenopir denned the total number of relevant documents as the number of relevant documents in the union of sets retrieved by several searches on the same topic. McKinin used a similar method, but referred to it as comprehensiveness. Blair and Maron, however, sampled a subset of the document collection and examined it to assess the number of relevant doc- uments, and used the sample to estimate the total number of relevant doc- uments for a given query. For this reason, Blair and Maron's total number of relevant documents probably reflected a much higher percentage of the total number of documents in the database, thereby causing the recall ratio to be low. The sampling method used by Blair and Maron cannot be dis- counted, and in fact may be a better measure of total number of relevant documents than the method used by Tenopir, Ro, and McKinin. While the preponderance of evidence from the various studies shows that full-text re- trieval will generally" produce high recall and low precision ratios,11 the find- ings of Blair and Maron suggest that the recall ratios found in the other studies are inflated. Blair and Maron's study did not, however, compare full- text retrieval to abstract/index term retrieval. It would have been interesting to see whether abstract/index term retrieval would have produced even smaller recall and greater precision ratios. Given the general tendencies of full text and abstract/index term re- trieval performance, there are implications for the effectiveness of archival MARC records and electronic full-text finding aids. Because precision levels tend to be low for full text, searching a database of full-text finding aids could present the user with the problem of "output overload," or the retrieval of an excessive number of irrelevant documents for a given search. Precision levels are usually higher when retrieving information from databases of ab- stracts and/or lists of index terms. A high precision level will result in a more manageable number of hits per search, and this is a strong argument for using the archival MARC record as an initial locator of a collection. On the other hand, if a user is concerned with finding the complete set of relevant collections, the potential for a higher recall level is an argument for searching full-text finding aids. While the MARC format ideally represents the most relevant subject in- formation in finding aids and provides the advantage of precision, the indi- vidual record is only as good as the quality of the cataloging. Although descriptive standards are supposed to provide consistency in descriptions from different repositories, archival cataloging is often inconsistent. Jackie Dooley notes the need for more consistent subject access to archival and manuscript collections cataloged in the MARC format.12 She advises that 11 The evidence presented only speaks to full-text retrieval performance in the absence of search engine techniques such as term-weighting and relevance feedback. 12Jackie M. Dooley, "Subject Indexing in Context," American Archivist 55 (Spring 1992): 344-54. 430 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S more attention should be paid to proper names, time periods, geographic places, and organizations, among other types of terms. Dooley maintains the MARC format is more than adequate to accommodate subject data, and ar- chivists need to upgrade the provision of subject access to archival collections within the MARC structure. Although full-text finding aids should offer greater levels of recall in information retrieval than MARC records, it is not clear to what extent find- ing aids represent potential subject terms that MARC records do not. Con- versely, if the most important categories of subject terms, such as chronological, geographical, personal, and corporate, are often omitted or underrepresented in MARC records, the advantage of precision may not out- weigh the disadvantage of low recall. The following is an analysis of the extent to which archival MARC records are likely to include or omit important subject term categories. M e t h o d o l o g y In order to discover the extent to which archival MARC records are likely to represent the most important categories of subject terms, a content analysis comparing archival MARC records to their corresponding finding aids was conducted. The focus was specifically on four broad types of information: chronological, geographical, personal, and corporate. Twenty finding aids were chosen along with their corresponding MARC records in the OCLC database. All of the finding aids were chosen from the Berkeley Finding Aid Project website in order to provide electronic searching capabilities.13 During the initial phase of content analysis, however, it became clear that searching the finding aids electronically would not provide an accurate count of subject terms. A manual count of subject terms proved to be more effective. While all of the finding aids selected for this study were chosen from the Berkeley Finding Aid Project website, they originated from three different repositories and ranged from three pages to twenty-six pages in length. Of these finding aids, two were for corporate papers, two were for family papers, and sixteen were for personal papers. Two study design factors prevented a more even mix of types of papers: all of the finding aids used for this study were chosen from the Berkeley Finding Aid website; and because the author did not have access to the Research Libraries Information Network (RLIN) database, only the finding aids at the Berkeley site that had corresponding MARC records available via OCLC were used. The OCLC limitations on MARC record length are fifty fields and 4,096 characters per record.14 Since RLIN records do not have the same size restrictions as OCLC records, they can include more sub- 13 (accessed 7 November 1996). 14 Electronic mail received from Tony Chirakos of the OCLC organization, March 1996. 431 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T ject information. If RLIN records had been used in this study, the results may have been very different. In addition, the finding aids did not follow a stan- dard format, in that each repository has its own criteria for structure and inclusion of information. All of the finding aids, however, included typical finding aid elements, such as collection information, scope and content notes, historical or biographical information, and container information. Once the finding aids were chosen, a print copy of each finding aid was visually scanned, and each instance of chronological, geographical, personal, and corporate terms was counted and recorded. Since counting subject terms is highly subjective and time-consuming, nonspecific topical subject terms, such as "trees" or "computers" were not included in the analysis. The fol- lowing criteria specify what types of terms were counted in each category: 1. Chronological terms a. Individual dates, individual dates listed in a date range, or time-span indicators Examples: 1952, 1968-1972 (just 1968 and 1972, not the dates between 1968-1972), 1940s, Twentieth century, Middle Ages 2. Geographical terms a. Political: countries, states/provinces, counties, cities/towns b. Geological: deserts, rivers, mountains, oceans, seas, lakes, etc. c. Specific sites: buildings, dams, roads, etc. d. Adjectives e. Do not include common, unspecified terms (e.g., western states) Examples: Mexico, Colorado River, Hoover Dam, Mexican 3. Personal names a. A person's full name or last name b. Family names Examples: Harry Crump, Woodell, The Boyte Family 4. Corporate names a. Companies, associations, societies, institutions, foundations, etc. b. Subdivisions: bureaus, departments, etc. c. Newspapers and magazines, if a place of employment for or founded by someone listed in the finding aid d. Do not include common, unspecified terms (e.g., administrative com- mittee) e. Do not include if in the title of a conference, meeting, forum, work- shop, symposium, etc. (e.g., The Third Annual Conference of the Use- nix Association)15 15 Conference names have their own category, distinct from organizations. I did not investigate the representation of conference names for this study. 432 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S Examples: General Electric, Library Association, Minnesota Historical Society, U.S. Department of Labor Some terms in the finding aids were not included in the content analysis because they were not in a context deemed useful for subject retrieval. If any of the four types of subject terms were found in the following contexts in the finding aids, they were not included in the content analysis: 1. Location of collection, 2. Encoder and encoding dates of finding aid, 3. Processing of collection, 4. Publishers, dates, and titles in bibliographical references, 5. Folder dates and chronological container information, or 6. Birth and death dates in Library of Congress authorized form of name. Once all of the terms that fell into the four subject categories were re- corded, each of these terms was compared with a print copy of the correspond- ing MARC record to see whether the term was represented anywhere in that format. The location of the term in the finding aid was also recorded. Each finding aid was broken into sections according to the following definitions: 1. Collection information Includes both the title of the collection and the overall date span of the collection. 2. Abstract A brief summary, usually no longer than a paragraph, recording the most important features of a collection. 3. Scope/Content notes A short section, generally one to two pages in length, describing the scope and the series and subseries of the collection, and types of materials pres- ent. 4. Historical/Biographical notes A short section usually ranging from one to two pages that provides a background of the primary person or institution related to the collection. 5. Series information Includes the series title, date span of the series, and series summaries that are normally no longer than a paragraph in length. 6. Container information A detailed listing that describes the contents of containers, typically down to the folder level. Container information can range from a few pages to hundreds. 7. Other Information that does not fall into the first six categories, such as related collections and donor information. 433 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T When considering whether a term from a finding aid was represented in its corresponding MARC record, the term had to be exactly the same in both the finding aid and the MARC record to be considered a match, except for the following cases:16 1. The term was obviously the same one but misspelled. 2. First and last name of a person was inverted. (For example, Carl Hammons and Hammons, Carl would constitute a match.) 3. A person's last name only, if identified by context, matched first and last name. 4. A shortened version of a corporate name (except for acronyms), if iden- tified by context, matched the full name. 5. Dates: 1970-1980 matched 1970 to 1980; 1970's matched 1970s. F i n d i n g s This section provides a detailed analysis of the extent to which the four types of subject terms were represented in the MARC records. The analysis presents the average percentage of representation of each type of subject category in the MARC records. This section also provides an analysis of whether the MARC records represented terms that were located in different areas of the finding aids: collection information, abstract, scope/content notes, historical/biographical notes, series information, and container infor- mation.17 Only one subject category, geographical, was omitted from a MARC rec- ord when there were terms from that category present in the finding aid; this occurred in only one collection out of the twenty analyzed. Aside from this occurrence, if there were terms from a given subject category in a finding aid, that category had at least some representation in the corresponding MARC record. The extent to which the subject categories were represented in the MARC records is given in Tables 2, 3a, 3b, 3c, and 3d. Table 2 shows that all of the types of subject terms—chronological, geographical, personal, and corporate—were represented on average less than 50 percent but more than 20 percent of the time. The most represented type of subject term in the MARC records was the geographical category at 41 percent. Personal names from the finding aids were represented on average 37 percent of the time. Chronological terms were represented on average 27 percent of the time, and lastly an average of 23 percent of corporate names from the finding aids were represented in the MARC records. 16 The focus of this study was to discover whether a given concept from the finding aid was represented in the MARC record, not whether an individual searcher would be able to retrieve the term in precisely the same way from finding aid to MARC record. " While only the average percentage of terms from the finding aids represented in the MARC records are provided in this article, the author can supply the raw data from which the averages were computed upon request. 434 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S Table 2. Average Percentage of Terms from Finding Aids Present in MARC Records, by Subject Category Subject Category Chronological Terms Geographical Terms Personal Names Corporate Names Average percentage of terms present in MARC records 27 41 37 23 Table 3a through 3d show the average percentage of terms from the finding aids present in the MARC records, but also break down the finding aids into their component parts so that further analysis is possible. C h r o n o l o g i c a l T e r m s While only an average of 27 percent of chronological terms from the finding aids were represented on the whole, Table 3a demonstrates that chronological terms derived from the collection information and abstract were represented at a relatively frequent 89 percent and 75 percent, respec- tively. Chronological terms in collection information almost exclusively delin- eate the date range for the whole collection and the date range that includes the bulk of the collection, and these dates tended to have a high represen- tation rate. The abstract of a finding aid includes the most important dates regarding the collection, and tended to have a high representation rate not only for chronological terms, but for all of the subject categories. Chrono- logical terms from the scope/content area were represented at 46 percent, and from the series level 41 percent were represented. The scope/content area contains chronological terms in a narrative fashion similar to the ab- stract, but is typically much more detailed and lengthy than the abstract. Series level chronological terms are sometimes an indication of the range of the entire series. Series summaries, however, contain chronological infor- mation that indicate specific dates of events that relate to particular contents within the series. The historical/biographical section chronological terms were least likely to be represented in the MARC records at 17 percent. Often the historical/biographical section of a finding aid is simply a chronology, listing one date or date range after another in a list, with a short explanation after it. G e o g r a p h i c a l T e r m s Being the most represented of all subject categories overall in the MARC records at 41 percent, geographical terms had a higher level of representa- tion in the abstract and scope/content sections than any other subject cate- 435 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T Average Percentage of Terms from Finding Aids Present in MARC Records by Subject Categories and Finding Aid Sections Table 3a. Chronological Terms Finding aid section Average percentage of terms present in MARC records Collection Information Abstract Scope/Content Historical/Biographical Series Container Information Other 89 75 46 17 41 n/a SO Table 3b. Geographical Terms Finding aid section Average percentage of terms present in MARC records Collection Information Abstract Scope/Content Historical/Biographical Series Container Information Other n/a 100 67 36 43 17 n/a Table 3c. Personal Names Finding aid section Average percentage of terms present in MARC records Collection Information Abstract Scope/Content Historical/Biographical Series Container Information Other 100 78 61 50 91 38 56 Table 3d. Corporate Names Finding aid section Average percentage of terms present in MARC records Collection Information Abstract Scope/Content Historical/Biographical Series Container Information Other 100 93 59 21 64 12 88 436 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S gory. All of the geographical terms (100 percent) from the finding aid abstracts were present in the MARC records. Scope/content geographical terms were represented at 67 percent in the MARC records. At the series level, 43 percent of the geographical terms were represented, and 36 percent of the historical/biographical section geographical terms were represented. Only 17 percent of the geographical terms from the container information of the finding aids were present in the MARC records. In the collection information, no geographical terms were noted because none of the collec- tions had a title that was coded as a geographical term. P e r s o n a / Names Personal names were second only to the geographical category in terms of representation without regard to finding aid section, at 37 percent. More specifically, though, personal names along with corporate names in the col- lection information had the highest representation. For personal and family papers, the title of the collection always includes some form of the personal name, and collection information personal names were represented 100 per- cent of the time in the MARC records. Personal names mentioned in the finding aids' series level information were represented 91 percent of the time, a far greater number than the next highest percentage of representation at the series level, being corporate names at 64 percent. Personal names in the abstracts were represented 78 percent of the time, lower than both geographical terms (100 percent) and corporate names (93 percent). For personal names listed in the scope/content section, the rep- resentation in the MARC records was 61 percent, second only to geographical terms at 67 percent. Half of all personal names in the historical/biographical section on average were represented, a much higher level than any of the other subject categories for this section of the finding aids. Similarly, repre- sentation of personal names from the container information was significantly higher at 38 percent than any other subject category for container informa- tion. C o r p o r a t e Names As mentioned above, all corporate names from the finding aids' collec- tion information were represented in the MARC records. The level of rep- resentation of corporate names from the abstracts was also relatively high, 93 percent, second only to geographical terms at 100 percent. Representation of corporate names dropped off to an average of 64 percent at the series level and 59 percent from the scope/content section of the finding aids. The only subject category with less representation in both of these finding aid 437 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T areas was chronological terms with 41 percent of series level information and 46 percent of scope/content information being represented. Corporate names from the historical/biographical section of the finding aids were rep- resented at 21 percent in the MARC records, and corporate names from the container information were represented only 12 percent of the time on av- erage, the lowest representational level out of all the subject categories for this section of the finding aids. C o n c l u s i o n Because of the increased accessibility of the Internet, archivists are pre- sented with an opportunity to make in-house finding aids accessible to a wide community of searchers. Clearly, searchable and downloadable finding aids are wonderful research tools once a user has connected to a repository's website. The question remains, however, whether finding aids alone are suf- ficient as an initial locator of a collection, especially when searching across collections. The production of MARC records and their entry into biblio- graphic utilities, as well as the preparation of finding aids for on-line envi- ronments, represent a significant investment of time and money for archival repositories. Although full-text finding aids should offer greater levels of re- call in information retrieval than MARC records, it is not clear to what extent finding aids represent potential subject terms that the MARC records do not. Conversely, if the most important categories of subject terms, such as chron- ological, geographical, personal, and corporate, are often omitted or under- represented in MARC records, the advantage of precision may not outweigh the disadvantage of low recall. The findings of this paper suggest that each of the subject types, chron- ological, geographical, personal, and corporate, are likely to be represented, at least at a minimal level, in MARC records. The level of representation varies, however, depending on subject category and section of the finding aids. Geographical terms were the most represented, followed by personal names, chronological terms, and lastly corporate names. The level of overall representation varied from 41 percent down to 23 percent. Since the purpose of a MARC record is to represent the most important information from a finding aid, it is expected that not all of the terms would be represented. The average number of terms from these important subject categories that were only present in the finding aids, however, was great. In addition, when looking at the different portions of the finding aids, the representation of terms varied considerably. Collection information should almost always be incorporated into a MARC record, since it is essentially the name and dates of a collection. This is reflected in the findings, in that personal and corpo- rate terms from the collection information were represented at 100 percent, 438 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 M A R C R E C O R D S A N D F I N D I N G A I D S I N E N D - U S E R A C C E S S T O A R C H I V A L C O L L E C T I O N S and chronological terms at 89 percent. Since the abstract is intended to sum- marize the most important features of a collection, it would seem that most of the subject information from this section should be recorded. This, too, is borne out by the findings: the subject terms from the abstracts were rep- resented at least 75 percent of the time, up to 100 percent for geographical terms. The rest of the sections of the finding aids were not so consistently represented. The scope/content section chronological terms were repre- sented only 46 percent of the time, and the series chronological terms were present only 41 percent of the time. The historical/biographical section pro- vides a background for the collection, and perhaps is not as critical for subject access, but the level of representation from this section was quite low. The container information was the least represented area, although this is not surprising since the information in this area is relatively specific and more comprehensive than the other areas of finding aids. These findings must be viewed, however, with the understanding that RLIN records may provide an even greater average percentage of relevant subject information from finding aids than do OCLC archival MARC records. MARC records seem best suited to address searches for personal and corporate names that are central to the collection, such as a search for a person for whom the collection is named. Searching finding aids for a specific person, however, may retrieve a collection in which the person was only a minor correspondent. The person may have been considered too peripheral to be included in a MARC record, but the collection could still be retrieved by searching the full-text finding aid. A search for chronological terms in a database of MARC records may not be fruitful unless it is for the date range of the entire collection. Finding aids tend to provide a much greater number of chronological terms than MARC records, and the majority of these terms are single dates or date ranges haying to do with the historical background of the subject of the collection. Searchers who have a specific date or a spe- cific date range other than the range of the collection in mind, such as a series date range, would benefit from being able to search the full-text finding aid. Geographical terms that are prominent in the background of a person or corporation, such as where a person resided when they produced the materials in the collection, are likely to be found by searching MARC records. Searching MARC records when the collection itself is closely related to a geographical subject, such as the Central Arizona Project Association, may be useful if searching on frequently mentioned geographical features in the find- ing aid. Many geographical terms that specify folder contents, however, tend not to be represented in MARC records. It is clear from these findings that a significant amount of subject infor- mation tends to be present in finding aids, but not in their corresponding MARC records. Making the full text of finding aids available through an on- 439 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 T H E A M E R I C A N A R C H I V I S T line database for subject searching would provide end users an alternative to searching MARC records in a bibliographic database. Allowing for the search- ing of full-text finding aids would enable end users to benefit from the subject information present not only in the collection information and in the ab- stract, but also from the areas in finding aids that tend to get less MARC representation: scope/content notes, historical/biographical information, series summaries, and container information. A useful alternative to search- ing MARC records or the entire full text of finding aids may be targeted field searching of certain sections of finding aids, e.g., collection information, abstract, and scope/contents notes. As with MARC records, however, the database into which the full-text documents are loaded can have an impact on retrieval effectiveness. The full- text format has the potential to burden the user with excessively large re- trieval sets with many nonrelevant hits, depending on the size of the database. A database that realistically reflects the hundreds of thousands of finding aids available nationwide may amount to nearly nineteen million pages of text.18 With the increasing reliance on retrieving information from large databases, there is a need for archivists to become expert searchers so they can both act as intermediaries for their patrons and educate them to conduct searches for themselves outside of repositories. In addition, research is needed to com- pare the retrieval performance of full-text finding aids versus their MARC surrogates in terms of recall and precision. 18 American Heritage Virtual Archive Project: A Proposal to the National Endowment for the Hu- manities (The Library, University of California, Berkeley) available at (accessed 7 November 1996). 440 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.61.2.3764m 56l67h827p5 by C arnegie M ellon U niversity user on 06 A pril 2021 work_jgfvqcoctrcrnpiejofw7klax4 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216587267 Params is empty 216587267 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587267 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_jggom2axjfbotmf3sahuzzwv6q ---- DSpace DSpace niet meer beschikbaar Wegens de overgang naar het nieuwe onderzoeksinformatiesysteem PURE is DSpace niet meer beschikbaar. Er wordt op dit moment hard gewerkt aan het valideren van de onderzoeksoutput in PURE. Na validatie wordt de researchoutput getoond via https://research.ou.nl Let op, niet alle onderzoeksoutput zal worden getoond in PURE, dit is mede afhankelijk van het beleid binnen de verschillende faculteiten. Heeft u vragen, dan kunt u een e-mail sturen naar het PURE support team pure-support@ou.nl t.a.v. Katrine Bengtsson, Pure-coördinator. work_joeseilruvbbjfukhn3fic5l6y ---- <3138395F3231345F5F5FC0CCC0BAC3B65F20C0CCBBF3BAB95FBFC0BBEFB1D55FB9DABFC1B3B22E687770> 정기간행물 기사색인 서비스 황 발 방향에 한 연구 * A Study of Ways to Improve Periodical Indexing Services in Korea 이 은 철(Eunchul Lee)**, 이 상 복(Sangbok Lee)*** 오 삼 균(Sam Gyun Oh)****, 박 옥 남(Ok nam Park)***** 목 차 1. 서 론 1.1 연구의 필요성 의의 1.2 연구목 연구문제 2. 선행연구 분석 2.1 정기간행물 기사색인의 발 2.2 정기간행물 기사색인 서비스 련 연구 3. 방법론 4. 연구 조사 4.1 이용자 연구 4.2 국내외 정기간행물 기사색인 서비스 황 5. 연구 결과 고찰 5.1 력기반 정기간행물 기사색인 통합 5.2 메타데이터 표 5.3 식별체계 5.4 거 일 구축 5.5 다각 정보 탐색 기능 5.6 이용자 참여 서비스 확 6. 결론 제언 록 본 연구에서는 정기간행물 기사색인 서비스의 정보자원으로서의 요성을 인식하고, 포커스 그룹 인터뷰를 통해 이용자의 정기간행물 기사색인 서비스에 한 요구사항과 국내외 정기간행물 기사색인 서비스 분석하 다. 이러한 분석을 통해 본 연구는 국내 정기간행물 기사색인의 방향성으로 이용자에게 심리스(seamless)한 서비스를 제공할 것을 제시하며 이를 해 력 기반 기사색인 구축, 공유, 검색 시스템의 마련, 표 메타데이터 구축, 거 일 구축, 이용자 참여형 서비스 구축, 패싯 기반 다각 정보탐색 기능 제공, 식별체계 구축 등을 제시하 다. ABSTRACT The study acknowledges the values of periodical indexing as information resources. The study identified periodicals users' needs of article indexing services based on focus group interviews. The study also conducted a comparative study of periodicals indexing services of libraries and databases in Korea and the US. The study argues for the need of seamless services for users of periodical articles indexing services. The study also recommends the elements needed for improving the current service, which includes establishing a collaborative indexing system, adopting a metadata standard, implementing authority files, incorporating social web services, offering diverse ways of information discovery based on facet approach, and stabilizing identification systems. 키워드: 정기간행물, 기사색인, 포커스 그룹 인터뷰 Periodicals, Article Indexing, Focus Group Interviews * ** *** **** ***** 본 연구는 국회도서 연구용역과제 ‘국회도서 정기간행물 기사색인 분석 평가 연구’의 일부내용을 수정․보완한 것임. 성균 학교 문헌정보학과 교수(eclee@skku.edu) 진 학교 문헌정보학과 교수(sblee@daejin.ac.kr) 성균 학교 문헌정보학과 교수(samoh@skku.edu) 성균 학교 사서교육원 강사(ponda777@gmail.com) 논문 수일자: 2009년 2월 23일 최 심사일자: 2009년 2월 23일 게재확정일자: 2009년 3월 5일 190 한국문헌정보학회지 제43권 제1호 2009 1. 서 론 1.1 연구의 필요성 의의 색인은 정보원과 이용자를 연결시켜주는 매 개체로서 특정 주제 분야의 문헌을 탐색하거나 연구의 성과를 조사하고자 하는 사람들에게 요 구하는 문헌을 찾을 수 있는 정보, 특정 문헌을 내포하고 있는 문헌의 서지 사항 는 문헌의 소재 등을 알려주는 검색, 탐색 식별도구이 다(윤구호 2001). 오늘날 학문과 기술이 속 하게 발 함에 따라 정기간행물의 도서 자료 로서의 요성이 단행본 이상으로 증 되고 국 제 으로 그 수요가 확 되어, 정기간행물 검 색 이용의 요성이 증 되고 있다. 재 국내 정기간행물 기사색인은 학술정보 정기간행물 기사색인 연구에 해 벤치마킹 할 수 있는 국내의 연구가 충분하지 않다. 따라서 국내외 도서 의 정기간행물 기사색인을 비교 분석하여 우리나라 정기간행물 기사색인을 평 가하고 그 개선방향을 제시할 필요가 있다. 본 연구의 결과는 국내 정기간행물 기사색인 서비스의 개선을 한 기 자료로서 활용될 수 있을 것이며, 나아가 향후 국내 정기간행물 기 사색인 방향제시에도 요한 자료가 될 것으로 기 한다. 1.2 연구목 연구문제 본 연구는 국내 정기간행물 기사색인에 한 이해를 목 으로 하며, 이를 해 본 연구에서 밝히고자 하는 내용은 다음과 같다. ∙ 국내 정기간행물 기사색인 서비스 황은 어떠한가? ∙국내 정기간행물 기사색인 서비스의 문제 은 무엇인가? ∙국내 정기간행물 기사색인 서비스는 어떠 한 방향으로 나아가야하는가? 의 요소들을 조사하기 해, 본 연구는 우 선 정기간행물 기사색인에 한 문헌조사를 실 시하 으며, 학도서 사서, 학원생, 일반 인을 상으로 정기간행물 기사색인 서비스 이 용에 한 정보요구를 탐색하 으며, 국내외 도서 학술정보데이터베이스를 분석하 다. 다음으로 우리나라 정기간행물 기사색인이 나아가야할 방향과 개선사항을 제시하 다. 마 지막으로 본 연구는 정기간행물 기사색인 서비 스에 한 향후 연구 본 연구의 의의를 제시 하 다. 2. 선행연구 분석 2.1 정기간행물 기사색인의 발 우리나라의 경우 기사색인은 1950년 후 서울 학교 사범 학에서 ‘사학잡지’에 수록된 ‘국어국문학 계연구논문목록 논문지’의 한 기 사로 수록하여 정기간행물 기사색인이 나타나 기 시작하 다. 이어서 한국도서 회에서 발 간한 ‘학술잡지색인’을 바탕으로 본격 인 정기 간행물에 한 색인 작업이 시작되었다. 이를 국회도서 이 이어받아 제명을 1963년부터 <국내정기간행물기사색인>으로 바꾸어 계간으 정기간행물 기사색인 서비스 황 발 방향에 한 연구 191 로 발간하다가 1998년까지 책자로 발행하고 1999년부터 2005년까지 CD-ROM으로 발행 하 다(윤구호 2001). 재 정기간행물 기사 색인은 데이터베이스의 형태로 구축되어 제공 되고 있으며, 학술정보데이터베이스 기 국립 앙도서 , 국회도서 을 심으로 구축 되고 있다. 미국의 경우는 1848년 W. F. Pool이 편찬한 색인을 기 으로 도서목록분류색인과 정기간행물색인 이 구분되는 상을 보 으며, 20세기에 들어 정기간행물은 과학 기술의 발달과 학문의 세분 화 상에 따라 주제별 연속간행물기사색인으 로 환하기 시작하 다(하우정 2001). 미국 Wilson 출 사가 상호참조법을 이용한 주제 근방식을 취한 을 발행하기 시작한 것이 그 표 인 이다. 이를 시작으로 20세기 반부터 문 주제별로 세분화하기 시작하여, 20세기 후반에는 데이터베이스로 환, 록과 함께 문도 같이 수용하여 서비스하고 있다(윤구호 2001; 하우정 2001). 이 듯 정기간행물 기사색인은 인쇄형태로 발간하던 것을 데이터베이스로 환 색인 뿐 아니라 문도 함께 제공하는 서비스로 확장해 나가고 있으며 정보검색도구로서의 요성이 증 되고 있다. 한 웹환경 변화에 따른 이용 자 요구를 만족시키기 하여 통합 목록기반 검색시스템 웹2.0를 도입하기 시작하여 이 용자 리뷰 태깅을 수용하는 서비스가 실시 되기 시작하고 있다. 2.2 정기간행물 기사색인 서비스 련 연구 자 문의 확 , 학술정보 데이터 베이스의 팽창을 고려해 볼 때 정기간행물 기 사색인에 한 연구는 소홀히 할 수 없는 분야 로 단된다. 그럼에도 불구하고 정기간행물 기사색인에 한 연구는 국내에서 많이 행해진 분야는 아니다. 정기간행물 기사색인과 련하여서는 메타 데이터 표 안 제시에 한 연구와 데이터 품 질에 한 연구 정도로 나 수 있는데, 이도 2000년 반의 논문이 주를 이루고 있으므로 이에 한 계속 인 연구가 행해지고 있다고 보기는 힘들다. 메타데이터 표 안 제시와 련하여서는 한 종엽(2000)의 연구를 언 할 수 있는데 한종엽 (2000)은 연속간행물 기사 데이터베이스 구축 을 해 서지데이터 요소의 표 화방안을 제시 하고자 하 다. ISO 690의 서지기술 10가지 요 소를 기본골격으로 하고 국내외 표 서지 데이터베이스의 데이터요소 분석 결과를 바탕 으로 7가지를 추가로 선별하여 17가지 요소(표 제, 자, 자소속, 부수 자, 명, 기사 의 수록 치(권호와 페이지), ISSN, 발행지, 발행자, 발행일자, 본문언어, 사항, 키워드, 주제분야, 록( 는 내용목차), 소장기 , 원 문보기)를 정렬하여 표 안을 도출하 다. 이 연구는 기사색인의 17가지 요소를 국제 표 과 국내 기사색인 상 기 을 분석하여 도출하 다는 에서 그 의의가 있다. 이에 본 연구에서 는 이러한 의의에 주목하여, 한종엽의 연구에 서 추출한 17가지 요소를 기 으로 정기간행물 기사색인의 서지데이터 요소를 분석하되, 한종 192 한국문헌정보학회지 제43권 제1호 2009 엽의 연구에서 포함되지 않은 기 (국립 앙도 서 , 한국과학기술정보연구원, 디비피아, 워싱 턴 학)들을 포함해서 17가지 요소에 제한을 두지 않고 분석하 다. 데이터 품질과 련하여서는 한국과학기술 정보연구원의 ‘과학기술 정기간행물 종합목록’ 데이터베이스 코드의 품질(윤정옥 2003)에 한 연구가 있는데 AACR2와 KORMARC 의 편목규칙과 기술표 을 기 으로 정기간행 물 코드 195종에 한 MARC 코드의 품질 을 평가하 다. 이를 기반으로 기본표제, 축약 표제, 등표제, 문표제, 발행지 오류, 불필요 한 언어코드에 한 오류를 지 하여 이에 한 향상 안을 제시하 다. 이 연구는 데이터 품 질을 서지통합기 에 따라 평가함으로써 표 과의 일 성 표 에 따른 정확성을 기 으 로 했다는 에서 의의가 있으나, 본 연구는 연 속간행물 코드에 한 품질만을 고려했을 뿐, 연속간행물 기사 코드에 한 평가는 다루지 않았다. 국외의 경우, 국내의 제한된 연구와는 조 으로 The Serials Librarians, Serials Review 등의 학술지를 통해 정기간행물에 한 꾸 한 연구가 이어져왔다. 자 과 문의 확 , 기술의 발 , Web 2.0 등장 등으로 인해 정기 간행물의 장서수집, 연구방향 등에 해서도 논의가 이어져오고 있다. 자 의 확 로 인한 기 의 정기간행물 리 방안 장서개 발에 한 논의(Nisogner 2008; Scherlen 2004), Web 2.0반 정기간행물 기사색인 서 비스(Kemp 2008), 이용자 요구 조사 반 방안에 한 고찰(Connaway 2006; Reynolds 2006; Stephens, Lott and Weston 2001)에 한 연구가 진행되고 있다. Connaway(2007) 는 정기간행물 기사색인 이용자인 8명의 학부 생, 교수, 학원생을 상으로 진행된 포커스 그룹 인터뷰를 통해 이용자가 정기간행물 기사 색인 서비스 이용과 련하여 요하게 생각하 는 3가지 요소로 문헌의 다각 정보 탐색 기능 의 향상(information discovery), 기사색인에 한 이용자 평가 조력 시스템, 구 스칼라 등 의 다른 시스템과의 유기 연결 등으로 악 하 다. 에서 살펴본 바와 같이 정기간행물 기사색 인 련 정보자원의 팽창에도 불구하고 정기간 행물 기사색인이 국내에서 어떤 형태로 제공되 고 있으며 어떠한 방향으로 나아가야하는 지에 한 연구는 충분하지 않은 실정이다. 본 연구 에서는 재 국내외 기사색인 서비스 제공기 을 조사하여 국내 기사색인의 황에 한 악을 기반으로 국내 기사색인 서비스의 발 방 향을 제시하고자 한다. 3. 방법론 본 연구는 이용자 연구, 국내외 도서 상 업 데이터베이스 기 분석을 바탕으로 진행 되었다. 정기간행물 기사색인과 련하여 이용 자 요구를 악하기 하여 사서 3명과 정기간 행물 기사색인 이용자 7명(학생 4명 직장인 3명)을 상으로 포커스 그룹 인터뷰를 실시하 다. 국내 정기간행물 기사색인 황을 악하기 하여 국내외 도서 상업 데이터베이스 기 을 심으로 기사색인 황, 검색방법, 메 정기간행물 기사색인 서비스 황 발 방향에 한 연구 193 타데이터, 서비스, 력기 간의 공유 통합 운 사례 등을 심으로 조사하 으며 조사 상 기 은 아래와 같다. ∙국내: 국회도서 , 국립 앙도서 , 한국교 육학술정보원(이하 KERIS), 한국학술정 보(이하 KISS), 리미디어(이하 DBpia), 한국과학기술정보연구원(이하 KISTI) ∙국외: 미의회도서 , OCLC, 노스캐롤라이나 주립 학(University of North Carolina), 워싱턴 학(University of Washington) 4. 연구 조사 4.1 이용자 연구 이용자를 상으로 정기간행물 기사색인 검 색을 해 주로 사용하는 시스템, 선호 이유, 개 선 사항, 정기간행물 기사색인 서비스의 우선 순 , 개선 방향 등을 심으로 포커스 그룹 인 터뷰를 진행하 다. 이를 통해 정기간행물 기 사색인 서비스 제공에 있어서 고려해야 할 요한 요소를 악하 다. • 통합검색 이용자들은 정기간행물 기사색인 검색을 해 KISS, DBpia, KERIS, 국회도서 등 하나 이상의 시스템을 사용하고 있었다. 선호하는 시스템에는 개인 간 차이가 나타났으나, 이용 자는 검색 시 락되는 자료가 발생하지 않기 해 하나 이상의 시스템을 사용하고 있는 것 으로 나타났다. 이는 이용자가 정기간행물 기 사색인 서비스 개선사항을 지 한 부분하고도 연결되는데 이용자는 정기간행물 기사색인을 통합 검색할 수 있는 시스템을 요구했다. 이용자 4: cross-check을 해야 하니까. 혹시 라도 락된 자료가 있을 수도 있고 • 최신성 이용자는 최근 1-3개월 내 발행된 최근자료 에 한 정보요구를 강하게 지니고 있음이 악되었다. 오래된 자료에 해서는 정보를 찾 을 수 있는 곳이 많으나 최신 자료는 색인이 되 지 않는 경우도 있으며 시스템 상에서 검색은 되나 인쇄 원문이나 자 원문이 제공되지 않 는 경우도 있으므로 이러한 자료에 한 신속 한 제공을 원하고 있었다. • 전자 원문 접근성 자 원문 근성 한 정기간행물 기사색인 의 요 요소로 악되었는데 이용자는 문서 비스의 확장으로 모든 기사를 인터넷으로 확인 할 수 있는 서비스가 제공되기를 원했다. 이용자 3: 도서 에 방문하지 않아도 그냥 온라인상에서 자료를 모두 이용할 수 있었 으면 좋겠어요. • 유연하고 효율적인 검색 기능 이용자는 검색 결과를 유용하게 화면에서 볼 수 있거나 보여거나 검색을 필터링 할 수 있는 기능, 유용한 검색의 제한 확장에 도움을 주 는 검색 기능을 기사검색 시스템 만족도에 194 한국문헌정보학회지 제43권 제1호 2009 향을 주는 요소로 지 하 다. 를 들어, 국회 도서 이용자는 정기간행물 기사색인은 방 한 기사색인을 제공한다는 은 만족요인이나 잡지별, 발행처별, 주제분야별 라우징을 제공 하지 않고 결과 내 검색 제공 는 다양한 정보 근 을 제공하지 않는 을 불만족 요인으로 지 하 다. 이에 반해 DBpia와 같은 데이터에 비스는 은 정보양에도 불구하고 정보 검색이 유용하고 검색의 확 제한이 용이함을 서 비스 만족의 요인으로 지 하 다. 한 이용자 평가에서 문에서 참고문헌을 링크해서 정보를 확장해나갈 수 있는 기능, 간 략검색 는 상세검색에서 자, 주제, 권호의 클릭을 통해 정보를 확장할 수 있는 정보의 다 각 탐색 능력을 조력할 수 있는 기능 한 요한 것으로 지 되었다. 이용자 평가를 통해 본 연구는 단일 검색창 을 사용한 통합 검색의 제공, 문 근성의 확 보, 최신성 있는 자원 제공, 다른 콘텐트의 재검 색 는 검색 제한을 가능하게 하는 다각 탐 색능력을 제공하는 검색 기능을 정기간행물 기 사색인의 주요 요소로 악할 수 있었으며, 이 러한 요소를 기반으로 국내외 기사색인 서비스 를 분석하고 향상시킬 수 있는 방안을 제시하 고자 한다. 4.2 국내외 정기간행물 기사색인 서비스 황 4.2.1 국외 기 의 정기간행물 기사색인 서비스 황 미국에서의 정기간행물 기사색인 원문의 구축은 주로 웹 데이터베이스와 자 업체 에 의존하고 있으며 미국 학도서 , OCLC, 미의회도서 은 정기간행물 기사색인을 따로 구축하지는 않는다. OCLC는 비 리기 으로 시작하여 상업 기 으로 환, 정기간행물 기사색인 련 통합검색을 제공하고 있다. OCLC는 이를 해 력기 과 력데이터 베이스의 정기간행물 기사색인을 포함하여 FirstSearch와 WorldCat 등의 OCLC 데이터 베이스를 통해 서비스를 제공하고 OCLC의 내 부 형식에 맞게 수정을 거쳐 서비스를 제공하 고 있다. 그 표 인 가 OpenURL을 사용 하여 기사색인 코드와 데이터베이스 원문을 연결하는 과정이며 정기간행물의 경우 다양한 외부 기 는 데이터베이스로부터 입수되는 정기간행물에 한 정보 정기간행물 표제가 복되거나 같은 정기간행물인데 표제가 다르 게 기록된 경우에 한해 제한 조사를 하여 기 이 되는 정기간행물 표제를 선정 이에 따라 기사색인을 제공하고 있다. OCLC에서 에 띄는 부분이 OCLC WorldCat 인데 표 인 특징은 <그림 1>, <그림 2>와 같 이 Article탭을 통해 기사 검색을 제공하며 단 일검색창을 통해 검색된 정보를 합도 순으로 제공하고, 이용자는 패싯 라우징 서비스를 이용하여 검색의 결과를 다양한 범 로 재검색 할 수 있다는 것이다. 한 구 , 야후와 같은 포털을 통해서도 자 의 정보에 근할 수 있 는 다양한 근 을 제공하고 있다. 그 외에도 리뷰, 태킹과 같은 이용자 참여 서비스 소셜 네트워킹 서비스를 제공하며, MySpace와 같 은 북마킹서비스를 제공하여 검색된 자료의 재 사용성을 조력한다. 한 미국 학도서 의 경우 정기간행물 기 사색인 서비스는 웹 데이터베이스를 구입하여 정기간행물 기사색인 서비스 황 발 방향에 한 연구 195 <그림 1> OCLC WorldCat 일반검색 화면 <그림 2> OCLC WorldCat 검색결과 상세페이지 링크를 제공함으로써 이용자가 정기간행물을 검 색할 수 있도록 하는 방법과 OCLC WorldCat을 학도서 시스템과 연동하여 자 소장 자원 과 로벌 자원을 통합 근할 수 있도록 지원 하는 WorldCat Local을 사용하는 방법으로 서 비스를 제공되고 있다. WorldCat Local을 사 용하는 표 인 의 하나인 시애틀 소재 워 싱턴 학은 <그림 3>과 같이 WorldCat Local 을 자 의 시스템에 도입함으로써 워싱턴 학 WorldCat은 ArticleFirst, British Library, 196 한국문헌정보학회지 제43권 제1호 2009 <그림 3> 워싱턴 학 도서 WorldCat 사용 기사검색 상세페이지 ERIC, NLM Medine, 북서부 도서 콘소시엄 자료, 워싱턴 학이 구독하는 자 자료에 한 정기간행물 기사 색인 검색 원문 링킹 서 비스를 제공하고 있으며, 검색결과는 Local(자 자료), Group(지역 콘소시움자료 - 북서부 도서 콘소시움), Global(WorldCat 자료) 순 으로 리스트가 자동 정렬되어 이용자로 하여 근성이 높은 순으로 편리하게 자료를 열람할 수 있도록 되어 있다. 이 게 같은 검색문에 한 검색결과의 정렬순서가 상이하게 나타나는 것은 검색엔진이 합성 순 화를 하는 과정에 서 자 이 소장한 자료에 가 치를 부여하기 때문이다. 한 <그림 3>과 같이 기사가 속한 잡지를 소장하고 있는 지역 도서 의 정기간행 물 링크를 제공하여 자 에서 제공하지 않는 기사를 지역 도서 으로부터 요청할 수 있도록 조치를 취하고 있다. 개별 도서 은 자 의 특성 요구에 맞게 서비스나 인터페이스를 수정할 수 있다. 를 들어, 보여주는 메타데이터 요소의 변경, 메타 데이터 요소의 용어변경, 패싯 요소의 변경, 정 렬기능, 서지 정보의 내보내기 등 검색 후 기능 의 변경이 가능하다. 2007년 워싱턴 학 도서 은 자 의 복잡한 검색 시스템의 비효율성에 직면하고 있었다. 자체 으로 개발한 3개의 목록 시스템을 활용 하고 있었으나 목록시스템 사이에 통합이 이루 어지지 않아 검색과 이용률이 히 낮았다. 결국 학생들은 학업과 연구를 해 구 , 야후 와 같은 인터넷 포털 사이트를 이용하게 되었 고 도서 의 시스템의 이용률은 더욱 낮아지게 되었다. 그러나 OCLC의 WorldCat Local 서비 스를 도입한 후 출과 상호 차 등 주요 서비스 의 이용 통계가 증하 고 이용자들의 만족도 역시 폭발 으로 증가하 다(OCLC Success Story 2008). 재 시범 용단계에 있으나 정기간행물 기사색인 서비스 황 발 방향에 한 연구 197 이러한 보고결과는 여겨 볼 만하다. 미의회도서 은 법률, 정치, 음악, 사회과학 의 주제 분야 고문서 등 미국의 역사와 련된 장서에 한 서비스를 주로 제공하고 있으나, 정기간행물 검색은 이 분야 웹 데이터베이스를 통해 검색할 수 있을 정도의 서비스를 제공하 는 수 에 그치고 있다. 미의회도서 은 정기 간행물 기사색인 서비스 제공을 하여 주제표 목표, 거 일, 목록표 을 개발하고 유지하는 역할을 담당하고 있다. 미국에서 이러한 시스템을 통해 정기간행물 기사색인 서비스를 제공하는 이유는 법률, 경 제, 인문과학, 사회과학 등의 모든 학문분야별 로 정기간행물 기사색인 문을 제공하는 상용 데이터베이스가 잘 구축되어 있으므로 각 기 에서 이를 복 구축하는 수고를 담당할 필요가 없기 때문이다. 한 한 분야에 다수의 상용 데이터베이스가 존재하여 하나의 데이터 베이스가 폐쇄될 경우에도 지 자원의 손실에 큰 험이 없기 때문이다. 한 OCLC라는 정 보센터가 존재한다는 , 각 기 이 기 의 정 보자원에 해 열린 근성(Open Access)을 우선시 한다는 , 기사색인 기사 원문에 식 별자를 주어 식별자를 통한 링크 연결이 가능하 여 기 간 정보 유통이 가능하다는 , OCLC 와 자 사이의 력이 제한다는 으로 인해 자 이 WorldCat을 도입했을 때 이용자에게 근성 있는 정보를 효율 으로 제공 유통 할 수 있는 것이다. 4.2.2 국내 기 의 정기간행물 기사색인 서비스 황 국내 정기간행물 기사색인 서비스는 국회도서 , 국립 앙도서 , KERIS를 비롯하여, KISS, DBpia와 같은 학술정보 데이터베이스 업체에 서 각 정기간행물 기사색인을 자체 구축하고, 원문 역시도 부분 으로 자체 구축하는 체제로 운 되고 있다. 미국과는 달리 국내 정기간행 물의 경우 다양한 조직에서 부분 으로 정기간 행물 기사색인 원문을 구축하는 형태를 띠 고 있으므로, 복성이 불가피하고 효율성도 미흡한 것으로 악되었다. 국회도서 의 경우 학술가치가 있다고 인정 되거나 의정활동에 도움이 된다고 단되는 모 든 범 의 정기간행물을 수집하여 기사색인 서 비스를 제공하고 있으며 2008년 12월 기 으로 기사색인 정기간행물종수는 8,683종에 이 르고 있으며, 기사색인 6,140종, 색인기사 250 만 건, 원문 데이터베이스 76만 건(원문구축률 약 30%)을 자체 구축하고 있다. 국립 앙도서 역시 학술 가치가 있다고 단되는 정기 간행물에 한해 기사색인을 제공하고 있으며 자 체 구축 KERIS 구축에 의존 하고 있으며 (기사색인 71만건, 원문 35만건, 원문구축률 약 50%), KERIS는 자체 구축한 것 외에는 링크 를 제공하는 방식으로 서비스를 제공하고 있다 (103만건에 한 원문 근제공). 정기간행물 수집의 경우도 각 기 마다 독자 으로 행해지고 있으므로 기 간 정기간행물 수집 황이나 기사색인 원문 구축 황의 정보가 공유되고 있지 않은 실이며 미국과는 달리 구축된 정기간행물 기사색인 원문 근 이 한 곳에서 검색될 수 있는 시스템이 미비한 실이다. 를 들어 재 국회도서 정기간 행물 기사색인은 KERIS와 통합되어 검색서비 스가 제공되지 않고 있다. 한 각 기 은 복 198 한국문헌정보학회지 제43권 제1호 2009 성 지양이라는 우선순 를 표방함에도 불구하 고 자체 구축한 정기간행물 기사색인 원문 의 복성 정도나 각 기 이 소장한 정기간행 물 복성에 해서는 악하지 못하고 있다. 우리나라는 한 거 일, 주제명표목표를 구축하고 있지 않으며, 국립 앙도서 이 인명 거 일을 구축하 으나 정기간행물에서의 활용은 미흡하며, 주제명표목표(시소러스)를 국회도서 이 구축하 으나 재 사용되지 않 고 있다. 이로 인해 재 국내에서는 주제어에 따른 정기간행물 기사검색은 제공되지 않고 있 으며, 주제어가 아닌 KDC나 DDC의 분류기호 에 의존하여 소수 몇 분야에 걸쳐 주제 분야별 정기간행물 검색을 허용하는 수 의 주제별 검 색만 제공되고 있다. 한 OCLC와 워싱턴 학에서 보여줬던 것과 같은 패싯 기반 클러스터링 서비스, 로컬 도서 , 데이터베이스, 로벌 자료와 연결하는 검색 시 스템, 이용자 서평 태깅과 같은 이용자 참여 기능, 북마킹 서비스 제공과 같은 기능이 충분하 게 제공되지 않고 있음이 악되었다. 이러한 국 내의 황은 이용자의 정기간행물 기사색인 정 보 요구 조사에서 밝 졌던 통합 검색, 자 원문 의 근성 확보, 유연하고 효율 인 서비스 제공 을 해하는 요소임을 악 할 수 있다. 5. 연구 결과 고찰 정기간행물 기사색인의 양은 빠르게 증가하 고 있다. 학술 연구개발 분야의 출 물들은 매일 약 2만 건 이상이 발행되고 있으며 이용자 는 매일 방 한 양의 정보를 하게 된다(고 만 2002). 국내에서 생산되는 정기간행물 기사색 인은 국가의 요한 지 자산으로써 이용자들 이 필요한 정보에 해 신속하고 효율 으로 근할 수 있도록 하여야 할 것이다. 이에 본 연구 에서는 앞장에서 논의된 이용자 연구와 국내외 정기간행물 기사색인 서비스 황 악을 바탕 으로 정기간행물 기사색인의 심리스(Seamless) 한 서비스의 요성을 인식하고 이를 효과 으 로 가시화하기 한 요소들이 국내외 정기간행 물 기사색인 서비스에서 어떻게 제공되고 있는 지를 조사하고 서비스를 향상시킬 방안을 제시 하고자 한다. 5.1 력기반 정기간행물 기사색인 통합 재 국내 정기간행물 기사색인은 과학기술 이나 교육 분야를 제외하고는 사회과학, 의학, 경 경제, 인문과학 등의 각 분야를 특징지 을 수 있는 정기간행물 기사색인 데이터베이스 가 충분치 않고 주제분야를 상으로 하는 정기간행물 기사색인이 부분 으로 여러 기 에 의해 구축되고 있는 실정이다. 한 정기간 행물 기사색인을 통합 리하는 기 의 역할이 명확히 규정되지 않고 국내 기 간 기사색인에 한 력 는 의사소통이 제 로 이루어지지 않고 있다. 이러한 재의 상황으로 인해 기 들 간에 기사색인이 복 으로 구축되고 있으 며, 각 기 이 독 으로 구축하고 있는 기사 색인 한 정보공유도 효율 으로 이루어지지 않고 있는 실이다. 정기간행물 기사색인을 공 유할 수 있는 심센터가 없이는 이용자뿐만 아 니라 도서 사서에게 유용하고 효율 인 기사 색인 서비스를 제공하는 것은 어려운 일이다. 정기간행물 기사색인 서비스 황 발 방향에 한 연구 199 이는 메타데이터 표 , 거 일, 주제명 표 목표 선정을 미국회도서 에서 담당하고, 데이 터베이스와 자 업체에서 생산한 정기간 행물 기사색인을 OCLC라는 정보 심센터를 통해 통합 검색이 가능하며, WorldCat Local 을 통해 로컬 도서 과 로벌 자료의 통합 검 색이 로컬 도서 에서 이루어지게 하는 시스템 형식이 시범 운 되고 있는 미국의 경우와 조되는 것이다. 본 연구에서는 정기간행물 기사색인 기 의 력을 기반으로 정기간행물 기사색인 통합 목 록(KAUC - Korean Articles Union Catalogs) 을 구축하여 기 간 정기간행물 기사색인 공 유 통합검색을 추진하며, 원문 서비스 링킹 서비스를 통한 원문의 근을 극 화 할 것을 제시한다. 기사색인 통합목록의 필요성은 다음 과 같다. 첫째, 기사색인의 범 는 주제범 형태 에 따라 다양하다. 형태에 따라 학발행지, 정 부발행 연구 보고서 학회지, 시사교양지(주 간지, 월간지), 학술지 등 범 가 다양하며, 주 제 역시 과학기술, 문화 술, 정치 사회, 법, 의 학 등 여러 분야에 걸쳐있으며, 동시에 기사색 인의 양도 기하 수 으로 증가하고 있다. 앞 으로 자 이 증할 것을 고려하면 기사색 인의 상도 더 증가할 것으로 보인다. 이러한 상황에서, 각 기 는 상용데이터베이스에서 구축된 독 문헌의 존재는 불가피하며, 이에 한 정보의 공유도 력을 통해 이루어져야 할 것이다. 본 연구를 통해 국회도서 , 국립 앙도서 , KERIS, KISTI 등에서 구축한 기사색인의 복성은 물론 각 기 이 독 으로 구축한 기 사색인의 분량도 상당한 것으로 악되었다. 국립 앙도서 은 인문 는 문학 등에 을 두는 경향이 있었고, KISTI는 과학기술 분야 에 을 두고 있으며, 국회도서 은 정부간 행물 시사교양지에 장 을 지니고 있는 것 을 볼 수 있었다. 그러나 재의 상황에서는 각 기 의 기사색인의 , 원문구축 제공 황, 수집 정기간행물에 한 정보가 공유되지 않고 있어 기 간 기사색인 복이 이루어지 고 있다. 력기반 기사색인 구축 공유는 기 의 문성을 향상시키고 문분야에 한 체 계 인 장서개발, 계속 인 개정, 정기간행물 리스트 작성 등에 을 두어 각 기 의 특성 을 향상시키데 도움을 것이다. 둘째, 국회도서 의 원문 구축률이 30%, 국 립 앙도서 의 경우에도 50% 정도로 원문서 비스를 제공하고 있는 , 한 이용자의 면담 조사를 통해 기사색인 서비스에서 문 제공이 가장 요한 요소로 인식되고 있는 을 고려 할 때, 문의 효율 인 제공을 해서도 기 간의 력이 기사색인 서비스의 요한 성공요 인이 될 것으로 단된다. 그러나 력기반 통합은 국내의 실 상황 을 고려하여 볼 때 각 기 의 특성, 력체제, 노력, 시간 산 등 고려해야 할 사항이 많 다. 따라서 통합을 섣불리 추진하는 방안보다 는 단계별 추진으로, 정기간행물 기사색인의 기반환경 활용환경을 다지는 것이 선행되어 야 할 것이다. 이에 본 연구는 정기간행물 통합 목록의 구성 체계를 <그림 4>와 같이 제시한다. 의체를 구성하여 국립 앙도서 , 국회도서 , KERIS, KISTI가 력하는 방안으로 나갈 것을 제언한다. 200 한국문헌정보학회지 제43권 제1호 2009 <그림 4> 정기간행물 기사색인 통합 목록의 구성 체계 제시 정기간행물 기사색인 종합목록 시스템은 단 지 력 도서 뿐만 아니라 상용 데이터베이스 의 기사색인을 포함하는 것도 고려하여야 할 것이다. 이러한 종합목록을 통해, 정기간행물 기사색 인은 목록의 복성 지양, 기 의 특성 문 성 향상, 기 간의 상호 차제도를 확립하고, 불필요한 목록 생성을 지양하고 각 개별도서 이 다른 도서 과 구별되는 독자 (Unique) 정기간행물 장서구축에 을 둘 수 있어 정 기간행물 공동수집계획 체계를 수립하는 장기 장 을 수반할 수 있으며(Anderson 2008), 이용자 연구에서 밝 진 단일화된 통합 검색 요구에 부응할 수 있을 것이다. 5.2 메타데이터 표 국내에서는 정기간행물 기사색인과 련하 여 메타데이터 표 에 한 연구가 활발히 진 행되지 않고 있는 실정으로, 본 연구에서 정기 간행물 기사색인 메타데이터는 필요에 따라 각 기 이 조 씩 상이한 메타데이터 포맷을 유지 하고 있었으며, 이는 서지메타데이터 요소뿐만 아니라 간략 정보 는 상세정보 검색에서 나 타나는 요소를 통해서도 악되었다(부록 1, 부록 2 참조). 를 들어, KERIS의 경우는 발 행처 URL, KCI 등재여부 등을, KISTI는 원문 형태(인쇄, 자, 인쇄 자), 자료유형( , 발표자료)를, DBpia는 UCI(디지털 콘텐츠 식별체계), 자료유형(학술 , 학술 회자료, 학술연구보고서, 문잡지, 국가지식-학술기사, 국가지식-학술정보), 출처분류(학회, 출 사, 연구기 , 정부기 ) 등을, OCLC, 워싱턴 학은 이용자 참여 서비스 (이용자 서평 이용 자 키워드, 소장기 식별자)를 제공하는 등 상 이한 메타데이터 포맷 메타데이터 값을 가 지고 있는 것으로 악되었다. 한 DBpia를 제외하고는 문헌의 이용자 평가를 조력하는 추 가 콘텐츠 를 들어, 서평, 이용자 태그 등은 제공되지 않고 있었다. 한 국회도서 이나 정기간행물 기사색인 서비스 황 발 방향에 한 연구 201 국립 앙도서 의 경우 재 MARC 형식을 유지하고 있으므로 Dublin Core(DC)의 기본 서지데이터 요소에서 제공될 수 있는 발행처 URL이나 기사색인의 식별자, 는 자료유형 에 한 정보를 제공하지 않고 있었다. 이 게 메타데이터의 표 포맷이 부재한 상황은 콘텐 츠의 공유와 상호운용성 확보를 해하고 있으 며 이를 타개하기 해서는 정기간행물 기사색 인과 련 표 화된 메타데이터가 수반되어야 할 것이다. 제안된 메타데이터는 다음 <표 1>과 같다. 기사색인 고유의 식별자를 제공하거나 원문의 식별자, 발행처 URL, 자료유형, 출처 분류, 주 제어, 원문형태, 등재 여부 등의 메타데이터 요 소를 추가함으로써 기사색인 이용자의 근 활용도를 높일 것을 제안한다. 한 이용자 참여 서비스를 제공하기 하여, 이용자 태그 나 이용자 서평을 포함할 것을 제안한다. 한 자료유형에서 학술 , 학술 회자료, 학술연 구보고서, 문잡지, 시사교양지(주간지, 월간 지, 계간지) 등의 값을 부여할 수 있을 것이며, 출처분류에서도 정부기 , 연구기 , 학기 , 학회, 출 사 등을 값으로 고려할 수 있다. 특히 기사색인의 경우 국내 부분의 기 이 주제어 를 주제어 표목표에 의존하여 주제어 태그를 제공하는 것이 아니라 KDC나 DDC의 청구기 호에 따라 주제 범 정도를 제공하는 서비스 에 그치고 있는 것으로 악되었는데 주제어를 제공함으로써 국회도서 기사색인 서비스의 특성을 살릴 수 있을 것이다. 한 형식(format)에 있어서도 URL 자 료유형 등을 포함할 수 있는 DC기반의 XML 형식으로 변형할 것을 제안한다. XML 포맷으 정기간행물 기사 서지데이터요소 1. 표제 기사의 표 인 이름 2. 자 주된 책임 자 3. 자소속 주된 책임 자가 속해 있는 단체나 기 4. 부수 자 부수 인 책임을 갖는 개인 5. 명 기사가 수록된 명 6. 기사의 수록 치와 페이지 기사가 수록된 치 (권호) 페이지 7. ISSN 정기간행물 표 번호 8. 발행지 발행자나 발행처의 지역명 9. 발행자 발행자나 발행처의 이름 10. 발행일자 발행일자 11. 본문언어 기사의 주요 언어 12. 사항 정기간행물의 사항 13. 주제분야 KDC, DDC 기반 주제분야 14. 주제어 통제 주제어 15. 키워드 자나 색인자부여 비통제 주제어 16. 이용자 태그 이용자가 기사의 내용에 부여한 주제어 17. 서평 이용자가 기사에 해 작성한 서평 18. 록 (내용목차) 록이나 목차정보 19. 소장기 기사의 입수정보 는 원문제공기 20. 기사색인 식별자 기사색인 고유 식별자 21. 자원 식별자 기사의 고유 식별자 22. 발행처 URL 발행처 고유 URL 23. 자료 유형 기사의 자료 유형 24. 출처 유형 발행처 유형 25. 원문 형태 원문 형태(PDF, HTML, 인쇄) 26. 등재 여부 잡지의 한국학술진흥재단 등재 여부 27. 원문보기 원문 입수 정보 <표 1> 정기간행물 기사색인 데이터 요소 로 형식을 변형하는 것은 단지 MARC에서 제 공하지 않는 메타데이터 요소를 추가할 수 있다 는 장 외에도 데이터의 상호운용성을 향상시 키고 서지시스템의 웹기반 통합 지향 측면에서 도 바람직하다(MARC AND SGML 2002). 202 한국문헌정보학회지 제43권 제1호 2009 더 효과 으로 메타데이터 의미를 표 할 수 있 을 뿐만 아니라, 실증을 가능하게 함으로써 서 지 데이터의 효율 인 송을 지향할 수 있다. 5.3 식별체계 메타데이터를 사용한 정기간행물 기사색인 자원의 리 이용을 해서는 기사와 기사 를 설명하는 기사색인 코드, 메타데이터 코드를 효과 으로 연결해야 할 체계가 요구된 다. 한 정기간행물 기사색인은 발행자에 한 리가 요하며 이를 기사색인과 연결하는 것도 요하다. 재 국회도서 정기간행물 기사색인은 <그림 5>와 같이 기사색인 는 원 문에 한 식별자가 부과되지 않으므로, 이용 자가 검색된 정기간행물 기사색인 는 원문을 북마킹 (bookmarking) 하거나 장할 수 없 다. 이로 인해 이용자의 검색된 기사에 한 재 사용성을 효율 으로 조력하는데 방해가 되고 있다. 식별자는 본 연구에서 이용자 요구에서 악한 다른 시스템과의 정보 연계, 정보의 탐 색을 해서 불가피한 요소이다. 이러한 분석을 토 로 본 연구는 기사색인 메타데이터 코드 기사에 한 식별체계를 구축하고 연계 서비스를 제공할 것을 제안한다. 이러한 식별자를 통해 정기간행물 기사색인의 메타데이터의 효과 이며 효율 인 리가 가 능하게 된다. 메타데이터는 효과 이며 효율 인 정보자 원 리에 필수 인 체계이다. 식별체계는 이 에 더하여 메타데이터의 기능을 제고할 수 있 는 기반을 제공하며 나아가 정기간행물 기사색 인과 련된 모든 정보와 지식을 식별할 수 있 도록 하여 효과 이며 효율 인 기사색인의 활 용을 가능하게 하므로 정기간행물 기사색인 기 반환경 조성에 반드시 고려되어야 한다. <그림 5> 국회도서 정기간행물 기사색인의 검색화면 정기간행물 기사색인 서비스 황 발 방향에 한 연구 203 표 인 식별체계로써 URI가 있다. URI는 자원의 고유한 불변성 식별자로써 URI의 기능 이 제 로 활용되기 해서는 각 URI의 값을 실 제 자원과 연결된 URL로 변환시켜 주는 시스템 이 필요하다. 재 URI 변환시스템으로 가장 잘 알려진 시스템은 미국 CNRI(The Corporation for National Research Initiatives)에서 개발한 핸들(HANDLE)시스템이다. 한 한국정보사회 진흥원(NIA)에서 핸들과 유사한 UCI(Universal Content Identifier-디지털 콘텐츠 식별체계) 서버를 활발히 운 하고 있으며, 더블린코어에 서도 유사한 개념으로 http형식의 URI 값을 제 공하는 형태로 속 인 URL이라는 개념의 PURL(Persistent URL)을 사용하고 있다(오삼 균 2002). 정기간행물과 련하여서는 KERIS 가 메타데이터정보의 고유식별자와 발행처 URL을 함께 제공하고, DBpia 역시 기사색인 에 UCI(디지털 콘텐츠 식별체계)를 제공하고 기사색인의 경우 발행처에 해서는 URL을 제 공하고 있으며, ERIC 데이터베이스는 PURL 을 기사색인에 제공하고 원문에 DOI(Digital Objective Identifier)를 제공하고 있으며, OCLC 는 PURL을 기사색인에 제공하여 고유 식별자 를 활용하여 오 액세스나 북마크 등 웹 자원으 로 다양하게 활용할 수 있도록 하고 있다. 정기 간행물 기사색인 식별자로써 한국문화콘텐츠진 흥원이 개발한 COI(Content Object Identifier), 국외 데이터베이스 업체에서 많이 사용하고 있 는 DOI, DBpia에서 사용 인 UCI의 사용을 고려할 수 있다. UCI는 디지털 콘텐츠의 온라인 유통에 있어 요구되는 디지털 객체의 식별을 해 개발된 식별체계로 문서 멀티미디어를 식별 상으 로 하고 있다. UCI는 총 기구인 한국정보사회 진흥원 아래 디지털 콘텐츠 등록 행 기 인 등록 리기 (Registration Agency)을 두고 디지털 콘텐츠 보유자로부터 콘텐츠를 등록 받 아, 해당 디지털 콘텐츠 요청 시 디지털 콘텐츠 의 치를 변환시스템을 통해 고정된 주소로써 이용자에게 제공할 수 있도록 한다. 한 식별 메타데이터를 통해 디지털 콘텐츠의 간략한 정 보를 제공하여 검색 편의를 도모하고 있으며 이 식별메타데이터는 본 연구에서 제안된 메타 데이터와 간단한 매핑 과정을 통해 공유될 수 있다. COI는 디지털 콘텐츠의 작물 작 권 리를 식별하고 리하기 해 개발된 식별체계 로 창작정보, 표 정보, 디지털 형태의 작물, 실물 작물, 실물객체를 식별 상으로 하고 있다. COI의 식별 상은 다소 추상 인 상 까지도 포함하고 있음이 특징 이다. COI 한 UCI와 유사한 운 리 체계를 갖추고 있다. 총 리기 인 한국문화콘텐츠진흥원 아래 등록 리기 을 두어 디지털 콘텐츠 보유자로 부터 콘텐츠를 등록 받아 변환시스템을 통해 고정된 주소로 이용자에게 콘텐츠를 제공할 수 있는 기반을 마련하고 있다. COI 등록 리기 으 로는 도서 수집 콘텐츠를 해 국립 앙도서 의 선정이 완료되었다(COI Handbook 2006). DOI는 디지털 객체 식별자로써 인터넷 문 서를 식별하고 리하기 해 개발된 식별체 계로 인터넷상의 디지털 콘텐츠의 치 변경 이나 시스템 변화와 무 한 디지털 콘텐츠의 구 인 식별체계이다. DOI는 Corportation for National Research Initaitves(CNRI)와 력 계에 있는 미국출 업 동조합에 의해 고 204 한국문헌정보학회지 제43권 제1호 2009 안되었으며 재 DOI 재단(International DOI foundation)에 의해 운 리되고 있다. 5.4 거 일 구축 재 국내 기 이 제공하는 그룹화는 아직 그 한계 을 지니고 있다. 재 자, 권호, 발 행처에 의해 그룹화(collocation)를 제공하고 있는 국내 기 KERIS의 경우도 거 일의 구축 없이 그룹화를 제공하고 있다. 이것은 같 은 인명, 같은 학술지, 같은 발행처가 존재할 경 우 이를 식별할 수 있는 기능을 제공하지 않고 있다는 의미이다. 한 자가 여러 이름으로 지칭될 경우 는 학술지나 발행처의 이름이 변 경될 경우 이를 검색에 반 하며 그룹화(collo- cation)를 제공하는 데 한계가 있다. 주제명의 경우에도 동일한 한계 이 있는 것 으로 악되는데 재 국내 기 에서 제공하는 주제의 검색 그룹화는 KDC나 DDC에 의존 하여 주제 범 를 제공하는 정도에 그치고 있 다. 그러나 정기간행물 기사색인은 문정보를 찾는 경우와 많은 이용자들은 이미 주제 분야 에 한 반 인 이해를 가지고 있는 경우가 많으므로, 이용자는 넓은 주제 범 를 넘어 상 세 주제를 제공해주기를 바라는 정보 요구를 가지고 있으며, 이를 반 하기 하여 주제표 목에 의한 주제어 부여가 필요하다. 재 국회 도서 에 경우 이미 2000년 시소러스를 개 발하고 한 차례 그 수정이 이루어졌으나, 이러 한 시소러스가 기사색인에 있어서는 사용되지 않고 있는 것으로 악되었으므로 재 국내에 서 주제명 표목표에 따른 주제어 부여를 제공 하고 있는 기 은 거의 악되지 않았다. <그림 6>은 Education Resources Information Center(ERIC) 데이터베이스의 경우로, 교육 시소러스(ERIC Thesaurus)를 개발, 정기간행 물 기사에 주제어로 부여하고 있는 를 보여 주고 있다. 이미 개발된 시소러스의 극 인 활용은 물 론 시소러스가 계속 인 유용성을 가지기 해 선 지속 인 개정이 이루어져야 할 것이다. <그림 6> ERIC 시소러스 사용 정기간행물 기사색인 서비스 황 발 방향에 한 연구 205 5.5 다각 정보 탐색 기능 정기간행물 기사색인과 련하여 키워드 검 색이 기본검색, 상세검색 등으로 주로 사용되 고 있었으며 이와 더불어 다음과 같은 다양한 기능이 제공되고 있다. ∙데이터베이스, 발행처, 간행물, 주제별 라우징 기능 ∙ 자, 게재지, 발행처, 주제, 권호별 그룹화 기능 ∙게재지, 발행년도, 연 도, 자, 제목별 결과 내 재정렬 기능 ∙이용자 태그, 인기 간행물 인기 검색어, 련 검색어, 인기 논문, 심학회, 이용자 서평, 검색어 제공, 참고문헌/인용문헌/ 련문헌 링크 기능 이러한 다양한 기능을 통해 이용자 검색을 지원하고 있다. 한 검색 후 기능으로 스크랩, 즐겨찾기, 검색엔진 연동, 문용어 사 , 문자 송, 서지사항 장을 제공하고 있다. 본 연구에서 여겨 볼 것은 이용자 연구에서 밝 졌던 기사색인 정보의 주요 요소인 다른 콘텐츠의 재검색 는 검색 제한을 가능하게 하 는 “다각 탐색 능력”인데 이는 Connoway (2007)의 연구 본 연구의 이용자 요구 분석 에서도 입증된 정기간행물 기사색인 서비스 주요 요소이기 때문이다. 다음 <그림 7>에서 보 이는 바와 같이 노스캐롤라이나 주립 학, 워 싱턴 학, OCLC가 패싯 라우징 시스템을 도 입하여 기존 발행처, 수록 잡지, 자, 주제명에 서만 그룹화를 제공하던 것을 확장하여, 치, 발행 연도, 언어 등 검색된 기사 탐색에 도움이 될 만한 요소를 추출 패싯을 사용하여 검색에 도움을 주고 있는 것을 볼 수 있다. 한 이를 통해 검색어와 련된 다른 검색어, 그리고 그 에 해당하는 기사의 수를 한 에 볼 수 있어, 이용자의 검색 확장 제한을 용이하게 하 다. 를 들어서, ‘환율’이라는 같은 검색어를 넣었을 때, <그림 7>에 보이는 KERIS화면에서 는 발행 잡지( : 국민경제연구)를 클릭하게 <그림 7> KERIS 검색 제한/확장 기능 206 한국문헌정보학회지 제43권 제1호 2009 <그림 8> OCLC 패싯기반 검색 제한/확장 기능 되면 발행 잡지에 한 색인이 나오게 되는데 반해, <그림 8>에서 보이는 바와 같이 OCLC화 면에서는 자명( : Rose)를 클릭하게 되면 환율과 Rose와 연 된 기사가 검색되게 한다. 이에 본 연구에서는 다각 탐색을 향상시키 기 해 패싯을 사용할 것을 제안하는데 패싯 기반 검색이 주는 유용성은 기존 연구를 통해 입증된 바 있다. Yee, Swearingen, Li와 Hearst (2003)는 키워드 검색 시스템과 패싯 검색 시 스템의 유용성 평가에서 패싯 검색이 검색 효 과성, 높은 재 률, 검색의 용이성, 검색 결과 학습효과 면에서 높은 만족도를 보여주었다고 밝혔으며, Santos(2008)의 연구에서도 이용자 인포메이션 아키텍처 평가를 바탕으로 패싯 기반 요리법 사이트와 키워드 검색 사이트를 비교한 결과, 패싯 기반 검색이 키워드 검색에 비해 유용성이 높음을 보여주었다. 목록 검색 시스템 유용성 평가에서도 노스캐롤라이나 주 립 학도서 의 보고서(2008)에 따르면 검색 창에 의존한 기존의 목록 검색 인터페이스보다 패싯 기반 목록 검색 인터페이스가 더 나은 유 용성을 보여주고 있는 것으로 나타났으며, 이 용자는 검색결과를 확장 는 제한할 수 있다 는 에서 패싯 기반 검색을 선호하 으며 주 제, 포맷, 이용가능 도서 , 출 가능성의 패싯 에 해서 높은 선호도를 보이는 것으로 조사 되었다. 이 게 패싯 기반 검색을 정기간행물 기사색 인에 제공함으로써 정보의 다각 탐색기능을 지원하기 해서는 먼 메타데이터 표 , 식 별자, 거 일을 기반으로 하여 자료유형(학 술 , 학술 회, 학술 연구보고서), 원문형태 (인쇄, HTML, PDF), 등재 여부, 검색 히스토 리, 주제, 출처 분류(학회, 출 사, 연구기 , 정 부기 ) 등에 한 정보 근 이 추가로 제공 되어야 하며, 이에 한 정보를 패싯(facet) 검 색을 사용, 그룹화(collocation)를 통해 정보의 제한 확장을 지원하고 정보의 식별을 향상 시키는 기능이 제공되어야 한다. 정기간행물 기사색인 서비스 황 발 방향에 한 연구 207 5.6 이용자 참여 서비스 확 정기간행물 기사색인에서의 이용자 참여 서 비스는 OCLC를 비롯한 국외 기 에서 이용자 태깅 이용자 서평 등록을 제공하는 것은 물 론, KISTI와 같은 국내기 에서도 이용자 태 깅 서비스를 제공하는 등 국내외 기 에서 사 용되기 시작한 서비스이다. <그림 9>에서 처럼 컴퓨터공학, 기계공학, 기 자 분야의 데이터베이스인 INSPEC에서 이용자 참여 서비스를 활용, 이용자가 논문에 한 키워드를 생성하고 이용자가 생성한 키워 드를 검색에 활용하는 시스템을 보여주고 있다. <그림 10> 처럼 이용자가 논문에 태그를 입력 할 수 있고, 이용자들이 논문, 정보에 메모 한 태그의 집합을 보여 으로써 <그림 9>와 같 이 태깅된 콘텐츠를 검색할 수 있다. 국내의 경우 KISTI와 같은 데이터베이스에 이용자 태킹을 허용하는 서비스를 제공하고 있 으나, 이용자 서평 등을 제공하는 차원으로는 확장되지 않고 있으며, KERIS, DBpia, KISS 등의 다양한 분야의 정기간행물 기사색인을 제 공하는 학술정보 데이터베이스의 경우에는 이 용자 참여 서비스를 반 하지 않고 있다. <그림 9> INSPEC 태그 검색화면 <그림 10> INSPEC 이용자 태그 입력화면 208 한국문헌정보학회지 제43권 제1호 2009 시맨틱 기술의 발 과 웹2.0의 확산으로 이 용자 참여를 바탕으로 한 정보 서비스의 큰 흐 름이 나타나게 되었다. 이용자 간 정보 교류를 도와주고 진시키는 소 트웨어의 총칭인 소 셜 소 트웨어(social software), 정보자원에 한 이용자 분류의 총칭인 소셜 태킹(social tagging), 정보 구성에 의 의도를 반 해 야 한다는 개념의 소셜 인포메이션 아키텍처 (social information architecture) 등 집단지성 이 고려되는 기술 상에 집단의 의미를 부여하 여 ‘소셜(social)'이란 어두를 사용하고 있다. 이는 차세 웹의 흐름이 이용자 심에서 참 여하는 으로 변하고 있음을 시사하고 있다 고 할 수 있다. 이용자의 잠재된 요구를 충족시 키기 해서는 표출된 의 요구를 반 할 수 있어야 한다는 것이다. 이용자 참여 서비스가 정보 검색 탐색에 본격 으로 사용되기 시작한 것은 아마존이 사 용자 리뷰 시스템을 도입하면서 부터 다. 아 마존에서 사용자 리뷰는 페이지의 맨 하단에 치함에도 불구하고 이용자자가 사용자 리뷰 를 참고한다는 것은 사용자 리뷰가 이용자가 진정으로 원하는 콘텐트이기 때문이며 이러한 개개인의 독립 인 활동이 정보 탐색 식별 에 요한 자원으로 작용을 하게 되었다. 즉, 정 보의 홍수 속에서 자원 는 상품에 한 경험 을 가지고 있는 개개인의 참여로 만들어지는 정보는 더욱 유용해질 것이며 이용자가 제공하 는 정보는 이용자에게 개인별 맞춤형 추천 서 비스 기능을 제공을 가능하게 하는 발 이 된 다. 이로 인해 웹 2.0는 빠르게 확산해 나갈 것 이라고 견하고 있다(Porter 2008). 이용자 참여 서비스는 개인 정보 리(personal information management)의 도구로 이용자가 자신들이 검색한 정보를 재사용하기 한 도구 로 시작되었으나 이용자는 장서, 이미지, 상품에 한 심을 공유함으로서 그들이 속한 는 속 하고자 하는 커뮤니티의 구성원으로서 역동 (dynamic)이며 방향 (interactive)으로 활 동하고자 하는 욕구를 충족할 수 있으며 이는 역 시 도서 서비스 제공에서도 고려되어야 한다 (Spiteri 2009). 정기간행물 기사색인과 련한 연구 (Connaway 2007)에서도 이용자는 자료에 한 다른 이용자들의 서평 등의 정보가 검색 뿐 아니라 정보의 식별에 도움 된다고 이용자 연 구 연구를 통해 밝힌바 있다. 한 이용자참여 서비스는 정기간행물 기사색인 정보의 유통에 도 도움을 될 것이다. 이러한 의 요구를 정 기간행물 기사색인에 반 하는 것은 이용자 참 여 서비스를 통해 그 가능성을 확보할 수 있다. 한 이용자참여 서비스가 정기간행물 기사색 인에 용됨으로써 가져올 장 은 다음과 같이 열거될 수 있다(Kemp 2008). 첫째, 정기간행 물 기사색인에 있어서 기사색인 작성자의 지 능력에만 의존하는 주제어 부여 주제어 검 색을 보완한다. 둘째, 정기간행물 기사색인 주 제어 부여에 통제주제어휘(시소러스)에만 의 존할 시, 통제주제어휘와 이용자 어휘가 일치 하지 않음으로 인한 검색상의 혼란을 보완한다. 셋째, 학문분야의 최신의 어휘를 이용자가 사 용한 언어를 통해 악 가능하며, 이를 통해 통 제주제어휘(시소러스)의 수정이 가능하도록 한다. 넷째, 기사색인으로 부여할 수 없는 의미 인 데이터(서평)를 제공함으로써, 기사에 한 상세한 정보제공이 가능이 가능하며, 이는 정기간행물 기사색인 서비스 황 발 방향에 한 연구 209 이용자의 정보의 식별을 돕는다. 마지막으로 이용자 참여를 통해 본 연구에서 밝 진 기사 색인 서비스의 주요요소인 색인의 최신성을 도 모한다. 국내외 정기간행물 기사색인 역에서의 이 용자 참여 서비스는 에서 악한 바와 같이 국내에서는 이용자 태깅 기능을 제공하는 정도 와 국외에서는 이용자 태깅 서평을 제공하 는 정도에서 그치고 있다. 본 연구는 이용자 태 깅 서평을 통한 정보의 검색 식별 기능을 넘어 Spiteri(2009)가 제안한 것과 같이 이용자 가 커뮤니티에 역동 으로 참여하고 활동할 수 있는 공간을 제공할 것을 제안한다. 이용자 참 여 서비스를 통해 이용자에게 심분야의 정기 간행물 기사를 추천하는 서비스를 제공할 것을 제언한다. 정기간행물 기사색인 역에서의 이 용자 참여 서비스는 다음과 같은 형태로 제공 될 수 있다. ∙이용자 태깅: 이용자가 정기간행물 기사 색인에 태깅을 부여하는 기능을 제공하여 이를 검색에 활용하도록 하는 서비스 ∙이용자 서평: 이용자가 정기간행물 기사 색인에 서평을 작성하는 것을 허용하여 문가 는 사서에 의해 주어지는 정보 가 아닌 다른 이용자의 의견을 참조할 수 있도록 하는 서비스 ∙이용자 심 분야 등록 서비스: 이용자가 정기간행물과 련 심 분야를 등록함으 로서 이용자에게 맞춤형 서비스를 제공하 고 심 커뮤니티 활동을 돕는 서비스 ∙북마킹 서비스: 정기간행물 기사색인이나 원문을 북마킹함으로써 개인정보 리 정보의 재사용에 활용될 수 있도록 하는 서비스 ∙추천 서비스: 이용자의 심 분야 등록 정 보, 기사색인 원문의 북마킹, 다운로드 정보를 바탕으로 이용자에게 도움이 될 만한 정기간행물 기사를 제안하는 서비스 Web 2.0의 잠재성을 고려할 때 국내 정기간 행물 기사색인은 다양한 이용자 참여서비스를 제공하여, 커뮤니티를 활용한 업작업, 정보공 유 등 다양한 지 활동을 지원해야 한다. 집단 지 활동은 이용 서비스의 측면에서는, 정보 의 재생산에 기여하여 의미론 으로 보다 의미 있는 정보를 이용자에게 제시하는 기반을 마련 할 것이다. 한 업무 지원 서비스의 측면에서 는 기사색인을 기사색인 담당자 는 목록 생 성자만 의지하는 시스템을 확 하여, 기사색인 담당자의 업무 부담을 여 수 있을 것며, 나 아가 이용자의 정보행태를 도서 에 반 함으 로써 이용자가 더 쉽게 이해하고 사용할 수 있 는 검색 식별 시스템을 구축하는데 그 의의 가 있다고 할 수 있다. 6. 결론 제언 정기간행물 기사색인의 양은 빠르게 증가하 고 있다. 학술 연구개발 분야의 출간물들은 빠르게 증가하고 있으며 이러한 정보의 더미를 이용자들은 매일 마주치게 된다. 국내에서 생 산되는 정기간행물 기사색인은 국가의 요한 지 자산으로써 이용자들이 필요한 정보에 해 신속하고 효율 으로 근할 수 있도록 하 210 한국문헌정보학회지 제43권 제1호 2009 여야 할 것이다. 이러한 동기 아래 본 연구에서 는 국내 정기간행물 기사색인 서비스 평가 발 방향을 제시하기 하여 문헌조사를 비롯 하여 이용자요구조사, 국내외 기사색인 기 조사를 진행하 다. 본 연구는 이용자요구조사를 통해 정기간 행물 기사색인이 갖추어야 할 요소를 분석하 고, 국내외 기 조사를 통해 재 국내 정 기간행물 기사색인의 황을 악하 다. 이 를 통해 정기간행물 기사색인 서비스의 심리스 (seamless)한 제공을 강조하 다. 이용자는 여 러 기 데이터베이스에서 제공되는 기사색 인을 검색할 수 있는 정보의 게이트웨이, 원문 제공기 으로의 링킹, 다른 이용자의 의견을 바탕으로 한 정보의 검색 식별, 커뮤니티의 구성원으로서의 활동, 이용자의 심에 합한 정보를 제공하는 맞춤형 서비스를 통해 유연하 고 효율 인 정보의 근 사용 환경이 요구 됨을 악하 다. 이를 효과 으로 가시화하기 한 체계 인 요소로서 국내 정기간행물 기사 색인 서비스 구축, 메타데이터 표 , 거 일, 이용자 참여 서비스, 패싯 기반 다각 정보 탐 색 기능, 식별체계 구축을 제시하 다. 본 연구는 정기간행물 기사색인의 국내외 도 서 데이터베이스를 분석하여 국내 정기간 행물 기사색인의 황을 악함으로써, 정기간 행물 기사색인의 요성을 강조하고 서비스 제 공에 요구되는 요소의 향상 을 제시함으로써 향후 련 연구에 도움을 가능성을 제시했 다는 데 그 의의가 있다. 본 연구에서 제안된 메타데이터는 표 정기간행물 기사색인 기 의 분석을 기반으로 한 것으로 이용자 평가 색인 용을 통해 평 가되어야 할 것이며, 본 연구에서 제시된 정기 간행물 기사색인 서비스에 요구되는 요소들은 향후 연구를 통해 활용 방안에 해서 추후 논 의가 되어야 할 것이다. 참 고 문 헌 [1] 고 만. 2002. ꡔ 문정보 통합센터로서의 국회도서 ꡕ. 국회도서 50년사, 서울: 국회도서 , 377-385. [2] 오삼균. 2002. 시멘틱웹 기술과 활용방안. 정보 리학회지 , 19(4): 297-319. [3] 윤구호. 2001. ꡔ색인․ 록ꡕ. 서울: 한국도서 회. [4] 윤정옥. 2003. 연속간행물 종합목록 데이터베이스의 코드 품질평가. ꡔ한국문헌정보학회지ꡕ, 37(1): 27-42. [5] 하우정. 2001. ꡔH. W. Wilson사 색인의 변천과 발 에 한 연구ꡕ. 계명 학교 석사학 논문. 구: 계명 학교. [6] 한종엽. 2000. ꡔ연속간행물 기사DB의 서지데이터요소 표 화연구ꡕ. 앙 박사학 논문. 서울: 앙 학교. 정기간행물 기사색인 서비스 황 발 방향에 한 연구 211 [7] COI 핸드북: 문화콘텐츠 식별체계 사업. 2006. 한국문화콘텐츠진흥원. . [8] Anderson, R. 2008. “Future-Proofing the Library: Strategies for Acquisitions, Cataloging, and Collection Development.” The Serials Librarian, 55(4): 560-567. [9] Connaway, L. S. 2007. “Mountains, Valleys and Pathways: Serials Users' Needs and Steps to Meet Them-Part 1." The Serials Librarian, 52(1/2): 223-236. [10] Litteltree, S. 2008. Endeca Catalog Usability Test. . [11] Nisogner, E. 2008. “The 80/20 Rule and Core Journals.” The Serials Librarian, 55(1/2): 62-84. [12] Scherlen, A., 2004. “Courage of Our Convictions: Making Difficult Decisions About Serials Collections." Serials Review, 30(2): 117-121. [13] Kemp, R. 2008. “Catalog/Cataloging Changes and Web 2.0 Functionality: New Directions for Serials." The Serials Librarian, 53(4): 91-112. [14] Porter, J. 2008. ꡔDesining for the social webꡕ. Berkeley, CA: New Riders. [15] Reynolds, R. R. 2007. “Mountains, Valleys and Pathways: Serials Users' Needs and Steps to Meet Them-Part 2." The Serials Librarian, 52(1/2): 237-249. [16] Santos, A. 2008. Clustering and Faceted Search. . [17] Spiteri, L. F. 2009. “The Impact of Social Cataloging Sites on the Construction of Bibliographic Records in the Public Library Catalog." Cataloging & Classification Quarterly, 47: 52-73. [18] Stephens, D., Lott, C., and Weston, B. 2001. “Prioritizing Periodicals: A Web-based Approach to Gathering Faculty Advice on Journal Subscriptions." The Serials Librarian, 40(3/4): 369-373. [19] Yee, K-P., Swearingen, K., Li, K., and Hearst, M. 2003. “Faceted Metadata for Image Search and Browsing." Proceedings of SIGCHI Conference on Human Factors in Computing Systems, 401-408. . [20] OCLC Success Story: University of Washington Libraries. . • 국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] Young Man Ko. 2002. “Jeonmunjeongbo Tonghapsenteoroseoui National Assembly Library” in 212 한국문헌정보학회지 제43권 제1호 2009 National Assembly Library 50nyeonsa. 377-385. Seoul: National Assembly Library. [2] Sam Gyun Oh. 2002. “The Semantic Web Technology and its Applications." Journal of the Korea Society for Information Management, 19(4): 297-319. [3] Koo-ho Yoon. 2001. Saegin․Chorok. Seoul: Korean Library Association. [4] Cheong-Ok Yoon. 2003. “Evaluation of the Quality of Records of the Serials Union Catalog Database." Journal of the Korean Society for Library and Information Science, 37(1): 27-42. [5] Woo Jung Ha. 2001. A study on the changes and developments of the wilson periodical indexes. Graduate thesis, Keimyung University. [6] Jong-Yup Han. 2000. A Study on standards for bibliographic data elements of the articles in serials. Ph.D. diss,, ChungAng University. [7] Korea Culture and Content Agency. COI Handbook: Munhwa contents Sikbyeolchegye Saeop. 2006. Seoul: Korea Culture and Content Agency. . 정기간행물 기사색인 서비스 황 발 방향에 한 연구 213 국내외 기사색인 서지데이터요소 입력 서지요소 간략정보 상세정보 국회도서 기사명, 자명, 입력일자, 본문언어, 발행사항 (발행지, 발행처, 발행년), 청구기호, 수록사항(수록지명, 권호, 쪽, ISSN) 기사명, 자명, 발행년도, 수록사항 (수록지명, 권호, 쪽, ISSN) 발행사 항 (발행지, 발행처, 발행년), 청구 기호, 원문, 목차, 록, 원문 기사명, 자명, 수록지명 (수록지명, 권호, 쪽, ISSN), 발행사항 (발행지, 발행처, 발행년), 청구기호, 목차, 록보기 (제한된 자료에 한하여), 원문 국립 앙 도서 제목, 자, 발행자, 발행년, 수록잡지 명, 언어, 청구기호, 형태사항, 청구기 호, 키워드 표제, 자, 수록잡지명, 발행자, 발 행년도, 청구기호, 소장처, 목차, 록, 원문 표제/논문명, 자, 주기사항, 수록 잡지명, 청구기호, 소장처, 록 KISTI 논문명, 등논문명, 자, 자료유형, 이용가능형태, 게재지명, 발행기 , 발행일, 언어, 권호정보, 페이지, 주제 어, 소장기 [사용자시스템의 경우] 논문명, 자명, 게재지명, 권호수, 발행년도, 페이지(시작-끝) * 주제분야 련 검색어 제공 논문명, 등논문명, 자, 자료유형, 이용가능형태, 게재지명, 발행기 , 발행일, 언어, 권호정보, 페이지, 주제 어, 록 *원문 근경로 제공 한국학술정보 (KISS) 간행물명, ISSN, E-ISSN, 자사항, 발행처, 언어, 제공시작년, 제공마감 년, 주기, 주제명, 권호, 기사명 기사명, 자, 발행처, 간행물명, 연 도, 원문보기 기사명, 문기사명, 자, 발행년도, 발행기 , 발행정보(간행물명, 권호, 시작페이지, 종료페이지, 총페이지), 주제키워드 한국교육학술 정보원 (KERIS) 논문명, 문논문명, 자명, 학술지 명, 권호사항, 발행처, 발행처URL, 자 료유형, 수록면(시작페이지, 종료페 이지, 총페이지), 언어, 발행년도, KDC, 등재정보, 주제어, 고유URL, 록, 목차, 원문보기, DDC, ISSN, 후 속 학술지명, 인쇄 소장기 , 발 행국, 간기 학술지논문명, 문논문명, 자,발 행정보(학술지명, 권호, 연도), KCI 등재정보, 원문보기, 록보기, 목차 보기 논문명, 문논문명, 자명, 학술지 명, 권호사항, 발행처, 발행처URL, 자료유형, 수록면(시작페이지, 종료 페이지, 총페이지), 언어, 발행년도, KDC, 등재정보, 주제어, 고유URL, 록, 목차, 원문보기, DDC, ISSN, 후속 학술지명, 인쇄 소장기 , 발행국, 간기 DBpia 제목, 어제목, 자, 부 자, 수록 페 이지, 총페이지 수, 간행물명, 발행기 , 간행물 유형, 언어, 일형식, KORMARC, DBPIA고유번호( 리 식별자), UCI, 목차, 국․ 문 록, 국‧ 문 키워드, 주제분류, 세주제분 류, 간행물 유형분류, 출처분류, 발행 기간, 사항, ISSN/ISBN, 발행주기 제목, 자, 부 자, 발행기 , 잡지 명, 사항, 발행년월, 수록 페이지, 총 페이지 수, 주제분야 어제목, 자, 부 자, 수록 페이 지, 총페이지 수, 간행물명, 발행기 , 간행물 유형, 언어, 일형식, KORMARC, DBPIA고유번호, UCI, 목차, 국․ 문 록, 국․ 문 키워드 OCLC WorldCat 기사명, 자명, 사용자평가도, 자료 유형, 언어, 수록사항( 명, 권호, 발행년, 쪽), 출 사항(발행지, 발행 처, 발행처정보), 색인데이터베이스, ISSN, OCLC Number, 소장기 , Detatils( 록), Reviews, Tags 기사명, 자명, 자료유형, 언어, 수 록사항( 명, 권호, 발행년, 쪽), 출 사항(발행지, 발행처, 발행처 정보), 색인데이터베이스 기사명, 자명, 사용자평가도, 자료 유형, 언어, 수록사항( 명, 권호, 발행년, 쪽), 출 사항(발행지, 발행 처, 발행처정보), 색인데이터베이스, ISSN, OCLC Number, 소장기 , Detatils( 록), Reviews, Tags 워싱턴 학교 (시애틀) UW WorldCat 기사명, 자명, 사용자평가도, 자료 유형, 언어, 수록사항( 명, 권호, 발행년, 쪽), 출 사항(발행지, 발행 처, 발행처정보), 색인데이터베이스, ISSN, OCLC Number, 소장기 , Detatils( 록), Reviews, Tags 기사명, 자명, 자료유형, 언어, 수 록사항( 명, 권호, 발행년, 쪽), 출 사항(발행지, 발행처, 발행처 정보), 색인데이터베이스 기사명, 자명, 사용자평가도, 자료 유형, 언어, 수록사항( 명, 권호, 발행년, 쪽), 출 사항(발행지, 발행 처, 발행처정보), 색인데이터베이스, ISSN, OCLC Number, 소장기 , Detatils( 록), Reviews, Tags * LC와 UNC의 경우 단 까지만 검색을 제공 그러므로 기사색인 메타데이터 비교표에서는 제외. <부록 1> 국내외 기사색인별 간략/상세정보 서지데이터 요소 분석 214 한국문헌정보학회지 제43권 제1호 2009 연속간행물 기사 서지데이터요소 기사색인 국회 도서 KISTI KERIS 국립 앙 도서 KISS DBpia OCLC LC UNC UW EBSCO Host 1. 표제 ○ ○ ○ ○ ○ ○ ○ ○ ○ 2. 자 ○ ○ ○ ○ ○ ○ ○ ○ ○ 3. 자소속 ○ 4. 부수 자 ○ ○ ○ ○ 5. 명 ○ ○ ○ ○ ○ ○ ○ ○ ○ 6. 기사의 수록 치 와 페이지 ○ ○ ○ ○ ○ ○ ○ ○ ○ 7. ISSN ○ ○ ○ ○ ○ ○ ○ ○ ○ 8. 발행지 ○ ○ ○ ○ ○ ○ ○ 9. 발행자 ○ ○ ○ ○ ○ ○ ○ ○ ○ 10. 발행일자 ○ ○ ○ ○ ○ ○ ○ ○ ○ 11. 본문언어 ○ ○ ○ ○ ○ ○ ○ 12. 사항 ○ ○ ○ 13. 키워드 ○ ○ ○ ○ ○ ○ ○ ○ ○ 14. 주제분야 ○ ○ ○ 15. 록(내용목차) ○ ○ ○ ○ ○ ○ ○ ○ 16. 소장기 ○ ○ ○ ○ 17. 원문보기 ○ ○ ○ ○ ○ ○ ○ ○ ○ 18. DC 지원 (XML 기반) ○ ○ <부록 2> 국내외 기사색인별 서지메타데이터 요소 분석 work_jukhjklns5hrvl4bwffa7vhwuy ---- Microsoft Word - E-LIS_OTDCF_v18no4.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA The ALCTS Networked Resources and Metadata Committee: Setting Course in a Sea of Change ___________________________________________________________________________________________________ {A published version of this article appears in the 18:4 (2002) issue of OCLC Systems & Services.} "Rapid changes in the means of information access occasioned by the emergence of the World Wide Web have spawned an upheaval in the means of describing and managing information resources. Metadata is a primary tool in this work, and an important link in the value chain of knowledge economics.” – Erik Duval, et al. ABSTRACT This article describes the work of the Networked Resources and Metadata Committee (NRMC), a committee within ALA’s Association for Library Collections and Technical Services (ALCTS). The article discusses NRMC’s history, initiatives, and program highlights, gleaned from an interview held with the present and past chairs of the committee. Numerous details about NRMC’s role within the larger metadata community are shared, as are future directions for the committee. KEYWORDS Networked Resources and Metadata Committee; NRMC; Association for Library Collections and Technical Services; ALCTS; metadata; Ann Sandberg-Fox; William Fietzer; Mary S. Woodley Readers of this column are aware of some key organizations dedicated to furthering metadata development. The Open Archives Initiative, the World Wide Web Consortium, and of course, the Dublin Core Metadata Initiative, among countless others, work hard to enable progress in this maturing field. The Association for Library Collections and Technical Services (ALCTS), a division of the American Library Association, is one of these dedicated groups. During the 2002 Annual Conference held in Atlanta, I had a chance to speak with the current and past chairs of the Networked Resources and Metadata Committee (NRMC), the ALCTS divisional committee established to address metadata and related issues. The charge of the NRMC, available on its web site , reads: Recognizing that a coherent view of networked information resources and metadata issues will benefit the activities of the division, its committees, and sections, the committee is charged 1. To provide a broad framework for information exchange on current research developments, tools, and activities affecting networked information resources and metadata; 2. To coordinate and actively participate in the development and review of standards concerning networked resources and metadata in conjunction with the division’s committees and sections, other units within ALA, and relevant outside agencies; and 3. To develop programs and to foster and sponsor education and training opportunities that contribute to and enhance an understanding of networked resources and metadata, their identity, content, technology, access, control, and use. The NRMC achieves these goals through strong leadership and dedicated membership. The accomplishments of this group – nine appointed members (including the chair) and one intern -- are remarkable, given that the committee was born just seven years ago. SOME HISTORY The Networked Resources and Metadata Committee was established by the ALCTS Board of Directors in 1995 under the name Digital Resources Committee (DRC). Ann Sandberg-Fox chaired the committee from 1995-2000, and two other ALCTS members have since held the reigns: William Fietzer (2000-2001) and Mary S. Woodley (2001-present). The DRC was charged to "study and address issues regarding digital resources of all kinds," and to collaborate with the ALCTS Audio Visual Committee so as to not have overlapping scopes (Sandberg-Fox, 1996). As Ann notes, it was not easy convincing the ALCTS Board to create the DRC. "The committee had to fight to get its point across. In the end however, enough far-sighted people believed in us, and the committee was established." The original charter for the Digital Resources Committee included a sunset clause stating that the committee would be disbanded after the 2000 annual conference. In 1998 however, the ALCTS Board asked the DRC to review its charge and modify its name to reflect the increase in metadata development occurring worldwide. In response to the Board's request, Ann formed a working group to review the DRC's charge and to consider a new name. In her report to the Board, Ann proposed the name "Networked Resources and Metadata Committee,” citing the following reasons (Sandberg- Fox, 1998): The term 'networked resources' applies, as its name suggests, to those resources that are available via the Internet, World Wide Web, and other particular network environments. It is these materials that have been the focus of the DRC's program of work…In addition, the working group felt it important that 'metadata' be included in the proposed name change … [because of] its association with the networked resource environment where it was developed and has been implemented. WHAT’S IN A NAME? Reconstituting the Digital Resources Committee as the Networked Resources and Metadata Committee broadened the already ambitious mission of this group, yet also provided it the stability of a long-term contract within ALCTS. The sunset clause was abolished, and although the terrain that the NRMC faced was in many ways unknown, it set out to achieve the goals identified in its charge. NRMC Chairs 1995-2000 Ann Sandberg-Fox, Cataloging Consultant & Trainer 2000-2001 William Fietzer, University of Minnesota 2001-present Mary S. Woodley, California State University – Northridge Subcommittees were formed, including the Standards Subcommittee, which drafted the Standardized Handling of Digital Resources: An Annotated Bibliography, commonly referred to as the Standards Bibliography. “Cecelia Preston initiated the Standards Bibliography,” says Ann. “It was her idea, and she worked extremely hard to develop it. Thanks to Bill and Mary, it continues to be a substantial contribution.” As noted in the document, available on the NRMC’s web site at , the Standards Bibliography “provide[s] a source of relevant information as it pertains to the issues facing ALCTS members as they address the collection, cataloging and provision of access to digital resources.” This resource is updated frequently, and as Mary notes, “it will soon include items on metadata transport protocols, as well as descriptive, administrative, technical, and preservation metadata standards. The Standards Bibliography will be a more timely and relevant document.” The work of mounting a lengthy and informative resource on the Web, although a common practice today, should not be overlooked. According to William, “the NRMC was one of the first committees to put materials up on the Web,” with the Standards Bibliography first appearing on July 25, 1997, and meeting minutes appearing as early as January, 1996. THE STRUCTURE Although the Standards Subcommittee was productive, the NRMC was overloaded with subcommittees by the time William was appointed chair in 2000. He restructured the NRMC by exchanging subcommittees for task forces, and removed many of the formal liaison relationships with other committees, the exception being the liaison to the Cataloging and Classification Section's Committee on Cataloging: Description and Access (CC:DA). As William notes, “not having such a formal structure can be beneficial.” Indeed, NRMC task forces have offered the committee the flexibility of convening to work on specific projects, and disbanding when these projects have been completed. Moreover, informal reports from NRMC members who attend other meetings serve the information needs of the group. As Mary notes, “as a divisional committee, we have the burden of making sure we coordinate with other committees within ALCTS.” NRMC PROGRAMS If you are interested in metadata, and regularly attend ALA annual conferences, chances are you have attended an NRMC program. Some past programs include: • "Intelligent Agents and the Digital Future," presented at the 1997 Annual Conference in San Francisco - this program’s featured speaker was Clifford Lynch, then the Director of Library Automation at University of California • "Security and the Digital Library: A Look at Authentication and Authorization Issues," presented in New Orleans in 1999 - Donald Waters, then of the Digital Library Federation, was part of a panel that explored issues of information integrity and user identity in an online environment • "Fish, Fungus, and Photos: Librarians as Metadata Collaborators," the most recent program, presented in Atlanta in 2002 - a panel of speakers, including Bill Garrison of the Colorado Digital Library Project, presented viewpoints on metadata collaboration among libraries and non-library communities, such as museums and scientific institutions The 2003 Annual Conference in Toronto will feature a two-day preconference metadata institute titled “Knowledge Without Boundaries” that will bring together an array of international presenters, including speakers from China, Canada, and Germany. In addition, a program will be held on the Open Archives Initiative Protocol for Metadata Harvesting that will feature some of the most highly-regarded practitioners of this innovative application. THE FUTURE OF THE NRMC “It’s a big challenge keeping the committee a little ahead of the changes taking place” comments William. Nowhere is this more prevalent than in the Standards Bibliography, where maintenance of the acronym list is a job unto itself. “We’re fortunate to have people who are not members of the NRMC interested and willing to help out,” says Mary. “There’s more work that the committee can do. We welcome them.” As the committee looks towards the future, there is little time to rest. Mary has outlined ambitious goals for the year ahead. Continuing education of professionals, developing a clearinghouse for metadata, and outreach to the information community are just a few of the projects facing the NRMC. Much of this work will intersect with the Library of Congress’ Action Plan, Bibliographic Control of Web Resources , which resulted from the Bicentennial Conference on Bibliographic Control for the New Millennium held in 2000. ALCTS members are heavily involved in this initiative, and clearly there is an important role for the NRMC to play. “A lot has happened in a short period of time,” notes Mary. And there’s plenty on the way. The author wishes to thank Ann Sandberg-Fox, William Fietzer, and Mary S. Woodley for their generous contributions to this article. REFERENCES Duval, E., Hodgins, W., Sutton, S. & Weibel, S. (2002), "Metadata principles and practicalities," D-Lib Magazine, vol. 8, no. 4, Available: http://www.dlib.org/dlib/april02/weibel/04weibel.html (Accessed: 2002, April 16) Sandberg-Fox, A. (1996) "ALCTS Digital Resources Committee." Personal correspondence provided by the author on June 16, 2002. Sandberg-Fox, A. (1998) "Report of the Working Group on the DRC Proposed New name and Charge, June 17, 1998." Personal correspondence provided by the author on June 16, 2002. work_jxa6xwqpqzhbrgxw7e7x6y5o3u ---- The Charleston Advisor / October 2018 www.charlestonco.com 57 ADVISOR REPORTS FROM THE FIELD Heard on the Net The New Deal May be No Deal doi:10.5260/chara.20.2.57 By Jill Emery (Collection Development & Management Librarian, Portland State University) With contributions from: Irene Barbers (Head of Acquisitions, Central Library, Forschungszentrum Jülich GmbH) Lisa Lovén (Librarian, Licensing Coordinator, Stockholm University Library, Stockholm University) With the advent of the worldwide financial downturn a decade ago, many libraries, in particular many medium-to-large aca- demic research libraries in North America, found they could no lon- ger afford the escalating costs associated with big deal journal pack- aging from major academic, commercial publishing houses. The results from the initial round of cancellations were scaled down ver- sions of big deals. The Scholarly Publishing and Academic Resourc- es Coalition (SPARC) has a tracking mechanism of big deal cancel- lations which can be found here: . There are currently around 30 in- stances noted on their spreadsheet, indicating where deals were re- duced or switched over to ordering specific titles as requested/need- ed by faculty in North America. In addition, SPARC has also added where negotiations from big deal packages have failed worldwide. Jacob Nash & Karen McElfresh note in their 2016 article “A Jour- nal Cancellation Survey and Resulting Impact on Interlibrary Loan” there was, in fact, little to no impact on Interlibrary Lending of con- tent that made up their big deal cancellation. (DOI: <10.3163/1536- 5050.104.4.008>). This study appears to be indicative to what many others have reported once they lose their big deal. There does not ap- pear to be a significant upswing in ILL once a deal ends or is signifi- cantly reduced. One of the biggest concerns is whether walking away from package deals from the major academic scholarly commercial publishers will create faculty backlash and erosion of goodwill between academic li- brarians and the communities with which they work. No one has yet undertaken a study to explore these impacts. Have faculty chosen to work at other institutions due to the lack of scholarly content at the Universities where cancellations have occurred? Has there been at- trition of faculty at these institutions? Or have faculty just found oth- er means of access to content? Daniel Himmelstein and others’ arti- cle in eLife seems to indicate, this may be the case. (DOI: <10.7554/ eLife.32822>) Given this fear and often great concern, my goal became to listen to librarians from Germany and Sweden about their consortial deci- sions regarding the biggest publisher in the mix, Elsevier. Bibsam, a consortia for Swedish academic institutions was unable to reach an agreement with Elsevier, and their content access ended July 1, 2018 for all content published after this date. For Germany, the lack of a renewed DEAL contract has resulted in a cascading loss of access among higher education institutions and research institutions. In both Germany and Sweden, it is still very early days with their non-renew- al of Elsevier deals. For Germany, a few research institutions contin- ue to have access to content up through the end of 2018. In addition, their previous contracts supply perpetual access for the years to which they were subscribed, so backfile access for numerous titles has been retained. My two interviewees are Irene Barbers (IB), who is Head of Acquisitions, Forschungszentrum Jülich GmbH, Zentralbibliothek/ Central Library, and Lisa Lovén (LL), Librarian, Licensing Coordi- nator, Stockholm University Library, Stockholm University. Both re- sponded to four questions posed by me. Do you think the landscape of scholarly literature is changing for the better and if so, can you provide examples? IB: Scholarly publishing is defined by various interest and stakehold- ers. It has to serve the needs of the scientific community and of the public, and it is at the same time an industry that is determined by the specific laws of its market. This has led to controversial views and it is fascinating to see the current developments of business and access models, and the discussion about the publishing process itself. Criti- cal voices claim that scientists should take the business into their own hands instead of working in the traditional way with publishers. One interesting example is SciPost (), a pub- lishing platform lead and organized by scientists, and adhering to Fair Open Access principles with a nonprofit business model. This is per- haps not the only way how scholarly publishing can work, but I am optimistic that more openness can help to achieve sustainability and quality and that the community can benefit from the diverse develop- ments in ideas for scholarly publishing models. LL: It does seem like the landscape of scholarly literature is changing into something better due to increased collaboration between libraries and different stakeholders. While we have been looking into funding for opening up access to academic sources, we have been forced to scrutinize costs and workflows to improve them and making them easier to use. The focus of library services has been turned from act- ing as the provider of information to trying to attract the user of infor- mation, which is a significant development. We have also seen some change regarding harvesting and opening up access to metadata. New databases such as OAPEN, DOAB, DOAJ, and other similar services challenge libraries to update their content discovery methods and li- brary systems as they can no longer rely on sales representatives from publishers to suggest purchases. We can, on the other hand, not just rely on Google Scholar taking care of the content selection for us ei- ther, which helps to shine some light on how we organize information and show users how they can trust information from different sources. 58 Advisor Reports from the Field / The Charleston Advisor / October 2018 www.charlestonco.com This process is far from mature yet, but it may end up with libraries having to put more emphasis on teaching information literacy in the future and by that strengthening the status of libraries (and librarians) as an invaluable part of the research process. What has been the greatest achievement accomplished by your in- stitution since the loss of the Elsevier deal? IB: At this point in time we do not know; the deal is lost already. But since the loss of access to Elsevier content for the majority of Ger- man academic institutions, we have seen that the support among the library community is really excellent. There is a great solidarity with the DEAL negotiation goals. Perhaps even more remarkable is the feedback we are receiving from many researchers who not only ex- press that they can cope with the loss of access but also tell us they are declining to review for Elsevier, try to avoid publishing with Elsevier or even step back from being editors in Elsevier Journals. LL: I would say that our greatest achievement was that we decided to use the money, otherwise spent on subscription costs with Elsevier, to fund publishing in fully Open Access journals. And that this decision was not delayed, but rather quickly made (between our Library Di- rector and Vice-chancellor, both of them part of the negotiation team with Elsevier, so maybe not that surprising) and communicated al- ready the end of June. At Stockholm University we think the transi- tion towards open science is too slow and using the “Elsevier-money” on fully Open Access journals is another step towards the target of 100% OA. Also, when handling requests for funding, we have tried to make it as easy as possible for our researchers. No application form, they just need to ask us by sending a simple e-mail to: . If the article meets the criteria, i.e., corresponding author affiliated with Stockholm University, to be published in a fully Open Access journal (listed in DOAJ), we take care of the APC invoice (no price caps). There are quite often pieces of information missing on publish- ers’ APC invoices (like DOI or journal title or even the fact that this is an Open Access fee), but when collecting invoices from authors we don’t expect them to debate the design of the invoice with the pub- lisher. There are other forums for that discussion, for now we’re just happy to get our hands on the invoice and spend the money as decid- ed. Quite a few other Swedish HEIs are talking about doing the same, most recently, Linköping University are now doing the same () which is re- ally promising. What has been the greatest challenge at your institution since los- ing current access to Elsevier content? IB: We are one of the few libraries in Germany who at the moment still have access to their subscribed Elsevier journals (with the ex- emption of Cell Press Journals) as our contract is running through to the end of 2018. So we are one of the institutions that are supporting those without access in terms of document delivery rather than hav- ing to deal with lost access ourselves. We had prepared ourselves for an increased demand for articles from other libraries, but contrary to what one would perhaps expect, the demand is surprisingly low! LL: Again, I must mention the “Open Access fund” since the greatest challenge has been to communicate the offer to our researchers and make them aware of this possibility. There are about 1,700 doctoral students and 5,000 staff at Stockholm University, and it is not that easy to reach all of them, but we are planning to repeat the message through different channels. It is still early days and the offer came out during the holiday season, only two months ago. Now with the new semester and our planned communicative actions we hope to increase the number of requests for funding. And, with the requests, also the number of open access articles published at SU, funded by the library. Since we monitor all APCs paid at the university (within the local accounting system) I keep a close watch on the ones for fully Open Access journals and can detect payments still made by the individu- al institutions/authors. In other words, which ones that didn’t know about the centralized funding via the library. I haven’t gotten the Au- gust report of APC invoices yet (still being August), but as soon as we have it I will check the payments and also be able to contact the corresponding author and/or the institution with our offer if needed. (We cannot refund payments already made, especially not credit card payments, but hopefully, they will send the invoice to us for future ac- cepted articles in OA journals.) Up until now, we have had about 40 requests, little over half of them for funding of OA articles in hybrid journals. Those we have declined, explaining why (because of hybrid journal) with a link to our news article: . When receiving our message and (hopefully) reading the text above, none of the authors have replied in a negative tone. Most of them say “thanks anyway” or “thanks for explaining”. When we have to de- cline articles we see it as an opportunity to educate our researchers about the open access publishing landscape and to make them aware of the university’s strategy regarding open science. Researchers publishing in fully Open Access journals are obvious- ly very happy to get funding and, not the least, to not have to bother with the handling of the invoice. So far we have only paid eight arti- cles, but there are additional articles where we have confirmed fund- ing with the author (and in some cases also the publisher), but for dif- ferent reasons, we haven’t got the invoice just yet. I might add here that we haven’t had a single complaint about not being able to access Elsevier content, not so far. The feedback that we’ve received is only supportive of the cancellation. A few patrons have turned to us saying they cannot access a specific journal, but in those cases they were unaware of the cancellation and the reasons be- hind it. When explaining, again with links to news about the Elsevier cancellation, none has replied being upset. How are you providing perpetual access to content that was previ- ously subscribed to from Elsevier? IB: Our library’s agreement with Elsevier will run out with the end of 2018. For the case that there will be no DEAL agreement by then, our plan is to hold on to an absolute minimum of subscriptions from El- sevier (that means only a very few journals or perhaps a small e-book collection), and so will continue to have access to the Elsevier plat- form and therefore to our content that was previously subscribed with rights for perpetual access. LL: As said, it is still early days and the reason behind not having any complaints so far might have to do with the fact that we can still access all previously subscribed content (Freedom Collection, Cell The Charleston Advisor / October 2018 www.charlestonco.com 59 Press titles and a few other titles, depending on local institutional ad- ditions to the cancelled agreement) on ScienceDirect published be- tween 1995 - until last of June 2018, this for the majority of the titles. In other words, only content published from the first of July 2018 can no longer be accessed, thanks to the PTA clauses in the cancelled agreement. For details, please see the Bibsam Q&A: . Adding to this is access to backfiles (pre-1995) purchased years back. As for content published from the first of July, we offer the Get it Now-service and have noticed a slight increase in Elsevier-articles ordered. We also inform our patrons about alternative ways to find content: . –––––– It became clear to me, after my discussion with Irene Barbers and Lisa Lovén, that in North America we appear to be at a turning point in regards to the traditional big deal purchasing. Big deals have be- come condensed deals. Many academic librarians, trying to find ways to reduce their costs expended annually, reduce their title bases and have reconfigured their packages. We no longer can afford not to do so. At some point in the near future, we’ll realize that the best deal is no deal at all. n Lemon/Vaporware Award MLA International Bibliography The Modern Language Asso- ciation (MLA) recently signed an exclusive contract to only offer the very popular MLA International Bibliography on the EBSCO plat- form. This has upset librarians worldwide, as access was previously provided on other distribution platforms such as ProQuest and Gale. Many libraries tend to prefer one platform over another, and popular discovery layers such as Primo or Summon (both from ProQuest) will no longer have access to the content from this database. Tipasa This new cloud-based Interlibrary Loan (ILL) manager from OCLC has not been received well by many in the ILL com- munity. Some feel it was brought to market too early by OCLC and many are not happy that the most favored product in the market, Il- liad, will be retired by OCLC. Not ready for prime time. n such as EBSCO () and a variety other solutions, such as LEAN Li- brary (now owned by SAGE ). CASA (Campus Activated Subscriber Access) by Google Scholar, in collaboration with hosting platforms such as Highwire Press (), and others provide new ways to eliminate ac- cess barriers for researchers. Best Effort Taylor and Francis Kudos to T&F for pulling back on their plan to introduce a 20-year rolling (moving) wall on their periodical back- files. This plan was strongly opposed by librarians around the world as it would have created extra work and expense for librarians and researchers to access older content which had already been licensed. FROM YOUR MANAGING EDITOR, continued from page 3 Seventeenth Annual Readers’ Choice Awards work_jzv2aszjxnbu3lntumwrry7mwm ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586442 Params is empty 216586442 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586442 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_k4ge5mt5preepenw36ng2dfqci ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216585611 Params is empty 216585611 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585611 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_k6vrtbeyk5dphdlc2oytwa7xiu ---- Preserving and ensuring Ii!/ormation & Communications Technology Law Vol. 18, No.1, March 2009, 39--74 Preserving and ensuring long-term access to digitally born legal information Sarah Rhodes a and Dana Neacsu b* aGeorgetowll Law Library, Washington, DC, USA; bColumbia Law School Library, New York, NY, USA Written laws, records and legal materials form the very foundation ofa democratic society. Lawmakers, legal scholars and everyday citizens alike need, and are entitled, to access the current and historic materials that comprise, explain, define, clitique and contextualize their laws and legal institutions. The preservation of legal information in all formats is imperative. Thus far, the twenty-first century has witnessed unprecedented mass-scale acceptance and adoption of digital culture, which has resulted in an explosion in digital information. However, digitally born materials, especially those that are published directly and independently to the Web, are presently at an extremely high risk of permanent loss. Our legal heritage is no exception to this phenomenon, and efforts must be put forth to ensure that our current body of digital legal information is not lost. The authors explored the role of the United States law library community in the preservation of digital legal information. Through an online survey of state and academic law library directors, it was determined that those represented in the sample recognize that digitally born legal materials are at high risk for loss, yet their own digital preservation projects have primarily focused upon the preservation of digitized print materials, rather than digitally born materials. Digital preservation activities among surveyed libraries have been largely limited by a lack of funding, staffing and expertise; however, these barriers could be overcome by collaboration with other institutions, as well as participation in a large-scale regional or national digital preservation movement, which would allow for resource-sharing among partici- pants. One such collaborative digital preservation program, the Chesapeake Project, is profiled in the article and explored as a collaborative effort that may be expanded upon or replicated by other institutions and libraries tackling the challenges of digital preservation. Keywords: right to information; digital legal information; digital preservation; legal archives Introduction Legal informatioll alld risk: history alld backgroulld Nearly two centuries ago, American Founding Father James Madison wrote: [Al popular Government, without popular information, or the means of acquiring it, is but a Prologue to a Farce or a Tragedy; or, perhaps both. Knowledge will forever govern ignorance: And a people who mean to be their own Governors, must arm themselves with the power which knowledge gives. (Madison, 1910, p. 103) *Corresponding author. Email: dneacs(i&law.columbia.edu ISSN 1360-0834 print/ISSN 1469-8404 online © 2009 Taylor & Francis DOl: 10.1 080jl3600830902727905 http:! jwww.informaworld.com 40 S. Rhodes and D. Neacsu Madison helped author the United States Constitution and draft the Bill of Rights - two historic legal documents that continue to influence conceptions of basic human rights throughout the world. As his statement affirms, among those rights, access to information is fundamental; it is the cornerstone of a society built upon ideals of egalitaxianism and popular governance. This notion is as true today as it was in the early days of the American Republic. The body of current and historic legal works that comprise, explain, define, critique and contextualize society's laws, legal institutions and systems of justice represent a vital portion of the human record. These materials stand as the underpinnings of democracy, and access to legal information is essential for law practitioners, lawmakers, jurists, legal scholars and ordinary citizens alike. To ensure that this information remains accessible to current and future generations, it must be preserved. In a day and age characterized by the proliferation of information in digital formats, this task requires new strategies and new ways of thinking about preservation. The written record, including our laws and legal documentation, has always been in danger of destruction, whether by natural or human means (Deegan & Tanner, 2006). The British burning of the Capitol building in 1814, for example, destroyed the original 3000 volumes of the Library of Congress (International Revie\v, 1878). In 1966, the Arno River escaped its banks in the city of Florence, destroying masterworks of art as well as thousands of books, manuscripts and legal records in the collections of the Biblioteca Nazionale and Archivio di Stato (Burlington Magazine, 1967). More recently, the war in Iraq resulted in the tragic burning and looting of the National Library of Iraq (Kniffel, 2003), and in New Orleans, state and local officials continue to wrestle with the recovery and rescue of government records and documents following Hurricane Katrina in 2005 (NASCIO, 2007). Human pursuit of profit has also put printed information at risk; in the latter part of the twentieth century, it was discovered that books published from the mid- 1800s onward were deteriorating at an alarmingly rapid rate, the result of inexpensive mass-printing processes using highly acidic paper. To ensure that these materials and their content would be available for future access, libraries took action to preserve them, including treatments using acid-free materials, controlled environmental conditions and conversion to archival microforms (Anon., 1980). At the present time, a new threat to our written heritage has surfaced, demanding immediate attention and action. The twenty-first century has witnessed not only the emergence of information created and disseminated in digital formats, but also an unprecedented mass acceptance and adoption of digital technologies, resulting in a flood of digital information that is expanding exponentially. In fact, for the first time in history, the amount of digital information 'created, captured or replicated' in the year 2007 alone, which equals about 281 billion gigabytes, surpassed the world's existing electronic storage capacity (International Data Corporation, 2008). Within a few short years, by 2011, the International Data Corporation expects the digital universe to increase 10 times over. Not surprisingly, important legal materials are increasingly being digitally born and then distributed online rather than published on paper. Since the mid-1990s, for example, the number of government documents distributed in digital, as opposed to print, formats by the United States Government Printing Office (GPO) has ballooned. Federal and state agencies over the past decade have likewise produced a growing number of digitally born documents and reports, which have been posted directly to the Web (Lyons, 2006). Court opinions are now being published online, Ir!/,ormation & Communications Technology Law 41 and legal scholarship increasingly relies on digitally born sources, identified only by a Uniform Resource Locator (URL), or website address, directing one to an online document (Rumsey, 2002; Neacsu, 2007). In fact, what has been called the 'disintermediation of legal scholarship' through collaborative and open-access Web- based publishing is having a noticeable impact on the practice and study of law in the United States (Solum, 2006, p. 1071). Articles and commentary posted on legal Web logs (blogs, or 'blawgs'), for example, have been cited in many prestigious law reviews as well as in cases argued before state and federal courts, including the United States Supreme Court (as cited in Solum, 2006). Exploring the risk of digitally born legal information published online The appeal of digital formats and online publishing to entities responsible for creating and distributing legal information is understandable. Digital materials are more compactly stored, easily transportable, widely distributable and instantly accessible than information produced in any previous format or medium. However, the proliferation of digital information poses an extraordinary challenge for professionals in the fields of law and legal informatics who are concerned about the long-term preservation of legal materials. Although printed materials are produced in limited quantities and vulnerable to fires, floods and physical deterioration, digitally born materials without printed counterparts are, in fact, at tremendous risk of permanent loss. While the content of a printed item can be read directly by the human eye, accessing materials in digital formats is an indirect process requiring sophisticated retrieval technology. As older technology is rendered obsolete by the emergence of new technology, so, too, are the corresponding older digital formats. Thus, digital information existing in obsolete formats, without the appropriate preservation treatment, can be lost forever (Deegan & Tanner, 2006). Additionally, a large store of knowledge has accumulated over the years about the lifespan and degradation of printed materials. As a result, archival specialists are able to safeguard both the intellectual content and physical packaging of printed items. The longevity of digital media, however, remains uncertain. For example, compact disks, by one estimation, have a physical lifespan of anywhere between five and 59 years, and a given format's descent into obsolescence, on average, can be expected to occur within five to 20 years (Rothenberg, 1999; Holdsworth, 2006). Whereas materials printed on acidic paper in the nineteenth and twentieth centuries were threatened by an abbreviated lifespan of less than 50 years, the twenty-first century's digitally born information could be rendered inaccessible within five short years of its creation. Within the realm of digital information, the transient quality of legal information published directly to the free Web (as opposed to within subscription databases), often by government and independent entities, is troubling. Documents, reports and other legal information published online can be unexpectedly and permanently lost as files are removed and URLs are changed or inactivated through routine and seemingly innocuous website maintenance activities. A study by Robert Lopresti and Marcia Gorin (2002) found that among a sample of government agency documents removed from the Web, roughly half had been replaced by newer versions or volumes, while the rest had vanished completely. The non-profit Internet Archive (n.d.), creator of the Wayback Machine Web archive, estimates the average lifespan 42 S. Rhodes and D. Neacsu of a webpage to be between 44 and 75 days. Yet, the challenges of preserving and providing sustainable access to digitally born legal information are not insurmoun- table. Tools and standards for best practices in digital preservation have been developed. Digital repositories and Web harvesters are available for a fee or independent development using open-source software. Many libraries have experimented with and implemented this technology within their own institutions. Statement of the problem and research objective This article aims to advance the dialogue about the need for successful, sustainable digital preservation programs to ensure ongoing access to our legal heritage, and to advocate for increased collaboration among law libraries in digital preservation efforts. At present, the status of digital preservation activities at state and academic law libraries is unclear. Of particular concern are projects aimed at preserving digitally born legal information published on the free Web - materials that have been demonstrated to be among the most ephemeral and at-risk component of the United States' legal heritage. The objective of the present project is twofold: first, to survey the United States state and academic law library community about its digital preservation efforts, and second, to profile a collaborative, multi-institution project currently underway to preserve digitally born legal information. The digital preservation survey seeks to answer the following questions: (1) To what extent are law libraries participating in digital preservation acti vities? (2) What are the law library community's attitudes and perceptions about digital preservation and the vulnerability of digitally born legal materials? (3) To what extent are law libraries collaborating in digital preservation projects and programs? (4) What faciors are limiting or fostering law libraries' participation in digital preservation activities? In addition, a case study is presented of the Chesapeake Project: a collaborative pilot digital preservation program being implemented by the State Law Library of Maryland, State Law Library of Virginia and the Georgetown Law Library, under the auspices of the Legal Information Preservation Alliance (LIPA). The pilot project is focused solely on harvesting and preserving legal information currently available in digital formats on the free Web, with the goal of evolving into a nationwide preservation program for digitally born legal materials. The Chesapeake Project is explored as a collaborative effort that may be replicated by other institutions and libraries tackling the challenges of digital preservation. Literature review Introduction To provide a cursory overview of current digital preservation strategies, and also to determine which entities in the United States have taken responsibility for the preservation of various categories of digitally born legal information, a comprehen- sive review of both legal and non-legal literature, including the library and information science literature, was conducted. Searches using academic resources Information & Communications Technology Law 43 and finding aids (such as library catalogs and article indexes), full-text database searches and search-engine queries were conducted. In addition to these sources, articles and reports of interest appearing on subscription electronic mailing lists and online news sites, which linked to primary and secondary source materials, were also tracked, allowing for the analysis of works authored by private and governmental institutions. Digital preservation overview A number of digital preservation strategies havc been introduced and explored since the issue emerged. Kevin Bradley (2007), in a comprehensive article exploring contemporary notions of digital sustainability, affirms that migration and emulation continue to represent two of the most viable digital preservation strategies. Migration is a process by which information is successively copied from old formats onto different or newer formats, thereby ensuring the content remains accessible by staying one step ahead of technology obsolescence. Emulation is a preservation strategy involving the development of new programs and applications that emulate, or mimic, the functionality of outdated programs and applications. Information stored in obsolete formats can thus be retrieved using these programs, theoretically allowing the user to access the digital item in its original format, with its original significant properties intact. Today, most experts in the field have shifted the thrust of their efforts away from mulling over specific digital preservation strategies and toward the active development and implementation of sustainable 'metadata, standards and archi- tecture' (Bradley, 2007, p. 161). Undoubtedly, migration, emulation or other to-be- developed preservation actions will be taken to ensure ongoing access to preserved items. However, many unforeseen factors and technologies will impact these future decisions. In the meantime, digital items are being archived in standard-compliant digital repositories alongside the appropriate technical, structural, administrative and descriptive preservation metadata, which will guide the extraction and rendering of the archived digital objects for whatever future preservation action is taken. Legal information and collaborative digital preservation efforts Archiving and preserving digitally born legal materials for sustainable, permanent access requires a commitment of staffing, infrastructure and budget monies. It is important to understand that the successful preservation of these materials depends upon the stable, ongoing commitment of law libraries and institutions themselves, just as much it depends upon their technological assets and digital repository systems. In fact, one of the core requirements for digital archives put forth by the Center for Research Libraries (2007) is a 'demonstrat[ed] organizational fitness (including financial, staffing structure and processes) to fulfill its commitment'. The costs asso- ciated with digital preservation are high, which can threaten the stability and long- term viability of a digital archiving program. As such, it is worthwhile for law libraries to collaborate in digital preservation activities in order to share costs, workloads and overall responsibility for sustainable access to digitally born legal materials. There has been a movement to encourage governmental, organizational and institutional collaboration in the preservation oflegalmaterials in digital formats. In 2000, the United States Congress established the National Digital Information 44 S. Rhodes and D. Neacsu Infrastructure and Preservation Program (NDIIPP), which is administered by the Library of Congress, and authorized up to US$lOO million to fund the initiative, with the goal of establishing a collaborative 'national network of entities committed to digital preservation and that are linked through a shared technical framework' (LeFurgy, 2005, p. 164). In 2003, the Legal Information Preservation Alliance, formed under the auspices of the American Association of Law Libraries, was established to begin to address the need for a collaborative national agenda for the preservation of legal literature, in both print and digital formats, by supporting and advancing such efforts (Legal Information Preservation Alliance, n.d.b). Yet, despite thcse strides toward collaborative digital preservation programs, two recently published studies, one by NASCIO (2007) and the other sponsored by the Center for Technology and Government (Pardo et aI., 2006), found discouraging levels of fragmentation, inconsistency and lack of standardization in the digital preservation of state government information. Each report advocated increased institutional collaboration, on a regional and a nationwide scale, as a possible solution to ensure the sustainability of digital preservation programs. Who is preserving digitally hol'1l legal informatioll? The professional literature reflects a variety of digital preservation projects, varying in size and approach, aimed at providing sustainable access to various types of legal information. Articles discovered were reviewed to discern the type of legal materials preserved (primary or secondary), the type of institution preserving the material and the preservation strategy implemented. Special attention was also paid to project- oriented collaborative efforts involving federal government institutions, academic libraries, law libraries, and state libraries and archives. Regarding the preservation of legal publications produced by the United States Government, the Government Printing Office (GPO) is required by Title 44 of the United States Code to provide permanent public access to information and documents created and funded by the federal government. In an electronic world, achieving such permanent public access necessitates digital preservation, and Title 44 specifically mandates that the GPO maintain a directory, a system of access and an electronic storage facility for federal electronic information (Lyons, 2006). Tn 2003, the GPO and the National Archives and Records Administration (NARA) entered into an agreement whereby the GPO was made a NARA archival affiliate, designating NARA as the legal custodian of the content made available through GPO Access, the GPO's free public website, while conferring physical custody, permanent public access and preservation responsibilities upon the GPO (US Government Printing Office, 2003). However, a preservation challenge for the GPO persists in the form of fugitive documents, elusive government pUblications published directly to the Web, which have not been cataloged, indexed or archived by the GPO. As of 2003, fugitive documents were conservatively estimated to comprise 'about 50% of the universe of Federal printing' (Baldwin, 2003). Currently, the GPO is working toward the release of the Federal Digital System (FDsys), an impressive new initiative that 'will allow federal agencies to easily create and submit content that can then be preserved, authenticated, managed and delivered upon request' (US Government Printing Office, n.d.). This system is expected to minimize the problem of fugitive documents and is being released in increments, with its first public release planned for late 2008. It!/ormation & Communications Technology Lmi' 45 The GPO has also entered into formal agreements with libraries and government agencies to ensure permanent public access to electronic publications (Kumar, 2006). For example, the GPO, in collaboration with the United States Commission on Civil Rights, has conferred upon the University of Maryland's Thurgood Marshall Law Library the responsibility to store and ensure permanent public access via the Internet to an electronic collection of historic United States Commission on Civil Rights documents held in the law library's collections (Commission on Civil Rights, 2007). A different type of collaborative effort between the GPO and an academic library is dcscribcd by Atifa Rawan and Cheryl Knott Malone (2006) in their account of a University of Arizona Library project implemented in conjunction with the GPO's Library Programs Service, which involved the identification of digital government documents by way of a 'virtual depository model'. The GPO acted as a partner in the program by providing legal and organizational support, as well as indexing services for the project, while resolving issues relating to fugitive documents and broken permanent URL (PURL) links discovered by University of Arizona Library. In early 2006, Yale University Library's Government Documents and Informa- tion Center (GDIC) librarians initiated an independent pilot project to migrate at-risk government information recorded on more than 3000 CD-ROMs acquired through the GPO's Fedcral Depository Library Program (FDLP) to a stable server environment, along with preservation of metadata records (Gano & Linden, 2007). The migration process was found to be costly and time-consuming, resulting in the successful migration of only 13 CD-ROMs. Project coordinators concluded that addressing the challenge of ensuring the preservation and long-term accessibility of government information distributed on legacy digital formats could not be addressed by a single institution alone. However, through a collaborative effort, FDLP libraries could take responsibility for the preservation of materials 'suited to each institution's interest and expertise' while avoiding duplication of efforts (Gano & Linden, 2007). The University of North Texas Libraries have played an active role in the preservation of and provision of access to federal and state government information, notably with the CyberCemetery collection, which provides permanent public access to the web sites and publications of defunct United States Government agencies and commissions (Glenn, 2007; Murray & Phillips, 2007; University of North Texas Libraries, n.d.a). As part of the Federal Depository Library Program, the University of North Texas Libraries created this archive in partnership with the GPO, and it has been cited in the literature as a successful effort in the area of preserving and ensuring access to online government information that would have otherwise been lost (Glenn, 2007; Murray & Phillips, 2007). Another worthy permanent public access project is the University of North Texas Libraries' Congressional Research Service (CRS) Reports collection, comprising CRS reports, which are not made readily available to American citizens by CRS. The reports in this collection are harvested from the Web, archived and then made freely accessible via the University of North Texas website (University of North Texas Libraries, n.d.b). The Library of Congress-administered National Digital Information Infrastruc- ture and Preservation Program (NDIIPP) has established projects in collaboration with university libraries, state libraries and archives, consortia, nonprofit electronic archiving services and educational organizations to address the preservation needs of a number of priority, at-risk digital materials, including digital geospatial data produced by state and local government entities, political and cultural websites, noncommercial foreign- and American-based public television programming, social 46 S. Rhodes and D. Neacsu science datasets, electronic scholarly journals, historic archival materials and state government information (LeFurgy, 2005; Library of Congress, n.d.a). Valerie Glenn (2007) describes the NDIIPP-funded Web-at-Risk project, which involves harvesting and then preserving digitally born legal information published online. The project is led by the California Digital Library, in collaboration with partners at the University of North Texas and New York University, and is developing and providing tools and services allowing librarians to 'capture, curate and preserve Web-based government and political information' (Glenn, 2007; California Digital Library, n.d.; Web-at-Risk, n.d.). NDIPP has also launched a Multistate Government Digital Information Program as a collaborative 'effort to facilitate the development of digital preservation partnerships among states and territories', involving libraries, archives, records management units and other information agencies (Pardo et aI., 2006, p. 1; Library of Congress, n.d.b). In addition to its administration ofNDIIPP projects, the Library of Congress has also been involved in the development of a related series of thematic Web Capture collections, involving the acquisition and preservation of websites of historical significance (Library of Congress, n.d.c). Many of the Library's Web Capture collections are highly relevant to legal scholars; these include collections of archived legal blogs, election websites, a 11 September Web archive, websites of members of Congress and congressional committees, online reports relating to the crisis in Darfur, web sites relating to the nomination and appointment of Supreme Court justices, and online reports and commentary relating to Hurricane Katrina. Among academic law libraries, the Cornell Law Library has played an active role in the preservation of website-based legal information. Claire Germain (2002) advocates the use of Web mirror sites as a relatively inexpensive option for the preservation of and long-term access to official legal information published online. In collaboration with producing organizations and governmental entities, entire collections can be loaded onto library servers and archived at regular intervals. The Cornell Law Library has used this strategy to create open-access Web mirror sites for the International Labour Organization and International Court of Justice, which issue court decisions, treaties and other primary documents (Germain, 2002). Many digital preservation projects involving legal materials focus on the preservation of primary and government resources. In response to this trend, some law librarians have called for academic law libraries to embark upon preserving secondary sources in the form of digitally born legal scholarship - especially online sources cited by scholars and identified only by URLs in the legal literature. Dana Neacsu (2007), building upon her previous work, as wel1 as other librarians' scholarship on the new challenges posed by Web citations, has offered up potential solutions to this issue, which involve preservation and access systems (Neacsu, 2002; Rumsey, 2002). Although librarians and the law library literature are increasingly advocating the preservation of digitally born sources, academic law libraries themselves appear hesitant to devote institutional resources to such projects involving the preservation of this body of legal scholarship and secondary sources of law. Survey Introduction and background In an effort to gain insight into law libraries' digital preservation actIVIties, preservation priorities, attitudes and perceptions about digital preservation, and Information & Communications Technology Law 47 collaborative digital preservation efforts, the authors devised an online survey and distributed it to selected members of the law library community. While the literature demonstrates a general awareness among the library community of digitally born materials' heightened risk of loss, as well as the need for preservation strategies to ensure sustainable access to digital materials, there remains some uncertainty as to how individual law libraries are addressing these challenges, what factors have impacted their activities and what they perceive their roles to be in this area. In 2005, the Legal Information Preservation Alliance (UPA) sponsored a survey of the state and academic law library community about their digital and analog preservation efforts. Of 41 respondents, only 19 had been involved in preservation projects involving legal materials in any format. Six respondents indicated that their institutions were involved in archiving digital publications that had been downloaded from the Internet. However, in describing their preservation strategy, five indicated that these digital documents were downloaded, printed, bound and added to their physical collections rather than stored in a digital archiving system (Breeze, 2005). Several issues have emerged as factors inhibiting libraries' efforts in digital preservation, but primary among these is a lack of resources -- both human and financial. The Center for Technology in Government in 2006 sponsored a baseline survey on the status of institutional efforts to preserve digital state government information, and found that the greatest barrier to participation in digital preservation activities was a lack of staffing and funding (Pardo et aI., 2006). Beyond the law library community, a digital preservation survey by the Museums, Libraries and Archives Council found that 'no long-term funding is being provided for the management of digital material' in institutions based in the United Kingdom (Simpson, 2005, p. 1). A similar study of libraries, archives and museums in the United States, conducted by the Northeast Document Conservation Center, found that a large number of surveyed institutions had 'no or low levels of institutional funds allocated for creation, acquisition, management or sustainability of digital collections' (Clareson, 2006). In addition to lack of funding and staffing, these surveys have demonstrated that a dearth of expert personnel who are trained in digital preservation has limited institutions' ability to participate digital preservation activities (Clareson, 2005; Simpson, 2005; Kenney & Buckley, 2005; Pardo et aI., 2006). Moreover, it has been shown that, in some cases, digital preservation is neglected because it simply is not an institutional priority (Kenney & Buckley, 2005). When it comes to the issue of preserving legal information specifically, there is a broad perception that the government, rather than law libraries, should be involved in such projects; after all, it is the government's mandate to publish the law and make it available to the public in a liberal democracy, where the rule of law requires the public to be informed (Pardo et aI., 2006). The GPO has been involved with impressive efforts to ensure perpetual access to federal government information in the United States. Yet, the library's traditional role encompasses collecting information, making that information accessible to users and preserving the information in its collections for future access. As traditional stewards of information, the Library of Congress and other libraries throughout the United States have implemented some of their own projects in protecting primary as well as some secondary sources. Yet, how exactly do state and academic law libraries view their roles in this enterprise? The preservation of which materials, primary or secondary, if any, and in which formats, print or digitally born, do state and academic law libraries believe to be within their institutional role? The authors 48 S. Rhodes and D. Neacsu surmised that law library digital preservation projects have been more focused on the preservation of digitized print materials, as opposed to digitally born materials, and that few law libraries were actively harvesting content from the Web. Moreover, it was suspected that the type of material (primary, as opposed to secondary legal sources) made a policy difference in terms of digital preservation priorities. Methodology Research questions The survey sought to address the following research questions: (1) To what extent arc law libraries participating in digital preservation activities? (2) What are the law library community's attitudes and perceptions about digital preservation and the vulnerability of digitally born legal materials? (3) To what extent arc law libraries collaborating in digital preservation projects and programs? (4) What factors are limiting or fostering law libraries' participation in digital preservation activities? Survey instrument The authors developed a rather detailed, mid-length, Web-based questionnaire (see Appendix). It included 45 questions, which were designed to gather data related to law libraries' digital preservation activities, policies and priorities. The survey, titled 'Law Libraries & Digital Preservation: A Survey', was divided into the following five sections: (1) Welcome (a brief introduction and instructions) (2) Demographic information (4 questions) (3) Digital-preservation activities (23 questions) (4) Perceptions and attitudes toward digital preservation (8 questions) (5) Copyright and access policies for archived digital materials (10 questions) The survey was created and made available online using Survey Monkey: survey software that was also used by the Center for Technology in Government's baseline survey of the preservation of state government information (Pardo et aI., 2006). This survey development and distribution service was chosen after experimenting with alternate survey formats, including a PDF form designed to be submitted electronically, which was determined by a trial user in the target population to be too unwieldy and difficult to submit. An assortment of question types were used in the survey: four multi-item Likert questions, 18 close-ended questions and 23 open-ended questions. It is worth noting that participants were not required to complete all survey questions, and that many of the open-ended questions included instructions asking for brief responses. Most open-ended questions were optional and designed to give respondents an opportunity to explain their answers or reactions to the preceding question. Before b?formation & Communications Technology Law 49 issuing the survey and publishing it online, the authors sought feedback on its content, design and usability from other law library professionals. An academic law library director and a law library government affairs representative reviewed the survey instrument, and their recommendations were incorporated into the final survey prior to distribution. Survey distribution State and academic law library directors comprised the survey's target population. This group was targeted in an effort to obtain the perspective of upper-level administrators and individuals with insight into the budgetary, organizational and policy issues that impact digital preservation decisions. Moreover, the authors sought to avoid duplication of survey responses, in which more than one respondent reports on the activities of a single law library. A brief introductory message containing the Web survey's hyperlinked URL was distributed via two separate electronic mailing lists, or listservs: the State Law Librarians Roundtable (sccllsllr@aallnet.org) and LawLibDir (lawlibdir@lists.washlaw.edu). The State Law Librarians Roundtable reaches 42 state law library directors who are members of the American Association of Law Libraries' Special Interest Section of State, Court, & County Law Libraries. LawLibDir, operated by Washburn University Law School, reaches a subscription base of 230 academic law library directors. The authors did not distribute the surveys directly; rather, they enlisted the assistance of colleagues who were listserv subscribers to post the survey on their behalf. The survey was published online and available for respondents to complete between the dates of 29 February 2008, and 20 March 2008. It was posted to the State Law Librarians Roundtable on 29 February 2008, and to LawLibDir on 4 March 2008. Response rate The survey was thus made available online to 272 recipients. A total of 39 recipients accessed the survey (i.e., they followed the URL to the survey's location on the Web); of these, however, 37 submitted surveys were determined to be valid. As a result, the survey response rate was low (13.6%). Thus, though the sample size is a weakness of the study, it is worth noting that a similar survey of digital preservation conducted via e-mail by the Northeast Document Conservation Center reported a comparable response rate of 12.5%. Among submitted surveys, the types of law libraries represented were equally split: 18 academic law libraries and 18 state law libraries (see Figure 1). One additional respondent was identifled as an 'other' law library, and the respondent declined to give additional information regarding his/her institution. Considering each listserv's sUbscription numbers, the academic law library response rate translates to 7.8%, while the state law library response rate stands at 42.9%. Results Law libraries and digital preservation activities Based on the reports of 21 survey respondents, representing 11 academic law libraries, nine state law libraries and one 'other' library, a cumulative total of about 50 S. Rhodes and D. Neacsu 3% 48% 49% ;I Academic Law Libraries (18) II1II State Law Libraries (18) o Others (1) Figure I. Survey respondents by library type. 59 digital preservation projects have been planned or implemented by respondents' libraries within the past five years. Responses varied, with nine respondents, representing four academic law libraries, three state law libraries and two 'other' libraries, indicating that their institutions had been involved in no digital preservation activities at all. Four academic law libraries indicated their involvement in one project, while four state law libraries and one academic law library indicated involvement in two digital preservation projects during the past five years. An academic law library and a state law library reported participation in three and five projects, respectively, and another state law library respondent indicated involve- ment in 12 projects over the stated period. Finally, an academic law library respondent reported participation in between 20 and 30 digital preservation projects (which was estimated to be 25 projects for the purpose of this study). In assessing the extent to which respondents' efforts focused on the preservation of digitized, or scanned, print materials, as opposed to digitally born materials with no print original, it was determined that 55 of the 59 digital preservation projects involved digitized print materials. Only four projects involved digitally born materials, and out of 15 respondents, 13 indicated that they were not harvesting and preserving Web-published materials. Table I provides an overview of respondents' digital preservation projects involving digitized print and digitally born materials. Of the total number of 18 state law library respondents, only two reported participation in a total of three projects to preserve digitally born materials, while five (27.8%) respondents indicated their involvement in a total of 22 projects to preserve digitized print materials. Of the 18 academic law library respondents who submitted responses to the survey, only one reported participation in a single project to preserve digitally born materials, while eight (44.4%) respondents indicated their involvement in a total of 33 projects to preserve digitized print materials. With regard to law libraries' emphasis on the preservation of primary versus secondary legal materials, three academic law library respondents indicated that their institutions preserved primary domestic law, while one preserved both primary and secondary materials, whether domestic, foreign or international. Among state Information & Communications Technology Law 51 Table 1. State and academic law library respondents' digital preservation projects involving digitized print and digitally born materials. Total number of digital preservation projects Digital preservation projects involving digitized (scanned) print materials Digital preservation projects involving digitally born materials Total projects 59 55 4 Projects by Projects by academic law state law libraries libraries 34 25 33 22 3 law library respondents, three respondents, representing all who answered the question, reported preserving domestic law materials in digital formats. Regarding copyright and access issues, a total of eight academic law library respondents answered the questions on this topic. Of these respondents, two reported that their institutions did not preserve materials in the public domain and therefore always sought copyright permission before including protected materials in digital preservation projects. Two respondents indicated that they sometimes obtained permission, and four respondents never sought such permission, even though one respondent was involved in the preservation of non-public domain materials. Of the six state law library respondents involved in digital preservation projects, five responded to the questions on copyright and access policies. Of these respondents, two indicated that copyright permission to include protected materials in digital preservation projects was obtained, while three indicated that it was not. Four respondents indicated that their institutions preserved materials that were in the public domain. Respondents report different approaches to providing access to preserved digital materials in their collections. Out of seven academic law library respondents, four libraries make their archives accessible online to authenticated patrons, two provide onsite-only access to library patrons and one has no current access system but is working on an open-access Web interface. One academic law library respondent distinguishes between the methods of access for copyrighted and public-access materials, allowing online access to the latter. Three state law library respondents reported making preserved digital material fully accessible to the public via the Web, while two respondents provide access onsite to in-library patrons only. Attitudes and perceptions about digital preservation in the law library community Survey respondents were asked to indicate their level of agreement on a Likert scale to a series of four statements regarding the types of digital legal materials that law libraries should be preserving. Table 2 provides respondents' rankings for each item. Among eleven academic law library respondents, four strongly agreed that law libraries should be involved in the preservation of law-related information published on the Web, with one respondent adding government information as well as information published on the Web and cited in law review articles. Three academic law library respondents agreed that law-related information published on the Web should be preserved, while two agreed that law libraries should be involved in the preservation of law review articles published digitally within subscription databases. One respondent expressed disagreement with the statement that law libraries should 52 S. Rhodes and D. Neacsu be involved in the preservation of digitally born government information, and one respondent also did not agree that law libraries should be responsible for articles and materials made available through subscription databases. Among seven state law library respondents, all seven agreed or strongly agreed that law libraries should be involved in the preservation of law-related information published on the Web. Six respondents agreed or strongly agreed that law libraries should be involved in the preservation oflaw review articles published digitally within subscription databases, as well as the preservation of digitally born government information and Web-published legal information cited in law review articles. Although the vast majority of digital projects implemented by survey respondents involved the preservation of digitized print, as opposed to digitally born, materials, respondents indicated by a two-to-one margin that they believed that digitally born materials were more in need of digital preservation efforts than print materials (see Table 3). Respondents advocating for the preservation of Table 2. State and academic law library respondents indicated their level of agreement to a series of statements regarding the types of digital legal materials that law libraries should be preserving ('Yo in parentheses). Strongly disagree Disagree Neutral Agree Strongly agree Response eount Law libraries should be involved in preventing the loss of law-related information published to the Web Total o (0.0) o (0.0) 4 (22.2) 8 (44.4) 7 (38.9) Academic o (0.0) o (0.0) 4 (36.4) 4 (36.4) 4 (36.4) State o (0.0) o (0.0) o (0.0) 4 (57.1) 3 (42.9) Law libraries should be involved in the preservation of digitally born government illj'ormation Total o (0.0) I (5.9) 3 (17.6) 9 (52.9) 4 (23.5) Academic o (0.0) I (9.1) 3 (27.3) 6 (54.5) 1 (9.1) State o (0.0) o (0.0) 1 (14.3) 3 (42.9) 3 (42.9) Law libraries should be involved in preventing the loss of information published on the Web and cited within law review articles Total o (0.0) o (0.0) 6 (33.3) 8 (44.4) 4 (22.2) Academic 0(0.0) o (0.0) 5 (45.5) 4 (36.4) 2 (18.2) State o (0.0) o (0.0) 1 (14.3) 4(57.1) 2 (28.6) Law libraries should be involved in the long-term preservation of and sustained access to law review articles and other legal materials published digitally within subscription databases (HeinOnline, LexisNexis, Westlaw, etc.) 18 10 7 17 10 7 18 10 7 Total 0 (0.0) 1 (5.6) 4 (22.2) 9 (50.0) 4 (22.2) 18 Academic 0 (0.0) 1 (10.0) 2 (20.0) 5 (50.0) 2 (20.0) 11 State 0 (0.0) 0 (0.0) 1 (14.3) 4 (57.1) 2 (28.6) 7 Table 3. State and academic law library respondents' answers to the question: 'Which materials, in your opinion, deserve more attention when it comes to preservation?'. Which materials are in greater need of preservation? Print materials Digitally born materials Total 6 12 Academic 4 6 State 2 6 Inj'ormation & Communications Technology Law 53 digitally born materials overwhelmingly expressed, in response to an open-ended question, an understanding of the risk posed to these materials due to obsolescence. Digital preservation and collaboration in the law library community Regarding survey respondents' levels of participation in collaborative digital preservation projects, none indicated that they always collaborated with digital preservation partners; eight respondents indicated that they sometimes or almost always collaborated with other libraries and institutions; three indicated that they almost always collaborated with nonprofit partners; and four indicated that they sometimes collaborated with corporate or for-profit partners. Five respondents indicated that they rarely or never collaborated with other libraries or institutions, and nine and eight respondents, respectively, answered that they never collaborated with nonprofit or for-profit partners. Survey responses are available in Table 4. In response to an open-ended question about which institutions and organizations, specifically, respondents had partnered with in digital preservation activities, nine state and academic law library respondents listed the following as collaborators: seven libraries and academic institutions, five for-profit companies, four nonprofit orga- nizations and two state government entities (excluding state libraries and archives). The list of collaborators named by respondents, classified by type, is available in Table 5. Factors impacting law libraries' digital preservation activities Survey respondents were asked to indicate their level of agreement, on a Likert scale, as to the extent to which various factors have impacted digital preservation activities at their respective libraries. The first series of Likert items assessed factors limiting libraries' involvement in digital preservation activities (see Table 6). The second series of Likert items assessed factors that would encourage libraries' involvement in digital preservation activities (see Table 7). Lack of funding, staffing shortages, lack Table 4. State and academic law library respondents' answers to the question: 'To what extent do you collaborate with other institutions and/or nonprofit/for-profit partners in developing your digital projects?' (% in parentheses). Almost Never Rarely Sometimes always Always Response collaborate collaborate collaborate collaborate collaborate count Collaborate with libraries/institutions Total 4 (30.8) 1 (7.7) 6 (46.2) 2 (15.4) o (0.0) 13 Academic 4(57.1) 1 (14.3) I (14.3) 1 (14.3) o (0.0) 7 State o (0.0) 0(0.0) 4 (80.0) 1 (20.0) 0(0.0) 5 Collaborate with nonprofit partners Total 9 (75.0) 0(0.0) o (0.0) 3 (25.0) o (0.0) 12 Academic 5 (88.3) o (0.0) 0(0.0) I (16.7) o (0.0) 6 State 3 (60.0) o (0.0) 0(0.0) 2 (40.) o (0.0) 5 Collaborate with for-profit partners Total 8 (66.7) o (0.0) 4 (33.3) o (0.0) 0(0.0) 12 Academic 3 (50.0) o (0.0) 3 (50.0) o (0.0) 0(0.0) 6 State 4 (80.0) 0(0.0) 1 (20.0) o (0.0) o (0.0) 5 54 S. Rhodes and D. Neacsu Table 5. Entities that have collaborated in the digital preservation initiatives of nine survey respondents. Named collaborators Libraries/academic partners Academic/law libraries Library consortia State libraries/archives Law schools/universities For-profit partners Publishers Google Nonprofit/organization partners Heritage/historical societies Legal institutions Professional organizations State government par tilers State committees State supreme courts Total 7 I 2 2 2 5 4 1 4 2 1 1 2 1 1 of staff with expertise in digital preservation and institutional lack of interest in digital preservation were all cited as limiting factors, and, not surprisingly, additional funding, additional staff and the recruitment of staff with digital preservation expertise were all cited as factors that would encourage increased participation in digital preservation activities at the institutional level. Although most respondents neither agreed nor disagreed that lack of opportunities for collaboration and lack of a large-scale digital preservation movement in which to participate has limited their activities, an increased number of respondents indicated agreement or strong agreement that increases in opportunities for collaboration and the emergence of a large-scale digital preservation movement in which to participate would encourage more involvement in digital preservation activities at their libraries. Discussion While the low rate of response precludes broad generalization of the data, these survey findings nonetheless build upon an emerging area of academic interest in the field law librarianship and are worth consideration. Based on survey responses, the majority of state and law libraries' digital preservation projects have involved the preservation of digitized, or scanned, print materials, as opposed to the preservation of materials that are digitally born; out of 59 reported digital preservation projects, only four involved digitally born items. Yet, paradoxically, when asked about which material types were in greater need of preservation, print materials or digitally born materials, respondents replied that digitally born materials were in more urgent need of preservation by a margin of 2 to 1. In response to questions about issues impacting their institutions' level of involvement in digital preservation activities, lack of funding, staffing shortages and lack of staff with technological or digital preservation expertise were cited as limiting factors. Not surprisingly, it was widely reported that increased funding, staffing and Information & Communications Technology Law 55 Table 6. State and academic law library respondents indicated their level of agreement as to the extent to which various factors have limited involvement in digital preservation activities at their respective institutions (% in parenthese). My library's level of involvement in digital preservation activities Strongly Strongly Response has been limited by: disagree Disagree Neutral Agree agree count Lack of funding Total o (0.0) 3 (16.7) 3 (16.7) 6 (33.3) 6 (33.3) 18 Academic o (0.0) 3 (27.3) 1 (9.1) 3 (27.3) 4 (36.4) 11 State o (0.0) 0(0.0) 2 (28.6) 3 (42.9) 2 (28.6) 7 Concerns about technologyf.file format obsolescence Total I (5.6) 4 (22.2) 5 (27.8) 7 (38.9) 1 (5.6) 18 Academic o (0.0) 4 (36.4) 2 (18.2) 4 (36.4) 1 (9.1) 11 State 1 (14.3) o (0.0) 3 (42.9) 3 (42.9) o (0.0) 7 Staffing shortages Total o (0.0) 2(11.1) 1 (5.6) 10 (55.6) 5 (27.8) 18 Academic o (0.0) 2 (18.2) 1 (9.1) 6 (54.5) 2 (18.2) 11 State o (0.0) 0(0.0) 0(0.0) 4 (57.1) 3 (42.9) 7 Lack of slqff with digital preservation/technological expertise Total o (0.0) 3 (16.7) 3 (16.7) 10 (55.6) 2(11.1) 18 Academic o (OJ)) 2 (18.2) 2 (18.2) 7 (63.6) 0(0.0) 11 Slate o (0.0) 1 (14.3) I (14.3) 3 (42.9) 2 (28.6) 7 Digital preservation is not an institutional priorily Total 1 (5.6) 4 (22.2) 4 (22.2) 8 (44.4) 1 (5.6) 18 Academic o (0.0) 1 (9.1) 3 (27.3) 6 (54.5) 1 (9.1) 11 State I (14.3) 3 (42.9) I (14.3) 2 (28.6) 0(0.0) 7 Lack of partners/opportunities to collaborate with other libraries and organizations in digital preservation activities Total I (5.6) 5 (27.8) 10 (55.6) 2 (11.1) 0(0.0) 18 Academic o (0.0) 3 (27.3) 7 (63.6) I (9.1) 0(0.0) 11 State 1 (14.3) 2 (28.6) 3 (42.9) I (14.3) o (0.0) 7 Lack of an organized statewide/nationwide/international digital preservation movement in which to participate Total Academic State I (5.6) 0(0.0) I (14.3) 5 (27.8) 5 (45.5) 0(0.0) 7 (38.9) 6 (54.5) 1 (14.3) 3 (16.7) 0(0.0) 3 (42.9) 2 (11.1) o (0.0) 2 (28.6) 18 11 7 recruitment or cultivation of well-trained staff would encourage greater involvement in digital preservation activities. While most respondents indicated that lack of collaborative opportunities or large-scale movements in which to participate had not necessarily limited their digital preservation activities, more than half of the respondents agreed that increased opportunities to collaborate with other institutions or within the context of a large-scale digital preservation movement would encourage more involvement in digital preservation activities at their libraries. Most likely, there is some recognition among law libraries that collaborative or large-scale digital preservation programs would allow for sharing of resources, including funds and workloads, among participating institutions, reducing the burden placed upon a single institution in implementing a digital preservation program. 56 S. Rhodes and D. Neacsu Table 7. State and academic law library respondents indicated their level of agreement as to the extent to which various factors would encourage digital preservation activities at their respective institutions (% in parentheses). The fol1owing would encourage greater involvement in digital preservation activities Strongly Strongly Response at my library: disagree Disagree Neutral Agree agree count Increasedfimding Total 0(0.0) 1 (5.6) 201.1) 9 (50.0) 6 (33.3) 18 Academic o (0.0) 1 (9.1) 2 (18.2) 7 (63.6) 1 (9.1) 11 State o (0.0) o (0.0) 0(0.0) 2 (28.6) 5 (71.4) 7 Increased staffing Total o (0.0) 1 (5.6) 2 (11.1) 8 (44.4) 7 (38.9) 18 Academic 0(0.0) 1 (9.1) 2 (18.2) 6 (54.5) 2 (18.2) 11 State o (0.0) o (0.0) o (0,0) 2 (28.6) 5 (71.4) 7 Recruitment/cultivation of st(jff with digital preservation/technological expertise Total o (0.0) 0(0.0) 4 (22.2) 9 (50.0) 5 (27.8) 18 Academic o (0.0) o (0.0) 3 (27.3) 7 (63.6) 1 (9.1) 11 State o (0.0) 0(0.0) 1 (14.3) 2 (28.6) 4 (57.1) 7 Increased opportunity to collaborate with other libraries and organizations in digital preservation activities Total o (0.0) I (5.6) 7 (38.9) 7 (38.9) 4 (22.2) 18 Academic o (0.0) 1 (9.1) 7 (63.6) 3 (27.3) I (9.1) 11 State o (0.0) 0(0.0) o (0.0) 4(57.1) 3 (42.9) 7 &tablishment of an organized statewide/nationwide/international digital preservation movernent in which to participate Total o (0.0) o (0.0) 7 (38.9) 7 (38.9) 4 (22.2) 18 Academic o (0.0) o (0.0) 7 (63.6) 3 (27.3) 1 (9.1) 11 State o (0.0) o (0.0) o (0.0) 4 (57.1) 3 (42.9) 7 Case study Introduction and overview While law librarians, as an organized profession, have engaged in discussions and academic dialogues about the need for collaborative efforts to preserve digitally born legal materials, it is a pilot effort being presently implemented in the United States that may serve to provide unique insight into the practical experience of realizing such a program. Commonly in the medical and social sciences, the case study is used as a research tool to gather and present empirical data from quantitative and quasi- experimental designs, often for the purpose of establishing causal relationships; however, the case study that follows diverges from this design and borrows from the educational case studies found in the disciplines of law, business and public policy, which are presented with the intent of impacting public opinion, practice and policy development by describing and raising awareness of specific problem-solving strategies (Yin, 2003). Within the digital preservation community, this type of descriptive case study has been put forward to cncourage, support and provide a framework for the establishment of digital preservation programs. For example, the Electronic Resource Preservation and Access Network (ERPANET), funded by the Information & Communications Technology Law 57 European Commission and Swiss Confederation, is developing a series of sixty case studies to explore digital preservation programs in various institutions and commercial sectors (Ross, 2004). Presently, nearly forty such studies have been published on the ERPANET website (ERPANET, n.d.). The study that follows provides an account of the Chesapeake Project, a collaborative pilot digital preservation project that began its archiving activities in March 2007, and investigates the following issues: project origin and background; mission and objectives; selection and collection scope; digital preservation strategies and tools; discovery and access; organizational framework and staffing; project and collection status; and post pilot-phase prospects. Methodology In researching the Chesapeake Project for the purpose of developing the present case study, a number of published and unpublished project-created policy documents were consulted and analyzed. In addition to these official project documents, other records consulted included project meeting materials, such as agendas and handouts, correspondence between collaborating project members and vendor documentation. Documents produced by the Legal Information Preservation Alliance, related to the early establishment of the Chesapeake Project, were also examined. It is important for readers to note that one of the two authors of this article is affiliated with an institution participating in the Chesapeake Project, and the other author is not. Working together, the authors have made every attempt to present an unbiased case study. Given this level of affiliation with the project, however, the authors were also privileged with extraordinary access to numerous materials and documents associated with the project, as well as the insight and direct observations of a project participant. Project origin and background The Chesapeake Project's origins can be traced to March 2003, when the conference 'Preserving Legal Information for the 21st Century: Toward a National Agenda' was held at the Georgetown University Law Center. The conference convened a select and strategic group of law library directors, law librarians, publishers and experts in the fields of information technology and digital preservation to discuss and set forth a national agenda to prevent the loss of legal information in both analog and digital formats (Legal Information Preservation Alliance, 2003). In an effort to advance this national agenda, participants formed a new organization called the Legal Information Preservation Alliance (LIP A) , with the mission 'to provide the leadership, the necessary organizational framework and the professional commitment necessary to preserve vital paper and electronic legal information by defining objectives, developing and/or adopting appropriate standards and models, creating networks, and fostering financial and political support for long term stability' (Legal Information Preservation Alliance n.d.b). LIPA's membership has increased steadily since its founding in 2003; by September 2005, the association had enlisted 36 institutions, and in January 2008, LIPA's membership had risen to 69, representing the American Association of Law Libraries (AALL) and state and academic libraries throughout the United Statcs (Special Committce on Permanent Public Access to Legal Information, 2005; Legal Information Preservation Alliance, n.d.a). 58 S. Rhodes and D. Neacsu In June 2006, LIPA finalized its Strategic Plan Outline, which articulated, among other things, the association's priorities in the area of preservation activities. First listed among these activities was the creation 'of a pilot project to preserve born digital materials' (Legal Information Preservation Alliance, 2006, p. 2). Tn an effort to accomplish this strategic objective, three LIP A-member institutions in the Chesapeake Bay region of the United States - the Georgetown Law Library, the Maryland State Law Library and the Virginia State Law Library - established the Chesapeake Project as a two-year pilot digital preservation program to address the challenge of preserving legal information published directly to the Web. Library directors at the Georgetown, Maryland State and Virginia State Law Libraries began organizing the project and evaluating Web harvesting, digital archiving tools and repository options in 2006. In March 2007, the institutions participating in the pilot began actively harvesting content from the Internet and preserving it within a shared digital repository. As a two-year pilot, the Chesapeake Project is slated to end its pilot phase in 2009. It is anticipated that it will establish a solid framework, body of policy documentation and support structure that will evolve into a nationwide digital preservation initiative by enlisting the participation of state and academic law libraries throughout the United States. Project mission and objectives The mission of the Chesapeake Project, as stated in project documentation, is to successfully develop and implement a pilot program to stabilize, preserve and ensure permanent access to critical born-digital legal materials on the World Wide Web. The Chesapeake Project is working to establish the beginnings of a strong regional digital archivc collection of US legal materials as well as a sound set of standards, policies and best practices that could potentially serve to guide the future realization of a nationwide preservation program. (Chesapeake Project, 2007, p. 2) Beyond this broadly stated mission statement, participants in the Chesapeake Project, upon its inception, did not create a list of specific benchmarks or strategic objectives. As a pilot project, the Chesapeake Project is primarily an investigative effort to lay the foundation for a larger collaborative program. As stated in the Chesapeake Project's first-year evaluation document: [Plroject participants have utilized the first year of the pilot to familiarize themselves with the digital archiving process, create shared documentation to guide project participation, assess digital-archiving costs and necessary staffing commitments, and develop reasonable expectations for progress in digital archiving and archive collection development. (Chesapeake Project, 2008, p. 4) Although specific project benchmarks were not set for the project's first year, evaluation parameters for the project were established early in the pilot, and participants set up a formal project evaluation schedule, with evaluations being conducted at the pilot's one- and two-year marks. First-year evaluation measures included: a count of items and titles archived during the first year; access statistics; a test sample to determine the number of archived titles altered or removed from their original locations on the Web; and a qualitative analysis of the project's progress and challenges (Chesapeake Project, 2008). Second-year evaluation measures are expected to consider user feedback. Ir!formation & Communications Technology Law 59 Selection and collection scope Because there are no in-print lists or approval plans generated to guide the acquisition of legal content from the expansive independent publishing medium that is the Internet, selection of materials from the Web for long-term preservation can be a challenge. Due to the two-year time limitation of the pilot project, each library participating in the Chesapeake Project implemented a set of selection parameters in an effort to both guide and limit the scope of materials harvested and preserved, and to ensure the development of a cohesive collection of preserved legal materials. The selection of materials from the Web for preservation by institutions participating in the Chesapeake Project is driven by the following factors: the users of each library and their information needs; the institutional missions, mandates and priorities of each library; the perceived risk level of the material; and the collection scope and parameters determined, individually, by each institution. The library patrons at each institution comprise the primary user group, which impacts the digital-archive selection decisions and collection scopes of participating libraries. These patron groups range from law students at Georgetown to state law practitioners in the states of Maryland and Virginia; they are described in project documentation as 'law practitioners, law faculty members, law students, justices and their staff members, judges and their staff members, and state government officials and their staff members' (Chesapeake Project, 2007, p. 3). The Chesapeake Project also aims to serve the legal information needs of a broader secondary user group comprising law students, scholars and practitioners who are not affiliated with the Georgetown, Maryland State or Virginia State Law Libraries, in addition to general public users with legal information needs. The Maryland and Virginia State Law Libraries digital-archive collections consist primarily of state-issued materials, as well as some community- and organization-published reports and studies. The Maryland State Law Library specifically selects items 'that describe, analyze, document, propose, clarify or define public-policy and legal issues that affect the citizens of the state of Maryland' (Chesapeake Project, 2008, p. 3). Within the first year of its participation in the Chesapeake Project, the Maryland State Law Library has collected and preserved what may be the most comprehensive collection of Maryland General Assembly- mandated task force reports available online (Chesapeake Project, 2008). The digital-archive collections of the Virginia State Law Library represent the online publications of the state's judicial branch of government, including those of the Supreme Court of Virginia and the Judicial Council of Virginia. As an academic law library, the Georgetown Law Library's digital archive collections are largely thematic and include secondary legal materials based on scholarly areas of interest and the established legal research institutes at the Georgetown Law Center. Additionally, the library collects jurisdictional materials by and about local and neighboring government entities and a limited number of reports from federal commissions. The library also works with the Law Center's Office of Journal Administration to archive Web-based sources that are cited in legal journals and fit within the established project collection scope. While some Web-harvesting projects focus on the capture and preservation of entire Web sites, the Chesapeake Project focuses upon the capture and preservation of discrete online publications. If multiple reports are posted on a single webpage, for example, the entire webpage is not harvested; rather, each report is harvested individually and preservation meta data is created to accompany each harvested title. 60 S. Rhodes and D. Neacsu Although the process of harvesting and archiving individual publications one-by- one, as opposed to entire collections of publications at once, is considerably more time-consuming, project participants believe that this strategy is in the best interests of their users as individual titles can be cataloged and linked to single bibliographic records, facilitating user discovery. Entire websites, on the other hand, would req uire re-harvesting at regular intervals to capture newly posted content, and facilitating discovery of discrete reports embedded deeply within a harvested, content-rich website would pose a challenge. Digital preservation strategies and tools After considering various options for the storage, preservation and management of digital materials, including open-source options, the libraries participating in the Chesapeake Project selected the OCLC Digital Archive, operated and administered by the nonprofit Online Computer Library Center (OCLC). Although open-source digital repository systems represent a less expensive option, they require a statfwith significant technological expertise and a designated storage site. As such, project participants chose to utilize a vendor-operated system. A number of factors influenced the choice of OCLC Digital Archive. The archive's storage system adhered to the ISO reference model for an Open Archival Information System (OAIS), which is the standard conceptual framework accepted by the digital preservation community for the permanent preservation of digital information. OCLC's prominence and stability in the library community also impacted this choice as the long-term viability of any digital preservation project demands a digital repository backed by a sound organizational structure. Moreover, OCLC was willing to work with the Chesapeake Project to negotiate a shared trial pricing structure for the term of the two-year pilot phase. The Chesapeake Project utilized the OCLC Digital Archive's bit-level preserva- tion services, ensuring that the digital files deposited into the archive remained uncorrupted and renderable in their original formats. OCLC's responsibilities included secure onsite storage of archived items at OCLC facilities, maintaining multiple copies of backup data and disaster tapes stored at an offsite facility, and a regular schedule of virus-checking, file format verification and fixity-checking using checksum algorithms. There was no explicit preservation action strategy, such as format migration, for items in the OCLC Digital Archive; however, customized preservation treatments would be implemented by OCLC, based on formats archived and institutional needs, to counteract future obsolescence. All items harvested from thc Internet and placed into the OCLC Digital Archive are accompanied by preservation metadata records, which contain information that will ultimately guide preservation action decisions and the future rendering of archived digital objects. The Digital Archive automatically generates and captures technical metadata about each item harvested, including the file format type, which is verified using the JSTOR/Harvard Object Validation Environment, or JHOVE. Project participants play an administrative and curatorial role in the creation of metadata records, manually entering descriptive and administrative metadata into the preservation records. As a quality control measure and to ensure consistence in meta data record creation, project participants consult a project metadata guide, which was developed at the start of the pilot. Like all things in the digital environment, digital archives and repositories themselves are not immune to the advancement of technology and the threat of Information & Communications Technology Law 61 obsolescence. Just as digital files require maintenance and migration, so too do digital archive systems. Within the first year of its pilot phase, the Chesapeake Project experienced this phenomenon first hand. Project participants were informed by OCLC in summer 2007 that the OCLC Digital Archive would be replaced by a new system. In April 2008, shortly after the project's the first-year mark, OCLC transitioned the Chesapeake Project's archived collections and metadata from the original OCLC Digital Archive to a more sophisticated, two-tiered digital- preservation and access system. Whereas the original OCLC Digital Archive acted as both an access and a preservation system, the new system separates access from preservation. Using the new system, two digital objects are created from the original item harvested from the Web: a master file and an access copy. The master file is stored in a dark digital archive, which is very similar to the previous OCLC Digital Archive, except that it is completely inaccessible to users. The derived access copy is imported into CONTENTdm, a customized storage and retrieval system, which makes archived collections accessible to users via a searchable Web interface. Discovery and access Discovery of and access to digital collections archived by the Chesapeake Project is made available through participating institutions' local OPACs, the open-access WorldCat.org system, subscription OCLC FirstSearch and WorldCat databases, and the Chesapeake Project's new CONTENTdm system. The bibliographic treatment of each item in the archive is vital to user access to and discovery of the Chesapeake Project's collections. In addition to archiving an item and generating a preservation meta data record, every archived title has a corresponding bibliographic MARC record, created in OCLC's shared global bibliographic database. As a digital item is harvested from the Web and archived, it is assigned a unique URL that is hyperlinked to the archived access copy in the OCLC system. (Previous OCLC Digital Archive system URLs now resolve to CONTENTdm URLs.) This U RL is added to local records and OCLC bibliographic records within an 856 field, alongside the original Web URL, and provides direct access to archived objects. If and when an object's original URL becomes inactive, the URL for the archived access copy will continue to provide access to the title. Any user with an Internet connection can discover these records through traditional catalog searching methods, using a library'S OPAC or an OCLC database, and is provided with open access to archived resources via hyperlinked URLs placed prominently within the records. As a digital object is harvested from the Web, attached to a bibliographic record in OCLC and imported into CONTENTdm, so too is the item's bibliographic metadata, which is crosswalked from MARC format into a Qualified Dublin Core record in CONTENTdm. In addition to these Qualified Dublin Core records, the CONTENTdm system facilitates discovery through full- text PDF searchability. Organizational framewo/'k and staffing The three libraries participating in the Chesapeake Project vary considerably. In addition to the fact that project participants represent two state and one academic law library, each with different patron groups and mandates, the three libraries also differ significantly in terms of size. The Virginia State Law Library is operated by a 62 S. Rhodes and D. Neacsu staff of five people. The Maryland State Law Library is larger, with a staff of 15, and the Georgetown Law Library, which consists of two separate law library buildings, has a staff of nearly 70. Given these differences, project structure, flexible policies and regular communication were required. The director of each participating library assists with project planning, upper- Icvel decision-making and strategy, and has appointed a staff librarian to coordinate the library's day-to-day participation in the project and manage project-related curatorial, cataloging and digital archiving tasks. Technical services and cataloging librarians at each institution also assist with the project as needed. The Georgetown Law Library hired a full-time librarian to manage the project, who devotes roughly 30 hours per week to project-related archiving, cataloging and coordination. Two librarians, a project coordinator and a cataloger at the Maryland State Law Library spend a combined amount of 12 hours per week on project-related tasks, and at the Virginia State Law Library, the project coordinator devotes about five hours per week to the project - down from 15 hours per week at the start of the project. All libraries report that the most time-consuming task associated with the project is cataloging, largely because the majority of the items harvested and archived through the project are fugitive documents or gray literature, and as such, they require original cataloging. Other time-consuming tasks include Web harvesting, archiving and preservation metadata record creation. A preservation metadata guide was developed early in the project to guide the creation of preservation metadata records for archived digital objects. Soon afterward, the libraries involved in the pilot approved a comprehensive collection plan, which laid out the project's mission and scope, methods of acquisition and selection, metadata policies, methods of access and preservation system. The structure of this collection plan was borrowed from the NDIIPP-sponsored Web-at- Risk project, which has developed and published online a flexible collection plan template to accommodate the various institutions participating in the Web-at-Risk (Web-at-Risk, n.d.). Project participants have continually convened to reassess established policies and update them, as needed, to address newly discovered challenges and the project's evolving circumstances. In addition to regular e-mail updates and discussions, project participants have implemented a formal schedule of quarterly meetings to facilitate communication, discuss project policy and share information. A conference call is set up to allow for the inclusion of any project participants who are unable to attend a quarterly meeting in person. Project and collection status In March 2008, at the time of the project's first-year evaluation, 2705 items, representing approximately 1270 titles, had been harvested from the Web and placed within the Chesapeake Project's shared digital archive (Chesapeake Project, 2008). The discrepancy between 'items' and 'titles' is largely due to serial publications, as well as some multi-part monographs, which require multiple, separate harvests, but, comprising a single 'title', are attached to a single corresponding bibliographic record. An analysis of a random sample of 579 titles archived during the Chesapeake Project's first year demonstrated, with a confidence level of 95% and a confidence interval of +/-3, that more than 95% of the titles in the archive were published in PDF format; 4% in the sample were published as X/HTML documents; and the Information & Communications Technology Law 63 remaining titles were in either Microsoft Word format or multiple formats, such as an HTML publication with embedded supplements in PDF format (Chesapeake Project, 2008). Project participants tested the same sample of 579 archived titles to determine the number of archived titles altered or removed from their original locations on the Web. This exercise demonstrated that more than 8% of titles harvested from the Web between the project's start date in 2007 to its first-year mark in March 2008 had inactive original URLs, meaning that these items had already been altered, removed from their original locations or deleted from the Web entirely. The Chesapeake Project's first-year evaluation also included an analysis of access statistics for archived items. Although the project's first-year efforts were marketed to neither users nor other institutions, access figures showed a high level of archived item use, indicating that many users discovered and accessed archived titles through bibliographic records in participating institutions' OPACs, WorldCat.org, and other OCLC bibliographic databases. In the project's first year, archived items were accessed a total of 5317 times. Within this figure, project participants accessed their own archived items a total of 2267 times. Public users, users who accessed archived content through open-access means without first logging into an OCLC system, accounted for a surprisingly high 2528 instances of access, and authenticated OCLC- affiliated libraries and institutions, excluding those participating in the project, accounted for 522 instances of access, most probably occurring during the course of research, reference activities and adding OCLC bibliographic records with archived URLs to their own local catalogs. Post pilot-phase prospects The vision of the Chesapeake Project has been articulated as follows: The Chesapeake Project aims to set a precedent for a national movement to prevent the widespread loss of legal information in digital formats, securing these materials for generations to come. Upon reaching the close of its two-year pilot phase in 2009, The Chesapeake Project hopes to help inspire, establish and galvanize widespread participation in a comprehensive, collaborative and nationwide preservation program for legal resources. (Chesapeake Project, 2007. p. 3) It is important to remember that the Chesapeake Project is a two-year pilot, and it ultimately aspires to evolve into a much larger digital archive for legal materials, shared by law libraries throughout the United States. With the organization-wide support of the Legal Information Preservation Alliance and the American Association of Law Libraries, this vision is indeed within reach. Beyond the borders of the United States, the Chesapeake Project aims to inform the preservation initiatives of other organized groups of libraries, who may learn through its experiences, and to raise global awareness of the vulnerability of digitally born legal materials published on the Web. Conclusion The end of the first decade of the twenty-first century is in sight, and the law library community has reached a crossroads in determining its role as the steward of legal information in an increasingly digital world. The amount of information being 64 S. Rhodes and D. Neacsu produced in digital formats and distributed via electronic media has exploded over the past decade. However, digitally born materials - especially those that are published directly and independently to the Web - are presently at an extremely high risk of permanent loss. Our legal heritage is no exception to this phenomenon, and efforts must be put forth to ensure that our current body digital legal information is not lost. Movements to preserve digitally born legal and government publications have been set in motion - most notably those implemented by the GPO and the NDIIPP projects administered by the Library of Congress, which have enlisted the assistance of libraries throughout the United States. Where does the law library community stand when it comes to the active preservation of our digital legal heritage? Our survey findings indicate that law libraries represented in our sample recognize that digitally born legal materials are at high risk of loss, yet their own digital preservation projects have primarily focused upon the preservation of digitized print materials, rather than digitally born materials. Digital preservation activities among surveyed libraries have been largely limited by a lack of funding, staffing and expertise; however, these barriers could be overcome by collaboration with other institutions, as well as participation in a large- scale regional or national digital preservation movement, which would allow for resource sharing among participants. One such collaborative digital preservation program has been initiated within the past year: the Chesapeake Project. The project, which is a collaborative effort between academic and state law libraries, is being implemented under the auspices of the Legal Information Preservation Alliance. The first year of the program has been shown to be successful, and the Chesapeake Project shows great promise in its goal to inspire a nationwide effort to prevent the loss of digital legal information. Tackling the challenges of digital preservation represents a means by which law libraries can reclaim their traditional roles as stewards of information in the digital sphere. More importantly, it would ensure that our contemporary legal heritage would be preserved for generations to come. Acknowledgements The authors would like to acknowledge the support and guidance of Kent McKeever, Steve Anderson, Mary Alice Baish and Janice Anderson. References Anon. (1980). The preservation crisis. Journal of Academic Librarianship, 6(5), 290. Retrieved March 8, 2008 from http://search.ebscohosLcom/login.aspx?direct=true&db=aph& AN=11298426&site=ehost-livej Baldwin, G. (2003). Fugitive documents: On the loose or 011 the run. Retrieved March 29, 2008, from http://www.access.gpo.gov/su_docs/fdlp/pubs/adnotes/ad081503.html#3/ Bradley, K. (2007). Defining digital sustainability. Library Trends, 56(1), 148-163. Breeze, H. (2005). Results of general survey. Retrieved April 10, 2008, from http://www. aallnet. org/ commi ttee/lipa/invcn tory/In ven tory ReportSect V.h tml/ Burlington Magazine. (1967). The Florentine flood disaster. Burlington Magazine, 709(769), 192-194. Retrieved March 6, 2008 from http://links.jstor.orgjsici?sici=0007-6287% 28196704%29109%3A 769%3CI92%3ATFFD%3E2.0.CO%3B2-%23/ California Digital Library. (n.d.). The Web at risk: Preserving our nation '.I' cultural heritage. Retrieved March 29, 2008, from http://www.cdlib.org/inside/projects/preservation/ webatrisk/ Information & Communications Technology Law 65 Center for Research Libraries. (2007). Core requirements for digital archives. Retrieved March 29, 2008, from http://www.crl.edu/content.asp?ll = 13&12=58&13= 162&14=92/ Chesapeake Project. (2007). Collection plan. Retrieved March 8, 2008, from http://web3. unt.edu/weba trisk/cp _cur /CP /ChesapeakeCollectionPlan _July2Draft. pdf/ Chesapeake Project. (2008). First-year pilot project evaluation. Unpublished manuscript. Clareson, T. (2006). NEDCC survey and colloquium explore digitization and digital preservation policies and practices. RLG DigiNews, 10(1). Retrieved April 10, 2008, from http:// digitalarchive.oclc.org/da/ViewObject.jsp?objid=0000068991 &reqid= 112907#article 1 / Commission on Civil Rights. (2007). Reinlligorating the nation's civil rights debate: The strategic plan o/, the United States Commission on Civil Rights for fiscal years 2008--2013. Retrieved April 13, 2008, from http://www.usccr.gov/pubs/SPFY0813web.pdf/ Deegan, M., & Tanner, S. (2006). Key issues in digital preservation. In M. Deegan & S. Tanner (Eds.), Digital preservation (pp. 1-31). London: Facet. ERPANET (n.d.). ErpaStudies. Retrieved March 29, 2008, from http://www.erpanet.org/ studies/index. php/ Gano, G., & Linden, J. (2007). Government information in legacy formats: Scaling a pilot project to enable long-term access. D-Lib Magazine, 13(7/8). Retrieved March 12, 2008, from http://www.dlib.org/dlib/july07/linden/07Iinden.html/ Germain, C.M. (2002). Web mirror sites: Creating the research library of the future, and more. Legal Reference Services Quarterly, 21(2/3), 87-104. Glenn, V.D. (2007). Preserving government and political information: The Web-at-Risk project. First Monday, 12(7). Retrieved March 12, 2008, from http://firstmonday.org/ issues/issue 12_7/ glenn/index.html/ Holdsworth, D. (2006). Strategies for digital preservation. In M. Deegan & S. Tanner (Eds.), Digital preservatio/l (pp. 32-59). London: Facet. International Data Corporation. (2008). The diverse and exploding digital universe: An updated forecast of worldwide information growth through 2011. Retrieved March 8, 2008, from http://www.emc.com/digital_universe/ International Review. (1878). The government library at Washington. International Review, 5, 754·769. Retrieved March 6, 2008 from http://O-proquest.umi.com.gull.georgetown.edu: 80 /pqdlink?did=4094 73881 &sid=2&Fmt=2&clientId=5600&RQT =309& VN ame=HNP / Internet Archive. (n.d.). Web archiving services. Retrieved March 12, 2008, from http:// www.archive.org/web/web.php/ Kenney, A., & Buckley, E. (2005). Developing digital preservation programs: The Cornell Survey of Institutional Readiness, 2003--2005. RLG DigiNews, 9(4). Retrieved March 12, 2008, from http:j /digitalarchive.oclc.org/da/ViewObject.jsp?objid=0000068919&reqid= I 642#articleO%C2%AO/ Kniffel, L. (2003). Though devastating, Iraq library losses may be less than feared. American Libraries, 34(6), 40-4l. Kumar, S.L. (2006). Providing perpetual access to government information. Reference Librarian, 45(94), 225-232. LeFurgy, W. (2005). Building preservation partnerships: The Library of Congress National Digital Information Infrastructure and Preservation Program. Library Trends, 54(1), 163- 172. Legal Information Preservation Alliance. (2003). Preserving legal information for the 21st century: Toward a national agenda, March 6-8, 2003. Retrieved March 2, 2008, from http://www.aallnet.org/committee/lipa/LIPA _ Conference_Report. pdf/ Legal Information Preservation Alliance. (2006). Legal Information Preservation Alliance strategic plan outline. Retrieved March 8, 2008, from http://www.aallnet.org/committee/ lipa/Stra tPlanFi na1Draft20060620. d oc/ Legal Information Preservation Alliance. (n.d.a.). LIP A member libraries. Retrieved April 2, 2008, from http://www.aallnet.org/committee/lipa/members.asp/ Legal Information Preservation Alliance. (n.d.b.). LIPA's mission statement. Retrieved March 2, 2008, from http://www.aallnet.org/committee/lipa/mission.asp/ Library of Congress. (n.d.a.). Preservation 0/ state digital in/ormation. Retrieved March 29, 2008, from http://www .digitalpreservation .gov /partners/states. html/ Library of Congress. (n.d.b.). Web capture. Retrieved March 29, 2008, from http:// www.loc.gov/webcapture/index.html/ 66 S. Rhodes and D. Neacsu Library of Congress. (n.d.c.). What is being saved. Retrieved March 29, 2008, from http:// www.digitalpreservation.gov/library/collect.html/ Lopresti, R., & Gorin, M. (2002). The availability of US government depository publications on the World Wide Web. Journal ()f Government information, 29(1), 17-29. Lyons, S. (2006). Preserving electronic government information: Looking back and looking forward. Reference Librarian, 94, 207-223. Madison, J. (1910). The writings of James Madison: Comprising his public papers and his private correspondence, including numerous leiters and documents now for the first lime printed: Vol. 9. 1819-1836 (G. Hunt, Ed.). New York: G.P. Putnam's Sons. Retrieved March 13 2008, from hUp://O-galenet.galegroup.com.gull.georgetown.edu:80/servlet/MOML?af= RN&ae=F150083426&srchtp=a&ste= 14/ Murray, K., & Phillips, M. (2007). Collaborations, best practices and collection development/or born-digital and digitized materials. Paper presented at DigCCurr 2007, An International Symposium in Digital Curation. Retrieved March 29, 2008, from http://ils.unc.edu/ digccurr2007 /papers/murrayPhillips -.raper _9-3. pdf/ NASCIO. (2007). Electronic records management and digital preservation: Protecting the knowledge assets 0/ the slate government enterprise. Retrieved March 8, 2008 from http:// www.nascio.org/publications/documents/NASCIO-RecordsManagement.pdf/ Neacsu, D. (2002). Legal scholarship and digital publishing: Has anything changed in the way we do legal research? Legal Reference Services Quarterly, 21(2/3), 105-122. Neacsu, D. (2007). Google, legal citations and electronic fickleness: Legal scholarship in the digital environmellt. Retrieved March 8, 2008, from http://papers.ssrn.com/soI3/ papers.cfm?abstract jd =991190/ Pardo, T.A., Burke, G.B., & Kwon, H. (2006). Preserving state government digital if!formation: A baseline report. Retrieved March 9, 2008, from http://www.ctg.albany.edu/publications/ reports/digital-.rreservation_baseline/ Rawan, A., & Malone, C.K. (2006). A virtual depository: The Arizona project. Re/erence Librarian, 45(94), 5·-18. Ross, S. (2004). The role of ERPANET in supporting digital curation and preservation in Eurpoe. D-Lib Magazine, 10(7/8). Retrieved March 29, 2008, from http://www.dlib.org/ dlib/july04/ross/07ross.html/ Rothenberg, J. (1999). Ensuring the longevity 0/ digital in/ormation. Retrieved March 9, 2008, from http://www.clir . org/pubs/ archives/ensuring. pdf/ Rumsey, M. (2002). Runaway train: Problems of permanence, accessibility and stability in the use of Web sources in law review citations. Law Library Journal, 94(1), 27-39. Simpson, D. (2005). Digital preservation in the regions: Sample survey 0/ digital preservation preparedness and need~ 0/ organisations at local and regional levels. Retrieved March 8, 2008, from http://www.mla.gov . uk/programmes/digital jnitiatives/digital-.rreservation/ digi ta1...J)feservation_in _the _ regions/ Solum, L.B. (2006). Blogging and the transformation of legal scholarship. Washington Law Review, 84, 1071-1088. Retrieved April 13, 2008, from http://papers.ssrn.com/ abstract=898168/ Special Committee on Permanent Public Access to Legal Information. (2005). Member's briefing: Preservation. AALL Spectrum, 70(3), 1-4. University of North Texas Libraries. (n.d.a.). Congressional Research Service Reports, hosted by UNT Libraries. Retrieved March 29, 2008, from http://digital.library.unt.edu/govdocs/ crs/ University of North Texas Libraries. (n.d.b.). CyberCemetery. Retrieved March 29, 2008, from http://govinfo.library.unt.edu/ US Government Printing Office. (2003). The GPO and National Archives unite in support 0/ permanent online public access. Retrieved March 29, 2008, from http://www.gpoaccess. gov /pr /media/2003 /03 news46. pdf/ US Government Printing Office. (n.d.). FDsys Overvielv. Retrieved March 29, 2008, from http://www.gpo.gov/projects/fdsys_overview.htm/ Web-at-Risk. (n.d.). Collection plans. Retrieved April 2, 2008, from http://web3.unt.edu/ webatrisk/cpg.php/ Yin, R.K. (2003). Case study research: Design and methods (3rd edn). Thousand Oaks, CA: Sage. Information & Communications Technology Law 67 APPENDIX Law Libraries & Digital Preservation: A Survey 1. Welcome Dear Librarian: We are two academic law librarians interested in law library activities relating to digital preservation. (Dana Neacsu, at dana.neacsu@law.columbia.edu, and Sarah Rhodes at sjr36@law.georgetown.edu.) We have developed a survey to gauge the law library community's digital preservation efforts, and we are contacting you today to request your participation in our research. The survey is divided into the following four short sections: (a) Demographic information, (b) Digital-preservation activities, (c) Perception and attitudes on digital preservation, and (d) Copyright and access policies for archived digital materials. We know how busy you are, so the survey has mostly multiple-choice and scale-based questions, with very few questions requiring more than selecting a checkbox to indicate your answer. Certainly, we will be happy to share the results will all respondents. Because we are working on a deadline, we would appreciate receiving your responses within two weeks from today. Please do not hesitate to contact either one of us with questions, suggestions, etc. Sincerely, Sarah and Dana 2. Demographic information I. I represent a(n) D Academic Law Library D State Law Library D Other 3. My position is that of D Director D Other 2. If you answered "Other," please desclibe your library: 4. If you answered "Other," please describe your position: 68 S. Rhodes and D. Neacsu 3. Digital presen'ation activities I. How many large-scale and/or small-scale digital preservation projects have you planned and/or executed within your library in the last five years? Please provide approx. number of projects: 2. Are these projects focused on the preservation of digitized (scanned) print items? DYes o No o Some are, but not all 3. If you answered "yes" or "some," how many of your library's projects have focused on the preservation of digitized (scanned) print items? Please provide approx. number of projects: 4. Are these projects focused on the preservation of digitally born materials (originally created and disseminated in a digital format, often with no print counterpart),? DYes o No o Some are, but not all 5. If you answered "yes" or "some," approximately how many of your library's projects have focused on the preservation of items digitally born? Please provide approx. number of projects: 6. How many of these projects have you successfully completed? Please provide approx. number of projects: 7. How many of these projects are currently in progress? Please provide approx. number of projects: In/ormation & Communications Technology Law 69 8. What digital repository and content management systems have been used for these projects? Select all that apply. o CONTENTdm o DigiTool o DSpace o EPrints o Fedora o Greenstone o Hyperion Digital Media Archive o Lots of Copies Keep Stuff Safe (LOCKSS) o MetaSource o OCLC Digital Archive o VITAL o Other 9. If you answered "other," please name/describe your digital repository and content management systems: 10. Does your institution use ... (Select all that apply.) o An in-house digital repository o Repository hosted offsite by vendor o Other II. If you answered "other," please describe your repository: 12. If you use a commercial digital repository, who operates your commercial digital repository? Please provide name of vendor(s): 70 S. Rhodes and D. Neacsu 13. To what extent do you collaborate with other institutions and/or nonprofit/for-profit partners in developing your digital projects? Please mark the appropriate box: Almost Never Rarely Sometimes always Always collaborate collaborate collaborate collaborate collaborate Collaborate with 0 0 0 0 0 libraries/institutions Collaborate with 0 0 0 0 0 nonprofit partners Collaborate with 0 0 0 0 0 for-profit partners 14. With whom have you collaborated? Please list institutions, organizations, and commercial collaborators: IS. What was/is the collection scope of your institution's digital preservation projects? (Preservation of which materials takes precedence in your projects and why?) Please explain: 16. Do you preserve primary legal materials in digital formats? o No DYes o Sometimes 17. If so, are those ... (Please seleet all that apply.) o Domestie o International o Foreign 18. If preserving domestic primary legal materials, are those ... (Please seleet all that apply.) o Federal o State o County o Municipal 19. Do you preserve secondary legal sources in digital formats? DYes o No o Sometimes Information & Communications Technology Law 71 20. If so, are those .. , (Please select all that apply.) o Domestic o International o Foreign 21. If preserving domestic secondary legal materials, are those ... (Please select all that apply.) o Federal o County o Municipal o State 22. Do you preserve digital items harvested from the Web? DYes o No o Sometimes 23. If so, what Web harvesting software do you utilize? Please describe: Section 4: Perceptions and attitudes about digital preservation I. Please usc the scale below to indicate your level of agreement with the statements below, as applicable. Strongly Strongly disagree Disagree Neutral Agree agree Law libraries should be involved in 0 0 0 0 0 preventing the loss of law-related information published to the Web. Law libraries should be involved in the 0 0 0 0 0 preservation of digitally born government information. Law libraries should be involved in 0 0 0 0 0 preventing the loss of information published on the Web and cited within law review articles. Law libraries should be involved in the 0 0 0 0 0 long-term preservation of and sustained access to law review articles and other legal materials published digitally within subscription databases (HeinOnline, LexisNexis, Westlaw, etc.). 72 S. Rhodes and D. Neacsu 2. My library's level of involvement in digital preservation activities has been limited by: Strongly Strongly disagree Disagree Neutral Agree agree Lack of funding 0 0 0 0 0 Concerns about tec1mology/file format 0 0 0 0 0 obsolescence Staffing shortages 0 0 0 0 0 Lack of staff with digital prcservation! 0 0 0 0 0 technological expertise Digital preservation is not an institutional 0 0 0 0 0 priority Lack of partners/opportunities to 0 0 0 0 0 collaborate with other libraries and organizations in digital preservation activities Lack of an organized statewide/ 0 0 0 0 0 nationwide/international digital preservation movement in which to participate 3. Are there factors, not listed above, which have limited your library's level of involvement in digital preservation activities? If so, please describe: 4. The following would encourage greater involvement in digital preservation activities at my library: Strongly Strongly disagree Disagree Neutral Agree agree Increased funding 0 0 0 0 0 Increased staffing 0 0 0 0 0 Recruitment/cultivation of staff with 0 0 0 0 0 digital preservation/technological expertise Increased opportunity to collaborate with 0 0 0 0 0 other libraries and organizations in digital preservation activities Establishment of an organized statewide/ 0 0 0 0 0 nationwide/international digital preservation movement in which to participate 5. Are there factors, not listed above, which would encourage greater involvement in digital preservation activities at your library? If so, please describe: Information & Communications Technology Law 73 6. Which materials, in your opinion, deserve more attention when it comes to preservation: o Print materials o Digitally born materials 7. Why do you think either a) print or b) digitally born materials are in greater need of preservation? Please explain your choice: 8. Please describe, briefly, the role and responsibility that you envisage law libraries should take in the preservation of Web-published and digitally born legal information: Section 5: Copyright al1d access policies for archived digital materials I. Is permission from the copyright holder obtained for copyright-protected items that are being digitally archived by your library? DYes o No o Somctimcs 2. Do you preserve copyright-protected materials under a claim of fair use? DYes o No o Sometimes 3. Do you preserve only materials that are in the public domain? DYes o No o Sometimes 4. How is copyright being managed for Web-harvested items? Please explain: 74 S. Rhodes and D. Neacsu 5. How is copyright being managed for digitized (scanncd) print items? Please explain: 6. How is copyright being managed for digitally born items (not Web-harvested)? Please explain: 7. To what extent are digitally archived items made available for patron use? o Fully accessible to the public online o Accessible only to authenticated patrons online o In-library access for library patrons o Not accessible to patrons o Other 8. If you answered "Other" above, please explain how your digitally archived items are made available for patron use: 9. What limitations on access, if any, are in place? Please describe: 10. Are digitally archived items accessible via a Web database or portal, or in-house only? Please describe: We thank you for your time and consideration in completing this survey. work_kfybopsxgrfx5b5yx6t6afcttq ---- Canadian Journal of Neurological Sciences Journal Canadien des Sciences Neurologiques P M 40007777 R 9824 Cambridge Core For futher information about this journal please go to the journal website at: cambridge.org/cjn The offi cial Journal of The Canadian Neurological Sciences Federation: Canadian Neurological Society Canadian Neurosurgical Society Canadian Society of Clinical Neurophysiologists Canadian Association of Child Neurology Canadian Society of Neuroradioogy Canadian Journal of Neurological Sciences Journal Canadien des Sciences Neurologiques P M 40007777 R 9824 Cambridge Core For futher information about this journal please go to the journal website at: cambridge.org/cjn The official Journal of The Canadian Neurological Sciences Federation: Canadian Neurological Society Canadian Neurosurgical Society Canadian Society of Clinical Neurophysiologists Canadian Association of Child Neurology Canadian Society of Neuroradioogy V O L U M E 47 , N O . 5 T H E C A N A D IA N JO U R N A L O F N E U R O L O G IC A L S C IE N C E S S E P T E M B E R 2020 (435-587) Canadian Journal of Neurological Sciences Journal Canadien des Sciences Neurologiques P M 40007777 R 9824 Cambridge Core For futher information about this journal please go to the journal website at: cambridge.org/cjn The official Journal of The Canadian Neurological Sciences Federation: Canadian Neurological Society Canadian Neurosurgical Society Canadian Society of Clinical Neurophysiologists Canadian Association of Child Neurology Canadian Society of Neuroradioogy V O L U M E 47 , N O . 5 T H E C A N A D IA N JO U R N A L O F N E U R O L O G IC A L S C IE N C E S S E P T E M B E R 2020 (435-587) Volume 47 Number 6 November 2020 V O L U M E 4 7 , N O . 6 T H E C A N A D IA N J O U R N A L O F N E U R O L O G IC A L S C IE N C E S N O V E M B E R 2 0 2 0 (7 3 1 -8 7 2 ) Canadian Journal of Neurological Sciences Journal Canadien des Sciences Neurologiques P M 40007777 R 9824 Cambridge Core For futher information about this journal please go to the journal website at: cambridge.org/cjn The official Journal of The Canadian Neurological Sciences Federation: Canadian Neurological Society Canadian Neurosurgical Society Canadian Society of Clinical Neurophysiologists Canadian Association of Child Neurology Canadian Society of Neuroradioogy V O L U M E 47 , N O . 5 T H E C A N A D IA N JO U R N A L O F N E U R O L O G IC A L S C IE N C E S S E P T E M B E R 2020 (435-587) https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.232 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.232 https://www.cambridge.org/core TMPr NEW AJOVY™ is a trademark of Teva Canada Limited. Teva Support Solutions® is a registered trademark of Teva Pharmaceutical Industries Ltd. and is used under licence.TEVA and the design version thereof are registered trademarks of Teva Pharmaceutical Industries Ltd. and are used under licence. © 2020 Teva Canada Innovation G.P. – S.E.N.C., Montreal, Quebec H2Z 1S8 AJO20-PMH12E Indicated for the prevention of migraine in adults who have at least 4 migraine days per month.1 Backed by the AJOVY™ Teva Support Solutions® (AJOVY TSS) Patient Support Program! Available from 8:00 a.m. to 8:00 p.m. ET, Monday to Friday 1-833-302-0121 TSS@ajovycanada.ca Clinical use: AJOVY™ should be initiated by healthcare professionals experienced in the diagnosis and treatment of migraine. Pediatric patients (< 18 years): Safety and efficacy have not been established. Geriatric patients (≥ 65 years): Clinical studies of AJOVY™ did not include sufficient numbers of subjects aged 65 and over to determine whether they respond differently than younger subjects. Warnings and precautions: › Hypersensitivity reactions. › Patients with hepatic or renal impairment. › Patients with cardiovascular diseases. › Pregnancy. › Breastfeeding. For more information: Please consult the Product Monograph at https://TevaCanada.com/en-Ajovy for important information relating to adverse reactions, drug interactions, and dosing information that has not been discussed in this piece. The Product Monograph is also available by calling us at 1-855-514-8382. References: 1. AJOVY Product Monograph. Teva Canada Innovation. April 9, 2020. CGRP: calcitonin gene-related peptide *Comparative clinical significance unknown. †Fictional patient. May not represent all patients. Indicated for the prevention of migraine in adults KATHERINE, 31 † The only anti-CGRP with both quarterly (every 3 months) and monthly dosing options indicated in migraine prevention*,1 https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.232 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.232 https://www.cambridge.org/core Volume 47 / Number 6 / November 2020 A-1 Editorial 731 Screening for Cognitive Impairment, Being Cognizant of the Liminal Deities and Demons Chetan Vekhande, Bei Jiang, Mahesh Kate 734 Canada, Neurosurgery, and 5-Aminolevulinic Acid (5-ALA): The Long and Winding Road Joseph F. Megyesi rEViEW artiClEs 736 Embryology of Spinal Dysraphism and its Relationship to Surgical Treatment Matthew E. Eagles, Nalin Gupta 747 A Focused Update on Tardive Dyskinesia Pierre J. Blanchet original artiClEs 756 Methods for Improving Screening for Vascular Cognitive Impairment Using the Montreal Cognitive Assessment Khush-Bakht Zaidi, Jill B. Rich, Kelly M. Sunderland, Malcolm A. Binns, Linda Truong, Paula M. McLaughlin, Bradley Pugh, Donna Kwan, Derek Beaton, Brian Levine, Demetrios J. Sahlas, Dariush Dowlatshahi, Ayman Hassan, Jennifer Mandzia, The ONDRI Investigators, Angela K. Troyer, Richard H. Swartz 764 Call 911: Lower Ambulance Utilization Among Young Adults, Especially Women, with Stroke Arunima Kapoor, M. Patrice Lindsay, Amy Y.X. Yu, Cristina Goia, Sheldon Cheskes, P. Richard Verbeek, Richard H. Swartz 770 Delayed Thrombectomy Center Arrival is Associated with Decreased Treatment Probability Meah M. Gao, Jeffrey Z. Wang, Jane Liao, Stephanie D. Reiter, Manav V. Vyas, Elizabeth Linkewich, Richard H. Swartz, Leodante da Costa, Charles D. Kassardjian, Amy Y. X. Yu 775 Determining Corticospinal Tract Injury from Stroke Using Computed Tomography Timothy K. Lam, Daniel K. Cheung, Seth A. Climans, Sandra E. Black, Fuqiang Gao, Gregory M. Szilagyi, George Mochizuki, Joyce L. Chen 785 Readiness for First-In-Human Neuromodulatory Interventions Iris Coates McCall, Nicole Minielly, Allison Bethune, Nir Lipsman, Patrick J. McDonald, Judy Illes 793 The Cost-Effectiveness of 5-ALA in High-Grade Glioma Surgery: A Quality-Based Systematic Review Nebras M. Warsi, Rahel Zewude, Brij Karmur, Neda Pirouzmand, Laureen Hachem, Alireza Mansouri 800 Variants in CHRNB2 and CHRNA4 Identified in Patients with Insular Epilepsy Maxime Cadieux-Dion, Simone Meneghinia, Chiara Villa, Dènahin Hinnoutondji Toffa, Ronny Wickstrom, Alain Bouthillier, Ulrika Sandvik, Bengt Gustavsson, Ismail Mohamed, Patrick Cossette, Romina Combi, Andrea Becchetti, Dang Khoa Nguyen 810 A National Spinal Muscular Atrophy Registry for Real-World Evidence Victoria L. Hodgkinson, Maryam Oskoui, Joshua Lounsberry, Saïd M’Dahoma, Emily Butler, Craig Campbell, Alex MacKenzie, Hugh J. McMillan, Louise Simard, Jiri Vajsar, Bernard Brais, Kristine M. Chapman, Nicolas Chrestian, Meghan Crone, Peter Dobrowolski, Susan Dojeiji, James J. Dowling, Nicolas Dupré, Angela Genge, Hernan Gonorazky, Simona Hasal, Aaron Izenberg, Wendy Johnston, Edward Leung, Hanns Lochmüller, Jean K. Mah, Alier Marerro, Rami Massie, Laura McAdam, Anna McCormick, Michel Melanson, Michelle M. Mezei, Cam-Tu E. Nguyen, Colleen O’Connell, Erin K. O’Ferrall, Gerald Pfeffer, Cecile Phan, Stephanie Plamondon, Chantal Poulin, Xavier Rodrigue, Kerri L. Schellenberg, Kathy Selby, Jordan Sheriko, Christen Shoesmith, Garth Smith, Monique Taillon, Sean Taylor, Jodi Warman Chardon, Scott Worley, Lawrence Korngut 816 Poor Yield of Routine Transthyretin Screening in Patients with Idiopathic Neuropathy Dina Namiranian, Colin Chalk, Rami Massie Volume 47 / Number 5 / September 2020 A-1 COMMENTARIES 589 Practical Guidance for Outpatient Spasticity Management During the Coronavirus (COVID-19) Pandemic: Canadian Spasticity COVID-19 Task Force Rajiv Reebye, Heather Finlayson, Curtis May, Lalith Satkunam, Theodore Wein, Thomas Miller, Chris Boulias, Colleen O’Connell, Anibal Bohorquez, Sean Dukelow, Karen Ethans, Farooq Ismail, Waill Khalil, Omar Khan, Philippe Lagnau, Stephen McNeil, Patricia Mills, Geneviève Sirois, Paul Winston 594 Tackling the Burden of Neurological Diseases in Canada with Virtual Care During the COVID-19 Pandemic and Beyond Ramana Appireddy, Shirin Jalini, Garima Shukla, Lysa Boissé Lomax ORIGINAL ARTICLES 598 The Virtual Neurologic Exam: Instructional Videos and Guidance for the COVID-19 Era Mariam Al Hussona, Monica Maher, David Chan, Jonathan A. Micieli, Jennifer D. Jain, Houman Khosravani, Aaron Izenberg, Charles D. Kassardjian, Sara B. Mitchell 604 Early Dabigatran Treatment After Transient Ischemic Attack and Minor Ischemic Stroke Does Not Result in Hemorrhagic Transformation Anas Alrohimi, Kelvin Ng, Dar Dowlatshahi, Brian Buck, Grant Stotts, Sibi Thirunavukkarasu, Michel Shamy, Hayrapet Kalashyan, Leka Sivakumar, Ashfaq Shuaib, Mike Sharma, Ken Butcher 612 Endovascular Thrombectomy for Low ASPECTS Large Vessel Occlusion Ischemic Stroke: A Systematic Review and Meta-Analysis Jose Danilo B. Diestro, Adam A. Dmytriw, Gabriel Broocks, Karen Chen, Joshua A. Hirsch, Andre Kemmling, Kevin Phan, Aditya Bharatha 620 Detecting Subtle Cognitive Impairment in Multiple Sclerosis with the Montreal Cognitive Assessment Kim Charest, Alexandra Tremblay, Roxane Langlois, Élaine Roger, Pierre Duquette, Isabelle Rouleau 627 Effect of Sleep Disorder on Delirium in Post-Cardiac Surgery Patients Hongbai Wang, Liang Zhang, Qipeng Luo, Yinan Li, Fuxia Yan 634 Efficacy and Acceptance of a Lombard-response Device for Hypophonia in Parkinson’s Disease Scott Adams, Niraj Kumar, Philippe Rizek, Angeline Hong, Jenny Zhang, Anita Senthinathan, Cynthia Mancinelli, Thea Knowles, Mandar Jog 642 Disparities in Deep Brain Stimulation Use for Parkinson’s Disease in Ontario, Canada James A.G. Crispo, Melody Lam, Britney Le, Lucie Richard, Salimah Z. Shariff, Dominique R. Ansell, Melanie Squarzolo, Connie Marras, Allison W. Willis, Dallas Seitz 656 Diagnostic Yield of MRI for Sensorineural Hearing Loss – An Audit Helen Wong, Yaw Amoako-Tuffour, Khunsa Faiz, Jai Jai Shiva Shankar 661 Influence of Optic Nerve Appearance on Visual Outcome in Pediatric Idiopathic Intracranial Hypertension Jonathan A. Micieli, Beau B. Bruce, Caroline Vasseneix, Richard J. Blanch, Damian E. Berezovsky, Nancy J. Newman, Valérie Biousse, Jason H. Peragallo 666 Association between Graduate Degrees and Publication Productivity in Academic Neurosurgery Michael B. Keough, Christopher Newell, Alan R. Rheaume, Tejas Sankar 675 A Diverse Specialty: What Students Teach Us About Neurology and “Neurophobia” Fraser G. A. Moore NEUROIMAGING HIGHLIGHTS 681 Multiphase CT Angiography for Evaluation and Diagnosis of Complex Spinal Dural Arteriovenous Fistula Sudharsan Srinivasan, Zachary M. Wilseck, Joseph R. Linzey, Neeraj Chaudhary, Ashok Srinivasan, Aditya S. Pandey https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.232 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.232 https://www.cambridge.org/core Volume 47 / Number 6 / November 2020 A-2 nEUroiMaging HigHligHts 820 Congenital Herniation of the Gyrus Rectus Resulting in Compressive Optic Neuropathy Kia Gilani, Pejman Jabehdar Maralani, Arun NE Sundaram 822 Supranuclear Horizontal Gaze Palsy Following Anterior Internal Capsule Hemorrhage Nathan Chu, Grayson Beecher, Mohammed Wasif Hussain 824 Superior Oblique Myokymia Presumed Due to Large Posterior Fossa Arteriovenous Malformation Laura Donaldson, Brian van Adel, Amadeo R. Rodriguez 826 Three-Dimensional Computed Tomography Reconstruction Unmasks Shunt Disconnection in a Child Jignesh K. Tailor, Ian C. Coulter, Michael C. Dewan, Helen M. Branson, Peter B. Dirks, James T. Rutka 828 Megalencephaly–Capillary Malformation–Polymicrogyria with Cerebral Venous Thrombosis Olivier Fortin, Mohammed Ashour, Caroline Lacroix, Christine A. Sabapathy, Kenneth A. Myers BriEF CoMMUniCations 830 Access to Allied Health Care Services in Canadian Interdisciplinary Complex Nerve Injury Programs Kristine M. Chapman, Chris Doherty, Sean G. Bristol, Russell O’Connor, Michael J. Berger 834 Access to Focal Spasticity Care: A Cross Canada Survey of Physiatrists Kevin E. Liang, Pham Vivian Ngo, Paul Winston 839 Successful Management of Glioblastoma Chemotherapy- Associated Dysgeusia with Gabapentin Karen Turcotte, Charles Jean Touchette, Christian Iorio-Morin, David Fortin 842 Low Seroprevalence of Lyme Disease Among Multiple Sclerosis Patients in New Brunswick Gregg MacLean, Peggy Cook, L. Robbin Lindsay, Todd F. Hatchette, Duncan Webster rEFlECtions 845 Michael W. Nicolle: Canadian Leader in Neurology Ario Mirian lEttErs to tHE Editor 847 The Impact of the Covid-19 Pandemic on Stroke Volume Christopher R. Pasarikovski, Leodante da Costa 849 COVID-19 Presenting With Thalamic Hemorrhage Unmasking Moyamoya Angiopathy Ritwik Ghosh, Souvik Dubey, Biman Kanti Ray, Subhankar Chatterjee, Julián Benito-León 852 Guillain-Barré Syndrome with Facial Diplegia Related to SARS-CoV-2 Infection Jason L. Chan, Hamid Ebadi, Justyna R. Sarna 855 Association Between Hypomimia and Mild Cognitive Impairment in De Novo Parkinson’s Disease Patients Carmen Gasca-Salas, Daniele Urso 858 FA2H Mutations in a Young Adult Presenting as an Isolated Cognitive Impairment Syndrome Luis André Leal Ferman, Mathieu Lévesque, Sébastien Lévesque, Christian Bocti 861 Alien Limb Phenomenon as a Heralding Manifestation of Toxic Leukoencephalopathy Pranjal Gupta, Deepa Dash, Rajesh Kumar Singh, Leve Joseph Devarajan Sebastian, Manjari Tripathi Volume 47 / Number 5 / September 2020 A-2 683 Heavy Eye Syndrome Mimicking Abducens Nerve Palsies Caberry W. Yu, Jonathan A. Micieli 685 Vertical One-and-a-Half Syndrome Due to Metastatic Spindle Cell Carcinoma of the Lung Elie Côté, Jonathan A. Micieli 687 Prolonged Hyperperfusion in a Child With ATP1A2 Defect-Related Hemiplegic Migraine Katherine Cobb-Pitstick, Dana D. Cummings, Giulio Zuccoli 689 Marchiafava–Bignami Disease: Two Chronologically Distinct Stages in the Same Patient Miguel Quintas-Neves, José Manuel Amorim, João Paulo Soares-Fernandes 691 A Glioma Presenting as a Posterior Circulation Stroke Fangshi Lu, Amy Fowler, Keith Tam, Carlos R. Camara-Lemarroy BRIEF COMMUNICATIONS 693 COVID-19: Stroke Admissions, Emergency Department Visits, and Prevention Clinic Referrals Maria Bres Bullrich, Sebastian Fridman, Jennifer L. Mandzia, Lauren M. Mai, Alexander Khaw, Juan Camilo Vargas Gonzalez, Rodrigo Bagur, Luciano A. Sposato 697 Ischemic Monomelic Neuropathy: The Case for Reintroducing a Little-Known Term Paul Winston, Dannika Bakker 700 Perception of Healthcare Access and Utility of Telehealth Among Parkinson’s Disease Patients Dakota Peacock, Peter Baumeister, Alex Monaghan, Jodi Siever, Joshua Yoneda, Daryl Wile LETTERS TO THE EDITOR 705 Effects of Rapid Eye Movement Sleep in Anti-NMDAR Encephalitis With Extreme Delta Brush Pattern Dhruv Jain, Marcus C. Ng 709 AMPA-R Limbic Encephalitis Associated with Systemic Lupus Erythematosus Zoya Zaeem, Collin C. Luk, Dustin Anderson, Gregg Blevins, Zaeem A. Siddiqi 711 Crossed Zoster Syndrome: A Rare Clinical Presentation Following Herpes Zoster Ophthalmicus Andrea M. Kuczynski, Carla J. Wallace, Ryan Wada, Kenneth L. Tyler, Ronak K. Kapadia 714 Recurrent Abducens Palsy in Relapsing-Remitting Multiple Sclerosis Sanskriti Sasikumar, Chantal Roy-Hewitson, Caroline Geenen, Dale Robinson, Felix Tyndel 716 Trigeminal Autonomic Cephalalgia Secondary to Spontaneous Trigeminal Hemorrhage Mathieu Levesque, Christian Bocti, François Moreau 719 Cluster Headache with Temporomandibular Joint Pain Tommy Lik Hang Chan, David Dongkyung Kim, Werner J. Becker 721 Location, Location: The Clue to Aetiology in Cerebellar Bleeds Stephen A. Ryan, Sandra E. Black, Julia Keith, Richard Aviv, Victor X.D. Yang, Mario Masellis, Julia J. Hopyan 724 Aerococcus Urinae Endocarditis Presenting with Bilateral Cerebellar Infarcts Kaie Rosborough, Bryce A. Durafourt, Winnie Chan, Peggy DeJong, Ramana Appireddy 727 Unexpected Progressive Multifocal Leukoencephalopathy in a Hemodialysis Patient Bryce A. Durafourt, John P. Rossiter, Moogeh Baharnoori Cover image: Prolonged Hyperperfusion in a Child With ATP1A2 Defect-Related Hemiplegic Migraine. Katherine Cobb-Pitstick, Dana D. Cummings, Giulio Zuccoli See pages 687-688. https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.232 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.232 https://www.cambridge.org/core Volume 47 / Number 6 / November 2020 A-3 864 Recurrent MOG-IgG Optic Neuritis Initially Attributed to Sjögren’s Syndrome Jennifer Ling, Jonathan A. Micieli 866 Unruptured Posterior Cerebral Artery Aneurysm Presenting with Temporal Lobe Epilepsy Leeor S. Yefet, Mostafa Fatehi, Yahya Aghakhani, Gary Redekop 869 Glioblastoma Spinal Cord Metastasis With Short-Term Clinical Improvement After Radiation Seth A. Climans, Indu S. Voruganti, David G. Munoz, Warren P. Mason addEndUM 872 Rapid Eye Movement (REM) Sleep Behavior Disorder and REM Sleep with Atonia in the Young – ADDENDUM Garima Shukla, Anupama Gupta, Kamalesh Chakravarty, Angela Ann Joseph, Aathira Ravindranath, Manju Mehta, Sheffali Gulati, Madhulika Kabra, Afsar Mohammed, Shivani Poornima Cover image: Embryology of Spinal Dysraphism and its Relationship to Surgical Treatment. Matthew E. Eagles, Nalin Gupta See pages 736-746. Volume 47 / Number 5 / September 2020 A-3 Editor-in-Chief/Rédactueur en chef Robert Chen t o ro n t o, o n Associate Editors/Rédacteurs associés Robert Hammond london, on Philippe Hout m o n t rea l, qc Mahendranath Moharir t oron t o, on Tejas Sankar e dmo n t on, a b Manas Sharma lo n do n, o n Jeanne Teitelbaum mo n t re a l, qc Richard Wennberg toronto, on Past Editors/Anciens rédacteurs en chef G. Bryan Young lo n do n, o n Douglas W. Zochodne ca lg a ry, a b James A. Sharpe to ron t o, o n Robert G. Lee ca lg a ry, ab Robert T. Ross w i n n i p eg, mb (Emeritus Editor, Founding Editor) Editorial Board/Comité éditorial Jorge Burneo, lo n do n, o n Jodie Burton, ca lga ry, a b Colin Chalk, mo n trea l, qc K. Ming Chan, e dmo n to n, a b Alan Goodridge, st. jo h n’s, n l Mark Hamilton, ca lg a ry, ab Michael Hill, ca lga ry, ab Alan C. Jackson, w i n n i p eg, mb Draga Jichici, ha m i lto n, o n Suneil Kalia, to ro n t o, o n Daniel Keene, o t towa, o n Julia Keith, to ro n t o, on Nir Lipsman, to ro n to, o n Stephen Lownie, lo n do n, o n Jian-Qiang Lu, h ami lt o n, o n Patrick McDonald, van co u v e r, bc Joseph Megyesi, lo n do n, o n Tiago Mestre, o t t owa, o n Sarah Morrow, lo n do n, o n Michael Nicolle, lo n don, on Ian Parney, rochester, mn Narayan Prasad, l o n do n, o n Alex Rajput, saskat oo n, sk Kesh Reddy, hami lt on, o n Larry Robinson, toro n to, o n Ramesh Sahpaul, n ort h van cou ver, bc Dipanka Sarma, t o ro n to, o n Madeleine Sharpe, mon t re al, qc Sean Symons, t oro n t o, o n Brian Toyota, va n co u ver, bc Brian Weinshenker, roc h est e r, m n Sam Wiebe, ca lga ry, a b Jefferson Wilson, to ro n t o, o n Eugene Yu, to ro n to, o n Journal Staff/Effectif du Journal Dan Morin ca l ga ry, a b Chief Executive Officer Donna Irvin ca lga ry, a b CNSF Membership Services / Communications Officer The official journal of: / La Revue officielle de: The Canadian Neurological Society La Société Canadienne de Neurologie The Canadian Neurosurgical Society La Société Canadienne de Neurochirurgie The Canadian Society of Clinical Neurophysiologists La Société Canadienne de Neurophysiologie Clinique The Canadian Association of Child Neurology L’ Association Canadienne de Neurologie Pédiatrique The Canadian Society of Neuroradiology La Société Canadienne de Neuroradiologie The permanent secretariat for the five societies and the Canadian Neurological Sciences Federation is at: Le sécretariat des cinq associations et de la Fédération des sciences neurologiques du Canada est situe en permanence à : 143N – 8500 Macleod Trail SE Calgary, Alberta T2H 2N1 Canada CNSF (403) 229-9544 Fax (403) 229-1661 The Canadian Journal of Neurological Sciences is published bi-monthly. The annual subscription rate for Individuals (electronic) is £149/US$245. The annual subscription rate for Institutions (electronic) is £197/US$327. See for full details including taxes; e-mail: subscriptions_ newyork@cambridge.org. Send address changes to Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. The Canadian Journal of Neurological Sciences is included in the Cambridge Core service, which can be accessed at cambridge.org/ cjn. For information on other Cambridge titles, visit www.cambridge.org. For advertising rates contact M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Phone: 856-768-9360; Fax: 856-753-0064; Email: mjmrvica@mrvica.com. Le Journal Canadien des Sciences Neuorlogiques est publié tous les deux mois. Le prix d’abonnement annuel pour les individus (électronique) est 149£/245US$. Le prix d’abonnement annuel pour les établissements (électronique) est 197£/327US$. Veuillez consulter pour tous les détails, y compris les taxes; email: subscriptions_newyork@cambridge.org. Envoyer les changements d’adresses aux Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. Le Journal canadien des sciences neurologiques est inclus dans le service Cambridge Journals Online, accessible à cambridge.org/cjn. Pour plus d’informations sur les titres disponible chez Cambridge, veuillez consulter www. cambridge.org. Pour les tarifs de publicité, contacter M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Téléphone: (1)856-768-9360; Email: mjmrvica@mrvica.com. This journal is indexed by / Cette revue est indexée par: Adis International, ArticleFirst, BIOBASE, BioLAb, BiolSci, BIOSIS Prev, Centre National de la Recherche Scientifique, CSA, CurAb, CurCont, De Gruyter Saur, E-psyche, EBSCO, Elsevier, EMBASE, FRANCIS, IBZ, Internationale Bibliographie der Rezensionen Geistes-und Sozialwissenschaftlicher Literatur, MEDLINE, MetaPress, National Library of Medicine, OCLC, PE&ON,Personal Alert, PsycFIRST, PsycINFO, PubMed, Reac, RefZh, SCI, SCOPUS, Thomson Reuters, TOCprem, VINITI RAN, Web of Science. ISSN: 0317-1671 EISSN: 2057-0155 COPYRIGHT © 2020 by THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES INC. All rights reserved. No part of this publication may be reproduced, in any form or by any means, electronic, photocopying, or otherwise, without permission in writing from Cambridge University Press. Policies, request forms and contacts are available at: http://www.cambridge.org/about-us/rights-permissions. Permission to copy (for users in the U.S.A.) is available from Copyright Clearance Center: http://www.copyright.com, email: info@copyright.com. COPYRIGHT © 2020 du THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES I NC. Tous droits réservés. Aucune partie de cette publication ne peut être reproduite, sous quelque forme ou par quelque procédé que ce soit, électronique ou autre, y compris la photocopie, sans l’accord écrit de Cambridge University Press. Les politiques, les formulaires de demande et les contacts sont disponibles à: http://www.cambridge.org/about- us/rights-permissions. La permission de copier (pour les utilisateurs aux États- Unis) est disponible auprès Copyright Clearance Center: http://www.copyright. com, email: info@copyright.com. https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.232 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.232 https://www.cambridge.org/core Volume 47 / Number 6 / November 2020Volume 47 / Number 5 / September 2020 A-3 Editor-in-Chief/Rédactueur en chef Robert Chen t o ro n t o, o n Associate Editors/Rédacteurs associés Robert Hammond london, on Philippe Hout m o n t rea l, qc Mahendranath Moharir t oron t o, on Tejas Sankar e dmo n t on, a b Manas Sharma lo n do n, o n Jeanne Teitelbaum mo n t re a l, qc Richard Wennberg toronto, on Past Editors/Anciens rédacteurs en chef G. Bryan Young lo n do n, o n Douglas W. Zochodne ca lg a ry, a b James A. Sharpe to ron t o, o n Robert G. Lee ca lg a ry, ab Robert T. Ross w i n n i p eg, mb (Emeritus Editor, Founding Editor) Editorial Board/Comité éditorial Jorge Burneo, lo n do n, o n Jodie Burton, ca lga ry, a b Colin Chalk, mo n trea l, qc K. Ming Chan, e dmo n to n, a b Alan Goodridge, st. jo h n’s, n l Mark Hamilton, ca lg a ry, ab Michael Hill, ca lga ry, ab Alan C. Jackson, w i n n i p eg, mb Draga Jichici, ha m i lto n, o n Suneil Kalia, to ro n t o, o n Daniel Keene, o t towa, o n Julia Keith, to ro n t o, on Nir Lipsman, to ro n to, o n Stephen Lownie, lo n do n, o n Jian-Qiang Lu, h ami lt o n, o n Patrick McDonald, van co u v e r, bc Joseph Megyesi, lo n do n, o n Tiago Mestre, o t t owa, o n Sarah Morrow, lo n do n, o n Michael Nicolle, lo n don, on Ian Parney, rochester, mn Narayan Prasad, l o n do n, o n Alex Rajput, saskat oo n, sk Kesh Reddy, hami lt on, o n Larry Robinson, toro n to, o n Ramesh Sahpaul, n ort h van cou ver, bc Dipanka Sarma, t o ro n to, o n Madeleine Sharpe, mon t re al, qc Sean Symons, t oro n t o, o n Brian Toyota, va n co u ver, bc Brian Weinshenker, roc h est e r, m n Sam Wiebe, ca lga ry, a b Jefferson Wilson, to ro n t o, o n Eugene Yu, to ro n to, o n Journal Staff/Effectif du Journal Dan Morin ca l ga ry, a b Chief Executive Officer Donna Irvin ca lga ry, a b CNSF Membership Services / Communications Officer The official journal of: / La Revue officielle de: The Canadian Neurological Society La Société Canadienne de Neurologie The Canadian Neurosurgical Society La Société Canadienne de Neurochirurgie The Canadian Society of Clinical Neurophysiologists La Société Canadienne de Neurophysiologie Clinique The Canadian Association of Child Neurology L’ Association Canadienne de Neurologie Pédiatrique The Canadian Society of Neuroradiology La Société Canadienne de Neuroradiologie The permanent secretariat for the five societies and the Canadian Neurological Sciences Federation is at: Le sécretariat des cinq associations et de la Fédération des sciences neurologiques du Canada est situe en permanence à : 143N – 8500 Macleod Trail SE Calgary, Alberta T2H 2N1 Canada CNSF (403) 229-9544 Fax (403) 229-1661 The Canadian Journal of Neurological Sciences is published bi-monthly. The annual subscription rate for Individuals (electronic) is £149/US$245. The annual subscription rate for Institutions (electronic) is £197/US$327. See for full details including taxes; e-mail: subscriptions_ newyork@cambridge.org. Send address changes to Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. The Canadian Journal of Neurological Sciences is included in the Cambridge Core service, which can be accessed at cambridge.org/ cjn. For information on other Cambridge titles, visit www.cambridge.org. For advertising rates contact M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Phone: 856-768-9360; Fax: 856-753-0064; Email: mjmrvica@mrvica.com. Le Journal Canadien des Sciences Neuorlogiques est publié tous les deux mois. Le prix d’abonnement annuel pour les individus (électronique) est 149£/245US$. Le prix d’abonnement annuel pour les établissements (électronique) est 197£/327US$. Veuillez consulter pour tous les détails, y compris les taxes; email: subscriptions_newyork@cambridge.org. Envoyer les changements d’adresses aux Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. Le Journal canadien des sciences neurologiques est inclus dans le service Cambridge Journals Online, accessible à cambridge.org/cjn. Pour plus d’informations sur les titres disponible chez Cambridge, veuillez consulter www. cambridge.org. Pour les tarifs de publicité, contacter M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Téléphone: (1)856-768-9360; Email: mjmrvica@mrvica.com. This journal is indexed by / Cette revue est indexée par: Adis International, ArticleFirst, BIOBASE, BioLAb, BiolSci, BIOSIS Prev, Centre National de la Recherche Scientifique, CSA, CurAb, CurCont, De Gruyter Saur, E-psyche, EBSCO, Elsevier, EMBASE, FRANCIS, IBZ, Internationale Bibliographie der Rezensionen Geistes-und Sozialwissenschaftlicher Literatur, MEDLINE, MetaPress, National Library of Medicine, OCLC, PE&ON,Personal Alert, PsycFIRST, PsycINFO, PubMed, Reac, RefZh, SCI, SCOPUS, Thomson Reuters, TOCprem, VINITI RAN, Web of Science. ISSN: 0317-1671 EISSN: 2057-0155 COPYRIGHT © 2020 by THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES INC. All rights reserved. No part of this publication may be reproduced, in any form or by any means, electronic, photocopying, or otherwise, without permission in writing from Cambridge University Press. Policies, request forms and contacts are available at: http://www.cambridge.org/about-us/rights-permissions. Permission to copy (for users in the U.S.A.) is available from Copyright Clearance Center: http://www.copyright.com, email: info@copyright.com. COPYRIGHT © 2020 du THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES I NC. Tous droits réservés. Aucune partie de cette publication ne peut être reproduite, sous quelque forme ou par quelque procédé que ce soit, électronique ou autre, y compris la photocopie, sans l’accord écrit de Cambridge University Press. Les politiques, les formulaires de demande et les contacts sont disponibles à: http://www.cambridge.org/about- us/rights-permissions. La permission de copier (pour les utilisateurs aux États- Unis) est disponible auprès Copyright Clearance Center: http://www.copyright. com, email: info@copyright.com. 4 Editor-in-Chief/Rédactueur en chef Robert Chen toronto, on Associate Editors/Rédacteurs associés Robert Hammond london, on Philippe Hout montreal, qc Mahendranath Moharir toronto, on Tejas Sankar edmonton, ab Manas Sharma london, on Jeanne Teitelbaum montreal, qc Richard Wennberg toronto, on Past Editors/Anciens rédacteurs en chef G. Bryan Young london, on Douglas W. Zochodne calgary, ab James A. Sharpe toronto, on Robert G. Lee calgary, ab Robert T. Ross winnipeg, mb (Emeritus Editor, Founding Editor) Editorial Board/Comité éditorial Jorge Burneo, london, on Jodie Burton, calgary, ab Colin Chalk, montreal, qc K. Ming Chan, edmonton, ab Alan Goodridge, st. john’s, nl Mark Hamilton, calgary, ab Michael Hill, calgary, ab Alan C. Jackson, winnipeg, mb Draga Jichici, hamilton, on Suneil Kalia, toronto, on Daniel Keene, ottowa, on Julia Keith, toronto, on Nir Lipsman, toronto, on Stephen Lownie, london, on Jian-Qiang Lu, hamilton, on Patrick McDonald, vancouver, bc Joseph Megyesi, london, on Tiago Mestre, ottowa, on Sarah Morrow, london, on Michael Nicolle, london, on Ian Parney, rochester, mn Narayan Prasad, london, on Alex Rajput, saskatoon, sk Kesh Reddy, hamilton, on Larry Robinson, toronto, on Ramesh Sahpaul, north vancouver, bc Dipanka Sarma, toronto, on Madeleine Sharpe, montreal, qc Sean Symons, toronto, on Brian Toyota, vancouver, bc Brian Weinshenker, rochester, mn Sam Wiebe, calgary, ab Jefferson Wilson, toronto, on Eugene Yu, toronto, on Journal Staff/Effectif du Journal Dan Morin calgary, ab Chief Executive Officer Donna Irvin calgary, ab CNSF Membership Services / Communications Officer The official journal of: / La Revue officielle de: the Canadian neurological society la société Canadienne de neurologie the Canadian neurosurgical society la société Canadienne de neurochirurgie the Canadian society of Clinical neurophysiologists la société Canadienne de neurophysiologie Clinique the Canadian association of Child neurology l’ association Canadienne de neurologie Pédiatrique the Canadian society of neuroradiology la société Canadienne de neuroradiologie The permanent secretariat for the five societies and the Canadian Neurological Sciences Federation is at: Le sécretariat des cinq associations et de la Fédération des sciences neurologiques du Canada est situe en permanence à : 143N – 8500 Macleod Trail SE Calgary, Alberta T2H 2N1 Canada CNSF (403) 229-9544 Fax (403) 229-1661 The Canadian Journal of Neurological Sciences is published bi-monthly. The annual subscription rate for Individuals (electronic) is £149/US$245. The annual subscription rate for Institutions (electronic) is £197/US$327. See for full details including taxes; e-mail: subscriptions_ newyork@cambridge.org. Send address changes to Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. The Canadian Journal of Neurological Sciences is included in the Cambridge Core service, which can be accessed at cambridge.org/ cjn. For information on other Cambridge titles, visit www.cambridge.org. For advertising rates contact M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Phone: 856-768-9360; Fax: 856-753-0064; Email: mjmrvica@mrvica.com. Le Journal Canadien des Sciences Neuorlogiques est publié tous les deux mois. Le prix d’abonnement annuel pour les individus (électronique) est 149£/245US$. Le prix d’abonnement annuel pour les établissements (électronique) est 197£/327US$. Veuillez consulter pour tous les détails, y compris les taxes; email: subscriptions_newyork@cambridge.org. Envoyer les changements d’adresses aux Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. Le Journal canadien des sciences neurologiques est inclus dans le service Cambridge Journals Online, accessible à cambridge.org/cjn. Pour plus d’informations sur les titres disponible chez Cambridge, veuillez consulter www. cambridge.org. Pour les tarifs de publicité, contacter M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Téléphone: (1)856-768-9360; Email: mjmrvica@mrvica.com. This journal is indexed by / Cette revue est indexée par: Adis International, ArticleFirst, BIOBASE, BioLAb, BiolSci, BIOSIS Prev, Centre National de la Recherche Scientifique, CSA, CurAb, CurCont, De Gruyter Saur, E-psyche, EBSCO, Elsevier, EMBASE, FRANCIS, IBZ, Internationale Bibliographie der Rezensionen Geistes-und Sozialwissenschaftlicher Literatur, MEDLINE, MetaPress, National Library of Medicine, OCLC, PE&ON,Personal Alert, PsycFIRST, PsycINFO, PubMed, Reac, RefZh, SCI, SCOPUS, Thomson Reuters, TOCprem, VINITI RAN, Web of Science. ISSN: 0317-1671 EISSN: 2057-0155 COPYRIGHT © 2020 by THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES INC. All rights reserved. No part of this publication may be reproduced, in any form or by any means, electronic, photocopying, or otherwise, without permission in writing from Cambridge University Press. Policies, request forms and contacts are available at: http://www.cambridge.org/about-us/rights-permissions. Permission to copy (for users in the U.S.A.) is available from Copyright Clearance Center: http://www.copyright.com, email: info@copyright.com. COPYRIGHT © 2020 du THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES I NC. Tous droits réservés. Aucune partie de cette publication ne peut être reproduite, sous quelque forme ou par quelque procédé que ce soit, électronique ou autre, y compris la photocopie, sans l’accord écrit de Cambridge University Press. Les politiques, les formulaires de demande et les contacts sont disponibles à: http://www.cambridge.org/about- us/rights-permissions. La permission de copier (pour les utilisateurs aux États- Unis) est disponible auprès Copyright Clearance Center: http://www.copyright. com, email: info@copyright.com. Volume 47 / Number 5 / September 2020 A-3 Editor-in-Chief/Rédactueur en chef Robert Chen t o ro n t o, o n Associate Editors/Rédacteurs associés Robert Hammond london, on Philippe Hout m o n t rea l, qc Mahendranath Moharir t oron t o, on Tejas Sankar e dmo n t on, a b Manas Sharma lo n do n, o n Jeanne Teitelbaum mo n t re a l, qc Richard Wennberg toronto, on Past Editors/Anciens rédacteurs en chef G. Bryan Young lo n do n, o n Douglas W. Zochodne ca lg a ry, a b James A. Sharpe to ron t o, o n Robert G. Lee ca lg a ry, ab Robert T. Ross w i n n i p eg, mb (Emeritus Editor, Founding Editor) Editorial Board/Comité éditorial Jorge Burneo, lo n do n, o n Jodie Burton, ca lga ry, a b Colin Chalk, mo n trea l, qc K. Ming Chan, e dmo n to n, a b Alan Goodridge, st. jo h n’s, n l Mark Hamilton, ca lg a ry, ab Michael Hill, ca lga ry, ab Alan C. Jackson, w i n n i p eg, mb Draga Jichici, ha m i lto n, o n Suneil Kalia, to ro n t o, o n Daniel Keene, o t towa, o n Julia Keith, to ro n t o, on Nir Lipsman, to ro n to, o n Stephen Lownie, lo n do n, o n Jian-Qiang Lu, h ami lt o n, o n Patrick McDonald, van co u v e r, bc Joseph Megyesi, lo n do n, o n Tiago Mestre, o t t owa, o n Sarah Morrow, lo n do n, o n Michael Nicolle, lo n don, on Ian Parney, rochester, mn Narayan Prasad, l o n do n, o n Alex Rajput, saskat oo n, sk Kesh Reddy, hami lt on, o n Larry Robinson, toro n to, o n Ramesh Sahpaul, n ort h van cou ver, bc Dipanka Sarma, t o ro n to, o n Madeleine Sharpe, mon t re al, qc Sean Symons, t oro n t o, o n Brian Toyota, va n co u ver, bc Brian Weinshenker, roc h est e r, m n Sam Wiebe, ca lga ry, a b Jefferson Wilson, to ro n t o, o n Eugene Yu, to ro n to, o n Journal Staff/Effectif du Journal Dan Morin ca l ga ry, a b Chief Executive Officer Donna Irvin ca lga ry, a b CNSF Membership Services / Communications Officer The official journal of: / La Revue officielle de: The Canadian Neurological Society La Société Canadienne de Neurologie The Canadian Neurosurgical Society La Société Canadienne de Neurochirurgie The Canadian Society of Clinical Neurophysiologists La Société Canadienne de Neurophysiologie Clinique The Canadian Association of Child Neurology L’ Association Canadienne de Neurologie Pédiatrique The Canadian Society of Neuroradiology La Société Canadienne de Neuroradiologie The permanent secretariat for the five societies and the Canadian Neurological Sciences Federation is at: Le sécretariat des cinq associations et de la Fédération des sciences neurologiques du Canada est situe en permanence à : 143N – 8500 Macleod Trail SE Calgary, Alberta T2H 2N1 Canada CNSF (403) 229-9544 Fax (403) 229-1661 The Canadian Journal of Neurological Sciences is published bi-monthly. The annual subscription rate for Individuals (electronic) is £149/US$245. The annual subscription rate for Institutions (electronic) is £197/US$327. See for full details including taxes; e-mail: subscriptions_ newyork@cambridge.org. Send address changes to Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. The Canadian Journal of Neurological Sciences is included in the Cambridge Core service, which can be accessed at cambridge.org/ cjn. For information on other Cambridge titles, visit www.cambridge.org. For advertising rates contact M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Phone: 856-768-9360; Fax: 856-753-0064; Email: mjmrvica@mrvica.com. Le Journal Canadien des Sciences Neuorlogiques est publié tous les deux mois. Le prix d’abonnement annuel pour les individus (électronique) est 149£/245US$. Le prix d’abonnement annuel pour les établissements (électronique) est 197£/327US$. Veuillez consulter pour tous les détails, y compris les taxes; email: subscriptions_newyork@cambridge.org. Envoyer les changements d’adresses aux Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. Le Journal canadien des sciences neurologiques est inclus dans le service Cambridge Journals Online, accessible à cambridge.org/cjn. Pour plus d’informations sur les titres disponible chez Cambridge, veuillez consulter www. cambridge.org. Pour les tarifs de publicité, contacter M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Téléphone: (1)856-768-9360; Email: mjmrvica@mrvica.com. This journal is indexed by / Cette revue est indexée par: Adis International, ArticleFirst, BIOBASE, BioLAb, BiolSci, BIOSIS Prev, Centre National de la Recherche Scientifique, CSA, CurAb, CurCont, De Gruyter Saur, E-psyche, EBSCO, Elsevier, EMBASE, FRANCIS, IBZ, Internationale Bibliographie der Rezensionen Geistes-und Sozialwissenschaftlicher Literatur, MEDLINE, MetaPress, National Library of Medicine, OCLC, PE&ON,Personal Alert, PsycFIRST, PsycINFO, PubMed, Reac, RefZh, SCI, SCOPUS, Thomson Reuters, TOCprem, VINITI RAN, Web of Science. ISSN: 0317-1671 EISSN: 2057-0155 COPYRIGHT © 2020 by THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES INC. All rights reserved. No part of this publication may be reproduced, in any form or by any means, electronic, photocopying, or otherwise, without permission in writing from Cambridge University Press. Policies, request forms and contacts are available at: http://www.cambridge.org/about-us/rights-permissions. Permission to copy (for users in the U.S.A.) is available from Copyright Clearance Center: http://www.copyright.com, email: info@copyright.com. COPYRIGHT © 2020 du THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES I NC. Tous droits réservés. Aucune partie de cette publication ne peut être reproduite, sous quelque forme ou par quelque procédé que ce soit, électronique ou autre, y compris la photocopie, sans l’accord écrit de Cambridge University Press. Les politiques, les formulaires de demande et les contacts sont disponibles à: http://www.cambridge.org/about- us/rights-permissions. La permission de copier (pour les utilisateurs aux États- Unis) est disponible auprès Copyright Clearance Center: http://www.copyright. com, email: info@copyright.com. https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.232 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.232 https://www.cambridge.org/core work_kvdv3kamkjhybjxivuhulj3cna ---- GLOBAL COLLECTIVE RESOURCES: GLOBAL COLLECTIVE RESOURCES: A Study of Monographic Bibliographic Records in WorldCat Report of a Study conducted under the auspices of an OCLC/ ALISE 2001 Research Grant by Anna H. Perrault Associate Professor School of Library and Information Science University of South Florida July 2002 GLOBAL COLLECTIVE RESOURCES Abstract In 2001, WorldCat, the primary international bibliographic utility, contained 45 million records with over 750 million library location listings. These records span over 4,000 years of recorded knowledge in 377 languages.1 Under the auspices of an OCLC/ALISE research grant, a bibliometric study was conducted of WorldCat. A 10% systematic random sample of the database was analyzed utilizing the OCLC iCAS product to profile the monographic bibliographic records in WorldCat by type of library, subject, language, and publication date parameters. The profile details the Ainformation commons@ of global publication made accessible through the OCLC international network. There were 3,378,272 usable records from the 10% systematic random sample of which 2,199,165 records had call numbers and could be analyzed by subject. Five types of library groupings were established for the study: research, academic, public, special, and school. The research libraries grouping has the largest number or records in the sample with call numbers at 1,745,034. The missions of the different types of libraries can be discerned in the subject profiles for each library grouping. Among the findings of the study are that the profile of WorldCat by time period and by subject divisions is mirrored in the profile of the grouping of research libraries. Of all of the records in the 10% sample, approximately 65% are English language materials with 35% for foreign language materials. The analysis by number of unique records and title overlap demonstrate that the universe of materials under bibliographic control in WorldCat shows a high level of diversity of resources with 53% of records having only one library location symbol. The number of records in the analysis show a sharp decline by most measures from 1992 to the last imprint year in the study. An analysis was performed of the records in the sample with ISBN numbers, finding that only 21% of the 3 million plus records in the study had ISBN numbers. This can be due to the amount of retrospective titles published before the numbering system came into use and also the number of publications that are not from mainstream publishers. But for publications since 1970, 57% of all records with call numbers have ISBN numbers, leaving an intriguing 43% of records with call numbers that do not have ISBN numbers. The findings establish that WorldCat is a rich resource for cataloging records, verification of the existence of titles, and identifying prospective materials for resources sharing. As OCLC continues to implement its Global Strategy, AExtending the Cooperative,@ the number of international members and thus foreign language records and unique titles may continue to increase. 1OCLC Newsletter Jan/Feb. 2001 No. 249, p.6-7. GLOBAL COLLECTIVE RESOURCES Table of Contents* Abstract Table of Contents List of Tables and Figures Acknowledgments About the Author Chapter One: Global Collective Resources Introduction The Problem Review of Related Research Methods Chapter Two: WorldCat: The Profile The Sample Subject Analysis Summary Chapter Three: Library Groupings Research Libraries Academic Libraries Special Libraries Public Libraries and School Libraries: Bibliographic Records by Audience Level Public Libraries Subject Analysis School Libraries Subject Analysis Chapter Four: Diversity of Resources Unique Titles WorldCat and Research Libraries Library Groupings Title Overlap Chapter Five: Language Analysis English, Non-English Foreign Language Groupings Subject Analysis for the Seven Language Groupings Chapter Six: ISBN Analysis http://www.oclc.org/research/grants/reports/perrault/chapter1final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter2final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter3final2.pdf http://www.oclc.org/research/grants/reports/perrault/chapter4final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter5final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter6final.pdf Chapter Seven: Summary and Conclusions Findings Decline in the Number of Records for Current Years Implications Further Research Conclusion Site Links OCLC list of libraries http://www.oclc.org/contacts/libraries/ OCLC iCAS http://www.oclc.org/western/products/aca/icas.htm OCLC Annual Reports http://www.oclc.org/about/annualreport/ *Organization of the Report This report has been designed as a web document with hyperlinks for easy navigation. Each chapter is self-contained with attached tables and figures. Chapters can be reached through the hyperlinks in the Table of Contents and also through the links at the end of each chapter to the next chapter. Each table or figure is hyperlinked from the title within the text to the attached table or figure. In chapter one, the references are hyperlinked from the superscript number to the notes at the end. Chapters with a small number of references have the references in footnotes on that page. http://www.oclc.org/contacts/libraries/ http://www.oclc.org/western/products/aca/icas.htm http://www.oclc.org/about/annualreport/ http://www.oclc.org/research/grants/reports/perrault/chapter7final.pdf GLOBAL COLLECTIVE RESOURCES List of Tables and Figures Chapter Two (http://www.oclc.org/research/grants/reports/perrault/chapter2final.pdf) Table 2-1 WorldCat Monographic Bibliographic Records: All Titles Held by Date (All Subject Divisions) Table 2-2 WorldCat Monographic bibliographic Records: Percentage Increase/Decrease by Time Period Table 2-3 WorldCat Records: Subject Divisions, 50 Year Range Table 2-4 WorldCat Records: Subject Divisions, 10-Year Range Chapter Three (http://www.oclc.org/research/grants/reports/perrault/chapter3final2.pdf) Table 3-1 Library Groupings: Total Number of Bibliographic Records from Subject Analysis Table 3-2 Research Libraries: Percentage Increase/Decrease by Years Table 3-3 Research Libraries: Percentage Increase/Decrease by Years Table 3-4 Academic Libraries: Percentage Increase/Decrease by Years Table 3-5 Academic Libraries: Percentage of Total by Subject Division Table 3-6 Special Libraries: Percentage Increase/Decrease by Years Table 3-7 Special Libraries: Percentage of Total by Subject Division Table 3-8 Public Libraries: Percentage of Adult/Juvenile Table 3-9 Public and School Libraries by Years Table 3-10 School Libraries: Percentage of Adult/Juvenile Table 3-11 Each Library Group by Subject Division http://www.oclc.org/research/grants/reports/perrault/chapter2final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter3final2.pdf bolander http://www.oclc.org/research/grants/reports/perrault/chapter2final. bolander pdf) bolander http://www.oclc.org/research/grants/reports/perrault/chapter3final2.pdf) Chapter Four (http://www.oclc.org/research/grants/reports/perrault/chapter4final.pdf) Table 4-1 WorldCat Unique Records: All Titles Held by Date (All Divisions) Table 4-2 Unique Records Research Libraries: All Titles Held by Date (All Divisions) Table 4-3 Total Number of Unique Bibliographic Records: WorldCat and Research Libraries Table 4-4 Total Number of Unique Records by Decade: WorldCat and Research Libraries Table 4-5 Unique Records by Subject Division: WorldCat and Research Libraries Table 4-6 Unique Bibliographic Records From Subject Analysis by Library Grouping Table 4-7 Unique Records by Subject by Library Groupings Table 4-8 All Libraries: Title Overlap Between Groups Table 4-9 Total Number of Shared Records by Library Grouping Figure 4-1 WorldCat and Unique Research Record Comparison Chapter Five (http://www.oclc.org/research/grants/reports/perrault/chapter5final.pdf) Table 5-1 WorldCat: English and All Non-English as a Percentage of Total Subject Records Table 5-2 Research Libraries: English and All Non-English as a Percentage of Total Subject Records Table 5-3 Academic Libraries: English and All Non-English as a Percentage of Total Subject Records Table 5-4 Increase/Decrease by Decades for English/non-English Records: Academic and Research Libraries Table 5-5 Increase/Decrease by Five-Year Periods, 1980-1999 for English/Non- English Language Groupings: Academic and Research Libraries Table 5-6 Foreign Language Groupings as a Percentage of Total Records http://www.oclc.org/research/grants/reports/perrault/chapter4final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter5final.pdf bolander http://www.oclc.org/research/grants/reports/perrault/chapter4final.pdf) bolander http://www.oclc.org/research/grants/reports/perrault/chapter5final.pdf) Table 5-7 Foreign Language Groupings: Number of titles by Time Period Table 5-8 Language Records 1985-1999: Academic and Research Libraries Table 5-9 WorldCat: Language Titles by Subject Division Chapter Six (http://www.oclc.org/research/grants/reports/perrault/chapter6final.pdf) Table 6-1 WorldCat ISBN Analysis: All titles Held by Date (All Divisions) Table 6-2 WorldCat English ISBN Table 6-3 WorldCat Foreign ISBN Table 6-4 WorldCat, English, and Foreign ISBN Analysis by Time Period Table 6-5 Comparison of Total WorldCat Records with ISBN Totals Table 6-6 WorldCat ISBN Tables: Records With Call Number Present by Subject Chapter Seven (http://www.oclc.org/research/grants/reports/perrault/chapter7final.pdf) Table 7-1 and Figure 7-1 Annual Increase/Decrease in Number of WorldCat Records 1990-2000 Table 7-2 and Figure 7-2 Annual Increase/Decrease in Number of Records 1990- 2000 Academic and Research Libraries Table 7-3 and Figure 7-3 WorldCat and Research Libraries Unique Records 1990- 2000 Table 7-4a and Figure 7-4a WorldCat English and Non-English Language Records 1990-2000 Table 7-4b and Figure 7-4b Research Libraries English and Non-English Language Records 1990-2000 Table 7-4c and Figure 7-4c Academic Libraries English and Non-English Language Records 1990-2000 Table 7-5 and Figure 7-5 WorldCat Records with ISBN Numbers Table 7-6 and Figure 7-6 WorldCat, English, and Foreign Language Records with ISBN Numbers http://www.oclc.org/research/grants/reports/perrault/chapter6final.pdf http://www.oclc.org/research/grants/reports/perrault/chapter7final.pdf bolander http://www.oclc.org/research/grants/reports/perrault/chapter6final.pdf) bolander http://www.oclc.org/research/grants/reports/perrault/chapter7final.pdf) Acknowledgments A number of people contributed to the success of this project. Sally Loken of WLN and Ed O=Neill of the OCLC Office of Research both lent their support to the idea for the project when I first broached it to them. The project was endorsed and supported by the administration of the Lacey Product Center and the staff who produce the iCAS products, including Scott Barringer, Paul Brogger, Eric Kraig, Will Ryan, Ann Marie Wehrer and Glenda Lins. The University of South Florida granted me a full semester of sabbatical leave in Fall 2001. The analysis of data began during that time. The support and cooperation I received from the Director of the School of Library and Information Science at the University of South Florida, Vicki L. Gregory, and my colleagues are greatly appreciated. Graduate assistants Jennifer Boucher and Monica Jenkins took an interest in the project and helped with the data analysis. Rich Austin, also of USF, readied the manuscript for the web. I am thankful to all of these people. About the Author Anna H. Perrault is an Associate Professor in the School of Library and Information Science at the University of South Florida in Tampa. Her research in collection analysis and assessment has been frequently cited and has won several awards including the ALCTS/Blackwell=s Scholarship Award, the LAPT Research Award, and an OCLC/ALISE Research Grant. Dr. Perrault has conducted collection analysis and assessment projects in Louisiana and with the College Center for Library Automation (CCLA), the network for community colleges in Florida. She has conducted collection assessment workshops throughout the Southeast for SOLINET. Perrault is a member of the Center for Research Libraries/Big 12 Plus Working Group to develop measures for quantifying, evaluating and maximizing the economic benefits of coordinated cooperative collection development projects. A complete vita and publication bibliography including an Impact statement can be accessed at the web site below. Contact information: Anna H. Perrault perrault@chuma1.cas.usf.edu (813) 974 6844 FAX (813) 974 6840 http://www.cas.usf.edu/lis/faculty/perra.html http://www.cas.usf.edu/lis/faculty/perra.html OCLC Research Grant Site Title Page: GLOBAL COLLECTIVE RESOURCES: A Study of Monographic Bibliographic Records in WorldCat Abstract Table of Contents Organization of the Report List of Tables and Figures Acknowledgments About the Author Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Print version (All sections in 1 file; 164 pp.) work_kvoiegc2fjhw7av2yo6tncomcu ---- Microsoft Word - RIM Systems_WuStviliaLee_062316_v7.doc Exploring Researchers’ Participation in Online Research Identity Management Systems Shuheng Wu Queens College, The City University of New York 65-30 Kissena Blvd., Queens, NY Shuheng.Wu@qc.cuny.edu Besiki Stvilia Florida State University 142 Collegiate Loop, Tallahassee, FL bstvilia@fsu.edu Dong Joon Lee Texas A&M University 5000 TAMU, College Station, TX djlee@library.tamu.edu ABSTRACT Prior studies have identified a need for engaging researchers in providing and curating their identity data. This poster reports preliminary findings of a qualitative study exploring how researchers use and engage in online research identity management (RIM) systems. The findings identify nine activity or task related motivations of using RIM systems. This study also identified three levels of participation in RIM systems: Readers, Personal Record Managers, and Community Members. Most participants of this study fell into the category of Personal Record Managers, who may maintain their own profiles in a RIM system. This suggests that a majority of researchers may be willing to maintain their research identity profiles. Institutional repository managers may consider recruiting researchers as not only research information and data providers, but also curators of their own research identity data. Keywords Research identity management systems, motivations, engagement, ResearchGate, Google Scholar. INTRODUCTION There are different research identity management systems, often referred to as research information management (RIM) systems, from publishers, libraries, universities, search engines, and content aggregators (e.g., Google Scholar, ORCID, ResearchGate). These systems employ different approaches to curating research identity information or data: manual curation by information professionals and/or users (including the subject of identity data), automated data mining and curation scripts (aka bots), and some combination of the above. With universities engaging in curating digital scholarship produced by their faculty members, staff, and students through institutional repositories (IRs), some of these universities and their IRs try to manage the research identity profiles of their contributors locally (e.g., Expertnet.org, Stanford Profiles). While knowledge curation by professionals usually produces the highest quality results, it is costly and may not be scalable (Salo, 2009). Libraries and IRs may not have sufficient resources to control the quality of large scale uncontrolled metadata often batch harvested and ingested from faculty authored websites and journal databases. They may need help from IR contributors and users to control the quality of research identity data. The literature on online communities shows that successful peer curation communities that are able to attract and retain enough participants can provide scalable knowledge curation solutions of a quality that is comparable to the quality of professionally curated content (Giles, 2005). Hence, the success of online RIM systems may depend on the number of contributors and users they are able to recruit, motivate, and engage in research identity data curation. There is a significant body of research (e.g., Cosley et al., 2006; Nov, 2007; Stvilia et al., 2008) on what makes peer knowledge creation and curation communities successful. However, most of the previous research has focused on encyclopedia, question answering, and citizen science communities. There has been little investigation on the peer curation of research identity data. This study explores how researchers use and participate in RIM systems. Particularly, it addresses the need to have greater knowledge of how to design scalable and reliable solutions for research identity data curation by examining researchers’ perceived value of research identity data and services, and their motivations to engage in online RIM systems. Findings can enhance our knowledge of the design of research identity data/metadata services, and mechanisms for recruiting and retaining researchers for providing and maintaining their research identity data. RELATED WORK There have been considerable deliberations on the needs for and uses of research identity data and how to manage that effectively in Library and Information Science (LIS) research and practice communities (e.g., NISO Altmetrics Initiative; Research Data Alliance). An OCLC Task Group aiming to register researchers in authority files identified {This is the space reserved for copyright notices.] ASIST 2016, October 14-18, 2016, Copenhagen, Denmark. [Author Retains Copyright. Insert personal or institutional copyright notice here.] five stakeholder groups of research identity data: researchers, funders, university administrators, librarians, and aggregators (OCLC Research, 2014). For the researchers stakeholder group, the Task Group formulated five needs: disseminate research, compile all publications and other scholarly output, find collaborators, ensure network presence is correct, and retrieve others’ scholarly output to track a given discipline. This set of needs was compiled based on expert opinions of the Group members, supplemented with a scenario-based analysis. It would be valuable to test this typology empirically and investigate what can be the disincentives for researchers to participate in online research identity data sharing and curation. Different units in universities (e.g., office of research) are increasingly interested in collecting and analyzing research output for reporting, accreditation, and organizational reputation management. Those activities and interests overlap with traditional interests of academic libraries, which have to better align their digital services with those broader organizational needs and priorities (Dempsey, 2014; Tenopir et al., 2012). One approach would be to add research identity management services to IRs (Palmer, 2013). There is evidence from the practice that adding research identity management services to an IR might increase researchers’ interest in the IR (Dempsey, 2014). However, the increased interest in an IR might not always translate in the increased use of an IR and/or increased engagement in research identity data curation as multiple RIM systems (e.g. ResearchGate, Academia.edu) offer similar services and strive for researchers’ attentions and contributions. Relying solely on automated mining, extraction, and aggregation of research identity data might result in poor quality. Libraries need researcher engagement in identity data curation to provide scalable and high quality research identity management services. The online community literature shows that volunteer knowledge curators in open peer-production systems like Wikipedia are mostly driven by intrinsic motivations such as their interests in specific areas (Nov, 2007; Stvilia et al., 2008). Previous studies also examined user motivations to contribute in other online communities. Ames and Naaman (2007) interviewed 13 ‘heavy’ users of a Flickr application and identified four types of motivations for tagging: self- organization, self-communication, social-organization, and social-communication. A study of Flickr collections by Stvilia and Jörgensen (2009) listed eight motivations members might have when organizing photographs into groups. Nov et al. (2010) found a positive relationship between the motivation of building reputation in the community and the amount of metainformation (i.e., tags) provided. Similarly, in a study examining an online network of legal professionals, Wasko and Faraj (2005) found a significant positive effect of building reputation on the quality and volume of knowledge contribution. The online communities literature provides valuable insights for designing RIM systems and building and maintaining user communities around those systems. However, more empirical research is needed to understand what motivates researchers to engage in RIM systems. RESEARCH METHOD Guided by activity theory (AT; Engeström, 1987), this study employed semi-structured interviews (Blee & Taylor, 2002) to answer the following research questions: 1. How do researchers use online RIM systems? 2. What are the levels of researcher engagement in online RIM systems? This study defines the research population as employees and students of institutions having an IR and classified as Research Universities in the Carnegie Classification of Institutions of Higher Education. The participants of this study must have at least one peer-reviewed research publication and have used at least one RIM system by the time of interviews. This study used AT and literature analysis to develop an interview questionnaire. The authors conducted semi-structured interviews with eight researchers from four institutions regarding their use of and participation in RIM systems. One participant was a full professor, one was an associate professor, one was an assistant professor, two were postdoctoral researchers, and three were doctoral students. All interviews, ranging from 17 to 68 minutes, were audio recorded, transcribed, and coded with NVivo 10. Two of the authors independently coded all the interviews using an initial coding scheme based on Activity Theory and literature analysis. After comparing, discussing, and resolving any differences in their coding, the two authors formed a new coding scheme with emergent codes and subcategories and recoded all interviews. FINDINGS Activities and motivations From analysis of the interviews we identified the activities in which researchers engaged using RIM systems (see Table 1) and the motives of those activities. Find relevant literature One of the most frequent activities in which RIM systems are used is literature search. Outcomes of this activity can be used as input to other scholarly activities such as literature analysis, manuscript writing, or planning a research project. The literature search activity may include four actions: search, determine, select, and obtain. Researchers may use different RIM systems for different types of searches (e.g., known item, subject, navigational searches) based on the strengths or capabilities of RIM systems. One participant explained how he used ResearchGate and Google Scholar for different purposes: I think they have different functions. Like for ResearchGate I can follow some people, so I can have their most recent papers. But sometimes I also use Google Scholar when I have a specific paper that I want to look for. So if I know the title of the paper, or I know the author, and I want to see their publications, I will use Google Scholar. It’s convenient. Researchers may also use RIM systems to define and manage their own bibliographies by following or ‘bookmarking’ the core papers of a specific research area. One participant specified how he used ResearchGate to manage and expand his bibliographies: Some of the big papers were sort of like in everyone’s research. These are the cornerstone articles that you base a lot of your research on ... I follow some of those articles [in ResearchGate]. To complete a literature search activity, researchers need to obtain the desired publications. Researchers may be motivated to use RIM systems providing open access to the self-archived versions of publications. RIM systems with social networking features can attract researchers to contact authors and request a copy of publication they cannot access otherwise. One participant indicated what motivated him to use ResearchGate was that it provides open access to his works and allows requesting papers from others: It's good to have your stuff easily accessible because not everyone has access to databases, but if you're a researcher, it's easy to set up an account on one of these sites and connect with the authors to hopefully get the articles that you want to get. Document manuscripts Besides literature search, researchers may use RIM systems in a manuscript writing activity to manage citations. They may use Google scholar to verify bibliographic metadata of the resources cited in their papers, and/or obtain citations in a specific style. One participant revealed his use of Google Scholar when working on the reference list of his paper: There are times that I need to verify the source just to make sure the title, authors, and year, and just to make sure the information I put in are correct. Google Scholar is doing a good job in accurately reflecting publications, so I use it as a [citation management] resource. Identify researchers RIM systems and their citation extraction and analysis functions can be used for identifying potential reviewers, collaborators, letter writers, students, and advisors, who may have similar or specific research interests. One participant explained how she used ResearchGate’s citation information to know other researchers and identify potential collaborators: One of the advantages to using these [RIM] systems is the ability to discover researchers that you may not have known like this ... I'm going to follow this guy from Boston now because apparently he likes my work and I want to be helpful to him, and I want to see what he's doing with the stuff of mine that he's citing, because maybe we could be good collaborators. A potential future collaboration can be one of the motivations to follow other researchers in RIM systems. One junior researcher stated that she hoped to convert some of the connections she was cultivating with other researchers in ResearchGate into future collaborations. Activities Actions RIM functionalities Find relevant literature Search Search engine; author profile; follow papers; citing & cited papers Determine Author and publication profiles Select Citation count; author impact scores; publication venue impact scores; manuscript status Obtain Download a paper; request a paper from author(s) Document manuscripts Document sources Citation generator Verify sources Author profile Identify researchers Identify students, advisors, reviewers, collaborators, & etc. Citing & cited papers; author profile, follow people; ResearchGate reads statistics Disseminate research Make papers accessible Upload a paper; paper self- archiving status determination Promote papers Recommend papers; recommend people Interact with peers Ask and answer questions on forums Q&A service Send and receive private messages Messaging service Monitor the literature Follow known researchers Receive updates on known researchers Follow papers Receive updates on papers Discover new papers Recommend papers; citing & cited papers Discover new researchers Recommend people; citing & cited papers Evaluate Evaluate papers Citation count; number of reads; manuscript status Evaluate people Author profile; h-index; export a CV; ResearchGate scores Curate Archive papers Upload papers Add and modify metadata for papers Add/update index terms; claim papers; disavow papers; add/update citation information Add and modify metadata for people Create/update profile; merge profiles; add/edit index terms; endorse people for expertise; add/remove suggested co- authors Review papers Open review Look for jobs Search Recommended job postings Table 1. Activities and RIM system functionalities. Disseminate research One of the main motivations of using RIM systems is to disseminate research results. Researchers may use RIM systems to share publications, data, and other research products. Nearly all participants mentioned they used RIM systems to promote their research. The dissemination activity may consist of making research results available and actively promoting those results. To make research results available researchers may upload copies of their papers, presentations, or data to a RIM system. A service participants found particularly helpful in that action was the one that helped them determine whether a publication could be self-archived in RIM systems based on the publisher’s policies. After research results are uploaded to a RIM system, the system then can use push services to promote results to the community. Researchers may choose a specific RIM system providing more effective mechanisms (e.g., social network) to promote research to the community that they want to reach. One participant emphasized the social network provided by ResearchGate for promoting his research to the peers: I used ResearchGate besides the Google Scholar because ResearchGate has slightly different methods of constructing the social network and the way they promote research is different – it’s more active than Google Scholar. In that sense, it serves my purpose of trying to promote my research in the peers. Interact with peers Scholarly work may involve interaction. Researchers may interact about any aspect of research such as what design to employ for a particular research problem, what tools to use and how, or how to replicate research results. Researchers may also interact to exchange information about employment opportunities, and to recruit students, collaborators, external reviewers, or letter writers for grant proposals or promotions. Some RIM systems provide researchers with Q&A forums and a direct messaging service to communicate. In some cases, those communications channels become the only means on the Web of reaching a particular researcher. One participant revealed how ResearchGate helped him communicate with a researcher he could not reach otherwise when he was looking for a recommendation letter from the industry: ResearchGate really gives you a way to connect to the researchers if you somehow cannot find their email address or other contact information from other channels … I was looking for some recommendation letters for personal use. I wanted one from industry. This company cited my paper … But for that specific case, the first author’s email was not on the paper. And the last author, the corresponding author, actually left the company. So I had nowhere to find them. Then I checked ResearchGate. He was on ResearchGate. So I tried my last resort. I just sent him a message. And surprisingly, he replied. Monitor the literature To stay current with the literature, researchers need to monitor the literature for new works and/or contributors. One participant indicated his motivation of using RIM systems was to monitor his network of researchers: Looking at what people whose work I'm interested in have cited, is useful for me and for following up on and finding out more about information that's useful for me in my own research. RIM systems can be helpful to a researcher in monitoring literature by sending alerts about new works from the researcher’s network, and recommending new works and authors based on topical or co-citation matches. RIM systems having social networking capabilities enable researchers to connect to and learn about junior researchers’ works, which may not be as visible as those of more established researchers. One participant explained how ResearchGate might enable her to know about junior researchers who otherwise could not be reached: Researchers who are not that famous now, like junior faculty members or doctoral students, who are not big names, I probably cannot find another opportunity to know them all. If they also have a ResearchGate account and have some publications there, I hope this site can give me some automatic suggestions. Evaluate Evaluation can be a standalone activity (e.g., benchmarking oneself against other researchers) or part of a research process (e.g., evaluating papers for inclusion in a literature review). The targets of evaluation can be different entities such as a manuscript, a publication venue, an individual researcher, a lab, or an institution. Researchers may play the role of evaluators or be the objects of evaluation by others. If the latter, a researcher still can be an active contributor of a distributed evaluation process by creating and maintaining a profile in a RIM system to support his/her evaluation by others. The context of those evaluation activities may vary. One participant revealed that he created a Google Scholar profile to support his application for an award. Another participant mentioned he used his Google Scholar profile and impact factors as an evidence of his research impact when applying for the U.S. permanent residency. A researcher’s career status may affect the types of evaluation activities she or he may engage in or is asked to perform. Senior researchers may evaluate other researchers for promotion and tenure. Doctoral students, on the other hand, may benchmark themselves against other doctoral students who are at the similar stage in their doctoral programs to assess their competitiveness for the job market. For example, one participant who was a doctoral student illustrated how she used ResearchGate to follow other doctoral students to help prepare herself for the job market: I followed some students who are at the same stage as myself … in other schools to see their publication rate, how many publications they will get in one year … And then I can estimate how much work should be expected for a doctoral student at my stage, so later, when I’m actually in the job market, I will not be too far away. Curate Curation of research resources can be defined as a process of managing those resources for discovery and future use (Lord & Macdonald, 2003). The main components of curation activities is quality assurance, which is the process of assuring that the research products, including information resources, meet the needs and requirements of the activities in which they are used (Stvilia et al., 2007). Researchers may use RIM systems to self-archive papers and data and to make them accessible. Researchers may create and manage metadata for those resources to make them findable and reusable, and also use the metadata to construct a CV for different purposes. RIM systems with social networking capabilities allow researchers to request reviews of the content of their works from their peers. Curation of research information enables all the other activities in which that information is used or reused. Indeed, assuring the quality of their research identity metadata can be a motivation for researchers to establish a profile in a RIM system. For example, one participant created a profile in Google Scholar to correct an error after she found Google Scholar had identified another researcher having the same name as the author of her article. Furthermore, the quality of information determines the outcome of an activity using that information. Concerns on the quality of an activity’s outcome using research information and its possible effects on a researcher can be a strong motivator for the researcher to engage in curating his/her research identity profile. One participant noted: If you don't maintain it [research identity profile], then it gives people an inaccurate view of your productivity, so you run the risk of potentially sending a signal about your productivity that's not accurate. All participants indicated that they maintained their own personal profiles at least in one RIM system. Their maintenance included adding bibliographic metadata and subject index terms to their publications, uploading full-text articles, and endorsing colleagues for their skills. Look for jobs RIM systems may serve as a social network for researchers to look for job information or find job candidates. One participant mentioned she used ResearchGate’s Jobs service to look for relevant job postings. Another participant described how he used ResearchGate’s messaging services to help a researcher in another country to find a job: For the messages I received, the only one that’s not requesting a paper is the one from an Italian researcher. She told me she’s going to graduate, and she’s applying for a postdoctoral position. She’s personally asked me if I knew any positions in the Untied States. So I replied her message, gave her some suggestions. Engagement Of the eight participants, four had public profiles in Google Scholar, seven used ResearchGate, and three had profiles in Academia.edu. Only one participant mentioned that he had an ORCID account. When asked why they participated in a particular RIM system, some participants recalled incidents that led them to create a profile in that system. Some of them did not purposefully create profiles in a RIM system to meet their research identity management needs, but the profiles were automatically generated and pushed on them. Others mentioned they acted on a recommendation from friends, colleagues, or advisors when creating profiles. Researchers can be also introduced to a RIM system by another information system such as a search engine. They then perceived the value of membership after observing specific benefits provided by the system. For example, one participant revealed: I first came to ResearchGate, because a paper I was looking for at that time only had full-text version on ResearchGate … Then I noticed that's a benefit. I should create an account there. Levels of engagement The data analysis identified three levels or categories of researcher participation in a RIM system. Researchers belonging to the first category have claimed or activated an account in a RIM system but do not maintain it or not interact with other members of the system. This category was called Readers as they use RIM systems mostly to access the literature. Researchers in the second category may maintain their profiles in a RIM system, but do not contribute to the system beyond that and not interact with other members of that system directly or indirectly. That is, they don’t ask or answer questions in Q&A forums, endorse other members for their skills, send emails, or respond to other members’ emails or requests. This category was labeled as Personal Record Managers. A majority of the participants (four out of eight) were grouped under this category. Researchers in the third category not only maintain their own profiles, but also are willing to curate research information of other members by endorsing them for skills, and sharing information via messages, emails, or Q&A services. This category was labeled as Community Members, who may be motivated by the feeling of reciprocity and being ‘a good member’ of the community. DISCUSSION AND CONCLUSIONS As mentioned above, an OCLC Task Group formulated researchers’ five needs for research identity data (OCLC Research, 2014), which can be nicely mapped to five of the motivations of using RIM systems identified in the current study: find relevant literature, disseminate research, curate, identify researchers, and monitor the literature. However, the empirical data collected from the current study identified four more motivations of using RIM systems or research identity data: document manuscripts, interact with peers, evaluate, and look for jobs. Indeed, most of the IRs do not support those four activities (Lee, 2015). Several participants of the current study mentioned they used ResearchGate as it provided a social network allowing them to follow and communicate with other researchers and look for jobs. IRs may consider incorporating these functionalities to allow their users to communicate with each other, and generate profiles to support different evaluation activities (e.g., self-evaluation, annual review). Preece and Shneiderman (2009) presented a framework to describe user engagement in online social communities consisting of four levels: Reader, Contributor, Collaborator, and Leader. The current study identified three levels of participation in RIM systems: Readers, Personal Record Managers, and Community Members. These three categories can be mapped to the first three levels of engagement of Preece and Shneiderman’s framework. Most participants of the current study fell into the category of Personal Record Managers, who may maintain their profiles in a RIM system, but do not contribute to the system beyond that nor interact with other members of that system. A study of data curation practices in IRs found that IR staff’s curation activities focused on ensuring the quality of publication metadata for the long-term preservation of publications to increase their reusability (Lee, 2015). Findings of the current study suggest that a majority of researchers may be willing to maintain their research identity profiles. IR managers may consider recruiting researchers as not only research information/data providers, but also curators of their own research identity data. This study provides rich qualitative data regarding how researchers use and participate in online RIM systems. Still, this poster is limited as it reports preliminary findings based on interviews with eight participants from four institutions. More interviews will be conducted with researchers from other institutions and disciplines to gain different perspectives. Based on findings of interviews, we will develop and implement a survey to reach more researchers, and develop a quantitative model of researcher participation in RIM systems. ACKNOWLEDGMENTS The authors would like to express their appreciation to the researchers who participated in the study. This research is supported by an OCLC/ALISE Library and Information Research Grant for 2016 and the National Leadership Grants from the Institute of Museum and Library Services (IMLS). The article reflects the findings and conclusions of the authors, and do not necessarily reflect the views of OCLC, ALISE, and IMLS. REFERENCES Ames, M., & Naaman, M. (2007). Why we tag: Motivations for annotation in mobile and online media. In B. Begole & S. Payne (Eds.), Proceedings of the SIGCHI (pp. 971- 980). New York, NY: ACM. Bauin, S., & Rothman, H. (1992). "Impact" of journals as proxies for citation counts. In Representations of science and technology (pp. 225-239). Leiden: DSWO Press. Blee, K. M., & Taylor, V. (2002). Semi-structured interviewing in social movement research. In B. Klandermans & S. Staggenbory (Eds.), Methods of social movement research (pp. 92-117). Minneapolis, MN: University of Minnesota Press. Cosley, D., Frankowski, D., Terveen, L., & Riedl, J. (2006). Using intelligent task routing and contribution review to help communities build artifacts of lasting value. In Proceedings of the SIGCHI conference on Human Factors in computing systems (pp. 1037-1046). New York, NY: ACM. Dempsey, L. (2014). Research information management systems - a new service category? Retrieved from http://orweblog.oclc.org/archives/002218.html Engeström, Y. (1987). Learning by expanding: An Activity- Theoretical approach to developmental research. Helsinki: Orienta-Konsultit Oy. Giles, J. (2005). Internet encyclopedias go head to head. Nature, 438(7070), 900-901. Lee, D. J. (2015). Research data curation practices in institutional repositories and data identifiers (Doctoral dissertation). Retrieved from http://purl.flvc.org/fsu/fd/FSU_migr_etd-9638 Lord, P., & Macdonald, A. (2003). E-Science curation report: Data curation for e-Science in the UK: An audit to establish requirements for future curation and provision. Bristol, UK: JISC. Nov, O. (2007). What motivates Wikipedians. Communications ACM, 50(11), 60-64. Nov, O., Naaman, M., & Ye, C. (2010). Analysis of participation in an online photo-sharing community: A multidimensional perspective. Journal of the American Society for Information Science & Technology, 61(3), 555-566. OCLC Research. (2014). Registering researchers in authority files. Retrieved from http://www.oclc.org/research/themes/research- collections/registering-researchers.html Palmer, D. (2013). The HKU Scholars Hub: Reputation, identity & impact management. How librarians are raising researchers’ reputations. Retrieved from http://hub.hku.hk/bitstream/10722/192927/1/Reputation.p df. Preece, J., & Shneiderman, B. (2009). The reader-to-leader framework: Motivating technology-mediated social participation. AIS Transactions on Human-Computer Interaction, 1(1), 13-32. Salo, D. (2009). Name authority control in institutional repositories. Cataloging & Classification Quarterly, 47(3-4), 249-261. Stvilia, B., Twidale, M., Smith, L. C., Gasser, L. (2008). Information quality work organization in Wikipedia. Journal of the American Society for Information Science & Technology, 59(6), 983–1001. Stvilia, B., Gasser, L., Twidale, M., & Smith, L. C. (2007). A framework for information quality assessment. Journal of the American Society for Information Science & Technology, 58, 1720-1733. Stvilia, B., & Jörgensen, C. (2009). User-generated collection-level metadata in an online photo-sharing system. Library & Information Science Research. 31(1), 54-65. Tenopir, C., Birch, B., & Allard, S. (2012). Academic libraries and research data services: Current practices and plans for the future. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1 .248.3251&rep=rep1&type=pdf Wasko, M. M., & Faraj, S. (2005). Why should I share? Examining social capital and knowledge contribution in electronic networks of practice. MIS quarterly, 29(1), 35- 57. work_leqhwycv7ff7zcu3i4ha3rn6mq ---- O R I G I N A L P A P E R End users’ trust in data repositories: definition and influences on trust development Ayoung Yoon Published online: 10 July 2013 � Springer Science+Business Media Dordrecht 2013 Abstract While repositories’ efforts to build trustworthy digital repositories (TDRs) led to the establishment of ISO standards, much less research has been done regarding the user’s side, despite calls for an understanding of users’ trust of TDRs. In order to learn about users’ perspectives on trust in digital repositories, the present study investigated users’ definitions of trust and factors that influence users’ trust development, particularly addressing the users of three data repositories in the United States. A total of 19 participants were interviewed in this study. The results of this study indicate that users’ definition of trust is largely based on a lack of deception, when it comes down to the specific context of data repositories. Regarding factors influencing the development of users’ trust in repositories, organizational attributes, user communities (recommendations and frequent use), past experiences, repository processes (documentation, data cleaning, and quality checking), and users’ perception of the repository roles were identified. Keywords Trust � Data repository � Trusted digital repository Introduction Historically, cultural institutions that have been responsible for preserving paper records and physical artifacts have already developed considerable trust within the communities they serve. Libraries, archives, and repositories are trusted to store materials valuable for cultural and scholarly purposes and provide access to them to disseminate knowledge and to preserve them for future generations (Research Libraries Group/Online Computing Library Center [RLG/OCLC] 2002). The flood A. Yoon (&) School of Information and Library Science, University of North Carolina at Chapel Hill, 216 Lenoir Drive CB #3360, 100 Manning Hall, Chapel Hill, NC 27599-3360, USA e-mail: ayhounet@gmail.com; ayyoon@email.unc.edu 123 Arch Sci (2014) 14:17–34 DOI 10.1007/s10502-013-9207-8 of digital information, however, has created new challenges in curating and archiving information. Whereas physical objects or documented reliable surrogates are available to patrons as ‘‘proof’’ of an institution’s capability to collect and preserve for the long term, digital information is less tangible and much more mutable than other materials, and trust and reliability are considered more difficult to establish (RLG/OCLC 2002, p. 8). Thus, new questions and new solutions in the field of archives and preservation are raised concerning whether and how the accumulated trust derived from traditional services can be transferred to the repositories of digital information (Jantz and Giarlo 2007). Ross and McHugh (2005) noted that digital information holders or service providers might already be regarded as trustworthy based on their reputations earned in the paper-based information environment. Institutions are likely to retain at least some trust from the public based on past successes (RLG/ OCLC 2002). Others have made the point that digital resources are much more vulnerable than traditional paper-based information; this fact makes people and organizations insecure about digital information usage and ways of guaranteeing the authenticity and longevity of digital objects in institutions’ collections (Electronic Resource Preservation and Access Network (ERPANET) 2004). Whether or not trust derived from traditional services can be transferred to digital repositories, the concept of trust remains central in digital environments. The Commission on Preservation and Access/Research Libraries Group (CPA/RLG) Task Force on Archiving of Digital Information (1996) pointed out, ‘‘for assuring the longevity of information, the most important role in the operation of a digital archive is managing the identity, integrity and quality of the archives itself as a trusted source’’ (p. 23). Lynch (2000) also observed that ‘‘virtually all determination of authenticity or integrity in the digital environment ultimately depends on trust. […] Trust plays a central role, yet it is elusive’’. Thus, as early as the late 1990s, the discussion on trusted digital repositories (TDRs) spread and addressed how the ‘‘trusted’’ information can be preserved. While trust in repositories has been much discussed, questions remain regarding whether end users will accept a repository with a solid record as ‘‘trusted’’. Ross and McHugh (2005) presented a broad range of trust-related issues surrounding digital repositories and argued that users’ expectations (with expectations of depositors, aspirations of service providers, and management concerns) must be addressed (p. 2). Understanding users’ perspectives is particularly significant because it is directly related to the fundamental missions of repositories, which is to serve a particular user group or designated community. As RLG/OCLC (2002) claimed, a trusted digital repository is ‘‘one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future’’ (p. 5). An empirical study to measure users’ perceptions of trust has also been called for because ‘‘trusted digital repositories can be classified as ‘trusted’ primarily because they meet or exceed the expectations and needs of the user communities for which they are designed’’ (Prieto 2009, p. 603). In response to this gap, this study attempts to investigate how users define ‘‘trust’’ in relation to digital repositories, and which factors influence users in building trust and/or maintaining it. Particularly, the focus of this study is in data repositories 18 Arch Sci (2014) 14:17–34 123 where (digital) research data are stored and managed for reuse. Although most previous studies on TDRs attempted to answer how digital repositories can be trusted and provided criteria for developing TDRs, this study provides an in-depth understanding of users’ perspectives on trust and contributes to broadening the understanding of questions such as ‘‘what is a trusted repository?’’ and ‘‘how can current TDRs meet users’ expectations for being a trusted digital repository?’’ Lastly, this study provides implications for building more trusted repositories in the future. Literature review The first part of the literature review covers the efforts to build TDRs and the development processes and standards for Trustworthy Repositories Audit and Certification (ISO 16363). The second part of the literature review discusses definitions, dimensions, preconditions, and attributes of trust, as they have been investigated across disciplines. Efforts to build trusted digital repositories: TRAC/ISO 16363 The concept of trust is known to be difficult to define or measure (Rousseau et al. 1998) because it is a vague term with an elusive definition (Gambetta 1988). How to define trust in a digital repository is also subjective depending on context of use. The archival and computer professions have been using the term as a synonym with ‘‘reliable’’, ‘‘responsible’’, ‘‘trustworthy’’, and ‘‘authentic’’, in relation to archival functions such as creating, managing, and using digital objects (RLG/OCLC 2002, p. 8). The RLG/OCLC report (2002) noted that no collective agreement exists, as of yet, on a more exact definition of ‘‘trusted archives (or repositories)’’, possibly because of the subjectivity and abstractness of the concept. This remains true today. Consequently, a ‘‘trusted’’ or trustworthy organization is unable to identify themselves as trusted (CPA/RLG Task Force on Archiving of Digital Information 1996). Ross and McHugh (2005) pointed out that this situation leads the public (as well as the repositories themselves) to accept digital repositories as being trusted if they can demonstrate that they have the properties of trustworthiness. Thus, the most important question is how to verify trustworthiness and how a repository can assert its own status as ‘‘trusted’’ (Ross and McHugh 2005). Therefore, the efforts to identify requirements for being a ‘‘trusted’’ repository were initiated, and certification for digital archives being declared as ‘‘trusted’’ was needed. These efforts included constructing a robust audit and certification program for digital repositories to enable these institutions to maintain the authenticity, integrity, and accessibility of digital materials over the long term. In 1996, the CPA/RLG Task Force on Archiving of Digital Information (hereafter, Task Force) argued that to be trusted, digital archives have to demonstrate that they could preserve information authentically for the long term. The Task Force emphasized the capabilities of ‘‘trusted’’ organizations, which Arch Sci (2014) 14:17–34 19 123 include being able to store, migrate, and provide access to digital collections because these capabilities are critical components of digital archiving infrastructure (CPA/RLG Task Force on Archiving of Digital Information 1996). In 2002, a report by RLG/OCLC (2002) provided a starting point for describing a framework of attributes and responsibilities for TDRs. In the report, the concept of a TDR is defined as the following: the repository has associated policies, standards, and technology infrastructure that provides the framework for digital preservation; and the repository has a trusted system, such as a system of software and hardware that can be relied on to follow certain rules. The ERPANET workshop report (2003) emphasized the role of an audit in this process. An audit itself does not directly improve uncertain situations with respect to being trusted because it only assesses these situations, but such assessments can certainly be intriguing efforts to analyze and improve situations (p. 6). Quality standards for creating digital resources, actions for capture, methods and procedures for storage and repositories, and technologies were discussed as ways to assess and improve situations. In response to these works and calls for audit and certification programs, in 2003, the Research Libraries Group (RLG) and the National Archives and Records Administration (NARA) created a joint task force to specifically address digital repository certification. First, they pointed out that institutions often declare themselves as ‘‘OAIS (Reference Model for an Open Archival Information System)- compliant’’ to underscore their trustworthiness, but no established or agreed understanding existed for the meaning of ‘‘OAIS-compliant’’ (p. 1). OAIS was designed to provide a conceptual framework for building appropriate environments, functional components, and information objects for long-term preservation. Even before it became an ISO standard in 2002, because institutions had no other developed criteria, they used OAIS to declare themselves trusted (p. 1). Thus, RLG/ NARAs (2005) research focused on building criteria for measuring this compliance by providing definitions of TDRs and components that should be considered TDRs. The metrics developed in the task force were tested in 2005 through the Certification of Digital Archives Project by the Center for Research Libraries (CRL). Funded by the Andrew W. Mellon Foundation, this project conducted actual audits of three digital archives: the National Library of the Netherlands–Koninklijke Bibliotheek (KB), Portico (Ithaka Harbors, Inc.), and Inter-university Consortium for Political and Social Research (ICPSR); and one archiving system, LOCKSS; and provided methodologies for auditing and certification with corresponding costs (CRL n.d.). Meanwhile, European researchers also responded to the call for audit and certification programs. The Network of Expertise in Long-term STOrage of Digital Resources (nestor) project published the Catalogue of Criteria for Trusted Digital Repositories in 2006. Initiated in Germany, the main focus of nestor was to form ‘‘a web of trustworthiness in which digital repositories can function as long-term digital archives within various environments’’ (Dobratz et al. 2007, para 3). Focusing on trusted repositories certification, nestor attempted to identify criteria that would facilitate the evaluation of digital repository trustworthiness, both at the organiza- tional and the technical level (nestor Working Group on Trusted repositories Certification 2006). 20 Arch Sci (2014) 14:17–34 123 All of those efforts and contributions to build solid audit, assessment, and certification programs were combined in Trustworthy Repositories Audit and Certification: Criteria and Checklist (TRAC) in 2007 (CRL/OCLC 2007). TRAC became the basis for Audit and Certification of TDRs, prepared by the Consultative Committee for Space Data Systems (CCSDS 2011). It presented three categories of criteria: (1) organizational infrastructures that include governance and organiza- tional visibility, organizational structure and staffing, procedural accountability and preservation, financial sustainability, contracts, licenses, and liabilities; (2) digital object management that includes ingest (acquisition and creation of archival information package [AIP]), preservation planning, AIP preservation, and infor- mation and access management; and (3) infrastructure and security risk manage- ment that addresses technical infrastructure and security risk management. In 2012, TRAC became a new ISO standard, ISO 16363: Audit and Certification of trustworthy digital repositories, with ISO/DIS 16919: Requirements for bodies providing audit and certification of candidate trustworthy digital repositories, which is waiting to be approved. The two standards complement each other in relation to accessing and building TDRs; largely based in TRAC, ISO 16363 provides a list of criteria for being a TDR and ISO/DIS 16919 provides requirements for organiza- tions that conduct audits and certifications (National Archives 2011). The creation of ISO 16363 reflects consensus within the digital preservation community regarding best practices for digital repositories. Although ISO standards are known as ‘‘voluntary’’ standards rather than ‘‘must-do’’, they provide an influential guideline for organizations attempting to build TDRs. These previous efforts acknowledged the role of the user community in building TDRs and suggested a way of engaging users in the process. For instance, a repository should allow users to audit/validate that the repository is taking the necessary steps to ensure the long-term integrity of digital objects and record and act upon problem reports about errors in data so that users can consider the repository as trustworthy sources (CCSDS 2011). However, much less research has been done regarding the user side despite the call for an understanding of users’ trust of TDRs. Prieto (2009) reviewed the concept of trust in online environments and TDRs and underscored the significance of user communities’ perceptions of trust. He argued that ‘‘user communities are the most valuable component in ensuring a digital repository’s trustworthiness’’ (p. 603) and called for empirical research measuring users’ perceptions of trust as a factor contributing to TDRs. Most recently, the dissemination information packages for information reuse project in the University of Michigan Ann Arbor and OCLC research investigated trust from two user communities, quantitative social scientists, and archeologists (Yakel et al. 2013). The findings of this project identified four indicators of trust: repository functions, transparency, structural assurance to include guarantees of preservation and sustainability, and the effects of discipline and level of expertise. More empirical research of how users perceive TDRs and of factors that influence the building of users’ trust would have practical implications for repositories’ ability to prove themselves as ‘‘trustworthy’’ to user communities. Arch Sci (2014) 14:17–34 21 123 Trust: definition, precondition, dimensions, and attributes Trust definition The concept of trust has been widely studied in various disciplines, such as psychology, organizational behavior, and economics. However, as researchers from different fields take varying approaches to understanding the concept of trust through their own disciplinary lenses and filters, full consensus on the definition of trust has not yet been reached. Researchers have also argued about the difficulty of defining and measuring trust (Rousseau et al. 1998) since it is a vague term with an elusive definition (Gambetta 1988). However, several efforts to derive a definition of trust from different disciplines have been made. Mayer et al. (1995) saw trust as a relationship between a trusting party (trustor) and a party to be trusted (trustee) and defined trust as ‘‘willingness to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action important to the trustor’’ (p. 712). Similarly, Doney and Cannon (1997) saw trust as ‘‘willingness to rely on another’’, and Lewicki and Bunker (1995) defined trust as a ‘‘confident, positive expectation’’. Later, in their study of a multidisciplinary view of trust, Rousseau et al. (1998) reported that ‘‘confident expectations’’ and ‘‘willingness to be vulnerable’’ are critical components of all definitions of trust regardless of discipline and defined trust as ‘‘a psychological state comprising the intention to accept vulnerability based upon positive expectations of the intentions or behavior of another’’ (p. 395). Precondition of trust As pre-conditions for the development of trust, two components, risk and interdependence, have been mentioned across disciplines (Rousseau et al. 1998). Defined as the perceived probability of loss, risk was considered an essential component of the pre-conditions for trust (Rousseau et al. 1998, p. 395; Rotter 1967; Sheppard and Sherman 1998). Risk was also discussed as the higher-level concepts of uncertainty (Doney and Cannon 1997; Gambetta 1988; Lewicki and Bunker 1995), which can result from a lack of information (Giddens 1990) and vulnerability (Blomqvist 1997; Rousseau et al. 1998), as discussed by Mayer et al. (1995). Interdependence (or dependence) means that a trustee holds the potential to satisfy a trustor’s needs; thus, it occurs ‘‘where the interests of one party (trustor) cannot be achieved without reliance upon another (trustee)’’ (Rousseau et al. 1998, p. 395). Dimensions of trust Different types of trust can emerge from different factors. Trust can emerge based on a trustee’s rational choice when a trustor perceives that the trustee will perform beneficial actions. Rousseau et al. (1998) referred to this as calculus-based trust, borrowing from Barber’s (1983) argument that this type of trust can be derived from credible information regarding the intention of another, which may be provided by reputation or certification. Trust can also be derived from repeated interaction over 22 Arch Sci (2014) 14:17–34 123 time between the trustee and trustor, which is classified as relational trust (Rousseau et al. 1998). Reliability and dependability in previous interactions create and increase a trustor’s positive expectations or beliefs about a trustee’s intention (Rousseau et al. 1998; Lewicki and Bunker 1995). Finally, there can be institution- based trust, a trustor’s feeling of security about situations because of structure assurance, such as guarantees, regulations, or the legal system (e.g., a contract or promise) (McKnight et al. 1998; Rousseau et al. 1998). Trust attributes From their review of literature, Mayer et al. (1995) suggested three attributes of perceived trustworthiness: ability, benevolence, and integrity. Ability refers to the skills, competence, and characteristics of a trustee that are influential in a specific domain; benevolence is the belief of a trustor that a trustee wants to do good work for a trustor; and integrity is a trustor’s perception that a trustee will adhere to principles acceptable to the trustor. Mayer et al. (1995) argued that these attributes are universally relevant and have been adopted by many researchers as a starting point to develop their own framework. Pirson and Malhotra (2011) slightly modified this framework in the context of stakeholders’ trust in organizations and provided a new framework with four attributes: identification, integrity, benevolence, and transparency. Integrity is the belief that an organization will act fairly and ethically; benevolence is the belief that an organization is concerned with the stakeholders’ well-being; identification refers to stakeholders’ understanding of an organization’s intention or interests based on shared values and commitment (Lewicki and Bunker 1995; and transparency refers to perceived willingness to share trust-related information with stakeholders. Though transparency did not appear to predict trust in the results of this study, it is worth noting that several scholars (e.g., Mishra 1996; Tschannen-Moran 2001) have argued for transparency as one attribute of trustworthiness. While the trust relationship and the trust model investigated in previous studies are not exactly the same as the trust relationship between users and repositories, previous studies have provided useful insights into factors that may influence trust and how it can be built. Thus, this study employed related concepts developed in previous studies, adopting an integrated approach from organizational studies, sociology, and social psychology to enhance our understanding of trust in repositories from the users’ point of view. Methods I conducted semi-structured interviews to gain more in-depth understanding of users’ perceptions. Among various types of digital repositories, I limited the scope to users of data repositories, because of both the significance of data in research and the increasing attention that is being paid to data sharing and reuse (e.g., Faniel and Zimmerman 2011). Potential subjects were identified from users of three major social science data repositories in the US: Odum Institute for Research in Social Arch Sci (2014) 14:17–34 23 123 Science at the University of North Carolina, Roper Center for Public Opinion Research at the University of Connecticut, and the ICPSR at the University of Michigan. These three repositories are participating in the Data Preservation Alliance for the Social Sciences (Data-PASS). Data-PASS is a voluntary partnership of organizations created to archive, catalog, and preserve data used for social science research, and their data include opinion polls, voting records, surveys on family growth and income, social network data, government statistics and indices, and GIS data measuring human activity. Other organizations participating in Data-PASS were excluded from this study due to differences in the nature of the repositories (a government organization) or due to the difficulty of tracking their users. I used data citation tracking to identify people who have used data from these repositories for their research. Currently, as no consistent standard for citing data has yet been established (Mooney 2011), tracking data citations is a challenging process. In addition, perhaps because a number of articles use data without citing them (Mooney 2011), the use of citation tracking to identify users might be a limitation of this sampling plan. Even though data citation tracking has limitations, it is still the most effective way to identify users of datasets from each repository. Among the three repositories, ICPSR and Roper Center provide lists of publications that have used their data, 1 and potential study participants were identified from these lists. Users of the Odum data archives were identified by searching Google Scholar, as the search results generated from the search term ‘‘Odum’’ provide the lists of publications that mentioned Odum as their data source. To minimize the potential imbalance in the sample, created by the use of different repositories, a quota sampling technique was employed for deciding user numbers for each repository, based on the number of studies that have provided data to the repository (as reported by the repository or discovered through searches of the DataVerse network 2 ). Thus, this study initially aimed to recruit about eight participants from ICPSR, about three participants from Odum Institute, and about ten participants from Roper Center. Potential participants were identified through searches for publications that cite their use of a dataset from one of the three repositories. These searches were limited to journal publications and conference proceedings published since 2000. Users from the most recent years were included in the sample first, and this process was continued until a sufficient number of participants had been identified. For articles that have been written by multiple authors, either the corresponding author or the first author was contacted first. The interview data were collected from February to July 2012. An email invitation for this study was sent to 213 potential participants identified through the process described previously. Twenty-five people who received the email invitation volunteered for interviews, but only 19 ended up participating in this study. Six volunteers were excluded because they had not been heavily involved in the process of acquiring and using data, as their co-authors or research assistants handled these 1 ICPSR: http://www.icpsr.umich.edu/icpsrweb/ICPSR/biblio/resources?collection=DATA; Roper Cen- ter: www.ropercenter.uconn.edu/research/research_bibliography.html#.TxyOx0qQ07A. 2 The DataVerse network: http://thedata.org/book/about-project. 24 Arch Sci (2014) 14:17–34 123 http://www.icpsr.umich.edu/icpsrweb/ICPSR/biblio/resources?collection=DATA http://www.ropercenter.uconn.edu/research/research_bibliography.html#.TxyOx0qQ07A http://thedata.org/book/about-project activities. Among 19 participants, ten were identified as ICPSR users, three were Odum users, and six were Roper Center users. Some participants used multiple repositories. Those participants first talked about their most recent experience within the repository they identified themselves as using, but they also talked about their experience with other repositories if they felt it necessary, including repositories other than Odum, Roper Center, and ICPSR. All interviews were conducted by phone, and the data were fully transcribed. Transcribed data were inductively coded for analysis using TAMS analyzer, a software program that facilitates qualitative analysis. The codes developed for initial analysis were reviewed by one peer checking the validity of codes. Findings Participant information All 19 participants were either academic or corporate researchers, which can be seen as a part of the ‘‘designated community’’ of each repository. Seventeen were university faculty members (including junior and senior levels), and two were classified as research associates at either a university or a research corporation. Eight participants were male and 11 were female. Most participants’ ages ranged from 30 to 50 years (one in their 20 s, seven in their 30 s, five in their 40 s, and six in their 50 s). While all the participants used the repositories to acquire data for their research, a few of them said that they also used it for teaching. Two participants also had experience with depositing, which might influence their perception of repositories, but the deposit experience was not investigated extensively since it is outside of the scope of this research. The level of use varied among participants. Participants had different ways of expressing their level of experience; for instance, by number of datasets used, by number of years using the repository, or by number of times datasets were used for research projects; and it was especially difficult for users who have used the repository for a long time to express this. For example, descriptions included ‘‘Half of my publications over the last 15 years, […] double- digits number of papers (PB12)’’. However, most participants had used data repositories more than five times and used more than five datasets. Five of them said they had used data repositories more than 100 times, and three of them had used repositories for more than 10 years. Only three participants had used repositories fewer than three times and had used fewer than three datasets. Defining trust: what does trust mean to users? Participants were first asked what they think trust is, and how they define trust in the context of data repositories. Similar to the preconditions discussed in previous literature, the interviews showed that trust became relevant to a particular situation when the trustor was uncertain about something (uncertainty) and when the trustor can depend upon the trustee (dependability). Trust can arise or become necessary under uncertain circumstances because it is sometimes necessary to ‘‘place Arch Sci (2014) 14:17–34 25 123 confidence in things that you don’t know’’ (PA05). Trust was also related to dependability. Participants expressed that ‘‘to trust’’ means to believe or count on someone or something (PA01, PB10, PB12), and PB10 defined trust as ‘‘being able to count on organizations or products or whatever’’. Dependability signified the trustor’s expectation that the trustee would consistently satisfy the needs of the trustor. PA14 expressed this understanding as ‘‘[Trust is] your ability to have faith that someone is going to fulfill some kind of expectation that you have’’. Participants’ sense that they could rely on someone or something was highly associated with truthfulness, which is the lack of deception. In the context of data repositories, lack of deception had two components: data validity and repositories’ integrity. Even though participants were asked to define trust in repositories, discussion of data validity inevitably emerged, as the needs for data came before other considerations because this was what participants actually used for their research. Belief in the integrity of repositories was another component of trust, due to the repositories’ role of managing data. Whether the data are presented accurately (validity) constituted the most important component of the definition of trust in the context of repositories. PA08 said, ‘‘The trust I have is […] the data in a way that accurately indicates what’s there and really what was collected. So that’s what I would be basing my trust on’’. PA03 remarked, ‘‘To trust them I believe that they are accurately representing what they say they are, which includes telling me the limitations of something and not just presenting all the good parts, but presenting the bad parts’’. Data should reflect exactly what it is, and accuracy in this context has nothing to do with evaluating how good or bad the data is. This dimension is also highly related to integrity, which is discussed in the next paragraph. The integrity of repositories was mentioned by most of the participants. Participants’ belief that organizations will be honest rather than deceitful comprises trust. As PB16 noted, ‘‘[repositories] are in fact doing what they’re saying that they’re doing, and they’re not trying to intentionally, I guess, mislead people’’. PA07 echoed, ‘‘I really think about your believing that they represent themselves as true and honest […] or do what they say they are going to do and [that] they have respect for you and so on, et cetera’’. Building trust: where does end users’ trust originate, and how do users develop trust? Study participants discussed a number of characteristics that contribute to the development of their trust in repositories. Regardless of the repository, participants’ trust seemed to be based on five broad components: organizational attributes, the internal repository process, user (or designated) communities, their own past experiences, and their perceptions of the roles of repositories. Organizational attributes About half of the participants showed strong belief in the integrity of repositories. Here, integrity means that users believe repositories are honest and do not deceive or mislead their users. PA01 said, ‘‘Well, I don’t think that the people or the 26 Arch Sci (2014) 14:17–34 123 organizations that managed the data were trying to mislead anyone’’, and PB16 added, ‘‘There’s no reason to think that [the repository] would be doing anything to the data to affect its integrity. They’re all about just making data available to the research community’’. This strong belief in the repositories’ integrity is also based on users’ understanding of the repositories’ mission and commitment to society; as PB16 stated above, repositories are ‘‘all about just making data available to the research community’’. PA04 echoed this view: They had provisions in place for data use agreements that they try to make it so that people can diffuse the data that they were using it for research purposes that would further knowledge. PA14 acknowledged the value and mission of the repositories, noting that ‘‘the value of having things like that at [the repository], is that they are concerned with long-term preservation’’. Similarly, PB12 stated, ‘‘If they stopped operating or were no longer able to archive as much data as they did in the past, then, well, the data would be lost’’. Understanding repositories’ commitment to society named identification by Pirson and Malhotra (2011), PB16, PA04, PA14, and PB12 expressed their faith in the repositories’ integrity. Participants’ belief in the staff was a third organizational attribute influencing trust. This trust is closely related to the reputation of the repositories because it can arise from the reputation of the repositories, as well as help to build a good reputation. However, it is worth noting that some of the participants’ trust was directed particularly toward the staff. Participants believed that the staff ‘‘were well trained in this area’’ (PA01), ‘‘have expertise’’ (PB10), and ‘‘are the best possible people working on it’’ (PA03). One participant (PA09) also stated that knowing about staff helps to build his trust because ‘‘it just makes [repositories] more visible, rather than just being these mysterious sites where people put datasets up that aren’t available for anyone to download’’. Interestingly, even though several participants expressed a strong belief in the repository staff’s expertise, other participants did not know what the staff members do with data, which will be discussed in a later section. Perceptions of and use by designated communities Another component that emerged from the interviews was trust transferred from other users who are close to participants or trust based on others’ use. PA02 noted, for example, that ‘‘It’s not like I just stumbled upon it myself; I worked with other researchers who were working with [the repository], and they are the ones who told me [to use it]’’. If users hear about a repository from sources with more authority, they tend to trust it more: PA07 […] maybe one of my professors said positive things about [the repository] so that I consider that professor a reliable source of information. So, to me, that is my… It’s someone I trusted, that if this was a trustworthy organization, it made me feel… To trust it as well. Arch Sci (2014) 14:17–34 27 123 Another participant commented that the frequency of others’ use of the sources in the repository can be one measure of trust, even though it is not same as hearing this directly from other users. PA09 I would trust the [repository]. It’s been used a lot. And so I can’t even imagine how many master’s theses and dissertations and research articles have been published based on [the repository] data. The reputation of repositories was another attribute mentioned by a number of participants. Reputation plays a significant role in the trust users have in repositories and is sometimes a consideration when choosing a dataset: PA05 I mean the institute that provides the dataset that I talk about on [the repository], that’s a reputable institution, you know what they have, the data that they have are good quality data. […] So people who are distributing the data, their reputations are important. PC19 I would trust [the repository] because I’ve already heard so much about it. Reputation is important and it has a great reputation, so if I say I wanted to work on a different dataset I would go there versus some other small university somewhere else. I wouldn’t know about them, whether they have all of these means of data collection and data security. PA13 First of all, they’ve both been in… this kind of business for a long time. They’re world-renowned for being data repositories and for leading the field in terms of data preservation and data access. Users’ own past experience In addition to recommendations from other users, users’ own experiences with repositories are important factors in building trust. The majority of participants in this study had used some repositories multiple times and related that having positive experiences with repositories over time helps to enhance their trust. For example, one participant said: PC19 I guess I did not had any problems in the past and I haven’t heard of other people having problems, and the data that I accessed through the repository, everything seems to be helping and there wasn’t anything suspicious or missing from it. So I guess I’ve had a good experience, so I have no reason to distrust them. Repository processes: documentation, data cleaning, and quality checking What repositories do with datasets is closely related to users’ trust. Almost every participant discussed documentation (e.g., codebooks, original questionnaires, or methodology) of datasets, arguing the significance of having good documentation since it is ‘‘only possible to understand [the dataset] by the reading documentation (PA02)’’. Good documentation is one factor influencing the user’s trust; as expressed by PA03, ‘‘Well, [I trust them] because they have really detailed and 28 Arch Sci (2014) 14:17–34 123 rigorous documentation describing their methodology’’. Another participant com- pared two different repositories and each one’s documentation, describing having more trust in the one with more rigorous documentation: PA14 Yeah, [repository A] tends to have more thorough complete documentation. [Repository Z] and [repository Y] are more whatever they get from the original investigator. So a lot of times with [repository Z], you’ll actually get an extant copy of the original questionnaire which sometimes there’s things crossed out by hand, different codes written in again by hand because of the last-minute changes. Whereas [the repository], you’re always gonna get a kind of nice, clearly organized thing without corrections and crossed out, so it’s kind of like getting somebody’s notes versus getting a finished manuscript. [Repositories Y and Z were not included in the study; Repository A was included.] Several participants were also aware of the internal data cleaning process of repositories and expressed their trust from this process: PA09 Well, there’s no trust issue about it. I mean if there’s an error, I think that [the repository] will do their best to make sure that it’s corrected and they’ll be very responsive. So that goes a long way to continuously building trust. Whether it is true or not, a few of the participants believed that the repositories would perform quality checks and appraisals of datasets. PA09 assumed that the repositories would meet some appraisal criteria for data quality, stating, ‘‘I really don’t know, but I’m guessing they would have. I don’t think they would just put something out without sort of reviewing it or ensuring data quality’’. Two of the others (PA01 and PA02) had a different view, stating, ‘‘I don’t think [the repository] is requiring each individual project or dataset that’s placed on their site to pass some criteria… You can’t just say, well, it’s in [the repository], so it must be great’’ (PA01). Accordingly, appraisal was one factor that influenced at least some of the participants’ trust, as some of them believed the repositories would check the quality of their data. Users’ perceptions of the roles of repositories Participants perceived repositories in a variety of ways, particularly regarding their roles. These perceptions turned out to influence the users’ trust in data repositories. Since this study does not aim to quantitatively test the correlation among factors influencing trust, it is not possible to argue for a consistent relationship between user perceptions and trust. However, the interview data indicated that some participants (PB06, PA08, and PC11) who perceived the repositories’ roles as somewhat limited did not consider the repositories to be trustworthy. For instance, PB06 defined the role of repositories as very limited, which led PB06 not to associate repositories with trust. PB06 perceived repositories’ functions as below: Arch Sci (2014) 14:17–34 29 123 PB06 I make the assumption that the repository has very little to do with the data because I know that… They don’t do much more than manage the files that are provided to them by the organization. They’re not in the business of doing much other than putting things in storage. So [the repository] is almost irrelevant from the point of view of trust. This case was in sharp contrast to the views of PA01, PB12, and PA14, who thought repositories did data cleaning and trust this process (see the section, ‘‘Repository processes’’). Such perceptions of the roles of repositories as limited made users see repositories as more of a ‘‘library’’ (PC11) or ‘‘warehouse’’ (PA13), where there are not many jobs or processes involving data. This view diminished users’ concern about the trust issue in repositories, and often made users question why trust mattered in this context. PC11 I don’t think this question makes any sense. This is like asking whether I have any concerns about using non-fiction books in the library. Some books will be shown to have mistakes; others will become influential. Each book must be judged on its own merits. But librarians cannot know this before placing an order for a book and placing it on the shelves. Librarians can have general rules of thumb in terms of ordering non-fiction books. But just because a book is in the library means almost nothing. Unless the library is run by some extremist group, I judge the book, not the library. Similarly, just because a dataset is in a repository means nothing. A repository should have very tight standards for documentation. Then the user can make informed decisions about each dataset or data series. Discussion This study did not account for possible personality-based aspects of trust (e.g., intrinsic trust), which can be considered a possible limitation. Even though, the findings of this study present how users perceived and defined trust, as well as what factors influenced their trust development. At a higher level, users perceived trust as their willingness to depend on or rely on something or someone (including an organization) in uncertain circumstances. These elements—dependability and uncertainty—align with pre-conditions of trust discussed in the previous literature across disciplines (Doney and Cannon 1997; Giddens 1990; Lewicki and Bunker 1995; Rousseau et al. 1998). When it comes down to the specific context of data repositories, users’ trust definition is largely based on lack of deception. In particular, a lack of deception can be achieved in two different ways: by determining data validity (or accuracy) and by assessing the integrity of repositories. Outcomes of repositories—meaning datasets deposited and processed in the repositories—should accurately represent the original dataset. Repositories should be honest and not intentionally mislead anyone. This strong presence of truth and honesty in users’ definition of trust in data 30 Arch Sci (2014) 14:17–34 123 repositories reflects the significance of data integrity and validity in their research. Acquiring valid, accurate data is the first step for any type of research that reuses data produced by others. The issue of integrity of repositories also relates to data integrity and validity as trusted sources for data. A number of factors, such as organizational attributes and repository processes, that influence users’ development of trust eventually relate to the data integrity and validity issue. Organizational attributes, user communities (recommendations and frequent use), past experiences, repository processes (documentation, data cleaning, and quality checking), and users’ perception of the repository roles were identified as influencing the development of users’ trust in repositories. These findings also reflect several types of trust discussed in previous literature. As Rousseau et al. (1998) and Barber (1983) argued, users could develop their calculus-based trust based on their rational choices, knowing the good intentions of repositories. This can be influenced not only by good reputation (Barber 1983) but also by others’ recommendations and frequent use of repositories by user communities. Relational trust also appeared when users develop their trust based on repeated positive experiences with repositories over time; such experiences help users develop positive expectations about the repositories. Knowing the staff of repositories also helped to develop relational trust; as an example, users expressed their trust of the staff when they met the staff in a conference or knew them personally because, in one way, this contact gave users an impression of repositories as ‘‘real’’ and ‘‘visible’’, rather than being ‘‘mysterious sites’’ (PA09). Users might also develop institution-based trust from reputation. As Rousseau et al. (1998) found, institution- based trust can convert to formulate calculus-based and relational trust; therefore, reputation could play an important role in users’ development of trust. One finding distinctive to users’ trust in data repositories is users’ perception of repository roles. Each study participant had a different level of understanding of repositories, and the level of understanding was sometimes, surprisingly, not related to the level of a user’s experience with repositories, since most participants of this study have frequently used repositories. Different levels of understanding—for instance, how much users know about repositories’ functions or the roles of staff/ repositories—are not the result of the possible differences among the three repositories in this study because these differences were also apparent in users of the same repository. The level of users’ understanding of repositories might be relevant to their level of trust, as it can be seen from the factors of repository process and users’ perception of the repository roles. Participants who knew a repository’s internal processes expressed their trust based on their belief in that process; however, others who did not know much about repository processes had different thoughts. In particular, a couple of participants viewed repositories as much less active players in maintaining data integrity than they are, although it is true that there might be differences among repositories in this study regarding processes and other functions. For them, because repositories are the same as a ‘‘warehouse’’ (PA08) or ‘‘library’’ (PC11), not much room existed for trust for those participants. These findings suggest that users’ awareness of repositories’ roles or functions can be one factor for developing users’ trust. In addition, if this is a factor, a new question is raised: Should the roles or functions of Arch Sci (2014) 14:17–34 31 123 repositories be more visible to users to gain more trust? Some might argue that, if users do not know much about what repositories do, such lack of knowledge shows that repositories have been successfully performing their job because users have not experienced serious problems. However, knowing the missions and functions of repositories can be a way to decrease users’ uncertainty, which can positively influence their trust, as can be seen in this study. Furthermore, building a better understanding of repositories would be important by giving users a better understanding. Conclusion As Prieto (2009) argued, ‘‘User communities are the most valuable component in ensuring a digital repository’s trustworthiness’’ (p. 603). Gaining users’ trust in repositories is important because one of the core missions of repositories is to serve their distinctive user communities. By understanding users’ perspectives on ‘‘trusted’’ repositories, repositories can enhance their ‘‘trusted’’ status because users’ perceptions of trust are often (though not always) related to repositories’ practices. If understanding users’ trust is the first step, the next step entails developing a metric to measure users’ trust in repositories. Trust is a complex concept to measure, but having a standardized way of measuring users’ trust can help to demonstrate how repositories have effectively gained users’ trust and have been perceived as trusted sources of information. In addition, trust in data itself plays a distinctive and important role for users to reuse data, which may or may not be related to the trust in repositories. Although it is not the scope of this study, findings also suggested that users’ trust in data is another important area to be investigated further. Acknowledgments I would like to give special thanks to Professor Barbara Wildemuth at the University of North Carolina at Chapel Hill, School of Information and Library Science, for her assistance with this study. References Barber B (1983) The logic and limits of trust. Rutgers University Press, New Brunswick Blomqvist K (1997) The many faces of trust. Scand J Mag 13(3):271–286 Center for Research Libraries (CRL) (n.d) Certification of digital archives project. http://www.crl.edu/ archiving-preservation/digital-archives/past-projects/cda. Accessed 30 Aug 2012 Center for Research Libraries/Online Computer Library Center (CRL/OCLC) (2007) Trustworthy repositories audit & certification: criteria and checklist (TRAC. Version 1.0) Commission on Preservation and Access/Research Libraries Group (CPA/RLG) Task Force on Archiving of Digital Information (1996) Preserving digital information: report of the task force on archiving of digital information Consultative Committee for Space Data Systems (CCSDS) (2011) Requirements for bodies providing audit and certification of candidate trustworthy digital repositories (No. CCSDS 652.1-M-1). The consultative committee for space data systems Dobratz S, Schoger A, Strathmann S (2007) The nestor catalogue of criteria for trusted digital repository evaluation and certification. J Digit Inf 8(2). http://journals.tdl.org/jodi/article/view/199/180. Accessed 13 Nov 2011 32 Arch Sci (2014) 14:17–34 123 http://www.crl.edu/archiving-preservation/digital-archives/past-projects/cda http://www.crl.edu/archiving-preservation/digital-archives/past-projects/cda http://journals.tdl.org/jodi/article/view/199/180 Doney PM, Cannon JP (1997) An examination of the nature of trust in buyer–seller relationships. J Mark 61:35–51 Electronic Resource Preservation and Access Network (ERPANET) (2004) The role of audit and certification in digital preservation. Stadsarchief Antwerpen, Belgium Electronic Resource Preservation and Access Network (ERPANET) Workshop Report (2003) Trusted digital repositories for cultural heritage. Accademia nazionale dei Lincei, Rome Faniel I, Zimmerman A (2011) Beyond the data deluge: a research agenda for large-scale data sharing and reuse. Int J Digit Curation 6(1):58–69 Gambetta D (1988) Can we trust trust? In: Gambetta D (ed) Trust: making and breaking cooperative relations. Basil Blackwell, New York, pp 213–237 Giddens A (1990) The consequences of modernity. Stanford University Press, Stanford ISO 14721 (2002) Reference model for an open archival information system (OAIS) ISO/DIS 16363 (2012) Audit and certification of trustworthy digital repositories ISO/DIS 16919 (under development) Requirements for bodies providing audit and certification of candidate trustworthy digital repositories Jantz R, Giarlo M (2007) Digital archiving and preservation: technologies and processes for a trusted repository. J Arch Organ 4:193–213. doi:10.1300/J201v04n01_10 Lewicki RJ, Bunker BB (1995) Developing and maintaining trust in work relationships. In: Kramer RM, Tyler TR (eds) Trust in organizations: frontiers of theory and research. Sage Publications, CA, pp 114–139 Lynch C (2000) Authenticity and integrity in the digital environment: an exploratory analysis of the central role of trust. Authenticity in a digital environment pub92. http://www.clir.org/pubs/reports/ pub92/lynch.html. Accessed 7 Nov 2011 Mayer RC, Davis JH, Schoorman FD (1995) An integrative model of organizational trust. Acad Manag Rev 20(3):709–734 McKnight DH, Cummings LL, Chervany NL (1998) Initial trust formation in new organizational relationship. Acad Manag Rev 23(3):473–490 Mishra AK (1996) Organizational responses to crisis: the centrality of trust. In: Kramer RM, Tyler TR (eds) Trust in organizations: frontiers of theory and research. Sage Publications, CA, pp 261–287 Mooney H (2011) Citing data sources in the social sciences: do authors do it? Learn Publ 24(2):99–108. doi:10.1087/20110204 National Archives (2011) NARAtions: the blog of the United States National Archives (2011, March 15) ISO standards for certifying trustworthy digital repositories. http://blogs.archives.gov/online-public- access/?p=4697. Accessed 8 Sep 2011 Nestor Working Group on Trusted repositories Certification (2006) Catalogue of criteria for trusted digital repositories: version 1. http://files.d-nb.de/nestor/materialien/nestor_mat_08-eng.pdf. Accessed 20 Dec 2011 Pirson M, Malhotra D (2011) Foundations of organizational trust: what matters to different stakeholders? Organ Sci 22:1087–1104. doi:10.1287/orsc.1100.0581 Prieto AG (2009) From conceptual to perceptual reality: trust in digital repositories. Libr Rev 58(8):593–606. doi:10.1108/00242530910987082 Research Libraries Group/National Archives and Records Administration (RLG/NARA) Task Force on Digital Repository Certification (2005) Audit checklist for certifying digital repositories. http://web.archive.org/web/20050922170446/http://www.rlg.org/en/page.php?Page_ID=20769. Accessed 30 Aug 2011 Research Libraries Group/Online Computing Library Center (RLG/OCLC) Working Group on Digital Archive Attributes (2002) Trusted digital repositories: attributes and responsibilities. http://www.rlg. org/longterm/repositories.pdf. Accessed 1 Jul 2011 Ross S, McHugh MA (2005) Audit and certification of digital repositories: creating a mandate for the Digital Curation Centre (DCC). RLG DigiNews 9(5). http://eprints.erpanet.org/105/01/Ross_McHugh_ auditandcertification_RLG_DigiNews.pdf. Accessed 11 Nov 2011 Rotter JB (1967) A new scale for the measurement of interpersonal trust. J Pers 35:615–665 Rousseau DM, Sitkin SB, Burt RS, Camerer C (1998) Not so different after all: across-discipline view of trust. Acad Manag Rev 23(3):393–404 Sheppard BH, Sherman DM (1998) The grammars of trust: a model and general implications. Acad Manag Rev 23(3):422–437 Tschannen-Moran M (2001) Collaboration and the need for trust. J Educ Adm 39(4):308–331 Arch Sci (2014) 14:17–34 33 123 http://dx.doi.org/10.1300/J201v04n01_10 http://www.clir.org/pubs/reports/pub92/lynch.html http://www.clir.org/pubs/reports/pub92/lynch.html http://dx.doi.org/10.1087/20110204 http://blogs.archives.gov/online-public-access/?p=4697 http://blogs.archives.gov/online-public-access/?p=4697 http://files.d-nb.de/nestor/materialien/nestor_mat_08-eng.pdf http://dx.doi.org/10.1287/orsc.1100.0581 http://dx.doi.org/10.1108/00242530910987082 http://web.archive.org/web/20050922170446/http://www.rlg.org/en/page.php?Page_ID=20769 http://www.rlg.org/longterm/repositories.pdf http://www.rlg.org/longterm/repositories.pdf http://eprints.erpanet.org/105/01/Ross_McHugh_auditandcertification_RLG_DigiNews.pdf http://eprints.erpanet.org/105/01/Ross_McHugh_auditandcertification_RLG_DigiNews.pdf Yakel E, Faniel I, Kriesberg A, Yoon A (2013). Trust in digital repositories. Int J Digit Curation 8(1):143–156. doi:10.2218/ijdc.v8i1.251 Author Biography Ayoung Yoon is a third-year doctoral student at the School of Information and Library Science, University of North Carolina at Chapel Hill. Her research interests include users’ trust in data and data repositories, data curation, and personal digital archiving on the Web. She has an MSI in both preservation and archives and record management from the University of Michigan School of Information, and BA in history from Ewha Womans University, South Korea. She is currently a Carolina Digital Curation (DigCCur) Doctoral Fellow. 34 Arch Sci (2014) 14:17–34 123 http://dx.doi.org/10.2218/ijdc.v8i1.251 End users’ trust in data repositories: definition and influences on trust development Abstract Introduction Literature review Efforts to build trusted digital repositories: TRAC/ISO 16363 Trust: definition, precondition, dimensions, and attributes Trust definition Precondition of trust Dimensions of trust Trust attributes Methods Findings Participant information Defining trust: what does trust mean to users? Building trust: where does end users’ trust originate, and how do users develop trust? Organizational attributes Perceptions of and use by designated communities Users’ own past experience Repository processes: documentation, data cleaning, and quality checking Users’ perceptions of the roles of repositories Discussion Conclusion Acknowledgments References work_lfaajj63yfay7kfm7vj4ht2r74 ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216585775 Params is empty 216585775 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585775 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_lsiffuipcjgspjrk4wmubn3fhu ---- Türk Kütüphaneciliği 13, 2 (1999), 139-158 [ Görüşler - Okuyucu Mektupları^ Opinion Papers - Letters Dünya Literatüründen Elektronik Bilgileri Saklama Sorunu 1980 li yıllardan başlayarak elektronik bilginin paylaşımında gözlenen hızlı büyüme bilgi merkezleri için dijital bilgi kaynaklarına erişimi birinci dere­ cede önemli hale getirmiştir. Ancak son zamanlarda bilgi merkezleri için elektronik ortamlarda depolanmaya başlanan her türlü bilginin gelecek için saklanmaya değer olup olmadığı bugünden kestirilememektedir. Bu neden­ le elektronik ortamda kayıtlı bilgilerin saklanma ve depolanma politikalari da günümüzde bilgi erişim kadar öncelikli olmaktadır. Literatürde de ağır­ lıklı olarak ele alman bu konu, daha uzun süre bilgi merkezlerine rehberlik etmeyi sürdürecek gibi görünüyor. American Libraries’in Mart sayısında yayımlanan makalesinde Abby Smith elektronik ortamdaki bilgilerin saklanması konusundaki uygulamala­ rın sadece donanıma bağlı olmadığım vurgulamaktadır! 1999: 36). Smith, di­ jital çağda bile saklama(preservation) teriminden restorasyon karşılığının anlaşıldığını belirtmektedir Gerçekten de İngilizce dilinde “preservation” te­ rimi ile materyali olabildiğince orjinal biçimi ile saklama ifade edilmektedir Ancak materyalin orjinal biçimine olabildiğince bağımlı kalma ve fiziksel ko­ şulları bu yönde sağlayarak korumanın yanısıra kullanım için kayıtlı bilgi­ lerin farklı ortamlara aktarılması işlemleri de yürütülmektedir. Saklama po­ litikasına bağlı olarak geliştirilen bu türdeki çalışmalar da çağdaş konser- vasyon(conservation) yöntemleri olarak ifade edilmektedir. Orjinal bilgilerin doğrudan farklı bir materyale kopyalanarak çoğaltılması ve basılı çıktının alınabil meşinde günümüzde en yaygın kullanılan teknik COM( bilgisayar çıktılı mikrofilm: computer-output microfilm) ya da CD-ROM olanaklarının sağlanmasıdır. Artık bu tür hizmetleri sunan ticari kuruluşlara internetten erişilebilmektedir. Ancak ülkemizdeki bilgi merkezlerine örnek olması açı­ sından mesleğimizle ilgili diğer konularda olduğu gibi gelişmiş ülkelerde bu alanda da özellikle işbirliği olanaklarına yapılan yatırımlar dikkati çekecek kadar hızlı gelişmektedir. 140 Görüşler-Okuyııcu Mektupları / Opinion Papers - Letters The Nothest Document Conservation Center(NEDCC) tarafından 11-13 Mayıs tarihleri arasında Coloradol.Amerikal’da dijital ortamdaki belgeleri saklama ve konservasyon konusunda bir tartışma platformu(workshop) dü­ zenlenmiştir. Toplantılarda dijital materyali saklama ve konservasyon yön­ temlerine yöne'lik farklı uygulmaların yürütüldüğü çeşitli projeler tartışıl­ mıştın Bu projelerden birisi The Colorado Preservation Alliance (CPA) adı altında oluşturulan bir konsorsiyum kapsamında yürütülmektedir.Konsorsi- yumun çalışma amaçları arasında saklama(preservation) ve konservasyon konularında eğitim, bilgilendirme ve belirlenen materyalin korumaya alın­ ması, arşiv materyalinin indirimli sağlanması gibi hizmetlerin yürütülmesi yer almaktadır. 16 Nisan tarihinde “cedars” projesi kapsamında Londra(tngiltere)’da ya­ pılan toplantıda ise dijital materyalin korunması için kütüphane ve arşivle­ rin işbirliği yapmaları öngörülmüştüm Bu toplantıda alınan bazı kararları şöyle sıralayabiliriz: • Kütüphanelerin dijital dermelerine yönelik politika geliştirilmesi ama­ cıyla durum saptaması yapılması • Kullanıcıların bu yöndeki gereksinimlerim belirlemek için kullanıcı araştırması yapılması • Dijital derme geliştirme ve saklamaya yönelik çalışmaların yürütülece­ ği model kütüphanelerin seçilmesi. Dijital materyalin depolanması ve korunmasına yönelik politikalar bü­ yük ölçüde bu tür materyalin oluşturulma süreciyle birlikte ele alınmalıdır. Bu süreç ise, • Geleneksel bilgi kaynaklarının elektronik ortama aktarılması • CD-ROM gibi dijital bilgi kaynaklarının doğrudan satın alınması • Web ortamındaki ücretsiz olarak sunulan bilgilere erişimin sağlanma­ sı • Bu bilgilerin, kütüphane olanakları ile depolanması gibi farklı seçim- sağlama biçimlerini • içermektedir. Sıralanan süreçlerden her biri kendi koşullarında her bilgi merkezi için farklı bir saklama politikasının uygulanmasını gerektirebilir Öte yandan elektronik ortamdaki bazı bilgiler çok kısa bir zaman dilimi içinde kullanıcı açısından hem güncelliğini , hem de' geçerliliğini yitirebilir ve saklanmasına gerek duyulmayabilir. Hangi bilgilerin saklanacağı ya da hangi ortamlarda daha sağlıklı depolanabileceği gibi konular bilgi merkezleri için gelecekte de sorun olacak gibi görünmektedir. Bu yöndeki açmazların giderilmesinde bil­ gi merkezlerine yardımcı olabilecek nitelikte rehber yayınlar üretilmesi sö­ zü edilen projenin kapsamına alınmıştır. Ülkemizde dağınık olarak sağlıksız koşullarda kaderlerine terk edilmiş olan taşınabilir kültür varlıklarımızın geleceğini , elektronik ortamdaki bil- Görüşler-Okuyucu Mektupları / Opinion Papers - Letters 141 gı yığınlarının depolanma sorunlarından ayrı düşünmemeliyiz. Farklı bilgi merkezlerinde depolanan her türlü bilgi kaynağının sağlıklı koşullarda hiz­ mete sunulması ve geleceğinin garanti altına alınması yönünde her bilgi merkezi tek başına sorumlu değildir. Bütün bilgi merkezleri konuya yönelik olarak geçmişte başlatılan duyarlılığa sahip çıkarak yeni proje ve kaynak paylaşımı programlarını geliştirmelidirler. OCLC'nin Web Tabanlı Bilgi Tarama Hizmeti: “First Search” OCLC, wirld wide üzerinde kütüphaneler için online(çevrimiçi) danışma hizmeti sağlıyor chttp:// www.oclc.org/oclc/menu/fs-new.htm> sitesine gire­ rek 75 den fazla veri tabanından ‘thesaurus’a dayalı bilgi taraması yapılabi­ lir. “First search” ile OCLC’e üye kütüphanelerin dermelerine erişim de sağ­ lanabilmektedir. Sanal kütüphanelerle bütünleşik bir hizmet sağlayan bu si­ te desteği ile OCLC kütüphane kayıtlarının toplu kataloğuna erişim ve kü- tüphanelerarası ödünç verme hizmetleri, tam metin (full text) erişimi ile in­ ternet kaynaklarına ulaşım hizmetlerinden on-line (çevrim içi) yararlanıla- bilmektedir. Econlit, MEDLINE, Social Science Abstracts, PsycFIRS ve PsycINFO gi­ bi veritabanlarından 5.000 başlığın taranabileceği “First Search” hizmetleri bu aydan itibaren Z39.50 standartları ile uyumlu olabilecek. “First Search-L” e abonelik: chttp:// www.oclc.org/oclc/menu/fs- new.htm> adresindeki e-posta listesine girerek abone olunabilir. Kaynak Smith, Abby .(1999). “ Preservation in the digital age:What is to be done”, American Libraries, March : 36-38. (Mayıs 1999) chttp: //www.leeds.ac.uk/cedars> (Mayıs 1999) Dr. Özlem Bayram A.Ü.D.T.C.F. Kütüphanecilik Bölümü http://www.oclc.org/oclc/menu/fs-new.htm http://www.oclc.org/oclc/menu/fs-new.htm http://www..aclin.org/other/libraries/cpa file:////www.leeds.ac.uk/cedars work_lufgnyxjrzgzlfxotzf6tfww4m ---- bmt$$$p888 Bone Marrow Transplantation (2000) 25, 3  2000 Macmillan Publishers Ltd All rights reserved 0887-6924/00 $15.00 www.nature.com/bmt Publisher’s Announcement Macmillan Publishers Ltd is pleased to be able to announce the creation of a new company. Nature Publishing Group brings together Nature, the Nature monthly titles and the journals formerly published by Stockton Press. Stockton Press becomes the Specialist Journals division of Nature Publishing Group. Nature Publishing Group will use its unique strengths, skills and global perspective to meet the demand of a rapidly changing and challenging pub- lishing environment. The Group’s publications are known for delivering high-quality, high-impact content, fair pricing, rapid publication, global marketing and a substantial presence on the internet. These elements are the key to excellence in selecting, editing, enhancing and delivering scientific information in the future. As a company, we have three core values: quality, service and visibility. These values are set to benefit all our customers – authors, readers, librarians, societies, companies and others – thus building strong publishing relationships. Bone Marrow Transplantation Bone Marrow Transplantation is now part of the Specialist Journals division of Nature Publishing Group. It will be marketed and sold from our offices in New York, Tokyo, London and Basingstoke. Within the elec- tronic environment, Bone Marrow Transplantation will benefit from a sub- stantial investment in innovative online publishing systems, offering global access, intelligent searches and other essential functions. Librarians will be able to provide their readers with print and online versions of Bone Marrow Transplantation through a variety of services including OCLC, Ingenta (linking to the BIDS service), SwetsNet, Ebsco, Dawson’s InfoQuest and Adonis. At a time when the basis of traditional journal publishing is undergoing significant changes, Nature Publishing Group aims to support the scientific and medical community’s needs for high quality publication services which provide rapid and easy access to the best of biomedical and clinical results. Jayne Marks Publishing Director Specialist Journals, Nature Publishing Group Publisher's Announcement Bone Marrow Transplantation work_luqj67uia5bz7dl2bleujfmdwi ---- ASIST Paper (DRAFT) Geographical Representation of Library Collections in WorldCat: A Prototype Lynn Silipigni Connaway* Clifton Snyder Lawrence Olszewski Consulting Research Scientist III Software Engineer Director Research Research OCLC Library OCLC Online Library Computer Center Inc. OCLC Online Library Computer Center Inc. OCLC Online Library Computer Center Inc. 6565 Frantz Road 6565 Frantz Road 6565 Frantz Road Dublin, OH 43017 Dublin, OH 43017 Dublin, OH 43017 Email: connawal@oclc.org Email: snyderc@oclc.org Email: olszewsl@oclc.org *All correspondence should be directed to Lynn Silipigni Connaway. Note: This is a pre-print version of a paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” the Annual Meeting of the American Society of Information Science and Technology, Session (B): Information Science Issues and Practices, Tuesday, November 1, 2005. Please cite the published version; a suggested citation appears below. Abstract In today’s world, people can be inundated by an overwhelming amount of information. Library and information science professionals attempt to provide information systems that are capable of retrieving precise and accurate information. One method for the organization and retrieval of geographically- based information is to develop a system to visually represent the data. A prototype for an interactive map, the OCLC WorldMap, was developed to provide a visual tool for the management and representation of geographically-based library collections and library data. These data are used to provide information for decision-making in regard to remote storage, collection management and cooperative collection development, preservation, and digitization. The collection data for the map were generated from WorldCat, the OCLC Online Computer Library Center bibliographic database. WorldCat contains approximately 55 million records, and not only serves as an aggregator of bibliographic data, but also identifies a billion holding locations by type of library for library resources. Additional data, gathered from more than thirty other sources, such as library expenditures and number of libraries, are represented on the OCLC WorldMap. © 2005 OCLC Online Computer Library, Inc. 6565 Frantz Road, Dublin, Ohio 43017-3395 USA http://www.oclc.org/ Reproduction of substantial portions of this publication must contain the OCLC copyright notice. Suggested citation Connaway, Lynn Silipigni, Clifton Snyder, and Lawrence Olszewski. 2005. “Geographical Representation of Library Collections in WorldCat: A Prototype.” Presented at the poster session at the 2005 ASIST Conference, Charlotte, NC, Nov. 1, 2005. Pre-print available online at: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf. (PDF:226K/10pp.) mailto:connawal@oclc.org mailto:snyderc@oclc.org mailto:olszewsl@oclc.org http://www.oclc.org/ http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Introduction The development of the Internet has provided a means for the creation and distribution of more data than ever has been available before. One of the principles of library and information science is to provide systems for the organization and retrieval of information and to provide users assistance with the evaluation of this information. One way to organize large datasets is the creation of visual representations of data. Visual displays are often more appealing and more easily understood than tables of numbers. Another method is to pare the data, carving off large chunks of extraneous facts and figures in an attempt to get to the important information. This project combines these two concepts into a single visual interface. The goal of the OCLC WorldMap is to create a visual tool for the management and representation of geographically-based library collections and data. Most of the data were mined from WorldCat, the OCLC bibliographic database, which contains more than 55 million records. Additional data were gathered from the Association of Research Libraries (ARL) (http://www.arl.org) and the National Center for Educational Statistics (NCES) (http://nces.ed.gov). The analysis of these statistics can provide useful information for remote storage, collection management and cooperative collection development, preservation, and digitization. Previously, the data existed in spreadsheets with thousands of rows and columns, which made it difficult to review and analyze. The interactive OCLC WorldMap was created to geographically represent these data. Literature Review Discussion of geographic information systems (GIS) dates back to the early 1990s; the American Society for Information Science devoted a special issue to spatial information, edited by Gluck (1994), in an attempt to bring this topic to the attention of library and information scientists. Lamont (1997) discusses the management issues involved in collecting, describing, and accessing spatial data. Fraser and Gluck (1999) discuss how users determine the relevance or potential value of geospatial objects from metadata in an update to Gluck’s work with the OCLC Office of Research (Gluck, 1997). Gluck and Yu (2000) provide an introduction to the topic of GIS uses in libraries in their discussion of standards for geospatial data, description of several library implementations of GIS, and analysis of the role of GIS as a library management tool. Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 2 of 10. Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Koontz (1996) addresses how GIS software facilitates library market analysis. Ottensmann (1997) discusses how geographical information systems can be employed to analyze patterns of library utilization in public library systems. Jue et al. (1999) analyze the distribution of poverty areas relative to public library outlets in order to assess the best funding and development policies for those residents. Koontz and Jue (2004, 2004a) focus on how public librarians can use data to help them fulfill their roles as agencies in equitable information provision. They also have made available a Public Library Geographic Database (PLGDB) that geographically represents United States public library census data (http://www.geolib.org/PLGDB.cfm). Tufte’s pioneering study (2001) explains how to communicate information through the simultaneous presentation of words, numbers and pictures. According to Shneiderman and Plaisant (2004) there are five factors for benchmarking the usability of an interface. Two factors, time to learn and speed of performance, are measured by carefully timing a set of tasks provided to the tester. Users are carefully watched to determine their rate of errors, and retention over time is judged by the tester’s ability to complete similar tasks throughout the course of testing. Finally, the user’s subjective satisfaction is gauged by both the spoken comments during testing as well as a follow-up interview and questionnaire. OCLC WorldMap – Development and Specifications The OCLC WorldMap does not attempt photographic accuracy; rather, it is a relatively simple means of displaying geographic data. The system allows the user to select a dataset of interest from several options provided on the map. The user is able to display library collections, a group of libraries’ collections, and all holdings in WorldCat by country of publication and date, or by library data. See Figure 1. The results are displayed on the map by variations of gradation to represent the data for the selected geographic regions (see Figure 2) or in a data table (see Figure 3), which the user is able to sort by selected column headers. Many different technologies are available for creating tools of this kind. In an attempt to create an entirely open source/open standards prototype, the first version of the map was created using Scalable Vector Graphics (SVG), an open standard maintained by the W3C. SVG supports the rendering of basic shape elements, i.e., circle, rectangle, line, ellipse, polyline, polygon, a text element, and complex paths. It supports styling for these elements and cascading style sheets (CSS), allowing the developer to manipulate various painting attributes as well as clipping, masking, and filtering. The specification also provides the means for describing various transformations and animations of these elements that can be automatically started or triggered by user Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 3 of 10. http://www.geolib.org/PLGDB.cfm Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. interaction, and allows for custom scripting using ECMAScript, a scripting language based on JavaScript. Because it is scalable, any amount of zooming in or out of an SVG document can be done without pixilation or distortion. SVG is a very young technology browser; therefore, browser support is currently limited. The most popular of the SVG plugins, produced by Adobe, only is supported by Windows with Internet Explorer or Netscape Navigator. The Mozilla Foundation is currently working on an SVG project, with the eventual goal of providing native support for their Mozilla and Mozilla Firefox browsers. At this time most browsers support Macromedia Flash. In order to make the map available to a larger audience and to provide features not supported by SVG, Flash was used to implement a new version of the OCLC WorldMap. Usability Testing Usability testing for the OCLC WorldMap was done internally in the OCLC Usability Lab using OCLC employees. The tests provided a diverse set of users from a range of different educational backgrounds, disciplines, ages, and levels of technical experience. Informal requests for revisions also were provided by staff within the Office of Research and staff from other OCLC divisions. The initial testing of the OCLC WorldMap returned a conclusive verdict: that the map was interesting, but the interface design was cumbersome and difficult to use. All of the users had difficulty locating small, obscure countries, which was attributed to a poorly implemented search feature. This problem was solved by providing the user with a textual list of country names, organized hierarchically by continent. The prototype map used only primary colors. According to Tufte (2001, pp. 153-54), a color-coded topical map should not “overload” the use of coloration as a means of communicating information to the user; it tends to be overwhelming and negatively impacts the user’s understanding of the data. For example, the map that was tested used colors to represent the dataset selected, whether or not the user was hovering over a given country or had selected (clicked on) a given country. Lighter tones were used to replace the primary colors and user interaction is now indicated by changing the color of the outline of a country. The users also noted the lack of sufficient “Help” functionality and confusion about how to use drop-down boxes effectively to access information; they found the labeling of the boxes misleading and uninformative. Despite these trouble spots, the testers saw the potential of the map and offered many helpful suggestions for improving its usability. Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 4 of 10. Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Conclusion and Future Development The OCLC WorldMap is a prototype, with many planned additions and revisions. The development of a system that enables the addition and revision of datasets with greater ease will facilitate future updates. Although it is not practical for a system to represent every possible dataset, the map in its current form is much too static. The map is being revised to use varying shades of a single color to represent the data so that variability within the datasets will become more distinct and understandable to the user. Other methods for displaying geographic data also are being developed, such as cartograms. One interesting example represents countries as circles as opposed to shapes defined by political boundaries. These circles vary in size depending upon the number of materials represented in WorldCat by country of publication. This tool makes it possible to show a user at a glance where the greatest number of WorldCat records is published. Other visual tools for information display are being explored to represent data in non-traditional ways, which will have the potential to make datasets more accessible to the user. The data represented in the OCLC WorldMap can be utilized by several different user groups. Internally, marketing and sales staff can use the library data to target potential areas of growth by segment in global expansion and to assess current market penetration. Externally, the place, date and language of publication data can be used by collection development staff to identify strong and weak area studies collections and to determine collection overlap between and last copies held by individual and groups of libraries. Librarians can use the datasets to plan for the integration of paper and digital collaborative collections, suggest candidates for deaccessioning and remote storage, and identify areas for preservation and digitization. REFERENCES Association of Research Libraries. Retrieved June 29, 2005 from http://www.arl.org. Fraser, B. & Gluck, M. (1999). Usability of geospatial metadata or space-time matters. American Society for Information Science Bulletin, 25, 24-26. Gluck, M. (Ed.). (1994). Spatial information [Special issue]. Journal of the American Society for Information Science, 45 (9). Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 5 of 10. http://www.arl.org/ Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Gluck, M. (1997). A descriptive study of the usability of geospatial metadata. Annual Review of OCLC Research. Retrieved June 29, 2005 from http://digitalarchive.oclc.org/da/ViewObject.jsp;jsessionid=36543e159640472aad5d9c6dc5e3cc c4?fileid=0000002652:000000058927&reqid=22096. Gluck, M. & Yu, L. (2000). Geographic Information Systems: background, frameworks, and uses in libraries. In F. C. Lynden & E. A. Chapman (Eds.). Advances in Librarianship, 23, 1-38. Jue, D. K. et al. (1999). Using public libraries to provide technology access for individuals in poverty: a nationwide analysis of library market areas using a Geographic Information System. Library & Information Science Research, 21, 299-325. Koontz, C. (1996). Using Geographic Information Systems for estimating and profiling geographic library market areas. In L. C. Smith, L.C. & M. Gluck, M. (Eds.), Geographic Information Systems and libraries: patrons, maps, and spatial information (pp. 181-193). Clinic on Library Applications of Data Processing. Champaign: Graduate School of Library and Information Science. Koontz, C. & Jue, D. K. (2004). Customer data 24/7 aids library planning and decision making. Florida Libraries, 47, 17-19. Koontz, C. & Jue, D. (2004a). Unlock your demographics. Library Journal, 129(4), 32-33. Lamont, M. (1997). Managing geospatial data and services. Journal of Academic Librarianship, 23, 469-73. National Center for Educational Statistics. Retrieved June 29, 2005 from http://nces.ed.gov. Ottensmann, J.R. (1997). Using geographic information systems to analyze library utilization. Library Quarterly, 67, 373-95. Public Library Geographic Database. Retrieved June 29, 2005 from http://www.geolib.org/PLGDB.cfm. Shneiderman, B. & Plaisant, C. (2004). Designing the user interface. (4th ed.). Boston, MA: Pearson/Addison-Wesley. Tufte, E. R. (2001). The visual display of quantitative information. (2nd ed.). Cheshire, CT: Graphics Press. UNESCO Institute for Statistics. Retrieved June 29, 2005 from http://www.uis.unesco.org. The following are trademarks and/or service marks of OCLC Online Computer Library Center: WorldCat, WorldMap. Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 6 of 10. http://digitalarchive.oclc.org/da/ViewObject.jsp;jsessionid=36543e159640472aad5d9c6dc5e3ccc4?fileid=0000002652:000000058927&reqid=22096 http://digitalarchive.oclc.org/da/ViewObject.jsp;jsessionid=36543e159640472aad5d9c6dc5e3ccc4?fileid=0000002652:000000058927&reqid=22096 http://nces.ed.gov/ http://www.geolib.org/PLGDB.cfm http://www.uis.unesco.org/ Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Appendix A Tables Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 7 of 10. Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Figure 1. OCLC WorldMap search screen Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 8 of 10. Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Figure 2. Results displayed by variations of gradation. Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 9 of 10. Connaway, Snyder, Olszewski: Geographical Representation of Library Collections in WorldCat: A Prototype Pre-print: please cite the published version; see cover page for suggested citation. Figure 3. Results displayed in a data table. Pre-print source: http://www.oclc.org/research/publications/archive/2005/connawayetal-asist05.pdf Paper given at ASIS&T 2005, “Sparking Synergies: Bringing Research and Practice Together:” Page 10 of 10. work_m6ljsw2ee5hkzp7t76y3owxhnm ---- Canadian Journal of Neurological Sciences Volume 46 Number 3 May 2019 Intraoperative Flash Visual Evoked Potential Recording and Relationship to Visual Outcome David A. Houlden, Chantal A. Turgeon, Nathaniel S. Amyot, Idara Edem, John Sinclair, Charles Agbi, Thomas Polis, Fahad Alkherayf Can J Neurol Sci. 2019;46:299 P M 40007777 R 9824 V O L U M E 46, N O . 3 T H E C A N A D IA N JO U R N A L O F N E U R O L O G IC A L S C IE N C E S M A Y 2019 (267-371) AN INTERNATIONAL JOURNAL PUBLISHED BY THE CANADIAN NEUROLOGICAL SCIENCES FEDERATION � e o� cial Journal of: � e Canadian Neurological Society, � e Canadian Neurosurgical Society, � e Canadian Society of Clinical Neurophysiologists, � e Canadian Association of Child Neurology, and � e Candian Society of Neuroradiology Cambridge Core For futher information about this journal please go to the journal website at: cambridge.org/cjn https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2019.52 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:37:04, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2019.52 https://www.cambridge.org/core https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2019.52 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:37:04, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2019.52 https://www.cambridge.org/core 267 Thank You to Our Reviewers REVIEW ARTICLES 269 Standards of Practice in Acute Ischemic Stroke Intervention International Recommendations Laurent Pierot, Mahesh Jarayaman, Istvan Szikora, Joshua Hirsch, Blaise Baxter, Shigeru Miyachi, Jeyaledchumy Mahadevan, Winston Chong, Peter J. Mitchell, Alan Coulthard, Howard A. Rowley, Pina C. Sanelli, Donatella Tampieri, Patrick Brouwer, Jens Fiehler, Naci Kocer, Pedro Vilela, Alex Rovira, Urs Fischer, Valeria Caso, Bart van der Wort, Nobuyuki Sakai, Yuji Matsumaru, Shin-ichi Yoshimura, Luisa Biscoito, Manuel Pumar, Orlando Diaz, Justin Fraser, Italo Lifante, David S. Liebeskind, Raul G. Nogueira, Werner Hacke, Michael Brainin, Bernard Yan, Michael Soderman, Allan Taylor, Sirintara Pongpech, Karel Terbrugge 275 Incidentaloma Discoveries in the Course of Neuroimaging Research Emmanuel Stip, Jean-Philippe Miron, Marie Nolin, Geneviève Letourneau, Odette Bernazzani, Laurie Chamelian, Bernard Boileau, Mona Gupta, David Luck, Ovidiu Lungu COMMENTARY 280 Rowan’s Rugby Fatality Prompts Canada’s First Concussion Legislation Charles Tator, Jill Starkes, Gillian Dolansky, Julie Quet, Jean Michaud, Michael Vassilyadi ORIGINAL ARTICLES 283 A Survey of Cerebrospinal Fluid Total Protein Upper Limits in Canada: Time for an Update? Pierre R. Bourque, Christopher R. McCudden, Jodi War- man-Chardon, John Brooks, Harald Hegen, Florian Deisen- hammer, Ari Breiner 287 Effects of Deep Brain Stimulation of the Subthalamic Nucleus Settings on Voice Quality, Intensity, and Prosody in Parkinson’s Disease: Preliminary Evidence for Speech Optimization Anita Abeyesekera, Scott Adams, Cynthia Mancinelli, Thea Knowles, Greydon Gilmore, Mehdi Delrobaei, Mandar Jog 295 Intraoperative Flash Visual Evoked Potential Recording and Relationship to Visual Outcome David A. Houlden, Chantal A. Turgeon, Nathaniel S. Amyot, Idara Edem, John Sinclair, Charles Agbi, Thomas Polis, Fahad Alkherayf 303 Psychiatric Neurosurgery: A Survey on the Perceptions of Psychiatrists and Residents Josiane Cormier, Christian Iorio-Morin, David Mathieu, Simon Ducharme 311 The Montreal Cognitive Assessment as a Cognitive Screening Tool in Athletes Chantel Teresa Debert, Joan Stilling, Meng Wang, Tolulope Sajobi, Kristina Kowalski, Brian Walter Benson, Keith Yeates, Sean Peter Dukelow 319 Factors Associated with Having a Will, Power of Attorney, and Advanced Healthcare Directive in Patients Presenting to a Rural and Remote Memory Clinic Sydney Lee , Andrew Kirk, Emily A. Kirk, Chandima Karunanayake, Megan E. O’Connell, Debra Morgan 331 Clot Histopathology in Ischemic Stroke with Infective Endocarditis Sonu Bhaskar, Jawad Saab, Cecilia Cappelen-Smith, Murray Killingsworth, Xiao Juan Wu, Andrew Cheung, Nathan Manning, Patrick Aouad, Alan McDougall, Suzanne Hodgkinson, Dennis Cordato 337 Uric Acid Levels Correlate with Sensory Nerve Function in Healthy Subjects Alon Abraham, Hans D. Katzberg, Leif E. Lovblom, Bruce A. Perkins, Vera Bril NEUROIMAGING HIGHLIGHTS 342 Characteristic Cerebrovascular Findings Associated with ACTA2 Gene Mutations Andrew Zhang, Alexandria Jo, Karen Grajewski, John Kim 344 Eagle Syndrome as a Cause of Cerebral Venous Sinus Thrombosis Fu-Liang Zhang, Hong-Wei Zhou, Zhen-Ni Guo, Yi Yang Volume 46 / Number 3 / May 2019 A-1 https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2019.52 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:37:04, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2019.52 https://www.cambridge.org/core 346 Radiological Demonstration of Choroid Plexus Causing Proximal Shunt Dysfunction Lior M. Elkaim , Pascal Lavergne, Dominic Venne, Alexander G. Weil 348 Primary Spinal Cord Melanoma – An Uncommon Entity Ritodhi Chatterjee, Fábio A. Nascimento, Kent A. Heck, Alexander E. Ropper, Anita L. Sabichi BRIEF COMMUNICATIONS 351 Fatal Second Impact Syndrome in Rowan Stringer, A 17-Year-Old Rugby Player Charles Tator, Jill Starkes, Gillian Dolansky, Julie Quet, Jean Michaud, Michael Vassilyadi 355 Effect of Head Rotation on Jugular Vein Patency Under General Anesthesia Mark A. Burbridge , Jung Gi Min, Richard A. Jaffe LETTER TO THE EDITOR 358 Intrathecal Trastuzumab as a Potential Cause of Drug-Induced Aseptic Meningitis Evangelia Pappa, Rosa Conforti, Khê Hoang-Xuan, Agusti Alentorn 360 Step on the Gas! Focal Seizures with Autonomic Features Alyson Plecash, Daryl Wile 363 Post-ischemic Leukoencephalopathy after Endovascular Treatment for Acute Ischemic Stroke Ahmad Nehme, Andrée-Anne Pistono, François Guilbert, Érika Stumpf 366 A Unique Case of Sinonasal Teratocarcinosarcoma Presenting as Foster Kennedy Syndrome Madeleine de Lotbinière-Bassett, Michael B. Avery, Yves P. Starreveld 369 A 51-year-old Man with Primary Spinal Cord Germinoma Amanallah Montazeripouragha, Brian Schmidt, Patricia Baker, Janice Safneck, Perry Dhaliwal, Marshall Pitz, Saranya Kakumanu, Sherry Krawitz Volume 46 / Number 3 / May 2019 A-2 https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2019.52 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:37:04, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2019.52 https://www.cambridge.org/core Volume 46 / Number 3 / May 2019 A-3 Editor-in-Chief/Rédactueur en chef Robert Chen toronto, on Associate Editors/Rédacteurs associés Robert Hammond london, on Philippe Hout montreal, qc Mahendranath Moharir toronto, on Tejas Sankar edmonton, ab Manas Sharma london, on Jeanne Teitelbaum montreal, qc Richard Wennberg toronto, on Past Editors/Anciens rédacteurs en chef G. Bryan Young london, on Douglas W. Zochodne calgary, ab James A. Sharpe toronto, on Robert G. Lee calgary, ab Robert T. Ross winnipeg, mb (Emeritus Editor, Founding Editor) Editorial Board/Comité éditorial Jorge Burneo, london, on Jodie Burton, calgary, ab Colin Chalk, montreal, qc K. Ming Chan, edmonton, ab Alan Goodridge, st. john’s, nl Mark Hamilton, calgary, ab Michael Hill, calgary, ab Alan C. Jackson, winnipeg, mb Draga Jichici, hamilton, on Suneil Kalia, toronto, on Daniel Keene, ottowa, on Julia Keith, toronto, on Stephen Lownie, london, on Jian-Qiang Lu, hamilton, on Patrick McDonald, vancouver, bc Joseph Megyesi, london, on Tiago Mestre, ottowa, on Sarah Morrow, london, on Michael Nicolle, london, on Narayan Prasad, london, on Alex Rajput, saskatoon, sk Kesh Reddy, hamilton, on Ramesh Sahpaul, north vancouver, bc Dipanka Sarma, toronto, on Sean Symons, toronto, on Brian Toyota, vancouver, bc Brian Weinshenker, rochester, mn Sam Wiebe, calgary, ab Eugene Yu, toronto, on Journal Staff/Effectif du Journal Dan Morin calgary, ab Chief Executive Officer Donna Irvin calgary, ab CNSF Membership Services / Communications Officer The official journal of: / La Revue officielle de: The Canadian Neurological Society La Société Canadienne de Neurologie The Canadian Neurosurgical Society La Société Canadienne de Neurochirurgie The Canadian Society of Clinical Neurophysiologists La Société Canadienne de Neurophysiologie Clinique The Canadian Association of Child Neurology L’ Association Canadienne de Neurologie Pédiatrique The Canadian Society of Neuroradiology La Société Canadienne de Neuroradiologie The permanent secretariat for the five societies and the Canadian Neurological Sciences Federation is at: Le sécretariat des cinq associations et de la Fédération des sciences neurologiques du Canada est situe en permanence à : 143N – 8500 Macleod Trail SE Calgary, Alberta T2H 2N1 Canada CNSF (403) 229-9544 Fax (403) 229-1661 The Canadian Journal of Neurological Sciences is published bi-monthly. The annual subscription rate for Individuals (electronic) is £141/US$233. The annual subscription rate for Institutions (electronic) is £187/US$311. See for full details including taxes; e-mail: subscriptions_ newyork@cambridge.org. The Canadian Journal of Neurological Sciences is included in the Cambridge Core service, which can be accessed at cambridge.org/ cjn. For information on other Cambridge titles, visit www.cambridge.org. For advertising rates contact M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Phone: 856-768-9360; Fax: 856-753-0064; Email: mjmrvica@mrvica.com. Le Journal Canadien des Sciences Neuorlogiques est publié tous les deux mois. Le prix d’abonnement annuel pour les individus (électronique) est 141£/233US$. Le prix d’abonnement annuel pour les établissements (électronique) est 187£/311US$. Veuillez consulter pour tous les détails, y compris les taxes; email: subscriptions_newyork@cambridge.org. Le Journal canadien des sciences neurologiques est inclus dans le service Cambridge Journals Online, accessible à cambridge.org/cjn. Pour plus d’informations sur les titres disponible chez Cambridge, veuillez consulter www. cambridge.org. Pour les tarifs de publicité, contacter M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Téléphone: (1)856-768-9360; Email: mjmrvica@mrvica.com. This journal is indexed by / Cette revue est indexée par: Adis International, ArticleFirst, BIOBASE, BioLAb, BiolSci, BIOSIS Prev, Centre National de la Recherche Scientifique, CSA, CurAb, CurCont, De Gruyter Saur, E-psyche, EBSCO, Elsevier, EMBASE, FRANCIS, IBZ, Internationale Bibliographie der Rezensionen Geistes-und Sozialwissenschaftlicher Literatur, MEDLINE, MetaPress, National Library of Medicine, OCLC, PE&ON,Personal Alert, PsycFIRST, PsycINFO, PubMed, Reac, RefZh, SCI, SCOPUS, Thomson Reuters, TOCprem, VINITI RAN, Web of Science. ISSN: 0317-1671 EISSN: 2057-0155 COPYRIGHT © 2019 by THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES INC. All rights reserved. No part of this publication may be reproduced, in any form or by any means, electronic, photocopying, or otherwise, without permission in writing from Cambridge University Press. Policies, request forms and contacts are available at: http://www.cambridge.org/about-us/rights-permissions. Permission to copy (for users in the U.S.A.) is available from Copyright Clearance Center: http://www.copyright.com, email: info@copyright.com. COPYRIGHT © 2019 du THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES I NC. Tous droits réservés. Aucune partie de cette publication ne peut être reproduite, sous quelque forme ou par quelque procédé que ce soit, électronique ou autre, y compris la photocopie, sans l’accord écrit de Cambridge University Press. Les politiques, les formulaires de demande et les contacts sont disponibles à: http://www.cambridge.org/about- us/rights-permissions. La permission de copier (pour les utilisateurs aux États- Unis) est disponible auprès Copyright Clearance Center: http://www.copyright. com, email: info@copyright.com. https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2019.52 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:37:04, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2019.52 https://www.cambridge.org/core FC IFC TOC_web IBC BC work_mfhhrcalkvexfk3lrxnm2vkrj4 ---- Effects of the Gay Publishing Boom on Classes of Titles Retrieved Under the Subject Headings “Homosexuality,” “Gay Men,” and “Gays” in the OCLC WorldCat Database  By: James V. Carmichael, Jr., PhD Carmichael Jr., James V.(2002) 'Effects of the Gay Publishing Boom on Classes of Titles Retrieved Under the Subject Headings ―Homosexuality,‖ ―Gay Men,‖ and ―Gays‖ in the OCLC WorldCat Database', Journal of Homosexuality, 42: 3, 65 — 88. DOI: 10.1300/J082v42n03_05 Made available courtesy of Haworth Press (now Taylor and Francis): http://www.informaworld.com/smpp/title~content=t792306897 ***Reprinted with permission. No further reproduction is authorized without written permission from Taylor and Francis). This version of the document is not the version of record. Figures and/or pictures may be missing from this format of the document.*** Abstract: What do searchers find when they look for literature on homosexuality? This question has profound implications for older as well as younger gays in their coming out, as well as in their subsequent identity development. Library records provide credible data to answer the question, since they represent relatively free sources of information, unlike data from bookstores, publishers, and some World Wide Web sites. The records of WorldCat, the world’s largest union database of library records, comprise over 30 million records listed in the Online Computer Library Center. For the purposes of the study, 18,757 records listed under ―Homosexuality,‖ ―Gay Men,‖ and ―Gays‖ were downloaded; records for ―Lesbian‖ and ―Lesbians‖ were not examined. Findings of the study suggest that while there has indeed been considerable growth in terms of the quantity of gay literature produced since 1969, such gains maybe offset by the deteriorating quality of cataloging copy, which makes the experience of browsing records a discouraging and confusing one. KEYWORDS. Gay literature, gay publishing, libraries, gay subject headings, search terms, databases, gay nomenclature, OCLC Article: INTRODUCTION Library and Publishing Contexts Gay and lesbian writers have often expressed their indebtedness to lesbigay literature in establishing and developing their identity. Actor Stephen Fry points to the ―pansy path of freedom‖ which, at least in the Victorian era, constituted the only legal if nevertheless clandestine means of communication between those Uranians who traced their genealogy from the Socratic Dialogues to the veiled sensibilities of Swinburne’s poetry, and the ambiguity of Tennyson’s (Fry, 1997). Well into the post-Stonewall Era, literature has provided confirmation, if not sanction, of homosexual identity, whether that literature was indeed ―literary fiction‖ or more graphically inclined ―pornography.‖ Lesbian activist Barbara Gittings, who for sixteen years headed the Task Force for Gay Liberation of the American Library Association (known soon after its founding as the Gay Task Force; after 1986, the Gay and Lesbian Task Force; after 1995, The Gay, Lesbian, and Bisexual Task Force; and after 1999, the Gay, Lesbian, Bisexual and Transgendered Round Table–hereinafter referred to as GLBTRT), explained the importance of reading and libraries to activist lesbians and gays of the 1960s and 1970s, many of whom grew up in the oppressive atmosphere, social distortions, and censorship of the McCarthy Era:  James V. Carmichael, Jr. is Professor of Library and Information Studies at The University of North Carolina at Greensboro. The author would like to thank graduate assistants Charles Wiggins, Katie Schlee, and Kevin Clement for their invaluable help in rechecking frequency counts and classifications. Correspondence may be addressed: P. O. Box 26171, The University of North Carolina at Greensboro, Greensboro, NC 24702-6171 (E-mail: Jim_Carmichael@uncg.edu). http://libres.uncg.edu/ir/uncg/clist.aspx?id=1103 http://dx.doi.org/10.1300/J082v42n03_05 http://www.informaworld.com/smpp/title~content=t792306897 mailto:Jim_Carmichael@uncg.edu). working in the movement kept reminding me that the written word has such along-range effect, that the literature on homosexuality is so crucial in shaping the images we and others have of ourselves, that these distorted images we were forced to live with must not be allowed to continue. I knew that the lies in libraries had to be changed, but I didn’t have a clear sense that we gay people could do it. (Gittings, 1978, 108) Other gay voices raised specific complaints about library practices that obscured the availability of gay information, or made the search for it perilous for those whose self-image was already fragile: pejorative Library of Congress Subject Headings (―Sexual Perversions‖) and Dewey Classifications (―Criminal Sexual Activity‖); homophobia and ignorance among the professorate of library schools; and the marginality of alternative and radical press publications and the failure of libraries to collect them (Revolting Librarians, 1972). Particularly in the past decade, spurred by a promising Democratic administration, gay writers have flourished, relatively speaking, but not without a cost. The gay publishing boom, which reached its peak in 1993, when the biggest concern seemed to be the blurring of lines between gay/lesbian publishing and publishing in gender fields whose outlines were less distinct and more bizarre (Summer, 1993), quickly became a bubble that popped in the publishing realities of the decade. Mainstream publishers became loathe to promote mid-market gay books (those selling 6,000 copies as opposed to 20,000 copies expected of a best-seller), and gradually these were left to independent presses, university presses with specialized lists, and newer small publishers who aimed for mid-market titles with long back lists–an essential for the gay market (Mann, 1995; Bronski, 1999; Warren, 1999). Although the market for gay titles has not abated, the blurring of lines between creative fiction and gay literature, as in the case of Michael Cunningham’s 1999 Pulitzer Prize-winning novel The Hours and Gore Vidal’s (1999) collection of essays, Sexually Speaking, both marketed as mainstream literary titles by Barnes and Noble’s and Border’s bookstores, makes their ready identification with gay content difficult for novice gay book buyers. Moreover, industry officials readily admit that promotion of gay titles by the mainstream press is ultimately dependent upon the support of a gay editorial administrator in the face of conscious or unconscious lack of support from others in the firm or in bookstores who may be squeamish about promoting a gay list. The Problem of Definition An academic industry has grown around the definition, delineation, and application of ―queer theory‖ and ―queer studies‖ (e.g., Browning, 1998; Mohr,1992; Queer Representations,1996; AQueer World,1996; Sinfield, 1994) based loosely upon the supposition that discrete elements of homosexual identity and influence can be discerned. There is no agreement on the extent to which same-sex desire permeates society (Vidal, 1999 and cultural conservatives represented in Beyond Queer, 1996, often take great issue with homosexual identity, as opposed to homosexual acts) but same-sex desire, either explicit or implied, is frequently the measure by which themes of gay literature (Woods, 1998), art, and culture (Mohr, 1992, 129-218; Ortiz, 1999; Shillinglaw, 1999) are defined. Certainly homosexual pornography, the most prevalent and explicit expression of same-sex desire, has become as ubiquitous as its heterosexual equivalent, although it may signify more to gay men who equate gay liberation with a freewheeling sexual lifestyle, the AIDS era notwithstanding, than those interested in other aspects of gay identity/issues (see, for example, O’Toole, 1998, and particularly Bergman, 1999, who blurs the line between gay pulp fiction of the sixties and pornography of the same period). At present, gay men freely acquire the bulk of their pornography in video, magazine, or Internet format, but this was not so when legal penalties were more severe. However quaintly historical the value of these publications may now seem to younger gays weaned on the videos of Joe Stryker and Rick Donovan, one must understand their function as pornography when they were published, particularly when access to raunchier pornography was extremely limited. At any rate, while such pornographic titles suffice as evidence of a genealogy of queerness, they usually do not represent holdings of public libraries except for historical or scholarly purposes, and they are discounted as gay literature in the present study. What constitutes gay literature at any given time naturally reflects societalattitudes, as even a cursory glance at the ornamented euphemism of nineteenth-century examples will attest (White, 1999, particularly 289-309, in which Simeon Solomon sublimates expression of same sex-desire beneath the prolix ooze of an idealized, mystical Christianity). The question, therefore, of exactly what constitutes gay literature in any given period leads the researcher through a serpentine quest of definition that defies easy resolution, even among the coterie of specialists professionally preoccupied with such issues, particularly when those concerns are conflated with those of the gay movement (an adequate though muted example of the spectrum of ideological divergence among a sample of writers can be gleaned from State of the Struggle, 1999). The library profession has made strides in improving access to gay information through the formation of the first professional gay organization in the world. For nearly thirty years, GLBTRT has published bibliographies of gay-positive literature, worked to end discrimination based on sexual orientation in The American Library Association, and sponsored a Gay Book Award to encourage and promote the develop- ment of quality gay fiction and nonfiction. Task Force programs over the years have explored various aspects of library practice that impede access to gay materials in libraries–failure to acquire gay materials, censorship by sequestration, the use of pejorative subject headings, and caving in to parental or religious pressure to ban such materials from collections (Gittings, 1998). In some of these areas, great progress seems to have been made within and without the library community, although there are many smaller communities and even municipalities where homosexuality is a hotly-contested issue, especially where expenditure of public funds is concerned. Moreover, the rise of the ReligiousRight, particularly during the Reagan presidency (1980-88), had a profound counterbalancing effect on the support for gay issues. Communication Theory and Database Searching Reference work in libraries during the past fifty years has been greatly influenced by communications theory and particularly the Shannon-Weaver model of communication that posits the necessity of feedback and the distraction of noise in any communications transaction, from asking a question of a reference librarian to typing in a request at the computer (Eichman, 1978). While the applicability of feedback to the reference situation is easily envisioned–the necessity for clarification for vague information requests–noise is more insidious, and may range from cultural differences, prejudice, and personal antipathies on the interpersonallevel to system incompatibilities, downtime, and too many choices in the electronic environment. Richardson (1995, 138) defines noise as ―interference, distortion, or errors‖ in the reference process. Where librarians may send out noise in the form of discouraging messages to gay users in the form of implied disapproval or indifference, the computer registers a different kind of noise in the form of false hits, duplicate entries, and the inappropriate categorization of information. For example, the term ―homosexuality‖ brings up homophobic works as well as gay-positive ones unless the searcher possesses relatively sophisticated search skills and can refine an electronic search accordingly. Obviously, in an age when access to information, and particularly electronic information, is equivalent to power, these problems assume greater proportions for those whose information needs are sensitive, and subject to social proscription and prejudice. Even the serendipitous delight of stack browsing may be abbreviated for the gay searcher when remote storage becomes an option for overcrowded areas of the stacks, and controversial items are routinely vandalized and stolen. THE STUDY Pertinent Subject Headings, Chronological Periods, and the OCLC WorldCat Database What information do people actually find when they search for information on homosexuality? It is not the purpose of this study to propose a definition for gay literature, only to examine what is found when one explores prescribed Library of Congress subject headings utilized by libraries for information about homosexuality and gay men (LCSH, 1998, Vol. II, 2236-38, 2596-97). The present study consists of an ex- amination of 18,757 WorldCat records retrieved (known as ―hits‖) using the subject headings ―Homosexuality,‖ ―Gay Men,‖ and ―Gays‖ for the time periods: (1) up to the Stonewall Rebellion (1969), a commonly recognized marker for the beginning of the current struggle for gay rights; (2) the period 1970-1981, from the Stonewall Rebellion to the beginning of the AIDS crisis, which for the purposes of this study, begins in 1982 with the publication of the (N.Y.) Men’s Health Crisis Newsletter; and (3)1985 and 1995, to compare quantitative and qualitativefeatures oftypes of works represented by these cataloging records. WorldCat, a bibliographic utility owned and operated by the Online Library Catalog Center (OCLC) through its FirstSearch service, maybe envisioned as an online database of shared cataloging records from around the world. Now consisting of over thirty million records, it represents the closest approximation to a universal bibliography that the world has ever known. Records may be searched by author, title, subject, or keyword, and records may be delimited by date, language, and form. The author downloaded all English-language records found in the WorldCat database for the subject heading terms ―Homosexuality,‖ ―Gay Men,‖ and ―Gays.‖ Items under ―Lesbian‖ subject headings were not examined in the interests of expediency, although some similarities should be expected for the more than six thousand records concerned primarily with lesbians. The investigator possesses rudimentary working knowledge of ―gay‖ (i.e., written by, for, or about gay males) literature and culture in the United States. Some literature retrieved under the pertinent subject headings addresses gays and lesbians collectively, and is considered a part of this study. The WorldCat database theoretically provides a relatively comprehensive record of gay literature, for while many specialized bibliographies have been published over the years (Dynes, 1990, is one of the most comprehensive, although it is perhaps most useful in the area of scholarly journal literature), few even pretend to cover the entire field; most address only one subject field like Theology. On the other hand, the OCLC database provides a rich, internationally based collection of records. Since records represent input from member libraries, the database provides a fair representation of what all types of libraries actually hold–and what, therefore, an inquirer might be able to retrieve under the rubric of ―gay information‖ in libraries. Type Material Classifications Downloaded hits were printed out for analysis by year. Analysis proceeded in the following order: homosexuality, Gay Men, and Gays. Duplicate titles, excluding revised editions, both within a single subject heading and between subject headings, as well as AV materials (recordings, films, and videotapes), erroneous retrievals (―false drops‖) and pornographic works were subtracted to yield the total number of print tiles. Resulting titles were then classified according to a schemata suggested by the type of the materials themselves, to wit:  ARCHIVES: Personal and organizational archives and special collections, excluding collective series entries for pornographic fiction (counted separately).  NONFICTION: Fifty pages or over, regardless of subject matter or form (e.g., history, memoir, directory, manual, study, etc.).  FICTION: Novels, short stories, drama, poetry, anthologies and juvenile fiction.  POLEMIC: Religious/spiritual aspects of homosexuality, whether pro-gay or anti-gay.  THESES: Dissertations and theses; also includes papers submitted for completion of bachelor’s-, master’s-, honor’s- level courses or degrees, and several course papers not defined in this way.  REPORTS, LEGAL PROCEEDINGS, ETC.: Official governmental, legal, or professional proceedings or reports.  JOURNAL REPRINTS: Articles or books reprinted or Xerox copied for the purpose of greater availability. Also counted are monographs consisting of reprints of single journal issues on a single theme (e.g., The Haworth Press, Inc.), and analytic entries for single journal articles.  MISCELLANEA: Works of any kind 50 pages or less in length.  SERIALS: Newsletters, journals, magazines, newspapers, bulletins, and some directories or travel guides. • EPHEMERA: Items created for a temporary purpose (ex: resource packets, posters, some exhibition guides, programs, etc.).  PERIPHERA: Works which concern topics of which homosexuality is only a part (e.g., gender roles, masculinities, the nature of sexuality, the sociology of deviance, and social issues facing the church, variously defined).  PORNOGRAPHY: Either fiction, pseudo-humor (e.g., Tom of Finland’s Kake series) or serial-pictorial (including early predecessors to hard-core pictorial porn of the 1980s such as beefcake ma ga zine s of the 1960s). Ma y a lso include some ea rly pseudo-medical texts of the 1960s profusely and explicitly illustrated with photographs. This category excludes mainstream sex manuals (e.g., The Joy of Gay Sex, The Gay Kama- Sutra), which are counted with nonfiction. Problems of Analysis Early in the analysis, certain problems emerged which demanded resolution. For pornographic titles, the old Library of Congress qualifier ―pornographic works‖ has been replaced by ―erotic works,‖ not only for newer fiction, but also for older titles cataloged retrospectively. Although, beginning in 1957, the courts went to great pains to distinguish between pornography and erotica, such shadings of meaning have been obliterated by what some feel is the less pejorative tone of ―erotica.‖1 The use of the classification ―Pornography‖ by the investigator does not indicate a value judgment, but rather explicitly identifies works intended to sexually excite the reader, or in the case of pictorial works, viewer. Such distinctions are harder to make in more recent works, such as coffee table photographic works or deluxe anthologies of the 1950s pictorial output of the Athletic Model Guild, which may lave lost some of their sexual appeal and acquired a veneer of camp in light of the more explicit works of later years. The investigator has labeled works pornographic only if, as the courts phrased it, artistic, literary or social value was secondary to prurience. The second major problem area is serials, the cataloging of which has undergone major changes in the past decade. Libraries no longer always provide the date of first publication of a serial, and catalogers may use an open-ended date (― 19uu‖) to accommodate holdings. This feature makes chronological placement difficult for older serials that may not have been cataloged online until quite recently. Added to the tendency of journal titles, and particularly organizational newsletters and bulletins, to change titles frequently, a tremendous number of duplicate entries result. Although the focus of the present study was not on serial publications–that genre of gay publications having received masterful treatment (Streitmatter, 1995)–an effort was exerted to count each publication only once, and to count as duplicates entries representing title changes. Finally, in the area of gay spirituality, and to a lesser degree psychology, it was increasingly impossible to separate works affirmative of gay sexuality (such as the publications of the Metropolitan Community Church) from doctrinal debates of organized churches that either defended, offered qualified support, patronized, or attacked homosexuality. Particularly after the Anita Bryant anti-gay crusade of 1977 and the rise to national prominence of Reverend Jerry Falwell, more specifically anti-gay works from the Religious Right are cataloged under ―Homosexuality.‖ In this study, all works dealing with religious aspects of homosexuality are included under ―Polemic,‖ as well as a few general psychological or social studies intended as attacks upon the normalization of homosexuality. FINDINGS Hits vs. Unique Titles Superficially, there does seem to have been a tremendous growth in the number of items published, particularly in the last decade, where there are two and a half times as many hits for items published during the twelve years following Stonewall than in all the years preceding it, and over three and a half times as many hits in 1995 as in the average annual hits for the 1970-81 period (see Table 1). Yet when duplicates, non-print, and pornographic titles are subtracted from the totals, the results are not nearly so impressive: While the ratio of growth remains about the same (1 to 3.3 average for 1970-81, 1 to 4.37 for 1995), the raw numbers shrink to only 1067 items in the pre-1969 era, 3361 items from 1970 to 1981, and 779 titles (as opposed to 1250 uncorrected hits) for 1995, the high-water mark to date of gay publishing (see Table 2). Duplicates Duplication rate among WorldCat entries varies from a high of over 37 per cent to a low of just over 15 per cent for individual subject head- ings by year. Overall, the duplication rate runs from one-fourth to one-fifth of the total number of hits in any given year for the three subject headings. While some of the duplication results from definitions set out in the present study, e.g., the decision to count each serial only once, regardless of variant titles, while others represent the decision to count pornographic fiction series either as a collective entity (series) or individually as separate titles, but not both (in the present study, the collective entry is preferred, since the records indicate that at least one collection of gay pornographic fiction–the International Gay and Lesbian Archives–is used primarily for historical research) (see Appendix I) in many more cases, catalogers have entered duplicate entries for new printings, duplicate entries for author pseudonyms, or apparently created their own records with no regard for existing records in the database. This practice is contrary to OCLC cataloging protocol which predicates that libraries merely add their library symbol to the holdings record on existing Library of Congress cataloging copy. Incases where unique records need to be created, the library receives credit from OCLC against their outstanding bill. Whatever the cause, the implication of so much duplication in terms of browsing is noise, multiple and unnecessary duplication, and wasted time and effort in searching. AV and False Drops Since this study is concerned with gay publishing, AV materials, important as they are from an informational standpoint, are also subtracted from the total of hits. As one might expect, not all libraries are equipped to handle AV materials, and others catalog AV items separately from print materials using in-house classifications/adaptations. At any rate, the number of AV titles constitute just two per cent of records up to 1969, 10 per cent of records from 1970 to 1981, and 11 per cent of records for 1995. False drops, those idiosyncratic errors that occur because words in a given record match those of the subject heading although they may mean something entirely different, also account for a small but significant part of the hits one retrieves. They are a bit more troublesome to resolve since they require knowledge of the literature to understand how they occurred, or indeed, that they are, in fact, errors. These account for 8 percent of hits up to 1969, 4 per cent of hits from 1970 to 1981, and less than 1 per cent of hits in 1995. The majority occur under the awkward subject heading ―Gays,‖ which is seemingly being revived again after a couple of decades of relative somnolence. The latest edition of the LCSH , in fact, prefers the ambiguous ―Gays‖ over ―Gay people,‖ ―Gay persons,‖ or ―Homosexuals,‖ although the ―Lesbian‖ and ―Gay Men‖ subject headings are still used, leaving the user to wonder if ―Gays‖ is the new collective term, or if indeed, homosexual men alone are intended–no scope note clarifies the usage. Worse, ―Gays‖ as a subject heading retrieves every record in which the eighteenth-century heterosexual playwright John Gay’s name is used in the possessive form, and even ―gay men,‖ yields similarly bizarre results–children’s and adult books with gay used in its non- homosexual sense (although more numerous among false drops are editions of Mrs. Frank Leslie’s play, ―The Gay Deceivers,‖ and most ironic is a collection of sheet music by homosexual composer Cole Porter, retrieved not because of his sexual orientation, but because of an analytic entry for the movie The Gay Divorcee. Ditto Ivor Novello). Qualitative Distinctions: Subject Headings At least three systems of modifications to gay subject-headings in LCSH have been proposed (see, for example, Michel, 1985; Berman, 1982, 110-12; ALA GLBTF, 1999). The Library of Congress has made some modifications to terms over the years, although in practice, there is some inconsistency in the way the headings are applied. There is little guidance given in LCSH for their correct use. The three main subject headings and their applications appear to be: Homosexuality. The general term homosexuality is the catch-all term used for all works concerned with homosexuality, either pro- or anti-, particularly in the nonfiction genres (see Table 3). Fiction comprises only 2.6 per cent of print titles under ―homosexuality‖ before 1982, whereas nonfiction monographs constitute 23 per cent, and nonfiction of all types–including polemic, legal reports, journal reprints, miscellanea, ephemera, and periphera, make up nearly 57 per cent of total print titles. The bulk (38 per cent) of the remaining titles consist of serials and theses. Generally, whatever the context, homosexuality provides the most information about the subject from an objective point of view, but it also includes the bulk of religious publications (5.5 per cent) debating the issue, particularly those with a negative slant. Gay Men. For literature by gay males, one must turn to the subject heading ―Gay Men‖ (see Table 4). Here, there are nearly twice as many novels as monographs. Only in recent years has the Library of Congress routinely added subject analytics to fiction, so there are some anomalies in the records. For example, the first edition of Gore Vidal’s The City and the Pillar (1948) is not retrieved in the pre-1969 records, since the original edition was not identified by content, and only the revised editions of the novel receive subject coverage. (The assignment of subject headings to fiction represents a recent revival of turn-of-the-century cataloging practice). Thirty-two novels turn up under ―Homosexuality‖ in the pre-1969 records, and 126 under ―Gay Men.‖ The unwary user may mistakenly assume that the sparse number of novels under ―Homo- sexuality‖ after 1969 (21 in all) represents an actual dearth of gay fiction, whereas in reality they represent a shift to ―gay men,‖ the more current term for items dealing exclusively with gay males. (Here one finds an additional 89 novels published from 1970 to 1981.) Also, the fact that the pre-1969 records exist at all attests to the growth in retrospective cataloging and the growth of gay studies as a respectable academic specialty. Ironically, archival collections–although some might be more properly identified as vertical files of newspaper clippings and other ephemera–are fairly evenly split between ―Homosexuality‖ and ―Gay Men,‖ about 50 records each, while ―Gays‖ contains over 80 records for the same period. Gays. There seems to be no rationale for the new use of the term ―Gays‖ in cataloging applications (see Table 5). Although reason would dictate a collective term for gay men and lesbians, many titles in the nonfiction and serials category apply primarily to male homosexuals, and many lesbians prefer separate nomenclature. One of the ironic benefits of the emergence of the Religious Right was the rapprochement between gay men and lesbians, particularly where their political and legal interests coincided, which means that the separatist tendencies of the 1970-1981 period have been partially although by no means completely allayed (Streitmatter, 1995, 211-242). The AIDS crisis has created more common bonds, particularly as both gay men and lesbians have an interest in challenging the stereotype of the disease as gay. It remains to be seen whether new material will be consistently cataloged under ―gays.‖ DISCUSSION The first impression received when the analysis of the records began, that there was much missing information, dissipated only after all of the records for all subject headings had been analyzed. There is indeed an abundance of information by and about gay men in library holdings, and it has grown tremendously, particularly in the past decade, a growth of 589 records per year between the 1970-81 period and 1995, an increase of over 400 per cent, all errors and duplications notwithstanding. More significantly, the number of creative fictional, poetical, and dramatic works appearing under the selected subject headings has increased annually from 1.75 to 18 for ―homosexuality,‖ from 12 to 24 for ―gay men,‖ and from .83 to 12 for ―gays‖–in other words, from fewer than two titles a year average to 54 titles annually. This growth is impressive, and reflects not only self-consciously progressive (if intermittent) tapping of the gay market from mainstream publishers, but also the growth of the gay press and gay series published by academic and university presses. The growth in fictional or creative works is paralleled by the growth of gay nonfiction, particularly in the areas of history and sociology. Until 1972, when the American Psychiatric Association declassified homosexuality as a mental illness, discussion of homosexuality was dominated by psychiatrists and sexologists. This is no longer true, for in 1995 there were as many historical monographs and biographies as there were total nonfiction titles in 1981. In 1995, 241 nonfiction gay monographs appear in the WorldCat database, whereas in the twelve years from 1970-1981, the average number of print titles represented is only 31 per annum. The growth represents a rate of nearly 775 per cent. On the other hand, the figures for nonfiction may be under-representative considering the great amount of miscellany retrieved, a category arbitrarily created for the purposes of this study. Miscellany of all types, fiction and nonfiction, accounts for almost 17 per cent of 1995 publications, and some researchers might consider these to be as important as longer works that bear a prestigious imprint. Certainly pamphlets such as the ―Gay Flames‖ series of political tracts published in San Francisco in the heyday of gay liberation, poetry chapbooks, or AIDS brochures distributed often at no charge at libraries, schools and health centers, and even speeches and position papers delivered at conferences and later distributed by the author may have an impact not reflected in later re-publication by an established press. Whitt (1993) also points to the importance of the ―grapevine‖ in disseminating information among oppressed minorities, not all of whose members are literate, or readers. The tendency, particularly since the early years of gay liberation, for some academic libraries to collect everything of importance in whatever form, means that several significant items cataloged as books but actually representing archival material are included under miscellany, as well–for example, a bound two-page statement by gay Civil Rights leader Bayard Rustin. Similarly, the importance of the gay press and scholarly journal literature is under-weighted in the present study. Journal of Homosexuality, for example, which has grown from a journal with medical/psychiatric emphasis to a comprehensive interdisciplinary journal built around social and professional issue themes since its inception in 1974, cannot be underestimated. A single reprint edition of a journal issue by Haworth Press probably has an impact for specialists in gay studies that most mainstream publications fail to provide (Joyce and Schrader, 1999). One undeniable trend is the growth of academic papers in gay studies, as reflected in the number of theses and dissertations from the seventies to 1995, from about 34 to 141 per year. Some of these, of course, are written for theology or divinity degrees, and not all are gay-positive. At the same time, the growth of scholarship in any field translates into the number of creative fiction and nonfiction works eventually available for wider consumption to the public. The considerable number of polemic works (4 per cent of 1970-81 titles in homosexuality, 10 per cent of 1995 titles), some of which are counted in with miscellany due to their length, indicates a great deal of soul-searching and self-examination by religious denominations on the subject of homosexuality, and more recently, same-sex marriages, as well as frankly negative diatribes by religionists who take exception to homosexuality as well as Darwinism and the ordination of women. The gay community has countered this trend with a prodigious amount of affirmative literature, from daily meditation missals to the more substantive position statements by organized groups like the Metropolitan Community Church, Beth Chaim Synagogue, and Dignity (an organi- zation now officially banned from holding its meetings in the Catholic Church). Unfortunately, the subject heading ―Homophobia,‖ if used at all, applies only to works that discuss homophobia rather than those that exemplify it. The LCSH provides a scope note for this term whose casuistry is matched only by its obscurity: Here are entered works on active discrimination against, or aversion to, homosexuals by heterosexuals. Works on prejudicial attitudes or assumptions held by heterosexuals concerning homosexuals or homosexuality are entered under Heterosexism. (Vol. II, 2596) The emphasis in the above note seems to be on action versus attitude, although it is hard to imagine ―active aversion‖ except in the practice of shunning. The example, although perhaps extreme, illustrates the problem of accurate naming in a climate of political correctness when common sense might dictate simplicity even if less fine distinctions might make some users squeamish. The American Library Association is committed, under the rubric of its ―Statement on Intellectual Freedom,‖ to providing all points of view on a controversial topic such as homosexuality. One can only hope that the young user of today gets past the negative information on homosexuality when searching WorldCat to find the great deal of positive information that now exists. One of the greatest dilemmas of the library profession is resolution of the dilemma of how works can be clearly identified by subject in such a way as to not offend the tenets of intellectual freedom, one interpretation of which would consider a subdivision of ―Homosexuality‖ such as ―homophobic works‖ a form of labeling that would constitute censorship. CONCLUSION While the present report does not attempt a detailed analysis of gay literature, it presents a cautionary view of the progress made in the creation and dissemination of gay literature throughout libraries. Future studies should examine the holdings patterns for various works to see how extensively these works are distributed, since a recent study by the American Library Association seems to indicate sizable gaps in the collections of major urban U.S. libraries (Bryant, 1995). They should also examine the growth of particular genres, the ratio of mainstream to gay press publications, and of course, patterns of growth in literature listed under lesbian subject headings. More importantly, the library profession should take note of the condition of the WorldCat database and the OCLC records on which it is based. Given the 1990s gospel of technological literacy, it hardly behooves librarians to emphasize the intricacies of refinements in Boolean search techniques when the database created by input from their own cataloging departments is increasingly flawed. This paper only suggests the extent to which errors, oversights, and ignorance of publication history and practice have worked their way into the records of what is supposedly an authoritative database. These faults are occurring at a time when cataloging theory has been demoted from a required course to an elective in some library programs (e.g., University of Pittsburgh), and it is accelerated by increasing reliance on search engines and the belief that the medium (electronic format) is more important than the message–the message being the accuracy of the information that the cataloging record conveys. While such errors are doubtlessly to be found in other subject areas of WorldCat, one would be hard pressed to think of another subject area in which the practical consequences of noise are more damaging. Whether their quests for identity ever take gay people into a library for information, the library remains the one place where such information is theoretically dispensed as a democratic right–relatively free of direct costs, discounting taxes and tuition fees. The WorldCat database indicates that gay information is available in some libraries in quantities and varieties unimaginable just thirty years ago. How easily such information can be accessed, whether all libraries acquire representative collections of such information, and whether the Internet environment provides increased access to such information are questions with profound implications for the future of gay people not only in metropolitan centers, but in the heartland as well. NOTE 1. Roth vs. United States: The book in question was D. H. Lawrence’s Lady Chatterley’s Lover, and it was the first time a defense of ―redeeming social importance‖ was allowed. REFERENCES American Library Association. Gay, Lesbian, and Bisexual Task Force. (1999). Clearinghouse inventory. Chicago: American Library Association. Bawer, B. (Ed.) (1996). Beyond queer: Challenging gay left orthodoxy. New York: The Free Press. Bergman, D. (1999). The cultural work of sixties gay pulp fiction. In P. J. Smith (Ed.), The queer sixties (26- 42). New York: Routledge. Berman, S. (1982). The joys of cataloging: Essays, letters and other explosions. Phoenix, AZ: Oryx Press. Bronski, M. (1999, May 3). After the ―boom,‖ Publishers Weekly 246: 38-42. Browning, F. (1996). A queer geography: Journeys towards the sexual self. New York: Noonday Press. Bryant, E. (1995). Pride & prejudice, Library Journal,120 (11), 37-39. Carmichael, J. V., Jr. (1998). Homosexuality and United States libraries: Land of the free, but not Home to the gay, Proceedings of the 64th International Federation of Library Associations Conference 1998, Booklet 7, 136-145. Cunningham, M. (1999). The hours. New York: Knopf. Duberman, M. (1996). Queer representations: Reading lives, reading cultures. New York: New York University. Duberman, M. (1996). A queer world: The Center for Gay and Lesbian Studies reader. New York: New York University Press. Dynes, W. (1985). Homosexuality: A research guide. New York: Garland. Eichman, T. L. (1978, Spring). The complex nature of opening reference questions. RQ 17: 212-222. Fry, S. (1997, June 16). Playing Oscar. The New Yorker 73: 82-86. Gittings, B. (1978). Combatting the lies in the libraries. In L. Crew (Ed.) The Gay academic (107-18). Palm Springs, CA: ETC Publications. Gittings, B. (1998). Gays in library land: The Gay and Lesbian Task Force of the American Library Association: The first sixteen years. In J. V. Carmichael, Jr. (Ed.) Daring to find our names: The search for lesbigay library history (81-94). Westport, CT: Greenwood Press. Joyce, S. & Schrader, A, M. (1999). Twenty years of the Journal of Homosexuality: A bibliometric examination of the first 24 volumes, 1974-1993. Journal of Homosexuality, 37(1): 3-24 Library of Congress subject headings. (1998). 21 st Ed, 5 Vols. Washington, DC: GPO. Mann, W. J. (1995, Spring). The gay publishing boom. The Harvard Gay &Lesbian Review 2: 24 Michel, D. (1985). Gay studies thesaurus: A controlled vocabulary for indexing and accessing materials of relevance to gay culture, history, politics, and psychology. Madison, WI: The Author. Mohr, R.D. (1992). Gay ideas: Outing and other controversies. Boston: Beacon Press. Ortíz, R.L. (1999). L. A. women: Jim Morrison with John Rechy. In Smith, P. J. (Ed.) The queer sixties (264-86). New York: Routledge. O’Toole, L. (1998). Pornucopia: Porn, sex, technology, and desire. London: Serpent’s Tail. Richardson, J. V., Jr. (1995). Knowledge-based systems for general reference work: Applications, problems, and progress. New York: Academic Press. Shillinglaw, A. (1999). ―Give us a kiss‖: Queer codes, male partnering, and the Beatles. In Smith, P. J. (Ed.) The queer sixties (127-44). New York: Routledge. Sinfield, A. (1994). Cultural politics: Queer reading. Philadelphia: University of Pennsylvania. State of the struggle–Martin Duberman, Michelangelo Signorile, Urvashi Vaid: A roundtable with Betsy Billiard (1999). The James White Review, 6(3): 17-21 Streitmatter, R. (1995). Unspeakable: The rise of the gay and lesbian press in America, Boston: Faber and Faber. Summer, B. (1993, August 9). Gay and lesbian publishing: The paradox of success. Publishers Weekly 240: 36- 40. Vidal, G. (1999). Sexually speaking: Essays. New York: Knopf. Warren, P. N. (1999, Fall). Has the gay publishing boom gone bust? The Harvard Gay & Lesbian Review 6: 7. West, C., &Katz, E. et al. (1972). Revolting librarians, San Francisco: Booklegger Press. White, C. (Ed.) (1999). Nineteenth-century homosexuality: A sourcebook. New York: Routledge. Whitt, A. J. (1993). The information needs of lesbian and bisexual women. Library and Information Science Research, 15 (3), 275-88. Woods, G. (1998). A history of gay literature: The male tradition. New Haven: Yale. APPENDIX 1 Pornographic Fiction Series Included Under Collective Entries in WorldCat Records 101 Enterprises. 1967-69. 30 vols. Adonis Classics. 1976-82. 43 vols. Barclay House. 1969-70. 6 vols. Big Boy/Backdoor. 1970-76. 7 vols. Black Knight Classics. 1969-70. 22 vols. Blueboy Library. 1976-78. 52 vols. Brandon House. 1968-1970. 11 vols. Companion Books. 1968. 9 vols. Eros. 1973-78. 7 vols. Finland Books. 1980-82. 21 vols. French Line. 1967-69. 12 vols. Frenchy's Gay line. 1969-71. 7 vols. Gay Way. 1969-72. 11 vols. Gay Parisian press. 1970-71. 5 vols. Golden Boy Books. 1978-79. 13 vols. Grand Prix Classics. 1970-75. 5 vols. Greenleaf Classics. 1969-70. 29 vols. Guild Press. 1966-67. 7 vols. Hardboy. 1974-77. 18 vols. His Collection I. 1971-75. 70 vols. Impact Library. 1967-68. 12 vols. Male Color Illustrated. 1970-79. 6 vols. Manhard. 1973-79. 53 vols. Midwood. 1969-76. 16 vols. Monkey Publication. 1969-70. 7 vols. Numbers Paperback Library. 1978. 5 vols. Olympia Press. 1970-72. 9 vols. Original Adult Books. 1968. 10 vols. Parisian Press. 1971-72. 10 vols. Pleasure Reader. 1969-73. 68 vols. Proctor File Illustrated. 1970-79. 10 vols. Ram. 1974-75. 14 vols. Roadhouse Classics. n.d. 12 vols. Rough Trade. 1975-82. 63 vols. Sean Johnson. n.d. 5 vols. Spade. 1968-74. 23 vols. Spartan Collection. 1972-74. 5 vols. Stud Series. 1976-79. 20 vols. Surrey Stud. 1975-82. 42 vols. Timely Books. 1971. 5 vols. Travelers Companion. 1968-72. 16 vols. Trojan Classic. 1968-73. 21 vols. Twilight Classics. 1968-69. 7 vols. Wildboys. 1975-76. 16 vols. work_mgoihr7tujdslah35765fcro3u ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216587351 Params is empty 216587351 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587351 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_mgp4qkyrkvaydd6xbnftz3xequ ---- Student Publishing: Future Scholars as Change Agents Providence College From the SelectedWorks of Mark J Caprio 2014 Student Publishing: Future Scholars as Change Agents Mark J Caprio, Providence College Available at: https://works.bepress.com/mark_caprio/5/ http://www.providence.edu https://works.bepress.com/mark_caprio/ https://works.bepress.com/mark_caprio/5/ Student publishing: future scholars as change agents Mark J Caprio Phillips Memorial Library, Providence College, Providence, Rhode Island, USA Abstract Purpose – The purpose of this paper is to highlight undergraduates as an emergent student–scholar author group and to encourage institutions to take a future-oriented view, focusing greater attention to and support of undergraduate’s publishing. Design/methodology/approach – Highlighting benefits derived from undergraduate research (UR) experiences and publishing taken from the literature and experienced through local practice (Providence College), presenting pedagogical models for transforming students into independent thinkers (students as scholars) and responding to business and non-profit leader graduate skills requests of higher education, this paper argues for the need to cultivate graduate attributes (requisite 21st century workforce skills, abilities and behaviors), especially graduate demonstrated articulation and communication (publication) skills and abilities. Findings – The conclusions drawn in this paper align with the literatures’ support of derived benefits from UR experiences and its completion through articulation and communication (publication). Final remarks reiterate that critical thinking, complex problem-solving and communication (publication) skills and abilities demonstrate graduate agency and preparedness for meeting 21st century challenges. Originality/value – This paper layers several pedagogical engagement-based teacher–learner models, highlights benefits of undergraduates’ completing the research process through communication (publication) and underscores the importance of cultivating 21st century graduate agency. Keywords Undergraduate research, Graduate agency, Graduate attributes, Students as scholars, Undergraduate publishing Paper type Viewpoint Introduction The advent of digital technologies, the Internet, the World Wide Web and evolving communication channels has provided options for knowledge dissemination of scholarly content through using new tools and platforms. Constrained by traditional disciplinary approaches and institutional tenure, promotion and reward systems, the long-established faculty scholarly communication paradigm has slowed the pace for more general scholarly communication experimentation, innovation and adoption of new knowledge dissemination tools, formats and channels for faculty. The article highlights undergraduates as an emerging student–scholar author group and encourages a future-oriented view of scholarly communication with greater attention to undergraduate publication. The article is divided into two main parts. Part I presents a brief national history of undergraduate research (UR) and its evolution over that past few decades, models for developing 21st century graduate attributes (e.g. critical thinking, complex problem-solving and communication skills) and growing interest in undergraduate The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm OCLC 30,3 144 Received 30 January 2014 Revised 10 February 2014 Accepted 10 February 2014 OCLC Systems & Services: International digital library perspectives Vol. 30 No. 3, 2014 pp. 144-157 © Emerald Group Publishing Limited 1065-075X DOI 10.1108/OCLC-01-2014-0003 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) http://dx.doi.org/10.1108/OCLC-01-2014-0003 publication as a natural completion of the UR experience. Part II then describes Providence College’s (PC’s) alignment with national educational reform, support for undergraduate publication and the cross-departmental synergy now taking place through the newly established Center for Engaged Learning. Acknowledging academic libraries’ expanding role in support of the entire UR and publication process, the article focuses primarily on academic libraries’ publishing role and specifically publishing support provided by the Digital Publishing Services (DPS) Department at the Phillips Memorial Library�Commons. Part I UR, graduate attributes and undergraduate publication. Integration of research and education When the Boyer Commission on Educating Undergraduates in Research Universities was formed in 1995, education scholars had already called for a re-examination of then current educational traditions and practices (Katkin, 2003; Merkel, 2001). The Boyer Commission Report issued in 1998 provided an educational reform direction change, recommending greater focus on undergraduate inquiry-based learning and UR experiences. The report encouraged development of an institutional framework that would support inclusion of undergraduates in faculty research, as early as possible. Through faculty–student research engagement, faculty as research mentors would “provide an appropriate balance of challenge and support” (Hodge et al., 2008, p. 11). Before the Boyer Commission Report, a limited number of faculty–UR programs existed at select research universities (Katkin, 2003; Merkel, 2001). The Massachusetts Institute of Technology (MIT) established the country’s first institution-wide program, called the “Undergraduate Research Opportunities Program” (UROP). UROP championed the value of faculty-guided undergraduate entrance into the scholarly community. The California Institute of Technology followed a decade later with its “Summer of Undergraduate Research Fellowships” (SURF) program (Merkel, 2001). Stanford and the University of Delaware followed, shortly thereafter (Bauer and Bennett, 2003). In support of these institutional initiatives, the National Science Foundation (NSF) established the “Recognition Award for the Integration of Research and Education” (RAIRE). In its press release, NSF envisioned universities with “a pervasive culture promoting collaborative research between professors and students” (National Science Foundation, 1997). Unlike many of the earlier studies and reports, which had a disciplinary focus (Katkin, 2003), the Boyer Commission Report set in motion a re-examination of the role of research universities across the curriculum in preparing: […] graduates who are well on the way to being mature scholars, articulate and adept in the techniques and methods of their chosen fields, ready for the challenges of professional life or advanced graduate study (Boyer Commission on Educating Undergraduates in the Research University, S. S. Kenny (chair), 1998, p. 38). The growing body of literature and anecdotal evidence punctuated by the Boyer Commission Report laid the foundation for cross-institutional pedagogical transformation (one from didactic-based to engagement-based). 145 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) Benefits of UR In the early days of developing UR programs, the term “research” predominantly reflected laboratory activities in the Natural Sciences. Highly qualified, highly motivated undergraduates were selected by faculty to participate in an authentic facet or aspect of their own scholarship or research project. Mentored students would contribute original independent scholarship to faculty projects (Merkel, 2001; Case, Davidson University QEP, 2007). Although beginning in the Natural Sciences, UR programs now include support for UR experiences in the Social Sciences and Humanities, as well as discovery through creative inquiry and expression in the Arts. The Council on Undergraduate Research (CUR) currently defines “undergraduate research” as an “inquiry or investigation conducted by an undergraduate student that makes an original intellectual or creative contribution to the discipline” (Council on Undergraduate Research, CUR, 2013a). This definition is inclusive of all UR experiences across the academy. Much of the early research supporting the benefits from faculty–student/academic mentor– undergraduate apprentice relationships is based on out-of-class undergraduate summer and academic year research experiences attached to faculty research projects (Hunter et al., 2007). Assessment of these UR programs has been largely ethnographic with faculty and students providing feedback about their experiences (Bauer and Bennett, 2003; Hakim, 1998; Hunter et al., 2007; Manduca, 1997; Wenzel, 2000). Increasingly, the literature identifies benefits from faculty–student engagement and research collaboration. Identified benefits include strengthened inquiry/research skills (Bauer and Bennett, 2003; Kardash, 2000; Lopatto, 2004a, 2004b, 2006; National Survey of Student Engagement, NSSE, 2007); professional socialization or enculturation into a “community of practice” (Hunter et al., 2007; Lave and Wenger, 1991; Wenger, 1998); greater disciplinary knowledge (Ishiyama, 2002); refinement of communication and interpersonal skills (Bauer and Bennett, 2003; Lopatto, 2004a, 2004b, 2006; Seymour et al., 2004; Dunn, 1996; Schapman, 1998); promotion of creativity and critical thinking abilities (Addison, 1996; Ellis, 2006; Hubbard and Ritchie, 1995; Laursen et al., 2012); greater enthusiasm for scholarly pursuits (Khersonskaya, 1998); greater confidence and familiarity with the research process (Alexander et al., 1998; Wolverton, 1998); and benefit to epistemological development (Buckley, 2008; Ryder et al., 1999). According to a National Survey of Student Engagement (National Survey of Student Engagement, NSSE, 2003), a significant number of undergraduate respondents indicated participation in UR experiences. These included independent studies, capstone courses or membership of a faculty research team. Bauer and Bennett (2003) surveyed alumni as one method to measure “the value added by the UR experience to baccalaureate education” (p. 214), comparing alumni who participated in a UR experience with those who did not. UR participants reported significantly greater perceived growth in their abilities for problem-solving and communicating effectively. These were perceived as important educational gains for alumni pursuing graduate education or entering the professional workforce. The Association of American Colleges and Universities (AACU), in a survey of business and nonprofit leaders, found that an overwhelming percentage of respondents recommend a 21st century liberal education for preparing the future workforce for long-term professional success in today’s global economy. These leaders highly value employee critical thinking, complex problem-solving and written and oral communication abilities and skills (AAC&U, OCLC 30,3 146 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) 2013). Kuh et al. (2008), in “High-impact educational practices: what they are, who has access to them, and why they matter”, identified UR as a high-impact activity widely tested and shown to be beneficial for college students. As of the writing of this article (December 2013), the CUR Web site indicates that 900 two-year/four-year colleges and universities are members (Council on Undergraduate Research, CUR, 2013a). This high rate of institutional participation points to recognition of the value and positive impact of UR experiences and to an important pedagogical paradigm shift toward greater faculty–student engagement in higher education. 21st century graduate attributes We continue to transition from an information and communication ecosystem with more clearly defined boundaries and more clearly prescribed interconnections to one that is data-abundant, dynamic, fluid, layered and continuously reconfigurable. In this evolving ecosystem, critical thinking, complex problem-solving and written and oral communication skills and abilities become precious, of the highest value and imperative as a set of demonstrable graduates attributes. These attributes parallel those developed through engagement with the scholarly research process (Ware and Burns, 2008). They are, in fact, attributes of the scholar. In From inquiry to discovery: developing the student as scholar in a networked world, Hodge et al. present the “Student as Scholar Model” as an essential educational paradigm shift to meet 21st century challenges. The speakers (authors) state their: […] aim [as] not simply to advance UR and creativity, but more importantly, to cultivate the “Student as Scholar,” where scholar is broadly conceived as an attitude, an intellectual posture, and a frame of mind derived from the best traditions of an engaged liberal education (Hodge et al., 2008, p. 2). Through faculty–student mentorships transition from student to scholar is predicated on the development of self as authority (self-authorship). Students progress in stages from a state of “absolute knowing” (knowledge is certain and authority external) to “contextual knowing” (knowledge is shaped by context and is debatable within that context). Supporting this, student transformation should be a central goal of higher education in the 21st century (Baxter Magolda, 1998; 1999; 2001). The ability to pose questions in response to challenges; skills to efficiently gather and effectively analyze and evaluate evidence; epistemological sophistication; intellectual integrity and responsibility; and the ability and skills to meaningfully communicate are those of the scholar. These skills, abilities and behaviors match those required by future employers and better position graduates to confidently engage within an evolving and disrupting information/communication ecosystem. Completing the research process: undergraduate publication Increasing numbers of institutions are recognizing “students as producers” (Case, Davidson University QEP, 2007; Neary and Winn, 2009) and expecting some form of communication (written or oral) to accompany and to complete UR activities. Case (2007), Program Director for Davidson College’s Research Initiatives, in a white paper summarizing faculty/student discussions about UR experiences at Davidson, writes: 147 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) Davidson faculty and students should share the expectation that their original work will result in a ‘product’ (e.g. a research article, painting and performance) that will make a contribution to their field of study (Case, Davidson University QEP, 2007, p. 5). The National Conference on Undergraduate Research (NCUR), established in 1987, dedicated to promoting and celebrating the academic accomplishments of young scholars “across all fields of study” publishes annual proceedings. These proceedings are a collection of papers “representative of the research presented at the annual conference” (National Conference of Undergraduate Research, NCUR Proceedings, 2013). The CUR lists and links to � 100 disciplinary and cross-disciplinary undergraduate journals hosted by colleges and universities worldwide (Council on Undergraduate Research, CUR, 2013b), recognizing student achievement in research and the creative arts. Since 2012, Indonesia’s education ministry has required that all undergraduates, graduates and post-graduates publish as a fulfillment of their degree (Rochmyaningsih, 2012). There is clear movement toward recognition of the importance of written and oral communication skills as critical graduate attributes and increased emphasis on undergraduate publication as a key activity for developing these skills. The literature on the benefits of undergraduates’ publishing is scarcer than on the benefits of UR experiences. This is likely due to the fact that new scholarly online publication platforms and communication channels are not much more than a decade old. However, there is literature that points to benefits of students’ publishing. Some benefits include communication refinement through feedback and peer-review (Hunter et al., 2007; Rifkin et al., 2010; Walkington, 2008; 2012); recognition and curriculum vitae enhancement – employers and graduate programs recognizing publication as an indicator of skills, abilities and commitment (Brownlow, 1997; Keith-Spiegel and Tabachnick, 1994; Walkington, 2012); confidence and motivation to continue publishing (Walkington, 2012); research literacy and media literacy (Feather et al., 2011; Rifkin et al., 2010; Walkington, 2012); and breadth of scholarly skills, research process completion, public engagement within a disciplinary or general community (Boyer Commission on Educating Undergraduates in the Research University, S.S. Kenny (chair), 1998; Rifkin et al., 2010; Tatalovic, 2008; Walkington, 2008; Walkington and Jenkins, 2008; Ware and Burns, 2008). In practice, as in all preceding research stages, faculty mentors should model communication (publication) behaviors. Students will have limited, or no previous publication experiences themselves with various communication (publication) styles, priority channels or politics. Form, venue and audience “need to be carefully related to the merit and quality of the work” (Walkington et al., 2013, p. 25). Departmental or institutional publication scaffolding can assist here (e.g. posters, blogs, wikis, multimedia-rich objects, informal and formal papers, UR journals and faculty–student authored articles). Providing publication options for students will help them develop and refine communication (publication) skills for different purposes and different audiences (Rifkin et al., 2010; Walkington, 2008). Unless not meeting institutional degree requirements, all undergraduates can be mentored and fully engaged in the entirety of the research process, including dissemination of their research through various types of publication formats and channels (Walkington et al., 2013). There is by no means agreement across the academy regarding the benefits of undergraduates’ publishing, especially for publishing in undergraduate-only research journals, or about the educational and scholarly value/resource expense proposition OCLC 30,3 148 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) associated with supporting undergraduate publication (Corbyn and Rooney, 2008; Gilbert, 2004; Siegel, 2004). Some argue that the focus of the UR experience should be on teaching only and that if undergraduates publish at all, they would do better to publish in “real” scholarly journals that are included by academic indexing and abstracting services (Gilbert, 2004). The questions raised by academics are of value, although they may to some degree be predicated on traditional educational and scholarly communication paradigms. These paradigms are informed by educational models and print-based processes of evaluation and dissemination, now buckling under the pressures of educational reform and the abundance of information, information formats, information dissemination channels and emerging human/system and author/content value metrics (Jensen, 2007). Articulation of process and discovery increases understanding and deepens comprehension of conceptually complex subject matter (Rifkin et al., 2010). Taking responsibility for ideas (self-authorship), entering public discourse through various formats and channels will assist undergraduates in meeting career goals. They will become competitive graduate program applicants (Brownlow, 1997; Keith-Spiegel and Tabachnick, 1994) and satisfactorily demonstrate communication skills and abilities to future employers (AAC&U, 2013, Bauer and Bennett, 2003). Institutional responsibility here lies with providing infrastructure and scaffolding for both UR and publication. Library as publisher Partly in response to commercial scholarly journal monopolies and skyrocketing costs, notable scholars and open-access evangelists like Stevan Harnad, as early as 1999, began to encourage a re-examination of scholarly communication practices. Harnad (1999, 2000) encouraged scholars to leverage digital technologies and networks and share their new scholarship immediately and openly through local or disciplinary e-print servers. Scholars in disciplines with pre-print traditions were the first to leverage new technologies and the Internet, creating disciplinary repositories (e.g. arXiv, RePEc and CogPrints). During this time, open-source (e.g. DSpace) and commercial cloud-based (e.g. bepress Digital Commons) institutional repository (IR) systems were being developed. In 2002, Raym Crow published his Scholarly Publishing and Academic Resources Coalition (SPARC) position paper, “The case for institutional repositories”. Crow encouraged the adoption of IRs as a means of both capturing and archiving institutional intellectual output and as a proactive response to publisher monopolistic practices (Crow, 2002). Since Crow’s publication, academic libraries have implemented a variety of IR and online journal publishing systems (e.g. DSpace, EPrints, bepress, ETD-db, Fedora and Open Journal System) and have developed library publishing services (Mullins et al., 2012) to support institutionally hosted publications. There has been increasing faculty and institutional interest in providing undergraduates with publication platforms and services. As noted earlier, many institutions and faculty are recognizing the value of undergraduate public articulation and are collaborating with libraries to support undergraduate publications (Davis-Kahl et al., 2011; Jones and Canuel, 2013; Miller, 2013). Though library support for undergraduate publication is the focus of the article, it is worth noting that University of South Florida library–faculty member Stamatoplos (2009) has called mentored UR “an emergent pedagogy in higher education” (p. 235). Stamatoplos encourages academic libraries to actively and formally engage with UR 149 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) programs. There is an opportunity here for librarians to fully and meaningfully collaborate across the academy at the convergence of UR, information literacy and scholarly communication (Gilman, 2013; Davis-Hahl, 2012; Davis-Kahl and Hensley, 2013; Stamatoplos, 2009). Part II Within the larger context provided by Part I, Part II describes PC’s alignment with national pedagogical shifts in higher education and explores emerging campus synergy in support of UR, engaged learning and publishing. As mentioned in Part I, although many academic libraries are providing support across the UR and publication process, with respect to the library’s role, Part II focuses on undergraduate publishing support through the Phillips Memorial Library’s (PML’s) DPS Department. A brief history Founded in 1917, PC is a Catholic predominantly liberal Arts college located in Providence, Rhode Island. PC has a student population made up of approximately 3,800 undergraduates and 700 postgraduate students. Since 2007, PC has been actively encouraging a campus culture supportive of faculty–student research collaboration and undergraduate publication. The College created a standing committee charged with promoting UR; developed library-based digital publishing platforms and services; hired a College grants officer; implemented funding programs for UR and travel; and most recently created a Center for Engaged Learning. Joining 900 other colleges and universities, PC became a member of the CUR organization in 2007. Demonstrating its institutional commitment to support undergraduate scholarship and creative expression consistent with its strategic plans (Strategic Plan, 2006-2010; 2011-2015), PC has since worked toward establishing a campus culture that supports faculty–student collaboration and student independent studies supervised by faculty mentors. Since joining CUR, College representatives have attended various CUR workshops, events and institutes, namely, CUR/NSF Northeast Regional Workshop on Institutionalizing Undergraduate Research (2008); CUR Dialogues (2009); and most recently, CUR Creative Inquiry in the Arts and Humanities Institute (2013). Since its formation in 2009, the Providence College Undergraduate Research Committee (PC-URC) has been funding UR projects at the College. PC-URC has been proactive in its outreach to all departments and has periodically invited faculty from other institutions to visit PC to share their student collaboration experiences (i.e. faculty including undergraduates in their own research projects). In 2009, PC received a $250,000 Davis Educational Foundation award. Davis funds were used to support “student engagement in learning” strategic planning goals (Strategic Plan, 2006-2010). Additionally, as a way of showcasing and celebrating faculty–student collaboration and student-independent research and creativity, the Office of Academic Affairs began sponsoring an annual spring “Celebration of Student Scholarship and Creativity” in 2010. Each year, in preparation for the event, an e-mail is sent to faculty inviting them “to nominate high-quality projects that are or have been under their supervision or direction” (Call for Nominations, 2013). Since 2010, the annual celebration has grown in the number of projects and in campus-wide attendance. Most recently, based on a strategic planning goal (Strategic Plan, 2011-2015), the Center for Engaged Learning was established at PC. The Center’s mission is “to promote, enhance, OCLC 30,3 150 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) and expand the College’s efforts to engage students deeply in their learning”. The Center’s Director is now responsible for oversight and coordination of the annual celebration, assisted by an event steering committee, which includes campus faculty, students and staff. DPS at the PML, PC, was established in 2007. DPS pursues and investigates new collaborations and publishing models, prioritizing support for local faculty and student scholarship and creative expression (Caprio and Landry, 2013). DPS has developed partnerships with academic departments and individual faculty representing all areas of study at the College (Sciences, Social Sciences, Humanities and Arts) and is currently publishing representative undergraduate capstone and honors theses, honors colloquia papers, a department-supported student-created journal and digital surrogates of senior art show works (studio theses) through its Digital Commons and Content Pro IRs. Unlike faculty, who has well-established pathways to publication, students have had little opportunity to publish their scholarship through an institutional host (e.g. PC) or other scholarly outlet. Most capstone theses are sequestered in an institution’s physical archive with little chance of being read by anyone beyond the project supervisor or supervising committee (local library circulation statistics demonstrate this to be an accurate assessment). Since implementing its Digital Commons and Content Pro IRs in 2007 and 2010, PC’s Phillips Memorial Library�Commons has published increasingly greater numbers of student scholarship and creative expression. With increased visibility and through Digital Commons’ Search Engine Optimization (SEO), student publications are receiving high numbers of unique views and downloads: the top four theses from one of the oldest community series in the repository receiving 19,898; 18,926; 18,618; and 16,012 downloads, respectively (usage statistics as of 19/01/2014). PC students are currently submitting, mainly, traditional forms (digital, but text-based) with few complex (multimedia) digital objects queued for review or publication. That may be due more to faculty preferences than to students’ (Rifkin et al., 2010). With growing campus support (equipment, software and staff), the creation of the new MediaHub (facility in the Phillips Memorial Library�Commons supporting the creation and use of new media in education, scholarship and creative expression) and greater outreach, students may begin to explore articulation through new media formats and new communication channels. Toward synergy The recent establishment of the Center for Engaged Learning as a College strategic initiative has created a focus for campus conversations about student engagement in learning at PC. In September 2013, the Center’s Director invited the Head of DPS, the Chair of the PC-URC and faculty from Foreign Languages Studies to submit a joint application for attendance at a CUR Institute, “Creative Inquiry in the Arts and Humanities”. The application was accepted and the PC team (five members) attended the institute from November 8, 2013 to November 10, 2013 at Sacramento State University, Sacramento, CA. The institute deliverable was a team-created campus-wide action plan, representing short-, medium- and longer-term objectives for expanding UR, scholarship, creativity and its public sharing. The three-day schedule included plenary sessions followed by institutional team breakout sessions. The plenaries presented topics of importance for 151 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) advancing UR, scholarship and creativity on college and university campuses (e.g. research on student participation outcomes; models and budgets; internal and external funding sources; and tenure/promotion challenges). Team breakout sessions, informed by preceding plenary presentations, provided respective idiosyncratic campus culture-focused discussions. Recognizing the value of undergraduate publishing activities, the action plan created by the PC team includes short- and medium-term objectives for greater outreach to faculty and students about Library�Commons publishing platforms (Digital Commons and Content Pro) and services with a longer-term objective to create an online open-access peer-reviewed undergraduate scholarly journal publishing the results of College-sponsored UR experiences. Increased cross-campus collaboration will help establish a model for greater integration of teaching, learning, research and publishing; and increase opportunities for Library�Commons faculty and staff to introduce emerging digital methodologies, tools and scholarly communication issues to the campus community (Davis-Kahl et al., 2011; Davis-Kahl, 2012; Davis-Kahl and Hensley, 2013; Gilman, 2013; Jones and Canuel, 2013; Miller, 2013; Stamatoplos, 2009). The DPS Department at the Phillips Memorial Library�Commons continues to evolve informed by advances in technology, changes in scholarly communication and changes in campus needs. DPS continues to remain true to its original mission to investigate new collaborations and publishing models for its community by staying attuned to global patterns, while respecting local idiosyncrasies (Caprio and Landry, 2013). Recognizing undergraduates as an emerging student–scholar author group, DPS has taken a future-oriented view. The department is formally engaged in greater campus wide outreach and allocating greater departmental resources to supporting a wider range of undergraduate publication activities and publication types (e.g. multimedia and text encoding) through the Phillips Memorial Library�Commons infrastructure. Conclusions Digital technologies, the Internet and the World Wide Web have fundamentally and disruptively changed the way we read, write and engage within community. In the past couple of decades, there has been a shift from didactic to engagement-based learning; an increase in the number and type of UR experiences; and increasing support for undergraduate communication (publication). Business leaders are looking to higher education to provide graduates who can think critically, solve complex problems and communicate effectively through traditional and new media. These are outcomes of a liberal education, attributes of the scholar. In Pedagogy of the Oppressed, Freire (1993) explores the narrative characteristics of the traditional teacher–student relationship. In this relationship, the teacher is the narrating “Subject” and the student is the acted upon object. In this scenario, knowledge is available only through the teacher. There is little agency on the part of the student (the object). Freire’s problem-posing method dislodges the teacher as “Subject”, replaced by the problem or question itself as the “Subject”. Teacher and student are together engaged by the problem or question (the “Subject”). In this dialogic space, the teacher can share knowledge and experience, disciplinary methods, tools and vocabularies; the student, fully and directly engaged with the problem or question, can share knowledge, OCLC 30,3 152 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) experience and expertise. Teaching and learning become dynamic, fluid and interchangeable (teacher can become learner; learner can become teacher). It is through a dialogic learning and writing space (Walkington, 2012), team-mentored by all campus stakeholders (e.g. faculty, librarians and administrators), that students will experience an authentic and complete research process, informed and capable of agency. Public articulation of ideas is an integral part of the research process for undergraduates. Without communication there is no community-developed knowledge base. It is through public expression, giving form to insights and discoveries that their meaning is reflected back (heard by the author) and heard within the community. When shared, ideas are re-formed and inspire new ideas, new questions or new knowledge. Transformed into scholars, encouraged and supported to write publically through traditional and new media, engaged within disciplinary or general communities, graduates will be equipped to meet 21st century challenges and to fully participate in the transformation of their communities. References Addison, W.E. (1996), “Student research proposals in the experimental psychology course”, Teaching of Psychology, Vol. 23 No. 4, pp. 237-238. Alexander, B.B., Foertsch, J. and Daffinrud, S. (1998), “The spend a summer with a scientist program: an evaluation of program outcomes and the essential elements for success”, University of Madison-Wisconsin LEAD Center, available at: http://wceruw.org/ publications/LEADcenter/sas.pdf (accessed 12 December 2013). Bauer, K.W. and Bennett, J.S. (2003), “Alumni perceptions used to assess undergraduate research experience”, The Journal of Higher Education, Vol. 74 No. 2, pp. 210-230. Baxter Magolda, M.B. (1998), “Developing self-authorship in young adult life”, Journal of College Student Development, Vol. 39 No. 2, pp. 143-156. Baxter Magolda, M.B. (1999), Creating Contexts for Learning and Self-Authorship: Constructive Developmental Pedagogy, Vanderbilt University Press, Nashville, TN. Baxter Magolda, M.B. (2001), Making Their Own Way: Narratives for Transforming Higher Education to Promote Self-Development, Stylus, Sterling, VA. Boyer Commission on Educating Undergraduates in the Research University, S.S. Kenny (chair) (1998), “Reinventing undergraduate education: a blueprint for America’s research universities”, State University of New York – Stony Brook, available at: http://eric.ed.gov/ ?id�ED424840 (accessed 5 November 2013). Brownlow, S. (1997), “Going the extra mile: the rewards of publishing your undergraduate”, Psi Chi Journal of Undergraduate Research, Vol. 2 No. 3, pp. 83-85. Buckley, J.A. (2008), “The disciplinary effects of undergraduate research experiences with faculty on selected student self-reported gains”, paper presented at the Annual Meeting of the Association for the Student of Higher Education, Jacksonville, FL, 6-8 November, available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi�10.1.1.169.6194&rep�rep1& type�pdf (accessed 30 December 2013). Caprio, M.J. and Landry, C.M. (2013), “Publishing Inti: a suite of services”, in Brown, A. (Ed.), Library Publishing Toolkit, IDS Project Press, Geneseo, NY, pp. 161-170. Case, V. (2007), “The future of undergraduate research, creative scholarship, and in-depth studies at Davidson College”, available at: www.academia.edu/4166612/The_Future_ of_Undergraduate_Research_Creative_Scholarship_and_In-depth_Studies_at (accessed 5 December 2013). 153 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) http://wceruw.org/publications/LEADcenter/sas.pdf http://wceruw.org/publications/LEADcenter/sas.pdf http://eric.ed.gov/?id=ED424840 http://eric.ed.gov/?id=ED424840 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.169.6194&rep=rep1&type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.169.6194&rep=rep1&type=pdf http://www.academia.edu/4166612/The_Future_of_Undergraduate_Research_Creative_Scholarship_and_In-depth_Studies_at http://www.academia.edu/4166612/The_Future_of_Undergraduate_Research_Creative_Scholarship_and_In-depth_Studies_at http://www.emeraldinsight.com/action/showLinks?isi=A1996WA74000009 http://www.emeraldinsight.com/action/showLinks?crossref=10.1353%2Fjhe.2003.0011&isi=000185336400005 http://www.emeraldinsight.com/action/showLinks?isi=000073762700002 http://www.emeraldinsight.com/action/showLinks?isi=000073762700002 Corbyn, Z. and Rooney, M. (2008), “Let students enjoy the power of print”, Times Higher Education, available at: www.timeshighereducation.co.uk/403119.article (accessed 11 December 2013). Council on Undergraduate Research (CUR) (2013a), “About CUR”, available at: www.cur.org/ about_cur/ (accessed 13 December 2013). Council on Undergraduate Research (CUR) (2013b), “Undergraduate journals and publications”, available at: www.cur.org/resources/students/undergraduate_journals/ (accessed 13 December 2013). Crow, R. (2002), “The case for institutional repositories: a SPARC position paper”, discussion paper, scholarly publication and academic resources coalition, Washington, DC, available at: http://works.bepress.com/ir_research/7/ (accessed 8 February 2014). Davis-Kahl, S. (2012), “Engaging undergraduates in scholarly communication outreach, education, and advocacy”, College and Research Libraries News, Vol. 73 No. 4, pp. 212-222. Davis-Kahl, S. and Hensley, M.K. (2013), “Common ground at the nexus of information literacy and scholarly communication”, Association of College and Research Libraries, available at: www.ideals.illinois.edu/bitstream/handle/2142/42666/CommonGround_OA.pdf? sequence�2#page�6 (accessed 31 December 2013). Davis-Kahl, S., Hensley, M. and Shreeves, S. (2011), “Completing the research cycle: the role of libraries in the publication and dissemination of undergraduate student research”, Association of College and Research Libraries, Fifteenth National Conference, Philadelphia, PA. March, available at: http://works.bepress.com/stephanie_davis_kahl/18 (accessed 31 December 2013). Dunn, D.S. (1996), “Collaborative writing in a statistics and research methods course”, Teaching of Psychology, Vol. 23 No. 1, pp. 38-40. Ellis, A.B. (2006), “Creating a culture for innovation”, The Chronicle Review, available at: http:// chronicle.com/weekly/v52/i32/32b02001.htm (accessed 14 January 2014). Freire, P. (1993), Pedagogy of the Oppressed, Continuum, New York, NY. Feather, D., Anchor, J.R. and Cowton, C.J. (2011), “The value of the undergraduate dissertation: perceptions of supervisors”, paper presentations of the 2010 University of Huddersfield Annual Learning and Teaching Conference, University of Huddersfield, Huddersfield, pp. 41-56, available at: http://eprints.hud.ac.uk/9655/ (accessed 29 November 2013). Gilbert, S. (2004), “Should students be encouraged to publish their research in student-run publications? A case against undergraduate-only journal publications”, Cell Biology Education, Vol. 3 No. 1, pp. 22-23. Gilman, I. (2013), “Scholarly communication for credit: integrating publishing education into undergraduate curriculum”, in Davis-Kahl, S. and Hensley, M.K. (Eds), Common Ground at the Nexus of Information Literacy and Scholarly Communication, Association of College and Research Libraries, Chicago, available at http://commons.pacificu.edu/cgi/ viewcontent.cgi?article�1021&context�libfac (accessed 31 December 2013). Hakim, T. (1998), “Soft assessment of undergraduate research: Reactions and student perspectives”, Council on Undergraduate Research Quarterly, Vol. 18 No. 4, pp. 189-192. Harnad, S. (1999), “Free at last: the future of peer-reviewed journals”, D-Lib Magazine, Vol. 5, No. 1, available at: www.dlib.org/dlib/december99/12harnad.html (accessed 8 February 2014). Harnad, S. (2000), “E-Knowledge: freeing the refereed journal corpus online”, Computer Law and Security Report, Vol. 16 No. 2, pp. 78-87, available at: www.cogsci.soton.ac.uk/�harnad/ Papers/Harnad/harnad00.scinejm.htm (accessed 8 February 2014). OCLC 30,3 154 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) http://www.timeshighereducation.co.uk/403119.article http://www.cur.org/about_cur/ http://www.cur.org/about_cur/ http://www.cur.org/resources/students/undergraduate_journals/ http://works.bepress.com/ir_research/7/ http://www.ideals.illinois.edu/bitstream/handle/2142/42666/CommonGround_OA.pdf?sequence=2%23page=6 http://www.ideals.illinois.edu/bitstream/handle/2142/42666/CommonGround_OA.pdf?sequence=2%23page=6 http://works.bepress.com/stephanie_davis_kahl/18 http://chronicle.com/weekly/v52/i32/32b02001.htm http://chronicle.com/weekly/v52/i32/32b02001.htm http://eprints.hud.ac.uk/9655/ http://commons.pacificu.edu/cgi/viewcontent.cgi?article=1021&context=libfac http://commons.pacificu.edu/cgi/viewcontent.cgi?article=1021&context=libfac http://www.dlib.org/dlib/december99/12harnad.html http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.scinejm.htm http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.scinejm.htm http://www.emeraldinsight.com/action/showLinks?isi=A1996TZ42500008 http://www.emeraldinsight.com/action/showLinks?isi=A1996TZ42500008 http://www.emeraldinsight.com/action/showLinks?crossref=10.1016%2FS0267-3649%2800%2988298-9 http://www.emeraldinsight.com/action/showLinks?crossref=10.1016%2FS0267-3649%2800%2988298-9 Hodge, D., Haynes, C., LePore, P., Pasquesi, K. and Hirsh, M. (2008), “From enquiry to discovery: developing the student as scholar in a networked world”, keynote address at Learning Through Enquiry Alliance Inquiry in a Networked World Conference, University of Sheffield, 25-27 June, available at: www.miami.muohio.edu/documents/about-miami/ president/reports-speeches/From_Inquiry_to_Discovery.pdf (accessed 21 November 2013). Hubbard, R.W. and Ritchie, K.L. (1995), “The human subjects review procedure: an exercise in critical thinking for undergraduate experimental psychology students”, Teaching of Psychology, Vol. 22 No. 1, pp. 64-65. Hunter, A.B., Laursen, S.L. and Seymour, E. (2007), “Becoming a scientist: the role of undergraduate research in students’ cognitive, personal, and professional development”, Science Education, Vol. 91 No. 1, pp. 36-74. Ishiyama, J. (2002), “Does early participation in undergraduate research benefit social science and humanities students?”, College Student Journal, Vol. 36 No. 3, pp. 380-387. Jensen, M. (2007), “Authority 3.0: friend or foe to scholars?”, Journal of Scholarly Publishing, Vol. 39 No. 1, pp. 297-307. Jones, J. and Canuel, R. (2013), “Supporting the dissemination of undergraduate research: an emerging role for academic librarians”, in Mueller, D.M. (Ed.), Imagine, Innovate, Inspire: Proceedings of the Association of College and Research Libraries Conference, Association of College and Research Libraries, Chicago, IL, pp. 538-545, available at: www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2013/ papers/JonesCanuel_Supporting.pdf (accessed 9 December 2013). Kardash, C.M. (2000), “Evaluation of an undergraduate research experience: perceptions of undergraduate intern and their faculty mentors”, Journal of Educational Psychology, Vol. 92 No. 1, pp. 191-201. Katkin, W. (2003), “The Boyer commission report and its impact on undergraduate research”, New Directions For Teaching and Learning, Vol. 2003 No. 93, Spring, pp. 19-38. Keith-Spiegel, P. and Tabachnick, B. (1994), “When demand exceeds supply: second-order criteria used by graduate school selection committees”, Teaching Of Psychology, Vol. 21 No. 2, pp. 79-81. Khersonskaya, M.Y. (1998), “Impressions and advice about making an undergraduate research presentation”, Journal of Psychological Inquiry, Vol. 3, pp. 50-51, available at: www.fhsu.edu/psych/jpi/ (accessed 11 December 2013). Kuh, G.D. (2008), “High-impact educational practices: what they are, who has access to them, and why they matter”, AAC and U, Washington, DC, available at: /www.neasc.org/downloads/ aacu_high_impact_2008_final.pdf (accessed 30 December 2013). Laursen, S., Seymour, E. and Hunter, A.B. (2012), “Learning, teaching and scholarship: fundamental tensions of undergraduate research”, Change: The Magazine of Higher Learning, Vol. 44 No. 2, pp. 30-37. Lave, J. and Wenger, E. (1991), Situated Learning: Legitimate Peripheral Participation, Cambridge University Press, Cambridge, MA. Lopatto, D. (2004a), “Survey of undergraduate research experiences (SURE): first findings”, Cell Biology Education, Vol. 3 No. 4, pp. 270-277. Lopatto, D. (2004b), “What undergraduate research can tell us about research on learning”, available at: http://web.grinnell.edu/science/ROLE/Presentation_2004_CUR_annual_ meeting_WI.pdf (accessed 3 January 2014). 155 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) http://www.miami.muohio.edu/documents/about-miami/president/reports-speeches/From_Inquiry_to_Discovery.pdf http://www.miami.muohio.edu/documents/about-miami/president/reports-speeches/From_Inquiry_to_Discovery.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2013/papers/JonesCanuel_Supporting.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2013/papers/JonesCanuel_Supporting.pdf http://www.fhsu.edu/psych/jpi/ http:///www.neasc.org/downloads/aacu_high_impact_2008_final.pdf http:///www.neasc.org/downloads/aacu_high_impact_2008_final.pdf http://web.grinnell.edu/science/ROLE/Presentation_2004_CUR_annual_meeting_WI.pdf http://web.grinnell.edu/science/ROLE/Presentation_2004_CUR_annual_meeting_WI.pdf http://www.emeraldinsight.com/action/showLinks?crossref=10.1002%2Fsce.20173&isi=000243246200002 http://www.emeraldinsight.com/action/showLinks?crossref=10.1037%2F0022-0663.92.1.191&isi=000087843100017 http://www.emeraldinsight.com/action/showLinks?crossref=10.1353%2Fscp.2007.0027 http://www.emeraldinsight.com/action/showLinks?isi=A1994NL85300003 http://www.emeraldinsight.com/action/showLinks?crossref=10.1017%2FCBO9780511815355 http://www.emeraldinsight.com/action/showLinks?isi=A1995QN59500020 http://www.emeraldinsight.com/action/showLinks?isi=A1995QN59500020 http://www.emeraldinsight.com/action/showLinks?crossref=10.1187%2Fcbe.04-07-0045 http://www.emeraldinsight.com/action/showLinks?crossref=10.1187%2Fcbe.04-07-0045 Lopatto, D. (2006), “Undergraduate research as a catalyst for liberal learning”, Peer Review, Vol. 8 No. 1, pp. 22-25. Manduca, C. (1997), “Broadly defined goals for undergraduate research projects: a basis for program evaluation”, Council on Undergraduate Research Quarterly, Vol. 18 No. 2, pp. 64-69. Merkel, C.A. (2001), “Undergraduate research at six research universities”, CA Institute of Technology, Pasadena, CA, available at: www.aau.edu/workarea/downloadasset. aspx?id�1900 (accessed 3 January 2014). Miller, C. (2013), “Riding the wave: open access, digital publishing, and the undergraduate thesis”, Pomona Faculty Publications and Research, Paper 377, available at: http://scholarship. claremont.edu/pomona_fac_pub/377 (accessed 13 December 2013). Mullins, J.L., Murray-Rust, C., Ogburn, J.L., Crow, R., Ivins, O., Mower, A., Nesdill, D., Newton, M.P., Speer, J. and Watkinson, C. (2012), “Library publishing services: strategies for success”, final research report, SPARC, Washington, DC, available at: http:// docs.lib.purdue.edu/cgi/viewcontent.cgi?article�1023&context�purduepress_ebooks (accessed 8 February 2014). National Conference of Undergraduate Research (NCUR). (2013), NCUR Proceedings, available at: www.ncurproceedings.org (accessed 22 December 2013). National Science Foundation (1997), “Recognition award for the integration of research and education” (RAIRE Press Release), available at: www.nsf.gov/news/news_summ.jsp?cntn_id�102824 (accessed 25 November 2013). National Survey of Student Engagement (NSSE) (2003), “Converting data into action: expanding the boundaries of institutional improvement”, Center for Postsecondary Research, Bloomington, IN, available at: http://nsse.iub.edu/2003_annual_report/pdf/NSSE_2003_Annual_Report.pdf (accessed 4 January 2014). National Survey of Student Engagement (NSSE). (2007), “Experiences that matter: enhancing student learning and success”, Center for Postsecondary Research, Bloomington, IN, available at http://nsse.iub.edu/nsse_2007_annual_report/ (accessed 4 January 2014). Neary, M. and Winn, J. (2009), “The student as producer: reinventing the student experience in higher education”, in Neary, M., Stevenson, H. and Bell, L. (Eds), The Future of Higher Education: Pedagogy, Policy and the Student Experience, Continuum, London, pp. 126-138. Rifkin, W., Longnecker, N., Leach, J. and Davis, L.S. (2010), “Students publishing in new media: eight hypotheses – a house of cards?”, IJISME, Vol. 18 No. 1, pp. 43-54. Rochmyaningsih, D. (2012), “Indonesia makes research publication a graduation requirement for all students”, Asian Scientist, available at: www.asianscientist.com/academia/indonesia- dikti-aptisi-publication-a-graduation-requirement-for-all-students-2012/ (accessed 8 December 2013). Ryder, J., Leach, J. and Driver, R. (1999), “Undergraduate science students’ images of science”, Journal of Research in Science Teaching, Vol. 36 No. 2, pp. 201-219. Schapman, A.M. (1998), “Tips for presenting a poster”, Journal of Psychological Inquiry, Vol. 3, pp. 53. Seymour, E., Hunter, A.B., Laursen, S.L. and Deantoni, T. (2004), “Establishing the benefits of research experiences for undergraduates in the sciences: first findings from a three-year study”, Science Education, Vol. 88 No. 4, pp. 493-534. OCLC 30,3 156 D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) http://www.aau.edu/workarea/downloadasset.aspx?id=1900 http://www.aau.edu/workarea/downloadasset.aspx?id=1900 http://scholarship.claremont.edu/pomona_fac_pub/377 http://scholarship.claremont.edu/pomona_fac_pub/377 http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1023&context=purduepress_ebooks http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1023&context=purduepress_ebooks http://www.ncurproceedings.org http://www.nsf.gov/news/news_summ.jsp?cntn_id=102824 http://nsse.iub.edu/2003_annual_report/pdf/NSSE_2003_Annual_Report.pdf http://nsse.iub.edu/nsse_2007_annual_report/ http://www.asianscientist.com/academia/indonesia-dikti-aptisi-publication-a-graduation-requirement-for-all-students-2012/ http://www.asianscientist.com/academia/indonesia-dikti-aptisi-publication-a-graduation-requirement-for-all-students-2012/ http://www.emeraldinsight.com/action/showLinks?crossref=10.1002%2Fsce.10131&isi=000222275100001 http://www.emeraldinsight.com/action/showLinks?isi=000078424700006 Siegel, V. (2004), “Should students be encouraged to publish their research in student-run publications? Weighing the pros and cons of undergraduate-only journal publications”, Cell Biology Education, Vol. 3 No. 1, pp. 26-27. Stamatoplos, A. (2009), “The role of academic libraries in mentored undergraduate research: a model of engagement in the academic community”, College and Research Libraries, Vol. 70 No. 3, pp. 235-249. Tatalovic, M. (2008), “Student science publishing: an exploratory study of undergraduate science research journals and popular science magazines in the US and Europe”, JCOM, Vol. 7 No. 3, p. A03, available at: http://jcom.sissa.it/archive/07/03 (accessed 18 November 2013). Walkington, H. (2008), “Geoverse: piloting a national e-journal of undergraduate research in Geography”, Planet, Vol. 20, pp. 41-46. Walkington, H. (2012), “Developing dialogic learning space: the case of online undergraduate research journals”, Journal of Geography in Higher Education Vol. 36 No. 4, pp. 547-562. Walkington, H. and Jenkins, A. (2008), “Embedding undergraduate research publication in the student learning experience”, Brookes eJournal of Learning and Teaching, No. 20, available at: http://bejlt.brookes.ac.uk/paper/embedding_undergraduate_research_publication- _in_the_student_learning_experi-2/ (accessed 23 November 2013). Walkington, H., Edwards-Jones, A. and Grestly, K. (2013), “Strategies for widening students’ engagement with undergraduate research journals”, Council on Undergraduate Research Quarterly (From the International Desk), Vol. 34 No. 1, pp. 24-30. Ware, M.E. and Burns, S.R. (2008), “Undergraduate student research journals: opportunities for and benefits from publication”, in Miller, R.L., Rycek, R.F., Balcetis, E., Barney, S.T., Beins, B.C., Burns, S.R. and Ware, M.E. (Eds), Developing, Promoting and Sustaining the Undergraduate Research Experience in Psychology, Society for the Teaching of Psychology, Washington, DC, pp. 253-256, available at: http://scoes.ccsu.edu/uploaded/departments/ AcademicDepartments/psychology/Mealy/UndergradResearchJournalsArticle.pdf (accessed 2 December 2013). Wenger, E. (1998), Communities of Practice: Learning, Meaning and Identity, Cambridge University Press, Cambridge, MA. Wenzel, T.J. (2000), “Undergraduate research as a capstone learning experience”, Analytical Chemistry, Vol. 72 No. 15, pp. 547A-549A. Wolverton, A.S. (1998), “Establishing an ideal program of research”, Journal of Psychological Inquiry, Vol. 3, pp. 49-50. Further reading AAC, U. (2007), “College learning for the new global century”, available at: www.aacu.org/leap/ documents/GlobalCentury_final.pdf (accessed 5 December 2013). Corresponding author Mark J Caprio can be contacted at: mcaprio1@providence.edu To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints 157 Student publishing D ow nl oa de d by P ro vi de nc e C ol le ge A t 09 :3 1 06 J un e 20 15 ( P T ) http://jcom.sissa.it/archive/07/03 http://bejlt.brookes.ac.uk/paper/embedding_undergraduate_research_publication_in_the_student_learning_experi-2/ http://bejlt.brookes.ac.uk/paper/embedding_undergraduate_research_publication_in_the_student_learning_experi-2/ http://scoes.ccsu.edu/uploaded/departments/AcademicDepartments/psychology/Mealy/UndergradResearchJournalsArticle.pdf http://scoes.ccsu.edu/uploaded/departments/AcademicDepartments/psychology/Mealy/UndergradResearchJournalsArticle.pdf http://www.aacu.org/leap/documents/GlobalCentury_final.pdf http://www.aacu.org/leap/documents/GlobalCentury_final.pdf mailto:mcaprio1@providence.edu mailto:reprints@emeraldinsight.com http://www.emeraldinsight.com/action/showLinks?isi=000088520400016 http://www.emeraldinsight.com/action/showLinks?isi=000309610600004 http://www.emeraldinsight.com/action/showLinks?crossref=10.1017%2FCBO9780511803932 http://www.emeraldinsight.com/action/showLinks?crossref=10.5860%2Fcrl.70.3.235&isi=000266379100004 http://www.emeraldinsight.com/action/showLinks?isi=000088520400016 Providence College From the SelectedWorks of Mark J Caprio 2014 Student Publishing: Future Scholars as Change Agents Student publishing: future scholars as change agents Introduction Part I Integration of research and education Benefits of UR 21st century graduate attributes Completing the research process: undergraduate publication Library as publisher Part II A brief history Toward synergy Conclusions References Cit p_4:2: Cit p_4:1: Cit p_3:1: Cit p_1:1: Cit p_27:2: Cit p_27:1: Cit p_19:2: Cit p_19:1: Cit p_41:2: Cit p_41:1: Cit p_29:2: Cit p_29:1: Cit p_40:1: Cit p_36:1: Cit p_32:1: Cit p_34:1: Cit p_30:1: Cit p_55:1: Cit p_57:1: Cit p_67:1: Cit p_59:1: Cit p_66:1: Cit p_62:1: Cit p_67:2: work_mh3ikikrhbfubivfofhgskjd4u ---- Canadian Journal of Neurological Sciences Journal Canadien des Sciences Neurologiques P M 40007777 R 9824 Cambridge Core For futher information about this journal please go to the journal website at: cambridge.org/cjn Volume 47 Number 5 September 2020 The official Journal of The Canadian Neurological Sciences Federation: Canadian Neurological Society Canadian Neurosurgical Society Canadian Society of Clinical Neurophysiologists Canadian Association of Child Neurology Canadian Society of Neuroradioogy V O L U M E 47 , N O . 5 T H E C A N A D IA N JO U R N A L O F N E U R O L O G IC A L S C IE N C E S S E P T E M B E R 2020 (435-587) https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.189 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.189 https://www.cambridge.org/core TMPr NEW AJOVY™ is a trademark of Teva Canada Limited. Teva Support Solutions® is a registered trademark of Teva Pharmaceutical Industries Ltd. and is used under licence.TEVA and the design version thereof are registered trademarks of Teva Pharmaceutical Industries Ltd. and are used under licence. © 2020 Teva Canada Innovation G.P. – S.E.N.C., Montreal, Quebec H2Z 1S8 AJO20-PMH12E Indicated for the prevention of migraine in adults who have at least 4 migraine days per month.1 Backed by the AJOVY™ Teva Support Solutions® (AJOVY TSS) Patient Support Program! Available from 8:00 a.m. to 8:00 p.m. ET, Monday to Friday 1-833-302-0121 TSS@ajovycanada.ca Clinical use: AJOVY™ should be initiated by healthcare professionals experienced in the diagnosis and treatment of migraine. Pediatric patients (< 18 years): Safety and efficacy have not been established. Geriatric patients (≥ 65 years): Clinical studies of AJOVY™ did not include sufficient numbers of subjects aged 65 and over to determine whether they respond differently than younger subjects. Warnings and precautions: › Hypersensitivity reactions. › Patients with hepatic or renal impairment. › Patients with cardiovascular diseases. › Pregnancy. › Breastfeeding. For more information: Please consult the Product Monograph at https://TevaCanada.com/en-Ajovy for important information relating to adverse reactions, drug interactions, and dosing information that has not been discussed in this piece. The Product Monograph is also available by calling us at 1-855-514-8382. References: 1. AJOVY Product Monograph. Teva Canada Innovation. April 9, 2020. CGRP: calcitonin gene-related peptide *Comparative clinical significance unknown. †Fictional patient. May not represent all patients. Indicated for the prevention of migraine in adults KATHERINE, 31 † The only anti-CGRP with both quarterly (every 3 months) and monthly dosing options indicated in migraine prevention*,1 https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.189 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.189 https://www.cambridge.org/core Volume 47 / Number 5 / September 2020 A-1 COMMENTARIES 589 Practical Guidance for Outpatient Spasticity Management During the Coronavirus (COVID-19) Pandemic: Canadian Spasticity COVID-19 Task Force Rajiv Reebye, Heather Finlayson, Curtis May, Lalith Satkunam, Theodore Wein, Thomas Miller, Chris Boulias, Colleen O’Connell, Anibal Bohorquez, Sean Dukelow, Karen Ethans, Farooq Ismail, Waill Khalil, Omar Khan, Philippe Lagnau, Stephen McNeil, Patricia Mills, Geneviève Sirois, Paul Winston 594 Tackling the Burden of Neurological Diseases in Canada with Virtual Care During the COVID-19 Pandemic and Beyond Ramana Appireddy, Shirin Jalini, Garima Shukla, Lysa Boissé Lomax ORIGINAL ARTICLES 598 The Virtual Neurologic Exam: Instructional Videos and Guidance for the COVID-19 Era Mariam Al Hussona, Monica Maher, David Chan, Jonathan A. Micieli, Jennifer D. Jain, Houman Khosravani, Aaron Izenberg, Charles D. Kassardjian, Sara B. Mitchell 604 Early Dabigatran Treatment After Transient Ischemic Attack and Minor Ischemic Stroke Does Not Result in Hemorrhagic Transformation Anas Alrohimi, Kelvin Ng, Dar Dowlatshahi, Brian Buck, Grant Stotts, Sibi Thirunavukkarasu, Michel Shamy, Hayrapet Kalashyan, Leka Sivakumar, Ashfaq Shuaib, Mike Sharma, Ken Butcher 612 Endovascular Thrombectomy for Low ASPECTS Large Vessel Occlusion Ischemic Stroke: A Systematic Review and Meta-Analysis Jose Danilo B. Diestro, Adam A. Dmytriw, Gabriel Broocks, Karen Chen, Joshua A. Hirsch, Andre Kemmling, Kevin Phan, Aditya Bharatha 620 Detecting Subtle Cognitive Impairment in Multiple Sclerosis with the Montreal Cognitive Assessment Kim Charest, Alexandra Tremblay, Roxane Langlois, Élaine Roger, Pierre Duquette, Isabelle Rouleau 627 Effect of Sleep Disorder on Delirium in Post-Cardiac Surgery Patients Hongbai Wang, Liang Zhang, Qipeng Luo, Yinan Li, Fuxia Yan 634 Efficacy and Acceptance of a Lombard-response Device for Hypophonia in Parkinson’s Disease Scott Adams, Niraj Kumar, Philippe Rizek, Angeline Hong, Jenny Zhang, Anita Senthinathan, Cynthia Mancinelli, Thea Knowles, Mandar Jog 642 Disparities in Deep Brain Stimulation Use for Parkinson’s Disease in Ontario, Canada James A.G. Crispo, Melody Lam, Britney Le, Lucie Richard, Salimah Z. Shariff, Dominique R. Ansell, Melanie Squarzolo, Connie Marras, Allison W. Willis, Dallas Seitz 656 Diagnostic Yield of MRI for Sensorineural Hearing Loss – An Audit Helen Wong, Yaw Amoako-Tuffour, Khunsa Faiz, Jai Jai Shiva Shankar 661 Influence of Optic Nerve Appearance on Visual Outcome in Pediatric Idiopathic Intracranial Hypertension Jonathan A. Micieli, Beau B. Bruce, Caroline Vasseneix, Richard J. Blanch, Damian E. Berezovsky, Nancy J. Newman, Valérie Biousse, Jason H. Peragallo 666 Association between Graduate Degrees and Publication Productivity in Academic Neurosurgery Michael B. Keough, Christopher Newell, Alan R. Rheaume, Tejas Sankar 675 A Diverse Specialty: What Students Teach Us About Neurology and “Neurophobia” Fraser G. A. Moore NEUROIMAGING HIGHLIGHTS 681 Multiphase CT Angiography for Evaluation and Diagnosis of Complex Spinal Dural Arteriovenous Fistula Sudharsan Srinivasan, Zachary M. Wilseck, Joseph R. Linzey, Neeraj Chaudhary, Ashok Srinivasan, Aditya S. Pandey https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.189 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.189 https://www.cambridge.org/core Volume 47 / Number 5 / September 2020 A-2 683 Heavy Eye Syndrome Mimicking Abducens Nerve Palsies Caberry W. Yu, Jonathan A. Micieli 685 Vertical One-and-a-Half Syndrome Due to Metastatic Spindle Cell Carcinoma of the Lung Elie Côté, Jonathan A. Micieli 687 Prolonged Hyperperfusion in a Child With ATP1A2 Defect-Related Hemiplegic Migraine Katherine Cobb-Pitstick, Dana D. Cummings, Giulio Zuccoli 689 Marchiafava–Bignami Disease: Two Chronologically Distinct Stages in the Same Patient Miguel Quintas-Neves, José Manuel Amorim, João Paulo Soares-Fernandes 691 A Glioma Presenting as a Posterior Circulation Stroke Fangshi Lu, Amy Fowler, Keith Tam, Carlos R. Camara-Lemarroy BRIEF COMMUNICATIONS 693 COVID-19: Stroke Admissions, Emergency Department Visits, and Prevention Clinic Referrals Maria Bres Bullrich, Sebastian Fridman, Jennifer L. Mandzia, Lauren M. Mai, Alexander Khaw, Juan Camilo Vargas Gonzalez, Rodrigo Bagur, Luciano A. Sposato 697 Ischemic Monomelic Neuropathy: The Case for Reintroducing a Little-Known Term Paul Winston, Dannika Bakker 700 Perception of Healthcare Access and Utility of Telehealth Among Parkinson’s Disease Patients Dakota Peacock, Peter Baumeister, Alex Monaghan, Jodi Siever, Joshua Yoneda, Daryl Wile LETTERS TO THE EDITOR 705 Effects of Rapid Eye Movement Sleep in Anti-NMDAR Encephalitis With Extreme Delta Brush Pattern Dhruv Jain, Marcus C. Ng 709 AMPA-R Limbic Encephalitis Associated with Systemic Lupus Erythematosus Zoya Zaeem, Collin C. Luk, Dustin Anderson, Gregg Blevins, Zaeem A. Siddiqi 711 Crossed Zoster Syndrome: A Rare Clinical Presentation Following Herpes Zoster Ophthalmicus Andrea M. Kuczynski, Carla J. Wallace, Ryan Wada, Kenneth L. Tyler, Ronak K. Kapadia 714 Recurrent Abducens Palsy in Relapsing-Remitting Multiple Sclerosis Sanskriti Sasikumar, Chantal Roy-Hewitson, Caroline Geenen, Dale Robinson, Felix Tyndel 716 Trigeminal Autonomic Cephalalgia Secondary to Spontaneous Trigeminal Hemorrhage Mathieu Levesque, Christian Bocti, François Moreau 719 Cluster Headache with Temporomandibular Joint Pain Tommy Lik Hang Chan, David Dongkyung Kim, Werner J. Becker 721 Location, Location: The Clue to Aetiology in Cerebellar Bleeds Stephen A. Ryan, Sandra E. Black, Julia Keith, Richard Aviv, Victor X.D. Yang, Mario Masellis, Julia J. Hopyan 724 Aerococcus Urinae Endocarditis Presenting with Bilateral Cerebellar Infarcts Kaie Rosborough, Bryce A. Durafourt, Winnie Chan, Peggy DeJong, Ramana Appireddy 727 Unexpected Progressive Multifocal Leukoencephalopathy in a Hemodialysis Patient Bryce A. Durafourt, John P. Rossiter, Moogeh Baharnoori Cover image: Prolonged Hyperperfusion in a Child With ATP1A2 Defect-Related Hemiplegic Migraine. Katherine Cobb-Pitstick, Dana D. Cummings, Giulio Zuccoli See pages 687-688. https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.189 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.189 https://www.cambridge.org/core Volume 47 / Number 5 / September 2020 A-3 Editor-in-Chief/Rédactueur en chef Robert Chen t o ro n t o, o n Associate Editors/Rédacteurs associés Robert Hammond london, on Philippe Hout m o n t rea l, qc Mahendranath Moharir t oron t o, on Tejas Sankar e dmo n t on, a b Manas Sharma lo n do n, o n Jeanne Teitelbaum mo n t re a l, qc Richard Wennberg toronto, on Past Editors/Anciens rédacteurs en chef G. Bryan Young lo n do n, o n Douglas W. Zochodne ca lg a ry, a b James A. Sharpe to ron t o, o n Robert G. Lee ca lg a ry, ab Robert T. Ross w i n n i p eg, mb (Emeritus Editor, Founding Editor) Editorial Board/Comité éditorial Jorge Burneo, lo n do n, o n Jodie Burton, ca lga ry, a b Colin Chalk, mo n trea l, qc K. Ming Chan, e dmo n to n, a b Alan Goodridge, st. jo h n’s, n l Mark Hamilton, ca lg a ry, ab Michael Hill, ca lga ry, ab Alan C. Jackson, w i n n i p eg, mb Draga Jichici, ha m i lto n, o n Suneil Kalia, to ro n t o, o n Daniel Keene, o t towa, o n Julia Keith, to ro n t o, on Nir Lipsman, to ro n to, o n Stephen Lownie, lo n do n, o n Jian-Qiang Lu, h ami lt o n, o n Patrick McDonald, van co u v e r, bc Joseph Megyesi, lo n do n, o n Tiago Mestre, o t t owa, o n Sarah Morrow, lo n do n, o n Michael Nicolle, lo n don, on Ian Parney, rochester, mn Narayan Prasad, l o n do n, o n Alex Rajput, saskat oo n, sk Kesh Reddy, hami lt on, o n Larry Robinson, toro n to, o n Ramesh Sahpaul, n ort h van cou ver, bc Dipanka Sarma, t o ro n to, o n Madeleine Sharpe, mon t re al, qc Sean Symons, t oro n t o, o n Brian Toyota, va n co u ver, bc Brian Weinshenker, roc h est e r, m n Sam Wiebe, ca lga ry, a b Jefferson Wilson, to ro n t o, o n Eugene Yu, to ro n to, o n Journal Staff/Effectif du Journal Dan Morin ca l ga ry, a b Chief Executive Officer Donna Irvin ca lga ry, a b CNSF Membership Services / Communications Officer The official journal of: / La Revue officielle de: The Canadian Neurological Society La Société Canadienne de Neurologie The Canadian Neurosurgical Society La Société Canadienne de Neurochirurgie The Canadian Society of Clinical Neurophysiologists La Société Canadienne de Neurophysiologie Clinique The Canadian Association of Child Neurology L’ Association Canadienne de Neurologie Pédiatrique The Canadian Society of Neuroradiology La Société Canadienne de Neuroradiologie The permanent secretariat for the five societies and the Canadian Neurological Sciences Federation is at: Le sécretariat des cinq associations et de la Fédération des sciences neurologiques du Canada est situe en permanence à : 143N – 8500 Macleod Trail SE Calgary, Alberta T2H 2N1 Canada CNSF (403) 229-9544 Fax (403) 229-1661 The Canadian Journal of Neurological Sciences is published bi-monthly. The annual subscription rate for Individuals (electronic) is £149/US$245. The annual subscription rate for Institutions (electronic) is £197/US$327. See for full details including taxes; e-mail: subscriptions_ newyork@cambridge.org. Send address changes to Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. The Canadian Journal of Neurological Sciences is included in the Cambridge Core service, which can be accessed at cambridge.org/ cjn. For information on other Cambridge titles, visit www.cambridge.org. For advertising rates contact M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Phone: 856-768-9360; Fax: 856-753-0064; Email: mjmrvica@mrvica.com. Le Journal Canadien des Sciences Neuorlogiques est publié tous les deux mois. Le prix d’abonnement annuel pour les individus (électronique) est 149£/245US$. Le prix d’abonnement annuel pour les établissements (électronique) est 197£/327US$. Veuillez consulter pour tous les détails, y compris les taxes; email: subscriptions_newyork@cambridge.org. Envoyer les changements d’adresses aux Journals Fulfillment Department, Cambridge University Press, One Liberty Plaza, New York, NY 10006 USA; usmemberservices@cambridge.org. Le Journal canadien des sciences neurologiques est inclus dans le service Cambridge Journals Online, accessible à cambridge.org/cjn. Pour plus d’informations sur les titres disponible chez Cambridge, veuillez consulter www. cambridge.org. Pour les tarifs de publicité, contacter M. J. Mrvica Associates, 2 West Taunton Avenue, Berlin, NJ 08009; Téléphone: (1)856-768-9360; Email: mjmrvica@mrvica.com. This journal is indexed by / Cette revue est indexée par: Adis International, ArticleFirst, BIOBASE, BioLAb, BiolSci, BIOSIS Prev, Centre National de la Recherche Scientifique, CSA, CurAb, CurCont, De Gruyter Saur, E-psyche, EBSCO, Elsevier, EMBASE, FRANCIS, IBZ, Internationale Bibliographie der Rezensionen Geistes-und Sozialwissenschaftlicher Literatur, MEDLINE, MetaPress, National Library of Medicine, OCLC, PE&ON,Personal Alert, PsycFIRST, PsycINFO, PubMed, Reac, RefZh, SCI, SCOPUS, Thomson Reuters, TOCprem, VINITI RAN, Web of Science. ISSN: 0317-1671 EISSN: 2057-0155 COPYRIGHT © 2020 by THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES INC. All rights reserved. No part of this publication may be reproduced, in any form or by any means, electronic, photocopying, or otherwise, without permission in writing from Cambridge University Press. Policies, request forms and contacts are available at: http://www.cambridge.org/about-us/rights-permissions. Permission to copy (for users in the U.S.A.) is available from Copyright Clearance Center: http://www.copyright.com, email: info@copyright.com. COPYRIGHT © 2020 du THE CANADIAN JOURNAL OF NEUROLOGICAL SCIENCES I NC. Tous droits réservés. Aucune partie de cette publication ne peut être reproduite, sous quelque forme ou par quelque procédé que ce soit, électronique ou autre, y compris la photocopie, sans l’accord écrit de Cambridge University Press. Les politiques, les formulaires de demande et les contacts sont disponibles à: http://www.cambridge.org/about- us/rights-permissions. La permission de copier (pour les utilisateurs aux États- Unis) est disponible auprès Copyright Clearance Center: http://www.copyright. com, email: info@copyright.com. https://www.cambridge.org/core/terms. https://doi.org/10.1017/cjn.2020.189 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:59, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/cjn.2020.189 https://www.cambridge.org/core work_mhmeqq3ly5be3gphtsybh63f4q ---- Library Trends Abstract This article considers the many developments in technology and practice that are making libraries more connected and interdepen- dent. It looks at new integrated online services and reviews the in- creasing importance of both formal and informal standards. Global centralized Web services are discussed. The relationships between information industry companies and libraries are considered. Vir- tual reference ser vices and far-reaching digitization projects are explored. The article concludes that close cooperation is allowing libraries to take their services to new levels and is key to the contin- ued innovation of those services. Introduction Library consortia, organized at the local, state, national, and interna- tional levels, are what we most commonly think of when we discuss library resource-sharing networks. Library consortia—for shared catalog services, interlibrary lending, document delivery, and shared electronic licensing— are growing in infl uence and importance. However, library communities also work together in a variety of ways, both formal and informal, that go beyond, or underpin, consortium activities. What follows is a consideration of the many different ways in which librar y communities are becoming more closely interconnected. The inherent capabilities of networked technology have presented li- braries with opportunities to take their ser vices to new levels. Libraries have been affected by general trends in computer technology. Libraries also share the enormous challenges of integrating new skills and methods, facing new sources of competition, and adapting to the rapid pace of tech- nological change. The 2003 OCLC Environmental Scan: Pattern Recognition Interconnected and Innovative Libraries: Factors Tying Libraries More Closely Together Peter Webster LIBRARY TRENDS, Vol. 54, No. 3, Winter 2006 (“Librar y Resource Sharing Networks,” edited by Peter Webster.), pp. 382–393 © 2006 The Board of Trustees, University of Illinois 383 (Wilson, 2003) provides a useful consideration of the changing landscape and technology-related challenges facing libraries. Library Networks in the New Millennium: Top Ten Trends (Laughlin, 2000) is another valuable work that looks at the forces affecting the development of library networks. In that volume of essays Hyman (2000) addresses the rapid growth in library user expectations in a world where instant communication and high-speed mobile access to worldwide information is the norm. Both Pattern Recognition and Hyman (2000, p. 97) conclude that cooperation and collaboration pro- vide libraries with essential tools for meeting the challenges of the future. Pattern Recognition quotes Reg Carr: “If the last few decades of library and information developments have taught us anything, then it’s surely that the really signifi cant advances, and the meaningful and lasting solutions, are cooperative ones” (Wilson, 2003, p. 83). As technology presents libraries with many new challenges, it also pro- vides collaborative tools to address these challenges. Shared online services in libraries have grown in step with increases in bandwidth and network reliability. We now take for granted network communication, universally available e-mail, listservs, RSS news feeds, blogs, and wikis. The use of these communication tools to focus the efforts of diverse groups is a central feature of the current advancement of library services through shared technology. New Shared Technology Services Integrated Library Systems (ILS) continue to be a key part of library consortium activity. New library online services are also becoming the fo- cus of librar y sharing. In his article “Re-Integrating the ‘Integrated’ Li- brary System” (Breeding, 2005), ILS watcher Marshal Breeding outlines the growing range of online services libraries are able to offer. Important new technologies like virtual reference, Open URL link resolving, feder- ated searching, content management systems, and user direct document delivery services are good candidates for shared and cooperative delivery. There are important economic benefi ts to sharing the costs of computer infrastructure needed for such services and spreading the workload among many libraries. There is also the considerable added benefi t of providing a more common experience to users from groups of libraries. As new services are being added to the offerings of ILS vendors, exist- ing library consortia are sharing a wider range of services. New services are also an incentive for new libraries to join consortia. For services such as user direct document delivery or virtual reference, there are great ben- efi ts to having very large groups of libraries participating. Sharing services among many libraries makes possible a level of service that could not be achievable by any single library. It is not surprising that Marshal Breeding also suggests that he is seeing renewed consolidation taking place in the ILS environment, as larger groups of libraries share centralized resources for a growing array of online services (Breeding, 2004). webster/interconnected & innovative libraries 384 library trends/winter 2006 Standards as a Key to Resource Sharing Development and use of common standards is one of the most impor- tant tasks that libraries perform collectively. Libraries have a long history of standards development previous to the development of the Dewey Decimal and Library of Congress classifi cation systems (Straw, 2003). Through adherence to standards, worldwide networks are created that successfully share resources, with little need for discussion among the par- ticipating agencies. Libraries exchanging materials via interlibrary loan need only follow agreed protocols to do so without the need for additional communication. In the same way, adherence to the Z39.50 search standard allows libraries and their users to routinely share information between their catalogs worldwide, without the need for any direct relationship or contact other than the reliance on a shared search standard. In the online environment, standards are taking on new importance. Networked information services are increasingly based on automated in- teroperability, where transactions between libraries take place with the few- est possible steps, with little human intervention, and at computer transfer speeds. Automated methods are becoming essential to reducing the cost of library services and providing the speed of service that users have come to expect. New data, format, and procedural standards have become neces- sary. Much more closely applied standards are proving essential to making automated interoperability work reliably and effectively. Library classifi cations systems and Machine-Readable Cataloguing (MARC) are major standardization achievements for libraries. The Z39.50 search stan- dard was the fi rst standard that allowed libraries to achieve the automated link- ages that are becoming central to our networked services today. The release of the Z39.50 standard in 1988 was an important step, but equally important for the advancement of library networking was the creation of the Bath profi le in 2000 (Lunau, 2003). Divergent implementations of the standard limited its usefulness. The uniform application of Z39.50 through use of the Bath profi le has been as important as the application of the standard itself. This has proven to be the case with the MARC cataloging standard as well. It is an ongoing process to make the application of MARC more uniform (Library of Congress, Network Development and MARC Standards Offi ce, 1998). The National Information Standards Organization (NISO) is becoming a critical resource for library integration. NISO has been instrumental in development of many of the more important standards that are allowing the closer integration of library services. The Z39.50 search standard, the International Standard Serial Number (ISSN) numbering system, and the underlying standards behind MARC are NISO standards. More recently de- veloped standards include the Open URL linking standard and the library Circulation Interchange Protocol (NCIP) (NISO, 2005a). NISO currently has task forces working on new standards for federated searching and cross- searching of multiple databases. 385 NISO is the information standards organization for a more general or- ganization, the American National Standards Institute (ANSI). NISO is also a key player in the technical standards group (T46) for the International Standards Organization (NISO, 2005b). The standards process itself is at every stage a collective activity. The standards organizations work through a broad process of consultation, with representatives from the information industry and from libraries. The fi nal approval of NISO standards is voted upon by the organization’s member- ship. Libraries and other organizations volunteer to act as Maintaining Agencies for each standard. For example, the U.S. Librar y of Congress is the lead agency for Z39.50, and NISO ILL is maintained by the Online Computer Library Center (OCLC). In addition to the organized standards process, interest groups and research communities form around individual existing and emerging standards. These informal groups are often as im- portant as the offi cial process in the implementation and advancement of standards. In addition to the ISO/NISO/ANSI international standards system, many library organizations are active in developing standards. Counting Online Usage of Networked Electronic Resources (COUNTER) is an ex- ample of a single purpose standard-setting organization. COUNTER is an international nonprofi t organization formed in 1992. It represents a large group of stakeholders including libraries and information companies. The group has worked cooperatively to implement standardized usage statistics for online journal databases. COUNTER built on the existing work done in this area, including guidelines developed by the International Coalition of Library Consortia (ICOLC) and the Association of Research Libraries (ARL) (COUNTER, n.d.). The International Federation of Librar y As- sociations and Institutions (IFLA) is particularly active in developing best practices and guidelines. ALA and its divisions are among the many other library organizations that are active in advancing standards and common practices in a wide range of areas. Informal Standards Libraries also share important resources through the use of a wide variety of informal standards. Of course, the process of standardization is not unique to the library industry. The Windows operating system or the Intel PC computer are common examples of informal standards. One example of an informal standard in libraries is the software product EzProxy. Useful Utilities Company’s EzProxy is one of the most popular means for libraries to offer their users remote access to the journal databases and other e-content resources that they license. It is considered a standard for this purpose. The software is used by over 1,500 library agencies in more than 35 countries and has recently seen its fi rst users in China (Chris Zagar, personal communication, April 15, 2005). It has become a standard for webster/interconnected & innovative libraries 386 library trends/winter 2006 providing remote access to library e-content. Another example is Infotrieve Inc.’s Ariel software, which has become a standard for online electronic document transmission. Some 6,000 librar y sites around the world are currently included on the Ariel site list (Infotrieve, 2005). Just as with offi cial standards, important communities of interest form around commonly used software, methods, and services. The users of Ariel or EzProxy communicate to solve problems and share information and best practices. In the same way, libraries using any common application or a particular ILS system, document delivery software, metasearch tool, or link resolver form informal but very valuable information- and resource- sharing networks. The use of XML markup language is another case of emerging stan- dardization. Roy Tennant’s XML in Libraries (2002) provides an excellent survey of the many ways XML can be useful in libraries. Major library system vendors, including Ex Libris, Sirsi, and Endeavor, have developed XML in- terchange features in their software to be used as the means of exchanging information with other systems. E-content vendors including Elsevier and Proquest have developed XML-based search interfaces as well. The use of this informally standardized markup language is allowing libraries to share XML methods and programming expertise. It also suggests possibilities for the creation of new formal interchange standards. It is very common for important new developments in information prac- tice to begin as informal standards and then be taken up by standards agen- cies and developed into more formal standards. This was the case with the Open URL linking standard, which was fi rst developed at Ghent University and then used by the SFX linking software (Grogg & Ferguson, 2004). Informal software standards are often transitory. The standard software or method for performing a certain task today is likely to change within a few years. It is also common for several informal standards to compete. One piece of software may be the common standard for one group of libraries in one region, while another competing application is favored by other libraries. Each software vendor of course strives to make its applica- tion the informal standard. This sometimes confusing competitive process has been the driving force behind much of today’s innovative technology. One of the keys to this process of innovation is the widespread exchange of information and expertise by groups and individuals using particular software, services, or standards. Open Source and Libraries Open source software is another example of collaboration at work in libraries. Eric Raymond’s The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary (Raymond, 2001) is a useful introduction to the open source community and its method of shared de- velopment and cooperative maintenance of freely available software. The 387 library community, with its inclination toward collaboration, has proven well suited to the shared method of software development. The open source software movement has a strong following in libraries. Thousands of libraries around the world rely on common applications developed through the open source process, such as the Linux operating system, the Apache Web-server software, or My-SQL and PHP Web database tools. These open source applications have become the informal standard in many libraries, as elsewhere. Open source development of library-specifi c software is widespread as well. The Koha ILS system is an excellent example of an open source library project (Koha Open Source Library Systems, n.d.). This application was developed in Australia in 1999 and is now used in over fi fty libraries around the world. The reSearcher suite of library integration software developed by the Council of Prairie and Pacifi c University Libraries (COPPUL) in western Canada is one of the most ambitious and successful open source library projects (COPPUL, n.d.). The PINES network of 249 public librar- ies in the state of Georgia has also recently announced plans to develop a new open source integrated library system (Kenney, 2004). Eric Lease Morgan’s “Possibilities for Open Source Software in Libraries” (Morgan, 2002) provides a useful introduction to the use of open source methods in libraries. The Web site of Open Source Systems for Libraries (OSS4Lib, 2005) is a prominent resource for learning about ongoing open source library activities. The open source movement in general is an important means for libraries to share software resources. Each individual open source project creates its own dynamic resource-sharing network. Centralized Information Services Centralized services such as bibliographic utilities and union catalogs have long been an important focus of library cooperative efforts. As some centralized services like catalog copy utilities have declined in importance, new centralized services are emerging. Increased Internet bandwidth, in- creasing capabilities of Web services software, and the decreasing cost of server technology are making wider sharing of library services possible. A growing capability and willingness to act collectively are also contribut- ing to this development. In a growing number of situations, nationally or internationally centralized library services are developing. Internet search engines, particularly Google at present, have become very important centralized information services. Google’s initiatives to ex- pand the public Internet content have received a great deal of attention. These include the Google Scholar scholarly materials search engine and Google’s partnership with prominent libraries to digitize library collec- tions (Carlson & Young, 2005). Google is partnering with a large number of e-content vendors and indexing projects to make a growing volume of journal information available via public Web search. webster/interconnected & innovative libraries 388 library trends/winter 2006 Google’s digitization projects have generated considerable controversy. Their efforts to expand the accessible content of the Web build on long- standing earlier cooperative efforts, notably Project Gutenberg. The recent announcement of a major digitization effort by national libraries in nine- teen European countries is also noteworthy, particularly for the non-English speaking world (Farrell, 2005). Other search engines including Yahoo and MSNet are also active in expanding Web content. Centralized Web services in general are an area of strong business competition (Vogelstein, 2005). New players and new content services will no doubt continue to evolve rap- idly on the World Wide Web. Web search engines will continue to emerge as one of the most important centralized information resources. OCLC has long been a key provider of shared library services. Their Open WorldCat service is a major new development in centralized library services. OCLC has partnered with Yahoo, MSN, and Google in the Open WorldCat project, which will make over 50 million library catalog records from OCLC’s WorldCat union catalog records searchable via Web search engines. OCLC also provides the means to link from a retrieved book refer- ence to the Web searcher’s local library (Mattison, 2005). In addition, both OCLC and Google are developing central ser vices that allow individual libraries to provide links to their journal holdings. Through these services, users will be routed to the appropriate link resolver or library catalog to determine if resource references found on the Web are available in a local library (OCLC, n.d.; ResourceShelf, 2005). Crossref is another important centralized service. Crossref is an industry organization with library membership that provides a central repository of location information to access e-journal materials available from over 1,400 publishers and societies. The service uses Open URL standard digi- tal object identifi ers to maintain up-to-date linking information for over 15 million articles in more than 11,000 journals available electronically (Crossref, 2005). Crossref can offer article- or journal-level Digital Object Identifi ers (DOIs) and has recently begun offering linking to material cited by a retrieved article. Crossref is not intended to be a tool for direct patron searching. Instead it can be used in the background, by library ILS software and e-journal search software, to link from retrieved citations to available full-text content held by many different publishers. The creation of Crossref is an indication that online vendors and publishers see the benefi t of working together rather than offering services independently. RedLightGreen is the Research Libraries Group’s (RLG) award- winning centralized Web accessible union catalog. This user-friendly library portal was developed with funding from the Mellon Foundation as a collabora- tion among RLG, Columbia University, New York University, Swarthmore College, and the University of Minnesota (Proffi tt, 2004). Rather than working primarily through the Web search engines, RedLightGreen of- fers centralized searching of over 45 million titles from the RLG union 389 catalog. Through its easy-to-use portal interface, it provides links to local library holdings as well as citation assistance. Shibboleth authentication is another example of a centralized service that will have a signifi cant impact on libraries. Shibboleth authentication was developed as an Internet 2 project. It provides a method for vendors of e-content and institutions that license full-text content to validate autho- rized users in order to share information. Shibboleth ensures the security of materials traveling over the Internet while providing authorized users with easy, safe, and private access. This federated method of authentica- tion requires content providers and users to work closely together and to share common methods of authentication and standards of security. It will provide a fl exible and more secure replacement for current methods used to validate the use of content over the Internet (Needleman, 2004). The possibilities for centralized information and library services are great. A growing number of information services can now be delivered as widely shared centralized services. Libraries worldwide are becoming more closely involved with these resources, including freely available Web resources and library consortium offerings. Greater connections are needed between freely available Web resources and individual library services and holdings. Virtual Reference Services Virtual Reference Services are another application where the sharing of technical resources and workload is proving to be valuable. These ser- vices have developed rapidly and received considerable attention recently. The Librar y of Congress worked with the “Global Reference Network” and OCLC on the early development of online reference. This work led to the development of OCLC’s popular QuestionPoint virtual reference software (Quint, 2002). A range of other software products has developed as well. A recent sur vey showed that seven prominent virtual reference software products are now being used by over 2,800 libraries around the world (Olivares, 2004). The Virtual Reference Desk is a promising project sponsored by the United States Department of Education. It has assisted in the creation of a network of more than 100 “Ask a”- type virtual reference services. Many of these are nonlibrary projects offering reference-type information on a wide variety of specialized topics. The Virtual Reference Desk is a wide-reaching resource-sharing project that includes both libraries and other information- providing organizations (Virtual Reference Desk, 2002). The process of establishing standards for virtual reference services is underway. Several organizations have developed best practices in this area. IFLA began a Digital Reference Standards project in 2001 to work with a wide variety of groups, including the Reference and User Services As- sociation (RUFA), OCLC, NISO, and the Virtual Reference Desk project (Fullerton, 2002). webster/interconnected & innovative libraries 390 library trends/winter 2006 Information Industry and Library Partnerships The publishing and information services industries are changing rapidly. Business mergers and partnerships are bringing about their own sort of re- source sharing through consolidation. Major publishers such as Gale, Bowker, and Academic Press have joined with larger companies. The merging of the ILS company Endeavor with the publisher Elsevier, or the e-serials service company Serials Solutions with the e-content aggregator Proquest, are ex- amples of formerly separate information services coming together. Libraries are being offered an increasingly unifi ed and integrated range of services. Online information vendors are involved in a growing array of part- nerships, of which Crossref is just one example. The new services that are becoming available—federated searching, Open URL linking, and virtual reference—all depend on the use of common standards and methods and on close cooperation among e-content vendors. Both Proquest’s director of platform management, John Law, and EBSCO’s chief systems architect, Oliver Pesch, agree that even more standardization and cooperation be- tween online information companies is needed (Grogg & Ferguson, 2004). It is not surprising that the metasearch company MuseGlobal prominently “showcases” its partnerships with major ILS vendors and e-content providers (MuseGlobal, 2005). In the same way, ILS vendor Sirsi lists eighty corporate partners on their Web site (Sirsi, 2005). The successful functioning of online products is increasingly dependant on cooperation. Publishers and information services vendors are also partnering with libraries in a growing variety of ways. As vendors rapidly develop new ser- vices, partnerships between software vendors and the library community for testing and evaluating new products are essential. The Endeavor com- pany promotes the collaborative approach taken to develop its software in partnership with library users. It lists over sixty libraries involved in “task forces” (Endeavor Information Systems, n.d.) working to enhance aspects of Endeavor services. Wide consultation and collaborative interaction with libraries have become the norm for information ser vices companies. It is important to build communities of interest for their products. Online information, product-specifi c publications, user groups, and mail lists are common methods for training users and providing information. They are also important for allowing users to share knowledge and join in discus- sions, which result in innovations and enhancement of the vendor’s prod- ucts. Informal networks grow around both commercial and public domain software. The product’s listserv often becomes a critical resource. The user community becomes an important force in application development. The range of library-related partnerships and network relationships is diverse and far reaching. The relationships among nonprofi t organizations, information vendors, and libraries have been instrumental in developing online information infrastructure in many parts of the world. Electronic 391 Information for Libraries (eIFL) is a particularly good example. eIFL was formed in 1999 as a joint project of the Sorus Foundation’s Open Society Institute and EBSCO publishing, with the aim of fostering library consortia and e-content services in countries with limited online information infra- structure. eIFL has developed into an independent consortium providing e-content ser vices in forty developing countries, particularly in Eastern Europe and Africa (Electronic Information for Libraries, n.d.). Preservation and Conservation Partnerships Another area where information industry and library partnerships have been particularly active is in digitization of print collections. A major ex- ample of such partnering is the Elsevier company’s collaborative effort to locate, digitize, and preser ve the complete archive of its print journals. Elsevier partnered with the National Library of the Netherlands and Yale University, in addition to many content-providing libraries, over a three-year period on this project (Elsevier Corporation, 2002). Thomson Web of Science has undergone a similar process to identify and index 100 years of historical journal materials for their Century of Sci- ence project (Thomson Scientifi c, 2004). Thomson credits partners Trinity College Dublin and University College Cork and lists eight other major libraries and institutions for providing materials for this project. Another interesting text conversion project is the Early English Books Online Text Creation Partnership (EEBOTCP), which involves Proquest and Chadwyck- Healey, partnered with over 130 universities, in the digitization of early works in English (EEBOTCP, 2005). Both business and nonprofi t partner- ships are involved in digitization efforts. These partnerships are making it possible to preserve and manage worldwide collections, both paper and electronic, in ways that have never been possible before. Conclusion Libraries are working ever more closely with one another, with online information companies, and with other cultural agencies. They increasingly share infrastructure and human resources to offer a range of common services. They are participating in widely available Web -accessible central- ized ser vices. Libraries collaborate and exchange resources by sharing both formal and informal standards. They participate in the cooperative process for developing those standards. Libraries participate collectively in the continuing innovation of information software and services, both commercial and open source. They routinely share information on the use of common software applications, large and small. The sharing of ideas, expertise, and resources by wide-reaching, often voluntary and informal, communities of interest is central to the way libraries offer and further develop online services. webster/interconnected & innovative libraries 392 library trends/winter 2006 These activities have made libraries more interconnected and inter- dependent than ever before. Through this interdependence, libraries are moving well beyond organizing and offering user access to local bodies of material within their own buildings to ordering and providing access to ever larger, increasingly comprehensive, ultimately global bodies of shared material. As the number, type, and complexity of sharing relationships grow, libraries will need to draw the threads together to better focus the many important ways in which they work together to share resources. References Breeding, M. (2004). The trend toward outsourcing the ILS: Recognizing the benefi ts of shared systems. Computers in Libraries, 24(5), 36–38. Breeding, M. (2005). Re-integrating the “integrated” library system. Computers in Libraries, 25(1), 28–30. Carlson, S., & Young, J. (2005). Google will digitize and search millions of books from 5 top research libraries. Chronicle of Higher Education, 51(18). Council of Prairie and Pacifi c University Libraries (COPPUL). (n.d.). reSearcher. Retrieved September 15, 2005, from http://researcher.sfu.ca/index.php/plain/about. Counting Online Usage of Networked Electronic Resources (COUNTER). (n.d.). About Coun- ter. Retrieved April 24, 2005, from http://www.projectcounter.org/about.html. Crossref. (2005). Crossref Newsletter. Retrieved May 1, 2005, from http://www.crossref.org/ 01company/10newsletter.html. Early English Books Online Text Creation Project (EEBOTCP). (2005). Participating Institutions. Retrieved May 4, 2005, from http://www.lib.umich.edu/tcp/eebo/proj_stat/ps_partners .html. Electronic Information for Libraries (eIFL). (n.d.). About eIFL.net. Retrieved May 4, 2005, from http://www.eifl .net/about/about.html. Elsevier Corporation. (2002). Elsevier Editors Update. Issue 2, Nov. 2002. Retrieved April 22, 2005, from http://www.elsevier.com/wps/fi nd/editorsinfo.editors/editors_update/issue2c. Endeavor Information Systems. (n.d.). Advisory boards. Retrieved May 4, 2005, from http:// www.endinfosys.com/support/advisory.htm. Farrell, N. (2005). European librarians march against Google. Inquirer, April 28, 2005. Retrieved April 28, 2005, from http://www.theinquirer.net/?article=22865. Fullerton, V. (2002). IFLA digital reference standards project. Retrieved May 4, 2005, from http:// www.ifl a.org/VII/s36/pubs/drsp.htm#1. Grogg, J., & Ferguson, C. (2004). Oh, the places linking will go. Searcher, 12(2), 48–58. Hyman, K. (2000). Struggling in a one-stop shopping world, or people want what they want when they want it. In Sara Laughlin (Ed.), Library networks in the new millennium: Top ten trends (pp. 93–97). Chicago: Association of Specialized and Cooperative Library Agencies. Infotrieve, Inc. (2005). Ariel address list. Retrieved May 1, 2005, from http://www4.infotrieve .com/ariel/fi les/wariadr.txt. Kenney, B. (2004). Georgia’s 250 PINES libraries to create an ILS their way. Library Journal, 129(14), 29. Koha Open Source Library Systems. (n.d.). About Koha. Retrieved September 15, 2005, from http://www.koha.org/about-koha/. Laughlin, S. (Ed.). (2000). Library networks in the new millennium: Top ten trends. Chicago: As- sociation of Specialized and Cooperative Library Agencies. Library of Congress, Network Development and MARC Standards Offi ce. (1998). MARC 21: Harmonized USMARC and CAN/MARC. Retrieved April 21, 2005, from http://www.loc .gov/marc/annmarc21.html. Lunau, C. (2003). The Bath Profi le: What is it and why should I care? Librar y and Archives Canada. Retrieved September 15, 2005, from http://www.collectionscanada.ca/bath/ ap-bathnew-e.htm. Mattison, D. (2005). RedLightGreen and Open WorldCat. Searcher, 13(4), 14–23. 393 Morgan, E. (2002). Possibilities for open source software in libraries. Information Technology and Libraries, 21(1), 12–15. MuseGlobal. (2005). Partner showcase. Retrieved May 4, 2005, from http://www.museglobal .com/partner/showcase.html. National Information Standards Organization (NISO). (2005a). About NISO. Retrieved April 24, 2005, from http://www.niso.org/about/index.html. National Information Standards Organization. (NISO). (2005b). Standards development pipeline. A quick summary of all NISO standards that are in development, approved, published or withdrawn. Retrieved April 24, 2005, from http://www.niso.org/creating/creating_process.html. Needleman, M. (2004). The Shibboleth authentication/authorization system. Serials Review, 30(3), 252–253. OCLC. (n.d.). Enable deep links to your online catalog from Open WorldCat. Retrieved May 21, 2005, from http://www.oclc.org/worldcat/open/deeplinking/. Olivares, O. (2004). Virtual reference systems. Computers in Libraries, 24(5), 25–29. OSS4Lib. (2005). Home page. Retrieved April 20, 2005, from www.oss4lib.org. Proffi tt, M. (2004). RedLightGreen: What we’ve learned since launch. RLG Focus. Retrieved May 1, 2005, from http://redlightgreen.com/librarianinfo/RFOCUSreprint-66_2.pdf. Quint, B. (2002). QuestionPoint marks new era in virtual reference. Information Today, 19(7), 50–51. Raymond, E. (2001). The cathedral and the bazaar: Musings on Linux and open source by an ac- cidental revolutionary. Cambridge, MA: O’Reilly. ResourceShelf. (2005). Google Scholar direct links to articles: Google Scholar is now open to all libraries. Retrieved May 1, 2005, from http://www.resourceshelf.com/2005/05/be-it-resolved-that- google-scholar-is.html. Sirsi. (2005). Complete alphabetical listing of all Sirsi partners. Retrieved May 4, 2005, from http:// www.sirsi.com/Partners/partner_alpha_list.html. Straw, J. (2003). When the walls came tumbling down: The development of cooperative service and resource sharing in libraries: 1876–2002. Reference Librarian, 83/84(40), 263–276. Tennant, R. (Ed.). (2002). XML in libraries. New York: Neal-Schuman. Thomson Scientifi c. (2004). Announcing the century of science: Over one hundred years of ground breaking research now available via Web of Science. Retrieved April 28, 2005, from http://www .thomsonscientifi c.com/centuryofscience/cos-backstory.html Virtual Reference Desk. (2002). About VRD. Retrieved May 4, 2005, from www.vrd.org. Vogelstein, F. (2005). Search and destroy. Fortune, 151(9), 73–79. Wilson, A. (Ed.). (2003). The 2003 OCLC environmental scan: Pattern recognition. Dublin, Ohio: OCLC Online Computer Library Center, Inc. Peter Webster is Systems Librarian for Saint Mary’s University in Halifax, Canada. He is also a participant in the Nova Scotia academic library consortium NOVANET, as well as regional and Canadian national resource-sharing efforts. Peter received his MLS from Dalhousie University in 1986. He has been a speaker at CLA, APLA, ALA, and ACCESS conferences. His recent publications include “Breaking Down Informa- tion Silos: Integrating Online Information,” Online 28, no. 6; “Metasearching in an Academic Environment,” Online 28, no. 2; and “Remote Patron Validation: Posting a Proxy Server at the Digital Doorway,” Computers in Libraries 22, no. 8. webster/interconnected & innovative libraries work_milcd5ewczadxemkri7zwcktvy ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216585200 Params is empty 216585200 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585200 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_mul5t24lkvfehbanpnhlhoddbi ---- American Archivist/Vol. 49, No. I/Winter 1986 Interpretation and Application of the AMC Format NANCY A. SAHLI Abstract: The USMARC Archival and Manuscripts Control (AMC) format is a standard format for the administrative and descriptive control of archives and manu- script materials, primarily in automated systems. This article describes the history of the AMC format's development, as well as the characteristics of its various parts. In- formation on AMC format implementation and use is provided, covering such topics as functional requirements analysis, information gathering, and system selec- tion/design. The article concludes with recommendations for future action related to the format's ongoing development and use. About the author: Nancy Sahli is on the staff of the National Historical Publications and Records Commission, where she is archives specialist for technological evaluation. Her undergraduate degree is from Vassar College and her M.A. and Ph.D. degrees in history are from the University of Pennsylvania. Much of her work has focused on archival automation and information systems, and she served as a member of the Society of American Archivists's National Information Systems Task Force. Her most recent book, MARC for Archives and Manuscripts: The AMC Format, was published in 1985 by the Society of American Archivists. The views expressed in this article are solely hers and do not represent the official position of the National Historical Publications and Records Commission or the National Archives and Records Administration. This article is a revised version of a paper presented at the 48th annual meeting of the Society of American Archivists, 3 September 1984, Washington, D.C. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 10 American Archivist/Winter 1986 EVERYONE REMEMBERS THE BIBLICAL STORY about the Tower of Babel, the highrise of confusion where the carpenters could not talk with the masons because no one spoke the same language. A similar situa- tion occurs when archivists try to describe and communicate information about ar- chives and manuscripts without having a common vocabulary and set of ground rules. The use of computers, which re- quires conformity and standard tech- niques, only complicates the situation. Archivists and manuscript curators who undertake computer applications quickly become aware of the need for standardized formats and procedures. As Lydia Lucas noted in an American Ar- chivist article in 1981, "Automation, though it tolerates wide variance in data, does not tolerate idiosyncracy. . . . Stan- dard formats, where the required elements can be formalized, help en- courage precision and accuracy at crucial points."1 Traditionally, however, ar- chivists have been idiosyncratic, and the lack of uniform descriptive standards and practices has been a definite hindrance to automation and information exchange in archives and manuscript repositories. What archivists are coming to realize, however, is that to argue that standard formats are not needed for archives and manuscripts or are impossible to achieve is to relegate archivists to an intellectual and professional backwater. Because of a general lack of standards for archival description at the time, it was relatively easy for the Library of Con- gress in 1973 to issue a MARC Format for Manuscripts that had only a remote relation to archival needs and practices.2 Although the introduction to the format acknowledged the assistance of John Knowlton of the library's Manuscript Division and Arline Custer and Harriet Ostroff of the National Union Catalog of Manuscript Collections, formal par- ticipation of the archival profession in the format's development apparently did not occur. As a result, this format for machine-readable information exchange was best suited for the description of in- dividual manuscript items—the kiss of death as far as archivists were con- cerned—and enjoyed little use.3 More significantly, however, archivists rejected the format because it was seen as being oriented to library rather than archival practices, from its origination in the Library of Congress to its use of library concepts and terminology. Even the Manuscript Division at the library re- fused to use it. Meanwhile, the library community embraced the other MARC formats (for books, serials, and other materials) with enthusiasm and used them in creating automated networks for interlibrary loan, shared cataloging, and other applications. It is hardly surprising that archivists turned away from the MARC formats and library automation activities in order to develop systems that were more in tune with their perceived needs. A variety of in-house systems were initiated at such in- stitutions as the National Archives, the Smithsonian Institution Archives, and the University of Illinois—systems which 'Lydia Lucas, "Efficient Finding Aids: Developing a System for Control of Archives and Manuscripts," American Archivist AA (Winter 1981): 24-25. 2Library of Congress, MARC Development Office, Manuscripts: A MARC Format; Specifications for Magnetic Tapes Containing Catalog Records for Single Manuscripts or Manuscript Collections (Washington, D.C.: Library of Congress, 1973). 'No review of the MARC Format for Manuscripts appeared in the American Archivist in 1973, 1974, or 1975; nor was the format mentioned in the journal's "Technical Notes" section. It was, however, listed by Meyer H. Fishbein in his bibliography, " A D P and Archives: Selected Publications on Automatic Data Processing," American Archivist 38 (January 1975): 31-42, and in "Writings on Archives, Historical Manuscripts, and Current Records: 1973," American Archivist!?, (July 1975): 339-374. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 Interpretation and Application 11 used different hardware, software, and data configurations. During these early days there was little perception of the need to view archival description as part of a wider information environment or of the possible administrative uses of shared data. One philosophical exception to this pattern was SPINDEX, a series of data base management programs developed at the National Archives in the 1960s and 1970s to deal with archival automation needs. Although SPINDEX's developers originally envisioned the use of a com- mon data format by all of the system's users, individual institutions quickly learned that the programs' flexibility enabled them to create a wide variety of data base designs, formats, and im- plementations. A big step toward the standardized use of SPINDEX and crea- tion of a national information system for archives and manuscripts came in 1976 when the National Historical Publica- t i o n s a n d R e c o r d s C o m m i s s i o n (NHPRC) announced plans to develop a SPINDEX data base of information about historical records and manuscripts and the institutions in which they were located.4 Conceived as a hierarchical system including repository, collec- tion/record group, series, and lower levels of control, to emulate the eight- level hierarchy found in the SPINDEX programs, the data base eventually en- compassed the NHPRC's Directory of Archives and Manuscript Repositories and several state-based survey projects in Washington, New York, Kentucky, and other areas, which used the same field structure or pattern for formatting data.5 At the time the NHPRC was beginning to develop its system, some archivists questioned the commission's decision. Concerns ranged from the possibility of duplication of effort with the National Union Catalog of Manuscript Collections (NUCMC) to concern that a SPINDEX- based information system could not, because of its inherent technological limitations, provide the kind of flexible, online access that was becoming more and more widespread in the information world. In order to address these concerns and to develop ideas for what a national archives and manuscripts information system should be, the Society of American Archivists formed the National Information Systems Task Force in 1977.6 Early on NISTF, as the task force came to be called, perceived that no single system or entity could serve the needs of all archival users. Instead, it •The most detailed discussion of the NHPRC data base concept can be found in Report on the Con- ference on Automated Guide Projects, St. Louis, Missouri, July 19-20, 1977 (Atlanta: National Associa- tion of State Archives and Records Administrators, 1978). See also Larry J. Hackman, Nancy Sahli, and Dennis A. Burton, "The NHPRC and a Guide to Manuscript and Archival Materials in the United States," American Archivist 40 (April 1977): 201-205. 'National Historical Publications and Records Commission, Directory of Archives and Manuscript Repositories in the United States (Washington, D.C.: NHPRC, 1978). Publications of the state-based survey projects include Washington (State), Division of Archives and Records Management, Historical Records of Washington State: Guide to Records in State Archives and Its Regional Depositories (Olympia, Wash.: Washington State Division of Archives and Records Management and Washington State Historical Records Advisory Board, 1981) and Washington (State), Division of Archives and Records Management, Historical Records of Washington State: Records and Papers Held at Repositories (Olympia, Wash.: Washington State Historical Records Advisory Board, 1981); a continuing series of county guides pro- duced by the New York Historical Resources Center at Cornell University; and the forthcoming guide to materials surveyed by the Kentucky Guide Project of the Kentucky Department for Libraries and Archives. 'See Richard H. Lytle, "A National Information System for Archives and Manuscript Collections," American Archivist 43 (Summer 1980): 423-426, and "An Analysis of the Work of the National Informa- tion Systems Task Force," American Archivist 41 (Fall 1984): 357-365; and David Bearman, "Toward Na- tional Information Systems for Archives and Manuscript Repositories," American Archivist 45 (Winter 1982): 53-56. NISTF functioned until 1983. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 12 American Archivist/Winter 1986 decided that a more appropriate focus would be to establish a format for ar- chival information exchange that could be used with all types of hardware and software and could even be adapted for manual applications. Such a common format would enable information to be exchanged between institutions and need- ed to be designed to conform to accepted standards in the information world, such as those promulgated by the American National Standards Institute. After in- vestigating the resources needed to develop such a format, the task force decided that the most economical and best approach would be to take an ex- isting format, the MARC Format for Manuscripts, and try to adapt it to meet archival needs. The radicalism of such a measure should not be underestimated. Imagine, recommending a procedure that would involve archivists talking to librarians, learning about their practices, and even working with them toward the develop- ment of a common standard. Yet the ad- vantages of developing a new MARC for- mat were clear. Archives and manuscripts information could be integrated into ex- isting MARC-based bibliographic net- works, the costs of developing and main- taining an independent format could be largely avoided, and network users could ultimately obtain information about all types of materials relevant to their needs from a single source. Indeed, the Re- search Libraries Group's plan to develop an archives and manuscripts module for their MARC-based online system, the Research Libraries Information Network (RLIN), was a key force leading to adop- tion of the MARC format strategy by NISTF. Because the development of MARC formats was controlled by the library community, NISTF recognized the im- portance of not ceding all format deci- sions to librarians. Cooperation was the key, and the working group established by NISTF to iron out the details of infor- mation and format requirements includ- ed representatives from both the archives and library worlds. Likewise, it was agreed that maintenance of the revised format, christened the USMARC Format for Archival and Manuscripts Control (or AMC for short), would be the joint responsibility of the Library of Congress and two advisory bodies, the SAA's Committee on Archival Information Ex- change (CAIE) and the American Library Association's Committee on the Representation in Machine-Readable Form of Bibliographic Information (MARBI). Basic to this working relation- ship was the Library of Congress's agree- ment to make AMC format changes only with the consent of the SAA and ALA committees. What then is the AMC format and what are its implications for archival description? At its most elementary level the format is a container for informa- tion—a series of labeled pigeonholes—in- to which data or information about ar- chives and manuscripts may be placed, just as a recipe, another type of format, is a series of pigeonholes of data relating to the preparation of a particular food. Just as a recipe contains various parts, such as a title, a list of raw ingredients, and narrative details on preparation tech- niques, so the AMC format, like the other USMARC formats, has different parts, each of which contains a particular kind of data. These include the leader, the record directory, control fields, and variable data fields. Within the general framework of the format the user creates a separate record for each unit (such as a collection or record group) being de- scribed. For example, the data base being created by the Research Libraries In- formation Network contains many dif- ferent records. Each of them, however, contains similar data fields. It is only the D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 Interpretation and Application 13 information in each record that is dif- ferent. One of the biggest mistakes that fledg- ling format users make is trying to under- stand the leader and record directory elements of the format. Both of these are machine-generated entries that contain general information about the record as a whole and also provide parameters for computer processing of the records. Con- ceptually they are very hard to under- stand and are apt to discourage novices from further exploration of the format. Most archivists, however, who are under- taking computer related implementations of the format will be using an existing processing system—an online network such as OCLC or RLIN; a turnkey system, such as Geac or LS2000, in which the vendor provides both equipment and programs; or general MARC application programs, which run in a mini- or micro- computer environment.7 Michigan State University, for example, with funding from the NHPRC, is developing a series of MARC-based programs that will run on an IBM PC-XT or compatible equip- ment. The project is scheduled for com- pletion in mid-1986. All of these systems should generate leader and directory data automatically, with minimum interven- tion by the user. Those who prefer the challenge of a more individualistic ap- proach may, of course, develop their own software and systems. In addition to the leader and record directory, each MARC record contains control fields and variable data fields. The control fields provide information about the record's control number, the subrecord map of the directory (another technical term), the date and time of the latest transaction involving the record, certain physical characteristics of the material being described, and other ab- breviated or coded information about the record useful for information retrieval. Following the control fields is the heart of the format, the seventy-seven variable data fields approved for inclusion in the AMC format (Table 1). Each variable data field consists of two characters called indicators, each of which provides summary information about the content of the rest of the field. Following the in- dicators, each field contains between one and twenty subfields. Each subfield con- tains a particular data element, such as a date, a name, or an index term. Many subfields and fields may be repeated within a single record, while others can- not. Each field and subfield has a unique field number, subfield letter or number, and name. Descriptions and examples have been created for all fields, as well as many subfields. Figure 1 shows the layout of a typical AMC variable data field, 506, which pro- vides information about restrictions on access. Although some fields, such as this one, give individual users considerable latitude in deciding how they want sub- field information to appear, other fields require the use of Library of Congress designated codes or standard forms of entry. If an archivist is interested in pursuing the use of the format, what steps should be taken? First, if the prospective user is not already involved in automation, he or she will need to decide whether the initial implementation will be manual or com- puterized. Did I say manual? I certainly did, for although MARC is an acronym 'OCLC, the Online Computer Library Center, began providing online services in 1971 and is the largest bibliographic service in the United States. RLIN is the computer network of the Research Libraries Group, a corporation jointly owned by a number of American research institutions and libraries. Turnkey systems are automation systems which include hardware, software, installation, training, and ongoing support from a single source. Geac is a turnkey system marketed by Geac Computers International, a Canadian firm. LS2000 is being developed and marketed by OCLC. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 14 American Archivist/Winter 1986 Tag 001 002 005 007/00 007/01 007/02 007/03 007/04 007/05-08 007/09 007/10 007/11 007/12 008/00-05 008/06 008/07-10 008/11-14 008/15-17 008/18-22 008/23 008/24-34 008/35-37 008/38 008/39 010 035 039 040 041 043 045 052 066 072 09X 100 110 111 130 240 242 243 245 260 300 340 351 500 502 505 506 510 AMC FORMAT VARIABLE DATA Field Title Control number Subrecord map of directory Date and time of latest transaction Category of material Specific material designation Original versus reproduction aspect Polarity (microforms) Dimensions (microforms) Reduction ratio Color (microforms) Emulsion on film (microforms) Generation Base of film (microforms) Date entered on file Type of date code Date 1 Date 2 Place of publication, production, or execution code Undefined Form of reproduction code Undefined Language code Modified record code Cataloging source code Library of Congress control number Local system control number Level of bibliographic control and coding detail Cataloging source Language code Geographic area code Chronological code or date/time Geographic classification code Character sets present Subject category code Local call numbers Main entry — personal name Main entry — corporate name Main entry — conference or meeting Main entry — uniform title heading Uniform title Translation of title by cataloging agency Uniform title, collective Title statement Publication, distribution, etc. (imprint) Physical description Medium Organization and arrangement General note Dissertation note Contents note (formatted) Restrictions on access Citation note (brief form)/references Tag 520 521 524 530 533 535 540 541 544 545 546 555 561 562 565 580 581 583 584 59X 600 610 611 630 650 651 655 656 657 69X 700 710 711 730 740 752 773 851 870 871 872 873 880 886 FIELDS Field Title Summary, abstract, annotation, scope, etc., note Users/intended audience note Preferred citation of described materials Additional physical form available note Reproduction note Location of originals/duplicates Terms governing use and reproduction Immediate source of acquisition Location of associated materials Biographical or historical note Language note Cumulative index/finding aids note Provenance Copy and version identification Case file characteristics note Linking entry complexity note Publications note Actions Accumulation and frequency of use Local notes Subject added entry — personal name Subject added entry — corporate name Subject added entry — conference or meeting Subject added entry — uniform title heading Subject added entry — topical heading Subject added entry — geographic name Genre/form heading Index term — occupation Index term — function Local subject added entries Added entry — personal name Added entry — corporate name Added entry — conference or meeting Added entry — uniform title heading Added entry — title traced differently Added entry — place of publication or production Host item entry Location Variant personal name Variant corporate name Variant conference or meeting name Variant uniform title heading Alternate graphic representation Foreign MARC information field Table 1 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 Interpretation and Application 15 for Machine .Readable Cataloging, the basic principles of designing a descriptive system are not dependent on the tech- nology that will be used for implementa- tion. The content of archival descriptive information is no different whether it is written down by hand, on a typewriter, or entered into a national online system. The important thing is to gather and record the information in a manner that is compatible and consistent. That means isolating individual information elements and arranging them in the same logical order, or field order, in which they ap- pear in the AMC format design. That way, if the decision is made to automate, it will be a simple matter to add field and subfield designators, indicators, and the other embellishments that are part of a machine-readable MARC record. A second key in planning for imple- mentation is to acquire the essential for- mat documents. These include two manuals prepared specifically for ar- chival users: MARC for Archives and Manuscripts: The AMC Format and MARC for Archives and Manuscripts: A Compendium of Practice.* The first of these contains introductory guidelines for format use, the AMC format edited for archival users, and an updated version of the NISTF Data Elements Dictionary with cross references to fields and sub- fields in the format and to Steven L. Hensen's Archives, Personal Papers, and Manuscripts. The second volume is a product of the 1984 Conference on the Use of the MARC Format for Archives and Manuscripts, held at the State Historical Society of Wisconsin with sup- port from the NHPRC. It provides ex- amples of format use and practice from some of the initial AMC users, including RLIN, OCLC, and individual institu- tions. The Library of Congress's MARC Formats for Bibliographic Data (MFBDJ, Update 10, the "official" release of the AMC format, contains the full text of the format as well as essential codes and other authority lists.9 Materials created by other organiza- tions and individuals can also be of assistance in designing an AMC im- plementation. These include the Research Libraries Group's AMC Field Guide, Walt Crawford's MARC for Library Use, and more general MARC literature from the library world. Crawford's book contains a rich bibliography.10 Third, the need for archivists to have a clear sense of their own descriptive needs is as important as familiarity with MARC itself. We have all heard of the proverbial repository whose finding aid system changes each time there is a new curator of manuscripts. With the AMC format, there is now an opportunity for archivists to take a detailed look not only at their descriptive systems (or lack thereof), but also at the methods used for providing administrative control over materials. It is likely that such evaluation will reveal repetition and redundancy in archival ad- ministrative practices and record keep- ing, with multiple forms and a lot of duplicated effort. Archivists should be "Nancy Sahli, MARC for Archives and Manuscripts: The AMC Format (Chicago: Society of American Archivists, 1985) and Max J. Evans and Lisa B. Weber, MARC for Archives and Manuscripts: A Compen- dium of Practice (Madison: State Historical Society of Wisconsin, 1985). 'The MARC Formats for Bibliographic Data (MFBDJ may be ordered either on an ad hoc or subscrip- tion basis from the Customer Services Section, Cataloging Distribution Service, Library of Congress, Washington, D.C. 20541. Steven L. Hensen's Archives, Personal Papers, and Manuscripts: A Cataloging Manual for Archival Repositories, Historical Societies, and Manuscript Libraries (Washington, D.C: Library of Congress, 1983) may be ordered from the same source. A catalog listing other MARC-related publications is also available. '"Research Libraries Group, AMC Field Guide (preliminary edition, Stanford, Cal.: Research Libraries Group, 1983; a revised edition is forthcoming). Walt Crawford, MARC for Library Use: Understanding the USMARC Formats (White Plains and London: Knowledge Industry Publications, Inc., 1984). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 16 American Archivist/Winter 1986 sto Q a p h ic B ib li o g i fo r CO o rm a M A R C F F or m at : <° >•Q . E CD X I DC CO A C C E S S O N S O r- T R IC T I co UJ DC m LL Q LU CO 3 5 •k * 5 5 > m cc A C C E S S •J O S N O T R IC T I T O R S 06 R E S IN D IC A LL LL O Q L U L U COCO 3 3 a. a. 5 5 5 5 5 5 > > mm nd ef in ed nd ef in ed 3 3 1 1 T- CM O O CO In d ic a In di ca XI XI B FI E LL " 3 CO LL Q LU CO 3 5 *DL 5 • 555 <<< 5 5 > > *: m * 5 > 2 CO CO ov er ni ng a cc e tio n O> CJ T er m s Ju ri sd CO . Q CO co tn 1 a cc e ss p ro vi : CO P hy si c 55 << 5 > oc :e d us er s 5 > DC 2 s sp e ci fie d CO A u th o M a te ri CJ "O CO ri O N OC o CO LUQ LL Q LU CO 3 5 Q. 5 5 5 > m CD r— ss t o tt on a cc es io n on in fo rm a t se d :o rd Q . E CO c o T3 m bo ut r e st ri ct io hi s fi e ld i s us < CO c co" 3r d in fo rm a tio u b iis h e d w or k CJ o • o Id i s us Q . al s. F or C o m a te ri st ri bu ti- T hi s fi e e sc ri b e d m ite d d if TD — LL Q LU CO 3 5 0- 5 5 5 CO cL.9 ly si ca l, oi Q . "io be d m at e sc ri •§ j £ CD ie s ic ce ss " id e n tif • * - ' a> w is h in g t o se CD CO O)C0 er m s g o ve rn in h- co -H- LU Q ;L D C O ed o n in d iv id u CO o a . CO c •s tr ic tic S U B F IE e d u ra l r e o 5 5 > c o ve rn i e te rm s g .c is ri f o _c au tl is u nd er w h a t OB 1 5 > ap pe al ed . & u ri sd ic ti o n " st ce d, a nd m ay xi LU Q iL D C O S U B F IE o "c o • o " CO o D. E CD cc es s ar CO n ts r e- rr a n g e m e CO c CO CO m ci fi( CD D . CO "co c o CO 1 5 5 > E to t ig e fr o m t im e Q . CO h ys ic a i ac ce ss (w h ic h m ay c Q. D E t c. " CO CO CD CJ o CO "cO ;L D C O ph ys ic S U B F IE ui re d fo r Cr CJ C if or s pe of u se rs CO CO CO pl y. jo n ot a p co -H- CJ 2 CO d e si g n a te s e it e st ri ct io n s in s "co ut ho riz ed us ei or de d tr ib u te io n is r ec to is n ot di s E 3 3-°' c CO CO n al ct io ns -o n- ac ce d is u se d in te r st ri i an d m us ic , r es tn or b oo k! fie l or m a p s, t h is f LL I N ot e) . on S er vi ce . D is tr ib u FI C E : F G en er a M A R C C P R A C " - 1 el d 50 0 ( • - ie M ap s LL Q LU CO 3 5 Q. 5 5 > m • D CD ta C la ss i XI CO X A M P LE [5 06 ] UJ Figure 1 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 Interpretation and Application 17 CO "5a o Q. CO fo r B it IA R C F o rm a ts 2 CO < DC O LL S S (C o n ti n u e d ) A C C E z o CO z o o DC 1— CO LU DC 8i n LL Q LU CO | | 03 3e o n ly . c ia l us =O ° —i O — «* — CO LU 0_ ^- < in X " LU CO CD ie U n it e d S ta t J— fo r d is tr ib u ti o n i r lil a b le TO CO o zCO n 0 6 ] LO ^ ^ 1 :o rs u n ti l 19 9S CO en C lo se d t o in ve st i •D -H- •b CD ;t ri cf UJ CD DC CO n 0 6 ] in n CO CO CD O O < b" 5) fr a a te ri a l e x tr e m e ly it o n ly . te d: M in tm e i o o CD ra CC CO 0 6 ] in I fr o m N o o n e m a y n i n fo rm a ti o n f U n d e r- S e cr e ta ry e se r e co rd s o r o b ile s o in e th si O CO n 0 6 ] in is lo te o ti c e . b y w ri tt e n T re a su ry o r h tc K e p t in r en r ir s a d va n ce n< 1 • —> p ie s o f th e m e x c e X b S e c re ta ry o f t •i ze d r e p re s e n ta ti \ ce ss r e q u ir e s 2 4 1 o r c o i is s io n a u th o i g e ; a c E E > , 5 iZ Q.TJ CO 'c (D le ra l g o v e rn m CD | i ye a rs e x c e p t to 1 w ft h a n e e d t o k n i| O CD o• a 8 LO LL a n ti l Ja n . 19 79 . se d u O CD i l CO n n 8 ^̂ 9 8 4 ). insa. D) c o O "o L ib ra ry h in g to n , D .C .: CO CO io g ra p h ic D a ta ( V \ 3 CO o u. o Q: rc e: 3 o CO Figure 1 (Continued) D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 18 American Archivist/Winter 1986 clear about their information needs before they begin format implementa- tion. Some of the current publications on designing information systems that ar- chivists should find particularly helpful are All in Order: Information Systems for the Arts by Mary Van Someren Cok, Richard M. Kesner's Automation for Ar- chivists and Records Managers: Planning and Implementation Strategies, and Joseph R. Matthews's Choosing an Automated Library System: A Planning Guide.'' Finally, after determining the system requirements for information elements and computer hardware, it is time to start thinking about what the format is to do. Since it is generally not cost-effective or practical for archivists to develop their own computer programs, it will be necessary to evaluate one of the existing networks, turnkey systems, or software packages. Before reaching a decision, questions should be asked about cost, maintenance, user assistance, the layout of screen displays used for data entry and retrieval, and the kinds of hard copy pro- ducts, such as reports, that can be generated by the system. Joining a net- work such as RLIN, for example, may be only a partial solution, because a stand- alone computer may still be needed for routine word processing and certain ad- ministrative functions. Other factors that should be considered include the ability of staff to adapt psychologically to the use of a standard format and automated techniques, the need to develop pro- cedures for quality control of format data, education and training needs prior to and during implementation, and even such mundane matters as whether a repository's wiring system can handle computer equipment without a major overhaul. It is also wise to talk with people at in- stitutions that are implementing the AMC format. Some of these are listed in MARC for Archives and Manuscripts: A Compendium of Practice. Quite a few repositories and organizations, with varying prior levels of automation ex- perience, are already using the format and are creating the basis for future im- plementation by the rest of the archival community. Members of the Research Libraries Group, such as Yale, Cornell, and Stan- ford universities and the Hoover Institu- tion, have been working as a consortium to design and test RLIN's implementa- tion of the AMC format. Other RLIN participants include the National Ar- chives, several state archives, and a host of research libraries. The Library of Con- gress's Manuscript Division and NUCMC have undertaken the planning steps essential for format implementa- tion. OCLC is implementing the format both through its regular online network and through its LS2000 turnkey system. At the Smithsonian Institution, MIT, and other locations, the integrated MARC- based Geac turnkey system is being used. Format-based software and in-house systems for archival applications have been or are being developed by a number of organizations, including Automated Information Reference, Inc., (AIRS), Michigan State University, the Chicago Historical Society, Western Carolina "Mary Van Someren Cok, All in Order: Information Systems for the Arts, Including the National Stan- dard for Arts Information Exchange (Washington, D.C.: National Assembly of State Arts Agencies, 1981), especially 63-100; Richard M. Kesner, Automation for Archivists and Records Managers: Planning and Implementation Strategies (Chicago: American Library Association, 1984); and Joseph R. Matthews, Choosing an Automated Library System: A Planning Guide (Chicago: American Library Association, 1980). See also Matthews's A Reader on Choosing an Automated Library System (Chicago: American Library Association, 1983). These are only a few of the many helpful works available in this field. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 Interpretation and Application 19 University, Dickinson College, and Gallaudet College. It is obvious that the USMARC Ar- chival and Manuscripts Control Format is here to stay. It is also obvious that some time is going to elapse before the format will be used by the majority of the profession. This is no cause for concern, however, for the format is complex and carries with it implications for the ways archivists describe and administratively control their holdings, assorted needs for education and outreach relating to its use, and a wide range of possibilities for the use of automated techniques. It also compels archivists to work with a wide range of professionals in the library, in- formation, automation, and user com- munities. None of this can or should be accomplished overnight. What then might archivists expect to see as format adoption and implementa- tion progress? There will be continued development of standards for archival description and information formatting. There will also be those in the profession who resist this trend, who see no merit in c o n s t r u c t i n g archival i n f o r m a t i o n systems integrated with those being developed for other kinds of information sources, and who feel that traditional ar- chival descriptive and administrative practices should be religiously main- tained. New ideas often face opposition. The primary concern, however, should be to ensure that the format and its im- plementations meet the needs of ar- chivists and users alike. In order to achieve this archivists need to develop strategies for education and outreach for AMC format implementa- tion and use directed to both archivists and the users of archives and manuscript materials. Understanding the format, automation, systems analysis, and related concepts are challenges for the profes- sion. Workshops and other short-term offerings may partially fill this need, but only if their participants immediately begin to apply the knowledge that they acquire. Self-instruction materials need to be prepared to guide archivists through the basics of format implementation. Similar instructional tools need to be developed for users of archival materials focusing, for example, on information retrieval strategies for use with online systems. The Society of American Ar- chivists's current project, funded by the National Endowment for the Humani- ties, to develop an archival automation information and education program is designed specifically to meet these needs. Archivists also need to consider the full range of possibilities for automated ap- plications. Although there is no question that the AMC format is the standard for higher level archival description and in- formation exchange, it may not be a suitable vehicle for providing certain types of administrative and process con- trols over the life cycle of records. Initial users of the format and of the networks and turnkey systems that have adopted it as a standard are evaluating these ques- tions. The suitability of national net- works for providing day-to-day ad- ministrative control of records is being evaluated, as are prospects for network- ing among microcomputers.12 Modifica- tions to the AMC format have already 'Tor example, see "Historical Society of Wisconsin Joins RLG," SAA Newsletter (January 1985): 7; Tom Mills and Kathleen Roe, Development of LS2000 for Automated Control of Archives (Albany: New York State Archives, 1984); and David Bearman, "Who About What or From Whence, Why and How: In- tellectual Access Approaches to Archives and their Implications for National Archival Information Systems" (Paper presented at the Conference on Archives, Automation and Access, University of Victoria, 1-2 March 1985). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 20 American Archivist/Winter 1986 been recommended, and additional changes are likely to occur in the future.13 The future of the AMC format depends on the work of many individuals and groups—the Society of American Archivists and its Committee on Archival Information Exchange, the Library of Congress, members of the archival pro- fession, format users, software developers, and a host of others. It depends on the willingness of the archival profession to adopt standardized methods and procedures, on the avail- ability of computer programs to manipulate formatted data, and on the ability of archivists, librarians, and other information professionals to continue the cooperation that has characterized their initial efforts. l!Format users should note that several errors occurring in the Library of Congress's "official" format issuance (MFBD, Update No. 10) have been corrected in the SAA's edition of the format, MARC for Ar- chives and Manuscripts: The AMC Format, as a result of discussion between the author and Margaret Pat- terson of the Network Development and MARC Standards Office of the Library of Congress. MFBD, Up- date No. 11, available from the Library of Congress, also includes these corrections. Substantive changes have been recommended as a result of the October 1984 meeting of AMC format users in Madison, Wisconsin. Those wishing to propose additional changes should address their concerns to the SAA's Com- mittee on Archival Information Exchange. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.49.1.y10g533247774463 by C arnegie M ellon U niversity user on 06 A pril 2021 work_mx5624rwivcsnaatb5e6ux7jlu ---- HKUST Institutional Repository XML and global name access control Ki-Tat Lam Introduction While there are well-established standards and implementations for conducting authority control at the regional level, efforts to extend such control globally are still far from mature. For instance, the Library of Congress (LC) has established one of the world’s largest name authority files using the MARC21 standard. However, this implementation was originally designed for Anglo-American libraries. Problems begin to appear when users worldwide adopted LC’s records and standards for non-Latin names. For instance, there have been complaints about the inability to store CJK (Chinese, Japanese, Korean) scripts in LC’s authority records because it is very difficult to identify who is who if the record contains only the Latin transliterations. Currently, the MARC21 format is a widely adopted markup standard for authority metadata. This standard has been very efficient for Latin-based names. However, if it is used for non-Latin names, the following issues may cause difficulties: providing more than one established heading in the same record; determining the field to be used to store multiple scripts with the same name; linking multiple fields of the same name in a record; identifying the script that is stored in a field; identifying the romanization scheme used in a field; and determining which character encoding scheme should be used in the record. Enormous effort will be required to add original scripts to LC’s authority files. However, the most fundamental issue for supporting global authority control lies in a search for a markup standard that fits well with the nature of multi-lingual name data. This article presents a solution using XML and its related technologies to achieve global name access control. Using multi-scripts in MARC21 name authority record ‘‘ ’’, the pen name of the renowned Chinese novelist and journalist ‘‘ ’’, will be used to The author Ki-Tat Lam is Head of Systems, Hong Kong University of Science & Technology Library, Kowloon, Hong Kong. Keywords Access control, Computer languages, Foreign languages Abstract This paper discusses why the MARC21-based authority format has failed in a global setting and details the use of XML and its related technologies to achieve global name access control. Electronic access The research register for this journal is available at http://www.emeraldinsight.com/researchregisters The current issue and full text archive of this journal is available at http://www.emeraldinsight.com/1065-075X.htm 88 OCLC Systems & Services Volume 18 . Number 2 . 2002 . pp. 88±96 # MCB UP Limited . ISSN 1065-075X DOI 10.1108/10650750210430150 http://www.emeraldinsight.com/researchregisters http://www.emeraldinsight.com/1065-075X.htm illustrate the multi-script issues in name authority records. Although this example involves only Chinese and Latin scripts, the problems discussed are common to other languages and scripts. The following are the various forms and transliterations of ‘‘ ’’: (1) ‘‘ ’’ – pen name used for his martial arts fiction; (2) ‘‘Jin, Yong’’ – Pinyin romanization of this pen name; (3) ‘‘Chin, Yung’’ – Wade-Giles romanization of this pen name; (4) ‘‘Kim-Dung’’ – Vietnamese romanization of this pen name; (5) ‘‘Kin, Yo’’ – Japanese romanization of this pen name; (6) ‘‘ ’’ – name used for his non-fiction publications; (7) ‘‘Zha, Liangyong’’ – Pinyin romanization of this name; (8) ‘‘Cha, Liang-yung’’ – Wade-Giles romanization of this name; and (9) ‘‘Cha, Louis’’ – his English name. Numbers 1-5 are various transliterations of the pen name ‘‘ ’’ and numbers 6-8 are for his real name ‘‘ ’’. The following shows the major fields of the authority record found in the OCLC database: 0 0 1 o c a 0 0 5 6 0 2 7 0 . . . . 1 0 0 1 | a J i n , Yo n g , | d 1 9 2 4 - 4 0 0 1 | a Ch i n , Yu n g , | d 1 9 2 4 - 4 0 0 1 | a Ki m, Du n g , | d 1 9 2 4 - 4 0 0 1 | a Ki n , Y , | d 1 9 2 4 - 4 0 0 1 | a Z h a , L i a n g y o n g , | d 1 9 2 4 - 4 0 0 1 | a Ch a , L i a n g - y u n g , | d 1 9 2 4 - 4 0 0 1 | a Ch a , L o u i s , | d 1 9 2 4 - . . . . This record contains many weaknesses: It contains Latin scripts only; the Chinese scripts, i.e. and , are not recorded. ‘‘Jin, Yong’’ is selected as the established heading by the Library of Congress, while other authority control agencies may prefer to use his other names. For example, a Hong Kong library would use the Chinese script in its catalog. Presently, MARC21 can only tackle part of these problems. It is possible to encode multi- script by adopting the UCS/Unicode specification (Library of Congress, 2000, January). Position 9 in the Leader would contain the value ‘‘a’’ to indicate that the metadata is in UTF-8. MARC21 also offers two markup models for multi-script records, known as model A and model B. The following shows how the Chinese script is handled in these two models: Model A 0 0 1 o c a0 0 5 6 0 2 7 0 . . . . 1 0 0 1 | 6 8 8 0 - 0 1 | a J i n , Yo n g , | d 1 9 2 4 - 8 8 0 1 | 6 1 0 0 - 0 1 | a , | d 1 9 2 4 - 4 0 0 1 | a Ch i n , Yu n g , | d 1 9 2 4 - 4 0 0 1 | a Ki m, Du n g , | d 1 9 2 4 - 4 0 0 1 | a Ki n , Y , | d 1 9 2 4 - 4 0 0 1 | 6 8 8 0 - 0 2 | a Z h a , L i a n g y o n g , | d 1 9 2 4 - 8 8 0 1 | 6 4 0 0 - 0 2 | a , | d 1 9 2 4 - 4 0 0 1 | a Ch a , L i a n g - y u n g , | d 1 9 2 4 - 4 0 0 1 | a Ch a , L o u i s , | d 1 9 2 4 - . . . . Model B 0 0 1 o c a 0 0 5 6 0 2 7 0 . . . . 1 0 0 1 | a J i n , Yo n g , | d 1 9 2 4 - 4 0 0 1 | a Ch i n , Yu n g , | d 1 9 2 4 - 4 0 0 1 | a Ki m, Du n g , | d 1 9 2 4 - 4 0 0 1 | a Ki n , Y , | d 1 9 2 4 - 4 0 0 1 | a Z h a , L i a n g y o n g , | d 1 9 2 4 - 4 0 0 1 | a , | d 1 9 2 4 - 4 0 0 1 | a Ch a , L i a n g - y u n g , | d 1 9 2 4 - 4 0 0 1 | a Ch a , L o u i s , | d 1 9 2 4 - 7 0 0 1 | a , | d 1 9 2 4 - . . . . By adopting tag 880 and subfield 6 in model A, it is possible to: make parallel to the established form ‘‘Jin, Yong’’; and maintain a link between and its Pinyin form ‘‘Zha, Liangyong’’. MARC21 specifies that the transliteration is stored in the regular tag with its vernacular forms stored in multiple 880 tags. With a slight modification to MARC21 by exchanging the 89 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 regular tag and tag 880, model A will be able to link all transliterations of the same name form: 1 0 0 1 | 6 8 8 0 - 0 1 | a , | d 1 9 2 4 - 8 8 0 1 | 6 1 0 0 - 0 1 | a J i n , Yo n g , | d 1 9 2 4 - 8 8 0 1 | 6 1 0 0 - 0 1 | a Ch i n , Yu n g , | d 1 9 2 4 - 8 8 0 1 | 6 1 0 0 - 0 1 | a Ki m, Du n g , | d 1 9 2 4 - 8 8 0 1 | 6 1 0 0 - 0 1 | a Ki n , Y , | d 1 9 2 4 - In model B, tag 700 (established linking entry) is used to store the Chinese script , making it an alternate established form. However, model B fails to maintain the parallel relationship between multiple transliterations of the same name. The two fields for and ‘‘Zha, Liangyong’’ are not linked. Therefore, one can fall back from model A to model B, but not from model B to model A. Although model A allows better linking of the vernacular script with its transliterations, neither library automation vendors nor bibliographic utilities such as OCLC are eager to support it for authority control. None of these models are sufficient for use in the global setting. Three areas in the MARC21 authority format could benefit from enhancement. Firstly, 1XX should be repeatable, so that instead of depending on 7XX, more than one name can be coded in 1XX as established forms. Secondly, better linking capability between related forms of the name should be developed to avoid using tag 880 with subfield 6. This is because library automation vendors are reluctant to implement tag 880 for authority control, a task that requires substantial effort. Thirdly, there should be a place to hold the multi-lingual attributes of each name, including information about the script, language and romanization scheme. Although effort is needed to identify these attributes, and it is in some cases not an easy task, such information is particularly helpful in differentiating names in a multi-lingual environment. From MARC to XML XML (eXtensible Markup Language) is a simplified standard of SGML. It allows plain text data to carry not just layout information, but also semantic structure. Since its official release in early 1998 by the World Wide Web Consortium (W3C), XML has rapidly been adopted and implemented by many industries. For the past few years, many initiatives and projects that make use of XML for library metadata have emerged (e.g. bibliographic, authority, holdings, patron records). MARC has been widely used as a markup language for library metadata since the1960s. MARC21, UNIMARC and their variations can be considered as MARC’s DTD (Document Type Definition) or markup schema. Due to the similarity between markup concepts in XML and MARC, it is feasible to use XML to describe a MARC record. By converting MARC to XML, library applications would be able to exchange library metadata in the emerging XML format and avoid working with the proprietary MARC interchange format, known as ISO2709. In addition, using XML, XSLT (Extensible Stylesheet Language Transformation) and other Web technologies, it is possible to do the following: create library metadata once and publish it in different formats; view library metadata directly from Web browsers, search engines, and potentially, library automation systems, without conversion; convert robustly between XML and MARC formats without data loss; and resolve many problems that were inherent with the MARC format. In 1995, the Library of Congress began to look into the feasibility of using SGML to encode USMARC (a previous version of MARC21) data (Library of Congress, 2001) in the subsequent release of MARC DTDs and conversion software. In 1999, Lane Medical Library at Stanford University released its XMLMARC software, a byproduct of its Medlane Project, for public use (Lane Medical Library, 2002). Based on its in-house developed DTDs for bibliographic and authority formats, this java-based XMLMARC program offers conversion of MARC21 to XML. Many library automation vendors also developed their own DTDs for their MARC data. For example, Innovative Interfaces Inc. 90 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 has its own DTDs to handle various record types. All these projects are defining their own XML schema (DTDs). However, to facilitate wider acceptance, it is essential that a globally agreed upon XML schema for MARC metadata be developed. This global MARC-XML standard should closely follow the MARC structure. In addition, it should allow for variations, changes and enhancements in the content designation. Unlike the DTDs developed by LC and the Lane project, which are tightly based on MARC21, the global MARC-XML elements and attributes should be independent of regional content designation standards. To achieve this, Figure 1 shows the proposed namespace. In the example, and mark the beginning and ending of a MARC record. and mark the beginning and ending of a MARC tag. and mark the beginning and ending of a subfield. Elements f d and s f contain a number of attributes. The name attribute defines the MARC tag name or the subfield code. i n d 1 and i n d 2 are the two indicators of the MARC tag. There should be additional attributes for f d and s f , to encode information not found in MARC. For instance, we can have: i d – to give a unique id to each MARC tag and to allow for linking among MARC tags; s c r i p t – to encode the language, the script and the romanization scheme used in the MARC tag; l a b e l – to describe the MARC tag, etc. Attributes for the ma r c element can also be defined to encode record level information found in the leader and the controlled fields (e.g. the 00X tags in MARC21). Global name access control in XML – an experiment With the adoption of XML, it is possible to expand authority control to access control. The idea of access control, as opposed to authority control, is to make obsolete the notion of ‘‘authority’’ so that variant forms of a name are equal in status and users are able to select any of them for searching, retrieval and display. Based on the proposed MARC-XML schema and the MARC21 specification, the name access control metadata for ‘‘ ’’ in XML is shown in Figure 2. Note that all transliterations for the name ‘‘ ’’ are assigned a value of ‘‘100’’ for their name attribute. And for linking, the i d attributes of these f d elements have the same number stem (e.g. 1.1, 1.2, ). Similar assignments are applied to the name ‘‘ ’’. Also note that the multi-lingual information is stored in the s c r i p t attribute, in the form of: " s c r i p t . l a n g u a g e . r o ma n i z a t i o n " A feasibility study on the above MARC-XML format was conducted at the Hong Kong University of Science and Technology (HKUST) Library in the summer of 2001. A prototype system for a Global Name Access Control Metadata Repository was developed. The system was built on the XML-based Tamino Server, using the Perl programming language and XSLT stylesheet transformation to generate search results and displays. Figures 3-6 show MARC21 extended pages of the full record, and sample screens of the system: the main page, the search results page, and the public display. Note that in Figure 3, the prototype system is capable of converting the MARC-XML Figure 1 Proposed namespace 91 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 metadata to various MARC21 formats, including model A and model B. It should also be possible to convert the MARC-XML format to the UNIMARC format. It is necessary to extend MARC21 so that the globalization attributes contained in the MARC-XML format can be maintained after the mapping. This can be achieved by enhancing subfield 8 (Field Link Control Subfield) for storing the i d and the s c r i p t attributes, for example: | 8 1 . 2 \ s \ l a t i n . Ch i n e s e . p i n y i n ‘‘s’’ (script) is a new field link type that differentiates the multi-lingual attributes of the linked fields. See Figure 6 for the use of subfield 8. MARC-XML offers other added-value features. It will be possible to implement Barnhart’s (1996) idea of reducing data content redundancy in a name access control record by separating and linking data elements such as uniform title and dates associated with the name from the name elements. With sufficient globalization information encoded in the MARC-XML name metadata, a virtual international authority file as proposed by Tillett (2001) can become a reality. It is feasible to have globally distributed repositories, offering name access control metadata as Web services, for access by various client applications such as OPACs (Online Public Access Catalogs), authority control modules, bookstore catalogs and rights management systems. Figure 2 An example of name access contol metadata for ÁÁ  in XML 92 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 SOAP communication with the repository SOAP (Simple Object Access Protocol) (W3C, 2001) is an emerging standard of the W3C. It is a simple protocol for information exchange in a distributed environment. Using this standard, the prototype name access control metadata repository can be published as a Web service. A Web service is a collection of functions that are published on the Web for use by other programs (Glass, 2000). To verify the SOAP communication concept, the prototype repository was enhanced as a SOAP node. Another SOAP application, pretending to be an INNOPAC WebPAC interface that communicates with the repository for author search assistance, was also developed. Figure 7 shows the SOAP messaging flow of this setup. We chose to use the SOAP technology instead of the Z39.50 standard in this project. Although Z39.50 is a well-established information retrieval standard for library applications, it was designed before the World Wide Web era. Because of the steep learning curve for Z39.50, developers need to invest extra effort in order to deploy and support Z39.50 systems. On the contrary, SOAP uses the two common standards XML and HTTP for messaging and transport. Because it introduces no new technology and is simple to follow, SOAP can quickly be deployed for distributed applications requiring remote procedure calls. Driven by the advantages of these Web technologies, a group of Z39.50 implementers has begun to study the integration of XML, HTTP and SOAP to the Z39.50 standard (ZNG Initiative, 2001). Figure 8 shows some SOAP messages between the INNOPAC and the repository. It is desirable that a globally agreed-upon profile for these kinds of SOAP request and response messages be formed. Once the specification is in place, distributed authority repositories can be published as compatible Web services, serving MARC metadata in XML format. Similar to the idea of Z39.50, client applications can then communicate with selective repositories worldwide for their name access control. Conclusion There are limitations and problems in MARC21 for name access control at the global level. These problems can be resolved with the Figure 3 Main page of the metadata repository system Figure 4 Search results page of the metadata repository system 93 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 help of XML and related technologies. The proposed MARC-XML schema can crosswalk with MARC21 and any other MARC content designation versions, such as UNIMARC. With the use of SOAP, XML-based name access control metadata can be easily published as Web services. If major authority files, such as the Library of Congress name authority file, are converted to the proposed XML format and are enabled as SOAP nodes, a true global name access control environment will be achieved. To make the above scenario happen, a global effort should be established to specify two standards, namely the XML schema for the MARC format and the SOAP messaging profile for bibliographic information retrieval. Figure 5 OPAC format view of the metadata repository system Figure 6 MARC21 extended format view of the metadata repository system Figure 7 SOAP request and response 94 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 References Barnhart, L. (1996), ÁÁAccess control records: prospects and challengesÂÂ, Authority Control in the 21st Century: An Invitational Conference, March 31-April 1, 1996, available at: www.oclc.org/oclc/man/authconf/ barnhart.htm (accessed 29 January, 2002). Glass, G. (2000), ÁÁThe Web services (r)evolution, part 1: applying Web services to applicationsÂÂ, available at: www-106.ibm.com/developerworks/webservices/ library/ws-peer1.html (accessed 29 January, 2002). Lane Medical Library, Stanford University Medical Center (2002), ÁÁMedlane XMLMARCÂÂ, last updated 7 January, available at: xmlmarc.stanford.edu/ (accessed: 29 January, 2002). Library of Congress (2000), ÁÁMARC 21 specifications for record structure, character sets, and exchange media. Character sets: part 2. UCS/Unicode environmentÂÂ, January, available: http://lcweb.loc.gov/marc/ specifications/speccharucs.html (accessed: 29 January, 2002). Library of Congress (2001), ÁÁMARC SGML and XMLÂÂ, last updated March, available at: lcweb.loc.gov/marc/ marcsgml.html (accessed 29 January, 2002). Tillett, B.B. (2001), ÁÁA virtual international authority fileÂÂ, 67th IFLA Council and General Conference, August 16-25, 2001, available at: www.ifla.org/IV/ifla67/ papers/094-152ae.pdf (accessed 29 January, 2002). W3C (2001), SOAP Version 1.2 working drafts, available through the links in: http://www.w3.org/TR/2001/WD- soap12-part0-20011217/ , December 17 (accessed 29 January, 2002). ZNG Initiative (2001), ÁÁZ39.50 next generationÂÂ, July, available at: www.loc.gov/z3950/agency/zng.html (accessed 29 January, 2002). Implications for practitioners This summary has been provided to allow a rapid appreciation of the significance of the content of this article. Browsers may then choose to read the article in toto, to derive full benefit from the authors’ work. It is often said that communications and technology have combined to make the world a smaller place. Such ideas, and with them the concept of the global village, are attractive. However, they are not much consolation to library professionals faced with issues of global name access control. The Library of Congress used the Machine Readable Cataloging (MARC) 21 standard to establish one of the world’s largest name authority files. It was designed for Anglo-American libraries but its use has been far wider. So, it is when some worldwide users adopt the Library of Congress records and standards for non-Latin names that problems occur. Chinese, Japanese and Korean are just three examples of scripts which create difficulties for MARC21 in Congress authority records, because of the limitations imposed by Latin transliterations. Yong Jin, the Chinese novelist and journalist, serves as a good example. That appellation would be the entry taking the Pinyin romanization of his pen name for martial arts fiction. One could make eight more entries, including the Chinese script version of that name to Japanese and Vietnamese romanizations to his English name. There are many weaknesses in the major fields of the authority record found in the OCLC database, and MARC21 can at the moment handle only some of them. The writer provides two different models which find ways of addressing some of the issues. Among them is the Established Linking Entry created by one model. This stores the Chinese script and makes it an alternate established form. The other allows better linking of the vernacular script with its transliterations. However, bibliographic utilities such as OCLC, to take just one example, would not want to support it for authority control. Neither model is sufficient for use in a global setting; something more radical is needed. Extensible Markup Language (XML) and related technologies can help. XML has been Figure 8 SOAP messages between the INNOPAC and the repository 95 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 http://www.ifla.org/IV/ifla67/papers/094-152ae.pdf http://lcweb.loc.gov/marc/specifications/speccharucs.html http://www.oclc.org/oclc/man/authconf/barnhart.htm http://www.oclc.org/oclc/man/authconf/barnhart.htm http://www-106.ibm.com/developerworks/webservices/library/ws-peer1.html http://www-106.ibm.com/developerworks/webservices/library/ws-peer1.html http://lcweb.loc.gov/marc/specifications/speccharucs.html http://www.ifla.org/IV/ifla67/papers/094-152ae.pdf http://www.w3.org/TR/2001/WDsoap12-part0-20011217 http://www.w3.org/TR/2001/WDsoap12-part0-20011217 http://www.loc.gov/z3950/agency/zng.html rapidly adopted and implemented by many industries during the last four years. XML and other Web technologies offer some answers, including conversion between XML and MARC formats without data loss and with resolution of many problems inherent in MARC. What we need now is a globally agreed-upon use of these opportunities. A MARC-XML feasibility was conducted last year at the Hong Kong University of Science and Technology Library. It suggests that, through globalization information encoded in the MARC-XML name metadata, a virtual international authority file can become a reality. Other areas for further investigation include Simple Object Access Protocol (SOAP), an emerging World Wide Web Consortium standard. SOAP carries the dual advantage of being easy to follow while introducing no new technology. It could serve MARC metadata in XML format, and its use could pave the way for XML-based name access control metadata to be published as Web services. The ultimate aim has to be creation of a true global name access control environment. That requires a global effort to work towards the standards described here. That shrinking world or global village, which nevertheless is witness to an information explosion, demands nothing less. (Précis supplied to MCB UP Limited by consultants.) 96 XML and global name access control Ki-Tat Lam OCLC Systems & Services Volume 18 . Number 2 . 2002 . 88±96 work_mz3kp2or25g3ppkuwt7cxp34uq ---- PII: 0098-7913(91)90021-A "A DREAM UNFOLDING": A GUIDE TO SELECTED JOURNALS, MAGAZINES, AND NEWSLETTERS ON PEACE, DISARMAMENT, AND ARMS CONTROL Grant Burns Burns is a reference librarian at the University o f Michigan-Flint Library. This article is based in part on his book The Nuclear Present, a guide to current literature on nuclear war, nuclear weapons, and the peace move- ment, to be published by Scarecrow Press. Why talk "peace movement" in a world that has recently been described as seeing peace "breaking out all over," where "velvet revolutions" have deposed communist dictatorships throughout Eastern Europe, and where the prospect o f a head-on nuclear "ex- change" between the U.S. and the Soviet U n i o n seems to be the stuff o f memory? I f the recent experience in the Persian Gulf is not a sufficient reminder that peaceful resolution o f h u m a n conflict is scarcely an entrenched habit o f the species, then brief perusals o f such documents as Amnesty International's annual reports should relieve most readers o f any unwarranted rosy feelings about peace on earth and good will prevailing among men, women, and children. F r o m Indonesia to Ethiopia, from the Philippines to El Salvador, thousands o f people are being killed, tortured, and otherwise physically intimi- dated for political purposes. Crippled by overwhelming military demands, national budgets fail to meet basic civil needs. Arms merchants, Bob Dylan's "Masters o f War," swoop d o w n to satisfy the hardware hungers o f any state that can ante up the cash for the latest hot new missile or tank. Peace may be a "dream unfolding," ,as Penney Kome and Patrick Crean say in Peace: A Dream Unfolding (Sierra Club, 1986), but large numbers o f people are not yet a part o f the dream. The peace movement exists to help make the dream real. In a world where the U.S. and the Soviet U n i o n still possess some 50,000 nuclear warheads, and where, according to a recent Brookings Institution report, at least sixteen SERIALS OF "l'tl£ P E A C E M O V E M E N T - - W I N T E R 1991 7 nations possess ballistic missiles with ranges of up to 1,500 miles, that elusive reality is in clear need of assistance. Peace Periodicals Talking about the periodicals of the peace move- ment requires some further reflection on what that movement is--and what it is not. The "Peace Move- ment" is n o t a monolithic, unified force with a single, clear objective, but a loose assembly of individual social, political, and religious movements with diverse concerns. The assembly is composed of local, regional, national, and international organizations, from the church group that meets down the street to the Council for a Livable World, Women's Action for Nuclear Disarmament, and International Physicians for Social Responsibility. The peace movement is also composed of individ- uals who belong to no voluntary associations, but whose awareness of the destructiveness of war and other forms of institutionalized violence as tools for addressing social and political problems leads them to question and criticize policies related to these tools. Those who contribute to the peace movement may do so by making financial donations to groups like Pax Christi or the CCCO (formerly the Central Committee for Conscientious Objectors). They may contribute by taking part in mass demonstrations against their govern- ments' use of force, whether in Afghanistan or Panama or South Africa or the Middle East. They may contrib- ute by writing letters to their local newspapers, or by talking with friends and work acquaintances about peaceful approaches to national and global problems. Yet the peace movement extends far deeper than any of those activities. It entails a commitment to ways of living that honor life at large. This commitment can manifest itself in solitary reflection on the shared goals and trials of humanity, in prayer, in the practical application of environmental awareness (for the ways of peace necessitate peaceful treatment of the planet every bit as much as a peaceful approach to other people), and in the education of one's children in enlightened thinking about war and violence. What is the peace movement? Is it a current of mingled hope and realization issuing from the soul of humanity and manifesting itself in a thousand very different yet complementary ways? Is it the sign of a slowly dawning but relentless consciousness that survival--of the species and of the planet--depends on cooperation rather than conflict? It may be nice to think so. But whatever the peace movement's origins, it is by now far too varied in its people and its activities to permit simplistic definitions. A Diverse Literature Given the great diversity of the peace movement, it stands to reason that its literature is equally diverse. It is. The objectives of the peace movement involve far more than the mere absence of people trying to kill one another to achieve their goals, although that absence is the fundamental reason for the movement's existence. Periodicals advocate the causes of peace from many different perspectives informed by a wide variety of values and experience. People come to the peace movement with religious and philosophical motivations, with environmental concerns, with basic human compassion for the suffer- ing of others, with legal and medical perspectives, even with enlightened business sense; the cynical slogan of the Vietnam War era, "War is good business; invest your son," is one whose irony many executives have come to recognize. (Many more, alas, have not; General Electric may claim to "bring good things to life," but it is also one of the nation's premier nuclear weapons contractors, and as such has been for several years the target o f a nationwide boycott by the peace movement, as well as the object of various direct protest actions.) Since Vietnam The Vietnam War did more than any other recent event to stimulate the development of peace (or at least antiwar) publishing in the United States, especially through the briefly-flourishing underground press movement. Since the war's end, the periodicals of the peace movement have proliferated and diversified. Further, they have strengthened their theoretical underpinnings and have broadened their scope, moving beyond the gut issue of opposition to a specific war to address the multiple issues o f peace, justice, and freedom. With the election of Ronald Reagan and the U.S. military buildup of the early 1980s, and with the intense focus early in the Reagan years on renewed fears of nuclear war, the past decade witnessed an impressive resurgence and maturing of publishing on issues of war and peace. It was a resurgence particularly strong at the grassroots level, the level of citizen action suggest- ing that the "slowly dawning but relentless conscious- ness" is a force real and insistent. The most dramatic example of the grassroots peace movement in the 1980s was the Nuclear Freeze movement, a movement that did as much as anything to bring mainstream legitimacy to nuclear weapons protest, even to the extent of a congressional resolution in its favor. The freeze movement was eventually manipulated and co-opted by the Reagan administra- 8 SERIALS REVIEW - - G R A N T B U R N S tion's assertions about making nuclear weapons "impo- tent and obsolete" through the Strategic Defense Initiative, and about their complete abolition, but the movement's accomplishments were real and significant. The Persian Gulf War brought some soul-searching to many peace movement activists and sympathizers; it is a search that can be traced in grassroots periodi- cals. The overwhelmingly sympathetic mass media treatment of the Bush administration's pursuit of the war led some long-time peace advocates to a cringing support of the U.S.-led war effort; other activists, who maintained a strong opposition to the war, nevertheless were sidetracked into devoting an embarrassing amount of time to showing "support for the troops." With the illusion of what General Colin Powell called in a post-war speech to the Veterans of Foreign Wars (VFW) "a clean win" having dissipated in the bloody aftermath of the Gulf ceasefire, there will be further discussion in the grassroots press about the maintenance of clear thinking regarding war as a necessary evil. At any rate, it is ultimately to the grassroots periodicals that one must turn to sense the depth of emotional and intellectual commitment that comprises the peace movement. These publications can, if one is in the mood, prove almost overwhelming in the purity of their commitment to peace on earth. The depth of caring in these periodicals is often so intense, so profound, and so selfless that reading them can be an almost transcendent experience, carrying one into the very realm of passion felt by those who would "save the world." The arguments against subscriptions to grassroots periodicals are well-known: they aren't indexed, they tend to be irregular, sometimes the editorial style seems more than a little homespun to those accustomed to the banal slickness of the mass newsweeklies. Forget those threadbare arguments. The library that fails to make such literature available to its readers, and I am ashamed to say that most libraries fail on this point, is depriving them of the opportunity for a powerful intellectual and emotional adventure, one that has the potential to be life-changing. S o m e Notes on the Selected List The periodical list here is not comprehensive. It omits a lot of newsletterish publications, but it includes others for flavoring. It makes no real effort to cover periodicals published outside North America, although some do show up. Some titles that I tried to obtain for review eluded my grasp. The titles appearing in this article come to light through a combination of circumstances, some deliber- ate, some fortuitous. I first began paying serious attention to peace periodicals in the early 1980s. Some titles here I have known and admired for a number of years, others are new to me. For The Nuclear Present, forthcoming from Scarecrow Press, I annotated a substantial number of peace movement periodicals, along with other titles dealing with nuclear issues from military and political perspectives. This task entailed use of such standard periodical guides as Ulrich's as well as recent reference books noting likely periodicals. Many of those titles appear in this article. My intention here is to present titles of potential use in almost any U.S. library. The list omits some interesting titles, such as some religious denominational publications, because their focus is too constricted for a general audience. Subscription prices, dates of first publication, circulation, ISSNs, and OCLC numbers are noted when ascertained. If there are two subscription costs, the first is for individuals, the second for institutions. Given today's rap~d changes in periodical prices, the figures listed here cannot be expected to prevail for long. When identified as covered by indexing or abstracting services, such tools are noted. Titles are grouped for reader convenience in subject categories. Treating periodicals in this fashion is usually a risky game, for it compels forcing some against their will into boxes that don't really fit their natures. Nuclear 7~mes, for example, noted in the "Professional" section, is at the same time very much a product of grassroots sensibilities, as is Ground Zero, located in the "Reli- gious" division. No one thinking straight attempts to arrange periodicals in anything but alphabetical order. I went ahead and did it, anyhow. Incomplete though the list below is, few libraries even begin to offer their users a healthy sample of the periodicals it covers. Perhaps some will be inspired by this article to take some corrective measures (see sidebar 1). G E N E R A L T I T L E S W I T H P E A C E F O C U S E S Greenpeace Magazine (see figure 1). Edited by Andre Carothers. 1436 U St., NW, Washington DC 20009. 6/year. $20. 1981-. OCLC 16718179. ISSN 0899-0190. Circ. 800,000. Indexed: Alternative Press Index. Next to the Sierra Club, Greenpeace is probably the world's best-known environmental organization. Its magazine is one of the leading journals of environ- mental activism, providing coverage on a broad range of issues. One of the organization's persistent interests has been pollution from nuclear weapons and nuclear power operations; its ship, the "Rainbow Warrior," was the target of a lethal 1985 terrorist bombing by French government functionaries in New Zealand. - - SERIALS OF T i m PEACE MOVF2VlENT - - W I N T E R 1 9 9 1 9 S i d e b a r 1: F r o m t h e B e l l y o f t h e B e a s t If a library cannot pretend to offer an adequate collection o f periodicals on peace without including a decent sample o f representative grassroots titles, it also cannot do the job unless it covers the other side of the coin with periodicals issuing from and devoted to the military-industrial complex. I'll not take up space to describe the following listings, but some titles worth carrying are: Air Force Magazine. Edited by John T. Correll. Air Force Association, 1501 Lee Hwy., Arlington, VA 22209. Monthly. $21. 1942-. OCLC 5169825. ISSN 0730-6784. Circ. 235,000. Indexed: Abstracts o f Military Bibliography; Aerospace Defense Markets & Technology; Air University Library Index to Military Periodicals; America: History and Life, Historical Abstracts; International Aerospace Abstracts. Air University Library Index to Military Periodicals. Edited by Emily J. Adams. Air University Library, Maxwell AFB, AL 36112-5564. Quarterly; cumulated annually. Free to libraries. 1949-. OCLC 2500050. ISSN 0002-2586. Circ. 1,500. A subject index to approximately 80 English- language military and aeronautical periodicals. The substantial book review index could be helpful in locating reviews not indexed in other sources and in identifying the books themselves. A single issue runs to approximately 160 pages; publication lags a year or so behind the period being indexed. Was Air University Periodical Index until 1962. Airpower Journal. Edited by Col. Keith W. Geiger. Air University, Maxwell AFB, AL 36112. Quarterly. $9.50. 1947-. OCLC 16481534. ISSN 0897-0823. Circ. 20,000. (U.S. federal depository serial D 301.26/24:2/4). Indexed: Aerospace Defense Markets & Technology," Air University Library Index to Military Periodicals; American Bibliography o f Slavic & East European Studies; America: History & Life; Abstracts o f Military Bibliography," Engineering Index Monthly; Historical Abstracts; Index to U.S. Govern- ment Periodicals; Middle East: Abstracts & Index; PALS," Predicasts Overview o f Markets and Technolo- gies. Comparative Strategy. Edited by Richard B. Foster. Taylor & Francis, 1900 Frost Rd., Ste. 101, Bristol, PA 19007. Quarterly. $89. 1978-. ISSN 0149-5933. Indexed: Abstracts o f Military Bibliography,'American Bibliography o f Slavic & East European Studies; Current Contents; International Political Science Abstracts, PALS," Peace Research Abstracts; Social Science Citation Index. Defense Analysis. Edited by Martin Edmonds. Perga- mon Press Journals Div., Maxwell House, Fairview Park, Elmsford, NY 10523. Quarterly. $100. 1985-. OCLC 10490881. ISSN 0743-0175. Indexed: Current Contents. Global Affairs. Edited by Charles M. Liehenstein. International Security Council, 1155 Fifteenth St. NW, Suite 502, Washington, DC 20005. Quarterly. $24. 1986-. OCLC 12954805. ISSN 0886-6198. Circ. 16,600. International Security. Edited by Steven E. Miller. Harvard University Center for Science and International Affairs, 79 John F. Kennedy St., Cambridge, MA 02138. Quarterly. $25/$65. 1976-. OCLC 2682087. OCLC 2682087. ISSN 0162-2889. Circ. 5,500. Indexed: A.B. C. Pol Sci; Abstracts o f Military Bibliog- raphy," Aerospace Defense Markets & Technology," Air University Library Index to Military Periodicals; American Bibliography o f Slavic & East European Studies," America: History & Life; Future Survey," Historical Abstracts," International Bibliography of A m o n g the articles in the magazine, one finds frequent pieces on peace-related topics. Early 1991 issues, for example, featured reports on Greenpeace actions in the Soviet nuclear weapons test territory in the Berents Sea and on French nuclear testing at Moruroa. A highly desirable addition to any library's peace and environ- mental offerings. New Outlook. Edited by Robert Berls. American Committee o n U.S.-Soviet Relations, 109 1 lth St. SE, Washington, DC 20003. Quarterly. $25. 1990-. New Outlook is the official journal o f the Ameri- can Committee on U.S.-Soviet Relations. This indepen- dent, nonpartisan group established in 1974 dedicates itself "to strengthen official and public understanding o f U . S.-S oviet relations by providing accurate informa- tion and expert analysis." Only one issue could be reviewed for this guide; that one, the Winter 1990/91 number, contained in its 90 pages an extensive analysis on "Reform and the Soviet A r m e d Forces," addressing the U.S.-Soviet strategic balance, the Soviet defense conversion process, "The Troubled Soviet Armed Forces," and other topics. The report included pertinent Soviet documents, such as a public opinion poll from August 1990, indicating that only 12 percent o f Soviet citizens believed that a threat o f military attack against the Soviet Union then existed. The periodical reflects a thorough journalistic rather than a scholarly approach. It should prove a useful source o f information and opinion. 1 0 SERIALS REVIEW - - GRANT BURNS the Social Sciences; International Poh'tical Science Abstracts; Middle East: Abstracts & Index," PALS; Predicasts Overview o f Markets and Technologies; Political Science Abstracts; Risk Abstracts; Social Science Citation Index. Jane's Defence Weekly. Edited by Peter Howard. Sentinel House, 163 Brighton Rd., Coulsdon, Surrey CR5 2NH, England. U.S. subscriptions: 1340 Braddock PI., Alexandria, VA 22314. Weekly. $145. 1980-. OCLC 10366120. ISSN 0265-3818. Indexed: Abstracts o f Military Bibliography," Aerospace Defense Markets & Technology. National Defense. Edited by F. Clifton Berry, Jr. American Defense Preparedness Assoc., 2101 Wilson Blvd., Ste. 400, Arlington, VA 22201.10/year. $35. 1920-. OCLC 4867930. ISSN 0092-1491. Circ. 40,200. Indexed: Abstracts o f Military Bibliography;Aerospace Defense Markets & Technology; Air University Library Index to Military Periodicals; Chemical Abstracts; Engineering lndex; Predicasts Overview o f Markets and Technologies. Strategic Review. Edited by Walter F. Hahn. U.S. Strategic Institute, PO Box 618, Kenmore Sta., Boston MA 02215. Quarterly. $15. 1973-. ISSN 0091-6846. Circ. 3,500. Indexed: Abstracts o f Military Bibliogra- phy," Air University Library Index to Military Periodi- cab," American Bibliography o f Slavic & East Europe- an Studies; Chicano Periodical Index; Middle East: Abstracts & Index; PALS; Predicasts Overview o f Markets and Technologies; Social Sciences Index. Other useful military titles can be identified in Katz's Magazines f o r Libraries and in Michael E. Unsworth' s "Professional Military Journals: An Overlooked Resource" (Serials Librarian 10 (Summer 1986): 143- 54.) These journals and magazines will give the reader interested in peace and related issues some highly useful and enlightening perspectives on military thinking, the weapons industry, and the political connections between the two. As John Adams said, "I must study politics and war that my sons may have liberty to study mathematics and philosophy." Today one hopes that Adams would include his daughters in this statement. At any rate, the advocates o f peace must expose themselves to the arguments o f warriors and their kin. To do otherwise is to go as a sheep among wolves, with the likely result being mutton. Readers will also want to remain alert to the pertinent work that appears from time to time in such titles as Foreign Affairs, Foreign Policy, Worm Politics, Orbis, and many others that emphasize international relations. GREENPFACE M A G A ;~ I N E Children Cherfiobyl How to ban the bum Sharksin danger Joumey into the Soviethomh Figure 1: Greenpeace Magazine January/February 1991 Nuclear Times. Edited by John Tirman. 401 Common- wealth Ave., Boston, M A 02215. Quarterly. $18. 1982-. OCLC 8771147. ISSN 0734-5836. Circ. 60,000. Alternative Press Index; HumanRights lnternet Report- er. Nuclear Times has evolved to serve as a wide- angle guide to the antiwar and antinuclear movements. It retains a primary focus on nuclear weapons and nuclear war issues, but also features commentary and assessments concerning political and military hotspots around the world (e.g., the Soviet crackdown in the Baltics, militarism in Japan, the Persian Gulf) that harbor the potential for far wider conflict. On the nuclear front, the magazine has recently featured articles o n proliferation, nuclear deterrence in the context o f the declining Cold War, and nuclear test protests in the Soviet Union. Contributors are journal- ists, scholars, and activists. Contains a good list o f organizational resources keyed to each issue's articles. Belongs in all libraries. Positive Alternatives (see figure 2). Edited by Jim Wake. Center for Economic Conversion, 222 View St., Suite C, Mountain View, CA 94041. Quarterly. $35. 1990-. Circ. 7,500. SERIALS OF THE PEACE MOVEMENT -- WINTER 1991 11 This journal's self-description says: "Positive Alternatives is the primary publication of the Center for Economic Conversion and is the journal of econom- ic conversion movement. It covers all facets of econom- ic conversion from editorials on alternatives to military dependency, to interviews with key actors in the movement, to case studies, book reviews, reports from the field and updates on CEL's work." Although primarily devoted to promoting economic conversion i n t h e U . S . , the journal also turns to other regions, such as Eastern Europe, struggling with the burden of wasteful investment in military projects. Includes one or two book reviews and an annotated resources listing. If this new title survives, it could prove of real correc- tive value to opposition to the federal governments military base-closing plans. A P I J I I L I C A T I O N OF ' I I | E CENTEII FOR ECONOMIC CONVERS|ON in lhiR I.~xaP, POSITIVE tot, t ?*b,l F~E~ IgJO Figure 2: Positive Alternatives Vol. 1, No. 1, Fall 1990 GRASSROOTS Civilian-Based Defense: News & Opinion. Edited by Melvin G. Beckman, Philip Bogdonoff, and Robert Holmes. PO Box 31616, Omaha, NE 68131.6/year. $15. 1982-. ISSN 0886-6015. Circ. 750. This newsletter is intended as a source of informa- tion on non-violent civilian-based defense (CBD) as an alternative policy for national defense, and as a vehicle for the exchange of international news, opinion, and research on CBD. It features some interesting articles by international contributors on CBD, a form of "defense" which has been discussed for many decades, and which was reintroduced in the 1950s by such figures as Sir Stephen King-Hall (Defence in the NuclearAge, Fellowship of Reconciliation, 1961), who proposed CBD as the best way to oppose Soviet expansion. The most recent issue focused on the relationship between CBD and the "Velvet Revolutions" that took place in Eastern Europe (some of which, of course, were more velvety than others). Resource notes and occasional substantial book reviews heighten the title's utility. Something of a hybrid between a grass- roots and scholarly effort; the spirit is of the former, but the academic qualifications of many contributors lend it an air of the latter. Fellowship. Edited by Virginia Baron. Fellowship of Reconciliation, 523 N. Broadway, Box 271, Nyack, NY 10960.8/year. $15. 1934-. OCLC 1569084. ISSN 0014-9810. Circ. 8,000. Human Rights lnternet Reporter; PALS; Peace Research Abstracts. Fellowship contains peace movement news from around the world, news of Fellowship activities, personal accounts o f peace activists (such as Joseph J. Fahey's "From Bluejacket to Pacifist" in the March 1991 issue), analysis of military events, and discussion of more subtle forms of violence, such as homelessness and war toys. Includes approximately a half-dozen book reviews in each issue, ranging from one or two para- graphs to several hundred words. Global Report: Progress Toward a World of Peace With Justice. Edited by Richard Hudson. Center for War/Peace Studies, 218 E. 18th St., N.Y., NY 10003. Quarterly. $35 (membership). 1977-. ISSN0730-9112. Circ. 2,500. The 2,500-member Center for War/Peace Studies advocates a much-enhanced role for the United Nations in achieving and maintaining international peace. The centerpiece of the organization's current efforts is a campaign to make the U.N. the top level of a global federal system, with considerably-strengthened power to make and enforce decisions. Organizational member- ship brings this newsletter, along with other materials, such as Benjamin Ferencz's and Ken Keyes, Jr.'s Planethood: The Key to Your Future. Global Report provides in its four pages news and features relevant to the center's objectives. Recent issues have included an interview with Andrei D. Sakharov and critical discussion of the Bush administration's use of the U.N. during the Persian Gulf War. 12 SEmAt, S REVIEW -- GRANT BURNS- INFACT: Nuclear Weaponmakers Campaign Update. INFACT National Field Campaign, PO Box 3223, S. Pasadena, CA 91031. Quarterly. $15. 1986-. For five years INFACT has been leading a consumer boycott of General Electric, one of the nation's leading nuclear weapons contractors. This brief newsletter reports on progress in the campaign and on GE activities on the nuclear front, including current work and historical events, such as the company's involvement in the notorious 1949 release of radioactive iodine into the atmosphere from the Hanford nuclear facility. INFACT has published numerous materials concerning GE and the boycott. The $15 charge is more a campaign donation than_ a subscription fee. A signifi- cant grassroots contribution to the nuclear debate. The Nonviolent Activist: The Magazine o f the War Resisters League. Edited by Ruth Benn. War Resisters League, 339 Lafayette St., New York, NY 10012. 8/year. $15/$25. 1984-. ISSN 8755-7428. Circ. 15,000. Indexed: Alternative Press Index. A 24-page magazine published by the nation's oldest secular pacifist organization, The Nonviolent Activist contains political analysis from a pacifist perspective, feature articles, and information relating to nonviolence, feminism, disarmament, international issues, resistance to registration and the draft, war tax resistance, and other topics. Although it occasionally runs articles on nuclear subjects (such as "Hiroshima and Nagasaki Remembered" in June 1990), the maga- zine's scope attempts to cover the whole of the peace and anti-militarist movement insofar as possible within its available space. Includes one or two book reviews per issue, up to 600 words long. Its value is enhanced by annual indexing. The Nuclear Resister. Edited by Jack and Felice Cohen- Joppa. PO Box 43383, Tucson, AZ 85733. 8/year. $18/10 issues. 1980-. ISSN 0883-9875. Circ. 1,000. This 16-page tabloid "works to foster a wider public awareness of imprisoned nuclear resisters, their motivations and their action." It facilitates a support network for such activists in the U.S., Canada, and Great Britain. The paper reports on arrests and jailings of civil disobedients, and provides analysis and com- mentary on underlying issues, as in the article on "The Militarization of the Academic Community" in the 21 September 1990 issue. Features statements of resisters themselves and a listing of forthcoming nonviolent direct actions at nuclear sites. An excellent source of information and opinion on this most committed segment of the peace movement, particularly in view of the mass media's almost total disregard in this area. Nukewatch Pathfinder. The Progressive Foundation, PO Box 5658, Madison, WI 53701. Quarterly. $15. Nukewatch (the informal name of the Progressive Foundation) came into being in 1979 following a federal court's decree restraining The Progressive from publishing information about the U.S. nuclear weapons program. The foundation, founded by the magazine, has developed into an independent action group working for peace and justice. Its 4-page tabloid newsletter reports on organizational activities, including the Nukewatch "H-Bomb Truck Watch," which monitors Department of Energy convoys that transport nuclear warheads and their components throughout the U.S., and the Missile Silo Campaign, designed to map the 1,000 ICBM missiles and 100 launch control centers in the Midwest and Great Plains. A good source of information on the secular arm of the grassroots peace movement. The Objector; A Journal o f Drafi and Military Informa- tion. Edited by Jeff Schutts. PO Box 42249, San Francisco, CA 94142.6/year. $15/$20. 1980-. OCLC 7534019. ISSN 0279-103X. Circ. 3,000. The Objector covers in 12-16 pages Selective Service laws and activities, military regulations and life in the military, issues of conscientious objection, anti-militarism, draft registration and resistance, and other information of concern to those facing "compulso- ry" military service. Published by the CCCO, an agency founded in 1948 as the Central Committee for Conscientious Objectors. Includes news of life in the Soviet and other foreign military establishments. Strongly recommended as an information tool in any environment where young men and women ponder their futures and search their consciences. On Beyond War. Edited by Mac Lawrence and Marilyn Rea. Beyond War, 222 High St., Palo Alto, CA 94301. 10/year. $25. 19847-. ISSN 0887-9567. Beyond War is an eight-year old educational foundation dedicated to building a cooperative, sustain- able world. It is active in a number of areas, including citizen diplomacy efforts with people in the Soviet Union and proposing initiatives for global security and cooperation. The 8- to 12-page newsletter contains discussions of current conflicts, ideas for positive change, interviews with peace activists and scholars, commentary on socio-psychological aspects of interna- tional relations, and occasional book reviews. Beyond War's dominant message is that all humanity shares the same vital need to preserve the planet; it believes that recognizing the common interest everyone has as a citizen of planet earth in its preservation is a logical and necessary step toward achieving preservation of the planet a n d humanity. "The Earth and all life are - - S E R I A L S O F T i m P E A C E M O V E M E N T - - W I N T E R 1 9 9 1 13 interdependent and interconnected," says one Beyond War document. "The well-being of each individual is inextricably linked to the well-being of the whole. All is one." The spirit here is grassroots, the execution professional. Peace and Freedom. Edited by Roberta Spivek. Women's International League for Peace & Freedom, 1213 Race St., Philadelphia, PA 19107.6/year. $10. 1941-. OCLC 2265762. ISSN 0015-9093. Circ. 11,000. Indexed: Alternative Press Index. Billing itself as "the only U.S. magazine devoted solely to the women's peace movement," each issue of P & F covers a full spectrum of international peace- related issues, from advocacy of a comprehensive nuclear test ban to children's books, racism, sexism, disarmament, peace education, and WlLPF activities. Useful to any peace activist or researcher, Peace and Freedom should be a basic item on any woman peace worker's periodical shelf. Peace Brigades International Project Newsletters. Peace Brigades International, Box 1233, Harvard Sq. Sta., Cambridge, MA 02238. Monthly. $25. 1989-. Peace Brigades International (PBI) sends unarmed international peace teams, on invitation, into areas of repression or conflict, acting on thebeliefthat "citizens can act boldly as peacemakers when their governments cannot." The newsletter provides information about the activities of the teams and the organizations with which they work, as well as background information on the situations in the countries where PBI has projects. Formerly published separately by country, the newslet- ter began including information about all projects (Central America, Southeast Asia, North America) in summer 1991. An effective tool for staying informed about troubled local situations in countries and regions with the potential to serve as catalysts for broader violence and military confrontation. Peace Conversion Times. Edited by Will Loob. Alliance for Survival, 200 N. Main St., Suite M-2, Santa Ana, CA 92701.9/year. $25. Circ. 8,000. 1983-. The Alliance for Survival is a grassroots group whose major goals includethe abolitionofnuclear arms and power, reversal of the arms race, and an end to military interventions. It is primarily active in the city of Los Angeles and in Orange County, Calfornia. Peace Conversion Times is an 8-page tabloid featuring organizational news and articles on narrower aspects of the broad goals noted above. Included here as a good example of a local peace periodical produced on a slender budget. Peace Magazine. Edited by Metta Spencer. Canadian Disarmament Information Service, 736 Bathurst St., Toronto, Ont. MSS 2R4 Canada. 6/year. $20. 1985-. ISSN 0826-9521. Circ. 8,000. "To inform, enlighten, and inspire. To save Earth from the scourge of war." With this motto, Peace Magazine addresses a wide variety of issues and readers. The 32-page magazine endorses multilateral disarmament, but otherwise takes no editorial position and presents a variety of views. It contains a listing of upcoming peace events in Canada, notes on the Canadian peace movement, reviews of books, films, and videos, letters from abroad, and other regular features. Recent issues have offered articles on the Persian Gulf War, the General Electric boycott, Greenham Common, nuclear accidents, and many other relevant issues. A well put-together magazine that will be useful to peace activists and scholars in the U.S., and essential to those in Canada. Peace Reporter. Edited by Kathleen J. Lansing. National Peace Institute Foundation, 110 Maryland Ave. NE, Suite 409, Washington, DC 20002. Quarter- ly. $35 (membership). 1986-. ISSN 1049-0779. Peace Reporter is a six-page newsletter providing information on the growth and development of the United States InstituteofPeace, activities and programs of the foundation, and other articles on peacebuilding, peacemaking, and conflict resolution. A recent issue contained articles on conflict management seminars in Armenia, establishment by the Institute of Peace of a Middle East program, networking notes, and other information. The foundation is an independent organiza- tion, not affiliated with the U.S. Institute of Peace, although its activities in behalf of the Institute helped enable its creation. Membership opens opportunities to meet in Regional Council workshops and seminars on peacemaking and conflict resolution. RECON. Edited by Chris Robinson. RECON Publica- tions, PO Box 14602, Philadelphia, PA 19134.9/year. $15. 1973-. ISSN 0093-5336. Circ. 2,000. This newsletter of approximately 14 pages "covers Pentagon activities around the world. RECON exposes little-known events and explains the reasons behind the mass-media headlines." Produced by volunteers, RECONreflects its editor's belief that what one reader calls "a goofy bunch of idealists" can help effect positive social change, in spite of the vast financial and political power of the military industrial complex. "We have faith that the change will come," says Robinson. RECON often publishes articles on nuclear resistance, nuclear weapons and warfare issues, and SDI. Includes eight to ten paragraph-long book and document reviews in each issue. 14 SERIALS REVIEW -- GRANT BURNS The Reporter for Conscience' Sake. Edited by David W. Treber. National Interreligious Service Board for Conscientious Objectors, Suite 750, 1601 Connecticut Ave., NW, Washington, DC 20009. Monthly. $20. 1940-. OCLC 2244974. ISSN 0034-4796. This publication is an update on legislation and developments affecting conscientious objectors to participation in war. Each 8-page issue is likely to offer discussion of individual CO cases, commentary on military action, analysis of pro-military propaganda in the media, a number of brief book reviews and other leads to pertinent literature, coverage of congressional action, and more. A valuable source to help anyone understand contemporary conscientious objection to participation in war, but especially useful to those in a position to counsel young people concerned about the draft and what constitutes their "duty" to their country. Space and Security News. Edited by Robert M. Bow- man. Institute for Space and Security Studies, 5115 Hwy. A1A S., Melbourne Beach, FL 32951. Quarterly. $25. 1984-. Editor Bowman, the author of Star Wars: A Defense Insider's Case Against the Strategic Defense Initiative (J.P. Tarcher, 1986), is a retired Air Force Lt. Colonel. He conducts an energetic campaign against the militarization of space and the continued funding of defense programs he considers wasteful and a threat to U.S. and global security. Each issue of his 8- to 16- page S & S News contains Bowman's analysis of global events and military programs, chiefly SDI. Like Thomas Liggett's World Peace News, Bowman's periodical reflects the thinking of a former military man who has seen a new light. He describes the publication as providing "an independent voice for the American people on space and other high-tech issues affecting national security .... We specialize in those areas where we feel the government has lied to the American people and their elected representatives to Congress. We "Speak Truth to Power' on issues like 'Star Wars,' the KAL-007 shootdown, the Challenger explosion, nuclear testing, and the war against Iraq. We have vigorously opposed weapons in space since 1980." The format is homey (2-column, typed), the message urgent and clearly-presented. Surviving Together: A Journal on Soviet-American Relations. Edited by Harriet Crosby, et al. Institute for Soviet-American Relations, 1601 Connecticut Ave., NW, Suite 301, Washington, DC 20009. 3/year. $25/$30. 1983-. ISSN 0895-6286. Circ. 6,000. This journal's parent institute is a nonpartisan serviceorganizationworkingto improveSoviet-Ameri- can relations throughbetter communication, facilitating working relationships between individual Soviet and U.S. citizens, cultural exchanges, and other means. Surviving Together presents news and editorial opinion on U.S.-Soviet relations and chronicles exchanges between the two countries, especially private-sector contacts. Each 90-page issue's coverage is divided among approximately 20 subjects, such as health, education, world security, environment, city affilia- tions, and citizen diplomacy. It includes articles reprinted from other sources and those based on information retrieved from interested organizations. Both U.S. and Soviet sources are cited. An effective tool for keeping informed on healthy developments in U.S.-Soviet relations. Features a good number of resource and new book notes. Readable and exciting. The Test Banner. American Peace Test, PO Box 26725, Las Vegas, NV 89126. Monthly. $10. 198?-. American Peace Test is a grassroots group dedicated, to nonviolent action to end the arms race. It advocates a comprehensive nuclear test ban as a first step towards disarmament, and engages in education and outreach to communities affected by nuclear weapons testing and the arms race. The organization's Testing Alert Network monitors U.S. and British tests at the Nevada Test Site and shares information on foreign tests with a global network of activists. The Test Banner reports both U.S. and international opposition to nuclear testing, including protests by Soviet citizens. The tabloid helps the reader keep up with a variety of testing issues, including environmental and legal matters. Readers seriously interested in participating in the movement for a comprehensive test ban will welcome access to this title. WAND Bulletin. Women's Action for Nuclear Disarma- ment. PO Box B, Arlington, MA 02174. Quarterly. $30 (membership). Circ. 20,000. 1982-. WAND was founded in 1980 by Dr. Helen Caldicott as a women's initiative to eliminate weapons of mass destruction and redirect military resources to human and environmental needs. WAND engages in congressional lobbying, grassroots organization, support of women congressional candidates, and other measures serving its objectives. The WAND Bulletin, an 8-page newsletter, includes notes from affiliates around the U.S. as well as discussions on a variety of political and military issues. A desirable addition to feminist and peace collections. Washington Peace Letter (see figure 3). Washington Peace Center, 2111 Florida Ave., NW, Washington, DC 20008. Monthly. $25. 1963-. ISSN 1050-2823. Circ. 5,000. - - S E R I A L S O F T H E P E A C E MOVEMENT - - W I N T E R 1 9 9 1 15 An affiliate of the national grassroots network, Mobilization for Survival, the Washington Peace Center focuses on peace education and action in the metropoli- tan Washington area. The 8-page tabloid aims, in its editors' words, "to support the work of local progres- sive grassroots activists, and provide information on issues of local, national, and international importance." The paper concentrates on such issues as militarism, racism, sexism, homelessness, protection of the environment, homophobia, and economic justice, as well as the conventional peace issues of outright military confrontation. Occasional book reviews. In its broad spectrum of concerns, the Washington Peace Letter is emblematic of the contemporary peace movement's realization that institutionalized violence extends far beyond traditional ideas of "war" to include more subtle but still devastating affronts to the rights of both humanity and nature. The tabloid's single, overriding interest is the objective of its subtitle, world government, and the sooner the better. The whole of WPN is given to short news notes and commentary, with occasional longer pieces, analyzing global affairs in light of that objective. It is relentlessly critical of efforts to preserve nationalism and the sovereignty of the nation-state; Liggett sees nuclear weaponry as the death knell--one way or another--of the present system of competing states. Each 8-page issue is full of information and opinion of interest to advocates of world government, contribut- ed not only by Liggett but by other advocates of the rule of international law as well. WPN is currently campaigning for Czechoslovak President Vaclav Havel's designation as U.N. Secretary-General in the belief that Havel has a better understanding of interna- tionalism than "the U . N . ' s line of nationalist Secretar- ies-General." It is especially interesting for its quick takes on political attitudes expressed in the mass media. WASH INGTON DECEMBER 1 9 9 0 PUBLISHED BYTHE WASHINGTON PEACE CENTER Nuclear Tomahawk SLCMs aboard U,S. Vessels in t h e Middle East Region t ~ n 12 A u g u s t - i November i990) Figure 3: Washington Peace Letter December 1990 World Peace News: A World Government Report. Edited and published by Thomas Liggett. 300 E. 33d St., New York, NY 10016.6/year. $20/3 years. 1970-. ISSN 0049-8130. Circ. 2,000. Editor-publisher Liggett, a journalist and decorated World War II Marine Corps fighter pilot, dedicates WPN to "All the World-Government News That's Fit to Print and Almost Free of Cant, Hype and Twaddle." R E L I G I O U S P E R S P E C T I V E S The Advocate. Edited by Kathleen Hayes. Evangelicals for Social Action, 10 Lancaster Ave., Wynnewood, PA 19096. Monthly. $20 (membership). 1988-. This nicely-designed 16-page newsletter includes attention to nuclear issues within its broad embrace of topics concerned with peace and justice. It describes its mission as seeking "to contributeto the development of social awareness and a consistently proqife social ethic in the American Christian evangelical community, in order to, in the words of our slogan, 'promote shalom in public life.'" Each issue contains a feature article on an important public policy matter, federal legislative updates, news on developments abroad, and other organizational information. Briefly. Edited by Nancy Lee Head. Presbyterian Peace Fellowship, Box 271, Nyack, NY 10960. Quarterly. $25 (membership). 1944-. A newsletter designed to inform Presbyterian Church members of peacemaking ideas, activities, resources, and backgrounds, Briefly, in its 8 pages, covers issues on peace in general, including attention to nuclear matters such as the General Electric boycott led by INFACT and nuclear weapons facility investiga- tions. It also features notes on resources and kindred organizations, plus occasional book reviews. Christian Social Action (see figure 4). Edited by Lee Ranck and Stephen Brockwell. General Board of Church and Society, United Methodist Church, 100 Maryland Ave., N.E., Washington, DC 20002. 1 I/year. $13.50. 1968-. ISSN0164-5528. Circ. 4,500. 16 SERIALS REVIEW - - GRANT BURNS Editor Ranck describes Christian Social Action as a magazine that "builds on the premise that faithful witness involves constant grappling with current issues in light of biblical and theological reflection. CSA is intended to stimulate thought, discussion and further study on a number of complex, sometimes controversial issues." These issues have recently included the Persian Gulf War, the situation in Panama, women's rights, the death penalty, and gay and lesbian concerns. Letters, a "U.N. Report," occasional book reviews, and other features round out the 40-page magazine. A good addition to libraries trying to offer readers access to a variety of religiously-informed views on the many aspects of peace and violence in today's world. Desert Voices. Nevada Desert Experience, PO Box 4487, Las Vegas, NV 89127. Quarterly. Free (dona- tions welcome). 1988-. The Nevada Desert Experience describes itself as "a faith-based organization with Franciscan origins working to end nuclear weapons testing through a campaign of prayer, dialog, and nonviolent direct action." Organized in 1984, the Experience conducts prayer vigils at the Nevada Test Site and sponsors annual commemorations of Hiroshima and Nagasaki in August. "NDE is a voice in the desert calling people of faith to nonviolence in the face of violence, truth in the face of illusion, hope in the face of despair, love in the face of fear." The 6-page newsletter features articles on the comprehensive test ban issue, organiza- tional news, notes on activities of kindred groups, and occasional book reviews. Episcopal Peace Fellowship Newsletter. Edited by Dana S. Grubb. PO Box 28156, Washington, DC 20038. Quarterly. $25 (membership). ca. 1965-. This newsletter is primarily for the encouragement and information of EPF members and friends, and to keep bishops, church press, and others informed of organizational activities and objectives. The Episcopal Church has been an active peace and anti-nuclear weapons advocate for some time; Episcopalians seeking connections with other Church members will find this newsletter helpful. Ground Zero (see figure 5). Ground Zero Center for Nonviolent Action, 16159 Clear Creek Rd. NW, Poulsbo, WA 98370. Quarterly. Donation. The root of Ground Zero's orientation is secured in the tradition of Christian nonviolence, but, as the "Dear Gandhi" letters column suggests, the point of view is anything but narrowly sectarian, and not without a sense of humor. The 12-page tabloid dwells on peace issues at large, from testimonies to the power of prayer to sustain the peace activist to analysis of current U.S. military projects and protests around the nation. It includes the regular feature "Voices from Prison," in which peace activists jailed for their actions reflect on their situations and the meanings implicit in them. As in most grassroots publications, there is a strong sense of community evoked by Ground Zero, in this case a spiritual community. Recommended as a good example of its kind. The f a c e of t h e "enemy?" Figure 4: Christian Social Action March 1991 TheOtherSide. Edited by Mark Olson, Doug Davidson, and Dee Dee Risher. 300 W. Apsley St., Philadelphia, PA 19144. Bi-monthly. $29.50. 1965-. ISSN 0145- 7675. Circ. 14,500. This is an independent, ecumenical Christian magazine tending to the broad issues of peace and social justice. It addresses war, racism, nationalism, and the oppression of the disenfranchised. The maga- zine has published such writers as Daniel Berrigan, Mary Lou Kownacki, bell hooks, Margaret Drabble, William O'Brien, and many others; it maintains a very selective approach to its submissions. It includes poetry and fiction in addition to non-fiction pieces. "We abhor political rationalizing and the social posturing of the right and left," say the editors. "We welcome critical thinking about ourselves and those 'movements' of which we sometimes are a part." Good illustrations; a nice title for public libraries. SERIALS OF THE PEACE MOVEMENT - - W I N T E R 1 9 9 1 17 I • F A L L I990, Vol, 9, N o . 2 C ; I I I ! I , I J Figure 5: Ground Zero Vol. 9, No. 2, Fall 1990 Pastoral Care Network f o r Social Responsibility Newsletter. Edited by G. Michael Cordner, Th.D. PO Box 9243, Ft. Myers, F L 33902. Quarterly. $25 (membership). 1984-. This ,organizational communication tool serves persons with training and interest in pastoral psycholo- gy and issues related to peace with justice and the "integrity of creation." The 16-page newsletter informs members of the network and other interested persons about important related events, issues, resources, and concerns. The strong antiwar theme is accompanied by discussion o f such social justice issues as adequate housing. It includes numerous notes from foreign readers and resource notes. Pax Christi USA. Edited by Mary Lou Kownacki, OSB. National Catholic Peace Movement, 348 East Tenth St., Erie, PA 16503. Quarterly. $20 (membership). 1985-. ISSN 0897-9545. Circ. 10,000. The primary goal of Pax Christi, the international Catholic peace movement, is "to work with all people for peace for all humankind, always witnessing to the peace of Christ. Its priorities are a Christian vision of disarmament, a just world order, primacy of con- science, education for peace and alternatives to vio- lence." Pax Christi USA covers the Catholic peace movement in depth, with articles by and about activists, actions, and events, from analysis of the Persian Gulf War to campaigning for a Comprehensive Test Ban treaty. Each 38-page issue contains a variety of feature articles, columns, two or three book reviews, news of Pax Christi organizational matters, and "Network," a resources listing. Essential reading for Catholic peace activists and a desirable item for libraries that wish to make Catholic peace perspectives more readily available to their users. Peace Office Newsletter. Mennonite Central Committee, International Peace Section, 21 S. 12th St., Box 500, Akron, PA 17501.6/year. $10. The Mennonite Central Committee is "the coopera- tive relief and service agency of North American Mennonite and Brethren in Christ conferences. It carries on community development, peacemaking and material aid 'in the name of Christ,' in response to His command to teach all nations the way of discipleship, love and peace." The 12-page newsletter features biblical perspectives on war and peace, examination of psychological issues, peace activism among different groups ("Seniors for Peace" is a current project), and reflections on the meaning of peacemaking. World Peacemaker Quarterly. Edited by Dr. William J. Price. World Peacemakers, Inc., 2025 Massachusetts Ave. NW, Washington, DC 20036. Quarterly. $5. Circ. 2,500. 1979-. This Christian, non-denominational newsletter emphasizes the importance of following the teachings of Christ in working for a peaceful world. The newslet- ter reflects editor Price's statement, drawn from his book Seasons o f Faith and Conscience: Kairos, Confes- sion, & Liturgy (Orbis, 1991), that "Every act of worship, every occasion where the sovereignty of the Word of God is celebrated, every instance where the realm of God is acknowledged, is always and every- where expressly political." Church and state may be separate, but World Peacemakers is a group that approaches politics informed by religious conviction. The 20-page newsletter contains essays and notes concerning the spiritual motivations and rationales for turning away from war as a "solution" to international problems. P R O F E S S I O N A L P E R I O D I C A L S The Arms Control Reporter: A Chronicle o f Treaties, Negotiations, Proposals. Institute for Defense & Disarmament Studies, 2001 Beacon St., Brookline, MA 0216. Monthly. $325 libraries/S500 profit-making institutions. 1982-. OCLC 16159509. ISSN 0886-3490. Circ. 400. 18 SERIALS REVIEW - - GRANT BURNS This useful if, for all but major research facilities, prohibitively costly looseleaf service provides up-to-date information on the status of arms control negotiations, the positions of governments, the record of events leading to the current situation, and an update on weapons involved in negotiations. Each supplement contains 100-160 pages. The binder arranges material by topic; the 1991 cumulation, for instance, covers close to 40 arms negotiation areas, including short- range nuclear forces, nuclear-weapon-free zones, the Non-Proliferation Treaty, and missile proliferation. Although full of valuable information, the title's cost will inevitably keep it out of the hands of many researchers. Arms Control Today. Edited by Matthew Bunn. Arms Control Association, 11 Dupont Circle NW, Washing- ton, DC 20036. Monthly except two bimonthly issues, Jan./Feb. and July/Aug. $25/$30. 1972-. OCLC 2197658. ISSN 0196-125X. Circ. 4,000. Abstracts o f Military Bibliography; Aerospace Defense Markets and Technology; PAIS; Predicasts Overview of Markets and Technologies. The Arms Control Association, a national member- ship organization, "seeks to create broad public appreciation of the need for positive steps toward the limitation of armaments and the implementation of other measures to reduce international tensions and promote world peace." Its journal is essential for any serious collection on peace, nuclear weapons, and strategic issues in general; ACT's typical 40-page issue contains interviews with influential figures and informed articles on such topics as nuclear proliferation, verification, movement toward a comprehensive test ban, and strategic defense. The regular departments, "News Briefs" and "Factfile," afford quick access to develop- ments in or affecting arms control. One of the most valuable points for the researcher is "Arms Control in Print," a timely, two-page bibliography identifying books, pamphlets, government documents, and articles in various categories. One or two long book reviews per issue allow reviewers to address the topic at hand as well as the books under consideration. Contributors are prominent and varied in their viewpoints. Barometer. Edited by Tariq Rauf. Canadian Centre for Arms Control and Disarmament, 151 Slater, Suite 710, Ottawa, Ontario, Canada KIP 5H3. Quarterly. $30/$45. 1990-. ISSN 0825-1894. Circ. 3,000. The Canadian Centre for Arms Control and Disarmament was established in 1983 to encourage informed debate and to provide independent, non- partisan research and information on arms control and disarmament. Barometer, although subsidized to some extent by the government, maintains an independent editorial position. An 8-page tabloid printed on quality paper, its emphasis is on Canadian involvement in global issues of arms control and disarmament. 1990 issues contained articles on nuclear testing in the Arctic, International Atomic Energy Agency (IAEA) safe- guards, trends in the arms trade, and Canadian-Soviet cooperation initiatives, among other topics, plus occasional book reviews. Bulletin o f the Atomic Scientists. Edited by Len Ack- land. Educational Foundation for Nuclear Science, 6042 S. Kimbark Ave., Chicago, IL 60637. 10/year. $30. 1945-. OCLC 1242732. ISSN 0096-3402. Circ. 20,000. A.B. C. Pol. Sci.; Academic Index; American Bibliogra- phy o f Slavic & East European Studies; America: History & Life; Bibliography & Index o f Geology; Biography Index; Biol. Dig.; Biological Abstracts; Book Review Index," Book Review Digest; Chemical Abstracts; Current Advances in Ecological and Environmental Sciences; Current Contents; Current Index to Journals in Educatib~i; Energy Review; Environmental Periodi- cals Bibliography; Excerpta Medica; Future Survey; General Science Index, Historical Abstracts; Index to Scientific Reviews; INIS Atomindex; Magazine Index; Media Review Digest; Metals Abstracts; Middle East: Abstracts & Index; PollutionAbstracts; Readers' Guide to Periodical Literature; Social Science Citation lndex; Sociological Abstracts; Risk Abstracts; South Pacific Periodicals Index; World Aluminum Abstracts. The BAS debuted in December 1945. Home of the famous "Doomsday Clock" logo indicating its editors' estimation of humanity's proximity to nuclear annihilation, the magazine is rather more optimistic about the future than it was a few years ago, or even at its inception when it warned of atomic catastrophe being "inevitable if we do not succeed in banishing war from the world." In its 45th anniversary issue, editor Ackland wrote, "The race to nuclear destruction between the world's two military behemoths has been reversed and the opportunity exists to dismantle the dangerous Cold War arsenals and superstructures." If that reversal has taken place, BAS can claim as much credit as any periodical. Throughout its history it has been at the forefront of "responsible" (i.e., professional, expert) forums for addressing the many and intricate aspects of the nuclear threat. Proliferation, testing, the arms race, nuclear weapon facility prob- lems, and many other nuclear issues come into its scope. With articles by recognized authorities, a lively format with good illustrations and good book reviews, BAS is a must for all libraries. CEASE News. Edited by Peggy Schirmer. Concerned Educators Allied for a Safe Environment, 17 Gerry St., Cambridge, MA 02138.3/year. $5. Circ. 700. 1982-. - - S E R I A L S O F T H E P E A C E M O V E M E N T - - WINTER 1991 19 CEASE is a national network of parents, teachers, and other young children's advocates concerned about the dangers of violence, pollution, nuclear power, nuclear war, and a global military budget that drains resources from programs designed to help children and their families. CEASE News is a modest but neatly- produced little newsletter reporting organizational activities and featuring brief articles on various facets of the peace movement. Recent issues have offered articles on the children of Hiroshima, war toys in the classroom, and the Middle East crisis, with some book and audiovisual reviews of materials intended either for children or for their adult teachers and guides. Council f o r a Livable World Newsletter. Council for aLivableWorld, 100 Maryland Ave. NE, Washington, DC 20002. Irreg.; free. 198?-. Although in the words of Council office manager Chris Peterson, "This newsletter is published with no regularity whatsoever," it remains of interest when it does appear. The Council works in behalf o f establish- ing a majority in the U.S. Senate supporting nuclear disarmament and "a big cut in the military budget." The 4-page newsletter contains updates on the current state of that budget, the status of weapons programs, arms control agreements, and other topics. The Council also publishes irregular "Fact Sheets," also free, on specific weapons and military issues, and operates a "Nuclear Arms Control Hotline" (202-543-0006), a 3- minute taped message. CPSRNewsletter. Edited by Gary Chapman. Computer Professionals for Social Responsibility, PO Box 717, Pale Alto, CA 94302. Quarterly. $50. 1983-. This desktop-published 30-page newsletter turns its attention generally to the socially responsible uses of computers, and has recently covered such issues as telephone privacy and how computers contribute to the ecological crisis. It has also published many articles in its history on nuclear war and related topics, includ- ing nuclear education, strategy, computer unreliability and nuclear war, SDI, and other topics. Articles contain references, but the style is accessible to the average educated reader; one need not be a computer scien- t i s t - o r even use a computer--to make sense of it. Recently CPSR called for an end to the "Star Wars" program, and published a response to that call by the Strategic Defense Information Office. Given the importance of computers in contemporary weaponry and defense systems, this newsletter is worth the attention of anyone concerned about the relationship of high technology to war and peace. ESR Journal: Educating for Social Responsibility. Edited by Sonja Latimore. Educators for Social Responsibility, 23 Garden St., Cambridge, MA 02138. Annual. $12. 1990-. ESR Journal devotes itself to new ideas on educating students for their involvement in the world. It includes both theoretical and practical essays by ESR leaders and other experts in education. "Skilled, courageous, and creative teachers are essential for our country to survive and thrive," states a journal repre- sentative. "ESR exists to enable such teachers to work together to develop and share ideas." The 120-page 1990 issue, in the format of a typical scholarly journal, featured articles on human rights education, conflict management for students, the role of education for social responsibility in American culture, and other topics. Many of the articles contain footnotes ~atd bibliographies. One hopes this welcome addition to educational literature will be able to evolve to a more frequent publication status. F.A.S. Public Interest Report. Edited by Jeremy J. Stone and Steven Aftergood. Federation of American Scientists, 307 Massachusetts Ave. NE, Washington DC 20002.6/year. $25/$50. 1970-. ISSN 0092-9824. Circ. 4,000. The Federation of American Scientists was founded in 1945 by Manhattan Project scientists to promote the peaceful and humane uses of science and technology. Its journal describes itself "as a means to disseminate the research and analysis produced by various projects of the F.A.S. Fund (educational and research arm of the Federation) which deal primarily in the areas of nuclear proliferation, chemical/biological weapons, international scientific exchange, disarmament verification and the environmental and political implica- tions of the U.S. space policy." Occasional book reviews are included. LAWS Quarterly. Edited by Laura McGough. Lawyers Alliance for World Security, 1120 19th St., NW, Washington, DC 20036. Quarterly. $20. 1982-. Recently revamped from a 4-page newsletter to a more substantial 20-page magazine, LAWS Quarterly is designed to assist its parent organization in providing a forum for the analysis and exchange of ideas concern- ing reduction of the threat of nuclear war, advancing non-proliferation, and enhancing movement towards the rule of law in the Soviet Union. In addition to organizational news, the most recent issue featured essays by a scholar from the Center for International Security and Arms Control of Stanford University and by a former director of the U.S. Arms Control and Disarmament Agency. Previously published as the newsletter of the Lawyers Alliance for Nuclear Arms Control. LAWS Quarterly is a desirable addition to law libraries. 20 SEiCIA~ R~VIeW -- GRA~T BU~S Meiklejohn Ovil Liberties Institute PeaceNet Bulletin. Edited by Ann F. Ginger. Meiklejohn Civil Liberties Institute, Box 673, Berkeley, CA 94701. Monthly. $12. 1990-. The Meiklejohn Civil Liberties Institute is active on a variety of fronts; a commitment to peace and social justice is one of them. The PeaceNet Bulletin is a four- to six-page newsletter devoted to single-issue analysis of"crucial current events and the central issues of peace law" regarding such topics as the U.S. invasion of Panama, the Persian Gulf War, and nuclear deterrence. The organization's goal "is to fulfill our responsibilities in the nuclear age by helping inform U.S. public discussion and debate on these events and to support appropriate action by U.S. poiicym~_c~, organizations, and also specifically by lawyers and lawmakers." Contributors are legal authorities. Chiefly of interest to those in the legal profession who want to explore the opportunities for pursuing peace and justice afforded by their professional expertise. Nucleus. Editedby StevenKrauss. UnionofConcerned Scientists, 26 Church St., Cambridge, MA 02238. Quarterly. Donation. 1978-. ISSN 0888-5729. Circ. 130,000. Nucleus covers arms control, national security and energy policy issues, and nuclear power safety. The oversize 8-page tabloid contains news and analysis of all these issues, and benefits from good graphs, charts, and other illustrations. The Union of Concerned Scientists is dedicated to environmental health, renew- able energy, and "a world without the threat of nuclear war." The organization also publishes books and brochures on these issues, along with its 4- to 6-page "Briefing Papers" on such topics as nuclear prolifera- tion, antisatellite weapons, and other aspects of nuclear war and peace. The PSR Quarterly: A Journal o f Medicine and Global Survival. Edited by Jennifer Leaning, M.D. Williams & Wilkins, PO Box 23921, Baltimore, MD 21203. (Editorialoffices: 10BrooklinePlaceWest, Brookline, MA 02146). Quarterly. $48/$85. 1991-. This most welcome new journal began in the thirtieth anniversary year of Physicians for Social Responsibility, a national organization of 25,000 health professionals and supporters working to prevent nuclear war and other environmental catastrophes. PSR is the U.S. affiliate of the International Physicians for the Prevention of Nuclear War. The journal provides the first peer-reviewed periodical coverage of the medical, scientific, public health, and bioethical problems related to the nuclear age. It features editorials, debate and rebuttal, news notes, letters, and book and journal reviews. The 65-page debut issue of March 1991 contained scholarly articles on the neutron bomb, health effects of radioactive fallout on Marshall Islanders, and other significant contributions to an informed under- standing of medical issues in the context of a world bristling with weapons of mass destruction. Any library serving a clientele with an interest in medicine and allied health fields will want to give this title serious consideration. PSR Reports. Edited by Burton Glass. Physicians for Social Responsibility, 1000 16th St., NW, Suite 810, Washington, DC 20036. 3/year. $80 physicians/S40 associates/$15 students (membership). ISSN 0894-6264. Circ. 50,000. 1985-. (Was PSR Newsletter, 1980- t ¢ ' t t ~ L " \ The official membership newsletter for Physicians for Social Responsibility, this 8-page tabloid informs readers of the organization's campaigns against nuclear weapons testing and production, federal budget priori- ties, and environmental protection and restoration. Some bool~ Teviews are included. Psychologists f o r Social Responsibility Newsletter. Edited by Anne Anderson. 1841 Columbia Rd., NW Suite 207, Washington, DC 20009. Quarterly. $35. 1982-. This 12-page newsletter in addition to covering activities of Psychologists for Social Responsibility, focuses on projects in which professional psychologists are involved concerning peace, war, conflict resolution, and related topics. The newsletter also features articles on such topics as the psychological case for a compre- hensive test ban, profiles of antiwar psychologists, and commentary on current international crises. The organization defines its mission as using psychological principles and tools "to promote conversion from a war system to a world dedicated to peace and social justice." An annotated resource list is a regular feature; occasional book reviews are included. Psychologists who want to stay abreast of professional developments regarding war and peace will find this title useful; so would lay readers interested in psychology. Research Report o f the Council on Economic Priorities (see figure 6). Edited by Alice T. Marlin. Council on Economic Priorities, 30 Irving PI., New York, NY 10003. Monthly. $25. 1969-. ISSN 0898-4328. The Council on Economic Priorities is an indepen- dent, public interest research organization. A focus on arms control, military spending, and national security has long been one of the Council's interests. Recent issues of the 6-page Research Report have dealt with the economic effects of the Cold War's decline, particularly the need for conversion from military to - - S E R I A L S O F T H E P E A C E M O V E M E N T - - WINTER 1991 21 civilian industry in both the U.S.S.R. and the United States. Succinct but informative. O C T O B E R 1 9 9 0 Beating Swords into Washing Machines _ ~ ~ ~ O HEL~ F ~ COnSUMErS? ing m~lacm~ers' envtcun mental c h l m s . A ~ r d l n g [CEP R ~ h e r ] t h e EPA is ~ n s l d e ~ g ~ ] ~ l m g M ~ n w h i l e , ~ m e s t ~ have s a a e d meir o w n laLeling p r ~ g ~ l s . m ~ e its . i ~ f i a a maaer ate a consum er a d v i s o ~ i ~ o w n ~ i l a r e a l ~ l l e y . , But ~ m e m ~ , , ~ a o n l y d e . r i d ~ what y o u b u y b u t ~ h ~ t y ~ d o ~ i t h COOO n o u s ~ e e r ~ , ^ v o u s r i ~ o B e P ~ r E g ~ c ~ s , ~ R ~ N SPECHLBR, J t ~ r m r ~ n c ~ ~ K ~ M ^ L O N ~ S o v i e t * x p ~ c ~ t m a t e then ~ m ~ y ~s o n e - i m p o ~ t ¢ i e m c m s o f M i k h ~ i l G o r b a c h e v ' s p e r u ~ i r d o f alI p ¢ o p l e e m p l o y e d i n S e v i c t i n d u s st~ikaistheconvcnionofmiIite~/produetionto t r y m ~ u f ~ t m e p r o d u c m c o n n o t e d w i t h c i v i l i ~ purposes. T h e ~ r o / s t a t e o f t h e e i v i l i ~ t h e m i l i t a l y . A ~ d i u g t o S m i e t ~ [ s t s ~ a ~ o f t h e S o v i e t ~ o ~ m y h ~ a l s o e ~ n v i n ~ d Z u ~ b y a k o b a s h v i l i , H ~ d o f t h e D e p o n e n t for R ~ s i ~ R e p u b l i c P r e s i d ~ t B o r l s Yeltsin t h a t s u e - F o r e i g n E c ~ o m i c R e l a t i ~ s at t h e S o v i e t Institute cessfi~l ¢ o n v e ~ i o a t o a p e t e ~ o n q m y i s n ~ e s - o f E c o n o m i c ~ d T ~ h n o l o g l c a l P o r ~ t i n g , a sary. E v e r y d a y , m i d e ~ appear i n t h e $ o v i a p r e s s C E P w o r k i n g g r o u p m e m b e r w h o s e ~ e s ~ a c o a a b o u t ~ w c ~ u m e r g o a d s b e i n g p t o d u ¢ ~ m s u i t ~ t t o t b e U . S . S R . S u p r e m e S o v i e t , thebuordun defense e n t e r p f i s e L I r ~ e e d , pro~lueing m u c h o f t h e m i l i t ai). ~ t o t is e v ~ I ~ g e r t h i n thig figltre neede~d e ~ s u m e t g o e d s i n factories t h a t ~ c ¢ p ~ s u g g e s t s . B y s o m e c a l c u l a t i o n s * s a y s D r . 6 ~ d m l s s i l e s s ~ m s a p ~ i s e a p c a ¢ © f a l f i l l e d y a k o b ~ h ~ i , t h e v~tae o f ~ u u ~ u ~ d for d ~ A ~ y shtt~ f i o m m i l i t m y t o c i v i l i a n p r o d u c t i ~ i s f e n ~ a l m s m a y r e ~ h ~ m u c h a s s l x t y p e r ~ n t o f therefore a ~ l ~ m e d e v e l o l m l e n t . Yet. t h e c o u n t a i i i l ~ d u s t i l a l ~ o a r ~ s u s ~ t i n d l e n a f i o n T h i s c o n v e ~ i ~ m e , 4 e l b e l n g ~ e d i n t h ~ S o v i ~ U n i o n , v a l ~ . b ~ d est i m a t i ~ takes a c c o s t o f t h e h i g h e r a s w e s h a l l s ~ , i s l u b - o p t i m a I ~ d i n s o m e w a y s q u a l i t y o f m i l i t a / ~ p r o d u e B s u p e r f i c i a l F o r i o s ~ ¢ . S o v i e t defense e n g l i t i s n o S U l p d ~ . t h e n , t h a t ~ e eft t h e m o ~ n e e d . a ~ t o m e d t o d e m = d s f ~ p l n p ~ i a t p ~ i F i g u r e 6: Research Report o f the Council on Economic Priorities October 1990 SCHOLARLY JOURNALS Bulletin o f Peace Proposals. Edited by Magne Barth. Sage Publications, PO Box 5096, Newbury Park, CA 91359. Quarterly. $37/$83. 1970-. OCLC 1537766. ISSN 0007-5035. Abstracts of Military Bibliography; America: History and Life; Historical Abstracts; Human Rights lnternet Reporter; INIS Atomindex,'Middle East: Abstracts & Index; PALS," Risk Abstracts. Recent issues of this scholarly journal have addressed such topics as religion and armed conflict, the alleged obsolescence of major war between devel- oped countries, international environmental cooperation, current change in Europe, and the arms industry, technology, and democracy in Brazil. It includes the occasional article on nuclear and related issues, such as Sven Hellman's "The Risks of Accidental Nuclear War" in the March 1990 issue. Authors are an interna- tional lot, including those from the U.S., Western and Eastern Europe, Latin America, Africa, Canada, and elsewhere. The journal's motto is "To motivate re- search, to inspire future oriented thinking, to promote activities for peace." It concentrates on international policy in the light of general peace research theory. Perhaps a bit intimidating for undergraduates and the public at large. Conflict Management and Peace Science. Edited by Walter Isard. Peace Science Society (International), Dept. of Political Science, SUNY Binghamton, Bing- hamton NY 13901. Irreg. $20. 1974-. OCLC 8055590. ISSN 0738-8942. Circ. 1,000. A.B.C. Pol. Sci.; America: History & Life; Current Contents; Historical Abstracts; Middle East: Abstracts & Index; PALS; Social Science Citation Index. It may not publish more than one issue in a year, but this journal nevertheless contributes some worth- while points of view on peace issues. This scholarly titlehas featured articles on long-term effects of nuclear weapons, the high-technology arms race, and the relationship between trade and conflict. For advanced students and scholars; others will be frequently stymied by mathematical formulae in the articles. Contributors are almost exclusively U.S. scholars. Current Research on Peace and Violence. Edited by Pertti Joenniemi. Tampere Peace Research Institute, Hameenkatu 13 b A, PO Box 447, SF-33101, Tampere, Finland. Quarterly. $40. 1971-. ISSN 0356-7893. Circ. 600. Abstracts of Military Bibliography; Current Contents; International Political Science Abstracts; Middle East." Abstracts & Index; Sociological Abstracts, Social Science Citation Index. An interdisciplinary scholarly journal that publish- es articles on a wide variety of topics in its 60 to 70 pages. Recent issues have featured articles on the U.N. and nuclear disarmament, Soviet military doctrine, "peace research as critical research," and other issues. A diversity of viewpoints and contributors, from Scandinavia, North America, Great Britain, and elsewhere, gives the journal appeal to peace activists, scholars, and students. Disarmament: A Periodic Review by the United Nations. Edited by Lucy Webster. United Nations Dept. of Disarmament Affairs. Publications Sales Office, Rm. DC2-853, New York, NY 10017. Quarterly. $18. 1978-. American Bibliography of Slavic & East Europe- an Studies; PAIS. Disarmament is intended to serve as a source of information and a forum for ideas concerning the activities of the United Nations and the wider interna- tional community with regard to arms limitation and disarmament issues. The periodical is issued in English, French, Russian, and Spanish editions. As one might expect, the breadth of subjects covered is extensive and its contributors are international. Recent issues have offered articles on economic conversion in the 22 SERIALS REVIEW - - GRANT BURNS U.S.S.R., coverage of the Non-Proliferation Treaty Review Conference that took place in the fall of 1990, tactical nuclear weapons, international arms transfers, and other significant topics. Contributors come to their tasks with well-informed backgrounds in the issues. The majority of the articles contain references to other literature. From 20 to 30 brief book reviews, a list of publications received, recent documents on disarma- ment, and a chronology of disarmament activities round out each issue. At the price, Disarmament is an economical and desirable addition to most libraries. International Journal on World Peace. Edited by Panos D. Bardis. Professors World Peace Academy, GPO Box 1311, New York, NY 10116. Quarterly. $15/$30. 1984-. ISSN 0742-3640. Circ. 10,000. Current Con- tents; Psychological Abstracts; Social Science Citation Index; Social Work Research & Abstracts; Sociology o f Education Abstracts; Geographical Abstracts; International Political Science Abstracts; Key to Economic Science; LLBA Linguistics and Language Behavior Abstracts; PALS; Peace Research Abstracts; Sociological Abstracts. This is another title ranging widely over the world of peace issues. A typical number contains two or three major articles; recent issues have focused on national self-determination, the link between Locke and Kant and ecological theories, the historical paradox of religious sects' lip-service to peace while engaging in war, apartheid, and wars of development in Latin America. A brief "News" section takes an equally broad approach to current political developments, such as the independence movements in the Soviet Union. It includes notes on new books and journals. Book reviews are lengthy, if not plentiful (8 to 10 per issue). Some of the books chosen for review are curious entries in a journal devoted to peace (e.g., E.D. Hirsch's Cultural Literacy) but the reviews also turn up some interesting and generally overlooked titles. Clearly a reflection of its editor's worldview, even to the inclusion of his long "Miscellany" column, in which he may offer anything from his own reflections on global affairs to poems sent in by readers to his "Pandebars," brief poetic musings on whatever catches his fancy. Journal of Conflict Resolution. Edited by Bruce M. Russett. Sage Publications, 2455 Teller Rd., Newbury Park, CA 91320. $130. 1957-. OCLC 1623560. ISSN 0022-0027. A.B. C. Pol. Sci.; America: History & Life; American Bibliography o f Slavic & East European Studies; Abstracts o f Military Bibliography; A cademic Index; Current Contents; E1 (Excerpta lndonesica); Educational Administraiton Abstracts; Historical Abstracts; International Political Science Abstracts; Psychological Abstracts; Middle East: Abstracts & Index; PALS; Predicasts Overview of Markets and Technologies; Peace Research Abstracts; Psycscan; Social Science Citation Index; Social Sciences Index; Social Work Research & Abstracts; Sociology o f Education Abstracts. Although war and its avoidance is a consistent theme in JCR, the journal is greatly varied in its subjects, and its focus is both historical and contempo- rary. The March 1991 issue, for example, offered articles on economic causes of a breakdown in military balance, another on Chinese community mediation, and an essay on foreign policy crises, 1929-1985. JCR often includes articles on nuclear deterrence and other facets of strategic arms. Contributors are chiefly U.S. academics, with occasional appearances by foreign scholars. The typical JCR essay is heavily annotated, laden with mathematical formulae, and more-or-less impenetrable to the lay reader. Abstracts precede the articles. Desirable for most academic collections; most public i]l~raries can live without it. Journal o f Peace Research. Edited by Nils P. Gleditsch and Stein Tonnesson. Sage Publications, Box 5096, Newbury Park, CA 91359. Quarterly. $37/$83. 1964-. OCLC 1607337. ISSN 0022-3433. Circ. 1,200. A.B.C. Pol. Sci; America: History & Life; Current Contents; Future Survey; Historical Abstracts; International labor Documentation; I~BA Linguistics and Language Behavior Abstracts; Middle East: Abstracts & Index; PAIS; Peace Research Abstracts; RiskAbstracts; Social Sciences Index. Published under the auspices of the International Peace Research Association, JPR "is committed to theoretical rigour, methodological sophistication, and policy orientation." The journal produces an occasional special theme issue; the February 1991 number is given over to international mediation and contains ten selections on the topic, including an introduction by former President Jimmy Carter. Other contributors to JPR are political scientists, sociologists, and psycholo- gists from the U. S., U.K., Scandinavia, and elsewhere. Articles contain abstracts and end notes. Thematic issues feature an issuewide bibliography listing citations to all the items referred to in the issue in hand. JPR publishes numerous articles on nuclear issues; recent essays have dealt with ICBM trajectories, assumptions of British nuclear weapon decision makers, and factors predisposing individuals to support nuclear disarma- ment. The "Book Notes" section provides fairly substantial reviews of up to a dozen recent books. A good addition to most peace collections. Peace and Change. Edited by Robert D. Schulzinger and Paul Wehr. Sage Publications, 2455 Teller Rd., SERIALS OF 'fln~ PEACE MOVEMENT - - W I N T E R 1 9 9 1 23 Newbury Park, CA 91320. Quarterly. 1972-. ISSN 0149-0508. Circ. 1,000. Historical Abstracts; Abstracts o f Military Bibliography; Human Rights Internet Reporter; International Political Science Abstracts; Middle East: Abstracts & Index; PALS; Peace Research Abstracts; Sage Public Administration Abstracts; Sage Urban Studies Abstracts. Peace and Change publishes scholarly articles on many peace issues, but focuses especially on work concerning the development of a just and humane society. The chronological scope is historical as well as contemporary; the January 1991 issue, for instance, features an assessment of the peace movement in the 1980s and a special section on Bertha yon Suttner (1843-1914), author of the famous 1889 antiwar novel Die Waffen niederl (Lay Down Your Arms). Contribu- tors, both foreign and U.S., to each issue's 6 to 9 articles typically represent a variety of discip- lines-anthropology, history, literature, political science, sociology, physics, and others. The journal's openness to work from different spheres gives it a healthy and stimulating eclecticism: few readers at all interested in peace topics will fail to find at least one or two articles per issue that strike sparks for them. Book reviews are few; it is an area the journal could bolster. Peace and the Sciences. Edited by Peter Stania. International Institute for Peace, Mollwaldplatz 5, A- 1040, Vienna, Austria. Quarterly. $240. 1969-. OCLC 6158329. ISSN 0031-3513. Circ. 800. This journal reports discussions at international meetings of both Western and Eastern scientists organized by its publisher. It also recently inaugurated a more thorough attention to the research activities of the liP. Chiefly of interest to those looking for a journal with a strong emphasis on European perspec- tives on peace issues; contributors are mostly European, although some U.S. scholars find their way into the journal's pages. Recent issues have dealt in depth with the future of Europe, economic conversion following disarmament, and ecological security. Contains a mix of research and reflective pieces. Survival. Edited by Hans Binnendijk. International Institute for Strategic Studies, 23 Tavistock St., London WC2E 7NQ, England. U.S. subscriptions to Brassey's, Maxwell House, Fairview Park, Elmsford NY 10523. 6/year. $30. 1959-. OCLC 5010177. ISSN 003945338. Circ. 6,500. Abstracts o f Military Bibliography; Historical Abstracts. A scholarly journal devoted to conflict and peacemaking, Survival covers the globe; articles range from Sri Lanka and Cambodia to Central America and South Africa. It contains occasional articles on explicit- ly nuclear issues, such as coverage o f the 1990 Non- Proliferation Treaty Review and evaluation of SDI deployment options. Each issue's book reviews are relatively few but lengthy, and often focus on works concerned with nuclear topics. I N D E X E S A N D ABSTRACTS Alternative Press lndex. AlternativePress Center, Inc., PO Box 33109, Baltimore, MD 21218. Quarterly. $30/$125. 1969-. OCLC 1479213. ISSN 0002-662X. Circ. 550. Subject and author access to articles in close to 250 alternative and radical publications, many of which cover peace issues on a regular basis. Most of the periodicals indexed here are not well represented in other indexes; most of them are not well represented in libraries. The majority of the titles are U.S. publica- tions, but the list includes many from Canada, Great Britain, Australia, and other nations. Peace Research Abstracts Journal. Edited by Hanna Newcombe and Alan Newcombe. Peace Research Institute, Dundas, 252 Dundana Ave., Dundas, Ont. L9H 4E5, Canada. Monthly. $210. 1964-. OCLC 1605735. ISSN 0031-3599. Circ. 400. A very useful tool for peace professionals, this abstracting journal cites and annotates (frequently at considerable length) over 3,000 documents annually. Coverage includes books, scholarly and semi-popular periodicals representing a large number of disciplines, institutional reports, newspapers, films, and other materials. Access is by author and subject indexes and by a code index that classifies entries by subject. Back issues are available from the publisher. Indispensable for researchers investigating Canada's role in affairs of peace and war because of its strong coverage of Canadian publications, the journal also treats a copious quantity of American and British materials. Some coverage of non-English language documents can also be found. 24 SERIAZS REVIEW - - GRANT B U R N S - work_n2dtwukhbvdydlibl2kifqbb74 ---- Scientific models as works 1 Scientific models as works By Anita S. Coleman School of Information Resources and Library Science, University of Arizona at Tucson, AZ 85719, USA. Email: asc@u.arizona.edu Abstract: This paper examines important artifacts of scientific research, namely models. It proposes that the representations of scientific models be treated as works. It discusses how bibliographic families of models may better reflect disciplinary intellectual structures and relationships, thereby providing information retrieval that is reflective of human information seeking and use purposes such as teaching and learning. Two examples of scientific models are presented using the Dublin Core metadata elements. Outline: 1. Background 2. Current cataloging and indexing practices 3. Assumptions and limitations 4. Definition of scientific models 5. Importance of scientific models 6. Aspects of scientific models 7. IR of scientific models a. Table 1: Representation and resources about constitutive models b. Table 2: Subject (information) retrieval and constitutive models c. Table 3: Information resources in OCLC about constitutive models 8. Scientific models as works and their bibliographic families 9. Bernoulli model a. Table 4: Bibliographic families b. Table 5: Bibliographic and subject relationships c. Table 6: Relationships 10. Sample Metadata for Atmosphere Ocean Model 11. Conclusion 12. Acknowledgements Glossary Appendix A: LCSH subject headings for models Appendix B: Notes about sampling frames Appendix C: Qualitative analyses: excerpts Appendix D: Scientific models as works and future research questions Notes mailto:asc@u.arizona.edu 2 Background: The current environment of scholarly information organization for retrieval in libraries is based on two important traditions: 1. Information handling tools like the library catalog are intrinsically different from bibliographic databases and indexes of journal articles. This is because library catalogs must accommodate information retrieval from both physical storage and conceptual content. Not so the databases or indexes, which are often concerned only with conceptual information retrieval only. Therefore, from the library perspective the two tools, periodical indexes (bibliographic databases) and library catalogs, provide bibliographic control of the universe of knowledge. From the user perspective, indexes and catalogs must both be consulted for information retrieval from the bibliographic universe of knowledge. 2. Information resources for inclusion in the library catalog are often chosen because they are bibliographically independent publications.1 The Anglo-American Cataloging Rules, 2nd edition revised (AACR2R) specifies these as books, pamphlets and printed sheets, cartographic materials, manuscripts (including manuscript collections), music, sound recordings, motion pictures and videorecordings, graphic materials, computer files, three-dimensional artifacts, and realia (the exception to “bibliographic”), microforms, and serials.2 In other words, these are the units of analysis, the item/object granularity level at which the library catalog functions. Typically, the whole item (book, serial, etc.) is described; individual book chapters are not cataloged though AACR2R provides for this.3 Similarly, the periodical indexes have taken over the role of providing access to component parts that can also be considered independent units, such as journal articles. 3 Both of these traditions have been challenged as the practice of representing information on digital media continues to rise. Patrick Wilson notes that in the global, online, multimedia information world we can no longer take textual or conceptual stability for granted.4 An important question then to investigate is this: How can the primary bibliographic tool, the library catalog, better reflect disciplinary knowledge structures? This paper investigates by using the notion of works for one class of intellectual (and disciplinary) creations, scientific models. Before proceeding to a discussion of scientific models as works, current practices in cataloging and indexing, assumptions, limitations and scope of this study are presented. There is a glossary, mostly drawn from Smiraglia, which defines terms used in this paper.5 Current cataloging and indexing practices: Library catalogs and indexing databases focus on the subjects of disciplines (what topics and concepts are there within a particular subject or discipline) and not on disciplinary intellectual activities (at least not in the sciences, and for example, modeling) that result in creative products that can be indexed for information retrieval. Questions such as what are the intellectual products of disciplinary activity, how are such products (for example, models) represented in bibliographic entities, what are the component parts of such representation, how are the parts related, etc. appear to be beyond the scope of bibliographic organization. Therefore, current cataloging, indexing, and classification of models exists only for representations of the textual content of models, the written descriptions about models and the activity of modeling as recorded in published 4 literature; these are usually found in text, items, and documents such as journal articles, scientific reports, theses, dissertations, books and chapters in books. Bibliographic control of models and modeling that is reported in published literature as a scientific activity is enabled through three types of tools: library catalog, bibliographic utility, and periodical index (from henceforth the term index includes bibliographic databases and periodical indexes). Information retrieval in these tools is facilitated through description and subject analysis. Subject analysis includes classification. Resources about models are often classified as subjects and this entails the use of controlled vocabulary systems like thesauri, classification or subject heading lists. In the library catalog and in bibliographic utilities, the Library of Congress Subject Headings (LCSH) is the predominantly used controlled vocabulary list. Dewey Decimal Classification (DDC) provides the classification number for item location of the unit. The LCSH descriptor (preferred term) is Models and modelmaking, which may be subdivided geographically; there is Mathematical models which may be subdivided by object and narrower terms like Atmospheric models, Hydrologic models, Wind Tunnel models, etc..6 Appendix 1 provides a list of the LCSH subjects under Models and modelmaking. Controlled vocabulary in the indexes is dependent upon discipline thesauri and each index usually selects and uses a different thesauri, subject heading or classification scheme. For example, GEOREF (an index for the geological sciences) uses the GEOREF Thesaurus while INSPEC (another index in the physical sciences and engineering) uses the INSPEC Thesaurus (see Appendix B for more details). 5 Assumptions and limitations: This study is based on the following assumptions: 1) Information retrieval in libraries and through bibliographic tools must actively support, if not enable, end-user information seeking purposes such as exploratory learning and information uses such as teaching and learning. Therefore, our library tools must reflect disciplinary knowledge structures, products, and uses. 2) In the online world continuing to segregate tools such as indexes, catalogs, and bibliographies is inefficient. For information seeking purposes such as teaching and learning efforts should be made to merge the three tools for improved end user searching and information retrieval. 3) Boolean searching is end-user hostile. Tests as early as Cranfield have shown that Boolean searches do not improve retrieval performance significantly over other types of searches.7 This study assumes that phrase searching, for example noun phrases like ‘tree rings’ are preferred user search strategies and hence bibliographic description should accommodate such information retrieval. 4) AACR2r defines models as "a three-dimensional representation of a real thing."8 It also provides descriptive cataloging rules for cataloging models as physical objects. This is not the definition of scientific models as used in this study. A different definition, one that is grounded in how the word is used in the disciplines (sciences and social sciences) is proposed. A major limitation of this study is that it does not completely include the representations of models in heterogeneous formats; it is limited to electronic formats only. Also, because modeling is a widespread activity, an attempt is made to identify 6 important properties of scientific models only. Finally, this is part of a larger funded study that is developing a classification scheme for scientific models in one are, water quality, and building a prototype catalog of scientific models. Definition of scientific models: In every field of human endeavor, including the natural and engineering sciences, the word model can mean different things and conjure different images to different people. The Oxford English Dictionary Online provides three major meanings for the word model: a representation of structure, a type of design and an object of imitation.9 From the more than 15 meanings within the above three contexts, the following definition best fits an initial consideration of scientific models as works: a model is “a simplified or idealized description or conception of a particular system, situation, or process (often in mathematical terms: so mathematical model) that is put forward as a basis for calculations, predictions, or further investigation” [emphasis original]. The Encyclopedia Britannica Online offered 4145 articles for a search on the word model.10 This is in addition to the meanings for six words (model (n), model (v), model (adj.), animal model, role model, Watson-Crick model). The first article entry is titled “Model Construction from Operation Research” and has the brief introduction: “A model is a simplified representation of the real world and, as such, includes only those variables relevant to the problem at hand. A model of freely falling bodies, for example, does not refer to the colour, texture, or shape of the body involved. Furthermore, a model 7 may not include all relevant variables because a small percentage of these...”11 [emphasis original]. The Academic Press Dictionary of Science and Technology provides a similar definition for a model as used in the sciences.12 It is “a pattern, plan, replica, or description designed to show the structure or workings of an object, system, or concept.” More specific definitions are provided for the areas of Behavior, Computer Programming, Artificial Intelligence, and Photogrammetry. It is clear that there are models in both social sciences and the sciences. It is equally clear that the activity of scientific modeling includes mathematical and computer- based is prevalent in many disciplines. For example, a search for the term model in Science Direct, a full-text index to Elsevier Journals subscribed to by the University of Arizona Libraries, yielded 1157 articles for the year 2002 only1. A search in the same database, all journals, 2001-onwards, for the phrase “computer models” within abstracts retrieved 2759 articles in journals such as Chaos, Solitons, & Fractals, Muscle & Nerve, Computers & GeoSciences belonging to science, medicine and social science disciplines. Even in the sciences only we find that there are many different approaches that scientific disciplines take to models; they can be complex numerical models implemented on computers, mechanical analogs, performs of theories or restricted concepts within which basic dynamical aspects can be described and understood.13 However, there appears to be widespread consensus that ‘scientific models’ reflect the intellectual activity and include computational and mathematical modeling. Hence the phrase scientific models is more accurate than mathematical or computer models. 1 I was unable to replicate this search for other years as the search does not proceed because of the system limitation - retrieved too many hits – hence user is asked to modify the query. 8 Importance of scientific models for teaching and learning: Integrating scientific, mathematical, and computational modeling, which includes model use and building, with teaching and learning new science concepts has been recognized since the 1980s as important for a number of reasons. Such integration, it is believed, provides the framework for assembling data and knowledge, for stimulating scientific reasoning, discovery, synthesis, analysis, and theoretical development skills in novice student learners. NASA identified the kinds of prerequisites that are needed for integration of research and model use and building with pedagogical goals.14 1. The acquisition of observations (the model must exhibit a selective attitude to information) – student’s ability to select relevant observables 2. Analysis and interpretation of the observational data (structured and pattern- seeking/replicating) 3. Construction of and experimentation with conceptual and numerical models (as analogies and prediction devices) 4. Verification of the models, together with their use to furnish statistical predictions of future trends (testing, experimentation and replication). ThinkerTools, where researchers from the University of California at Berkeley and Educational Testing Services (ETS) collaborated with middle school teachers, is a scientific inquiry and modeling project.15 They developed Morton Modeler, a computer agent, who walks users through the process of building good scientific models. Another 9 noteworthy project is Modeling for Understanding in Science Education (MUSE) based in the University of Wisconsin at Madison.16 MUSE includes middle and high school students and is focused on improving understanding about science as a modeling enterprise and scientific modeling skills. Scientific models draw on fundamental laws and equations and are potentially rich sources for solutions to interdisciplinary problems. Hence, bibliographic tools that show disciplinary activity relationships associated with modeling, name, phenomenon, process, object relationships in the model (subject relationships), mathematical, computer, and use relationships (what kind of mathematical function does the model use, purpose of model, how many and what types of variables, what kind of computer, what kind of software) are important for disciplinary enlightenment during the information retrieval process that accompanies teaching and learning tasks. Aspects of scientific models: The specific area of water quality and the broad discipline of geography were studied to provide the preliminary aspects (facets) of scientific models. Appendix C provides excerpts of raw data used in this analysis. General properties of scientific models are: 1) Models reflect reality. 2) They are small representations of reality. 3) They are simpler than the process/phenomenon they study or model . 4) They are closed, not open, systems. 5) Any real situation can be analyzed if it can be described in terms of mathematical equations. 6) The most important features of reality are correctly incorporated; less important features are initially ignored. Important aspects of scientific models are as follows: 1. Purpose or type of model. What is the purpose of modeling? Many classifications of models by purpose exist; classifications in geography, and 10 hydrology, environmental sciences (see Appendix C) and physics, biology. Scientific classification usually has different purposes from classification for information retrieval. ThinkerTools also provides a simple typology of models. Example: quasi-realistic (simulation), cognitive (explanatory) 2. Object of study. What is the object or objects being modeled? Example: wind tunnel 3. Process: These objects that the model studies, investigates or imitates, participate in a natural process. What process or processes does the model simulate or study? Example: erosion 4. Phenomena. What phenomenon does the model simulate, study or seek to explain? Example: clouds 5. Fundamental Law. Is there a fundamental law that the model is based upon? What? Example: Bernoulli Law 6. Mathematical or statistical function. Models are represented through textual theory, law, and mathematical equations. Can the fundamental law be expressed mathematically? What is the equation? What are the mathematical functions that the model uses? Example: differential functions 7. Variables. What are the conditions and variables studied, modeled, input, output, hypothesized or generated? Example: soil properties 8. Spatial coverage. What is the spatial coverage of the model? Various schemes can be used here: geographic scale, named features, geo-references, etc. Example: global models, regional models 9. Temporal coverage. What is the temporal coverage of the model? Again, various schemes can be used here: temporal scales, named geologic periods, historical periods, etc. 11 Example: micro, meso, 1-hr 10. Software. What software is needed to run the model? What documentation is available about software? Is the software available in executable code? Several components such as operating system and other computing requirements for the model can be included here. Example: Fortran source programs 11. Hardware. What is the hardware environment of the model? Example: PC 11. Person/group who proposed the original model/authored the paper, etc. Is there an original mathematical, computational, scientific model? Is there a theory? Whose? Who did the work? Is it possible to identify original models that continue to be revised, modified, updated? Or for which alternative solutions are proposed? What are the bibliographic relationships that exist between these models? Example: Box-Jenkins, Streeter-Phelps 12. Discipline. What is the major discipline that can be determined for the model either by creator affiliations or other means? The disciplinary facet is an important one for promoting cross-disciplinary collaboration and information retrieval about models. Example: hydrology 13. Replication. Has the model been replicated? This facet is related to the one about the person/group who did the original model. Model replications are important reports for continued model modification and use. Identifying varying types of replications may be useful too. Example: IsReplicatedBy 14. Related Materials. What types of other related materials exist about this model? Example: IsAnalysisOf Much more work is needed before we regard this as the definitive list of scientific model properties and relationships to be used for information retrieval in a tool. But, this provides a good beginning for initial prototype design and subsequent user study. 12 Information retrieval of scientific models: An informal and small survey of the information retrieval and use problems associated with constitutive models as currently represented in a library catalog, periodical indexes and bibliographic utility, and the WWW provided preliminary evidence for re-considering the cataloging of scientific models and investigating scientific models as works. The following two tables demonstrate the representation and problems associated with retrieving information resources about models in four sampling frames, index, utility, catalog, and the World Wide Web (WWW). The survey focused on the following two types of terms, those advised by expert users and those selected from the LCSH. Controlled vocabulary terms from the LCSH include: Atmospheric models, Hydrologic models, Mathematical Models. User- suggested term is constitutive models. Constitutive models are an important class of models in civil and rock engineering. Constitutive models are based on constitutive equations, relations and laws. Engineering faculty at the University of Arizona suggested this as a class of scientific models that needed better information retrieval for novice learners, senior undergraduate and graduate level engineering students. Table 1 shows the search terms used, the sampling frame where the search was conducted, and the number of information items, hits, that were retrieved and the dates of the search on a particular class of scientific models. Appendix B provides the notes about the selected sampling frames. 13 Table 1: Representation and Number of Resources about Constitutive Models Search Terms and Search Strategy Sampling Frame Number of Hits & Date of Search (in parentheses) Constitutive models – Subject OCLC 9 (Nov. 2001) Constitutive models – Subject Sabio 0 (Nov, 2001) Constitutive models – Google WWW 35,200 (Nov. 2001) Constitutive Relations – Subject OCLC 2 (Nov. 2001) Constitutive Relations – Keyword OCLC 33 (Nov. 2001) Constitutive relations – Subject Sabio 0 (Feb. 2002) Mathematical models – Subject OCLC 130,141(Nov. 2001) Atmospheric models – Subject OCLC 2018 (Feb. 2002) Atmospheric models – Subject Sabio 1 (Feb. 2002) Hydrologic models – Subject OCLC 1,075 (Feb. 2002) Hydrologic models – Subject Sabio 95 hits with 11 subject entries (Feb. 2002) Models – Subject OCLC 179,717 (Nov. 2001, Feb. 2002) Models – Subject Sabio 447 subject entries (Feb. 2002) Models – Keyword OCLC 241, 349 ( Feb. 2002) 14 Table 2 identifies the controlled vocabularies used to index or catalog information resources about constitutive models. It shows how different and scattered the subject terms for constitutive models are. Often, the term model need never be used; at other times it must be used in conjunction with other subject subdivisions or topic ideas. There appears to be no easy provision for index displays of retrievals using model names, objects, processes, phenomenon; terms, subject headings, or descriptors for various subject concepts do not reveal precise subject relationships such as process/agent. Relationships that are important in modeling, therefore, are not revealed. Collocation, the bringing together of all ‘works’ on a particular subject, especially if we extend the notion to scientific models, is therefore complicated and made extremely difficult or incomprehensible for new students and people unfamiliar with the discipline or topic. For example, bibliographic records for models on runoff reveal nothing about objects such as water, hydrologic bodies, or processes, such as saturation and infiltration. When scientific models are cataloged with controlled values for objects, phenomenon, processes as parts different types of subjects, collocation is facilitated. Furthermore, current cataloging and indexing practices of models assigned one or two ‘subjects’ and not as creative artifacts, as works, reinforces the often-held views of non-scientific thinking; namely, that scientific models are physical objects, mechanical devices, or physical scale models only. 15 Table 2: Subject Information Retrieval (IR) & Constitutive Models Name of Database Name of Thesaurus Subject Terms Used and Their Narrow Terms Narrower Terms Subdivisions Notes OCLC Library of Congress Subject Headings Engineering models Mathematical models Acoustic models Beggs method Electromechanical analogies Hydraulic models Bridges, Concrete – Models Mathematical models – Construction Industry Neither constitutive models nor constitutive relations are preferred (used) subject headings. OCLC Local subject headings Constitutive models Used as local subject headings. INSPEC Inspec Thesaurus Models Over 30 such as Band theory models Exchange models Brain models Elementary particle interaction models Quark models Sandpile models None Neither constitutive models nor constitutive relations are preferred (used) as subject descriptors EI COMPENDEX Compendex Thesaurus Models is not a descriptor; nor is constitutive models Note: A search from abstract/title/subject pulled up 6605 hits but majority of the hits picked the phrase from the title “constitutive models” is a common phrase in titles; possibly useful for index displays (using subject or other models relationships criteria) for IR that enlightens novice users Materials Science (Cambridge Scientific) Abstracts (MSA) Copper Thesaurus, Engineered Materials Thesaurus, NASA Thesaurus, Metallurgical Thesaurus As above However 250 hits were retrieved for a search in title on constitutive models Many of the hits appeared to be chapters in books. 16 Table 3 shows descriptive bibliographic and brief subject information for the items about constitutive models that were found in OCLC. Table 3: Information Resources in OCLC about constitutive models OCLC # Type Language Date Format/Form* # of Subject Headings 4552889 Book English 1998 /Ph.D. Thesis 14 45523973 Book English 1997 /Report 11 42890758 Book English 1998 Microfom /Technical Memo 11 33345127 Book English 1986 Microform/ 7 33188617 Book English 1986 Micorform/ Symposium proceedings 6 32866588 Book English 1988 Microform/ Report 7 32617094 Book English 1989 Microform/ Report 7 32115587 Book English 1989 Microform/ Conference proceedings 4 25003695 Book English 1990 Microform/ Technical Memo 11 *Format is what OCLC calls them sometimes; form is my analysis. Therefore, reading row one, we find that format is not given (hence just book or printed text) and form is a Thesis. Using Tables 1-3, a partial list of information retrieval problems for scientific models can be deduced as follows: 1. For information resources and packages, such as books, theses, dissertations, and reports, in library catalogs, the preferred controlled vocabulary scheme is LCSH. LCSH uses a variety of subdivisions to enable the cataloger to assign subject headings; these include topical (other topics), form, time, and geography subdivisions. Smaller information packages such as journal articles, conference proceedings articles are covered by bibliographic indexing services and include indexing and abstracting sources. 17 For these, the preferred controlled vocabulary it can be seen varies from index to index. INSPEC uses the INSPEC thesaurus. EI COMPENDEX and MSA use their own home- brewed or multiple other schemes. Therefore, knowing the right vocabulary to search for constitutive models, or indeed any kind of scientific modeling, is a major problem. 2. Most information resources about models are theses, dissertations, and government or agency reports. These items are often the least cataloged in libraries and subject analysis is often cursory, maybe not even done. Yet, these are probably some of the richest resources about constitutive laws, equations and models. Table 3 is interesting in that it shows these types of resources richly cataloged in OCLC with local and controlled subject headings ranging from four to fourteen. 3. Subject cataloging principles such as specific, direct entry work only when there are specific subject entries and as we have seen there is no subject heading for “constitutive models.” Even when direct entries are available such as mathematical models, simulation methods under which objects can be added as subdivisions the resulting sort criteria of date and publication type become meaningless in the context of actual use of scientific models. Table 3 provides one example of the lack of usefulness of current categorizations in the utility between type, form, and is probably illustrative of cataloging confusion about form and format (see columns Type and Format/Form). 4. With the increasing scientific activity of modeling, current descriptions of textual content about models are incomplete. It appears that the subject headings for texts about models do not provide much information on the underlying laws, processes, phenomena, type of model, computational requirements or mathematical functions. Yet, these are critical factors in disciplinary teaching, use, and research of models. They should be 18 included. Information retrieval may be improved if we consider scientific models as works. Scientific models as works and their bibliographic families: Are scientific models works? Smiraglia provides an operational definition for a work.17 “A work is the intellectual content of a bibliographic entity; any work has two properties: a) the propositions expressed, which form ideational content and b) the expressions of those propositions (usually a particular set of linguistic (musical, etc.) strings) which form semantic content.” Using this definition, scientific models most certainly can be considered as works. Semantic content in scientific models includes mathematical expressions, formal propositions and hypotheses, and statements of laws. Ideational content includes ideas about objects, processes, and relationships, usually within or for specified spatial and temporal scales, and formally, semantically expressed as mathematical equations and algorithmic notation. The ideas include both observables (verified and expressed as measurements) and non-observables (hypothetical data, mathematical equations). MUSE researchers reinforce this view in their statement that “a scientific model is a set of ideas that describes a natural process” and that various “types of entities, namely representations, formulae, and physical replicas” are sometimes needed in the formation of scientific models.18 Works are bibliographic entities. Examining scientific models as bibliographic entities, we find that they have two properties, physical and conceptual. The physical 19 components of a scientific model can be determined in terms of its form (what the instantiation is): 1. Textual works – includes articles, abstracts, bibliographies, reviews, analysis, software documentation. 2. Datasets – includes observations and measurements of the observed phenomenon, object, process reported as data, images, visualizations, and graphs. 3. Software – includes computer code, both source code and downloadable executables. 4. Services – includes interactive and other services (animation applets, databases, indexes, contact pages, submit forms, etc.) Conceptual components can be determined in terms of the ideas the model expresses. Even more than just the ideas, the ideational (subject + other) relationships are important in modeling. Hence, conceptual components can also be called model concepts and relationships. They include: 1. Research foci - What is being modeled, or the object(s) being studied? Wind tunnels? Sediment? River? Are objects related? How? 2. Model type - What is the purpose of modeling – explain, predict, simulate, test? 3. Mathematic functions - This is probably the most complicated idea to abstract. The simplest mathematical submodel needs at least three types of variables and a set of operating system characteristics linking them. The three sets of variables are: input variables, status variables (the internal mathematical constant), and the 20 output variables (which depends on both input and status variables). Model strategy, the fitting and testing of the model by choosing the type of mathematical operations most suitable to the type of system one is trying to model. Finally, developing the algorithmic (computational notation) and testing the model to see if it works as planned before using it for prediction, etc. 4. Instrumentation - What relationships exist between observables, data collected, conditions, and instruments used to gather or generate data. 5. Fundamental theory, law, or hypotheses that drives the model. 6. Replication, revision, simulation and continued improvements, modification. Examples of two prototypical scientific models investigated as works are the Bernoulli Model and the NASA GISS 1999 Atmosphere Ocean Model. Bernoulli Model The Bernoulli family in Switzerland was one of the most productive families in the field of mathematics in the seventeenth and eighteenth centuries. Table 4 shows the names of three generation of Bernoullis who have contributed to the literature of fields such as calculus and fluid mechanics. Searching through the library catalog, it is hard for the novice learner to clearly identify how the various concepts, which bear Bernoulli names, are related. There are Bernoulli numbers, Bernoulli equations, Bernoulli models, Bernoulli principles, Bernoulli law and the Bernoulli theorem. These have all traditionally been thought of as 'subjects' and cataloged in terms of the various physical formats these ideas were packaged in. There is only one LCSH descriptor Bernoulli 21 Numbers. This was used along with other terms to identify what, if any, items existed on the Bernoulli models. Can a bibliographic family be identified? Is there a relationship between the Bernoulli model and the Bernoulli equations that are an application of Bernoulli’s Law and used in elementary physics learning for various things like curving baseballs and aerodynamic lift? What is the source of the Bernoulli model? A bibliographic family is a “set that includes all texts of a work that are derived from a single progenitor.”19 It is therefore, “the tangible, and to some extent quantifiable, instantiation of the mutability of works.” Bibliographic families usually are created by derivative relationships: one source is the progenitor. Derivative relationships are further classified simultaneous, successive, translations, amplifications, extractions, adaptations, performances.20 Tillett’s taxonomy of bibliographic relationships identifies other types of relationships besides derivative; the seven posited by Tillett and including derivative are equivalence, descriptive, whole-part, accompanying, sequential, and shared characteristic relationships.21 Other bibliographic relationships that the library catalog tries to address include access point relationships and subject relationships, but Tables 4 through 6 try to show that these relationships are not clearly specified or made obvious to the user in current bibliographical tools. Table 4 tries to show the searches to identify bibliographic families. It shows the term used, the type of search, and the number of hits the search retrieved in three sampling frames, OCLC (a bibliographic utility), Sabio (the University of Arizona Online Public Access Catalog), and the WWW (using the Google search engine). Sampling Frames & Dates of Search are: OCLC: 01/07/02; Sabio: 01/31/02; WWW: 01/31/02. Terms in Boldface type are Library of Congress subject headings. 22 Table 4: Bibliographic Families Search Term Type of Search Number of Hits Retrieved In OCLC | SABIO | WWW Bernoulli keyword in Basic Search 604 80 125,000 Bernoulli author in Advanced Search 648 43 N/a Bernoulli, Daniel author in Advanced Search 76 8 8,940 Bernoulli, Johann author in Advanced Search 44 2 (see entries to Bernoulli Jean) 4,860 Bernoulli, Jakob author in advanced search 107 3 2,530 Bernoulli, Jean author in advanced search 133 5 6,550 Bernoulli law keyword in (Basic search) 18 2 19,700 Bernoulli equations subject words in Advanced search 4 0 21,800 Bernoulli numbers Subject words in Advanced Search 21 2 subject search 22,800 Note: Bernoulli, Johann, 1667-1748 is same and entered in the library catalog as Bernoulli, Jean. Table 5 shows subject headings and disciplines in selected records. OCLC# with a d1 or d2 indicates that they are bibliographic records for the same copy; Call number, if any indicates the discipline to which the items has been assigned (physical or online shelf browse number); SH stands for subject heading that is found in the record and the number indicates the position in the list of subject headings if an item had more than 1 23 subject heading. Subject headings in OCLC records for records with Bernoulli Numbers are indicated by an X. Therefore, reading the first row, OCLC record # 40937076 has another record for the same item (line below) and has no call number, three subject headings, with the subject heading Bernoulli number in the third position. Table 5: Subject & Discipline Relationships OCLC # Call Number SH #1 SH 2 SH3 40937076 (d1) None Education Research Methodology Linear Models (statistics) X 45340799 (d2) 150 Education Research Methodology Linear models (statistics) X 34844313 LB 1861 .C57 X Numerical functions - 37561894 515.5 Numerical functions X Euler numbers 34594846 QA 8.4 X Dissertations, academic, Mathematical sciences - 33001372 T171.G45x X - - 36246557 Q 172.5 Chaotic behavior in systems X - 48189946 - X Graph Theory - 25099001 Bernoulli Numbers - Bibliography - - 12220874 Euler’s Numbers Bernoulli shifts - 15817046 QA 279.5 Bayesian statistical decision theory Bernoullian Numbers Numerical functions, Computer programs 11107209 QA 55 X - - 41382818 QA 246 X Fermat’s theorem - 42909419 (d1) QA 246 .F3 X - - 41780960 (d2) QA 246 .F3 X - - 41777351 - X Series Equations 3238858 (d1) QA 246 .S2 X - - 48510451 (d2) QA 246 . S2 X - - 05616123 (d1) QA 246 .S73 X - - 05616122 (d2) QA 246 .S73 X - - 43307081 QA 306. S3 Calculus, differential Mathematics, Dissertation, Germany, Berlin X 24 Table 6 tries to identify derivative and equivalence relationships and while it appears straightforward in some case, it is not always so. It is also not clear which of the various types of bibliographic relationships are found in scientific models. Table 6: Bibliographic Relationships Matrix OCLC # Type (Format) Language Publisher or Place of Pub Date of Pub Form of Pub Discipline & Sub- Discipline Derivation Type Equivalence. Relationship 40937076 (d1) Book English Michigan State University 1998 Ph.D. Thesis Educational Psychology New share an equivalence relationship with below (copy) 45340799 (d2) Book English As above 1998 Ph.D. Thesis 150 As above As above 34844313 Book English Eastern Illinois University 1996 M.A. Thesis LB1861.C57 New None 37561894 Book German Aachen 1995 515.5 Unknown Unknown 34594846 Book English Tennessee State University 1995 M.S. Thesis QA 8.4 New None 33001372 Book English Georgia Institute of Technology 1995 M.S. Thesis T171.G4 New None 36246557 Book English University of Rhode Island 1995 M.S. Thesis Q172.5 New None 48189946 Microform English Georgia Institute of technology 1992 Ph.D. Thesis Unknown New None 25099001 Book English Kingston Ontario (Queen’s University) 1991 Bibliography QA3 (510) Revision Updated bibliography is available on the WWW 12220874 Microform English NASA (Institute for Computer Applications in Science and Engineering) 1984 (NASA contractor) Report Unknown New None 15817046 Book French Ecole Poytechnique de Montreal 1977 Report 519 Mathematics New None 11107209 Book English Lamar State College of Technology, texas 1968 M.S. Thesis QA 55 New None 25 41382818 Book English Philadelphia 1936 Society Report Philosophy New None 42909419 (d1) Book English Unknown 1925 Extracted from Messenger of Mathematics (July 1925) QA 246 New? Feinler Mathematics collection Shares equivalence below (copy) 41780960 (d2) Book English Unknown 1925 Extract as above QA 246 As above? Equivalence relationship with above 41777351 Book English London 1914 Extracts from Quarterly Journal of Pure and Applied Mathematics (no. 181) Unknown New? Unknown 3238858 (d1) Book German Springer 1893 Photocopy QA 246 Unknown? Equivalence relationship with below (copy) 48510451 (d2) Book German Springer 1893 Electronic reproduction QA 246 Unknown? Equivalence relationship with above 05616123 (d1) Book Latin Unknown 1977 reprint of 1845 edition Reprint of 1845 edition QA 246 Edition? Equivalence relationship with below + (Bound with another text) (copy) 05616122 (d2) Book Latin Unknown As above Reprint as above As above As above? Equivalence relationship (bound with another text) 43307081 Book Latin Berolini 1823 Dissertation QA 306 New None Metadata for Scientific Models: For purposes of clearly revealing only the important model concepts and relationships that need to be represented and further investigated, the metadata in the examples below are kept very simple. For example, Dublin Core is used as the content standard but a corresponding encoding scheme is not shown. Additionally, metadata for the English translations of the original works (in the case of the first example, Bernoulli 26 model) are not included. DC facets are the one deviation from the Dublin Core standard elements, and again the purpose is to explore and demonstrate how analytical cataloging and faceted classification strategies can be combined for improving bibliographic control and information retrieval that reveals disciplinary structures. Index displays based on controlled values may be derived semi-automatically in the dc/type, dc/relation, and facets are the key to representing disciplinary and other structures of models. They will be described and demonstrated in a subsequent article and through the models prototype database, a classification-based catalog that is currently under development. Elaboration about DC elements, as used in this study, are provided below; only deviations or elaborations are included and use of elements when consistent with DC 1.1 are omitted. These are DC-Description, DC-Contributor, DC-Date, DC-Format, DC- Language, and DC-Rights.22 ! DC-Identifier: Universal Resource Identifier is the unambiguous reference identifier used for each of the items that make up the work ! DC-Title: Title (either cataloger assigned or creator’s title) ! DC-Creator: author or other authority ! DC-Subject: Is not used in new records. Will be kept if found. Instead aspect or facet is proposed and many different ones identified; in keeping with the DC content standard model facets are optional and repeatable. Currently, standard classification schemes and thesauri like the ACM’s Computing Classification System, American Mathematical Society’s Mathematical Subject Classification, GEOREF Thesaurus, and ERIC Thesaurus can be used; but, in the larger study 27 they are being investigated, compared and will be drafted into a simpler models classification scheme based on the facets below. ! FacetConcept: is an idea, the traditional subject (for example, calculus of variations) ! FacetObject: the object studied in the model ! FacetDiscipline: the major discipline to which this model belongs (may be determined either through author affiliations or other means) ! FacetPhenomenon: the phenomenon being modeled ! FacetProcess: the process being modeled ! FacetMathRepresentation: the mathematical functions, equations used ! FacetSoftware: the software needed to run the model ! FacetFunLaw: the fundamental laws that the model is based upon ! FacetModelType: the type of model based on its purpose ! FacetVariable: number, types, conditions, and variables in this model ! FacetProblem: the problem the model is analyzing stated often as a question ! DC-Type: the type of resource is taken to be its form and until the models classification is fully developed, the model semantic unit and modified LCSH form subdivisions are shown as placeholders. ! DC-Relation: various types of bibliographic and model relationships to demonstrate work linkages are used and not limited to the ones in the DC Qualified list ! DC-Coverage: a distinction is made between spatial and temporal coverage; so there is DC-CoverageSpatial and DC-CoverageTemporal. 28 Figure1: Bernoulli Model – Metadata for sources + two members of the bibliographic family dc/identifier: http://foo.bar.org/arsconjectandi dc/title: Ars conjectandi dc/creator: Bernoulli, Jakob facetobject: Bernoulli numbers dc/description dc/publisher dc/date: 1713 dc/type: textual works dc/format: text/html dc/source: Work of Johann Faulhaber dc/relation: IsRelatedTo dc/identifier: http://foo.bar.org/workofjohann Bibliography dc/identifier: http://foo.bar.org/bibliography dc/title: Bibliography of Jakob Bernoulli dc/creator: facetconcept: Bernoulli numbers facetmathfunction: facetperson: Bernoulli, Jakob dc/description dc/date dc/type: textual works (Bibliography) dc/format: dc/source: dc/language: dc/relation: IsUpdateOf dc/identifier: madeupisbnnumber dc/identifier: http://foo.bar.org/workofjohann dc/title: Work of Johann Bernoulli dc/creator: Bernoulli, Johann facetobject: Bernoulli numbers dc/description: dc/publisher: dc/date: 1742 dc/type: textual works dc/format: text/html dc/relation: IsRelatedTo dc/identifier: http://foo.bar.org/arsconjectandi Abstract dc/identifier: http://foo.bar.org/abstract dc/title: The significance of Bernoulli’s Ars conjectandi dc/creator: shafer, glenn facetconcept: series, infinite dc/date dc/type: textual works (Abstract) dc/format: text/html dc/source: dc/language: dc/relation: IsAnalysisOf dc/identifier: http://foo.bar.org/arsconjectandi http://foo.bar.org/arsconjectandi http://foo.bar.org/workofjohann http://foo.bar.org/bibliography http://foo.bar.org/workofjohann http://foo.bar.org/arsconjectandi http://foo.bar.org/abstract http://foo.bar.org/arsconjectandi 29 Figure 2: Bernoulli Model – Metadata for Other Forms, Members of the Bibliographic Family This figure shows one added element under consideration, audience (appropriate audience level for the resource). Exercise dc/identifier: http://www./mtn-bern.pdf dc/title: Numerical, graphical and symbolic analysis of Bernoulli equations dc/creator: Bern, David facetmathfunction: differential equations dc/description: dc/date: 2001 dc/type: textual works (Exercise) dc/format: text/html dc/relation: IsProblem dc/identifier: http://foo.bar.org/bernulliequations dc/relation: IsSupplementTo dc/identifier: http://foo.bar.org/computercode dc/audience: 9-12 (US) dc/typical learning time: unknown dc/coverage: not applicable dc/rights: none http://www./mtn-bern.pdf http://foo.bar.org/bernulliequations 30 Sample Metadata for Atmosphere Ocean Model: dc/identifier: http://aom.giss.nasa.gov/index.html dc/title: Atmosphere Ocean Model dc/creator: NASA GISS AOM Group facetdiscipline: Meteorology facetobject: sea ice facetphenomenon: climate facetprocess: river flow facetprocess: advection fascetprocess: atmosphere-ocean interactions facetprocess: insolation facetmathfunction: atmospheric mass equations facetmathfunction: differential facetsoftware: fortran source facetsoftware: pc executables facetfunlaw: mass facetmodeltype: climate predictions facetmodeltype: grid-point model facetvariable: ocean entropy dc/type: interactive service dc/relation: References dc/identifier: http://aom.giss.nasa.gov/publicaitons.html dc/relation: ModelCode dc/identifier: http://aom.giss.nasa.gov/code.html dc/relation: Observations dc/identifier: http://aom.giss.nasa.gov/observe.html dc/relation: Personnel dc/identifier: http://aom.giss.nasa.gov/people.html dc/relation: Simulation dc/identifier: http://foo.bar.org/12yearruns of the control simulation of 1995 version of CO23 dc/coveragespatial: global dc/coveragetemporal: decade As mentioned, the example metadata shown in Figures 1-3 continues to be under development for the models prototype database/classified catalog. The database is scheduled for completion by the end of the year 2002. There have been several other well-known efforts to catalog materials for learning, including the development of a content standard for computational models.23 However, many of these efforts fail to http://aom.giss.nasa.gov/index.html http://aom.giss.nasa.gov/publicaitons.html http://aom.giss.nasa.gov/code.html http://aom.giss.nasa.gov/observe.html http://aom.giss.nasa.gov/people.html http://foo.bar.org/12yearruns 31 consider scientific models as works, an instantiation with many entities and relationships and hence are not useful attempts to improve information retrieval beyond identification and location. This study is trying to do more; it attempts to map structures for scientific models using the bibliographic definition of works. Conclusion: Heckhausen, in the 1970s, discussed the growth of computer models and modeling as the analytical tools of a discipline.24 However, models are products of scientific research, the creative artifacts reflecting the intellectual structures (mapping of relationships between objects, process, through bibliographic entities) of the discipline and the researcher. The activity of modeling conceals an incredible amount of intellectual relationships that traditional bibliographical tools (primarily the catalog and the index) neither capture nor describe from the texts, documents, and items about models. Should our tools do so? Yes. Our increasing awareness of conceptual and textual instability of electronic forms requires active investigation and experimentation with other types of knowledge organization and representation structures. Additionally, decades of research both in information retrieval and information seeking behavior complemented by the widespread success of Internet search engines has shown us that users tend to disregard Boolean searches, human indexing as opposed to machine indexing does not improve search performance significantly, and that users want a few relevant, good materials. Assessing relevance in terms of disciplinary structures has never been researched. Finally, access to published literature alone is insufficient and continued segregation of our tools, the catalog distinct from the index, is unproductive 32 and outmoded as the goals of knowledge management merge symbiotically with information retrieval. This paper has tried to show that scientific models may be described more meaningfully for information seeking purposes such as exploratory learning if they are considered as works. Dublin Core was used as the content standard for the examples of models cataloged as works in the paper and aspects of models and significant relationships continue to be investigated. Appendix D provides a brief graphical illustration of the status of scientific models as works and the research questions that should be empirically investigated. Acknowledgements: This paper benefited greatly from the comments of three anonymous reviewers who approved the funding for this models classification study. I also thank my former student, Olha Buchel, for many meaningful discussions on the cataloging of models for information retrieval that supports learning tasks and for her support in database and catalog design. Thanks to Professor Muniram Budhu, Dept. of Civil Engineering, Dr. William Rasmussen, Dept. of Agriculture and Hydrology, and Dr. John Kemeny, Dept. of Rock and Mining Engineering for their enthusiastic participation and support of the informal survey of models in the engineering disciplines. Thanks to Professor Smiraglia for writing ‘Works’ and all helpful feedback. They have helped to focus this article. This study is supported by the University of Arizona, Office of the Vice President for Research Faculty Small Grant, 2001-2002. 33 Glossary Note: Many of the terms in this glossary are from Smiraglia (see Notes). Exceptions are noted. Definitions by the author are indicated by an asterisk. Bibliographic Entity: A bibliographic entity is a unique instance of recorded knowledge (e.g., a dissertation, a novel, a symphony, etc.). Each bibliographic entity has two properties – physical and conceptual. A containing relationship exists between the two properties. All recorded and published representations of scientific models are bibliographic entities (including datasets, observational data, instrument and model- generated data). Bibliographic Family: The set of all works that are derived from a common progenitor. Bibliographic Relationships: A bibliographic relationship is an association between two or more bibliographic items or works. Four types of relationships are possible in the library catalog, bibliographic, access point, name, and subject relationships. Source: Tillett (see Notes) See also Tillett Taxonomy of Bibliographic Relationships, Smiraglia Taxonomy of Derivative Bibliographic Relationships. Derivative relationships: Bibliographic families usually are created by derivative relationships: one source is the progenitor. See Smiraglia Taxonomy of Derivative Relationships. Form: Form subdivisions indicate what an item is rather than what it is about. Items can be assigned form subdivisions because of: * Their physical character (e.g., photographs, maps) * The particular type of data that they contain (e.g., bibliography, statistics) * The arrangement of information within them (e.g., diaries, indexes) * Their style, technique, purpose, or intended audience (e.g., romances, popular works) * A combination of the above (e.g., scores) Source: “LCSH: Subject Cataloging Manual” [CD-ROM], The Cataloger’s Desktop (Washington D.C.: CDS, 2001). Item: The physical property, container which is the package for the intellectual part of the bibliographic entity. Progenitor: A progenitor is the first instantiation of a work - the source - that has an original idea. This starts the propagation of other texts, items, and documents. *Scientific model: A class of models with mathematical and computational properties, which can be considered to be works. Like works scientific models have semantic content and ideational content. The following types of relationships exist in scientific 34 models: 1) bibliographic (example, parent-child), 3) access point (not discussed in this paper) 3) name relationships (limited to model personal authors and informal/formal organizations and groups only) 4) subject relationships (to be determined for facets such as concept, object, process, phenomenon, discipline, mathematical representation, software, purpose, and coverage). The sum of these subject relationships is referred to as ideational relationships in works or as scientific model relationships and can be expressed through controlled value lists and classification schemes for improved information retreival. Smiraglia Taxonomy of Derivative Relationships: Derivative relationships are simultaneous, successive, translations, amplifications, extractions, adaptations, and performances. See Smiraglia (Notes) for complete definitions Text: A text is the set of words that constitute writing. Includes textual works. Tillett Taxonomy of Bibliographic Relationships: Bibliographic relationships include derivative, equivalence, descriptive, whole-part, accompanying, sequential, and shared characteristic relationships. Services: Interactive and reference services, electronic. Work: A work is an abstract entity; there is no single material object one can point to as the work. We recognize the work through individual realizations or expressions of the work, but the work itself exists only in the commonality of content between and among various expressions of the work. When we speak of Homer’s Illiad as a work, our point of reference is not a particular recitation or text of the work, but the intellectual creation that lies behind all the various expressions of the work. Source: International Federation of Library Associations, Study Group on the Functional Requirements for Bibliographic Records, 1998, Functional Requirements for Bibliographic Records (Muchen: K.G. Saur, 1998), 16-17. 35 Appendix A: Subject heading for Models in LCSH Models and modelmaking (May Subd Geog) [TT154-TT154.5 (Handicraft)] UF Model-making Modelmaking BT Handicraft Manual training Miniature objects RT Modelmaking industry Simulation methods SA subdivision Models under types of objects, e.g. Automobiles--Models; Machinery--Models; and phrase headings for types of models, e.g. Wind tunnel models NT Architectural models Atmospheric models Engineering models Geological modeling Geometrical models Historical models Hydraulic models Hydrologic models Mannequins (Figures) Matchstick models Miniature craft Models (Patents) Patternmaking Relief models Ship models Surfaces, Models of Wind tunnel models Zoological models --Motors --Radio control systems (May Subd Geog) [TT154.5] BT Citizens band radio Radio control Models and modelmaking in literature Models in art USE Artists and models in art Models of surfaces USE Surfaces, Models of Appendix B: Qualitative Analyses: Excerpts (classification of models in water quality and geography) Five books in the area of water quality, three in the area of physical and human geography, and two thesauri (GEOREF and Water Abstracts) were analyzed to see how models are represented and what vocabulary/classification schemes exists for scientific models in a specific area like water quality and a broad discipline like geography. Excerpts are presented from the analysis of the books only. The entire text of the book including tables of contents, preface, selected chapters, and the indexes of the five books were scanned for classification, names of models, objects, processes, phenomenon, and laws (only classification and named models are shown here). Here are the excerpts: Text #1: An Introduction to Water Quality Modelling. Edited by A. James (Chichester, John Wiley, 1984. Audience for book: Beginner. Based on courses in Civil Engineering Department introducing water quality modeling to scientists and engineers in water pollution control. Classification Notes: Provides a classification of water models (see Fig. Below) Water Quality Models Mo S optimization Computer-aided design imulation 36 dels of Water Quality on Rivers: 1. BOD/DO models by Streeter and Phelps in the 1920s 2. Fick’s Laws of Diffusion 3. River models 4. Lagrangian models 5. Moving segment models 6. Models for discharge 7. Dispersion model 8. Block models 37 Text #2: Water Resources: Environmental Planning, Management and Development (New York: McGraw-Hill, 1996). Analysis: Index, Table of Contents, Preface, Chapter 7 Classification Notes: Provides a classification of water models (see fig. below) Water Quality Models Named Models: 1. Advection-dispersion equation (Taylor, 1921 – classic cite) 2. Mass-balance modeling 3. Plug flow reactor models (Streeter & Phelps, 1925) 4. BOD/DO (Streeter & Phelps, 1925) 5. Oxygen-balance model (Streeter & Phelps, 1925) 6. Continuously stirred tank reactor (CSTR) modeling (Vollenweider, 1975) 7. Aggregate dead zone (ADZ) model 8. QUAL models (current version, Brown and Barnwell, 1987) Representation of mathematical model Representation of time Representation of biological, chemical, physical processes Empirical or statistical models (a.k.a. black box models) Deterministic models Stochastic models Kinetic models Biocoentic models Ecological models Hydraulics Quality Stationary Nonstationary static dynamic Constituents * transformation * production * decay Organisms * Primary production * Consumption / secondary production * Destruction / decomposition Kinetic + Food chain organism 38 9. WQRRSQ models (Smith, 1986) 10. WASP (Ambrose et al, 1988) 11. MIKE 11 (Danish Hydraulics Institute) 12. Flow model 13. DRAINMOD 14. land-subsidence model Text #3: Ajit K. Biswas, Editor, Models for Water Quality Management (McGraw-Hill, 1981). Classification Notes: States that most of the Water Quality Models in use today (1981) are extensions of two simple equations by Streeter and Phelps (1925) – BOD and DO. Other modelers are: O’Connor, Thomann, DiToro, and Chen and Orlob. Named models: 1. BOD/DO 2. QUAL 11 Geography – Qualitative Summary: Minshull (1967) summarizes classification of models by geography scholars and offers a new classification: Type I. Submodels of structure Iconic or scale Analogue Symbolic Type II. Submodels of function Mathematical Hardware Natural all three above can be used for a) simulation, and 2) experiment Type III. Submodels of explanation or theoretical conceptual models (the causal factors of systems in the earth's surface) Cause and effect models Temporal models (including process, narrative, models of time or stage, models of historical processes), and Functional models Minshull further proposes new labels for describing model: 1. The nature of the model hardware, symbolic, graphic, cartographic, 2. Functions of the model descriptive, normative, idealistic, experimental, tool, procedure 3. Form of the model static or dynamic 4. Operational purpose of the model to store data to classify data to experiment on the data 5. Stage at which model is used: a priori concurrent a posteriori (theory is proposed and verified before the model is made) 39 Content standard for computational models (CSCM) proposed by Hill et al. D-Lib Magazine, 2001. http://www.dlib.org/ CSCM is a descriptive standard. It identifies 165 elements in ten sections for computational (scientific models) in the environmental sciences. The ten sections are: 1. Identification Information 2. Intended Use 3. Description 4. Access or Availability 5. System requirements, 6. Input data requirements, 7. Data processing, 8. Model output, 9. Calibration efforts and validation, 10. Metadata source 40 APPENDIX C: Notes about sampling frames Most investigations into the nature of works have established the library catalog and the bibliographic utility as the natural sampling frames. However, bibliographic databases (periodical indexes) are also a natural sampling frame, since the definition of a work imposes no limitation upon texts, manifestations, or other forms of derivative and other bibliographic entities that emanate from the progenitor. This appendix provides a brief description of the bibliographic databases (indexes) and their controlled vocabularies that were used in this study. The bibliographic utility OCLC is also briefly described. Rationale for including indexes (bibliographic databases): Most information in the sciences gets outdated quickly. Articles, therefore, are timely in terms of providing information about models. Additionally co-citation has proved to be a valuable analysis in identifying disciplinary intellectual structures. OCLC: OCLC was the bibliographic utility chosen. I tried searching in both CatExpress (the cataloging service) and WorldCat (the reference service) and found both equally frustrating to use. The distinction between subject words, subject phrase, and subject, searches were blurred in many searches. INSPEC: While OCLC was chosen as the bibliographic utility for this investigation into the nature of scientific models as works, I did not limit myself to a single indexing database, though not all data has been included in this paper. I searched GEOREF, Water Abstracts, EI Compendex, and ISI indexes (note that these findings are partially reported here). INSPEC is a database that has over 5 million records and approximately 300,000 records are added annually. Coverage includes physics, electrical engineering, electronic, 41 telecommunications, computers, control technology, and information technology. The INSPEC thesaurus has broad subject terms under which constitutive models may be classed and indexed. They are: 1. A 4630 – Mechanics of solids 2. A4660 – Rheology of fluids and pastes 3. A5100 Kinetic and transport theory of fluids; physical properties of gases 4. A6200 Mechanical and acoustic properties of condensed matter 5. A 8140 Treatment of materials and its effects on microstructures and properties Once the user has browsed the thesaurus under these broad terms, she can search using more specific terms such as Viscoelasticity or an even more broader term such as mechanical properties. Examples of other narrower, specific terms are: anelasticity, creeps, cracks, deformation, stress-strain relations. Many of these terms are very good, and we consider that INSPEC had the best indexing, in terms of direct, specific entries, but it still does not identify constitutive models as a class of scientific models or as one category of works with many subject relationships, nor does it show specific bibliographic family relationships. These are critical aspects for successful information retrieval about scientific models. Additionally, these terms were established in 1995 and the field continues to change and evolve rapidly. SABIO: This is the online public access catalog of the University of Arizona Libraries. Appendix D: Scientific Models as Works and Future Research Questions Model facets are indicated by arrows with dashed lines; bibliographic relationships that representations of models might possibly possess are indicated by solid arrows. Significant questions remain to be investigated. For example, if scientific models are identified as works, which bibliographic relationships do representations about model replications fit best? Can relationships in electronic resources be determined semi- automatically? What are important, generic name and subject relationships that are present in all scientific models? How can they be used to display and sort through results in library catalogs? Ideational content of models includes identifying creator name and subject relationships. What are these? Scientific Model as ‘work’ Software code Object Process/Phenomena Replication D 1 2 3 4 5 6 7 Sequential Relationships 1. Sequels Shared characteristic relationships: 1. Common author 2. Common title 3. Shared language 4. Shared date of publication 5. Shared country of publication 6. Other access points Creator (s) Mathematical function E r 1 2 3 4 5 6 7 8 Deriva Relati 1. Edi Re Tr Su Ab Di 2. Ad M 3. Ne on the quivalence elationships: . Copies . Issues . Facsimiles . Reprints . Photocopies . Microforms . Electronic Reproductions . Etc. tive onships: tions, visions, anslations, mmaries, stracts, gests aptations or escriptive Relationships: . Description . Criticism . Review . Evaluation . Annotated editions . Casebooks . Commentaries Whole-Part relationships: what are the component arts of scientific models? 42 odifications w work based style or matic content Semantic content of models are textual works, datasets, software, services are the forms of models; is it important to identify and describe other forms? 43 Notes 1 Patrick Wilson, “The Catalog as Access Mechanism: Background and Concepts,” Library Resources and Technical Services 27 (1983): 7. 2 Anglo-American Cataloging Rules: 2nd Edition, 1998 Revision with 1999 Amendments [Electronic edition], (Washington D.C. Library of Congress, 2001), Part I. 3 Op. cit. Chapter 13. 4 Patrick Wilson, “On Accepting the ASIST Award of Merit,” Bulletin of the American Society for Information Science & Technology28 (2002): 11. 5 Richard P. Smiraglia, The Nature of “A Work”: Implications for the Organization of Knowledge (Lanham, Maryland: Scarecrow, 2001). 6 “Library of Congress Subject Headings,” The Cataloger’s Desktop [CD-ROM], (Washington D.C.: CDS, 2001). 7 Cyril.W. Cleverdon, “The Significance of the Cranfield Tests on Index Languages,” Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrrieval, September 1991, Chicago, Illinois (Baltimore, Maryland: ACM Press, 1991). 8 Anglo-American Cataloging Rules: 2nd Edition, 1998 Revision with 1999 Amendments [Electronic edition], (Washington D.C. Library of Congress, 2001) Glossary. 9 “model” Oxford English Dictionary, Ed. J. A. Simpson and E. S. C. Weiner. 2nd ed. (Oxford: Clarendon Press, 1989), OED Online, (Oxford: Oxford University Press, 4 Apr. 2000), http://oed.com/cgi/entry/ 00149259 [Accessed 31 January, 2002]. 10 “model” Encyclopedia Britannica Online, http://search.eb.com/bol/search?type=topic&query=model&DBase=Articles&x=30&y=3 [Accessed 31 January, 2002]. 11 "operations research" Encyclopedia Britannica Online, http://search.eb.com/bol/topic?eu=109277&sctn=7, [Accessed 31 January 2002]. 12 “model” Academic Press Dictionary of Science and Technology Online, (San Diego, Calif.: Academic Press, 2001), http://www.academicpress.com/dictionary/ [Accessed 31 January, 2002]. 13 Hans von Storch. Models in Environmental Research. (Berlin: Springer, 2001), 17. 14 Earth System Science: A Closer View, (Washington D.C.: NASA, 1988). 44 15 Thinkertools: The ThinkerTools Scientific Inquiry and Modeling Project [online], University of California at Berkeley, http://thinkertools.soe.berkely.edu/ [Accessed 31 January 2002]. 16 Modeling for Understanding in Science Education [online], University of Wisconsin- Madison, http://www.weer.wisc.edu/ncisla/muse/index.html, [Accessed 31 January 2002]. 17 Op. cit. R.P. Smiraglia, 81. 18 Jennifer Cartier, et al, “The Nature and Structure of Scientific Models” [online], National Center for Improving Student Learning and Achievement in Mathematics and Science (NCISLA), University of Wisconsin-Madison, http://www.wcer.wisc.edu/ncisla/ [Accessed 31 January, 2002]. 19 R.P. Smiraglia, op cit. 75. 20 Ibid. 42 21 Barbara A. Tillett, “ A Taxonomy of Bibliographic Relationships,” LRTS 35 (1991), 150-158. 22 Dublin Core Metadata Element Set, Version 1.1.: Reference Description [online], http:dublincore.org/dces/ [Accessed 31 January 2002]. 23 Linda H. Hill et al. “A Content Standard for Computational Models,” D-Lib Magazine 7 (2001), http://www.dlib.org/ [Accessed 31 January 2002]. 24 Heinz Heckhausen, “Disciplines and Interdisciplinarity,” Interdisciplinarity: Problems of Teaching and Research in Universities (Paris: OECD, 1972), 85. work_n2i2yvr6brgmbaalzp64qylccm ---- A dataset of systematic review updates This is a repository copy of A dataset of systematic review updates. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/146321/ Version: Accepted Version Proceedings Paper: Alharbi, A. and Stevenson, R. orcid.org/0000-0002-9483-6006 (2019) A dataset of systematic review updates. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 21-25 Jul 2019, Paris, France. ACM , pp. 1257-1260. ISBN 978-1-4503-6172-9 https://doi.org/10.1145/3331184.3331358 © 2019 The Authors. This is an author-produced version of a paper subsequently published in Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Uploaded in accordance with the publisher's self-archiving policy. eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing eprints@whiterose.ac.uk including the URL of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ A Dataset of Systematic Review Updates Amal Alharbi∗ Ki♪g Abdulaziz U♪iversity Jeddah, Saudi Arabia ahalharbi1@sheield.ac.uk ℧ark Steve♪so♪ U♪iversity of Sheield Sheield, U♪ited Ki♪gdom mark.steve♪so♪@sheield.ac.uk ABSTRACT Systematic reviews ide♪tify, summarise a♪d sy♪thesise evide♪ce releva♪t to speciic research questio♪s. They are widely used i♪ the ield of medici♪e where they i♪form health care choices of both professio♪als a♪d patie♪ts. It is importa♪t for systematic reviews to stay up to date as evide♪ce cha♪ges but this is challe♪gi♪g i♪ a ield such as medici♪e where a large ♪umber of publicatio♪s appear o♪ a daily basis. Developi♪g methods to support the updati♪g of reviews is importa♪t to reduce the workload required a♪d thereby e♪sure that reviews remai♪ up to date. This paper describes a dataset of systematic review updates i♪ the ield of medici♪e created usi♪g 25 Cochra♪e reviews. Each review i♪cludes the Boolea♪ query a♪d rel- eva♪ce judgeme♪ts for both the origi♪al a♪d updated versio♪s. The dataset ca♪ be used to evaluate approaches to study ide♪tiicatio♪ for review updates. KEYWORDS Systematic review; systematic review update; test collectio♪; evalu- atio♪ ACM Reference Format: Amal Alharbi a♪d ℧ark Steve♪so♪. 2019. A Dataset of Systematic Review Updates. I♪ Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’19), July 21–25, 2019, Paris, France. AC℧, New York, NY, USA, 4 pages. https:⁄⁄doi.org⁄10. 1145⁄3331184.3331358 1 INTRODUCTION Systematic reviews are widely used i♪ the ield of medici♪e where they are used to i♪form treatme♪t decisio♪s a♪d health care choices. They are based o♪ assessme♪t of evide♪ce about a research questio♪ which is available at the time the review is created. Reviews ♪eed to be updated as evide♪ce cha♪ges to co♪ti♪ue to be useful. However, the volume of publicatio♪s that appear i♪ the ield of medici♪e o♪ a daily basis makes this diicult [2]. I♪ fact, it has bee♪ estimated that 7% of systematic reviews are already out of date by the time of publicatio♪ a♪d almost a quarter (23%) two years after they have appeared [19]. A review ca♪ be updated at a♪y poi♪t after it has bee♪ cre- ated a♪d would ideally be carried out whe♪ever ♪ew evide♪ce be- comes available but the efort required makes this impractical. The ∗Curre♪tly studyi♪g at the U♪iversity of Sheield SIGIR ’19, July 21–25, 2019, Paris, France © 2019 Copyright held by the ow♪er⁄author(s). Publicatio♪ rights lice♪sed to AC℧. This is the author's versio♪ of the work. It is posted here for your perso♪al use. Not for redistributio♪. The dei♪itive Versio♪ of Record was published i♪ Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’19), July 21–25, 2019, Paris, France, https:⁄⁄doi.org⁄10.1145⁄3331184. 3331358. Cochra♪e Collaboratio♪ recomme♪ds that reviews should be up- dated every two years. Cochra♪e's Livi♪g Evide♪ce Network have rece♪tly started developi♪g livi♪g systematic reviews for which evide♪ce is reviewed freque♪tly (♪ormally mo♪thly) [7] but it is u♪- clear whether this efort is sustai♪able. The Age♪cy for Healthcare Research a♪d Quality suggests that reviews are updated depe♪di♪g o♪ ♪eed, priority a♪d the availability of ♪ew evide♪ce [15]. The process that is applied to update a systematic review is similar to the o♪e used to create a ♪ew review [6]. A search query is ru♪ a♪d the resulti♪g citatio♪s scree♪ed i♪ a two stage process. I♪ the irst stage (abstract screening) o♪ly the title a♪d abstract of the papers retrieved by the Boolea♪ search are exami♪ed. It is commo♪ for the majority of papers to be removed from co♪sideratio♪ duri♪g the abstract scree♪i♪g stage. The remai♪i♪g papers are co♪sidered i♪ a seco♪d stage (content screening) duri♪g which the full papers is exami♪ed. If a♪y ♪ew releva♪t studies are fou♪d the♪ data is extracted a♪d i♪tegrated i♪to the review. The review's i♪di♪gs are also updated if the evide♪ce is fou♪d to have cha♪ged from the previous versio♪. The scree♪i♪g stages are o♪e of the most time co♪sumi♪g parts of this process si♪ce a♪ experie♪ced reviewer takes at least 30 seco♪ds to review a♪ abstract a♪d substa♪tially lo♪ger for complex topics [22]. The problem is made more acute by the fact that the search queries used for systematic reviews are desig♪ed to maximise recall, with precisio♪ a seco♪dary co♪cer♪, while the volume of medical publicatio♪s i♪creases rapidly. Developi♪g methods to support the updati♪g of reviews are therefore required to reduce the workload required a♪d thereby e♪sure that reviews remai♪ up to date. However, previous work o♪ the applicatio♪ of I♪formatio♪ Retrieval (IR) to the systematic review process has o♪ly paid limited atte♪tio♪ to the problem of updati♪g reviews (see Sectio♪ 2). This paper describes a dataset created for evaluati♪g automated methods applied to the problem of ide♪tifyi♪g releva♪t evide♪ce for the updati♪g of systematic reviews. It is, to our k♪owledge, the irst resource made available for this purpose. I♪ additio♪, this paper also reports performa♪ce of some baseli♪e approaches applied to the dataset. The dataset described i♪ this paper is available from https:⁄⁄github.com⁄Amal-Alharbi⁄Systematic↓Reviews↓Update. 2 RELATED WORK A sig♪iica♪t ♪umber of previous studies have demo♪strated the useful♪ess of IR tech♪iques to reduce the workload i♪volved i♪ the systematic review scree♪i♪g process for ♪ew reviews, for exam- ple [3, 5, 12±14, 16, 17, 22]. A ra♪ge of datasets have bee♪ made available to support the developme♪t of automated methods for study ide♪tiicatio♪. Ωidely used datasets i♪clude o♪e co♪tai♪i♪g 15 systematic reviews about drug class eicie♪cy [3] a♪d a♪other https://doi.org/10.1145/3331184.3331358 https://doi.org/10.1145/3331184.3331358 https://doi.org/10.1145/3331184.3331358 https://doi.org/10.1145/3331184.3331358 https://github.com/Amal-Alharbi/Systematic_Reviews_Update co♪tai♪i♪g two reviews (o♪ Chro♪ic Obstructive Pulmo♪ary Dis- ease a♪d Proto♪ Beam therapy) [22]. Rece♪tly the CLEF eHealth track o♪ Tech♪ology Assisted Reviews i♪ Empirical ℧edici♪e [9, 20] developed datasets co♪tai♪i♪g 72 topics created from diag♪ostic test accuracy systematic reviews produced by the Cochra♪e Col- laboratio♪. A♪other test collectio♪ has also bee♪ derived from 94 Cochra♪e reviews [18]. However, ♪o♪e of these datasets focus o♪ the review updates. O♪ly a few previous studies have explored the use of IR tech- ♪iques to support the problem of updati♪g reviews [3, 11, 21]. I♪ the majority of cases this work has bee♪ evaluated agai♪st simulatio♪s of the update process, for example by łtime slici♪g" the i♪cluded studies a♪d treati♪g those that appeared i♪ the three years before review publicatio♪ as bei♪g added i♪ a♪ update [11]. A♪ excep- tio♪ is work that used update i♪formatio♪ for ♪i♪e drug therapy systematic reviews [4], but this dataset is ♪ot publicly available. To the best of our k♪owledge there is ♪o accessible dataset that focuses o♪ the problem of ide♪tifyi♪g studies for i♪clusio♪ i♪ a re- view update. The problem is subtly difere♪t from the ide♪tiicatio♪ of studies for i♪clusio♪ i♪ a ♪ew review si♪ce releva♪ce judgeme♪ts are available (from the origi♪al review) which have the pote♪tial to improve performa♪ce. A suitable dataset for this problem would i♪- clude the list of studies co♪sidered for i♪clusio♪ i♪ both the origi♪al a♪d updated reviews together with a list of the studies that were ac- tually i♪cluded i♪ each review. This paper describes such a resource. 3 DATASET The dataset is co♪structed usi♪g systematic reviews from the Cochra♪e Database of Systematic Reviews1, a sta♪dard source of evide♪ce to i♪form healthcare decisio♪-maki♪g. I♪terve♪tio♪ reviews, that is reviews which assess the efective♪ess of a particular healthcare i♪terve♪tio♪ for a disease, are the most commo♪ type of reviews carried out by Cochra♪e. A set of 25 published i♪terve♪tio♪ sys- tematic reviews were selected for i♪clusio♪ i♪ the dataset. Reviews i♪cluded i♪ the dataset must have bee♪ available i♪ a♪ origi♪al a♪d updated versio♪ (i.e. a♪ updated versio♪ of the review has bee♪ published) a♪d at least o♪e ♪ew releva♪t study ide♪tiied duri♪g the abstract scree♪i♪g stage for the update. The followi♪g i♪formatio♪ was automatically extracted from each review: (1) review title, (2) Boolea♪ query, (3) set of i♪cluded a♪d excluded studies (for both the origi♪al a♪d updated versio♪s) a♪d (4) update history (i♪cludi♪g publicatio♪ date a♪d URL of origi- ♪al a♪d updated versio♪s). 3.1 Boolean Query Ca♪didate studies for i♪clusio♪ i♪ systematic reviews are ide♪tiied usi♪g Boolea♪ queries co♪structed by domai♪ experts. These queries are desig♪ed to optimise recall si♪ce reviews aim to ide♪tify a♪d assess all releva♪t evide♪ce. Queries are ofte♪ complex a♪d i♪clude operators such as AND, OR a♪d NOT, i♪ additio♪ to adva♪ced operators such as wildcard, explosio♪ a♪d tru♪catio♪ [10]. Boolea♪ queries i♪ the reviews i♪cluded i♪ the dataset are created for either the OVID or Pub℧ed i♪terfaces to the ℧EDLINE database of medical literature. For ease of processi♪g, each OVID query was 1https:⁄⁄www.cochra♪elibrary.com⁄cdsr⁄about-cdsr automatically co♪verted to a si♪gle-li♪e Pub℧ed query usi♪g a Pytho♪ script created speciically for this purpose (see Figure 1). (a) Multi-line query in OVID format 1. endometriosis/ 2. (adenomyosis or endometrio$).tw. 3. or/1-2 (b) One-line PubMed translation endometriosis[Mesh:NoExp] OR adenomyosis[Text Word] OR endometrio*[Text Word] Figure 1: Example portion of Boolean query [8] in (a) OVID format and (b) its translation into single-line PubMed for- mat. This portion of the query contains three clauses and the last clause represents the combining results of clause 1 and 2 in a disjunction (OR). 3.2 Included and Excluded Studies For each versio♪ of the reviews (origi♪al a♪d updated) the dataset i♪cludes a list of all the studies that were i♪cluded after each stage of the scree♪i♪g process (abstract a♪d co♪te♪t). The set of studies i♪cluded after the co♪te♪t level scree♪i♪g is a subset of those i♪- cluded after abstract scree♪i♪g a♪d represe♪ts the studies i♪cluded i♪ the updated review. I♪cluded a♪d excluded studies are listed i♪ the dataset as P℧IDs (u♪ique ide♪tiiers for Pub℧ed citatio♪s that make it straightfor- ward to access details about the publicatio♪). If the P℧ID for a study was listed i♪ the systematic review (which accou♪ted for a majority of cases) the♪ it was used. If it was ♪ot the♪ the title of the study a♪d year of publicatio♪ were used to form a query that is used to search Pub℧ed (see Figure 2). If the e♪tire text of the title, publicatio♪ year a♪d volume of the retrieved record match the details listed i♪ the systematic review the♪ the P℧ID of that citatio♪ is used. Study title: Cli♪ical experie♪ce treati♪g e♪dometriosis with ♪afareli♪. Publication Year: 1989 Search Query: clinical[Title] AND experience[Title] AND treati♪g[Title] AND endometriosis[Title] AND nafarelin [Title] AND 1989[Date - Publication] Figure 2: Example of search query generated from title and publication year for study from Topic CD000155 [8]. 3.3 Update History Details of the date of publicatio♪ of each versio♪ (origi♪al a♪d update) are also extracted a♪d i♪cluded. 3.4 Dataset Characteristics Descriptive statistics for the 25 systematic reviews that form the dataset are show♪ i♪ Table 1. It is worth drawi♪g atte♪tio♪ to the small ♪umber of studies i♪cluded after the i♪itial abstract scree♪i♪g stage. Table 1: List of the 25 systematic reviews with the total num- ber of studies returned by the query (Total) and the num- ber included following the abstract (Abs) and content (Cont) screening stages. The average (unweighted mean) number of studies is shown in the bottom row. Note that for the up- dated review, the number of included studies in the table lists only the new studies that were added during the update. Original Review Updated Review Review Total Abs Cont Total Abs Cont CD000155 397 42 14 101 6 4 CD000160 433 7 6 1980 1 1 CD000523 34 6 3 18 1 1 CD001298 1384 22 15 1020 17 13 CD001552 2082 2 2 844 2 2 CD002064 38 2 2 9 1 0 CD002733 13778 30 10 6109 6 6 CD004069 951 5 2 771 9 7 CD004214 57 5 2 21 4 1 CD004241 838 25 9 193 5 3 CD004479 112 6 1 153 4 3 CD005025 1524 43 8 1309 46 4 CD005055 648 8 4 353 3 0 CD005083 462 46 16 107 9 2 CD005128 25873 5 4 5820 9 3 CD005426 6289 13 8 1413 3 0 CD005607 851 11 7 103 2 1 CD006839 239 8 6 93 3 3 CD006902 290 18 6 106 10 5 CD007020 348 47 4 47 4 3 CD007428 157 7 3 190 9 3 CD008127 5460 7 0 6720 2 1 CD008392 5548 15 5 1095 2 0 CD010089 41675 22 10 4514 4 0 CD010847 571 15 1 111 6 0 Average 4402 17 6 1335 7 3 4 EXPERIMENTS AND RESULTS Experime♪ts were co♪ducted to establish baseli♪e performa♪ce ig- ures for the dataset. The aim is to reduce workload i♪ the scree♪i♪g stage of the review update by ra♪ki♪g the list of studies retrieved by the Boolea♪ query. Performa♪ce at both abstract a♪d co♪te♪t scree♪i♪g levels was explored. The collectio♪ was created by usi♪g the Boolea♪ query to search ℧EDLINE usi♪g the Entrez package from biopython.org. The list of studies i♪cluded after abstract scree♪i♪g was used as the releva♪ce judgeme♪ts for abstract level evaluatio♪ a♪d the list of studies i♪cluded after the co♪te♪t scree♪i♪g was used for co♪te♪t level evaluatio♪. 4.1 Approaches 4.1.1 Baseline uery. A łbaseli♪e query" was formed usi♪g the review title a♪d terms extracted from the Boolea♪ query. This query is passed to B℧25 [1] to ra♪k the set of studies retur♪ed from the Boolea♪ query for the review update. 4.1.2 Relevance Feedback. A feature of the problem of ide♪tify- i♪g studies for i♪clusio♪ i♪ updates of systematic reviews is that a sig♪iica♪t amou♪t of k♪owledge about which studies are suit- able is available from the origi♪al review a♪d this i♪formatio♪ was exploited usi♪g releva♪ce feedback. Rocchio's algorithm [1] was used to reformulate the baseli♪e query by maki♪g use of releva♪ce judgeme♪ts derived from the origi♪al review. Co♪te♪t scree♪i♪g judgeme♪ts (i♪cluded a♪d excluded studies) were used for the ma- jority of reviews. Abstract scree♪i♪g judgeme♪ts were used if these were ♪ot available, i.e. ♪o studies remai♪ed after co♪te♪t scree♪i♪g. 4.2 Evaluation Metrics ℧ea♪ average precisio♪ (℧AP) a♪d work saved over sampli♪g (ΩSS) are the metrics most commo♪ly used to evaluate approaches to study ide♪tiicatio♪ for systematic reviews, e.g. [5, 9, 20]. ℧AP represe♪ts the mea♪ of the average precisio♪ scores over all reviews. ΩSS measures the work saved to retrieve a dei♪ed perce♪tage of the i♪cluded studies. For example ΩSS@95 measures the work saved to retrieve 95% of the i♪cluded studies. ΩSS at recall 95 a♪d 100 (ΩSS@95 a♪d ΩSS@100) was used for the experime♪ts reported i♪ this paper. 4.3 Results Results of the experime♪t are show♪ i♪ Table 2. As expected, perfor- ma♪ce improves whe♪ releva♪ce feedback is used. The scree♪i♪g efort required to ide♪tify all releva♪t studies (100% recall) is re- duced by 63.5% at abstract level a♪d 74.9% at co♪te♪t level. This demo♪strates that maki♪g use of i♪formatio♪ from the origi♪al review ca♪ improve study selectio♪ for review updati♪g. Table 2: Performance ranking abstracts for updated reviews at (a) abstract and (b) content levels. Results are computed across all reviews at abstract level (25 reviews) and only across reviews in which a new study was added in the up- dated version for content level (19 reviews). Approach MAP WSS@95 WSS@100 (a) abstract level (25 reviews) Baseli♪e Query 0.213 51.70% 56.60 % Releva♪ce Feedback 0.413 58.80% 63.50% (b) content level (19 reviews) Baseli♪e Query 0.260 65.50% 70.50% Releva♪ce Feedback 0.382 69.90% 74.90% Figure 3 shows the results of AP scores for all 25 reviews. Rele- va♪ce feedback improved AP for 23 (92%) of the reviews. There are also four reviews where the use of releva♪ce feedback produced a♪ AP score of 1, i♪dicati♪g that the approach reduces work required by up to 99.9%. 5 CONCLUSION Updati♪g systematic reviews is a♪ importa♪t problem but o♪e which has largely bee♪ overlooked. This paper described a dataset co♪tai♪- i♪g 25 i♪terve♪tio♪ reviews from the Cochra♪e collaboratio♪ that Figure 3: Abstract screening AP scores for each review using Baseline Query and Relevance Feedback. ca♪ be used to support the developme♪t of approaches to automate the updati♪g process. The title, Boolea♪ query, releva♪ce judge- me♪ts for both the origi♪al a♪d the updated versio♪s are i♪cluded for each systematic review. Sta♪dard approaches were applied to the dataset i♪ order to es- tablish baseli♪e performa♪ce igures. Experime♪ts demo♪strated that i♪formatio♪ from the origi♪al review ca♪ be used to improve study selectio♪ for systematic review updates. It is hoped that this resource will e♪courage further research i♪to the developme♪t of ap- proaches that support the updati♪g of systematic reviews, thereby helpi♪g to keep them up to date a♪d valuable. REFERENCES [1] Ricardo Baeza-Yates a♪d Berthier Ribeiro-Neto. 2011. Modern Information Re- trieval (2♪d ed.). Addiso♪-Ωesley Publishi♪g Compa♪y, Bosto♪, ℧A, USA. [2] Hilda Bastia♪, Paul Glasziou, a♪d Iai♪ Chalmers. 2010. Seve♪ty-Five Trials a♪d Eleve♪ Systematic Reviews a Day: How Ωill Ωe Ever Keep Up? PLOS Medicine 7, 9 (Sep 2010), 1±6. https:⁄⁄doi.org⁄10.1371⁄jour♪al.pmed.1000326 [3] Aaro♪ Cohe♪. 2008. Optimizi♪g feature represe♪tatio♪ for automated systematic review work prioritizatio♪. AMIA ... Annual Symposium proceedings (2008), 121± 125. [4] Aaro♪ Cohe♪, Kyle Ambert, a♪d ℧aria♪ ℧cDo♪agh. 2012. Studyi♪g the pote♪tial impact of automated docume♪t classiicatio♪ o♪ scheduli♪g a systematic review update. BMC Medical Informatics and Decision Making 12, 1 (2012), 33. https: ⁄⁄doi.org⁄10.1186⁄1472-6947-12-33 [5] Aaro♪ Cohe♪, Ωilliam Hersh, Kim Peterso♪, a♪d Po-Yi♪ Ye♪. 2006. Reduci♪g workload i♪ systematic review preparatio♪ usi♪g automated citatio♪ classiicatio♪. Journal of the American Medical Informatics Association : JAMIA 13, 2 (2006), 206± 19. https:⁄⁄doi.org⁄10.1197⁄jamia.℧1929 [6] ℧ark R Elki♪s. 2018. Updati♪g systematic reviews. Journal of Physiotherapy 64, 1 (2018), 1±3. https:⁄⁄doi.org⁄10.1016⁄j.jphys.2017.11.009 [7] Julia♪ H. Elliott, A♪♪eliese Sy♪♪ot, Tari Tur♪er, ℧ark Simmo♪ds, Elie A. Akl, et al. 2017. Livi♪g systematic review: 1. I♪troductio♪ - the why, what, whe♪, a♪d how. Journal of Clinical Epidemiology 91 (November 2017), 23±30. https: ⁄⁄doi.org⁄10.1016⁄J.JCLINEPI.2017.08.010 [8] Edward Hughes, Julie Brow♪, Joh♪ Colli♪s, Ci♪dy Farquhar, Do♪♪a Fedorkow, et al. 2007. Ovulatio♪ suppressio♪ for e♪dometriosis for wome♪ with subfertil- ity. Cochrane Database of Systematic Reviews 3 (2007). https:⁄⁄doi.org⁄10.1002⁄ 14651858.CD000155.pub2 [9] Eva♪gelos Ka♪oulas, Da♪ Li, Leif Azzopardi, a♪d Re♪e Spijker. 2017. CLEF 2017 tech♪ologically assisted reviews i♪ empirical medici♪e overview. I♪ Working Notes of CLEF 2017 - Conference and Labs of the Evaluation forum, Dublin, Ireland, September 11-14, 2017, CEUR Workshop Proceedings, Vol. 1866. 1±29. [10] Sarv♪az Karimi, Stefa♪ Pohl, Falk Scholer, Lawre♪ce Cavedo♪, a♪d Justi♪ Zobel. 2010. Boolea♪ versus ra♪ked queryi♪g for biomedical systematic reviews. BMC medical informatics and decision making 10, 1 (2010), 1±20. https:⁄⁄doi.org⁄10. 1186⁄1472-6947-10-58 [11] ℧adia♪ Khabsa, Ahmed Elmagarmid, Ihab Ilyas, Hossam Hammady, ℧ourad Ouzza♪i, et al. 2016. Lear♪i♪g to ide♪tify releva♪t studies for systematic reviews usi♪g ra♪dom forest a♪d exter♪al i♪formatio♪. Machine Learning 102, 3 (℧ar 2016), 465±482. https:⁄⁄doi.org⁄10.1007⁄s10994-015-5535-7 [12] Halil Kilicoglu, Di♪a Dem♪er-Fushma♪, Thomas C Ri♪dlesch, Na♪cy Ωilczy♪ski, a♪d Bria♪ Hay♪es. 2009. Towards automatic recog♪itio♪ of scie♪tiically rigorous cli♪ical research evide♪ce. AMIA 16 (2009), 25±31. https:⁄⁄doi.org⁄10.1197⁄jamia. ℧2996 [13] Seu♪ghee Kim a♪d Ji♪wook Choi. 2014. A♪ SV℧-based high-quality article classiier for systematic reviews. Journal of Biomedical Informatics 47 (2014), 153±159. [14] Atha♪asios Lagopoulos, A♪to♪ios A♪ag♪ostou, Adama♪tios ℧i♪as, a♪d Grig- orios Tsoumakas. 2018. Lear♪i♪g-to-Ra♪k a♪d Releva♪ce Feedback for Litera- ture Appraisal i♪ Empirical ℧edici♪e. I♪ Experimental IR Meets Multilinguality, Multimodality, and Interaction - 9th International Conference of the CLEF Asso- ciation, CLEF 2018, Avignon, France, September 10-14, 2018, Proceedings. 52±63. https:⁄⁄doi.org⁄10.1007⁄978-3-319-98932-7↓5 [15] Ersilia Luce♪teforte, Alessa♪dra Bettiol, Salvatore De ℧asi, a♪d Gia♪♪i Virgili. 2018. Updating Diagnostic Test Accuracy Systematic Reviews: Which, When, and How Should They Be Updated? Spri♪ger I♪ter♪atio♪al Publishi♪g, Cham, 205±227. https:⁄⁄doi.org⁄10.1007⁄978-3-319-78966-8↓15 [16] David ℧arti♪ez, Sarv♪az Karimi, Lawre♪ce Cavedo♪, a♪d Timothy Baldwi♪. 2008. Facilitati♪g biomedical systematic reviews usi♪g ra♪ked text retrieval a♪d classiicatio♪. I♪ 13th Australasian Document Computing Symposium (ADCS). Hobart Tasma♪ia, 53±60. [17] ℧akoto ℧iwa, James Thomas, Aliso♪ O'℧ara-Eves, a♪d Sophia A♪a♪iadou. 2014. Reduci♪g systematic review workload through certai♪ty-based scree♪i♪g. Journal of Biomedical Informatics 51 (2014), 242±253. https:⁄⁄doi.org⁄10.1016⁄j.jbi.2014. 06.005 [18] Harrise♪ Scells, Guido Zucco♪, Beva♪ Koopma♪, A♪tho♪y Deaco♪, Leif Azzopardi, et al. 2017. A test collectio♪ for evaluati♪g retrieval of studies for i♪clusio♪ i♪ systematic reviews. I♪ 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Tokyo, Japa♪, 1237±1240. https: ⁄⁄doi.org⁄10.1145⁄3077136.3080707 [19] Kaveh G Shoja♪ia, ℧argaret Sampso♪, ℧ohammed T A♪sari, Ju♪ Ji, Steve Doucette, et al. 2007. How quickly do systematic reviews go out of date? A survival a♪alysis. Annals of Internal Medicine 147 (2007), 224±233. https: ⁄⁄doi.org⁄10.7326⁄0003-4819-147-4-200708210-00179 [20] Ha♪♪a Suomi♪e♪, Liadh Kelly, Lorrai♪e Goeuriot, Eva♪gelos Ka♪oulas, Leif Azzopardi, et al. 2018. Overview of the CLEF eHealth Evaluatio♪ Lab 2018. I♪ Experimental IR Meets Multilinguality, Multimodality, and Interaction. Spri♪ger I♪ter♪atio♪al Publishi♪g, Cham, 286±301. [21] Byro♪ C Ωallace, Kevi♪ Small, Carla E Brodley, Joseph Lau, Christopher H Schmid, et al. 2012. Toward moder♪izi♪g the systematic review pipeli♪e i♪ ge♪etics: eicie♪t updati♪g via data mi♪i♪g. Genetics in Medicine 14 (2012), 663. https:⁄⁄doi.org⁄10.1038⁄gim.2012.7 [22] Byro♪ C Ωallace, Thomas A Trikali♪os, Joseph Lau, Carla E Brodley, a♪d Christo- pher H Schmid. 2010. Semi-automated scree♪i♪g of biomedical citatio♪s for systematic reviews. BMC Bioinformatics (2010). https://doi.org/10.1371/journal.pmed.1000326 https://doi.org/10.1186/1472-6947-12-33 https://doi.org/10.1186/1472-6947-12-33 https://doi.org/10.1197/jamia.M1929 https://doi.org/10.1016/j.jphys.2017.11.009 https://doi.org/10.1016/J.JCLINEPI.2017.08.010 https://doi.org/10.1016/J.JCLINEPI.2017.08.010 https://doi.org/10.1002/14651858.CD000155.pub2 https://doi.org/10.1002/14651858.CD000155.pub2 https://doi.org/10.1186/1472-6947-10-58 https://doi.org/10.1186/1472-6947-10-58 https://doi.org/10.1007/s10994-015-5535-7 https://doi.org/10.1197/jamia.M2996 https://doi.org/10.1197/jamia.M2996 https://doi.org/10.1007/978-3-319-98932-7_5 https://doi.org/10.1007/978-3-319-78966-8_15 https://doi.org/10.1016/j.jbi.2014.06.005 https://doi.org/10.1016/j.jbi.2014.06.005 https://doi.org/10.1145/3077136.3080707 https://doi.org/10.1145/3077136.3080707 https://doi.org/10.7326/0003-4819-147-4-200708210-00179 https://doi.org/10.7326/0003-4819-147-4-200708210-00179 https://doi.org/10.1038/gim.2012.7 Abstract 1 Introduction 2 Related work 3 Dataset 3.1 Boolean Query 3.2 Included and Excluded Studies 3.3 Update History 3.4 Dataset Characteristics 4 Experiments and Results 4.1 Approaches 4.2 Evaluation Metrics 4.3 Results 5 Conclusion References work_n2i37hrlv5gtfmk2gb4qk72jo4 ---- 464 American Archivist / Vol. 58 / Fall 1995 From Managerial Theory and Worksheets to Practical MARC AMC; Or, Dancing with the Dinosaur at the Amistad FREDERICK STIELOW, WITH REBECCA HANKINS AND VENOLA JONES Abstract: This article discusses how theory and historical analysis can help inform man- agerial practices toward the integration of MARC AMC as part of a descriptive chain. The staff of the Amistad Research Center used their own experiences and research and Zipf s Law of Least Effort to produce techniques to simplify and rationalize the complex, library-based MARC format for their environment and ongoing technological change. The process is ongoing and far from revolutionary, but the techniques to date include the production of a standard cataloging worksheet and an authority list of subject headings. About the authors: Frederick J. Stielow is now the executive director of the Mid-Hudson Library System. He served as executive director of the Amistad Research Center from 1992 to 1995 and had previously taught archives and information technology at Catholic University and the University of Maryland. He has a dual Ph.D. in History and American Studies and an M.L.S. Stielow has published over seventy articles, including the Posner Prize-winning "Archival Theory Redux and Redeemed,'' and six books, including the Leland Award-winning The Management of Oral History Sound Archives. Well-known on the dance floor, he also consults for bodies from the Hip-Hop Hall of Fame and Jazz and Heritage Association to the World Bank. The initial draft of this article was written during the luxury of the 1994 Archival Summer Institute at the Bentley Library of the University of Michigan. Thanks to Susan Rosenfeld for additional comments. Rebecca Hankins is the archivist at the Amistad Research Center. Venola Jones is a cataloger at the Dillard University Library in New Orleans. She also does cataloging for the Amistad Research Center. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 From Managerial Theory and Worksheets 465 THE TRANSIT FROM THE ivory tower of teaching to the nitty-gritty of archival man- agement can prove a learning experience. Some colleagues have even developed a sordid interest in how one addresses real practice instead of just dancing around with theory. The admittedly verbose lead author of this article acknowledges the sig- nificance of this challenge. Indeed, this ar- ticle developed in partial response to a minor contretemps on his comments in the editorial pages of the American Archivist.' The argument, dating back to the au- thor's ill-spent youth and training as a sys- tems analyst and data processing section chief in the late 1960s, was then and is now that Machine Readable Cataloging (MARC) format seems to be a "technolog- ical dinosaur." MARC simply could not escape its origins during that almost paleo- lithic era of mainframes with expensive storage costs and military communication protocols. Since then we have had a mi- crocomputer revolution. The need to code and keep data neatly isolated is disappear- ing. Our vocabulary has enlarged to more "user friendly" data models. We have gone from lines of programming code to spread sheets and relational data base mod- els of the 1970s, to the word processing of the 80s, and to the interactive and hyper- media world of today.2 '"To the editor," American Archivist 55 (Fall 1992): 524. The comments were in regard to an ear- lier article by Bruce Bruemmer on oral history and the MARC-AMC format. My specific point was that not every individual tape merits an AMC record—that archival theory and descriptive practices allow us to operate at the collections level and avoid the library imperative for unit cataloging. (In a minor contre- temps, Bruemmer and Judith Campbell Turner took some exception to my thoughts, but still failed to deal with the collective nature of archival description in their subsequent letters to the editor.) 2In addition to his work while in a U.S. Army com- puter center, the author produced one of the first text- edited history dissertations in the mid-1970s. He also taught introductory computer and information systems classes to graduate students, and has consulted on au- tomated systems for a variety of businesses and in- stitutions. Just as the alligator survives from the "Age of Dinosaurs" and functions in fairly effective fashion, however, MARC has its place. The archival manager must recognize that MARC provides a key an- swer to the goal of a national inventory for archival records and has also succeeded in bringing archives into the "Information Age." Yet, such an embrace does not deny the responsibility of keeping up with on- going technological improvements or fu- ture changes; nor does it come without the need for historical and critical analysis to insure the system's proper integration within the institution.3 The following article describes an at- tempt to blend theory and practice from an institutional perspective. It rests on histor- ical and observational methods. With tongues and mixed metaphors firmly in cheek, we want to show how the Amistad Research Center is learning to dance with the MARCosaurus.4 The Setting The Amistad Research Center is one of the nation's premier minority archives. The first repository created with a specific eye to chronicling the Civil Rights Movement, the Center currently holds over 800 collec- tions with more than ten million documents and thousands of tapes and photographs. It has a 25,000-volume library and the Deep South's finest African-American art collec- tion. An independent organization, the 3The alligator analogy is to reassure Judith Camp- bell Turner, "To the editor," American Archivist 57 (Winter 1994): 8-9—who ignored the author's danc- ing style, but did question background knowledge on MARC and chastise with the faint hope of a devel- opmental framework, "Stielow is using dinosaur in the way paleontologists and evolutionary biologists would." 4Apologies to Trudy Peterson and her "Archival Bestiary," as well as the designer of a dinosaur tee- shirt that helped to symbolize the struggles of the Na- tional Archives' movement for independence in the early 1980s. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 466 American Archivist / Fall 1995 Amistad maintains its own Solinet/OCLC catalog account, but is housed at Tulane University with ties to the campus library system. In addition, Tulane provides an ethernet hub and direct access to computer experts and the Internet, including gopher and Web nodes. The Amistad joined the rush to MARC in the late 1980s with the aid of a Depart- ment of Education grant. All of the Cen- ter's archival collections then received AMC breakdowns and were downloaded into the OCLC national bibliographic da- tabase. Yet, by the arrival of a new director in mid-1992, the Center had not really in- tegrated MARC into its descriptive appa- ratus. We at the Center faced an ever growing backlog with few new MARC rec- ords to show. While significant, MARC en- tries still remained largely the domain of the cataloger. They were somehow apart from most of the archivists and their main- stay two-steps with registers and card in- dexes—an element for the specialist and, frequently, only an afterthought or a poten- tially easily overlooked, time-consuming burden. In the jitterbug toward a "sexy" and "funded" technological advancement, the Center may have abrogated some of its professional responsibilities. One doubts that we were alone. Some Historical Factors From at least the early nineteenth cen- tury and Antonio Panizzi's dictates at the British Museum, librarians were able to de- volve strict rules to standardize their de- scriptive practices across institutions. They produced a generic "book and catalog card model" with demands for precision of en- try of an eighteenth-century minuet. In the United States, the late nineteenth-century establishment of professional library edu- cation helped firm up a new national bib- liographic order. The model gained more clout and economic expediency following the introduction of printed card sets from the Library of Congress in the early 1900s.5 The underlying American intellectual schema went through several permutations before eventually linking back across the Atlantic and into the Anglo-American Cat- aloguing Rules. AACR was a special pio- neer. It was conceived to dovetail with emergent mainframe technologies of the 1960s Cold War era and lay the ground rules for projected MARC standards. Through the monumental labors of people like Henrietta Avram at the Library of Congress, MARC itself surfaced during the late 1960s. It helped provide the economies of scale, borrowing services, and "copy cataloging" that continue to drive library automation. Archives followed jazzier, idiosyncratic patterns and did not partake in the library evolution until recently. Even the devel- opment of an archival/library model with the National Union Catalog of Manuscript Collections in the 1950s was strangely dis- tant from the AACR that was being dis- cussed in the same halls at the Library of Congress. Instead, the AMC initiative emerged as the controversial breakthrough of the SAA's National Information Sys- tems Task Force in the early to mid 1980s—a decade and a half after MARC's creation.6 The AMC format helped introduce data processing concepts and a new precision to the art form of archival description. MARC entries inform the researcher around the world about the existence and location of a collection. They can facilitate the internal collocation of similar subjects across prov- enance lines and bring a new order to ar- chival management. With more than 'Historical information is drawn from notes from Stielow's courses on the History of Libraries and the History of Archives and Information Systems. 'Unfortunately, MARC-AMC evolved under the auspices of the far less archivally sensitive second edition, or AACR2. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 From Managerial Theory and Worksheets 467 500,000 records already logged, MARC has emerged as a standard for modern American archival description. Because of such factors, the Amistad re- mains professionally committed to MARC AMC, and proudly continues to proclaim that all its collections will receive such en- tries. We take it as a given that such a pres- ence is vital in informing the widest range of outside researchers on the existence of our holdings. We understand too that such acceptance implies acquiescence to a pan- oply of outside rules and the entry of such formerly alien tunes as "Subfleld Delim- iters" and "National Thesauruses." Managerial and Theoretical Considerations: Any archival manager knows that MARC is far from a panacea. While ar- chives did come to the MARC cotillion, they did not necessarily move with the same rhythms or partake as fully in its synergies as their library sisters. The key portions of archival description still re- main fuzzy and tied to descriptive narra- tives beyond the easy reach of a MARC record.7 The manager has bottom-line consider- ations. Archives do not fully join in such economic benefits as shared cataloging and interlibrary loan. MARC AMC depends on the expensive and time-consuming norm of "original cataloging." Many archives are linked to bibliographic utilities with costly annual fees and incur additional charges whenever they update records for growing collections. MARC may also call for in- creasing staff specialization and slow down the descriptive process, thus prolonging backlogs. 7For more background and additional challenges, see David Bearman, "Archives and Manuscript Con- trol with Bibliographic Utilities," American Archivist 52 (Winter 1989): 26-39. At the human level, how can any casual or infrequent user reasonably keep in mind the nuances of AMC? Who can memorize its seventy-seven variable-length field op- tions and their myriad of sub-field delim- iters? The visible format is dated with unnecessary redundancies between the var- iable and a block of fixed field codes, which are themselves largely unusable and unsearchable. The use of the 650 field with LCSH [Library of Congress Subject Head- ings] alone may be described as a tango within Dante's Inferno. Users face dizzy- ing possibilities and ever-changing rulings to meet national library needs. Library lit- erature and anecdotal evidence are pock- marked by repeated technical and intellec- tual failures to live up to its potential. Archivaria also recently illustrated a growing nest of acronyms from MAD to RAD, which have appeared as supplemen- tal standards to expand and potentially challenge MARC. MARC's limitations also are evident to anyone conversant with current data base design. From conversa- tions with network specialists, it seems that even MARC's underlying Open Systems Interconnection (OSI) or computer com- munications standard is under scrutiny and may prove insufficient to meet data transfer needs in the fiber optics age.8 Given that most archivists come with primary training in history, we can also posit a likely lack of awareness of pertinent managerial theoretical perspectives from other fields. For instance, George Zipf s Law of Least Effort is a recognized classic in information science. His is a form of game theory with cost/benefit checks for an applied and managerial context. Zipf ar- gues from the warning maxim that "jobs seek tools; tools seek jobs." He calls for avoiding the inefficiencies of unplanned or makeshift responses to new demands through the conscious development of 'Archivaria 35 (Spring 1993). D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 468 American Archivist / Fall 1995 techniques and tools designed for the least effort to accomplish the tasks. Zipf s Law suggests putting energy at the front-end to structure efficient mechanisms and hence heighten probable returns at the back-end. Thus, descriptive practice should be weighed and formulated to avoid demand- ing more time and energy than the likely value to be derived from the information. Although it may be possible to so describe a record as to virtually guarantee access, the economic and managerial equations must also be weighted with probability and risk assessment. The resulting equation suggests—Input (time * costs) should be < or = Output (value * costs). Without the formula, such evaluation relates directly to appraisal and many archival practices out- side of MARC AMC and its "flat" or uni- tary form of description.9 The Amistad Experience Historical, practical, and theoretical con- siderations thus led us to a deeper exami- nation of how best to use MARC AMC. We sought to maximize the integration of useful, staff-efficient, standardized, and easily accomplished description, with a minimum of energy. Our cautions were to avoid "reinventing the wheel" and stay with Zipf s injunctions, as well as follow- ing the rule of KISS—keep it simple, stu- pid. We needed to deal with our reality. This process largely relied on historical and ob- servational methods. The Amistad had to recognize that the overwhelming bulk of its descriptive tools were not tied to MARC nor adapted to accommodate its arrival. The register was still our primary focus and what our clients sought. We were already in the process of recasting this device to allow for enhanced retrieval through word- 'George Zipf, Human Behavior and the Principle of Least Effort (Cambridge, Mass.: Addison Wesley, 1949). processed narratives and box and folder lists formed with data base management software. Moreover, Internet ties appeared to be increasing our traffic more than MARC had.10 Other pragmatic factors intruded. We fea- tured trained library catalogers, several with MARC AMC workshop training, and a di- rector who made MARC entries for archives even before AMC. But the Center had not begun to address the full range of what MARC has to offer, and probably cannot do so with the staff at hand. Although quite active, the Center may, at best, catalog sev- enty-five collections in a year. Such a num- ber is barely sufficient to maintain the sophistication necessary for the complex art form of original cataloging. In addition, the Amistad must deal with "non-MARCian" processors. The Center can simply not af- ford to extend the requisite workshops to its transient pool of student interns and volun- teers working on its backlog. Our quest also led to the literature and contacts with other institutions. We learned that the basic recourse lay in solid manual approaches and the design of a standard worksheet. Nancy Sahli suggested such techniques early on in the MARC AMC revolution. As we interpreted her 1985 writings, an ar- chives could systematically foxtrot through the MARC maze by preselecting and stan- dardizing its fields for entry. We attempted to streamline and further simplify this pro- cess with forms design theory. Instead of seventy-seven major variable field options, why not present only a dozen and make most of those mandatory? Why not attempt to default all the fixed fields at the top of l0We do not view automated registers or Internet connections as being in an "either-or" conflict with MARC. Instead they are all related methods toward the same goal within our environment and its ties to a university library system. However, this does not mean that some archives may make a logical choice for themselves to concentrate electronic delivery on the Internet without a MARC format. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 From Managerial Theory and Worksheets 469 the record, rather than looking up the choices? Why not design to ease manual entry, but still enhance data retrieval: e.g., default where possible, avoid codes, and use check blocks with built-in terminology controls?" We even extended these latter concepts to subject selections in the 650 field. In- stead of the two volumes of LCSH, the Center developed a single sheet of terms and codes for our processors. The Sisy- phian choices were researched and broken down into rough "thesaurus" categories to reflect the activities of our preexisting and likely holdings. Other managerial decisions helped in- crease our "probable returns." In essence, we weighed the importance of promoting finely polished descriptions versus the value of getting information out quickly to researchers and attacking our backlog. We opted for speed and minimal energy. Worksheets would be addressed imme- diately following a quick preliminary in- ventory, or as early as possible during processing. Entries need not be very long, will usually be one-time ventures, and will be limited to a collection-level overview. (But, they could also be revisited if sub- stantial errors or other factors interfered.) Finally, we decided that the collection's processors should be primarily responsible for filling out the initial forms. The results could stand alone as the sole pointer, especially for a small, less important collection. But the MARC rec- "Nancy Sahli, MARC: For Archives and Manu- scripts (Chicago: Society of American Archivists, 1985). We looked at a number of later publications and forms at several institutions. In addition, Stielow had built a MARC archival worksheet at the Univer- sity of Southwestern Louisiana as early as 1982. Among other features, our forms design approaches for check blocks are consciously limited by the Miller Number of 7 (+ - ) 2, which conforms to human capacities, versus overly long lists of terms fit only for the computer. Note, too, the placement of a con- trol number with year of creation and retention sched- ule. ord could also be an introduction to stan- dard finding aides with box and folder lists for larger and more complex collec- tions. At the Amistad, MARC does not stand at the apex; rather, it is an initial step and integrated into an overall de- scriptive chain. Information from MARC helps inform other parts of that chain. Eventually, AMC records will likely link, or "front-end," to a full range of electronic in-house registers and database indexes, which will also be placed on our gopher and Web nodes. Compromises and Bending the Rules The cognoscenti are aware of legalistic problems—elements that differ from the originating library model and may trouble the more literal MARC interpreter. To Steven Hensen, for example, in the APPM bible, "In such a system, a catalog record created according to these rules is usually a summary or abstract of information con- tained in other finding aides." His under- lying assumptions follow from the finished book model with a finding aide as "chief source of information." Theoretically, the in- tense scrutiny given in the production of the finding aide will lead to more accurate and "cleaner" records.12 Our waltz was obviously a compromise to fit a particular situation, but we did have internal evidence to argue for our simplified, early entries. For instance, we had found no evidence of increased use through MARC. In light of our other find- ing aides, automation advances, and user requests, we also found little motivation to expand the size of our catalog records. We were aware, too, that many of our col- lections continued to receive deposits and had economic imperatives against costly and awkward on-line updating. Most im- uSteven Hensen, Archives, Personal Papers, and Manuscripts, 2nd ed. (Chicago: Society of American Archivists, 1989), 4. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 470 American Archivist / Fall 1995 portantly, a review of earlier and properly formulated entries from completed regis- ters and trained staff showed a great deal of inconsistency and "dirt." The sum- mary information in the scope note often appeared distant from a comparative read- ing of the finding aid. Subject headings were often isolated "break dances," too overly diverse to help tie our collections together.13 We still tried to build in qualitative safeguards. We knew that student interns and undertrained staff would have to be involved even to dent the backlog. Thus, we made certain that all staff and interns received similar training in an attempt to coordinate in-house processing. They also have ready recourse to key background readings and an internal processing man- ual. Each collection is managed by a Holdings Folder and Processing Control Sheet, which helps coordinate and inte- grate the full range of processing. It con- tains both check blocks to indicate the level of description, and pertinent infor- mation for the MARC entry. Moreover, the Senior Archivist provides the proces- sors with tutelage and assistance in com- pleting their sheets. Finally, trained catalogers make the actual data entries and are responsible for quality and authority control, which involves additional author- ity checks through an off-line microcom- puter cataloging package. Although Hensen's recommendations did not meet our needs or experiences, the Center was still committed to following the rules. Fortunately, he also had hinted at the '•"Helen Tibbo, Abstracting, Information Retrieval, and the Humanities: Providing Access to Historical Literature (Chicago: American Library Association, 1993), demonstrates the difficulties in producing a good abstract—problems that are exacerbated the fur- ther removed they are from the original author. Tibbo, who is one of the coming lights in the field, has also provided some disturbing information on impractical- ity of complex subject headings in actual application within current on-line systems. absence of an absolute requirement for the record, just "to be an abstract of a more substantial finding aid." Fortunately too, OCLC obligingly provided us with a con- comitant technological break. Sitting in the fixed fields at the start of entry screen in OCLC is a demand for encoding level (Enc.Lvl). OCLC's MARC AMC Catalog- ing Manual reveals that Enc.Lvl comes with several options—from " 1 " , showing that processing and description are com- plete to " 5 " , indicating incompleteness. The Amistad elected to rhumba and throw the " 5 " switch. MARC AMC Coding Sheet Rather than prolong what could become a tedious theoretical debate, or go beyond still preliminary findings—let us examine the dance card. The Amistad's AMC Worksheet is far from revolutionary; many institutions regularly employ similar de- vices. Ours is perhaps designed to be more "transparent" and user-friendly. At pres- ent, it appears as a two-page form, mim- icking the pre-prepared OCLC computer template, with an explanatory guide in- cluded. [Worksheet and Guide are included as Appendix A-Ed.] Conclusions Let us admit that the reality of imple- mentation—of going from theory to prac- tice—can be frustrating. We are in the midst of an information revolution. Archi- vists and catalogers do have problems communicating, and the case is magnified when dealing with automation and net- working specialists. Specific software packages and the need to conform to an on- line bibliographic utility can provide slam dance nightmares, which lay waste to the- ory and logic. For example, we could not default all the fixed field codes. If research- ers were to receive a reasonable initial on- line pointer, OCLC requires that DATES be filled in—even though they are repli- D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 From Managerial Theory and Worksheets 471 cated at the end of the 245 field. LCSH subject headings caused expected head- aches and a tarantella back to the manuals for sub-field indicators before they would be accepted by the system. Finally, we concede that we are still studying at the Arthur Murray School for MARC Dancers. The readers are only glimpsing a portion of a work in pro- gress—the MARC section of what is in- tended to be an integrated and highly computerized system. Our future plans in- clude descriptive apparatus with hypertext links from all key terms and subjects, pointers to the location of the materials, and, eventually, hypermedia buttons to the actual information and across collection lines. Much study and quality control re- mains to be done. The Center invites com- ments and criticism from others in a similar struggle, so we can begin to rock-n-roll in the Information Age—especially before the new integrated format finally hits the air- ways with sounds guaranteed to disturb ar- chivists and send us back for new dance lessons once again.14 14Those interested in how we are developing our overall procedures can glimpse them in the Proce- dures Section of our Installation Manual, which is available on line through the Amistad's web page or directly in the gopher under Departments in go- pher@mailhost.tcs.tulane.edu. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 472 American Archivist / Fall 1995 Appendix A AMISTAD RESEARCH CENTER MARC-AMC FORM Processor: Date Finished: ToCataloger: OK-date: FIXED FIELD CODES [predetermined, except Dates and if added languages] OCLC: NEW Rec. Stat: n Type: b Bib lvl: c Source: d Lang: eng [• other languages: ] DCF: Repro: Enclvl: k Ctry: us Desc: Mod rec: Dattp: i Dates: , [repeat from 245 field] Variable Length Fields 035 Collection Number [take from "Acquisitions Register"]: Main Entry [check appropriate category, fill in information] • 100 1-person; 9 100 2--family; • 110 2~corporate body Last Name: First: Middle/(maiden) Dates [optional] ($f) year born- , year died-- 245 10 Title ($a) [Check one]« Archives (corporate body/institution) • Papers (person or family) t Collection (artificial grouping) Collection Year Span ($f) , [give earliest and latest years] 300 Extent. [if> 3" fill in] linear feet [if< 3" fill in no. of] items 340 03 Medium [optional—only for collection with non manuscript materials] [Check any applicable] • painting; •sculpture; •photographs; •audiotapes; • videotapes; •computer files • or~ 500 General [optional—use to list related collections by title] See Also: 520 Summary and Scope Note [describe collection in a few sentences that define the subject and indicate key events/locations/individuals, as well as our series breakdown-may abstract Register's Collection Overview] 545 Biographical/Historical Note [optional, if you feel 520 note needs more on the subject's life or milestones- relate to Register's chronology] 555 0 Finding Aids Note [optional, check any present] • register; • computer inventory; o gopher file, o mosaic file Subject Added Entries [Use appropriate codes (600=Person; 610=Corporate; 650=Topic; 651=Place); list the key persons, institutions, or places : Go to the "Topic Sheet for LCSH" for 650 terms] 6 _ 6 _ 6 _ 6 _ 6 _ 6 _ 851 Location ($a) [predetermined] Amistad Research Center, Tilton Hall, Tulane University, New Orleans, LA 70118 $d USA E-Mail: amistad ©mailhost. tcs.tulane.edu Marc.94-lyr D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 From Managerial Theory and Worksheets 473 Amistad Research Center MARC-AMC WORKSHEET GUIDE Instructions: Processors must be aware that their collections require MARC-AMC Worksheets [the initials stand for Machine Readable Cataloging-Archives and Manu- scripts Collections]. MARC includes unseen data communications protocols, which you do not have to worry about, and the visible fields on your MARC-AMC Worksheet. Most of the fixed fields at the top of the form are already completed. You will con- centrate on abstracting information within the remaining variable length fields. The results will be converted into a short "catalog card" image for the online public access catalog (OPAC). This guide is to help explain the fields and how to enter data on the worksheet. It will include several sample entries.* Should you want more information refer to the APPM Manual or one of several articles and books available to you on the subject in the professional reference shelves. If you have additional questions or problems, ask your supervisor, the Senior Archivist or director. Write for clarity and to communicate with others outside the Amistad. Keep sen- tences concise with no more than 25 words. In general, try to report out what you would think a typical researcher might need to find the information. Data Entry A. Initial Blocks: Fill in your name and the date that you complete the worksheet. All Worksheets go to the Catalog InTray for review—feel free to ask to help with the data entry. B. Fixed Fields: With two exceptions do not make any entries: 1. If you encounter a significant amount of non-English materials, check the box next to LANG and enter the languages; 2. DATES, you will enter the earliest year, the latest year of materials that you encountered during your Preliminary Inventory—entry the same as the 245 field. C. Variable Length Fields: The information to complete these sections will come from your research and initial inventorying of the collection, as well as the Processing Control Sheet and Holdings Folder. The numbers are tag lines to define data entry elements and an asterix * before any tag means that entry is optional—all other fields must be completed before passing the form to the cataloger. 035—Collection Number [found on the Processing Control Sheet, or ask the Ac- quisitions Archivist.] Main Entry [use this area to enter information on the provenance or creator of the material. First check the appropriate 100's delimiter—the materials come from a person, family, or a corporate body (a business, college, association). Next enter the proper name of this originator—if you have questions, the cataloger and Mic-Me software have a predetermine "authority list" of some of the names. Finally, if known and verified, enter the year in which the originator was born or founded and any death or closing year.] D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 474 American Archivist /Fall 1995 245—Title [we have limited you to three choices: check "Archives" for the records of a corporate body; "Papers" for a person or family's documents; or, "Collections." The last refers to any holdings without clear provenance and that we have artificially drawn together to describe a person or event—for example, "The David Duke Collec- tion" was not donated by Duke, but brought together by the staff as we monitored his actions.] Collection Year Span: Indicate the earliest, latest year of the materials that you encountered in the collection (Duplicate in DATE: in fixed field area). 300—Extent [approximate the size of the holdings: if less than a Hollinger Box, give the number of items; if larger, indicate the number of feet and/or a decimal equiv- alent for less than a foot—e.g., .6 linear feet.] 340—Medium [optional] unless the holdings have materials other than paper records. Check any and all applicable blanks and write in any materials not covered by the check list. 500—General Note [optional] use to show if it relates significantly to other holdings in the archives, or to cross reference for materials that were pulled from another holding—e.g., an artwork that was separated into the art collection. 520—Scope and Content Note [this is the heart of your work]—a narrative para- graph on the holding and any significant people, place, or events that it helps inform. Think of this as an abstract of the Collection Overview from a register. Keep it short, but you can use the reverse side of the sheet for more. 545—Biographical Note [optional, but highly recommended and may extend to the verso also] Build a short biographical statement chronicling the person, family, or in- stitution. This should put stress on the time frames/events that are actually documented by the materials and feed to the Chronology of a Register. 555—Finding Aides Note [optional, unless one of the terms is checked on the Pro- cessing Guide Sheet] You should check any and all applicable entries—are you doing a register; does the register include a Paradox DBMS index of the inventory; is that material scheduled for downloading into a textual "gopher" and/or "Mosaic" hyper- media platform. 600—Subject Added Entries [with the scope note, the key pointers for researchers] First select the significant persons, families, events, institutions that you have cited in your 520 or 545 notes—go back and correct any oversights. Fill in the appropriate numerical tags found in the header notes and then the selection. Once that is done, turn to the Subject Headings—650 Topic Notes guide sheet, which is an authority list of acceptable terms from the Library of Congress's Subject Headings. Refer to the direc- tions and make the appropriate selections and entries. 851—Location [the standard address to contact the Center] *Sample Entries [eliminated for this paper] D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.4.2h24853221046411 by C arnegie M ellon U niversity user on 06 A pril 2021 work_n5pyu2hmenhl7jdp2dltvyhjta ---- 117538 186..199 Bibliometric tools applied to analytical articles: the example of gene transfer-related research Donghui Wen College of Environmental Sciences and Engineering, Peking University, Beijing, China, and Te-Chen Yu and Yuh-Shan Ho Bibliometric Research Centre, I-Shou University, Kaohsiung, Taiwan Abstract Purpose – The objective of this study is to conduct a bibliometric indicator and to conduct an analysis of citations per publication of all horizontal gene transfer-related publications in the Science Citation Index (SCI). A systematic search was performed using the SCI for publications during the period 1991-2005. Design/methodology/approach – The data were based on the online version of the Science Citation Index (SCI), Web of Science. Analyzed parameters included authorship, patterns of international collaboration, journal, language, document type, number of times cited, author, and KeyWords Plus. Findings – The USA and Germany produced 57 percent of the total articles and 77 percent of the total times cited in three years after publication. In addition, a simulation model was applied to describe the relationship between the cumulative number of citations and the article life. Originality/value – This is one of the first studies that uses analysis of citations per publication, defined as the ratio of the number of citations per publication in a certain period, to assess the impact relative to the entire field. Keywords Serials, Genetics, Research results, Publishing Paper type Research paper Introduction Horizontal gene transfer (HGT), or lateral gene transfer (LGT), is the collective name for processes that permit the exchange of DNA among organisms of different species (Jain et al., 2003). In the reproduction strategies of a replicon, vertical transfer of chromosome is the faithful way of increasing the genotype of a species, while horizontal transfer of transposon, plasmids or viruses provides the chance of creating a recombinant genotype by contributing to the genome of a neighbor recipient cell (Heinemann, 1998; Brown, 2003). The earliest mention of HGT can be traced to 1905, when Merechowsky suggested that the eukaryotic mitochondria and chloroplast originated when bacteria invaded the eukaryotic cell and were subsequently incorporated by it (Syvanen and Kado, 1998). In the 1950s, upon the previous attempts of demonstrating recombination between diverse species of bacteria, Baron et al. (1959) and Miyake and Demerec (1959) reported that high frequency of recombination (Hfr) strains of Escherichia coli could transfer genetic information to certain mutant strains of Salmonella typhimurium. Ochiai et al. The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm OCLC 25,3 186 Received December 2008 Reviewed January 2009 Accepted January 2009 OCLC Systems & Services: International digital library perspectives Vol. 25 No. 3, 2009 pp. 186-199 q Emerald Group Publishing Limited 1065-075X DOI 10.1108/10650750910982575 (1959) discovered infectious multiple-antibiotic resistant plasmids in pathogenic bacteria. Theoretical implications of HGT began to grow from the view of evolution in the 1970s (Went, 1971; Hartman, 1984). By the mid-1980s, numerous mechanisms for natural HGT were firmly established in the range from bacteria to metazoans (Ochman and Selander, 1984; Erwin and Valentine, 1984; Syvanen, 1984, 1987). However, the name “horizontal gene transfer” did not appear in the title, abstract or keywords of a publication until 1983, when Aporpium was reported as an example of horizontal gene-transfer (Setliff, 1983). In the succeeding research history, with the advance of molecular biology, more and more available scientific evidence has indicated that HGT is a natural process among wild-type organisms of prokaryotes (Penalva et al., 1990; Di Giovanni et al., 1996; Eisen, 2000; Koonin et al., 2001) and eukaryotes (Penalva et al., 1990; Rosewich and Kistler, 2000; Andersson, 2003). Now, under increasing pressure from human society, HGT is detected frequently, related to the occurrence of resistance to herbicides (Ka and Tiedje, 1994; de Lipthay et al., 2001) or antibiotics (Penalva et al., 1990; Coffey et al., 1995; Shoemaker et al., 2001; Zolezzi et al., 2004), and the catabolic pathways for the degradation of synthetic compounds (Herrick et al., 1997; van der Meer et al., 1998; Top et al., 2002; Wilson and Metcalf, 2005). In this study, the authors attempted to analyze bibliometrically the HGT-related literature published in journals listed in SCI from 1991 to 2005, in order to provide insights into the characteristics of the HGT literature and identify patterns, tendencies, or irregularities that may exist in the literature. An indicator, citations per publication, was also applied in this study. Furthermore, this will provide a comprehensive evaluation of current HGT research. Methodology The data were based on the online version of the Science Citation Index (SCI), Web of Science. SCI is multidisciplinary database of the Institute for Scientific Information (ISI), Philadelphia, USA. One common way of conducting bibliometric research is to use the SCI database to trace the times each document has been cited (Hsieh et al., 2004). In the 2005 edition of the Journal Citation Reports (JCR), 6,088 journals are listed in the SCI. “Horizontal-gene-transfer” or “lateral-gene-transfer” were used as keywords to search titles, abstracts, and keywords to identify HGT-related publications from 1991 to 2005. Articles originating from England, Scotland, Northern Ireland and Wales were re-categorized as being from the UK. Collaboration type was determined by the address of each author, where the term “single country” was assigned if the researchers’ addresses were from the same country. The term “international collaboration” was designated to those articles that were co-signed by researchers from different countries. The information downloaded included names of authors, contact address, title, year of publication, keywords, times cited, subject categories of the journal, names of journals publishing the articles, and publisher information. The records were downloaded into spreadsheet software, and additional coding was performed manually for the number of authors, country of origin of the collaborators, and impact factors of the publishing journals. Impact factors were taken from the Journal Citation Report (JCR) published in 2005. To assess the visibility of an article, the authors used the number of times it was cited as an indicator. The number of times cited for an article, however, is highly Gene transfer- related research 187 correlated with the length of time since its publication. To adjust for that, a new variable was created (Chuang et al., 2007). Figure 1 shows the relationship between the average number of times cited per paper and the number of years since the paper’s publication for all HGT-related articles from 1991 to 2005. It shows that the frequency of being cited was highest in the second full year since publication, and began to decrease thereafter. To adjust for bias due to differences in the length of time since publication, a new variable, TC2 (times cited before year 2), instead of just times cited since publication, was used to assess the visibility of articles. A TC2 for the year 2003 would be the number of times being cited before 2006 for all articles published in 2003. Another variable CPP (citation per publication) for articles published in a particular year was calculated as TC2 divided by the number of articles published in that year. In some cases, the authors only discuss documents published in the period 1991-2003, since articles published after 2003 would not have TC2 and CPP values during the analyzing period, i.e. 1991-2005. Results and discussion Document type and language There were 1,549 HGT-related documents published from 1991 to 2005. The distribution of document types identified by ISI was analyzed as listed in Table I. For the period 1991-2003, journal article was the most frequent document type, 765 articles comprising 79 percent of total production with a CPP of 14, followed distantly by 173 reviews comprising 18 percent of total production with a CPP of 20. Editorial material, letters, meeting abstracts, corrections, reprints, bibliographies, news items, notes, and Figure 1. Citations per article by article life for 1,208 HGT-related articles from 1991 to 2005 OCLC 25,3 188 software reviews showed lesser significance than articles and reviews. As journal articles represented the majority of document types that were also peer-reviewed within this field, a total of 1,208 relevant articles in the period 1991 2005 was identified and analyzed. The percentage of reviews related with HGT (18 percent) was notably high, however, implying that the HGT research is much comprehensive. The concept of HGT provides a penetrating insight into many significant phenomena of life, such as evolution, adaptation, genetic modification, recombination, antibiotic resistance, conjugation, transduction, and transformation. The predominant language for all journal articles was English (99 percent); others were published in French, (four articles, 0.44 percent), German (two articles, 0.22 percent), and Chinese and Russian (one article, 0.11 percent), respectively. Garfield and Welljamsdorof (1992) reported that English is the main language of microbiology research, accounting for 90-95 percent of all SCI papers. In addition, it could be expected that English would be used more frequently because more journals listed in ISI were published in English. Chronological publication output There were 1,208 HGT-related articles published from 1991 to 2005. Table II shows that the number increased significantly from 1991 (12 articles) to 1997 (32 articles). After 1997 there was been a large increase, reaching 195 articles in 2004 and 248 articles in 2005. Table II also shows TC2 and CPP during the period 1991-2003. Only 765 articles had CPP values. The average CPP was 14. The lowest CPP was found in 1995 at 8.2, while the highest CPP occurred in 1997 at 27. Figure 2 also shows that CPP has fluctuated over the years and a peak appeared in the year 1997. The reason for this is that Kunst et al. (1997) published “The complete genome sequence of the Gram-positive bacterium Bacillus subtilis” in Nature, with a TC2 of 499. This article analyzed the complete genetic information of a bacterial strain, Bacillus subtilis, the best-characterized member of the Gram-positive bacteria. In its genome, many genes are involved in the synthesis of secondary metabolites including antibiotics, and a few prophages or remnants of prophages are contained, indicating that bacteriophage 1991-2005 1991-2003 Document type P P TC2 CPP Article 1,208 (78) 765 (79) 10,823 14 Review 282 (18) 173 (18) 3,378 20 Editorial material 28 (1.8) 18 (1.8) 173 10 Letter 16 (1.0) 10 (1) 136 14 Meeting abstract 28 (0.39) 3 (0.31) 0 0 Correction 3 (0.19) 1 (0.10) 0 0 Reprint 2 (0.13) 2 (0.21) 25 13 Bibliography 1 (0.065) 0 (0) 0 0 News item 1 (0.065) 1 (0.10) 0 0 Note 1 (0.065) 1 (0.10) 13 13 Software review 1 (0.065) 0 (0) 0 0 Total 1,549 974 14,548 15 Notes: Figures in parentheses are percentages. P, number of papers; TC2, times cited before the second full year since publication; CPP, citation per publication Table I. Document distributions from 1991 to 2005 with CPP from 1991 to 2003 Gene transfer- related research 189 Year No. of articles TC2 CPP ICA 1991 12 199 17 3 (25) 1992 11 110 10 4 (36) 1993 16 176 11 7 (44) 1994 24 236 10 3 (13) 1995 27 222 8 4 (15) 1996 35 320 9 8 (23) 1997 32 871 27 6 (19) 1998 53 566 11 13 (25) 1999 67 1,177 18 17 (25) 2000 87 1,277 15 22 (25) 2001 114 2,171 19 38 (33) 2002 128 1,454 11 37 (29) 2003 159 2,044 13 30 (19) 2004 195 60 (31) 2005 248 79 (32) Total 1,208 331 (27) Notes: Figures in parentheses are percentages. TC2, times cited before the second full year since publication; CPP, citation per publication; ICA, international co-authorship Table II. Article characteristics by year of publication Figure 2. Relationships among the number of articles, citation per articles, and year OCLC 25,3 190 infection has played an important evolutionary role in horizontal gene transfer. Since its publication, this article was cited 1,633 times up to 2005 by 49 countries. Later, increasing studies of genes and genomes have indicated that considerable horizontal transfer has occurred between different species, leading to the steep increase in the number of HGT-related articles published after 1997. International collaboration Of the 1,208 articles, 331 articles, or about 27 percent, had international co-authorship (ICA). The annual percentage of articles with ICA is listed in Table II. The percentage of ICA articles was highest in 1993 at 44 percent, followed by 1992 at 36 percent, and 2005 at 32 percent. In general, ICA articles were more prevalent in recent years than earlier years. Using five-year intervals, the percentages of articles with ICA were 23 percent, 24 percent, and 29 percent for the periods 1991-1995, 1996-2000, and 2001-2005, respectively. It has been reported that the European Union is becoming more important as a scientific collaboration partner of both advanced and developing countries (Glänzel et al., 1999). In the case of stroke-related research in Taiwan, international co-authorship also increased. The percentages of articles with ICA were 14 percent, 17 percent, and 23 percent for the periods 1991-1995, 1996-2000, and 2001-2005, respectively (Chuang et al., 2007). Table III lists the ten most productive countries in total publications between 1991 and 2003, with ICA and CPP values. Among the 765 articles with CPP information from 1991 to 2003, international articles comprised 25 percent of the articles with a CPP of 29, compared to 75 percent from single countries, with a CPP of 13. International collaboration is a factor that attracts citations (de Granda Orive et al., 2007). The most highly cited European papers were found to be the multinational papers (Narin et al., 1991). The CPP values of the numbers of articles with international co-authorship were increased and significantly higher in the case of stroke-related research in Taiwan (Chuang et al., 2007). Horizontal gene transfer-related articles with ICA had significantly higher CPP values. It would be reasonable to assume that more international collaboration would lead to more output due to the sharing of ideas and workloads (Chuang et al., 2007). Meanwhile, single-country articles were produced by authors from 36 different countries, with the majority originating from the USA (192; 34 percent) with a CPP of 18 followed by Country SP CPP CP CPP TP CPP USA 192 (34) 18 97 (51) 24 289 (38) 20 Germany 85 (15) 11 61 (32) 27 146 (19) 18 UK 46 (8.0) 10 38 (20) 26 84 (11) 18 France 34 (5.9) 16 30 (16) 36 64 (8.4) 25 Canada 34 (5.9) 12 25 (13) 13 59 (7.7) 13 Japan 30 (5.2) 18 10 (5.2) 54 40 (5.2) 27 The Netherlands 11 (1.9) 7.9 16 (8.3) 44 27 (3.5) 29 Sweden 17 (3.0) 10 10 (5.2) 10 27 (3.5) 10 Spain 13 (2.3) 7.4 13 (6.8) 47 26 (3.4) 27 Australia 14 (2.4) 10 12 (6.3) 12 26 (3.4) 11 Notes: Figures in parentheses are percentages. SP, single country publications; CP, international collaborative publications; TP, total publications; CPP, citation per publication Table III. Ten most productive countries in total publications from 1991 to 2005 Gene transfer- related research 191 Germany (85; 15 percent) with a CPP of 11. Twenty countries contributed only one or two single-country articles. The country with the most international co-authorship was also the USA with 97 articles, comprising 51 percent of the total number of internationally co-authored articles, with an average CPP of 24. Germany was the country with the second greatest number of international collaborations, with 61 articles and an average CPP of 27. Nineteen countries contributed only one or two international collaborative articles. ICA articles with the highest TC2 values (Kunst et al., 1997) were co-authored with researchers from France, Japan, Germany, The Netherlands, the UK, the USA, Poland, and Ireland. In addition, the USA was the most productive and cited country in patent ductus arteriosus treatments (Hsieh et al., 2004), asthma in children (Chen et al., 2005), biomedicine (Figueredo et al., 2003), ophthalmic reserach (Ohba, 2005), and otolaryngology research (Cimmino et al., 2005). Journals and subject categories The 765 articles with CPP were published in 213 journals during 1991 to 2003. Table IV shows the ten most published names of journals, the number of articles published by these journals, CPP, ranking order of CPP, IF, and ranking order of IF in a subject category. The journal that published the greatest number of articles was Journal of Bacteriology, with 53 articles, followed by Proceedings of the National Academy of Sciences of the United States of America, and Journal of Molecular Evolution. The 48 articles published in Proceedings of the National Academy of Sciences of the United States of America had the highest CPP (31) and the highest IF (10.231) among the top ten journals. The 765 articles with subject category and CPP information were included in 58 subject categories. In the research history, HGT and recombination between two streptococcal lineages was first reported in the subject category of immunology and infectious diseases (Simpson et al., 1992); later, the horizontal transfer of bacterial heavy metal resistance genes was first presented in the subject category of environmental-related fields (Dong et al., 1998). Today, researchers in a number of unrelated fields are making observations related to HGT, leading to an unusual breadth of topics. Table V shows categories that had at least ten articles. The three top categories with the largest number of articles were Microbiology (279), Biochemistry & Molecular Biology (223), and Genetics & Heredity (180). All numerical analyses used integer counts, i.e. if an article was included in two or more different subject categories, each subject category was counted once, and thus in these instances the percentage will add up more than 100 percent. Distribution of KeyWords Plus KeyWords Plus provides search terms extracted from the titles of papers cited in each new article listed in the database in ISI (Garfield, 1990). KeyWords Plus substantially augments title-word and author-keyword indexing. Examination of KeyWords Plus revealed that 2,740 keywords were used in the 754 articles. Among them, 1,946 keywords (71 percent) appeared only once, 355 keywords (13 percent) appeared twice, and 134 keywords (4.9 percent) appeared three times. The large numbers of once-only keywords are caused by a wide disparity in research focus (such as “DNA gyrase”, “nitrogen-fixation”, “methane”, “rhizobium”, and “human skin”), a special material used in research (such as “Bacillus sphaericus”, “lacteriophage T7”, “L12” and “ribulose OCLC 25,3 192 Jo u rn a l A rt ic le C P P R a n k in g o f C P P S u b je ct ca te g o ry IF R a n k in g Jo u rn a l o f B a ct er io lo g y 5 3 (6 .9 ) 1 2 4 1 M ic ro b io lo g y 4 .1 6 7 1 6 /8 6 P ro ce ed in g s o f th e N a ti o n a l A ca d em y o f S ci en ce s o f th e U n it ed S ta te s o f A m er ic a 4 8 (6 .3 ) 3 1 7 M u lt id is ci p li n a ry sc ie n ce s 1 0 .2 3 1 3 /4 8 Jo u rn a l o f M o le cu la r E v o lu ti o n 4 1 (5 .4 ) 9 .2 5 7 B io ch em is tr y a n d m o le cu la r b io lo g y 2 .7 0 3 1 0 8 /2 6 1 E v o lu ti o n a ry b io lo g y 1 3 /3 3 G en et ic s a n d h er ed it y 5 9 /1 2 4 M o le cu la r B io lo g y a n d E v o lu ti o n 3 6 (4 .7 ) 1 3 3 9 B io ch em is tr y a n d m o le cu la r b io lo g y 6 .2 3 3 3 2 /2 6 1 E v o lu ti o n a ry b io lo g y 4 /3 3 G en et ic s a n d h er ed it y 1 5 /1 2 4 A p p li ed a n d E n v ir o n m en ta l M ic ro b io lo g y 3 0 (3 .9 ) 1 2 4 4 B io te ch n o lo g y a n d a p p li ed m ic ro b io lo g y 3 .8 1 8 2 1 /1 3 9 M ic ro b io lo g y 1 9 /8 6 F E M S M ic ro b io lo g y L et te rs 2 4 (3 .1 ) 5 .7 9 9 M ic ro b io lo g y 2 .0 5 7 5 3 /8 6 G en e 2 3 (3 .0 ) 8 .0 6 9 G en et ic s a n d h er ed it y 2 .6 9 4 6 0 /1 2 4 M o le cu la r M ic ro b io lo g y 2 2 (2 .9 ) 1 4 3 6 B io ch em is tr y & m o le cu la r b io lo g y 6 .2 0 3 3 3 /2 6 1 M ic ro b io lo g y 1 1 /8 6 In fe ct io n a n d Im m u n it y 2 0 (2 .6 ) 1 1 4 8 Im m u n o lo g y 3 .9 3 3 2 2 /1 1 5 In fe ct io u s d is ea se s 8 /4 3 M ic ro b io lo g y- S G M 1 8 (2 .4 ) 7 .8 7 0 M ic ro b io lo g y 3 .1 7 3 2 1 /8 6 N o te s : F ig u re s sh o w n in p a re n th es es a re p er ce n ta g es . C P P , ci ta ti o n s p er p u b li ca ti o n p u b li sh ed in re sp ec ti v e jo u rn a ls ; IF , im p a ct fa ct o r o f th e jo u rn a l in 2 0 0 5 Table IV. Core journals publishing horizontal gene transfer-related articles Gene transfer- related research 193 1,5-bisphosphate carboxylase”), or a combination of some general topics, such as “DNA-sequence data”, “microbial gene identification” “evolutionary information”, “last common ancestor”, and “penicillin-resistant strains”. Table VI shows 19 KeyWords Plus keywords that appeared at least 30 times. The most frequently used keyword was “Escherichia coli”, appearing in 29 percent of 754 articles published in 1991-2003 with a CPP of 14. Other frequently used keywords were KeyWords Plus No. of articles Percent TC2 CPP Escherichia coli 221 29 3,143 14 Evolution 133 18 2,116 16 Sequence 118 16 1,629 14 Horizontal gene transfer 99 13 1,184 12 DNA 89 12 924 10 Identification 78 10 1,321 17 Expression 76 10 1,102 15 Nucleotide sequence 70 9.3 1,024 15 Bacteria 63 8.4 613 10 Gene 58 7.7 1,117 19 Cloning 54 7.2 631 12 Protein 53 7.0 1,149 22 Genes 49 6.5 1,055 22 Strains 47 6.2 560 12 Sequences 45 6.0 494 11 Genome 39 5.2 374 10 Bacillus subtilis 33 4.4 516 16 Proteins 32 4.2 557 17 Origin 31 4.1 567 18 Notes: TC2, times cited before the second full year since publication; CPP, citation per publication Table VI. Frequency of KeyWords Plus keywords used Subject category Article Percentage TC2 CPP Microbiology 279 36 2,537 9.1 Biochemistry and molecular biology 223 29 2,741 12 Genetics and heredity 180 24 2,073 12 Biotechnology applied microbiology 102 13 1,230 12 Evolutionary biology 97 13 1,013 10 Multidisciplinary sciences 75 9.8 3,587 48 Infectious diseases 43 5.6 457 11 Immunology 41 5.4 351 8.6 Cell biology 31 4.1 357 12 Biology 27 3.5 341 13 Plant sciences 24 3.1 176 7.3 Pharmacology and pharmacy 18 2.4 142 7.9 Virology 15 2.0 158 11 Biophysics 13 1.7 58 4.5 Ecology 12 1.6 89 7.4 Notes: TC2, times cited before the second full year since publication; CPP, citation per publication Table V. Number of articles and CPP by subject category OCLC 25,3 194 “evolution” at 18 percent, and followed by “sequence” at 16 percent, “horizontal gene-transfer” at 13 percent, and “DNA” 12 percent. The 53 articles with the KeyWords Plus keyword “protein” and the 49 articles with the KeyWords Plus keyword “genes” had the highest CPP (22) among the 19 keywords. In addition, “project”, “strands”, “terminators”, and “yeast artificial chromosomes” appeared once with the highest CPP of 499. Information regarding popular keywords is useful in understanding the research profile. “Escherichia-coli” is the most frequently used host bacterium in the HGT-related research. By “DNA” and “sequence” analysis, more evidence was found to show that “horizontal gene-transfer” might be an important driving force of “evolution”. Citation model For the period 1991-2003, the cumulative number of citations increased. In year of publication for all 765 articles, 969 citations were obtained, while three years after articles were published (including the year of publication), the cumulative number of citations (TC2) was 10,823. A model can be used to describe the relationship between the cumulative number of citations, C, and the article life, Y (Chiu and Ho, 2005). The model can be expressed as: C ¼ KY þ S; where K is the citation rate (number of times cited/year) and S is the visibility potential when a paper is published (number of times cited). Moreover, K is a measure of how quickly the “average article” in the field is cited. S shows how often the articles published in the field are cited in the year of publication. This model fitting suggested that there were sustained constant citations in each year. Figure 3 shows that significant correlations between the yearly cumulative number of citations and the article life were made for the years 1991-2003, with the model having high coefficients of determination (r 2 . 0:984). The results indicated that articles published in 2001 had the highest citation rate and visibility potential, followed by those published in 2003 and 1999, respectively (Table VII). In other words, 114 research articles published in 2001 had the highest impact potential and the greatest numbers of times cited each year after the articles were published. Conclusions Studies on HGT dealing with the SCI have increased during the past 15 years. Journal articles were the most frequent document type, with a lower citation per publication rate than reviews. The top three ranking countries in terms of total publication were the USA, Germany, and the UK. Articles with international co-authorship had higher visibility. Journals listed in the subject category of microbiology published the most articles. Journal of Bacteriology published the greatest number of articles. “Escherichia coli” was the most frequently used KeyWords Plus keyword. A linear model was successfully applied to describe the relationship between the cumulative number of citations and article life. Articles published in 2001 had the highest citation rate and visibility potential. The most frequently cited article was published in 1997 in Nature, which is the highest impact factor journal in the category of multidisciplinary sciences. Gene transfer- related research 195 References Andersson, J.O. (2003), “Evolution of glutamate dehydrogenase genes: evidence for lateral gene transfer within and between prokaryotes and eukaryotes”, BMC Evolutionary Biology, Vol. 3 No. 14. Baron, L.S., Carey, W.F. and Spilman, W.M. (1959), “Genetic recombination between Escherichia coli and Salmonella typhimurium”, Proceedings of the National Academy of Sciences of the United States of America, Vol. 45 No. 7, pp. 976-84. Year K (number of times cited/year) S (number of times cited) r 2 1991 71.7 61.2 0.984 1992 48.3 18.3 0.990 1993 77.6 42.0 0.981 1994 95.2 60.9 0.986 1995 93.5 64.5 0.980 1996 151 28.5 0.996 1997 395 71.3 0.996 1998 254 63.1 0.999 1999 514 152 0.998 2000 628 17.5 0.998 2001 940 274 0.999 2002 679 101 0.998 2003 933 116 0.987 Table VII. Citation model constants Figure 3. Relationships between the cumulative number of citations and age of articles with simulated models OCLC 25,3 196 Brown, J.R. (2003), “Ancient horizontal gene transfer”, Nature Reviews Genetics, Vol. 4 No. 2, pp. 121-32. Chen, S.R., Chiu, W.T. and Ho, Y.S. (2005), “Asthma in children: mapping the literature by bibliometric analysis”, Revue Française d’Allergologie et d’Immunologie Clinique, Vol. 45 No. 6, pp. 442-6. Chiu, W.T. and Ho, Y.S. (2005), “Bibliometric analysis of homeopathy research during the period of 1991 to 2003”, Scientometrics, Vol. 63 No. 1, pp. 3-23. Chuang, K.Y., Huang, Y.L. and Ho, Y.S. (2007), “A bibliometric and citation analysis of stroke-related research in Taiwan”, Scientometrics, Vol. 72 No. 2, pp. 201-12. Cimmino, M.A., Maio, T., Ugolini, D., Borasi, F. and Mela, G.S. (2005), “Trends in otolaryngology research during the period 1995-2000: a bibliometric approach”, Otolaryngology – Head and Neck Surgery, Vol. 132 No. 2, pp. 295-302. Coffey, T.J., Dowson, C.G., Daniels, M. and Spratt, B.G. (1995), “Genetics and molecular biology of beta-lactam-resistant pneumococci”, Microbial Drug Resistance: Mechanisms, Epidemiology and Disease, Vol. 1 No. 1, pp. 29-34. de Granda Orive, J.I., Rı́o, F.G., Benavent, R.A., Zurı́an, J.C.V., Ruiz, C.A.J., Reina, S.S., Serrano, S.V. and Arroyo, A.A. (2007), “Spanish productivity in smoking research relative to world and European Union productivity from 1999 through 2003, analyzed with the Science Citation Index”, Archivos de Bronconeumologia, Vol. 43 No. 4, pp. 212-18. de Lipthay, J.R., Barkay, T. and Sorensen, S.J. (2001), “Enhanced degradation of phenoxyacetic acid in soil by horizontal transfer of the tfdA gene encoding a 2,4-dichlorophenoxyacetic acid dioxygenasem”, FEMS Microbiology Ecology, Vol. 35 No. 1, pp. 75-84. Di Giovanni, G.D., Neilson, J.W., Pepper, I.L. and Sinclair, N.A. (1996), “Gene transfer of Alcaligenes eutrophus JMP134 plasmid pJP4 to indigenous soil recipients”, Applied and Environmental Microbiology, Vol. 62 No. 7, pp. 2521-6. Dong, Q.H., Springeal, D., Schoeters, J., Nuyts, G., Mergeay, M. and Diels, L. (1998), “Horizontal transfer of bacterial heavy metal resistance genes and its applications in activated sludge systems”, Water Science and Technology, Vol. 37 No. 4/5, pp. 465-8. Eisen, J.A. (2000), “Horizontal gene transfer among microbial genomes: new insights from complete genome analysis”, Current Opinion in Genetics & Development, Vol. 10 No. 6, pp. 606-11. Erwin, D.H. and Valentine, J.W. (1984), “Hopeful monsters, transposons, and metazoan radiation”, Proceedings of the National Academy of Sciences of the United States of America – Biological Sciences, Vol. 81 No. 17, pp. 5482-3. Figueredo, E., Perales, G.S. and Blanco, F.M. (2003), “International publishing in anaesthesia – how do different countries contribute?”, Acta Anaesthesiologica Scandinavica, Vol. 47 No. 4, pp. 378-82. Garfield, E. (1990), “KeyWords Plus – ISIS breakthrough retrieval method. 1. Expanding your searching power on Current Contents on diskette”, Current Contents, Vol. 32, pp. 5-9. Garfield, E. and Welljamsdorof, A. (1992), “The microbiology literature: languages of publication and their relative citation impact”, FEMS Microbiology Letters, Vol. 100 Nos 1-3, pp. 33-7. Glänzel, W., Schubert, A. and Czerwon, H.J. (1999), “A bibliometric analysis of international scientific cooperation of the European Union (1985-1995)”, Scientometrics, Vol. 45 No. 2, pp. 185-202. Hartman, H. (1984), “The origin of the eukaryotic cell”, Speculations in Science and Technology, Vol. 7 No. 2, pp. 77-81. Gene transfer- related research 197 Heinemann, J.A. (1998), “Looking sideways at the evolution of replicons”, in Syvanen, M. and Kado, C.I. (Eds), Horizontal Gene Transfer, Chapman & Hall, London. Herrick, J.B., Stuart Keil, K.G., Ghiorse, W.C. and Madsen, E.L. (1997), “Natural horizontal transfer of a naphthalene dioxygenase gene between bacteria native to a coal tar-contaminated field site”, Applied and Environmental Microbiology, Vol. 63 No. 6, pp. 2330-7. Hsieh, W.H., Chiu, W.T., Lee, Y.S. and Ho, Y.S. (2004), “Bibliometric analysis of patent ductus arteriosus treatments”, Scientometrics, Vol. 60 No. 2, pp. 205-15. Jain, R., Rivera, M.C., Moore, J.E. and Lake, J.A. (2003), “Non-clonal evolution of microbes”, Biological Journal of the Linnean Society, Vol. 79 No. 1, pp. 27-32. Ka, J.O. and Tiedje, J.M. (1994), “Integration and excision of a 2,4-dichlorophenoxyacetic acid-degradative plasmid in Alcaligenes Paradoxus and evidence of its natural intergeneric transfer”, Journal of Bacteriology, Vol. 176 No. 17, pp. 5284-9. Koonin, E.V., Makarova, K.S. and Aravind, L. (2001), “Horizontal gene transfer in prokaryotes: quantification and classification”, Annual Review of Microbiology, Vol. 55, pp. 709-42. Kunst, F., Ogasarawa, N., Moszer, I., Albertini, A.M. and Alloni, G. (1997), “The complete genome sequence of the gram-positive bacterium Bacillus subtilis”, Nature, Vol. 390 No. 6657, pp. 249-56. Miyake, T. and Demerec, M. (1959), “Salmonella-escherichia hybrids”, Nature, Vol. 183, pp. 1586-8. Narin, F., Stevens, K. and Whitlow, E.S. (1991), “Scientific cooperation in Europe and the citation of multinationally authored papers”, Scientometrics, Vol. 21 No. 3, pp. 313-23. Ochiai, K., Yamanaka, T., Kimura, K. and Sawada, O. (1959), “Inheritance of drug resistance (and its transfer) between Shigella and E. coli strains”, Nihon Iji Shimpo, Vol. 1861, p. 34. Ochman, H. and Selander, R.K. (1984), “Evidence for clonal population-structure in Escherichia coli”, Proceedings of the National Academy of Sciences of the United States of America – Biological Sciences, Vol. 81 No. 1, pp. 198-201. Ohba, N. (2005), “Bibliometric analysis of the current international ophthalmic publications”, Nippon Ganka Gakkai Zasshi, Vol. 109 No. 3, pp. 115-25. Penalva, M.A., Moya, A., Dopazo, J. and Ramon, D. (1990), “Sequences of isopenicillin-N synthetase genes suggest horizontal gene-transfer from prokaryotes to eukaryotes”, Proceedings of the Royal Society of London, Series B – Biological Sciences, Vol. 241 No. 1302, pp. 164-9. Rosewich, U.L. and Kistler, H.C. (2000), “Role of horizontal gene transfer in the evolution of fungi”, Annual Review of Phytopathology, Vol. 38 No. 1, pp. 325-63. Setliff, E.C. (1983), “Aporpium – an example of horizontal gene-transfer”, Mycotaxon, Vol. 18 No. 1, pp. 19-21. Shoemaker, N.B., Vlamakis, H., Hayes, K. and Salyers, A.A. (2001), “Evidence for extensive resistance gene transfer among Bacteroides spp. and among Bacteroides and other genera in the human colon”, Applied and Environmental Microbiology, Vol. 67 No. 2, pp. 561-8. Simpson, W.J., Musser, J.M. and Cleary, P.P. (1992), “Evidence consistent with horizontal transfer of the gene (Emm12) encoding serotype-M12 protein between group-A and group-G pathogenic streptococci”, Infection and Immunity, Vol. 60 No. 5, pp. 1890-3. Syvanen, M. (1985), “Cross-species gene-transfer – implications for a new theory of evolution”, Journal of Theoretical Biology, Vol. 112 No. 2, pp. 333-43. Syvanen, M. (1987), “Molecular clocks and evolutionary relationships – possible distortions due to horizontal gene flow”, Journal of Molecular Evolution, Vol. 26 No. 1/2, pp. 16-23. OCLC 25,3 198 Syvanen, M. and Kado, C.I. (1998), Horizontal Gene Transfer, Chapman & Hall, London. Top, E.M., Springael, D. and Boon, N. (2002), “Catabolic mobile genetic elements and their potential use in bioaugmentation of polluted soils and waters”, FEMS Microbiology Ecology, Vol. 42 No. 2, pp. 199-208. van der Meer, J.R., Werlen, C., Nishino, S.F. and Spain, J.C. (1998), “Evolution of a pathway for chlorobenzene metabolism leads to natural attenuation in contaminated groundwater”, Applied and Environmental Microbiology, Vol. 64 No. 11, pp. 4185-93. Went, F.W. (1971), “Parallel evolution”, Taxonomy, Vol. 20, pp. 197-226. Wilson, M.M. and Metcalf, W.W. (2005), “Genetic diversity and horizontal transfer of genes involved in oxidation of reduced phosphorus compounds by Alcaligenes faecalis WM2072”, Applied and Environmental Microbiology, Vol. 71 No. 1, pp. 290-6. Zolezzi, P.C., Laplana, L.M., Calvo, C.R., Cepero, P.G., Erazo, M.C. and Gomez-Lus, R. (2004), “Molecular basis of resistance to macrolides and other antibiotics in commensal viridans group streptococci and Gemella spp. and transfer of resistance genes to Streptococcus pneumoniae”, Antimicrobial Agents and Chemotherapy, Vol. 48 No. 9, pp. 3462-7. Further reading Vergidis, P.I., Karavasiou, A.I., Paraschakis, K., Bliziotis, I.A. and Falagas, M.E. (2005), “Bibliometric analysis of global trends for research productivity in microbiology”, European Journal of Clinical Microbiology & Infectious Diseases, Vol. 24 No. 5, pp. 342-5. About the authors Donghui Wen is an Associate Professor in the College of Environmental Sciences and Engineering, Peking University in China. Her research interests include environmental biotechnology and wastewater treatment. To link two research aspects, molecular methods were used to detect the distribution of target bacterium and gene, the micro-community structure, functional gene expression, and the behaviour of mobile genetic elements during the wastewater treatment in biological reactors. Te-Chen Yu obtained his first degree in School of Public Health (2004) from the Taipei Medical University, Taiwan. He is a Research Assistant at the Bibliometric & Research Centre, I-Shou University, Taiwan. His area of specialization is bibliometric and research study. Yuh-Shan Ho obtained his PhD (1995) at the University of Birmingham, UK. He has published 110 papers in refereed journals (which have attracted more than 3,500 citations). He is the Executive Editor of Journal of Environmental Protection Science and the Director of the Bibliometric & Research Centre, I-Shou University, Taiwan. His research interests are the adsorption process for water and wastewater treatment, and bibliometric studies. Yuh-Shan Ho is the corresponding author and can be contacted at: ysho@isu.edu.tw Gene transfer- related research 199 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_n6r2qru5jfd6zabavxraxtzxre ---- MARC Formats in China: Local or International? Ben Gu 1 Introduction The use of MARC tapes was the first step for Chinese librarians to use MARC records. It can be traced back to 1980, when Na- tional Library of China (NLC) imported LC MARC tapes for test retrieval of books in Western languages. In July 1981, the Party School of the Central Committee of the Communist Party of China (CPC) imported LC MARC tapes containing 413,000 records during April 1979-April 1981. The tapes were used to retrieve books on international communist movement, and there were about 1000 hit records. Because Chinese libraries never used such new technology before, it should even be approved by the CPC Secretariat, the top level management at that time (An, “Yin jin MARC ci dai de shou ci ying yong”). The test work was done by MARC Cooperation Group jointly participated by the National Library of China, Peking University Library, Tsinghua University Library, Renmin Univer- sity Library and the China National Publications Import & Export Corporation (“MARC ci dai yu wo guo xi wen bian mu”; Chen, “Beijing di qu xi wen tu shu ji du mu lu yan zhi jin zhan”). Actually, Chinese librarians began to pay attention to LC MARC as early as 1975, but they could not do anything but research, because the lack JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). DOI: 10.4403/jlis.it-9083 http://dx.doi.org/10.4403/jlis.it-9083 B. Gu, MARC Formats in China of international exchange during the “Cultural Revolution” period (1966-1976) (Liu). In 1985 Chinese libraries began to create bibliographical records of monographs in Chinese languages, and had accumulated about 1 million records by the end of 1990. 2 Decision of CNMARC Based on UNIMARC Since the first publication of UNIMARC in 1977 and the second edition in 1980, Chinese library cataloging community, which was just considering catching up with international developments, de- cided to develop a Chinese local format based on UNIMARC, be- cause it is an IFLA-developed international standard and even Li- brary of Congress announced to implement UNIMARC (This what Chinese librarians knew about LC’s decision; actually, LC’s im- plementation was to provide conversion between USMARC and UNIMARC)(IFLA). There were other possible reasons for Chinese libraries to use UNIMARC as the basis of CNMARC: • Main Entry Issue: Chinese libraries do not use main entry in the cataloging of Chinese materials, therefore USMARC, which was suitable for the cataloging with the concept of main entry, seemed strange to Chinese catalogers. • MARC Expert Suggestion: It is said that Henriette Avram, the first developer of MARC format, visited China in late 1980s and expressed the feasibility for Chinese libraries to use UNIMARC. • Japanese Example: Japanese libraries were already using MARC format earlier and basing its JapanMARC on UNIMARC. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 266 JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014) • USMARC Shortcomings: There are some shortcomings with USMARC, and even other countries using USMARC made modifications and had their own versions. Because there is a lack of historical records, the author can just provide the above reasons from the interviews with Mr. Yan ZHU, Ms. Beixin SUN and Mr. Yuanzheng CHEN, who were members participating in drafting of the first edition of CNMARC. People can hardly imagine the conditions of the 1970s and 1980s, when international exchange of information was extremely insufficient. 3 CNMARC and CMARC The process of compiling the Chinese national format was the fol- lowing: the National Library of China began to compile Chinese MARC Communication Format (later called CNMARC) in 1986, it completed a draft by the end of 1986 (Sun), sent copies to other libraries for national review in January 1987, and revised the draft according to the latest edition of UNIMARC (1987) in early 1988; the format was second-time reviewed during a special workshop hosted by the Library Society of China in 1989 (“Zhongguo ji du mu lu ge shi xue shu yan tao hui ji yao”), and was finally published by the Bibliography and Documentation Publishing House in February 1991. At the same time, the National Library began to distribute CNMARC records to libraries in China and around the world (“Bei- jing tu shu guan xiang guo nei wai fa xing ji du mu lu”). More and more libraries have adopted CNMARC for cataloging of Chinese materials (Chen, “Zhong wen tu shu ji du mu lu zhu ti biao yin cai yong hou kong gui fan de she xiang”). In Taiwan Province, a local MARC format (later called CMARC) for monographs, which was strictly based on UNIMARC, was pub- lished early in 1981 and was revised almost at the same year, while JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 267 B. Gu, MARC Formats in China a MARC format for all types of publications was published in July 1984. This format was earlier than the CNMARC in the Chinese mainland, and was used as an example for drafting of the CNMARC (Chen, “Da lu he Taiwan de Zhong wen MARC bi jiao”). Although both are based on UNIMARC, CNMARC and CMARC are slightly different, especially in the use of character sets and terminologies (Zhu and Mengjie). In January 2002, China MARC Format / Authorities was released as an industrial standard of the Ministry of Culture (WH/T15-2002). It was based on UNIMARC / Authorities published by IFLA in 1991. Since then, there has been no revision of the format (China MARC Format / Authorities). 4 UNIMARC Translations UNIMARC Bibliographic Format was translated into Chinese in 1986. However, it was just used for internal use for the drafting of CNMARC, and was not published and distributed through commer- cial channel (Chen, “Zhongguo ji du mu lu tong xun ge shi cun zai de wen ti yu xiu gai jian yi”). There have been translations of some later editions, but none of them was published. Translation is very important for correct understanding of the original UNIMARC text. In Manual of the New Edition of CNMARC in March 2004 (Guo jia tu shu guan, Manual of the New Edition of CNMARC), for example, EAN (073 - International Article Number) was misunderstood as a kind of numbers assigned to articles in published journals. By the end of 2013, the Chinese translation of UNIMARC Man- ual: Authorities Format (3rd Edition, 2009) will be published by the National Library of China Publishing House. The translation was done by the Working Group for the Drafting of CNMARC Authorities JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 268 JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014) Format, a national standard proposal supervised by the National Library Standardization Technical Committee in 2009 and approved by the Standardization Administration of the People’s Republic of China. 5 CNMARC Revisions After the publication of CNMARC in 1991, there were some critics on it, pointing that: • There are too many fields and subfields. • There are some duplicated fields or subfields. • The definitions of some data elements are not clear. In March 1993, the Ministry of Culture approved the proposal of CNMARC Format as a library industrial standard (Qi). The standard (WH/T0503-96) was published in 1996, and began to be implemented in July 1, 1997 (Zhu). This standard was based on the 1994 edition of UNIMARC Manual. At the same time, main contents of the standard were included in the published China MARC Format Manual (1995) (“Zhongguo ji du mu lu ge shi ji shi yong shou ce nian nei chu ban”), and the revised edition of the manual was published in 2001 (Pan). The industrial standard (WH/T0503-96) was revised according to the latest edition (2002) of UNIMARC, and was intended to be- come a national standard. The draft for the national standard was completed and approved by an expert committee organized by the Ministry of Culture in 2003, and its major contents were published as Manual of the New Edition of CNMARC in March 2004. Because of some procedural problems, the national standard was not finally released. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 269 B. Gu, MARC Formats in China In December 2009, the Standardization Administration of the People’s Republic of China approved the proposal of two national standards, CNMARC Bibliographic Format and CNMARC Authorities Format.1 The two national standards were planned to be released in 2011, but they are delayed due to personnel and technical problems. 6 Special Characteristics in CNMARC In addition to some minor changes to UNIMARC format, the major differences of CNMARC are reflected in the definition of some local fields 9–, -9-, –9, and subfields, $9 (and $A, $B, etc.). For example (Bibliographic format): • 091 Union Books and Serials Number: numbers assigned by Chinese Administrative Agencies for books and serials, espe- cially before the implementation of ISBN and ISSN in China. • 092 Order Number: Numbers assigned by distributers. • 094 Standard Publication Number: For the numbers of inter- national, national, industrial or enterprise standards. • 191 Coded Data Field: Rubbings • 192 Coded Data Field: Ethnic Music of China • 193 Coded Data Field: Chinese Antiquarian – General • 194 Coded Data Field: Chinese Antiquarian – Copy Specific Attributes • 393 Outsystem Chinese Character Note: For the description of Chinese characters not defined in character set. 1Standardization Administration of the People’s Republic of China, December 14, 2009: http://www.sac.gov.cn/gjbzjh/201012/t20101213_56788.htm. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 270 JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014) • 690 Chinese Library Classification • 692 Classification for Library of Chinese Academy of Sciences • 696 Other Local Class Numbers • 905 Holding Information: Used in the second edition of the China MARC Format Manual (2001), when Chinese libraries did not use separate MARC format for holding records. As to Pinyin Romanization, NLC and public libraries use subfield $9 to include all Latin scripts corresponding to Chinese characters in major subfields of the field, while CALIS (China Academic Library & Information System) and university libraries use $A, $B etc. for Latin scripts corresponding to Chinese characters in $a, $b etc. Besides, CNMARC doesn’t include some UNMARC fields not applicable to Chinese library cataloging, e.g. 012 (Fingerprint Identi- fier) and 670 (PRECIS). As libraries in China do not use the concept of main entry for the cataloging of monographs in Chinese language, fields 700, 710, 720 are usually not used. Besides, name/title access points are not used in NLC records. Therefore, there are difficulties in adding uniform titles in bibliographic records, and there are very few records with uniform titles. In NLC records, 701$b and 701$g are not used, while in CALIS records, 701$b and 701$g are used for foreign names (Xie). Punctuations are not used in CNMARC records, and they are cre- ated automatically by computer systems in OPAC display. However, there are some problems with the punctuations. For example, 200$d is for Parallel Title Proper and 200$e is for Other Title Information, and we are hard to define automatic generation of punctuations for parallel other title information, because there are many possi- bilities for 200$e. However, CALIS solved this problem by adding punctuations in these particular subfields. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 271 B. Gu, MARC Formats in China 7 Co-Existence of Two MARC Formats In China, libraries use two MARC formats, MARC21 and CNMARC. CNMARC has some special fields for Chinese publications in addi- tion to UNIMARC fields. Most small-sized libraries use UNIMARC- based CNMARC for all materials, with the consideration of easy management. For large-sized libraries that have sizeable collections in foreign languages, they prefer USMARC/MARC21 to CNMARC with the consideration of international compatibility and easy copy cataloging. In the National Library of China, we use CNMARC, Chinese Library Classification and Classified Chinese Thesaurus for Chi- nese publications, and we use MARC21 for foreign publications, including those in Western languages, Japanese and Russian. For publications in Western languages, we use AACR2 (later RDA), LC Subject Headings, LC Name Authority File and Chinese Library Classification. The Aleph500 system allows us to maintain two sep- arate databases respectively in CNMARC and MARC21 formats. There is no relationship between the two databases now. We were considering the possibility of establishing relationships between the two authority databases respectively in CNMARC and MARC21, but the cost was estimated very high. In the Chinese mainland, most libraries use UNIMARC-based CNMARC for Chinese publications. In Hong Kong Special Admin- istrative Region, most libraries follow the practices of the Western countries and use MARC21 for all publications. In Macau Special Administrative Region, some libraries use CNMARC and some use MARC21. Large libraries in Taiwan Province use CMARC (also based on UNIMARC and similar to CNMARC) or MARC21; mid- sized libraries favor CMARC. For libraries using UNIMARC-based formats, they also have different rules, especially for name headings and authority records. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 272 JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014) As to Romanization, most libraries use Pinyin now, but they sometimes use different ways of segmentation, for example ”mao ze dong” or ”Mao Zedong” (Gu). 8 Holdings Format At the early stage of library automation in China, there was no local library automation system using the technology of relational database, and there was no need to use a separate holdings format. Libraries used tag 905 in the bibliographic format to record holdings data. Since the beginning of the 21th century, more and more libraries use international systems, such as Aleph 500, and people have to consider the use of holdings format. Because there was no UNI- MARC holdings format, a research group of the National Library of China considered the localization of MARC21 Format for Holdings Data. In 2003, a Chinese translation of the MARC21 Format for Holdings Data was published (Guo jia tu shu guan, MARC21 Format for Holdings Data), and the format began to be applied in NLC and other libraries. 9 Problems in International Exchange CNMARC is used mainly for the distribution of MARC records in China. Sometimes, we have information exchange with East Asian libraries, and CNMARC records are easy to be converted into JapanMARC (Japan) and CMARC (Taiwan Province) records, which are based on UNIMARC. However, we have difficulties in sharing bibliographic information with libraries in Hong Kong, a special administrative region of China, which use MARC21, and JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 273 B. Gu, MARC Formats in China those in Republic of Korea, which use KorMARC based on MARC21. Further, we have more difficulties in sharing CNMARC records with foreign libraries, most of which use MARC21 format. After almost ten-year discussions and negotiations, NLC and OCLC signed an agreement for batch uploading of NLC Chinese bibliographic records to OCLC WorldCat database in 2008,2 and had uploaded 2.3 million records by the end of 2008.3 This is a great step forward for Chinese MARC records to be used by foreign libraries. However, because exact CNMARC-MARC21 automatic conversion is almost impossible, and manual conversion requires huge costs of human resources, NLC and OCLC had to choose the option of batch conversion, leaving details to be upgraded by OCLC users. The batch uploading process also adds NLC holding information to WorldCat. 10 Future Perspectives Although there have been talks about the death of MARC, CNMARC is still a major format for the cataloging of Chinese materials and exchange of bibliographic records in China. If there is a need for records with other formats, such as XML, we just convert CNMARC into them. There is not any cataloging agency distributing XML records yet. Since the bibliographic year 2009, the German National Library has been delivering its data in the format MARC 21.4 Some libraries 2 National Library of China to Add Its Records to OCLC WorldCat: http://newsbreaks.infotoday.com/Digest/ National-Library-of-China-to-Add-Its-Records-to-OCLC-WorldCat-41153.asp. 3http://www.nlc.gov.cn/dsb_zx/gtxw/201201/t20120116_58355.htm. 4MARC21, Deutschen Nationalbibliothek: http://www.dnb.de/EN/ Standardisierung/Formate/MARC21/marc21_node.html. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 274 http://newsbreaks.infotoday.com/Digest/National-Library-of-China-to-Add-Its-Records-to-OCLC-WorldCat-41153.asp http://newsbreaks.infotoday.com/Digest/National-Library-of-China-to-Add-Its-Records-to-OCLC-WorldCat-41153.asp http://www.nlc.gov.cn/dsb_zx/gtxw/201201/t20120116_58355.htm http://www.dnb.de/EN/Standardisierung/Formate/MARC21/marc21_node.html http://www.dnb.de/EN/Standardisierung/Formate/MARC21/marc21_node.html JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014) in Taiwan Province also switched its bibliographic format from UNMARC-based CMARC to MARC21 in recent years.5 Chinese librarians have to consider the following questions: • What is the original purpose for us to choose UNIMARC for the basis of CNMARC twenty years ago? • Shall we still use UNIMARC for the basis of our cataloging? • Shall we even abandon MARC formats and use a “modern” format? The author has answered the first question in this article, but he cannot answer the other two questions now. The author doesn’t think there will be any answers in the next few years. CNMARC will still be our major format for the cataloging of printed resources. Some new fields and subfields related to FRBR in the updates 2012 will be good tools for the application of FRBR in Chinese libraries. 5http://catwizard.net/posts/20120125095904.html. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 275 http://catwizard.net/posts/20120125095904.html B. Gu, MARC Formats in China References An, Shulan. “MARC ci dai yu wo guo xi wen bian mu”. Computers and Libraries 4. (1983): 7–10. (Cit. on p. 265). Print. –––. “Yin jin MARC ci dai de shou ci ying yong”. Computers and Libraries 4. (1981): 58. (Cit. on p. 265). Print. “Beijing tu shu guan xiang guo nei wai fa xing ji du mu lu”. Science and Technology Daily February 21. (1991). (Cit. on p. 267). Print. Chen, Fuliang. “Da lu he Taiwan de Zhong wen MARC bi jiao”. Library Work and Research 1. (1994): 28–30. (Cit. on p. 268). Print. Chen, Yuanzheng. “Beijing di qu xi wen tu shu ji du mu lu yan zhi jin zhan”. Library and Information Service 5. (1983): 1–6, 41. (Cit. on p. 265). Print. –––. “Zhong wen tu shu ji du mu lu zhu ti biao yin cai yong hou kong gui fan de she xiang”. Journal of Library Science in China 4. (1991): 69–71. (Cit. on p. 267). Print. –––. “Zhongguo ji du mu lu tong xun ge shi cun zai de wen ti yu xiu gai jian yi”. Library Development 5. (1992): 53–57. (Cit. on p. 268). Print. China MARC Format / Authorities. Beijing: Zhonghua ren min gong he guo wen hua bu fa bu, 2002. (Cit. on p. 268). Print. Gu, Ben. “National Bibliographies: the Chinese experience”. Alexandria 18.3. (2006): 173–178. (Cit. on p. 273). Print. Guo jia tu shu guan. Manual of the New Edition of CNMARC. Pechino: Beijing tu shu guan chu ban she, 2004. (Cit. on p. 268). Print. –––. MARC21 Format for Holdings Data. Pechino: Ke xue ji shu wen xian chu ban she, 2003. (Cit. on p. 273). Print. IFLA. “Library of Congress Implements UNIMARC”. IFLA Journal 11.3. (1985): 273– 274. (Cit. on p. 266). Print. Liu, Rong. “Tan LC-MARC zai wo guo de shi yong”. Library and Information Service 4. (1982): 22–23. (Cit. on p. 266). Print. Pan, Taiming. China MARC Format Manual. Pechino: Ke xue ji shu wen xian chu ban she, 2001. (Cit. on p. 269). Print. Qi, Siyan. “CN-MARC de wen shi ji she hui xiao ying”. Library Theory and Practice 4. (1997): 29–30. (Cit. on p. 269). Print. Sun, Beixin. “The role of the National Library of China in standardization activities”. Scholarly Information and Standardization: Proceedings of the Twelfth Open Forum on the Study of the International Exchange of Japanese Information and Scholarly Databases in East Asian Scripts 1992/1993, November 20. Tokio, 1992. (Cit. on p. 267). Print. Xie, Qinfang zhu bian. Manual for CALIS Online Cooperative Cataloging, Part I. Pechino: Beijing da xue chu ban she, 2000. (Cit. on p. 271). Print. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 276 JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014) “Zhongguo ji du mu lu ge shi ji shi yong shou ce nian nei chu ban”. New Technology of Library and Information Service 6. (1995): 64. (Cit. on p. 269). Print. “Zhongguo ji du mu lu ge shi xue shu yan tao hui ji yao”. New Technology of Library and Information Service Suppl. (1989): 10–55. (Cit. on p. 267). Print. Zhu, Qingqing and Huang Mengjie. A Comparative Study of Chinese Cataloging Practices between Libraries across the Taiwan Straits. Pechino: Shi jie tu shu chu ban gong si Beijing gong si, 2011. 15–47. (Cit. on p. 268). Print. Zhu, Yan deng qi cao. CNMARC Format. WH/T0503-96 1996.2.6 fa bu 1997.7.1 shi shi. 1996. (Cit. on p. 269). Print. BEN GU, Director of Foreign Acquisitions & Cataloging Department, National Library of China, Beijing, China. bgu@nlc.gov.cn Gu, B. ”MARC Formats in China: Local or International?”. JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014): Art: #9083. DOI: 10.4403/jlis.it-9083. Web. ABSTRACT: The application of MARC formats in Chinese libraries has a history of about 20 years. At the age of internationalization and digitization, people ask the question: why we chose UNIMARC for the basis of CNMARC? Is it the best choice for China? Are there any other options? In this paper, the author reviews many historical documents and analyzes the present status of MARC formats in China. KEYWORDS: Bibliographic formats; China; CNMARC; MARC; MARC21; UNIMARC. Submitted: 2013-09-20 Accepted: 2013-10-26 Published: 2014-01-01 JLIS.it. Vol. 5, n. 1 (Gennaio/January 2014). Art. #9083 p. 277 mailto:bgu@nlc.gov.cn http://dx.doi.org/10.4403/jlis.it-9083 Introduction Decision of CNMARC Based on UNIMARC CNMARC and CMARC UNIMARC Translations CNMARC Revisions Special Characteristics in CNMARC Co-Existence of Two MARC Formats Holdings Format Problems in International Exchange Future Perspectives work_ndeipce4ojcadc4gfnbvnxj2aq ---- A Revival of the Music Conspectus: A Multi-Dimensional Assessment for the Score Collection A Revival of the Music Conspectus: A Multi-Dimensional Assessment for the Score Collection Katie Lai Notes, Volume 66, Number 3, March 2010, pp. 503-518 (Article) Published by Music Library Association DOI: For additional information about this article [ This content has been declared free to read by the pubisher during the COVID-19 pandemic. ] https://doi.org/10.1353/not.0.0310 https://muse.jhu.edu/article/376375 https://doi.org/10.1353/not.0.0310 https://muse.jhu.edu/article/376375 A REVIVAL OF THE MUSIC CONSPECTUS: A MULTI-DIMENSIONAL ASSESSMENT FOR THE SCORE COLLECTION By Katie Lai The Hong Kong Baptist University (HKBU) is a medium-size public- funded tertiary institution with a full-time student enrolment of around 8,000 and is one of the three universities in Hong Kong that offers music programs. Serving a small student body of around 230 music students of undergraduate and graduate levels, the HKBU library’s music collection contained over 15,000 volumes of scores as of June 2007 in addition to books and audiovisual materials. In order to understand the current situ- ation of the score collection in Western art music published in Western languages, an assessment was conducted between summer 2007 and spring 2008. With an innovative and modified use of the music conspec- tus initially developed by the Research Libraries Group (RLG), the li- brary was able to identify not only the strengths and weaknesses of the collection, but more importantly, problems with the choice of score pub- lishers and score formats in the selection and acquisitions process. Because of its flexible application, this modified music conspectus can be easily adopted by libraries of all sizes and libraries that use any classifi- cation system. This article provides a detailed description of the prepara- tion, techniques used, and findings of the assessment, and highlights the benefits received and actions done following the project. COLLECTION BACKGROUND The score collection of Western art music in the HKBU library com- prises scores in all formats such as full scores, miniature scores, piano re- duction scores, solo instrumental parts, etc. With a short collection his- tory of about fifty years, selecting scores is primarily the responsibility of the music faculty who make decisions on what the library should acquire based on the faculty’s and students’ teaching, research, study, and per- formance needs. Faculty members are regularly sent “yellow slips” or ap- proval plan notification slips, publishers’ catalogs, and new title an- nouncements, and then forward their requests to the library for orders � 503 Katie Lai is cataloging librarian in the Technical and Collection Services Division of the Hong Kong Baptist University library, and library liaison to the music faculty. to be placed. Consequently, the content of the score collection reflects, to a large extent, the faculty’s interests or what was presented to them in publishers’ catalogs. As Elizabeth Henry, Rachel Longstaff, and Doris Van Kampen observed, the music areas in which faculty members are more vocal tend to be better represented in the collection.1 Also, there has been little input from the library, and there is no effective approval plan to complement faculty’s selection. Hence, the selection process lacks a systematic approach to developing the score collection as a whole, and is therefore susceptible to holes and gaps in many areas. LITERATURE REVIEW Originally developed by the Research Libraries Group (RLG) in the late 1970s, the conspectus was a tool that gave an overview and a compar- ison of the existing collections showing where the strengths were and recording future collecting intensity among the RLG conspectus partici- pating member institutions.2 As Ferguson, Grant and Rutstein have ex- plained, the goal was to “improve the stewardship of funds through bet- ter communication among those building collections to acquire, make accessible, and preserve the world’s scholarly production for the national community.”3 By making the collecting activities a coordinated plan, un- necessary duplication of research materials could be avoided such that a larger scope of library materials could be made available to users through the interlibrary loan system.4 The use of the conspectus was soon adopted by the Association of Research Libraries (ARL) in 1983 for its North America Collections Inventory Project (NCIP), and later by other regional consortia such as the Library and Information Resources for the Northwest (LIRN), and the New York Metropolitan Reference and Research Library Agency (METRO).5 The Music Program Commit - tee of the RLG also began to create a music conspectus in the early 1980s, and in 1986, the Music Library Association (MLA) proposed to use the music conspectus to gather information from libraries of all sizes and types to form the National Music Collection database.6 504 Notes, March 2010 1. Elizabeth Henry, Rachel Longstaff, and Doris Van Kampen, “Collection Analysis Outcomes in an Academic Library,” Collection Building 27, no. 3 (2008): 116. 2. Nancy E. Gwinn, and Paul H. Mosher, “Coordinating Collection Development: The RLG Conspectus,” College & Research Libraries 44, no. 2 (March 1983): 129. 3. Anthony W. Ferguson, Joan Grant, and Joel S. Rutstein, “The RLG Conspectus: Its Uses and Benefits,” College & Research Libraries 49, no. 3 (May 1988): 199. 4. Jim Coleman, “The RLG Conspectus: A History of Its Development and Influence and A Prognosis for Its Future,” Acquisitions Librarian 7 (1992): 25. 5. Larry R. Oberg, “Evaluating the Conspectus Approach for Smaller Library Collections,” College & Research Libraries 49, no. 3 (May 1988): 188. 6. Jane Gottlieb, ed., Collection Assessment in Music Libraries, MLA Technical Reports, 22 (Canton, MA: Music Library Association, 1994), 2–3. Much literature has been published on the topic of the conspectus methodology. With regard to the music conspectus in particular, Jane Gottlieb compiled a book titled Collection Assessment in Music Libraries which included papers originally presented at the 1991 MLA annual meeting.7 In the book, Elizabeth Davis provided guidelines on evaluating the collection using the music conspectus in the METRO project,8 and Peggy Daub supplied a very detailed paper about its application, brief re- sults of and benefits received by various institutions.9 In Daub’s survey, most of the music librarians who had used the music conspectus agreed that its values had accurately represented their collections. Some indi- cated that through its use they were able to identify weak areas in their collections, and that it helped them write stronger collection develop- ment policies. Others opined that knowing the conspectus values of other peer institutions aided them in making justifications for increased funding.10 Nonetheless, music librarians also criticized the challenges in using this assessment tool. In the same survey, Daub revealed that music librari- ans found the Library of Congress (LC)-based subject lines did “not rep- resent useful categories that would be used in collection evaluation and development,” but were only quantitative shelflist measurements that were to give a quick overview of the music collection.11 This argument was also echoed by librarians using the conspectus in non-music fields. For example, Richard J. Wood stated that the LC classification numbers on the conspectus worksheets failed to embody the total collection,12 while Larry R. Oberg pointed out that the gaps between the LC-based conspectus lines was one of its problems.13 In the survey conducted by Mary H. Munroe and Jennie E. Ver Steeg, respondents complained about how imprecise any classification scheme was in their conspectus studies.14 This deficiency of the conspectus was in fact even more prominent in the field of music where publications are quite distinctive compared to materials in other disciplines. As Kent Underwood stated, “real differ- ences in content do tend to accompany differences in format . . . [and] A Revival of the Music Conspectus 505 7. Ibid. 8. Elizabeth Davis, “Guidelines for Evaluating Music Collections as Part of a Regional Assessment Plan,” in Collection Assessment in Music Libraries (see note 6), 25–49. 9. Peggy Daub, “The RLG Music Conspectus: Its History & Applications,” in Collection Assessment in Music Libraries (see note 6), 7–24. 10. Ibid., 18. 11. Ibid., 20. 12. Richard J. Wood, “A Conspectus of the Conspectus,” Acquisitions Librarian 7 (1992): 12–13. 13. Oberg, “Evaluating the Conspectus Approach,” 195. 14. Mary H. Munroe and Jennie E. Ver Steeg, “The Decision-Making Process in Conspectus Evaluation of Collections: The Quest for Certainty,” Library Quarterly 74, no. 2 (2004): 200. the different formats are created and collected for different purposes.”15 Yet, little attempt has been made to examine this aspect of the music conspectus. Similarly, Lenore Coral argued that the LC classification scheme in the music conspectus did not provide the kind of detail that would describe actual music collecting activities, nor clarify which com- posers’ works, which editions, which genres, or which periods or geo- graphical areas are collected.16 As such, the music conspectus has indeed left many important areas untouched. Though the use of the conspectus was quite popular in the 1980s and ’90s, many librarians found using it laborious and time-consuming. Thus, variations in conspectus methods were used. In 1995, Howard D. White created the “Brief Test” which was based on the idea of the con- spectus, but with a goal to simplify the entire process by assessing only as few as forty titles selected by subject experts. These forty titles were grouped into four different conspectus levels (with ten titles for each level) from Level 1 to Level 4 (with Level 0 “Out of Scope” and Level 5 “Comprehensive,” excluded), based on the ranking of the holdings counts retrieved from the Online Computer Library Center (OCLC). The library collection was then checked against this final conspectus value-ranked title list, and the library could claim the highest conspectus level in which at least fifty percent of the titles were owned.17 Later, this “Brief Test” evolved into the “Coverage Power Test” which was designed “to test the entire collection of each library against the en- tire literature,” and was aimed to rectify some issues, for example, the possible inconsistency problem18 and the sensitivity of results19 due to the small number of sample titles chosen for each conspectus level. Instead of having a subject expert prepare the forty-item list, a list of ti- tles in the “entire literature” of a specific subject was retrieved from OCLC based on a certain call number range. This list would be ranked from high to low according to the holdings counts. Similarly, the same process would be done for a list of titles in the same call number range for the library collection being assessed. Comparisons would then be made between the holdings counts of the “entire literature” and those of 506 Notes, March 2010 15. Kent Underwood, “Developing Supplemental Guidelines for Music: A Case Report,” in Collection Assessment: A Look at the RLG Conspectus, ed. Wood J. Richard and Katina Strauch (New York: Haworth Press, 1992), 161. 16. Lenore Coral, “Evaluating the Conspectus Approach: Problems and Alternatives,” in Collection Assessment in Music Libraries (see note 6), 80. 17. Howard D. White, Brief Tests of Collection Strength: A Methodology for All Types of Libraries (Westport, CT: Greenwood Press, 1995). 18. Howard D. White, “Better Than Brief Tests: Coverage Power Tests of Collection Strength,” College & Research Libraries 69, no. 2 (March 2008): 158–59. 19. Jennifer Benedetto Beals, “Assessing Collections Using Brief Tests and WorldCat Collection Analysis,” Collection Building 26, no. 4 (2007): 106. the library collection assessed, and the conspectus values would be as- signed to the library collection based on the percentage coverage of the entire literature in the subject.20 In recent years, OCLC has also offered a service called WorldCat Automated Analysis (WCA), which allows libraries to analyze their collec- tions according to size, coverage, publication date, language, format, and audience based on the data found in WorldCat. It also facilitates peer comparison with two to five libraries and checks for collection over- lap and uniqueness. Because the whole WCA process is automated, li- brarians with little knowledge in the subject area studied can still easily carry out the assessment, and worries about biases in compiling the core list such as what happened with the conspectus or Brief Test method were now eliminated.21 However, when looking back at the music field, not much develop- ment or application of new assessment tools has been found in recently- published music literature, with the exception of projects using circula- tion statistics, interlibrary loan statistics, or preservation condition of scores. Although there were heated discussions on the Brief Tests and the WCA, Jennifer Benedetto Beals commented that these approaches might be more suitable to monographs than serial or multimedia materi- als,22 and that they are rather impossible to use in a music score collec- tion. Consequently, since both the Brief Tests and the WCA rely on the accurate reporting of data and holdings in OCLC, libraries who do not have a consistent practice in doing so will find the two methods not feasi- ble. In addition, especially for music scores, separate cataloging records have been created for bibliographically-similar editions and each has its own OCLC accession number. Since the WCA performs its analysis by matching accession numbers only, it is prone to produce doubtful results when reporting collection uniqueness and overlap.23 Further, according to Darby Orcutt and Tracy Powell, a lower institutional reporting rate to OCLC was found for videos and other non-book format, thus making the results of the Brief Tests and the WCA unreliable.24 Although the WCA does provide such details as the age and language of the collection, this information does not seem to be of much use as the publication year of scores is often ignored by music users, and the fact that scores are usually cataloged as items with “no linguistic content” has also made such language A Revival of the Music Conspectus 507 20. Ibid., 106. 21. Ibid., 106. 22. Ibid., 107. 23. Darby Orcutt and Tracy Powell, “Reflections on the OCLC WorldCat Collection Analysis Tool: We Still Need the Next Step,” Against the Grain 18, no. 15 (November 2006): 44. 24. Ibid., 44. examination meaningless. Music users, on the other hand, are generally more concerned about the edition or the publisher of the score. Therefore, to overcome the shortcomings of the music conspectus and other assessment methods, the HKBU library created a tool that would allow these methods to be used in a more comprehensive way when as- sessing its score collection. By modifying the RLG Music Conspectus and dividing the collection hierarchically, it has facilitated a less complicated application of the music conspectus for internal assessment purposes. Furthermore, a positive result of this new multidimensional approach to music assessment will be to make librarians rethink the benefits of the conspectus, and what it can do that other methods cannot. ASSESSMENT PREPARATION Before the project began, many decisions were made with respect to the scope and the methodology. Defining the scope Because of the small number of students studying Chinese music in HKBU, the scope of this assessment covered music scores published in Western languages only. Thus, scores published in Chinese or other Asian languages were excluded. In addition, only those music categories that the HKBU Music Department needed for its curriculum and re- search were considered. For example, wind band music for which the department does not have an individual course was left out of the proj - ect. Complete editions, sets containing a comprehensive collection of works by a specific composer, were also treated separately using a simple benchmarking exercise to compare holdings against other local aca - demic music libraries. As such, they have been excluded from this study. Defining the purpose and choosing the appropriate assessment method Due to the small size and numerous gaps in the collection, it was deemed not worthwhile and too costly to use the automated evaluation analysis services (WCA). As a result, a conspectus project was considered. However, unlike earlier projects that aimed to obtain an overview of a national collection or to compare holdings among a group of libraries, the current assessment study focused on comparing the library’s music score holdings against a core list so that the results could serve as inter- nal guidelines for future collection development. ADOPTING AND MODIFYING THE MUSIC CONSPECTUS Although many libraries have used the music conspectus successfully, it was quite difficult for the HKBU library to carry out such a task. In the 508 Notes, March 2010 original music conspectus, the M schedule of the LC classification for scores was divided into over fifty conspectus lines according to subject for the purpose of comparison and analysis between each music group (see table 1).25 Though this LC-based conspectus was theoretically usable in libraries that used other classification systems, the employment of the music conspectus by the HKBU library, which uses the Dewey Decimal Classification (DDC), was not easy. Because of the major revamp of the music section of the DDC in past years, most of the older scores were not retrospectively reclassified to mirror these changes. So, scores of the same music genre might be classed in different places, making it rather impossible to do the assessment by strictly following the classification numbers in the conspectus lines. Also, with the small size of the collection, the meticulous division of the classification schedule in the music conspectus was considered to be too complex. Hence, all these lowered its usability. Nonetheless, the concept of the music conspectus was adopted for the project. Rather than splitting up the classification schedule into numer- ous segments as the original conspectus did, a few broad music cate- gories were identified based on music types, namely “Orchestral,” “Concerto,” “Chamber Music,” “Instrumental,” “Voice/Choral,” “Opera/ Musical,” and “Anthology.” Each music category was then subdivided by the music genre (see table 2). For instance, the “Orchestral” category was further broken down into “Symphonies,” “Overtures, Suites, Tone Poems, etc.,” “String & Chamber Orchestra,” and “Ballet.” Then, for some genres that were especially important to the HKBU music users, in order to allow for a more refined analysis, these were further split into smaller subjects according to their instrumentation or ensemble type (see table 3). By using this strategy, the application of the conspectus was not bound by the classification system or the call number attached to the score, but was based on the genre of the music itself. Therefore, not only could this modified music conspectus be used in non-LC libraries, it could also solve the problems caused by the inconsistent use of classifica- tion numbers as a result of the redesign of a classification schedule. Once the framework of the modified music conspectus was completed, the Western score collection was checked against a core list. Two numeri- cal values on a scale of 0 to 5, with 0 being “Out of Scope” and 5 being “Comprehensive Level,” were then assigned to each music genre as- sessed. The first value was the Existing Collection Strength (ECS) which described the collection level of a particular portion of the collection at the time of assessment, and the second value was the Desired Collecting A Revival of the Music Conspectus 509 25. “RLG Music Conspectus Lines,” in Gottlieb, Collection Assessment in Music Libraries, 82–88. Intensity (DCI) which indicated the desirable level which the collection should ultimately achieve to adequately support users’ needs. While the scores of the ECS were assigned by the music liaison librarian, the DCI scores were provided by the conductor of the university orchestra, who oversaw the performance activities in the music department. By involv- ing a faculty member, this enabled the library to gather a more objective opinion about how the collection should develop using an expert who works with music students and professors on a daily basis, and best knew their musical needs. COMPILING THE CORE TITLE LISTS AND CHECKING HOLDINGS Similar to other assessment projects, a core title list was compiled based on standard bibliographies such as A Basic Music Library: Essential Scores and Sound Recordings (BML) published by the American Library Association in 1997,26 and other sources, including audition lists of major music schools or professional orchestras, repertoire requirements of important international music competitions, and curriculum and course syllabi. The music faculty was also consulted and a list of the major works of thirty-eight contemporary composers was added to complement the core list in order to ensure an adequate coverage of contemporary and twentieth-century music in the assessment. The Western score hold- ings were then checked against this core list by the music liaison librar- ian or a part-time student worker studying in the music department. 510 Notes, March 2010 26. 3d ed., compiled by the Music Library Assoc., Elizabeth Davis, coordinating editor. A new, 4th edi- tion, of the Basic Music Library is currently being compiled and should provide a more up-to-date listing of repertoire essential to building a music collection. TABLE 1 Excerpt of the Conspectus lines of the original Music Conspectus (Source: Jane Gottlieb, ed., Collection Assessment In Music Libraries [Canton, MA: Music Library Association, 1994], 82) ID LC Class Subjects MUS14 M217-285 Piano & one other instrument MUS15 M286-298 Duets without keyboard instruments MUS16 M300-986 Chamber ensembles: trios-nonets & larger combinations MUS17 M300-986 Chamber music for early instruments MUS18 M1000-1075 Orchestral music MUS19 M1100-1160 String orchestra music MUS20 M1200-1270 Band music A Revival of the Music Conspectus 511 T A B L E 2 W o rk sh ee t fo r H K B U ’s m o d ifi ed M u si c C o n sp ec tu s C at eg o ry G en re * C o ll ec ti o n T it le s o w n ed ( a) / % o f h o ld in g O ve ra ll l ib ra ry ’s h o ld in g le ve l T it le s co m p ar ed ( b ) = ag ai n st t h e co re l is t (a ) / ( b ) x 1 00 (b y m u si c ca te go ry ) = E C S ^ D C I+ T o ta l (a ) / T o ta l (b ) x 1 00 O rc h es tr al Sy m p h o n ie s O ve rt u re , s u it es , to n e p o em s, e tc . St ri n g & c h am b er o rc h es tr a B al le t C o n ce rt o P ia n o St ri n gs W o o d w in d s B ra ss P er cu ss io n M ix ed i n st ru m en ts C h am b er m u si c E n se m b le w it h p ia n o St ri n gs W o o d w in d s B ra ss P er cu ss io n M ix ed w it h o u t p ia n o In st ru m en ta l P ia n o & k ey b o ar d (s o lo & d u o ) St ri n gs W o o d w in d s B ra ss P er cu ss io n V o ic e & c h o ra l V o ic e C h o ra l 512 Notes, March 2010 T A B L E 2 co n ti n u ed C at eg o ry G en re * C o ll ec ti o n T it le s o w n ed ( a) / % o f h o ld in g O ve ra ll l ib ra ry ’s h o ld in g le ve l T it le s co m p ar ed ( b ) = ag ai n st t h e co re l is t (a ) / ( b ) x 1 00 (b y m u si c ca te go ry ) = E C S ^ D C I+ T o ta l (a ) / T o ta l (b ) x 1 00 O p er a & O p er a m u si ca ls M u si ca l & s ta ge w o rk s A n th o lo gy * “G en re ” ca n b e fu rt h er s u b d iv id ed b as ed o n t h e in st ru m en ta ti o n o r en se m b le t yp e fo r d et ai le d a n al ys is . S ee T ab le 3 f o r an e xa m p le . ^ E C S = E xi st in g C o ll ec ti o n S tr en gt h + D C I = D es ir ed C o ll ec ti n g In te n si ty A Revival of the Music Conspectus 513 T A B L E 3 F u rt h er s u b d iv is io n s o f th e ge n re “ C h am b er M u si c” G en re E n se m b le t yp e T it le s o w n ed ( a) / % o f h o ld in g % o f ti tl es o w n ed i n t h e T it le s co m p ar ed ( b ) = m u si c ge n re = T o ta l (a ) / (a ) / ( b ) x 1 00 T o ta l (b ) x 1 00 E n se m b le w / P ia n o t ri o p ia n o P ia n o q u ar te t P ia n o q u in te t P ia n o s ex te t & u p St ri n gs St ri n g tr io St ri n g q u ar te t St ri n g q u in te t St ri n g se xt et St ri n g o ct et & u p W o o d w in d s W in d t ri o W in d q u ar te t W in d q u in te t W in d s ex te t W in d o ct et W in d s ep te t & u p B ra ss B ra ss t ri o B ra ss q u ar te t B ra ss q u in te t B ra ss s ex te t & u p P er cu ss io n P er cu ss io n e n se m b le M ix ed w / o M ix ed e n se m b le w / o p ia n o – t ri o p ia n o M ix ed e n se m b le w / o p ia n o – q u ar te t M ix ed e n se m b le w / o p ia n o – q u in te t M ix ed e n se m b le w / o p ia n o – s ex te t & u p MORE THAN A CONSPECTUS EXERCISE: PUBLISHER AND FORMAT EVALUATION While many conspectus studies primarily or solely involved a yes-or-no title check against a core list or the holdings of other institutions, the HKBU library further employed a multidimensional technique to iden- tify not only what the library owned (or the number of titles), but also to see if score publishers and formats (whether full scores, miniature scores, piano reduction scores, etc.), available for use were sufficiently fulfilling users’ needs. In the music industry, a work in the public domain, such as a Mozart piano sonata, can be published by many companies. While some offer “urtext” editions or include critical commentary in performance scores, some provide reprints of other editions, or add numerous edito rial notes or interpretation marks in the music. Though there is no hard-drawn line of good and bad, musicians generally have a preference for different editions or publishers of certain composers/types of works. Thus, having the right edition by more highly regarded publishers for users is an im- portant matter in good music-collection management. Apart from the quality of publishers, it was also of interest to look at the availability of score formats in the library. Music publications are dif- ferent from other library materials in many ways, and music scores may come in many versions, with each serving a different purpose. Therefore, putting into consideration all of the above, an additional step was taken to record the name of the publisher for each score as- sessed, and the score format found for each title. Such careful scrutiny allowed the library to know whether scores produced by the “preferred publishers” were purchased. Through this extra effort, the library was able to obtain a distribution of all score formats acquired for each type of music. ANALYSIS RESULTS With only one music liaison librarian working on this project while en- gaging in other duties, such as cataloging and library instruction, and with one part-time student helper working in the summer, this project took about nine months to complete. After checking holdings against the core list and examining the publishers and formats of each score in the library, many valuable findings resulted. Strengths and weaknesses Like other conspectus studies, the strengths and weaknesses of the col- lection were identified. It was evident that the strongest parts of the HKBU library’s Western score collection were in the orchestral and 514 Notes, March 2010 opera/musical areas, and the weakest part was in the chamber music sec- tion. The breadth and depth of other parts of the collection differed widely. There was broad coverage of orchestral works, but in contrast, there appeared to be an imbalance in the collection of solo works for dif- ferent instruments (e.g., more core titles available for piano, and fewer for percussion or brass). Variety of score publishers With a multi-angle approach, the analysis results uncovered issues re- lating to the choice of music score publishers. Recording the name of the publisher of each title assessed resulted in knowing that a significant portion of the scores held were published by “less preferred” publishers, even when better alternatives were available. For example, the library owned two sets of scores and parts to Franz Schubert’s Piano Trio No. 1. While the highly preferred editions for music users would be the urtext published by Bärenreiter or Henle, neither of these were acquired. On the other hand, reprint editions with substantial amounts of interpreta- tive markings, were purchased instead. Though it would not be possible to know the history or cause to such acquisition decisions, this demon- strated a need for better quality control and clearer guidelines in the se- lection process. Suitability of score formats acquired When studying score formats, hidden phenomena which were un- known in the past were revealed. For chamber music works, it was found that oftentimes only scores were available without their corresponding performance parts. Over seventy percent of the chamber music items were full, study, or miniature scores, and merely thirty percent were per- formance parts. With this knowledge, it was a good indicator that the li- brary should begin buying the missing instrumental parts which are cru- cial to chamber music study. Moreover, there seemed to be a pattern for buying miniature scores rather than full scores, as demonstrated by the fact that sixty-two percent of the orchestral works were in miniature score format, and only thirty-one percent were in full score format. Again, the reasons behind these acquisition decisions remain a mystery. Nonetheless, this has raised the question of whether this was a result of the faculty’s selection bias or real users’ needs. Other important findings in the score assessment included the obscure presence of a few number of score and parts for large orchestral works, and the absence of the cor- responding full or study scores to concertos, for which piano reduction scores and solo parts sets were bought. By looking at these, it became ap- parent that revised collection development guidelines are needed so that A Revival of the Music Conspectus 515 the appropriate or preferred formats of scores will be acquired for certain types of music. The acquisitions and collection scope may also need to be redefined. For example, some formats such as the score and parts sets for large orchestral works which often contain over sixty instrumental parts should perhaps be housed in a separate performance library where direct supervision and proper management of the parts could be done by orchestra staff. DISCUSSION AND RECOMMENDATIONS After the assessment, the library began to acquire additional knowl- edge on what the Western score collection contained from different per- spectives. By knowing the strengths and weaknesses of the collection, the library is now able to accommodate changing needs more quickly. The effort in evaluating the score publishers and formats also proved to be worthwhile without requiring much extra time, for the final all-around picture produced was instrumental in detecting flaws during the selec- tion and acquisitions process. As Marcia Pankake, Karin Wittenborg, and Eric Carpenter noted, librarians needed to know the cause of weak selec- tion practices and to act upon them.27 Therefore, drawing on these find- ings, areas for improvements were identified and two sets of follow-up ac- tions were performed as a result. Externally, a score enhancement project was initiated immediately after the assessment, and ten weak areas were selected for prioritized col- lection development with approval and financial contribution from the library and the music department. Informal discussions with music fac- ulty and students were conducted to ascertain if music users have a pref- erence for full, study, or miniature scores. This created a more casual channel for users to freely express their opinion and reasons for certain preferences. A formal music user survey was also conducted to gather statistical information about music users’ library use behavior, their per- ceived importance of music materials, and their collection development preferences. The survey results were invaluable in helping to understand library use patterns of each music user group and their real music needs.28 These additional activities allowed for the direct involvement of users in collection building and also facilitated the creation of a truly user- centered collection. 516 Notes, March 2010 27. Marcia Pankake, Karin Wittenborg, and Eric Carpenter, “Commentaries on Collection Bias,” College & Research Libraries 56, no. 2 (March 1995): 114. 28. Katie Lai and Kylie Chan, “Do You Know Your Music Users’ Needs? A Library User Survey that Helps Enhance a User-Centered Music Collection,” Journal of Academic Librarianship 35, no. 1 ( January 2010) [forthcoming]. Follow-up actions were also done internally. A list of “preferred” music publishers for different types of music or composers was compiled by the music liaison librarian for the technical services staff to follow in case such order details were not provided by the faculty requester. A training workshop was also given to the staff to introduce them to the differences between and purposes of various score formats in order to understand the logic of why certain materials should be chosen over others. This way, staff would not blindly follow the guidelines provided, but would be able to make sensible judgments based on music users’ needs. Furthermore, music orders submitted to the acquisitions section could now be looked over quickly by the music liaison librarian before sending out to vendors to ensure that the best possible or necessary score formats and music publishers had been chosen. A plan to fully update the collec- tion development policy is also underway, aiming to provide clearer guidance on the consistent selection of appropriate materials that sup- port the research, teaching, study, and performance needs of music users. CONCLUSION Music score publications are complex, and the existence of a diverse range of scores for the same musical work goes beyond mere reproduc- tion. The variations in formats, publishers, or editions are of great con- cern to music users. Hence, the assessment of a score collection should not be just a title-checking procedure, but should employ a more qualita- tive approach that can actually guide collection development activities. Tailor-made for music scores, this new modified music conspectus turned the collection inside-out, and revealed many selection and acqui- sitions loopholes that one would easily miss in daily work. Its separation from the classification schedule also enhanced its usability in non-LC set- tings, and its application can be straightforwardly extended to libraries that have not been able to keep up with the changes in classification. Since it is genre-based, libraries will have the flexibility of doing a simple broad assessment based on a few large music categories and genres that are particularly needed by users, or a comprehensive in-depth analysis by adding more refined music categories to the conspectus list or further subdividing each music genre into smaller subsets according to instru- mentation or ensemble type. Consequently, conducting a conspectus project is no longer only for large universities or consortia, but can also be carried out by smaller libraries, where money and staff may be limited. There are many ways to evaluate score collections, but this is the first attempt to incorporate a multidimensional concept in music collection assessment, and as such, there is more to be explored. Music users are A Revival of the Music Conspectus 517 very specific about what they need in regard to formats, editions, and quality, and the assessment tool for music scores should reflect this need. ABSTRACT With an innovative use of the music conspectus, the Hong Kong Baptist University library conducted a score collection assessment to identify not only the strengths and weaknesses of the collection, but also problems with the choice of score publishers and formats in the acquisi- tions process. Because of its flexible application, this modified music conspectus can be easily adopted by libraries of all sizes and libraries that use any classification system. This article provides a detailed description of the techniques used, and highlights the findings and benefits re- ceived, as well as actions done following the project. 518 Notes, March 2010 work_ng35wt4jz5d3nhlw6nvms4fz4m ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586941 Params is empty 216586941 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586941 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_nimair2ck5b3vnwckwuiphxmkm ---- Microsoft Word - fertig_fuer_reps.doc How to assess the impact of an electronic document? And what does impact mean anyway? Reliable usage statistics in heterogeneous repository communities This document is a preprint of the formal publication with the same title, written by the same authors in: OCLC Systems & Services 26 (2), p. 133-145 www.emeraldinsight.com/10.1108/10650751011048506 DOI: 10.1108/10650751011048506 Publisher: Emerald Group Publishing Limited Authors: Ulrich Herb - Saarland University and State Library, Saarbrücken, Germany (corresponding author) Eva Kranz - Saarland University and State Library, Saarbrücken, Germany Tobias Leidinger - Saarland University and State Library, Saarbrücken, Germany Björn Mittelsdorf - Saarland University and State Library, Saarbrücken, Germany Purpose Usually impact of research and researchers is tried to be quantified by using citation data: Either by journal-centred citation data as in the case of the journal impact factor JIF or by author-centred citation data as in the case of the Hirsch- or h-index. The paper discusses a range of impact measures, especially usage based metrics. Furthermore the authors report the results of two surveys. The surveys focused on innovative features for open access repositories – with an emphasis on functionalities based on usage information. Design/methodology/approach The first part of the article analyzes both citation-based and usage-based metrics. The second part is based on the findings of the surveys: One in form of a brainstorming session with information professionals and scientists at the OAI6 conference in Geneva, the second in form of expert interviews mainly with scientists. Findings The results of the surveys indicate an interest in the social aspects of science like visualizations of social graphs both for persons and their publications. Furthermore usage data is considered an appropriate measure to describe quality and coverage of scientific documents, admittedly the consistence of usage information among repository has to be kept in mind. The scientist that took part in the survey also asked for community services, assuming these might help to identify relevant scientific information more easily. Some of the other topics of interest were personalization or easy submission procedures. Originality/value This paper delineates current discussions about citation-based and usage-based metrics. Based on the results of the surveys it depicts which functionalities could enhance repositories, what features are required by scientists and information professionals and whether usage-based services are considered valuable. These results also outline some elements of future repository research. Acknowledgments The authors would like to thank Philipp Mayr, Sven Litzcke, Cornelia Gerhardt, the experts who prefer to remain anonymous, and all participants of Breakout Group 6 at the OAI6 conference. Introduction As Harnad (2008) explains, the meaning of an impact measure can only be determined by correlating said measure with either another measure (construct validity) or an external criterion (external validity). But which data should be employed to check impact measures like the Journal Impact Factor or the Hirsch-Index? The range and divergence of potential validating data sets, respectively their object selection, object granularity, and complexity of calculation instructions, reveal that the scientific value of a document has multiple dimensions (Moed 2005b). The actual choice depends on the perspective from which the impact (usefulness) question is asked. Galyani Moghaddam & Moballeghi (2008) give an extensive overview over possible methods. But a matter seldom addressed is the concrete motivation for impact measurement, a question that can help defining what impact should mean in a specific context. Statistical predictions and especially quality assessments can become self-fulfilling prophecies, especially if the numbers are already in use. If we use the height of academics as quality criterion in calling new staff members, academic teams naturally will become taller. A later study of height and institutional quality will find a high correlation of quality and height not because of the inevitable working of things but because this relation was man-made and the variables were confounded to begin with. Nicholas addresses this issue commenting on the Journal Impact Factor in an interview conducted by Shepherd (2007). Scientometric perspective: Eugene Garfield devised the Journal Impact Factor (JIF) as a filter criterion to determine whether a journal should be included in the Science Citation Index (SCI) sample (Garfield 2006). At that time each journal in the sample meant a serious amount of work. The restriction on a finite number of journals was not only a matter of quality but also of practicability. The assumption is that a higher rate of citations indicates a higher importance/quality/impact of an article but more importantly the journal. JIF is –in the context of journal assessment- presumably superior to simple publication counts as quantity does not depend on quality, but it can be argued that JIF rather describes a journal’s prestige, concentrates in established topics, and depends on a certain amount of honesty while it can be easily misunderstood by the naive or corrupted by the dishonest (Brumback 2009). Evaluation perspective: “The use of journal impacts in evaluating individuals has its inherent dangers. In an ideal world, evaluators would read each article and make personal judgements.” (Garfield 2006) There are two main problems in evaluation: Scientific quality and researcher value are assumed to be multidimensional, including characteristics outside the publication activity (Moed 2005b). Subjective or personal statements, i.e. evaluation by peers, are not reproducible, and have in modern days the air of bias and distortion about them. Jensen et al. (2009) investigate career predictors for French scientists in the national research organisation CNRS. Promotions are decided by a peer committee. The correlation of promotion on the one hand and publication and citation measures on the other is highest for the Hirsch-Index. Nevertheless the amount of promotions correctly predicted by h is only 48%. This gap might result from human failure h cannot predict or measurement bias the experts do not succumb to, but presumably a mixture of both. Therefore the questions are: Which variables should be collected in addition to citation metrics? How should the variables be weighted? How to maximize fairness and openness, or can objective measures and human-made recommendations be synchronised? The need for multidimensional evaluation is shown by Shepherd (2007) who reports that over 40% of their web survey sample perceive the JIF as a valid measure but over 50% regard the JIF as over-used. Journal perspective: The motivations of journal editors can be assumed to be purely economic, as only economically sound journals can compete with economically managed journals in a spiral of competition. Mabel et al. (2007) investigated the attitudes of medical editors towards the JIF and their handling of independent variables which are likely to increase their journal’s JIF rating. Editors relied chiefly on quality increase of their staff to boost author recruiting, article selection, and author support. JIF was accepted as status quo but editors expressed their concern that JIF is not useful to impress their practising readership. Thus, they could not solely rely on optimizing their JIF scores. They hope that complementary metrics that represent clinical impact or public advancements will be implemented. Empirical analysis of the interrelation of journal characteristics and journal performance (McWilliams et al. 2005) seem to contradict some of the medical editors’ statements. It is rather likely that different circumstances in management science and medicine account for these discrepancies. Further assessment of the properties of different disciplines will improve the transfer of insights. Library perspective: Electronic documents fundamentally change the mechanisms in libraries. Whereas in former times the library was the keeper of objects and controlled and monitored – sometimes even created- the processes of search, localisation, and usage, they have become an intermediate agent nowadays. These changes might be as trivial as people being able to receive a text without physically visiting the library leading to bulletin boards not being read anymore. Librarians have to adapt by offering telematic versions of their services (Putz 2002). On the other hand the easily adaptable electronic reception desk offers opportunities of personalisation, customisation and portalisation. (Ghaphery & Ream 2000) and (Ketchell 2000) warn that personalisation appeals rather to the professional or heavy user whereas customised views, centred on a specific topic or even specific academic classes, aid the average student user, who shuns high investments (e.g. login procedure, training period) and has to change focus of interest quickly. Additionally metrics can aid subject librarians in the compilation of resources. Another issue is billing. As libraries no longer have control over objects, they have to rely on external server statistics provided by the publishers and hosts to make licensing and acquisition decisions. The COUNTER standard (Project COUNTER 2008) is widely used to generate usage reports. Its granularity is rather unfit to help with acquisition decisions. Usage is reported per journal, not per volume, making it impossible to identify irrelevant vintages that would better be bought by article on demand and not included in a flat rate licensing bundle. COUNTER tends to distort the actual usage, for example Davis & Price (2006) report how interface characteristics of the publisher portal can unjustly increase usage frequency. Educational perspective: Educational science research in general focuses on the classroom aspects of digitisation. Collaborative work environments, online instructions, and testing environments are designed and evaluated to enhance the lecturers' efficiency for example with homework management and on the other hand boost student to student and student to tutor communication (Appelt 2001). Electronic resources are produced by students or prepared by the lecturer to be stored and possibly versioned in the system. Course reserve collections are often created independently from library activities as many coursework software systems are designed as closed applications, which cannot be easily connected with other services. The aim of education is on the one hand to teach the curricula but on the other hand emphasis is placed on teaching information and communication technology (ICT) competence (Sikkel et al. 2002). As education relies heavily on text books, contemporary citation measures are not applicable. The usability perspective is a specialised point of view that can complement the paradigms described above. It is most obvious with education as most education research explicitly includes investigations of ease of use and practicability. On the other hand institutions and organisations in a competitive environment (libraries, universities, and publishers) can improve their strategic position by increasing user efficiency. These can be purely technical aspects (e.g. a user achieving his goal with fewer steps requires less server computation time) but in general it has to be discussed whether the service does fulfil the request at all and whether it meets the user’s needs. Much discussed aspects of usability in information dissemination are recommender services, though their main application is in the commercial area (Montaner et al. 2003). The vast amount of works already available and the increasing growth rate can be assumed to overload the faculties of non-elite information hunters and gatherers (i.e. most students, practitioners, interested private persons, and persons concerned). Even a professional academic researcher can overlook an article and be informed in peer review about his non-optimal library search. But recommenders can not only help to clarify a topic. Content providers are very interested in recommenders that show the visitor alternative objects of interest hoping that he spends more time with the provider’s products. This can be a benefit by itself as straying users increase the amount of views and visits, which is reflected in COUNTER statistics as well as in revenues for paid advertisements. Other aspects of usability include among others visualisation of data, personalisation including user notes and comments being saved for later visits, see Expert Interviews and Brainstorming Session later in this article for further examples. But valid usage statistics are valuable to all of the perspectives: To scientometry it is an additional database that enables research in construct validity and sociological aspects of citation habits though it has to be emphasised that there is no mono-variant relation between usage and citation (Moed 2005a). Possibly citations and usage are independent dimensions of multi-dimensional impact. Access of an electronic resource can be measured in real-time and to a certain extent in-house. This should appeal to evaluation committees as well as developers (and usability testers) of educational methods and academic services. Methodologically speaking access and to a lesser degree usage are observable whereas questionnaires and even references/citations are susceptible to bias and distortions based on human language, beliefs and self-consciousness (Nicholas et al. 2005). Libraries and publishers have always counted their users' activity. It is a simple result of billing. And of course these numbers were used to advertise the journal following the logic of the tyranny of the majority: The journal read by many should also be read by you. There are problems that have to be addressed: The observable event in a repository of digitised objects reached via HTTP is the client computer request for said object to the web server. Neither a human intention nor successful delivery is strictly necessary. There are visits that result from search engines updating their search indices, errantry, and prefetching. But attributing requests to different individuals is further hampered by technologies like thin clients and proxy servers but also by public search terminals. Thin clients allow their users to interact with software located and executed on a central infrastructure. In case of web browsers this implies that all browser instances serving one thin client cluster are routed via one IP address. Intransparent proxies -nowadays mainly an important aspect of network security- pose the same problem. The obvious solution is to determine one unique user not only via the request IP address but also by utilising session identifiers transmitted as cookies or dynamically created URL arguments. However there is no reliable way to tell to visitors apart who use the same physical machine and account. This is common in educational facilities with search terminals located for example in libraries. It would be necessary to clean the browser’s cache each time before a successor begins his work or to identify the account as belonging to multiple persons for example by an appendix to the user-agent header field. Furthermore, aggregated statistics (e.g. author statistic) suffer from multiple instances of one document, but also from print-outs, private sharing of articles, making it very hard for the statistics provider to produce an ecologically valid parameter (Nicholas 1999), see Stassopoulou & Dikaiakos (2007) for a dynamic approach to robot detection. The heterogeneity of perspectives strongly indicates that a single measure, even a single method, is hardly a reliable decision base. Furthermore this diversity implies that even if one perspective were to reject usage analysis based on scientifically valid reasons this cannot automatically extend to other motivations. Open Access Statistics (OA-S) and similar projects Interoperable Repository Statistics (IRS) is a British project tailored to the British repository context. Utilising PERL scripts and the software tool AWStats EPrints and DSpace repository access logs can be analysed. The strength lies in the well prepared presentation possibilities offering various kinds of graphs and granularities (IRS 2007). MESUR is a research project which has established a large set of usage data by logging activities on publisher and link resolver servers. They aim towards creation and validation of usage-based metrics as well as validity testing of other impact measures (Bollen et al. 2008). The PEER project investigates the relation between open access archiving, research, and journal viability (Shepherd 2008). To this end PEER measures usage of a closed set of journals made open access by publishers for the project’s duration. PIRUS is developing a standard for article level usage reports that adheres to COUNTER Codes of Practice. A prototype and abstract description were created to enable document hosts to report the raw access data to an aggregating server or process it themselves (Shepherd et al. 2009). Open Access Statistics (OA-S, http://www.dini.de/projekte/oa-statistik/english/) is a project funded by the German Research Foundation (Deutsche Forschungsgemeinschaft DFG) and conducted by the Project Partners State- and University Library Göttingen (Georg-August Universität Goettingen), the Computer- and Mediaservice at Humboldt-University Berlin, Saarland University and State Library, and the University Library Stuttgart. OA-S aims to (1) establish protocols and algorithms to standardise calculations of usage frequency on web- technology based on open access repositories (2) create an infrastructure to collect the raw access data, process it accordingly and (3) supply the participating repositories with the usage metrics. In contrast to IRS, statistical parameters are not calculated locally - so in addition to article level measurements, parameters beyond document granularity can be implemented, like author-centred statistics or usage aggregation over different versions or multiple instances of the same publication like preprint vs. postprint or self-deposit vs. repository copy. This flexibility in scope should ease the combination and comparison of usage statistics and bibliometric indices. Methods and experiences are similar to those of the PIRUS project, but OA-S concentrates on the centralised aggregation strategy and faces an even more diverse repository software ecosystem. In addition to the bibliometric perspective, the project will specify functionalities to enhance the usability of repositories, e.g. quality valuations, document recommendations, based among others on usage data. In order to focus on the services most yearned for a questionnaire survey will be conducted to determine actual user priorities. It should be noted that neither the interviews nor the brainstorming were limited to a special perspective or methodology. All ideas were accepted equally. Expert Interviews The expert interviews were conducted according to the guidelines given by Bogner et al. (2005). Most experts were identified through their publications. To add publisher and user perspective persons who are involved in journal production and are situated at Saarland University were contacted, too. 5 out of 10 candidates agreed to participate in a loosely structured interview. Interview length ranged from 12 to 57 minutes. No person-centred presentation of results is given to ensure privacy. Most interviews were conducted via phone. All were recorded with explicit consent from the participants and afterwards transcribed to text. The following list consists of the experts’ ideas and inspirations from the interviews: 1.Recommender (high-usage-quota-based) 2.Freshness-Recommender (recent-publication-based) 3.Minority-Recommender (low-usage-quota-based) 4.Profile-Recommender (based on profile similarities) 5.Subject-Recommender (thematic-proximity-based) 6.Usage-Similarity-Recommender (clickstream-similarity-based) 7.Citation-Recommender (citation-intersection-based) 8.Favourites-Recommender (Users'-favourites-lists-based) 9.Recommendation of central authors 10.School-of-thought Recommender (Scientific social network graph) 11.Author-centred usage statistics 12.Repository-centred usage statistics 13.Subject-centred usage statistics 14.User-centred usage statistics 15.Reordering links (usage-quota-based) 16.Collapsing links in large result sets (usage-quota-based) 17.Re-Rendering Result List layout 18.Dead link identification 19.Users' Quality Statements, i.e. comments (free text) 20.Users' Quality Statements (rating) 21.Quality Statements (usage-based) 22.Ensuring document accessibility (bridging the gaps between different storages) 23.Automated Retro-Digitalisation requesting 24.Automated translation requesting 25.Feed Notifications 26.Notifying friends manually (e.g. via e-mail) 27.Search Phrase Recommender 28.Search Result commenting Brainstorming Session A brainstorming was conducted as part of Breakout Group 6 at the OAI6 conference in Geneva (Mittelsdorf & Herb 2009). In contrast to the expert interviews, many proposals were concerned with interface design and data visualisation and presentation. This possibly results from the fact that many participants are situated in libraries. The ideas are grouped and preluded by buzz word labels (Arabic numbers) to heighten readability. 1.Authority/Standardisation a.Central Unique Author Identification i.Author ii.Identification/Profile iii.Picture iv.Projects v.Competence b.Network of Authors i.Social ii.Professional iii.Expertise iv.Field of interest 2.Visualisations/Indexing Dimensions a.Paper's Context b.Visual Social Graph c.Show development of ideas (network graph displaying publication times for a document set) d.Visualisation of publication's geo-location e.Position publication in the „landscape of science“ f.Project’s social map g.Visualise Data and Connections h.Semantic classification i.Numerical Semantics (Speech Independence) 3.Barrier Reduction a.Connect to the world (link between science and application) b.Publication-News-Binding c.Solicit further research i.Need stack ii.Wish list iii.Request notification d.Practicable access to repositories not only via modern PC capabilities and resolution (e.g. mobile phones, hand helds, OLPC, etc.) 4.Reception Tracking a.Consistent access statistics b.Re-use tracking c.Enhanced (complex) metrics (for better evaluation) 5.Assistance (author and user): a.Automatic update/linking pre-print + new version b.Thumbnail/snapshot creation (first page display) c.Integrate everything (integrate information and processes and results) seamlessly working d.Modification of the document catalogues' structures 6.Assistance (author): a.Real-Time assistance in form fill-in b.Automatic Metadata Creation/Lookup c.Reduce Redundant Work (intelligent submission) d.Dynamic publication (version management of production and reception; collaborative production) e.Easy submission process f.Dynamic publication list (exportable) g.Bonus Point System h.Easy feedback from authors to repository i.Repository as workspace j.Repository as research/production environment k.Educational assistance/encouragement for new authors (How-tos) l.Automatic/easy classification/positioning of new publication m.Automatic citation generation 7.Assistance (user): a.Track/pursue other searchers’ way through the repository b.User recommendations as part of repository c.Graph/Image extraction from papers d.Dataset extraction e.Assign personalised searchable attributes i.Personal comments ii.Pictures as bookmarks iii.Memory aids iv.Relevance statement f.Transparent result display relevance criteria Comparison: Simple numbers in this section indicate items from the expert interview set, while number- letter-combinations belong to the brainstorming set. Both samples expressed a strong awareness of and interest into the social aspects and laws shaping modern science, calling for social graphs of publications and authors (10. and 2.a.-f.). Statistics were perceived as an information source to judge quality and coverage (11.-14.) whereas the brainstorming group emphasised inter-repository consistency of statistics as a precondition (4.a.) and possible benefits to evaluation (4.c.). Many ideas revolved around community sharing, assuming a positive shift in the amount of work required to identify an interesting paper (7.a.; 7.b. and 27.). These trends are probably inspired by the widely perceived impact of user-generated content and community content management. The same is probably true of 7.b., 7.e.i-iv, 28., 19., 20., and 8., but in addition this strong demand for personalisation mechanisms implies that users have the impression of many redundant steps in repository handling (e.g. search for a specific piece of text in a previously read article). Overall the experts accepted the repository interface as it is in contrast to the brainstorming group. Most technical and bureaucratic proposals came from the latter. Possibly because a majority is employed in the library/knowledge management sector. The experts interviewed on the other hand emphasised that not only the amount of services is important but also the service’s success rate. All of them would tolerate recommender systems with an accuracy of 90% or more but would rather not be bothered by the noise produced by an inaccurate service. There seems to be a demand for complex measures and the unfiltered presentation of complex interrelations instead of simplifications. The persons interviewed no longer believed in the promise of simple numbers describing the world without loss of information. Future Research To investigate the desirability order of the collected ideas quantitative methods will be used. The questionnaire will have three logical parts: demographic questions for identifying user subcultures and later data re-use general “attitude towards different kinds of service“ questions. These are filter questions that cause blocks of questions for specific services to be asked. specific questions. A participant will have to answer a number of thematic blocks based on his general attitudes. The questionnaire will be a set of HTML forms. Adaptive testing is easily implemented using dynamically generated HTML pages. Adaptive testing reduces the number of items presented, which helps to prevent participants from giving random answers to questions they are not interested in. In electronic testing there is also no need to manually transcribe answers from hard-copy forms into the computer, thus eliminating the risk of transcribing errors. Execution via HTML forms is today the cheapest and most efficient way to conduct a survey targeting a large and international sample. There will be at least a German and an English version. Conclusion: The ideas presented in this paper provide especially those persons concerned with usability improvement and the creation of new services with valuable hints to the library or interface perspective. The informative value will greatly increase as the results of the questionnaire survey can be quantitatively interpreted. The benefit to the other perspectives should not be underrated. Aside from designing specialised tools for evaluators, the data needed to implement added-value services and the data generated by visitors utilising these services can be integrated with established data sources' increasing validity and the amount of variance explained. Usage data can be used to analyse the validity of bibliometric constructs. New modes of synchronous and asynchronous communication can help libraries and universities –even publishers- to tailor their stock to their clients’ demands and even to rectify content or reference structures for example. A stronger awareness of the social aspects of the publishing process can renew peer communication and make peer review more transparent if not completely open. Educational as well as non-academic personnel is not only a beneficiary but as shown in the brainstorming can be a source of major transformations assuming that it is supported by students, academics, and bureaucrats. Additionally the use of open protocols and standards for object description and data transfer are strictly necessary: Different solutions can aid the innovation process but this should not be an excuse for implementing the same algorithm on a different set of objects without retaining interoperatability with other providers. The OAI standards as well as standards as IFABC, ORE, and OpenURL context objects need to be employed and further refined. References Appelt, W. (2001), “What groupware functionality do users really use? Analysis of the usage of the BSCW system”, in Parallel and Distributed Processing, 2001, Mantova, 2001, IEEE, pp. 337-341 Bogner, A. Littig, B. and Menz, W. (2005), Das Experteninterview: Theorie, Methode, Anwendung, VS Verlag für Sozialwissenschaften, Wiesbaden. Bollen, J., Van de Sompel, H. and Rodriguez, M.A. (2008), "Towards usage-based impact metrics: first results from the MESUR project", in Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, Pittsburgh, 2008, ACM, New York, pp. 231-240. Brumback, R.A. (2009), “Impact Factor Wars: Episode V – The Empire Strikes Back”, Journal of Child Neurology, Vol. 24 No. 3, pp. 260-262. Davis, P.M. and Price, J.S. (2006), “eJournal interface can influence usage statistics: Implications for libraries, publishers, and Project COUNTER”, Journal of the American Society for Information Science and Technology, Vol. 57 No. 9, pp. 1243-1248. Galyani Moghaddam, G. and Moballeghi, M. (2008), “How Do We Measure Use of Scientific Journals? A Note on Research Methodologies.”, Scientometrics, Vol. 76 No. 1, pp. 125-133. Garfield, E. (2006), “The History and Meaning of the Journal Impact Factor”, Journal of the American Medical Association, Vol. 295 No. 1, pp. 90-93. Ghaphery, J. and Ream, D. (2000), “VCU’s My Library: Librarians Love It. ...Users? Well, Maybe”, Information Technology and Libraries, Vol. 19 No. 4, pp. 186-190. Harnad, S. (2008), “Validating Research Performance Metrics Against Peer Rankings”, Inter- Research Ethics in Science and Environmental Politics, Vol. 8, pp. 103-107. IRS (2007), “IRS: Interoperable Repository Statistics”, available at: http://irs.eprints.org/ (accessed 17 July 2009). Jensen, P., Rouquier, J.-B. and Croissant, Y. (2009), “Testing bibliometric indicators by their prediction of scientists promotions”, Scientometrics, Vol. 78 No. 3, pp. 467-479. Ketchell, D.S. (2000), “Too Many Channels: Making Sense out of Portals and Personalization”, Information Technology and Libraries, Vol. 19 No. 4, pp. 175-179. Mabel, C., Villanueva, E.V. and Van Der Weyden, M.B. (2007), “Life and times of the impact factor: retrospective analysis of trends for seven medical journals (1994-2005) and their Editors' views”, Journal of the Royal Society of Medicine, Vol. 100 No. 3, pp. 142-150. McWilliams, A., Siegel, D. and Van Fleet, D.D. (2005), “Scholarly Journals as Producers of Knowledge: Theory and Empirical Evidence Based on Data Envelopment Analysis”, Organizational Research Methods, Vol. 8 No. 2, pp. 185-201. Mittelsdorf, B. and Herb, U. (2009), “Breakout group 6. Access Data Mining: A new foundation for Added-value services in full text repositories”, available at: http://indico.cern.ch/contributionDisplay.py?contribId=72&confId=48321 (accessed 17 August 2009). Moed, H.F. (2005a), “Statistical Relationships Between Downloads and Citations at the Level of Individual Documents Within a Single Journal”, Journal of the American Society for Information Science and Technology, Vol. 56 No. 10, pp. 1088-1097. Moed, H.F. (2005b), Citation Analysis In Research Evaluation, Springer Netherlands, Dordrecht. MONTANER, M., LÓPEZ, B. and DE LA ROSA, J.L. (2003), “A Taxonomy of Recommender Agents on the Internet”, Artificial Intelligence Review, Vol. 19 No. 4, pp. 285- 330. Nicholas, D., Huntington, P., Lievesley, N. and Withey, R. (1999), “Cracking The Code: Web Log Analysis”, Online & CD-ROM Review, Vol. 23 No. 5, pp. 263-269. Nicholas, D., Huntington, P. and Watkinson, A. (2005), “Scholarly journal usage: The results of deep log analysis”, Journal of Documentation, Vol. 61 No. 2, pp. 248-280. Project COUNTER (2008), “COUNTER Codes of Practice”, available at: http://www.projectcounter.org/code_practice.html (accessed 17 July 2009). Putz, M. (2002), Wandel der Informationsvermittlung in wissenschaftlichen Bibliotheken, University of Applied Sciences for Library and Information Management, Eisenstadt, 17 July 2009. Sikkel, K., Gommer, L. and Van der Veen, J. (2002), “Using Shared Workspaces in Higher Education“, Innovations in Education and Teaching International, Vol. 39 No. 1, pp. 26-45. Shepherd, P. (2007), “Final Report on the Investigation into the Feasibility of Developing and Implementing Journal Usage Factors”, available at: http://www.uksg.org/sites/uksg.org/files/FinalReportUsageFactorProject.pdf (accessed 17 July 2009). Shepherd, P. and Wallace, J.M. (2009a), “PEER: a European project to monitor the effects of widespread open access archiving of journal articles”, Serials, Vol. 22 No. 1, pp. 19-23. Shepherd, P. and Needham, P.A.S. (2009b), “PIRUS Final Report”, available at: http://www.jisc.ac.uk/media/documents/programmes/pals3/pirus_finalreport.pdf (accessed 17 August 2009). Stassopoulou, A. and Dikaiakos, M.D. (2007), “A Probabilistic Reasoning Approach for Discovering Web Crawler Sessions”, Lecture Notes in Computer Science, Vol. 4505 No. 1, pp. 265-272. Biographical Notes: Ulrich Herb studied Sociology at Saarland University, Germany. He is member of the electronic publishing group of Saarland University and State Library. Affiliation: Saarland University and State Library, Saarbrücken, Germany Eva Kranz is studying Bioinformatics at Saarland University, Germany, where she also has been working as a student assistant for Open-Access-Statistics, since 2008. Ms. Kranz is actively involved in the open source project Collabtive where she is responsible for development, documentation and community management. Affiliation: Saarland University and State Library, Saarbrücken, Germany Tobias Leidinger is studying Computer Science at Saarland University, Germany. He is working for several electronic publishing projects at Saarland University and State Library (e.g. OPUS 4 and Open Access Statistics), since 2006. Affiliation: Saarland University and State Library, Saarbrücken, Germany Björn Mittelsdorf has been a member of Open-Access-Statistics since 2008. Previously he spent two years at the Institute for Psychology Information, Trier, Germany, where he was involved in digital preservation of primary research data. Affiliation: Saarland University and State Library, Saarbrücken, Germany work_nj5bwc2yrzc7zof5t3mepbvyde ---- Microsoft Word - Serial Killer Books.doc _____________________________________________________________ Researching the Multiple Murderer: A Comprehensive Bibliography of Books on Specific Serial, Mass, and Spree Killers Michael G. Aamodt & Christina Moyse Radford University True crime books are a useful source for researching serial killers. Unfortunately, many of these books do not include the name of the killer in the title, making it difficult to find them in a literature search. To make researching serial killers easier, we have created a comprehensive bibliography of true crime books on specific multiple murderers. This was done by identifying the names of nearly 1,800 serial killers and running searches of their names through such sources as WorldCat, Amazon.com, Barnes and Noble, and crimelibrary.com. This listing was originally published in 2004 in the Journal of Police and Criminal Psychology and was last updated in August, 2012. An asterisk next to a killer’s name indicates that a timeline written by Radford University students is available on the Internet at http://maamodt.asp.radford.edu/Psyc%20405/serial_killer_timelines.htm and an asterisk next to a book indicates that the book is available in the Radford University library. ______________________________________________________________________________________ Adams, John Bodkin Devlin, Patrick (1985). Easing the passing. London: Robert Hale. (ISBN 0-37030-627-9) Hallworth, Rodney & Williams, Mark (1983). Where there’s a will. Jersey, England: Capstans Press. (ISBN 0-946-79700-5) Hoskins, Percy (1984). Two men were acquitted: The trial and acquittal of Doctor John Bodkin Adams. London: Secker & Warburg (ISBN 0-436-20161-5) Albright, Charles* *Matthews, John (1997). The eyeball killer. NY: Pinnacle Books (ISBN 0-786-00242-5) Alcala, Rodney+ Sands, Stella (2011). The dating game killer. NY: St. Martin’s Press. ISBN 978-0312535896 Allanson, Patricia* *Rule, Ann (1992). Everything she ever wanted. New York: Pocket Books. Anderson, Dale* *Busch, Alva (1998). Deadly deception. (ISBN 0-786-00617-x) Archer-Gilligan, Amy* *Phelps, M. William (2010). The devil’s rooming house: The true story of America’s deadliest female serial killer. Lyons Press. (ISBN-13 978-1599216010). Ball, Joe Radin, Edward D. (1953). Crimes of passion. NY: Digit Books Barfield, Velma* Barfield, Velma (1985). Woman on death row. Nashville: Oliver Nelson (ISBN 0-840-79531-9) Barfield, Velma (1986). On death row. Basingstoke: Marshall Pickering (ISBN: 0-551-01322-2) *Bledsoe, Jerry (1999). Death sentence. NY: Onyx Books. (ISBN 0-451-40755-5) Bathory, Elizabeth* McNally, Raymond (1983). Dracula was a woman. NY: McGraw-Hill. (ISBN 0-070-45671-2) Penrose, Valentine (1970). The bloody countess. London: Calder & Boyars. (ISBN 0-714-50134-4) Baumeister, Herb* *Weinstein, Fannie & Wilson, Melinda (1998). Where the bodies are buried. NY: St. Martin’s Press. (ISBN 0-312-96653-9) Beck, Martha Brown, Wenzell (1952). Introduction to murder. NY: Greenburg. (OCLC 2155500) Buck, Paul. (1990) The honeymoon killers. London: Xanadu Publications. (ISBN 1-854-80038-8) Beets, Betty Lou* *Pence, Irene (2001). Buried memories. NY: Pinnacle. (ISBN 0-786-01263-3) Bell, Mary Sereny, Gitta (1972). The case of Mary Bell. London: Methuen. (ISBN 0-413-27940-5) Sereny, Gitta (1995). Portrait of a child who murdered. NY: Vintage (ISBN 0-712-66297-9) Sereny, Gitta (2000). Cries unheard: Why children kill-The case of Mary Bell. Owl Books. (ISBN 0- 805-0608-5) Bender Family Adleman, Robert (1970). The bloody Benders. NY: Stein & Day. (ISBN 0-71810-837-x) De la Garza, Phyllis (2004). Death for dinner: The Benders of (old) Kansas. Talei Publishers. (ISBN 9780963177292). James, John (1913). The Benders of Kansas. Wichita: Kan-Okla Publishing. (LCCN nuc 87-562578) Berdella, Robert* Jackman, Tom & Cole, Troy (1998). Rites of burial. NY: Pinnacle. (ISBN 0-786-00520-3) Berkowitz, David (Son of Sam)* Abrahamsen, David (1985). Confessions of Son of Sam. NY: Columbia University Press (ISBN 0-231- 05760-1) Calohan, G. H. (2001). My search for the Son of Sam. Iuniverse.com (ISBN 0-595-19694-2) Carpozi, George (1977). Son of Sam: The .44-Caliber Killer. NY: Manor Books (ISBN 0-532-22212- 5) Cender, Stephen and Cender, Kenneth (2001). A serial killer: David Berkowitz: Son of Sam/Son of Hope. (ISBN 1-588-20920-2) Klausner, Lawrence D. & Klausner, Larry (1981). Son of Sam: Based on the authorized transcription of the tapes, official documents, and diaries of David Berkowitz. NY: McGraw-Hill (ISBN 0- 070-35027-2) Terry, Maury (1987). The ultimate evil. NY: Doubleday. (ISBN 0-385-23452-X) Bernardo, Paul* Burnside, Scott, & Cairns, Alan. (1995). Deadly innocence: The true story of Paul Bernardo, Karla Homolka,and the schoolgirl murders. NY: Warner Books. (ISBN 0-446-60154-3) Davey, Frank (1995). Karla’s web Toronto: Penguin (ISBN: 0-140-25375-0) DeAngelo, Lee (2011). The twisted relationship of Karla Homolka and Paul Bernardo: Serial killers. ISBN: 978-1241688479 *Pron, Nick (1996). Lethal marriage. NY: Ballantine Books (ISBN 0-345-39055-5) *Williams, Stephen (1996). Invisible darkness. NY: Bantam Books. (ISBN 0-553-56554-x) Bianchi, Kenneth* O’Brien, Darcy (1985). Two of a kind: The Hillside Stranglers. NY: New American Library (ISBN 0- 451-14643-3) Schwarz, Ted. (2001). The Hillside Strangler. NY: Vivisphere Publishing (ISBN 1587-76041-x) Bible John Crow, Alan (1997). Bible John: Hunt for a killer. NY: First Press Publishing (ISBN 1-901-60300-8) Stoddart, Charles (1980). Bible John. Edinburgh: Paul Harris. (ISBN 0-904-50589-8) Bindner, Timothy (Suspected) *Philpin, John (1997). Stalemate: The shocking true story of child abduction and murder. NY: Bantam (ISBN 0-553-56999-6) Bittaker, Lawrence* Markman, Ronald & Bosco, Dominic (1989). Alone with the devil. NY: Doubleday. (ISBN 0-553- 28520-3) Bland, Warren James* *Braidhill, K. (1996). Evil secrets. New York: Pinnacle Books. Bonin, William* Bonin, William. (1991). Doing time: Stories from the mind of a death row prisoner. Red Bluff, CA: Eagle Publishing Boskett, Willie James Butterfield, Fox (1996). All God’s children: The Boskett family and the American tradition of violence. NY: Avon Books. (ISBN 0-380-72862-1) Brady, Ian Brady, Ian (2001). The gates of Janus: Serial killing and its analysis by the ‘Moors Murderer’. Feral House. (ISBN 9780922915736). Cowley, Chris (2011). Face to face with evil: Conversations with Ian Brady. John Blake (ISBN 978- 1844549818). Goodman, Jonathan (1973) Trial of Ian Brady and Myra Hindley: The Moors case. David Charles (ISBN 0-715-35663-1) Harrison, Fred (1986). Brady & Hindley: Genesis of the moors murders. London: Ashgrove Press. (ISBN 0-906-79870-1) Johnson, Pamela (19 ). On iniquity: Some personal reflections arising out of the Moors murder trial. NY: Scribner (ISBN 0-684-10313-3) Potter, J.D. (1966). The monsters of the Moors. NY: Balentine. (OCLC 5965528) Sparrow, Gerald (1966). Satan’s children. London: Odhams. (LCCN 67-73292) Williams, Emelyn (1968). Beyond belief. NY; World Books Wilson, Robert (1986). Devil’s disciples. Poole, England: Javelin Books. (ISBN 0-713-71960-5) Wilson, Robert (1988). Return to Hell. London: Javelin Books. (ISBN 0-713-72073-5) Bright, Larry Rosencrance, Linda (2010). Bone crusher. NY: Pinnacle (ISBN-13 978-0786022144) Brisbon, Henry Romero, James (2008). The I-57 murderer: The story of Henry Brisbon. CreateSpace. (ISBN-13 978- 1438254500). Brown, David Rule, Ann (1994) If you really loved me: A true story of love and murder. NY: Pocket Books (ISBN 0- 671-76920-0) Browne, Robert Charles* *Michaud, Stephen G. & Price, Debbie M. (2007). The devil’s right hand man: The true story of serial killer Robert Charles Browne. NY: Berkley. (ISBN 0-425-21727-2). Hess, Charlie (2011). Hello Charlie: Letters from a serial killer. Atria (ISBN 978-1416544869) Brudos, Jerome* *Rule, Ann (1983). Lust killer. NY: Penguin. (ISBN 0-451-16687-6) Buenoano, Judias* Anderson, Chris & McGehee, Sharon (1993). Bodies of evidence: The true story of Judias Buenoano – Florida’s serial murderess. NY: St. Martin’s Press (ISBN 0-312-92806-8) Bundy, Ted* Dekle, George R. (2011). The last murder: The investigation, prosecution, and execution of Ted Bundy. New York: Praeger. (ISBN 978-0313397431) Humphries, William (1999). Profile of a psychopath Batavia, IL: Flinn (ISBN 1-877-99150-3) Kendall, Elizabeth (1981). The phantom prince: My life with Ted Bundy. Washington: Madrona Publications (ISBN 0-914-84270-6) Keppel, Robert (1995). The Riverman: Ted Bundy and I hunt for the Green River Killer. NY: Pocket Books. (ISBN 0-671-86763-6) Larsen, Richard W. (1980). The deliberate stranger. Englewood Cliffs, NJ: Prentice Hall. (ISBN 0- 130-89185-1) Michaud, Stephen, Aynesworth, Hugh, & Hazelwood, Roy. (1989). The only living witness: The story of serial killer Ted Bundy. (ISBN 1-928-7041-5) Michaud, Stephen, & Aynesworth, Hugh (2000). Ted Bundy: Conversations with a killer. Authorlink Press. (ISBN 1-928-70417-4) Nelson, Polly (1994). Defending the devil: My story as Ted Bundy’s last lawyer. NY: William Morrow (ISBN 0-688-10823-7) Perry, Michael (1992). The stranger returns (fiction). NY: Pocket Books (ISBN 0-671-73495-4) Rule, Ann (1996). The stranger beside me. NY: New American Library. (ISBN 0-451-16493-8) Sullivan, Kevin M. (2009). The Bundy murders: a comprehensive history. McFarland (ISBN-13 978- 0786444267). Winn, Steven & Merrill, David (1980). Ted Bundy: The killer next door. NY: Bantam. Bunting, John* Pudney, Jeremy (2005). The bodies in barrels murders. Australia: Harper Collins Publishers. Buono, Angelo* O’Brien, Darcy (1985). Two of a kind: The Hillside Stranglers. NY: New American Library (ISBN 0- 451-14643-3) Schwarz, Ted. (2001). The Hillside Strangler. NY: Vivisphere Publishing (ISBN 1587-76041-x) Burke, William* *Bailey, Brian (2002). Burke and Hare: The year of the ghouls. London: Mainstream Publishing Company (ISBN 1-840-18575-9). Knight, Alanna (2007). Burke and Hare. National Archives of England. (ISBN 9781905615131). Rosner, Lisa (2009). The anatomy murders: Being the true and spectacular history of Edinburgh’s notorious Burke and Hare and of the man of science who abetted them in the commission of their most heinous crimes. University of Pennsylvania Press (ISBN-13 978-0812241914). Buss, Timothey *Kirsch, Mary Ann (1999). When love is not enough Sarasota: Disc Us Books (ISBN: 1-584-44068-6) Butler, Eugene Zieren, Gregory R. (1980) Oral history interview with Eugene Butler in Davenport, Iowa Campbell, Charles* King, Gary C. (1996). Savage vengeance. NY: Pinnacle Books. (ISBN 0-786-00251-4) Cannan, John* *Berry-Dee, C., & O’dell, R. (1992). Ladykiller. London: Virgin (ISBN 0-86369-690-2) Caputo, Ricardo Papa, Juliet (1995). Lady killer. NY: St. Martins (ISBN 0-671-51720-1) Wolfe, Linda (1998). Love me to death. NY: Pocket Books (ISBN 0-671-51732-5) Carignan, Harvey * *Rule, Ann (1999). The want-ad killer. NY: New American Library. (ISBN 0-454-46688-4) Carpenter, David* Graysmith, Robert. (1990). The Sleeping Lady: The trailside murders above the Golden Gate. NY: Onyx Books. (ISBN 0-451-40255-3) Carson, Suzan and Michael Reynolds, Richard (1987). Cry for war. Walnut Creek, CA: Squibob Press. (ISBN 0-961-85772-2) Cavaness, John Dale Dr. *O’Brien, Darcy (1989). Murder in Little Egypt. NY: Onyx Books. (ISBN 0-451-40167-0) Chambers, Robert Taubman, Bryna (1988). The preppy murder trial. NY: St Martins. (ISBN 0-312-92205-1) Chase, Richard Trenton* Biondi, Ray & Hecox, Walt (1992) Dracula killer. NY: Pocket Books. (ISBN 0-671-74003-2) Markman, Ronald & Bosco, Dominic (1989). Alone with the devil. NY: Doubleday (ISBN 0-061- 09221-5) Chikatilo, Andrei* Conradi, Peter (1992). The Red Ripper. NY: Dell. (ISBN 0-44021-603-6)* Cullen, Robert (1993). The killer department. NY: Pantheon Books. (ISBN 0-67942-276-5) Krivich, M., & Ol’gin, Ol’gert. (1993). Comrade Chikatilo: The psychopathology of Russia’s notorious serial killer. NY: Barricade Books. (ISBN 0-942-63790-9)* Lourie, Richard (1994) Hunting the devil. NY: Harper-Collins (ISBN 0-061-09221-5)* Christie, John* *Eddowes, John (1994). The two killers of Rillington Place. London: Little, Brown. (ISBN 0-31690- 946-7) Furneaux, Rupert (1961). The two stranglers of Rillington. London: Panther. (OCLC 24923958) *Kennedy, Ludovic (1961). 10 Rillington Place. London: Gollancz. (OCLC 47627484) Marston, Edward (2007). John Christie: Crime Archive. National Archives (ISBN-13 978-1905615167) Clark, Douglas and Carol Bundy *Farr, Louise (1992). The sunset murders. NY: Pocket (ISBN 0-671-70088-x) Clark, Haden* Harill, Adrian (2001). Born evil: A true story of cannibalism and sexual murder. NY: St. Martins (ISBN 0-312-97890-1) Clements, Dr. Robert George Firth, J. B. (1960). A scientist turns to crime. London: William Kimber Cline, Alfred Leonard *Rice, Craig (1952). 45 murders. NY: Simon & Schuster Cohen, Charles Walsh, Mike (1994). The fallen son. NY: New American Library (ISBN 0-451-40488-2) Coit, Jill* Linedecker, Clifford (1995). Poisoned vows. NY: St. Martins. (ISBN 0-312-95513-8) Singular, Stephen (1995). Charmed to Death New York: Kensington Publishing (ISBN 0-786-00257- 3) Cole, Carrol Edward* *Newton, Michael (1994). Silent rage: Inside the mind of a serial killer. Dell (ISBN 0-440-21313-4) Coll, Vincent Delap, Breandan (1999). Mad Dog Coll. Dublin: Mercier Press (OCLC: 44014145) Rosen, Victor (1955). Dark plunder. New York: Lion Books (OCLC: 37865854) Collins, John Norman * Keyes, Edward. (1976). The Michigan murders. NY: Pocket Books. (ISBN 0-671-73480-6) Constanzo, Adolfo de Jesus* Linedecker, Clifford (1990). Hell ranch. London: Futura *Provost, Gary (1989). Across the border. NY: Pocket (ISBN 0-671-69319-0) Humes, Edward (1991). Buried secrets. NY: Dutton. (ISBN 0-525-24946-x) Kilroy, Jim & Stewart, Bob (1990). Sacrifice. Dallas: Word Publishing. (ISBN 0-849-90783-7) *Schutze, Jim (1989). Cauldron of blood. NY: Avon. (ISBN 0-38075-997-7) Cook, William* *Shirley, Glenn (1963). Born to kill. Derby: Monarch Books (OCLC: 19092177) Cooke, Eric Edgar *Drewe, R. (2000). The shark net: Memories and murder. New York: Viking Copeland, David* McLagan, Graeme & Lowles, Nick (2002). Mr. Evil: The secret life of pub bomber and killer David Copeland. London, Blake Publishing (ISBN 1-857-82416-4) Copeland, Faye Miller, Tom (1993). The Copeland killings. New York: Windsor Pub. Corp. (ISBN: 1-558-17675-6) Copeland, Ray* Miller, Tom (1993). The Copeland killings New York: Windsor Pub. Corp. (ISBN: 1-558-17675-6) Corll, Dean and Elmer Wayne Henley* Gurwell, John K. (1974). Mass murder in Houston. Houston: Cordovan Press Hanna, David (1975). Harvest of horror: Mass murder in Houston. NY: Belmont Tower. *Olsen, Jack (2001). The man with the candy. NY: Simon and Schuster. (ISBN 0-743-21283-5) Corona, Juan* Cray, Ed (1973). Burden of proof: The case of Juan Corona. NY: Macmillan Kidder, Tracy (1974). The road to Yuba City: A journey into the Juan Corona murders. NY: Doubleday (ISBN 0-385-02865-2) Talbitzer, Bill (1978). Too much blood. NY: Vantage Press. (ISBN 0-533-03801-4) Villasenor, Victor (1997). Jury: The people vs. Juan Corona. Boston: Little Brown (ISBN 0-440- 22333-4) Villasenor, Victor (1976). Beyond reasonable doubt. Boston: Little, Brown. (ISBN 0-316-40300-0) Costa, Antone* *Damore, Leo (1990). In his garden: The anatomy of a murderer. NY; Dell (ISBN 0-440-20707-x) Cottingham, Richard Francis* Leith, Rod (1973). The prostitute murders. NY: St. Martins (ISBN 0-523-42281-4) *Leith, Rod (1984). The torso killer. NY: Pinnacle Books. (ISBN 1-558-17518-0) Cotton, Mary Ann Appleton, Arthur (1973). Mary Ann Cotton. London: Michael Joseph. (ISBN 0-718-11184-2) Whitehead, Tony (2000). Mary Ann Cotton, dead, but not forgotten. London: Whitehead. (ISBN 0- 953-96140-0) Crawford, John Martin* *Goulding, Warren (2003). Just another Indian: A serial killer and Canada’s indifference. Fitzhenry & Whiteside Limited (ISBN 1-894-00451-5) Cream, Dr. Thomas Neill *McLaren, Angus (1995). A prescription for murder: The Victorian serial killings of Dr. Thomas Neill Cream. Chicago: University of Chicago Press (ISBN 0-226-56068-6) Creighton, Mary Frances Hoffman, Richard Horace (1999). The girl in Poison cottage, a Gold medal original. NY: Fawcett Publications. (OCLC 44116922) Cummings, Jesse James* *Davis, Barbara. (1999). Suffer the little children. NY: Pinnacle. (ISBN 0-786-00664-1) Cunanan, Andrew *Clarkson, Wensley (1997). Death at every stop: The true story of alleged gay serial killer Andrew Cunanan. St. Martins (ISBN 0-312-96636-9) Indiana, Gary (2000). Three month fever: The Andrew Cunanan story. Cliff Street Books (ISBN 0-060- 93112-4) Orth, Maureen (1999). Vulgar favors. Delacorte Press (ISBN 0-385-33286-6) Cunningham, Robbie Mitchell, Sandra (1992). The Miramichi Axe Murder. Halifax, N.S.: Nimbus. (ISBN 1-551-09011-2) Dahmer, Jeffrey* *Baumann, Edward. (1991). Step into my parlor: The chilling story of serial killer Jeffrey Dahmer. Chicago: Bonus Books (ISBN 0-929-38764-3) Dahmer, Lionel. (1994). A father’s story. NY: Avon (ISBN 0-380-72503-7) *Davis, Don (1995). The Milwaukee Murders: Nightmare in Apartment 213. St. Martins (ISBN 0- 312-92840-8) *Dvorchak, Robert, and Howlewa, Lisa. (1991) Milwaukee massacre: Jeffrey Dahmer and the Milwaukee murders. Dell. (ISBN 0-440-21286-3) *Jaeger, Richard W., & Balousek, M. William (1991). Massacre in Milwaukee: The macabre case of Jeffrey Dahmer. Oregon, WI: Waubesa Press (ISBN 1-878-56909-0) Martin, Herman & Lorenz, Patricia (2010). Serial killer’s soul: Jefrey Dahmer’s cell block confidente reveals all. Titletown Publishing (ISBN-13 978-0982720615). *Masters, Brian. (1993). The shrine of Jeffrey Dahmer. Dunton Green, Sevenoaks, Kent, UK: Hodder and Stoughton (ISBN 0-340-59194-3) Murphy, Dennis, & Kennedy, Patrick. (1992). Incident, Homicide: The Story of the Milwaukee Flesh- Eater. Boston, MA: Quimby World Headquarters Publications. *Norris, Joel (1992). Jeffrey Dahmer: A bizarre journey into the mind of America’s most tormented serial killer. NY: Pinnacle Books (ISBN 1-558-17661-6) *Schwartz, Anne E. (1992). The man who could not kill enough: The secret murders of Milwaukee’s Jeffrey Dahmer. Carol Publishing. (ISBN 1-559-72117-0) Tithecott, Richard, & Kincaid, James (1999). Of men and monsters: Jeffrey Dahmer and the construction of the serial killer. Madison: University of Wisconsin Press (ISBN 0-299-15684- 2). Daveggio, James* Scott, Robert (2001). Rope burns. NY: Pinnacle (ISBN 0-786-01195-5) Smith, Carlton (2007). Hunting evil. NY: St. Martin’s (ISBN 0-312-97572-2)* Davis, Frank Davis, Frank (1935). A fugitive from Hell. Joplin, MO: Davis. (OCLC 1930322) DeBardeleben, James Michael* Meloy, J. Reid. (1992). Violent attachments. Northvale, NJ: Jason Aronson. *Michaud, Stephen G. (1994). Lethal shadow: The chilling true-crime story of a sadistic sex slayer. NY: Onyx Books (0-451-40530-7) *Michaud, Stephen G. (2007). Beyond cruel: The chilling story of America’s most sadistic killer. NY: St. Martin’s (ISBN 0-312-94251-9). Denyer, Paul Kidd, Paul (2002). Knick-knack man: Inside the mind of Australia’s most deranged serial killer. NY: Harper Collins (ISBN 0-732-27058-8) Petraitis, Vikki (1995). The Frankston serial killer. Australia: Clandestine Press. ISBN 9780980790078 de Rais, Gilles Bataille, Georges (1965). The Trial of Gilles de Rais. Paris: Jean-Jaques Pauvert. Translated by Richard Robinson. Benedetti, Jean (1971) The real Bluebeard: The life of Gilles de Rais. New York: Stein & Day. Winwar, Frances (1948). The Saint and the Devil: The story of Joan of Arc and Gilles de Rais. NY: Harper and Brothers. Wolf, Leonard (1980). Bluebeard: The life and crimes of Gilles de Rais. NY: Clarkson N. Potter. DeSalvo, Albert (Boston Strangler) Banks, Harold (1967). The strangler!. NY: Avon. (LCCN 67-2278) Frank, Gerold (1988). Boston strangler. NY: New American Library (ISBN 0-451-16625-6) Kelly, Susan (1995). The Boston Stranglers: The Public Conviction of Albert DeSalvo and the True Story of Eleven Shocking Murders. Birch Lane (ISBN 1-559-72298-3). Rae, George (1967). Confessions of the Boston Strangler. NY: Pyramid Books. Dhillon, Sukhwinder Wells, Jon (2009). Poison: From Steeltown to the Punjab, the true story of a serial killer. Dieteman, Sam+ +Kimball, Camille (2009). A sudden shot: The Phoenix serial shooter. NY: Berkley. (ISBN-13 978- 0425230190). Dodd, Westley Allan* Dodd, Westley Alan, Steinhurst, Lori, & Rose, John (1994). When the monster comes out of the closet: Westley Alan Dodd in his own words. Salem, OR: Rose Publications (ISBN 1-881-17006-3) *King, Gary (1993). Driven to kill. NY: Pinnacle Books (ISBN 0-786-01347-8) Downey, Leslie Sparrow, Gerald (1966). Satan’s children. London: Odhams. (LCCN 67-73292) Dugan, Brian* *Frisbie, Thomas, & Garrett. Randy (1994). Victims of justice. NY: Avon Books. (ISBN 0-380- 79845-X) Dunkle, Jon Scott* Brinck, G. (1999). The boy next door. New York: Kensington Books. Durand, Earl Blake, Jim F. (2000). A chronicled account of the outlaws, robbers, and shootists of the Cowboy Bar, Meeteetse, Wyoming. Meeteetse, WY: Blake Co. (OCLC 46999842) Hilken, Glen (1976). The legend of Earl Durand. NY: Manor Books. (OCLC 18159003) Engleman, Glennon* *Bakos, Susan C. (1991). Appointment for murder. NY: Pinnacle Books (ISBN 1-558-17552-0) Erler, Bob* *Erler, Bob & Souter, John (1980). They called me the catch-me killer. (ISBN 0-842-30214-x) Souter, John (1980). The catch-me killer. (ISBN 0-842-30213-1) *Varon, Joseph (1996). The Catch-me killer. NY: St. Martins (ISBN 0-312-95934-6) Evans, Edward Sparrow, Gerald (1966). Satan’s children. London: Odhams. (LCCN 67-73292) Evans, Gary* Phelps, M. W. (2005). Every move you make. NY: Pinnacle Books (ISBN 0-786-016957) Evans, Timothy Eddowes, John (1955). The man on your conscience. London: Cassell. (LCCN 5512621) Eddowes, John (1994). The two killers of Rillington Place. London: Little, Brown. (ISBN 0-31690- 946-7) Furneaux, Rupert (1961). The two stranglers of Rillington. London: Panther. (OCLC 24923958) Kennedy, Ludovic (1985). 10 Rillington Place. NY: Avon (ISBN 0-685-03264-7) Evonitz, Richard Marc* Fanning, Diane (2004). Into the water. New York: St. Martin’s. (ISBN 0-312-98526-6) Eyler, Larry* *Kolarik, Gera-Lind (1992). Freed to kill: The true story of serial murderer Larry Eyler. NY: Avon Books. (ISBN 0-380-71546-5). Fernandez, Raymond Brown, Wenzell (1952). Introduction to murder. NY: Greenburg. (OCLC 2155500) Fish, Albert* *Angelella, Michael (1979). Trail of blood. NY; New American Library. (ISBN 0-451-09673-8) *Heimer. Mel (1971). The cannibal. NY: Lyle Stuart *Schechter, Harold (1990). Deranged. Pocket Books (ISBN 0-671-67875-2). *Wertham, Frederic (1949). The show of violence. London: Gollancz +Fisher, John and Lavinia Orr, Bruce (2012). Six miles to Charleston: the true story of John and Lavinia Fisher. The History Press. Available for Kindle Flegenheimer, Arthur Burroughs, William S. (1969). The last words of Dutch Schultz. Boston, Mass: The Atlantic Monthly. (OCLC 25199751) Weston, Paul B. (1962). Muscle on Broadway. Evanston, Ill.: Regency Books. (OCLC 12710973) Fontaine, Roy* Copeland, James (1981). The butler. London: Granada. (ISBN 0-586-04906-1) *Lucas, Norman, & Davies, Philip (1979). The monster butler. London: Weidenfeld and Nicolson (ISBN 0-297-81032-4) *Hall, Roy Archibald & Holt, Trevor Anthony (2002). To kill and kill again: The chilling true confessions of a serial killer. John Blake Publishing (ISBN 9781857825558). Ford, Wayne Adam* *Smith, Carlton (2001). Shadows of evil. NY: St. Martins (ISBN 0-312-97887-1) Francois, Kendall* Gado, Mark (2011). Nightcrawler. (Kindle Edition) Rosen, Fred (2002). Body dump. NY: Pinnacle (ISBN 0-7860-1133-5) Franklin, Joseph Paul* Ayton, Mel (2011). Dark soul of the south: the life and crimes of racist killer Joseph Paul Franklin. ISBN: 978-1597975438. Fraser, Leonard* *Doneman, Paula (2006). Things a killer would know: The true story of Leonard Fraser. Allen & Unwin (ISBN 1-741-14231-8) Frazier, John Linley Ward, Damio (1974). Urge to kill. NY: Pinnacle (ISBN 0-523-00380-3) Fry, Robert* *Scott, Robert (2005). Monster slayer. NY: Pinnacle Books (ISBN 0-7860-1603-5). Fugate, Caril Ann Beaver, Ninette, Ripley, B. K., & Trese, Patrick (1974). Caril. NY: Lippincott (ISBN 0-397-00997-6) Newton, Michael (1998). Waste land. NY: Pocket Books (ISBN 0-671-00198-1) Gacy, John Wayne* Amirante, S. L., & Broderick, D. (2011). John Wayne Gacy: Defending a monster. Skyhourse Publishing. ISBN 978-1616082482. *Cahill, Tim (1986). Buried dreams: Inside the mind of a serial killer. NY: Bantom Books. (ISBN 0- 553-25836-2) Gacy, J. W. (1991) Question of doubt: The John Wayne Gacy story. NY: Craig Bowley Consultants. (ISBN 1-878-86503-x) Kozenczak, Joseph, & Henrikson, Karen (1992). A passing acquaintance. NY: Carlton Press (ISBN 0- 806-24132-2) Kozenczak, Joseph R., & Henrikson, K. M. (2004). The Chicago killer: The hunt for serial killer John Wayne Gacy. Xlibris (ISBN 1-401-09532-1) *Linedecker, Clifford L. (1980). The man who killed boys: A true story of mass murder in a Chicago suburb. St. Martins Press. (ISBN 0-312-95228-7) *Mendenhall, Harlan, Kowaleski, Mary Lou, & Mars, Johm (1999). Fall of the house of Gacy. M.K. Enterprises (ISBN 1-887-82701-3) *Moss, Jason (2000). The last victim. NY: Warner Books (ISBN 0-446-60827-0) Nelson, Mark, & Oswald, Gerald (1986). The 34th victim. Westmont, IL: Fortune Productions (OCLS 43874402) Rignall, Jeff (1979). 29 Below. Chicago: Wellington Press. (OCLC 6366698) *Sullivan, Terry and Maiken, Peter T. (1983). Killer clown: The John Wayne Gacy murders: NY: Pinnacle Books. (ISBN 0-786-00086-x) Gallego, Gerald* *Biondi, Ray, & Hecox, Walt (1990). All his father’s sins. NY: Pocket Books. (ISBN 0-671-67265-7) Flowers, R. Barri. (1996). The sex slave murders. NY: St. Martins. (ISBN 0-312-95989-3) *Van Hoffman, Eric (1999). A venom in the blood. NY: Pinnacle. (ISBN 0-786-0-660-9) Garrow, Robert *Alibrandy, Tom & Armani, Frank (1984). Privileged information. NY: HarperCollins. (ISBN 0- 396-08363-3) Gary, Carlton Rose, David (2011). The Big Eddy Club: The stocking stranglings and southern justice. New Press (ISBN 978-1595586711) Gaskins, Donald “Pee Wee” *Gaskins, Donald & Earle, Wilton (1993). Final Truth: The autobiography of mass murderer/serial killer Donald “Pee Wee” Gaskins. NY: Pinnacle Books (ISBN 0-786-00022-8) *Griffin, John Chandler (2010). Pee Wee Gaskins: America’s No. 1 serial killer. Xlibris, Corp. ISBN- 13: 978-1450090889. Hall, Frances S. (1990). Slaughter in Carolina. Florence, SC: Hummingbird Publishers. (ISBN 0-962- 69810-5) Gecht, Robin Fletcher, Jaye Slade (1995). Deadly thrills NY: Onyx (ISBN 0-451-40625-7) Gein, Ed* Gollmar, Robert H. (1990). Edward Gein. New York: Windsor. *Schechter, Harold (1989). Deviant: The shocking story of the original “Psycho”. NY: Pocket Books. (0-671-02546-5) Woods, Paul Anthony. (1995). Ed Gein: Psycho. NY: St. Martin’s Press. (ISBN 0-312-13057-0) Gerard, David* Scott, Robert (2010). Blood frenzy. New York: Pinnacle. ISBN-13: 978-0786020362. Gilbert, Kristen* *Phelps, M. William (2003). Perfect poison: A female serial killer’s deadly medicine. New York: Pinnacle (ISBN 0-786-01550-0) Gillis, Sean* Mustafa, Susan D., & Israel, Sue (2011). Dismembered. New York: Kensington Publishing (ISBN: 978-0-7860-2361-5) Glatman, Harvey* *Newton, Michael. (1998). Rope: The twisted life and crimes of Harvey Glatman. NY: Pocket Books (ISBN 0-671-01747-0) Wolf, Marvin J. & Mader, Katherine (1988). Fallen angels. NY: Ballantine Books. (ISBN 0-345- 34770-6) Gonzales, Benjamin Pedro *Scott, Robert (2002). Savage. NY: Pinnacle (ISBN 0-786-01409-1). Gore, David* *Ward, Rernie (1994). Innocent prey: The astounding true story of two killing cousins and their Florida rampage of terror. New York: Pinnacle. Gray, Dana Sue *Braidhill, K. (2000). To die for: The shocking true story of female serial killer Dana Sue Gray. NY: St. Martins. (ISBN 0-312-97416-7) Green, Deborah Rule, Ann (1999). Bitter harvest. NY: Pocket Books (ISBN 0-671-86869-1) Green, Ricky Lee and Sharon Springer, Patricia (2000). Blood rush. NY: Kensington (ISBN 0-786-00552-1). Griffiths, Stephen* Dixon, Cyril (2011). The crossbow cannibal: The definitive story of Stephen Griffiths. John Blake. ISBN-13: 978-1843583592. Grissom, Richard *Mitrione, Dan. (1996). Suddenly gone: The Kansas murders of serial killer Richard Grissom. NY: St. Martins. (ISBN 0-312-96052-2) Gunness, Belle* De La Torre, Lillian (1955). The truth about Belle Gunness. NY: Gold Medal. *Langlois, Janet (1985). Belle Gunness. Bloomington: Indiana University Press. (ISBN 0-253-31157- 8) *Shepherd, Sylvia E. (2001). The mistress of murder hill: the serial killings of Belle Gunness. Haarman, Fritz Bolitho, W. (1926). Murder for profit Lessing, T. (1993). Monsters of Weimar: Haarmann – The Story of a Werewolf Hahn, Anna Marie* *Franklin, D. B. (2006). The good-bye door: The incredible true story of America’s first female serial killer to die in the chair. Kent, OH: Kent State University Press. (ISBN 0-873-38874-7) Haigh, John George Bryne, Gerald (1954). John George Haigh, Acid killer. London: J. Hill. Briffett, David (1988). The acid bath murders. Sussex: Field Place Press La Bern, Arthur (1973) Haigh: The mind of a murderer. London: W.H. Allen & Co., Ltd. (ISBN 0- 491-01190-3) Lefebure, M. (1958). Murder with a difference: Studies of Haigh and Christie. London: Heinemann. Olsen, J. (1953). The trial of John George Haigh: The acid bath murder. London: T&A Constable Hall, Della Faye *Jones, Aphrodite (1998). Della’s web: The many husbands of a suburban black widow. NY: Pocket Books. (ISBN 0-671-01379-3) Hall, Larry* *Martin, Christopher (2010). Urges: a chronicle of serial killer Larry Hall. CreateSpace: (ISBN-13 978-1451589948). Hall, Roy Archibald* Copeland, James (1981). The butler. London: Granada. (ISBN 0-586-04906-1) *Lucas, Norman, & Davies, Philip (1979). The monster butler. London: Weidenfeld and Nicolson (ISBN 0-297-81032-4) *Hall, Roy Archibald & Holt, Trevor Anthony (2002). To kill and kill again: The chilling true confessions of a serial killer. John Blake Publishing (ISBN 9781857825558). Nicol, Allen (2011). The monster butler. B & W Publishings (ISBN-13: 978-1845023362). Hansen, Robert* DuClos, Bernard. (2008). Fair game. BackinPrint.com. (ISBN-13 978-0595481361) *Gilmour, Walter & Hale, Leland E. (1991). Butcher, baker: A true account of a serial murderer. NY: Onyx Press. (ISBN 0-451-40276-6) Hare, William *Bailey, Brian (2002). Year of the ghouls: The complete history of Burke and Hare. Mainstream Publishing Company (ISBN 9781840185751). *Knight, Alanna (2007). Burke and Hare. National Archives of England. (ISBN 9781905615131). Harrelson, Sharon (not a serial killer – only two husbands killed) *Olsen, Gregg (1998). The confessions of an American black widow: A true story of greed, lust, and a murderous wife. NY: St. Martins (ISBN (0-312-96503-6) Harvey, Donald* Whalen, William, & Martin, Bruce (2005). Defending Donald Harvey: The case of America’s most notorious angel-of-death serial killer. Emmis Books (ISBN 1-578-60209-2) Hatcher, Charles* *Ganey, Terry (1989). Innocent blood: The true story of obsession and serial murder. St. Martins (ISBN 0-312-92269-8) Ganey, Terry (1989). St. Joseph’s children: a true story of terror and justice. NY: Carol Pub. Group. (ISBN 0-818-40509-0) Hausner, Dale* Kimball, Camille (2009). A sudden shot: The Phoenix serial shooter. NY: Berkley. (ISBN-13 978- 0425230190). Heath, Neville Selwyn, Francis (1988). Rotten to the core. London: Routledge. (ISBN 0-710-21033-7) Heaulme, Francis Abgrall, Jean-Franciois, Luret, Samuel, & Schwartz, R. (2005). Inside the mind of a killer: On the trail of Francis Heaulme. Profile Books Limited. (ISBN 1-861-97656-9). Heidnick, Gary* *Englade, Ken (1989). Cellar of horror. NY: St. Martins (ISBN 0-312-92929-3). Apsche, Jack (1993). Probing the mind of a serial killer. Morrisville, PA: International Information Associates. (ISBN 0-945-51012-8) Heirens, William* Downs, Thomas (1984). Murder man. NY: Dell. (ISBN 0-44015-995-4) Freeman, Lucy (1955). Catch me before I kill more. NY: Crown (OCLC 758195) Kallio, Lauri E. (1999). Confess or die: The case of William Heirens. NY: Minerva Press (ISBN 0- 754-10440-0) Kennedy, Dolores (1992). William Heirens: His day in court. Chicago: Bonus Books. (ISBN 0-929- 38750-3) Lindberg, Richard (1999). Return to the scene of the crime – Chicago. Nashville: Cumberland House Publishers. Hickey, John Frank* *McLaughlin, V. (2006 ). The postcard killer: The true story of J. Frank Hickey. Thunder Mouth Press (ISBN 1-560-259094) +Hicks, James Scee, Trudy Irene (2009) Tragedy in the north woods. History Press (ISBN 9781596295506) Hightower, Christopher (mass murderer) Hightower, Susan and Ryzuk, Mary (1995). Shattered innocence, shattered dreams. NY: Pinnacle (ISBN 0-786-00219-0) +Hilley, Marie Ginsburg, Phillip E. (1987) Poisoned blood. NY: Charles Scribner +McDonald, Robin. (1987) Black widow. NY: St. Martin’s. (ISBN 0-31290-266-2) Hillside Stranglers * O’Brien, Darcy (1985). Two of a kind: The Hillside Stranglers. NY: New American Library (ISBN 0- 451-14643-3) Schwarz, Ted. (2001). The Hillside Strangler. NY: Vivisphere Publishing (ISBN 1587-76041-x) Hilton, Gary Michael* Butcher, Lee (2012). At the hands of a stranger. NY: Pinnacle Books ISBN 0786021934 *Rosen, Fred (2011). Trails of death: The national parks killer. Titletown Publishing. ISBN 978- 0982720691 Tucker, Gloria (2011). Victimized by a serial killer. Moorhen Press. ISBN 1105283461 Hindley, Myra Harrison, Fred (1986). Brady & Hinkley. London: Ashgrove Press. (ISBN 0-906-79870-1) Jones, Frank (1981). Trail of blood. Toronto: McGraw-Hill Ryerson. (ISBN 1-85685-024-2) Jones, Janie (19994). Devil and Miss Jones: The twisted mind of Myra Hindley. London: Blake Publishing (ISBN 1-856-85060-9) Lee, Carol Ann (2011). One of your own: the life and death of Myra Hindley. ISBN: 978-1845967017. Potter, J.D. (1966). The monsters of the Moors. NY: Balentine. (OCLC 5965528) Sparrow, Gerald (1966). Satan’s children. London: Odhams. (LCCN 67-73292) Wilson, Robert (1986). Devil’s disciples. Poole, England: Javelin Books. (ISBN 0-713-71960-5) Wilson, Robert (1988). Return to Hell. London: Javelin Books. (ISBN 0-713-72073-5) +Holmes, H. H. (America’s First Serial Killer) Borowski, John (2008). The strange case of Dr. H. H. Holmes. Waterfront Productions. (ISBN-13 978- 0975918517). Eckert, Allan W. (2000). The scarlet mansion. NE: iUniverse.com (ISBN 0-595-08988-7) Griffith, William & Selzer, Adam (2011). The murder castle of H.H. Holmes: Eyewitness accounts, diagrams, and pictures. Franke, David (1975). The torture doctor. NY: Hawthorne Books. (ISBN 0-801-57832-9) *Geary, Rick (2003). The beast of Chicago: The murderous career of H.H Holmes. NY: NBM Publishing (ISBN 1-56163-362-3-51595). *Schechter, Harold (1998). Depraved: The shocking story of America’s first serial killer. Pocket Books (ISBN 0-671-02544-9). Van Wagoner, Colby (2011). Mulberry Lane: Based on the true story of H. H. Holmes. CreateSpace (ISBN 978-1463769857) Hoyt, Waneta Ethel* *Firstman, Richard, & Talan, Jamie (1998). The death of innocents. NY: Bantam Books. (ISBN 0- 55337977-1) Hickey, Charles; Lighty, Todd; O’Brien, John (1996). Goodbye, my little ones: the true story of a murderous mother and five innocent victims. NY: Onyx. (ISBN 0-451-40692-3) Jablonski, Phillip* *Bortnick, B. (1997). Deadly urges. New York: Pinnacle. James, Stephen Gerald* Root, Neil (2007). Cold blooded evil. John Blake Publishing (ISBN 9781844544813). Jaspers, Virginia Belle* *Peinkofer, James (2007). Lilacs in the rain: The shocking story of Connecticut’s shaken-baby serial killer. Bloomington, IN: Rooftop Publishing (ISBN 978-1-6000-8073-9) Jesperson, Keith Hunter* *Olsen, Jack (2002). I: The creation of a serial killer. NY: St. Martins (ISBN 0-312-24198-4) Johnson, Russell Jones, Frank (1981). Trail of blood: A Canadian murder odyssey. Toronto: McGraw-Hill. (ISBN 0- 075-48414-5) Jones, Genene* *Elkind, Peter. (1989). The death shift: The true story of Genene Jones and the Texas baby murders. NY: Viking Press. (ISBN 0-670-81397-4) *Moore, Kelly & Read, Dan (1989). Deadly medicine. NY: St. Martin’s (ISBN 0-312-91579-9) Jones, Jeremy Bryan* *Johnson, Sheila (2007). Blood lust. NY: Pinnacle Books. (ISBN 978-0-7860-1852-9). Joubert, John* *Pettit, Mark (1990). A need to kill. NY: Ivy Books. (ISBN 0-93964-438-X) Judy, Steven Timothy Nunn, Bette (1981). Burn, Judy, burn. Martinsville, Ind: B. Nunn. (OCLC 7891489) Kaczynski, Theodore (The Unabomber) *Douglas, J., & Olshker, M. (1996). Unabomber: On the trail of America’s most-wanted serial killer. NY; Simon & Schuster. (ISBN 0-671-00411-5) *Gibbs, N., Lacayo, R., Morrow, L., Smolowe, J., & Van Biema, D. (1996). Mad genius: The odyssey, pursuit, and capture of the Unabomber suspect. NY: Warner Books. Graysmith, Robert (1998). Unabomber: Desire to Kill. Berkley (ISBN 0-425-16725-9). Waits, Chris & Shors, David (1999). Unabomber: The secret life of Ted Kaczynski. Mountain Magazines (ISBN 1-560-37131-5) Kallinger, Joseph* Downs, Thomas (1984). The door-to-door killer. NY: Dell (ISBN 0-440-12132-9) *Schreiber, Flora Rheta. (1983). The shoemaker: The anatomy of a psychotic (ISBN 0-671-22652-5) +Kearney, Patrick +Stewart, Tony (2010). Trash bag murderer. Bloomington, IN: Tony Stewart Publication. ISBN-13: 978-0557908998 Kemper, Ed* *Cheney, Margaret (2000). Why: The serial killer in America. Lincoln, NE: Backinprint.com. (ISBN 0-595-08915-1) Cheney, Margaret (1976). The co-ed killer. NY: Walker and Co. (ISBN 0-802-70514-6) Ward, Damio (1974). Urge to kill. NY: Pinnacle (ISBN 0-523-00380-3) Kibbe, Roger (The I-5 Strangler) *Henderson, Bruce (1999). Trace evidence: The hunt for an elusive serial killer. Scribner (ISBN 0- 684-80708-4) Kilbride, John Sparrow, Gerald (1966). Satan’s children. London: Odhams. (LCCN 67-73292) Kimbrough, Petrie Coleman, J. Winston (1952). Death at the court-house an account of the mob action in Lexington, Kentucky, on February 9th, 1920, and the events leading up to it. Lexington, KY: Winburn. (OCLC 22075991) +Klenner, Frederick +Bledsoe, Jerry (1988). Bitter blood. NY: E.P. Dutton. (ISBN 0-525-24591-X) +Trotter, William & Newsom, Roberet (1988). Deadly kin. NY: St. Martin’s. (ISBN 0-929-30700-3) Knorr, Theresa Cross* *McDougal, Dennis (1995) Mother’s day. NY: Fawcett Publishing. *Clarkson, Wensley (1995). Whatever mothers says. NY: St. Martins Knowles, Paul John Fawkes, Sandy (1977). Killing time. London: Peter Owen. (ISBN 0-720-60514-8) Fawkes, Sandy (2008). In love with a serial killer. John Blake. (ISBN-13 978-1844544738). Kraft, Randy* *McDougal, Dennis (1991). Angel of darkness. NY: Warner Books. (ISBN 0-446-36302-2) Krajcir, Timothy* *DiCosmo, Bridget (2009). Serial murder 101: Timothy Krajcir. New York: Berkley (ISBN 978-0- 425-22698-8) Echols, Paul & Byers, Christine (2011). In cold pursuit: My hunt for Timothy Krajcir. New Horizon Press (ISBN 978-0882823485). Echols, Paul & Byers, Christine (2011). In cold pursuit: My hunt for Timothy Krajcir. New Horizon Press (ISBN 978-0882823485). *Walker, Steven (2010). Predator. NY: Pinnacle. (ISBN-13 978-0786020188). Krebs, Rex* Mitchell, Corey (2004) Dead and buried: A shocking account of rape, torture, and murder on the California coast. New York: Pinnacle Books (ISBN 0-786-01517-9) Krist, Gary Steven Krist, Gary Steven (1972). Life: the man who kidnapped Barbara Mackle. NY: Olympia Press. (ISBN 0-700-40100-8) Kuklinski, Richard Baden, Michael M. & Hennessee, Judith A. (1992). Unnatural death. NY: Ivy. (ISBN 0-804-10599-5) Bruno, Anthony (1993). The Iceman: The True Story of a Cold-Blooded Killer. New York: Delacorte. Mustain, Gene, and Capeci, Jerry (1993). Murder Machine: A True Story of Murder, Madness, and the Mafia. NY: Onyx Books Kurten, Peter Berg, Karl (1945). The sadist. London: Heinemann Goodwin, George (1938). Peter Kurten – A study in sadism. London: Acorn Press Wagner, Margaret Seaton (1932). The monster of Dusseldorf. London: Faber and Faber LaBarre, Sheila* *Benson, Michael (2009). The burn farm. NY: Pinnacle. (ISBN-13 978-0786020300). *Flynn, Kevin (2010). Wicked intentions: A remote farmhouse, a beautiful temptress, and the lovers she murdered. NY: St. Martin’s (ISBN-13 978-0312575779). Lake, Leonard Harrington, Joseph and Burger, Robert (1993). Eye of evil. NY: St. Martins. (ISBN 0-312-95175-2) Harrington, Joseph, & Burger, Robert (1999). Justice denied. NY: Perseus Press (ISBN 0-306-46013- 0) Lasseter, Don (2000). Die for me: The terrifying true story of the Charles Ng and Leonard Lake torture murders. NY: Pinnacle Books (ISBN 0-786-01107-6) Landru, Henri Bardens, Dennis (1972). The lady killer. London: P. Davies. (ISBN 0-432-01140-4) Wakefield, Russell (1936). Landru. London: Duckworth. (OCLC 8633696) Lang, Donald Myers, Lowell J. (1969). Can a man be imprisoned without a trial because he cannot communicate? The case of Donald Lang, petition for a writ of habeas corpus filed August 29, 1969. (OCLC 4137627) Tidyman, Ernest (1974). Dummy. London: W. H. Allen. (ISBN 0-491-01392-2) Lara, Mario Lara, Mario (1983). Inside out. San Diego, CA. (OCLC 12147904) Lawrence, Jonathan* *Rosen, Fred (2003). Flesh collectors. NY: Pinnacle Books (ISBN 0-7860-1583-7). Leasure, William Ernest Humes, Edward (1993). Murderer with a badge: the secret life of a rogue cop. NY: Onyx. (ISBN 0- 451-40402-5) LeBaron, Ervil Bradlee, B. & Van Atta, D. (1981). Prophet of blood. NY: Putnam. (ISBN 0-399-12371-7) Chynoweth, R. & Shapiro, D. (1990). The blood covenant. Austin, Tex: Diamond Books. (ISBN 0- 890-15768-5) Leclerc, Marie *Thompson, T. (2000). Serpentine. NY: Carroll & Graf (ISBN 0-786-70749-6) Lee, Derrick Todd* *Mustafa, S. D., & Clayton, T. (2006). I’ve been watching you: The South Louisiana serial killer. Bloomington, IN: Author House. (ISBN 1-4259-1326-1) *Stanley, Stephanie A. (2006). An invisible man. NY: Berkley. (ISBN 0-425-20887-7) Weeber, Stan (2007). In search of Derrick Todd Lee: the Internet social movement that made a difference. NY: University Press of America (ISBN 9780761838425). Legere, Allan Joseph MacLean, Rick (1992). Terror’s end. Toronto: McClelland & Stewart. (ISBN 0-771-05595-1) MacLean, Rick & Veniot, Andre (1990). Terror. Toronto: McClelland & Stewart. (ISBN 0-77105- 592-7) Mitchell, Sandra (1992). The Miramichi axe murder. Halifax, N.S.: Nimbus. (ISBN 1-551-09011-2) Lehmann, Christa Thorwald, Jurgen (1966). The power of poison. London: Thames and Hudson Lindsey, William Darrell* Vernon, McCay, & Vernon, Marie (2005). Deadly lust: A sex killer strikes again and again. NY: Pinnacle Books (ISBN 0-786-01699-X) List, John Emil Benford, Timothy & Johnson, James (2000). Righteous carnage: The List murders. Iuniverse.com (ISBN 0-595-00720-1) Ryzuk, Mary. (1990). Thou shalt not kill. NY: Warner Books (ISBN 0-445-21043-5) Sharkey, Joe (1990) Death sentence: The inside story of the John List murders. NY: Signet (ISBN 0- 451-16947-6) +Lockhart, Michael Lee +Fletcher, Jaye Slade (1996). A perfect gentleman. (ISBN 0-786-00263-8) Long, Bobby Joe* *Flowers, Anna (2000). Bound to die: The shocking true story of Bobby Joe Long, America’s most savage serial killer. (ISBN 0786011874) Steel, Fiona (2000). Bobby Joe Long: Savage lust and murder. Ward, Bernie (1995). Bobby Joe: In the mind of a monster. Boca Raton, FL: Cool Hand Communications (ISBN 1-567-90093-3) Wellman, Joy (1997). Smoldering embers: The true story of a serial murderer and three courageous women. NY: New Horizon Press (ISBN 0-882-82154-7) Lucas, Henry Lee* Call, Max (1985). Hand of death: The Henry Lee Lucas story. Lafayette, LA: Prescott Press (ISBN 0- 933-45100-8) *Cox, Mike (1991). The confessions of Henry Lee Lucas. NY: Pocket Books. (ISBN 0-671-70665-9) Norris, Joel (1991). Henry Lee Lucas: The shocking true story of America’s most notorious serial killer. Kensington (ISBN 0-821-73530-6) +Lumbrera, Diana +Cavenaugh, Mary Lou (1998) Mommy’s little angles. NY: Onyx (ISBN 0-451-40493-9) Luther, Thomas* *Jackson, Steve (1998). Monster. NY: Pinnacle Books. (ISBN 0-786-00586-6) Lyles, Anjette White, Jaclyn (1999). Whisper to the black candle: Voodoo, murder, and the case of Anjette Lyles. Atlanta, GA: Mercer University Press. (ISBN 0-865-54638-x) Lynch, Susie Sharp Newsom Trotter, William & Newsom, Robert (1988). Deadly kin. NY: St. Martin’s. (ISBN 0-929-30700-3) Mabrick, Florence and James Christie, Trevor (1968). Etched in arsenic. Philadelphia: J.B. Lippincott. (DLC 68044374) Mackay, Patrick Clark, Tim & Penycate, John (1974). Psychopath. London: Routledge & Kegan. (ISBN 0-710-08402- 1) *Manuel, Peter *MacLeod, Hector, & McLeod, Malcolm (2010). Peter Manuel, serial killer. Mainstream Publishing (ISBN-13 978-1845965728). *Nicol, A. M. (2009). Manuel: Scotland’s first serial killer. Black and White Publishing. (ISBN-13 978-1845022419). Manson, Charles Bishop, George (1971). Witness to evil. Los Angeles: Nash Publishing. (ISBN 0-840-21155-4) Bugliosi, Vincent, & Gentry, Curt (1974). Helter Skelter. NY: Norton. (ISBN 0-393-08700-X) Emmons, Nuel & Manson, Charles (1986). Manson in his own words. NY: Grove. (ISBN 0-394- 55558-9) George, Edward, & Matera, Darcy (1999). Taming the beast: Charles Manson’s life behind bars. (ISBN 0-312-20970-3) Livsey, Clara (1980). The Manson women. NY: Marek. (ISBN 0-39-990073-X) Murphy, Bob (1999). Desert shadows: A true story of the Charles Manson family in Death Valley. (ISBN 0-930-70429-0) Sanders, Ed (1971). The family. NY: Dutton. (ISBN 1-897-74315-7) Schiller, Lawrence & Atkins, Suzan (1970). The killing of Sharon Tate. NY: New American Library. (OCLC 2851466) Terry, Maury (1987). The ultimate evil. NY: Doubleday. (ISBN 0-385-23452-X) Udo, Tommy (2002). Charles Manson. (ISBN 1-860-74388-9) Watkins, Paul & Soledad, Guillermo (1979). My life with Charles Manson. NY: Bantam. (ISBN 0- 553-12788-8) *McCullough, Patrick *Vernon, McCray & Vernon, Marie (2010). Deadly charm: the story of a deaf serial killer. Washington, D.C.: Gallaudet University Press (ISBN-13 978-1563684432). *McDuff, Kenneth Allen *Lavergne, Gary M. (1999). Bad boy from Rosebud: the murderous life of Kenneth Allen McDuff. Denton, TX: University of North Texas Press (ISBN 1-574-41072-5) *Stewart, Bob (1996). No remorse. NY: Kensington Press. (ISBN 0-786-00231-x) McElroy, Ken *MacLean, Harry N. (1990). In broad daylight. NY: Dell (ISBN 0-440-20509-3) +McGinnis, Virginia +Heilbroner, David (1994). Death benefit. NY: Avon. (OCLC 34298102) McGurn, Jack Gusfield, Jeffrey (1999). Who was “Machine Gun” Jack McGurn?: correcting the myths about Al Capone’s chief assassin. (OCLC 42399469) Malvo, Lee Boyd* Cannon, Angie (2003). 23 days of terror: The compelling true story of the hunt and capture of the Beltway snipers. New York: Pocket Books (ISBN 0-743-47695-6) Horwitz, Sari & Ruane, Michael (2003). Sniper: Inside the hunt for the killers who terrorized the nation. New York: Random House (ISBN 1-400-06129-6) Moose, Charles & Fleming, Charles (2004). Three weeks in October: The manhunt for the serial sniper. NY: Signet Books (ISBN 0-451-21279-7) Penn, Eric (2005). Psychology of killing: What drove John Allen Muhammad to kill? Authorhouse (ISBN 9781420870060). Manuel, Peter McLeod, Hector, & McLeod, Malcolm. (2009). Peter Manuel, serial killer. Mainstream Publishing (ISBN 978-1845963972) Merrett, John Tullett, Tom (1956). Portrait of a bad man. London: Evans Brothers. (OCLC 1963949) Michaud, Michelle Scott, Robert (2005). Rope burns. NY: Pinnacle (ISBN 0-786-01195-5) Middleton, David* *Kaye, Jeff (2009). Beware of the cable guy: From cop to serial killer. Polimedia Publishing. (ISBN 978-0976861737) Milat, Ivan Mercer, Neil (1997). Fate. Australia: Random House. Shears, Richard (1997). Highway to nowhere: The chilling true story o f the backpacker murders. Australia: Harper Collins (ISBN 0-732-25105-2) Whittaker, Mark, & Kennedy, Les (1998). Sins of the brother. Australia: Pan Macmillian. Montes, Francisco Arce* *Clarkson, W. (2008). Wolf man. NY: John Blake. (ISBN-13 978-1844545049). Moody, Walter Leroy, Jr Jenkins, Ray (1997). Blind vengeance: the Roy Moody mail bomb murders. Athens: University of Georgia Press. (ISBN 0-820-31906-6) Winne, Mark. Priority mail: the investigation and trial of a mail bomber obsessed with destroying our justice system. NY: Scribner. (ISBN 0-026-30420-3) +Moore, Blanche +Schutze, Jim (1993). Preacher’s girl. NY: William Morrow. (ISBN 0-68811-934-4) Morris, Raymond Hawkes, Harry (1971). Murder on the A34. London: John Long. (ISBN 0-09102-960-0) +Mudgett, Herman (America’s First Serial Killer) +Borowski, John (2005). The strange case of Dr. H. H. Holmes. West Hollywood, CA: Waterfront Productions (ISBN 978-0975918517) +Boswell, Charles (1955). The girls in nightmare house. NY: Fawcett Eckert, Allan W. (2000). The scarlet mansion. NE: iUniverse.com (ISBN 0-595-08988-7) Franke, David (1975). The torture doctor. NY: Hawthorne Books. (ISBN 0-801-57832-9) *Geary, Rick (2003). The beast of Chicago: The murderous career of H.H Holmes. NY: NBM Publishing (ISBN 1-56163-362-3-51595). Larson, Erik (2003). The Devil in the White City: Murder, magic, and madness ar the fair that changed America. NY; Crown Publishers (ISBN 978-0375725609) *Schechter, Harold (1998). Depraved: The shocking story of America’s first serial killer. Pocket Books (ISBN 0-671-02544-9). Snavely, Judy Miller (2006). Devil’s disciple: The deadly Dr. H.H. Holmes. Authorhouse (ISBN 978- 1425926908) Muhammad, John Allen* Cannon, Angie (2003). 23 days of terror: The compelling true story of the hunt and capture of the Beltway snipers. New York: Pocket Books (ISBN 0-743-47695-6) Horwitz, Sari & Ruane, Michael (2003). Sniper: Inside the hunt for the killers who terrorized the nation. New York: Random House (ISBN 1-400-06129-6) Moose, Charles & Fleming, Charles (2004). Three weeks in October: The manhunt for the serial sniper. New York: Signet Books (ISBN 0-451-21279-7) *Penn, Eric (2005). Psychology of killing: What drove John Allen Muhammad to kill? Authorhouse (ISBN 9781420870060). Mullin, Herbert* *Lunde, Donald T. & Morgan, Jefferson (1980). The die song. NY: W.W. Norton (ISBN 0-393-01315- 4) Ward, Damio (1974). Urge to kill. NY: Pinnacle (ISBN 0-523-00380-3) West, Don (1974). Sacrifice unto me. NY: Pyramid Books. Nance, Wayne *Coston, John. (1992). To kill and kill again. New York: Onyx. (ISBN 0-451-40323-1) Napoletano, Eric *Pienciak, Richard T. (1997). Mama’s boy: The true story of a serial killer and his mother. Onyx Books (ISBN 0-451-40748-2) Narcisco, Filipina Wilcox, Robert (1977). The mysterious deaths at Ann Arbor. NY: Popular Library. (ISBN 0-445- 04030-0) Neal, William Lee* *Jackson, Steve (2011). Love me to death. New York: Pinnacle. ISBN-13: 978-0786026906. Neelley, Judith Cook, Thomas (1990). Early graves. NY: Dutton. (ISBN 0-525-24918-4) Neilson, Donald Hawkes, Harry (1978). The capture of the Black Panther. London: Harrap. (ISBN 0-245-53257-9) Nelson, Earle Leonard* Anderson, Frank (1974). The dark strangler. Calgary: Frontier. (NL Canada 740028510) Graysmith, Robert (2011). The laughing gorilla: the true story of the hunt for one of America’s first serial killers. Berkley (ISBN-13: 978-0425237366). Schechter, Harold (1998). Bestial: The savage trail of a true American monster. NY: Pocket Books. (ISBN 0-671-73219-6) Ng, Charles Harrington, Joseph and Burger, Robert (1993). Eye of evil. NY: St. Martins. (ISBN 0-312-95175-2) Harrington, Joseph, & Burger, Robert (1999). Justice denied. NY: Perseus Press (ISBN 0-306-46013- 0) Lasseter, Don (2000). Die for me: The terrifying true story of the Charles Ng and Leonard Lake torture murders. NY: Pinnacle Books (ISBN 0-786-01107-6) Owens, Greg, & Henton, Darcy (2001). No kill, no thrill: The shocking true story of Charles Ng. (ISBN 0-889-95209-4) Nickell, Stella Maudine Olsen, Gregg (1993). Bitter almonds: the true story of mothers, daughters, and the Seattle cyanide murders. NY: Warner Books. (ISBN 0-446-36359-6) Nilsen, Dennis Masters, Brian (1994). Killing for company: The story of a man addicted to murder. Dell Publishing. (ISBN 0-440-22043-2) McConnell, Brian, & Bence, Douglas (1983). The Nilsen file. London: Futura. Noe, Marie *Glatt, John (2000). Cradle of death. NY: St. Martins (ISBN 0-312-97302-0) Norris, Roy* Markman, Ronald & Bosco, Dominic (1989). Alone with the devil. NY: Doubleday. (ISBN 0-553- 28520-3) Olson, Clifford Robert* Ferry, Jon & Inwood, Damian (1982). The Olson murders. Langley, B.C.: Cameo Books. (NLC 820079014) Holmes, W. Leslie & Bruce L. Northrop (2000). Where Shadows Linger: The Untold Story of the RCMP’s Olson Murder. Surrey, Canada: Heritage House Publishing Co. Ltd. (ISBN 1-895- 81192-9) Mulgrew, Ian (1991). Final payoff: The true price of convicting Clifford Robert Olson. Toronto: Seal Books. O’Neall, Darren Dee *King, Gary C. (1995). Blind rage. NY: Penguin (ISBN 0-451-40532-3) Panzram, Carl Gaddis, Thomas (2002). Panzram: A journal of murder. NY: Amok Books (1-878-92314-5) Peete, Louise Anthony, Helen B. (1999). The search for Lofie Louise. Dallas, TX: Glendale Press. (ISBN 1-893- 45102-x) *Percy, Derek Ernest *Marshall, Debi (2010). Lambs to the slaughter: Inside the depraved mind of child-killer Derek Ernest Percy. Australia: Random House (ISBN-13 978-1741666519) Perez, Leonora Wilcox, Robert (1977). The mysterious deaths at Ann Arbor. NY: Popular Library. (ISBN 0-445- 04030-0) Petiot, Marcel Grombach, John (1980). The great liquidator. NY: Doubleday. (ISBN 0-385-13271-9) Maeder,Thomas (1980). The unspeakable crimes of Dr. Petiot. Boston: Little, Brown. (ISBN 0-316- 54366-7) Maeder, Thomas (1992). Docteur Petiot: The study of France’s most diabolical serial killer. NY: Penguin (ISBN 0-140-16927-x) Pickton, Willie *Cameron, Stevie (2007). The Pickton file. Canada: Knopf (ISBN 0-676-97953-x) Pitchfork, Colin* *Wambaugh, Joseph (1989). The blooding: The true story of the Narborough Village murders. New York: Morrow. Pomeroy, Jesse * (America’s Youngest Serial Killer)* *Schechter, Harold (2000). Fiend: The shocking true story of America’s youngest serial killer. Simon & Schuster (ISBN 0-671-01448-x) Powers, Harry* McLaughlin, Vance (2011). The mail order serial killer: the life and death of Harry Powers. Cincinnati, OH: Post Mortem Press. (ISBN-13: 978-1615480121) Price, Craig *Lang, Denise (2000). A call for justice: A New England town’s fight to keep a stone cold serial killer in jail. NY: Morrow/Avon. (ISBN 0-380-78077-1) Puente, Dorthea* Blackburn, Daniel (1990). Human harvest. Los Angeles: Knightsbridge. (ISBN 1-877-96110-8) Norton, Carla (1994). Disturbed ground: The true story of a diabolical female serial killer. NY: Avon Books. (ISBN 0-688-09704-9) Wood, William (1994). The bone garden. NY: Pocket Books. (ISBN 0-671-68638-0) Putt, George Howard Meyer, Gerald (1974). The Memphis murders. NY: Seabury Press (ISBN 0-816-49202-6) Quantrill, William Clarke Schultz, Duane (1997). Quantrill’s war: The life and times of William Clarke Quantrill. NY: St. Martin’s Press (ISBN 9780312169725). Rader, Dennis (BTK Killer)* Douglas, John & Dodd, Johnny (2007). Inside the mind of BTK: The true story behind the thirty-year hunt for the notorious Wichita serial killer. NY: Wiley (ISBN 9780787984847). *Beattie, R. (2005). Nightmare in Wichita: The hunt for the BTK strangler. NY: New American Library (ISBN 0-451-21738-1) *Singular, S. (2006). Unholy messenger: The life and crimes of the BTK serial killer. NY: Scribner. (ISBN 0-743-29124-7) *Smith. C. (2006). The BTK murders: Inside the “Bind Torture Kill” case that terrified America’s heartland. NY: St. Martin’s (ISBN 0-312-93905-1) Wenzl, R., & Potter, T. (2007). Bind, torture, kill: The inside story of the serial killer next door. NY: Regan Books (ISBN 0-061-24650-6) Ramirez, Richard (The Night Stalker)* *Carlo, Phillip (1996). The Night Stalker: The life and crimes of Richard Ramirez. Pinnacle Books (ISBN 0-786-00379-0). Dee, Kay (2006). Life without hope: Into the mind of Richard L. Ramirez. Los Angeles: D.B. Press International (ISBN 0-978-89840-0). Linedecker, Clifford (1989). Hell ranch. Austin, Tex.: Diamond Books. (ISBN 0-312-92505-0) *Linedecker, Clifford (1991). Night stalker. NY: St. Martins (ISBN 0-312-92505-0) Rathburn, Charles Linedecker, Clifford (1997). Death of a model. NY: St. Martin (ISBN 0-312-96163-4) Ray, David Parker * Fielder, Jim (2003). Slow death. NY: Pinnacle Books. *Sparks, J. E. (2006). Consequences: The criminal case of David Ray Parker. Roswell, NM: Yellow Jacket Press (ISBN 0-9787734-0-3). Reeves, Jack *Springer, Patricia (1999). Mail order murder. NY: Penguin (ISBN 0-786-00640-4) +Reldan, Robert Muti, Richard, & Buckley, Charles (2012). The charmer: the true story of Robert Reldan. Titletown Publishing ISBN-13: 978-0985247874. Reles, Abraham Block, Alan Abner (1975). Lepke, Kid Twist and the Combination: organized crime in New York City, 1930-1944. (OCLC 3169906) Resendez, Angel Maturino (The Railroad Killer) *Clarkson, Wensley (1999). The Railroad Killer: The shocking true story of Angel Maturino Resendez. NY: St. Martins. (ISBN 0-312-97452-3) Rhoades, Robert Ben* *Busch, Alva (1996). Roadside prey. NY: Pinnacle Books (ISBN 0-786-00221-2) Ridgway, Gary (Green River Killer)* King County Journal (2004). Gary Ridgway: The Green River Killer. Seattle: King County Journal (ISBN 0-974-70380-x) *Prothero, Mark & Smith, Carlton (2007). Defending Gary: Unraveling the mind of the Green River Killer. NY: Jossey-Bass (ISBN 0-787-99548-7) Reichert, David (2004). Chasing the devil: My twenty-year quest to capture the Green River Killer. New York: Little, Brown (ISBN 0-316-15632-9) Rule, Ann (2005). Green River, running red: The real story of the Green River Killer. New York: Free Press (ISBN -743-238516) Rifkin, Joel* Eftimiades, Maria (1993). Garden of graves: The shocking true story of Long Island serial killer Joel Rifkin. NY: St. Martins. (ISBN 0-312-95298-8) *Mladinich, Robert (2001). The Joel Rifkin story: From the mouth of the monster. (ISBN 0-743-41152- 8) Pulitizer, Lisa Beth & Swirsky, Joan (1994). Crossing the line. NY: Berkley. (ISBN 0-425-14441-0) Robinson, John Edward* Douglas, John & Singular, Stephen (2004). Anyone you want me to be: A true story of sex and death on the Internet. New York: Pocket Star (ISBN 0-743-44880-4) *Glatt, John (2001). Internet slave master. NY: St. Martin’s Press. (ISBN 0-312-97927-4) *Glatt, John (2001). Depraved. NY: St. Martin’s Press. (ISBN (0-312-93684-2) *Wiltz, Sue (2010). Slave master. NY: Pinnacle Books (ISBN13 978-0786022221) Rode, Jimmy* (a.k.a. Cesar Barone) *Lasseter, Don & King, Gary C. (1997). Dead of night: The true story of a serial killer. Onyx Books (ISBN 0-451-40703-2) Rodgers, Jeremiah* *Rosen, Fred (2003). Flesh collectors. NY: Pinnacle Books (ISBN 0-7860-1583-7). Rogers, Dayton Leroy (Molalla Forest Killer) *King, Gary C. (1992). Blood lust: portrait of a serial sex killer. Onyx (ISBN 0-451-40352-5) Rogers, Glen* *Combs, Stephen M. & Eckberg, John (2002). Road dog: The bloody trail of Glen Rogers. Federal Point Publishing. ISBN (9780966825916). Linedecker, Clifford L. (1997). Smooth operator: The true story of accused serial killer Glen Rogers. St. Martins Press. (ISBN 0-312-96400-5). *Spizer, Joyce & Rogers, Claude (2002). Cross country killer: the Glen Rogers story. Top Publications. (ISBN 9781929976119). Rogers, Kenneth Paul *Rogers, Kenneth Paul (1974). For one sweet grape: the true story of a convicted rapist-murderer. NY: Playboy Press. (ISBN 8-722-3419-3) Rolling, Danny (The Gainesville Ripper)* *Fox, James & Levin, Jack (1996). Killer on campus: The terrifying true story of the Gainesville Ripper. NY: Avon Books (ISBN 0-380-76525-x) Philpin, John & Donnelly, John (1994). Beyond murder: The inside account of the Gainesville murders. NY: Penguin (ISBN 0-451-40409-2) *Rolling, Danny, & London, Sondra. (1996). The making of a serial killer: The real story of the Gainesville student murders in the killer’s own words. Portland, OR: Feral House. (ISBN 0- 922-91540-7) Ryzuk, Mary S. (1995). The Gainesville Ripper: A summer’s madness, five young victims – the investigation, the arrest, and the trial. St. Martins. (ISBN 0-312-95324-0) Rulloff, Edward Howard +Bailey, Richard W. (2004). Rouge scholar: The sinister life and celebrated death of Edward H. Rulloff. Ann Arbor, MI: University of Michigan Press +Butz, Stephen D. (2007). Shall the murderer go unpunished: The life of Edward H. Rulloff. Utica, NY: North Country Books (ISBN 978-1-59531-015-6). +Freeman, E. H. (2012). Edward H. Rulloff: The veil of secrecy removed. CreateSpace (ISBN 978- 1461142751) Ross, Michael Ramsland, Katherine (2012). The ivy league killer. RosettaBooks (available on Kindle) Roth, Randy* *Smith, Carlton (1993). Fatal charm: The shocking true story of serial wife killer Randy Roth. NY: Onyx Books. (ISBN 0-451-40416-5) Russell, George* *Olsen, Jack. (1995). Charmer: The true story of a ladies man and his victims. NY: Avon Books. (ISBN 0-380-71601-1) Sapp, William K.* *Rothgeb, Carol J, (2011). Hometown killer. New York: Pinnacle (ISBN 978-0786026883). Schaefer, Gerard Schaefer, G. J. (1990). Killer fiction: Tales of an accused serial killer. Atlanta: Media Queen Ltd. Schmid, Charles* *Gilmore, John (1996). The saga of Charles Schmid, the notorious Pied Piper of Tucson. Portland, OR: Feral House (ISBN 0-922-91531-8) *Moser, Don & Cohen, Jerry (1967). The Pied Piper of Tucson. NY: New American Library. (LCCN 67-29731) Seda, Heribrto “Eddie” Crowley, Kieran (1997). Sleep my little dead: The true story of the Zodiac Killer. St. Martins (ISBN 0-312-96339-4) Graysmith, Robert. (1996). Zodiac. Berkley Publications Group (ISBN 0-425-09808-7) +Sellers, Sean Dawkins, Vickie & Higgins, Nina (1989). Devil’s child. NY: St. Martin’s. (ISBN 0-312-91533-0) Sells, Tommy Lynn* *Fanning, Diane (2003). Through the window: The terrifying true story of cross-country killer Tommy Lynn Sells. New York: St. Martin’s (ISBN 0-312-98525-8) Rivers, Tori (2008). 13 ½: Twelve jurors, one judge and a half-assed change: A serial killer in his own words. Riverbend Press (ISBN-13 978-0980080209). Shaw, Sebastian* *Scott, Robert (2009). Lust to kill. Shawcross, Arthur (Genesee River Killer) *Norris, Joel (1992). Arthur Shawcross: The Genesee River killer. NY: Pinnacle Books (ISBN 1-558- 17578-4) *Olsen, Jack. (1993). The misbegotten son: A serial killer and his victims: The true story of Arthur Shawcross. Bantam Books. (ISBN 0-440-21646-X) Sharif, Abdul Latif Whitechapel, Simon (2000). Crossing to kill: The true story of the Serial-Killer Playground. NY; Virgin. (ISBN 0-753-50496-0) Shipman, Harold Dr.* Clarkson, Wensley. (2005). Evil beyond belief: How and why Dr. Harold Shipman murdered 357 people. NY: John Blake (ISBN 1-904-03446-2) *Clarkson, Wensley. (2001). The good doctor. NY: St. Martin’s. (ISBN 0-312-98260-7) Sitford, Mikaela (2000) Addicted to murder. NY: Virgin Publishing. (ISBN 0-753-50445-6) Sitford, Mikaela (2006). Serial killer file: The doctor of death investigation. NY: Bearport Publishing. (ISBN 1-597-16551-4) Whittle, Brian, & Ritchie, Jean. Prescription for murder. NY: Warner Books Shore, Anthony Allen * *Mitchell, Corey (2007). Strangler. NY: Kensington Publishing. (ISBN 9780786018505) Siegel, Benjamin Jennings, Dean (1992). We only kill each other: the life and times of Bugsy Siegel. London: Penguin. (ISBN 0-140-17290-4) DeMexico, N. R. (1951). Vice over America. NY: Designs. (OCLC 21634865) Reid, Ed (1972) The mistress and the mafia: the Virginia Hill story. NY: Bantam Books. (OCLC 21575533) Carpozi, George (1973). Bugsy; the high-rolling bullet-riddled story of Benjamin “Bugsy” Siegel. (OCLC 19499474) Carpozi, George (1976). Bugsy: the godfather of Las Vegas. London: Everest. (ISBN 0-905-01809-5) Hanna, David (1974). Bugsy Seigel: the man who invented Murder Inc. NY: Belmont Tower Books. (OCLC 11415601) Silveria, Robert* Palmini, W. G., & Chalupa, T. (2004). Murder on the rails: The true story of the detective who unlocked the shocking secrets of the Boxcar Serial Killer. NY: Horizon Press (ISBN 0-882- 82243-8) Sims, Paula* Becker, Audrey (1993). Dying dreams: The secrets of Paula Sims. NY: Pocket Books (ISBN 0-671- 73232-3) Weber, Don & Bosworth, Charles (1991). Precious victims. NY: Signet. (ISBN 0-451-17184-5) Smith, Lemuel* *Foley, Denis (2004). Lemuel Smith and the compulsion to kill: The forensic story of a multiple personality serial killer. New Leitrim House Publishing. (ISBN 0-972-238301) Smithers, Samuel* *Rosen, Fred (2000). Deacon of death. New York: Pinnacle. Sobhraj, Charles *Thompson, Thomas (2000). Serpentine. NY: Carroll & Graf (ISBN 0-786-70749-6) Sowell, Anthony Miller, Steve (2012) Nobody's women: The crimes and victims of Anthony Sowell, the Cleveland Serial Killer NY: Penguin/Berkley (ISBN 0425250512) Spahalski, Robert* *Benson, Michael (2010). Killer twins. NY: Pinnacle. (ISBN-13 978-0786022052). Speck, Richard Altman, Jack & Ziproryn, Marvin (1967). Born to raise hell: The untold story of Richard Speck. NY: Grove Press. Altman, Jack (1984). Speck: The untold story of a mass murderer. (ISBN 0-873-19025-4) Breo, Dennis, & Martin, William (1993). The case of the century: Richard Speck and the murder of eight nurses. NY: Bantam (ISBN 0-553-56025-5) Spencer, Timothy *Mones, Paul (1995). Stalking justice. NY: Pocket Books (ISBN 0-671-70348-x) Spilotro, Anthony Roemer, William F. (1995). The enforcer: Spilotro, the Chicago mob’s man over Las Vegas. NY: Ivy Books. (ISBN 0-804-11310-6) Stanko, Stephen* *Benson, Michael (2011). Watch mommy die. New York: Pinnacle. (ISBN 978-0786024995) Stano, Gerald Eugene* *Flowers, Anna (1993). Blind fury: The shocking true story of Eugene Stano. NY: Pinnacle Books (ISBN 0-786-00662-5). Kelly, Kathy & Montane, Dianna (2011). I would find a girl walking. ISBN-13: 978-0425231869 Starkweather, Charles Allen, William (1967). Starkweather: The story of a mass murderer. Boston: Houghton Mifflin. (ISBN 0-395-24077-8) Newton, Michael (1998). Waste land. NY: Pocket Books (ISBN 0-671-00198-1) O’Donnell, Jeff (1993). Starkweather: A story of mass murder on the great plains. Lincoln, NE: J&L Lee Publishers (ISBN 0-934-90431-6) Reinhardt, Jim (1960). Murderous trail of Charles Starkweather. Springfield, IL: Charles C. Thomas (ISBN 0-398-01565-1) Sargeant, Jack (1996). Born bad: The story of Charlie Starkweather and Caril Ann Fugate. (ISBN 1- 871-59262-3) Starrett, Danny* *Naifeh, Steven & Smith, Gregory W. (1996). A stranger in the family: A true story of murder, madness, and unconventional love. NY: Penguin Books. (ISBN 0-451-40622-2) Stewart, Raymond Lee *Kelly, Greg (1999). Killer on the loose: The true story of serial killer Raymond Lee Stewart. Paperboy Press. (ISBN 0-966-84440-8) Strayner, Cary* McDougal, Dennis. (2000). The Yosemite murders. NY: Ballantine Books. (0-312-98201-1) *Smith, Carlton (1999). Murder at Yosemite. NY: St. Martins. (ISBN 0-312-97457-4) Stutzman, Eli Olson, Gregg (2002). Abandoned prayers. NY; St. Martins. (ISBN 0-312-98201-1) Suff, William Lester (Riverside Prostitute Killer)* *Keers, Christine & St. Pierre, Dennis (1996). The Riverside killer. NY: Pinnacle Books (ISBN 0-786- 00345-6) Lane, Brian Alan & Suff, Bill (1997). Cat and mouse: Mind games with a serial killer . Dove Books (ISBN 0-787-10860-x) Sutcliffe, Peter Beattie, John (1981). The Yorkshire Ripper. London: Quartet/Daily Star. (ISBN 0-140-09614-0) Burn, Gordon (1990). Somebody’s husband, somebody’s son. NY: Penguin. (ISBN 0-434-09827-2) Cross, Roger. (1981). The Yorkshire Ripper. London: Granada. (ISBN 0-586-05526-6) Jones, Barbara (2002). Evil beyond belief. London: Blake Publishing (ISBN 1-903-40298-0) Jouve, Nicole (1986). The street cleaner: The Yorkshire Ripper case on trial. London: Marion Boyars (ISBN 0-714-52847-1). Yallop, David (1982). Deliver us from evil. NY: Coward, McCann. (ISBN 0-698-11113-3) Swango, Michael *Stewart, James (2000). Blind eye: the terrifying story of a doctor who got away with murder. NY: Touchstone Books (ISBN 0-684-86563-7) Sweeney, Dennis Harris, David. (1982) Dreams die hard. NY: St. Martin’s Press. Taylor, Gary Imbrie, A. (1993). Spoken in darkness. New York: Plume. +Tinning, Mary Beth +Eggington, Joyce (1989). From cradle to grave. NY: William Morrow. (ISBN 0-68807-566-5) Tison, Gary *Clarke, James (1988). Last rampage. NY: Houghton Mifflin. (ISBN 0-39546-721-7) Toppan, Jane* Schechter, Harold (2003). Fatal: The poisonous life of a female serial killer. New York: Pocket Star (ISBN 0-671-01450-1) Unterweger, Jack* *Leake, John (2009). Entering Hades: The double life of a serial killer. NY: Berkley (ISBN-13 978- 0425228012) Vacher, Joseph Starr, Douglas (2010). The killer of little shepherds: A true crime story and the birth of forensic science. NY: Knopf (ISBN-13 978-0307266194). Wager, Terilynn Earle, Wilton (2001). Terilynn: Based on the true story of America’s youngest serial killer. (ISBN 0- 963-24221-0) Wallace, Henry Louis Burlington, Thomas A. (1994). Bodies of evidence: the Wallace investigation. (OCLC 31498934) Wardrip, Faryion* *Springer, Patricia. (2001). Body hunter. NY: Pinnacle Books (ISBN 0-786-01264-1) *Stowers, Carlton. (2004). Scream at the sky. NY: St. Martin’s (ISBN 0-312-99819-8) Warren, Lesley Eugene* *Bellini, Jon (2002). The babyface killer. New York: Pinnacle Books (ISBN 0-786-01202-1) Watkins, Paul Watkins, Paul & Soledad, Guillermo (1979). My life with Charles Manson. NY: Bantam. (ISBN 0- 553-12788-8) Watts, Coral Eugene* *Mitchell, Corey (2006). Evil eyes. NY: Pinnacle (ISBN 0-7860-1676-0). West, Fred and Rose Burn, Gordon (1999). Happy like murderers. London: Faber & Faber. (ISBN 0-571-19757-4) *Masters, Brian. (1996). She must have known: The trial of Rosemary West. London: Transworld Publishers. Sounes, Howard (1995) Fred and Rose: The full story of Fred and Rose West and the Gloucester house of horrors. (ISBN 0-751-51322-9) Wansell, Geoffrey. (1996). An evil love: The life of Frederick West. London: Headline Book Publishing. West, Anne (1996). Out of the shadows. London, Pocket Books (0-671-85516-6) *Wilson, Colin. (1998). The corpse garden: The crimes of Fred and Rose West. London: True Crime Library. (ISBN 1-874-35824-9) Woodrow, Jane Carter (2012). Rose: A portrait of the serial killer as a young girl. Whitman, Charles Lavergne, G. M. (1997). A sniper in the tower: The true story of the Texas tower massacre. NY: Bantam Publishing. (ISBN 1-574-41029-6) Wilder, Christopher *Gibney, Bruce (1990). The beauty queen killer. NY: Pinnacle. (ISBN 1-558-17345-5) Williams, Charlene *Van Hoffman, Eric (1999). A venom in the blood. NY: Pinnacle. (ISBN 0-786-0-660-9) Williams, Russell* Appleby, Timothy (2011). A new kind of monster: Broadway. (ISBN 978-0307888723). Gibb, David A. (2012). Camouflaged killer. NY: Berkley. ISBN 978-0425259191 +Williams, Wayne Dettlinger, Chet (1983). The list. Atlanta: Philmay Enterprises. (ISBN 0-942-89404-9) Leeming, David and Baldwin, James (1995). The evidence of things not seen. NY: Holt (ISBN 0-805- 03939-2) +Mallard, Jack (2009). The Atlanta child murders: The night stalker. BookSurge Publishing. (ISBN-13 978-1439263372). Wood, Catherine May and Gwendolyn Gail Graham *Cauffiel, Lowell (1992). Forever and five days. NY: Kensington (ISBN 0-786-00469-x) Wood, James Edward* *Adams, Terry, Brooks-Mueller, Mary, & Shaw, Scott. (1998). Eye of the beast: The true story of serial killer James Wood. NY: St. Martin’s. (ISBN 0-312-96882-5) Woodfield, Randall (I-5 Killer) *Rule, Ann (1988). The I-5 Killer. NY: Signet Books (ISBN 0-451-16559-4) Wuornos, Aileen* Kennedy, Dolores & Nolin, Robert (1992) On a killing day. Chicago: Bonus Books (ISBN 0-929- 38775-9) *Reynolds, Michael (2004). Dead ends. NY: St. Martin’s (ISBN 0-312-98418-9) *Russell, Sue (2002). Lethal intent: The shocking true story of one of America’s most notorious serial killers. NY: Kensington Publishing (ISBN 9780786015184). *Wuornos, Aileen., & Berry-Lee, Christopher (2004). Monster: My true story. NY: John Blake Publishing (ISBN 1-844-54079-0) +Yaslo, Claudia Elaine +Keyes, Danial (1986). Unveiling Claudia. NY: Bantam. (ISBN 0-553-05126-1) Yates, Andrea Spencer, Suzy (2005). Breaking point. NY: St. Martins. (ISBN 0-312-93871-3) Yates, Robert Lee* Morlin, Bill & White, Jeanette (2001). Bad trick: The hunt for Spokane’s serial killer. NY: New Media Ventures (ISBN 0-923-9108-2) Young, Graham Frederick Holden, Anthony (1974). The St. Albans Poisoner. London: Hodder & Stoughton. (ISBN 0-340- 17009-3) Young, Winifred (1973). Obsessive poisoner. London: Hale. (ISBN 0-709-13732-X) Yuki, Charles William *Tanenbaum, Robert & Greenberg, Peter (1994). The piano teacher. NY: Signet. (ISBN 0-451-15468- 1) Zani, Robert* *Fero, Louise (1992). The Zani murders. Austin, TX: Texas Monthly Press. (ISBN 0-877-19153-0) work_nkxqva5efrcg7litapywmuo6fm ---- 105069 287..301 Duplicate detection algorithms of bibliographic descriptions Anestis Sitas School of Philosophy, Aristotle University of Thessaloniki, and School of Library Science, Technological Institute of Thessaloniki, Thessaloniki, Greece, and Sarantos Kapidakis Archive and Library Sciences Department, Ionian University, Paleo Anaktoro, Greece Abstract Purpose – The purpose of this paper is to focus on duplicate record detection algorithms used for detection in bibliographic databases. Design/methodology/approach – Individual algorithms, their application process for duplicate detection and their results are described based on available literature (published articles), information found at various library web sites and follow-up e-mail communications. Findings – Algorithms are categorized according to their application as a process of a single step or two consecutive steps. The results of deletion, merging, and temporary and virtual consolidation of duplicate records are studied. Originality/value – The paper presents an overview of the duplication detection algorithms and an up-to-date state of their application in different library systems. Keywords Cataloguing, Algorithms, Bibliographic systems, Records management Paper type Research paper Introduction The ideal setup for a library catalogue would be to register a unique bibliographic record for each bibliographic entity. However, bibliographic databases include several types of duplicate records. Even if the search cues are clearly specified, locating the correct entry is still an issue that requires further investigation as new materials are added in a variety of media. Duplicate records slow down the indexing process and significantly increase the cost for saving and managing data not to mention that their retrieval is delayed. As a result, duplicate records constitute a system deficiency and compromise quality control for all parties involved, namely users, catalogers, and technical staff. Shared cataloging further aggravates the problem as, through the automated systems, each library-member of one system can access the other members’ records. Administrators have to have to improve the bibliographic database quality and keep the database functional and “clean”. Duplicate records In the environment of bibliographic databases, a duplicate record could be defined as two or more records which stand for or describe the same document (defined as any information resource). Duplicate records can cause problems to the following areas: The current issue and full text archive of this journal is available at www.emeraldinsight.com/0737-8831.htm Duplicate detection algorithms 287 Received 11 October 2007 Revised 22 October 2007 Accepted 27 January 2008 Library Hi Tech Vol. 26 No. 2, 2008 pp. 287-301 q Emerald Group Publishing Limited 0737-8831 DOI 10.1108/07378830810880379 . User information overload. Because of the recall of a larger number of documents the user is presented with more information than he or she can actually handle. . Reduced system efficiency. The actual number of records in the database is increased and therefore complicating the efficiency of indexing. This also hinders searching, cataloging decision-making and affects end-user satisfaction. . Low cataloging productivity. Identifying duplicate records and cleaning the database requires valuable time by catalogers, which could be spent on other essential tasks. . Increased cost for database maintenance. More time spent on database maintenance results to an increased cost. Possible reasons for the existence of duplicate records include novice searchers, the inability for successful searches, and the wish for a “perfect” record to be entered (Wanninger, 1982). Additional factors for record duplication include: . local practices and policies of cataloging; . cataloging inconsistencies; . careless record entering; and . errors in the syntax of MARC format. Record matching algorithms The existence of duplicate records constitutes a problem which is becoming increasingly alarming in networked environments, as the size of individual databases increases and new cooperative networks or consortia are created. In order to reduce the existence of duplicate records, new software is developed using special detection algorithms. Record matching algorithms are programs used to maintain the integrity of bibliographic databases. It would be quite easy to create a process that will match two identical bibliographic descriptions but it is not as easy to match similar records (Hunstad, 1988). Developing a detection and deduplication process Designing the process of detection and deduplication of records within a bibliographic database should take the following into consideration: . Design goal. Specifying which types of documents will be represented in the records to be processed (articles, journals, etc.). . Specification of duplicate records. Detailed definition of the term “duplicate record” based on the needs of the particular database. . Application of the process. Specifying whether the process will be applied automatically, semi-automatically, or manually. Creating a record-matching algorithm In order to develop an effective algorithm, it is essential to define the application steps, the MARC fields to be used as matching keys, and the criteria for identifying and assessing record similarity/supplication. LHT 26,2 288 Application steps The algorithm can be applied as a one- or two-step comparison. A final step follows which deals with the management of the duplicate records. The single step application of the algorithm is, in most cases, a compromise in order to achieve a fast and inexpensive deduplication. In general, these algorithms are more general and with loosely defined criteria resulting in a large number of duplicate records in need of further control. During the initial step of a two-step algorithm, a file of duplicate records is created based on a limited comparison of fields. Its principal aim is to minimize the number of comparisons during the second step and reduce mismatches that could lead to the deletion of unique records. The second step verifies matches from the first step and then applies a detailed and accurate comparison to determine actual duplicates. Selection of fields In order for such an algorithm to be created, it is important to select fields which exhibit significant stability regardless of who created the record (specific cataloger or bibliographic agency). The fields with less stable data offer low probability for record matching (Meir and Lazinger, 1998). Although deduplication based on a control number (ISBN, etc.), is the best method of detection, it does not always ensure full detection. Other data serving as sources for detection include author, title, publisher, pagination, place and the year of publication (Coyle, 1992). Matching keys The algorithms for detection of duplicate records use matching keys, which are strings constructed from a pre-selected field or combination of fields. A field can be used as a key in part (e.g. ISBN), or whole (e.g. title proper). Moreover, a combination of fields or a combination of field parts can also be used. Before these keys are created, the data are processed for normalization of spacing, punctuation, special fonts or characters, and capitalization.. In addition, a variety of techniques are used to accommodate for field content differences such as spelling errors, missing data, and small variations of words. These techniques include truncation, keywording, Harrison Keys, Hamming distance, USBC, and others (Toney, 1992). Matching evaluation Two methods are used to evaluate the matching of duplicate records: (1) Field comparison. This is based on binary comparisons of selected fields, that is, if fields appear to be the same or not. The software uses YES/NO indications. When the entire field is used, the comparison is safer but the process is time-consuming. This method is very strict and complicates the detection of records that have variations in cataloging or data entry errors (O’Neill and Oskins, 1990). (2) Weight assigning. This method concerns the matching of strings that estimate the similarity assigning weights/values which do not reflect bibliographic significance of the data, but their usage in the recognition of similar records (Coyle, 1992). The matching algorithm allows the merging or deletion of entries only if the assigned weight reaches a pre-determined value, a threshold. This method is open to the existence of minor differences in field content, spelling Duplicate detection algorithms 289 errors, completeness or missing data, and variations in cataloging practice (Coyle and Gallaher-Brown, 1985). Duplicate records handling Another element in the design of the duplication detection algorithm is the decision of how to handle duplicate records once they are detected. Toney (1992) presented three main practices: (1) one record is selected as the master record and all others are deleted; (2) one record is selected as the master record and all non-matching fields from the other records are added to the master (merging); and (3) all records are kept but clustered around a master record. Several variations can be added to the above practices. These include: to retain and maintain the record that was entered first in the database and delete the most recent ones; to retain and maintain the most recent record and delete all previous ones; and to retain either the first or the most recent record and merge into it the unique information from all others. Finally, one may choose to merge duplicate records only during the process of searching or retrieval (on the fly). Merging can be made instantly and “virtually” just for the purpose of displaying a single record to the end user. Results of duplicate detection algorithms In every effort of duplicate record detection the matching process may bring about the following results: . Exact matches. Records which are absolutely identical. . Partial matches. Only some parts of the records are duplicated. . Mismatches, false matches. Although indicated as duplicates, the records do not represent the same document. . Missed/undetected matches. Existing duplicate records that are not detected by the algorithm. Mismatches are considered a more important problem than the missed matches, since when deleted there is a permanent loss of information. To avoid this problem, the algorithm should use a loose method so that it gathers records with a degree of variations but avoids possible deletion of bibliographic information. On the other hand, it should be a tight method so that it restricts the accumulation of a large number of possible duplicate records and at the same time it does not allow the loss of genuine duplicate records (Meir and Lazinger, 1998). Algorithm categorization Types of material and status This paper describes ten algorithms. Table I presents these algorithms based on the type of records they are designed to detect. In other words, it specifies whether they refer to the detection of duplicate bibliographic records of monographs, serials journals, journal articles, or other types of material. In addition, the current status of each algorithm is noted. Their status may be defined as: . Prototype systems. Applied in a lab environment. LHT 26,2 290 . Inactive. While they were once applied in a real environment, their application is now abandoned. . Active. Algorithms that still applied. The algorithms that will be presented further on concern bibliographic records of monographs, except than the one by Oak Ridge National Laboratory which addressed journal articles. Algorithms for ALEPH-ULM, MDBUPD and IUCS are also applied to other types of documents (microforms, maps, etc.) while the one for MELVYL handles journal articles apart from monographs and journals. Finally, the Union Catalog of Greek Academic Libraries algorithm manages all sorts of materials except journals. As far as the state of their use is concerned, four out of ten (40 percent) are of the research type (Oak Ridge National Laboratory, MDBUPD, IUCS and Hickey and Rypka). Half of them (50 percent), including ILCSO, DDR, OPAC, MELVYL and Union Catalog of Greek Academic Libraries, continue to be in use even today. One algorithm was applied to the ALEPHs ULM catalog but its application ended in 1998. Processes of application and evaluation Apart from the type of materials they are applied to, these algorithms can also be distinguished according to the following characteristics: . Application. This refers to the number of stages of applications as either one- or two – step processing. Three algorithms (30 percent) are applied in one step (ALEPH-ULM, ILCSO, and Union Catalog of Greek Academic Libraries). The remaining seven algorithms (70 percent) follow the practice of two-step process. . Evaluation. This refers to the methods of comparison used to assess whether two or more bibliographic records are identical. These methods include either a comparison between fields or the assignment of weights. Of the algorithms presented in this paper, 40 percent use the method of the field comparison (ALEPH-ULM, MDBUPD, IUCS, and Union Catalog of Greek Academic Libraries). The remaining 60 percent, assign points/values for weights. Document type Status Monographs Journals Other Prototype Inactive Active ALEPH-ULM U U U U ILCSO U U U Greek Union Catalog U U U OAK Articles U MDBUPD U U U IUCS U U U OCLC (Hickey and Rypka) U U DDR U U U COPAC U U U MELVYL U U Articles U Note: U ¼ Yes Table I. Document type and status Duplicate detection algorithms 291 Table II presents each algorithm and their respective application method, whether the application is done during the process of searching or retrieval (on the fly), and their evaluation method. Final handling and algorithm running Furthermore, we can distinguish algorithms according to the final handling of the detected duplicate records (deletion or merging), as well as whether this process is done online or offline. Final handling information for each algorithm is presented in Table III. Final handling This refers to the final stage of the process of detecting duplicate records. Three programs (ILCSO, MDBUPD, IUCS), 30 percent, delete duplicate records. The MDBUPD and IUCS algorithms end up deleting the spare ones and retaining just one record, while ILCSO selects and retains the most suitable one. In total, five of them, 50 percent, including ALEPH-ULM, Union Catalog of Greek Academic Libraries, DDR, COPAC, and MELVYL, merge duplicate records in one integral record. COPAC merges Application Evaluation Steps On the fly Field comparison Weights ALEPH-ULM 1 U ILCSO 1 U Greek Union Catalog 1 U OAK 2 U MDBUPD 2 U IUCS 2 U OCLC (Hickey and Rypka) 2 U DDR 2 U COPAC 2 U U MELVYL 2 U U Note: U ¼ Yes Table II. Algorithm application and evaluation methods Final handling Algorithm running Deletion Merging Offline Online ALEPH-ULM U U ILCSO U U Greek Union Catalog U U OAK * * U MDBUPD U U IUCS U U OCLC (Hickey and Rypka) * * U U DDR U U U COPAC U U U MELVYL U U Notes: U ¼ Yes; * ¼ Not available Table III. Final handling and time of algorithm running LHT 26,2 292 the records in two of its three segments (the first segment includes only the British Library records and each one of the other two segments include approximately the 50 percent of the other catalog records). Among the three segments, however, there is no physical merging but it makes possible to present merged records to users in real time during the search. MELVYL’s practice does not lead to the physical merging of duplicate records, but to online presentation of merged records during the recall phase. For two out of ten algorithms (Oak Ridge National Laboratory, Hickey and Rypka), 20 percent, there is no information available. Application time This refers either to the offline or the online process. All algorithms “run” offline. Only three of them (30 percent) have the ability to apply online procedures as well. The Hickey and Rypka algorithm was designed to run both ways, DDR was designed to be applied both ways as well but the offline procedure is preferred. Finally, in COPAC part of the procedure is applied offline and part of it is applied online. The term “online” is used to refer to the real time running. Fields used for the creation of keys (monographs) Another significant characteristic of the algorithms are the MARC fields used for the creation of comparison keys. As we can see in Figure 1 the majority of algorithms (nine out of ten, 90 percent) use author, title and publication year for key creation. In addition, the algorithms also use the following fields in key creation: 70 percent of Figure 1. MARC field use for key creation (monographs) Duplicate detection algorithms 293 algorithms use pagination, 60 percent use ISBN, 50 percent use LCCN and/or publisher, 40 percent use edition statement, 30 percent use place of publication and/or series, 20 percent use fields like reproduction code, country of publication, government document number and ISSN, and finally 10 percent use fields such document type, language of the document, CODEN, control number, cataloging source, statement of responsibility, and volume/part and dimensions. Table IV presents detailed information on all fields that are used for duplicate bibliographic record detection. Algorithm efficiency Most organizations that apply duplicate detection procedures have not publicized their algorithm efficiency results. Even the data at hand are not absolutely comparable since each case is distinct and because the results of application depend on: . the type/types of documents; . the given definition of “duplicate record”; . the consistency of cataloging and data entry; and . the target set by each algorithm. From the data presented in Table V we draw the following: . efficiency among algorithm applications range between 44.95 percent and 99.62 percent out of the total identified duplicate records, real duplicate records were only the previously referred percentage; . mismatches range at a percentage below 1.5 percent; and . missed matches range somewhere around 4 percent with the exception of those presented in ALEPH, which range from 17.4 to 34 percent. Following is an analysis of each individual algorithm. One step algorithms ALEPH-ULM ALEPH is the network of the research libraries of Israel, which maintained the Union List of Monographs. The entries were loaded with the use of their detection and merging algorithm. It was based on the comparison of a stable number of not frequently met letters that came from four fields: author (five characters), title proper (seven characters), publication date and language (Lazinger, 1994). In a 1996 research study examined the efficiency of the algorithm when applied to monographs. It was reported that it yielded 0 percent mismatches for records describing Hebrew materials and 1.4 percent for English but it failed to detect existing duplicates in for 17.4 percent of English and 34 percent of Hebrew records (Meir and Lazinger, 1998). ULM, now named Union List of Israel, decided that their algorithm did not satisfy their demands and in 1998 stopped all deduplication efforts. Illinois Library Computer Systems Organization For duplicate record detection, the system uses indices of the following control numbers: OCLC, LCCN, ISBN, ISSN, and publisher number. When the data of these indices overlap, they are given specific values. Then, further actions, based on the sum LHT 26,2 294 F ie ld s A L E P H -U L M IL C S O G re ek U n io n O A K M D B U P D IU C S H ic k ey a n d R y p k a D D R C O P A C M E L V Y L D o cu m en t ty p e U R ep ro d u ct io n co d e U U C o u n tr y o f p u b li ca ti o n U U L a n g u a g e U L C C N U U U U U IS B N U U U U U U IS S N U U C O D E N U C o n tr o l n u m b er U C a ta lo g in g so u rc e U G o v er n m en t d o cu m en t U U A u th o r U U U U U U U U U T it le U U U U U U U U U S ta te m en t o f re sp o n si b il it y U V o lu m e/ p a rt U E d it io n U U U U P la ce o f p u b li ca ti o n U U U P u b li sh er U U U U U P u b li ca ti o n d a te U U U U U U U U U P a g in a ti o n U U U U U U U D im en si o n U S er ie s U U U N o te : U ¼ Y es Table IV. Fields used in the creation of keys (monographs) Duplicate detection algorithms 295 of the weights, are determined. This is an offline process. The following are the values recommended for the bulk import of records (ILCSO, 2004) (see Table VI). Once the comparison is done and the matching shows that two bibliographic records represent the same document, they are evaluated so that the most suitable is selected to remain in the database while the other will be deleted. For each field used for the matching process, there is a corresponding field weight to help decide which record will remain. The fields used for matching include: cataloging source, encoding level, agency that has modified the original record and bibliographic level of the record. In dubious cases the final decision is taken by comparing the records manually (ILCSO, 2004). Union Catalog of Greek Academic Libraries Use of this algorithm started in April of 2005. At the time of import, records are checked for duplicate detection and merging. Imported records are created in a variety of software and therefore records have differences in format, the number of letters, the holdings of existing records, etc. After loading these records are processed so there are no such variations. To accommodate this, the key is formed by taking data from the fields further down: Title, Author, Edition statement, Publication date and ISBN (Vougiouklis, 2007). Questionable duplicate records are kept in a work to be examined manually. Based on the algorithm evaluation, it was estimated that 44.95 percent of actual duplicate records were detected. Among the detected problems, 17.8 percent were mainly due to the applied key, while 12.47 percent were due to the policy issues, 7.05 percent represented cataloging problems, and 17.62 percent referred to other kinds of problems. Effectiveness Mismatches Missed matches % % % ALEPH-ULM * 0-1.5 17.4-34 Greek Union Catalog 44.95 * * IUCS 56.58-99.62 0.54 * OCLC (Hickey and Rypka 54-69 1.3 * Note: * Not available Table V. Algorithm efficiency Duplicate replace ¼ 100 Duplicate warn ¼ 30 Indexes and weights 035O ¼ 100 010A ¼ 20 020A ¼ 25 022A ¼ 15 028A ¼ 10 Table VI. Recommended values for bulk import of records LHT 26,2 296 Two step algorithms Oak Ridge National Laboratory In 1976, Oak Ridge National Laboratory created an algorithm aiming at detecting duplicate records of cited journals articles. It was used offline and it produced fixed length keys (Hickey and Rypka, 1979). Publication date, initial page number, journal CODEN, volume number, and samplings from the author, journal title, and article title elements were used for record matching. For duplicate record detection the keys were sorted in many and various fields. When fields matched perfectly, a weighted matching of the remaining fields was used. The algorithm was completed with a page/year and author/title sorting. Online Computer Library Center (OCLC): MDBUPD This program was created by OCLC shortly after 1976; it was named Master Data Base Update (MDBUPD) and was used offline. This algorithm was designed as a two-step application (Wanninger, 1982). Initially, it searched the database using LCCN and keys produced by OCLC. These keys were derived from the name/title fields or just from the title field. Then, it checked additional fields for verification. These were: Publisher, Place of publication, Title, Date of publication, Pagination. Towards the end of this process, after the absolute matching of all compared fields, duplicate records were deleted. University of Illinois: IUCS IUCS (IRRL [Information and Retrieval Research Laboratory] Union Catalog System), was developed to detect non-monographic documents as well as maps, filmstrips, etc. (Williams and MacLaury, 1979). Once the data were normalized, they were processed by comparing fields and applied in two steps/passes. The first step involved the creation of a matching key. The “title-year” keys were sorted and the keys of the documents that were identical were later recalled and compared in the second step (Hickey and Rypka, 1979). For the second step, a number of detailed matching processes were applied so that the first estimation was either verified or rejected. A title mapping key different from that of the first step was used. The author names, titles and pagination of records that were recalled in the previous step as possibly duplicate ones were compared and it was then specified, which were ultimately duplicate ones. The efficiency of this algorithm ranged from 56.58 to 99.62 percent depending on the database which was being tried. Mismatches accounted for 0.54 percent of the total number of duplicate records (Cousins, 1998). When it was not possible to reject or accept records as duplicates, a non-automated comparison of records was used (Hickey and Rypka, 1979). Online Computer Library Center (OCLC) – Hickey and Rypka During 1978-1979 OCLC tried once again to develop a research program for detecting duplicate monographs. This algorithm was developed by Hickey and Rypka and could be applied both online and offline. It was applied in two steps/sections (Hickey and Rypka, 1979): (1) The first step or exact-match section aimed at clustering of related keys in order to reduce the number of full key comparisons. Duplicate detection algorithms 297 (2) In the second step all other keys of selected fields that matched in part or whole were applied. These Keys were derived from the following fields of bibliographic record: Reproduction code, Record type, Title (only the beginning), Publication date, Place of publication, Author, Pages, Publisher and Hashed title. SuDoc number, ISBN, Edition statement, Series, and LCCN were incorporated only f present in bibliographic records. This algorithm was checked against a decision table to determine if the keys were duplicates. This table specified 16 alternative ways by which two keys could be matched. The comparison of the two keys could yield a quote which took any one of the three values: 2 ¼ mismatch, P ¼ partial match, E ¼ exact match. It was found that mismatches were 1.3 percent of the total records identified as duplicates (Hickey and Rypka, 1979). The algorithm located approximately 54-69 percent of duplicate records depending on whether reprints were defined as duplicates or not. Online Computer Library Center (OCLC): DDR In 1990, OCLC created a new algorithm for duplicate record detection. It is applied to monographs and journals and consists of two steps. In the first step, with the application of the clustering algorithm possible duplicate records are clustered with the use of a key consisting of eight characters after the data have been normalized. Only records with the same key titles using seven more elements are included. These elements include LCCN, ISBN, Publication date, Pages, Author, Publisher, and Full title. Records with the same key titles and identical LCCN or ISBN, either identical at least two out of the other five elements are considered as possibly duplicates (O’Neill and Oskins, 1990). In the second step, the evaluation algorithm is applied. This estimates the similarity between possible duplicate records. The similarity values range from “0.0” for not identical ones to “1.0” for the absolutely identical records (O’Neill and Oskins, 1990). The elements are considered partial matches if their similarity is greater than 0.85 percent. When no automated decision is possible, the records are identified for non-automated control. Research showed that the recall of clustering is 96 percent and that 56 percent of the total duplicate records can eventually be detected (O’Neill and Oskins, 1990). This algorithm led to the creation of the DDR software which is used to specify and merge duplicate records representing books and periodicals. Although it can run offline, OCLC has chosen to apply it as an offline procedure. Consortium of University Research Libraries (CURL): COPAC COPAC, the union catalog of the members of CURL, has been in use since 1996. The process of duplicate record detection follows two distinct practices. The first practice deals with the process of detection of duplicate records that is applied only in one part of the database (the second practice is descried in the “Detection and merging on the fly” chapter). The process takes place offline with the aim of merging duplicate records. It is applied in two steps/stages. Step one: each imported record is compared to very record in the database. To achieve this, two methods are used (Cousins, 1998): LHT 26,2 298 (1) Matching ISBN/ISSN: clusters of matching records are located based on ISBN. After the text is normalized, matching fields are assigned weights/values. In the end, the values of all fields are added up. If the total assigned weight is equal or bigger than 13, the record is identified for merging. If the record has an edition statement, matching of this field is also necessary. In the same way, checks for series volumes and multi-volume works take place (Cousins, 1998). (2) Matching of author/title acronym: records without ISBN or ISSN and records with ISBN/ISSN which fail to find a similar record are re-examined with the use of an acronym author/title, 4/4 letters of author and publication year. Possible matches are promoted to the next step. At this point no weighting is determined and for each field matching is a simple YES/NO. Matching based on acronyms introduces the matching of two new fields: publisher and total number of pages (Cousins, 1998). Step two: In order to verify possible matching records, a number of detailed matchings take place. The fields used in this process are: ISBN, ISSN, Publication date, Title, Author, Edition statement, Series, Pagination, and Publisher. COPAC still continues to apply the process described above, but part of its process is done during the process of searching by end users. This part of the process is presented in the following section. Detection and merging on the fly All processes of algorithm applications for duplicate record detection aim primarily at the deduplication or merging of duplicate records. Another practice is the application of the program on the fly. Detection and merging of duplicate records is done during the search or retrieval of records and does not lead to their physical merging, but just to a temporary or “virtual” merging for reasons of presentation to end users. Two programs that apply this method are described below. COPAC: detection and merging of records upon search The majority of the duplicate record detection and merging process continues to take place offline as described previously. A process of three sets of data loading is applied which leads to the creation of three segments in the database (Cousins, 2006): . One set is the data from the British library. These records do not consolidate. . The other two data sets, each consisting of records from approximately half of the other COPAC libraries, have their records consolidated into a specific segment during the data loading using the process described earlier. There is no record consolidation between the three segments, which leads to the existence of duplicate records between them. To compensate for this problem, a check for duplicate records is performed as an on the fly process. When a user searches, results are checked for any possible duplicate records before they are displayed to the user. When duplicate records are found, they are displayed to the user as just one record. This record includes all information from the other records that are included in the result set. This matching and consolidation process during loading time, combined with the process of matching during the search process, is a substantial compromise Duplicate detection algorithms 299 compared to actual detection and merging of duplicate records with large amounts of data. MELVYL: detection and merging of entries upon retrieval The network of the University of California libraries supports the entire system, which runs duplicate record merging on the fly. The records are not merged physically but they are merged and presented dynamically during the search process. Apart from book and journal records, the monographs algorithm is applied to in-analytics, as well as to other non-print materials. This algorithm is applied when each new record is loaded to the MELVYL database, which is basically an offline process. Every time a new record is loaded, its possible identical records are located and the new result is saved in an Oracle table. If a record matches a user’s search criteria, the system automatically checks this table and the best record is recalled (Campbell, 2006). A two-step process is followed for the advancement of identical records to the final phase of merging. Initially, a pool of possible duplicate records is created. In the first step, there is a comparison of LCCN/ISBN, publication year, and the first twenty-five characters of the title. At this point a threshold weight is assigned. The threshold for merging monograph records is 875 points. If during the first step of comparisons identification for merging is not achieved, a second step of comparisons is performed based on data from the title, main entry (normalized), country of publication, pagination, and publisher. Conclusion This paper examined the algorithms applied to eliminate the problems caused by the existence of duplicate bibliographic records in a database. When algorithms are applied in one step, a faster application is achieved but the percentage of database cleanup usually remains low. Most algorithms are two step applications., These result in a greater database quality improvement, since with the initial application of a short key, all possible duplicate records are collected and therefore file the rest of the algorithm is applied only to this new file. The methods used for duplicate matching evaluation are field comparisons and weight assignment. Almost all algorithms studied so far run offline. Also presented is the application of another approach which facilitates a temporary consolidation as a user carries out a search or during the recall stage (on the fly process). The result of this method is not the physical merging of duplicate records in the database but their temporary or “virtual” consolidation for the purpose of presentation to the user. For the creation and selection of the appropriate duplicate records handling algorithms, there is neither an absolute and specific solution, nor a system or a tool which can be simply transferred and applied purely from one environment to the other. Each environment has its own specifications and policies; it applies specific practices and has specific and special needs. In every system the application of these algorithms calls for a special study and modifications to correspond to the given needs. The focus of future research is the handling of large scale data in a network environment and in real time. Virtual catalogs and Z39.50 protocol are the focus of future study. Users wish for a comprehensive, updated, clear, consistent, and fast catalog, which is capable of incorporating searches between distributed databases in a heterogeneous network with consistency, accuracy and speed. Further research on LHT 26,2 300 conventional ways of duplicate record management including the most current practices such as virtual merging is needed. This research is important in order to fully understand a problem to which no satisfactory solutions have been found while at the same time the needs for such solutions are constantly increasing. References Campbell, C. (2006), Melvyl Project Coordinator, information given by e-mail, (accessed 31 January 2006). Cousins, S.A. (1998), “Duplicate detection and record consolidation in large bibliographic databases: the COPAC database experience”, Journal of Information Science, Vol. 24 No. 4, pp. 231-40. Cousins, S. (2006), COPAC Service, Manchester Computing, University of Manchester, available at: copac@mcc.ac.uk (accessed 11 January 2006). Coyle, K. and Gallaher-Brown, L. (1985), “Record matching: an expert algorithm”, ASIS Proceedings, Vol. 4 No. 1, pp. 77-80. Coyle, K. (1992), Rules for merging MELVYLw Records, Technical Report No. 6, University of California, DLA, Oakland, CA. Hickey, T.B. and Rypka, D.J. (1979), “Automatic detection of duplicate monographic records”, Journal of Library Automation, Vol. 2 No. 12, pp. 125-42. Hunstad, S. (1988), “Norwegian bibliographic databases and the problem of duplicate records”, Cataloguing and Classification Quarterly, Vol. 8 Nos 3/4, pp. 239-48. ILCSO (2004), Using OCLC for ILLINET Online/Voyager Data Entry, Illinois Library Computer Systems Office, available at: http://office.ilcso.illinois.edu/Docs/using_OCLC.pdf (accessed 15 February 2007). Lazinger, S.S. (1994), “To merge and not to merge – Israel’s Union List of Monographs in the context of merging algorithms”, Information Technology and Libraries, Vol. 13 No. 3, pp. 213-9. Meir, D.D. and Lazinger, S.S. (1998), “Measuring the performance of a merging algorithm: mismatches, missed-matches, and overlap in Israel’s Union List”, Information Technology and Libraries, Vol. 17 No. 3, pp. 116-23. O’Neill, E. and Oskins, W.M. (1990), Duplicate Records in the Online Union Catalog, OCLC Office of Research, Dublin, OH. Toney, S.R. (1992), “Cleanup and deduplication of an international bibliographic database”, Information Technologies and Libraries, Vol. 11 No. 1, pp. 19-28. Vougiouklis, G. (2007), ELiDOC, available at: gvoug@elidoc.gr (accessed 2 February 2006). Wanninger, P.D. (1982), “Is the OCLC database too large? A study of the effects of duplicate records in the OCLC system”, Library Resources and Technical Services, Vol. 26, pp. 353-61. Williams, M.E. and MacLaury, K.D. (1979), “Automatic merging of monographic data bases: identification of duplicate records in multiple files: the IUCS Scheme”, Journal of Library Automation, Vol. 12 No. 2, pp. 156-68. Corresponding author Anestis Sitas can be contacted at: sitas@lit.auth.gr Duplicate detection algorithms 301 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_nlt72cbrcrg2vfr347spy6po2u ---- Title Page Article: Scalable Decision Support for Digital Preservation: An Assessment *This version is a voluntary deposit by the author. The publisher’s version is available at: http://dx.doi.org/10.1108/OCLC-06-2014-0026. Author Details Author 1 Name: Christoph Becker Department: Faculty of Information University/Institution: University of Toronto Town/City: Toronto Country: Canada Author 2 Name: Luis Faria University/Institution: KEEP Solutions Town/City: Braga Country: Portugal Author 3 Name: Kresimir Duretec Department: Information and Software Engineering Group University/Institution: Vienna University of Technology Town/City: Vienna Country: Austria Acknowledgments (if applicable): Part of this work was supported by the European Union in the 7th Framework Program, IST, through the SCAPE project, Contract 270137, and by the Vienna Science and Technology Fund (WWTF) through the project BenchmarkDP (ICT12-046). Structured Abstract: Purpose – Scalable decision support and business intelligence capabilities are required to effectively secure content over time. This article evaluates a new architecture for scalable decision making and control in preservation environments for its ability to address five key goals: (1) Scalable content profiling, (2) Monitoring of compliance, risks and opportunities, (3) Efficient creation of trustworthy plans, (4) Context awareness, and (5) Loosely-coupled preservation ecosystems. Design/methodology/approach – We conduct a systematic evaluation of the contributions of the SCAPE Planning and Watch suite to provide effective and scalable decision support capabilities. We discuss the quantitative and qualitative evaluation of advancing the state of art and report on a case study with a national library. Findings – The system provides substantial capabilities for semi-automated, scalable decision making and control of preservation functions in repositories. Well-defined interfaces allow a flexible integration with diverse institutional environments. The free and open nature of the tool suite further encourages global take-up in the repository communities. Research limitations/implications - The article discusses a number of bottlenecks and factors limiting the real- world scalability of preservation environments. This includes data-intensive processing of large volumes of information, automated quality assurance for preservation actions, and the element of human decision making. We outline open issues and future work. Practical implications - The open nature of the software suite enables stewardship organizations to integrate the components with their own preservation environments and to contribute to the ongoing improvement of the systems. Originality/value – The paper reports on innovative research and development to provide preservation capabilities. The results of the assessment demonstrate how the system advances the control of digital preservation operations from ad-hoc decision making to proactive, continuous preservation management, through a context- aware planning and monitoring cycle integrated with operational systems. Keywords: Repositories, preservation planning, preservation watch, monitoring, scalability, digital libraries. Scalable Decision Support for Digital Preservation: An Assessment This article presents a systematic assessment and evaluation of the SCAPE decision support environment comprising PLATO, SCOUT and c3po. We discuss the improvements and identified limitations of the presented system. We furthermore discuss the quantitative and qualitative evaluation of advancing the state of art and report on a case study with a national library. Finally, we summarize the contributions and provide an outlook on future work. 1. Introduction This article continues the discussion in Becker et al (2014) and presents a systematic assessment and evaluation of the SCAPE decision support environment comprising PLATO, SCOUT and c3po. We discuss the improvements and identified limitations of the presented system. We furthermore discuss the quantitative and qualitative evaluation of advancing the state of art and report on a case study with a national library. Finally, we summarize the contributions and provide an outlook on future work. 2. Evaluation and assessment While some of the questions that are raised by the design goals discussed in Becker et al (2014) can be readily evaluated using standard metrics, others require a detailed qualitative assessment. This section will discuss how to systematically assess improvements on the dimensions of trust and scalability. We will report on a typical case study conducted with the State and University Library Denmark, discuss key metrics that can be used for evaluation, apply them to assess recent advances, and discuss a set of limitations. We further discuss how these findings can be applied on a wider scale. 5.1 Evaluation dimensions and challenges Five major design goals have been proposed in Becker et al (2014): ● G1: Scalable content profiling is required to create and maintain an awareness of the holdings of an organization, including the technical variety and the risk factors that cause difficulties in continued access and successful preservation. ● G2: Monitoring compliance, risks and opportunities is a key enabler to ensure that the continued preservation activities are effective and efficient. ● G3: Efficient creation of trustworthy plans is required so that preservation can function as a normal part of an organization's processes in a cost-efficient way. ● G4: Context awareness of the systems ensures that they can adapt to the specific situation rather than provide generic recommendations or require extensive manual configuration. ● G5: Loosely-coupled preservation ecosystems, finally, enable organizations to follow a stepwise adoption path and support continuous evolution of the preservation systems as new solutions and improved systems emerge. Given this set of design goals, it is clear that a systematic evaluation has to be based on both qualitative and quantitative criteria and account for the various socio-technical dimensions of the design problem. Scalable content profiling requires, first and foremost, efficiency in the data processing system. This can be measured in terms of the amount of data processed in a certain timeframe using a defined set of resources. This applies to the content profiling tool C3PO. The efficiency of decision making, on the other hand, can be measured in controlled experiments. This, however, has to be done in a real-world environment to be meaningful, which creates additional challenges for a large-scale assessment and has to be interpreted with caution. The effectiveness of a preservation system composed of several heterogeneous and asynchronous processes, collaborating over time and controlled by decision makers in a real organization, is much harder to measure, since very often what needs to be measured in terms of the effects is time-delayed, and to a large extent defies objective measures in the present time. Similarly, trust is extremely hard to measure, and the preservation community has for a decade discussed different ways of assessing the trustworthiness of a repository (Ross & McHugh 2006; OCLC and CRL 2007). The resulting criteria catalogue ISO 16363 (ISO 2010) provides a useful checklist for assessing the assumed trust of an organization and hence can form a guideline for evaluation, but does not apply to the actual operations and the preservation lifecycle on the operational level. The Plato planning approach that forms the basis for the architecture presented here has been designed with these criteria in mind and evaluated for adherence with and support of these criteria (Becker et al. 2009). However, it can be argued that more holistic perspectives are required to assess and improve an organization’s trustworthiness, perspectives that emphasize enterprise governance of IT and the maturity of organizational processes (Becker et al. 2011). The following discussion is designed loosely along the Goal-Question-Metric paradigm (Basili et al. 1994). Each goal is associated with a set of questions corresponding to the objectives outlined in Becker et al (2014). The answers to these should support an assessment as to how far the goal has been achieved. To this end, each question is further linked to a set of metrics that provide objective indicators to support an answer to the question. We discuss each of the design goals in turn and discuss the specific questions that need to be answered to provide an assessment of how the state of art is improved with the proposed system design and implementation. This forms the basis of a systematic discussion, taking into account the quantitative indicators and the qualitative discussion of the state of art. 5.2 Evaluation of design goals G1 Provide a mechanisms for scalable in-depth content profiling Figure 14: Scalable profiling goals Figure 14 poses the key questions we need to answer to evaluate the scalability and quality of in- depth profiling. Content profiles need to be meaningful, i.e. cover the interesting features that are known to be relevant for preservation, and trustworthy. Clearly, a profile covering only the size of files will be less meaningful than a profile including mime-types, formats, and validity. Additionally, a plethora of features influence the success in continued access, ranging from the presence of Digital Rights Management settings to the numbers of embedded tables in electronic documents and other dependencies. The features needed for in-depth characterization process are broad coverage in terms of supported file formats and the features extracted, usage of a common vocabulary for identification of the formats, feature names and its values, and a reasonable low resource consumption and performance so they can be used in large-scale frameworks. By relying on the fits file information toolset, the C3PO profiler maximizes the coverage of features, arguably providing the highest possible feature coverage that can currently be achieved (Petrov & Becker 2012). The correctness of aggregation itself can be verified in a straightforward way, since the operations are basic statistical calculations. The correctness of general map-reduce based operations in themselves can be assumed. On the other hand, the correctness of characterization components to provide accurate feature descriptors based on arbitrary input is far from proven. In fact, current data sets are entirely insufficient for proving correctness of the complex interpretation processes that take place. This, however, is a problem on the level of operations and cannot be attributed to the aggregation step of content profiling. Separate efforts are underway to verify the correctness of characterization tools using model-driven engineering to generate annotated test data (Becker & Duretec 2013). To enable future evolution, yet another aspect of scalability, a meaningful content profiler must be flexible enough to work on arbitrary property sets. C3PO supports this by relying on a generic data model, so that any additional property sets can be profiled. This also supports the integration of further characterization tools. While FITS fulfils coverage and vocabulary requirements, it consumes considerable resources and takes a substantial amount of time to executei. Hence, C3PO also supports Apache Tika, which in experiments showed far better resource consumption and performance with a throughput of up to 18 GB per minuteii when used with large- scale platforms such as Apache Hadoop iii. While in this case, Apache Tika was only used for file format identification, it supports feature extraction and has good coverage of file formatsiv, but does not yet use a well-defined vocabulary for the identification of extracted features. The objectively measurable throughput and resource usage in profiling, then, is the crucial final question. To measure the time and resource behavior of C3PO, a set of controlled experiments has been conducted. The first measured the throughput of C3PO on a single standard machine, while the second employed a server with strong hardware and explored the boundaries of scalability by attempting to profile up to 400 million resources (12 Terabyte) in a single profile, enabling a further extrapolation of these results to the entire set of 300 TB in this collection. The third examined the limits of the web visualization platform to cope with these amounts of data. The first experiment (Petrov & Becker 2012) tested the performance on limited resources and showed that on a standard computer with 4 GB of RAM and 2.3 GHz CPU, the ingesting and generation of a profile of 42 thousand FITS takes about 1.5 minutes. Large scale tests were performed by Niels Bjarke Reimer from the Danish State and University Libraryv. A 12 TB sample was taken from a dataset with 300 TB of the Danish web archive. FITS was run on the sample content, resulting in 441 million FITS files. This characterization process took about a year to complete vi. For content profiling, two processing parts need to be considered: 1) the gathering of files into the internal data structure and 2) the analysis of that data set using mapreduce queries. The experiments were executed on a single machine with the specifications described in Table 1. Processor 2 X Intel Xeon x5660 2.8 GHZ (12 core) RAM 72 GB Storage Isilion storage box with 20 TB storage and 400Gb SSD, connected by a 1Gbit/s Ethernet network Operating System MongoDB Linux x86 64-bit v2.4 MongoDB Version 2.4 Application service Apache Tomcat version 7 Table 1: C3PO scalability test machine specifications The first step, which ingests the FITS files into a MongoDB server, was tested with the 441 million FITS items. The graph depicted in Figure 15 shows the import time for samples of around 3.600 files. The Y-axis unit shows time in milliseconds and the X-axis unit is a sample number, which can also be considered as a timeline. Figure 15: Performance of C3PO import process using FITS metadata (Reimer, et al. 2013) The complete import process took less than 80 hours, with an average execution time of 0.65 milliseconds per FITS file. This import time is quite constant with only a few outliers, which implies that the platform and the software are acceptable for importing large amounts of data. The second step, the analysis of the data using map reduce queries, was tested with a data set of about 12 million FITS files and took 15 hours and 18 minutes, which is about 4.63 milliseconds per FITS file. Using sharding and mapreduce technologies, the processing time of the second step should also be linear. In conclusions, both steps are linear and together take on average 5.28 milliseconds per FITS file. This means that processing the current 300TB dataset would take about 16170 hours, or about 677 days, on a single machine. As both processes are massively parallelizable and the MongoDB platform already supports sharding and map reduce, the processing time can be highly reduced by distributing the load on several servers. Substantial resources may be needed to bring the processing time down to a practical time, but this profile does not have be re-generated frequently. The C3PO tool also provides a web interface that supports real-time analytics on the gathered data. While this is not one of the requirements strictly required for automated monitoring, it provides interesting insights into content profiles that is considered highly valuable by the decision makers. However, in this scenario the limits of the web application are revealed. Several test runs were made with different data set sizes to ascertain the limits of the application. For each test run, two manual procedures on the web interface were made: 1) opening the overview page that calculates, in real time, distributions of several extracted features, and 2) drilling down into into the characteristics of a subset of the collection, as all the files of a defined format. Test run # FITS files Elements size GB number of properties Overview processing time Drill down processing time 1 13,962 0.03 80 Fast Fast 2 108,348 0.26 96 18 sec 11 sec 3 363,991 1.00 106 30 sec 34 sec 4 1,020,514 2.46 113 2 min 25 sec 1 min 42 sec 5 1,639,842 3.95 119 3 min 52 sec 2 min 50 sec 6 2,683,596 6.44 119 6 min 28 sec 4 min 25 sec 7 11,905,935 28.63 211 not finished within 3 hours N/A 8 441,923,560 1183.50 5122 N/A N/A Table 2 - Testing the limits of real-time analytics in the C3PO web application (Reimer, et al. 2013) Table 2 shows the results of the tests, which show acceptable results up to around 2.5 million files, with a waiting time of around 6.5 minutes. Above this limit, the web system does not respond within 3 hours, which is considered unacceptable. Hence, it is not feasible to perform real-time analytics with the current solution on the set of 440 million FITS files of the 12TB data set. It should be noted that the analysis of the entire dataset provides a view of over 400 million rows in a table with over 5000 columns, with a resulting database size of over a Terabyte. G2 Enable monitoring of operational compliance, risks and opportunities This section analyzes how the mechanisms presented in Becker et al (2014) can be used to accomplish proactive monitoring of operational compliance, risks and opportunities in a preservation environment. Figure 16: Monitoring As outlined in Figure 16, the key questions relate to the identification of aspects that need to be monitored and to the coverage of measures available to provide indicators related to these aspects. Based on an analysis of a reference model for drivers and constraints (Antunes et al. 2011), which classifies each of the influencers a preservation organization should be aware of, the discussion in (Becker, Duretec, et al. 2012) showed that relevant questions and measures can be derived for each of the influencers of interest. This enables the development of appropriate adaptors for measuring specific indicators pertaining to this driver. Table 3 shows key examples, while a full discussion and detailed table is provided in (Becker, Duretec, et al. 2012). Driver Question Indicator Sources Content Is the content volume growing unexpectedly? Rate of growth changes dramatically in Ingest content profile, Repository Report API Operations Are our content profiles policy- compliant? Mismatch between content profiles and policy statements content profiles, control policy statements Format How many organizations have content in this format? number of shared content profiles containing a format content profiles shared by organizations Format What is the predicted lifespan of format X? lifespan estimates based on historic profiles model-based simulation Table 3: Selected preservation drivers and related information sources (Becker et al, 2012) In practice, the achieved coverage of measures is by no means complete, but increasing. Currently supported sources include format registries, semantic policies, content profiles, and an automated rendering and comparison tool (Law et al. 2012). A prioritization approach is taken to target first and foremost those aspects that are perceived most critical. The open nature of the adaptor design, the data model, and the licensing model has the effect that additional sources can be integrated by anybody in the preservation community, and the coverage is rising steadily. Figure 17: Evaluation of monitoring compliance Monitoring of operational plans is illustrated schematically in Figure 17 on a simplified example. Consider a preservation plan that evaluates four potential actions (“alternatives”) against a set of four decision criteria. These criteria evaluate the important aspects of the data to be preserved, the environment, and the actions to be applied. Based on these criteria, preservation actions in question are evaluated and a ranking is calculated. The planner then chooses the best suited action and adopts it. In this example, a check mark denotes a best-in-class performance, a tilde denotes acceptable performance, and a cross reflects an unacceptable performance, for example a process that did not terminate or an image conversion that shows a distorted image. We can see that two alternatives have been rejected, and alternative 1 has the highest score and will be selected. vii Since the decision criteria identified during planning lead to the adoption of a certain action, they must be monitored during operational executions as well to enable the organization to track whether the action keeps performing according to the expectations. This is shown on the bottom left of the figure. However, this is not the only aspect prone to evolution: (1) New alternatives will emerge over time that may perform better than the chosen alternative. In cases where no alternative was acceptable, this will sometimes be the only thing monitored, since the organization would wait for a better solution to become available before embarking on premature preservation actions. For example, this was the case in (Kulovits et al. 2009). (2) Updated or new Quality Assurance tools can emerge that provide more reliable or more efficient measures for Quality Assurance or even the first automated way to measure relevant quality. For example, these could be of the kind described in (Jurik & Nielsen 2012) or (Bauer & Becker 2011). (3) Related to this, experiments including certain criteria may be conducted by other individuals or organizations that can reveal risks and opportunities related to this plan. For example, the chosen Quality Assurance tool might be shown to malfunction on similar objects, which poses a major risk (Bauer & Becker 2011). (4) Finally, the organization’s objectives themselves may shift over time as goals change. This would be reflected by a change in the control policies. The tool suite described in this article is designed to provide full support for this monitoring scenario. The upcoming release of Plato generates specifications describing the expected quality of service (QoS), similar to a service-level agreement (SLA), for the set of decision criteria considered, linked to the corresponding organizational policies, and deposits corresponding monitoring conditions in Scout upon deployment of a preservation plan. Such QoS specifications are created for those criteria in a tree which are influenced by the dynamic behavior of the service - i.e. the components. That means that they are not created for aspects relating to the format, such as the ISO standardization of PDF versions, but they include criteria such as whether the created files are well-formed. QoS is then measured within executable workflows and monitored for fulfillment. Aspects pertaining to the format and other non- dynamic aspects are monitored as risks and opportunities using Scout. While Scout is able to collect a wide variety of measures, these are naturally limited by the availability of operations that support such measures. The controlled vocabulary encourages developers to declare which measures their tools deliver to support discovery, but the coverage of measures will naturally vary across different scenarios. It is important to note, however, that any required measures can be integrated by any organization due to the open nature of the ecosystem. Finally, transparency of the monitoring process is achieved through the usage of the permanent shared vocabulary and the explicit declaration of tolerance levels in the QoS, corresponding to the specified acceptance thresholds that are derived from the organization’s control policies. G3 Improve planning efficiency Previous work has shown that the key challenge in planning is to make the decision making process more efficient (Kulovits et al. 2009; Becker & Rauber 2011c). In Becker et al (2014), we reflected on the dimension of trust that should not be sacrificed along this quest. Correspondingly, the key questions shown in Figure 18 relate to the aspect of effort: How long does it take to create one preservation plan now, and how much further improvement is possible? Figure 18: Efficient creation of trustworthy plans Previous discussions have shown the trustworthiness of plans produced by Plato (Becker et al. 2009), which is based on the evidence-based measures of decision criteria directly linked to organizational goals and based on factual evidence, documented with full change tracking assigned to acting users. These strengths continue to form the backbone of trustworthy planning. While it is clear that fully automated, i.e. autonomous, preservation planning is contradicting the goal of trustworthiness in this domain, the goal nevertheless must be to achieve a substantial increase in efficiency (Becker & Rauber 2011c). We will focus our discussion on measurements of effort on a controlled case study conducted with the Danish State and University Library, described in detail in (Kulovits et al. 2013). In this study, a set of responsible decision makers and experts from the library set out to create a preservation plan, with the assistance of a planning expert and a moderator who kept time of all activities throughout the planning process. The goal of planning was to create a preservation plan for a large set of audio recordings; the drivers and events motivating the plan included the goal to homogenize the formats of the library’s holdings and provide well-supported and efficient access to authentic content. The team at the library has comprehensive expertise in all relevant areas, which range from technical knowledge on audio formats and quality assurance mechanisms for comparing audio files to a documented understanding of the designated communities and preservation purpose of the content set at hand. The preservation plan was created using the then-current version 3 of the planning tool Plato, which is the precursor of the solution presented here. The goal was to identify the major areas of decision making effort and measure the potential improvement that can be realistically achieved. The total time required to create a preservation plan amounted to 35.5 person hours, completed over a period of two days. This shows on the one hand that efficient teams in well-established settings can already plan quite efficiently. Nevertheless, the effort must be further reduced to make planning truly a part of “business-as-usual” preservation in practice. To contextualize the effort required in this case, it is important to understand that this effort strongly depends on a well-defined understanding of the decision making context, including the understanding of the goals and constraints; the expertise of decision makers; and the technical proficiency of the staff carrying out the experimental steps of preservation planning. Finally, a strong driver for cost is the homogeneity of content: For large object sets that are very diverse, several preservation plans will have to be created, each respecting to a certain degree the specific aspects of a subset of the content and the means available to ensure access to this subset. Figure 19: Distribution of effort across activities in preservation planning (Kulovits et al, 2013) Figure 19 shows the distribution of effort across each of the types of activities that were part of this planning process. It should be noted that several of these activities were in fact on the upper end of the efficiency range for several reasons: ● Experiment execution often takes more time. The experiments conducted were highly efficient due to the minimum number of alternatives evaluated, the high technical proficiency of staff, the homogeneity of content, and the quality assurance mechanisms employed. In many cases, the experimentation processes consumes a multiple of the time. The integration of Taverna workflows and myExperiment can reduce this massively, since potential components can be discovered and automatically invoked within planning. This automation is similar to an existing integration of automated measures in Plato (Becker & Rauber 2011a), but makes these mechanisms available on an open, standardized and easily extensible basis. ● Background information often is unavailable. This applies in particular to the user communities and the statements of preservation intent that many organizations are only now beginning to document systematically (Webb et al. 2013). The organization in question, however, has a stable and well-supported definition of collections and user communities, from which the preservation goals could be derived rather efficiently. Formal policy specification makes this background explicit and known by the systems, so that the effort can be further reduced. ● Analysis and verification is complex. Even with the support of a planning expert, 14% of the time was spent in sense-making, analyzing the completed set of evidence and assessment in the decision making tool to arrive at a conclusion that was well understood by the stakeholders. This points to the need for improving the decision support tool in visualizing results in a more easily understandable and user-friendly way. Improved summaries in Plato are planned to this end. ● Entering data into the system is tedious, in particular for users not familiar with the tools. This was alleviated by the involvement of a planning expert familiar with the tool. Similar to other aspects, this benefits greatly from the integration of the tool with workflows and from the explicit endowment of Plato with an understanding for the policy models of organizations. A subsequent controlled experiment showed that Plato 4 reduced this effort by over 50% (Kulovits et al. 2013). In an ideal case, the effort required to cover the above aspects (software testing, background information, analysis and verification, and data input) can be removed almost entirely. Still, 50% of the time in this case would be spent discussing requirements. However, the majority of these concerns objectives about formats, significant properties, and technical encoding or representation (Becker & Rauber 2011a). For all of these aspects, standard definitions are now available as part of the controlled vocabulary that enable decision makers to reuse definitions and formalize these aspects on a policy level, removing this activity from the operational planning process. This applies to the designated community and preservation intent statement as well as to format and risk factors and technical constraints. The control policy statements thus can improve his effort by enabling reuse of these goals and constraints across plans. As the discussion on Goal 4 will show, the context awareness of Plato can eliminate the need for in-depth discussions of requirements as part of planning almost entirely. For an organization that establishes planning as a proper function in its roles and responsibilities and possesses a solid skills and expertise base, we estimate that preservation planning should on average take about one to two person days per plan, provided that policies and content profiles are known and documented. However, a large variance across organizations is to be expected. This estimate will strongly depend on a variety of specific factors and certainly needs to be further validated in longer- term empirical studies. These should in particular also cover the question of homogeneity of content sets covered in a plan: How many plans are required to safeguard a particular heterogeneous set of objects? A detailed discussion on the activities in this process and the relevant skills and expertise is presented in (Kulovits et al. 2012). G4 Make systems aware of their context Figure 20: Context awareness By providing a model that enables decision makers to formulate policies so they can be understood by automated processes, the systems can understand their context and stay informed about its state. To assess the context awareness of the systems in question, we investigate three distinct aspects: On the one hand, the context needs to be well understood and modeled in order to ensure a solid approach has been taken. On the other hand, each of the systems needs to demonstrate that it can use the part of the context that is relevant for its function appropriately. Finally, it is crucial to ensure that this does not come at the cost of coupling the context too closely with the systems. To this end, we discuss how this context can evolve independently from each of the systems. This is illustrated in Figure 20. The modular approach of the semantic models has been discussed in Becker et al (2014). A detailed documentation of the model is provided in (Kulovits et al. 2013). The models are based on W3C- approved standards and follow established Linked Data principles. At the heart of the model is the Resource Description Frameworkviii (RDF), a standard model for enabling the representation of data and metadata in subject-object-predicate triples. The Web Ontology Languageix (OWL) provides the mechanisms for the description of vocabularies, defining classes and properties. These are used to annotate, describe and define resources. Having well-defined semantics, OWL facilitates reasoning, ontology management and querying of data. The model is represented as an RDF graph and queried using SPARQL. The vocabulary domains have permanent identifiers according to the following ontologies: ● http://purl.org/DP/preservation-case contains the basic elements that link a preservation case together. ● http://purl.org/DP/quality describes the quality ontology, linking attributes and measures in a domain-specific quality model. ● http://purl.org/DP/quality/measures contains the vocabulary individuals that are used for annotating, describing and discovering measures and the mechanisms for measuring. ● http://purl.org/DP/control-policy, finally, defines the classes of objectives relevant for making a preservation case operational. Each of the systems presented is aware of those parts of the model that are relevant for its domain. Correspondingly, each system shows its awareness of this model in different manners. Plato uses the control policy model in several ways. On the one hand, the preservation case provides the basic cornerstones of planning. Instead of providing the documentation of the planning context in textual form, as it used to be standard (Becker & Rauber 2011c), a planner who has specified the policy model selects a preservation case to start planning, and the contextual information from this case is extracted from the policy model. Additionally, the objectives and measures specified in the control policy enable the decision support tool to derive the complete goal hierarchy automatically from the model, leaving it to the decision maker only to revise, verify and confirm the decision criteria to be used for the experimental evaluation. In the case study discussed above, this requirements specification alone accounted for 30% of the effort. While the policies of course require a similar discussion, much of the objective specification has to be discussed only once and can then be carried forward across preservation cases, which represents a substantial efficiency gain as soon as more than one plan is created. Similarly, the acceptable values, and hence the utility functions associated with each measure, can be computed in a straightforward way based on the objectives specified in the control policies, which presented a potential gain of another 17% in our case study discussed above. C3PO uses the vocabulary relevant to characterization in the content profile, referencing elements from the quality measures catalogue (such as http://purl.org/DP/quality/measures#55). Given that it only provides objective analytics of factual statements about the domain elements, it has no understanding of the policy model and does not require any. Scout, finally, leverages the policy model for monitoring the alignment of operations and plans to the policies, and also monitors the policy itself: If it is updated, that means that affected plans should be re-evaluated. Specific standard queries are provided as templates that monitor policy compliance. These can be activated by the user. For example, Figure 21 shows Scout starting a monitoring activity on the policy conformance of a specific content set (identified by a collection key). In this case, it shows in a preview that the property compression scheme is violated by 3 entries, and provides the option to create a continuous monitoring process by specifying a trigger with a condition and an event. It can be seen that the model of the context is shared between the tools, with the decision maker updating the ontology independently of the tools. A crucial requirement is that the context model can evolve independently of the systems. This is especially important considering that the current model is very much focused on operational support and can benefit greatly from being expanded to cover aspects of decision making that are further removed from operations. Similarly, it can be expected that meaningful linkages will surface that connect the existing ontologies to emerging ontologies from neighboring areas ranging from software quality and ontologies for describing software dependencies and platforms to preservation metadata and related policies. The potential for such evolution is guaranteed by the choice of representation and languages, since the Linked Data principles that the model adheres to are designed with these very goals in mind. Figure 21: Checking collection policy conformance in Scout G5 Design for loosely-coupled preservation ecosystems The design goal of loosely-coupled systems is relevant for several reasons. On the one hand, it is crucial to enable the stepwise adoption approach preferred by many organizations (Sinclair et al. 2009). On the other hand, it ensures that evolution can take place independently, enabling each organization to replace parts of its system without negatively affecting continued operations, and enables each component of the ecosystem to be sustained independently (to a degree) of the others. Figure 22: Loosely-coupled preservation ecosystems Figure 22 relates these goals to more specific questions. While it is clear that the components are open source, licensed under OSI-approved conditions x, and highly modular, it is useful to consider closely both the functional specifications and the data structures. The API specifications for the SCAPE Planning and Watch suite are in the process of being published openly on github. Data exchanged between components is standardized and supported by schemas, as shown in Table 4. Plato Scout C3PO All functional interfaces openly published In progress In Progress In progress All data structures documented using standards and schemas XML schemas published for each version Linked Data model Policy model XML schema published Component is used independently Yes Yes Yes Component follows the controlled vocabulary objectives, measures, control policies, preservation cases objectives, measures, control policies measures Table 4: Interoperability of components The controlled vocabulary as the glue that connects much of the ecosystem is maintained on githubxi. Curating this vocabulary over the long term will be sustained by a community effort. Recent discussions in the communities of metadata and preservation have brought forward long-term requirements for such evolution that will be considered carefully. (Gallagher 2013a; Gallagher 2013b) The components are functionally independent in that every component can and is actually used independently. Nevertheless, it is clear that the compound value proposition is larger than the sum of its parts, serving to encourage take-up of the suite as a whole. Similarly, the usage of this tool suite benefits greatly from integrating also with the workflow development, execution and sharing platforms Taverna and myExperiment, whose latest releases provide specific support for semantic annotation, driven by the requirements outlined in this article. Since such an ecosystem should be built with sustainable evolution in mind, we consider a recent discussion that identified eleven factors affecting the sustainability of a modular preservation system (Gallagher 2013a; Gallagher 2013b). Table 5 shows how our system performs on each of these criteria. Sustainability factorsxii How does the SCAPE Planning and Watch suite perform? Ability to view and modify source code All components are openly licensed, and all source code elements are freely available on a github repository. Widely used C3PO and Scout are relatively new, but enjoying quick take-up in the community, while Plato has been growing to over 1000 user accounts since first publication 2008. However, usage so far has been limited to prototypical evaluation rather than production-level deployment, mostly due to the level of effort involved. Well tested, few bugs or security flaws All tools support automated tests and have an active ticketing system, and the major releases are considered very stable. No security incident has been reported so far. Actively developed, supported All tools are part of an active development community, continuously supported, and the development platform is hosted by the Open Planets Foundationxiii. Standards aware All components follow standards on multiple levels wherever possible. This ranges from standard technologies such as Java Server Faces to XML Schema declarations and Linked Data principles. Well documented All components have extensive code documentation, manuals, built-in help and tutorials, as well as scientific publications explaining the theoretical foundations and practical implications of the software. Unrestricted licensing All software components are licensed under OSI-approved open licenses such as LGPL and Apache Software License 2.0. All documentation is licensed under the Creative Commons license. Ability to import and export data and code Preservation plans, executable plans and content profiles can be freely imported and exported, and shared between users. The Scout knowledge base is a Linked Data triple store and hence equally portable. Compatible with multiple platforms Being based on standard server technologies, all components are compatible with multiple platforms. Plato even integrates with multiple platforms at once in the case of preservation action discovery (Kraxner et al. 2013). Backward compatible This is very relevant in the context of Plato, which is an online service since 2008. Here, there is full backward compatibility with a fully traceable forward conversion upon import of legacy preservation plans. All plans created on the online service have been automatically migrated for all releases. Similarly, the knowledge base of Scout is designed to keep growing incrementally, without disposing of accumulated historical data. Minimal customization There is almost no customization required, since all contextual adaptation of the systems’ behavior can be achieved through the configuration of API endpoints and the corresponding definition of control policies. Table 5: Sustainability evaluation of the SCAPE Planning and Watch suite While the ecosystem is well positioned for future sustainability, there is still room for improvements. This includes the development and publication of Technical Compatibility Kits that can automatically test the functional compliance of a component to an API specification, as it has been done for the Data Connector APIxiv, but also the long-term evolution of vocabularies and any future extensions of the tool suite. 5.3 Practical adoption Considering the preservation lifecycle outlined in Becker et al (2014), what does the availability of the described system mean for an organization that has content and a preservation mandate, has set up a reasonable organizational structure and defined corresponding responsibilities, but has not yet ventured to create and maintain specific, actionable preservation plans? The exact measures to be taken will certainly depend on the specific institutional context, but essentially, such an organization can follow a series of steps. 1. Getting started entails several aspects. a. Start content profiling. Run format identification and characterization components such as fits on the set of content to extract metadata, deploy the content profiling tool C3PO, gather the metadata, and conduct an analysis of the content profile. b. Sign up with SCAPE Planning and Watch, either on the online servicexv or on an organization-specific deployment based on a public code releasexvi. c. Connect the organization’s repository with SCAPE Planning and Watch, either through configuring a standard adaptor or implementing a specific adaptor. 2. Specify control policies based on a thorough analysis of the organization’s collection, the user communities, and the preservation cases that are considered relevant. 3. Activate the monitoring of policies and content profiles in Scout to detect policy violations. 4. Create preservation plans to increase the alignment of the organization’s content and operations to the goals as declared in the policies. This planning will be done by evaluating action components using characterization and QA components in Taverna workflows, all integrated in planning. The finished Plans contain workflow specification including QoS criteria that can be automatically monitored. 5. Deploy the operational plans to the repository through the plan management API, connected to a workflow engine such as Taverna. 6. Establish responsibility for continuous monitoring. This is supported by Scout, which will monitor the compliance of operations to plans and detect risks and opportunities connected to these plans and policies. 5.4 Limitations From the discussion above, a number of limitations can be observed. These can be divided into limitations of the current capabilities of available tools, which can be expected to grow, and more fundamental limitations of current approaches which require new perspectives to be overcome; limitations of the problem space that set natural limits to further improvement; and limitations on the quantitative evaluation that can feasibly and meaningfully be conducted. This section discusses those limitations that are seen as central to the further advancement of the state of art. Coverage and correctness of available measurement techniques The availability of tools and mechanisms to deliver objective and well-defined measures that are shown to be correct and reliable is a key challenge holding back operational preservation today (Becker & Rauber 2011c; Becker & Duretec 2013). Scout supports a growing set of adaptors to feed in measures into the knowledge base, and by nature of the design alleviates some of the shortcomings and gaps in existing tools through the free combination of multiple information sources, but still is limited by the availability of these information sources. Similarly, experiment automation in Plato and, equally important, the feasibility of large-scale preservation operations in general, is entirely dependent on the existence of well-tested, efficient and effective mechanisms for Quality Assurance. Recent work is showing promising advances (Jurik & Nielsen 2012; Bauer & Becker 2011; Law et al. 2012), but there is still a wide gap to be addressed for preservation operations to be broadly supported. It seems crucial that this gap is made explicit and shared with a wide community so that efforts to close it can be based on a solid assessment of the shortcomings of existing tools rather than isolated ad-hoc identification of application scenarios within single institutions, as is often practiced today. Scalable preservation operations are only possible with fully automated, reliable and trustworthy Quality Assurance; and such quality assurance is expensive to develop and difficult to verify. Only through coordinated community efforts based on solid experimentation can the evidence be constructed to make a convincing case on authenticity (Bauer & Becker 2011). The utter lack of solid, reliable and open benchmark data set with full ground truth is a fundamental inhibitor to validating the correctness of such measures. To address this gap, we are investigating innovative approaches to turn around the publication of test data sets from ex-post annotation, inherently plagued by unreliable ground truth and copyright problems, to an open, model-driven generative approach (Becker & Duretec 2013). Scalable distributed and cost-efficient processing: How to profile a Petabyte? As shown above, the content profiling tool C3PO provides support for scaling out on distributed platforms. However, it requires considerable resources if the content to be profiled is approaching the Petabyte range, and visual analytics are not currently supported on such amounts of data. Yet, it is important to point out that the core goals of content profiling are achieved regardless of the collection size: Visual analytics is an additional capability on top of the processing activity. To enable cost-efficient creation of large content profiles without visual analytics requirements, we are exploring purely sequential profilers with a small footprint as a low-cost alternative, and we are investigating a set of techniques for feature-space pruning and dimensionality reduction prior to the more expensive processing steps. Similar considerations apply to preservation operations such as actions, characterization, and quality assurance. As noted, the execution of fits on the 440M resources on the Danish web archive took a year to complete, which clearly indicates the need for improvements. Similarly, automated QA mechanisms are computationally demanding (Bauer & Becker 2011). These processes need to be supported by parallel execution environments and more efficient algorithms to be truly applicable on large-scale volumes. The element of human decision making As observed above, trustworthy preservation should always be driven by careful decision making and factual evidence. While this element of human decision making can be reasonably minimized, replacing it entirely will only be possible once a solid, substantial knowledge base of real-world cases populates the ecosystem described above. Eventually, the human element can in the ideal case be reduced to a policy specification activity and a monitoring oversight function. This is clearly out of scope for this article, but will provide the logical next step in research on preservation planning and monitoring. Trust and maturity The assessment of complex socio-technical systems such as the one presented is challenging. Arguably, it will not be complete without an enterprise governance view incorporating a set of dimensions on the level of organizational process performance and maturity. A first view on this perspective has been presented in (Becker et al. 2011), where a process and maturity model for preservation planning was outlined that was aligned with the IT Governance framework COBIT (IT Governance Institute 2007). Current efforts are building on this work to develop a full-fledged process and capability maturity model that shall support organizations in systematic improvement of their preservation capabilities.xvii 5.5 Summary This section discussed each of the key design goals of the architecture and system presented in Becker et al (2014) and conducted a quantitative and qualitative evaluation of the key objectives for each of the goals. We showed that the system significantly improves on the existing state of art in digital preservation by combining a context-aware business intelligence support tool with a scalable mechanism for content profiling, both integrated with a successor of the standard preservation planning tool Plato that is showing substantial efficiency gains over previous solutions. While there are limitations on the scale of content that can be profiled, analyzed and preserved in limited amounts of time, the improvements show that preservation planning and monitoring can be realistically advanced to a continuous preservation management function integrated with operational systems. This will provide a substantial step forward for the many organizations that are looking for ways to enable their repositories for truly supporting the long-term access promise that digital preservation has set out to deliver (Hedstrom 1998). We pointed out a number of limitations that currently hold back further progress, and outlined current efforts to tackle them. 3. Conclusion and Outlook Ensuring the longevity of digital assets across time and changing social and technical environments requires continuous actions. The volumes of today’s digital assets make effective business intelligence and decision support mechanisms crucial in this enterprise. While purely technical scalability of data processing can be handled using state of the art technologies, curators require specific decision support to enable large-scale management of digital assets over time. This demands a set of systems and services that facilitate scalable in-depth content analysis, intelligent information gathering, and efficient decision support, designed as loosely-coupled systems that are able to interact and connect to the wider preservation context. This article presented a systematic assessment and evaluation of the SCAPE Planning and Watch suite presented in Becker et al (2014). The results of the assessment demonstrate the possibility to deploy full preservation lifecycle support into preservation systems of real-world scale by adopting a loosely-coupled, open and extensible suite of preservation tools that each support particular aspects of the core preservation planning and monitoring capabilities: 1. Scalable content profiling is supported by the highly flexible and efficient content profiler C3PO, which has been tested on a data set of 441 million files. 2. Monitoring of compliance, risks and opportunities is supported by the monitoring system Scout, which provides an extensible open platform for drawing together information from a variety of sources to support the much-needed business intelligence insights that are key to continued preservation success. 3. Preservation planning efficiency is being continuously improved as the ecosystem grows, and recent advances show that planning can become a well-understood and managed activity of repositories. 4. Context awareness of each of the systems is supported by a shared permanent vocabulary set to grow over time through extensions with related ontologies, connecting the domains of solution components and the preservation community with the organizational policies and the decision support and control systems presented here. 5. Loose coupling of the components in this ecosystem guarantees that organizations can follow an incremental approach to improving their preservation systems and capabilities. We discussed the evaluation of key aspects of each tool as well as the ecosystem as a whole and outlined the key benefits and advances over the existing state of art. Based on a number of limitations, we define a number of key goals for future research. These include real-time profiling of very large data sets in the Petabyte range; benchmarking of automated tools against solid, reliable ground truth in open, fully transparent experiments with shared data sets; and a systematic framework for assessing the performance of organizations in terms of process metrics and organizational maturity. Acknowledgements Part of this work was supported by the European Union in the 7th Framework Program, IST, through the SCAPE project, Contract 270137, and by the Vienna Science and Technology Fund (WWTF) through the project BenchmarkDP (ICT12-046). References Antunes, G. and Borbinha, J. and Barateiro, J. and Becker, C. and Proenca, D. and Vieira, R. (2011), “Shaman reference architecture”, version 3.0. SHAMAN project report. Basili, V.R. and Caldiera, G. and Rombach, H.D. (1994), “The Goal Question Metric Approach”, Encyclopedia of Software Engineering, Volume 2, John Wiley, pp.528–532. Bauer, S. and Becker, C. (2011), “Automated Preservation: The Case of Digital Raw Photographs” , in Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation Proceedings of 13th International Conference on Asia-Pacific Digital Libraries (ICADL 2011) in Beijing, China, 2011, Springer-Verlag. Becker, C. and Antunes, G. and Barateiro, J. and Vieira, R. and Borbinha, J. (2011), “Control Objectives for DP: Digital Preservation as an Integrated Part of IT Governance”, In Proceedings of the ASIST Annual Meeting, 2011, New Orleans, USA: American Society for Information Science and Technology. Becker, C. and Kraxner, M. and Plangg, M. and Rauber, A. (2013), “Improving decision support for software component selection through systematic cross-referencing and analysis of multiple decision criteria”, in Proceedings of 46th Hawaii International Conference on System Sciences (HICSS), 2013, Maui, USA, pp 1193-1202. Becker, C. and Duretec, K. and Petrov, P. and Faria, L. and Ferreira, M. and Ramalho, J.C. (2012), “Preservation Watch: What to monitor and how”, in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Becker, C. and Duretec, K. and Faria, L. (2014). “Scalable Decision Support for Digital Preservation”. To appear in: OCLC Systems & Services, volume 31, no. 1. Becker, C. and Kulovits, H. and Guttenbrunner, M. and Strodl, S. and Rauber, A. and Hofman, H. (2009), “Systematic planning for digital preservation: evaluating potential strategies and building preservation plans”, International Journal on Digital Libraries, Volume 10, Issue 4, pp 133–157. Becker, C. and Duretec, K. (2013), “Free Benckmark Corpora for Preservation Experiments: Using Model-Driven Engineering to Generate Data Sets”, in Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital libraries (JCDL), 2013, Indianapolis, USA, pp 349-358. Becker, C. and Rauber, A. (2011a), “Decision criteria in digital preservation: What to measure and how”, Journal of the American Society for Information Science and Technology, Volume 62, Issue 6, pp 1009-1028. Becker, C. and Rauber, A. (2011c), “Preservation Decisions: Terms and Conditions Apply. Challenges, Misperceptions and Lessons Learned in Preservation Planning”, in Proceedings of the 11th annual international ACM/IEEE Joint Conference on Digital libraries (JCDL), 2011, Ottawa, Canada, pp 67-76. Gallagher, M. (2013a), “Improving Software Sustainability: Lessons Learned from Profiles in Science” in Proceeding of Archiving 2013, Washington D.C., USA, pp 74-79. Gallagher, M. (2013b), “Why can’t you just build it and leave it alone?”, retrieved from http://blogs.loc.gov/digitalpreservation/2013/06/why-cant-you-just-build-it-and-leave-it-alone/ . Hedstrom, M. (1998), “Digital Preservation: A time bomb for digital libraries”, in Journal of Computers and the Humanities, 1997, Volume 31, Issue 3, pp 189–202. ISO (2010), “Space data and information transfer systems - Audit and certification of trustworthy digital repositories (ISO/DIS 16363)”, International Standards Organisation. IT Governance Institute, 2007. COBIT 4.1 Framework. Jurik, B. and Nielsen, J. (2012), “Audio Quality Assurance: An Application of Cross Correlation” in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Kulovits, H. and Rauber, A. and Kugler, A. and Brantl, M. and Beiner, T. and Schoger, A. (2009), “From TIFF to JPEG2000? Preservation Planning at the Bavarian State Library Using a Collection of Digitized 16th Century Printings”, in D-Lib Magazine ,2009, Volume 15, Number 11/12. http://blogs.loc.gov/digitalpreservation/2013/06/why-cant-you-just-build-it-and-leave-it-alone/ Kulovits, H. and Becker, C. and Rauber, A. (2012), “Roles and responsibilities in digital preservation decision making: Towards effective governance”, in The Memory of the World in the Digital Age: Digitization and Preservation 2012, Vancouver, Canada. Kulovits, H. and Becker, C. and Andersen, B. (2013a), “Scalable preservation decisions: A controlled case study”, in proceeding of Archiving 2013. Washington D.C., USA , pp 167-172. Kulovits, H. and Kraxner, M. and Plangg, M. and Becker, C. and Bechofer, S. (2013b), “Open Preservation Data: Controlled vocabularies and ontologies for preservation ecosystems”, in Proceedings of the 10th International Conference on Preservation of Digital Objects (iPRES)2013, Lisbon, Portugal. Law, M.T. and Thome, N. and Gançarski, S. and Cord, M. (2012), “Structural and visual comparisons for web page archiving”, in Proceedings of the 2012 ACM symposium on Document Engineering (DocEng’12), 2012, New York, NY, USA, pp 117-120. OCLC and CRL (2007), “Trustworthy Repositories Audit & Certification: Criteria and Checklist”. Petrov, P. and Becker, C. (2012), “Large-scale content profiling for preservation analysis”, in Proceedings of the 9th International Conference on Preservation of Digital Objects (iPRES)2012, Toronto, Canada. Ross, S. and McHugh, A. (2006), “The Role of Evidence in Establishing Trust in Repositories”, in D-Lib Magazine, 2006, Volume 12, Number 7/8. Sinclair, P. and Billenness, C. and Duckworth, J. and Farquhar, A. and Humphreys, J. and JArdine, L. (2009), “Are you Ready? Assessing Whether Organisation are Prepared for Digital Preservation”, in Proceedings of the 6th International Conference on Preservation of Digital Objects (iPRES)2009, San Francisco, USA, pp 174-181. Webb, C. and Pearson, D. and Koerbin, P. (2013), “Oh, you wanted us to preserve that?! Statements of Preservation Intent for the National Library of Australia’s Digital Collections”, in D-Lib Magazine, 2013, Volume 19, Number 1/2. i http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits ii http://www.openplanetsfoundation.org/blogs/2012-11-06-running-apache-tika-over-arc-files-using-apache- hadoop iii http://hadoop.apache.org iv https://tika.apache.org/1.4/formats.html v http://en.statsbiblioteket.dk vi http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits vii In Plato, the scoring functions range between 0 and 5, with 0 being unacceptable, and are aggregated across the goal hierarchy. This is discussed in detail in (Becker et al, 2013). viii http://www.w3.org/RDF/ ix http://www.w3.org/TR/owl2-overview/ x http://opensource.org/licenses xi https://github.com/openplanets/policies xii (Gallagher 2013, Gallagher 2013a) xiii http://openplanetsfoundation.org/ xiv https://github.com/fasseg/scape-tck xv http://www.ifs.tuwien.ac.at/dp/plato/ xvi https://github.com/openplanets/plato xvii www.benchmark-dp.org http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits http://www.openplanetsfoundation.org/blogs/2012-11-06-running-apache-tika-over-arc-files-using-apache-hadoop http://www.openplanetsfoundation.org/blogs/2012-11-06-running-apache-tika-over-arc-files-using-apache-hadoop http://hadoop.apache.org/ https://tika.apache.org/1.4/formats.html http://en.statsbiblioteket.dk/ http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits http://www.w3.org/RDF/ http://www.w3.org/TR/owl2-overview/ http://opensource.org/licenses http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://openplanetsfoundation.org/ https://github.com/fasseg/scape-tck http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.scape-project.eu/ http://www.benchmark-dp.org/ 1. Introduction 2. Evaluation and assessment 5.1 Evaluation dimensions and challenges 5.2 Evaluation of design goals G1 Provide a mechanisms for scalable in-depth content profiling G2 Enable monitoring of operational compliance, risks and opportunities G3 Improve planning efficiency G4 Make systems aware of their context G5 Design for loosely-coupled preservation ecosystems 5.3 Practical adoption 5.4 Limitations Coverage and correctness of available measurement techniques Scalable distributed and cost-efficient processing: How to profile a Petabyte? The element of human decision making Trust and maturity 5.5 Summary 3. Conclusion and Outlook Acknowledgements References work_nly7uohk4zhazoek6ggf6d6laa ---- Microsoft Word - 9813.doc Design and Implementation of the E-Referencer Danny C. C. Poo Christopher S. G. Khoo Teck-Kang Toh School of Computing Centre for Advanced Information Systems National University of Singapore School of Applied Science Lower Kent Ridge Road Nanyang Technological University Singapore 119260 Singapore 639798 dpoo@comp.nus.edu.sg assgkhoo@ntu.edu.sg 1 Design and Implementation of the E-Referencer Abstract. An expert system Web interface to online catalogs called E-Referencer is being developed. An initial prototype has been implemented. The interface has a repertoire of initial search strategies and reformulation strategies that it selects and implements to help users retrieve relevant records. It uses the Z39.50 protocol to access library systems on the Internet. This paper describes the design and implementation of the E-Referencer. A preliminary evaluation of the strategies is also presented. 1. Introduction E-Referencer is a web-based interface to online catalogs that is being developed as a tool to test and experiment on various search strategies that are aimed at helping online catalog users in their searches. The design of the system makes use of the expert system technology in the selection of search strategies. Through the conduct of experiments and with some modifications and fine-tuning, it is hope that the E-Referencer can be used as an effective searching tool for all online catalog users. The E-Referencer uses the Z39.50 Information Retrieval protocol [1] to communicate with the various library systems, and the Java Expert System Shell (JESS) [2] to implement the knowledge base of the system. At the present moment, the knowledge base of the E-Referencer consists of: 1. a conceptual knowledge base that maps free-text keywords to concepts represented by the Library of Congress (LC) subject headings 2. search strategies coded in the system, including • initial search strategies, used to convert the user’s natural language query to an appropriate Boolean search statement • reformulation strategies, used for refining a search based on the results of the previous search statement 3. rules for selecting an appropriate search strategy. The E-Referencer processes a user’s natural language query, selects a suitable search strategy and formulates an appropriate search statement for the library system. Based on the user’s relevance 2 feedback on the search result, it further selects a strategy for reformulating the search. The process goes on until the user is satisfied with the final result. This paper begins by discussing the state of implementation of current search systems as implemented in Online Public Access Catalogues System (OPACS). The discussion then proceeds to illustrate the improvement in search results obtained by using an initial prototype of the E-Referencer. The design and implementation of the E-Referencer covers the remaining sections of the paper. 2. Experiences in Online Catalog Searches 2.1 User Difficulties in searching OPACS Traditional text-based boolean search systems as incorporated in the design of OPACS are difficult to use. In the 1984 Online Catalog Evaluation Projects sponsored by the Council of Library Resources, Markey found that users have the following problems when performing subject searches of online catalogues [6] : • Users have problems matching their terms with those indexed in the online catalogue. • They have difficulty identifying terms broader or narrower than their topic of interest. • They do not know how to increase the search results when too little or nothing is retrieved. • They do not know how to reduce the search results when too much is retrieved. • They lack understanding of the printed LCSH (Library of Congress Subject Heading) Hildreth also pointed out that “conventional informational retrieval systems place the burden on the user to reformulate and re-enter searches until satisfactory results are obtained.” [7] Indeed the boolean search system, which was the conventional informational retrieval system used then, requires too much knowledge from their users. The new Web-based library OPAC search systems today are not much different from those in the eighties in terms of functionality. Borgman indicated that most of the improvements to online catalogues in recent years were in surface features rather than in the core functionality. Also, online catalogues “were designed for highly skilled searchers, usually librarians, who used them frequently, not for novices or for end-users doing their own searching.” [8]. 3 Other studies conducted by Cousins [3], Dalrymple [4], Ensor [5], Lancaster [6] have found deficiencies in the present-day online catalogue systems and identified the problems users have. For example, Cousins [3] analyzed the types of subject queries that users brought to two online catalog systems. She found that many subject queries were not expressed at the level of specificity that was appropriate or suitable for searching the system. She concluded that online catalogue systems should provide more information about the document content (e.g. content pages), facilities for browsing the thesaurus and the classification scheme, facilities for browsing records arranged by class number, ranked display of search results, help with query formulation, and relevance feedback. 2.2 Recent Developments in Improving the Search System To help the users search more effectively, recent research efforts have concentrated on improving the search system. The more notable developments are: • Introduction of ‘best match’ or statistically-based search systems that incorporate numerically- based algorithms for estimating the relevance of documents to the users’ query. • Use of knowledge-based search systems that encode expert knowledge to provide advice or assistance to users for searching. • Development of the Z39.50 Standard for Information Retrieval that provides common access to multiple online catalogues. 2.2.1 ‘Best Match’ Search Systems There are many variations of ‘best match’ or statistically-based search systems. They support a mixture of features like document ranking, relevance feedback and query expansion. These systems use either the vector-based information retrieval model or the probabilistic model to re-index the collection of documents to support more effective retrieval. The user is allowed to enter his search query in natural language. The search query is then treated as a list of unstructured keywords. Stop- words such as ‘is’, ‘a’, and ‘the’, which are not very useful in the search process are removed. The remaining content-bearing words are stemmed to a common form. Inverse frequency weights are then calculated for each of the remaining query stems; whereby the greatest weights are then assigned to those terms that occur least frequently in the document collection. The weight of each document in the collection is then calculated by taking the sum of the weights of the common terms that occur in both 4 the query and the document. Subsequently, the documents are sorted base on their calculated weights. This process is known as document ranking. In order to determine the relevancy of the documents, a certain number of top ranking documents are displayed to the user. This process is known as relevance feedback. Based on the user’s feedback, terms are extracted from the relevant documents; the selected terms are then weighted and ranked. A certain number of the top ranking terms could be automatically added to the original query. Alternatively, they can be displayed to the user and the latter can decide if the new terms are to be included in the original query. This process is known as query expansion. Examples of best match systems include the Interactive System for Teaching Retrieval (INSTRUCT) developed in the University of Sheffield [11], the Okapi System by the City University in London [12] and almost all the current web search engines. There are advantages in using such systems. First, the user does not need to compose their queries using logical operators. Second, the ranking of documents means that the user is more likely to find the required document at the top of the list. Third, the use of relevance feedback information to refine the query, by automatic or semi-automatic query expansion provides a useful mechanism for the user to clarify his query and locate the required information quickly. However, these are also problems with such systems. Firstly, there is a need to re-index the existing database and this can be very costly and time-consuming. This may explain why very few library OPAC boolean search systems are ‘best match’ systems. Secondly, a large amount of records are usually returned in each search. This is evident in the searches carried out on the web. It is unlikely for users to browse through all the returned records. A small number of the records is finally seen. Lastly, such numerically-based systems do not utilize any extra information or domain knowledge present that may be used to help users in their search. 5 2.2.2 Knowledge-based Search Systems Another approach is to make use of techniques from research in expert systems. An intermediary system that removes the need for users to have knowledge of boolean search can be provided. Expert systems encode the knowledge of a human expert in the domain of interest and the strategies the expert uses for reasoning. In the field of information retrieval, the expert is probably the reference librarian and his/her expert knowledge lies in his/her ability to convert natural language queries into boolean queries, and refining and reformulating the query according to his/her knowledge of the subject area being searched. A number of such systems based on expert system techniques like production rules, semantic nets and frames have been developed over the years; for example: • Gauch’s Query Reformulation system uses production rules to reformulate the query by manipulating the boolean operators, or by adding related terms, or by replacing terms with broader or narrower terms from a thesaurus [13]. • PLEXUS, an expert referral system that uses a frame-based representation of topics in the domain of gardening to map words to semantic primitives. It incorporates some search strategies expressed as production rules, and builds a temporary user profile based on the users’ gardening experience, knowledge and location. This profile is used to set up the level of help for the user [14]. • RUBRIC (Rule-Based Retrieval Information by Computer) uses production rules to define a hierarchy of retrieval concepts. Domain knowledge is modeled as a collection of concepts. Each concept contains a description, the relationship between the concept with other concepts, and the rules that describe the patterns of text that should be present to support the use of the retrieval concept. In this system, the user is required to provide to the system domain specific concepts [15]. • Drabenstott and her colleagues developed a prototype online catalogue that uses search trees or decision trees to represent how experienced librarians select a search strategy and formulate a search statement. The decision tree is represented as a flowchart [16-18]. 6 The main advantage of using the knowledge-based approach is in the ability to reuse the existing database index. By building on top of existing infrastructure, libraries are more likely to adopt the knowledge-based solution to augment their existing OPAC boolean search systems. Furthermore, the classification systems in library catalogues such as the Dewey Classification System and the Library of Congress Subject Classification System, are already well-defined. Such structured domain information and knowledge can thus be incorporated in any system to help users in their searches. 2.2.3 Z39.50 Information Retrieval Protocol For many years, users have been limited to performing searches at the library premises. Users have to also make use of the OPAC search system as provided by each library in searching the catalogs. Although most systems are boolean search systems offering similar search capabilities, users still need to learn each of the different search interfaces in order to be proficient in accessing information from the online catalogs. In order to overcome this compatibility issue, the Z39.50 Information Retrieval Protocol was developed to standardize the informational retrieval process. Commercial products and freeware search systems that implement the Z39.50 protocol are now available. For a list of Z39.50 products and freeware, refer to http://lcweb.loc.gov/z3950/agency/projects/software.html. We have evaluated a few of these Z39.50 compliant search systems, namely: • Sitesearch WebZ software from Online Computer Library Center (OCLC) http://www.oclc.org/oclc/sitesearch/components.htm • Chalmers from Chalmers Library, Chalmers University of Technology http://www.lib.chalmers.se/prov/Z3950/gateway.html • Willow from University of Washington http://www.washington.edu/willow • WWW/Z39.50 Gateway from Library of Congress http://lcweb.loc.gov/z3950/gateway.html Most of all these Z39.50 compliant search systems provide access to multiple databases but they only support boolean logic queries. Users are still burdened with the responsibility of query formulation and reformulation. Susannah noted that “Almost all literature on Z39.50 and its implementation focuses on the issues related to the implementor and the development of the standard. In the ten years since work 7 on Z39.50 began, little attention has been given to the end user, the one who is supposed to ultimately benefit from the implementation of the standard.” [19]. 2.3 Our Approach A study conducted by Robertson and his colleagues found that there is very little overall difference in performance between the ‘best match’ search system, INSTRUCT, and the knowledge-based system, TomeSearcher [20]. In order to help current users of library online catalogs search more effectively, the knowledge-based approach is currently the most suitable and feasible approach. The approach as taken in the paper is to build an intermediary system on top of the existing facilities as provided by OPACS. In this way, we can reuse the existing infrastructure to reduce cost, and to take advantage of the domain knowledge present in the structure of online catalog classification systems and the expertise of the librarians in searching catalogs to enhance the search session of the users. The task of interfacing is made simpler with the widespread acceptance and use of the Z39.50 Information Retrieval Protocol. Our expert intermediary system, the E-Referencer, encompasses the expert knowledge of reference librarians. Librarians are trained in formulating queries in boolean logic and reformulating them to obtain sufficient relevant records. Such formulation and reformulation strategies are domain-independent and can be applied to any boolean search systems. The E-Referencer also makes use of the domain knowledge present in the structure of classification scheme of the online catalogs to assist users. In a library OPAC system, the documents and records are classified based on information such as subject heading and call number. This information is fully utilized in helping users maximize their searches. 2.4 Evaluation of E-Referencer and Results The system, E-Referencer version 2.0, has been implemented and is accessible at the URL http://revelation.comp.nus.edu.sg/ERef2.0/. 8 An evaluation of the system was carried out using an earlier version of the E-Referencer on 12 queries that were selected from among those submitted by the university staff and students for literature searches. The queries were selected to cover a wide range of subjects. A complete list and description of the 12 queries is given below: Query No. Topic A96-7 Digital library projects A96-14 Cognitive models of categorization A97-16 Internet commerce B97-1 Making a framework for surveying service quality in academic libraries B97-3 Looking to do a comprehensive literature review of the Sapir-Whorf Hypothesis D96-2 Software project management D96-16 Decision under uncertainty D97-1 Thermal conductivity of I.C. Packaging D97-2 Fault-Tolerant Multiprocessor Topologies D97-4 Face recognition D97-7 A study on computer literacy and use by teachers in primary schools N97-13 Expert systems in library reference service The Nanyang Technological University (NTU) library system in Singapore was used in the evaluation. Searches were performed by entering the query topics into the traditional search system provided by NTU’s library at http://www.ntu.edu.sg/library/opacs.html, and also on the E-Referencer prototype that we developed. The same set of queries was also given to an expert librarian who performed her own search and reformulation using the NTU library search system. To illustrate the process of searching by the three systems, we shall use one of the queries above: A96- 7 “Digital library projects”. a. NTU Library Search System No record was returned when the query string was entered in this search system. b. E-Referencer The same query string was then entered into the E-Referencer; Initial Search Strategy 1 was carried out as follows: • Stop-words were replaced with ANDs “Digital library projects” 9 • The words in between the AND and OR operators were assumed to be adjacent “Digital library projects” • Individual words were stemmed and truncation signs added “Digit? librar? project?” • The formulated query string “Digit? librar? project?” was sent to the NTU library search system. No record was retrieved and Broadening Strategy 1 activated: • The operator AND was inserted between all adjacent words “Digit? AND librar? AND project?” • Once again, the NTU library search system was sent the query “Digit? AND librar?AND project?” The search yielded four records, they include: 1. Conversion of the microfilm to digital imagery: a demonstration project: performance report on the production conversion phase of Project Open Book / by Paul Conway, principal investigator. 2. Digital library visualization tool / by Yee Mun Sung. 3. Library development for mixed analog-digital circuit simulation / submitted by Ng Kian Ching, Ng Meng Hui. 4. Electronic services in academic libraries : ALA survey report / by Mary Jo Lynch, project director. c. Expert Librarian The expert librarian was able to retrieve 6 records from the given query string. Searches using other query string from the above topics were also carried out. The results of the searches were given to two judges, who were asked to indicate whether the records retrieved were relevant, marginally relevant or not relevant. For this evaluation, records that were judged to be marginally relevant were considered to be non-relevant. The precision measure (proportion of records retrieved that are relevant) was calculated for the first 20 records displayed. (The expert system currently displays only 20 records to the user for relevance judgment.) The mean precision for the 2 sets of relevance judgments was then calculated for each query. The consolidated results are given in Table 1. 10 Search by NTU Search System Search by E-Referencer Search by Librarian Query No. No. Displayed No. Relevant Precision No. Displayed No. Relevant Precision No. Displayed No. Relevant Precision A96-7 0 0 0.00 4 0.5 0.13 17 6 0.35 A96-14 0 0 0.00 20 1 0.05 3 2 0.67 A97-16 4 1 0.25 4 1 0.25 20 705 0.38 B97-1 0 0 0.00 20 3 0.15 13 7 0.54 B97-3 0 0 0.00 2 0 0.00 4 3.5 0.88 D96-2 11 10.5 0.95 11 10.5 0.95 20 14 0.70 D96-16 0 0 0.00 20 9 0.45 20 9 0.45 D97-1 0 0 0.00 3 1.5 0.50 20 5 0.25 D97-2 0 0 0.00 5 3.5 0.70 6 6 1.00 D97-4 0 0 0.00 8 2 0.25 15 5 0.33 D97-7 0 0 0.00 19 0 0.00 14 2.5 0.18 N97-13 0 0 0.00 4 3.5 0.88 5 4 0.80 Average 1.25 1.0 0.10 10 3.0 0.36 13.1 6.0 0.54 Table 1: Comparison of Search Results Note: 1. The figures given for No. Relevant and Precision are the average for 2 sets of relevance judgments by 2 persons. 2. The evaluation is based on the first 20 records retrieved. E-Referencer currently displays only 20 records for relevance judgment. Clearly, Table 1 shows that the E-Referencer performed much better than the traditional library search system. The reformulation strategies used by the E-Referencer have helped the user retrieve more relevant records. However, the result shows that the E-Referencer did not perform as well as an expert librarian. While this may be true from Table 1, it must be pointed out that the evaluation is skewed towards the expert librarian. This is because the librarian was able to execute several search statements, and continually refine the search after examining the records retrieved in earlier search formulations. While the above results show the final search set as obtained by the librarian, it is only the first non-null set retrieved that is shown for the E-Referencer. The evaluation suggests that the E-Referencer can be an efficient tool for helping online catalog users to search better. The strategies as used in the prototype still need to be refined further. Current work is aimed at realizing this. A complete description of the search strategies can be found in [16]. In the rest of the paper, we shall focus on the design and implementation of the E-Referencer. 11 3. Conceptual Design For many years, when library users have problems finding what they need using the search systems in the libraries, they would approach the reference librarians. The latter are people who are more knowledgeable and proficient with the catalog search system. In general, librarians would ask what the user is looking for, clarifying the subject or topic when necessary before constructing a query in boolean logic to search the catalog. If too little records are retrieved, the librarian would either change some of the boolean operators in the query, like changing the AND operator to OR to get more records, or try using similar keywords or broader subject headings in the reformulation. Similarly if too many records are retrieved, the librarian may try modifying the boolean operators or use narrower subject headings to reduce the search results. This process goes on until the user is satisfied with the search result. Basically, there are two types of knowledge present in a search session: 1. Domain knowledge – Classification information of records in OPACS such as Subject Headings, and Dewey and Library of Congress Call Numbers. These information can be used to help users clarify their search topic and locate the relevant documents. The hierarchy present in the various classification schemes can also be used to broaden or narrow a search. 2. Domain-independent knowledge – Search strategies that librarians used to formulate the user’s original query to the format expected by the OPAC (boolean logic), and reformulation of the query by modifying the boolean logic operators to get more or less records. In the design of the E-Referencer, we have decided to incorporate these two types of knowledge into the system. A conceptual knowledge base of domain knowledge has been incorporated into the E- Referencer to map keywords to concepts represented in the subject headings. The domain-independent knowledge of formulation and reformulation rules has been implemented as search strategies in the E- Referencer. For a complete description of search strategies used in the E-Referencer, the reader is referred to an earlier published paper [21]. 12 4. System Design and Implementation Having illustrated the potential of the E-Referencer, we will now describe the design and implementation of the E-Referencer. 4.1 Design Approach and Considerations The approach we have adopted in developing our system is that of rapid prototyping and incremental development. We first implemented an initial prototype using simple-minded strategies specified by an experienced librarian. We then carry out experiments to evaluate the system and compare its performance with that of experienced librarians. From this, we identify the areas the system is deficient and how it can be improved. The prototype system had been designed to study the reasons why experienced librarians are more superior in their searches than ordinary users, and how expert search systems can be designed to match what experienced librarians can do. This approach of rapid prototyping is necessary because at the start of the project, we do not know which strategy is the most effective one to use. A detailed planning and design approach is thus not feasible. This cycle of incremental development, testing and redevelopment has allowed us to add new features and refine the system gradually. The increasingly popular Z39.50 Informational Retrieval protocol was used to provide a common interface to multiple online catalogs. Search strategies as used by librarians have been incorporated into the knowledge base of the prototype system using JESS. E-Referencer uses a three-tier design architecture consisting of a client, proxy and server. The client handles user interaction, the server (a Z39.50 server) contains the data and search strategies, and a proxy sits between the client and server. This approach is necessary because the subject heading database required by the conceptual knowledge base is huge, around 300 megabyte. It is not feasible to send this large database across the network to every client in order to extract subject headings of usually a few keywords (at most ten). In our design, the proxy houses the database and handles all the 13 processing required on the conceptual knowledge base. Keywords are submitted by clients to the proxy, which will then search the subject heading database and return the relevant subject headings to the clients. The traffic generated in this case is greatly reduced since only the keywords and the resulting subject headings are returned. Another advantage of this approach is that we can log all the activities carried out by the different clients; the log information can be useful for further analysis. 3.2 Design of the E-Referencer The design of the E-Referencer is shown in Figure 1. It consists of the following modules: Client Modules a. The Graphical User Interface (GUI) Module handles the interaction between the user and E- Referencer. b. The Network Interface Module communicates with the E-Referencer proxy. c. The Expression Module provides functions for manipulating a search expression. d. The Control Module is the heart of the expert system. It controls and calls the various functions of the system and has the following components: • A Knowledge Base of search strategies • A Fact Base which stores the intermediate search results and information needed to select the next search strategy. • An Explanation Facility for explaining why and how certain strategies were chosen. e. The Knowledge Module contains wrapper functions for integrating the expert system script of the Control Module with the other modules of E-Referencer. Proxy Modules a. The Proxy Controller Module accepts new connections from clients and activates the appropriate modules to handle the various clients’ requests. b. The Keyword-Subject Association Module provides a list of subject headings that associates with the keywords users specify in their query. The subject headings are used to augment the user’s original query to perform a more accurate search. 14 c. The OCLC Z39.50 Client API provides functionality for connecting to, searching and retrieving information from the various library systems that support the Z39.50 protocol. d. The Z39.50 Interface Module provides a clean interface to the OCLC Z39.50 Client API. It isolates the rest of the system from changes to the OCLC Z39.50 Client API. 3.3 Client Modules Design and Implementation a. Graphical User Interface Module The widespread use of graphical-based operating systems like Windows, OS/2 and X-Windows have greatly increased the demand for programs written with graphical user interfaces. The proper usage of graphical user interface provides a very simple and friendly way for the user to interact with the system. The mouse pointer allows for easy manipulation of the system and the use of graphical items like buttons and scroll windows allows the system to present its information to the user clearly and effectively. Thus, the E-Referencer, which will eventually be used by ordinary users, has to support a graphical user interface. In addition, we hope to make the E-Referencer easily accessible to all online catalog users, and thus a Web-based graphical interface is required in the design of the E-Referencer. We have also designed the user interface to be simple, so that it is easy to use. The interface contains only one keyboard input area for the user to enter the query string, so that users will not need to spent too much time learning how to use the system. Limited information on the search results is displayed; the information includes title, author and publisher information. The records retrieved are also arranged in a list for easy browsing. Since the E-Referencer is also used as an experimental tool to help us refine and test our search strategies, we have also included a server and strategy option in the design. The server option allows us to search different Z39.50 servers, while the strategy option allows us to use different strategies for reformulation purposes. 15 OCLC Z39.50 Client API for Java Control Module - Knowledge Base - Fact Base - Explanation Facility JESS functions Knowledge Module Wrapper Functions Network Interface Module Expression Module Graphical User Interface Module Proxy Controller Module Library B Online Catalog User Keyword-Subject Association Module Library A Online Catalog Subject Heading Database Proxy Client Servers (External) Z39.50 Interface Module Figure 1: Design of E-Referencer A screenshot of the main user interface and the feedback window is shown in Figure 2 and Figure 3. 16 Figure 2: E-Referencer Frame (main GUI) b. Expression Module This module provides functions to manipulate the search expression that a user keys into the E- Referencer. The functions implemented are: • Removing stop-words like a, an, is, the, which does not help in the search session. • Stemming words to remove suffixes to get a common form for retrieving more records. Porter’s algorithm [22] available at the Glasgow IDOM – IR Resources Web site (http://www.dcs.gla.ac.uk/idom/ir_resources/liguistic_utils/) is selected because of its simplicity and fast implementation. • Conversion between AND, OR and Adjacent operators in search string. Example: AND operators -> “expert” AND “systems” OR operators -> “expert” OR “systems” Adjacent Operators -> “expert systems” • Create combinations of two or three keywords Example: Expert Systems Internet Intranet Two keyword combination -> 17 “Expert Systems” OR “Expert Internet” OR “Expert Intranet” OR “Systems Internet” OR “Systems Intranet” OR “Internet Intranet” Three keyword combination -> “Expert Systems Internet” OR “Expert Systems Intranet” OR “Expert Internet Intranet” OR “Systems Internet Intranet” Figure 3: Feedback Window c. Control Module JESS, developed by Sandia Labs, is used to represent the expert knowledge in the E-Referencer. JESS is written entirely in Java and implements a subset of the CLIPS [23] language, which uses production rules to represent knowledge. A production rule consists of a list of facts followed by a list of actions. When all the facts in the list are asserted, the rule is said to have fired and the list of actions is executed. Actions could include asserting more facts, which could fire more rules. In JESS, all production rules are specified in a script. It comes with a set of standard functions and provides a mechanism to wrap functions written in Java as JESS functions. All these functions can then be called from within the 18 JESS script, which allows for greater flexibility in manipulating the system. JESS was chosen because JESS syntax is simple and search strategies can be specified as production rules easily. Furthermore, the ability to integrate Java functions into JESS makes it very favorable for our use since E-Referencer is written in Java. JESS uses an inference engine, based on the Rete (Greek word for net) algorithm [24], to process the production rules. The Control Module creates this JESS inference engine, which processes the production rules contained in the JESS script. The components of the Control Module are: • A Knowledge Base of search strategies. The strategies are represented as production rules in the JESS script EReferencer.clp. Example of a search strategy: If “No. of records retrieved = 0” and “No. of words in query = 1” then assert “Broadening Strategy 6” (prompt user to enter synonyms) • A Fact Base which stores the intermediate search results and information needed to select the next search strategy. The intermediate search results and information are represented as asserted facts in JESS, which could lead to the firing of other production rules, in some case, the firing of a rule which contains another search strategy. Example of facts: “No. of records retrieved = 10” “No. of words in query = 5” • An Explanation Facility, to explain why and how certain strategy is chosen. The explanations are embedded in the production rules. Below is a sample piece of code for a search strategy implemented as a production rule in the JESS script. For a more thorough understanding of the JESS syntax and rules, refer to the JESS README http://herzberg.ca.sandia.gov/jess/README.html. (defrule BROADENING_STRATEGY1 1 ; Expression has adjacency operators. Convert them into ands. ?expr <- (Expr SearchExpr ?str) ?strategy <- (BroadeningStrategy 1) 5 19 => (retract ?strategy) (miscPrintout "Broadening Strategy 1: convert adjacent operators to and") (miscPrintout "Checking if adjacency operators are present.") (if (exprHasAdjWords ?str) 10 then (retract ?expr) (miscPrintout "Adjacency operators found.") (bind ?newstr (exprAdjToAnd ?str)) (assert (Expr SearchExpr ?newstr)) 15 (assert (KeyWordSearch)) else (miscPrintout "No adjacency operators found.") (assert (BroadeningStrategy)) ) 20 ) The sample code is the production rule for representing the strategy “broadening strategy 1”. The lines before the => represent the list of facts, while the lines after => represent the list of actions. The Fact Base is implemented as facts being asserted. The fact (Expr SearchExpr ?str) is always asserted. It is used to store the current search query in the variable ?str. Thus, in this case, the above rule is fired when the fact (BroadeningStrategy 1) is asserted in the Fact Base. Upon firing the rule, the fact (BroadeningStrategy 1) is retracted from the Fact Base to prevent further firing of the same rule. The Explanation Facility is implemented as code embedded in different rules in the script. The code at line 8, 9, 13 and 18 is part of the Explanation Facility and they indicate to the user the strategy used and the actions executed. In this sample code, Broadening Strategy 1 is used, and a check is performed to determine if there are any adjacent words in the query. Line 14 shows how the JESS script calls the JESS function exprAdjToAnd from the Knowledge Module. This function is a Java function that is wrapped as a JESS function, and the method Call of the private class ExprAdjToAnd will be invoked. Other facts are then being asserted into the Fact Base so as to fire other rules (lines 15, 16), which may then select another strategy or perform some other functions. 20 d. Knowledge Module In order to integrate the rest of the modules written in Java (e.g. the Expression, Subject, Z39.50 Interface and GUI Modules) with JESS, we need to create wrappers for them. All these wrapper functions are grouped according to modules. That is, all the wrapper functions for Expression are grouped under Expression Functions, all the wrapper functions for Z39.50 Interface are grouped under Z3950 Functions and etc. A prefix is then added to each wrapper function to denote the module that they belong to. The Knowledge Module thus contains all these groups of wrapper functions. Example of wrapper functions: Expression Functions { exprStem, exprRemoveStopWord, exprAndToOr ...} Z3950 Functions { z39Connect, z39Search, z39Display …} GUI Functions { GUIFeedbackDialog, GUIFrame …} When facts in the Fact Base of the Control Module are asserted, certain rules in the Knowledge Base are fired to select an appropriate strategy. The actions in the strategy make calls to the wrapper functions in the Knowledge Module to perform some required operations. For example, given these rules Rule 1 If “No. of records retrieved = 0” and “No. of words in query > 1” Then Assert fact “Broadening Strategy 1” : : Rule 2 If “Broadening Strategy 1” Then ExprAdjToAnd <- wrapper function call: Convert Adjacent to And : : When a user enters the query “Expert Systems” the fact “No. of words in query = 2” is asserted in the Fact Base. The search is performed but no record is retrieved. The fact “No. of records retrieved = 0” is then asserted. These two facts then fire rule 1 in the Knowledge Base, and it asserts the fact “Broadening Strategy 1”; this in turn fires rule 2. Rule 2 calls the wrapper function ExprAdjToAnd in the Knowledge Module to activate the appropriate Java function in the Expression Module to convert the Adjacent operators in the original query to AND operators. A new query “Expert AND Systems” is eventually created and used to perform another search. 21 Each function group of wrappers is implemented as a public Java class; they extend the JESS Userpackage class. The various wrapper functions are created as private classes and added into the JESS inference engine. Sample code: public class ExprFunctions implements Userpackage { public void Add(Rete engine) { engine.AddUserfunction(new exprRemoveStop()); engine.AddUserfunction(new exprStem()); : : } } The above code shows how the class ExprFunctions implements the group of wrapper functions for the Expression Module. Other groups are implemented similarly. Each wrapper function is implemented as a private class extending from the JESS Userfunction class. The latter contains a private attribute _name to store the name the programmer use for invoking the function in the JESS script. The public method Call is then used to define the operations to be performed when the function is called in the JESS script. Sample code: class exprRemoveStop implements Userfunction { Expression ex = new Expression(); int _name = RU.putAtom( "exprRemoveStop" ); public int name() { return _name; } public Value Call(ValueVector vv, Context context) throws ReteException { String expr = ""; if ((vv.size() == 2) && (vv.get(1).type() == RU.STRING)) { expr = vv.get(1).StringValue(); expr = ex.removeStop(expr); } return new Value(expr, RU.STRING); } } 22 This code shows how the class ExprRemoveStop is used to wrap the Java function expr.RemoveStop of the Expression Module into the JESS function exprRemoveStop. All other Java functions are wrapped in the same way. e. Network Interface Module This module sets up the connection between the client and the proxy. Search requests issued by the client are submitted to the proxy through this module. The search result as returned by the proxy is collected by this module before it is displayed to the user. All networking functions are implemented via the Socket class in Java. 3.4 Proxy Modules Design and Implementation a. Proxy Controller Module As the name implies, this module controls all the activities and transactions that are carried out in the proxy. It accepts connections from E-Referencer client applets, and establishes connection with the selected Z39.50 server on behalf of the clients. The Z39.50 Interface module is invoked for the connection. If a client applet requests for subject headings that are associated with the keywords it submits, the Keyword-Subject Association Module will be invoked by the controller to retrieve the relevant subject headings. If logging is required at a later stage, it can be implemented in this module since this module handles all transactions. The Proxy Control Module is implemented using Java Threads. When the proxy starts up, a controller thread is created to listen to client requests. Each time a new client’s connection request is received, the controller thread would instantiate two new threads to handle all future requests from that client. One thread will handle all activities between the client and the proxy while the other thread will handle all activities between the proxy and the Z39.50 server the client is connecting to. With two separate threads, there is continuous communication since the blockage of one communication channel will not affect the other. When the connection to the client dies, the two threads will also be killed and reclaimed. 23 b. OCLC Z39.50 Client API The E-Referencer is developed as a system capable of searching the many different online library catalogs available. Thus, a standard System Interface module to all these online catalogs is needed. By implementing the System Interface Module based on the Z39.50 protocol, users will be able to access all existing and newly created Z39.50 compliant online catalogs. A search on the Web for existing resources that can be used in our development effort revealed the following class libraries and toolkits: • OCLC Z3950 Client API. The API is written entirely in Java and implements the latest version of the Z39.50 protocol, version 3 released in 1995. The API is provided free and comes with source code. • Index Data, Yet Another Z39.50 Toolkit, YAZ is a toolkit for implementing the Z39.50v3 protocol. The toolkit supports both the Z39.50 and the ISO10163 SR protocol. Both the Origin (client) and Target (server) roles of the protocol are supported. The toolkit is written in C. YAZ is also provided free. • ZedKit for Unix. The Z39.50 Application Development Libraries is developed for the German Library Project DBV OSI II and also the ONE project co-funded by the European Commission Libraries Programme, and is written is C/C++. We tested the OCLC Z39.50 Java API by using the sample client application, zclient, that comes with the API, to connect to a few Z39.50 compliant online catalogs. Since there are many different Z39.50 server implementers like INNOPAC, DRA and etc., it is necessary to test the OCLC API, by connecting it to a representative few of the various servers implemented by the different vendors. The API was tested with the DRA server at the NTU Library, the INNOPAC server at National University of Singapore (NUS) and the Ameritech HORIZON server at Clarke College, Dubuque, Iowa. Although some fine-tuning was needed due to differences in implementation by the different vendors, we managed to connect to the various servers, specify the database to search, send some sample search queries and retrieved the records from all the servers. 24 From the tests, we found the OCLC Z39.50 Client Java API most suitable for our use. The API is written entirely in Java, which makes it easy to integrate into our design framework, and it supports many functions of the latest version of the Z39.50 protocol. c. The Z39.50 Interface Module Having decided to adopt the OCLC Z39.50 Client API for development, we then designed the Z39.50 Interface Module; this module serves as a “wrapper” module on top of the OCLC Z39.50 Client API. The module provides simple Z39.50 functions like connect, search, retrieve, close etc. for use by the E- Referencer; these functions are implemented using the OCLC Z39.50 Client API. There are two advantages for adopting this design. Firstly, the Z39.50 Interface Module can isolate the rest of the program from the OCLC Z39.50 Client API code. In the event of changes to the OCLC Z39.50 Client API or even if there is a need to change to a different API, we will just need to modify the codes in the Z39.50 Interface Module, while keeping the rest of the program code intact. Secondly, by grouping the primitive OCLC API procedures into higher level procedures like search, retrieve etc., we have simplified the coding of the other modules that make use of Z39.50 functions. A new Z39client class is created to provide Z39.50 functions like connect, search, retrieve etc. for the proxy. Each function is implemented as a method and makes calls to the OCLC Z30.50 Client API to perform its operation. From our implementation, we found that the zclient application implemented most of the Z39.50 functions supported by the API. We modified the zclient source code to create our own z39client class. Most of the existing methods in zclient were modified and reused, but some of the functions as provided by zclient were too simple and we had to create new methods to suit our needs. For example, the display method in zclient does not guarantee that the requested number of records are retrieved and displayed. Therefore, we created a retrieve method in our z39client class, which uses a loop to continuously call the display method to retrieve records, and guarantees that the specified number of records are retrieved. 25 d. Keyword-Subject Association Module The conceptual knowledge base that maps free-text keywords to concepts represented as LC (Library of Congress) subject headings is a very useful tool for users to clarify their search topic. This conceptual knowledge base is implemented as the Keyword-Subject Association Module in the proxy. The module accesses a subject heading database that contains all the keyword-subject heading mappings for all the keywords found in the LC bibliographic catalogue from 1980-1998. As mentioned earlier, this large database is one of the main reasons that necessitated the three-tier design architecture. The major task in implementing this module is in the creation of good data structures for storing the keyword-subject heading mappings to allow for efficient retrieval. Since the database is located at a centralized proxy, disk space is not a constraint. An inverted file is thus used to store the keyword- subject heading mappings, to allow for fast and efficient retrieval. A Subject Heading Map is also created to map each subject heading to a unique number. This is used in the keyword-subject heading inverted file for representing the subject headings using numbers. The use of this greatly reduces the size of the inverted files, because the numbers require much less storage than the subject heading strings. This Subject Heading Map is implemented as a sorted text file so that we can use a binary search algorithm to map subject headings to their unique numbers and vice-versa quickly. 5. Conclusion The current online catalog search systems do not provide enough assistance for users in their searches. There have been various attempts to develop new systems that utilize different approached to help users in this area. Such systems can be broadly categorized as ‘best match’ systems or knowledge- based systems. Both systems have their merits and problems. However, we felt that the knowledge- based approach is more suitable for use in this domain of online catalog searching, and have thus have adopted it to solve the above-mentioned problem. 26 A Web-based search interface known as E-Referencer has been developed to provide an accessible and useful search tool for online catalog users. Presently, the E-Referencer is also used as a tool for experimenting with search strategies to create a system that is capable of helping users search effectively. Early test results are encouraging and refinements are still being done on the E-Referencer. We hope that with more testing and iterations of refinements, it can eventually be deployed for widespread use as an effective searching tool for online catalog users. References 1. Z39.50 Maintenance Agency. URL http://lcweb.loc.gov/z3950/agency/. 2. JESS, the Java Expert System Shell. URL http://herzberg.ca.sandia.gov/jess/. 3. Cousins, S. A.: In their own words: An examination of catalogue users’ subject queries. J. Amer. Soc. Inf. Sci. 46 (1992) 329-341. 4. Dalrymple, P.W.: Retrieval by Reformulation in Two Library Catalogs: Toward a Cognitive Model of Searching Behavior. J. Amer. Soc. Inf. Sci. 41 (1990) 272-281. 5. Ensor, P.: User Practices in Keyword and Boolean Searching on an Online Public Access Catalog. Inf. Tech. Libr. 11 (1992) 210-219. 6. Lancaster, F.W., Connell, T.H., Bishop, N., McCowan, S.: Identifying Barriers to Effective Subject Access in Library Catalogs. Libr. Reso. Tech. Serv. 35 (1991) 377-391. 7. Markey, K.: Subject Searching in Library Catalogs: Before and After the Introduction of Online Catalogs. OCLC Online Computer Library Center, Dublin, OH. (1984). 8. Hildreth, C.: Beyond Boolean: Designing the Next Generation of Online Catalogs. Libr. Trends 35 (1987) 647-667. 9. Borgman, C.L.: Why are Online Catalogs Still Hard to Use? J. Amer. Soc. Inf. Sci. 47 (1996) 493- 503. 10. Khoo, C., Poo, C.C.D.: An Expert System Front-End as a Solution to the Problems of Online Catalogue Searching. In: Information Services in the 90s: Congress Papers. Library Association of Singapore, Singapore (1991) 6-13. 11. Al-Hawamdeh, S., Ellis, D., Mohan, K.C., Wade, S.J., and Willet, P.: Best match of document retrieval: development and use of INSTRUCT. Proceedings of the Twelfth International Online Information Meeting, (1998) 761-767. 12. Robertson, S.E.: Overview of the Okapi Projects. J. of Doc. Vol. 53, no.1 (1997) 3-7. 13. Guach. S: Search improvement via automatic query reformulation. ACM Transactions of Information Systems, 9 (1991) 14. Vickery, A., Brooks, H.M.: PLEXUS – The expert system for referral. Information Processing & Management, 23 (1987) 99-117. 15. Tong, R.M., Applebaum, L.A., Askmann, V.N., Cunningham, J.F.: Conceptual information retrieval using RUBRIC. In C.T.Yu and C.J.Van Rijsbergen(Eds.), Proceedings of the tenth Annual International ACM SIGIR Conference on Research and Development in Formation Retrieval (1987) 247-253. 16. Drabenstott, K.M.: Enhancing a New Design for Subject Access to Online Catalogs. Libr. Hi Tech, 14 (1996) 87-109. 17. Drabenstott, K.M., Weller, M.S.: Failure Analysis of Subject Searches in a Test of a New Design for Subject Access to Online Catalogs. J. Amer. Soc. Inf. Sci. 47 (1996) 519-537. 18. Drabenstott, K.M., Weller, M.S.: The Exact-Display Approach for Online Catalog Subject Searching. Inf. Proc. Manag. 32 (1996) 719-745. 19. Z39.50: An Overview of Development and the Future. URL http://www.cqs.washington.edu/~camel/z/z.html. 20. Robertson, A.M., Willet, P., Vickery, A., Thompson, W.: Comparison of Statistically-based and knowledge-based approaches to information retrieval. Inf. 90 (1990) 282-286. 21. Khoo, C.S.G., Poo, D.C.C., Liew, S.-K., Hong, G., Toh, T.-K.: Development of Search Strategies for E-Referencer, an Expert System Web Interface to Online Catalogs. In: Toms, E., Campbell, 27 D.G., Dunn, J. (eds.): Information Science at the Dawn of the Millennium: Proceedings of the 26th Annual Conference of the Canadian Association for Information Science. CAIS, Toronto (1998). 22. Porter, M.F.: An Algorithm for Suffix Stripping. Program 14 (1980) 130-137. 23. CLIPS: A Tool for Building Expert Systems. URL http://www.ghg.net/clips/CLIPS.html (1997). 24. Forgy, C.L.: Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem. Artificial Intelligence 19 (1982), 17-37. 28 Search by NTU Search System Search by E-Referencer Search by Librarian Query No. No. Displayed No. Relevant Precision No. Displayed No. Relevant Precision No. Displayed No. Relevant Precision A96-7 0 0 0.00 4 0.5 0.13 17 6 0.35 A96-14 0 0 0.00 20 1 0.05 3 2 0.67 A97-16 4 1 0.25 4 1 0.25 20 705 0.38 B97-1 0 0 0.00 20 3 0.15 13 7 0.54 B97-3 0 0 0.00 2 0 0.00 4 3.5 0.88 D96-2 11 10.5 0.95 11 10.5 0.95 20 14 0.70 D96-16 0 0 0.00 20 9 0.45 20 9 0.45 D97-1 0 0 0.00 3 1.5 0.50 20 5 0.25 D97-2 0 0 0.00 5 3.5 0.70 6 6 1.00 D97-4 0 0 0.00 8 2 0.25 15 5 0.33 D97-7 0 0 0.00 19 0 0.00 14 2.5 0.18 N97-13 0 0 0.00 4 3.5 0.88 5 4 0.80 Average 1.25 1.0 0.10 10 3.0 0.36 13.1 6.0 0.54 Table 1: Comparison of Search Results Note: 1. The figures given for No. Relevant and Precision are the average for 2 sets of relevance judgments by 2 persons. 2. The evaluation is based on the first 20 records retrieved. E-Referencer currently displays only 20 records for relevance judgment. 29 OCLC Z39.50 Client API for Java Control Module - Knowledge Base - Fact Base - Explanation Facility JESS functions Knowledge Module Wrapper Functions Network Interface Module Expression Module Graphical User Interface Module Proxy Controller Module Library B Online Catalog User Keyword-Subject Association Module Library A Online Catalog Subject Heading Database Proxy Client Servers (External) Z39.50 Interface Module Figure 1: Design of E-Referencer 30 Figure 2: E-Referencer Frame (main GUI) 31 Figure 3: Feedback Window work_nm3sfzvatvdhrotz4v477gmjni ---- Customized Mapping and Metadata Transfer from DSpace to OCLC to Improve ETD Work Flow Customized Mapping and Metadata Transfer from DSpace/SOAR to OCLC to Improve ETD Work Flow • Sai Deng, Susan Matveyeva, Tse-Min Wang, Wichita State University Libraries • Consultant: Terry Reese, Oregon State University Libraries Outlines Thesis Cataloging Workflow Dynamics: overview of changes Cataloging ETDs in SOAR and OCLC/Voyager: records & workflow Improving ETD Workflow through metadata harvesting, customized mapping and metadata transfer Workflow for Paper Theses 1929-2002 – over 80% records (~ 5000) 70 year range: stable record’s structure Workflow: (1) original cataloging (2) item’s marking/ labeling Cataloging efficiency: constant data Labor intensive: SH Presenter Presentation Notes WSU Library Catalog Voyager has over 6000 records of WSU dissertations and Master’s theses. Records are in a range of 70 years, 1929-2000 have similar structure. There are over 80% of all theses’ records are similar the record you see on this slide. Workflow for cataloging thesis was similar to typical monograph cataloging. The most labor intensive part was subject analysis. Majority of these records have two subject headings, but some are short records with no subject headings. Thesis MARC Record (till 2002) _000 01093nam a2200277 i 450 001 331612 005 19991028065706.0 008 780705s1977 ksu 000 0 eng d 035 __ |a (OCoLC)ocm04023056 035 __ |9 ABK7544WS 040 __ |a KSW |c KSW 099 __ |a LD|a 2667 |a .T4 |a V871d 100 1_ |a Vliet, Martha Tasheff. 245 12 |a A descriptive study of obstetric patients’ knowledge of and self reported attitudes toward the prenatal experience / |c by Martha Tasheff Vliet. 246 3_ |a Patients’ perceptions of prenatal experience 260 __ |a Wichita, Kan. : |b WSU, |c 1977. 300 _ |a viii, 75 leaves ; |c 29 cm. 490 1_ |a Wichita State University. Theses 500 __ |a Also in University Archives: THESIS. 500 __ |a Title on spine: Patients’ perceptions of prenatal experience. 502 __ |a Thesis (M. Ed.) - Wichita State University, December 1977. Department of Instructional Services. 504 __ |a Bibliography: leaves 48-52. 650 _0 |a Pregnancy. 650 _0 |a Pregnancy |x Psychological aspects. 650 _0 |a Prenatal care. 810 2_ |a Wichita State University. |t Thesis. Theses Digitization, Workflow & Records 2003-2004 digitization of WSU Theses began UMI/ProQuest effects workflow Linking Voyager records to UMI/ProQuest Presenter Presentation Notes Explain how OCLC/Voyager- UMI/ProQuest Record enhancements (fields /contents) 856 -links from a catalog to full text in UMI 520 – author abstracts 500 & 700 -- advisor’s name Workflow changes: Special projects: a repetitive data entry goes to students Cataloger creates procedure; MACRO for speedy processing; trains students, and review their work Thesis Bib Record 2004 (MARC) 000 03794ctm a2200289Ia 45 001 1172115 005 20070208132604.0 008 050201s2004 xx a bm 000 0 eng d 035 __ |a (OCoLC)ocm57545066 035 __ |a 1172115 040 __ |a KSW |c KSW 049 __ |a KSWA 050 _4 |a LD2667.T42 |b P437733 099 _9 |a Microfilm 1391 100 1_ |a Perera, Bupani Asiri. 245 12 |a A comparision of multiple-stage tandem MS of protonated and metal cationized peptides in the context of direct sequencing and sequence tag generation / |c by Bupani Asiri Perera. 260 __ |c 2004. 300 __ |a xiv, 136 leaves : |b ill. ; |c 29 cm. 502 __ |a Thesis (Ph.D.)--Wichita State University, College of Liberal Arts and Sciences, Dept. of Chemistry. 500 __ |a "July 2004.“ 500 __ |a Thesis advisor: Michael J. Vanstipdonk. 504 __ |a Includes bibliographical references (leaves 128-136). 520 8_ |a [Author abstract] We have examined the multiple stage collision we bind to the metal ion significantly 700 12 |a Vanstipdonk, Michael J.|e advisor 810 2_ |a Wichita State University. |t Thesis. 856 40 |u http://proxy.wichita.edu:2048/login?url=http://wwwlib.umi.com/cr/wichita/fullcit?p3137654 |z Click here for available full-text of this dissertation via Current Research@Gateway. 994 __ |a C0 |b KSW Transitional Period: 2004-2006 e-Theses in four places: OCLC/Voyager; ProQuest; a temporary web site and SOAR Paper theses are still submitted Development of a new workflow for ETDs e-docs, paper docs, inventory table Naming convention, ETD file preparation MARC and DC manual input; further changes in records (identifiers) 00003279ctm a2200433Ia 450 0011245843 00520080422003723.0 New additions to ETD record: identifiers of several databases that have 006m d this thesis 007cr m|||||||||| 008070423s2005 xx a sbm 000 0 020__ |a 9780542757921 Record consists of 30 fields 020__ |a 0542757923 0247_ |a AAT 1436580 |2 UMI 0248_ |a 778 SOAR 035__ |a (OCoLC)ocn123426976 035__ |a 1245843 040__ |a KSW |c KSW 049__ |a KSWA 099_9 |a Microfilm 1502 099__ |a t05040 1001_ |a Radhakrishnan, Preetha. 24510 |a Enhanced routing protocol for graceful degradation in wireless sensor networks during attacks |h [electronic resource] / |c by Preetha Radhakrishnan. 260__ |c 2005. 300__ |a xii, 50 leaves : |b ill., digital, PDF file. 500__ |a "December 2005." 504 __ |a Includes bibliographic references (leaves 48-50). 500 __ |a Title from PDF title page (viewed on April 23, 2007). 533__ |a Electronic reproduction. |b Ann Arbor, MI : |c ProQuest Information and Learning Company, |d c2006. 538__ |a System requirements: Adobe Acrobat Reader. 538__ |a Mode of access: World Wide Web. 502__ |a Thesis (M.S.)--Wichita State University, College of Engineering, Dept. of Electrical and Computer Engineering. 500__ |a Thesis adviser: Ravi Pendse. 500__ |a UMI Number: AAT 1436580 5203_ |a [Author’s abstract] With the deployment of Sensor networks gaining some … 655_0 |a Electronic dissertations. 70012 |a Pendse, Ravindra.|e advisor 85640 |u http://proxy.wichita.edu:2048/login?url=http://wwwlib.umi.com/cr/wichita/fullcit?p1436580 |z Click here for available full-text of this thesis via Current Research@Gateway. 85640 |u http://soar.wichita.edu/dspace/handle/10057/778 |z A link to full text of this thesis in SOAR Presenter Presentation Notes Further changes in thesis record. Addition of identifiers of those databases that hold this title. The reason to include identifiers to a record – workflow efficiency. Workflow consists of many small operations that is easier to perform by using identifiers. ETD Program 2006-2008 From 2006, WSU have a full scale ETD program (400 records, 2005-2007) eTheses (no paper); no ProQuest or a temporary access to ETD via a web site eTheses are in three databases: SOAR and OCLC/Voyager Work Flow includes the number of operations with a digital file (thesis) and metadata records (MARC and DC) Inventory Table Pdf ID No Last First Name Year Mon. GS send list PDF Harvested PDF Property filled PDF Subm To UMI PDF secured d07001 1 Smith John 200 7 May date date date date date PDF re- named GS Paper work received Soar ID Voyager Bib UMI ID UMI Link Soar Link Micr film No Link Checked Note date date 1074 1262388 32408 65 Yes/no Yes/no 2740 date ETD Workflow: Manual Input DC & MARC The Improved Workflow: no draft record and manual MARC input A Wider Context of ETD Workflow ETD workflow in different institutions University of Virginia (1999), Texas A & M (2004) Home-grown scripts, site-specific harvesters Kent State University (2007) Harvest from OhioLINK ETD Center, ETD-MS to Marc… XSLT Transformation LC MARC 21 XML schema with MarcXML toolkit Dublin Core to MARCXML Stylesheet OAI community developed tools, mostly for IT staff MarcEdit (Terry Reese) Metadata Harvester, MARC Editor Low-barrier harvester, can be used by catalogers http://www.loc.gov/standards/marcxml/xslt/DC2MARC21slim.xsl http://oregonstate.edu/~reeset/marcedit/html/index.php Sample Record in SOAR (Dublin Core) DC Field Value dc.contributor.author Niles, Rae- dc.date.accessioned 2006-12-24T14:56:10Z dc.date.available 2006-12-24T14:56:10Z- dc.date.copyright 2006 dc.date.issued 2006-05 dc.identifier.other d06005 dc.identifier.uri http://hdl.handle.net/10057/373- dc.description Thesis (Ed.D.)--Wichita State University, College of Education.en dc.description "May 2006.” dc.description Includes bibliographic references (leaves 129-145).en dc.description.abstract The purpose of this study was to describe and identify Sedgwick High School’s teacher and student perceptions of the impact of one-to-one laptop computer access using an appreciative inquiry theoretical research perspective and the theoretical frameworks of change and paradigm shift… dc.format.extent xiv, 167 leaves : digital, PDF file. dc.format.extent 1174852 bytes- dc.format.mimetype application/pdf- dc.language.iso en_US dc.rights Copyright Rae Niles, 2006. All rights reserved. dc.subject.lcsh Educational technology dc.subject.lcsh Education--Data processing dc.subject.lcsh Electronic dissertations dc.title A study of the application of emerging technology: teacher and student perceptions of the impact of one-to-one laptop computer access dc.type Dissertation dc.thesis.adviser Calabrese, Raymond L. dc.identifier.oclc 71805797- Appears in Collections: EL Theses and Dissertations COE Theses and Dissertations Dissertations http://soar.wichita.edu/dspace/handle/10057/192 http://soar.wichita.edu/dspace/handle/10057/253 http://soar.wichita.edu/dspace/handle/10057/352 Dublin Core to MARC Mapping Fields in DSpace Transformed MARC fields in OCLC (What we want) dc.contributor.author 100 1 _ Author. dc.date.accessioned dc.date.available dc.date.copyright dc.date.issued 260 ǂc year. dc.identifier.other 099 …… dc.identifier.uri 856 4 0 … dc.description 502 Thesis (Ed.D.)--Wichita State University, College of … dc.description 500 "Month year." dc.description 504 Includes bibliographic references… dc.description.abstract 520 3 _ … dc.format.extent 300 dc.format.extent dc.format.mimetype dc.language.iso 546 en_US dc.rights 540 Access restricted to WSU students, faculty and staff (delete) dc.subject 690 (keywords, non CV, delete) dc.subject.lcsh 650 _ 0 dc.title 245 1 _ … dc.type 655 _ 7 Dissertation ǂ2 local dc.thesis.adviser 700 1 2 … ǂe advisor dc.identifier.oclc 856 4 1 … Appears in Collections: Using MarcEdit MarcEdit Interface Metadata transformation in MarcEdit The wheel and spoke design for metadata transformation (by Reese) EAD TEI MODS MarcXML Dublin Core Data Flow Diagram MarcEdit OAI response Export MARC OAI request OCLC Metadata Harvester MarcEditor Voyager DSpace Authorized data processing (Title, author, subject…) Resolving data ambiguity (Many to one mapping w/ element positioning…) String Processing (Data normalization…) XSLT (DC to MarcXML) Customization Raw XML (DC) Selective Harvesting Define in MarcEdit by identifier (e.g. oai:soar.wichita.edu:10057/255 ) by set (e.g. hdl_10057_351) by date (e.g. from=2007-01- 01&until=2008-01-01) Or, http://soar.wichita.edu/dspace- oai/request?verb=ListRecords&metadataPrefix =oai_dc&from=2007-01-01&until=2008-01-01 How do we define harvesting theses only? Define by set (http://soar.wichita.edu/dspace- oai/request?verb=ListSets) Sets by schools and departments AE Theses and Dissertations (hdl_10057_313) ANTH Theses (hdl_10057_233) BIO Theses (hdl_10057_389) CE Theses and Dissertations … Or sets in two categories Master’s These (hdl_10057_351) Dissertations (hdl_10057_352) Alternatively, Define Theses Sets in XSLT Dublin Core to MARCXML Stylesheet - - - - - - http://www.loc.gov/standards/marcxml/xslt/DC2MARC21slim.xsl XSLT Customization: Transform and Display Theses and Dissertations Only - - - - - p m r k m m m i a t a … Sample Result Exported to OCLC Mapping Problems and Error Reports (for Variable Fields) 100 occurrence 1, indicator 2 - invalid code 520 occurrence 4, $a occurrence 1, position 76 - invalid character - data must be ALA characters 655 occurrence 1, indicator 1 - invalid code 655 occurrence 1, indicator 2 - invalid code 655 occurrence 1, $2 - invalid relationship - when element is present, then 655 indicator 2 must equal 7 … Need customization to meet our needs. Mapping Test Results Using OAIDCtoMARCXML.xsl (in MarcEdit) DSpace (version 1.4 or below) only responds with simple Dublin Core xml file (to be transformed to MarcXML using xslt). Fields in DSpace Transformed fields in OCLC Correction and Customization Needed dc.contributor.author 100 1 0 Niles, Rae ǂe author (Delete ǂe author.) dc.date.accessioned dc.date.available dc.date.copyright dc.date.issued 260 ǂc 2006-05 (Only keep 2006) dc.identifier.other 500 d06005 (Change to 099) dc.identifier.uri 500 http://hdl.handle.net/10057/373 (Change to 856 4 0) dc.description 520 Thesis (Ed.D.)--Wichita State University, College of Education. (Change to 502) dc.description 520 "May 2006." (Change to 500) dc.description 520 Includes bibliographic references (leaves 129-145). (Change to 504) dc.description.abstract 520 The purpose of this study was to describe and identify Sedgwick High School’s teacher and student perceptions of the impact of one-to-one laptop computer access using an appreciative inquiry theoretical research perspective and the theoretical frameworks of change and paradigm shift... (Change to 520 3) dc.format.extent dc.format.extent dc.format.mimetype dc.language.iso 546 en_US (delete) dc.rights 540 Access restricted to WSU students, faculty and staff (delete) dc.subject.lcsh 690 Educational technology (Change to 650 _0) dc.subject.lcsh 690 Education--Data processing dc.subject.lcsh 690 Electronic dissertations dc.title 245 0 0 A study of the application of emerging technology: teacher and student perceptions of the impact of one- to-one laptop computer access (if 100 exists, use 245 1_; or else use 245 0_ ) dc.type 655 7 _ Dissertation ǂ2 local (Change to 655 _7) dc.thesis.adviser (Add 700 1 2 … ǂe advisor.) dc.identifier.oclc 856 4 1 ǂu 71805797 ǂz Connect to this object online. (replace ǂu with value from dc.identifier.uri) Appears in Collections: http://hdl.handle.net/10057/373 Customized Mapping in XSLT Resolving data ambiguity Same DC fields to different MARC fields: description 502(Dissertation) 500(General Note) 504 (Bibliography) Qualified DC element: description.abstract 520(Summary) Solution: element positioning - - - - … Customized Mapping in XSLT Authorized data processing Primary entries vs. added entries: title and personal names processing Template to deal with personal names (in MarcEdit) E.g. Webb, Kyle M. Webb, Kyle M., 1977 - transformed to =100 1\$aWebb, Kyle M. =100 1\$aWebb, Kyle M., $d1977- Identify field relationship and correct indicators 100, 245 (author, title) relationship: if 100 exists, 245 1 _ or else, 245 0 _ Local element: dc.thesis.advisor transformed to 700 1_ (If more than one dc.thesis exists, positioning is needed.) Customized Mapping in XSLT Processing of non-filing characters in title 245 (title) 2nd indicator: a, an, the… (2, 3, 4) - - - - - - - - - … Alternatively, it can be defined in the title template. Customized Mapping in XSLT Subjects vs. Keywords Only kept common subject in the test (when keywords and subjects mixed inconsistently) - - - - … Subject template (OSU solution) ocean wave energy direct-drive fluid-structure interaction Ocean wave power Fluid-structure interaction Transformed to =650 \0$aOcean wave power. =650 \0$aFluid-structure interaction. =690 \\$aocean wave energy. =690 \\$adirect-drive. =690 \\$afluid-structure interaction. Customized Mapping in XSLT String Processing Functions normalize-space() translate() substring()… Example: Extract partial value from DC element 260 (Date): only extract year from the issuing date in DC - - - - . Customized Mapping in XSLT Leaders: fixed fields that comprise the first 24 character positions (00-23) of each MARC record. They provide information for the processing of the record. 008 field (Fixed-Length Data Elements) Type (t, manuscript language material) BLvl (m, Encoding level is monograph) Desc (a) ELvl (I, encoding level is full level) Form (s, form of item is electronic) Cont (b, m, content is theses with bibliographies) Ills (a, illustration included) Srce (d, cataloging source) Conf (0, not a conference publication) Fest (0, not a festschrift) LitF (0, not fiction) DtSt (s, single date) Indx (0, no index) Lang (eng, language is English) Ctry (xx) Ways to handle: Scripting and adding all fixed fields (leader and 008 fields) in OAIDCtoMARCXML.xsl; Or, Adding 008 in MarcEditor after record export; Or, applying fixed field template after records being exported to OCLC. Harvesting Using the Revised XSLT Crosswalk Harvest Raw Data Raw DC XML (Harvest oai Data to Local File) Harvest and Transform DC to MarcXML Records will be Dumped to MarcEdit- MarcEditor MarcEditor Edit harvested theses in MarcEditor Batch edit fields, subfields, indicators (if needed) E.g.: add 008 field for all records .mrk (MARC text file) Compile to .mrc (MARC) Or Save as .mrk8 (MARC UTF8 text file) Compile to .mrc (MARC) Import Records to OCLC Click “File- Import Records…” Select “Import to Local Save File” Import Records to OCLC After Being Exported to OCLC… In OCLC Connexion client: Open each file, do some review/editing as needed, attach KSW holding and apply fixed field template of ETD (if needed) in OCLC. Alternatively, records exported to Voyager directly This part is performed by Gemma Blackburn. Send .mrc file to the Voyager server. Create a Bulk Import rule in Voyager System Administration module. Go to: Cataloging Bulk Import Rules New Name the rule Choose (or create a new) Bib De-Duplication Rule Modify mapping as needed Save the rule Voyager System Administration Bulk import rules screenshot Bulk import to Voyager Bulk Import the records using the Bulk Import rule On your Voyager server, go to: .../voyager/xxxdb/sbin/ Write the command for Bulk Import to run: Pbulkimport –ftheses-sample.mrc –iSOAR –b1 –e3 –f and the file name (required) –i and the Bulk Import rule name (required) –o and your name (not required, but will let people know who ran the bulk import) –b and a number. This will define the beginning record in the file that you want to import if you prefer to import a select set at a time (not required) –e and a number. This will define the end record in a set to import (not required) There are several other options. Check the Technical User’s Guide A real case Transformation of ETDs of 2007 Ph.D. Dissertations (Summer, Fall 2007): 23 Master’s Theses (Summer, Fall 2007): 55 Some adjustment in the transformation: Transfer dc.format.extent[1] to physical description (Marc 300) E.g. ix, 53 leaves, ill. 300 $a ix, 53 leaves : $b ill. Keep 3 description fields description [1] 500(General Note) description [2] 502(Dissertation) description.abstract 520(Summary) 008 field values added in MarcEditor rather than applied in OCLC E.g. =008 …s2008\\\\xx\\\\\\sbm\\\000\0\eng\d Discussion and Conclusion The customized mapping and metadata transfer can eliminate the need of double entry in DSpace and OCLC/Voyager and significantly improve our ETD work flow. Metadata management One single crosswalk and style sheet will not meet all needs; Needs to be based on standard practice but add local variations; Application-specific mapping is needed for special projects; Coordination in metadata repurposing is important. Data mapping, manipulation and transformation Using qualified DC instead of element positioning in XSLT; DSpace 1.5 enables qualified DC crosswalk for OAI-PMH; Handling of MARC fixed fields and 008 field. Other technical issues Using other tools for harvesting besides MarcEdit; Using DSpace Item Importer and Exporter instead of Metadata Harvester. Project team and Acknowledgement Sai Deng, Metadata mapping and transformation Susan Matveyeva, ETD cataloging and mapping Tse-Min Wang, Programming assistance Sandy Oswald, Manoj Gogoi, ETD cataloging assistance Terry Reese, Consultant Nancy Deyoe, Administrative Support Connie, Basquez, Voyager support Gemma Blackburn, Voyager support Thank you! Customized Mapping and Metadata Transfer from DSpace/SOAR to OCLC to Improve ETD Work Flow� Outlines Workflow for Paper Theses �Thesis MARC Record (till 2002)� Theses Digitization, Workflow & Records OCLC/Voyager- UMI/ProQuest �Thesis Bib Record 2004 (MARC)� Transitional Period: 2004-2006 Slide Number 9 ETD Program 2006-2008 Inventory Table �ETD Workflow: Manual Input DC & MARC� The Improved Workflow: no draft record and manual MARC input A Wider Context of ETD Workflow Sample Record in SOAR (Dublin Core) Dublin Core to MARC Mapping Using MarcEdit Metadata transformation in MarcEdit Data Flow Diagram Selective Harvesting Alternatively, Define Theses Sets in XSLT XSLT Customization: Transform and Display Theses and Dissertations Only Sample Result Exported to OCLC Mapping Problems and Error Reports (for Variable Fields) Mapping Test Results Using OAIDCtoMARCXML.xsl �(in MarcEdit) Customized Mapping in XSLT Customized Mapping in XSLT Customized Mapping in XSLT Customized Mapping in XSLT Customized Mapping in XSLT Customized Mapping in XSLT Harvesting Using the Revised XSLT Crosswalk Raw DC XML (Harvest oai Data to Local File) Harvest and Transform DC to MarcXML Records will be Dumped to MarcEdit- MarcEditor MarcEditor Import Records to OCLC Import Records to OCLC After Being Exported to OCLC… Alternatively, records exported to Voyager directly Voyager System Administration Bulk import rules screenshot Bulk import to Voyager A real case Discussion and Conclusion Project team and Acknowledgement Slide Number 46 work_nmnwp2jykjcvdaj77w76k5uemi ---- Editorial Vietnam J Comput Sci (2016) 3:1 DOI 10.1007/s40595-016-0057-1 EDITORIAL Editorial Ngoc-Thanh Nguyen1 Published online: 26 January 2016 © The Author(s) 2016. This article is published with open access at Springerlink.com It is our great pleasure to present to you the first issue of Volume 3 of Vietnam Journal of Computer Science (VJCS). This issue starts the third year of VJCS activities. During the first 2 years (2014–2015), VJCS published 8 issues including 45 high quality papers selected in the rigorous peer-review process. All these issues have been published in schedule time, thus the regularity is well guaranteed. The published papers come from many countries over the world, thus the journal has strong internationality. Some words about the statistics. As of 31 Dec. 2015, we have received in total 348 papers, from which 53 papers have been accepted, 283 rejected and 16 papers are under review. According to Google Scholar, up to now papers published in VJCS have been cited 117 times and the number is quickly increasing. This statistics shows that VJCS has established the first step to be an international and prestigious journal in the field of Computer Science. VJCS is now indexed by Google Scholar, Computer and Information Systems Abstracts, DBLP, OCLC and Summon by ProQuest. VJCS papers are published in both forms of open access and print. We are trying our best to keep the B Ngoc-Thanh Nguyen Ngoc-Thanh.Nguyen@pwr.edu.pl 1 Department of Information Systems, Faculty of Computer Science and Management, Wroclaw University of Technology, Str. Wyb. Wyspianskiego 27, 50-370 Wrocław, Poland regularity, internationality and the quality of selected papers for the journal to be able to have such important indexes such as Scopus or SCI-E as early as possible. On behalf of VJCS community and the Sponsor, Nguyen Tat Thanh University, we cordially thank Alfred Hofmann, RonanNugentandtheEditorialOfficefromSpringerfortheir kind supports for VJCS. Special thanks go to the members of AdvisoryBoard,AssociateEditors,EditorialBoardmembers and Editorial Reviewer Board members for helping us in the paper selection process, journal promotion and for all advices and efforts in building VJCS to be a considerate and significantjournalforthefieldComputerScience.Finally,we sincerely thank all authors for their valuable contributions to VJCS. Taking this opportunities we would like to take this oppor- tunity to wish you a Happy and Prosperous New Year! Ngoc-Thanh Nguyen, Editor-in-Chief Manh-Hung Nguyen, President of Nguyen Tat Thanh University Dac-Hien Cao, Assistant Editor Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. 123 http://crossmark.crossref.org/dialog/?doi=10.1007/s40595-016-0057-1&domain=pdf http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/ Editorial work_nnbo3zecpjcijdkgptamy4eksi ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216587268 Params is empty 216587268 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587268 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_nnjnaznp7bgshmw6xdhy2unvsy ---- eScholarship@mcgill.ca - redirect Redirecting to: http://escholarship.mcgill.ca work_no3clfvh3vek5dqtsadrhpbg3e ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586108 Params is empty 216586108 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586108 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_nx7m2rbwrrgofl76ahi7sgivke ---- 1 Interlibrary Loan - Reference Collaboration: Filling Hard-to-Find Faculty Requests MARGARET H. BEAN Resource Sharing Librarian, University of Oregon, Eugene, Oregon, USA MIRIAM RIGBY Social Sciences Librarian, University of Oregon, Eugene, Oregon, USA INSTITUTION PROFILE The University of Oregon (UO), located in Eugene, Oregon, has a population of almost 26,000, including students, faculty and staff. The UO is a member of the Orbis Cascade Alliance consortium made up of 36 college and university libraries in the Pacific Northwest. The ILL unit resides in the Access Services department of the UO libraries. This unit currently is staffed by 1 resource sharing librarian, 4.5 full-time equivalent (FTE) staff and 2 FTE student assistants. We fill approximately 30,000 borrowing requests per year and 45,000 lending requests per year. The UO Libraries employ over 25 FTE subject specialists. Their areas of expertise include subject-based knowledge covering virtually all the fields taught and researched at the university. 2 These subject specialists are placed throughout the library across a number of departments including Cataloging, Reference, Acquisitions, and subject-based branch libraries. BACKGROUND Prior to November 2008, UO library patrons utilized three separate databases in searching for books and audiovisual materials: the UO’s local catalog, the Orbis Cascade Alliance consortial database (Summit), and OCLC’s FirstSearch. Undergraduates usually searched in the local catalog and only occasionally ventured into Summit. Graduate students and faculty members searched in the local catalog and in Summit and sometimes requested items via interlibrary loan that they located in FirstSearch. In November 2008, the UO Libraries, as part of the Orbis Cascade Alliance, went live with OCLC Navigator as our consortial borrowing tool. Navigator became a search engine incorporating our local Summit catalog as well as the WorldCat catalog allowing patrons to search both catalogs at once. In August 2009 the UO libraries went live with WorldCat Local which includes the Summit catalog, OCLC’s FirstSearch and the UO local catalog. With WorldCat Local’s integration of the three databases, all library patrons are able to search all three catalogs at one time. The search bar for WorldCat Local is prominently displayed on the UO homepage allowing patrons immediate access to the database. Also, since the interlibrary loan request button is prominently displayed on each WorldCat record for which there is no Summit or UO holding, both discovery and requesting have become easy one-click processes. 3 The impact of implementing OCLC Navigator was immediate. In January 2009 we saw an increase of 122% in filled returnable borrowing requests over January 2008. Implementing WorldCat Local further impacted the UO’s interlibrary loan unit. When school reconvened in late September 2009 we saw an additional increase in filled ILL borrowing returnable requests of 57%. Hiring additional staff and implementing the OCLC/ILLiad interlibrary loan management software has enabled us to manage our increased workload. However, dealing with the increase in borrowing requests has not been our only concern. We have also seen a change in the types of items requested including requests for foreign, audio visual and very new items as well as for items held only by a few institutions. It is often difficult and time consuming to find providers for these types of materials. Illustrating this problem is the fact that at one point in January 2010 we had over 800 requests in our borrowing “unfilled” queue. ENGAGING SUBJECT SPECIALISTS We realized that ILL could no longer go it alone. In the past, UO library subject specialists assisted the ILL unit in tracking down hard-to-verify bibliographic citations. However, thanks to World Cat Local and to bibliographic databases using open URL resolvers to pass through complete book and article citations into ILLiad, citation verification is no longer a major issue for us. The challenge now is in finding a means of getting hard-to-locate items into the hands of our patrons. 4 For some time we have been thinking about tapping into staff members’ hidden skills and interests to promote communication and interdepartmental collaboration throughout the library. Library units hold regular cross-training sessions, and dedicate at least one day annually to sharing our knowledge and skills with one another. Further, individual librarians have taken it upon themselves to broaden their skills by offering their time to other departments in exchange for the learning that comes with working on new projects. With this type of collaboration in mind, we realized that we could employ the expertise of UO subject specialists to help process our more challenging requests. We needed what subject specialists have in such abundance: an in-depth knowledge and understanding of their patrons’ current research, language fluency, a knowledge of strong library collections and colleagues in their subject areas, as well as knowledge of specialized publishers. Our first step was to decide which subject specialists to include in the program. Using ILLiad Web Statistics we identified the areas in which we receive most of the difficult requests from faculty and selected subject specialists who support these areas. Our focus was primarily on the areas of music, East Asia, the social sciences, Romance languages, and the humanities. We were concerned with the impact of such a program on the subject specialists’ workload. We decided that we would begin by asking subject specialists to work only on requests placed by faculty members. Researching difficult requests placed by graduate and undergraduate students could be added to the subject specialists’ workload at a later time once the impact on their workload had been determined. 5 We began our ILL-reference collaboration in May 2010 and the program is ongoing. It is similar to other libraries’ projects to provide better access to materials, though it is unique in some of its goals and the manner of implementation (Kern, 2006) 1 . DELIVERING REQUESTS TO SUBJECT SPECIALISTS We debated how best to provide subject specialists with information regarding these difficult requests. After considering several options we decided that the most efficient means of transferring this information to the subject specialists would be to give them direct access to the original record within ILLiad. With access to the ILLiad record the subject specialist can easily see patron information including name, status and department, see the exact citation and the “reason for no” history. This gives the subject specialist a more complete picture of who is 1 Most notably, Kern and Weible’s article, Reference as an Access Service (2006), explores the inclusion of reference librarians and graduate assistants in the ILL process to provide better access to resources. We created our project independently, used significantly different technical methods of implementation, and explore additional future opportunities possible due to our methods. Yet, encountering both projects together provides an opportunity for “repetition” of an experiment in different environments with differing university communities. As librarians move away from one-size-fits-all implementation of ideas from other libraries, having multiple testing grounds for similar projects provides an invaluable resource to assist in determining how such a project might work in one’s own library environment. 6 requesting the item and the history of actions already taken, thus cutting down on duplication of effort. Our next decision was which ILLiad processing steps should be handled by subject specialists. We determined that subject specialists would not place actual requests; instead they would use data from the ILLiad request record to determine how to find a supplier for the item. Using the powerful ILLiad email routing rules subject specialists would then move the request to the next queue so that ILL staff could finish processing it. TRAINING We provided subject specialists with an hour and a half training which included a PowerPoint demonstration with ILLiad screen shots as well as time to work on processing sample requests. Training included logging into ILLiad, locating the “Awaiting Processing by Subject Specialist” queue, learning to read the request record (finding bibliographic information, OCLC accession number, user information and “reasons for no” data), a discussion of the “reasons for no” and how to choose the appropriate option to email the patron or to route the request to another queue. The PowerPoint slides were loaded onto the library intranet for future reference and training. A link was provided to a list of online sources used by interlibrary loan staff for international bibliographic verification. WORKFLOW 7 We developed the following workflow to integrate subject specialists into the interlibrary loan process: Interlibrary Loan Staff Interlibrary loan staff members take processing and searching a request as far as they can until the item becomes “unfilled” with possible lenders. Hard to track down requests are routed to the “Awaiting Processing by Librarian” queue where the resource sharing librarian double checks to make sure all obvious sources have been explored. The librarian then identifies the appropriate subject specialist and uses ILLiad e-mail routing to alert the subject specialist that the request has been routed to the “Awaiting Processing by Subject Specialist” queue. Subject Specialists The subject specialist opens the record in “Awaiting Processing by Subject Specialist” queue and reviews it for patron information, “reasons for no,” double checks OCLC accession numbers used in prior searches, and then searches other sources. After review the subject specialist makes the decision to send an email from within the ILLiad record of one of the following types: (1) To the faculty member for more information. Sending this email leaves the request in the “Awaiting Processing by Subject Specialist” queue. (2) To the faculty member canceling the request. Sending this email automatically routes the request to the “Canceled by ILL Staff” queue. 8 (3) To the faculty member to inform him or her that the item will be purchased. Sending this email will automatically route the request to the “Canceled by ILL Staff” queue. (4) To the faculty member providing a link to URL if item is available on the web. Sending this email automatically routes the request to the “Request Finished” queue. (5) To the ILL librarian with a new OCLC number to use in requesting the item and leaves the request in the subject specialists queue. These emails are prepopulated with patron and request data. ILLIAD CUSTOMIZATION As noted above, this project was accomplished by the use of ILLiad queues and ILLiad email routing. The following new queues were set up in ILLiad Borrowing using the ILLiad Customization Manager: Awaiting Processing by Librarian Awaiting Processing by Subject Specialist We also created the following emails with their routing rules: FacultyCancel (request is moved to “Canceled by ILL Staff” queue) Full Text on Web (request is moved to “Request Finished” queue) 9 Note to Faculty (request is left in “Awaiting Processing by Subject Specialist” queue) Faculty Purchase (request is moved to “Canceled by ILL Staff” queue) RESULTS This program has been very successful. Of the 50 requests that have been forwarded to subject specialists since May 2010, 29 have resulted in purchases, 15 have been canceled, 4 have been filled via traditional ILL and 2 have been filled by web sources. Our subject specialists say that this collaboration has been successful on many levels. They now have an additional point of contact with the faculty in their liaison departments and insight into the types of materials their faculty are requesting. Subject specialists have enjoyed the chance to learn about the ILLiad software. Despite the additional training time and work, the subject specialists report that they have not felt this has been a burden. Subject specialists currently deal with reference requests and purchase requests daily, so incorporating the ILLiad requests into these existing workflows has not significantly impacted anyone – even for the librarians receiving the most requests. Perhaps even more importantly, our faculty members seem similarly pleased with the program. While they are not aware of the change behind the scenes, they have reacted positively when approached by subject specialists about their hard to track down interlibrary loan requests. Sometimes, a simple clarification is all that has been required – such as with an item originally requested in Japanese, for which we owned an English copy, which was equally acceptable to the 10 professor. In other cases, faculty members have been thrilled to be offered the option of having an item purchased for the library, when an ILL request could not be filled. We were initially concerned that subject specialists would feel uncomfortable being bearers of bad tidings when they delivered news that an item could not be supplied either via interlibrary loan or purchase. Positive faculty response has quelled this fear; even when an item cannot be procured through interlibrary loan or purchase, faculty have told us that they appreciate the effort and the additional explanation that now comes with an interlibrary loan request rejection notice. CONCLUSION Overall, this collaborative effort can be called a success, and we expect to continue it indefinitely. Although some job-sharing ventures can create additional burden, and each project should be carefully considered in terms of how it fits into each library’s structure and each department’s workflow, that has not been the case here. An unexpected benefit of this collaboration has been the opening up of communication between interlibrary loan staff and subject specialists. Not only are subject specialists filling the needs of interlibrary loan but they are now more apt to forward requests for “just-in-time” borrowing for unfillable faculty purchase requests. We have also found this process to be a way to get our feet wet with respect to a purchase on demand program. As the interlibrary loan staff and subject specialists become more familiar with the types of requests that can only be filled by purchase rather than via interlibrary loan we will 11 be able to establish realistic guidelines for a more formal purchase on demand program in the future. Far from stealing jobs as much outsourcing does, or just shifting an overload of work around as many collaborative projects might, we have found a successful way to lighten the workload on interlibrary loan staff, without creating a noticeable burden on the subject specialists. In the future, as the ILL unit migrates to ILLiad 8.0, we will train subject specialists in this new version of ILLiad. In particular, we will focus on training them to use the ILLiad “Addon” feature. Our next step will be to expand this program to include ILL requests from graduate students with an eye to including undergraduate requests in the future. We will also use statistics gathered in this project to make a case for a purchase on demand program at the University of Oregon. REFERENCES Buchanan, S. (2009). Interlibrary loan is the new reference: reducing barriers, providing access and refining services. Interlending & Document Supply, 37(4), 168-170. Deardorff, T. & Nance, H. (2009). WorldCat Local implementation: the impact on interlibrary loan. Interlending & Document Supply, 37(4), 177-180. Hodges, D., Preston, C., & Hamilton, M.J. (2010). Patron-initiated collection development: progress of a paradigm shift. Collection Management, 35, 208-221. 12 Kern, M. K. & Weible, C.L. (2006). Reference as an access service: collaboration between reference and interlibrary loan departments. Journal of Access Services, 3(1), 17-35. Way, D. (2009). The assessment of patron-initiated collection development via interlibrary loan at a comprehensive university. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 19, 299-308. work_nyj2ann7u5c3boxw2f5mbv22zq ---- Assessing the collective wealth of Australian research libraries: measuring overlap using WorldCat Collection Analysis Assessing the collective wealth of Australian research libraries: measuring overlap using WorldCat Collection Analysis PAUL GENONI AND JANETTE WRIGHT The article has been adapted from the original version of a paper presented at the ALIA Access 2010 Biennial Conference, Brisbane Australia, 1-3 September 2010. Abstract This paper reports the results of recent research examining the holdings of Australian research library collections recorded in the WorldCat database using OCLC WorldCat Collection Analysis software. The objectives of the research are: 1. To better understand the distribution of printed monographs amongst Australian research collections in order to assess the potential for enhanced collaboration in aspects of collection management. 2. To test the OCLC WorldCat Collection Analysis software in order to ascertain its value in comparing collection data based on the Australian research libraries subset of the WorldCat database. The collections compared are the National Library of Australia; University of Melbourne; Monash University, and CAVAL Archival and Research Materials Centre. The data record the extent of overlap between collections, and the prevalence and distribution of single copies. The paper reflects on the use of WorldCat Collection Analysis software as a means of supporting the future management of Australian research collections. The research was undertaken as a pilot for a larger study. IMPLICATIONS FOR BEST PRACTICE • The results support those of previous overlap studies that point to the potential for significant de-duplication of Australian research collections • WorldCat Collection Analysis software has the potential to enable far more detailed comparison of Australian collections than has previously been possible. • Libraries, or groups of libraries, undertaking detailed collection overlap analysis will have available data that can be used to support collaborative collection management. • Identification of last copies in the national collection will provide added confidence in their retention. Introduction As library collections continue to transition from physical to digital formats, library managers are faced with the challenge of deciding on the medium-term and long- term storage of their print collections. For while libraries continue to acquire many new print items, there is a widespread acceptance that the proportion of new material acquired in this form will continue to decline. In addition to the increasing amount of content that is acquired in a digital form, research libraries and communities are also adjusting to the impact of mass-digitisation programs that will potentially absorb at least some of the demand for access to print materials. As a result research libraries are under increasing pressure to manage their “legacy collections” of print material in the most space and cost-effective manner possible, while continuing to provide efficient access to items from these collections. This issue is particularly acute in research libraries where there is a responsibility to ensure that the formidable print-based collections remain secure and accessible, while at the same time freeing-up space for new technology dependent uses (Sharp 2009). In this environment it is imperative for research library managers to have reliable data on which to base decisions relating to the storage, disposal and digitisation of items in print collections. Increasingly these decisions are being taken from a basis of collaboration, with a view to meeting the needs of a group or network of libraries, while reducing the burden on individual libraries within those systems. This reliance on collaboration means that data used to underpin decision making should also be system-wide in order to provide the most relevant evidence to support those decisions. The Australian research library community is adequately cohesive, and sufficiently supported by an existing collaborative discovery and delivery infrastructure, to enable it to potentially function as a single network. In this circumstance there is significant benefit in approaching key collection management decisions on a whole-of-system basis. This is particularly true of decisions relating to the sustainable management of low-use print materials—the so called “long tail”—where collaboration provides great potential for substantial savings in the cost required by long-term print storage. This research is therefore intended as a pilot study for a wider analysis of the incidence of overlap and last copies within Australian print collections. It is undertaken as part of an ongoing project investigating the long-term storage, discovery and delivery of legacy print collections for the mutual benefit of Australian research libraries and communities. It also forms part of a considerable body of international research and commentary looking at the possibilities for the transformation of print storage through increased collaboration (see for example: Vattulainen 2004; Gherman 2007; Payne 2007; O’Connor and Jilovsky 2008; Yoon and Oh 2008; Genoni and Varga 2009; Sharp 2009). Australian collection overlap studies Collection overlap studies are a standard method of investigating the relationship between collections and the distribution of items between two or more libraries. The data derived from overlap studies is useful for both the individual libraries included in the studies, and for the group of libraries whose collections are examined. Data relating to overlap and unique holdings can assist in making decisions related to the management of those collections. The types of decisions that might be influenced include collection development; last-copy retention; inter-library loan/ document delivery; disposal, and storage. Because overlap studies involve two or more libraries the data is particularly useful for libraries seeking to develop cooperative policies or processes relating to the management of their collections. The capacity to conduct overlap studies depends on the ease and accuracy with which holdings can be compared. In recent years the development and implementation of increasingly large-scale, inclusive union catalogues has provided additional impetus for overlap studies by enabling them to be increasingly broadly- based and effective. The use of such catalogues does, however, raise issues relating to methodology and the completeness and accuracy of catalogue data (Rochester 1987), and studies based on national union catalogues inevitably encounter problems associated with inaccurate and incomplete data. In Australia recent overlap studies have relied upon the holdings recorded in the National Bibliographic Database (NBD). These include a study conducted by the National Library in 2002 on behalf of the Higher Education Information Infrastructure Advisory Committee (Missingham and Walls 2003). This study investigated the overlaps between academic libraries on a state-by-state basis, and included both serial and non-serial holdings. Missingham and Walls encountered some of the frustrations of relying upon the NBD for examining overlap, noting that incomplete holding and duplicate records had the effect of “limiting the accuracy of any study based on a large collaborative catalogue” (p. 249). A second major study focusing on academic library holdings was undertaken during 2002 and 2003 when the Australian Research Libraries Collection Analysis Project (ARLCAP) examined the overlaps in the South Asian and Indian Ocean collections of the Group of Eight libraries (serving Australia’s most research intensive universities) and the National Library (ARLCAP 2004). The ARLCAP research is relevant to the current study in that it used the Automated Collection Analysis Services (ACAS) of OCLC. The ACAS undertook the analysis on behalf of ARLCAP relying upon the holdings data recorded in the NBD. The classification numbers of items were mapped to the WLN / OCLC conspectus in a manner similar to that used for the current research. The results were compromised to some extent by the low number of holdings (as low as 55% for one library) that had at that time been added to the NBD by participating libraries. Nevertheless the ARLCAP Report concluded that: The project has demonstrated that the use of OCLC’s ACAS to perform an automated collection analysis across several libraries is an effective tool for gathering data and structuring it according to WLN Conspectus divisions. The results provide a solid basis for further comparative analysis of the holdings, trends, and gaps in library collections. Shortly after the completion of the ARLCAP research OCLC withdrew the ACAS service and announced that future collection analysis services would be based exclusively on the use of the holdings recorded in WorldCat and using the WorldCat Collection Analysis software. OCLC WorldCat Collection Analysis software divides subject content according to the OCLC Conspectus. OCLC describe the Conspectus as “a framework to systematically inventory and describe library collections” (OCLC). The structure of the Conspectus is hierarchical, and is comprised of divisions (the broadest category), categories and descriptors. The divisions, categories and descriptors can be mapped to Dewey Decimal, Library of Congress, and National Library of Medicine classification schemes. Dewey Decimal mapping was used in this research as all four collections use DDC. There are 32 divisions within the OCLC Conspectus, and overlap data for 24 of these divisions was collected in the course of this study. The most recent Australian overlap study was conducted in 2007 as part of the ongoing research project that is reported in this paper (Genoni and Varga 2009). The study examined overlap within the membership of CAVAL Ltd, a consortium owned by twelve Australian university libraries. The study also included the CAVAL Archival and Research Materials (CARM) Centre, a print repository providing a storage facility for member libraries and a document delivery service for the wider research community. The study relied upon an analysis conducted by the National Library of NBD holdings data of the relevant collections, and was limited to monographs in the Dewey Decimal range of 600-699. OCLC WorldCat OCLC WorldCat has become established as the foremost international union catalogue. As at September 2010 OCLC claimed the database consisted of over 203 million bibliographic records with 1.64 billion holdings provided by over 72,000 libraries (http://www.oclc.org/worldcat/statistics/default.htm). Given the amount of catalogue data that is federated in WorldCat it is not surprising that librarians and researchers have investigated ways in which this extraordinarily rich source can be used to support research investigating the nature of collections and to make decisions related to their management. The potential uses cover a wide range of library operations including collection management, with Lavoie, Dempsey and Connaway (2006) arguing that with the assistance of WorldCat, . . . data mining across library collections could open the door to new opportunities for shared collection management. Studies of holdings patterns for institutional clusters at the consortium, regional, or even national level could reveal opportunities to reduce cross-collection redundancies and free up resources to fill gaps in collections. Some of the reported research-based uses of WorldCat data include: identifying the distribution and characteristics of last copies to provide data for decisions relating to de-accessioning and storage (Connaway, O’Neill and Prabha 2006); making inferences about the level of audiences for which texts are intended (O’Neill, Connaway and Dickey 2008); assisting with collection development by testing the effectiveness of an approval plan (McClure 2009); and conducting a collection evaluation test by comparing strengths and weaknesses of different collections (White 2008). The WorldCat Collection Analysis software has been used to conduct “brief tests” of collection strengths and weaknesses (Beals and Gilmour 2006), and to support decisions relating to the withdrawal of material from storage facilities (Ward and Aagard 2008). From July 2007 the National Library of Australia entered into an agreement with OCLC that covered all Libraries Australia subscribers. Under the terms of the agreement records in the NBD that have attached holdings are uploaded to WorldCat (with a small number of exceptions for records obtained from some commercial suppliers); and WorldCat records with Australian holdings are in turn uploaded to the NBD. The agreement did not give Australian libraries access to additional services such as WorldCat Collection Analysis, although the National Library noted that the arrangement with regard to exchange of catalogue data would “allow Australian libraries to benefit from OCLC research and development” (National Library of Australia). Research Methodology Aim The aim of the current research design is on recording the extent of overlap between collections, and identifying the likely prevalence and distribution of single (last) copies in the collections of Australian research libraries. The particular objectives of the pilot phase are: 1. To better understand the distribution of printed monographs amongst Australian research collections in order to assess the potential for enhanced collaboration in aspects of collection management. This includes the use of high-end technologies to support seamless discovery and delivery for the purpose of interlibrary loan and document delivery. 2. To test the OCLC WorldCat Collection Analysis software in order to ascertain its value for comparing collection data based on the Australian research libraries subset of the WorldCat database. Methodology WorldCat Collection Analysis software was used to undertake a study of holdings of single (last) copies in, and collection overlap between, a subset of Australian research library collections. ‘Australian research libraries’ in this context was defined as the members of Council of Australian University Librarians (CAUL); the Australian members of National and State Libraries Australasia (NSLA); and the CARM Centre. The collections included in the study were the libraries of The University of Melbourne (UM) and Monash University (Mon) representing CAUL; the National Library of Australia (NLA) representing NSLA, and the CARM Centre. The data mined from WorldCat were intended to identify: • The number of unique titles held by each library. Four results are possible: UM; Mon; NLA; CARM • The number of titles held by any two of the libraries. Six results are possible: NLA+UM; NLA+Mon; NLA+CARM; UM+Mon; UM+CARM; Mon+CARM • The number of titles held by any three of the libraries. Four results are possible: NLA+UM+Mon; NLA+UM+CARM; NLA+Mon+CARM; UM+Mon+CARM • The number of titles held by all four libraries. One result is possible: NLA+UM+Mon+CARM As noted, the quality of data in union catalogues has been a problem with many overlap studies, and using WorldCat does not avoid these problems (Orcutt and Powell 2006). Holdings data in WorldCat may be incomplete (for example not all records have been uploaded); inaccurate in a fashion which prevents matching of the same item resulting in duplicate records; or contributing libraries might have different cataloguing practices (e.g. with series titles) that prevent similar items from being identified. It is, for example, estimated that some 50,000 to 70,000 records for CARM Centre holdings that are recorded in the Libraries Australia database have not been able to be uploaded to OCLC due to system problems. There are indications that this is also true for the holdings of the university libraries included in this research. This will result in distortions to the overlap data and a likely understatement of the degree of overlap. The rate of duplication within this network of libraries will also be understated as this methodology does not count duplication within a collection. That is, multiple holdings of the same title by a single library will not be identified. Results The results presented in Table 1 were obtained by compiling the data from the 24 Conspectus divisions, plus those designated by the WorldCat collection analysis process as “unknown” (ie items for which a subject division could not be determined). The Table presents data for the number of items that are held uniquely by each of the four collections, plus the extent of overlap as measured by items that are held by two, three, or all four of the collections. Uniqueness and overlap Unique % Held by 2 % Held by 3 % Held by 4 % Total CARM 114,119 57.8 41,964 21.3 28,768 14.6 12,522 6.3 197,373 UM 617,006 47.8 437,636 33.9 222,514 17.3 12,522 0.9 1,289,678 Mon 458,421 41.7 397,888 36.2 231,275 21.0 12,522 1.1 1,100,106 NLA 1,594,816 70.5 420,678 18.6 232,826 10.3 12,522 0.6 2,260,842 Totals Holdings Items 2,784,362 2,784,362 57.4 75.6 1,298,166 649,083 26.7 17.6 715,383 238,461 14.8 6.5 50,088 12,522 1.0 0.3 4,847,999 3,684,428 Table 1: Unique holdings and overlap The 3,684,428 items have a total of 4,847,999 holdings, with an average of 1.32 holdings per item. This indicates that there are some 1,163,571 duplicate holdings within the 24 subject divisions of these collections. While this can be construed as a significant level of overlap, it is also noticeable that the level of unique items could also be assessed as being high, with 75.6% of all items having one holding only. The comparatively high level of unique holdings within the National Library has been noted in previous overlap studies that have compared the National Library with academic libraries (ARLCAP). This can be explained by the National Library’s historical–but now reduced–role of collecting in depth for some international materials; and their continued commitment to the comprehensive collecting of Australiana, irrespective of the ‘level’ of the intended readership. In both cases this will result in the acquisition of material that is unlikely to be of interest to curriculum driven academic library collections. The considerably higher rate of duplication within the CARM Centre collection (ie the high rate of holdings of items that are held within each of the other three collections) is likely explained by the presence of duplicate copies within the collections of member libraries, with de-duplicated copies being deposited with CARM. It might be assumed that these are likely to be textbooks or similar curriculum focused items. By Conspectus division Within the scope of this paper the results for three subject divisions are reported as examples of the type of data that can be readily extracted using WorldCat Collection Analysis. The subject divisions are Art and Architecture (211,880 total holdings); Sociology (238,461); and Medicine (250,041). These divisions were selected to represent the three broad disciplinary groupings of humanities, social science and science; and because the number of items within each of the three divisions is broadly similar. Art and Architecture Unique % Held by 2 % Held by 3 % Held by 4 % Total CARM 1,996 57.6 835 24.1 472 13.6 161 4.6 3,464 UM 43,074 55.0 26,318 33.6 8,775 11.2 161 0.2 78,328 Mon 23,960 44.9 20,456 38.3 8,775 16.4 161 0.3 53,352 NLA 50,708 66.1 16,985 22.1 8,882 11.5 161 0.2 76,736 Totals Holdings Items 119,738 119,738 56.5 74.3 64,594 32,297 30.5 20.0 26,904 8968 12.7 5.6 644 161 0.3 0.1 211,880 161,164 Table 2: Unique holdings and overlap for Art and Architecture division The 161,164 Art and Architecture items have a total of 211,880 holdings, with an average of 1.31 holdings per item. It is notable that the percentages of unique items held are very similar within other major humanities subject divisions. For example, for the division ‘Language, Linguistics and Literacy’, results for uniqueness for the three library collections were University of Melbourne, 54.4%; Monash University, 44.5%; and the National Library, 63.5%. Sociology Unique % Held by 2 % Held by 3 % Held by 4 % Total CARM 2,659 40.0 1,682 25.2 1,540 23.1 773 11.6 6,654 UM 21,907 33.5 24,319 37.2 18,422 28.2 773 1.2 65,421 Mon 17,704 29.1 23,500 38.6 18,930 31.1 773 1.3 60,907 NLA 63,450 60.2 22,275 21.1 18,981 17.9 773 0.7 105,479 Totals Holdings Items 105,720 105,720 44.3 65.4 71,776 35,888 30.1 22.2 57,873 19,291 24.3 11.9 3092 773 1.3 0.5 238,461 161,672 Table 3: Unique holdings and overlap for Sociology division The 161,672 Sociology items have a total of 238,461 holdings, with an average of 1.47 holdings per item. Medicine Unique % Held by 2 % Held by 3 % Held by 4 % Total CARM 3,296 46.9 2,002 28.5 1,298 18.5 434 6.2 7,030 UM 34,977 44.9 29,206 37.5 13,269 17.0 434 0.6 77,886 Mon 39,843 49.0 27,556 33.9 13,562 16.7 434 0.5 81,395 NLA 47,601 56.9 22,136 26.4 13,559 16.2 434 0.5 83,730 Totals Holdings Items 125,717 125,717 50.3 69.7 80,900 40,450 32.3 22.4 41,688 13,896 16.7 7.7 1,736 434 0.7 2.4 250,041 180,497 Table 4: Unique holdings and overlap for Medicine division The 180,497 Medicine items have a total of 250,041 holdings, with an average of 1.36 holdings per item. Observations relating to the data The NLA has the most recorded holdings for 18 of the 24 divisions. The exceptions are Art and Architecture (see Table 1), Chemistry, Computer Science, Mathematics, Music, and Physical Science. It is also the case that for 23 of the 24 divisions the NLA recorded the highest percentage of unique items, usually by a considerable margin (the exception was Library Science). As discussed above, this can be explained by the nature (breadth) of their collecting. It is also likely, however, that the degree of uniqueness in a collection has some correlation with collection size. This is apparent when comparing results for the two academic libraries. For 23 of the 24 divisions the larger of the two collections was also the one that recorded the higher percentage of unique items. This can logically be explained in that smaller collections will be driven by the need to acquire a core set of curriculum driven items, with a greater likelihood of duplication in other collections. As collections become larger they will inevitably focus on more research-related material, with a corresponding decline in duplication. The one exception was again Library Science, where The University of Melbourne has a slightly smaller collection than Monash University, but a higher percentage of unique items. This is almost certainly explained by the fact that the University of Melbourne collection has been developed for use by library staff rather than to serve a curriculum (the university does not educate in the area of library and information studies). Tables 2-4 reveal a considerable difference in the results for the sample disciplines represented. The difference in results between Art and Architecture (humanities) and Sociology (social sciences) indicate the substantially higher level of uniqueness and lower rate of duplication (as indicated by average holdings per item) of the former. The results do not, however, suggest there is a linear progression from humanities to sciences, as Medicine has produced an outcome that is placed between these two extremes. There is evidence from other Conspectus divisions indicating that the humanities tend to produce a lower level of overlap than other discipline areas, but this requires closer examination. Discussion One of the challenges inherent in overlap studies is the interpretation of the results. There are no benchmarks available for assessing a ‘high’, ‘low’, or ‘acceptable’ level of overlap. Establishing an acceptable level of overlap is particularly difficult when, as in this case, there are no cooperative collecting agreements in place designed to minimise duplication and overlap. When libraries are driven by the needs of curricula—as in the case of the two university libraries—or by commitments to comprehensive collecting—as in the case of the National Library—then a degree of overlap is both unavoidable and necessary. It is also the case, however, that in a nationwide network of research libraries where efficiency in collection storage is at a premium, that reduced long-term overlap in the retention and storage of low-use print material will benefit the system as a whole. These benefits in turn have the potential to flow through to further efficiencies in the discovery and delivery of research materials in a system where a repository such as the CARM Centre has a commitment to permanent retention of low-use material in a high density storage environment. The National Library is also obligated to the permanent retention of Australian material. The presence of in excess of 1.1 million duplicate holdings for the collections studied is indicative of the potential for de-duplication. Obviously this overlap number would grow—and grow quite quickly—as additional libraries were added to the calculation. The National Library has 666,026 duplicated holdings within this small sample of the academic library sector alone. It is of course the case that many of these will be part of the National Library’s Australiana collections, but there is nonetheless scope for a more intensive examination of the characteristics of this duplicated material. Further insight into the extent of the overlap can be gained by examining additional data recording the duplication between collections. Table 5 reports the overlap for all of the recorded holdings on WorldCat—as opposed to the 24 divisions in Table 1— for the two university-based collections included in this study, and the National Library. UM Mon NLA UM 1,524,110 482,845 520,052 Mon 482,845 1,405,960 547,486 NLA 520,052 547,486 3,233,921 Table 5: Three-way overlap, University of Melbourne, Monash University, and National Library. Of note in these results is that 34.3% of the Monash University collection is duplicated by the University of Melbourne, and that 31.6% of the University of Melbourne collection is duplicated by Monash. In addition both academic libraries have considerable duplication with the National Library; 34.2% in the case of University of Melbourne, and 38.9% for Monash. The data in Table 5 again indicate that there is considerable potential for de-duplication, but the exact extent of possible de-duplication can only be confirmed by closer examination of the items that comprise the overlap. This would be necessary in order to establish the characteristics of duplicated items and whether there is likely to be ongoing demand for this material that would justify retention in more than one library, on in main library sites as opposed to storage. With access to a service such as WorldCat Collection Analysis it should be feasible to undertake this additional level of analysis. Although not utilised in the present study, WorldCat Collection Analysis offers access to more detailed levels of data regarding collection overlap. This includes the capacity to collate and compare holdings by features such as publication date, format and audience level. Also of particular relevance to the issue of understanding overlap and unique holdings in the national context is the capacity to establish ‘groups’ of libraries for comparison purposes. This might include, for example, groups that represent sectors within the university library community, such as the Group of Eight (research intensive) or the ATN (technology based) university libraries; or groups from outside the university sector such as major special libraries, or the Technical and Further Education libraries . Collection comparisons can then be made within, or between these groups, with a view to assisting collection management decisions of either individual libraries; the particular group or network to which they belong; or to the wider library community. Indeed it is only when non-university based libraries or groups of libraries are included in the data-gathering that a full picture of the collective wealth of Australia’s research libraries will emerge. While data related to overlap and uniqueness has previously been available in Australia from the NBD, it has been extremely difficult to mine, with no National Library service or software function specifically designed to meet the need. It has therefore not been possible for Australian libraries to use the NBD data in order to optimise its potential to assist in managing either local or system-wide collections. WorldCat Collection Analysis software, however, uses the major international union catalogue to make possible a rapid and detailed analysis of local collections, and to enable libraries to undertake collection comparisons on a scale of their own choosing. This extends to providing Australian libraries with the capacity to benchmark using international collections. The breadth of the coverage of the WorldCat database, supported by of WorldCat Collection Analysis software, also provides an opportunity to broaden the basis for conceptualising and managing Australia’s national research collection. The comparative ease with which collection analysis can be undertaken using WorldCat Collection Analysis makes it conceivable to include a wider range of collections within the scope of overlap-based studies, and therefore within any framework for collaborative planning and management of the national research collection. For while there has been acknowledgement that special libraries include valuable research material that is unlikely to be duplicated in academic libraries (Stephens 2009), the practical difficulty of including these libraries within any data collection exercises has meant that they have been largely excluded. This exclusion, for example, has extended to the recent Australian overlap studies mentioned above. Conclusion The results of this pilot study add to the growing body of data regarding the potential for the rationalisation of print storage in ways that might produce benefits for the Australian research collections. It is apparent, however, that the data available as yet is preliminary and partial and that a much more complete investigation of both unique holdings and overlap are required. The study has also identified that there are ongoing problems in the accuracy of some Australian holdings data in WorldCat, but that nonetheless the WorldCat database and its collection evaluation software have the potential to provide important data in support of the management of Australian research collections. It is also possible to conclude that the WorldCat Collection Analysis software is appropriate for the subsequent and expanded phases of this research, and that it is also likely to have substantial benefits for other Australian libraries interested in a better understanding of their collections. Acknowledgement The authors would like to thank OCLC for their assistance in making available and using the WorldCat Collection Analysis software to support this research. References Australian Research Libraries Collection Analysis Project, Report 2004. Available at http://www.library.uwa.edu.au/__data/assets/pdf_file/0004/524794/arlcap_final_repor t.pdf Beals, Jennifer, and Ron Gilmour. 2006. Assessing collections using brief tests and WorldCat Collection Analysis. Collection Building 26(4): 104–107. Connaway, Lynn S., Edward T. O’Neill, and C. Prabha. 2006. Last copies: what’s at risk? College & Research Libraries 67(4): 370-379. Genoni, Paul, and Eva Varga. 2009. Assessing the potential for a national print repository: results of an Australian overlap study. College & Research Libraries 70(6): 555-567. Gherman, Paul. 2007. The North Atlantic Storage Trust: maximising space, preserving collections. Portal: Libraries and the Academy 7(3):273-275. Lavoie, Brian, Lorcan Dempsey, and Lynn S. Connaway. 2006. Making data work harder. Library Journal 131(1). Available at http://www.libraryjournal.com/article/CA6298444.html http://www.library.uwa.edu.au/__data/assets/pdf_file/0004/524794/arlcap_final_report.pdf� http://www.library.uwa.edu.au/__data/assets/pdf_file/0004/524794/arlcap_final_report.pdf� http://www.eric.ed.gov/ERICWebPortal/search/simpleSearch.jsp?_pageLabel=ERICSearchResult&_urlType=action&newSearch=true&ERICExtSearch_SearchType_0=au&ERICExtSearch_SearchValue_0=%22Lavoie+Brian%22� McClure, Jennifer Z. 2009. Collection assessment through WorldCat. Collection Management 34(2): 79-93. Missingham, Roxanne and Robert Walls. 2003. Australian university libraries: collections overlap study. Australian Library Journal 52(3): 247-260. National Library of Australia. OCLC Agreement. Available at http://www.nla.gov.au/librariesaustralia/oclc.html OCLC: Introduction to the WorldCat Collection Analysis service, http://www.oclc.se/support/documentation/collectionanalysis/using/introduction/introd uction.htm O’Connor, Steve, and Cathie Jilovsky. 2008. Approaches to the storage of low use and last copy research materials. Library Collections, Acquisitions & Technical Services 32 (3/4): 121-126. O’Neill, Edward T., Lynn S. Connaway, and Timothy J. Dickey. 2008. Estimating the audience level for library resources. Journal of the American Society for Information Science and Technology 59(13): 2042-2050. Orcutt, Darby, and Tracy Powell. 2006. Reflections on the OCLC WorldCat Collection Analysis tool: we still need the next step. Against the Grain 18(4):44–48. Payne, Lizanne. 2007. Library Storage Facilities and the Future of Print Collections in North America (Dublin, Oh: OCLC, 2007): 5. Available at http://www.oclc.org/programs/publications/reports/2007-01.pdf. Rochester, Maxine K. 1987. The ABN database: sampling strategies for collection overlap studies. Information Technology and Libraries 6(3): 190-199. Sharp, Steven. 2009. No more room aboard the ark! A UK higher education perspective on space management. Interlending & Document Supply 37(3): 126-131. Stephens, Matthew. 2009. Heritage book collections in Australian libraries: what are they, where are they and why should we care? Australian Library Journal 58(2):173- 189. Vattulainen, Pentii. 2004. National repository initiatives in Europe. Library Collections, Acquisitions, & Technical Services 28 (2): 39-50. Ward, Suzanne M., and Mary C. Aagard. 2008. The dark side of collection management: deselecting serials from a research library’s storage facility using WorldCat collection analysis. Collection Management 33(4): 272-287. White, Howard D. 2008. Better than brief tests: coverage power tests of collection strength. College & Research Libraries 69(2): 155-174. Yoon, Hee-Yoon, and Sun-Kyung Oh. 2008. Shortage of storage space in Korean libraries: solutions centering upon hub-based collaborative repositories. Aslib Proceedings 60(3): 265-282. http://www.oclc.org/programs/publications/reports/2007-01.pdf� Associate Professor Paul Genoni teaches with the Department of Information Studies at Curtin University in Perth. He has published numerous papers related to collection management, reference services and continuing professional development. From 2004 to 2010 he served as an educator representative on the ALIA Education Reference Group, and the Education and Workforce Planning Standing Committee. Janette Wright, Chief Executive of CAVAL Ltd, has extensive management experience within the library and information services sector. Previous roles include Director, RMIT Publishing, aggregator and online publisher of Australian scholarly content, and Managing Director, of the journal subscription agency, RoweCom Australia/Divine Information Services - Europe. As Director of Public Library and Network Services at the State Library of NSW and Director, Library and Community Services, at Waverley Council, NSW, Janette has managed significant operational and grant funding programs for libraries with accountability at a senior level across a number of different legislative environments. Uniqueness and overlap work_nztwbvcblngzpm5l5m4so7ug3u ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216588994 Params is empty 216588994 exception Params is empty 2021/04/06-01:37:01 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216588994 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:01 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_o3esvw4clnexpgljio2wdqlb3m ---- ELIS_OTDCF_v19no4.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA You’re invited: XML’s fifth birthday celebration ___________________________________________________________________________________________________ {A published version of this article appears in the 19:4 (2003) issue of OCLC Systems & Services.} “The future ain’t what it used to be.” – Yogi Berra ABSTRACT This article looks at XML’s first five years of existence. It reviews the original motivation for creating XML, and some of the applications this standard has made possible, specifically Rich Site Summary and Open eBook Publication Structure. KEYWORDS XML; Extensible Markup Language; World Wide Web Consortium; Rich Site Summary; RSS; Open eBook Publication Structure; OEBPS Recently, I was asked by my director to give the library staff a presentation on the changing nature of cataloging. When I suggested that providing a history of the universe might be easier, he reminded me of my interview presentation years prior, which, like the Roy Tennant article he was holding, discussed the marginalizing of MARC in favor of other metadata schemes. After reflecting on my assignment and forming a committee to assist – my typical reaction when faced with anything unpleasant, laborious, or beyond me -- I came to consider this task more an opportunity than a burden. It gave me a chance to take stock of my colleagues’ activities; to learn not only how they were using various metadata flavors, but far more importantly, why they had decided on these particular standards. What these schemes, like so many others, have in common is their carrier, XML. XML TURNS FIVE Quite by accident, I discovered that XML turned five years old on February 10, 2003. Considering my chief accomplishments at that tender age involved crayons and shoe laces, by comparison XML has had a prodigious first few years. Like an adolescent whom you have not seen since he was in diapers, XML had grown quickly. I remember reviewing the first XML recommendation while on a commuter train bound for Grand Central Station (it may have been a Tuesday), struggling with concepts such as element type declarations, literals, and validity constraint. It was such a “XML provides a new versatile structure for tagging and packaging metadata as the rapid proliferation of digital resources demands both rapidly produced descriptive data and the encoding of more types of metadata.” (Guenther and McCallum, 2003) divergence from the simple nature of HTML. Dave Holander and C.M. Spergberg-McQueen, two original members of the XML Working Group, offer a sentimental look back at the early years of XML in an article available on the World Word Web Consortium site . As they note, XML was invented in order to make SGML-encoded papers publishable on the Web – a simple ambition that has since spawned a revolution in the way information is managed across numerous communities (Holander & Spergberg- McQueen, 2003). The earliest reference that I can find to XML in Library Literature is from the June, 1997, issue of Online & CDROM Review. The article, entitled “A leaner, meaner markup language: simpler form of SGML called Extensible Markup Language (XML),” discusses the embryonic language and its ten guiding principles (A leaner, meaner markup language, 1997): � XML shall be straightforwardly usable over the Internet � XML shall support a wide variety of applications � XML shall be compatible with SGML � It shall be easy to write programs which process XML documents � The number of optional features in XML is to be kept to the absolute minimum, ideally zero � XML documents should be human-legible and reasonably clear � The XML design should be prepared quickly � The design of XML shall be formal and concise � XML documents shall be easy to create � Terseness in XML markup is of minimal importance As noted in this early article, “[XML’s] main purpose is to position the Internet or intranet for a far wider range of document markup than just HTML”; this is to say a middle-ground position between the onerous coding required by SGML, and the content insensitivity of HTML. XML APPLICATIONS TODAY Numerous XML applications exist, many of which we use without ever knowing that XML is under the hood. A lengthy list of some of these applications is maintained by the Organization for the Advancement of Structured Information Standards (OASIS) . Below I focus on two such applications that respectively have or soon will have an impact on library staff and users. RICH SITE SUMMARY (RSS) Rich Site Summary (sometimes referred to as RDF Site Summary) is an XML application that streams channels, commonly in the form of news feeds, to a computer via intermediary software. The end- user experience has been described in recent articles (see for instance Cohen and Notess), but the underlying foundation of RSS, specifically its XML horsepower, is worth describing here. RSS was developed by Netscape in 1999 for use with their MyNetscape portal (RDF Site Summary (RSS) 1.0, 2000). RSS utilizes the XML namespace feature to parse channel vocabularies. Namespaces are a convenient means of pointing to definitions of elements in order to facilitate context and use of these elements. An RSS document provides a title, link, and description of the channel it is describing. These metadata “XML is only a tool. Just as a word processor cannot write an interesting article by itself, XML cannot automatically find what people want, present information in an easy-to-read format, or solve problems relating to the content of information resources. XML can’t do your work for you” (Banerjee, 2002). provide the framework by which users select channels. The World Wide Web Consortium provides illustrative examples of RSS documents formatted in XML, along with descriptions and usage guidelines for RSS elements such as channel, image, and items . Use of RSS continues to grow as organizations of all types incorporate channels into their web architecture. OPEN EBOOK PUBLICATION STRUCTURE (OEBPS) OEBPS was developed by the Open eBook Forum, the international trade and standards organization for the ebook industry . The specification was released in 1999, and has undergone two revisions since then. The purpose of the XML-based specification is threefold (Open eBook Publication Structure Specification 1.2, 2002): � to give content providers (e.g., publishers and others who have content to be displayed) and tool providers minimal and common guidelines that ensure fidelity, accuracy, accessibility in the presentation of electronic content over various electronic book platforms � to reflect established content format standards � to provide the purveyors of electronic book content (publishers, agents, authors et al.) a format for use in providing content to multiple reading systems The Open eBook Forum is an organization consisting of 85 hardware and software companies, publishers, and authors, including the Association of American Publishers, Random House Inc., the American Library Association, and Palm Digital Media. It recognizes that electronic books will soon need to be rendered on an assortment of readers – devices that range in size, processing power, and operating systems. The OEBPS specification uses stylesheets to render XML-encoded data into suitable formats for particular reading devices. Each ebook described using OEBPS must contain a package file, a group of identifiers that describe relationships within the ebook. The package file, an XML document, consists of six entities that can be broadly defined as technical metadata. These include package identity, metadata, manifest, spine, tours, and guide. The metadata required in the package file is descriptive, and must consist of Dublin Core elements. An example of this metadata is the following (Open eBook Publication Structure Specification 1.2, 2002): Alice in Wonderland en 123456789X Lewis Carroll Given the important purpose of this specification and the strong organizational backbone affiliated with it, OEBPS will no doubt become an increasingly important standard as electronic books become more popular. CONCLUSION XML has come a long way in a remarkably short period of time. It has enabled some innovative tools, including those noted above, and changed the foundation of World Wide Web development. The amazing accomplishments XML will achieve in its future are left to the imagination, though this phenom’s best years surely lie ahead. REFERENCES “A leaner, meaner markup language” (1997). Online & CDROM Review, vol. 21, no. 3, p. 181-184. Banerjee, K. (2002). “How does XML help libraries?” Computers in Libraries, vol. 22, no. 8, p. 30-34. Cohen, S..M. (2002). “Using RSS: an explanation and guide.” Information Outlook, vol. 6, no. 12, p. 6- 11. Guenther, R. & McCallum, S. (2003). “New metadata standards for digital resources: MODS and METS.” Bulletin of the American Society for Information Science and Technology, vol. 29, no. 2, p. 12-15. Holander, D. & Spergberg-McQueen, C.M. (2003). “Happy birthday, XML.” Available: http://www.w3.org/2003/02/xml-at-5.html (Accessed: 28 April 2003). Notess, G.R. (2002). “RSS, aggregators, and reading the Blog fantastic.” Online, vol. 26, no. 6, p. 52-54. “Open eBook Publication Structure 1.2” (2002). Available: http://www.openebook.org/oebps/oebps1.2/downlaod/oeb12.pdf (Accessed: 21 May 2003). “RDF Site Summary (RSS) 1.0.” (2000). Available: http://www.purl.org/rss/ (Accessed: 20 May 2003). work_o424hmcdpnc5xafkdymnutcbl4 ---- OTDCF_v21no2.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA Electronic Resources Management: An Update ___________________________________________________________________________________________________ {A published version of this article appears in the 21:2 (2005) issue of OCLC Systems & Services.} “Always do sober what you said you'd do drunk. That will teach you to keep your mouth shut." – Ernest Hemingway ABSTRACT This article reviews the Digital Library Federation’s (DLF) Electronic Resource Management Initiative (ERMI) guidelines, finalized in August 2004. The specifications are reviewed in light of the electronic resource (e-resource) management needs of academic libraries. The piece reflects on comments made by Tim Jewell and Adam Chandler in an earlier “On the Dublin Core Front” column. A review of commercial e-resource management system development is also included. KEYWORDS DLF ERMI; Digital Library Federation Electronic Resource Management Initiative; electronic resource management; e -resource management; Tim Jewell; Adam Chandler 2004 was a year of achievements. Politics aside (after all, this is a bi -partisan column), evidence of water was discovered on Mars, Google went public, Brittany Spears got married – twice, and the 86-year Curse of the Bambino ended as the Boston Red Sox won the World Series. While the Red Sox were turning around an otherwise unspectacular season during the dog days of August (they went 21-7 that month), the ERMI Steering Committee, under the auspices of the Digital Library Federation, was publishing their final report. Although less titillating than the BoSox’s historic comeback against the Yankees in the American League Championship Series, the ERMI report ranks to this librarian as one of the most important accomplishments of the past year. Readers of this column may recall an interview piece with Tim Jewell and Adam Chandler that appeared in the 19:3 (2003) issue of OCLC Systems and Services. Jewell and Chandler discussed the problems associated with e -resource management, and the ways they envisioned the work of the ERMI Steering Group helping libraries. In August 2004, ERMI’s final report was released, available on the DLF web site at . The report and its six appendices represent many years of intense efforts by Jewell, Chandler, and their Steering Group colleagues, Ivy Anderson, Sharon Farb, Angela Riggio, Nathan Robertson, and Kimberly Parker. Not surprisingly, this work has attracted the attention of leading library system vendors, publishers, and standards organizations. By reviewing the work of a small number of libraries that had developed early e -resource management systems, querying librarians about the types of e-resources data they needed to store and disseminate, and by sharing these findings in public venues such as the American Library Association Annual Conference and Midwinter Meeting, DLF Forums, and the Charleston Conference, the ERMI Steering Group developed functional requirements for an e -resource management system that have become a development roadmap for commercial vendors. It is amazing to consider how much this group of talented, dedicated librarians has accomplished in a relatively short period of time. MARKETPLACE UPDATE In the interview piece alluded to above, Jewell noted that the primary outcome he hoped to see as a result of ERMI was “rapid progress in developing systems to manage electronic resources” (Medeiros, 2003). Not quite two years later, Jewell’s wish is a reality. The leading library system providers all have e-resource management products either currently available or in the works. First out of the gate was Innovative Interfaces. Their product, ERM, has been in commercial release for over a year. They have sold over 100 modules, many of these to libraries whose library management system is not an Innovative product. Innovative capitalized on early recognition of the e-resource needs of libraries, choosing a handful of libraries, including the University of Washington and Ohio State University, to help develop their product, which can integrate with an existing Innovative system or be used as a standalone module. Additional information about ERM is available at . Ex Libris is working with MIT and Harvard to develop their e -resource management system, Verde. At the 2005 American Library Association Midwinter Meeting held in Boston, Ellen Duranceau from MIT and Ivy Anderson from Harvard gave a brief presentation on Ex Libris’ work-in-progress. Verde is designed to accommodate the DLF ERMI functional specifications, and will interact with the SFX Knowledge Base. Verde is being built as a standalone system, useful for libraries regardless of their library management system, and is expected for commercial release sometime in 2005. Additional information about Verde is available at . VTLS is working with the Tri-College Consortium (Bryn Mawr, Haverford, and Swarthmore Colleges) to develop their product, Verify. Verify is framed around the DLF ERMI functional specifications, and seeks to provide an application extensible for use by both individual and groups of libraries. Like ERM and Verde, Verify is being designed as a standalone system, and is planned for commercial release by the end of 2005. Additional information about Verify is available at . Endeavor Information Systems is at work on its e -resource management product, Meridian. Meridian is being developed with input from the international community. The system, which can operate in tandem with Endeavor’s Voyageur library management system or as a standalone application, should be in general release by June 2005. Additional information about Meridian is available at . Dynix is planning to build an e-resource system that will complement existing Horizon systems. Additional information is available at . In addition to these commercial products, a number of feature-rich library-developed systems exist. These systems include Penn State’s ERLIC , Johns Hopkins University’s HERMIS , and Boston College’s ERMdb < http://www.bc.edu/bc_org/avp/ulib/staff/erm/erm-db/>. Some libraries, including MIT, Harvard, Ohio State, and the Tri-College Consortium, will need to address the challenge of data migration from their homegrown e -resource management systems to new commercial systems. It will be interesting to see how well these commercial systems are able to map local data elements into the DLF construct. For additional information about commercial e-resource management system development, I suggest Ellen Duranceau’s “Electronic Resource Management Systems from ILS Vendors,” which appears in the September 2004 issue of Against the Grain (Duranceau, 2004). The timeliest information on this topic is available at the “Web Hub for Developing Administrative Metadata for Electronic Resource Management” maintained by Adam Chandler at Cornell University. WORKFLOW CONCERNS Electronic resource management systems are just one step in helping libraries manage e - resources. Effective e -resource management requires changes to the ways libraries do their work. A colleague mentioned to me a few days ago that his library would soon be hiring an “e-resources person” to assist with the burgeoning array of tasks affiliated with the access, administration, and control of e-resources. Our conversation reminded me of a study conducted by Cindy Hepfer and the much-referenced Ellen Duranceau that appeared in a 2002 issue of Serials Review (Duranceau, 2002). The article details data gathered by the authors during the period 1997 to 2002. Although just a small sample (15 libraries responded), a trend was clear: staffing for e-resource activities is being vastly outpaced by the growth of e -resource collections in libraries. A more recent and expansive survey of libraries on the topic of staffing is found in Managing Electronic Resources (Grahame, 2004). This February 2004 study details results of 69 ARL libraries that responded to questions about their ability to manage electronic resources. Not surprisingly, the vast majority of respondents indicated they had made staffing or organizational changes as a result of e-resource demands. CONCLUSION It’s the combination of e -resource systems, sufficient and proper allocation of staff, adoption and standardization of the ERMI functional specifications, and an unreferenced but critical issue, agreeable model licensing, that will help libraries overcome the challenge presented by today’s digital information environment. REFERENCES Duranceau, E.F. & Hepfer, C. (2002). “Staffing for Electronic Resource Management: The Results of a Survey,” Serials Review, vol. 28, no. 4: 316-320. Duranceau, E.F. (2004). “Electronic Resource Management Systems From ILS Vendors,” Against the Grain, vol. 16, no. 4: 91-94. Grahame, V. & McAdam, T. (2004). Managing Electronic Resources. Washington, DC: Association of Research Libraries. Medeiros, N. (2003). “A Pioneering Spirit: Using Administrative Metadata to Manage Electronic Resources,” OCLC Systems and Services, vol. 19, no. 3: 86-88. work_o6npgfvcvvfljhtdd2jzuhkaz4 ---- 306505 342..354 THEME ARTICLE Modeling the digital content landscape in universities Paul Conway School of Information, University of Michigan, Ann Arbor, Michigan, USA Abstract Purpose – Digital content is a common denominator that underlies all discussions on scholarly communication, digital preservation, and asset management. This past decade has seen a distinctive evolution in thinking among stakeholders on how to assemble, care for, deliver, and ultimately preserve digital resources in a college and university environment. At first, institutional repositories promised both a technical infrastructure and a policy framework for the active management of scholarly publications. Now other approaches that take a broader view of digital content hold sway, the result being confusion rather than clarity about where digital content originates, who the stakeholders are, and how to establish and adjust asset management priorities. This article seeks to present a model for plotting the range of digital content that might be amenable to management as digital assets in higher education. Design/methodology/approach – The article reviews differing perspectives on digital content, outlines a generalized model, and suggests how the model could be used for examining the distribution of campus digital assets and fostering dialog on management priorities across stakeholder communities. Findings – A multivariate model of digital content provides a rich framework for analyzing asset management priorities in a university setting. The model should be applied and tested in a variety of university settings. Practical implications – The model is a tool for establishing asset management priorities across campus units that produce digital content. Originality/value – The paper offers an original model for evaluating the asset values of digital content produced or acquired in a university context. Keywords Assets management, Digital storage, Digital libraries, Content management Paper type Research paper Introduction Depending on who you ask, the idea of the Institutional Repository (IR) in higher education means anything from “innovative solution” to “irrelevant curiosity”. To librarians, archivists, programmers, and the faculty who are building systems and contributing publications, an IR is, at least, a more convenient and more reliable place to hold the output of scholarly communication. (Smith, 2005). At best, advocates for a network of repositories hope to revolutionize these processes (Harnad, 1990) and, along the way, catalyze the reinvention of library collections and services (Keller et al., 2003). Critics of the IR movement have their own arguments, including that the technologies and associated policy frameworks are too limited, too narrowly construed, too political, The current issue and full text archive of this journal is available at www.emeraldinsight.com/0737-8831.htm qPaul Conway The author thanks Karen Markey and Soo Young Rieh at Michigan’s School of Information and Tim Pyatt at Duke for their careful reading and very helpful comments on an earlier draft of this article. LHT 26,3 342 Received 30 December 2007 Revised 11 April 2008 Accepted 26 April 2008 Library Hi Tech Vol. 26 No. 3, 2008 pp. 342-354 Emerald Group Publishing Limited 0737-8831 DOI 10.1108/07378830810903283 or unconvincing. The vast majority of academics appears to be largely unaware of or uninterested in a suite of technologies that have too little impact on their lives as scholars or administrators (Davis and Connolly, 2007). Against this backdrop of challenging, technology development is the near ubiquity of digital content (Lyman and Varian, 2003) on and off the university campus, significant mass digitization projects transforming library book collections (Coyle, 2006), and the inexorable shift to digitally based and tool-rich scholarship within and across disciplinary boundaries. As an all-digital academy emerges, the perspective of IR advocates and supporters is broadening to encompass the active management of digital content from a panoply of sources supporting a variety of academic purposes. This shift may be due, in part, to increased concern about the preservation challenges of digital content, in general (Waters and Garrett, 1996) or to explicit advocacy for new tools and services. Libraries and other campus organizations are initiating and championing Institutional Repositories as critical components of a very diffuse approach to stewardship on campus, as libraries themselves evolve from serving primarily as “archive” to becoming social centres for teaching and learning (Lynch et al., 2007). This article proposes one way of conceptualizing the digital content landscape in a university context. The proposed model, which the author has developed, refined, and vetted over a three-year period, is an effort to facilitate a rich cross-campus dialog on the challenges and opportunities of aggressive and effective digital asset management. The article reviews differing perspectives on digital content, outlines a generalized model, and suggests how the model could be used to examine the distribution of campus digital assets and to foster dialog on management priorities across stakeholder communities. Evolving definitions of digital content The concept of the IR is well described and well defined, even if not as well accepted in the academy as proponents hope. Agreement is less clear on the types of digital content included in repositories or asset management systems. For the past 15 years, the notion of what constitutes digital content worthy of local management has expanded greatly from an initial focus on the scholarly preprint to a current view that encompasses almost any sort of digital object that can be identified and described intellectually. Initially, Harnad (1990, 2001) proposed that scholars archive their scholarly preprints as a direct challenge to the control of scholarly content by commercial publishers. Raym Crow, working under the auspices of the Association for Research Libraries’ SPARC initiative[1], synthesized one of the first formal definitions of the “IR”. His definition builds on but moderates Harnad’s political argument while expanding the content domain slightly to encompass “intellectual product” of a university. Institutional repositories represent the logical convergence of faculty-driven self-archiving initiatives, library dissatisfaction with the monopolistic effects of the traditional and still-pervasive journal publishing system, and availability of digital networks and publishing technologies (Crow, 2002). Lynch’s definition (2003) of IR is the most widely cited. He correctly establishes the IR as a set of services provided by a university to its community. He purposefully takes a broad approach to digital content, suggesting that an IR supports the management and dissemination of “digital materials created by the institution and its members”. Digital content landscape in universities 343 Lynch’s emphasis on locally produced content places clear boundaries on the content domain, initially eliminating from consideration content acquired from external sources to support research and teaching. Branin (2003) sees IR activities as analogous to the collection development efforts of traditional libraries. While confining IR digital content to faculty and student output, Branin equates digital content with digital information and knowledge assets, using streaming video and audio as an example of an asset management need that transcends the scholarly preprint. Markey et al. (2007) conducted a census of IR activities in the USA. The MIRACLE Project’s definition of IR content builds on prior work, particularly Branin and Lynch. The project limits the scope of its census population to organizations and systems that collect locally produced publications, but then investigates how universities within this population are expanding their use of IR technologies to assemble and manage over three dozen document types, including electronic theses and dissertations, learning objects, digitized images, software, and other types of content that may be deemed valuable for longer term retention. IR to asset management As the concept of IR stretches to accommodate an ever expanding array of formats and sources, some have begun adopting the term “digital asset management” as a broader concept capable of encompassing the active management of any form of digital content. Bicknese (2004) targets a university’s electronic records as worthy of special management attention. Thomas and Rothery (2005) argue quite forcefully for accessible and more systematically managed repositories of the digital learning objects accumulated over nearly a decade in some combination of proprietary courseware systems, open source applications, and every flavor of website. Green and Gutmann (2007) add to the complex asset management picture with a pointed appeal for attention to the maintenance of research databases and other science and social science data resources either created by the university research enterprise or acquired to support it. In digital asset management, the concept of value is a critical factor. Ross (2002), writing from the UK/European Union perspective, provides one of the earliest and most complete definitions of digital asset management applied to the higher education environment: Digital assets have the very unique characteristic of being both product and asset. Some digital assets exist only in digital form while others are created through the digitisation of analogue materials such as text, still images, video and audio. Content has the same value to institutions as other assets such as facilities, products and knowhow. Just as an organisation seeks to make efficient and effective use of its financial, human and natural resources, it will now wish to use its digital assets to their full potential without reducing their value. The University of Kansas (Fyffe et al., 2004) adopts an “asset management” perspective in its IR initiative, KU ScholarWorks[2]. “A digital asset is an electronic object that has value for some purpose”. The KU definition explicitly places digital preservation at the core of its management approach. “To become part of the University’s digital preservation program, a digital asset must support (directly or indirectly) the University’s fundamental instructional, research, or public service missions”. Asset management is criteria driven, focusing on three support functions: the academic mission of the university, university administrative needs, and the acquisition by license or purchase of data for continuing use. LHT 26,3 344 Waters (2006) reflects deeply on the trend toward stewarding digital assets and provides the most insightful description of the challenges and opportunities of expanding the content landscape. Digital assets “are resources for research and teaching in higher education and that the aim of academic institutions in managing them is to advance knowledge and improve education”. Waters offers a critique of the obsession that librarians have tended to have with escalating electronic journal pricing, and warns of the consequences of accepting a “dramatic, jump-off-the-cliff shift in the academy from owning scholarly output to effectively renting it”. Waters challenges universities to invest in the necessary and significant costs of repository development, including “compelling rationales for collecting, preserving, and providing access to these kinds of scholarly output”. He predicts that demand in universities “will grow for deepening connections between digital library systems used for managing digital assets in various forms and combinations of licensed, digitized, and open source materials and learning management systems”. In his approach, Waters calls for an integrated and balanced approach to the wide range of digital materials that exist in various distributed forms and function fluidly as repurposeable raw material for the emerging world of cyberscholarship. Regardless of how institutional repositories and asset management systems define the scope of content, advocates have confronted significant adoption challenges. Markey et al. (2007) suggests that a typical (median) operational repository contains 1,000 documents. Lynch and Lippincott (2005) find that comparing the size of repositories between institutions is at present an intractable problem but that repository tools are being positioned as general-purpose infrastructure with an increasingly wide array of digital content types. van Westrienen and Lynch (2005) report similar use-measurement challenges but an widening adoption of underlying repository technologies in thirteen industrialized nations. Walters (2006) finds a near total absence of both a broad understanding of what content is appropriate for asset management and a distinctive lack of awareness of end-user functional requirements in an asset system. Librarians have recruited anthropologists (Foster and Gibbons, 2005), marketing specialists (Gierveld, 2006), and economists (Lavoie, 2003) in attempts to encourage adoption. Advocates appeal to scholarly responsibility (Harnad, 2001), logic (Courant, 2006), institutional efficiency (Mackie, 2004), and the preservation mandate of universities (Hitchcock et al., 2007) to find effective stakeholder incentives. Davis and Connolly (2007), supplying data from Cornell, suggest that the idea of author-archiving is so disconnected from the reality of faculty life that there may be no real progress made until value-added aggregation services transcend the functional value of solitary repositories. As university administrators expand their notions of the resource stewardship beyond the library to encompass “asset management” by a variety of campus stakeholders, major unresolved questions centre on defining the landscape of digital content appropriate for management as an asset. The growing research literature on both institutional repositories and asset management is quite loose in its definition of what digital content is appropriate for local management, where that content originates, what its administrative limits are, and, in general, how the components of an emerging all-digital content landscape fit together. What is largely missing from the literature of claims and counter claims for either IR technologies or a broader “asset management” are clear distinctions among the varieties digital content that a university creates, physically assembles, and/or provides access for its community Digital content landscape in universities 345 of users. The absence of a consensus framework for digital content increases the planning overhead at every university interested in capturing institutional value. As with the clichéd story of the blind men and the elephant, the obscurity of the content landscape complicates cross-campus communication and limits opportunities to develop stakeholder-driven priorities. As the university landscape of digital content broadens beyond the realm of the scholarly preprint, it becomes increasingly necessary to model this landscape in ways that reflect the various roles and perspectives of digital content creators, stewards, and users. A comprehensive environmental scan of the digital technology landscape effecting libraries and users suggests that “too few initiatives include all the stakeholders . . . and there is no common view of what an IR is, what it contains, and what its governance structure should be” (OCLC, 2003). Additionally, tools are needed that foster rich dialog among campus stakeholders, on the content appropriate to manage as an asset, and on the priorities for allocating increasingly scarce resources that are competing for a plethora of technology needs. (Camp, 2007). Digital collection models In 2003, OCLC and Stanford University separately proposed distinctive two-dimensional collection models that envisions an evolving library environment. The first of these is the Collection Grid from OCLC (2003), which plots collections (digital and analog) in four quadrants based on the degree of stewardship required (high/low) and the extent to which uniqueness lends a distinctive character to the library and the university (high/low). As shown in Figure 1, the OCLC Collections Grid gives priority value to those special collections materials with high stewardship and uniqueness values – the very sort of materials that endow research institutions with distinctive collections identity. On the surface, The OCLC Collections Grid’s (Dempsey, 2007) embedded value system encompasses the traditional view of preservation that emphasizes long-term preservation needs over short-term user needs. (Hazen et al., 1998) The Grid reflects the traditional Figure 1. OCLC collections grid, 2003 LHT 26,3 346 archivist’s perspective that the value of unique research collections trumps redundant physical or digital collections of books and web resources. Although the Collections Grid appears to be an accurate snapshot of the collection behaviors of research libraries that are increasingly focusing their collection efforts and their university’s collection dollars on digital resources, the Collections Grid may be less useful for engaging the broad array of campus stakeholders to who may not value investment in library-oriented stewardship that is not related to immediate scholarly need. Another model from 2003 is Stanford University’s portrayal of evolving digital collections and services. (Keller, 2005) It shares with the OCLC Collections Grid awareness of stewardship responsibilities ranging from short-term need to long-term preservation. The Stanford model, however, plots the second dimension in terms of the “compass direction” or the evolving orientation of digital services from individual to institutional need. The strength of the Stanford model, represented in Figure 2, is the way it maps emerging academically oriented digital content on a suite of library digital repository and preservation services. The model explicitly presumes the library’s role as campus repository but does not address the management of digital assets that fall outside the library’s self-defined scope. An alternative content landscape model is the subject of this article. The Conway model was first developed at Duke University to support campus conversations on the scope of digital library activities. The model was presented and refined at a series of workshops and symposia, including the OCLC Distinguished Seminar Series. (Conway, 2004) It was applied to the specific Duke context during a year-long exploration of digital content generated by interdisciplinary research centres and academic departments. The following sections of this article describe a more fully realized version of the Conway landscape Figure 2. Stanford University libraries, digital collections and services, 2003 Digital content landscape in universities 347 model and suggest ways that the model can support a broad planning process that involves content stakeholders across an entire campus. Conway content landscape model The Conway Content Landscape Model (CLM) is a multi-dimensional framework that addresses three outstanding issues with digital asset management in universities. First, the model acknowledges the broader academic mission within which digital content is created, acquired (bought and licensed), managed and preserved, and distributed and used. Second, the model provides for selection processes and priority setting exercises based on the dual perspectives of content creator/stakeholders and content user/stakeholders. Third, the model identifies four digital content property scales that provide an analytical foundation for assigning management priorities to particular classes of digital content. At its most abstract level, seen in Figure 3, the model recognizes the information environment within which universities carry out their four-part mission to foster research, teaching, publication, and preservation (Waters, 2006). This wider environment of e-research, e-teaching, e-publishing, and e-recordkeeping is similar in structure and perspective to the digital framework that motivates the research and development activities of the UK Joint Information Systems Committee (JISC)[3]. More specifically the CLM articulates four interacting variables that together describe the core asset management challenges that universities face with digital content: property rights, structure, source and possession. Property rights distinguishes campus digital assets based on the likelihood that the university can retain the rights to capture, store, preserve and make available digital content to its academic community. In the present environment, the rights of a university vis-a vis digital content are not a dichotomous proposition, but rather depend on a number of factors that limit options for preservation and access. Complexity is lightened in situations where a university has unambiguous rights to manage digital content. Structure recognizes that digital objects range from tightly structured, highly relational database elements to loosely affiliated items assembled for varying purposes. Figure 3. The variable world of digital content LHT 26,3 348 Tight structure improves the likelihood that valuable assets can be identified and managed actively; dispersed and loosely affiliated objects add complexity. The source of digital assets plays a significant role in determining management priorities. Digital content that originates on a university campus (internal), either through digitization or through acquisition, may be simpler to identify and more technically capable of effective management than externally generated content. Digital content that originates locally has the value of “uniqueness” that adds distinctive character to a university, much like a library’s special collections have done through the past century. Possession as a variable of the content landscape points to the diversity of campus access models. Although some digital content of critical value to the academic mission is secured on campus-managed servers, the university rarely possesses some of the most significant digital resources in which the university has a continuing stake, particularly licensed electronic journals and books. Access is most likely through links to external data providers (journal publishers, database contractor, multimedia conglomerate) with limited or no commitment to preservation. Possession is quite often unassociated with property rights. Populating the digital content landscape are overlapping clusters of digital content whose existence in a management framework are due to specific actions taken by the university. Some content is digitized surrogates of physical objects; some content may have been “born digitally” and may be managed to varying degrees as university records. Other digital content has been purchased or otherwise acquired by university units, ranging from libraries to academic departments, specifically to support research and learning. Yet other digital content is merely licensed for use under sometimes highly restrictive access provisions. The model assumes that nearly all digital content is accessible through a browser-based web gateway, even if the university limits access to local users as a way of dealing constructively with the present intellectual property regime. These clusters overlap in the model to illustrate that the characteristics or functional origins of digital content on a university campus is rarely clear cut. For example, the university might retain the right to mount significant licensed resources on a local server; the university library might purchase and manage directly a significant collection of digitized artwork and may or may not deliver this asset to campus users from its own servers. Placing digital assets appropriately within the landscape is the first important step in establishing asset management priorities (Figure 4). The Conway CLM embeds a conscious distinction between actively managed content and the wider world of digital possibilities. The dotted line in the model represents a porous, potentially two-way boundary where selection and de-selection replace random assembly and deletion as a management ethic. Atkinson (1996) defined the area inside the boundary as the “control zone” and declared unambiguously that selection adds fundamental value to scholarship. Accessibility, particularization, maintenance, certification, standardization, and coordination are all boosted “when an object of information is moved across the boundary from the open zone into the control zone”. Atkinson assigned to the library and its sponsoring institution full responsibility for moving specialized scholarly publications into the control zone and maintaining them according to standards agreed on by the scholarly community. Figure 5 provides examples of the types of digital content that a university community typically produces and plots this content on the landscape. In the domain of digitized content (upper left) live digital objects and resources usually created locally to support teaching and learning. Digitized content that is more aggressively managed, represented Digital content landscape in universities 349 by the overlapping section at centre-left, encompasses growing image and text databases, multimedia “warehouses,” and portfolios of student produced content. In the more fully managed sector (bottom left) are the output of campus research centres, faculty and university publications, and the contents of enterprise systems, most especially university electronic records systems and the increasingly important web content management systems. The domain of acquired content (upper right) encompasses research data and associated software, the digital acquisitions of the library (often on portable or fugitive media) and other digital resources purchased or otherwise obtained to support the research mission of the university. Finally, the domain of licensed content (lower right) is the large and growing world of digital books and electronic journals that have become the academic lifeblood of the campus. Figure 4. Digital asset clusters on the content landscape Figure 5. Examples of digital content plotted on the landscape LHT 26,3 350 Uses of the model The Conway model has been applied at Duke University as a framework for gathering and evaluating information about the scope of digital assets produced by interdisciplinary research centres and academic departments. A report on this work is in preparation. The applicability of the model to other university settings should be evaluated and reported. Additionally, the potential of the model to foster a collaborative, multi-institutional approach to asset management should be explored. The CLM has a number of possible uses as a tool for planning and advocating campus asset management activities and commitments. The model provides a framework for identifying the most salient management characteristics of existing and emerging digital assets on campus. It is a mechanism for assembling and organizing the results of a content survey. Indeed, the four issue-dimensions of the model (property rights, structure, source, possession) could provide a useful outline of the information about clusters of digital assets that should be assembled and analyzed in a campus-wide investigation. A common stumbling point in campus discussions is the tendency of stakeholders to view digital asset challenges through the prism of a particular administrative need. For example, the managers and designers of campus course management systems may view the management of e-learning objects to be a pressing need while remaining relatively unaware of or unconcerned about the library’s electronic journal management challenge. Similarly, faculty who are struggling to deal with burgeoning collections of research data from grant funded projects may have less of an interest in the challenges of building a campus-wide digital image repository. As a visual representation of the variety of digital assets that have the potential for long term management, the content landscape model could be used as a tool for plotting the varying perspectives of campus stakeholders regarding the desirability of managing particular clusters of digital content. In the emerging all-digital academy, the quantity and variety of assets worthy of specialized management in campus repositories could well overwhelm the resources of a university. The content landscape model has the potential to serve as a framework for establishing campus digital asset priorities through the inclusion of stakeholder perspectives and commitments. For example, a campus-wide survey of digital assets, plotted on the content landscape, might well reveal clusters of valuable content and associated stakeholders that distinguish a given university within its peer group. Alternatively, information from multiple universities plotted on the content landscape may reinforce the notion that a consortium shares deeply in value of addressing the needs of a particular type of digital asset. Limitations of the model The CLM does not provide adequately for some types of digital content for which campus administrators are increasingly being called upon for technical support. Specifically, collections of important digital content owned and sometimes even managed by individual faculty do not fit well in the model. Depending upon their research interests and their affinity for information technology tools, faculty possess and are continuing to assemble significant research resources on personal computers, departmental servers and other relatively unmanaged spaces. The model also does not provide for the management as assets of the burgeoning collection of web pages either hand-coded individually or generated by dynamic database driven applications. Content delivered via a widely distributed network of Digital content landscape in universities 351 campus servers, maintained by significant numbers of support staff has proven to be largely immune to active management. Efforts to implement enterprise-wide web content management systems in higher education have generally not met expectations. Finally, the CLM is a static view of the world that does not account for the flow of digital content into and from asset management systems. Further research might well match the Conway model to emerging dynamic management flow models exemplified by the consulting work of Lyon (2007) on behalf of the JISC Digital Repositories Programme. Conclusion Further research should also compare and contrast the strengths and weaknesses of the three models. An empirical test of the Conway Model that plots the characteristics of actual collections of digital assets across the four potentially interacting variables (rights, structure, source, possession) would help refine the relevance of the model and begin quantifying the scope of the campus digital asset management challenge. The content landscape model proposed here may be most valuable, ultimately, for placing in a wider perspective the particular collection development priorities of a university library in relation to other stakeholders on campus. One of the biggest challenges that libraries face as they decide to begin tackling the preservation of digital information is identifying and establishing responsibility for critical clusters of digital assets, such as campus scholarly publications, for which the library is particularly well poised to preserve. Notes 1. Association of Research Libraries, Scholarly Publishing & Academic Resources Coalition: http://www.arl.org/sparc/ 2. University of Kansas. KU ScholarWorks: https://kuscholarworks.ku.edu/dspace/ 3. JISC E-resources Initiative: http://www.jisk.ac.uk/ References Atkinson, R. (1996), “Library functions, scholarly communication, and the foundation of the digital library: laying claim to the control zone”, Library Quarterly, Vol. 66 No. 3, pp. 239-65. Bicknese, D. (2004), “Institutional repositories and the institution’s repository: what is the role of the university archives with an institution’s on-line digital repository?”, Archival Issues, Vol. 28 No. 2, pp. 81-93. Branin, J. (2003), “Institutional repositories”, in Drake, M. (Ed.), Encyclopedia of Library and Information Science, Taylor & Francis, Boca Raton, FL, preprint available at: https://kb. osu.edu/dspace/bitstream/1811/441/1/inst_repos.pdf Camp, J. (2007), “Top-ten IT issues”, EDUCAUSE Review, Vol. 42 No. 3, pp. 12-32, available at: http://www.educause.edu/apps/er/erm07/erm073.asp Conway, P. (2004), “Institutional repositories: is there anything else to say?”, OCLC Distinguished Seminar Series, October 7, OCLC, Dublin, OH, available at: www.oclc.org/research/dss/ conway.htm Courant, P. (2006), “Scholarship and academic libraries (and their kin) in the world of google”, First Monday, Vol. 11 No. 8, p. 2, available at: www.firstmonday.org/issues/issue11_8/ courant/index.html LHT 26,3 352 Coyle, K. (2006), “Mass digitization of books”, Journal of Academic Librarianship, Vol. 32 No. 6, pp. 641-5. Crow, R. (2002), “The case for institutional repositories: a SPARC position paper”, ARL Bimonthly Report, No. 223. available at: www.arl.org/newsltr/223/instrepo.html Davis, P. and Connolly, M. (2007), “Institutional repositories: evaluating the reasons for non-use of Cornell University’s installation of DSpace”, D-Lib Magazine, Vol. 13 Nos 3/4, available at: www.dlib.org/dlib/march07/davis/03davis.html Dempsey, L. (2007), Thinking about collections, Fiesole collection development retreat, April 12-14, Hong Kong, available at: http://digital.casalini.it/retreat/retreat_2007.html Foster, N. and Gibbons, S. (2005), “Understanding faculty to improve content recruitment for institutional repositories”, D-Lib Magazine, Vol. 11 No. 1, available at: www.dlib.org/dlib/ january05/foster/01foster.html Fyffe, R., Ludwig, D., Roach, M., Schulte, B. and Warner, B.F. (2004), Preservation planning for digital information: final report of the HVC2 digital preservation task, KU ScholarWorks, University of Kansas, Lawrence, KS, available at: https://kuscholarworks.ku.edu/dspace/ handle/1808/166 Gierveld, H. (2006), “Considering a marketing and communications approach for an institutional repository”, Ariadne, No. 49, available at: www.ariadne.ac.uk/issue49/gierveld/ Green, A. and Gutmann, M. (2007), “Building partnerships among social science researchers, institution-based repositories and domain specific data archives”, OCLC Systems and Services, Vol. 23 No. 1, pp. 35-53, available at: http://deepblue.lib.umich.edu/handle/2027. 42/41214 Harnad, S. (1990), “Scholarly skywriting and the prepublication continuum of scientific inquiry”, Psychological Science, Vol. 1, pp. 342-3, available at: http://cogsci.soton.ac.uk/ , harnad/ Papers/Harnad/harnad90.skywriting.html Harnad, S. (2001), “For whom the gate tolls: how and why to free the refereed research literature online through author/institution self-archiving, now”, journal (on-line/unpaginated), available at: http://cogprints.org/1639/ Hazen, D., Horrell, J. and Merrill-Oldham, J. (1998), “Selecting research collections for digitization”, Council on Library and Information Resources, Washington, DC, available at: www.clir.org/pubs/abstract/pub74.html Hitchcock, S., Brody, T., Hey, J.M.N. and Carr, L. (2007), “Digital preservation service provider models for institutional repositories: towards distributed services”, D-Lib Magazine, Vol. 13 Nos 5/6, available at: www.dlib.org/dlib/may07/hitchcock/05hitchcock.html Keller, M. (2005), “Institutional repositories: strategic considerations”, Presentation available at: www.ala.org/ala/acrl/aboutacrl/acrlsections/universitylib/CurrentTopicsKeller.ppt Keller, M., Reich, V. and Herkovic, A. (2003), “What is a library anymore, anyway?”, First Monday, Vol. 8 No. 5, available at: www.firstmonday.org/issues/issue8_5/keller/index. html Lavoie, B. (2003), “The incentives to preserve digital materials: roles, scenarios, and economic decision-making”, OCLC, Dublin, OH, available at: www.oclc.org/research/projects/ digipres/incentives-dp.pdf Lyman, P. and Varian, H. (2003), “How much information 2003?”, School of Information Management and Systems, University of California, Berkeley, CA, available at: www.sims. berkeley.edu/how-much-info-2003 Lynch, C. (2003), “Institutional repositories: essential infrastructure for scholarship in the digital age”, ARL Bimonthly Report, No. 226, available at: www.arl.org/newsltr/226/ir.html Digital content landscape in universities 353 Lynch, C. and Lippincott, J. (2005), “Institutional repository development in the United States as of early 2005”, D-Lib Magazine, Vol. 11 No. 9. Lynch, B. et al. (2007), “Attitudes of presidents and provosts on the university library”, College & Research Libraries, Vol. 68 No. 3, pp. 213-27, available at: www.ala.org/ala/acrl/acrlpubs/ crljournal/backissues2007a/crlmay07/crlmay7.cfm Lyon, L. (2007), “Dealing with data: roles, rights, responsibilities and relationships (consultancy report)”, JISC Digital Repositories Programme, available at: www.jisc.ac.uk/whatwedo/ programmes/programme_digital_repositories/project_dealing_with_data.aspx OCLC (2003), “Pattern recognition: a report to the OCLC membership”, OCLC, Dublin, OH, available at: www.oclc.org/reports/escan/default.htm Mackie, M. (2004), “Filling institutional repositories: practical strategies from the DAEDALUS project”, Ariadne, No. 39, available at: www.ariadne.ac.uk/issue39/mackie/ Markey, K. et al. (2007), “Census of institutional repositories in the United States: MIRACLE project research findings”, Council on Library and Information Resources, Washington, DC, available at: www.clir.org/pubs/abstract/pub140abst.html Ross, S. (2002), “Position paper on DAMS for the heritage sector”, Digital Asset Management Systems for the Cultural and Scientific Heritage Sector, DigiCult Thematic Issue No. 2, available at: www.digicult.info/downloads/html/1039519224/1039519224.html Smith, M. (2005), “Exploring variety in digital collections and the implications for digital preservation”, Library Trends, Vol. 54 No. 1, pp. 6-15. Thomas, A. and Rothery, A. (2005), “Online repositories for learning materials: the user perspective”, Ariadne, No. 45, available at: www.ariadne.ac.uk/issue45/thomas-rothery/ van Westrienen, G. and Lynch, C. (2005), “Academic institutional repositories: deployment status in 13 NATIONS AS OF Mid 2005”, D-Lib Magazine, Vol. 11 No. 9, available at: www.dlib. org/dlib/september05/westrienen/09westrienen.html Walters, T. (2006), “Strategies and frameworks for institutional repositories and the new support infrastructure for scholarly communications”, D-Lib Magazine, Vol. 12 No. 10, available at: www.dlib.org/dlib/october06/walters/10walters.html Waters, D. (2006), “Managing digital assets in higher education: an overview of strategic issues”, ARL Bimonthly Report, Vol. 244, pp. 1-10, available at: www.arl.org/bm , doc/ arlbr244assets.pdf Waters, D. and Garrett, J. (1996), “Preserving digital information: report of the task force on archiving of digital information”, Commission on Preservation and Access, Washington, DC, available at: www.rlg.org/ArchTF Corresponding author Paul Conway can be contacted at: pconway@umich.edu LHT 26,3 354 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_ofcgyavdrjc63kqliuzi3npejq ---- TÜRK KÜTÜPHANECİLİĞİ, O, 3 UNU 1İ7.UB Bilgisayara Dayalı Kütüphane Sistemleri: Son Yirmi Bir Yılın Gözden Geçirilmesi Lucy A. Tedd Abstract: The developments in the use of computer systems in libraries from 1966 to date have been great. This report looks at some of these developments in Britain, in North America, and in other countries. It traces the history of library automation from the early experimental systems through the co-operative systems, the locally developed systems, the mini —and microcomputer— based and stand-alone integrated systems that are available today. öz: 1966 yılından günümüze kütüphanelerde bilgisayar sistemleri kulla­ nımında büyük gelişmeler olmaktadır. Yazıda, Britanya, Kuzey Amerika ve diğer ülkelerdeki bu gelişmelerden bazıları İncelenmektedir. Kütüphane oto­ masyonunun tarihi, ilk deneysel sistemlerden günümüzdeki kooperatif sis­ temlere, yerel olarak geliştirilmiş sistemlere, mini Ve mikrobilgisayara da­ yalı ve bağımsız bütünleşik sistemlere dek verilmektedir. 1. İIKYIILARt 1966'nın başlarında Newcastle upon Tyne’da 'Bilgisayar ve kütüpiame’ üstüne altı bölümden oluşan bir dizi konferans verildi. Bu konferanslara dayanan kitabın Önsözünde1 zamanın Newcastle upon Tyne Üniversitesi Kü­ tüphanesi Müdür Yardımcısı Maurice he, 'çok geniş' olduğu sanılan ilgili literatürün kapsamlı bir biçimde gözden geçirilememesi nedeniyle özür di­ lemekteydi, Aradan geçen süre İçerisinde kütüphane otomasyonu üzerine yayınlar aşım derecede artarak kapsamlı bir gözden geçirmeyi şimdi daha da güçleştirdi. Bu nedenle bu gözden geçirme (review), geçen yirmi bir yıl­ daki başlıca gelişmeleri ana hatları ile vermeyi ve okuyucuyu bu süre içe­ risindeki temel yayınlara yöneltmeyi amaçlamaktadır. Bu gözden geçirme İçin başlangıç tarihi olarak 1966'nın seçilmesinin nedenlerinden birisi, Program'ın ilk olarak bu yılda çıkmasıdır. Mart 1966’da Program'll İlk sayısı News of Computers İn British University Libraries alt başlığıyla bağımsız bir haber bülteni olarak yayımlandı. Editörlüğünü o za­ man Belfast'daki Queen's Üniversitesi Kütüphane Araştırmaları Okulunda görevli (şimdi bu derginin editörü)* Richard Kimber yapmaktaydı. • Richard Kimber, Journal of Documentation'll! editörüdür. Çav. İlk sayılarda, Southampton ÜniversitesTndeki delikli karta dayalı ödünç verme sisteminin, Newcastle Üıtiverritıedindekl sağlama sisteminin, Essex 118 Lucy A. Tedd Üniversitesinde makinece okunabilir katalog geliştirilmesinin ve Loughbo­ rough Üniversitesinde bir süreli yayınlar listesi üretimi için yapılan plan­ ların ayrıntıları yer almaktaydı. 1968’de Program'm kapsamı yeni bir alt başlıkla. News of Computers in British Libraries, genişletildi ve [dergi] son zamanlarda Aslib'de kurulan Mekanizasyon Grubu'nun ana iletişim yolu olarak kabul edildi. 1969’da (Cilt 3) Program'm Aslib tarafından yayımlanmasına ve 'kütüphane işlemleri için bilgisayar kullanımının gerçek teknikleriyle ilgilenerek’ Journal of Documen. tatlon'ı ’tamamlamasına karar verildi. Yıllık abone bedeli Aslib Üyelerine 30, üye olmayanlara 40 shilling oldu. Program'm kapsamı News of Computers İn Libraries yeni alt başlığıyla daha da genişletildi. Aslib'in mekanizasyon grubu çok geçmeden adım Bilgisayar Uygulama­ ları Grubu olarak değiştirdi. Sağlama ve kataloglama, ödünç verme, süreli yayın denetim sistemleri, tutanak yapısı, kütük manipülasyonu ve bilgi erişim gibi çeşitli konuları İncelemek üzere çalışma gruplarının kurulması üyelerin isteğiydi. Bu çalışma gruplarının raporları Program'da yayımlan dı. Bunların ilki Birleşik Kırallık’taki (United Kingdom) mevcut bilgisaya­ ra dayalı dört ödünç verme sistemini (Batı Sussex ti Kütüphanesi, Sout­ hampton Üniversitesi, Aldermaston'daki Atomik Silahlar Araştırma Kuru­ luşu (AWRE) ve Harwell’deki Atom Enerjisi Araştırma Kuruluşu (AERE)) karşılaştıran bir yazıydı? 1960’lann ortalarında Londra metropoliten belediyeleri yeniden düzen­ lendiğinde, yeni belediyelerin kütüphanecileri, dermelerinin birleşik kata- loğlarım üretme sorunuyla karşılaştılar. Dört kütüphane (Barnet, Camden, Greenwich ve Southwark belediye kütüphaneleri) bu işi başarmak için bir bilgisayar sistemini denemeyi kararlaştırdılar. Bu ilk kataloglama sistem­ leri seksen kolonluk delikli kartlara dayanmakta ve ortaya çıkan katalog bilgisayarın satır yazıcısıyla basılmaktaydı. Bunların ve diğer bilgisayara dayalı ilk Ingiliz kütüphane sistemlerinin iyi bir tanımlaması Woods tara­ fından verilmektedir.’ 1966'daki diğer önemli bir olay, Oxford’daki Brasenose Koleji’nde kü tüphane otomasyonu konusunda yapılan Anglo-Amerikan konferansıydı? Bu konferans, ilgili kütüphaneciler ve bu alanda çalışan bazı lider Amerikan kütüphanecilerinden başka British Museum, Bodleian Kütüphanesi ve Cam­ bridge Üniversitesi Kütüphanesi gibi belli başlı derleme kütüphanelerinden temsilcileri de biraraya getirdi. Kuzey Amerika'daki kütüphaneler de bilgisayar sistemlerini denemek­ teydiler. 19166da Annual Review of Information Science and Technology'nin (ARIST) ilk sayısı yayımlandı ve İlk sayıda kütüphane otomasyonu üzerine bir bölüm bulunmaktaydı.’ Bu sayı, Toronto Üniversitesinde bilgisayarca üretilen kitap formundaki katalog hakkında bilgi ve 'büyük bir üniversite kütüphanesinden yeni bir bilgi aktarım sistemine geçiş için bir tasarım sağlamayı' amaçlayan (Massachusetts Teknoloji Enstitüsündeki) INTREX projesinin ayrıntılarını da içermekteydi. Kütüphanelerde bilgisayar sistemlerini inceleyen Önemli Amerikan der­ gisi Journal of Library Automation (1982’den beri Information Technology Bilgisayara Dayalı Kütüphane Sis'^^i^Ie^er: 119 and Libraries) ilk olarak Mart 1968'de yayımlandı ve editörü FG. KUgour'dı. İlk sayıdaki yazılar Texas A&I (Sanat ve Endüstri) Üniversitesindeki bil­ gisayara dayalı bir sağlama sistemi ve Stanford Üniversitesinde (12 Kb bellek ve dört manyetik teyp sürücüye sahip bir bilgisayar kullanarak) bir kitap katalog üretimi üzerineydi. Bu dergi Amerikan Kütüphane Demeği (ALA) Bilgibîlim ve Otomasyon Bölümü'nün (şimdiki Kütüphane ve Bilgi Teknolojisi Demeği (LİTA) resmi yayın organıdır. tik yıllarda ABD’deki bilgisayara dayalı kütüphane sistemleri hakkında bilgi için yararlı bir diğer kaynak Illinois üniversitesi Kütüphane Bilimi Yüksek Lisans Okulu’nda bilgi işlem ve kütüphane uygulamaları konusun­ da yapılan yıllık klinik bildirileridir, ilk yıllarda bu Minikler 'bilgi İşlemin çeşitli yönleriyle ilgilenen kütüphanecilerin buluşmasını amaçlayan’ bir dü­ zenleme olarak kabul edilmekteydi. Örneğin, 1968'de Illinois Eyalet Kütüp­ hanesindeki çevrimiçi ödünç verme sisteminin, Ohio College Library Center için ilk sistem tasarımının, Michigan Üniversitesindeki bilgisayara dayalı bir kitap sipariş sisteminin ve kütüphane otomasyonunda başarısızlığı ön­ lemenin açıklamaları yer almaktaydı.* Daha sonraki yıllarda bu klinikler spesifik konular üzerinde yoğunlaştı (Örneğin, bir bilgisayar için anlaşma (1977) ve otomasyona halkın erişimi (1980)). ABD’de kütüphane otomasyo­ nundaki gelişmelerin genel bir tarihi Kllgour tarafından verilmektedir.7 * Kütüphanelerin günbegün işleyişi için sürekli yapılması gereken sağlama, kataloglama, ödünç verme ve süreli yayın denetimi gibi rutin işlemler İngilizcede ‘housekeeptai* olarak adlandırılmaktadır. Çev. Avustralya’da bilgisayara dayalı kütüphane sistemlerinin gelişimi LASIE' de (Information Bulletin of the Library Automated System Information Exchange) iyi bir şekilde belgelenmektedir. Bu dergi resmi olmayan bir haber bülteni olarak başladı, 1970'de Dorothy Peake editörlüğüne getirildi ve dergi 1971'de abonelik yoluyla sağlanmaya başladı. LASIE, Avustralya'da kütüphane otomasyonu konusunda bilgi İçin bir 'clearing house' olarak ha­ reket etmeyi amaçlamakta ve 1yi cilalanmış' makaleler yerine, sistemlerin planlaması, tasarımı ve gerçekleştirimi hakkında ayrıntılar içeren makale* leri yayımlama politikası izlemektedir. 1967'den itibaren Britanya’da Bilimsel ve Teknik Bilgi Ofisi (OSTI) kütüphanelerde 'housekeeping' sistemleri* geliştirmek için birçok proje des­ tekledi. Bu tür bir destekleme olmasaydı otomasyon projelerinin çoğu ger. çekleştirilemeyebilirdi. OSTI'den para alan kütüphaneler Öenek koşullan' nın bir parçası olarak rapor üretmek zorundaydılar. Woods'un ön raporun- ka® Southampton Üniversitesinde geliştirilen ve OSTI tarafından destek İenen sağlama ve kataloglama sistemi tanımlanmaktadır. Keza OSTI, Kü­ tüphane Otomasyon Projeleri Bilgi Görevlisi kadrosunu da destekledi ve I971'de, bu görevlinin editörlüğünü yaptığı ve OSTI'nin desteklediği proje­ lerin ayrıntılarını veren ilk Very Informal Newsletter veya Vine yayımlandı. 1974'de İngiliz Ulusal Kütüphanesinin (British Library) kurulmasıyla bir­ likte OSTI'deki personel Ingiliz Ulusal Kütüphanesi Araştırma ve Gellştlr- me Bölümüne (BLR&DD) aktarıldı. BLR&DD kütüphane otomasyonu gö­ revlisi kadrosunu desteklemeye devam etmekte ve Vine (yılda üç veya dört 120 Lucy A. Tedd kez yayımlanır) şimdi Ingiliz kütüphanelerindeki kütüphane 'housekeeping' sistemlerini tüm yönleriyle kapsamaktadır. Vine'daki makaleler genellikle editör tarafından ilgili Örgütün ziyaret edilmesinden sonra yazılmaktadır- Kütüphanecilik okulu öğrencilerini ve bilgisayar sistemleriyle ilgilenen kütüphanecileri hedefleyen ders kitaptan 19Wlann sonunda görünmeye baş. ladı; bunu çoğu kez, gelişen bu alanın okuyucularını güncelleştirmek için, yıllar sonra bir ikinci basım izledi (örneğin, Kimber,6 Hayes ve Becker/0 ve Eyre ve Tonks'). 2. MARC GELİŞMELERİ MARC'ın (makinece okunabilir kataloglama) doğuşu, sık sık, ABD'deki Kongre Kütüphanesinde (LC)King ve arkadaşlarınca 1963 yılında otomas­ yon konusunda hazırlanan bir rapora* 2 atfedilir. Raporun başlıca sonucu, LC'deki bibliyografik sistemin on yıl içinde otomatikleştirilebifcceğiydi. MARC projesi üzerinde çalışma başladı, ancak 1967’de sorunun özgün (ori­ jinal) olarak düşünülenden daha karmaşık olduğu fark edildi. O zamana kadar OSTI, BNB’de (British National Bibliography) makinece okunabilir bir bibliyografik tutanağın gerekirlikleri konusundaki bir olurluk çalışma­ sını (feasibility study) finanse ediyordu. MARC tutanağının yapısının tasa­ rımında LC ile BNB arasında yakın işbirliği vardı. Tutanak yapısının ama­ cı, bibliyografik tanımların makinece okunabilir formda —ki bu form be­ lirli bir uygulama İçin gerektiğinde yeniden formatlanabilir— iletişimini ola­ naklı hale getirmekti. 1968 ile 1974 arasında LC'de ve BNB'de MARC'rn kul­ lanımı büyük ölçüde deneyseldi ve yerel katalogların üretiminde kullanmak için birçok kütüphane MARC tutanaklarının kopyelerini manyetik teyp üze­ rinde sağladı. Wainwright^ bazı İngiliz kütüphanelerinin bu teypleri nasıl kullandıklarını anlatmaktadır. MARC'dakl gelişmeler genellikle ulusal kütüphanelerde kütüphane oto­ masyonundaki gelişmelerle yakından ilgili olmuştur. Ingiliz Ulusal Kütüp­ hanesinin kurulmasından önce, bir raporda^ yeni ulusal kütüphanenin MA^ı^’a dayalı kataloglama hizmetleri sunması gerektiği savunulmaktaydı. Nitekim BNB, Ingiliz Ulusal Kütüphane si'nin Bibliyografik Hizmetler Bö- lümü'ne (BLBSD) dönüştüğü zaman MARC’a dayalı çeşitli hizmetler sun- du.u Bu hizmetlerden birisi olan LOCAS (Yerel Kataloglama Hizmeti), kü­ tüphanelere kendi mevcutlarının bilgisayar çıktısı mikrofiş (COM) katalo­ gunu sağladı; bu hizmet Brighton Halk Kütüphanesinde yapılan deneylerden geliştirildi.1'* LOÇAS'm popülerliği kanıtlandı. 1983’de seksen civarında kü­ tüphane LOCAS müşterisiydi. Fakat o zamandan beri birçok kütüphane ken­ di bilgisayar sistemlerini edindi ve 1986’da BLBSD, 1988 yılında LOCAS hiz­ meti sunmayı durduracağını açıkladı. 1975 yılında Ingiliz Ulusal Kütüphanesi Makinece Okunabilir Kütüphane Bilgi Hizmeti (MERLIN) adıyla bilinen genel amaçlı karmaşık bir veri ta banı yönetim sistemi (data base management system) geliştirmeye başladı. MERLIN, çevrimiçi kitap siparişi ve sağlama, ödünç verme ve kataloglama alt sistemlerini içermekteydi 1979'da Ingiliz Hükümetinin yaptığı kesinti­ ler MERLIN’in askıya alınmasıyla sonuçlandı. Robinson, çalıştığı kadarıyla o zamanki MERLIN versiyonunu tanımlamaktadır^ 1977'de Ingiliz Ulusal Bilgisayara Dayalı Kütüphane Sistemsel: 121 Kütüphanesinin Otomatik Bilgi Servisi (BLAISE) aracılığıyla UK ve LC MARC tutanaklarının çevrimiçi aranması olanaklı hale geldi. 1985'5de İngi­ liz Ulusal Kütüphanesi otomatik hizmetleri için gelecekteki planlarını açık­ ladı. Mevcut BLAISE, toptan (batch) ve çevrimiçi hizmetlerinin yerini ala­ cak iki yazdım paketi (BRS/Search ve WLN) sağladı. Yeni sistem BLAISE 2 Sistemi ya da BS2 olarak bilinecektir. Sistem, karmaşık çevrimiçi arama ve erişim Özelllkleri, Ingiliz Ulusal Kütüphanesinin bölümleri için çevrimiçi kataloglama ve katalog kütüklerinin yaşatılmasını (maintenance) sağlamayı amaçlamaktadır. 1970lerin oltalarından beri MARC tutanak yapısı birçok ülkede ulusal bibliyografyaların üretimi için kullanılmaktadır. Makinece okunabilir form­ daki bilbiyografik verilerin ulusal ajanslar arasında uluslararası düzeyde değişimini kolaylaştırmak için evrensel bir MARC tutanağı (UN1MARC) tasarlandı. UNIMARC'm gelişmesi gerekliydi. Çünkü birçok ülke bireysel gereksinimlerini karşılamak için özgün MARC formatı üzerinde çeşitleme­ ler geliştirdi, UNIMARC’m ilk basımı, Aıvısturya, Belçika, Kanada, Danimarka, Finlandiya, Fransa, Demokratik Almanya, Federal Alman­ ya, Büyük Britanya, Macaristan, Hollanda, İrlanda, SSCB, ABD ve Yugos­ lavya’dan temsilcileri içeren bir çalışma grubu tarahndan 1977 yılında ya­ yımlandı. Long,18 UK MARC ve LC MARC tutanaklarının gelişimini iyi bir şekilde vermekte, Hopkin son1” ise konunun uluslararası yönlerini incelemek­ tedir. Batı Atvupa'da 1970lerde Belçika, Fransa, İsviçre ve Birleşik Kırallık- taki ulusal ve bilimsel kütüphanelerin temsilcileriyle bir INTERMARC gru­ bu kuruldu. Bu grup başlangıçta MARC tutanak formatlannın saptanma­ sıyla ilgiliydi. Grup daha sonra bir süre MARC'â dayalı sistemler için ya­ zılım üzerinde odaklaştı. 197<^,'da Avrupa Kütüphane Otomasyon Grubu (ELAG) olarak adlandırıldı. ELAG her yıl bir seminer/konferans düzenle- .mektedir. Bu seminerln/konferansın bildirileri çeşitli Avrupa ülkelerindeki kütüphane otomasyonu gelişmeleri hakkında bilgi sağlayan değerli bir kay­ naktır,2» MARC tutanak yapısının tek tek kütüphanelerde katalog üretimi için kullanımı değişmektedir. Bir bibliyografik tutanakta ve katalog giri­ şinde yer alması gereken bilgi miktarı uzun süreden beri tartışma konusu­ dur, 19t8Herin başında Bath Ün^'^ter^si^ı^i^’ndeki Katalog Araştırma Merkezi (Centre for Catalogue Research) tam ve kısa giriş kataloglar üzerine bir dizi deney gerçekleştirdi ve farklı uzunluktaki katalog girişlerinin sistem maliyeti, kullanıcı gereksinimleri ve kullanılabilirlik üzerine etkilerini araş- tirdi.21 3. KOOPERATİF SİSTEMLER VE HİZMETLERİ 1970lerde kütüphaneler arasında kaynak paylaşımında ve kooperatif hiz­ metlerde gelişme olmuştur. OSTI'ntn ilk desteğinden sonra Britanya'da on taya çıkan iki büyük kooperatif BLCMP (Library Services) Ltd (eskiden Birmingham Libraries Cooperative Mechanisation Project) ve SWALCAP Library Services Ltd’dir (eskiden South West Academic Libraries Cooper­ ative Automation Project; özgün olarak South West University Libraries System Co-Opeation Project (SWULSCP)). 122 Lucy A. Tedd BLCMP Britanya'daki ilk kooperatif kataloglama servisiydi ve 1972'de kooperatifin ilk üç üyesine (Aston Üniversitesi Kütüphanesi, Birmingham Üniversitesi Kütüphanesi ve Birmingham Halk Kütüphanesi) deneysel bir hizmet sunuyordu. Özgün sistem toptan işlem yoluylaydı. 1980’de bibliyog­ rafik bilgileri çevrimiçi arama, tutma ve düzenleme (editing) kolaylığı sağ­ layan BOSS (BLCMP Online Support Service) tanıtıldı. Bu tür diğer koo­ peratif sistemlerde olduğu gibi BLCMP veri tabanı hem UK ve LC MARC tutanaklarından hem de MARC veri tabanında olmayan materyal için koo­ peratif üyelerince derlenen MARC tutanaklarından meydana gelmektedir. İkincisine sık sık EMMA (Extra-MARC), MARC dışı tutanaklar olarak atıf yapılır. Çeışitli formatlarda katalog çıktısı elde edilebilfr. 1980’lerm başında o zaman üye olan otuzun üzerindeki akademik kütüphanenin ve halk kü­ tüphanesinin çoğu kataloglarını COM mikrofişi üzerinde sağlamayı seçti. Diğer kooperatiflerde de yaygın olduğu gibi BLCMP şimdi bir dizi kurum içi bağımsız (inhouse stand-alone) sistemler geliştirmiştir. SLS (SWALCAP Library Services Ltd) tarafından sunulan ilk kooperatif hizmet ödünç verme içindi ve ilk üye kütüphaneler Bath, _ Bristol, Cardiff ve Exeter üniversite kütüphaneleriydi. 1978'de çevrimiçi kataloglama hiz­ meti sunuldu. Bu hizmetin temel tasarım özelliği kullanımdaki esneklikti. Üye kütüphanelerin MARC tutanakları istekleri önce kendi kütüklerinden kontrol edilmekte, sonra SWALCAP'in MARC tutanakali! veri tabam ve da­ ha sonra da BLCMP ve BLAISE'in veri tabanları kontrol edilmekteydi. Eğer bulunamadıysa o zaman gerekli MARC tutanağı kütüphane personelince çevrimiçi olarak yaratılmak zorundaydı, 1986'da SLS, LIBERTAS olarak bi­ linen bağımsız bütünleşik kütüphane yönetim sistemini duyurdu. Polytechnic of Central London bu yeni sistemin ilk kullanıcılarından birisidir. Londra ve Güneydoğu Kütüphane Bölgesi (LASER) İngiltere'nin güney­ doğusunda kütüphane işbirliği için kurulmuş bir örgüttür. LASER'e her tür kütüphane üye olabilir. Halen yaklaşık 90 üyesi vardır. 1971'de LASER, her başlık için ISBN (veya diğer özel kimlik) numarasından ve bu başlığın bölgedeki hangi kütüphanelerde olduğunu gösteren bir kod listesinden mey­ dana gelen bir toplu katalog sistemi kurdu. Kütüphane Otomasyonunda iş­ birliği (Co-operation on Library Automation) konusundaki ilk çalışma 1970' ferin ortasında LASER'ce gerçekleştirildi.22 O zamandan bu yana LASER, üye kütüphaneler için çeşitli kataloglama hizmetleri sağlamak üzere deği­ şik sistemler geliştirdi. Keza LASER, İngiliz Ulusal Kütüphanesinin isteği üzerine, 1950 yılına kadar inen BNB tutanaklarının geriye dönüşlü olarak MARC formatına çevrilmesinden de resmen sorumluydu. Bundan başka LASER, Britanya'daki halka açık görüntülü bilgi hizmeti (public access viewdata service) PrestelTe de çok ilgilendi. LASER, Prestel üzerine bilgi koymak isteyen kütüphaneler için 'şemsiye' rolü oynamaktadır^ Bir diğer kooperatif servis de SCOLCAP'dir (Scottish Libraries Coop­ erative Automation Project). 1973'de İskoç Ulusal Kütüphanesi'nİ, üç üniver­ site (Dundee, Glasgow, Stirling) ve iki halk kütüphanesini (Edinburgh ve Glasgow) temsil eden bir grup kütüphaneci tarafından kurulmuştur. SCOL- CAP sistemi halen İskoç Ulusal Kütüphanesinde yerleşik ve üye kütüphane­ ler ve BLAISE İle bağlantıları olan bir Hewlett Packard 3000 dizi 48 mini* Bilgisayara Dayalı Kütüphane Sistemleri: 123 bilgisayarı kullanmaktadır. Kataloglama, sağlama ve bilgi erişim kolaylıkları mevcuttur, Britanya'da mevcut bilgisayara dayalı çeşitli katalog hizmetle* * riyle ilgili bir çalışma Katalog Araştırma Merkczi'nce hazırlanmıştır.3* * OCLC’nİn üye sayısı 1SS8 yılı sonunda 637S’e yükselmiştir. Çev. • WLN'ln kapsamı 1M8 yılında tüm Batı eyaletlerindeki kütüphaneleri de içerecek biçimde genişletildi ve adı. kısaltması aynı kalmasına karsın, Western Library Network olarak değiştirildi. Çev. Kuzey Amerika'da da kooperatif sistemlerin birçok örnekleri vardır. Bunların en büyüğü kuşkusuz OCLC’d’r (ilkin Ohio College Library Center, şimdi ise Online Computer Library Center). OCLC 1967 yılında iki ana amaç* la, Ohio eyaletindeki elli akademik kütüphanede kaynak paylaşımı ve kü­ tüphane maliyetlerindeki artışı azaltmak amaçlarıyla kuruldu. 1971'de or­ tak katalog sistemi işlemeye başladı ve üye kütüphaneler çevrimiçi termi­ naller aracılığıyla MARC'a dayalı veri tabanına eriştiler. Alınan çıktı temel­ de katalog kartlan biçimindeydi. Yıllar geçtikçe OCLC büyük ölçüde ge­ nişledi. 1975’e gelindiğinde bu hizmeti kullanan 240 kütüphane vardı; İ982'de bu sayı yaklaşık 30(Xüe yükseldi ve bu kütüphanelerin çoğu Kuzey Amerika' daydı. Ozamandan beri OCLC, hizmetlerini dünyanın diğer ülkelerinde de etkin olarak pazarlamaya başladı. Boylece şimdi, 1986'da, OCLC'nİn Kuzey Amerika, Avrupa ve Avustralya'da 4800 civarında üyesi vardır.* 15 milyon civarındaki bibliyografik tutanaktan oluşan ana veri tabanına erişim sağ­ lamanın yanı sıra 'OCLC, yerel bir sistem için LS/2000 olarak bilinen bir yazılım geliştirdi. Bu yazılım kataloglama, erişim, ödünç verme, süreli ya­ yın denetimi ve sağlama kolaylıklarını içermektedir. LS/2000 halen dünya çapında. Birleşik Kırallk'taki Newcastle upon Tyne Üniversitesi Kütüpha­ nesi ve Oxford Üniversitesindeki bazı kütüphaneler için bir pilot projeyi de içeren, 60 civarında kütüphanede kullanılıyor. Son olarak OCLC Europe, LS/2000'i, başka yerlerde halâ pazarlanıyor olmasına karşın, Avrupa'da pa- zarlamamayı kararlaştırdı. Uluslararası Awupa Pazarı için bütünleşik yerel sistem yazılımı yeterince geliştirildiğinde OCLC Europe pazara yeniden gi­ recektir. UTLAS (University of Toronto Library Automation Systems) Kuzey Ame­ rikan (twqpratiflerinin diğer bir Örneğidir. UTLAS, Toronto Üniversitesinde MARC tutanaklarını kullanan bir çevrimiçi katalog destek hizmetinden ge­ liştirildi ve 1973’e gelindiğinde üye kütüphanelere bilgisayara dayalı sistem­ ler ve hizmetler sağlıyordu. UTLAS ver’ tabam UK ve LC MARC tutanak­ larından başka Kanada MARC tutanaklarını, Quebec MARC tutanaklarım (Quebec Ulusal Kütüphanesinin veri tabam) ve Japonya'daki Ulusal Diyet Kütüphanesinin Japan/MARC tutanaklarım da içermektedir. 1967'de ABD'deki Washington Eyalet Kütüphanesi bu eyaletteki tüm kütüphaneler için ortak bibliyografik destek, bir tophi katalog ve otorite kontrolü sağlayacak bir sistem geliştirme sorumluluğunu üstlendi. 1976'ya dek sistem toptan işlem (batch mode) yoluyla işledi ve katalog kartlan sağ­ ladı. İ977'de Washington Kütüphane Ağı (Washington Library Network: WLN)* yan Özerk oldu. O zamanki çevrimiçi sistemle üye kütüphanelere ortak kataloglama, otorite kontrolü, sipariş, muhasebe ve kütüphanelerarası ödünç verme kayıtlan sunuldu. WLN üyesi kütüphaneler çoğunlukla ABD'nİn 124 Lucy A, Tedd 'Pasifik Kuzeybatı' kesiminde bulunmaktadır. Üyelik akademik kütüpha­ neler, halk kütüphaneleri ve eyalet kütüphanelerini kapsamaktadır. WLN tarafından kendi hizmetler’ için geliştirilen yazılımın son derece taşınabilir olduğu kanıtlandı. Yazılım, Avustralya, Yeni Zelanda, Singapur ulusal kü­ tüphaneleri ve İngiliz Ulusal Kütüphanesinden başka çeşitli akademik kü­ tüphanelerde de kullanılmaktadır- Bir Amerikan firması olan Biblio-Techni- ques ana WLN yazılımına ödünç verme ve süreli yayın denetim modüllerini de ekledi ve yazılımın bütününü anahtar teslim (turnkey) bir sistem olarak- pazarlamaktadır. 1974'de dört büyük Kuzey Amerikan araştırma kütüphanesi (Columbia, Harvard, Yale Üniversite Kütüphaneler’ ve New York Halk Kütüphanesi) kooperatif derme geliştirme, dermelere ortak erişim, araştırma materyalle­ rinin korunması ve ileri/gel’şm’ş bibliyografik araçların (tools) yaratılması ve işletilmesi amaçlarıyla Araştırma Kütüphaneleri Grubu'nu (Research Libraries Group: RLG) kurdu. Son amacı gerçekleştirmek üzere Araştırma Kütüphaneleri Bilgi Ağı (Research Libraries Information Network: RLIN) kuruldu. Araştırma kütüphanelerine, diğer kütüphanelerin bilgUerin’ gön menin yanında kendi çevrim’ç’ kataloglama kütüklerini yaratma ve bu kü­ tükleri arama olanakları sunuldu. RLIN yaklaşık 20’nln üzerinde RLG üye­ sine ek olarak 360 civarında diğer kütüphane tarafından da kullanılmakta dır. Matthews ve Williams21 bu dört Kuzey Amerikan kooperatif tarafından 19J^İ^''de sunulan hizmetlerin bir karşılaştırmasını yapmaktadırlar, ABD’dekl Kütüphane Kaynaklan Konseyi (Council on Library Resources) Bağlantılı Sistemler Projesi (Linked Systems Project: LSP) adıyla bilinen ve bu koo­ peratifler arasında s’stemlerarası iletişim İçin protokol ve standartlan ya­ ratmayı amaçlayan bir çalışmayı desteklemektedir. 19{^(^''^e Birleşik Kıral- kk’ta İngiliz Ulusal Kütüphanesi'ntn himayesinde BLCMP, LASER, SCOLCAP ve SWALCAP temsilcilerine ek olarak Aslib, Kütüphane Demeği, Ulusal Kü­ tüphaneler ve Üniversite Kütüphaneleri Daim’ Konferansı (SCONUL) ve Po- 11 teknik Kütüphaneler’ Konseyinden (COPOL) de temsilcilerin katılmasıyla b’r Kooperatif Otomasyon Grubu (Co-operative Automation Group: CAG) oluşturuldu. Bakewell bu Grubun referans ilişkilerini açıklamaktadır"' CAG’nin ’lk etkinliklerinden birisi, ortak standartların benimsenerek bibli­ yografik bilgilerin paylaşım yollarım incelemek üzere bir Bibliyografik Stan­ dartlar Daimi Grubu kurmak olmuştur. Böyle bir standart 1986da açıklan­ dı. Bu standart, veri değişim formatı olarak UK MARC’ın benimsenmesini ve bibliyografik tanımlama için Anglo-iAneerkan Kataloglama KuraUlm'nın (AACR2) kullanılmasını öngörmektedir. 4. 197OLERDE YEREL SİSTEMLER 4.1 Genel Bakış Kooperatif sistemlerin ve hizmetlerin gelişmesine karşın, 1970!erde bil­ gisayar kullanan kütüphanelerin çoğu bağlı oldukları ana kuruluşların bil­ gisayarından yararlandılar. Bu, doğallıkla ucuz bir çözümdü, ama bilgisayar (ve bilgisayar merkezi personel’) kütüphanenin denetim’ altında değildi. Bu Bilgisayara Dayalı Kütüphane Sistemleri: 125 nedenle kütüphane sık sık gerekli bilgisayar sistemi kaynaklara için örgüt içindeki diğer bölümlerle rekabet etmek zorundaydı. 1970'lerin başlarında Kütüphane Otomasyonu Araştırma ve Danışma Demeği (The Library Auto* mation Research and Consulting Association: LARC) dünyadaki tüm ülke­ lerde kütüphanelerde bilgisayar kullanımı konusunda çeşitli araştırmalar gerçekleştirdi, örneğin, Avrupa araştırması™ Avusturya, Belçika, Çekoslo­ vakya, Danimarka, Finlandiya, Fransa, Macaristan, İrlanda, İtalya, Hollan­ da, Norveç, Polonya, Romanya, İspnya, İsviçre, İsveç, Doğu Almnya ve Ba­ tı Almanya'daki işletimde olan 127 sisteme işaret etmektedir. Keza LARC 1970'lerin başlarında bilgisayar sistemleri kullanan kütüphanelerin sık sık karşılaştıkları sorunlarla İlgili, bir araştırma da yaptı.™ Patrinostro karşı, taşılan sorunlarla ilgili çözümlemelerini bitirirken şöyle demektedir: 'Oto masyona uyum sağlamayı başarmada ve birçok sorundan kaçınmada temel etmenler uygun iletişim, hem insani hem de teknik hususlarda esnek, uzun dönemli planlama, katodun, gelişmiş satıcı ve personel İlişkileri ve ehliyetli denetimdir.' 1974'de LARC, Network; international Communications İn Library Auto­ mattan adlı yararlı bir aylık dergi yayımlamaya başladı. Bu dergi çeşitli ül­ kelerdeki gelişmelerle ilgili genel makaleler İçermekteydi. Chauvdhcc™ Fran sa'daki gelişmelerle İlgili genel b’r değerlendirme sağlamakta; Grenoble Ünl- versitesi'ndeki MARC'a dayalı MONOCLE katalog sistemini, bilgisayarca üretilmiş b’r süreli yayınlar toplu listesini (Inventaire Pennanet des Peri- odiques Etrangers en Cours) ve Bibliothdque Municipale d'Aıthony'deki bilgisayara dayalı ödünç verme sistemini açıklamaktadır. Network'iin yayım 1976'da kesildi, Britanya'da Aslib'in Bilgisayar Uygulamaları Grubu (Computer Applica­ tions Group) kütüphanelerde ve bilgi birimlerindeki İşletimsel bilgisayar uygulamalarıyla ilgili iki büyük araştırmanın gerçekleştirilmesinden sorum­ luydu. ilk araştırma. Mart 1973'de ’şletimsel bilgisayar uygulamalarına sa­ hip olduğu bilinen ya da sanılan tüm Birleşik Kıratlık kütüphane ve bilgi birimlerine gönderilen bir ankete dayanmaktaydı; 135 kütüphanedeki işle­ timse! sistemlerle ilgili ayrıntılar araştırma raporunda açıklamaktadır?® 1976'daki ikinci basımda 170 kütüphane hakkında bilgi verilmekteydi?1 1976'da rapor edilen bilgisayar sistemlerinde en popüler uygulamalardan birisi ödünç verme idi: 79 kütüphane bu tür sistemler işletmekteydi. Bu sayı 1973'dekj 33 işletimsel sistemle ve 1975 başlarındaki tahmini 53 sistemle karşılaştın- labilir.®* 42 Ödünç Verme Sistemler! Bilgisayara dayalı b’r ödünç verme sisteminin temel özelliklerinden bi­ risi ödünç verilen materyal hakkmdaki bilgilerin ve kime ödünç verildiğinin kaydedilmesidir. ^ü'lerin Amerikan sistemlerinde bu, sıklıkla, kitap bügk lerinin 80 kolonluk kart üzerine ve okuyucu bilgilerinin kredi kartına ben­ zeyen özel bir 'badge' üzerine kaydedilmesiyle başarılmaktaydı. Bu işi yap­ mak için Birleşik Kırallık'ta daha ileri araç-gereç geliştirildi. Genellikle veri toplama birimi (data collection unit) olarak adlandırılan bu araç, kitabı ve okuyucuyu tanımlamak için kullanılan özel numaralan kaydeder. Automated 126 Lucy A Tedd Library Systems (ALS), kitap ve okuyucu numarasıyla ilgili bilgilerin Browne ödünç verme kartlarıyla aynı boydaki kartlara delindiği 'karta dayalı' bir sistem geliştirildi. Bu sistem ilk kez 1967'de Batı Sussex Î1 Kütüphanesinde kullanıldı. 1974'de ALS, karta dayalı sisteme alternatif olarak 'etikete da­ yalı’ sistem olarak bilinen ve kitabın arkasına yerleştirilen manyetik olma­ yan b’r etiketten oluşan bir sistem geliştirdi. Bu sistem ’lk olarak Lancashire 11 Kütüphanesinin Bolton-le-Sands şubesinde kullanıldı, ALS araç-gereci Aıvıssralya'da ve çeşitli Avrupa ülkelerinde de kullanılmaktadır. İ986'da ALS, üretiminin üçte ikisin’ ihraç eHi. 197Û!erin başında Plessey, bazı sü­ permarketlerde stok sayımı amaçlarıyla çubuklu kod (barcode) ve ışıklı kalem (light pen) sistemlerin’ kurdu. Camden, Luton, Oxford ve Sutton halk kütüphaneler’ bu alet’ Ödünç verme bilgilerinin kaydedilmesi ’ç’n deneme ye karar verdiler, Böylece ’lk Plessey sistemi 1972'de Camden Halk Kütüp­ hanelerinin Kentish Town şubesinde kuruldu. Plessey ve ALS U7f^''^l^rde kütüphanelerde kullanılan veri toplama bir’mlrr’n’n çoğunu karşıladı. An­ cak bazı başka aygıtlar da vardı. Mills Associates Ltd bazı bibliyografik bil giler içeren bir kitap kartı (fiziksel olarak 80 kolonluk delikli karttan daha küçük) kullanarak Lancaster Üniversitesi ’ç’n özel bir sistem; Rontec de 40 kolonluk kartları kullanarak Bradford Üniversitesi için b’r sistem ta­ sarladı. Başka b’r firma, S.B. Electronic Systems Ltd, Plessey ışıklı kale­ minden farklı bir teknik tasarıma sahip ve sayısal karakterlere ek olarak abecesel karakterleri de içeren daha esnek b’r çubuklu kod yapışma izin veren Telepen'i (uzakyazıcı ve ışıklı kalem) geliştirdi. Bu aygıt çeşitli kü­ tüphanelerde kullanıldı; örneğin, Manchaster Üniversitesi. Sheffield ‘Politek- n’k ve Cambridge Üniversitesi. 197(rterin başlarındaki ödünç verme sistem­ lerinin büyük çoğunluğu toplan işlemliydi; işlemlerin ayrıntıları belirli bir zamanda (genellikle ana kuruluşun bilgisayarında) işlenmek üzere kütüpha­ nede delikli kâğıt bant veya manyetik teyp kaset’ üzerinde biriktiriliyordu. Ayırtmalar (reservations), aşın ödünç alma vs. ile ilgilenmek için veri top­ lama birimine iliştirdi bir 'tuzak depo' ile daha fazla denetim mümkündü. Bu elektronik depo kitap ve okuyucu numaralarım tutabilir ve bu numa­ ralar da veri toplama birimince 'okunan' numaralarla otomatik olarak kar- şılaşlırılab’lir. ALS ’lk tuzak depo sistemin’ 1971 ortasında Sussex Üni­ versitesinde kurdu?3 1975’de Plessey, Depolanmış Program Denetimi (Stored Program Control) olarak bilinen; ve, genişletilmiş bir tuzak depo rolü oy­ nayan, bir merkez kütüphane ile çeşitli şubeler arasında iletişim kolaylık­ ları sağlayan b’r m’nib’lgisayar (Inter data 74) içeren bir sistem geliştirdi. Bu sistem bazı çevrimiçi kolaylıklar sunmaktaydı, ancak hâlâ toptan işlem için büyük bir bilgisayara bağımlıydı. Londra'nın Hovering Belediyesi bu sistemi kuran ilk kütüphanedir. Bütün bu sistemlerde ödünç verme bilgi­ lerin’ işlemek ’çin standart bir yazılım yoktu ve bu, ana kuruluşun bilgi­ sayar merkezindeki personel tarafından yapılmak zorundaydı. Buckland ve Gallivan* 4 b’r ödünç verme sisteminin arzu edilir özelliklerini ana hatlanyia vermekte ve çeşitli işleme yöntemlerinin bu temel işlevleri nasıl etkilediğini tartışmaktadırlar. 43 Sağlama ve Katalog Sistemleri KtHerin başındaki bilgisayara dayalı sağlama ve katalog sistemleri, ge­ nellikle, bilgisayara dayalı ödünç verme sistemlerinden oldukça bağımsızdı. Bilgisayara Dayalı Kütüphane Sistemleri: 127 Katalogun fiziksel formatma karar vermek katalog sistemleri tasarımcıları için Önemliydi, Bilgisayarca üretilen ilk kataloglar satır yazıcı kâğıdı üze­ rine üretildi. Bunun belli dezavantajlar zayıf basım kalitesi, büyük fiziksel boyutlar, basım için harcanan zamanın uzunluğu ve yüksek reprodüksiyon maliyetiydi. Satır yazıcı kâğıdı üzerine basılan katalogların bazı sorunları matbaa harfleriyle dizilmiş (typeset) katalog ile çözümlendi —ki bu, bazı kütüphanelerce, Örneğin Batı Sussex İl Kütüphanesi, benimsenen bir çö­ zümdü. Bilgisayardan doğrudan katalog kartlan üzerine basım ve bu kart­ ların alışılmış biçimde dizilmesi bir başka çözümdü; özellikle ABD'dek’ OCLC kütüphanelerince benimsendi. Bir başka çözüm bilgisayar çıktısı mikrofonu (COM) kullanmaktı; Becker” kütüphane katalogu üretiminde COM kullanmanın bazı avantajlarını ana hatlanyla vermektedir. COM üze­ rine çıktı ya mikrofilm yu da mikrofiş formunda olabilir. Her iki çözüm de çeşitli kütüphanelerde kullanıldı. Fakat COM m’krofişin yeğlendiği kısa sürede ortaya çıktı. Bunun gerçekliği, dört katalog düzenine (ad, başlık, bö­ lünmüş katalog ve bağlam dışı anahtar sözcük (KWOC)) ek olarak dört fi­ ziksel katalog formunun (satır yazıcı, kart, COM film ve COM fiş) perfor­ mansım araştıran Bath Üniversitesi Karşılaştırmalı Katalog Çahşması'nca (BUCCS) tanıtlandı.1! Yine, sağlama ve kataloglama sistemlerinin büyük çoğunluğu ana kuruluşun bilgisayarı aracılığıyla toptan işlemliydi. Bunun bir istisnası Cheshire Î1 Kütüphanesiyd’. Bu kütüphane 1971'de bütünleşik çevrimiçi sağlama ve kataloglama sistemini işletmeye başladı.^ Bu sistemin ardındaki temel düşünceler Birleşik Kıralhk’taki diğer yerel otoritelerce (Cleveland, Lancashire, Staffordshire) de kullanıldı, 1970'lerin ortalarında katalog üretim’ için bazı yazılım paketleri ortaya çıktı. İngiliz Ulusal Kü­ tüphanesi IBM 360 veya 370 dizisi bilgisayarları kullanarak MARC tutanak­ larım işlemek iç’n bir paket üretti. ICL'e tabi b’r şirket, Dataskil, çeşitli halk kütüphaneleriyle birlikte Dataskil Integrated Library System'ı (DILS) üretti, Ozford'da yerleşik Telecomputing adlı şirket ICL 1900 dizisi bilgisa­ yarlar için bir sipariş ve kataloglama sistemi olan TeleMARC'ı üretti. Sout­ hampton Üniversitesi Kütüphanesinde geliştirilen sağlama sistemi Kuzey İrlanda'daki New University of Ulster'de de kullanıldı. 4.4 Süreli Yayınlar Çeşitli akademik ve özel kütüphaneler HO'lerde sürel’ yayın mevcut­ larının listelerini üretmek için bilgisayara dayalı sistemler işletmeye baş­ ladılar. Keza (coğrafik alana veya konuya göre) bazı süreli yayın toplu lis­ teleri üretildi. ABD'deki PHILSOM (Periodical Holdings ’n Library School of Medicine) sistemi yedi tıp külüphanrs’nce tutulan 8300 süreli yayın baş­ lığını kapsamaktadır.'8 1977Herde süreli yayınlarla ilgili diğer bir gelişme UNISIST programı içinde Ulussararası Süreli Yayunar BÜgİ Sistemi'nln (ISDS) kurulmasıydı.9* ISSN'lerin kullanımını geliştirmek üzere süreli ya­ yın kütüklerini geliştirmek, sürel’ yayınlardaki bilimsel ve teknik bilgiye erişimi kolaylaştırmak için Paris'de uluslararası bir merkez kuruldu. UNESCO üyesi ülkelerin ulusal ve bölgesel merkezler kurmaları beklendi, 1975'de Arjantin, Avustralya, Kanada, Fransa, Japonya, ABD ve Birleşik Kırallık gibi ülkeleçde 20'nin Üzrinde bu tür merkez kuruldu. 12a Lucy A. Tedd Dewe40 1975'den 1978'e dek kütüphane otomasyonuyla ilgili açıklamalı b’r kaynakça vermektedir, Vickers'! ve Wainwright'in42 raporları HOlerin ortalarında İngiliz kütüphanelerinde mevcut bilgisayara dayalı kütüphane sistemleri ve hizmetler hakkında bir anlayış sağlamaktadır, Tedd'i^ ders kitabı43 da böyledir, 5. KÜTÜPHANELERDE MİNİ VE MİKROBİLGİSAYARLARIN KULLANIMI 1970lerin sonlarında çeşitli kütüphaneler ana kuruluşlarından sağladık­ ları bilgisayar kolaylıklarım kütüphanede bir minib’lgisayar kurarak des­ teklemeye başladılar. Minibilgisayar çeşitli işlevler için kullanıldı. Bilgisa­ yara dayalı b’r ödünç verme sistemindeki bazı kütüklere çevrimiçi erişim sağlanması popüler biriydi. Grosch44 yerel ve uzak bilgisayar sistemleri ara­ sındaki gücün olası dağılımıyla ilgili ayrıntıları vermektedir. Kataloglama- da ve Ödünç vermede min’b’lgisayariar Üzerine bir konferansın b’ldir’le- ri41 1975’de Britanya'daki planlı sistemlerin ayrıntılarını sağlarken, Awstral- ya kütüphanelerinde minibılglsayarlarm kullanımı Middleton'da1® verilmek­ tedir. Kütüphanelerin kendi bilgisayar kolaylıklarım edinme yönelimi (trend) İSBOOerie mikrobilgisayar sistemlerinin ortaya çıkışıyla büyük ölçüde hız­ landı. Kütüphanelerde kullanılan ilk mikrobilgisayarlar (Commodore PET, Apple veya RML 380Z gibi) 32K ya da WKlik ana depo ve belki 0.5 Mbye’lık floppy disk depoya sahip, ve CP/M işletim sistemin’ kullanan çoğunlukla 8-bitlik makinelerdi, WSOIerin ortalarında teknolojT, artırılmış ana depo ko­ laylıklarına sahip daha hızlı makineler sağlayacak şekilde gelişmişti. Bu da­ ha hızlı mikrobilgisayarlar için daha ’leri işletim sistemleri geliştirildi; bun­ lar CP/M-86, MS-DOS, UNIX, PICK ve PC-DOS’u İçermektedir. Keza mik­ robilgisayarların depolama sistemlerinde de gelişmeler olmaktadır. IBM tarafından geliştirilen ’sabit’ diskler veya Winchester diskleri floppy disk­ lerden çok daha yüksek depolama kapasitesine (5 ila 200 Mbyte arasında) ve daha hızlı erişim zamanlarına sahiptir. Daha yeni bir gelişme CD-ROM (Kompakt Disk — Yalnız Oku Belleği) aygıtları olmuştur; halen bilgi en­ düstrisindeki birçok örgüt bu yeni ortam'la (medium) ilgili araştırma ya­ pıyorlar. Bir diğer gelişme, depolama ve çıktı kolaylıklarını paylaşan iki ya da daha fazla mikrobilgisayardan, ve veri ve programların tek tek mik­ robilgisayarlara gönderilmesini denetleyen bir 'ağ hizmetçi^Ünden (network server) oluşan yerel bilgisayar ağıdır (local area network: LAN), Collier43 yerel bilgisayar ağlarının kütüphane ve bilgibilime etkilerini ana İratlarıyla vermektedir. Keza mikrobilgisayarlar şimdi, kendilerine çevrimiçi arama hizmetleri, görüntülü bilgi sistemleri veya elektronik posta sistemleri gibi dış kaynaklara erişim olanağı sağlayan 'akıllı' iletişim kolaylıklarına sahip­ tir. 1918)lerde mikrobilgisayarlar İlgilendiren diğer büyük etmen, DEC, ICL ve çok tanınmış IBM (IBM PC'siyle) gibi büyük bilgisayar üretic’ler’nin pa­ zara girişleri olmuştur. ^M'lerin başında kütüphane otomasyonunu ve kütüphanelerde mikro­ bilgisayar kullanımım İşleyen birçok yeni dergi ve haber bülteni ortaya çık­ tı; örneğin, Electronic Library, Library Micromation News, Access, Micro­ Bilgisayara Dayalı Kütüphane Sistemleri: 129 computers for Information Management ve Small Computers in Libraries, Halen mevcut dergiler de kütüphanelerde mikrobilgisayar uygulamalarıyla ilgili yazılar içermektedir; Program'm Ocak 1985 sayısı1'' Avustralya ve ABD’deki kütüphanelerde ve Ingiliz halk kütüphanelerinde mikrobilgisayar kullanımıyla ilgili genel değerlendirmeleri veren yazılardan başka bir de mevcut yazılımın genel değerlendirmesini kapsamaktadır. Britanya’da ko­ nuyla ilgili d’ğer bir gelişme de Kütüphane Teknolojisi Merkezinin (Library Technology Centre) kurulması olmuştur. Kasım 1982'de (Bilgi Teknolojisi Yılı'nda) Polytechnic of Central London'da b’r Bilgi Teknolojisi Merkezi açıldı ve Ticaret ve Sanay’ Bakanlığı ile Ingiliz Ulusal Kütüphanesi Araş­ tırma ve Geliştirme Bölümü'nce (BLR&DD) iki yıllık yardımla desteklendi. Merkezin ana amacı kütüphaneciler ve bilgi profesyonelleri arasında bilgi teknolojisi uygulamasının farkında olunmasını artırmaktı. Kasım 1984"de Merkezin adı Kütüphane Teknolojisi Merkezi oldu (ve BLR&DD tarafından desteklenmektedir). Amaçlarından b’r’si, m’krobügisayar uygulamalarının üzerinde Önemle durularak b’r dizi kütüphane sisteminin gösterilmesidir. Mikrobilgisayarlar kütüphanelerde çok çeşitli amaçlarla kullanılmakta­ dır. Leggate ve Dyer^ bu 'başlangıç uyguI;atnalaın' ve 'sonraki ge­ lişmeler’ olmak üzere ayırmaktadırlar. Başlangıç uygulanalan: — çevrimiçi arama hizmetlerine akıllı terminal erişimi; — örgüt içindeki bilgisayar gücüne akıllı terminal erişimi; — veri tabanı yönetim sistemlerinin (DBMS) ve d’ğer yardımcı program­ ların kullanımını; — kendin yap (do-it-yourself) yazılımım içermektedir. Veri tabam yönetim sistemler’ (Compsoft DMS, dBaself ve dBaselİI gibi) aslında iş alanında mikrobilgisayar kullananları amaçlamak­ tadır. Fakat, Burton'un"0 açıkladığı gibi, birçok kütüphane bu tür paketleri başarıyla kullandı ve kullanıyor. Leggate ve Dyer'ın 'sonraki gelişmeleri spesifik olarak kütüphane uygulama];m için mikrobilgisayar kullanılarak yazılmış yazılımın (örneğin, BOOKSHELF, CALM, Sydney) ortaya çıkışma dayanmakta ve üç kategoriye ayrılmaktadır: — tek işlev, tek kullanıcı; — 'bütünleşik' sistemler, tasarımda ve başlıca kütüphane 'hosekeepüng* ’şevler’ sağlamada modüller; —• yukandak’lerin çok kullanıcıya ve ağ konflgürasyonuna genişletilme­ si. Leggate ve Dyer'ın yazısı^ kütüphanelerde (özellikle küçük kütüphanelerde) mikrobilgisayar uygulamalarına giriş oluşturan ve Electronic Library'de ya­ yımlanan b’r dizi makalenin İlkidir. Mikrobilgisayarlar ve kütüphanelerde kullanımları hakkında çok yazıl­ dı. Burton'un açıklamalı kaynakçası51 1976'dan 1985'e kadar bir dizi dergi­ den ve konferans bildirilerinden alman 600 referansı içermektedir. Bu alan­ daki Ingiliz kökenli ana ders kitabı Burton ve Petrie tarafından yazıldı ve 130 Lucy A. Tedd 1984^ yayjmlandı;’2 19;^ı^*'^;a ’k’nc’ basımı hazırlamaktadır. Bu kitabın ön­ sözünde yazarlar yazılımın önemini vurgulamaktadırlar. Mikrobilgisayarlar için yazılım geliştirmenin değeri; Trevelyan ve Rowat'm, bilgisayarların kü­ tüphane uygulamalarında sistem programlarının kullanımı üzerine 1981'de' hazırladıkları rapora?3 Tedd'’n 1983'de yayımlanan genel değerlendirmesi34, Burton ve Gates’in genel değerlendirmesi.1' ve Gates'iıı kütüphanelerde mik­ robilgisayar kullanımı İçin sağlanabilen yazılım rehber’’’ karşılaştırılarak Ölçülebilir-, ABD’de Boston'daki Simmons College'da kütüphanelerde mikro­ bilgisayar uygulamaları üzerine b’r ver’ tabanı derleniyor ve yaşatılıyor.’’ Bilgi, belgeleme ve kütüphanelerde m’n’ ve mikrobilgisayarların uygu­ lanması konusunda uluslararası bir konferans 19’3'de İsrail'de düzenlendi ve 27 ülkeden 400 civarında katılımcı hazır bulundu. Bu konferansın bildi­ rileri ■ gelişmekte olan ülkelerden de olmak üzere birçok yazı içermekte­ dir. Bilgi alanındaki yazılım paketlerinin uluslararası envanter konferans düzenleyicilerince yayımlandı?’ Bunu izleyen, bu kez mikrobilgisayar üze­ rinde yoğunlaşan b’r konferans 53 ülkeden temsilcilerle 1986’da Batı Alman­ ya'da yapıldı?' Unesco, gelişmekte olan ülkelerde kütüphanelerde ve bilgi alanında bilgisayar kullanımı konusunda konferans öncesi bir semineri des­ tekledi. Esas konferans sırasında bu tema yeniden ortaya çıktı. Konferansta sunulan iki b’ldirı gelişmekte olan ülkelerde kullanılmak üzere geliştirilen IV+V (Informations Vermittlung und Verarbeitung) adlı veri yönetim pa­ ketin’ açıklamaktaydı. Gelişmekte olan ülke kütüphanelerindeki birçok mik­ robilgisayar kullanıcısı dBaselI gibi standart paketlerden yararlanmakta­ dırlar; örneğin, Haravu ve arkadaşları’" Hindistan'da Palanchrru'daki Yan Kurak Tropik İçin Uluslararası Ürün Araştırma Merkezindeki dBaself kul­ lanan mikrobilgisayara dayalı bir kitap sağlama sistemini açıklamaktadır­ lar. ABD deki 16 mikrobilgisayar ödünç verme sisteminin bir karşılaştırma­ sı Matthews"1 tarafından verilmektedir. Mikrobilgisayara dayalı olarak ge­ liştirilmiş olan ve küçük akademik veya Özel kütüphanelerde kullanılan sis­ temlerden bazıları şunlardır: ADLIB/ADLIB2. ADLIB, bir veri tabanı yönetim aracı olarak bireysel kütüphanelerin kendi bilgisayara dayalı kütüphane sistemlerini ken­ dilerinin kurmalarım sağlamak üzere 177(0Ierin ortasmda LMR Computer Services' tarafından geliştirildi. Mikrobilgisayar versiyonu ADLIB2 daha standartlaşmış bîr üründür ve Unix işletim sistemi altında işlemek üzere tasarlanmıştır. LMR 1985'de Databasix tara­ fından devralındı. Şimdi bu ik’ ürünü Databas’x pazarlamaLaHır. Biblio Lend/B’bl’o Buy. Bunlar sırasıyla ödünç verme ve sağlama İçin Biblio Tech Ltd tarafından geliştirilen iki modüldür. Biblio Tech Ltd küçük ve orta boy eğitim kütüphaneleri için Telepen aleti kul­ lanarak mikrobilgisayar sistemleri geliştirmede uzmanlaşmış bir fir­ madır. Clark’2 Galler Kütüphanecilik Koleji Kütüphanesi (College of Librarian ship Wales Library) ’ç’n bir ödünç verme sisteminin ge­ reksinimlerin’ tanımlamaktadır —ki Biblio Lend'în edinilmesiyle so­ nuçlanmıştır. Bookshelf. Bookshelf, Logical Choice (Computer Services) tarafından Oxford'dak’ John Radcliffe Hastanesi Cairns Kütüphanesiyle b’rlik- te tasarlanmıştır?' Pick işletim sistemi altında işlemekte ve sağla­ Bilgisayara Dayalı Kütüphane Slstderi: 131 ma, kataloglama ve sorgulama, Ödünç verme ve süreli yayın dene­ timi için modülleri kapsamaktadır. Diğer bazı tıp kütüphanelerince de, örneğin Fife Health Board ve Leicester General Hospital, kul­ lanılmaktadır. CALM, CALM (Computer Aided Library Management) ilk olarak 1982'de İsrail'de geliştirildi. ^'’den beri Britanya’da Pyramid Computer Systems Ltd tarafından pazarlamaktadır. Sağlama, kataloglama ve sorgulama (gömü denetim’ ve konu dizinlemesi isteğe bağlı ekstra­ lardır), ödünç verme ve süreli yayınlar modülleriyle bütünleşik bir sistemdir. Dünya çapında 120 kütüphanede kullanılmaktadır. Bun­ ların yaklaşık ll'i Birleşik Krallık'tadır. Dynix, Dynix paketi 1983^ ABD’de geliştirildi. Pick işletim sistemi kullanılmaktadır. Kataloglama, ödünç verme, erişim ve sağlama gi­ bi modülleriyle bütünleşik b’r paket olarak tasarlanmıştır. ABD’de Dynix kullanan 80 civarında kütüphane vardır; halihazırdaki İngiliz kullanıcılar temelde akademik kütüphanelerdir ve Stirling üniver­ sitesi, Bamet Koleji ve Güney Glamorgan Yüksek Eğitim Enstitü- sü'nü İçrrmrkridiir.(W LIBRARIAN. LIBRARIAN İ7’2'dr bir İngiliz firması olan Eurotec Consultants Ltd tarafından geliştirildi. Kataloglama ve sorgulama ’ç’n modülleriyle bütünleşik b’r sistemdir, Buckingham Üniversite­ sindeki kütüphane bu sistemin bir kullanıcısıdır.0* MICRO LIBRARY. Sydney Ltd'in modüler kütüphane yönetim sistemi MICRO LIBRARY, mikrobilgisayarlarda kullanım için 1981'de ge­ liştirilen Easy Data bütünleşik kütüphane sisteminin bir mikrobil­ gisayar versiyonudur. Teknik kütüphaneler, araştırma kütüphane­ leri ve diğer uzman kütüphaneler ve ticaret kütüphaneleri için ta­ sarlanan MICRO LIBRARY, IBM PC (ya bağımsız ya da b’r ağın parçası olarak) veya DEC VAX sistemler üzerinde işlemektedir. Birleşik Kırallık'taki müşterileri arasında British Coal, Midland Bank, Unilever ve Central Computing and Telecommunications Agency bulunmaktadır. Manson80 'hosekeeping' amaçllan için Britanya’da mevcut bilgisayara da­ yalı paketler’ incelemekte ve Gates67 küçük kütüphaneler için bir mikrobil­ gisayar sistemi seçerken göz önünde tutulacak etmenlerden bazılarını açık­ lamaktadır. 6. BÜTÜNLEŞİK KÜTÜPHANE ’HOUSEKEEPING’ SİSTEMLERİNİN GELİŞMESİ 19i’]'lerin diğer bir yönelimi, sık sık anahtar teslim sistemler olarak anılan ve çeşitli kütüphane 'housekeeping' işlemlerini kapsayan paket ha­ lindeki donanım ve yazılımın sağlanması olmuştur. Yeni satıcıların yanı sıra (örneğin Geac) orijinalde ödünç verme sistemler için veri toplama birimleri satan satıcıların çoğu (örneğin ALS ve Plessey) anahtar teslim sistemler geliştirdiler. Bu sistemler ödünç verme yanında popüler bir işlev olan katalog kütüğüne çevrimiçi erişimle (OPAC) birlikte çeşitli uygulama- 132 Lucy A, Tedd lar sağlamaktadır. Çeşitli kooperatif servisler (örneğin BLCMP, OCLC, SLS) ■—merkez’ olarak tutulan ortak bibliyografik kütüklerle blrilkte kullanıla­ bilen, veya kullanılamayan— anahtar teslim sistemler geliştirdiler. Keza birçok geleneksel kütüphane hizmeti sağlayıcıları (İngiliz kitap satıcısı Blackwell, Amerikan abone ajansı Faxon ve HollandalI abone ajansı Swets gibi) bazı yerel işleme ve kendi merkezi bilgisayar sistemlerine er’ş’m ola­ nağı veren paket sistemler sağlıyorlar. Çeşitli işlerlerin bilgisayara dayalı tek bir kütüphane sisteminde bülünlrsl’r’lmes’ çekici b’r düşünce ve bir­ kaç kütüphane bunu yaptı (örneğin, ABDdeki Northwestern Üniversitesi NOTIS'le, Virginia Politekn’k Enstitüsü ve Eyalet Ünivrrsitrs’’ndrkl kütüp­ hane VTLS'le’5). Keza bütünleşik sistemleri pazarlayan (IBM’in DOBIS/LI- BIS'i gibi) bazı ticari örgütler vardır. 1980'lerde Britanya ve Kuzey Amerika'daki her kütüphane yöneticisi [sistem] gerçekleştir’mi (implementation) düşünürken bu nedenle çok çe­ şitli sistemlerle karşılaştı, Matthews’3 kütüphane karar vericisi İçin «ya­ pılması ve yapılmaması gerekenlerin» listesin’ içeren bir rehber sağlamak­ ta, bundan başka maliyet-ymar analizi gerçekleştirmede kullanılan gerçek değer yöntemini açıklamaktadır. Keza Boss,™ CorbnV1 veToochilB2 de kü­ tüphanelerde bilgisayara dayalı sistemler kurulması konusunda yararlı ma­ teryal sağlamaktadırlar. Program'm bir özel sayısn' 'bir bilgisayar ’çin an­ laşmaya varılması' konusundaki b’r-günlük konferansın bildirilerini kapsa­ makta; ve, teknik şartnamenin düzenlenmesi hususunda Wainwright, öneri mektubu verenlerin (tenders) değerlendirilmesi hususunda Ashford tara­ fından yazılan yararlı yazıları içermektedir. Çevrimiçi ve bağımsız (stand­ alone) b’r ödünç verme sisteminin seçim’ konusunda öncelikle İngiliz halk kütüphanecilerini amaçlayan b’r rehber ve kontrol listesi Lee ve s tarafından hazırlandı?' Daha yakın tarihte Leeves 20 civarında anahtar teslim (turnkey) kütüphane 'housekeeping' sisteminin standart tanımlarım üretti?’ ABD’de Rush, kütüphane işlemleri için bilgisayara dayalı destek sağlanmasıyla il- g’lenrnlrre b’r dizi rehber hazırladı; örneğin, bu dizinin yedinc’si katalog- lamayı içermektedir?’ Keza Boss ve McQuinn’in raporu da 17^’01e^İn ba­ şında ABD'de sağlanabilen dokuz otomatik ödünç verme sistemini tanım- lamakladır.7û Günümüzdeki anahtar teslim sistemlerden bazıları aşağıda kısaca açık­ lamaktadır: ALS. ALS System 5 adlı bütünleşik veri tabanı sistemini geliştirdi —ki ilk kez İ7Û9'da Derbyshire İl Kütüphanesinde kuruldu. S’tsem he­ men hemen tamamen halk kütüphanesi pazarını hedef almakta ve Browser erişim terminalini kullanmaktadır?® Kataloglama, Ödünç verme ve e’rşim kolaylıklarını içermektedir. ALS'in en son sistemi, System 38, ilk müşterinin kütüphanesinde kurulma aşamasındadır. CLSI Ltd. Computer Library Services International (CLSI) 1971’den ber’ ABD'de kütüphane otomasyonuyla ilgilenmektedir, CLSI son olarak ürünlerin’ dünyanın diğer ülkelerinde de pazarlamaya baş­ lamıştır, Günümüzde CLSI sistemlerini kullanan 11(0)'ün üzerinde kütüphane vardır. 1986'nın sonunda Pekin'deki Çin Ulusal Kütüp­ hanesi ödünç verme etkinliklerinin otomasyonu ’çin CLSI ile kont­ Bilgisayara Dayalı Kütüphane Sistemleri: 133 rat imzaladı. CLSI'ın Birleşik Kıralhk'taki müşterileri arasında Coventry Politeknik ve Coventry İl Kütüphanesi,’0 Warwickhire İl Kütüphanesi, Cumbria, Hammersmith ve Fulham, ve Heriot Watt Üniversitesi yer almaktadır. Sistem, sağlama, kataloglama, erişim, ödünç verme ve (Blackwell'in PEARL yazılımını kullanarak) sü­ reli yayın denetim hizmetler sunmaktadır. DS. DS bir Ingiliz firmasıdır. Temelde esk’ Plessey personelince, Plessey kütüphane pazarından, 1983'de çekildiğinde, kuruldu. Ples- sey'n’n mevcut kütüphanelerle kontratları ’ç’n sorumluluğu üzerine aldı. DS, Perkin Elmer mlnibİIgisay^u•iıaını kullanan Modül IV sistemiyle halk kütüphaneler’ pazarında çalışmaktadır, örneğin, Kent İli, Hampshire İli ve Ealing.’1 Modül IV bağımsız Ödünç verme sistemi olarak tasarlandı. Ancak tam MARC kataloglama (full MARC cataloguing) ve sağlama için modüller planlanmaktadır. Geac- Geac, 1977’de Guelph ve Waterloo üniversite kütüphanelerinde ilk kez kullanılan bir anahtar teslim ödünç verme sistem’ geliştirmiş olan KanadalI bir firmadır. O zamandan ber sistem, kataloglama, erişim, sağlama modüllerini geliştirdi ve MARC tutanak yönetimin’ ekledi, Britanya'daki çeşitli akademik kütüphaneler Geac sistemleri­ ne sahiptir. Örneğin, Durham, Hull, Leeds ve Londra üniversite kü­ tüphaneleri ve Preston ve Southbank politeknik kütüphaneleri.’* Sussex üniversitesi Kütüphanesi Geac sistemini [n etkinliğini] ken­ disi yerel olarak artırdı.’' IBM. IBM'in DOBIS/LIBIS bütünleşik sistemi IBM tarafından Dort­ mund (B, Almanya) ve Leuven (Belçika) üniversiteleriyle birlikte geliştirildi ve ilk kez 1978'de ortaya çıktı,’' Arama, sağlama, akta- loglama, erişim, ödünç verme ve süreli yayın denetimi sağlamakta­ dır, Şimdi tüm dünyada yaklaşık 80 yerde ve, Arapça ve Çince da­ hil, en azından bir düzine dilde DOBIS/LIBIS kullanılmaktadır. DOBIS/LIBIS şimdi Britanya'da da etkin olarak pazarlamaktadır. İlk kullanıcıları Liverpool Üniversitesi ve Bristol PülitrkIıik'llr. McDonnell Douglas. McDonnell Douglas'm URICA sistemi, sağlama, kataloglama, ödünç verme, sorgulama ve süreli yayın denetimi gibi işlevleri sağlayan bütünleşik bir yazılım modülleri setidir. Yazılım dünya çapında çeşitli türlerdeki 67 kütüphanede kullanılmaktadır. Ingiliz kullanıcılar arasında Lincolnshire 11 Kütüphaneleri, Londra Newham Belediyesi, Bath Üniversitesi ve Willshire 11 Kütüphanesi bulunmaktadır.’' Daha önce sözü edildiği gibi, kooperatif sistemler de bağımsız yazılım geliştirmektedir. BLCMP (Library Services) Ltd Data General Eclipse dizisi bilgisayarlar kullanarak CIRCO Ödünç verme sistemini geliştirdi. İlk CIRCO sistemi 1982'nin ortasında Londra'daki Barbican Centre'da kuruldu. Bundan başka 1982 ve 19’3'de Manchaster, Middlesex, Portsmouth, North London, Thames ve Huddersfield politekn’k kütüphanelerinde sistemler kuruldu. CIRCO, temelde ödünç vermeyle ilgilenmektedir. Ancak BLCMP'nin plan­ lan tüm uygulama rutinlerini, bibliyografik tutanaklar İçin bir kaynak ola- 134 Lucy A. Teck rak kullanılan BLCMP bilgisayarımla birlikte yerel bir bağımsız sisteme (geliştirilmiş CIRCO, şimdi BLS olarak bilinmektedir) devretmektir. SLS'in LIBERTAS adlı bağımsız sistemi DEC VAX bilgisayarları üzerinde işlemek­ te ve kataloglama, ödünç verme, erişim, sağlama ve bibliyografik tutanak­ lar için ana SLS bilgisayarına erişimli süreli yayınlan içermekeedrr® 7. SON GELİŞMELER Günümüzde dünyanın çeşitli ülkelerindeki birçok kütüphaneci, kütüpha­ ne kullanıcıları tarafından kütüphane kütüklerine çevrimiçi erişim sağlan­ masıyla ilgilenmektedir. Program'm Nisan 1986 sayısı bu halka açık çev­ rimiçi kataloglar (veya OPACs) üzerinde yoğunlaşmakta ve Kuzey Amerika ve Avustralya'daki OPAC sistemlerinin ayrıntılarından başka bazı bağımsız anahtar teslim sistemlerin (stand-alone turnkey systems) OPAC modülleri­ nin açıklamalarını içermekte v& OPAC'la ilgili çeşitli araştırma projelerini rapor etmektedir. İngiliz Ulusal Kütüphanesi Araştırma ve Geliştirme Bölümü (BLR&DD) OPACİarla ilgili birkaç projeyi destekledi; bunların bugüne kadarki en bü­ yüğü muhtemelen Polytechnic of Central London'da OKAPI olarak bilinen prototip bir OPAC'm tasarımı olmuştur?3 OKAPI araştırmacılarından birisi olan Mitev Program'ın özel sayısında OPAC sistemlerinin kullanıcıları ve kullanım kolaylığı üzerine yararlı bir yazı yazdı. Keza Bath Üniversitesi'nde- ki Katalog Araştırma Merkezi (ki BLR&DD tarafından desteklenmektedir), diğer etkinlikleri arasında, OPAC çalışmasıyla da çok ilgilenmekte ve özel­ likle OPAC gelişmeleri konusunda yıllık konferanslar düzenlemektedir.’* Ekim 17’5'dr BLR&DD'in Kütüphanelerde ve Bilgi Hizmetlerinde Bilgi Tek­ nolojisi çalışma grubu OPAC araştırması için 300.000 Ingiliz Lirası harcan­ masını önerdi1,0 Çevrimiçi kataloglar 1985 Essen sempozyumunun da tema­ sıydı ve bildirler’1 çeşitli Avrupa ülkelerindeki OPAC gelişmelerinin ayrın­ tılarım vermektedir. Bilgisayara dayalı bir kataloglama sistemi kuran herkesi İlgilendiren diğer bir alan, ya da sorun, mevcut katalog tutanaklarının makinece oku­ nabilir forma geriye dönüşlü çevrimidir. Kütüphanelere bu büyük girişimde yardımcı olmak için şimdi çeşitli servisler vardır. ABD'dekİ Carrollton Ya­ yınevi daha Önce MARC formatmda olmayan Kongre Kütüphanesindeki yaklaşık beş milyon yapıtın tutanaklarım içeren bir bibliyografik veri ta­ banı (REMARC olarak bilinir) yarattı. REMARC tutanakları diğerlerinin yam sıra Galler Ulusal Kütüphanesi, Singapur Ulusal Kütüphanesi ve Kent Üniversitesi'nce de kullanılmaktadır. OCLC de MICROCON ve TAPECON gibi çeşitli geriye dönüşlü çevirim hizmetleri sunmaktadır. British Museum Kütüphanesfndeki basılı kitapların tüm katalogunu (4,5 milyon civarında bibliyografik tutanak) makinece okunabilir forma çevirmek İçin Britanya'da GKIII olarak bilinen uzun dönemli bir proje devam etmektedir. Harrison'12 tam MARC tutanaklarım yaratan katalog kartlarının ustalıkla ve optik ola­ rak gözden geçirmek İçin bir teknik tanımlamaktadır. Bu yeni teknik çe­ şitli kütüphanelerde kullanılmaktadır, örneğin, Edinburgh Üniversitesi Bi­ lim Kütüphanesi ve Avustralya'da Sydney Üniverritessindeki kütüphane. Britanya'da kütüphane otomasyonundaki bazı son gelişmeleri açıklayan ders kitaplar Lovecy?’ Rowde®* ve Tedd^^LkUerdir.®5 Bilgisayara Dayalı Kütüphane : 135 Son zamanlarda Birleşik Kırallık'taki çeşitli türdeki kütüphanelerde bilgisayar kullanımı üzerine çeşitli araştırmalar yapılmakta vr son gelişme­ lerin grnrl bir değerlendirmesi verilmektedir. Politekn’k Kütüphanecileri Konseyi (COPOL) 1975, 1982 ve 1986'da otuz İngiliz pol’trknlğinde bilgi teknolojisi kullanımı üzerine araştırmalar ger­ çekleştirdi. En son araştırma* 1' 2’ kütüphanenin (% 93) bilgisayara dayalı kataloglama sistemlerine, 27 kütüphanenin (% 90) büg’sayara dayalı ödünç vrrme sistemlerine, 18 kütüphanenin (% 60) bilgisayara dayalı sağlama sis­ temlerine ve 3 kütüphanenin (% 10) çeşitli türdeki süneli yayın denetimine sahip olduğuna işaret etmektedir COPOL bilgi teknolojisi grubunun bazı üyeleri akademik kütüphanelerde bilgi teknolojisinin bina gerekirl’kleri ve çrvrrsrl gereksinimleri üzerine vararlı bir yazı Vazırladılar■sı Londra Şrf Kütüphaneciler Dernrği'nin (ALCL) yen’ Teknoloji Paneli 1979, 1983 vr 1985’de 34 Londra Belediyesi kütüphanesinde yeni teknoloji kullanımı üzrrinr araştırmalar gerçekleştirdi. En son araştnaa90 24'ünün (% 71) katalog üretimi, 22'sinin (% 6’) ödünç verme ve 3'ünün (% 9) kitap sipaıişi için bilgisayar kullandıklarını göstermektedir. 19^J^''de aynı ankete dayalı bir araştırma 166 Ingiliz halk kütüphanesi otoritesinde dr gerçek­ leştirildi.''9 Bunların 7Û'si (% ’8) bilgisayarları katalog üretimi için (en Önemli yönelim bütünleşik, bağımsız sistemlere doğrudur —29 kütüphanenin İşletimde olan bu tür sistemleri vardı ve 12'si daha, kısa sürede yaşama geçirilecekti) ve 93'ü (% 56) ise ödünç vrrmr İçin kullanmaktadır. 1984'de Üniversite ödenekler Komitesi (University Grants Committee) Birleşik Kırallık'taki ’3 üniversite kuruluşuna (Londra Üniversitesi tek bir kuruluş olarak işlrm görmüştür) kütüphane otomasyonu hakkında bir araş­ tırma mektubu gönderilmesini kararlaştırdı. Sonuçlar ’1'inin (% 96) bir tür bilgisayara dayalı sistem işlettiklerini gösterdi.100 19’6'da topluca LIB-2 olarak bilinen bir dizi araştırma Avrupa Ekonomik Topluluğu'nun (EEC) DGXI[I'üncr desteklendi. Bu araştırmalar (her üye devlrtr bir tane) 'etkin bir Avrupai kütüphane işbirliği için gerçek bir uz­ laşma (consensus) yaratmayı' amaçladı. Her araştırma geniş kapsamlıydı ve kütüphane 'housekeeping' sistemlerini, çevrimiçi kaynalkan İncele- merin yanı sıra bilgi teknolojisinin kütüphanelere uygulanmasının etkisini vr maliyetini değerlendirmekteydi. Ingiliz araştırması Kütüphane Demeği ve Kütüphane Teknolojisi Merkezince ortaklaşa gerçekleştirildi. Bu araş­ tırmanın sonuçları henüz açıklanmadı, ancak Iljon ve Lupovici'nin 1986'da HarrogaU'de yapılan kütüphane otomasyonu üzerine bir Avrupa konferan­ sına sundukları b’ldiridem biraz geçmiş bilgi verilmektedir. Teknolojik gelişmeler kaçınılmaz olarak hızla ilerledi. (Yerel bilgisayar ağltan ve faksimile iletimi içeren) uzakiletişim (telecommunication), depo­ lama teknikleri (CD-ROM vs), gösterim teknikleri, İşlem hızları, v’deoteks ve diğerlerinin hepsi geleceğin kütüphane 'housekeeping' sistemlerinin ala­ nına girecektir. Uzman sistemlerin (expert systems) kütüphane ve bilgi ça­ lışmasıyla ilgisi kuşkusuz daha incrlrnrcektin, örneğin, Davirs vr James10’ Exrter Üın’ve^Siıesilnde bazı AACR2 kurallarının otomatik olarak uygulan­ ması konusunda gerçekleştirilen bazı deneylen rapor etmektedirler. Gele- 136 Lucy A. Tedd cek 21 yıldaki bilgisayara dayalı kütüphane sistemleri tasarımcılarının son 21 yılın bazı hatalarından öğrenilenleri akılda tutmaları ve kütüphane kul­ lanıcılarının gereksinimlerini gerçekten karşılayan mali yönden etkin sis­ temler geliştirmeleri umulmalıdır. KAYNAKLAR 1. Cox, N.S.M., Dews, J.D. and Dolby, J. L. The computer and the Library. Newcastle upon Tyne: University ot Newcastle upon Tyne Library, i960. 3. Wilson, C.W.J. Comparison of UK computer-based Joans systems. Program 3(3/4), 1889, 127-142. 3. Woods, E .G. Library automation. London: British Library, 1982, 4. Harrison, J. and ' Laslett, P. eds. Brasenose conference on the automation of libraries. London: Mansell, 1967. 5. Black, D.V. and Barley. E.A. Library automation. In: Cuadra, C., ed. Annual Review of Information Science and Technology, vol. 1. New York: Wiley, 1986, 273-303. 0. Canroll, D.E. ed. Proceedings of the itse clinic on library application sof data processing. Urbana, Illinois: Graduate School of Library Science, University of Illinois, 1989. 7. Kilgour, F.G. Historical development of library computerization. In: Hammer, D.P. ed. The Information age, its Impact. Metuchen, New Jersey; Scarecrow Press, 1978. 8. Woods, R.G, Acquisitions and cataloguing systems: preliminary report. Southampton: Southampton University Library, 1971, 9. Kimber, R.T. Automation in libraries. Oxford: Pergamon Press, 1988; 2nd ediiton 1974. 10. Hayes, R.M. and Becker, J. Handbook data processing for libraries. New York: Becker and Hayes, Inc., 1970; and edition, Los‘ Angeles: Melville Publishing Company, 1974. 11. Eyre. J. and locks, P. Computers and systems: an introduction for librarians. London: Bingley, 1971. 12. King, G.W. ' et al. Automation and the Library of Congress: a survey sponsored by the Council of Library Resources. Washington: Library of Congress. 1983. 13. Wainwright, J, BNB MARC users in the UK: a survey. Program 6(4). 1972 , 271-283. 14. The scope for automatic data processing in the British Library, report of a study into the feasibility of applying ADP to the operations and services of the British Library. London: HMSO, 1972. 15. British Library MARC services, a guide for intending users. London: British Library Bibliographic Services Division, 197S. 18. Duchesne, R.M. and Donbroski, L. BNB/Brighton Public Libraries Catalogue Project­ BRIMARC. Program 8(3), 1973, 205-224. 17. Robinson, S. Sleeping beauty, MERLIN, a state of the art report. Program 14 (1), 1980, 1-13. IB. Long. A. UK MARC and US MARC: a brief history and comparison. Journal of Documentation 40(1), 1984, 1-12. 19. Hopkinson, A. International access to bibliographic data: MARC and MARC-related activities. Journal of Documentation 40 (1), 1984, 13-24. 20. Chauveing, M. ed, European library automation group ninth library systems seminar, Paris, 14-16 April 1965. Paris: Blblioth&que Nationals, 1966. 21. Seal, A.W, et aJ. Full and short entry catalogues, library needs and uses. Bath: Bath University Library, 1982. 22. Ashford, J.H., Bourne R. and Plalster J. Co-operation in library automation. London; LASER. 1975. 23. Plalster. J. LASER and Prestel. Aslib Proceedings 33(öl, 1981, 343-350. , 24. Seal, A.W. Automated cataloguing in the UK, a guide to services. Bath: Bath University Library, 1980. 25. Matthews, J. R. and William?, J.F . The bibliographic utilities; progress and problems. Library Technology Reports 1816i). I9S2, 603-653. 28. Bakeweli. K.G.B. The UK library networks and the Co-operative Automation Group. . Aslib Proceedings 34(6/7, 1982, 301-309. Bilgisayara Dayalı Kütüphane Sistemleri 137 27. Patrinostro, F.S. «d. A survey ot automated activities in European libraries, voL 8, Tempe, Arizona: LARC, 1972. 28. Patrinostro, F.S. ed. A survey of commonplace problems in library automation, vol. U. Tempe, Arizona: LARC, 1973. ■ 29. Chauveing, M. Automation of library and information services In France. Network 1(1) 1974, 21-24. 30. Wilson, C.W.J. ed. Directory of operational computer applications in United Kingdom libraries and information units. London: Aslib, 1973. 31. Wilson. C.W.J. ed. Directory of operational computer applications in United Kingdom libraries and information units. 2nd edition. London: Aslib, 1976. 32. Young, R.C. United Kingdom computer-based loans systems: a review. Program BIS), t 1975, 102-114. 33. Young, R.C., Stone, P.T. and Clark. G.J. University of Sussex Library automated circulation control system. Program 6(31, 1972, 223-247. 34. Buckland, M.K. and Galli van, B. Circulation control: online, offline or hybrid. Journal of Library Automation 5(1), 1972 30-39. 35. Becker. J. Computer output microform for libraries. Unesco Bulletin for Llbraries 28(5), 1974 , 242-248. 36. Lambic, J.H., Bryant. P. and Needham. A. Bath University Comparative Catalogue Study: final report (nine parts). Bath: Bath University Library, 1975. 37. Berriman, J.G. and Pillıner, J. Cheshire County Library acquisitions and cataloguing system. Program 7(1), 1973, 38-59. 38. Mayden, P. The PHFLSOM network: the co-ordinator’s viewpoint. In: Axford. H.W. ed. Proceedings of the LARC Institute On Automated Serials Systems. Tompe, Arizona: LARC, 1973. 39. Koster. C.J. IS DS and the functions and activities of national centres. Unesco Bulletin for Libraries 27(4), 1973, 199204. 40. Dewe, A. An annotated bibliography of library automation 1875-1978. London; Aslib, 1980. 41. Vickers, P.H. Automation guidelines for public libraries. London: HMSO, 1975. 42. Wainwright, J. Computer provision in British libraries. London: HMSO, 1975. . 43. Tedd, L.A. An Introduction to computer-based library systems. London: Hoyden, 1977. 44. Grosch, A.N. Minicomputers in libraries (98(-(re^: the era of distributed systems. White Plains. New York: Knowledge Industry Publications, 1982. 45. Ross, J. ed. Minicomputers in cataloguing and circulation: papers presented at a one- day conference on 24 October, 1373. at Aslib. London. London: British Library Bibli­ ographic Services Division, 1975. 43. Middleton, M.R. ed. Proceedings of a national conference on library and bibliographic applications of minicomputers, Sydney, Australia, August 22-24, 1978. Sydney: School of Librarianship, University of New South Wales, 1979. ; 47. Collier, M. Local area networks, the implications for library and information science. London, British Library, 1984. 43. Program. Special issue on microcomputers. 19(1). 1985. 49. Leggate. P. and Dyer, H. The microcomputer in the library: 1 Introduction. Electronic Library 3(31, 1985, 200-209. ' 50. Burton, P.F. Microcomputer applications and tho use of database management software. Program 18(3), 1982, 180-190. . 51. Burton. P.F. Microcomputers in library and information services: an annotated bibli­ ography. Aldershot: Gower, 1985, 52. Burton, P.F. and Petrie, J.H. Introducing microcomputers: a guide for librarians. Wokingham: Van Nostrand Reinhold. ^^34; 2nd edition 1998. 53. Trevelyan. A. and Rowat, M.J. An investigation of the use of systems programs in library applications of microcomputers. London: British Library, 1983. 54. Tedd, L.A. Software for microcomputers in libraries and Information units. Electronic Library 1(11, 1083, 31-48. _ 55. Gates, H. A directory of library and information retrieval software for microcomputera. Aldershot: Gower, 1935; 2nd ed., 1988, 56. Chen, C-C. and Wang, X. MicroUse: the database on microcomputer applications in libraries and Information centers. Microcomputers for Information Management 1(0, 1934, 39-56. * 138 Lucy A. Tedd 57. Keren. C and Peumutter, L eds. The appilcatinn fl niüi--and mieoo-oomput*r« tn information, df-umentatife and libraries. Amsterdam North Holland, 1953. 53 Keren, C and Sered, I eds., International inventory of software packages in infor­ mation field. Pans Unesco. 1983 59. Lehmann. K D and Stroh) Goebel, H eds.. Proceedings of the Second International Conference on the application of mi-e, an open source project for building a community digital archive of articles, images, audio, video and documents. Joann is a professional librarian with qualifications in English literature and computer network administration. She regularly blogs and tweets jransom , and leads a quiet life in a small coastal village with her menagerie of children and pets. http://kete.net.nz/ http://kete.net.nz/ http://library-matters.blogspot.com/ http://library-matters.blogspot.com/ http://twitter.com/jransom http://twitter.com/jransom work_olnmwxejejgeheu4m3pr34y3di ---- SCIENCE American Association for the Advancement of Science Science serves its readers as a forum for the presentation and discussion of important issues related to the advance- ment of science, including the presentation of minority or con- flicting points of view, rather than by publishing only material on which a consensus has been reached. Accordingly, all ar- ticles published in Science-including editorials, news and comment, and book reviews-are signed and reflect the indi- vidual views of the authors and not official points of view adopted by the AAAS or the institutions with which the au- thors are affiliated. Publisher: Richard S. Nicholson Editor: Daniel E. Koshland, Jr. News Editor: Ellis Rubinstein Managing Editor: Patricia A. Morgan Deputy Editors: Philip H. Abelson (Engineering and Applied Sciences); John 1. Brauman (Physical Sciences) EDITORIAL STAFF Assistant Managing Editor: Monica M. Bradford Senior Editor: Eleanore Butz Associate Editors: Keith W. Brocklehurst, Martha Coleman, R. Brooks Hanson, Barbara Jasny, Katrina L. Kelner, Edith Meyers, Linda J. Miller, Phillip D. Szuromi, David F. Voss Letters Editor: Christine Gilbert Book Reviews: Katherine Livingston, editor; Susan Milius Contributing Editor: Lawrence I. Grossman Chief Production Editor: Ellen E. Murphy Editing Department: Lois Schmitt, head; Mary McDaniel, Patricia L. Moe, Barbara P. Ordway Copy Desk: Joi S. Granger, Margaret E. Gray, MaryBeth Shartle, Beverly Shields Production Manager: James Landry Assistant Production Manager: Kathleen C. Fishback Art Director: Yolanda M. Rook Graphics and Production: Holly Bishop, Julie Cherry, Catherine S. Siskos Systems Analyst: William Carter NEWS STAFF Correspondent-at-Large: Barbara J. Culliton Deputy News Editor: Colin Norman News and CommenVResearch News: Mark H. Crawford, Constance Holden, Richard A. Kerr, Eliot Marshall, Jean L. Marx, Joseph Palca, Robert Pool, Leslie Roberts, Marjorie Sun, M. Mitchell Waldrop European Correspondent: Jeremy Cherfas West Coast Correspondent: Marcia Barinaga BUSINESS STAFF Circulation Director: John G. Colson Fulfillment Manager: Marlene Zendell Business Staff Manager: Deborah Rivera-Wienhold Single Coples Manager: Ann Ragland Classified Advertising Supervisor: Amie Charlene King ADVERTISING REPRESENTATIVES Director: Earl J. Scherago Traffic Manager: Donna Rivera Traffic Manager (Recruitment): Gwen Canter Advertising Sales Manager: Richard L. Charles Marketing Manager: Herbert L. Burklund Employment Sales Manager: Edward C. Keller Sales: New York, NY 10036; J. Kevin Henebry, 1515 Broad- way (212-730-1050); Scotch Plains, NJ 07076: C. Richard Callis, 12 Unami Lane (201-889--4873); Chicago, IL 60914: Jack Ryan, 525 W. Higgins Rd. (312-885-8675); San Jose, CA 95112: Bob Brindley, 310 S. 16th St. (408-998-4690); Dorset, VT 05251: Fred W. Dieffenbach, Kent Hill Rd. (802-867-5581); Damascus, MD 20872: Rick Sommer, 11318 Kings Valley Dr. (301-972-9270); U.K., Europe: Nick Jones, +44(0647)52918; Telex 42513; FAX (0647) 52053. Information for contributors appears on page Xl of the 29 September 1989 issue. Editorial correspondence, including requests for permission to reprint and reprint orders, should be sent to 1333 H Street, NW, Washington, DC 20005. Tele- phone: 202-326-6500. Advertising correspondence should be sent to Tenth Floor, 1515 Broadway, New York, NY 10036. Telephone 212-730-1050 or WU Telex 968082 SCHERAGO, or FAX 212-382-3725. IO NOVEMBER I989 VOLUME 246 NUMBER 493I A Question of Information Policy Twwo of the nation's premier libraries, the Library of Congress and the National Library of Medicine, may be growing disenchanted with their altruistic images. More likely, implicitly being asked to assume an unfair share of the federal deficit, they are looking for solutions to their own budget crunch. Encouraged by Congress to reduce operating costs, the Library of Congress announced that on 1 January 1990 it would begin charging licensing fees and imposing restrictions on some of the reuse of bibliographic records distributed to libraries and library utilities. The fees would be in addition to subscription fees currently charged to recover costs of reproduction and distribution. Librarian of Congress James Billington says he is tired of everybody making money off Library of Congress efforts except the Library of Congress. The National Library of Medicine has also announced plans to implement restrictions and licensing fees. The response by the library community was swift and uniformly negative: the community had problems, not only with the licensing philosophy, but also with interpreting the proposed licensing agreement which, as written, did not capture the Library of Congress intentions for implementation. The Library of Congress has since announced that imple- mentation of a licensing policy will be delayed until an evaluation can take place. The library and academic communities will thus have an opportunity to discuss policy questions, an opportunity that should have been provided prior to issuing the conditions of the agreement. Many questions come to mind. Is it ethical and legal for the Library of Congress and the National Library of Medicine to charge costs over and above reproduction costs for bibliographic records created by government employees at libraries funded by tax receipts? Since the government subsidizes many libraries, is this not largely a case of taking funds out of the left pocket to put in the right? And, if already financially strapped academic, research, and university libraries are required to use more of their budgets to purchase machine- readable records for their electronic catalogs, will not other aspects of our libraries, such as collection development and patron services, suffer? Through a variety of cooperative programs libraries other than the Library of Congress have contributed records to the files that would be licensed. Should not these partner libraries have some influence on any potential licensing arrangement for their records? The planned charges and restrictions seem to challenge the intentions of statutory and constitutional provisions that shape U.S. federal government information policy-the First Amendment to the Constitution, the Freedom of Information Act, the Privacy Act of 1974, the Paperwork Reduction Act of 1980, and section 105 of the Copyright Act. It also seems that the scientific research community will be disadvantaged. With our libraries paying additional fees, there will be fewer already scarce dollars to purchase materials and fewer funds to provide access to materials not held by particular libraries. Although the Library ofCongress and the National Library ofMedicine provide highly respected cataloging information, these libraries do not have wide distribution mechanisms in place. That task falls to nonprofit bibliographic utilities such as the Online Computer Library Center (OCLC) and the Research Libraries Information Network. These utilities add value to the Library of Congress and National Library of Medicine records as they are shared with libraries. OCLC has estimated the proposed agreement from the Library of Congress could cost an additional $500,000 to $6 million a year, most ofwhich would have to be passed on to member libraries. The fees required by the National Library ofMedicine will result in significant increases in the prices of compact disc databases containing library records. Perhaps a reexamination of the missions ofthese great libraries is in order. The Library of Congress serves Congress, the American people, and their libraries. It exists to make its resources maximally accessible and to facilitate and celebrate free intellectual creativity by all people on all subjects. The National Library of Medicine's purpose is to assist the advancement of medical and related sciences and to aid the dissemination and exchange of scientific and other information important to the progress of medicine. These are noble objectives. Let's hope they are not forgotten.-RIcHARD C. ATKINSON, Chancellor, University of California, San Diego, La Jolla, California 92093 10 NOVEMBER I989 EDIITORIAL 733 o n A p ril 5 , 2 0 2 1 h ttp ://scie n ce .scie n ce m a g .o rg / D o w n lo a d e d fro m http://science.sciencemag.org/ A question of information policy RC Atkinson DOI: 10.1126/science.2814490 (4931), 733.246Science ARTICLE TOOLS http://science.sciencemag.org/content/246/4931/733.citation PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions Terms of ServiceUse of this article is subject to the is a registered trademark of AAAS.ScienceScience, 1200 New York Avenue NW, Washington, DC 20005. The title (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement ofScience No claim to original U.S. Government Works. Copyright © 1989 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. o n A p ril 5 , 2 0 2 1 h ttp ://scie n ce .scie n ce m a g .o rg / D o w n lo a d e d fro m http://science.sciencemag.org/content/246/4931/733.citation http://www.sciencemag.org/help/reprints-and-permissions http://www.sciencemag.org/about/terms-service http://science.sciencemag.org/ work_omr7adzfybf6red6oc7imebp3i ---- ¤¤¤å¤º¤å (ONIX for Books) 41 2 (December 2003) 219-235 2 1 9 1971 (Project Gutenberg) (light literature) ( 1 ) ( d e v i c e - i n d e p e n d e n t ) A d o b e A c r o b a t M i c r o s o f t R e a d e r f i l e ( d e v i c e - 41 2 (December 2003)2 2 0 dependent) ( 2) E D I t E U R O N I X ( O N l i n e I n f o r m a t i o n e X c h a n g e ) O e B F ( O p e n e B o o k F o r u m ) O p e n e B o o k P u b l i c a t i o n S t r u c t u r e 1.2(OEBPS) ( 3) ONIX O N I X ( Ingram Bowker) ( 4) 1 9 9 9 7 ( A s s o c i a t i o n o f A m e r i c a n P u b l i s h e r s , A A P ) 6 0 2000 AAP ONIX ( 5) ONIX ONline Information eXchange( ) ONIX O N I X E D I ( E u r o p e a n E D I g r o u p , EDItEUR) (Book Industry Study Group, BISG) (Book Industry Communication, BIC) EDItEUR O N I X ( 6 ) B I C B I S G ( P u b l i s h e r s A s s o c i a t i o n ) ( T h e Booksellers Association) (The Library Association) (The British Library) ( 7) EDItEUR B I S G AAP ONIX ( 8) 2 2 1 ONIX_IMPLEMENT ONIX ONIX ( 9) ONIX EDItEUR EPICS (EDItEUR Product I n f o r m a t i o n C o m m u n i c a t i o n S t a n d a r d s ) ( 1 0 ) E P I C S ( I n t e r o p e r a b i l i t y o f D a t a i n E - Commerce Systems, INDECS) INDECS E D I t E U R D O I ( I n t e r n a t i o n a l D O I Foundation, IDF) ( 11) ONIX 1.2 2000 11 1.2.1 2001 7 1 2.0 2001 7 25 2.0 8 2 2003 6 ONIX for Books 2.1 ONIX ( 12) 2.0 ( 13) 1. ( ) 2. 3. series subseries ONIX series / subseries ONIX 4. 5. 2 . 1 XHTML Website (composite) Title ( 14) O N I X O N I X O N I X X M L XML DTD 41 2 (December 2003)2 2 2 ( 15) (product form code) ( 1) ( 16) ONIX Level 1 Level 2 Level 1 ( X M L R e f e r e n c e Name) Level 2 (tag) ( p r o d u c t n u m b e r s ) 2 PR2.1 PR2.6 PR2.1 PR2.6 Level 1 PR.2.7 Product identifier t y p e c o d e P R . 2 . 9 I d e n t i f i e r v a l u e PR.2.7 3 PR.2.7 01 PR.2.8 Identifier type name EDItEUR, BISG and BIC, ONIX Product Information Message: Product Record Format, v2.1, June 2003, p.20, (19 Nov. 2003). 2 2 3 02 02 ISBN 8474339790 ISBN EDItEUR, BISG, BIC ONIX Code Lists, Issue 1 (London, UK: EDItEUR, 2003), p7, (13 May 2003) 3 ONIX Code Lists Issue 1 List 5: Product identifier type code EDItEUR, BISG and BIC, ONIX Product Information Message: Product Record Format, v2.1 June 2003, pp.16-19, (19 Nov. 2003). 2 ONIX Product Information 41 2 (December 2003)2 2 4 ( ) ONIX Level 2 EDItEUR Level 1 Level 2 EDItEUR Level 2 2 . 1 ( d e p r e c a t e ) P R . 2 . 1 PR.2.6 PR.2.7 P R . 2 . 9 ( n e w ) UK only US only Europe only Class of trade ONIX for Books 2.1 XML (message specification) (product record specification) (main series record specification) (subseries record specification) ( 17) ONIX XML (start of message) XML ONIX (header block) (the body of the message) (the end of message) ONIX ( 18) 2 2 5
(packaged) (sender) (addressee) EAN( 19) SAN( 20) 4 / Level 1 EDItEUR, BISG and BIC, ONIX Product Information Message: Product Record Format, v2.1 June 2003, pp.6-12, (19 Nov. 2003) 4 ONIX 2.1 Message header 41 2 (December 2003)2 2 6 ONIX / (series detail) ONIX XML 250 25 ( 21) / (multi-level series/subseries) Group structure of an ONIX Product record,"Graphical view of the ONIX DTD Release 2.1 [Revision 01], (20 Nov. 2003). 2 2 7 ONIX for Books 2.0 ONIX ( 22) 7 2 ( 2 3 ) M S . 1 M S . 6 MS.7 / (collection) ONIX for B o o k s 2 . 0 ONIX ( 24) 8 ( 25) SS.3 SS.8 ONIX Main Series record outline structure, Graphical view of the ONIX DTD Release 2.1 [Revision 01], (20 Nov. 2003). 2 ONIX 41 2 (December 2003)2 2 8 SS.3 SS.8 O N I X ( 26) ONIX International (aggregators) ONIX Subseries record outline structure, Graphical view of the ONIX DTD Release 2.1 [Revision 01], (20 Nov. 2003). 3 ONIX 2 2 9 O N I X A m a z o n B N . c o m (Barnes & Noble) ONIX ONIX ONIX ONIX ONIX O C L C 2 0 0 1 9 O N I X M A R C Wo r l d C a t O C L C (ORACLE) MARC Dublin Core ONIX O N I X D i g i t a l O b j e c t I d e n t i f i e r ( DOI) DOI (International DOI Foundation IDF) ONIX DOI E P I C S O N I X ONIX 2 0 0 3 1 2 6 ( N a t i o n a l I n f o r m a t i o n S t a n d a r d s Organization NISO) EDItEUR ALA Midwinter Conference ONIX for Books ONIX ONIX for Serials ( 27) ( 28) O N I X O N I X M A R C (availability) ( 29) B I C O N I X ( L i b r a r y o f C o n g r e s s , L C ) O N I X M A R C ( B r i t i s h N a t i o n a l B i b l i o g r a p h y Research Fund, BNBRF) ONIX LC ONIX (Cataloging in Publication CIP) ( 30) M A R C O N I X L C 2 0 0 0 1 1 O N I X MARC21 ( 31) ONIX1.2 OCLC Bob Pearson M A R C 2 1 O N I X ( 3 2 ) U N I M A R C A l a n Danskin ONIX UNIMARC ( 33) Danskin 41 2 (December 2003)2 3 0 ( 34) B I C O N I X (American Library Association, ALA) ONIX B I C O N I X B u r t o n 4 2 18 16 8 ONIX ( 35) BIC ( 36) ONIX ONIX ONIX ONIX XML ( ) ONIX 2001 12 CC:DA( 37) ONIX ( f i n a l r e p o r t ) O N I X 1 . 2 . 1 ( 3 8 ) O N I X ( A A C R 2 , MARC 21) O N I X Amazon.com Barnes & Noble O N I X O N I X L C OCLC ONIX MARC A A C R 2 O N I X I F L A F R B R ( F u n c t i o n a l Requirements for Bibliographic Records) PCC(Program for Cooperative Cataloging) ONIX CC:DA ONIX Pearson ONIX MARC21 ( 39) OCLC 2 3 1 BIC ALA 2002 BIC ONIX (SIMONE) SIMONE the Simple O N I X E d i t i o n To o l O N I X ONIX ( 40) L C ( T h e B i b l i o g r a p h i c E n r i c h m e n t A d v i s o r y Te a m BEAT) TOC(Table of Contents) TOC O PA C TO C O PA C TO C O N I X ( 4 1 ) 2 0 0 3 1 Wiley 33,894 ONIX TOC( 42) O N I X 1.ONIX Level 2 2 . O N I X O N I X D C 3.ONIX ONIX 4 . O N I X ONIX EDItEUR BISG BIC ONIX 5 . O N I X SIMONE NetRead Book Data Whitaker MUZE ONIX ONIX 6.ONIX DOI NISO ONIX ONIX ONIX MARC 41 2 (December 2003)2 3 2 1 . O N I X 2 . O N I X O PA C L C BEAT ONIX TOC ONIX OCLC RLG O N I X We b PA C ONIX ( 43) 3 . O N I X ( 4 4 ) O C L C L C O N I X M A R C OCLC ONIX MARC ONIX MARC ONIX 4 . O N I X ALA ONIX OCLC ( 45) 5.BIC 42 ONIX ONIX MARC ONIX ONIX International ONIX ONIX ONIX 1 5 3( 88 3 ) 32-33 2 47( 90 1 ) 1-9 (2003 3 3 ) 2 3 3 3 OeBF, Open eBook Publication Structure, 27 August 2002; (3 March 2003). 4 BISG, FAQ s About ONIX, (3 March 2003). 5 EDItEUR, BISG and BIC, ONIX Product Information Release 2.0: Overview and Summary List of Data Elements, 2 August 2001, (3 March 2003). 6 European EDI group, EDItEUR, (3 March 2003). 7 Book Industry Communication, BIC, (23 March 2003). 8 Book Industry Study Group, BISG, (23 March 2003). 9 ONIX_IMPLEMENT, (3 Nov. 2003). 10 Norm Medeiros, Metadata for e-commerce: the ONIX International standard, OCLC Systems & Services, 17 3 (2001) 114-115. 11 , (23 March 2003) 12 EDItEUR, ONIX for books: Latest release: Release 2.1, (7 June 2003). 13 5 14 12 15 10 16 EDItEUR, BISG and BIC, ONIX Product Information Message: Product Record Format , v2.1 June 2003, p.20, (13 May 2003). 17 EDItEUR, BISG and BIC, ONIX for Books: Product Information Message Release 2.1 Download XML DTD and documentation, (7 June 2003). 18 EDItEUR, BISG and BIC, ONIX product information guidelines release 2.0. XML message specifi- cation, 8 August 2001, p4, (3 October 2003). 19 EAN European Article Numbering EAN International 900,000 UCC (Uniform Code Council) EAN.UCC ONIX EAN.UCC-13 standard numbering structure (EAN ) 13 EAN.UCC (prefix) 20 SAN Standard Address Number 7 [ANSI/NISO Z39.43-1993.] 21 16 1 22 EDItEUR, BISG, and BIC, ONIX Product Information Guidelines Release 2.0 record, v2.0, July 2001, (3 October 2003). 23 EDItEUR, BISG and BIC, ONIX for Books Production Information Message: Main Series Record Format, v2.1 June 2003, pp.3-10, (13 May 2003). 24 EDItEUR, BISG and BIC ONIX Product Information Guidelines Release 2.0: record, v2.0, July 2001, (3 October 2003). 25 EDItEUR, BISG and BIC, ONIX for Books: Production Information Message Sub Series Record Format, v2.1 June 2003, pp3-10, (13 May 2003). 26 ALA(Committe on Cataloging Description and Access, CC DA), Task Force on ONIX International, 29 January 2001 (3 March 2003). 27 National Information Standards Organization, ONIX: What Is In It for Libraries? 4 March 2003, (24 May 2003). 28 National Information Standards Organization, NISO/EDItEUR Program at ALA Midwinter: January 41 2 (December 2003)2 3 4 26, 2003, 7 January 2003, (24 Nov. 2003). 29 B. Green and B. Persing, Using the ONIX standard to manage serials, Serials Librarian, 42 3/4 (2002) 235. 30 29 31 Library of Congress, ONIX to MARC 21 Mapping, 31 Nov. 2000, (3 March 2003). 32 Bob Pearson, ONIX/MARC21 mapping, (3 March 2003). 33 Alan Danskin, Report on an ONIX UNIMARC crosswalk, (3 March 2003). 34 C. Burton, ONIX for libraries: An investigation into the feasibility of using ONIX International as a standard for bibliographic data transmission between the book trade and libraries in the UK, August 2001, p.2, (3 March 2003). 35 34 3-4 36 34 1 37 CC:DA ALA (Association for Library Collections and Technical Services, ALCTS) (Cataloging and Classification Section) (Committee on Cataloging Description and Access, CC DA) 38 26 39 32 40 The British Library unveils bibliographic editing tool, Information Today, 19 7 (Jul/Aug 2002) 34. 41 BEAT Home Page, 19 May 2003, (23 May 2003). 42 David Williamson, ONIX: What s in It for Libraries? The Technical Services Angle, 26 Jan. 2003, (23 May 2003). 43 M. H. Needleman, ONIX(Online Information Exchange), Serials Review, 27 3/4(2001) 104. 44 34 1 45 26 A Study of ONIX for Books Abstract Since the Internet has grown as a popular place to buy books, the publishers need badly a standard format that can use to distribute electronic information about their books to wholesale and retail booksellers, and anyone else involved in the sale of books. ONIX, stands for Online Information eXchange, is one of such standard formats. This article describes the background and development of ONIX, also explores the library's role in the mapping from ONIX to MARC. Keywords : ONIX; Metadata standard; Ebook standard Ho-chin Chen Associate Professor E-mail chin@mail.tku.edu.tw Hui Ou-Yang Graduate Student Department of Information & Library Science, Tamkang University Taipei, Taiwan, R.O.C. E-mail 691070014@s91.tku.edu.tw J o u r n a l o f E d u c a t i o n a l M e d i a & L i b r a r y S c i e n c e s , 4 1 : 2 ( J u n e 2 0 0 3 ) : 2 1 9 - 2 3 5 2 3 5 J o u r n a l o f E d u c a t i o n a l M e d i a & L i b r a r y S c i e n c e s , 4 1 : 2 ( S e p t e m b e r 2 0 0 3 ) : 0 0 0 - 0 0 02 3 6 work_oo4k2jmrhnfnppeikpus3i5m6m ---- http://joemls.tku.edu.tw 教育資料與圖書館學 Journal of Educational Media & Library Sciences http://joemls.tku.edu.tw Vol. 47 , no. 4 (Summer 2010) : 403-428 圖書資訊系統的演變與發展 Evolutionary Development of Library Information Systems 黃 鴻 珠 Hong-Chu Huang * Professor E-mail: kuanin@mail.tku.edu.tw 陳 亞 寧 Ya-Ning Chen System Analyst E-mail: arthur@sinica.edu.tw English Abstract & Summary see link at the end of this article mailto:kuanin@mail.tku.edu.tw� mailto:arthur@sinica.edu.tw� http://joemls.tku.edu.tw 教育資料與圖書館學 47 : 4 (Summer 2010) : 403-428 圖書資訊系統的演變與發展 黃鴻珠 教授兼館長 淡江大學資訊與圖書館學系暨覺生紀念圖書館 E-mail: kuanin@mail.tku.edu.tw 陳亞寧* 系統分析師兼組長 中央研究院計算中心 E-mail: arthur@sinica.edu.tw 摘要 圖書資訊系統是圖書館作為徵集、組織與服務圖書資訊資源的重 要工具,隨著資訊技術的發展,圖書資訊系統也從傳統的卡片目 錄逐漸蛻變成不同的多元樣貌,如整合式圖書館系統、聯合目 錄、電子資源管理系統、亞馬遜網路書店與Google圖書等。本文 旨在探討圖書資訊系統的歷史發展與未來方向,採取個案研究為 方法,以主要趨勢的演變歷程為主軸,依序選取4類14個個案進 行討論,分析各式圖書資訊系統的特色,進而從物件類型、物件 精細度、資源範圍、組織方式、聚合方式、聚合資源、呈現方 式、軟體導入模式、社會化線上目錄、設計方式、運作方式與取 用方式等12種觀點提出研究發現與建議。 關鍵詞: 圖書資訊系統,圖書資訊資源 前 言 早期圖資界的圖書資訊系統可追溯至西方修道院的藏書清單,或是中國藏 書樓的目錄清單,隨著時代的功能需求演變,之後陸續出現了書本式目錄、 卡片目錄、線上資料庫、圖書館自動化系統等。近年來,圖書資訊系統除了受 電腦的影響外,隨著網路化、數位化、Google化、Web 2.0化與手機行動化等 潮流趨勢的衝擊下,未來的圖書資訊系統會有何種面貌、圖書館需要何種功能 的圖書資訊系統等議題,皆引起圖資界的關心與研究。本文採取個案研究為方 * 本文通訊作者。 2010/02/27投稿;2010/05/14修訂;2010/06/16接受 研究論文 http://joemls.tku.edu.tw 404 教育資料與圖書館學 47 : 4 (Summer 2010) 法,以網路、數位、Google、Web 2.0與手機行動化等五大潮流趨勢為範圍, 選擇4類14個案進行分析,進而提出研究發現與討論,以及研究建議。 二、發展趨勢 早期圖資界的圖書資訊系統係以本身的館藏資源為主要範圍,同時以人工 製作的館藏清單,及書本式與卡片式目錄作為館藏資源管理與服務的工具。隨 著電腦的發明與運用,圖書資訊系統邁入所謂的圖書館自動化(library automa- tion)時代,因而有圖書館管理系統(Library Management System, LMS),或是 整合圖書館系統(Integrated Library System, ILS)等名詞的出現及相關研究。然 而,隨著時間的演變與資訊技術的發展,圖書資訊系統所面對的環境也有了重 大變遷,近廿年來的主要發展趨勢約可歸納成下列幾項: ㈠網路化 自1990年代起,隨著網際網路(Internet)與全球資訊網(World-Wide Web, W3)的蓬勃發展與應用,各式資訊資源及其系統走向網路化,讀者可經由網際 網路、全球資訊網與各式網路搜尋引擎查詢各項資訊資源,及取得相關的資訊 資源。在此一趨勢影響下,許多資訊資源系統開始進行數位再製(digitization) 並與網際網路進行連線,且數位原生(born digital)資訊資源逐漸大量產生。原 本的圖書資訊系統由圖書館內部擴展至外部,同時也逐漸全球資訊網頁化,提 供讀者跨越時空藩籬的取用功能。 ㈡數位化 自1994年起,在美國國家科學基金會(National Science Foundation, NSF)、 國防部高等研究計畫機構(Defense Advanced Research Projects Agency, DARPA) 與航空暨太空總署(National Aeronautics and Space Administration, NASA)三個 單位共同推動數位圖書館計畫(Digital Libraries Initiatives Phase One, DLI1)至 今,全球各國也開始展開各式的數位圖書館、數位博物館、數位檔案館、數位 自然標本館與數位學習(elearning)等計畫。此外,各式商業電子資料庫與開放 取用(Open Access, OA)也陸續推出,形成一種數位原生與數位再製資訊資源 並存,且快速成長的現象。在此一趨勢衝擊下,圖書館除了應用原來的LMS或 ILS管理這類資訊資源外,各式電子資源管理系統(Electronic Resource Manage- ment, ERM)與聯合查詢系統(Meta-search或Federated search)也應運而生,協 助圖書館管理與服務此類電子資訊資源庫。另外,亞馬遜網路書店(Amazon. com)則與書商直接合作,除了書目資料外,更進一步提供書影、目次、部分 全文預覽、購書推薦與讀者評論等新式的圖書資訊系統與服務。就讀者而言, http://joemls.tku.edu.tw 405黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 數位資訊資源已大幅快速成長,讀者可直接取得數位資訊資源的機率大為提 高。 ㈢Google化 就在Internet廣泛應用後,除了Yahoo外,Google也異軍突起,同時將 搜尋引擎服務擴展至各式數位資訊資源,乃至於學術資訊資源,包括期刊、 圖書、學位論文、技術報告、工作報告等。同時也推出了Google學術搜尋 (Google Scholar)與Google圖書(Google Books Search)等服務,以嶄新的方式 提供另一新型圖書資訊系統,供讀者查詢與取用。在此一趨勢潮流下,讀者已 習於Google的單一簡易查詢及資訊取用模式,幾乎徹底改變了傳統上資訊資源 的查詢與取用方式,也促使各式資訊資源及其系統直接與Google接軌,讓讀者 可經由Google找到所需的資訊資源。 ㈣Web 2.0化 自2004年起,由歐瑞利(Tim O’Reilly)提出Web 2.0的概念後,自此各式 社會化工具軟體推陳出新,著名者如維基百科(Wikipedia)、部落格(blog)、推 特(twitter)、Flickr與YouTube等。Web 2.0的主要影響在於邀請使用者共同參 與,以集體方式產生各式資訊資源,有別於第一代全球資訊網只限於資訊資源 的提供或使用而已,也擴展至人際間的互動交流,同時資訊資源的載體也趨向 多元化。對讀者而言,資訊資源及其系統不再限於單向式的提供,而是擴展至 讀者的參與,進而由讀者提供各式資訊資源。另外,Web 2.0也融合了開放原 始碼(open source code)的精神,不同之處在於後者限於電腦軟體的共同開發、 設計與使用等方面。 ㈤手機行動化 隨著智慧型手機的盛行下,手機的普及化及上網化,又是另外一波改變讀 者對資訊資源及其系統的使用需求,因為各家手機廠商已開發設計與提供各式 工具軟體,供使用者下載,以利使用各種資訊資源及其系統的使用,包括提供 簡訊服務、影音新聞、地圖查詢與定位及使用各類加值服務。換言之,一旦圖 書資訊系統可以符合手機的軟體規格要求後,即可經由手機直接使用。 三、研究範圍與限制 有關圖書資訊系統的意涵,目前並無一致的定義與說法。如果從時間軸 加以追溯,可發現圖書館導入電腦進行業務自動化後,LMS與ILS等名詞才逐 一出現,因而所謂的「圖書資訊系統」與圖書館自動化間有極為密切的關係。 http://joemls.tku.edu.tw 406 教育資料與圖書館學 47 : 4 (Summer 2010) 依據圖書資訊學線上詞典(Online Dictionary for Library and Information Science, ODLIS, http://lu.com/odlis/)對圖書館自動化的定義是:「最初是圖書館導入電腦 技術至圖書館內部作業與服務,包括徵集、分編與權威控制、期刊控制、流通 與盤點及館際互借與文獻傳遞。隨著趨勢的發展,圖書館自動化必須外加(add- ons)許多功能,以完成數位內容的傳遞,包括:鏈結解析(link resolver)、入口 網站與聯合查詢界面(portal and metasearch interfaces)、電子資源管理(e-resource management module)及全球資訊網環境的結合,乃至於與學習管理系統的整合 等。」具體而言,從ODLIS對圖書館自動化的定義即可得知,圖書資訊系統面 對不同時代的潮流變化,會有不同的變革與調整,乃至於不同的範圍與名詞。 除此之外,圖書資訊系統的功能範圍也從圖書館館藏及其內部作業,逐漸擴展 至圖書館外部資訊資源及各項服務。次則,若從國際開放式圖書館環境(Open Library Environment, OLE)計畫提出的報告中,內容也指出ILS必須與其他系統 與服務相互結合,才能提供讀者需要的資訊資源及其服務(OLE, 2009, p.3)。 Dempsey(2008, p.5)則直接指出:「圖書館所管理與採用的圖書資訊系統已不 僅是ILS而已,也包括了ERM與各式典藏庫系統(如機構典藏庫、數位學習與 數位圖書館等系統)。」簡言之,圖書資訊系統隨著不同潮流趨勢的發展,會有 不同的名詞及其所定義的功能範圍,以因應時代的需求,且彼此間必須相互整 合。 本文以上述OLDIS、OLE計畫與Dempsey對圖書資訊系統提出的定義與 觀點為依據,以1990至2010期間為時間範圍,依上述五大趨勢所產生的代表 性圖書資訊系統為研究對象,作為歷史發展軌跡的追溯,採取個案研究為方 法,討論各式圖書資訊系統的變革與特色,進而探討圖書資訊系統的演變與發 展。因而,本文選擇的研究對象並非以LMS或ILS的傳統定義選擇單一性質系 統進行同質比較,改以異質且多元化的圖書資訊系統為主,探究圖書資訊系統 的全貌。本文研究對象涵蓋14個圖書資訊系統,包括:ILS、開放原始碼軟體 (open source software)系統、聯合目錄系統、亞馬遜網路書店、ERM系統、聯 合查詢系統、開放取用系統、Google圖書、數位圖書館/數位學習資源/機構典 藏系統、合集式服務(collection level)系統、Web 2.0系統、BiblioCommons、 LibraryThing及美國北卡羅萊納州立大學圖書館(North Carolina State University Libraries, http://www.lib.ncsu.edu/m/) 的行動化服務等,且可進一步區分為4大 類型:包括圖書館管理系統、數位資訊資源系統、社會化工具軟體及行動化服 務。 四、研究個案與分析 本文分為下列4類14個個案進行探討,4大類包括了圖書館管理系統(項目 http://joemls.tku.edu.tw 407黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 1-4)、數位資訊資源系統(項目5-10)、社會化工具軟體(項目11-13)及行動化 服務(項目14)等,分述如下: ㈠ILS 傳統上,典型的ILS包含了採購、編目、流通、期刊管理、線上公用目錄 (Online Public Access Catalog, OPAC)等模組,乃至於參考服務與館際互借等。 本質上,是以圖書館的專業活動與業務為主軸,讀者服務方面只限於OPAC、 WebPAC與參考服務。換言之,以ILS作為圖書館專業工作的管理系統,進而 將成果以OPAC或參考服務模組方式提供館藏給讀者使用。近年來,因應圖書 資訊資源的數位化、網路化與Web 2.0化,ILS的功能發展也有所調整,主要包 括: 1. 加入URL:以既有的書目控制或知識組織為基礎,整理與讀者需求相關 的電子式網路資訊資源,由外而內將網際網路上各式數位化圖書資訊資源納為 圖書館館藏的一部分,擴展既有的館藏範圍與項目。 2. 全球資訊網服務化(web service):採用OpenURL作為書目紀錄的查詢 語法結構,以利於不同系統間的串連與查詢,將OPAC或WebPAC的查詢範圍 擴展至網路書店(如:亞馬遜網路書店、博客來網路書店),以及Google圖書 與Google學術搜尋,由內而外擴展查詢範圍,以找出相關的電子式網路資訊資 源。 3. 提供聯合查詢功能(Metasearch):如Primo,提供讀者一致化的查詢功 能,一次查檢圖書館實體館藏與各式訂購的電子資料庫,作為目錄查詢的範 圍,期能協助讀者一次找出相關的圖書資訊資源。 4. 提供Web 2.0功能:如Encore、Primo等產品,則是提供社會化標籤 (soical tags)、標籤雲(tag cloud)與層面(facet)瀏覽等功能。 5. 在圖書資訊系統運作與應用方面,可區分為單一圖書館與聯盟式兩種, 後者如美國的OhioLINK與香港的JULAC等。除此之外, OCLC計劃推出的「全 球資訊網型管理服務」(web-scale management service)則提供另外一種方式。 此種方式圖書館無須實體購置電腦主機,及維護圖書資訊系統軟體,圖書資訊 系統及其主機與軟體的發展與維護皆由廠商全權負責,圖書館只需支付系統年 費與維護資料,以使用所需的功能與作業(OCLC, 2009)。 ㈡開放原始碼軟體系統 目前ILS的主要系統與市場係由幾家廠商所主導,圖書館必須同時支付購 置與年度維護費用,同時系統也屬於一種專屬系統(proprietary system),圖書 館不易主導系統的設計發展,也不易從中取出資料加以利用。然而,近年來 http://joemls.tku.edu.tw 408 教育資料與圖書館學 47 : 4 (Summer 2010) 此一現象已逐漸有所改變。例如,開放原始碼軟體的發展潮流也踏入ILS領域 中,結合眾人力量開發符合圖書館需求的ILS軟體。其中著名者有Evergreen (http://www.open-ils.org/about.php)、Koha(http://koha.org/)。此種模式源自於開 放原始碼軟體的特質,對圖書資訊系統的影響主要有下列幾點,分述如下: 1. 集合眾人智慧共同發展軟體,使用單位亦可一同參與,表達其需求,而 不是由單方主導或決定。 2. 所發展軟體可以無償或低價使用,也無所謂的年度維護費用。 3. 所發展軟體可以在工作站或個人電腦等級的平台運作,無須工作站或大 型主機級以上的電腦主機,更為平民大眾化。 4. 使用單位必須配置技術人員負責電腦應用軟體及作業軟硬體的運作與維 護,所需電腦技術能力的挑戰性頗高。 5. 在Evergreen的使用個案中,PINES計畫是以美國喬治亞州(Georgia State)275家公共圖書館為範圍,應用在類似聯盟圖書館層次的ILS。 ㈢聯合目錄系統 在聯合目錄方面,世界級WorldCat融合全球圖書館等資訊服務機構,致力 於書目紀錄的管理與服務,同時與Google進行策略聯盟,讀者在使用Google 圖書時,也可一併獲知全球圖書館的實體館藏,就近取用。另外,英國Intute (http://www.intute.ac.uk/)則是將所整理的網站資源記錄與Google作結合,讀 者除了可在Intute本身網站查詢與瀏覽外,也可經由Google查詢取得使用。傳 統上,聯合目錄是一個地區內所有圖書館館藏資源的集合體代表。隨著網路化 與數位化的技術應用,聯合目錄也形成一種館際互借與資源共享的利器。從 WorldCat與Intute的個案中,可發現共建共享所展現的團結力量,如果再與網 路搜尋引擎進行策略聯盟,可以進一步提高圖書資訊系統中資訊內容與服務的 可見度與取用性。 ㈣亞馬遜網路書店 亞馬遜網路書店提供了各類圖書資訊資源,供讀者查詢及購買之用。亞馬 遜網路書店的資料來源直接來自出版社,包括了書目、封面書影、目次及部分 內容或電子檔。更進一步,亞馬遜網路書店也提供讀者上傳評論與評價等級, 以及提供讀者購買圖書的歷史記錄,作為購書參考之用。除此之外,亞馬遜網 路書店也包括許多物品,例如3C產品。亞馬遜網路書店的特色主要如下: 1. 整理對象:除了紙本圖書外,也包括電子書及各式物品,如:3C產 品、影音、器材等。 2. 組織方式:以簡略方式進行書目控制或知識組織,而不分物件類型,且 http://joemls.tku.edu.tw 409黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 資料來源由廠商提供,不是由人工建立與維護。 3. 參與式:最早實現Web 2.0的先驅者之一,可由讀者提供評論的意見內 容及其等級,形式包含文字與影音檔等。 4. 資料探勘:依據讀者購買的歷史記錄,提供下一位讀者購買的參考,是 一種商業智能(business intelligence)型內容與服務。 5. 聚合資料:除了基本的書目資訊外,也包括了價格、數量、新書或二手 書、封面書影、目次、部分或全部內容的電子檔、讀者評論與其等級、讀者 購書的行為記錄,及相關主題的串連查詢等。 6. 另外,亞馬遜網路書店提供了網路服務(Amazon Web Service, AWS), 服務範圍包括硬碟的儲存空間(Amazon Simple Storage Service)、內容傳 遞(Amazon CloudFront)、資料庫建置(Amazon SimpleDB)、軟體作業排 程(Amazon Simple Queue Service)、雲端計算能力(Amazon Elastic Compute Cloud)等,可作為數位化圖資資訊資源服務與管理所需的電腦軟硬體平台環 境。 ㈤ERM系統 因應各式電子化的索引、摘要、全文(如:圖書、期刊與學位論文等)、 參考資料(如:辭典、百科全書等)與引文等各式資料庫的產生,圖書館會以 網頁或簡單的資料庫方式,提供依字母順序或筆劃數的瀏覽,及關鍵詞查詢等 功能,協助讀者找到所需的特定圖書資訊資源。另外一種方式即是選用ERM系 統,此種ERM具備下列特色,包括: 1. 整理對象:物件可從單一資料庫,乃至於單一的期刊或圖書;例如, ProQuest的Serials Solution與EBSCO的A2Z等產品。 2. 組織方式:與上述ILS方式大異其趣,以簡略方式進行書目控制或知識 組織,著重在電子資源的授權、認證與租用內容等資訊,以提供單一資料庫 /期刊/圖書的瀏覽與查詢功能。主要方式有二,一是由圖書館自行發展軟體或 租用ERM系統,自行建立與維護資料。二則是資料來源由ERM協力廠商提供 相關的書目與館藏資料,及維護URL的有效性,而圖書館再依實際訂購內容進 行修改。 3. 採用OpenURL:作為不同圖書資訊系統間的串連,例如EBSCO A2Z採 用OpenURL藉以串連ILS與ISI引文資料庫等。 ㈥聯合查詢系統:電子資料庫 所謂聯合查詢係指讀者可以經由單一界面,查詢2個以上的資料來源,其 中著名者為WorldCat Local與Acquabrowser等。依現況而言,約可分為下列幾 http://joemls.tku.edu.tw 410 教育資料與圖書館學 47 : 4 (Summer 2010) 種方式,其特色分述如下: 1. 廣播型資訊檢索:以 Z39.50 系列為主,包括 SRU/SRW 與 NISO Metasearch XML Gateway(MXG),是一種資訊檢索協定,以網路廣播式查詢2 個以上的資訊來源,本身並不儲存任何資料,多數ILS有提供此類產品。 2. 資料擷取:如OAI-PMH等,係為一種資料擷取協定,實際儲存資料 記錄,並可區分為合集型(如本文稍後探討的IMLS Digital Collections and Content)與單件型(如英國的CiteBase, http://www.citebase.org/search)兩種層 級。 3. 資料彙整(Mashup):源自於Web 2.0概念,係以API軟體為主,自2個 以上來源進行資料的彙整與重組。 4. 採用數位物件識別碼:如DOI、URN、Handle等,作為串連與取得單篇 文獻之用,尤其是在電子期刊與電子書方面。 ㈦開放取用系統 自1991年起,在美國NSF贊助下,成立了arXiv;1994年Steven Harnad提 出「顛覆建議方案」(Subversive Proposal);2002年布達佩斯會議宣言(Budapest Open Access Initiatives, BOAI)的公告,自此OA已為全球高等教育與學術界所 接受,同時也發展出了開放取用期刊與機構典藏(有關機構典藏請詳下文「㈨ 數位圖書館、數位學習資源與機構典藏系統」一節)。以arXiv為例,已融合了 作者、出版者與讀者等三種角色所需的圖書資訊系統功能,利於進行投稿、 審閱、出刊與發行,及查詢與取用等多種功能。若依據Van de Sompel, Payette, Erickson, Lagoze, and Warner(2004)等人的研究分析,提出arXiv已獲致下列成 效(見圖1)。 圖1 arXiv運作生態與服務路徑 資料來源:Van de Sompel, Payette, Erickson, Lagoze, & Warner, 2004 http://joemls.tku.edu.tw 411黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 1. 已具備期刊的所有功能,即登記(register)、認證(certify)、公告 (awareness)、典藏(archive)與回饋(reward)。 2. 可經由Google搜尋引擎,達到更高的可見度與取用。 3. 期刊功能流程的自動化與公開化,同時由學者專家共同參與,包括投 稿、評論與修正。 4. arXiv與正式期刊間的接軌。 ㈧Google圖書 Google網路搜尋引擎除了以全球資訊網為主要範圍外,也逐漸擴展至不同 的資料物件,包括部落格、Wiki、出版社資料庫,乃至於圖書與期刊論文等。 在Google圖書方面,仍然延續Google搜尋引擎的作法,提供關鍵字查詢。綜 合Google圖書主要特色如下: 1. 整理對象:全然數位化資料,包括數位原生與數位再製等兩種類型,也 涵蓋了圖書內的文字內容等。 2. 組織方式:提供關鍵字查詢,範圍不僅限於書目資料項目而已,還包括 內文,亦可由Google網頁搜尋找到。 3. 參與式:在Google圖書方面,讀者也可以提供評論,讓讀者共同參與。 4. 聚合資料:除了重要的書目資料要項外,也包括封面影像、內文影像、 引用的電子書、引用的網頁或網站、讀者評論、串連網路書店與WorldCat、標 籤雲,以及以Google Map方式呈現書中內容所指涉的空間地點。 5. 呈現方式:結果呈現方式則有別於ILS,Google圖書可以提供標籤雲方 式顯示內容的相關性、全影像書影的瀏覽方式,以及以頁面影像與加註關鍵字 等視覺化界面。 ㈨數位圖書館、數位學習資源與機構典藏系統 除了受Web 2.0的趨勢左右外,圖書資訊系統也深受數位化發展的影響, 也包括數位圖書館、數位學習資源與機構典藏(Institutional Repositories, IRs)等 系統。首先,在組織的結構方式,上述資訊系統作了截然不同的改變。以美國 加州線上檔案數位圖書館(Online Archive of California, OAC)計畫為例,該資 訊系統將組織對象區分為全宗(fonds)、系列(series)、卷(files)與件(items), 也可進一步提供讀者查詢與瀏覽之用。另外,在數位學習資源系統則可劃分為 課程(Content Aggregation, CA)、學習單元(Sharable Content Object)與素材(as- set),而IRs則可依學校組織層級加以區分不同階層結構。其次,在資訊資源方 面,OAC計畫說明了圖書館亦從事檔案的組織與服務,而MIT@DSpace則具體 落實了圖書館從事IRs徵集與組織教職員各項學術產出資源與數位學習資源等 方面。 http://joemls.tku.edu.tw 412 教育資料與圖書館學 47 : 4 (Summer 2010) ㈩合集式服務系統 隨著數位圖書館計畫的執行,產生一種新的圖書資訊系統,管理各項主題 的數位資訊資源;例如,美國數位圖書館聯盟(Digital Library Federation, DLF) 的「數位館藏註冊中心」(Digital Collections Registry, http://dlf.grainger.uiuc.edu/ DLFCollectionsRegistry/browse/)、美國博物館暨圖書館服務研究機構(Institute of Museum and Library Service, IMLS)的「數位館藏與內容」(Digital Collections and Content)網站(http://imlsdcc.grainger.uiuc.edu/),及OCLC WorldCat的OAIster 等皆是。除此之外,在英國的Intute服務則是提供有關網際網路上的網站資源, 供研究與學習之用。上述這些服務也針對後設資料方面提出標準或規範,包括 英國圖書館支援研究計畫(Research Support Libraries Programme, RSLP)的合集 資源描述格式(RSLP Collection Description Schema)、都柏林後設資料(Dublin Core Metadata Initiative, DCMI)的DC合集描述應用檔(DC Collection Description Application Profile, DC CDAP),以及美國資訊標準組織(National Information Standard Organization, NISO)起草的Z39.91合集資源描述規格(Collection Description Specification)等。 Web 2.0系統 在Web 2.0發展趨勢下,現有Web 2.0也被圖書館加以應用,作為另外一 種圖書資訊系統。以美國國會圖書館(Library of Congress, LC)為例,LC導入 部落格(http://www.loc.gov/blog)、臉書(http://www.facebook.com/libraryofcon- gress)、Flicker(http://www.loc.gov/rr/print/flickr_pilot.html)與推特(http://twitter. com/Librarycongress),作為提供新訊息或促進讀者的參與,以推展各項服務 之用。此外,英國格拉斯哥大學圖書館採用維基(Glasgow University Library, http://en.wikipedia.org/wiki/Glasgow_University_Library),來建立整個圖書館簡 介、圖書館歷史,以及現在建築的一個情形,以作為圖書館相關資訊的公布。 此一作法主要特色分述如下: 1. 圖書館以現有的圖書資訊資源為基礎,予以重新定位(repurpose),再 經由Web 2.0工具進行重新組織(remix),將現有圖書資訊資源再予以新的應用 (reuse),以另外一種嶄新面貌呈現圖書館的資訊資源及其服務內容。 2. 結合Web 2.0工具與圖書館的圖書資訊資源,邀請讀者共同參與,以產 生各種社會化的資訊;如:社會化標籤與大眾分類(folksonomy)。 3. 經由讀者的共同參與,重新行銷圖書館的資訊資源及其服務,也進入 Google搜尋領域中,形成一種Google化的資源索引與發掘服務。 4. 經由讀者的參與,從中了解與掌握讀者的使用行為,以發展一種群體智 能(collective intelligence)機制的主要參考來源。 http://joemls.tku.edu.tw 413黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 5. 圖書館的資訊資源及其服務的呈現方式,可結合多元化的媒體,同時也 可經由類似層面與標籤雲方式呈現資訊的結構,提供讀者取用所需的資源。 6. 以推特的即時式訊息與讀者進行雙向式溝通或報導。 BiblioCommons 在圖書館系統的OPAC方面,除了所謂的WebPAC外,也有些圖書館融合 Web 2.0社會化功能於現有ILS之中。這些社會化導向式的讀者界面功能,讀 者除了可加入社會化標籤外,尚可進一步建立私域與公域的閱讀清單(reading list)、讀者網路(my network),及個人化設定(personalization)等;例如,加拿 大奧克維爾公共圖書館(Oakvilee Public Library)採用傑佛遜(Beth Jefferson)發 展的BiblioCommons(http://www.bibliocommons.com/),主要特色在於: 1. 讀者可採用社會化標籤方式參與圖書資訊的組織外,還可進一步查詢與 分享個人的主題閱讀清單,以及建立個人的圖資社會網路。 2. 提供層面分析式的查詢功能。 3. 建立在現有ILS之上(如:SirsiDynix的Unicorn與Horizon,及Ex Libris 的Voyager),同時建立一種讀者參與式的OPAC,針對現有的書目紀錄進行評 論與標籤等。 LibraryThing 由史波帝(Tim Spalding)設計發展的LibraryThings(http://www.librarything. com/),於2005年推出(LibraryThing, 2010)。LibraryThing使用者界面系統主要 的社會化特色在於: 1. 在LibraryThing個案中,其範圍不限於單一圖書館及其讀者,其他 圖書館、網路書店與出版社亦可經由Z39.50協定提供書目資料,以匯入至 LibraryThing平台。 2. 讀者可以利用現有的書目資料進行評論、標籤,進而建立屬於個人的圖 書目錄。 3. 除了書影外,讀者可以查看評論、參與討論、相關參考書目、書的背 景知識,及瀏覽相關的統計數據,包括評論數、成員數、討論事項數、等級 與歡迎度等。 4. 可以利用手機,行動上線使用。 美國北卡羅萊納州立大學圖書館行動化服務 美國北卡羅萊納州立大學圖書館結合手機,推出行動化的圖書館服務。以 北卡羅萊納州立大學圖書館為例,讀者可經由手機上網方式查詢圖書館開閉館 時間、可用電腦設備數量與狀態(computer availability)、圖書館OPAC、參考 http://joemls.tku.edu.tw 414 教育資料與圖書館學 47 : 4 (Summer 2010) 服務答詢、特定場所網路即時攝影(如:咖啡廳排隊情形)、圖書館最新消息, 及圖書館網站等(見圖2)。除此之外,美國維吉尼雅大學圖書館(University of Virginia Library, http://www2.lib.virginia.edu/mobile/)及威奇塔州立大學圖書館 (Wichita State University Libraries, http://library.wichita.edu/techserv/WSU_faculty/ search.asp)等也有類似的作法與服務。此一使用個案主要特色在於: 1. 結合智慧型手機,提供行動化的圖書館服務,除了網際網路外,圖書館 提供另外一種資訊的取用管道。 2. 手機化的行動服務係為一種重新定位與再應用(reuse)性質,可為現有 的圖書館服務開拓另外一種新面貌,以及使用環境與經驗。 圖2 北卡羅萊納州立大學圖書館M化圖書資訊服務 資料來源:北卡羅萊納州立大學圖書館網站http://www.lib.ncsu.edu/m/ 五、研究發現與討論 依據上述個案,可以了解圖書資訊系統的演變發展。更進一步,也可以發 現在網路化、數位化、Google化、Web 2.0化及行動化等多重趨勢衝擊下,圖 書資訊系統不僅在功能上有所改變,在本質上也有所變動。茲分為下列重點, 闡述如下: ㈠物件類型 傳統上,圖書資訊系統係以實體的資訊載體為主要對象,其中又以紙本式 圖書與期刊等文獻為主要的處理對象。因而,圖書資訊系統係以這些圖書資訊 文獻為中心,進行徵集、分編、閱覽、典藏等專業工作為主要活動。然而, http://joemls.tku.edu.tw 415黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 隨著數位化圖書資訊,以及數位創作與出版工具的盛行,圖書資訊系統所處理 的資訊資源除了實體的資訊載體外,也包括了數位再製與數位原生等數位資訊 資源。此類資源來源眾多,包括圖書館自製、商業資料庫廠商製作、讀者產 生與提供等不同方式,且類型繁複,從簡易書目資料與社會標籤外,乃至於文 本、圖片、影音、簡訊、教學資料等無所不包。本質上,這些新式圖書資訊 資源的產生方式並無常規可以遵循,同時也具備瞬息萬變的特質,亦即容易被 引用、複製、修改、多種版本、存放位置更改與消失等。因而,面對此種新 式的數位資訊生命週期(digital information lifecycle),無論圖資界的學術與實務 專業領域,乃至於其所應用的圖書資訊系統,皆必須對圖書資訊資源的形式、 來源、類型與特質等作妥善的管理,方能提供良好的圖書資訊服務。 ㈡物件精細度 基本上,圖書資訊系統處理的資訊資源多數以單件(item)為主要對象, 例如一本書、一種期刊等。然而,隨著各式數位化產品與服務的推陳出新,此 一現象已有所改變。首先,在向下深化方面,ILS方面已有許多系統可以提供 目次(table of content, TOC)的查詢。在聯合查詢與開放取用期刊(如DOAJ)方 面,則是將查詢對象深入至單篇的期刊文獻。甚至在亞馬遜網路書店與Google 圖書方面,則是將處理對象深化至圖書資訊的內文。次則,在向上廣度化方 面,隨著各式數位圖書館與數位學習計畫的實施與具體化,也有往合集層次 (collection level)發展。例如,DLF的數位館藏註冊中心、IMLS的數位館藏 與內容、OAIster及MIT@DSpace的OCW資源等皆是。除此之外,也有些圖 書資訊系統將服務納入處理對象;例如,ERM整理的各式資料庫、奧克漢計 畫(OCKHAM Project,http://www.ockham.org),以及NISO的Z39.91資訊檢索服 務描述規格(Information Retrieval Service Description Specification)等皆屬之。但 Google搜尋引擎則是不加以區分,全部包括在索引與查詢範圍之內(見表1)。 表1 圖書資訊系統處理的資訊物件精細度 內文 單篇文獻 目次 單件 合集 服務 Google Web � � � � � � DLF � IMLS � DSpace for OCW � � OAIster � Intute � ILS � � � ERM � � � � Metasearch � � � � Google Books � � � � OCKHAM � NISO Z39.91 � 資料來源:本文自行整理 http://joemls.tku.edu.tw 416 教育資料與圖書館學 47 : 4 (Summer 2010) ㈢資源範圍 以往圖書資訊系統處理與服務的範圍係以圖書館徵集的實體圖書資訊資源 為主,然而隨著網路化、數位化與開放取用等趨勢發展下,圖書館的業務與 服務需求也有所調整,所蒐集、整理與提供的資訊資源也由圖書館本身擴及 外界。當各式商業電子化網路資訊資源大量興起時,圖書館已將亞馬遜網路書 店、Google圖書與相關網路資訊納入圖書資訊範圍之一。次則,當開放取用與 開放式數位學習資源逐漸成為另一種學術與學習文獻的重要資訊資源時,機構 典藏庫也成為圖書館另外一項重要的圖書資訊系統,以徵集與整理所屬機構內 的學術產出與教學資源。若依據Dempsey(2008, p. 5)的看法,圖書館所管理 與採用的圖書資訊系統已不僅ILS而已,也包括了ERM與各式典藏庫系統(如 IRs、eLearning與數位圖書館等系統),也反映了圖書館因應資訊資源的演變, 必須應用與管理不同的圖書資訊系統(見圖3)。 圖3 圖書館面對的不同圖書資訊系統 資料來源:Dempsey, 2008, p.5 ㈣組織方式 為因應不同類型的資訊資源,圖書館應用與發展不同的圖書資訊系統, 以滿足不同的服務與目的;如前述ERM、IRs、Metasearch、部落格與維基等 等。然而,所產生的問題是圖書館必須獨自管理許多不同的圖書資訊系統,以 及與不同廠商進行協調與溝通,同時人員也必須十八般武藝樣樣精通,具備不 同的資訊技術與領域知識。依據Dempsey(2008, p.10)的建議,圖書館應導入 英國圖書館界(Adamson, Bacsich, Chad, Kay, & Plenderleith, 2008, p.126)與Ex Libris Ltd.分別提出的一致化資源管理系統(unified resource management, URM) (Fallgren, 2008; Ex Libris Ltd., 2009),或澳洲國家圖書館建議的單一營運方式 (single business approach)(National Library of Australia, 2007)。依據澳洲國立圖 書館的資訊技術架構計畫報告內容指出,建議圖書館導入所謂的服務導向服務 (Service-Oriented Architecture, SOA)概念營運圖書資訊系統,圖書館應建立單 一資料聚合體(single data corpus)。此一聚合體可由單一實體典藏庫(one physi- cal repository)或不同的獨立聚合資料(a number of separate aggregations)組成, http://joemls.tku.edu.tw 417黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 應明確釐清圖書館的共通核心需求,一次發展完成,並非依不同的目的與功 能需求,發展不同軟體以組織或管理此一單一資料聚合體(National Library of Australia, 2007, pp.11-13)。換言之,圖書資訊系統可分為管理與讀者等兩種環 境(見圖4)。前者是圖書館用來徵集與組織各式圖書資訊資源,且以前述單一 營運方式進行;後者則是因應不同讀者需求,發展不同讀者端的應用軟體與系 統,以提供與遞送各項資訊資源服務(Dempsey, 2008, pp.12-15)。 圖4:在扁平世界中的圖書館 資料來源:Dempsey, 2008, p.12 ㈤聚合方式 多數的圖書資訊系統提供串連功能,以整合不同的資訊資源系統與特定 項目的資訊物件。整體而言,此種串連方式與類型並不是實體形式的資源儲存 與聚合,而是利用網路的連結功能達成。因而在方式與類型上即有精簡程度上 的不同,其中有兩項標準扮演著關鍵角色:OpenURL與數位物件識別碼。在 OpenURL方面,此一標準主要目的在於規範不同主機伺服器間對查詢記錄的語 法結構與格式,求取一致化以達成不同主機間的互操作(interoperability)。因 而,標的物件是書目紀錄,而非實體的數位物件。次則,在數位物件識別碼方 面,在於規定數位物件的命名原則,大多數標的物件是實體的數位物件,而非 書目紀錄。因而這些串連式或仲介式的聚合方式與類型的最終標的物可能是某 一圖書資訊系統或資料庫的主網頁,乃至於特定的書目紀錄,或特定的實體資 訊物件(如:一篇專著或期刊文獻),精準層次是有所不同的。 ㈥聚合資源 基本上,ILS是以圖書館館藏資源為主要對象,因而以提供資訊替代品 http://joemls.tku.edu.tw 418 教育資料與圖書館學 47 : 4 (Summer 2010) (surrogates)的書目為主軸,再予以延展;例如目次等。首先,隨著Amazon. com的出現,可以發現圖書資訊系統聚合的資源可以擴大結合書影、讀者評論 (文字與影音等形式)、內容全文,乃至於讀者購書行為資訊等。次則,隨著 Web 2.0的興起,許多圖書資訊系統可以進一步提供社會化標籤與讀者閱讀清 單。第三,Google圖書推出的改版服務,也可以發現進一步結合Google Map與 引用文獻等資源。(見表2) 表2 圖書資訊系統聚合資源一覽表 ILS - commercial ILS - Open source ERM 亞馬遜 網路書店 Google 圖書 書目 √ √ √ √ √ 書影 √ √ √ √ 目次 √ √ √ √ √ 內文 √ √ √ 評論 √ √ √ 標籤 √ √ 智能建議 √ 引用情形 √ Map √ 人際網路 √ 資料來源:本文自行整理 ㈦呈現方式 形式上,多數圖書資訊系統以文本型的呈現方式為主。隨著Amazon.com 的出現,已可採用視覺化方式呈現書影影像。次則,在Google圖書的視覺化呈 現方式又加以推陳出新,除了以全面化書影影像式呈現每一本電子書外,內容 全文可以全面化頁面影像式呈現,同時也將查詢字串以不同顏色標示,並可以 用標籤雲的方式顯示內容的相關性。第三,WorldCat以不同的主題層面呈現資 訊資源的內容主題,以供查檢之用。第四,Web 2.0各式工具則是進一步以標 籤雲方式呈現資訊資源的內容主題,並依標籤的使用頻率提供不同字體大小的 呈現與使用方式。第五,在IRs與數位圖書館系統方面,依組織層級或檔案來 源與全宗等結構方式提供查檢。簡言之,資訊資源的呈現方式除了文本外,視 覺化界面已逐漸興起,且趨向多元化。 ㈧軟體應用的導入模式 除了商業的套裝系統(turnkey system)軟體外,圖書資訊系統應用的導入 模式也有徹底性的改變。首先,開放原始碼軟體的出現,圖書館可以免費或 以更低廉價格購入應用所需的圖書資訊系統外,更可以直接參與軟體的開發與 需求的設計,已跳脫專屬且封閉的軟體系統,其中最具代表個案者包括前述 http://joemls.tku.edu.tw 419黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 的Evergreen、Koha及IRs的DSpace與EPrints等皆是。另一方面,OCLC規劃 的WebScale服務,是一種以「軟體服務」(Software as a Service, SaaS)的應用 模式,圖書館可以選擇支付年費,並不需要購置與維護電腦主機及相關技術人 員的配置等。在Amazon.com亞馬遜網路書店方面,則更進一步擴大範圍至軟 硬體服務,及提供平台服務(Platform as a Service, PaaS)與基礎設施服務(Infra- structure as a Service, IaaS)。因而,可以發現圖書資訊系統的導入模式已正在無 聲無息中改變,且所需配備條件也有所不同(包括有形的電腦主機與無形的人 力資源)。 ㈨社會化OPAC(Social OPAC) Web 2.0其中之一的重要概念在於眾人式參與,包括資訊資源與社會化 標籤的產生與提供。此一作法已明顯影響了ILS與開放原始碼軟體的圖書資 訊系統的發展。具體實例是直接增加了社會化標籤及其標籤雲等功能,包括 Encore與Primo,及前述的LibraryThing與Biblicommons等皆是。除此之外, LibraryThing與Biblicommons也以既有的書目紀錄為基礎,提供讀者建立主題 式的閱讀清單,同時可列入目錄的查檢範圍之列。更進一步者,Biblicommons Project提供讀者發展人際網路,建立共享與拒絕名單,以擴大興趣相近的結合 力量。除此之外,從LibraryThing與亞馬遜網路書店的個案中,可以發現所謂 的共同參與者也包括了讀者、圖書館及網路書店與商業出版社等三方,而不是 僅限於讀者與圖書館雙方而已。 ㈩設計方式 Google搜尋引擎的特色在於利用資料擷取(harvest)方式,予以重新定 位與重新組織,以提供不同目的的重新再應用。然而,Web 2.0另一重要概念 則是強調重新解構(disruptive and disaggregation)與聚集(mashup),以提供不 同目的的重新組織與再應用。圖書資訊系統面對網路化、數位化、Web 2.0與 Google化等四大趨勢交互衝擊下,也必須重新思索下一代圖書資訊系統的發展 與應用方向,以反映各種不同的資訊資源與讀者需求。由前述的URM與單一 營運方式的建議,圖書館採用SOA已是必然的導入策略之一,保持各圖書資訊 系統的開放性與獨立性,以利於不同需求與目的之功能應用,以及相關應用軟 體的開發、結合與互操作,因應圖書館隨時的重新定位、重新組織與再應用等 多重目的。 運作方式 制式的圖書資訊系統(如:ILS、ERM、IRs,乃至於Metasearch)皆以單一 圖書館或機構為主要應用範圍,台灣圖資界共建共享的個案仍屬少數。然而, http://joemls.tku.edu.tw 420 教育資料與圖書館學 47 : 4 (Summer 2010) 在電子資源採購方面,卻有圖書館聯盟的運作方式。反之,Web 2.0潮流下, 各式Web 2.0工具與資訊資源則是群體智能的力量,是一種虛擬式共建共享的 聯盟運作模式。在圖書資訊系統的運作方式而言,也有OCLC WorldCat與Ever- green等兩種個案與類型。因而,未來圖書館界能否延續電子資源採購方面的聯 盟至圖書資訊系統,則有待觀察與努力,以累積不同的資訊資源及其應用與管 理,達成共建共享的目的。 取用方式 以往行動化圖書館係指圖書館運用圖書巡迴車,至各地提供實體與網路 化的圖書資訊服務,以延伸圖書館館內的各項資訊服務。隨著手機行動化上 網,也拓展另外一種新形式的行動化圖書資訊服務,以及取用方式。依據英國 劍橋大學的「理想國計畫」(arcadia project)計畫成果,該計畫建議圖書館可以 朝向「M化圖書館」(M-Libraries)的資訊服務,主要項目包括(Mills, 2009, pp. 7-10): 1. 簡訊型的提示服務,如:預約、續借、逾期通知服務。 2. 簡訊型的參考服務。 3. OPAC查詢服務。 4. 電子資訊資源的遞送服務。 5. Internet上網服務。 其中值得注意的是,上述「理想國計畫」在調查英國劍橋大學與空中大學 (The Open University)讀者的調查結果中,顯示讀者利用手機使用圖書資訊服 務的意願項目(如圖5),值得發展M化圖書館資訊服務時的參考。 圖5 英國劍橋大學與空中大學讀者 手機上網應用圖書資訊意願調查 資料來源:Mills, 2009, p.11 http://joemls.tku.edu.tw 421黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 六、結語與研究建議 自1970年代起,圖書資訊系統隨著電腦發展而興起圖書館自動化,ILS逐 漸開始盛行。至1990年代,隨著網際網路與全球資訊網的廣泛應用,圖書資 訊資源開始走向網路化與電子化,而圖資界也從ILS與全球資訊網網頁方式切 入,著手引進各式電子化資源。自1994年起,美國NSF開始推展數位圖書館 第一期計畫,也引導各圖書館館藏的數位化及其應用研究。在此之際,1998 年Google正式成立,提供網路搜尋引擎服務,掀起一陣Google化與世代。至 2000年代,隨著各出版社與資料庫廠商推出各項商業化電子資訊資源,ERM 與Metasearch的圖書資訊系統也應運而生,協助圖書館管理與提供各項資訊服 務。除此之外,Web 2.0概念於2004年提出後,各項Web 2.0工具的發展與應 用,讓使用者自行產生與提供各式電子資訊資源,使得圖書資訊資源更為豐富 多樣化之際,也更顯得難以管理。手機隨著智慧型化與3G化後,也引起手機 上網與行動化的熱潮。簡言之,面對網路化、數位化、Google化、Web 2.0化 與手機行動化等世界潮流的發展與推波助瀾下,圖書資訊系統已產生形式與本 質上的變化。依據前述內容,圖資界面對此種多元變化與挑戰,未來應何去何 從,本文試提出兩項建議供作參考,說明如下: ㈠重新定義「圖書館」,深入檢視「資源的發掘與遞送」功能流程 以往圖書資訊系統以單一圖書館或機構為範圍,係以徵集、編目、流 通、參考、館際互借與OPAC等為主要的作業與服務。本質上,以圖書館內部 活動為主,係以各項圖書館內部作業成果為基礎,再延伸至讀者服務。隨著網 路化與數位化的結果,圖書館資訊服務的資源範圍不再以本身館藏資源為主, 改以網際網路可取得的數位資訊資源與本身館藏資源等為領域範圍。在資訊使 用方面,傳統與數位資訊資源並重,尤其各項商業電子資料庫愈形重要,且成 為讀者取用各項資訊的主要來源。同時,讀者也以Google作為資訊的主要資源 發掘工具,藉以找到與取得各項資訊。另外,隨著Web 2.0的社會化與大眾化 參與,許多新式資訊資源係由讀者自行產生,或加值後再賦予新面貌的資訊資 源,加速數位原生資訊資源的變動與生產。因而,所謂的「圖書館」及其資訊 服務已在形式與本質上皆產生根本性的變化。圖書資訊系統所要處理的資訊物 件包括傳統、數位再製與數位原生資訊,領域範圍也由單一機構延伸至網際網 路空間,而資訊來源除了傳統的學術來源外,也來自四面八方網路讀者自行產 生的資訊。另一方面,讀者的世代交替與行為的改變,也讓圖書館必須重新檢 視本身的角色定位及其功能。所以,當圖書館在探討與尋求下一代圖書資訊系 統時,就必須考量上述這些因素及其影響與變化。如同大英圖書館(British Li- brary)在2005-2008年的策略報告書(Redefining the Library: The British Library’s http://joemls.tku.edu.tw 422 教育資料與圖書館學 47 : 4 (Summer 2010) Significant age-related differences in article discovery method Personal recommendations Visiting a library in person Electronic tables of contents Visiting journal publishers’ Websites Search GoogleSholar 圖6 英國不同年齡層尋找文獻途徑 資料來源:CIBER, 2008, p.13 strategy 2005-2008),即提出「重新定義圖書館」的策略規劃(British Library, 2005), 同時在2008-2011的策略報告書,強調圖書館必須將資訊資源與讀者予以結合在 一起(connect our users with content)(The British Library, 2008);即是Dempsey (2005)提出的「資源的發掘與遞送」(Discovery to Delivery, D2D),圖書資訊系 統必須適時地融入D2D過程中,便利地讓資訊被讀者發掘後,即時遞交給讀 者。換言之,圖資界必須深入檢視D2D的內容及其各項環節與流程,進而定位 出符合讀者需求的專業作業活動與資訊服務,以及適用的圖書資訊系統。 ㈡融入讀者工作流程:從「資源的發掘與遞送」至「資源的創 造與保存」 Google化影響的範圍所及,除了一般的網路使用者外,年輕讀者更是明 顯。依據英國資訊行為與評估研究中心(Centre for Information Behaviour and the Evaluation of Research, CIBER)的研究,稱謂這些年輕使用者為「Google世代」 (Google generation),亦即出生在1993年以後的網路使用者。在CIBER發布的 「未來研究者的資訊行為」(Information Behaviour of the Researcher of the Future: Executive Summary)報告中,除了說明各年齡層的使用特性(見圖6)外,該 報告也對研究圖書館提出三項建議事項,包括:經由網路搜尋引擎增加能見 度(make their sites highly visible in cyberspace by opening them up to search en- gines),放棄一站發掘機制(a one-stop shop),及圖書館館藏使用情形將日趨減少 (much content will seldom or never be used, other than perhaps a place from which to http://joemls.tku.edu.tw 423黃鴻珠、陳亞寧:圖書資訊系統的演變與發展 bounce)(CIBER, 2008, p.31)。另外,Dempsey(2008)也建議除了要增加揭露 性(disclosure)外,也需要進一步融入讀者的研究與學習的流程環境之中。換 言之,雖然圖書館及圖書資訊系統著重在D2D,但D2D必須融入「資源的創造 與保存」(Create to Curate, C2C)環境下,由圖書館經由圖書資訊系統適時地提 供適當的資訊服務,讓讀者與圖書館共同創造、分析、組織、選擇、取得、 儲存與保存這些資訊資源(Dempsey, 2006)。因而,圖資界必須深入了解一個 機構在eResearch與eLearning環境下的讀者工作流程及其資訊需求(即研究與學 習),方能適時導入D2D型的圖書資訊系統及其服務。 隨著時代潮流的演進,圖書資訊系統從早期的藏書清單、卡片目錄、書本 式目錄、索摘、ILS、電子資料庫,推進至現今的Web 2.0工具及其資訊資源。 圖資界早就對時代脈絡的變動習以為常,且有其應變的策略與作為。如今面對 網路化、數位化、Google化、Web 2.0化與手機行動化等多重趨勢衝擊下,如 何設計與善用圖書資訊系統,以滿足不同類型的資訊資源,及圖書館管理與讀 者使用資訊的需求,將是未來圖資界的重要任務與課題之一。本文僅以近年來 的圖書資訊系統為個案進行探討,同時提出建議,希望有助於圖資界對圖書資 訊系統的了解與決策參考之用。 參考文獻 Adamson, V., Bacsich, P., Chad, K., Kay, D., & Plenderleith, J. (2008). JISC & SCONUL li- brary management systems study: An evaluation and horizon scan of the current library management systems and related systems landscape for UK higher education. Retrieved February 14, 2010, from http://www.jisc.ac.uk/media/documents/programmes/resource- discovery/lmsstudy.pdf British Library. (2005). Redefining the library: The British Library’s strategy 2005-2008. Retrieved February 14, 2010, from http://www.bl.uk/aboutus/stratpolprog/strategy0811/ blstrategy20052008.pdf British Library. (2008). The British Library’s strategy 2008-2011. Retrieved February 14, 2010, from http://www.bl.uk/aboutus/stratpolprog/strategy0811/strategy2008-2011.pdf Centre for Information Behaviour and the Evaluation of Research. (2008). Information behav- iour of the researcher of the future: Executive summary. Retrieved February 25, 2008, from http://www.ucl.ac.uk/infostudies/research/ciber/downloads/ggexecutive.pdf Dempsey, L. (2005). Discover, locate, ...vertical and horizontal integration. Retrieved February 14, 2010, from http://orweblog.oclc.org/archives/000865.html Dempsey, L. (2006). The (digital) library environment: Ten years after. Ariadne, 46. Retrieved February 14, 2010, from http://www.ariadne.ac.uk/issue46/dempsey/ Dempsey, L. (2008). Reconfiguring the library systems environment. Portal: Libraries and the Academy, 8(2). Retrieved February 14, 2010, from http://www.oclc.org/research/publica- tions/library/2008/dempsey-portal.pdf http://joemls.tku.edu.tw 424 教育資料與圖書館學 47 : 4 (Summer 2010) Ex Libris Ltd. (2009). Unified resource management: The Ex Libris framework for next- gen- eration library services (Ver. 1.1). Retrieved March 14, 2010, from http://www.exlibris- group.com/files/Solutions/TheExLibrisFrameworkforNextGenerationLibraryServices.pdf Fallgren, N. J. (2008). Users and uses of bibliographic data meeting: Brief meeting summary. Retrieved February 14, 2010, from http://www.loc.gov/bibliographic-future/meetings/ 2007_mar08.html LibraryThing. (2010, May 5). In Wikipedia. Retrieved May 5, 2010, from http://en.wikipedia. org/w/index.php?title=LibraryThing&oldid=360199022 Mills, K. (2009). M-libraries: Information use on the move. Retrieved February 14, 2010, from http://arcadiaproject.lib.cam.ac.uk/docs/M-Libraries_report.pdf National Library of Australia. (2007). National library of Australia IT architecture project re- port. Retrieved February 14, 2010, from http://www.nla.gov.au/dsp/documents/itag.pdf OCLC. (2009). OCLC announces strategy to move library management services to Web scale. Retrieved February 14, 2010, from http://www.oclc.org/news/releases/200927.htm Open Library Environment Project Report. (2009). Retrieved 5 May, 2010 from http://olepro- ject.org/wp-content/uploads/2009/11/OLE_FINAL_Report1.pdf Van de Sompel, H., Payette, S., Erickson, J., Lagoze, C., & Warner, S. (2004). Rethinking scholarly communication: Building the system that scholars deserve. D-Lib Magazine, 10(9). Retrieved February 14, 2010, from http://dlib.ejournal.ascc.net/dlib/september04/ vandesompel/09vandesompel.html http://joemls.tku.edu.tw Journal of Educational Media & Library Sciences 47 : 4 (Summer 2010) : 403-428 Research Article * To whom all correspondence should be addressed. Evolutionary Development of Library Information Systems Hong-Chu Huang Professor Department of Information & Library Science Library Director Chueh Sheng Memorial Library Tamkang University Taipei, Taiwan E-mail: kuanin@mail.tku.edu.tw Ya-Ning Chen* System Analyst Computing Center, Academia Sinica Taipei, Taiwan E-mail: arthur@sinica.edu.tw Abstract A library information system is an essential tool for libraries to acquire and organize information resources and to deliver services to users. With the advancement of information technologies, library information systems have evolved from the card catalogue into diverse types, such as integrated library systems, the electronic resource management system, the Amazon.com, and the Google Books Search, etc. This article aims to explore the future of library in- formation systems by reviewing the examplar systems. This study used the case study research methodology to analyze 14 library information systems. The research findings and suggestions include the following points about library information systems: the types of information, the information granularity, the information scope, the information organization, integration methods, resource aggregation, the presentation of information, the software implementation, the social OPAC, the system design, operation models, and access, etc. Keywords: Library information systems; Information resources SUMMARY The history of library information systems can be traced back to the invento- ry list preserved by the medieval monastic libraries in the western world, or to the bibliographies of the family libraries in ancient China, or even to the later cata- logs in book and the card catalog. With the advancement of time and information technologies, library information systems have evolved into diverse types, such as integrated library systems, the electronic resource management system, the Amazon.com, and the Google Books Search, etc. However, as an essential tool http://joemls.tku.edu.tw 426 Journal of Educational Media & Library Sciences 47 : 4 (Summer 2010) to organize library information resources, library information systems have been experiencing great changes due to the growing popularity of the Internet, digitiza- tion, Googlization, Web 2.0, mobile computing, etc. These changes caused con- cerns in the library field regarding the future of library information systems and the essential functions that library information systems should offer, etc. This article aims to explore the future of library information systems by re- viewing the examplar systems. It used the case study research methodology to analyze 14 library information systems in 4 categories. The research findings and suggestions include 12 key points, such as the types of information resources, the granularity, the scope, the information organization, integration methods, re- source aggregation, the presentation of information, the software implementation, the social OPAC, the system design, operation models, and access, etc. For the research samples, this study chose the library information systems in a number of categories instead of just from one category which is usually meant by the traditional LMS (Library Management System) or ILS (Integrated Library System). However, there has been no agreement made on the definition of the library information system. If traced back to the history, LMS and ILS began to appear only after the computerized automation was introduced to the libraries. Therefore, library information systems are closely related to the library automa- tion. Based on the definition by ODLIS(Online Dictionary for Library and In- formation Science) and OLE (Open Library Environment) Project as well as the one given by Dempsey, this research studied the library information systems from 1990 to 2010 in terms of the characteristics, functions, and developments. The research samples covered 14 library information systems which include the ILS, the Open Source Software, the Federated Library System, the Amazon.com, ERM (Electronic Resource Management), union catalog, the open access, the Google Book Search, Digital libraries/digital learning resources/IRs systems, Collection Level Systems, the Web 2.0 Tools, the Bibliocommons, the LibraryThing, and the North Carolina State University Library System. These systems can be divided into 4 categories: the LMS/ILS, the digital library/archive information resource system, the social software, and the mobile service. This research found that the processed information by library information systems has been undergoing a lot of changes because of the Internet, digitization, Googlization, Web 2.0 and mobile service. These changes include: • Types: the types of information have been expanded from the traditional books and journals to the digitization and born-digital materials. In addition, the type is going to be more complex than ever. • Granularity: the granularity of the information either goes deeper down to TOC and full-texts or goes wider up to the collection level. http://joemls.tku.edu.tw 427Hong-Chu Huang, & Ya-Ning Chen: Evolutionary Development of Library Information Systems • Scope: the information scope has been extended from the library collection only to various Web-based resources, such as the Amazon.com, the Google Book Search, and the Open Access, etc. • Organization: library information systems either adopt the URM (unified resource management) or the Single Business Approach to develop Single Data Corpus for various needs. • Integration methods: both the OpenURL and the Indentifier will play important roles in retrieving and integrating data. • Resource Aggregation: Besides bibliographies and TOC, the resource aggregation has been extended to integrate book cover, videos, customer reviews, full-texts, book purchase suggestions, social tags, Google maps and citations. • Presentation: besides texts, visualization tends to be the future of the information presentation. • Software implementation: Open Source and SaaA will dominate the future implementation of library information systems. • Social OPAC: the OPAC is more socialized by adopting Web 2.0. • System design: it becomes important to keep system both open and independent for facilitating various applications based on different requirements and purposes, as well as for system design, integration and interoperability. • Operation model: the consortium-based collaborative model needs in- depth examination to achieve integration and sharing of information resources by collective action. Access: mobile access which used to be an extra function will become the main service. Last, this study proposed two suggestions as follows: • To redefine the “library” and to examine in depth the process of informa- tion D2D (discovery to delivery). • To integrate with user service and to look for the best way to provide library services from the perspective of C2C(create to curate)information resources. ROMANIZED & TRANSLATED REFERENCES FOR ORIGINAL TEXT Adamson, V., Bacsich, P., Chad, K., Kay, D., & Plenderleith, J. (2008). JISC & SCONUL li- brary management systems study: An evaluation and horizon scan of the current library management systems and related systems landscape for UK higher education. Retrieved February 14, 2010, from http://www.jisc.ac.uk/media/documents/programmes/resource- http://joemls.tku.edu.tw 428 Journal of Educational Media & Library Sciences 47 : 4 (Summer 2010) discovery/lmsstudy.pdf British Library. (2005). Redefining the library: The British Library’s strategy 2005-2008. Retrieved February 14, 2010, from http://www.bl.uk/aboutus/stratpolprog/strategy0811/ blstrategy20052008.pdf British Library. (2008). The British Library’s strategy 2008-2011. Retrieved February 14, 2010, from http://www.bl.uk/aboutus/stratpolprog/strategy0811/strategy2008-2011.pdf Centre for Information Behaviour and the Evaluation of Research. (2008). Information behav- iour of the researcher of the future: Executive summary. Retrieved February 25, 2008, from http://www.ucl.ac.uk/infostudies/research/ciber/downloads/ggexecutive.pdf Dempsey, L. (2005). Discover, locate, ...vertical and horizontal integration. Retrieved February 14, 2010, from http://orweblog.oclc.org/archives/000865.html Dempsey, L. (2006). The (digital) library environment: Ten years after. Ariadne, 46. Retrieved February 14, 2010, from http://www.ariadne.ac.uk/issue46/dempsey/ Dempsey, L. (2008). Reconfiguring the library systems environment. Portal: Libraries and the Academy, 8(2). Retrieved February 14, 2010, from http://www.oclc.org/research/publica- tions/library/2008/dempsey-portal.pdf Ex Libris Ltd. (2009). Unified resource management: The Ex Libris framework for next- gen- eration library services (Ver. 1.1). Retrieved March 14, 2010, from http://www.exlibris- group.com/files/Solutions/TheExLibrisFrameworkforNextGenerationLibraryServices.pdf Fallgren, N. J. (2008). Users and uses of bibliographic data meeting: Brief meeting summary. Retrieved February 14, 2010, from http://www.loc.gov/bibliographic-future/meetings/ 2007_mar08.html LibraryThing. (2010, May 5). In Wikipedia. Retrieved May 5, 2010, from http://en.wikipedia. org/w/index.php?title=LibraryThing&oldid=360199022 Mills, K. (2009). M-libraries: Information use on the move. Retrieved February 14, 2010, from http://arcadiaproject.lib.cam.ac.uk/docs/M-Libraries_report.pdf National Library of Australia. (2007). National library of Australia IT architecture project re- port. Retrieved February 14, 2010, from http://www.nla.gov.au/dsp/documents/itag.pdf OCLC. (2009). OCLC announces strategy to move library management services to Web scale. Retrieved February 14, 2010, from http://www.oclc.org/news/releases/200927.htm Open Library Environment Project Report. (2009). Retrieved 5 May, 2010 from http://olepro- ject.org/wp-content/uploads/2009/11/OLE_FINAL_Report1.pdf Van de Sompel, H., Payette, S., Erickson, J., Lagoze, C., & Warner, S. (2004). Rethinking scholarly communication: Building the system that scholars deserve. D-Lib Magazine, 10(9). Retrieved February 14, 2010, from http://dlib.ejournal.ascc.net/dlib/september04/ vandesompel/09vandesompel.html 403-428 Journal of Educational Media & Library Sciences http://joemls.tku.edu.tw Vol. 47 , no. 4 (Summer 2010) : 245-428 403-428.pdf work_op4hvs4irncejdhw6gzr27vcra ---- Cross-Train Your New Hire With a Plan and Schedule Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=wild20 Download by: [Texas A&M University Libraries] Date: 02 January 2017, At: 09:31 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve ISSN: 1072-303X (Print) 1540-3572 (Online) Journal homepage: http://www.tandfonline.com/loi/wild20 Cross-Train Your New Hire With a Plan and Schedule Zheng Ye (Lan) Yang To cite this article: Zheng Ye (Lan) Yang (2015) Cross-Train Your New Hire With a Plan and Schedule, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 25:3-5, 107-115, DOI: 10.1080/1072303X.2016.1254706 To link to this article: http://dx.doi.org/10.1080/1072303X.2016.1254706 Published online: 20 Dec 2016. Submit your article to this journal Article views: 3 View related articles View Crossmark data http://www.tandfonline.com/action/journalInformation?journalCode=wild20 http://www.tandfonline.com/loi/wild20 http://www.tandfonline.com/action/showCitFormats?doi=10.1080/1072303X.2016.1254706 http://dx.doi.org/10.1080/1072303X.2016.1254706 http://www.tandfonline.com/action/authorSubmission?journalCode=wild20&show=instructions http://www.tandfonline.com/action/authorSubmission?journalCode=wild20&show=instructions http://www.tandfonline.com/doi/mlt/10.1080/1072303X.2016.1254706 http://www.tandfonline.com/doi/mlt/10.1080/1072303X.2016.1254706 http://crossmark.crossref.org/dialog/?doi=10.1080/1072303X.2016.1254706&domain=pdf&date_stamp=2016-12-20 http://crossmark.crossref.org/dialog/?doi=10.1080/1072303X.2016.1254706&domain=pdf&date_stamp=2016-12-20 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 25:107–115, 2015 Published with license by Taylor & Francis ISSN: 1072-303X print / 1540-3572 online DOI: 10.1080/1072303X.2016.1254706 GLRSC REPORTS FROM THE FIELD Cross-Train Your New Hire With a Plan and Schedule ZHENG YE (LAN) YANG Texas A&M University Libraries, College Station, Texas, USA INTRODUCTION Texas A&M University (TAMU) Libraries is the first library in the nation to offer free local document delivery and interlibrary loan services to its entire campus of customers, including undergraduates, graduates, faculty, and staff members. This service has been provided since 2002. Our mission, as one might surmise, is to ”get it for you,” no matter where the material resides, be it in our own collections or anywhere in the world (Yang 2004, 2005, 2012). The Document Delivery Services department of TAMU Libraries is re- sponsible for interlibrary loan, book retrieval from library stacks for cus- tomers to pick up, and local collections scanning for our campus of over 70,000 customers. The department originally had 13 team members—five for borrowing functions, three for lending, three for local document deliv- ery, one professional staff supervisor, one director of the department, and 11 student workers (3 FTE). In 2010, we peaked in terms of number of requests received, processing a total of 235,754 requests. Since 2011, we have seen decrease in all three functions (borrowing, lending, and document delivery). In 2015, we received just 156,000 requests (74,278 lending requests, 54,032 borrowing requests, and 28,064 local book retrieval and scanning requests), a 34% decrease from 2010 (Figure 1). Because of the downward trends in requests, the director of the depart- ment eliminated three positions (two in borrowing, one in local document delivery) after they were organically vacated due to a retirement, promotion C© Zheng Ye (Lan) Yang Address correspondence to Zheng Ye (Lan) Yang, Texas A&M University Libraries, 5000 TAMU, College Station, TX 77843-5000, USA. E-mail: Zyang@TAMU.edu This article was accepted by the Great Lakes Resource Sharing Conference as a presen- tation but was not presented due to extenuating circumstances. Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/wild. 107 http://dx.doi.org/10.1080/1072303X.2016.1254706 mailto:Zyang@TAMU.edu http://www.tandfonline.com/WILD 108 Z. Y. L. Yang FIGURE 1 Yearly Total Requests Received From 2003–2015. to another library department, and resignation to attend graduate school between 2011 to 2014. On average, we now handle and process about 900 requests/items daily. The decrease in requests might be contributed to our robust elec- tronic resources, which allow users to find fulltext online themselves, the implementation of demand driven acquisition, the installation of scanners on every single floor of the library stacks, and the improvement of our dis- covery services. In the summer of 2015, two other staff members resigned, one of whom previously held a position in lending and another in local document deliv- ery. Already running a lean team, the director did not choose to eliminate these positions. Instead, the director used their departure as an opportunity to establish a new staffing model. When the two newly vacated positions were advertised, the revised position descriptions included all three func- tions, namely, borrowing, lending, and local document delivery, instead of focusing on only one aspect of operation. There are many benefits for cross training, including (but not limited to) consistent productivity even when employees are absent, decreased em- ployee boredom, spreading employees’ understanding and capabilities over a wide range of skills and tasks, and building empathy amongst team mem- bers for their colleagues. Providing employees with varied work typically results in increased productivity and satisfaction. TRAINING APPROACH AND STRATEGIES Why We Love Our Job In mid-August of 2015, we hired two new staff members. On their first day of work, we explained why we love our job: GLRSC Reports From the Field 109 1. Our department is regarded as the most valued and vital library service in the eyes of our customers. 2. We enjoy the detective work and resulting satisfaction of tracking down a resource. 3. We don’t compete - rather, we share, help, and cooperate. 4. Seeing an item coming from South Africa/Australia/Hong Kong makes the world seem smaller. 5. We are in touch with users’ real needs and make a difference in their lives. 6. The articles/books that we deliver enable our students to complete their assignments/theses and our faculty to make breakthroughs in their re- search efforts or secure their grants. First Things First After this pep talk, the supervisor gave the new hires a list of approximately 70 job tasks performed by department staff, not to overwhelm them, but to give them an idea of what they will expect to learn. They were also asked to submit a request as a user to our Get it for me system (our brand name for ILLiad). This would help them understand how the requests came to the ILLiad client as they started to learn how to process them. Finally, the supervisor asked them to read the FAQ page of the Get it for me service. This would inform them of our service coverage. Training on Borrowing Processing They were trained on borrowing processing first, focusing on two tasks: sending requests to other libraries to be filled and electronic delivery of articles found in TAMU Libraries databases and online resources. During the first week, the supervisor spent 2–3 hours every day sitting next to them, watching them process each request and answering questions along the way. Figure 2 is our ILLiad borrowing module front page. Each staff member has their own specific task underneath their name. This arrangement makes clear what tasks need to be addressed should a staff member be absent; for example, if Bobbie were out for the day, the backup staff would know to take care of the requests in the queues underneath Bobbie’s name. Each task has at least one back-up person. Attention to detail is the key for our job. When we trained new staff members, we stressed that they need to check several items of informa- tion in the request, including the following: Has the patron indicated if he/she will accept other editions, or languages? What specific format has been requested–(ex. print book, audio book, ebook, or CD/DVD)? Are there any notes from the patron in the note field? Is it a rush request, in which case 110 Z. Y. L. Yang FIGURE 2 ILLiad Borrowing Module Front Page. we need to call the lending library to alert them and put the word “rush” in the borrowing note field? For article requests, staff members were instructed to Google the article first to find open access PDFs freely available on the internet. New staff members were shown the power of keyword searching in OCLC and provided with a key MARC fields descriptions reference sheet we prepared (Appendix 1). They were instructed that searching on the ISBN or ISSN in OCLC is quicker, but a title search will bring more results depending on the complexity of the request. GLRSC Reports From the Field 111 Training on Lending Processing In week two, they were trained on lending processing. They had a tour of the library stacks, learned the library layout, the call number structures, and the locations of the re-shelving areas for each floor. We talked about our consortia group, priority processing for our consortia members, and informed them that in the case of a bad citation they should conditional the borrowing library and ask that they check the citation instead of cancelling or not filling it. If unfilled, the request will continue to the next library without being corrected, a waste of time for all parties involved. We also shared with them some OPAC search tips. They were instructed to print pull slips for stacks search, learned packaging, organizing received items, document scanning, and sending shipments via FedEx and TExpress (our state courier). We emphasized that when we return items to the lending library, to include the paperwork that was sent along with the item to help the lending library in their check-in processes. New staff members were also informed about database licenses and their effect on interlibrary loan; for example, per our contract with Elsevier, we cannot send copies of articles from its packages to libraries outside of the United States. During the second week, they learned lending processes and spent about two hours every day on borrowing requests processing so that they wouldn’t forget what they had learned the week before. Training on Local Document Delivery Processing On the last day of week two, moving over to our Document Delivery module training was smooth sailing. They only needed to learn how to create a hold record in Voyager circulation module after retrieving books from the stacks for our patrons, so patrons could be alerted of an available pick up. They were instructed to print pull slips, update stacks search for loans, and scan/update documents for electronic delivery. Practice on Their Own Starting from week three, the two new hires were on their own. We saved the screen shots during the training, so that they could be used as refresh- ers during processing, and they also took extensive notes for reference. A copy of step-by-step processing procedure was provided to them during the training. Both realized that taking notes enabled them to reinforce the learning, leading us to conclude that it is better to have new hires take notes during the training session rather than just give them the process procedure documents. When they ran into any uncertainty, they asked other staff, or pinged their supervisors or the director. Whenever the director saw a request that was out of norm, she would explain how she processed it, sometimes 112 Z. Y. L. Yang asking them to show her how they would tackle the request. We talked about copyright law, custom holdings, and patron privacy. They had a solid 4 months of practice after the initial training. Review and More Training In mid-December of 2015, the director sat down with each of the new hires and checked their progress. They both did a fantastic job. The director then showed them how to process the following tasks: awaiting copyright clear- ance, awaiting renewal request processing, awaiting denied renewal pro- cessing, awaiting odyssey delivery, awaiting SFX requests processing, and using OCLC blank work form to submit a request. With the hands-on experience they’d gained over the previous four months, it took less than 30 minutes to complete the above training. We decided that after the holiday, in the spring semester, they would be trained on the following tasks: borrowing unfilled, conditional, and incoming books processing; lending conditional, renewal, and unshipped; and OCLC special messages complete and not received. They would also be trained on prepar- ing items for faculty/staff office mail stop delivery, branch library delivery, using FedEx to ship books to distance education student’s home, and check- ing in borrowed returned books from faculty and distance students. Overall, it took them about 8 months to feel comfortable in handling all the tasks in the department. CONCLUSION This model was put to test when, unexpectedly, one staff member whose responsibilities were solely in lending resigned in mid-February of 2016. Another lending staff was out for the entire week at the same time for a pre- planned vacation. The two new hires just shifted and balanced their focus, giving more attention to lending in their daily work activities. It successfully alleviated any potential extra burden felt by other staff. Their daily responsibilities include processing incoming requests in all three modules and filling in for staff absences seamlessly. After initially feeling overwhelmed by the training, after a semester they feel much less stressed and have begun to feel more confident in their abilities within the department as a whole. They are very receptive and appreciative for this training model; because work is very varied, they get to do many different things instead of focusing on only one thing every day. They are able to learn many of the department processes through incremental instruction. This model also earned support from existing staff members. They com- mented that we developed a larger pool of employees who can step in when the department is short-staffed. Now each task is covered by at least five staff GLRSC Reports From the Field 113 members. All tasks can be carried out throughout the day. For example, we used to process books only in the morning, because books were usually delivered at the end of shift for the staff member whose main responsibility was to process incoming borrowed books. With this new model, we can process incoming books multiple times a day. This model breaks away from traditional staffing practices in big resource-sharing/document-delivery de- partments, where specific staff attend to their specific responsibilities only. Based on our experiences implementing this model, we created a training schedule for our 3rd new hire (Appendix 2) who started in May 2015. In short, this model has paid off. The new hires developed a clear un- derstanding and appreciation for the interconnection of the department’s ser- vices. They are more confident and self-reliant with a broader skill set. De- partment dynamics have been improved. Turnaround times have improved by .5 days for borrowing and 1 full day for lending. ACKNOWLEDGMENT This article is based on the proceedings of the Great Lakes Resource Sharing Conference held June 9–10, 2016 in Indianapolis, Indiana. REFERENCES Yang, Z. Y. (2004). Customer satisfaction with Interlibrary Loan Service— DeliverEdocs: A case study. Journal of Interlibrary Loan, Document Delivery & Information Supply, 14(4), 79–94. Yang, Z. Y. (2005). Providing free document delivery services to a campus of 48,000 library users. Journal of Interlibrary Loan, Documen Delivery & Electronic Re- serve, 15(4), 49–55. Yang, Z. Y, Hahn, D., & Thornton, E. (2012). Meeting our customers’ expectations: A follow-up customer satisfaction survey after 10 years of free document delivery and interlibrary loan services at Texas A&M University Libraries. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 22(2), 95–110. APPENDIX 1: MARC FIELDS DESCRIPTIONS 010: Library of Congress Catalog Number (LCCN: 12-345678) 020: International Standard Book Number (ISBN: 0123456789) 022: International Standard Serial Number (ISSN: 1234–5678) 030: CODEN (ABCDEF) - assigned by Chemical Abstracts Service 037: Source of acquisition - NTIS and ERIC documents microfiche 050: Library of Congress call number 082: Dewey Decimal call number 114 Z. Y. L. Yang 086: US Documents classification/call number 100: Personal name/author 110: Corporate name 111: Meeting/Conference name 210: Abbreviated title 245: Title 260: Publication place, company, & date. 300: Physical description - book. 362: Dates of publication/sequential designation - serials 440: Series title 502: Dissertation/thesis note 772: Parent record entry - for supplements and single issues 776: Additional physical form entry merged with alternate title 780: Preceding bibliographic record - serials 785: Succeeding bibliographic record – serials APPENDIX 2: TRAINING SCHEDULE FOR NEW HIRE • Week 1: Train on lending process, OPAC/Database search and floor search • Week 2: Practice lending processing and train on opening incoming mail and distributing mail, preparing for lending outgoing packages (FedEx, Texpress, International Mail) • Week 3: Train on printing lending pull slips, updating lending stacks search for loan and scanning for electronic delivery (Odyssey, Article Ex- change) • Week 4: Practice lending processing • Week 5: Train on processing borrowing requests in Borrow from Others and Awaiting Request Processing queues • Week 6–7: Practice both lending and borrowing processing • Week 8: Train on placing hold record in Voyager circulation for book re- trieval in DocDel, updating DocDel loan/article stacks search for delivery • Week 9–12: Practice Lending, Borrowing and DocDel processing • Week 13: Comprehensive review with the supervisor • Week 14: Train on the following borrowing tasks: • Awaiting Copyright Clearance • Awaiting Renewal Request Processing • Awaiting Denied Renewal Processing • Awaiting Odyssey Delivery • Users to Clear • Awaiting SFX Requests Processing • Using OCLC blank work form to submit request • Week 15–17: Practice all of the above GLRSC Reports From the Field 115 • Week 18–19: Train on processing incoming books for borrowing • Week 20: Comprehensive review with the supervisor • Week 21: Train on borrowing unfilled/conditional processing • Week 22: Train on following lending tasks: • Conditional request processing • Unshipped • Renewal request • OCLC Special Message: Complete, Not Received • Week 23–24: Practice all of the above • Week 25: Comprehensive review with the supervisor • Week 26–27: Train on following DocDel tasks: • Monitor request queues for books to/from branch libraries • Prepare items for faculty office and branch library delivery • Ship books to distance education student’s home via FedEx • Check in returned books from faculty/distance students in borrowing • Week 28–31: Practice all of the above processing • Week 32: Comprehensive review with the supervisor INTRODUCTION Training Approach and Strategies Why We Love Our Job First Things First Training on Borrowing Processing Training on Lending Processing Training on Local Document Delivery Processing Practice on Their Own Review and More Training CONCLUSION ACKNOWLEDGMENT REFERENCES 1.MARC FIELDS DESCRIPTIONS 2.TRAINING SCHEDULE FOR NEW HIRE work_oqfpjs7pqzf3pgwazpkv2vk4nu ---- Testing RDA at Dominican University's Graduate School of Library and Information Science: The Students’ Perspectives This article was downloaded by: [Biblioteca del Congreso Nacional], [Mr Biblioteca Congreso Nacional] On: 28 December 2011, At: 07:44 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Cataloging & Classification Quarterly Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/wccq20 Testing RDA at Dominican University's Graduate School of Library and Information Science: The Students’ Perspectives Marjorie E. Bloss a a Graduate School of Library and Information Science, Dominican University, River Forest, Illinois, USA Available online: 17 Nov 2011 To cite this article: Marjorie E. Bloss (2011): Testing RDA at Dominican University's Graduate School of Library and Information Science: The Students’ Perspectives, Cataloging & Classification Quarterly, 49:7-8, 582-599 To link to this article: http://dx.doi.org/10.1080/01639374.2011.616264 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. http://www.tandfonline.com/loi/wccq20 http://dx.doi.org/10.1080/01639374.2011.616264 http://www.tandfonline.com/page/terms-and-conditions Cataloging & Classification Quarterly, 49:582–599, 2011 Copyright © Taylor & Francis Group, LLC ISSN: 0163-9374 print / 1544-4554 online DOI: 10.1080/01639374.2011.616264 Testing RDA at Dominican University’s Graduate School of Library and Information Science: The Students’ Perspectives MARJORIE E. BLOSS Graduate School of Library and Information Science, Dominican University, River Forest, Illinois, USA Dominican University’s Graduate School of Library and Informa- tion Science (GSLIS) was one of a funnel group of graduate schools of library and information science selected to test Resource De- scription and Access (RDA). A seminar specifically for this purpose was approved by the dean and faculty of the library school and was conducted from August to December 2010. Fifteen students participated in the test, creating records in Anglo-American Cata- loguing Rules (AACR2) and in RDA, encoding them in the MARC (Machine Readable Cataloging) format, and responding to the re- quired questionnaires. In addition to record creation, the students were also asked to submit a final paper in which they described their experiences and recommended whether or not to accept RDA as a replacement for AACR2. KEYWORDS Resource Description and Access (RDA), Anglo- American Cataloguing Rules (AACR2), user studies, descriptive cataloging, cataloging Received May 2011; revised August 2011; accepted August 2011. Marjorie E. Bloss was a full-time lecturer in Dominican University’s Graduate School of Library and Information Science from 2004–2011. She also served as the RDA Project Manager from 2005 to 2009. She retired at the end of June 2011. The author thanks Dean Susan Roman and the Dominican GSLIS faculty for their support, and the following students who participated in the RDA Testing seminar: Albulena Bruncaj, Mary Jo Chrabasz, James Hennelly, Andrea Jarratt, Phyllis Kastle, Concetta Kellough, Heidi Knuth, Richard Martin, Lauren Robb, Jennifer Rubin, David Sanborne, Anthony Santaniello, Stacy Taylor, Julie Tegmeier, and Amanda Vermeulen. Address correspondence to Marjorie E. Bloss, 2827 W. Gregory Street, Chicago, IL 60625, USA. E-mail: marjorie bloss@msn.com 582 D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 583 INTRODUCTION There is a saying that the author of this article first heard from her library director a number of years ago: “Pioneers are often found with arrows in their backs.” What the director was referring to at the time was something called the MARC (Machine Readable Cataloging) format. We were in the early days of MARC and were busy trying to convince the university administration that converting our card catalog records into MARC was a cost-effective thing to do. To prove the point, we decided to generate a microfiche catalog from our MARC records in order to replace the card catalog. It did not go over very well with the users even though our new-found ability to hold what used to be the card catalog in one hand impressed everyone considerably. While the microfiche catalog was mercifully replaced by an online catalog, we learned an important lesson: life is not easy for pioneers—even in the library world. So here we are, experiencing yet another example of the pioneering spirit in our exploration and testing of Resource Description and Access (RDA). Dominican University’s Graduate School of Library and Information Sci- ence, (GSLIS, located in River Forest, Illinois), was one of fourteen library schools constituting a funnel group that was selected to participate in the formal testing of RDA. (A “funnel group” is a group of library schools—in this case—working together as a single unit. Information and processes are “funneled” through an institution representing the group as a whole.) Each library school that agreed to participate in the test was given free reign with regard to how it wished to design its approaches to the testing. This article will focus on the testing that took place specifically at Dominican University’s GSLIS and will include the following: • The RDA testing process • How RDA testing was conducted at Dominican University • Students’ comments and observations—The Negatives • Students’ comments and observations—The Positives • Perspectives on teaching RDA THE UNITED STATES NATIONAL LIBRARIES’ TESTING PROCESS The plans to test RDA were created jointly by a steering committee consisting of representatives from the Library of Congress (LC), the National Agricultural Library (NAL), and the National Library of Medicine (NLM). The process they identified was to select approximately 25 libraries from various communities (academic, public, special and school libraries, library automation vendors, and library schools). Each participant would be expected to catalog the same 25 titles selected by the three U.S. national libraries and to create original cataloging records for them using both Anglo-American Cataloguing Rules, D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 584 M. E. Bloss 2nd edition (AACR2) and RDA. Five additional records were selected that would be copy-cataloged using RDA. Finally, each participating institution would be expected to create a minimum of 25 “extra set” records using only RDA. The 25 extra set records were to be selected by each participating library, reflecting the materials the institution received as part of its normal acquisitions. Libraries not selected for the formal testing process were also encouraged to contribute records for the test. Once the records were cataloged participants were asked to submit them to the three U.S. national libraries for review. This could be done through OCLC or some other method for submission. For every record created, par- ticipants had to fill out a survey assessing such things as the amount of time it took to create the record, the difficulties they had along the way (be it with the content of the RDA rules or using the online version of the cataloging instructions), the amount of time taken to consult with others, and ultimately, whether or not RDA should be adopted. These surveys were then analyzed by the three U.S. national libraries in order to determine whether RDA would become accepted cataloging practice in the United States. TESTING TIMELINES The three U.S. national libraries had identified timelines for the testing, de- pendant on RDA’s release. When RDA was released on June 23, 2010, the testing period began. The end of June through the end of September 2010 was designated as a training period when test participants were expected to become familiar with and experienced in using the RDA instructions and the RDA Toolkit (the online package that includes RDA itself plus additional features such as tools related to RDA’s use, e.g. AACR2, workflows, RDA to MARC and MARC to RDA mappings). From October through December 2010, participants were expected to catalog the 25 original records using both AACR2 and RDA, the five copy cataloging records, and a minimum of 25 “extra set” records. Following the submission of the cataloging records and their related questionnaires, LC, NAL, and NLM analyzed the results from January through March 2011. During this time, they began to formulate their recommendations regarding the adoption of RDA. The three U.S. national libraries announced their recommendations at the 2011 annual American Library Association conference. The overriding recommendation was to adopt RDA but not to do so until January 2013 at the earliest. LC, NAL, and NLM identified a number of modifications they felt should be made in the intervening 18 months both to RDA content and the RDA Toolkit as well as developing a replacement for the MARC format. Many of these recommendations were based on comments received from the RDA test participants. D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 585 SELECTION OF THE LIBRARY EDUCATORS’ GROUP TO TEST RDA Prior to the selection of the RDA test participants, a call went out to the grad- uate library schools inquiring if they would like to form a group of library school educators for the purposes of testing RDA. Educators from fourteen schools indicated they would be interested in doing so. The appropriate ap- plication forms were submitted and the three U.S. national libraries selected the group to participate in the formal testing of RDA. The group was a loose confederation—one where each institution could decide how it wished to catalog and submit its records. In some institutions, only the faculty submitted records. In others, students contributed records ei- ther as part of a practicum, a seminar devised specifically for testing RDA, or voluntarily.1 Some of the educators decided not to participate in the testing once they saw how time-consuming the process was. Although the edu- cators’ group was considered a funnel group, coordination occurred at the administrative level only rather than creating and submitting bibliographic records and surveys through one institution. RDA TESTING IN DOMINICAN UNIVERSITY’S GSLIS PROGRAM During the spring of 2010, the author proposed to Dominican University’s GSLIS faculty that she conduct a seminar designed specifically for the purpose of testing RDA. The faculty approved the proposal for the fall 2010 semester. In order to register for the seminar, students had to have taken the cataloging-related core course, Organization of Knowledge, and the second- level cataloging course. Students who had taken only the core course were permitted to take the seminar if they could prove they had sufficient cata- loging knowledge and skills equivalent to the second-level cataloging course. To this end, they had to submit cataloging records demonstrating knowledge of both AACR2 and the MARC format. Consequently, fifteen students were admitted to the seminar. This num- ber was very advantageous when it came to divvying up the 25 original set records, the five copy cataloging records, and the extra set records. The final tally of records submitted by Dominican included all 25 original set records in both AACR2 and RDA, the 5 copy cataloging records, and 95 extra set records using only RDA. Of course, the related surveys for each record were also submitted. The students’ record creation process for the test was divided as follows. • Five students created 5 original set records each, using RDA; in addition they created 5 extra set records using RDA • Five students created 5 original set records each, using AACR2; in addition they created 5 extra set records using RDA D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 586 M. E. Bloss • One student did the 5 copy cataloging records; in addition she created 5 extra set records using RDA • Four students created 10 extra set records using RDA Classroom time was spent reviewing records, discussing successes and diffi- culties encountered with RDA content, the RDA Toolkit, creating records in OCLC, as well as general observations about the testing process itself. SOME STUDENT DEMOGRAPHICS One of the goals in having GSLIS students participate in testing RDA was to gauge whether RDA was easier to use than AACR2. The hypothesis was that library school students who approached RDA without the baggage of many years of using AACR2 would have an easier time adjusting to the new cata- loging code than those immersed in AACR2. To this end, the self-selecting process of students who registered for the class proved very effective. Cat- aloging experience for 11 out of the 15 students ranged anywhere from no cataloging experience to one year. Four students had 1–2 years of cata- loging experience. Unfortunately, the survey questions regarding cataloging experience were no more specific than this; therefore, it was impossible to know what, exactly, students’ cataloging experience consisted of (e.g., experience gathered only during course work, copy cataloging experience, original cataloging experience). Another interesting demographic was the ages of the students. Although no specific information was requested from the students regarding their ages, the author estimates that all of them were in their 20s, 30s, and 40s. Consequently, these particular students would have a number of years left as practicing librarians and would indeed witness the impact of RDA on library staff and users alike. In short, they had a vested interest in the future of cataloging and of RDA. Prior to the seminar, students were advised that they needed to have an ability to tolerate ambiguity and they would need to be flexible. We knew instructions would be coming in quick succession, often while students were training or even after we had moved from the training period to creating the test records themselves. We also knew there would be modifications to testing instructions since all of us (the U.S. national libraries and OCLC as well) were learning as we went along and no one knew all the answers right out of the box. Furthermore, and perhaps most important, students needed to understand that we would need to work collaboratively and make allowances for mistakes—even by the professor. In addition to the requirements for the test itself, the students were re- quired to submit a final paper that would describe their learning experiences using RDA and to record their observations on the management of the testing D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 587 process as a whole. In other words, they were asked to look beyond record creation and to look at the testing of RDA from a project management per- spective. Finally, they were asked to recommend whether or not RDA should be adopted and to explain the reasons for their recommendation. Students were asked to comment on the following in their final paper: • How easy was the transition from AACR2 to RDA for you? • What helped you acclimatize yourself to RDA? Were there certain “tricks of the trade” that you found useful? • What is your assessment of RDA’s content? Based on your experiences, what changes would you like to see? • What is your assessment of the RDA Toolkit? Based on your experiences, what changes would you like to see? • Do you feel RDA lives up to the goals that the Joint Steering Committee (JSC) and the Committee of Principals (CoP) identified for RDA (e.g., more cataloging efficiency, better internationalization, better accommodation of digital materials, etc.)? • Do you prefer AACR2 over RDA or RDA over AACR2 and why? • What has this course taught you from the perspective of management: ◦ Introducing a new cataloging code ◦ Introducing new software ◦ Observing how people learn and what helps them learn ◦ Assessing your leadership role should you be in a position of introducing RDA to your staff, or to the library as a whole • Would you recommend that RDA (a) not be adopted, (b) be adopted with some changes along the lines of what you previously identified, (c) be adopted as soon as possible realizing that some changes are inevitable? All the students were extremely enthusiastic and excited about being part of a national program to test RDA. The fact that they were putting their library school experience into practice and contributing to a national decision about the future of cataloging provided them with an experience only few would have. Through the semester we experienced moments of frustration but in the end, everyone had a feeling of immense satisfaction for having participated in testing RDA. DISCONNECTS BETWEEN RDA AND SCHOOL SEMESTER TIMELINES Even before the semester began, we were at a disadvantage. As has been mentioned previously, the training period for RDA began immediately after D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 588 M. E. Bloss its release in late June 2010. The seminar, however, was not scheduled to begin until Dominican’s fall semester, which commenced at the end of August. Technically, this would mean a loss of two months in the testing process. Additionally, our time would be cut short at the end of the testing period since the semester concluded in mid-December rather than at the end of the month. In order to compensate to some degree, the university allowed us to hold two sessions during early August (before the fall semester officially began). This helped immensely in terms of discussing some of the basics of the testing process and distributing training documents to the students for them to review prior to the beginning of the semester. We were at another disadvantage with regard to the submission of our cataloging records through OCLC. While Dominican University’s GSLIS most certainly has an account with OCLC, we are (understandably) allowed to use the system only in a limited mode. What this means is that we can save records in OCLC in order for them to be reviewed, but we cannot upload them into the system. OCLC’s policy for testing RDA was that we were expected to create Institutional Records (IRs), a process that had its own procedures and guidelines. This, then, was another component of the test (in addition to becoming familiar with RDA and the RDA Toolkit) that was part of our learning curve. AVAILABILITY OF TRAINING MATERIALS Even in the early stages of RDA testing, there were a number of excel- lent training materials available. Many of these were generated by the LC including their nine PowerPoint training modules as well as the training documents used by LC staff. The LC also made its RDA policy decisions (Library of Congress Policy Statements, or LCPS) available that were specific to the RDA testing. Not long after the RDA Toolkit became available, the LCPS were integrated in with RDA content, making it very easy to go back and forth between the policy statement itself and the RDA instruction to which it referred. The University of Chicago in particular began to catalog using RDA and generously provided access to its training documentation. This included a number of valuable workflows that were incorporated within the RDA Toolkit itself. A series of Webinars was given, with instruction and guidance for using both RDA content and the RDA Toolkit. When it came down to it, there was such a plethora of good documentation that it became necessary to be selective in what to use and what to omit. Even with the LCPS, test participants were encouraged to make their own decisions regarding certain RDA instructions and workflows. One of these decisions had to do with whether or not to include authority control as part of the testing process. In the Dominican GSLIS seminar, we decided not to create any authority control records. The semester was simply too short D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 589 given the university schedule described above and there was enough to do with training, bibliographic record creation, learning how to create IRs in OCLC, filling out the survey for each record created, not to mention writing the final paper for the class. In hindsight, we felt this omission was a good decision. WHAT WE LEARNED: GENERAL COMMENTS Initially, the author thought she would be the only one having difficulty in making the transition from AACR2 to RDA due to her many years of using AACR2. This was not the case as every member of the class commented on the steep learning curve required for creating bibliographic records using RDA. What surprised us most was the need for a detailed knowledge of the Functional Requirements for Bibliographic Records (FRBR) and to a lesser degree the Functional Requirements for Authority Data (FRAD). This knowl- edge goes beyond an overview of FRBR. It calls for a solid understanding of the attributes of the groups of entities and how they relate to one another when creating a bibliographic record. We observed that RDA does not lay out cataloging instructions nearly as linearly as does AACR2. Attributes that we are used to seeing as a unit in AACR2 (e.g., extent data) are found in separate chapters in RDA (e.g., pagination and size are found under the instructions for “Manifestation” while the instructions for illustrations are found under “Expression”). We also found ourselves needing to adjust to splitting AACR2’s General Material Designation (GMD) into three parts especially as one of those elements (content type) has its instructions in chapter 6 whereas the instructions for the other two elements (carrier type and media type) are found in chapter 3. Another learning curve was adjusting to RDA’s terminology. In some cases, it was like learning a new language, as the vocabulary used in RDA comes very much from FRBR, as do the concepts that underlie RDA. In other cases, we discovered that the terminology in RDA does not always have an equivalent in AACR2 and vice versa. This caused frustration when searching a term using AACR2 vocabulary that does not exist in RDA. There was nothing that immediately pointed us in the right direction in the RDA Toolkit. Consequently, we developed skills for what we called “going through the back door”—namely, reviewing training documentation or consulting materials familiar to us like MARC and AACR2 that would then provide a map to get us into the appropriate instruction in RDA. Even with the modifications made to MARC to accommodate RDA, we discovered we had difficulty putting the round RDA pegs into the square MARC holes (unless it was the other way around). One major example is the 1xx field, the main entry field. RDA provides instructions for access points but not main entries. We found ourselves needing to select an access point D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 590 M. E. Bloss for a main entry in order to create a MARC record. We seriously considered putting all personal, corporate, or family access points into 7xx fields as a way to adhere to RDA more closely. STUDENT COMMENTS: THE NEGATIVES The author analyzed the students’ comments from their final papers, group- ing them into “negative” and “positive” categories and grouping like com- ments together. What follows is a listing of student comments based on their submitted final papers. The students’ comments underscored the importance of having a solid understanding of the details of FRBR and FRAD due to the lack of a linear approach for creating bibliographic records in RDA, the difficulty in corre- lating AACR2’s vocabulary with RDA’s, and the observation that MARC is not the best encoding scheme for RDA. Here are some of the other difficulties the students found in using the RDA content and the RDA Toolkit. Negative Comments on RDA Content • RDA’s rules can be vague and lack clarity in places; the language of RDA should be simplified • Initially, students ran into difficulty adjusting to RDA’s vocabulary and the order of the rules • Students felt that RDA’s structure was based too heavily on that of FRBR and FRAD, making a linear approach to cataloging difficult. While the students were very supportive of the concepts of FRBR and FRAD, they expressed a desire to see RDA recast for catalogers rather than having the instructions governed by FRBR and FRAD entity groups • The rules for description and the creation of access points seemed frag- mented and at times jumped around to different chapters rather than keep- ing instructions all together (e.g., the creation of access points) • The lists of relationship designators should be combined into one list • Students were not always sure when they had completed a bibliographic record—they often had an uneasy feeling that there was more information that they needed to include Negative Comments on the RDA Toolkit • The RDA Toolkit slowed down the cataloging process as the software drilled down through the entire chapter before arriving at the specific rule D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 591 • The students were unable to reference multiple rules in the RDA Toolkit simultaneously • The RDA Toolkit did not have an index (this has recently been rectified) • Scrolling in the RDA Toolkit was often slow or erratic • Some of the RDA Toolkit’s functionality was unclear or awkwardly struc- tured (e.g., “Previous” and “Next-hit” arrows and “Advanced search”) STUDENT COMMENTS: THE POSITIVES Positive Comments on RDA Content • Many students commented on the fact that RDA’s de-emphasis on format and type of material led to greater flexibility in the rules • RDA was much better equipped for cataloging digital materials • RDA was much more “future-proof” than AACR2 in that it was much more in line with other digital knowledge and information communities • Students found that many rules in RDA were similar or identical to those in AACR2 (especially those dealing with access points). Therefore, once they became familiar with where to find these rules in RDA, their comfort level in using RDA increased significantly • For the most part, students liked replacing the GMD with the carrier, content, and media designators (especially for digital materials) and also supported the individual MARC tags for them although one person felt that an expansion of the GMD identifiers would have sufficed • Students believed that eliminating the “rule of three” was essential in pro- viding better access to materials • A number of students commented that they felt RDA would meet its goals regarding a broadening of scope internationally • Students were highly in favor of the entity relationship-based database concepts of RDA, believing that this provides a greater ability to support user needs. They also commented on the importance of the vendor com- munity in supporting this database architecture in order to realize the full potential of RDA • Students noted that although the RDA learning curve was steep, they be- came more adept at using RDA as they gained practice in its use. Several students were surprised when, at the end of the semester, they were cre- ating RDA bibliographic records very quickly. Positive Comments on the RDA Toolkit Generally, the RDA Toolkit received high marks from the students although there were certainly areas for improvement. One of the students commented that he could not see how anyone could easily learn RDA from a print D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 592 M. E. Bloss document due, in large part, to the many interactive tools, mappings, and resources available in the Toolkit. Specific comments included the following: • Students felt the RDA Toolkit was readable and easy to navigate • Students felt that the Toolkit itself greatly aided in their learning the RDA content. Specifically, they liked such Toolkit functions as: ◦ Integration of the Library of Congress Policy Statements with the RDA instructions ◦ Quick Search function ◦ Various mappings (e.g., RDA to MARC and MARC to RDA, AACR2 to RDA) ◦ Bookmarks and notes ◦ Synch Table of Contents ◦ Workflows RDA’S IMPACT ON USERS Although the students did not have the opportunity to query catalog users as to which record was preferable, AACR2 or RDA, they made their own comparisons. Students noted that in many cases there was little difference in the representation of bibliographic data from an AACR2 to an RDA record; however, they still felt there were some improvements in the RDA records. A number of students pointed to the elimination of abbreviations as something that would benefit the user, especially in the effort to internationalize RDA. The students also supported the separation of the GMD into three components—content, carrier, and media identifiers. While students did not believe this was overly useful for print material, they observed the great value in being able to be more specific when describing digital materials. And finally, students noted the value of a FRBRized catalog as hugely beneficial for library users. Providing them with the ability to access all the manifestations of a work at once rather than having to sort through several different records was seen as a major plus. By extension, RDA’s focus on attributes and relationships provides increased access to information. thus increasing responsiveness to user needs. OTHER CONCERNS The students’ comments went beyond RDA content and the RDA Toolkit, demonstrating that they had broader concerns than simply RDA cataloging. These included: • The cost of RDA—both the Toolkit and the hardware and software on which to run it D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 593 • Concern for the paraprofessional and clerical staffs needing to understand FRBR theory in order to accomplish their day-to-day cataloging tasks • The need to address the important issue of authority control and whether AACR2 and RDA headings could, in fact, co-exist in a single file • And again, the steep RDA learning curve even for people who do not carry AACR2 baggage GENERAL SUPPORT FOR RDA In their final papers, the students were asked to make a recommendation about RDA’s future. All of the students supported the adoption of RDA, al- though there was hesitation in some cases as well as differences of opinion as to how quickly RDA should be adopted. Nine of the 15 students rec- ommended that RDA be adopted immediately. Six recommended that it be adopted but only after modifications either to the content or the Toolkit or both were made. TEACHING RDA What is the best way to teach RDA to GSLIS students regardless of whether or not they intend to become catalogers? One thing that is obvious is that get- ting our minds and mouths around RDA is considerably different than when AACR2 was implemented. Technology—the ability to share documents over the Web, to hold training sessions via Webinars and other similar technolo- gies, to ask questions and receive responses within hours if not minutes, to hold philosophical discussions about cataloging in general and cataloging codes in particular—has provided us with information overload. If anything, our difficulty is going to be sorting out all of the available material (much of it excellent) and deciding what is most appropriate to use for our particular circumstances. Training in a classroom situation can be very different than training in a cataloging department. Dominican University’s GSLIS prides itself in face- to-face classroom settings. Although the number of online courses taught is growing and are certainly included in the curriculum, our current course ratio has us teaching a larger number of face-to-face courses. Face-to-face sessions are held once a week during the fall and spring semesters (twice a week is the norm in summer) with opportunities through Blackboard for asking questions, making observations, and holding discussions between class sessions. This process differs from a cataloging department where staff has the opportunity to meet, review material first-hand, and ask questions on a daily basis. So what works in a classroom situation? What is the most effective way to teach RDA in that setting? D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 594 M. E. Bloss For the immediate future, it is essential to include some fundamental instruction of AACR2 in basic cataloging syllabi. There are simply too many existing bibliographic records that were created using AACR2 cataloging and it is highly unlikely anyone will have the time or money to convert them to RDA cataloging. It would be a disservice to students and the people they ultimately assist not to include AACR2 in our curriculum. This is also true because a number of RDA instructions (particularly in the way access points are formatted) are based on AACR2 rules. As time continues and more and more RDA records are integrated into our databases and files, we will see a shift in the time spent teaching RDA rather than AACR2 in our curricula but for the meantime, we need to teach both. As has been mentioned previously, anyone working with RDA must spend time with FRBR and FRAD, and that means consulting the complete documents; not only the overviews of them. This will form the foundation for understanding RDA’s organization, the terminology and vocabulary used in RDA, and the relationships between the various groups of entities. The Library of Congress has made a number of its excellent documents available. These range from PowerPoint presentations to training documents to documents that compare AACR2 cataloging with RDA cataloging (see list of references). Giving students examples of the similarities and differences between AACR2 and RDA immediately (even before introducing them to RDA’s instructions) provides an excellent visual introduction as to what they can expect to see in RDA bibliographic records. Once students have an idea of the content of an RDA bibliographic record, it is time to look at the RDA instructions themselves. Providing an overview of the organization of RDA is essential and of course, tying it in with FRBR terminology and principles emphasizes how the FRBR conceptual model forms a foundation for the RDA instructions. Comparisons between AACR2 and RDA can prove helpful here if students have been exposed to AACR2. With regard to the creation of a descriptive cataloging record, RDA’s core record attributes for manifestations are a valuable way for students to understand the elements of a bibliographic record and their related rules. In addition to becoming familiar with the content of RDA’s instructions, students will also need to become familiar with the features of the RDA Toolkit. Giving students the opportunity to spend time experimenting with RDA Toolkit functionality, developing a good understanding of how the Toolkit is organized, and examining its various resources and features are essential. Those training for and teaching RDA will need to factor in time for students to become comfortable with the Toolkit and how it supports RDA content. The RDA Toolkit functionality can also be an excellent instructional tool when learning the content of RDA instructions. As was noted in the stu- dents’ comments, the various features of the Toolkit such as “Quick Search” (where you can key in either rule numbers or phrases, e.g., “statement of D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 595 responsibility”) help tremendously when attempting to find specific rules. The mappings, especially those from RDA to MARC and MARC to RDA (there are also mappings from Dublin Core to RDA and vice versa), are extremely valuable as another way of finding the appropriate RDA instruction. AACR2 is also included in the RDA Toolkit and the rules there are hot-linked to the corresponding RDA instructions. The Toolkit’s bookmarking system is yet another method that can help students track down a rule previously used; and the comments feature, allowing one to add his or her own observations on the application of instructions, assures the user that their earlier work will not be lost. The author previously noted that RDA is not nearly as linear as AACR2 in providing users with a step-by-step method of proceeding when creating a bibliographic record. The “Workflows” feature in RDA can help allevi- ate this difficulty. (It is here that RDA users can submit their own internal workflows—either for all to see and use or limited only to their own institu- tion.) The “Workflows” section is already beginning to be populated thanks initially to the Library of Congress and the University of Chicago. Students found the workflows invaluable when looking for step-by-step cataloging instructions. All in all, it is essential to give students time to understand the interactive nature of the RDA Toolkit and to use its features effectively. JUMPING INTO THE DEEP END OF THE POOL And after all the documentation is read and reviewed, and after all the PowerPoint presentations are given and Webinars are attended, the best way to learn RDA’s instructions is to take a deep breath and simply begin to use them. An effective method for doing this is to have all students catalog the same resource—often one of the course texts works best. Having students identify the descriptive cataloging attributes using RDA’s core record elements is a good starting place. In addition to having the students record the attributes themselves, having them identify the specific rules they used is another important factor in familiarizing them with the RDA instructions (and AACR2 for that matter). There is no question that the students will have growing pains when first matching wits with RDA but they had similar growing pains when they first began to use AACR2. We are introducing them to a new vocabulary and new concepts (as was the case with AACR2). There is no way we can expect students to create perfect records the first time they apply RDA. Therefore, it is essential to build in time for students to learn, to make mistakes, for us to provide them with feedback, and for all of us to critically analyze RDA’s instructions. The more experience students have using RDA (and anything new for that matter), the easier and faster record creation becomes. Cataloging is a new experience for the majority of GSLIS students. They D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 596 M. E. Bloss need time to learn the language and understand the rules—whether AACR2 or RDA or both. CONCLUSION The students who participated in Dominican University’s GSLIS RDA Testing Seminar went into the class knowing they would need to be flexible and open-minded; and that there would be additional instructions and modifica- tions to those instructions coming out at the same time as they were creating records. They knew they were working with both instructional content and software that had not yet been tested. They knew changes would be made once the test results had been analyzed and that they were, for all intents and purposes, beta testing RDA—both content and software. There was never a debate about the importance of AACR2, knowing it held us in good stead for more than 30 years but RDA moves us into the digital age in ways AACR2 cannot. Perhaps most important in having students participate in the RDA testing is that these are the people who will be the catalogers of the future. To quote one of the students: I prefer RDA to AACR2 and am all for adopting it as a new standard as I am rather heavily invested in the future of cataloging. It is clear to me that the future depends on making some significant changes. RDA represents an important step forward. . . [and] is imperative for the growth of the cataloging profession, for the evolution of libraries as a whole, and for the fulfillment of the basic principles of library service. We must pay attention to students’ comments and observations regarding the RDA instructions, the effectiveness of the RDA Toolkit, and the impact of RDA on catalogers and users alike. Today’s students are truly our pioneers and they are moving us forward into a new cataloging frontier. RECOMMENDED RESOURCES (All URLs accurate as of July 26, 2011). About FRBR and RDA: About RDA (OCLC) http://www.oclc.org/us/en/rda/about.htm Le Boeuf, Patrick, ed. 2005. Functional Requirements for Bibliographic Records (FRBR): Hype or Cure-All? Binghamton, NY: Haworth Information Press. D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 597 Joint Steering Committee for Development of RDA http://www.rda-jsc.org/rda.html Maxwell, Robert L. 2008. FRBR: a Guide for the Perplexed. Chicago, IL: Amer- ican Library Association. Oliver, Chris. 2010. Introducing RDA: A Guide to the Basics. Chicago: ALA Editions. RDA listserv, RDA-L, http://www.rda-jsc.org/rdadiscuss.html. To subscribe to the list send an e-mail to: LISTSERV@LISTSERV.LAC-BAC.GC.CA with Sub- scribe RDA-L Firstname Lastname in the body of the message. RDA Toolkit, http://access.rdatoolkit.org/ Tilllett, Barbara. 2004. What is FRBR?: A Conceptual Model for the Biblio- graphic Universe. Washington, D.C.: Library of Congress Cataloging Distri- bution Service, http://www.loc.gov/cds/downloads/FRBR.PDF Tillett, Barbara. “RDA Changes from AACR2 for Texts,” http://www.loc.gov/ today/cyberlc/feature wdesc.php?rec=4863 Taylor, Arlene G., ed. 2007. Understanding FRBR: What It Is and How It Will Affect our Retrieval Tools. Westport, CT: Libraries Unlimited. Zhang, Yin and Athena Salaba. 2008. Implementing FRBR in Libraries: Key Issues and Future Directions. New York: Neal-Schuman. RDA and MARC MARC 21 Standards http://www.loc.gov/marc/ RDA in MARC http://www.loc.gov/marc/RDAinMARC29.html The US RDA Test and RDA Examples General Information on the US Test of RDA http://www.loc.gov/bibliographic-future/rda/ Joint Steering Committee for Development of RDA: Working Documents: Complete Examples for RDA Toolkit 2010 http://www.rda-jsc.org/working2.html#rda-examples Library of Congress Documentation for the RDA Test http://www.loc.gov/catdir/cpso/RDAtest/rdatest.html Library of Congress Choices for the RDA Test http://www.loc.gov/catdir/cpso/RDAtest/rdachoices.html Library of Congress Documentation: Examples for RDA Compared to AACR2 http://www.loc.gov/catdir/cpso/RDAtest/rdaexamples.html D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 598 M. E. Bloss OCLC Policy Statement on RDA Cataloging in WorldCat for the U.S. testing period http://www.oclc.org/us/en/rda/policy.htm RDA Test Partners Handout http://www.loc.gov/bibliographic-future/rda/RDA%20test%20partners%20 handout.xls RDA Test “Train the Trainer” (Kuhagen and Tillett—9 modules) http://www.loc.gov/bibliographic-future/rda/trainthetrainer.html http://www.loc.gov/catdir/cpso/RDAtest/rdatraining.html Resource Description and Access (RDA) Testing at the University of Chicago Library http://www.lib.uchicago.edu/staffweb/depts/cat/rda.html Includes training materials, UC’s timeline and testing decisions, and records created using RDA. US RDA Test Record Collection Plan http://www.loc.gov/catdir/cpso/RDAtest/admindoc2.doc US RDA Test Policy for the Extra Set: Use of Existing Authority and Bibliographic Records (Common Copy Set) http://www.loc.gov/catdir/cpso/RDAtest/admindoc1.doc Metadata Beyond RDA Coyle, Karen. Coyle’s InFormation, http://kcoyle.blogspot.com/ Hillmann, Diane. Metadata Matters, http://managemetadata.org/blog/ Open Metadata Registry: The RDA (Resource Description and Access) Vocabularies, http://metadataregistry.org/rdabrowse.htm W3C Library Linked Data Incubator Group, http://www.w3.org/2005/ Incubator/lld/ Recommendations for RDA’s Future Issued by the Library of Congress, National Agricultural Library, Na- tional Library of Medicine. Testing Resource Description and Access (RDA): Report and Recommendations. Washington, DC, June 13, 2011, http://www.loc.gov/bibliographic-future/rda/ Report and Recommendations of the U.S. RDA Test Coordinating Committee: Executive Summary. Washington, DC, June 13, 2011, http://www.nlm.nih.gov/tsd/cataloging/RDA report executive summary. pdf Response of the Library of Congress, the National Agricultural Library, and the National Library of Medicine to the RDA Test Coordinating Com D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 Testing RDA at Dominican University 599 mittee. Washington, DC, June 13, 2011, http://www.nlm.nih.gov/tsd/ cataloging/RDA Executives statement.pdf A Web site has been established that will be the central place for plans, news, and progress of the MARC Transition Initiative: http://www. loc.gov/marc/transition/ NOTE 1. The University of Illinois at Urbana-Champaign published the details of their site experiences in Robert Bothmann, ed., “Cataloging News,” Cataloging & Classification Quarterly 49, no. 3 (2011): 242–256. D ow nl oa de d by [ B ib li ot ec a de l C on gr es o N ac io na l] , [ M r B ib li ot ec a C on gr es o N ac io na l] a t 07 :4 4 28 D ec em be r 20 11 work_or3zpqp6dzcnfeysd4yx3rbyua ---- Meetings and Conferences | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.1159/000475538 Corpus ID: 37961711Meetings and Conferences @article{Fung2017MeetingsAC, title={Meetings and Conferences}, author={Margaret C. Fung}, journal={Oncology Research and Treatment}, year={2017}, volume={40}, pages={310 - 312} } Margaret C. Fung Published 2017 Medicine Oncology Research and Treatment View on PubMed karger.com Save to Library Create Alert Cite Launch Research Feed Share This Paper Related Papers Abstract Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_os6jdcktafhzlmgwbqpg7ijf6m ---- Microsoft Word - OTDCF_v22no1.doc by Norm Medeiros Coordinator for Bibliographic and Digital Services Haverford College Haverford, PA On the Road Again: A Conversation with Jill Emery ___________________________________________________________________________________________________ (A published version of this article appears in the 22:1 (2006) issue of OCLC Systems & Services.) “Nothing can substitute for the knowledge and experience of a good librarian.” -- Harold Evans ABSTRACT This article features an interview with Jill Emery, Director of the Electronic Resources Program at the University of Houston. Ms. Emery discusses her career, the potential impact of the open access movement, and the nuances of licensing electronic resources. KEYWORDS Jill Emery ; open access ; licensing ; electronic resource ; e-resources One of the thrills of writing this column is that I often get the privilege of interviewing people for whom I have great admiration. Jill Emery, Director of the Electronic Resources Program at the University of Houston, is an esteemed member of this group. Jill’s thoughtfulness, leadership, and deep commitment to the profession are sterling qualities that contributed to her selection as one of Library Journal’s 2004 “Movers & Shakers” (“Digital Girl,” 2004). Her numerous publications and presentations, not to mention rigorous participation in the “Liblicense” listserv, make her name a familiar one, yet I hope the piece that follows gives readers a more meaningful perspective on an engaging librarian whose relatively brief ten-year professional career is marked by impressive accomplishments and even greater promise. NM: You earned your undergraduate degree from Texas A&M, your MLIS from the University of Texas at Austin, and your professional positions have all been at academic institutions within the Lone Star State. I take it you’re a native Texan. JE: There’s a refrigerator magnet you can buy in Texas that reads, “I wasn’t born here, but got here as soon as I could.” Actually, I’m an Air Force brat. I was born in Tucson, AZ, but for various reasons my parents decided to settle in Texas. I’ve lived in Texas since I was ten years old. Deciding to go to school in Texas was more of an economic advantage than anything else. While at Texas A&M, I started working at the library and became a state employee at the age of 19. At this point, I’m vested in the state employee system, which has certain advantages. NM: What prompted you to become a librarian? JE: Perhaps you’ve heard this someplace before: Getting my library degree was an interim step between college and whatever came next. I’m still waiting to see what comes next. I started working in libraries at the age of nine when as a member of the nerdy kids at my elementary school I spent half of one class period “helping” out in the library. When you’re indoctrinated that young, there is no escape. I believe some refer to this condition as fate. NM: If you weren’t a librarian, in what profession would you be engaged? JE: Lately, I’ve been thinking that being a kept woman wouldn’t be such a bad thing. If you know of any way to facilitate this goal, suggestions are appreciated. NM: You’re presently the Director of the Electronic Resources Program at the University of Houston. Although there’s no doubt much diversity in your job, describe a typical workday. JE: You obviously haven’t read my chapter in Jump Start Your Career in Library & Information Science edited by Priscilla Shontz as I affirm that there are no typical days anymore. That statement aside, email is a huge part of my day and the rest is Jill sporting her favorite meetings. Here’s an hour-by-hour synopsis of what an observer leopard print fez would see today: 8 AM: I arrive at work, log into my computer, glance at the phone for messages, sigh at the paperwork on my desk, and move some of it to a new stack arrangement for the day. You can tell my mood by how deep the piles on the desk are and in what geometric pattern. I check my email and calendar for the day, and make mental notes of meetings to attend or forget about. 9AM: I work on something tangible, like figuring out why we’re only getting a few months’ access to journal X from publisher Y. I send three emails about this matter: one to the serials librarian so she can ask the subscription agent; one to the publisher to see why they think it’s happening; and one to the subject librarian to explain that it may take a while to resolve. 10AM: I attend the morning meeting of the User Services Assessment Task Force, which sounds interesting and like fun at this point. 11AM: I come back to my office and check my email, looking for a reply on the journal problem. 11:30 AM: I go to lunch with two colleagues from technical services and come up with alternatives to celebrating Constitution Day. We decide on creating Day of the Dead coffins that feature the Founding Fathers, and a means by which students can interactively turn the Founders in their graves. 12:30 PM: I head back to email. I receive three or four messages from vendors/publishers trying to sell the library something or meet to sell the library something; announcements of price increases or platform changes; and solicitations to train staff on various products. I forward the messages to appropriate subject librarians. 1 PM: I review any of the contracts sitting on my desk and make marginal notes, strike-outs as needed, or decide they’re fine as written. I type the Texas addendums, print them out, and fax to the appropriate vendor contact. I sign necessary riders and fax these also. I review electronic requests submitted by subject librarians, add them to spreadsheets for review by the collection management committee, and track down any missing bits of information like pricing, etc. 2 PM: This is my phone hour. There’s at least one conference call to be had a week either for consortium business, vendor/publisher stuff, or professional organization matters. 3 PM: I face the paperwork on my desk and try to dispense with it. I bring in folders and are able to file some things; some things get mailed to other parts of the library; some get re-arranged yet again. I tackle professional committee work, such as writing reports or documents, reviewing appointments and calendars, and making sure information has been submitted properly. 4 PM: I finish my workday with outstanding professional presentation needs that almost always begin with a review of various blogs, both library-related and otherwise. I peruse various music web sites and listen to someone inspirational such as Nick Cave, Joe Strummer, P J Harvey, Brit Pop, or Grime, depending on my mood. I find clip-art for the presentation -- a wise man once told me that all presentations must begin with clip art, and I have to say this technique has not failed me yet. NM: You’re a prolific author and presenter. What do you attribute to this scholarly productivity? JE: Wow, prolific, really? I’m blushing. Actually, it is vanity. People ask me to write something or present something and I think yeah, I can talk about that or I can write about that and so I do. There was also a continuing appointment (tenure-esque) process I underwent years ago that helped me to stay focused in this manner. Part of my motivation is that I have opinions on things and am not afraid to share them. I’m sometimes embarrassed by what I say, but never, ever afraid to say it. It’s a curse and a blessing really. When writing an article I always try to incorporate some sort of musical lyric into the title, heading, or text somewhere just to see if anyone really reads these things. I’ve been getting really obvious with this convention lately though and should go back to being a bit more obscure. I try to tie in what I’m writing to something in popular culture in some way, usually tying in a couple of things that I’m reading at the same time. It’s all really a game for me to see what I can get way with and still come across as knowledgeable on a subject. NM: I read and very much enjoyed your recent publication, “Is Our Best Good Enough? Educating End-Users About Licensing Terms” (Journal of Librarian Administration 42:3/4 (2005)). In that article, you argue that digital resources, the use of which is governed by contract law rather than copyright law, requires libraries to take a more active approach to informing users of permissible and illegal uses of these materials. I agree with your position, but it’s difficult to push important licensing terms to patrons, and even more difficult to entice patrons to read these terms. Do you think electronic resource management systems might offer help to libraries in this regard? JE: Electronic resource management systems let librarians and libraries off the hook in this regard. We can say the information was there and posted to the user population much like we do with those copyright signs over the photocopiers. Does that mean we’ve fulfilled our responsibilities in regards to educating users about copyright laws? That’s a bit hedgy. Let’s face it, copyright laws aren’t something our user population really wants to be educated about, however it’s something all librarians feel guilty about. Electronic resource management systems help alleviate some of this guilt. It is a best effort at allowing us a single place to collectively gather administrative metadata. Truth be told, I’m ambivalent about electronic resource management systems. They serve a worthwhile purpose for a library internally, but I’m not sure they help the majority of end-users with much of anything in their search for information. NM: But there are other systems, such as a library’s acquisitions module, that don’t serve the information-seeking needs of students, yet help libraries operate. JE: We need to answer this question: What do we truly need to know about license agreements? For this, ERMs may be overkill. There’s not a lot of payoff for users. What I need from an ERM is standardized metadata that can be migrated to other systems . The effort currently underway by the DLF ERMI group to map licenses into standardized language will be really helpful for libraries. NM: Will a day come when libraries and content providers are able to agree on standard license terms -- the so-called model license -- and if so, what conditions will cause this to happen? JE: Do we really honestly want this? Do we really want one interface and one way of doing things? Where’s the variety or fun in that? With a few really oddball situations that continue to exist primarily with business resources that were never intended or designed for mass consumption, most licenses are pretty standard these days. They all seem to cover the things I need them to. Since I find myself signing more riders than actual licenses because of renewals or additions to content or the purchasing of back-files, standardized licensing is much less of an issue than it once was. In actual fact, libraries are the ones that have become non-standard. In one consortium that this institution belongs to, there is something like five different state addendums that have to be added to the consortia licenses. Some of the addendums are incredibly detailed and are basically licenses in themselves. In our attempts to secure our own licensing terms, we may become our own worse enemy. One thing I like to muse about is whether libraries will ask other libraries to sign licenses to use their institutional repositories. Now there will be some fun! NM: A few libraries are pushing out to publishers an institutional license for electronic resources they want to purchase. Do you think this approach is strategic, or one that could undermine the license standardization efforts that seem to be bringing publishers and libraries closer together? JE: It’s a good idea, especially for companies you haven’t worked with before. There are companies not accustomed to working with libraries in this way. I wouldn’t do it every time you negotiate a license, especially with big companies. Some of the existing publisher licenses are great. You need to understand the benefits of distributing an institutional license to a vendor. It should be a selective process. NM: Are the hopes of open access as a means of wrestling power from publishers a realistic expectation for the library community? JE: Information does not want to free; information is a commodity. Not with the current models of open access, no. The current models are subscriptions that just aren’t called subscriptions. Will there be a model that will change the entire scholarly communication landscape? Probably, but it isn’t around yet and I cannot, for the life of me, fathom what it would be. Here are the tenets that require ubiquitous adoption of anything new: 1. Intuitive to use by all users (those publishing/those reading). 2. It doesn’t require a huge shift in the way current organizational structures work, but rather minor adjustments (in this case tenure structures). 3. There is some type of understanding and/or trust in the entity that creates this new thing. NM: You’ve just begun your term as Chair of the ALA/ALCTS Serials Section. What are some of the goals you hope to achieve during the Emery administration? JE: Status quo. I am powerless beneath the bureaucracy of the American Library Association and do not pretend otherwise. NM: During your off time, what do you enjoy doing? JE: Right now, I’m reading a lot of Japanese mystery novels in translation. I get into these paths of reading and keep at it until I’ve exhausted myself of either the subject or run out of good writers to read. I watch loads of television and movies at home and knit copious amounts of scarves for presents for people. There’s an Art Space here in Houston where I volunteer to do whatever they need during their exhibit openings and performance art shows. The artists are always amused that there’s this librarian in their midst who doesn’t really do anything artistic other than serve as a barmaid when they need me to do so. I am always excited to see live music shows and try to go to at least two or three a month. Due to my travel schedule this summer this wasn’t accomplished, but I’m getting back into it. My preference is for local/smaller bands and venues that most people wouldn’t know about -- not to be a music snob or obscure or anything, but rather because I tend not to like huge crowds of people and being forty feet away from the band. I’m rather naturally curious as well, so I travel lots, even if it’s driving to some small town just outside of Houston and wandering about to see what’s there. For instance, I’ve been to most of the museums in Beaumont, Texas. When they start the rebuilding of the Gulf Coast, I’d like to pull together a volunteer group to work with Habitat for Humanity to go in and help with the reconstruction. NM: What are the next professional challenges for you, and where do you see yourself 10 years from now? JE: As they say down here in Houston, I’ll just keep on keeping on. REFERENCES “Digital Girl” (2004). Library Journal, v. 129, no. 5, p. 49. work_p4e57gbbbvgfle33j5u2fn2wtq ---- Microsoft Word - OTDCF_v23no1.doc by Norm Medeiros Associate Librarian of the College Haverford College Haverford, PA The Evolutionary Research Process ___________________________________________________________________________________________________ {A published version of this paper appears in the 23:1 (2007) issue of OCLC Systems & Services.} “By the time you know what to do, you're too old to do it.” -- Ted Williams ABSTRACT This article describes the evolution of key technologies that have improved the research process. The article notes that these advancements have occurred outside of libraries, and that future development, as evidenced by Google’s ever-expanding array of innovative tools, will likely continue to be developed with little librarian involvement. KEYWORDS Technological advances in librarianship ; access to electronic resources The first article I wrote for OCLC Systems & Services appeared in the 15:1 (1999) issue, a viewpoint piece that described the effects of digitization on catalogers as I saw it. I recently returned to this essay, happening upon it during a spring cleaning of my file cabinet. Some of the points I made in the article included: • The online catalog has assumed a diminished role in the world of researchers. • E-journals have become the medium of choice for disseminating scholarly information. • The supreme information gateway is the library’s web site. • A cooperative database of e-resource records cataloged by librarians would be “the ultimate search engine experience.” Assessing these statements in light of today’s environment yields a mixed bag. Let’s evaluate: The online catalog has assumed a diminished role in the world of researchers. The role of the online catalog has continued to diminish given the many other tools available to information seekers. That said, the emergence of innovative search applications powered by Endeca and MediaLab have not only given promise to rejuvenating the catalog experience, they’ve spawned development of similar tools within the traditional library system marketplace. E-journals have become the medium of choice for disseminating scholarly information. E-journals are well entrenched as the format of choice for disseminating journal literature in most disciplines. The supreme information gateway is the library’s web site. The library web site is not as convenient or simple to use as many of us would like to think. Yet libraries insist on developing evermore attractive portals to electronic resources, despite our users’ disinterest. There are simply more direct means of accessing e-resources than through a library’s web site. A cooperative database of e-resource records cataloged by librarians would be “the ultimate search engine experience.” My referenced database of high-quality e-resource records was OCLC’s Cooperative Onlince Resource Catalog (CORC), which although deserving of praise, was no where near successful enough to be coined “the ultimate search engine experience.” This distinction goes to Google, which was far from an empire in 1999. To this day Google continues to evolve in an impressive way, the U2 of the search engine world. REFLECTIONS In reflecting on the environment of the time and how it’s progressed over the past eight years, I’m not surprised by anything that’s happened. Google’s ubiquity is perhaps the exception, though the larger mystery is why search competitors have not been able to imitate Google’s success or even beat it to the punch once in a while. What I think is worthy of note, however, is how in 1999 libraries still controlled the gates to information. Even if by that time publisher web sites had become the more expedient means to scholarly articles, libraries, through web-based catalogs and A-Z e-journal lists, still controlled the roads to these resources. The gateways to information we build and maintain today, however, are less immediate and less attractive than the gateways commercial entities provide. Since 1999, three initiatives have had a profound effect on the way users connect to electronic resources. Google, as noted above, is the most impressive of these. CrossRef and OpenURL link resolvers are two others that have largely removed from libraries the gatekeeper status. We still spend lots of time and money on creating and maintaining gateways – they are a “just in case” tool – but they are not the most immediate route to information. Although libraries ultimately control their users’ access to commercial information – libraries, after all, pay the invoices -- outside agents provide the vehicles that transport users to this information. The concept isn’t foreign, nor should it be one that necessarily concerns us. Abstracting and indexing services have for decades provided the means through which users locate journal citations. Libraries have been comfortable with this contract, especially given the unrealistic alternatives. In the case of OpenURL applications, libraries have a great deal to do with the success of this transmission protocol. The application is costly to libraries, both in dollars and maintenance, and it requires customization in order to assure users are lead down the most successful paths. The following example illustrates well the way OpenURL link resolvers and CrossRef have simplified and made more immediate information retrieval: Circa 1999 1. A user in need of journal articles accesses an online abstracting and indexing service. 2. She locates a useful citation and searches for the title in her library’s online catalog. 3. She finds the bibliographic record for the journal and connects to it via a hotlink in the record. 4. In the bibliography of the journal article she finds what looks like an even better article. 5. She reconnects to her library’s online catalog and searches for the journal. 6. She finds the record for the journal and connects to it via a hotlink in the record. Circa 2007 1. A user in need of journal articles accesses an online abstracting and indexing service. 2. She connects directly to the article through an OpenURL embedded in the citation. 3. In the bibliography of the journal article she finds what looks like an even better article. 4. She clicks on the citation and retrieves the full-text via CrossRef. Both examples presume the library subscribes to the sought after e-journals. If that wasn’t the case, the 1999 user would need to continue searching her library’s catalog for journals. The 2007 user, however, would be informed through her library’s OpenURL link resolver whether the title was available, and if not, how to retrieve it. The example above also presumes the 2007 user didn’t go first to Google, a stretch indeed. LOSING OUR GRIP? The future of the online catalog is a hot topic. Much has been written about the catalog’s diminished use among undergraduates. 1 In response to this threat and the general perception that academic libraries are losing their place in the research lives of their users, collaborations are underway to rejuvenate the catalog. Products such as Endeca’s search application, pioneered at North Carolina State University, help the clunky catalog provide successful experiences. Library management system vendors are also getting into the act. The promise of this technology is great, but so too are the costs. Yet the cost of obsolescence is greater. __________________________________________________________________________________________ 1. See Karen Calhoun’s “The Changing Nature of the Catalog and its Integration with Other Discovery Tools” and the University of California’s “Rethinking How We Provide Bibliographic Services for the University of California” as examples of this literary trend. work_p5nw3uel7fdfvcasyuq2oib2pa ---- U-M Weblogin U-M Weblogin Enter your Login ID and Password This page displays best when JavaScript is enabled in your web browser. JavaScript is required for two-factor authentication. Uniqname or Friend ID Password Forgot password? Need help? By your use of these resources, you agree to abide by Responsible Use of Information Resources (SPG 601.07), in addition to all relevant state and federal laws. University of Michigan © 2021 The Regents of the University of Michigan work_p7acfvqzmjfrzgapb32yx7sja4 ---- vi R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 Revista Internacional Interdisciplinar INTERthesis – v.12, n.1, 2015 EDITORES Selvino José Assmann, PPGICH, UFSC, Florianópolis, SC, Brasil Silmara Cimbalista, PPGICH, UFSC, Florianópolis, SC, Brasil Javier Ignacio Vernal, PPGICH, UFSC, Florianópolis, SC, Brasil EDITOR DE RESENHAS E TRADUÇÕES Selvino José Assmann, PPGICH, UFSC, Florianópolis, SC, Brasil EDITORES ASSISTENTES Área: Condição Humana na Modernidade José Eliezer Mikosz, EMBAP/PR, Curitiba, PR, Brasil João Lupi, PPGICH, UFSC, Florianópolis, SC, Brasil Área: Sociedade e Meio Ambiente Javier Ignacio Vernal, PPGICH, UFSC, Florianópolis, SC, Brasil Área: Estudos de Gênero Luciana Rosar Fornazari Klanovicz, Unicentro, Londrina, Brasil Teresa Kleba, PPGICH, UFSC, Florianópolis, SC, Brasil REVISÃO –TRADUÇÃO Espanhol: Leandro Marcelo Cisneros, PPGICH, UFSC, Florianópolis, SC, Brasil Inglês: Javier Ignacio Vernal, PPGICH, UFSC, Florianópolis, SC, Brasil BOLSISTA Victória Moraes, PPGICH, UFSC, Florianópolis, SC, Brasil vii R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 AUTORES Adriano Correia correiaadriano@yahoo.com.br Ahmad Saeed Khan saeed@ufc.br Alessandro Pinzani Alessandro@cfh.ufsc.br Alexandre Franco de Sá alexandre_sa@sapo.pt Camilo Henrique Silva camilo.henrique@ufms.br Cláudia Valéria Fonseca da Costa Santamarina claufcost@gmail.com Cristiana Carneiro cristianacarneiro13@gmail.com Daniela Rosendo daniela.rosendo84@gmail.com Fabiano Garcia f.garcia7@hotmail.com Fernanda Luiza Fontoura de Medeiros flfmedeiros@gmail.com Fernando Michelotti fmichelotti@ufpa.br Gabriela Cristina Braga Navarro gabrielabnavarro@gmail.com Henrique Luiz Caproni Neto henriquecap_adm@yahoo.com.br Hildete Pereira dos Anjos dosanjoshildete@gmail.com Jactania Marques Muller jac-muller@hotmail.com Josaida de Oliveira Gondar jogondar@uol.com.br Leila Maria Amaral Ribeiro leirib@gmail.com Leonardo Andrade Rocha leonardoandrocha@yahoo.com.br Letícia Albuquerque leticia.albuquerque@ufsc.br Lucas Moraes Martins lucasmoraesmartins@hotmail.com Marcos Aurélio da Silva maurelio@cfh.ufsc.br Maria Alice da Silva mariaalicesilv@gmail.com Maria Inácia D'Ávila Neto inadavila@gmail.com Maria Isabel Araújo Rodrigues isabel.rodrigues@fjp.mg.gov.br mailto:correiaadriano@yahoo.com.br mailto:saeed@ufc.br mailto:Alessandro@cfh.ufsc.br mailto:alexandre_sa@sapo.pt mailto:camilo.henrique@ufms.br mailto:claufcost@gmail.com mailto:cristianacarneiro13@gmail.com mailto:daniela.rosendo84@gmail.com mailto:f.garcia7@hotmail.com mailto:flfmedeiros@gmail.com mailto:fmichelotti@ufpa.br mailto:gabrielabnavarro@gmail.com mailto:henriquecap_adm@yahoo.com.br mailto:dosanjoshildete@gmail.com mailto:jac-muller@hotmail.com mailto:jogondar@uol.com.br mailto:leirib@gmail.com mailto:leonardoandrocha@yahoo.com.br mailto:leticia.albuquerque@ufsc.br mailto:lucasmoraesmartins@hotmail.com mailto:maurelio@cfh.ufsc.br mailto:mariaalicesilv@gmail.com mailto:inadavila@gmail.com mailto:isabel.rodrigues@fjp.mg.gov.br viii R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 Mariany Freitas de Oliveira marianyfoliveira@hotmail.com Marina Rocha de Sousa marina_rochadesousa@ymail.com Marlene Almeida de Ataíde maataide@yahoo.com.br Paola Maia Lo Sardo pmaialosardo@gmail.com Patricia de Sá Freire patriciadesafreire@gmail.com Patrícia Verônica Pinheiro Sales Lima pvpslima@gmail.com Pedro Augusto Boal Costa Gomes pedroaugustoboal@gmail.com Rafael Speck de Souza rafaelspk@gmail.com Rita Ippolito rita.ippolito@gmail.com Rodolfo Antônio de Figueiredo raf@cca.ufscar.br Rosânia Rodrigues de Sousa rosania.sousa@fjp.mg.gov.br Simone Cristina Dufloth sduf@uol.com.br Tânia Aparecida Kuhnen taniakuhnen@hotmail.com Vanessa Cunha Prado D'Afonseca vanessadafonseca@hotmail.com Wesley Felipe de Oliveira wesley.filosofia@hotmail.com mailto:marianyfoliveira@hotmail.com mailto:marina_rochadesousa@ymail.com mailto:maataide@yahoo.com.br mailto:pmaialosardo@gmail.com mailto:patriciadesafreire@gmail.com mailto:pvpslima@gmail.com mailto:pedroaugustoboal@gmail.com mailto:rafaelspk@gmail.com mailto:rita.ippolito@gmail.com mailto:raf@cca.ufscar.br mailto:rosania.sousa@fjp.mg.gov.br mailto:sduf@uol.com.br mailto:taniakuhnen@hotmail.com mailto:vanessadafonseca@hotmail.com mailto:wesley.filosofia@hotmail.com ix R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 AVALIADORES Agripa Faria Alexandre Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brasil Beatriz Brandão Polivanov Universidade Federal Fluminense, Niterói, RJ, Brasil Cristiano Luis Lenzi Universidade de São Paulo, SP, Brasil Delâine Cavalcanti Santana de Melo Universidade Federal de Pernambuco, Recife, PE, Brasil Francisco da Cunha Silva Universidade Federal de Santa Catarina, Florianópolis, SC, Brasil Izabel Guimarães Marri Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brasil Jaqueline Russczyk Instituto Federal de Educação, Ciência e Tecnologia de Santa Catarina, Chapecó, SC, Brasil o, Cincia Jason de Lima e Silva Universidade Federal de Santa Catarina, Florianópolis, SC, Brasil Javier Ignácio Vernal Universidade Federal de Santa Catarina, Florianópolis, SC, Brasil Julia Rodrigues Leite Universidade de São Paulo, São Paulo, SP, Brasil Katarini Giroldo Miguel Universidade Federal de Mato Grosso do Sul, Campo Grande, MS, Brasil Leonardo de Mello Ribeiro Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brasil x R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 Luís Antônio Cunha Ribeiro Universidade Federal Fluminense, Niterói, RJ, Brasil Marcos Gerhardt Universidade de Passo Fundo, Passo Fundo, RS, Brasil Maria do Socorro Rayol Amoras Sanches Universidade Federal do Pará, Belém, PA, Brasil Mariana Aranha Moreira José Universidade de Taubaté, Taubaté, SP, Brasil Mônica Schiavinatto Universidade Estadual Paulista Júlio de Mesquita Filho, São Paulo, SP, Brasil Newton Narciso Gomes Jr Universidade de Brasília, Brasília, DF, Brasil Paulo Sérgio Moreira da Silva Centro Universitário de Patos de Minas, Patos de Minas, MG, Brasil Raquel Gianolla Miranda Centro Universitário Herminio Ometto de Araras, Araras, SP, Brasil Ricardo Gilson da Costa Silva Universidade Federal de Rondônia, Porto Velho, RO, Brasil Sandro José da Silva Universidade Federal do Espírito Santo, Vitória, ES, Brasil Selvino Assmann Universidade Federal de Santa Catarina, Florianópolis, SC, Brasil Silmara Cimbalista Universidade Federal de Santa Catarina, Florianópolis, SC, Brasil Vinícius Nicastro Honesko Universidade Federal do Paraná, Curitiba, PR, Brasil xi R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 REVISÃO POR PARES (PEER REVIEW) DA INTERthesis A INTERthesis publica artigos e divulga resenhas de livros. Os Artigos submetidos aos Editores são encaminhados para a apreciação de dois pareceristas integrantes do Conselho Editorial como Consultores “ad hoc”. As Resenhas submetidas aos Editores são encaminhadas à apreciação de dois pareceristas. Neste processo, os originais são lidos inicialmente pelos Editores que, se considerar estarem de acordo com a linha editorial geral, os remeterá para os pareceristas. Os originais submetidos à INTERthesis deverão seguir, obrigatoriamente, as INSTRUÇÕES PARA AUTORES, a fim de preservar a característica acadêmica da publicação, sua padronização e seu reconhecimento entre pares. Tendo em vista sua periodicidade estar restrita a duas edições anuais, uma dentro do primeiro e outra dentro do segundo semestre de cada ano, far-se-á sempre o esforço para que os originais recebidos até o final de dezembro de cada ano possam ser editados para publicação no primeiro número do ano seguinte e os originais recebidos até o final de junho de cada ano possam ser editados para publicação no segundo número do ano, para tanto, é incluído nos artigos a data de recebimento dos mesmos. Embora isto seja uma intenção, originais que careçam de alterações por seus autores poderão modificar este objetivo. A necessidade de cuidados editoriais rigorosos, que implicam um processo de análise cuidadoso, resulta do fato de que periódicos eletrônicos acadêmicos devem seguir os mesmos critérios de qualidade que se costuma adotar para a edição de periódicos publicados em formatos e suporte tradicionais. xii R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 INDEXAÇÃO INTERNACIONAIS DRJI - Directory Of Research Journal Indexing – http://www.worldcat.org/title/revista-internacional-interdisciplinar- interthesis/oclc/325327527&referer=brief_results RCAAP - http://diretorio.ibict.br/xmlui/handle/1/601 Dialnet - http://dialnet.unirioja.es/servlet/revista?codigo=15803 Directory of Open Access Journals (DOAJ) - http://www.doaj.org/doaj?func=issues&jId=44439&uiLanguage=en Google Scholar - http://scholar.google.com.br/scholar?hl=pt- BR&lr=lang_pt&q=Interthesis&btnG=Pesquisar&lr=lang_pt LATINDEX - http://www.latindex.unam.mx/buscador/ficRev.html?opcion=1&folio=17140 Oaister - http://oaister.worldcat.org/search?qt=wc_org_oaister&q=interthesis&scope=0&oldsco pe=&wcsbtn2w=Buscar&dblist=239 Ulrich's - http://www.ulrichsweb.com/ulrichsweb/Search/fullCitation.asp?navPage=1&tab=1seri al_uid=599337&issn=1807138 Scirus - http://scirus.com/srsapp/ vLex - http://vlex.com/ EBSCOhost - “Revista Internacional Interdisciplinar INTERthesis has entered into an electronic licensing relationship with EBSCO, the world's most prolific aggregator of full text journals, magazines and other sources.” NACIONAIS Portal para periódicos de livre acesso na Internet - LivRe- http://livre.cnen.gov.br/ Portal de Periódicos da Capes - http://www.periodicos.capes.gov.br/portugues/index.jsp Site do Sistema Eletrônico de Editoração de Revistas - http://seer.ibict.br Sumários.org - http://sumarios.org/revista.asp?id_revista=397&idarea=9 http://www.worldcat.org/title/revista-internacional-interdisciplinar-interthesis/oclc/325327527&referer=brief_results http://www.worldcat.org/title/revista-internacional-interdisciplinar-interthesis/oclc/325327527&referer=brief_results http://diretorio.ibict.br/xmlui/handle/1/601 http://dialnet.unirioja.es/servlet/revista?codigo=15803 http://www.doaj.org/doaj?func=issues&jId=44439&uiLanguage=en http://scholar.google.com.br/scholar?hl=pt-BR&lr=lang_pt&q=Interthesis&btnG=Pesquisar&lr=lang_pt http://scholar.google.com.br/scholar?hl=pt-BR&lr=lang_pt&q=Interthesis&btnG=Pesquisar&lr=lang_pt http://www.latindex.unam.mx/buscador/ficRev.html?opcion=1&folio=17140 http://oaister.worldcat.org/search?qt=wc_org_oaister&q=interthesis&scope=0&oldscope=&wcsbtn2w=Buscar&dblist=239 http://oaister.worldcat.org/search?qt=wc_org_oaister&q=interthesis&scope=0&oldscope=&wcsbtn2w=Buscar&dblist=239 http://www.ulrichsweb.com/ulrichsweb/Search/fullCitation.asp?navPage=1&tab=1serial_uid=599337&issn=1807138 http://www.ulrichsweb.com/ulrichsweb/Search/fullCitation.asp?navPage=1&tab=1serial_uid=599337&issn=1807138 http://scirus.com/srsapp/ http://vlex.com/ http://livre.cnen.gov.br/ http://www.periodicos.capes.gov.br/portugues/index.jsp http://seer.ibict.br/ http://sumarios.org/revista.asp?id_revista=397&idarea=9 xiii R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 REFERÊNCIA – QUALIS- CAPES (Agosto 2013) B2 História B2 Interdisciplinar B3 Administração, Ciências Contábeis e Turismo B3 Artes / Música B3 Ciência Política E Relações Internacionais B3 Ciências Ambientais B3 Economia B3 Educação B3 Geografia B3 Psicologia B3 Serviço Social B3 Sociologia B4 Direito B4 Enfermagem B4 Saúde Coletiva B5 Antropologia / Arqueologia B5 Ciências Agrárias I B5 Filosofia/Teologia: subcomissão Filosofia B5 Letras / Linguística B5 Medicina II Disponível em: http://qualis.capes.gov.br/webqualis/ http://qualis.capes.gov.br/webqualis/ xiv R. Inter. Interdisc. INTERthesis, Florianópolis, v.12, n.1, p.vi-xiv, Jan-Jun. 2015 EQUIPE DE TRABALHO Javier Ignacio Vernal, PPGICH, UFSC, Florianópolis, SC, Brasil José Eliezer Mikosz, EMBAP, Curitiba, PR, Brasil Leandro Marcelo Cisneros, PPGICH, UFSC, Florianópolis, SC, Brasil Selvino José Assmann, PPGICH, UFSC, Florianópolis, SC, Brasil Silmara Cimbalista, PPGICH, UFSC, Florianópolis, SC, Brasil Victoria Moraes, PPGICH, UFSC, Florianópolis, SC, Brasil work_p7skt5qo4bb7rf4vrjzcwhw4zy ---- <3131352D3133362D2D2D2D2DB0FBBDC2C1F82DC3D6C0E7C8B22DB9E8B0E6C0E72DC1A4BFB5B9CC2E687770> 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 115 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구*1) A Study on the Policy on Digital Contents Archiving in the Field of Science and Technology 곽 승 진**・최 재 황***・배 경 재****・정 영 미***** Seung-Jin Kwak · Jae-Hwang Choi · Kyung-Jae Bae · Young-Mi Jung 차 례 1. 서 론 2. 디지털 아카이빙 정책의 프레임워크 3. KISTI의 디지털 아카이빙 환경 분석 4. KISTI의 디지털 아카이빙 추진 전략 5. 결론 및 제언 · 참고문헌 초 록 최근 전 세계적으로 디지털 콘텐츠에 대한 아카이빙 사례가 증가하고 있다. 디지털 콘텐츠는 휘발 성, 진본성 문제, 보존의 취약성, 재생도구의 노화 등과 같은 특성으로 인하여 장기보존을 위한 특별한 보존관리가 요구된다. 성공적인 디지털 아카이빙은 체계적으로 수립된 명문화된 정책에 기초하여 관련 기관들의 협력을 통한 단계적인 실행에서 가능해질 것이다. 본 연구는 과학기술정보 유통의 대표기관 인 KISTI의 디지털 콘텐츠 아카이빙 정책 수립을 위한 기초적인 연구이다. 해외의 디지털 아카이빙 정책 사례들을 조사․분석하여 디지털 아카이빙 정책의 프레임워크를 파악하고, 전문가 그룹 인터뷰를 통해 아카이빙 정책에 포함되는 거시적인 항목들에 대한 방안을 제시하였다. 키 워 드 디지털 콘텐츠, 아카이빙, 보존, 큐레이션, 정책, 과학기술정보, 한국과학기술정보연구원 * 본 논문은 2010년 KISTI의 지원에 의한 연구의 일부 내용을 수정 ․ 보완한 것임. ** 충남대학교 사회과학대학 문헌정보학과 부교수(sjkwak@cnu.ac.kr) (제1저자) *** 경북대학교 사회과학대학 문헌정보학과 부교수(choi@knu.ac.kr) (공동저자) **** 동덕여자대학교 사회대학 문헌정보학과 전임강사(kjbae@dongduk.ac.kr) (공동저자) ***** 동의대학교 인문대학 문헌정보학과 조교수(yomjung@deu.ac.kr) (교신저자) ∙ 논문접수일자: 2011년 8월 19일 ∙ 최종심사(수정)일자: 2011년 10월 4일 ∙ 게재확정일자: 2011년 10월 14일 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 116 ABSTRACT Recently, the cases of archiving for digital contents are being increased throughout the world. Owing to the characteristics of digital contents including the issues on volatile property and authenticity, preservation weakness, and aging in the reproductive tool, it requires special management for long-term preservation. A successful digital archiving seems to be possible in the stage-oriented practice through the cooperation of interested organizations based on the systematically established and stipulated policy. This is the basic study that was designed for the establishment of the policy on digital contents archiving in KISTI(Korea Institute of Science and Technology Information), which is a representative organization for the distribution of science and technology information. By investigating and analyzing the cases of overseas policies on digital contents archiving, it intended to understand the framework of digital archiving policy while suggesting measures on the macroscopic items that were included in the archiving policies through the interview conducted to expert groups. KEYWORDS Digital Content, Archiving, Preservation, Curation, Policy, Science & Technology Information, KISTI 1. 서 론 디지털 콘텐츠는 종이기록물과 달리 유실되 기 쉬운 휘발성을 가지고 있어 망실되기 쉬우 며 변형이나 훼손도 쉽게 이루어진다. 그러나 보다 심각한 것은 현재 정보를 담고 있는 매체, 그 매체를 읽어 들이는 하드웨어와 운영체제, 이 모두를 포괄하는 보존시스템의 수명을 가 까운 미래에도 보장할 수 없다는 사실이다. 디지털 콘텐츠의 보존을 위해서는 수명주기 상의 한계를 극복하려는 노력과 더불어 디지 털 콘텐츠의 무결성을 유지할 수 있는 장기적 인 보존 방안과, 정보기술 발전추이에 따른 디 지털 콘텐츠의 가독성과 접근성을 유지할 수 있는 방안을 모색하여야 한다. 또한, 한정된 인 력과 예산으로 최대의 효과를 창출하기 위해 서는 아카이빙 대상 콘텐츠의 우선순위 결정 방안과 콘텐츠 형태별 아카이빙 필요성 평가 기준을 설정할 필요성이 있다. 디지털 콘텐츠를 수집, 생산, 서비스, 보존하 는 국가대표 기관인 국립중앙도서관, 국기기록 원, 한국과학기술정보연구원(KISTI) 등은 디 지털 콘텐츠에 대해 협력을 통한 구체적이고 도 체계적인 아카이빙 정책의 수립과 집행이 매우 중요하다. 오늘날 우리 주변에는 수많은 디지털 콘텐 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 117 츠가 생성되고 이용되고 소멸한다. 인쇄자료 와 다르게 디지털 콘텐츠는 생성부터 소멸까 지의 시간이 매우 빠르게 진행되는 특징이 있 다. 미국 의회도서관의 발표에 따르면 2002년 인터넷 사이트의 평균 수명은 44일이다. 또한 단행본 책의 경우 한 장에서 인용된 웹문서의 약 65%는 1년 안에 사라지거나 URL이 변경 되며, 한 학술지 논문에 인용된 웹 문서의 50%는 논문이 발행되기도 전에 원래의 위치 에 변동이 생긴다(Charlesworth 2003). 따라 서 디지털 콘텐츠의 장기보존과 아카이빙을 위한 다양한 정책과 제도의 정비가 필요한 시 점이다. 디지털 콘텐츠의 수집과 항구적 보존을 위 한 세계 각국의 노력은 수년간에 걸쳐 진행되 고 있다. 이러한 노력의 결과로 국가대표 도서 관이나 기억조직으로 미국의 NDIIPP(National Digital Information Infrastructure and Pres- ervation Program), 영국의 DCC(Digital Cu- ration Centre)와 JISC(Joint Information Systems Committee), 호주 국립도서관을 중 심으로 한 PANDORA(Preserving and Ac- cessing Networked Documentary Resources of Australia), 일본의 NII-REO(NII Repos- itory of Electoronic Journals and Online Publications) 등의 디지털 아카이빙 사업이 수행되고 있다. 또한 대학이나 연구기관에서 출판하여 국가를 넘어 지식활용과 정보의 협 업 필요성을 자각한 컨소시엄 형태의 JSTOR (Journal Storage)에 의해 설립된 Portico, 유 럽 대학도서관과 기록관, 과학기술기관 및 문 화기관이 참여한 DPE(Digital Preservation Europe), 국제연합적인 형태로 발전한 E-depot, 스텐포드대학의 LOCKSS(Lots of Copies Keep Stuff Safe) 등의 아카이빙 프로젝트가 활발하게 전개되고 있다. 우리나라도 2009년 도서관법을 개정하여 디 지털 자료의 납본 체계를 수립하여 국립중앙도 서관이 우리나라에서 서비스되는 온라인 자료 중에서 보존가치가 높은 온라인 자료를 선정하 여 수집․보존하게 하였다. 특히 국립중앙도서 관에서 운영 중인 OASIS(Online Archiving & Searching Internet Sources)는 미래 디지 털 세대를 위한 현세대의 디지털 지적 문화유 산의 수집/보존 프로젝트로 가치 있는 디지털 자원을 효과적으로 수집/보존/서비스를 목적 으로 개발되었다(OASIS homepage 2011). 현재 수집대상 디지털자원은 한국과 관련된 사회, 정치, 문화, 종교 또는 경제적으로 중요 한 주제이고, 한국의 저자에 의해 작성된 것으 로 저작권 이용 허락을 받은 국내 웹사이트, 중 앙정부가 생산한 온라인 디지털자원, 대학간행 물, 최근 이슈가 되는 온라인 디지털자원이 수 집우선자료이다. 그러나 국가경쟁력 강화를 위 해 주요 요소인 과학기술분야 연구자를 위한 연구논문, 연구보고서, 과학데이터 등의 항구 적인 수집과 접근을 지원하는 체계는 아직 미 비한 실정이다. KISTI(Korea Institute of Science and Technology Information)는 우리나라의 대표 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 118 적인 과학기술 지식정보의 관리, 생산, 유통 전 문기관으로서 “과학기술 지식정보인프라의 연 구개발 및 서비스체계 확립"의 임무를 갖고 있 다. KISTI는 수행 과제로서 국내외 과학기술 및 국가연구개발사업 관련 지식정보의 종합적 인 수집 및 분석, 데이터베이스 구축, 연계 및 공동 활용 체계 구축을 추진하고 있으며, 또한 종합관리시스템 구축, 공동 활용을 위한 표준 화, 관리 유통을 촉진하기 위한 종합계획 수립 등도 중요한 임무로 수행하고 있다. 따라서 최근 디지털 정보자원의 생산과 이 용이 급증하고 장기보존의 중요성이 증대되고 있는 디지털 환경에서 KISTI는 디지털 아카 이빙과 관련하여 과학기술 및 국가연구개발사 업의 지식정보가 지속적인 가치를 가질 수 있 도록 종합적인 계획의 수립과 실행을 위한 역 할 수행이 무엇보다 중요하다고 할 수 있다. 장기적인 관점에서 성공적인 디지털 아카이 빙은 체계적으로 수립된 명문화된 정책에 기 초하여 관련 기관들의 협력을 통한 단계적인 실행에서 가능해질 것이다. 본 연구의 목적은 우리나라 과학기술정보의 생산, 수집, 서비스, 유통 기관인 KISTI의 디지털 콘텐츠 아카이 빙 정책 수립을 위한 기초적인 연구이다. 이 논 문에서 논의하는 디지털 콘텐츠는 디지털화된 방법으로 제작, 유통, 소비되는 제품군으로 디 지털 형태로 제작된 연구논문, 연구보고서, 과 학데이터, 특허, 웹사이트, 전자책, 전자저널, 이미지 파일, MP3 파일, 동영상 파일 등의 디 지털자료를 말한다. KISTI의 디지털 콘텐츠 아카이빙 정책을 수립하기 위해 전문가 그룹의 인터뷰를 통해 정책의 거시적인 관점에서 접근하고자 하였다. 먼저, 해외의 디지털 아카이빙 정책 사례들을 조사․분석하여 디지털 아카이빙 정책의 프레 임워크를 파악하고, 전문가 그룹 인터뷰를 통 해 정책에 포함되는 거시적인 항목들에 대한 방안을 제시하였다. 2. 디지털 아카이빙 정책의 프레 임워크 디지털 아카이빙 정책은 사명과 목적, 관련 기관의 역할, 디지털 아카이빙의 대상 범위 등 과 같은 보다 거시적인 조항들과 함께 실제 실 행에서 필요한 각종 절차와 전략, 방법들에 대한 세부적인 내용 또한 포함하고 있어야 한 다. 디지털 아카이빙 정책 수립에 앞서 디지털 아카이빙 정책에 포함되어야 하는 내용과 그 구조를 살펴보기 위해 디지털 아카이빙 정책 프레임워크에 대한 이론적 검토 및 사례를 조 사하였다. 조사 대상으로는 국가 주도형의 디 지털 아카이빙 프로젝트를 진행 중인 JISC와 호주국립도서관(NLA: National Library of Australia)을 대표적인 사례로 선정하였고 기 관 주도형의 OCLC(Online Computer Library Center)와 콜롬비아 대학도서관(Columbia Uni- versity Libraries), 그리고 디지털 큐레이션의 사례를 포함하였다. 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 119 2.1 JISC의 디지털 장기보존 모델 JISC 보고서(2008)는 기존의 디지털 아카 이빙 정책들에 기반 하여 기관의 정책 수립 및 실행 시 필요한 절차와 프레임워크를 제공하고 있다. 이 보고서에서는 DCC, TNA(The UK National Archives), The Digital Preserva- tion Coalition 등의 사례연구를 통해 정책에 포함되어야 하는 항목들을 제시하고 있는데 디지털 아카이빙 정책을 위한 모델을 정책수 립 단계와 실행 단계로 구분하여 접근하고 있다. 정책수립 단계에서 정책에 포함되어야 하는 거시적 항목으로 <표 1>과 같은 항목들이 제시 되고 있다. 여기에는 디지털 아카이빙 정책의 사명과 목적, 범위와 지원정책, 보존 작업의 중 요성과 책임, 관련된 정책들, 추가적인 가이드 라인과 절차를 제공하는 문서, 정책에 사용된 용어의 정의, 정책의 버전 등에 관한 개괄적인 내용을 포함한다. 디지털 아카이빙 정책에는 거시적인 항목뿐 만 아니라 실행 단계에서 더욱 필요한 미시적인 항목들도 명확하게 명시할 필요가 있다. JISC 보고서(2008)는 다양한 정책 사례를 기반으로 <표 2>와 같이 디지털 아카이빙 정책에 필요한 실행단계에서 요구되는 미시적인 조항들을 제 시하고 있다. 2.2 OCLC의 디지털 장기보존 정책 OCLC는 2001년부터 지속적으로 디지털 자 원의 장기적인 보존 문제에 관심을 가져왔다. NO 조 항 설 명 1 Principle Statement ∙디지털 보존 정책이 기 의 요구를 어떻게 지원하고 어떤 이익이 있는지의 정책의 사명과 목 에 해 기술함 2 Contextual Links ∙이러한 정책이 조직에 어떻게 융합되고 다른 높은 수 의 략들과 정책들 에 련되는지를 조명함 3 Preservation Objectives ∙보존의 목 에 한 내용과 그것이 어떻게 지원되어질 것인지 4 Identification of Content ∙내용의 에서 정책의 반 인 범 와 장서 개발 목 과 련한 설명 5 Procedural Accountability ∙정책을 한 상 수 의 역할들을 밝히고 핵심 인 기 의 자원들에 한 보존 작업의 요한 책임에 한 명시 6 Guidance and Implementation ∙보존 정책을 어떻게 실행할 것인가와 직원이나 별도의 문서화에서 사용가 능한 추가 인 가이드라인과 차들이 어디에 있는지를 명시 ∙실행 단계의 세부 인 조항들이 이 정책의 조항에 포함되어짐 7 Glossary ∙필요하다면 정책에 사용된 용어에 한 정확한 정의 8 Version Control ∙정책의 버 에 한 정보 명시 <표 1> 디지털 아카이빙 정책의 거시적인 조항 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 120 NO 조 항 설 명 1 Financial and Staff Responsibility ∙기 내의 디지털 아카이빙 책임자를 명시 ∙기 의 산 계획 내에서 이 정책을 유지할 수 있는 재정 지속성에 해 명시해야 함 2 Intellectual Property ∙어떻게 기 의 계획이 작권 문제를 승인하거나 항하고 있는지와 작 권 문제에 한 인지 정도를 보여 3 Distributed Services ∙어떤 경우에 몇몇 는 모든 보존 활동들을 분산하는 것은 더 편리하고 비용 효과 인 방법임 ∙어떤 서비스를 제공할 것인지를 정의함 4 Standards Compliance ∙어떤 표 도구들을 수용할 것인지를 정의함 5 Review and Certification ∙정책이 얼마나 잘 실행되고 있는지에 한 검은 얼마나 자주 할 것인지? 반년간, 연간, 격년간 등 6 Auditing and Risk Assessment ∙표 화된 감사를 실행하고 정책에 직면한 험을 인지하기 한 차 7 Stakeholders ∙정책에 포함된 모든 요소들과 그것의 실행 차 8 Preservation Strategies ∙정책의 기술 인 이행과 채택된 보존 략에 한 가이드라인 테이블 <표 2> 디지털 아카이빙 정책의 미시적인 조항 그 결과 2002년에 OCLC Digital Archive가 설립되었고 2004년 OCLC의 Digital Archive Preservation Policy가 마련되었다. 정책은 OAIS 참조모형과 RLG/OCLC의 Trusted Digital Repositories: Attributes and Respon- sibilities 보고서에 기초하여 만들어졌다. 그 구 성은 다음의 <표 3>과 같으며 크게 정책과 정책 을 지원하는 문서들로 구성되어 있다. OCLC의 디지털 장기보존 정책은 보다 기술적이고 실 질적인 내용들을 많이 포함하고 있다. 정책에 는 정책의 목적과 범위, 서비스 수준과 보존 활 동, 보존 전략, 데이터 포맷의 위험 평가, 콘텐 츠 접근 환경, 보존의 실행 계획, 조직의 지속 성에 대한 조항 등이 포함된다. 정책을 지원하 는 보다 기술적이고 세부적인 내용은 별도의 문서로 제공된다. 2.3 호주 국립도서관의 디지털 보존 정책 호주 국립도서관(NLA)의 디지털 보존 정 책의 초판은 2002년에 만들어졌고 2번의 개정 과정을 거쳤다. 여기에 포함된 것은 2008년 개 정판의 내용이다. NLA의 정책은 디지털 보존 정책에 포함되어야 하는 정책의 핵심 영역에 대한 개요를 제공할 뿐만 아니라 실행 단계에 필요한 보다 상세한 절차와 방법도 포함하고 있다. NLA의 디지털 보존 정책의 구성을 살펴보 면 다음의 <표 4>와 같다. 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 121 조 항 설 명 OCLC Digital Archive Preservation Policy Purpose ∙정책의 목 과 정책에 한 개요 ∙정책의 내용 범 Service Levels and Preservation Activities Bit Preservation ∙ 재 장소에 축 되어 있는 콘텐츠 객체를 한 서비스 수 과 보존 활동Local Preservation Full Preservation ∙미래의 OCLC는 장소의 몇몇 콘텐츠 객체를 해 완 한 보존을 제공할 계획 Preservation Strategy ∙비트스트림 객체의 무결성 유지에 한 략 ∙디지털 객체의 장기 인 근성 보장에 한 략 Data Format Risk Assessment ∙기술 인 환경 변화에 따른 데이터 포맷 험 평가 Content Access Environment Current access environment ∙콘텐츠 객체 디스 이에 필요한 기술 인 애 리 이션, 운 체 제, 그리고 하드웨어의 세트 ∙두 가지 유형의 근 환경을 명시함 Open-source access environment Preservation Action Plan ∙선택된 일 포맷에 련된 것임 Succession Plan ∙조직의 장기 인 지속성 Supporting Documentation Data integrity and continuity ∙데이터의 무결성과 근성을 보장하기 한 략 ∙데이터 리, 백업 정책, 장시설과 보안, 재해 방지와 회복, 정책 갱신 Risk assessment ∙ 험 평가 차에 사용될 수 있는 기 을 제시 ∙조직의 험, 보존 활동의 험, 일 포맷, 하드웨어, 소 트웨어의 험 A glossary ∙정책에 사용된 용어 정의 설명 Policy and supporting documentation update procedure and notification process ∙정책 정책 지원 문서의 갱신 정보 <표 3> OCLC의 장기적인 보존 정책 구성 조 항 설 명 Purpose ∙정책의 목 과 정책에 한 개요 ∙기 이 다른 정책과의 계 The objectives of the Library's digital preservation activities ∙디지털 보존에 한 정의 ∙도서 의 디지털 장기보존 활동들의 목 The nature of the Library's digital collections ∙디지털 보존 장서의 내용과 범 The challenges of keeping digital information resources accessible ∙디지털 자원에 한 지속 인 근을 제공하는데 직면하게 되는 문제 과 과제들 Broad directions for preserving the Library's digital collections Scope ∙ 의 모든 디지털 자원을 상으로 근성과 유용성을 보존하는 것을 목 으로 하지만 실 으로 어렵기 때문에 우선순 를 고려하여 장기보존함 Preserving accessibility ∙ 근성을 보존하기 한 방법 Models ∙정책이 근간으로 삼고 있는 OAIS 모델을 명시함 <표 4> NLA의 장기적인 보존 정책 구성 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 122 2.4 콜롬비아 대학도서관의 장기보존 정책 콜롬비아 대학도서관의 디지털 자원의 장기 보존 정책은 좀 더 실질적이고 구체적이다. 여 기에는 수명주기 관리에 대한 항목들이 포함 되어 있으며 그 외에도 자원의 관리, 장기보존 에 대한 책임 소재 등을 밝히고 있다. 그리고 장기적인 보존 전략 개발에 다음과 같은 사항을 고려해야 한다고 권고하고 있다. ∙비디지털 도서관 자원들을 위한 저장, 백 업 그리고 보존 등의 통합 ∙의사결정 툴 사용과 개발(예를 들면, 위 험 분석, 모니터링 사용방법, 손실 계산 확률, 비용모델 등) ∙전략 유지(백업: 온라인 그리고/또는 오 프라인 모니터링, 변환, 미러사이트나 캐 시를 통해 중복) ∙보존 전략(마이그레이션, 에뮬레이션 등) ∙외부 컨설팅과 저장 서비스에 대한 통합 CUL의 장기보존 정책의 구조와 그 내용을 살펴보면 <표 5>와 같다. 2.5 생명주기 기반 큐레이션 정책의 프레임워크 안영희와 박옥화(2010)는 디지털 정보자원 의 큐레이션 정책 수립을 위해 영국 디지털 큐 레이션 센터 및 8개 연구재단의 디지털 큐레이 션 정책과 지원 서비스 사례를 통해 디지털 정 보자원의 생명주기 단계에 따른 큐레이션 정 책 프레임워크를 제안하였다. 디지털 큐레이션 정책 프레임워크 구조는 디지털 큐레이션을 위한 원칙, 정책 수립, 생명주기 기반 큐레이션 정책, 지원 규정의 4가지 범주 아래에 14개의 하위 범주를 두고 있다. 이 논문에서 제안한 디 지털 큐레이션 정책 프레임워크의 상세 구조 는 <그림 1>과 같다. 조 항 설 명 Implementation Principle costs and resources ∙실행 원칙은 사용가능한 비용-효과 인 방법을 제공하기 해 필요함 ∙도서 장서의 규모, 기술의 변화, 수명주기의 하부구조 등에 달라질 수 있음 methods coordination and planning of the Library's preservation efforts Research and standards development ∙디지털 자원을 효과 으로 리하기 해서는 련 연구와 표 개발이 지속 으로 필요함 Working with others to preserve the nation's digital information resources ∙국가의 디지털 정보자원을 보존하기 해 다른 기 과 력 Working with others to foster digital preservation ∙디지털 장기보존 략, 실제, 지원 기술 등을 해 국가 는 국제 기 과의 력 Contact ∙정책의 책임자 연락 정보 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 123 조 항 설 명 Statement of CUL policy for preservation of digital resources Policy ∙장기 인 보존 정책의 목 , 정책의 내용 개요 Scope of preservation responsibility ∙디지털 자원별 의무 인 보존 수 Frequency with which preservation/retention policy for digital material will be updated ∙정책의 검토 갱신빈도 Statement of CUL's commitment to lifecycle management Development of preservation strategies ∙보존 략들 Selection ∙디지털화, 수집, 라이선스 계약 시 장기 인 유지를 한 선택 기 Conversion ∙변환 가이드라인 Metadata creation and management ∙메타데이터의 생성과 리 Storage ∙온라인, 오 라인, 복 장에 한 권고사항 Access arrangements ∙ 작권, 승인 요구사항 등 Statement of CUL's resource management policies and plans ∙ 략 계획(기술 인 하부구조, 산 계획, 직원, 작권 리) Statement related to regional, national, consortial, an international responsibilities ∙장기보존에 한 CUL의 지역 , 국가 , 컨소시엄, 그리고 국제 인 역할 <표 5> 콜롬비아 대학도서관의 장기적인 보존 정책 구성 정책 수립 생명주기 기반 정책 지원 규정 디지털 큐 이션 정책 원칙 지침 •교육 •홍보 가이드 •모니터링 •목 •역할 책임 •범 •법 권리 윤리 이슈 •권장사항 • 리계획 •수집 개발 •평가 선정 • 장 리 • 근, 공유 재사용 •변환 <그림 1> 디지털 큐레이션 정책 프레임워크 구조 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 124 디지털 큐레이션 정책의 원칙 및 지침 범주 에는 정책의 목적과 역할 및 책임 사항을 명시 하고 정책수립의 범주에는 범위, 법적 권리 및 윤리적 이슈, 권장사항의 하위 범주가 포함된 다. 보다 실행적인 범주인 생명주기 기반 정책 에는 관리계획, 수집 및 개발, 평가 및 선정, 저 장 및 관리, 접근, 공유 및 재사용, 그리고 반환 의 범주를 포함하여 디지털 콘텐츠의 생명주 기와 그 처리과정에 따라 정책을 수립할 수 있 도록 프레임워크를 구성하였고 나머지 지원 규정에는 교육, 홍보 및 가이드, 모니터링을 포 함하고 있다. 2.6 디지털 콘텐츠 아카이빙 정책 프레임워크 개발 앞의 정책 사례들을 살펴본 결과, 디지털 콘 텐츠 아카이빙 정책의 프레임워크는 주로 거 시적 관점의 조항들과 미시적 관점의 조항들 의 두 범주로 나누어 볼 수 있다. 정책에 포함될 조항 중 거시적 관점에 포함 될 수 있는 것으로는 정책의 목적 및 목표, 디 지털 아카이빙의 범위, 협력기관과 그 역할 그 리고 책임, 보존 전략, 품질관리 등이고 기타로 포함될 수 있는 것은 사전과 정책의 버전 정보 등이다. 미시적 관점의 조항들은 정책의 실행적인 측면으로 디지털 콘텐츠 수집에서부터 장기적 인 보존 전략까지 디지털 콘텐츠의 수명주기 전반에 따른 처리과정에 따라 조항들을 구성 하고 있다. 특히 콜롬비아 대학도서관의 장기 적인 보존 정책과 안영희와 박옥화가 제시한 디지털 큐레이션 정책 프레임워크는 정책의 하위 범주가 디지털 콘텐츠의 수명주기에 기 반하고 있다. 본 연구에서 살펴본 디지털 콘텐츠 장기 보 존 정책 사례들을 분석하여 다음의 <그림 2>와 같은 디지털 콘텐츠 아카이빙 정책의 프레임 워크를 제시할 수 있다. 거시적 관점의 조항들 중 보존 전략과 품질 관리는 구체적인 실행방법이 필요한 조항으로 정책에서는 개요만 제공하고 미시적 관점에 해 <그림 2> 디지털 콘텐츠 아카이빙 정책의 프레임워크 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 125 당하는 실질적인 전략 및 방법들에 대한 조항 은 별도의 문서로 지원하는 것이 보다 안정적 인 정책을 제공할 수 있을 것이다. 기존의 연구에서 KISTI의 디지털 콘텐츠에 적합한 수명주기가 정의(곽승진, 성원경, 배경 재 2011)되었고, 보존 전략과 품질관리에 대한 구체적인 내용들(정영미, 윤화묵, 김정택 2010) 이 이미 제시되었다. 이들 내용은 정책을 지원하 는 기술적인 문서로 정책과 독립적으로 관리하 고 정책에는 개요만 제공하는 것이 바람직하다. 그래서 다음에서는 거시적 관점의 정책(안)을 마련하기 위해 전문가 인터뷰를 실시하였고 정 책 수립과 실행을 위한 환경 분석을 실시하였다. 3. KISTI의 디지털 아카이빙 환경 분석 KISTI에 적합한 디지털 아카이빙 정책 수 립의 기초 안을 마련하기 위해 전문가 그룹 6 명의 인터뷰와 KISTI의 디지털 아카이빙 태 스크포스 소속 8명 팀원의 포커스 그룹 인터뷰 를 실시하였다. 이를 통해 KISTI 디지털 아카 이빙의 목적과 목표, 역할을 파악하고 KISTI 의 디지털 아카이빙을 둘러싼 내부 환경과 외 부환경을 조사하였다. 정책 실행과 관련하여 KISTI의 경쟁력과 전략분석을 위해 SWOT 분석기법을 활용하였다. 디지털 아카이빙 정책 수립을 위한 전문가 인터뷰는 국내의 문헌정보학분야, 기록학분야, 컴퓨터공학분야, 도서관분야의 전문가 6명과 KISTI 실무 관리자 8명을 대상으로 하였다. 인터뷰 문항은 내부 환경 분석, 외부 환경 분 석, 중점 추진과제 등을 주요 항목으로 <표 6> 과 같이 구성하였다. 아래의 KISTI의 비전과 목표, KISTI 역할과 책임, SWOT 분석, 관련 기관과의 협력은 포커스 그룹 인터뷰의 결과 를 종합적으로 분석하여 제시한 것이다. 구 분 문 항 내부 환경 분석 디지털 콘텐츠 아카이빙에 한 KISTI의 사명(역할) 과학기술분야 정보자원의 디지털 아카이빙 요성 KISTI의 디지털 아카이빙 센터로서의 강 과 약 KISTI의 디지털 아카이빙 목 과 목표 디지털 아카이빙을 한 핵심과제 세부 실행계획 외부 환경 분석 KISTI의 디지털 아카이빙 수행 시 기회요소 KISTI의 디지털 아카이빙 수행 시 요소 국내 유 기 의 력방안 추진 과제 으로 추진해야할 우선 과제 기타 기타 의견 <표 6> 전문가 그룹 인터뷰 문항 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 126 3.1 KISTI의 비전과 목표 KISTI의 임무는 “과학기술 지식정보인프라 의 연구 개발 및 서비스 체계 확립”으로 이를 위해 KISTI가 수행해야 하는 과제는 과학기술 기본법 제26조에 근거하고 있다. KISTI가 수 행해야 할 과제는 동법 시행령 제40조 8항에서 다음과 같이 기술하고 있다. ∙국내․외 과학기술 및 국가연구개발사업 관련 지식․정보의 종합적인 수집 및 분석 ∙과학기술 및 국가연구개발사업관련 지 식․정보 관련 데이터베이스의 구축․연 계 및 공동 활용 ∙과학기술 및 국가연구개발사업관련 지 식․정보 유통체계 및 종합관리시스템 구축 ∙과학기술 및 국가연구개발사업관련 지 식․정보 공동 활용을 위한 표준화 ∙과학기술 및 국가연구개발사업관련 지 식․정보의 관리․유통을 촉진하기 위한 종합시책 및 계획의 수립지원 ∙그 밖에 과학기술정보화 촉진을 위하여 필요한 사항 즉 KISTI는 과학기술의 지식․정보의 하나 인 과학기술 디지털 콘텐츠에 대해서도 수집 및 분석, 데이터베이스의 구축․연계 및 활용, 유통체계 및 종합관리시스템 구축, 공동 활용 을 위한 표준화, 관리․유통을 촉진하기 위한 종합시책 및 계획의 수립지원 등의 과제를 수 행해야 하며 곧 이것은 디지털 콘텐츠의 디지 털 아카이빙 역할을 수행해야 한다는 것이다. 단, 디지털 콘텐츠의 휘발성, 자원 유형의 다양 화, 정보 기술의 의존성, 저작권문제 등의 특징 들이 이전의 정보자원의 행태와 달라 장기적 인 보존을 포함한 디지털 아카이빙을 위해서 는 약간은 다른 정책과 기술을 필요로 한다. KISTI의 디지털 아카이빙에 대한 궁극적인 목표인 비전은 다음과 같은 내용을 포함하고 있어야 한다. ∙과학기술 및 국가연구개발사업의 디지털 지식정보를 위한 디지털 아카이빙 거버넌 스 수립 ∙생산지에 관계없이 전자정보의 망라적 수 집에서 항구적 액세스를 지원할 국가 과 학기술 지식자원 센터 구축 및 운영 ∙국내 과학기술분야 자료의 망라적 수집 및 활용을 제공하고 대외적으로는 과학기 술 자료의 국가의 대표기관의 위치 확립 ∙국내 과학기술분야 디지털 콘텐츠의 가장 포괄적이고 신뢰할 수 있는 아카이빙 센터 ∙과학기술 정보자원 중 특히 태생적 디지 털 자원(born digital contents)의 장기 보존을 위하여 국가 차원의 정책을 수립 하고 이를 위한 연구 개발을 실시함으로 써 과학기술유산이 후대에 전승될 수 있 도록 함은 물론 주요 과학기술자원의 활 용을 극대화할 수 있는 기관 종합해보면 KISTI의 디지털 콘텐츠 아카이 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 127 빙에 대한 비전은 다음과 같이 제시될 수 있다. “과학기술분야 디지털 콘텐츠에 대한 망라적 수집, 축적, 관리, 장기적인 보존을 제공하는 국가를 대표하는 과학기술 디지털 콘텐츠 아 카이빙 센터"이다. 비전을 뒷받침하는 목표에 대한 의견을 종 합해서 정리해보면 다섯 가지의 목표와 함께 목표를 달성하기 위한 중점적인 추진 과제가 다음의 <표 7>과 같이 제시될 수 있다. 3.2 KISTI의 역할과 책임 과학기술분야의 디지털 아카이빙이 KISTI 의 정보서비스에 있어 중요한 이유로는 다음 과 같은 의견들이 제시되었다. ∙과학기술분야 디지털 콘텐츠 양의 수적인 급증 ∙디지털 콘텐츠 자체의 휘발성 또는 다양 한 이유로 미래사용의 불확실성이 커짐 ∙해외전자저널 등 디지털 콘텐츠의 라이선 스 계약 만료 후에도 항구적 액세스를 담 보할 장치가 필요함 ∙디지털 콘텐츠의 기술의존적인 특성으로 인해 정보기술발전에 따른 정보의 가독성 유지가 필수적이므로 지속적인 디지털 아 카이빙 및 큐레이션이 필요함 디지털 콘텐츠가 지니는 위와 같은 특성과 더불어 KISTI는 현재 과학기술정보에 대한 관리 보존의 법적 책임과 가장 많은 데이터를 확보하고 있는 정보서비스 기관이기 때문에 과학기술분야에 대해서 책임감 있는 디지털 목 표 중점 추진 과제 과학기술분야 디지털 아카이 구축 ∙수명주기 기반 디지털 콘텐츠를 장기 보존할 수 있는 통합형 랫폼 개발 ∙우선순 가 높은 자원을 상으로 신뢰할만한 아카이 구축 ∙지속 인 모니터링 과학기술분야 디지털 콘텐츠 아카이빙을 한 내/외부 체계 구축 ∙국가 디지털 아카이빙 센터로서의 운 조직 체계 구축 ∙아카이빙을 한 공동 이해 마련 제도 정비 ∙장기 인 에서 디지털 콘텐츠 아카이빙을 한 재원 확보 과학기술분야 디지털 콘텐츠의 장기보존 정책수립 ∙디지털 아카이빙을 한 성문화된 정책 마련 ∙단기, 기, 장기로 구분하여 수립 필요 ∙외부 생산 정보원의 아카이빙과 련된 작권 허락 메커니즘 구축 과학기술분야 디지털 콘텐츠의 장기보존 기법 실험과 연구 ∙디지털 콘텐츠의 유형별 장기보존 략 연구 ∙과학기술 웹정보원의 장기보존 략 연구 ∙네트워크 기반 연구개발 로젝트의 아카이빙 연구 ∙실험 테스트베드 구축 과학기술분야 디지털 콘텐츠의 장기보존을 한 의체 구성 ∙디지털 아카이빙 련 국제 네트워크 참여와 기여 ∙국내 유 기 과의 디지털 아카이빙 의체 구축 ∙정책 역(국립 앙도서 , 국가기록원, 유네스코, 학회 등 작권 보유주체), 연구 역( 학, 연구소 등), 실행 역(민간 정보기술업체)의 력 <표 7> 디지털 아카이빙의 목표와 중점 추진 과제 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 128 아카이빙 수행이 필요하다. 특히 과학기술분야 디지털 콘텐츠에 대한 서비스의 지속성과 연 속성을 유지하고 서비스 대상 디지털 콘텐츠 의 진본성과 무결성을 제공하기 위해서는 체 계화된 디지털 아카이빙 정책 수립과 그 실행 과정이 더욱 필요하다고 볼 수 있다. 보다 조직 적인 차원에서 KISTI의 과학기술분야 디지털 콘텐츠 아카이빙에 있어서의 역할은 다음과 같이 제시될 수 있다. ∙과학기술분야 디지털 콘텐츠의 국가보존 센터의 역할 ∙국제적 협력을 위해 국내 과학기술분야 디지털 콘텐츠에 대한 허브의 역할 ∙디지털 콘텐츠 아카이빙 관련 정책 및 기 술 연구의 중심기구의 역할 KISTI가 위와 같은 역할을 수행하기 위해 디지털 콘텐츠 아카이빙에서 책임져야 할 몇 가지 세부적인 영역의 정의는 다음과 같이 제 시될 수 있다. ∙과학기술분야 디지털 콘텐츠의 관리, 보 존, 서비스를 위한 대표기관으로써 이들 자원에 대한 입수, 관리, 보존, 서비스를 위한 정책 수립 ∙과학기술분야 디지털 콘텐츠 아카이빙을 위한 정보기술 인프라 구축 ∙디지털 콘텐츠 아카이빙을 위한 장기적인 보존 중점의 전담 조직 운영 ∙과학기술분야 디지털 콘텐츠 아카이빙을 위한 대내․외 협력 추진 ∙디지털 콘텐츠 아카이빙을 위한 표준화 추진 3.3 SWOT 분석 SWOT 분석은 기관의 강점(Strength), 약점 (Weakness), 기회(Opportunity), 위협(Threat) 요소에 대한 환경 분석을 제공하는 경영 전략 을 수립하기 위한 유용한 분석 도구이다. 이것 은 내부 환경을 분석하는 강점․약점 요소와 외부 환경을 분석하는 기회/위협 요소로 범주 화될 수 있고 또한 기관을 둘러싼 긍정적인 환 경인 강점/기회와 이와 반대로 위험을 불러오 는 약점/위협으로도 범주화될 수 있다. 일반적 으로 SWOT분석은 경쟁적인 시장 환경의 사 적부문에서 경영자가 회사가 처한 시장 상황 에 대한 인식 및 앞으로의 전략을 수립하기 위 해 자주 사용하는 분석도구이지만 요즘에는 공공부문의 정책 수립 및 실행을 위한 환경 분 석에도 종종 사용되기도 한다. 전문가 그룹 및 포커스 그룹의 의견을 토대 로 KISTI가 과학기술분야 디지털 콘텐츠에 대한 디지털 아카이빙 수행 시 그 강점과 약점, 그리고 기회와 위협요소는 무엇인지 살펴보기 위한 SWOT 분석 결과는 <그림 3>과 같다. KISTI의 가장 큰 강점은 과학기술분야의 대표적인 정보서비스 기관으로써의 위치와 축 적된 전문적인 경험, 광범위한 디지털 콘텐츠 의 확보라는 의견이 복수적으로 제시되었다. 외부환경에서 기회요소가 될 수 있는 부분은 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 129 <그림 3> KISTI 디지털 아카이빙의 SWOT 분석 디지털 콘텐츠에 대한 이용자의 요구 증가와 디지털 아카이빙에 대한 중요성과 필요성에 대한 인식 확대, 그리고 세계 각국의 디지털 아 카이빙에 대한 노력과 협력으로 정책 수립과 실행이 보다 수월하게 실천가능하다는 것이고, 이 분야 관련 정보기술의 발달 또한 기회요소 로 평가되었다. 하지만 KISTI는 유관기관과 달리 디지털 아카이빙에 대한 경험의 부족과 국가대표기관 으로써 유관 또는 협력기관 간의 리더십의 부 족, 공공기관이 지니고 있는 충분한 재정과 인 력 확충의 한계, 여전히 해결되지 않은 저작권 문제 등이 부정적인 요소로 나타났다. 그 중 KISTI 디지털 아카이빙의 가장 위협적인 요소 로 분석된 것은 안정적인 재정 확보의 문제이 다. 국내외의 불안한 경제 사정과 국가 재정의 불안정성 등은 대규모의 장기적인 예산을 필요 로 하는 디지털 아카이빙에 가장 큰 위협요소 이고 반드시 해결해야 할 문제로 제시되었다. 3.4 관련 기관과의 협력 전문가 그룹 참여자 모두 과학기술분야 디 지털 콘텐츠로 디지털 아카이빙의 범위를 제 한하더라도 포함되는 자원의 양이 너무 광범 위하고 유형이 다양하므로 유관기관과의 협력 체제를 구축하는 것은 반드시 필요하다는 의 견들을 피력하였다. 디지털 아카이빙 정책의 실행과정에서의 성패를 좌우할 만큼 유관기관 과의 협력은 필수적이라는 의견도 있었다. 디 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 130 영역 구분 대상 기관 국내 정책 역 유 기 국립 앙도서 , 국가기록원, KERIS 등 국내 연구 역 유 기 학 연구소 상업 인 학술정보 유통 기 EBSCO, ProQuest, 한국학술정보, 리미디어 등 기술 역 유 기 소 트웨어 시스템 업체 국외 정책 역 유 기 미국의회도서 , 국국립도서 , 호주 국립도서 등 국외 디지털 아카이빙 네트워크 PORTICO, LOCKSS, KB(Koninklijke Bibliotheek) 등 <표 8> 디지털 아카이빙의 협력 영역 지털 아카이빙은 인적, 재정적, 기술적 측면에 서 범국가적 투자가 전제되어야 하는 대규모 사업인 만큼 이들 자원이 제한적인 상황에서 관련 기관들 간에 중복 투자되어 소모적인 활 동이 생긴다면 큰 손실을 초래할 수 있기 때문 이다. 협력의 방법으로서 제시된 의견은 중복 노력을 피하기 위해 국가차원에서 분야별 아 카이빙 영역을 규정하는 것이 필요하고 이를 통해 기관들의 전문성을 강조하되 공동 이용 을 통해 서비스를 확대하는 방안이 강구되어 야 한다는 것이다. 또한 협력해야 할 부문으로서 디지털 아카 이빙에 대한 연구 및 기술적인 협력이 고려되 어야 하며, 우선적으로 각급 생산자 또는 수요 자들과의 협력을 통해 디지털 콘텐츠의 수명 주기 및 위험 평가가 이루어져야 하고, 아카이 브 우선순위 평가 시 중복을 피하기 위한 공동 협력과 액세스 관리를 위한 이해당사자들의 협력이 필요하다. KISTI와 협력해야하는 디지털 아카이빙에 관련된 기관은 <표 8>과 같이 여섯 영역으로 구분하여 볼 수 있다. 4. KISTI의 디지털 아카이빙 추진 전략 4.1 아카이빙 영역별 추진 전략 KISTI는 국가를 대표하는 과학기술분야 디 지털 콘텐츠 아카이빙센터로써의 사명을 갖는 다. 따라서 디지털 콘텐츠 아카이빙의 궁극적 목표를 달성하기 위해 국가적 영역, 국가 및 KISTI의 공동 영역, KISTI 영역으로 구분하 여 추진 전략을 마련하는 것이 바람직하다. 각 영역별 목표와 아카이빙 대상 콘텐츠 유형, 협 력기관, 주요 업무 내용, 해외 아카이빙 주요 사례를 정리하면 <표 9>와 같다. 먼저 국가적 영역은 과학기술분야의 국가 디지털 아카이빙(National Digital Archive) 센터의 역할을 수행하는 것이다. 즉 KISTI의 고유한 영역이지만 많은 예산과 장기간의 시 간 소요되어 국가적인 지원이 반드시 필요한 영역이다. 다음으로 국가와 KISTI가 공동으로 수행해야 할 아카이빙 영역은 해외전자저널 등의 해외학술정보의 아카이빙 센터의 역할이 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 131 구분 국가적 영역 국가 및 KISTI 공동 영역 KISTI 영역 목표 국가 디지털 아카이빙 센터(NDA)의 역할 해외학술정보 아카이빙 센터의 역할 국내 STM 정보자원의 아카이빙 센터의 역할 아카이빙 콘텐츠 유형 - 웹 콘텐츠 - 과학데이터 등 - 해외학술지 - 해외학술회의자료 - 해외연구보고서 등 - 국내학술지 - 국내연구보고서 - 국내학술회의자료 - 사실정보 - 동향분석 자료 등 력기 - 국립 앙도서 - 국가기록원 - 한국교육학술정보원 - 기 기술연구회 - 련 정부부처 등 - 해외출 사 - KESLI - Portico - LOCKSS - KB, BL 등 - 정부출연연구소 - 기 기술연구회 - 문도서 의회 - 국내 STM분야 학회 - KESLI 등 주요 업무 - 국가 정책 사업추진 략 수립 - 법 제도 장치 마련 - 력모델 개발 련 의체 구성 - 산 확보 - 아카이빙 문가 육성 - 교육 홍보 - 련 연구수행(가치평가 수 요조사 등) - 장기 인 정책 사업추진 략 수립 - 디지털 아카이빙 국제 력 네트 워크에 참여 - 산 확보 - 력모델 개발 련 의체 구성 - 작권 해결 - 련 연구수행(가치평가 수 요조사 등) - 디지털 아카이빙 기본계획 세부 략 수립 - 력모델 개발 련 의체 구성 - OAIS 표 모델에 기반 한 아카 이빙시스템 구축 - 시범 서비스 운 - 아카이빙 담부서 설치 - 아카이빙 문가 육성 참고 사례 - e-Depot - UKDA, NDAD - PANDORA - UKWAC - DCC, DPC - DareLux Project - e-Depot - Portico - LOCKSS - TDR - Journal@rchive - NII-REO - LOCKSS - TDR - Journal@rchive - OCLC의 ECO - TRAIL - NERS <표 9> KISTI의 디지털 콘텐츠 아카이빙 추진 전략 다. 마지막으로 KISTI만이 유일하게 할 수 있 는 국내 STM 정보자원의 아카이빙 센터의 역 할로 구분하여 목표를 정할 수 있다. 국가적 영역의 디지털 아카이빙 콘텐츠 유 형으로는 크게 웹 콘텐츠와 과학데이터 등이 며, 국가 및 KISTI 공동 영역에서는 해외학술 지, 해외 학술회의자료, 해외연구보고서 등이 해당된다. KISTI 영역의 콘텐츠 유형으로는 먼저 국내학술지, 국내 연구보고서, 학술회의 자료, 사실정보, 동향정보 등의 아카이빙이 필 요하다. 협력모델 개발과 아카이빙 협의체 구성을 위한 협력 네트워크 기관으로 국가적 영역의 경우 국립중앙도서관, 국가기록원, 한국교육학 술정보원 등 관련 정부부처가 해당되며 아카 이빙 업무 조정 및 역할분담 등의 협력이 필요 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 132 하다. 국가 및 KISTI 공동 영역에서는 해외출 판사, KESLI 등과의 협력이 선행되어야 하고, PORTICO, LOCKSS, KB 등의 협력네트워 크와의 협력을 모색할 필요가 있다. KISTI 영 역에서는 정부출연연구소와 기초기술연구회, 국내 STM분야 학회, KESLI 등의 협의체 구 성이 선행되어야 한다. 4.2 발전 단계별 추진 전략 KISTI의 디지털 콘텐츠 아카이빙의 성공적 인 발전을 위해서는 사업수행 단계를 구분하 여 발전 단계별로 실행목표를 수립하여 접근 하는 것이 효과적이다. 사업 수행 단계는 개념 적으로 사업출범기를 포함하여 구축기와 성장 기, 도약기의 4단계로 제시할 수 있다. 사업 출범기에는 우선적으로 디지털 콘텐츠 의 아카이빙을 위한 기본 계획을 수립하고 디 지털 아카이빙 시범 시스템을 구축한다. 또한 TFT를 구성하여 아카이빙 기본 정책의 세부 계획 수립과 관련 기관 및 이해당사자들의 협 력 모델 개발과 협의체 구성 등의 업무와 아카 이빙 관련 선진기법 연구를 수행한다. 구축기(2012~2013년)에는 가장 중요한 국 가 디지털 아카이브(NDA) 센터로서의 역할 을 수행하기 위한 국가적/장기적 아카이빙 정 책의 수립이 필요하며 국가적으로 과학기술 자료의 장기보존을 위한 필요성에 대한 공감 대를 형성할 수 있도록 홍보와 교육이 중요하 다. 그리고 국내기관 간 협력 네트워크 구축이 요구된다. KISTI 내부적으로 아카이빙 시스템 의 고도화 작업과 아카이빙 전담부서를 설치 하고 관련된 예산을 확보하는 노력을 전개하 여 디지털 콘텐트의 아카이빙을 위한 제도적 기반을 마련한다. 성장기(2014~2016년)에는 웹 콘텐츠, 과 학데이터 및 해외전자저널의 아카이빙을 위한 세부 계획 수립과 예산 확보, 국제협력 네트워 크 구축, 관련된 시스템 구축이 이루어지는 단 계이다. 아카이빙 대상 콘텐츠의 수집과 저장 체제 구축 및 운영이 이루어지며 관련된 부가 서비스도 개발한다. 도약기(2017~2020년)는 그동안의 성과를 바탕으로 국가적인 디지털 아카이브 센터로 서의 위상을 확고히 하고, 해외 디지털 아카이 브 기관과 협력하여 실제적인 자료 공유 체제 를 마련하여 국제적인 서비스를 지원하는 것 을 그 목표로 한다. 또한 콘텐츠 구축을 확대 하고 서비스를 고도화하여 안정적인 국가 디 지털 아카이빙 체제를 확립하는 시기에 해당 된다. 현실적으로 각 단계의 목표를 달성하기 위 해 소요되는 시기에 대해서는 논란의 여지가 있겠으나 해외의 사례를 참고하여 KISTI의 디지털 아카이빙 사업이 안정적으로 자리 잡 기 위해서는 출범기를 포함하여 10여년의 단 계별 실행계획을 제시하였다. 물론 국가적인 디지털 아카이빙 정책과 투여되는 예산과 인 력 등의 자원에 따라 그 시기는 조정될 수 있을 것이다. 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 133 5. 결론 및 제언 우리나라를 비롯하여 세계 각국의 국가도서 관과 대표적인 정보센터들은 자국의 국가문헌 을 망라적으로 수집하고 보존하는 정책을 추 진하여 왔다. 그럼에도 불구하고 현재 디지털 아카이빙이 더욱 중요시 되는 이유는 디지털 형태로 생산되고 소비되는 정보의 양은 기하 급수적으로 늘어나고 있지만 디지털자료는 그 가치를 판단할 여유도 없이 짧은 기간에 소멸 되고 있다는 것이다. KISTI는 우리나라의 과학기술정보에 대한 수집, 관리, 보존의 국가적 임무를 갖고 있으며 가장 많은 데이터를 확보하고 있는 기관이기 때문에 과학기술분야에 대해서 책임감 있는 디지털 아카이빙을 수행해야 한다. 본 연구의 목적은 과학기술정보 전문기관 인 KISTI의 디지털 콘텐츠 아카이빙 정책 수 립을 위한 기초적인 연구로, 디지털 콘텐츠 아 카이빙 정책개발을 위한 관련 국외 사례 조사 와 아카이빙 전문가의 면담조사 등 정성적인 연구방법을 병행하여 심층적인 연구를 수행하 였다. KISTI의 디지털 아카이빙 정책 수립을 위 한 프레임워크를 마련하기 위해 우선적으로 정책에 포함되어야 할 조항들을 선진 사례를 분 석하여 파악하고, 전문가 그룹의 인터뷰를 통 해 환경 분석을 실시하였다. 이를 통해 과학기 술분야 디지털 콘텐츠에 대한 KISTI의 디지털 아카이빙 비전과 목표, 역할과 책임, SWOT 분석을 통한 강점과 약점, 기회요소와 위협요 소 등을 분석하였고 협력해야 할 관련기관 등 을 파악하였다. KISTI는 국가를 대표하는 과학기술분야 디 지털 콘텐츠 아카이빙 센터로서의 사명을 가 지므로, 효과적인 디지털 아카이빙 정책을 실 현하기 위해 아카이빙 영역별 추진 전략과 아 카이빙 발전 단계별 추진 전략을 수립하였다. KISTI 디지털 콘텐츠 아카이빙의 궁극적 목 표를 달성하기 위해서 국가적 영역, 국가 및 KISTI의 공동 영역, KISTI 영역으로 구분하 여 추진 전략을 계획하였으며, 각 영역별 목표 와 아카이빙 대상 콘텐츠 유형, 협력기관, 주요 업무 내용, 해외 아카이빙 참고 사례가 설명되 었다. 또한 KISTI의 디지털 콘텐츠 아카이빙 의 성공적인 발전을 위해서는 사업수행 단계 를 구분하여 발전 단계별로 실행목표를 수립 하여 접근하는 것이 효과적이므로 사업 수행 단계를 개념적으로 사업출범기를 포함하여 구 축기와 성장기, 도약기의 4단계로 제시하였다. KISTI의 디지털 콘텐츠 아카이빙 정책은 정보환경의 변화와 기술의 발달에 능동적으로 대처할 수 있도록 유연하게 제정되어야 하고, 귀중한 문화유산으로서 과학기술분야 디지털 콘텐츠의 수집과 보존에 있어서 적극성이 발 휘되어야 한다. 또한 KISTI의 기존 정보시스템(NDSL, NTIS 등)과 유연하게 연계된 아카이빙시스템 의 개발과 시범적인 운영, 아카이빙 전문가의 육성 및 교육, 다양하고 계속적인 디지털 아카 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 134 이빙 관련 연구 프로젝트를 통한 충분한 논의 와 연구가 선행되어야 할 것이다. 그리고 디지 털 콘텐츠의 아카이빙과 관련된 많은 기관과 이해당사자들의 의견을 수렴하고 협력하는 노 력이 필요하다 하겠다. 참고문헌 곽승진, 성원경, 배경재. 2011. 디지털 콘텐츠 수 명주기 모델 분석 및 평가에 관한 연구. 정보관리연구 , 42(1): 25-46. 국가기록원 대통령기록관. 2008. 국제모범기준 과의 격차분석에 기반한 대통령기록관의 디지털 아카이브 발전전략연구 . 성남: 국가기록원 대통령기록관. 김희정. 2003. 디지털 아카이빙 최근 연구동향 및 OAIS 참조모형에 관한 연구. 기록관 리학회지 , 3(1): 23-42. 박현영, 남태우. 2004. 디지털 아카이빙 정책에 관한 연구. 한국정보관리학회 학술대회 논문집 , 2004년 8월, 69-76. 서은경. 2004. 디지털 아카이브의 영구적 보존을 위한 개념적 모형 설계에 관한 연구. 한 국문헌정보학회지 , 38(1): 13-34. 안영희, 박옥화. 2010. 디지털 큐레이션 정책을 위한 프레임워크 개발. ꡔ한국도서관․정 보학회지ꡕ, 41(1): 167-186. 이윤주, 이소연. 2009. 진본 전자기록의 장기보존 을 위한 정책프레임워크. 기록학연구 , 19: 193-249. 정영미, 윤화묵, 김정택. 2010. 디지털 콘텐츠의 무결성 유지를 위한 장기적인 보존 정책 에 관한 연구. 정보관리연구 , 41(4): 205-226. 정영임, 최호남, 최선희. 2010. 아카이빙 데이터 의 활용성 증진을 위한 전략연구. 정보 관리학회지 , 27(1): 185-206. 최재황, 곽승진, 김정택. 2009. 디지털자료의 납 본체계 및 이용에 관한 연구. 한국도서 관․정보학회지 , 40(1): 209-232. 최호남, 이응봉. 2005. 해외 전자저널의 디지털 아카이브 구축 전략에 관한 연구. 한국 문헌정보학회지 , 39(2): 161-183. 한국과학기술정보연구원. 2005. 국가 디지털 아카이빙 체제 구축에 관한 연구. 최종연 구보고서 . 서울: 한국국가기록연구원. Beagrie, Neil et al. 2008. “Digital Preser- vation Policies Study." [cited 2010. 5. 12]. . Beagrie, Neil. et al. 2008. “Keeping Research Data Safe: A Cost Model and Guidance for UK Universities." JISC. [cited 2010. 7. 20]. . 정보관리연구, vol.42, no.4 2011, pp.115-136 http://dx.doi.org/10.1633/JIM.2011.42.4.115 135 Björk. 2007. “Economic Evaluation of LIFE Methodology. Research Report." LIFE Project, London, UK. [cited 2010. 7. 20]. . British Library Digital Preservation Strategy. [cited 2010. 5. 3]. . Columbia University Libraries. 2006. “Policy for Preservation of Digital Resources." [cited 2010. 5. 23]. . Digital Curation Centre, Digital Preservation Europe. 2007. DCC and DPE Digital Repository Audit Method Based on Risk Assessment. . Digital Preservation Coalition. “Preservation Management of Digital Materials: The Handbook." [cited 2010. 4. 22]. . Higgins, Sarah. 2009. “Applying the DCC Curation Lifecycle Model." [cited 2010. 5. 19]. . ICPSR. 2006. “Digital Preservation Manage- ment." [cited 2010. 5. 6]. . Knight. 2007. “Recommendations to ensure the long-term preservation of digital objects stored by institutional repos- itories." [cited 2010. 7. 20]. . McLeod, Wheatley and Ayris. 2006. “Lifecycle Information for e-Literature: Full Report from the LIFE Project. Research Report." LIFE Project, London, UK. [cited 2010. 7. 20]. . National Library of Australia. 2008. “Digital Preservation Policy." [cited 2010. 5. 22]. . National Library of Australia. “Recommended Practice for Digital Preservation." [cited 2010. 5. 22]. . OCLC, CRL. 2007. Trustworthy Repositories Audit & Certification: Criteria and Checklist. OCLC Digital Archive Preservation Policy 과학기술분야 디지털 콘텐츠의 아카이빙 정책 연구 136 and Supporting Documentation, Last Revised. 2006. [cited 2010. 4. 15]. . Parliament Archives. 2008. “A Digital Pres- ervation Policy for Parliament." [cited 2010. 5. 23]. . Shenton. 2003. “Life Cycle Collection Man- agement." LIBER Quarterly: the Journal of European Research Libraries, 13(3): 254-272. [cited 2011. 7. 20]. . Thibodeau, Kenneth. 2010. “Preservation and Migration of Electronic Records: The State of the Issue." [cited 2011. 6. 12]. . University Digital Conservancy Preservation Policy. [cited 2011. 6. 1]. . University of Minnesota Digital Conservancy Preservation Policy. [cited 2011. 6. 1]. . UK Data Archive, The National Archives. 2005. Assessment of UKDA and TNA Compliance with OAIS and METS Standards. Watson. 2005. The “LIFE project research review: mapping the landscape, riding a life cycle. Literature review." London, UK. [cited 2011. 7. 20]. . Wheatley et al. 2007. “The LIFE Model v1.1. Discussion paper." LIFE Project, Lon- don, UK. [cited 2011. 7. 20]. . <관련 홈페이지> CCSDS Homepage. [cited 2011. 3. 30]. . DPC Homepage. [cited 2011. 4. 12]. . KISTI Homepage. [cited 2011. 7. 12]. . OASIS Homepage. [cited 2011. 9. 20]. work_pcah26is3rge7flkw6vlfeyhvi ---- From the Air: the photographic record of Florida’s lands From the Air: the photographic record of Florida’s lands Stephanie C. Haas, haas@uflib.ufl.edu Erich Kesse, kesse@ufl.edu Mark Sullivan, marsull@uflib.ufl.edu Randal Renner, ranrenn@uflib.ufl.edu Digital Library Center, University of Florida Libraries Joe Aufmuth, mapper@ufl.edu GIS Coordinator, Documents Department, University of Florida Libraries Support staff for project, http://www.uflib.ufl.edu/digital/collections/FLAP/Credits.htm BACKGROUND Historical aerial photographs dramatically document changes in Florida’s land use. Between 1938 and 1971, the U.S. Department of Agriculture (U.S.D.A.) created more than 88,000 black and white, 9 x 9 aerial photographs with 2,500 accompanying photomosaic indexes (1938- 1971) of Florida. Flight lines were county based and each flight created dozens of individual aerial photographs, or tiles. Due to the unstable nature of the photographic negative’s sodium nitrate composition, the U.S. government destroyed archival negatives for the earliest photos. As a result, the aging hardcopy photographic prints are all that remain of this historic resource. Originally intended to assist farmers determine accurate assessments for their farms and to provide information on crop determination and soil conservation, today these images provide some of the oldest land use/cover information available. They are used extensively in agriculture, conservation, urbanization, recreation, education, hydrology, geology, land use, ecology, geography, and history. The University of Florida Map & Digital Imagery Library houses the largest and most complete collection of Florida aerial photographs (~160,000 photos) outside of the National Archives in Washington, D.C. U.S.D.A. aerial photographs (~120,000) comprise the largest and most heavily used single set of photographs in this collection. In 2002, the Digital Library Center, the Map & Digital Imagery Library, and the GIS Coordinator, Federal Documents Department of the University of Florida Libraries submitted a grant proposal to the Florida Department of State, Division of Library and Information Services to digitize the aging 1937-1951 images and make then available over the Web through an ESRI map server. The grant was funded as an LSTA grant in 2002; Phase II funded in 2003 digitized the tiles from 1952-1970. The project site can be found at http://www.uflib.ufl.edu/digital/collections/flap. INTRODUCTION Within Florida, government agencies and private consulting firms have made extensive use of the UF collection. But few citizens and fewer educators and students recognize the many potential uses of these early historical images that document Florida’s transition from rural to urban, track the containment of Florida’s terrestrial waters, or trace its phenomenal growth as a “sunshine” mecca. Although physically accessible, these images remained Florida’s hidden visual heirloom. Once the project content was defined, we sought to document the need by soliciting input from the Florida community. To do this, we targeted GIS and other email lists, asking individuals to forward the message to others who might be interested in such a project. Within five days, we mailto:haas@uflib.ufl.edu mailto:kesse@ufl.edu mailto:marsull@uflib.ufl.edu mailto:ranren@uflib.ufl.edu mailto:mapper@ufl.edu http://www.uflib.ufl.edu/digital/collections/FLAP/Credits.htm http://www.uflib.ufl.edu/digital/collections/flap had received over 90 responses that reflected serious interest from a wide, diverse user base. Support came from Federal, state, and local agencies; Florida industries for which land use is an intrinsic factor; educators of students in grades 1-12; nonprofit organizations with interests ranging from environment to genealogy; and non-affiliated individuals with personal interests. The comments of Mark W. Glisson, Environmental Administrator, Division of State Lands of Florida clearly indicate the tremendous need for and interest in this project: “It is my understanding that the Digital Library Center and the Map & Imagery Center, University of Florida Libraries, are currently pursuing a grant to digitize and make available on the Web the historical Florida aerial photographs, taken by the USDA between 1937 and 1955. As Staff Director for the Acquisition & Restoration Council, and as director of the office responsible for reviewing land management plans, proposed land uses, and reviewing management activities on all conservation lands leased for management by the Board of Trustees in Florida, please allow me to lend my enthusiastic support and encouragement for this endeavor. More than 8 million acres, or approximately 25% of Florida's total upland acreage, are managed for conservation purposes. Included among those responsible for managing these conservation lands are the state agencies that manage state parks, state forests, wildlife management areas and greenways, federal agencies such as the U.S. Forest Service, U.S. Fish & Wildlife Service National Park Service and even the Department of Defense, the five state water management districts, and a growing number of NGOs and local governments. Common among all of these land managers is an increasing emphasis on restoring Florida's natural systems to some semblance of their natural state. For the first time ever, the Florida Forever land acquisition program now includes a focus on restoration funding, in recognition of the fact that we may now finally be in a position to not only set aside remaining "natural" areas, but to proactively pursue the return of these mostly-altered systems to functioning habitat and wetland systems. In order to collectively work toward this goal, there is a genuine need for a consensus on the objectives and desired outcomes of restoration initiatives, so that different agencies and different funding sources are not in conflict when lands lie in common watersheds or on adjacent uplands. Uniform access to these historical aerials could help immensely in guiding restoration objectives and identifying historical water flows and habitat characteristics, across agency and management boundaries. Beyond this role as a "standardizer" and data source for large-scale restoration, the maps would also provide invaluable and quickly-accessed information for any land manager seeking through accepted land management practices to restore natural functions and flow patterns on conservation lands. As a land manager for 18 years with the state park system, I can tell you from firsthand experience that such a tool could go a long way toward taking the guesswork out of management objectives. In addition, the historical land uses and working landscapes revealed by these aerials could help managers protect and interpret Florida's rich cultural heritage.” Because there is a nation-wide emphasis on restoration of native landscapes, a high level of support for similar projects in other states is highly probable. Similar projects in Georgia, Illinois, and Kansas attest to this need. PROJECT DESCRIPTION Four action lines were followed in the project development: 1) digitization of aerial photographs and photomosaic indexes; 2) development of a GIS interface; 3) creation of instructional materials; and 4) creation of a Web site and integration of Products 1-3. Digitization of aerials and photomosaic indexes Scanning of the aerial photographs (tiles) began in November 2002 and continued through August of 2004. The photomosaic indexes were captured in four months on a large format camera. Throughout the aerial tile scanning, photogrammetric glass targets were used to document distortion introduced by the various scanners. Although drum scanners that are photogrammetrically more accurate were considered, their cost precluded use in this project. Because original negatives were no longer available, distortions introduced by the photographic printing process were accepted, as were distortions introduced into the source image by pitch and yaw of the plane. Subsequent reviews of the scanned images showed that the amount of distortion was quite minimal and within the acceptable range. Image capture of the 9 x 9 inch aerial tiles and the 20 x 24 inch photomosaic indexes adhered to the guidelines promulgated by the Cornell Department of Preservation and Conservation (see Digital Imaging for Library and Archives, Anne R. Kenny and Stephen Chapman, Ithaca, NY : Cornell University Library, 1996). TIFF masters were the original capture format. Electronic archive masters are uncompressed TIFF files (ITU 6.0) at 100% scale: the current de facto standard for electronic image archives. Aerial tile images were scanned at 615 dpi, 8-bit greyscale. Because digitized aerial photographs average approximately 29.9 MB, a compressed SID version of 1.3-1.5 MB was created for serving over the Web. SID images are served from a dedicated server at the Florida Center for Library Automation. Migration to the JPEG 2000 (Level 1: JP2) file format is planned. Epson Expression 1640XL-SE and Microtek 9600XL and 9800XL flatbed scanners were used to capture the aerial tiles. For the larger photomosaic indexes, a planetary PhaseOne PowerPhase FX 4x5 inch digital camera back with a 10,500 pixel by 12,600 pixel scanning area and a 8-bit greyscale was used. The camera back is mounted on a planetary ZBE Satellite universal scanning system that is no longer manufactured. It includes a three turret mounted lenses & bellows, a camera stand, and an automated control system for calibrated imaging. A Rodenstock Rodagon 135 mm professional enlarging lens (f/5.6-f/22)) with an AR- 1 high aspect ratio filter was used for imaging. Two daylight balanced fluorescent Videssence ICELITE 360 light banks provided even illumination. Indexes were held in place during imaging by a Cobra-Pro vacuum easel powered by 2 Craftsman wet/dry shop vacuums. All of the index images were captured by and processed on a Macintosh Apple G4 with dual 1 GHz processors, 1 GB RAM, and a 36GB SCSI hard drive. The Macintosh computer operates OS 10 with OS 9.2 subroutines for compatibility with Phase One imaging software. Final quality control is performed using Adobe Photoshop CS (v.8). TIFF images were processed with LizardTech's MrSid 1.3.1 to create servable SID images. Image Management The management of the 88,000 physical aerial tiles and the subsequent collection and processing of the scanned images were major challenges faced by this project. Each physical aerial tile was tracked from the time it was borrowed from the Map Library until the time it was returned. For each TIFF image created, several key pieces of data were collected: the scanner, the time of the scan, and the technician who performed the work. TIFF header information included bit-depth and resolution and was verified against project requirements. Every tile was post-processed for both image (e.g., gamma) correction and creation of web- friendly formats. Additionally, tile images were visually inspected to assure quality. Finally, each image was archived, and the web formats sent offsite via FTP to the image server at the Florida Center for Library Automation. A programmer developed three different software tools to address these needs. The creation of a user-friendly front end allowed each physical tile to be tracked through the entire in-house process. As each physical tile was received, it was checked against a database, and then assigned to a technician for scanning. Once scanned, images were collected, processed, FTPed, and archived and each action was tracked in the application. Once all the tiles for a flight were complete, the tiles were returned en masse. The second tool automated image collection from the disparate scanning locations, performed basic image manipulation for quality control, and stored collected data in the database. Image scans were stored in individual flight folders on local hard drives. As images were collected from drives, data related to the scanner and flight numbers were also stored in the database. This application also read the TIFF header of each image to ensure that the correct technical specifications, e.g., dpi, greyscale, were followed during scanning. Deviant parameters triggered error notifications. The last tool used for processing images performed routine image corrections and prepared web format images on a dedicated computer. The processor performed histogram correction and SID creation for each image. Additional JPEG thumbnails were kept for visual quality control on the images. Each of the web formats was automatically sent to the FCLA web hosting site and the raw TIFF images packed into CD-size folders for local archiving. Finally, all of this data was stored in an aerials database and the aerial tiles returned to UF's Map & Imagery Library. Cost saving accrued from these automated procedures reduced the project costs by $20,000. Each of the tiles captured has a record in an MS-SQL database. The database programmer collaborated with the Libraries’ Systems Department and the GIS coordinator to create a web- based map interface. The interface permits searching by county, latitude and longitude, township/section/range, and year. These access points were suggested by the same group of 90 individuals who responded to the initial request concerning the need for the project. The site of the interface is accessed through the Aerial Photography: Florida home page at http://web.uflib.ufl.edu/digital/collections/FLAP/. GIS Interface To create the map interface, GIS technicians under the supervision of the GIS Coordinator used ERDAS Imagine software to geographically rectify and stitch together the multiple county mosaic index tiles. Individual aerial tiles were then hyperlinked to corresponding spatial points on the indexes. The image below shows the interface that has been created. Each tile is represented by a green dot that turns yellow when selected. Users may select an area of interest by using zoom tools, by drawing spatial footprints, e.g., rectangles, polygons, or by searching the tile database. The individual tile is zoomable and can be viewed at a 1:1 ratio. Zoom in to St. Augustine area. The selected tile dot turns yellow, and tile data displays below. The spa ARCIM Univers integrat aerial in While in search function combin pending Clicking on the camera icon displays the tile. tial search engine, individual images, and metadata are integrated through the ESRI S (Internet Map Server) software and served from a map server housed at the ity of Florida Libraries’ Systems Department. ESRI ArcGIS software is used to e other geospatial data layers, e.g., roads, rivers, political boundaries, etc. with the dex layers. itial programming efforts were directed at creating an interface that allowed the user to by multiple access points (for example, by county and year), the built-in search alities of the ARCIMS software proved the most intelligible and functional approach to ing search terms for users already familiar with GIS applications. Additional revision is to make advanced searches more intelligible to the average user with no previous GIS http://web.uflib.ufl.edu/digital/collections/FLAP/ http://web.uflib.ufl.edu/digital/collections/FLAP/ experience. A quick search help screen was created to assist users in defining their search components. EDUCATIONAL/INSTRUCTIONAL MATERIALS Instructional modules were developed for elementary school students and for the general public. Modules completed to date include: Interpreting Aerial photography [http://www.uflib.ufl.edu/digital/collections/FLAP/education/Interpreting/Interpret.htm] gives a general overview of how to interpret aerials. Spanish Explorers in the New World [http://www.uflib.ufl.edu/digital/collections/FLAP/education/Explorer/SPANISHEXPLORERS.pd f ] (Grades 3-5) Reading/worksheet combination. This unit includes information on Christopher Columbus and Ponce De Leon's exploration of Florida. Miami [http://www.uflib.ufl.edu/digital/collections/FLAP/education/Cities/Miami.htm ] (Grade 6- 8) This unit discusses the evolution of Miami from the stone circles of the Tequesta to modern high rises. An introduction to aerial photos shows students how to track city growth. A Place in Time [http://www.uflib.ufl.edu/digital/collections/FLAP/Education/StAug/Placeintime.htm] provides a model of how to use aerial images to determine landscapes over time. Two images of St. Augustine Harbor one from 1942 and the other from 1960 show changing land use. The activity for this unit asks students to choose a feature in his/her own county and see how surrounding land use has changed over a period of years. In additional to the educational components, eleven search guides http://web.uflib.ufl.edu/digital/collections/flap/Helpdetail/Howto.htm were developed to help users understand how GIS mapping interfaces function and to give instructions on using the site. USE OF THE SYSTEM Although the project has never formally been announced, use of the images has grown rapidly. 2003-2004 statistics collected by the Florida Center for Library Automation (FCLA) indicate 1,028 unique users had accessed the site. These users accounted for more than 398,600 manipulations of the aerial images. Manipulations are defined as both the initial display of an aerial image and zooming around the image itself. The following three charts compare the usage of county aerial tiles for 2003 and 2004. It should be noted that the preliminary aerial site became available in late 2003. The data collection for 2004 covers January through November 5, 2004. http://www.uflib.ufl.edu/digital/collections/FLAP/education/Interpreting/Interpret.htm http://www.uflib.ufl.edu/digital/collections/FLAP/education/Explorer/SPANISHEXPLORERS.pdf http://www.uflib.ufl.edu/digital/collections/FLAP/education/Cities/Miami.htm http://www.uflib.ufl.edu/digital/collections/FLAP/Education/StAug/Placeintime.htm http://www.uflib.ufl.edu/digital/collections/FLAP/Education/PlaceInTime.htm http://web.uflib.ufl.edu/digital/collections/flap/Helpdetail/Howto.htm Email and phone contacts concerning aerial use continue to grow. The site has an FTP request form that is used to facilitate the transferring of SID images and occasionally TIFF images to requestors. Between May and November more than 1,200 images were transferred to requestors. Alachua to Hardee County Aerials Used, 2003-2004 comparison 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Counties 2003 2004 Hendry to Monroe County Aerials Used, 2003-2004 comparison 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Counties 2003 2004 Nassau to Washington County Aerials Used, 2003-2004 comparison 0 2000 4000 6000 8000 10000 12000 Counties 2003 2004 FUTURE PLANS Although the GIS functionality is familiar to land use professionals, the project team has become aware of the difficulty it represents for the general public and K-12 audiences. This project was presented at the annual conference of the Florida Association for Media in Education and while there was great interest in the content, the interface proved too daunting for many of the attendees. In order to provide the broadest service to the citizens of Florida, the design of a second, less complex interface to the aerial collection is a high priority. Expanded year coverage is also a priority. The completed U.S. Department of Agriculture aerial collection will include digitized aerials from 1971 through 1995: the starting date for the online collection of the Florida digital orthographic quarter quads http://data.labins.org/2003/MappingData/DOQQ/doqq.cfm. Robert R. Terry, Agricultural Statistics Administrator, Commercial Citrus Inventory has given us permission to digitize and serve the aerial photographs taken as part of the Commercial Citrus Inventory. These images have been taken every two years since 1965 and document changes in citrus grove/land use. Currently there are 160 canisters of film containing approximately 24,000 images. Procedural modifications necessitated by these roll films are currently being tested. Additionally, many agencies that are using the USDA aerial collection have indicated a willingness to return georectified images and/or to provide images of tiles we are missing. Such collaboration will help us build a digital collection of significant value to the entire state. Automated methods of contribution and attribution of “trust” are being configured. In terms of technical enhancements, the project team has identified several functionalities that need to be addressed: http://data.labins.org/2003/MappingData/DOQQ/doqq.cfm 1) All data fields for any individual tile should be searchable and viewable; 2) The geospatial footprint of the tiles, currently points, should be buffered to a closer approximation of actual earth coverage*; 3) The georectified photomosaic indexes should be stitched together and be viewable as a viewable layer; 4) Features from the Geographic Names Information System (GNIS) should be added to the data layers; 5) More detailed transportation layers need to be added to help identify locations of interest; and 6) A publicly accessible FTP server needs to be developed so individual users with proper authorization can download needed high-resolution images. CONCLUSIONS It is most appropriate to conclude with comments from the actual users: Our Division is actively restoring natural areas in the state of Florida. We have used the aerials to delineate native undisturbed areas (now disturbed) for restoration. We have also used the aerials in geo-rectified form and loaded them onto our GPS data loggers and navigated to disturbed areas (previously undisturbed) that were undetectable by any other means or data source. This site is and will be valuable to government agencies for the data that the imagery contains. The data site is easy to use and lets non-technical people access the data. Please keep the site and data collection effort up and going. – GIS Coordinator at a state environmental agency The Florida aerial images site has been indispensable to me for identifying potential research sites for examining the effects of land conversion on storage of soil carbon in South Florida. The website was useful in identifying available historical aerials, choosing the images most relevant to my study area, and having those images transferred to our organization. This site is an outstanding resource for researchers and land managers. – Director of Research at a biological research laboratory This site was extremely useful to my research on gopher tortoises. It allowed me to trace habitat changes and anthropogenic disturbance to a large number of study sites over the last 50-60 years. Being able to download all of the maps from one site saved me considerable time and effort. – Graduate Student * Georectification of individual tiles or the definition of bounding boxes would be idea, with sufficient funding. BACKGROUND PROJECT DESCRIPTION Digitization of aerials and photomosaic indexes Image Management EDUCATIONAL/INSTRUCTIONAL MATERIALS USE OF THE SYSTEM FUTURE PLANS work_pdeknlsmwzhg7itraxxytd5xja ----  D-Lib Magazine January 2000 Volume 6 Number 1 ISSN 1082-9873 Mapping and Converting Essential Federal Geographic Data Committee (FGDC) Metadata into MARC21 and Dublin Core Towards an Alternative to the FGDC Clearinghouse Adam Chandler and Dan Foley Energy and Environmental Information Resources Center University of Louisiana Lafayette Lafayette, Louisiana adam_chandler@usgs.gov dan_foley@usgs.gov Alaaeldin M. Hafez Center for Advanced Computer Studies University of Louisiana Lafayette Lafayette, Louisiana ahafez@cacs.usl.edu Abstract The purpose of this article is to raise and address a number of issues related to the conversion of Federal Geographic Data Committee metadata into MARC21 and Dublin Core. We present an analysis of 466 FGDC metadata records housed in the National Biological Information Infrastructure (NBII) node of the FGDC Clearinghouse, with special emphasis on the length of fields and the total length of records in this set. One of our contributions is a 34 element crosswalk, a proposal that takes into consideration the constraints of the MARC21 standard as implemented in OCLC's World Cat and the realities of user behavior. 1. Introduction This paper describes a continuing digital library research project at the Energy and Environmental Information Resources Center to enhance access to Federal Geographic Data Committee (FGDC) data sets.[1] It presents a mapping of selected FGDC metadata elements into Dublin Core (DC) and MARC21 metadata that is based on standard crosswalks [Mangan 1997; LC 1999]. The FGDC elements included in our mapping are referred to as "essential FGDC metadata." They provide the basis for a converter being developed to import FGDC metadata into the Online Computer Library Center's WorldCat, its Cooperative Online Resource Catalog (CORC) project, and into local MARC-based library catalogs. We also analyze a data set of 466 FGDC records: 1) as a criterion for selecting essential FGDC elements, and 2) in terms of FGDC record length, because record and field lengths are a limitation for records in WorldCat and often in local library systems. One impetus for this work is our discovery in 1998 that more than 50% of the queries directed at the National Biological Information Infrastructure (NBII) node of the FGDC Clearinghouse retrieve zero (0) hits for the user. To us, that number represents a failure in the system architecture. A follow-up analysis of NBII log files between the period of July 1998 and March 1999 substantiated the earlier finding. We are following two research threads: the first is to create an alternative Clearinghouse model that makes management and maintenance of the metadata easier for the individuals responsible for taking time to create FGDC compatible metadata; the second is to convert existing and future metadata to more widely used metadata standards for inclusion in systems other than the Clearinghouse. Our metadata converter model addresses both concerns. (The permanent URL for our converter project is: .) Before describing our project, however, it may be useful to first offer some important definitions for readers who are not professional librarians. WorldCat is an international bibliographic database of more than 40 million records maintained by the Online Computer Library Center (OCLC) in Dublin, Ohio, and used by more than 34,000 libraries worldwide [OCLC 1999]. WorldCat records are in MARC21 format, which is the current version of the MARC (Machine Readable Cataloging) standard originally developed in the 1960s by the Library of Congress [LC 1998]. MARC21 is used in the United States and Canada. There are also other national and international MARC standards such as UKMARC and UNIMARC. The Cooperative Online Research Catalog (CORC) is an initiative sponsored by OCLC to develop the creation and sharing by libraries of metadata for Internet resources. Some of the main features of CORC are the integration of Dublin Core and MARC21 metadata into a single system that provides for both shared and local metadata for digital and physical items, editing in DC and MARC21 views, import and export of DC and MARC21 records, RDF/XML import and export, authority control, assisted (DDC) classification and subject heading assignment, automated keyword extraction and data extraction, link maintenance, and Unicode support [CORC 1999]. The Dublin Core Metadata Initiative is well known to researchers in the digital library community [DC 1999]. The first Dublin Core Metadata Workshop was sponsored by OCLC and the National Center for Supercomputing Applications in March 1995. Since that time, six more workshops have taken place, with the Seventh Dublin Core Workshop (DC-7) being held October 25-27, 1999 at Die Deutsche Bibliothek in Frankfurt, Germany [DC-7]. 2. Analysis of FGDC Metadata FGDC metadata is based on the "Content Standard of Digital Geospatial Metadata" [FGDC 1998]. The standard is available in several electronic formats, for example, as hyptertext images [CSDGM Image Map 1998]. FGDC metadata has a hierarchic structure of more than 300 elements, including 199 data entry elements, that are organized into seven information sections and three supporting sections called templates: Identification Data Quality Spatial Data Organization Spatial Reference Entity and Attribute Distribution Metadata Reference Citation Time Period Contact Sections and elements are either mandatory, mandatory if applicable, or optional. Templates are not used alone, but are inserted into information sections at appropriate places. Some data elements are repeatable, as are the templates. Only the Identification and Metadata Reference sections (sections 1 and 7) are mandatory in a fully compliant FGDC metadata record. After an FGDC record is created in one of the available editors, for example the MetaMaker program created by FGDC [MetaMaker 1999] and the Army Corps of Engineers' effort called CorpsMet [CorpsMet 1999], the structured ASCII text file is run through a parser which first checks its syntax and then outputs three different versions of the record (text, HTML, SGML). All three versions are then sent to a node within the FGDC Clearinghouse. There, Isite software indexes the SGML version of the record. The nodes are then searched through one of the Clearinghouse web sites. The user's request is made to a web form which is sent to a Z39.50 client that broadcasts the request to all the selected nodes, then returns the results of the query to the user's browser as one set (see ). It should be noted that the FGDC Clearinghouse has published statements that indicate that, on average, 10% or more of nodes are not functioning at any given time. We believe that the percentage may even be higher. The reader may check the status of the FGDC Clearinghouse nodes at anytime at the following location: < http://130.11.52.178/serverstatus.html>. The model we are approaching to address this reliability problem eliminates the complicated Z39.50 based nodes. Instead, researchers will register the location of their metadata files with a central search engine/converter. During this registration process, a unique persistent identifier will be assigned for each full metadata record. At that point, the content of the file will be ingested into this centralized portal which will offer features such as searching, browsing, and conversion to MARC21 or other metadata standards. We will be reporting on a working prototype of this model in the months ahead. Analyzing the NBI Data Set for Record Length FGDC records provide considerably more information than is usually found in library online catalogs. This applies to both the kind and amount of information that they convey. Thus, one of our goals was to determine how much FGDC records may exceed the field and record length limits of MARC21 records in OCLC's WorldCat database. While MARC21 records have a theoretical maximum record length of 99,999 ASCII characters and a maximum field size of 9,999 ASCII characters, in WorldCat the maximum record length is 4096 characters and the maximum field size is 1230 characters in a variable field. WorldCat also has a maximum of 50 variable fields. Accordingly, we obtained a data set of 466 metadata records from the National Biological Information Infrastructure (NBII) of the U.S. Geological Survey.[2] The output of the SGML format maps into a flat text file of 444 element cells for each record. A summary of results is presented in Table 1. For those interested in examining the data in more detail, see Appendix A: NBII Data. Table 1: Record and Field Length Summary of 466 NBII Metadata Records   record length largest field in record average 6792 bytes 2125 bytes median 6474 bytes 1258 bytes largest values of 466 NBII records 28042 bytes 9525 bytes number of records with length > 4096 343 records   number of records with a field length > 1230   236 records It is clear that the FGDC metadata in this set is, essentially, of a different type than the typical catalog record found in WorldCat. The largest record in the set contains 28042 bytes, that is, nearly seven times larger than the WorldCat record length limit. The largest field value is about eight times larger than the WorldCat field limit. In fact, about 74% of this set exceeds the maximum record size, while 51% of the records have at least one field, usually the abstract (element 14 in our output), distributor liability statement (element 376), or the process description (element 135) that exceeds the field length limit in WorldCat. How does this relate to records in OCLC's CORC system, which allows the import and export of both MARC21 and Dublin Core records? According to Thomas Hickey, CORC Project Manager at OCLC, the differences in size between FGDC records in the NBII dataset and MARC21 bibliographic records in WorldCat should not be a problem for the CORC system. The only real limitation to record size in CORC is what browsers can handle. There have been some problems with records having tens of thousands of bytes, but the average FGDC record is well below this range. WorldCat may adopt CORC's XML system sometime in the future, but for now, moving very long CORC records into WorldCat would require an algorithm to cut or drop fields in order to make the record fit. In other words, the WorldCat record would display abbreviated data in some fields, but the CORC system would display the entire record. The newer Dublin Core, XML/RDF, and FGDC standards do not have field or record length limitations. Criteria for Mapping and Converting FGDC Elements It is not our intention to map and convert all 300-plus FGDC elements (or 195 data entry elements). Rather, we selected a smaller number of elements that we refer to as "essential FGDC metadata" for a fully compliant FGDC record. Elements were selected for three reasons: 1) they are required (mandatory) for the production of a fully compliant FGDC record; 2) they are search keys such as author, title, subject, and date that are commonly found in online library catalogs; 3) they are fields commonly used by creators of FGDC metadata that may be used as search keys by persons interested in FGDC geospatial data sets. The first two criteria are determined, respectively, from mandatory elements in the "Content Standard for Digital Geospatial Metadata" (CSDGM) and by generally accepted library practice for the selection of access points in online catalogs. The third criterion is based on a frequency analysis of the NBII data set for actual usage of FGDC elements by persons who created the metadata records. The results of this analysis are presented in Table 2. Columns 1 and 2, respectively, give the tag numbers and names of each essential FGDC element as is found in the CSDGM. Column 3 gives the number of times each essential element was used in the sample set (out of a possible 466 times). Table 2: Element Frequency Count  for Sample Set of 466 NBII Metadata Records FGDC Tag FGDC Element NBII Frequency 8.4 Title 466 8.1 Originator 466 1.6.1.1 Theme_Keyword 466 1.6.2.2 Place_Keyword 424 1.2.1 Abstract 465 1.2.2 Purpose 466 8.8.2 Publisher 305 8.2 Publication_Date 466 1.5.1.1 West_Bounding_Coordinate 466 1.5.1.2 East_Bounding_Coordinate 466 1.5.1.3 North_Bounding_Coordinate 466 1.5.1.4 South_Bounding_Coordinate 466 9.3.1 Beginning_Date 345 9.3.1 Ending_Date 345 9.1.1 Calendar_Date 117 10.1.1 Contact_Person 396 10.4.1 Address_Type 461 10.4.2 Address 459 10.4.3 City 461 10.4.4 State_or_Province 461 10.4.5 Postal_Code 461 10.4.6 Country 165 10.5 Contact_Voice_Telephone 461 10.6 Contact_Facsimile_Telephone 226 10.8 Contact_Electronic_Mail_Address 315 10.9 Hours_of_Service 47 4.1.2.1.1 Map_Projection_Name 74 4.1.4.1 Horizontal_Datum_Name 59 1.7 Access_Constraints 466 1.8 Use_Constraints 466 10.1.2 Contact_Organization 65 10.3 Contact_Position 282 6.4.2.1.1 Format_Name 257 3. FGDC to MARC21/DC Crosswalk The following table (Table 3) presents our crosswalk from FGDC to Dublin Core and MARC21. It consists of 34 essential FGDC elements and is based on standard crosswalks [LC 1999; Mangan 1997].[3] It includes mandatory elements from the Identification and Metadata Reference sections, as well as specific elements from the Spatial Reference, Distribution, Citation, Time Period, and Contact sections. Our crosswalk has similarities as well as differences with the "Metadata Entry System" for minimally compliant metadata that has been proposed recently by the Federal Geographic Data Committee [FGDC 1999]. We recommend that the reader compare those guidelines with the elements in our crosswalk. Appendix B contains a detailed discussion of the essential FGDC metadata elements. The crosswalk and converter represent the current state of an evolving process rather than a final product. The converter software program is written in C by one of us (Alaaeldin Hafez). It has a modular and adaptable design, that is, it is very easy to add, change, or delete particular features within its general design. However, even the best machine conversion may require some human intervention: in other words, librarians may want to do some editing of records produced by the converter in order to adapt them to their local automated library systems. It also includes our reasons why there are temporary blank spaces in the crosswalk in Table 3. Table 3: Crosswalk from FGDC to Dublin Core and MARC21   FGDC Tag FGDC Name DC Name MARC21 Tag 01 8.4 Title Title 245 00 $a 02 1.2.1 Abstract Description 520 __ $a 03 1.2.2 Purpose Description 500 __ $a 04 8.1 Originator Creator.Name 720 __ $a 05 8.8.2 Publisher Publisher 260 0_ $b 06 8.2 Publication_Date Date.Issued 260 0_ $c 07 9.1.1 Calendar_Date Coverage.Date 513 __ $b 08 9.3.1 Beginning_Date Coverage.dateStart 513 __ $b 09 9.3.1 Ending_Date Coverage.dateEnd 513 __ $b 10 1.5.1.1 West_Bounding_Coordinate Coverage.Box.westLimit 034 0_ $d 11 1.5.1.2 East_Bounding_Coordinate Coverage.Box.eastLimit 034 0_ $e 12 1.5.1.3 North_Bounding_Coordinate Coverage.Box.northLimit 034 0_ $f 13 1.5.1.4 South_Bounding_Coordinate Coverage.Box.southLimit 034 0_ $g 14 1.6.1.1 Theme_Keyword Subject.Keyword 653 0_ $a 15 1.6.2.2 Place_Keyword Subject.Geographic 653 0_ $a 16 6.4.2.1.1 Format_Name Format 856 $q 17 10.1.1 Contact_Person   270 $p 18 10.1.2 Contact_Organization   270 $q 19 10.3 Contact_Position   270 $q 20 10.4.1 Address_Type   270 $i 21 10.4.2 Address   270 $a 22 10.4.3 City   270 $b 23 10.4.4 State_or_Province   270 $c 24 10.4.5 Postal_Code   270 $e 25 10.4.6 Country   270 $d 26 10.5 Contact_Voice_Telephone   270 $k 27 10.6 Contact_Facsimile_Telephone   270 $l 28 10.8 Contact_Electronic_Mail_Address   270 $m 29 10.9 Hours_of_Service   270 $r 30 1.7 Access_Constraints Rights.Access 506 $a 31 1.8 Use_Constraints Rights.Use 540 $a 32     Identifier.URL 856 $u 33 4.1.2.1.1 Map_Projection_Name Coverage.Box.projection 255 $b 34 4.1.4.1 Horizontal_Datum_Name Description 342 05 $a 4. Conclusion The realities of mapping FGDC to MARC21 and Dublin Core standards are most clearly understood by examining the record and field length limits of the OCLC WorldCat system. It is our supposition that there are others who are interested in putting FGDC records into their local MARC21 library systems to increase the access points and availability of this valuable metadata. The whole notion of cooperative cataloging mandates that we look for least common denominators for our metadata standards. While some library automation systems do not impose the same kind of limits as WorldCat, it would be counterproductive to design individual crosswalks for each library vendor's system. It appears the CORC project's success will translate into a new way of storing metadata for OCLC over time. Given OCLC's leadership in the field, there is a good chance that the XML based record structure will be adopted by vendors. Completion of a migration away from MARC, however, considering the massive investment of equipment and training in libraries is years in the future. Therefore, metadata conversion efforts ought to consider the OCLC WorldCat field and record length and number limitations as constants for now. One of the core issues we would like to highlight is the lack of a persistent URI for FGDC metadata. As the system is currently designed, an SGML version of the record is dumped into a Z39.50 database server. Each time the system re-indexes, the address for the record is changed. This design flaw embedded in the FGDC Clearinghouse model violates a core rule of networked information. No stronger statement of this is available than that made by Tim Berners-Lee of the World Wide Web Consortium: "The most fundamental specification of Web architecture, while one of the simpler, is that of the Universal Resource Identifier, or URI. The principle that anything, absolutely anything, "on the Web" should [be] identified distinctly by an otherwise opaque string of characters is core to the universality." [Berners-Lee 1998]. Until the problem of dynamic metadata locations is addressed, it will not be possible to create meta-metadata for the FGDC records on a large scale. There are other problems with the Z39.50 FGDC Clearinghouse system, such as slow response time and unreliable search results. These are liabilities that cause the metadata searcher and creator to lose faith in the system, thus accelerating the need to export the metadata into other systems with better user interfaces. Solutions must take metadata maintenance into consideration. An area ripe for empirical investigation is to study what preferences and habits scientists have when searching FGDC metadata. Myke Gluck and Bruce Frasier, for example, have shown that the appearance or format of metadata records has a very large effect on the user's perception of relevance [Gluck 1998]. Another fruitful area of digital library research is to study the relationship between metadata and scholarly electronic journals. We believe FGDC metadata should be peer reviewed and included in the institutional reviews of scientists for promotion and tenure. More discussion and critical analysis is due. We hope our effort here will stimulate an exchange of ideas. 5. Notes [1.] By way of background, Adam Chandler is a systems librarian, Dan Foley is a cataloger, and Alaaeldin M. Hafez is a computer scientist. Our library, the Energy and Environmental Information Resources Center (EE-IR Center) is a digital special library of text, numeric, and geospatial data. It was formed as a partnership between the National Wetlands Research Center (NWRC) of the U.S. Geological Survey, and the Center for Advanced Computer Studies of the University of Louisiana (CACS/ULL). Both partners are located in Lafayette, Louisiana. The EE-IR Center is funded by the Office of Scientific and Technical Information (OSTI) of the U.S. Department of Energy. The scope of the collection pertains to energy and the environment of Louisiana, especially the wetland areas of South Louisiana. An area of special interest is pollution and contamination of the Lower Mississippi Watershed and offshore in the Gulf of Mexico. For more information, see Foley 1999 [Foley 1999]. The EE-IR Center is located in the NWRC Library. Other Center personnel are NWRC Librarian Judy Buys and GIS Specialist Suzanne Harrison. The work presented in this paper is funded by U.S. Dept. of Energy Grant No. DOE-FG02-97ER1220. The principal investigators for our digital library project under this grant are Dr. Vijay Raghavan, CACS/USL, and Gaye Farris, Branch Chief, Technical and Informatics Branch, NWRC. [2.] We are grateful to Susan Stitt of NBII for supplying this data set to us. [3.] For readers unfamiliar with the MARC21 bibliographic format, the best introduction is "Understanding MARC Bibliographic: Machine Readable Cataloging" [Furrie 1998]. Throughout this paper, a three-digit number indicates a MARC tag for a particular MARC field. Fields have subfields $a, $b, $c, etc., where the dollar sign ($) is a sub-field indicator. For example, the notation 856 $u refers to an Electronic Location and Access field (856) having a subfield ($u) that contains a Uniform Resource Locator (URL)). 6. References [Berners-Lee 1998] Berners-Lee, Tim. (1998). "Web Architecture from 50,000 feet." Retrieved 5 May 1999 from: [CSDGM Image Map 1998] CSDGM Image Map 1998. (1998). "An Image Map of the Content Standard for Digital Geospatial Metadata: Version 2, 1998 (FGDC-STD-001 June 1998)." Available at: [CORC 1999] CORC -- Cooperative Online Resource Catalog. Available at: [CorpsMet] United States. Army. Corps of Engineers (1999). "CorpsMet." Available at the Corps' "Geospatial Data Clearinghouse Node" Web page: [DC-7] 7th Dublin Core Metadata Workshop, October 25-27, 1999, Die Deutsche Bibliothek Frankfurt am Main, Germany. Available at: [DC 1999] Dublin Core Metadata Initiative. (1999). Available at: [FGDC 1998] Federal Geographic Data Committee. (1998). "Content Standard of Digital Geospatial Metadata, Version2, 1998." Available at: [FGDC 1999] Federal Geographic Data Committee. (1999). "Metadata Elements Included in the Metadata Entry System." Retrieved 9 September 1999 from: [Foley 1999] Foley, Dan. (1999). "Metadata in a Digital Special Library: the Energy and Environmental Information Resources Center in Lafayette, Louisiana." Journal of Southern Academic and Special Librarianship: 01[iuicode: ] [Furrie 1998] Furrie, Betty. (1998). ""Understanding MARC Bibliographic: Machine Readable Cataloging" Fifth edition reviewed and edited by the Network Development and MARC Standards Office, Library of Congress. Available at: [Gluck 1998] Gluck, Myke, and Bruce Fraser. (1998). "Usability of Geospatial Metadata or Space-Time Matters." presented in the "Theory and Practice of the Organization of Image and Other Visuo-Spatial Data for Retrieval: From Indexing to Metadata" Session. American Association for Information Science 1998 Annual Meeting, Pittsburgh, Pennsylvania, 25-29 October 1998. [Iannella 1999] Iannella, Renato. (1999). "DC Agent Qualifiers: DC Working Draft, 12 November 1999." Available at: [LC 1998] Library of Congress. Network Development and MARC Standards Office. (1998). "MARC 21: Harmonized USMARC and CAN/MARC." 22 October 1998 . Available at: [LC 1999] Library of Congress. Network Development and MARC Standards Office. (1999). "Dublin Core/MARC/GILS Crosswalk." Available at: [Mangan 1997] Mangan, Elizabeth. (1997). "Crosswalk: FGDC Content Standards for Digital Geospatial Metadata to USMARC." Available at: [MetaMaker] MetaMaker. (1999). U.S. Geological Survey. Available at: [OCLC 1999] OCLC Online Computer Library Center, Inc. [home page] (1999). 7. Contact Information Adam Chandler Systems Librarian Energy and Environmental Information Resources Center University of Louisiana at Lafayette 700 Cajundome Blvd. Lafayette, LA 70506 web: email: adam_chandler@usgs.gov tel: 318-266-8697 Dan Foley Metadata Librarian Energy and Environmental Information Resources Center University of Louisiana at Lafayette 700 Cajundome Blvd. Lafayette, LA 70506 web: email: dan_foley@usgs.gov tel: 318-266-8539 Alaaeldin M. Hafez Research Scientist Center for Advanced Computer Studies University of Louisiana at Lafayette P.O. Box 44330 Conference Center Room 459 Lafayette, LA 70504-4330 USA web: email: ahafez@cacs.usl.edu tel: 318-482-5791 This work is supported in part by a grant from the U.S. Department of Energy (under grant No. DE-FG02-97ER1220). Copyright © 2000 Adam Chandler, Dan Foley and Alaaeldin M. Hafez Top | Contents Search | Author Index | Title Index | Monthly Issues Previous story | Next story Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/january2000-chandler work_pedpm37vxzeixccnspxff2mwyq ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586948 Params is empty 216586948 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586948 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_pg22i2dn4fh3dkfznxt7d2atbe ---- ~. r荒 J"urnal ,,( Erl 間 "ω叫 /\.'l ('<.ha & L, brary S叫的紗~, V" I. '17, N" , 叫 1 Y!.划}仲 1 'l7. 1油 PLANNTNG PROCESS AND CONSIDERATIONS FOR A STATEWIDE ACADEMIC UBRARIES TNFO RtY!ATION SYSTEM IN OHIO Hwa-Wei Lee Direc /Qf of Libraries Ohio Unirersity Athem, Ohìo, U.S.A Abstract Academic libraries in OhÎo have led in cooperalive library automatio吼 叫th the establishment of OCLC in 1967 as one example. Beyond OCLC, which provides onlîne shared cataloging , interlibrary loan and the world's largesl bibliographic database, many have developed or acquired 1。“ l syslems to meel the needs ofindividuallibraries. A 1986 study by the state Board of Rcgcnts recommcndcd d他vc\opmcnt of an Qhio Librarics lnfor- mation Syslem (O L1 S) which would permit studcnts and facu1ty at any public university 10 have full açcess 10 the resourccs al any public univcrsity in the statc. Bcyond bibJi ograph ic acc~錯, the ~ystem emphasizes infor mat ion dclivery. This paper dcscribcs thc planning process and considera tions of the sySlem which wi11 go 10 REP in Ju ne 1989 1. Ohio: Th e Birth Place of OCLC Cooperation for automatîon and resource sharìng among academic libraries , especîally the state-supported universi ty lîb- raries, has been firmly established in Ohio since the 1960s. The most important accomp lishmen t was the establishme nt of OCLC 1'17 ~':\.~ 耐 I lX JO\l rn叫 of Edllc削他可 I Mt,d ", ~也 L, hr,, 'y &i ,'n 月積 ~7 : :! ( Wi間kr l !l!叫 in 1967. Originally , OCLC was the abbreviation for Ohio Co lJege Li brary Center , an entity founded by a group of academic libraries whose institutio l1s were members of the Ohio Co l1 ege A血。 C!a tlOl1 Under the leadership of the Inter-university Library Councìl (l ULC) , an informal organization of the Iibrary directors of state- supported universitîes , initial funding was obtained from the Ohio Board of Regents, the planning and coordinating agency for all state-supported institutions of higher education. OCLC's success in creating a central bibliographic database of MARC (MA chine- Readable Cataloging) records to facilitate online , shared cataloging by participating libraries induced many other libraries to join Within fifteen years, OCLC had become a mu1t i-type library network. The membership had grown from 48 in 1967 to 2,934 in 1982 , covering every state of the Union (Maciuszko , 1984: 17 & 219). The expanding membership caused OCLC to change its name and governance. Today , OCLC stands for the Online Computer Library Center. As of J une 30, 1988, OCLC had 9 ,4日 o participating libra刊e濁。f all types and sizes in 50 states and 23 other countries with 17 ,748 ,222 bibliographic records , making it the world法 largest bibliographic database. In 1987-88 alone , 2 1.9 million books and other materials were ca taloged into the data- base , and 3.78 million transactions for interlibrary loans were handled (QCLC, 19日 8) By the late 197日 s and early 1980s, with the advances in min卜 computer technologies, many libraries found it desirable to deve10p or acquire loca llibrary systems for other library functions not provided by OCLC, ln 1988 , there were 50 library systems ve ndors in the market (Walton & Brid餌, 1988) , most claiming to include a variety of integrated library fun ctions. Additiona l1 y , many of these systems are capable of networking among a group of libraries on a local or regional basis. Even with local systems, most libraries still participate in OCLC for shared cataloging and interlibrary loans. 111 Ohio , for example , of the thirteen state- ;.:,;.合 ..:;'t; • ~ L~" : I'lannmg I'roc“S 仙d ConsiderallOn" for I., Þrar 開 I"f()m, ~non SysI ,'''' !2!1 suppo r ted unîversities and two me d ical co lleges, a ll of which a re members of OCLC, one has a locally d eveloped system and eight othe rs h ave acquired loca l sys t em (Table 1). Nea r ly all are capable of providing an online public access catalog (OPAC) , acquisitions , fund accountin皂, circulat ion , and serials control Table I Automltcd Local Lìbrary Systems ìn Ohìo Pu blìc Univer創世的 UnivenitylMcdical Co llege University of 岫'"" Qhio Univ可erslty Youngstown Stale University h叫ing Green State Univ University ofCincinnali Ohio 5tate University 、'1right 5tale University Oeveland State University Kem State University University of Mian甘 Central 51ate University Sh,削閥割ate University University of Tolcdo Medical CoUege of Ohio Northeast Qhio Univ. CoUege of Medicine Local Systcm Virginia Tech Library System Virginia Tech Library System Virginia Tech Library Systcm OCLC LS/2000 Washinglon Library Network LC5 (Lo cally developed) Data R的earch Associates NQTlS NOT1S No ~stem No systcm No system No system No systcm No system To facilitate re口source sharin皂, the thirteen university libraries have a reciprocal borrowing agreement allowing faculty and students at these universities to use each other's libraries. Inter- Ii brary loan and photocopy req uests among IULC libra ries receive priority attention and are free of charges. T hose libraries with a local system allow the o t her libraries remote d ial-up access T h rough OCLC these tib raries all have acce鉛 to the bibliographic records of the o t hers; however, such records do no t in d icate the 三 l::IIf Juurnal 叫 Edurauon叫 M叫ia ~是 I 山r,or y S<:;們'ct" ~'7 : ::! (W,nl<:r 1蛇m number of copies in a given library nor circ叫ation status Information on serial holdings is o ft en incomplete or absent Further, OCLC's massive database does not yet allow for subje口, keyword , or boolean searching. Most local systems provide these capabilities 2. A New Initiative ln 1986, facing massive requests for new and enlarged library facilities on sta t e-supp o rt ed campuses, the stat e legislature man- dated that Ohio Board of Regen t s asses心 th e need for space by the uni versity libraries and possible alternatives. The Board created a seventeen-member Li brary Study Committee, cha ired by Dr Elainc Hairston , Vice Chance Jlor for Academic and Special Pr。 grams of the Board of Reg凹的, consisting of a university presi一 dent , a provost , two vice presidents , two dean s, two library dire ctors, a professo r, an OCLC researcher, a publisher, and four additiona! Board of Regents senio r staff officers. The Committee decided early in its d eliberations that its charge would require assessment o f “ the rol e of th e academic Ubrary . . . in its broadest con t emporary sense" and that it "should co nsider such oppor- tunities for improvmg the quality of libraries as might appear in the context of its considerations." (Ohio Board of Rege nts, 1987 vii) In its published report of th e year-Iong study (Ohio Boa rd of Regen t s, 1987), th e Committee felt that This wîder perspective is nec的sary because the academic library of today ha. a thl't'efold purp。紹 , scrving not only as a storehouse of information but also as a gatcway to information hcld e1scwhc悶, and as a center for instruction about information. (p 泊。 According旬 , the Committee's recommendations centered on ~~ Lee : Plann ll\l!; Pr<.><;c.s and Con~lderauon~ for Lihrarie. Jnfonn削ion Sy~'~'" 的l three broad a reas 的 eoUaboration. which en曲mpuses a 扇風阱。f iJlues JUch aJ曲.u. borative a呵Ulstnons. 仙ared ac僧U, and shared ston阱, 2) technology, including high density me.帥。r P"叫“tion such aJ the uisting micrororm and the emefiing compact disk 3)“ternatlve 5t。開阱. including 伽削。“ m.酬。ds of maintaining rarely u.sed malerials In a wareh間.se enVÎJonment_ (p_ vü) The principal recommendation for collaboration was to implement "as expeditiously as possible a statewide electronic catalog system" - the project , initially the Ohio Lib rary Access System (OLAS) was later named the Ohio Library Information System (O LlS)_ Co l1 ateral recommendations inc1 uded r etro- spective .conversion of remaining paper catalog records to MARC format , the development and implementation of a statewide delivery system for library materials, and a plan for a cooperative preservation program 3. 0h岫Library Jnfom旭tion System: The Ratio nale 500n after the release of the Committee Report , the Ohi。 Board of Regents acted to begin planning for a statewide electron ic Iibrary system_ They commissioned a feasibility study (RMG Consultan筒, 1988) and an evaluation of centralized vs. distributed approach目 to the statewide system (Hurley , 1988), established a steenng committee and three task forces (one each for systems managers, librarians, and users), held a working conference featuring reports of experts on multi-campus systems from seven different states, drafted a planning paper and held re&Îonal hearings, and prepared a “ Request for Information" (RFI) document. A chrono logy of events from the formation of the Lib間ry Study Committee to the issuance of the RFI is recorded in Table 1I 1:12 - • JOUrrlßl 01 Educ~tlo 叫伽如(ha & I.i[, rary Sci..n~~. '.!l : 2 (Wirllt. 1990) FalI 1986 Sept.1987 FalI I987 T.bl~ 11 O! ronology of Evcnb Llbrary Study Committee fonned by Ohio B個.rd of Regenu Library Smdy Committee report, p,旬greu Th rough Col IDbonu;o,爪 Sto'Qge, Qnd Technology , Isω.d Ohio Board of Regents 回mm.甜。ns a feasibill吵 sludy of statew而de syste m fro m RMG Associales W'"惚r 1987 Stee ring Committce 叩pointcd March 1988 Taslc: Forces (or sYltems managen, librarians, and usen established Apr.-Aug. 1988 Task Forces m叫. work loward p1anning documenl and July 1988 Summer 1988 RFI Board of Regenls receives capi叫 budgel appropriation 。($2.5 mUlion for planning Board of Rege nts eommisslons an evaiuation of cen tralized 'IS distributed app間)8ch to stalewide syslem Scpt. 1988 Co-d irectou ror pJa nning hired s.p' 油, 1988 Dtaft of planning paper cirωlo",d Sep t. 27一28 , 1988 Working Conference I in Columbul N肝 2 , 1988 Planning Pa pcr circulatcd '"阻c. 5-9 , 1閃9間8閻8 R.叩E斟."甜a刮J h.組artngs 0酬n the pl恥a Dec. 1 6 , 1 9呵8閻8 R連 FI dtaft eαU~吼仙‘u叫.1岫aled Feb. 3, 1989 RFI sent to ycndors Tabl e UI presents th e Projec ted Ti m etabl e of Future Actions T.bk m Projecud T血曲kof Ma戶Action. Apr. 15 , 1989 May 2-3 , 1989 June 15 , 19帥 3凶y-Aug . , 1989 Sepl. 4, 1989 5ep l. 15 , 1989 Scpt. 22, 1989 Dec. 1 間的 July 1 1990 Vendor responses 10 RFI due Working Conference 11 in Kent RFP sent to vcndofl Vendor demonstra包。.... RFP responses due Capital budget requeJt fo r 199品" ACling duector and initlai staff hlred Vendo rlsYltem se lectcd Capl叫 budget for 1990-92 aval1able 呵,、. 口可 • t.." I'lanninl! Proc.,,;s and Consid<,ral;ons for L;brar;c~ Informa1 ;on Sys!"m 133 Aug. 1, 1990 July 1, 1991 Operatin唱 budget for 1991-93 Operating budget for 1991-93 availablc First Ph ase of lmplementation Begins The planning Paper, issued on November 2 , 1989 (OLAS Steering Committec , 1988), was divided into the following sec tion s (;QaJ statement Need for an Ohio Library Information System Assumptions Governance issues 一 Tentati瞻仰吋ect timetable Because th e cu rren tJ y insta ll ed six different lo ca l systems at the nin e IULC Iibraries are not compatible, direct communi cation affiong them is impra ct ica1. OLIS will conn ect lo cal systems at the thirteen state universit i間, plus the two medi cal co l1eges. QLlS is co n ceived as a multi-dimensional information system which will integrate traditional ca talog and circulation functions for a state- wide system wîth a docu ment delîve ry service to make the infor matî on resour ces readtly available for users from each particîpating university and beyond The Ohio Board of Rege nts has e mphasized the importance of the system by incorporating OLlS into its Selective Excellence initiatives - nationally acc1aimed challenge grants to encourage outsta nd ing programs specia !! y funded by the State of Ohio A1though O LlS will dîrectly benefit the faculty , researchers and st udents of the stat e-suppo rted unîversîties initîally , th e sys t em will be available to all citize ns in Ohio and latcr may be expanded 10 include other institutions of high er learnin g an d othe r types of Iibraries The Planning Paper (OLAS Steering Committec, 1988: 4-5) identìfies the following reasons for creation of OLIS 同』 '" Joomal 01 Educ8 110nal M吋;8& [巾rary S<:icnc<'K 'n : 2 (W,"'... L9'JO) Acc:eu 10 the divefSt re50urces of IULC übrarles Enhanee interlibrazy I。帥 and intcr-祉utitutional borrowin4ι Cooperativc 四.JIection devclopment and management AcC<. S. The Road Ahead At the time of this writing (March 1989) , the Request for Inform ation (RF I) document has gone ou t to some 5日 vendors and interested parties. The responses are due on Apri1 15. In the meantime, the Task Forces a re working on functional speci日ca tions which will be in cJ uded in the Request For Proposa1 (REP) do cument to be 悠sued on June 1 S. Specia1ized consultative working conferences on the functional specifications are schedu led for late April and a second general working conference is schedul ed on May 2-3 to consider the vendor responses to the RFI and to finalize the RFP Although the fmal shape of OLIS is sti1l un cJear , al1 involved in the process are encouraged by the progress thus far and remain 。ptimistic about the future. Many questions rem到n , some of which will not be answered until the vendor and system have been ' 學f Lee : 1'1a,"""1,t Pr<65 are shown. The last two columns show the fractions fHEP and fcore of HEP and core articles, respectively [21]. Journal Publisher Ntot NHEP Ncore fHEP fcore Phys. Rev. D APS 2285 2101 1635 92% 72% JHEP SISSA/IOP 859 859 840 100% 98% Phys. Lett. B Elsevier 957 862 740 90% 77% Nucl. Phys. B Elsevier 522 481 465 92% 89% Phys. Rev. Lett. APS 3836 407 279 11% 7% Eur. Phys. J. C Springer 331 272 234 82% 71% Mod. Phys. Lett. A World Scientific 281 216 138 77% 49% Phys. Rev. C APS 853 298 136 35% 16% Class. Quant. Grav. IOP 491 255 89 52% 18% Int. J. Mod. Phys. A World Scientific 878 143 88 16% 10% J. Math. Phys. AIP 446 108 74 24% 17% J. Phys. A IOP 850 78 65 9% 8% Eur. Phys. J. A Springer 458 91 58 20% 13% JCAP SISSA/IOP 156 128 57 82% 37% J. Phys. G IOP 414 87 55 21% 13% Prog. Theor. Phys. IPAP 159 68 54 43% 34% Nucl. Phys. A Elsevier 692 92 51 13% 7% Gen. Rel. Grav. Springer 190 103 20 54% 11% Int. J. Mod. Phys. D World Scientific 160 97 18 61% 11% Nucl. Instrum. Meth. A Elsevier 1371 312 16 23% 1% Astropart. Phys. Elsevier 85 74 13 87% 15% Table 1: The most popular HEP journals and their publishers, together with the total number of articles published in 2005, Ntot; the number of HEP articles, NHEP; and the number of articles in the HEP core subject, Ncore. The journals are ordered in decreasing order of Ncore. Only journals with NHEP>65 are shown. The last two columns show the fractions fHEP and fcore of HEP and core articles, respectively. From reference [21]. As discussed in section 2, the vast majority of content of arXiv tagged as HEP, appeared in just six peer-reviewed journals from four publishers [8]. Five out of these six journals carry a majority of HEP content, as listed in table 1, these are: 1. Physical Review D (published by the American Physical Society); 2. Journal of High Energy Physics (SISSA/IOP); 3. Physics Letters B (Elsevier); 4. Nuclear Physics B (Elsevier); 5. European Physical Journal C (Springer). SCOAP3 aims to assist publishers in converting these “core” journals entirely to OA. 8 As from the last column of table 1, these five “core” journals include up to 30% of articles beyond the core HEP topics, particularly in Nuclear Physics and Astroparticle Physics. These articles will also be included in the OA conversion of the journals. This is in the interest of the HEP readership and promotes the long-term goal of an extension of the SCOAP3 model to these related disciplines. The sixth journal, Physical Review Letters (American Physical Society), is a “broadband” journal, which carries only a small fraction (11%) of HEP content. SCOAP3 aims to sponsor the conversion to OA of this fraction on an article-by-article basis. A similar approach holds for another popular “broadband” journal in instrumentation: Nuclear Instruments and Methods in Physics Research A (Elsevier), which carries about 25% of HEP content. These seven journals covered, in 2005, around 4’200 core HEP articles and about 5’300 articles in the wider HEP definition, including all related subjects. The conversion to OA of these five “core” journals and the HEP part of these two “broadband” journals would cover over 80% of the core HEP subjects and over 60% of the entire HEP literature, including all related subjects. The remaining 3’300 HEP articles, not published in the journals mentioned above, are scattered over some 140 other journals. It is important to note that the SCOAP3 model should not be limited to this set of journals, which the SCOAP3 report spotlights for sake of clarity, but is open to all existing and future high-quality journals which carry HEP content, within budgetary limits. An interesting fact, which vindicates the necessity of a worldwide consensus for the implementation of the SCOAP3 model, is the geographical origin of HEP articles and the geographical distribution of their publication outlets. Two of the journals mentioned above, Physical Review D and Physical Review Letters, are published in the US and four, Journal of High Energy Physics, Nuclear Physics B, Physics Letters B, European Physical Journal C, are published in Europe. Figure 3 shows that US journals see an equal amount of literature coming from the US, Europe and the rest of the world. European journals see slightly more literature coming from Europe, but still more than half of their publications originate outside Europe [24]. Therefore, for a conversion to OA of HEP literature, which would benefit all authors worldwide, a conversion of all journals, independently of their country of publication, is necessary. Figure 3 Origin of publications in US- and Europe-based journals. Co-authorship is taken into account on a pro- rata basis, assigning fractions of each article to the geographical reason in which the authors are affiliated. CERN and its Member States are, to a wide extent, a representation of Europe. From reference [24]. 5. FINANCIAL ASPECTS OF THE SCOAP3 MODEL The price of an electronic journal is driven by the costs to run the peer-review system, by editorial costs for copy-editing and typesetting, by the cost for electronic publishing and access control, and by subscription administration. Some publishers today quote a cost, from reception to final publication, in the range of 1’000 –2’000 Euros per published article [25]. 9 This includes the cost of processing articles which are eventually rejected, the fraction of which varies substantially from journal to journal. The annual budget for a transition of HEP publishing to OA can be estimated from this figure and the fact that the journals, which cover a large fraction of the HEP literature, publish about 5’000 articles per year: the annual budget for a transition of HEP publishing to OA would amount to a maximum of 10 million Euros per year1. A “fair-share” scenario for the financing of SCOAP3 is to distribute these costs among all countries active in HEP, on a pro-rata basis, taking into account the size of the HEP author base of each country. The rationale behind this scenario is that SCOAP3 will be effectively covering the costs of the peer-review service and therefore the countries using this service more are naturally expected to contribute more. To cover publications from scientists from developing countries, which cannot be reasonably expected to contribute to the consortium at this time, an allowance of not more than 10% of the SCOAP3 budget is foreseen. The size of the HEP author base in each country is estimated from a recent study [21,24] which considered all articles published in the years 2005 and 2006 in the five HEP “core” journals, Physical Review D, Physics Letters B, Nuclear Physics B, Journal of High Energy Physics and the European Physical Journal C, as well as those HEP articles published in the two “broadband” journals, Physical Review Letters and Nuclear Instruments and Methods in Physics Research A. A total sample of about 11’300 articles was considered and, for each of them, all authors were uniquely assigned to a given country, therefore assigning a fraction of each article to each country. Being an international laboratory, CERN was treated as an additional country. About 5% of the authors were found to have multiple affiliations, often in different countries, reflecting the intense cross-border collaborative tradition of HEP. In these cases, the ambiguity in the assignment of authors to countries was solved biasing a larger expenditure towards countries with a larger gross domestic product, as described in reference [24]. The results from this study are summarized in table 2 and figure 4. Country Share of HEP Scientific Publishing Country Share of HEP Scientific Publishing United States 24.3% Iran 0.9% Germany 9.1% Netherlands 0.9% Japan 7.1% Portugal 0.9% Italy 6.9% Taiwan 0.8% United Kingdom 6.6% Mexico 0.8% China 5.6% Sweden 0.8% France 3.8% Belgium 0.7% Russia 3.4% Greece 0.7% Spain 3.1% Denmark 0.6% Canada 2.8% Australia 0.6% Brazil 2.7% Argentina 0.6% India 2.7% Turkey 0.6% CERN 2.1% Chile 0.6% Korea 1.8% Austria 0.5% Switzerland 1.3% Finland 0.5% Poland 1.3% Hungary 0.4% Israel 1.0% Remaining countries 3.7% 1 Another indication which corroborates this estimate is that the costs to run a “core” journal such as Physical Review D, amount to 2.7 Million Euros per year [25] and it covers about a third of the HEP publication landscape [8]. 10 Table 2: Contribution to the HEP scientific publishing of several countries. Co-authorship is taken into account on a pro-rata basis, assigning fractions of each article to the countries in which the authors are affiliated. The last cell aggregates contributions from countries with a share below 0.4%. This study is based on all articles published in the years 2005 and 2006 in the five HEP “core” journals, Physical Review D, Physics Letters B, Nuclear Physics B, Journal of High Energy Physics and the European Physical Journal C and the HEP articles published in two “broadband” journals, Physical Review Letters and Nuclear Instruments and Methods in Physics Research A. A total sample of about 11’300 articles is considered. From reference [21]. Figure 4: Contribution to the HEP scientific publishing of several countries. Co-authorship is taken into account on a pro-rata basis, assigning fractions of each article to the countries in which the authors are affiliated. A total of 11’300 articles published in the years 2005 and 2006 is considered, covering entirely the five HEP “core” journals, Physical Review D, Physics Letters B, Nuclear Physics B, Journal of High Energy Physics and the European Physical Journal C and the HEP articles published in two “broadband” journals, Physical Review Letters and Nuclear Instruments and Methods in Physics Research A. Contributions from countries with a share below 0.8% are aggregated in the slice denoted as “Other Countries”. From reference [21]. Three transitional aspects in the implementation of the SCOAP³ model are particularly relevant [21]. 1. Journal licence packages. In the case of “core” HEP journals which will be entirely converted to OA, and which are part of a large journal licence package, a contractual condition to the publishers will be to extract these titles from the package and to correspondingly reduce the subscription cost for the remaining part of the package. 2. Partially-converted journals. For “broadband” journals, where only the conversion of selected HEP articles is paid by SCOAP³, a contractual condition for publishers will be to reduce the subscription costs according to the fraction of content supported by SCOAP³. For journals of this kind that are part of a licence package, the reduction should be reflected in a corresponding reduction of the package subscription cost. 3. Multi-year subscriptions contracts. In the case of existing long-term subscription contracts between publishers and libraries or library consortia, a contractual condition for publishers will be to reimburse the subscription costs pertaining to OA journals or to the journal fractions converted to OA. 11 6. BUILDING THE SCOAP3 CONSORTIUM The fundamental pillar of the SCOAP3 model is the federation of libraries, library consortia, HEP laboratories and HEP funding agencies worldwide to cover centrally the costs of peer- review in HEP publishing, while making HEP publications OA. This conversion to OA of HEP literature, with all the ethical, scientific and financial benefits it implies, can only be achieved in a global co-ordinated process. A crucial step towards OA publishing in HEP is therefore the establishment of worldwide consensus around the SCOAP3 initiative. This consensus will be reflected in financial pledges from each country for its share of the yearly SCOAP3 costs. These contributions are calculated as from the percentages of HEP authorship of each country, presented in table 2, taking into account a global budget envelope of 10 million Euros, and an additional contribution from each country of 10% to cover authors in countries which cannot be expected to contribute to the consortium at this stage. At the time of writing, in early December 2007, partners from many European nations have pledged, over a few weeks, over ¼ of the budget envelope of SCOAP3: a total of 2.5 million Euros, covering the contribution of Germany, Italy, France, CERN, Sweden and Greece will be made available for the initiative. Intense discussions are underway within the remaining European countries, many of which are expected to join the consortium in the immediate future. Entities in Asia and in the United States are also considering the model, and signs of interest are appearing from leading libraries and library consortia in the United States. The situation is in continue evolution and can be monitored on the consortium website [26]. Once sufficient funds will have been pledged towards the establishment and the operation of SCOAP3, from partners around the world, a tendering process involving publishers of high- quality HEP journals will take place. Provided that the SCOAP3 funding partners are ready to engage into long-term commitments, most publishers are expected to be ready to enter into negotiations along the lines presented in this article. The outcome of the tendering process will allow the complete SCOAP3 budget envelope to be precisely known and will trigger the formal establishment of the consortium and the definition of its governance. Finally, contracts with publishers will be established in order to make Open Access publishing in High Energy Physics a reality, when the first experimental and theoretical publications of the LHC program, a watershed in the history of HEP research, will appear. It is important to remark how this process is driven by the author community: at the beginning of 2007, the large scientific collaborations giving the final touch to the particle detectors of unprecedented complexity to be used for discoveries at the LHC, counting a total of over 5’000 scientists, each voted, unanimously, a statement in support of OA publishing: “We strongly encourage the usage of electronic publishing methods for our publications and support the principle of Open Access Publishing, which includes granting free access of our publications to all. Furthermore, we encourage all collaboration members to publish in easily accessible journals, following the Open Access Paradigm.” [21] The conversion of the HEP scientific publishing to the OA paradigm, along the lines of the SCOAP3 model, will be an important milestone in the history of scientific publishing. The SCOAP3 model could be rapidly generalized to other disciplines and, in particular, to related fields such as Nuclear Physics or Astroparticle Physics and to all disciplines which are practiced by a compact and closely-knit community, with a limited number of publishing outlets, and a strong presence of learned-societies in the publishing process. REFERENCES [1] http://oa.mpg.de/openaccess-berlin/berlindeclaration.html [Last visited December 1st, 2007]. 12 [2] R. Voss et al., Report of the Task Force on Open Access Publishing in Particle Physics, CERN, 2006; http://library.cern.ch/OATaskForce_public.pdf [Last visited December 1st, 2007]. [3] Luisella Goldschmidt-Clermont, Communication Patterns in High-Energy Physics, High Energy Physics Libraries Webzine, issue 6, March 2002 http://library.cern.ch/HEPLW/6/papers/1/ [Last visited December 1st, 2007]. [4] http://arXiv.org [Last visited December 1st, 2007]. [5] Paul Ginsparg, Computers in Physics 8 (1994) 390. [6] http://www.slac.stanford.edu/spires [Last visited December 1st, 2007]. [7] The history of SPIRES is recounted in L. Addis, 2002 http://www.slac.stanford.edu/spires/papers/history.html [Last visited December 1st, 2007]. [8] S. Mele et al., JHEP 12(2006)S01; arXiv:cs.DL/0611130. [9] http://publish.aps.org/FREETOREAD_FAQ.html [Last visited December 1st, 2007]. [10] http://www.elsevier.com/wps/find/authorshome.authors/physicslettersb [Last visited December 1st, 2007]. [11] http://www.epj.org/open_access.html [Last visited December 1st, 2007]. [12] http://jhep.sissa.it/jhep/docs/SISSA_IOP_OA_proposal.pdf [Last visited December 1st, 2007]. [13] http://www.iop.org/EJ/journal/-page=extra.9/NJP [Last visited December 1st, 2007]. [14] http://www.physmathcentral.com/pmcphysa/about [Last visited December 1st, 2007]. [15] http://www.hindawi.com/journals/ahep [Last visited December 1st, 2007]. [16] http://www.bentham.org/open/tonppj/index.htm [Last visited December 1st, 2007]. [17] Convention for the establishment of a European Organisation for Nuclear Research, Paris, 1953. Art II.1. [18] http://open-access.web.cern.ch/Open-Access/20050916.html [Last visited December 1st, 2007]. [19] http://indico.cern.ch/conferenceDisplay.py?confId=482 [Last visited December 1st, 2007]. [20] http://indico.cern.ch/conferenceDisplay.py?confId=7168 [Last visited December 1st, 2007]. [21] S. Bianco et al., Report of the SCOAP3 Working Party, CERN, 2007; http://scoap3.org/files/Scoap3WPReport.pdf [Last visited December 1st, 2007]. [22] S. Bianco et al., Executive Summary of the SCOAP3 Working Party Report, CERN, 2007; http://scoap3.org/files/Scoap3ExecutiveSummary.pdf [Last visited December 1st, 2007]. [23] http://www.soros.org/openaccess/read.shtml [Last visited December 1st, 2007]. [24] J. Krause, C.M. Lindqvist and S. Mele, Quantitative Study of the Geographical Distribution of the Authorship High-Energy Physics Journals, CERN-OPEN-2007-014, CERN, 2007; http://cdsweb.cern.ch/record/1033099 [Last visited December 1st, 2007]. [25] M.Blume, Round table discussion: Policy Options for the Scientific Publishing System in FP7 and the European Research Area. Conference on Scientific Publishing in the European Research Area: Access, Dissemination and Preservation in the Digital Age, Brussels 15-16 February 2007. [26] http://scoap3.org [Last visited December 1st, 2007]. work_qi56spj2yffxhf2eyiridyavkq ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216588004 Params is empty 216588004 exception Params is empty 2021/04/06-01:37:00 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216588004 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:00 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_qq2rb6ol2jg5vpnhjvo3snncse ---- InformationSharing Pipeline UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) UvA-DARE (Digital Academic Repository) Information-Sharing Pipeline Ilik, V.; Koster, L. DOI 10.31219/osf.io/hbwf8 10.1080/0361526X.2019.1583045 Publication date 2019 Document Version Final published version Published in The serials librarian License Unspecified Link to publication Citation for published version (APA): Ilik, V., & Koster, L. (2019). Information-Sharing Pipeline. The serials librarian, 76(1-4), 55-65. https://doi.org/10.31219/osf.io/hbwf8, https://doi.org/10.1080/0361526X.2019.1583045 General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Download date:06 Apr 2021 https://doi.org/10.31219/osf.io/hbwf8 https://doi.org/10.1080/0361526X.2019.1583045 https://dare.uva.nl/personal/pure/en/publications/informationsharing-pipeline(53be94e8-a0ef-46a2-9891-9f7d43077baf).html https://doi.org/10.31219/osf.io/hbwf8 https://doi.org/10.1080/0361526X.2019.1583045 Information-Sharing Pipeline Violeta Ilik and Lukas Koster Presenters ABSTRACT In this article we discuss a proposal for creating an information-sharing pipeline/real-time information channel, where all stakeholders would be able to engage in exchange/verification of information about entities in real time. The entities in question include personal and organizational names as well as subject headings from different controlled vocabularies. Three World Wide Web Consortium–recommended protocols are consid- ered as potential solutions: the Linked Data Notifications protocol, the ActivityPub protocol, and the WebSub protocol. We compare and explore the three protocols for the purpose of identifying the best way to create an information-sharing pipeline that would provide access to the most up-to- date information to all stakeholders. KEYWORDS Identity management; authority records; Linked Data Notifications; ActivityPub; WebSub; ResourceSync Framework Specification—Change Notification Introduction Our longstanding interest in this topic is due to the fact that there is no single place where one can find the most up-to-date information about creators, their institutions, and organizations with which they are affiliated. There are many creators, institutions, and organizations without authority records in the Library of Congress Name Authority File. On the other hand, information about those same creators, institutions, and organizations exists in other data stores such as discipline-specific databases, vendor or publisher databases, databases developed by non-profit organizations, and many more. In most cases these creators, institutions, and organizations have one or more identifiers assigned to them. We also know that information about creators, institutions, and organizations exists in institutional directory databases. In many cases these institutional databases feed into library discovery tools. However, institutional databases do not always synchronize all of this information with outside data stores held by publishers, vendors, non-profit organizations, or open source platforms. The information from all of these databases could be leveraged to support improved discoverability of relevant information about individual authors and institutions. Our article’s primary focus is how to exchange and verify data in real time. Below we outline the characteristics of three World Wide Web Consortium (W3C) standards that could enable such an exchange. Background We discussed our original idea in a blog post on which we received feedback from experts in the infrastructure field. The current need to solve the identity management/authority control pro- blem led us to initially name the system “Global Distribution System (GDS) system for authors’ information exchange.” As we discussed in the blog post, the system would be comprised of hubs where all stakeholders would engage in exchange/verification of information about authors/ institutions/organizations.1 We imagined it as a decentralized system that joins together various CONTACT Violeta Ilik ilik.violeta@gmail.com Digital Collections and Preservation Systems, Columbia University Libraries, New York, NY, USA. Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/wser. Published with license by Taylor & Francis Group, LLC © 2019 Violeta Ilik and Lukas Koster THE SERIALS LIBRARIAN 2019, VOL. 76, NOS. 1–4, 55–65 https://doi.org/10.1080/0361526X.2019.1583045 http://www.tandfonline.com/WSER https://crossmark.crossref.org/dialog/?doi=10.1080/0361526X.2019.1583045&domain=pdf&date_stamp=2019-06-13 software instances so that everyone would be able to see the activities in all of the hubs. Stakeholders in this case include individuals, libraries and other cultural heritage institutions, vendors, publishers, and identity providers, just to name a few. They have knowledge and experience in working with and/or developing various repositories (national or disciplinary), metadata aggregators, personal identifiers systems, publisher platforms, library vendors, and individually or otherwise hosted profiles. We further discussed how the proposed solution to the present challenge is a shared information pipeline where all stakeholders/agents would have access to up-to-date information and also contribute their own. Figure 1 shows a simplified version of the information-sharing pipeline. As soon as someone updates a profile in a database, anyone interested in that profile could either pull the information or be notified through a subscription service that keeps subscribers informed to changes in a profile. Interoperability and unique identification of researchers Everyone wants to uniquely identify a researcher, starting with the researchers themselves, institu- tions, publishers, libraries, funding organizations, and identity management systems. The most interesting cases for researchers’ name disambiguation are the ones with common names. In cases when we are dealing with last names of researchers of Chinese descent, for example, where the top fifty family names comprise 70% of China’s population of over one billion, the complications that arise from name disambiguation are difficult.2 We need to be able to pull all the information about researchers from various databases and obtain the best possible aggregated data. Interoperability and reuse of data need to be addressed since we live in an age that has offered us unique opportunities. As Ilik noted, “these revolutionary opportunities now presented by digital technology and the semantic web liberate us from the analog based static world, re-conceptualize it, and transform it into a world of high dimensionality and fluidity. We now have a way of publishing structured data that can be interlinked and useful across databases.”3 Figure 1. Information-sharing pipeline/real-time information channel. 56 CONCURRENT SESSION The problem Landscape The OCLC Online Computer Library Center, Inc. (OCLC) Research Registering Researchers in Authority Files Task Group report looked into the landscape that represents various systems that capture data about researchers. According to the report, “rather than waiting until sufficient information is available to create a national authority heading, librarians could generate a stub with an ID that others could augment or match and merge with other metadata.”4 The stakeholders currently do not utilize the rich information about persons and corporate bodies that is available from various data sources in a dynamic way. The bibliographic utilities and the vendor community, which includes Integrated Library Systems, automated authority control ven- dors, contract cataloging vendors and publishers, are not yet ready to offer a technical environment that enables us to control the name headings that come from linked data products and services. Some of this has changed already now that the Authority toolkit, developed by Gary Strawn at Northwestern University to work with OCLC Connexion Client, allows us to semi-manually search outside sources and add information to the Name Authority Record. Some of those sources include Wikidata, Wikipedia, OpenVIVO, the Getty vocabularies, Virtual International Authority File, Medical Subject Heading, LC Linked Data Services, and GeoNames.5 However, we still cannot execute the control headings from those sources, we just enhance the records with data from these sources. The solution may not be to try and work within closed and constrained systems, but rather to completely change the way we currently think about and apply the W3C standards and protocols to decouple data from closed systems in a decentralized web. A decentralized web The notion of the decentralized web has been the focus of a number of initiatives, publications, and projects in recent years. Most notable of these are Tim Berners Lee’s 2009 post “Socially Aware Cloud Storage,”6 the Solid project,7 Herbert Van de Sompel’s and associates’ distributed scholarly communication projects such as ResourceSync,8 Open Annotation and Memento, Ruben Verborgh’s work on Paradigm Shifts for the decentralized web,9 Sarven Capadisli’s dokie.li,10 and Linked Data Notifications (LDN) protocol, a W3C recommendation. The concept of the decentralized web basically consists of four principles: Separation or decoupling of data and applications, Control of and access to data by the owner, Applications as views on the data, and Exchange of information between data stores by means of notifications. Decoupling of data and applications The current standard situation on the web is that sites store their own and their users’ data in a data store that is only accessible for the site’s own applications, or via Application Programming Interfaces (API) through the site’s own services. These data stores function as data silos that are only accessible via services provided and managed by the data and system owners. This is the situation with the aforementioned categories of systems that hold creators’ data. Berners Lee states that there is a possible web architecture where applications can work on top of a layer of read–write data storage, serving as commodity (a good that can be exchanged), indepen- dent of the applications. This decoupling of data and applications “allows the user to control access to their data,” “allows the data from various applications to be cross-linked,” and “allows innovation in the market for applications.”11 Moreover, data are not lost when applications disappear and vice versa. Persistence of data and applications are independent of each other. In this architecture Uniform Resource Identifiers (URI) are used as names for users, groups, and documents, Hyper Text Transfer Protocol (HTTP) is used for data access, and application-independent single sign-on systems are used for authentication. Ruben Verborgh, who works with Berners Lee on the Solid THE SERIALS LIBRARIAN 57 project, comes to a similar conclusion, but arrives there from the other side. He discusses three paradigm shifts to prepare for if we want to build web applications with a decentralized mindset: (1) End users become data owners, (2) Apps become views, (3) Interfaces become queries.12 Control of data and access In a situation of decoupled data and applications, the service provider no longer controls the data itself nor access to the data. Instead the data owner can decide on the storage location and the reliability of the content and types of access granted to applications and other users. It improves privacy and control.13 Credentials and access are managed only once, in the data store or data pod, not multiple times in each application. Besides that, trust, provenance, and verification are distrib- uted and are no longer derived from a single actor. dokie.li, a client-side editor for decentralised article publishing, annotations, and social interac- tions, is an example implementation of where the concepts of decentralization and interoperability meet toward an architecture where atomic data components are managed by their creators. A Solid- based server compliments tools like dokie.li by using the same set of open web standards. In the context of creators’ authority data, creators would be able to store, verify, and control their own data, coexisting with data pods controlled by other data providers, such as the current system categories described above. However, in practice the number of data pods containing overlapping, identical, or complementary creator data would probably decrease as applications indeed become mere views. Of course this will be a gradual development. For the remaining data stores a real-time information pipeline using notifications is essential. Applications as views The logical consequence of decoupling data and applications is that the data can and must be viewed, (re)used, and managed using any number of applications, of any type, including commercial, free, and open source. These applications then act as mere views on the data, because there is no direct and intrinsic relation between data and services in one system. The data service provider can also provide their own applications, as is the case with dokie.li and Solid, but other applications are not excluded. This allows innovation in the market for applications, as Berners Lee and Verborgh say. For this scenario to be reaized, the data must be FAIR (findable, accessible, interoperable, and reusable), and Open if applicable, to be decided by the data owner. For the creators’ authority data ecosystem this would mean that library catalogue systems would directly access the data stores of all the categories of creator authority data systems mentioned above and individual data pods managed by creators themselves. Exchange of information via notifications Now, the problem with applications acting as views on distributed data pods obviously lies in keeping track of where the required data reside and when the data are updated. Here the notifica- tions come into play. Instead of a data pull mechanism whereby various applications access all possible data stores in a decentralized web, a data push mechanism is the optimal way of sharing data. All individual data stores will have to notify all interested client applications of any updates. Berners Lee mentions the need for a “publication/subscription” system for real-time notifications.14 Verborgh proposes a different perspective with his “interfaces become queries” paradigm shift.15 In the current situation, data consuming applications send requests to a web API, a custom interface exposed by the data provider. This is not feasible in a decentralized web, where the many individual data pods will not have one standard interface. Instead, decentralized web applications should use declarative queries, which would be processed by client side libraries, translating these queries into concrete HTTP requests against one or multiple data pods. In a later presentation at ELAG 58 CONCURRENT SESSION [European Library Automation Group] 2018 elaborating on his post, Verborgh mentions LDN as the way to notify interested parties of data updates.16 He also predicts the transition from existing data aggregators to a network of caching nodes in the decentralized web. For the creators’ authority data situation this would mean a decentralized network of caching nodes or hubs exchanging and synchronizing information via a publish/subscribe notifications system. One of these is LDN, others are WebSub and ActivityPub, which we will discuss below. At the same ELAG 2018 conference, a presentation entitled “Pushing SKOS” was given by Felix Ostrowski about distributing controlled vocabularies containing authority data in general, using a publication/subscription model with LDN and WebSub.17 This idea shares similarities with the information-sharing pipeline for creators data proposed here. Who holds the data we need? Data we are interested in are diffused among all of these stakeholders: individuals, libraries and other cultural heritage institutions, vendors, publishers, and identity providers. What is becoming more and more obvious is that the technologies to implement some changes in how we do business are already available, but the adoption and implementation is taking time. The resistance is based on the, in our opinion, fear of the new unknown. Experts see the benefit, but unfortunatelly the stakeholders are comfortable with the status quo and are not receptive to this big change. We still want to control our systems, re-entering the same data over and over again, when in fact what we should do is remember Verborgh’s paradigm shift, and enter data only once. In order to build an inclusive playground and work together to exchange/verify information about entities in real time, we need to break down the walls. Once we manage to do that, all stakeholders can engage in exchange/verification of information about entities. Sarven Capadisli noted that the social paradigm shift is 25–50 years behind the technical shift and this has presented significant obstacles for the efforts to decentralize the web.18 Existing standards/protocols We will look at similarities and differences between three W3C protocols, LDN, ActivityPub, and WebSub for the purpose of identifying which one fits with the use case of creating an information- sharing pipeline that will enable all stakeholders to have access to the most up-to-date information. Linked Data Notifications The design principles of LDN are that “data on the Web should not be locked in to particular systems or be only readable by the applications which created it. Users should be free to switch between applications and share data between them.”19 Applications generate notifications about activities, interactions, and new information, which may be presented to the user or processed further. LDN supports autonomy so any resource can have an Inbox anywhere; gives an identifiable unit that means that notifications have URIs; is reusable in the sense that a notification can contain any data and can use any vocabulary; and maintains a separation of concerns among sender, consumer, and receiver.20 According to the W3C, “Linked Data Notifications is a protocol that describes how servers (receivers) can have messages pushed to them by applications (senders), as well as how other applications (consumers) may retrieve those messages. Any resource can advertise a receiving endpoint (Inbox) for the messages. Messages are expressed in Resource Description Format (RDF), and can contain any data, as noted on the W3C site and Capadisli et al. paper.”21,22 This allows for more modular systems, which decouple data storage from the applications that display or otherwise make use of the data. As previously mentioned while describing the decen- tralized web and the Solid project, decoupling of data from the applications is an important and THE SERIALS LIBRARIAN 59 necessary step toward a decentralized web. The protocol is intended to allow senders, receivers, and consumers of notifications, which are independently implemented and run on different technology stacks, to seamlessly work together, contributing to decentralization of our interactions on the web. This specification enables the notion of a notification as an individual entity with its own URI. As such, notifications can be retrieved and reused. It is important to remember that the LDN protocol is a simple protocol for delivery only. It is used to publish notifications on the web to another user that the receiver has not explicitly asked for.23 There is no subscription option. No further specification of the payload is needed; it is a Pull approach for retrieval of notifications from an inbox. Senders and consumers discover a resource’s Inbox Uniform Resource Locator (URL) through a relation in the HTTP Link header or body of the resource (see Figure 2). LDN completely decouples the three roles that an application (actor) can perform: sender, receiver, consumer. Those applications can be different. And as mentioned before, an application can have more than one role. ActivityPub The W3C describes the ActivityPub protocol as “a decentralized social networking protocol based upon the ActivityStreams 2.0 Terms. It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.”24 W3C further defines the features of ActivityPub and describes the two layers it provides: A server to server federation protocol (so decentralized websites can share information) A client to server protocol (so users, including real-world users, bots, and other automated processes, can communicate with ActivityPub using their accounts on servers, from a phone or desktop or web application or whatever) In ActivityPub, a user is represented by “actors” via the user’s accounts on servers. The core Actor Types include: Application, Group, Organization, Person, and Service. Every Actor has: An inbox: How they get messages from the world An outbox: How they send messages to others Figure 2. Overview of linked data notifications. 60 CONCURRENT SESSION In the client server setting, notifications are always published to the sender’s outbox. Actors wishing to receive a sender’s notifications must send a request to that sender’s outbox. In the federated server setting, notifications are published directly to the receiving actor’s inbox. For the purpose of our case ActivityPub plays well with the idea of each system keeping their data intact but sharing the relevant information with other systems that relate to each actor type (people, corporate bodies). However, subscriptions are only available for federated servers. In ActivityPub the sender must be aware of the receiving federated server in a Followers list. On the other hand, non-federated server actors must be aware of all senders. ActivityPub specializes in LDN as the mechanism for delivery of notifications by requiring that payloads are ActivityStreams2. Inbox endpoint discovery is the same. LDN receivers can understand requests from ActivityPub federated servers, but ActivityPub servers cannot necessarily understand requests from generic LDN senders. It is important to note that ActivityPub reuses LDN’s Inbox mechanism. This means that tools are not only interoperable over sets of standards, but some of these standards themselves are designed to work with each other out of the box. Another important fact that positions LDN as a great solution for the problem we discuss is that any Linked Data Platform (LDP) implementation is an LDN Receiver out of the box. LDP “defines a set of rules for HTTP operations on web resources, some based on RDF, to provide an architecture for read-write Linked Data on the web.”25 Fedora Commons,26 a flexible, modular, open source repository platform with native linked data support, is an example of a service that passes the LDP conformance test.27 Figure 3 shows the simple flow of the data. WebSub WebSub provides a common mechanism for communication between publishers of any kind of web content and their subscribers, based on HTTP web hooks. Subscription requests are relayed through hubs, which validate and verify the request. Hubs then distribute new and updated content to subscribers when it becomes available. The W3C document defines the important terms used in WebSub and they are described below.28 A subscriber is an entity (person or application) that wants to be notified of changes on a topic. The topic is the unit of content that is of interest to subscribers, identified by a resource URL. The topic is owned by a publisher. The publisher notifies one or more hubs, servers that implement both sending and receiving protocols. The hub notifies all subscribers that have a subscription to a specific topic. A subscription is a unique key consisting of the topic URL and the subscriber’s callback URL. Figure 3. ActivityPub: Illustration of data flow. https://www.w3.org/TR/activitypub/illustration/tutorial-2.png THE SERIALS LIBRARIAN 61 https://www.w3.org/TR/activitypub/illustration/tutorial-2.png In WebSub a subscriber does not have to be aware of all publishers that own the topics of interest. Also, a publisher does not have to be aware of all interested subscribers. Here the hub is a caching node in the decentralized web. WebSub would provide an environment where each party could post its evolving version of a description on a channel to which all parties subscribe. Each party would be able to gather the information they need from that channel. In this environment there is no central/correct/unique version of the data. There are many versions that are informed by work being done in different institutions/applications that manage and use identity information. This is a real-time information channel fed by and consumed by institutions and applications that manage and use identity information. The high-level protocol flow is shown in Figure 4, taken from the W3C site. ResourceSync Framework Specification—Change Notification is based on WebSub. ResourceSync Change Notifications can be used to create/update/delete links when information about a new or updated description is sent via the URI of the description. The nature of the change (create, update, or delete) and the associated URI are sent through Change Notification Channels29 as Change Notifications.30 These notifica- tions “are sent to inform Destinations about resource change events, specifically, when a Source’s resource that is subject to synchronization is created, updated, or deleted” as described in the ResourceSync Framework Specification (ANSI/NISO Z39.99–2017).31 ResourceSync Change Notification specification describes an additional, push-based capability that a Source can support. It is aimed at reducing synchronization latency and entails a Source sending notifications to subscribing Destinations. The push-based notification capability is aimed at decreasing the synchronization latency between a Source and a Destination that is inherent in the pull-based capabilities defined in the ResourceSync core specification. In order to implement the publish/subscribe paradigm, WebSub introduces a hub that acts as a conduit between Source and Destination. A hub can be operated by the Source itself or by a third party. It is uniquely identified by the hub URI. WebSub’s topic corresponds with the notion of channel used in this specification. A topic is uniquely identified by its topic URI. Hence, per set of resources, the Source has a dedicated topic (and hence topic URI) for change notifications. Figure 4. WebSub flow diagram. https://www.w3.org/TR/websub/#x2-high-level-protocol-flow 62 CONCURRENT SESSION https://www.w3.org/TR/websub/%23x2-high-level-protocol-flow WebSub protocol and ResourceSync Change Notification work well together. As shown in Figure 5 the Source submits notifications to Hub, the Destination subscribes to Hub to receive notifications, the Hub delivers notifications to Destination, and Destination unsubscribes from Hub. It is important to remember that WebSub has a higher bar to send/receive and it specializes in subscriptions. Conclusion and recommendation All three protocols have the capacity to provide a technical solution to the problem of creators’ authority data being unavailable to all stakeholders. As previously mentioned, the holdup is the simple yet hard social paradigm shift that lags behind the technological shift. Organizations that manage identity information (OCLC, ORCID, the Library of Congress, DBpedia, WikiData, to name a few) need to come together to deploy an information-sharing pipeline. Consumers of such data need to be involved, including higher education and library system vendors. One obvious benefit would be that all stakeholders would be able to look up information about authors in one place, since the information that comes from various data sources would be synchronized. Next, even if the information about a person is stored on a personal server, a copy of the data can always be found in the Information-Sharing Hubs or in other data pods and would not be lost after the person for any reason ceases to maintain the personal server. All organizations that manage identity information (Library of Congress, OCLC, ORCID, DBpedia, WikiData, libraries, museums, archives, library system vendors) should have a clear interest in deploying an information-sharing pipeline. The most important motivator for all of these organizations to agree on an information-sharing pipeline is that all of them would need to work with only one information exchange method, which saves everyone time and money, while enhancing information accuracy. Figure 5. ResourceSync change notifications—WebSub as transport protocol HTTP interactions between Source, Hub, and Destination. http://www.openarchives.org/rs/notification/1.0.1/notification THE SERIALS LIBRARIAN 63 http://www.openarchives.org/rs/notification/1.0.1/notification Acknowledgments Sarven Capadisli [http://csarven.ca/#i]; Herbert Van de Sompel [http://public.lanl.gov/herbertv/]; PCC Task Group on Identity Management in NACO [https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO- rev2018-05-22.pdf]. Notes 1. Violeta Ilik, “Real Time Information Channel,” Violeta’s Blog, October 19, 2017, https://ilikvioleta.blogspot. com/2017/10/real-time-information-channel.html (accessed June 11, 2018). 2. Cultural Diversity: A Resource Booklet on Religious and Cultural Observance, Belief, Language and Naming Systems, (London: HM Land Registery), archived from the original (PDF) on January 13, 2006, https://web. archive.org/web/20060113025139/http://www.diversity-whatworks.gov.uk:80/publications/pdf/hmlandregistry culturaldiversity.pdf (accessed January 4, 2019). 3. Violeta Ilik, “Cataloger Makeover: Creating Non-MARC Name Authorities,” Cataloging & Classification Quarterly 53, no. 3–4 (2015): 382–98, doi:10.1080/01639374.2014.961626. 4. Karen Smith-Yoshimura et al., Registering Researchers in Authority Files (Dublin, Ohio: OCLC Research, 2014), 23, https://www.oclc.org/content/dam/research/publications/library/2014/oclcresearch-registering-researchers -2014.pdf (accessed January 4, 2019). 5. Gary L. Strawn, “Authority Toolkit: Create and Modify Authority Records,” http://files.library.northwestern. edu/public/oclc/documentation/ (accessed July 29, 2018). 6. Tim Berners-Lee, “Socially Aware Cloud Storage,” August 17, 2009, https://www.w3.org/DesignIssues/ CloudStorage.html (accessed July 29, 2018). 7. The Solid Project, “What is Solid?” 2017, https://solid.mit.edu/ (accessed July 29, 2018). 8. Open Archives Initiative, “Resourcesync Framework Specification – Table Of Contents,” February 22, 2017, http://www.openarchives.org/rs/toc (accessed July 29, 2018). 9. Ruben Verborgh, “Paradigm Shifts For the Decentralized Web,” December 20, 2017, https://ruben.verborgh. org/blog/2017/12/20/paradigm-shifts-for-the-decentralized-web/ (accessed July 29, 2018). 10. Sarven Capadisli, dokie.li, https://dokie.li/ (accessed July 29, 2018). 11. Berners-Lee, “Socially Aware Cloud Storage,” August 17, 2009. 12. Verborgh, “Paradigm Shifts,” December 20, 2017. 13. Ibid. 14. Berners-Lee, “Socially Aware Cloud Storage,” August 17, 2009. 15. Verborgh, “Paradigm Shifts,” December 20, 2017. 16. Ruben Verborgh, “The Delicate Dance of Decentralization and Aggregation,” International Conference of the European Library Automation Group (ELAG), June 5, 2018, http://slides.verborgh.org/ELAG-2018/#inbox (accessed July 29, 2018). 17. Felix Ostrowski and Adrian Pohl, “Pushing SKOS,” International Conference of the European Library Automation Group (ELAG), June 6, 2018, http://repozitar.techlib.cz/record/1241/files/idr-1241_1.pdf (accessed July 29, 2018). 18. Sarven Capadisli, “Enabling Accessible Knowledge,” CeDEM 2015, Open Access (Danube University Krems, 2015), http://csarven.ca/presentations/enabling-accessible-knowledge/?full#trouble-in-paradigm-shifts (accessed July 29, 2018). 19. Sarven Capadisli, “Linked Data Notifications,” Scholastic Commentaries and Texts Archive (SCTA) (Basel, June 6, 2018), http://csarven.ca/presentations/linked-data-notifications-scta/ (accessed July 29, 2018). 20. Ibid. 21. World Wide Web Consortium, “Linked Data Notifications,” May 2, 2017, https://www.w3.org/TR/ldn/ (accessed July 29, 2018). 22. Sarven Capadisli et al., “Linked Data Notifications: A Resource-Centric Communication Protocol” (14th International Conference, Extended Semantic Web Conference (ESWC), Portorož, Slovenia, 2017), http:// csarven.ca/linked-data-notifications (accessed July 28, 2018). 23. Ibid. 24. World Wide Web Consortium, “ActivityPub,” January 23, 2018, https://www.w3.org/TR/activitypub/ (accessed July 29, 2018). 25. World Wide Web Consortium, “Linked Data Platform 1.0,” February 26, 2015, https://www.w3.org/TR/ldp/ (accessed August 29, 2018). 26. DuraSpace, “Fedora,” https://duraspace.org/fedora/ (accessed August 29, 2018). 27. W3C Working Group Note, “Linked Data Platform Implementation Conformance Report,” December 2, 2014, https://dvcs.w3.org/hg/ldpwg/raw-file/default/tests/reports/ldp.html (accessed August 29, 2018). 64 CONCURRENT SESSION http://csarven.ca/%23i http://public.lanl.gov/herbertv/ https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO-rev2018-05-22.pdf https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO-rev2018-05-22.pdf https://ilikvioleta.blogspot.com/2017/10/real-time-information-channel.html https://ilikvioleta.blogspot.com/2017/10/real-time-information-channel.html https://web.archive.org/web/20060113025139/http://www.diversity-whatworks.gov.uk:80/publications/pdf/hmlandregistryculturaldiversity.pdf https://web.archive.org/web/20060113025139/http://www.diversity-whatworks.gov.uk:80/publications/pdf/hmlandregistryculturaldiversity.pdf https://web.archive.org/web/20060113025139/http://www.diversity-whatworks.gov.uk:80/publications/pdf/hmlandregistryculturaldiversity.pdf https://www.oclc.org/content/dam/research/publications/library/2014/oclcresearch-registering-researchers-2014.pdf https://www.oclc.org/content/dam/research/publications/library/2014/oclcresearch-registering-researchers-2014.pdf http://files.library.northwestern.edu/public/oclc/documentation/ http://files.library.northwestern.edu/public/oclc/documentation/ https://www.w3.org/DesignIssues/CloudStorage.html https://www.w3.org/DesignIssues/CloudStorage.html https://solid.mit.edu/ http://www.openarchives.org/rs/toc https://ruben.verborgh.org/blog/2017/12/20/paradigm-shifts-for-the-decentralized-web/ https://ruben.verborgh.org/blog/2017/12/20/paradigm-shifts-for-the-decentralized-web/ https://dokie.li/ http://slides.verborgh.org/ELAG-2018/%23inbox http://repozitar.techlib.cz/record/1241/files/idr-1241_1.pdf http://csarven.ca/presentations/enabling-accessible-knowledge/?full%23trouble-in-paradigm-shifts http://csarven.ca/presentations/linked-data-notifications-scta/ https://www.w3.org/TR/ldn/ http://csarven.ca/linked-data-notifications http://csarven.ca/linked-data-notifications https://www.w3.org/TR/activitypub/ https://www.w3.org/TR/ldp/ https://duraspace.org/fedora/ https://dvcs.w3.org/hg/ldpwg/raw-file/default/tests/reports/ldp.html 28. World Wide Web Consortium, “WebSub: Definitions,” January 23, 2018, https://www.w3.org/TR/websub/ #definitions (accessed July 29, 2018). 29. Open Archives Initiative, “ResourceSync Framework Specification—Change Notification: Notification Channels,” July 20, 2017, http://www.openarchives.org/rs/notification/1.0.1/notification#NotificationChannels (accessed July 29, 2018). 30. Open Archives Initiative, “ResourceSync Framework Specification—Change Notification,” July 20, 2017, http:// www.openarchives.org/rs/notification/1.0.1/notification#ChangeNoti (accessed July 29, 2018). 31. Open Archives Initiative, “ResourceSync Framework Specification (ANSI/NISO Z39.99-2017),” February 2, 2017, http://www.openarchives.org/rs/1.1/resourcesync (accessed July 29, 2018). Disclosure statement No potential conflict of interest was reported by the authors. Notes on contributors Violeta Ilik, Head of Digital Collections and Preservation Systems, Columbia University Libraries. Lukas Koster, Library Systems Coordinator, University of Amsterdam. THE SERIALS LIBRARIAN 65 https://www.w3.org/TR/websub/%23definitions https://www.w3.org/TR/websub/%23definitions http://www.openarchives.org/rs/notification/1.0.1/notification%23NotificationChannels http://www.openarchives.org/rs/notification/1.0.1/notification%23ChangeNoti http://www.openarchives.org/rs/notification/1.0.1/notification%23ChangeNoti http://www.openarchives.org/rs/1.1/resourcesync Abstract Introduction Background Interoperability and unique identification of researchers The problem Landscape Adecentralized web Decoupling of data and applications Control of data and access Applications as views Exchange of information via notifications Who holds the data we need? Existing standards/protocols Linked Data Notifications ActivityPub WebSub Conclusion and recommendation Acknowledgments Notes Disclosure statement Notes on contributors work_qrkpjrrysbdyhaucql5ww47ouy ---- Reports Generation with Koha Integrated Library System (ILS): Examples from Bowen University Library, Nigeria | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.4018/IJDLS.2015070102 Corpus ID: 38511545Reports Generation with Koha Integrated Library System (ILS): Examples from Bowen University Library, Nigeria @article{Adesola2015ReportsGW, title={Reports Generation with Koha Integrated Library System (ILS): Examples from Bowen University Library, Nigeria}, author={Adekunle P. Adesola and Grace Olla and Roseline Mitana Oshiname and A. Tella}, journal={Int. J. Digit. Libr. Syst.}, year={2015}, volume={5}, pages={18-34} } Adekunle P. Adesola, Grace Olla, +1 author A. Tella Published 2015 Computer Science Int. J. Digit. Libr. Syst. The paper showcases various library house-keeping reports that can be generated effortlessly using Koha ILS. Examples of reports generated in Bowen University Library include Circulation, Acquisitions and Cataloguing/Classification reports. Circulation activity reports like user registration, patron category, overdue payments, item issue, returns and renewals are showcased. Acquisition reports highlighted include acquisitions by purchase and donation, expenditure on acquisitions and also by… Expand View via Publisher igi-global.com Save to Library Create Alert Cite Launch Research Feed Share This Paper Topics from this paper Koha Registered user Library (computing) References SHOWING 1-10 OF 16 REFERENCES SORT BYRelevance Most Influenced Papers Recency Integrated Library System Implementation: The Bowen University Library Experience with Koha Software A. A. Ojedokun, Grace O.O. Olla, Samuel A. Adigun Engineering 2016 3 Save Alert Research Feed Assessment and Evaluation of KOHA ILS for Online Library Registration at University of Jos, Nigeria Akpokodje Nkiruka Vera, Akpokodje Thomas Edore Engineering 2015 10 PDF Save Alert Research Feed The Use and Application of Open Source Integrated Library System in Academic Libraries in Nigeria: Koha Example Emeka Christian Uzomba, Oluwatofunmi Jesudunni Oyebola, A. C. Izuchukwu Computer Science 2015 16 PDF Save Alert Research Feed Library automation with Koha R. A. Egunjobi, R. A. Awoyemi Computer Science 2012 30 Save Alert Research Feed Implementation of Koha Integrated Library Management Software (ILMS): The Babcock University Experience Saturday U. Omeluzor, Olugbenga Adara, Madukoma Ezinwayi, Felicia ObyUmahi Computer Science 2012 21 Save Alert Research Feed The use of library software in Nigerian University Libraries and challenges Kingdom Hudron Kari, Ebikabowei Emmanuel Baro Computer Science 2014 9 Save Alert Research Feed Selection and use of KOHA software in two private Nigerian universities Janet Ogbenege, Airen E. Adetimirin Computer Science 2013 8 View 1 excerpt Save Alert Research Feed Library software products in Nigeria: A survey of uses and assessment A. S. Obajemu, J. N. Osagie, Helen Olubunmi, Jaiyeola Akinade, F. C. Ekere Engineering 2013 12 PDF Save Alert Research Feed Software Selection and Deployment for Library Cooperation and Resource Sharing Among Academic Libraries in South-West Nigeria A. M. Iroaganachi, Juli James, U. Esse Business 2015 11 PDF Save Alert Research Feed Library automation and use of open source software to maximize library effectivenss N. Ukachi, V. Nwachukwu, U. D. Onuoha Engineering, Economics 2014 10 Save Alert Research Feed ... 1 2 ... Related Papers Abstract Topics 16 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_qwinipqezbfmndy3of43kj3uma ---- 98 American Archivist/ Vol. 51 /Winter and Spring 1988 Commentaries & Case Studies DEAN DeBOLT and JOEL WURL, Editors The Commentaries and Case Studies department is a forum for sharply focused archival topics that may not require full-length articles. Commentaries and Case Studies articles generally take the form of analyses of archivists' experiences implementing archival prin- ciples and techniques within specific institutional settings, or short discussions of common theoretical, methodological, or professional issues. Members of the Society and others knowledgeable in areas of archival interest are encouraged to submit papers for consid- eration. Papers should be sent to Managing Editor, The American Archivist, Society of American Archivists, 600 S. Federal, Suite 504, Chicago, IL 60605. The NHPRC Data Base Project: Building the "Interstate Highway System" RICHARD A. NOBLE The second edition of the National Histor- ical Publications and Records Commis- sion's Directory of Archives and Manuscript Repositories in the United States will be published in the spring of 1988.' The vol- ume describes the manuscript, archival, and special media holdings of 4,200 institu- tions, a 50 percent increase over the first edition. The Directory is the only single- volume reference work covering the na- tion's archival and manuscript institutions. Yet it began as part of a much more am- bitious project that has an interesting and significant history. Chronological Overview The Directory effort began in 1951 with a decision by the National Historical Pub- lications Commission to compile a central register of manuscript collections. The pur- pose was to improve administration of the commission's documentary publications program. A staff headed by the commis- sion's executive director, Philip M. Hamer, 'The Oryx Press of Phoenix, Arizona, is publisher of the Directory's second edition. The author, archivist with the Machine-readable Branch of the National Archives and Records Administration, previously worked for the National Historical Publications and Records Commission as a grants analyst and an editor of the Directory of Archives and Manuscripts Repositories. This article is a revised version of a paper presented at the 1987 annual meeting of the Society of American Archivists. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 Commentaries and Case Studies 99 gathered collection-level data in the 1950s. The result was the commission's A Guide to Archives and Manuscripts in the United States, which was published in 1961 and covered 1,300 repositories nationwide.2 The Library of Congress's National Union Cat- alog of Manuscript Collections (NUCMC) also was first prepared during the late 1950s, and its first volume was published in 1962, a year after the Hamer Guide? After deciding in 1974 to revise the Hamer Guide, the commission quickly recognized that the amount of repositories and records holdings had grown dramatically in the thirteen years since publication of Hamer's volume, making a full update of Hamer's collection-level information prohibitively time consuming and unwieldy. The com- mission instead decided to prepare a repo- sitory-level directory and to compile the collection-level data over a longer time pe- riod. The Directory was intended to pro- vide an overview of the nation's historical records repositories—a matter of increased interest to the commission when its records grant program was inaugurated in 1975.4 The commission chose to automate pro- duction of the Directory so that future edi- tions could be produced more easily and selected the SPINDEX computer program package for this purpose. SPINDEX, which stands for Selective Permutation Indexing, was developed in the late 1960s primarily by the Library of Congress and the Na- tional Archives, initially as a means of in- dexing finding aids. By 1975 it was able to generate descriptive guides with sophis- ticated indexes in an attractive typescript format. SPINDEX, however, had the dis- advantage of being an off-line, batch mode system.5 Starting in 1976, the commission's Di- rectory staff canvassed 10,000 repositories by mail and telephone. Published in 1978, the Directory described the holdings of 2,700 repositories—over twice as many as ap- peared in the Hamer Guide. But, unlike the Hamer Guide, the Directory provided only repository-level information. Reviews of the Directory from the archival and library community were almost uniformly pos- itive.6 The Directory was conceived to be part of the commission's plan for a national col- lection-level data base on archives and manuscripts. The data base was designed to use the same SPINDEX programs, stan- dard descriptive elements, and thesaurus of index terms as the Directory. Due to the flexibility of the SPINDEX programs, the data base could generate a variety of rec- ords guides, such as to specific states, re- gions, types of repositories, or types of records.7 The commission staff planned that the data base would be implemented in a piece- 2Frank G. Burke, " A Proposal for Revision of A Guide to Archives and Manuscripts in the United States," unpublished, 15 August 1975, 1-2; Philip M. Hamer, ed., A Guide to Archives and Manuscripts in the United States (New Haven: Yale University Press, 1961). 'Library of Congress, National Union Catalog of Manuscript Collections 1959-61 (Ann Arbor: J. W. Edwards, 1962). Subsequent editions have been as follows: 1962 and Index 1959-1962 (Hamden, Conn.: Shoe String Press, 1964), and 1963-1985 (Washington: Library of Congress, various dates). 4With the inauguration of the records program, the National Historical Publications Commission (NHPC) was redesignated the National Historical Publications and Records Commission (NHPRC). Larry J. Hackman, Nancy Sahli, and Dennis A. Burton, "The NHPRC and a Guide to Manuscript and Archival Materials in the United States," American Archivist 40 (April 1977): 201-03. ^SPINDEX II: Report and Systems Documentation (Washington: National Archives and Records Service, 1975), 1-7; Hackman, Sahli, and Burton, "The NHPRC and a Guide," 203-04. ''Directory of Archives and Manuscript Repositories in the United States (Washington: National Historical Publications and Records Commission, 1978). Examples of favorable reviews include: " ' A ' rating—important for even a small basic reference collection in this subject" (Wilson Library Bulletin 65 [April 1979]: 583, 587- 88); "Most academic libraries will need at least one copy" (Choice 16 [July/August 1979]: 652); "The directory is a gold mine of information for scholars, researchers, and other clientele of academic and research libraries" (Booklist 75 [1 July 1980]: 1627). 7Hackman, Sahli, and Burton, "The NHPRC and a Guide," 204-05. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 100 American Archivist / Winter and Spring 1988 meal but methodical manner. In 1977 the staff described the data base concept as fol- lows: "Like the interstate highway system, as each little bit is completed it can be used. Someday all of the pieces will be there, integrated into a single whole."8 To initiate the national data base, the commission from 1977 to 1982 provided over $1 million in grant funding for several descriptive projects. These projects de- scribed historical records, using standard data elements, for inclusion in the national data base. There were three NHPRC-funded statewide survey projects—in Kentucky, New York, and Washington. These proj- ects gathered collection-level and some- times series-level data from all known repositories by means of site visits. The commission also funded a cooperative de- scriptive project by three midwestern state archives—Illinois, Minnesota, and Wis- consin. In 1980 the commission issued a report on the feasibility of creating the national data base. The report projected that almost 20,000 repositories and over 700,000 col- lection descriptions would be included in the national data base. (It is interesting to note that, today, after only three years of data entry, the Research Libraries Infor- mation Network (RLIN) data base already includes over 100,000 archives and man- uscript descriptions.) The NHPRC report projected that creation of the data base would cost $15 million, over $10 million of which would come from the commission. The re- port acknowledged that a large increase would be needed in the commission's es- tablished records grant budget of $2 million a year.9 By 1982 several factors had led the com- mission to abandon the national data base. Of greatest importance were the dramatic cuts in the commission's grant funding and staffing implemented at the beginning of the Reagan administration. Another factor was that the General Services Administra- tion and National Archives stopped sup- porting the SPINDEX programs and, in particular, did not develop promised new programs to permit more efficient manip- ulation of data. Moreover, the batch mode requirements of SPINDEX made it less de- sirable than the on-line descriptive systems offered by the Research Libraries Infor- mation Network (RLIN) and On-line Com- puter Library Center (OCLC). Because of these problems, the commis- sion in 1982 decided to limit the data base's goals to producing the second edition of the Directory. Now that the second edition is a reality, the commission is considering contracting with a publisher to update the data periodically and to publish future edi- tions. Results and Impact Though its original primary objectives were not attained, the NHPRC data base played a role in the development of the US- MARC Archival and Manuscripts Control (AMC) format. The format was developed by the Society of American Archivists' Na- tional Information Systems Task Force (NISTF), which was active from 1977 to 1983. NISTF was asked to investigate es- tablishing one or more national automated information systems for description of ar- chives and manuscripts. In 1977 the NHPRC data base was the only national archival descriptive system that was automated and used standard data ele- ments; however, as indicated above, the SPINDEX programs carried disadvantages. Another candidate for a national data base was NUCMC, which had the drawback of being a manual system. Finally, there were the two major on-line library bibliographic KIbid., 205. ""Staff Report: NHPRC Data Base" (Unpublished, 1980), 14. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 Commentaries and Case Studies 101 networks: RUN and OCLC. In 1977, how- ever, RLIN and OCLC were devoted al- most exclusively to the description of books and periodicals using the MARC format for published materials. Possibly because of the inadequacies of the several potential national information systems, NISTF soon gave up on selecting a particular one and decided instead to cre- ate standard data elements to permit infor- mation exchange among the existing systems. From 1982 to 1984, NISTF, its successor the SAA Committee on Archival Information Exchange, and the Library of Congress developed the standard data ele- ments, named the USMARC AMC format and formally approved in 1985. The RLIN and OCLC data bases, using the USMARC AMC format, now have emerged as the dominant national data bases for archives and manuscripts.10 The NHPRC data base, through its use of standardized data elements, helped pave the way for the USMARC AMC format. In fact, a number of the NHPRC data ele- ments are compatible with the AMC for- mat, and an automated conversion can be done between the NHPRC and AMC for- mats. The Midwest guide project and the New York state project, for example, have successfully converted their records into RLIN. Although the NHPRC national data base never was realized, it left a legacy of useful survey projects across the country. Table 1 is a summary of the projects, and the ap- pendix is a bibliography of the guides pub- lished by the NHPRC data base. The first NHPRC data base project to receive funding was a statewide survey by the Washington State Historical Records Advisory Board, beginning in 1977. The project director noted that this effort was the country's first "since the WPA Histor- ical Records Survey of the 1930s to survey historical records on a statewide basis si- multaneously in a wide variety of records s e c t o r s . " " Among other products, the Washington project produced a typeset col- lection-level guide to 250 repositories in the state. After NHPRC grant funding ex- pired in 1983, the Washington project dras- tically reduced its personnel and maintained the data base at a minimal level by obtain- ing voluntary updates from the state's ma- jor repositories. The project now is considering conversion of the SPINDEX data to the USMARC AMC format for in- clusion in RLIN, OCLC, or the Western Library Network (WLN), a third biblio- graphic network.12 A second survey project—by the Ken- tucky Department for Libraries and Ar- chives—was funded by the commission between 1978 and 1983. Since expiration of commission funding, the project has re- ceived support from the U.S. Department of Education. The project finished survey- ing the state in 1984, having covered 6,000 collections. In 1986 the project published the Guide to Kentucky Archival and Man- uscript Repositories, which provides re- pository-level information for the state's 285 archival institutions. The project also has been entering collection-level information into the data base and plans to publish a series of collection-level volumes. The project is considering switching from '"Richard H. Lytle, " A National Information System for Archives and Manuscript Collections," American Archivist 43 (Summer 1980): 423-26; David Bearman, "Toward National Information Systems for Archives and Manuscript Repositories," American Archivist 45 (Winter 1982): 53-56; Richard H. Lytle, "An Analysis of the Work of the National Information Systems Task Force," American Archivist 47 (Fall 1984): 357-65; Nancy A. Sahli, "Interpretation and Application of the AMC Format," American Archivist 49 (Winter 1986): 11-12. "John F. Burns, "Statewide Surveying: Some Lessons Learned," American Archivist 42 (July 1979): 295. l2Author's conversation on 15 June 1987 with David W. Hastings, Chief of Archives, Washington State Archives. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 102 American Archivist / Winter and Spring 1988 Table 1 NHPRC Data Base Institution Kentucky Depart- ment for Libraries & Archives Cornell University, New York Histori- cal Resources Center Washington State Historical Records Advisory Board State Historical Society of Wiscon- sin (Midwest Guide Project) NHPRC Directory (1978) NHPRC Directory (1988) TOTAL Repositories Described 285 1,000 1,000 3 2,675 4,600 6,888™ Projects: Summary Statistics Collections & Record Groups 6,000 20,000 25,000t 13,000+ — — 64,000 NHPRC $354 $321 $457 $100 federal non grant funds federal non grant funds $1,232 Funding* Other Federal** State $167 $363 $500 $25 $555 $500 * In 100's of thousands. Figures are approximate. ** Includes funding from the National Endowment for the Humanities and the De- partment of Education. f Includes series descriptions. TT Excludes 1978 NHPRC Directory. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 Commentaries and Case Studies 103 SPINDEX to an in-house automated sys- tem, as well as transferring the SPINDEX data base to RUN or OCLC.13 The third NHPRC survey project cov- ered New York State and was administered by Cornell University. For each county, the project produced a computer printout de- scribing historical records at the collection level. After NHPRC funding ran out in 1983, the project was able to continue the survey with funding from the National Endow- ment for the Humanities, matched by an appropriation from the New York State leg- islature. In 1986, with a grant from the U.S. Department of Education, the Research Li- braries Group (the parent institution of RLIN) wrote a computer program to con- vert New York's 12,000 SPINDEX collec- tion descriptions into RLIN. The New York project now has abandoned SPINDEX and enters all of its data directly into RLIN; a total of 20,000 New York descriptions are now in the system. The project has thereby ensured national access to the records of many small repositories that probably never would have participated in a national data base on their own. Cornell is negotiating for the New York State Archives to take over maintenance of the data base when the statewide survey is completed.14 A final NHPRC-funded project, the Midwest State Archives Guide Project, was administered by the State Historical Soci- ety of Wisconsin. The project endeavored to establish a cooperative data base among the state archives of Illinois, Minnesota, and Wisconsin and to publish a guide to the records of the three institutions. The project failed in the latter goal, although it produced a series-level guide to a repre- sentative sample of records. More impor- tantly, the project achieved close cooperation among the three participating state archives and thereby produced a consistent multi- institutional data base. Unfortunately, soon after NHPRC support ended in 1982, the Midwest guide project was discontinued and the common data base abandoned; how- ever, the Midwest project, like the New York State project, was successful in trans- ferring most of its data to the RLIN data base through an automatic conversion pro- gram.15 In summary, the NHPRC data base was unique in several ways. Its central prod- uct—the NHPRC Directory — provides summary information, in one volume, about several thousand repositories of all types nationwide. The NHPRC data base was also the first attempt to compile a national au- tomated data base for archives and manu- scripts. Under its aegis, the first comprehensive records surveys since the 1930s were undertaken in three states. The guides produced by these survey projects have greatly increased access to historical records in the three states. Additionally, the on-site visits to repositories by trained field surveyors allowed records custodians to gain better in-house control of their collec- tions.16 On the negative side, the NHPRC na- tional data base was heavily dependent upon federal grant money, which has proved to be an unstable funding base in the 1980s. The data base projects also depended upon the federally funded NHPRC data base staff to promote standardization of data elements "Barbara A. Teague, "The Burden of Batching: Current Uses of SPINDEX at Kentucky's Public Records Division" (Paper presented at the Fiftieth Annual Meeting of the Society of American Archivists, August 1986). I4G. David Brumberg, "From Batch Mode to On-Line: The New York Historical Documents Inventory and the Growth in Archival Automation" (Paper presented at the Fiftieth Annual Meeting of the Society of American Archivists, August 1986). l5The Midwest Guide Project converted 12,000 SPINDEX collection/record group descriptions into RLIN. Marion Matters, "SPINDEX—The Mother of Invention" (Paper presented at the Fiftieth Annual Meeting of the Society of American Archivists, August 1986). ""'Final Report [to NHPRC], Kentucky Guide Project, Phase II, Grant # 8 1 - 4 5 " (Unpublished, 1986); Burns, "Statewide Surveying," 296. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 104 American Archivist / Winter and Spring 1988 and index terms. When that staff was elim- inated in 1982, the projects no longer were able to maintain common descriptive stan- dards. As a result of their continued depend- ence upon federal funding, the three sur- viving data base projects—in Kentucky, New York, and Washington State—have a somewhat uncertain future. This is one rea- son the inclusion of the survey data in the current bibliographic networks is so im- portant. At least the data will be available in the networks, even if the individual data base projects do not continue. The future is particularly tenuous for the SPINDEX computer system, because of its batch mode requirements. In fact, few in- stitutions are still using SPINDEX.17 But the NHPRC data base, and its use of SPIN- DEX, did help pave the way for develop- ment of the USMARC AMC format and its implementation in national bibliographic systems. In this respect, there is still valid- ity to the "interstate highway system" metaphor used by the NHPRC staff. The RUN, OCLC, and WLN data bases are gradually realizing the goal envisioned ten years ago by the NHPRC. "Besides the Kentucky and Washington state archives, other institutions still using SPINDEX include the Portland (Oregon) City Archives and the South Carolina Department of Archives and History. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 Commentaries and Case Studies 105 Appendix NHPRC Data Base: Published Guides National Historical Publications and Records Commission: Directory of Archives and Manuscript Repositories in the United States (Washington, D.C.: NHPRC, 1978). $25. Directory of Archives and Manuscript Repositories in the United States (Phoenix, Ariz.: Oryx Press, 1988). $55. Kentucky State Survey Project: The Guide to Kentucky Archival and Manuscript Repositories (Frankfort, Ky.: Kentucky Department for Libraries and Archives, Public Records Division, 1986). $12. In preparation: collection-level guides, arranged alphabetically by city, ca. 8 to 10 vol- umes, with a consolidated index. New York State Survey Project: Fifty-five county-wide records guides have been published. (New York has 62 counties.) Example: Guide to Historical Resources in Allegany County, New York, Repositories (Ithaca, N.Y.: Cornell University, New York Historical Resources Center, 1980). Prices range from $3 to $16; price list available. Washington State Survey Project: Historical Records of Washington State: Guide to Records in State Archives and its Re- gional Repositories (Ellensburg, Wash.: Washington State Historical Records Advisory Board [WASHRAB], 1981). $25. Historical Records of Washington State: Records and Papers Held at Repositories (El- lensburg, Wash.: WASHRAB, 1981). Out of print. Historical Records of Washington State: Guide to Public Records Held by State and Local Government Agencies [computer output microform (COM)] (Ellensburg, Wash.: WASH- RAB, 1984). $10. Historical Records of Washington State: Private Records and Papers not held in Archival Custody [COM] (Ellensburg, Wash.: WASHRAB, 1984). Distributed only to major re- positories, which were asked to screen researchers wanting access to the materials. Out of print. Genealogical Resources in Washington State (Ellensburg, Wash.: WASHRAB, 1983). $10. Midwest State Archives Guide Project: "Prototype Guide" (1978). A "test" typeset guide including a sample of records descrip- tions from each of the participating Midwest state archives and a consolidated index. Out of print. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.51.1-2.l9h2173177m 12827 by C arnegie M ellon U niversity user on 06 A pril 2021 work_r23jiatkfjfvzfmrxz6yphd42a ---- 1 Looking at Resource Sharing Costs Originally published: Lars Leon, Nancy Kress, (2012),"Looking at resource sharing costs", Interlending & Document Supply, Vol. 40 Iss: 2 pp. 81 - 87 Permanent link to this document http://dx.doi.org/10.1108/02641611211239542 Lars Leon Lars Leon is Head of Resource Sharing at the University of Kansas Libraries. He holds an MLIS from Emporia State University. His research interests include best practices, assessment, and staff development. Nancy Kress North Carolina State University Libraries nancy_kress@ncsu.edu mailing address and phone to follow Nancy Kress is Head of Access and Delivery Services at North Carolina State University Libraries. She holds an MLIS from University of Illinois, and a Process Management and Improvement Certificate from The University of Chicago Graham School. She has practical experience applying Lean and business process improvement methods in libraries and higher education. Her research includes supply chain management and Lean principles applied to library operations, and middle management. Title: Looking at Resource Sharing Costs Abstract Purpose – This paper is the result of a small cost study of resource sharing services in 23 North American libraries. Trends that have affected resource sharing costs since the last comprehensive study are discussed. Design/methodology approach – Selected libraries were approached for this phase of study. A pilot phase helped to clarify the cost and service definitions while revising the database which served as the data collection instrument. Findings – Immediate access to electronic items at point of use has resulted in user demand for faster turnaround for physical materials. This in turn has led to increased costs for ILL technology and shipping. Costs have decreased but continue to show a noticeable disparity between higher ILL Borrowing mean costs compared to ILL Lending. The data also clearly supports the perception that patron initiated Circ to Circ module transactions are lower than ILL. Originality/value – Libraries have been using cost data that is almost ten years old. While this study is small, the data provides an updated benchmark to assist libraries in making effective decisions regarding resource sharing. The study illustrates a range of costs which reinforce the need for libraries to investigate their own average costs for optimal decisions. mailto:nancy_kress@ncsu.edu 2 Introduction The economic crisis of the past few years means that all areas of libraries, including interlibrary loan, have to do more with less. Concurrently, the Web has changed how users interact with the library and with information. Users expect interlibrary loan (ILL) performance to match the ease and speed of electronic access. The emerging standard for delivery turnaround of articles and book chapters is within 48 hours of when the user places a request (Sturr and Parry, 2010). Services are changing as well. The prevalence of electronic journals, combined with migration of print to off-site storage and the high cost of photocopying have led many institutions to offer local document delivery. Most absorb the costs of retrieving, scanning and electronically delivering print journal content to address what users consider a time-consuming impediment to research. Changes in the scholarly and technical environment have affected the costs of resource sharing. An unavoidable response to shrinking library budgets is to reduce acquisitions. The rising costs of journal subscriptions is causing some institutions to move away from Big Deals, using an alternative model of purchasing essential journals on a title by title basis, making use of interlibrary loan to fill the gaps (Howard, 2011). While this may save acquisition monies, some journals have a high copyright fee per use and these costs can add up quickly (Reighart and Oberlander, 2008). Monographs are also affected. Electronic books do not follow the same models as physical books; many ebook lending systems allow only one patron to access a book at a time, and lending permissions are limited (Vigen and Paulson, 2003). Thus, libraries continue to ship physical materials, in some cases using expensive expedited methods to meet consortia expectations for turnaround. Technology has benefited interlibrary loan by automating many processes that were once manual. The development of ILL management systems, electronic delivery software, and professional scanners improve the efficiency of ILL, but can be prohibitively expensive (Hosburgh and Okamoto, 2010). Mary Jackson completed an ILL cost study for the Association of Research Libraries that libraries and vendors continue to refer to when considering new services and alternative workflows (Jackson, 2004). There has not been a comprehensive U.S. cost study since that time. New models for acquiring information resources are being developed as libraries move from a collection centric just-in-case model to a user centric just-in-time model. Selection of material is increasingly driven by the user as purchase on demand models become part of collection development. To make more effective decisions an updated view of the costs for resource sharing services is needed. This information is vital to make more informed, data-assisted decisions that are most cost-effective for the library. Literature review ILL is an important resource for users, providing support for research and academic course work (McCaslin, 2010). In his article on the expectations of ILL in an economic crisis, he points out that patron needs do not decrease along with the financial situations of their institutions. In fact, in an era in which collection budgets – and hence local collections are shrinking, ILL requests continue to increase. Two recent surveys of ILL conducted by Primary Research Group questioned participants on the total percentage in increase or decrease of ILL over the past three years, reporting a median increase of 9.5% to 14 (Primary Research Group, 2009, 2011). The ARL study from 2004 found an increase in borrowing (mean 75 percent) and lending (mean 59 3 percent) requests from 44 libraries that had participated in the 1996 and 2002 studies (Jackson, 2004). These changes emphasize the need to know the current costs of performing ILL. The recent literature on change in interlibrary loan includes discussion of changes in user preferences and behaviors, as well as how technology has changed ILL work and the definition of resource sharing. An OCLC report analyzing several user behavior studies conducted in the US and the UK found that the electronic environment led to clear changes in what is important to users. The study notes some common conclusions about changed user behavior, including the importance of speed and convenience, and preference for desktop access to scholarly content (OCLC Research, 2010). McHone-Chase (McHone-Chase, 2010) presents a historical perspective through a review of literature from 1995 to 2009, on how technology has changed user expectations and driven change in interlibrary loan. She found ILL has been extensively affected by technology and is struggling to keep up with user expectations for service. ILL workloads are increasing due to a combination of libraries purchasing less content, and users‟ ability to find more content through databases. The evolution of resource sharing continues to change the costs of interlibrary loan. Dannelly (Dannelly, 1995) writes about the effect of the electronic era on resource sharing, beginning with the acceptance of interlibrary loan during the 1990s as a substitute for ownership. Dannelly points out there is no such thing as free information. The combined effects of technology and economic factors caused increases for interlibrary loan costs, necessitating decreases to other library services and personnel. While technology costs have dropped, reliance on electronic access has meant an increase for staff, software and workstations which have all increased the cost of ILL. The introduction of Total Quality Management (TQM) into the academic environment in the early 1990s influenced many libraries to reengineer ILL to reduce costs. Chang and Davis (Chang and Davis, 2010) examine how these changes have affected access services, including interlibrary loan. An increase in serial inflation rates meant increases in ILL transactions, causing many libraries to look at the labor-intensive process of ILL. The authors note “in short, the adoption of automated systems for circulation functions and TQM in ILL produced the first transformation wave for access services in the 1990s” (Chang and Davis, 2010). New technologies affecting discovery and delivery of information have shifted the focus of interlibrary loan from simply delivering materials to providing user-centered service. Posner (Posner, 2007) examines what Google and other web-based information services have meant for resource sharing. One role of the library is to provide information at no cost to the patron, in a world in which all people can‟t afford computers or connections. This cost is often born by the library. In spite of all of the free information, there are sources for which interlibrary loan is the only option, and the introduction of document delivery service, which is labor intensive, has driven up costs. Posner discuss the possibility of putting the cost on users, but cautions that librarians should carefully consider whether it would generate enormous bills for patrons and become more of a headache and public relations challenge than a benefit for otherwise positively received ILL departments. Buchanan (Buchanan, 2009) describes how new ways to obtain information resources have forced staff to rethink the work of ILL. Discovery tools increase the ability for users to find, and request content from a myriad of sources. New tools built into the ILL request process create very different workflows, including options for purchase of content if the price is right. As the line between ILL and acquisitions continues to blur, understanding the cost differences of purchasing or borrowing content becomes increasingly important. 4 As interlibrary loan became accepted as a substitute for ownership, it became necessary to understand the costs to inform decisions on whether to buy or borrow research materials. The studies in this literature review use different methodologies and report costs differently, so it is important to understand what exactly is being include and measured before comparing costs. The most recent comprehensive cost study (Jackson, 2004) collected data on the 2001-02 performance of primarily ARL libraries and assessed mediated and user-initiated borrowing and lending, as well as data from a subset of participants on local document delivery services. The study reported an average cost of a filled request for borrowing at $18.35, and $9.48 for lending (Jackson, 2003). The study methodology involved detailed cost worksheets; participants received an electronic report with summary data. The National Resource Sharing Working Group for Australian (National Library of Australia, 2001) performed a large comprehensive study of interlibrary loan and document delivery by studies that would allow institutions to benchmark their operations against a set of data. The survey instrument was based on the ARL study, and in addition to costs, determined characteristics of high performing operations. The study measured filled requests, and locally filled requests were not included. The average total cost for the participating libraries was $32.10 for borrowing, and $17.03 for lending. Staff made up the highest proportion of total unit cost, representing 61.2 percent for borrowing and 76.8 percent for lending, with delivery being the second largest cost. On average, patrons paid $10.28 per item. Patrons who had not paid a fee were surveyed and asked to estimate the amount they would pay. The most common response was $0, with the average amount being $4.09. The study found that generally, fees didn‟t recover costs. One library in the study had a fixed cost for the user of 8 francs per request, and the study found this covered the cost of simple requests, but that more complicated requests were found to cost on average 10.71 francs. 1n 1994 three universities at the State University of New York performed a study assessing the cost per use of a journal article as compared to the cost to borrow the item via interlibrary loan (Kingma, 1996). The study updated the data collection form used in the 1993 ARL/RLG report. To arrive at an exact cost per transaction, ILL staff members were times to determine the actual costs of processing a typical borrowing and lending request. This differs from the ARL studies, in that only the staff time employed processing the article was considered, rather than the total percentage of time a staff person worked in an ILL unit. Thus, the study did not include time that an employee was employed in an ILL unit, but not processing requests. The study found that there was a decline in average cost per transaction as requests increased. Wichita State University in Wichita Kansas (Naylor, 1997) performed a cost study to determine the cost effectiveness of ILL as compared to commercial document delivery and purchasing full-text databases. The author also wanted to compare their institution‟s costs as a medium sized academic library to those from the ARL study, which are primarily large research libraries. Using data collected in fiscal year 1995/1996, the study methodology used the basic structure of the ARL/ RLG study. Personnel were found to be the highest expense, with network and communication costs second followed by delivery costs third. The cost of both filled and unfilled transactions was determined. Cost per filled borrowing transaction totaled $8.51, while an unfilled request, as it still involved staff time, cost $4.68 per transaction. A filled lending request cost $2.47 and an unfilled request $1.36. The low cost of lending for this institution results from student assistants performing the majority of work in the lending unit. Zhou (Zhou, 1999), in an article that looks at ILL cost studies recommended that future interlibrary loan cost studies include marginal costs, specifically copyright fees. The author notes 5 that in studies up to June of 1998, copyright was not included and calculated on average cost per transaction. While the Commission on New Technological Uses of Copyrighted Works (CONTU) allows five articles from one title to be copied within the calendar year, the article points to a study from 1997 that found median copyright fees for all subjects to be $5.00. This marginal cost significantly increases the total cost per transaction. For this reason, Zhou recommends including copyright cost in any future studies. As resource sharing continues to evolve, the variables that affect cost and what libraries will need to know to make the most efficient decisions for both users and the library will change. Reighart and Oberlander (Reighart and Oberlander, 2008) explore the future of ILL using a framework that places resource sharing into four domains: borrowing from other libraries, purchasing items from publishers, renting, and free. The convergence of acquisitions, patron- driven purchasing, collection development and new technologies will continue to push the evolution of interlibrary loan. In this new environment, libraries must evaluate the factors of reliability and stability, user expectations, cost and license terms, then determine whether to change workflow or not. The authors look at the impact of copyright, providing an example of a special issue, in which seven articles from the same issue would have had a $38.00 copyright fee per article. Although increasing amounts of materials appear in electronic form, users and libraries will have to deal with paper holdings for some time into the future. Pederson (Pederson, 2010) calls the separation between immediate electronic access for e-journals and print journal collections the “paper divide.” To meet users preference for downloading content immediately on his/her PC some libraries are offering free document delivery to users. Providing documents from the paper collection comes at a substantial cost to libraries and includes equipment, software and staff costs. To evaluate the best option for both user and libraries, quantifiable measures of usage and cost are needed. Costs per use of both electronic serials and interlibrary loan need to be evaluated side by side. “Data concerning journal article access on both the left and right sides of the ACC (Article Access Continuum) will be invaluable for the future development of academic libraries.” Cost studies can serve as more than a benchmark for institutions to compare themselves. The data can point to ways in which an individual intuition can create efficiencies and cost savings. Cost studies have found staffing to be the most substantial portion of ILL costs. Morris (Morris, 2004) notes that labor costs can represent about 80% of total ILL cost. As a step towards reducing these she suggests rethinking the level of staff that handles transactions, comparing the salary differences between a librarian, a clerk, and a student. She provides the example of how an interlibrary loan librarian who makes $40,000 a year can add as much as $4.00 to each transaction. Cost Study Objectives The interlibrary loan community has a need for updated cost data, as the landscape of resource sharing has changed dramatically since the last comprehensive study which is almost ten years old. In an environment of multiple formats, increase in user requests for journal content and rising subscription prices, there is a need for current cost data to inform service and collection development decisions. Past resource sharing studies have primarily examined turnaround time, fill rate and unit cost for successfully filled requests. Deciding what resource sharing activities and which costs were collected was determined by the authors, after discussion with pilot library participants. The authors were initially prompted to undertake a cost study due to a need for up to 6 date costs to make decisions regarding purchasing over borrowing, as well as format decisions and full journal subscription access vs. pay per article. Thus this study focuses on unit costs alone. Since the last study libraries are increasingly providing locally held and accessible items via interlibrary loan and other library units, hence the study measures unit cost for delivery of locally held and accessible items. Finally, we decided to include both filled and unfilled requests, as costs are incurred to process all requests, and staffing represents the highest unit cost for ILL operations. The goal for this study was to create a prototype costing tool that was:  Simple to use, relying on electronic data entry and not paper forms.  Required minimum broad data elements to enable comparison between institutions but also the ability to break down certain categories into granular costs. This paper is a study of 23 primarily medium to large academic libraries. The results of this small study will provide average costs that libraries can consider using and form the basis for contributing to a larger study with more libraries that should provide a more effective wider use of data such as by library type. Research Data Design and Collection Data Design We defined the data to collect after reviewing the literature and determining what was most relevant for current decision making. We identified resource sharing services as services. We selected the fiscal year that runs July 1, 2010 through June 2011 so we could have complete and comparable data between libraries. We developed a Microsoft Access® database that each participating library would use as the collection tool. The costs included in this study were staffing, equipment, copyright, payments to supplying libraries or other sources, payments received for requests fulfilled, management tools (e.g. ILLiad), request systems (e.g. OCLC, RAPID), shipping costs, and supplies. See the appendix for the definitions. The resource sharing service costs were separated out as the following services. See the appendix for definitions. Borrow through Circ to Circ module Copies from local collections to local patrons ILL Borrowing Copies ILL Borrowing Loans ILL Lending Copies ILL Lending Loans Lend through Circ to Circ module Deliver to Campus Mail to non-Campus Page from local collections to local patrons For each service, fiscal year request totals were collected for all ILL requests submitted. Eight categories were used to calculate unit cost per service: staff, request systems, request management tools, shipping, equipment, supplies, fees (debits and credits), and copyright. For 7 the purpose of analysis, we used the costs applied over all requests submitted both filled and unfilled. Participating Libraries We completed a pilot data collection with five libraries in order to determine the best data definitions, revise instructions, and modify the Access® database. As our goal was to develop a prototype costing tool, we limited the number of libraries invited to participate, soliciting participation from a limited number of libraries through selected consortiums. Almost all participating libraries were medium to large academic libraries. A list of the pilot and full phase participating libraries can be found in the appendix. We analyzed the final data received and were able to keep almost all submitted data for the final analysis. However, in a few cases we excluded some data where it was extremely less or more expensive than the norm. Data Collection A website was created that included the Microsoft Access® database to download, cost gathering guidelines and online tutorials, and information on participation and privacy. Participants were able to enter data into their own copy of the study database. The database structure made it easy for participants to extract the information needed for the cost study. In some cases, data such as staff salaries cannot be publicly shared. This method of data collection allowed those libraries to enter their own staff salaries and only send the aggregated data that was needed for the study. Once they completed entering data, participants were able to see their unit costs immediately. Reports generated Once each participant completed entering data into their copy of the database they were then able to immediately download reports specific to their library. One report was based on the total requests submitted for each service while the other two provided data across filled requests as defined in two ways. Each report provided the mean cost by category for each service entered by that library. One filled request report did not count ILL Borrowing copy requests that were returned to the patron as locally available in licensed content. The other filled request report did. 8 Results of study and analysis Table 1 displays the mean costs across all requests submitted. As stated earlier, the data set is small; especially for some of the services. However, the authors feel this information helps identify areas that need further exploration. Table 1 Service Nr Libs. Mean Nr. Reqs Mean Total Costs Credits pd. to the Libs Net Mean Staff Equip Cpyrt Debits Mgmt Tools Req Sys Shipp. Suppl Borrow through Circ to Circ module 5 27,442 $3.85 $0.00 $3.85 $2.18 $0.18 $0.00 $0.00 $0.02 $0.19 $1.27 $0.00 Lend through Circ to Circ module 5 28,105 $4.70 $0.00 $4.70 $2.68 $0.24 $0.00 $0.00 $0.02 $0.21 $1.53 $0.01 ILL Borrowing Copies 18 20,391 $7.98 $0.05 $7.93 $4.33 $0.05 $0.81 $1.94 $0.12 $0.71 $0.00 $0.02 ILL Borrowing Loans 17 13,875 $12.12 $0.01 $12.11 $6.86 $0.12 $0.00 $1.93 $0.14 $0.61 $2.31 $0.14 ILL Lending Copies 18 28,232 $4.11 $1.09 $3.02 $2.90 $0.24 $0.00 $0.00 $0.16 $0.79 $0.00 $0.03 ILL Lending Loans 18 20,210 $6.21 $1.00 $5.21 $3.28 $0.07 $0.00 $0.04 $0.16 $0.74 $1.86 $0.06 Copies from Local Collections to Local Patrons 14 10,933 $7.14 $0.00 $7.14 $6.43 $0.48 $0.00 $0.00 $0.11 $0.08 $0.00 $0.04 Page from Local Collections to Local Patrons 9 18,468 $7.34 $0.00 $7.34 $6.13 $0.17 $0.00 $0.00 $0.04 $0.00 $0.92 $0.09 Deliver to Campus 3 13,805 $3.65 $0.00 $3.65 $3.48 $0.02 $0.00 $0.00 $0.01 $0.00 $0.00 $0.14 Mail to Non-Campus 2 1,221 $7.54 $0.00 $7.54 $5.50 $0.06 $0.00 $0.00 $0.03 $0.00 $1.89 $0.06 9 Table 2 illustrates the mean and median costs for the total net cost and the staff only costs. Some services median and mean costs are close. There were some libraries more distant from the mean but there was a good distribution. However, for some services the median is noticeably lower. This illustrates that some libraries with higher costs may be increasing the overall average greater than the average cost. These numbers are a good start for analysis but a follow-up study should have a larger participant pool to help determine if the averages are useful for a greater number of libraries. Table 2 Average costs based on all requests submitted Mean Net Total Costs Staff only Service Nr Libs. Reqs. total Mean Median Mean Median Borrow through Circ to Circ module 5 27,442 $3.85 $2.94 $2.22 $1.35 Lend through Circ to Circ module 5 28,105 $4.70 $3.58 $2.68 $1.97 ILL Borrowing Copies 18 20,391 $7.93 $7.68 $4.33 $2.85 ILL Borrowing Loans 17 13,875 $12.11 $12.02 $6.86 $6.98 ILL Lending Copies 18 28,232 $3.02 $2.54 $2.90 $2.59 ILL Lending Loans 18 20,210 $5.21 $4.73 $3.28 $3.09 Copies from Local Collections to Local Patrons 14 10,933 $7.14 $5.94 $6.43 $5.34 Page from Local Collections to Local Patrons (1) 9 18,468 $7.34 $4.02 $6.13 $3.88 Deliver to Campus 3 13,805 $3.65 $3.02 $3.48 $2.88 Mail to Non-Campus 2 1,221 $7.54 $7.54 $5.50 $5.50 (1) One library had 108,015 requests. Excluding this data brings the average to 7,275. Observations on Data General Overview Staffing continues to be the largest cost in all categories. The debits and shipping costs are the next highest average costs and warrant some analysis. However, this data confirms the importance of managing staff costs as the most important factor. This study did not separate out mediated and unmediated requests for interlibrary loan. That should be accomplished in a further study. The framework of the study though did help illustrate the impact on costs that a library can have when they are able to minimize staff time such as in the Circ to Circ module type services. In addition, several libraries in the study had lower costs than the average. Further analysis is needed to understand what systems, workflows, policies and other factors have helped to reduce their costs. 10 Comparing Circ to Circ module system Costs and Interlibrary Loan Costs This study provides additional evidence on how much less expensive the Circ to Circ systems are compared to traditional interlibrary loan. This study shows a net average cost per request of $3.85 to borrow in a Circ to Circ module compared to $12.11 to borrow loans via traditional interlibrary loan. Two critical areas impact these figures. The main area is staffing where Circ to Circ Borrowing average $2.22 while ILL Borrowing loans is $6.86. A more detailed analysis is needed of staffing in the other services to determine how to make equivalent progress as Circ to Circ modules. Figure1: Average cost across all requests for services where obtain materials from other libraries and sources Shipping costs For services that involved moving physical materials the next highest cost area was typically shipping. For ILL Borrowing Loans this was 19% of the net mean cost per request. For Circ to Circ module borrowing and lending this was around 32 to 33% of the net mean cost per request. Reducing staffing costs will help. Several libraries had lower mean costs. Further study is needed of their workflows and commitments to determine applicability for the greater ILL community. Comparing Costs: Leon/Kress and ARL 2004 The literature review discussed recent factors that have changed the nature of resource sharing. This raises questions regarding how costs have changed in the almost ten years since the ARL study. Our study defined, collected and measured the data differently, so the numbers cannot be compared directly but some observations can be made. Tables 3 & 4 compare our data with the 2002 ARL (Jackson, 2004) data. Observations and discussion of major points follows. 11 Table 3: Borrowing Costs by Category (ILL Borrowing Copies and ILL Borrowing Loans) ARL (2002 data) Leon/Kress (FY 2011 data) Cost Category Percentage of Unit Cost Cost Category Percentage of Unit Cost Percentage of Unit Cost (excluding Debits) Staff 58% Staff 55% 70% Network 1% Request systems 7% 9% Delivery 5% Request mgmt tools 1% 2% Photocopy 0% Shipping 8% 9% Supplies 1% Supplies 1% 1% Equipment 2% Equipment 1% 1% Borrowing Fees 20% Copyright 6% 8% Debits 21% --- Mean Mediated Cost $17.50 Mean Cost $9.62 $7.69 Table 4: Lending Costs by Category (ILL Lending Copies and ILL Lending Loans) ARL (2002 data) Leon/Kress (FY 2011 data) Cost Category Percentage of Unit Cost Cost Category Percentage of Unit Cost Staff 75% Staff 63% Network 5% Request systems 16% Delivery 13% Request mgmt tools 3% Photocopy 2% Shipping 12% Supplies 1% Supplies 1% Equipment 4% Equipment 4% Mean Mediated Cost $9.27 Mean Net Cost $3.93 The first point to note is that the ARL data counts filled requests, whereas our data counts all requests, both filled and unfilled. The ARL study excludes RAPID from mediated categories; our study includes RAPID and other user-initiated services. The Circ to Circ, Deliver to Campus and Mail to Campus services were excluded from the Leon/Kress data for this comparison to provide similar data. Comparing the 2002 and 2011 data raises many questions, particularly why staffing represents similar percentages of total unit costs for both studies in ILL Borrowing. Most of these cannot be answered and are not addressed in this article, but point to future areas for research. There are a few trends however that have affected unit costs which we can comment on. 12 The Leon/Kress “Shipping” cost category is similar to the ARL “Delivery” cost category. This category has doubled for borrowing while being similar for lending. To meet user service desires for fast delivery, many libraries in this study have been using expedited shipping for returnables and moving away from slower ground mail. Several libraries participating in the study have consortia turnaround time requirements for both electronic and physical materials. Further analysis is needed as to the discrepancies for the percentage changes between borrowing and lending. The other increase worth noting concerns network and system costs. The ARL “Network” costs included “applicable telephone, Ariel or other electronic transmission, electronic mail, Internet, and network fees (OCLC, RLIN, etc)” (Jackson, 2004). Leon/Kress “Request systems” included OCLC, RAPID, Article Reach, LINK+. We did not include phone, email and Internet. This increase is due to increases in OCLC fees in the time since the ARL study, and that at present, libraries may belong to several fee based resource sharing systems. Conclusions and next steps The purpose of this study was to provide some updated average costs to assist libraries in making decisions in the increasingly complex environment of expanded resource sharing services and buy versus borrow decisions. These challenges have arisen even as patrons increasingly demand faster service. In addition, this was a test of a stand-alone Microsoft Access® database that empowered libraries to enter their own data and immediately see their own average costs. The primary conclusion is that this information illustrated that costs have changed and libraries need to determine their own, updated costs to make the most accurate decisions. Use of a tool like the database provided empowered libraries to gather and determine their own costs. With this information, each library is able to make more informed decisions. In addition, their contribution of this type of data in an organized fashion to this study has helped to provide greater insight into costs amongst different libraries. This is useful for the library community. The figures in this study could be a starting point for libraries to consider but the authors encourage libraries to determine their own costs through our tool or future group efforts. The largest cost across resource sharing services continues to be staffing. Additional more granular analysis is needed especially in the ILL Borrowing services. Most participants in this study had a combination of unmediated and mediated requests in these services. The combination of mediated and unmediated requests helps provide useful averages to help with buy versus borrow decisions but does not give practitioners enough detail to understand the most cost effective means within ILL of obtaining materials. Further analysis is needed specifically into why staff costs are high and which workflows provide the best return for the services desired. The authors hope this information encourages libraries to take the time to identify their own costs and to contribute to a community effort. The ILL community has a tremendous track record of sharing. Extending that cooperation into a broad, shared look at costs across libraries can only enhance our common understanding that should lead to efficiencies and better decisions as we expand services and collaborate more with other library services such as acquisitions and collection development. 13 References Buchanan, S. (2009), “Interlibrary loan is the new reference: reducing barriers, providing access and refining services”, Interlending and Document Supply, Vol. 37, No. 4, pp. 168-170. Chang, A. and Davis, D. (2010), “Transformation of Access Services in the new era”, Journal of Access Services, Vol. 7, pp. 109-120. Dannelly, G. (1995), “Resource sharing in the electronic era: potentials and paradoxes”, Library Trends, Vol. 43, No. 4, pp. 663-678. Hosburgh, N. and Okamoto, K. (2010), “Electronic document delivery: a survey of the landscape”, Journal of Interlibrary Loan, Document Delivery and Electronic Reserve, Vol. 20, No. 4, pp. 233-252. Howard, J. (2011), “Libraries abandon expensive „Big Deal‟ subscription packages to multiple journals”, Chronicle of Higher Education, July 17, available at: http://chronicle.com/article/Libraries-Abandon-Expensive/128220/ Jackson, M. (October/December 2003), “Assessing ILL/DD services study: initial observations”, ARL Bimonthly Report 230/231. Jackson, M. (2004), Assessing ILL/DD services: new cost-effective alternatives, Association of Research Libraries, Washington, DC. Kingma, B. 1996, The economics of access versus ownership, Hawthorne Press, NY. McCaslin, D. (2010), “What are the expectations of interlibrary loan and electronic reserves during an economic crisis”, Journal of Interlibrary Loan, Document Delivery and Electronic Reserve, Vol. 20, No. 4, pp. 227-231. McHone-Chase, S. (2010), “Examining change within interlibrary loan”, Journal of Interlibrary Loan, Document Delivery and Electronic Reserve, Vol. 20, pp. 201-206. Morris, L. (2004), “How to lower your interlibrary loan and document delivery costs: an editorial”, Journal of Interlibrary Loan, Document Delivery and Information Supply, Vol. 14, No. 4, pp. 2-3. National Resource Sharing Working Group 2001, Interlibrary loan and document delivery benchmarking study, National Library of Australia. Naylor, T. (1997), “The cost of interlibrary loan services in a medium-sized academic library”, Journal of Interlibrary Loan, Document Delivery and Information Supply, Vol. 8, Issue 2. OCLC Research 2010, The digital information seeker: report of the findings from selected OCLC, RIN, and JISC user behavior projects, research report prepared by L. Silipigni and T. Dickey, OCLC Research. Pederson, W. (2010), “The paper divide”, The Serials Librarian, Vol. 59, No. 3, pp. 281-301. Posner, B. (2007), “Library resource sharing in the early age of Google”, Library Philosophy and Practice, Special Issue: libraries and Google, viewed 2 August, 2011, . Primary Research Group 2009, Higher education interlibrary loan management benchmarks, 2009-10 edn, Primary Research Group. Primary Research Group 2011, Academic interlibrary loan benchmarks, 2011 edn, Primary Research Group. Reighart, R. and Oberlander, C. (2008), “Exploring the future of interlibrary loan: generalizing the experience of the University of Virginia, USA”, Interlending and Document Supply, Vol. 36, No. 4, pp. 184-190. 14 Sturr, N. and Parry, M. (2010), “Administrative perspectives on dynamic collections and effective interlibrary loan”, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 20, No. 2, pp. 115-125. Vigen, J. and Paulson, K. (2003), “E-books and interlibrary loan: an academic centric model for lending”, paper presented at the 8th Interlending and Document Supply Conference, 28- 31 October, Canberra, available at: www.nla.gov.au/ilds/abstracts/ebooksand.htm Zhou, J. (1999), “Interlibrary loan cost studies and copyright fees”, Journal of Interlibrary Loan, Document Delivery and Information Supply, Vol. 4, No. 4, pp. 29-38. 15 APPENDIX Costs included in the study Staffing – total salary, including fringes, for everyone that helped with the services. This included an appropriate percentage of managers‟ time. Equipment – Equipment used in support of services such as scanners, faxes, multifunctional devices, and computers. Equipment costs entered were amortized over the replacement cycle of the library submitting costs or a standard study value if there was not one. Copyright – Copyright costs paid for ILL Borrowing. (Debits) Payments to supplying libraries or other sources – This included payments to other libraries, document suppliers, and publishers for services included in the study. (Credits) Payments received for requests fulfilled – This included payments received by the participating library for items fulfilled to other libraries. For several libraries, this also included payments from their own patrons for items supplied to them. Management tools – This included whatever software and supporting hardware used to manage requests. For almost all participants this was ILLiad and either software hosting fees or local hardware costs such as servers. Request systems – This included systems that helped to share requests with other libraries. For almost all participants this included OCLC and for many RAPID. Several “Circ to Circ” modules were also linked such as LinkPlus or OrbisCascade. Shipping costs – All costs that helped move materials to/from the participating library (e.g. USPS, UPS, Fed Ex, courier costs). Supplies – all supplies associated with the services in the study. Rough estimates were okay due to the very small percentage of cost. Services included in the study Borrow through Circ to Circ module – Unmediated loan requests from participating library‟s patrons for loans not available locally. Patron initiates requests that are sent in unmediated fashion to the potential lending library. Lend through Circ to Circ module - Unmediated loan requests for the participating library‟s collections from other libraries. Other libraries‟ patrons initiate requests that are sent in unmediated fashion to the participating library. ILL Borrowing Copies – Mediated and unmediated requests for copies from other libraries and supplying sources for the participating library‟s local patrons. 16 ILL Borrowing Loans - Mediated and unmediated requests for loans from other libraries and supplying sources for the participating library‟s local patrons. ILL Lending Copies – Copy requests from other libraries for the participating library‟s collections. ILL Lending Loans – Loan requests from other libraries for the participating library‟s collections. Copies from local collections to local patrons – Requests from the participating library‟s patrons for copies of items from the participating library‟s collections. Deliver to Campus – Requests from the participating library‟s patrons for loans to be delivered to their local patrons‟ campus addresses. Mail to non-Campus – Requests from the participating library‟s patrons for loans to be mailed to their local patrons‟ off-campus addresses. Page from local collections to local patrons – Requests from the participating library‟s patrons for loans from their local collections to be paged and made available at a participating library‟s service desk. Participating libraries Arizona State University Florida International University Kansas State University Massachusetts Institute of Technology New York University Ohio State University Oklahoma State University Oregon State University Pennsylvania State University Philadelphia Museum of Art Texas A&M University University of Arizona University of Colorado Boulder University of Connecticut University of Houston University of Iowa (pilot library) University of Kansas (pilot library) University of Massachusetts Medical School University of Nebraska Lincoln (pilot library) University of Nevada Las Vegas (pilot library) University of Utah (pilot library) Utah State University Washington University in St. Louis work_rap7v6algncpzbtunq5lya55tu ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216585563 Params is empty 216585563 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585563 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_rmr2ilkponfz3ezt4u34emsac4 ---- Microsoft Word - OTDCF_v24no3.doc by Norm Medeiros Associate Librarian of the College Haverford College Haverford, PA Harvard, NIH, and the Balance of Power in the Open Access Debate ___________________________________________________________________________________________________ {A published version of this article appears in the 24:3(2008) issue of OCLC Systems & Services.} ABSTRACT This article reviews the recent decision by Harvard’s Faculty of Arts & Sciences to submit scholarly articles to the University’s institutional repository prior to (or in lieu of) publication in a journal. The remarkable decision, the first of its kind in the United States, reverberated quickly across the open access landscape, making many wonder which universities will follow Harvard’s lead. This article also looks at the National Institutes of Health (NIH) Public Access Policy, which as of 8 April 2008, requires NIH-sponsored investigators to place into PubMed a copy of their peer-reviewed journal articles. The impact of this legislation will be enormous, as some 80,000 articles per year result from NIH-sponsored research. KEYWORDS open access ; Harvard University ; institutional repository; National Institutes of Health ; NIH ; public access “It's a mere moment in a man's life between the all-star game and the old-timer's game.” - Vin Scully I can’t recall who first posited the idea that electronic journals would eliminate the concept of the journal issue, but it’s come to pass. In the print world, distribution of a selection of articles packaged in a convenient container was practical. Yet in a world where articles appear without much to remind us of their brethren or provenance, it’s not surprising that the concept of the journal issue is all but dead. This isn’t a bad thing, just an observation really, but it shows true every time I work with an undergraduate whose research yields either electronic journal articles, or a link to an interlibrary loan form. Never, it seems, is a bound journal needed, further destroying the idea of the issue. The same is happening in music. Album sales continue to decline, to the tune of 25% since 2000. 1 Meanwhile ear buds have become a fashion accessory as legal (and illegal) digital singles download at breakneck speed into the ubiquitous iPod. Two recent events will further contribute to this article-as-sovereign- object transformation. OPEN ACCESS AT HARVARD In February, Harvard’s Faculty of Arts and Sciences (FAS) voted to grant the university nonexclusive rights to preserve and make accessible its scholarly journal articles. The landmark decision ensures that most articles authored by Harvard FAS will be made freely-available in the University’s institutional repository (a waiver to opt out of the arrangement is available, which one hopes will be used sparingly). The implications of this remarkable move are numerous, not the least of which being the potential effect on the journal supply chain. The FAS decision is a big win for the managers of the institutional repository. Unlike most repository managers who plead with faculty to submit publications, Harvard’s IR staff has the benefit of a faculty that recognizes the good in making freely- available its scholarship. The move reminded me of a similar faculty-endorsed motion that occurred in 2003. At that time, the Cornell University Library sought support from the faculty when attempting to break from its “big deal” with Elsevier. The faculty senate endorsed the Library’s decision to divest itself of its existing relationship with Elsevier, on the grounds the license to the bundled journals was excessively expensive. Subsequently, several hundred journals were cancelled. The resolution passed by Cornell’s faculty in 2003 included the following prophetic passage: Recognizing that the increasing control by large commercial publishers over the publication and distribution of the faculty’s scholarship and research threatens to undermine core academic values promoting broad and rapid dissemination of new knowledge and unrestricted access to the results of scholarship and research, the University Faculty Senate encourages the library and the faculty vigorously to explore and support alternatives to commercial venues for scholarly communication.3 Harvard’s recent decision brings to fruition this idea. Which universities will follow? NIH PUBLIC ACCESS POLICY More far-reaching open access news occurred with the passing of the National Institutes of Health (NIH) Public Access Policy (Public Law 110-161, Division G, Title II, Section 218), a part of the Consolidated Appropriations Act of 2008, which mandates the submission into PubMed of peer- reviewed journal articles that result from NIH-funded research. The policy, which went into effect Harvard Open Access Motion Harvard University is committed to disseminating the fruits of its research and scholarship as widely as possible. In keeping with that commitment, the Faculty adopts the following policy: Each Faculty member grants to the President and Fellows of Harvard College permission to make available his or her scholarly articles and to exercise the copyright in those articles. In legal terms, the permission granted by each Faculty member is a nonexclusive, irrevocable, paid-up, worldwide license to exercise any and all rights under copyright relating to each of his or her scholarly articles, in any medium, and to authorize others to do the same, provided that the articles are not sold for a profit. The policy will apply to all scholarly articles written while the person is a member of the Faculty except for any articles completed before the adoption of this policy and any articles for which the Faculty member entered into an incompatible licensing or assignment agreement before the adoption of this policy. The Dean or the Dean's designate will waive application of the policy for a particular article upon written request by a Faculty member explaining the need. To assist the University in distributing the articles, each Faculty member will provide an electronic copy of the final version of the article at no charge to the appropriate representative of the Provost's Office in an appropriate format (such as PDF) specified by the Provost's Office. The Provost's Office may make the article available to the public in an open-access repository. The Office of the Dean will be responsible for interpreting this policy, resolving disputes concerning its interpretation and application, and recommending changes to the Faculty from time to time. The policy will be reviewed after three years and a report presented to the Faculty.2 on 7 April 2008, will provide public access to the roughly 80,000 articles published annually by NIH- sponsored investigators.4 The law states: The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine’s PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication: Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law.5 An earlier access policy (“Policy on Enhancing Public Access to Archived Publications Resulting from NIH-Funded Research”) that went into effect on 2 May 2005 requested, but did not mandate, that NIH-sponsored investigators submit research articles to PubMed Central. A paltry percentage of investigators adhered to the suggestion, thus the more rigorous law recently put in place.6 Presumably the new law will have a high compliance rate, and provide opportunities for librarians to assist their faculty colleagues with the submission process. The effect of this open access law on publishers will be interesting to watch over the next several years, though the NIH’s 12-month embargo should have a minimal effect on subscription revenue. “Add-on” revenue generated from the sale of backfiles, however, may be susceptible to losses as the NIH-sponsored articles are released to the open web. REFERENCES 1. United States, Bureau of the Census (2006). Statistical Abstract of the United States: 2007, 126th ed. (Lanham, MD: Bernan Press). 2. Harvard University, Faculty of Arts and Sciences (2008). “Agenda.” Available: http://www.fas.harvard.edu/~secfas/February_2008_Agenda.pdf (Accessed: 7 April 2008). 3. Cornell University Faculty Senate (2003). “Resolution Regarding the University Library’s Policies on Serials Acquisitions, with Special Reference to Negotiations with Elsevier.” Available: http://www.library.cornell.edu/scholarlycomm/resolution2.htm (Accessed: 16 April 2008). 4. Association of Research Libraries (2008). “NIH Public Access Policy, Guide for Research Universities.” Available: http://www.arl.org/sc/implement/nih/guide/ (Accessed: 7 April 2008). 5. National Institutes of Health (2008). “NIH Public Access Policy.” Available: http://publicaccess.nih.gov/policy.htm (Accessed: 7 April 2008). 6. Suber, Peter (2006). “NIH Public Access Policy: Frequently Asked Questions.” Available: http://www.earlham.edu/~peters/fos/nihfaq.htm (Accessed: 24 April 2008). work_rmyy4gb4qrgvlifs6ostcixszy ---- JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 __________ © 2020, The Author(s). This is an open access article, free of all copyright, that anyone can freely read, download, copy, distribute, print, search, or link to the full texts or use them for any other lawful purpose. This article is made available under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. JLIS.it is a journal of the SAGAS Department, University of Florence, published by EUM, Edizioni Università di Macerata (Italy). Inside the Meanings. The Usefulness of a Register of Ontologies in the cultural Heritage Sector Chiara Veninata(a) a) Istituto Centrale per il Catalogo e la Documentazione (ICCD-MiBACT), http://orcid.org/0000-0003-0981-1726 __________ Contact: Chiara Veninata, chiara.veninata@beniculturali.it Received: 27 January 2020; Accepted: 18 February 2020; First Published: 15 May 2020 __________ ABSTRACT The article deals with the topic of the semantic web and the publication in linked open data of information relating to cultural heritage. In particular, the article analyzes the ontology registers, i.e. those tools that formally describe the ontological models available on the web and facilitate their retrieval and evaluation, encouraging their reuse and facilitating semantic alignment and interoperability processes. The ontology registers respond effectively to the absence of reference and orientation tools in the conceptual modeling processes of information and have been successfully tested in different domains, but are still unpublished in the cultural sphere. The examination of the initiatives carried out in the last decade in the field of cultural heritage has clearly highlighted t he lack of a consolidated epistemological structure in the conceptual modeling of information resources, despite the numerous ontologies created according to the multiple linked open data publication projects. Consequently, it is often difficult to fully understand all the ontologies available in relation to your area of interest and to obtain in a smooth and systematic way a reliable assessment of their representative capacity and their degree of semantic interoperability. The analysis of the main registers of ontologies so far created outside the domain of cultural heritage has made it possible to identify and define the requirements of a register of ontologies for cultural heritage, and to elaborate the relative ontology. The clarification of the requirements also took into account a peculiar function that the registers could play in the cultural domain, as tools to support some features of a digital library of cultural heritage. KEYWORDS Ontology; Cultural heritage; Linked open data. CITATION Veninata, C. “Inside the Meanings. The Usefulness of a Register of Ontologies in the cultural Heritage Sector.” JLIS.it 11, 2 (May 2020): 45−58. DOI: 10.4403/jlis.it-12624. http://creativecommons.org/licenses/by/4.0/ JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 46 Premessa Il settore dei beni culturali rappresenta una delle aree più promettenti e stimolanti per quanto concerne l’applicazione degli standard e delle tecnologie del semantic web e, più in generale, la sperimentazione di nuove soluzioni in grado di connettere tipologie di informazioni intrinsecamente eterogenee. Istituzioni culturali quali biblioteche, archivi e musei, stanno rivolgendo una crescente attenzione alle nuove tecnologie del web per offrire all’utenza funzionalità di ricerca e di navigazione che, sfruttando le relazioni semantiche tra dati anche di diversa origine, permettono sia di migliorare l’interoperabilità tra sistemi differenti sia di accrescere le possibilità di integrazione, recupero e fruizione delle informazioni. L’utilizzo del linguaggio RDF1 e la disponibilità di ontologie, costituiscono la base tecnologica necessaria per la pubblicazione sul web di dati secondo il paradigma dei linked open data (LOD), ovvero dati in formato aperto e collegati tra di loro. Le ontologie, intese come dizionari esterni rappresentativi ed esplicativi dei dati, consentono di rappresentare le risorse tramite la descrizione delle loro caratteristiche (o attributi), l’identificazione delle relazioni esistenti tra esse e della semantica che lega tali entità. Lo spazio di riferimento semantico, ovvero il settore in cui tali risorse e tali relazioni sono significative, è detto anche dominio dell’ontologia e tiene conto dello specifico contesto e dello specifico punto di vista sulla base dei quali la realtà è osservata. Il dominio dei beni culturali è, per sua natura, piuttosto complesso e l’esplicitazione semantica delle relazioni tra le sue numerose componenti (beni di natura biblioteconomica, archivistica e museale, fotografie, beni archeologici, beni immobili, beni etnoantropologici materiali e immateriali, architetture e poi artisti, autori, editori etc.) può essere più o meno significativamente espressa anche a seconda dell’ontologia che si prende a modello. Il processo di costruzione di una ontologia comporta un coinvolgimento di risorse (persone, tempi, costi) di solito piuttosto elevato. Solitamente si tratta di circoscrivere un determinato dominio della conoscenza, definirne i concetti (classi) e organizzarli in una gerarchia. Successivamente si definiscono gli attributi e le relazioni (proprietà o slot) tra concetti e le restrizioni su di essi. Infine, si individuano le istanze dei concetti, popolando l’ontologia. La definizione dei concetti e delle relazioni tra concetti tra gli esperti di un determinato dominio richiede solitamente una enorme attività di negoziazione tra pari per la conciliazione di moltissime esigenze e punti di vista diversi (basti semplicemente pensare a tutti i punti di vista rappresentati dai diversi sistemi semantici associati ai vari sistemi linguistici). Una pratica raccomandata da buona parte della letteratura sugli standard – in particolare in ambito semantic web e linked open data – è quella che prevede che, prima di creare una nuova ontologia, occorra valutare il riuso di ontologie disponibili per un determinato dominio di conoscenza facendo preferibilmente riferimento a ontologie standard o, in secondo luogo, a ontologie di dominio molto conosciute, documentate e mantenute da agenzie reputate affidabili. Raccomandazioni sul riuso di ontologie sono fornite anche dalle “Linee guida per l’interoperabilità semantica attraverso i linked open data” pubblicate nel novembre 2012 dall’Agenzia per l’Italia Digitale.2 1 Cfr. Resource Description Framework, https://www.w3.org/RDF/. 2 Cfr. https://www.agid.gov.it/sites/default/files/repository_files/documentazione_trasparenza/cdc-spc-gdl6- interoperabilitasemopendata_v2.0_0.pdf. https://www.w3.org/RDF/ https://www.agid.gov.it/sites/default/files/repository_files/documentazione_trasparenza/cdc-spc-gdl6-interoperabilitasemopendata_v2.0_0.pdf https://www.agid.gov.it/sites/default/files/repository_files/documentazione_trasparenza/cdc-spc-gdl6-interoperabilitasemopendata_v2.0_0.pdf JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 47 Ciononostante, il carattere distribuito dei processi di modellazione e pubblicazione di ontologie ha portato allo sviluppo di varie ontologie per domini molto simili. Questa tendenza, pur non costituendo un problema di per sé, rischia di indebolire le possibilità di realizzare l’interoperabilità semantica tra i dati, che è invece favorita dalla conoscenza completa dei modelli che stanno alla base della pubblicazione dei dati. Linked open data e ontologie in ambito culturale Prima di entrare nel merito di strumenti quali i registri di ontologie ed evidenziarne l’utilità nel web semantico, può risultare efficace partire dallo stato dell’arte dei progetti di pubblicazione di linked open data in ambito culturale e delle ontologie in essi utilizzate. Una prima analisi prende avvio con la Review on linked open data sources, effettuata nell’ambito del Progetto Athena Plus dell’ottobre 2013;3 essa fu successivamente integrata e aggiornata da una ricerca di OCLC condotta tra luglio e agosto 2014 e poi ripubblicata con alcune correzioni nel 2015.4 Quest’ultima riguardava un primo censimento di progetti per la pubblicazione di linked open data in ambito prevalentemente biblioteconomico. Le indagini di OCLC sono state effettuate diffondendo il link alla survey su molte listserv e su Twitter, contando quindi sulle potenzialità dei social media per raggiungere il maggior numero possibile di istituzioni. I risultati della prima indagine del 2014, portata a termine nel 2017,5 sono stati successivamente confrontati con la revisione6 della medesima indagine condotta sempre da OCLC tra il 17 aprile e il 25 maggio 2018. All’indagine del 2018 hanno risposto 81 istituzioni che hanno segnalato un totale di 104 progetti, rispetto alle 71 istituzioni che ne avevano segnalati 112 nel 2015. Dei suddetti 104 progetti, solo 42 erano già stati descritti in precedenza. Il 75% delle 104 implementazioni sono attive e il 40% di queste è attiva da più di quattro anni. Dei 104 progetti descritti, 29 dichiarano di utilizzare endpoint SPARQL per consentire l’accesso diretto ai linked data e 33 dichiarano di fornire accesso ai file dump. Le istituzioni italiane coinvolte nell’indagine OCLC del 2018 sono solo 4: si tratta della Biblioteca della Camera dei deputati, di Casalini Libri (SHARE-VDE group), del Coordinamento delle Biblioteche Speciali e Specialistiche di Torino (CoBIS) e dell’Università degli Studi Roma TRE. Mentre sul fronte internazionale il presente contributo non aggiunge molto al sondaggio di OCLC, salvo una verifica puntuale sulla reale disponibilità degli endpoint, essa arricchisce notevolmente l’indagine sul fronte italiano. Per quanto riguarda l’indagine OCLC, in questa ricerca si tiene conto solo dei progetti che hanno riguardato la produzione di dati descrittivi del patrimonio culturale7 e che hanno dichiarato la presenza di un endpoint SPARQL o di file dump, ovvero di quelli in cui sia stato possibile verificare l’utilizzo delle ontologie dichiarate e la conseguente modellazione dei dati, 3 Disponibile all’indirizzo http://www.athenaplus.eu/getFile.php?id=190 (consultato il 19/12/2019). 4 L’indagine del 2014 non considerava alcune realtà rilevanti come le biblioteche nazionali di Francia e Germania. Così il sondaggio è stato ripetuto tra il 1° giugno e il 31 luglio 2015. 5 Cfr. Smith-Yoshimura, Analysis of International Linked Data Survey for Implementers, D-Lib Magazine, 2017, 22 (7/8), pp. 141–167, disponibile all’indirizzo http://doi.org/10.1045/july2016-smith-yoshimura. 6 Disponibile all’indirizzo https://www.oclc.org/research/themes/data-science/linkeddata/linked-data-survey.html. 7 Si escludono dunque progetti, come quello della Charles University di Praga (cfr. https://etl.linkedpipes.com) o della Cornell University (cfr. http://ld4p.org/) o della North Rhine-Westphalian Library Service Center (cfr. http://lobid.org), che riguardano modalità innovative di produzione automatica di linked open data. http://www.athenaplus.eu/getFile.php?id=190 http://doi.org/10.1045/july2016-smith-yoshimura https://www.oclc.org/research/themes/data-science/linkeddata/linked-data-survey.html https://etl.linkedpipes.com/ http://ld4p.org/ JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 48 escludendo quindi i progetti che dichiarano di utilizzare linked data solo nel back end8 o quelli in cui l’accesso ai dati è ancora dichiarato privato per questioni tecniche9 o risulta non raggiungibile alla data del 31/12/201910 o, infine, i progetti di digitalizzazione in cui la messa a disposizione di LOD è di fatto delegata al portale. L’indagine evidenzia 57 progetti di pubblicazione di linked open data riconducibili all’ambito culturale.11 Tra questi, i 15 progetti italiani censiti risultano tutti mantenuti e dotati di un endpoint SPARQL funzionante. 8 È il caso dei progetti di biblioteca digitale della Agencia Española de Cooperación Internacional para el Desarrollo (AECID), o del catalogo di Anythink Libraries, o della Digital Public Library of America (cfr. http://dp.la/), della biblioteca del Ministero della Difesa spagnolo ecc. 9 Si vedano ad esempio i progetti della George Washington University, il progetto della National Diet Library riguardante la “Nippon Decimal Classification” (NDC), della National Library of Medicine (USA), della University of South Florida St. Petersburg, della Memorial University of Newfoundland, della National Library of Portugal, della National Library of Scotland, della National Library of Wales. 10 È il caso della Australian National University. 11 Si fornisce di seguito l’elenco dei progetti: American Numismatic Society (http://numismatics.org/archives e http://numismatics.org/authorities), Amsterdam museum (https://www.amsterdammuseum.nl/open-data), Archaeology Data Service (http://archaeologydataservice.ac.uk/research/stellar/), Archivio Centrale dello Stato (http://dati.acs.beniculturali.it), Archivio storico della Presidenza della Repubblica italiana (https://archivio.quirinale.it/aspr/redazione/linked-open-data), Bavarian State Library (lod.b3kat.de), Biblioteca de Galicia (http://biblioteca.galiciana.gal/gl/datos_abiertos/datos_abiertos.cmd), Biblioteca Nacional de España (http://datos.bne.es/), Bibliotheque Nationale de France (http://data.bnf.fr/), Biblioteca Nazionale Centrale di Firenze, Soggettario (http://thes.bncf.firenze.sbn.it/thes-dati.htm), British Library, British National Bibliography (http://bnb.data.bl.uk), British Museum (http://collection.britishmuseum.org), Camera dei deputati (http://dati.camera.it/), Carnegie hall (http://data.carnegiehall.org), Casalini Libri (www.share-vde.org), Centro di Documentazione Ebraica Contemporanea CDEC (http://dati.cdec.it/), Claros Project (https://clarosdata.wordpress.com/), CoBIS - Coordinamento delle Biblioteche Speciali e Specialistiche di Torino (https://dati.cobis.to.it/), Consiglio Nazionale delle Ricerche (http://data.cnr.it/), Corago LOD (http://www.disit.org/corago/), Cultura Italia (http://dati.culturaitalia.it/), Data Archiving and Networked Services (DANS), Royal Netherlands Academy of Arts and Sciences (http://www.cedar- project.nl/), Deutsche Nationalbibliothek (http://www.dnb.de/EN/lds), Europeana Foundation (http://europeana.eu/), Fondazione Zeri (http://data.fondazionezeri.unibo.it/), Fundacción Ignacio Larramendi (http://www.larramendi.es/i18n/cms/elemento.cmd?id=estaticos/paginas/Biblioteca_Virtual_Ignacio_Larramen.html), German National Library (http://www.dnb.de/lds), Getty Vocabularies (http://vocab.getty.edu/), Goldsmiths’ College (http://slickmem.data.t-mus.org/), Hellespont Project (http://hellespont.dainst.org/startpage/index.html), Historic Environment Scotland (http://heritagedata.org/live/schemes/scapa.html), Istituto Centrale per gli Archivi – Sistema Archivistico Nazionale (http://san.beniculturali.it/web/san/dati-san-lod), Istituto Centrale per il Catalogo e la Documentazione – Catalogo generale dei beni culturali (http://dati.beniculturali.it/arco/), Istituto per i beni artistici culturali e naturali (IBACN) della Regione Emilia-Romagna (http://ibc.regione.emilia-romagna.it/servizi-online/lod), Library of Congress (http://id.loc.gov/ e http://bibframe.org/), Linked Jazz (https://linkedjazz.org/), Linking Lives (http://archiveshub.ac.uk/linkinglives/), linked open data for ACademia (LOD.AC) project (http://lod.ac/), Ministero per i beni e le attività culturali e per il turismo (http://dati.beniculturali.it), National Diet Library (http://id.ndl.go.jp/auth/ndla), National Library of Finland (http://data.nationallibrary.fi), National Library of Medicine (https://id.nlm.nih.gov/mesh/), National Széchényi Library (http://v.mek.oszk.hu/FlintSparqlEditor/index-mek.html), Nomisma (http://nomisma.org), North Carolina State University Libraries (http://www.lib.ncsu.edu/ld/onld/), NTNU (Norwegian University of Science and Technology) University Library (http://www.ntnu.no/ub/digital/), OCLC (http://www.worldcat.org/ e http://viaf.org), Oslo Public Library (http://data.deichman.no/), Progetto Reload (http://labs.regesta.com/reloadProject/client/), Rijksmuseum (https://datahub.io/rasvaan/201805-rma-collection), Russian Linked Culture Cloud (http://culturecloud.ru/), Springer Nature (http://www.springernature.com/scigraph), University of Alberta Libraries (http://canlink.library.ualberta.ca e http://dx.doi.org/10.7939/DVN/URXSGC), University of California - Los Angeles http://dp.la/ JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 49 Volendo classificare a grandi linee la tipologia dei dati descrittivi del patrimonio culturale riconducendoli a specifici ambiti disciplinari, sul totale dei progetti censiti, 34 pubblicano dati bibliografici, 30 authority file, 13 pubblicano dati su opere d’arte, 11 pubblicano dati archivistici. Le altre tipologie di dati pubblicati sono dati archeologici (6), musicologici (2), vocabolari (9), dati biografici (9), storici (8) e geografici (4). Se si passa all’analisi dei modelli ontologici sottesi alla descrizione del patrimonio culturale, dei 57 progetti censiti 38 dichiarano di usare le ontologie Dublin Core e DC Terms, 14 l’ontologia CIDOC- CRM, 12 la Bibliographic ontology, 9 BIBFRAME, 6 l’ontologia EDM, 7 la Biographical Ontology, 10 RDA; ben 27 progetti hanno invece sviluppato proprie ontologie. Dunque l’analisi rivela una realtà composita e interessante, segno anche della ripresa di un vivace dibattito non solo relativo alle potenzialità connesse con le nuove tecnologie ma che coinvolge anche i modelli descrittivi, con dei ripensamenti in chiave critica circa la relativa produzione scientifica e tecnica dei decenni precedenti. Fino alla fine degli anni Novanta il dibattito scientifico sui modelli descrittivi è stato infatti caratterizzato dalla ricerca e pubblicazione di tracciati dati ritenuti adeguati – in una determinata comunità di riferimento – alla descrizione del patrimonio culturale e specifici per la descrizione bibliografica, per la descrizione archivistica e per la descrizione di oggetti cosiddetti museali. Frutto di tale dibattito è stata la produzione di numerosi modelli concettuali: una ricerca condotta nel 2009 da Jenn Riley dell’Indiana University Libraries White Professional Development Award ne contava circa 105 tra quelli maggiormente diffusi a livello internazionale. Jenn Riley ne fornisce una spettacolare rappresentazione grafica,12 evocativa della complessità della situazione, organizzando i modelli a seconda dei domini specifici (biblioteche, archivi, musei), delle comunità di riferimento, della funzione svolta (modelli concettuali, vocabolari controllati), delle tecnologie utilizzate (schemi XSD, ontologie, thesauri SKOS) e dello scopo per cui sono stati concepiti (metadati tecnici, descrittivi, strutturali, metadati per la conservazione a lungo termine etc.). A livello nazionale tale complessità si complica ulteriormente: basti pensare che solo l’Istituto Centrale per il Catalogo e la Documentazione fornisce ulteriori 30 tracciati di metadati descrittivi per la catalogazione dei beni archeologici, storico-artistici, architettonici etc.13 e che, in generale, gli Istituti centrali del MiBAC hanno spesso adottato profili nazionali che si presentano come adattamenti dei tracciati internazionali (ad esempio, i tracciati del catalogo del Sistema archivistico nazionale – CAT-SAN – basati sullo standard internazionale EAD o lo standard MAG per la metadatazione delle immagini digitali pubblicato da ICCU e basato sullo standard internazionale METS). Tale proliferazione nasce indubbiamente dal desiderio di esprimere con la massima analiticità possibile tutto il potenziale informativo dei dati raccolti nella fase di schedatura e di catalogazione dei (http://link.library.ucla.edu), University of Nevada, Las Vegas (https://www.library.unlv.edu/linked-data), University of Oxford (oxlod.eng.ox.ac.uk), Yale Center for British Art (http://britishart.yale.edu/collections/using- collections/technology/linked-open-data). 12 Cfr. Jenn Riley, Seeing standards, A Visualization of the Metadata Universe, disponibile all’indirizzo http://jennriley.com/metadatamap/. 13 Cfr. Maria Letizia Mancinelli, Gli standard catalografici dell’Istituto Centrale per il Catalogo e la Documentazione, in Roberta Tucci, Le voci, le opere e le cose. La catalogazione dei beni culturali demoetnoantropologici, Roma, Istituto centrale per il catalogo e la documentazione - Ministero dei beni e delle attività culturali e del turismo, 2018, pp. 279–302 disponibile all’indirizzo http://www.iccd.beniculturali.it/getFile.php?id=6670. JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 50 beni culturali senza i vincoli, spesso restrittivi, imposti dall’adesione a tracciati standard esterni. Ciò è spesso avvenuto anche al costo della perdita del potenziale di interoperabilità sintattica che proprio l’adesione agli standard mirava a garantire nei progetti di condivisione dei dati. Dal 2010 il dibattito intorno alle potenzialità del nuovo web (web dei dati o web semantico), ha contribuito a rinverdire il confronto tra gli esperti che si sono occupati delle metodologie connesse alla descrizione del patrimonio culturale. I primi tentativi di esplorare i formalismi e le tecnologie connessi al web semantico – ed in particolar modo le ontologie ed i linked open data – sono in parte riconducibili a due opposte tendenze: la prima è rappresentata da tentativi di traduzione, a volte acritica, dei tracciati standard di metadati in chiave ontologica; la seconda è ravvisabile nella volontà di emanciparsi dai vincoli spesso presenti negli standard del settore per pubblicare i dati sul patrimonio culturale sulla base di modelli concettuali leggeri, anche non concepiti specificamente per le descrizioni dei beni culturali e delle loro collezioni (vedi, ad esempio, il caso di OCLC che pubblica l’intero catalogo bibliografico usando l’ontologia schema.org, pensata per una più efficiente indicizzazione dei siti web da parte dei motori di ricerca). In questa sede, quello che preme sottolineare è come, anche in questa nuova fase del dibattito teorico e scientifico, si assista, da una parte, al ricrearsi, nei settori librario, archivistico e museale, di atteggiamenti scarsamente propensi alla contaminazione in nome delle specificità delle varie discipline e metodologie descrittive; dall’altra parte, si nota un fiorire di iniziative scollegate le une dalle altre che si dedicano o a produrre nuovi modelli concettuali spesso estremamente complessi per finalità legate a specifici progetti di ricerca o ad adottare acriticamente modelli ontologici già disponibili per la rapida pubblicazione di dati aperti e facilmente relazionabili. Quello che è sembrato mancare fino ad ora è una seria riflessione suffragata da un numero adeguato di casi di studio sulle reali implicazioni che la scelta di un modello ontologico comporta per i linked open data, in termini di qualità dei collegamenti tra dati e quindi di accrescimento del potere conoscitivo offerto a chi consulta i dati attraverso i portali web. Chi si è occupato di linked open data sa bene che la modellazione ontologica non è mai del tutto indifferente rispetto all’utilizzo che dei dati si intende fare e che spesso la scelta di una determinata ontologia è funzionale proprio ai molteplici utilizzi dei dati ritenuti plausibili dai modellisti. D’altra parte, neanche il ricorso ad ontologie ritenute i golden standard relativamente ad un determinato dominio, come ad esempio viene considerato CIDOC in ambito culturale, rende immuni da possibili scelte errate nella modellazione dei dati.14 14 È, ad esempio, il caso dei linked open data relativi al portale Cultura Italia la cui modellazione sulla base di CIDOC risulta infarcita di errori sia sintattici che semantici, dovuti ad una errata interpretazione dello standard. Cfr. Valdimir Alexiev, How Not To Do Linked Data, disponibile all’indirizzo https://gist.github.com/VladimirAlexiev/090d5e54a525d57acb9b366121e77573. Ed è anche il caso della modellazione dei dati del British Museum, che pur corretti da un punto di vista sintattico e semantico, vengono spesso citati tra gli addetti a i lavori come un esempio di modellazione eccessivamente complessa e astrattamente concettuale che rende difficoltosi l’interlinking con altri dati e l’effettuazione di query sull’endpoint: si fa riferimento alle criticità messe in evidenza nell’ambito di specifici tavoli di lavoro durante il quarto LOD-LAM Summit, tenutosi a Venezia presso la Fondazione Giorgio Cini, a giugno 2017 (cfr. https://summit2017.lodlam.net/), a cui la sottoscritta ha partecipato come delegata per l’Università La Sapienza di Roma. Un tavolo di lavoro, in particolare, era dedicato all’analisi dell’utilizzo di CIDOC come ontologia per la pubblicazione di LOD in ambito culturale (cfr. le note sulle sessioni di lavoro http://bit.ly/2sPCj3e, ultima consultazione 27/04/2019). https://gist.github.com/VladimirAlexiev/090d5e54a525d57acb9b366121e77573 https://summit2017.lodlam.net/ http://bit.ly/2sPCj3e JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 51 Il ruolo dei registri di ontologie e l’esperienza dei registri di metadati I registri di ontologie, intesi come repository delle ontologie, si prefiggono lo scopo di aiutare il ricercatore ad individuare l’ontologia o le ontologie più adatte al proprio scopo e a valutarne la qualità e la riusabilità, sulla base di criteri di valutazione basati su parametri quanto più oggettivi possibili, ovvero su metriche misurabili. Alcune metriche sono basate su statistiche e altre sono basate sul controllo di qualità. Per quanto concerne le prime, esse calcolano, ad esempio, il numero delle classi non anonime presenti nell’ontologia, numero delle proprietà (o slot), numero degli individui15, massima profondità delle relazioni gerarchiche tra le classi,16 numero medio e massimo delle classi sorelle ovvero sullo stesso livello nell’albero. Per quanto concerne invece le metriche basate sul controllo di qualità, esse si basano su parametri quali, ad esempio, classi con una sola sottoclasse (situazione che spesso indica una gerarchia poco specificata o una poco appropriata distinzione tra la classe e la sottoclasse), classi con più di 25 sottoclassi (spesso una classe così articolata è eligibile per distinzioni e categorizzazioni ulteriori), classi senza definizioni.17 Tuttavia, la sola disponibilità di elementi quantitativi o comunque oggettivi non appare sufficiente a decretare la bontà di un modello rispetto ad un altro. La valutazione non prescinde mai da elementi soggettivi e occorre capire se esistano strumenti in grado di agevolare anche le interpretazioni da parte dei modellisti e se i registri di ontologie possono rivelarsi utili in tal senso. I registri di ontologie possono consentire di valutare come un concetto identificato da un utente sia stato espresso in varie ontologie, rendendo esplicite le equivalenze e gli allineamenti tra i vari modelli concettuali registrati, con strumenti in grado non solo di esplicitare gli assiomi del linguaggio RDFS e OWL appositamente previsti per la gestione degli allineamenti concettuali (relazioni di sotto- classe/proprietà o di equivalenza tra classi o proprietà di due ontologie differenti) ma in grado anche di prevedere costrutti sintattici per l’esplicitazione di relazioni più complesse (ad es. un concetto di una ontologia che corrisponde a più concetti o ad una concatenazione di classi e proprietà in un’altra ontologia). I registri di ontologie sono fortemente correlati all’esperienza dei registri di metadati. Uno degli studi più completi riguardo l’utilità dei registri di metadati nasce nell’ambito delle digital humanities. Si tratta del “Principles of Metadata Registries. A White Paper of the DELOS Working Group on Registries”,18 sponsorizzato dalla DELOS Network of Excellence on digital libraries, una iniziativa della Commissione Europea nata per promuovere la ricerca e la cooperazione internazionale nel campo delle digital libraries. 15 Questa metrica è significativa solo per specifici linguaggi di rappresentazione delle ontologie dato che per esempio nelle ontologie di formato OBO non sono presenti individui. 16 Per le ontologie espresse in OWL e RDFS, considera come relazione gerarchica la relazione “is-a”. Per le ontologie nel formato OBO, sono considerate relazioni gerarchiche le relazioni “is-a”, “has-part”, inversa di “develops-from”. 17 Alcune di queste proprietà sono definite nella OMV Ontology ed hanno omv:prefix; il resto sono definite localmente nella BioPortal Metadata Ontology (metrics:prefix). 18 Baker, Thomas et al. “Principles of Metadata Registries A White Paper of the DELOS Working Group on Registries.” (2003), disponibile all’indirizzo https://pdfs.semanticscholar.org/01ea/e200c915fbb38faf2584e87230bb15d2d683.pdf (consultato il 26/12/2018). https://pdfs.semanticscholar.org/01ea/e200c915fbb38faf2584e87230bb15d2d683.pdf JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 52 Rispetto ai registri di metadati, i registri di ontologie possono essere utilizzati, sfruttando le caratteristiche dei linguaggi RDF e OWL, per rendere comprensibili i modelli concettuali formalizzati in ontologie oltre che agli esseri umani anche alle macchine, per favorire l’esplicitazione delle relazioni semantiche tra classi e proprietà delle ontologie al fine di effettuare operazioni di mappatura e query complesse su dati espressi sulla base di modelli dati differenti. Un registro di ontologie presenta infatti i seguenti possibili scenari di utilizzo, molti dei quali sono comuni ai registri di metadati, anche se i formalismi connessi alla rappresentazione ontologica ne potenziano gli effetti: - individuare il modo migliore per descrivere e modellare una risorsa (una query al registro dovrà restituire una lista di classi e proprietà con indicazioni circa il loro utilizzo ed esempi concreti); - armonizzare le ontologie usate in un determinato dominio da varie istituzioni o nell’ambito di vari progetti (il registro presenterà la descrizione di come le singole classi e proprietà sono state utilizzate per favorire operazioni di crosswalk tra ontologie); - pubblicare in formato linked open data certi dati facendo massimo riuso di ontologie già esistenti (occorrerà effettuare query sul registro e visualizzare sul registro tutte le classi e le proprietà afferenti a un determinato concetto ricercato, contestualizzate in relazione alle caratteristiche del modello concettuale di riferimento e a casi d’uso); - rendere comprensibile a lungo termine una ontologia sulla cui base sono stati modellati certi dati (il registro potrà conservare una copia in locale delle varie versioni delle ontologie, fungendo anche da repository e tenere conto delle variazioni intercorse nel tempo tra una versione e l’altra della medesima ontologia, definendo anche la compatibilità tra le versioni); - individuare, per ogni ontologia esaminata, i riferimenti a chi la mantiene, a chi la aggiorna, a chi la usa e in quali progetti, ed esplicitare se i suoi autori hanno sviluppato ontologie in domini contermini (il registro terrà conto della frequenza di aggiornamento dell’ontologia, dei suoi autori, dei dataset che sono su di essa basati, degli autori che l’hanno modellata); - verificare se una ontologia è stata usata in un determinato dominio, quale classe dell’ontologia è stata usata; come è definita una determinata classe in una determinata ontologia, da quale istituzione è stata utilizzata ed in che modo, a quali altre classi è eventualmente dichiarata equivalente e che relazioni ci sono tra una classe ed un’altra nell’ambito della medesima ontologia e/o con classi di altre ontologie, attraverso quale proprietà sono espresse. Potenzialmente, il registro è in grado di garantire la semplificazione delle operazioni di mappatura e integrazione fra set di metadati, resa possibile da un’architettura in grado di estrapolare le entità comuni dai singoli vocabolari, esplicitata attraverso il ricorso a meccanismi di mappatura verso ontologie top level e l’effettuazione di operazioni di mappatura tra classi e proprietà e/o costrutti complessi di classi e proprietà da parte di utenti registrati e accreditati. Fondamentali inoltre sono le funzionalità che prevedono la raccolta automatica dei requisiti funzionali alle politiche adottate per garantire non solo lo storage ma anche la disponibilità a lungo termine dell’ontologia pubblicata, sia riguardo al formato informatico dei file registrati, sia relativamente ai metadati necessari per la conservazione a lungo termine delle ontologie. JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 53 Registri di ontologie Una prima disamina di registri di ontologie viene effettuata nel 2001 da Ding e Fensel che censiscono 8 registri di ontologie, quasi nessuno dei quali tuttavia risulta ad oggi mantenuto.19 D’Aquin e Noy nel 201220 forniscono un’analisi più aggiornata dei registri di ontologie, censendone ben 11, di cui solo 6 risultano ad oggi mantenuti: BioPortal, sviluppato per il settore biomedico dal National Center for Biomedical Ontology; Cupboard, creato nell’ambito del NeOn Project, si basa sul concetto di spazio in cui ciascun utente può selezionare e lavorare (annotare, allineare semanticamente e commentare) un subset specifico di ontologie; OBO Foundry initiative, che mira a creare degli insiemi di ontologie di ambito biomedico ben documentate e ben definite in grado di interrelarsi le une con le altre.; oeGov - Ontologies for e-governement, un’iniziativa per la creazione e lo storage di ontologie dedicate all’amministrazione digitale, mantenuta da TopQuadrant; OLS - Ontology Lookup Service, sviluppato dall’European Bioinformatics Institute, un registro usato in vari progetti di ambito biomedico; OntologyDesignPatterns.org (ODP), catalogo di ontology design patterns, che costituiscono di per sé delle micro-ontologie in alcuni casi o, in altri, dei componenti ontologici che possono essere facilmente integrabili in ontologie di dominio; OntoSelect, provvisto di un avanzato meccanismo di ricerca sulle ontologie che organizza i risultati sulla base di un meccanismo di ranking basato su parametri quali, ad esempio, il numero di importazioni o il numero di linguaggi; OntoSearch2, che intende fornire un efficiente meccanismo di query sulle ontologie, potenziando il linguaggio SPARQL; ONKI ontology server, a supporto di vari servizi informatici del governo finlandese; TONES repository, un registro di ontologie tutte rigorosamente espresse in OWL; Schema-Cache, Sviluppato sulla Talis Platform. Nel 2016 Debashis Naskar e Biswanath Dutta identificano i seguenti 10 registri classificandoli a seconda delle loro funzionalità in repository, directory e registry,21 laddove attribuiscono agli ontology registry funzionalità di ricerca, navigazione, mapping e metadatazione; le ontology directory si caratterizzerebbero per essere relative ad un particolare dominio della conoscenza e per svolgere funzioni di reference ovvero di assistenza per gli utenti; non avrebbero tuttavia funzionalità di mapping o di metadatazione; gli ontology repository aggiungerebbero alle funzionalità presenti negli ontology registry anche funzionalità di storage, conservazione a lungo termine e di standardizzazione: 1) Bioportal, repository di dominio; 2) Agroportal, repository di dominio; 3) COLORE (COmmon Logic Ontology repository), repository generale; 4) Romulus (repository of Ontologies for MULtiple Uses), repository misto; 5) OeGov, directory di dominio; 6) ODP - OntologyDesignPatterns, directory generale; 7) ONKI, directory mista; 19 Ding, Y., Fensel, D. (2001). Ontology library systems. The key to successful ontology reuse. In First Semantic Web Working Symposium, 93–112. 20 Noy, Natasha F., D’Aquin, Mathieu. Where to Publish and Find Ontologies? A Survey of Ontology Libraries, [op. cit.]. 21 Cfr. Naskar, Debashis and Dutta, Biswanath, Ontology Libraries: A Study from an Ontofier and an Ontologist Perspectives, 2016, disponibile all’indirizzo https://www.researchgate.net/publication/305368094_Ontology_Libraries_A_Study_from_an_Ontofier_and_an_Ontolog ist_Perspectives. https://www.researchgate.net/publication/305368094_Ontology_Libraries_A_Study_from_an_Ontofier_and_an_Ontologist_Perspectives https://www.researchgate.net/publication/305368094_Ontology_Libraries_A_Study_from_an_Ontofier_and_an_Ontologist_Perspectives JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 54 8) DAML, directory mista; 9) MMI-ORR, registro di dominio; 10) Protégé ontology library, registro misto. Tra i censimenti più recenti, va infine menzionata l’analisi di Jonquet, Toulet, Arnaud, et al. (2018)22 che riguarda tre registri del dominio “agricoltura”: 1) FAIRSharing,23 un database che raccoglie oltre 1200 standard sviluppati da comunità scientifiche e di sviluppatori organizzati a seconda del tipo di standard (Terminology Artifact, Model/Format, Reporting Guideline, Identifier Schema), del dominio coperto, del soggetto, delle tassonomie, del Paese di creazione e dell’organizzazione responsabile. 2) FAO’s VEST Registry, non più disponibile perché confluito nel Gogan Vest / Agroportal map of standards.24 3) agINFRA linked data vocabularies, punto di ingresso per i value vocabularies e metadata vocabularies utilizzati nel progetto agINFRA.25 Allo stato attuale, le realizzazioni più interessanti sono rappresentate dai registri LOV – Linked Open Vocabularies, AgroPortal, Bioportal e Finto. Essi sono correntemente mantenuti e presentano le più avanzate funzionalità di gestione e presentazione delle ontologie (in termini di storage, conservazione, metadatazione, ricerca, annotazione etc.), con interfacce utente particolarmente amichevoli. I quattro registri esaminati, in particolare, sono conformi al modello FAIR nel senso che consentono di rendere al loro interno le risorse (le ontologie) findable, accessible, interoperable e reusable conformemente ai 15 principi FAIR.26 Inoltre essi si caratterizzano per la volontà di gestire in maniera uniforme i metadati delle ontologie registrate, secondo un preciso modello concettuale che è posto a loro fondamento, a differenza di altre realizzazioni che si limitano a trattare le ontologie sulla base dei soli metadati in esse esplicitamente previsti e sulla base di metriche automaticamente. In particolare, LOV – che è il portale di gran lunga più consultato da chi si occupa di ontologie – ha un orizzonte poco specialistico e mira a ricomprendere qualunque ontologia realizzata nei vari domini della conoscenza, mentre Bioportal e Agroportal si specializzano rispettivamente nel settore biomedico e agronomico. Dall’analisi delle loro funzionalità e dai test sul loro utilizzo è emerso chiaramente che strumenti generalisti come LOV risultano efficaci solo nel caso in cui un utente miri ad ottenere una prima panoramica sulle ontologie esistenti e ad identificare le ontologie afferenti al dominio di proprio interesse. Nella restituzione dei risultati di ricerca l’analisi ha evidenziato come la metodologia di calcolo del ranking attribuito a ciascuna classe/proprietà dell’ontologia presti – soprattutto in LOV – una attenzione nettamente maggiore al concetto di “diffusione” dell’utilizzo di determinate ontologie (diffusione calcolata sull’uso delle classi e delle proprietà nella metadatazione dei dati censiti nella lod- 22 Jonquet, C., Toulet, A., Arnaud, E., Aubin, S., Dzale Yeumo, W. E., Emonet, V., Graybeal, J., Laporte, M.-A., Musen, M. A., Pesce, V., Larmande, P. (2018), “AgroPortal: A vocabulary and ontology repository for agronomy”, Computers and Electronics in Agriculture, 144, 126–143, DOI: 10.1016/j.compag.2017.10.012, disponibile all’indirizzo https://prodinra.inra.fr/record/427822. 23 Cfr. https://fairsharing.org/. 24 Cfr. https://www.godan.info/datasets/vest-agroportal-map-standards. 25 Cfr. https://aginfra.eu/. 26 Cfr. https://www.nature.com/articles/sdata201618. https://fairsharing.org/ https://www.godan.info/datasets/vest-agroportal-map-standards https://aginfra.eu/ https://www.nature.com/articles/sdata201618 JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 55 cloud) a discapito del concetto di “provenienza”, ovvero di collocazione in un determinato contesto – scientifico o meno – dell’ontologia medesima. In riferimento alle caratteristiche dei registri, si riportano di seguito (fig. 1) in forma schematica alcuni risultati del processo di valutazione dei 4 registri analizzati in dettaglio. Fig. 1. Risultati del processo di valutazione di 4 registri di ontologie Alcune annotazioni finali e sviluppi possibili Come ogni risorsa, anche le ontologie, i thesauri e i vocabolari controllati devono essere descritti con metadati pertinenti e adatti a facilitarne l’identificazione, la selezione, il riutilizzo e la conservazione a lungo termine. I registri di ontologie sono strumenti che descrivono formalmente i modelli ontologici disponibili sul web, a fronte di modelli di metadatazione il più possibile uniformi, e ne agevolano il reperimento e la valutazione, incentivandone il riuso e facilitando i processi di allineamento semantico e di interoperabilità. Un registro di ontologie per i beni culturali dovrebbe essere ideato per travalicare i limiti di un mero catalogo di risorse formali da prendere a riferimento per operare le scelte più opportune nella valorizzazione semantica dei propri dati. Esso dovrebbe anzitutto costituire un primo nucleo di quello che in prospettiva potrebbe diventare un repertorio ufficiale mantenuto, ad esempio, o dal MiBACT o da AgID, nell’ambito delle proprie funzioni istituzionali di promozione dell’omogeneità dei linguaggi, delle procedure e degli standard, connesse in particolare alle politiche di valorizzazione del patrimonio informativo pubblico nazionale, ivi compresa la definizione della strategia in materia di dati aperti nonché lo sviluppo e la gestione del portale nazionale dei dati aperti. Analogamente a quanto già avviene nell’ambito della conservazione digitale infatti, AgID inizia a porsi JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 56 come garante di strumenti che raccolgono tecnologie e teorie formali di tipo semantico certificate o accreditate mediante un sistema di valutazione e certificazione condiviso.27 Nell’ambito dei beni culturali, un possibile ulteriore obiettivo è che un registro di dominio possa costituire il fondamento semantico di una digital library trasversale ai vari domini della cultura (dominio biblioteconomico, archivistico e museale), divenendone una componente di servizi interni, al fine di consentire la mappatura delle varie ontologie utilizzate nei sistemi di origine, che interagiscono con la digital library, e per ottimizzare le prestazioni delle interrogazioni. In tale veste, il registro garantirebbe una architettura in grado di estrapolare le entità comuni dai singoli modelli ontologici e rendere più semplici le mappature e le integrazioni fra set di metadati. Questa finalità riguarda la possibilità di usare il registro per facilitare la costruzione di indici comuni a risorse metadatate sulla base di ontologie differenti per la ricerca delle risorse culturali e l’accesso ad esse. La disponibilità di un simile strumento consentirebbe di superare una delle maggiori difficoltà ravvisate nei tentativi anche recenti di costruzione di digital library cross-domain. Infatti le risorse relative a ciascun dominio potrebbero continuare ad essere pubblicate sulla base dei modelli concettuali e dei sistemi di metadatazione propri di ciascun ambito disciplinare (archivistico, librario, archeologico, demoetnoantropologico, storico-artistico etc.) mentre la reductio ad unum sarebbe demandata solo alla fase di ricerca nella digital library. Si supererebbe in questo modo uno dei principali limiti degli attuali portali aggregatori di risorse come CulturaItalia o Europeana: per inviare i dati all’aggregatore è necessario effettuare operazioni di mappatura semantica a partire dai sistemi originari sulla base del tracciato dati individuato per l’aggregatore. Tale tracciato spesso coincide con un tracciato di minima scelto come modello dati comune alle varie tipologie di risorse gestite. Le operazioni di mappatura hanno pertanto comportato operazioni di schiacciamento semantico dei dati che in origine si presentavano molto ricchi (si pensi ai dati espressi secondo i tracciati descrittivi all’interno del Catalogo generale dei beni culturali) o di forzature semantiche di dati espressi secondo modalità consolidate di metadatazione peculiari di un determinato dominio (si pensi ai dati archivistici). Una digital library basata sulle tecnologie del web semantico dovrebbe invece offrire la possibilità di effettuare ricerche sul complesso degli elementi descrittivi di ciascuna risorsa, indicizzati ciascuno secondo il proprio modello descrittivo; inoltre, grazie all’utilizzo di un registro di ontologie in cui sia reso esplicito (e computabile dai software) lo strato di mappatura concettuale tra i modelli descrittivi dei vari domini, sarebbe possibile anche ricercare su chiavi comuni, nel senso di entità espresse in maniera definita con un procedimento che raccoglie e aggrega aspetti particolari che una molteplicità di oggetti informativi hanno in comune. 27 Cfr.OntoPiA, https://github.com/italia/daf-ontologie-vocabolari-controllati. https://github.com/italia/daf-ontologie-vocabolari-controllati JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 57 Riferimenti bibliografici Allocca, Carlo, Mathieu D’Aquin, Enrico Motta. 2009. DOOR-towards a formalization of ontology relations. In International conference on knowledge engineering and ontology development, keod’09, 13–20. Madera, Portugal. Allocca, Carlo, Mathieu D’Aquin, Enrico Motta. 2012. “Impact of Using Relationships between Ontologies to Enhance the Ontology Search Results.” In Knowledge Discovery, Knowledge Engineering and Knowledge Management, 164–176. Springer Berlin Heidelberg. D’Aquin, Mathieu, Holger Lewen. 2009. “Cupboard - a place to expose your ontologies to applications and the community.” In Lecture Notes in Computer Science, volume 5554, 913–918. Berlin: Springer. Debashis, Naskar. 2014. Ontology and ontology libraries: a critical study, in Master’s Dissertation (carried under the supervision of Biswanath Dutta), Bangalore, India: DRTC, Indian Statistical Institute, 10–49. Ding, Ying, Fensel, Dieter. 2001. “Ontology Library Systems: The key for Successful Ontology Reuse.” In Isabel F. Cruz, Stefan Decker, Jerome Euzenat, Deborah L. Mcguinness, Proceedings of SWWS’01, The First Semantic Web Working Symposium, 93–112, http://sw- portal.deri.org/papers/publications/ding+01.pdf ultima consultazione 30/12/2019. Hartmann, Jens, Raúl Palma, Asunción Gómez-Pérez. 2009. “Ontology Repositories”, Handbook on Ontologies, 906–915. Berlino, Heidelberg: Springer-Verlag. Jonquet, Clement, et al. 2018. “AgroPortal: A vocabulary and ontology repository for agronomy.” Computers and Electronics in Agriculture, 144:126–143. https://prodinra.inra.fr/record/427822. Jonquet, Clement, Anne Toulet, Biswanath Dutta, Vincent Emonet. 2018. “Harnessing the Power of Unified Metadata in an Ontology Repository: The Case of AgroPortal.” Journal on Data Semantics 7:191–221, https://doi.org/10.1007/s13740-018-0091-5. Magkanaraki, Aimilia, Sofia Alexaki, Vassilis Christophides, Dimitris Plexousakis. 2002. “Benchmarking RDF Schemas for the Semantic Web”, Lecture Notes in Computer Science, 2342, Berlin, Heidelberg: Springer. Mccarthy, John W., et al. 2009. Data Modeling and Harmonization with OWL: Opportunities and Lessons Learned. http://ceur-ws.org/Vol-524/swese2009_7.pdf, ultima consultazione 30/12/2019. Noy, Natalya F., et al. 2009. “Bioportal: ontologies and integrated data resources at the click of a mouse.” Nucleic Acids Research, vol. 37. Noy, Natalya F., Mathieu D’Aquin. 2012. “Where to Publish and Find Ontologies? A Survey of Ontology Libraries, in Web Semantics: Science, Services and Agents on the World Wide Web.” Journal of Web Semantics, vol. 11, January, http://www.websemanticsjournal.org/index.php/ps/article/view/217, ultima consultazione 30/12/2019. http://sw-portal.deri.org/papers/publications/ding+01.pdf http://sw-portal.deri.org/papers/publications/ding+01.pdf https://prodinra.inra.fr/record/427822 https://doi.org/10.1007/s13740-018-0091-5 http://www.websemanticsjournal.org/index.php/ps/article/view/217 JLIS.it 11, 2 (May 2020) ISSN: 2038-1026 online Open access article licensed under CC-BY DOI: 10.4403/jlis.it-12624 58 Stadtmuller, Steffen, Andreas Harth, Marko Grobelnik. 2013. “Accessing Information About Linked Data Vocabularies with vocab.cc.” In Li J., Qi G., Zhao D., Nejdl W., Zheng HT. (eds). Semantic Web and Web Science. Springer Proceedings in Complexity, New York: Springer. Thomas, Edward, Jeff Z. Pan, Jeff, Derek Sleeman. 2007. “ONTOSEARCH2: Searching ontologies semantically.” In Proceedings of the OWLED 2007 Workshop on OWL: Experiences and Directions, Innsbruck, Austria, June 6-7, 2007, ceur-ws.org. Xiang, Zuoshuang, Mélanie Courtot, Ryan R. Brinkman, Alan Ruttenberg, Yongqun He. 2010. “OntoFox: web-based support for ontology reuse.” BMC Research Notes, vol. 3. work_rpd4prb2bngc7ju7cehbwxujny ---- Vzájemné inspirace informační a kognitivní vědy ProInflow : časopis pro informační vědy 2 /2011 Tamar Sadeh DISCOVERY AND MANAGEMENT OF SCHOLARLY MATERIALS: NEW-GENERATION LIBRARY SYSTEMS Abstrakt: S nedávným nástupem systémů nové generace, založených na oddělené architektuře, se změnil způsob shromažďování a spravování vědeckých prací. S využitím moderního uživatelského prostředí nabízí nové knihovní Discovery systémy širokou škálu materiálů dostupných skrze jednotné rozhraní daleko za možnostmi fyzických sbírek knihovny. Sbírky odborných prací tak dosahují globálního významu. Tyto systémy mají také vliv na množství dat shromážděných od institucí celého světa a zvyšují tak úroveň vyhledatelnosti odborných dokumentů. Nové systémy, určené od samého počátku ke správě všech typů odborných dokumentů, využívají technologického vývoje, sdílení metadat a spolupráce odborné komunity k optimalizaci knihovních služeb. Článek se zabývá některými současnými trendy v oblasti systémů nové generace a věnuje pozornost tomu, jak shromažďování informací (odborného obsahu, bibliografických metadat a uživatelských dat) v lokálním a individuálním kontextu buduje novou úroveň organizace a správy odborných dokumentů. Klíčová slova: vyhledávací nástroj, žebříček relevance, index agregace, využití dat, recommender, oddělená architektura. Abstract: The discovery and management of scholarly materials have changed in recent years with the introduction of new-generation systems based on decoupled architecture. In addition to offering a modern user experience, new library discovery systems extend the scope of materials available through a single interface far beyond the physical collections of the library, reaching the wealth of scholarly collections of global significance. Such systems also leverage a body of usage data gathered from institutions worldwide to enhance the discoverability of materials. New management systems, built from the outset to manage all types of scholarly assets, harness technological advances, shared bibliographic metadata, and community collaboration to optimize library services. The paper examines some of the current trends in new-generation systems and focuses on the way in which collaboration among stakeholders and the aggregation of information—scholarly content, bibliographic metadata, and usage data—combine with the local and individual context to establish a new level of discovery and management of scholarly materials. Keywords: search engine, relevance ranking, aggregated index, usage data, recommender, decoupled architecture 4 T. Sadeh Discovery and management of (...) 1 Introduction The advent of metasearch technology, which was first presented by Ex Libris at the 10th CASLIN conference in 2001, marked the beginning of a new era for library systems. Metasearching and context-sensitive linking, which Ex Libris also introduced at about the same time, were the first forays by library-system vendors into an arena that until then had been the sole territory of information aggregators. Metasearch technology, which offers unified searching across heterogeneous information resources 1, and context-sensitive linking, which establishes library-controlled links between various components of the information landscape2, helped libraries break down the traditional barriers between various silos of local materials and global content and added the institution’s and the user’s context to the information-seeking flow. With the addition of electronic-resource management systems and digital-asset management systems shortly after, libraries gained the functionality required for managing the full spectrum of library content—print, electronic, and born digital content—and making discovery and access possible. However, the multiplicity and complexity of the library systems, along with the unparalleled scale of content and speed offered by non-library information systems (primarily Web search engines), triggered the drift of end users toward the latter. Librarians, on their part, were deploying less than optimal workflows for the management of all aspects of library services. This paper addresses the way in which the discovery and management of scholarly materials have changed in recent years and examines some of the current trends. In particular, the paper focuses on the way in which collaboration among stakeholders and the aggregation of information—scholarly content, bibliographic metadata, and usage data—combine with the local and individual context to establish a new level of discovery and management of scholarly materials. 2 A Transformation in Progress There is no doubt that the changing behavior of library users is part of an overall transformation that the society, economics, politics, and culture of our era are undergoing. The expectations of today’s library users differ from those of the past, as the result of several factors, such as the immediacy of information and communication, the abundance of online activities in which people are typically 1 For a discussion of metasearching, see Tamar Sadeh, “The Challenge of Metasearching,” New Library World 105, no. 1198/1199 (2004), doi: 10.1108/03074800410526721. 2 Herbert Van de Sompel and Oren Beit-Arie, “Open Linking in the Scholarly Information Environment Using the OpenURL Framework,” D-Lib Magazine 7, no. 3 (2001), http://www.dlib.org/dlib/march01/vandesompel/03vandesompel.html . 5 http://dx.doi.org/10.1108/03074800410526721 http://www.dlib.org/dlib/march01/vandesompel/03vandesompel.html ProInflow : časopis pro informační vědy 2 /2011 involved, and the effect of the social networks with which a great majority of users are engaged. Web search engines, primarily Google, have had a profound influence on the way in which people find information. Started at the turn of the millennium as a means of reaching general information without mediators and without prior information literacy, Web search engines have shaped the way in which students seek scholarly information today. The ease of use, the immediacy, and the heterogeneous nature of information provided by Web search engines trigger expectations that library information systems were unable to match until recently. In late 2005, an OCLC Online Computer Center survey of users’ perceptions of library and information systems and services gave a clear picture of the changing user behavior3. OCLC has continued to monitor the behavior of users, including college students.4;5;6 Already in 2005, library users had shifted from library services to Web search engines, online bookstores, and other Web-based services, such as blogs, online news, and e-mail, to satisfy their information needs without the help of the library. At that time, 89% of the undergraduate and graduate students that OCLC surveyed reported that they started their searches for information with Web search engines, whereas only 2% reported the library Web site as their starting point 7. The 2007 OCLC survey showed an increase of usage of all Web services except one: the library Web site8. The 2010 OCLC survey revealed changes that occurred with the emergence of social networks, Wikipedia, social sharing sites, ask-an-expert sites, and new communication channels such as Skype and Twitter. Although search engines clearly dominate information seeking (93% of college students use Web search 3 Cathy De Rosa, et al., Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2005), http://www.oclc.org/reports/2005perceptions.htm. 4 Cathy De Rosa, et al., College Students’ Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2006), http://www.oclc.org/reports/perceptionscollege.htm. 5 Cathy De Rosa, et al., Sharing, Privacy and Trust in Our Networked World: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2007), http://www.oclc.org/reports/sharing/default.htm. 6 Cathy De Rosa, et al., Perceptions of Libraries, 2010: Context and Community (Dublin, Ohio: OCLC Online Computer Library Center, 2011), http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf. 7 Cathy De Rosa, et al., Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2005), http://www.oclc.org/reports/2005perceptions.htm. 8 Sharing, Privacy and Trust in Our Networked World: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2007), http://www.oclc.org/reports/sharing/default.htm. 6 http://www.oclc.org/reports/sharing/default.htm http://www.oclc.org/reports/2005perceptions.htm http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf http://www.oclc.org/reports/sharing/default.htm http://www.oclc.org/reports/perceptionscollege.htm http://www.oclc.org/reports/2005perceptions.htm T. Sadeh Discovery and management of (...) engines to find online content), Wikipedia has gained considerable recognition by the college student population (88% of the students reported that they use Wikipedia to find information). Furthermore, social networks have become pivotal in the exchange of information—92% of college students use such networks, and two-thirds of them log on daily. The library Web site, on the other hand, was not mentioned as an initial starting point for information-seeking by any survey participants, although 57% of the students do use their library’s site 9 (a slight decline compared to 61% in 200510). As the result of a heavy reliance on social networks and social sharing sites, users have shifted from being consumers of so-called “objective” information systems— those that arrange a result list based on the degree of correlation between the query and the results, regardless of the specific user—to being the recipients of advice from other users, be they friends, other community members, or individuals whose path happened to cross that of the searcher. Although Web search engines do take a user’s context into account to a certain degree, the user is likely to view a recommendation from a trusted person as more reliable than that of a Web search engine. Until recently, the providers of scholarly information systems took the opposite stand: they strived to remain objective (if one discounts librarians’ selection of which materials to offer). Even though many systems have by now applied relevance ranking to result lists, the use of an assemblage of many criteria such as number of citations, number of downloads, and recency for sorting results (as opposed to sorting alphabetically or by date of publication) is a topic of debate among librarians. Nevertheless, users still seek information that is authoritative and has been summarized by someone else, such as on ask-an-expert sites (for example, WikiAnswers). According to the 2010 OCLC survey, the number of respondents who use such sites increased by 136% from 2005, and the frequency of use increased as well. Online librarian-question services have become slightly more popular (10% of the respondents to the 2010 survey reported that they use such services, versus 8% in 2005), but they are still less popular than ask-an-expert sites. On the other hand, college students attribute greater trustworthiness and accuracy to library information systems than they did five years earlier (43% of the 2010 respondents indicated that information from library sources is more 9 Cathy De Rosa, et al., Perceptions of Libraries, 2010: Context and Community (Dublin, Ohio: OCLC Online Computer Library Center, 2011), http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf. 10 Cathy De Rosa, et al., Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2005), http://www.oclc.org/reports/2005perceptions.htm. 7 http://www.oclc.org/reports/2005perceptions.htm http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf ProInflow : časopis pro informační vědy 2 /2011 trustworthy than information from search engines, as opposed to 31% in 2005).11;12 Commissioned by JISC and the British Library, the Centre for Information Behaviour and the Evaluation of Research (CIBER) at University College London (UCL) undertook a study aiming “to identify how the specialist researchers of the future, currently in their school or pre-school years, are likely to access and interact with digital resources in five to ten years’ time.” 13 The investigation identifies some of the information-seeking behavior patterns that scholarly information systems will need to address: “in general terms, this new form of information seeking behaviour [digital information-seeking behavior] can be characterised as being horizontal, bouncing, checking and viewing in nature. Users are promiscuous, diverse and volatile and it is clear that these behaviours represent a serious challenge for traditional information providers, nurtured in a hardcopy paradigm and, in many respects, still tied to it.” 14 The authors conclude that the information literacy of young people has not improved despite the exposure to technological tools from an early age. Furthermore, young people do not invest time in understanding their information need, developing search strategies, or evaluating the information that they find. Much of the available research addresses the information-seeking behavior of academic library users, primarily undergraduates and graduate students. When the behavior of only graduate students and faculty members is examined, the findings are slightly different15;16;17;18;19: although the great majority of academic 11 Cathy De Rosa, et al., Perceptions of Libraries, 2010: Context and Community (Dublin, Ohio: OCLC Online Computer Library Center, 2011), http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf. 12 Cathy De Rosa, et al., Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2005), http://www.oclc.org/reports/2005perceptions.htm. 13 Centre for Information Behaviour and the Evaluation of Research (CIBER), Information behaviour of the researcher of the future (London: CIBER, 2008), http://www.ucl.ac.uk/slais/research/ciber/downloads/ggexecutive.pdf. 14 CIBER, Information behaviour, 9. 15 Research Information Network, Researchers and discovery services: Behaviour, perceptions and needs (Research Information Network, 2006), http://www.rin.ac.uk/our-work/using-and-accessing-information- resources/researchers-and-discovery-services-behaviour-perc. 16 Bradley M. Hemminger et al., “Information Seeking Behavior of Academic Scientists,” Journal of the American Society for Information Science and Technology, 58, no. 14 (2007), doi: 10.1002/asi.20686. 17 Anne Gentil-Beccot et al., “Information Resources in High-Energy Physics: Surveying the Present Landscape and Charting the Future Course,” arXiv:0804.2701v2 [cs.DL] (arXiv, 2008), doi: 10.1002/asi.20944. 18 H. R. Jamali and D. Nicholas, “Information-Seeking Behaviour of Physicists and Astronomers,” Aslib Proceedings, 60, no. 5 (2008), doi: 10.1108/00012530810908184. Version used for this study: http://eprints.rclis.org/16127/1/JAMALIi-FINAL-preprint.pdf. 19 Laura L. Haines et al., “Information-Seeking Behavior of Basic Science Researchers: Implications for Library Services,” Journal of the Medical Library Association, 98, no. 1 (2010), doi: 10.3163/1536- 5050.98.1.019. 8 http://dx.crossref.org/10.3163%2F1536-5050.98.1.019 http://dx.crossref.org/10.3163%2F1536-5050.98.1.019 http://eprints.rclis.org/16127/1/JAMALIi-FINAL-preprint.pdf http://arxiv.org/ct?url=http%3A%2F%2Fdx.doi.org%2F10%252E1002%2Fasi%252E20944&v=fc44dc42 http://arxiv.org/abs/0804.2701v2 http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-and-discovery-services-behaviour-perc http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-and-discovery-services-behaviour-perc http://www.ucl.ac.uk/slais/research/ciber/downloads/ggexecutive.pdf http://www.oclc.org/reports/2005perceptions.htm http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf T. Sadeh Discovery and management of (...) users employ Web search engines, graduate students and faculty members tend to use discipline-specific information systems to satisfy some, or even most, of their information needs. However, it is clear that even in research communities, users are drawn to the simplicity, comprehensiveness, and ease of use of Web search engines. The CIBER study suggests that “it would be a mistake to believe that it is only students’ information seeking that has been fundamentally shaped by massive digital choice, unbelievable (24/7) access to scholarly material, disintermediation, and hugely powerful and influential search engines. The same has happened to professors, lecturers and practitioners. Everyone exhibits a bouncing/flicking behaviour, which sees them searching horizontally rather than vertically. Power browsing and viewing is the norm for all.” 20 Because most scholarly materials are discoverable through multiple interfaces, users may well be able to obtain the same materials through Web search engines and academic systems, although Web search engines provide an easier and faster route to these materials. However, Web search engines come with their own drawbacks, particularly the limited search and filtering options available to users and a search scope that comprises a universe of materials of unequal quality. Furthermore, with the growing amount of available data, even Web search engines have lost some of their attraction, as revealed by the 2010 OCLC survey: “only” 83% of college students begin their searches using search engines, as opposed to 89% in 2005.21;22 In addition, library-driven services such as bibliographic tools and citation analyses are not available through Web search engines. Hence, most searchers rely on more than one type of information system and typically use both scholarly information systems and Web search engines. In a report commissioned by the Bibliographic Services Task Force of the University of California, the authors conclude that their “users expect simplicity and immediate reward and Amazon, Google, and iTunes are the standards against which we are judged. Our current systems pale beside them.” 23 The challenge for libraries, therefore, is to determine how they wish to portray themselves to their users and how they can best serve the institutions to which they belong. Looking for ways to retain their users and maintain their hegemony as information providers, libraries have started considering new approaches—user- 20 CIBER, Information behaviour, 8. 21 Cathy De Rosa, et al., Perceptions of Libraries, 2010: Context and Community (Dublin, Ohio: OCLC Online Computer Library Center, 2011), http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf. 22 Cathy De Rosa, et al., Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC Online Computer Library Center, 2005), http://www.oclc.org/reports/2005perceptions.htm. 23 University of California Libraries Bibliographic Services Task Force, Rethinking How We Provide Bibliographic Services for the University of California (University of California, 2005), http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf. 9 http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf http://www.oclc.org/reports/2005perceptions.htm http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf ProInflow : časopis pro informační vědy 2 /2011 centric solutions that replace their traditional online user interfaces. However, in order to address users’ expectations regarding the interface, the breadth and relevance of services, and the comprehensiveness of the body of information available through the system, libraries had to undergo a major conceptual shift. 24 3 The Challenges for Library Management Systems While worrying about the drift of users to other information systems, librarians must also meet the challenge of efficiently managing their assets. Because the various systems in the library were developed over time to support specific needs as they arose, the overall library environment became complex and workflows became cumbersome. Finally, the economic crisis of the last decade coupled with the resulting budget cuts led to the realization that to retain their role as the providers of quality information, libraries would have to operate in a different way. Recent reports indicate that libraries are undergoing considerable changes. 25 New trends include the following: • The nature of collection development is changing and is becoming much more driven by patrons’ requests. Shifting to e-materials, libraries are expanding their collections and dismantling the traditional distinction between local and remote content—be it owned by other libraries, under subscription, available on demand, or open access. The types of materials 24 For a discussion of user preferences, see Tamar Sadeh, “Time for a Change: New Approaches for a New Generation of Library Users, New Library World 108, no. 7/8 (2007), doi: 10.1108/03074800710763608; “User-Centric Solutions for Scholarly Research in the Library,” LIBER Quarterly 17, no. 3/4 (2007), http://liber.library.uu.nl/publish/issues/2007-3_4/index.html?000215; “A Model of Scientists’ Information Seeking and a User-Interface Design” (PhD thesis, City University London, 2010). 25 ACRL Research Planning and Review Committee, “2010 top ten trends in academic libraries,” College & Research Libraries News 71, no. 6 (June 2010), http://crln.acrl.org/content/71/6/286.full; Association of College and Research Libraries (researched by Megan Oakleaf), Value of Academic Libraries: A Comprehensive Research Review and Report (Chicago: Association of College and Research Libraries, 2010), http://www.acrl.ala.org/value/; Council on Library and Information Resources, No Brief Candle: Reconceiving Research Libraries for the 21st Century (Washington, D.C.: Council on Library and Information Resources, 2008), http://www.clir.org/pubs/reports/pub142/pub142.pdf; L. Johnson, A. Levine, and R. Smith, The 2009 Horizon Report (Austin, Texas: The New Media Consortium, 2009), http://wp.nmc.org/horizon2009/; L. Johnson, A. Levine, R. Smith, and S. Stone, The 2010 Horizon Report (Austin, Texas: The New Media Consortium, 2010), http://wp.nmc.org/horizon2010/ ; Matthew P. Long and Roger C. Schonfeld, Ithaka S+R Library Survey 2010: Insights from U.S. Academic Library Directors (ITHAKA, 2010), http://www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey-2010/insights-from-us- academic-library-directors.pdf; James Michalko, Constance Malpas, and Arnold Arcolio, Research Libraries, Risk and Systemic Change (Dublin, Ohio: OCLC Research, 2010), http://www.oclc.org/research/publications/library/2010/2010-03.pdf; David J. Staley and Kara J. Malenfant, Futures Thinking for Academic Librarians: Higher Education in 2025 (Association of College and Research Libraries, 2010), http://www.acrl.org/ala/mgrps/divs/acrl/issues/value/futures2025.pdf. 10 http://www.acrl.org/ala/mgrps/divs/acrl/issues/value/futures2025.pdf http://www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey-2010/insights-from-us-academic-library-directors.pdf http://www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey-2010/insights-from-us-academic-library-directors.pdf http://wp.nmc.org/horizon2010/ http://wp.nmc.org/horizon2009/ http://www.clir.org/pubs/reports/pub142/pub142.pdf http://www.acrl.ala.org/value/ http://crln.acrl.org/content/71/6/286.full http://liber.library.uu.nl/publish/issues/2007-3_4/index.html?000215 http://dx.doi.org/10.1108/03074800710763608 T. Sadeh Discovery and management of (...) are becoming more varied, and libraries are likely to be involved in dealing with new material types, such as data sets. • Because of continuous budget cuts all over the world, libraries are becoming more vulnerable and have to demonstrate measurable value for the institutions that they serve. To stay relevant, libraries are trying to engage in new tasks, traditionally not within the library domain, such as teaching and conducting research. Topics such as scholarly communication and intellectual property are increasing in relevance with the involvement of libraries in the publishing of locally created materials, such as theses, dissertations, and course materials. Libraries are also active in digitization projects and are likely to become involved in the preservation of digital materials. • Driven by the need to increase efficiency and adopt cost-effective processes, libraries are seeking assessment tools and analytics that can provide more insight into library activities and help library executives in their decision making. True to their long tradition of collaboration, which is now supported by technological advances, libraries are collaborating more with other groups in their institution and beyond—primarily suppliers and other libraries. • The rapid pace of technological changes and the ongoing library investment in hardware, operations, and skilled staff is motivating library decision makers to seek an infrastructure that can ease the burden of managing information technology or exempt libraries altogether from those tasks. Cloud-computing environments, now offered by many vendors, are already gaining momentum and are likely to continue attracting libraries. While many such changes trigger discussions about the role of the library, it is clear that no matter what new missions libraries undertake, existing software solutions fall short in providing optimal support for all current and future library activities because of architecture that is neither flexible nor scalable enough. New software solutions addressing the new needs are likely to take the lead. 4 The Introduction of Decoupled Architecture One of the main challenges in adapting library system environments to better serve end users is the systems’ focus on library staff workflow and the tight bond between the management of the library assets and the way in which the library makes these assets available to end users. The librarian-centric focus originates from the fact that librarians perform most of the tasks in the library; furthermore, library vendors sell to librarians, not to end users. 11 ProInflow : časopis pro informační vědy 2 /2011 Two main reasons make it difficult for information systems to both fulfill the expectations of users and accommodate librarians’ need for optimized systems. First, to date, each system in the library has tended to deal with only a single type of material—print, electronic, or digital—whereas end users expect to find everything they need in one place, regardless of format; given the architecture of the existing systems, it is not likely that one such system can be extended to cover all materials. This “silo” approach to managing materials is also challenging for librarians who attempt to understand what is going on across the library’s collections. Second, the traditional model of library workflows—each of which was developed for a single type of material—coupled with a lack of efficient technical channels for collaboration among libraries and other stakeholders does not lend itself to an efficient, cost-effective infrastructure. In 2006, Ex Libris, a provider of library automation solutions, introduced “decoupled architecture” as the cornerstone of its future offering. Other vendors, such as OCLC and Innovative Interfaces, have demonstrated the same vision. This architecture, which separates the user experience from the management of the library collections and the library services, is based on data exchange between the discovery layer and the management layer of the library systems; each such layer is designed around the needs of its users—end users and librarians, respectively. Because retaining their target audience was an urgent task for most libraries, the first new systems based on decoupled architecture were discovery systems. With such systems in place, the libraries’ current solutions, such as integrated library systems and digital-asset management systems that were tailored to the needs of the librarians, continued to fulfill their administrative functions. Embraced by all stakeholders, the decoupled architecture model has already been implemented in discovery systems developed by library system vendors, other software vendors, information providers, and open-source communities since 2007. While library system vendors have been addressing both components of decoupled architecture, others have been offering only the discovery layer. Decoupled architecture provides libraries with several benefits: • Flexibility: Because it is not tied to the library’s management systems, the user experience is free of constraints related to specific data structures and administrative workflows and hence can be designed and tailored to the needs of end users. • Unification: Data coming from heterogeneous, harvestable information resources—both local and remote—form one source of information for end users. The fact that data elements originate from separate sources has become irrelevant to the discovery process. 12 T. Sadeh Discovery and management of (...) • Data enhancement: Once it harvests the data from the administrative systems, the new system easily enhances that data with additional elements from other resources, typically from third parties. Such data elements include images of book covers, tables of contents, descriptions, and reviews. • Integration: Adherence to the newest standards of interoperability enables a library to easily integrate its new interface into an existing environment and the various systems in the user’s workspace, such as course management systems, social network sites, and reference management tools. Furthermore, discovery interfaces are now available for mobile devices and hence have become more accessible and practical. • Analytics: Providing access to both local and remote collections through a single interface enables a library to easily gather information about user behavior and analyze the information for purpose of development planning and the adjustment of library collection and services. • Step-by-step evolution: With a system that deploys decoupled architecture, libraries have started offering a new, improved user experience without disturbing the behind-the-scenes administration. Similarly, libraries can add, modify, and replace components of their administrative systems without affecting the user experience. Decoupled architecture lays the groundwork for a robust discovery service for end users, on the one hand, and a unified resource-management system for librarians, on the other hand. The first service already exists: end-user-centric discovery solutions, including Primo from Ex Libris, appeared on the market in 2007 and have been embraced by national, academic, and public libraries everywhere (Primo alone has been selected by more than 800 institutions worldwide). Evidence shows that, indeed, the number of searches in library-based systems increases dramatically when the library introduces a solution whose user experience is tailored to the library community’s needs. 26 26 According to statistics cited by New York University in an e-mail exchange to the NGC4LIB listserv in October 2009, the average number of search sessions after the implementation of Primo almost tripled compared to the average number before the implementation, and the average session time decreased from 15 minutes to six minutes, indicating that users are finding what they want much faster. At a user group meeting in early 2010, Yonsei University (in Seoul, South Korea) reported a similar increase in the number of search sessions after the implementation of Primo. In addition, the Primo search session time decreased to 10%, on average, of the previous system’s session time for a particular set of queries. The difference can be attributed to the display of all the titles on the first page in Primo as opposed to on the second to fifteenth page in the previous system. 13 ProInflow : časopis pro informační vědy 2 /2011 5 Current Discovery Challenges No longer considered new, discovery systems resemble each other in many respects. All such systems are fast and offer heterogeneous scholarly materials, a modern user interface, one search box for searching all types of materials, faceted browsing, and relevance-ranked result lists. However, one significant difference between discovery systems lies in the way in which they deal with global offerings in a local context. This difference is manifested in the breadth and quality of the materials offered, the customizability of the search scope, the ease of integration with local library services, the sophistication and adaptability of relevance ranking, the inclusion of recommendations as search aids, and the provision of functionality for local branding. There is no doubt that the content available for scholarly discovery systems sets the point of departure for the discovery process. On the one hand, the content should be as far-reaching as possible and include all types of materials in every possible discipline. On the other hand, if discovery systems provide only relevant, high-quality scholarly results, academic users are more likely to turn to them (as opposed to Web search engines). Because collections are no longer bound to space and location, the whole spectrum of scholarly information is open for discovery, and the boundaries set by institutional physical or digital holdings are no longer relevant. Furthermore, in today’s academic environment, information needs can rarely be satisfied by local resources alone. However, the integration between global searching and the local physical holding is crucial in many disciplines and for specific user communities. The biggest challenge of discovery systems is how to provide users with the most relevant items in the immense landscape of available content. Thus, new tools have been added to such systems to help users find specific items that they are looking for or items that will satisfy a broader search query. For example, faceted navigation helps users quickly refine their result list and focus on subsets of it, 27;28 and a display of recommendations based on other users’ selections draw one’s attention to items similar to a given item. However, familiarity with Google and other search engines leads users of discovery systems to scan only the first results; hence, relevant items can easily remain unnoticed if they are not displayed on the first page. Relevance ranking, whose purpose is to highlight what the system deems the most appropriate materials for the particular query, has become a major factor in satisfying user needs, together with immediate delivery (or, in the 27 Tamar Sadeh, “Multiple Dimensions of Search Results” (paper presented at the Analogous Spaces Interdisciplinary Conference, Ghent University, Belgium, May 15-17, 2008). 28 Marti A. Hearst, “Clustering Versus Faceted Categories for Information Exploration,” Communications of the ACM, 49, no. 4 (2006), http://people.ischool.berkeley.edu/~hearst/papers/cacm06.pdf. 14 http://people.ischool.berkeley.edu/~hearst/papers/cacm06.pdf T. Sadeh Discovery and management of (...) case of physical items, immediate OPAC-based services), and in increasing the value of the library for the user and for the institution. 6 What Does the Library Offer the User? If in the past a library was judged by the number of volumes it held, today scholarly information is broad and borderless. However, one of the main roles of librarians—the selection of appropriate resources—is no less applicable today than in the past. Libraries need to differentiate themselves from Web search engines by ensuring the quality and breadth of the local and remote information that is provided and by making the information easy to use through preprocessing (such as the detection and grouping of duplicates) and by integrating scholarly search functionality into the user’s environment (by supporting, for example, institutional single sign-on and embedding search boxes in various institutional and external systems). Given the huge quantity of available data, libraries can reduce information overload by setting an initial search scope that is more appropriate to their communities and by using techniques such as the grouping of similar materials. Although discovery systems match Web search engines when it comes to ease of use and speed, speed is not measured only by the amount of time that elapses until a result list is displayed. Much more important is the amount of time that an information system takes to satisfy an information need and provide the user with the desired outcome. In this respect, the control that libraries have over the search scope and the relevance-ranking algorithms, the deployment of services such as recommendations, and the immediate delivery of the materials significantly decrease the time that users take to find items of interest and amplify the value of the library’s services for its users. When defining the search scope, libraries should be addressing the “long tail” of information resources that are of utmost importance to some of their users. While it is likely that almost all information needs of undergraduates can be satisfied by the most popular information resources, many researchers require more specialized information that might dictate the adoption of various searching technologies. A discovery system that is based on one technology and is not flexible enough to provide access to information resources of all types cannot become the ultimate search entry point for many users. Depending on a user’s information need—an exploratory search for items on a particular topic or a search for a specific item—an information system might have to use more than one method of selecting the most appropriate results. When a user is looking for a specific item, the system should display that item at the top of the first page of results. However, an exploratory search is more complex, because the broader the search is, the greater the quantity of relevant results. 15 ProInflow : časopis pro informační vědy 2 /2011 Furthermore, in an exploratory search the information need is not necessarily well defined; the user might not be sure what is needed, and the way in which the user phrases the query might not be clear. Because undergraduates—who are typically less adept at phrasing their information needs—tend to conduct exploratory searches29, addressing such searches adequately is of great importance. In addition to identifying the items that are most likely to fulfill the user’s needs and putting them at the top of the result list, the system should draw the user’s attention to items that are likely to be relevant although they might not be on the result list. In a manner that is similar to human interaction between two parties, in which each person adjusts to the other in tone, language, and content, information systems need to “understand” the user’s context as well as the value of the information that they offer, regardless of the specific query. While the usefulness of the information available through a system lies in the aggregation of global data —both the content itself and measures that are associated with it, such as the journal impact factor and usage statistics—the user is always an individual who is part of a local and wider community and has a specific information need at a given moment. “Awareness” of the user context, such as the person’s discipline, can help an information system adjust the relevance ranking so that items related to the user’s discipline are ranked higher. Similarly, the academic level of the user may indicate the degree of applicability of items that are more general. Usage data has proven to be a most valuable resource for information systems and management systems in the scholarly arena. 30 From such data, a system can generate metrics for evaluating the significance of items and for associating items with each other; then the system can feed the results back to the user through recommendations (such as those provided by the bX article recommender service) and relevance ranking and can aid librarians in collection-development decisions. However, the gathering of usage data is most meaningful when the data are aggregated across institutions rather than related to the few individuals that happen to be at one institution. Recommendations from the bX service expand the search results to include items that are not retrieved by the query yet are clearly relevant. Such recommendations are highly valuable for cross-disciplinary research and for research in an area with which the user is less familiar. A user who does not know the applicable 29 According to a survey by University of Minnesota Libraries, over 70% of undergraduate searches are exploratory; as researchers become more knowledgeable in their area, they tend to search more for known items. Discoverability: Phase 2 Final Report (2010), http://conservancy.umn.edu/bitstream/99734/3/DiscoverabilityPhase2ReportFull.pdf 30 Johan Bollen and Herbert Van de Sompel, “An Architecture for the Aggregation and Analysis of Scholarly Usage Data (paper presented at JCDL ’06, Chapel Hill, North Carolina, June 11–15, 2006, ACM 1-59593- 354-9/06/0006), http://public.lanl.gov/herbertv/papers/jcdl06_accepted_version.pdf. 16 http://public.lanl.gov/herbertv/papers/jcdl06_accepted_version.pdf http://conservancy.umn.edu/bitstream/99734/3/DiscoverabilityPhase2ReportFull.pdf T. Sadeh Discovery and management of (...) terminology is likely to miss relevant results; however, recommendations that are displayed along with an item on a result list aid such users by highlighting materials that are relevant even though they may not share the same keywords. The system’s evaluation of the user’s context serves the entire information-seeking process, not just searches. Services such as those related to evaluating materials, accessing them, integrating them into the user’s space (for example, enabling the user to save the citation or bookmark the item), and accessing other relevant materials should be part of the institutional context. In addition, because more and more users carry a mobile device, they can identify and make use of services that are relevant in their current location. The user’s context brings up issues of privacy, which is of great concern to libraries; however, gaining more information about a user is the key to tailoring the system’s behavior to that user, just as in human interaction. 7 New-Generation Library Management Systems While the new-generation discovery systems, available since 2007, were the first manifestations of decoupled architecture, corresponding library management systems started to emerge toward the end of the last decade. No such system is in full-scale production yet, although two systems—Alma from Ex Libris 31 and Web- scale Management Services (WMS) from OCLC 32—have been made available to selected libraries for specific functionalities. Designed from scratch rather than as an extension of existing products, the new- generation management systems have the privilege of presenting an optimal infrastructure that is likely to serve libraries for the next decade and more. Ideally, such systems address the following aspects: • The unification of data structures and consolidation of processes, regardless of the type of material, its location, and its type of ownership. Such unification and consolidation open up opportunities to build more efficient library services by creating optimized workflows that are supported by comprehensive information gathering. Furthermore, unified data structures enable business processes to be automate and facilitate the sharing of data among libraries. • Collaboration among stakeholders—other libraries, parent institutions, suppliers, and the scholarly community—to optimize processes and collections. With such collaboration, community efforts can be leveraged to 31 For information about Alma, see http://www.exlibrisgroup.com/category/URM_ResourceCenter. 32 For information about Web-scale Management Services, see http://www.oclc.org/webscale. 17 http://www.oclc.org/webscale http://www.exlibrisgroup.com/category/URM_ResourceCenter ProInflow : časopis pro informační vědy 2 /2011 reduce library-specific investment in generating or obtaining information (for example, metadata) that is already available to the community, thus supporting streamlined processes such as acquisition on demand. Furthermore, collaboration can be extended to activities such as joint collection building, with physical items shared among institutions. • Network-level architecture based on cloud services. This type of architecture increases the return on investment and reduces the total cost of ownership of the library’s infrastructure while providing easy access to shared data and services. • The leveraging of intelligence gathered by the system to generate analytics that support both informed decisions regarding all library activities (primarily collection development) and the tailoring of a personal context for users to improve the discovery and delivery process • Integration of library services within the institutional environment to facilitate strategic library support of the institution’s mission statement and of institutional activities such as teaching and research, thus demonstrating the library’s value within its environment • A core system that supports the building of software extensions by libraries. Such extensions enable libraries to add services and local adaptations. The handling of metadata demonstrates, once again, the way in which aggregation provides a springboard to greater efficiency. While the optimization of metadata handling is tightly bound to the capability of a system to leverage large aggregates of metadata shared by libraries, it is crucial that the system operate in the context of the specific library and balance the global sharing with the local library’s characteristics and needs. The individuality of a library—primarily, its unique collections—should be combined with the global metadata repository shared by many libraries to achieve optimal flexibility while supporting the efficiency of processes. 8 Conclusions The past decade has seen a fundamental change in the way in which libraries have been providing services to their users. During that time, libraries expanded their services to offer a much greater volume and variety of materials through multiple systems. Yet, because of global changes outside the boundaries of libraries, users drifted to other spaces and libraries found themselves looking for ways to remain relevant. 18 T. Sadeh Discovery and management of (...) Decoupled architecture, through which discovery systems support the needs and expectations of end users while administrative systems serve librarians, has been embraced by industry stakeholders in recent years. Hundreds of libraries have adopted discovery systems that were developed by library software vendors, such as Ex Libris and Innovative Interfaces; information providers, such as EBSCO and Serials Solutions (a ProQuest company); the open-source community (from which comes the VUFind portal, for example), and other providers, such as Endeca. The new-generation systems based on this architecture—both discovery systems and management systems—leverage technological advances and the aggregation of content, bibliographic data, and usage data to deliver library services on a new scale. While enabling libraries to expand their offerings to their users, on the one hand, and optimizing administrative processes, on the other hand, software systems need to help libraries maintain their individuality and set the appropriate context for their users. By doing so, libraries can better serve their users and add greater value to their institutions. References 1. ACRL Research Planning and Review Committee. “2010 top ten trends in academic libraries.” College & Research Libraries News 71, no. 6 (June 2010): 286-292. http://crln.acrl.org/content/71/6/286.full. 2. Association of College and Research Libraries (researched by Megan Oakleaf). Value of Academic Libraries: A Comprehensive Research Review and Report. Chicago: Association of College and Research Libraries, 2010. http://www.acrl.ala.org/value/. 3. Bollen, Johan, and Herbert Van de Sompel. “An Architecture for the Aggregation and Analysis of Scholarly Usage Data. Paper presented at JCDL ’06, Chapel Hill, North Carolina, June 11–15, 2006. ACM 1-59593-354- 9/06/0006. http://public.lanl.gov/herbertv/papers/jcdl06_accepted_version.pdf . 4. Centre for Information Behaviour and the Evaluation of Research (CIBER). Information behaviour of the researcher of the future . London: CIBER, 2008. http://www.ucl.ac.uk/slais/research/ciber/downloads/ggexecutive.pdf . 5. Council on Library and Information Resources. No Brief Candle: Reconceiving Research Libraries for the 21st Century . Washington, D.C.: Council on Library and Information Resources, 2008. http://www.clir.org/pubs/reports/pub142/pub142.pdf . 19 http://www.clir.org/pubs/reports/pub142/pub142.pdf http://www.ucl.ac.uk/slais/research/ciber/downloads/ggexecutive.pdf http://public.lanl.gov/herbertv/papers/jcdl06_accepted_version.pdf http://www.acrl.ala.org/value/ http://crln.acrl.org/content/71/6/286.full ProInflow : časopis pro informační vědy 2 /2011 6. De Rosa, Cathy, Joanne Cantrell, Diane Cellentani, Janet Hawk, Lillie Jenkins, and Alane Wilson. Perceptions of Libraries and Information Resources: A Report to the OCLC Membership. Dublin, Ohio: OCLC Online Computer Library Center, 2005. http://www.oclc.org/reports/2005perceptions.htm . 7. De Rosa, Cathy, Joanne Cantrell, Janet Hawk, and Alane Wilson. College Students’ Perceptions of Libraries and Information Resources: A Report to the OCLC Membership. Dublin, Ohio: OCLC Online Computer Library Center, 2006. http://www.oclc.org/reports/perceptionscollege.htm . 8. De Rosa, Cathy, Joanne Cantrell, Andy Havens, Janet Hawk, and Lillie Jenkins. Sharing, Privacy and Trust in Our Networked World: A Report to the OCLC Membership. Dublin, Ohio: OCLC Online Computer Library Center, 2007. http://www.oclc.org/reports/sharing/default.htm . 9. De Rosa, Cathy, Joanne Cantrell, Matthew Carlson, Peggy Gallagher, Janet Hawk, and Charlotte Sturtz. Perceptions of Libraries, 2010: Context and Community. Dublin, Ohio: OCLC Online Computer Library Center, 2011. http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf . 10. Gentil-Beccot, Anne, Salvatore Mele, Annette Holtkamp, Heath B. O'Connell, and Travis C. Brooks. “Information Resources in High-Energy Physics: Surveying the Present Landscape and Charting the Future Course.” arXiv:0804.2701v2 [cs.DL]. arXiv, 2008. doi: 10.1002/asi.20944 . (Journal reference: Journal of the American Society for Information Science and Technology 60, no. 1 (2009): 150-160.) 11. Haines, Laura L., J. Light, D. O'Malley, and F. A. Delwiche. “Information- Seeking Behavior of Basic Science Researchers: Implications for Library Services.” Journal of the Medical Library Association 98, no. 1 (2010): 73– 81. doi: 10.3163/1536-5050.98.1.019 . 12. Hearst, Marti A. “Clustering Versus Faceted Categories for Information Exploration.” Communications of the ACM 49, no. 4 (2006): 59-61. http://people.ischool.berkeley.edu/~hearst/papers/cacm06.pdf . 13. Hemminger, Bradley M. Dihui Lu, K. T. L. Vaughan, and Stephanie J. Adams. “Information Seeking Behavior of Academic Scientists.” Journal of the American Society for Information Science and Technology 58, no. 14 (2007): 2205–2225. doi: 10.1002/asi.20686. 14. Jamali, H. R., and D. Nicholas. “Information-Seeking Behaviour of Physicists and Astronomers.” Aslib Proceedings 60, no. 5 (2008): 444-462. 20 http://people.ischool.berkeley.edu/~hearst/papers/cacm06.pdf http://dx.crossref.org/10.3163%2F1536-5050.98.1.019 http://arxiv.org/ct?url=http%3A%2F%2Fdx.doi.org%2F10%252E1002%2Fasi%252E20944&v=fc44dc42 http://arxiv.org/abs/0804.2701v2 http://www.oclc.org/reports/2010perceptions/2010perceptions_all.pdf http://www.oclc.org/reports/sharing/default.htm http://www.oclc.org/reports/perceptionscollege.htm http://www.oclc.org/reports/2005perceptions.htm T. Sadeh Discovery and management of (...) doi: 10.1108/00012530810908184. Version used for this study: http://eprints.rclis.org/16127/1/JAMALIi-FINAL-preprint.pdf . 15. Johnson, L., A. Levine, and R. Smith. The 2009 Horizon Report. Austin, Texas: The New Media Consortium, 2009. http://wp.nmc.org/horizon2009/ . 16. Johnson, L., A. Levine, R. Smith, and S. Stone. The 2010 Horizon Report. Austin, Texas: The New Media Consortium, 2010. http://wp.nmc.org/horizon2010/ . 17. Long, Matthew P., and Roger C. Schonfeld. Ithaka S+R Library Survey 2010: Insights from U.S. Academic Library Directors. ITHAKA, 2010. http://www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey- 2010/insights-from-us-academic-library-directors.pdf . 18. Michalko, James, Constance Malpas, and Arnold Arcolio. Research Libraries, Risk and Systemic Change. Dublin, Ohio: OCLC Research, 2010. http://www.oclc.org/research/publications/library/2010/2010-03.pdf. 19. Research Information Network. Researchers and discovery services: Behaviour, perceptions and needs. Research Information Network, 2006. http://www.rin.ac.uk/our-work/using-and-accessing-information- resources/researchers-and-discovery-services-behaviour-perc . 20.Sadeh, Tamar. “The Challenge of Metasearching.” New Library World 105, no. 1198/1199 (2004): 104-112. doi: 10.1108/03074800410526721 . 21. Sadeh, Tamar. “A Model of Scientists’ Information Seeking and a User- Interface Design.” PhD thesis, City University London, 2010. 22. Sadeh, Tamar. “Multiple Dimensions of Search Results.” Paper presented at the Analogous Spaces Interdisciplinary Conference, Ghent University, Belgium, May 15-17, 2008. 23. Sadeh, Tamar. “Time for a Change: New Approaches for a New Generation of Library Users. New Library World 108, no. 7/8 (2007): 307-316. doi: 10.1108/03074800710763608 . 24. Sadeh, Tamar. “User-Centric Solutions for Scholarly Research in the Library.” LIBER Quarterly 17, no. 3/4 (2007). http://liber.library.uu.nl/publish/issues/2007-3_4/index.html?000215 25. Staley, David J., and Kara J. Malenfant, Futures Thinking for Academic Librarians: Higher Education in 2025. Association of College and Research Libraries, 2010. http://www.acrl.org/ala/mgrps/divs/acrl/issues/value/futures2025.pdf . 21 http://www.acrl.org/ala/mgrps/divs/acrl/issues/value/futures2025.pdf http://liber.library.uu.nl/publish/issues/2007-3_4/index.html?000215 http://dx.doi.org/10.1108/03074800710763608 http://dx.doi.org/10.1108/03074800410526721 http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-and-discovery-services-behaviour-perc http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-and-discovery-services-behaviour-perc http://www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey-2010/insights-from-us-academic-library-directors.pdf http://www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey-2010/insights-from-us-academic-library-directors.pdf http://wp.nmc.org/horizon2010/ http://wp.nmc.org/horizon2009/ http://eprints.rclis.org/16127/1/JAMALIi-FINAL-preprint.pdf ProInflow : časopis pro informační vědy 2 /2011 26. University of California Libraries Bibliographic Services Task Force. Rethinking How We Provide Bibliographic Services for the University of California. University of California, 2005. http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf . 27. University of Minnesota Libraries. Discoverability: Phase 2 Final Report. University of Minnesota, 2010. http://conservancy.umn.edu/bitstream/99734/3/DiscoverabilityPhase2Re portFull.pdf . 28.Van de Sompel, Herbert, and Oren Beit-Arie. “Open Linking in the Scholarly Information Environment Using the OpenURL Framework.” D- Lib Magazine 7, no. 3 (2001). http://www.dlib.org/dlib/march01/vandesompel/03vandesompel.html . 22 http://www.dlib.org/dlib/march01/vandesompel/03vandesompel.html http://conservancy.umn.edu/bitstream/99734/3/DiscoverabilityPhase2ReportFull.pdf http://conservancy.umn.edu/bitstream/99734/3/DiscoverabilityPhase2ReportFull.pdf http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf Discovery and Management of scholarly materials: New-generation library systems 1 Introduction 2 A Transformation in Progress 3 The Challenges for Library Management Systems 4 The Introduction of Decoupled Architecture 5 Current Discovery Challenges 6 What Does the Library Offer the User? 7 New-Generation Library Management Systems 8 Conclusions References work_rq2f4vkqffedvbm2zhdtuyingq ---- LIBER Webinar: Generating Metadata with Artificial Intelligence libereurope.eu CC BY HOST Jeannette Frey LIBER President Director, Bibliothèque Cantonale et Universitaire (BCU) Lausanne Jeannette.frey@bcu.unil.ch mailto:Jeannette.frey@bcu.unil.ch libereurope.eu CC BY SPEAKER Martijn Kleppe Head of Research, National Library of the Netherlands (KB) Martijn.Kleppe@kb.nl mailto:Martijn.Kleppe@kb.nl libereurope.eu CC BY NOTES ○ The webinar is being recorded. ○ Slides and a recording will be shared by email after the webinar. ○ Questions? Put them in the chat box. ○ 10-15 minutes of discussion will take place following the presentations. Generating metadata with AI EXPERIENCE OF THE KB, NATIONAL LIBRARY OF THE NETHERLANDS Martijn Kleppe – Head of Research Martijn.kleppe@kb.nl | @martijnkleppe | www.kb.nl/martijnkleppe mailto:Martijn.kleppe@kb.nl https://twitter.com/martijnkleppe http://www.kb.nl/martijnkleppe https://w w w .kb.nl/en/new s/2019/kb-explores-artificial-intelligence-to-generate-m etadata Kleppe, M., Veldhoen, S, Waal-Gentenaar, M., Oudsten, B. den, & Haagsma, D. (2019). Exploration possibilities Automated Generation of Metadata. http://doi.org/10.5281/zenodo.3375192 Sara Veldhoen Meta van der Waal-Gentenaar Dorien Haagsma Brigitte den Oudsten https://www.kb.nl/en/news/2019/kb-explores-artificial-intelligence-to-generate-metadata http://doi.org/10.5281/zenodo.3375192 I. Introduction II. Set-up experiment III. Lessons learned IV. Next steps Outline I. INTRODUCTION • About me • Research Department at KB, National Library of the Netherlands (18 fte) • Topics: Digital Preservation, Public Library Research, Copyright, Data Science • KB Researchagenda 2018-2022 Introduction https://zenodo.org/com m unities/kbnl/search?page=1& size=20 https://w w w .kb.nl/en/organisation/research-expertise https://zenodo.org/communities/kbnl/search?page=1&size=20 https://www.kb.nl/en/organisation/research-expertise • 5 Research themes: • Informationsociety • Publications • Access & Sharing • Customers • Impact • 8 Researchgroups with KB colleagues from the whole organisation Introduction - KB Research Agenda ht tp s: // w w w .k b. nl /e n/ or ga ni sa tio n/ re se ar ch -e xp er tis e/ re se ar ch -a ge nd a- 20 18 -2 02 2 ht tp s: // do i.o rg /1 0. 52 81 /z en od o. 12 54 22 6 https://www.kb.nl/en/organisation/research-expertise/research-agenda-2018-2022 • Short term: Proof of Concept, internships, researcher-in-residence, workshops • Long term: Collaborate with partners: academic, libraries, industry • 1 Researchgroup on (Semi-) automated metadata Introduction - KB Research Agenda ht tp s: // w w w .k b. nl /e n/ or ga ni sa tio n/ re se ar ch -e xp er tis e/ re se ar ch -a ge nd a- 20 18 -2 02 2 ht tp s: // do i.o rg /1 0. 52 81 /z en od o. 12 54 22 6 https://www.kb.nl/en/organisation/research-expertise/research-agenda-2018-2022 • Literature review: • Media sector • Heritage institutes • Libraries • Site visits • Which part of the process do we focus on? Introduction – Research Group Introduction – Metadata process at KB Introduction – Metadata process at KB • Literature review: • Media sector • Heritage institutes • Libraries • Sight visits • Which part of the process do we focus on? • ICT with Industry Workshop Introduction – Research Group • Dutch Research Council (NWO) • Formulate use-case & get selected • Small funding required (1,5K EUR) • Full week • 13 participants • Workingspace & hotel Lorentz Center Leiden Introduction – ICT with Industry Workshop ht tp s: // w w w .lo re nt zc en te r. nl /l c/ w eb /2 01 9/ 10 61 /i nf o. ph p3 ?w si d= 10 61 & ve nu e= O or t https://www.lorentzcenter.nl/lc/web/2019/1061/info.php3?wsid=1061&venue=Oort II. SET-UP EXPERIMENT Researcher Physical publications Since 1789 Researcher Search Interface Physical Repository Physical publications Since 1789 Researcher Search Interface Physical Repository Physical publications Manual Annotation of keywords Since 1789 Researcher Search Interface Digital Repository Full text digital publications Manual Annotation of keywords Since 2003 Researcher Search Interface Physical & Digital Repository Physical & full text digital publications Manual Annotation of keywords Since 2019 Researcher Search Interface Physical & Digital Repository Physical & full text digital publications Manual Annotation of keywords Since 2019 Set up - Research question Dissertations Brinkman Topics ____________ ____________ ____________ ____________ ____________ Mapping + metadata Research question: How can we automatically label dissertations with relevant keywords from the Brinkman thesaurus? Set up – Data & Thesaurus • Data – Dissertations: Full text and metadata via 6 university libraries • Thesaurus Brinkman - ‘Brinkeys’: 15K keywords, since 1885 ØChallenge: Map dissertations of university libraries with titles in KB Catalog Set up – Data & Thesaurus In the Ideal World: Every Thesis has an ISBN Every Author has an ORCID Every thesis is in Dutch (or English) A title is always written consistently Author names are written consistently All text is in UTF-8 Every university uses the same keywords consistently Set up – Approaches • Naive Baselines: • Lexical overlap between titles and Brinkeys • Lexical overlap keywords universities and Brinkeys • Methods: • Naive Bayes: simple machine learning algorithm that predicts a Brinkey on the basis of the words that appear in the title and/or a summary • Word Embeddings: neural networks that places the meaning of words in a continuous virtual “vector space” • Fasttext Set up – Approaches • Annif • Finnish National Library • Use own thesaurus • Open Source • Combination of techniques • Ariadne • OCLC Research • Trained on a lot of data • Scores very well • Not open source https://www.oclc.org/research/themes/data-science/ariadne.html http://annif.org/ https://www.oclc.org/research/themes/data-science/ariadne.html http://annif.org/ Set up – Results Focus on Recall: if the system outputs a list of twenty possible Brinkeys, are the correct Brinkeys according to our thesaurus among them? III. LESSONS LEARNED Lessons • Data, data, data: Quality, Amount • Do not underestimate preprocessing • How to keep up with researchers that go beyond state of the art? • “The human perspective, expertise and skill will remain necessary for guaranteeing the quality that we as the KB, National Library of the Netherlands represent” • Results still vague for cataloguers at KB III. NEXT STEPS Researcher Search Interface Physical & Digital Repository Ingest physical & full text digital publications Manual Annotation of keywords Next steps Researcher Search Interface Physical & Digital Repository Ingest physical & full text digital publications Manual Annotation of keywords Next steps Fol low up 1: Int erf ace tha t sug ges ts key wo rds to ann ota tor s http://lab.kb.nl/ http://lab.kb.nl/ https://lab.kb.nl/tool/brinkeys-tool https://lab.kb.nl/tool/brinkeys-tool Researcher Search Interface Physical & Digital Repository Ingest physical & full text digital publications Manual Annotation of keywords Next steps Fol low up 1: Int erf ace tha t sug ges ts key wo rds to ann ota tor s Fol low up 2: App ly t ech niq ues to o the r ty pes of doc um ent s Next steps • Experiments with full text data: • Do we need full text or is title or summary sufficient? • Do we need different approaches per type of text? • Set up a dedicated & highly secure server with full text files • Main focus on Annif: • Open source • Use own thesaurus • Active user community (http://swib.org/swib19/programme.html) • Experiment with other types of materials: documents of 16&17th century http://swib.org/swib19/programme.html ACKNOWLEDGEMENT Fantastic participants ICT with Industry Workshop Alex Brandsen Leiden University Hugo de Vos Leiden University Karen Goes VU Amsterdam Lin Huang Leiden University Hugo Huurdeman University of Amsterdam Aruembyeol Kim VU Amsterdam Sepideh Mesbah TU Delft Myrthe Reuver Radboud University Shenghui Wang University of Twente & OCLC Richard Zijdeman IISG & Stirling University Iris Hendrickx Radboud University Great colleagues at KB Erik Vos Arjan Dekker Lida Zoutewelle Angelique Tempels Meta van der Waal Gentenaar Enno Meijers Sara Veldhoen Brigitte den OudstenIrene Wolters Rene van der Ark Willem Jan Faber Dorien Haagsma Interested in more? Working on similar challenges? LET’S COLLABORATE! Generating metadata with AI EXPERIENCE OF THE KB, NATIONAL LIBRARY OF THE NETHERLANDS Martijn Kleppe – Head of Research Martijn.kleppe@kb.nl | @martijnkleppe | www.kb.nl/martijnkleppe mailto:Martijn.kleppe@kb.nl https://twitter.com/martijnkleppe http://www.kb.nl/martijnkleppe THANKS! Questions? Please put them in the chat box. Slides and a recording will be sent to all registered delegates. Intro Extro slides.pdf 20191108 - Webinar Liber v0.pdf work_rssdg4u5xrakpbmgp3daff7jke ---- TÜRK KÜTÜPHANECİLER DERNEĞİ BÜLTENİ XXVI. Cilt 1 9 7 7 4'üncü Sayı KÜTÜPHANELERARASI İŞBİRLİĞİNDE OTOMASYON PLANLAMASI Dr. Mustafa AKBULUT Ünlü yazar H. G. Wells’e1 göre mutlu yarınlar eğitimle cehalet arasında yapılan yarışmanın sonucuna bağlıdır. Yirminci yüzyılın ■ yaşantısı ise çok ' karmaşık, bilimsel ve teknik olma niteliğini gittik­ çe artırmaktadır. İnsanoğlu akıl ve bilgisini kullanarak çeşitli ku­ rumlar oluşturmakta ’ böylece toplum sorunlarını çözümleme yolun­ da ileri adımlar atmaktadır. Toplumun bilgi ihtiyacını karşılamayı amaçlayan kütüphaneler ve diğer benzeri . kuruluşlar da son yıllar­ da teknolojik gelişmelerden kendilerine düşen payı almaya başlamış­ lardır. Söz konusu teknolojik gelişmenin başında ise bilgisayar mu­ cizesi gelmektedir Gelişmiş ülkelerde bilgisayarların, yani otomasyonun teker teker kütüphanelere uygulandığını duymakta ve çeşitli yayınlarda oku­ maktayız. Yazımızın asıl değinmek istediği nokta ise, otomasyonun kütüphanelerarası işbirliği ve örgütlenmede nasıl bir önemi oldu­ ğunu ve rol oynadığını birkaç örnek vererek ortaya koymaktır. Bazı ' Batı ülkelerinde özellikle Amerika Birleşik Devletlerinde kütüphanelerarasmda işbirliğine gidilmesini gerektiren nedenler vardır.2 Bunlardan başlıca ikisi: Kütüphanelerin hizmet ettiği kişi ve topluma daha yararlı olmak, diğeri ' de hızlı bilgi artışının ortaya çıkardığı sorunlardır. 192 Son yıllarda yapılan bir araştırmaya göre A.B.D. de 125 resmî akademik kütüphane işbirliği programının bulunduğu belirtilmiştir.3 Genellikle işbirliği programına katılan kütüphaneciler içişlerinde ser best olmakla beraber, bazı hizmetlerin verilmesi konularında birbir­ lerine bağımlıdırlar. Bu hizmetler şöylece sıralanabilir: Katalog bil­ gilerinin değişimi, ihtiyaç duyulan kitap ve kitap dışı materyallerin ödünç verilmesi, müracaat sorularını cevaplamada yardım, az -kul­ lanılan materyallerin fazladan alınmasını önlemek için toplu kata­ logların hazırlanması, tecrübe, bilgi ve araştırmaların paylaşılması gibi. Harvard Üniversitesi’nden. Susan Martin «Library Automation» (Kütüphanede Otomasyon) adlı makalesinde, tekrarların önlenmesi ve çeşitli sistemler arasında uyum sağlanması için yapılan çağrıların önceleri pek az ilgi gördüğünü söylemekte ve fakat son yıllarda çe­ şitli faktörlerin işbirliği programlarını değiştirmekte olduğunu yaz­ maktadır. Bu faktörlerden bir tanesi ticarî bir kuruluş olmayan kü­ tüphanelerin karşılaştığı ekonomik sorunlar, İkincisi de bibliyogra­ fik bilgilerin dağıtım ve denetimi ' için MARC (Machine Readable Cataloging) bilgisayar kataloglaması projesinin başarıya ulaşması­ dır. Bunlara ek olarak da OCLC (Ohio College Library Center) ve NELINET (New England Library and Information Network) gibi böl­ gesel işbirliği örgütlerinin otomasyonu ekonomi ve başarı ile gerçek­ leştirmeleri gösterilebilir.4 Günümüzün hızla gelişen dünyasında ve dinamik bir çevre için­ de kütüphanelerin topluma, sosyal ve ekonomik kalkınmaya katkı­ da bulunabilmesi ancak yaratıcı ve yetenekli bir yönetimle müm­ kündür. Bu - nedenle gelişmiş ve bazı gelişmekte olan ülkelerde kü­ tüphane yönetici ve planlayıcılara otomasyondan yararlanma yolla­ rını aramakta ve olumlu çalışmalar yapmaktadır. Çağımızın artan bilgi üretimi artık insan gücü ile denetlenemez bir duruma gelmiştir. Özellikle bilgiye kısa bir sürede ulaşmak ar­ zusu ve bilginin yayımı geleneksel bibliyografik uygulama için bir sorun olmaya başlamıştır. Dünyanın en büyük buluşlarından olan tekerlek kadar bilgisa­ yar da çağımızda yeni bir çığır açmış bulunmaktadır. Bilgi üretimi ve denetiminin de emrine verilen bilgisayar endüstrisi yayılmakta ve gelişmektedir. Yapılan araştırma ve tahminlere göre bu endüstri­ nin 1980 yılında dünyanın üçüncü büyük, 20. Yüzyılın sonlarına doğru da en büyük endüstrisi olacağı hesaplanmıştır? Bunun nedeni de bilgisayarın hesaplama gücü ve hızının insan zekâsının yapabi­ leceğinden milyonlarca kere daha fazla olmasıdır. 193 Tüm bu olanaklara rağmen bir kütüphanenin hangi işlemleri­ nin bilgisayarla yapılması ' gerektiği - konusunda kesin bir - sonuca ya­ nlamadığını kütüphanecilik literatürünün incelenmesinden anla­ maktayız. Bilindiği gibi, verilen hizmet kütüphaneden kütüphaneye - değiş­ mekte ise de kütüphane içinde yapılan bazı- işlemler hemen hemen her kütüphane türünde aynıdır. İşbirliğini gerçekleştirmek isteyen kütüphaneler için bu işlemler üç ana bölümde toplanabilir : 1. Yönetim (kütüphanenin politikası, amacı, personeli, bütçesi, plânlaması, halkla ilişkileri, v.s.) 2. Teknik - hizmetler (sağlama, kataloglama, sınıflama, ciltleme, depolama - ve koruma) 3. Okuyucu hizmetleri (müracaat, ödünç verme) Verilen bu bilgilerin ışığı altında kütüphane yöneticilerinin kü­ tüphanenin otomasyonu konusunda karar verirken hangi noktala:! göz önünde bulundurmalarının yararlı olacağına değinen görüşleri inceleyelim. Konıi genellikle dört geniş alanda ele alınmaktadır. Bunlar : 1. Mevcut teknoloji 2. Kütüphanecilik teori ve ilkeleri 3. Toplumun ve çevrenin ihtiyacı 4. Ana kuruluşun yapısı ve amacı.6 Mevcut teknoloji ile ilgili olarak bilinmesi önerilen üç nokta ise şudur : a) bilgi toplama, b) bilgi depolama ve bilgiye ulaşım, c) bil­ gi - alma ve hizmete sunma. Bilgisayarların yararlı olabilmesi için otomasyonun kütüphane­ cilik teori ve ilkelerine dayanması gerekmektedir. Bu alandaki baş­ lıca sorunları bilginin organizasyonu, bilgiyi ulaşım stratejileri ile dil ve terim yapılarının oluşturduğu belirlenmiştir. Toplumun ve çevrenin ihtiyacını ise, bilgi kaynaklarının artışı, toplum kuruluşlarının gittikçe daha karmaşık bir hale gelmesi ve bireylerin yeterli derecede bilgi kaynaklarına sahip olma olanakla­ rının azalması etkilemektedir. Dördüncü- alanda da, kütüphanenin tek başına bağımsız bir ku­ ruluş olmadığı noktasından hareketle, bağlı bulunulan ana kuru­ luşun - yapı ve amacının göz önünde - bulundurulması önerilmiştir. 194 Bilgisayar metodlarmın kütüphane işlemlerine uygulanması ti­ caret dünyasındaki uygulamaya - çok benzemektedir. Çünkü ikisi de iyi - bir plânlamayı, sistem analiz ve çalışmasını ve programlamayı öngörmektedir. Bu çalışmalar sonucu ortaya çıkan ve kütüphane yöneticileri tarafından dikkate alınması gereken başlıca karar alan­ larını şöyle sıralayabiliriz : 1. Mevcut kütüphanenin yapısı ve çalışma sisteminin durumu nedir? 2. Yapılacak değişikliğin sonuçları neler olacaktır? 3. Değişikliğin gerçekleştirilmesinde- kullanılacak metod veya metodlar? 4 Kütüphanede - otomasyona elverişli işlemlerin belirlenmesi. 5. Kütüphanenin mevcut kaynakları ve İnsan gücü. 6. Sistem analizi ve uygulama grubunun oluşturulması. 7. Personelin seçim ve eğitimi. 8. Kullanılacak bilgisayar ve programlar. 9. Otomasyonda kullanılacak standartların belirlenmesi. 10. Dokümanların hazırlanması. 11. Yerel, bölgesel ve ulusal amaçlar. Yukarda sıralanan karar alanlarının incelenmesi ve değerlen­ dirilmesi otomasyonun katı bir takım işlemlerden oluşmadığını, çe­ şitli işlemler için çeşitli seçeneklerin bulunduğunu gösterecektir. Şüphesiz seçeneklerin tercihi de çevreden çevreye değişebilecektir. Bu nedenle günümüzde kütüphanelerin otomasyonunda kullanılan ve diğerlerine üstün olarak kabul edilebilen tek bir metod yoktur. Bir kütüphanede mantıklı ve ekonomik görülmeyen bir yaklaşım, diğer bir kütüphane - için son derece yararlı olabilir. Yazımızın başında da belirttiğimiz- gibi otomasyonun kütüpha- nelerarası işbirliğindeki önemine birkaç örnekle değinelim. Bunlar­ dan birincisi, CLSD (Collaborative Library System Development) adı altında, Chicago, Columbia, ve Stanford üniversite kütüphanelerinin ortaklaşa gerçekleştirmeye çalıştığı otomasyon- projesi olup, kütüp­ hane otomasyonu ve bilgi transferi konularında yapılan çalışma, el­ de edilen bilgiler ve hazırlanan teknik raporların değişimi için or­ taklaşa bir sistem oluşturmayı amaçlar. Bu proje 1968 yılında Na­ tional Science Foundation' m para yardımı ile başlatılmıştır. Paul Fasana’ya7 göre projenin - önemli iki özel amacı şudur : 195 1. Bir takım teknik varsayımların doğruluğunu denemek. (Bun­ lar, ortaklaşa işlerin görülmesini kolaylaştırıcı bir sistemle uyumlu kütüphane otomasyon sistemlerinin geliştirilebileceği olanaklarının bulunduğu görüşüdür.) 2. Proje grubunun tecrübe ve buluşlarını kütüphanecilik top- lumuna duyurmak. Söz konusu proje 1968 yılında başlatılmış ise de ortakların oto­ masyon çalışmaları daha önceki yıllardanberi yürütülmekteydi. Bun­ lardan Chicago 1964, Columbia 1965, Stanford’da 1967 de - gerekli ha­ zırlıkları tamamlamışlardı. İşte bu nedenledir ki, - bu üç üniversite kütüphanesinin otomasyon felsefesi, yaklaşımı ve değerlendirmesi arasında - farklılıklar vardır. Uzun görüşmelerden sonra ortaklar ayrı ayrı çalışmalarını sürdürmeyi yararlı görerek işbirliğine kendi tek­ nik görüş ve buluşları ile katılmayı kararlaştırmışlardır.8 Bu üç kü­ tüphanenin otomasyon felsefe ve yaklaşımları şöyledir : Stanford Stanford’un yaklaşımı ve çalışmaları tam kapasiteli bir «on-line» (ki buna doğrudan bağlantılı- diyebiliriz) sistemini gerçekleştirme­ ye yöneliktir. Çalışmalarda özellikle bibliyografik bilgi işlem siste­ mine ağırlık verilmiş olup, işlemler dört grupta yürütülmektedir : a) bilgi toplama ve işleme, b) MARC. c) kataloga ait veriler, d) ödünç verme envanteri. Bunlardan en önemlisi olarak belirtilen birinci grupta, bir bibliyografik künyeye ilişkin bilgilerin toplanması söz konusudur. Buradaki tüm bilgiler indekslenmekte ve bilgisayar­ dan «on-line» olarak kitap adı, yazar adı ve tüzel yazar adına göre arama yapılabilmektedir. - Anılan dört gruptaki bilgiye ulaşım ve aramanın yapılması 30 kadar «on-line» bilgisayar görüntü ekranı ile yapılabilmektedir.9 Ayrıca Stanford’da, bilgisayarla kataloglama alanında da araş­ tırmalar yapılmaktadır.‘0 Bu konuda benimsenen iki yaklaşım şöyle verilmektedir : 1) kitabın iç kapağındaki bilgilerin basit olarak bil­ gisayarın okuyabileceği bir şekle dönüştürülmesi, 2) iç kapaktaki ya­ zar adı, kitap adı ve diğer öğeleri -mekanik olarak tanıyabilecek bir hesaplama metodunun (algorithm) geliştirilmesi. Kısa.ca özetlemek gerekirse, Stanford «on-line» otomasyon siste­ mi kitabın iç kapağındaki öğelerin bilgisayarca okunabilmesi ilkesi­ ne dayanmaktadır. İ96 Sistemi diğerlerinden ayıran başlıca özelliği ise istenilen bilgiyi edinebilmek için hazırlanan indeks kelimelerinin ilk üç harfinden oluşan bir anahtar dizininin bilgisayar tarafından kullanılmasıdır. Columbia Columbia’nm otomasyon yaklaşımı ise Stanford’un aksine «off­ line» (ki buna doğrudan ' bağlantısız diyebiliriz) bilgisayar tekniğine yöneliktir. Columbia kütüphanede otomasyonu gerçekleştirmek için iki beş yıllık plân hazırlamıştır. Bunlardan ilki 1966- 1971 yıllarını kapsamış ve teknik hizmetler alanında bibliyografik verilerin işlen­ mesinde bilgisayardan yararlanma amacını gütmüştür. İkinci beş yıllık plân süresindeki çalışmalar daha ziyade mevcut sistemlerin ve okuyucu hizmetlerinin geliştirilmesi doğrultusunda olmuştur. Columbia, otomasyona gidilirken verilen hizmetlerin aksama­ ması için hızlı bir değişimi sakıncalı görmüş, buna neden olarak da bilgisayar teknolojisindeki hızlı gelişmeyi ve «on-line» sisteminin pa­ halı oluşunu göstermiştir’. Columbia'daki otomasyon sistemi teknik hizmetler işlemlerinin yürütülmesini sağlıyan ve merkezi olmıyan beş ana bilgi grubundan oluşmuştur : 1) Bilgi toplama ve işleme 2) Parasal işler 3) MARC 4) Kataloglama - 5) Bibliyografya11 Chicago Chicago'nun otomasyona - yaklaşım felsefesi «on-line» ve «off-line» bilgisayar tekniğinin her ikisinin de kullanılmasına dayanır. Chicago'da tek bir merkezi sistem vardır ve teknik işlemlerin yürütülmesini sağlamak amacıyle bilgisayar tarafından düzenlenen her bibliyografik künye için hazırlanan kayıtlar burada depolanır. Depolama disklerle yapılmaktadır ve istenilen bilgi «on-line» olarak elde edilebilir. Depolanmış bilgiden katalog fişleri ile kitap kartlan ve etiketleri, MARC fişleri, istatistik listeleri hazırlanabilir. Bu sistemde verilen depolanması iki yolla yapılmaktadır: a) yerel olarak b) MARC bandları ile. 1969-70 istatistiklerine göre sistem 107 37.000 bibliyografik künyeyi işliyerek depolamış, 413,000 katalog fişi ile 81,000 kitap kartı ve etiketi hazırlamıştır.12 CLSD’den sonra gerçekleştirilen ikinci - önemli otomasyon işbir­ liği ise «SLICE - The Southwestern Library Interstate Cooperative Endeavor» dır. Proje SWLA (Southwest Library Association) nm yar- dımıyle eyaletlerarası kütüphane işbirliğini gerçekleştirmek ve kü­ tüphane kaynaklarının ve hizmetlerinin geliştirilmesini sağlamak amaciyle 1971 yılında başlatılmıştır. SWLA altı eyaletten oluşan bir bölgeyi içine almaktadır ve şunlardır: Arizona, Arkansas, Louisiana, New Mexico, Oklahoma ve Texas SLICE ’m ilk önemli projesi MARC-O sistemi olmuştur. Oklahoma Eyalet Kütüphaneler Merkezi’nce geliştirilen bu sistem Kongre Kü­ tüphanesi tarafından hazırlanan MARC bandlarmın - ortaklar arasın­ da aynı kolaylıkla kullanılabilmesi için standartlaştırılmasını öngör­ müştür. Bu projeye öncelik verilmesinin nedenleri ise şöyle sıralan­ maktadır:” 1) Bibliyografik kayıtlar, kataloglama ve bunların çoğaltılması işlemi, yeri, şekli ve büyüklüğü ne ve nasıl olursa olsun böl­ gedeki tüm kütüphaneler için aynıdır. 2) MARC-O sistemi yalnız bibliyografik bilgi ve kayıtları ver­ mekle kalmayacak aynı zamanda kitapların - hangi kütüpha­ nelerde bulunduğuna ilişkin bilgileri de- içerecektir. Bu da dolayısı ile bölgesel bir kitap katalogunun oluşturulmasına yardım edecektir. 3) MARC-O sistemi yerel kütüphanelerin «SDI - Selective Disse­ mination of Information» (ki buna seçici bilgi dağılımı diye­ biliriz) yolu ile en yeni bilgileri edinmelerine yardımcı ola­ caktır. 4) Eyalet Kütüphaneler Merkezi MARC-O sisteminin geliştiril­ mesi ve iyi işleyebilmesi için gerekli harcamaları yapacaktır. Mary Duggan1'1 MARC-O sistemini iyi ve yararlı bir sistem olarak nitelemektedir. Bir kere, çok sayıda kütüphanenin - kataloglama işlem­ lerine yardımcı olan ve bölgede eyalet kütüphanesince başlatılan tek projedir, ikinci olarak, verilen SDI hizmeti metaryal sağlama, ön - kataloglama ve müracaat hizmetleri için büyük bir potansiyele sa­ hiptir. Kütüphanecilikte yeni bir gelişmenin eseri olan SDI hizmeti iki şekilde verilmektedir: a) standart SDI hizmeti, - b) kütüphanecile- 198 rin özel işleklerini karşılayan SDI hizmeti. Birincisi, yeni katalog­ lanmış kitapların konularına göre ayrılmış basılı haftalık listelerdir ki birçok kullanıcıya yararlı olmaktadır. Diğeri ise, adından da an­ laşılacağı üzere kütüphaneciler için hazırlanmaktadır. MARC-O sistemi bir «on-line» IBM 3330 bilgisayarı kullanmakta ve 316,000 kadar MARC kaydını depolamış bulunmaktadır.15 Bilgiye ulaşım ve atama LC (Kongre Kütüphanesi) numarası, konu, kitap adı, yazarı ve ISBN (Uluslararası Standart Kitap Numarası) ile ya­ pılabilmektedir. Yukarda gördüğümüz iki -işbirliğinden biraz farklı olmakla bera­ ber başarısından söz. etmemiz gereken diğer bir işbirliği de, daha doğrusu kütüphaneler arasındaki örgütlenme de (Network) «OCLC- Ohio College Library Center» dir. Ohio eyaletinin Columbus kentin­ de Frederick Kilgour tarafından 1972 yılında başlatılan bu proje bü­ yük bir üne kavuşmuştur.16 Merkezin amacı Ohio eyaletindeki kü­ tüphanelerle, Georgia, Pennsylvania, ve New Hampshire eyaletlerin­ deki kütüphanelere «on-line» kataloglama hizmeti sunmaktır. Bu­ nunla beraber, OCLC büyük üniversite kütüphanelerinin sorunları­ na cevap vermemektedir. Bunun nedeni de, Chicago ve Stanford ör­ neklerinde olduğu- gibi, büyük üniversitelerin kendi bilgisayarlarının bulunması, dolayısiyle bu kütüphanelerin otomasyon programlarını buralardaki programlara göre hazırlama zorunluluğudur. OCLC’nin ise özel bir bilgisayarı bulunmaktadır. OCLC iki amaca ulaşma çabasındadır- 1) Örgüte katılan kütüp­ haneler kullanıcılarının ihtiyacını karşılayacak kaynakları artırmak, 2) Gittikçe aılan kütüphane masraflarını azaltmak. Uzun sürede ulaşılacak amaç ise, tam bir otomasyona gidilerek birey ve kütüphanelere istedikleri bilgiyi istenilen yerde ve en kısa süre içinde verebilmektir. Bunu gerçekleştirmek için OCLC bir «on­ line» bibliyografik sistem geliştirmekte olup tüm ortakları birer «CRT-Cathode Ray Tube» (bilgisayar görüntü ekranı) veya teleksle bu sisteme bağlamaktadır. Biraz daha ayrıntıya gidecek olursak OCLC otomasyon sistemi­ nin altı - altsistemden oluştuğunu görürüz. Bunlar? 1) «on-line» toplu katalog ve ortaklaşa kataloglama 2) Süreğen yayınlar denetimi 3) Kütüphanelerarası ödünç verme 4) Kitap sağlama J 99 5) Merkezi kataloga uzaktan ulaşım - (Remote access) ve ödünç verme denetimi 6) Konu ile bilgiye ulaşım (Retrieval by subject) OCLC'nin merkezi katalogu 1,000,000 kayıttan oluşmaktadır. Bu sayı günde ortalama iki bin artmaktadır. 1974 yılındanberi de örgü­ tü oluşturan kütüphaneler merkeze kendi süreğen yayınlarının bib­ liyografik künyelerini vermeye başlamışlardır. Merkezdeki süreğen yayınlar sistemi aşağıdaki işlemlerden oluşmaktadır : 1) Dergi ve mecmuaların gelip gelmediklerinin denetimi 2) Gelmeyenlerin bilgisayar tarafından belirlenip otomatik olarak, istenmesi 3) Ciltleme denetimi OCLC otomasyon sisteminden yararlanan kütüphaneciler kendi kütüphanelerindeki «on-line» görüntü ekranı kanalıyle - merkezi ka - talogdakı bilgiye kitabın konusu, kitabın adı, yazar ve editör adla­ rını kullanmak suretiyle ulaşabilmektedirler. KAYNAKLAR 1. James ' Martin and Adrian R. D. Norman. The Computerized Society (Englewood Cliffs, NJ. : Prentice Hall. 1970), s. 286 2. Barbara Evans Markuson, «An Overview of Library Systems and Automation*, Datamation, 16 (February 1970), s. 62 3. Carlos A. Cuadra and Ruth J. Patrick, «Survey of Academic Library Consortia in the U.S.», College and Research Libraries, (July 1972), s. 273 4. Susan K. Martin, «Library Automation», ARIST, 7 (1972), s. 253 5. IBM, Management Information Systems: The Executive View, (New York, IBM, 1970), s. 3-11 .... .............. 6. I. A. Warheit, Computers in Libraries, (New York, Special Libraries Association, 1969). 7. Paul J. Fasana and Allen Veaner, Collaborative Library Systems Develop­ ment, (Cambridge, Mass. : MIT Press, 1971), s. 3 8. Paul J. Fasana, ' aynı eser, s. 3 9. Paul J. Fasana, aynı eser, s. 46-48 10. Frederick G. Kilgour, «Concept of an On-Line Computerized Library Catalog» Journal of Library Automation, 3 (March 1970), s, 8 200 11. Fanasa and Veaner, s. 11-12 12. Fanasa and Veaner, s. 8 13. Mary Duggan, «The SLICE Project of the Southwestern Library Association and Experiment in Interstate Intertype Library Cooperation», Illinois Libraries, 55 (May 1973), s. 311 14. Mary Duggan, ' «The SLICE-MARC-O Project», Louisiana Library Bulletin, 34 (Winter 1972), s. 125 15. Robert L. Clark, «MARC-Oklahoma Data Base Storaje and Retrieval Project­ Report Number 8» Oklahoma Department of Libraries Automation Newsletter, 4 (December 1972), s. 2 16. Susan K. Martin, Library Automation, (Chicago, ALA, 1975), s. 61 17. The Bowker Annual . of Library and Book Trade Information, New York, R. R. Bowker Co., 1975. s. 95-96. work_s66wwceusngqvc3xowwpc2ndry ---- Evaluation of Web Discovery Services: Reflections from Turkey Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 1877-0428 © 2013 The Authors. Published by Elsevier Ltd. Selection and peer-review under responsibility of The 2nd International Conference on Integrated Information doi: 10.1016/j.sbspro.2013.02.074 The 2nd International Conference on Integrated Information Evaluation of Web Discovery Services: Reflections from Turkey Güleda Doğana,*, Selahattin Cihan Doğanb aHacettepe University, Department of Information Management, 06800, Ankara, Turkey bHacettepe University Libraries, 06800, Ankara, Turkey Abstract After the occurrence of Google and failure of federated search engines against Google, web discovery services appeared as a savior for libraries. Main importance of web discovery services is a google-like single search box, speed and comprehensiveness. This study aims an overview for web discovery services and analyzing the use and awareness of these services in Turkey. Web Discovery Services are used by 52 of the Turkish universities, 39 state universities and 13 private universities (August 2012). Ebsco Discovery Service (EDS) and Serial Solutions Summon are most commonly used discovery services in Turkey. EDS have 101 users (31 Universities, 3 Institutions, 1 Military Institution of Higher Education and 66 Ministry of Health Training and Research Hospitals), Summon have 21 users that all are universities. The other two discovery services, OCLC Worldcat Local, Encore Synergy and Primo Central don’t have any user in Turkey library market for now. According to the results of the conducted survey (February 2012) to all 160 higher education institutions in Turkey, 63 of the 74 respondents have information about web discovery services. About half of the 39 nonuser institutions attempted for using a discovery service. © 2012 The Authors. Published by Elsevier Ltd. Selection and/or peer-review under responsibility of The 2nd International Conference on Integrated Information. Keywords: Web discovery services; web-scale discovery; federated search; Turkish university libraries 1. Introduction The main goal of libraries is to connect users with the information they seek. Because of the significant changings in scholarly use of information services in recent years, getting the most relevant information as quickly as possible have a great importance for today’s users. Google has a very important impact on users’ experiences and expectations mainly in the last decade. Positive experiences of users with Google have raised their expectations; they want same performance from libraries. They don’t want to jump from one interface to another; instead they prefer a single access point that has the ability to search multiple resources simultaneously. * Corresponding author. Tel.: +90-312-297-8200; fax: +90-312-299-2014. E-mail address: gduzyol@hacettepe.edu.tr Available online at www.sciencedirect.com © 2013 The Authors. Published by Elsevier Ltd. Selection and peer-review under responsibility of The 2nd International Conference on Integrated Information Open access under CC BY-NC-ND license. Open access under CC BY-NC-ND license. http://creativecommons.org/licenses/by-nc-nd/3.0/ http://creativecommons.org/licenses/by-nc-nd/3.0/ 445 Güleda Doğan and Selahattin Cihan Doğan / Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 They want the library catalog to foresee what they are looking for based on the words they type in the search box. They expect no separation between searching and finding, or discovery and delivery; they expect these two activities to be one. Shortly, they want a research experience mirrors the one they live with Google. For many users, the quality and reliability of the results are less important if it is easy and quick to find information. For these reasons, the library has declined steadily as starting point for research. Research process is no longer begin with asking reference services of libraries or with searching library catalog. Libraries are often not the first stop for research, or maybe not a stop because of the numerous alternatives for discovery out of library. It turned to network-level services, includes general purpose search engines mainly [1-9]. Many academic libraries are redesigning their websites to compete with Google and to provide their users a Google-like experience, in other words for students’ and researchers’ continue to use library resources and for libraries’ continue to be significant for academia [1, 10]. Web discovery tools reflect the searching way that users want and answer the initial question of searching, “where do I begin” [8]. They respond to many of the user issues that libraries are facing such as speed, relevance, response time, increased consistency, simplified interface, intuitive design, simple and direct searching, more accessible useful metadata, links to full-text, spell checking, and a variety of social networking options (reviews, tagging, etc.) [1, 8, 10, 11]. Web-scale discovery services harvest content from local library resources and hosted repositories and create a comprehensive centralized and unified single index for institution’s information resources, with the other words they unify the content that traditionals separate them as library owned and licenced, physical and digital. Web-scale discovery services have the capacity of connecting researchers with a great amount of content that includes physical content that is locally held, such as books and DVDs; local electronic content, such as digital image collections and institutional repository materials; digital collections, such as full text journals, e-books, abstracting and indexing resources purchased or licensed by the library and open access repositories. Shortly, they combine purchased, licensed and free information. Web-scale discovery services provide users to search most of the library’s collection all at once. This is a big change for users that they no longer need to choose a specific search tool to begin their search. While users can search across a broad range of content, they can also limit their search for only the available sources at their own institutions. Web-scale discovery services have a Google-style single search box designed as a single access point to wide range of library content, eliminates the need to merge results, allow easier deduplication and provide rapid search, quick and relevantly ranked results. It also have advanced searching capabilities [1, 10-17]. Because web-scale discovery tools are new for the information market, library environment are not necessarily aware of them [15, 17], they will start to gain more attention with the implementation of these tools by a growing number of academic libraries [1]. Academic libraries spend lots of money for purchasing online databases, electronic collections etc. and lots of staff time for supporting these services [10]. They must evolve in response to the rapidly changing information environment for not losing their potential users and to make the users to put more value to gateway role of libraries to information that own licence valuable and costly full-text databases [4, 6]. This paper aims to introduce the concept of web-scale information services, understand the need for web discovery services and provide an overview information for librarians want to know about web-scale discovery services; to make them consider about these services, begin or continue investigations on these services. 2. Web Discovery Services Evolution of information discovery tools within the library context started with card catalogs. As a next step, those card catalogs transferred to online integrated library systems (ILS). These were only available in the library. In the 1990s, with the development of the web, HTTP web-based online catalogs were created. Electronic journal content, e-text and e-book content, abstracting and indexing databases also appeared in 1990s, but firstly in CD-ROM format [15-16]. In 1998, a new search engine developed by two Stanford graduate students. It was Google which is often first and mainly the last stop for todays’ users who have grown up with Google for information discovery [15]. Google became the standard for users with its simple interface as easy as entering 446 Güleda Doğan and Selahattin Cihan Doğan / Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 keywords in a single search box, speed, accessibility wherever a searcher can get Internet access, broad content and quality results [4, 17-18]. In contrast to Google, libraries generally subscribe about 100-400 databases each of which have different interfaces [4, 17]. Federated searching also known as metasearching, identified by Roy Tennant and others during late 1990s and early 2000s, was supposed to be simple interfaces providing seamless searching across logically clustered databases of information and being a way of meeting the expectations and needs of Google generation, that they would allow users to search, retrieve and display content from multiple resources such as abstracting and indexing databases and full-text databases simultaneously as easy as entering keywords in a simple search box. In other words, they could allow libraries to become “one-stop shop” for their users. It is one of the major advantages of federated search to access library resources without having to select a specific database and without repeating the search that users do not want to jump from one interface to another when they are searching for information. The biggest players in this market are MuseGlobal, Serials Solutions WebFeat, and Fretwell- Downing [3-4, 11, 15-16, 18-20]. The development of federated searching had become one of the main growth in academic libraries at that years that many libraries thought federated searching instead of Google because of including more scholar sources than Google [4, 9, 18]. Federated search tools have not been able to achieve to provide a Google-like search box that can quickly retrieve information for all library resources for several reasons [19]. Over time, it has been clear that federated searching are not meeting the needs of users [9]. Issues raised on the capabilities of federated search mainly as difficulty, complication, slowness, merging of resources, deduplication (merging results) and ranking of retrieved results [9, 15, 17-18, 20-23). Beside these, the number of individual resources that can be simultaneously searched was limited with federated search tools [3, 17]. It may be better for smaller libraries or public libraries to use federated search tools for providing access to a selected group of databases [20]. In 2000s, library web catalogs changed as “next-generation” library catalogs. Next-generation library catalogs such as Encore, Aquabrowser, Exlibris Primo were more functional that they had "Web 2.0" features like tagging, submission of reviews, facets etc. and a user interface with popular sites like Amazon. They provided harvesting of records from various locally hosts; catalog records from one host, digital collection records from another host. They search, retrieve and present results in a single next-generation catalog interface. It was believed that this modern looking interfaces would retake some of the users from Amazon and maybe from Google but they have failed so far, users continued to use Amazon, Google. Library Catalog was used more to check whether an item they found in Amazon or Google was available in the library. Next-generation catalogs are still new technology for many libraries [15-16, 24]. Development of Google Scholar next to these issues about federated search and next-generation catalogues, a quest began for a new kind of resource that would compete with Google Scholar both in terms of speed, scope, harvesting and preprocessing of information [3, 19]. Beside these, the new kind of resource is supposed to suggest better searh terms, spell check, suggest other terms based on the entered search terms, easily live help, relevance ranking as default display, helping links, include less jargon such as Boolean, ISSN etc. [8]. After the failure of Federated search engines and Next generation catalogs, libraries began to suffer against Google and Google Scholar. A few months after Google Scholar start in November 2004, Marshall Breeding discussed that federated searching could not compete with Google Scholar’s speed and power. He called for a “centralized search model” [3, 11]. Web discovery tools became the latest attempt to solve this problem, providing a Google-like single search box that have access to all library resources [1]. Different from federated search engines which searches multiple databases and aggregates the results, they search a unified index and presents search results an a single interface Federated search engines results rely on each tool’s search algorithm and relevance rankings as well as the federated search engine’s. Web discovery tools import metadata into one index and apply one set of search algorithms to retrieve and rank results [11]. First web discovery service appeared in late 2007 was OCLC WorldCat Local. In the middle of 2009 Serials Solutions announced its web- 447 Güleda Doğan and Selahattin Cihan Doğan / Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 scale discovery tool Summon. Others followed with similar products, such as Ebsco's Discovery Service in early 2010, finally Encore Synergy and Exlibris Primo Central in mid 2010 [12-13, 16, 25-26]. The approach of Encore Synergy to web-scale information discovery is a bit different from the other four. It doesn’t create a preharvested control index. Ebsco Discovery Service and Innovative Interfaces Encore Synergy were released in 2010. There is not any studies on these Web Discovery Services yet [14-15, 24, 26]. Also some open sourced Web-scale discovery services, such as eXtensible Catalog (XC) Project by University of Rochester were developed [15]. Summon was announced in January 2009 with Dartmouth College Library and Oklahoma State University Libraries as beta sites [27]. University of Sydney, and the University of Liverpool were the other two beta sites [28]. Summon launched commercially in July 2009 and recognized as Best Enterprise Search Solution at 2011 CODIE awards by the Software & Information Industry Association (SIIA). It has the capability to search more than 800 million records that is more than even the size of the largest traditional library database. It contains more than 35 Open Access repositories and more than 90 institutional repositories, providing the most full-text searchable content. The service is openly available for searching without authentication [17, 19, 29]. It includes content and metadata from more than six thousand publishers, database producers and content providers. The content consists of different type of sources such as manuscripts, archival materials, journal articles, monographs and sound recordings. For harvesting content, Serials Solutions’ Summon has been contracting with dozens of content producers, publishers including the American Institute of Physics, the American Psychological Association, Cambridge University Press, Oxford University Press, and Springer. Summon is also capable of utilizing OpenURL and Digital Object Identifiers [28]. Summon is able to search better than even the best federated search tool. Unlike from Google Scholar, Summon is tied to a library's resources. Finally, unlike federated search or Google Scholar, Summon's normalized data also allows for a greater level of purification of data before and after the search [17]. John Law who is the director for Serial Solutions Summon noticed that Summon had a very significant impact on library usage. From the publisher side, libraries using Summon have increasing use of their content [26]. Sam Brooks indicated that the main aim for developing the EBSCO Discovery Service (EDS) was helping libraries compete with Google and Wikipedia that one of the most important content of EDS is academic encyclopedia [26, 30]. Millersville University’s integration to EDS was easy because of common using of EBSCO products in this university. Usage of some subject databases increased after EDS, it also prevented the cancellation of some. Carl Grant pointed out that the libraries using Primo have increased average number for search sessions, on the other hand, decreased session length. They thought this situation as the effect of quickly finding of what is looked for by users. Chip Nilges states that OCLC WorldCat Local came along for integrating library collections that provides a single search of all library collections. University of Delaware (UD) was the first place of production for WorldCat Local and it was a major conceptual change for the library staff of UD [26]. There are different criteria to consider while selecting a web discovery service for your library/institution. Some of the important criteria are scope and depth of content, richness of metadata, simplicity of interface, customizing of interface, supporting of mobile access and of course cost. Diane Bruxvoort, summarized main issues for selecting discovery services for an academic library as prices, top criteria of the institutions, discovery service providers, time you need to get ready for a new product etc. [26]. Carl Grant think that the next steps for discovery services are “personal relevancy ranking”, “improved mobile interfaces” and “addressing of growth in e-book usage” [26]. 3. Web Discovery Services in Turkey There is no study on Web Discovery Services in the national literature apart from the symposium organized by Turkish Librarianship Association Istanbul Office in October 3, 2011 with the aim of raise awareness for these 448 Güleda Doğan and Selahattin Cihan Doğan / Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 services. Turkey representers of Web Discovery Service providers’ introduced their products to library environment with this symposium titled as “Web Discovery Tools and Services in Libraries” [31-32]. EDS is the most commonly used discovery service in Turkey. According to the information received from Erol Gökduman, Regional Sales Manager at EBSCO Publishing, Turkey, EDS have 101 subscriptions in Turkey distributed as 31 Universities, 1 Military Institution of Higher Education, 3 Institutions and 66 Ministry of Health Training and Research Hospitals [30, 33]. Summon has been used by 21 Universities [34]. OCLC Worldcat Local used by Pamukkale University for a while but then the subscription has been cancelled. Encore Synergy and Primo haven’t been in use in Turkey market yet [33]. Totally, 52 Turkish universities (49 state and 13 private) are subscribed to a web discovery service. In the view of such information and universities list of the The Council of Higher Education [35], it is clear that one third of the Turkish universities subscribe to web discovery services. According to Erol Gökduman, web discovery services spread faster in Turkey than Europe [30]. A basic web-based survey consist of six questions developed with SurveyMonkey, a free online survey software [36], and sent to 160 higher education institutions (96 State Universities, 50 Private Universities- Institutions of Higher Education Established by Foundations, 5 Turkish Republic of Northern Cyprus Universities, 4 Vocational Schools of Higher Education Established by Foundations, 2 Military Institutions of Higher Education, 1 State University with Special Status, 1 Institution of Higher Education Affiliated with Police Organization)† [35, 37-38]. 74 (46%) of these higher education institutions replied the survey. As the first question indicates 63 (85%) of the 74 institutions have information about web discovery services, 11 (15%) don’t have any information about them. Higher education institutions are mostly informed of Ebsco Discovery Service (84%), Summon (71%) and OCLC WorldCat Local (55%). There are 6 institutions haven’t been informed about any of the web discovery services. About half of the 74 institutions (53%; 39) don’t use a web discovery service, 19 of the institutions use Ebsco Discovery Service, 15 use Summon. Approximately the half of the 39 nonuser institutions attempted for using a web discovery services. Only 11 of the web discovery service subscribers made studies (survey, user statistics etc.) to understand the effect of these services on the library use. An example to these subscribers is Izmir Institute of Technology. Library management had a survey on the library web page mainly about Summon consists of four questions (February, 2012). They asked where do the users begin to search, where do the users begin to search in library, if they know about Summon and if they use Summon. They don’t have a report for the results of this survey on their web page. Gültekin Gürdal, Director of the Library indicated that they also used a focus group study carried out in 2010 for subscription decision. They asked 15 Institute member about response rates, search results by categories, page design, user help, easily finding what you search, displaying content without losing search results, showing the number of materials contained by each of the search filtering options, ease of using adding to folder” option, ease of processing for sending the user to a new page. According to these questions, very positive responses were observed about Summon and this study had effect on subscription decision of the library management [39-40]. Another example is Bilkent University. They conducted a user satisfaction survey in 2011 [41] after they subscribed to Ebsco Discovery Service. We compared the results of this study with a similar survey conducted in this library in 2008. The number of users think that the library catalog is fast increased 65% in 2011 while it is 58% in the 2008 survey [42-43]. This is not a vital increase of course, the most crucial changings are about library website. While the %31 of users are satisfied with accessibility, %29 with content, %18 with interface in 2008; these number changed as %76 accessibility, %80 content and %70 interface in 2011 [44-45]. Overall † Totally 183 higher education institution exist within the structure of the Council of Higher Education including 103 State Universities, 65 Private Universities, 7 Vocational Schools of Higher Education, 5 Turkish Republic of Northern Cyprus Universities, 5 Military Institutions of Higher Education, 2 State Universities with Special Status and 1 Institution of Higher Education Affiliated with Police Organization (The Council of Higher Education, 2012a, 2012b, 2012c). Some of these are so new that their library websites are under construction, some of them don’t have a library web page by its structure and some don’t have an electronic mail address for their library on their web page. Web- based survey is mailed to all institutions that are relatively easy to access (160 institutions). 449 Güleda Doğan and Selahattin Cihan Doğan / Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 satisfaction from the Bilkent University Library is increased to 86% from 49% [46-47]. All of these increases may be the effect of Ebsco Discovery Service. 4. Conclusions This study summarized the history of information discovery in libraries before web discovery services, evaluated the web discovery services which are so new for library market, and finally pointed to the use of web discovery services in Turkey. These services have an increasing awareness and use in Turkey. References [1] Asher, A. D., Duke, D. M. and Wilson, S. (In press). Paths of discovery: Comparing the search effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and conventional library resources. College & Research Libraries. Retrieved from http://guides.main.library.emory.edu/content.php?pid=43389&sid=1367871 [2] Bates, M. J. (2003). Improving user access to library catalog and portal information, final report, version 3. Washington, DC: Library of Congress. Retrieved from http://www.loc.gov/catdir/bibcontrol/2.3BatesReport6-03.doc.pdf. [3] Breeding, M. (2005). Plotting a new course for metasearch. Computers in Libraries, 25(2), 27–29. [4] Luther, J. (2003). Trumping google: Metasearching's promise. Library Journal, 128(16), 36-39. [5] Vaughan, J. (2010). Web scale discovery. Retrieved August 23, 2012, from http://americanlibrariesmagazine.org/columns/dispatches- field/web-scale-discovery [6] Schonfeld, R. C. and Housewright, R. (2010). Faculty survey 2009: Key strategic insights for libraries, publishers and societies. Newyork: Ithaka S+R. Retrieved from http://www.ithaka.org/ithaka-s-r/research/faculty-surveys-2000-2009//Faculty%20Study%202009.pdf [7] OCLC. (2009). Online catalogs: What users and librarians want. Dublin, OH: OCLC. Retrieved from http://www.oclc.org/reports/onlinecatalogs/fullreport.pdf [8] Tallent, E. (2010). Where are we going? Are we there yet?. Internet Reference Services Quarterly, 15(1), 3-10. [9] Warren, D. (2007). Lost in translation: The reality of federated searching. Australian Academic & Research Libraries, 38(4), 258-269. [10] Mussell, J. and Croft, R. (In press). Discovery layers and the distance student: Online search habits of students. Journal of Library & Information Services in Distance Learning, Retrieved from http://dspace.royalroads.ca/docs/bitstream/handle/10170/471/Mussell_Croft_Discovery_Layers.pdf?sequence=3 [11] Fagan, J. C., Mandernach, M. A, Nelson, C. S., Paulo, J. R., Saunders, G. (2012). Usability test results for a discovery tool in an academic library. Information Technology and Libraries, 31(1), 83-112. [12] Hadro, J. (2009a). Summon aims at one-box discovery. Library Journal, 134(3), 17-18. [13] Hadro, J. (2009b). EBSCOhost unveils discovery service. Library Journal, 134(8), 17. [14] Tay, A. (2011, March 25). One search box to rule them all? Web scale discovery tools? [Blog post]. Retrieved from http://musingsaboutlibrarianship.blogspot.com/2011/03/one-search-box-to-rule-them-all-web.html [15] Vaughan, J. (2011). Web scale discovery: What and why?. Library Technology Reports, 47(1), 5-11. [16] Vaughan, J. and Hanken, T. (2011). Evaluating and Implementing web scale discovery services in your library (Part I). Retrieved August 24, 2012, from http://www.scribd.com/doc/59958617/Evaluating-and-Implementing-Web-Scale-Discovery-Services-Part-1 [17] Way, D. (2010). The impact of web-scale discovery on the use of a library collection. Serials Review, 36(4), 214-220. [18] Fryer, D. (2004). Federated search engines. Online, 28(2), 16-19. [19] Stern, D. (2009). Harvesting: Power and opportunities beyond federated search. Online, 33(4), 35-37. [20] Tennant, R. (2003). The right solution: Federated search tools. Library Journal, 128(11), 28-30. [21] Tennant, R. (2001). Cross-database search: One-stop shopping. Library Journal, 126(17), 29-30. [22] Xu, F. (2009). Implementation of a federated search system: Resource accessibility issues. Serials Review, 35(4), 235–241. [23] Helfer, D. S. and Wakimoto, J. C. (2005). Metasearching: The good, the bad, and the ugly of making it work in your library. Searcher, 13(2), 40-41. [24] Tay, A. (2011, June 15 ). Why not web scale discovery tools [Blog post]. Retrieved from http://musingsaboutlibrarianship.blogspot.com/2011/06/why-not-web-scale-discovery-tools.html [25] Primo Central - more data for discovery, better service to end users, less hassle for libraries (2009). Retrieved August 24, 2012, from http://initiatives.exlibrisgroup.com/2009/07/primo-centralmore-data-for-discovery.html. [26] Hawkins, D. (2011). Web scale information discovery: The opportunity, the reality, the future – An NFAIS Symposium. Retrieved August 24, 2012, from http://www.theconferencecircuit.com/2011/10/04/web-scale-information-discovery-the-opportunity-the-reality-the- future-an-nfais-symposium/ 450 Güleda Doğan and Selahattin Cihan Doğan / Procedia - Social and Behavioral Sciences 73 ( 2013 ) 444 – 450 [27] The Summon Service™ goes live: Grand Valley State University is first adoption of Web-scale discovery service. (2009). Retrieved August 24, 2012, from http://www.serialssolutions.com/assets/attach_news/PR-Summon-Live.pdf [28] Medeiros, N. (2009). Researching the research process: Information-seeking behavior, Summon, and Google Books. OCLC Systems & Services, 25(3), 153-155. [29] Lund University launches Serials Solutions® Summon Discovery Service. (2011). Retrieved August 24, 2012, from http://www.serialssolutions.com/news/lund-university-launches-serials-solutions-summon-discovery-service/ [30] E. Gökduman, personal communication, February 10, 2012 [31] Web discovery tools and services in libraries symposium. (2011a). Retrieved August 23, 2012, from http://www.istanbulkutuphaneci.org/federe-arama-servisleri [32] Web discovery tools and services in libraries symposium. (2011b). Retrieved August 23, 2012, from http://www.istanbulkutuphaneci.org/node/914 [33] E. Gökduman, personal communication, July 26, 2012 [34] A. İkinci, personal communication, August 2, 2012 [35] Universities. (2012a). Retrieved August 23, 2012, from http://www.yok.gov.tr/content/view/527/222/lang,tr/ [36] http://www.surveymonkey.com [37] Other institutions of higher education. (2012). Retrieved August 23, 2012, from http://www.yok.gov.tr/en/content/view/750/lang,tr/ [38] Universities. (2012b). Retrieved August 23, 2012, from The Council of Higher Education Website: http://www.yok.gov.tr/en/content/view/527/222/ [39] G. Gürdal, personal communication, August 24, 2012 [40] G. Gürdal, personal communication, August 28, 2012 [41] Bilkent University Library Survey 2011. (2012). Retrieved August 23, 2012, from http://library.bilkent.edu.tr/survey_2011.html [42] http://library.bilkent.edu.tr/Survey/04_Eng.pdf [43] http://library.bilkent.edu.tr/Survey/soru5Eng.pdf [44] http://library.bilkent.edu.tr/Survey/soru8Eng.pdf [45] http://library.bilkent.edu.tr/Survey/05_Eng.pdf [46] http://library.bilkent.edu.tr/Survey/soru17Eng.pdf [47] http://library.bilkent.edu.tr/Survey/10_Eng.pdf work_s7do7yfwura6he7vj2phar5bt4 ---- http://old.arl.org/newsltr/194/identifier.html Identifiers and Their Role In Networked Information Applications by Clifford Lynch, Executive Director, Coalition for Networked Information Editor's Note: When the Association of American Publishers' proposed Digital Object Identifier system reached a level of news that was worthy of a headline in the New York Times (First Business Page, Sept. 22, 1997), ARL turned to CNI for this reality check on the new, high profile role of identifiers. Identifiers are an enormously powerful tool for communication within and between communities. For example, the International Standard Book Number (ISBN) has played a central role in facilitating business communications between booksellers and publishers; it has also been important to libraries in identifying materials. The International Standard Serial Number (ISSN) plays a pivotal role in facilitating commerce among publishers, libraries, and serials jobbers; it is also vital to libraries in managing their own internal processes, such as serials check-in. Bibliographic utility identifier numbers such as the OCLC or RLIN numbers are used in duplicate detection and consolidation in the construction of online union catalog databases. The traditional bibliographic citation can be viewed as an identifier of sorts, albeit one that is not rigorously defined; it has many variations in style, and data elements based on editorial policies. Yet the ability to cite is central both to the construction of the record of discourse for our civilization and to the development of scholarship; the citation plays an essential role in allowing authors to reference other works, and in permitting readers to locate these works. The assignment of identifiers to works is a very powerful act; it states that, within a given intellectual framework, two instances of a work that have been assigned the same identifier are the same, while two instances of a work with different identifiers are distinct. The use of identifiers outside of their framework of assignment, though, is often problematic. For example, normal practice assigns a paperback edition of a book one ISBN and the hardcover edition another, so bookstores can distinguish between these versions, which usually vary in price and availability. But ISBNs are also used sometimes in bibliographic citations; in this situation, when the content and pagination of the hardcover and paperback editions are identical, either will serve equally well for a reader tracking down a citation, and the inclusion of an ISBN as an identifier for the cited work may actually cause problems because it is making an unnecessary distinction (for this purpose) among versions of the same work. A great deal of scholarship involves the development of identifier systems that allow scholars to name things in a way which makes distinctions and recognizes logical equivalence--ways of identifying editions of major authors or composers, variations in coinage having numismatic significance, or the identification of chemicals, proteins, or biological species. Often the rules for assigning identifiers to objects are the subject of ongoing scholarly debate and form a key part of the intellectual framework for a field of study. Identifiers take on a new significance in the networked environment. To the extent that a computational process can allow a user to move from the occurrence of an identifier to accessing the object being identified, identifiers become actionable. For example, in the World Wide Web links can be constructed between the entries in an article's bibliography and digital versions of the cited works, links that can be traversed with a mouse-click. The significance of making a citation actionable is so great that it has been the subject of several recent lawsuits-- for example, the litigation between Microsoft and Ticketmaster about the inclusion of links to Ticketmaster's web pages in Microsoft's web service over Ticketmaster's objections, which remains pending as of this writing. Another interesting case involved a service on the Web called Totalnews, which included citations and offered access to many other services, "framed" by the Totalnews service. The case was recently settled out of court and failed to establish a precedent. If one translates these practices under legal challenge, particularly in the Microsoft v. Ticketmaster case, into analogous practices in the print world, one can view this litigation as questioning whether one author remains free to cite the work of another without permission-- which is certainly a well established practice in print, and a profoundly important right to lose in the networked environment. Of course, this is just one interpretation of the Microsoft v. Ticketmaster case--it is complicated by a number of commercial factors. Yet it helps to illustrate what is at stake in establishing identifier systems, the control of the use of identifiers, and the practices surrounding them. In the networked information environment, we have recently seen the emergence of a number of important new identifiers, some of which are relatively mature, and others that are still under development. The remainder of this article briefly discusses a number of these identifiers. URLs and URNs Uniform Resource Locators (URLs) are a class of identifiers that became popular with the emergence of the World Wide Web. We first saw them on web pages, later in newspaper advertising and on the sides of buses, and then everywhere; currently they serve as the key links between physical artifacts and content on the Web, as well as providing linkage between objects within the Web. URLs have clearly been very effective; yet they are unsatisfactory in one very major way. They are really not names, in that they don't specify logical content, but, rather, are merely instructions on how to access an object. URLs include a service name (such as "FTP" for file transfer or "HTTP" for the Web's hypertext transfer protocol) and parameters that are passed to the specified service--most typically a host name and a file name on that host, both of which may be ephemeral. From a long-term perspective, the service name is also ephemeral--for example, content may well outlive a specific service (as has already been the case with the GOPHER service). It is important to recognize that URLs were never intended to be long- lasting names for content; they were designed to be flexible, easily implemented and easily extensible ways to make reference to materials on the Net. The Internet Engineering Task Force (IETF), which manages standards development for the Internet, realized the limitations of URLs for persistent reference to digital objects several years ago, and as a result began a program to develop a parallel system called Uniform Resource Names (URNs). The IETF URN working group recognized that the URN system must accommodate a multiplicity of naming policies for the assignment of identifiers. Roughly speaking, the syntax of a URN for a digital object is defined as consisting of a naming authority identifier (which is assigned through a central registry) and an object identifier which is assigned by that naming authority to the object in question; the specific content of the identifier may have structure and significance to users familiar with the practices of a given naming authority, but has no predefined meaning within the overall URN framework. Note that the URN syntax does not specify an access service for the object, unlike a URL. The second key idea in the URN framework is that of resolution services or processes--which may be as complex as new network protocols and infrastructure (analogous to the Domain Name System, for example) or processes as simple as a database lookup--which translate a URN into instructions for accessing the named object. Systems which provide resolution services are called "resolvers"; sometimes the IETF work also refers to "resolution databases" which provide the mapping from names to object locations and access services. URNs are resolved to sets of URLs which provide access to instances of the named digital object. A URN may resolve to more than one URL because there are copies of the digital object that have been replicated at multiple locations such as mirror sites, or because the URN (as defined by the relevant naming authority) specifies the object at a high degree of abstraction, and multiple manifestations of the object (for example, in different formats, such as ASCII, SMGL and PDF) are available. There is no explicit requirement that the URN to URL resolution process expose the mapping from an abstract definition of content to a variety of specific manifestations; it is equally legitimate for the choice of format to be made as part of a prototol negotiation in evaluating a URL when using a sophisticated protocol such as the Z39.50 Information Retrieval Protocol which supports such negotiation. As the location and means of access for objects change, the resolver's database is updated; thus, resolving a URN tomorrow may return a different set of URLs. Today's standard browsers do not yet understand URNs and how to invoke resolvers to convert them to URLs, but hopefully this support will be forthcoming in the not too distant future. One can reasonably view the URN framework as the means by which both existing and new identifier systems will be moved into the networked environment. The URN framework is intended to be sufficiently flexible to subsume virtually all existing bibliographic identifiers (sometimes referred to as "legacy" identifier systems); for example, the IETF working group documented how the ISSN, ISBN, and SICI might be implemented as URNs. The IETF uses the term Uniform Resource Identifiers (URIs) as a generic name to cover both URLs and URNs, along with the still immature concept of Uniform Resource Characteristics (URCs), which can be thought of as structures which allow one or more URNs (perhaps from different naming frameworks) to be related both to sets of URLs and to metadata describing the objects identified by the URNs and URLs. The Coalition for Networked Information is active in the IETF standards work on URIs. The OCLC Persistent URL (PURL) As a stopgap measure to address some of the problems with the persistence of URLs, about two years ago OCLC deployed a system called the PURL (Persistent URL). Basically, PURLs are HTTP URLs where the usual hostname has been replaced with the host "PURL.ORG" and the filename is an identifier for the "real" content being referenced. The PURL.ORG host will be maintained for the long term by OCLC under that name; when someone registers an object with this PURL server they provide the current hostname and filename for the object and the PURL server creates a database entry linking this hostname and filename to the identifier that will appear in the PURL. When the PURL server is contacted because someone is evaluating a PURL, it looks up the identifier in its database, finds out where the object in question currently resides, and uses the redirect feature of the HTTP protocol to connect the requester to the host housing the object. Content providers are responsible for sending updates to the PURL server when the content file name and/or location changes. PURLs share the idea of indirection--looking up an identifier in a database to find out where the object is currently stored--with URN resolvers as a means of achieving persistence. They are a very clever and practical design, in that they work with the existing installed base of web browsers. However, they are not truly names, since they only permit content to be accessed through a specific service, namely HTTP. PURLs will probably no longer work as new protocols appear that supersede HTTP, and as content migrates to access through such successor protocols. The SICI Code and Related Developments The Serial Item and Contribution Identifier (SICI) code was recently revised by a standards committee under the auspices of the National Information Standards Organization (NISO), the ANSI-accredited standards body serving libraries, publishers, and information service providers; it is described in American National Standard Z39.56-1996. The SICI relies in an essential way on the ISSN to identify the serial, and can be used to identify a specific issue of a serial, or a specific contribution within an issue (such as an article, or the table of contents). The SICI code is starting to see wide implementation and is likely to serve a central role in a number of applications: it can be used not only to identify articles, but also to link citations from article bibliographies or abstracting and indexing databases to articles in electronic form. It is an important part of the infrastructure that supports ARL' s NAILLD program to streamline interlibrary loan and document delivery. One of the great strengths of the SICI is that it can be determined directly from an issue of a journal (or an article within the issue), assuming only that the ISSN for the journal can be somehow determined. As such, it represents an open standard for creating linkages to articles or other serial components. Also under NISO auspices, work has just begun on a new identifier with the working name of Book Item and Contribution Identifier (BICI). The BICI can be used to identify specific volumes within a multivolume work, or components such as chapters within a book. There are still a number of unresolved issues surrounding the exact scope of this standardization effort, both in terms of the range of works that it applies to (for example, sound recordings as well as books) and the level of granularity of the identifier (for instance, whether it can identify a specific illustration or table within a work, something the SICI is not currently designed to do). Both ARL and CNI are heavily involved in the SICI and BICI work; Julia Blixrud of ARL chairs the BICI committee, of which Clifford Lynch from CNI is a member, as are representatives of several other ARL and CNI member organizations. ARL is an institutional member of NISO. The Digital Object Identifier (DOI) In the past few months, the Association of American Publishers (AAP) and their technical contractor, the Corporation for National Research Initiatives (CNRI), have issued a great deal of publicity about a new identifier called, rather grandly, the Digital Object Identifier (DOI). The DOI is based on CNRI's The Handle System(TM), a very general identifier system that fits roughly within the URN framework, and that provides a mechanism for implementing naming systems for arbitrary digital objects. Thus far, the DOI has been demonstrated within the context of online consumer acquisition of intellectual property and perhaps for this reason it is somewhat difficult to disentangle the proposed DOI standard, the demonstration implementation of the DOI, and applications enabled by it. Major demonstrations of the DOI system are scheduled for the Frankfurt Book Fair in October 1997. There are a number of misconceptions surrounding various aspects of the DOI. Its development does not mean that everything on the Web will become pay-per-view; rather, the DOI provides a method for collecting revenue for access to material that is described by a DOI (either on a one-time license or pay-per-view basis), if the organization that owns the rights to the object wishes to do this. Some objects described by DOIs may be accessible without charge. DOIs in and of themselves are only identifiers, and do not imply that any sort of copyright enforcement mechanisms (like an "envelope" or other secure container) will be bundled with the objects that they describe; the presence or absence of such copyright enforcement technologies is an entirely separate issue. These copyright enforcement technologies can be used with objects described by all sorts of identifiers, not just DOIs. I believe there are some legitimate concerns about the use of DOIs as a means of implementing actionable citations among works on the Web, since this is likely to mean that the author of the citing work will need to obtain the DOI of the work that he or she wishes to cite either from the owner of the cited work or from some third party, and accessing a citation would then involve interaction with the DOI resolution service, raising privacy and control issues. But the notion that the use of DOIs will make the networked environment "safe" for proprietary intellectual property in a way that it is not today is as improbable as the idea that the introduction of DOIs, as one type of commonly used URN, will somehow convert the entire Web into a pay-per-view environment. Discussions with the DOI developers suggest that the DOI's role will be as an identifier of content that is available for acquisition; there is currently some ambiguity as to whether it actually identifies content directly or if it simply identifies a method of acquiring content (such as an order screen). It is also extremely unclear under what circumstances similar objects are assigned distinct DOIs. Current plans seem to be to carefully control what organizations are permitted to assign DOIs, limiting the groups to "legitimate" publishers; thus, a DOI is hoped to offer some "brand name" confidence to consumers purchasing content on the Net. DOIs will be assigned to content as it is made available for acquisition, and perhaps removed from the DOI database as content is withdrawn from availability for acquisition. It is important to recognize that there does not seem to be consensus on most of these issues at present within the DOI developer community, which underscores the uncertainties about the potential roles and utility of the DOI outside of its use as a means for consumers to acquire content. In general, one cannot determine the DOI assigned to a digital object, or even whether the object has a DOI, unless the object carries it as a label. However, this can be confusing, because some publishers use, for those digital objects which are within the scope of the SICI, the SICI code as their (publisher-assigned) identifier. The implications of this practice will require careful examination and analysis. It is also unclear what role the DOI can usefully play in identifying material outside of acquisitions--for example, for material that is already licensed and is part of a library's collection, where it would be desirable to resolve "bibliographic" links to this material, but when it is inappropriate to connect library patrons to the acquisitions apparatus defined by the DOI. It appears that DOIs can be implemented within the IETF URN framework, though there are a few messy details having to do with character coding; to the best of my knowledge no documentation has yet been developed which spells out these details. Recently, representatives of the DOI developer community have asked CNI to work with them to help to increase understanding of the DOI's objectives and roles, particularly as they relate to library services, and to help to suggest ways in which the DOI might be made more useful to the broader bibliographic community. NISO has also been active in trying to relate the publishing community work on DOIs to the broader needs of the full NISO constituency, and held a workshop in June 1997 to begin developing requirements for general purpose bibliographic identifiers in the networked environment. The DOI as it currently seems to be evolving is likely to be a useful tool to permit consumers to acquire content from publishers on the Net with some confidence about who they are doing business with. My present concerns with it relate to the lack of clarity surrounding many aspects of this identifier, the very broad applicability implied by the name DOI, which doesn't seem to be consistent with its actual definition (something like Publisher Object Access Identifier, or something similar, might be more accurately descriptive), and the very real potential dangers that are raised if this identifier is pressed into broader uses, such as a means of implementing navigable citations in digital documents. In a very real sense, there are no bad identifiers, but it is very possible to put identifiers to bad or inappropriate uses. Conclusions Many new identifier systems are appearing; some have been developed specifically for the networked information environment, while others are long-standing identifiers that are being brought forward into the digital context. When evaluating a new identifier system, there are a number of essential questions to ask: 1. What is the scope of the identifier system--what kinds of objects can be identified with it? Who is permitted to assign identifiers, and how are these organizations identified, registered, and validated? 2. What are the rules for assigning new identifiers; when are two instances of a work the same (that is, assigned the same identifier) within the system, and under what criteria are they considered distinct (that is, assigned different identifiers)? What communities benefit from distinctions that are implied by the assignment of identifiers? 3. How does one determine the identifier for the work, and can one derive it from the work itself, or does one need to consult some possibly proprietary database maintained by a third party? To what class of objects are the identifiers applicable? Within this class of objects, is there an automatic method of constructing identifiers under the identifier system, or does someone have to make a specific decision to assign an identifier to an object? If so, who makes this decision, and why? Note that, if the identifier cannot be derived from the identified work, it is unsuitable for use as a primary identifier within any system of open citation. The act of reference should not rely upon proprietary databases or services. 4. How is the identifier resolved--that is, how does one go from the identifier to the identified work, or to other identifiers or metadata to permit the instances of the work to be located and accessed? Again, what is the role of possibly proprietary third party databases in resolving the identifier? Do the operator or operators of these resolution services have monopoly control over resolution? What are the barriers to entry for new resolution services? What are the policies of the resolution services in areas such as user privacy and statistics gathering? 5. How persistent is the identifier across time? Can one still resolve it after the work ceases to be commercially marketed? Identifiers that rely on the state of the commercial marketplace are very treacherous for constructing citations or other references that can serve the long-term social or scholarly record. All of the new identifiers are likely to be useful to some community, for some purpose, but it will be essential to determine what roles each new identifier is suitable for, and to avoid using various types of identifiers in roles that are inappropriate. The URN framework being established by the IETF invites all communities who are coming to rely on networked information to carefully consider what they need from identifier systems, and whether those needs are best served by defining new identifier systems. Resources on Identifiers URLs are defined in Internet RFC 1738. Functional Requirements for URNs are defined in Internet RFC 1737, and the syntax details are defined in RFC 2141. There are also a number of experimental resolver systems that are currently being deployed on a prototype basis on the Internet (see, for example, RFC 2168). There are also a number of internet drafts that are currently moving towards RFC status (see under "draft-ietf-urn" in internet drafts) that cover areas such as resolver system requirements and the use of bibliographic identifiers as URNs. See http://ietf.org. OCLC Persistent Uniform Resource Locator http://www.purl.org National Information Standards Organization http://www.niso.org Digital Object Identifier System http://www.doi.org Copyright (c) 1997 by Clifford Lynch. The author grants blanket permission to reprint this article for educational use as long as the author and source are acknowledged. For commercial use, a reprint request should be sent to clifford@cni.org Source: ARL: A Bimonthly Newsletter of Research Library Issues and Actions 194 (October 1997). Washington, DC: Association of Research Libraries. Digital Object Identifiers (DOIs) and Clifford Lynch's five questions on identifiers by William Y. Arms, Corporation for National Research Initiatives October 13, 1997 © ARL: A Bimonthly Newsletter of Research Library Issues and Actions 194 (October 1997). Washington, DC: Association of Research Libraries. Table of Contents for Issue 194 | Other Networked Information Articles | ARL Newsletter Home ARL Home © Association of Research Libraries, Washington, DC Maintained by ARL Web Administrator  Last Modified: July 7, 2001 work_s7f6rlilbzcwdgxdszz4bd2dmu ---- 1 The Romani Language: Cataloging Ramifications for a Language in the Process of Standardization GEOFF HUSIC Preprint: Final edited version is published in: Slavic and East European Information Resources. Volume 12, Issue 1 (January 2011), pages 37-51: DOI: 10.1080/15228886.2011.556076). Abstract A discussion of issues related to the cataloging of a language, Romani (or Romany), which is only in the 21 st century beginning to achieve some degree of standardization. The discussion focuses on issues of Romani orthography, specifically a small number of unusual Unicode characters that may cause technical problems in certain automated cataloging environments, such as OCLC WorldCat, the OCLC cataloging client Connexion, and online library catalogs. Keywords: Romani (Romany) language, Cataloging, MARC, Unicode, Numeric Character References Introduction The discussion of cataloging foreign language materials has been recently invigorated by a number of developments. The ability to add matching non-latin 1 vernacular fields in bibliographic records in online library catalogs for a variety of languages (e.g. Japanese, Arabic, Chinese, Korean, Persian, Hebrew, Yiddish, Greek, and some Cyrillic languages 2 , i.e. the so-called JACKPHY languages) has somewhat ameliorated the problem of the end-user's lack of acquaintance with the romanization schemes used in library catalogs and other databases, as he now has, in some cases, the option of searching either in transliteration or in the native script. In addition, especially among the languages of the former Soviet Union, there have been changes in the orthographies of several languages that have abandoned Cyrillic in favor of romanized scripts (e.g. Moldavian, Azeri, and most recently Uzbek), which have certain ramifications for cataloging and retrieval in library catalogs. In some cases the change is unproblematic, for example, when abandoning Cyrillic the Moldovans, whose language is basically identical to Romanian, just chose to return to writing in standard Romanian, while Address correspondence to: Geoff Husic, MA, MS, Slavic and Special Languages Librarian, University of Kansas Libraries, 1425 Jayhawk Blvd, Room 519, Lawrence, KS 66045-7544. E-mail husic@ku.edu. 2 preserving the token Moldovan language designation. In the case of Azeri and Uzbek, complications arise, in that the new romanized alphabets diverge somewhat from the ALA romanized transliteration schemes for the Cyrillic that have been employed heretofore in library catalogs and many databases. In some cases this has resulted in the need to address the Library of Congress guidelines for supplying uniform titles in cases of changes in orthography that affect alphanumeric filing and retrieval. 3 The Slavic and East European Section (SEER) of Association of Research Libraries (ARL) has recently convened a taskforce to discuss these and other remaining problematic issues such as mixed orthographies, reemergence of pre-Revolutionary orthography in Russian publications, and problems with various recensions of Church Slavic. This information will be used to update the widely consulted Slavic Cataloging Manual. 4 In this article I would like to discuss a related but somewhat distinct phenomenon, which deals with the ramifications of cataloging and retrieval not of changes in an established language, but rather those of a language which is only now beginning to achieve a degree of standardization, namely Romani (commonly also spelled „Romany‟, and now sometimes Rromani). Romani, spoken by approximately 4 million Roma 5 (for a variety of reasons an exact count is impossible) in Europe, Asia Minor, and the Americas is considered to be a northern Indic language. The great bulk of Romani vocabulary is of Indic origin, and its grammar preserves some archaic Indic elements that have been lost in the other modern Indic languages. In a yet to be published article, Marcel Courthiade, a preeminent scholar of the Romani language at Institut National des Langues et Civilisations Orientales in Paris (INALCO), convincingly traces the original migration of the Roma out of the Indian city of Kannauj, in state of Uttar Pradesh, sometime after 1018, as a result of raids on the city by the Persian ruler Mahmud of Ghazni (791-1030). 6 After a lengthy peregrination though Asia Minor, the Roma first appeared in Europe in the early 1300s, a portion of the population establishing a fairly sedentary life in some areas, while others remained mainly itinerant, travelling freely throughout several countries. On its way to Europe, the Romani language was greatly influenced by other Indo- European languages spoken along the path, such as Persian, Armenian, and Greek. Because of the large populations of Roma in the Balkans, Romani has also developed sufficient features shared by the other Balkan languages (Serbian/Croatian/Bosnian, Bulgarian, Macedonian, Albanian, Greek, Romanian, and Vlax) to be also considered as part of the Balkansprachbund. 7 As is the case with all languages of Europe, Romani has been present in Europe long enough now to develop a number of definable dialects, which differ among themselves in points of grammar, vocabulary, and pronunciation, often rather substantially. 8 Until recently, orthographic variations used to write Romani have been utterly chaotic, and the most common pattern was to use the orthographic system of the host country, resulting 3 in Romani being written in a variety of non-latin scripts (Cyrillic, Greek, and Arabic) as well as a multitude of latin scripts, based on, e.g. Czech, Croatian, Hungarian, Albanian, German, Spanish, and English, often with no internal consistency, even in writings by the same author. A survey of Romani in Internet websites, chat rooms, and Facebook shows that this situation still obtains to this very day. This discussion is intended to be somewhat forward looking. This author probably has much more contact with Romani materials than librarians in most other research libraries. Because of my close working relationship with INALCO, the University of Kansas Libraries receive most of the Romani-language publications (books, journals, and multi-media) published by this institute and others collected by the institute. Despite the lack of any authoritative cataloging documentation in how to deal with this language, as cataloger as well as bibliographer, I have had to develop my own strategies for processing Romani-language materials. We must however acknowledge that there has not traditionally been a large body of printed literature in the Romani language, which is perhaps why the problems associated with Romani in the scholarly and library context have not been raised until this point. Until most recently, educated Roma generally have deferred to the languages of their host countries for scholarly and educational purposes, whereas among the uneducated and illiterate, oral literature and song have been the usual methods of transmitting their history and culture. The environment has change dramatically in the last 20 years, as a cadre of Romani intellectuals, historians, and linguists have perceived an urgent need to try to achieve some kind of standardization of the European Romani variants, in order to promote education by creating culturally-inclusive instructional materials in Romani and to foster and preserve their rich culture. Warsaw Agreement In 1990, the 4th World Romani Congress, an umbrella cultural organization that represents Roma in 25 countries, convened in Warsaw Poland in order to accomplish two goals: to validate Romani as a European language on par with other European languages, and to attempt to achieve a degree of standardization, making it a viable vehicle for primary education and publishing. 9 A team of linguists forged a new standard alphabet, quite different from any that had ever been used before. There was quite a bit of debate and disagreement between the various team members. Some wished to adhere to something closer to the common latin-script usage in those countries with the largest Romani populations, i.e. the Balkan countries. Others saw an opportunity to showcase the unique features of the Romani language, and they viewed a truly distinctive alphabet (see Appendix for the complete 4 alphabet) as a means to achieve this end. The chosen alphabet exhibits a careful attempt to avoid adhering too closely to the orthographic systems of the countries in which the majority of Roma reside, and is a rather unfortunate product from my perspective, especially from the point of view of cataloging and data retrieval. This is not purely a value judgment on my part, for practical reasons I will explain further below. In order to expound on some of the special complications that Romani presents, I must first delve into a bit of linguistic minutiae that explain how this new alphabet became endowed with some rather exotic features that will be problematic. Romani is an Indic language, descended from Middle Indic languages that had already lost the elaborate case and verbal systems we see in the older Indic languages such as Sanskrit and Pali. It therefore shares certain syntactic features with other modern Indic languages such as Hindi, but has subsequently undergone certain innovations since the Roma left the India subcontinent. One salient feature concerns what has been traditionally viewed as a case system in Romani. This syntactic feature can indeed be interpreted quite differently when comparing Romani to its sister modern Indic languages. In Hindi, for example, case relationships, such as possession, indirect object, and location, are formed by placing a postposition after the modified noun, which, in turn, appears in the oblique form (which may or may not change from the direct form), e.g. the direct form of the Hindi word for 'boy' is larka, and larke ki ankh means „boy's eye', and „boys‟ eyes‟ will be larkoŋ ke ankhe, with the noun changing to the oblique form with the placement of the possessive postposition ki. We see a very similar reflex in Romani. The Romani word for boy is raklo, 10 and 'the boy's eye will be rakleski yakh, and 'the boys' eyes' will be raklǎnge yakha 11 . When comparing the Romani structure with Hindi, we can clearly see that what may look like a case ending can, in fact, be interpreted as a postposition. 12 What differs in Romani from Hindi is that the phonetic nature of the postpositions in Romani changes based on the consonant to the immediate left. If that consonant is voiced, then so will be the consonant of the postposition, and conversely if it is unvoiced then so will be the consonant in the postposition. This can be seen in the example above. The singular oblique stem of raklo is rakles-, while the plural oblique stem is raklen-. The phonetic form of the possessive postposition can thus be either /-ke/ or /-ge/ depending on whether the preceding constant is unvoiced or voiced. The same reflex occurs with the postpositions -te „to‟ and -tar „from‟, which phonetically become /-de/ and /-dar/ after voiced consonants. Some Romani linguists considered this syntactic feature so distinctive and significant that it should be conspicuously accounted for in any standardized orthography chosen for Romani. They therefore came up the following solution: rather than write the consonant of the postposition with two different consonants, in each instance they would 5 choose a rather obscure grapheme to represent them. The postpositions added to the oblique stem would thus be written as below: Pronounced Written as Meaning -ke/-ge -qe „of‟ -te/de -θe „to‟ -tar/-dar -θar „from‟ -sa/-tsa * -ça „with‟ *The sound /ts/ is written Romani with the letter c. Thus, the above examples would now be written as raklesqi yakh and raklǎnqe yakha. In order to try to further distance Romani from the graphical representation of the neighboring Slavic languages, the diacritic hachek, i.e. caron (ˇ), was eschewed in favor of the acute accent, thus ć, ś, and ź for the aspirant sounds ch (as in Charley), sh (as in ship), and s (as in pleasure) for the aspirant sounds. Practically speaking, this is a rather modest modification. However somewhat more contentious was the discussion of how to represent the frequent Romani consonant corresponding to the j in judge. This particular sound (and closely related sounds) are represented by a wide variety of graphemes in the languages of Europe, e.g.: j, g (English), đ, dž, dj (Bosnian/ Croatian/ Serbian), rz, ż (Polish) 13 , c (Turkish), gy (Hungarian), and gj, xh (Albanian). Previous to the 2009 Warsaw agreement, treatment of this sound varied widely, as it still does in the non-standard Romani orthographies, being variously represented as ž, ƶ, ź, as the number 3, and even a very quirky square box minus the left side and an acute accent above. 14 While it would have been logically consistent to choose ź to represent this sound, the committee, rather inexplicably, decided to select the obscure International Phonetic Association character „ezh‟ ʒ. The ezh, alongside the theta that appears in the Romani postpositions, will be problematic from the perspective of cataloging. 6 Cataloging Ramifications The problem of languages that have undergone changes in the official orthography, e.g. Russian after the Soviet Revolution, is well known to catalogers in research libraries. However the Roma live in a number of countries and therefore are not able to claim to have any governmentally sanctioned “official orthography.” Thus, there is some uncertainty whether the AACR2 stipulation about changes in official orthography is applicable, or even desirable in the cataloging of Romani materials. Nevertheless, as more and more materials are being published in Romani, many, but not all adhering to the new standardized orthography, it is important to consider having a systematic approach to cataloging materials in the Romani language. AACR2 cataloging records consist of two main components, the descriptive elements, transcribed from the item being cataloged, and additional controlled metadata elements assigned by the cataloger, such as names and subject headings, which are subject to authority control. In the context of Romani, the latter are generally unproblematic, as authors‟ names are usually established based on the orthographies of the countries in which they reside, and therefore do not normally contain the problematic characters ezh and theta 15 . The descriptive elements of the bibliographic record, on the other hand, present very little flexibility. AACR2 clearly stipulates that the descriptive elements should ideally be transcribed exactly as they appear in the item being cataloged, although with further caveats. Until recently, for languages in non-latin scripts the only option in bibliographic utilities, such as OCLC, was a romanized transliteration scheme, which at least for North American libraries, was based on the ALA norm. 16 While we are now able to add matching vernacular fields for many scripts in OCLC WorldCat, this is not the case for all valid Unicode characters. 17 In addition, AACR2 stipulates some rather elaborate rules about how to deal with special characters such as ezh and theta, which seem inappropriate to apply in the case of a language for which these characters are part of the established orthography. 18 We are therefore left with a dilemma when it comes to Romani. The Problematic Characters In the context of 1990, when the new alphabet was adopted, computer automation and the Internet were not as developed as they are today, and therefore the Romani linguists did not see the introduction of new exotic letters as unduly problematic, as typewriters could be ordered with specific keys as needed. However in the current computer environment these characters present a particular 7 problem. OCLC WorldCat is still limited to the smaller Marc-8 Unicode subset and limits what characters can be entered in certain contexts. While the vast majority of letters of the new Romani alphabet are MARC-8 compatible, the ezh is a character that is still unrecognized in that character set, and so we must replace this character with some alternative. 19 The introduction of the Greek character theta (θ) also presents some distinct problems for cataloging. This character is disallowed in OCLC cataloging in otherwise latin-script fields and is generally replaced with the bracketed name of the letter, i.e. [theta], or a phonetic equivalent. If we do attempt to input this character in OCLC Connexion (other than in a matching vernacular Greek field) it will force Connexion to assume that the record being input is a non-latin script record and will automatically code the resulting record as such. This code is system supplied and cannot be removed. Connexion will also attempt to add matching fields for romanization, as we see in OCLC WorldCat records for Cyrillic, Arabic, etc., which cannot be left blank and must be filled with text of some sort in order to validate and add the record to OCLC WorldCat. Alongside these problematic consonants, the new orthography contains a number of vowel graphemes ǎ, ě, ǐ, ǒ, ǔ. These so-called pre-jotizing vowels are pronounced with a faint “y” sound after the consonant preceding them. From the descriptive aspect of cataloging these characters are not unduly problematic and are compatible in OCLC WorldCat. The most important thing to bear in mind is that these diacritics are the caron, and not the breve, which is more commonly seen above vowels in European languages. In addressing the descriptive aspect of the bibliographic record, how then do we deal with the theta and ezh? Computer alphabetical sorting and keyword searching is normally good at regularizing variants of roman-alphabet letters occurring with various diacritics, so that, e.g. z, ź, ż, ž, etc., will all be retrieved when searching the stripped z. The theta and ezh, however, are not characters for which such a correlation is normally encoded. As far at the descriptive elements of Romani bibliographic records that contain these problematic elements are concerned, we must turn to practical solutions. There seem to me to be two acceptable solutions for dealing with the ezh, and, for the sake of consistency, I believe it would be prudent to treat the theta as well as the ezh in the same fashion when cataloging in OCLC WorldCat. The approaches I will suggest below, while admittedly not extremely aesthetic, will nevertheless assure that important graphical information will be retained until catalogs and bibliographic utilities are compatible with all Unicode characters. For the time being, it is likely that these steps will be mainly necessary in the descriptive areas of the bibliographic record and will mostly affect the transcription of the titles. 8 Option 1: Except for the theta and ezh all other Romani letters have ALA character equivalents and can be transcribed as they are. The ezh has two acceptable variants according to the Warsaw Agreement, ƶ and з (the Cyrillic /z/). However, neither of these is a valid ALA character, and therefore cannot be used in OCLC WorldCat to transcribe the title proper (MARC 245 field). Although not commonly seen in OCLC cataloging, there is a provision in the Library of Congress Rule Interpretation documentation for replacing special characters that cannot be used in the MARC record. According to this list, the ezh, which can also appear in some African languages, should be transcribed as z ̳(z with double underscore) and theta as t ̳(t with double underscore). Using this substitution for ezh, Kote ʒàna e Kosoviaqe Rroma would be transcribed as: Kote zà̳na e Kosoviaqe Rroma. Since titles are generally not unwieldy, I believe it would also be helpful to provide a note, such as “z ̳ in the word z ̳ana appears as the IPA symbol ezh.” Such notes will be helpful to users unfamiliar with the vagaries of transcribing unusual characters in OCLC cataloging. The use of t ̳ for theta seems a little less satisfactory, since this letter has two phonetic outcomes, /t/ and /d/ depending on the preceding consonant. Nevertheless, it makes sense to consider employing this substitute character. In the case of theta there will also be cases where added title entries will be helpful. Regrettably, the treatment of theta will require a bit of knowledge of Romani phonology as revealed in the discussion of postpositions above. However, with this knowledge in hand, when constructing added title entries (MARC 246 field), I would recommend, in cases where the theta is indeed pronounced as the voiced d, that the theta be transcribed according to its phonetic value, i.e. d or t depending on whether the preceding consonant is voiced or unvoiced respectively, e.g. Sar me vastesθar xutǐlav tu would be transcribed as Sar me vastestar xutǐlav tu, while But lenθar ʒivdinèna would be transcribed as But lendar zi̳vdinèna. Option 2: The second option involves the use of a so-called “lossless” solution. Such a proposal for 9 dealing with problematic characters in MARC cataloging records already exists in the MARC documentation. 20 Although accepted in 2006, its use with characters that fall outside the MARC-8 character repertoire has not be widely used in OCLC cataloging, the reasons for which might become evident from the examples below. In short, this lossless method recommends replacing a Unicode character that cannot be mapped to MARC-8 with a placeholder that contains the Numeric Character Reference (NCR) for the character. The NCR consists of the hexidecimal representation of the code point of the character (four ASCII characters), preceded by #x and all surrounded by & and ; Using this method yields the following: The lower case theta can be substituted with θ and the capital letter as Θ The lower case ezh will be ʒ and the upper case Ʒ There are advantages and disadvantages to each of these options. The first option will allow for searching and alphanumeric filing based on a more natural representation of what a speaker of Romani, or one researching Romani might expect, in the case of the ezh because z ̳will regularize out with the many forms of the letter z frequently used to represent this sound in the various common manifestations of the Romani alphabet. However it is not ideal in this respect, as diagraphs, such as dž, dj, xh are also frequently used to represent this sound in Romani texts. In regards to the theta, this approach is somewhat better suited, as it represents a form used by the majority of Romani writers, many of which still shun the use of the theta. On the other hand, the second option will allow for eventual automatic conversion of the characters back to the proper graphic representation when these Unicode characters become available in the MARC format. Some newer web interfaces are already able to render these characters correctly when they are manually input in the local catalog database, even if it is not yet possible to input these non-MARC-8 characters in OCLC WorldCat directly. 21 Nevertheless there are obvious drawbacks to this approach as well. First, since these NCRs are mostly likely to occur conspicuously in the title of the bibliographic record, any user, other than those familiar with cataloging limitations and hexidecimal coding of Unicode characters, will likely assume that a coding error has occurred, resulting in garbled characters. 22 Second, this method will profoundly affect keyword searching and alphanumeric sorting of main titles. Finally, even 10 if the characters are rendered correctly in the catalog and browser, another unfortunate result is that the catalog user may not have the ability to input the correct characters for the purpose of searching. If the second option is chosen, as in the case of the first option, it will still be desirable to provide some alternate title headings to the bibliographic record, based on a reasonable assumption of what the user might expect. This is clearly not a trivial issue. As in the case of cataloging all foreign language materials, the cataloger must have at an adequate working knowledge of the language being cataloged in order to be able to discern which added entries are most appropriate and useful for the end user. In the Appendix I discuss my personal preference for treating these problematic characters but I feel both approaches are equally legitimate. Authority Control Implications Turning now to the portion of the bibliographic record subject to authority control, i.e. the assignment of controlled vocabulary and standardized names, we are left with a further dilemma. 23 Many Romani writers are either unaware of the standardized alphabet adopted by the 1990 World Romani Congress or find the system so bizarre that they simply choose to disregard it, opting for the orthographic system of their home country or an idiosyncratic one of their own design. However, as a cataloger of these materials, I must make a decision, one way or another, how to approach this issue. As a linguist, I recognize that Romani is very different from most other European languages in that it is represented by no nation state and therefore has no governmentally-sanctioned official orthography, and is represented by a wide variety of dialects. However as a result of my experience as a cataloger I recognize the importance of collocation in the context of the library catalog and strive to find a means to make Romani materials as accessible as possible. This raises the question of whether it is desirable to accept the 1990 Romani Congress decision as authoritative enough to apply the AACR2 guidelines concerning change in orthography to Romani. I would argue that this is indeed desirable, at least in certain cases where possible, and where there a desire to bring together related manifestations of a specific work. 24 One problem, until very recently, has been the lack of any authoritative reference source to resolve questions of orthography for Romani. Such reference sources are frequently consulted by foreign-language catalogers, for example, when cataloging Arabic to ensure a consistent transcription of the short vowels, which are normally not written. In 2009, a dictionary, of which I was coeditor of the English content, was published and which may be very helpful in resolving problems. This 11 UNESCO sponsored dictionary is the first international dictionary of the proposed standardized form of the Romani and it employs the new orthography discussed above. 25 While it is difficult at this stage of Romani publication to point to many examples to support this argument, I can present at least one real-life example where assigning uniform titles based on the current form of the orthography can bring together related works. The Romani scholar, Rajko Đurić, has published many books on the topic of the Roma, several in Romani. His earlier Romani publications were written in an orthography that mirrored Serbian latin-script orthography, whereas he later chose to adopt the 1990 Romani Congress orthography. Đurić published two related books treating the history of the Roma, the first, in 1988 under the title Bibahtale breša [Unhappy years]. Then in, 1996, he published a companion volume, using the new orthography, under the title: Bibaxtale berśa]. 26 Clearly these works will regularize differently in a title browse and a keyword search. In cases such as this, it seems that it would be prudent to provide uniform titles (MARC 130 or 240 fields) based on the new orthography on the bibliographic record for the title in the old orthography, in order to provide access to the new established orthography. An authoritative source, such as the dictionary mentioned above, can facilitate in determining the appropriate form. Other than examples such as above, where there is a desire to bring together related manifestations of a work, I fear that it may impractical to try to routinely construct uniform titles for Romani works for a number of practical reasons. First, the new orthography is only a recommended standard, with no legal weight. While being used by several Romani organizations and scholars in Europe, others may either totally disregard it, or use it only partially (especially avoiding the q, ç, ʒ, and θ). While above I have concentrated on the problem of the consonants, similar problems exist with the new Romani vowel characters. To give just an example for ǎ and ǒ, these have been written as ja, ia, ya, я and yo, io, yo, ё respectively in various kinds of orthographies. While books being printed by publishers that specialize in Romani materials may be more likely to adhere to the new orthography, there will continue to be a great variety in how Romani is written in books and journals, the Internet, etc. The burden will therefore be on the cataloger to attempt to construct uniform titles for a language that is generally little known outside the Romani community and among Balkanologists. Conclusion As a non-territorial and transnational language, it is very difficult to envision a time in the near future when even European variants of Romani will achieve the degree of standardization of other 12 European languages. Nevertheless, considering that there are now UN-affiliated and other cultural groups that are attempting to promote the new standard orthography and are introducing it more and more in printed publications and Internet resources, it should be approached by the cataloger with the same degree of attention that we pay to other complicated issues of orthography. The goal of our diligence is always to keep the potential end user in mind, and to anticipate how in the future automated algorithms may be able to bring together related works, as envisioned in FRBR models 27 . Even if we conclude that certain issues of authority control, such as assigning uniform titles, are too complicated or ambiguous to be applied systematically, we are nevertheless left with the problematic characters that presently prevent their exact transcription in some cataloging environments. It is my hope that this discussion will be of help to other catalogers who may be presented with Romani materials to understand the complicated cataloging issues involving this fascinating language. Please feel free to contact me at husic@ku.edu if you should need any assistance in cataloging Romani materials. mailto:husic@ku.edu 13 Appendix: The Unified Romani Alphabet (Based on the 1990 Warsaw Agreement) and Summary of Suggested Cataloging Approaches Most Romani letters are pronounced very similarly to Croatian. The exceptions are: ç (as s or ts, q (as k or g), and θ (as t or d); x (as the Russian х); kh, ph, and th (aspirated k, p, and t), rr ( uvular r), and ʒ (as the j in jam.) The letters ć, ś, ź are very close the Croatian č, š, and ž in most dialects. Unproblematic characters: The following letters (in Romani order) present no special problems to the cataloger in any cataloging environment. The only possible point of confusion is whether rr (pronounced as a uvular sounds, such as r in French Paris), when initial, is written as Rr or RR and is of little concern, although Rr is preferred. a, b, c, ç, d, e, f, g, h, x, i, j, k, kh 28 , l, m, n, o, p, ph, r, rr, s, t, th, u, v, z. Vowels with the acute (representing unpredictable stress: à, è, ì, ò, ù) Possibly problematic characters: The pre-jotizing vowels, i.e. vowels with a caron (hachek) written above. For bibliographic utilities such as OCLC Connexion these characters no longer present a problem. Each vowel can be input with the necessary caron above. Library catalogs may display them correctly or strip out the diacritics, depending on the system, but searching and filing will not usually be affected. Nevertheless, it behooves the cataloger to input these letters with the proper caron, rather than to accidentally substitute the diacritic breve. ǎ, ě, ǐ, ǒ, ǔ The Latin characters with the acute diacritic: The characters representing these letters are available in OCLC Connexion and should search and display correctly in any Unicode-compliant library catalog. The characters may be downgraded when imported into some bibliographic utilities, such as Endnote, to their stripped forms c, s, z, or may be replaced with blank or filler characters, depending on how character importing is set. However this particular problem is beyond the scope of this article. 14 ć, ś, ź The Postpositions q and ç, In the examples with the ezh and theta, I have recommended making title added entries in certain cases where problematic characters occur in a title proper. I would also recommend constructing such added entries in the cases of the characters, q and ç, used in the postpositions, based on their phonetic outcome, i.e. k/g and s/c. Again, while requiring a bit of knowledge of Romani phonology, these added entries will aid a user by providing keyword searching based on a more natural form. Problematic Characters and their Numerical Character Reference Codes. These characters may present considerable problems for the reasons discussed above. Theta θ and ʒ Although visually unaesthetic, my preferred recommendation for transcribing these characters in OCLC is to use the lossless method discussed above by replacing these characters with their appropriate Numerical Character References, i.e. θ (θ) and Θ (Θ) ʒ (ʒ) and Ʒ (Ʒ). I am hopeful that in the near future we will have all Unicode character sets available to us in MARC cataloging, and therefore this problematic situation should not last indefinitely. Until then some catalogs can already display these characters correctly when encoded in this way. If this solution is locally considered to be unacceptable, then the use of the t̳ and z ̳(t and z with double underscore) is a good pragmatic alternative. At some future point these replacement characters could be searched and replaced in the database with the proper Unicode characters. One further note: if attempting to search OCLC WorldCat for examples of records for items in Romani you will discover that a large number of records have the MARC fixed-field language code for Romanian (rum) miscoded as rom, which is in fact the language code for Romani. This will result in many false hits when attempting to filter by language. Romani Samples on the WWW 15 Below I have listed a few examples “in the wild” Internet sites that use the new orthography or a permutation of it, that may interest the reader. A website of news from Kosova, using the new orthography in totality: http://rroma.courriers.info/spip.php?article394 A website devoted to Romani rights. Some articles are in the new orthography, which others are in a variety of the idiosyncratic variants: http://romarights.wordpress.com/category/nevipenewsvijesti/nevipe/page/3/ Website of the European Roma Rights Center. Except for the notation of the aspirated consonants, i.e. kh, ph, and th, the orthography is the same as Croatian. http://www.errc.org/cikk.php?cikk=2255 http://rroma.courriers.info/spip.php?article394 http://romarights.wordpress.com/category/nevipenewsvijesti/nevipe/page/3/ http://www.errc.org/cikk.php?cikk=2255 16 NOTES 1 . In cataloging literature and documentation, the terms latin and roman are often used interchangeably when speaking of scripts. However the process of transliterating a non-latin script into latin script is almost always called “romanization.” 2. The complete character sets for certain Central Asian and other Cyrillic languages are not yet available in OCLC. Further information on available character sets can be found at: http://www.oclc.org/support/documentation/worldcat/records/subscription/4/4.pdf 3. Library of Congress Rule Interpretation based on Anglo-American Cataloging Rules. 2nd ed., 2002 Revision. (Chicago: American Library Association), 2002 via Catalogers Desktop: http://desktop.loc.gov: „Section 25.3A‟: “For monographs, on the bibliographic record for any edition of a work whose title proper contains a word in the old orthography provide a uniform title reflecting the new orthography, although no edition with the reformed orthography has been received.” 4. Slavic Cataloging Manual (http://www.indiana.edu/~libslav/slavcatman/smtocs.html) 5. The term „Gypsy‟ is considered pejorative and is no longer used in scholarly writing. In English both Roma and Romanies are used for the plural. 6. This information has been included with the kind permission of Prof. Courthiade. 7. A linguistic concept that examines the syntactic, morphological, and lexicological commonalities that have arisen in the territorially contiguous, but linguistically quite different Indo-European language groups spoken in the Balkans. 8. The Romani dialects are broadly grouped into Vlax, Balkan, Carpathian, and Sinti. 9. Courthiade, Marcel, ed. Morri angluni rromane ćhibǎqi evroputni lavustik. Budapest: Fővárosi Onkormányzat Cigány Ház-- Romano Kher, 2009, pg. 496-497. 10. Both raklo and yakh are cognate with the Hindi larka and ankh. 11. I am intentionally not using the new standardized alphabet in these examples, as it would obscure my argument. 12. Victor Friedman. “Case in Romani: old grammar, new affixes. Journal of the Gypsy Lore Society, 5th ser., vol. 1, no. 2 (1991, pg. 85-102. 13. In fact in standard Bosnian/Croatian/Serbian and Polish there are two distinct “zh” phonemes, one more palatalized than http://desktop.loc.gov/ http://www.indiana.edu/~libslav/slavcatman/smtocs.html 17 the other. In Romani there is generally just one such phoneme, although there is a great variety in the actually pronunciation of the phoneme. 14. Courthiade, Marcel. Gramatika e gjuhës rrome. Tirana: [s.n.], 1989, pg. 19. 15. Since theta only exists in the postposition, it of course cannot occur in the non-oblique form of a name. 16. http://www.loc.gov/catdir/cpso/roman.html 17. As of this writing OCLC Worldcat is limited to the Marc-8 character set, plus four additional scripts (Thai, Bengali, Devanagari, and Tamil), which use the UTF-8 character sets. 18. Anglo-American Cataloging Rules. 2nd ed., 2002 Revision. (Chicago: American Library Association), 2002 via Catalogers Desktop: http://desktop.loc.gov: „Section 1.0.E: Language and script of the description.‟ 19. In order to not become overly technical in this article and to concentrate on the current cataloging implications, I have chosen not to go into specific details about the variety of MARC formats and ALA, MARC-8, and Unicode character sets, as they are out of scope. 20. http://www.loc.gov/marc/marbi/2006/2006-09.html 21. This is the case in, e.g., with the Voyager catalog. 22. An example can be seen on the OCLC WorldCat record # 233230871 (an example with Mongolian and Kazakh in Cyrillic script). 23. An excellent explanation of authority control in the context of bibliographic records can be found at: http://en.wikipedia.org/wiki/Authority_control 24. The new requirements for how related manifestations are to be handled will be more clear when RDA, the cataloging code that is to replace AACR2, is published in 2010, but this is not likely to greatly affect the suggestions made in this article. 25. Courthiade, Marcel, ed. Morri angluni rromane ćhibǎqi evroputni lavustik. Budapest: Fővárosi Onkormányzat Cigány Ház-- Romano Kher, 2009. This dictionary has not yet been widely circulated, but questions concerning its availability can be directed to: Mr Zsigo Jenö, Roma Parlament, 1084 Budapest, Tavaszmezö ut 6. (Roma.parlament@chello.hu) or to Mr Lakatos Laszlo (lakatoslaszlo@frokk.hu). http://www.loc.gov/catdir/cpso/roman.html http://desktop.loc.gov/ http://www.loc.gov/marc/marbi/2006/2006-09.html http://en.wikipedia.org/wiki/Authority_control mailto:Roma.parlament@chello.hu mailto:lakatoslaszlo@frokk.hu 18 26. Admittedly this is an imperfect example, as these two items, although they share the same main title, are not variant editions, but different books covering different time periods. The discrepancy between breša and berśa is not a typo. It is a dialect variation for this word. 27. FRBR (Function Requirements for Bibliographic Records) is an entity-relationship model for displaying related works in a catalog in a more holistic manner. 28. The h after another consonant indicates aspiration. These represent separate phonemes and are thus also considered separate digraph letters. work_s7wdflqtk5ft5aqrg3ljal6wky ---- 2, 4苦耐J4 ;1::'、呵, 寅 :t?.þ主 ....目前科輿....會學,第 36" 第 3 期 (1m) 頁缸...", 關於圖書分類法的修訂 陳和琴 COncerning the Revision of Classification Systems IIQ- chin Ch en Asso..;at.. 抖旬ifessQr 品'I>artmenl o[臼仙tional M,叫.&μb"ηsc酬的 Tam Jumg Univenity T.句>ei. T切wan, R. OC Abo>lrarl For reviewi呵 a classification .cheme, we usu ally look in much aelail 31 ils traditional fea tOJres such as detailed sched叫“, hospital notation, a supponive ind也)( and i!s adaptability . Yet anOlher d駒irable fealU'e. a good and linancially 鉛eure revlS10n programm金 is a key point 10 the . ucces. 01 das.ifica!的n scheme The author trac的 the history 01 revision proc個S的 01 three SUCCel!sful dassifica ';on syslems ( D<è wey Decimal Classific8lion. Library 01 Congress 口的sificaúon and Univ el"$3 1 Decimal Classification ), and auempt 10 附加1mend the better re vision mechani.m for La i's New C.缸s;!;catiQn Sch~m~ for Chint~ Li/>rarùs( 命闢 咽,分級海);n Taiwan K旬,...onb Revision ; Library classification ; Dewey Oecimal Cla個 ilica tion ; Library 01 Congress Classificallon: Universal Oecimal 口assilication 前 言 有λ認耳的世組末、 20 世缸胡期是西文圖書什顯法監史體展中的賀童 時代 因且許多頗具盛名的圖書塔什輯法都在泣盤時間內矗生,包括 o,~y Declmal Cwssificatioll (叫下 簡稱 DDC' 1 876 年tJj組 ) 、且Z肉nSl tJe Cl,削lfiωtiOll ( 189 1 -1893 年間也組)、 Library of Congress Classifi.ωtioll ( '" 制3 m 歇"'越科輿..t會學第三 f 六盤第三期 ( Marcb 1摸~) 以干問稱 LCC ﹒ 1 902 年開始 ili Jit,í} , Ulliversal Decimal Classificat紗" (叫 下簡稱 UDC' 1回5 年間晶晶腫)、 Subject Classificatirm( 1銳地年出區)、 Colo n Classification( 1 933 年出版)、 Bibliograρhic Classificat ioll ( 1935 年 t! l組)等。不過在追許多分類睡中,目前活動力仍強者卻不事, DDC 、 LCC 且 UDC 是來所告認表現傑出、十分且曲的典型代表。聶哥國會分類法風敗 的因3暫不少,章者認且持讀不斷的修盯周僅是隨功的最主要原因。說國圖書分 類怯也有黃金時代 , 民國tJJ年,西學東漸,當時新法章血 , 林林綠縛,國會 分類法之事﹒不講外人之值。無而轉眼數十年過去,目前輩灣地區國前甜昇 使用的圖書分輔眩,則U賴永祥先生昕增訂的〈中國國~f分類挂)(山下簡稱 「中國法J)佔壇大事數 。 彈1挂查到國會臨界的宵暉,自有其原因,不過書年 來賴世使用者對世新版的珊,層來壘,不免有些思宵。賴法幢會J情恆的不理想 ,可觀是使用大眾心中的痛 。 本主試世間ti目前最流行的三大圖書分餌詮 ( DDC 、 LCC 、 UDC )其修訂過程的相輔士獻中,探討分輒措修訂的最佳模 式,希望能提研制蹦單位的參考。 二 .DDCI包- ) 杜威分類挂 ( DDC )是目前世抖止使用暈目蟹的一種什餌訟 , 有 135 個國 家、超過四萬個國件軸探用 DDC' 現翻譯暐 30 華揖甜汗,最新一服是 21 腫, tiJ!長酷的明年,節描服品 13 版,世 1997 年出版 。 t 吩麵挂的修訂費屬 ]第 1 版至第 14 版 ( 1876-1942 ) 1876 年一本小珊 r J;.(卉名 A CI.ωslficall /J!I and Subjι,'ct Inde:r for 品taioguing ωId Arra唔ing the 品0/0 叫d Pamþhlets of a Libra旬'i'J2名 的瓶,迫說是撞來耳聽世間名的 Dewey Dec阱'wf C如s ifiü仙。" 。 掉多〉餌法的 飼連者 Melvil Dewey 表示 ,這是他在 1873 年且任職的母彼圖書館 Amher到 College Library 所發展的一直持輯軍統 。 DDCl曲多年來從/JJ)祖到目前的最新阻本,修訂的最主要ti式是定期 出臨新版 。 1站88品5 年單 2 版H聶的、申t掉?名 I品站kcima叫IC口fω削"叭fρ"叫叫4α叫" 111(1.品J身e口',rJ山Ii現 。 Dewey 在揖 2 椏的導介巾在示.對分顯法的修官盯J ' I1他也井將寄踩用 是一、 t fS J啥也勾 心~'t;'. R敵馳2軍 閥於..勞頓挫的1>前 ,2< 額噓聲音 ( integrity of numbers ) 的祖輩。也說是觀分類桂雖然特別的幫困 在傑目有所擴充,但是分類標記不會叫不同的重輯、重新再被使用,請越不 會U新標記守,叫安盟 。 Dewey 要分類n安心,因且顯號都已rsettledJ '外 類事若有所改壘 , 也是必要的改壘 , 是經過多年深思點l哩。由此可叫看址: Dew叮深知1分類Jl的辛苦,了解館且不喜改鹽,而歡盟行社便利的心理。 其佳 57 年中 , DDC出現7第 3!頁到第 M 姐,共 12 個新暉,都出自 Dewey L芋,提旁協助監督的有 W.R. Bisco, May Seymour 且 Dorcas Fel- Jows 導人 。追些血本的幢盯也都單行車 2 版的模式,研叫顯表E酷的部分 不算大事。自世 Dewey 的溫和安排,當時的確a既得不少使用者的歡心 。 2 第 15 版至第叩版11951-1989) 1951 的 15 摳,草稿「悍全體版」﹒甘 E咐1~",改凰 Dewey Decimal Clas- slficatioll & Rewtive lnd,阻,自 Mihon J. F erguson 主捕 。且了自闡事年 來使用者的期車,希草 DDC 型代化、撩撥化及平衡曹廳. Ferguson 等人 將 DDC 快連續身,提 14 版的 31364 條數目、 1927 n捕時且 4揖8 條目、 716 頁 。自居有人抱揖 DDC 多年來堅持 Dewey 不故團顛碟的承醋'iJ分 賣買居越來越不古使用, Ferguson 引且7的 1 千個重新蓋世顯目 {relo間, tions l 。 此外,自世有人不畸重索引, Dewey 的見于 Godfrey Dewey 軒騙 索引 , 世 1952 年男行出恆。然而 DDCI5恆的積極研措施,據讀者並不領 惘,其中比較激烈的批辦竟是宜佈 DDC 的死期即將來臨 。 聖f~教訓之聾, 1958 年出版的 DDCl6瓶在福輯 Ben Custer 主持下, 旦回到傳統的出摳模式,去掉 15 版大部分重新安置的顯妞,把原先刪掉的 排胡蘭自再找回來,放單聶摳肉 。 第一個全新在 { Phoeuix schedule l 世 DDC 16 祖引進,且無機化學觀 ( 546)且釘機化學類 ( 547 ) 0 Custer 的韓偉 作法揖原來使用 DDC 的分類且再狄恢祖位心 。 DDCl7 扭扭 1965 年出版,仍由 Cusler t稱任攝制 。自扭頭拉引不覺歡迎 。的血性 Forest Press 趕緊推U:修訂的索引應泣 。 DDCI7 屆推出全輛表心 理學廟 ( 150 ) 且也、革化、理學斯( parapsychology , 130 ) ,不tl l(fil的反應不住 。 除了輔助有覺地理團分在做認且比前Il&好月1外'Jt餘被稱謂的不多 。 盟體而 窮 , 多激賣學為學者甚費 17 摳,但壘實際外醋的軸且雖不怎麼喜歡迫個Il&本。 Forest Press 肘 1971 年1的距 18 伍 。 有兩個全新司~1.去掉觀 (340 ) 且數學 J22 塵!t -"資料,鷹....學費開三+六彎彎王朔( 恥iarch !唉m 顯( 510 ) ,都被認且是拉銜拙扭曲作品 ﹒ 輔助表增加到 7 個,世揖聶比 17 版 好用 。 攝 1974 年一項量加兩地杜風訟的使用調查,認聶 18 版有四大祖輩, 包括軒的海介、章引、手珊且舉辦研習會(草二)。 DDC19 月屋出現世 1979 年 。 (I )DDC 1 9 脹的全新費敢抬過程頓 ( 324 ) ,是世欲無ft面批評的全新 置 。 晃一全新表社會學輒 ( 301-307 ) 就世n~磨拳置,查到強烈巨對 。 主聾 Forest Press 世 1982 年推出比較完聾的社會學顯圭斬表(油1-307 )修訂本, 批評輸告平且 。 DDC 19 摳出版時,曾有人建議分閱單獨自雌某頓生新表是 比較實際的古式,至世增故部分則刊噩在期刊 E品抬z酌帥"'Y 正品站clm叫a叫l Cl必αs必岫3品2叭rfic.帥惚 t叫 Addit山Eω叫on叫五 Noles a帥"d L卸抬缸Cα叫E“山sω山叫n叫2 隔較畏的時間 J出H驅 。 DG妥原先耳 Notes and decisions 00 the app licat ion of the decimal classification 的三個單列也臨品主- ,自 Oecimal Classi fication Offîce 在 1934-1955 之間也腫,其書名在 DC 16 月區出驅使蛤改晶 。恆 。 雖然 D曲原計畫一年出兩跌,不過世來戚且不定期出版品, 一直 給V州 到 volume 5 才定期地一年出版一紋 。 單獨出隘的全新車削他社會學類 們 ( 30卜307 ) 外,有的'80 年的音樂頓 ( 780 ) 且 1985 年出恆的賀科處理直唱腦 顯(帥4α泊)。 ( 2)新騙輯 John Comaromi 世 19田年主任 。 (3) 1 980 年代杜威法所面臨的揖項重大聲卒,其中之 就是與 Inforo ni cs, Inc 曹展組上的蝠.1主授軍統( Editional Suppo rt System l '暐騙輯有 辦法也磁帶的機祖國書分輔法 。 不過自世當時尚無分顯貴料被樹,各主置到限 制,而單轟了越帶的醫行 。 1988 年 Fores t Press 的主rr機槽Lake Placid Education Foundation 叫車盒站0 萬的商價,把杜威挂的祖祖1學買給 OCLC ( ûnline Computer Library Center ) ,自此扯臘語去的命單間給有了轉 壘 。 DDC第扭扭( 1989 ) ( 1 )第四驅 11.\現世凹89 年,新腫的相關章引中刪除 see re ference ' 使參且更宜接 一般聽見新索引雖不理妞,不過至少比前揖服好用許多 。 DDC20 只有一個全輛車(音樂顯 780 ) ,但全斬草末並帶來預期的監果 。 有 人認且此一全新事世有必要,並未解決原有分輒止的困餒,有的甚至還崑聶 其于是: 躁和~:IU貴國﹒分,蘭語的修訂 '2; 生嶄衰改聾的幅度大劇烈7. 將迫使飽且轉向國會圖書時分顯訟 ( LCC ) 。雖 然扭此 • DDC20 Il&仍有數項I!!點值得一醒 , 包括重寫的分顯站在導介頗鹽好 評,手加包會在分顯睡之內、商品質的宇間是國會做學系研份額暐置的最佳 教材 , 越揮軍統加強 , 什類法提供各種1]式 , 數導使用者如何使用新版分類 洽 。 ( 2) 1993 年元月. OCLC 曹行「電子桂且J( El ectronic Dewey, version 1 )。 恤金字 3 月且體行rElectronic Dewey, Version 2J <> r電于杜威」 自一光醋 片扭曲,內當輯本 20 版仕聶法的全部丑 1 994 年叫前的類裝增改部分。查尋 軟體在矗碟片上 , 自附文件質科 a 若要使用「電子扯威J .餾且軍備有 光碟 復旦 IBM 個人電腦 。 比起欽本的分類費, r電于社成」的最大特色但措瀏覽 且 Boolean 蓋等能力、 LCSH 典 DDC顯敬的連話、主士索引、會目紀錄的 揖本,指示分餌聾厲的使用且 屯于便條 。 1994 版「屯子往成J可以將分割 符號 { segmenlat i on marks }加在 DDC 號輯之誼,其文件 ( documentation ) 包會有 DDC20 區分類告時介的全文 。 使用者可U扭曲蝴體字、片語、牡 威挂顯l!I、"屆國會樹~l館主題揮自被當 。 自動化索引的性能~U Boolean operators, trunca風 search history 在電于拉威都有 ﹒ 電于仕盟的世料庫共 有九個韋引 包括基本章引,指引著顛目、蛀擇、在t風法相輔章引丑扯盛世 顛號等,其餘分別且閉目闢體字丑片時當引、杜威法餌號的數字韋引、社跋 法相關索引的酬體字且片黯索引、詮釋的關暐字輩引,與社級2告顯現相應的 主閻國會圖書館主越標目的關體字輩引等 。 基本上,使用者經由適當的韋引 片甜、闢蟬E自1棠,且輯噓的層屆世,甜可以拉到好顛噓 。 3. DDC21 版 ( 1996 ) DDC扭扭也扭扭 7 年 ( 1四6 ) , OC LC Forest P ress 5l.推出了 2 1 距 。 目前 2 1 月正有兩種血本 1 一見攝本,共有 4 珊,另一品市于瓶,且說窗脹, 稱之且 Dewey fo r Windows CD-ROM <> m:于祖說圖在t成盟棋了許事新的 權舍 , 值分類法直在品、更有散地蟬用 。 且T便利值用者使用,新脹增加7 許事索引詞 。 'iIl于索引 ( Eleclron i c lndex ) 的宰亨|詞比舊Il&.!!!聶盟富,包括 手醋、分顯挂旦掌引的數目 。 新版手冊中增加T許草鞋暉,並扭盎且,供使 用者參考比較各相蝴類瓏的用訟 。 詮釋的話情也簡化許事,把舊恆的 exam ple, contains, including 等蛙陣告掛品 z月、即聶includ i ng 註 。 對世五位置衡 路‘ 敏膏,賢"輿...官司長 M三+1';f! M::::.:IIII(M..ch 閉幕9) 曙餌目 (multiple tenn heading )若所會各輒眉,可I!.(!IIH!揮學極分﹒也新增 T指引叫告知l使用者﹒ 此外,分個訟制輯晶T引導分廟品,祖供有實際的圈,目自料,作風分輒 止的事勻,借用了 LC 的線上H錄且 OCLC 線上輔音口緣上的習~~記錄. 就昨在世軒的牡戚曬而官﹒ OCLC 曲曲目貴輯陳桂換了It LC JI!宜貨的路 用且更個防化的使用 因昂 OCLC 目錄包括了來自國外使用者的記錄,制 個英國、澳大利亞、加拿大且訟阻 。 且了幫助由用者. 21 世且也隘的同時 , ~y Decimal CÚlS$ ifi但lIon aρrac( /C(l l guìde 的軍 2 睡盯本也出版7 0 1997 年出服師胡組第 1 3 版且其 研'1'1于助 ( workbook ) 。 晶了直是持性風佳的且時性, OCLC Fo rest Press 每 年單ilJ服 Dewey Decimal C.如sification, Additi。叫 Notes and Decisiolls ' 聞研 DG壘 。 新睡在銷商傳土的理壘起查到祖代分類瞳肘原則的刺醋 。 在直11繭的改鹽 包括規j叫他 ( regulation ) ,便什餌睡E品世值JII. 並增加值用liI而指標 (facet indic3lor ) 且告成揮記(synthesiz吋 notation ) .因且層面指標丑標記 什成在續當上iif叫增加潛;句 。 篇著宜彼份額告中來自任何盧腕的現閣,法組 且鹽商所It者的負責日'#1直能擴展主團董穢的有散性 。 且T增強 DDC 效能﹒ OCLC 的前究人口車行T1II$右側的研究 。 悔 。CLC 最近維也的 Dewey 2俠)()~年囂,就是有重但使 DDC 且且調酪m壇中 組攝買見最有力的工具。 ω#領撞修訂立相關攝柵 DDC 的且功主要在世杜威恆不I 迫且. 1900 年 1 他成立平神湖"'司 ( La ke Placid Company) . 1922 年、藍而且立平悍嘲教育基章含(Lake Pla cid Club Educatioll Foundation ) 將平蟬湖俱樂部 ( Lake Placid Club ) 的 脅和l研冊,專輯基金會,再申基金會提供鍾賣給轟林ilJ腫吐(For.曲t Press )t!1 Jl!( DDC ﹒ U單行分顯法的;1單單 ﹒ Forest Press 丑乎蟬嘲教育A基金會的監瞥 ,不過團半團立機槽,擁有自己的鹽'1'舍 。 1 分傾言&徊,‘ 1901 年,直圓圓會制協會目錄冊 ( Catalog Section )扭頭直讀 DDC 的 輛軍融融rt現在 LC 的卡片1:'白此, Dewey 恆不過鐵力但成此l!< • 1927 .. 徊,學 "齡..份,區陸的..盯 道站 r ? 色 " , 各 訊 A H n hhr5 -tz h A v n d 舟 " ‘ 年社歸~/.去攝輯部直至 LC 成立的十進分類訟部 ( the Dec ima l Class ification Offi目卜分類法的攝輯政軍查到平靜嘲教育基金會且要國圖書館協會攝輯 咀暈暈且會聯合會 Uoint La ke Pl acid Club Education Foundation-A mer ica n Library Associat ion Editorial Policy Committee ) 的控制 。 1930 年 , LC 間站在 LC 卡片上缸供 DDC 顯妞,直是 Dewey 去世的前一年, Dewey !I!算完成T心顧 。 1954 年, Forest Pre惜與 LC 葷的, åJ Forest Press 且供 經賣給 LC I'-I捕前 DDC 16 版 。 1957 年,仕威撞騙輯部與十進分輔禮部告 畔,自攝輯( the Editor)掌管組前份額站長及提供 DDC 輔音處在 1C 卡片南方面 的囂睛 。 撞來告佛的部門立壘成 LC 屬下的一個單位 ( the Oecimalαa詞a fication Divis ion of LC ) 。 到了 1987 年﹒分觀訟蝠輯與組主任旦分間成兩 個職位 。 目前分類睡騙輯專門且賀 。DC 的輛輛工作 。 2 分類淺 出版挂 1988 年. Foresl Press 團成 OCLC Online Computer Library Center 的一部分 。 OCLC ~JAn血管理提棋T許事控巧且貨單 1lII Forest Press 實現其擴展計宜 。 在 OCLC Foresl Press 輿 Library of Congress 聾的之下 • DDC 71顯龍的輛車I工作仍由 The Decimal Classification Division of LC 負責 . The Division 的胡能胡同住嗨,不過印刷版且電子版的出血也 OCLC F orest Press 訊行 。 3 分類法使用者使表 雖有 OCLC 的電腦對料庫骷檻盾,但值用者代裝也是影響 DDC 曹麗 不可草缺的重要一理 。 卡進分輔挂攝輯壘n會 ( The Decimal Classification Ed ito rial Policy Committee ﹒簡稱 EPC ) 輯是使用者的代表 。 EPC 成立Ifi 1937 年 , 當時局十進分輔法垂且會 ( Committee 00 the Oecimal Classifica ti on ) ,在 DDC 的政11)1<方向的決定主提供勸告撒闕,國顧問性質 。 1955 年 . EPC 重姐,成品平靜嘲教宵基盒會且聾國圖書館協會的聯告疊且會 。 雖然 1988 年 Forest Press 當給 OCLC ' EPC與平靜嘲教育基金會不再再有 關蟬,但 EPC 在 OCLC Forest Press 的蹄束之下 ,單揖掛讀著國間的角色 。 目前 EPC 有 10 位代裂,自 10 個國際性會且機構代表姐暉,會n代表包 括 America n Library Association, Librar y of Congress, OCLC Forest P ress (publisher of the DDC) , Library Association (United Kingdom ) 等 vhhJ ~:q:~. "持 32' a坡等資料,‘國".. 呵, 第三十六響 ,聽三期 ( March 1說別 。 American Libra ry Association 所屬相關單位聶 Subject Anal ysis Com mltt間, (British l Li brary Association 所屬相關單位耳 Dewey Dedmal Classification, Library of Congress 所屬相關î(I!iL耳 Decimal Clas池ification Oiv ision 0 除此之外,還包指加拿大且澳大利亞的代表 。 EPCJ,餘代表多來 自甘共圖書館、專門圖借館、學繪圖書館丑圖書聞學軍昕各界 。 DDC 騙輯 將血壓備好的背輯表修訂且擴充迪權組吏使用者代表開會,自代表加叫評帖 , 並作進一t步建議行動 。 時盟國使用者代表外,英、帥、澳三國代表亦對分餌 裂的幢前﹒有著相當重要的貢獻,個個音樂觀 ( 7曲) 的全新車首汰出現在英 國,量有輔助實地理體外表 ( Table 2 )的修盯, 三國代表也了不少力 。 每年 LC 的十揖什輯法組取定的 DDC 扭玻超過 11 萬個 titl帥,加上英國、澳大 利亞、組西蘭、加傘大且其他使用 。DC 國家所取有 DDC 顯載的 MARC 紀錄,崇禎起來每年有著相當辜的數目 。 目前 DDC 在鐵三角 ( 出版社 OCLC Foresl P ress 、蝠輯 Oecimal Clas siíication Div的ion of LC l!J..使用者代在 The D缸ima] Class ification Edito rial Poli cy Committee ( EPC ) 的鋼鐵陣容下,邁向西元2(咽年,期世且且 世界上最強有力的實訊組攝工具而費力 。 三 ﹒ LCC I拉到 H簡史 1800 年 , 聾回國會圖書峰館( 叫下簡稱 LC )成立 ,在大部外 19 世紀年間 內, LC 都是控照聾團體弗矗 ( Thomas Jefferson )總統所設計的分類扭扭鐵 館囂 。 18!肝年, LC 直轄軒顱,當時甜甜巴超過 50 萬珊, ]e{ferson *誼顱 輛不l!所需 。 而適時候有兩種分類法在斐闢圖書的興使用,杜威的 DDC 且 科特 ( C.A. Cutter)的展開式#顯法 ( Expansive Cl assificati咽 , 簡稱 EC ) , 不過 LC 決定串行曹躍華清訟 , 定晶晶 Li brary of Cong ress Classification 0 J.C. M. Hanson lÆ Charles Martel 理直 EC 聶分餌法的主要指引 ,在他們二 λ會導下 , 每一輛別的分轎車樹木同學科個域的專家分頭曹展,且每一轍表 各自出版 。 和其他般性分顯挂祖不相同的 點,就是 LCC 並荐的自某一 位大師之芋,而是某種的集體創作 , 有人盟且 LCC 壘一車到的專門外數措 4日?恐志 、h干企甸甸 ~tl'*'fi 躁和琴 ..時間,份嫩誼會的,摩訂 327 。 Cl ass Z 是第一個聶嫂出來的頗有置,直到 1948 年,除7法呻輔(αass K ) 之外,其他輯表全部完成並出恆 。 世呻頓中. 1969 年也版的 Class KF { U nited States law )是晶先/.t\隘的訟1.顯聾,到目前聶止, Class K 的曹廳還 在進行曲中 。 仁〉修IT LCC 的睡前與擴充一直持攪進行 。 分輒法的睡前且盟費是 LC 的攝目 故聾且主揖部 ( Cataloging Policy and Support Offî凹 , 簡稱 CPSO ) 的職責 之一。 LC 輛自且缸旁協助修訂工作 。 份額訟幢盯、增加且前聾的建議通常 來自 L己的主題輛自日 。 CPSO 且其他圖目品的代量組成騙輯壘J1會 , 每過 定期會謹,會中審宜主魎輛自且提出的鹽融 。 t!J!區建議 經核准通過,新的 頰lIt立即生睡 。 分類桂的增加典故團首先也現m LC 的一帥刊物 LC CÚIssi ficat的抗 -Addit朋1$ o. nd Ch(l nges ' 是季刊 。 LCC 各輒謹修訂扭曲刊行盟有 定的時間表,各視其需要,有些類表已盟曹展到第 8 版(個個 Cl ass Q) , 有些直是第 1 11& 1 倒扭 Class K ) 。 斬組閣表的t!J!巨型體基本主分成四暉,新 近完且而未曾出版過的類表是馬輛暫謹( new sc hed ules ) ;已盟盟有庫存而 近期內又無計車也IJ盯隘的顯聾. LC 會置印,盤附輒量增加且i!I(蟬的補捕 ,直是重刊:nt “ Dewey Decimal Classification" A.C. Fosken , The Subject A仲間ch to / njonnaliofl, 5th ed. ( London Library Ass個 at ion, I呻6 ) , Chapter 口,“The Dewey Decimal 口assification~ , pp 256-28。 1.5. Mitchell , "DDC 21 and B叮ond : The Dewey Decimal 口的sifica h九 .. t tion prepares for the future ," Co. taloging and Classifi.叫;011 Quartt:r鈔" 21 : 2(1四5 1 12 主 37-47 R. Trotter “ Electronic Dewey ; The CD.ROM vers ion of the De wey Decirnal Cla側的叫﹒00," Gàtaloging u. nd CIn.uψω1;011 QUilrurly, 19: 3/4 ( 1995 ) 213一234 DDC 鋼結 http:// ww w . oclc. org/oclclfp 位二 lP. Comaromi, Mary El1 en Michael, & Janet Bloom, A S,.內呵。)f lhe Use of Ihe 卸wey Decimal C.仙ificalÌon in the Un;ted SU1tes and Canada{ Albany , NY: Forest P隱約。 1975 ) "三 LM, Chan, Catalog;ng and C/lUJ叭'calion : An 叫",d,叫'on, 2nd ed (New York. N.Y.: McGra w. Hill, 1994 ), Chapte r 13. 拉回 有關 UDC 修訂費用喔,本文主要參考貸科知 F A.C. Fosk凹, The Subj,叫 Apþr個ch 10 /nfonnat;o", 5th 吋 ( London Library As的ociation, 1996 ), Chapter 18, "The Un iversal Decimal Cla弱i fication," pp.281-29吟 la C, Mcl lwaine,吋he Universal Decimal Classi fication: Some factors conceming its origins, development, and influe凹的," Journal of Ihe American 品α ny [Qr InfQrmation Science, (Ap r. 1明7); 331 -且4 p, David Strachan. & Fits M. H. Oomes,“ Universal Decim al Classifica tion Update," Catalog;啤& Cl間;[;,ωlÎ01l Quarterlè挑 19:3/4 ( 1叩5 ) : 11 9-13 1 JEa石';,\1;. 叫‘三 J; .i.. 險制lt: Mß奇圖唱,分顯訟的,前 '" Th t! UDC ι必ays !Qr a 11f'"即 decade, edited by Alan Gilchrist and Dav id Strachan {London: Aslib, cl9四), pp.l-訂 work_scngevldbvdo3fdhvvyf64zxhu ---- Connecting Systems for Better Services Around Special Collections Search D-Lib:   HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB   D-Lib Magazine September/October 2014 Volume 20, Number 9/10 Table of Contents   Connecting Systems for Better Services Around Special Collections Saskia van Bergen Leiden University Library, the Netherlands w.van.bergen@library.leidenuniv.nl doi:10.1045/september2014-vanbergen   Printer-friendly Version   Abstract Over the last few years, several projects to improve physical and digital access to special collections have been undertaken by Leiden University Libraries in the Netherlands. These heritage collections include manuscripts, printed books, archives, maps, atlases, prints, drawings and photographs, from the Western and non-Western worlds. They are of both national and international importance. The projects were undertaken to meet two key requirements: providing better and faster service for customers when using the collections, and creating a more efficient workflow for the library staff. Their interdependencies, with regard to creating new formats for the description of graphic materials and providing digital access, led to a merger of the projects with a combined set of goals for conversion, cataloging and digitization-on-demand. This article describes the infrastructure behind these projects, and the impact of the projects on users and staff to date.   1. Introduction Founded in 1575 by William I, Prince of Orange, Leiden University owns a large number of heritage collections of national and international importance. These special collections include manuscripts, printed books, archives, maps, atlases, prints, drawings and photographs both from the Western and non-Western world.1 Recently, the university library acquired the library collections of the Royal Tropical Institute in Amsterdam and the Royal Netherlands Institute of Southeast Asian and Caribbean Studies in Leiden. Because of this international orientation, researchers and students from all over the world come to Leiden to visit the library. The special collections are part of our national and international heritage. They are important to reconstructing the development and dissemination of science in the Dutch Republic. For instance, how did 17th century researchers like Christiaan Huygens and Antonie van Leeuwenhoek obtain their knowledge? The aim of the library is to facilitate the use of our special collections in research and education by students, teachers, and researchers, but also by culturally interested people in the general public. To reach these objectives, the library continuously invests in digital and in physical services, such as the facilities in the reading room and (virtual) exhibitions. The foundation of the Scaliger Institute, a research center that aims to stimulate and support the use of the special collections by means of lectures, symposia, master classes and the provision of scholarships, also serves to further the library's objectives. The special collections are increasingly made available through the library's catalogue. Since the late 1960s, Dutch libraries have described their library materials in a union catalogue, which is currently hosted by OCLC.2 Apart from this, Leiden makes use of three Ex Libris products: Aleph for our library services, DigiTool for our digitized special collections and Primo for discovery and delivery (see Figure 1). The OCLC-GGC union catalogue is used for bibliographic information. It facilitates interlibrary loans within the Netherlands, and it is also used to ensure worldwide availability via WorldCat. Metadata records from the national OCLC-GGC database are fetched by the local Aleph database, in which descriptive information about the individual copies is added. Materials can only be lent out or made available in the reading room when the shelfmark is included in the metadata. Primo, finally, is used for the discovery of materials in both Aleph and DigiTool. But as you can see in Figure 1, these last two databases were treated as separate silos. This means that the information was not synchronized, causing inconsistencies in Primo. Figure 1: Graphic overview of the main systems used by Leiden University Library for the cataloguing of the special collections. Old situation. Another problem we were dealing with is that the union catalogue was originally designed for books, periodicals and other textual sources. As described above, our library owns many non-textual collections, and the union catalogue did not contain the right formats to describe these. The result was that the curators of our library started looking for their own personal solutions. They were cataloguing in their own Access or Excel databases, without using a metadata standard. As a consequence, the metadata were locked in these databases. DigiTool was 'misused' for cataloguing purposes as well, both for digitized and non-digitized materials. Because DigiTool and Aleph are not connected, no services could be delivered for these collections in Primo. Visitors could see the metadata, but viewing the images online or placing a request for the physical items was impossible.   2. Project Goals The main aim of the project we started with OCLC was to create new formats for the description of graphic materials, such as prints, drawings and photographs, and to convert all of our special collections metadata to the standard used in the union catalogue. We had already started two other projects concerning the digital access to our special collections, but as it soon became clear that these had many interdependencies, it was decided to merge these into a program focusing on three main goals: Converting all special collections metadata to OCLC's union catalogue, which would thereafter be used for cataloguing all of our special collections (including archival materials, prints, photographs and objects). Making all of our special collections available through our library catalogue. This means that clients can place view requests 24/7 from anywhere in the world and do not need to come to the library anymore just to fill in a paper call slip (and wait until the materials are available in the reading room). This is especially an advantage for our many foreign researchers. When researchers can select and request materials in advance, they can plan their travels much more efficiently. Creating a new service for digitization-on-demand of our special collections, built on the library catalogue. A visitor survey held earlier had revealed that clients considered the current application procedure for reproductions to be too slow and the costs of the scans to be too high. By integrating the new service into Primo, the administrative process could be simplified, both for the clients and the staff.   3. Project Stages Four main stages were identified within the project. The first step concerned the transfer of the metadata to the union catalogue, carried out in close collaboration with OCLC. To accomplish this, we had to take several factors into account. First, we had to make separate records for the physical and digital objects, both with their own set of metadata. This process was recommended by the consortium of Dutch university libraries and the National Library (UKB). Although referring to the same object, digital and physical records describe different material types, both with their own services. Selections in WorldCat, for example, are based on material types. If you search for e-only, these materials can only be selected when they are catalogued separately as a digital record. To be able to make this distinction, we had to create different procedures for each of the following situations: Materials that had been catalogued in DigiTool, but had not been digitized and for which the system had been misused. For these materials only one new record was created in the union catalogue, representing the physical object. The records in DigiTool were deleted after conversion. Materials that were catalogued in the union catalogue, but were also digitized and therefore had a record in DigiTool as well. In this case, only one extra record was made for the digital object, with a link to the scans in DigiTool. Digitized materials that were only catalogued in DigiTool. For these materials two records were created, one for the physical and one for the digital object. Scans of books and manuscripts in DigiTool that were not scanned completely into a digital facsimile, but for which only one or a few specimen scans were made. For these records, a link to these scans was added to the record for the physical object. A second concern was that the union and Aleph catalogues describe editions, or, according to the FRBR terminology, manifestations, whereas DigiTool describes copies or items.3 Fortunately, this distinction didn't affect our procedures that much, because our records in most cases describe unique materials. This means that, in practice, edition and copy are identical. The printed books were already catalogued in the union catalogue, so this didn't cause any conversion problems either. Strictly speaking, the print collections do not contain unique materials. However, since the metadata do not contain any information about state or edition, it was decided that these would be considered as unique materials nonetheless.4 The second stage of the project concentrated on the development of a digitization-on-demand service. All of our scans can already be viewed for free in DigiTool in high JP2 resolution. The service had to be set up for people who preferred the original TIF scans or a PDF, or who wanted to order reproductions of non-digitized materials. The first thing we did was to add order buttons to all special collections records in the catalogue and to connect them with an order form (fig. 2). Clients can now order scans of all our materials, catalogued or non-catalogued, digitized or non-digitized. They can place their orders regardless of time and location and pay with credit card, PayPal or bank transfer. Scans are delivered to the client after payment. Previously, ordered TIFF scans and PDF files were sent by email, WeSendit, or other external services. When a large order was placed, the scans were placed on a DVD and shipped, which was not very practical considering the fact that our clients are from all over the world. Part of the project was therefore to implement an FTP server, which could be used to deliver the ordered scans quickly and safely. At present, clients receive a link by email, which they can use to download the scans during one month as often as they want, on various devices. Figure 2: All special collections records in the catalogue are provided with an order button. (See a larger version of Figure 2) To organize the digitization-on-demand workflow, we use the open-source software application Goobi. This software allows you to model, manage and supervise all production processes involved in creating a digital library. These include importing data from library catalogues, scanning and content-based indexing and the digital presentation and delivery of results in standardized formats. The software was developed by a consortium of German libraries and commercial companies. It is used in libraries and archives in Germany, England, Spain and Austria, and we are the first library to implement the software in the Netherlands. Scan requests placed in Primo are sent to Goobi automatically and connected instantaneously to the bibliographic metadata imported from the catalogue (See Figure 3). The software is also used to make METS files, add structural metadata, and deliver scans to the client. Scans of completely digitized materials are imported into DigiTool as well. Figure 3: The Goobi software is connected to various other applications: Scan requests placed in Primo are sent to Goobi automatically (1), together with client information taken from the order form (2). Goobi imports the bibliographic data from the catalogue (3), scans are sent to the client with an FTP-service (4), and when a complete object is digitized, the scans are exported to DigiTool as well (5). Step three was to synchronize the metadata between all systems in an automated process (See Figure 4). Eventually all metadata in DigiTool will be substituted by the records for the digital objects in the union catalogue. This is a great advantage. In DigiTool there is no validation on the use of Marc21, so the standard wasn't always used in the right way. During the conversion the metadata were enriched and corrected where possible and the result is that we have cleaned up metadata in all systems. Our aim is to close the connection between DigiTool and PRIMO as soon as possible. Because the images from DigiTool will be made available in Primo through a link in Aleph, the connection is no longer necessary. Figure 4: Overview of the connections between the systems in use by Leiden university library. Scans that are made in projects by external vendors are uploaded in batches into DigiTool with a locally developed tool called MEGI (which stands for Mets ingester and uploader). With this tool it is possible to prepare the structure of a METS file before upload, and to create metadata on both collection and item level. For uploads of single items, DigiTool's own ingest service Meditor is still used. In both cases, the identifier from the union catalogue is added to the records during ingest, to make it possible for DigiTool to import the right metadata. The last step was to add holding and item information to the new records in Aleph, to allow for view requests and loans. Because our collections are highly diverse, different request procedures and restrictions had to be taken into account. Materials that are kept in our stacks are collected throughout the day and are available an hour after the request. Photographs, however, have to acclimate for 24 hours before they can be made available in the reading room. In addition, a part of our collection is kept in the Bibliotheca Thysiana, the only Dutch book collection from the seventeenth century that is still housed in its original purpose-built building.6 These materials can be consulted in the special collections reading room of the University Library only, and are collected once a week. When we started the project, we knew it would take some time before all materials received an item description, especially because of the conversion project. For this reason, we decided to use the existing scan request button for view requests as well. The buttons only appear with materials that don't have an item yet, so they will gradually become obsolete. An additional problem is that a considerable part of the special collections is not available in online search systems. Some materials are described only in a non-digital form, like printed catalogues and inventories. In some cases, uncatalogued materials are mentioned in scholarly publications, or researchers find references in publications. For these materials we placed the buttons for scan orders and view requests prominently on the special collections tab of Primo (See Figures 5). This way, all of our special collections, catalogued and non-catalogued, digitized and non-digitized, can be requested online. Figure 5: For uncatalogued materials, special order buttons are made available through the catalogue.   4. Conclusion By connecting the existing systems and by developing local additions, researchers and students are offered easier and faster ways to view materials in the reading room, and to order, pay and receive reproductions of our heritage collections. The project has made an important contribution to the visibility of these collections. All of our materials, not just the textual sources, but also the various image collections — prints, drawings and photographs — are now available in WorldCat, thus improving the discoverability of our records. We are also experiencing a significant increase in orders. By placing the buttons prominently in the catalogue, ordering scans has apparently become so much simpler that the scanner we bought for our in house scanning activities is now in use almost fulltime for digitization-on-demand. As explained above, OCLC not only converted the metadata to the standard used in the union catalogue, but also created new formats for various material types. Because we are the first library in the Netherlands to use the union catalogue for these materials on this scale, our project was conceived as a pilot. While our opinions clearly carried much weight, the main goal was to create new formats for all Dutch libraries. An important secondary goal of the program was to make the current systems more manageable by using them correctly, and also by automating the workflow where possible. The cleanup consisted of more than technical solutions. During the project it became apparent that it was even more important than previously to make proper arrangements for the use of systems, to identify deficiencies, and when necessary to create work-arounds to avoid future problems in our systems. Although it took our staff some time to get used to the new way of working, they now clearly see the advantages as well. For example, colleagues involved in cataloguing considered the stricter procedures as inflexible at first. But working in a consistent manner improves the quality of the content considerably, thus increasing the possibility of building new services on the content as well. Altogether, the most important result of the project is that, at present, we are much better prepared for the future of library cataloguing. In the near future, we plan to implement an update of the infrastructure for our digital collections, and the results of this project will make replacement with any possible future system much easier.   Notes 1 Christiane Berkvens-Stevelinck. Magna commoditas - Leiden University's great asset. 425 years library collections and services. Amsterdam, Leiden University Press, 2012. 2 Information on the Dutch union catalogue GGC (which stands for Gemeenschappelijk Geautomatiseerd Catalogiseersysteem, or shared automated cataloguing system) can be found here. 3 IFLA Study Group on the Functional Requirements for Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report. München 1998. PDF and HTML files of the report can be found here. 4 Rare Books and Manuscripts Section of the Association of College and Research Libraries. Descriptive Cataloging of Rare Materials (Graphics). Chicago 2013. Especially Appendix E. Variations requiring a new record. 5 For Goobi case studies see http://www.goobi.org/en/ and http://slideshare.net/goobi_org. 6 For English information about the Bibliotheca Thysiana, with further literature, see here.   About the Author Saskia van Bergen works as a senior Project manager for the Innovations and Projects Department of Leiden University Library. She is responsible for projects focusing on the digital access to Special Collections, and deals with the management of digitization, cataloguing and digital collections. She also participated in several national projects, like Early Dutch Books Online (now Delpher) and the Dutch portal for academic heritage Academische Collecties.   Copyright © 2014 Saskia van Bergen work_sf62rljh2rettjxmewar5j645q ---- Pre-ILS Migration Catalog Cleanup Project – In the Library with the Lead Pipe Skip to Main Content chat18.webcam Open Menu Home About Awards & Good Words Contact Editorial Board Denisse Solis Ian Beilin Jaena Rae Cabrera Kellee Warren Nicole Cooke Ryan Randall Emeritus Announcements Authors Archives Conduct Submission Guidelines Lead Pipe Publication Process Style Guide Search Home About Awards & Good Words Contact Editorial Board Denisse Solis Ian Beilin Jaena Rae Cabrera Kellee Warren Nicole Cooke Ryan Randall Emeritus Announcements Authors Archives Conduct Submission Guidelines Lead Pipe Publication Process Style Guide Search 2016 Aug 19 Robyn Gleasner /0 Comments Pre-ILS Migration Catalog Cleanup Project Image by flickr user ashokboghani (CC BY-NC 2.0) In Brief: This article was written to describe the University of New Mexico’s Health Sciences Library and Informatics Center’s (HSLIC) catalog cleanup process prior to migrating to a new integrated library system (ILS).  Catalogers knew that existing catalog records would need to be cleaned up before the migration, but weren’t sure where to start.  Rather than provide a general overall explanation of the project, this article will provide specific examples from HSLIC’s catalog cleanup process and will discuss specific steps to clean up records for a smooth transition to a new system. by Robyn Gleasner Introduction In February 2014, the Health Sciences Library and Informatics Center (HSLIC) at the University of New Mexico (UNM) made the decision to migrate to OCLC’s WorldShare Management Services (WMS).  WMS is an integrated library system that includes acquisitions, cataloging, circulation, analytics, as well as a license manager. The public interface/discovery tool called Discovery is an open system that searches beyond items held by your library and extends to items available worldwide that can be requested via interlibrary loan.  We believed that Discovery would meet current user expectations with a one-stop searching experience by offering a place where users could find both electronic resources and print resources rather than having to search two separate systems.  In addition to user experience, we liked that both WMS and Discovery are not static systems. OCLC makes enhancements to the system as well as offers streamlined workflows for the staff. These functionalities, along with a lower price point, drew us to WMS. This article will discuss HSLIC’s catalog cleanup process before migrating to OCLC’s WMS. Before the decision was made, the library formed an ILS Migration Committee consisting of members from technical services, circulation, and information technology (IT) that met weekly. This group interviewed libraries that were already using WMS as well as conducted literature searches and viewed recorded presentations from libraries using the system.  This research solidified the decision to migrate. HSLIC began the migration and implementation process in June 2014 and went live with WMS and WorldCat Discovery in January 2015.  Four months elapsed from the time the decision was made to the time the actual migration process began due to internal security reviews and contract negotiation.  Catalogers knew that existing catalog records would need to be cleaned up before the migration, but weren’t sure where to start. Because of this, the cleanup process was not started until the OCLC cohort sessions began in June 2014.  These cohort sessions, led by an OCLC implementation manager, were designed to assist in the migration process with carefully thought out steps and directions and provided specific training in how to prepare and clean up records for extraction, as well as showed what fields from the records would migrate. In addition to providing information about the migration, the OCLC cohort sessions also provided information on the specific modules within WMS including Metadata/Cataloging, Acquisitions, Circulation, Interlibrary Loan, Analytics and Reports, License Manager, and Discovery.  While the sessions were helpful, the cleanup of catalog records is a time-intensive process that could have been started during the waiting period. Luckily, we were one of the last institutions in the cohort to migrate bibliographic records.  This allowed more time to consider OCLC’s suggestions, make decisions, and then clean up records in our previous ILS, Innovative’s Millennium, before sending them to OCLC. Literature Review While there is extensive information in the professional literature regarding how to choose an ILS and how to make a decision about whether or not to move to a cloud based system, there is little information about the steps needed to clean up catalog records in order to prepare for the actual migration process. Dula, Jacobson, et al. (2012) recommend thinking “of migration as spring-cleaning: it’s an opportunity to take stock, clear out the old, and prepare for what’s next.” They “used whiteboards to review and discuss issues that required staff action” and “made decisions on how to handle call number and volume entry in WMS;” however, catalog record cleanup pre-migration was not discussed in detail. Similarly, Dula and Ye (2013) stated that “[a] few key decisions helped to streamline the process.”  They “elected not to migrate historical circulation data or acquisitions data” and were well aware that they “could end up spending a lot of time trying to perfect the migration of a large amount of imperfect data” that the library no longer needed.  They planned on keeping reports of historical data to avoid this problem. Hartman (2013) mentioned a number of questions and concerns for migrating to WMS including whether or not to migrate historical data or to “start with a clean slate.” They decided that they “preferred the simpler two-tiered format of the OCLC records” to their previous three-tiered hierarchy, but found some challenges including the fact that multi-volume sets did not appear in the system as expected. The cataloger chose to to view this as “an opportunity to clean up the records” and methodically modify records prior to migration.  Hartman (2013) also discussed that the “missing” status listed in their previous ILS system did not exist in WMS and that they had to decide how or if they should migrate these records. While the questions and concerns that these authors mentioned helped us focus on changes to make in the catalog prior to migration, we found no literature that discussed the actual process of cleaning up the records.  From the research, it was obvious that a number of decisions would have to be made in the current ILS before the migration would be possible. Process In order to make those decisions, the ILS Migration Committee met every other week to discuss what had been learned in the OCLC cohort sessions as well as any questions and concerns.  It was important for catalogers to understand why certain cataloging decisions had been made over the years to determine how items should be cataloged in the new system.  Our library’s cataloging manual and procedure documentation was read and questions were asked of members on the committee who had historical institutional knowledge. Topics included copy numbers, shelving locations, and local subject headings.  Notes and historical purchasing information were closely examined and their importance questioned.  Material formats and statuses were also examined before determining what should be changed to meet the new system’s specifications. Copy Numbers OCLC recommended taking a close look at copy numbers.  A few years ago a major weed of the media and the book collection was conducted.  Unfortunately, when items were withdrawn, the copy numbers were not updated in the system.  In some cases, copy number 4 and 5 were kept while 1-3 were withdrawn and deleted from the system.  In the new system this would appear that the library had 5 copies of a title, while it really owned two.  We decided that the actual copy number of an item wasn’t important to our library users because we could rely on the barcode; however, it was important to determine the number of copies so that WMS could  accurately identify when multiple copies of an item existed. In order to make these corrections, a list was run in Millennium for items with copies greater than 1 and then item records were examined to discover how many copies existed in the catalog.  Corrections were then made as needed.  This was a bigger job than anticipated, but it was a necessary step to avoid post-migration cleanup of the copy numbers in order to prevent errors in WMS. Shelving Locations One of the first things we learned in the OCLC cohort sessions was that many of the statuses that we used in Millennium did not exist in WMS.  Some examples were: MISSING STOLEN BILLED CATALOGING REPAIR ON SEARCH Because these statuses were no longer an option, we decided to create shelving locations that would reflect these statuses in WMS.  Some of these shelving locations aren’t necessarily physical locations in the library, but rather designations for staff to know where the item can be found. For example, items with a previous status of “repair” in Millennium now have a shelving location of “repair” in WMS. This alerts staff that the item is not available for checkout and is in repair in our processing room. We decided to delete items that had statuses of “stolen” and “missing” prior to migration to better reflect the holdings of our library. We also decided to delete a number of shelving locations as they were no longer being used or no longer needed. For example, some locations were merged and others were renamed to better reflect and clarify where the physical shelving locations were in the library as well as the type of material the locations held. Local Bibliographic Data and Subject Headings WMS uses OCLC’s WorldCat master records for its bibliographic records.  This means that WMS libraries all use the same records and must include information that is specific to its library in a separate section called Local Bibliographic Data (LBD).  After much discussion, we decided to keep the following fields: 590, 600, 610, 651, 655, 690, and 691.   We felt that keeping these fields would create a better record and provide multiple access points for our users. A number of records for Special Collections had local topical terms in the 690 field and local geographic names in the 691 and 651 fields.  For the most part, master records did not exist for these records as they were created locally for HSLIC’s use.  When these bibliographic records were sent to OCLC for the migration, the WorldCat master record was automatically created by OCLC as part of the migration process.  It was important that these subject headings were migrated as part of the project, so that they were included with the record and not lost as an access point. We also decided that the local genre information in the 655 field was important to retain as it provided an access point on a local collection level.  For example, we wanted to make sure that “New Mexico Southwest Collection” was not lost to our researchers who are familiar with that particular collection.  Generally, a genre heading contained in the 655 field would be considered part of the WorldCat master record that other libraries could use.  Because our local information would not be useful to other libraries, we decided to transfer this information to a 590 local note so that it would only be visible to our library users. Notes Decisions regarding local notes that were specific to our institution, such as general notes in the 500 field and textual holdings notes in the 850 field had to be made.   We requested that Innovative make the information in the 945 field visible to our catalogers.  This is the field that contains all of the local data including item information and is instrumental in the migration process. 500 General Notes During the migration process, libraries have the option to load local bibliographic data to supplement the OCLC master records.  This means that when OCLC receives the library’s bibliographic records, as part of an automatic process the records are compared with OCLC’s master records according to a translation table submitted by the library. The 500 field was closely examined to ensure that information wasn’t duplicated or deleted.   OCLC master records usually contain a 500 note field, a general note that would be relevant to any library that holds the item. For example, some records contain “Includes index” listed in the 500  note field. Because this field already exists within the master record and is relevant for anyone holding the item, we wanted to keep the information in the master record.  However, we had a number of notes in this field that were relevant only to our library and we could not simply keep the notes in this field.  If we had migrated the 500 field, it would have resulted in two note fields containing the same information in the master record as the note would “supplement” the master record.  Because of this, we chose not to migrate information in the 500 field in order to prevent duplicate information.  Instead, a list was created in Millennium mainly for Special Collection records that were created locally and not previously loaded into WorldCat.  The information in the 500 field was then examined in these special collection records by catalogers to determine whether or not the information was local or general  and then manually changed one record at a time.  If the information in this field was considered local and only important to HSLIC; it was moved to a 590 field, so that it would be visible to our users in Discovery and staff in WMS, but not to any other libraries who might want to use the record. Local Holding Records WMS’s local holding record (LHR) incorporates information from Millennium’s item record with the holding information from the bibliographic record. It includes information like the call number, chronology and enumeration, location, and price.  The LHR in WMS was created using the information found in the 945 field and was included in the extracted bibliographic records we sent to OCLC.  For the most part, migrating this information was simple except for a few unique cases for our library. 850 Holding Institution Field The 850 holding institution field is part of the bibliographic record and was labeled in our instance of Millennium as “HSLIC Owns”.  This field was used to list coverage ranges or the dates and issues held by our library for journals, special collections material, and continuing resources. This information is usually cataloged in the 863 field within an item or local holdings record; however, HSLIC did not use this in Millennium.  WMS reserves the 850 field for OCLC institution symbols with holdings on a particular title, which meant that we could not continue to use the 850 field as we had previously.  Because WMS coverage dates are generated from the enumeration listed in the LHR, we explored the possibility of migrating the 850 field from the bibliographic record to the 863 field in the local holding record. Unfortunately, it was not possible to do a global update to cross from bibliographic record to an item record within Millennium during the migration process. There were two options to create coverage statements in the migration process: 1. Allow the statements to be newly generated in WMS through the holdings statements generating tool or 2. move the current coverage statements to a 590 note. Because there were so many notes that needed to be moved to the 590 field, a decision was made to delete the 850 holding institution fields from almost all of our records and use the automated summaries generated in WMS. This left all serial records without coverage dates during the migration project in Millennium; however, we believed it would make the migration process to WMS easier. Special Collection records did not include item-level date and enumeration in the item records and were instead cataloged at a box or series level.  This eliminated the possibility of using WMS automated summaries. Because of this, coverage statements were moved to a 590 public note for all special collections records.  This way the information was retained in the system, while still creating an opportunity to change the formatting at a later date if needed. After the migration, it was discovered that the system generated coverage dates were not as complete or as easy to read in WMS as they had been in Millennium. It is an ongoing project to clean up and keep these summaries current in the new system.  Below is a screenshot of how the coverage dates appeared on the staff side of Millennium: This is how the coverage dates appear in WMS: In hindsight, we should have migrated the 850 field to a 590 field to keep the information as local bibliographic data in addition to using the WMS automated summary statement.  The coverage dates would then have appeared in a public note, which would have given our staff and users an additional place to look for the coverage dates.  It would also have given technical services staff a point of comparison when cleaning up the records post-migration. Info/Historical Records In Millennium, a local practice was developed to keep notes about subscriptions as an item record under the bibliographic record.  In WMS, these could not be migrated as items because they were not real items that could be checked out, but rather purchasing notes that were only important to staff.  Because of this, it was important that these notes not be visible to the public.  These notes were a constant topic of discussion among the implementation team members and with the OCLC cohort leaders. One idea was to migrate them from an item to a bibliographic field by attaching the note as an 850 holdings institution field.  Unfortunately, just as it was not possible to do a global update to cross from bibliographic record to item record, it was also not possible to to cross from item record to bibliographic record.  OCLC tried to help with this, but could not find a solution for crossing between record types.  Even if this were possible, the above mentioned issues with the 850 field would have been encountered and the information would have to be moved to a 590 field to retain it. Because this seemed complicated, a list was created of all of the info/historical records in Millennium and then exported to Excel to create a backup file containing these notes.  Soon after this was completed, OCLC developers found a way to translate the information from the 850 field to the 852 non-public subfield x note in WMS as part of the migration. Historical purchasing information is now in a note that is only visible to staff in WMS. Continuing Resources We have found continuing resources to be challenging in WMS.  Previously, we had used OCLC’s Connexion to create and manage bibliographic records and used material types that the system supplied.  While “continuing resource” is a material type in Connexion, it is not a material type in WMS.  Because of this, an available material type in the new system was chosen and then records were changed in Millennium to match the new system.  To do this, another list was created in Millennium of items with “continuation” listed as the material type.  The list was then examined and a determination was made as to whether or not the materials were actually still purchased as a continuation.  Most of the titles were no longer purchased in this way, so the migration presented an opportunity to make these corrections in the system. Not every item listed as a “continuation” in Millennium was a serial item.  In some cases the titles were part of a monographic series.  Decisions then had to be made whether to use a serial record or a monograph record for items that had previously been considered continuing resources.  For items that had only an ISBN, we chose the monograph record and for those with an ISSN, we chose the serial record; however, many items had both an ISBN and an ISSN.  The decision was more difficult in these instances and continues to be difficult for these items because the format chosen affects how patrons can find the item in Discovery.  This is addressed in more detail below. Analytic Records At the beginning of the migration process, OCLC inquired about specific fields and data elements in our records to identify potential errors in the migration process which could be addressed before migrating. One question was whether the data contained linked records.   At first, we had no idea what this even meant, so we answered “no” on our initial migration questionnaire.  A few short weeks before the scheduled migration date, the linked records were discovered in the form of series analytic records. A series analytic record is basically a record that is cataloged as an overarching monographic series title that is then linked to individual titles within that series.  This means that the item record is linked to the overarching bibliographic record for the series as well as the bibliographic record for the individual title, which then links both bibliographic records.  Unknown to those working on the migration project, previous catalogers had an ongoing project to unlink all of these analytic records when a monographic series subscription was no longer active.  Notes were found on how to unlink the records, but no notes on what the titles were or where the previous catalogers left off in the project were found.  Unfortunately, we had no way to identify linked records in Millennium. We unlinked as many of the records as possible before the migration, but finally had to send the data to OCLC knowing that many linked records still remained. These records migrated as two separate instances of the same barcode, which created two LHRs in WMS, subsequently causing duplicate barcodes in WMS.  After the migration, OCLC provided a number of reports including a duplicate barcode report, so that these duplicate instances could be found. To correct these records, the item was pulled and examined to determine if the serial or the monograph record best represented it.   The local holdings record was corrected for the title and the LHR from the unchosen bibliographic record was deleted. In Millennium, the choice between representing an item with a serial or monograph record had few implications for users. However, in WMS, choosing a serial record could allow for article level holdings to be returned in Discovery, while choosing a monograph record would not. Conversely, choosing a serial record for an item which looks like a monograph might make the item more difficult to find if users narrow their search to “book.”   Because of this, careful review of items and material types was necessary to help create the best user experience. For example, “The Handbook of Nonprescription Drugs” looks like a book with a hard cover to most library users and even staff. In Discovery,  if the format is limited to “journal,” the title is the first search result: If the search is limited to the format “book,” the title is not found on the first page of the search results. Serials As was mentioned previously, OCLC relies on the 945 field to view all item information.  For the most part, serials records contained the 850 HSLIC Owns field that was discussed earlier. The 945 subfield a was used to list the following distinctions: Current Print Subscription, Current Print and Electronic Subscription, and Electronic Subscription.  Because the 945 subfield a also contained the volume dates, we chose to move this information to a 590 local note field. Once those notes were moved, we found that enumeration and chronology was entered in various subfields within the 945 field.  The date was usually in subfield a, volume notes were found in subfield d, while the volume number was in subfield e.  The below example is taken from an extraction in Millennium and shows the enumeration and chronology for volume 53 of the journal “Diabetes” published in 2004. The first line shows an example of a note that this volume is a supplement, while the second line shows a more typical entry with volume number and coverage. 945 |c|e53|a2004|dSupplements| 945 |c|e53|a2004:July-2004:Dec| The enumeration and chronology was constructed from these subfields where possible; however, if this information was repeated in a different subfield, it had to be cleaned up post-migration. Electronic Resources We decided not to migrate electronic resources cataloged in Millennium to WMS.  Electronic resources are managed within Collection Manager, which is WMS’ electronic resource manager.   It was specified in the translation table that any record with a location of electronic resource not be migrated to the new system.  Unfortunately, many of the electronic resources records unintentionally migrated.  They may have been attached to a print record or perhaps did not have the location set as electronic resource.  Holdings had to be removed from these records post-migration. Before migration, we decided to delete records for freely available e-books from Millennium.   Most of these resources were provided for the public via government websites hosted by the Center for Disease Control (CDC) and could easily be accessed through other means of searching.  These resources could be added to Collection Manager post-migration if deemed important. Similarly, electronic records were not migrated directly from Serial Solutions, our previous electronic resource manager. Instead, electronic resources were manually added to Collection Manager for a cleaner migration.  All electronic resources are shared with University Libraries (UL), the main campus library, so close collaboration with UL was necessary in order to share and track these resources.  While all HSLIC resources were shared with UL and all UL resources shared with us, we decided to select only the resources that were relevant to the health sciences in Collection Manager.  This created a more health sciences focused electronic resources collection, so that titles relevant to these subjects are displayed at the top of the search. Suppressed Records One of OCLC’s slogans is “because what is known must be shared,” so it makes sense that WMS does not have the capability to suppress records. If an item has our holdings on it and has an LHR, then it is viewable to the public in Discovery.  For the most part this concept worked for us.  There were two record types in Millennium where this idea presented challenges: suppressed items and equipment records. Suppressed Items At the time of migration, there were around 1200 books that had been removed from the general collection and stored in offsite storage for future consideration for adding to Special Collections.   These records were suppressed in Millennium, so that only staff could see them in the backend. Adding these items back into the collection was considered, so that records would not be lost, but it was finally decided this would be far too time consuming in the middle of the migration and that many of the titles would probably be deleted later on. Instead, another list was created in Millennium containing items in offsite storage with a status of “suppressed”.  An Excel spreadsheet was then created that contained the titles, OCLC numbers, and even the call numbers of all of the formerly suppressed titles, allowing for easy reference to the items in storage.  We instructed OCLC not to migrate any records with a status of suppressed. Equipment Records Similarly, there were a number of equipment records that were only viewable and useful to staff at the circulation desk.  These records were for laptops, iPads, a variety of cables and adaptors, even some highlighters, and keys.  These items all had barcodes and could be checked out, but patrons had to know that they existed in order to ask for them.  While this never seemed to be a problem for users and it did seem strange to create bibliographic records for equipment items, it was decided to create brief records and then migrate them anyway in hope of promoting use. Now users have the ability to see if a laptop is available for checkout before even asking.  While the idea of these records is a bit unorthodox from traditional cataloging, creating the records ultimately added to the service the library was already providing in addition to providing a way to circulate the equipment using WMS. Conclusion Although there were a number of steps, a number of surprises, and a number of decisions that had to be made, the pre-migration cleanup process was definitely worth the work.  Many errors were discovered post-migration, but without doing the initial clean up, there would have been even more problems. At HSLIC, we have one full time cataloger/ILS manager and one full time electronic resources/serials librarian.  It took nearly 6 months to clean up catalog records before migrating to WMS. Starting the cleanup process earlier would have saved us a lot of work and resulted in cleaner records to migrate. We should have started looking for the linked series analytic records immediately.  This would have given us more time to identify the records, unlink them, and decide which record best represented the item before sending the records to OCLC.  This would have prevented post-migration cleanup of duplicate barcodes and prevented circulation staff any confusion when trying to check these items out to users. Five out of eight members of HSLIC’s ILS migration committee had worked at HSLIC less than a year before we began the  migration process. This provided a balance between historical institutional knowledge with new perspectives.  It helped us look at the catalog with fresh eyes and allowed us to ask “why” whenever the answer was,“that is the way we have always done things.” If “why” couldn’t be answered or no longer seemed relevant, we considered making a change. The catalog should reflect what is on the shelf and what is accessible electronically.  The online catalog is the window to the library itself and should accurately represent what the library holds. Because of electronic access to ebooks and ejournals, some of our users won’t ever step into the physical library, which makes the accuracy of the online catalog or discovery layer even more important. Even if your library isn’t moving to a new ILS, it is important for catalogers and technical services staff to ask, “What is in the library’s catalog?” and then ask “Why?”  As we discovered at HSLIC, keeping notes and shelving locations just because “that is what had always been done” in some cases was no longer compatible with the new system and in other cases was no longer efficient or comprehensible. Sometimes change is exactly what is needed to keep the catalog relevant to library users. Acknowledgements Thank you to the peer reviewers, Violet Fox and Annie Pho, for helping me focus and clarify my ideas and experiences in this article.  You both made the peer review process an interesting and enjoyable experience.  Thank you to Sofia Leung, publishing editor, for guiding me through the process.  I would also like to thank all of the members on the HSLIC ILS Migration Committee who made the migration possible.  I would especially like to thank Victoria Rodrigues for her hard work on cleaning up the serial records and adding our electronic resources to the new system. Works Cited Dula, M., Jacobsen, L., Ferguson, T., and Ross, R. (2012). Implementing a new cloud computing library management service. Computers in Libraries, 32(1), 6-40. Dula, M., and Ye, G. (2013). Case study: Pepperdine University Libraries’ Migration to OCLC’s Worldshare. Journal of Web Librarianship, 6(2),125–132. doi: 10.1080/19322909.2012.677296 Hartman, R. (2013). Life in the cloud: A WorldShare Management Services case study. Journal of Web Librarianship, 6(3),176-185. doi: 10.1080/19322909.2012.702612 OCLC. (2015) Accessed January 14, 2016, from https://www.oclc.org/en-US/share/home.html     cataloging, integrated library systems, migration Inclusivity, Gestalt Principles, and Plain Language in Document Design The Collective Approach: Reinventing Affordable, Useful, and Fun Professional Development This work is licensed under a CC Attribution 4.0 License. ISSN 1944-6195 About this Journal | Archives | Submissions | Conduct work_sgyhovgcrrgxzcekdp3qgg7x2m ---- C A S E R E P O R T DOI: dx.doi.org/10.5195/jmla.2019.666 Journal of the Medical Library Association 107 (4) October 2019 jmla.mlanet.org 560 Evaluating a historical medical book collection Karen R. McElfresh, AHIP; Robyn M. Gleasner See end of article for authors’ affiliations. Background: After several years of storing a large number of historical medical books that had been weeded from the general collection, the University of New Mexico Health Sciences Library and Informatics Center developed a set of evaluation criteria to determine whether the material should be kept and included in the library catalog or discarded. The purpose of this article is to share lessons learned in evaluating and processing a historical medical book collection. The authors share how we determined review criteria as well as cataloging and processing procedures. Case Presentation: Best practices for evaluating, cataloging, and processing historical library material were determined through a literature search and then reviewed and adapted for application to this project. Eight hundred sixty-two titles were selected to add to the catalog and were added to a shelving location in our offsite storage facility. Conclusions: These materials are now discoverable in the library’s catalog for library users who are interested in historical research, and the materials have been processed for easy retrieval as well as preservation purposes. BACKGROUND Preserving the history of medicine to help scholars and clinicians discover errors and to connect practitioners and institutions to the past are important values for health sciences librarians [1–5]; however, libraries do not always have staff with the expertise and resources to implement these values. In addition to the difficulty in managing historical collections, it can also be difficult for librarians to find historical information. To remedy this, the Medical Library Association (MLA) created a BibKit that includes ready reference, primary and secondary sources, and Internet resources that are relevant to the history of medicine [6]. While this resource was helpful, librarians at the University of New Mexico Health Sciences Library and Informatics Center (HSLIC) wanted to make its modest collection of older print books, dating from the early 1800s to the 1950s, more discoverable and findable to support historical researchers. In 2007, approximately 1,300 monographic volumes were weeded from the general collection and placed in offsite storage to be evaluated for addition to the historical collection. The bibliographic records were suppressed in the library catalog system so that they would not display in the public catalog. Due to personnel changes, no one was available to review these titles for several years. In 2015, the library migrated to a new catalog system that did not permit a suppressed status. Because the collection needed substantial review, we decided not to migrate these records to the new system, which meant the books would need to be re-cataloged in the new system. Before we lost access to our previous catalog system, we exported a spreadsheet with information about the suppressed titles, including title, author, barcode number, Online Computer Library Center (OCLC) number, National Library of Medicine (NLM) call number, and circulation information to a spreadsheet. Because of the suppressed status, this review project became known as the “Suppressed Books Project.” At the time of this project, the University of New Mexico HSLIC did not have an archivist or special collections librarian, so the collection development librarian evaluated the books. It was challenging to find current case studies from health sciences libraries that documented similar projects. There is a plethora of information in A h i s t o r i c a l m e d i c a l b o o k c o l l e c t i o n 5 6 1 DOI: dx.doi.org/10.5195/jmla.2019.666 jmla.mlanet.org 107 (4) October 2019 Journal of the Medical Library Association the professional literature regarding the importance of writing policies for collection development as well as determining standards for rare books, but little information on selection criteria and evaluation of books with historical value. The Association of College and Research Libraries’ Rare Books and Manuscripts Section offers guidelines on selecting and transferring materials from general collections to special collections that consider (1) market value, (2) rarity and scarcity, (3) date and place of publication, (4) physical and intrinsic characteristics, (5) bibliographic and research value, and (6) condition [7]. While the library has a collection development manual that covers selection criteria for the general collection and a separate special collections policy, these historical titles did not fit into existing documentation or procedures. Our goal was to develop selection criteria to help guide future decisions to include historical information in the collection. The Cleveland Clinic Foundation Library had a similar objective to evaluate and provide selection criteria for its offsite historical book collection to incorporate into the general collection, explain the evaluation process and rationale for criteria, and provide a written collection development policy to guide future decisions [8]. While our goal was to develop selection criteria rather than a policy, their advice of looking in the catalog to determine if the title was already held, checking availability at consortia or our university library systems, looking at the number of titles available in OCLC’s WorldCat, and investigating the dollar value of the book was helpful. Other libraries focused criteria on preservation, relevance, need for potential research, quality, and type of publication, which also helped determine our approach [3]. Our objective was to share our process and experience to help other librarians in similar situations, such as weeding projects or assessment of donations of older materials. CASE PRESENTATION Setting The University of New Mexico HSLIC supports the academic, research, and clinical enterprises of the University of New Mexico Health Sciences Center. While the majority of our library resources are electronic, HSLIC also collects monographs, serials, and archival or historical material in print. To support researchers who are interested in the history of medicine and health, we maintain a modest collection of older print books, dating from the early 1800s to the 1950s. The library also has a separate special collection and archive with materials related to the history of our institution and health care in the state. Review criteria We started the process by developing a set of criteria to apply to the suppressed books to guide us in deciding which books to keep. Because the books were suppressed in the catalog and were not accessible to library users, we did not have any recent usage data to use in our evaluation. Even if this information were available, it would not have been a useful metric given that past use would not necessarily dictate future use in cases of historical research. While circulation data are commonly used in collection assessment, they were not discussed as a metric in any of the literature that we reviewed on historical collections [1, 4, 8–11]. Essentially, the books were evaluated in a similar manner to the evaluation of donated materials. Subjects Much of the criteria that we developed were subject based. The material had to be in-scope to be retained in the historical collection, meaning it had to fall within the same subject parameters that we applied to the general collection as defined in the library’s collection development manual. For material that fell outside of the health sciences scope, we made a note to offer it to the university’s main campus library system. HSLIC is a separate entity from the main campus library system, even though we are part of the same institution and located on the same campus. Each library system maintains its own special collections, and any materials transferred between the libraries are treated as donations. We were particularly interested in any materials in the health sciences subjects that documented the history or foundation of a health profession or an area of study. An example is History of Cardiology by Bishop and Neilson (1927). We also kept all materials related to specific subject areas that were of local interest, including Native American health, Latin American health, health care in New Mexico and the American 5 6 2 M c E l f r e s h a n d G l e a s n e r DOI: dx.doi.org/10.5195/jmla.2019.666 Journal of the Medical Library Association 107 (4) October 2019 jmla.mlanet.org Southwest, rural health, tuberculosis, military medicine, and midwifery. In the 1990s, the library decided to keep all editions of certain textbooks that were considered “core” in their specialties to allow researchers to track the development of a specialty over time. A textbook was chosen for each NLM classification, and all editions of that text that the library owns are kept in perpetuity. Some of these editions were mistakenly pulled and placed in the suppressed book collection. This error was corrected, and the items were added back to the catalog. Any book that was included in Morton’s Medical Bibliography: An Annotated Check-List of Texts Illustrating the History of Medicine [12], also known as Garrison-Morton, was kept. Garrison-Morton is a comprehensive and authoritative resource on the history of medicine and has been maintained since 1912. We chose to use the online version of this resource as it allows searching by title, author, subject, publication date, or entry number [13]. The web version contains all the information from the print editions of Garrison-Morton and is updated with new information [14]. When a book on the Garrison-Morton list was cataloged, a note including the Garrison-Morton entry number was added to the catalog record. The note will allow us to retrieve a list of all Garrison-Morton books that the library owns, which could be helpful in future collection- review or weeding projects. Availability at other libraries As suggested in the literature and standard collection analyses, we searched for each book in WorldCat to determine the number of copies available worldwide. However, this did not prove to be especially useful. HSLIC is the only health sciences library in New Mexico that is open to the general public, and it was very unlikely any other local libraries would have historical medical materials. Even copies available at libraries in our regional consortium would not be accessible to our users, as the closest member library is more than 250 miles away, and most libraries will not loan historical or special collections materials via interlibrary loan. Languages The majority of the books in the collection were written in English or were English translations of work originally written in other languages. We also had a fairly large amount of material in German and some in Spanish and French. We kept non-English material, provided it was the original language of the material and not a translation. HathiTrust Many of the books in the suppressed book collection were published before 1923 and, therefore, no longer under copyright. For the books that were in the public domain, we checked the HathiTrust Digital Library to see if a scanned version was available. Because our library uses OCLC’s WorldShare Management Services (WMS) as our catalog and electronic resources management tool, we were able to search the OCLC Collection Manager to see if a scanned version was available to add to our collection. We added the scanned versions—which display like e-books in our discovery layer, WorldCat Discovery—to provide an additional access point for our users. In most cases, we still kept the print edition because the electronic scans in HathiTrust can be inconsistent in quality and difficult to navigate. The process of adding items from HathiTrust to our catalog was fairly straightforward, but we did have some problems. Some titles were in HathiTrust but were not available in the OCLC Collection Manager and had to be added by our cataloger. There were also records that linked to the wrong item in HathiTrust, and in some cases, HathiTrust had the incorrect information on their site. These errors were reported to HathiTrust using their feedback form. Cataloging and processing Our goal was to integrate the selected books into our offsite storage shelving location. The books that were not selected were discarded by marking an “X” through the call number and barcode to show the item had been evaluated. They were then put in our recycling bin for weekly pick-up. Books that were selected to keep were cataloged. Cataloging the material was more challenging than we had anticipated. We consulted the spreadsheet from our previous catalog for the OCLC number and NLM call number. We expected to be able to use the call numbers that were previously assigned and then use the OCLC number to pull in the appropriate master bibliographic record using A h i s t o r i c a l m e d i c a l b o o k c o l l e c t i o n 5 6 3 DOI: dx.doi.org/10.5195/jmla.2019.666 jmla.mlanet.org 107 (4) October 2019 Journal of the Medical Library Association OCLC’s Record Manager. Unfortunately, this method was not always effective. Many of the OCLC numbers were no longer accurate, as the records had probably been merged with a more complete record and given a new OCLC number. We developed the following process to find and select the best record: 1. Look for the OCLC number listed on our spreadsheet in WMS Record Manager. 2. If not available, use WMS Record Manager Advanced Search to search WorldCat for the title, author, and year of publication. 3. From here, select “Book - PrintBook” as the material format and “English” as the cataloging language. 4. Select the record with the correct publisher. Because our original method did not work as expected, cataloging the suppressed books took much longer than anticipated. Many of the suppressed books still had the call numbers and spine labels intact, so we reused them; however, there were a number of books that had to be reclassified. We first noticed some inconsistencies in how reprints were cataloged in the previous system. According to NLM’s Shelflisting Procedures for Monographs and Classified Serials, the year of the original publication should be used followed by the letter “a” in the call number for reprints [15]. We were able to correct this in WMS and on the spine labels to differentiate this material from originals in the collection. We also noticed that previous catalogers were not consistent in following NLM’s Classification Practices for the nineteenth century schedule. The schedule consists of “A simplified subject classification derived from the letters that represent the preclinical and clinical subjects used for nineteenth century (1801–1913) monographs” [16]. This includes classification notations W1–6, W600, and WX2 as well as the entire WZ schedule for History of Medicine. It was unclear whether previous catalogers entered a zero where a blank should have been for the classification number or if the integrated library system did not allow blanks and forced a zero. The blank classification number or the zero should be shelved before actual numbers, but library staff seemed to have trouble shelving these items because they were mixed throughout the call number range. Previous catalogers were also not consistent in using the schedule, so previous editions of titles were not always classified together. We asked the cataloging community how they dealt with the schedule via a Facebook cataloging group and by directly emailing the Cleveland Clinic Foundation Library [17]. While there were not many responses, the schedule seemed to be used for rare book collections, but librarians saw no reason not to use the regular schedule in place of the nineteenth century schedule. Because new labels had to be printed anyway, we decided not to use the nineteenth century schedule in the hope that it would promote findability on the shelf as well as make shelving easier for library staff. Processing and preservation Processing these books was also a concern. While we wanted to make the books findable, we also wanted to protect them and limit the amount of processing needed due to the age and fragility of the material. Because we were integrating these books with books that were already on the shelf in offsite storage, we decided to continue the processing practice already in place, including spine labels covered with a label protector and property stamps. Many of the books were damaged and needed repairs before being shelved, but our library has limited preservation expertise and resources. We were able to repair corners and loose spines using polyvinyl acetate (PVA) glue, waiting for the glue to dry, and then testing the repair. If the repair was not successful, the repair process began again. For the books that we could not repair, we made book boxes to better protect them on the shelf. We repaired around 100 books and made boxes for about 40 books. We also discovered around 35 books affected by leather or red rot, meaning the leather was decaying and turning into a powder. We are considering using a product to consolidate the leather on the covers but are not sure how to use the product, and the library might not have the appropriate ventilation. If we are not able use the product, we will make boxes for these books. An additional benefit to moving these books to offsite storage is that the location is temperature and light controlled and, therefore, a better environment for preservation. 5 6 4 M c E l f r e s h a n d G l e a s n e r DOI: dx.doi.org/10.5195/jmla.2019.666 Journal of the Medical Library Association 107 (4) October 2019 jmla.mlanet.org Shelving Before shelving the material, we wanted to ensure that we had room for the collection to grow. To calculate the potential growth of the collection, we created a spreadsheet that listed the call numbers of the books that had been suppressed and those of the books currently in the offsite shelving location. We added these together to show which call numbers had the most material. WB and QV were the highest, followed by WH–WK, and so on. We left space at the end of the call number ranges that had the most items in the hope that other uncataloged older material could be added to this shelving location. RESULTS To make this project possible, careful planning, strong communication, and teamwork were imperative. The team included the collection management librarian to make selection decisions; the cataloger to catalog material and create a project plan for shifting and shelving material; another technical services employee to repair material and create boxes; and a student employee to transport material from the offsite storage facility to the library for review, shelve cataloged materials, and integrate the materials with the existing collection. We added 862 titles that met the criteria for selection into the new offsite storage location to the catalog. We shifted 970 linear feet to integrate the additional 144 linear feet of the selected formerly suppressed books. The project took approximately 1 year to complete, including 2–3 months of prep time to determine the process and evaluation criteria. Team members worked on the project as time allowed, while managing day-to-day work functions and other projects. While we did not have a clear time frame in place to complete the project, it did take longer than we expected due to other projects and staff changes. DISCUSSION The main goal of this project was to re-catalog the books so that they would be discoverable in our catalog/discovery interface, WorldCat Discovery. Now that the project is complete, our users have more access to a collection of materials that they would not have known about previously. They now have a way to request access to the materials through our catalog’s hold system. Despite the fact that the books are more discoverable, we have not seen an increase in use of the collection. The collection is housed in our offsite storage to alleviate space issues in our main collection. The downside to the collection being located offsite is that it makes the collection noncirculating. The only way someone can access it is to contact the library and make arrangements to visit the library and view the material. We have not received any such requests as of the time this article was written. One of our goals for the future is to explore ways to market and promote the use of our historical book collection. When we started this project, we discussed creating a new shelving location for the historical books and new loan rules. We debated dividing the historical books into two locations: one circulating and one noncirculating. While many of the historical books were rare or in poor condition and should not circulate, others were in good condition and we felt circulation was appropriate. However, it became too complicated to determine what could and could not circulate and how to shelve materials in two locations, so we decided to create only one, noncirculating shelving location. This decision may need to be revisited in the future to promote use. On the other hand, many of the titles are now also available electronically through HathiTrust records that have been added, so users could be opting to use the titles electronically. Now that we have completed the review of the suppressed titles, we plan to use the review criteria and methodology to deal with other uncataloged books in our offsite storage. The majority of these uncataloged books are donations that were never processed into the collection due to lack of staff. Similar to the Cleveland Clinic Foundation Library, the suppressed book project allowed us to test our criteria and process so that we can more easily review the other uncataloged books as time allows [8]. Many health sciences libraries do not have positions dedicated to special collections and archives but may still maintain historical collections. Additionally, because the majority of users in health sciences libraries are interested in current information, librarians may not be fully comfortable assisting users with queries about the history of medicine. Despite many libraries having this issue, there is not a large body of literature on this topic or sufficient resources specific to health sciences A h i s t o r i c a l m e d i c a l b o o k c o l l e c t i o n 5 6 5 DOI: dx.doi.org/10.5195/jmla.2019.666 jmla.mlanet.org 107 (4) October 2019 Journal of the Medical Library Association historical collections. By sharing our decision criteria and process, we hope to provide a helpful resource to other libraries with similar collections of historical books. REFERENCES 1. Flannery MA. Building a retrospective collection in pharmacy: a brief history of the literature with some considerations for US health sciences library professionals. Bull Med Libr Assoc. 2001 Apr;89(2):212–21. (Available from: . [cited 11 Aug 2017].) 2. Chaplin S. The medical library is history. RBM J Rare Books Manuscr Cult Herit [Internet]. 2014 Sep 1;15(2):146–56. DOI: http://dx.doi.org/10.5860/rbm.15.2.427. 3. Richards DT, McClure LW. Selection for preservation: considerations for the health sciences. Bull Med Libr Assoc. 1989 Jul;77(3):284–92. (Available from: . [cited 21 Jun 2109].) 4. Nasea MM, Moskop RMW. Preparing to honor the past in the future: collection development in the history of the health sciences. Grain [Internet]. 2013 Nov 4;20(5):article 9. DOI: http://dx.doi.org/10.7771/2380-176X.5187. 5. Reznick JS. Embracing the future as stewards of the past: charting a course forward for historical medical libraries and archives. RBM J Rare Books Manuscr Cult Herit [Internet]. 2014;15(2):111–23. DOI: http://dx.doi.org/10.5860/rbm.15.2.424. 6. Greenberg SJ, Gallagher PE. History of the health sciences: MLA BibKit #5. 2nd rev ed. Chicago, IL: Medical Library Association; 2002. 7. Association of College & Research Libraries (ACRL). Guidelines on the selection and transfer of materials from general collections to special collections [Internet]. 4th ed. The Association; 2016 [cited 11 Dec 2018]. . 8. Schleicher MC. Assembling selection criteria and writing a collection development policy for a variety of older medical books. J Hosp Librariansh [Internet]. 2010 Jul 28;10(3):251– 64. DOI: http://dx.doi.org/10.1080/15323269.2010.491424. 9. Association of College & Research Libraries (ACRL), Rare Books and Manuscripts Section (RBMS). ACRL RBMS guidelines on the selection and transfer of materials from general collections to special collections. The Association; 2015. 10. Hatfield AJ, Kelley SD. Case study: lessons learned through digitizing the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research Collection. J Med Libr Assoc. 2007 Jul;95(3):267–70. DOI: http://dx.doi.org/10.3163/1536-5050.95.3.267. 11. Overmier J, Mueller MH. Collection development policies and practices in medical school rare book libraries. Bull Med Libr Assoc. 1984 Apr;72(2):150–4. (Available from: . [cited 21 Jun 2109].) 12. Morton LT, Norman JM. Morton’s medical bibliography: an annotated check-list of texts illustrating the history of medicine (Garrison and Morton). 5th ed. Aldershot, Hants, England, UK: Scolar Press; 1991. 13. Norman J. History of medicine and the life sciences [Internet]. HistoryofMedicine.com; 2018 [cited 6 Dec 2018]. . 14. Norman J. Bibliographical evolution, or transforming a static printed bibliography into a growing interactive website: a progress report [Internet]. HistoryOfMedicine.com; 2018 [cited 6 Dec 2018]. . 15. US National Library of Medicine. Shelflisting procedures for monographs and classed serials [Internet]. The Library; 2014 [cited 16 Jan 2019]. . 16. US National Library of Medicine. NLM classification practices [Internet]. The Library [cited 16 Jan 2019]. . 17. Troublesome catalogers and magical metadata fairies [Internet]. [cited 11 Jul 2017]. . AUTHORS’ AFFILIATIONS Karen R. McElfresh, AHIP,* kmcelfresh@salud.unm.edu, https://orcid.org/0000-0003-1404-9764, Health Sciences Library and Informatics Center, University of New Mexico, Albuquerque, NM Robyn M. Gleasner, rgleasner@salud.unm.edu, Health Sciences Library and Informatics Center, University of New Mexico, Albuquerque, NM Received January 2019; accepted April 2019 * Current contact information: kmcelfresh@rice.edu, Fondren Library, Rice University, Houston, TX. Articles in this journal are licensed under a Creative Commons Attribution 4.0 International License. This journal is published by the University Library System of the University of Pittsburgh as part of its D-Scribe Digital Publishing Program and is cosponsored by the University of Pittsburgh Press. ISSN 1558-9439 (Online) Karen R. McElfresh, AHIP; Robyn M. Gleasner See end of article for authors’ affiliations. Background: After several years of storing a large number of historical medical books that had been weeded from the general collection, the University of New Mexico Health Sciences Library and Informatics Center developed a set of evaluation criteria to determine whether the material should be kept and included in the library catalog or discarded. The purpose of this article is to share lessons learned in evaluating and processing a historical medical book collection. The authors share how we determined review criteria as well as cataloging and processing procedures. Case Presentation: Best practices for evaluating, cataloging, and processing historical library material were determined through a literature search and then reviewed and adapted for application to this project. Eight hundred sixty-two titles were selected to add to the catalog and were added to a shelving location in our offsite storage facility. Conclusions: These materials are now discoverable in the library’s catalog for library users who are interested in historical research, and the materials have been processed for easy retrieval as well as preservation purposes. BACKGROUND CASE PRESENTATION Setting Review criteria Subjects Availability at other libraries Languages HathiTrust Cataloging and processing Processing and preservation Shelving RESULTS DISCUSSION References Authors’ Affiliations Karen R. McElfresh, AHIP,* kmcelfresh@salud.unm.edu, https://orcid.org/0000-0003-1404-9764, Health Sciences Library and Informatics Center, University of New Mexico, Albuquerque, NM Robyn M. Gleasner, rgleasner@salud.unm.edu, Health Sciences Library and Informatics Center, University of New Mexico, Albuquerque, NM Received January 2019; accepted April 2019 work_shf5s4orgzdf7mibuvgsokor34 ---- 2019 I-SPIE Conference Explorations and Adventures: Mapping our Future July 18 – 19, 2019 at CSU, Fullerton Presentations Keynote: Char Booth Information privilege: Access, Advocacy, and the Critical Role of Libraries. In the world of higher education and libraries, there is a growing awareness of widespread inequality in information access. Information privilege is the concept that some enjoy greater access to information than others based on a variety of sociocultural factors such as social class and institutional affiliation. Information privilege​ ​creates an inherent barrier to knowledge, yet libraries and librarians hold a fundamental value of open information - and no group within libraries is more aware of the constraints of information privilege than those who work in resource sharing. This keynote will examine the role of libraries in a complex information ecosystem, question the obligations we have to our communities to challenge information privilege, and offer​ ​strategies that libraries can pursue to facilitate greater information equity. Presenter: David Walker Now I Get It! Customizing and Simplifying the Get It Menu in Alma/Primo Alma's display of physical holdings and related request options within the Primo ‘Get It’ menu is rather complex. It appears to be optimized for large research institutions, leaving other types of institutions with an interface that is not ideal for their (often simpler) needs. This presentation will showcase work by the Discovery UResolver Task Force to replace the Get It menu in Primo with one of our own design, providing a streamlined display of holdings and a simplified resource sharing request option. David will explain the problems our users faced with the default Alma UResolver interface, the group’s exploration of options to improve the user experience, the work they did to implement this customization, and some considerations for the future. Presenter: ​Meghann Brenneman IDS Workflows at Humboldt Review the IDS workflows and benefits of IDS implementation at Humboldt. An evaluation of the new and old workflow and the impact IDS may have at your library. Presenters: ​Resource Sharing Functional Committee Members ULMS Resource Sharing Functional Committee Overview The RSFC will provide an overview of the projects and tasks that they have worked on during the prior year. Accomplishments, achievements and ongoing goals will discussed. New members of the RSCF for the upcoming year will be introduced. Presenter: ​Meghann Brenneman ILL Data Resources and You How to become and ILL data expert? Through a review of data assessment resources attendees will have what they need to bring their ILL data into conversations at their libraries. Presenter: Mike Richins Rapid ILL​ – Current Projects and Future Directions This session will provide an update on RapidILL including current projects and future directions. We will look at RapidR – Returnables, and its ability to facilitate efficient monograph-sharing for both consortial and non-consortial Rapid partners using smart lender identification and custom routing. Information on Project Bedrock, a new ILL request management platform, will also be shared. Presenters: Mark Bilby and Greg Witmer What does Open Access Mean for the Future of Interlibrary Lending? Open Access can mean a lot of things to a lot of people. On the technological side, it can simply mean immediate access to content—whether legally or not—via the internet, social media, or cell phones. On the legal side, it can refer to massive shifts happening in Open Access policies or big deal negotiations with publishers. 2020 is sizing up to be an especially pivotal year, as countries and universities are increasingly playing hardball through major OA initiatives such as OA2020 and Plan S. What does all this mean for Interlibrary Lending going forward? As more and more content becomes not only digital, but also legally open and instantly accessible, how will this impact the role of Interlibrary Loans? In this presentation, we will talk about currents in Open Access and their implications for Collection Development, Systems integration, Discovery Tools and Workflows, and even Information Literacy. What we envision is that Interlibrary Loan staff will be called to focus increasingly on user-service and interaction, and in the process will have a myriad of opportunities to be advocates and transformational agents who make academic information open. Presenter: Tony Landolt Article Galaxy Scholar - Your Collection Development Safety Net The Article Galaxy Scholar Collection powered by Reprints Desk is tailored to help academic institutions supplement subscriptions and Interlibrary Loan (ILL) service to reduce spending and meet the immediate research needs of faculty and graduate students. Learn how Reprints Desk is making it easier for library staff to manage on-demand access for selected patrons or to specific journals while controlling spending and reducing manual mediation time. Presenters: Joe Adkins and ​Allyce Ondricka Reduce Your ILL Copyright Fees Everybody has to pay ILL copyright fees. But have you ever wished you could pay less on copyright fees so that your library could better spend the money elsewhere? If so, Joe Adkins and Allyce Ondricka will show you how to use a variety of tools including Open Access and comparison shopping to reduce your ILL copyright fees. Presenters: Ann Roll and Mike De Mars A floating collection for the CSU, one step further toward collaborative collection development and management Thanks to the options provided by the ULMS implementation and CSU+, the CSU is ready to explore further opportunities for collaboration. This presentation proposes the possibility of a floating collection for the CSU libraries. In a floating collection, individual items have no fixed home. When one CSU campus lends an item to a user at another CSU campus, upon the user’s return of the item, the item would be shelved at the borrowing institution rather than being returned to the lending campus. This ensures that the books reside where they are being used but have the flexibility to relocate as needs change. The presentation will respond to the recommendations of the Systemwide Committee on Print Management (SCOPM) and will discuss the floating collection currently in place at the Pennsylvania State University system. Presenter: Jenny Rosenfeld - OCLC OCLC Resource Sharing Update Learn about what’s new with OCLC’s resource sharing products – with a special focus on Tipasa and all the ways it now integrates with Alma. You’ll hear about Tipasa’s OPAC integration with Alma for availability and shelving location lookup as well as the NCIP Integration with Alma for circulation. OCLC partnered with several libraries to test this integration from December, 2018 – March, 2019 and then launched NCIP integration with Alma to all current and implementing Tipasa/Alma libraries in April of 2019. We will also review updated road maps to show you what to expect with OCLC’s Resource Sharing products in the upcoming year. Presenter: Ex Libris Article Processing / Resource Sharing Next Gen Solution / Product Roadmap Abstract forthcoming Moderators: ​I-SPIE Conference Planning Committee Future Directions of I-SPIE Discussion Conference attendees will have the opportunity to discuss how I-SPIE can best help and serve the resources sharing community. Is I-SPIE’s current charge still accurate and appropriate in our changing environment? Are all the I-SPIE committees still vital and necessary? What workflows and functions can I-SPIE focus on to help your library and the resource sharing community flourish? work_shfuyd6jzjanfi2nllrjaqhzve ---- PowerPoint-Präsentation slide 2 Visibility Reuse & citation Increase in motivation for data curation Dr. Reingis Hauck | CRIS & Visibility of Datasets | #IDCC19 Benefits of Visibility slide 3Dr. Reingis Hauck | CRIS & Visibility of Datasets | #IDCC19 Researcher Publications Datasets Equipment Projects Activities Reports & Research Assessment Showcasing Research in Webportals Interlinked metadata on: Multifunctional use for: CRIS What is a CRIS? slide 4 CRIS & Datasets – Current Status  Only 16% interoperability between data repositories and CRIS  Less than 10% of CRIS web portals have datasets as content type  Caused by …  weak data policies  missing incentives  technical challenges Dr. Reingis Hauck | CRIS & Visibility of Datasets | #IDCC19 slide 5 CRIS (Pure) Generic Solution for Metadata Import of Data Sets into CRIS slide 6 Where are we now? Dr. Reingis Hauck | CRIS & Visibility of Datasets | #IDCC19  Integration of Datacite with Pure is competing with  other requirements for CRIS development  commercial interests (Mendeley Data) We are still dedicated to this solution! slide 7Dr. Reingis Hauck | CRIS & Visibility of Datasets | #IDCC19 CRISs are important for the visibility of data sets  Get in touch with your local CRIS team  We need to integrate Datacite with Pure @rsilvia77 #DataciteandCris #DataciteandPure Conclusions work_sjynglhqqngitfiwy6zh2mjsma ---- Türk Kütüphaneciliği, 7, 4 (1993), 303-309 HABERLER (SONBAHAR 1993) OCLC ROBOT KULLANIYOR Veri depolama maliyetini düşürmek amacıyla kullanılmaya başlanan robotlar, bilgisayar odasmda yer alan iki teyp kütüphanesinde teypleri depoluyor veya istenen teypi yerinden alıp yüklüyorlar. Her bir teyp kütüphanesinde 1,200 gigabit bilgi yüklü 6000 kartuş var. Kütüphanenin ortasına yerleştirilen robot kolu 11 saniyede istenen teybi yükleyebiliyor. Açıklamaya göre, böylece OCLC büyük tam metin ve görüntü kütüklerini istek üzerine yükleyebilecek. Bu gelişmenin danışma hizmetleri ve elektronik yayıncılık açısından önemli sonuçlar doğurabileceği düşünülüyor. (OCLC Newsletter, No. 204, July/August 1993, 7.s.) COMLIS IV IV. Müslüman Kütüphaneciler ve Bilgileşim Bilimcileri Uluslararası Kongre'si bahar aylarında Pakistan'ın Islamabad kentinde toplanacak. Kongre'nin teması "Müslüman ülkelerde kütüphanecilik ve bilgileşim hizmetleri-nde önemli sorunlar". Kongre'ye iki tür sunu kabul ' edilecek: Ulusal kütüphane ve bilgileşim hizmetleri profilleri ve bildiri-ler. Bildirilerin konuları, insangücü eğiti-mi, İslam literatürüne uygun bilgi tekno-lojisi uygulamaları, tam metin Kuran ve Hadis veritabanları, müslüman ülkeler arasında belge sağlama, İslam literatürü veritabanlarının geliştirilmesi ve model ulusal kütüphane ve bilgileşim hizmet politikası olarak belirlenmiş bulunuyor. Anımsanacağı üzere III. COMLIS Kongresi 1989'da Türkiye'de toplanmıştı. IFLA 1993 BARSELONA IFLA'nın 59. Kurulu ve Genel Konferansı 20-28 Ağustos tarihleri arasında Bar­ selona'da yapıldı. Dergimiz sayfalarında izlediğiniz faaliyetlere ülkemizden aşağıdaki meslektaşlarımız katıldı. Tuncel ACAR Milli Kütüphane Başkanlığı, YILMAZ ADNAN Anadolu Üniversitesi, Hatice Ulya AKBAYRAK Yıldız Teknik Üniversitesi, Mustafa AKBULUT Yüksek Öğretim Kurulu Dokümantasyon Merkezi, Asım AKKAYA Gümrükler Genel Müdürlüğü, Neşe ARAT Uludağ Üniversitesi, Gülgün ARIGÜMÛŞ Türkiye Ziraat Bankası Eğitim Organizasyonu Müdürlüğü, Selma ASLAN Türk Kütüphaneciler Demeği, Gülgün BOSTANCI İstanbul Üniversitesi, Hilmi ÇELİK TBMM Kütüphane ve Dokümantasyon Dairesi, Nevzat ÇOLAKOĞLU Kültür Bakanlığı Bütçe Dairesi Başkanlığı, Özkan DERBEND Kongresist, Hasan DUMAN Kültür Bakanlığı, Esra FINDIK Bilkent Üniversitesi, Oya FİŞEKÇİ Gazi Üniversitesi, Seylan GÖKSEL Adile GÜNDEN YÖK Dokümantasyon Dairesi, Figen İŞÇİ Boğaziçi Üniversitesi, R. Serdar KATIPOGLU Orta Doğu Teknik Üniversitesi, Ayten KORAN Marmara Üniversitesi, Sevgi KORKUT TBMM Kütüphane ve Dokümantasyon Dairesi, Serap KURBANOĞLU Kendi 304 Haberler olanaklarıyla, Firdevs NURCAN Kongresist, Lale ÖZMUMCU TBMM Kütüphane ve Dokümantasyon Dairesi, Nuran RAMAZANOGLU Kongresist, Altınay SERNİKLİ Milli Kütüphane Başkanlığı, Sezen TAN Kendi olanaklarıyla, Leyla TEKİN Akdeniz Üniversitesi, . Hansın TUNÇKANAT Hacettepe Üniversitesi, Hülya ÜNAL Merkez Bankası, Gökçin YALÇIN Kütüphaneler Genel Müdürlüğü, Davut YILDIRIM Kongresist. KAMUDA BAŞARILI YÖNETİCİLERE ÖDÜL VERİLECEK Ekonomik Tirend dergisi, yönettikleri kamu kuruluşlarında özveriyle çalışan ve başarılı olan bürokratları belirlemek için bir adaylar listesi oluşturdu. Bu listede, kamu sosyal kurumlan altında kütüphaneleri "göstermelik kurum" olmaktan çıkararak tüm yurt çapında yaygınlaştırması ve bu kültür kuramlarının çağdaş Türk yazarlarının eserleriyle zenginleştirilmesi yönündeki çabaları nedeniyle Prof. Dr. Tülin Sağlamtunç da yer aldı. MİLLİ KÜTÜPHANE BİLGİSAYAR SİSTEMİ HİZMETE AÇILDI Milli Kütüphane Başkanlığı, 1987 yılında başladığı bilgisayara dayalı bilgi sistemini kurma çalışmalarını içinde bulunduğumuz yıl tamamlayarak, sistemi 18 Ekim 1993 tarihinde Başbakan Vekili Sayın Necmettin CEVHERİ, Kültür Bakanı Sayın D. Fikri SAĞLAR ve diğer seçkin konukların katılı­ mıyla düzenlediği bir törenle kullanıcının himetine açılmıştır. Söz konusu sistem, ülkemizdeki en hızlı ve yüksek kapasiteli sistemlerden biri olup, Kütüphanemin gelecek için planladığı hizmetlere yanıt verebilecek nitelikte gelişme olanağına sahiptir. Sistemle, kullanıcıların Milli Kütüphane koleksiyonuna yerel ve uzaktan erişimi sağlanmıştır. MİLLİ KÜTÜPHANE BASIMEVİ HİZMETE AÇILDI Kültür Bakanlığı ve Milli Kütüphane tarafından yayınlanacak tüm • yayınların basılabileceği ve en son teknolojiyi yansıtan Milli Kütüphane Basımevi 18 Ekim 1993 tarihinde Kültür Bakam Sayın D. Fikri SAĞLAR tarafından hizmete açılmıştır. IFLA GENEL BAŞKANI VE GENEL SEKRETERİ ÜLKEMİZİ ZİYARET ETTİLER IFLA Genel başkanı Robert WEDGEWORTH ve Genel Sekreter Leo VOOGTH 7-12 Aralık 1993 tarihleri arasında 61. IFLA Genel Konferansı Yürütme Kurulu'nun davetlisi olarak ülkemizi ziyaret etmişlerdir. İstanbul'da düzenlenecek olan "1995 IFLA Genel Konferansı" konusunda Yürütme Kurulu üyeleri ile çeşitli görüşmeler yapan Sayın Başkan ve Genel Sekreter Ankara ve İstanbul'da çeşitli tarihi ve turustik yerleri ziyaret etmişler ayrıca Genel Konferans'ın yapılacağı mekanlarda incelemelerde bulunmuşlardır. HALK KÜTÜPHANESİ SAYISI 1107’ YE YÜKSELDİ Kütüphaneler Genel Müdürlüğü'nce halk kütüphanesi sayısının yurt sathında hızla artması ve herkesin kütüphane hizmetlerinden yararlanabilmesi için çalışmalar yapmaktadır. Bu çerçevede 1993 Mali Yılı içerisinde Yatırım Programı yoluyla Şanlıurfa-Suruç, Şanlurfa-Hilvan, Antalya-Manavgat, Afyon- Dinar, Antalya-Tkelioğlu, Erzincan- Haberler 305 Kemaliye, Burdur İl Halk, Kars-Sarıkamış, Diyarbakır-Silvan, Amasya-Gümüşhacıköy, Konya-Sarayönü, Ordu-Fatsa, Muş-Bulamk ve Tokat İl Halk Kütüphanesi hizmete açılmıştır. Bunun yanısıra Kültür Bakanlığı ve yerel yönetimlerle yapılan işbirliği sonucu; binası ve personeli yerel yönetimlerce karşılanarak hizmete açılan Karaman- Ermenek-Görmeli, Amasya-Merkez-Uygur, Niğde-Merkez-Bağlama, Kahramanmaraş- Türkoğlu-Yeşilyöre, Konya-Yunak- Yukarıpiribeyli, Afyon-Merkez-Işıklar, Afyon-Dinar-Haydarlı, Afyon-Sincanlı- Tınaztepe, İçel-Tarsus-Huzurkent, Ankara- Bâlâ-Kesikköprü, İçel-Gülnar-Konur, Malatya-Merkez-Dilek, Hatay-İskenderun- Karaağaç, Konya-Kulu-Zincirlikuyu, Muğla-Ortaca-Dalyan, Kahramanmaraş- Beyoğlu, Kayseri-Merkez-Fevzioğlu, halk kütüphaneleri ve Tekirdağ-Saray, Tekirdağ-Çerkezköy, Çankırı-Ovacık, Kastamonu-Azdavay, Çanakkale-Lapseki, Manisa-Köprübaşı ilçe halk kütüphaneleri ile mevcut kütüphane sayısı 1107’ye ulaşmıştır. HALK KÜTÜPHANELERİNE ÇOK SAYIDA GÜNCEL KİTAP VE SÜRELİ YAYIN ALINIYOR Halk kütüphanelerine olan ilgiyi artırmak, verilen hizmetin kalitesini yükseltmek ve daha çok kullanıcıyı kütüphaneye çekmek, koleksiyonlarının güncel tutulması ve devamlı beslenmesiyle mümkündür. Bu düşünceden hareketle, 1993 yılında Kültür Bakanlığı yayın seçme kurulunca 150 büyük kütüphaneye 7 çeşit 790 kütüphane için 2 çeşit günlük gazeteye, resmi gazete dahil 94 adet süreli yayma abone olunmuş ve bu güne kadar 2181 çeşit kitaptan 608.731 adedi satın alınmıştır. HACIBEKTAŞ VE MARMARİS’TE İZMETİÇİ EĞİTİM SEMİNERİ 1993 yılında 26-30 Nisan tarihleri arasında, Orta Anadolu Bölgesinden 30 il ve ilçe halk kütüphanesi müdürü veya sorumlularına yönelik "Halk Kütüphaneciliği Semineri" Nevşehir'in Hacıbektaş ilçesinde düzenlendi. Bu bağlamda 2. seminerKütüphaneler Genel Müdürlüğü tarafından 1-5 Kasım 1993 tarihleri arasında Marmaris’te Doğu ve Güneydoğu Anadolu Bölgelerinde bulunan halk kütüphanelerinin 44 müdür ya da sorumlusu için düzenlenmiştir. Seminer Doğu ve Güneydoğu Anadolu Bölgelerinde nitelikli ' kütüphane hizmeti verilebilmesi için kütüphane çalışanlarının ’ bilgilendirilmesi, kütüphanecilik alanındaki yeni gelişmelerden haberdar edilmesi, kütüphane ve kütüphanecilikle ilgili sorunların tartışıldığı bir ortam yaratarak çözüm yollarının araştırılması bakımından yararlı olmuştur. KÜTÜPHANECİLER YURT DIŞNA GÖNDERİLİYOR Mesleki yenilikleri yakından izlemek ve yabancı dillerini geliştirmelerine katkıda bulunmak üzere Kütüphaneler Genel Müdürlüğü'nde kütüphaneci olarak görev yapan Müfit TAŞ ABD’ne, Kocaeli Gölcük ilçe Halk Kütüphanesi kütüphanecisi Figen BALABAN Hollanda’ya gönderildi. Ankara Cebeci İlçe Halk Kütüphanesi Müdürü Berrin ACAR, Yenimahalle İlçe Halk Kütüphanesi Müdür Yardımcısı Harun Reşit KARADEMİR, Erzincan İl Halk Kütüphanecisi Velhan GÜCER’in 1993 yılı Aralık ayı içerisinde İngiltere’ye gönderilmesi çalışmaları sürmektedir. 306 Haberler Ayrıca yurtdışı etkinlikler çerçevesinde; -Barselona'da IFLA taoplantısına, -Belçika’da Europaliya Festivali toplantısına, -Almanya'da AvrupalI yazarlar toplantısına ve Frankfurt Kitap Fuarı’na, -Budapeşte'de ISBN Ajansı Danışma Toplantısı'na, -Rusya'da Kültürel İşbirliği Protokolünü oluşturma tplantısına kütüphanecilerin katılımı sağlanmıştır. I. HALK KÜTÜPHANECİLİĞİ SEMPOZYUMU Dünyamızda ve ona bağlı olarak toplu- mumuzda teknolojik gelişmeler ve bilgi teknolojisi her alanda olduğu gibi kütüp­ hanecilik alanında da köklü değişikliklere ve yeniden yapılanmaya neden olmak-tadır. Bilgi çağını yakalamaya çalışan ve bilgi toplumu olma yolunda uğraş veren Ülkemizde halk kütüphanelerinin üzerine düşen görevi yerine getirebilmesi ancak bu hızlı değişime, ayak uydurabilmesiyle mümkün olabilecektir. Teknolojik ve toplumsal gelişmelerden uzak kalmadan kütüphanecilik konusun-daki bilgi birikimlerinin aktarımı ve kullanıcılara en üst düzeyde hizmet veril-mesi için halk kütüphanelerinde çalışan-larm bu yeniliklerden haberdar edilmesin-de Kütüphaneler Genel Müdürlüğü Dü­ zenleyici olma durumundadır. Bu gerçekten hareketle, 29 Kasım - 1 Aralık 1993 tarihleri arasında Ankara’da üniversitelerin kütüphanecilik bölümleri ve çeşitli kütüphanelerden uzman kişilerin bilimsel bildirilerle katıldıkları ulusal düzeyde "I. Halk Kütüphaneciliği Sempozyumu" düzenlenmiştir. Öte yandan Türkiye ile İngiltere arasında imzalanan ikili kültür değişim programı çerçevesinde 29 Kasım 5 Aralık tarihleri arasında halk kütüphaneleri konusunda uzman iki kütüphaneci Ülkemizi ziyaret etmişler ve uzmanlardan Ms. Jennifer SHEPHERD 29 Kasım 1993 tarihlerinde saat 10.00'da "Halk Kütüphanelerinde Kaynak Artırma", Mr.. Peter OLDROYD ise aynı gün saat 14.00'de "Halk Kütüphanesi Hizmetlerinin Yaygın­ laştırılması" konularında konferans vermişlerdir. Amlan sempozyumun bu ziyaretle aynı zamana raslaması iki ülke arasında halk kütüphaneciliği konusunda yeni olanaklar yaratılması ve bilgi birikimlerinin karşılıklı aktarılmasına da katkıda bulumuştur. Sempozyumda, halk kütüphanelerinin ekonomik ve sosyo-kültürel kalkınmadaki önemi ele alınmış, halk kütüphane­ ciliğindeki son gelişmeler bilimsel düzeyde sorgulanıp tartışılmış ve çözüm yolları aranmıştır. Yurt çapında 105 halk kütüphanesi çalışanlarının katıldığı "I. Halk Kütüphaneciliği Sempozyumu"nda 26 kişi de bildiri sunmuştur. BİLİMSEL ESER YARIŞMASI SONUÇLANDI Kütüphaneler Genel Müdürlüğü’nce Cumhuriyetimiz'in kuruluşunun 70. Yılı kutlamaları çerçevesinde düzenlenen "Cumhuriyetimizin 70. Yılında Halk Kütüphaneleri ve Düşünce Özgürlüğü" konulu bilimsel eser yarışması sonuçlandı. Prof. Dr. Özer SOYSAL başkanlığında, Prof. Dr. Tülin SAĞLAMTUNÇ, Prof. Dr. Mustafa AKBULUT, Prof. Dr. Nilüfer TUNCER ve Prof. Dr. Bengü ÇAPAR'dan oluşan jüri tarafından yapılan değerlendirmelerde birinciliğe lâyık eser Haberler 307 bulunamıştır. M.Tayfun GÜLLE ve Zafer KIZILKAN'ın ortak çalışması ikinciliğe; Handan GÜÇLÜOĞLU’nun eseri ise üçüncülüğe layık görülmüştür. İkincilik ödülü 25.000.000.-TL’si ve üçüncülük 15.000.000.-TL’si ile şiltler dereceye girenlere, 29 kasım 1993 günü "I.Halk kütüphaneciliği Sempoz-yumu"nun ilk gününde Kültür Bakanı Fikri Sağlar Tarafından verilmiştir. MESLEKİ YAYIN ÇALIŞMALARI HIZLANDIRILDI Kütüphaneler Genel Müdürlüğü, halk kütüphanelerinde çalışan elemanların ve yetişen genç kütüphanecilerin mesleki gelişmeleri izleyebilmeleri bilgilerini tazelemeleri ve yeniliklere uyum sağlayabilmeleri açısından mesleki yayınlar hazırlamakta ve baskısı tükenenlerin ikinci basımlarım yapmaktadır. Bu bağlamda 1993 yılında aşağıda belirtilen 4 mesleki yayınuı da basımı yapılacaktır. Basımı yapılacak kitaplar şunlardır: - Bülen YılmazTn "Okuma Alışkanlığında Halk Kütüphanelerinin Önemi" - "Osman Ersoy'un Makaleleri" - Bülent Yılmaz, Aytaç Yıldızeli, Serap Narinç ve Oya Gürdal’ın "Türk Kütüphaneciliği Dizini" - Prof.Dr Özer Soysal'ın "Türk Kütüphaneciliği İL Geleneksel Yapıda Yeniden Yapılanışı" Böylece mesleki yayınlarımızın sayısı 25'e ulaşacaktır. 1993 YILINA AİT İSTATİSTİKİ SONUÇLAR BELİRLENDİ Kütüphane sayıss.............................. 1107 Kitap sayın.............................. 10.027.738 Okuyucu sayısı ...................... 12.413.009 Ödünç verilen kitap sayısı .... 2.086.189 Gezi Kütüphane sayısı ............................ 66 Hizmet götürülen Semt Sayısı .... 1073 Kitap Sayısı ................................ 139.476 Kayıtlı üye sayın......................... 108.135 Ödünç verilen kitap sayın ... .. . 314.812 HALK KÜTÜPHANELERİNE ÇOK SAYIDA DONATIM MALZEMESİ GÖNDERİLDİ Kütüphaneler Genel Müdürlüğü'nce donatım malzemesi yaptırılarak mevcut, yeni açılan ve aşağıda belitilen kütüphanelere gönderilmiştir. A d a n a - C ey h a n - K ö s r e 1 i Halk Kütüphanesi: 4 büyük okuyucu masası 24 büyük okuyucu sandalyesi, 5 ahşap kitap rafı. Afyon-Ballık Halk Kütüphanesi: 3 Büyük okuyucu masası, 12 büyük okuyucu sandalyesi, 3 ahşap kitap rafı. Afyon-Kadıhlar Halk Kütüphanesi: 3 büyük okuyucu masası, 12 büyük okuyucu sandalyesi, 3 ahşap kitap rafı. Afyon-Kırca Halk Kütüphanesi: 5 büyük okuyucu masası, 20 büyük okuyucu sandalyesi, 4 büyük ahşap kitap farı, 2 orta boy ahşap kitap rafı. Ankara-Altındağ Halk Kütüphanesi: 16 büyük okuyucu masası, 96 büyük okuyucu sandalyesi, 4 dikdörtgen çocuk masası, 24 çocuk sandalyesi, 12 büyük boy ahşap kitap rafı, 2 adet periyodik raf, 3 katalog fış dolabı. Ankara-Kızılcahanıam Halk Kütüp­ hanesi: 10 büyük okuycu sandalyesi. Ankara-Polatlı Halk Kütüphanesi: 6 ahşap kitap rafı, 3 periyodik raf. Ankara-Yenimahalle Halk Kütüp-hanesi: İki katalog fış dolabı. Bayburt il ve ilçe halk kütüphaneleri: 5 büyük okuyucu masası, 20 büyük okuyucu sandalyesi, 5 dikdörtgen çocuk masası, 24 çocuk sandalyesi, 15 büyük ahşap kitap rafı, 5 orta boy ahşap kitap rafı, bir katalog fış dolabı. Edirne İl Halk Kütüphanesi: 64 raf, 2 katalog. 308 Haberler Erzincan-Kemaliye Halk Kütüphanesi: 45 raf, 4 katalog. İçel-Erdenıli Halk Kütüphanesi: 35 raf, 15 masa, 70 sandalye, 2 katalog fiş dolabı. İzmir-Aliağa-Yenişaklan Halk Kütüp­ hanesi: 8 büyük okuyucu masası, 32 büyük okuyuvu sandalyesi, 2 yuvarlak çocuk masası, 8 çocuk sandalyesi, 4 büyük bboy ahşap kitap rafı, 2 orta boy ahşap kitap rafı, 1 katalog fiş dolabı. İzmir-Çiğli-Sasah Halk Kütüphanesi: 5büyük okuyucu masası, 20 büyük okuyucu sandalyesi, 2 dikdörtgen çocuk masası, 8 çocuk sandalyesi, 2 büyük boy ahşap kitap rafı, 4 orta boy kitap rafı. Hatay il ve ilçe halk kütüphaneleri: 25 büyük okuyucu masası, 100 büyük okuyucu sandalyesi, 6 dikdörtgen çocuk masası, 24 çocuk sandalyesi, 12 ahşap kitap rafı, 22 orta boy ahşap kitap rafı. Kayseri-Develi-Şıhlı Halk Kütüphanesi: 6 büyük okuyucu masası, 24 büyük okuyucu sandalyesi, 2 dikdörgen çocuk masası, 8 çocuk sandalyesi 4 büyük boy, 3 orta boy kitap rafı. Nevşehir İl Halk Kütüphanesi: 6 ahşap kitap rafı, 1 katalog fış dolabı. Nevşehir-Hacıbektaş Halk Kütüp-hanesi: 6 büyük okuyucu masası, 24 büyük okuyucu sandalyesi, 4 dikdörgen çocuk masası, 16 çocuk sandalyesi, 6 büyük boy, dört orta boy ahşap kitap rafı, 1 katalog fış dolabı. Sakarya-Akyazı Halk Kütüphanesi: 20 raf, 14 masa, 56 sandalye. Sakarya-Hendek Halk Kütüphanesi: 10 büyük boy, 10 orta boy ahşap kitap rafı, 1 adet periyodik raf. Uşak-Banaz Halk Kütüphanesi: 8 büyük okuyucu masası, 32 büyük okuyucu sandalyesi, 10 büyük boy, 5 orta boy ahşap kitap rafı, 1 katalog fış dolabı. Yozgat-Doğankent Halk Kütüphanesi: 8 büyük okuyucu masası, 32 büyük okuyucu sandalyesi, 5 büyük boy, 3 orta boy ahşap kitap rafı. İBRAHİM KARAER DAİRE BAŞKANI OLDU Genel Merkez Yönetim Kurulu İdari İşlerden Sorumlu Genel Başkan Yardımcısı Dr. İbrahim Karaer Başbakanlık Devlet Arşivleri Genel Merkezi Dokümantasyon Dairesi Başkam oldu. Sayın Karaer’i kutlar başarılarının devamım dileriz. NAFİZ ÜNLÜ VEFAT ETTİ İstanbul’da Fatih-Çarşamba semtinde bulunan Murat Molla Halk Kütüphanesinin emekli müdürü Nafiz Ünlü 2.10.1993 günü vefat etmiş, 4.10.1993 günü de toprağa verilmiştir. Kendisine Tanrı'dan rahmet, kederli ' ailesine başsağlı dileriz. work_sk7dpvfc6rch7bjgx65pvye7eq ---- OCLC Research Publications Repository Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine March 2005 Volume 11 Number 3 ISSN 1082-9873 OCLC Research Publications Repository   Shirley Hyatt and Jeffrey A. Young OCLC Research OCLC Online Computer Library Center Inc. {hyatts, jyoung}@oclc.org Introduction Conducting research on behalf of the library community has, for 27 years, been part of the mission of OCLC, and we believe that if research is worth doing, it's worth telling people about. Making our research outcomes known is integral to our mission, and we do this both by writing about it and by demonstrating it. Accordingly, in September 2003, OCLC's Office of Research set about to create a repository of its staff's publications. The goal of the repository was to maximize the visibility, usage, and impact of OCLC's research output. A secondary goal was to make the publication section of the OCLC Research web site more nimble. (This portion of the site had previously been one long straggle of citations, not particularly searchable in a helpful manner.) A third, de facto, goal was to demonstrate OCLC Research technologies. The publications repository achieved all of these goals. The site was launched in June 2004, and by November 2004 it included metadata for all OCLC Research's 900+ publications and full text for half of them. The project was completed with .25 FTE, which included project management, cataloging labor, copyright permissions handling, scanning, archiving, and one-time development time. The OCLC Research repository primarily uses lightweight, no- or low cost, easy-to-implement technologies [Note]. It doesn't take a rocket scientist to implement this sort of repository; indeed, we challenged a developer to implement a new repository from scratch, to discover how long it would take now that we'd completed development of the components. It took him just 1 hour. The purpose of this article is to describe the repository and provide links to additional information about the technical components used. Content of the Repository The OCLC Research repository contains works produced, sponsored, or submitted by OCLC Research. In general, the works are research-oriented and are in the subject area of library and information science. Many items describe OCLC Research projects, activities, and programs and were originally published by OCLC, while others are from peer-reviewed scholarly journals. The repository contains MARC metadata about publications and, whenever available and permitted, a link to the full digital text of items described. Though the repository will always be under construction in the sense that our authors will continue to publish, the repository is presently up-to-date. Access to the Content (or, Good Bang for the Buck) The records in the repository are available for browser-based searching on the OCLC Research Web site [1]. The records are available for automated searching via the SRW/U (Search Retrieve Web Service / Search Retrieve URL) interface [2]. The metadata records are available from an Open Archives Initiative (OAI) v2.0 repository; and we encourage harvesting them [3]. The MARC records are in WorldCat [4], and are searchable in FirstSearch [5] and search engines participating in OCLC's Open WorldCat Program [6]. Full text of articles are linked from the metadata record wherever permitted by the copyright holder. The repository also has an RSS [7] feed and supports Open URLs [8].   Figure 1. The variety of interfaces under which items in the repository are available. We searched for Lorcan Dempsey's "Pick up a Portal" article.   How It All Happens: Architecture & Data Flow WorldCat was and is our starting point; we use WorldCat as a "registry" for our holdings. That said, the architecture used is quite modular and could be reconfigured to accept catalog records from a variety of sources. We chose to use WorldCat because we found that WorldCat provides more bang for the buck: having records in WorldCat ensures that the records are accessible in FirstSearch and by major search engines via the Open WorldCat Program, as well as all the other access points our architecture provides. In our architecture, all of the other access flows virtually automatically from the act of cataloging an item into WorldCat. WorldCat has its own (optional) metadata creation utility (Connexion), and there are no cataloging transaction costs to enter original records into WorldCat. This workflow also fits neatly into existing library workflows and requires no additional training. In essence, the cost of the project is the cost of labor of cataloging the items into WorldCat (and this need not be full MARC) [Note]. First we catalog our publications into WorldCat. For practicality's sake we catalog our preprints using some minimal level of cataloging (level K—MARC or level 3—DC), with an 856 ‡3 field for the preprint. We then build onto this record as the document is published by adding additional 856 ‡3 fields for the postprint. (Acknowledging the experience of many institutional repositories that it is unlikely authors would have the time or inclination or habit to catalog their own work, we used professional catalogers and, where needed for older materials, professional scanners for our project.) We opted to use OCLC Connexion as our metadata creation utility, but any system would work. Our repository records are identified by a holding symbol unique to the project. In the 25 years of OCLC Research's existence, OCLC had already lost a handful of items (i.e., we have citations for originally paper-based items, for which even the authors have no copies), and so where we own copyright or have permission from the copyright holder, we store full-text of documents in the OCLC Digital Archive [9]. This too is optional to our architecture. Using Z39.50 [10] access, we then run an automated script that ports our MARC records from WorldCat into a Pears database [11], which is open source. WorldCat via Z39.50 provides no mechanism for getting only records that have changed, so we get all the records and Pears is smart enough not to replace records that have not changed. As soon as a record appears in our Pears database, it is available as OAI metadata, and to SRW/U clients, RSS clients, and search application program interfaces (APIs). We catalog into WorldCat, where it surfaces in FirstSearch, Connexion, and in public search engines via the WorldCat Open Access program; and when the record additionally appears in Pears, it is available for searching in our repository's interface (where it can appear as MARC or other formats we define), as an RSS feed, and for OAI harvesting. Figure 2. The OCLC Repository's workflow and architecture. Blue items are OCLC services; orange items are offerings of OCLC Research; green represents commercially available services. The grey box represents an activity which is purely internal to OCLC. Inside the Black Box: Core Technologies Although we used OCLC services for our implementation, a repository could be built without these. The essential requirement for this architecture is a combination of OAI and SRW/U access to the data. In our case, the Pears database was connected to an open-source standards-based SRW/U server, which was connected to an open-source standards-based OAI server, which was connected to an open-source ERRoL server [12], which applies the power of standards-based XSLT [13] to bring it all together as a holistic unit . Pears technology, first introduced into the open source community by OCLC in 2000, is database engine software. Pears databases have features that make them ideal for storing hierarchically structured data, such as bibliographic records, authority records and text documents. The open-source software includes all the utilities to build and maintain a Pears database. SRW (Search/Retrieve Web Service) and SRU (Search/Retrieve URL Service) are companion Web Service protocols for querying Internet indexes or databases and returning search results. SRW/U forms a standard web-based text-searching interface drawing heavily on the abstract models and functionality of Z39.50, without Z39.50's complexity. They are built using common web development tools (WSDL [14], SOAP [15], HTTP [16] and XML [17]). Although the OAI-PMH harvesting component could have been built directly on top of the Pears database as a sibling to the SRW/U search component, we chose to register this SRW/U service in an SRW Registry [18] instead, which automatically provides OAI access to the data as a value-added service. Although this layering may be less efficient, it demonstrates how the SRW/U and OAI standards can build on each other. In this way, similar publication repositories systems can be created from a lone SRW/U service while letting the SRW Registry provide the necessary OAI component. The same effect could have been achieved the other way around. The ERRoL service component can be thought of as a central clearinghouse for all services related to a data repository. The resolution of requests to the ERRoL service involves issuing a variety of web service requests behind the scenes and then transforming and manipulating the responses into something new (typically with XSLT). Although interaction with OAI repositories is the foundation of the ERRoL service, its capabilities aren't limited to OAI. By describing related services in the OAI Identify response (such as SRW/U), the ERRoL service can manipulate them as well with a common XSLT stylesheet that provides seamless integration across all services and protocols. The user experience of our repository—everything you see including the "SRU interface"—is presented via XSLT stylesheets. The output formats (Dublin Core and brief citations) are produced by performing XSLT crosswalks on the MARC records at runtime. This crosswalk capability is provided by the SchemaTrans service, which is discussed in the December D-Lib Magazine article, "A Repository of Metadata Crosswalks" [19]. Several of the core technologies used in the repository are also described in a preprint entitled "Metadata Switch" [20]. Conclusion Our repository was built using lightweight, modular, inexpensive services, and we found these to be highly flexible in building our repository. At the repository manager's end, the only requirement is to catalog items into WorldCat. For that effort, the consumer gets all this magic: RSS, OAI, SRW/U, ERRoLs. We analyzed the cost of this repository, from conception to present. The retrospective project (approximately 900 items) took a total of .25 FTE to implement. This included project management and planning time, copyright permissions time, cataloging costs, scanning costs, web page design time, and implementation time. Our permissions, cataloging and scanning costs were about $10.00/record [Note]. We challenged an OCLC developer to implement a new repository from scratch, to get some sense for how difficult this would be. It took him 1 hour to bring up the repository. (Of course, this did not include the content--bibliography, metadata creation, permissions, and digitization of full text.) The basic development costs for our own repository are "sunk", i.e. the components have been developed and are freely available. It's our assumption that others may want to replicate this sort of repository, and/or that IT groups might want to use some of these components in similar or more creative ways. We continue to develop and enhance the ERRoL service and to revamp the underlying architecture of our publications repository. People interested in replicating this architecture should feel free to contact us for the latest developments. Acknowledgements The following people worked on the OCLC Research Repository: Bob Bolander, Ginny Browne, Christine Guenther, Shirley Hyatt, Ralph LeVan, Lance Osborne, Linda-Ann Sturgeon, and Jeffrey A. Young. Note Our implementation included use of OCLC FirstSearch and the OCLC Digital Archive. OCLC membership, FirstSearch, and Digital Archive incur fees, but these are optional to the architecture, and "sunk" for OCLC member institutions that already have them. These fees are not included in our cost figures, though cataloging and scanning labor and expenses are included. References [Links checked March 9, 2005.] [1] OCLC Research Repository . [2] SRW/SRU (Search/Retrieve Web Service / Search/Retrieve URL Service) Publications Interface . For SRW/SRU general information, see . [3] OCLC Research Publications OAI repository . [4] WorldCat . [5] FirstSearch . [6] Open WorldCat program . [7] OCLC Research Repository RSS feed . [8] OCLC Research and the OpenURL Registry . [9] OCLC Digital Archive . [10] Z39.50 . [11] Pears . [12] ERRoLs . [13] XSLT Stylesheets . [14] Web Service Definition Language (WSDL) . [15] Simple Object Access Protocol (SOAP) . [16] Hypertext Transfer Protocol (HTTP) . [17] Extensible Markup Language (XML) . [18] ERRoLs SRW registry . (Don't let the fact that this SRW Registry is itself implemented using ERRoLs distract you from its described purpose. ERRoL services, which are explained next, are occasionally incestuous in this way.) [19] Godby, Jean Carol, and Jeffrey A. Young, 2004. A Repository of Metadata Crosswalks. D-Lib Magazine 10 (12): December. . [20] Dempsey, Lorcan, Eric Childress, Carol Jean Godby, Thomas B. Hickey, Diane Vizine-Goetz, and Jeffrey A. Young, 2004. Metadata Switch: Thinking About Some Metadata Management and Knowledge Organization Issues in the Changing Research and Learning Landscape. Forthcoming in LITA Guide to E-Scholarship [working title], ed. Debra Shapiro. Preprint available at . Copyright © 2005 OCLC Online Computer Library Center, Inc. Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions doi:10.1045/march2005-hyatt   work_skeu6gefyjcyhbsrsrr3bfd37i ---- The STARPAHC collection: part of an archive of the history of telemedicine ................................................................................................................................................... ....................................................................... ....................................................................... EDUCATION AND PRACTICE History " The STARPAHC collection: part of an archive of the history of telemedicine Gary Freiburger, Mary Holcomb and Dave Piper Arizona Health Sciences Library, University of Arizona, Tucson, Arizona, USA Summary An early telemedicine project involving NASA, the Papago Tribe (now the Tohono O’odham Indian Nation), the Lockheed Missile and Space Company, the Indian Health Service and the Department of Health, Education and Welfare explored the possibilities of using technology to provide improved health care to a remote population in southern Arizona. The project, called STARPAHC (Space Technology Applied to Rural Papago Advanced Health Care), took place in the 1970s and demonstrated the feasibility of a consortium of public and private partners working together to provide medical care to remote populations via telecommunication. In 2001 the Arizona Health Sciences Library acquired important archival materials documenting the STARPAHC project and in collaboration with the Arizona Telemedicine Program established the Arizona Archive of Telemedicine. The material is likely to interest those studying early attempts to use technology to deliver health care at a distance, as well as those studying the sociological ramifications of technical and scientific projects among indigenous populations. Introduction In 2001 several boxes of documents relating to a 1970s telemedicine project came to light. The materials consisted of reports, correspondence and photographs of a telemedicine project conducted on the Papago reservation in Arizona (now called the Tohono O’odham reservation). The project was called STARPAHC (Space Technology Applied to Rural Papago Advanced Health Care). The University of Arizona agreed to archive these materials in the hope that the collection would form the nucleus of an archive of historical materials related to telemedicine. The STARPAHC project The STARPAHC project was conceived and sponsored by NASA, assembled by the Lockheed Missiles and Space Corporation, managed and evaluated by the US Indian Health Service, and used and evaluated by the Papago Nation. The original budget was US$4.26 million in 1973 dollars. The project employed advanced technology to deliver medical services on the Papago Indian reservation; extensive evaluation criteria were used.1 The STARPAHC system included a control centre located in the Indian Health Service hospital on the Papago reservation which was staffed by physicians and a system operator. There was a remote clinic in the village of Santa Rosa located 50 km away which was staffed by a physician assistant. There was also a mobile health unit staffed by a physician assistant and a laboratory technician (Figures 1 and 2). Finally, there was a referral centre at the Indian Health Service hospital in Phoenix with access to medical specialists (Figure 3). Two-way video, audio and data commu- nications linked these units, which were used primarily for remote diagnosis. Communications were provided via microwave (video, voice, data), VHS radio (voice, data) and telephone (voice, data, pre-recorded video). The project was active from 1973 until 1977.2 Subsequently, Bashshur summarised the significance of the programme as follows: � NASA and the Indian Health Service demonstrated the organizational and technological capacity to provide medical care to remote populations; Journal of Telemedicine and Telecare 2007; 13: 221–223 Accepted 1 March 2007 Correspondence: Gary Freiburger, Arizona Health Sciences Library, University of Arizona, Tucson, AZ 85724, USA (Fax: þ1 520 626 2922; Email: garyf@ahsl.arizona.edu) ....................................................................... � the approach to the design and implementation of this mode of care delivery was effective and holds promise for other situations; � the efficacy of remote telemetry and non-physician medical personnel in the provision of medical care was demonstrated; � the cooperation and advance planning on the part of all the participants in the project can serve as a model for others.3 The STARPAHC collection Although archives are often located in university libraries, the term ‘library’ is not synonymous with the term ‘archive’; librarians are not necessarily trained as archivists, nor in most cases are archivists trained as librarians. In general, librarians organize items such as books, journals, audiovisual and digital materials, most of which are not unique. In contrast, archivists organize items from the records of a person, company or institution into a unique aggregation. Library materials are usually arranged according to an established classification scheme such as the National Library of Medicine system, while archival collections are arranged according to provenance (i.e. office or person of origin) and original order. In addition, library materials are described individually and listed in a library catalogue, while an archival collection is described as an aggregate by means of a finding aid.4 Since none of the staff at the Arizona Health Sciences Library was a professionally trained archivist, an archivist was hired to carry out the work. The aim was to organize the papers, reports and photographs (approximately 3 m of shelf space) into an archive usable by researchers. In addition, policies were to be established so that the collection could be maintained and additional collections could be added to create a general archive of telemedicine. Over a seven month period, working an average of twenty hours a week, the archivist reviewed the contents of the collection, stored documents, publications and photographs in acid-free binders and boxes, and created a finding aid for future researchers. After processing, the STARPAHC collection occupied approximately 2.6 m of shelf space. The documentary material was organized into 22 storage containers and there were two large framed pictures in addition. The reports, correspondence and photographs are from the period 1970–1991, but the bulk of the material is from the period 1972–1978. The collection was organized into seven distinct series based on the manner in which the materials were collected, filed and maintained by three of the people directly involved in the project: James W Justice, the STARPAHC evaluation officer and Figure 1 Exterior view of the mobile health unit on location Figure 2 Interior view of the mobile health unit with medical personnel, neonatal patient and mother Figure 3 The STARPAHC sites. The distance between Sells and Phoenix is approximately 220 km G Freiburger et al. The STARPAHC collection 222 Journal of Telemedicine and Telecare Volume 13 Number 5 2007 ....................................................................... medical director; Peter G Decker, the Indian Health Service project engineer; and Norman Belasco, STARPAHC’s project officer and chief of NASA’s Integrated Medical and Behavioral Laboratory. The first six record series were organized chronologically. The seventh series contained undated material. Once the processing of the collection was complete and the finding aid had been prepared, press releases were sent to major library and archival publications to let others know of its existence. A record for the finding aid was created and added to the OCLC WorldCat database to ensure that researchers could locate the collection from anywhere in the world (see http:// worldcat.org/oclc/53231018). Significance of the STARPAHC collection The STARPAHC project represented the ‘first generation’ of telemedicine, a generation which has been said to be unsuccessful because the projects were not sustained and because telemedicine was not widely adopted for health-care delivery. Nevertheless, first generation telemedicine projects provided evidence of the feasibility of remote consultation, the clinical effectiveness of several clinical functions, training and education.5 The press release announcing the archive of telemedicine project, stated that ‘This collection will be of great value to scholars interested in the historical roots of ‘e-health care’, its early successes and failuresy Arizona has had important experiences with multi- cultural telemedicine for more than a generation. As other major institutions extend their e-health networks around the world, the Arizona experiences provide a frame of reference for studies on the critical roles of tele- communications in health care in the information age.’6 In the last five years, the STARPAHC collection has been consulted several times by researchers from Arizona and elsewhere. The material is likely to interest those studying early attempts to use technology to deliver health care at a distance, as well as those studying the sociological ramifications of technical and scientific projects among indigenous populations. From the library’s perspective, an important lesson in regard to establishing archival collections is that the processing and managing costs are not insignificant and are easily underestimated. For future collections, the library will need to seek specific funding. Nevertheless, we urge people from other telemedicine programmes to document their history and preserve important documents in collaboration with their own institutional libraries. References 1 Lovett JE, Bashshur RL. Telemedicine in the USA: an overview. Telecomm Policy 1979;3:3–14 2 Lockheed Missiles and Space Company, Inc. STARPAHC Systems Report. 30 October 1977: 2 vols. Report number: LMSC-D566138 3 Bashshur RL. Technology Serves the People: The Story of a Co-operative Telemedicine Project by NASA, The Indian Health Service and the Papago People. Tucson, AZ: Indian Health Service, Office of Research and Development, 1980:107–9 4 Hunter GS. Developing and Maintaining Practical Archives: A How-To-Do-It Manual. New York, NY: Neal-Schuman, 1997: 6–10 5 Bashshur RL. Telemedicine and the health care system. In: Bashshur RL, Sanders JH, Shannon GW, eds. Telemedicine: Theory and Practice. Springfield, IL: Charles C Thomas, 1997:5–35 6 Weinstein RS. Comments in Arizona Health Sciences Center press release file 2001. See http://www.ahsc.arizona.edu/opa/news/sep01/ telemed.htm (last checked 16 February 2007) G Freiburger et al. The STARPAHC collection Journal of Telemedicine and Telecare Volume 13 Number 5 2007 223 work_skrd5qwmxnchjiks425piiwh4u ---- ,QIRUPLQJ 6FLHQFH 6SHFLDO ,VVXH RQ ,QIRUPDWLRQ 6FLHQFH 5HVHDUFK 9ROXPH � 1R �� ���� RReepprreesseennttaattiioonn aanndd OOrrggaanniizzaattiioonn ooff IInnffoorrmmaattiioonn iinn tthhee WWeebb SSppaaccee:: FFrroomm MMAARRCC ttoo XXMMLL Jian Qin School of Information Studies Syracuse University jqin@syr.edu Abstract Representing and organizing information in libraries has a long tradition of using rules and standards. As the very first standard encoding format for bibliographic data in libraries, MAchine Readable Cataloging (MARC) format is being joined by a large number of new formats since the late 1980s. The new formats, mostly SGML/HTML based, are actively taking a role in representing and organizing networked information resources. This article briefly describes the historical connection between MARC and the newer formats for representing information and the current development in XML applications that will benefit information/knowledge management in the new environment. Keywords: MARC, information representation, information organization, XML schemas Introduction The notion of information representation and organization traditionally means creating catalogs and indexes for publica- tions of any kind. It includes the description of the attributes of a document and the representation of its intellectual con- tent. Libraries in the world have a long history in recording data about documents and publications; such practice can be dated back to several thousand years ago. Indexes and library catalogs are created to help users find and locate a document conveniently. Records in the information searching tools not only serve as an inventory of human knowledge and culture but also provide orderly access to the collections. Just like every other business and industry, the representation and or- ganization of information in the network era has gone through dramatic changes in almost every stage of this process. The changes include not only the methods and technology used to create records for publications, but also the standards that are central to the success and effectiveness of these tools in searching and retrieving information. Today the library cata- log is no longer a tool for its own collection for the library visitors; it has become a network node that users can visit from anywhere in the world via a computer connected to the Internet. The concept of indexing databases is no longer just for newspapers and journal articles; it has expanded into the Web information space that is being used for e-publishing, e- businesses, and e-commerce. The heart of such a universal information space lies in the standards that make it possible for different types of data to be communicated and understood by heterogeneous platforms and systems. We all know that TCP/IP allows different com- puter systems to talk to each other and to understand different dialects of networking language; in the world of organizing information content, the content is represented by terms either in natural or controlled language or both. The characteristics of its container (book, journal, film, memo, report, etc.) will be encoded in certain format for computer storage and re- trieval. Libraries in the world have used MAchine Readable Cataloging (MARC) (Library of Congress, 1999) to encode information about their collections. In conjunction with cata- loging rules, such MARC format standardized the record structure that describes information containers, i.e., books, manuscripts, maps, periodicals, motion pictures, music scores, audio/video recordings, 2-D and 3-D artifacts, and micro- forms. The Online Computer Library Center (OCLC) in Dub- lin, Ohio is the largest and the busiest cataloging service in the world. Almost 33,000 libraries from 67 countries now use OCLC products and services and more than 8,650 of them are OCLC members. As e-publishing thrives and Web informa- tion space grows, libraries have expanded conventional cata- loging of their collections into organizing the information on the Web. In the early 1990s, OCLC started the Internet cata- loging project, in which librarians from all types of libraries volunteered to contribute MARC records they created for Go- pher servers, listserves, ftp and Web sites, and other net- Material published as part of this journal, either on-line or in print, is copyrighted by the publisher of Informing Science. Permission to make digital or paper copy of part or all of these works for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage AND that copies 1) bear this notice in full and 2) give the full cita- tion on the first page. It is permissible to abstract these works so long as credit is given. To copy in all other cases or to republish or to post on a server or to redistribute to lists requires specific per- mission and payment of a fee. Contact Editor@inform.nu to re- quest redistribution permission. mailto:jqin@syr.edu 5HSUHVHQWDWLRQ DQG 2UJDQL]DWLRQ RI ,QIRUPDWLRQ 84 worked information resources (OCLC, 1996). Another major undertaking in organizing information on the Web is OCLC's Metadata Initiative (Dublin Core Metadata Initiative, 1999) inaugurated in 1995, which proposed a metadata scheme con- taining 15 data elements. Among them are title, creator, pub- lisher, subject, description, format, type, source, relation, identifier, and rights. The metadata scheme was named after the city where OCLC is located: Dublin Core Metadata Ele- ment Set (Dublin Core for short). Since its debut, it has be- come an important part of the emerging infrastructure of the Internet. Many communities are eager to adopt a common core of semantics for resource description, and the Dublin Core has attracted broad ranging international and interdisci- plinary support for this purpose. Metadata and Metadata Creation The term "metadata" refers to "machine-understandable in- formation about Web objects" (Swick, 1997). It is the "docu- mentation about documents and objects. They describe re- sources, indicate where the resources are located, and outline what is required in order to use them successfully" (Younger, 1997). Metadata schemes, such as Dublin Core, entail a group of codes or labels that describe the content and/or container of digital objects. When the metadata is embedded in hypertext documents, they can accommodate automatic indexing for digital objects and thus provide better aids in networked re- source discovery. Several terms have been used interchangea- bly in describing the digital objects that a user views through various interfaces (e.g., a Web browser). They are given names such as Web document, Web object, digital object, hy- pertext, and hypermedia. Post-Publishing Representation Post-publishing representation is a method in which a special type of computer program generates metadata from digital objects already published. These programs are known as spi- ders, knowbots or automatic robots, Webcrawlers, wanderers, etc. Using these programs, metadata are extracted from the objects that were made available on the Internet. Many of the Web search engines, e.g., Excite, Lycos, AltaVista, employ the post-publishing representation method to collect metadata and build their metadata bases for networked information dis- covery purposes. This fully automated process of metadata generation is "a mixed blessing": it requires little or no human intervention, but the methods used to extract metadata are too simple and far from effective in resource discovery. Lynch indicates that automatic indexing is "less than ideal for re- trieving an ever-growing body of information on the Web" for several reasons: the inability to identify characteristics of a document such as its overall theme or its genre, lack of stan- dards, and inadequate representation for images (Lynch, 1997). However, post-publishing representation has its merits. The most appealing advantage is probably that updating a metadata base can be done automatically and as frequently as one desires. This advantage makes it possible for popular search engines such as Yahoo! AltaVista, and HotBot to create dynamic metadata in response to queries. Since they do not generally retrieve the metadata content, results are created on the fly to answer users' queries (Schwartz, 1998). Another advantage comes with this automatic indexing process: the labor costs tend to be low because little or no human interven- tion is involved in the metadata harvesting process. Pre-Publishing Structuring One way to compensate for the shortcomings in post- publishing representation is through pre-publishing structur- ing, i.e., attaching structured metadata to the digital objects so that automated indexing programs can collect this information in a more efficient way. Earlier efforts in pre-publishing struc- turing of metadata have taken place in various domains. The Text Encoding Initiative (TEI) (University of Virginia, 1994) was one of the pioneers. It is basically an encoding scheme consisting of a number of modules or Document Type Decla- ration (DTD) fragments, which include 3 categories of tag sets: (1) core DTD fragments; (2) base DTD fragments; and (3) additional DTD fragments. Another project, the Encoded Archival Description (EAD) (Library of Congress, 1996) is an SGML document type definition for encoding finding aids for archival collections. Other domain-specific projects include the Content Standards for Digital Geospatial Metadata (CSDGM) (Federal Geographic Data Committee, 1998) and the Government Information Locator Service (GILS) (OIW/SIG-LA, 1997). As of April 1998, there were over 40 projects in more than 10 countries that either use Dublin Core or are developing their own metadata element set that are based on Dublin Core. The common element among these projects is that they embed the structured metadata into the Web objects prior to or after their "publication." The structured metadata consists of com- ponents that allow establishing relationships among data ele- ments with other entities, and these components are usually categorized into several different "packages" or "layers." Newton (1996) maintains that "[meta]data elements must be described in a standard way as well as classified. Attribute standardization involves the specification of a standard set of attributes, and their allowable value ranges, independently of the application areas of data elements, tools, and implementa- tion in a repository." Her five categories of attributes include identifying, definitional, relational, representational, and ad- ministrative, reflecting a complex structure in metadata ele- ments. Bearman and Sochats (1996) propose a reference model for business-acceptable communication. They define clusters of data elements that would be required to fulfill a range of functions of a record. The functions of records are identified as: 4LQ • The provision of access and use rights management • Networked information discovery and retrieval • Registration of intellectual property • Authenticity, including: handle, terms and conditions, structural, contextual content, and use history Metadata and Digital Information Repositories Among the key concepts in digital information repositories, metadata plays two important roles: as a handler (i.e., identi- fier) and as points of access to data/document content (Kahn & Wilensky, 1995). As a locator, metadata helps users obtain the data or document by providing the exact location. As ac- cess points, metadata supplies information about the content of resources. The demand for effective organization of infor- mation does not diminish with powerful information technol- ogy, but rather, people nowadays have higher expectations for networked resources. The success of a digital information repository in meeting such high expectations depends largely on the quality and scale of metadata, which, in turn, depends on a whole set of information processing standards and qual- ity control management. Metadata and XML The dilemma of post-publishing representation and pre- publishing structuring reflects the inadequacy of describing unstructured data/documents coded with HTML. Given the shorter publishing cycle and huge volume of information, any method requiring heavy manual intervention in creating meta- data records would be impractical. If data or documents can be structured with meaningful tags at the time they are cre- ated, it would greatly increase the flexibility of these data/documents to be exchanged and understood over the network systems. The structured documents can make it easier to extract information about them to build metadata reposito- ries. This is where the eXtensible Markup Language (XML) (Cover, 2000) comes in to play. XML describes a class of data objects called XML documents and partially describes the behavior of computer programs that process them. It is an application profile or restricted form of SGML, the Standard Generalized Markup Language. XML allows large-scale Web content providers to perform such tasks as industry-specific markup, vendor-neutral data exchange, media-independent publishing, one-on-one market- ing, workflow management in collaborative authoring envi- ronments, and the processing of Web documents by intelligent clients. XML applications for creating metadata involve a wide range of activities: sitemaps, content ratings, stream channel, definitions, search engine data collection (web crawl- ing), digital library collections, and distributed authoring. There are several parallel efforts in developing XML-based metadata applications. One of them is the Resource Descrip- tion Framework (RDF) developed at W3C (Lassila & Swick, 1999). RDF "is a foundation for processing metadata; it pro- vides interoperability between applications that exchange ma- chine-understandable information on the Web. RDF empha- sizes facilities to enable automated processing of Web re- sources." RDF uses XML as syntax to express the semantics in the RDF data model. A simple example is diagramed in Figure 1 to demonstrate how RDF/XML structures data ele- ments. This diagram represents that "the individua referred to by employee id 85740 is named Ora Lassila and h s the email address lassila@w3.org. The resource http://www.w3.org/Home/Lassila was created by t ual." In RDF/XML, it will be represented as: iption, , which documents created, and reduce e ex- e potential mailto:lassila@w3.org http://www.w3.org/Home/Lassila http://www.w3.org/staffId/ 85740 5HSUHVHQWDWLRQ DQG 2UJDQL]DWLRQ RI ,QIRUPDWLRQ 86 in XML syntax-based metadata opens up opportunities for a wide range of applications not only in e-publishing and digital libraries, but also in e-businesses and e-commerce. XML Namespaces One of the requirements for organizations these days is to have effective information systems that can quickly respond to information needs of ad hoc nature or for decision-making. XML can contribute to build such a system by quickly gener- ating both data-centric and document-centric documents. The so-called "data-centric" documents are characterized by "fairly regular structure, fine-grained data (that is, the smallest independent unit of data is at the level of a PCDATA-only element or an attribute), and little or no mixed content… The document-centric documents often have irregular structure; larger grained data (that is, the smallest independent unit of data might be at the level of an element with mixed content or the entire document itself" (Bourret, 1999). It becomes a real- ity now that almost all the information flowing within and between organizations can be represented as one of these two kinds of documents (marked up by XML), stored in databases, and communicated through network systems. A recent statistical survey found that up to October 1999, a total of 179 initiatives and applications emerged (Qin, 1999). Many of these applications propose specialized data elements and attributes that range from business processes to scientific disciplinary domains (Figure 2). Businesses and industry as- sociations are the most active developers in XML initiatives and applications (Figure 3). The burgeoning of these special- ized XML applications raises a critical issue: how can we be sure that data/documents marked up by these specialized tags can be understood correctly cross different systems in differ- ent applications? It is well known that different domains use their own naming conventions for data elements in their op- erations. For example, the same data element "Customer ID" may be named as "Client ID" or "Patron ID." Besides the same data may be named differently, the same term may also mean different things, such as "title" may be referring to a book, a journal article, or a person's job position. To further complicate the issue, future XML documents will most likely contain multiple markup vocabularies, which pose problems for recognition and collision. Solutions to the problems related to XML namespaces lie largely in the hands of the library and information science community who, over the years of research on informa- tion/knowledge representation and organization, have devel- oped a whole spectrum of methodologies and systems. An immediate example is that the techniques used in thesaurus construction and control can be applied to standardize the naming of data elements in various XML applications and map out semantics of data element names in namespace re- positories. With more and more XML applications sprouting, the demand for namespace control and management will also increase. Conclusion When libraries began to use MARC format for their library catalogs back in the late 1960's, they mainly converted their printed records into electronic form for storage and retrieval. The materials represented by these records are physical and static. In the Web space, there is not much physical, nor static- -the material is virtual and the information is dynamic. The library's role today has more emphasis in being as a "path- finer" than a "gatekeeper." All these grant the library and in- formation profession a wonderful opportunity to take a sig- nificant part in this information revolution, as well as a great challenge to demonstrate the value of library and information science and its potential contribution to e-organizations and e- enterprises. Application development 42% Business process 19% Document format 19% Communication 4% Resource description 7% Standard for XML 3% Other 6% Figure 2. Areas of XML application Organizations involved in XML applications Business 42% Industry association 30% Government 3% Scholarly society 4% Research institute 4% University 4% Other 13% Figure 3. Organization categories involved in XML applications 4LQ 87 References Bearman, D. and K. Sochats. (1996). Metadata Requirements for Evi- dence. Accessed January 30, 2000: http://www.lis.pitt.edu/~nhprc/BACartic.html. Bourret, R. (1999). XML and Database. Accessed January 30, 2000: http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/ xml/XMLAndDatabases.htm Cover, R. (1999). The SGML/XML Web Page: Extensible Markup Lan- guage (XML). Accessed January 30, 2000: http://www.oasis- open.org/cover/xml.html. Federal Geographic Data Committee. (1998). Content Standard for Di- giotal Geospatial Metadata (CSDGM). Accessed January 30, 2000: http://www.fgdc.gov/metadata/contstan.html . Kahn, R. and R. Wilensky. (1995). A Framework for Distributed Digital Object Services. Accessed January 30, 2000: http://WWW.CNRI.Reston.VA.US/home/cstr/arch/k-w.html. Lassila, O. & Swick, R. R. (1999). Resource Description Framework (RDF) Model and Syntax Specification. Accessed January 30, 2000: http://www.w3.org/TR/1999/PR-rdf-syntax-19990105/. Library of Congress. (1999).1MARC Standards. Accessed January 30, 2000 http://lcweb.loc.gov/marc/ Library of Congress. Network Development and MARC Standards Of- fice. (1996). Encoded Archival Description (EAD) DTD. Accessed January 30, 2000: http://lcweb.loc.gov/ead/. Lynch, C. (1997). Searching the Internet: Organizing Material on the Internet. Scientific American, 276, March, 52-56. Namespaces in XML. W3C Working Draft 16-September-1998. Ac- cessed February 10, 2000: http://www.w3.org/TR/1998/WD-xml- names-19980916. Newton, J. (1996). Application of Metadata Standards. In: Proceedings of the First IEEE Metadata Conference, April 16-18, 1996, Silver Spring, Maryland. Accessed January 30, 2000: http://www.computer.org/conferen/meta96/newton/paper.html. OCLC. (1996). Internet Cataloging Project Call for Participation: Building a Catalog of Internet-Accessible Materials. Accessed January 30, 2000 http://www.oclc.org/oclc/man/catproj/catcall.htm OCLC. (1999). Dublin Core Metadata Initiative. Accessed January 30, 2000 http://www.oclc.org/oclc/research/projects/core/index.htm OIW/SIG-LA. (1997). Application Profile for the Government Informa- tion Locator Service (GILS). Version 2. Accessed January 30, 2000: http://www.usgs.gov/gils/prof_v2.html. Qin, J. (1999). Discipline- and industry-wide metadata schemas: Seman- tics and Namespace Control. Paper presented at the ASIS Annual Meeting 1999. Schwartz, C. (1998). Web Search Engines. Journal of the American Society for Information Science 49, 973-982. Swick, R. (1997). Metadata: A W3C Activity. Accessed January 30, 2000: http://www.w3.org/Metadata/Activity.html . University of Virginia. Electronic Text Center. (1994). TEI Guidelines for Electronic Text Encoding and Interchange (P3). Accessed February 10, 2000: http://etext.virginia.edu/ TEI.html. Younger, J. A. (1997). Resources Description in the Digital Age. Li- brary Trends, 45, 462-481. http://www.w3.org/TR/1999/PR-rdf-syntax-19990105/ http://www.lis.pitt.edu/~nhprc/BACartic.html http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/XMLAndDatabases.htm http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/XMLAndDatabases.htm http://www.oasis-open.org/cover/xml.html http://www.oasis-open.org/cover/xml.html http://tulip.itc.nrcs.usda.gov/meta/ContStan.html http://tulip.itc.nrcs.usda.gov/meta/ContStan.html http://www.cnri.reston.va.us/home/cstr/arch/k-w.html http://www.w3.org/TR/1999/PR-rdf-syntax-19990105/ http://lcweb.loc.gov/marc/ http://lcweb.loc.gov/ead/ http://www.w3.org/TR/1998/WD-xml-names-19980916 http://www.w3.org/TR/1998/WD-xml-names-19980916 http://www.computer.org/conferen/meta96/newton/paper.html http://www.oclc.org/oclc/man/catproj/catcall.htm http://www.oclc.org/oclc/research/projects/core/index.htm http://www.usgs.gov/gils/prof_v2.html http://www.w3.org/Metadata/Activity.html http://etext.virginia.edu/ work_snagbwj3l5acjcusfq3dsiqtwe ---- None work_srwkh6dxvnbv7nplxsml2r4dra ---- Preservation Metadata: Pragmatic First Steps at the National Library of New Zealand Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine April 2003 Volume 9 Number 4 ISSN 1082-9873 Preservation Metadata Pragmatic First Steps at the National Library of New Zealand   Sam Searle Digital Library Projects Leader National Library of New Zealand Dave Thompson Digital Library Resource Analyst National Library of New Zealand Introduction The National Library of New Zealand Te Puna Mätauranga o Aotearoa (NLNZ) has a legislative mandate "to collect, preserve and make available recorded knowledge, particularly that relating to New Zealand" [1]. In common with other cultural institutions, the Library is undergoing a period of intense change brought about by the quantity of digital resources that must be managed and the knowledge that the rate at which we accumulate this material will dramatically increase year on year. The complexity of digital objects is a concern, as is the rising proportion that are "born digital" rather than as digital copies of analogue items from the Library's collections. NLNZ is adopting a holistic approach to the long-term management of its digital assets. The Library has established a Digital Library Transition Team to: Develop and implement business process workflows Specify infrastructure for digital material, e.g., storage, access, data authentication Research and develop a range of Digital Library activities, e.g., metadata (resource discovery, preservation, structural) and persistent identifiers Pilot web harvesting for the capture and preservation of New Zealand web sites Implement production processes for bulk digitisation of textual materials. The primary objective is that processes for digital objects become "business as usual" for the Library. This includes activities relating to digital preservation. In contrast to the inertia that may seem an understandable response to what some describe as an unmanageable flood of digital materials, NLNZ is developing pragmatic business-oriented processes for managing this material, for the long term. One component of the digital preservation puzzle is preservation metadata. NLNZ has developed a Preservation Metadata Schema [2] designed to strike a balance between the principles expressed in the OAIS Information Model [3] and the practicalities of implementing a working set of preservation metadata. This tension has informed a recent OCLC/RLG report [4] and work at the University of North Carolina [5]. A pragmatic response to this environment is required, one that recognises the need to implement a workable solution within existing resources and organisational structures. This article introduces the NLNZ schema, describes the environment in which it was conceived and identifies areas of further development, which will include: Developing data definitions for the elements in the schema Designing a repository based on those data definitions Investigating and developing tools for automatically extracting metadata to populate the repository. The NLNZ Preservation Metadata Schema The NLNZ schema identifies the data that the Library will collect and maintain. This relates to the Preservation Master held in the Digital Archive, but could also cater for an object that is not or is no longer a Preservation Master, e.g., the CD-ROM on which the Library received the original digital object or a previous Preservation Master that has been superseded through hardware or software obsolescence. The Preservation Master will be a "best effort" creation of a working preservation object. It will be a rendition of some form of "original" as supplied to or acquired by the Library, in a file format that can be preserved, managed and disseminated over time. The Preservation Master is dynamic and will be subject to processes such as migration during a lifecycle of creation, use and eventual replacement. At any time there can be only one Preservation Master for an object and maximum preservation effort will be applied whilst it has that status. As shown in the diagram below, the NLNZ schema is split into four entities. Figure 1. The four entities of the NLNZ schema. Entity 1 - Object contains 18 elements describing the logical object, which may exist as a file or aggregation of associated files. These elements identify the object and describe those characteristics relevant to preservation management. Entity 2 - Process contains 13 elements that record the complete history of actions performed on the objects. It includes the objectives of a process, who has given permission for the process, critical equipment used, and the outcomes of the actions taken. An audit trail of date/time stamps and responsible persons and/or agencies is constructed. Entity 3 - File contains technical information about the characteristics of each of the files that comprise the logical object identified in Entity 1. Nine elements are common to all file types, and further elements are specified for certain categories of file (e.g., image, audio, video, text). Entity 3 will develop further in light of emerging standards such as NISO Z39.87 Technical Metadata for Still Images [6]. Entity 4 - Metadata modification contains 5 elements and records information about the history of changes made to the preservation metadata. This acknowledges that the record is itself an important body of data that must be secure and managed over time. Although an object may have multiple files or processes for which data is recorded, each set of preservation metadata will pertain to a single logical object. This is an arbitrary construct allowing the Library to differentiate between the following types of digital objects: Simple objects: One file intended to be viewed as a single object (e.g., a word-processed document comprising one essay). Complex objects: A group of dependent files intended to be viewed as a single object (e.g., a website or an object created as more than one file, such as a database), which may not function without all files being present in the right place. Object groups: A group of files not dependent on each other in the manner of a complex object (e.g., a group of 100 letters originally acquired on a floppy disk). This object may be broken up into (described as) 100 single objects or 4 discrete objects containing 25 letters each, or it may be kept together as a single logical object ("Joe Blogg's Letters"). Practice will determine the viability of this model, especially in relation to complex objects. More information about the schema and each of the elements it contains is available at . Schema development: international context and local business requirements The schema developed in light of international endeavours relating to preservation metadata, particularly work undertaken by the National Library of Australia [7], as well as initiatives through the CEDARS programme [8], the OCLC/RLG Preservation Metadata Working Group [9] and the emerging consensus regarding the role of OAIS. These efforts provided a useful framework, but they were not immediately applicable to NLNZ's business requirements and environment. Much of the work to date has occurred at a largely theoretical level. The standards and guidelines available are not yet backed up by attention to business processes. This more practical focus will inevitably emerge as organisations develop and maintain working preservation metadata repositories, but the current lack of documented experience poses some risks for organisations needing to implement preservation metadata schemas sooner rather than later. As Seamus Ross has argued, "Concentration on the definition of metadata divorced from the processes that need to be undertaken using metadata, will result in the creation of guidelines of limited value because they will not reflect the data environment" [10]. The development of digital preservation activities requires in-depth knowledge of the Library's business activities, which revolve around processes such as: Acquiring digital objects, both published and unpublished, in a variety of file formats Storing many digital objects comprising gigabytes or terabytes of data Processing large volumes of material, e.g., migrating multiple objects to avoid format obsolescence Disseminating digital objects to users in easy, secure, and meaningful ways. We recognise that we will need to achieve our aims with limited resources, in a budgetary environment where responses to a changing electronic world are being developed within baseline funding. There is little concrete information about the immediate and long-term costs of digital preservation, although it is generally acknowledged that the costs will be greater than for analogue materials. The schema addresses these limitations by stressing that the resources we do have must be used efficiently. In this context, NLNZ's approach is an integrative one in line with Joint Information Systems Committee/National Preservation Office (JISC/NPO) findings: "Rather than attempting to isolate a global preservation cost, we should assume that there are some preservation costs associated with all the elements involved in the lifecycle of a digital resource" [11]. We believe that if digital preservation is to be incorporated into the Library's routine business, it must become as productionised as other processes such as cataloguing. We acknowledge upfront that bulk processing (mass migration and/or emulation) will be required and that hand-crafting, while sometimes necessary, will be avoided wherever possible. The schema suggests that successful implementation within a resource-constrained environment will require at least three things: 1) limiting the scope of preservation metadata; 2) maximising potential for automation; and 3) ensuring change control for metadata. Limiting the scope of preservation metadata We focus only on the data that is key to digital preservation. Wherever possible we have stripped out elements that more properly support other activities such as the preservation of analogue formats or the resource discovery and rights management of disseminated digital objects. We envisage that these types of metadata will rarely be required in order to undertake digital preservation activities and that they can be drawn upon as needed from other sources rather than routinely collected as part of our preservation strategy. Where preservation and other functions do require a common element, this has been identified within the schema, so that further rationalisation can occur during implementation. For example, where elements are collected for preservation and also required for resource discovery (or vice versa) — for example, 1.2 Reference Number and 1.8 Structural Type — this has been noted so that duplication can be avoided during repository development. The schema also removes the need to collect preservation metadata about dissemination formats. This differentiates the NLNZ Schema from some other schemas that require the identification and categorisation of the relationships between different manifestations of the object. It is clear that certain metadata required for preservation — for example, details of the internal structure of a complex object — may also be needed to manage dissemination formats. However, we believe that relationships between preservation objects and the dissemination formats that they generate can be efficiently documented through the use of persistent identifiers and consistent file storage structures. This effectively removes the need to collect dissemination-related metadata against preservation objects. Maximising potential for automation The drive towards automatic population of the maximum number of elements is most clearly demonstrated in the NLNZ Schema's focus upon the Preservation Master. As noted above, the Preservation Master will be a "best effort" representation of the material acquired by the Library, distilled into one of a small number of preferred file types. It is the Preservation Master, not the "original", that will be subject to preservation processes to continually transform it from obsolete into current formats. The concept of the Preservation Master is in line with guidelines provided by Resource: "Narrowing the range of file formats handled streamlines the management process and reduces preservation costs" [12]. Through working with preferred file types we envision that most preservation-related processes will become more standardised and the number of required processes limited. As a consequence, the range of values that need to be collected as preservation metadata will also be reduced to a more manageable set amenable to automation. Change control for metadata In developing the schema, an important question arose: why provide an audit trail for the object but not for the metadata? The decisions relating to preservation processes and the steps involved in those processes will not stand up to scrutiny in ten, fifty or a hundred years time if the metadata records in which they are documented are not subject to similar processes of change control as the preservation objects themselves. It is for this reason that the NLNZ Schema has a built-in audit trail. Entity 4 - Metadata Modification will enable the Library to track changes to the metadata record and the person responsible for them. This ensures that the goal of long-term integrity is not only applied to digital objects but also to the related metadata records. Although these characteristics of the NLNZ Schema may seem superficially to depart from existing work, in fact they reflect the widespread tension noted above, between high-level conceptual models for preservation metadata and the pragmatism required to actually implement them. Further work We are engaged in a variety of other activities to support the work done on the Preservation Metadata Schema and to ensure that management of digital objects continues to be aligned with the business objectives of the Library. Three pieces of work are key to this: 1) an implementation data model; 2) a preservation metadata repository; and 3) a preservation metadata extract script. Implementation data model Whilst the Preservation Metadata Schema offers a generalised conceptual model of preservation metadata, it is not an implementation model. NLNZ is currently undertaking data modelling work that will inform the implementation of the schema. This work is due for release in May 2003. Preservation metadata repository Following the production of data definitions, NLNZ will develop a metadata repository, with a view to integrating it with our existing systems for other types of metadata. Ultimately we hope to incorporate preservation metadata into our core portal product, Endeavor Information Systems' Encompass. Until that takes place, it is likely that the Library will need to develop an interim solution. Preservation metadata extract script NLNZ is also developing a tool that automatically extracts metadata embedded in commonly found file types. This automation is essential given the number of files involved and the complexity of their associated metadata. The script, which is currently moving beyond the proof-of-concept phase, produces an XML report of that metadata identified as important to preservation. This will then be uploaded to the metadata repository. The script's flexible modular architecture will allow the addition of extraction components for new file types and for the fine-tuning of the XML output as required. This tool is due for completion in June 2003. Conclusion The desired outcome of the activities described in this article is the integration of digital objects into the NLNZ collections as simply another type of material we collect, preserve and make available. To move digital preservation into a business-as-usual framework requires a change in language and in thinking, away from describing the requirements of digital preservation as 'problematic' and the accumulation of digital material as "an unmanageable flood" [13]. The risk of such rhetoric is that digital preservation continues to be perceived as outside the norms of business processes. There is no doubt that the Library will face many challenges in ensuring that digital objects remain functional into the future. Our Preservation Metadata Schema will need to evolve as these challenges arise and are resolved. In the immediate future, the schema is particularly likely to be influenced by the following: Research and development in the area of emulation (especially of complex objects) The evolution of METS, the Library of Congress Metadata Encoding and Transmission Standard [14] The practical experience that NLNZ and other organisations will gain from managing a wide range of digital objects. In the meantime, we are working to implement the Preservation Metadata Schema. This is one of the first steps to ensure that the preservation of our digital objects takes place within a set of agreed processes and policies. In time we hope these policies and processes will become as standardised as those that currently relate to other Library activities such as acquisitions, collection management and bibliographic description. References [1] National Library of New Zealand Te Puna Mätauranga o Aotearoa. 2001. The 21st Century: The Strategic Direction of the National Library of New Zealand Te Puna Mätauranga o Aotearoa. A Revised Framework for Planning. [2] National Library of New Zealand Te Puna Mätauranga o Aotearoa. 2000. Metadata Standards Framework: Preservation Metadata Schema. Available at [3] International Organisation for Standardisation. 2003. ISO 14721:2003: Space data and information transfer systems - Open archival information system - Reference model. Also available as: Consultative Committee for Space Data Systems. 2002. CCSDS 650.0-B-1. Reference model for an Open Archival Information System (OAIS). Blue Book. Issue 1. January 2002. [4] OCLC/RLG Working Group on Preservation Metadata. 2002. A Recommendation for Preservation Description Information. [5] North Carolina ECHO, 2003. Exploring Cultural Heritage Online. [6] National Information Standards Organisation. NISO Z39.87 Data Dictionary - Technical Metadata for Digital Still Images. [7] National Library of Australia. 1999. Preservation Metadata for Digital Collections -Exposure Draft. [8] Cedars: CURL Exemplars in Digital Archives. 2002. Cedars Guide to: Preservation Metadata. [9] OCLC/RLG Preservation Metadata Working Group, [10] Ross, Seamus. 2000. Changing Trains at Wigan: Digital Preservation and the Future of Scholarship. London: National Preservation Office. Also available at [11] Mary Feeney, ed. 1999. The Digital Culture: Maximising the Nation's Investment. (A synthesis of JISC/NPO studies on the preservation of digital materials). [12] Jones, M. & Beagrie, N., for Resource: The Council for Museums, Archives and Libraries. 2001. Preservation Management of Digital Materials: A Handbook. The British Library: London. Also available at [13] University of Heidelberg Institute for Chinese Studies. Digital Archive for Chinese Studies: About DACHS. [14] Library of Congress. Metadata Encoding and Transmission Standard. Official Website. (All URLs accessed 20 March 2003.) (On April 16, 2003 the email address for Sam Searle was corrected.) Copyright © Sam Searle and Dave Thompson Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next Article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/april2003-thompson   work_stj233lty5e33iykpvdz245zxy ---- IDS article Recent developments in Remote Document Supply (RDS) in the UK – 3 Stephen Prowse, Kings College London British Library to pull out of document supply You read it here first. Purely for the sake of artistic and dramatic licence I’ve omitted the question mark that should rightly accompany that heading. But even with the question mark firmly in place it still comes as a shock, doesn’t it? Can you imagine life without the Document Supply Centre? Can you think the unthinkable? Why should the BL pull out and where would that leave RDS? Rather like a science fiction dystopia, I’ve tried to imagine what such a post-apocalyptic world would look like and what form RDS might take. Assuming, that is, that RDS would survive the fallout. This article will attempt to show why the BL may be on the verge of abandoning document supply and what could fill some of the huge gap that would be left. Seven minutes to midnight We can think of the likelihood of a post-BL document supply world in the same terms as the Doomsday Clock positing the likelihood of nuclear Armageddon – the nearer to midnight the clock hands then the closer the reality. Perhaps we’ll start it at seven minutes to and then adjust in future articles? Perhaps a clock could adorn the cover of this esteemed journal? It could be argued that trends have been pushing the BL towards an exit for a while – the relatively swift and ongoing collapse of the domestic RDS market, for example. But the idea was first publicly mooted or threatened (take your pick) at a seminar jointly organised by the BL and CURL on 5th December 2006 at the BL in London. Presentations from the event can still be found on the CURL website [1]. This event brought together all those with a stake or an interest in the proposed UK Research Reserve (UKRR), a collaborative store of little-used journals and monographs. Librarians are notoriously loathe to completely discard items, preferring to hang on to them in case of future need. Sooner or later this creates a storage problem as space runs out. Acquiring extra space is often problematic and expensive. What is to be done? Moving to e only preserves access to the content and frees up space but can’t be wholly trusted, so print needs to be held somewhere – just in case. Co-operating with other libraries, HE institutions can transfer print holdings to an off-site storage depot and, once an agreed number of copies have been retained, can dispose of the rest. This is the theory underpinning the UKRR. The UKRR is a co-operative that will eventually invite institutions to become partners or subscribers. The first phase involves the following institutions working with the BL – Imperial College (lead site), and the universities of Birmingham, Cardiff, Liverpool, St Andrew’s, and Southampton. Research has shown that the BL already holds most of the stock that libraries would classify as low use and seek to discard – an 80% overlap of journals held by the BL and CURL libraries has been identified. Additional retention copies (meaning a minimum of three in total) would be required to placate fears of stock accidentally being destroyed. It is not felt that extra building will be necessary – stock will be accommodated at BLDSC and at designated sites by encouraging some institutions to hold on to their volumes so that others can discard. SCONUL will be the broker in negotiations as to who will be asked to retain copies. The first phase began in January 2007 with 17.3 km of low-use journals being identified among the partners for storage/disposal. If the BL already holds most of these volumes, and there is a need to ensure that two more copies are kept, it will be interesting to see how much of the 17.3 km will actually make it to disposal. I expect that libraries will be asked to hold on to much of the material that they would like to send for disposal. In fact, at a subsequent CURL members’ meeting in April 2007, Imperial disclosed that 30 out of 1300 metres of stock selected for de-duplication had been sent to BL. This represents only 2.3% being offloaded. Once participation widens then there will be increased scope for disposal, but I can’t see the partner institutions creating much space until that happens. Should the UKRR really take off then there may be a need for more building space to accommodate stock, although the BL now has a new, high density storage facility at Boston Spa. The business model behind the UKRR will mark a change in the way remote document supply is offered to HE institutions and could determine the future of the service. Instead of the current transaction-based model the new model will be subscription–based and will comprise two elements – 1) a charge to cover the cost of BL storage and 2) a charge according to usage. Institutions that don’t subscribe, including commercial organisations, will be charged premium rates. The theory is that costs will not exceed those currently sustained for document supply. Assuming funding is provided for Phase 2, we will see the roll-out of this new model after June 2008. Advocacy will be crucial to the success of the UKRR. The original study reported widespread buy-in to the idea but will that translate into subscriptions? Many libraries will already be undertaking disposal programmes, particularly those with more confidence in and/or subscriptions to e repositories such as Portico. Will anyone really want access to that material once it’s out of sight (and out of mind)? If requests remain low (and decline further) and take- up isn’t great then that could spell the end as far as the BL & RDS go. The editor has already commented on the apparent lack of commitment to RDS from the chief executive of the BL (McGrath, 2006). This year the BL faces potentially very damaging cutbacks from the 2007 Spending Review with threats to reading rooms, opening hours and collections together with the possible need to introduce admissions charges. RDS wasn’t mentioned as a possible target for cutbacks but then the BL will want to see how the UKRR and the new model fares. Tough financial targets in the future coupled with low use/low take-up could lead to a time when the BL announces enough is enough. I’m sure that scenario has been considered by a new group of senior managers set up within the British Library – the Document Supply Futures Group. Not surprisingly little is known about this Group (again this was something divulged at the CURL December presentation) but if it’s looking at all possible futures then it must also be considering no future. The group is headed by Steve Morris the BL’s Director of Finance and Corporate Services. McGrath reported in his paper just quoted that senior figures were seriously considering the future of document supply in 2001. Whatever will come from this Group’s deliberations the present tangible outcome is a commitment to the UKRR. We’ll see where that goes – the clock’s ticking. Alternative universes If ever we’re left adrift in RDS without the BL then what are the alternatives? One option is to go it alone and request from whoever will supply. For this a good union catalogue will be a fundamental requirement. COPAC has had a facelift and, as with other search tools, the Google effect can be seen in the immediate presentation of a simplified ‘quick search’ screen. Expansion is taking place with the catalogues of libraries outside of CURL also being added e.g. the National Art Library at the Victoria and Albert Museum already added and the Cathedral Libraries’ Catalogue forthcoming. The national libraries of both Wales and Scotland are on COPAC, as is Trinity College Dublin. The national library of Scotland has always had an active ILL unit, although this is far too small to take on too many requests. Further development of COPAC has seen support for OpenURLs so that users are linked to document supply services at their home libraries. CISTI is the Canadian document supplier that would welcome more UK customers. However it should be remembered that the BL acts as a backup supplier for CISTI, so without them CISTI could only play a minor role. A new service for 2007 is the supply of ebooks. For US$25 users can access the ebook online for 30 days, after which the entitlement expires. As far as I’m aware this is the first solution aimed at tackling the problem of RDS in ebooks. Ejournal licences have become less restrictive and usually allow libraries to print an article from an ejournal and then send it to another library. This obviously isn’t an option for ebooks, and neither can libraries download and pass on the whole thing or permit access, so the CISTI solution is an attractive option. Of course a major undertaking of the BL is to act as a banker on behalf of libraries for the supply of requests. Libraries quote their customer numbers on requests to each other and charges can then be debited and credited to suppliers once suppliers inform BL (via an online form or by sending a spreadsheet). IFLA vouchers can act as currency but these are paper- based rather than electronic, even though an e voucher has long been desired and projects have looked at producing one. Realistically, survivors in a post-BL document supply world would need to band together with like-minded others to form strong consortia and reap the benefits from membership of a large group. Effectively that boils down to two options – joining up with OCLC or with Talis. Despatches from the Unity front – no sign of thaw in the new cold war OCLC and Talis have both, naturally enough, been promoting their distinct approaches to a national union catalogue and a RDS network that can operate on the back of that, while firing an occasional blast into the opposing camp. Barely was the ink dry on the contract between The Combined Regions (TCR) and OCLC for the production of UnityUK, than opening salvos were being exchanged. At the time Talis had announced they were going ahead with their own union catalogue and RDS system. Complaints relating to data handover and data quality were being lobbed Talis’ way. Since then news items on UnityUK have appeared regularly in CILIP’s ‘Library + Information Update’ (Anon, 2006) along with a stream of letters (Chad, Froud, Graham, Green, Hendrix, McCall, 2006), including one from a Talis user and two from senior Talis staff, bemoaning the situation and seeking ‘a unified approach’. Talis’ position is that they would like to enable interoperability between Talis Source and OCLC for both the union catalogue and the RDS system. I’m sure TCR’s position is that OCLC won the tender to provide a union catalogue and RDS services, while Talis didn’t bid, and they are happy to press on without Talis, thank you very much. An article by Rob Froud, chair of TCR, in a previous issue of ILDS (Froud, 2006b), providing some history and an update on progress, was met with a counter-blast from Talis’ Dr Paul Miller in the Talis Source blog (Miller, 2007). A particular bone of contention was the decision taken by Rob Froud to withdraw a number of TCR libraries’ holdings from the Talis Source union catalogue. Not an especially surprising move given the circumstances, but neutrals should note that libraries can contribute holdings records freely to both. However access to the union catalogue will only be free with Talis. This free access for contributors has seen more FE and HE libraries joining Talis Source. It’s interesting comparing membership lists. While there isn’t great overlap between the two there is a significant minority of public library authorities who are members of both. Will this continue, and, if so, for how long? UnityUK and Talis Source have staked their claims to be the pre-eminent union catalogue and RDS network on their respective websites [2, 3]. UnityUK have this to say - “In 2007, the combined UnityUK and LinkUK services will bring together 87% of public libraries in Great Britain, Jersey and Guernsey in to one national resource sharing service.” They show their extent of local authority coverage with the following membership figures- 97% County Councils 97% London Boroughs 97% Metropolitan authorities 75% Unitary authorities Meanwhile, Talis Source announces itself as “the largest union catalogue in the UK comprising 26 million catalogue items and 55 million holdings from over 200 institutions.” (April 25 2007).th No more ISO ILL for NLM In January 2007 the National Library of Medicine (NLM) in the U.S. said that it would no longer accept ILL requests into its DOCLINE system via ISO ILL. The reasons cited were poor take-up (only three libraries were using it), and the drain on resources by having to test separately with every supplier and every institution that wanted to use it. The protocol itself is quite long but implementers do not have to implement every item – they can select. This meant however that each implementer had to test with NLM even if they were using one (out of only four) of the systems suitable for use. The time and effort required to support ISO ILL was too much and so the NLM pulled the plug. This raises a number of questions about the use of ISO ILL and its future. It doesn’t seem to be well-used in the U.S. e.g. OCLC’s Resource Sharing website lists nearly four times as many Japanese libraries using it compared to those in the U.S., and the British Library hasn’t developed its own ISO ILL gateway since that came on stream. That gateway is of course run on VDX. On the other hand, ISO ILL is used in VDX-based consortia in the UK (UnityUK), the Netherlands, Australia and New Zealand. Quite where all this leaves ISO ILL I don’t know, but I wouldn’t be too optimistic about its prospects. Big deals - unpicking the unused from the unsubscribed Statistics on ejournal usage have moved on apace since publishers committed themselves to achieving COUNTER compliancy in their reports. By creating a common standard, COUNTER reports from one publisher can be meaningfully compared with those of another, knowing that both treat data in the same way. SUSHI takes that a step further by consolidating reports from several publishers into one to provide easy comparisons and show usage across platforms. These can be accessed via Electronic Resource Management Systems (ERMS) or by subscribing to a service such as ScholarlyStats. By utilising such tools analysis of these statistics will becoming increasingly sophisticated, but I suspect that for the moment it remains at a somewhat elementary level. After all, who has the time to look much beyond full text downloads and what titles are or are not being used? The Evidence Base team at the University of Central England have been running a project involving 14 HE institutions that looks at their usage of ejournals, specifically big deals. Libraries are given reports on their usage of ejournals within selected deals and how these rate for value etc. Furthermore, libraries can compare their use with use made at other libraries in the project. At King’s we have received a number of reports including our use of Blackwell’s STM collection in 2004-05, ScienceDirect in 2004-05 and Project Muse in 2005 (Conyers, 2006-07). The Blackwell’s report runs to 22 pages and provides a wealth of detail. Some key findings are highlighted – • 19% increase in usage from 2004-2005 • 91% of requests come directly from the publisher’s web-site, compared to 9% through Ingenta • The average number of requests per FTE user was 6.7 in 2004 and 8.4 in 2005 • 50% of titles in the STM deal were used 100 times or more and 96% of total requests were generated by these titles • 62% of high priced titles in the deal (£400 and over) were used 100 times or more. Higher priced titles were used more frequently than those with a low price (under £200) • 78% of subscribed titles and 39% of unsubscribed titles were used 100 times or more • 62 titles (14% of total) received nil or low use (under 5 requests) in 2005. 22 of these (35%) were unpriced titles not fully available within the deal and a further 18 (29%) were low price (under £200) • The average number of requests per title in 2005 was 369. Average requests for a subscribed title were 860 and unsubscribed title 186 • The heaviest used title was Journal of Advanced Nursing which recorded 15,049 requests in 2005 and 13,840 in 2004 So the report confirms that heavy use is made of titles in the deal, that practically all use is concentrated on half the titles, although practically every title gets some use, and that it is the expensive titles that are most used, but also that unsubscribed titles can attract heavy use. Furthermore, in discussing costs the report finds that the average cost of a request to a subscribed title is 84p in 2005, and just 16p to an unsubscribed title. Pretty good value when all is said and done. The second report confirms much of what the first found. I’ll focus on two of the deals – ScienceDirect (SD) and Project Muse (PM) – as the first is our biggest deal (and will be the case for other libraries too) and PM has a humanities focus which provides a nice contrast. In SD 35% of titles were used 100 times or more, in PM 15%. SD had 2% of titles with nil use*, PM 4% (*nil use doesn’t include ‘unpriced’ titles with limited availability). SD had 80% of subscribed titles used 100 times or more and 27% of unsubscribed titles; for PM the figures were 36% and 9% respectively. This reflects the relative importance of ejournals to users in STM and Humanities fields but also shows how much users gain from a big deal like SD. The average cost for a request to a subscribed SD title was £1.12 and only 2p for an unsubscribed title. One of the arguments against big deals is that you are buying content that you don’t really need – a lot of filler is thrown in with the good stuff. While not totally dispelling that presumption, research such as that produced by Evidence Base can counter that argument somewhat and certainly puts a lot more flesh on bare bones. If you choose carefully which deals you sign up to then your users can make good use of this extra content. At the time of writing (June) Evidence Base were recruiting institutions for a second round of the project. A report from Content Complete (the ejournals negotiation agent for FE, HE and the Research Councils) outlined what they discovered from trials involving five publishers and ten HE institutions that took place between January and December 2006 (Content Complete Ltd, 2007). The idea behind the trials was to look at alternative models to the traditional big deal, and in particular focus on unsubscribed or non-core content and acquiring this via pay per view (PPV). Although the common idea of PPV as a user-led activity was quickly dropped as impractical, a cheaper download cost per article was agreed for all but one of the publishers instead. PPV was then considered in the context of two models – one where unsubscribed content is charged per downloaded article, and the second also with a download charge per article, but this time, should downloading reach a certain threshold, PPV would convert to a subscription and there would be no further download charges. This second option appears more attractive to librarians at first glance as it puts a ceiling on usage, and therefore cost per title, but costs could still mount up considerably if the library saw heavy usage across a wide range of unsubscribed content and was forced into taking further subscriptions. The report highlights a number of problems to do with accurately measuring downloads such as the need to discount articles that are freely available, to not count twice those that are looked at in both HTML and PDF, and to include those downloaded via intermediaries’ gateways. Ultimately these problems proved too much of a technical and administrative difficulty to overcome during the trials for both publishers and librarians. Such problems are likely to continue for some time, although one imagines, given sufficient incentive, they could be overcome with automation and developments to COUNTER and SUSHI. However, would the incentive exist? For the trials also found that the PPV models didn’t compare too well against the traditional big deals in terms of management, and in almost all cases ended up more expensive. Updates In Recent Developments…2 I reported on the RDS proposal for the NHS in England. There’s been some progress on this but there’s still quite a way to go. A list of options has been trimmed to five to undergo cost-benefit analysis before deciding on an eventual winner. The options range from doing little or nothing to improving direct access to content to using a vendor’s RDS system to outsourcing. Building a search engine across catalogues or developing a national union catalogue were the rejected options. It won’t be until November that the preferred option is chosen and then should procurement prove necessary that will take until September 2008 with implementation following early in 2009 (ILDS Task Group, 2007). There have been two significant developments on open access (OA). Firstly, the UK version of PubMed Central launched in January 2007. Like the original U.S. version this will be a permanent archive of freely available articles from biomedical and life sciences journals. Although initially set up as a mirror service, the UK version has 307 such journals at the time of writing (June 2007) against 334 in the U.S version. We can expect future developments to favour UK and European resources. The UK version is supported by a number of organisations – the British Library, the European Bioinformatics Institute and Manchester University are the suppliers while a number of organisations including the Wellcome Trust provide funding. Secondly, for researchers who do not have access to an institutional or subject repository JISC is now offering a service called the Depot, where peer-reviewed papers can be deposited. The Depot is not intended as a long-term repository but rather more of a stop-gap until more become available. eTheses – a long time coming Of course, repositories don’t just have to be homes for journal articles; they can contain a lot more. The possibility of institutions holding their own theses in electronic form has been mooted since the early to mid nineties. Early projects often had a Scottish base and had wider dissemination of research material as a key factor in their raison d’être. An important group looking into the subject was the University Theses On-line Group (UTOG), chaired by Fred Friend. A survey they undertook showed how important theses were to those who consulted them, how authors would be happy to see their own theses more widely consulted, that most theses were being produced in electronic form and so should therefore be easily adapted to storage in an electronic form (Roberts, 1997). One of the members of the UTOG, the Robert Gordon University, subsequently led a smaller group to look at etheses production, submission, management and access. The recommendations from that group led to the EThOS (Electronic Theses Online Service) project which in turn is in the process of establishing itself as a service. From that service researchers will be able to freely access theses online while deposit can be directly into EThOS or by harvesting from institutional repositories. Digitisation of older theses can also be undertaken by the British Library as part of the service. Around the peak of BLDSC’s RDS operations in 1996-97 over 11,000 theses were supplied as loans with more than 3,000 also being sold as copies (Smith, 1997). Final point With UK PubMed Central and EThOS the British Library will be making material freely available that would previously have had to be obtained via RDS. That seems to be the way that much RDS has been going. Previously it was quite expensive, took a while and had to be done via an intermediary; increasingly the documents traditionally obtained via RDS are free and available directly to users immediately. It’s an interesting turnaround isn’t it? Notes 1 BL & CURL presentations on the UKRR from the December 2006 meeting can be found at http://www.curl.ac.uk/projects/CollaborativeStorageEventDec06.htm 2 TCR/UnityUK http://tcr.futurate.net/index.html http://www.curl.ac.uk/projects/CollaborativeStorageEventDec06.htm http://tcr.futurate.net/index.html 3 Talis Source http://www.talis.com/source/ References Anon. (2006), “Will UnityUK bring ILL harmony?”, Library + Information Update, Vol. 5 No 5, pp.4. Anon. (2006), “OCLC Pica/FDI and Talis set out their stalls,” Library + Information Update, Vol. 5 No 5, pp.4. Chad, K. (2006), “Removing barriers to create national catalogue”, Library + Information Update, Vol.5 No 7-8, pp.24. Content Complete Ltd (2007, JISC business models trails: a report for JISC Collections and the Journals Working Group, available at http://www.jisc- collections.ac.uk/media/documents/jisc_collections/business%20models%20trials%20report %20public%20version%207%206%2007.pdf Accessed 28th June 2007. Conyers, A. (2006-2007), Analysis of usage statistics, Evidence Base, UCE, Birmingham, unpublished reports. Froud, R. (2006), “Small price to pay for a proper inter-library lending system”, Library + Information Update, Vol.5 No 7-8, pp.25. Froud, R. (2006b), “Unity reaps rewards: an integrated UK ILL and resource discovery solution for libraries”, Interlending & Document Supply, Vol. 34 No 4, pp. 164–166. Graham, S. (2006), “We want a unified approach to inter-library lending”, Library + Information Update, Vol.5 No 9, pp.25. Green, S. (2006), “Make Unity UK freely available to boost demand”, Library + Information Update, Vol.5 No 6, pp.24. Hendrix, F. (2006), “Struggle for national union catalogue”, Library + Information Update, Vol.5 No 6, pp.26. ILDS Task Group (2007), Strategic business case for interlending and document supply (ILDS) in the NHS in England: recap and update on short listing of options, unpublished report. McCall, C. (2006), “Seeking a unified approach to inter-library lending”, Library + Information Update, Vol. 5 No 10, pp.21. McGrath, M. (2006), “Our digital world and the important influences on document supply”, Interlending & Document Supply, Vol. 34 No 4, pp. 171–176. Miller, P. (2007), “Unity reaps rewards: a response”, Talis Source Blog, available at http://www.talis.com/source/blog/2007/03/unity_reaps_rewards_a_response_1.html (Accessed 7th June 2007). http://www.talis.com/source/ http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.jisc-collections.ac.uk/media/documents/jisc_collections/business models trials report public version 7 6 07.pdf http://www.talis.com/source/blog/2007/03/unity_reaps_rewards_a_response_1.html Roberts, A. (1997), Survey on the Use of Doctoral Theses in British Universities: report on the survey for the University Theses Online Group, available at http://www.lib.ed.ac.uk/Theses/ (Accessed 28th June 2007). Smith, M. (1997), How theses are currently made available in the UK, available at http://www.cranfieldlibrary.cranfield.ac.uk/library/content/download/678/4114/file/smith.pdf (Accessed 6th July, 2007). http://www.lib.ed.ac.uk/Theses/ http://www.cranfieldlibrary.cranfield.ac.uk/library/content/download/678/4114/file/smith.pdf Recent developments in Remote Document Supply (RDS) in the U work_sxpywixh4japxkoz5vwuiubdzq ---- 4.1lowry Charles B. Lowry vii portal: Libraries and the Academy, Vol. 4, No. 1 (2004), pp. vii–viii. Copyright © 2004 by The Johns Hopkins University Press, Baltimore, MD 21218. Passing the Baton Charles B. Lowry Whenever there is a change of editors at a scholarly journal, one expects tosee the ritual “thank you,” and it may seem pro forma. However, for portal:Libraries and the Academy this is a significant milestone, and I want to attach special meaning to my first editorial statement—not because it is mine, but because our young journal has become such a success so quickly and many are due a sincere thanks. Four years ago, the creation of this journal was the idea of a majority of the members of the board of the Journal of Academic Librarianship (JAL) who felt that it was time for librarians to take a stand similar to our colleagues in science, who were at last rebelling against the deleterious effects of commercialism on the exchange of scholarly informa- tion. Debate within the board was serious-minded, and some stayed out of principled commitment to the title. Those of us who resigned our positions on the board did so with considerable angst and not a little regret. My own reasons for resigning were typi- cal, as I stated in e-mail to colleagues at the time. [A]s a member of the SPARC Advisory Committee it would be profoundly inconsistent of me to continue on the JAL board. I think the right thing to do is to quit and devote my energy to working with you like-minded colleagues to create a competing title. That is consistent with principle. I do not think we are engaged in a great moral crusade against some “Evil Empire,” but I do think we are engaged in a transformational struggle that will define the future. If we want a future we can live with, we had best understand that Elsevier wants something else. Their company stock is highly touted because of its P/E ratio. They are answerable to their stockholders not to the academy and certainly not to academic libraries. This is a matter of the future of scholarly communication, higher education, and our libraries.1 We did the right thing, but there was no certainty that we would succeed. The first challenge was to find a publishing partner with the willingness to support our little rebellion and with the track record that would give us confidence that our vision could be fulfilled. Our first stroke of luck was the willingness of Jim Neal to approach The Johns Hopkins University Press and our good fortune in getting the ear of Marie Hansen at the press. Her support, experience, and perspicacious advice in shaping a new pub- lication were indispensable. 4.1lowry 12/10/03, 3:46 PM7 Passing the Batonviii We also received support from other quarters. The Association of Research Librar- ies Board and Executive Director Duane Webster encouraged and endorsed our effort. In particular Rick Johnson, enterprise director of SPARC, has been vigorous in support- ing our work. Perhaps he described best what our journal means: “portal is a commu- nity built around diverse needs of its members. It offers a superior alternative to com- mercial journals in the field, and it is the kind of initiative academic librarians every- where should support.”2 The critical requisite was the active participation of the board members who were willing not merely to talk, but to walk and, most importantly, to work very hard in the creation of a new journal, and recruits were added to them in the effort. That original board included: Don Bosseau, Nicholas Burckel, Karyle Butcher, Meredith Butler, Deborah Dancik, John DiBraggio, Ray English, Larry Hardesty, Pat Harris, Eddy Hogan, Neal Kaske, John Lombardi, Jim Matarazzo, James Neal, Ann Prentice, Sarah Pritchard, Helen Spalding, Steve Stoan, Jack Sulzer, and Jerome Yavarkovsky. I vociferously thank each one—most are still with us today. Sue Martin and I signed on as executive editors, but Gloriana St. Clair did the heavy lifting of managing editor. More than any other individual contribution, her experi- ence, brought daily to the first three years of portal, has made the journal what it is today. She deserves special credit for forwarding and implementing the idea of men- tors in the context of a double-blind review scholarly journal. It has advanced a new idea that a manuscript may be raw, but if it contains the core of good research the au- thor deserves more than a rejection. Thus we extend mentoring to help develop good ideas and good research into viable publishable articles. Many of the articles that have appeared in portal are the result of this nurturing effort. On behalf of the board I want to also extend our appreciation to Copy Editor Martha Bright Anandakrishnan and Editorial Assistant Cindy Stell Carroll. We all know that their contributions have been critical. Our publisher deserves a word. The Johns Hopkins University Press is well known for the quality of its titles and for the groundbreaking venture into the world of e-journals with Project Muse. We are proud to be associated with it for its quality and for the principles it stands for in advancing the exchange of scholarly information as its primary purpose for existing. Finally, in my inaugural issue as editor, I am introducing the new occasional fea- ture of a “Guest Editorial.” Jay Jordan, CEO of OCLC, was generous in responding to my request for a submission in which he revisits The Keystone Principles, and the role of OCLC in the world of libraries and scholarly information. Notes 1. Charles B. Lowry, e-mail to JAL Board, February 10, 1999. 2. Statement provided by Rick Johnson to portal, summer 2003. 4.1lowry 12/10/03, 3:46 PM8 work_sxtvfsdqwbc3tfsmxr4ie7mtpq ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586688 Params is empty 216586688 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586688 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_sytykq55vndr7ooggx3hfb5l2a ---- ADVERTISERS INDEX Abbot Epival - xiii, xiv Ciba/Geigy Tegretol - obc, xi, xviii, xix Deprenyl Eldepryl - ix, xx Permax - v, xxii Dupont Sinemet - xii, xvi, xvii Hoescht Frisium - ibc, iii, xvii Sandoz Canada Organ Donation - x Parlodel - ifc, xv Syntex Ticlid - vi, vii, viii Classified Ads - xx, xxi, xxii, xxiii "My husband is 55 years old. And he's been stolen from me." Alzheimer Disease is a degenerative brain disorder that destroys vital brain cells. It affects over 300,000 Canadians. And that's not including the people who love them. Slowly, the disease steals away the ability to think, understand, remember, communicate, or perform the simplest tasks, leaving them completely dependent. There is no known cause, nor is there a cure. Today. The Alzheimer Society of Canada is a national organiza- tion dedicated to helping those affected by the disease, as well as their caregivers. We also conduct research into possible causes, treatments and a cure, so that we can put an end to the devastation of this killer disease. Tomorrow. Alzheimer CANADA Help forlbday Hope forTomorrow Fight cancer with will power. When it comes to your will, your family and loved ones come first. But after everyone in your life has been well provided for, we ask that you give some thought to leaving a bequest to the Canadian Cancer Society. While much of die Society's fundi come from regular donations, a fignificant amount comes i bcqutKs made in &:&££<• $rl£§fc wills. The money is used to support the Society's on- going programs in research, public edu- cation and patient services. No matter what die size of die bequest, we need your support. So when the time comes, please remember: a strong will can go a long way in die fight against Copies of articles from this publication are now available from the UMI Article Clearinghouse. For more information about the Clearinghouse, please fill out and mail back the coupon below. JSMrtado. (Jiearinghouse Yes! I would like to know more about UMI Article Clearinghouse. I am interested in electronic ordering through the following system(s): • DIALOC/Dialorder D ITT Dialcom D OnTyme • OCLC ILL Subsystem D Other (please specify)- • I am interested in sending my order by mail. • Please send me your current catalog and user instructions for the system(s) I checked above. Name Title Institution/Company Department Address City -State- -Zip- Phone ( ) Mail to: University Microfilms International 300 North Zeeb Road, Box 91 Ann Arbor, Ml 48106 (xxiv) https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0317167100049222 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:57, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/10.1017/S0317167100049222 https://www.cambridge.org/core work_t3vkrdfn6rdghevpoymuc5yuri ---- Hierarchical Catalog Records: Implementing a FRBR Catalog Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine October 2005 Volume 11 Number 10 ISSN 1082-9873 Hierarchical Catalog Records Implementing a FRBR Catalog   David Mimno University of Massachusetts, Amherst Gregory Crane Alison Jones Tufts University Abstract IFLA's Functional Requirements for Bibliographic Records (FRBR) lay the foundation for a new generation of cataloging systems that recognize the difference between a particular work (e.g., Moby Dick), diverse expressions of that work (e.g., translations into German, Japanese and other languages), different versions of the same basic text (e.g., the Modern Library Classics vs. Penguin editions), and particular items (a copy of Moby Dick on the shelf). Much work has gone into finding ways to infer FRBR relationships between existing catalog records and modifying catalog interfaces to display those relationships. Relatively little work, however, has gone into exploring the creation of catalog records that are inherently based on the FRBR hierarchy of works, expressions, manifestations, and items. The Perseus Digital Library has created a new catalog that implements such a system for a small collection that includes many works with multiple versions. We have used this catalog to explore some of the implications of hierarchical catalog records for searching and browsing. 1. Introduction Current online library catalog interfaces present many problems for searching. One commonly cited failure is the inability to find and collocate all versions of a distinct intellectual work that exist in a collection and the inability to take into account known variations in titles and personal names (Yee 2005). The IFLA Functional Requirements for Bibliographic Records (FRBR) attempts to address some of these failings by introducing the concept of multiple interrelated bibliographic entities (IFLA 1998). In particular, relationships between abstract intellectual works and the various published instances of those works are divided into a four-level hierarchy of works (such as the Aeneid), expressions (Robert Fitzgerald's translation of the Aeneid), manifestations (a particular paperback edition of Robert Fitzgerald's translation of the Aeneid), and items (my copy of a particular paperback edition of Robert Fitzgerald's translation of the Aeneid). In this formulation, each level in the hierarchy "inherits" information from the preceding level. Much of the work on FRBRized catalogs so far has focused on organizing existing records that describe individual physical books. Relatively little work has gone into rethinking what information should be in catalog records, or how the records should relate to each other. It is clear, however, that a more "native" FRBR catalog would include separate records for works, expressions, manifestations, and items. In this way, all information about a work would be centralized in one record. Records for subsequent expressions of that work would add only the information specific to each expression: Samuel Butler's translation of the Iliad does not need to repeat the fact that the work was written by Homer. This approach has certain inherent advantages for collections with many versions of the same works: new publications can be cataloged more quickly, and records can be stored and updated more efficiently. 2. FRBR Implementations One recent survey of FRBR implementations by Martha Yee has described some of the common problems users have with library catalog systems and ways that a FRBR organization can address those problems (Yee 2005). First, she finds that it is often difficult to search for author and title combinations because variant name information is isolated in authority records. Second, she finds that catalogs are often poor at displaying the full range of relevant materials that the library holds because of variations in titles. Yee then describes how both problems can be addressed by making the catalog more aware of connections between author information and work information and between different versions of the same work. A great deal of research has examined how the implementation of FRBR might affect online public access catalogs (OPAC). In recent years, OCLC has launched a number of FRBR related research projects. After conducting a number of experiments with WorldCat records, OCLC researchers concluded that the algorithmic identification of expressions is quite difficult and decided to focus their research on the identification of works instead (Hickey 2002). Other OCLC projects have included the creation of an open source FRBR Work-set algorithm that converts MARC21 bibliographic databases to a FRBR model (FRBR Work-set Algorithm 2005) and the development of the FictionFinder catalog, a FRBR prototype system for searching and browsing bibliographic records for fictional works (FictionFinder 2005). Searches in the FictionFinder system return a list works, rather than a list of individual bibliographic records. Selecting a particular work leads to a list of expressions of that work, and choosing an individual expression leads to a list of manifestations. In early 2005, OCLC announced a new project entitled CURIOSER that will seek to make open WorldCat more useful and will integrate previous OCLC FRBR research in order to support display and navigation of records in a FRBR context (CURIOSER 2005). Similarly, the Research Libraries Group has created RedLightGreen, an online catalog designed specifically for undergraduates that utilizes the FRBR model while not yet serving as a full FRBR implementation (RedLightGreen 2005). Merilee Proffitt of RLG has described this prototype as a FRBRish not FRBRized catalog that has utilized many of the intellectual concepts behind FRBR in its design. System designers found that the FRBR Group 1 entities of work, expression, manifestation and item were important during the conceptual modeling stages and have used these concepts to determine how records should cluster in their catalog (Proffitt 2004). FRBR has also been implemented in the Australian Literature Gateway (AustLit) with some modifications and extensions to the basic model. AustLit augmented the FRBR model with INDECS event modeling in their digital gateway to Australian literature. In their model, works have a creation event, expressions have a realization event, and manifestations have an embodiment event (Ayres 2003). Their modeling also introduced the concept of the Super Work. They found that there were a number of issues in converting the data to the FRBR model, of particular concern was the fact that the FRBR model has a whole monograph emphasis and most of their records were non-monograph items such as individual poems or reviews (Ayres 2004). Nonetheless, they were able to develop a number of automated processes to convert records to the FRBR format, and developed a maintenance interface to support human intervention with problematic records. The hardest challenge for catalogers was in distinguishing between new expressions and manifestations of works. In designing their user interface, they chose to use light visual clues like dot points and simple text statements such as "this work has appeared in x different versions" to guide the user. During evaluation they found that users seemed to have no difficulty in navigating the interface (Ayers 2004). Some FRBR research has involved the creation of tools for experimenting with the model, rather than full implementation of FRBR catalogs. The Library of Congress has created a FRBR display tool that takes flat files of MARC records and uses XSLT to transform this data into meaningful displays by grouping the bibliographic data into the "Work," "Expression" and "Manifestation" FRBR entities (FRBR Display Tool 2004). The tool is largely intended to display FRBR on the fly for libraries wishing to experiment with the model (Radebaugh 2004). Roberto Sturman, a librarian at the University of Trieste has also developed an experimental FRBR tool called IFPA (ISIS FRBR Prototype Application) that can be viewed on the Web (IFPA 2005). This software serves as an application for the UNESCO ISIS retrieval software and was developed to manage the data and relationships implied in the FRBR model. Sturman stresses that his tool is meant to serve as an academic experiment that will assist people interested in experimenting with the FRBR model (Sturman 2004). Several commercial vendors have also designed FRBR implementations. The Virtua catalog developed by VTLS Inc. is marketed as offering full support of the FRBR Model (Virtua 2005), while Portia has created VisualCat, an integrated cataloging system that is capable of consolidating different types of metadata within a single semantic framework based on RDF and FRBR (VisualCat 2005). 3. The Perseus Digital Library We felt that the Perseus Digital Library (PDL) would be an ideal testbed for research on FRBR-based records and catalog interfaces for several reasons: It is moderately sized. Although the digital content of the PDL is substantial, the number of bibliographic entities is moderate: on the order of one or two thousand distinct intellectual works rather than the millions in catalogs like WorldCat. This scale makes it possible for staff to give individual, personal attention as necessary to each bibliographic record within a reasonable time period. It is highly structured. The core of the PDL is a collection of literary documents from the Greco-Roman period. Almost all of the ancient works, such as Homer's Iliad, are present in the collection in multiple versions, for example the original text in Greek along with one or more English translations. Additionally, the collection contains many works that have fundamental relationships to other works, such as a commentary on a particular book of the Iliad or a lexicon specific to Homer. It is already well cataloged. Most of the digital works in the PDL are derived from scanned print editions that are present in WorldCat or LC Voyager. Although many of the catalog records for our out-of-copyright editions are not consistent with modern cataloging practice, we do have some level of professional catalog data for the great majority of the collection. This provides a strong starting point. Additionally, we are in the unusual position of already having unique, authorized identifiers for many of our distinct intellectual works because of the Thesaurus Linguae Grecae canon, which assigns numbers to all attested Greek authors and their works. A similar scheme exists for Latin, developed by the Packard Humanities Institute. It is entirely digital. As a result, our catalog does not need to keep track of any sort of circulation information. Additionally, our digital document interface provides virtually no interesting item-level metadata. This may change if we are able to expand the number of documents that are available in different formats, such as HTML, XML, and PDF, but at the moment there is little point in cataloging our collection beyond the manifestation level. Dealing with a three-level hierarchy rather than a four-level hierarchy simplifies the system somewhat, but adding item information would make little difference. 4. Building a Hierarchical Catalog The format we chose to express our metadata was a combination of the MODS (Metadata Object Description Schema) and MADS (Metadata Authority Description Schema) standards. As was previously argued by Karen Coyle (Coyle 2004), we felt that incrementally improving the MARC record would not be a productive direction. Compared to newer XML-based formats, the infrastructural overhead of parsing, searching, and displaying MARC records was too great to make a MARC-based approach feasible. Another important feature of the MODS/MADS family is that it includes the extremely flexible field, which was explicitly designed to express a wide variety of relationships between catalog entities. A screenshot of the Perseus hierarchical catalog The starting point for our catalog was a set of MODS catalog records downloaded directly from the Library of Congress through the Voyager SRU service and manually created from OCLC WorldCat catalog data. The Perseus Digital Library has for many years maintained links between TLG identifiers for canonical works and the documents in the collection that are instances of those works (Mahoney 2000). These two databases were sufficient to generate the hierarchical catalog. Our work-level records are MADS records, extending the notion of the work as an authorized entity. The two remaining levels are implemented as MODS records. Although this distinction makes sense bibliographically, it did present difficulties in implementation, as all XPath queries against the database had to be formulated both in the MODS namespace and the MADS namespace. The last step in the process was to divide information currently in individual records into multiple hierarchical records. For single-manifestation works, this was simply a matter of passing the record through three different XSL filters. The division of metadata fields between FRBR levels is based on a proposal by Sally McCallum. (McCallum 2004). For works with more than one manifestation (largely primary texts in the Classics collection), the work-level records were generated separately from an existing bibliographic database at Perseus, independently from the LC MODS records. The distribution of MODS fields was as follows: At the work level, we have placed metadata fields that are common to all versions of a work. These include the author name and the uniform title, which specifies a single authorized title for works that have many editions, each potentially with a different title. As with a standard authority record, the work record also provides a good place to specify common variants of the title. This factor is especially important in a collection with many multilingual materials: a student assigned to find Cicero's Catilinarian orations may not know to look under the Latin title In Catilinam. Additionally, we have placed subject, classification, and genre information at the work level. At the expression level, we have included language, editor, translator, abstract, and table of contents fields. This level tends to be the sparsest in our collections, although this is to some extent based on the fact that the Library of Congress MODS records on which we based our catalog did not distinguish between authors, editors, and translators: only the MARC role "creator" was provided. Finally, at the manifestation level we have included the publication's actual title, publication information, physical description information, and any notes. This level also includes any related item information that is not directly related to the FRBR hierarchy, such as membership in a series. 5. Hierarchical Records and Analytical Cataloging Bringing more focus to the hierarchical relationships between intellectual works and their manifestations in our documents has also brought part-whole relationships into more prominence. Very often, manifestations of shorter works are found in our collection as part of larger volumes, for example collections of speeches by minor Attic orators or the complete works of Aeschylus. These relationships raise interesting questions. At the work level, one play by Aeschylus is clearly a distinct entity from another play, but at the manifestation level, the publication information for every translated play in the volume is the same, and therefore should be kept in a single record. Our current implementation addresses this problem by linking a single manifestation-level record for the multi-work volume to multiple expression-level works. This compromise works for our collection, but we cannot be certain that it will scale to larger collections, especially those for which analytical catalog data is not currently available. 6. Searching a Hierarchical Catalog Although a hierarchical catalog has many obvious advantages for searching and browsing, it presents some new challenges in implementation. With ordinary MARC records, searching a catalog has been fairly simple. A set of query terms is matched against a set of records, returning the subset of records that contain some or all of the search terms. In contrast, searches against hierarchical catalogs can become extremely complex. A single search might contain terms that occur at several different levels. For example, a search for "Butler AND Iliad" would need to match an expression-level element (translator) and a work-level element (uniform title). Even when matching records are located, they may not contain complete information: in the previous example, the system would still need to request one or more manifestation-level records. The system may also want to expand search results to include other expressions and manifestations of a matched work. Finally, de-duplication becomes an important problem. If both the work record and a manifestation record match the query (such as "Iliad" in the title and the uniform title) the system needs to know not to expand the records in both directions. None of these problems are directly addressed by current XML databases. In order to create a catalog that displayed the unique, complete results returned by arbitrary queries in hierarchical format, we would have needed to implement complex matching algorithms and slow, recursive record expansion algorithms. Initial experiments with simple queries showed noticeable performance problems. The simplest solution we have found for the problems outlined above is to keep two parallel versions of the catalog, both containing the same records. The first is simply a collection of individual records, one for each work, expression, and manifestation. The second is a set of "composite" records, one for each work, that bring together in one XML tree all of the expressions of the work and all of the manifestations of each of those expressions. The first set are the editable copies. To draw an analogy to computer programming, they are the "source code". The second set is the "compiled" version, optimized for searching. By creating composite records, we essentially reduce the problem of searching the hierarchical catalog to the earlier problem of searching a "flat" catalog: the search engine, in our case an XML database, is simply asked to return the records that match a query. The display code, an XSL stylesheet, can then choose whether or not to display unmatched elements of the composite record (e.g. versions of the Iliad not translated by Butler), and how to highlight the terms that matched the query. In our experiment, we used custom tags (, , ) as a framework for the hierarchical structure of the composite records. In order for catalogs of this nature to be accessible through federated searching systems such as SRU/W, the library community must standardize on a means for specifying relationships between blocks of XML. This could include some form of METS or RDF XML. 7. Conclusions Our work has shown that small, theoretically well-founded changes in the structure of catalog records, combined with readily available database software can produce catalog interfaces that address previously identified problems with existing library catalog interfaces. The Need for New Identifiers One aspect that this work has highlighted is the need for unique identifiers at every hierarchical level. This need has been previously identified by Karen Coyle (Coyle 2004). Identifiers provide the "glue" that holds distributed records together. Currently, most standard identifiers (OCLC accession numbers, ISBNs, LCCNs) are issued at the manifestation level. Work-level identifiers exist in narrow domains, such as the TLG for works in Greek and uniform titles for works with many cataloged versions. Even these sources are problematic. Catalogers may not be aware of special-purpose schemes such as the TLG. Uniform titles, which are designed to be read by humans, are inefficient and unwieldy for computational use. As Kristin Antelman points out, "Documents do not need to be described to be referenced in a networked world; they must be identified. An inherently descriptive element, such as title, cannot meet the requirements of a network identifier." (Antelman 2004) Creating new identifiers for work- and expression-level entities is a reasonable goal within a short period of time. The technology of identifiers is already well studied. Several high quality systems for creating identifiers have been implemented. These systems are not, however, currently being applied to more abstract bibliographic entities. The creation of work-level identifiers is a question of deterimining the correct authorities for issuing the identifiers and reliable methodologies for distinguishing works, not one of actually producing the identifiers themselves. At a recent workshop at OCLC on issues related to FRBR, Ketil Albertsen (Albertsen 2005) outlined a number of the attributes of a successful identification scheme and Patrick Le Boeuf (Le Bouef 2005) described systems that could potentially fill some of the requirements. Deciding a new system of identifiers and issuing authorized numbers for works is a role that small libraries like Perseus cannot fulfill. In order for distributed hierarchical catalogs to become useful, there must be a network of widely known naming authorities, such as national libraries, that can assign unique identifiers to all cataloged intellectual works and expressions. Benefits Moving to a hierarchical catalog would involve a substantial investment in time, money, and technology. The end result, however, is a higher quality catalog that can be more easily maintained, distributed, and searched. The hierarchical MODS/MADS catalog effectively separates FRBR levels into manageable segments. These segments in turn provide easily updatable and reusable building blocks for further cataloging and networked catalog reuse. When we add a new translation of Livy's history of Rome to our library, we only need to locate the standard identifier for the work and specify it along with a small amount of publication information. The benefits for library users are also clear. The composite records described in this article automatically organize complex works with many manifestations into a simple, understandable format. Searching for a title and author combination brings up an interface that displays all available versions, even those that would not by themselves match the query. These records can also integrate work-level authority data such as alternate titles into the searchable record. As a result, searching the catalog becomes dramatically more powerful while at the same time retaining simplicity and efficiency in implementation. Appendix: A sample of a composite record made up of distinct work, expression, and manifestation records References Albertsen, Ketil. "What Do We Want to Identify? - FRBR and Identifier Semantics" Presentation at FRBR in 21st Century Catalogues: An Invitational Workshop. May 2-4 2005, Ohio. Available online at . Antelman, Kristin. "Identifying the Serial Work as a Bibliographic Entry." Library Resources & Technical Services, 48.4 (2004): 238-55. Ayres, Marie Louise, Kerry Kilner, Kent Fitch and Annette Scarvell. "Report on the Successful Austlit: Australian Literature Gateway Implementation of the FRBR and INDECS Event Models, and Implications for other FRBR Implementations." International Cataloging and Bibliographic Control, 31.1 (2003): 8-13. Ayres, Marie-Louise. "Case studies in implementing FRBR: AustLit and MusicAustralia." Paper presented at "Evolution or Revolution? The Impact of FRBR." 2 Feb. 2004, Melbourne, Australia. 2 July 2005. Coyle, Karen. "Future considerations: the functional library systems record." Library Hi Tech, 22.2 (2004): 166-174. CURIOSER. OCLC. 30 June 2005. . FictionFinder: A FRBR-Based Prototype for Fiction in WorldCat. OCLC. . FRBR Display Tool Version 2.0. 31 Mar 2004. Network Development and MARC Standards Office-Library of Congress. 7 July 2005. . FRBR Work-Set Algorithm. OCLC. 30 June 2005. . Hickey, Thomas B, Edward T. O'Neill and Jenny Toves. "Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR)." D-Lib Magazine, 8.9 (2002). 27 July 2005. . IFLA Study Group on the Functional Requirements for Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications-New Series. Vol. 19, Munchen: K.G.Saur, 1998. IFPA (FRBR Prototype Application) home page. 23 Nov. 2004. ed. Roberto Sturman. 11 July 2005. . Le Bouef, Patrick. "Identifying 'textual works': ISTC: controversy and potential." Presentation at FRBR in 21st Century Catalogues: An Invitational Workshop. May 2-4 2005, Dublin, Ohio. Available online at . Mahoney, Anne, Jeffrey A. Rydberg Cox, David A. Smith and Clifford E. Wulfman. "Generalizing the Perseus XML Document Manger." Paper presented at the workshop on Web-Based Language and Documentation and Description, 12-15 Dec. 2000, Philadelphia, USA. 23 Aug. 2005. McCallum, Sally. "An Introduction to the Metadata Object Description Schema (MODS)." Library Hi-Tech, 22.1 (2004): 82-88. Proffitt, Merrillee. "RedLightGreen: FRBR Between a Rock and a Hard Place." Presentation at ALCTS Preconference, ALA Annual Meeting, 25 June 2004, Orlando, Florida. 1 July 2005. Radebaugh, Jackie and Corey Keith. "FRBR Display Tool." Cataloging & Classification Quarterly, 39, 3-4 (2004): 271-283. RedLightGreen. Research Libraries Group. 30 June 2005. . Sturman, Roberto. "Implementing the FRBR Conceptual Approach in the ISIS software environment: IFPA (FRBR Prototype Application). " Cataloging & Classification Quarterly, 39.3/4 (2004): 253-70. Virtua. Visionary Technology in Library Services Inc. 12 July 2005. . VisualCat. Portia. 14 July 2005. . Yee, Martha M. "FRBRization: a Method for Turning Online Public Finding Lists into Online Public Catalogs." Information Technology and Libraries. 24.3 (2005): 77-95. 30 Jul. 2005. Postprint available free at . Copyright © 2005 David Mimno, Gregory Crane, and Alison Jones Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions doi:10.1045/october2005-crane   work_tapzfqz2k5bpbduygwq45qpgge ---- Author Posting. (c) 'Taylor & Francis', 2010. This is the author's version of the work. It is posted here by permission of 'Taylor & Francis' for personal use, not for redistribution. The definitive version was published in Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Volume 20 Issue 4, September 2010. doi:10.1080/1072303X.2010.502096 (http://www.informaworld.com/smpp/content~db=all?content=10.1080/1072303X.2010.502096) Electronic Document Delivery: A Survey of the Landscape and Horizon NATHAN HOSBURGH John H. Evans Library, Florida Institute of Technology, Melbourne, Florida, USA KAREN OKAMOTO Lloyd Sealy Library, John Jay College of Criminal Justice, New York, New York, USA The authors examine, based on a survey of users, the electronic document delivery methods currently in place as well as changes in the recent past and future developments. Interlibrary loan and document delivery staff were surveyed from institutions across the United States in order to ascertain what document delivery mechanisms are currently in place, how they are being used, and why. Findings from this study should lead to an increased awareness of electronic delivery options in libraries across the country and elucidate the dynamics involved at individual sites. This, in turn, will assist librarians in making decisions, based not only on their individual circumstances, but on the experience and trends found across a broad sampling of institutions. KEYWORDS Odyssey, Ariel, BScan ILL, Relais Express, RapidX, interlibrary loan, electronic document delivery, scanning INTRODUCTION Electronic document delivery has been an important component of interlibrary loan (ILL) operations since the advent of the fax machine. Through the development of software programs such as Ariel, Prospero, ILLiad's Odyssey, and RapidX, electronic document delivery has been further refined to meet the needs of both interlibrary loan staff as well as end users. With the http://www.informaworld.com/smpp/content~db=all?content=10.1080/1072303X.2010.502096 portability of file formats such as PDF and TIFF, even email is being increasingly used as a low- cost, low-tech method of electronic document delivery which remains system-independent and relatively convenient, especially for average to low volume ILL units. Posting documents to a secure web server for either borrowing institutions or patrons has also become a viable option. The electronic document delivery landscape is always changing, but perhaps never so much as it is now. It is this unprecedented change and the desire to make sense out of it that has been the impetus behind this study. Following the results of a poll released by Timothy Bowersox highlighting the use of Odyssey's Trusted Sender feature by ILLiad libraries, the authors became interested in how many ILLiad sites are still using Ariel in addition to Odyssey. This interest widened in scope, as there is significant electronic document delivery software crossover between platforms, ILL management systems, and libraries of various size and type. In order to gain the clearest and most comprehensive view of the current environment of electronic document delivery in the United States, a web survey was distributed to subject-specific discussion lists. The findings of this survey have been collected, compiled, analyzed, and represented in such a way as to be useful for disparate ILL operations. The goal of this paper is to examine use of different methods across various institutions and to identify trends that will lead, not only to improvement of interlibrary loan and document delivery, but improvement of the software products themselves. PRODUCT BACKGROUND Ariel Ariel, released in the fall of 1991, was developed by the Research Libraries Group (RLG), formerly a membership-based, not-for-profit organization comprised of research libraries, museums, archives, historical societies and national libraries. RLG is now part of the Online Computer Library Center (OCLC). Limited by the capabilities of fax and postal delivery methods, RLG members wanted to take advantage of their existing Internet connection to send documents inexpensively and rapidly. Members also wanted to improve the quality of transmitted images. Ariel was developed in response to this need. Initially, Ariel was restricted to the DOS environment, but a Windows version was developed in 1994. Since then, Ariel "has grown from a solution for a relatively small group of large research libraries into a de facto international standard for document exchange used by libraries and document suppliers of all sizes and specialties" (Lavigne & Eilts, 2000, p. 7). Ariel delivers documents to other Ariel workstations through FTP or email, and converts them to PDF files for patron delivery. It supports grayscale and color images, and scans and prints on letter, legal, journal, A3 or A4 sizes. By the end of 1998, there were over 4,000 Ariel users around the world, mainly in the United States, but across 18 other countries as well (Ibid., p.5). Infotrieve, a multifaceted document delivery company, purchased Ariel from RLG in 2003 (Shigo 2003). According to the Infotrieve web site, Ariel is currently used by over 9,400 institutions worldwide. Prospero Developed by the Prior Health Sciences Library at Ohio State University in Columbus, Prospero was released in 1999 as a free, web-based, open source software that works both with Ariel or as a standalone program. Prospero conveniently converts tagged image file format (TIFF) files into a portable document format (PDF), preserving the layout of the original, scanned document. When Prospero was released, Ariel did not have this feature. TIFF files were only supported. Prospero also provides patron access to scanned documents electronically. Patrons are sent an email notification message that indicates the URL for the available document and assigns a PIN to the account. Once users log into their account, they will see a list of viewable documents. Libraries can set the number of views allowed for a document and establish the length of time it will be available for viewing, giving libraries control over copyright-compliant practices. Prospero also includes a database of patron email addresses, and a log of documents sent and documents remaining on the server. Technical support for Prospero is limited to a Web Board, email list and other web documentation. The latest version, 2.0, was released in 2002, but according to the Interlibrary Loan and Document Delivery (ILL-DD) web board of Open Source Systems for Libraries (oss4lib), version 2.0 is unstable. Instead, version 1.40 is currently available for download. Prospero use may be waning now, not only because of technical issues and the lack of user support, but also because Ariel is now able to convert TIFF files into PDFs. Odyssey Released in April 2003, with ILLiad version 6.2.0.1, Odyssey has developed into a significant electronic document delivery component for ILLiad users. Odyssey enables document transmission between ILLiad sites and has been expanded, with Atlas's Standalone product, to include non-ILLiad sites as well. The ILLiad component consists of such features as inclusion of all request information, auto-updating of the request, and the Trusted Sender setting. The latter allows for unmediated receiving: a borrowing library's ILLiad server receives an article sent from a lending library via Odyssey, converts it to PDF format, delivers the article to the web, notifies the customer, and updates the request to "received". This feature may be further modulated by selecting the desired level of staff review. (Connell & Janke, 2006). Odyssey Standalone is a free version that was released by Atlas Systems during summer 2005 (Miller, 2010). Although Standalone is not built into ILLiad, it allows libraries to send and receive documents to ILLiad libraries using Odyssey, Odyssey Standalone locations, and other vendor's software that supports the Odyssey protocol (Atlas Systems, 2010). It functions similar to Ariel in that it is solely an electronic scanning and delivery mechanism and does not allow the user to initiate and receive requests. Brian D. Miller (2009) of Ohio State University Libraries has created an extremely useful set of FAQs that aim to assist both prospective and current users of Odyssey Standalone in getting the most out of this product. A search of the OCLC Policies Directory on May 10, 2010 revealed that, out of 1107 current ILLiad sites, 451 officially list Odyssey as a delivery method, along with 131 non-ILLiad sites (OCLC Policies Directory, 2010). These figures reflect a sizable Odyssey user base, yet may not be completely accurate as they are dependent on ILL units keeping their information current, particularly that listed under "Delivery Methods" --> "Copies". Relais Express Based in Ottawa, Canada, Relais International is a company that has provided software solutions for interlibrary loan and document delivery since 1995. With customers in Canada, USA, and the UK, Relais has found its market share predominately among larger academic research libraries, national libraries, and archives. Library and Archives Canada, National Institutes of Health, and the British Library are among those institutions that rely on Relais. Relais Express combines scanning and delivery functions within a single interface. Any TWAIN-compliant scanner may be used and a range of electronic delivery options may be employed: Ariel, Odyssey, fax, post to web in PDF format, and email attachment. Patron and library delivery information is stored in the system. Once a document is scanned, it is properly prepared for delivery according to the specified method, updated, and sent via automated processes (Relais, 2010). Although Relais is not widely used in the United States, it is worthy of mention in the context of this discussion. BScan ILL Since 1993, Image Access has provided digitization technologies to commercial markets. In 2004, Image Access created the Digital Library Systems Group (DLSG), a division that focuses on the ongoing development, service, and support of hybrid library digitization products. BScan ILL is a software package developed by DLSG specifically for interlibrary loan and document delivery units. The software can be used with a variety of scanning devices, but is most often integrated with DLSG’s line of Bookeye planetary scanners. BScan ILL provides a single, intuitive document delivery interface that enables staff to produce high-quality images with automatic deskew, book curve correction, fan and gutter removal, etc. The finished document may be transmitted to the receiver via ILLiad/Odyssey, Ariel, Rapid, Clio, FTP or email, while the request is updated in the library’s respective ILL management system in the process. BScan ILL automatically reads the requestor’s information from scanned pull slips, saving time and eliminating human error. A growing number of academic libraries involved in high-volume resource sharing and document delivery are using BScan ILL. Represented in this expanding list are important research institutions such as Duke University, Harvard University, M.I.T., and University of Florida (DLSG, 2010). RapidX RapidILL was designed by the ILL staff at Colorado State University Libraries in 1997 to provide fast, cost effective article requesting and delivery. Since that time, many libraries have joined what is known as "Rapid" - a total of about 180 sites worldwide at the time of writing. A holdings database common to the system matches requests down to the year level providing a system average fill rate of 95% or greater and enabling consistent turnaround time between members of less than 24 hours (RapidILL.org). The newest innovation of Rapid is "RapidX", the electronic document delivery mechanism that makes interoperability between disparate systems possible. Via RapidX, lending libraries are able to transmit documents in TIFF or PDF format to any Rapid site regardless of that site's delivery preference (Odyssey, Ariel, Relais Express, Rapid website, etc.) During the process of converting the document to the appropriate format, RapidX will also automatically insert a coversheet and update the request to "filled" in the Rapid system. According to Jane Smith, Director of Rapid Development and Training, RapidX was first launched at Colorado State University around mid-2009 and will soon be employed by all Rapid sites. (personal communication, May 6, 2010) LITERATURE REVIEW Ariel software, the first electronic document transmission system designed specifically for interlibrary loan, was reviewed as early as 1992 by Mary Jackson. At this time, the enhanced capabilities of Ariel over fax transmission were highlighted and the potential for Ariel to become the de facto standard for document transmission was recognized. Landes (1997) echoed Jackson's findings, delving deeper into hardware and software requirements and costs, while further updating technical product information. Ariel literature peaked at the turn of the millennium when the Journal of Interlibrary Loan, Document Delivery & Information Supply (now the Journal of Interlibrary Loan, Document Delivery & Electronic Reserve) devoted an entire issue to experiences of institutions using the software (Ives, 2000). Around the same time, publications began to examine how Ariel functionality could be enhanced through the use of software such as Prospero (Schnell, 1999; Sayed, Murray, Wheeler, 2001; Weible, Robben, 2002) and DocMorph (Franke-Webb, 2001). Comparatively, there has been a dearth of formal literature published on Odyssey since its inception in 2003. Connell and Janke (2006) published a study which evaluated turnaround time between Ariel and ILLiad Odyssey. Examining data across two separate institutions, the authors found that, with the Trusted Sender setting turned on, Odyssey delivery was faster than Ariel (p.42). Most of the information concerning Odyssey can be gathered via websites, some of which are discussed in this paper. Although the Rapid system has been around for some time, the RapidX component has not been discussed in the literature up to this point. Rapid itself has been covered in recent years by Smith (2006) and Delaney (2007). BScan ILL has only recently been mentioned in the literature. Staff from the State University of New York at Buffalo and Empire State College briefly discussed BScan ILL and its role in providing expanded resource sharing. BScan ILL dramatically improved image quality and processing speed, while offering greater flexibility in electronic delivery methods (Bertuca et al., 2009). The efficiencies achieved through the use of BScan software were also discussed in the context of scanning productivity at Iowa State University Library (Pedersen & Runestad, 2009). Although Relais Express was not explicitly mentioned, the document delivery component of the Relais product was discussed to some extent in a comparison of procedures in University of California, San Diego’s (UCSD) interlibrary loan and course reserves units. In 2009, UCSD was in the process of implementing Relais ILL and anticipated that the software would resolve resource contention issues found when using Ariel and allow dispersed ILL departments to work effectively in a network environment (Elliot & Longacre, 2010). The Relais product has also been covered by Cornish (2000) and Guadagno (2005). To date, the authors are unaware of a study that has systematically examined the range of electronic document delivery methods available to libraries across the United States and garnered user feedback regarding these methods. METHODOLOGY In Spring 2010, the authors created a survey using Google Docs that was distributed to STARS- L, Arie-L, Odyssey-L, ILL-L, and Workflowtoolkit-L listserv subscribers as well as members of the ILLiad Webjunction group. These forums all focus on resource sharing, including interlibrary loan and document delivery, therefore being appropriate places to solicit response on electronic document delivery methods. At the time of the survey, there were thousands of combined subscribers to these various lists, both national and international. However, the authors purposely limited the scope of the survey to U.S.-based institutions by framing the informed consent statement and survey questions to reflect this intent. This was done in an effort to keep the study more focused and manageable. In order to ensure confidentiality, survey respondents were not asked to identify themselves or their institution. Duplicates were eliminated based on location and identical responses to questions. Since the results of this survey represent a self-selected group of practitioners in the area of resource sharing, they cannot be deemed conclusive or representative of all libraries. Nevertheless, insight can be gained from the many respondents, representing libraries across the United States and its territories. For reference purposes, appendices are included. Appendix A contains the online survey and the formats for each question as they originally appeared on the web. Appendix B includes the scanner types used in conjunction with the Ariel and Odyssey software. RESULTS AND DISCUSSION There were a total of 104 respondents who participated in the web survey during April 2010. The types of libraries represented were unevenly distributed, with 90% Academic, 3% Public, and 7% Special (including corporate, medical, etc.) There were respondents from most states across the country. Those states with the most response, in order of magnitude, were: New York (13), Massachusetts (7), Pennsylvania (7), Texas (7), Illinois (6), North Carolina (6), and Florida (5). Most other states had between 1 and 4 respondents, although no responses were recorded for Alabama, Delaware, Hawaii, Idaho, Maine, Mississippi, Montana, New Hampshire, New Jersey, Rhode Island, South Dakota, or Wyoming. In addition to those from the United States, there was also a response from the U.S. territory of Guam. Of the total, 66% send documents via Odyssey, either with ILLiad or as a standalone product. Of those, 89% employ Odyssey integrated with ILLiad, while 11% take advantage of the free standalone version. Most Odyssey users (93%) both send and receive via the software, yet 3% only receive and 4% only send. Ariel is utilized by an even higher proportion of libraries in the sample (78%). Of those, the overwhelming majority (93%) have purchased the software. Only 7% have chosen the annual subscription option, most likely due to budgetary reasons or projection of future use. 95% of Ariel users send and receive documents via the software and 5% employ it on a receive-only basis. Electronic document delivery via email comes out above any proprietary methods, being used by 89% of respondents. Surprisingly, fax is still used by 64% of libraries to some degree. The method of uploading documents to a server and making those available to borrowers is used by 21%. Prospero's use is certainly waning, being used by only 5% of those surveyed. 8% of respondents used some "other" mechanism: Relais Express, RapidX, or a tool such as www.transferbigfiles.com. (See Table 1) TABLE 1 Electronic Document Delivery Methods in Use by Responding Institutions Document delivery systems in use Percent of respondents using method Odyssey 66% ILLiad Odyssey 89% Odyssey Standalone 11% Send & Receive 93% Receive only 3% Send only 4% Ariel 78% Purchased software 93% Annual subscription 7% Send & Receive 95% Receive only 5% Send only 0% Email 84% Fax 64% Upload to server 21% Prospero 5% Other 8% Among libraries that use Ariel and Odyssey, 70% use both on a single PC, while 30% use them on separate PCs. ILL Management Systems in use include: ILLiad 65%, Clio 15%, Homegrown 10%, Relais 1%, Other 15%. The total adds up to more than 100%, as some respondents indicated multiple systems. While it is important to ascertain what methods are currently being used, it is equally important to see what methods have recently been discontinued or will be discontinued in the near future. Although Ariel exhibits high use by libraries, it has the highest rate of abandonment. Fax is second, as many libraries find it no longer necessary due to the ease and speed of email transmission as well as the overall poor quality of faxed documents. That said, one respondent did recently discontinue transmission via email, while another discontinued uploading to server, as it forced activity outside of their system's workflow. Prospero's use is dropping, making the once-prominent companion to Ariel rather insignificant. A few institutions either have dropped Odyssey recently or are planning to drop it in the near future. Two of these were never able to get the Standalone version to work, while the other has opted to rely on RapidX. (See Table 2) TABLE 2 Recently Discontinued Methods/ Methods to Be Discontinued by Respondents Percent of respondents indicating recent Discontinued systems or future discontinuation Odyssey 3% Ariel 22% Email 1% Fax 9% Upload to server 1% Prospero 4% Other 4% Respondents were also asked what electronic document delivery methods they have recently adopted or will be adopting in the near future. Unsurprisingly, not a single library specified Ariel in this regard. Odyssey, on the other hand, came out on top as the method adopted most. To a lesser extent email, fax, uploading to server, and other methods such as RapidX have been recently adopted or planned for future adoption. (See Table 3) TABLE 3 Recently Adopted Methods/ Methods to Be Adopted by Respondents Percent of respondents indicating recent Adopted systems or future adoption Odyssey 21% Ariel 0% Email 2% Fax 1% Upload to server 5% Prospero 0% Other 5% Ariel Ariel users expressed a range of likes and dislikes with the software, which we list below, but they also provided illuminating comments that speak to Ariel's unique history. One respondent commented that "Ariel was built by librarians for library use and is suited to ILL". This respondent is referring to RLG and its role in developing Ariel. Similarly, other respondents cite a history or habit of using the software: "Ariel has been in use here since the mid 90s" "We received a grant for Ariel approximately 9 years ago and have used it ever since." "Ariel is the standard" This familiarity with the software, cultivated through years of use, is echoed in favorable comments about Ariel. Other respondents articulated an imminent decline in the software, which we list further below. There are contradictions, though, in responses. Some state that Ariel is cost-effective and easy to use. Others state the contrary. Ariel clearly has both fans and detractors, but the most important question raised by our respondents is, what is the future of Ariel? Respondents cited a range of reasons why they like Ariel. The top three reasons included: Many libraries and/or consortium partners use Ariel (22%); It is simple and easy to use, and works well with older operating systems and hardware (19%); and thirdly, Ariel is speedy (15%). Respondents also stated that Ariel works well, most of the time. Documents can be received or imported through email or Odyssey, and uploaded onto the library's web server for patrons to easily access. PDFs can also be conveniently emailed via Ariel to libraries that do not have Ariel or Odyssey. Scanned articles are available on the local server, so they can be redelivered easily, even as a PDF file, if initial delivery fails. The address book is easy to use and is saved during upgrades. Respondents also appreciate the delivery log or queue, which is searchable. The log conveniently lists problems with received, sent or emailed articles. Because Ariel does not rely on initiating the scanner for each page, it scans faster. PDF and TIFF formats are also supported. One respondent stated that Ariel interacts better with their scanner, and with self-feeding scanners. Secure document transmission was another valued feature. One respondent liked the customer support. The scan settings are adjustable, and Ariel handles color documents. Ariel is good for large files; works well with ILLiad; includes libraries that do not have Odyssey; and allows batch sending. Ariel users indicated reasons why they use Ariel over Odyssey. Most stated that if the borrowing institution does not have Odyssey, they will use Ariel instead (18% of respondents). Other respondents stated that Ariel has a larger user base. Some choose Ariel for scanned articles, and Odyssey only for articles from electronic journals. Also, Ariel is preferred because it monopolizes the scanner. If the borrowing library prefers Ariel, respondents said they would choose Ariel over Odyssey. For large PDF files, one library stated that it is easier for them to send the article through Ariel rather than convert it to a TIFF file to be sent through Odyssey. Unlike Odyssey Standalone, articles do not require rescanning in order to resend a document. The "send to patron" email function in Ariel is used to send documents to libraries that lack Ariel or Odyssey because the transmission is better than fax. Some long-time Ariel users interestingly said they prefer to use Ariel "out of habit". One Docline library stated that Ariel works better than Odyssey. A Rapid library claimed that ArticleReach and Docline do not integrate well with ILLiad, so they use Ariel instead. Another Rapid library stated that for Rapid requests, they use Ariel, because many Rapid libraries still use it. A few libraries (4) expressed that they are not ready for Odyssey due to budgetary constraints or a lack of time. Other reasons include: microfilm scanner does not work well with Odyssey; Ariel is part of the existing workflow; some libraries only have Ariel; there are more export options for documents that are received; Ariel is common in the state; documents upload onto a secure server; and when Odyssey fails, Ariel is their backup option. Technical problems and the lack of customer support from Infotrieve were frequently cited as reasons why respondents dislike Ariel. An overwhelming 40% of respondents expressed their dissatisfaction with customer support. Conflicts with the scanner between Ariel and Odyssey were also frequently cited to be a nuisance. Other technical problems included: Slow and prone to errors and crashes; inconsistent connections; does not work with Vista; firewall and IP address issues; problems with email server in terms of delivering articles to patrons; difficulty setting up the program and loading it onto another machine; problems sending to institutions with different versions of Ariel; and a need for institutional staff or IT support with the program. According to respondents, a number of features and services are lacking: Updates are infrequent; deliveries are not updated in OCLC or ILLiad; Ariel needs better security; and varying shades of black to gray do not transmit well. Various inconveniences were also cited: Ariel may not send larger docs; editing documents can be clunky; the program needs to be on in order to receive; it does not have a log for email transmissions; users cannot change an email PDF attachment name or subject line easily; and in borrowing, documents are delivered as TIFFs which are too large to work with and must be converted to PDFs. Some stated that Ariel is not user friendly and lacks a usable manual. The documentation is also confusing. Some respondents stated that Ariel is expensive, the technology outdated and the number of Ariel institutions limited, making Ariel less desirable. Some of the problems cited above compel respondents to phase out Ariel. Conflicts with Odyssey, lack of support, infrequent updates, firewall issues, technical problems, and the high cost to maintain and update Ariel are convincing libraries to discontinue it. Programs such as Relais Express, RapidX and Odyssey are competitors that are also drawing libraries away from Ariel. Comments from two respondents express a bleak future for Ariel: "Ariel is a doomed product and we want to reduce our dependence on it to the smallest possible footprint so that when it's no longer supported by the vendor it will be a non event." "We discontinued Ariel because support from Infotrieve was minimal, and it took too much of our IT staff time to maintain so many systems." Odyssey Among those libraries that use ILLiad Odyssey, there were consistent reasons why they like it. The number one reason involved Odyssey's integration with the ILLiad management system which enables auto-updating and patron notification of requests (42%). Related to this, 25% specifically mentioned Trusted Sender as being a standout feature, as it allows for unmediated delivery of articles to patrons' ILLiad accounts and speeds turnaround time. Respondents also remarked that Odyssey requires very little staff intervention/ is easy to use (15%) and that it is fast and efficient (11%). A number of positive comments were tied to the scanning function. Odyssey is compatible with most scanners and allows for flexibility in scanning from multiple PCs. Scanning features included the ability to easily preview scanned pages and edit them with many tools, support for color/grayscale, batch page rotation, hot key to trigger scanner, and the ability to mix resolutions and color depth within a single document to reduce file size. Odyssey Helper, an ILLiad module that batch processes scanned articles for document delivery and lending, was also mentioned. One respondent believed that Odyssey simplifies the transmission of articles from e-journals. A few others pointed out that customer support from Atlas and OCLC is reliable. Odyssey Standalone was valued primarily because it is free and easy to configure/use. Users are also able to easily deskew and edit images in other ways. Some find that Odyssey Standalone coexists with Ariel without problems and that it seems to be able to handle larger document transmissions than Ariel. Odyssey users specified reasons for using Odyssey over Ariel. 30% of Odyssey users (ILLiad and Standalone combined) were emphatic that Odyssey is always their first choice of electronic document delivery. If Odyssey is the preferred/only method used by the borrowing institution, this also factors in. Trusted Sender and integration with ILLiad ranked high. Less staff intervention, less manual updating, and less chance of human error were cited. This leads to more efficient delivery: "If articles arrive after-hours or on weekends, patrons receive them immediately without ILL staff intervention". One respondent believed Odyssey worked better with their scanners (Ricoh Aficio IS330DC and HP Scanjet 8290). After locating a print article or checking in an item, Odyssey also presents the option to scan right away. Additional scanning features that set it apart from Ariel include: a larger preview window of the current scan, higher success rate with color scans, and a greater range of document editing tools. Installations and upgrades appear to be smoother as well and are facilitated by OCLC/Atlas (ILLiad version). According to some, an increasing amount of consortia partners are opting to use Odyssey. One respondent carried this further by saying, "We wish all libraries would use Odyssey. It would simplify our procedures." Respondents criticized ILLiad Odyssey because of the following: Not as many libraries/smaller schools have it as Ariel (therefore they must rely on email as well) (10%); Imported files/sent documents must be in TIFF format (10%). Negative comments related to the scanning interface included: errors with color scans/large imported files, slow scanning (due to initiation of scanner for each page), inability to adjust brightness/contrast, less flexible image settings, no recognition of black edges/autocropping, and extra mouse clicks involved in the scanning process. Other scanning issues included Odyssey not working well with a particular scanner, problems running alongside Ariel, resends more problematic than Ariel (because of the way the document is tied to a specific ILLiad transaction), and "clunkiness" in clearing up failed deliveries. A few respondents believed that it did not work well with their existing workflows. One respondent pointed out that when hosted by OCLC/Atlas, a failed scan necessitates rescanning the entire document. The inability to "post to web" on the lending side, select Odyssey OR Ariel on a per request basis, and easily send to Odyssey Standalone sites were mentioned. Annoyance was expressed over the need to include a coversheet for Standalone sites for the purposes of request identification. A Docline library felt that Odyssey could be improved for libraries using that system. A RAPID library expressed a similar sentiment by saying that "Odyssey should be designed to work with ILLiad like RapidX works with RAPID". The user base for Odyssey Standalone was appreciably smaller, yet they had their share of criticism. This comment was echoed amongst respondents: "The Standalone version allows other institutions using ILLiad to send articles to us but does not automatically include any request information (such as OCLC ILL number or transaction number, which show up for ILLiad Odyssey users)". If an article is not successfully delivered, it must be scanned again. One respondent believed the address book is limited by the inability to assign one IP address to multiple libraries. However, the authors are aware of a workaround for this problem, which can be found at https://osu.illiad.oclc.org/illiad/osu/lending/odysseyfaq.html. Not being able to remove unsent items from the queue was mentioned. The fact that Odyssey Standalone does not have an email capability in order to deliver to end users or other libraries was pointed out. Others believed it to be slow, clunky, and confusing. One respondent cited firewall issues, while another brought attention to the fact that the software times out after 4 attempts at electronic delivery. In certain cases, potential users are unsuccessful at getting Odyssey Standalone to work with their existing systems. Since fewer libraries have Standalone than either Ariel or ILLiad Odyssey, this also appears to be a disincentive. Based on responses for Ariel and Odyssey, we identified features and characteristics that libraries want and need from a document delivery program. Libraries need reliable technical support. They want interoperable software, integrated with an ILL program, making transaction updates seamless and automated. For example, automatic updating in OCLC or the Trusted Sender feature in ILLiad where documents received via Odyssey are sent directly to the patron without ILL staff intervention, are highly desirable. Libraries want regular and inexpensive software updates, including updates that are compatible with new operating systems (e.g. Vista and Windows 7). They want fast and secure document transmission, and the ability to deliver any file type (PDFs, TIFFs). They want a program that is robust and reliable, easy to install, and that requires minimal IT staff support. Other desirable features include: ability to quickly resend documents if initial delivery fails ability to store scanned files temporarily or until deleted compatibility with various scanner brands and models, and does not monopolize the scanner easy to edit documents before sending ability to send large files and color documents document preview before sending option to deliver to patrons CONCLUSION Respondents identified both beneficial features of and drawbacks to existing document delivery methods. Their responses suggest that no software meets all of their document delivery needs. This may explain why 95% of respondents use more than one delivery method, and why 49 out of 66 ILLiad libraries (75%) use Ariel along with Odyssey. One respondent aptly writes: "The more electronic DD options the better -- you never know what the other institution might have." Respondents stated that technical problems, the number of libraries using a particular delivery method, and the type of document, among other reasons, determine what delivery method they use for each transaction. Electronic document delivery software may be system-dependent, outdated, prohibitively expensive, etc., which creates stumbling blocks to interoperability. Ariel is system-independent, but is not being developed and therefore, may not work well with future operating systems. Atlas Systems has shown magnanimity in releasing the free standalone version of Odyssey. However, the uptake of this open source product has been rather slow. Some libraries have found its functionality too limited, while others have not been able to integrate it into their existing systems. Perhaps software with the accessibility of Odyssey Standalone, yet the interoperability of RapidX will hold the most future promise. The document delivery horizon is certainly wide open for innovation and improvement. REFERENCES Atlas Systems. (2010). Odyssey-product information. Retrieved from http://www.atlas- sys.com/products/odyssey/ Connell, R., & Janke, K. (2006). Turnaround time between ILLiad's odyssey and ariel delivery methods. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 16(3), 41-56. doi:10.1300/J474v16n03_07 Delaney, T. (2007). Rapid and the new interlending: A cooperative document supply system in the USA. Interlending & Document Supply, 35(2), 56-59. doi: 10.1108/02641610710754042 Franke-Webb, J. (2001). Using DocMorph in conjunction with ariel to expand digital document delivery options. Journal of Interlibrary Loan, Document Delivery & Information Supply, 12(1), 85-92. doi:10.1300/J110v12n01_08 Holbert, G. L., Sayed, E. N., & Murray, S. D. (2002). Prospero power: Web-based document delivery allowing libraries to exchange documents through interlibrary loan; panel presentation given October 10, 2001, international scientific conference in library and information service, V. Vernadsky National Library of Ukraine, Kiev, Ukraine, October 9-11, 2001. E-JASL: The Electronic Journal of Academic and Special Librarianship, 3(1/2), 1. Infotrieve. (2009). Ariel interlibrary loan software. Retrieved from http://www.infotrieve.com/ariel Ives, G. (2000). Introduction. Journal of Interlibrary Loan, Document Delivery & Information Supply, 10(4), 1 - 2. doi:10.1300/J110v10n04_01 Jackson, M. E. (1992). Using ariel, RLG's document transmission system to improve document delivery in the united states Interlending & Document Supply, 20(2), 49-52. doi:10.1108/02641619210154477 Landes, S. (1997). The ARIEL document delivery system. Journal of Interlibrary Loan, Document Delivery & Information Supply, 7(3), 61-72. doi:10.1300/J110V07N03_08 Lavigne, J., & Eilts, J. (2000). The evolution of ariel. Journal of Interlibrary Loan, Document Delivery & Information Supply, 10(4), 3-7. doi:10.1300/J110v10n04_02 Miller, B. (2009). Odyssey Standalone FAQ. Retrieved from https://osu.illiad.oclc.org/illiad/osu/lending/odysseyfaq.html Miller, B. (2010). What every ILLiad library should know about Odyssey Standalone. 2010 ILLiad International Conference, Virginia Beach, VA. Retrieved from https://www.atlas- sys.com/conference/ConferenceSessions.aspx Morgen, E.B. & Hersey, D. (2003). Prospero 2.0. Journal of the Medical Library Association, 91(3), 381–382. Online Computer Library Center, Inc. (2010). OCLC Policies Directory. Retrieved May 10, 2010, from https://illpolicies.oclc.org/ Online Computer Library Center, Inc. (n.d.). Uniting OCLC and RLG. Retrieved from http://www.oclc.org/research/about/oclcrlg/default.htm Open Source Systems for Libraries (oss4lib). (2010, May 12). ILL-DD board. Retrieved from http://oss4lib.org/node/569 RapidILL.org. (2010). RapidILL: Frequently Asked Questions. Retrieved from https://rapid2.library.colostate.edu/PublicContent/AboutRapid.aspx Sayed, E. N., Murray, S. D., & Wheeler, K. P. (2001). The magic of prospero. Journal of Interlibrary Loan, Document Delivery & Information Supply, 12(1), 55-72. doi:10.1300/J110v12n01_06 Schnell, E. H. (1999). Freeing ariel: The prospero electronic document delivery project. Journal of Interlibrary Loan, Document Delivery & Information Supply, 10(2), 89. doi:10.1300/J110v10n02_08 Smith, J. (2006). The RAPIDly changing world of interlibrary loan. Technical Services Quarterly, 23(4), 17-25. doi:10.1300/J124v23n04_02 Weible, C. L., & Robben, C. (2002). Calming the tempest: The benefits of using prospero for electronic document delivery in a large academic library. Journal of Interlibrary Loan, Document Delivery & Information Supply, 12(4), 79-86. doi:10.1300/J110v12n04_08 APPENDIX A Web-Based Survey – “2010 Electronic Document Delivery Options Study” 1. What is your library type? a. Academic b. Public c. Special 2. In which U.S. state is your library located? (select from list of states, including District of Columbia) 3. If outside the U.S., please specify your location. (short text box for territories such as Guam) 4. What electronic delivery methods do you use to lend articles? a. Odyssey b. Ariel c. Email d. Fax e. Upload to server f. Prospero g. Other: 5. Why did your institution decide to use this or these method(s)? 6. Have you recently adopted/are you planning on adopting any of these electronic delivery methods in the next 12 months? a. Odyssey b. Ariel c. Email d. Fax e. Upload to server f. Prospero g. Other: 7. Why? 8. Have you recently discontinued/are you planning on discontinuing any of these electronic delivery methods in the next 12 months? a. Odyssey b. Ariel c. Email d. Fax e. Upload to server f. Prospero g. Other: 9. Why? 10. In what way do you use Ariel? a. Send and receive b. Receive only c. Send only 11. What do you like most about Ariel? 12. What do you like least about Ariel? 13. Do you use Ariel over Odyssey for any specific reasons? 14. What model scanner are you using with Ariel? 15. In what way do you use Odyssey? a. Send and receive b. Receive only c. Send only 16. What version of Odyssey do you use? a. Included with ILLiad b. Standalone version 17. What do you like most about Odyssey? 18. What do you like least about Odyssey? 19. Do you use Odyssey over Ariel for any specific reasons? 20. What model scanner are you using with Odyssey? 21. Do you use Ariel and Odyssey on a single PC? (YES/NO) 22. If so, what issues do you find with this arrangement? 23. What type of ILL management system do you use? a. ILLiad b. Clio c. Relais d. homegrown system e. other: 24. Additional comments: APPENDIX B Number (#) next to model indicates the number of respondents who specified identical scanner Ariel – Scanner models Bookeye (unspecified model) (6) Bookeye 2 (2) Bookeye 3 Canon Canoscan 8000F (2) Canon ImageRunner 400E Canon iR3235/iR3245 PS3 Epson GT15000 (2) Epson Perfection 1640SU Fujitsu flatbed/ADF Fujitsu fi-4120C Fujitsu fi-4220-C (2) Fujitsu fi-4340C (8) Fujitsu fi-5220C (5) Fujitsu fi-5750C (2) Fujitsu fi-6230 Fujitsu fi-6240dj Fujitsu fi-6770 Fujitsu M3093G Fujitsu M3096Gx Fujitsu M4097 D (2) Fujitsu ScanPartner 15c (2) Fujitsu ScanPartner 620c Fujitsu ScanPartner 93Gx HP Officejet 7130 HP Scanjet (unspecified model) HP Scanjet 3500 HP Scanjet 5550C HP Scanjet 5590 (2) HP Scanjet 7000 HP Scanjet 7400c (3) HP Scanjet C7710A HP Scanjet 8250 HP Scanjet 8270 (2) HP Scanjet 8300 HP Scanjet N8460 Lanier LP425 Minolta PS 5000C (2) Minolta PS7000 (9) Minolta 7145 Plustek Opticbook 3600 Ricoh IS760D Ricoh Aficio IS330DC (3) SCSI scanner device Toshiba estudio 351c WideTek (unspecified model) WideTek 25 WideTek Super B Wide Tek flat bed scanner Xerox Documate 510 Odyssey – Scanner models Bookeye (unspecified model) (8) Bookeye 2 Bookeye 3 Canon C 3380 Canon Canoscan8000F (2) Epson GT2500 Epson GT15000 Fujitsu flatbed Fujitsu fi-4120C Fujitsu fi-4200c2 Fujitsu fi-4340C (8) Fujitsu fi-5220C (2) Fujitsu fi-5750C Fujitsu fi-6230 Fujitsu fi-6240dj Fujitsu fi-6770 Fujitsu ScanPartner 93Gx (2) Fujitsu ScanPartner 620c Fujitsu M3096Gx Fujitsu M4097D HP LaserJet 3015 HP LaserJet M5053 (3) HP LaserJet M5590 HP Scanjet 7000 HP Scanjet 8250 HP Scanjet 8270 HP Scanjet 8290 HP Scanjet 8300 (2) HP Scanjet N8460 Konica Minolta 7145 Lanier LD 425c Minolta PS 5000c Minolta PS 7000 (5) OCE 3165 Plustek Opticbook 3600 Ricoh Aficio IS330DC (2) Ricoh IS760D WideTek (model unspecified) WideTek 25 WideTek (w/B-scan software) (3) Xerox Documate 510 work_td7h4eiwdfghph6hfj4nzqmlhq ---- NSF Budget Summary for Research and Related Activities, FY 1987-1988 (Dollars in Millions) FY1987 Current Plan FY1988 Request % Change FY 87/86 Mathematical and Physical Sciences Engineering Biological, Behavioral, and Social Sciences Geosciences Computer and Information Science and Engineering Scientific, Technological, and International Affairs Program Development and Management Research and Related Activities $ 464 163 257 285 117 44 77 $ 514 205 297 330 143 51 95 11% 26% 16% 16% 23% 17% 24% $1,406 $1,635 16% Amounts have been rounded. John Zaller. Analysis of Information Items in the 1 985 NES Pilot Study. June, 1986. Staff memoranda: John Brehm and Santa Traugott. Similar- ity and Representativeness of 1 985 Pilot Half-samples. March 17, 1986. Santa Traugott. Staff Evaluation of the Data Collection and Documentation. May 1 9 , 1 9 8 6 . • National Science Foundation Increases FY88 Budget Request Erich Bloch, Director of the National Sci- ence Foundation (NSF), has announced that the NSF budget request for FY 1 988 is $1.89 billion, almost 17% above the FY 1987 appropriation. Bloch said, "This increase is part of a national effort to bolster our economic competitiveness internationally and pre- serve our lead in science and tech- nology." "In addition," he said, "the Administra- tion is committed to doubling the Foun- dation's budget over the next five years. This amounts to increases of 1 4 percent per year for the next four years following 1988, raising the NSF total budget to more than $3.2 billion in FY 1 9 9 2 . " Budget increases for 1988 are widely distributed among the directorates and major activities of NSF. The largest in- creases go to the Engineering Directorate and to the new Computer and Informa- tion Science and Engineering Directorate, both of great importance in helping the U.S. regain its economic competitive- ness. Budget figures by directorate are listed on the chart. "The 1988 budget is a watershed event for NSF." Bloch concluded, " I n time of severe budget constraints it signals clear- ly the broader understanding of the importance of strong education and re- search programs to improvements in national productivity and our ability to compete in world markets." • Announcements Symposia Published The following journals offer symposia of interest to political scientists: Publius: The Journal of Federalism "Assessing the U.S. Voting Rights A c t " (1 6:4) guest edited by Charles L. Cotrell. Articles include general assessments of the VRA as well as assessments for selected states and localities. "Developments in State Constitutional L a w " (17:1) guest edited by Mary Cor- nelia Porter and G. Alan Tarr. Articles cover general developments in state con- stitutions and state constitutional law, 2 9 3 News of the Profession individual rights protection under state constitutions, state high court activity, attorneys general as interpreters of state constitutions, and other matters. Available: Publius: The Journal of Feder- alism, c/o Department of Political Sci- ence, North Texas State University, Den- ton, TX 7 6 2 0 3 - 5 3 3 8 . $ 1 0 , single issue ( + $1.50 for foreign postage). Make checks payable to CSF-Publius. The Review of Politics "American Constitutional Theory and Interpretation," Summer 1987. "German Politics," including articles related to the 1987 national election in the Federal Republic of Germany, to be published in Fall 1987. Available: The Review of Politics, Notre Dame, IN 4 6 5 5 6 . $5, single issue; $ 1 2 . 7 5 , APSA members annual rate. D Newsletter Summarizes Proposed Federal Budget The February 13, 1987 (Vol. VI, No. 3) issue of COSSA Washington Update summarizes and analyzes proposed budgets for social and behavioral science research in FY 1988. Information is given separately for each federal depart- ment and independent agency. A single copy of the newsletter is free from: COSSA Washington Update, 1 200 17th Street, NW, Suite 520, Washing- ton, DC 2 0 0 3 6 . Phone: (202) 887- 6166. • work Analysis, Contextual and Multilevel Effects, Artificial Intelligence, Bootstrap and Jackknife Re-sampling Methods, and LISREL Models. There will also be work- shops on methodological application in the areas of American Electoral Re- search, Latino Research Issues, Crime and Criminal Justice, Population Projec- tion and Estimation, and the Survey of Income and Program Participation (SIPP). The eight-week program will be divided into two four-week terms and offer as well the standard courses on Linear Models, Causal Analysis, Time Series, Mathematical Modeling, and Logit/Log- Linear Models. For more information, application and brochures, contact: ICPSR Summer Program, P.O. Box 1248, Ann Arbor, Ml 4 8 1 0 6 ; (313) 7 6 4 - 8 3 9 2 . D Department of Education Seeks Reviewers The Department of Education is seeking to enlarge its list of proposal reviewers for several of its categorical programs, including the graduate fellowships in Title IX and various foreign languages and international area studies programs. Interested parties should submit a full c.v. showing subject/area specialization and expertise to: Richard Naber, Program Support Branch Chief, Office of Higher Education Programs, Department of Edu- cation, 4 0 0 Maryland Avenue, SW (stop ROB-3, 3 1 0 8 A ) , Washington, DC 20202. • Summer Program Offered in Quantitative Methods The twenty-fifth annual ICPSR Summer Program in Quantitative Methods of Social Research will be held in Ann Arbor, Michigan, June 29 to August 2 1 , 1987. The Summer Program will feature a num- ber of special courses and presentations on such topics as: Structural Equations w i t h Limited Dependent Variables, Pooled Time Series Analysis Models, Sta- tistical Estimation of Formal Models, Net- Maggiotto to Direct Southern Association Michael A. Maggiotto, associate pro- fessor of government and international studies at the University of South Caro- lina, has been named executive director of the Southern Political Science Asso- ciation. The association's more than 3,500 insti- tutional and individual members world- wide make it the oldest and largest regional political science organization in the United States. 2 9 4 PS Spring 1987 His appointment means that head- quarters for the association will move to USC. The editorial offices of The Journal of Politics will remain at Emory Univer- sity. Maggiotto, whose areas of speciali- zation include American politics, political behavior and methodology, earned his bachelor's degree from the State Univer- sity of New York at Buffalo and his master's and doctoral degrees in political science from Indiana University. He joined the USC faculty in 1982 and previously taught at the University of Florida from 1976-82. He is a member of The Journal of Politics editorial board and served as assistant managing editor from 1977-82. He also has served on the executive council of the Southern Political Science Associa- tion. • British Politics Group Elects Executive Committee Editor's Note: In the winter 1987 PS we stated that the following individuals were elected to the executive committee of the Hoover Institution. That was wrong. They are the new executive committee members of the British Politics Group. In a recent election, the following were chosen for the Executive Committee of the British Politics Group: James Alt, David Butler, Leon Epstein, Donley Stud- lar, and Kenneth Wald. The Committee elected Ivor Crewe as president. Gerald Dorfman is acting as executive secretary while Jorgen Rasmussen is in Scotland for a year. • Parliament Recognizes Hull Professor MPs, peers, officers of Parliament, jour- nalists, academics and Hull politics grad- uates crowded a reception room at the House of Commons on January 28, 1987, to celebrate the election of Philip Norton, an expert on Parliament, to a per- sonal chair. Speaking on behalf of the Hull University Politics Graduates, Cliff Grantham paid tribute to Professor Norton's dedication Michael A. Maggiotto to his teaching and students as well as his scholarship. Sir Bernard Braine, representing the par- liamentarians present, declared that Nor- ton knew more about Parliament than anybody outside it, "and probably more about it than most people inside i t . " Norton spoke of the need for more in- depth studies of Parliament as opposed to superficial observations and studies that divorced Parliament from an appre- ciation of British political history. • Program Preserves U.S. Newspapers In 1 9 8 2 , the National Endowment for the Humanities initiated a program to organize, preserve, and provide access to U.S. newspapers. Under the U.S. News- paper Program, the NEH provides funding to national repositories and to state pro- jects involving libraries, archives, and his- torical societies. The Library of Congress provides technical management for the 2 9 5 News of the Profession newspaper program, and the Online Computer Library Center (OCLC) in Dublin, Ohio, provides facilities for the bibliographic phase of the program. In 1982, six national repositories, with NEH funding, began to catalogue and enter their newspaper holdings into The OCLC/CONSER data base. The Library of Congress also contributed its catalogue records. To date, eight repositories, as well as institutions in twenty-four states, have participated in the U.S. Newspaper Program. Participants in the program accept the responsibility to catalogue newspaper holdings within their respective states and territories and to preserve on micro- film the titles most important for re- search. They then enter bibliographic records for originals and microfilms into the OCLC data base. The 1985 data base provides extensive cross-references for variant titles, which allows users to identify more than 5 0 , 0 0 0 newspapers published in the United States and its territories, where the title is held, and which issues are held. A second edition, which will include a further 3 5 , 0 0 0 titles, is expected soon. Indexes to a hard-copy listing allow access by place of publication (city and state), publication date (the year in which the paper began), predominant language of the paper, and intended audience (ethnic, political, or religious). Source: Humanities (Washington, DC: NEH, January/February 1 987), p. 2 6 . U Grassroots Peace Directory Published The Grassroots Peace Directory is a computer-based directory of information on religious and secular groups working in the areas of peace, disarmament, and international security. Detailed listings include phone centacts, issue focus, method of operation, organizational infor- mation, and program description. Direc- tories are updated biannually. Prices for regional directories vary from $ 6 . 5 0 to $ 1 6 . 0 0 depending on region. Mailing labels are also available. For further information, contact: Grass- roots Peace Directory, PO Box 2 0 3 , Pomfret, CT 0 6 2 5 8 ; phone: (203) 928- 2 6 1 6 . • Peace Research Compiled If you are involved in planned, in- progress, or completed but unpublished peace-related research that also involves children and families, you may share your research with others by sending a one- page abstract to: Helen Raschke, c/o Libra Foundation, 3 3 0 8 Kemp Street, Wichita Falls, TX 7 6 3 0 8 . Also write to this address for the upcoming compila- tion, available summer 1 9 8 7 . • Jimmy Carter Library Opens to Researchers The Jimmy Carter Library in Atlanta, Georgia, officially opened its research facilities January 2 7 , 1987. More than six million pages of material documenting the Carter Administration was opened. This material includes the Subject and Name Files of the White House Central File and significant portions of the papers of the following White House staff mem- bers: Jody Powell, Press Secretary to President Carter; Stuart Eizenstat, Direc- tor of the Domestic Policy staff; Hugh Carter, Jr., Director of the Office of White House Administration; Gerald Raf- shoon, Director of the White House Com- munications Office; and Sarah Wedding- ton, Director of the Office of Women's Affairs. Other material opened for research will include memoranda from White House staff members advising the President on national and international policies; reports to the President; and correspond- ence between President Carter and na- tional and world notables on topics rang- ing from the environment, education, and mental health to nuclear disarmament and the exploration of space. Federal records relating to presidential commissions and White House confer- ences have also been opened. These 296 PS Spring 1987 include records relating to the National Commission on Neighborhoods; the White House Conference on Balanced National Growth and Economic Develop- ment; the President's Commission for a National Agenda for the Eighties; and the Presidential Commission on Mental Health. Nearly one million photographs, hundreds of hours of motion picture film, and audio and video tapes relating to the Carter Presidency were opened as well. The Jimmy Carter Library was dedicated on October 1 , 1986. It is one of eight presidential libraries administered by the National Archives. The library is located in the Carter Presidential Center on Copen Hill, along with the former Presi- dent's office, the Carter Center of Emory University, and offices of national and international organizations President Carter supports. The Carter Library is open for research 9 a.m. to 4:45 p.m. Monday through Friday. For further information, contact: Jimmy Carter Library, ( 4 0 4 ) 3 3 1 - 3 9 4 2 . • Archive Established for Materials Relating to Assassination of R. F. Kennedy Southeastern Massachusetts University has established this new archive to serve scholars, journalists, and interested pub- lics. The faculty has a rich variety of pri- mary material relevant to history, crim- inology, journalism, jurisprudence, and the politics of the 1960s. The collection contains all or significant portions of the following files: Superior Court trial ex- hibits, Los Angeles District Attorney's Office records, Los Angeles Police Department investigative records, FBI records (32,000 pages of newly released documents), audio and video tapes (600 hours of hearings, media programs, and interviews), photographs, and magazine and newspaper clippings. In addition it contains the papers of journalist Robert Blair Kaiser (author of the award-winning book RFK Must Die!), as well as the col- lections of three private researchers. The faculty is still acquiring and catalog- ing data, but is open to scholars. Inquiries can be directed to the Chair- person of the Archive: Philip H. Melan- son, Political Science Department, Southeastern Massachusetts University, North Dartmouth, MA 0 2 7 4 7 . • Women Helped in Finding Washington Internships Washington Internships Unlimited offers counseling to women interested in intern- ships in Washington. The service encour- ages women to regard volunteer intern- ships as a step from homemaking to working in a paying career. Counseling includes help in defining goals, locating internships, and preparing resumes. For further information, contact WIU, 4 2 6 4 S. 35th Street, Arlington, VA 2 2 2 0 6 . Phone: (703) 998-1 7 4 6 . • Negative TV Political Ad Tape Is Available A 45-minute video tape of historic nega- tive ads used in campaigns from 1 952 to 1986 is available. This tape is an excel- lent teaching tool in communication, jour- nalism, advertising, and political science classes. The tape is available for $ 2 0 0 . Four additional video tapes of historic television ads are also available as pre- viously announced in PS and vary in price from $100 to $ 2 0 0 . The individual, department, or library ordering one or several tapes provides blank '/a-inch or %-inch VHS cassette tapes. If no tape is provided, an additional $ 1 0 . 0 0 for y2-inch tapes or $ 3 0 . 0 0 for %-inch tapes will be charged. Order from L. Patrick Devlin, Department of Speech Communication, University of Rhode Island, Kingston, Rl 0 2 8 8 1 - 0 8 1 2 , or phone (401 ( 7 9 2 - 2 5 5 2 . • GSS Bibliography Available The sixth edition of the annotated bib- liography of papers using the General 297 News of the Profession Social Survey is now available through the Inter-University Consortium for Polit- ical and Social Research. The bibliogra- phy contains 1,498 citations of publica- tions that have used the General Social Survey in their analyses. Each entry in the bibliography contains a full citation, a list of the General Social Surveys used, the mnemonics used, and a short abstract. A mnemonic index permits quick identification of all references using particular variables of interest. The cost of the GSS Bibliography is $ 1 2 per copy for individuals from ICPSR mem- ber institutions and $18 per copy for those individuals not affiliated with an ICPSR institution. Inquiries and requests should be sent to: Member Services, ICPSR, P.O. Box 1248, Ann Arbor, Ml 4 8 1 0 6 ; (313) 7 6 3 - 5 0 1 0 . • CIS Launches "Academic Editions" Product Line Congressional Information Service, Inc. (CIS) will introduce a new product line of subject-specific microfiche collections to be published under the imprint CIS Aca- demic Editions. The collections will cover a wide variety of subjects in the humanities and the arts, history, history of science, business, government, poli- tics, interdisciplinary topics, and biog- raphy. CIS generally will publish a printed index or finding aid with each collection to ensure ease of use. The first file to be published, a documen- tary history of the Library of Congress, will be available in spring 1987. Devel- oped in cooperation with the Center for the Book in the Library of Congress, the collection will chart the development of the Library from its inception through 1985. Other collections currently planned for 1 987 include: The Franklin Institute and the Making of Industrial America, The Occupation of Japan: U.S. Planning Doc- uments, 1942-1945, Jesuit Woodstock Letter and Essays, and Peronista and Other Argentinian Political Pamphlets. For further information, contact: Con- gressional Information Service, 4 5 2 0 East-West H i g h w a y , Suite 8 0 0 , Bethesda, MD 2 0 8 1 4 ; phone: (301) 654-1 550 or (800) 6 3 8 - 8 3 8 0 . • Subscriptions to Humanities Available Humanities is the bimonthly magazine of the National Endowment for the Humani- ties with an annual subscription rate of $14. Features include " h o w t o " details on applying for NEH grants, excerpts from persuasive proposals, lists of most recent NEH grants, and names and tele- phone numbers of staff members who will help grant applicants. To subscribe, write to: Humanities, Room 4 0 9 , 1100 Pennsylvania Avenue, NW, Washington, DC 2 0 5 0 6 . • Guides Help Plan Study Abroad The International Educational Exchange is a nonprofit organization that publishes guides to studying and teaching abroad. Research papers are published on foreign students studying in the United States, and videotapes are available for loan to staff members of U.S. colleges and uni- versities on international exchange issues. One videotape package is directed toward Fulbright program advisers. For further information, write: Publica- tions Service, Institute of International Education, 809 United Nations Plaza, New York, NY 1 0 0 1 7 . • 298ttSSpring 1987 work_tdkt2x3fmncgpbzeqmq65w3ddi ---- 338150 82..92 THEME ARTICLE Open access indicators and information society: the Latin American case Nancy Gómez E-lis: E-prints in Library and Information Science, Santiago, Chile Atilio Bustos-Gonzalez Pontific Catholic University of Valparaiso, Valparaiso, Chile Julio Santillan-Aldana Documentation Center, Bartolome de Las Casas Institute, Lima, Peru, and Olga Arias Luis Federico Leloir Library, University of Buenos Aires, Buenos Aires, Argentina Abstract Purpose – The purpose of this paper is to estimate open access penetration ratios through cross-analysis of existing social context and open access indicators in Latin America. Design/methodology/approach – The following parameters were used to characterize the chosen countries. On one hand, it takes social context indicators like digital opportunity index (DOI), GDP 2007 (Organization for Economic Co-operation and Development) (www.oecd.org/home/0,3305,e n_2649_201185_1_1_1_1_1,00.html), scientific output 2005, and investment in science and technology vs GDP 2004. On the other hand, it analyses open access indicators considering the two main open access strategies – the green and gold routes – and the existing legal framework. Findings – This paper discusses the evolution of DOI and compares with open access parameters (numbers of repositories, number of registries in repositories, DOAJ journals and number of creative commons licences) in the context of scientific information in developing countries in Latin America. Research limitations/implications – This paper is not an exhaustive survey and limits the comparison to the Latin American Countries, focalized in Brazil, Chile and Argentina. Originality/value – This paper gives an overview of the situation of three particular countries: Brazil, Chile and Argentina, and explains the position of these countries in the open access movement in Latin America. Keywords Information society, Open systems, Communication technologies, South America Paper type Viewpoint Introduction The internet has created unprecedented possibilities to disseminate, share and build on the outcome of research efforts. Developing countries face major challenges in terms of research infrastructure, mainly due to high levels of poverty and unequal wealth and income distribution. This has a direct influence on informational literacy rates, and other indicators of the information society, which have led to the coining of the term The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm OCLC 25,2 82 Received February 2008 Revised March-June 2008 Accepted June-August 2008 OCLC Systems & Services: International digital library perspectives Vol. 25 No. 2, 2009 pp. 82-92 q Emerald Group Publishing Limited 1065-075X DOI 10.1108/10650750910961884 “digital divide,” meaning the difference in terms of access to information and communication technologies among developed and developing countries. In World Summit on the Information Society (WSIS, 2006), an international conference held in two phases in 2003 and 2005 in Geneva and Tunis, respectively, world leaders from more than 174 countries (among them heads of governments, ministers, vice-ministers as well as high-level representatives from international organizations and private sector) committed to turning the digital divide into a digital opportunity for all, as one of the main objectives of the conference. Developing countries face major economic problems, among them poverty, intense foreign debt and high illiteracy rates (Chan and Costa, 2004). Research infrastructure is also inadequate, which in turn leads to low levels of scientific output. Information is a key component in the production of new scientific knowledge, so that, in this sense, both information and communications technology (ICTs) technologies and access to quality research content are crucial. The open access movement arose at the beginning of 2000 as a reaction against the traditional scientific publishing model. According to Suber’s (2004-2006) definition, “open access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.” The BOAI (2005) defines it as follows: By “open access” to this literature, we mean its free availability on the public Internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited. As Suber (2004-2006) points out “[. . .] campaign for OA focuses on literature that authors give to the world without expectation of payment”. A strong argument in favor of open access is the concept of free access to publicly funded research, best set out on the (Organization for Economic Co-operation and Development – OECD, 2004) Declaration on Access to Research Data from Public Funding, signed in 2004 by over 30 countries. There are two main strategies for attaining open access: publishing in an open access journal, known as “the gold route” or archiving in institutional archives or repositories, known as “the green route.” The chief difference between them is that OA journals conduct peer review, whereas OA archives do not. John Willbanks, Executive Director of BMJ and Leader of the Science Commons Project states that: “[. . .] Open access journals are entering the mainstream of scholarly publishing”. The Directory of Open Access Journals (DOAJ) (www.doaj.org), a listing of “free, full text, quality controlled scientific and scholarly journals, includes 2478 journals, with on average more than one journal a day added in 2006.” (Willbanks, 2006). As of this paper writing, the DOAJ lists 2,995 journals amounting to nearly 10 per cent of the whole scientific literature included in Ulrich Periodicals Directory (www.ulrichsweb.com/ulrichsweb/). In the same direction, the number of OA repositories is steadily growing. The Registry of Open Access Repositories (ROAR) (http://roar.eprints.org/) lists around 1,000. According to Willbanks (2006) ROAR has grown “[. . .] by nearly one every other day, and the number of records in those archives Open access indicators 83 has grown by nearly 600 percent, to 1.2 million papers. Open access is here to stay, in one form or another.” Context In order to carry out a survey of the Information Society situation, one must research institutional organizations in the region. Two different reports were analyzed, released, respectively, by the Information Society Observatory for Latin American and the Caribbean (OSILAC, Spanish acronym), under the framework of the Economic Commission for Latin America and the Caribbean (Spanish acronym: CEPAL), and the ITU/UNCTAD World Information Society Report 2007. The eLAC2007 report Since 2000, Governments in Latin America and the Caribbean have stated their commitment and worked towards the development of regional observatories to track the impact of ICTs on the economy, based on statistical data of the Information Society, issuing basic indicators. In this sense, OSILAC was assigned the mission of building indicators and monitoring the situation of eLAC 2007, the regional Plan on Information Society for Latin America and the Caribbean. The plan consists of 30 subject areas with 70 short-term activities, which contribute to the long-term implementation of the WSIS Global Action Plan, enshrined within the Millennium Development Goals (Observatorio para la Sociedad de la Información en América Latina y el Caribe, 2007). This plan is structured around five critic areas identified by the countries in the region: access and digital inclusion, knowledge and capacity building, public efficiency and transparency, policy instruments and enabling environment. The activities within eLAC are aimed at reaching three main objectives: (1) Fostering regional projects to reinforce initiatives and cooperation, to help develop common synergies. (2) Promoting strategies to help getting measurable results in specific areas through the implementation of indicators on the development and progress of the information society. (3) Deepening knowledge to better understand critical areas to support the definition, implementation and evaluation of policies. The plan shows a significant progress throughout the region in the development of the information society: out of the 27 areas surveyed, 15 show progress, or even strong progress, whereas 12 remain with moderate or low progress levels. Table I depicts the level progress in each area. The major progresses count among the two first areas: access and digital inclusion, and knowledge and capacity building, while public efficiency and policy instruments show less improvements. Open access can be considered a transversal variable with a direct effect on a number of activities described in the table, especially in the two first areas, as explained below: . 10 Education and research networks. Cooperación Latino Americana de Redes Avanzadas (CLARA) is a research network which connects 16 countries in Latin America, including universities and research centers, to encourage OCLC 25,2 84 regional cooperation in scientific and educational activities (www.redclara.net/). The necessary infrastructure for building a network of institutional open access repositories could be provided by CLARA, since it includes the countries selected for this survey. . 11 Science and technology. Most Latin America countries have a very low investment rate in research and development, being in average around 0.5 per cent of their gross domestic product (GDP), except for Brazil, first place in the region, with 1 per cent, but very far from those of developed countries. As Rossini states (Rossini, 2007): “[. . .] open access represents the best method for the flow, interchange and production of scientif knowledge – that access to knowledge is crucial for innovation and innovation is crucial for development.” . 16 E-learning. The main objective of e-learning is the creation and fostering of digital educational contents. The building of electronic theses databases in Brazil (Biblioteca Digital de Tesis) (http://bdtd.ibict.br/bdtd/) and Chile (Cybertesis) (www.cybertesis.net/), since the late 1990s, has led to significant progress in the field of e-learning. Higher education institutions rely heavily on this type of materials, that enables researchers and students to keep to date with cutting-edge scientific advances. Area Aim Progress level A. Access and digital inclusion 1. Regional Infrastructure Progress 2. Community centres Strong progress 3. Online schools and libraries Progress 4. Online health centers No progress 5. Work Moderate progress 6. Local governments Strong progress 7. Alternative technologies Moderate progress B. Knowledge and capacity building 8. Software Moderate progress 9. Training Progress 10. Education and research networks Strong progress 11. Science and technology No progress 12. Companies Progress 13. Contents and creative industries Progress 14. Internet governance Progress C. Public transparency and efficiency 15. E-government Progress 16. E-learning Strong progress 17. E-health No progress 18. Catastrophes No progress 19. E-justice Moderate progress 20. Environment protection Moderate progress 21. Public information and cultural inheritage Progress D. Political instruments 22. National strategies Progress 23. Financing No progress 24. Universal access policies No progress 25. Legal framework No progress 26. Measuring and indicators Strong progress E. Enabling environment 27. WSIS and eLAC follow-up Strong progress Table I. eLAC2007 aims – progress levels Open access indicators 85 . 17 E-health. Virtual libraries can take advantage of ICTs through greater connectivity with their partners in the region and worldwide. In addition to that, it is often not recognised that international medical and environmental research programmes may be inappropriate for developing countries, due to the lack of knowledge of research generated in them, where the major health problems exist. A search for “malaria” on the Bioline International web site illustrates the volume of relevant research available from the developing world (Chan and Costa, 2004). The World Information Society Report 2007: beyond WSIS The WSIS is a summit held in two phases in Geneva, in December 2003 and in Tunis, in November 2005 with the main objectives of building the information society and making advances towards bridging the digital divide. The World Information Society Report 2007: beyond WSIS (International Communication Union, 2007) is the second of a series of reports intended to track current implementation and progress in achieving the WSIS targets. It has been created by the Digital Opportunity Platform, an open multi-stakeholder platform with contributions from governments, academics and civil society, as well as inter-governmental organisations. As an evaluating tool, the digital opportunity index (DOI) has been created as a composite index comprising eleven separate indicators, grouped into three clusters of opportunity, infrastructure and utilization (Figure 1). Indicators in various data series are standardized on a scale of zero to one, making the DOI simple and straightforward to calculate. In that way, access to and affordability of ICTs is condensed into a single index number, permitting comparison of countries scores in any one year, as well as over time. This index has been chosen because it best represents the variability of a number of Latin American countries between 2004 and 2006. In that period, Chile, Argentina and Brazil appear in the first places, above the Latin American average growth of 3 per cent and close to the DOI variability worldwide. These three countries will face the challenge of tracking their progress within the Information Society, as well as the impact that open access movement has had in recent years (Figure 2). Figure 1. DOI categories Percentage of population covered by mobile cellular telephony Internet access tariffs as a percentage of per capita income Mobile cellular tariffs as a percentage of per capita income Proportion of households with a fixed line telephone Proportion of households with a computer Proportion of households with internet access at home Mobile cellular subscribers per 100 inhabitants Mobile internet subscribers per 100 inhabitants Proportion of individuals that used the internet Ratio of fixed broadband subscribers to total internet subscribers Ratio of mobile broadband subscribers to total mobile subscribers D ig ita l o pp ur tin ity in de x Utilization Oppurtunity 1 2 3 4 5 6 7 8 9 10 11 Infrastructure Source: ITU/UNCTAD/KADO Digital Opportunity Platform OCLC 25,2 86 Countries situation The following parameters were used to characterise the chosen countries: . social context indicators; . DOI; . GDP 2007 (OECD) (www.oecd.org/home/0,3305,en_2649_201185_1_1_1_1_1,00. html); . scientific output 2005 (Scimago Research Group) (www.scimago.es/); and . investment in science and technology vs GDP 2004 (Table II). Open access indicators Considering the two main open access strategies – the green and gold routes – and the legal framework, the following parameters were analyzed: . number of repositories in ROAR; . number of registries in ROAR repositories; . number of journals in DOAJ; . number of journals in Scientific Electronic Library Online – SciELO (2007); . number of journals in BIOLINE; Figure 2. DOI evolution in Latin American 0 0.1 0.2 0.3 0.4 0.5 0.6 Ch ile Ar ge nti na Br asi l Ur ug ua y M ex ico Ve ne zu ela Co sta R ica Co lom bia Do mi nic R ep ub lic Pa na ma Pe rú Ec ua do r El Sa lva do r La tin A me ric a Gu ate ma la Pa rag ua y Bo liv ia Ni ca rag ua Cu ba Ho nd ura s Ha iti DOI 2004–2005 DOI 2005–2006 Country Argentina Brazil Chile Social context indicators DOI 2005-2006 0.51 0.48 0.57 GDP 2007 5,677 5,885 9,275 Investment in science and technology vs GDP 2004 (%) 0.49 1.28 0.68 Scientific production ISI 2005 SCIMAGO Group 5,875 19,386 3,385 Table II. Social context indicators Open access indicators 87 . date of adoption of creative commons license; and . number of creative commons licenses (Table III). Brazil From the three selected countries, Brazil has the lowest DOI. It was the first in the region to adapt creative commons licenses, and at present it is the country with the greatest number of granted licenses. Its scientific output is the highest in the region, with around 200,000 documents in 2005, closely related to the investment level in science and technology. Sely Costa makes an exhaustive report (Costa, 2007; Costa and Kuramoto, 2007; Electronic Publishing Trust for Development, 2007) where she describes the multiple steps Brazil has taken to accomplish the goals of open access, since the SciELO platform was brought in as early as 1997. These steps include statements in favour of OA, ways to implement OA initiatives, publications and courses. All these initiatives have been carried out mainly by the Instituto Brasileiro de Ciencia e Tecnologia (IBICT), which has had a leading role coordinating the Open Access Movement with SciELO, the Brazilian Parliament, the Brazilian Council of University Chancellors and other societies in Brazil, and in partnership with other countries. Some of these initiatives are: . carrying out technology prospective studies; . customizing software (OJS, OCS, Eprints, Dspace and NDLTD); . training people (640 people from 189 institutions); . transferring technology (SEER – open journal systems, SOAC – open conference system, Institutional and Discipline Repositories, mostly to universities); . building portals – data and service providers (BDTD; Oasis.br); . sensitising the scholarly community and policy makers; and . expanding Brazilian initiatives to the Portuguese speaking ALemPLus project) and LA countries. Open access indicators Argentina Brazil Chile Gold route indicators Number of repositories (ROAR) 2 55 4 Number of registries in repositories (ROAR) 2,143 346,411 11,610 Green route indicators BIOLINE 4 2 DOAJ Journals 42 287 81 SciELO (July 2007) 29 185 66 Legal framework open access Adaptation of Creative Commons License to local legislation October 5 July 4 July 5 Number of Creative Commons Licences (Data obtained from the Creative Commons 1.0 Statistics Generator web site)a 84,890 216,500 158,210 Source: awww.ccestadisticas.negociosabiertos.com/index.php (accessed December 2007) Table III. Open access indicators OCLC 25,2 88 Hélio Kuramoto (IBICT) has helped to formulate a proposed law (introduced by Rodrigo Rollemberg, Member of Brazil’s House of Representatives) that would require all Brazil’s public institutions of higher education and research units to create OA institutional repositories of their technical-scientific output (www.camara.gov.br/sileg/ integras/461698.pdf). In this sense, Brazil is the first country in Latin America to have a Parliament bill related to the open access movement. The fundamentals of this bill are found on many declarations undersigned in Brazil to support open access, among them the one issued by IBICT at the 57th Annual Meeting of the Sociedade Brasileira para o Progresso da Ciencia (http://ibict.br/openaccess/arquivos/manifesto.htm) and also the declaration approved at an international conference at Salvador (Bahia), known as the Bahia Statement (www.icml9.org/public/documents/pdf/pt/Dcl-Salvador). Another equally important initiative on open access in Brazil is OASIS.Br. OASIS stands for Open Access and Scholarly Information System, and it is the Brazilian portal of open access journals and repositories. Its 109 digital repositories can be searched simultaneously through a single interface by means of the OAI-PMH (Open Archives Initiative for Metadata Harvesting) protocol. The portal focuses on four main objectives: to increase the visibility of the Brazilian scientific production; to ease the access to information through a single site; to act as metadata service provider, and to support the open access movement in Brazil and worldwide. Chile Among the countries surveyed, Chile has the highest DOI in Latin America, reaching 0.57. Chile was the second country to implement and adapt creative common licenses to local legislation in the region since July 2005. So far, 158,210 licenses have been already granted. Regarding its scientific output, Chile stands in the third position, showing a production registered in ISI of about 3,400 documents: . The Consejo Nacional de Rectores de Universidades Chillenas (CRUCH) and the Comisión Nacional de Ciencia y Tecnologı́a (Conicyt) have developed a National Plan of Access to Scientific and Technical Information to serve universities and other public and private research institutions. CINCEL is a private law corporation established by CRUCH and Conicyt, engaged in building a National Electronic Library, and managing acquisition and access to a range of scientific and technical journals to be equally available to all participating institutions. A clause has been added recently compelling publishers to allow downloads of all national scientific literature, in order to be made available at public repositories. These are considered to be an important part of the plan, which is still under development. Chile is represented in international open access directories, such as ROAR and Directory of Open Access Repositories (DOAR) by five open access repositories: . Two at Universidad de Chile: Cybertesis and Captura. Cybertesis hosts doctoral theses, and its name is related to the software upon which it is built; Captura hosts journal articles, theses and other document types. . One at Universidad de Talca, which hosts Memoirs and other document types. . One at Universidad del Bı́o Bı́o, with doctoral theses. . SciELO Chile is registered as an open access journals repository. Open access indicators 89 As it can be observed, doctoral theses and journal articles count among the most visible Chilean electronic documents accessible through the internet. A significant step in favour of the open access movement was the Valparaı́so Statement for the improvement of scientific communication in the electronic environment, held in Valparaı́so in January 2004. The statement stresses the need to use the internet as the most efficient tool for the immediate dissemination of knowledge, and it declares that journal publishers are responsible for their maximum visibility and availability at various repositories. Argentina Considering the DOI and the Information Society Growth, Argentina appears in tenth place showing a growth of 4 per cent. In ROAR, this country is represented only by two repositories: SciELO Argentina, and a journal published by the public university, Universidad Nacional del Centro, since both of them are harvested through the OAI protocol. There are several other repositories and/or digital libraries in the country, mainly within public universities, all of which have started collecting electronic theses. Examples are Universidad Nacional de Cuyo, Instituto Balseiro, which belongs to the same university, Universidad Nacional de La Plata and the Latin American Counsil on Social Sciences (Spanish acronym, CLACSO) (www.clacso.org.ar/). CLACSO is an international NGO associated to UNESCO, created in 1967. It brings together about 170 research centers distributed across 21 countries in Latin America and the Caribbean. Since 2004 CLACSO digital library activities has been supported by INASP (International Network for the Availability of Scientific Publications) (www.inasp. info/). As of now, it hosts over 9,000 full-text documents. In total, Argentina has seven institutional repositories, though not OAI compliant yet. The Directory of Open Access Journal (www.doaj.org/) registers 42 electronic journals from Argentina, and the SciELO Argentina repository, 29. There are no public policies or mandates related to open access; still, there is growing concern to promote its benefits, especially among the librarian community. Regarding the creative commons licenses, many have been adapted to legal local frameworks since October 2005, and statistics observed at Negocios Abiertos web site (www.negociosabiertos.com/) show that around 85,000 licenses were granted. Regarding the scientific output, Argentina is the second of the three selected countries, producing around 5,900 documents/year. Nonetheless, the number of registries per repository is very low, about 2,200, far from the other two countries. Argentina’s visibility level is the lowest of the region. Conclusions In terms of scientific output vs investment level in science and technology, Brazil stands out in first place, followed by Chile and Argentina, respectively. The same relation applies to visibility on the web: Brazil stands in the first place, Chile is second, and Argentina is in the last place. Additionally, the DOI shows that Chile is in the best position considering levels of inclusion in the Information Society in Latin America, even over several European countries. Brazil is in the second place, followed by Argentina. As far as the legal framework is concerned, Brazil was the first country of the three to adapt CC licenses to local legislation, followed by Chile in the second place, and OCLC 25,2 90 Argentina in the last position. Brazil is also the first country in Latin America with a bill in Parliament in favour of the open access movement, a major step even considering the international context. In terms of document types, electronic theses and OA journals are the most widely represented from the late 1990s both Brazil and Chile have engaged in electronic theses programs, taking advantage of the support offered by UNESCO. A number of repositories in Chile started as electronic theses repositories. Finally, from many perspectives, Brazil is the country that epitomizes the concept underlying the open access movement: publicly funded research should be available free of charge to anyone. In this way, the SciELO platform has been a key opportunity to launch the open access initiative in Latin America, joined by Chile in 1998, and by Argentina several years later. Digital repositories in these three countries are slowly catching up with the concept of access to research supported and financed by government and other public agencies. They are essential to help bridge the south-north digital divide, that is often ignored by developed countries (Kirsop et al., 2007). References BOAI (2005), Budapest Open Access Initiative, available at: www.soros.org/openaccess/ (accessed November). Chan, L. and Costa, S. (2004), “Participation in the global knowledge commons: challenges and opportunities for research dissemination in developing countries”, E-prints in Library and Information Science – E-Lis, available at: http://eprints.rclis.org/archive/00002611/ (accessed November). Costa, S. (2007), “New publishing models for scholarly communication and the Brazilian open access policy”, paper presented at PKP Scholarly Publishing Conference Blog, available at: http://scholarlypublishing.blogspot.com/search/label/Brazil (accessed December). Costa, S. and Kuramoto, H. (2007), “New publishing models for scholarly communication and the Brazilian open access policy”, paper presented at 1st PKP Conference, Vancouver, July 2007, available at: http://pkp.sfu.ca/ocs/pkp2007/index.php/pkp/1/paper/view/63/82 (accessed December). Electronic Publishing Trust for Development (2007), “Brazilian OAI”, November 12, 2007, available at: http://epublishingtrust.blogspot.com/2007/11/oa-in-brazil.html (accessed December). International Communication Union (2007), World Information Society Report 2007: Beyond WSIS, 2nd ed., available at: www.itu.int/osg/spu/publications/worldinformationsociety/ 2007/ Kirsop, B., Arunachalam, S. and Chan, L. (2007), “Access to scientific knowledge for sustainable development: options for developing countries”, Ariadne, Vol. 52, available at: www. ariadne.ac.uk/issue52/kirsop-et-al/ (accessed October). Observatorio para la Sociedad de la Información en América Latina y el Caribe (2007), “Monitoreo del eLAC 2007: avances y estado actual del desarrollo de las Sociedad de la Información en América Latina y el Caribe”, available at: www.eclac.org/publicaciones/ xml/5/29945/ResumenEjecutivo.pdf (accessed December). Organization for Economic Co-operation and Development (2004), “Declaration on access to research data from public funding”, available at: http://webdomino1.oecd.org/horizontal/ Open access indicators 91 oecdacts.nsf/Display/2F3530C4FE5F02D7C125729C00508A9A?OpenDocument (accessed December). Rossini, C. (2007), “The open access movement: opportunities and challenges for developing countries. Let them live in interesting times”, Diplo Foundation Internet Governance Program 2007, available at: http://campus.diplomacy.edu/env/scripts/Pool/GetBin.asp? IDPool !3737 (accessed December). SciELO (2007), available at: www.scielo.org/php/index.php?lang !es (accessed November). Suber, P. (2004-2006), “Open access overview”, available at: www.earlham.edu/, peters/fos/ove rview.htm (accessed November). Willbanks, J. (2006), “Another reason for opening access to research”, BMJ 33, available at: www. bmj.com/cgi/reprint/333/7582/1306 (accessed December). WSIS (2006), World Summit on the Information Society, available at: www.itu.int/wsis/basic/ about.html (accessed November). Further reading Babini, D. and Fraga, J. (2004), “Bibliotecas virtuales para las ciencias sociales”, CLACSO, Buenos Aires, available at: http://eprints.rclis.org/archive/00005185/01/intro.pdf (accessed December). Bustos-González, A. and Moya-Anegón, F. (2008), “La investigación cientı́fica chilena (1990 – 2005)”, Tesis de grado para oponer el grado de doctor en Ciencias de la Información. Dirigida por Félix de Moya y Anegón, Facultad de Bibliotecologı́a y Documentación, Universidad de Granada, Granada. International Communication Union (2006), The Digital Opportunity Index: A User’s Guide, available at: www.itu.int./ITU-D/ict/doi/material/doi-guide.pdf (accessed December). Morrison, H. (2005), “Dramatic growth of open access: revised update”, available at: http:// poeticeconomics.blogspot.com/2005/08/dramatic-growth-of-open-access-revised.html (accessed December). Morrison, H. (2006), “Dramatic growth of open access: implications and opportunities for resource sharing”, Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, Vol. 16 No. 3, available at: http://ir.lib.sfu.ca/handle/1892/510 (accessed December 2007). Moya-Anegón, F. (2007), “Ciencia en América Latina, visibilidad internacional y acceso abierto”, Décimo aniversario de SciELO FAPESP, Sao Paulo, Agosto 2007, available at: www. eventos.bvsalud.org/scielo10/public/documents/13h00%20Felix%20de%20Moya-115353. pdf (accessed November). Olaya, D. and Peirano, F. (2007), “El camino recorrido por América Latina en el desarrollo de indicadores para la medición de la sociedad de la información y la innovación tecnológica”, Revista Iberoamericana de Ciencia Tecnologı́a y Sociedad, Vol. 3 No. 3, pp. 153-85, available at: www.revistacts.net/3/9/09/file Corresponding author Nancy Gómez can be contacted at: nancydiana@gmail.com OCLC 25,2 92 To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints work_tjlfquwr6bfxxi5trtngsosw4a ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586191 Params is empty 216586191 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586191 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_tjxywd6fj5bfzpli5a4es7lzpu ---- Promoting open access in Germany as illustrated by a recent project at the Library of the University of Konstanz Promoting open access in Germany as illustrated by a recent project at the Library of the University of Konstanz Abstract Anja Kersting and Karlheinz Pappenberger Library of the University of Konstanz, Konstanz, Germany Purpose - With the illush'ation of a best practice example for an implementation of open access in a scientific institution, the paper will be useful in fostering future open access projects. Design/methodology/approach The paper starts with a brief overview of the existing situation of open access in Gelmany. The following report describes the results of a best practice example, added by the analysis of a survey on the position about open access by the scientists at the University of Konstanz. Findings - The dissemination of the advantages of open access publishing is fundamental for the success of implementing open access in a scientific institution. For the University of Konstanz, it is shown that elementmy factors of success are an intensive cooperation with the head of the university and a vigorous approach to inform scholars about open access. Also, some more conditions are essential to present a persuasive service: The Librm'y of the University of KonstrulZ offers an institutional repository as an open access publication platform and hosts open journal systems for open access journals. High-level support and consultation for open access publishing at all adminish'ative levels is provided. The integration of the local activities into national and international initiatives and projects is pursued for example by the joint operation of the information platform open-access.net. Originality/value - The paper offers insights in one of the most innovative open access projects in Germany. The University of Konstanz belongs to the pioneers of the open access movement in Gelmany and is cUlTently mnning a successful open access project. Keywords Open systems, Publishing, University libraries, Germany Paper type Research paper Open access in Germany Since the Berlin Declaration on open access to Knowledge in the Sciences and Humanities (http://oa.mpg.de/openaccess-berliniberlindec1aration.html) in 2003, open access has gained increasing significance within the academic world. Various academic institutions, research funding agencies, as well as a growing number of universities, have already signed the Berlin Declaration and are actively promoting open access publishing. Thus, in the last four years, open access has become a topic of high actuality and relevance in Germany, especially regarding publication policies of institutions and funding agencies: Scientific organizations und institutions such as various universities, scholarly societies, the German Rectors Conference (HRK) and the alliance of the research organizations are supporting open access, and have put 105 http://kops.ub.uni-konstanz.de/volltexte/2009/8760 http://nbn-resolving.de/urn:nbn:de:bsz:352-opus-87603 http://www.emeraldinsight.com/1065-075X.htm http://www.emeraldinsight.com/1065-075X.htm 106 forward policies regarding open access publishing, its relevance, and the specific impact on the scientific communication. Accordingly, the major German funding agency, the German Research Foundation (DFG), encourages scientists to publish their results originating from the DFG-grants open access. The publications are either to be deposited in discipline-related or institutional electronic archives (repositories) following conventional publication, or to be published in a recognized peer-reviewed open access journal. A growing number of universities support the open access idea, and implement institutional repositories to provide the opportunity of free archiving and free access to scientific documents. Meanwhile, the advantages of worldwide free and unrestricted access to scientific information are beyond controversy. For researchers, using open access maximizes their research impact, increases their visibility, and raises their reputation. Users can access relevant information on the web at any time and place. In addition, the worldwide free access to scientific information is essential in enabling enhanced information supply in developing countries. The need for information on open access Even though many scholars already have a general idea of what open access means, there are still large information deficits as identified in a DFG survey (Deutsche Forschungsgemeinschaft, 2005) in 2005. In this survey, more than a 1,000 scientists were asked about open access publications. The survey results show that the major problem in implementing open access is the lack of knowledge in the scientific community. Similar experiences have been made at the University of Konstanz. Some scholars had already realized the opportunities of open access publishing many years ago, and since the cooperation between the Library of the University of Konstanz and its scientists is traditionally very close, an intensive open access discussion began and the library's engagement in the open access field gained momentum. While convinced scholars mainly saw the publishing advantages of open access, for the library the consequences of rising journal subscription costs with decreasingly limited budgets led to a consequent pursuit of the open access idea. In 2004, the library began to put out information about open access issues on its homepage. Another concrete activity was the institutional BioMedCentral membership to support open access publications of Konstanz scientists. During that time, the awareness of the importance of open access regarding the entire infOlmation supply steadily grew. The library also realized the complexity and subject-specific diversity of the open access issue. The knowledge of scientists concerning open access varied significantly; informing them all adequately turned out to be a challenge. While in 2004, many scientists were not aware of the chances, possibilities, and the details of the publishing process in open access, discussions at the university showed a basically open and positive attitude towards it, once the researchers became better informed and saw the relevance for their specific field of interest. The objective of the library was to inform scientists as precisely and concentrated as possible about the open access topic. With the awareness of the complexity and subject-specific diversity, the idea of a nationwide open access information platform developed. Establishing the information platform on open access A lot of widespread information was accessible on the web, but clearly structured infol1nation with a general overlook for scientists as well as for librarians and similar infrastructure providers was lacking. Why should similar infonnation be held and administrated at different places coexistently? The directors of the university libraries in Konstanz and Bielefeld initiated the founding of a nationwide information platform on open access. Together 107 with the University of Goettingen and the Free University of Berlin, they asked the DFG for support for this project. Fortunately, the support was granted and since 2006 the platform has been funded by the DFG to support scientists and research institutions in the implementation of open access practices. Open-access.net is an online information platform on open access issues jointly operated by the Universities of Bielefeld, Goettingen, Konstanz and the Free University of Berlin, who belong to the pioneers of the open access movement in Germany. The platform provides infonnation about the growing scientific and political significance of open access issues. Open-access.net is the. first online platform providing information on open access in German language. It offers information on publishing strategies, costs, and legal aspects, and lists open access journals and other publishing possibilities for different research areas. This subject-specific information is of great practical relevance, such that the specific information needs in individual research areas can be fulfilled. Additionally, it presents convincing arguments for the use of open access not only for researchers, but also for research institutions, universities, professional associations, libraries and publishing houses. The official start of the platfonn was announced at the "German e-Science-Conference" in Baden-Baden in May 2007. As of today, the platform is already well accepted and well known among people dealing with open access. To guarantee the abundance, quality, and sustainability of the content, as well as to reach a high media resonance, a cooperative network was established: The platform is supported by the German Rectors' Conference, the Volkswagen Foundation and the German Initiative for Networked Information (DINI e.V.). At the same time, several research institutions, such as the Helmholtz Association, the Fraunhofer-Gesellschaft, and the Max Planck Society have added information about their own open access policies to the content of the platform and are promoting the platform as the main source of information on open access. An academic advisory council regularly evaluates the platfonn to meet the needs of the researcher community. Two scholarly societies - the German Psychological Association and the Gennan Linguistic Association - underline their support with their labels on open-access.net; cooperations with other scientific societies are in preparation. In a next step, the platform is going to be translated into English and internationalized even more. The aim is to establish an internationally recognized powerful and continuous information platform on open access. The implementation of open access at the University of Konstanz For the engagement of the library of the University of Konstanz on open access, the information platform was an important starting point. Whenever talking to researchers, profound knowledge is a key element for promoting open access. So, the platform is both a service for everybody interested in open access in Germany and an important foundation of the open access policy at the University of Konstanz. 108 Besides the launch and the operation of the platform, the University of Konstanz is very active in promoting and implementing open access inside the University. In the beginning of open access discussion in Konstanz, the golden road was not really the "golden road" for Konstanz. There were only a few pioneers who published in the small number of existing open access journals, so the "golden" road for the library in Konstanz first beC Multilingual information system needs
Users self report on efficiency of WorldCat Respondents pointed out their concerns with Romanization issues: ∙Difficulty getting from the Romanized language to the target or native language. ∙Meaning is lost under the current system since it is not transparent on how to move across languages. ∙Romanized titles were reported as particularly difficult to understand; respondents noted that the system works best when the user knows both conventions in use. ∙Typing the correct, exact query becomes tanta- mount to mastering the Romanization problem. ∙Use of Chinese characters in bibliographic de- 102 Journal of the Korean Society for Information Management, 27(2), 2010 scriptions for Korean and Japanese materials. ∙Korean material using Korean Hangul would be more accessible if titles also carried Chinese characters linked to Romanization. ∙Linking of original native language, English translation, and Romanization would facilitate understanding of bibliographic records. ∙Addition of an English language abstract would allow users to assess if bibliographic records meet the original information need for topic searches. The survey provided a framework to define the secondary access problem: how do individuals get information about information (the bibliographic problem) as they move from one language to another and from one alphabet to another? The survey con- firmed the importance of topic, task, and display and it offered specific information on how each of these might be assessed when individuals conduct searches for information. Thus, the survey funneled and focused these issues allowing for the design of an experiment to explore how individuals might seek such information in a realistic but controlled environment. 3.2 Experiment A separate experiment was conducted to explore the use of transliterated information when searching for bibliographic information using the WorldCat system. 3.2.1 Sample This study used a non-probability convenience sample of nine individuals whose native languages were Chinese, Japanese, or Korean, and whose sec- ond language is English. There were three in- dividuals who were native speakers from each lan- guage group. The subjects were selected to include librarians from Rutgers University Libraries and stu- dents from three academic departments: Library and Information Science (LIS), Communication, and Journalism and Media studies. The subjects were purposively selected to accommodate the ex- perimental design; for example, one librarian from each language group and two students from LIS and non-LIS areas were selected. 3.2.2 Experimental Design The main focus of this experiment is to examine how sensitive the system is to a person’s particular needs, especially when seeking information across different languages. Subjects were observed conduct- ing three searches using the WorldCat system and this was followed by a personal interview. The three different search tasks assigned to each user served as the unit of analysis for this study with three individuals assigned to different languages searching three tasks with different topics. The three topics were chosen from areas of health, information science, and business because it was assumed that these areas were considered relatively important for the subjects conducting the searches given their pro- fessional or academic positions. After choosing the subject area, the actual topics were set up. Although A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 103 the search results and satisfaction levels vary by subjects’ interest of these subject areas, topic knowl- edge, and users’ search experiences with these topics, all subjects were required to search all three topics and their search satisfaction levels were recorded by them and then reflected on the individual’s overall satisfaction test results. This design resulted in 27 cases (3 subjects x 3 Tasks x 3 Topics). Embedded within the design is the use of three different languages, CJK, in addi- tion to English. Incorporated within the search proto- col is the use of different languages available through transliterated records in WorldCat. The Tasks (T) are defined as follows: T1: Do a search looking for information written in your native language. T2: Do a search looking for information written in English. T3: Do a search looking for information written in a language you do not know. The Topic was assigned as follows. Topic1: Food nutrition business in the United States. Topic2: Socio-cognitive concept in Information Science. Topic3: Globalization in industry. 3.2.3 Hypotheses for the Experiment A fundamental premise underlying transliteration from CJK to Romanized script is that seekers would be able to interpret the Romanized version which requires knowledge of two languages. Also tested were individuals’ searches in a language they did not know to provide preliminary data on how trans- literation serves those not knowing one of the languages. H1: Users will have better results and greater sat- isfaction when looking for information writ- ten in English than when searching for trans- literated information written in their native language. (T2 > T1) H2: Users will have better results and greater sat- isfaction when looking for information writ- ten in their native language than when search- ing for information written in a language they do not know. (T1 > T3) 3.2.4 Data Analyses and Findings for the Experiment A profile of the subjects was obtained to capture demographic information in a pre-test questionnaire and this revealed that 56% of the subjects have experi- ence with WorldCat and have an average online searching experience spanning three to five years. Note that one librarian was assigned to each language group and this increased the dispersion in the experi- ence variable when compared to the experience of non-librarians. Variables used in this experiment could be cast as follows: task, subject, and topic as independent measures and user satisfaction as the dependent measure. Overall Satisfaction is a measure encompassing assessments of Results, Relevance with expectations, Understanding level, Efficiency of the system, and Friendliness of the system. The users’ Overall sat- 104 Journal of the Korean Society for Information Management, 27(2), 2010 isfaction value was obtained through a factor analysis of search scores obtained when evaluating task and system performance. Table 2 reports that the principal components, rotated component matrix revealed that two vectors could be used for each search to represent overall user satisfaction: one vector representing Task based satisfaction which included Results, Relevance, and Understanding; and, the other vector reflecting System based satisfaction which encompassed users’ search assessments of the Efficiency and Friendliness of the system. Overall satisfaction was then computed as the summation of the two individual factor scores for the 27 searches which represented the unit of analysis. Separate analyses of each factor were con- ducted as well. By using a Generalized Linear Model (GLM), the three tasks, three topics and nine subjects were parti- tioned to identify users’ Overall satisfaction with the results they achieved. The GLM test revealed that task effect indicated that T2>T1 and T1>T3 (T2: Beta = 1.770, T: Beta = 1.142, and T3: Beta = 0, all at p <.001). That is, the two hypotheses achieve weighted scores that are not likely to occur by chance. Tests of between subject effects uncovered statisti- cally significant results for Subject and Task (p< .05) with non-statistically significant results for Topic. The entire model is presented in the Table 2. The effect size for this model explains 85% of the variance in the Overall satisfaction score. A one-way analysis of variance model with multi- ple group comparisons was performed to explore users’ satisfaction ratings by the three tasks to de- termine if statistically significant differences existed across and between groups. Results revealed that there were statistically significant differences among all groups F(2, 24) = 14.063, p<.001. Post hoc com- parisons using a Scheffé test showed that there were statistically significant mean differences (p≤.05) be- tween all pairs of tasks: task 1 with task 2, task 2 with task 3, and task 1 with task 3. These results affirm the importance of task when individuals per- form multilingual information searches. A separate GLM analysis on System based sat- isfaction and Task based satisfaction was conducted to partition the impact of subject, task, and topic on the original factor derived satisfaction variables. Table 3 reports the results for System based sat- isfaction showing that statistical significance for this Overall Satisfaction Variable Component Separate Variables 1 2 Satisfaction with the results .930 .089 Relevance with users' expectation .880 .188 Catalog understand level .759 -.085 Efficiency of the system .099 .916 Friendly system .011 .927 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Rotation converged in 3 iterations. Rotated component matrix for overall satisfaction variable A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 105 Source Type III Sum of Squares df Mean Square F Sig. Corrected Model* 44.134(a) 12 3.678 6.546 .001 Intercept .000 1 .000 .000 1.000 Subject* 26.351 8 3.294 5.863 .002 Task* 14.497 2 7.248 12.901 .001 Topic 3.286 2 1.643 2.925 .087 Error 7.866 14 .562 Total 52.000 27 Corrected Total 52.000 26 * statistically significant at p < .05. R Squared = .849 (Adjusted R Squared = .719)
Tests of between-subjects effects. dependent variable: Overall satisfaction model rested on the differences among the individuals participating in the experiment: Chinese, Japanese, or Korean. The model accounted for 92% of the system satisfaction variance explained. These results might be used to inform the design of future research which could consider developing separate models for each CJK bibliographic environment. It would be important in future research to separate the perceived effective- ness of the system from it friendliness. A one way ANOVA examined subjects’ back- ground as an explanatory variable for System based satisfaction. The results are based on small sample subgroups but it does show that native language has a statistically significant effect, F (2,9.25) = .001 (p<. 05). In other words, the individuals’ first lan- guage corresponds to their level of satisfaction with how user friendly the system is perceived. The Chinese group reported higher degrees of satisfaction than the Japanese group, and the Japanese group reported greater satisfaction than the Korean group in both interface satisfaction and system cross-lan- guage ability satisfaction. These results correspond to linguistic issues covered later in this report (see Section 4 Transliteration issues). Source Type III Sum of Squares df Mean Square F Sig. Corrected Model 23.860(a) 12 1.988 13.011 .0001 Intercept .000 1 .000 .000 1.000 Subject* 23.379 8 2.922 19.122 .0001 Task .284 2 .142 .931 .417 Topic .197 2 .099 .646 .539 Error 2.140 14 .153 Total 26.000 27 Corrected Total 26.000 26 * significant at p < .05 R Squared = .918 (Adjusted R Squared = .847)
Dependent variable: System based satisfaction 106 Journal of the Korean Society for Information Management, 27(2), 2010 Table 5 provides the GLM results for Task based satisfaction and it indicates that Task and Topic are statistically significant influences explaining 79% of the effect size for this dependent measure. This result is not surprising but it does affirm the im- portance of task and topic when individuals retrieve information from a multi-language bibliographic system. These results, based on a small non-random sample, would need further testing in a larger study so that the individual effects of topic and task can be removed systematically to create separate ex- planatory models. 3.2.5 Observation and Interview Data Patterns of searching are noted for respondents to assess differences by language, by country, and by status of the individual. At the beginning of the search, most individuals spent three to four minutes exploring the design of the search page. Although the page appears simple, it has features that give it more power when searching. In particular, even when searching Task 1 for information written in the subject’s native language, the participants ques- tioned how to assign the language they were looking for (the target language). Since there are a number of different options on the first screen and also on the “advanced search” page, this required some time for users to gain familiarity with the system. The subjects for this experiment were all Asians who said they were most comfortable searching in English, their second language which all of them knew in addition to their primary, native language. The potential pool of relevant hits in the database could be perceived as more productive when search- ing in English which represented the dominant lan- guage of the database. In Task 3 when looking for information written in a language that they do not know, most subjects sought an English word when they browsed the bibliographic description and re- ported that they viewed English as a common link which should span all records in the database. Most of the bibliographic records retrieved, however, did not provide English words and these precluded sub- jects continuing their search. There is one exception Source Type III Sum of Squares df Mean Square F Sig. Corrected Model 20.495(a) 12 1.708 4.344 .005 Intercept .000 1 .000 .000 1.000 Subject 1.378 8 .172 .438 .879 Task* 15.890 2 7.945 20.205 .0001 Topic* 3.227 2 1.614 4.104 .040 Error 5.505 14 .393 Total 26.000 27 Corrected Total 26.000 26 * significant at p < .05 R Squared = .788 (Adjusted R Squared = .607)
Dependent variable: Task based satisfaction A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 107 to this pattern: when subjects tried to look in lan- guages having a similar alphabet to English, such as French, the subject could sometimes guess the meanings of particular words and this encouraged them to continue their search. After completing the three tasks, a short follow-up interview was held to assess how users viewed the search process they had completed. Chinese, Japanese and Korean individuals expressed serious reservations using Romanized transliteration systems when creating or interpreting a search. All but one individual reported great difficulty searching biblio- graphic records across languages. Most subjects com- mented that WorldCat may be well designed for searching for known items in a known language but that it is less effective when searching for information by topic and even less effective when searching or retrieving information in unknown languages. 4. Discussion Most Chinese and Korean native subjects claimed that it is very difficult to understand the descriptive Romanized text without prior knowledge of the record or special expertise in the original language. The problems were less pronounced for Japanese who were better able to read the Romanization for Japanese materials. For Korean native subjects, especially those with more extensive search experience using Korean words, some confusion might have arisen during the survey and experiment due to changes in the Korean Romanization system and in the differences in the Romanization system used in Korea and in foreign countries. This is one example of different needs from different languages and it is assumed there will be more issues related to such cultural and language differences that should be addressed when structuring a Cross Language Information Retrieval system for target users. It is noteworthy that the respondents in this study began by preferring to input their query in their own language and resorted to preferring input in English. When users expressed confusion, it became evident that certain functions would have aided them such as query expansion with suggestions of other words, synonyms, thesauri or distinguishing homophonic words. Most users want to have an abstract or summary of a document or book in their language ― as well as in English. Thus, the respondents here preferred a system whose bibliographic description included three features: original language, Romanization and English. 4.1 Study Limitation This study has several limitations. First, it focuses on limited language choices involving Chinese, Japanese, and Korean (CJK). Next, this study used two convenience samples of individuals whose native languages are Chinese, Japanese, or Korean. Sample selection was achieved by identifying individuals using a network of colleagues. The sample was not randomly selected, and the sample cannot be said to be representative of a larger population. This, then, decreases the generalization available from such 108 Journal of the Korean Society for Information Management, 27(2), 2010 a study and it limits validity beyond the sample. 4.2 Transliteration Issues The CJK languages differ from languages written in a Latin alphabet in that CJK include unique writing and phonological systems. For example, there are 400 syllables in Chinese written by Chinese logo- grams; 110 different moras or syllables written by kana or kanji in Japanese; and 2,000 Korean alpha- betic syllabary in Korean in their writing systems. One common characteristic shared by these three languages is the use of Chinese characters although the frequency of their use is different in each language. Each Chinese character represents a meaning and those from Japan or Korea could approximate the meaning of the Chinese character even when its spe- cific meaning could change depending on the context. Japanese and Koreans use about 2,000 Chinese char- acters (Taylor and Taylor 1999, 17). The biggest challenge of Romanization is making accurate isomorphic representations using a Roman script. Most Romanization systems have attempted to decode the original script through the use of one or two methods; either transliteration or transcription: the former tries to map each character one-by-one based on the original written script of the language; whereas, the latter tries to transcribe the sound of the language. Each Romanization system has its own defining principles and each causes some confusion and difficulty of use which, from the results presented here, is exacerbated during topic searches. Japanese users experienced fewer problems in this study than others; yet, as Kudo (2010) reports, Japanese Romanization still confuses users with word division issues and lack of application of standardized proce- dures for transliteration. The data from the survey and from the experiment with interviews led to an examination of the under- lying linguistic structure for Romanization. That ef- fort then led to areas of concern which might be tested in research settings in order to provide better access to the CJK materials in current online database systems. From user interviews the following emerged as core topics for further investigation: stand- ardization, simplification, Rosetta Stone, and provi- sions for a vernacular search which might include: ∙Exploration of a single standardization system complete with transparent rules which can be applied by those seeking information ― both native and non-native speakers. ∙Studies of traditional vs. simplified Romaniza- tion for Chinese and Korean languages to assess user satisfaction and ability to retrieve pertinent information. ∙Over half the users requested that a standard language, English, be used in parallel with the Romanized script and that English language ab- stracts be provided. This Rosetta Stone prefer- ence implies that translation might be studied as an alternative to transliteration. 5. Conclusion Different native languages often engender differ- A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 109 ent perspectives and these may express themselves in unstated needs for those using bibliographic systems. Language also embodies culture and this, too, emerged in the findings as a concern when trans- literation attempts to mimic spoken language which includes cultural nuances and regional differences. Future continuation of this research can take two directions: (1) providing more in-depth research on the three countries and three languages using a more representative sample; (2) expanding the countries surveyed, the languages used, and the number of individuals contacted in each country. It would also be appropriate to explore a third area: comparing different types of Multilanguage systems, such as those used by Amazon.com and/or online catalog systems, by different language backgrounds. Of spe- cial note will be the socio-cognitive and cultural perspectives of the individuals from each country. Another future area for exploration would be the process of the potential sharing of bibliographic in- formation across borders. Within this would be some exploration of the cooperative work now being done, which much of it under the leadership of Online Computer Library Center (OCLC), which currently directs the WorldCat effort. Future research might also address culture in terms of its influence on user satisfaction and retrieval effectiveness. Currently, WorldCat represents one of the largest multilanguage databases in existence and its im- pressive size and content expand our information boundaries. OCLC continues to advance the features and friendliness of WorldCat. Transliteration is a bridge to knowledge but it currently needs more transparency if it is to satisfy the needs of those seeking information References Arsenault, Clément. 2002. “Pinyin romanization for OPAC retrieval – Is everyone being served?” Information Technology and Libraries, 21(2): 45-50. Bossmeyer, Cristine, Willian R. H. Koops, and Stephen W. Massil. Ed. 1987. Automated Systems for Access to Multilingual and Multiscript Library Materials: Problems and Solutions. Paper presented at the Pre-Conference held at Nihon Daigaku Kaikan, IFLA, August 21-22, 1986, in Tokyo, Japan. Gao, Mobo C. F. 2000. Mandarin Chinese: An Introduction. Victoria: Oxford University Press Ha, YooJin. 2008. Accessing and Using Multilanguage Information by Users Searching in Different Information Retrieval Systems. Ph. D. diss., Rutgers University. Jeong, Wooseob. 1998. “A pilot study of OCLC CJK plus as OPAC.” Library & Information Science Research, 20(3): 271-292. 110 Journal of the Korean Society for Information Management, 27(2), 2010 Kim, Kyongsok. 1999. “Standardizing romanization of korean hangeul and hanmal.” Computer Standards Interfaces, 21(5): 441-459. Kudo, Yoko. 2010. “A study of romanization practice for japanese language titles in oclc world- cat records.” Cataloging & Classification Quarterly, 48(4): 279-302. Lindén, Krister. 2006. “Multilingual modeling of cross-lingual spelling variants.” Information Retrieval, 9(3): 295-310. Oard, Douglas W. and Anne R. Diekema. 1998. “Cross language information retrieval.” In Annual Review of Information Science and Technology (ARIST), 33: 223-256. Oh, Jong-Hoon, Key-Sun Choi, and Hitoshi Isahara. 2006. “A comparison of different machine transliteration models.” Journal of Artificial Intelligence Research, 27: 119-151. Park, Jung-ran. 2001. “Information retrieval of Korean materials using the CJK bibliographic system: Issues and problems.” In Proceedings of the Second KSAABiennial Conference: Korean Studies at the Dawn of the Millennium, 245-255. Shaker, A. K. 2002. Bibliographic Access to Non- Roman Scripts in Library OPACs: A Study of Selected ARL Academic Libraries in the United States. Ph. D. diss., University of Pittsburgh. Shin, Hee-sook. 2003. “Quality of Korean cataloging records in shared databases.” Cataloging & Classification Quarterly, 36(1): 55-90. Sohn, Ho-min. 1999. The Korean Language. Cambridge: Cambridge University Press Taylor, Insup, and M. Martin Taylor. 1995. Writing and Literacy in Chinese, Korean and Japanese. Philadelphia: John Benjamins Publishing Company Wang, Andrew H. 2007. OCLC update. OCLC Online Computer Library Center, Inc. [cited May 10, 2008] . Zeng, Lei. 1992. An Evaluation of the Quality of Chinese-Language Records in the OCLC OLUC Database and the Study of a Rule- Based Data Validation System for Online Chinese Cataloging. Ph. D. diss., University of Pittsburgh. Zeng, Lei. 1991. “Automation of bibliographic control for Chinese materials in the United States.” International Library Review, 23: 299-319. Zhang, Foster and Marcia Lei Zeng. 1998. “Multi- script information processing on crossroads: Demands for shifting from diverse character code sets to the UnicodeTM Standard in library applications.” IFLA Journal, 25(3): 162-167. Zhu, Xiaojin. 2001. “Chinese languages: Mandarin.” In Garry, J. and C. Rubino (Eds.). Facts about the World’s Languages: An Encyclopedia of the World’s Major Languages, Past and Present. 146-150. New York: H.W. Wilson Company. A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 111 Appendix A : Survey questionnaire Ⅰ. Background question 1. What is your native language? 2. Please indicate all other languages you know. 3. Could you please indicate your current position? student: (please indicate your major, degree and place) _____________________ librarian (please specify your library’s area and your subject area) ___________ Others: ______________________________________________________________ 4. When was the last day you used a library system to search for information in a language other than your own? Please respond to ONE of the below: a. ____ days ago b. ____ weeks ago c. ____ months ago d. ____ years ago 5. Please indicate below your use of online library systems which can provide information in languages other than your own language. Include the extent to which you have used such systems. ___________________________________________ 6. Have you ever tried to use OCLC’s WorldCat online library catalog? Yes ____________ No _______ (If yes, could you please comment on your use of this system? If you have not used WorldCat, then please skip the next question 7.) Your comment about the WorldCat system: ___________________________________________ 7. When you conduct a search, which of the below factors are your greatest concern? a. misspelling b. ambiguity of a term c. hard to understand a term d. no problem e. other: ___________________________________________ 8. Imagine if you could design a new information system which had the ability to support cross language information searching and retrieval. Which of the below would be the most helpful to search your query? a. system would provide translation dictionary in query b. translation would be available of the abstract in the target language c. provide highlighting of the indexing words d. support synonyms with a top down menu 9. Could you please indicate why you might need information written in other languages, which might include a language you cannot read? 112 Journal of the Korean Society for Information Management, 27(2), 2010 10. Overall, for how many years have you been doing online searching? _____________ years. Ⅱ. WorldCat usage Please conduct a search on any topic of interest to you using OCLC’s WorldCat system. For purposes of this study, you are being asked to make sure that your search results are written in a language different from the country where you now live. For example, if you are in the US, please try to find certain information written in languages other than English. Please record your search experience by responding to the following questions. (If you are belong to Rutgers University, you can visit to the library website such as go to http://www.libraries.rutgers.edu/rul/rr_gateway/catalogs.shtml and then find WorldCat.) 1. Query you searched for (Please type in the same language that you used in the search) _______ 2. Translate to English if your topic statement was not in English (if possible) 3. What language you were looking for and from what language? (i.e. Korean–English) a. From what language _____________ b. To what language ___________ 4. How long did it take you to get a satisfactory response to your original question? __________ minutes. (Please fill in number of minutes) 5. How satisfied are you with the description of each retrieved document? (circle appropriate response) a. not satisfied b. somewhat satisfied c. I don’t know d. satisfied e. very satisfied 6. Was the retrieved document relevant to your information needs? a. not relevant b. somewhat relevant c. I don’t know d. relevant e. very relevant 7. Do you think this system is efficient, especially when searching for documents in different languages? a. not efficient b. somewhat efficient c. I don’t know d. efficient e. very efficient 8. Is there any word that you could not understand even if it was in your native language? Yes ____ No _____ (If yes, please give an example.) (example: _______________________________________________________) 9. When you conduct a search, which of the below factors are of your greatest concern? a. misspelling b. ambiguity of a term c. hard to understand a term d. no problem e. other: ___________________________________________ 10. All things considered, I am satisfied with the system services. a. Strongly Agree b. Agree c. Undecided d. Disagree e. Strongly Disagree 11. Please describe in detail any difficulties you encountered. A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 113 Appendix B : Experiment Questions Ⅰ. Background questions 1. What is your native language? (please circle) 1: Chinese 2: Japanese 3: Korean 2. Have you ever tried to use OCLC’s WorldCat online library catalog? 0: No 1: Yes 3. Overall, for how many years have you been doing online searching? 0: none, 1:1-2 years, 2:3-5 years, 3: 6-8 years, 4: 9-10 years 5: more than 11 years 4. Are you a librarian? 0: No 1: Yes Ⅱ. Task questions 3 Tasks will be assigned with different topics. Task 1: Do a search looking for information written in your native language. Topic will be given at the experiment. T11. How familiar are you with the topic 0: I don’t know 1: none 2: little 3: somewhat 4: familiar 5: very familiar T12. How many queries did you retrieve to find the final answer for this task? _______ T13. How much time did this task take to get the result? ___________ Minutes T14. How many catalog records did you examine? _______ T15. How many catalog records did you save? _______ R1: Are you satisfied with the result? 0: I don’t know 1: not at all 2: little 3: satisfied 4: very satisfied R2: How much , related information did you retrieve? 0: I don’t know 1: not related at all 2: slightly related 3: Fairly related 4: Perfect match R3: Was the information on the retrieved catalogs understandable to you? 0: I don’t know 1: not at all 2: little 3: understandable 4: very understandable 114 Journal of the Korean Society for Information Management, 27(2), 2010 Task 2: Do a search looking for information written in English (2nd language). Topic will be given at the experiment. T21. How familiar are you with the topic 0: I don’t know 1: none 2: little 3: somewhat 4: familiar 5: very familiar T22. How many queries did you propose to find the final answer for this task? _______ T23. How much time did this task take to get the result? ___________ Minutes T24. How many catalogs did you examine? _______ T25. How many catalogs did you save? _______ R21: Are you satisfied with the result? 0: I don’t know 1: not at all 2: little 3: satisfied 4: very satisfied R22: How much related information did you get on what you were looking for? 0: I don’t know 1: not related at all 2: slightly related 3: Fairly related 4: Perfect match R23: Was the information on the retrieved catalogs understandable to you? 0: I don’t know 1: not at all 2: little 3: understandable 4: very understandable Task 3: Do a search looking for information written in language you don’t know. Topic will be given at the experiment. T31. How familiar with the topic? 0: I don’t know 1: none 2: little 3: somewhat 4: familiar 5: very familiar T32. How many queries did you ask to find the final answer for this task? _______ T33. How much time did this task take to get the result? ___________ Minutes T34. How many catalogs did you examine? _______ T35. How many catalogs did you save? _______ T36. What language of materials you were looking for? 0: English 1: Chinese 2: Chinese (traditional) 3: Japanese 4: Korean 5: French 6: Arabic 7: Parisian, 8: Spanish 9: Otherlanguages R31: Are you satisfied with the result? 0: I don’t know 1: not at all 2: little 3: satisfied 4: very satisfied R32: How much relevant, related information did you get on what you were looking for? 0: I don’t know 1: not related at all 2: slightly related 3: Fairly related 4: Perfect match A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System 115 R32: Was the information on the retrieved catalogs understandable to you? 0: I don’t know 1: not at all 2: little 3: understandable 4: very understandable Ⅲ. Overall questions R4: Do you think this system is efficient, especially when searching for documents in a different language? 0: I don’t know 1: not at all 2: little 3: efficient 4: very efficient R5: Do you think this system is user friendly? 0: I don’t know 1: not at all 2: little 3: friendly 4: very friendly work_ttp7k5ftsbhgvg7m6blaq36lm4 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216589753 Params is empty 216589753 exception Params is empty 2021/04/06-01:37:02 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216589753 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:02 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_tttp64clyfatdo526e7uox6qdy ---- Diverse population, diverse collection Diverse Population, Diverse Collection? Page 1 of 32 Diverse Population, Diverse Collection? Youth Collections in the United States Authors Virginia Kay Williams, Acquisitions Librarian, Wichita State University, 1845 Fairmount Dr., Wichita, Kansas 67260-0068, ginger.williams@wichita.edu Nancy Deyoe, Assistant Dean for Technical Services, Wichita State University, 1845 Fairmount Dr., Wichita, Kansas 67260-0068, nancy.deyoe@wichita.edu Acknowledgements: The authors gratefully acknowledge the support provided by the Amigos Fellowship and Opportunity Award in collecting and analyzing data for this study. Abstract. Do school, public, and academic library collections in the United States provide the children, young adults, and future teachers we serve with books that reflect diverse families and life experiences? Using checklists and OCLC holdings, the authors assessed the extent to which libraries collect youth literature that includes characters from racial and ethnic minorities, characters with disabilities, and characters who identify as LGBTQ. They also assigned public libraries to Conspectus levels and compared youth-diversity holdings by collection expenditures. They found that more than one-third of public libraries spending over $100,000 annually on materials did not achieve the minimal level for representations of diversity in their youth collections, indicating a need for local assessments and additional efforts to provide diverse youth collections. Keywords: children’s literature, young adult literature, diversity, Hispanic, Latino, African American, Asian Pacific Islander, disability, LGBTQ This is an electronic version of an article published Technical Services Quarterly, Volume 31, Issue 2, (2014), pp. 97-121. DOI: 10.1080/07317131.2014.875373. Technical Services Quarterly is available online at http://www.tandfonline.com/. mailto:ginger.williams@wichita.edu http://www.tandfonline.com/ Diverse Population, Diverse Collection? Page 2 of 32 Casual conversations were the impetus for this study. One of the authors talked about working with a new faculty member who requested that the library acquire more culturally authentic Hispanic and Asian American picture books that she could use with her teacher-education students. The other author brought a copy of the Rainbow List home from a conference and mentioned a friend who had lamented that the “middle class, white, straight” bias in library collections meant he seldom found a character with whom he could identify. Other conversations touched on changing demographics in Kansas, the likelihood that a Coretta Scott King Honor Book about a blacksmith would find readers, fond memories of reading Beverley Butler’s Light A Single Candle about a teenager who lost her sight, the relatively new American Indian Youth Literature Award, the value of academic freedom for academic librarians who wanted to acquire controversial children's books, and the challenge of identifying culturally authentic picture books about religious holidays. These casual conversations led to the question: How well do library youth collections reflect the diverse families and life experiences of the communities they serve? The authors looked for information on how well youth collections reflect diversity but found little information to support commonly shared unconscious assumptions that some libraries were more likely to have diverse collections because of library type or community demographics. The authors had discussed possible research on how selector education or beliefs influenced juvenile collection diversity, but realized that the first step needed to be determining that significant differences in collections actually exist. Diverse collections are important. Bernice Cullinan (1989) used the metaphors of a mirror and a window to describe two purposes for children’s literature. Books mirror children’s lives, reflecting their own experiences and reassuring children that their hopes, challenges, and dreams are shared by others. Books also create windows into the lives of others, allowing children to peek into other lives and other cultures, to vicariously experience life through another’s eyes. A library collection that includes books with diverse characters and cultures subtly reaffirms young people's sense of self and the value of their culture while providing opportunities to expand their awareness and understanding of other viewpoints and cultures. How well do library collections in the United States provide those mirrors and windows for Diverse Population, Diverse Collection? Page 3 of 32 children and young adults when the characters are members of populations which have been discriminated against or disadvantaged? Literature Review According to Jaeger, Subramaniam, Jones, and Bertot (2011), racial and ethnic diversity have been the focus of library and information science (LIS) diversity research and education, as have efforts to diversify the profession. Jaeger et al. recommended that LIS diversity research and education include populations which may be discriminated against, disadvantaged, and underserved due to factors such as disability, geography, sexual orientation, and socioeconomic status. This study focuses on the extent to which members of three populations are represented in youth collections: members of minority racial and ethnic groups, individuals with disabilities, and individuals who identify as lesbian, gay, bisexual, transgender, queer, or questioning (LGBTQ). Selecting appropriate terminology is a challenge when writing about diversity issues. Terms may be politically charged, may be rejected by the groups they are used to describe, and may change meaning over time. In this article, when discussing work done by other researchers, the authors use the source’s terminology. For example, the same group may be described as Negro, Black, Afro-American, or African American depending on how that group was described in the source material. The authors followed the contemporary practice of using LGBTQ to describe individuals who identify as lesbian, gay, bisexual, transgender, and queer or questioning, but used LGB when discussing work focused on lesbian, gay, and bisexual populations. Following the 2010 usage of the United States Census Bureau, the authors use the terms “majority” or “non-Hispanic White” to represent the currently dominant racial/ethnic group in the United States, while the term “minority” represents members of other racial and ethnic groups. Librarians have recognized the importance of diversity in library collections for at least half a century. The Freedom to Read Statement, adopted by the American Library Association and American Association of Publishers in 1953, states, “It is in the public interest for publishers and librarians to make available the widest diversity of views and expressions, including those that are unorthodox, unpopular, or Diverse Population, Diverse Collection? Page 4 of 32 considered dangerous by the majority” (para. 8). The ALA’s Policy on Diversity in Collection Development states that “Librarians have a professional responsibility to be inclusive, not exclusive, in collection development …. Access to all materials and resources legally obtainable should be assured to the user…. This includes materials and resources that reflect a diversity of political, economic, religious, social, minority, and sexual issues” (American Library Association, 2013, B.2.1.11 ). Despite this recognition that diversity is important in children’s literature and in library collections, both the library and education literature contain evidence that building diverse youth collections may be difficult. Although slightly over 16% of the United States population was nonwhite in 1960 (Dodd, 1993), just a few years later, Larrick (1965) reported that, of over 5,000 books published for children between 1962 and 1964 that she examined, a mere 6.7% included one or more Negroes. Larrick also noted that many of those books included Negroes only as a dark face in a crowd or with derogatory stereotypes. Decades later in the 2010 Census, non-Hispanic Whites were the majority population of the United States, but 36% of the population indicated that they were members of racial and ethnic minorities (Humes, Jones, & Ramirez, 2011). Although more than one-third of the population of the United States were members of racial and ethnic minorities, statistics from the Cooperative Children’s Book Center (CCBC) showed that less than 10% of books published for children and teenagers in the United States during 2011 were by or about African Americans, American Indians, Asian/Pacific Island Americans, and Latinos (Cooperative Center for Children’s Books, n.d.). Concerns about lack of racial and ethnic minority representation in youth literature are also evident in library science and education literature. In a study of middle-school genre fiction, Agosto, Hughes-Hassell, and Gilmore-Clough (2003) found that about 16% of books published between 1992 and 2001 included at least one minority character in a significant role. Hughes-Hassell and Cox (2010) found that children’s board books rarely include minorities, but when they do, frequently present them in inauthentic contexts. Kurz (2012) found that not quite 18% of South Carolina Picture Book Award nominees featured a minority main character, none of them Latino, although more than 40% of the state’s Diverse Population, Diverse Collection? Page 5 of 32 children are minorities and more than 7% were Latino. Concerns about racial and ethnic diversity in youth literature go beyond the mere scarcity of titles. For example, Sims noted that middle-aged African Americans who had attended newly integrated schools in the mid-twentieth century often had “painful memories of having white schoolmates call him or her ‘Little Black Sambo’ or some equally demeaning epithet from classroom story hours” (Sims, 1983, p. 650). Sims stressed the importance of author’s perspective as well as cultural authenticity, pointing out that even a book with a setting rife with poverty, drugs, and crime can emphasize cultural values such as “the strength and support to be found in human relationships” (Sims, 1983, p. 653). Yokoto (1983) pointed out that while an author writing about his or her own culture will naturally include cultural details in a story, it is possible for an author to gain sufficient understanding through research and close contact to write accurately about another culture. Yokoto also pointed out that a person who does not live in his or her birth culture may be able to write authentically of the experience of living in an adopted culture but not be able to write authentically of the birth culture. Caldwell, Kaye, and Mitten (2007) discussed both culturally authentic titles about American Indians and popular but problematic titles. For example, Caldwell et al. cited Susan Jeffers’ Brother Eagle, Sister Sky as a problematic title, praising its environmental message but pointing out that it is based on a made-up speech by Duwamish Chief Seattle and has illustrations reflecting the cultures of the Plains rather than the Northwest Coast Duwamish people. Two books edited by Slapin and Seale (1992, 2005) used a combination of personal essays and annotated bibliographies to explain why many depictions of American Indians in youth books are inaccurate, offensive, and perpetuate harmful stereotypes; both books recommend many youth books which are accurate and authentic in their portrayal of American Indians. The problems of damaging stereotypes and lack of cultural authenticity are so extensive that college textbooks, such as Multicultural Literature and Response: Affirming Diverse Voices (Smolen & Oswald, 2011) and Cultural Journeys: Multicultural Literature for Children and Young Adults (Gates & Mark, 2006), have been published to introduce future teachers, librarians, and others preparing to work with young people to issues in selecting and using youth books about minority cultures. Diverse Population, Diverse Collection? Page 6 of 32 Individuals with disabilities are another population which may be discriminated against, disadvantaged, and underserved by libraries. Approximately 13% of children and young adults in United States public schools were served under the Individuals with Disabilities Education Act (IDEA) during the 2008-09 school year, indicating a strong need for library collections to include books that mirror the experiences of these youth (Aud et al., 2011). The authors were unable to find statistics on the number of youth books published about individuals with disabilities, although they did note that Disabilities and Disorders in Literature for Youth (Crosetto, Garcha, & Horan, 2009) included just over 500 youth titles, most published between 2000 and 2008. Many librarians and educators have written about using books to help children and teenagers understand and accept individuals with disabilities. In a review of research studies, Anthony (1972) found that neither providing contact with nor information about individuals with disabilities consistently improved attitudes toward them, but that combinations of regular contact with and reading or viewing information about individuals with disabilities did improve attitudes. Smith-D’Arezzo (2003) found that encouraging elementary school children to read enjoyable and realistic books about children with disabilities can promote understanding, although she also stated that not all children will develop more accepting attitudes unless an adult provides some guidance. Prater, Dyches, and Johnstun (2006) encouraged educators to use books that include characters with learning disabilities to help children and young adults understand the challenges that disabled peers face and accept them as people first, rather than focusing on their differences. Rogers and Writt (2010) suggested that reading children’s literature about disabilities helps education students learn about books to share with children and helps them learn about disabilities without intimidating jargon. Andrews (1998) encouraged librarians who are evaluating fiction about individuals with disabilities to look for a plot that does not center on the disability, but reveals aspects of the disability naturally through the storyline. Finally, Walling (2010) stressed that books about children with disabilities should present the disability as just one characteristic of a whole person who is involved in life, faces realistic problems, and is involved with others who do not have disabilities. Diverse Population, Diverse Collection? Page 7 of 32 A third population which may be discriminated against, disadvantaged, and underserved by libraries is people who identify as LGBTQ. Estimates of the number of LGBTQ individuals in the United States vary, but demographer Gary J. Gates (2011) used data from multiple surveys to estimate that 3.8% of the United States population identifies as LGBTQ, while the 2010 Census showed that more than 110,000 same-sex couples in the United States are raising children (Lofquist, Lugaila, O’Connell, & Feliz, 2012). Sexual orientation is a controversial topic in the United States; such controversial topics require extra diligence from librarians to ensure all children and young adults can find mirrors of themselves and their families in the collection. Several studies conducted during the 1990s examined the extent to which libraries hold LGBTQ titles. A 1995 Library Journal survey of LGB issues in public and college libraries found “widespread, though not universal, inattention to gay book collections,” with half the libraries responding reporting fewer than 30 LGB titles held and 14% that reported no LGB titles (Bryant, 1995, p. 37). Sweetland and Christensen (1995) compared holdings of LGB titles to a list of general titles selected from Publishers Weekly, finding that the general titles had significantly more OCLC holdings than the LGB titles even though both title lists had received approximately the same number of reviews. Spence (1998, 1999) conducted two studies of the catalogs of urban public library systems in the United States and Canada, one focusing on adult LGB titles and the other on young adult LGB fiction, finding that both the number of titles held and the number of copies held per capita varied greatly. Rothbauer and McKechnie (1999), investigating holdings of LGB young adult fiction in Canadian public libraries, also found a disparity in number of titles held. In recent years, LGBTQ materials have been widely discussed among librarians and educators; for example, Stringer-Stanback (2011) studied the relationship between holdings of LGBTQ non-fiction young adult materials and the adoption of county anti-discrimination policies in the Southeastern United States. The publication of Rainbow Family Collections (Naidoo, 2012) provided librarians with a guide to selecting and using children’s books with LGBTQ content. Interest in diverse youth literatures among researchers has been paralleled by the growth of book Diverse Population, Diverse Collection? Page 8 of 32 awards and other recognitions targeting specific groups. The ALA, its divisions, and its affiliates recognize the need to recognize and promote quality youth literature by and about racial and ethnic minorities, about people with disabilities, and about people who identify as LGBTQ. The Coretta Scott King Award, founded in 1969, recognizes excellence in children's and young adult books that "demonstrate an appreciation of African American culture and universal human values” (Ethnic and Multicultural Information Exchange Round Table, 1996-2013, para. 1). Titles are selected annually by the Coretta Scott King Book Award Committee, part of ALA's Ethnic and Multicultural Information Exchange Round Table (EMIERT). The Coretta Scott King Book Award Committee also periodically recognizes a new African American author or illustrator for children and young adults with the John Steptoe Award. The Pura Belpre Award was established in 1996 to recognize children's and young adult literature that "best portrays, affirms, and celebrates the Latino cultural experience” (Association for Library Service to Children, 1996-2013, para. 1). The honored book must be by a Latino or Latina author or illustrator. The award is co-sponsored by the Association for Library Service to Children (ALSC) and the National Association to Promote Library and Information Services to Latinos and the Spanish-Speaking (REFORMA). The Belpre Award was given biennially until 2008; it is now an annual award. In 2000-2001, the Asian Pacific American Library Association (APALA) established awards to recognize books about Asian Pacific Americans and their heritage (Asian Pacific American Library Association, 2010). The criteria focus on literary and artistic merit, and the guidelines encourage that nominated titles be by Asian Pacific Island Americans. Awards are presented biennially in several categories, ranging from picture book to adult, and the association may also name honor books. The American Indian Library Association (AILA) presented the first AILA Youth Literature Awards in 2006 (American Indian Library Association (ca. 2006). The award committee is charged with identifying the best writing and illustrations by and about American Indians. The winning books must "present American Indians in the fullness of their humanity in the present and past contexts” ([linked criteria handout], para. 1). The AILA Awards are presented biennially in picture book, middle school, and Diverse Population, Diverse Collection? Page 9 of 32 young adult categories; the committee may also name honor books. In remarks made at the first presentation of the Schneider Family Book Award, Schneider (2004) described the librarian at the Michigan Library for the Blind as her childhood hero and mentioned that during her school years in the 1950s, books and media rarely mentioned people with disabilities. Schneider and her family endowed the award in 2003 to recognize "a book that embodies an artistic expression of the disability experience for child and adolescent audiences” (American Library Association, 1996-2013b, para. 1). An ALA award jury selects three books annually, usually one each for younger children, middle grades, and teens. The award’s guidelines state that winning books must portray a character living with a physical, mental, or emotional disability; the character does not have to be the protagonist but must be important to the story. The Rainbow List is an annual bibliography of books for youth from birth to age 18 that reflect the LGBTQ experience (American Library Association, 1996-2013a). Titles are selected by a committee drawn from members of the Gay, Lesbian, Bisexual, and Transgender Round Table and the Social Responsibilities Round Tables of the ALA. Criteria include commendable literary quality and distribution in the United States during the previous 18 months; both fiction and nonfiction titles are eligible. The first Rainbow Book List was announced in 2008, with 45 titles published between 2005 and 2007. The initial committee noted that most of the titles were for teens because few titles had been published that were appropriate for children (Rainbow Books, 2009). Research Questions This research began with a very broad question: How well do library youth collections reflect the diverse families and life experiences of the communities they serve? Diversity can be viewed in many ways; the ALA Policy on Diversity in Library Collections lists "political, economic, religious, social, minority, and sexual issues” (American Library Association, 2013, section B.2.1.11). The authors decided to focus on four questions in measuring the extent to which youth collections include representations of diverse families and life experiences: Diverse Population, Diverse Collection? Page 10 of 32 • To what extent do libraries in the United States collect youth literature that includes characters from racial and ethnic minorities? Characters with physical, learning, and emotional disabilities? Characters who identify as LGBTQ? • Is there any difference in representations of diversity in youth collections in academic, public, and school libraries? • Is there any difference in representations of diversity in youth collections by region of the United States? • Is there any difference in representations of diversity in public library youth collections by library collection size? By expenditures on collections? Initially, the authors also hoped to address a fifth question: Is there any difference in representations of diversity in public library youth collections by racial diversity, ethnic diversity, or median household income of the community served? However, the authors soon discovered that while such demographic data is readily available in the United States Census, matching library holdings to census data is complicated because public libraries may serve a city, a county, or a group of counties. Research Method The authors used the list-checking method to determine the extent to which libraries collect youth literature that includes characters from racial and ethnic minorities, characters with disabilities, and characters who identify as LGBTQ. As Porta and Lancaster (1988) noted, the first problem in evaluating collections by list checking is the selection or creation of a bibliography that represents the types of books the library's users are likely to seek. Criteria for the lists included: • Titles had been positively reviewed or included on recommended lists • Titles were appropriate for birth to age 18 • Titles had been evaluated for cultural authenticity and avoidance of stereotypes • Titles ranged from widely to less commonly held • Titles were published between 2000 and 2009 Diverse Population, Diverse Collection? Page 11 of 32 The restriction on publication date was intended to emphasize relatively recent collection decisions and to minimize the likelihood that checklist titles had been weeded from library collections. The authors were unable to locate an appropriate checklist for racial and ethnic diversity, so they developed a checklist. The basis of the racial/ethnic diversity checklist was titles recognized by professional library associations for excellence in portraying minority cultures in youth literature, including titles which won or were named as honor books for the Coretta Scott King Award, the John Steptoe Award, the Pura Belpre Award, the APALA Award, and the AILA Award. Because one goal of the checklist was to include a broad spectrum of titles, from widely to less commonly held, the authors used Multicultural Review to identify additional titles for the racial/ethnically diverse checklist. Multicultural Review was founded in 1991 to foster a "better understanding of ethnic, racial and religious diversity" (Smith, 2009, para. 16) among educators and librarians; most book reviewers were either members of racial and ethnic minorities or had an academic background in the culture of the book being reviewed. In addition, Multicultural Review explicitly addressed issues of cultural authenticity in its book reviews. The authors added titles from the children’s review section to the checklist. The authors did not add a title when the reviewer noted problems with cultural authenticity, the reviewer specifically stated that the title was not recommended, the book was not published between 2000 and 2009, or when the authors were not able to determine which racial/ethnic group was portrayed in the book. The authors assigned each title to a broad racial/ethnic grouping, using the award list the title appeared on or the race/ethnicity mentioned in the Multicultural Review book review. Some titles added to the checklist from Multicultural Review were assigned to two racial/ethnic groups when the review mentioned both groups. The authors made no attempt to ensure that the checklist included equal representation of each group; the checklist reflects the availability of culturally authentic children’s and young adult books distributed in the United States from 2000 to 2009. Given that broad groupings like Asian Pacific Islander American reflect many cultures, from Iraqi Americans to Japanese Americans, the value of any data analysis made based on such broad groupings would be limited. Table 1 shows the Diverse Population, Diverse Collection? Page 12 of 32 number of titles from each broad group on the racial/ethnic checklist. Table 1. Titles per checklist. Checklist Subgroup No. Titles Racial/Ethnic All Titles 964 African American 422 American Indian 72 Asian/Pacific Island American 278 Hispanic/Latino 209 Disabilities All Titles 334 Emotional 117 Mental 115 Physical 102 LGBT All Titles 116 Major Awards All Titles 30 Caldecott 10 Newbery 10 Printz 10 All Checklists 1421 Note: Some titles were assigned to multiple categories. The second checklist of titles portraying people with disabilities was developed from two sources, the Schneider Family Book Award and the selective bibliography Disabilities and Disorders in Literature for Youth (Crosetto, Garcha, & Horan, 2009). Titles for the bibliography were selected from positive reviews in a variety of standard sources; only titles published during the decade from 2000 to 2009 were included on the disabilities checklist. Because the authors relied heavily on a bibliography published in 2008, the disability title list includes few titles published during late 2008 or during 2009. The third checklist, titles portraying people who identify as LGBTQ, is composed of titles from the Rainbow Book List. The earliest Rainbow List began with titles published in 2005, so the LGBTQ list includes no titles from the first half of the decade. While Naidoo’s Rainbow Family Collections (2012) would also be an excellent source of titles for children ages birth to 11, it was published after the authors completed the data collection for this project. The authors used a fourth checklist, major awards, to identify libraries that actively collect youth Diverse Population, Diverse Collection? Page 13 of 32 literature. For the purpose of this research, the authors defined a library that actively collects youth literature as one that held at least 20% of the Caldecott Medal, Newbery Medal, and Printz Medal winning titles from 2000 and 2009. As these three medals are awarded for best American picture book for children, most distinguished book for children published in America, and excellence in young adult literature, any United States library that actively collects youth literature should hold some of these titles. While 20% may seem a low threshold, the authors did not want to eliminate school libraries which serve a limited age range. After developing the four checklists, the authors searched each title in OCLC WorldCat to determine library holdings. All holding symbols were copied into a spreadsheet for each title. Many titles had multiple records, reflecting hardcover and paperback editions, audio books, video recordings based on the book, Braille editions, and editions in multiple languages. Since content, rather than format or language, was of primary importance for this research, holdings were recorded for every OCLC record. Holding symbols were de-duplicated, then transferred to an Access database. The database included 11,394 unique holdings symbols. Using checklists is a standard method of assessing library collections, but libraries sometimes need a method to express their collection goals and to compare their collections to those of similar institutions. In the 1980s, the Research Libraries Group (RLG) developed a six-level Conspectus scale to rate library collections; the scale ranges from 0 for out-of-scope to 5 for comprehensive collecting (Bushing, 1997). Early Conspectus rankings were based solely on librarians’ professional judgment. White (1995) proposed brief tests of collection strength as a method of reducing the subjectivity of Conspectus rankings. His brief tests relied on a subject expert developing title lists ranked from most likely to least likely to be held by a library. By obtaining OCLC holding counts for each title on the list, White could establish an objective scale for comparing libraries. Extensive testing showed that they were cumulative; a library that held at least half of the titles at a given level would also hold at least half of the items at all lower levels. Later tests by Twiss (2001) and Lesniaski (2004) demonstrated that results from White’s brief-test method were replicable and consistent with expectations in both academic and public libraries. Given a list of books on a Diverse Population, Diverse Collection? Page 14 of 32 specific topic, OCLC holdings could be used to establish an objective scale for assigning Conspectus levels. After de-duplicating the OCLC holdings for all checklist titles, the authors created a chart matching OCLC holdings ranges with RLG Conspectus levels. Titles were ranked from most to least commonly held, then the ranges were divided into five levels: minimal, basic, resource, research, and comprehensive. Each level included 20% of the checklist titles. The authors used this scale to assign each library a Conspectus level for youth racial/ethnic diversity, youth disability diversity, and youth LGBTQ diversity. Following White’s procedures, a library was assigned to the highest Conspectus ranking for which it held at least half of the titles. Table 2 shows the Conspectus levels with the corresponding OCLC holdings range. Table 2. Conspectus levels with corresponding OCLC holding ranges. Conspectus Levels Race/Ethnicity Checklist Disability Checklist LGBTQ Checklist L-1 Minimal more than 1480 more than 1100 more than 1000 L-2 Basic 900-1480 600-1100 730-1000 L-3 Instruction/Study 640-899 401-599 541-729 L-4 Research 420-639 280-400 330-540 L-5 Comprehensive Less than 420 Less than 280 fewer than 330 The authors next queried the database to identify OCLC symbols for libraries that met the 20% threshold for actively collecting youth literature. Student workers looked up these holding symbols (OCLC, 2013) and recorded the symbol, library name, street address, city, state, zip code, country, and institution type in the database. The authors eliminated libraries that were outside the United States or which were not clearly identified as academic, public, school, or votech/community colleges in the OCLC listing. Libraries identified as either academic or votech/community college in the OCLC listing were categorized as academic libraries for this project. By the conclusion of data collection, the authors had identified 5,004 United States academic, public, and school libraries that actively collect youth literature. Each library that actively collected youth literature was assigned to a region based on the Census Bureau regions: Northeast, Midwest, South, and West (United States Census Bureau, n.d.). To determine Diverse Population, Diverse Collection? Page 15 of 32 the extent to which libraries collect youth literature that includes characters from racial and ethnic minorities, characters with disabilities, and characters who identify as LGBTQ, the authors calculated the average number of titles held for each type of library by region for each checklist. The authors also computed a one-way ANOVA (F) comparing number of each checklist titles held by type of library and a one-way ANOVA comparing number of checklist titles held by region of the United States to determine whether a there is a difference in representations of diversity in youth collections by type of library or by region of the United States. To enable comparisons between libraries with similar collection sizes and similar collection budgets, the authors recorded the number of print volumes held and total collection expenditures for each public library from the Public Library Survey 2009 (Manjarrez et al., 2011). The authors manually matched OCLC symbols to the Public Library Data Survey data by comparing library names and addresses. Although the authors had identified 2,507 public libraries that actively collected youth literature, the authors were only able to match 2,264 public library OCLC symbols to libraries in the Public Library Data Survey. The most common problem in matching OCLC holding symbols to the Public Library Data Survey was that some OCLC symbols were for regional library systems or cooperatives, even though they were listed as public libraries in the OCLC listing. Another common problem was the existence of multiple possible matches in the same town, but none that closely matched the library name in the OCLC listing. After matching OCLC symbols and library names, the authors computed the coefficient of determinations (r2) for each of the checklists to determine whether a relationship exists between collection size and youth diversity holdings or between collection expenditures and youth diversity holdings. Finally, the authors decided to include limited demographic information for a few public libraries that had strong holdings of titles on the three diversity checklists. The authors recorded the total population, the number who identified themselves as White with no other race indicated, and the number who identified themselves as Hispanic/Latino in the United State Census (2010) for each library’s service area. Determining which demographic data should be matched to each OCLC holding symbol was complicated Diverse Population, Diverse Collection? Page 16 of 32 as some libraries serve cities, some counties, and some multi-county consortia. The authors checked each library’s website to determine its service area, then extracted data for the relevant cities and counties from the appropriate census tables. Results and Discussion The authors identified 5,002 academic, public, and school libraries in the United States that actively collect youth literature; these institutions held an average of 158.1 titles from the 964 titles on the race/ethnicity checklist, 45.7 titles from the 334 titles on the disability checklist, and 24.9 titles from the 116 titles on the LGBTQ checklist. The number of titles held varied both by type of library and by region; see Table 3 for the average number and range of titles held. Table 3. United States libraries that actively collect youth literature with titles held per checklist. Library Type Region No. Libraries Major Awards (n=30) Race/Ethnic (n=964) Disability (n=334) LGBT (n=116) Mean Range Mean Range Mean Range Mean Range Academic Northeast 327 21.27 6-30 104.54 1-780 20.61 0-156 8.87 0-109 Midwest 469 21.74 6-30 119.43 1-808 22.08 0-148 9.29 0-101 South 592 20.45 6-30 100.8 0-736 20.73 0-143 7.39 0-95 West 264 19.89 6-30 109.8 0-687 18.56 0-126 9.21 0-106 Public Northeast 163 22.99 6-30 348.4 2-888 91.72 1-316 43.43 0-128 Midwest 1095 21.04 6-30 193.08 2-929 56.37 0-290 22.85 0-135 South 726 19.67 6-30 210.28 5-864 52.04 0-259 19.74 0-117 West 523 22.35 6-30 246.64 1-893 60.69 0-252 29.48 0-130 School Northeast 46 11.48 6-30 75.02 11-655 20.91 1-182 8.87 0-70 Midwest 512 12.53 6-30 71.43 1-684 22.82 1-200 5.6 0-90 South 96 9.8 6-30 70.73 11-767 17.43 1-259 3.6 0-74 West 189 12.9 6-30 82.03 6-810 22.94 1-249 6.87 0-89 Note: There was a significant difference among type of library in holdings of titles on the race/ethnicity (F(2, 4999)=328.29, p<.05), disability(F(2, 4999)=493.68, p<.05), and LGBTQ (F(2, 4999)=335.81, p<.05) checklists. There was a significant difference among regions in holdings of titles on the race/ethnicity (F(3, 4998)=8.46, p<.05), disability(F(3, 4998)=3.71, p<.05), and LGBTQ (F(3, 4998)=15.28, p<.05) checklists. The authors were surprised to note that three academic institutions held no titles from the race/ethnicity checklist. None of these three libraries held more than one-third of the titles from the major award checklist, and none held more than two titles each from the disability and LGBTQ checklists. This suggests that these libraries, and perhaps other academic libraries, may have very small youth literature Diverse Population, Diverse Collection? Page 17 of 32 collections. Some of the academic libraries may limit their youth collections to titles that won a specific award, but have a few titles in their general collections which were included on the diversity checklists as appropriate for young adults. Forty academic libraries and eight public libraries held no titles from the disability checklist. With the exception of the University of Wisconsin-Green Bay, the academic libraries that held no titles from the disability list also held less than 0.5% of the titles on the race/ethnicity checklist and five or fewer titles from the LGBTQ checklist, a finding which suggests that these libraries may limit their youth collections to titles that win specific awards. The University of Wisconsin-Green Bay was unusual because it held 142 titles from the race/ethnicity checklist but no titles from the disability checklist. The authors speculate that the university may have faculty members expressing an interest in racially/ethnically diverse titles, but not specifically requesting titles or assigning students to read youth titles about individuals with disabilities. The authors were also surprised to note that eight of the public libraries held no titles on the disability checklist. All eight also had relatively few holdings from the other diversity checklists and held less than half of the major award checklist titles. Three of the eight spent less than $10,000 on collections, according to the Public Library Survey 2009; the authors were unable to match the other five libraries to a library in the Public Library Survey database. While limited budgets are one possible explanation for the lack of disability checklist titles at these eight libraries, other possibilities include inconsistencies in attaching holdings to OCLC records, reliance on floating collections within a public library system to supply some needs, and weeding or loss of books. Fifteen percent of the libraries held none of the 116 titles on the LGBTQ checklist. Finding that 237 academic libraries, 326 public libraries, and 207 school libraries held none of the LGBTQ checklist titles is concerning. While sexual orientation is a controversial topic, the Freedom to Read Statement specifically recognizes the need for librarians “to make available the widest diversity of views and expressions, including those that are unorthodox, unpopular, or considered dangerous by the majority” (American Library Association & Association of American Publishers, 1953-2004, para. 8). Some of the 770 libraries Diverse Population, Diverse Collection? Page 18 of 32 that hold no titles from the LGBTQ checklist may have very limited collection budgets or restrict their youth collections to titles that win specific awards; however, 31 of the libraries held at least 10% of the 964 titles on race/ethnicity checklist, and 35 held at least 20% of the 334 titles on the disability checklist. Limited budgets cannot be the only reason some libraries have no holdings from the LGBTQ checklist. The authors ranked the libraries by number of titles held from the race/ethnicity checklist, the disability checklist, and the LGBTQ checklist; the top ten holding libraries for each checklist are listed in Table 4 with the number of titles held and ranking for each checklist. With one exception, libraries that ranked in the top ten for holdings of one checklist also ranked in the top 100 on the other two checklists. Brooklyn Public Library was a puzzling exception; it ranked ten on holdings of the LGBTQ checklist titles, 84 on holdings of the disability checklist titles, but only 288 on holdings of the race/ethnicity checklist titles. No academic libraries ranked in the top ten, and only one school library did. The school library is Pasco County School System, Florida; the authors searched for the website and discovered that Pasco is a school district with 84 schools (Pasco, 2013). Although Pasco County School System ranked in the top ten on disability checklist holdings, it was not in the top 50 for either race/ethnicity checklist or LGBTQ checklist holdings, suggesting that one or more of the district’s librarians makes collecting books about individuals with disabilities a high priority. Four of the public libraries ranked in the top ten on all three youth diversity checklists. Diverse Population, Diverse Collection? Page 19 of 32 Table 4. Libraries holding the most titles from the diversity checklists. Population Served Collections Race/Ethnicity (n=964) Disability (n=334) LGBTQ (n=116) Region Library % White Alone % Hispanic Print Volumes Expenditures ($) No. Rank No. Rank No. Rank Midwest ALLEN CNTY PUB LIBR Ft Wayne, IN 79.3 6.5 3,374,517 3,281,970 929 1 290 2 92 10 South BIRMINGHAM-JEFFERSON PUB LIBR Birmingham, AL 53 3.9 768,914 1,628,168 851 11 259 8 76 27 Northeast BROOKLYN PUB LIBR Brooklyn, NY 42.8 19.8 3,943,126 7,448,824 442 288 149 84 92 10 Northeast CARNEGIE LIBR OF PITTSBURGH Pittsburgh, PA 81.5 1.6 1,566,561 3,003,292 703 92 183 51 97 9 Midwest CHICAGO PUB LIBR Chicago, IL 45 28.9 5,295,965 10,187,665 869 7 250 13 103 2 West Denver Public Library Denver, CO 68.9 31.8 1,986,100 4,577,200 830 17 198 41 95 10 South Harris County Public Library Houston, TX 56.6 40.8 1,940,144 4,098,703 864 9 227 22 75 32 Midwest HENNEPIN CNTY LIBR Minnetonka, MN 74.4 6.7 4,417,865 6,910,200 891 3 208 33 104 1 Midwest INFOSOUP (NE WI PUB LIBR) Appleton, WI 92.8 2.7 326,112 532,980 792 38 269 5 90 10 South Jacksonville Public Library Jacksonville, FL 60.9 7.6 2,627,949 3,738,487 864 9 253 11 83 26 West JEFFERSON CNTY PUB LIBR Lakewood, CO 88.4 14.3 1,065,523 3,821,000 765 56 189 47 101 3 West KING CNTY LIBR SYST Issaquah, WA 68.7 8.9 3,084,584 12,567,119 893 2 252 24 96 6 Diverse Population, Diverse Collection? Page 20 of 32 Midwest LIBRARY NETWORK, THE Southgate, MI 66.4 4.3 744,914 831,198 763 57 258 9 58 45 Midwest LINK (SOUTH CENT LIBR SYS) Madison, WI 88.8 4.7 n/a n/a 881 6 267 6 98 4 West LOS ANGELES PUB LIBR Los Angeles, CA 49.8 48.5 6,433,495 10,115,236 893 2 223 24 87 16 Midwest MILWAUKEE CNTY FEDERATED LIBR SYST Milwaukee, WI 60.6 13.3 2,228,916 2,005,404 867 8 252 12 94 7 Northeast NASSAU LIBR SYST Uniondale, NY 73 14.6 103,522 227,833 883 5 316 1 99 5 Northeast New York Pub Libr Res Libr Long Island City, NY 44 28.6 n/a n/a 797 36 163 70 94 9 Northeast ONONDAGA CNTY PUB LIBR Syracuse, NY 81.1 4 635,799 763,155 806 29 260 7 94 8 South PASCO CNTY SCH SYST Land O'Lakes, FL n/a n/a n/a n/a 767 54 259 8 62 56 Northeast QUEENS BOROUGH PUB LIBR Jamaica, NY 39.7 27.5 6,882,543 9,166,748 860 10 254 10 81 23 Northeast ROCHESTER PUB LIBR Rochester, NY 43.7 16.4 1,394,992 1,474,150 888 4 289 3 94 8 West SEATTLE PUB LIBR Seattle, WA 69.5 6.6 1,781,861 5,960,001 832 15 185 49 95 7 West Washoe County Library System Reno, NV 76.9 22.2 1,146,667 761,290 773 49 209 32 93 10 Northeast WESTCHESTER LIBR SYST Tarrytown, NY 68.1 21.8 86,988 113,217 881 6 287 4 84 24 Note: Libraries holding the same number of titles for a checklist are ranked identically. Diverse Population, Diverse Collection? Page 21 of 32 Examining the service area of the four libraries that ranked in the top ten on all three youth diversity checklists revealed a problem with relying on OCLC WorldCat holdings to assess library collections. Some OCLC holding symbols are for multi-branch public library systems. For example, Allen County Public Library (2005) consists of the main library and 13 branches. LINK is the shared catalog for the South Central Library System (2013) which serves a 7-county area of Wisconsin. The Nassau Library System (2007-2012) includes 54 public libraries in Nassau County, New York, and Rochester Public (City of Rochester, n.d.) has ten locations. OCLC WorldCat holdings indicate that children and teenagers at each of these four libraries have access to diverse youth collections through the library catalog, but the data do not reveal the number of copies held or their distribution among branches. As Spence (1998, 1999) noted in his studies of LGBTQ materials in the late 1990s, by looking at both number of titles held and number of copies held per capita, librarians can gain a better picture of how accessible materials on a particular topic are in a library system. After comparing holdings of the diversity checklist titles by geographic region and type of library, the authors focused on holdings by public libraries. Examining the number of diversity checklist titles held per 10,000 volumes held revealed a mild relationship between race/ethnicity titles (r2=0.22), disability titles (r2=0.22), and LGBTQ titles (r2=0.22) held and collection size. Larger collections tended to have more titles from the diversity checklists. On average, the number of race/ethnicity checklist titles held increased 18.4 for every additional 10,000 volumes in the collection, while the number of disability titles increased 0.4 and the number of LGBTQ titles increased by 0.3 per additional 10,000 volumes in the collection. The number of titles on each checklist varied considerably, so it is not appropriate to make comparisons among checklists. As can be seen from the data in Table 5, which shows average checklist holdings by collection size, the number of volumes held is not a perfect predictor of diversity title holdings. The average number of diversity titles held by libraries with very small collections (under 5,000 volumes) and very large collections (over five million titles) do not fit the general rule of larger collections having more diversity checklist titles. The authors examined the libraries with fewer than 5,000 volumes and noted Diverse Population, Diverse Collection? Page 22 of 32 that this group includes a consortium with many more diversity titles than the other six libraries in the group, skewing the average holdings for the very small libraries group. The libraries within that consortium have separate holding symbols, leading the authors to speculate that the consortium collection consists of last copies, professional titles, or other titles that supplement the member library collections. The authors also examined the libraries with over five million volumes, noting that the number of race/ethnicity checklist titles held ranged from 250 to 893, disability titles from 77 to 250, and LGBTQ titles from 45 to 133, but did not identify any possible reason for the wide disparity in diversity checklist holdings. Table 5. Average titles held in public libraries by print collection size. No. Print Volumes No. Public Libraries Major Award (n=30) Race/Ethnicity (n=964) Disability (n-334) LGBTQ (n=116) Less than 5,000 7 17.71 134.29 33.71 12.29 5,000 to 9,999 64 12.39 41.45 13.27 2.30 10,000 to 24,999 414 14.12 40.31 13.90 2.38 25,000 to 49,999 429 19.00 84.29 26.70 6.72 50,000 to 99,999 443 22.76 174.23 49.34 18.11 100,000 to 499,999 707 24.57 342.38 86.84 40.38 500,000 to 999,999 115 26.10 507.62 126.65 62.11 1,000,000 to 2,499,999 64 27.25 560.66 143.64 74.67 2,500,000 to 4,999,999 14 29.14 710.86 180.07 103.79 5,000,000 or more 7 28.86 624.71 168.86 83.00 Overall Mean -- 21.05 219.33 58.31 24.67 The authors also found a mild relationship between expenditures on public library collections and number of race/ethnicity titles (r2=0.27), disability titles (r2=0.27), and LGBTQ titles (r2=0.3) titles held. Libraries that spend more money on collections tend to hold more titles from the diversity checklists. On average, the number of race/ethnicity checklist titles held increased by 108.3, the number of disability titles increased by 2.7, and the number of LGBTQ titles increased by 1.6 per $100,000 spent on collections. Table 6 shows average checklist holdings by 2009 collection expenditures. Table 6. Mean public library holdings of checklist titles by 2009 collection expenditures. Mean Number of Titles Held Diverse Population, Diverse Collection? Page 23 of 32 Expenditures (in dollars) No. of Libraries Major Award (n=30) Racial/Ethnic (n=964) Disability (n=334) LGBTQ (n=116) Less than 5,000 87 11.48 24.40 8.11 0.85 5,000 to 9,999 158 13.11 32.74 10.51 1.70 10,000 to 24,999 349 15.95 51.94 17.09 2.77 25,000 to 49,999 314 19.41 90.79 28.67 6.75 50,000 to 99,999 307 21.66 148.09 42.47 12.99 100,000 to 499,999 676 24.31 287.78 74.76 32.86 500,000 to 999,999 189 26.40 479.92 120.22 61.96 1,000,000 to 2,499,999 112 25.85 507.85 129.38 65.44 2,500,000 to 4,999,999 46 27.70 536.57 136.61 72.80 5,000,000 or more 26 28.81 694.92 177.19 100.08 Overall Mean -- 21.08 213.9 56.98 24.13 The authors looked at the Conspectus rankings of each public library to develop a better understanding of the strength of collections by library expenditures. As Table 7 shows, only 22 public libraries ranked at level 5 on the race/ethnicity conspectus, only five on the disability conspectus, and only 20 on the LGBTQ conspectus. Level 5 represents a comprehensive collection, where the library attempts to acquire every title available on a given topic, so it is unsurprising that only a few libraries achieve this rating. Level 0, on the other hand, represents an out-of-scope collection, an area in which the library does not attempt to collect and holds very few of the most commonly held titles. The authors were concerned by the number of libraries with level 0 ratings on the three youth diversity Conspectus scales. While it is understandable that a library spending less than $5,000 a year will not be able to purchase many youth diversity titles, it is disturbing to find that so many libraries, including many that spend more than $100,000 annually on collections, did not achieve a minimal rating on the youth diversity conspectus scales. The standard for a level 1 minimal collection is holding at least half of the level 1 titles, so a library would need to hold at least 95 titles from level 1 of race/ethnicity checklist, at least 33 titles from level 1 of the disability checklist, and at least 13 titles from level 1 of the LGBTQ checklist to be rated as having a minimal collection in that area. All of these libraries were identified as actively collecting youth literature based on Diverse Population, Diverse Collection? Page 24 of 32 OCLC holdings of major award winners, but it is possible that some are inconsistent in setting holdings in OCLC, others stopped setting OCLC holdings during the last decade, and some have strict weeding criteria based on age or circulation resulting in removal of older titles from their youth collections. It seems likely, however, that some of these libraries simply do not make collecting youth diversity literature a priority. Diverse Population, Diverse Collection? Page 25 of 32 Table 7. Number of libraries by 2009 collection expenditures and Conspectus level. Race/Ethnicity Conspectus Levels Disability Conspectus Levels LGBTQ Conspectus Levels Expenditures (in dollars) L-0 L-1 L-2 L-3 L-4 L-5 L-0 L-1 L-2 L-3 L-4 L-5 L-0 L-1 L-2 L-3 L-4 L-5 Less than 5,000 86 0 0 0 0 0 86 0 0 0 0 0 86 0 0 0 0 0 5,000 to 9,999 157 0 0 0 0 0 156 1 0 0 0 0 157 0 0 0 0 0 10,000 to 24,999 345 2 1 0 0 0 347 1 0 0 0 0 347 0 1 0 0 0 25,000 to 49,999 293 19 2 0 0 0 293 20 1 0 0 0 309 4 0 1 0 0 50,000 to 99,999 239 60 5 1 2 0 225 79 1 0 2 0 277 18 7 3 2 0 100,000 to 499,999 261 189 141 55 28 2 250 356 59 7 1 3 343 141 106 69 15 2 500,000 to 999,999 28 19 30 65 44 3 26 69 78 11 4 1 35 17 36 55 46 0 1,000,000 to 2,499,999 24 7 8 16 51 6 25 19 40 22 5 1 29 9 11 10 49 4 2,500,000 to 4,999,999 9 5 6 3 18 5 6 15 12 8 5 0 11 3 7 1 18 6 5,000,000 or more 1 0 2 2 15 6 1 1 9 9 6 0 1 0 1 2 14 8 TOTAL 1443 301 195 142 158 22 1415 561 200 57 23 5 1595 192 169 141 144 20 NOTE: L-0 = Out of Scope, L-1 = Minimal, L-2 = Basic, L-3 = Instructional/Study, L- 4 = Research, L-5 = Comprehensive Diverse Population, Diverse Collection? Page 26 of 32 Implications for Collection Development and Assessment The first finding of this study should not surprise any librarian who understands the basics of collection development and assessment; the authors found significant differences between academic, public, and school libraries in the extent to which they collect youth literature that reflects the diverse families and life experiences of members of racial and ethnic minorities, individuals with disabilities, and people who identify as LGBTQ. Library collections are built to meet the needs of their users. Table 2 shows that, in every region of the United States, public libraries collected a greater percentage of the checklist titles than either academic or school libraries. Both public and school libraries collect youth materials to support recreational reading and information needs, but public libraries collect for all age groups while school libraries often supported narrow age ranges. Academic libraries typically collect youth literature to support teacher education and other programs preparing professionals to work with youth, so the youth collections tend to be small collections of exemplary materials rather than the broader collections of school and public libraries. Librarians must consider the library’s mission and population served when selecting and using checklists to assess youth literature collections. The second finding, that there are significant regional variances in representations of diversity in youth collections, is intriguing but must be considered in relation to population distribution. For example, Table 2 shows that public libraries in the Northeast held more titles, on average, from each of the checklists than public libraries in the other regions. Is this because public librarians in the Northeast are more attuned to the need for representations of diverse families and experiences? Or is it because many large cities are in the Northeast and large cities have large public library systems with extensive collections? Finding that over 700 libraries held no titles from the LGBTQ checklist should prompt librarians to reconsider their selection practices and their youth collections. Of the 5,002 academic, public, and school libraries identified as actively collecting youth literature, only three held no titles from the race/ethnicity checklist. The authors were mildly concerned to realize that 48 libraries held none of the disability titles, but finding that 15% of libraries held no titles from the LGBTQ checklist was disturbing. Sexual orientation is Diverse Population, Diverse Collection? Page 27 of 32 a highly controversial topic in the United States, so it would be easy for selectors to unconsciously shy away from titles that are likely to be challenged. Librarians responsible for youth collections should make assessing their collections for LGBTQ-friendly titles a priority to verify that they are not unconsciously practicing censorship by selection. The results of this study should also remind librarians that while the Freedom to Read Statement was first adopted in 1953, its goals have not yet been met in all libraries. In a nation with increasing diversity, libraries need a renewed focus on serving all parts of our communities. Looking at the Conspectus levels in Table 7 should encourage public librarians to assess their youth collections for representations of diversity. Most librarians would probably have predicted the findings, shown in Tables 5 and 6, that increased collection size and that increased collection expenditures were positively correlated with representations of diversity in public library youth collections. Table 7, however, shows that more than one-third of libraries spending over $100,000 per year on materials did not achieve the minimal level for representations of racial/ethnic diversity or representations of disability, while half did not meet the minimal level for representations of LGBTQ orientation in youth collections. As noted above, it is possible that inconsistent setting of holdings in OCLC or strict weeding practices resulted in low Conspectus rankings for some public libraries, but these results also suggest that public libraries need to review their youth collections for representations of diversity. Conclusion This study used checklists to assess representations of diversity in youth collections based on OCLC holdings. The authors noted several potential problems, such as inconsistent setting of OCLC holdings, the possibility that older checklist titles may have been weeded before data collection commenced, difficulty of matching OCLC symbols with libraries in the Public Library Data Survey, and the need for caution in interpreting results given that some OCLC holding systems represented discrete libraries while others represented school districts, multi-branch public libraries, or regional library systems. Despite these limitations, the results indicate that librarians need to assess their youth collections for representations of diversity, particularly in representations of LGBTQ individuals. Diverse Population, Diverse Collection? Page 28 of 32 The results also suggest several areas for further research. For example, the authors developed a broad checklist for racial/ethnic diversity, but using a checklist of award-winning or highly recommended titles focused on a specific group might indicate that most libraries do collect the best of the best. The authors did not divide the checklists by the age range of materials; would using separate checklists for children and young adults reveal differences in collecting patterns, particularly of LGBTQ titles? The authors limited this study to OCLC holdings, which do not indicate whether a library holds one copy or many; assessing how well a multi-branch library system serves its population would require looking at both number of titles and number of copies. This study was an attempt to assess the extent to which library youth collections reflect diverse families and life experiences. While the authors did not fully achieve their goals, the results suggest that children and teenagers from racial and ethnic minorities, youth with disabilities, and youth from families with LGBTQ members are finding few representations of people like themselves in many libraries. Professional ethics require that libraries provide diverse youth collections, but doing so is also in libraries’ best interests. Minority populations are growing at a faster rate than the non-Hispanic White majority, the percentage of the population living with disabilities is likely to increase as health care improves and the population diversifies, and more people are publicly identifying as LGBTQ. According to File (2013), for the first time since the Census Bureau began tracking voting data by race, the percentage of eligible Black voters who voted in 2012 was higher than the percentage of eligible White voters (File, 2013). If libraries are to survive and thrive in the United States, libraries must work to appeal to members of minority populations and multiple constituencies. Providing youth collections that reflect diverse families and life experiences is one way that libraries can ensure that members of populations which have been discriminated against or disadvantaged feel welcomed and included in libraries today and in the future. REFERENCES Agosto, D. E., Hughes-Hassell, S., & Gilmore-Clough, C. (2003). The all-White world of middle-school genre fiction: Surveying the field for multicultural protagonists. Children’s Literature in Education, 34, 257-275. Diverse Population, Diverse Collection? Page 29 of 32 Allen County Public Library. (2005). Locations [Web page]. Retrieved from http://www.acpl.lib.in.us/ American Indian Library Association. (ca. 2006). American Indian Youth Literature Award. Retrieved from http://ailanet.org/activities/american-indian-youth-literature-award/. American Library Association. (2013). Diversity in collection development. In ALA Policy Manual (section B.2.1.11). Retrieved from http://www.ala.org/aboutala/governance/policymanual/ American Library Association. (1996-2013a). Rainbow Project Committee. Retrieved from www.ala.org/glbtrt/about/committees/jnt-rainbowprj American Library Association. (1996-2013b) Schneider Family Book Award. Retrieved from http://www.ala.org American Library Association & Association of American Publishers. (1953-2004). The Freedom to Read Statement. Retrieved from http://www.ala.org/advocacy/intfreedom/statementspols/freedomreadstatement Andrews, S. E. (1998). Using inclusion literature to promote positive attitudes toward disabilities. Journal of Adolescent & Adult Literacy, 41, 420-426. Anthony, W. A. (1972). Societal rehabilitation: Changing society’s attitudes toward the physically and mentally disabled. Rehabilitation Psychology, 19, 117-126. Asian Pacific American Library Association. (2011, June 1). Literature Award Guidelines. Retrieved from www.apalaweb.org/awards/literature-awards/literature-award-guidelines Association for Library Service to Children, A Division of the American Library Association. (1996-2013). About the Pura Belpré Award. Retrieved from http://www.ala.org/alsc/awardsgrants/bookmedia/belpremedal/belpreabout Aud, S., Hussar, W., Kena, G., Bianco, K., Frohlich, L., Kemp, J., Tahan, K. (2011). The Condition of Education 2011. (NCES 2011-033). United States Department of Education, National Center for Education Statistics. Washington, DC: U.S. Government Printing Office. Bryant, E. (1995, June 15). Pride & prejudice: LJ’s survey of gay and lesbian library service. Library Journal, 120, 37-39. Bushing, M., Davis, B., & Powell, N. (1997). Using the Conspectus method: A collection assessment handbook. Lacey, Washington: WLN. City of Rochester. (n.d.) Rochester Public Library Director’s Message [Web page]. Retrieved from http://www.cityofrochester.gov/ Caldwell, N., Kaye, G., & Mitten, L. A. (2007). “I” is for inclusion: The portrayal of Native Americans in books for young people. Retrieved from http://ailanet.org/about/publications/ Cooperative Children’s Book Center, School of Education, University of Wisconsin-Madison. (n.d.). Children’s books by and about people of color published in the United States. Retrieved from www.education.wisc.edu/ccbc/books/pcstats.asp Diverse Population, Diverse Collection? Page 30 of 32 Crosetto, A., Garcha, R., & Horan, M. (2009). Disabilities and disorders in literature for youth: A selective annotated bibliography for K-12. Lanham: Scarecrow Press. Cullinan, B. (1989). Literature and the child (2nd ed.). San Diego: Harcourt Brace Jovanovich. Dodd, D. (1993). Historical statistics of the states of the United States: Two centuries of the census, 1790-1990. Westport, Connecticut: Greenwood Press. Ethnic and Multicultural Information Exchange Round Table, American Library Association. (1996-2013).The Coretta Scott King Book Awards. Retrieved from www.ala.org/emiert/cskbookawards File, T. (2013). The diversifying electorate -- voting rates by race and Hispanic origin in 2012 (and other recent elections).Current Population Survey Reports, P20-569. Retrieved from United States Census Bureau website: http://www.census.gov/ Gates, G. J. (2011). How many people are lesbian, gay, bisexual, and transgender? The Williams Institute on Sexual Orientation and Gender Identity Law and Public Policy, School of Law, University of California Los Angeles. Retrieved from www.law.ucla.edu/williamsinstitute/ Gates, P. S., & Mark, D. L. H. (2006). Cultural journeys: Multicultural literature for children and young adults. Lanham: Scarecrow Press. Hughes-Hassell, S., & Cox, E. J. (2010). Inside board books: Representations of people of color. Library Quarterly, 80, 211-230. doi: 10.1086/652873 Humes, K. R., Jones, N. A., & Ramirez, R. R. (2011). Overview of race and Hispanic origin: 2010. 2010 Census Briefs, C2010BR-02. Retrieved from United States Census Bureau website: http://www.census.gov/ Jaeger, P. T., Subramaniam, M. M., Jones, C. B., & Bertot, J. C. (2011). Diversity and LIS education: Inclusion and the age of information. Journal of Education for Library and Information Science, 52, 166-183. Kurz, R. F. (2012). Missing faces, beautiful places: The lack of diversity in South Carolina Picture Book Award nominees. New Review of Children’s Literature and Librarianship, 18, 128-145. doi: 10.1080/13614541.2012.716695 Larrick, N. (1965, September 11). The all-White world of children's literature. The Saturday Review, 48, 63-65, 84-85. Lesniaski, D. (2004). Evaluating collections: A discussion and extension of brief tests of collection strength. College & Undergraduate Libraries, 11, 11-24. doi: 10.1300/J106v11n01_02 Lofquist, D., Lugaila, T., O’Connell, M., & Feliz, S. (2012). Households and families: 2010. 2010 Census Briefs, C2010BR-14. Retrieved from United States Census Bureau website: http://www.census.gov/ Manjarrez, C., Miller, K., Craig, T., Dorinski, S., Freeman M., Isaac, N., O’Shea, P., Schilling, P., Scotto, J. (2011). Data file documentation: Public libraries survey: Fiscal year 2009 (IMLS-2011–PLS-01). Institute of Museum and Library Services. Washington, DC. www.imls.gov/statistics Diverse Population, Diverse Collection? Page 31 of 32 Naidoo, J. C. (2012). Rainbow family collections: Selecting and using children’s books with lesbian, gay, bisexual, transgender, and queer content. Santa Barbara, CA: Libraries Unlimited. Nassau Library System. (2007-2012). NLS Online: About NLS. [Web page]. Retrieved from http://www.nassaulibrary.org/ OCLC. (2013). Find an OCLC library. [Data file]. Retrieved from http://www.oclc.org/contacts/libraries/ Pasco County Schools. (2013). 2012-2013 District School Board of Pasco County fact sheet. Retrieved from http://www.pasco.k12.fl.us/ Porta, M. A., & Lancaster, F. W. (1988). Evaluation of a scholarly collection in a specific subject area by bibliographic checking: A comparison of sources. Libri, 30, 131-37. doi: 10.1515/libr.1988.38.2.131 Prater, M. A., Dyches, T. T., & Johnstun, M. (2006). Teaching students about learning disabilities through children’s literature. Intervention in School and Clinic, 42, 14-24. doi: 10.1177/10534512060420010301 Rogers, K. & Writt, H. (2010). No rose-colored glasses needed: Learning about illnesses and disabilities through children’s literature. Kentucky Libraries, 74 (1), 14-19. Rainbow Books (2009, February 1). 2008 Rainbow List [Web log message]. Retrieved from http://glbtrt.ala.org/rainbowbooks/archives/153 Rothbauer, P. M. & McKechnie, L. E. F. (1999). Gay and lesbian fiction for young adults: A survey of holdings in Canadian public libraries. Collection Building, 18, 32-39. doi: 10.1108/01604959910256526 Schneider, K. (2004, June 29). Remarks for Donor Katherine Schneider, Ph.D., L.P., ABPP, on the occasion of the first presentation of the Schneider Family Book Award. In Schneider Family Book Award Manual. Retrieved from www.ala.org/awardsgrants/awards/1/all_years Sims, R. (1983). What has happened to the all-White world of children’s books? Phi Delta Kappan, 64, 650-653. Slapin, B. & Seale, D. (Eds.) (1992). Through Indian eyes: The native experience in books for children. Philadelphia: New Society Publishers. Slapin, B. & Seale, D. (Eds.) (2005). A broken flute: The native experience in books for children. Walnut Creek, CA: AltaMira Press. Smith, C. L. (2009, November 19). Editor interview: Lyn Miller-Lachmann on Multicultural Review. [Web log message]. Retrieved from http://cynthialeitichsmith.blogspot.com/2009/11/editor-interview-lyn-miller-lachmann-on.html Smith-D’Arezzo, W. M. (2003). Diversity in children’s literature: Not just a black and white issue. Children’s Literature in Education, 34, 75-94. Smolen, L. A., & Oswald, R. A. (Eds.) (2011). Multicultural literature and response: Affirming diverse voices. Santa Barbara: Libraries Unlimited. South Central Library System. (2013). SCLS Directory of Public Libraries [Web page]. Retrieved from http://www.scls.info/ Diverse Population, Diverse Collection? Page 32 of 32 Spence, A. (1999). Gay young adult fiction in the public library: A comparative survey of public libraries in the US and Canada. Public Libraries, 38, 224-243. Spence, A. (1998). Gay books in the public library: responsibility fulfilled or access denied? How nineteen large urban American and Canadian library systems compare in service to their communities. Toronto: International Information Research Group. Stringer-Stanback, K. (2011). Young adult lesbian, gay, bisexual, transgender, and questioning (LGBTQ) non-fiction collections and countywide anti-discrimination policies. Urban Library Journal, 17. Retrieved from http://ojs.gc.cuny.edu/index.php/urbanlibrary. Sweetland, J. H. & Christensen, P. G. (1995). Gay, lesbian and bisexual titles: Their treatment in the review media and their selection by libraries. Collection Building, 14 (2), 32-41. doi: 10.1108/eb023399 Twiss, T. M. (2001). A validation of brief tests of collection strength. Collection Management, 25, 23-32. doi: 10.1300/J105v25n03_03 United States Census Bureau. (n.d.). Census regions and divisions of the United States. Retrieved from http://www.census.gov/geo/www/us_regdiv.pdf United States Census Dataset 2010 Summary File 1. (2010). [Data file]. Retrieved from http://www.census.gov/ Walling, L. L. (2010). Evaluating materials about children with disabilities [Web page]. Retrieved from http://faculty.libsci.sc.edu/walling/evaluatingmaterialsabout.htm. White, H. D. (1995). Brief tests of collection strength: A methodology for all types of libraries. Westport, Connecticut: Greenwood Press. Yokota, J. (1993). Issues in selecting multicultural children’s literature. Language Arts, 70, 156-167. work_txvaxeot7jfdlaib6aa6kyw6lq ---- The Self-Publishing Phenomenon and Libraries The self-publishing phenomenon and libraries Juris Dilevko4, Keren Dali Faculty of Information Studies, University of Toronto, Toronto, Ontario, Canada M5S 3G6 Available online 4 May 2006 Abstract In the late 1990s and early 2000s, the concept of book self-publishing for fiction and nonfiction began to loom large in the North American publishing universe. As traditional mainstream publishers consolidated and were often loathe to take chances on unknown writers whose books might not turn immediate profits, some authors found that fewer and fewer publishing venues were open to them. As a result, new self-publishers—collectively called bauthor servicesQ or print-on-demand (POD) publishers—appeared alongside subsidy (or vanity) publishers. Against the background of an increasing corporatization of mainstream publishing, book self-publishing can theoretically be situated as one of the last bastions of independent publishing. This article examines how academic and public libraries dealt with the book self-publishing phenomena during 1960–2004. To what extent did libraries collect fiction and nonfiction published by self-publishing houses? Can any patterns be discerned in their collecting choices? Did libraries choose to collect more titles from bauthor servicesQ publishers than subsidy publishers? D 2006 Elsevier Inc. All rights reserved. 1. Background Self-publishing of books has a long and illustrious history. Kremer (n.d.) has compiled an extensive list of now-famous authors who chose initially to self-publish their books or were forced to take this path because one or more of their books were rejected by one or more 0740-8188/$ - see front matter D 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.lisr.2006.03.003 4 Corresponding author. E-mail addresses: juris.dilevko@utoronto.ca (J. Dilevko)8 keren.dali@utoronto.ca (K. Dali). Library & Information Science Research 28 (2006) 208–234 traditional publishers. Most of these authors undertook self-publishing without the help of formal self-publishing companies, contacting printers with whom they made a financial arrangement. Some authors even started their own presses. Among self-published authors are Margaret Atwood, William Blake, Elizabeth Barrett Browning, Willa Cather, W. E. B. DuBois, Benjamin Franklin, Nathaniel Hawthorne, Beatrix Potter, Mark Twain, Walt Whitman, and Virginia Woolf. More recently, Mawi Asgedom (Of Beetles and Angels), Dave Chilton (The Wealthy Barber), Irma Rombauer (The Joy of Cooking), and James Redfield (The Celestine Prophecy) enjoyed self-publishing success. In North America, self-publishing evolved into a formal industry in the early and middle part of the twentieth century, with the growth of book subsidy (or vanity) publishers such as Dorrance Publishing (Pittsburgh, PA), founded in 1920, and Vantage Press (New York), founded in 1949. They typically used offset printing and charged an author between $8,000 and $50,000 bfor a limited quantity of copies, some owned by the author and the rest warehousedQ by the publisher (Glazer, 2005, p. 10). If Vantage Press can be taken as a representative example, each of these publishers produces bbetween 300 and 600 titles a yearQ (Glazer, 2005, p. 10), all the while warning authors that, although b[s]ome prestige and popularity may come your way . . . it is important to recognize that you may only regain a small part of the feeQ (Span, 2005, p. T8). All in all, these publishers never enjoyed stellar reputations, and were consistently on the sidelines of the publishing world. The consolidation of mainstream publishing houses into corporate behemoths in the late 1990s meant that many formerly independent publishers became part of the entertainment divisions of profit-oriented companies answerable to shareholders, pension funds, and mutual funds. Once they became a part of these divisions, they were expected to show profits on each book they published (Schiffrin, 2000). Epstein (2001) explained that b[c]onglomerate budgets require efficiencies and create structures that are incompatible with the notorious vagaries of literary production, work whose outcome can only be intuited,Q adding that bthe retail market for books is now dominated by a few large bookstore chains whose high operating costs demand high rates of turnover and therefore a constant supply of bestsellers, an impossible goal but one to which publishers have become perforce committedQ (p. 12). One publishing official describes the situation in vivid terms: bCompanies like Random House and Simon & Schuster are in the process of investing in highly valuable properties. They want to find Deepak Chopra; they don’t want to find a writer necessarily who has an audience of 10,000 peopleQ (quoted in Glazer, 2005, p. 10). Conventional publishers—that is, publishers that do not concentrate on scholarly monographs—became risk-averse, concentrating their energies on books that they were confident would be guaranteed bestsellers and profit makers. In this environment, many first-time authors—and even seasoned writers with one, two, or three published books—found it increasingly difficult to convince established publishers to take a chance on their new books. For example, Wyatt (2004) describes how Jeffrey Marx, a published Simon & Schuster (S&S) author, after having his idea for a new book (Seasons of Life) about ba former professional football player turned minister who teaches high school football players how to be men of substanceQ rejected by S&S in 2003, finally secured a book contract with S&S in late 2004, but only after he self-published and sold—through extensive entrepreneurial efforts that included travel, speeches, and self-generated publicity—14,000 J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 209 copies. In addition, university presses and small independent presses were paying more and more attention to financial questions—a circumstance that invariably meant that they took fewer chances on manuscripts whose sales potential they could not accurately gauge (Thompson, 2005). It was in this context that a new generation of self-publishers such as AuthorHouse, iUniverse, and Xlibris developed in the middle and late 1990s. Often referred to as bauthor servicesQ publishers and employing print-on-demand (POD) technology, they marketed themselves to the growing number of disaffected authors who had been frustrated by repeated rejections from corporate and independent publishers. iUniverse, for example, bprint[s] a trade paperback for $299 to $748, depending on how many dfreeT copies and how much deditorial reviewT a customer wants, with additional charges for line-editing, proofreading and press releasesQ (Span, 2005, p. T8), assigns it an ISBN number, and makes it available to online book retailers (Glazer, 2005). Xlibris, in conjunction with Borders bookstores, offers a btake-home self-publishing kitQ explaining that, bfor between $299 and $598, customers can have a manuscript converted into a book by Xlibris, be listed on Amazon.com and get shelf space in BordersQ (Glazer, 2005, p. 11). AuthorHouse, which began under the name of 1stBooks (or 1st Books), offers standard paperback publishing for $698 and color paperback publishing for $999, with a wide array of ancillary services, including a bPersonal Media ValetQ for $3,000, bExpanded PromotionQ for $750, a bBooksellers Return ProgramQ for $699, and bMedia AlertsQ for $450 (AuthorHouse, 2005, p. 8). Although vehemently against being considered either a POD or subsidy publisher, PublishAmerica, which released 4,800 titles in 2004 (Span, 2005), was nevertheless part of the new wave of self-publishers (PublishAmerica, 2005). Despite offering authors a nominal $1 advance, seven-year contracts, and royalties of 8%, 10%, and 12.5% (depending on the number of copies sold), it soon garnered a negative reputation (Span, 2005). Whether this was warranted or not, it associated PublishAmerica with older subsidy models of self-publishing, although one of PublishAmerica’s titles – Mary Carpenter’s Rescued by a Cow and a Squeeze, a biography of Temple Grandin, a professor at Colorado State University who designs humane animal facilities – received a highly favorable review in 2003 in the prestigious Washington Post Book World. The bauthor servicesQ business model made an immediate impact. AuthorHouse had approximately 23,000 books under contract in early 2005; between its inception in 1997 and the end of 2003, it sold approximately 2 million titles (Glazer, 2005). In 2004, AuthorHouse, iUniverse, and Xlibris – considered to be the top three self-publishing firms – introduced ba total of 11,906 new titlesQ (Glazer, 2005, p. 10). And, in an attempt to escape the vanity press stigma and reach bookstore shelves, iUniverse introduced a program called bStar,Q which bselect[s] two or three books a month that have passed an internal editorial review and sold more than 500 copies,Q offers them to bookstores at competitive discounts, accepts returns, and sends out advance galleys to reviewing outlets (Glazer, 2005, p. 11). The impact of these companies was such that many established authors turned to self- publishing bbecause they’re unable to interest their publishers in a new genreQ (Glazer, 2005, p. 11). This was the case with fantasy and science fiction author Piers Anthony, who, at the beginning of 2005, bha[d] published more than 15 books with Xlibris, either to release serious J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234210 historical fiction or to make out-of-print books availableQ (Glazer, 2005, p. 11). In perhaps the clearest sign that self-publishers such as AuthorHouse, iUniverse, and Xlibris had escaped the debilitating vanity press stigma, literary agents began to recommend to some best-selling authors that they publish with these companies. Kathryn Harvey was advised to publish her book Private Entrance with Xlibris because traditional mainstream publishers bcomplained that [it] . . . fit into neither the dchick-litT category nor the older woman’s audience (sometimes called dhen litT)Q (Glazer, 2005, p. 11). Harvey’s agent summarized the new publishing landscape: bThe self-publishing route has become a viable alternative for a lot of these authors who can’t conveniently categorize what they’re doingQ (quoted in Glazer, 2005, p. 11). His implication was clear: mainstream publishers prefer proven and safe categories that have reliable sales records. As soon as something different appears, these publishers become reticent, fearing a lack of profits if they take a chance on an unproven commodity. There is little dispute that the self-publishing of fiction and nonfiction books in North America grew in the late 1990s and early 2000s, mainly because of the POD model. Still, book self-publishers of all kinds continued to have what could best be described as mixed reputations because of the perceived poor quality of the books they publish and because – no matter the often complex contractual arrangements between the companies and authors – authors themselves, in the final analysis, pay to have their works published. Indeed, the first factor is often seen as leading to the second. Thus, although there have been many best- selling self-published successes over the past decades and centuries, a stigma hovers over the book self-publishing universe—a stigma exacerbated by the controversy, in the early 2000s, involving PublishAmerica, which stood accused of a wide range of deceptive practices (Span, 2005). As a result, many newspapers, including the New York Times, have longstanding policies whereby they do not review books published by self-publishers (Glazer, 2005). In addition, bookstores are typically breluctant to stock self-published books . . . because they carry the vanity press taint, they aren’t returnable and they aren’t discounted as much as traditional booksQ (Glazer, 2005, p. 10). 2. Libraries and self-publishers In public and academic libraries, there has been, for the most part, an awkward silence about how to deal with books from self-publishers, mainly because of the lack of reviews of self-published books in mainstream reviewing outlets. But, as the nature of publishing changes by taking on myriad electronic manifestations and as libraries begin to come to terms with the philosophies and concepts underlying electronic publishing and collection development, the issue of whether to collect self-published books assumes importance. The first statement about the importance of self-published books for library collections appeared in 1984 (Hayward, 1992). Crook and Wise (1987), two proponents of self- publishing, in explaining that self-publishers should target libraries as potential customers, observed that libraries bhave little prejudice against self-published books,Q mainly because, as Kremer (1986) pointed out, they are binformation specialists . . . continually and actively seeking new titles which can help them better serve their library patronsQ (quoted in Hayward, J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 211 1992, p. 290). Hayward (1992) remarked that libraries should make a concerted effort to collect self-published books because b[g]ood writers are writing and publishing good books on specialized subjects that trade publishers will no longer produce because of the limited financial returns possible on these booksQ (p. 290). The enthusiasm of the 1980s and early 1990s soon gave way to a harsher view. Manley (1999) may be seen as having a realistic attitude towards self-publishers. He lamented that librarians and reviewers are often inundated with bpoorly xeroxed cop[ies] of an announce- ment of a new bookQ that a btrue-believerQ author has published himself (p. 485). These books are typically either ba personal testimony of someone who has seen God, survived a terminal disease, fought in a war, or met an alien coming out of a flying saucer; a technical treatise on something obscure like a two-phased parachute, a four-barreled carburetor, or an eight-sided kite; the history of an inconsequential sports team, religious sect, educational institution, or residential community; or a conspiracy theory imputing evil intent to the U.S. government, the British royal family, or the Chinese MafiaQ (p. 485). He noted the bbrutal realityQ that almost 100% of self-published books bhave been rejected by mainstream publishers for one of two reasons: the book is a poorly written piece of drivel, or the book is on a subject that no one cares about with the possible exception of an author’s family and his two best friendsQ (p. 485). Still, he identified one exception to his general comments—The Prison Called Hohenasperg: An American Boy Betrayed by His Government during World War II by Arthur D. Jacobs—and suggested that b[o]ur standard line that a book must merit at least one positive review from a reputable source can be rather tyrannical in that rare instance when a self-published book does represent an important contribution to a valid subject areaQ (p. 485). Manley’s (1999) ambiguous stance with regard to self-publishers—consider the cumulative negative effect of his use of the adjectives brare,Q bimportant,Q and bvalidQ in his article—has echoed the debate about whether libraries should collect zines, another major form of self-publishing. While Bartel (2003), Herrada (1995), and Stoddart and Kiser (2004) stressed the importance of establishing zine collections in public and academic libraries, the ephemeral nature of zines—not to mention the cataloging and preservation problems they represent—was a factor in the disinclination of many libraries to start collecting them. In broad terms, faced with an overwhelming number of books published by well-known corporate and independent publishers, on the one hand, and tighter and tighter budgets on the other, librarians may not consider self-publishing companies and their products to be worthy of attention, especially given Manley’s (1999) statement that b99.99 percentQ of self- published books are bdrivelQ (p. 485). 3. Problem statement and research questions Given the rapid growth of book self-publishing, as well as the fraught reputation of self- publishers, how have academic and public libraries dealt with the issue of self-published books in the years 1960–2004, as represented by the number and range of books published by self-publishers appearing in their catalog records and hence on their shelves? A word about the vocabulary used here. The terms btitleQ and bbookQ are used synonymously in the J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234212 remainder of this article. The Online Computer Library Center (OCLC) defines titles as a b[t]erm(s) used to name a library resource such as a book, article, transcript, video, recording, song, score, or softwareQ (Online Computer Library Center, 2005a). But, as will be seen below, 99.99% of titles produced by the self-publishers studied in the present article are books. In addition, libraries that own a copy of a specific title are referred to as bholding libraries,Q or a sentence such as bthe library held a particular titleQ is used. Six research questions were posited: 1. To what extent are libraries in the United States and Canada choosing to collect self- publisher titles? That is, how many titles published by self-publishers appear in the catalog records of libraries? 2. Are libraries in general choosing to collect more titles from one self-publisher than from another? 3. Are there identifiable trends in library holdings with regard to subsidy self-publishers and bauthor servicesQ self-publishers in library holdings? 4. Which types of titles covering which subject areas from which self-publishers are libraries choosing to collect the most? 5. Do different types of libraries collect (i.e., hold) different types of titles from self- publishers? 6. What percentage of self-published titles are held collectively and individually by major public and academic libraries? 4. Procedures From the hundreds of self-publishers in the United States and Canada, seven self- publishers were identified for further study. These seven were all mentioned in two prominent articles dealing with the self-publishing phenomena during 2005 (Glazer, 2005; Span, 2005). As such, they represent a good cross-section of what is understood, in the public mind, to be a self-publisher. Three were subsidy publishers: Dorrance Publishing, Ivy House, and Vantage Press. Four were bauthor servicesQ publishers: AuthorHouse/1st Books/1stBooks (hereafter referred to as AuthorHouse), iUniverse, PublishAmerica, and Xlibris. In this latter category, AuthorHouse, iUniverse, and Xlibris are POD publishers, while PublishAmerica wants to distance itself from the POD designation (PublishAmerica, 2005). Using WorldCat Advanced Search, an online database developed and maintained by OCLC, the investigators searched by the individual publisher name (and likely variants to take into account cataloging entry errors) as mentioned above and, where applicable, by publisher location. All searches were carried out in a three-week period in May–June 2005 and updated on June 13, 2005. Searches were meant to elicit both the raw number of total titles published by each self-publishing company that were held by all libraries participating in the OCLC consortium, as well as the names of the top 25 titles published by each self- publishing company that were held by all libraries in the OCLC consortium and the number of libraries holding these top 25 titles. For Dorrance Publishing and Vantage Press—the two J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 213 oldest self-publishers in the set of seven—searches covered 1960–2004. For the five other publishers – all of which were founded in the middle and late 1990s—searches covered the years 2000–2004. The researchers also asked WorldCat to generate a frequency distribution of how many OCLC-member libraries owned a particular title published by each of the seven self-publishers. The names of the libraries holding the top 25 titles published by each of the seven self- publishers were downloaded from WorldCat using the feature called bDisplay All Libraries.Q In the interests of completeness, the feature called bFind Books with Same Title and AuthorQ was also used. When this feature generated additional records corresponding to the titles in the top 25 list, these records were included in the statistics reported in the present article, but only if they were published by the same self-publisher as in the original set of records. The output for the bDisplay All LibrariesQ records comes in the form of alphabetical state-by-state lists of all OCLC-member libraries possessing at least one copy of a particular title. Each of the top 25 titles from each of the seven self-publishers was then categorized according to its subject matter and form of publication based on the Library of Congress (LC) and/or Dewey classification numbers and/or LC subject headings found in the OCLC catalog records. Broad subjects and forms were assigned using LC classification schedules and subject headings. Subject categories with only one title were combined with a closely related area; for instance, the one title about the history of Canada was combined with titles about the history of the United States to form a subject category called history of the United States and Canada. In addition, the libraries that held the top 25 titles from the seven self-publishers were classified according to the following nine-fold categorization: university library, which included libraries belonging to medical schools and law schools; college library, which included libraries pertaining to seminaries and religious colleges; community college library, which included libraries at technical colleges and junior colleges; public library, which included public library consortia and school libraries; military library, which included libraries at military bases, military institutions of higher learning, and Veterans Affairs medical centers; govern- ment library, which included national depository libraries and libraries belonging to nonmilitary departments of national government entities; state library; a library belonging to a historical society, museum, archives, or art gallery; and other library, which included private corporations, law firms, banks, churches, and nonmilitary hospitals. OCLC provides a one-line identifier for each holding library; this identifier typically contains either the full name of the holding institution or abbreviations such as UNIV, COL, COMMUN COL, PUB, CNTY, REG, MIL, and others that make the categorization straightforward. In those cases where there was any doubt about which category a library belonged to – for example, all institutions designated as COL were checked in order to make sure whether they were in fact universities, colleges, or community colleges – the library name was searched using Web sites and an ultimate categorization was decided. To determine the extent to which major academic libraries held titles published by the seven selected self-publishers, the researchers used the 2003 list of top-ranked academic library systems (the most recent available at the time this research was conducted) as published by the Association of Research Libraries (2005) (ARL) on its Web site. The researchers picked the 25 top-ranked ARL library systems in the United States and searched for the three-letter J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234214 institutional code of each system’s main library in OCLC’s bFind Codes for Participating InstitutionsQ feature. Entering these 25 codes as a group in the blimit by library codeQ box on the WorldCat search screen, the researchers asked WorldCat to generate the raw number of titles published by the seven self-publishers that were held by the main libraries of the 25 top-ranking ARL library systems. This procedure was repeated with public libraries, this time using a list of the top 25 public library systems (ranked according to holdings) from Statistical Report 2004 (Public Library Association, 2004). Finally, the researchers selected the top five ARL library systems (not from the same state) and the top five public libraries (not from the same state) and repeated these procedures using each of their institutional codes separately. Taken together, these procedures allowed the researchers to gauge the degree to which major academic and public libraries in the United States held titles published by the selected seven self-publishers. The methodology is subject to all the limitations encountered when working with bibliographical databases administered by OCLC. While over 9000 libraries—most in the United States and Canada—belong to OCLC and while OCLC has over 58 million records (Online Computer Library Center, 2005b), not all libraries in North America are members of OCLC and thus do not contribute records. Many libraries that are not OCLC members may therefore have extensive collections of books published by self-publishers. In addition, OCLC records do not indicate how many copies of a specific title a reporting library has. A large metropolitan or regional library system with many branches may report as a single system holding a single copy of a particular self-published title though its various branches hold, say, 20 copies of that title. Another large central system may have a policy whereby its branches report separately. The presence of bibliographic records in OCLC is also dependent on the speed with which participating institutions enter information about their holdings. Institutions with backlogged cataloging departments yield an underestimation of OCLC cumulative statistics. Finally, when catalogers use OCLC to download records to their individual institutional catalogs, they may neglect to update cumulative OCLC holding statistics, which necessitates an extra step. These factors may result in an underreporting of self-publisher titles in libraries. Finally, the research presented here is necessarily a snapshot of an evolving picture, since OCLC updates its database frequently. The reported findings are therefore best viewed as broad trends. 5. Results The first three research questions are addressed in Section 5.1. Research question four is addressed in Section 5.2. Research question five is addressed in Section 5.3. Research question six is addressed in Section 5.4. 5.1. Self-publishers in North American libraries OCLC-member libraries held 14,061 titles that were published in 2000–2004 by the seven self-publishers, with 14,042 of these titles (99.99%) identified as books by OCLC. As shown in Table 1, titles published by AuthorHouse (5223), Xlibris (3351), and iUniverse (2945) are J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 215 the most widely held in OCLC-member libraries. If total number of titles held from a publisher is conceived as a measure of publisher reputation in the library world, then AuthorHouse leads the way, with 37.1% of all titles held, followed by Xlibris (23.8%) and iUniverse (20.9%). Titles published by PublishAmerica (1250) and each of the three subsidy publishers—Dorrance, Ivy House, and Vantage Press—are held at substantially lower rates. However, if another measure of publisher reputation is looked at – that is, the number of OCLC- member libraries that hold the top 25 overall held titles from the seven self-publishers—a slightly different picture emerges from that in Table 1. As shown in Table 2, Xlibris titles are held by the largest number of OCLC-member libraries (2589), followed by iUniverse (1998) and AuthorHouse (1905). In percentage terms, Xlibris titles account for 29% of the holdings of OCLC-member libraries in terms of the overall top 25 held titles published by the seven self-publishers; AuthorHouse is at 21.3%, slightly behind iUniverse (22.4%). Findings from Table 2 are given credence by data in Table 3: of the top 10 self-published titles (2000–2004) held by OCLC-member libraries, Xlibris published four, iUniverse three, AuthorHouse two, and Vantage Press one. The 14,061 total self-published titles are not broadly held across OCLC-member libraries. As Table 4 shows, there is a pyramid effect in the distribution of self-publisher titles: 42.8% of the titles are held by only one OCLC-member library each, while another 38.6% of the titles are held by 2–4 OCLC-member libraries, and 12% are held by 5–9 OCLC-member libraries. In other words, 93.4% of the titles from self-publishers are held by fewer than 10 Table 1 Number of titles published by seven American self-publishers (2000–2004) held by OCLC-member libraries Publisher No. of titles % AuthorHouse 5223 37.1 Dorrance 525 3.7 iUniverse 2945 20.9 Ivy House 69 0.5 PublishAmerica 1250 8.9 Vantage Press 698 5.0 Xlibris 3351 23.8 Total 14,061 100 Percentages do not add to 100 due to rounding. Table 2 Number of OCLC-member libraries holding top 25 overall held titles from seven American self-publishers Publisher No. of libraries % AuthorHouse 1905 21.3 Dorrance 462 5.2 iUniverse 1998 22.4 Ivy House 316 3.5 PublishAmerica 676 7.6 Vantage Press 989 11.1 Xlibris 2589 29.0 Total 8935 100 Percentages do not add to 100 due to rounding. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234216 OCLC-member libraries each. Only 822 titles (5.8%) are held by 10 or more OCLC-member libraries each, and only 61 titles (less than 1%) are held by 50 or more OCLC-member libraries each. Only 14 titles are held by 100 or more OCLC-member libraries each, with 12 of these 14 titles published by AuthorHouse, iUniverse, or Xlibris. Only three titles are held by 400 or more OCLC-member libraries each, and all three of these are published by iUniverse (1) and Xlibris (2). Can the arrival of bauthor servicesQ self-publishers in the late 1990s and 2000s be associated with a decline in the holdings of subsidy publishers by OCLC-member libraries? As shown in Table 5, the answer is mixed. Vantage Press titles that were published in each of the two five- year increments between 1995 and 2004 are held by fewer OCLC-member libraries than Vantage Press titles published in each of the five-year increments between 1965 and 1994. On the other hand, Dorrance titles published in the two five-year increments between 1995 and 2004 are held by more OCLC-member libraries than Dorrance titles published in each of the five-year increments between 1960 and 1994, with the exception of the periods 1970–1974 and 1975–1979. However, in no five-year increment between 1960 and 2004 did OCLC-member Table 3 Top 10 titles published by seven American self-publishers ranked by the number of OCLC-member libraries holding these titles Rank Title (Author) Publisher No. of libraries holding title 1 Abortion and Common Sense (Ruth Dixon-Mueller and Paul K. B. Dagg) Xlibris 483 2 If I Knew then . . . (Amy Fisher and Robbie Woliver) iUniverse 459 3 American Western Song: Poems from 1976 to 2001 (Victor W. Pear) Xlibris 403 4 Lewis and Clark in the Illinois Country: The Little Told Story (Robert E. Hartley) Xlibris 224 5 Dancing with Mosquitoes: To Liberate the Mind from Humanism (Theo Grutter) Vantage Press 222 6 The Guide to Identity Theft Protection (Johnny R. May) AuthorHouse 217 7 The Russian Adoption Handbook (John H. Maclean) iUniverse 201 8 The Chinese Adoption Handbook (John H. Maclean) iUniverse 194 9 bMistyQ: First Person Stories of the F-100 Misty (Don Shepperd) AuthorHouse 174 10 Race and the Rise of the Republican Party, 1848–1865 (James D. Bilotta) Xlibris 155 J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 217 T a b le 4 F re q u e n c y d is tr ib u ti o n o f h o w m a n y A m e ri c a n se lf -p u b li sh e r ti tl e s a re h e ld in h o w m a n y O C L C -m e m b e r li b ra ri e s (% fo r se le c te d ro w s) N o . o f O C L C li b ra ri e s N o . o f A u th o r H o u se ti tl e s N o . o f D o rr a n c e ti tl e s N o . o f iU n iv e rs e ti tl e s N o . o f Iv y H o u se ti tl e s N o . o f P u b li sh A m e ri c a ti tl e s N o . o f V a n ta g e P re ss ti tl e s N o . o f X li b ri s ti tl e s N o . o f ti tl e s p u b li sh e d b y a ll se lf -p u b li sh e rs 4 0 0 o r m o re 0 0 1 0 0 0 2 3 2 0 0 – 3 9 9 1 0 0 0 0 1 1 3 1 0 0 – 1 9 9 2 0 3 0 1 0 2 8 5 0 – 9 9 1 7 0 7 0 3 4 1 6 4 7 2 5 – 4 9 4 7 4 2 6 2 4 5 3 2 1 2 0 1 0 – 2 4 2 2 4 2 3 1 4 3 1 1 2 0 3 8 1 8 2 6 4 1 5 – 9 6 0 9 (1 1 .7 ) 5 8 (1 1 .0 ) 3 9 6 (1 3 .4 ) 1 6 (2 3 .2 ) 1 2 1 (9 .7 ) 7 9 (1 1 .3 ) 4 0 3 (1 2 ) 1 6 8 2 (1 2 .0 ) 2 – 4 2 0 6 9 (3 9 .6 ) 2 4 1 (4 5 .9 ) 1 0 4 6 (3 5 .5 ) 2 4 (3 4 .8 ) 4 3 9 (3 5 .1 ) 2 8 6 (4 1 .0 ) 1 3 1 8 (3 9 .3 ) 5 4 2 3 (3 8 .6 ) 1 2 2 1 3 (4 2 .4 ) 1 9 7 (3 7 .5 ) 1 3 0 7 (4 4 .4 ) 1 5 (2 1 .7 ) 6 5 5 (5 2 .4 ) 2 7 8 (3 9 .8 ) 1 3 6 0 (4 0 .6 ) 6 0 2 5 (4 2 .8 ) N o d a ta 4 1 2 1 6 1 7 7 3 5 1 0 9 T o ta l 5 2 2 3 5 2 5 2 9 4 5 6 9 1 2 5 0 6 9 8 3 3 5 1 1 4 ,0 6 1 J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234218 libraries choose to collect as many titles from Vantage or Dorrance as they did in the five-year increment 2000–2004 from each of the three principal bauthor servicesQ publishers (Author- House, iUniverse, and Xlibris). Does the picture change when the extent to which individual Dorrance and Vantage Press titles are held by OCLC-member libraries is examined? During 1960–1999, both Dorrance and Vantage Press had many titles that were held by more than 200 OCLC- member libraries each. For example, 14 Dorrance titles published between 1960 and 1999 were in more than 200 OCLC-member libraries each, including The Moral Foundations of United States Constitutional Democracy (1992) (229 libraries), Black Mathematicians and Their Works (1980) (268 libraries), The Navajo Code Talkers (1973) (450 libraries), Ruth Suckow: A Critical Study of Her Fiction (1972) (246 libraries), and Chaucer and the Liturgy (1967) (284 libraries). Twenty-seven Vantage Press titles published between 1960 and 1999 were in more than 200 OCLC-member libraries each, including Opera as Dramatic Poetry (1993) (535 libraries), The Inventor’s Handbook on Patent Applications (1993) (279 libraries), The Human Vocal Tract: Anatomy, Function, Development, and Evolution (1987) (471 libraries), The Hatch Act and the American Bureaucracy (1981) (250 libraries), American Cut Glass for the Discriminating Collector (1965) (421 libraries), and The United Colonies of New England, 1643–1690 (1961) (614 libraries). Conversely, in 2000–2004, Dorrance does not have a single title that is held by more than 50 OCLC- member libraries, while Vantage Press has only one title held by more than 50 OCLC- member libraries (Dancing with Mosquitoes, held by 222 libraries). If the number of OCLC-member libraries holding a specific title is an indication of the quality of that title, then the quality of Dorrance and Vantage Press publications declined when comparing the period of 1960–1999 with 2000–2004. 5.2. Types of titles collected As shown in Table 6, the type of title that OCLC-member libraries collect the most (26.3%) from self-publishers is handbooks, manuals, guidebooks, and self-help titles (based on the 25 titles held the most in OCLC-member libraries from each of the seven self- Table 5 Number of titles held by OCLC-member libraries that were published by Dorrance and Vantage Press in five-year increments (1960–2005) No. of Dorrance titles No. of Vantage Press titles 2000–2004 525 698 1995–1999 618 1192 1990–1994 334 1608 1985–1989 168 1404 1980–1984 249 1269 1975–1979 548 1911 1970–1974 671 1493 1965–1969 329 1208 1960–1964 182 927 J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 219 T a b le 6 T h e to p 2 5 ti tl e s h e ld b y O C L C -m e m b e r li b ra ri e s th a t a re p u b li sh e d b y se v e n A m e ri c a n se lf -p u b li sh e rs su b -d iv id e d b y fo rm o f p u b li c a ti o n (n = 1 7 5 ) N o . o f h a n d b o o k s, m a n u a ls , g u id e b o o k s, se lf -h e lp , e tc . N o . o f p o p u la r w o rk s N o . o f fi c ti o n N o . o f b io g ra p h y o r a u to b io g ra p h y N o . o f ju v e n il e fi c ti o n N o . o f sp e e c h e s, le tt e rs , d ia ri e s, e ss a y s N o . o f p o e tr y A u th o rH o u se 1 1 1 8 2 0 3 0 D o rr a n c e 7 8 2 6 2 0 0 iU n iv e rs e 1 2 4 4 2 2 1 0 Iv y H o u se 2 5 1 0 6 1 1 0 P u b li sh A m e ri c a 4 6 1 0 3 2 0 0 V a n ta g e P re ss 5 8 1 9 1 1 0 X li b ri s 5 9 3 4 1 1 2 T o ta l (% ) 4 6 (2 6 .3 ) 4 1 (2 3 .4 ) 3 8 (2 1 .7 ) 3 2 (1 8 .3 ) 9 (5 .1 ) 7 (4 .0 ) 2 (1 .1 ) P e rc e n ta g e s d o n o t a d d to 1 0 0 d u e to ro u n d in g . J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234220 publishers). The bhandbooks and manualsQ category includes life skills guides, study guides, textbooks, self-instruction books, directories, indexes, and so on. bPopular worksQ—defined as nonfiction monographs written on subjects such as history, medicine, technology, and science for nonspecialist readers—was the second most popular category (23.4%), followed by fiction (21.7%) and bbiography or autobiographyQ (18.3%). Table 6 also shows that there are substantial differences in the types of titles each of the seven self-publishers publishes that are widely held in OCLC-member libraries. Almost half of the top 25 titles held in OCLC- member libraries published by AuthorHouse (11) and iUniverse (12) fall in the bhandbook and manualsQ category. PublishAmerica and Ivy House seem to be making their mark with fiction (10), Vantage Press with bbiography or autobiographyQ (9), and Xlibris with bpopular worksQ (9). These holdings data also suggest that iUniverse is more dependent for its presence in OCLC-member libraries on a single category of title (bhandbooks and manualsQ—12) than any of the other self-publishers. For instance, while AuthorHouse does have 11 bhandbooks and manualsQ in its overall top 25 held titles by OCLC-member libraries, it also has 8 fiction titles. Similarly, while PublishAmerica has 10 fiction titles in its overall top 25 held Table 7 Subjects covered by the nonfiction titles among the top 25 titles held by OCLC-member libraries and published by seven American self-publishers (n = 126) No. of handbooks, manuals, guidebooks, self-help, etc. No. of popular works No. of biography or autobiography No. of speeches, letters, diaries, essays Total no. Social sciences 13 4 2 0 19 History of the United States and Canada 1 7 9 1 18 Medicine 10 3 5 0 18 Political sciences, education, and law 5 2 2 3 12 Science 2 5 2 0 9 Religion and theology 1 5 1 1 8 History of Europe 0 4 1 1 6 Technology 4 2 0 0 6 Fine arts and music 1 1 2 1 5 Geography, anthropology, and recreation 3 1 1 0 5 History of Asia 0 2 3 0 5 Psychology 3 1 0 0 4 Agriculture 1 1 1 0 3 Literature—biography and criticism 0 2 1 0 3 Military and naval sciences 1 1 1 0 3 Library and information science 1 0 1 0 2 Total 46 41 32 7 126 J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 221 Table 8 Number of OCLC-member libraries holding copies of the top 25 titles published by American self-publishers subdivided by subject (n = 8935) No. of Author House titles No. of Dorrance titles No. of iUniverse titles No. of Ivy House titles No. of Publish America titles No. of Vantage Press titles No. of Xlibris titles Subject total no. (%) Social sciences 419 0 1033 7 63 15 668 2205 (24.7) Fiction 495 23 218 88 211 83 215 1333 (14.9) History of the United States and Canada 79 14 34 79 144 126 422 898 (10.1) Political sciences, education, and law 288 0 307 76 0 25 0 696 (7.8) Medicine 183 62 36 12 42 146 176 657 (7.4) Science 0 41 96 0 0 341 0 478 (5.4) Poetry 0 0 0 0 0 0 450 450 (5.0) Religion and theology 138 83 0 12 86 0 0 319 (3.6) Juvenile fiction 0 29 89 5 52 23 61 259 (2.9) History of Europe 0 0 0 29 10 91 113 243 (2.7) History of Asia 174 26 0 0 0 32 0 232 (2.6) Fine arts and music 0 0 34 0 0 36 155 225 (2.5) Geography, anthropology, and recreation 76 30 0 0 0 21 57 184 (2.1) Technology 53 59 0 0 34 15 0 161 (1.8) Psychology 0 18 0 0 34 0 100 152 (1.7) Library and information science 0 0 116 0 0 35 0 151 (1.7) Agriculture 0 27 35 0 0 0 60 122 (1.4) Military and naval sciences 0 16 0 8 0 0 64 88 (1.0) Literature—biography and criticism 0 34 0 0 0 0 48 82 (0.9) Publisher total no. (%) 1905 (21.3) 462 (5.2) 1998 (22.4) 316 (3.5) 676 (7.6) 989 (11.1) 2589 (29.0) 8935 (100) Percentages do not add to 100 due to rounding. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234222 titles, it also has 6 titles categorized as bpopular works.Q The most diverse and wide-ranging self-publisher is Xlibris, the only self-publisher with titles in all identified categories. As shown in Table 7, which looks only at nonfiction titles, exactly half of the bhandbooks and manualsQ category is composed of social sciences (13) and medicine (10) titles, while the categories of bpopular worksQ and bbiography or autobiographyQ have numerous titles falling within the field of history, especially the history of the United States and Canada (7), but also history of Europe (4) and history of Asia (2). The bsocial sciencesQ category includes such topics as commerce and business, office management and retail trade, labor, criminology, and family issues. Overall, the three most popular nonfiction (and nonpoetry) subjects published by self-publishers and held in OCLC-member libraries are the social sciences (19), history of the United States and Canada (18), and medicine (18). Table 8 blends some of the findings of Tables 6 and 7 to provide a more detailed picture of the differences in the way OCLC-member libraries collect books from different self- publishers, this time from the perspective of subject matter. When OCLC-member libraries hold titles from self-publishers in the broad field of social sciences, almost half of such titles come from iUniverse (1033 out of 2205, or 46.8%). In much the same way, when OCLC- member libraries hold titles from self-publishers in the field of history of the United States and Canada, almost half of such titles come from Xlibris (422 out of 898, or 47%). Table 8 also reiterates a finding from Table 6: iUniverse seems to be the most one-dimensional of the self-publishers discussed here. Slightly more than half of its overall top 25 held titles are in the social sciences (1033 out of 1998, or 51.7%). No other self-publisher is as dependent on a single category as iUniverse for the presence of its titles in OCLC-member libraries, although PublishAmerica, with 211 held fiction titles out of 676 total held titles (31.2%), is moving in a similar direction. On the other hand, both AuthorHouse and Xlibris have a broad subject range of titles in OCLC-member libraries. 5.3. Types of libraries collecting self-publisher titles As shown in Table 9, in terms of the number of OCLC-member libraries holding the overall top 25 held titles published by each of the seven American self-publishers, 5,150 OCLC-member public libraries hold self-published titles (57.6%), more than double (2.56 times) the number of OCLC-member university libraries (2008 or 22.5%) and about eight times the number of OCLC-member community college libraries (646 or 7.2%) or OCLC- member college libraries (569 or 6.4%). Much further down the list are OCLC-member military-related institutions (252 or 2.8%) and OCLC-member libraries grouped under the rubric of bhistorical societyQ (69 or 0.8%). Different types of OCLC-member libraries have varying degrees of emphasis with regard to their holdings of different types of publications (Table 10), subject matter of publications (Table 11), and different self-publishers (Table 12). As shown in Table 10, while the number of OCLC-member libraries which hold self-publisher bhandbooks and manualsQ (28.4%) and bpopular worksQ (25.8%) is about the same, public libraries account for 65.7% (1,667 out of 2536) of the total number of held bhandbooks and manuals.Q Similarly, public libraries account for 62.3% of the total number of held bbiographies or autobiographiesQ (1050 out of 1685) and 79.7% of the total of held fiction titles from J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 223 self-publishers (1062 out of 1333). Universities, community colleges, and colleges, taken together, account for 62% of the total number of held bpopular works,Q many of which, as shown in Table 7, are histories of the United States, Europe, and Asia. As seen in Table 9, OCLC-member public libraries hold 2.56 times the total number of self- publisher titles than OCLC-member university libraries. If 2.56:1 is taken as a benchmark ratio, it is easy to identify the subject areas that, for public libraries, significantly exceed this ratio, or, for university libraries, invert this ratio, thus pinpointing subject areas that are of particular importance for a particular type of library. Thus, as shown in Table 11, OCLC- member public libraries hold 4.8 times as many self-publisher social sciences titles as OCLC- member university libraries, 6.6 times as many self-publisher technology titles, 6.9 times as many self-publisher fiction titles, 7.8 times as many bgeography, anthropology, and recreationQ titles, and 83 times as many juvenile fiction titles. Conversely, OCLC-member university libraries hold 1.1 times as many bfine arts and musicQ self-publisher titles as OCLC-member public libraries, 1.3 times as many breligion and theologyQ self-publisher titles, 1.8 times as many European history self-publisher titles, 2.6 times as many bpolitical sciences, education, and lawQ titles, and 9.6 times as many blibrary and information scienceQ self-publisher titles. These divergences from the benchmark ratio show that OCLC-public and university libraries are collecting self-publisher titles that appeal the most to their respective user groups. This is even more apparent in the case of military-related OCLC-member libraries. Of the 252 OCLC- member military-related libraries that hold self-publisher titles, 169 hold titles dealing with the history of Asia—an umbrella subject grouping that includes personal accounts of the Korean and Vietnam wars, among other topics. As shown in Table 12, iUniverse titles are held most often by OCLC-member public libraries (in 1518 libraries), closely followed by Xlibris titles (in 1327 libraries). The picture changes in OCLC-member university, community college, and college libraries, where Xlibris titles are held most often (in 654, 285, and 218 libraries, respectively), followed by AuthorHouse titles (in 448, 164, and 128 libraries, respectively). In OCLC-member military- related libraries, AuthorHouse titles are the most frequently held titles (in 177 libraries). In many ways, results shown in Table 12 reiterate findings from Table 8. While iUniverse titles are concentrated at the rate of 76% (1518 out of 1998 total titles) in OCLC-member public libraries and PublishAmerica titles are concentrated in OCLC-member public libraries at a rate of 79.1% (535 out of 676 total titles), titles published by AuthorHouse and Xlibris have wide diffusion across all library types. 5.4. Top-ranked libraries and self-publisher titles The 25 top-ranked ARL libraries held 1,056 self-publisher titles (7.5%) from the seven self-publishers (see Table 13). In addition, these 25 ARL libraries held more titles from Xlibris (411) and AuthorHouse (326) than from the remaining five self-publishers. The 25 largest public libraries in the United States (as measured by total number of holdings) held 2,306 self-publisher titles (16.4%). By a large margin, these 25 public libraries favor titles published by AuthorHouse (816), with titles published by iUniverse (488) and Xlibris (480) a distant second and third. Together, these 50 major academic and public libraries hold 23.9% J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234224 of the total number of 14,061 self-publisher titles identified in the present study and favor, in descending order, AuthorHouse, Xlibris, and iUniverse. There were important differences in the nature of the self-published titles that top-ranked ARL libraries and top-ranked public libraries had in their collections. For example, there were no common titles among the top five iUniverse titles held by top-ranked ARL and public libraries. The top five iUniverse titles held by the top 25 public libraries included If I Knew Then, The Chinese Adoption Handbook, The Russian Adoption Handbook, and The Great Garage Sale Book, while the top five iUniverse titles held by the 25 top-ranking ARL libraries included Research Strategies: Finding Your Way through the Information Fog, Beyond the Answer Sheet: Academic Success for International Students, The International Student’s Survival Guide to Law School in the United States, and Others: Third Party Politics from Nation’s Founding to the Rise and Fall of the Greenback-Labor Party. Compare also the differences in the top three titles published by PublishAmerica that were held by the top 25 public libraries—How to Destroy a Village: What the Clintons Taught a Seventeen Year Old; Science vs. Religion: The 500-year War: Finding God in the Heat of the Battle; and The Adventures of Coker LaRue—with the top three titles published by PublishAmerica that were held by the 25 top-ranked ARL libraries: Domestic Abuse: Our Stories; Rescued by a Cow and a Squeeze: Temple Grandin; and Will the Gay Issue Go Away?: Questioning Sexual Myths: Toward a New Theological Consensus on Sexual Orientation. There were, however, similarities in collecting patterns between the 25 top- ranked public libraries and the 25 top-ranked ARL libraries in the case of self-published titles from AuthorHouse. Three books—The Guide to Identity Theft Prevention; Measuring Sky without Ground: Essays on the Goddess Kali, Sri Ramakrishna and Human Potential; and A Genealogical Index to the Guides of the Microfilm Edition of Records of Ante-bellum Southern Plantations from the Revolution through the Civil War—were among the top five titles from this publisher held by both top-ranking ARL and public libraries. Such crossover appeal between academic and public libraries may be one reason that AuthorHouse titles are so widely available in libraries. Were there differences in the self-publisher holding patterns of the top 25 ARL libraries and top 25 public libraries, on the one hand, and all academic libraries (universities, colleges, and community colleges considered as a group) and all public libraries, on the other? As shown in Table 14, the 25 top-ranking ARL libraries, as a percentage of their total holdings of self-publishers, held AuthorHouse titles at a rate of 30.9%, while the comparable figure for all academic libraries was 23%. The top 25 public libraries, as a percentage of their total holdings of self-publishers, held AuthorHouse titles at a rate of 35.4%, while the comparable figure for all public libraries was 18%. If holding rates are a proxy for publisher reputation, the reputation of AuthorHouse is higher at both the 25 top-ranking ARL and public libraries than at academic and public libraries in general. However, the reputations of PublishAmerica, Ivy House, and Vantage Press are slightly lower at both the 25 top-ranking ARL and public libraries than at academic and public libraries in general. The reputations of Dorrance, iUniverse, and Xlibris are variable, depending on whether one is comparing the 25 top- ranking ARL libraries with all academic libraries, or the top 25 public libraries with all public libraries. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 225 As shown in Table 15, there are pronounced differences in the numbers of self-published titles (2000–2004) that are held by selected top-ranking and ARL and public libraries. While the five ARL libraries all have approximately the same number of titles (between 48 and 60) from the seven self-publishers, the Chicago Public Library (CPL) has a larger collection of self-published titles (286) than the New York Public Library (173), the Los Angeles Public Library (152), the Houston Public Library (37), and the Miami-Dade Public Library (149), taken separately. However, the individual holdings of all 10 large library systems—academic and public—pale in comparison to the 937 self-published titles held by the Jacksonville (Florida) Public Library, a library whose name repeatedly appeared in the holdings lists of the titles examined in the present study. Indeed, Jacksonville Public Library’s total of 937 self- published titles is more than three times as many as CPL, more than 15 times as many as Harvard, and just slightly less than the total of self-published titles held by the 25 top-ranked ARL libraries (1056). Table 9 Number of OCLC-member libraries holding the overall top 25 held titles published by seven American self- publishers by type of library Type of library Description Acronym used in Tables 10–12 No. (%) Public Libraries designed as PUB, MEM, CNTY, or REG by OCLC, as well as public library consortia and school libraries PUB 5150 (57.6) University Libraries designated as UNIV by OCLC, as well as law and medical school libraries UNIV 2008 (22.5) Community college Libraries designated as COMMUN COL by OCLC, as well as technical colleges and junior colleges CC 646 (7.2) College Libraries designated as COL by OCLC, as well as seminaries and religious colleges (all COLs checked to see whether they were in fact a COL or COMMUN COL) COL 569 (6.4) Military Libraries at military-controlled institutions of higher learning, military bases, and Veterans Affairs medical centers and agencies MIL 252 (2.8) State library Libraries designated by OCLC as being state libraries SL 102 (1.1) Government Libraries designed by OCLC as being national depositories or pertaining to a nonmilitary department or agency of a national government GOV 74 (0.8) Historical society Libraries designated by OCLC as pertaining to historical societies, museums, archives, and art galleries HIST 69 (0.8) Other Libraries pertaining to private corporations, nonmilitary hospitals, law firms, banks, churches, and other institutions not included in any of the other categories above OTHER 65 (0.7) Total All libraries Not applicable 8935 (100) Percentages do not add to 100 due to rounding. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234226 T a b le 1 0 N u m b e r o f O C L C -m e m b e r li b ra ri e s h o ld in g c o p ie s o f th e to p 2 5 ti tl e s p u b li sh e d b y se v e n A m e ri c a n se lf -p u b li sh e rs su b -d iv id e d b y ty p e o f li b ra ry a n d fo rm o f p u b li c a ti o n (n = 8 9 3 5 ) N o . o f P U B N o . o f U N IV N o . o f C C N o . o f C O L N o . o f M IL N o . o f S L N o . o f G O V N o . o f H IS T N o . o f O T H E R T o ta l n o . (% ) H a n d b o o k s, m a n u a ls , g u id e b o o k s, se lf -h e lp , e tc . 1 6 6 7 4 0 8 2 3 2 1 2 5 2 3 3 0 1 5 1 1 2 5 2 5 3 6 (2 8 .4 ) P o p u la r w o rk s 7 3 9 8 8 6 2 9 1 2 5 3 3 0 2 7 3 5 3 4 1 1 2 3 0 6 (2 5 .8 ) B io g ra p h y o r a u to b io g ra p h y 1 0 5 0 2 7 5 2 6 8 1 1 8 6 1 9 1 6 1 6 1 6 1 6 8 5 (1 8 .9 ) F ic ti o n 1 0 6 2 1 5 3 2 8 5 1 6 1 6 5 4 8 1 3 3 3 (1 4 .9 ) P o e tr y 2 6 9 9 6 4 0 3 4 4 5 1 1 0 4 5 0 (5 .0 ) S p e e c h e s, le tt e rs , d ia ri e s, e ss a y s 1 1 4 1 8 7 2 8 2 5 2 2 2 3 3 3 6 6 (4 .1 ) Ju v e n il e fi c ti o n 2 4 9 3 1 0 1 3 0 0 2 2 5 9 (2 .9 ) T o ta l 5 1 5 0 2 0 0 8 6 4 6 5 6 9 2 5 2 1 0 2 7 4 6 9 6 5 8 9 3 5 (1 0 0 ) P e rc e n ta g e s d o n o t a d d to 1 0 0 d u e to ro u n d in g . J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 227 T a b le 1 1 N u m b e r o f O C L C -m e m b e r li b ra ri e s h o ld in g c o p ie s o f th e to p 2 5 ti tl e s p u b li sh e d b y se v e n A m e ri c a n se lf -p u b li sh e rs su b -d iv id e d b y ty p e o f li b ra ry a n d su b je c t (n = 8 9 3 5 ) S u b je c t (n o .) N o . o f P U B N o . o f U N IV N o . o f C C N o . o f C O L N o . o f M IL N o . o f S L N o . o f G O V N o . o f H IS T N o . o f o th e r S o c ia l sc ie n c e s (2 2 0 5 ) 1 4 4 2 3 0 2 2 8 4 1 2 3 8 1 8 9 4 1 5 F ic ti o n (1 3 3 2 ) 1 0 6 2 1 5 3 2 8 5 1 6 1 6 5 4 8 H is to ry o f th e U n it e d S ta te s a n d C a n a d a (8 9 8 ) 4 5 8 2 5 4 3 7 6 4 1 3 1 6 1 4 3 4 8 P o li ti c a l sc ie n c e s, e d u c a ti o n , a n d la w (6 9 6 ) 1 3 5 3 5 2 8 0 1 0 9 4 3 3 7 3 M e d ic in e (6 5 7 ) 4 9 6 8 4 3 1 7 7 1 1 7 0 1 4 S c ie n c e (4 7 8 ) 2 1 5 1 8 3 3 2 2 1 7 1 0 5 1 4 P o e tr y (4 5 0 ) 2 6 9 9 6 4 0 3 4 4 5 1 1 0 R e li g io n a n d th e o lo g y (3 1 9 ) 1 0 5 1 4 1 2 3 4 1 1 2 5 0 1 Ju v e n il e fi c ti o n (2 5 9 ) 2 4 9 3 1 0 1 3 0 0 2 H is to ry o f E u ro p e (2 4 3 ) 7 0 1 2 3 1 1 1 8 4 2 6 9 0 H is to ry o f A si a (2 3 2 ) 2 7 1 9 4 4 1 6 9 0 4 3 2 F in e a rt s a n d m u si c (2 2 5 ) 9 2 1 0 0 1 0 1 7 0 1 4 1 0 G e o g ra p h y , a n th ro p o lo g y , a n d re c re a ti o n (1 8 4 ) 1 3 3 1 7 1 4 1 3 1 3 1 1 1 T e c h n o lo g y (1 6 1 ) 1 1 9 1 8 9 5 3 3 4 0 0 P sy c h o lo g y (1 5 2 ) 7 0 5 2 8 1 4 4 1 0 0 3 L ib ra ry a n d in fo rm a ti o n sc ie n c e (1 5 1 ) 7 6 7 3 0 4 1 1 2 1 1 1 A g ri c u lt u re (1 2 2 ) 9 7 1 8 0 0 2 1 2 1 1 M il it a ry a n d n a v a l sc ie n c e s (8 8 ) 5 8 6 2 2 1 3 4 1 2 0 L it e ra tu re — b io g ra p h y a n d c ri ti c is m (8 2 ) 4 6 2 0 2 5 4 1 2 0 2 T o ta l n o . (% ) 5 1 5 0 (5 7 .6 ) 2 0 0 8 (2 2 .5 ) 6 4 6 (7 .2 ) 5 6 9 (6 .4 ) 2 5 2 (2 .8 ) 1 0 2 (1 .1 ) 7 4 (0 .8 ) 6 9 (0 .8 ) 6 5 (0 .7 ) P e rc e n ta g e s d o n o t a d d to 1 0 0 d u e to ro u n d in g . J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234228 T a b le 1 2 N u m b e r o f O C L C -m e m b e r li b ra ri e s h o ld in g c o p ie s o f th e to p 2 5 h e ld ti tl e s p u b li sh e d b y se v e n A m e ri c a n se lf -p u b li sh e rs su b -d iv id e d b y p u b li sh e r a n d ty p e o f li b ra ry (n = 8 9 3 5 ) P u b li sh e r N o . o f P U B N o . o f U N IV N o . o f C C N o . o f C O L N o . o f M IL N o . o f S L N o . o f G O V N o . o f H IS T N o . o f o th e r T o ta l n o . (% ) A u th o rH o u se 9 2 7 4 4 8 1 6 4 1 2 8 1 7 7 1 7 8 1 9 1 7 1 9 0 5 D o rr a n c e 1 9 2 1 8 4 1 4 3 3 1 1 5 1 2 4 7 4 6 2 iU n iv e rs e 1 5 1 8 2 3 5 1 0 6 9 8 8 2 1 2 5 5 1 9 9 8 Iv y H o u se 1 9 3 5 7 1 5 1 3 1 0 7 1 0 4 7 3 1 6 P u b li sh A m e ri c a 5 3 5 6 6 1 8 3 3 2 1 0 4 1 7 6 7 6 V a n ta g e P re ss 4 5 8 3 6 4 4 4 4 6 1 8 1 7 2 2 1 1 9 9 8 9 X li b ri s 1 3 2 7 6 5 4 2 8 5 2 1 8 2 6 2 5 1 6 2 5 1 3 2 5 8 9 T o ta l 5 1 5 0 2 0 0 8 6 4 6 5 6 9 2 5 2 1 0 2 7 4 6 9 6 5 8 9 3 5 (1 0 0 ) J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 229 6. Discussion Academic and public libraries are aware of titles published by self-publishers, having in their collections 14,061 unique titles published by AuthorHouse, Dorrance, iUniverse, Ivy House, PublishAmerica, Vantage Press, and Xlibris in 2000–2004. However, 42.8% of the titles are in a single OCLC-member library, with 93.4% of the titles in fewer than 10 OCLC- member libraries each. Only a relatively small number of titles published by self-publishers (defined as those that are held by at least 10 libraries) are broadly represented in OCLC- member libraries. Possible explanations for this are: generally poor marketing on the part of self-published authors and their publishers; failure of library vendors to include self- publishers in their approval plan profiles; lack of discounts and incentives by self-publishers to library vendors so that the vendors would include self-publishers in their approval plan Table 13 Holdings of titles published by seven American self-publishing houses in groups of top-ranked libraries in the United States No. of titles in top 25 ARL libraries No. of titles in top 25 public libraries Total no. of titles AuthorHouse 326 816 1142 Dorrance 51 173 224 iUniverse 154 488 642 Ivy House 5 17 22 PublishAmerica 29 153 182 Vantage Press 80 179 259 Xlibris 411 480 891 Total 1056 2306 3362 Table 14 Comparison of holdings of titles published by seven American self-publishing houses between top-ranked public and academic libraries in the United States and all public and academic libraries in the United States No. of titles in top 25 ARL libraries No. of titles in all university, college, and community college libraries No. of titles in top 25 public libraries No. of titles in all public libraries AuthorHouse 326 (30.9) 740 (23.0) 816 (35.4) 927 (18.0) Dorrance 51 (4.8) 231 (7.2) 173 (7.5) 192 (3.7) iUniverse 154 (14.6) 439 (13.6) 488 (21.2) 1518 (29.5) Ivy House 5 (0.5) 85 (2.6) 17 (0.7) 193 (3.7) PublishAmerica 29 (2.7) 117 (3.6) 153 (6.6) 535 (10.4) Vantage Press 80 (7.6) 454 (14.1) 179 (7.8) 458 (8.9) Xlibris 411 (38.9) 1157 (35.9) 480 (20.8) 1327 (25.8) Total 1056 (100) 3223 (100) 2306 (100) 5150 (100) Percentages do not add to 100 due to rounding. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234230 T a b le 1 5 H o ld in g s o f ti tl e s p u b li sh e d b y se v e n A m e ri c a n se lf -p u b li sh in g h o u se s in se le c te d a c a d e m ic a n d p u b li c li b ra ri e s in th e U n it e d S ta te s N o . o f A u th o rH o u se ti tl e s N o . o f D o rr a n c e ti tl e s N o . o f iU n iv e rs e ti tl e s N o . o f Iv y H o u se ti tl e s N o . o f P u b li sh A m e ri c a ti tl e s N o . o f V a n ta g e P re ss ti tl e s N o . o f X li b ri s ti tl e s T o ta l n o . H a rv a rd U n iv e rs it y 1 8 8 2 1 1 6 2 4 6 0 Y a le U n iv e rs it y 1 1 1 9 0 1 7 2 8 5 7 U n iv e rs it y C a li fo rn ia -B e rk e le y 1 7 1 4 0 0 5 2 9 5 6 U n iv e rs it y M ic h ig a n 1 3 3 7 0 1 3 2 2 4 9 U n iv e rs it y Il li n o is -U rb a n a 7 8 7 1 3 6 1 6 4 8 L o s A n g e le s P u b li c L ib ra ry 3 7 9 2 1 6 9 1 3 5 7 1 5 2 N e w Y o rk P u b li c L ib ra ry 3 4 2 7 2 8 3 4 1 1 6 6 1 7 3 C h ic a g o P u b li c L ib ra ry 8 7 1 7 4 3 5 1 0 5 3 7 1 2 8 6 H o u st o n P u b li c L ib ra ry 1 2 0 9 1 2 5 8 3 7 M ia m i- D a d e P u b li c L ib ra ry 5 6 3 4 4 2 6 1 2 2 6 1 4 9 Ja c k so n v il le (F L ) P u b li c L ib ra ry 1 2 6 1 4 5 8 2 4 6 8 1 5 1 2 8 9 3 7 J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 231 profiles; lingering concerns about the quality of titles bearing the imprint of a self-publisher; or some combination of the above. When OCLC-member libraries do choose to collect titles from self-publishers, they are making distinctions among them, favoring AuthorHouse, iUniverse, and Xlibris (all bauthor servicesQ POD publishers) rather than Dorrance, Ivy House, PublishAmerica, and Vantage Press. Given the negative publicity surrounding PublishAmerica in late 2004 and early 2005 (Span, 2005), OCLC-member libraries and librarians may not feel comfortable about the quality of titles published by PublishAmerica, nor about the older subsidy presses such as Vantage Press and Dorrance. Although AuthorHouse, among the seven self-publishers, has the greatest number of unique titles present on the shelves of all OCLC-member libraries (5223 titles, 37.1% of the total; Table 1), the rate at which its overall top 25 held titles are held by OCLC-member libraries, as a percentage of the number of OCLC-member libraries holding top 25 held titles from all seven self-publishers, no longer makes it the leading self-publisher (1905 libraries, or 21.3%; Table 2). When it comes to the number of OCLC-member libraries holding top 25 held titles of self-publishers, Xlibris (2589 libraries, 29%) and iUniverse (1,998 libraries, 22.4%) surpass AuthorHouse. This may be an indication that OCLC-member libraries are finding that the top echelon of iUniverse and Xlibris titles is qualitatively superior to the top echelon of AuthorHouse titles—a circumstance substantiated by the fact that only iUniverse (1) and Xlibris (2) have titles that are held by 400 or more OCLC-member libraries each. However, AuthorHouse titles, as a percentage of the holdings of all seven self-publishers, are held at a proportionally greater rate by the 25 top-ranking ARL libraries and the top 25 public libraries than by all OCLC-member academic and public libraries. In general, the better quality self-published titles may have migrated away from Dorrance and Vantage Press in 2000–2004, especially when compared with their successes (measured in terms of the numbers of OCLC-member libraries holding their most popular titles) between 1960 and 1999. Writers who may have published with Dorrance and Vantage Press in the past may now be using bauthor servicesQ publishers such as AuthorHouse, iUniverse, and Xlibris. The relatively heavy rates at which OCLC-member libraries hold bauthor servicesQ self-published titles, as opposed to subsidy self-publisher titles, may indicate that the titles published by bauthor servicesQ self-publishers are either about topics and subjects of more interest to libraries than the titles published by subsidy self-publishers, are of a better quality, or both. In addition, the circumstance that a publisher—as in the case of AuthorHouse—has a number of titles that are judged to be worth holding by both academic and public libraries (as opposed to books that have little crossover in academic and public libraries, as in the case of iUniverse and Publish- America) may be another way to measure the quality (or breadth of appeal of the topic) of self-published titles, as well as the reputations of their publishers. Based on the comparison between OCLC-member public and university libraries, on the one hand, and top-ranked ARL libraries and public libraries on the other, public libraries (no matter their ranking) hold more than twice as many self-publisher titles as academic libraries (no matter their ranking). In broad terms, academic libraries are not as receptive to self-publishers as public libraries, perhaps because of the emphasis on peer-review in the academic world. And if the Jacksonville (Florida) Public Library is any indication, some public libraries are making a J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234232 serious effort to develop strong collections of self-publisher titles. These public libraries may have realized that the level of quality of recent self-published titles, especially from POD bauthor servicesQ publishers, warrants their inclusion in library collections in significant numbers, especially in the categories of fiction and bhandbooks and manuals,Q and in such subject areas as the social sciences, history, and medicine. 7. Conclusion Library collections have always attempted to meet the needs of their users. As large mainstream publishers become focused on profit-and-loss statistics (Schiffrin, 2000) and as the demands of bookstores stoke the corporate emphasis on bestsellers (Epstein, 2001), librarians should remember that self-publishers often release titles that would not typically find a home with a profit-oriented publisher. Self-publishers may be one of the last frontiers of true independent publishing. Richard Sarnoff, the president of Random House Ventures (RHV), which owns ba minority stake in Xlibris,Q commenting on why RHV invested in Xlibris, explained that b[w]hat’s interesting is the capability of having micro-niches that are so small that publishers would not be interested in publishing them in the traditional wayQ (Glazer, 2005, p. 11). Thus, micro-niche titles such as Laparoscopic Adjustable Gastric Banding and Be Brief. Be Bright. Be Gone: Career Essentials for Pharmaceutical Representatives may represent the new face of self-publishing (Glazer, 2005). If this is the direction that self-publishing is taking, public and academic librarians should reevaluate their negative preconceptions about self-publishers, especially AuthorHouse, iUniverse, and Xlibris, because catering to segmented, niche, and individualized markets has been shown, in influential marketing textbooks such as Principles of Marketing (Kotler & Armstrong, 2003), to be an effective way to generate demand for a given product or service. In blunter terms, collection development librarians in public and academic libraries should make a conscious effort not to exclude self-published titles from their field of vision because the stigma traditionally associated with self-publishing is quickly disappearing. References Association of Research Libraries. (2005). ARL statistics: Ranked lists for institutions for 2003. Retrieved May 21, 2005, from http://fisher.lib.virginia.edu AuthorHouse. (2005). Author services agreement. Retrieved June 11, 2005, from http://www.authorhouse.com/ GetPublished/Agreements.asp Bartel, J. (2003). The Salt Lake City Public Library zine collection. Public Libraries, 42, 232–238. Crook, M., & Wise, N. (1987). How to self-publish and make money. Kelowna, BC7 Sandhill Publishing and Crook Publishing. Epstein, J. (2001). Book business: Publishing past present and future. New York7 W. W. Norton. Glazer, S. (2005, April 24). How to be your own publisher. New York Times Book Review10–11. Hayward, P. (1992). The trend towards self-publishing. Canadian Library Journal, 49, 287–293. Herrada, J. (1995). Zines in libraries: A culture preserved. Serials Review, 21(2), 79–88. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234 233 Kotler, P., & Armstrong, G. (2003). Principles of marketing (10th ed.). Upper Saddle River, NJ7 Prentice Hall. Kremer, J. (1986). 1001 ways to market your books—For publishers and authors. Fairfield, IN7 Ad-Lib Publications. Kremer, J. (n.d.). The self-publishing hall of fame. Retrieved June 9, 2005, from http://www.selfpublishinghalloffame. com/ Manley, W. (1999). One-tenth of one percent. Booklist, 96, 485. Online Computer Library Center. (2005a). Glossary. Retrieved June 12, 2005, from http://firstsearch.oclc.org Online Computer Library Center. (2005b). WorldCat: Window to the world’s libraries. Retrieved June 12, 2005, from http://www.oclc.org/worldcat/default.htm Public Library Association. (2004). Statistical report 2004: Public library data service. Chicago7 American Library Association. PublishAmerica. (2005). Facts and figures about one of America’s most spectacular book publishing companies. Retrieved May 23, 2005, from http://www.publishamerica.com/facts/index.htm Schiffrin, A. (2000). The business of books: How international conglomerates took over publishing and changed the way we read. London7 Verso. Span, P. (2005, January 23). Making books. Washington Post Book WorldT8–T9. Stoddart, R., & Kiser, T. (2004). Zines and the library. Library Resources & Technical Services, 48, 191–198. Thompson, J. B. (2005, June 17). Survival strategies for academic publishing. The Chronicle of Higher Education, 51(41), B6. Wyatt, E. (2004, September 10). Ingredients of a best seller: Faith, luck and hard work. New York TimesE30. J. Dilevko, K. Dali / Library & Information Science Research 28 (2006) 208–234234 work_tymmmuio6raitfa7a65jhtu7ve ---- Microsoft Word - Hamilton.doc Library Acquisitions: Practice & Theory, Volume 20, issue 1 (Spring, 1996), p. 9-21. ISSN: 0364-6408 doi:10.1016/0364-6408(95)00076-3 Copyright © 1996 Published by Elsevier Science Ltd. SSDI 0364-6408(95)00076-3 PROMPTCAT ISSUES FOR ACQUISITIONS: QUALITY REVIEW, COST ANALYSIS AND WORKFLOW IMPLICATIONS MARY M. RIDER Head of Copy Cataloging The Ohio State University Libraries 040S Main Library 1858 Neil Avenue Mall Columbus, OH 43210 MARSHA HAMILTON Head of Monograph Acquisitions The Ohio State University Libraries 036 Main Library 1858 Neil Avenue Mall Columbus, OH 43210 Abstract PromptCat is a new service offered by OCLC, in conjunction with monograph materials vendors, that provides libraries with a full bibliographic record from the OCLC Online Union Catalog (OLUC) simultaneous to the supply of materials from a vendor. The library's holdings are set automatically on the OLUC record. Because PromptCat eliminates the need for libraries to do individual title-by-title searching and record selection when materials are received, it will streamline local cataloging activities. It may also provide an impetus for libraries to reevaluate local editing practices and determine whether materials can be processed quickly upon receipt in acquisitions rather than in copy cataloging. This article addresses issues relating to PromptCat, including tests of the service conducted at The Ohio State University (OSU) and Michigan State University (MSU), an estimated cost/benefit analysis based on OSU's approval plan, and issues including coordination between OCLC, materials vendors, system vendors, and the library as well as workflow, organizational implications and staffing issues. INTRODUCTION In recent years, libraries have shown increasing interest in relying on outside vendors to supply all or part of their cataloging services. In the 1990's, economic constraints have forced many libraries to reduce staff, particularly in large cataloging departments. As a result, many libraries have concluded that they can no longer achieve desired productivity and quality standards and also maintain reasonable costs by cataloging in-house. A few libraries have outsourced cataloging entirely, but many feel uncomfortable with the loss of flexibility and diminished local expertise which this may entail. A hybrid option is to receive cataloging copy that can still be reviewed and adjusted as desired for the local system. In this environment, OCLC announced plans for PromptCat, a service released in spring 1995 that allows libraries to receive MARC records from the OCLC Online Union Catalog (OLUC) at the same time they receive monographs from approval plan and firm order vendors. Initially, OCLC plans to offer the service in conjunction with four approval plan vendors: Academic Books, Baker & Taylor, Blackwell North America, and Yankee Book Peddler. Other vendors will be added as the service expands [1]. Although PromptCat is intended for both firm order and approval monographs, this analysis will focus on approval materials only. PromptCat operates in the following way. OCLC receives weekly updates of new titles added to the vendor's inventory. These titles are searched against the Online Union Catalog (OLUC) to locate matching bibliographic records. The bibliographic record is selected according to several match algorithms that use multiple elements to compare the vendor record to the OLUC record. At regular intervals, the vendor sends OCLC a manifest with a list of the titles to be shipped to the library on approval or firm order. OCLC supplies the matching bibliographic record to the library and sets the library's holding symbol on the OLUC record. Libraries will have various options regarding type, bibliographic level, and format of records selected, method of record delivery, and timing of the set of holdings. A Cataloging Report will be supplied for records not delivered if no matching records are found, for duplicate holdings in case the library's symbol is already attached to the selected OLUC record, and for records delivered. Based on preference or local practice, libraries may specify which categories of copy they want to receive, e.g., DLC, CIP, member, UKM, or a combination. Records will be supplied to the local catalog using one of four delivery options. They can be provided on 1600 or 6250 bpi magnetic tapes, via Electronic Data Exchange (EDX), which is OCLC's implementation of FTP, through an online PRISM PromptCat file, or on catalog cards. Holdings may be set immediately or with a built-in 21-day delay to allow for review and processing of approval titles. If approval titles are returned to the vendor by the library, holdings may be removed from the OLUC record manually on PRISM, via EDX, by magnetic tape or MICROCON Delete. THE PROMPTCAT TEST AT MICHIGAN STATE UNIVERSITY Two libraries participated with OCLC in conducting tests of the prototype PromptCat service: Michigan State University Libraries (MSU) in October-December 1993 and The Ohio State University Libraries (OSU) in January-March 1994. The MSU test was performed with approval records supplied by Yankee Book Peddler. Yankee selected books for MSU's weekly approval shipment and sent bibliographic data via FTP to OCLC. OCLC searched the titles against the OLUC and set MSU's holdings on those records. A tape of the records was generated and shipped to MSU's Computer Center [2]. A major question in both tests was whether OCLC's matching algorithm would locate the correct bibliographic record based on the vendor-supplied data. Another concern was whether OCLC would select what libraries considered the best available record in the OLUC and whether the copy would be suitable for the local catalog. In her review of the test at MSU, Kay Granskog indicated that the percentage of matching records supplied by OCLC was 99.4% [3]. She also indicated that 40% of the records supplied were full-level Library of Congress records that required no revision. Since MSU performed some "straightforward cataloging" in acquisitions, they were able to process many of the PromptCat records in this same manner. Acquisition staff matched the PromptCat record against the book in hand, input the location code, and wrote the call number in the book. Since access points are reviewed later by an outside vendor, headings were not verified. Call numbers were not adjusted if they fit certain criteria. Because MSU does not consider this process full copy editing, they describe it instead as "monographic check-in," and have developed the following criteria for "check-in" of books in the Acquisition Department: 1. Author, title, edition, imprint, and date on record match the book in hand; 2. Bib Lv1 = m and Enc Lvl = (blank) or I; 3. Call number (050/090) is present and not in class M, ML, MT, P-PZ, Ζ [These classes require cutter number adjustment.]; 4. Title is not an analytic, LAW title, a set, an added edition, or an added conference proceedings [4]. One of the problems raised by the test at MSU was how to distinguish newly-loaded PromptCat records from records for fully-processed materials available for public use. In the MSU test, it took approximately 2 1/2 to 3 weeks to complete shelf processing of titles in the PromptCat sample. These titles already showed MSU's holdings in OCLC and in the local catalog. Because of concern that these titles would generate interlibrary loan activity, acquisition personnel forwarded rejected approval titles immediately to the database management area to remove MSU's holdings from the OCLC record and the online catalog. This was particularly important because there was no indication in the public catalog that PromptCat titles were not available on the shelf. Because of the speed with which PromptCat bibliographic records are supplied, each library will need to decide whether OCLC holdings should be set immediately or delayed and whether a note or status code should be added to records in the local system to alert users that titles may not be available for circulation. MSU considered the test of the early product to be "a good compromise between totally outsourcing a workflow that represents a high volume of material and full in-house processing. By stamping and labelling books ourselves the library retains the right to return an unmarked book to the approval vendor. The library gains the possibility of faster processing but still receives the books from the vendor immediately after they are profiled" [5]. THE PROMPTCAT TEST AT THE OHIO STATE UNIVERSITY The second PromptCat test was performed at The Ohio State University Libraries with the Baker and Taylor Company (B&T). The OSU Libraries (OSUL) receives approximately 18,000 approval titles annually from B&T. Normally, abbreviated MARC bibliographic records are provided by B&T on tape shortly before receipt of the weekly approval shipment. Each title is searched in Monograph Acquisitions against the local catalog (OSCAR, an INNOPAC system) to identify duplicates, added volumes, added editions, analytics, series on standing order, series or serials cataloged as monographs, etc. Books are then displayed for 1 week for collection managers who decide whether items should be selected or rejected. Acquisition staff complete processing of the local order record and forward all selected books to the cataloging department where tides are searched to locate suitable OCLC copy. The copy chosen for cataloging will eventually overlay the brief record supplied by the vendor. Although several vendors, including B&T, are capable of providing cataloging copy using the LC MARC tape service, until now there has been no mechanism to obtain the unique OCLC number on vendor-supplied records or to automatically upload the library's holding symbol to OCLC. For libraries like OSU that have a commitment to contribute to the OCLC database, this was a problem. PromptCat does fulfill these two needs. Since PromptCat will supply the local system with an OCLC record at the point of a book's receipt, it will also make it possible for OSU to finish processing in acquisitions rather than send all books to the cataloging department for search and copy editing. The test at OSU examined whether the record selected by the PromptCat service would match the record selected in cataloging. Based on a random sample of 200 books, 182 records (91%) matched one-to-one with those chosen by OSU. For the remaining 18 titles (9%), two records were supplied by OCLC, one of which matched OSU's choice. Multiple records were supplied for OSU personnel to test and evaluate the OCLC searching algorithm. In all but one case, the record with the highest match rate was also the preferred OSU record. In the final product, only one record will be supplied according to the matching algorithm. The search algorithm was also evaluated in a second non-random sample of 128 problematic titles flagged by OCLC for special review. Of this group, OCLC supplied one record for 108 records (84%) that matched those chosen by OSU internally, multiple records for 10 tides (8%), one of which matched OSU's choice, and 10 records (8%) that did not match the record selected by OSU. Again, these were considered problematic titles. The 10 records that did not match OSU's choice also included records where OSU decided on a different cataloging treatment (e.g., multivolume set vs. individual monograph) than the one for which matching copy was found in the OLUC. As a result of these two tests, the Library gained confidence that OCLC’s record selection (based on the match algorithm and other criteria) could replicate OSU's own decision process in selecting cataloging copy 91-99% of the time. Because acquisition staff search approval titles in the local system prior to retention review, local cataloging practice that may differ from PromptCat copy can easily be identified without additional searching. A second issue addressed in the OSU test was the quality of records supplied by PromptCat. In this case, the speed with which OCLC supplied copy meant that a higher than usual proportion of CIP records were delivered. Sixty-five percent of the test records were CIP, 25% were DLC (Library of Congress), 8% were OCLC member copy, and 2% were UKM (UK MARC). Currently, OSU approval titles are cataloged about 3 to 4 weeks after receipt. By the time OSU cataloged the sample of 182 titles, 49 CIP records (or 42%) had been upgraded to DLC blank or 1 encoding level copy. This raised the question of whether it would be advantageous to wait several weeks after receipt of books for PromptCat CIP records to be upgraded to full-level DLC copy or to have OSU staff upgrade the CIP records to make materials available for patrons. To help answer this question, the amount of editing that would have to be done to integrate CIP copy into the local catalog was evaluated. For the sample of 182 titles, 78% of the CIP records required adjustment to fields other than call number, but the average number of changes per records was relatively low, an average of two fields per record. These changes were mostly in the 300 description field (pagination, size) and 260 publisher field (including date of publication) and would have to be added or revised on most CIP copy. The evaluation helped to convince the OSU Libraries that it would be relatively simple to revise CIP records at the point of receipt and that it would better serve patrons to upgrade CIP PromptCat records when the books were processed in acquisitions rather than delay processing. This would provide materials 2-3 weeks faster than the current workflow. The tests conducted at MSU and OSU showed that the selection of PromptCat records using the OCLC matching algorithm was highly accurate. This prompt delivery of records also reinforced the idea that libraries will realize a greater benefit from PromptCat if they are willing to accept OCLC-supplied copy with a minimum of local adjustment [6]. COST/BENEFIT ANALYSIS A preliminary estimate of PromptCat costs associated with the OSU Baker & Taylor approval plan indicates that the service will be cost-effective, particularly when compared with average figures for in-house processing that include direct OCLC charges (e.g., search for copy, FTU's for uploading holdings) as well as indirect charges for staff time and salaries. The total cost of PromptCat will vary somewhat for each library depending upon the method chosen for delivery of records (e.g., tape, EDX, cards or PRISM PromptCat file) and removal of holdings for rejected approval titles. OHIONET has announced that the 1995/96 PromptCat charge will be $1.925 per record. This per record charge will occur regardless of whether the record is retained in the local system. Libraries with a low approval rejection rate, thus, will realize greater savings. In addition, there will be an initial set-up fee of $220.00 profiling charge per vendor. A library may be assessed charges for the EDX annual fee (Electronic MARC annual fee of $198.00) and electronic MARC monthly processing charges of $22.00 if that is the delivery mechanism chosen. If records are placed in a PRISM PromptCat file, the library will be charged regular export charges of $.1045 per record when the record is downloaded [7]. To estimate annual, direct costs for the PromptCat service at OSUL, excluding profiling charges, the $1.925 per record delivery charge was multipled by 18000 which is the average number of Baker & Taylor approval titles received yearly [8]. If OSU were to receive PromptCat records via tape for approval books, the annual cost of those tapes ($16.50 each for 52 weekly 1600 bpi tapes) would be $ 858.00 (see Table 1). OSU will investigate the feasibility of electronic transmission of PromptCat data, but since tapes are the current delivery mechanism used for approval records, tapes were used for this cost/benefit analysis. The cost per record for using the PromptCat service can be estimated on the basis of the average number of B&T approval records received annually (ca. 18,000) minus an average return rate TABLE 1 Estimated Cost of PromptCat Service for OSUL B&T Plan of 6% (ca. 1,200 books were returned to the vendor last fiscal year). This is a rough estimate that assumes PromptCat would supply catalog records for all of the remaining 16,800 approval titles annually (see Table 2). We know, however, that a small percentage of records will not have matching or acceptable OCLC copy. In a few cases, OSU may not use the PromptCat-supplied record if it does not reflect the Libraries' preferred cataloging treatment (e.g., monograph copy supplied for title cataloged at OSU as serial or monographic set), but these numbers should not significantly affect the average per record charge. For an average of $2.11 per title (excluding set-up charges), OCLC will supply a matching bibliographic record for approval tides and set OSU holdings in the OLUC. Other processing activity will continue to be performed in-house. In addition to processing the record for retention and payment, OSU staff will need to add some data for the local catalog including library location codes, material and record type, circulation status, and barcode numbers. OSU personnel will need to complete descriptive information that may be lacking on CIP copy (estimated to be 65% of the PromptCat records), and accept with minor revisions DLC (25%) and member copy (8%). Due to different national standards, UKM copy (2%) may require more extensive checking and revision by cataloging staff. It is hoped that in the majority of cases, OSUL will be able to process books and records quickly after they are reviewed for retention in acquisitions. In order to achieve this goal at the OSU Libraries, local editing policies and practices are being reviewed to change workflows so that some records can be completed in acquisitions without having to go to a separate cataloging area. This means that it will be possible to have books on the shelf and available for circulation an estimated 2-3 weeks earlier than is presently the case. Catalog records will also be displayed more quickly in the local system while books are in process. This will facilitate review by collection managers and reduce potential duplication in preorder search for firm ordered titles. The option will be available in the local system to display records for staff and suppress them from public view until books are completely processed. To further evaluate how the estimated costs for PromptCat compare with traditional costs for OCLC search and copy cataloging activities, OSUL looked at both direct OCLC charges as well as the average number of titles cataloged per hour at OSUL and the number of FTE staff needed to search and edit approximately 16,800 books annually. Presently, OCLC charges for online search and update of holdings in the OLUC via OHIONET amount to direct costs of $.79 per record. Average direct costs for cataloging 16,800 titles would, therefore, amount to $13,272 annually (see Table 3). TABLE 2 Estimated Per Record Cost of PromptCat TABLE 3 Estimated OCLC Costs for B&T Approval Books While the direct costs of in-house search and cataloging per title are low, less than $1.00, other factors, including divided workflows, sorting and moving materials, and staff time spent editing records for local practice, etc. increase costs substantially. At OSUL, an average of about 3 titles per hour can be searched and edited, even though search for copy is currently handled separately from copy cataloging for most materials. At this rate, assuming that a staff member edits records for 7 hours per day, a total of 21 titles/day/person can be processed. Although there are 260 work days in the year, OSU recognizes 11 official holidays. Civil service staff also receive from 2-5 weeks vacation, depending on years of service [9]. Using a figure of 3 weeks vacation per year plus holidays (234 working days/year × 21 titles/day) approximately 4,914 titles can be searched and edited annually per FTE staff. It would therefore require 3.4 FTE to search and edit the 16,800 B&T approval books selected annually based on current OSU procedures and editing practices. Based on an average salary of $21,500, plus the university's estimate of benefits equal to 22% of the base salary, the total cost of hiring 3.4 staff members comes to $89,182 [10]. If the annual cost of weekly OCLC tapes to load records to the local system is added to staff salaries, in addition to direct OCLC online charges, the annual cost of cataloging 16,800 approval titles amounts to $104,456 annually or $6.22 per title (see Table 4). This is a conservative estimate because it does not factor in staff overhead such as sick leave, other assignments including committees or meetings, and other library duties. It also does not include additional database charges for authority processing which is handled separately at OSUL [11]. Staff costs for completing PromptCat records in acquisitions are difficult to estimate because this process was not actually performed during the OSU test. However, using time study estimates based on the assumption that a percentage of PromptCat copy will be processed with minimal editing, it may be possible for a staff member to process 8 titles per hour (see "Determine Staffing Needs" later in this article) This includes current acquisition check-in (3 minutes/title) plus review and completion of the catalog record (4 minutes/title). At this rate, it is estimated that 1.2 FTE would be needed to check-in and complete processing for 16,800 approval titles. Using the same average salary plus benefits figure used above, $31,476.00 ($26,230.00 × 1.2 FTE), plus the estimate of OSU's annual PromptCat service ($35,508), the cost of processing 16,800 approval titles using PromptCat would be $3.99 per title. This figure is calculated differently than the cost of search and copy cataloging ($6.22/title) because it includes staff time for acquisition check-in, but also assumes, for the sake of comparison, that all titles would be processed in acquisitions, even though it is estimated that up to 20% of records would still be forwarded to cataloging, again increasing the total cost per title. This 20% reflects an estimate of titles classed in P, M, and Ν as well as miscellaneous serial, analytic and special collections titles that require call number adjustment or special processing. Based on the above estimates, it appears that PromptCat is a cost-effective service. Compared to current costs for search and copy cataloging ($6.22/title), if all 16,800 records could be processed at the rate of $3.99/title, the library would realize a 35% savings for processing of approval titles, with no significant differential in quality [12]. While promising, it does appear that the degree to which PromptCat can be truly cost-effective and the extent to which it can replace or reduce OCLC search and copy editing, will depend largely on individual libraries and whether they can minimize in-house editing after books are reviewed and selected. These "hidden" factors can quickly escalate per title processing costs. TABLE 4 Estimated OSUL Costs for Search and Copy Cataloging WORKFLOW ISSUES In general, smaller libraries traditionally have had an advantage in being able to process incoming materials quickly because items do not have to move from place to place or person to person, sit in backlogs, or have the same record worked on at different times. Libraries with a larger number of incoming receipts are more likely to have separate units to handle different functions. The same item is passed from person to person in an assembly line process. Historically, this arrangement developed to allow large libraries, which also had large personnel budgets in the past, to take advantage of staff who were trained to handle specific tasks. For example, acquisition receipt, search for OCLC copy, copy cataloging, and even labeling could be handled in separate units or departments on the theory that this arrangement would maximize expertise on the part of staff performing a narrowly-defined function. While PromptCat does not require libraries to change workflows or handle materials differently, it does provide a powerful impetus, particularly for larger libraries, to reevaluate existing practices and organizational schemes to maximize PromptCat's benefits. Because PromptCat provides a full OCLC catalog record at the point of a book’s receipt, it enables acquisition and copy cataloging functions to be handled at the same time as part of a unified process. The merging of acquisition and cataloging activities has been practiced in recent years in libraries that have instituted "fastcat" processing so that monographs can be "checked in" and records added to local catalogs without extensive review or revision of existing copy. Michigan State, among other libraries, already had experience with this method of monographic "check-in" when they tested the prototype PromptCat service in late 1993. However, even if OCLC records were downloaded to the local system at the time of firm order, staff still had to search OCLC for approval titles. It is this step — the OCLC search for acceptable catalog copy and setting of holdings for approval titles — that PromptCat will eliminate. The library is left with editing records for the local catalog only and shelf preparation of the physical piece. By handling receipt and cataloging at the same time, all information gained during check-in or problem-solving would remain with one individual, and not be subject to loss as the item is passed from unit to unit. Approvals with no copy or unacceptable copy could continue to be forwarded to a copy and/or original cataloging unit to handle problems, such as call number assignment, that would otherwise slow approval processing. This would speed the shelf-readiness of a larger percentage of newly acquired materials and hypothetically free copy editors to focus on more problematic titles. It also means that there are fewer opportunities for books to be misplaced or errors made as items are transferred from one work area to another. IMPLEMENTING PROMPTCAT IN TECHNICAL SERVICES Libraries interested in using PromptCat will want to do some planning in advance to ensure that local system requirements and library needs are compatible with the various options offered by the service. Although the service does appear to offer a cost-effective alternative to local OCLC searching and may reduce editing of catalog copy, libraries will want to evaluate it based on their specific needs and objectives. In planning for implementation, at least three major areas should be addressed by libraries. These include knowledge of the local system and how PromptCat records will interface in the local database, impact of the service on library workflows and staffing requirements. Know Your Local System Since libraries will have various options with PromptCat and can establish criteria for record selection as well as choose the method of delivery, it is important that libraries understand how PromptCat will interact with the local system. For example, OSUL currently receives brief MARC bibliographic and order record information from B&T which is loaded via tape directly into the local INNOPAC system. Since PromptCat includes the option to pass through vendor information in 9xx MARC fields of the OCLC bib record, it will be important to review order/acquisition data and its location in the MARC record with the system vendor, B&T and OCLC network representatives to make certain that this crucial information can be entered and read correctly by the local system. Libraries will need to take the initiative and work closely with system vendors to insure that vendor-supplied data in PromptCat records interfaces appropriately with the local online system. Coordination and additional programming may be necessary to take advantage of the options offered by PromptCat. Because OSUL's local catalog (OSCAR) is an INNOPAC system, separate item records must be created for each physical piece and linked to appropriate bibliographic records for circulation. OSUL wants to develop the capability to create item records using a template in the local system that will allow certain default information to be provided automatically when bibliographic records are loaded. This would greatly speed up processing when books are checked-in. Again, coordination with OCLC or a network representative, the system vendor, and the library will be important during the initial set-up period. The PromptCat delivery method chosen will have an impact on library workflow. With tape or EDX delivery of PromptCat records, the library may be able to create order records immediately for acquisition purposes but editing of bibliographic records must be done directly in the local system. This may have an impact on some libraries, including OSUL, where bibliographic records are currently edited on PRISM and tapeloaded to the local catalog. Delivery using a PromptCat PRISM file will allow records to be edited on OCLC using PRISM commands and then exported when complete to the local system. However, this means order records for acquisition processing will not be available until after the title is fully cataloged. For OSUL, this alternative is unacceptable. Libraries should also be aware of the impact PromptCat will have on local patrons if records are available to the public in the local system before books have been processed and sent to the shelf. This may be a particular problem if a patron requests an approval title which a collection manager elects to return to the vendor. Libraries may want to suppress PromptCat records from public display while materials are still in process, if they have that option in the local system. Alternatively, libraries may want to load PromptCat records with an added note alerting patrons that the item has been received on approval and is not yet available for circulation. Because the OCLC holdings symbol is set at the time the record is delivered to the library (or set at a 21-day delay), interlibrary loan requests may be generated for materials that are still in process or that might be returned to the vendor. For books that are returned to the vendor, OLUC holdings will have to be removed as quickly as possible to reflect accurate holdings and to minimize interlibrary loan requests. Decide What Workflows Will Work Best for the Library PromptCat can be implemented adapting the library's current workflows if that is desired. In other words, if acquisition and cataloging functions are presently handled separately in different departments, materials could still be checked-in and cleared for payment in acquisitions and then forwarded to cataloging for editing. The only "real" activity that PromptCat would replace is the need to search for OCLC copy since records will be supplied. However, this minimizes the advantages PromptCat can provide. If libraries choose to develop new workflows and reduce local editing to maximize PromptCat's potential, they will need to consider whether it will be acceptable to edit some records in acquisitions rather than forwarding all items to cataloging. Libraries can profile with OCLC to select what types of records will be supplied via PromptCat (e.g., DLC, CIP, member, UKM). If they determine what is considered "acceptable" copy (e.g., LC call number and subjects available on record; call number does not have to be adjusted locally), these titles can be "checked in" in acquisitions. Records that do not fall within these categories could be forwarded to cataloging for more extensive revision. Since PromptCat records include a sizable percentage of CIP copy (at OSU, 65% of test records were CIP), libraries will need to determine whether they will supply missing descriptive information in acquisitions. (At MSU, CIP records were sent to cataloging for completion during the PromptCat test. OSU is currently reviewing whether to process DLC and CIP records in acquisitions.) Libraries can further enhance the value of PromptCat if they can accept call numbers on OCLC copy without additional adjustment for the local shelflist. At MSU, records with M, ML, MT, P-PZ, and Ζ call numbers were sent to cataloging; otherwise the call number on incoming copy was accepted without further adjustment. OSU has also been reviewing call number policy and recently decided that LC call numbers will no longer be adjusted to maintain alphabetic shelf list order. The exception to this will be classes M, N, and P, which OSUL will continue to adjust in order to keep works by or about individual musicians, artists, or literary authors together on the shelf. While each library will evaluate its needs, it appears likely that some records will be processed quickly in acquisitions and other materials will need to be forwarded to cataloging for completion of problematic records or adjustment to reflect local practice. The following outlines a proposed PromptCat workflow for the OSU Libraries that would facilitate processing in acquisitions for the majority of records with acceptable copy. • Vendor supplies OCLC with weekly updates of new titles added to their inventory. Titles are searched against the OLUC and matches are made. • The vendor supplies OCLC with a manifest of titles to be supplied on approval to the library. OCLC retrieves the appropriate bibliographic records, sets holdings, and produces a Cataloging Report. • PromptCat records are loaded into local catalog (OSCAR). Order record, item record, and online invoice are generated automatically by data in the PromptCat 9xx MARC fields and load templates in the local system [13]. • Approval books are searched against local catalog to identify pertinent information, including added volumes, duplicates, added editions, analytics, monographic series cataloged separately, etc. Search results are transcribed to approval slips placed in each title for collection manager review. • Titles are displayed on review shelves for one week. • Titles are pulled from shelves; INNOPAC order records are updated to show selecting location, fund, price, date of processing, etc. • At same time, PromptCat bibliographic record is verified against book in hand; additions or corrections are made if necessary. (If record does not fall within guidelines for processing in acquisitions, the book is forwarded to a cataloging editor for completion.) • Item record is updated for circulation data in local system, including codes for library location, material and record type, circulation status, volume or copy number, and barcode, etc. • Book is barcoded, labelled, magnetic stripped, and prepared for shelf. • Book is forwarded to Circulation for shelving or Mail Room for shipment to departmental libraries. Determine Staffing Needs After the library determines how and where PromptCat materials will be handled, they will need to consider whether current staff should be reassigned to handle PromptCat processing. Staffing needs can be estimated, based on three factors: 1. number of approval titles received/retained annually, 2. percentage of materials that can be handled in acquisitions, and 3. estimated amount of time per record needed to "check-in" vs. "catalog" materials. It is estimated above that OSUL will catalog approximately 16,800 B&T approval titles per year. Copy cataloging production averages 3 titles per hour per person, but OSUL would like to move processing for a sizable majority of the PromptCat records to the "front-end" in acquisitions even though it is not certain how many titles per hour can be handled this way. Since it currently takes approximately 3 minutes for the acquisition order record to be processed, if it takes another 4 minutes to verify that the bibliographic record matches the book in hand and to update the item record, it may be possible to "check-in" books at the rate of 7 minutes per title. This would be about 8 titles per hour, substantially higher than the current copy cataloging rate. A staff member processing 7 hours per day could complete 56 titles. At 234 working days per year, one FTE staff member could process 13,104 titles annually, which is about 78% of OSLTs annual B&T approval receipts. Of course, these are only estimates, which may vary in practice depending upon how many records fit criteria for quick completion in acquisitions. For example, if CIP records are processed in acquisitions, it may take longer to revise those records. But up to 20% of records will continue to be forwarded to cataloging for certain categories including class numbers M, N, or P, which means less time will be spent on those records in acquisitions. The advantages of approval plans are that materials come in quickly, can be reviewed prior to purchase, and can be rejected if inappropriate for the library. Since a feature of PromptCat is to set the OCLC holdings in the Union Catalog, the library should be aware of its approval return rate when deciding to use this new product. In general, a return rate of over 10% for domestic plans is considered unacceptable; a return rate of over 5% for foreign plans is costly due to overseas shipping and postage. If the library's return rate is high, it may be advisable to update the profile prior to implementing PromptCat. Otherwise, the deletion of holdings for rejected titles might be too costly and time-consuming. First, the holdings must be removed promptly so that interlibrary loan requests are not generated. Second, the library is paying a PromptCat fee for all copy provided, even for titles that are rejected. There is also staff processing time for rejected titles, preparation of credit memos, invoice adjustment, shipping, and postage on returns. In these days of shrinking technical services personnel, the number of FTE staff assigned to approval processing may be limited. When moving forward the function of verifying book in hand to PromptCat copy, editing copy, and preparing the book for the shelf, it must be realized that acquisition processes may be slowed, although through-put time from beginning to end for individual titles should improve. Because payments can be delayed if approval processing is not maintained in a timely manner, the acquisition function has traditionally not had the ability to allow backlogs to develop in a manner considered more acceptable for cataloging functions. To maintain current status in acquisitions, it may be necessary to reassign personnel. Depending on the library, a forward shift of personnel from copy cataloging to PromptCat processing may be viewed as a welcome chance for cross-training, learning new skills, seeing a book through from beginning to end, and a chance to better serve patrons by providing materials more quickly. Each library has a different organizational climate and the impact on staff and librarians of the immense changes facing technical services today should not be underestimated. Librarians and staff are very aware of the consequences of outsourcing technical services processes. The number of people employed in technical services today is far less than the typical library of ten years ago [14], The psychological implications for staff and librarians of implementing vendor-based services should be addressed openly. Ideally, staff and librarians should be given the opportunity to be reassigned or cross-train for positions where their expertise is needed and required for the processing workflows of the future [15]. SUMMARY In summary, PromptCat is another in the new line of vendor-based services being introduced to streamline technical services processes within libraries. It provides OCLC copy for titles supplied by participating approval vendors. Since OCLC bibliographic copy is provided and OCLC holdings are set at the point approval books are shipped, the library can adjust workflow to edit copy during acquisition processing. This can speed up processing materials for patrons and cut costs by minimizing the number of staff handling an item and the number of times a record must be accessed. The overall effect can be to introduce "small library" processing in a large library approval environment. PromptCat eliminates local search for copy by using an effective matching algorithm. While not eliminating the need for local editing and approval processing, PromptCat streamlines the process by providing copy containing the unique OCLC number and holdings at a price generally below that possible using local library staff. Preliminary estimates indicate that PromptCat may save approximately one-third of copy cataloging costs for selected approval titles. When considering the PromptCat service, the library should ideally study local cost and workflow implications. The service provides an excellent impetus to review expensive and time-consuming local practices. The library should work closely with systems and book vendors to insure the compatibility of the PromptCat service with the local online system. Psychological and staffing implications should also be considered. Many staff and librarians are concerned about job security and satisfaction in the face of increasingly sophisticated vendor-supplied services and the specter of outsourcing. Finally, the increased speed with which materials can be made available to patrons should be considered, especially if changes in workflow and local systems can be adopted to maximize the benefits that PromptCat can provide. REFERENCES 1. Information on PromptCat was obtained from OCLC-supplied promotional literature and by OCLC project coordinators during the OSUL PromptCat test. At the time of writing, some libraries and vendors have postponed testing PromptCat due to programming needed for local systems. 2. Granskog, Kay. "PromptCat Testing at Michigan State University," Library Acquisitions: Practice & Theory, 18 (1994), 419-420. 3. Ibid, p. 420. 4. Ibid, p. 423. 5. Ibid, p. 425. 6. The OSUL test focused on comparing PromptCat record selection and match rates to the Libraries' current search methods. The test was performed using paper reports and was not done by adding records to a "live" system. Further pro-gramming will be required before PromptCat can be integrated into the Librarie' INNOPAC system. 7. These figures are for the PromptCat service through OHIONET. Charges for libraries within other local networks and consortia may vary. 8. In FY 1994, OSU received 18,380 Baker & Taylor approval titles. 9. Information provided by OSU Libraries Personnel office. 10. Salary and benefits based on average figures provided by the OSU Libraries Personnel and Business Offices. 11. See Dilys E. Morris, "Staff Time and Costs of Cataloging," Library Resources & Technical Services, 36 (January 1992), 79-91, for a detailed, longitudinal study of cataloging costs conducted at Iowa State University Library in 1987­ 90. ISUL also copy cataloged an average of 3 titles/hour and factored in staff overhead and other cataloging activities for an average per title cost of $8.18 per record over a 3-year period. It was also noted that these per title costs increased 25% over the 3-year period of the study; from $7.74 in 87/88; to $8.24 in 88/89; to $9.02 in 89/90. 12. All savings are estimates since "monograph checkin" was not tested during the OSUL PromptCat test. Additional costs for system programming to enable local implementation of PromptCat are also not covered in this article and would vary depending on the local online system and library needs. 13. This is a hypothetical workflow because the OSUL system currently does not have the capability to create bibliographic, order and item records simultaneously. 14. Homey, K. L. "Fifteen Years of Automation: Evolution of Technical Services Staffing," Library Resources & Technical Services, 31 (Jan.-Mar. 1987), 69-76. 15. For more detailed information on the OSUL PromptCat test, see manuscript by Mary M. Rider, "PromptCat: A Projected Service for Automatic Cataloging — Results of a Study at The Ohio State University Libraries," which is forthcoming in Cataloging & Classification Quarterly. work_u3x5zkrwmjfcxhc72hrddlviti ---- 13 PERAN PERPUSTAKAAN DALAM MENGEMBANGKAN INSTITUTIONAL REPOSITORY DI UIN SUNAN KALIJAGA YOGYAKARTA Ida Nor’aini Hadna1 Abstract: This paper tries to describe the role of libraries in the development of Institutional Repository (IR) at UIN Sunan Kalijaga Yogyakarta. IR is a unique collection owned by institutional needs to be maintained and preserved. In addition, IR collections in digital library caould be raised the brand of the institution, achieved 10th rank in Indonesia for Webometrics repository in 2012, then went to the rank 5 in 2013, and increased again to rank 3 by 2014. Role of the library become very important in the development of IR. Activities undertaken by the library is in terms of (1) collecting; (2) managing; (3) preserving; (4) evaluating; and (5) promoting. Keyword: Institutional Repository (IR), the role of libraries A. Pendahuluan Istilah Institutional Repository(IR) merujuk pada sebuah kegiatan menghimpun dan melestarikan koleksi digital yang merupakan hasil karya intelektual dari sebuah komunitas tertentu.2 Senada dengan definisi tersebut, Clifford Lynch dalam Bevan3 menyebutkan bahwa IR merupakan satu set layanan yang ditawarkan oleh universitas kepada anggota komunitasnya untuk pengelolaan dan penyebaran materi digital yang dibuat oleh lembaga dan anggota masyarakatnya. Hasil karya intelektual merupakan informasi ilmiah yang sangat 1 Pustakawan Perpustakaan UIN Sunan Kalijaga Yogyakarta 2 Putu Laxman Pendit,Perpustakaan Digital dari A Sampai Z. (Jakarta: Cita Karyakarsa Mandiri, 2008),137;Perpustakaan Digital: Kesinambungan dan Dinamika.(Jakarta: Cita Karyakarsa Mandiri, 2009), 50 3 Bevan, Simon J. 2007. “Developing an institutional repository: Cranfield QUEprints – a case study”, OCLC Systems &Services,Vol.23Iss:2, pp.170–182.DOI (PermanentURL): 10.1108/10650750710748478 14 Ida N. Hadna, PERAN PERPUSTAKAAN DALAM... berharga yang dimiliki oleh seseorang atau lembaga. Kekayaan intelektual ini perlu dijaga kelestariannya baik fisik maupun content- nya. Selain itu, hasil karya ini perlu dikelola dengan baik agar dapat diakses dengan luas. Menurut Arianto4 kemampuan untuk menghimpun seluruh sumber informasi local content yang melimpah ruah dan mengusahakannya untuk dapat diakses oleh komunitas global menjadi tantangan utama yang dihadapi oleh para profesional di bidang informasi. Yang umum dilakukan saat ini untuk penyimpanan(repository) dan pelestarian (preservasi) hasil karya intelektual suatu lembaga serta penyajiannya agar dapat diakses secara luas adalah dengan perpustakaan digital (digital library). Menurut Pendit5 pada awalnya peran perpustakaan digital dalam konteks IR menjadi bahan diskusi dan perdebatan, tetapi akhirnya fenomena IR bermuara juga di perpustakaan digital yang melanjutkan ‘ruh’ kepustakawanan sebagai penghimpun pengetahuan yang dapat dipercaya oleh komunitas pengguna pengetahuan itu. Menurut Arianto6 dasar pemikiran yang mendorong pengelolaan dan pengembangan local content yang kemudian dipublikasikan menjadi IR adalah (1) untuk meningkatkan reputasi dan peringkat lembaga yang bersangkutan serta mempertahankankelangsungan simpanan kelembagaan untuk akses jangka panjang (preservasi digital); (2) agar dapat diakses lebih luas; dan (3) meningkatkan visibilitas para penulis. Preservasi digital (digital preservation) dalam digital librarymerupakan kegiatan yang terencana dan terkelola untuk memastikan bahwa bahan digital dapat terus dipakai selama mungkin7. Lebih lanjut Pendit juga menyebutkan bahwa preservasi pada dasarnya merupakan upaya untuk mempertahankan sumberdaya kultural dan intelektual agar dapat digunakan selama mungkin 8. 4 M. Solihin Arianto. “Diseminasi Informasi: Strategi Pengelolaan Local Content”,paper dipresentasikan pada Seminar Nasional Diseminasi Informasi Local Content: Peluang dan Tantangan dari Sudut Pandang Cyberlaw, diselenggarakan oleh Perpustakaan UNS Solo, tanggal 18 Juni 2014 5 Pendit, Op.Cit., 2009: 51 & 54 6 Arianto, Op.Cit. 7 Pendit, Op.Cit., 2008: 248 ; 2009: 111 8 Ibid.,2008:248 15 Pustakaloka, Vol. 6. No.1 Tahun 2014 IR merupakan koleksi yang unik dan berharga yang hanya dimiliki oleh lembaga tersebut. Oleh karena itu, koleksi tersebut perlu dirawat dan dijaga kelestariannya agar dapat diakses selama mungkin. Tulisan ini akan membahas secara singkat tentang peran perpustakaan dalam mengelola IR di UIN Sunan Kalijaga Yogyakarta. Selanjutnya, guna memberikan gambaran yang lebih jelas, makaakan disampaikan sejarah singkat pendirian perpustakaan digital UIN Sunan Kalijaga. B. Pengelolaan IR di Perpustakaan UIN Sunan Kalijaga Perpustakaan UIN Sunan Kalijaga sebagai perpustakaan Perguruan Tinggi, sebagaimana yang tertuang dalam SNI 7330:2009 pasal 2.19 tentang Perpustakaan Perguruan Tinggi, memiliki tujuan utama memenuhi kebutuhan informasi bagi pengajar dan mahasiswanya, walaupun terbuka juga untuk untuk publik. Selanjutnya dalam pasal 5.2 tentang jenis koleksi disebutkan bahwa salah satu jenis koleksi yang dimiliki oleh perpustakaan perguruan tinggiadalah terbitan perguruan tinggi; artinyaperpustakaan menyediakan terbitan perguruan tinggi yang bersangkutan, termasuk terbitan lembaga penelitian, karya akhir mahasiswa, karya pengajar, serta karya yang berkaitan dengan perguruan tinggi tersebut9 Sebagaimana halnya perguruan tinggi yang lain, UIN Sunan Kalijaga juga telah mewajibkan mahasiswa yang akan menyelesaikan studi untuk mengumpulkan tugas akhirnya. Sebelum tahun 2003 (waktu itu masih IAIN) tugas akhir dikumpulkan di perpustakaan hanya dalam bentuk hard copy/tercetak. Sejak tahun 2003, Perpustakaan UIN Sunan Kalijaga mulai mewajibkan kepada mahasiswa yang akan mendaftar wisuda untuk menyerahkan tugas akhirnya baik dalam bentuk hard copy maupun soft copy. Semangat yang ada saat itu adalah semakin terbatasnya tempat display tugas akhir dan perlunya back up data tugas akhir dalam bentuk digital. Peristiwa dahsyat gempa Yogya 27 Mei 2006 semakin menyadarkan perpustakaan untuk menyelamatkan koleksinya.Sebagai daerah rawan bencana perlu diupayakan untuk menyimpan koleksi yang seminimal mungkin dapat kuat menahan bencana alam seperti gempa, banjir, kebakaran, dan 9 Badan Standarisasi Nasional. Standar Nasional Indonesia Perpustakaan Perguruan Tinggi, SNI 7330:2009 16 Ida N. Hadna, PERAN PERPUSTAKAAN DALAM... lain-lain. Bentuk digital menjadi alternatif terbaik untuk menghadapi situasi rawan bencana tersebut. Melalui penyimpanan dalam bentuk digital, maka hasil karya sivitas yang sangat berharga akan dapat tetap terselamatkan.Pada tahun 2007 pimpinan UIN Sunan Kalijaga mulai membangun perpustakaan digital untuk mengelola koleksi ini. Saat itu perpustakaan digital dirancang untuk mengumpulkan semua institusional repository UIN Sunan Kalijaga termasuk foto-foto kegiatan, pidato rektor, artikel dosen, soal ujian, jurnal yang diterbitkan di lingkungan UIN Sunan Kalijaga. Mulailah Perpustakaan mencari software opensource yang cocok dengan kebutuhan perpustakaan digital yang direncanakan tersebut. Pilihan waktu itu jatuh ke GDL (Ganesha Digital Library). Melalui perpustakaan digital tersebut, maka koleksi IR dapat diakses lebih luas lagi, tidak hanya dinikmati oleh anggota sivitas akademikanya saja. Sejalan dengan perkembangan bidang Teknologi Informasi yangbegitu cepat, perpustakaan berpikir ulang untuk menata ulang perpustakaan digitalnya. Hal ini disebabkan perpustakaan digital yang ada sudah tidak dapat beroperasi lagi seiring dengan semakin banyaknya koleksi yang harus diunggah dan tuntutan kemudahan layanan oleh pemustaka. Sementara dari segi open source yang digunakan, pengelola mengalami kesulitan untuk mengembangkan open source ini ataupun untuk menghubungi pengembangnya. Sejak bulan Mei tahun 2012 Perpustakaan digital UIN Sunan Kalijaga beralih ke sofware open source yang lain, yaitu EPrints. Hal ini dilakukan dengan harapan akan lebih meningkatkan kualitas layanan kepada pemustaka. Pilihan peralihan ke EPrints rupanya menjadi kebijakan yang tepat. Satu tahun setelah peralihan ini, perpustakaan digital UIN Sunan Kalijaga menempati urutan ke-10di Indonesia peringkat webometrics. Peringkat ini terus meningkat seiring bertambahnya aktivitas digitalisasi berbagai karya ilmiah yang dilakukan oleh perpustakaan. Pada bulan Agustus 2013 digital repository UIN Sunan Kalijaga naik peringkatnya menjadi posisi ke-5 se-Indonesia,ke-8 se-AsiaTenggara, dan ke-27 se-Asia. Pada bulan Juli 2014 Perpustakaan digital UIN Sunan Kalijaga naik peringkat pada posisi ke-3 di Indonesia, Asia 17 Pustakaloka, Vol. 6. No.1 Tahun 2014 Tenggara ke-6, dan Asia ke-18.10 Webometrik dilakukan oleh The Consejo Superior de Investigaciones Cientificas (CSIS), lembaga riset yang berkedudukan di Spanyol. Pemeringkatan ini dilakukan untuk menunjukkan keluasan atau keterjangkauan akses repositori digital suatu perguruan tinggi dengan indikator webnya antara lain berupa visibilitas global, size, rich file dan impact repository ilmiah. C. Peran Perpustakaan Dalam Mengelola IR Perpustakaan menjadi ujung tombak dalam pengelolaaan IR di UIN Sunan Kalijaga. Pengelolaan IR meliputi berbagai kegiatan, mulai dari pengumpulan, pengelolaan, pelestarian, promosi, hingga evaluasi. Kegiatan-kegiatan tersebut tampak dalam uraianberikut: a). Pengumpulan IR Sebagaimana telah disampaikan di atas bahwa IR selain berisi tentang tugas akhir mahasiswa juga berisi tentang foto-foto kegiatan, pidato rektor, artikel sivitas akademika, soal ujian, jurnal yang diterbitkan di lingkungan UIN Sunan Kalijaga, dan lain-lain. Pengumpulan tugas akhir dilakukan sendiri oleh mahasiswa sebagai salah satu syarat pendaftaran wisuda.. Format file tugas akhir yang diserahkan ke perpustakaan harus sesuai dengan standar yang telah disahkan oleh Pembantu Rektor Bidang Akademik. Standar tersebut antara lain adalah format dalam bentuk pdf, harus ada lembar pengesahan yang sudah disahkan, bookmark, dan adaabstrak. Sebelum tahun 2013 file diserahkan dalam bentuk CD (compact disc), tetapi dengan berbagai pertimbangan, sejak akhir tahun 2013 mahasiswa langsung menyerahkan flashdisc kepada petugas untuk disimpan filenya di komputer petugas. Selanjutnya, adanya pertimbangan terjadinya penumpukan mahasiswa pada akhir pendaftaran wisuda, maka perpustakaan melakukan evaluasi agar dapat memberikan layanan yang lebih baik kepada mahasiswa. Perpustakaan bekerjasama dengan unit PTIPD (Pusat Teknologi Informasi dan Pangkalan Data) membuatsistem 10 http://repository.webometrics.info/en/Asia/Indonesia, diunduh tgl 23 September 2014 18 Ida N. Hadna, PERAN PERPUSTAKAAN DALAM... bebas pustaka on line dan up load mandiri tugas akhir. Melalui sistem yang diberlakukan sejak bulan Agustus 2014, maka seluruh mahasiswa yang akan wisuda, baik mahasiswa D3, S1, maupun mahasiswa Program Pascasarjana wajib mengupload tugas akhirnya secara mandiri di server perpustakaan. Setelah itu, bagian repository yang akan menguploadnya ke digital library. Ketika mengumpulkan soft copy tugas akhir, mahasiswa diminta mengisi form penyerahan dan pemberian izin kepada perpustakaan untuk mempublikasikan tugas akhirnya sesuai dengan ketentuan dan kebijakan yang berlaku di Perpustakaan UIN Sunan Kalijaga. Sementara itu, pengumpulan IR selain tugas akhir, harus dilakukan sendiri oleh perpustakaan. Pustakawan terutama di bagian digital repository harus aktif mencari dan mengumpulkan materi-materi seperti makalah dosen/peneliti/pegawai, pidato rektor, dokumentasi foto/film tentang UIN, hasil-hasil penelitian, dan lain-lain. Keragaman content dalam digital library ini juga menjadi salah satu faktor pendukung dalam penilaian peringkat ranking webometrics. Oleh karena itu, kegiatan berburu naskah IR oleh perpustakaanini menjadi kegiatan yang tidak dapat diabaikan. b). Pengelolaan IR Setelah IR diserahkan ke perpustakaan, maka selanjutnya bagian Repositori perpustakaan akan mengelolanya agar dapat diakses oleh pemustaka. Tugas akhir yang tercetak setelah diterima oleh bagian informasi, akan dikirim ke bagian Pengembangan Koleksi untuk distempel, diinventaris, dan diinput datanya. Selanjutnya oleh bagian Pengembangan Koleksi akan dikirim ke bagian Pengolahan untuk diklasifikasi berdasarkan fakultas, diberi label, kemudian dikirim ke bagian Referensi untuk siap dilayankan kepada pemustaka. Sementara itu, file soft copy tugas akhir yang telah diserahkan akan segera dikelola oleh pustakawan. File akan dipecah, bab 1 dan bab 5 (terakhir) adalah bab yang bisa diakses dan didownloadsecara fullteks, sedangkan bab 2, 3, dan bab 4 tidak dapat diakses. Pada tahun 2012 Perpustakaan UIN Sunan Kalijaga mengajak mahasiswa part time perpustakaan untuk membantu mengelola tugas akhir untuk diupload ke dalam digital library. Hal ini dilakukan bersamaan dengan 19 Pustakaloka, Vol. 6. No.1 Tahun 2014 peralihan dari GDL ke EPrints. Agar file tugas akhir yang begitu banyak tersebut dapat segera diakses oleh pemustaka dengan EPrints, maka pengelolaannya dibantu oleh mahasiswa part time. Kerja keras ini akhirnya membuahkan hasil dengan masuknya perpustakaan digital UIN Sunan Kalijaga ke dalam rangking webometrics. Pengelolaan selain tugas akhir seperti foto-foto kegiatan, pidato rektor, jurnal terbitan fakultas, makalah dosen, dan lain-lain adalah dengan menguploadsecara fulltekske dalam digital library. Jika masih dalam bentuk tercetak, maka discan terlebih dahulu kemudian filenya diupload ke digital library. c). Pelestarian IR Menurut Pendit11 ada dua hal yang perlu diperhatikan dalam kegiatan pelestarian digital, yaitu: (1) media penampungnya harus tahan lama (CD-Rom, tape, disk); (2) format isinya juga harus tahan lama dalam arti dapat terus terbaca (PDF, TIFF, JPEG). IR Perpustakaan UIN Sunan Kalijaga disimpan dalam hard disc local server dan hard disc external, sedangkan format isi dalam bentuk PDF dan JPEG. Melalui format PDF, maka content dapat terjaga keasliannya. d). Evaluasi IR Salah satu tantangan dalam pengembangan IR adalah peningkatan jumlah koleksi12. Hal ini menjadi tugas pustakawan untuk aktif mencari dan menghimpun koleksi dari seluruh sivitas akademika. Pengukuran ranking webometrics repository didasarkan pada: size (S) yaitu jumlah halaman yang ditemukan dalam mesin pencari google; visibility (V) yaitu jumlah tautan eksternal); rich files (R) yaitu volume file dalam bentuk Adobe Acrobat (.pdf), MS Word (doc, docx), MS Powerpoint (ppt, pptx) and PostScript (.ps & .eps) dalam mesin pencari google; dan scholar(Sc) yaitu makalah ilmiah dan kutipan (http://repositories:webometrics.info/en/Methodology). Perpustakaan UIN Sunan Kalijaga yang telah menduduki peringkat ketiga se-Indonesia dalam webometrics repositorypada tahun 2014 memiliki tanggung jawab yang cukup berat untuk dapat menjaga 11 Pendit, Op.Cit., 2009:114 12 Ibid., 2008:141 20 Ida N. Hadna, PERAN PERPUSTAKAAN DALAM... atau bahkan meningkatkan peringkat tersebut. e). Promosi IR Berdasarkan evalusi seperti tersebut di atas, maka pustakawan perlu terus- menerus melakukan promosi kepada sivitas akademika untuk mengakses dan mengumpulkan hasil karyanya ke perpustakaan. Promosi yang dilakukan perpustakaan untuk mengenalkan digital library UIN Sunan Kalijaga antara lain melalui penyampaian materi dalam pendidikan pemakai perpustakaan bagi seluruh mahasiswa baru, baik mahasiswa D3, S1, S2, maupun S3, road show ke fakultas dan unit-unit terkait di lingkungan universitas. Promosi juga diberikan kepada para tamu dari lembaga atau institusi yang melakukan kunjungan ke perpustakaan untuk studi banding. D. Kendala Pengembangan Perpustakaan Digital Dalam pengembangan perpustakaan digital, Perpustakaan UIN Sunan Kalijaga menghadapi beberapa permasalahan, antara lain: 1. Masih banyak skripsi terbitan sebelum tahun 2007 yang belum ada soft filenya sehingga perlu didigitalkan, sedangkan anggaran untuk digitalisasi terbatas. 2. Kurang lancarnya hubungan perpustakaan dengan fakultas atau unit dalam penyerahan local content. Hal ini antara lain masih kurangnya kesadaran sivitas akademika untuk menyerahkan hasil karyanya kepada perpustakaan, baik untuk diupload agar dapat diakses secara luas maupun untuk penyimpanan dokumen/ preservasi. 3. Masih banyak sivitas akademika yang belum mengetahui digital library UIN Sunan Kalijaga. Hal ini antara lain diketahui ketika penulis bertanya kepada para peserta seleksi mahasiswa part time tentang akses skripsi, maka sebagian besar hanya mengakses melalui skripsi tercetak yang ada di perpustakaan. Ada beberapa yang mengetahui alamat digilib.uin-suka.ac.id tetapi tidak mengetahui fungsinya. Dalam road show yang dilakukan oleh perpustakaan ke fakultas juga terungkap masih banyak dosen yang belum mengetahui tentang digilib ini. 21 Pustakaloka, Vol. 6. No.1 Tahun 2014 E. Solusi Pengembangan Perpustakaan Digital Kendala-kendala di atas bisa diselsaikan dengan melakukan langkah-langkah berikut: 1. Agar skripsi lama yang masih dalam bentuk tercetak dapat segera selesai didigitalkan, maka diperlukan penambahan tenaga untuk melakukannya. Hal ini karena tenaga di bagian Repositori yang berjumlah 3 orang tidak dapat memenuhi pekerjaan tersebut. Peran serta mahasiswa dan seluruh staf perpustakaan di luar jam kerja akan sangat membantu penyelesaian pekerjaan ini. Oleh karena itu, diperlukan anggaran yang cukup. 2. Membangun kesadaran sivitas akademika untuk mau menyerahkan hasil karyanya untuk disimpan dan diupload di digital library menjadi tugas yang penting dilakukan oleh perpustakaan. Keuntungan penyimpanan hasil karya ilmiah ke digital library dari segi keselamatan dokumen dari bencana serta keterjangkauan akses informasi yang luas ke pemustaka perlu disampaikan kepada sivitas. Sosialisasi kegiatan ini dapat dilakukan oleh perpustakaan melalui kegiatan road show ke seluruh fakultas. Hal lain yang perlu dilakukan oleh perpustakaan adalah mengusulkan kepada pimpinan universitas untukmembuat surat keputusan tentang wajib serah simpan karya sivitas ke perpustakaan.Selain itu, kerja sama dengan unit terkait dalam kenaikan pangkat dan jabatan fungsional dosen, peneliti, pustakawan, dan lain-lain untuk menyerahkan hasil karyanya ke perpustakaan. Demikian juga dengan hasil penelitian yang didanai oleh pemerintah/universitas wajib diserahkan ke perpustakaan. 3. Agar koleksi IR dapat diakses secara luas, maka perlu sosialisasi atau promosi, baik kepada sivitas maupun kepada masyarakat umum. Sosialisasi dapat dilakukan antara lain melalui kegiatan user education bagi mahasiswa baru, road show ke fakultas dan unit terkait, web, spanduk, dan lain-lain. F. Kesimpulan Dari uraian sebelumnya dapat disimpulkan sebagai berikut: 1. Koleksi IR yang merupakan koleksi unik dan khusus yang dimiliki 22 Ida N. Hadna, PERAN PERPUSTAKAAN DALAM... oleh suatu lembaga akan semakin bermanfaat jika dapat diakses lebih luas oleh berbagai pihak. Oleh karena itu, kerjasama dari berbagai lembaga untuk dapat mengakses IR dari masing-masing lembaga yang tergabung di dalamnya (seperti portal Garuda DIKTI) perlu ditingkatkan lagi. 2. Kepemilikan IR akan meningkatkan citra lembaga, misalnya melalui pencapaian peringkat dalam webometrics. 3. Perpustakaan perlu memiliki SDM yang mampu menangani teknologi informasi dengan baik agar perpustakaan dapat mandiri dalam mengelola otomasi dan perpustakaan digitalnya. DAFTAR PUSTAKA Arianto, M. Solihin. 2014. “Diseminasi Informasi: Strategi Pengelolaan Local Content”.Makalah Seminar Nasional Diseminasi Informasi Local Content: Peluang dan tantangan dari Sudut Pandang Cyberlaw, diselenggarakan oleh Perpustakaan UNS Solo pada tanggal 18 Juni 2014 di kampus UNS. Badan Standarisasi Nasional,Standar Nasional Indonesia Perpustakaan Perguruan Tinggi, SNI 7330:2009, Bevan, Simon J. 2007. “Developing an institutional repository: Cranfield QUEprints – a case study”, OCLC Systems & Services, Vol. 23 Iss: 2, pp.170 – 182. DOI (Permanent URL): 10.1108/10650750710748478 http://repository.webometrics. info/en/Asia/Indonesia, diunduh tgl 23 September 2014 Indonesia. 2007. Undang-undang Republik Indonesia Nomor 43 Tahun 2007 tentang Perpustakaan. Jakarta: Perpustakaan Nasional RI. Pendit, Putu Laxman,Perpustakaan Digital dari A Sampai Z. Jakarta: Cita Karyakarsa Mandiri, 2008 _________________, Perpustakaan Digital: Kesinambungan dan dinamika. Jakarta: Cita Karyakarsa Mandiri, 2009 work_u5el44s3gfhwfm44x6vveckrgq ---- Trends in the Evolution of the Public Web: 1998 - 2002 Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine April 2003 Volume 9 Number 4 ISSN 1082-9873 Trends in the Evolution of the Public Web 1998 - 2002   Edward T. O'Neill, Brian F. Lavoie, Rick Bennett, OCLC Office of Research Introduction The swiftness of the World Wide Web's ascension from obscure experiment to cultural icon has been truly remarkable. In the space of less than a decade, the Web has extended into nearly every facet of society, from commerce to education; it is employed in a variety of uses, from scholarly research to casual browsing. Like other transformational technologies that preceded it, the Web has spawned (and consumed) vast fortunes. The recent "dot-com bust" was a sobering indication to organizations of all descriptions that the nature and extent of the Web's impact is still unsettled. Although the Web is still a work in progress, it has accumulated enough of a history to permit meaningful analysis of the trends characterizing its evolution. The Web's relatively brief history has been steeped in predictions about the direction of its future development, as well as the role it will ultimately play as a communications medium for information in digital form. In light of the persistent uncertainty that attends the maturation of the Web, it is useful to examine some of the Web's key trends to date, in order both to mark the current status of the Web's progression and to inform new predictions about future developments. This article examines three key trends in the development of the public Web — size and growth, internationalization, and metadata usage — based on data from the OCLC Office of Research Web Characterization Project [1], an initiative that explores fundamental questions about the Web and its content through a series of Web samples conducted annually since 1998. II. Characterizing the Public Web In 1997, the OCLC Office of Research initiated a project aimed at answering fundamental questions about the Web: e.g., how big is it? what does it contain? how is it evolving? The project's objective was to develop and implement a methodology for characterizing the size, structure, and content of the Web, making the results available to both the library community and the public at large. The strategy adopted for characterizing the Web [2] was to harvest a representative sample of Web sites, and use this sample as the basis for calculating estimates and making inferences about the Web as a whole. Using a specially configured random number generator, a 0.1% random sample of IP (Internet Protocol) addresses was taken from the IPv4 (32-bit) address space. For each of these IP addresses, an HTTP connection attempt was made on Port 80, the standard port for Web services. An IP address identified a Web site if it returned an HTTP response code of 200 and a document in response to the connection attempt. Each Web site identified in the sample was harvested using software developed at OCLC. Following the collection of the Web sites, several diagnostic tests were applied to identify sites duplicated at multiple IP addresses. This yielded an estimate of the total number of unique Web sites. Finally, the set of unique Web sites from the sample was analyzed to identify public Web sites. A public Web site offers to all Web users free, unrestricted access to a significant portion of its content. Public Web sites may also contain restricted portions, but in order for the site to be considered public, a non-trivial amount of unrestricted content must be available as well. Completion of this analysis yielded a representative sample of unique public Web sites. The set of all public Web sites is called the public Web. It is this portion of the Web that is the most visible and readily accessible to the average Web user. The public Web is the focus of the analysis and discussion in this paper. After a pilot survey in 1997, the project conducted five consecutive surveys (1998 - 2002) of the Web, based on the sampling methodology described above. III. Size and Growth According to the results of the Web Characterization Project's most recent survey, the public Web, as of June 2002, contained 3,080,000 Web sites, or 35 percent of the Web as a whole. Public sites accounted for approximately 1.4 billion Web pages. The average size of a public Web site was 441 pages. Is the public Web remarkable by virtue of its size? By at least one account, the answer is no — Shapiro and Varian [3] recently estimated that the static HTML text on the Web was equivalent to about 1.5 million books. They compared this figure to the number of volumes in the University of California at Berkeley Library (8 million), and, noting that only a fraction of the Web's information can be considered "useful", concluded that "the Web isn't all that impressive as an information resource." But Shapiro and Varian's assessment seems harsh. The Web encompasses digital resources of many varieties beyond plain text, often combined and re-combined into complex multi-media information objects. To assess the Web's size based solely on static text is to ignore much of the information on the Web. Furthermore, many Web analysts now recognize the distinction between the "surface Web" and the "deep Web". While this terminology suffers from different shades of meaning in different contexts, the surface Web can be interpreted as the portion of the Web that is accessible using traditional crawling technologies based on link-to-link traversal of Web content. This approach is used by most search engines in generating their indexes. The deep Web, on the other hand, consists of information that is inaccessible to link-based Web crawlers: in particular, dynamically-generated pages created in response to an interaction between site and user. For example, online databases that generate pages based on query parameters would be considered part of the deep Web. Although an authoritative estimate of the size of the deep Web is not available, it is believed to be large and growing [4]. In another study, Varian and Lyman [5] estimate that in 2000, the surface Web accounted for between 25 - 50 terabytes of information, based on the assumption that the average size of a Web page is between 10 and 20 kilobytes. However, Varian and Lyman make no distinction between public and other types of Web sites. Combining their estimate with results from the 2000 Web Characterization Project survey, and assuming that Web sites of all types are, on average, the same size in terms of number of pages, 41 percent of the surface Web, or between 10 - 20 terabytes, belonged to the public Web in 2000. In comparison, the surface Web in 2002 accounted for 14 - 28 terabytes (combining the page count from the Web Characterization Project's 2002 survey with an average Web page size of between 10 - 20 KBs). Varian and Lyman estimate that a 300-page, plain text book would account for 1 MB of storage space. This in turn implies that as of June 2002, the information on the surface Web was roughly equivalent in size to between 14 and 28 million books. This would seem to suggest that the Web is indeed an information collection of significant proportions: consider that in 2001, the average number of volumes held by the 112 Association of Research Libraries (ARL)-member university libraries [6] was approximately 3.7 million. The largest number of volumes, held by Harvard University, was just under 15 million. The conclusion, however, that the Web is equivalent, or perhaps even surpasses, the largest library collections is probably unwarranted. A significant percentage of the surface Web is taken up by "format overhead" — for example, HTML or XML tagging. In addition, Shapiro and Varian's point that a significant portion of the information on the Web is not generally useful cannot be dismissed lightly. What is probably most remarkable about the size of the Web is how rapidly it rose from relatively insignificant proportions to a scale at least comparable to that of research library collections. A widely-cited estimate [7] placed the size of the Web as a whole in 1996 at about 100,000 sites. Two years later, the Web Characterization Project's first annual survey estimated the size of the public Web alone to be nearly 1.5 million sites. By 2000, the public Web had expanded to 2.9 million sites, and two years later, in 2002, to over 3 million sites. In the five years spanning the series of Web Characterization Project surveys (1998 - 2002), the public Web more than doubled in size. Figure 1: Number of Public Web Sites, 1998 - 2002 But this impressive overall growth masks an important trend: that the public Web's rate of growth has been slowing steadily over the five-year period covered by the Web Characterization Project surveys. Examination of year-on-year growth rates (measured in terms of the number of Web sites) for the period 1998 - 2002 reveals this decline: between 1998 and 1999, the public Web expanded by more than 50 percent; between 2000 and 2001, the growth rate had dropped to only 6 percent, and between 2001 and 2002, the public Web actually shrank slightly in size. Most of the growth in the public Web observed during the five years covered by the surveys occurred in the first three years of the survey (1998 - 2000). In 1998, the public Web was a little less than half its size in 2002; by 2000, however, it was about 96 percent of its size in 2002. Figure 2: Public Web Year-on-Year Growth Rates The slowdown in growth of the public Web is even more dramatically evident in absolute terms. Between 1998 and 1999, the public Web exhibited a net growth of 772,000 sites; a similar number (713,000) were added between 1999 and 2000. After this point, however, absolute growth dropped off precipitously: between 2000 and 2001, only 177,000 new public Web sites were added, and between 2001 and 2002, the public Web shrank by 39,000 sites. The evidence suggests, then, that the public Web's growth has stagnated, if not ceased altogether. What factors can explain this? One key reason is simply that the Web is no longer a new technology: those who wish to establish a Web presence likely have already done so. In this sense, the rush to "get online" witnessed during the Web's early years has likely been replaced with a desire to refine and develop existing Web sites. Indeed, estimates from the Web Characterization Project's June 2002 data suggest that while the public Web, in terms of number of sites, is getting smaller, public Web sites themselves are getting larger. In 2001, the average number of pages per public site was 413; in 2002, that number had increased to 441. In addition to a slower rate of new site creation, the rate at which existing sites disappear may have increased. Analysis of the 2001 and 2002 Web sample data suggests that as much as 17 percent of the public Web sites that existed in 2001 had ceased to exist by 2002. Many of those who created Web sites in the past have apparently determined that continuing to maintain the sites is no longer worthwhile. Economics is one motivating factor for this: the "dot-com bust" resulted in many Internet-related firms going out of business; other companies scaled back or even eliminated their Web-based operations [8]. Other analysts note a decline in Web sites maintained by private individuals — the so-called "personal" Web sites. Some attribute this decline to the fact that many free-of-charge Web hosting agreements are now expiring, and individuals are unwilling to pay fees in order to maintain their site [9]. In sum, the dual effect of fewer entities creating Web sites, combined with entities discontinuing or abandoning existing Web sites, combine to dampen the public Web's rate of growth. This, of course, is not to say that the public Web is definitively shrinking: by other measures, e.g., the number of pages, or number of terabytes, it may in fact be growing. But in terms of the number of Web sites, which is roughly equivalent to the number of individuals, organizations, and business entities currently maintaining a presence on the Web in the form of a public site, the numbers suggest that growth in the public Web, at least for the time being, has reached a plateau. IV. Internationalization As its name suggests, the World Wide Web is a global information resource in the sense that anyone, regardless of country or language, is free to make information available in this space. Ideally, then, the Web's content should reflect the international community at large, originating from sources all over the world, and expressed in a broad range of languages. In 1999, the second year of the Web Characterization Project survey, the public Web sites identified in the sample were traced back to entities — individuals, organizations, or business concerns — located in 76 different countries, suggesting that the Web's content at that time was fairly inclusive in terms of the global community. A closer examination of the data, however, belies this conclusion. In fact, approximately half of all public Web sites were associated with entities located in the United States. No other country accounted for more than 5 percent of public Web sites, and only eight countries, apart from the US, accounted for more than 1 percent. Clearly, in 1999, the Web was a US-centric information space. Figure 3: Distribution of Public Web Sites By Country, 1999 Three years later, little has changed. The proportion of public Web sites originating from US sources actually increased slightly in 2002, to 55 percent, while the proportions accounted for by the other leading countries remain roughly the same. In 2002, as in 1999, the sample contained public sites originating from a total of 76 countries. These results suggest that the Web is not exhibiting any discernable trend toward greater internationalization. Figure 4: Distribution of Public Web Sites By Country, 2002 This conclusion is reinforced when the language of textual content is considered. Given the fact that more than half of all public sites originate from US sources, it is easy to predict that English is the most prevalent language on the Web. But to what degree does this dominance extend? How has it evolved over time? Examination of the 1999 and 2002 Web Characterization Project survey data provides insight into these questions. In 1999, 29 different languages were identified among the sample of public Web sites included in the survey, which, taken at face value, suggests that the Web's text-based content is fairly diverse in terms of the languages in which it is expressed. But, as with the geographical origins of public Web sites, the raw total of languages represented in the public Web overstates the degree of internationalization actually achieved. Data from 1999 indicate that nearly three-quarters of all public Web sites expressed a significant portion of their textual content in English. The next most frequently encountered language was German, which appeared on about 7 percent of the sites. Only seven languages, apart from English, were represented on 2 percent or more of the public Web sites identified in the survey. Figure 5: Relative Frequency of Languages, 1999 (Percent of Public Sites) Not all sites represent their textual content in a single language: in 1999, for example, 7 percent of the public Web sites identified in the sample presented textual content in multiple languages. Interestingly, however, in each instance where textual content was offered in more than one language, English was, without exception, one of the choices. Just as with the geographical origins of public Web sites, the distribution of textual content across languages appears to have changed little between 1999 and 2002. The percentage of public sites offering a significant portion of their content in English remained steady at nearly three-quarters; no other language exceeded 7 percent. The percentage of multilingual sites decreased slightly, from 7 to 5 percent. Perhaps the most significant change is the increase in the number of sites offering content in Japanese: this percentage increased from 3 to 6 percent between 1999 and 2002. This result, combined with the fact that the percentage of sites available from Japanese sources increased from 3 to 5 percent during the same period, suggests that the Japanese presence on the Web has perceptibly expanded over the past few years. Figure 6: Relative Frequency of Languages, 2002 (Percent of Public Sites) Library collections offer a point of comparison for the distribution of languages found on the Web. Comparison of the Web to a single library collection is problematic, because the latter reflects a collection development strategy unique to a single institution. However, comparison of the Web to an aggregation of many library collections will tend to average out the idiosyncratic features of particular collections, offering a more meaningful comparison to the Web. WorldCat® (the OCLC Online Union Catalog) is the world's largest bibliographic utility, representing content held by libraries all over the world, but predominantly from the US. As of July 2001, WorldCat contained about 45 million bibliographic records [10]. Of these, about 63 percent were associated with English-language content. German and French were the next most common languages, at 6 percent each; Spanish was 4 percent, and Chinese, Japanese, Russian, and Italian were each 2 percent. All other languages were 1 percent or less. This distribution is very similar to that of public Web sites, both in terms of the shape of the distribution (heavily skewed toward English, then immediately dropping off to a long, thin tail), as well as the relative frequency ranking of languages. This suggests that the Web and library collections exhibit roughly the same degree of internationalization in terms of language of textual content. V. Metadata Usage Libraries serve as more than just repositories of information. In addition, the information is organized and indexed to facilitate searching and retrieval. A complaint that has often been made about the Web is that it lacks this organization. Searching is done using "brute force" methods such as keyword indexing, often without context or additional search criteria. Some improvements have been made from the earliest days of the Web: the search engine Google, for example, employs relatively sophisticated algorithms that rank search results based on linkage patterns and popularity. Librarians achieve their organization through the careful preparation and maintenance of bibliographic data — i.e., descriptive information about the resources in their collections. More generally, this descriptive information is called metadata, or "data about data". A movement has been underway for some time to introduce metadata into the Web, most notably through the Dublin Core Metadata Initiative [11]. Has any significant progress been made in this regard? Metadata for Web resources is typically implemented with the META tag, which can be used by creators to embed any information deemed relevant for describing the resource. The META tag consists of two primary components: NAME, which identifies a particular piece of metadata (keyword, author, etc.) and CONTENT, which instantiates, or provides a value for, the metadata element identified in the NAME attribute. Using the data from all five Web Characterization Project surveys, it was possible to examine trends in metadata usage on the public Web over the past five years. The purpose of the analysis was simply to detect the presence of any form of metadata, implemented using the META tag, on public Web sites. Analyzing the public sites collected in the samples between 1998 and 2002 revealed several important characteristics about metadata usage on the Web. First, it seems clear that metadata usage is on the rise: steady increases in the percent of public Web sites containing metadata on the home page (where metadata usage is most common) are observed throughout the five-year period. Similar increases were observed in the percentage of all Web pages harvested from public sites that contained some form of metadata. One caveat should be mentioned, however: with the advent of more sophisticated HTML editors, some META tags are created and populated automatically as part of the document template. It is likely that this accounts for at least part of the perceived increase in META tag usage on public Web sites. A second interesting feature about metadata usage on the Web is that, apparently, it is not becoming more detailed. If it is assumed that one META tag is equivalent to one metadata element, or piece of descriptive information about the Web resource, then it is clear that, on average, Web pages that include metadata contain about two or three elements. Clearly, there is no widespread movement to include detailed description of Web resources on the public Web. Table 1: Metadata Usage on Public Web Sites, 1998 - 2002   1998 1999 2000 2001 2002 No. of Public Sites 1,457,000 2,229,000 2,942,000 3,119,000 3,080,000 % of Sites with Metadata on Home Page 70 percent 79 percent 82 percent 85 percent 87 percent Mean No. of Tags 2.75 2.32 2.72 2.97 3.14   % of pages with Metadata 45 percent 50 percent 59 percent 62 percent 70 percent Mean No. of Tags 2.27 2.16 2.46 2.63 2.75 A discouraging aspect of metadata usage trends on the public Web over the last five years is the seeming reluctance of content creators to adopt formal metadata schemes with which to describe their documents. For example, Dublin Core metadata appeared on only 0.5 percent of public Web site home pages in 1998; that figure increased almost imperceptibly to 0.7 percent in 2002. The vast majority of metadata provided on the public Web is ad hoc in its creation, unstructured by any formal metadata scheme. VI. Conclusion In this paper, three key trends in the evolution of the public Web over the last five years were examined, based on five annual surveys conducted by OCLC's Web Characterization Project. The results of these surveys indicate that the public Web is an information collection of significant proportions, exhibiting a remarkable pattern of growth in its short history. But evidence suggests that growth in the public Web, measured by the number of Web sites, has reached a plateau. The annual rate of growth of the public Web has slowed steadily throughout the five-year period covered by the Web Characterization Project surveys; over the last year, the public Web shrank slightly in size. A second trend concerned the internationalization of the public Web. The Web has been positioned as a global information resource, but analysis indicates that the public Web is dominated by content supplied by entities originating in the US. Furthermore, the vast majority of the textual portion of this content is in English. There are no signs that this US-centric, English-dominated distribution of content is shifting toward a more globalized character. Finally, examination of metadata usage on the public Web over the five-year span of the Web Characterization Project surveys indicates that little if any progress is being made to effectively describe Web-accessible resources, with predictable results for ease of search and retrieval. Although metadata usage (via the HTML META tag) is common, the metadata itself is created largely in an ad hoc fashion. There is no discernable trend toward adoption of formal metadata schemes for public Web resources. As we consider the current status of the Web's evolution, and speculate on its future progression, the trends described in this paper suggest that the public Web may have reached a watershed in its maturation process. The rush to get online is, at least for the time being, over, evidenced by the plateau in the growth of the public Web. Maintaining a Web presence has become a routine, and in many cases, necessary activity for organizations of all descriptions. But the public Web clearly has some distance yet to go to reach its full potential, a point corroborated by the two other trends examined in this paper. The ubiquity of the public Web in other parts of the world has not reached the level realized in the United States. Also, while we have become extremely proficient at making information available on the public Web, progress in terms of making that information more organized and "findable" has been comparatively limited. The past five years have witnessed extraordinary validation of the Web as "proof of concept". Hopefully, the next five years will witness equally remarkable progress in fine-tuning the Web to enhance both the scope of its users, and the utility of its content. Notes and References [1] For more information about the Web Characterization Project, please visit the project Web site at . [2] For more information about the Web sampling methodology, please see "A Methodology for Sampling the World Wide Web". Available at . [3] Shapiro, C. and H. Varian (1998). Information Rules: A Strategic Guide to the Network Economy, (Harvard Business School Press, Cambridge) [4] Bergman, M. (2001). "The Deep Web: Surfacing Hidden Value" Journal of Electronic Publishing, Volume 7, Issue 1. Available at . [5] Information on this study, "How Much Information?", along with the study's findings, are available on the project Web site: . [6] All ARL-member statistics were obtained at the ARL Statistics and Measurement Program Web site at . [7] Statistics from Gray, M., "Web Growth Summary". Available at . [8] Kane, M. (2002). "Web Hosting: Only the Strong Survive" ZDNet News. Available at . [9] Mariano, G. (2002). "The Incredible Shrinking Internet" ZDNet UK News. Available at . [10] WorldCat statistics were obtained from the OCLC Annual Report 2000/2001. The report is available at . [11] For more information about the Dublin Core Metadata Initiative, please visit the DCMI Web site at . Copyright © OCLC Online Computer Library Center. Used with permission. OCLC and WorldCat are registered trademarks of OCLC Online Computer Library Center, Inc. Dublin Core is a trademark of OCLC Online Computer Library Center, Inc. Top | Contents Search | Author Index | Title Index | Back Issues Editorial | Next Article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/april2003-lavoie   work_u6tos44tzrae3br4pdxibwyvyu ---- 184 American Archivist / Vol. 54 / Spring 1991 Case Study SUSAN E. DAVIS, editor American Medical Association's Historical Health Fraud and Alternative Medicine Collection: An Integrated Approach to Automated Collection Description JAMES G. CARSON Abstract: From 1913 to 1975, the American Medical Association's Department of Inves- tigation assembled more than 300 cubic feet of files on health fraud, quackery, "patent" medicines, and alternative medicine. In 1988, the AMA obtained a grant from the National Library of Medicine to process and catalog these materials, now known as the Historical Health Fraud and Alternative Medicine Collection. Using Minaret software (a stand-alone USMARC AMC cataloging system) in combination with WordPerfect word-processing software, the project staff developed procedures that allowed it to generate textual and index entries for the printed guide to the collection as well as upload USMARC AMC records directly to the OCLC (Online Computer Library Center) union catalog. About the author: James G. Carson, Ph.D., is an independent archival consultant and former project manager, Historical Health Fraud and Alternative Medicine Collection Project, Division of Library and Information Management, American Medical Association. The author expresses his appreciation to Arthur W. Hafner, Ph.D., director, Division of Library and Information Manage- ment, American Medical Association, for editorial counsel and administrative support; to Victoria A. Davis, former director of the Division of Library and Information Management's Department of Archives, History, and Policy, for her significant participation in this project; to Micaela Sullivan- Fowler, who recognized the need to organize the collection and worked on early drafts of the grant proposal that was eventually funded; and to John F. Zwicky, Ph.D. for his technical contributions. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 An Integrated Approach to Automated Collection Description 185 Background of the Project THE AMERICAN MEDICAL ASSOCIA- TION'S Historical Health Fraud and Alter- native Medicine Collection (hereafter referred to simply as the Historical Health Fraud Collection) consists of more than 300 cubic feet of files on health fraud, quack- ery, "patent" medicines, and alternative medicine. The collection originated as the office files of the AMA's Department of Investigation, which existed from 1913 to 1975 and was charged with answering in- quiries about fraud, quackery, and alter- native medicine. A combination of factors led to the ab- olition of the Department of Investigation in 1975. By that time, government agen- cies such as the Food and Drug Adminis- tration and the Federal Trade Commission were largely duplicating the department's investigative functions. The AMA Library accepted responsibility for both the records and the information-dispensing function. The library gradually phased out active gather- ing of new information on health fraud and questionable therapies as private organiza- tions such as the National Council Against Health Fraud moved into this arena.1 The Department of Investigation's files, though no longer active, contained an un- paralleled wealth of original source mate- rial on thousands of fraudulent or alternative health practitioners, products, and prac- tices that the department investigated dur- ing its sixty-two years of existence. Recognizing the unique historical value of these files, in 1988 the AMA Division of Library and Information Management ap- plied for and received a two-year, $165,000 grant from the National Library of Medi- cine to process the collection, describe it in MARC Archival and Manuscripts Con- trol (AMC) format, and produce a collec- tion guide.2 The grant funds permitted the AMA li- brary to develop automated procedures that integrated what are typically related-but-sep- arate operations. In order to provide wide access to information about the collection through a nationwide bibliographic utility, USMARC AMC records were to be added to OCLC (Online Computer Library Center), the national network with which the AMA Library is affiliated. This objective could be combined with that of efficiently producing a conventional guide to the collection, com- plete with indexes, and a local searchable database containing more detailed data than could appropriately be entered in OCLC. The procedures described here could be adapted for use by other archives, whether or not the same software and bibliographic utility de- scribed here are involved. Creating USMARC AMC Records In archival terms, the AMA's Historical Health Fraud Collection is an alphabetical subject file that constitutes a single large record series. Within this series, holdings range from single folders, in the case of many minor topics, to several cubic feet on topics of great interest, such as claimed cures for alcoholism, cancer, and obesity. This variation in depth of coverage within the collection leads to corresponding variation in the descriptive approach. The project staff created a master collection-level catalog re- cord, supplementing it with separate AMC records for each major subseries, i.e., holdings on a single subject. Subseries of sufficient size and complexity were addi- tionally described by a folder list (which is not a part of an AMC record). Because the 'The National Council Against Health Fraud, Inc., P.O. Box 1276, Loma Linda, CA 92354. Its resource center is located at Trinity Lutheran Hospital, 3030 Baltimore, Kansas City, MO 64108. department of Health and Human Services, Public Health Service, National Institutes of Health, Re- source Grant G08 LM04637, Arthur W. Hafner, Prin- cipal Investigator. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 186 American Archivist / Spring 1991 collection contained more than 3,000 sub- series, as many as four alphabetically ad- jacent minor subseries were combined into a single record. This effort reduced the number of MARC records to approxi- mately 950. Original plans for the Historical Health Fraud Collection project called for AMC records to be created directly in OCLC.3 This approach was ruled out early in the project for several reasons. At the time, OCLC lacked subject-searching capability, and searching the OCLC database for in- house reference purposes would have in- volved cumulatively expensive connect-time charges.4 The most serious problem, how- ever, related to authority control. Bibliographic networks such as OCLC naturally and legitimately require prior veri- fication of personal and corporate names and other headings used as access points in cat- alog records; this verification is carried out using appropriate national authority data- bases such as the Library of Congress name authority file. However, in archival contexts this process can become a black hole into which mountains of work-time disappear to small discernable purpose. Very few of the names encountered in a typical archival col- lection are those of published authors or other similarly prominent entities. Hence they are unlikely to be found in the relevant authority files.5 However, the use of many such names, and of other headings such as names of med- icines and similar products, is highly desir- able to provide access points for local searching of the collection. Fortunately, this dilemma arose just as the first personal-computer-based AMC sys- tems, Minaret and MicroMARQamc, were 3Thc Historical Health Fraud Collection is the only portion of the AMA Archives to be cataloged in OCLC. 4OCLC's recently introduced EPIC service has filled this gap. 5On a previous project in which the author partic- ipated, the hit rate for authority searching in a similar context was about 2 percent. becoming generally available. The AMA Li- brary eventually decided to install Cactus Software's Minaret system on the project's OCLC workstation, an AT-class personal computer.6 This permitted project staff to do the original cataloging in Minaret and then transfer the records to OCLC. This decision allowed the development of a customized da- tabase configuration that includes both stan- dard MARC and local versions of each of the USMARC AMC subject added entry (6xx) fields, as well as a special local field for product names. The OCLC version of a cat- alog record includes only fields with standard MARC tag numbers and omits the corre- sponding local fields. Cataloging staff veri- fies terms used in the standard 6xx fields in the relevant authority files, namely LC name authority and National Library of Medicine subject headings. Terms used in the local 6xx fields are subject only to a much more streamlined local authority-control system built into Minaret. In general, the local fields are preferred except for persons or other entities that appear to be of sufficient prominence to justify their inclusion in a national database.7 In conjunction with word-processing and other auxiliary software, Minaret has be- come the heart of an integrated automation approach that uses only two inputting pro- cedures to generate five different products (see Figure 1). The processing staff enters AMC catalog records in Minaret and cre- 6Cactus Software, Inc., 15 Kary Way, Monistown, NJ 07960-5604. Among the factors favoring Minaret were its built-in authority control routine, variety of inputting-form options, flexibility in formatting out- put, and automatic index updating. 'This strategy can complicate local searching on the Minaret database, because a searcher may not know whether a given search term appears in the standard MARC version or the local version of a given field. But this is not a serious difficulty, for Minaret's free- form search editor allows searches with Boolean op- erators involving multiple fields. Formulating a free- form search can present problems for a computer nov- ice, but we have streamlined the process by using SuperKey, a RAM-resident utility, to create search macros that take care of all the necessary keystrokes except for the search term itself. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 An Integrated Approach to Automated Collection Description 187 Figure 1 MINARET DATABASE FOLDER LISTS Original Input OCLC DATABASE GUIDE TEXT GUIDE INDEXES Derived Products ates folder listings with WordPerfect as the collection is processed. The Minaret rec- ords are then manipulated to produce OCLC catalog records. With the help of the search- and-replace and other editing conveniences of WordPerfect, they also yield textual and index entries for the collection guide. Index entries are also drawn from the folder lists. Producing a Guide from Minaret Records Text conversion. The process for pro- ducing guide text entries takes advantage of Minaret's form-editor feature. A Min- aret form is, in effect, a template through which catalog records are viewed. The guide-entry form includes only those US- MARC AMC data elements that appear in the collection's printed guide: record title, dates, extent, call number, and note fields.8 Once this form is invoked in Minaret, the operator creates an export file and then transfers it to WordPerfect. In Word- Perfect, search-and-replace macros (stored instructions that simplify the repetitive re- 8"Call numbers" for the collection are simply in- clusive box/folder numbers. For example, 0106-07/ 0107-03 means that the materials described are to be found beginning in folder 7 of box 106 and ending with folder 3 in box 107. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 188 American Archivist / Spring 1991 placement of one text string or formatting code with another) perform a number of editing functions, the most notable of which exploits WordPerfect's paragraph number- ing feature to assign serial numbers to the entries. (See Figures 2a and 2b.) Figure 2a 035 $aAMC89-000140 040 taAMA Jeappm JcAMA 099 9 ta0113-03/0116-01 049 taAMAF 110 2 taAmerican Medical Association, tbDept. of Investigation. 245 00 taRecords. tpCathartics, #1904- 1973. 300 tal.O cubic ft. (3 boxes). 520 ^Correspondence, reports, adver- tisements, articles and clippings, press re- leases, and promotional and supplementary materials concerning cathartics. tbThere are six folders of material on cathartics in gen- eral. The rest concern individual cathartics, mostly patent medicines, but also quack de- vices such as the "Sphincter Muscle Expan- der." Among the more prominent cathartics are "Cereal Meal," Phillip's Milk of Mag- nesia, and Zo-Ro-Lo. 555 0 $aA folder list is available for this material. Portion of a USMARC AMC catalog record as it appears in Minaret. Figure 2b 125. Cathartics, 1904-1973. 1.0 cubic ft. (3 boxes). Call number: 0113-03/0116-01 SUMMARY: Correspondence, reports, ad- vertisements, articles and clippings, press releases, and promotional and supplemen- tary materials concerning cathartics. There are six folders of material on cathartics in general. The rest concern individual cathar- tics, mostly patent medicines, but also quack devices such as the "Sphincter Muscle Ex- pander." Among the more prominent ca- thartics are "Cereal Meal," Phillip's Milk of Magnesia, and Zo-Ro-Lo. A folder list is available for this material. Collection guide entry, derived from record in Fig. 2a. Building an index. The procedure for deriving index entries from Minaret catalog records also involves the Minaret form ed- itor. For this purpose, project staff have defined a form that contains only the call number and the subject added entry (6xx) fields. In this form the tag numbers are re- placed by two-letter mnemonic codes, e.g., " P N " for a personal name. Again, an ex- port file created with this form is trans- ferred to WordPerfect.9 There the serial number of the corresponding guide entry replaces the call number, and a macro ap- pends this serial number to each index en- try. Next, another set of macros appends each entry to one of seven index files, de- pending on the index code that precedes it.10 In the final step, WordPerfect sorts the index files alphabetically to move the new entries into proper sequence. The system also derives index entries from folder lists originally entered in Word- Perfect.11 In addition to columns for folder title, dates, and box/folder number, the folder-list format includes a column for in- dex codes. A processor will enter the ap- propriate two-letter code in this column whenever a folder title is suitable for inclu- sion in one of the seven indexes—for ex- ample, when it comprises the name of a person, corporate body, or product. An- other series of macros then strips the folder list down to include only folder titles and call numbers, thus corresponding to the 'Unlike the guide-text conversion routine described in the previous section, this procedure must be per- formed separately for each catalog record. To stream- line it as much as possible, all the keystrokes needed to generate the export file are stored as a SuperKey macro. lnThere are indexes of personal, corporate, confer- ence/meeting, geographic, and product names; titles; and topical subjects. "This procedure would be unnecessary if every ap- propriate heading appearing in a folder list were also incorporated as an added entry in the corresponding catalog record. The decision not to follow this practice was largely a concession to time constraints and may be reconsidered in the future. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 An Integrated Approach to Automated Collection Description 189 output from the Minaret index form.12 From this point on, the procedure is exactly the same as for the 6xx AMC fields. (See Fig- ures 3a, 3b, and 3c.) The result of these steps (which, once the procedures and macros are established, are employed far more routinely than their description here may convey) is a guide to the collection that provides a clear descrip- tion of the collection (including indexes) in an effective, recognizable format. The same data entry also produces a locally search- able Minaret database which can be ac- cessed in a wide variety of ways—even by minimally trained personnel or by research- ers themselves, using the search macros de- scribed in note 7. As described in the next section, it also produces records to be added to a national database. Uploading Minaret records to OCLC The purchase of the Minaret system for the project did involve one major uncer- 12This process is actually performed on a copy of the folder list; the original folder list is naturally re- tained (on disk as well as in hard copy). Figure 3a 692 laCeremel. 692 iaCitrolax. 692 taCream of Magnesia. Product-name added entries, derived from US- MARC AMC catalog record shown in Fig. 2a. tainty. While Minaret produced records that conformed to the OCLC implementation of the AMC format and were hence OCLC- compatible, it did not originally have the capacity to upload records directly to OCLC. At this stage, it would have been necessary to export records onto tape and then send the tape to OCLC—a cumbersome proce- dure that would have added significantly to the project's expenses. The solution was to develop a direct upload protocol using a modem, which would obviate the necessity for tape uploading. Eventually, following a few false starts and several discussions with OCLC per- sonnel, the author and Cactus Software president Geoffrey Mottram developed a procedure based on one previously devised by Richard Aroksaar and Ellen Traxel of the Pacific Northwest Regional Library, National Park Service.13 The initial version of the OCLC upload routine worked as fol- lows: The first step was to strip the local fields out of the records to be uploaded; this was accomplished by exporting them to a sep- arate database within Minaret. Staff then transferred this record set to WordPerfect, where search-and-replace macros rectified some minor format differences between 13The original procedure is described by Aroksaar and Traxel in OCLC Micro 5 (June 1989): 9-11. The American Medical Association's adaptation was de- scribed briefly by Marion Matters in the SAA News- letter, March 1990, 11. Figure 3b Index CN PR PR PR PR CN PR Folder Title Cerag Company Colonaid Correctol Cryst-L-Dex Dorsey's Mixture Druggists Cooperative Association Dunbar's System Tonic Date(s) 1916 1957-1960 1958-1959 1936-1939 n.d. 1913-1917 1913-1937 Folder No. 0114-04 0114-05 0114-06 0114-07 0114-08 0114-09 0114-10 Portion of corresponding folder list showing product-name entries (index code " P R " ) . D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 190 American Archivist / Spring 1991 Figure 3c Ceremel. 125 Chamberlain's Colic Remedy. 129 Chase's Kidney Pills. 130 Citrolax. 125 Citrophan. 148 Clarke's Blood Mixture. 224 Collum Dropsy Remedy. 158 Colonaid. 125 Connelley Liquor Cure. 14 Correctol. 125 Cosmic Wave Vitalizer. 173 Cream of Magnesia. 125 Crotalin. 181 Cryst-L-Dex. 125 Cystex. 184 Portion of product-name index, showing inter- mixed entries from AMC record and folder list. Minaret and OCLC. For example, Minaret requires subfield delimiters at the begin- nings of all subfields; in OCLC, the $a de- limiter is omitted where subfield a is the first subfield in a field. Hence, one of the search-and-replace macros stripped out t a delimiters that occurred at beginnings of fields. After these modifications, the resulting file went through a routine that transformed it into a script that could be read by ProComm communications software and transmitted via modem to OCLC. Aroksaar originally developed the "transcat" utility file which performs this transformation; the utility is available via the Fedlink bulletin board ALIX.14 Since this utility was orig- inally designed for use in a book-oriented environment, its output required some modification for archival purposes, notably by replacing the books-format workform command with the appropriate workform command for the AMC format.15 Thus, in- '"ALIX can be dialed at 202-707-9656; the "tran- scat" routine is in files area # 3 , files section. 15The process of entering an OCLC record always begins by calling up the appropriate workform for the MARC format desired. The workform includes prompts for required fields and others that are commonly used. stead of going directly into ProComm, the "transcat" output file was first loaded into WordPerfect again, where another series of search-and-replace macros replaced the workform commands and made other nec- essary changes. Staff then manually in- serted appropriate passwords and identification numbers, and transferred the file to ProComm, which transmitted it to OCLC. Recently, drawing on the Health Fraud project's experience, Cactus has added an upgraded version of the OCLC upload util- ity for Minaret that completely eliminates the need for auxiliary massaging in a word processor. This latest upload utility in- cludes a special upload form and a DOS utility, "mkscript," that incorporates the "transcat" routine. These features accom- plish the stripping of local fields, elimina- tion of superfluous subfield delimiters, substitution of AMC workform commands, and all other necessary changes. User pass- words and identification numbers need only be inserted once in an auxiliary text file; the utility then includes them automatically in each output file produced by the "mkscript" routine. This output file is then loaded directly into ProComm and trans- mitted. What OCLC " s e e s " during this process is cataloging text being entered at "home" position on the workstation screen, one line at a time. The process takes from 60 to 90 seconds for an average record con- taining between twenty and thirty fields. The version of the upload utility used on the Health Fraud project places catalog rec- ords in the OCLC " s a v e " file, from which project staff then retrieve and "produce" them in a separate manual operation. This is the final step that actually places a record in the OCLC online union catalog and as- signs it a serial number. The ProComm script Although data can be entered at the " h o m e " position on the screen, rather than on the workform, the latter must still be present. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 An Integrated Approach to Automated Collection Description 191 could incorporate this step; however, prob- lems such as undiscovered typographical errors and communication difficulties dur- ing the uploading session may result in the necessity for last-minute changes. It is eas- ier to make these changes in the " s a v e " file than in a record which has already been "produced." Comment It is difficult to quantify the impact of these automated procedures on the AMA's Historical Health Fraud Collection project. However, a reasonable estimate is that it would have taken the project staff at least twice as long to create OCLC catalog rec- ords, guide text, and guide indexes man- ually. More likely, of course, these helpful additional finding aids would never have been developed. Thus, the Historical Health Fraud Collection project's automated pro- cedures have saved roughly four person- years of work, and can serve as a useful model for other repositories in using auto- mation to improve collection access with minimal descriptive effort. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.54.2.d0415l75772k5174 by C arnegie M ellon U niversity user on 06 A pril 2021 work_ua2vd3s3frbtnej62cmgxjx2fe ---- The Journal of Academic Librarianship, 2007, Vol. 33, No. 1, p. 56-66. ISSN: 0099-1333 (print) 1879-1999(online) DOI: 10.1016/j.acalib.2006.08.008 http://www.sciencedirect.com/science/journal/00991333/33/1 http://www.sciencedirect.com/science/article/pii/S0099133306001601 Copyright © 2007 Elsevier Inc. All rights reserved. Open WorldCat and Its Impact on Academic Libraries by Magda El-Sherbini This paper analyzes librarians' reactions to the Open OCLC WorldCat. A detailed survey was sent to ARL libraries to explore what, if anything, the libraries are currently doing to prepare for these changes and how they plan to cope with the probability of having all their records open to the whole world. Survey findings indicate that most of the ARL member institutions are not making immediate preparations to cope with issues that have not yet emerged and they will continue to provide access to materials based on priorities dictated by current needs and available resources. Introduction During the last few years, technology has been advancing at a very rapid pace. Introduction of the World Wide Web in the 1990s, subsequent development of effective search engines such as Yahoo! and Google, and advances in digital technologies helped create a virtual universe of information that is available to all users via the Internet. Information that is made available through the Internet can now be accessed from homes, offices, cafes, airplanes, etc. As traditional library users became increasingly dependent on instant information available via the Internet, they began to lose interest in some of the services that were traditionally offered by libraries. Librarians began to feel the impact of these changes and initiated debates on how technology is affecting their business and what the response to this challenge might be. In response to these changes in user behavior on the academic level, OCLC and RLG have recently gone beyond making their union catalogs available to library users and made the contents of their databases available to the public at large via the Web. The present study was undertaken to explore the degree to which libraries and librarians are committed to the idea of having their collections available through the Internet. A large part of the paper focuses on the quality of records in the library OPACs and the potential need to have these records upgraded or enriched, so the remote user can assess their usefulness. The second important theme addresses the issue of un-cataloged collections and the digital document repositories. At this time most of these materials are not represented in the library OPACs and this study attempts to determine if libraries are committed to the idea of providing full access to all of their institutional materials. Digitizing library collections and institutional repositories are addressed in the final parts of the discussion. These two sets of materials have existed outside the traditional library catalogs and the author will attempt to determine how libraries perceive the future of these collections. Background Recent technological developments in the area of information processing and dissemination have made a profound impact on ways in which information is accessed. Explosion of information that is available through the Internet is making a huge impact on libraries and their users. Information that is available through the Internet ranges from scholarly information on the one hand, to answers to quick questions, such as a recipe for a pound cake, on the other. Many academic Web searchers and users prefer to use the two leading search engines, Yahoo! and Google, rather than the library OPAC. This is due, in part, to the flexibility of the Web, which is rapidly becoming an all-purpose tool that includes personal communication as well. As Stephen Abram mentioned in his article "For users who cry out for online discussions, communities of practice, group and individual blogs, and connections through social networking software, Google already offers Blogger, Google Groups, and the new Orkut beta, already a pretty advanced social networking tool." 1 In addition to speed and personal communication, today's users are also becoming dependent on the convenient manner in which information can be accessed and delivered. As Judy Luther points out in her article, "Google has radically changed users' expectations and redefined the experience of those seeking information. For many searchers, quality of the results matter less than the process—they just expect the process to be quick and easy." 2 Library content has, for the most part, been excluded from the Internet revolution. Wealth of bibliographic information that is contained in major bibliographic utilities such as the OCLC WorldCat and the Research Libraries Group catalog was available only at the library. Until recently, only those institutions which use a cataloging interface to the WorldCat database or which subscribe to the WorldCat database via FirstSearch products were able to provide WorldCat access to their users. This left the valuable resources of world libraries outside the emerging information mainstream that we now call The Web. Initial remedies to this situation were offered by RLG and OCLC. On September 22, 2003, RLG launched their "RedLight-Green" pilot, to support undergraduates using the Web. 3 RedLightGreen is a free resource that helps the user to find important books for research, check their availability at the library, and to create citations. It delivers information from RLG members about more than 130 million books for education and research, and it links students back to their campus libraries for the books they select. RedLightGreen allows any library with a Web-based catalog (with URL-supported search syntax) to be linked, regardless of whether the institution is an RLG member. This innovation provides access to bibliographic and location information on library materials via the Web. OCLC was also among those institutions that began to seek solutions. Their response was to bring the library to the Internet user by opening the WorldCat database to Web searchers through Internet search engines such as Google and Yahoo! and other partners. To use these search services, patrons would simply enter a search phrase to locate an item. Through the "Find in a Library" function, the user could then enter their postal code to locate the item at a participating library in their city, region or country. On October 27, 2003, OCLC announced that it would begin a pilot project to test the feasibility of opening the WorldCat database to Google and other search engines. 4 The idea behind this project was to utilize available technology to integrate library collections with the Internet and make these materials available to Web users. By taking advantage of the popular search engines, libraries would increase their visibility to the broad user community and make information about their treasured collections available to the world via the Internet. The experiment with Google was a success, and nearly a year later, on July 2004, OCLC expanded its pilot project to locate libraries materials to Yahoo! search. 5 On October 11, 2004, OCLC decided to open the entire collection of their records. 6 Another step toward a further integration of library resources was taken at the time of reviewing this paper. On May 2006, Jay Jordan, President and CEO of OCLC, announced that OCLC and RLG had signed a definitive agreement to combine the two organizations. 7 If this agreement is approved by RLG membership, OCLC will purchase the assets and assume certain liabilities of RLG, effective July 1, 2006. This merger of RLG with OCLC appears to be another major step in the process of integrating library resources and is likely to have an impact on WorldCat and its users. Literature Review The subject of opening access to the OCLC's WorldCat database through Google is fairly new and the author found relatively little literature that addressed the question directly. Most of what is found in literature consists of announcements, comments on user's perspective, news stories and reports, and exchanges of e-mails. OCLC's own Web site is the primary and authoritative source of information on this topic. It contains valuable information on topics such as: Open WorldCat Program, 8 Quick Facts about the Open WorldCat Pilot, 9 How the Open WorldCat Pilot Works, 10 Open WorldCat Pilot: Frequently Asked Questions," User-Contributed Content Pilot. 12 These materials provide most of the current information about the project. An interesting comparison between RedLightGreen and Open WorldCat is provided in a recent article by David Mattison. 13 In his comparison, the author addressed both the strengths and weaknesses of both projects, the target audience, source data and basic searching methods, advanced searching methods, library holdings, and search personalization options. He concluded that both projects offer great information to users; however, Open WorldCat works best when users are looking for a specific title and need to know whether a library holds it or not, whereas RedLightGreen, with its research focus, displays many titles at once, with no capability of indicating whether a library owns any one title. Various aspects of searching the WorldCat data in a Web browser environment are discussed in two papers by Nancy O'Neill and Gary Price. 14 In Open WorldCat: A User's Perspective, O'Neill addressed issues related to searching the WorldCat items in Google and Yahoo! and how different search techniques can provide different results, or no results. Price raised important points about the number of mouse clicks that are required to locate a specific item and how that might impact user behavior. He also raised the question—"Why doesn't OCLC make subject headings viewable and hyperlinked." This issue in particular was later resolved by making subject headings visible and hot-linked. This feature, as Price emphasized, will help in limiting the result of a search to only Open WorldCat records. In another paper by Gary Price and Steven Cohn point out that OCLC must improve subject access to materials. 15 After running several subject searches on random topics, the authors received poor results at both Yahoo! and Google. Paula Wilson raises important questions about providing Internet access to the WorldCat database in her paper. 6 Wilson discusses how Open WorldCat works, the benefits of having library materials accessible via the major search engines and discusses some issues of concern. Any number of other short articles and announcements could be mentioned in this review, but most of them do not add substantially to the discussion of this topic. As statistical data about the use of WorldCat by the Internet users become available, further research will be conducted. At this time, librarians and researchers are not offering speculation about ways in which this change is going to impact library operations and the survey conducted by this author attempts to address some of these issues. Research Methodology Since the opening of OCLC's WorldCat via Google and Yahoo! librarians began to raise questions about the impact this development would have on libraries and their operations? Since libraries are the feeder of the WorldCat database, would they have a role in adapting and facilitating this important transformation of the WorldCat database? To provide a more in-depth analysis of the issues, the author developed a survey (Appendix A), which was distributed electronically via Zoomerang to the heads of cataloging departments or to assistant directors of technical services or their equivalents at 123 Association of Research Libraries (ARL) member institutions. 17 A cover letter was sent along with the survey to outline the purpose of the study, and to provide instructions for completing the survey (Appendix B). The letter assured participants that the result would be reported in this study. The survey was initially distributed informally to a small number of ARL librarians for their input, proofreading, and for clarity. Based on their comments and feedback, the survey was revised. ARL academic libraries formed the population sample for this study, since these institutions represent a well-defined group of libraries with sufficiently large collections. A number of obstacles were encountered in the process. It was unexpectedly difficult to create a mailing list of the ARL recipients and it was very difficult to find an address or an e- mail of some recipients. Reliance on institutional Web sites was a very disappointing experience. On a number of occasions, the author had to resort to the telephone to establish identities of heads of cataloging or assistant directors for various departments. Names, titles, and address information changed as individuals changed positions, retired, or left their positions for other reasons, and online information was not updated. The survey was distributed, followed by two reminders sent within ten days of each other. The survey response was 54 (43.9 percent) out of 123 ARL libraries responding. Although a better response rate could have been expected, given the timeliness of this topic, the author considers the fifty-four responses acceptable. Some questions in the survey required a yes or a no response. All of the questions, however, offer opportunity for comments and opinions. The survey consisted of twenty questions. Some of the questions (Questions 19 and 20) were of general nature and the data obtained are not discussed in this paper in detail. Other questions, pertaining to un-cataloged collections provide interesting detail that falls outside the scope of this paper. Data gathered in this part of the survey will be analyzed and discussed at a future time. Survey Results Survey results will be grouped in four parts and discusses as follows: Part one: Provides information about the quality of records; commitments of libraries to cataloging according to standards; cataloging staffing level; and using additional services such as vendor records. Part two: Addresses the issue of un-cataloged collections; commitment of libraries to catalog these materials; the level of cataloging of these material; enriching bibliographic records; and whether libraries are planning to add their entire collections to the Open WorldCat. Part three: Provides information about library repositories and asks if libraries are willing to integrate their repository materials into their OPACs and to Open WorldCat. Part four: Provides information about libraries' vision and if they are foreseeing a time when all library collections will be digitized. Part One: Quality of Records in WorldCat There is an underlying assumption that opening of library catalogs through the Internet will increase the library presence on the Web and in the user community. What this implies is an increased use of libraries by current patrons and new, Web-based patrons, who will now find library materials useful and desirable. This survey attempts to ascertain what steps libraries are planning to take to provide the quality of catalog records that would assure full access to their collections in the Web environment. Quality of cataloging records has always been the determining factors in providing effective access to materials. The library community has always taken great care in developing cataloging standards and professional training to assure that quality would be maintained. As the opportunity to move library records to the Web environment arises, it is useful to revisit these issues and determine whether changes need to be made to meet this new challenge. Standards Table 1 of the survey is intended to measure the level and commitment to various cataloging standards, by asking which cataloging standards are currently in use by the ARL members. The response shows that all ARL libraries (100 percent) continue to use traditional resources for cataloging. These resources include the Anglo-American Cataloging Rules, Second edition revised (AACR2 rev.), Bibliographic Machine-Readable Cataloging (MARC21), Library of Congress Subject Headings (LCSH), and Library of Congress Classification (LCC). However, there are new standards that are being adapted and used to catalog digital materials. Dublin Core (DC) is used by 41 percent of the respondents who use it to catalog library Web content and archived images. Table 1 also shows that 18.5 percent of ARL libraries use other metadata, such as Encoding Archival Description (EAD), Federal Geo graphic Data Committee (FGDC), Extensible Markup Language (XML), Text Encoding Initiative (TEI), and Visual Resources Associations (VRA) Core. As for the Functional Requirements for Bibliographic Records (FRBR), 4.5 percent of libraries responded that they would adapt it when it is incorporated into the Resources Discovery and Access (RDA, formerly AACR2 revised). Table 1. What Cataloging Standards Does Your Library Use for Cataloging? Yes No N % N % AACR2 rev 54 100 0 0 LSCH 54 100 0 0 LC Classification 54 100 0 0 DDC 2 3.7 52 96.2 UDC 0 0 0 0 MARC21 52 96.2 2 0.23 FRBR 5 4.5 49 90.7 Dublin Core 22 41 32 59.2 Others 10 18.5 44 81.4 These results reflect continuing commitment to the quality standards that are the backbone of cataloging. As librarians continue to adhere to the established standards, they develop new standards to accommodate the new formats and material types. Participation in the Library of Congress Cooperative Cataloging Programs (PCC) is generally considered by many to be an indication of commitment to quality that the library community expects. Most ARL libraries responding to the survey are actively participating in one or more of the Library of Congress PCC programs. Table 2 shows that the largest number of respondents (59.2 percent) indicated that they are contributing authority records to the Name Authority Cooperative Program of the PCC (NACO), while 31 percent are contributing their records to the Monographic Bibliographic Records Program (BIBCO). Another 29.6 percent are contributing records to the Subject Authority Cooperative Program (SACO), and 20.3 percent are participating in the Cooperative Online Serials (CONSER). Nearly a half of the respondents (40.5 percent) indicated that they are not members and are not contributing to the PCC Program at this time. Libraries are not likely to change their participation in PCC programs, as their collections are being made available through the Internet. As Table 3 demonstrates, a majority of libraries responded to this question with a lot of doubt. Nearly one in three libraries (29.6 percent) indicated that they would participate or continue participating in NACO, 20.3 percent will participate or continue participating in BIBCO, 20.3 percent will participate or continue participating in SACO, and 18.5 percent would participate or continue participating in CONSER. Five libraries commented that they are thinking of participating and it is in their future plan, while three libraries felt that Open WorldCat would increase the pressure on libraries to perform high quality cataloging to assure patron access to all materials. Most of the comments indicated that libraries do not have enough qualified staff nor the time to go through the Library of Congress quality review in order to become members in these programs. Table 2. Is your Library a Member of the Library of Congress International Program for Cooperative Cataloging? Yes No N % N % NACO 32 59.2 22 40.5 BIBCO 17 31 37 67.5 SACO 16 29.6 38 70.3 CONSER 11 20.3 43 79.6 None 22 40.5 32 59.2 Table 3. Will the Potential Visibility in Opening WorldCat Create the Need for Your Library to Participate in the Library of Congress International Program for Cooperative Cataloging (PCC) for Providing More Quality Content? Yes No N % N % NACO 16 29.6 38 70.3 BIBCO 11 20.3 43 79.6 SACO 11 20.3 43 79.6 CONSER 10 18.5 44 81.4 None 38 70.3 16 29.2 Staffing Levels One of the ways to assess quality of cataloging is to look at how libraries handle the staffing issue and staff training. The question in Table 4 was designed to measure the level of staffing assigned to cataloging in the ARL libraries and ways in which cataloging tasks are being assigned to various staff levels. All respondents (100 percent) as shown in Table 4, indicated that they are using librarians to perform original cataloging, complex copy cataloging, cataloging of foreign language materials, some incomplete copy, copy needed subject headings and call number, and cataloging of various formats. Table 4 also demonstrates that all responding libraries (100 percent) are utilizing para- professionals in cataloging. Although some of the responses did not specify which specific tasks are assigned to paraprofessionals, responses from other libraries indicated that paraprofessionals are performing copy cataloging, simple original cataloging, such as new edition, variant edition, and original cataloging for theses. Table 4. Who does the Cataloging in your Library? N % Librarians 54 100 Para-professional 54 100 Students assistants 23 42.5 Vendor (specify) 35 64.8 Although only 42.5 percent of respondents indicated that their libraries are using student assistants, it was very important to see how libraries are using students and what kind of tasks they are performing. Some respondents indicated that they are using students to perform simple copy cataloging, processing of records with DLC, doing copy cataloging of non-roman languages, checking, cataloging electronic resources, and materials that need vernacular. The practice of using non-professional staff or graduate student assistants in cataloging is fairly common among academic libraries. As these libraries strive to maintain high quality of cataloging, it becomes necessary for them to provide a level of training that would allow these libraries to continue to adhere to those cataloging standards discussed in the preceding section. Vendor Services Table 4 also shows that nearly two-thirds (64.8 percent) of the respondents indicated that they are using vendor services for part of their cataloging. As libraries are taking advantage of vendor services, they need to pay particular attention to record quality. The survey mentions a variety of vendors that are being used. Librarians provided comments on what vendor services they are currently using as follows: 17.1 percent (six) indicated that they are using YBP Library Services; 17.1 percent (six) are using MARCIVE, Inc. for obtaining bibliographic records for Government Documents and other cataloging services; 25.7 percent (9) are using OCLC PromptCat services; 14.2 percent (five) are using OCLC TECHPRO for cataloging foreign languages, such as Arabic and CJK (Chinese, Japanese, and Korean materials); 8 percent (three) are using Shelf Ready; 11.4 percent (four) are using Blackwell; 5 percent (two) are using Backstage Library Works, and one library is using EBSCO services. One library also indicated that they would increase vendor services while other libraries mentioned that they are purchasing record sets for electronic resources and microforms. This part of the survey reveals that ARL libraries continued to use mixed methods to satisfy their cataloging needs. In most cases, a combination of in-house cataloging is augmented by contracts with outside vendors. In-house cataloging is conducted by professional catalogers, with the assistance of paraprofessional staff and student assistants. However, professional catalogers and paraprofessionals constituted the foundation of cataloging in most institutions. At this time, libraries are not planning to hire new professional staff for the purpose of enriching their cataloging records (Table 5). In responding to this question, 50 percent of respondents indicated that they were not sure if library administrations would create additional cataloger position solely for the purpose of enriching their cataloging records. The second largest groups of respondents (44.4 percent) indicated that they would not hire new catalogers or staff. Some of the responses seem to reflect the fact that technical service operations have been understaffed for a long time and that libraries are not showing willingness to go back and rebuild staffing in this area. Table 5. If your Library Decides to Enrich Records Added to WorldCat, Are you Planning to Add More Professional Catalogers to Your Staff? N % Yes 3 5.5 No 24 44.4 Not sure 27 50 Table 6. Do you Expect the Decision to Open WorldCat to Increase The Cost of Cataloging? N % Yes 4 7.4 No 44 81.4 Not sure 6 11 One respondent expressed a concern that shortage of professional librarians would definitely affect the quality of cataloging records, and negatively impact access to information. However, a very small percentage of the responses (5.5 percent) mentioned that they would add more staff, but not necessarily librarians, if they had to enrich their bibliographic records. In view of the response to the question above, the issue of increased cost of cataloging as a result of hiring additional staff and enriching bibliographic records seemed less relevant (Table 6). Most libraries (81.4 percent) did not anticipate any increase in the cost of cataloging. A small percentage (11 percent) indicated in their responses that they are not sure if the cost of cataloging will increase, while 7.4 percent thought that there would be an increase in the cost related to the time spent in cataloging and enriching bibliographic records in addition to hiring of additional staff. Some of the comments addressed the increases in the cost of cataloging related to how much additional time libraries would need to spend on enriching bibliographic records and the level of staffing involved in the process. Some respondents mentioned that enrichments of the bibliographic records would slow the overall production and raise the cost of cataloging. Adding new librarians would have an impact on the overall cost as well. Some librarians emphasized the importance of producing high quality records and how this impacts the overall costs: "if you do it right from the beginning, this will save money and time to fix it." Part Two: Un-cataloged Collections Many libraries have treasure troves of materials in special collections, un-cataloged backlogs, archives and manuscript repositories that are often not fully represented in the library OPAC. They are used locally and available to walk-in patrons only. Parts of these collections have been digitized and locally cataloged. Libraries are now faced with the opportunity to make these rare collections available through the Internet. This would involve commitment of energy and resources, and the survey raises important questions about where libraries stand on this issue. In this section, the author will attempt to broadly identify the types of un-cataloged collections held at the surveyed libraries and to establish if libraries are making plans to have these collections cataloged. In Table 7, nearly 90 percent of all respondents indicated that they have significant un- cataloged collections, while 11 percent mentioned that they have no un-cataloged materials. In addition to new receipts un-cataloged materials, some libraries have backlogs that are related to retrospective conversion, gift collections, pre-1976 government documents, rare collections, microforms that need analytics, and archival materials. The question in Table 8 was designed to provide detailed information about the types of un-cataloged materials held by the ARL libraries. Responses to this question provide a rich data set that deserves further study. A number of libraries reported that they have sizable special collections materials (e.g. ca. 60,000 rare books), while another group reported significant foreign language un-cataloged materials, mainly in Japanese. Gift collections, collections of slides, and microforms constitute significant subsets of un-cataloged materials as well. It is important to emphasize that most of these un-cataloged materials are not present in the OCLC or RLIN databases, and most have no records in local library online catalog systems. Although the design of question 9 did not require the respondents to provide direct information about their plans to catalog the un-cataloged materials in their libraries, most of respondents expressed their wish to catalog all of their materials. This conveys a sense of urgency about the need to have these materials made available to the public as soon as possible. The question was directed at the issue of methods that might be employed to catalog these collections. As Table 9 indicates, the answers were diverse. Nearly half (48 percent) of respondents chose collection level cataloging for handling specific collections, such as archival materials and manuscripts. One library mentioned that they will be using collection level cataloging for some materials, but they will not add bibliographic records to the OCLC WorldCat. More than one third (35 percent) of the responding libraries plan on using short/ brief bibliographic records, especially "K" level for outsourcing materials such as maps. Other respondents indicated that they would input brief records with no subject headings. Three fourths of libraries (75.5 percent) responded that they will be cataloging at full bibliographic level whenever possible. One respondent indicated that their library decided to use full bibliographic records for un-cataloged foreign language materials. Others responded that they would be using this approach for cataloging serials and electronic monographs. One of the important questions in this part of the survey dealt with the issue of how libraries would deal with hidden collections once their catalogs became available through the Internet (Table 10). The response to this question was neutral. Nearly 45 percent of the respondents indicated that they already experience some pressure to catalog their hidden collections and that these materials need to be in circulation. Increased visibility resulting from opening their collections through Open WorldCat did not seem to be a factor here. Table 7. Does your Library have Un-cataloged Collections? N % Yes 48 88.8 No 6 11 Table 8. What Kind of Un-cataloged Collections does your Library have? Yes No N % N % Western language print new receipts 11 20.3 43 79.6 Western language print retrospective conversion 17 31.4 37 68.5 Non-Roman language print new receipts 18 33.3 36 66.6 Non-Roman language retrospective conversion 4 7.4 50 92.5 Western language special formats 22 40.7 23 59 Non-Roman language special formats 6 11 49 90.7 Electronic resources (e.g. Web site) 11 20.3 43 79.6 Special collections materials 22 40.7 23 59 Another 46.2 percent responded that the Open WorldCat has nothing to do with cataloging their hidden collections or with the visibility of their library or their collections. These institutions would continue what they are doing now and there would be no additional efforts or plans to do more than what is being done currently. About 10 percent of the respondents indicated that they were not certain of how Open WorldCat would affect their efforts to catalog their hidden collection. Many comments received with the survey emphasize the need to have important library collections made available to users regardless of how Open WorldCat might impact the library. This response reveals a basic uncertainty about how most librarians view the Open WorldCat initiative and the way it might impact libraries. It seems that implications of this change are difficult to predict and libraries are not taking additional steps at this time to accommodate this development. Table 9. If you are Making Plans to Catalog your Un-Cataloged Collection, Which Methods are you Likely to Use to Catalog and Add these Collections to WorldCat? Yes No N % N % Collection -level record 26 48 28 51.8 Short/brief record (e.g. “K” level record) 19 35 35 64.8 Full bibliographic records 41 75.5 13 24 Other specify 0 0 0 0 Table 10. Does the Potential Visibility in Open WorldCat Encourage your Library to Catalog “Hidden” Collections? N % Yes 24 44.4 No 25 46.2 Not sure 5 9.2 Until now, many libraries excluded parts of their collections from their OPACs. The question whether or not libraries will exclude some records from WorldCat after it becomes available through Google was raised in Table 11. Nearly three out of four libraries answered this question in the negative. Only 9.2 percent of respondents indicated that they would be excluding materials from WorldCat. Some of the materials that would be excluded are theses and dissertations, electronic theses and dissertations (ETDs) that are embargoed for patents, U.S. depository materials, passworded materials, materials that are required to be excluded by the vendor, license-restricted materials, and materials not available for document delivery. However, 16.6 percent of respondents indicated that they have not made a decision about this issue at this time. The response to this question suggests that many librarians would like to have more of their collections available through the Internet. Information available to Internet users is typically much richer than what is available in a bibliographic database such as WorldCat or a library OPAC. Survey question 12 raised the issue of record enrichment and how libraries are planning to accomplish this task. Table 12 shows that 20.2 percent of responding institutions plan on providing some enrichment to bibliographic records as a result of the Open WorldCat initiative. However, two thirds (66.6 percent) of the respondents indicated that they would not enrich their bibliographic records as a result of Opening of the WorldCat. The remaining 15 percent of respondents were not sure if they would enrich their bibliographic records. Some of the respondents commented that libraries would provide enhancements to their bibliographic records not because of the Open WorldCat initiative, but to provide quality access to their own users. They expect to perform these enhancements locally into their OPAC. One respondent mentioned that their institution is adding tables of contents to some materials locally, while another mentioned that they are adding more descriptions and details to their special collections materials. Some other libraries plan to add more access points to their bibliographic records, if they have more staff. Others indicated that the issue of enhancing or enriching of bibliographic records has not been discussed in their libraries until now. Table 11. Will Your Library Attempt to Exclude Some Records from WorldCat After it Becomes Available Through Google? N % Yes 5 9.2 No 40 74 No decision 9 16.6 Table 12. Since Users will be Remotely Evaluating an Item Through Information Recorded in the Bibliographic Records, Is your Library Likely to Provide Enrichments to The Bibliographic Records? N % Yes 11 20.2 No 36 66.6 Not sure 8 15 A number of librarians gave a clear indication that their library would not automatically enhance all cataloging records in the new environment. It was mentioned that libraries would select records to be enhanced based on the collection being cataloged and available resources. Survey participants who responded "yes" to question 12 specified what kind of enrichments they would provide to their bibliographic records in Table 13. About 9 percent indicated that they would add abstracts, 16.6 percent would add tables of contents, 11 percent would add summaries when it is appropriate, and 5.5 percent would add key words. Some of the comments indicated that libraries would be willing to add more information to their records if that information was easily available from vendors. Libraries would consider purchasing more tables of contents, for example. One respondent mentioned that they already subscribe to the Blackwell North America (BNA) table of contents service, but they add these contents only locally. Other librarians indicated that they are exploring various ways of enriching their bibliographic records. Some libraries began to add abstracts and key words to their special collections and theses and dissertations. One library indicated that they are already adding reviews and links about author information to their bibliographic records. A feeling of ambiguity and a certain lack of clear direction are evident from answers to most of the questions in this section. Libraries responding to the survey do not seem to be making special preparations or plans that might address issues that might be expected to arise from the Open WorldCat initiative. Survey responses convey the sense that it is too soon to anticipate what demands or requirements will be raised by the "Google generation" of information users. Table 13. If you Answered Yes to Question Number 12, What Kind of Information Would your Library Add to the Bibliographic Record? Yes No N % N % Abstracts 5 9.2 49 90.7 Table of contents 9 16.6 45 83.3 Summaries 6 11 48 88.8 Keywords 3 5.5 51 94.4 Table 14. Libraries are Now Engaged in Creating Local Repositories of Digital Items that are Kept Silos. Some of these are Available Online and Can be Searched by Google, Yahoo!, etc. Does your Library have such a Repository? N % Yes 29 53.7 No 17 31.4 Planning 8 15 Part Three: Institutional Repositories Another important decision facing library managers is the issue of integrating institutional repositories of digital contents with library OPACs. Many institutions have been experimenting with digital repositories, while some are just in the planning stages. The last part of the survey attempts to determine how libraries are dealing with this issue now and what plans they are making for the future. Survey responses confirmed that building an institutional repository is a priority for most major institutions. Table 14 revealed that most of the ARL libraries that participated in this survey (53.7 percent) are already engaged in creating local repositories for digital resources, while another 15 percent indicated that they are in the planning stage. Nearly one-third (31.4 percent) of the respondents indicated that they are not creating such a repository at the present time. One of the issues facing a typical repository is the absence of consistent selection policies. Contents of institutional repositories are varied from one institution to another not only in content, but also in ways they are integrated with the local library. Table 15 shows that out of twenty-nine of the libraries that indicated the presence of a repository at their institution, twenty- one (72.4 percent) provide partial access to the repository collection through their OPAC system. Typical collections include EDTs, digitized print collections only (not images), and honors theses. Among the twenty-one libraries that provide access, four indicated that they are providing access to the repository collection by creating collection level records or complete bibliographic records. The remaining eight (27.5 percent) respondents indicated that their libraries do not provide access at this point but they are reviewing their selection criteria for inclusion in their OPACs. Two of the eight libraries commented that that they are using their Web sites and DSpace for accessing their repository collections. Table 15. If Yes, does your Library also Provide Access to the Repository Contents Through the OPAC? N % Yes 21 72.4 No 8 27.5 Table 16. Does your Library also Provide Access to the Repository Contents Through Bibliographic UtilitiTable 16. Does your Library also Provide Access to the Repository Contents Through Bibliographic Utilities such as OCLC and RLIN? N % Yes 20 68.9 No 3 10 Planning to provide access 6 20.6 Many of the institutions that already have repositories did not perceive the need to add all of their repository records to OCLC or RLIN. Table 16 reveals that although more than two thirds of the libraries surveyed (68.9 percent) are adding records to the bibliographic utilities (16 OCLC and 1 RLIN), they are adding records of only selected collections, such as ETDs, fiction collections, etc. A majority of repository records added to OCLC or RLIN are collection level records. Three of the twenty libraries that add their repository records to OCLC or RLIN indicated that everything in the repository/OPAC is also represented in OCLC. Ten percent of all respondents indicated that they are not planning to add their repository records to any of the bibliographic utilities. These respondents felt that cataloging of these materials and adding them to the catalog would prove to be too costly and redundant, given the fact that these materials are already indexed and accessible through the major search engines such as Google and Yahoo! Another 20.6 percent indicated that they are planning to add some of their repository collections to OCLC at a future time. Most libraries indicated in Table 17 that their repositories include mixed collections. (Please note that not all the repositories include all of these categories.) Data regarding the repository collections reveal that most libraries do not perceive the need to fully integrate their repository and library holdings. This phenomenon is likely to influence the overall effectiveness of the effort to integrate library resources with the information superhighway. Table 17. If Yes, Could you provide a General Description of Materials Held in your Repository? Theses/dissertations and honors theses Archival collections, papers, working papers, letters, interviews, photographs, reports, technical reports Article pre- and post-prints Electronic journals and electronic books, open access journals, locally published electronic journals Textual objects, graphical objects, and multimedia. Digitized materials from Peel's Prairie Provinces bibliography and related materials Slides of historic landscape architecture, university photographs, and photos of Iowa barns Faculty publications (results of faculty research, departmental papers from across campus, faculty research output, faculty research materials) University publications and archival materials Specific manuscript collections in digitized form, special collections, and unique materials Locally digitized work; pre-prints, post-prints TEI text collections (largely Americana—American South, Virginia, American fiction (pre 1923); a collection of EAD finding aids for large image collection (images used for teaching) Table 18. If not, do you Envision a Future when all Library Contents will be Digitized and Available Electronically in Full Text? N % Yes 11 20.2 No 28 51.8 Not sure 15 27.7 Part Four: Digitizing Library Collections Many of the survey respondents expressed considerable restraint in their appraisal of this development. This can be attributed in part to some level of skepticism toward the concept of a digital library. When asked to envision a future when all library content will be digitized and available electronically (Table 18), 20.2 percent of respondents indicate that they see some potential for this, but not in the next few years. More than half (51.8 percent) responded that it will be impossible to digitize all library collections; however, 27.7 were not sure and they do not see this happening in the near future. Most of the respondents indicated that money would be an issue when it came to digitizing library collections and many had questions about sources of funding. Some felt that perhaps commercial support (such as Google) would make this possible. There was also great concern about copyright laws that would likely be an obstacle to full digitization. Most of the comments indicated that libraries would not digitize their entire collections, but perhaps large parts. Respondents offered comments such as these: "Probably not in the foreseeable," "All? - no; most- yes, but several decades ahead," "A large percentage, perhaps, but not all," "possibly digitization on demand; but not comprehensive," "Very unlikely that ALL contents will be digitized," "maybe not all, but a lot will be," "At least not for some years due to legal issues." Many respondents to this question found it difficult to envision a future where all library collections would be digitized. They predicted that some selected materials would be digitized, but not all. Copyright and budgets seemed to be the greatest obstacles. A few respondents commented that print materials would stay with us for a long time. It seems that there is still demand for paper and not everyone uses the Internet. Conclusion The findings of this survey provide important insights into the way ARL library administrators perceive the OCLC Open WorldCat initiative and how they are preparing to handle the potential impact of this new development. Although the prospect of making library collections visible to the whole world via the Internet is likely to have a significant effect on libraries, the survey reveals a rather cautious attitude. Responses to survey questions strongly suggest that most of the ARL libraries are not anticipating immediate preparations to address issues that have not yet fully emerged. This generally guarded approach to the concept of a digital library is reflected in answers to many of the questions in the survey. An explanation for this rather cautions response can be viewed from a number of perspectives suggested by the survey respondents themselves. Realities of library operations are such that no new funding is being made available to provide special treatment to materials going into Google via the Open WorldCat. Enhancements that are currently being made to some bibliographic records will continue to be made in response to the need to serve all clients, regardless of how they access these materials. Regardless of how one estimates the relative importance of Google and other search engines to the overall research process, it can be generally acknowledged that no search engine can locate items in libraries without reliance on cataloging records that will point to the object that is being sought. In order to facilitate an effective search, these cataloging records, whether we call them metadata or something else, will have to be based on consistently applied standards that will describe each item going into an OPAC, a Silo, a digital archive or any other storage device. Most librarians responding to the survey remain committed to those cataloging standards that provide consistency of catalog records. Most of the respondents felt that these records will have to be consistent, complete, and accurate. Participation in the Library of Congress programs seems to confirm the need to maintain accepted standards developed and maintained by a responsible body. The Open WorldCat initiative does not seem to be resulting in a major change in library operations at this time. Library administrators are discussing its impact, but remain focused on continuing to deliver quality services to their own constituents. This study brought together and analyzed important data at the time when the Open WorldCat initiative is being implemented. It also recorded the state of the libraries at this important juncture. Mid-level and upper-level library administrators that responded to this survey perceive the need to provide access to library materials, based on priorities dictated by immediate need as expressed by their own users, and locally available resources. The Open WorldCat initiative has yet to send strong signals that something specific needs to be done. These signals will most likely emerge in the near future, when some statistical data will become available. This will offer an excellent opportunity to evaluate the real impact of the Open WorldCat initiative on the Internet user. Such data will also provide opportunities to study the impact that the WorldCat database is having on the Internet and its users. Acknowledgment: The authors wish to thank George Klim for reading this article and making valuable comments. Appendix A: Survey 1. What cataloging standards does your library use for cataloging? a. AACR2 rev LCSH b. LC Classification c. DDC d. UDC e. MARC 21 f. Functional Requirements for Bibliographic Records g. (FRBR) h. Dublin Core i. Comments 2. Is your library a member of the Library of Congress International Program for Cooperative Cataloging? a. NACO b. BIBCO c. SACO d. CONSER e. Comments 3. Will the potential visibility in open WorldCat create the need for your library to participate in the Library of Congress International Program for Cooperative Cataloging (PCC) for providing more quality control in your records? a. NACO b. BIBCO c. SACO d. CONSER e. Comments 4. Who does the cataloging in your library? a. Librarians b. Par-professions c. Student assistants d. Vendors (please specify) e. Comments 5. If your library decides to enrich records added to WorldCat, are you planning to add professional cata-logers to your staff? a. Yes b. No 6. Do you expect the decision to open WorldCat to increase the cost of cataloging? a. Yes b. No c. Comments 7. Does your library have un-cataloged collections? a. Yes b. No c. Comments 8. What kind of un-cataloged collections does your library have? a. Western language print new receipts (please specify) b. Western language print retrospective conversation (please specify) c. Non-Roman language print new receipts (please specify) d. Non-Roman language print retrospective conversion (please specify) e. Western language special formats (please specify) f. Non-Roman language special formats (please specify) g. Electronic resources (e.g. Web sites) (please specify) h. Special Collections materials (please specify) i. Comments 9. If you are making plans to catalog your un-cataloged collections, which methods are you likely to use to catalog and add these collections to WorldCat? a. Collection-level record b. Short/brief record (e.g. K level record) c. Full bibliographic record d. Others (Please specify) e. Comment 10. Does the potential visibility in open WorldCat encourage your library to catalog 'hidden' collections? a. Yes b. No c. Comments 11. Will your library attempt to exclude some records from WorldCat after it becomes available through Google? a. Special materials (e.g. thesis and dissertations b. Special Formats c. Expensive materials d. Comments 12. Since users will be remotely evaluating an item through information recorded in the bibliographic record, is your library likely to provide enrichments to the bibliographic record? a. Yes b. No c. Comments 13. If you answered yes to question number 12, what kind of information would your library add to the bibliographic record? a. Abstracts b. Table of contents c. Summary d. Key words e. Others (please specify) f. None g. Comments 14. Libraries are now engaged in creating local repositories of digital items that are kept in silos. Some of these are available online and can be searched by Google, Yahoo, etc. Does your library or institution have such a repository? a. Yes b. No c. Comments 15. If yes, does your library also provide access to the repository contents through your OPAC? a. Yes b. No c. Comments 16. Does your library also provide access to the repository contents through the bibliographic utilities such as OCLC and RLIN? a. Yes b. No c. Comments 17. Could you provide a general description of materials held in your repository? 18. If not, do you envision a future when all library contents will be digitized and available electronically in full text? a. Yes b. No c. Comments 19. What is the approximate size of your library? a. Under 500,000 b. 500,000-1,000,000 c. 1,000,000-3,000,000 d. 3,000,000-5,000,000 e. Above 5,000,000 Comments 20. What utility does your library use for cataloging? a. OCLC b. RLIN c. Others (please specify) 21. Please identify yourself and your institution? APPENDIX B: INVITATION Dear Colleagues, I am writing to ask you to participate in a very brief survey on the impact of open WorldCat and accessing materials remotely in ARL libraries. You were chosen as the most obvious person in your organization to answer a few short questions about the impacts in your library or libraries. If you are not the appropriate person to answer these questions, please pass along this message to another person in your library. I realize that many libraries have complex organizational structures and functions may take place in many departments. This brief survey is intended to address the question of the impact of open WorldCat and accessing materials remotely. There are only twenty-one questions which should take fewer than ten minutes to reply. If you prefer, I would be happy to send a paper copy in the mail so that it could be completed without using electronic communications, or to send the survey as an e-mail attachment. In any case, the responses will be kept confidential and will not be associated with any individual institution. I appreciate your willingness to participate in this research project, and I thank you in advance for your time. Please feel free to contact me if you have any specific questions. I plan to publish my results, and would also be happy to discuss the project in more detail. Magda El-Sherbini Head, Cataloging Department Ohio State University Libraries http://www.zoomerang.com/survey.zgi?p=U244C4LGE9AR Notes and References 1. Stephen Abram, "The Google Opportunity," Library Journal (February 1, 2005). Available: http://www.libraryjournal.com/ article/CA498846.html (Accessed June 6, 2006). 2. Judy Luther, "Google? Metasearching's Promise," Library Journal (October 1, 2003). Available: http://www.libraryjoumal.com/ article/CA322627.html (Accessed June 6, 2006). 3. "RedLightGreen." Available: http://www.rlg.org/en/page.php? Page_ID=433 and dataGo.x=l 1 &dataGo.y=l 3 (Accessed May 22, 2006). 4. Barbara Quint, "OCLC Project Opens WorldCat Records to Google," InfoToday News Breaks (October 27, 2 003). Available: http://www.infotoday.com/newsbreaks/nb031027- 2.shtml (Accessed June 6, 2006). 5. Barbara Quint, "Yahoo! Search join OCLC Open WorldCat project," InfoToday News Breaks (July 6, 2004). Available: http:// www.infotoday.com/newsbreaks/nb040706- 2.shtml (Accessed June 6, 2006). 6. Barbara Quint, "All of OCLC's WorldCat Heading Toward the Open Web," InfoToday News Breaks (October 11, 2004). Available: http://www.infotoday.com/newsbreaks/nb04101 l-2.shtml (Accessed June 6, 2006). 7. "RLG to Combine with OCLC." Available: http://www.oclc.org/ news/releases/200618.htm (Accessed June 6, 2006). 8. "Open WorldCat Program." Available: http://www.oclc.org/ worldcat/open/default.htm (Accessed June 6, 2006). 9. "Quick Facts about the Open WorldCat Pilot." Available: http:// www.oclc.org/worldcat/open/facts/default.htm (Accessed June 7, 2006). 10. "How the Open WorldCat Program Works." Available: http:// www.oclc.org/worldcat/open/how/default.hnn (Accessed June 7, 2006). 11. "Open WorldCat Pilot: Frequently Asked Questions." Available: http://www.oclc.org/worldcat/open/faq/defaull.htm (Accessed June 7, 2006). 12. "User-Contributed Content Pilot." Available: http://www. oclc.org/worldcat/open/features/default.htm (Accessed June 7, 2006). 13. David Mattison, "RedLightGreen and Open WorldCat: Changing the World of Academic Search," Searcher 13 (4) (April 2005): 14-23. Available: http://web.ebscohost.com/ehost (Accessed October 17, 2006). 14. Nancy O'Neill, "Open WorldCat Pilot: a User's Perspective," Searcher 12 (10) (November/December 2004). Available: http://www. infotoday.eom/searcher//nov04/oNeill.shtml (Accessed June 7, 2006); Gary Price, "Web Search-Yahoo! News Breaks: Two Million Open WorldCat Hit the Yahoo! Database," ResourceShelf (Wednesday, July 7, 2004). Available: http://www.resourceshelf.com/2004/07/ two-million-open-worldcat-records-hit.html (Accessed June 6, 2006); Gary Price, "Open WorldCat: Subject Headings Now Hyperlinked in Open WorldCat," ResourceShelf (Wednesday, November 10 2004). Available: http://www.resourceshelf.com/ 2004_ll_01_resourceshelf_archive.html (Accessed June 6, 2006). 15. Gary Price and Steven M. Cohen, "OCLC Opens up the Complete WorldCat Database to Web Engines and Other Partners," ResourceShelf (October 11, 2004). Available: http://www.resourceshelf. com/2004__10_01_resourceshelf_archive.html (Accessed June 6, 2006). 16. Paula Wilson, "Open WorldCat: Earth's Largest Library," Public Libraries (2005 (March- April)): 82-83. 17. "Association of Research Libraries." Available: http://www.arl. org/members.html (Accessed June 6, 2006). work_ubghlx6iajebvdqg4q5fg4grri ---- Microsoft Word - OTDCF_v22no3.doc by Norm Medeiros Associate Librarian of the College Haverford College Haverford, PA Good Enough is Good Enough: Cataloging Lessons from the University of California Libraries ___________________________________________________________________________________________________ {A published version of this article appears in the 22:3 (2006) issue of OCLC Systems & Services.} “Baseball is like church. Many attend, few understand.” -- Leo Durocher ABSTRACT This article reviews the recent report, “Rethinking How We Provide Bibliographic Services for the University of California.” It discusses some of the report’s recommendations in light of similar initiatives underway. The article includes comments from John Riemer, Chair of the Bibliographic Services Task Force, the group responsible for the report. The article concludes by affirming many of the suggestions detailed in the report. KEYWORDS Bibliographic services ; cataloging ; metadata ; University of California Libraries ; John Riemer Living in interesting times, as the saying goes, is either a blessing or a curse, and I suspect those administering academic technical services over the next few years will have very strong opinions on this matter. Library patrons to a large degree are growing dissatisfied with the fragmented and ever-expanding array of information tools that confront them, none of which seem terribly easy to use. The online catalog (opac) is a particular source of frustration for many users. In light of this dissatisfaction, I read with much interest the recent report issued by the Bibliographic Services Task Force (BSTF) at the University of California (UC) Libraries. “Rethinking How We Provide Bibliographic Services for the University of California” describes means by which academic libraries can continue to provide relevant, sophisticated bibliographic services to their constituents. There is much hard truth in the 79-page document, and it comes at a time when similar questions regarding the future of library services are at the forefront of many minds. “Only through knowing our audience, respecting their needs, and imaginatively reengineering our operations, can we revitalize the library’s suite of bibliographic services.” –UC report UC’s Systemwide Operations and Planning Advisory Group (SOPAG) established the Bibliographic Services Task Force (BSTF), charging it with assessing means by which bibliographic services can provide similar experiences as those provided by Amazon and Google. Not surprisingly, BSTF’s resulting report references Amazon 18 times and Google 23 times. SOPAG clearly recognized these commercial enterprises as the standards to which UC should aspire. John Riemer (UCLA), Chair of the BSTF, identified the rapidly growing stable of non-interoperable information tools as a prevailing factor which led to the formation and charge of the group (Riemer, 2006). As Riemer notes, students usually don’t know which tool will satisfy a particularly information need, and thus these stymied users often abandon library resources for the immediate gratification provided by the general Web. Moreover, given the declining need for undergraduates to visit campus libraries, the noticeable migration of users away from the virtual library is a cause for alarm. In the course of performing its work, the BSTF reviewed a large assortment of papers on the topic of improving bibliographic access to information, while also soliciting the informed opinions of library leaders and visionaries such as John Byrum, recently-retired Chief of the Regional and Cooperative Cataloging Division, Library of Congress; Lorcan Dempsey, Vice President and Chief Strategist, OCLC; Clifford Lynch, Director, Coalition for Networked Information; and Roy Tennant, User Services Architect, Digital Library Services, California Digital Library. The resulting report provides a succinct, persuasive set of recommendations for improving the way libraries provide bibliographic services. The executive summary states the obvious: Library systems can’t compete with tools provided by juggernauts Amazon and Google. The summary continues with the less obvious statement, however, that libraries offer fragmented systems to users, the distinctions of each being lost on nearly all undergraduates. Moreover, libraries are expending great effort – too much effort it’s argued – at maintaining these fragmented systems. The summary ends on a deservingly ominous note, stating that the recommendations contained within the report must be implemented if libraries “are to remain viable in the information marketplace.” “For the past 10 years online searching has become simpler and more effective everywhere, except in library catalogs.” –UC report Focusing on a few of the UC recommendations should provide a sense of the simple, practical, and essential activities that academic libraries must learn to provide and/or incorporate into their bibliographic provision: Offer alternative actions for failed or suspect searches I can’t count the number of times Google has saved me from a failed search, simply by recognizing a misspelled word within my query. For instance, if one searches for Champange Jam in Google, Google’s first response is to ask whether I meant Champagne Jam. After acknowledging that I did mean Champagne Jam, I am brought to a results list that links me to information about the Atlanta Rhythm Section’s album. My library’s catalog, however, returns no results when the query is entered mistakenly. It suggests I search Champange Jam as keywords, but this too yields no results since the catalog is helpless against misspellings. To users, it looks as though the library doesn’t own a copy of this album, though in fact we do. As a result, users are too harshly punished for misspelling and mistyping, and may harbor dissatisfaction towards the library, particularly given the ease with which Google recognizes and corrects obvious spelling errors. Provide relevance ranking and leverage full-text Library catalogs are increasing in size while user search strategies are becoming less sophisticated -- a recipe for very large result sets. Even the most dedicated bibliographic instruction program will only effect change in a small number of students. Rather than fight the tide, results should be ranked and/or clustered as a means of helping searchers more ably access the most desired results. Given the expectation of immediate gratification, part of the relevancy could be based on the availability of an electronic version of the desired object. It’s not unreasonable to think that librarians could develop criteria that would provide useful relevancy regardless of the simplicity or sophistication of the query. Yet many opacs don’t allow anything beyond the most rudimentary relevancy setting. Automate metadata creation / manually enrich metadata in important areas The title of this article is a quote from the UC report section referencing the need to automate metadata creation. From an administrative perspective, I think it’s critical for library cataloging departments to cease seeking perfection. There are many reasons for my position, chief among them the need to deploy cataloging staff to other projects that require fairly complex levels of cataloging. It’s inconceivable for cataloging staffs to continue to provide near-perfect bibliographic records while also immersing themselves into an ever-growing array of non-MARC/AACR digital projects. Moreover, from a strictly utilitarian perspective, maintenance of such near-perfection is not warranted. Libraries should determine acceptable error rates for different parts of the bibliographic record and be comfortable adhering to them. There are simply too many competing demands to allow legacy cataloging practices to inhibit such progress. “If we wish to remain a contender in the information marketplace, we need to incorporate efficient ways for obtaining, creating, and exporting metadata.” –UC report It did not go unnoticed that the above quote says contender rather than leader. The implication is that the library community has surrendered the top spot in providing information to undergraduates, and it’s likely we will never regain it. The UC report is sprinkled with a sufficient and warranted number of similarly alarming statements about the future of libraries. It should serve as a wake-up call. Indeed if it doesn’t, academic libraries jeopardize their existence. WORKS CITED Riemer, J. (2006). Telephone correspondence with the author (10 April 2006). work_ug5e77el35cb5c6h6zi574c7ja ---- Free as in Tibet: ibiblio’s cultural cultivation and community creation Jessamyn West The author Jessamyn West is Outreach Librarian, Rutland Free Library, Rutland, Vermont, USA. Keywords Digital libraries, Communities, Public domain software Abstract ibiblio is a digital library hosted at the University of North Carolina-Chapel Hill that manages to be both a repository for cultural information and a resource for community building. The project has existed in many forms since the beginning of the web, and has maintained a core commitment to open source software and tools. ibiblio’s maintainers have continually expanded the project’s offerings in response to the availability of new technologies and the support of financial and technological partners. Their newest project is an open source weblog development and distribution system. Electronic access The Emerald Research Register for this journal is available at www.emeraldinsight.com/researchregister The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075X.htm If you’ve got something unusual that you want to share with the world, chances are that ibiblio will put it online for you [1]. I have a confession to make. I am not just discussing ibiblio’s digital archive – which they call “the public’s library” – I am also a client. By “client”, I mean someone to whom ibiblio has graciously extended the free use of their servers to host my weblog librarian.net and its associated archives. My small site is one of many content archives hosted by the ibiblio project at the University of North Carolina, one of the largest “collections of collections” online. ibiblio is run by librarians, computer scientists and students. The servers host many large well-known digital archives and provide services such as storage, shell access, mailing lists, technical support and website statistics for the site owners and maintainers. Where other digital repositories tout their size and power in terms of terabytes and gigaflops, ibiblio prefers to think of their immense storage and retrieval systems in terms of freedom. To quote their FAQ, “We’re all about freedom, man! Free Tibet, Free Burma, Free Love, you get the picture. We offer a free platform for the exchange of free thought” [2]. As libraries have become more than just storage places for books, the role of digital libraries has also been growing and changing. ibiblio, through offering space, tools and know-how, and encouraging synergies between them, is “breaking the fourth wall” [3] of the Internet: using the web to connect people to each other. My website and its associated archives comprise about a hundred files out of over two million that live on ibiblio’s servers. I had been maintaining librarian.net as a labor of love for several years, paying the hosting and domain registration fees myself and updating and designing when I found time. One of ibiblio’s collection developers contacted me and asked if I’d like to host my pages on their servers free of charge. They had more robust server architecture than the host I was currently with, and their only caveats were that all my content had to be freely available, and I could not sell anything or solicit donations from the site. I had been interested in moving to a content management system (CMS) to run my website for some time, and this provided me the perfect opportunity to upgrade while not incurring additional costs. I moved my site over after installing Movable Type in the last week of September 2003. The more time I spent looking at the collections that ibiblio hosted, with their collection index available in Universal Decimal Classification and their list of online library catalogs dated 1991, the more I became interested in this fusion of computing and libraries. OCLC Systems & Services: International Digital Library Perspectives Volume 20 · Number 2 · 2004 · pp. 82–86 q Emerald Group Publishing Limited · ISSN 1065-075X DOI 10.1108/10650750410539095 82 http://www.emeraldinsight.com/researchregister http://www.emeraldinsight.com/1065-075X.htm This overview includes information on ibiblio’s system architecture, statements of purpose over time, and some evaluation of how they have managed to do what many aspire to: create an actual community out of a series of websites. Form Tools have their way of defining us and defining our next tools as well. The Web became more of an encyclopedia than a town hall with Mosaic and in that we are all made a little less by it ( Jones, 2003). The ibiblio project is much more than a vast filing cabinet with a fat pipe to the Internet, though it is also that. Here is some raw data from the project [4]: . constant outbound network traffic in the 160-180Mbits/sec range; . five terabytes of server space; . 1.5 terabytes of data moved daily; . 10 million server requests per day; . original home to the Internet Movie Database [5] and the Internet Underground Music Archive [6] and the current home to Project Gutenberg [7]; and . host of the Linux Software Archives [8] housing 171 gigabytes of freely available Linux programs, as well as The Linux Documentation Project [9]. The project has always been conceptualized as a community resource. The bits and bytes that were ibiblio’s predecessors grew out of a technical support system at the University of North Carolina. Their DECStation 3100 system had 1.2GB of disk space – an amount equal to the RAM on many current home computers – and ran Ultrix, an old Unix-like operating system created by Digital Equipment Corporation. In an attempt to make university resources available to UNC alumni in the late 1980s, they built a Bulletin Board System (BBS) facilitating access for remote users. By the early 1990s they were experimenting with wide area information server (WAIS) systems, distributed text searching systems that search remote databases using the z39.50 protocol. According to current project director Paul Jones, “we were in a kind of race with Brewster [Kahle] to see who could get the most and the most interesting databases available on the net” [10]. Jones wrote a grant application to Sun Microsystems in 1992 asking for server resources to “make free software and multimedia resources of interest to Sun users available to all”. They called the program Sun Software, Information & Technology Exchange (SunSITE) [11,12] and became the first grantee in a program that now includes over 50 members worldwide. Some of the purposes of SunSITES (which were outlined originally as part of Jones’s proposal) were to provide easy, global access to free software and tools, and to act as a repository for key Sun, local and government information [13]. The Clinton administration became the first presidency to have its presidential papers, speeches and even budgets archived and made available online, years before Yahoo was available. On campus, the project was jointly sponsored by The School of Journalism and Mass Communication, the School of Information and Library Science and the Office of the Vice Chancellor for Information Technology and Networks. The University of North Carolina’s (UNC) SunSITE was one of the first public file transfer protocol (FTP) servers, and represented a way for people who used the Internet for communication to keep track of that communication. “[I]f the electronic mailing list was the community, SunSite was the book shelf ”, Eric Troan, former Linux Archive maintainer, has said [14]. ibiblio’s support from Sun continued until 1998 when they amicably parted ways and the servers were temporarily renamed MetaLab [15]. Soon thereafter, a local open source software company called Red Hat approached Jones about working with their new philanthropic organization, which later became the Center for the Public Domain (CPD) [16]. CPD pledged $4 million in support over five years. To reflect their renewed commitment to information sharing, the project managers rechristened the site ibiblio in 2000 – “a made-up word that alludes well to librararyness” [17] – the name that it bears today [18,19]. Their goal is to become the largest collection of freely distributed information on the Internet or, put another way, “a lively, noisy, Jacksonian library” [18]. Function and family The free market is a wonderful device for cooperation, and we say there is no point in fighting it when it’s so much easier (and so much more fun) to co-opt it [20]. Unlike many online ventures that call themselves electronic libraries, ibiblio has no product. It sells nothing. In fact, one of the few content restrictions for collections on ibiblio is that the collection must be entirely non-commercial: no t-shirts, no bumper stickers, no corporate sponsor, no banner ads. Browsing ibiblio’s content takes you back to a web era before pop-ups, when content was king. ibiblio also reflects an earlier time on the web through hosting the Linux Archive, nearly Free as in Tibet: ibiblio’s cultural cultivation and community creation Jessamyn West OCLC Systems & Services: International Digital Library Perspectives Volume 20 · Number 2 · 2004 · 82–86 83 200 gigabytes of Linux programs and documentation dedicated to helping Linux and open source software “evolve faster and spread further” [20]. Paul Jones, director of the project, open source and poetry advocate, and member of the Luxuriant Flowing Hair Club for Scientists [21] created his first hypertext page in 1991 and has been active recently in newer commons-oriented projects such as the Creative Commons licensing project [22]. To the public, ibiblio exists as one of the great “go to” places on the web, hosting the Online Burma Library, the Folk Music Index, and the Internet Poetry Archive. To the archive owners, the site is an oasis where they can create and present content free of the usual financial, space and support restraints that can be prohibitive to large collections. To the system maintainers, an assortment of students [23], staffers and educational and financial supporters, the project is a way of realizing their dream of a 21st century library using open source technology. They envision the site as being driven by the collaborators who choose and maintain the content and lend their personalities to their projects. While the servers are maintained by UNC students and a few paid staffers, each content provider operates as a homesteader within their own space on the ibiblio servers, setting up their own archives, processes, files, programs and communities. A collection of mailing lists keeps the participants in touch with each other and allows for rapid dissemination of information, as well as an informal tech support network. The ibiblio site has several Wikis available for public interaction to facilitate further information exchange. Wikis are pages on the web that can be edited on the fly by anyone with a knowledge of the tags used in the Wiki syntax – putting square brackets around a page name to create a link to that page, for example [24] – and are used for creating spaces online where users can help each other [25], or even where they can create and add content [26]. Jones and the site take pains to point out that just because they believe in free software, it doesn’t mean they are against the free market: We think [selling low-cost CD-ROMs of files in the archives] is a good thing, because it spreads Open Source software much faster than pure network distribution possibly can. It also creates market incentives for people to support Linux and Open Source software as a full-time job, something we think is essential if we want to succeed at waking the world up from proprietary nightmares [27]. In their quest to represent the future of Internet librarianship, they would like the field to be able to support paid practitioners as well as provide free content. This model is a bit odd to people used to supply and demand economics, but is fairly familiar on the Internet. As Ghosh (1998) explains in his paper on what he calls “cooking pot markets”, “. . . much of the economic activity on the Net involves value but no money”. He then later rhetorically asks “What, indeed, is valuable, when everything’s free?”, and finds an answer in the vast scale of the Internet, creating not a barter or a gift economy, but a large cooking pot: The economy of the Net begins to look like a vast tribal cooking-pot, surging with production to match consumption, simply because everyone understands [. . .] that trade need not occur in single transactions of barter, and that one product can be exchanged for millions at a time. The cooking-pot keeps boiling because people keep putting in things as they themselves, and others, take things out (Ghosh, 1998). With its attempts to grow larger and larger, ibiblio begins to approximate cooking-pot scale and, additionally, cooking-pot economics. The question then becomes: who runs the place? The ibiblio team learned early on that removing the librarian authority figure from the archive and collection development model required some tinkering to ensure quality control of the material available. Or, put another way, what happens to authority control when everyone (or no one) is the librarian? Jones approached this problem by creating and offering tools for the community to assess, approve, and comment on the items submitted to the Linux Archive: By removing nearly all barriers to submission and instituting instead some simple verification procedures, we were able to accept (and later distribute) very high quality software with a very low rejection rate. [. . .] By giving contributors and readers access to tools for evaluation, ranking and managing the collections, we are not just off-loading work; we are building communities of intellectual discourse. Strong community members are recognized by reputation capital and trust metrics and are rewarded (Jones, 1991). For example, when a user wants to upload open source software to the Linux Archives, they must first fill out a template that ensures that their software includes a complete set of metadata, such as keywords and contact information. Software cannot be uploaded without this document. The information from this document is then fed to the archiving program which categorizes it for easy retrieval [28]. Strict naming conventions are observed so that a user can tell if the software in the archive is more recent than the one they have, just by looking at the filename [29]. A similar approach is seen with etree.org, a site on ibiblio dedicated to digital-audio distribution of high-quality concert recordings, all legally tradable [30]. etree.org is maintained by an Free as in Tibet: ibiblio’s cultural cultivation and community creation Jessamyn West OCLC Systems & Services: International Digital Library Perspectives Volume 20 · Number 2 · 2004 · 82–86 84 all-volunteer community of 85,000 registered users who together own digital recordings of over 150,000 unique shows [31]. The site uses a very strict set of naming conventions for their uploaded files to facilitate easy storage and retrieval. Each file that is indexed by the site is available to be reviewed by the entire community, with reviews of the recording posted directly to the information page about the file [32]. Individuals can set up trades with other members and are encouraged to post feedback about these trades [33]. While supremely bad behavior can get a user banned from the site, the community mostly self- moderates. Since no money is being exchanged, reneging on a trade can get you at most a few free CDs but it will also inspire the enmity of a large community with access to a very large collection of resources. Involving the people who care about the material when you create and maintain the archive is the best way to ensure high quality of that same material. This reputation model is seen at work in ranking and karma systems built into many larger sites such as eBay, Amazon and Slashdot, as well as social software sites like orkut.com. Creating a reputation system in a community that people want to be involved in encourages co-operation. People who participate have a stake in the work created, whether it is a software repository, a book review or a comment on a threaded discussion list (Durand, 2002). Once the internal content is created by the community, ibiblio is available to help present the archives and other content to the external community, where reputation models are not as binding. ibiblio’s work with Creative Commons puts them in a position to assist or enable their content providers to share their material on their own terms. While ibiblio maintains that collections on their server must be provided to the public at no cost, they do not envision this sharing as an information free-for-all where people plunder available resources simply in order to make them available in a for-profit manner outside of ibiblio. “We are working with Creative Commons, which we also host, to develop a small but viable set of licenses for folks including our contributors who want to share their work on various terms (attribution, home or personal use, educational use, etc.)”, says Jones [4]. Future Revolution is born of enabling technologies; this is our experience with the Internet. Technologies that facilitate the sharing of information, in ways both remarkable and intuitive, enables users fundamentally [34]. My mention of ibiblio as the host of my weblog wasn’t just for publicity: it represents a new direction being taken by the project. Fred Stutzman, one of ibiblio’s few paid employees, has been working on an open-source weblog development project for the past six months. The general idea is to position ibiblio not only as an archive for storing and retrieving open source software, but also as a place where interested people can comment on and discuss their open source and academic topics. Using an open source tool called Lyceum, they hope to ease peoples’ transition into communicating digitally about topics they are already discussing. The software has the added benefit of offering robust tracking and reporting tools built in to it for users to track site statistics and visitors. According to Stutzman, making this software freely available will allow people to focus on content and not get dissuaded by hard-to-use tools or technological barriers to entry: “Blogs, like listservs, emails, bulletin boards, are just wrappers for digitized thought”, he says. “The main difference I see is the simplicity and pervasiveness of blogs; many great content producers who once enjoyed a luddite status have no excuse to not contribute to the digital sphere” (Stutzman, 2004). Paul Jones’s vision for the future also shows how far they’ve come: What began as let’s-see-what-happens has become a valuable net.resource for millions of people and is becoming more of a trusted archive and a contributor-run digital library. We’ve become, thanks to my colleagues, more aware of our roles are archivists, librarians, publishers and broadcasters while trying to remain true to openness, information sharing, and user empowerment that were our roots. The staff I spoke with all included “helping people” as one of their favorite things about their work (Lazorchak, 2003). Jones fleshes this out, saying: Not every job gives you a chance to help out a Nobel prize winner in literature, the Dalai Lama, an organic farming cooperative, a rock musician, and a historical [chronicler] of slavery, not to mention helping out the folks who use the information that’s shared here (Jones, 2004). With the assistance of newer technologies and a collection of people devoted to the cause, ibiblio brings the best ideas from the technology world (i.e. open source development of things and ideas, rapid deployment of new ideas, and a sense of humor) together with positive aspects of libraries (i.e. free access for all, quality collections, good finding aids) to create a sustainable 21st century digital library and, even more importantly, a thriving digital library community. Free as in Tibet: ibiblio’s cultural cultivation and community creation Jessamyn West OCLC Systems & Services: International Digital Library Perspectives Volume 20 · Number 2 · 2004 · 82–86 85 Notes 1 newsobserver.com article “The Renaissance geek”, available at: www.ibiblio.org/pjones/menconi.html 2 ibiblio FAQ, available at: www.ibiblio.org/faq/?sid¼1#2 3 “Fourth wall”, Wikipedia, available at: http://en. wikipedia.org/wiki/Fourth_wall 4 Unless otherwise indicated, statistics come from “Slashdot: ibiblio Director Paul Jones Answers”, available at: http://interviews.slashdot.org/interviews/02/08/07/ 0010200.shtml 5 Internet Movie Database, now available at: www.imdb.com 6 Internet Underground Music Archive, now available at: www.iuma.com 7 Project Gutenberg,available at: www.gutenberg.net/ 8 Linux Software Archive, available at: www.ibiblio.org/pub/ Linux/ 9 The Linux Documentation Project, available at: http://tldp. org/ 10 Brewster Kahle, founder of the Internet Archive Project, available at: www.archive.org 11 Original URL for the project, available at: http://sunsite. unc.edu 12 Original web page for the project, available at: www.ibiblio.org/newlook/old.html 13 SunSITE project page, available at: www.sun.com/sunsite/ 14 “The wide, wild world of ibiblio”, available at: www.ibiblio.org/pjones/ibiblio/dyrness-story.html 15 Available at: http://www.metalab.unc.edu 16 Center for the Public Domain, available at: www.centerpd.org/ 17 “ibiblio takes MetaLab contempt to a new level”, available at: http://slashdot.org/features/00/09/17/ 155240.shtml 18 Press release announcing rename, available at: http:// carolinafirst.unc.edu/connections/fall2000/fall00ibiblio. htm 19 ibiblio’s first home page, available at: www.ibiblio.org/ index.old/index-old.html 20 ibiblio Linux Archive Mission, available at: www.ibiblio.org/pub/linux/POLICY.html 21 Luxuriant Flowing Hair Club for Scientists site, available at: www.improb.com/projects/hair/hair-club-top.html 22 Creative Commons, available at: http://creativecommons. org/ 23 “ibiblio ratz” staff page, available at: www.ibiblio.org/ wdg/ 24 “How to edit a page”, Wikipedia, available at: http://en. wikipedia.org/wiki/Wikipedia:How_to_edit_a_page 25 “IbibConsultWiki”, available at: www.ibiblio.org/ic/ 26 “Permaculture Wiki”, available at: www.ibiblio.org/ ecolandtech/pcwiki/index.php/HomePage 27 “ibiblio Archive Mission”, available at www.ibiblio.org/ pub/linux/POLICY.html 28 “How to submit open source software”, from the ibiblio Linux Archive, available at: www.ibiblio.org/pub/linux/ HOW.TO.SUBMIT.html 29 “How to name things”, from the ibiblio Linux Archive, available at: www.ibiblio.org/pub/linux/NAMES.html 30 Motto “Free music, free software, free thought”, available at: http://etree.org 31 etree self-reported statistics, available at http://db.etree. org/stats.php 32 For examples of file comments, see http://db.etree.org/ shninfo_detail.php?shnid¼22006#comments 33 For example of personal comments, see http://db.etree. org/userrating_view.php?ref_userid¼mudpie 34 ibiblio’s blog project, available at http://blogs.ibiblio.org/ References Durand, A. (2002), “Pondering digital reputations”, Kuro5hin, 1 May, available at: www.kuro5hin.org/story/2002/4/30/ 225111/850 Ghosh, R.A. (1998), “Cooking pot markets: an economic model for the trade in free goods and services on the Internet”, First Monday, Vol. 3 No. 3, available at: www.firstmonday.dk/issues/issue3_3/ghosh/ Jones, P. (1991), “Open (source)ing the doors for contributor-run digital libraries”, Communications of the Association for Computing Machinery, Vol. 44 No. 5, pp. 45-6. Jones, P. (2003), “Web turns 10 – but was Mosaic really first and best browser? No, no”, LocalTechWire, 21 April, available at: www.localtechwire.com/article.cfm?u¼3891 Jones, P. (2004), Personal communication, 23 January. Lazorchak, B. (2003), Personal communication, 15 December. Stutzman, F. (2004), Personal communication, 23 January. Further reading Delio, M. (2003), “Where sharing isn’t a dirty world”, Wired news, 15th November, available at: www.wired.com/news/roadtrip/0,2640,61200,00.html Tennant, Roy (1997), “A digital library showcase and support service: the berkeley digital library sunsite”, ariadne, the web version, No. 10, available at : www.ariadne.ac.uk/ issue10/sunsite/ Witten, I. (2003), “Examples of practical digital libraries collections built internationally using greenstone”, d-Lib Magazine, Vol. 9 No.3, available at: www.dlib.org/dlib/ march03/witten/03witten.html Free as in Tibet: ibiblio’s cultural cultivation and community creation Jessamyn West OCLC Systems & Services: International Digital Library Perspectives Volume 20 · Number 2 · 2004 · 82–86 86 http://www.ibiblio.org/pjones/menconi.html http://www.ibiblio.org/faq/?sid http://en http://interviews.slashdot.org/interviews/02/08/07/0010200.shtml http://interviews.slashdot.org/interviews/02/08/07/0010200.shtml http://www.imdb.com http://www.iuma.com http://www.gutenberg.net/ http://www.ibiblio.org/pub/Linux/ http://www.ibiblio.org/pub/Linux/ http://tldp.org/ http://tldp.org/ http://www.archive.org http://sunsite http://www.ibiblio.org/newlook/old.html http://www.sun.com/sunsite/ http://www.ibiblio.org/pjones/ibiblio/dyrness-story.html http://www.metalab.unc.edu http://www.centerpd.org/ http://slashdot.org/features/00/09/17/155240.shtml http://slashdot.org/features/00/09/17/155240.shtml http://www.ibiblio.org/index.old/index-old.html http://www.ibiblio.org/index.old/index-old.html http://www.ibiblio.org/pub/linux/POLICY.html http://www.improb.com/projects/hair/hair-club-top.html http://creativecommons.org/ http://creativecommons.org/ http://www.ibiblio.org/wdg/ http://www.ibiblio.org/wdg/ http://en http://www.ibiblio.org/ic/ http://www.ibiblio.org/ http://www.ibiblio.org/pub/linux/POLICY.html http://www.ibiblio.org/pub/linux/POLICY.html http://www.ibiblio.org/pub/linux/HOW.TO.SUBMIT.html http://www.ibiblio.org/pub/linux/HOW.TO.SUBMIT.html http://www.ibiblio.org/pub/linux/NAMES.html http://etree.org http://db.etree.org/stats.php http://db.etree.org/stats.php http://db.etree.org/ http://db.etree http://blogs.ibiblio.org/ http://www.kuro5hin.org/story/2002/4/30/ http://www.firstmonday.dk/issues/issue3_3/ghosh/ http://www.localtechwire.com/article.cfm?u http://www.wired.com/news/roadtrip/0,2640,61200,00.html http://www.ariadne.ac.uk/issue10/sunsite/ http://www.ariadne.ac.uk/issue10/sunsite/ http://www.dlib.org/dlib/march03/witten/03witten.html http://www.dlib.org/dlib/march03/witten/03witten.html work_ukih6to6bvhrpoua2udgyh5ogy ----            Faculty and Staff Publication: Shannon Pritting, William Jones III, Timothy Jackson, and Michael Mulligan Citation for PrePrint: Pritting, S., Jones, W., Jackson, T., & Mulligan, M. (2017, April 12). Enhancing Resource Sharing Access Through System Integration. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 1-14. http://dx.doi.org/10.1080/1072303X.2017.1305034 http://www.tandfonline.com/doi/full/10.1080/1072303X.2017.1305034 ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 1 Enhancing Resource Sharing Access through System Integration Shannon Pritting, Library Director, SUNY Polytechnic Institute, Utica NY William Jones III, Creative Technologist, IDS Project, Geneseo NY Timothy Jackson, Resource Sharing & Reserves Coordinator, University at Albany, Albany NY Michael Mulligan, Information Management & Technology, SUNY Upstate Medical University, Syracuse NY ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 2 Abstract The application and IT ecosystem of academic libraries typically includes multiple systems, with crucial functions requiring using or sharing information between them. However, library systems are often not well integrated, making workflows and system interactions less than optimal for both staff and patrons. The method to integrate systems that the IDS Project took was to create a middleware platform, IDS Logic, that can connect multiple library systems and open or vendor web services to create the best resource sharing experience for staff and patrons. One specific application that is hosted within the IDS Logic middleware platform is Article Gateway, which uses resource sharing technology and workflows to deliver fast or instant access to research material to users with little or no staff time and removes as many barriers to user access as possible. Where resource sharing has typically sought to deliver articles in one-to-two days, libraries using Article Gateway typically deliver a significantly higher percentage of articles to patrons within a few hours. ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 3 Enhancing Resource Sharing Access Through System Integration Resource Sharing is an area of academic libraries that often serves to illustrate the operational effect of larger issues such as disparate systems needing integration, and increasing costs for access to journal content. Resource Sharing departments also help to create solutions and different approaches to major issues, benefitting both library users and library staff. Electronic subscriptions occupy an ever increasing majority of budgets and prevent libraries from expanding services in other areas. There are few attractive options for libraries who want to provide access to research materials to users in a way that is cost effective and simple for users. Libraries are caught between subscriptions to single journals, large research collections, or article-level purchasing that either involves delays or must allow expensive access to everyone, which can become costly. The IDS Article Gateway platform, developed by the IDS Project and SUNY Polytechnic Institute Cayan Library, uses resource sharing technology and workflows to deliver fast or instant access to research material to users in a way that involves little or no staff time and removes as many barriers to user access as possible. Where resource sharing has typically sought to deliver articles in one-to-two days, libraries using Article Gateway deliver a high percentage of articles within four hours. The IDS Article Gateway is delivered through IDS Logic, a middleware platform created by the IDS Project that aims to integrate different library systems and connect with vendor web services to help Resource Sharing departments improve existing services and deliver new services. Literature Review In the past several years, there are many examples of individual libraries using Application Programming Interfaces (APIs) or web services to address issues, streamline work, or enhance library functions. Additionally, library vendors are beginning to include access to ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 4 extract data from systems through APIs, and to allow for connection to systems via web services and APIs. However, beyond groups such as OCLC's developer's network (https://www.oclc.org/developer/home.en.html) or the Ex Libris Developer network (https://developers.exlibrisgroup.com/), which are meant for sharing ideas and code, there is not a large community or cooperative based on technology integration and development. The IDS Project brings together a community with ideas and strategies about how to improve libraries, and connects them with a platform and development expertise to integrate systems and foster innovation. Recent examples of system integration through application and software development reveal how much effect software solutions can have on library functions. In 2011, Wayne State University created an application that connected data from its two systems, ILLiad and ArticleReach, and submitted orders to the Copyright Clearance Center API (Sharpe and Gallagher, 2011). It was estimated that upon implementation, the library would save over 500 staff hours per year that were previously spent paying royalties (Sharpe and Gallagher, 2011, p. 137). Services such as CCC and the global library cooperative OCLC are ripe for integration, and the positive effect in saved time is evident, even with integrative applications that are limited in scope. As OCLC services overlap with many other systems, whether they are cataloging, resource sharing, or discovery such as WorldCat, the OCLC web services are key resources to leverage for system integration. Sarah Johnston of St. Olaf College developed Perl scripts to drive an application that uses the WorldCat Metadata API to create a "do-it-yourself" reclamation project (2015, p. 1). The project at St. Olaf allowed for a high level of automation for a reclamation of roughly 500,000 holdings, with only minimal staff intervention (Johnston, ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 5 2015, p. 5). Terry Reese, at Ohio State University, provided a thorough analysis of the OCLC Metadata API, seeing the creation of the API as “a welcome shift in how libraries are able to interact with their data, and a set of opportunities to develop new collaborations and workflows around the library community’s metadata operations” (2014, p.9). The API is a major shift from the environment that is “tightly controlled and coupled to the client software OCLC has provided to the cataloging community,” which has resulted in “a lack of innovation and integration of workflows, as the need to work with WorldCat hamstrung those efforts” (Reese, 2014, p.10). The ability to integrate systems changes how libraries can operate and frees them to work more effectively. Since OCLC services are so highly integrated into so many areas of libraries, having access to OCLC APIs or web services is crucial for libraries. As resource sharing and library cooperation expands, especially outside of North American libraries, there will be an increased need to connect disparate resource sharing and library management systems as there are different systems used by international libraries. As OCLC has expanded resource sharing in Spain and other countries, Rodriguez-Gairin and Somoza-Fernandez (2014) identified a need to connect OCLC’s WorldShare Management platform with the GTBib-SOD Interlibrary Loan system already in use in Spain (p. 487). Further, the solution identified by Rodriguez-Gairin and Somoza-Fernandez (2014) was to use web services and APIs to connect the two resource sharing systems and remove the need for duplicate work in systems (p. 487). In the International library software market, systems that allow and encourage interoperation are becoming prevalent, especially Library Services Platforms (LSPs). With LSPs becoming more open, there is an opportunity for further development and cooperation, in much the same way that current existing OCLC web services has allowed for development and ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 6 innovation. Breeding (2014) sees APIs as one of the primary ways to connect and unify library services and software (p. 22). The library technology environment involves maintenance of many different systems, “often with overlapping spheres of functionality or data.” (Breeding, 2014, p. 22). For libraries, “such a matrix of interrelated products and services brings considerable complexity as libraries manage each separately, while attempting to fit them into a coherent technology strategy” (Breeding, 2014, p. 22). However, with more major library systems offering APIs, “rather than considering each system as isolated and self-enclosed relative to how data flows in and out, the use of APIs opens the possibility for more dynamic interactions that are beneficial both in terms of more efficient operations behind the scenes and a more elegant presentation to patrons” (Breeding, 2014, p.24). Breeding sees the two-fold benefit of API usage as “Libraries benefit from APIs when they are able to perform tasks that are not possible through the bundled user interfaces or that automate tasks that otherwise might be performed through manual or batch processes” (Breeding, 2014, p.23). Improving the user experience and greatly optimizing staff capabilities or automation is at the heart of API usage. In his 2015 NISO white paper, Breeding goes further and indicates “there is a window of opportunity for a set of cross-vendor APIs to be defined within each of the areas of intersection among products” (p. 34). This “ecosystem of interoperable APIs might not be codified as standards, but instead as recommended practices that can be validated with compliance assessment” with defined cross-platform operations that should be available via APIs (Breeding, 2015, p.34). A task group proposed standards of API based interactions from an ILS to a discovery system, which would allow interoperability between ILS and discovery systems (Breeding, 2008, p. 18). Although library API standards have not been fully implemented, there ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 7 still is a high degree of interoperability via APIs, and standards proposed such as Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) are now widely in use. Although not an Integrated Library System, the software platform that has allowed the most customization and optimization has been ILLiad, developed by Atlas Systems, and supported by OCLC. In many cases, ILLiad has been extended beyond its main purpose of being a complex hub for resource sharing into a core system that libraries have integrated into many areas and departments. With the experience developed by many libraries and the IDS Project in integrating LSP components, OCLC services, ILLiad, and other software systems, the potential is great for further development within the next few years. As the IDS Project developed a community of highly talented librarians and staff, and systems matured to become more open to integration, there is now the ability to connect “mission critical” systems that will “support better, more informed decisions and free employees to undertake higher-value tasks,” which will ultimately “offer the capability to unlock talent and time” (Oberlander, 2012, p.15). The promise of freeing time and talent through improved systems was at the core of many of the software projects that have come from the IDS Project, which has resulted in staff who have more time for professional development, are more engaged with an innovative community, and can contribute more to their individual libraries and the IDS Project community. One area of libraries that has a direct impact with users that has suffered from the lack of system integration and innovation in the past decade is borrowing articles. Borrowing or purchasing articles for patrons is a complex process that, beyond direct requesting of articles through the IDS Project’s ALIAS system, RapidILL, and OCLC’s direct request for articles, has not seen significant recent automation advancements. The one advancement in borrowing article ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 8 workflows has been mediated article purchasing through services such as CCC’s Get it Now or Reprints Desk Article Galaxy service. The literature on analyzing trends in ILL borrowing article requests indicates that there is a high potential for saving time and reducing costs in ILL borrowing, but users still place many requests manually, which may lead to incomplete citations. In Leykam’s 2013 study of four years’ worth of ILL borrowing requests, he found that 59 percent of requests were manually entered rather than use the library’s link resolver, SFX (p. 106). Surprisingly, students manually entered ILL requests more often than faculty and did not use the direct linking capabilities of the open URL resolver (Leykam, 2013, p.110). Others, such as Ashmore and Allee, look at the accuracy and reliability of OpenURL data, with incorrect, inconsistent, or incomplete information (such as ISSN) being passed from aggregated databases or other indexes, leading to link resolver failures, and incomplete information being passed to the ILL system (p. 27, 2015). Cost effective models for delivering access to articles and journals has also been a major thread in recent literature regarding resource sharing and collections. Heather L. Brown, in studying the comparison of copyright payments, found that over $500 could have been saved in the course of a semester purchasing articles through document providers or publishers, with purchases being filled in almost half the time of traditional ILL (2012, p.101). Libraries, especially those in Health Sciences, are exploring multiple models to deliver articles that include purchase on demand, subscriptions, and resource sharing. In the case of Loyola Health Sciences Library, the use of a hybrid approach using CCC’s Get it Now to deliver journal content saved the library an estimated $640,000 over a two-year period (April 2014 – April 2016) in related access costs (Hendler and Gudenas, 2016, p. 368). In addition, Hendler and Gudenas see the value of ILL to deliver material in a cost effective and very quick manner (for Loyola under 11 ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 9 hours), but also acknowledge that article purchasing has its place in the merged world of access and collection management (2016, p. 369). For the users in a Health Sciences Library, “if it is the weekend or if the user wants the article immediately, they might not elect to use ILL. Get It Now fulfills that need for immediacy and prevents customers going outside of the library to meet their needs.” (Hendler & Gudenas, 2016, p.369). For the University of California, Irvine, Imamoto and Mackinder (2016) found significant improvements in speed and relatively inexpensive cost associated with a pay-per-view pilot project with Reprints Desk. During the project, 72 articles were filled and a “vast majority of articles arrived within 36 minutes of placing the order” with “the average cost per article was $34 plus a service charge of $5.85” (Imamoto and Mackinder, 2016, p. 381). As libraries move towards different models for providing access to research material, there will certainly be a need for more complex configuration of delivery options based on user need and user status. Background on the Development of IDS Logic The development of technology has always been focused on the needs of the IDS Project community and the Interlibrary Loan community as a whole. One of the first major IDS Project developments was the Getting It System Toolkit (GIST) that allowed staff to “easily route requests between ILL and acquisitions depending on a number of factors, such as user recommendations, the borrowing cost versus the purchase price, regional library holdings, and more” (Pitcher, Bowersox, Oberlander, & Sullivan, 2010). Another development focusing on user needs for ease of discoverability across siloed member catalogs was the consortial catalog IDS Search, providing users with an “intuitive search experience which enables libraries to easily customize the search interface and add geographic search limits” and allowing for an almost instant submission of ILL requests for items held at regional libraries (Oberlander & Rivenburgh, ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 10 2012). ILLiadTM Addon Development has also been focused on the needs of the Interlibrary Loan practitioner, allowing for faster ILL request processing time by connecting disparate systems and platforms to provide the information necessary for making intelligent processing decisions. In ILLiad version 8.4, Atlas systems introduced Server Addons into their product, which was a major enhancement to the ability to pull external data automatically into ILLiad. As many tasks in ILL involve consulting external systems and gathering data, Server Addons were a major opportunity for libraries to customize and tailor the ILLiad software to specific needs to create optimal efficiency. Server Addons allow for reading from the ILLiad database, and also writing to most of the ILLiad database, except for the user table and lender address tables, where patron information and information about other libraries are stored. The ability to set the values and insert information into ILL transactions has proven to be a powerful tool for integrating other systems with ILLiad. In addition to being able to insert data, the statuses of transactions can be changed and notifications such as e-mails can be sent. Lastly, external commands to other systems that ILLiad regularly interacts with, such as OCLC ILL, can be sent via Server Addons. Although Atlas released an ILLiad Web Platform API in its version 8.4, the ILLiad web platform does not allow for setting values or inserting data into ILLiad, and only provides read access. The Web Platform API does allow for notifications to be sent, such as emails and SMS messages, but no further functionality currently exists. Currently, Server Addons are the best way to connect external systems into ILLiad, and provide the best functionality to provide customized and efficient workflows that greatly increase levels of automation in resource sharing. This is why the IDS Logic platform was build using the functionality provided by Server Addons. ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 11 IDS Logic Technical Overview ILLiad Server Addons interact with ILLiad in much the same way that Addons in the ILLiad software client do by utilizing functions that perform actions. As ILLiad Client-side Addons were first designed to extend the functionality of ILLiad, first with user-based functionality within the client software, the simple but powerful programming language called Lua was used by Atlas for its Addons. As Lua was intended for embedded lightweight uses, it does have its limitations, which is why IDS Logic takes an approach of creating Lua templates that can be rendered by PHP to allow for much more extensive, dynamic, and flexible software customizations. In addition, having a PHP to Lua interface allows for pieces of basic data to be pulled from ILLiad and sent to the IDS Logic server where more complex analysis and tasks as well as interactions with APIs can be performed with PHP or other software languages. After this analysis is performed, the appropriate actions can be sent back to ILLiad using the simpler functionality and commands available in Lua provided by Atlas. For example, core pieces of citation information might be pulled from ILLiad via Lua, sent to the IDS Logic server, where PHP is used to then send this citation data to APIs such as OCLC web services or CCC’s Get it Now service, and then responses from these web services are brought back to the Logic server where complex rules and algorithms are applied to then determine the decision or data that is sent back to ILLiad via Lua. In some cases, the IDS Logic platform uses PHP to interact with several disparate APIs to get sufficient information to allow for automation of decision making. One example of this is the use of five sequential APIs to take an ISBN and identify related holdings and the number of ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 12 libraries owning. In short, Lua is the basic interface to interact with ILLiad and basic data within the transactions, while Logic and PHP act as the more robust interface with external web services and configurations based on library policies and processes, translating this into actions to send to transactions via Lua. IDS Logic as a Platform As IDS Logic serves as the platform that integrates multiple systems, determining a way to streamline maintaining electronic holdings information and license information with the resource sharing workflow was necessary. ALIAS, the Article License Information Availability Service, was one of the first functions to be integrated into IDS Logic. Taking the approach that maintaining holdings in multiple systems isn’t beneficial, ALIAS harvests data regarding electronic holdings that libraries already have in their knowledgebase software such as Ebsco’s Full Text Finder or Serials Solutions 360 Link. ALIAS and IDS Logic accumulate this data for all 100 libraries and use this as the basic local article availability lookup and article direct request system. Thus, libraries have less maintenance, benefit from identifying items they own, and sending requests only to libraries who can deliver the article they need. Rather than create a separate system, ALIAS uses OCLC for sending these requests, which keeps more transactions in one familiar workflow. To do this, ALIAS and IDS Logic manages the status of OCLC libraries using the ILL Policies Directory API by OCLC to only send requests to libraries who are “upper case,” or active to receive requests from other libraries. Since ALIAS uses OCLC, if IDS Libraries do not hold an article that is needed, which is rare, this request can then be sent to a broader set of libraries using OCLC via the Worldshare ILL article direct request process. Another major service that highlights how IDS Logic functions as a platform that connects systems is Lending Availability Service. A major task of many resource sharing ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 13 departments is looking up call numbers and availability for books or other materials requested. Essentially, this task involves three major components: looking up information in a library catalog, applying policies based on subcollections or other criteria, and then acting upon the information in ILLiad to either route for searching or cancel the TN. Lending Availability Service pulls information from the citation in ILLiad and then sends this to IDS Logic, which then queries the library’s Z39.50 server or availability API. The Z39.50 or availability API typically returns all needed information for libraries to cancel or route transactions for pulling from the shelves. In some cases, IDS Logic can enhance existing software and processes by pulling further information and enabling libraries to make their processes even less time consuming. A key example of these enhancements is Borrowing Availability Service and Direct Request Enhancer to make more book requests go out automatically. Libraries can use Borrowing Availability Service to automatically check if they own items being requested by their patrons, which encourages libraries to allow requests for items they hold but aren’t available to go out automatically without first manually checking their catalog. In addition, many users will request a rare edition or a version of an OCLC record not held by many libraries; these typically fail as direct requests. Direct Request Enhancer combines the usage of multiple OCLC web services such as xID and Borrowing Availability Service to find the best widely used OCLC record and title to send the request on, and if the library owns an alternate edition that is available, the request can be turned into an in-house document delivery request. Article Gateway A major initiative that began in 2016, called the Article Gateway (AG) workflow, features the depth of what is possible with the IDS Logic platform. The AG workflow ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 14 streamlines and automates fixing of citations, checking for copyright clearance and compliance, and when needed, checking multiple article vendors for best prices, ultimately leading to an unmediated delivery for most requests. To ensure that copyright checking is as accurate as possible, borrowing articles all must have ISSN’s and have fairly consistent citations. To achieve an all ISSN process without forcing staff to open many requests, citation data is sent to the PubMed web service and to the OCLC Worldcat and xID web service to harvest ISSN’s, PMIDs, and other citation information that is then ultimately inserted back into the transaction for a verified and standardized citation. In addition, the date, volume, issue, and other citation information are then run through custom scripts that standardize citation information so that years and other citation data can be precisely compared. Additional queries, such as the “Rule of 5” query which checks to see if five requests from the same ISSN (or ISSN grouping for a journal) have been filled within the past year, are run to prevent the need for staff to review requests. If copyright limitations have been reached, the Article Gateway platform checks Copyright Clearance Center licensing fees, pricing from CCC’s Get It Now service, and Reprints Desk Article Galaxy service. Whichever option offers the best value is then selected by Article Gateway, and the request is fulfilled with no staff intervention or delays. Whether a request is submitted with an incomplete citation, no ISSN, or for an article where a copyright payment is needed, IDS Logic and Article Gateway work to facilitate almost instant delivery. A strong motivation behind the development of Article Gateway was to create a method of delivering pay-per-view article access that was unmediated with a high degree of configurability. However, when we began analyzing the ILL borrowing article workflow, we discovered that there were many more opportunities for improvement in automation that would have major effects on ILL services. In October of 2016, using ILLiad data for all 100 IDS ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 15 Project libraries from January 1st 2016 - October 24, 2016, we found that ILL borrowing requests remained in an ILLiad queue “Awaiting Copyright Clearance” for an average of 22.3 hours. The “Awaiting Copyright Clearance” queue is typically designated for requests that need staff review to determine if license fees need to be paid, or whether the CONTU or “Rule of 5” limits have been meet. A total of 314,252 IDS Project borrowing article requests created between December 1, 2015 and November 30, 2016 were analyzed, with a total of 7.01 million hours (or 292,254 days) where requests sat waiting for staff to review copyright. Clearly, copyright clearance has a major effect on ILL delivery time. Throughout the installation and implementation of Article Gateway at member libraries, we learned that many libraries were unwilling to automate purchasing of articles. Two major threads emerged, with one being that sometimes articles available for purchase can also be acquired through open access. Other times, articles are available freely online in a format that is not the final version, such as a pre-print in a repository, or are available to download from sites such as Research Gate, having been posted by the author. All libraries wanted to take advantage of true open access articles and we took this into account for development. Open access filtering is being built into Article Gateway using APIs such as PubMed, which will filter for open access links and content. However, libraries differed on whether they would point patrons to content that was not the “copy of record” or posted on sites that violate publisher’s rights. In these cases, Article Gateway is configured to allow libraries to make choices about how best to educate and engage with patrons about access, or to simply pay and deliver the article of record from a document delivery provider or publisher. Article Gateway Workflow ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 16 The first step in the Article Gateway workflow is to determine whether or not the article request contains standardized ISSN and Publication Date information. Through this process, the ISSN is standardized using the ISSN-L list to ensure that the ISSN for the request is the same for both print and electronic holdings. The ISSN-L, maintained by the International Standard Serial Number International Centre, “is a specific ISSN that groups the different media of the same serial publication” (ISSN-L). The information contained in the ISSN field in the request is also checked for length to separate out numbers that are either ten digits or thirteen digits in length, denoting that the request is for a Book Chapter and not a Journal Article. Requests that contain a ten or thirteen-digit number are separated from the remainder of the Article Gateway processes and put into a queue for staff to review. Next, the Journal Year of the transaction is standardized with the end result being information within the Journal Year field that is only four digits long. If the ISSN and/or the Journal Year field of the request cannot be fixed by the automated processes of Article Gateway, the request is placed into a holding queue for staff to intervene and manually fix the incorrect citation information. After the ISSN and Journal Year have been standardized, Article Gateway checks to see how to acquire the article for the user. First, the user’s library holdings are checked to see if the request can be filled locally. If the article is not held by the requester’s library, the Rule of 5 is applied to the request. If the article has been published outside of five years from the creation date of the request submission, these articles are requested from potential lending libraries. For IDS Project libraries in particular, the holdings and lending licensing rights of the requested article are checked in the ALIAS database to automatically create a list of potential lending member libraries and the request is then sent out using OCLC’s Direct Request service. When an article cannot be fulfilled by IDS Project member libraries, these requests are then processed ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 17 manually to either be sent to Custom Holdings groups or other selected lenders. Now, if the requested article has been published within five years of the creation date of the request, specific steps must be taken to ensure that the Rule of 5 is being followed while processing the request. For those requests that have been published within the last five years, Article Gateway performs a check across previous borrowing article requests fulfilled within the given calendar year to determine if the ISSN or ISSN grouping has been requested five times or more with articles published within the last five years. If the ISSN of the request has not reached the five article limit, that request will be routed to be filled through partner libraries or other standard processing means. However, if the request has reached the five article limit for the year, copyright fees will need to be paid if the article is borrowed from another library. This opens a unique opportunity for borrowing libraries because they have option not to pay borrowing copyright fees and rather to buy the article from Purchase On Demand providers, such as CCC’s Get It Now or Reprints Desk Article Galaxy. At this step of the Article Gateway process, the least expensive option for acquiring an article is determined by querying the web services or APIs of CCC’s Get It Now, Reprints Desk Article Galaxy, and copyright licensing fees. Once the price from each potential provider has been determined, a recommendation is made by Article Gateway and the article request is routed to a queue for staff purchasing or borrowing. If desired, these final steps of acquiring the article can be automated and the article can be delivered to the user within minutes. Figure 1 below provides a visual diagram for the Article Gateway workflow. ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 18 Figure 1: Article Gateway Workflow An important point to remember when checking through a library’s previous transactions for adherence to the Rule of 5 is that in order to perform this check systematically and automatically, ISSNs from previous transactions need to be standardized. In the very first installations of Article Gateway, we realized that many times there was a need for staff to manually go through previous article transactions to ensure that these transactions contained both a standardized ISSN and Year. This process was rather cumbersome, especially later on in the calendar year where the number of article requests to fix grew each day. For this reason, the Batch Year Fixer and Batch ISSN Fixer were created to minimize the amount of previous transactions that needed to be manually fixed by staff. After these functions ran, the number of transactions that needed to be fixed was significantly reduced. Standardization of citations has been such a popular portion of the workflow that libraries have even used the batch standardization tool on old ILL data to make reporting and analysis more precise. However, even with Batch Fixers available for library use, it is recommended to implement Article Gateway toward the beginning of the year in order to potentially save staff time. ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 19 The Effect of Article Gateway Three of the major goals of the Article Gateway tools are to significantly reduce the average turnaround time for ILL articles, increase the percentage of extremely fast delivery times (articles delivered within 4 hours), and reduce the amount of requests that staff need to manually process. One of the largest users of the Article Gateway system is the University at Albany. Prior to implementing Article Gateway, the University at Albany was using a combination of ALIAS and RapidILL to automate article request processing. However, both RapidILL and ALIAS require ISSNs to process article requests, and neither is capable of automatically checking copyright, so the percentage of article requests requiring at least some manual processing was high. Furthermore, staffing levels limited manual request processing to 9:00am – 5:00pm on weekdays, so article requests would often sit unprocessed for anywhere from 16 to 64 hours, preventing many requests from being filled quickly. To help resolve these issues, the University at Albany implemented Article Gateway during the summer of 2016. To determine the impact of implementing Article Gateway at the University at Albany, article requests from a five month period following implementation (June 30 – November 30) were compared to article requests from the same date range during the previous calendar year. This comparison showed that the percentage of article requests requiring manual ISSN lookup and copyright processing dropped significantly upon implementation of Article Gateway (see Table 1). Before Article Gateway Implementation After Article Gateway Implementation Article Requests Filled 12,541 9,125 Article Requests Requiring Manual ISSN Lookup 6,070 (48%) 940 (10%) ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 20 Article Requests Requiring Manual Copyright Processing 5,403 (43%) 579 (6%) Turnaround Time median: 19 hours mean: 32.2 hours median: 14 hours mean: 27.6 hours Table 1: Effect of Article Gateway on manual ISSN lookup - University at Albany By automating the ISSN lookup and copyright processing, Article Gateway significantly increased the percentage of article requests that could be processed automatically by ALIAS and RapidILL, which in turn led to faster turnaround times for article requests (see Table 2). Before Article Gateway Implementation After Article Gateway Implementation Article Requests Requiring Manual Processing 9,116 (73%) 1,522 (17%) Article Requests Processed Automatically by ALIAS and RapidILL 3,425 (27%) 7,603 (83%) Turnaround Time median: 19 hours mean: 32.2 hours median: 14 hours mean: 27.6 hours Table 2: Effect of Article Gateway on automatic article requesting - University at Albany A closer examination of article request turnaround times before and after the implementation of Article Gateway shows a dramatic increase in the number of articles delivered in four hours or less (see Table 3). Delivery Time % of Total Request Before Article Gateway Implementation % of Total Request After Article Gateway Implementation Under 1 Hour 5 13 Under 2 Hours 11 20 Under 3 Hours 16 26 Under 4 Hours 20 30 Table 3: Effect of Article Gateway on article turnaround time - University at Albany Although SUNY Polytechnic Institute is a much smaller institution than the University at Albany, the effect of Article Gateway on article delivery was equally significant in improving service for a large percentage of the roughly 2,852 article requests placed from June 1, 2015 to ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 21 December 1, 2016. Table 4 below shows the total percentage of requests at SUNY Polytechnic Institute that were delivered in 24 hours or less, indicating that Article Gateway, by removing obstacles in the ILL borrowing article workflow, can make ILL a “near instant” option. Of the 2,852 requests, only 50 were purchased from document delivery providers (Reprints Desk or Get It Now), and of the 50 requests delivered by document providers, none were delivered in less than 2 hours, as only recently did SUNY Polytechnic Institute turn on automatic ordering of articles through document providers. With copyright clearance removed, the percentage of requests that are delivered within 2 - 4 hours through library to library lending can be significantly increased, and with automatic document ordering through document providers, the percentage of requests delivered almost instantly will increase even further. Delivery Time Number of Requests % of Total Requests Under 20 Minutes 117 4 30 Minutes and Under 223 8 60 Minutes and Under 422 15 2 Hours and Under 640 22 4 Hours and Under 826 29 9 Hours and Under 990 35 15 Hours and Under 1296 45 16 Hours and Under 1560 55 24 Hours and Under 1956 69 Table 4: Effect of Article Gateway on article requests - SUNY Polytechnic Institute Through the case studies at SUNY Polytechnic Institute and University at Albany, the effect of Article Gateway in making ILL a nearly instant option that requires significantly less staff time is clear. With more configuration and by enabling Article Gateway to trigger purchases during certain days or times, the possibility of ILL as an increasingly meaningful part of the resource access and subscription landscape can become a reality. ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 22 Conclusion Based on the evidence provided through the case studies of the University at Albany and SUNY Polytechnic Institute, Article Gateway has made significant advances to both the workflow involved and the turnaround time in delivering article requests to users. In having Article Gateway manage the increasingly complex process of the borrowing article workflow, interlibrary loan departments can both save financial resources and leverage the time gained to provide new patron services. Over time, users at these libraries should grow to expect and receive faster access and will become comfortable relying on interlibrary loan departments when their requests have an expedient need. The improvements in delivering articles to patrons with minimal delays would not be possible without the complex middleware platform, IDS Logic, to integrate systems and vendor web services. The IDS Article Gateway is an example of how dedication to back-end integration and efficiency can result in improvements in patron services. ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 23 References Ashmore, B., Allee, E., & Wood, R. (2015). Identifying and troubleshooting link-resolution issues with ILL data. Serials Review, 41(1), 23-29. doi:10.1080/00987913.2014.1001506 Breeding, M. (2014). APIs unify library services. Computers in Libraries, 34(3), 22-24. Breeding, M. (2008). Progress on the DLF ILS discovery interface API: The Berkeley accord. Information Standards Quarterly, 20(3): 18-19. Retrieved from: http://www.niso.org/apps/group_public/download.php/5637/ISQv20no3.pdf Breeding, M., & National Information Standards Organization (U.S.). (2015). The future of library resource discovery: A white paper commissioned by the NISO Discovery to Delivery (D2D) Topic Committee. Retrieved from: http://www.niso.org/apps/group_public/download.php/14487/future_library_resource_dis covery.pdf Brown, H. L. (2012). Pay-per-view in interlibrary loan: A case study. Journal of the Medical Library Association, 100(2), 98-103. doi:10.3163/1536-5050.100.2.007 Hendler, G. Y., & Gudenas, J. (2016). Developing collections with Get It Now: A pilot project for a hybrid collection. Medical Reference Services Quarterly, 35(4), 363-371. doi:10.1080/02763869.2016.1220751 Imamoto, B. & Mackinder, L. (2016). Neither beg, borrow, nor steal: Purchasing interlibrary loan requests at an academic library. Technical Services Quarterly, 33(4), 371-385. doi: 10.1080/07317131.2016.1203642 ISSN-L. (n.d.). International Standard Serial Number International Centre. Retrieved from http://www.issn.org/understanding-the-issn/assignment-rules/the-issn-l-for-publications- on-multiple-media/ Johnston, S. (2015). Homegrown WorldCat reclamation: Utilizing OCLC's WorldCat metadata API to reconcile your library's holdings. Code4lib Journal, 27, 1. Retrieved from: http://journal.code4lib.org/articles/10328 Leykam, A. a. (2013). The road not taken: SFX, ILLiad, and interlibrary loan submission methods. Journal of Interlibrary Loan, Document Delivery & Electronic Reserves, 23(2), 97-121. Oberlander, C. (2012). Why Mission Critical Systems are Critical to the Future of Academic Libraries. Computers in Libraries, 32(8), 14-18. Oberlander, C., & Rivenburgh, E. (2012). The IDS Project: promoting library excellence through community and technology. Interlending & Document Supply, 40(2), 76-80. doi:10.1108/02641611211239533 http://www.niso.org/apps/group_public/download.php/5637/ISQv20no3.pdf http://www.niso.org/apps/group_public/download.php/14487/future_library_resource_discovery.pdf http://www.niso.org/apps/group_public/download.php/14487/future_library_resource_discovery.pdf http://journal.code4lib.org/articles/10328 ENHANCING RESOURCE SHARING ACCESS THROUGH SYSTEM INTEGRATION 24 Page, J. R., & Kuehn, J. (2009). Interlibrary service requests for locally and electronically available items: Patterns of use, users, and canceled requests. Portal: Libraries & The Academy, 9(4), 475-489. Phillips, M., & Tarver, H. (2014). Enhancing descriptive metadata records with freely-available APIs. Code4lib Journal, 24, 3. Retrieved from: http://journal.code4lib.org/articles/9415 Reese, T. (2014). Opening the door: A first look at the OCLC WorldCat metadata API. Code4lib Journal, 25, 9. Retrieved from: http://journal.code4lib.org/articles/9863 Rodríguez-Gairín, J., & Somoza-Fernández M. (2014). Web services to link interlibrary software with OCLC WorldShare. Library Hi Tech, 32(3), 483 – 494. Sharpe, J., & Gallagher, P. (2011). Developing a Web API for interlibrary loan copyright payments. Journal of Interlibrary Loan, Document Delivery & Electronic Reserves, 21(3), 133-139. doi:10.1080/1072303X.2011.585099 Sullivan, M., Jones, W., Little, M., Pritting, S., Sisak, C., Traub, A., & Zajkowski, M. (2013). IDS Project: community and innovation. In A. Woodsworth & D. Penniman (Eds.), Advances in Librarianship, 36. (281-311). United Kingdom: Emerald Group Publishing Limited. doi:10.1108/S0065-2830(2013)0000036013 http://journal.code4lib.org/articles/9415 http://journal.code4lib.org/articles/9863 IR-coverpage Enhancing Resource Sharing Access Through System Integration work_ulr4dpr6kzhobkdrxz5bk53rau ---- Border Crossings: Reflections on a Decade of Metadata Consensus Building Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine July/August 2005 Volume 11 Number 7/8 ISSN 1082-9873 Border Crossings Reflections on a Decade of Metadata Consensus Building   Stuart L. Weibel Senior Research Scientist OCLC Research In June of this year, I performed my final official duties as part of the Dublin Core Metadata Initiative management team. It is a happy irony to affix a seal on that service in this journal, as both D-Lib Magazine and the Dublin Core celebrate their tenth anniversaries. This essay is a personal reflection on some of the achievements and lessons of that decade. The OCLC-NCSA Metadata Workshop took place in March of 1995, and as we tried to understand what it meant and who would care, D-Lib magazine came into being and offered a natural venue for sharing our work [16]. I recall a certain skepticism when Bill Arms said "We want D-Lib to be the first place people look for the latest developments in digital library research." These were the early days in the evolution of electronic publishing, and the goal was ambitious. By any measure, a decade of high-quality electronic publishing is an auspicious accomplishment, and D-Lib (and its host, CNRI) deserve congratulations for having achieved their goal. I am grateful to have been a contributor. That first DC workshop led to further workshops, a community, a variety of standards in several countries, an ISO standard, a conference series, and an international consortium. Looking back on this evolution is both satisfying and wistful. While I am pleased that the achievements are substantial, the unmet challenges also provide a rich till in which to cultivate insights on the development of digital infrastructure. The Achievements When we started down the metadata garden path, the term itself was new to most. The known Web was less than a million pages, people tried to bribe their way into sold-out Web conferences, and the term 'search engine' was as yet unfamiliar outside of research labs. The OCLC-NCSA Metadata Workshop brought practitioners and theoreticians together to identify approaches to improve discovery. In two and a half days, an eclectic Gang of 52 (we affectionately described ourselves as 'geeks, freaks, and people with sensible shoes') brought forward a core element set upon which many resource description efforts have since been based. The goal was simple, modular, extensible metadata – a starting place for more elaborate description schemes. From the thirteen original elements we grew to a core of fifteen, and later elaborated the means for refining those rough categories. In recent years much work has been done on the modular and extensible aspects, as application profiles have emerged to bring together terms from separate vocabularies [9]. A Consensus Community The workshop series coalesced as a community of people from many countries and many domains, drawn by the appeal of a simple metadata standard. Openness was the Prime Directive, and early progress was often marked by the contentious debate of consensus building. But our belief that value would emerge from many voices informed our deliberations, and still does. Not without difficulty: in one early meeting, participants spent an hour of scarce plenary time talking about Type before realizing that the librarians and the computer scientists had been talking about completely different concepts. Crossing borders is often difficult. This open, inclusive approach to problem solving helped the Dublin Core community to frame the metadata conversation for the past decade. The Dublin Core brand has been for some years the first link returned for the Google search term "metadata", and for a time, it outranked all other results for the search "Dublin" (as of this writing, it is #6). With only moderate irony, we might say "I feel lucky!" Process As a workshop series evolved into a set of standards and a community, the need for rules and governance evolved as well. DCMI developed a process for evaluating proposed changes and bringing them into conformance with the overall standard [5]. The DCMI Usage Board is comprised of knowledgeable, experienced metadata experts from five countries who exercise editorial guidance over the evolution of DCMI terms and their conformance with the DCMI Abstract Model [13]. This model itself is among the most important of the achievements of the Initiative, representing as it does the convergence of theory and practice over a decade of vigorous debate and practical implementation. It emerged from early intuition and experience, informed by an evolving sense of grammatical structure [2,6] and further refined by a long co-evolution with the W3C's Resource Description Framework (RDF) and the Semantic Web. At a higher level, DCMI has a Board of Trustees [1], who oversee operations and do strategic planning, and an Affiliate Program and governance structure that distributes the cost of the initiative and assures that the needs of stakeholders are accommodated [3]. At the time of this writing, there are four national DCMI Affiliates and several more in discussion. Internationalization The global nature of the Web demands commitment to internationalization. The difficulties of achieving system interoperability in multiple languages are immense, and still only partially solved (anyone used IRIs recently?). Nonetheless, DCMI has succeeded in attracting translations of its basic terms in 25 languages and offers a multilingual registry infrastructure of global reach [14]. The venues for the workshops and conferences have been chosen to make the Initiative accessible to people in as many places as possible. Workshops and conferences are held in the Americas, Europe, and Austral-Asia on a rotating basis, and Dublin Core principals have given talks on every continent save Antarctica. This policy of international inclusion has been a philosophic mainstay for the Initiative, attracting long-term participation from around the world. Where we were confused Confusions and unmet challenges are both interesting and instructive. A few of these are historical curiosities, and interesting mostly as a source of wry humility. Others represent unsolved dilemmas that remain prominent challenges for the metadata world in general. Author-created Metadata The idea of user-created metadata is seductive. Creating metadata early in the life cycle of an information asset makes sense, and who should know the content better than its creator? Creators also have the incentive of their work being more easily found – who wouldn't want to spend an extra few minutes with so much already invested? The answer is that almost nobody will spend the time, and probably the majority of those who do are in the business of creating metadata-spam. Creating good quality metadata is challenging, and users are unlikely to have the knowledge or patience to do it very well, let alone fit it into an appropriate context with related resources. Our expectations to the contrary seem touchingly naïve in retrospect. The challenge of creating cost-effective metadata remains prominent. As Erik Duval pointed out in his DC-2004 keynote, 'Librarians don't scale' [7]. We need automated (or at least, hybrid) means for creating metadata that is both useful and inexpensive. What is metadata for? Another naïve assumption was that metadata would be the primary key to discovery on the Web. While one may quibble about the effectiveness of unstructured search for some purposes, it is the dominant idiom of discovery for Web resources, and may be expected to remain so. What then, is metadata for? There are many answers to this question, though given the high stakes in the search domain, expect these answers to shift and weave for the foreseeable future. Searching the so-called 'dark web' remains a function of gated access, and metadata is a central feature of such access. One might simply say – harvest and index. OCLC's exposure of WorldCat assets in search engines such as Google and Yahoo is exemplary of this approach [11]. Indexed metadata terms connect users to the location of the physical assets via holdings records, but it is reasonable to ask... would simple, full-text indexing of these assets be better still? We may argue the fine points today but in the future, we'll know the answer, for the day of digitization is fast upon us. Structured metadata remains important in organizing and managing intellectual assets. The Canadian Government's approach to managing electronic information illustrates this strategy [4]. Metadata becomes the linkage relating content, legislative mandates, reporting requirements, intended audience, and various other management functions. One does not achieve this sort of functionality with unstructured full text. The International Press Telecommunications Council is exploring embedding Dublin Core in their new generation of news standards [17]. No domain is more digitally now than this one. If you want to know the value of structured metadata, look to the requirements and business cases in such communities [10]. Similarly, in the management of intellectual property rights, well-structured data is essential, and as these requirements become ubiquitous, the creation and management of metadata will be central to the story. Metadata for images is a critical use. Association of images with text makes them discoverable. When the asset is a stand-alone image, metadata is the primary avenue by which they can be accessed. Picture Australia is an early and enduring (and widely copied) model in this area, showing how a photo archive can become a primary cultural heritage asset through the addition of systematic search tools and Web accessibility [12]. There is much talk of taxonomies, their strengths, and deficiencies these days and in fact the emergence of 'folksonomies' hints at a sea change in the use of vocabularies to improve organization and discovery [9]. The Dublin Core community has struggled with the role of controlled vocabularies, how to declare and use them, and how important (or impotent?) they might be. The notion that uncontrolled vocabularies – community-based, emergent vocabularies – might play an important role in aggregation and discovery occasions a certain discomfort for those schooled in formal information management. Whether it is just the latest fad, or an important emerging trend, remains to be seen. A Major Unmet Challenge Entropy is an arrow. In the absence of constant care and fussing, our greatest successes break down. Failures, however, remain potent without much attention, retaining their power to impede. One of the yet-unsolved problems in the metadata community is the railroad gage dilemma. The first editor of D-Lib, Amy Friedlander, introduced me to the notion of train gages as metaphor for interoperability challenges [8]. Last year I rode that metaphor from Beijing to Ulan Bator, Mongolia. A cursory knowledge of Asian history reminds us that relations between Mongolia and China have been less-than-cordial from time to time, and this history remains manifest at the Gobi border crossing today. In the dark of night, the Beijing train of the Trans-Siberian Railway pulls into a longhouse of clanking and switching as the entire train is raised on hydraulic jacks. Chinese bogeys (wheel carriages) are rolled out, and Mongolian bogeys of a different gage are rolled in. Border guards with comically high hats (and un-comical sidearms) work their way through the train cars in the manner of border guards everywhere. After a couple of hours, the train is rolling through the Gobi anew. It is a fascinating display of technological diplomacy – a kind of Maginot line that helps those on both sides of the border sleep better. These images belong to a Bogart movie or a Clancy novel, but their abstraction pervades the metadata arena. Stacked bogeys, ready to be rolled into use. Photo by Stuart Weibel. A railroad car raised on one of dozens of hydraulic jacks that raise an entire train at once for the exchange of bogeys. Photo by Stuart Weibel. We load our metadata into structures in one domain and when we cross borders we unload it, repackage it, massage it to something slightly different, and suffer a measure of broken semantics in the bargain. We're running on different gages of track, manifested in different data models, slightly divergent semantics, and driven by related, but meandering, often poorly-understood functional requirements. Crosswalks are the hydraulic jacks – quieter, but no more efficient than the clanking and grinding in the train longhouse. Metadata standards specify the means to make (mostly) straightforward assertions about resources. Many of these assertions are as simple as attribute-value pairs. Others are more complex, involving dependencies or hierarchies. None are so complicated that they cannot be accommodated within a common formal model. Yet we do not have such a model in place. Why? NIH (Not Invented Here) Syndrome is often blamed for disparities that emerge in solutions from separate domains targeted at similar problems. Certainly our propensity to like our own ideas better than those of others plays a role, but my view is that it is not such a large role. Developments take place in parallel. It is unusual to have the luxury of waiting to see how another group is approaching a particular problem before tackling it yourself. It is quite hard enough to know what is happening in one's own community, let alone to follow related developments in others, whose differences in terminology obscure what we need to know. The functional requirements of various metadata standards are often ambiguous and always focused slightly differently. DCMI focuses on simple, extensible, high-level metadata. IEEE LOM (Learning Object Metadata) also concerns itself with discovery metadata, but focuses more strongly on educational process descriptors. MPEG is about media, where technical image metadata is central, and intellectual property rights management is crucial. MODS is grounded firmly in the legacies of MARC (and the world's largest installed base of resource discovery systems). The cost of collaboration – in intellectual as well as financial terms – is high. People have to know and trust one another, which generally requires face-to-face engagement: transporting ourselves and our ideas to other time zones, surviving frequent-flyer-flues, finding the means to support travel costs, and missing baseball games of our children. The problems are more complicated than we imagine at the outset. The recent approval of the Dublin Core Abstract Model by DCMI is the culmination of a journey that began almost at the outset of the Initiative. Early attempts, under the guise of the DC Data Model Working Group, rank among my most contentious professional experiences. To borrow from the oldest joke of the Dismal Profession, put all the data modelers in the world end to end, and you won't reach a conclusion (we did, but it took ten years to manage it). The idea of achieving similar consensus across communities with their own legacies of such conflict is daunting in the extreme, though recent discussions on this topic with colleagues in another metadata community remind me that hopefulness and optimism are as much a part of our domain as contention [18]. Collaboration and consensus in the digital environment The Web demands an international, multicultural approach to standards and infrastructure. The costs in time and treasure are substantial, and the results are uncertain. Paying for collaboration that spans national boundaries, language barriers, and the often-divergent interests of different domains is a major part of these challenges. Doing this while sustaining forward progress and attracting a suitable mix of contributors, reviewers, implementers, and practitioners, is particularly difficult. A recent presentation by Google's Adam Bosworth, referenced in the Blandiose blog [15], makes for provocative reading for those debating the costs and benefits of heavy-weight versus light-weight standards. The tension between these approaches sharpens designers and practitioners (and especially, entrepreneurs), to the eventual benefit of users. Any standards activity ignores this balancing act at its peril. As we try to foment change and react to it at once, we are like Escher's Hands – designing the future as it, in turn, designs us... except that there are often implements other than pencils in those hands. Ever try explaining what you do for a living to your mother? In the Internet standards arena, conveying an appropriate balance of glee, terror, satisfaction, frustration, and pure wonder is no easy task. I just tell her I'm not a real librarian, but I play one on the Internet. It seems enough. Acknowledgements I wish to acknowledge my personal debt to uncountable colleagues in the Dublin Core community, and my deep sense of gratitude for the opportunity to have played the role I have. The patience, forbearance, and generosity of the support of OCLC management in supporting my efforts and DCMI in general, have been singular and essential. Thomas Baker reviewed and improved this manuscript with several insightful suggestions. Amy Friedlander and Bonnie Wilson, successive editors of D-Lib, have made me look better than I am in these pages for 10 years. Congratulations to them and to all who have helped make this journal (and its authors) what they are. References and Notes [1] About the Initiative DCMI Website, accessed June 23, 2005 . [2] Baker, Thomas "A Grammar of Dublin Core" D-Lib Magazine, October 2000 Volume 6 Number 10 . [3] DCMI Affiliate Program DCMI Website, accessed June 23, 2005 . [4] Committee of Federal Metadata Experts Metadata Action Team, Council of Federal Libraries. Government of Canada Metadata Implementation Guide For Web Resources 3rd edition - July 2004 . [5] DCMI Usage Board DCMI Usage Board Mission and Principle DCMI Website, June 11, 2003 . [6] DCMI Usage Board DCMI Grammatical Principles DCMI Website, 2003-11-18 . [7] Duval, Erik and Wayne Hodgins "Making metadata go away: Hiding everything but the benefits" Keynote address at DC-2004 Shanghai, China, October 2004 . [8] Friedlander, Amy Emerging Infrastructure: The Growth of Railroads Infrastructure History Series, CNRI, 1995 . [9] Mathes, Adam Folksonomies - Cooperative Classification and Communication Through Shared Metadata Computer Mediated Communication - LIS590CMC Graduate School of Library and Information Science, University of Illinois Urbana-Champaign. December 2004 . [10] News Architecture Version 1.0 Metadata Framework Business Requirements IPTC Standards Draft, 2005 . [11] Open Worldcat Program OCLC Website, accessed June 23, 2005 . [12] Picture Australia Hosted by the National Library of Australia Website accessed June 23, 2005 . [13] Powell, Andy; Mikael Nilsson, Ambjörn Naeve, and Pete Johnston. DCMI Abstract Model. DCMI Website, 2005-03-07 . [14] Wagner, Harry and Stuart Weibel "The Dublin Core Metadata Registry: Requirements, Implementation, and Experience" Journal of Digital Information Accepted for publication, May, 2005. [15] "Web of Data" Blandiose blog, 2005-04-21 . [16] Weibel, Stuart Metadata: the Foundations of Resource Discovery. D-Lib Magazine, July, 1995 Volume 1, Number 1 doi:10.1045/july95-weibel [17] Wolf, Misha DC in XHTML2 Semantic Web and DC-General Mailing lists, June 7, 2005 . [18] The author has been party to discussions with Erik Duval and Wayne Hodgins of the IEEE LOM effort centered around the possibility of cross-standard data modeling that might promote convergence among various metadata activities. The means and methods for carrying such work forward are presently undetermined. Copyright © 2005 OCLC Online Computer Library Center, Inc. Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | JCDL Conference Report Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions doi:10.1045/july2005-weibel   work_ulritpmhmzaftnb36xb7mhf3ay ---- ILLiadPaper.PDF “ILLiad: Customer-Focused Interlibrary Loan Automation” by Harry M. Kriz, M. Jason Glover, & Kevin C. Ford Journal of Interlibrary Loan, Document Delivery & Information Supply Vol. 8, No. 4, pp. 31 - 47 (1998). This PDF file contains the text of that article as it was finally submitted for publication on November 17, 1997. Only minor formatting changes distinguish the published version from this preprint version. Accepted by Journal of Interlibrary Loan, Document Delivery, and Information Supply Final Revision: 11/17/97 ILLiad: Customer-focused Interlibrary Loan Automation By Harry M. Kriz, M. Jason Glover, and Kevin C. Ford University Libraries Virginia Polytechnic Institute & State University ABSTRACT. ILLiad is an interlibrary loan borrowing system designed and implemented in the University Libraries at Virginia Tech. ILLiad models the ILL borrowing process so as to track the status of an ILL request as it is processed, either by the staff or by software. This process approach to automating interlibrary loan is leading to fundamental improvements in ILL management and service. The process approach allows continued expansion and modification of the system, including the addition of electronic delivery of articles. The process approach results in substantial improvements in customer service by allowing customers to intervene directly in the borrowing process without staff assistance. WHAT IS ILLIAD? ILLiad is a model, implemented in software, of the interlibrary borrowing process at Virginia Polytechnic Institute & State University (Virginia Tech). The name “ILLiad” can be thought of as an acronym for InterLibrary Loan Internet Accessible Database. However, ILLiad is much more than a database that stores information about interlibrary loan requests. As a model of the entire borrowing process, ILLiad has many implications for service and future expandability. ILLiad is the software that examines the current state of each ILL borrowing request. Depending on the state of the request, ILLiad may perform an action, such as sending an overdue notice to the customer if the book is overdue. Alternatively, ILLiad may inform the ILL staff of the need to perform an action, such as searching OCLC for possible lenders. Performance of an action usually alters the state of a request, thereby preparing the request for another processing step. Because ILLiad is a model of the complete borrowing process, the way remains open to replace staff actions with ILLiad actions by implementing additional algorithms in software. For instance, it will be possible to automate the process of verifying that a photocopy request falls within fair use guidelines. In this paper, we describe the reasons for developing ILLiad, the goals established for the system, and the ways in which ILL customers interact with ILLiad. Then we describe how ILLiad is used by the staff to conduct interlibrary borrowing. The reader may view the public face of ILLiad on the World Wide Web at http://www.ill.vt.edu/. WHY AUTOMATE INTERLIBRARY LOAN? Interlibrary loan was once thought of, if it was thought of at all, as a minor part of reference service. ILL was a small, back-office operation that was considered a nice, but not particularly important, supplement to local collections. This view is changing rapidly, especially in large research and public libraries. Decades of inflation in journal and book prices have greatly reduced the purchasing power of libraries at a time when the quantity of published information, and the range of information needed to support multi-disciplinary research and education, has increased greatly. Today, interlibrary loan is a fundamental adjunct of collection development and reference service. ILL is now vital to the success of a library’s clientele. A new perspective on interlibrary borrowing at Virginia Tech emerged when ILL activity was considered in relation to activity levels in acquisitions and in circulation. For example, during ILLiad: Customer-focused Interlibrary Loan Automation Page 2 of 11 the 1996-1997 fiscal year, the borrowing unit of the Interlibrary Loan Department searched and ordered 20,716 items, an increase of nearly 8% from the previous year. During that same time, the Collection Development and Acquisitions Departments searched and ordered only 20,540 items, a decrease of nearly 11% from the previous year. Thus, interlibrary loan ordering already exceeds that of traditional collection development and acquisitions functions, and the disparity should increase. We should also consider that items obtained through ILL are most likely going to be used, in contrast to the well documented fact that many items purchased in large libraries remain unused after many years on the shelves. (1, 2). In addition to its acquisitions function, ILL circulates returnable materials borrowed from other libraries. During the 1996-1997 fiscal year, the Interlibrary Loan Department circulated 6,163 items. This is nearly as many items as were circulated by the Geology branch library, and almost twice the circulation in the Veterinary Medicine branch library. Thus, the borrowing unit of the Interlibrary Loan Department may be viewed as a branch library in its own right, but one whose collection is shelved in widely dispersed locations. The magnitude of interlibrary loan borrowing, coupled with the expectation of increased growth in borrowing, led to the conclusion that moving from a manual, paper-based system was worth exploring. Our belief that interlibrary loan is a major activity worthy of automating has been justified by the high level of use of ILLiad during its first four months of operation. During that time, fully one-fourth of the Virginia Tech faculty and about one-third of the graduate students made use of ILLiad. This level of use occurred at the end of the academic year and during a slow summer session. Almost nothing that occurs on campus affects such a large proportion of faculty and graduate students so quickly, unless it is a change in parking regulations. GOALS FOR ILLIAD Upon recognizing that interlibrary loan had a level of activity and a public service function that justified automation, general goals for the system were developed by the Interlibrary Loan Department. Included in the goals were: • Customer identification through registration to assure successful delivery of materials, to eliminate repetitive input of customer information, and to prevent unauthorized use. • Customer initiated requests submitted through online forms. • Customer interaction to allow altering a request after submission, renewing a request, and tracking the progress of a request. • Tracking and reporting of requests at every stage. • Elimination of all paper records and manual record keeping. • Statistical reports that could be shared with customers. HOW WAS ILLIAD BUILT? Virginia Tech’s ILL process was first modeled as a flowchart. That flowchart, which is still evolving, today contains more than 490 individual steps and decisions. A complete printout in a readable font size would be about 24 feet long and 4 feet high. Within the chart, a group of steps and decisions represents a sub-process, such as clearing article requests for copyright fair use compliance or sending overdue notices. These sub-processes were implemented in software, the sum total of which constitutes ILLiad. The choice of what sub-process is to affect a particular request at any one time depends on the “state” of the request. Information about the customers, the requests, the lending libraries, and the status of a request is stored in a database named ILLData. ILLiad: Customer-focused Interlibrary Loan Automation Page 3 of 11 Designing the database Designing ILLData was perhaps the simplest part of building ILLiad. The required data fields were already known from the content of the paper ILL request cards that had been in use for years, and from the content of various other paper files in the department. The content of these existing paper files was translated into several database tables as described below: Transactions - This table contains the bibliographic information about the request, the OCLC lending string, the OCLC symbol of the library that supplied the item, and the UserName of the requester. The key to this table is the TransactionNumber assigned by ILLiad to each request as it is submitted by the customer. Users - This table contains information about registered customers. Besides the usual names, addresses, phone numbers, and e-mail addresses, the Users table contains the customers’ preferred method of delivery. It is linked to Transactions by the UserName field. LenderAddresses - This table contains information about lending libraries. It is linked to the Transactions table by the OCLC symbol for the lending library. Tracking - This table logs every change in the status of a transaction. It includes the date and time the status was changed and which staff member or customer changed the status. It is linked to Transactions by the TransactionNumber. Invoices – This table stores any payment information needed by the lending institution. It is linked to Transactions table by the TransactionNumber. ILLiad uses additional tables to manage its internal operations. These include Increment (where the TransactionNumber is created), Inventory (where data is stored from a hand-held scanner during the inventory process), and OCLCUpdates (where ILL numbers are stored awaiting an update, along with their status of received or returned). Implementing the ILL process in software Choosing the software environment in which to implement ILLiad was perhaps the most critical decision. Choices usually followed logically from conditions at Virginia Tech. The campus is heavily computerized and networked. Ethernet is available in many dorms and in some nearby apartment complexes. The dominant operating system is Microsoft Windows, though significant Mac and UNIX populations exist. To accommodate customers on all platforms, we developed a Web-based user interface for customers. This interface was a logical extension of the simple Web forms that were already in use, and which already accounted for more than 90% of all ILL borrowing requests by the time ILLiad was put into production. We chose Microsoft SQL Server as the relational database engine for ILLiad because it is well integrated into the Windows NT Server environment used to network the department’s PCs. To minimize programming, we chose Borland’s Delphi as the development tool. ILLiad’s live statistical reports displayed on the Web are generated through the Active Server Pages feature of Microsoft Internet Information Server version 3.0, which is the Web server included with NT Server. ILLiad is a software model of the ILL request process. The progress of a request as it is processed is described by its “Status.” A complete explanation of the 25 statuses used by ILLiad is available at the ILLiad Web site. However, the Status of a request is not a complete description of the “state” of the request. ILLiad determines the state of a request from its Status combined with other data. For example, a request is in an overdue state if the following three conditions are ILLiad: Customer-focused Interlibrary Loan Automation Page 4 of 11 met: 1.) the RequestType is a loan, 2.) the TransactionStatus is “Checked Out to Customer,” and 3.) the current date is later than the DueDate. A custom-written ILLiad client is used by the staff to process requests. The ILLiad client imports and exports OCLC information in a process facilitated by Passport for Windows macros. Other software components in ILLiad include Microsoft Word for Windows and Adobe Acrobat Exchange. System Requirements ILLiad requires the following computer configurations: Customer workstations • Any computer capable of running a Web browser that understands frames, tables, and forms. Netscape Navigator 3.0 and Microsoft Internet Explorer 3.0 and higher versions are examples of such browsers. Screen resolution of 800x600 pixels or higher is recommended for ease of use of the frames-based Web pages, but it is possible to work at lower resolutions. For those with small or low-resolution monitors, the customer-input pages are available without frames. Database server • Microsoft Windows NT Server 4.0, 64 MB of RAM recommended. • Microsoft SQL Server 6.5 or greater, or another ODBC-compatible SQL database. Database is standard ANSI SQL compatible. Any industry standard SQL reporting tools will work with it. Web server • Microsoft Windows NT Server 4.0, 64 MB of RAM recommended. • Microsoft Internet Information Server version 3.0 or greater. This is a free component of Windows NT Server 4.0. Staff workstations • Windows 95 or Windows NT client workstations. • 15” monitors with resolution of 1024x768 pixels (17” monitors recommended). • FAX modem in addition to Internet connection to transmit ALA request forms. • Passport for Windows (for OCLC access). • Web browser. • For electronic delivery, the Ariel receiving workstation also should have this configuration, along with a copy of Adobe Acrobat Exchange. Other software • Microsoft Word for Windows (or any other ODBC compliant application suitable for formatting and printing book labels and checkout slips from data generated by the ILLiad client). • Microsoft FrontPage 97 for maintaining Web page content. ILLiad: Customer-focused Interlibrary Loan Automation Page 5 of 11 WHAT THE CUSTOMER SEES AND DOES Customer procedures It is not possible in this article to illustrate the rich and colorful Web interface that is the public face of ILLiad. Readers should point their Web browsers at the Virginia Tech Interlibrary Loan page at http://www.ill.vt.edu to see ILLiad. A customer accesses ILLiad using a Web browser. The first-time user registers for interlibrary loan service by filling out a form on a Web page. The customer supplies his name, e-mail address, campus mailing address, status and department at Virginia Tech, Virginia Tech ID number, and preferred method of delivery for photocopies, including a preference for electronic delivery of articles. The customer also specifies a username and a password of his choosing. This is the only time the customer has to supply this personal information. When submitting a borrowing request, the customer logs on to ILLiad with his username and password. His personal information is associated automatically with his requests. When a registered customer logs on to ILLiad, he sees the ILLiad Main Menu. Usually he will request a photocopy or a book loan by clicking the appropriate button on a menu. Clicking a request button displays a request form. The customer fills in the bibliographic information about his request. He does not need to supply any personal information because it is already known to ILLiad. When the form is complete, the customer clicks a button to submit the information to ILLiad. The initial status of the request is “Submitted By Customer.” If the request is for a photocopy of an article, ILLiad immediately updates the status to “Awaiting Copyright Clearance.” The significance of these statuses will become apparent when we describe the staff functions of ILLiad. Customers have other options on the ILLiad Main Menu. If a photocopied article has been received via Ariel, and if the customer prefers electronic delivery of photocopies, then clicking a button lists the articles for that customer that are available for download. These articles are stored by ILLiad in Adobe’s Portable Document Format (PDF). They can be read or printed within the Web browser using the free Adobe Acrobat Reader. Other selections on the Main Menu allow customers to view details about any outstanding requests. For each request they can view a complete TransactionStatus history of the request as it went through processing. This history lists the date, time, and staff member’s name associated with each change in status that occurred during processing. Customers can revise any request that has not yet been processed by the staff. They can renew loans for items in their possession, and they can see complete information about all their past requests. If a request was previously cancelled by the staff for incomplete information, or because a source for the item could not be found, the customer can display that request, edit the content, and resubmit the request with additional information that might now make it possible to obtain the item. Finally, the customer can revise his personal information, including addresses, delivery preferences, and ILLiad password. ILLiad composes and sends automatic e-mail notifications to customers when an item is received or when a request is cancelled. The e-mail messages are customized according to different pickup locations and customer characteristics. The message contains full details about the request, as well as other pertinent information, such as a reason for a cancellation and a call number for items owned by the Virginia Tech libraries. Returnables are picked up by the requester in the Interlibrary Loan office. The items are checked out to the customer on ILLiad by wanding a barcode on the loan label printed by ILLiad and attached to the book. ILLiad: Customer-focused Interlibrary Loan Automation Page 6 of 11 It is worth emphasizing what the customer does not have to do under ILLiad. Unlike most paper and Web request forms, the customer does not have to supply personal information each and every time that he submits a request. This alone cuts in half the amount of time it takes the customer to complete each request. Customer acceptance of ILLiad During its first four months of operation, ILLiad was used by 1,444 individuals. This includes 363 faculty members, 813 graduate students, 102 staff members, and 166 undergraduates. It is not surprising that ILLiad was heavily used, as it is the only means of requesting items through Interlibrary Loan. What is remarkable is that the system was introduced in the middle of the second semester, a time usually regarded as undesirable for the introduction of significant procedural changes in academic routines. The successful introduction of ILLiad can be attributed to several factors. First, about 90% of all ILL requests were already being submitted through Web-based forms. Paper request forms still existing at various library service points were gradually withdrawn over a period of months. As a result, the customer’s interface with the ILL Department did not change greatly when ILLiad was introduced. Second, Virginia Tech is a computer literate campus. Most ILL customers enjoy the convenience and technological sophistication of an online system that allows them to work from home and office to submit requests. Third, the user interface was designed and redesigned for logical and easy use. Graphic images provide cues to help the customer navigate the menus. A request that contains errors is presented back to the customer with clear instructions for revision and resubmission. Fourth, ILL staff gave demonstrations of ILLiad and its benefits to reference and information desk staff. This helped reference personnel answer any questions when ILLiad was introduced. The greatest stumbling block we anticipated when introducing ILLiad was the requirement that customers register for service and use a password to log on to ILLiad to submit requests. To avoid surprising our customers on the day ILLiad was introduced, we publicized the system on our Web pages and gradually shifted the existing Web environment into the new environment of ILLiad. Thus, the look and feel of the system for submitting requests was modified prior to changing the actual procedures. As a result, ILLiad’s introduction was seen as a useful, convenient, and labor saving transition, rather than as a surprising or disruptive change. Informing the customer Having focused on the customer throughout the development of the public face of ILLiad, we wanted to keep the customers informed about the benefits of ILLiad. The best way to do this was to share with the customer everything the staff knew about interlibrary loan service. In particular we wanted to inform the customer about how long it takes to get items through interlibrary loan. This information is available to everyone through the Web-based ILLiad Reports. ILLiad includes reports on turn-around time, on the distribution of delivery times, and on the status of all requests in the system. Other reports list the number of registered users from each academic department, the most requested journals, and the number of items obtained from each lending library. These reports are “live.” That is, they are generated by ILLiad at the instant when the user clicks on the desired item on the report menu. The information cannot be more current or complete because it is based on every request in the system, not just a sample. Interlibrary Loan may be the first public service unit on campus that enables its customers to generate their own reports about the unit’s scope and effectiveness. ILLiad: Customer-focused Interlibrary Loan Automation Page 7 of 11 Because of ILLiad Reports, we know that between March 17 and July 30 the Interlibrary Loan Department delivered 215 requested items within one day of the request submission. We also know that 55% of all successful requests are delivered to the customer within 7 days. The average time to obtain an item, from the moment it is submitted to ILLiad until the customer is notified that the item is available is 8.39 days. This time includes nights and weekends, not just weekdays. WHAT THE STAFF SEES AND DOES ILLiad includes a custom-programmed Windows client that the interlibrary loan staff uses to process ILL requests and to circulate borrowed items to customers. This client organizes much of the work according to the TransactionStatus of requests. The operator launches the client, logs on with his username and password, and sees a screen displaying the total number of requests for each status in the system. From this table, or from the ILLiad client menu bar, the operator selects a task to work on. Validating new users Anyone with Web access can fill out and submit ILLiad’s registration form for new users. As part of preprocessing procedures, the staff clicks on the menu selection that displays a list of new registrants. The names are reviewed against lists of faculty, students, and staff at Virginia Tech so that only eligible registrants are accepted into the system. ILLiad sends an automated e-mail message to each registrant confirming successful registration or disavowal. Because of Tech’s size and complexity, confusion can arise about service eligibility for individuals not directly part of the university community. The message disavowing a registrant provides an explanation and leaves the path open for correction of an error by the staff. Copyright clearance The operator selects Copyright Clearance from the ILLiad menu to display a window with the photocopy requests to be reviewed. Two other windows simultaneously display a list of all titles previously obtained that have reached the copyright fair use limits. The first of these windows displays the titles from which five or more items were obtained. The second window displays those titles for which multiple requests were obtained from a single issue. The operator compares each new request against the previously requested titles to determine if a copyright fee must be paid. If payment is required, the operator uses his Web browser to determine the copyright fee from the Copyright Clearing Center web page. He enters the payment information into the copyright payment fields in the ILLiad client window. The entry includes base fee, per page fee, and ISSN. The operator then clicks ILLiad’s “Pay Copyright” button to submit the data to the ILLiad database. ILLiad then updates the transaction status to “Awaiting Request Processing.” Periodically, ILLiad generates copyright payment reports to be used in paying fees to the Copyright Clearing Center. For those requests requiring payment but not listed on the CCC web page, ILLiad changes the status to “Awaiting Document Provider Processing.” The staff obtains the item through a commercial document provider who handles copyright payment as part of the provider’s fee. Of course, there is always an option of cancelling a request if no means of paying the required copyright fees can be found. ILLiad: Customer-focused Interlibrary Loan Automation Page 8 of 11 Requesting items through OCLC Requests ready for processing are searched in OCLC. The operator clicks the “Search Requests” button. ILLiad presents the operator with a list of all requests ready to be searched. After selecting one of the requests for processing, the operator switches to an OCLC search window in Passport for Windows and presses . This invokes a macro that reads the request data from ILLiad and searches OCLC for the item. If the search does not result in a unique OCLC record, the operator continues with the usual manual search procedures. This may mean selecting a record from a menu, or revising the search argument if no records are retrieved by the automated search. When a usable OCLC record is found, the holdings are examined and potential lenders are selected. The operator creates an ILL workform using the desired lending string. A Passport for Windows macro then copies all relevant data from the ILLiad request into the OCLC form. This includes customer name, article author and title, page numbers, etc. The macro then sends the request. After the request is sent, OCLC assigns an ILL transaction number and updates the workform display in Passport. The operator clicks a button on the ILLiad client window to import the OCLC number, the ILL number, and the lending string into ILLiad, where the information is attached to the customer’s request. During this entire process the operator does not key any information about the requester nor about the item to be borrowed. Receiving items and notifying customers When an item arrives from a lender, the ILL staff clicks the ILLiad client’s “Receive from Lending Library” button to call up a search form. The item can be searched in ILLiad by ILL number, ILLiad transaction number, customer name, book or journal title, and by several other fields. Once the request is found, the operator enters the due date (if applicable), the lender, and any special instructions for a loaned item. If the incoming item is a loan or a printed photocopy, the operator queues the transactions for printing of mailing labels for articles and loan labels for loaned items. ILLiad then updates the status of each transaction to “Awaiting Customer Contact.” The operator then clicks on the ILLiad menu selection “Contact Customers” and ILLiad sends an e-mail to the requester announcing arrival of the item. The e-mail contains all relevant information about the transaction. A small number of customers have indicated their preference to be notified by phone. For those customers, ILLiad prints a calling list and a staff member phones the requester with news that the item has arrived. The printed articles are placed in envelopes for mailing and matched with the appropriate mailing labels. Loan labels are affixed to returnables, which are held for customer pickup. These labels are removable adhesive labels containing the requester’s name, title of the loaned material, the due date, the number of pieces comprising the request, and any special instructions stipulated by the lending library. The loan labels also contain an ILLiad generated barcode that allows automated checkout and checkin on ILLiad. If the item is a photocopy that arrived through ARIEL, and if the customer requested electronic delivery of such articles, then the ILL staff converts the Ariel file to Adobe’s Portable Document Format (PDF). The PDF file is posted to the ILL server where it can be accessed by the requester through the Web. ILLiad automatically sends an e-mail to the requester announcing that the item is available for download. As little as eight minutes has passed from the time the e- mail was sent until the requester downloaded the item. ILLiad: Customer-focused Interlibrary Loan Automation Page 9 of 11 Returning borrowed items When the customer returns a borrowed item, the ILL staff checks it in on ILLiad using the barcode on the loan label. ILLiad updates the transaction status to “Awaiting Return Label Printing”. After checking in all returned items, the ILL staff prints return mailing labels and OCLC ILL forms by clicking a menu item on the ILLiad client. By printing these items on demand, ILLiad eliminates the need to file and store mailing labels and OCLC printouts used when returning items to the lender. There are no paper files whatsoever for interlibrary borrowing under ILLiad. Once the return paperwork is generated, the items are boxed for return and mailed back to the lender. ILLiad updates the TransactionStatus to “Request Finished.” BENEFITS OF ILLIAD ILLiad enables the library to provide a number of services programmatically at the initiative and convenience of the customer. Such services include feedback to the customer about the current status of a request, enabling the customer to cancel a request or renew a loan, and enabling the customer to generate statistical reports about the performance of the interlibrary loan service. In the Interlibrary Loan department, ILLiad saves time and eliminates typing errors. After searching a request using OCLC’s Passport for Windows, a single keystroke by the operator copies all customer and request information from ILLiad to the OCLC ILL workform and sends the request on OCLC. ILLiad then copies back from OCLC the correct title of the item, the ILL and OCLC numbers, and the lender string. ILLiad stores both the title originally requested by the customer and the title on the OCLC record selected by the staff. ILL staff are better able to answer customer inquiries because all data about a request is immediately available online. There is no need to search through multiple paper files when answering an inquiry. Customer service is improved by gathering statistical information about the performance of the lending libraries, enabling the staff to make better decisions when choosing a lender to supply an item. While providing increased customer satisfaction, ILLiad is also saving time and money. Preliminary work measurements show that request processing is cut by an average of 9.7 minutes per item, which translates into a saving of more than $21,000 in student and staff time during the course of one year. This staff time is being devoted to improving other aspects of Interlibrary Loan service, including improved lending services to other libraries. ILLiad’s benefits are becoming apparent throughout the library. Detailed borrowing data contributes to informed decisions about collection development and cost avoidance. Finally, the investment in ILLiad is an example of “buying the future.” The Virginia Tech interlibrary loan service is in a position to handle increased borrowing activity as customer needs increase. ILLIAD “PROBLEMS” ILLiad requires a higher level of computer sophistication on the part of ILL staff. To some this may be thought of as a problem. We see it as a benefit in that it develops the ability of the staff and the customers to take advantage of the power of software to improve services. As with other aspects of library automation, ILLiad requires the library to employ knowledgeable computer specialists who can deal with the inevitable hard disk and network failures. Some see this shift in the skill set of library employees as a problem. We see it as a natural evolution that is appropriate in an information organization. ILLiad: Customer-focused Interlibrary Loan Automation Page 10 of 11 Confidentiality of information submitted by a customer at a public Web workstation has been suggested as a possible problem. Web browsers cache information and create a history file on the local workstation. This allows a subsequent user of the workstation to access the system under a previous user’s name using the browser’s Back button to recall a page from the cache, or by finding a page with a user’s personal information in the browser’s history file. Customers concerned about this issue should take the kinds of precautions they are accustomed to using with any private information that is at risk to exposure in public places. They can access ILLiad only from their personal computer or an otherwise secure workstation. They can access ILLiad from the supervised public workstations in the Interlibrary Loan office in the library. If they are working at a public workstation in the library or a campus computer lab, then they can delete from the history file those pages that contain personal information. Also, they should exit from the Web browser before leaving the workstation. This prevents the browser’s Back button from accessing the pages they were using. FUTURE DEVELOPMENTS ILLiad is being extended to automate the lending process. Rather than printing in duplicate the 45,000 lending requests received each year via OCLC, ILLiad will capture the requests to provide a completely online environment for lending. Data about items to be retrieved from the stacks for lending or photocopying will be downloaded to handheld computers. As items are retrieved, or not found, the status will be updated on the handheld computer. Information from the handhelds will be uploaded into ILLiad for transmittal to OCLC as to the final action on a lending request. ABOUT THE AUTHORS Harry M. Kriz is Assistant to the Dean of Libraries for Special Projects and Head of the Interlibrary Loan Department in the University Libraries at Virginia Tech. Jason Glover was a programmer in the ILL Department when he developed ILLiad. He is now a programmer/analyst with the Virginia Technical Information Center. Kevin C. Ford was in charge of the ILL borrowing unit. He now teaches history in the Greensville County High School in Emporia, Virginia. ACKNOWLEDGEMENTS We wish to thank Dean of University Libraries Eileen Hitchingham for recognizing the value of committing resources to improve interlibrary loan services. We are grateful for the consistent support and assistance of our colleagues in the Interlibrary Loan Department. Their knowledge, patience, and enthusiasm for change was vital to the successful implementation of ILLiad. Janet Bland helped educate customers about the benefits of using ILLiad. Donna Deplazes and Heather Ford created useful Web pages and helpful documentation. Lucy Cox ran the lending operations efficiently even when we “borrowed” her workers. And Sharon Gotkiewicz managed the day-to-day operations of the department and kept us rooted in reality. ILLIAD AVAILABILITY ILLiad is available under license from Virginia Tech Intellectual Properties, Inc. VTIP can be contacted at (540) 231-3593 or through its Web page at http://www.vtip.org/. ILLiad: Customer-focused Interlibrary Loan Automation Page 11 of 11 FOOTNOTES 1. Kent, Allen, et. al. (1979). Use of Library Materials: The University of Pittsburgh Study. Marcel Dekker, Inc.: New York. The authors of this classic study found that any given book had only about a 50% chance of being used in the seven years following its purchase. 2. Broadus, Robert N. (1980). Use Studies of Library Collections. Library Resources & Technical Services, 24(4), Fall, 317-324. work_umdzchw5ybedhp6s22gmlwxxw4 ---- FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies Jeffrey Mixter Research Assistant Eric R. Childress Consulting Project Manager OCLC Research FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies Jeffrey Mixter and Eric R. Childress, for OCLC Research © 2013 OCLC Online Computer Library Center, Inc. This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/ August 2013 Updates: 3 September 2013, pp. 30-33, updated RMIT Publishing case study. 16 September 2013, pp. 15, updated Universite of Quebec in Montreal details in Table 2; pp. 47, added screen shot to Universite of Quebec in Montreal case study. OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 1-55653-459-0 (978-1-55653-459-1) OCLC (WorldCat): 850981735 Please direct correspondence to: fast@oclc.org Suggested citation: Mixter, Jeffrey, and Eric R. Childress. 2013. FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies. Dublin, Ohio: OCLC Research. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf. http://creativecommons.org/licenses/by/3.0/ http://www.oclc.org/ mailto:fast@oclc.org http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 3 Contents Introduction ................................................................................................. 6 About FAST .................................................................................................. 7 Adopters of FAST ........................................................................................... 8 Non-adopters of FAST ..................................................................................... 11 User feedback on FAST vocabulary and services ...................................................... 13 Conclusion .................................................................................................. 15 Case Studies ................................................................................................ 17 1. Bodleian Libraries (University of Oxford) ........................................................ 18 2. Databib.org ........................................................................................... 20 3. National Library of New Zealand (Te Puna Mātauranga o Aotearoa) ......................... 23 4. OCLC Online Computer Library Center, Inc. ..................................................... 26 5. RMIT Publishing ...................................................................................... 30 6. Sterling and Francine Clark Art Institute Library ............................................... 34 7. Universiteitsbibliotheek Amsterdam (University of Amsterdam Library) .................... 36 8. University of Illinois at Chicago ................................................................... 38 9. University of North Dakota ......................................................................... 39 10. Biodiversity Heritage Library (BHL) ............................................................. 42 11. Minnesota State University, Mankato ............................................................ 43 12. University of Western Ontario (Ph.D. Research Project) ..................................... 44 13. People of the Founding Era ....................................................................... 45 14. l’Université du Québec à Montréal (UQAM) [University of Quebec in Montreal] .......... 46 15. University of Texas School of Public Health at Houston ...................................... 48 16. World Maritime University ........................................................................ 49 Notes ........................................................................................................ 50 References ................................................................................................. 52 http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 4 Tables Table 1. Agencies known to have adopted FAST ...................................................... 14 Table 2: Agencies known to have considered, but not adopted FAST ............................. 15 Table 3. FAST heading queries in Classify (2012) ..................................................... 28 Table 4. OCLC Research projects using FAST .......................................................... 29 http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 5 Figures Figure 1. Positive attributes cited by FAST adopters ................................................. 9 Figure 2. FAST facets use by adopters ................................................................. 10 Figure 3. Positive attributes of FAST ................................................................... 12 Figure 4. An individual record in Databib.org. ........................................................ 21 Figure 5. An RDF sample in Databib.org. .............................................................. 22 Figure 6. National Library of New Zealand’s use of FAST ........................................... 25 Figure 7. A list of FAST subjects in search result for OCLC Classify................................ 27 Figure 8. A list of titles in Classify ...................................................................... 28 Figure 9. A list of FAST index terms displayed in Informit .......................................... 32 Figure 10. Individual record display in Informit ....................................................... 33 Figure 11. A record display in the Sterling and Francine Clark Art Institute Library’s CONTENTdm system ......................................................................... 35 Figure 12. A record of display in the University of North Dakota's use of CONTENTdm system 41 Figure 13. Individual record illustrating the resemblance between University of Quebec’s RASUQAM system and FAST ................................................................. 47 http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 6 Introduction Over the past ten years, various organizations, both public and private, have expressed interest in implementing FAST in their cataloging workflows. As interest in FAST has grown, so too has interest in knowing how FAST is being used and by whom. Since 2002 eighteen institutions (see table 1) in six countries have expressed interest in learning more about FAST and how it could be implemented in cataloging workflows. Currently OCLC is aware of nine agencies that have actually adopted or support FAST for resource description. This study, the first systematic census of FAST users undertaken by OCLC, was conducted, in part, to address these inquiries. Its purpose was to examine: • how FAST is being utilized; • why FAST was chosen as the cataloging vocabulary; • what benefits FAST provides; and • what can be done to enhance the value of FAST. Interview requests were sent to all parties that had previously contacted OCLC about FAST. Of the eighteen organizations contacted, sixteen agreed to provide information about their decision whether to use FAST (nine adopters, seven non-adopters). This document presents: • a brief overview of FAST; • a brief analysis of common characteristics of parties that have either chosen to adopt FAST or chosen against using FAST; • suggested improvements for FAST vocabulary and services; • tables summarizing FAST adopters and non-adopters; and • sixteen individual “case studies” presented as edited write-ups of interviews. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 7 Please note: OCLC Research also uses FAST in several prototypes. Thus, it is treated as an adopter in the remainder of this report, including the case-study write-ups, in order to share our own assessment of FAST. (See table 4 for a comprehensive list of OCLC services using FAST.) About FAST FAST (Faceted Application of Subject Terminology)1 is derived from the Library of Congress Subject Headings (LCSH), one the library domain’s most widely-used subject terminology schemes. The development of FAST has been a collaboration of OCLC Research and the Library of Congress. The origin of FAST can be traced to observations by OCLC Research staff involved with the OCLC Cooperative Online Resource Catalog (CORC)2, which focused on the cataloging of Web resources. CORC participants typically wanted to be able to adopt simple, low-cost, low- effort approaches to describing Web resources (e.g., using Dublin Core rather than AACR2 and MARC). In the course of the CORC project, it became clear that a significant barrier to minimal-effort resource description was the lack of an easy-to-learn and -apply general subject vocabulary. Additionally, work during the same time period by the Subcommittee on Metadata and Subject Analysis of the Association for Library Collections and Technical Services’ Subject Access Committee identified specific functional requirements of subject data in the metadata record (ALCTS 1999), and these requirements mapped well to the intended outcomes of what would become the FAST project. So, FAST has been developed in large part to attempt to meet the perceived need for a general-use subject terminology scheme which is: • simple to learn and apply; • faceted-navigation-friendly; and • modern in its design. The full development of FAST has required several years and resulted in an eight-facet vocabulary with a universe of approximately 1.7 million headings across all facets. FAST facets are designed to be used in tandem, but each may also be used independently. The rules of application are very simple. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 8 Documentation has been made available in the form of a published monograph (Chan and O’Neill 2010), and FAST has been made available by OCLC in various online tools and published as linked data3. Additionally, OCLC Research has made significant use of FAST for various research activities and in several prototypes, and these uses have led to improvements in FAST. Adopters of FAST Institutions which have implemented FAST have done so for a variety of purposes, from using it exclusively for cataloging digital materials to making it the primary cataloging vocabulary. The FAST project team, based on anecdotal discussions over the years, has assumed that FAST would appeal most to agencies needing a rich but low-investment vocabulary. Adopters interviewed for this report validated this basic appeal of FAST. A manual categorization of comments by adopters indicates [number reporting in parentheses] that: ease of use (8), simple syntax (5), FAST’s suitability for use by non-specialist staff (5), one-to-one heading-to-authority record structure(4), and the rich vocabulary (4) features of FAST were the most-frequently cited, positive attributes of FAST. The next most cited attributes were: ease of learning (3), availability as linked data (3) , support for faceted navigation (2), ease of implementation (2), and FAST‘s potential value as a "super vocabulary" to facilitate uniform indexing of metadata from multiple sources and/or for a diverse range of resources (see figure 1). http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 9 Figure 1. Positive attributes cited by FAST adopters Other positive attributes, each cited by only a single agency, included: the usefulness of FAST tools (searchFAST, assignFAST), the ability to make quick assignment of FAST headings, FAST’s short strings (when compared to other vocabularies), and the importance of FAST being published under an open license. The types of resources for which FAST headings are being assigned are varied and include: main-stacks library materials, journal articles, book chapters, digital materials (including datasets, images, historical documents, and institutional repository resources), rare books, and state/local government documents. Several of the agencies have used or considering using FAST for short-term special projects, but the majority of those interviewed are using FAST routinely for selected categories of resources. For those agencies that have switched from another vocabulary to FAST, the most common predecessor vocabulary is LCSH. Several agencies report using FAST (selected facets) in combination with another vocabulary or vocabularies. Based on mentions in the interviews of specific facets, the most frequently adopted facet is Topics followed by Geographic Name and Form/Genre. Other facets are used far less frequently (see figure 2). 8 5 5 4 4 3 3 2 2 2 easy to use simple syntax non-cataloger use all headings are linked rich vocabulary easy to learn linked data faceted navigation easy to implement uniform indexing Mixter and Childress, for OCLC Research. 2013. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 10 Figure 2. FAST facets use by adopters The types of institutions represented among the adopters of FAST include four universities (Oxford, UvA, UIC, UND), a national library (the National library of New Zealand/Te Puna Mātauranga o Aotearoa), a special library (the Clark), a publisher (RMIT), and other agencies (Databib, TRCDL, OCLC) (see table 1). Adopters are located in five countries: six in the United States, and one each in Australia, the Netherlands, New Zealand, and the United Kingdom. A variety of system environments for metadata editing and user discovery were referenced, but only one system, CONTENTdm, received multiple mentions (by two of the agencies interviewed). Many of the agencies have adopted FAST for specific categories of resources. The Databib and RMIT are relying on FAST as their sole or primary controlled vocabulary. Databib initially made use of LCSH but found it cumbersome for their particular purpose, cataloging data repositories, and switched to FAST. FAST supports search and navigation on the Dababib website and is also expressed in linked data (RDFa) output by Databib. 9 5 4 3 3 3 2 2 Topics Geographic Names Form/Genre Personal Names Corporate Names Chronological Events Titles Mixter and Childress, for OCLC Research. 2013. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 11 RMIT, a publisher that provides subscription databases to libraries, has chosen to adopt FAST for metadata it creates to describe journal articles, book chapters, books, and other materials. In describing the choice to switch from LCSH to FAST, RMIT summarizes the reasons behind the choice this way (Note: Informit is the name of an RMIT system): FAST enables Informit to continue to index with a subject scheme that has the richness of topics of LCSH, but in a more efficient form to use, apply and manage. Benefits for customers will include: easier searching across a wide spectrum of information resources in diverse formats; and faster turnaround for full text materials. (RMIT 2011, p. 2) At least two of the adopters have engaged in structured evaluation of FAST. The National Library of New Zealand did a formal evaluation and issued an internal report, which they graciously shared with the FAST team at OCLC. UvA is making pilot use of FAST as part of a plan of evaluating whether to continue to use FAST going forward. Broadly speaking, FAST has been perceived by adopters as a superior alternative because it is easier, cheaper, and faster to apply; and/or because of FAST’s perceived added value as a means of achieving superior indexing and faceted displays in discovery systems. Non-adopters of FAST When interviewed for this study, non-adopters cited many of the same positive attributes of FAST as adopters, but with some differences in emphasis (see figure 3). FAST’s simple syntax was the most-often-cited positive attribute. Because of the small number of agencies interviewed, many attributes were referenced by only a single interview subject. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 12 Figure 3. Positive attributes of FAST Overall, reasons for not choosing FAST varied and often reflected specific time-and-place considerations that did not reflect any shortcomings in FAST. Nevertheless, there were some re-occurring themes in cases where FAST was not adopted in favor of alternatives; reasons such as an absence of customer support and concerns about OCLC’s commitment to FAST going forward. Three agencies (University Of Texas School Of Public Health at Houston, the Biodiversity Heritage Library, and People of the Founding Era) chose not to adopt FAST for funding and/or administrative reasons. One agency (University of Quebec in Montreal) has developed a French language vocabulary similar to FAST. Two parties did identify specific barriers to their use of FAST: Minnesota State University, Mankato had shown interest in using FAST geographic facets headings but abandoned the idea after communications between the school and OCLC broke down in late 2006. Olha Buchal, a former Ph.D. student at the Western University of Ontario, considered using FAST in her Ph.D. research but chose instead to use LCSH due to limitations in the geographic headings. 0 2 4 6 8 10 simple syntax uniform indexing easy to use non-cataloger use all headings are linked easy to implement faceted navigation geographic search innovative uses staff efficiency FAST Adopters FAST Non-Adopters Mixter and Childress, for OCLC Research. 2013. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 13 User feedback on FAST vocabulary and services The interviews presented a welcome opportunity to learn about not just the respective agencies’ use, or considered use, of FAST, but also their overall experience with FAST as a “product” (albeit not a true OCLC product, as FAST is still a research project). There were some consistently mentioned areas for improvement, issues that pose either barriers to use or cause for concern. These include: • Better customer service and communication: The email correspondence from interested parties indicates that requests for further information regarding the development of FAST and FAST tools were frequently ignored or not followed up. The lack of communication caused at least one of the interested organizations to abandon the idea of using FAST. In particular, OCLC needs to communicate changes and updates, including development, release, and update notes, to FAST on its website and by other means. • User proposals: Respondents have requested a way to suggest new or revised FAST headings for authorization consideration. • Implementation tips and advice and documentation: Users want suggestions from OCLC on how libraries can use the FAST tools to help improve the FAST implementation as well as its user experience. At least four interview subjects mentioned the value of the Chan and O’Neill book, but there also were requests for briefer, online documentation as well. • Commitment: Some respondents asked for a statement of commitment from OCLC in regards to how FAST will be supported in the future. • Enrichments: The Geographic headings are of particular interest to current FAST users, but there is a need to assure that more headings have associated latitude and longitude coordinates. • Fix Form/Genre: Some respondents would prefer the ability to choose between FAST and LC Form/Genre Headings. • Add FAST to WorldCat: Some respondents requested the addition of FAST headings to WorldCat records and the implementation of FAST into WorldCat.org. • Disclose user base: University of Amsterdam, which is currently using FAST in a pilot study, is concerned that no other large research universities (that they are aware of) are currently using FAST. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 14 All of the organizations that were interviewed expressed interest in the tools associated with the FAST project (searchFAST, mapFAST, assignFAST and the FAST linked data API). In addition to improving cataloging workflow, the tools could also be used for improving end- user experience (this is possible due to the simple syntax associated with FAST headings). Table 1. Agencies known to have adopted FAST4 Agency (Country) FAST Usage Facets used Notes Case # Bodleian Libraries, University of Oxford (UK) Institutional repository, data catalog, institutional data archive Topical FAST is used in 3 services 1 Databib.org (US) Used for cataloging data repositories Topical Using FAST Linked Data 2 National Library of New Zealand (NZ) Indexing national articles Geographic, Topical, and Forms Replaced use of APAIS5 3 OCLC (US) Classify, Fiction Finder, Kindred Works, WorldCat Identities, more… All facets (Interview covers selected OCLC services) 4 RMIT Publishing (AU) Article, book chapter, book indexing All facets RMIT’s Informit service 6 5 Sterling and Francine Clark Art Institute Library (US) Cataloging a rare books collection Topical and Form Used in CONTENTdm 6 Theodore Roosevelt Center Digital Library, Dickinson State U. (US) Cataloging digitized materials Topical, Event, Geographic, and Corporate names 7 Used in a DARMA system [No Response] University of Amsterdam (NL) Cataloging monographic materials All facets 7 University of Illinois at Chicago (US) Cataloging IL state documents [unknown] 8 University of North Dakota (US) Cataloging digitized materials All facets Used in CONTENTdm 9 http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 15 Table 2: Agencies known to have considered, but not adopted FAST 8 Agency/Person (Country) Possible Use Facets Notes Case # Biodiversity Heritage Library (US) Cataloging digitized materials Topical 10 California Historical Society (US) [No Response] Minnesota State University, Mankato (US) Cataloging student research papers Geographic 11 Monash University Library (CA) [No Response] Olha Buchel, University of Western Ontario (CA) Visualization of Ukrainian materials Geographic 12 People of the Founding Era (US) Cataloging digitized materials 13 University of Quebec in Montreal (CA) Mapping between thesauri, indexing all types of materials All facets FAST is similar to a vocabulary from UQAM 14 University of Texas School of Public Health as Houston (US) Local materials 15 World Maritime University (SE) Geographic, Topical, Uniform Titles 16 Conclusion A primary motivation for undertaking this census of FAST adopters was to enumerate the agencies using FAST, understand their use cases, and gain a sense of the degree to which FAST was suitable to purpose. Since OCLC did not know in advance whether the agencies that had expressed interest in FAST were adopters, this census has also included information about non-adopters, the better to understand cases where FAST was considered but not adopted. This census is also a response to the most common question by parties that use FAST or are considering using FAST—“Who is using FAST?” The case studies that follow present summaries of interviews with sixteen agencies, including nine adopters and seven non-adopters. The interviews were conducted by phone in late 2012 and early 2013. Initially, the intent was to create a brief report for OCLC’s internal use, but our interview subjects and other parties expressed such interest in seeing the report that we http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 16 returned to our interview subjects for permission to include their interviews in a publicly- accessible version of the document. All subjects readily agreed. Draft versions of the summaries were provided to the respective interview subjects for review and correction. Final edits to the summaries have been done by OCLC Research staff. OCLC Research is very grateful to all of our interview subjects for being generous with their time and providing very useful information and in some cases screenshots and other material. It has also been very helpful and illuminating for the FAST team to have feedback on how FAST, FAST tools, and the customer service aspects of supporting FAST can be improved. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 17 Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 18 1. Bodleian Libraries (University of Oxford) Interview Date: 17 May 2013 Contact: Sally Rumsey Interview Notes: The Bodleian Libraries has recently been developing a variety of new services to help users from the University of Oxford community submit and retrieve deposited information resources (e.g., reports, papers and research datasets). In developing these services Bodleian staff decided that a single, preferred controlled vocabulary should be used to support uniformity in indexing across all of the new services. One of the initial challenges in implementing this approach was identifying a single vocabulary that could cover the wide range of topics but also be specific enough to add a high amount of detail about a given topic. Additionally, the staff preferred a vocabulary that was easy and intuitive to use and implement. Finally, the library wanted to use Linked Data but did not want to reinvent the wheel and create their own custom linked data subject vocabulary. Currently scholars of the University use a wide range of topic-specific vocabularies within their own specialist subject disciplines (e.g., the Journal of Economic Literature (JEL) Classification System9 and the Mathematics Subject Classification10 from the AMS (American Society of Mathematics)). FAST was identified as a candidate general-use vocabulary in the course of investigating various options and eventually selected as the vocabulary of choice for the Bodleian Libraries’ research outputs services. One of the reasons for choosing FAST was the intuitive nature of navigating the various subject headings. Ms. Rumsey also noted that technical staff at the Bodleian Digital Libraries Systems and Services were impressed by the Linked Data that FAST supported. (Ms. Rumsey first heard about the FAST vocabulary around 2009 and noted that FAST was considered for use in the JISC-funded Building the Research Information Infrastructure (BRII) project11, which attempted to create an entity store that could be used as a foundation for all of the Bodleian Libraries’ repositories.) The FAST vocabulary is being implemented in three different projects: the Oxford University Institutional Repository, the data catalog (DataFinder) and the DataBank institutional data archive. FAST will not be used by library catalogers, but, rather, manual deposit end-users http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 19 will use FAST to add subject terms to submitted works. Additionally, Oxford hopes to add FAST subject terms to material that is bulk-uploaded to the various systems though this will require the use of a mapping tool to convert existing subject headings to FAST. The library staff have not developed any training material for using the FAST vocabulary, but they intend to develop a short “how to” section for users. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 20 2. Databib.org Interview Date: 7 February 2013 Contact: Michael Witt, Purdue University Interview Notes: Professor Witt first heard about FAST two years ago through an OCLC news release. The vocabulary looked very applicable to the work that he was doing with the IMLS- funded Databib project,12 but at the time, the OCLC FAST licensing was too restrictive, and therefore he opted not to use it. In December of 2011 OCLC released FAST as experimental Linked Data and changed the licensing agreement to an Open Data Commons Attribution License. This change in licensing as well as the fact that FAST was now available as Linked Data prompted Witt to reconsider using FAST in the Databib.org project. Databib.org is a website that catalogs data repositories and allows users to quickly and easily find and access data for research (The repositories primarily contain research datasets such as instrument and sensor data, spreadsheets, interview transcripts, surveys, observation logs, bioinformatics data, software source code, etc.). Prior to using FAST, Databib.org was using LCSH for subject cataloging, but it decided that the complexity of the vocabulary was too frustrating, and it took too much time to assign terms to a record. The LCSH dataset that Databib.org was using included over 400,000 terms/URIs, of which only about 3,000 were used. In October of 2012 Databib.org began using FAST as its primary cataloging vocabulary. The team used the FAST Converter to match the existing LCSH terms with FAST headings. Approximately 2,000 terms matched automatically; another 100 required manual intervention to match. Databib.org switched to FAST (for a sample display, see figure 4) because the vocabulary was easier to use and implement than LCSH. Mr. Witt commented that FAST was developed for applications such as Databib.org. It allows inexperienced catalogers to add headings quickly, and also allows users to easily discover material by using facets. Databib.org users can submit records for data repositories and subsequently catalog them with appropriate subject terms. Consequently the team needed a simple vocabulary that could be used by people having little or no knowledge of cataloging vocabularies or practices. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.databib.org/ FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 21 Databib.org downloads the FAST Topical facet headings to its local system and has an auto- complete function that assists users in finding and selecting terms for use on a record. Users spend approximately 20 minutes cataloging a record, each of which requires a minimum of one subject heading. After a record is submitted to Databib.org, it is reviewed by the Editorial Board and, if need be, changes or additions are made. Databib.org does not have any professional user’s guide for FAST but does provide a one-page outline for users to reference when adding subject headings. Databib.org currently has over 500 repositories cataloged using FAST. In addition to simply adding the string subject term, Databib.org also exposes the FAST URI serialized as RDFa (see figure 5) with the hope that, as the Semantic Web continues to develop, Linked Data will become more relevant and important for data organization, data searching, and data retrieval. Figure 4. An individual record in Databib.org. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 22 Figure 5. An RDF sample in Databib.org. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 23 3. National Library of New Zealand (Te Puna Mātauranga o Aotearoa) Interview dates: 17 and 25 October 2012 Contacts: Karen Rollitt and Diana Sola (interviewed on separate occasions) Interview Notes: Karen Rollitt, 17 October—Ms. Rollitt first heard about FAST when Eric Childress and Ed O’Neill ran a tutorial at the 2003 Dublin Core Conference (DC-2003) in Seattle, Washington (USA). The National library of New Zealand (NLNZ) is currently using FAST for indexing national articles in Index New Zealand13. They began using FAST in 2005, and currently the article collection includes approximately 750,000 items. Prior to using FAST, the library used the APAIS (Australian Public Affairs Information Service) Thesaurus for all of their indexing. It was determined that APAIS was not detailed enough, and there was fear that the thesaurus would cease to be updated or maintained. The library adopted FAST because it was more intuitive and easier to use than LCSH. Since the library is only indexing articles, it was not cost-effective to train staff—some of whom are not catalogers—to use LCSH. In addition to being easier to use than LCSH, the library liked the fact that all FAST files are authority files. This saves time and effort for indexers. The library also uses Ngā Ūpoko Tukutuku / Māori Subject Headings14 to help provide localized and culturally specific terms. For the purposes of indexing articles, NLNZ uses FAST topical and geographic headings (for a sample record, see figure 6). The library has been using FAST since 2005. The library has had problems downloading FAST updates and would like continued support or maintenance of the vocabulary. Ms. Rollitt is also concerned that there is little communication from OCLC about updates or development news regarding FAST. Even though the library has a local copy of the FAST authority files, the indexers frequently use searchFAST to find the most up-to-date terms. This is due primarily to difficulties downloading new facet headings to their local systems. The source of these difficulties remains unclear. Diana Sola, 25 October—Ms. Sola first became aware of FAST in 2007, while working with the NLNZ Indexing Team. Like Karen Rollitt, she also mentioned that the library had used APAIS prior to adopting FAST. When concerns arose about future maintenance and the broad use of subject terms in the APAIS thesaurus, the library began to consider other possible http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://innz.natlib.govt.nz/ FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 24 vocabularies to use for article indexing, including LSCH. They decided to test FAST for one month due to its relative ease to learn and implement, and the fact that its facets lent themselves well to indexing. The library currently employs 12 indexers, 10 of whom are full- time employees. While the staff is well experienced, not all have cataloging experience. This was one of the factors considered when determining what vocabulary to adopt. After FAST was implemented, the library had initial problems converting from APAIS to FAST, due primarily to problems associated with the local database. There was no initial effort made to retroactively add FAST headings to older records, but the staff is currently doing so during routine database maintenance. As part of the initial implementation, the NLNZ Indexing Team developed internal training programs and tutorials for indexing staff on how to implement and use FAST. The library currently uses the topical, geographic, and form facets. It has mapped most APAIS headings to FAST, although manual work is required to convert APAIS headings with no direct match to FAST headings. In addition to using FAST authorized headings for indexing, the staff also use FAST’s personal names syntax and format for creating their own local list of personal names. Since 2009, FAST terms have been linked with the library’s indexing database so users can use them to search the article database for desired material. In the future, the library would like to be able to suggest new or revised FAST headings for authorization consideration. Additionally, the library would like to receive more regular updates about current FAST developments and releases. Ms. Sola was familiar with services such as mapFAST, but had not considered using them for any future end-user services. She was interested in hearing about how OCLC researchers think these tools can be used or leveraged by libraries to help improve the FAST implementation as well the end-user experience. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 25 Figure 6. National Library of New Zealand’s use of FAST http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 26 4. OCLC Online Computer Library Center, Inc. Interview Date: 16 January 2013 Contact: Diane Vizine-Goetz, Ph.D. Interview Notes: Dr. Vizine-Goetz has used FAST in three of her OCLC research prototypes: Classify (see figures 7 and 8 for samples of FAST headings displayed in a search result), Kindred Works, and Fiction Finder. She chose to use FAST primarily because all of the headings were controlled, whereas LCSH is not controlled in WorldCat.org. Her research projects primarily used the topical and geographic facets although in both Kindred Works and Fiction Finder the Form facet was also implemented. FAST is a frequently-used index in OCLC Classify (see table 3 for FAST-use metrics for 2012). In addition to the authority control of FAST headings, Dr. Vizine-Goetz also liked not having to deal with long LCSH strings that could be confusing to end-users at times. Overall Dr. Vizine- Goetz is very impressed with FAST in regards to its ease of use for both researchers as well as end-users. She noted that she intends to continue to use FAST for internal, self-contained OCLC projects. Although FAST is easy to use, there were a few problems with some of the current headings and Dr. Vizine-Goetz mentioned a few improvements that could be made to improve the overall utility of the vocabulary for future projects. A challenge with using FAST is that FAST headings are not currently added to a large number of OCLC bibliographic records in the production version of WorldCat, and so are not in WorldCat.org, which makes information retrieval using FAST headings problematic. She suggested that if FAST was implemented into WorldCat.org, search and retrieval using FAST in projects such as Kindred Works could be greatly improved. Another area of concern for Dr. Vizine-Goetz was FAST’s Form facet. She thinks that the various form headings need to be cleaned up to conform to the expanding LCTGF Vocabulary. FAST currently imports and maps headings from both vocabularies, which can cause overlap and confusion on the part of catalogers. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 27 Another facet that Dr. Vizine-Goetz would like to see improved is the chronological headings. She thinks that the headings should be cleaned up to allow the chronological time periods to be matched more accurately with resources. Another suggestion about possible FAST improvements that Dr. Vizine-Goetz offered was that the FAST team should continue ongoing efforts to add latitude and longitude coordinates to geographical headings. As a final note, Dr. Vizine-Goetz thought that it would be interesting to see how end-users responded to the FAST’s presentation of Uniform Title headings. She is interested in seeing how they use and understand not only the concept of a uniform title while conducting searches but also the structural form of uniform titles (e.g., Romeo and Juliet (Shakespeare, William) or Real Housewives of Atlanta (Television Program)). Figure 7. A list of FAST subjects in search result for OCLC Classify http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 28 Figure 8. A list of titles in Classify Table 3 and its associated bar chart below show the number of FAST heading queries that were conducted in Classify throughout 2012. Table 3. FAST heading queries in Classify (2012) Month Subject queries Jan 221,886 Feb 230,432 March 194,315 April 230,432 May 182,584 June 340,456 July 354,161 Aug 253,860 Sept 263,186 Oct 333,533 Nov N/A Dec 155,755 0 50,000 100,000 150,000 200,000 250,000 300,000 350,000 400,000 FAST queries in Classify (2012) http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 29 Table 4. OCLC Research projects using FAST Project Name URL Information Contact Email OCLC Linked Data http://www.oclc.org/us/en/news/releases/2012/201238.htm API: http://oclc.org/developer/documentation/worldcat- identities/response-details Rick Bennett bennetr@oclc.org WorldCat Identities http://worldcat.org/identities/ API: http://oclc.org/developer/documentation/worldcat- identities/response-details Ralph LeVan levan@oclc.org WorldCat Genres http://www.worldcat.org/genres/ Diane Vizine-Goetz vizine@oclc.org WorldCat Facebook App https://apps.facebook.com/worldcat/ Bruce Washburn washburb@oclc.org Enhanced WorldCat (Research Version) (OCLC internal access only) Enriched version of WorldCat stewarded by OCLC Research Kerre Kammerer kammerer@oclc.org Kindred Works http://experimental.worldcat.org/kindredworks/ Diane Vizine-Goetz vizine@oclc.org Classify http://classify.oclc.org/classify2/api_docs/index.html API: http://classify.oclc.org/classify2/api_docs/index.html Fiction Finder http://www.oclc.org/research/activities/fictionfinder.html Currently on hiatus assignFAST http://experimental.worldcat.org/fast/assignfast/ API: http://oclc.org/developer/services/assignfast Rick Bennett bennetr@oclc.org mapFAST http://experimental.worldcat.org/mapfast/ API: http://oclc.org/developer/services/mapfast searchFAST http://fast.oclc.org/searchfast/ GUI for FAST FAST Linked Data http://experimental.worldcat.org/fast/ API: http://oclc.org/developer/services/fast-linked-data-api http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.oclc.org/us/en/news/releases/2012/201238.htm http://oclc.org/developer/documentation/worldcat-identities/response-details http://oclc.org/developer/documentation/worldcat-identities/response-details mailto:bennetr@oclc.org http://worldcat.org/identities/ http://oclc.org/developer/documentation/worldcat-identities/response-details http://oclc.org/developer/documentation/worldcat-identities/response-details mailto:levan@oclc.org http://www.worldcat.org/genres/ mailto:vizine@oclc.org https://apps.facebook.com/worldcat/ mailto:washburb@oclc.org mailto:kammerer@oclc.org http://experimental.worldcat.org/kindredworks/ mailto:vizine@oclc.org http://classify.oclc.org/classify2/api_docs/index.html http://classify.oclc.org/classify2/api_docs/index.html http://www.oclc.org/research/activities/fictionfinder.html http://experimental.worldcat.org/fast/assignfast/ http://oclc.org/developer/services/assignfast mailto:bennetr@oclc.org http://experimental.worldcat.org/mapfast/ http://oclc.org/developer/services/mapfast http://fast.oclc.org/searchfast/ http://experimental.worldcat.org/fast/ http://oclc.org/developer/services/fast-linked-data-api FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 30 5. RMIT Publishing15 Interview Date: 23 January 2013 Contact: Leanne Whitby Interview Notes: Ms. Whitby first heard about FAST after having discovered and read chapter 12 in Library of Congress Subject Headings: Principles and Application by Lois Mai Chan (2005). Her initial reaction to the new vocabulary was that it would be a great replacement for RMIT’s existing indexing vocabulary, which at the time relied on LCSH (having used APAIS Thesaurus for a number of years). RMIT primarily indexes articles, reports, conference papers (both professional and academic), book chapters and books. All of their indexed materials are available through their website www.informit.com.au.16 In early 2011 RMIT underwent a project to develop a new indexing interface that used FAST headings as a replacement for LCSH (see figures 9 and 10 for sample displays). The reason for making the switch was primarily based on the ease of use that FAST offered when compared to LCSH. Since the indexers are generally not trained in cataloging or librarianship, there was difficulty in using the lengthy and sometimes complicated LCSH strings to create article subject indexes. Ms. Whitby expressed the concern that LCSH was simply too complex for the average indexer. She also commented that FAST offered improved term governance since all FAST heading are also authority records17. The updated indexing interface was launched in February of 2012 (RMIT 2012)18. The system uses a custom indexing interface that allows indexers to select FAST headings from a dropdown menu with an auto-suggest included to help decrease the time spent finding a desired term. If a specific term is not found, the indexers are instructed to use the OCLC searchFAST service to find the appropriate term and manually import it into their indexing system. Every morning a computer scans all new FAST headings in the RMIT systems, and cross-references them with the FAST SRU database in order to verify they are valid. RMIT utilizes all FAST (i.e., all of the facets) and keeps their database up-to-date by regularly downloading FAST file dumps from the OCLC website19. RMIT has not created any specific training material for using FAST but has produced and distributes to their indexers guidelines on how to use the custom interface to index material, which also includes coverage on indexing with FAST. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.informit.com.au/ FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 31 In addition to creating a custom user interface from which indexers can select and assign FAST headings, the RMIT development team also created a custom algorithm to convert existing LCSH terms into FAST headings. RMIT used approximately 169,000 LCSH terms (80,000 of which were only used once), and through using their algorithm they successfully converted 82% over to FAST headings.20 Once the algorithm converted an LCSH term into a FAST term, the RMIT team used FAST Converter to validate the accuracy of the conversion. One of the questions that Ms. Whitby posed was whether OCLC had the ability to help them convert the remaining 18% of the un-encoded LCSH headings into FAST headings. As of now, RMIT still uses the old LCSH headings for terms that could not successfully be converted over to FAST. The algorithm has subsequently been used to retroactively assign FAST headings to material that had previously been indexed using LCSH. The success of the new system has prompted RMIT to expand the use of FAST to other areas of indexing. The organization is currently implementing FAST indexing into an update of RMIT’s Informit TVNews service21, which will allow users to search for and find news programs online. Ms. Whitby thought that FAST’s faceted nature is very intuitive for users to understand, and thinks it could be leveraged by RMIT to improve overall information search and retrieval. She is also interested in exploring the possibilities of using FAST’s Linked Data for future research and development. Some questions and comments Ms. Whitby had regarding further improvement to FAST included developing better reference guides and vocabulary guidelines. One of the main guidelines utilized during RMIT’s FAST vocabulary research and development that went into its indexing system came from Chan and O’Neill’s FAST: Faceted Application of Subject Terminology (2010). This book was published when RMIT was in the final stage of implementing FAST and was used to guide this effort through completion. Prior to that, the RMIT development team utilized these resources and tools to guide their FAST product development work: Databases and Tools • FAST—full authority file: http://fast.oclc.org/ • FAST Converter: http://experimental.worldcat.org/fast/fastconverter/ • FAST SRU database: http://tspilot.oclc.org/fast/?operation=explain&version=1.1 http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://fast.oclc.org/ http://experimental.worldcat.org/fast/fastconverter/ http://tspilot.oclc.org/fast/?operation=explain&version=1.1 FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 32 Resources • Chan, Lois Mai. 2005. Library of Congress Subject Headings: Principles and Application. Westport, Connecticut: Libraries Unlimited. http://www.worldcat.org/oclc/57373683. • O’Neill, Edward T. O'Neill, Lois Mai Chan, Eric Childress, Rebecca Dean, Lynn M. El- Hoshy, and Diane Vizine-Goetz. 2001. “Form Subdivisions: Their Identification and Use in LCSH.” Library Resources & Technical Services. 45 (October): 187-197. http://www.ala.org/alcts/sites/ala.org.alcts/files/content/resources/lrts/archive/45 n4.pdf Additional resources and tools are available at http://www.oclc.org/research/activities/fast.html Ms. Whitby also expressed interest in a more up-to-date and detailed set of guidelines, as well as a more developed and connected forum for asking FAST related questions. She would like to see only FAST headings used in RMIT’s subject index database; work remains ongoing until all LCSH terms are converted to FAST. Figure 9. A list of FAST index terms displayed in Informit http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.worldcat.org/oclc/57373683 http://www.ala.org/alcts/sites/ala.org.alcts/files/content/resources/lrts/archive/45n4.pdf http://www.ala.org/alcts/sites/ala.org.alcts/files/content/resources/lrts/archive/45n4.pdf http://www.oclc.org/research/activities/fast.html FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 33 Figure 10. Individual record display in Informit http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 34 6. Sterling and Francine Clark Art Institute Library Interview Date: 12 February 2013 Contact: Penny Baker Interview Notes: Ms. Baker first heard about FAST through a presentation given at a conference in 2006. She recalls that the new vocabulary was receiving a large amount of attention and hype because it was being presented as compatible with Dublin Core. In 2012, the library began an IMLS-funded project to digitize and provide cataloging for a portion of the Mary Ann Beinecke Decorative Art Collection. The records for the rare books (approximately 400) in the collection were originally cataloged between 1977 and 1978 and contained very few subject terms. The team’s plan was to add new subject terms to each record in order to enhance access and visibility of the materials in the Library’s CONTENTdm collection (for a sample record display, see figure 11). The team selected FAST as their cataloging vocabulary because of its simplicity and ease of use. Since the majority of the cataloging was being done by library interns with little cataloging experience, there was a great interest in using a vocabulary was easy to use and did not have the complexity typical of LCSH. The existing assigned headings (primarily LCSH) were retained, and FAST headings and AAT (Art and Architecture Thesaurus) terms were added. There was no attempt to convert existing LCSH headings into FAST headings. The team is using the FAST Topical facet as well as the Form facet. The terms are found and selected using assignFAST, which Ms. Baker said was very easy to use and worked well. She noted that assignFAST’s cut and paste function was similar in concept to the OCLC Connexion (OCLC’s cataloging service interface) and therefore easy to use and explain. One of the minor problems that Ms. Baker had in implementing FAST was the lack of any guidelines or training material. The team purchased and used FAST: Faceted Application of Subject Terminology (Chan and O’Neill 2010), but she commented that it was not designed to allow for quick look-up or reference. She also was slightly annoyed by the lack of updates and release notes on the FAST project website. In addition to using the FAST string headings, the team is also including the FAST Linked Data URI, though they are not packaging them in RDF markup. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 35 The IMLS-funded project is set to end in October of 2013, and Ms. Baker said that there are no current plans to use FAST in any upcoming projects. She did say that she would use FAST again in another digital cataloging project, or non-MARC based cataloging project. Otherwise, there is no interest in using FAST for everyday cataloging. Figure 11. A record display in the Sterling and Francine Clark Art Institute Library’s CONTENTdm system http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 36 7. Universiteitsbibliotheek Amsterdam (University of Amsterdam Library) Interview date: 27 November 2012 Contact: Aad van Duijn Introductory remarks: Late in 2007, inspired by a MARC21 conference in Frankfurt and a visit to the American Library Association (ALA), the University of Amsterdam Library initiated a nationwide transition program from Dutch metadata standards, rules and regulations to international metadata standards for all Dutch academic libraries. FAST was discussed as early as the initial brainstorming session in Amsterdam. In accordance with the nationwide initiative, the University of Amsterdam Library itself adopted MARC21 and AACR in 2009 while implementing a new ILS (Aleph). In 2012 they switched to English as the language of cataloging, working directly in WorldCat using the Connexion client. RDA and LCC will be implemented in 2013, and a decision on the active use of LCSH and/or FAST is to be expected shortly. The recent FAST pilot therefore forms part of a long term strategy. Interview Notes: Mr. Van Duijn first came to know about FAST in a book about LCSH written by Lois Mai Chan (Chan 2005). He found additional information about FAST on the OCLC Research website. His manager first heard about FAST at an ALA conference several years ago. The University of Amsterdam Library used the Gemeenschappelijke Onderwerpsontsluiting (GOO) from the mid-1980s to June of 2012. Since the GOO is no longer maintained, and the Library is switching to English-language cataloging, an alternate, English-language, thesaurus was needed. FAST was chosen and is currently being used in a pilot program for original English-language cataloging of monographs22. Titles that are copy- cataloged already have LCSH assigned to them in most cases. The study is to see if FAST is adequate in terms of retrievably, visibility, and “clickability” in their local OPAC and their Primo search engine, as well as in WorldCat. FAST was chosen because it has a simple syntax but still retains the semantic richness of LCSH. Mr. Van Duijn is the only cataloger currently using FAST. He has thus far not experienced any difficulties in incorporating FAST into his cataloging workflow. The Library has not developed http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 37 any training material, but Mr. Van Duijn has relied on the FAST: Faceted Application of Subject Terminology book by Ed O’Neill and Lois Chan as a reference guide (2010). The features of FAST most appreciated include the extensive number of headings (over 1.6 million) and the extremely simple, easy-to-use syntax. All of the facets are being used for subject cataloging. The university is currently testing FAST to see if it is suitable for regular use. As a replacement of LCSH for day-to-day cataloguing there are still some major concerns because FAST is not, to their knowledge, used by large research libraries and FAST is probably not included in third-party authority services (Marcive, LTI etc.). Additionally, FAST is not an international standard as LCSH de facto is. There is also a quality issue at stake as LCSH headings offer more context than FAST headings. The University of Amsterdam Library is considering the use of FAST for large digitization projects. Furthermore, the library is interested in mapping the Dutch GOO thesaurus to FAST23. Of the FAST tools Van Duijn is most impressed with searchFAST (for looking up terms) and assignFAST (for simply copy and paste term assignment). FAST headings can currently be found in cataloged items through the university’s local OPAC as well as Primo and in WorldCat. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 38 8. University of Illinois at Chicago Interview Date: 22 February 2013 Contact: Joan Schuitema Interview Notes: Joan worked with FAST when she was at the University of Illinois at Chicago (UIC) from 2005-2010. She learned about FAST through a presentation given at ALA and from various OCLC news releases. Joan was a member of the PCC Standing Committee on Standards with Eric Childress and after consulting him about FAST, decided that that it would be a good fit for her project at UIC. UIC began to use FAST in 2007 to catalog local city/suburban government documents from the American Planning Association (APA) collection held by the university. There were tens of thousands of documents in the collection. Prior to this project there had been little attempt to catalog any part of the collection. FAST was chosen because it was seen as being easier and faster to use than LCSH. Since the materials in the collection were very similar in type, location, and scope, the cataloger could set up templates for a particular group of documents. For example, a set of documents from a similar town would all have the same geographic heading. Therefore the only work needed for those documents would be to add specific topical headings. This saved time and effort for catalogers and helped improve uniformity across the collection. The project was intended to be started by professional cataloger and then transitioned to less experienced library staff members. This proposed plan made the choice to use FAST even more appealing since the long-term goal was to use paraprofessional library staff to add headings to documents. Although FAST was very easy to use and implement, there was some push-back from experienced catalogers who wanted to continue to use LCSH. When Joan left UIC in 2010, the project was still ongoing, and FAST was still being used. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 39 9. University of North Dakota Interview date: 19 November 2012 Contact: Shelby Harken Interview Notes: The University of North Dakota digital library has two full-time catalogers. Ms. Harken has over 45 years experience in cataloging and there is also an associate cataloger that has a bachelor’s degree in art. The associate cataloger does not have a degree in library science but has been using FAST since the digital library was created in 2004. Harken first heard about FAST in 2002. The University of North Dakota began using FAST in 2004 to catalog digital images in CONTENTdm. Their CONTENTdm items (see a sample record display in figure 12) use a custom Dublin Core template and FAST is used in conjunction with the Art & Architecture Thesaurus (AAT) and the Thesaurus for Graphic Materials in the creation of subject metadata for photos, pottery, and special collections. FAST is not currently used to catalog their “Writers’ Conference” items, which include readings and writing of up-and-coming authors. The university began using FAST when it initiated the development of its digital collection in 2004, and therefore no vocabulary was used prior to FAST. FAST was chosen by the university primarily because it is easy for non-professionals to use and apply. This is in part because all of the terms in the FAST vocabulary have valid authority records. The digital collections library has continued its use of FAST because it has been easy to implement and apply to item records. When the library first began using FAST the application of subject terms required a fair amount of guesswork. This was primarily due to the fact that FAST was not yet completed, and the rule and guidelines for specific facets were not yet established. One example of this is the assignment of “bays” (FAST assigns bays with the associated body of water while LCSH assigns them to an associated land mass). The library has not developed any training material for the use or implementation of FAST but they do use FAST: Faceted Application of Subject Terminology book by Ed O’Neill and Lois Chan as a reference text (2010). The feature that the library likes the most about FAST is the ability to easily and quickly look up terms online via searchFAST, and then apply them in CONTENTdm. The search interface is straightforward, and the faceted system allows for easy distinction between terms. Unlike http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 40 LCSH there is no need to string together multiple terms in order to form a valid subject term. Ms. Harken was familiar with the mapFAST and FAST linked data API, but the digital collection library is not currently using any of the services. In the future Ms. Harken would like to integrate FAST into the regular cataloging workflow. Once the library gets Primo fully integrated, users will be able to search across the entire University of North Dakota library system for materials and use FAST headings to conduct searches. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 41 Figure 12. A record of display in the University of North Dakota's use of CONTENTdm system http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 42 10. Biodiversity Heritage Library (BHL) 24 Interview Date: 11 October 2012 Contacts: Suzanne Pilsk and Bianca Crowley Interview Notes: The Biodiversity Heritage Library project began in late 2006 and includes 14 member libraries, 2 of which are based in the UK. When the project first started, Ms. Pilsk was an advocate for using FAST but the opinion was not shared with the rest of the management team. The BHL contains records from all of the member libraries but the vocabularies in the MARC records vary greatly. These include LSCH, National Agricultural Library's Agricultural Thesaurus headings, as well as unique subject headings. In order to add uniformity to the records, the BHL currently uses as very basic algorithm to parse the headings and create a keyword list that can be used for searching. The library did not get very far in pursuing the use of FAST, but they are still interested in the possibility of using it to standardize the various types of subject headings that their records contain. Since the library does not have the necessary underlying structure in place, there are no current plans to use the linked data features of FAST, but there is excitement about the end-user features that FAST supports. mapFAST was one such feature that Ms. Pilsk was familiar with and thought was of interest. Ms. Crowley was not familiar with mapFAST. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 43 11. Minnesota State University, Mankato25 Interview Date: 12 September 2012 Contact: Robert Bothmann Interview Notes: Mr. Bothmann first heard about FAST when he met Ed O’Neill at an ALA conference in 2002. The university library is not currently using FAST, but FAST headings do appear in their catalog, presumably because of the presence of records created by libraries that are using FAST. The library currently uses LCSH as its cataloging vocabulary. Mr. Bothmann was very interested in using the FAST vocabulary as a means of geographic location and searching. He mentioned that this feature would help regional researchers by showing them where in the state prior research had been conducted. This feature would also help Minnesota State University students decide on research proposals for their Capstone projects. His idea was for students to use the coordinates associated with FAST headings to identify locations that have already been the subject of research papers and use that information to develop unique proposal and as a means to find relevant research to help support their projects. He noted that it is very difficult to use MARC records to track location, but they have experimented using File Maker Pro to manually add latitude and longitude to individual records, and it worked well. The cataloging staff at Minnesota State University, Mankato library consists of two expert catalogers. Their catalog is open to external users. He was very excited about the possibility of using the FAST geographic facet in the library’s records. After exploring mapFAST, Mr. Bothmann inquired why he could not find his thesis in WorldCat, when using mapFAST and searching for his home town (which was the topic of his thesis). The problem was that the subject heading he used for the thesis was an LCSH term and mapFAST was not able to parse the heading in WorldCat. This will no longer be an issue when when FAST is implemented into WorldCat. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 44 12. University of Western Ontario (Ph.D. Research Project) Interview Date: 17 January 2013 Contact: Olha Buchel, Ph.D. Interview Notes: Dr. Buchel initially heard about FAST through a presentation that Eric Childress gave at a conference in Canada. She was interested in using the geographic headings to assist in her doctoral research into how metadata records could be used by services to produce end-user data visualization tools. Her project26 focused specifically on mapping visualizations for Ukrainian bibliographic holdings. She did not end up implementing FAST in her research primarily because there were not many FAST headings for the Ukraine that actually had latitude and longitude coordinates. Additionally, there were almost no bibliographic records that had FAST headings associated with them. She was also disappointed that there was not an option/ability for users to submit or add coordinates for addition to FAST geographical headings. With the continual improvement to FAST geographic headings, Dr. Buchel is considering using FAST in future geo-spatial research and possibly in further development of initial Ph.D. research. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 45 13. People of the Founding Era Interview Date: 15 January 2013 Contact: Susan Perdue Interview notes: The People of the Founding Era had been interested in using FAST for tagging historical documents. Ms. Perdue had first heard about FAST from a co-worker who had previously worked as a programmer for the Theodore Roosevelt Center Digital Library (where FAST was being used). The plan to use FAST was never initiated due to complexity of the initial project. Ms. Perdue was very interested in using FAST because all of the people who would have been tagging documents were not professional catalogers and had no experience using or assigning the complex string that would have been necessary for LCSH. In their situation, the relative simplicity of FAST would have been a huge advantage not only for accuracy but more importantly for the speed at which workers could tag documents. Another advantage would be the retrieval of relevant historical documents based on controlled subject headings based on historical events. Ms. Perdue was most interested in using FAST’s event headings in order to connect relatively vague or obscure historical documents with broader and more widely known historical events. Previously, the collection was tagged using occupational headings, which could be problematic in terms of finding desired documents and related material. There was also some interest in using the names facet, but Ms. Perdue thought that it would have been slightly more problematic to match FAST names with the relatively obscure individuals that authored the historical documents in their collection. While there are no current plans to use FAST in any set projects, Ms. Perdue is still very interested in using the vocabulary. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 46 14. l’Université du Québec à Montréal (UQAM) [University of Quebec in Montreal] Interview date: 10 October 2012 Contact: Benoit Bilodeau Interview Notes: Mr. Bilodeau first heard about FAST at the 2008 10th International ISKO Conference in Montreal, where he heard Eric Childress give a presentation on the development of FAST. He was surprised at how similar the faceted vocabulary was to a project that he and his colleagues had been working on at the University of Quebec since1994 (see figure 13). His project, called le Répertoire des Autorités-Sujet de l'UQAM (RASUQAM),27 developed out of the need to create a more robust thesaurus to replace the Uniterm automated system that the library had been using since 1970. The first use of the RASUQAM system was in November 1994. Currently the thesaurus has 52,000 terms, and it is still being developed. The University of Quebec in Montreal is currently the only user of the thesaurus. French is the indexing language used in the system, but English can be used as a searching language. He decided to use facets because “they come naturally to users.” While the system does not employ the use of linked data, it does include reference links to Wikipedia.org for all applicable headings. This information is coded in the marc 670 $u field. Five years ago, the team began to map RASUQAM to both LCSH and Répertoire de vedettes-matière (RVM)28. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 47 Figure 13. Individual record illustrating the resemblance between University of Quebec’s RASUQAM system and FAST http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 48 15. University of Texas School of Public Health at Houston29 Interview Date: 13 September 2012 Contact: Richard L. Guinn Interview Notes: Mr. Guinn first heard about FAST in 1998/1999 and contacted OCLC about to the possibility of implementing it in the library. Local issues prevented the library’s adoption of FAST. His library has a large local collection of material that uses both MeSH and LC as subject vocabularies. The library uses Ex Libris and has a lot of local records, and Mr. Guinn was hoping to use FAST as a means to standardize all of the headings. On initially hearing bout FAST, he thought that it would be very well adapted at the task of standardizing local library records. In contrast to LCSH, it appeared that FAST had comprehensively authorized terms and that the syntax was much simpler to use and implement. Mr. Guinn’s cataloging staff consists of himself and another part-time cataloger. He described their cataloging skills as “well experienced” since together they have about 20 years’ of cataloging experience combined. Mr. Guinn had not heard any update news on FAST since he last made contact with OCLC back in 2008, expressing interest in the use of the FAST website. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 49 16. World Maritime University Interview Date: 4 March 2013 Contact: Chris Hoebeke Interview Notes: Mr. Hoebeke was a student of Dr. Lois Chan and became familiar with the FAST vocabulary through FAST: Faceted Application of Subject Terminology authored by Chan and O’Neill (2010). Since the interview, he has converted fifty thousand bibliographic records and loaded them on a test instance of their Koha ILS. Mr. Hoebeke mentioned that one of the reasons he was attracted to FAST was because the faceted outline of the vocabulary makes it easy to use (both as a cataloger and as an end-user conducting searches). For the purposes of his library, traditional LCSH is too complex for faceting. One aspect of LCSH that Mr. Hoebeke does like is that the vocabulary lends itself well to browsing by subject and subject hierarchies, whereas facets serve only to limit or narrow an existing result set. He postulated that using LCSH in combination with FAST could make for a very powerful and easy to use search system. If the library were to use FAST in any future projects, Mr. Hoebeke mentioned that he would be particularly interested in the Geographic and Topical headings, but he thought that the Uniform Titles might also be of interest. In considering FAST’s approach to geographic headings, Mr. Hoebeke disagreed with some of the body of water second level headings, suggesting that they should probably be valid first level headings. In particular he questioned whether anyone living in Europe would think of the Baltic as a subordinate region of the Atlantic (Atlantic Ocean—Baltic Sea), and he thought it would be more useful as a first level geographic heading. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 50 Notes 1 FAST project page: http://www.oclc.org/research/activities/fast.html 2 CORC: http://www.oclc.org/research/news/1998/09-28.html 3 FAST tools and data: http://fast.oclc.org/searchfast/ 4 As known to OCLC staff 5 Australian Public Affairs Information Service [thesaurus]: http://www.nla.gov.au/apais/ 6 FAST implemented in 2012 (see RMIT 2011) 7 LCNAF is used for all personal names: http://id.loc.gov/authorities/names.html 8 As known to OCLC staff 9 JEL Classification System: http://www.aeaweb.org/jel/jel_class_system.php 10 AMS Mathematics Subject Classification: http://www.ams.org/mathscinet/msc/msc2010.html 11 BRII project: http://brii.medsci.ox.ac.uk/ 12 Databib project: http://databib.org 13 Index New Zealand: http://innz.natlib.govt.nz 14 Ngā Ūpoko Tukutuku / Māori Subject Headings: http://mshupoko.natlib.govt.nz/mshupoko/index.htm 15 “RMIT Publishing is a business unit of RMIT Training Pty Ltd, a wholly owned subsidiary of RMIT University”: http://www.informit.com.au 16 Note: RMIT content is indexed in WorldCat Local 17 Exception is Chronological headings (FAST facet 148): “Authority records for chronological headings are established only when needed for references or linkages” (Chan and O’Neill 2010, 99) 18 See (RMIT Publishing 2012) 19 Available at: http://tspilot.oclc.org/fast/?operation=explain&version=1.1 20 4% of the LCSH terms were set up as LC-to-FAST-mapping test cases to verify the conversion. 21 Informit TVNews: http://www.informit.com.au/tvnews.html 22 For an example of a WorldCat record created by UvA using FAST, see http://www.worldcat.org/oclc/706651462 23 Note: OCLC Research has a project underway to test algorithms to map GOO to FAST. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.oclc.org/research/activities/fast.html http://www.oclc.org/research/news/1998/09-28.html http://fast.oclc.org/searchfast/ http://www.nla.gov.au/apais/ http://id.loc.gov/authorities/names.html http://www.aeaweb.org/jel/jel_class_system.php http://www.ams.org/mathscinet/msc/msc2010.html http://brii.medsci.ox.ac.uk/ http://databib.org/ http://innz.natlib.govt.nz/ http://mshupoko.natlib.govt.nz/mshupoko/index.htm http://www.informit.com.au/ http://tspilot.oclc.org/fast/?operation=explain&version=1.1 http://www.informit.com.au/tvnews.html http://www.worldcat.org/oclc/706651462 FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 51 24 As a follow-up, OCLC sent both Pilsk and Crowley links to the most recent version of the FAST website as well as a searchFAST PowerPoint (O’Neill, Bennett and Kammerer 2010) that Ed O’Neill presented at the IFLA Satellite Post-Conference in 2010. 25 As a follow-up to the interview, Mr. Mixter sent Mr. Bothman links to mapFAST (OCLC Research 2012), and a searchFAST PowerPoint (O’Neill, Bennett and Kammerer 2010) that Ed O’Neill presented at the IFLA Satellite Post-Conference in 2010. 26 Collection About Local History of Ukraine (DK508): http://abuchel.apmaths.uwo.ca/~obuchel/maps/VICOLEX.php 27 le Répertoire des Autorités-Sujet de l'UQAM (RASUQAM): http://www.bibliotheques.uqam.ca/services-techniques/RASUQAM/presentation 28 Répertoire de vedettes-matière (RVM): https://rvmweb.bibl.ulaval.ca/en/a-propos 29 As a follow-up to the interview, Mr. Mixter sent Mr. Guinn the URL for the searchFAST website (see note 3 above). http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.nlib.ee/html/yritus/ifla_jarel/presentations/1-1_Oneill.ppsx http://www.oclc.org/research/activities/mapfast.html http://www.nlib.ee/html/yritus/ifla_jarel/presentations/1-1_Oneill.ppsx http://abuchel.apmaths.uwo.ca/~obuchel/maps/VICOLEX.php http://www.bibliotheques.uqam.ca/services-techniques/RASUQAM/presentation https://rvmweb.bibl.ulaval.ca/en/a-propos http://fast.oclc.org/searchfast/ FAST (Faceted Application of Subject Terminology) Users: Summary and Case Studies http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf August 2013 Jeffrey Mixter and Eric R. Childress, for OCLC Research Page 52 References ALCTS (Association for Library Collections & Technical Services). 1999. Subject Data in the Metadata Record: Recommendations and Rationale: A Report from the ALCTS/CCS/SAC/Subcommittee on Metadata and Subject Analysis. July. http://www.ala.org/alcts/resources/org/cat/subjectdata_record. Chan, Lois Mai. 2005. Library of Congress Subject Headings: Principles and Application. Westport, Connecticut: Libraries Unlimited. http://www.worldcat.org/oclc/57373683. Chan, Lois Mai, and Edward T. O'Neill. 2010. FAST: Faceted Application of Subject Terminology: principles and applications. Santa Barbara, California: Libraries Unlimited. http://www.worldcat.org/oclc/624025531. OCLC Research. 2012. “mapFast.” Last updated 27 November. http://www.oclc.org/research/activities/mapfast.html O’Neill, Edward T. O'Neill, Lois Mai Chan, Eric Childress, Rebecca Dean, Lynn M. El-Hoshy, and Diane Vizine-Goetz. 2001. “Form Subdivisions: Their Identification and Use in LCSH.” Library Resources & Technical Services. 45 (October): 187-197. http://www.ala.org/alcts/sites/ala.org.alcts/files/content/resources/lrts/archive/45n4.pdf O’Neill, Edward T., Rick Bennett and Kerre Kammerer. 2010. “Using Authorities to Improve Subject Searches.” Session 1: User Needs and Subject Access Design in the Digital Environment. Presented at the IFLA Satellite Post-Conference Beyond Libraries—Subject Metadata in the Digital Environment and Semantic Web.17 August. Tallinn, Estonia. http://www.nlib.ee/html/yritus/ifla_jarel/papers/1-1_ONeill.docx. RMIT Publishing. 2011. “New! FAST Indexing.” informit bulletin (blog). October-December: 2. http://www.informit.com.au/downloads/bulletin_issue3_2011.pdf. RMIT Publishing. 2012. “Going FAST: Informit’s Conversion from LCSH to FAST. 2012. informit bulletin (blog). 17 December. http://informitbulletin.com/2012/12/17/going-fast-informits- conversion-from-lcsh-to-fast/. http://www.oclc.org/content/dam/research/publications/library/2013/2013-04.pdf http://www.ala.org/alcts/resources/org/cat/subjectdata_record http://www.worldcat.org/oclc/57373683 http://www.worldcat.org/oclc/624025531 http://www.oclc.org/research/activities/mapfast.html http://www.ala.org/alcts/sites/ala.org.alcts/files/content/resources/lrts/archive/45n4.pdf http://www.nlib.ee/html/yritus/ifla_jarel/papers/1-1_ONeill.docx http://www.informit.com.au/downloads/bulletin_issue3_2011.pdf http://informitbulletin.com/2012/12/17/going-fast-informits-conversion-from-lcsh-to-fast/ http://informitbulletin.com/2012/12/17/going-fast-informits-conversion-from-lcsh-to-fast/ work_uq62lvcyjvdqxo63hhplarvp5e ---- Microsoft Word - Zuccala_etal-2017-Preprint.docx u n i v e r s i t y o f c o p e n h a g e n Metric Assessments of Books As Families of Works Zuccala, Alesia Ann; Breum, Mads; Bruun, Kasper; Wunsch, Bernd Thomas Published in: Journal of the Association for Information Science and Technology DOI: 10.1002/asi.23921 Publication date: 2017 Citation for published version (APA): Zuccala, A. A., Breum, M., Bruun, K., & Wunsch, B. T. (2017). Metric Assessments of Books As Families of Works. Journal of the Association for Information Science and Technology, 69(1). https://doi.org/10.1002/asi.23921 Download date: 06. Apr. 2021 https://doi.org/10.1002/asi.23921 https://doi.org/10.1002/asi.23921 1 Preprint version accepted in the Journal of the Association for Information Science (June 7, 2017). Metric Assessments of Books as Families of Works Alesia Zuccala1, Mads Breum, Kasper Bruun, and Bernd T. Wunsch 1a.zuccala@hum.ku.dk Royal School of Library and Information Science University of Copenhagen Njalsgade 76, 2300 København S, Denmark Abstract We describe the intellectual and physical properties of books as manifestations, expressions and works and assess the current indexing and metadata structure of monographs in the Book Citation Index (BKCI). Our focus is on the interrelationship of these properties in light of the Functional Requirements for Bibliographic Records (FRBR). Data pertaining to monographs were collected from the Danish PURE repository system as well as the BKCI (2005-2015) via their International Standard Book Numbers (ISBNs). Each ISBN was then matched to the same ISBN and family-related ISBNs cataloged in two additional databases: OCLC-WorldCat and Goodreads. With the retrieval of all family-related ISBNs, we were able to determine the number of monograph expressions present in the BKCI and their collective relationship to one work. Our results show that the majority of missing expressions from the BKCI are emblematic (i.e., first editions of monographs) and that both the indexing and metadata structure of this commercial database could significantly improve with the introduction of distinct expression IDs (i.e., for every distinct editions) and unifying work-related IDs. This improved metadata structure would support the collection of more accurate publication and citation counts for monographs and has implications for developing new indicators based on bibliographic levels. 1. Introduction In the past, bibliographic data and citation data pertaining to books were inaccessible, if not difficult to retrieve. Now, as digital resources have improved, so has the priority to advance book-related metrics. This is partly due to the introduction of Thomson Reuter’s Book Citation Index (BKCI) (Adams & Testa, 2011)1 and the addition of books to Elsevier’s Scopus. These commercial databases, however, are not the ‘be-all and end-all’ for the discerning bibliometrician. Recent assessments of the BKCI point to numerous indexing problems, which can lead to flawed evaluations (Gorraiz et al., 2013; Leydesdorff & Felt, 2013; Torres-Salinas et al., 2014). Still, researchers continue to use the BKCI and/or Scopus by finding ways to extract book citations from journal articles (Hammarfelt, 2011; Zuccala et al., 2014). Some have chosen instead to work with alternative resources, like Google Books (Kousha & Thelwall, 2009), Google Scholar (Kousha & Thelwall, 2011) and OCLC-WorldCat (Torres-Salinas & Moed, 2009; White et al., 2009). Concerted efforts are even being made to compare data that has been retrieved from multiple databases (Kousha et al., 2016; Zuccala & Cornacchia, 2016; Zuccala et al., 2015a; Zuccala & White, 2015b). 1 At the time this research was carried out the Book Citation Index was owned by Thomson Reuters. It is now part of the parent company Clarivate Analytics. 2 The metric community is making rapid progress, but this is related primarily to the exploration of new data sources. The BKCI indexing problem therefore persists. One solution is to avoid studies based on citations and work with library holding counts instead (Torres-Salinas & Moed, 2009). With this option books cataloged in various international libraries (e.g., the OCLC-WorldCat Union catalog) may be evaluated according to “their perceived impacts on culture and the life of the mind” (White et al., 2009, p. 1086). Thus far, the libcitation has generally been accepted, though researchers are reluctant to separate libcitations from citation counts, suggesting that both indicators might be used in a complementary manner (Linmans, 2010; Zuccala & White, 2015b). To a large extent, the citation is inexorable: it is the principal indicator upon which the BKCI was founded, and remains pertinent to the use of other databases as well (e.g., Scopus and Google Scholar). Another solution for improving book- related metrics is to take the problem of book indexing more seriously and put more emphasis on index- related improvements. This approach does not rest entirely with the bibliometrician’s expertise, yet most studies that rely on indexes/book catalogs still point to the same issue: regardless of where and how bibliographic and citation data are collected, it is essential to recognize that books often belong to bibliographic families. Since bibliographic families may be examined both theoretically and empirically, the aim of this study is to do both. First, we will examine and explain several interrelated concepts linked to a family-oriented entity-relationship model, known as the Functional Requirements for Bibliographic Records (FRBR). We have chosen to use this model, because it can effectively illustrate the extent to which books, as complex entities are not always indexed accurately in the BKCI, using appropriate metadata. In the second empirical part of this study, we will present some data collected specifically from the Book Citation Index (BKCI), OCLC-WorldCat, and Goodreads, and use this data to demonstrate why a robust model is necessary, in order to improve upon the accuracy of book-oriented metrics (i.e., citation counting). The empirical aspect of our research is based on the following question: Do books currently indexed in the Book Citation Index (BKCI) have adequate metadata and data designed to reflect inherent familial components and relationships? 2. Background to the Problem 2.1 Bibliographic entities and their properties Counts of books as publications and/or counts of their received citations may be compounded or not, depending on how we recognize their intellectual and physical properties. According to Lubetzky (1953), all bibliographic entities possess at least two: an intellectual property, which we refer to as the work and a physical property, which is the container for the work. It is worth noting that when Lubetzky (1953) first established these definitions, digital media had not yet been introduced. Attempts have also been made since then to elaborate upon the term work; hence the general consensus today is that what we observe from a work is the synthesis of its ideational and semantic content (Smiraglia, 2001). If we examine a journal article, we are likely to observe familial components based on a one-to-one, or a one-to-many relationship. An article’s intellectual property begins as a piece of work and its physical property can manifest as an official print publication and/or a digital publication with a Digital Object Identifier (DOI). The purpose of the DOI is to provide a persistent link to both the print and digital object. Circa Lubetzy’s time period (1950s) there would have been little confusion about what is counted when a journal article was accepted for publication, printed and indexed. Today, with print and 3 digital publishing, it can be interesting to examine when an article is officially published – i.e., if it is available online with a DOI, or printed and indexed at a later date (Haustein et al., 2015). A monograph is similar to a journal article in that it typically appears first as one intellectual contribution, or work. Like a journal article, it may be published in print or digital form. Unlike the journal article, the monograph can be re-itemized as a new edition. Chi et al., (2015) initially reflect on this problem when they note that the BKCI sometimes includes different editions of the same work: The BKCI distinguishes different editions of a book for some of its source items and indexes one or more editions of a work. For example, “CRIME SCENE TO COURT: THE ESSENTIALS OF FORENSIC SCIENCE, SECOND EDITION (2004)” and “CRIME SCENE TO COURT THE ESSENTIALS OF FORENSIC SCIENCE, 3RD EDITION (2010)” coexist in the database. Therefore, the citation links provided by the BKCI to the different editions of a book are edition sensitive and may need further judgment or weight for an additional evaluation process. While it is clear that these items have been published as distinct editions, Chi et al.’s (2015) use of the term work needs further attention. A basic Google search for the second edition and third edition of “CRIME SCENE TO COURT” confirms that both have been published under the same title, with the same editor (WHITE, PC), but in different publication years. Moreover, a closer examination indicates that not only do they possess unique International Standard Book Numbers [i.e., ISBN: 978-1-84755-065-1 for the second edition and ISBN: 978-1-84755-882-4 for the third edition] they also do not share the same content. This is evidenced by the fact that each volume is made up of different chapter titles and different authors corresponding to each chapter. Chi et al., (2015) have the impression that both editions of “CRIME SCENE TO COURT” are the same work, but we show that this may not be the case. 2.2 The structure of bibliographic ‘families’ In the Functional Requirements for Bibliographic Records (FRBR) the term work is an abstract entity, which serves as the focal point for a full conceptual model of the bibliographic universe (Tillett, 2005). The FRBR was first developed by a study group, affiliated with the International Federation of Library Associations and Institutions (IFLA Study Group on the Functional Requirements for Bibliographic Records, 1998), but Tillet (2005), one of the original members of this group, explains that it was written to serve as “a generalized view… independent of any cataloguing code or implementation” (p. 24). Now it is often recommended for the restructuring of catalogs: the number of records we make is a decision made up front by the cataloger based on local policies reflecting local user needs. We may choose to catalog at various levels: the collection of works (FRBR calls this an aggregation), an individual work, or a component of a work. At the description level, we may include a description of all the parts and should provide access to each component. At the component level, we should provide a link to relate to the larger ‘whole.’ (Tillet, 2005, p. 27) Long before the FRBR was introduced, O’Neill and Vizine-Goetz (1989) were the first to examine the term work as part of an entity-relationship model of the bibliographic family. In this early model, the top concept of work refers abstractly to a common origin and content. Subsequent concepts – i.e., text, 4 edition, printing, and book – are used to gradually represent a more narrow understanding of a work down to the individual printed book on the shelf of a library. Book is the only term for a physical object and thus the only one that is not abstract. O’Neill and Vizine-Goetz (1989) explain also how a work and its physical object are linked on the basis of a one-to-many relationship: each book is affiliated with one work, but one work can have multiple books with which it is affiliated. Tillett (2005) agrees with an abstract notion of work, but refers to a text and its specific arrangement of sentences, paragraphs chapters, etc. as an expression. The expression is then manifested by a specific version, leading to one example, which she calls an item (p. 25). These four concepts – i.e., work, expression, manifestation, and item – belong to a family tree with inherent relationships. It is a bibliographic family because “all texts of a work are derived from a single progenitor” (Smiraglia, 2001, p. 75). At the level of the original work there may be expressed equivalent works, such as copies (e.g., hardcopy or paperback) or reprints. There might also be expressed derivatives, which can include multiple editions, revisions, translations, etc. At the descriptive level, the family tree could also include reviews, commentaries, annotated editions or critical evaluations of the original work (Tillett, 2001). Figure 1 illustrates what the FRBR entity-relationship model might look like as a guide to evaluating the current structure of the BKCI. This is an adapted version of Tillett’s (2001) figure, which was printed first in Relationships in the Organization of Knowledge and reprinted later in What is FRBR? A conceptual model for the bibliographic universe (Tillet, 2005). Note that our figure is designed to focus solely on scholarly work and indicates the cut-off point when an expression may be recognized as a new work. Below Figure 1, we present a list of concepts, which have also been adapted from Tillet (2001, 2005). Our definitions do not deviate too much from the classical definitions, but we include references to other texts in some cases for further clarification. Figure 1. Modified model of bibliographic families for a scholarly work (Tillet, 2001; 2005). 5 1. First Edition: the emblematic or original version of a work as an intellectual contribution 2. Revised Edition: an edition that includes small corrections made to the original work 3. Literal Translation: a direct translation of the original language text into another specified target language text (e.g., Danish to English) whereby the intellectual domain and the historical- temporal context of the original work is recognized and maintained (Pellizzi, 2015). 4. Augmented Edition: a new edition of a work that is based on an earlier work with augmented or new intellectual content 5. Free Translation: an approach to translating a text, which intentionally recognizes the cultural gap between the “intellectual world of the author and that of the translator” (Pellizzi, 2015, p. 10); it modifies parts of the original language text, so that it appeals differently to the audience of the target language text. 6. Edited Series: by default every new expression of an edited series with new intellectual content will become a new work, even if the title of the edited series remains the same. 7. Review: a focused piece of work written by a new author to describe and review the intellectual content of the original, emblematic work or one of its expressions (e.g., a book review) 8. Criticism: an extensive piece of work written by a new author which critically evaluates the intellectual content of the original, emblematic work or one of its expressions in connection with other similar works (e.g., literary criticism) 9. Commentary: a work that explains and annotates an original work (e.g., a commentary on one or more expressions of the Bible). 2.3. The monograph as a complex ‘work’ At present little is known about why certain books are included in the BKCI. Most books that have been indexed have been published in 2005 or after, and there is currently a book-by-book editorial selection process in place at Thomson Reuters (Testa, 2012). One of the goals of Thomson Reuter’s development team is to include books with a relatively high citation impact, yet it will always be unclear as to which particular item, was originally used by the citing person(s). The distinct item that was used, however, is not important, as long as it has been accurately referenced. This means that all manifestation details for an indexed item need to be accurate (i.e., author name, title, ISBN, publication date) so that a decision can be made as to which expressions are equivalent and which shall be characterized as new work. This is one of the key recommendations of FRBR, and thus far, it has had some impact already on other bibliographic structures like OCLC-WorldCat (see Bennett et al., 2003). According to Bennett et al. (2003), “the majority of benefits associated with applying the FRBR” may be “obtained by concentrating on a relatively small number of complex works” (p. 45). Figure 2, below, illustrates what is meant by the term “complex work”. The example that we use is a monograph that was first written and published in Dutch, titled De Vergeten Wetenschappen: Een Geschiedenis van de Humaniora (Bod, 2010). De Vergeten Wetenschappen has been reprinted in its first edition language, and has also been ‘translated’ to Polish and Ukrainian (i.e., as two new expressions of the same ‘work’). Note that the 2013 Polish and the 2016 Ukrainian expressions are linked back to the original ‘work’; thus were not (according to international catalogers linked to OCLC-WorldCat) recognized as new works. In OCLC-WorldCat they have been recorded as direct or ‘literal translations’ of the Dutch progenitor, but the latest English expression has not (see Figure 2). The term ‘literal translation’ generally means that a source language text is rendered to a target language text, while retaining similar meaning and structure of content (Bassnett, 2002). In this sense, 6 the translation process seems relatively straightforward; however, several factors can influence the exercise. For instance, it may become more complex if there is a deeper focus on the cultural or historical background of the source language text and its author, as well as the target language text and translator (e.g., a free translation). With some freely translated texts, changes are often rooted in the historical period in which the translation was carried out, including the conditions surrounding the translation and the intellectual world of the translator herself. With other translated texts, more emphasis is placed on the reception and influence of the translation on the target language and culture. In simpler terms, an author may have a work re-written by a translator, or she may translate her own work, but the translated work can only be recognized later as a new work if it includes significant changes. Figure 2. Model of a complex work with expressions and manifestations of a new work Note from Figure 2 that Rens Bod has translated and published an English derivative of De Vergeten Wetenshchappen, titled A New History of the Humanities: The Search for Principles and Patterns from Antiquity to the Present (Bod, 2013). Again, this English expression, unlike the Dutch-to-Polish expression and Dutch-to-Ukrainian expression has been identified (in WorldCat) as a new work. In an e- mail exchange with the author, we received the following information: I would consider the English [expression] as a kind of improved edition of the Dutch book. When the Dutch work was translated into English, I sent it to OUP [Oxford University Press] and incorporated the comments by the 5 OUP reviewers into the English version; I also had the book read by an arabist, 7 indologist and a sinologist, and incorporated their comments as well. And, I added a few additional humanists to the book (e.g. Mabillon)2 as well as some additional concepts, such as the Chinese theory of parallel perspective (R. Bod, personal communication, June 16, 2016). In the future, Rens Bod notes that there will be new expressions of his work in “Chinese (just finished), Italian, Armenian and Korean versions…translated directly from the English version” (R. Bod, personal communication, June 16, 2016). Clearly one work has potential to possess complex family relationships, and in the case of De Vergeten Wetenshchappen, we see that the bibliographic family is still growing. With many more works like this, there are multiple implications for the structural design of the BKCI. Currently “it is possible to distinguish bibliometrically between monographs and edited volumes among the books [in the BKCI]” but according to Leydesdorff and Felt (2012) “monographs may be underrated in terms of citation impact or overrated because individual chapters are counted separately” (p. 1). This problem of over- counting or under-counting pertains solely to the metadata used for indexing monographs in the BKCI and their components as familial entities. 3. Research 3.1 Databases Our research focuses primarily on the BKCI, but in order to assess its reliability as a data source for bibliometric analyses, we have chosen to compare it to three other catalogs: 1) the Danish PURE repository system for scholarly research outputs, 2) the OCLC-WorldCat, and 3) Goodreads. Each database/catalog was selected for a specific reason. The Danish PURE repository system is a collection of repositories corresponding to eight universities across Denmark. Each university has created its own PURE database in order to register and maintain records of all scholarly research outputs. Due to this system’s nation-wide adoption, it is often used in conjunction with the performance-based evaluation system in Denmark. As of 2009, all Danish scholars across the country received a mandate to register their scholarly publications in PURE. Each year, performance points are then calculated on the basis of these PURE records and used to determine the amount of leftover government funding to be distributed across departments or research centers (i.e., 25% of the new basic funds, which are 5% of the total basic funding). Monographs are included, and each registration earns a department or research center 5.00 points (level 1 authority publisher) or 8.00 points (level 2 authority publisher) (Giménez-Toledo et al., 2016). The data retrieved for our study was a set of monographs that had been registered in eight University PURE repositories between the years of 2005-2015. Our main reason for working with these PURE repositories was to examine their current indexing quality, and to determine the extent to which books published by Danish scholars have been indexed also in the new BKCI. 2 Jean Mabillon was a French Benedictine monk and scholar and Bod (2013) has referred to his De re dipomatica (‘On the Science of Diplomatics’) in A New History of the Humanities. 8 The OCLC-WorldCat and Goodreads were also chosen for this study because both catalogs comply to some degree with the FRBR standard. The BKCI does not; hence by matching ISBNs and extracting all related data from these two extra databases, it is possible to assess the extent to which the BKCI is an accurate index of monographs as family-based entities. 3.2. Data retrieval and data curation method The procedural list below explains how all monograph data for this study were collected (over six months in 2015-2016), integrated and ‘curated’ into a new database for all research queries: 1. ISBNs from the BKCI (2005-2015) were retrieved and added to a new SQL database. This included monographs only, which had been indexed with the following metadata tags: a) Pubtype=book/books, b)Doctype=book, c)Norm_doctype=book, and d) Role=author (n=16,392). 2. ISBNs from the eight Danish PURE repositories were also retrieved, based on the following indexing tags: a) doc_type=db, b) doc_level=sci, and c)person role=NOT editor, and added to the SQL database (n=8,604) 3. All duplicate ISBNs from both the BKCI and PURE were removed and the two datasets were merged to produce a total of n=24,961 ISBNs (note: only 35 records between the two original lists were duplicates). 4. With OCLC-WorldCat and Goodreads, we used an Application Programming Interface (API) to retrieve additional metadata (e.g., book title, author name, publisher, publication year) matched to our initial list of ISBNs (n=24,961), including all extra related ISBNs (i.e., additional manifestations of the same work). 5. Our final research dataset in the SQL database included a total of n=56,445 unique ISBNs 6. All ISBN-13 numbers were trimmed to create a new numerical ISBN for retrieval purposes (e.g., 978-92-95055-02-5 was reduced to 929505502) and to minimize errors in SQL queries 7. An OCLC-Work-ID was created as a distinct metadata field for a work and all of its related ISBNs for the OCLC-WorldCat data. 8. We also created a Goodreads-Work-ID for a work and all of its related ISBNs from Goodreads 9. In all cases where there was a relational overlap of the same work in Goodreads and/or OCLC- WorldCat, we created a Final-Work-ID. This enabled us to identify the most comprehensive relational overview of one work. If it a particular ISBN was not identified at all in Goodreads or OCLC-WorldCat, the individual item was given its own Final-Work-ID. 10. A final Expression-ID was created for each work based on the following rules. First, if a manifestation (of a book) was published in the same year and in the same language then it was identified as being the same expression. If a manifestation (of a book) was published in the same year but in a different language, then it was categorized as being a different expression. The last part of the Expression-ID was designed to show the number of manifestations related to one expression. 9 Table 1: Sample list of related ISBNs extracted from the BKCI, OCLC-WorldCat, and Goodreads. Table 1, above, presents a sample list of related ISBNs from our final dataset. Note that we have retrieved 24 unique ISBNs (i.e., physical manifestations) of the same work titled Manias, Panics, and Crashes. In the BKCI, we found two ISBNs (i.e., 978-14-03936-51-6 and 978-02-30365-38-4), and both were indexed separately as distinct entities rather than two manifestations linked together to represent the same work. The ISBN at line 20 was not found in any other database except the BKCI; thus according to our methodology (point 9) we have counted it as a separate work. With the OCLC-WorldCat we found 14 more unique ISBNs related to this title, and with Goodreads, we found an additional 8 ISBNs. Note also, that out of the 24 unique ISBNs for Manias, Panics, and Crashes: A History of Financial Crises 18 could be categorized with a full Expression-ID, based on the rules that we developed for identifying expressions and the metadata available to confirm these rules. Only 12 were labeled formally as unique expressions. In the end, only seven editions could be unambiguously identified when we carried out a search using Google for all expressions related to each publication year. This means that some of the publication years may have been recorded in error. While OCLC-WorldCat tends to be a more reliable database for retrieving complete metadata, Goodreads tended to support the retrieval of more unique ISBNs. OCLC-WorldCat was particularly useful for identifying ‘expressions’ of a particular work due to its regular indexing of publication year and language. Goodreads, on the other hand, supported a much better understanding of the relationship between the ISBNs because, unlike OCLC-WorldCat, all ISBNs were united under one work-related metadata tag. 3.3. Results Table 2 indicates the results of our data crawling and matching procedure beginning with two original datasets – 1) an ISBN list from the BKCI, and 2) an ISBN list from the Danish Pure Repository. In relation to the 16,392 ISBNs retrieved from the BKCI, an extra 30,903 ISBNs (65% more) were found following the API procedures with OCLC-WorldCat and Goodreads. With the Danish PURE repository only a few extra related ISBNS (19%) were found using the APIs. 10 Table 2. ISBN matching and retrieval results for total manifestations, expressions, and works. Figure 3. Frequency distribution of works with one or more ISBNs and published as one or more edition. The Book Citation Index The Danish Pure repository 1 Number of ISBNs crawled 16,392 (35%) 8,604 (81%) 2 Number of overlapping ISBNs 35 (0.41%) 3 Extra related ISBNs found in OCLC-WorldCat and Goodreads 30,903 (65%) 2,042 (19%) 4 Total unique ISBNs in the dataset under study 47,295 (100%) 10,646 (100%) 5 ISBNs with distinct language and publication year 34,236 8,362 6 Total Expressions 20,284 7,844 7 Total Works 16,311 8,195 11 Figure 3 indicates the number of works from the full dataset with one or more ISBNs (i.e., physical manifestations), including those that had been published as one or more edition (i.e., distinct expressions). Although a little more than half (52%; n=12,723) were published with only one ISBN, almost half (48%, or n=10,249) could also be identified as having two or more ISBNs. The highest count of ISBNs was a total of n=28 for one work, and the lowest was 1, but on average, a scholarly work is likely to be published as two editions, each with approximately 3 different ISBNs. Figure 4. Indexing quality of the BKCI based on ISBNs per edition for publication years 1995-2015. Figure 4 presents the indexing quality of the BKCI pertaining to editions or expressions of a work published in 1995, up to and including 2015. We selected this time frame because 98% of the ISBNs originally retrieved from the BKCI were for at least one edition of a work that had been published between these years. The black portion of each column per year indicates that all ISBNs related to an edition of a work are present in the BKCI. With the ISBN as the counted variable, this means that several works in their entirety have been accurately indexed. The white part of the column indicates that there is at least one ISBN indexed for a particular edition of a work, but that ISBNs for additional family-related editions are missing. The grey portion at the top of each column then represents all of these missing editions, which were confirmed to exist based on data matching with Goodreads and OCLC-WorldCat, but were not recorded in the BKCI. Note that for the publication year of 2005, most editions (i.e., expressions) had been fully indexed in the BKCI, as shown by the proportionally longer black column. For the publication year of 2009, more editions in general were added to the BKCI, but a full indexing of each edition (i.e., expression) and related ISBN seems to decrease, as shown by the proportionally longer white column. Again, the gray column indicates the proportion of editions that have no representation in the BKCI. For the publication year of 2010 and onward there is no real observable pattern other than the fact that the indexing quality for all editions (expressions) has remained inconsistent. 12 To illustrate this indexing problem more clearly, we refer back to the sample title list shown in Table 1. From this table, note that both the fifth and sixth editions of Manias, Panics, and Crashes: A History of Financial Crises had been indexed in the BKCI, but all earlier editions published (or printed) in the years 1978, 1989, 1996 and 2000, each with their own related ISBNs, were not added. Overall, what we found is that for all of the monographs originally identified with ISBNs in the BKCI, approximately 21% of their related editions (or expressions) were not represented. Figure 5. Indexing quality of the Danish PURE repository system based on ISBNs per edition for publication years 1995-2015. Figure 5 shows the same information shown in Figure 4, but this time for the Danish PURErepository system. Here the indexing quality for editions per work tends to be much better. Note also that most of the works that had been registered in PURE do not have more than one associated ISBN (as shown by the black and white portion of the columns). There could be two reasons for this: 1) many works were never published or reprinted again as second or third editions with new ISBNs, or 2) the Danish author decided to only register his/her work under a single ISBN. Also, if an author had been responsible for producing and publishing both a Danish and English edition of a work, both would have had to be indexed. For some works identified as having a non-indexed edition (i.e., the proportionally smaller grey bars), we found that only a Danish edition of a work was registered, but not the original language one. If the Danish author-as-translator did not produce the original language edition; he or she would not have been required to register this in PURE. 13 Figure 6. Indexing of emblematic expressions of a work in the BKCI based on ISBNs per edition for publication years 1995-2015. Figure 6 illustrates the extent to which emblematic expressions were indexed in the BKCI for the publication years of 1990 up to and including 2014. For all works with more than one edition (i.e. expression) in our dataset (n=10,731) we were able to identify a total of n=3,370 that were emblematic. Again, the emblematic edition or expression is the first publication and printing of a work as an original intellectual contribution. According to our data, approximately 40% of these emblematic expressions had not been indexed, even though they are represented in the BKCI in the form of later editions. 4. Discussion: Metrics for Monograph ‘Families’ With the Book Citation Index currently as it is, counting citations to monographs is problematic; hence a discussion is needed both in light of FRBR standards and our study results. While many similar problems apply to edited books, here we will focus strictly on monographs. One of the data accuracy problems related to the BKCI stems directly from the referencing practices of researchers. With the BKCI structured as it is now “monographs may be underrated in terms of citation impact or overrated because individual chapters are counted separately” (Leydesdorff and Felt, 2012, p. 1). For instance, if, a scholar who writes a research paper refers repeatedly to a specific chapter, (s)he may choose to cite only that chapter. If the scholar refers to several chapters from the same monograph, (s)he may choose to cite the full monograph. There is no rule regarding this practice, but different research associations often set guidelines. According to the Publication Manual of the 14 American Psychological Association (2016), referencing a chapter from a monograph is in fact not recommended (note: only a chapter from an edited book), yet there are instances in the BKCI where this occurs. For example: • Full Monograph: Moed, H. (2005). Citation analysis in research evaluation. Dordrecht, NL: Springer • Chapter in Monograph: Moed, H. (2005). Assessing social sciences and humanities. In Citation analysis in research evaluation (pp. 145-166). Dordrecht, NL: Springer. If this practice continues, and the BKCI is re-developed to follow FRBR, the problem of citation undercounting would cease to exist. In other words, separate citation counts might still be attributed to the Moed (2005) chapter-based reference as well as the monograph-based to reference, but the implementation of a work-related identifier would confirm that the two records are related. Applying the FRBR standard to the BKCI would, in general, ensure that all expressions of a work are indexed distinctly with an identification code. This is our first recommendation, and to some degree it has already been accomplished. For instance, currently there are two separate indexed editions of Manias, Panics, and Crashes in the BKCI (see Table 4), but not all editions have been indexed (as the data illustrate in Figure 4) and with the two that are present, there is no linking ID that shows they are part of the same work or progenitor. For all expressions and not just these two, a primary work identifier is critical, and will show the extent to which different editions within the BKCI belong to the same bibliographic family. The follow-up effect of this practice is that bibliometricians would also have new options for collecting citation counts at specific family levels. A suggested indexing structure for the BKCI, including levels for citation counting, is outlined in Figure 7. Note from Figure 7, that a work is the highest proposed target entity for a citation count; while all individual expressions (editions) constitute the lowest proposed target entity. Each expression of Manias, Panics, and Crashes has been labeled from #1 to #7 (note: see the same list in Table 2). The first four expressions link back to the same work, and the last three expressions may potentially be indexed as new work(s), as illustrated by the line leading to the box labeled “New Work ID”. Earlier, we indicated that Bod’s (2013) English translation of De Vergeten Wetenschap, newly titled as A New History of the Humanities, was said to possess augmented properties that make it identifiable as a new work. With Figure 7, we also show that when the fifth, sixth and seventh editions of Manias, Panics, and Crashes were published, C. P. Kindelberger was no longer writing alone, but with R. Z. Aliber as his co-author. For these later editions, particularly the sixth one, a note on Amazon.com indicates that there have been changes to the content: “This highly anticipated sixth edition has been revised to include an in-depth analysis of the first global crisis of the twenty-first century” (Amazon.com, 2016). Sometimes small revisions appearing in a new edition still fit the abstract and intellectual concept of the work as a whole, but because the revisions in this case are substantial, one might apply both a new author and augmented text rationale towards indexing the last three editions of Manias, Panics, and Crashes under a new work ID. 15 Figure 7. Recommended indexing structure for the BKCI. 16 At Figure 7, the arrows leading to the box labeled “citations/libcitations” illustrate how our proposed indexing structure would support metric assessments books at different bibliographic levels, and for two different types of metric indicators. For example, one could analyze the sum of citations given to the first four expressions of Manias, Panics, and Crashes at #1 to #4 (i.e., the work as a whole), or evaluate the individual counts of citations/libcitations given to each expression at #1 to #4. The same process may be repeated again, for every expression indexed as a new work (i.e., #5 to #6). Again, the indexer has little control over the appearance of references in the academic literature, but if most scholars adhere to proper guidelines, a reference should always be given to the correct edition of a monograph used at the time of writing. Figure 7 also illustrates that the two different counting options may be applied to libcitations or library holdings for each cataloged edition (e.g., using OCLC-WorldCat). The value in calculating indicators at different bibliographic levels is that it can help to identify whether or not a specific expression or edition of a monograph is receiving more attention than the work as a whole. For instance, one specific expression of a work may be cataloged in libraries, used, referred to, or reviewed more frequently than another. This could be the literal translation of a non-English edition of a work to English, with the new English-language edition potentially having a wider appeal. For some types of translated works, in fact, an author might even have more than one metric profile. At Figure 2, we see how distinct metrics could be calculated for De Vergeten Wetenschappen (Bod, 2010) as well as for A New History of the Humanities (Bod, 2013). The delineation between new monograph expressions (editions) would also support the identification of associated descriptive works (e.g., book reviews; commentaries). Last but not least, bibliographic levels present better opportunities for bibliometricians to discuss the merits of certain weighting options. 5. Conclusion The purpose of this study was to investigate the extent to which books currently indexed in the Book Citation Index (BKCI) have adequate metadata and data designed to reflect inherent familial components and relationships. Our research focuses primarily on monographs, and results confirm that some familial components are present in the BKCI, but not all. In terms of ISBNs, many are missing for extra editions of the same work and many in particular that need to be indexed are the ISBNs of emblematic (original/first) editions. The purpose of including all ISBNs is to ensure that every physical manifestation of a monograph is recognized (e.g., print, paperback, hardcopy, e-print) and that each ISBN is indexed as part of the correct edition or expression. This, in turn, ensures that all monograph editions can clearly be identified as being part of the same intellectual contribution, or work. Thus, publication counts and citation counts would be more accurate in the BKCI, and new metric indicators could be calculated more effectively. Part of this research was also designed to compare the indexing of monographs in the BKCI with the Danish PURE repository system. Only a small percentage of books (0.41%) that had been indexed in eight Danish university PURE databases were also present in the BKCI. The BKCI is therefore not a reliable or accurate tool for citation-based evaluations of Danish scholars who mainly publish books. At present, the Danish evaluation system does not focus on citations, or citation-based approaches to evaluation. However, indexing problems still point to some drawbacks related to the PURE system when taking a performance-based approach. If monographs continue to be indexed without recognizing that they are family-based entities, a few problems might arise. For example, if co-authoring scholars from two different Danish universities register two manifestations of the same work differently in PURE, 17 this could result in a single BFI point given towards each university department. Normally, if two scholars are responsible for the same work, each department should actually receive a fractionalized BFI point for the shared contribution. Until it is clear whether or not FRBR might be applied to the PURE system, the Ministry of Higher Education and Science in Denmark is at least making an effort to improve upon the accuracy of book registrations, by producing and publishing a set of document registration guidelines (Uddannelses-og Forskingsministeriet, 26 January 2017). 6. References Adams, J., & Testa, J. (2011). Thomson Reuters Book Citation Index. In E. Noyons, P. Ngulube & J. Leta (Eds.), The 13th Conference of the International Society for Scientometrics and Informetrics (Vol. I, pp. 13-18). Durban, South Africa: ISSI, Leiden University and the University of Zululand. Amazon.com. (2016). Manias, panics and crashes: a history of financial crises, sixth edition, paperback – September 27, 2011. Retrieved October 10, 2016 from https://www.amazon.com/Manias- Panics-Crashes-History-Financial/dp/0230365353. American Psychological Association. (2001). Publication Manual of the American Psychological Association (5th ed.). Washington, DC. Bassnett, S. (2002). Translation studies. London: Routledge. Bennett, R., Lavoie, B. F., & O’neill, E. T. (2003). The concept of a work in WorldCat: an application of FRBR. Library Collections, Acquisitions, and Technical Services, 27(1), 45-59. Bod, R. (2010). De vergeten wetenschappen. Een geschiedenis van de humaniora. Amsterdam: Bert Bakker. Bod, R. (2013). A new History of the humanities. The search for principles and patterns from antiquity to the present. Oxford, UK: Oxford University Press. Chi, P., Thijs, B., & Glänzel, W. (2015). The challenges to embody a new data source: The Book Citation Index. ISSI Newsletter, 11(1), 24-29. Giménez-Toledo, E., Manana-Rodrıguez, J., Engels, T. C. E., Ingwersen, P., Pölönen, J., Sivertsen, G., Verleysen, F.T. and Zuccala, A. A. (2016). Taking Scholarly Books into Account. Current Developments in Five European Countries. Scientometrics, 107(2), 685-699. Gorraiz, J., Purnell, P., & Glänzel, W. (2013). Opportunities and limitations of the book citation index. Journal of the Association for Information Science and Technology, 64(7), 1388–1398. Hammarfelt, B., (2011). Interdisciplinarity and the intellectual base of literature studies: citation analysis of highly cited monographs. Scientometrics, 86(3), 705-725. Haustein, S., Bowman, T.D., & Costas, R. (2015). When is an article actually published? An analysis of online availability, publication, and indexation dates. In Salah, A.A., Y. Tonta, A.A. Akdag Salah, C. Sugimoto, U. Al (Eds.), Proceedings of ISSI 2015 Istanbul: 15th International Society of Scientometrics and Informetrics Conference, Istanbul, Turkey, 29 June to 3 July, 2015, (pp. 1170 - 1179). Bogaziçi University Printhouse. 18 IFLA Study Group on the Functional Requirements for Bibliographic Records (1998). Functional requirements for bibliographic records, final report. UBCIM Publications New Series, Vol. 19. Munchen: K.G. Saur. Retrieved February 15, 2017 from http://www.ifla.org/files/assets/cataloguing/frbr/frbr.pdf. Kindleberger, C. P. (1978) Manias, panics, and crashes: a history of financial crises. New York: Basic Books. Kousha, K. & Thelwall, M. (2009). Google book citation for assessing invisible impact? Journal of the American Society for Information Science and Technology, 60(8), 1537-1549. Kousha, K. & Thelwall, M. (2011). Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus. Journal of the American Society for Information Science and Technology, 62(11), 2147-2164. Kousha, K., Thelwall, M. & Abdoli, M. (2016, in press). Goodreads reviews to assess the wider impacts of books. Journal of the Association for Information Science and Technology. Retrieved November 1, 2016 from https://wlv.openrepository.com/wlv/handle/2436/619162. Leydesdorff, L. & Felt, U. (2012). “Books” and “book chapters” in the book citation index (BKCI) and science citation index (SCI, SoSCI, A&HCI). Proceedings of the American Society for Information Science and Technology, 49(1), 1-7. [DOI: 10.1002/meet.14504901027] Linmans, A. J. M. (2010). Why with bibliometrics the Humanities does not need to be the weakest link. Indicators for research evaluation based on citations, library holdings, and productivity measures, Scientometrics, 83(2), 337-354. Lubetzky, S. (1953). Development of cataloging rules. Library Trends, 2(2), 179-186. Moed, H. (2005). Citation analysis in research evaluation. Dordrecht, NL: Springer O'Neill, E. T., & Vizine-Goetz, D. (1989). Bibliographic Relationships: Implications for the Function of the Catalog. In E. Svenonius (Ed.), The Conceptual Foundations of Descriptive Cataloging (pp. 167- 179). San Diego: Academic Press. Pellizzi, F. (2015). Art historical and anthropological translation: Some notes and recollections. Art in Translation, 4(1), 9-16. Smiraglia, R. P. (2001). The nature of a work: implications for the organization of knowledge. London: The Scarecrow Press, Inc. Testa, J. (2012). The book selection process for the Book Citation Index in Web of Science. Thomson Reuters. Retrieved June 24, 2016 from http://wokinfo.com/media/pdf/BKCI- SelectionEssay_web.pdf Tillet, B. (2001). Bibliographic relationships. In C. A. Bean and R. Green (eds.) Relationships in the Organization of Knowledge (pp. 19-35). Dordrecht: Kluwer Academic Publishers. Tillet, B. (2005). What is FRBR? A conceptual model for the bibliographic universe. The Australian Library Journal, 54(1), 24-30. DOI: 10.1080/00049670.2005.10721710. 19 Torres-Salinas, D. & Moed, H. F. (2009). Library catalog analysis as a tool in studies of social sciences and humanities: An exploratory study of published book titles in economics. Journal of Informetrics, 3(1), 9–26. Torres-Salinas, D., Robinson-Garcia, N., Cabezas-Clavijo, A. & Jimenez-Contreras, E. (2014). Analyzing the citation characteristics of books: edited books, book series and publisher types in the Book Citation Index. Scientometrics, 98(3), 2113–2127. Uddannelses-og Forskingsministeriet. (26 January 2017). Retrieved March 10 2017 from http://ufm.dk/forskning-og-innovation/statistik-og-analyser/den-bibliometriske- forskningsindikator/filer/retningslinjer-for-forskningsregistrering-til-bfi-v-1-0.pdf White, H., Boell, S.K, Yu, H., Davis, M., Wilson, C.S. and Cole, F.T.H. (2009). Libcitations: a measure for comparative assessment of book publications in the humanities and social sciences. Journal of the American Society for Information Science and Technology, 60(6), 1083-1096. Zuccala, A. & Cornacchia, R. (2016). Data matching, integration, and interoperability for a metric assessment of monographs. Scientometrics, 108(1), 465-484. Zuccala, A., Guns, R., Cornacchia, R., & Bod, R. (2014). Can we rank scholarly book publishers? A bibliometric experiment with the field of history. Journal of the Association for Information Science and Technology, 66(7), 1333-1347. Zuccala, A. A., Verleysen, F., Cornacchia, R., & Engels, T. (2015a). Altmetrics for the Humanities: Comparing Goodreads reader ratings with citations to history books. Aslib Proceedings, 67(3). http://dx.doi.org/10.1108/AJIM-11-2014-0152 Zuccala, A. A., & White, H. D. (2015b). Correlating libcitations and citations in the humanities with WorldCat.org and Scopus data. In A. A. Salah, Y. Tonta, A. A. Akdag Salah, C. Sugimoto, & U. Al (Eds.), Proceedings of the 15th International Society for Scientometrics and Informetrics Conference, Istanbul, Turkey, 29 June to 3 July, 2015, (pp. 305-316). Bogazici University Printhouse. work_uwjnzazcafa7niigkpyndhqqza ---- Shear waves with orthogonal polarisations for thickness measurement and crack detection using EMATs NDT&E International 111 (2020) 102212 Available online 7 January 2020 0963-8695/© 2020 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Shear waves with orthogonal polarisations for thickness measurement and crack detection using EMATs Jaime Parra-Raad, Pouyan Khalili, Frederic Cegla * NDE Group, Mechanical Engineering Department, Imperial College London, Exhibition Road, South Kensington SW7 2AZ, United Kingdom A R T I C L E I N F O Keywords: Polarised shear waves Crack detection Orthogonal co-located coils (OCLC) EMAT Thickness measurement A B S T R A C T The use of polarised shear waves to detect the presence of crack-like defects seems to have received little or no attention in the past. The authors believe that the main reason for this appears to be the lack of a device with the capability to excite shear waves of different polarisations. In this paper, the authors, first, present the design of an EMAT that permits the excitation of two orthogonally polarised shear waves in metallic materials by means of two coils that are orthogonal with respect to each other. This is then followed by a 3D finite element analysis of the wavefield generated by the EMAT and its interactions with crack-like defects of different sizes, positions and orientations. Then a methodology of how this EMAT can be used to simultaneously measure material thickness and detect crack-like defects in pulse-echo mode is introduced. Good agreement between the finite element simulation and experimental results was observed which makes the presented technique a potential new method for simultaneous thickness measurements and crack detection. 1. Introduction Pulse-echo mode thickness measurements are one of the most commonly employed ultrasound non-destructive evaluation (NDE) techniques. The procedure consists in exciting 0∘ waves using an ultra- sound transducer (UT). These waves propagate into the material that is to be inspected at an angle that is normal to the surface and reflect from the back-wall of the inspected component before returning to the transducer. The series of wave reflections and their time of flight (ToF) can be used to determine the thickness of the material at the specific spot where the UT is located [1,2]. Example applications where UTs are frequently used to inspect metallic materials are: manual thickness measurements [3–5], corrosion mapping [6,7] inline inspection (ILI) with e.g. pipeline inspection gauges (PIGs) [8,9] and permanently installed devices for thickness gauging [10–13]. The effectiveness of the thickness measurements depends on the quality of the returning signals. Severe surface roughness and non- uniformity can result in low signal to noise ratio (SNR) [14] and poor thickness measurements [15]. At the same time, ToF calculations of 0∘ wave reflections are relatively insensitive to small defects and crack-like defects because they reflect very little energy back and do not influence the arrival time of back-wall reflections [16]. These effects on the signal are, in some cases, comparable to the changes induced by surface roughness and therefore 0∘ waves are usually unsuitable for crack detection. A different setup is used to perform crack detection with ultrasound [17]. An angled beam is sent into the inspected component; which is usually achieved by installing a piezo-electric UT onto an angled wedge [18]. The inspection consists in generating an ultrasound pulse that travels through the wedge and refracts at the wedge-specimen interface, resulting in an obliquely travelling wave in the material that is being inspected. The presence of a defect in the travel path of the wave can result in partial or complete reflection (depending on the geometry of the defect) of the ultrasound wave which reflects back to the transducer; therefore a defect is detected when a reflected signal is received [18,19]. Applications of these type of UT probe are: crack detection [19], and NDE inspection of train wheels [20]. The above shows that two different setups are required to perform thickness measurements and crack detection. This can be practically inconvenient to implement. Therefore, to overcome these impractical- ities, in this work we investigated if both measurements can be com- bined by performing two simultaneous pulse-echo tests. The concept is to use a pair of orthogonal and linearly polarised shear waves emitted by the same transducer. This enables thickness gauging and crack-like defect detection with the same setup. The UT probe that is presented in this paper consists of an electro- * Corresponding author. E-mail addresses: j.parra-raad18@imperial.ac.uk (J. Parra-Raad), f.cegla@imperial.ac.uk (F. Cegla). Contents lists available at ScienceDirect NDT and E International journal homepage: http://www.elsevier.com/locate/ndteint https://doi.org/10.1016/j.ndteint.2019.102212 Received 24 May 2019; Received in revised form 20 November 2019; Accepted 31 December 2019 mailto:j.parra-raad18@imperial.ac.uk mailto:f.cegla@imperial.ac.uk www.sciencedirect.com/science/journal/09638695 https://http://www.elsevier.com/locate/ndteint https://doi.org/10.1016/j.ndteint.2019.102212 https://doi.org/10.1016/j.ndteint.2019.102212 https://doi.org/10.1016/j.ndteint.2019.102212 http://crossmark.crossref.org/dialog/?doi=10.1016/j.ndteint.2019.102212&domain=pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ NDT and E International 111 (2020) 102212 2 magnetic acoustic transducer (EMAT) that contains two orthogonal co- located butterfly coils. The EMAT can excite a pair of perpendicular and linearly polarised shear waves propagating into the inspected ma- terial at 90∘ relative with its surface. The generated waves travel through the metallic material and bounce back from the back-wall to the trans- ducer. The received back-wall reflections are analysed to quantify the thickness of the material and to determine the presence of a crack-like defect. Thickness gauging is performed, using an orthogonal co-located coil (OCLC) EMAT, by calculating the ToF of the received back-wall re- flections. While, the presence of crack-like defects is detected by comparing the amplitudes of the two orthogonal and linearly polarised shear wave reflections. This is possible because the amplitude of the reflected shear waves is strongly dependent on the relative orientation of a crack-like defect with respect to the polarisation direction of the incident shear wave. There has been prior work aimed at understanding how linearly polarised shear waves interact with crack-like defects. For instance, S. K. Datta et al. [21] investigated how shear horizontal waves are diffracted when interacting with surface-breaking defects at different angles. Their experimental results indicated that SH waves propagating at 0∘ can be used to detect surface-breaking defects. A. H. Harker [22] analysed in-plane shear waves interacting with surface-breaking cracks at different incident angles. He concluded that in-plane incident shear waves split their energy into other waves after interacting with the defect. Howard [16] described qualitatively how the back-wall re- flections of bulk waves are affected by the width of surface-breaking defects with respect to the UT aperture. In industry, for example, it is a well-established procedure to measure the arrival time of several oblique angle shear vertical wave reflections to size a crack-like defect [17]. However, there is no methodology that utilises relative changes in amplitude between orthogonal polarised shear waves to detect the presence of a crack-like defect. The authors believe that the lack of this methodology is a result of the absence of a device capable to record the amplitude of two orthogonal shear waves simultaneously, and also because there are no studies that relate the amplitude of shear wave reflections to the size of crack-like defects. Therefore, this is what we set out to investigate in this paper. To understand how both 0∘ shear waves interact with back-wall cracks and how crack-like defects affect the reflected waves; 3D Finite Element (FE) simulations of the waves interacting with different crack- like defects were performed. An experimental setup that mimics the 3D- FE model was then built to validate the FE simulation results. The simulation and experimental results were compared and their agree- ment was verified. This paper is organised as follows: in section 2, the OCLC EMAT layout is described. In section 3 we present the FE model used to study the interaction of the EMAT wavefield with notches of different sizes and orientations. In section 4, we explain the behaviour of the reflected waves when the EMAT wavefield interacts with crack-like defects and then we illustrate the concept of crack-like defect detection with FE simulation results. In section 5, the experimental design and setup is presented. In section 6 we compare the FE simulation results with their equivalent experimental measurement results. Section 7 discusses the presented work and its conclusion. 2. The orthogonal co-located coils (OCLC) EMAT The authors of this paper are proposing to use an EMAT that contains a pair of orthogonal co-located coils (schematic concept of the magnetic flux orientation, the orthogonal coil design and the directions of the generated shear forces are shown in Fig. 1a and b). The design of the proposed EMAT is based on the results presented by Ref. [12]. The EMAT that they designed maximised the bias magnetic field, so that, in combination with one butterfly coil (20 turns, 0:25mm track width and 0:25mm gap between tracks), it generates a linearly polarised 0∘ shear wave by means of the Lorentz force mechanism [23]. The OCLC EMAT its composed of the same magnetic circuit and coil introduced by Ref. [12] plus a second identical butterfly coil that is placed on top and rotated at 90∘ relative to the first EMAT coil. Both coils are placed at the bottom of the EMAT. Therefore, the description of the magnetic field and the ultrasound wave generation of the OCLC EMAT is the same to the one described by Isla et al. but with the difference that an additional ultrasound shear wave with perpendicular polarisation direction is generated by the second coil. Therefore, the active aperture of the OCLC EMAT is a square area at its centre (marked in blue in Fig. 1b) [16] where the vertical and horizontal tracks of the coils cross. It is in this area where the magnetic flux is applied and hence where the Lorentz force will exert a surface traction on the specimen. 3. Finite element simulation of interaction of SH waves with notches/cracks in isotropic materials In order to assess the interaction of the two orthogonal and linearly polarised shear waves with crack-like defects of various depths, widths and orientations, FE analysis was employed. FE is a versatile and powerful tool that allows the investigation of a large number of defect shapes/sizes which would be very costly to fully examine experimentally. The predictions were performed in a pulse-echo setup at 2 MHz centre frequency, which in an aluminium (ρ ¼ 2700kg=m3, E ¼ 70:76GPa, ν ¼ 0:3375) structure corresponds to a wavelength of around 1:6mm. The full 3D FE simulations were carried out with the Pogo Software package [24]. Which is a FE simulation package dedicated to Fig. 1. Schematic of the orthogonal co-located EMAT coils. a) Schematic side view of the OCLC EMAT. b) Top view layout showing the two coils that are identical but rotated by 90∘ and placed on top of each other, as well as the active aperture area and the shearing direction resulting from the two coils. ICoilA and ICoilB indicate the current direction in the coils. J. Parra-Raad et al. NDT and E International 111 (2020) 102212 3 solving explicit time domain wave propagation problems [25]. In order to ensure the accuracy and stability of the simulations, cubic elements with side lengths of 0:05mm and a time step of 6:34ns were used. To ensure the simulations use the least possible computation resources, a small section of a block was modelled. The geometry investigated through out the simulations is shown in Fig. 2. Here, a typical model had around 648 million degrees of freedom and took approximately one hour and half to run on a data Muncher with 8 Nvidia Quadro RTX 8000 GPU cards. Fig. 2 displays the general FE setup used throughout this study. In order to mimic the OCLC EMAT active area, an excitation area of 12mm x 12mm [16] was placed on the top surface of a 30mm thick aluminium block where polarised shear waves were generated via tangential sur- face tractions; the excitation signal consisted of a 5-cycle Hanning-windowed toneburst at 2MHz centre frequency with maximum amplitude of 1, (The authors are assuming that the wave propagation of the shear waves is a linear problem and are only inter- ested in relative amplitude changes of the waves. Therefore, the absolute values of the surface tractions in the simulations are not relevant for the results of this study). The preference of the authors to use a 5-cycle Hanning-windowed toneburst rather than a single pulse as an excita- tion signal is due to its well-defined frequency bandwidth compare to single pulse signals. The surface defect was placed on the opposite sur- face of the aluminium block. By simply changing the geometry of the defect, the effect of various parameters such as defect depth, width and orientation on the reflection of the shear waves can be studied. To simulate a pulse-echo measurement, the reflection was recorded via the same nodes used for the excitation (red area in Fig. 2) where the OCLC EMAT reception was simulated by summing and then normalising the surface displacement components oriented along the corresponding coil direction. In order to understand the behaviour of the polarised shear waves when interacting with surface crack-like defects, an one-element-wide notch defect was placed at the middle of the block, as is shown in Fig. 2. The defect was created by disconnecting the set of nodes at the location of the discontinuity. With this configuration, the displacement coming from the left-hand side of the defect is not transferred to the right-hand side and vice versa, simulating the effect of a surface- breaking defect. Then, simulations of the wave propagation were car- ried out with various defect depths (0.05, 0.3, 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10mm) as well as for different normalised (to transducer size) defect lengths of 5%, 10%, 20%, 40%, 60%, 80% and 100% and at different orientations (0∘, 10∘, 20∘, 30∘, 40∘, 45∘, 50∘, 60∘, 70∘, 80∘ and 90∘). 4. Measurement concept The procedure of measuring the thickness of a metallic material consists in generating an ultrasound wave at the material surface, and then recording the reflections of the emitted wave from the back-wall of the material. By determining the ToF between consecutive back-wall reflections and multiplying this with the wave speed, the thickness of the material is estimated. Moreover, to detect the presence of crack-like defects, the relative amplitude of the reflection of two orthogonal and linearly polarised shear waves that are co-located is proposed. This is possible because the reflected amplitude of a transmitted linearly polarised shear wave through an area of material containing a crack-like defect is strongly dependent on the relative orientation of the defect and the incident shear wave polarisation. Fig. 3 illustrates how the shear waves are affected by the presence of a crack-like defect. The FE simulation results were obtained for a notch of 0:05mm width, 5mm depth and 30 mm length. Fig. 3a shows how an incident shear wave that has a polarisation direction parallel to the defect face is almost unaffected by the presence of the notch, while Fig. 3b illustrates how an incident shear wave with polarisation perpendicular to the notch scatters at the defect’s tip and mode converts into compression and Rayleigh waves. Fig. 3c shows the amplitude of the signals recorded when the incident shear wave is parallel or perpendicular to the notch orientation. The time-traces show the excitation signal used for the pulse-echo tests, the received L þ S wave which is the resultant wave of the mode conversion of the longi- tudinal wave that is excited at the edges of the OCLC EMAT’s aperture and then mode converted to shear waves at the back-wall surface of the specimen [12] and the 1st back-wall reflection recorded at the aperture area. The amplitude difference between both 1st back-wall reflections is due to the relative orientation of the shear waves with respect to the notch. The amplitude of a reflected shear wave when interacting with a crack-like defect also depends on the relative length of the defect with respect of the transducer aperture. If the crack-like defect is shorter/ smaller than the transducer aperture, part of the incident wave will not interact with the defect, leaving the reflected wave from that area un- affected by the presence of the defect. Fig. 4 shows the back-wall reflection amplitude drop of the shear wave perpendicular to the defect as a function of the crack depth and aperture. The results show that the amplitude drop increases for deeper and/or longer defects. The amplitude drop is also dependent on the width of the defect/ notch as additional scattering occurs due to the reflection from the top side of the defect; hence higher amplitude is observed as the width of the defect is increased. Fig. 5 displays the effect of the crack-like defect (notch) width on the amplitude drop of back-wall reflections. The FE results show the percentage of amplitude drop of the perpendicular and parallel (relative to the defect face) shear waves as a function of the defect/notch width ranging from 0 to 1 mm. The OCLC EMAT, in pulse-echo mode, is proposed to be used to measure the thickness of a metallic material and detect the presence of a crack-like defect in the same material by exciting two orthogonal and linearly polarised shear waves simultaneously. The two generated shear waves will interact with the material back-wall surface and reflect to- wards the OCLC EMAT. The thickness of the material is proposed to be measured by calculating the ToF using one of the signals recorded with the coils. To detect the presence of a crack like defect, the amplitudes of the two reflected waves could be recorded and compared to each other. If there is an amplitude drop in one of the waves relative to the other, then it has been detected that one of the incident waves has scattered more energy away due to the presence of an anomaly in the material. Which is most likely the result of the presence of a notch or crack-like defect. Fig. 2. Schematic of the FE setup. The red area represent the equivalent OCLC EMAT excitation/reception area (aperture). (For interpretation of the refer- ences to colour in this figure legend, the reader is referred to the Web version of this article.) J. Parra-Raad et al. NDT and E International 111 (2020) 102212 4 5. Experiment design and setup To verify the FE predictions, an OCLC EMAT with 12mm active aperture was built and placed on a 30mm thick, 250mm long and 100mm wide aluminium block. Three notches with different depths (0.3, 1 and 5mm as shown in Fig. 6a) were machined into the back-wall of the block. The notches were 0:3mm wide and spanned the whole width of the specimen. Then, four different experiments were carried out by per- forming several pulse-echo tests on the aluminium block. The first experiment, which consisted in a pulse-echo test executed in a defect free region (OCLC EMAT placed at 100mm as shown in Fig. 6a), was carried out to get the baseline amplitude of the reflected shear waves and the wave propagation velocity of the excited shear waves. The second experiment consisted of a pulse-echo test over the three different notches (where one of the linearly polarised shear waves was aligned parallel with the notch orientation), the purpose of this experiment was to measure the effect of defect depth in the amplitude of the reflected shear waves. The results obtained in the first and second experiment were analysed to extract the back-wall reflection amplitude drop due to the presence of the notches and the estimated thickness of the material when a defect is present. The third experiment was performed to investigate the effect of the transducer offset relative to the defect po- sition. The experiment was conducted by varying the relative offset between the 5mm-depth notch and the centre of the OCLC EMAT. The relative offset was varied from �10 to 10mm with a 1mm step. The last experiment was performed to measure the amplitude drop when the relative orientation of the OCLC EMAT with respect to the defect is varied. The experiment was executed by rotating the OCLC EMAT in 10∘ steps from 90∘ position to the 0∘ position. For the four experiments both coils were used in pulse-echo mode; and the coils were exited simulta- neously. Fig. 6a depicts the geometrical setup of the first three experi- ments and Fig. 6b shows the geometrical setup of the last measurement. A schematic of the experimental setup used to perform the experi- ments is shown in Fig. 7a. The measurement setup consists of a com- puter, a data acquisition system (DAQ) containing two synchronised Handyscope-5 (HS5) with arbitrary function generator (AWG) and os- cilloscopes, an Amplification System (AS) developed in-house and the Fig. 3. FE results of two orthogonal and linearly polarised shear waves (2 MHz, 5-cycle Hanning-windowed) interacting with a notch defect of 0:05mm width and 5mm depth across the back surface of the material. a) Displacement magnitude of the shear wave with polarisation aligned parallel. b) Displace- ment magnitude of the shear wave with polarisation aligned perpendicular. c) Normalised time-trace of the incident and reflected shear waves when the back- wall has a notch in it. Fig. 4. FE simulation predictions of the amplitude drop of the perpendicularly (with respect to defect face) and linearly polarised shear wave relative to the parallel polarised shear wave as a function of crack depth and crack length. The results were obtained for a 30mm thick aluminium block at a centre frequency of 2MHz. Fig. 5. FE simulation results of the notch width effect in the percentage amplitude drop of linearly polarised shear wave back-wall reflections. The notches simulated were 30mm long and 5mm depth. J. Parra-Raad et al. NDT and E International 111 (2020) 102212 5 OCLC EMAT. A picture of the full system is shown in Fig. 7b. The computer, which controlled the experiment, executed a Matlab [26] routine that used two synchronised HS5 devices to send and receive signals to/from each coil. Inbetween the HS5s and the OCLC EMAT the AS was connected. The purpose built amplifier was used to enhance the excitation and receive signals of the transducer. The AS contains a buffer amplifier on the transmitting side, an isolation switch to interchange between transmitter and a 60 dB receiver amplifier. Ultrasound data could be acquired rapidly and simultaneously from both OCLC EMAT coils by employing the use of coded sequences to excite each coil [27]. The signal encoded was a 5-cycle Hanning-windowed toneburst at 2MHz centre frequency with an amplitude of 24Vpp from the AWG. Then, the signal was amplified by the AS and transmitted to the OCLC EMAT, which induced the tangential traction at the surface of the aluminium block. Once the ultrasound signal has been excited and travelled through the material, its reflections from the back-wall of the material were measured by the OCLC EMAT and sent to the AS, which amplified the received signal to be recorded by the HS5 devices. 6. FE and experimental results The results of two pulse-echo test on the aluminium block, using the OCLC EMAT, are shown in Fig. 8. The time-traces shown in Fig. 8a are the results of the inspection when the OCLC EMAT is placed on top of the undamaged area of the aluminium block and Fig. 8b shows the results of the inspection when the OCLC EMAT is placed on top of the 5mm-depth notch. The ToFs shown in Table 1 were calculated by measuring the time difference among maximum absolute values of the 1st and 2nd back-wall reflection of the same time-trace. The ToF calculated, when the back- wall is undamaged, were used to estimate the reference values of the shear wave propagation velocities in the aluminium block at the Fig. 6. Schematic of the measurement setup and geometrical orientation of the OCLC EMAT. a) Front view. b) Top view. Arrows indicates the coils orientation when the OCLC EMAT has a angle rotation of 0∘ with respect of the notch. Blue arrow represents coil A orientation and red arrow indicates coil B orientation. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) Fig. 7. Measurement setup. a) Schematic. b) Picture of the full acquisition system including Laptop, HS5, Amplifier, specimen and OCLC EMAT. Fig. 8. Back-wall reflection recorded with the OCLC EMAT coils. a) Back-wall reflection from an undamaged back surface. b) Back-wall reflection when a 5mm depth, 0:3mm width and 100mm long notch is present. J. Parra-Raad et al. NDT and E International 111 (2020) 102212 6 direction of each coil. The velocities were calculated by dividing the distance that the shear waves travelled in the aluminium block by the ToF. Once the shear wave propagation velocities in the aluminium block were estimated, the calculated ToFs when the notch is present were used to measure the thickness of the aluminium block. The thickness was measured by multiplying the ToFs with the shear wave propagation velocities in aluminium and then divide the result by the times that the wave travelled through the material from the 1st to 2nd back-wall reflection. The estimated velocities and thicknesses using the calcu- lated ToFs when the back-wall of the aluminium block is undamaged and when the notch is present are shown in Table 1. The results of the second, third and last experiments with their equivalent FE results are shown in Figs. 9–11 respectively. The FE results in Figs. 9–11 were obtained for 0mm-width notches, while the experi- mental results were obtained for 0:3mm-width notches. The rest of the geometrical dimensions of the notches were the same for both cases. Fig. 9 shows the maximum amplitude drop of the shear waves as a result of their interaction with notches of different depth. Fig. 10 shows the relative amplitude drop from each coil as a function of OCLC EMAT offset relative to the notch (0mm value offset signifies that the OCLC EMAT is directly over the defect). Fig. 11 shows the results acquired in the fourth experiment. The normalised maximum amplitude recorded on both coils is plotted for each angular position in which it was taken. The amplitude drop percentage for Figs. 9 and 10 were calculated by comparing the results to those obtained from the undamaged section of the specimen. 7. Discussion The capabilities of the OCLC EMAT towards the detection of crack- like defects were established. In all cases, it was found that experi- mental results show good agreement to the corresponding FE prediction. It should be noted, however, that the majority of the FE results were obtained for 0:05mm-width notches hence the lower predicted ampli- tude drops compared to the experimental measurements. Other factors such as surface roughness associated with the notches faces and a no perfectly flat bottom surface of the notches were thought to further enhance the amplitude drop recorded in experiments. The amplitude drop percentage recorded with the OCLC EMAT de- pends on the relative position and orientation of the coils with respect to the crack-like defect; and therefore the highest chance to detect the presence of the defect occurs when one is positioned directly above the defect and oriented perpendicular to the defect. However, if the relative orientation of both coils with respect of the defect is similar (close to 45∘), then the amplitude drop of each coil will be similar, complicating the detection of the defect. In such cases, the presence of the defect can be verified by rotating the transducer in the same position which enables higher contrast between the amplitude drops recorded by each OCLC EMAT coil. The capability to perform thickness measurements using the OCLC EMAT was verified by estimating the thickness of an aluminium block Table 1 Estimated velocity and thickness using the OCLC EMAT. Thickness measurement Undamaged Notch present Coil A ToF 18:88μs 19:04μs SW velocity 3178m=s 3178m=s Thickness 30mm 30:03mm Coil B ToF 18:9μs 19:06μs SW velocity 3151m=s 3151m=s Thickness 30mm 30:03mm Fig. 9. Maximum amplitude drop of the back-wall reflection vs the depth of a notch. The experimental results are the result of the shear waves interacting with different 0:3mm-width notches, while for the FE, the results are from the interaction with different 0:05mm-width notches. The amplitude drop per- centage was calculated by comparing the results when the defect is present and absent. Fig. 10. Maximum amplitude drop of the back-wall reflection vs offset distance between notch and OCLC EMAT. The experimental results are the result of the shear waves interacting with the 5mm-depth notch, while for the FE, the results are from the interaction with a 0:05mm-width and 5mm-depth notch. The amplitude drop percentage was calculated by comparing the results when the defect is present and absent. Fig. 11. Maximum amplitude of the back-wall reflection vs relative angle of the 5mm-depth notch with the OCLC EMAT. The experimental results are the result of the shear waves interacting with the notch, while for the FE, the results are from the interaction with a 0:05mm-width notch. The amplitude drop per- centage was calculated by measuring how much each amplitude data point has decreased compare with the maximum value in the experiment. J. Parra-Raad et al. NDT and E International 111 (2020) 102212 7 when the back-wall surface was undamaged and when a crack-like defect was present. The experimental results of thickness measure- ment are presented in Table 1 and Fig. 8. The results, for each coil, show similar estimation values for both cases; showing that the thickness measurement can be performed irrespective to the presence of a crack- like defect. However, slight difference between the ToFs recorded by each of the coils was found to be due to the material anisotropy of the tested aluminium block, resulting in wave velocity variation of around 1%. This discrepancy can be compensated for by comparing the ToFs recorded by each coil when obtained on an undamaged section of the specimen. Moreover, there is a 0:02μs difference among ToFs measured with the same coil (when the notch is present and when back-wall surface is undamaged). This difference corresponds to a position change of the maximum peak of the back-wall reflections of 1 sample, at the sampling frequency of 50MHz, and is therefore not believed to be significant as slight changes in amplitude and noise can push the peak into the next digital sample point. 8. Conclusion This paper has demonstrated that EMATs containing orthogonal co- located coils can be used to excite two perpendicular and linearly polarised shear waves. The two perpendicularly oriented shear waves interact very differently with crack-like defects and the resulting amplitude difference of the received signals can be used as a reliable indicator for the presence of a defect in metallic materials. The results obtained with the OCLC EMAT show that crack-like defects with depth 0:3mm (0:2λ) or higher can be detected by placing one of the shear waves parallel to the defect face and comparing the maximum amplitude difference between the signals that are measured with both coils. However, the received signal amplitude drop plateaus at a crack depth of roughly 2λ, meaning the amplitude drop value cannot be used to estimate the size of defects of 3mm-depth or larger. Additionally, the amplitude drop of the recorded back-wall reflections depends of the relative orientation and position of the OCLC EMAT with respect to the crack-like defect. The chance to detect the presence of a crack-like defect decreases when the OCLC EMAT is not on top nor aligned with the di- rection of the defect. To investigate how the amplitude of the received signals can be used to determine the presence of a defect, FE simulations of the two orthogonal and linearly polarised shear waves interacting with different crack-like defects were performed and validated with experimental re- sults. The FE and the experimental results show good agreement. The discrepancies resulted because the FE simulations were performed for 0:05mm-width notches while the experiments where implemented on notches with a finite width. The NDE technique introduced in this paper is a new concept that can be used for thickness measurements and crack-like defect detection. The advantage of this technique with respect of conventional NDE tech- niques is that one single setup allows to perform both measurements simultaneously. Acknowledgements The author would like to acknowledge funding from the ORCA hub (EPSRC grant EP/R026173/1) and Permasense Ltd. References [1] ASTM E797/E797M-10. Standard practice for measuring thickness by manual ultrasonic pulse-echo contact method, standard. West Conshohocken, PA, USA: ASTM International; 2015. [2] BS EN 141227. Non-destructive testing-Ultrasonic thickness measurement, Standard. London, UK: British Standard Institution; 2011. 2013. [3] Kobayashi M, Jen C-K, Bussiere J, Wu K-T. High-temperature integrated and flexible ultrasonic transducers for nondestructive testing. NDT E Int 2009;42: 157–61. [4] Dixon S, Edwards C, Palmer S. High accuracy non-contact ultrasonic thickness gauging of aluminium sheet using electromagnetic acoustic transducers. Ultrasonics 2001;39:445–53. [5] Robinson A, Drinkwater B, Allin J. Dry-coupled low-frequency ultrasonic wheel probes: application to adhesive bond inspection. NDT E Int 2003;36:27–36. [6] Benstock D, Cegla F, Stone M. The influence of surface roughness on ultrasonic thickness measurements. J Acoust Soc Am 2014;136:3028–39. [7] Brizuela J, Camacho J, Cosarinsky G, Iriarte J, Cruza J. Improving elevation resolution in phased-array inspections for ndt. NDT E Int 2019;101:1–16. [8] Vanaei H, Eslami A, Egbewande A. A review on pipeline corrosion, in-line inspection (ili), and corrosion growth rate models. Int J Press Vessel Pip 2017;149: 43–54. [9] Xie M, Tian Z. A review on pipeline integrity management utilizing in-line inspection data. Eng Fail Anal 2018;92:222–39. [10] Honarvar F, Salehi F, Safavi V, Mokhtari A, Sinclair AN. Ultrasonic monitoring of erosion/corrosion thinning rates in industrial piping systems. Ultrasonics 2013;53: 1251–8. [11] Cawley P, Cegla F, Stone M. Corrosion monitoring strategies—choice between area and point measurements. J Nondestruct Eval 2013;32:156–63. [12] Isla J, Cegla F. Optimization of the bias magnetic field of shear wave emats. IEEE Trans Ultrason Ferroelectr Freq Control 2016;63:1148–60. [13] F. Cegla, J. Allin, Ultrasonic monitoring of pipeline wall thickness with autonomous, wireless sensor networks, John Wiley & Sons, Ltd, pp. 571–578. [14] Chen J, Shi Y, Shi S. Noise analysis of digital ultrasonic nondestructive evaluation system. Int J Press Vessel Pip 1999;76:619–30. [15] Cegla F, Jarvis A. Modeling the effect of roughness on ultrasonic scattering in 2d and 3d. AIP Conf. Proc. 2014;1581:595–601. [16] Howard RD. Quantitative evaluation of ultrasonic techniques for the detection and monitoring of corrosion in pipes. d. eng thesis,. Imperial College London; 2017. [17] ASTM E2192-13. Standard guide for planar flaw height sizing by ultrasonics, standard. West Conshohocken, PA, USA: ASTM International; 2018. [18] Ermolov IN. Progress in the theory of ultrasonic flaw detection. problems and prospects. Russ J Nondestruct Test 2004;40:655–78. [19] Mak D. Ultrasonic methods for measuring crack location, crack height and crack angle. Ultrasonics 1985;23:223–6. [20] Pohl R, Erhard A, Montag H-J, Thomas H-M, Wüstenberg H. Ndt techniques for railroad wheel and gauge corner inspection. NDT E Int 2004;37:89–94. [21] Datta SK, Shah AH, Fortunko CM. Diffraction of medium and long wavelength horizontally polarized shear waves by edge cracks. J Appl Phys 1982;53:2895–903. [22] Harker AH. Numerical modelling of the scattering of elastic waves in plates. J Nondestruct Eval 1984;4:89–106. [23] Kawashima K, Wright OB. Resonant electromagnetic excitation and detection of ultrasonic waves in thin sheets. J Appl Phys 1992;72:4830–9. [24] Huthwaite P. Pogo 2019. https://www.http://www.pogo.software.com. [25] Huthwaite P. Accelerated finite element elastodynamic simulations using the GPU. J Comput Phys 2014;257:687–707. Part A. [26] MathWorks Inc.. Matlab r2018b. 2018. https://www.mathworks.com/products/ matlab.html. [27] Isla J, Cegla F. Coded excitation for pulse-echo systems. IEEE Trans Ultrason Ferroelectr Freq Control 2017;64:736–48. J. Parra-Raad et al. http://refhub.elsevier.com/S0963-8695(19)30281-6/sref1 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref1 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref1 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref2 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref2 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref3 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref3 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref3 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref4 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref4 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref4 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref5 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref5 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref6 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref6 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref7 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref7 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref8 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref8 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref8 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref9 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref9 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref10 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref10 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref10 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref11 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref11 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref12 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref12 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref14 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref14 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref15 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref15 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref16 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref16 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref17 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref17 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref18 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref18 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref19 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref19 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref20 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref20 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref21 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref21 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref22 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref22 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref23 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref23 https://www.http://www.pogo.software.com http://refhub.elsevier.com/S0963-8695(19)30281-6/sref25 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref25 https://www.mathworks.com/products/matlab.html https://www.mathworks.com/products/matlab.html http://refhub.elsevier.com/S0963-8695(19)30281-6/sref27 http://refhub.elsevier.com/S0963-8695(19)30281-6/sref27 Shear waves with orthogonal polarisations for thickness measurement and crack detection using EMATs 1 Introduction 2 The orthogonal co-located coils (OCLC) EMAT 3 Finite element simulation of interaction of SH waves with notches/cracks in isotropic materials 4 Measurement concept 5 Experiment design and setup 6 FE and experimental results 7 Discussion 8 Conclusion Acknowledgements References work_uxnzkbl3jvdmtnnjlag3n47ini ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216587586 Params is empty 216587586 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216587586 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_uzqs374q3natdm5bg66ptrgn5i ---- <3139332D32313528323136295F5FC0E5C7FDB6F55FC8ABC7F6C1F85FB3EBBFB5C8F15FBFC0C0C7B0E62E687770> 국가장서개발정책 기본모형 연구* Developing a Basic Framework for the Korean National Collection Policy 장 혜 란(Hye-Rhan Chang)**, 홍 현 진(Hyun-Jin Hong)*** 노 영 희(Younghee Noh)****, 오 의 경(Eui-Kyung Oh)***** 목 차 1. 서 론 2. 주요국 국가도서 의 장서개발정책 3. 국립 앙도서 장서수집 황 4. 국가장서개발정책의 기본모형 4.1 하이 리드장서 구축 4.2 동연계장서 구축 4.3 장서평가 4.4 장서개발정책 성문화 5. 결론 제언 록 정보자원의 지속 증가와 다양화 그리고 네트워크를 이용한 정보 근과 활용 등 정보환경의 변화로 도서 장서개발 정책은 더욱 정교하게 변화되어야 하는 상황에 있다. 국립 앙도서 은 이러한 시 조류를 반 하고 그 역할과 사명을 다하기 하여 보다 체계 이고 합리 인 국가장서개발정책을 수립해야 할 것이다. 본 연구에서는 우리나라와 주요 선진국의 국립도서 장서개발정책을 조사하고 이를 기반으로 우리나라 국가장서 구축을 한 기본모형을 도출하 다. 기본모형은 크게 하이 리드장서 구축, 동연계장서 구축, 장서 평가, 그리고 장서개발정책 성문화 등 네 가지의 서 모형으로 구성되어 있다. 모형별 세부 내용과 추진방안이 제시되었다. ABSTRACT A more sophisticated library collection policy is required due to the changing information environment. Both traditional resources and diversified networked information resources are increasing continuously. The National Library of Korea should construct a systematic collection development policy in order to fulfill its mission. The purpose of this study is to develop a basic framework that is intended to improve the coverage of national collection. Examining previous studies and the national collection development policies of major advanced countries, a model consisting of four sub-models has been proposed. It includes hybrid collection, cooperative linking collection, collection evaluation, and a collection policy statement. Details of the sub-models and strategies are described. 키워드: 국가장서, 장서개발정책, 장서개발모형, 국가도서 National Collection, Collection Development Policy, Collection Policy Framework, National Library of Korea * ** *** **** ***** 본 연구는 2009년 국립 앙도서 의 정책연구비 지원에 의해 수행된 ‘국가장서개발정책 기본모형 연구’의 일부를 수정․보완한 것임. 상명 학교 문헌정보학과 교수(chrhan@smu.ac.kr) 남 학교 문헌정보학과 교수(hjhong@chonnam.ac.kr) 건국 학교 문헌정보학과 부교수(irs4u@kku.ac.kr) 건국 학교 문헌정보학과 강의교수(ohspace@kku.ac.kr) 논문 수일자: 2009년 11월 20일 최 심사일자: 2009년 11월 23일 게재확정일자: 2009년 12월 16일 한국문헌정보학회지, 43(4): 193-215, 2009. [DOI:10.4275/KSLIS.2009.43.4.193] 194 한국문헌정보학회지 제43권 제4호 2009 1. 서 론 도서 정보환경의 변화, 정보자료의 다양화 자정보원의 증 등으로 각 도서 의 장 서개발정책은 재정비되어야 하며 여러 가지 상 황들이 검토, 반 되어야 한다. 우리나라 표 도서 인 국립 앙도서 역시 이러한 시 흐름을 반 하고 그 사명과 역할을 다하기 하여 체계 이고 합리 인 장서개발정책을 수 립해야 할 것이다. 즉 국립 앙도서 은 통 인 장서개발계획으로부터 선진국과 같이 종 합 이고 체계화된 정책에 입각한 장서구축 활 동이 필요하다. 본 연구는 방 한 국가 기록문화유산을 체계 으로 수집할 수 있는 장서개발정책 기본모형 을 제시하고자 한다. 이를 해 하이 리드 도 서 구 을 한 온오 라인 자료, 국외 한국 인 작 한국의 사회․문화․역사 등 한국 련 자료, 회색문헌, 장애인자료 등의 포 장서구축 략을 고찰하 다. 연구를 하여, 기존의 선행연구를 체계 으 로 조사하여 종합하며 국외 국가도서 의 장서 개발정책서들을 검토하여 국가장서 수집 범 , 수집 방법, 수집 규정, 평가 방법 등 장서개발 반에 한 사항을 분석하 다. 한 각 국의 국가도서 련 웹사이트를 직 탐색하여 최근의 변화와 발 방향을 조사하 다. 황에 한 조사와 분석, 국외의 장서개발 정책 사례 등을 기반으로 제시한 장서개발 기 본 모형은 국립 앙도서 의 장서확충 세부계 획 지침을 한 유용한 정보로 이용될 수 있 을 것이다. 이와 같은 연구는 장차 국립 앙도 서 이 국가도서 으로서의 상을 정립하고, 우리나라의 기록문화 유산을 망라 으로 수집 보존하여 후 에 승하는 책무를 이행함과 동 시에 국가 지식 집산처로서의 역할을 다할 수 있는 기반을 제공하게 될 것이다. 한 국립 앙도서 이 장서수집 업무를 재정비하고 장 기 장서확충계획을 수립하기 한 기 자료로 도 활용될 수 있을 것이다. 2. 주요국 국가도서 의 장서개발정책1) 장서개발정책은 도서 의 목 과 한 련이 있으며, 도서 의 목 은 장서개발정책의 방향, 틀, 범 를 결정한다. 주요 국가도서 의 장서개발정책을 분석해 보면, 장서개발정책의 구성요소는 목 등을 포함하는 서문 는 개 요, 주제별 장서개발정책, 매체별 장서개발정 책, 지역별 국외자료 개발정책, 장서개발수 범 , 장기장서개발계획, 자자원 개발정 책, 기타 등이며 그 핵심 인 내용을 정리하 면 다음과 같다. 1) 장서정책 개요 이 부분에서는 국가도서 의 목 , 목 달 1) 미국 의회도서 장서개발정책 페이지. . 국 국립도서 장서개발정책 페이지. . 호주 국립도서 장서개발정책 페이지. . 캐나다 국립도서 장서개발정책 페이지. . 국가장서개발정책 기본모형 연구 195 성을 해 국가도서 이 추구해야 할 장서개발 의 방향을 제시하고 장서개발의 범 , 장서의 보존 등을 기술하고 있으며, 장기장서개발 략까지 포함하고 있는 경우도 있다. 를 들어, 캐나다 국립도서 의 목 은 캐나다의 문화와 역사 자료를 보존하고 승시키며, 국외 자료 를 수집․정리하여 이용하도록 한다고 진술하 고, 장서수집원칙, 장서의 핵심개념, 발 략 등을 제시하고 있다. 2) 주제/매체별 장서개발정책 국외 국가도서 의 경우 주제별 매체별로 장서개발정책을 기술하고 있다. 미국 의회도서 은 상세하게 학문 주제별 46가지와 매체별 16 가지로 구분하여, 장서개발 범 , 상 장서 규모, 강 , 특성, 취약 과 수집제외 자료, 추 구하는 장서 수 등을 기술하고 있다. 호주 국 립도서 은, 주제를 사회과학, 인문과학, 술, 과학 등 계열별로 구분하고, 매체는 구 역사 와 민속기록물, 사진, 지도, 음악, 특수 자료 인 무용 등으로 구분하여, 장서개발 목 , 정의 범주, 장서개발 방법, 근 방법, 련 기 과의 력 등에 하여 기술하고 있다. 3) 지역별 국외자료 개발정책 세계의 자료를 모두 수집하는 목 을 가 진 규모 국가도서 에서는 국외자료의 장서 개발 방향 범 , 수집 략 등을 지역별로 나 어 구체 으로 제시하고 있다. 미국 의회도 서 은 세계의 지역 연구 자료를 망라 으 로 수집하고 있다. 국 국립도서 은 국 아일랜드, 동 북아 리카, 네덜란드, 독일, 그리스, 인도, 이태리, 스칸디나비아, 슬래 동유럽 등으로 구분하여 기술하고 있다. 호주 국립도서 은, 인 지역과 인 국가를 심으 로, 국외장서 일반, 자정보원, 아시아, 태평양 지역, 특수주제, 정부 국제기구 간행물, 신문 자료 등으로 구분하여 장서개발 목 , 범주, 언 어 등에 한 가이드를 제시하고 있다. 4) 장서 수 범 부분의 국가도서 들은 장서개발 수 이 나 범 를 장서개발정책 서문이나, 주제별․매 체별․국외지역별 장서개발정책 기술부분에서 언 하고 있다. 특히 장서 수 을 각 주제별로 제시하고 있는데, 를 들어 미국 의회도서 의 경우, 연구도서 그룹(Research Libraries Group: RLG)이 개발한 컨스펙터스를 용하 고 있다. RLG 컨스펙터스는 연구도서 수 의 장서를 기술하는데 합하며, Level 0부터 Level 5까지 각각 Out of Scope, Minimal, Basic Informational, Study or Instructional Support, Research, Comprehensive 수 을 나타낸다. 호주 국립도서 도 장서 평가를 해 컨스펙터 스를 용하고 있으며, 호주 컨스펙터스는 Level 0부터 Level 5까지 각각 Out of Scope, Minimal, Basic, Intermediate, Research, Comprehensive 수 을 나타낸다. 5) 장서개발 략 균형 있는 장서개발을 체계 으로 진행하기 해서는 장서개발을 한 장기발 계획을 수립하는 것이 필요하다. 캐나다 국립도서 에 서는 2005년 ‘Key Directions, 2005-2010’을 제 시하 는데, 하나의 완벽한 LAC 장서구축, 디 지털장서구축계획, 원주민 련 자료, 다문화 자 196 한국문헌정보학회지 제43권 제4호 2009 료, 국가장서구축 등에 한 다섯 가지 략을 포함하고 있다. 6) 디지털장서 개발정책 디지털자료가 증가함에 따라 디지털자료의 수집 보존과 련된 장서개발정책이 제시되 는 경우가 많다. 미국 의회도서 은, 주제별 매 체별 장서정책의 Supplementary Guidelines에 서 ‘Electronic Resources’ 제목 하에 디지털자 료의 정의 수집범주에 해서 언 하고 있 다. 한 디지털 장서를 구축하기 한 지침까 지 제시하고 있다. 국 국립도서 도 ‘Web Collections’을 구축하기 한 방향과 수집 가이드 라인을 제시하고 있다. 독일 국립도서 은 2006년 6월 독일도서 법이 통과됨에 따라 non-physical media(즉, 온라인출 물) 수집 업무가 부여되 었으며, 납본규정과 장서 가이드라인이 개정되 었다. 3. 국립 앙도서 장서수집 황 국립 앙도서 은 납본, 구입, 교환, 기증, 자 체생산 등의 방식으로 장서를 수집하고 있다. 2009년 8월 31일 재, 수집 내역별 장서 황 을 살펴보면 <표 1>과 같다. 납본에 의한 수집 이 5,403,444건으로 체의 74.10%에 해당하는 가장 높은 비 을 차지하고, 그 다음으로는 기 증에 의한 수집이 774,404건, 구입에 의한 수집 이 558,722건, 교환에 의한 수집이 403,439건, 자체생산이 152,228건의 순이다. 2008년도의 유형별 자료 수집 방법과 수량 은 <표 2>와 같다. 납본 상은 국내에서 출 된 모든 출 물 로, 수집 내역 가장 높은 비율을 차지하며, 국립 앙도서 에 직 개별 납본과 한출 문화 회 행 납본의 병행체제를 유지하고 있 다. 한출 문화 회 출 통계에 의하면, 2008 년 한출 문화 회로 납본된 신간 출 물(20 08년 출 물)은 모두 43,099종으로 년도 구분 납본 구입 교환 수증 자체생산 계 장서량 5,403,444 558,722 403,439 774,404 152,228 7,292,237 ※출처: 국립 앙도서 홈페이지 도서 소개. <표 1> 수집 내역별 장서 황 (단 : 책, 부, , 릴) 구분 납본 구입 교환 수증 자체생산 계 단행본도서 296,484 21,663 3,910 16,459 1,255 339,771 단행본비도서 52,256 5,010 1,888 2,131 1,926 63,211 연속간행물도서 266,684 11,710 4,563 1,718 181 284,856 연속간행물비도서 4,108 889 318 - 568 5,883 계 619,532 39,272 10,679 20,308 3,930 693,721 ※출처: 2008 국립 앙도서 연보(pp.261-262에서 재편). <표 2> 2008년 자료유형별 장서 수집 내역 (단 : 책, 부, , 릴) 국가장서개발정책 기본모형 연구 197 비 4.9%가 증가하 음을 알 수 있다(http://w ww.kpa21.or.kr/bbs/board.php?bo_table=d _total). 한 납본용역 사업에 의하여 납본된 자료의 총 수는 113,800 책( )에 이르는 것으 로 나타났다. 1965년 납본법 시행 이 국내 발행 자료 미소장 희귀자료, 국외 발간 한국 련 자료, 북 한서, 일본서, 국서 서양서 등 다양한 언어 의 자료는 구입에 의하여 수집되며, ‘책다모아’ 홈페이지(http://www.nl.go.kr/sun/index.p hp)를 운 하여 개인이나 단체가 소장하고 있 는 자료들을 기증받아 이들 미소장 자료를 장서로 편입한다. 한 국외 도서 국제기구 등과 교환 기탁 을 통해 국외의 출 정보, 정부간행물, 학술자료, 각종 통계, 국외 발행 한국 련 자료, 국제기구간 행물 등을 수집하며, 국내외 소재 한국 고문헌 조사 인을 통한 자체 생산 자료도 수집하고 있 다. 국내 인사업의 경우 규장각 소장 2,983종 3,246책이 마이크로필름 225롤로 수집되었고, 2010년까지 3년 계획으로 추진 인 국외소재 자료의 인사업은 Harvard 학교 칭도서 소장 한국고서 297종 347책이 디지털화되고 있 으며, 미국 의회도서 의 84종 313책의 자료가 디지털 일 48,592면으로 디지털화 작업을 완료 하 다. 그 밖에도 국외소재 한국 련 근 사 자료를 체계 지속 으로 인 수집하고 있다. 미국 국립문서기록 리청(NARA) 소장 한국 련 자료는 재 1,030,000면을 수집하 다. ‘도서 발 종합계획: 2009-2013’( 통령소 속 도서 정보정책 원회 2008)에 의하면, 국 립 앙도서 은 2013년까지 총 장서량 1,100만 을 넘어서 OECD국가 8 수 으로 성장할 것을 목표로 하고 있다. 4. 국가장서개발정책의 기본모형 국가장서개발정책의 기본모형으로서, 인쇄 자료, 원문자료, 자출 물, 웹 정보자원 등의 균형 있는 수집 방안을 제안하는 ‘하이 리드 장서구축’모형, 정보소외계층(장애인)을 비롯 하여 이용자나 발행처가 한정 인 특수 자료를 효율 으로 수집하는 방안으로 ‘ 동연계장서 구축’모형, 국가장서의 균형있는 발 과 보존 그리고 질 우수성을 유지하기 한 ‘장서평 가’모형, 그리고 ‘장서개발정책 성문화’를 제안 하 다(그림 1 참조). 4.1 하이 리드장서 구축 오 라인 정보자료는 물론 온라인 자출 물을 통합 으로 이용할 수 있는 하이 리드 장서를 지향하는 국가도서 에서는 통 매 체자료와 더불어 자출 물에 한 장서수집 정책이 새롭게 수립되어야 할 것이다. 최근 세 계 각국에서는 기존의 장서개발정책에 추가하 여, 인터넷 디지털 자원의 수집․보존 로젝트 를 추진하고 있다. 미국의 MINERVA, 국의 CEDARS, 호주의 PANDORA 로젝트가 표 인 이다. 한 우리나라 국립 앙도서 에서도 2004년부터 OASIS(Online Archiving & Searching Internet Sources) 로젝트를 추진하여 웹사이트, 웹문서 등의 디지털 자원 을 수집하고 있다. 198 한국문헌정보학회지 제43권 제4호 2009 <그림 1> 국가장서개발정책 기본모형 1) 오 라인 장서수집 오 라인 장서의 경우, 행 수집 방법인 납 본(자료제출), 구입, 기증, 국제교환 그리고 기 타 국외 소재 우리나라 고 자료의 인 수집 보존 등을 지속 으로 유지하고, 동시에 주제 별, 유형별, 매체별 수집 략을 수립할 필요가 있다. 한 국립 앙도서 은 법정 국가 표도 서 으로서 망라 수집을 제로 하되, 집 으로 수집해야할 주제 유형, 장기 략을 수립해 나가야 한다. 재 국립 앙도서 자료 수집의 근간이 되는 납본업무를 해 략 으로 집 해야 할 부분은, 미납본 자료 의 수집 강화, 납본 증진을 한 홍보, 납본 강 제력 행사로, 이 세 가지는 순차 으로 계획 실 시되어야 할 것이다. 2) 온라인 장서수집 국외 국립도서 에서는 1996년부터 온라인 디지털자원을 수집․보존하고, 작권이 허용 되는 범 내에서 이용자 서비스를 제공하고 있 으며, 한 온라인상에 존재하는 디지털자원의 범 가 방 하여 자국에 합한 온라인 디지털 자원의 수집⋅선정지침을 마련하여 그 지침을 충족시키는 디지털자원만을 수집하고 있다. 국 립 앙도서 도 온라인 자출 물에 한 체 계 ․효율 수집 모형을 개발해야 한다. 재 국립 앙도서 의 수집 원칙은 온라인 자출 물 수집의 근간이 되어야 한다. 생산 유통된 자출 물 체를 기 으로 하면 재 까지 수집된 자원은 양 으로 지만, 수집정 책의 지속 추진을 해서는 기본원칙을 그 로 유지하는 것이 바람직하다. 디지털 정보자원은 텍스트, 데이터베이스, 사진과 동 상, 소리, 그림, 소 트웨어, 웹 페 이지를 포함한다. 이 정보자원의 부분은 속 인 가치와 요성을 가지며, 계속 보존되 기 해서는 의도 인 생산, 유지, 리가 필요 하다. 이러한 디지털 정보자원의 보존을 디지 털 아카이 라고 한다. 디지털 아카이 는 단 순히 디지털화한 자료를 모아놓는 보고나 창고 가 아니다. 아날로그 시 의 아카이 는 보 을 목 으로 구축되었지만, 디지털시 가 요구 하는 아카이 는 자료의 극 활용뿐만 아니 라 자료가 합성되어 새로운 자료를 만들어 내 국가장서개발정책 기본모형 연구 199 기도 한다. 국립 앙도서 의 디지털 아카이 는 우리 나라의 특성 있는 문화원형들을 사운드(음성), 텍스트(문자), 이미지(정지화상), 동 상의 매 체별 주제별 요소로 나 어 데이터베이스로 구 축하여 세계 어디에서나 이용할 수 있도록 함으로써 향후 국가지력의 역할을 할 수 있을 것이다. 웹 자원은 이용은 편리하지만 잦은 변화로 인하여 가치 있는 정보가 삭제, 변경되어 수집 이나 보존에 어려움이 있다. 따라서 문화자원 으로 장기 인 활용을 해 장서개발의 주요 상으로 인식하고 체계 으로 수집하여, 미래 세 를 한 기록유산으로 보존하여야 한다. 웹 자원을 수집, 보존하는 노력은 1990년 부터, 주로 국가도서 들의 주도하에 시작되었다. 1994 년 캐나다의 EPPP, 이후 호주의 PANDORA, 미 국의 IA, 스웨덴의 Kultturaw3 등을 필두로 웹 사 이트 아카이빙이 본격 으로 시작되었다. 우리나라에서 진행된 표 인 웹 아카이빙 은 국립 앙도서 의 OASIS와 민간에서 진행 된 정보트러스트이다. 국가도서 으로서 국립 앙도서 은 가치 있는 인터넷 자원을 수집, 보 하기 해 OASIS를 2001년부터 시작하고 2006 년에 국민서비스를 개시하 다. 정보트러스트 는 개인과 단체 등 민간 역에서 생산된 가치 있는 디지털 자원을 상으로 아카이 를 구축 하기 해 민간에서 진행된 로젝트이다. 웹 아카이빙을 해서는 다양한 련 기술을 가진 기술 인력과 함께 그들의 공동 활동을 조 정할 능력을 가진 리더가 필요하다. 디지털 자 료의 경우, 납본 수집이 확립되지 않아, 디지털 자료에 한 총량 조사나 수집을 한 재원이 마련되지 않은 상태이다. 그러나 앞으로 개정 된 도서 법에 의하여 디지털 자료의 수집이, 특히 2009년 5월 개 한 디지털도서 을 심 으로, 본격화될 것으로 망한다. 재 디지털 자료의 총량은, 자도서, 자 잡지, 데이터베이스, 웹정보자료(홈페이지, 웹 사이트), 음악 일(MP3 등), 동 상자료, 이 미지, 각종 자문서 등을 상으로 검색엔진 인 GOOGLE과 NAVER를 통해 재 92억 6 천만 건으로 측하고 있으며, 연간 수집수량 을 재의 10% 수 인 9억 2천만 건으로 추정 하고 있다(곽승진 등 2008). 이러한 상황에서 재 추정 건수를 모두 수집하기 해서는 특 별 산의 확보와 담 인력이 필요하다. 4.2 동연계장서 구축 국립 앙도서 은 망라 수집이라는 사명 을 가지고, 수십 년의 역사를 통하여 소장 주 의 장서를 성실히 구축하여 왔다. 그러나 정보 환경의 변화에 따라, 국가장서의 범 는 소장 뿐만 아니라 디지털 정보 기술을 활용한 근 이나 연계가 가능하도록 확 되어야 한다. 회색문헌, 장애인 체자료등과 같은 자료는 국립 앙도서 이 소장해야 할 의무가 있으나, 출처가 다양하고 배포 방식이 제한되어 있으며, 이미 국내 타 기 에 산발 으로 분산 소장하 고 있다. 이러한 자료들은 이용자가 한정 인 경우이므로, 국립 앙도서 이 모든 자료를 소장 하는 것보다 타 기 과의 동을 통하여 근을 허용하는 것이 장서개발의 복투자 방지와 완 한 서지통정에 효율 이다. 소장(possession) 뿐만 아니라 근(access)이나 연계(link)로 200 한국문헌정보학회지 제43권 제4호 2009 장서 개념을 확 시킬 필요가 있고, 수집정책 에서도 동을 통한 서지정보DB의 구축과 작권에 배되지 않는 범 에서 원문 공유가 실 되는 략이 채택되어야 한다. 본 에서는 회색문헌, 장애인자료, 국외 한 국 련 자료 등 근이나 연계로 개념을 확 시 킬 수 있는 장서를 상으로, 동연계장서 구 축 방안에 하여 제안하고자 한다. 우리나라는 이미 국가 자도서 사업 등을 통하여 서지정 보데이터베이스의 구축과 상호 차 문 제공 에 한 력 사례가 있으므로, 국가차원의 기 술 인 인 라는 조성되어있다고 보며, 다만 참 여기 의 행정 특성과 상을 고려한 역할 분 담과 력의지 그리고 의를 통한 주 기 의 결정이 요하다. ‘도서 법’에서 문화기 도서 종 간의 력을 의무사항으로 규정하 고 있어 설립 목 수행에 지장이 없는 한 공 이 이용할 수 있도록 시설 자료를 제공할 것 을 명시하고 있으므로 이러한 행정 차이에서 오는 문제는 해결될 수 있을 것이다. 4.2.1 회색문헌의 동연계장서 1) 학 논문 국내 표 인 학 논문 수집기 에는 국립 앙도서 을 비롯하여, 국회도서 , KERIS 등 이 있다. KERIS의 학술연구정보서비스(http://ww w.RISS4U.net)는 국내 191개 학도서 에서 제공하는 737,156건의 학 논문의 검색과 원문 서비스 는 원문연계 서비스를 제공 이다. 국회도서 은 2009년 4월 30일 재 1,051,559 책의 석박사 학 논문을 소장하고 있으며, 매 년 약 50,000책이 넘는 학 논문이 꾸 히 수집 되고 있다. 한 1945년 이후 간행된 석박사 학 논문 1,245,325건에 한 목록정보를 제공하 고 있다. 국립 앙도서 은 의 두 개 기 과 동하 여 링크와 서지정보의 공유를 통한 DB를 구축 하여 보다 완 한 근을 제공할 수 있을 것이 다. 특히, 학의 dCollection의 확 보 으로, 인쇄본 형태의 기증 비 이 차 감소될 것이 상되므로, KERIS, 국회도서 그리고 학 도서 들과의 서지정보공유와 링크 컬 션으로 의 확 는 학 논문의 서지통정과 이용자 서비 스의 기반을 마련할 수 있을 것으로 사료된다. 2) 정부간행물 재 정부간행물의 통합 리를 한 정책기 이 없으며 정부간행물을 수집하는 곳도 일원 화되어 있지 않아, 정보를 검색하기 어렵다. 따 라서 정부간행물 동연계수집에 한 총 통 제는 국립 앙도서 이 담당해야하며, 재의 정부간행물 수집기 인 국회도서 , 국가기록 원과의 역할 분담과 서지정보의 공유와 링크를 통한 장서의 구축을 도모해야 할 것이다. 국립 앙도서 은 정부 앙부처의 간행물 을, 국회도서 은 연구기 공공단체의 간 행물을, 국기기록원은 정부기 의 기록물을 분 담하여 수집하며, 행법을 그 로 용하여, 국립 앙도서 은 납본을, 국회도서 은 기증 을, 국가기록원은 납본을 주요 수집방법으로 유지한다. 각각의 기 이 수집한 간행물의 서 지정보는 종합목록DB로 구축되며, 상호간 공 동으로 활용 는 링크시킴으로서, 동 장서 를 구축하며, 장기 으로는 력 수집 기 을 확 시켜서 완 한 정부간행물의 컬 션을 구 국가장서개발정책 기본모형 연구 201 축할 수 있도록 하여야 한다. 3) 연구보고서 회색문헌 상당 부분을 차지하는 연구보고 서 한 국가장서로서 수집되어야 할 자료이다. 우선 으로 연구보고서를 량으로 생산하고 수집하는 KISTI, SERI, KOSEF 등을 심으 로 서지정보 종합목록DB 구축과 링크를 추진 할 것을 제안하며, 지속 으로 력기 을 확 편입시켜, 국가 규모의 연구보고서 컬 션 을 구축할 것을 제안한다. 각 기 소장 자료의 주제와 유형에 따라 력과 역할 분담이 이루 어져야 할 것이다. 국립 앙도서 이 동으로 유지되는 서지정보의 통제, 기 별 주제의 할 당, 원활한 링크 등을 주도해 나가야 할 것이다. 회색문헌 유형별 동연계장서에 한 국립 앙도서 의 략은, 납본과 기증 등 기존 수 집 방법의 개선 강화를 주도하고, SIGLE, 국 국립도서 , NTIS 등 국외 사례를 벤치마 킹하여 회색문헌 DB 구축 표 화를 담할 담당부서의 신설을 고려해야할 것이다(남 2008; 신은자 1999). 회색문헌 력 모델의 개 요를 그림으로 나타내면 <그림 2>와 같다. 4.2.2 장애인용 동연계장서 재 국립 앙도서 에는 국립장애인도서 지원센터가 2007년에 설립된 이래로, 장애인을 한 도서 서비스 정책, 장애인을 한 자․ 녹음․확 ․수화․ 자자료 등의 제작 배 포, 장애인서비스를 한 국내외 유 기 도서 간의 력 등의 업무를 추진하고 있다. 한 2009년 4월에는 장애인 리터를 개실하 여, 장애인이 정보를 이용할 수 있는 공간과 장 비를 마련해주고, 도서 서비스를 받을 수 있 도록 하고 있다. 그러나 극 인 노력에도 불 구하고, 국립 앙도서 의 장애인을 한 체 자료의 수는 극히 부족하다. 장애인용 장서 수집에 가장 주력해야 하는 분야는 정보획득에 큰 지장을 주는 시각장애인 과 청각장애인용 장서이다. 따라서 본 연구에 서는 장애인을 시각장애인과 청각장애인으로 한정하여, 시각장애인과 청각장애인용 장서의 <그림 2> 회색문헌 동연계장서 구축 202 한국문헌정보학회지 제43권 제4호 2009 수집 모형을 제안하고자 한다. 최근 장애인자료 수집을 한 구체 인 개발 략이 제안된 바 있다(윤희윤 2008). ‘도서 종합발 종합계획 2009~2013’의 ‘장 애인 등 취약계층 상의 도서 서비스 활성화’ 를 한 추진 략에서는 장서수집과 련하여 체자료 제작 보 에 한 세부사항을 아동 청 소년 독서진흥을 한 콘텐츠 개발, 학생 연구 자 문직업인을 한 학술콘텐츠 개발, 성인 여가문화활동 지원을 한 콘텐츠 개발 등으로 제시하 다(도서 정보정책 원회 2008). 장애인 자료는 일반장서와는 달리 장애를 가 진 일부 이용자에 한정되어 있으나, ‘정보격차해 소에 한 법률’, ‘도서 법’, ‘장애인복지법’, ‘장 애인․노인․임산부 등의 편의증진보장에 한 법률’, ‘장애인차별 지 권리구제 등에 한 법률’, ‘ 작권법’ 등에서 규정하는 바와 같이, 정 보 이용 근에 차별행 가 있어서는 안 된 다. 그러므로 극 인 장서 수집이 장애 유형에 따라 합한 형태로 이루어져야 한다. 국립장애 인도서 지원센터는 서비스 기 지침에서 장애별 체자료의 유형을 설정하 다. 장애인용 자료는 체자료로 변환되어야만 장애인들이 이용할 수 있으나, 형태가 다양하 고 자료제작 인건비와 제작도구비용까지 추가 되므로, 개별 도서 이 자체 으로 수집하여 제공하기 어렵다. 동은 크게 두 가지 방식으 로 개될 수 있는데, 첫째는 체자료의 수집 과 제작이며, 두 번째는 서지DB의 공동 구축과 공유, 링크, 상호 차이다. 력의 구심 은 국 립장애인도서 지원센터가 되어야 하고, 자 도서 과 지역 표도서 이 국 력망의 노 드가 되어 역할을 분담해 나가는 형식으로 모 형이 구축되어야 할 것이다. 부분의 공공도서 이 장애인에 한 지원 을 하고 있으므로, 앞으로 력기 의 수는 지 속 으로 증가할 것이며, 재 수집업무의 력기 으로 참여할 수 있는 표 인 기 들은 다음과 같다(국립장애인도서 지원센터 2008). 즉 구 학교 자도서 ( 자교과서), 부산 시각장애인 자도서 (녹음도서), 서울시각장 애인복지 ( 자학습물), 실로암시각장애인복 지 ( 자학습물, 문서 ), 하상장애인종합 복지 (녹음도서, 자어학간행물), 한국시각 장애인복지 (녹음도서, 자어학간행물),한국 시각장애인복지연합회(화면해설, 녹음도서), 한 국 자도서 (데이지도서, 자도서, 자도서 종합목록 발행), LG상남도서 책읽어주는 도서 (데이지도서 등), 한국농아인 회(데 코리아), 한국수화방송국, 한국농아TV, 청음회 , 삼성소리샘복지 , 구청각언어장애인복 지 , 제주도농아인복지 , 시립서 문농아인 복지 등이다. 에 열거된 다수의 기 에서 체자료를 제 작하고 있지만, 체자료 제작 서비스에 한 정확한 통계를 찾아볼 수가 없다. 가장 표 이 고 많은 양의 체자료를 제작, 소장하는 도서 인 한국 자도서 의 경우, 연간 자도서 3,000 종, Daisy 도서 12,000종을 제작하며, 2008년 재, 21,631종의 자도서와 950종 1,900권의 녹 음도서와 카세트테이 를 소장하고 있으며, 자도서 종합목록도 제작하고 있다. 4.2.3 국외 한국 련 동연계장서 국가도서 은 국외에서 발간된 자국에 한 자료와 국외에 소장되어 있는 자국 련 자료를 국가장서개발정책 기본모형 연구 203 수집할 책무가 있다. 국립 앙도서 이 한국학 련 국외자료 확보 사업을 추진하는 근거는 ‘도서 법’ 제3장 국립 앙도서 , 19조 국립 앙도서 의 업무에서 ‘국내외 자료의 수집․제 공․보존 리’에 두고 있으며, 재 다음과 같 은 추진과제를 수행하고 있다: 한국 련 국외 자료의 조사 수집, 미국 국립문서기록 리 청 소장 한국 련 기록자료 수집, 국외 소재 한 국 련 연구기 ․개인 소장 자료 조사 수 집, 한민국 토 련 자료 조사 수집. 국립 앙도서 의 한국 련 자료 수집 황 은 양 으로 매우 부진하다. 국외소재 한국 근 사 련 자료를 체계 으로 수집하여 국가 기록문화유산을 보존, 승하고 국외발행 한국 련 자료를 수집함으로써 한국학 련 연구 활동을 효과 으로 지원하기 해서는 보다 극 이고 완벽한 자료수집 방안 략이 필 요하다. 국가기록문화유산의 보존, 한국학의 세 계화, 한국학 련 연구 활동 지원을 해 한국 련 자료의 체계 탐색, 한국학 련 국외 소 장기 과의 력, 한국학 문가 견 등을 통 한 한국 련 자료의 극 수집과 동시에 한 국학 련 국내 유 기 을 통한 종합서지목록 DB와 연계장서구축이 필요하다. 국립 앙도서 에서 가장 먼 수행해야 할 과제는 한국학 련 자료소장기 에 한 정보 와 그 기 이 소장하고 있는 한국학 련 자료의 양과 질을 조사하며 주제별, 매체별 등으로 구 분하여 악할 필요가 있다. 이를 기반으로 국 외기 과의 연도별 력체결 략을 세우고 나 라 수와 기 수를 단계 으로 늘려갈 필요가 있다. 미국 의회도서 , Smithsonian Institute, Harvard 학교 등 한국 련 장서를 수집하고 있는 기 은 물론, 국과 유럽을 심으로 한 세계 각지의 학 연구소 장서를 조사하여 미소장 자료에 해서는 인할 필요가 있다. 한 수집 략에서 기록물의 포함범 에 한 결정이 필요하다. 국외소장 한국 련 자료 조사 구입활동이 서지조사나 주요 도서 의 한국 련 장서목록 조사, 그리고 문출 사 출 정보지, 온오 라인 형서 상의 목록조사 만으로는 수집 에 많은 어려움이 있다. 따라서 한국학 문가 를 국외에 견하여 자료조사 수집을 지속 으로 수행해야 한다. 한 국외소재 문화원 재외 문화홍보 등을 통하여 수집활동이 지원될 수 있는 체제가 수립되어야 한다. 미국 의 경우 세계 각국의 문화원을 극 으로 활 용하여 미국 련 자료는 물론 지역자료 수집활 동을 하고 있다. 문화원의 활용은 산이 감 될 뿐만 아니라 그 효과도 높을 것이다. 재 재외문화원은 9개국 12개 처에 있으며, 재외 문 화홍보 은 22개국 사 는 사 에 있다. 일반 으로 국외 소재 한국 련 자료의 수집 은 국립 앙도서 을 통해 수행되었다. 그러나 1990년 이후 국내 여러 기 에서 국외소재 한 국 련 자료 수집사업이 본격화되기 시작하 다. 재까지 알려진 한국학 련 자료를 수집 하고 있는 기 은 국립 앙도서 을 비롯하여 국회도서 , 국가기록원, 국사편찬 원회 등 11 개 기 인 것으로 알려져 있다(조혜경 2007). 주요 기 의 략 인 수집 황은 다음의 <표 3>과 같다. 이러한 기 들 간의 력을 통해 복수집을 방지하고 보다 체계 이고 략 인 수집을 할 수 있도록 해야 한다. 가능하다면 <그림 3>과 204 한국문헌정보학회지 제43권 제4호 2009 기 명 수집량 상 국가 기 국립 앙도서 ∙39개 문서군 ∙매년 약 20만면 수집 ∙한국고 자료 4,000 인 수집 ∙귀 자료 5,000책 인 수집 ∙미국: NARA ∙ 국, 미국, 랑스 러시아 국회도서 ∙35책, M/F 74롤, 마이크로피시 1,014 매 ∙미국, 일본, 국 국사편찬 원회 ∙3,297,633면/3,231롤 ∙미주: NARA, 캐나다 국립아카이 즈 ∙일본: 국회도서 , 도쿄 학교, 국문학 자료 , 궁내청서 릉부, 후 아이 신아이쥬쿠, 복각자료 ∙ 국: 국북경시당안 , 국북경국가도서 , 국국가 도서 ∙러시아 유럽: 네덜란드 암스테르담 국제사회연구소, 타쉬 트 문서연구소, 우즈베키스탄 국립 앙아카이 즈 한국학 앙연구원 장서각 ∙2,860면 ∙미국: NARA ∙ 국: 련 학 ∙일본: 홋카이도 학, 도교 학, 카뮤수인 학 등 국가기록원 ∙마이크로필름 3,774롤 ∙문서 1,343931권 ∙도면 1,238,306면 ∙카드(인사기록카드, 병 카드 등) 5,290,190매 ∙러시아: 국방성문서연구소 ∙ 국: 제2 역사당안 , 강소성당안 , 남경시당안 , 연변 시당안 ∙일본: 외교 사료 ∙미국, 국, 몽골, 베트남 등 독립기념 한국독립운동사연구소 ∙ 11,963 ∙문건 29,622 ∙수기 88 ∙서화 505 ∙사진 13,691 ∙무기 179 ∙유품 978 ∙ 탁 2,062 ∙복제 10,518 ∙기타 11,480 ∙일본, 국, 일본, 북한 등 <표 3> 한국 련 국외자료 수집기 수집 황 <그림 3> 국외 한국 련 동연계장서 구축 국가장서개발정책 기본모형 연구 205 같은 력모형을 통해 종합목록 DB를 구축하 고 포 으로 한국 련 자료를 수집할 수 있 는 방안 마련이 시 하다. 한국학 련 자료들은 기존의 도서 분류체 계와는 다르므로 별도의 분류체계를 운 해야 한 다. 즉 국제표 기록물기술(ISAD(G): General International Standard Archival Description) 분류체계를 따라 출처 존 의 원칙, 원질서 존 의 원칙, 계층 통제의 원칙에 근거하여 자료 를 정리할 수 있도록 해야 한다. 국립 앙도서 은 재에도 표 인 분류체계와 기본원칙 을 수하여 수집을 진행하고 있다. 한 력 기 들 간의 표 화 작업을 한 부수 인 업무 를 통제하는 역할을 담당하여야 할 것이다. 수 집된 자료에 해서는 데이터베이스를 구축하 되, 서지DB, 작권이 해결된 자료는 원문DB 를 구축하고, 한 한국 련 자료 소장기 DB 도 별도로 구축할 필요가 있다. 4.3 장서평가 4.3.1 컨스펙터스의 도입 장서평가는 도서 임무를 완수하는 장서의 합성을 단하는 과정으로서, 모든 도서 에 서 요시 되어 왔다. 장서는 도서 의 존재 이 유이며, 훌륭한 장서는 도서 의 목 달성에 필수 이다. 장서평가는 과거의 자료수집에 한 노력의 정도를 평가하고, 행 로그램을 검토하고, 장서목 을 성취하기 한 개발 우 선순 를 결정하기 한 정보를 제공한다. 장 서를 평가하기 해서는 도서 의 유형, 사명, 규모가 고려되어야 한다. 국립 앙도서 은 국 가 자료의 망라 수집, 보존, 이용이라는 사명 을 가지고 있으므로, 단순히 이용자 요구뿐만 아니라 국가 차원에서 장서의 특성, 구성, 심 도를 평가하여 강 은 유지하고 취약 은 보완 하는 업무를 수행하여야 할 것이다. 1970년 반부터 도서 들은 타 도서 과 의 상호 력을 하여 주제별 장서 강도와 장 서수 에 심을 가지면서, 주제별 장서에 한 계량 이고 표 화된 평가기법을 찾고자 하 다. 일반 으로 가장 많이 알려진 컨스펙터 스는 미국의 연구도서 구룹(RLG)이 작성한 컨스펙터스로서 주제 분야의 장서수 을 기술 하는데 합하다. RLG 컨스펙터스는 장서평 가도구로 리 사용되었다. 1980년 북미지역은 물론, 국, 호주, 뉴질 랜드, 랑스, 스칸디나비아 국가, 네덜란드, 헝 가리, 벨기에, 이탈리아 등 여러 나라에서 컨스 펙터스를 사용하게 되었으며, 1987년 유럽연구 도서 회는 컨스펙터스를 공식 으로 인정 하고 Conspectus Working Group을 결성하 다. 1990년 에는 IFLA 회의에서 국제 인 컨 스펙터스(worldwide conspectus)를 검토하 다. WLN(Western Library Network)은 단 도서 에서 장서 평가에 사용할 수 있는 PC기 반의 소 트웨어를 개발하 고, 1999년 WLN 과 OCLC Pacific의 병합 이후 최근까지 컨스펙 터스 서비스는 OCLC/WLN에 의하여 유지 리되었다. 이후 2006년 7월 RLG가 OCLC에 합 병됨에 따라 컨스펙터스는 체 으로 OCLC 의 책임 하에 운 되고 있다. 최근 자자원의 비 이 확 되면서, 기존 컨스펙터스의 수정에 한 연구들이 등장하고 있다. 표 인 로서는 Johnson(2004)의 모 형, Clayton & Gorman(2002)의 모형, Biblaz 206 한국문헌정보학회지 제43권 제4호 2009 (2001)의 모형 등이 있는데, 기존의 장서수 기호를 최 한 유지하면서 유형별 서비스 수 을 제시하기 함이다. 4.3.2 컨스펙터스의 구성 WLN Manual(Powell 1992: 27)에는 “컨 스펙터스란 표 화되어 있는 정의를 이용하여 도서 장서를 체계 으로 기술하고 분석하는 조직화된 과정이다”라고 서술되어 있다. Wood (1992, 6)는 컨스펙터스를 “도서 장서에 한 조사결과를 제고하는 다방면 다목 의 장서 심 평가과정”이라고 정의하 다. 컨스펙터스는 몇 가지 요소로 구성되어 있는 단순한 장치이다. 일 반 으로 주제분류기호(subject classification code), 주제기술어(subject descriptor), 장서 수 기호(collection level code), 언어범 기호 (language coverage code), 비고(comments) 로 구성되어 있다. 1) 주제분류기호 컨스펙터스의 주제분류기호는 각 도서 이 사용하는 분류표의 기호를 따르도록 되어있다. LCC와 DDC의 계층 구조를 용한 컨스펙 터스의 주제는 24개의 주류, 500여개의 강목, 5,000여개의 요목이 있다. WLN 컨스펙터스는 LCC와 DDC의 주제표시어를 모두 제공하고 있다. 2) 주제기술어 컨스펙터스 주제기술어는 분류표의 주제기 술어와 거의 일치하도록 한다. 필요한 경우 주 제기술어를 추가하거나 수정하도록 한다. 3) 장서수 컨스펙터스는 일반 으로 각 주제별 장서수 을 몇 가지 차원으로 구분하고 장서수 기호 라는 표 화된 척도를 이용하여 장서의 능력을 기술한다. RLG 컨스펙터스는 미국 의회도서 , 호주 국립도서 등 규모 도서 에서 사용되 고 있으며, 각 장서수 을 ‘0’부터 ‘5’까지의 6개 등간척도로 구성된 기호로 표시하고 있다. 4) 언어범 기호 컨스펙터스에서는 장서의 언어범 특성에 따라 언어범 기호를 부여하고 있다. 언어범 기호는 장서수 기호와 결합하여 표기되며, 이 표시기호는 장서수 ‘3’~ ‘5’(학습 교육지 원수 ~망라 인 수 )에 해당하는 장서를 다 루는 연구도서 에서 주로 이용한다. RLG 컨 스펙터스와 WLN 컨스펙터스는 장서에 포함된 자료들의 언어특성에 따라 어를 심으로 언 어범 기호를 부여하고 있다. 5) 장서기술 표 화된 장서수 기호를 이용하여 장서를 기술하는 경우에는 장서의 특수한 환경이나 외상황을 표시하거나 설명하기 어렵다. 따라서 컨스펙터스에서는 주제배열의 한계성을 극복하 고 장서수 기호만으로 나타내지 못하는 장서상 황을 설명할 수 있는 여러 가지 장치를 마련하고 있다. RLG 컨스펙터스에서는 ‘비고’란을 두어 장서에 한 정보를 추가로 기록할 수 있다. 4.3.3 국가장서를 한 컨스펙터스 용 1) 컨스펙터스 수 장서를 평가하기 해서는 도서 의 유형, 규 국가장서개발정책 기본모형 연구 207 모, 이용자 특성에 합한 기법을 선택 용해 야 한다. 국립 앙도서 은 포 인 연구도서 수 의 장서를 구축하고 있으므로, <표 4>와 같은 RLG의 6단계 수 의 용을 제안한다. 2) 컨스펙터스 주제어 체계 재 세계 으로 가장 많이 사용되고 있는 OCLC/WLN의 컨스팩터스는 주제에 해 LC와 DDC 기호를 모두 제공하고 있다. 국립 앙도서 은 이 컨스펙터스에서 DDC부분을 KDC로 변경하여 사용할 수 있다. 컨스펙터스 는 일회 으로 사용하는 것이 아니라, 평가 시 추가되는 주제가 발생하면, 새로 기술하여 축 시킴으로써, 데이터베이스로 유지된다. 본 연구는 국가장서를 한 컨스펙터스를 구 성함에 있어, OCLC/WLN의 컨스펙터스를 용하여 개발하 다. OCLC/WLN에서 정하고 있는 주제는 모두 24개의 카테고리이다. 이 사회과학 분야에 해당하는 경 경제, 교육, 지리 지구과학, 법학, 문헌정보학, 정치학, 심 리학, 사회학 카테고리에 하여, 컨스펙터스를 개발하 다. KDC 5 상 색인에서 주제어를 보완하 고, KDC 번호를 SUBJECT RANGE 로 부여하 다. 향후 국제 력이나 서양서의 처 리 등을 하여 DDC번호와 LCC번호를 병기 하 다. 개발한 컨스펙터스의 일부 시는 다음 <표 5>와 같다. 3) 장서수 결정 방법 컨스펙터스를 활용하여 장서수 을 평가하 는 방법은 다음과 같다. ① 서가목록분석(shelflist analysis)방법: 도 서 의 서가목록을 분석하여 평가 상 장 서의 각 주제별 장서 수, 언어범 , 출 년 등을 조사. ② 서지 조(bibliography checking)방법: 문 평가용 서지를 선정/작성하여 평 가 상 도서 의 장서목록과 조하여 소 장 자료의 비율을 조사. WLN Collection Manual에서는 장서수 의 결정에 보조도구로 계량 지침을 명시하고 장서수 기호 내 용 수 0 - 제외수 . 도서 의 수집범 에서 벗어나 수집하지 않는 자료 수 1 - 최 수 . 매우 기본 인 자료 이외에는 거의 선택하지 않음 수 2 - 기 정보수 . 한 주제에 하여 입문 인 지식과 개요를 제공함 수 3 - 학습 교육지원수 . 한 주제 분야의 지식을 체계 으로 유지, 보완하는데 합함 - 문 연구를 지원하기에는 낮은 수 - 독립연구, 학부 학원 교육은 물론 공공/ 특수도서 이용자의 학문 요구 지원에 합함 수 4 - 연구수 . 연구보고서, 새로운 연구결과, 과학 검증결과, 기타 연구자들에게 유용한 정보는 물론 학 논문과 독립연구에 필요한 주요 정보원을 포함함 - 박사과정 기타 순수연구를 지원함 수 5 - 망라 수 . 한 도서 이 매우 제한된 주제 분야에 하여 모든 실용 언어로 쓰인 기록지식 즉 모든 주요 자료를 소장하고자 노력함 - 한 주제 분야에 한 문 장서를 유지하며 자료를 포 으로 소장하는데 목 을 둠 <표 4> RLG 컨스펙터스 208 한국문헌정보학회지 제43권 제4호 2009 DIVISION CATEGORY SUBJECT RANGE(CALL NUMBER) KDC LCC DDC 경 & 경제 통계학 310-310.09 HA0-9999 310-311 313-319 경제 이론 320.1 HB0-9999 304.6 307.2 312 330-330.8 332.83 338.5 339.5 339-339.2 경제 역사 & 상황 320.9-320.906 HC0-9999 330.9 339.3-339.41 339.43-339.49 경제- 산업, 토지이용, 노동 323 321.32 321.5 HD0-2320 333.73-333.74 333.76-333.77 333.9-333.91 333-333.6 338.9 338-338.1 658.4 658.9 658-658.2 659.2 : : : : : : : : : : 사회학 범죄학, 형사 사법 364.4 367 HV6001-9999 362.28 363.2 363.42 364-365 <표 5> 국가장서 평가를 한 컨스펙터스(사회과학분야 일부 시) 있는데, 단행본에 하여 주류별 장서 수, 주제 별 표 서지에 수록된 소장률에 따라 수 을 결정하고 정기간행물에 하여는 색인지에 수 록된 소장정도에 따라 수 을 결정하는 지침을 제공하고 있다. 그 밖에 외국어 자료 수, 최신 성, 수서율 등 다양한 요소에 따른 수 결정 지 침이 있다. RLG에서는 학문분야별로 보충지 침을 마련하 다. 4) 언어범 기호 장서의 언어범 기호는 장서에 포함된 자료 의 언어범 에 한 특성을 나타내는 기호이다. RLG 컨스펙터스에서는 장서의 언어범 특성 에 따라 다음과 같은 기호를 부여하고 있다: 어자료가 우세한 장서(E), 어자료 이외의 외 국어자료가 일부 소장된 장서(F), 여러 언어로 된 자료를 범 하게 선택하고 있는 장서(W), 어 외 1개 외국어 자료로만 구성된 장서(Y). 국가장서개발정책 기본모형 연구 209 도서 장서의 언어범 는 해당주제분야 장 서의 언어별 출 량과 출 물의 언어범 에 따 라 결정된다. 미국도서 회 Resources and Technical Services Division은 E, F, W, Y를 규정하고 있다. 언어범 기호는 한 국가에서 사용하는 언어 와 도서 의 특성에 합하도록 조정할 수 있다. Canadian Association of Research Libraries 는 회원 도서 들의 컨스펙터스 이용에 편의를 제공하고자 자체 언어범 기호를 다음과 같이 개발하 다: 랑스어(P), 어/ 랑스어(V), 어와 외국어(S), 랑스어와 외국어(T), 외 국어(X). 국립 앙도서 은 국어 외에, 일본어 한문 국어 등 동양어(Asian Language), 어(Eng- lish), 유럽어(European Language), 기타 언어 등을 간주하여야 될 것이다. 4.4 장서개발정책 성문화 앞에서 살펴보았듯이 주요국의 국가도서 은 부분 장서개발정책에 한 성문화된 정책 진술서를 홈페이지에 게재하고 있다. 국립 앙 도서 의 홈페이지에는 간략한 자료수집 방법 과 황 등을 안내하고 있으나, 정책진술서라 고 하기에는 미흡하다. 성문화된 장서정책은 해당 부서의 업무와 부 서간의 계나 방향을 분명히 하고, 일 성 있 게 업무를 수행해 나가기 한 것이며, 자료 선 정과 수집, 계획, 공 과의 계, 외부와의 력 등에 용하여 일 성 있는 장서개발을 진행할 수 있다. 한 시간이 경과되고 담당직원이 바 더라도 장서의 연속성을 유지할 수 있는 기 를 제공한다. 성문화된 정책에는 서문, 주제 별 장서정책, 매체별 장서정책, 특수장서정책, 장기 개발계획 등이 포함될 수 있으며, 다음 과 같은 내용이 포함된다. 4.4.1 서문 서문은 명확하고 간결하게 작성되어야 한다. 장서의 목 이 국가도서 의 사명과 일치하며 법 행정 권 에 한 내용을 담고 있어야 한다. 한 국가도서 의 목 을 달성하기 해 추구해야 할 장서개발의 방향을 제시하고 장서수집 방법과 범 , 장서의 보존 등을 기술 하며, 자료의 폐기와 수집정책의 개정에 한 사항도 포함하여야 한다. 국립 앙도서 의 장서개발정책의 서문에는, ‘도서 법’ 제18조(국립 앙도서 설치 등) 제 2항에 의한 국가 표도서 의 목 과 지 , ‘도 서 법’ 제19조 제1항에 의한 국립 앙도서 의 업무, ‘도서 법’ 제20조(자료의 납본, 온라 인자료의 수집)와 구입․기증․국제교환 등 에 의한 자료수집 방법, 특수장서 정책, 자원공 유를 한 동연계장서 구축, 폐기 등 장서 리에 한 사항 등을 서술한다. 4.4.2 주제별 매체별 장서개발정책 주제별 장서 정책에서는 분야별 장서개발의 범 , 장서의 강 , 수집정책, 취약 , 수집제외 자료, 수집원 등을 기술한다. 매체별 장서개발 정책에서는 매체별 장서개발목 , 정의 범 주, 장서개발방법, 장서 근방법, 타 기 과의 력 등에 해 기술한다. 요한 것은 주제를 어떻게 구분하느냐 하는 것이다. 본 연구에서는 주제 분야를 재의 국 210 한국문헌정보학회지 제43권 제4호 2009 립 앙도서 이 채택하고 있는 분류체계에 따 라 구분하고 각 주제 분야별 장서개발정책을 제시할 것을 제안한다. 매체별로는 문제 이 많은 부문부터 정책을 수립해 나가야 하며, 온 라인 자료, 회색문헌 정부간행물 등이 매체 별 정책의 우선순 로 간주된다. 국립 앙도서 분 이 설립되는 경우 분 의 주제는 각 분 이 담당하여 장서개발정책을 개선하고 발 시켜 나갈 수 있도록 해야 할 것이다. 4.4.3 자자원 개발정책 국외 국가도서 장서개발정책을 보면, 이 미 자자원에 한 장서개발정책을 수립하여 체계 으로 수집하고 있는 것을 알 수 있다. 미 국 의회도서 의 경우 5가지의 supplementary guideline(도서자켓, 자정보원, 마이크로자료, 비도서 자료, 웹아카이빙) ‘ 자정보원’에 서 디지털자료의 정의 수집범주에 해서 언 하고 구체 인 수집 가이드라인을 제시하고 있다. 국의 국립도서 도 ‘Web Collection’을 구축하기 한 방향과 수집 가이드라인을 제시 하고 있다. 호주의 도서 은 특히 ‘Australian Electronic Resources’와 ‘Overseas Electronic Resources’로 구분하여 보다 체계 이고 략 인 장서개발정책을 제시하고 있다. 5. 결론 제언 국가 표도서 인 국립 앙도서 은 우리나 라에서 생산된 모든 자료(인쇄자료와 자자 료)를 납본을 통해 수집하고, 고 는 문화원 형 자료 등과 같은 국가기록유산을 수집하며, 우리나라 언어로 쓰인 자료와 국외에서 출 된 우리나라에 한 자료를 수집하여, 국가장서를 구축해 나가야하는 책무를 가지고 있다. 본 연구에서는, 주요 국가들의 국립도서 장서개발정책을 조사하고, 기존의 선행연구를 체계 으로 종합하여, 하이 리드장서 구축, 동연계장서 구축, 장서 평가, 그리고 장서개발 정책 성문화 등 네 가지 서 모형으로 구성된 기본모형을 도출하 다. 하이 리드장서는 도서 이 물리 으로 소 장하는 오 라인 자료는 물론 온라인 자출 물을 통합 으로 수집한다. 다양한 회색문헌과, 이용자가 한정되어 있는 장애인자료, 한국 련 국외자료 등에 하여는 일반 인 납본 등의 수집방법 외에 련기 간 상호 동을 통한 연계장서 모형을 제안하 다. 우수한 장서수립 에 필수 인 장서평가를 하여 포 장서에 합하며 세계 표 방법으로 확립된 컨스펙 터스 방법을 제안하 다. 마지막으로, 체계 이 고 일 성 있는 장서개발을 지속 으로 추진해 나갈 수 있도록 국가장서개발정책의 성문화를 제안하 다. 국립 앙도서 은 복 이후 60여 년에 걸쳐 장서개발 노력을 경주함으로써 오늘날의 장서 를 구축하 다. 짧은 역사에도 불구하고, 최근 의 국립디지털도서 의 개 등 국제 상은 상당히 높아졌지만, 아직 선진 외국에 비교하 면 장서는 미흡한 상태로 남아 있다. 국립 앙 도서 은 그 동안 축 되어 온 지식과 업무 차를 유지하는 한편 새로운 세부 인 장서수집 지침을 마련함으로써, 보다 발 된 국가장서를 구축할 수 있을 것이다. 본 연구는 행 제도 안에서 장서개발 모형 국가장서개발정책 기본모형 연구 211 을 도출하 다. 우리나라의 실이 일 이 서 지통정이 확립된 선진국과 다르며 법 , 제도 변경이 가까운 장래에 이루어지기 어렵다는 을 감안할 때, 완 한 국가장서의 구축은 막 한 인력과 산의 투입을 필요로 한다. 동 연계장서 구축의 상이 될 회색문헌, 장애인 용 체자료, 한국 련 국외소재 자료의 수집 을 한 역할분담과 표 화 그리고 종합서지 DB의 구축 한 인력의 투입과 산을 필요로 한다. 디지털자료의 수집은 이제 시작하는 보 상태에 있으며, 이해 당사자에 한 홍보 와 계도와 의는 도서 의 업무 부하를 가 시킬 것이다. 본 연구에서 제안한 국가도서 장서개발 기 본모형의 각 서 모형을 실 하기 해서는 후 속 연구 구체 인 실천계획이 수립되어야 할 것이다. 하이 리드장서 구축을 추진하기 해서는 온․오 라인 출 물의 완 한 납본 과 수집을 한 국내출 물 상시 모니터링 체 계를 구 할 것을 제언한다. 회색문헌, 장애인 용 체자료, 국외 한국 련 자료 등의 동연 계장서 구축을 해서는 각각 참여기 으로 구 성된 추진 원회를 통하여 수집범 , 역할, 장기 세부계획을 확립하고 의체를 결성하는 과정이 필요하다. 장서의 재 수 을 평가하 고 목표를 설정하는 장서평가는 발 장서개 발의 선행조건이다. 본 연구에서 제시한 사회 과학 분야 컨스펙터스를 분야에 걸쳐 확 개발하고 진 으로 평가 작업을 실행에 옮겨 야 할 것이다. 장서개발의 지속 추진을 한 주제별, 유형별 장서개발정책의 성문화는, 재 사용하는 방법과 차를 기반으로 국외의 사례 를 참고하여 지침이 될 상세한 내용을 갖추도 록 해야 한다. 재 장서 취약한 주제분야, 수집범 와 원칙이 명확하지 않은 자료, 새로 운 정보원 등이 우선순 가 되어 순차 으로 개발해 나가야 할 것이다. 참 고 문 헌 [1] 곽승진 외. 2008. ꡔ디지털자료 납본체계 이용보상 에 한 연구ꡕ. [서울]: 국립 앙도서 . [2] 국립장애인도서 지원센터. 2008. ꡔ도서 장애인서비스 기 지침ꡕ. [서울]: 국립 앙도서 국립장애인도서 지원센터. [3] 김유승. 2007. 웹 아카이빙의 법 제도 문제에 한 고찰: 웹 정보자원의 특성을 심으로. ꡔ한국 문헌정보학회지ꡕ, 41(3): 5-24. [4] 남 . 2002. 디지털 시 의 회색문헌 이용 활성화에 한 연구. ꡔ정보 리학회지ꡕ, 19(4): 233- 255. [5] 도서 정보정책 원회. 2008. ꡔ도서 발 종합계획: 2009-2013ꡕ. [서울]: 통령소속도서 정보 정책 원회. 212 한국문헌정보학회지 제43권 제4호 2009 [6] 박진희. 1997. ꡔ컨스펙터스 방법을 이용한 장서 평가 연구ꡕ. 박사학 논문, 연세 학교 학원. [7] 서혜란. 2003. 디지털자료의 납본과 보존을 한 각국의 노력. ꡔ정보 리학회지ꡕ, 20(1): 379-399. [8] 신은자. 1999. 자 회색문헌의 활용에 한 연구. ꡔ정보 리학회지ꡕ, 16(3): 83-98. [9] 윤희윤. 2001. 국가도서 의 자출 물 수집 보존방안. ꡔ도서 ꡕ, 56(3): 3-48. [10] 윤희윤. 2003. 한국납본제도 개선모형에 한 연구. ꡔ한국문헌정보학회지ꡕ, 37(4): 23-52. [11] 윤희윤. 2008. ꡔ장애인용 체자료 개발 지원방안 연구ꡕ. [서울]: 국립 앙도서 . [12] 장덕 . 2009. 자자원 선정을 한 컨스펙터스 수정 모형 연구. ꡔ한국비블리아학회지ꡕ, 20(2): 31-44. [13] 조혜경. 2007. ꡔ일제강 기 해외소재 한국 련 기록의 통합활용에 한 연구: 컬 션 기술을 심 으로ꡕ. 석사학 논문, 명지 학교 기록과학 학원 기록 리학과. [14] 홍 진, 노 희. 2008. 정책정보통합서비스시스템 구축 모형에 한 연구. ꡔ한국문헌정보학회지ꡕ, 42(1): 95-125. [15] Biblaz, D. 2001. Guidelines for a collection development policy using the conspectus model. Hague: International Federation of Library Associations and Institutions, Section on Acquisition and Collection Development. [16] British Library. 1996. Proposal for the Legal Deposit of Non-print Publications to the Department of National Heritage from the British Library. London: British Library. [17] British Library. 2002. Extension of Legal Deposit to Non-print Materials. [online]. [cited 2002.12.13]. . [18] Calang, M., Tabata K., & Sugimoto S. 2002. “Linking collection management policy to metadata for preservation." Proceedings of International Conference on Dublin Core and Metatdata for e-Communication, 35-43. [19] Carroll, B., & Cotter, G. 1993. “A New Generation of Grey Literature: The Impact of Advanced Information Technologies." Proceedings of the First International Conference on Grey Literature, 5-17. [20] Cassell, Kay Ann. 2004. “Report on the 6th International Conference on Grey Literature." Collection Building, 24(2): 70. [21] Chillag, J. P. 1985. “Grey Literature: Its Supply and Bibliographic Access at the BLLD." British Library Lending Division Catalogue and Index, 78/79: 6-8. [22] Clayton, P., & Gorman, G. E. 2002. “Updating Conspectus for a Digital Age." Library Collections, Acquisitions & Technical Services, 26: 253-258. [23] Committee on an Information Technology Strategy for the Library of Congress. 2000. LC 21: a Digital Strategy for the Library of Congress. Washington D.C.: National Academy 국가장서개발정책 기본모형 연구 213 Press. [24] Conference of Directors of National Libraries, Working Group. 1997. The Leagal Deposit of Electronic Publications. Paris: Unesco. [25] Conference of European National Librarians, Federation of European Publishers(CENL/ FEP). 2002. Statement on the Development and Establishment of Codes of Practice for the Voluntary Deposit of Electronic Publications. [online]. [cited 2002.6.1]. . [26] Dominic, J. et al. 2005. “Access to Grey Content: An Analysis of Grey Literature Based on Citation and Survey Data, A Follow-up Study." Proceeding of the 7th International Conference on Grey Literature. [27] Dupont, Henrik. 1999. “Legal Deposit in Denmark: the New Law and Electronic Product." LIBER Quarterly, the Journal of European Research Libraries: 9(2). [online]. [cited 2002. 1.20]. . [28] Evans, G., Eeward., Saponaro, & Margret, Zarnosky. 2005. Developing Library and Information Center Collections. 5th ed. Westport, Conn.: Libraries Unlimited. [29] Ferguson, A. W., Grant, J., & Rustein, J. S. 1988. “The RLG Conspectus: It's Uses and Benefits." College & Research Libraries, 49(2): 197-206. [30] Gatenby, Pam. 2002. “Legal Deposit, Electronic Publications and Digital Archiving: the National Library of Australia's Experience." 68th IFLA Council and General Conference. Glasgow, August: 18-24. [online]. [cited 2002.12.13]. [31] Hakala, Juha. 1999. “Electronic Publications as Legal Deposit Copies." Tietolinja News [online]. 1/1999: [cited 2001.11.21]. [32] IFLA. Committee on Copyright and Other Legal Matter(CLM). 2000. The IFLA Position on Copyright in the Digital Environment. [online]. [cited 2002. 12. 13]. [33] IFLA and IPA. 2002. Preserving the Memory of the World in Perpetuity: a Joint Statement on the Archiving and Preserving of Digital Information. [online]. [cited 2002.12.13]. . [34] IFLA Headquarters. 2005. Libraries for the Blind in the Information Age-Guidelines for Development. Hague: IFLA Headquarters. [35] Jasion, Jan T. 1991. The International Guide to Legal Deposit. Aldershot: Ashgate. [36] Johnson, P. 2004. Fundamentals of Collection Development & Management, Chicago: ALA. [37] Joint, Nicholas. 2006. “Legal Deposit and Collection Development in a Digital World." Library Review, 55(8): 468-473. 214 한국문헌정보학회지 제43권 제4호 2009 [38] Owen, J., Mackenzie, S., & Walle, J. V. D. 1996. Deposit Collections of Electronic Publications. Brussels: European Commission. [39] Wood, D. 1982. “Grey Literature-The Role of the British Library Lending Division." Aslib Proceeding, 34(11/12): 359-465. •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] Seung-Jin Kwak et al. 2008. Digitaljaryo Napbonchegye mit Yiyongbosanggumae gwanhan Yeongu. Seoul: The National Library of Korea. [2] The National Library Support Center for the Disabled. 2008. Doseogwan Jangaeinservice Gijun mit Jichim. Seoul: The National Library Support Center for the Disabled. [3] You-Seung Kim. 2007. “A Study of Legal Issues for Web Archiving." Journal of the Korean Society for Library and Information Science, 41(3): 5-24. [4] Yoong-Joon Nam. 2002. “A Study on the Utilization of the Grey Literature in Digital Age." Journal of the Korean Library and Information Management, 19(4): 233-255. [5] Presidential Committee for the Library & Information Policy. 2008. Doeseogwanbaljeonjongha pgehoek: 2009-2013. Seoul: Presidential Committee for the Library & Information Policy. [6] Jin-Hee Park. 1997. A Study of Collection Evaluation Using Conspectus Methodology. Ph.D. diss., Yonsei University. [7] Hye-Rhan Seo. 2003. “Legal Deposit and Preservation of Digital materials in Various Countries." Journal of the Korean Library and Information Management, 20(1): 379-399. [8] Eun-Ja Shin. 1999. “A Study on the Use of Electronic Grey Literature." Journal of the Korean Library and Information Management, 16(3): 83-98. [9] Hee-Yoon Yoon. 2001. “Archiving Strategies of Electronic Publications in the National Library of Korea." Doseogwan, 56(3): 3-48. [10] Hee-Yoon Yoon. 2003. “A study on the Reform Model of Legal deposit System in Korea." Journal of the Korean Society for Library and Information Science, 37(4): 23-52. [11] Hee-Yoon Yoon. 2008. Jangaeinyong Daechejaryo Gaebal Jiwonbangan Yeongu. Seoul: The National Library of Korea. [12] Duk-Hyun Chang. 2009. “A Study on a Revised Conspectus Model for the Assessment of Electronic Resources." Journal of the Korean BIBLIA Society for Library and Information Science, 20(2): 31-44. 국가장서개발정책 기본모형 연구 215 [13] Hye-Kyoung Jo. 2007. A Study for Integrated Use of Overseas Record on Korea During the Japanes Rule -Focus on Collection and Description. M.A. thesis, Myoungji University. [14] Hyun-Jin Hong, & Young-Hee Noh. 2008. “A Study on Modeling a Unified Policy Information Service system in Korea." Journal of the Korean Library and Information Science Society, 42(1): 95-125. work_v5y5tqeet5hktkc6dvbwoimmqe ---- The Journal of Academic Librarianship, 2007, Vol. 33, No. 2, p.228-242. ISSN: 0099-1333 (print) 1879-1999(online) DOI: 10.1016/j.acalib.2006.12.007 http://www.sciencedirect.com/science/journal/00991333/33/2 http://www.sciencedirect.com/science/article/pii/S009913330600228X Copyright © 2007 Published by Elsevier Inc New Strategies for Delivering Library Resources to Users: Rethinking the Mechanisms in which Libraries are Processing and Delivering Bibliographic Records by Magda El-Sherbini and Amanda J. Wilson Abstract The focus of this paper is to examine the current library practice of processing and delivering information and to introduce alternative scenarios that may keep librarians relevant in the technological era. In the scenarios presented here, the authors will attempt to challenge basic assumptions about the usefulness of and need for OPAC systems, Web OPAC's, OCLC (as one of the major bibliographic utilities), and consortia. The authors then identify ways in which libraries can leverage their resources and available technology to create cost-effective ways of delivering improved services. Introduction During the last decade libraries were faced with budgetary restrictions and shortfalls that forced many administrators to limit or reduce their expenditures. As the cost of technical and public services operations reached its highest point, libraries began to explore ways to control and reduce these costs. 1 While libraries focused on streamlining their processes and workflows, they began to shift more resources to technology and the emerging cooperatives, thus making better use of automation systems, consortia, and networks. Many institutions now subscribe to multiple systems and are part of consortia arrangements that often seem to duplicate services. Perhaps the time is right for libraries to question whether all of these mechanisms of delivering and accessing information are still of value. As the new century dawns two trends affecting the delivery of information are emerging. On the one hand, technological advances are taking libraries into new territories that are offering new solutions for information delivery. On the other hand, the bibliographic record that forms the foundation needed to make best use of the new technologies is becoming less complete and less useful. 2 The purpose of this paper is to introduce new models that might come to serve as alternative solutions to the existing systems of processing delivery of information. In the scenarios presented here, the authors will attempt to challenge basic assumptions about the usefulness of and need for OPAC systems, Web OPAC's, OCLC (as a major bibliographic utility), and consortia. The authors will then identify ways that libraries can leverage their resources and available technology to create cost-effective ways of delivering improved services. "The purpose of this paper is to introduce new models that might come to serve as alternative solutions to the existing systems of processing delivery of information." Literature Review The world of librarianship has fundamentally changed since the advent of the Internet. Recently, some reports and articles have been published about the future of libraries, the future of cataloging and ways to revitalize and enhance the catalog, and retooling the cataloging workforce. 3 However, there is hardly any literature that specifically points to the possibility of radically reevaluating the entire technological infrastructure of the local and shared OPAC. New ideas emerging from the library community at this time look at ways of linking library catalogs and union catalogs, such as OCLC's WorldCat, to the Internet. In his 1999 article "Building Earth's Largest Library: Driving into the Future," Steve Coffman presented a vision of the future library based on the Amazon.com business model for the library world. In his vision, he proposed that libraries cooperate to create a single catalog incorporating all library holdings. The need for local circulation, cataloging, and interlibrary loan systems would therefore be eliminated. With the cost and resource savings of the model, Coffman suggested that libraries enhance catalog records and focus collections to support their specific patron population. In addition, by applying this model, libraries would have the potential to "radically reduce the traditional costs of library operations, both by significantly reducing automation expenses, and by allowing libraries to restructure their physical collections to be more responsive to customer demand." 4 An initial critical reaction to Coffman's article pointed to some social and technological obstacles that would have to be resolved before his dream of building such a library could be possible. The primary concerns included the cost of implementation, actual cost savings presented by Coffman, and the ability of technology to perform certain tasks better than a human. 5 Later, in 2000, Barbara Quint published an editorial entitled "With OCLC's New Strategy, Is the Earth's Largest Library in Sight?" in which she points to significant similarities between OCLC's strategies and ideas put forth by Coffman. OCLC has since implemented many of Coffman's ideas by opening WorldCat to major search engines, such as Google and Yahoo! 6 Digitization of library collections and increasing full text access is making an impact on the user community. John Markoff and Edward Wyatt wrote an article in the New York Times on the agreement between Google and leading US research libraries, as well as Oxford University, to begin converting their holdings into digital format that would be freely searchable over the Internet. In their article, Markoff and Wyatt mentioned that this effort might be a "step on a long road toward the long-predicted global virtual library." The authors also included in their article a valuable quote from Michael A. Keller, Stanford University's head librarian, who stated, "Within two decades, most of the world's knowledge will be digitized and available; one hopes for free reading on the Internet, just as there is free reading in libraries today." 7 Thomas Frey, in his article "The Future of Libraries: Beginning the Great Transformation," presented an excellent view of the future of libraries from a user's perspective. He discussed ten key trends that are affecting the development of the next generation of libraries. In some of the trends, Frey emphasized that today's technology will be replaced by something new and that technology is constantly changing. He also emphasized that the demand for global information will continue to grow. His statement that "the Stage is being set for a new era of Global Systems" is becoming reality. 8 In his talk "Life beyond MARC: The Case for Revolutionary Change in Library Systems and Service," Roy Tennant gave a description of a future environment in which "libraries must switch their focus and resources toward the more efficient use of technology in their processes" to remain relevant in the technological age. 9 Sources cited above allow for some speculation concerning current thinking about the future of libraries. From the review of the literature the authors made an assumption that libraries and librarians have begun to think about options for the transformation of access to library information into the future. A gap still exists, however, in the discussion of how libraries will deliver their information to users. Current Mechanism for Delivering and Accessing Bibliographic Information In the history of delivering and accessing bibliographic information, libraries have used catalog cards as surrogate records. In the 1970s and 1980s, libraries began to automate their processing and created the online public access catalog (OPAC) from catalog cards. This was accomplished through massive retrospective conversion that transferred data from the card catalog format to a machine-readable format. The OPAC mechanism delivered the same bibliographic description of library holdings to users with more features and methods for users to access information. During the same period, the concept of shared bibliographic records was developed and libraries recognized the merit and cost-effective benefits of sharing records. This sharing of bibliographic information was made possible through the introduction of two major bibliographic utilities, OCLC (Online Computer Libraries Center) and RLIN (Research Libraries Information Network). With the impending merger of OCLC and RLIN, OCLC will be examined here as the only bibliographic utility throughout the scenarios. 10 The idea behind creating bibliographic utilities was to generate a record for each library item once, then shares this record through bibliographic utilities with other member libraries. Member libraries could then use and attach their holdings to a record so that patrons would be able to locate that item in their respective libraries as illustrated in Fig. 1. In this diagram, which is commonly used by most libraries, library staff will search their online catalog (OPAC) to find a record for the item being cataloged. If the record is found, staff will add the copy or the volume of the item. If the record is not found in the OPAC, staff will search OCLC's WorldCat. If the required record is located in the database, staff will update the holding and export the record into their OPAC. If the record that is found does not exactly match the item, the cataloger will then edit the record to match the item on hand and export the records into the OPAC. If the item on hand is a different edition from that described in the catalog record, the cataloger will derive a new record using the information from the earlier edition and export the record into the OPAC. In this case, the cataloger sometimes chooses to edit only for their local online catalog and not to make changes in the master record in the bibliographic utility. If the cataloger does not find a matching record, the cataloger will use the online work-form and key in the information following national and international standard (such as AACR2, MARC21, LCSH, LCC, and other standards). Figure 1 Current Mechanism for Delivering and Accessing Bibliographic Information The record will then be "produced" into the bibliographic utility and then exported into the local library's OPAC. In the late 1980s and early 1990s, some libraries began to organize local and regional consortia to further leverage their resources by taking advantage of the benefits offered by such cooperative arrangements. 11 Some consortia provide support for major functions, such as providing a central catalog, ILL, and purchasing/licensing electronic information (i.e., Ohio- LINK), while other consortia provide a central catalog and purchasing/licensing, but not ILL (i.e., CARL). In this paper, the authors will take into consideration those consortia that provide a shared union catalog and borrowing services. The bibliographic record that each library created resides in the library's own OPAC system, the large database maintained by a bibliographic utility such as OCLC, and the regional network database managed by a local consortium. The authors of this study raise the question of whether the current model that includes library OPAC's, bibliographic utilities, and the consortia offers the optimal solution for the library of the 21st century. In other terms, can libraries streamline their mechanisms used to deliver and access information? Are libraries making the best use of today's technologies? Are libraries comfortable with these scattered services, assuming that users are provided with the best service? To answer these questions, the authors will introduce several alternative scenarios with an aim to increase awareness of what new technologies can provide. Each of these scenarios is presented in an outline form and is not intended as a complete solution. Rather, it is a suggestion or a hypothesis that may lead to other solutions or ideas. In presenting these scenarios, the authors are making the following assumptions: 1. Search engines and the Internet are here to stay. 2. Search engines could be a new form of a bibliographic utility with more flexibility and accessibility. 3. Internet use will increase in the future. Scenario 1: OPAC/Repository/Search Engines In this scenario the authors looked at the institutional repository as an interface that would provide the link between library's OPAC and Internet users (see Fig. 2). In doing so, we examined our own institutional repository and the way it operates. Although this repository is still in its infancy and does not contain a large number of resources deposited compared to our OPAC, all materials in the repository are indexed and accessible through Internet search engines. Each record in the repository includes two parts: metadata, or information describing a digital resource, and the file(s) described by that metadata. In the repository, metadata are displayed with a link to a video file, audio file, image, application, data set, or the full text of the resource (if the resource is text-based). The metadata include descriptive data (author, title, associated date(s), use restrictions, an identifier, and any other data relating the content of the resource), technical data (file size, file type, date entered into database), and data required for preservation. The Dublin Core metadata standard is used to structure metadata in the repository. At this time, metadata records and the full text of digital objects (if applicable) in the repository are accessible through Google, Yahoo!, and other Open Archives Initiative (OAI) harvesting Internet search engines. Software used to link the repository records to the Internet works very effectively and provides quick and easy access to stored digital files. The question that arises is whether the repository software can serve as effectively as an interface between a library's OPAC and Google and Yahoo!, thereby replacing the OCLC and consortium models. Figure 2 Senario 1: OPAC/Repository/Search Engines Can the repository software be used to access bibliographic data, that is, metadata, only? In other words can a repository record exist without a file attached- a non-traditional use of the software? In order to experiment with this idea, a few bibliographic records were selected from the library OPAC system and input into the repository using the existing Dublin Core descriptive metadata structure. After allowing time for the records to be indexed by search engines, the records were searched by using author name and were retrieved in Google (for example, see Fig. 3). A link to the library's OPAC system guided the user to the holdings information about the items. Searching author names only does not always produce highly ranked results. In fact, using popular names such as "Allan" and "Tucker" as search terms without quotes failed to return a repository hit in the first 10 pages. However, a search on another author and a keyword from the title produced a repository hit on the first Google results page (see Fig. 4). Fig. 5 shows the repository page that appears in the link from Google. Clicking on the title in that page will take users to bibliographic record data and a link to the library catalog record for holdings and other library services, as shown in Fig. 6. How This Scenario Works 1) At first, information from the MARC fields needs to be assigned to Dublin Core fields. The MARC Content Designation Utility portion of the Z-Interop Project at the University of North Texas has taken steps toward identifying the most frequently used MARC fields by catalogers and the retrieval efficiency of those fields. 12 This and similar projects will help libraries choose what MARC fields to translate to repository records. If desired, all of the information in the MARC record can be used in the repository record. Figs. 6 and 7 show bibliographic data for the same masters thesis in both, the repository and the Web OPAC. Format information (microfiche, etc.), OCLC number or holdings information was not included. However, authoritative name and subject headings were retained. 2) Second, records must be input into the repository. The input process can be manual or automated. Manual input: Bibliographic records have to be input one at a time. Figs. 8 and 9 show our repository's first two manual submission screens. After entering the metadata, a dummy file must be uploaded to complete the creation of a record in the repository. After the repository record has been created, the dummy file must be manually removed, leaving only the bibliographic data. Batch input: Existing open source and proprietary tools can be used to repurpose MARC metadata into a more universal data format—XML, which is the format needed for import and export from the repository. Figure 3 First Google Results Page Based on an Author Search Figure 4 First Google Results Page for Search Including Author Name and Keyword from Title (Query: Allan tucker Attrition) One example of such software is MARCedit.' 3 By using these tools, records can be imported into the repository in automatic batches and dummy files need not be attached and removed in the process. 3) Third, libraries will maintain their current local OPAC systems to manage acquisition, ILL, circulation, and the collection maintenance functions of the OPACs. Advantages • Libraries will repurpose MARC format metadata into XML format, an open format. The XML format is becoming ubiquitous outside the library community. XML tools are increasingly available. Library data can be easily integrated into users' infosphere. Bibliographic data can also be easily decoded both within and outside of the library community. • Libraries will increase access to the "hidden collections" (e.g., ERIC collections, archival and special collections, and thousands of "short records" or k-level records in OCLC or our catalogs). If libraries do not wish to have some materials available to users, they may still suppress these records in their local OPAC. In this manner suppressed records will not be exported to the repository. • Bibliographic records can be easily edited in the repository software and can be customized or expanded to meet users' needs. • Implementation of this scenario will eliminate the need to store bibliographic records in bibliographic utilities (OCLC/ RLIN). This will eliminate the fees that libraries pay for record retrieval and the cost of membership fees. • Library records will go directly from the library repository to the search engines. Disadvantages • Libraries may need the same or higher levels of IT support to integrate open source tools into the traditional cataloging workflow. • Increased IT support for the use, configuration, and customization of open source repository tools may be needed. To support this scenario, programming of the repository software to expand intended uses and purposes of that software is needed. For example, allowing a repository record to exist without an attached file. Figure 5 Following the Link from Google into our Repository • New policies and standards are needed to determine how to represent complex bibliographic relationships expressed in MARC records (example-series, successive titles, etc.) in a less complex metadata structure. • If each individual library inputs their bibliographic records into the repository that is searched by the Internet, then hundreds of records for the same resource will appear through an Internet search. Search engine designers or libraries will have to find a way to eliminate this duplication. • Libraries will have to develop a new way to conduct ILL transactions without OCLC. • Enormous task of loading hundreds of thousands of rich MARC catalog records into the repository. Scenario 2: Consortia/OCLC WorldCat/Search Engines This model proposes the elimination of the local OPAC and creation of a consortium- level OPAC among several libraries (see Fig. 10). The consortium will serve as a MEGA OPAC for those member libraries. The consortium will contribute records to OCLC's WorldCat and these records will then be made available to Internet users through Open WorldCat. How This Scenario Works 1) Libraries will use workstations to access and input records into the consortium database. The consortium will contribute records to OCLC and Open WorldCat. 2) Libraries will not have an Integrated Library System (ILS) or an individual catalog, but will submit all their catalog records to the consortium database. The records will then be migrated to the OCLC WorldCat database for indexing and made available through the Internet utilities such as Google and Yahoo!. The consortium database will be used to identify local holdings, execute ILL transactions, and circulate materials. For example: a) Users will search Google, Google Scholar, or Yahoo! as usual. Users will enter their search as follows: "Find in a Library: Harry Potter and the Goblet of Fire." b) Users will select the item they want from Google, Google Scholar, or Yahoo! list (see Fig. 11). c) Users will enter the zip code to locate the item in a nearby library. In this example the "Ohio State University" (OSU)—shows that the work is in OSU Libraries. d) When the user clicks on the "Ohio State University" the link will take the user to the consortium catalog to get to ILL, etc. In this case, users will be taken to the OhioLINK Web catalog (Fig. 12), instead of OSU's local webPAC. 3) Consortia will have memberships to OCLC, so that non-subscribing libraries will have access to OCLC's WorldCat and functions such as ILL linking. Figure 6 Masters Thesis Bibliographic Record in Ohio State University Repository Advantages • Libraries will be able to centralize their operations and reduce redundancies. • Libraries will be able to eliminate their local OPAC systems and manage their catalogs through consortia. • Circulation, inventory control, and ILL would need to be handled at the consortium level. • Redundant costs for maintaining individual local OPACs and processing an item several times by multiple libraries will be eliminated. • Increased standardization and cooperation among libraries will be encouraged. • Holdings information stored in regional consortium databases can help libraries in their selection and acquisition process by allowing them to eliminate redundancy that results from buying the same items for many libraries in the same region. This is likely to foster cooperation and reduce acquisition costs. • Libraries not currently using online systems may move to automated systems. For example, school libraries that cannot afford to have their own OPAC will be able to contribute holdings to the consortium through workstations linked solely to the consortium. • IT support for OPAC maintenance and functions will be centralized at the consortium level. Other IT functions, such as public workstations, will be maintained by individual libraries. Disadvantages • Consortia will assume a larger role in regional cooperation and coordination. Consortia will need to develop and employ methods to enable member libraries to manage their local collections through a central catalog (ILL, circulation, etc.). Authority of the consortium will not include oversight of individual library budgets. • Libraries might encounter difficulties in establishing cooperative models for regional collection development policies. • The potential for difficulties in coordinating acquisition activities among the libraries exists. • Coordinating communication and activities among member libraries could be difficult. • Libraries will have to give up local OPACs, which may be seen as losing a measure of control. • Some non-OCLC libraries may have to become members to take advantage of ILL and other functionalities WorldCat offers. Figure 7 Masters Thesis Bibliographic Record in Ohio State University Web OPAC Scenario 3: OCLC WorldCat/Search Engines In this scenario, libraries no longer maintain their catalogs (see Fig. 13). Instead, each library submits their cataloging records electronically to OCLC via workstations linked solely to OCLC. WorldCat becomes a de facto catalog for each library. This scenario also utilizes an existing OCLC tool in development, OCLC Resource Sharing, in an effort to leverage available resources. Libraries retain the circulating/lending/ borrowing functionality locally through the acquisitions and inventory modules of current ILS systems. In this scenario, libraries will need to do the following: • Link workstations to OCLC (e.g., using Connexion). • Search OCLC as usual to find a copy and attach holdings to existing records. • Create and input a new record into OCLC through a template, if a record for an item does not already exist in OCLC. How This Scenario Works 1) OCLC will make library bibliographic records available to Google, Yahoo!, and other Internet search engines. 2) Users will search the WorldCat database through the Internet search engines or OCLC's FirstSearch to retrieve a list of records. 3) Users will select the record they want and be prompted to another screen with a brief record, holdings information, and a "request this item" option. 4) Users can click on the "holdings information" field to determine which library owns the item. Then they may select the "request this item" option. 5) Initiating the request will prompt the user to OCLC WorldCat Resource Sharing. 6) The Resource Sharing process will allow users to select the library from which they want to borrow materials as illustrated in Fig. 14. 7) For in-library users, terminals linked to WorldCat or to any Internet search engine will be the starting point. If users find a material housed in their local library, they have the option to retrieve and borrow the items in person. 8) Libraries will need to retain some sort of circulation and ILL modules that will interface with OCLC Resource Sharing. 9) Libraries will have to keep some circulation tools for on site users. In addition, libraries will need to maintain control over ILL for borrowing. Perhaps the circulation and acquisitions modules of current ILS systems will suffice. Figure 8 Manual Submission Screen – First Page of Descriptive Information Advantages • This scenario eliminates some of the costs associated with purchasing full ILS systems and maintaining library catalogs internally. This may also eliminate the cost of consortia membership, if the library participates in any consortia. OCLC will assume an even greater role in cooperative cataloging, and will save libraries time and perhaps cost in managing their collections. • Without the OPAC, libraries will be forced to add all of their records to the OCLC database. Short records will be submitted to WorldCat to be shared with other users. Other libraries will be able to upgrade these if they want to have full bibliographic records. This will enable all types of libraries to participate in a truly cooperative cataloging venture. • Smaller libraries that do not have OPACs will be encouraged to contribute records directly to OCLC via the Internet. • The need to download records into local OPACs will be eliminated. • Internet users will promptly access libraries' materials through Internet search engines. • Libraries' dependency on ILS vendors will be radically reduced. This is an important issue because libraries have spent tremendous amounts of time and money purchasing, upgrading, and maintaining their ILS systems. ILS vendors would have to adapt, and supply stand-alone circulation, acquisition, and ILL modules. In addition, ILS vendors can help integrate libraries' collections with the major search engines through XML exporting and conversion capabilities. • Library staff will be able to devote more time to the actual enhancement of bibliographic records. • IT maintenance and support needs might be reduced on the local library level. For example, the need for communication with the ILS vendors for system enhancements and for implementing a new module and staff training will be greatly reduced. • Libraries will have the capability to add acquisition order records into OCLC. If OCLC will develop such a module, this will help determine how many copies of an item are ordered and help make the OCLC database an innovative tool for collection development. A selector could check local and regional holdings of titles to make informed collection management decisions. • Libraries will retain control over their own budgets. • Libraries will follow the same standards for creating catalog records. • This scenario is a safe option to consider because OCLC is an established service provider that has been working with libraries for over thirty years. • This scenario presents libraries with an opportunity to rethink the ownership vs. access issue. The advantage is that there would be less duplication of owned copies. Figure 9 Manual Submission Screen – Second Page of Descriptive Information Figure 10 Senario 2: Consortia/OCLC WorldCat/Search Engines Disadvantages • Libraries will have to maintain some functions, such as ILL, acquisitions, and circulation on the local level or OCLC has to take on the entire ILS functionality. Currently, all of the ILS library's subsystems are linked to an institution's bibliographic record. • OCLC has to redesign the interface that displays item-level information, holdings information, and request item capabilities. For example, OCLC will not only indicate which libraries own a particular item, but also provide specific information about the item in a specific library (e.g., copy number, volume holdings, circulation status, location within the individual institution, check-in records, etc.). ILS vendors provide this information and customization for individual libraries. Would OCLC be willing to take on this role? • If the OCLC network goes down, the entire library network goes down unless software/hardware is sophisticated enough to handle the backup of the database. • Creating and maintaining a complex database or database network that works smoothly, consistently, and flexibly will be a challenge. Figure 11 • More bureaucracy within OCLC will be needed to handle increased staffing needs required to communicate and coordinate with individual libraries and manage the database. • Libraries will cut costs of maintaining their own OPAC, but there may be a significant increase in the cost of membership to OCLC and for handling information through OCLC. Scenario 4: Local OPAC/Search Engines Search engines can be a virtual catalog and replace the bibliographic utilities (see Fig. 15). All intermediate steps including consortia and OCLC will be bypassed. How This Scenario Works 1) Libraries will catalog in their own OPACs and maintain their databases for all normal library functions. 2) These databases will be directly searchable by search engines. 3) Library cataloging staff will search the Internet to find an existing record for an item. 4) The record will be downloaded by the owner library through the Internet (this functionality already exists in online banking). 5) If no record is found through the search, libraries will catalog the item in their own system. The record will subsequently be available on the Internet for other libraries to use. Advantages • Libraries will have access to international bibliographic records. • Libraries do not need specific software designed to harvest bibliographic information only (i.e., OCLC's Connexion). • The Internet would provide true international tools for collection development and selection. • Users will have quicker access to bibliographic holdings of libraries from anywhere in the world. • Internet users will have true visibility of libraries' holdings. • Many of the costs associated with contributing records to bibliographic utilities and subscriptions will be eliminated. For example, now libraries have to subscribe to OCLC's WorldCat and pay fees for any transaction, in a complicated cost structure. Libraries are paying to get their own records into the Open WorldCat and, by extension, into Google. • Cost savings may occur due to changes or elimination of consortium member fees. Figure 12 Disadvantages • Search engines will display many records for the same item held at different libraries. A remedy might be a programming solution that will eliminate duplication and allow for holdings display. • The capability for search engines to harvest library records directly from the OPACs will need to be implemented. • A central database such as OCLC's WorldCat is no longer available. • Libraries will have to establish mechanisms to share records among themselves. • All participating libraries will need to reinforce their commitment to use standard protocols and cataloging rules. • ILS vendors will need to work together to establish standard practices and displays. • The danger that search engines may collapse, go bankrupt, or no longer want to support this service in the future exists. Figure 13 Senario 3: OCLC WorldCat/Search Engines Figure 14 Resource Sharing Process Figure 15 Senario 4: OPACs/Search Engines Scenario 5: OPACs/OCLC's WorldCat/Search Engines for Printed Materials Repository/Search Engines for Digital Materials In this scenario, libraries will use OPACs to catalog print materials, as they currently do, and use repositories to catalog digital resources as it is illustrated in Fig. 16. Many libraries have been establishing their own repositories to serve their campuses. These repositories can accommodate any digital resources. In examining other library repositories, including the repository at the authors' institution, the authors found that they can contain digital theses and dissertations, articles, images, transcripts, monographs, data sets, HTML files, etc. All of these materials are accessible and indexed via the Web. Libraries are making substantial additions to repositories. The question that arises is why do libraries have to go through a long processing cycle that includes cataloging digital resources in the MARC format, adding them to their OPAC, adding them to OCLC, and making them available to Internet search engines. How This Scenario Works 1) Libraries will catalog print materials as in the current scenario. 2) Libraries will deposit digital resources into institutional repositories which are indexed and accessible to users via the Internet. Advantages • Local OPACs focused on describing printed materials will be maintained. These records can be made available to the Internet search engines through Open WorldCat. • The redundancy of having records for a single resource in multiple places will be eliminated. • Using institutional repositories provides the opportunity for a fresh start to use a different technology to describe and access digital resources, before libraries continue to change, adapt, and modify their practices, codes, and standards to accommodate digital resources further. • The institutional repository format (as well as databases and other digital library formats) could be viewed as a potential OPAC for digital resources. • In examining some of the repositories' materials, including the repository at the authors' institution, we found that the majority of the digital materials are already indexed and displayed in the major search engines. • The cost of cataloging and adding digital materials to the OPAC may be saved. • This scenario is not as radical a change for libraries as the other scenarios presented. • Users may find resources more efficiently because they do not have to go through an intermediary to access digital resources, which are quickly indexed in search engines. • Both sets of records from the repository and the OPAC are indexed by the Internet search engines. Disadvantages • This is not a large obstacle, but the process to access licensed digital resources is already established with the OPACs and it might have to be translated to work in the repository environment. Also electronic resource management systems have functions for payment, licensing information, use statistics, URL management, etc., built in that will need to be addressed. Figure 16 Senario 5: OPACs/OCLC WorldCat/Search Engines for Printed Materials Repository/Search Engines for Digital Materials • Libraries will have to maintain an OPAC and a repository. Some libraries do this already. • So far, libraries are using DSpace for storing their digital materials. This software might be too limited to handle massive digitization projects. Conclusion By presenting a number of options for consideration, the authors of this paper highlight potential solutions to some important issues facing libraries. Options presented here are intended to initiate a debate in the library community about the mechanisms that libraries currently use to share bibliographic records in order to facilitate access to information. These options do not exhaust all possibilities that might be available and it is our hope that more ideas and potential scenarios will be generated in response to the thesis of this project. It is our belief that library administrators, who until now have focused most of their attention on streamlining internal operations of their technical and public services, ought to include in their considerations those mechanisms and tools that connect library collections to their users. Thirty odd years ago libraries took a huge leap into the future by beginning to automate their catalogs and other operations. Technology once again presents libraries with an opportunity to take the next leap. This time, the leap will take us fully into the Internet world, where most of our users already reside. To begin this process we need to analyze some basic assumptions about the legacy systems that have served us so well for so long, and to seek new solutions. "By eliminating the middle steps of creating, accessing, and retrieving information via intermediaries, such as regional consortia, OCLC, and costly OPAC's, libraries might realize substantial savings that could be diverted to enrich bibliographic records that form the foundation of the current bibliographic structure." We take our cue from the user community itself. There is less interest in retrieving bibliographic records with pointers to a building and a shelf where the desired information is housed. Instead, today's users want and prefer easy on-line access to full text information. What makes this scenario exciting is that libraries are in a good position to provide full text access to some materials that exist in digital formats. Libraries can also satisfy the need for prompt delivery of materials where full text is unavailable. This can be achieved by offering easy-to- understand and access-enhanced bibliographic information linked to a delivery system that offers service directly to the user. By eliminating the middle steps of creating, accessing, and retrieving information via intermediaries, such as regional consortia, OCLC, and costly OPAC's, libraries might realize substantial savings that could be diverted to enrich bibliographic records that form the foundation of the current bibliographic structure. One of the key questions is whether librarians can envision a future without the OPACs, the consortia, and bibliographic utilities, and embrace the major Internet search engines as the "Earth's Largest Library". Acknowledgment: The authors wish to thank George Klim for reading this article and making valuable comments. Notes and References 1. Innovative Redesign and Reorganization of Library Technical Services Paths for the Future and Case Studies," edited by Bradford Lee Eden (Libraries Unlimited Westport, 2004), p. 480; Kathleen L. Wells, "Hard Times in Technical Services: How Do Academic Libraries Manages? A Survey," Technical Services Quarterly 21 (4) (2004): 17-30, Available at: http://www.haworthpress. com.proxy.lib.ohio-state.edu/store/E- Text/View_EText.asp?sid=5SA5DBCMR6F19J87PUWS885REB055TB0&a=4&s=J12 4& v=21&i^&fn=J124v21n04%255F02 (Accessed June 8, 2006); Karen Calhoun, "Technology, Productivity and Change in Library Technical Services," Library Collections, Acquisitions, and Technical Services 27 (3) (Autumn 2003): 281-9, Available at: http:// journals.ohiolink.edu.proxy.lib.ohio-state.edu/local-cgi/send-pdf/ 060502133946493345.pdf 2. "Levels of cataloguing treatment applied by the National Library of Canada" (Revised August 2003), Available at: http://www. collectionscanada.ca/6/17/sl 7-201-e.html (Accessed June 8, 2006); "LC to implement Core Level Cataloging," Available at: http://www. loc.gov/catdir/cpso/corelev.html (Accessed June 8, 2006); "Definition of Cataloging Levels," Available at: http://165.112.6.70/tsd/ cataloging/DefCatLev.html (Accessed June 8, 2006). 3. "The Changing Nature of the Catalog and its Integration with Other Discovery Tools: Final Report" prepared for the Library of Congress by Karen Calhoun, Cornell University Library (March 2006): 52, Available at: http://www.loc.gov/catdir/calhoun- report-final.pdf (Accessed June 8, 2006); Deanna B. Marcum, "Future of Cataloging" Library Resources and Technical Services 50 (1) (January 2006): 5-9, Available at: http://wilsontxt.hwwilson.com/pdffull/01 866/zg4ee/8sb.pdf (Accessed June 8, 2006); "Rethinking How We Provide Bibliographic Services for the University of California: Final Report" (December 2005): 80, Available at: http://libraries.universityofca-lifornia. edu/sopag/BSTF/Final.pdf (Accessed June 8, 2006); "A White Paper on the Future of Cataloging at Indiana University" (January 2006), Available at: http://www.iub.edu/~libtserv/pub/ Future_of_Cataloging_White_Paper.doc (Accessed June 8, 2006); North Carolina State Libraries, "NCSU Libraries Unveils Revolutionary, Endeca-Powered Online Catalog," News Release (January 2006), Available at: http://www.ncsu.edu/news/press_releases/ 06_01 /007.htm (Accessed June 8, 2006). 4. Steve Coffman, "Building Earth's Largest Library: Driving into the Future," Searcher 7 (3) (March 1999), pp. 34, 12. 5. Mark E. Napier and Kathleen A. Smith, "Earths Largest Library-Panacea or Anathema? A Socio-Technical Analysis: A detailed critique of Coffman's proposal" (May 2000), Available at: http:// rkcsi.indiana.edu/archive/CSI/WP/wp00-02B.html (Accessed June 8, 2006). 6. Barbara Quint, "With OCLC's New Strategy, Is Earth's Largest Library in Sight?" Searcher (October 30, 2000), Available at: http://www.infotoday.com/newsbreaks/nbOO 1030-1 .htm (Accessed June 8, 2006). 7. John Markoff and Edward Wyatt, "Google is Adding Major Libraries into its Databases," New York Times (2004, September 14): Available at: http://www.nytimes.com/2004/12/14/technology/14google.html?ex=1141794000&en=a 58fllae9e54b47b&ei= 5070 (Accessed June 8, 2006). 8. Thomas Frey, "The Future of Libraries: Beginning the Great Transformation" DaVinci Institute, Available at: http://www. davinciinstitutexom/page.php?ID=120 (Accessed June 8, 2006). 9. Roy Tennant, "Life Beyond MARC: The Case for Revolutionary Change in Library Systems and Service," Speech at the Library of Congress (September 15, 2005), Available at: http://www. loc.gov/ today/cyberlc/feature_wdesc.php?rec=3774 (Accessed June 8, 2006). 10. "RLG to Combine with OCLC," Available at: http://www.oclc.org/news/releases/200618.htm (Accessed June 8, 2006). 11. "More Information on Consortia and their Services," Available at: http.V/www. library.yale.edu/consortia/icolcmembers.html (Accessed June 8, 2006). 12. "Z-Interop: A Z39.50 Interoperability Testbed Study," Available at: http://www.unt.edu/zinterop/index.htm (Accessed June 8, 2006). 13. "MARCedit," Available at: http://oregonstate.edu/~reeset/marcedit/ html/ (Accessed June 8, 2006). work_v6fxeffjrzhj7jr6vqwotptlya ---- Overwhelmed to Action: Digital Preservation Challenges at the Under-resourced Institution Rinehart, Amanda, Patrice-Andre Prud’homme, and Andrew Huot Suggested citation: Rinehart, Amanda, Patrice-Andre Prud’homme, and Andrew Huot. 2014. Overwhelmed to action: digital preservation at the under-resourced institution. OCLC Systems and Services, Digital Preservation Special Edition, 30(1): 28-42. Available at: http://www.emeraldinsight.com/journals.htm?articleid=17103689 http://www.emeraldinsight.com/journals.htm?articleid=17103689 1 Overwhelmed to Action: Digital Preservation Challenges at the Under-resourced Institution Introduction Librarians have been hearing about digital preservation (DP) for a while. As a data librarian, the subject comes up when discussing data management plan mandates from various funding agencies. As a preservationist, it lies outside the daily workflow but is material that should be preserved. As a librarian who selects and transforms unique materials into digital collections, it is important to ensure that these digital assets are not lost in a few years. Regardless of the job description, it is important that valuable collections do not deteriorate or become lost. However, the reality for a small or mid-sized institution is that there is high competition for dwindling resources and not much DP training translates into action. When DP is not part of existing position descriptions, and there is no funding to hire an expert, how does an institution begin to tackle the challenge? The pressure to start working on it immediately, before any content is lost, just adds to the impression of impossibility. As a partner institution in an Institute of Museum and Library Services (IMLS) National Leadership Grant, Milner library has had the privilege of consulting with some of the best DP experts in the nation. This paper reveals some lessons learned and how Milner Library has gone from overwhelmed to action. “Preservation is the state at which everything that is important about a file is still capable of being examined by a human being…it’s not about individual records or bits, it’s about allowing humans to understand” (McDonough, 2012). Often it can be easy to get lost in the technological details of DP and lose sight of the ultimate goal. For instance, the CARLI white paper (2010, p. 2) defines DP as “a commitment to maintain long-term access to digital objects through standardization, migration, and replication of those objects on numerous servers in multiple locations.” Therefore, DP is really “a multi-faceted problem that is viewed differently by different institutions and different professionals” (Hedstrom and Montgomery, 1998, p. 7). The term is defined broadly because it encompasses a variety of activities pertaining to the continued value of digital information (Hedstrom, 1998). Certainly, those who are concerned about DP at Milner Library found that choosing which facet to tackle is the first challenge. Initially, in order to facilitate a common understanding between library staff, DP advocates at Milner focused on the misconceptions about DP. The following will further explain some of these misconceptions. Misconceptions of digital preservation One misconception of DP is that if an item is accessible, then it is preserved. “Preservation and access are different” (McDonough, 2012). In fact, access is not required for DP at all, although it may be a desired component. As well, access may be seen as a higher priority and provide more immediate satisfaction than DP, particularly to users or other 2 constituents. However, without preservation, access is not reliable over time. Even as early as 1998, “some collection managers expressed concerns that institutions would sidestep the issue of long-term preservation [in favor of providing] access to materials, and that at some point this approach would fail, leaving institutions with a preservation crisis” (Hedstrom and Montgomery, 1998, p. 21). In order to avoid this crisis, libraries need to be thoughtful about how resources are distributed in order to make sure that DP is not forgotten. The difference between digitization and preservation is another potentially misunderstood component of DP. “Digitization does not equal preservation” (Halbert, 2012). The act of digitally transforming analogs still remains confusing with the array of standards and formats for digital surrogates, while poor quality digitization prevents adequate long-term digital preservation. When NASA accidentally erased the high quality recordings of the Apollo 11 moon landing in the 1980s, they eliminated any possibility of future high resolution analysis with more modern technology (Greenfieldboyce, 2009). Recordings of the event still exist, but only of low quality TV broadcasts. This is a prime example of how non-experts can interpret any digitization as digital preservation. Digitization must meet current standards and guidelines in order to achieve suitable quality for long-term preservation. Some of these standards are well-established (Puglia, Ree, and Rhodes, 2004), while others are on-going conversations (Library of Congress, n.d., NDSA Standards ). Knowing how to seek the most relevant information and adhere to the best possible practice is essential to both adequate DP and the efficient use of limited resources. Most institutions and individuals understand the need to backup data, but may not realize the difficulty of assuring the preservation of knowledge for years to come (Harvey, 2005). “Backup is a component of preservation, not preservation itself” (Bishoff, 2012). At the end of the day, there is the false assurance that data are safe because it exists in multiple copies. However, typical storage media, such as the recommended fast magnetic disks or tapes, are still vulnerable to damage from such dangers as electromagnetic fields (McGath, 2012). Digital media, similar to the materials that make up physical texts, is also at risk of corruption, or degradation. This is colloquially called “bit rot.” Additionally, any collection of files that fail to have accompanying documentation (or metadata) is left without context. Metadata becomes comparatively more important in the non- tangible world, as it is the sole guarantor of the provenance and meaning of the digital files. Without metadata, backups are the organizational equivalent of throwing all of one’s papers in one disorganized pile in a large drawer. Depending on the type of digital material, and the intent of the collection, appropriate metadata could vary widely. Becker and Riley (2010) created a visualization of accepted metadata standards for a variety of digital materials and part of DP is to choose the most appropriate schema. Even if materials are digitized appropriately, metadata attached, and backed-up, they may become inaccessible very quickly. The fast rate of technological change calls for greater attention to format and media storage. How many files have been made inaccessible due to the lack of floppy drives? A survey of 54 institutions found that collection managers view technology obsolescence as the greatest threat to sustain the continuous access to digital 3 resources (Hedstrom and Montgomery, 1998). DP requires periodic transformations, which is a marked difference from traditional preservation, where minimal handling is desirable. Restricted resources Institutions are between a rock and a hard place when facing rapidly changing technologies and the sheer volume of digital creation. Across the nation, university archives, libraries, and other types of repositories are trying to meet “escalating user expectations with limited financial and technical resources” (Hedstrom, 1998, p. 193). Due to the exponential creation of born-digital materials, information is being lost nearly as soon as the digital assets are produced. Although Harvey (2005, p. 188) points out that there is a “lack of concrete knowledge of how much [digital preservation will] cost”, there is a general assumption that any cost is too much, particularly when overall funding is being cut. Although only 41% of surveyed institutions who are not participating in DP programs cited cost concerns, all institutions ranked the three top DP concerns as additional costs, lack of staff resources, and budgets (Meddings, 2011). Change fatigue Lack of funding is not the only restricted resource. It is not uncommon to find that librarians suffer from change fatigue, lack of practical knowledge, and lack of engagement. Being prepared to encounter and work through these issues is key to overcoming frustration and discouragement. Like many libraries, Milner Library has been through a number of organizational and physical changes in recent years. Originally, each level of our six-floor building operated with a great deal of autonomy. As duplication was identified, the six floors were merged into one modern library structure, de facto creating a reorganization of staff. As well, the entire history of the physical building has been fraught with difficulty (Boyd, 2001). A combination of delayed building maintenance and initial poor construction resulted in portions of the building being unfit for public use. These spaces are currently used for closed stacks, but not without significant changes to retrieval workflows and on-going water control measures. With the influx of technology in the last thirty years, electrical requirements have increased and student expectations have changed. These trends have placed additional pressures on space planning decisions and print collection deselection. These organizational and physical changes contribute to change fatigue. Similar to many other academic libraries across the nation, Milner has seen leadership come and go, technology become pervasive and the budget significantly cut. As several University Librarians have noted, the “upheaval and disruptive events that had preceded their arrival at the respective institution...restricted innovative activities for extended periods of time” and “we’re our own worst enemy - the big threat is ourselves. Especially being unwilling to accept and project ourselves into new environments” (Jantz, 2012, p.13). This change fatigue can mean that the burden of another activity is often met with valid expressions of disbelief and 4 resistance. Understandably, the scope and volume of DP often invokes this reaction. Again, the refrain that “this is not a technical problem, it’s an organizational problem,” rings true (Kolowich, 2012). Change inherently causes uncertainty and fear of failure or, as one study phrased it, waiting to see “if someone gets fired” (Jantz, 2012, p.12). This can create a difficult environment for discussing DP. Change can result in “significant stress on librarians and staff to adapt and respond” (Jantz, 2012, p.18). As well, the nature of librarianship in the late twentieth and early twenty-first century has not been conducive to new processes. “Rigidly defined job classifications...encourages ritualistic and unimaginative behavior” and “librarians are trained to follow certain processes - repetitive work that does not lend itself to the generation of new ideas” (Jantz, 2012, p. 12). These professional norms may lead to concern about taking the initiative, with one observation that “librarians [do] not recognize it as their responsibility to speak up” (Jantz, 2012, p. 14). In addition to staff training, creating non-ideological objectives (or clear goals) for the library may help overcome these fears (Jantz, 2012). Lack of training In addition to change fatigue, some librarians have had, or are aware of, negative experiences with digitization projects. One of the dangers of the digital world is expending effort and resources on projects that are unimportant or badly executed. The lack of awareness of standards and best practices is still highly prevalent. Getting the right information in the right hands at the right time is a problem that has plagued the library community for decades. When adding in the incredible pace of change in the digital environment, limited resources for training and travel, and work days that are already overburdened, it is not surprising that at the local level people forge ahead on projects blissfully unaware of standards and best practices. (Molinaro, 2010, p. 47) It doesn’t help that “digital preservation is an extremely complex, evolving field that requires a great deal of knowledge to understand” (Duff et al., 2006, p. 203). Concepts that may appear simple to those in the field are often confusing, especially to fellow librarians. Meddings (2011, p. 57) noted that “some respondents were confusing print and digital preservation as well as confusing longer-term preservation with post-cancellation access.” Without a basic overview of the scope and content under discussion, many colleagues are left adrift, attempting to synthesize fragmented information while simultaneously concerned about limited resources and rebounding from periods of change and stress. Instead of pressure and crisis, Duff et al. (2006) identify practical content, real-life experiences, more emphasis on tools and a greater need for contextualization as key components for understanding DP. They also emphasize the need for “a nonthreatening setting where [people] can discuss the problems they are facing in their workplaces” (Duff et al., 2006, p. 201). 5 As well, emphasizing traditional librarian strengths may make DP concepts relatable. Although librarians may not always be adequately fluent with computer technology, their expertise may be employed in the decision making process of basic preservation and archival principles. Bishoff (2012) notes, “We don’t have to preserve everything at the same level.” It can also be argued that not everything needs to be preserved; it is a question of prioritization. Collection development techniques of selection, organization, and preservation all apply in the digital world. Lack of engagement Even when there is willingness and training, it is difficult to engage in DP. Some librarians may feel that they have the mandate to perform digital preservation, but not the authority. In his world-wide study of DP, Meddings (2011, p. 57) found that despite 85% of respondents claiming that “digital preservation is either important or very important to their library”, “less than half of respondents (46.1 percent) stated that they were currently taking steps to ensure the long-term preservation of digital content.” In Duff et al. (2006, p. 188), “networking, increasing confidence levels, and future collaboration were identified as important benefits of the workshops”, however “very few participants were able to implement the skills once they returned to their work environments.” This was not due to the lack of dissemination or awareness of DP issues. Ninety-six percent of the Electronic Resource Preservation and Access Network (ERPANET) workshop participants shared their training with their institutions (Duff et al., 2006). However, “only 35% of respondents said that they implemented the ideas that they learned about at the event” (Duff et al., 2006, p. 199). This lack of “empowering the front lines in the fight for sustainability of our digital heritage” is a real challenge to implementing digital preservation initiatives (Molinaro, 2010, p. 47). It is difficult to pinpoint the exact nature of this lack of empowerment. Certainly, an organizational culture that is suffering from change fatigue can be discouraging and budgets are universally tight. However, other digital initiatives, such as access points and digitization, receive funding and support. Some possible DP engagement barriers may be organizational, procedural, and leadership related. It is possible that organizational structures are inhibiting DP, particularly if the digital collections are maintained by separate departments that have different reporting lines. For example, imagine an organization that has an archive that collects historical digital material, subject librarians that initiate and maintain their own digital collections, a digitization center that has their own criteria and collections, and a separate institutional repository. This creates confusion regarding who has ownership of DP. Without a unifying DP organizational structure each unit is a silo who must tackle DP on their own. While cross-departmental collaboration may overcome the challenge of organizational fragmentation, librarian engagement is complicated by unclear DP processes. If there were easy workflows for all participants then many more people would participate. Developing clear processes requires a like-minded core group that make recommendations to administration about 6 standards, technology options, feasibility and training. If the group does not have a clear mandate from administration to provide recommendations, or if there is not a mandate to even form a group, then DP collaboration is stymied. In short, grassroots efforts to educate and raise awareness only go so far - ultimately, library administration must provide leadership for DP implementation in order for it to be effective. Talking to Library Administration The importance of raising awareness about DP needs to be well articulated in order to gain legitimacy. Specifically, administrators need to “understand that digital preservation is not peripheral; it is a cultural change; an institutional activity” (Halbert, 2012). Further, for DP to be successful it needs to be included in strategic planning and allocated funding. Program versus project A common pitfall of all technology initiatives in the library is to think of them as projects instead of programs. “It’s not a project, it’s a program” (Rudersdorf, 2012). Projects have a discrete beginning and ending, implying a lack of long-term commitment. While the technology may change rapidly, and staff turn-over is inevitable, DP is a long-term commitment. There needs to be recognition that DP is “an outcome of the organization’s successful day-to-day management of its digital assets” at the outset (Fiffe et al., 2005). There are a few externally- driven arguments for DP that might resound with administrators. The following have been shown to carry more weight than arguments grounded in librarianship ideology, such as ensuring institutional memory or providing a public good. NSF Data Management Plan The first national grant funding agency to require data preservation was the National Science Foundation (NSF). In February of 2011, the NSF implemented a requirement for all grant proposals to include a data management plan. Within the plan, both metadata and preservation must be addressed (National Science Foundation, 2011). Since then, the Digital Humanities Division of the National Endowment for the Humanities also began requiring a data management plan that includes preservation (National Endowment for the Humanities, 2013). While the National Institute of Health has required data sharing for large grants since 2003, and it does require data documentation and archiving, this has not necessarily translated into preservation at the institutional level (National Institutes of Health, 2003). As most data is now borne digital, these requirements are a natural motivator for a DP program. Collaborating with the librarian who assists with data management plans is a good first step to unifying these activities. Although “The hard scientists had been regularly contributing their papers and publications to discipline-specific digital archives for years [they are] now facing a mandate from granting organizations to contribute their research data to repositories and have sustainable data management plan” (Colati and Colati, 2011, p. 166). If this is an activity that the library has not yet undertaken, a DP program would allow the library to support this 7 effort. As well, this can be the basis for collaboration with the grant management office and a very substantial service for the scientists and science educators. Legislation Recent changes in local legislation may also provide an argument for elevating DP status in the library. In Illinois, Senator Biss has sponsored a bill called the Open Access to Research Articles Act. This bill would require that, among other things, all faculty at public institutions must provide: long-term preservation of, and free public access to, published research articles: (A) in a stable digital repository maintained by the employing institution; or (B) if consistent with the purposes of the employing agency, in any repository meeting conditions determined favorable by the employing institution, including free public access, interoperability, and long-term preservation. (Illinois General Assembly, 2013, pp. 3-4) At the writing of this manuscript, the bill has passed both houses and been sent to the Governor for approval. If it becomes law, then each public institution will be required to appoint a task force in order to address the requirements. This is an ideal opportunity for library administration and DP advocates to take part in the conversation and promote DP standards. Disaster recovery It is easy to not think beyond the need for access, but what happens when files get corrupted or the server becomes obsolete? DP can easily be forgotten until a crisis emerges (Waters, 2002).When speaking to administrators, using the disaster recovery (or risk management) approach may be a good option. Although the term ‘disaster recovery’ brings natural disasters to mind, about 40% of digital data loss is due to hardware failure, 29% human error, and 13% software corruption (Smith, 2003). Certainly, any digital material loss could be colloquially termed a ‘disaster’, depending on the importance of the digital material, how much it would cost to recovery it, and if it is recoverable at all. Without a digital inventory, it is difficult to know if file loss is a true disaster or not - after all, the loss may or may not be important. For important digital material, the cost of recovery in the US is estimated at 18.2 billion per year (Smith, 2003). In order for a library to avoid a disaster, and thus manage this potentially expensive risk, a DP program needs to be implemented. This will ensure that important files are safe and recoverable. For administrators, risk management and disaster recovery may be a relatable argument for DP. Collaboration 8 Like many small and mid-sized institutions, Milner Library has no DP staff. Milner has a preservation department with four personnel and several other departments that collect or create unique local digital materials. While an inventory of these materials has yet to be completed, it is assumed that the volume is likely to be fairly large. The Digital Collections department consists of four personnel and focuses on creating an array of unique collections for teaching and research. This requires digitally transforming analog material according to accepted standards (including copyright concerns), creating metadata, and administering collections of nearly 40,000 items. The Archives department is run by two personnel and has been given an estimated seven terabytes of digital material. The Digital and Data Services Department consists of two personnel and recently launched an institutional repository. To complicate the organizational infrastructure, the Digital Collections and Digital and Data Services Departments are in a different unit than Preservation and Archives (Figure 1). Rather than rely on a reorganization to facilitate DP, the heads of these departments have opted for informal discussions, presentations, and joint training ventures. At the time of this article publication, it is not known where digital preservation will occur, but it is agreed that it will occur. The first opportunity for collaboration is within the library. [Insert Figure 1 here] With respect to the entire library, three of eleven departments (27%) are already staunch DP advocates (Figure 1). Another five departments (45%) have some stake in DP and need assistance in understanding how DP affects them. These departments are targeted for awareness and inclusion in future conversations. By cultivating cross-departmental alliances, we intend to demonstrate that DP is a concern for our library as a whole. Forming a community of support is key to meeting the challenges of DP. “You can’t do it on your own” (Bishoff, 2012). This involves understanding and raising awareness within the library, educating and demonstrating competency to campus stakeholders, and ultimately, leveraging limited resources to foster multiple collaborations. “Digital preservation cannot be left to a small team of specialists within an organization; it needs to be embedded within an organization” (Jones, 2005, p. 99). There needs to be a marriage of the technology-savvy and preservation-savvy advocates in order to elucidate a fundamental “framework of basic concepts” to support DP (Verheul, 2006, p. 268). “It is not about technology, it is about people” (Bishoff, 2012). Forging these partnerships is difficult, but necessary (Stewart, 2012). Creating cross- organizational collaborations is not easy, as some will be wary of sharing resources, desire greater authority over decisions, or have legitimate concerns about privacy and regulatory requirements (Stewart, 2012). An excellent argument for collaboration is that most technology is more cost effective when it can be scaled up. That is, funding can be pooled to stretch further. Kolowich (2012, para. 15) observed that “we’re either going to solve this problem institution by institution at great expense and with little chance of solutions that last...[or] solve it together at scale, just like we 9 did with high-performance networks.” Therefore, it is important to build collaborations at all levels; library, campus, consortium and nationally. Balancing print and digital Small institutions have limited staff and time for the preservation of all collections within the library. Moving traditional preservation staff into caring for digital collections can be confusing and frustrating as new workflows and focus require new skills and thought processes. Preservation of paper and book materials is based on techniques that slow the natural decay by controlling the environmental temperature, humidity, and chemical makeup of the items in the collections. With book and paper materials, print copies and surrogates only happen when the original is too fragile to be handled safely. With proper storage conditions, items can be housed safely and don’t require constant maintenance and testing. Fixing it and putting it back on the shelf takes care of a majority of the print collection. How is this different from DP? The DP concept of producing many copies and checking for bit rot regularly creates a need for a different awareness and plan within a preservation department. Balancing the preservation needs of the physical collection, which fills a conservation lab’s shelves, and the unseen digital data, which is out-of-sight in the digital world, requires the preservation staff to rethink how they use their time and resources. All this requires “better coordination among the various parties involved in digital preservation; and the development of tools for appraisal and risk assessment” (Hedstrom and Montgomery, 1998, p. 22). Integrating the DP workflow into the existing preservation organization involves staff and departments that do not usually coordinate with the preservation department. Unlike traditional preservation, “digital preservation, would permeate all organizations and institutions, including many who did not regard themselves as playing a digital preservation role and who may initially have regarded digital materials as fairly peripheral to their needs” (Jones, 2005, p. 96). Especially with the need for balancing priorities, the importance of collaborating with the preservation department cannot be underestimated. Common language When it comes to DP, a common language is just as important as clearly defined processes and objectives. “Agree on vocabulary – especially with IT” (Bishoff, 2012). As noted above, ‘backup’ is a common term for computer technologists and programmers. To equate backup to DP is erroneous, but due to the familiarity of this concept, it may be a good place to begin the conversation. Ironically, it is using the same words, but assuming subtly different meanings, that is often the crux of miscommunications. Indeed, several groups on campus may be experimenting with some aspects of DP, but may not be using the same terminology, or have the expertise, that exists in the library (Joint, 2007). This may be the case for entities that are in charge of sensitive university data or trying to meet National Science Foundation Data Management Plan 10 requirements (Smith, 2011). As a result, disparate groups may be duplicating efforts without knowing it. The lack of commonality in the language of DP across campus is a handicap in attaining constructive cross-organizational collaborations. How can librarians bridge the communication gap that exists between them and other computing experts? It is important to take the time to define terminology and establish trust. This should be a key component of the first inter-departmental conversation; its importance cannot be underrated, particularly if librarians aspire to administering their own DP tools. Next Steps It is not uncommon to be overwhelmed when one begins to consider DP. The lack of resources makes meeting national standards an improbability. Combine this with the sheer volume and urgency of the problem and it is tempting to believe that DP is not achievable. Chris Prom of University of Illinois at Urbana-Champaign Archives states “first, do no harm...[and]...don’t try to do everything at once.” Steve Bromage of the Maine Historical Society advises people to “have priorities, not all information and data are collected equally.” Many advocates at small or mid-size institutions are struggling with similar issues. Advocates at Milner Library are approaching DP by raising awareness, taking a comprehensive inventory of digital materials, and participating in grants and training opportunities. Inventory After awareness-raising presentations and attending a DP workshop, Milner Library will begin by taking an inventory of the digital files and collections within the library. A practicum student from the School of Library and Information Science at the University of Illinois- Urbana/Champaign will be working with Milner to inventory the file types, size, media, and locations within the building, as well as gathering any metadata available on those files. All Milner faculty and staff will be interviewed with special attention paid to the administration, digitization, archives, and special collections departments. This inventory will help the library organize, evaluate, and prioritize the content for safe keeping. The volume of material will help push the conversation regarding required storage space and workflows. This inventory is essential in deciding “whether to build, buy, or outsource (or some combination of all three) its digital preservation activities. This would require a systematic review and evaluation of the University’s current projects and the ‘preservation-readiness’ of the digital content itself” (Colati and Colati, 2011, p. 171). Once the inventory is performed, it will be beneficial to follow the example of Welch et al. (2011, p. 60), who “analyzed its internal user groups - public relations, development, and central administration - and recast its collections to appeal to their unique needs.” Participate Milner Library also participates in the Institute of Museum and Library Services (IMLS) sponsored National Leadership Grant to investigate multiple collaborative and scalable DP 11 solutions and evaluate them for small and medium-sized college and university libraries. The initiative gathers the expertise of an advisory group from across the United States and partner institutions across the State of Illinois. Within the course of implementing the project, a selection of DP tools will be tested by each of the partners. The testing and documentation of this are reported regularly at http://digitalpowrr.niu.edu/. Although the final output of this project is the publication of a white paper, there are several additional incentives to participate. The first is that piloting DP solutions may gain organizational buy-in with institutional decision makers. It may also empower the ‘front lines’, via hands-on experience with DP solutions that are custom chosen for their needs. It provides legitimacy to the claim DP advocates make, that it is necessary for accessing information for future use and that it should be prioritized. Although Milner advocates have had the honor of participating in this particular grant, joining any type of local coalition sends a message to colleagues and administration that DP is important. This effort complements the Train-the-Trainer workshop sessions developed and taught by the Library of Congress Digital Preservation Outreach and Education (DPOE). The Consortium of Academic and Research Libraries in Illinois (CARLI) sponsored a DPOE workshop in July 2013. Two of the authors of this paper were able to attend. There is hope that this “increased visibility at a local level supported by national organizations will finally make it possible for all of the talk to become reality” (Molinaro, 2010, p. 47). Planning Nancy McGovern, noted digital preservation pioneer, has referenced the idea of the three-legged stool of digital preservation. It consists of organizational infrastructure, technological infrastructure, and a resources framework (Library of Congress, n.d., Nancy McGovern). Milner Library DP advocates are building the three legs of that stool with the IMLS POWRR grant, the digital materials inventory, DPOE training, and cross-departmental collaboration (Figure 2). In addition, discussions have begun regarding the short-term hire of a Visiting Librarian to create a customized DP plan (Figure 3). This position would be advised by various Milner Library stakeholders and would be focused on bridging the gap between what is ideal and what is possible. This position would ensure that progress would continue without further burdening existing staff. As well, it is not a long-term financial commitment, making it more appealing to administration. [insert Figure 2] [insert Figure 3] Conclusion It is difficult to provide an exact map for every institution to take on its path to establishing DP. This is particularly true for resource restricted institutions, where the first barrier is the acceptance that DP can and should be done. However, there are several approaches 12 that can be explored. Certainly, training on the practice and importance of DP is essential. Training dispels DP myths and develops a common language. As well, articulating the human challenges is also important. Without recognition, there can be no resolution. Some typical challenges are: minimal funds, change fatigue, lack of training, lack of engagement and lack of recognition from decision makers, etc. Seeking collaborations helps to overcome these challenges, particularly if the collaborations are regional or multi-institutional. Milner Library DP advocates have found that dispelling DP myths, developing a common language, and articulating the human challenges have brought people together and created an environment where a short-term DP position can be discussed. This required that department heads pursue training, allocate what small resources they have to DP activities, and most importantly, collaborate and communicate. No one staff member has the resources to implement DP, but together, the group is moving the process forward. No matter the challenges, DP advocates can go from overwhelmed to taking action if they seek out and take advantage of every opportunity they can. 13 References Becker, D. and Riley, J. (2010), “Seeing Standards: A Visualization of the Metadata Universe”, in K Börner & M. Stamper (eds), 7th Iteration (2011): Science Maps as Visual Interfaces to Digital Libraries, Places & Spaces: Mapping Science, available at: http://scimaps.org. Bishoff, L. (2012), personal communication, October 16. Boyd, M. (September 2001), “ISU History: A Tribute to Perseverance, The Construction of Milner Library at Illinois State University”, available at: http://library.illinoisstate.edu/unique-collections/history-digital/ISU- history/online/milner.php (accessed June 13, 2013). Bromage, S. (2012), personal communication, October 15. Consortium of Academic and Research Libraries in Illinois (2010), “CARLI and digital preservation: A white paper”, working paper 7, Digital Collections Users' Group, 1 March. Colati, J.B. & Colati, G.C. (2011), “Road tripping down the digital preservation highway, part II: Road signs, billboards, and rest stops along the way” Journal of Electronic Resources Librarianship, Vol. 23 No. 2, pp. 165-173. Digital Preservation Outreach & Preservation (2013), “DPOE Train-the-Trainer Workshops”, available at: http://www.digitalpreservation.gov/education/ttt.html (accessed April 7, 2013). Duff, W.M., Limkilde, C. and van Ballegooie, M. (2006), “Digital preservation education: educating or networking?” The American Archivist, Vol. 69, pp. 188-212. Fyffe, R., Ludwig, D. and Warner B.F. (2005), “Digital preservation in action: Towards a campus-wide program”, ECAR Research Bulletin, Vol. 2005 No. 19, pp. 2-14. Greenfieldboyce, N. (2009), “Houston, we erased the Apollo 11 tapes”, NPR, July 16, available at: http://www.npr.org/templates/story/story.php?storyId=106637066 (accessed June 21, 2013). Halbert, M. (2012), personal communication, October 16. Harvey, R. (2005), Preserving digital materials, Mörlenbach, Germany, Strauss GmbH. 14 Hedstrom, M. (1998), “Digital preservation: A time bomb for digital libraries”, Computers and the Humanities, Vol. 31, pp. 189–202. Hedstrom, M. and Montgomery, S. (1998), “Digital preservation needs and requirements in RLG member institutions”, working paper, Research Library Group, Mountain View, CA, December. Illinois General Assembly (2013), Open Access to Research Articles Act, Bill SB1900, available at: http://www.ilga.gov/legislation/98/SB/09800SB1900.htm Jantz, R.C. (2012), “Innovation in academic libraries: An analysis of university librarians' perspectives”, Library & Information Science Research, Vol. 34 No. 1, pp. 3-12. Joint, N. (2007), “Data preservation, the new science and the practitioner librarian”, Library Review, Vol. 56 No. 6, pp. 450-455. Jones, M. (2006), “The Digital Preservation Coalition: Building a national infrastructure for preserving digital resources in the UK”, The Serials Librarian, Vol. 49 No. 3, pp.95-104. Kolowich, S. (2012), “Giving digital preservation a backbone”, available at: http://www.insidehighered.com/news/2012/11/09/educause-call-digital-preservation-will- outlast-individual-institutions-and (accessed June 1, 2013). Library of Congress (n.d.), “Nancy McGovern: Digital Preservation Pioneer”, available at: http://www.digitalpreservation.gov/series/pioneers/mcgovern.html (accessed July 29, 2013). Library of Congress (n.d.), “NDSA Standards and Practices Working Group”, available at: http://www.digitalpreservation.gov/ndsa/working_groups/standards.html (accessed 13 June 2013). McDonough, J. (2012), personal communication, October 15. Meddings, C. (2011), “Digital preservation: The library perspective”, The Serials Librarian, Vol. 60, pp. 55-60. McGath, G. (2011), “Choosing your media”, available at: http://filesthatlast.com/2011/12/13/media/ (accessed May 12, 2013). 15 Molinaro, M. (2010), “How do you know what you don’t know? Digital Preservation Education” Information Standards Quarterly, Vol. 22 No. 2, pp. 45-47. National Endowment for the Humanities (2013), “Data Management Plans for NEH Office of Digital Humanities” available at: http://www.neh.gov/files/grants/data_management_plans_2013.pdf (accessed June 21, 2013). National Institutes of Health (2003), “NIH Data Sharing Policy and Implementation Guidance”, available at: http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm#archive (accessed 21 June 2013). National Science Foundation (2011), Chapter II, Proposal Preparation Instructions, available at: http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/gpg_2.jsp#IIC2j (accessed June 21, 2013). POWRR (2012), “Preserving (Digital) Object With Restricted Resources”, available at: http://digitalpowrr.niu.edu/ (accessed June 24, 2012). Prom, C. (2012), personal communication, October 15. Puglia, S., Reed, J., and Rhodes, E. (2004), Technical Guidelines for Digitizing Archival Materials for Electronic Access: Creation of Production Master Files – Raster Images, available at: http://www.archives.gov/research/arc/techguide-raster-june2004.pdf (accessed June 13, 2013). Rudersdorf, A.E. (2012), personal communication, October 15. Smith, D, (2003), “The cost of lost data”, Graziadio Business Review, Vol. 6, No. 3. Available at: http://gbr.pepperdine.edu/2010/08/the-cost-of-lost-data/ (accessed 21 June 2013). Smith, P.L. (2011), “Developing small worlds of e-science: using quantum mechanics, biological science, and oceanography for education and outreach strategies for engaging research communities within a university”, The Grey Journal, Vol. 7 No 3, pp. 121-126. Stewart, C. (2012), “Preservation and access in an age of e-science and electronic records: sharing the problem and discovering common solutions”, Journal of Library Administration, Vol. 52 No 3-4, pp. 265-278. 16 Verheul, I. (2006), Networking for digital preservation: Current practices in 15 national libraries, München, K.G. Saur. Waters, D. (2002), “Good Archives Make Good Scholars: Reflections on Recent Steps Toward the Archiving of Digital Information", in The State of Digital Preservation: An International Perspective. Washington, D.C.: Council on Library and Information Resources, pp. 78-95. Welch, J.M., Hoffius, M.S., and Fox, E. B. (Jan 2011), “Archives, accessibility, and advocacy: A case study of strategies for creating and maintaining relevance”, Journal for Medical Library Association Vol. 99 No. 1, pp. 57-60. Figure 1. Milner Library’s organizationa l infrastructure and digital preservation advocates. Departments in bold are staunch digital preservation advocates and those with asterisks are likely converts. Milner Library Information Assets Preservation and Archives Access Services Cataloging, Acquisitions and Processing* Special Collections* Strategic Technology Digital and Data Services Digital Collections Library Information Technology* Public Services Liaison and Reference Services* Collection Development* Information Use and Fluency Government Documents Figure 2. Milner Library digital preservation advocates building of Nancy McGovern’s three - legged stool. Digital Preservation Prepared Technology (the “how”) Organizational (the “what”) Resources (the “how much”) Acknowledgments The authors would like to gratefully acknowledge the POWRR board of advisors, Liz Bishoff, Steve Bromage, Martin Halbert, Jerry McDonough, Chris Prom, and Amy E. Rudersdorf; Lynne Thomas and Drew VandeCreek, Principal Investigators, Northern Illinois University; and Teresa Mitchell for her copy editing services and assistance with manuscript preparation. Biographical Details Amanda K Rinehart is Head of Data and Digital Services at Milner Library, Illinois State University, Normal, USA. She is a representative of Illinois State University for the IMLS National Leadership grant, Preserving (Digital) Objects With Restricted Resources (POWRR) -- a collaboration with other Illinois academic institutions to evaluate suitable digital preservation solutions for small and medium-sized college and university libraries. Patrice-Andre Prud’homme is Head of Digital Collections at Milner Library, Illinois State University, Normal, USA and also a PhD student in the College of Education, Higher Education Administration. He is the project lead for Illinois State University in the IMLS National Leadership grant, Preserving (Digital) Objects With Restricted Resources (POWRR) -- a partnership with other Illinois academic institutions led by Northern Illinois University in a collaborative effort to evaluate suitable digital preservation solutions for small and medium-sized college and university libraries. Andrew Huot is the Conservator and Preservation Specialist at Milner Library, Illinois State University, Normal, IL, USA. He also teaches Conservation, Preservation, and Bookbinding for the School of Library and Information Science at the University of Illinois – Urbana/Champaign. Overwhelmed to ActionCover Overwhelmed to Action- Digital Preservation Challenges at the Und work_vdj5qi7ph5fbzb6omc6uoqazva ---- Türk Kütüphaneciliği, 7, 3, (1993), 238-240 HABERLER (YAZ 1993) ULUSLARARASI PLATFORMDA BİR MESLEKTAŞIMIZ Ubrarianship and Information Work Worldwide 1992, Bowker & Saur tarafından yayınlandı. Danışman Editörler Kurulu üyeleri arasında TBMM Kütüphane Dokümantasyon ve Tercüme Müdürü Hilmi Çelik'in adını kıvançla okuduk. Meslektaşımızı kutluyoruz. DEWEY ONLU SINIFLAMA SİSTEMİ (DDC) DCC Yayın Politikası Komitesi, 19Wda 21. basım için çalışmalara başladı. Kapsamlı bir biçimde gözden geçirilecek konu alanları 350-354 Kamu Yönetimi, 370 Eğitim ve 560-590 Yaşam Bilimleri. Bunun yanında 1, 2 ve 5. tablolarla 130 paranormal olgular, 150 psikoloji, 290 Karşılaştırmalı Din, 310 İstatistik, 320 Siyasal bilimler, 340 Hukuk, 398 Folklor yazını, 63-635 Tarım, 670-680 İmalat, 796 Spor ve oyunların genel yönleri ve 900 Tarih ve coğrafya bölümleri de gözden geçirilecek. Kapsamlı değişikliklere uğrayacak konu alanları için alt komiteler oluşturulmuş olup "Kamu Yönetimi" alt komitesinden Martin Kuth, "Eğitim"den, Gail P. Huoting ve "Yaşam Bilimleri"nden Arnold Wajenberg ve Pat Thomas sorumludurlar. Alt komitelerle yazışma adresi OCLC, Forest Press, 85 Watervilict Avenue, Albayn NY 12206- 2082'dir. OCLC Forest Press DDC20’yi CD-ROM ortamında piyasaya çıkardı. Bu ortam ileri tarama teknikleri ve tam-metin dizinlemesi ile daha hızlı ve daha etkin sınflandırma olanağı sunuyor. SÜRELİ YAYINLAR FİYAT İNDEKSİ Bütçe hazırlamada maliyet tahminleri önemli bir unsur. B. H. Blackwell yaptığı çözümlemeler sonucu İngiltere için hazırladığı 1993 fiyat indeksini yayınladı. İndeks süreli yayınların yayınlandığı ülkelere ve ana konu gruplarına göre düzenlenmiş. İlgilenen araştırma kütüphaneleri Demeğimize başvurabilir. HATAY ŞUBESİ YAYINLARI Bu şubemizin 1990 ve 1991 yıllarında yayınladığı "Bekir Sıtkı Kunt" ve "Kuruluşunun 50. Yılında Hatay İl Halk Kütüphanesi" adlı yayınların hâlâ mevcudu olup ilgilenenler edinmek için Hatay Şubesi’ne başvurabilirler. Bekir Sıtkı Kunt, Hatay'lı bir öykü yazarı olup kitap ve yazar hakkında - bilgiler yanında bazı öykülerini de içeriyor. Kütüphane ile ilgili ikinci yayın ise Antakya'nın bir tarihçesi ve bu tarihi gelişim sürecinde kütüphanelerin yerini, bu yörede mesleğe hizmet edenleri ve bu günkü durumuyla Hatay İl Halk Kütüphanesi'nin çalışmalarını ayrıntılı biçimde kapsıyor. HALK KÜTÜPHANESİ ' SAYISI 1084’E YÜKSELDİ Kültür Bakanlığı halk kütüphanelerini yurt sathında yaygınlaştırmak amacıyla çalışmalar yapmaktadır. Bu proje çerçevesinde; Samsun-Salıpazarı, Samsun-Tekkeköy, Gaziantep-Nurdağı, Ankara-Şereflikoçhisar-Çalören, Diyarbakır-Bismil, Ordu-Mesudiye- Topçam, İçel-Merkez (Şube), Muğla- 239 Haberler Bodrum-Gündoğan, Muğla-Bodrum- Yalıkavak, Muğ la-Yatağan- Yeşilbağcılar, Denizli-Çal-Akkent, İçel- Bozyazı, Antalya-Elmalı-Yuva, İstanbul- Kartal, Sivas-Gemerek-İnkışla, Kırşehir- Merkez-Hamit, İzmir-Ödemiş-Birgi, Yozgat-Sorgun-Doğankent, Bilecik- Pazaryeri, İzmir-Ödemiş-Konaklı, Antalya-Akseki-Akşahap, Aydm-Koçarlı- Bıyıklı Kütüphaneleri yerel yönetimlerce geçici olarak görevlendirilen personelle, yatırım yoluyla yaptırılan Antalya- Tekelioğlu İl Halk ve Burdur İl Halk Kütüphaneleri hizmete açilmıştır. HALK KÜTÜPHANELERİNE ÇOK SAYIDA GÜNCEL YAYIN ALINIYOR Halk kütüphanesi koleksiyonlarını güncel yayınlarla besliyerek okuyucu oranını ve hizmet niteliğini artırmak amacıyla Kültür Bakanlığı "Yayın Seçme Kurulu"nca 150 kütüphane için 7 çeşit, 790 kütüphane için 2 çeşit günlük gazeteye, Resmi Gazete dahil 93 çeşit süreli yayma abone olunmasına ve 1094 çeşitten 249.339 adet kitabın satın alınmasına karar verilmiştir. KÜTÜPHANELERİN YAYGINLAŞTI­ RILMASI PROJESİ Kültür bilincinin ve okuma alışkanlığının daha geniş kitlelere yaygınlaştırılması ve geliştirilmesine olanak sağlamak amacıyla; kendi yörelerinde kütüphane kurmak isteyen mahalli idarelere proje ve inşaat aşamasında maddi destekte bulunabilmek amacıyla 1993 yılında uygulanmasına başlanılan "Kütüphanelerin Yaygınlaştırılması" projesi doğrultusunda kütüphane binası yapımına başlayıp, binanın tamamlanabilmesi için Kültür Bakanllgfndan yardım isteğinde bulunan belediyelerin başvuruları değerlendirilerek, bu güne kadar 2.890.000.000.TL. ödenek gönderilmiştir. KÜTÜPHANELER GENEL MÜDÜR­ LÜĞÜNCE MESLEKİ BİR KİTAP BASILIYOR Kütüphaneci Oya Gürdal, Serap Narinç, Aytaç Yıldızeli ve Bülent Yılmaz'ım çalışmalarıyla Türk Kütüphaneciler Demeği Bülteni'nin 1952-1992 yıllarım kapsayan dizini "Türk Kütüphaneciliği Dergisi Dizini" adı altında hazırlanmıştır. Kütüphanecilik konusunda pekçok bilgiyi içeren ve toplu bir bibliyografik çalışma olması nedeniyle araştırmacılar için çok önemli olan bu dizin "Kütüphaanecilik Meslek Dizisi" içinde Kültür Bakanlığı'nca bastırılacaktır. "HALK KÜTÜPHANECİLİĞİ" KONULU SEMPOZYUM HAZIR­ LANIYOR Kütüphaneler Genel Müdürlüğü'nce 29 Kasım - 1 Aralık 1993 tarihleri arasında üniversitelerin kütüphanecilik bölümleri ve çeşitli kütüphanelerden uzman kişilerin bilimsel bidirilerle katılacakları ulusal düzeyde "Halk Kütüphaneciliği" konulu bir sempozyum düzenlenecektir. MAİDE (ÜNLÜ) VİDİNLİ’Yİ KAYBETTİK Ankara Üniversitesi Dil Tarih ve Coğrafya Fakültesi Kütüphanecilik Bölümü’nün ilk mezunlarından, Ege Üniversitesi Merkez Kütüphanesinin kurucusu değerli kütüphaneci büyüğümüz Maide (Ünlü) Vidinli’yi 15.106.1993 tarihinde geçirmiş olduğu bir trafik kazası sonucunda kaybetmenin üzüntüsünü yaşıyor, kütüphanecilik camiasına başsağlığı diliyoruz. Prof.Dr. MUSTAFA AKBULUT YÖK DOKÜMANTASYON DAİRE BAŞKANI OLDU 07.05.1993 tarihinde görevinden ayrılan Prof.Dr. Nilüfer Tuncer’in yerine YÖK Haberler 240 Dokümantasyon ve Bilgi Sağlama Dairesi Başkarihğı'na A.Ü. DTCF Kütüphanecilik Bolümü Başkam Prof.Dr. Mustafa Akbulut vekaleten atandı. Kendisini kutluyor ve başarılar diliyoruz. BENGÜ ÇAPAR PROFESÖR OLDU A.Ü. DTCF Kütüphanecilik Bölümü öğretim üyelerinden Bengü Çapar 26.07.1993 tarihinde Profesör oldu. Çapar'ı kutlar başarılarının devamım dileriz. GEÇMİŞ OLSUN DİLEĞİ A.Ü. DTCF Kütüphanecilik bölümü emekli öğretim üylerinden Prof.Dr. Osman Ersoy Prostat, Bölüm Başkam Prof. Dr. Mustafa Akbulut da safra kesesinden A.Ü. İbni Sina Hastahanesi'nde başarılı birer ' ameliyat geçirdiler. Her iki öğretim üyesine de geçmiş olsun der, sağlıklı günler dileriz. KREDİ VE YURTLAR KURUMU KÜTÜPHANECİLER İÇİN SEMİNER DÜZENLEDİ Kredi ve -Yurtlar Kurumu- Genel Müdürlüğü bünyesinde bulunan ve yurdun çeşitli bölgelerine dağılan yüksek öğrenci yurtlarında çalışan kütüp­ haneciler ve kütüphanede görevli yönetim memurları' için 12-16 Temmuz 1993 tarihleri arasında Sinop'ta Mesleki Eğitim Semineri düzenledi. A.Ü. DTCF Kütüphanecilik Bölümun’den Yrd.Doç.Dr. Doğan Atılgan ve Araştırma Görevlileri özlem Gökkurt, Oya Gürda! ve Fahrettin özdemirci'nin öğretici olarak katıldıkları seminere 28’i kütüphaneci toplam 59 kursiyer katıldı. İLK MEZUNLAR A.Ü DTCF Kütüphanecilik Bölümü bünyesinde 1989-1990 öğretim Yıh'nda eğitime başlayan Arşiv ve Enformasyon- Dokûmantasyon Anabilim Dalları ilk mezunlarını verdi. Genç mezunlarımıza bundan sonraki yaşamlarında başarılar dileriz. MESLEĞİMİZ YÜKSELİYOR Milliyet Gazetesi'nin 9 Haziran 1993 tarihli sayısında mesleğimizle ilgili umut verici bir haber yayınlandı. Amerika'da yapılan bir araştırmanın sonuçlarının yayınlandığı haberde 1990­ 2005 yıllan arasında "yükselen işler" sıralamasında kütüphaneciliğin 61.9 yüzde ile 6. sıraya yükseleceği belirtilmektedir. PATENT CD-ROM VERİTABANI İngiliz Milli Kütüphanesine bağlı Patent Express - 1 - milyon Amerikan, İngiliz, Avrupa ve PCT paketini içeren 1000 CD- ROM'dan oluşan ve CD-ROM ortamının büyük uluslararası veritabanım oluşturdu. İleride bu sistemin günde 24 saat uzaktan erişime açlılması planlanıyor. TODAİE KÜTÜPHANECİ ALIYOR TODAİE Kütüphanesi okuyucu hizmetleri bölümüne kütüphaneci kadrosu ile eleman alacaktır. Atama ancak 657 sayılı yasaya bağlı kuruluşlardan nakil yoluyla yapılabilmektedir. -Başvuru: Filiz Yücel Tel: 230 42 91 work_vfj2jnzebvdjzchxvea3jegrai ---- Microsoft Word - Tracking Romani publications-Husic.docx Tracking the History of Romani Publications: Challenges Presented by Flawed Data Geoff Husic1 Abstract: Romani is a language of northern Indic origin spoken natively by an estimated 2.5 million people, primarily in Eurasia but also in North America. The history of publication patterns in Romani has not been well documented. Extracting data about this history based on available information in large bibliographic databases such as OCLC WorldCat has been hampered by unfortunate misapplication of certain language codes, making it all but impossible to efficiently filter search results using Romani language as a parameter. The author discusses how he was able to correct much of this inaccurate data in OCLC WorldCat. Keywords: Romani (Romany) language publications, OCLC WorldCat, Cataloging databases, Language codes. The history of publication in the Romani language or on the topic of Romani has not been very well documented.2 This holds true especially for Internet publications, but 1 Slavic & Near East Studies Librarian. BA Russian and German (Middlebury College), MA Slavic Languages and Literatures (University of Kansas), MS Library and Information Science (University of Illinois). Room 519, Watson Library, University of Kansas, 1425 Jayhawk Blvd, Lawrence, KS 66045-7544. (husic@ku.edu). is also true for more traditional paper-based publications. As the Internet has been embraced as a convenient means for publishing and blogging, it has become a boon for linguistic minorities, such as the Roma, that wish to make information about their cultures and languages better known to the world at large. It has been able to serve as a convenient venue for publishing news, cultural information, literature, blogs, and chat room content in lesser-known languages in a way that would have been politically or economically unfeasible in the pre-Internet print world. One good example of this phenomenon can be found in Wikipedia, where much excellent information about lesser- known languages has been entered, curated, and edited by those who have obvious interest in and affection for these languages and the cultures of their speakers. I have been monitoring the development of Romani-language Web publishing over the last ten years in this very context. Due to the nature of Romani, it can be written in a variety of dialects and rather chaotic writing systems, so that discovering Romani sources on the Web can be challenging. Because of the rapid movement of many aspects of publishing to the Internet, it seemed to me a good time, as a companion project, to try to create a thorough retrospective bibliography of materials published about the Romani language, or in any of the several Romani dialects, that have appeared in print in the last 300 or so years. By necessity this bibliography, when completed, will primarily index materials on the whole book or journal level. Unlike the Web, in which data is, for the most part, still very unstructured, library bibliographic databases are universally based on very structured data 2 Romani is Library of Congress spelling of the language, but is more conventionally spelled Romany in English. schemes. Theoretically this should make it very easy to extract bibliographic data concerning the publication history of a particular language, or, in my case, Romani. However, the path to even beginning an analysis of Romani publications has been somewhat complicated. The database, based on which I wished to conduct the analysis, was OCLC WorldCat3, the bibliographic database used by the majority of North American academic libraries for cataloging and reference purposes. Its use has also steadily expanded to libraries worldwide. Bibliographic records in this database are encoded in the MARC format, which allows for very granular encoding of information for each bibliographic entity it represents. These entities most typically represent texts (books, journals, manuscripts, etc.) but can also be scores, maps, computer files, sound files, etc. Catalogers, who use this database for cataloging locally held library materials, the records of which then get uploaded to their library’s catalog, usually do their work through a technical-services interface called Connexion. Some libraries employ another method, where they create records in their local library catalog client, and then upload their records back to OCLC WorldCat. The latter libraries, when cataloging ‘copy’ records (those that already have some kind of record in OCLC WorldCat and which can vary widely in quality and fullness), may choose to make corrections and enhancements in their local catalog only. This sometimes makes sense from a workflow perspective, but it does not benefit other libraries, that subsequently need to use these bibliographic records, if errors have not been corrected in the WorldCat master records. As a result of 3 WorldCat has a variety of public and cataloging interfaces. The client version used by libraries for cataloging purposes is called Connexion. this practice, some libraries will have somewhat more accurate data in their local catalogs then was originally imported from WorldCat. Among the kinds of information encoded about each book or journal are the language or languages represented in the work. Each languages represented is encoded using OCLC’s three-letter language designation, on which the ISO Codes for the Representation of Names of Languages is also based.4 These codes are added to a dedicated portion (the fixed-field language field) of the MARC record and can be used in WorldCat and other local library catalogs, into which MARC records have been uploaded, to limit catalog search results by language. For items that are multilingual, there is a additional MARC field, the 041 field, in which further language information can be recorded, such as multilingual texts, original language of translations, languages of summaries, etc. Most of the common language codes are transparent and easy to remember, e.g. eng for English, rus for Russian, etc. Occasionally, some codes for less-commonly encountered language must be looked up by catalogers, who can’t be expected to have memorized all of the thousands of language codes available. However, due to oversight by some catalogers, a situation had developed in WorldCat that had eroded the ability to identify Romani-language materials in the database. This is the essence of the problem: The official OCLC language code for Romanian is rum. This code was chosen because at the time these codes were established, the spelling Rumanian was still the more common spelling in English. The spelling Romanian became more common starting in 4 See http://www.loc.gov/standards/iso639-2/php/code_list.php for a full list of these language codes. the late 1960s and is now the standard in English. Understandably, most libraries are much more likely to encounter and catalog materials in Romanian than in Romani. What has occurred over the years is that many libraries, when cataloging Romanian-language materials in the OCLC WorldCat database, have been miscoding Romanian (language code rum) with the language code rom. However rom is actually the language code assigned to Romani. These coding errors have resulted in many Romanian records being miscoded as Romani. As there are obviously several magnitudes more published works in Romanian than Romani, this has made the task of extracting information about Romani publications all but impossible. In late 2011, I contacted OCLC to alert them to this problem and to solicit their cooperation in correcting it. I informed them that my goal was to extract information from the database about Romani materials in order to construct a thorough retrospective bibliography and chronology, as well as my desire, as a cataloger, to see these coding errors corrected. OCLC technical staff was very cooperative and eager to help. They provided me with an initial spreadsheet of all records in OCLC that had the language code rom (Romani), either in the MARC language fixed field or the MARC 041 field. This spreadsheet was helpful for getting an initial overview of the scope of the problem. For a variety of reasons, I soon decided to abandon the spreadsheet approach to scrutinizing the data and to do my work directly through the OCLC Connexion interface. The main impediment was that the spreadsheet bibliographic records also included hundreds of duplicate records for many books based on cataloging done by mainly European national libraries. In these records, the language of cataloging, i.e. the description, notes, subject headings, etc., are not in English but rather in the language of the national cataloging agency. I felt that it was unmanageable for me to attempt to correct all these records. A recent enhancement to the OCLC Connexion software allowed me to easily limit to bibliographic records produced by English-language cataloging agencies. These are the records that will be used by North American and British libraries, so I felt my efforts were best placed in correcting these. Sifting through and correcting the number of records that needed to be reviewed (over 2600) required being familiar with Romanian, Romani, and a number of other languages. Fortunately, as a specialist in Eastern European languages, being proficient in Romanian and very knowledgeable in several Romani dialects, I was eager to lend my assistance to the task. Scouring through the records encoded rom, I was able to eliminate records that clearly had some Romani content and thus eliminate them from the problematic set. I had to scrutinize the remainder with care. While the problem originated in the confusion of the language codes for Romanian and Romani, there are in fact many works that contain both Romanian and Romani content, so I needed to assure I didn’t eliminate works purely based on, say, a Romanian title. Ultimately I ended up with approximately 1400 items that required further scrutiny. I then downloaded the OCLC records for these items so that I could view the full bibliographic information, such as subject headings to help me ascertain the content. The following is a brief overview of the kinds of errors I identified when examining the problematic set: Approximately 1200 were actually Romanian language materials that had been miscoded as rom. Forty or so were Romansh language materials, the correct code for which is roh. In addition, there were quite a few other oddities, such as rom being coded for Latin texts, or apparently as a result of confusion with the place of publication, such as an example of an English text published in Rome. Several dozen more were actually texts in Hungarian and other languages, incorrectly coded rom, that were printed in Romania. In these cases, a cataloger, unfamiliar with the languages, presumably extrapolated the incorrect language based on the place of publication. Or perhaps they were artifacts from an automatic conversion project. A small number were coded rom because a cataloger apparently thought this was the proper way to indicate something in the roman script. Finally, in a few cases, not only was the record coded with rom but the record also contained textual language notes such as “In Hungarian and Romanian.” This is rather curious and perhaps was the result of automatically generated notes based on the fixed field or 041 languages codes. It is difficult to tell for certain. In those cases where I found the language code rom to be incorrectly assigned, as a cataloger in an OCLC Enhance Program library, I was able, in most cases, to correct the OCLC WorldCat master record to reflect the correct language.5 In many cases I also fix incorrect language notes as appropriate. There were quite a few cases where I was unable to correct the codes. These were mainly cases in which there was other incorrect coding in the records that made it impossible to update the master record. Some errors also appeared in records for music scores. I am not familiar with the scores format, so I was reluctant to make corrections to these records for fear of causing unforeseen problems. The bulk of this project has now been completed. Users of WorldCat will now be able to filter results using the Romani language as a search parameter much more 5 The Enhance Program allows qualified member libraries to correct and added additional information to bibliographic records in OCLC WorldCat. reliably, if not yet perfectly, than before. There are a number of items I will need to contact OCLC or other libraries to fix in the master records. I intend to monitor new items added to WorldCat periodically in order to catch new incorrectly coded records that are sure to be added. I would encourage other WorldCat users to also correct these mistaken language codes, especially for minority language, such as Romani, when encountered. A few major academic libraries have also made corrections in their local catalogs based on my personal communications with them about this issue. Now begins the hard part! Having cleaned up much of the data in WorldCat, I have begun to examine how best to extract the data. I will likely import the data either into Endnote (a citation management program), Zotero (a bibliographic tool plugin for the Mozilla Firefox browser) or work with both tools. There are certain character encoding issues that occur when importing bibliographic data into these software tools that will also have to be addressed to minimize the amount of manual editing I must do. When the majority of the data is imported satisfactorily, I can begin to correct any additional errors and add additional useful metadata as well as my annotations. I can then finally begin an analysis of the actually publication patterns based on time period, dialect, place of publication, genre, and other parameters of interest. Results will be published upon completion in a venue to be determined. work_vjlpjwtuaje5rcdonj5trptpmi ---- PII: 0098-7913(91)90021-A "A DREAM UNFOLDING": A GUIDE TO SELECTED JOURNALS, MAGAZINES, AND NEWSLETTERS ON PEACE, DISARMAMENT, AND ARMS CONTROL Grant Burns Burns is a reference librarian at the University o f Michigan-Flint Library. This article is based in part on his book The Nuclear Present, a guide to current literature on nuclear war, nuclear weapons, and the peace move- ment, to be published by Scarecrow Press. Why talk "peace movement" in a world that has recently been described as seeing peace "breaking out all over," where "velvet revolutions" have deposed communist dictatorships throughout Eastern Europe, and where the prospect o f a head-on nuclear "ex- change" between the U.S. and the Soviet U n i o n seems to be the stuff o f memory? I f the recent experience in the Persian Gulf is not a sufficient reminder that peaceful resolution o f h u m a n conflict is scarcely an entrenched habit o f the species, then brief perusals o f such documents as Amnesty International's annual reports should relieve most readers o f any unwarranted rosy feelings about peace on earth and good will prevailing among men, women, and children. F r o m Indonesia to Ethiopia, from the Philippines to El Salvador, thousands o f people are being killed, tortured, and otherwise physically intimi- dated for political purposes. Crippled by overwhelming military demands, national budgets fail to meet basic civil needs. Arms merchants, Bob Dylan's "Masters o f War," swoop d o w n to satisfy the hardware hungers o f any state that can ante up the cash for the latest hot new missile or tank. Peace may be a "dream unfolding," ,as Penney Kome and Patrick Crean say in Peace: A Dream Unfolding (Sierra Club, 1986), but large numbers o f people are not yet a part o f the dream. The peace movement exists to help make the dream real. In a world where the U.S. and the Soviet U n i o n still possess some 50,000 nuclear warheads, and where, according to a recent Brookings Institution report, at least sixteen SERIALS OF "l'tl£ P E A C E M O V E M E N T - - W I N T E R 1991 7 nations possess ballistic missiles with ranges of up to 1,500 miles, that elusive reality is in clear need of assistance. Peace Periodicals Talking about the periodicals of the peace move- ment requires some further reflection on what that movement is--and what it is not. The "Peace Move- ment" is n o t a monolithic, unified force with a single, clear objective, but a loose assembly of individual social, political, and religious movements with diverse concerns. The assembly is composed of local, regional, national, and international organizations, from the church group that meets down the street to the Council for a Livable World, Women's Action for Nuclear Disarmament, and International Physicians for Social Responsibility. The peace movement is also composed of individ- uals who belong to no voluntary associations, but whose awareness of the destructiveness of war and other forms of institutionalized violence as tools for addressing social and political problems leads them to question and criticize policies related to these tools. Those who contribute to the peace movement may do so by making financial donations to groups like Pax Christi or the CCCO (formerly the Central Committee for Conscientious Objectors). They may contribute by taking part in mass demonstrations against their govern- ments' use of force, whether in Afghanistan or Panama or South Africa or the Middle East. They may contrib- ute by writing letters to their local newspapers, or by talking with friends and work acquaintances about peaceful approaches to national and global problems. Yet the peace movement extends far deeper than any of those activities. It entails a commitment to ways of living that honor life at large. This commitment can manifest itself in solitary reflection on the shared goals and trials of humanity, in prayer, in the practical application of environmental awareness (for the ways of peace necessitate peaceful treatment of the planet every bit as much as a peaceful approach to other people), and in the education of one's children in enlightened thinking about war and violence. What is the peace movement? Is it a current of mingled hope and realization issuing from the soul of humanity and manifesting itself in a thousand very different yet complementary ways? Is it the sign of a slowly dawning but relentless consciousness that survival--of the species and of the planet--depends on cooperation rather than conflict? It may be nice to think so. But whatever the peace movement's origins, it is by now far too varied in its people and its activities to permit simplistic definitions. A Diverse Literature Given the great diversity of the peace movement, it stands to reason that its literature is equally diverse. It is. The objectives of the peace movement involve far more than the mere absence of people trying to kill one another to achieve their goals, although that absence is the fundamental reason for the movement's existence. Periodicals advocate the causes of peace from many different perspectives informed by a wide variety of values and experience. People come to the peace movement with religious and philosophical motivations, with environmental concerns, with basic human compassion for the suffer- ing of others, with legal and medical perspectives, even with enlightened business sense; the cynical slogan of the Vietnam War era, "War is good business; invest your son," is one whose irony many executives have come to recognize. (Many more, alas, have not; General Electric may claim to "bring good things to life," but it is also one of the nation's premier nuclear weapons contractors, and as such has been for several years the target o f a nationwide boycott by the peace movement, as well as the object of various direct protest actions.) Since Vietnam The Vietnam War did more than any other recent event to stimulate the development of peace (or at least antiwar) publishing in the United States, especially through the briefly-flourishing underground press movement. Since the war's end, the periodicals of the peace movement have proliferated and diversified. Further, they have strengthened their theoretical underpinnings and have broadened their scope, moving beyond the gut issue of opposition to a specific war to address the multiple issues o f peace, justice, and freedom. With the election of Ronald Reagan and the U.S. military buildup of the early 1980s, and with the intense focus early in the Reagan years on renewed fears of nuclear war, the past decade witnessed an impressive resurgence and maturing of publishing on issues of war and peace. It was a resurgence particularly strong at the grassroots level, the level of citizen action suggest- ing that the "slowly dawning but relentless conscious- ness" is a force real and insistent. The most dramatic example of the grassroots peace movement in the 1980s was the Nuclear Freeze movement, a movement that did as much as anything to bring mainstream legitimacy to nuclear weapons protest, even to the extent of a congressional resolution in its favor. The freeze movement was eventually manipulated and co-opted by the Reagan administra- 8 SERIALS REVIEW - - G R A N T B U R N S tion's assertions about making nuclear weapons "impo- tent and obsolete" through the Strategic Defense Initiative, and about their complete abolition, but the movement's accomplishments were real and significant. The Persian Gulf War brought some soul-searching to many peace movement activists and sympathizers; it is a search that can be traced in grassroots periodi- cals. The overwhelmingly sympathetic mass media treatment of the Bush administration's pursuit of the war led some long-time peace advocates to a cringing support of the U.S.-led war effort; other activists, who maintained a strong opposition to the war, nevertheless were sidetracked into devoting an embarrassing amount of time to showing "support for the troops." With the illusion of what General Colin Powell called in a post-war speech to the Veterans of Foreign Wars (VFW) "a clean win" having dissipated in the bloody aftermath of the Gulf ceasefire, there will be further discussion in the grassroots press about the maintenance of clear thinking regarding war as a necessary evil. At any rate, it is ultimately to the grassroots periodicals that one must turn to sense the depth of emotional and intellectual commitment that comprises the peace movement. These publications can, if one is in the mood, prove almost overwhelming in the purity of their commitment to peace on earth. The depth of caring in these periodicals is often so intense, so profound, and so selfless that reading them can be an almost transcendent experience, carrying one into the very realm of passion felt by those who would "save the world." The arguments against subscriptions to grassroots periodicals are well-known: they aren't indexed, they tend to be irregular, sometimes the editorial style seems more than a little homespun to those accustomed to the banal slickness of the mass newsweeklies. Forget those threadbare arguments. The library that fails to make such literature available to its readers, and I am ashamed to say that most libraries fail on this point, is depriving them of the opportunity for a powerful intellectual and emotional adventure, one that has the potential to be life-changing. S o m e Notes on the Selected List The periodical list here is not comprehensive. It omits a lot of newsletterish publications, but it includes others for flavoring. It makes no real effort to cover periodicals published outside North America, although some do show up. Some titles that I tried to obtain for review eluded my grasp. The titles appearing in this article come to light through a combination of circumstances, some deliber- ate, some fortuitous. I first began paying serious attention to peace periodicals in the early 1980s. Some titles here I have known and admired for a number of years, others are new to me. For The Nuclear Present, forthcoming from Scarecrow Press, I annotated a substantial number of peace movement periodicals, along with other titles dealing with nuclear issues from military and political perspectives. This task entailed use of such standard periodical guides as Ulrich's as well as recent reference books noting likely periodicals. Many of those titles appear in this article. My intention here is to present titles of potential use in almost any U.S. library. The list omits some interesting titles, such as some religious denominational publications, because their focus is too constricted for a general audience. Subscription prices, dates of first publication, circulation, ISSNs, and OCLC numbers are noted when ascertained. If there are two subscription costs, the first is for individuals, the second for institutions. Given today's rap~d changes in periodical prices, the figures listed here cannot be expected to prevail for long. When identified as covered by indexing or abstracting services, such tools are noted. Titles are grouped for reader convenience in subject categories. Treating periodicals in this fashion is usually a risky game, for it compels forcing some against their will into boxes that don't really fit their natures. Nuclear 7~mes, for example, noted in the "Professional" section, is at the same time very much a product of grassroots sensibilities, as is Ground Zero, located in the "Reli- gious" division. No one thinking straight attempts to arrange periodicals in anything but alphabetical order. I went ahead and did it, anyhow. Incomplete though the list below is, few libraries even begin to offer their users a healthy sample of the periodicals it covers. Perhaps some will be inspired by this article to take some corrective measures (see sidebar 1). G E N E R A L T I T L E S W I T H P E A C E F O C U S E S Greenpeace Magazine (see figure 1). Edited by Andre Carothers. 1436 U St., NW, Washington DC 20009. 6/year. $20. 1981-. OCLC 16718179. ISSN 0899-0190. Circ. 800,000. Indexed: Alternative Press Index. Next to the Sierra Club, Greenpeace is probably the world's best-known environmental organization. Its magazine is one of the leading journals of environ- mental activism, providing coverage on a broad range of issues. One of the organization's persistent interests has been pollution from nuclear weapons and nuclear power operations; its ship, the "Rainbow Warrior," was the target of a lethal 1985 terrorist bombing by French government functionaries in New Zealand. - - SERIALS OF T i m PEACE MOVF2VlENT - - W I N T E R 1 9 9 1 9 S i d e b a r 1: F r o m t h e B e l l y o f t h e B e a s t If a library cannot pretend to offer an adequate collection o f periodicals on peace without including a decent sample o f representative grassroots titles, it also cannot do the job unless it covers the other side of the coin with periodicals issuing from and devoted to the military-industrial complex. I'll not take up space to describe the following listings, but some titles worth carrying are: Air Force Magazine. Edited by John T. Correll. Air Force Association, 1501 Lee Hwy., Arlington, VA 22209. Monthly. $21. 1942-. OCLC 5169825. ISSN 0730-6784. Circ. 235,000. Indexed: Abstracts o f Military Bibliography; Aerospace Defense Markets & Technology; Air University Library Index to Military Periodicals; America: History and Life, Historical Abstracts; International Aerospace Abstracts. Air University Library Index to Military Periodicals. Edited by Emily J. Adams. Air University Library, Maxwell AFB, AL 36112-5564. Quarterly; cumulated annually. Free to libraries. 1949-. OCLC 2500050. ISSN 0002-2586. Circ. 1,500. A subject index to approximately 80 English- language military and aeronautical periodicals. The substantial book review index could be helpful in locating reviews not indexed in other sources and in identifying the books themselves. A single issue runs to approximately 160 pages; publication lags a year or so behind the period being indexed. Was Air University Periodical Index until 1962. Airpower Journal. Edited by Col. Keith W. Geiger. Air University, Maxwell AFB, AL 36112. Quarterly. $9.50. 1947-. OCLC 16481534. ISSN 0897-0823. Circ. 20,000. (U.S. federal depository serial D 301.26/24:2/4). Indexed: Aerospace Defense Markets & Technology," Air University Library Index to Military Periodicals; American Bibliography o f Slavic & East European Studies; America: History & Life; Abstracts o f Military Bibliography," Engineering Index Monthly; Historical Abstracts; Index to U.S. Govern- ment Periodicals; Middle East: Abstracts & Index; PALS," Predicasts Overview o f Markets and Technolo- gies. Comparative Strategy. Edited by Richard B. Foster. Taylor & Francis, 1900 Frost Rd., Ste. 101, Bristol, PA 19007. Quarterly. $89. 1978-. ISSN 0149-5933. Indexed: Abstracts o f Military Bibliography,'American Bibliography o f Slavic & East European Studies; Current Contents; International Political Science Abstracts, PALS," Peace Research Abstracts; Social Science Citation Index. Defense Analysis. Edited by Martin Edmonds. Perga- mon Press Journals Div., Maxwell House, Fairview Park, Elmsford, NY 10523. Quarterly. $100. 1985-. OCLC 10490881. ISSN 0743-0175. Indexed: Current Contents. Global Affairs. Edited by Charles M. Liehenstein. International Security Council, 1155 Fifteenth St. NW, Suite 502, Washington, DC 20005. Quarterly. $24. 1986-. OCLC 12954805. ISSN 0886-6198. Circ. 16,600. International Security. Edited by Steven E. Miller. Harvard University Center for Science and International Affairs, 79 John F. Kennedy St., Cambridge, MA 02138. Quarterly. $25/$65. 1976-. OCLC 2682087. OCLC 2682087. ISSN 0162-2889. Circ. 5,500. Indexed: A.B. C. Pol Sci; Abstracts o f Military Bibliog- raphy," Aerospace Defense Markets & Technology," Air University Library Index to Military Periodicals; American Bibliography o f Slavic & East European Studies," America: History & Life; Future Survey," Historical Abstracts," International Bibliography of A m o n g the articles in the magazine, one finds frequent pieces on peace-related topics. Early 1991 issues, for example, featured reports on Greenpeace actions in the Soviet nuclear weapons test territory in the Berents Sea and on French nuclear testing at Moruroa. A highly desirable addition to any library's peace and environ- mental offerings. New Outlook. Edited by Robert Berls. American Committee o n U.S.-Soviet Relations, 109 1 lth St. SE, Washington, DC 20003. Quarterly. $25. 1990-. New Outlook is the official journal o f the Ameri- can Committee on U.S.-Soviet Relations. This indepen- dent, nonpartisan group established in 1974 dedicates itself "to strengthen official and public understanding o f U . S.-S oviet relations by providing accurate informa- tion and expert analysis." Only one issue could be reviewed for this guide; that one, the Winter 1990/91 number, contained in its 90 pages an extensive analysis on "Reform and the Soviet A r m e d Forces," addressing the U.S.-Soviet strategic balance, the Soviet defense conversion process, "The Troubled Soviet Armed Forces," and other topics. The report included pertinent Soviet documents, such as a public opinion poll from August 1990, indicating that only 12 percent o f Soviet citizens believed that a threat o f military attack against the Soviet Union then existed. The periodical reflects a thorough journalistic rather than a scholarly approach. It should prove a useful source o f information and opinion. 1 0 SERIALS REVIEW - - GRANT BURNS the Social Sciences; International Poh'tical Science Abstracts; Middle East: Abstracts & Index," PALS; Predicasts Overview o f Markets and Technologies; Political Science Abstracts; Risk Abstracts; Social Science Citation Index. Jane's Defence Weekly. Edited by Peter Howard. Sentinel House, 163 Brighton Rd., Coulsdon, Surrey CR5 2NH, England. U.S. subscriptions: 1340 Braddock PI., Alexandria, VA 22314. Weekly. $145. 1980-. OCLC 10366120. ISSN 0265-3818. Indexed: Abstracts o f Military Bibliography," Aerospace Defense Markets & Technology. National Defense. Edited by F. Clifton Berry, Jr. American Defense Preparedness Assoc., 2101 Wilson Blvd., Ste. 400, Arlington, VA 22201.10/year. $35. 1920-. OCLC 4867930. ISSN 0092-1491. Circ. 40,200. Indexed: Abstracts o f Military Bibliography;Aerospace Defense Markets & Technology; Air University Library Index to Military Periodicals; Chemical Abstracts; Engineering lndex; Predicasts Overview o f Markets and Technologies. Strategic Review. Edited by Walter F. Hahn. U.S. Strategic Institute, PO Box 618, Kenmore Sta., Boston MA 02215. Quarterly. $15. 1973-. ISSN 0091-6846. Circ. 3,500. Indexed: Abstracts o f Military Bibliogra- phy," Air University Library Index to Military Periodi- cab," American Bibliography o f Slavic & East Europe- an Studies; Chicano Periodical Index; Middle East: Abstracts & Index; PALS; Predicasts Overview o f Markets and Technologies; Social Sciences Index. Other useful military titles can be identified in Katz's Magazines f o r Libraries and in Michael E. Unsworth' s "Professional Military Journals: An Overlooked Resource" (Serials Librarian 10 (Summer 1986): 143- 54.) These journals and magazines will give the reader interested in peace and related issues some highly useful and enlightening perspectives on military thinking, the weapons industry, and the political connections between the two. As John Adams said, "I must study politics and war that my sons may have liberty to study mathematics and philosophy." Today one hopes that Adams would include his daughters in this statement. At any rate, the advocates o f peace must expose themselves to the arguments o f warriors and their kin. To do otherwise is to go as a sheep among wolves, with the likely result being mutton. Readers will also want to remain alert to the pertinent work that appears from time to time in such titles as Foreign Affairs, Foreign Policy, Worm Politics, Orbis, and many others that emphasize international relations. GREENPFACE M A G A ;~ I N E Children Cherfiobyl How to ban the bum Sharksin danger Joumey into the Soviethomh Figure 1: Greenpeace Magazine January/February 1991 Nuclear Times. Edited by John Tirman. 401 Common- wealth Ave., Boston, M A 02215. Quarterly. $18. 1982-. OCLC 8771147. ISSN 0734-5836. Circ. 60,000. Alternative Press Index; HumanRights lnternet Report- er. Nuclear Times has evolved to serve as a wide- angle guide to the antiwar and antinuclear movements. It retains a primary focus on nuclear weapons and nuclear war issues, but also features commentary and assessments concerning political and military hotspots around the world (e.g., the Soviet crackdown in the Baltics, militarism in Japan, the Persian Gulf) that harbor the potential for far wider conflict. On the nuclear front, the magazine has recently featured articles o n proliferation, nuclear deterrence in the context o f the declining Cold War, and nuclear test protests in the Soviet Union. Contributors are journal- ists, scholars, and activists. Contains a good list o f organizational resources keyed to each issue's articles. Belongs in all libraries. Positive Alternatives (see figure 2). Edited by Jim Wake. Center for Economic Conversion, 222 View St., Suite C, Mountain View, CA 94041. Quarterly. $35. 1990-. Circ. 7,500. SERIALS OF THE PEACE MOVEMENT -- WINTER 1991 11 This journal's self-description says: "Positive Alternatives is the primary publication of the Center for Economic Conversion and is the journal of econom- ic conversion movement. It covers all facets of econom- ic conversion from editorials on alternatives to military dependency, to interviews with key actors in the movement, to case studies, book reviews, reports from the field and updates on CEL's work." Although primarily devoted to promoting economic conversion i n t h e U . S . , the journal also turns to other regions, such as Eastern Europe, struggling with the burden of wasteful investment in military projects. Includes one or two book reviews and an annotated resources listing. If this new title survives, it could prove of real correc- tive value to opposition to the federal governments military base-closing plans. A P I J I I L I C A T I O N OF ' I I | E CENTEII FOR ECONOMIC CONVERS|ON in lhiR I.~xaP, POSITIVE tot, t ?*b,l F~E~ IgJO Figure 2: Positive Alternatives Vol. 1, No. 1, Fall 1990 GRASSROOTS Civilian-Based Defense: News & Opinion. Edited by Melvin G. Beckman, Philip Bogdonoff, and Robert Holmes. PO Box 31616, Omaha, NE 68131.6/year. $15. 1982-. ISSN 0886-6015. Circ. 750. This newsletter is intended as a source of informa- tion on non-violent civilian-based defense (CBD) as an alternative policy for national defense, and as a vehicle for the exchange of international news, opinion, and research on CBD. It features some interesting articles by international contributors on CBD, a form of "defense" which has been discussed for many decades, and which was reintroduced in the 1950s by such figures as Sir Stephen King-Hall (Defence in the NuclearAge, Fellowship of Reconciliation, 1961), who proposed CBD as the best way to oppose Soviet expansion. The most recent issue focused on the relationship between CBD and the "Velvet Revolutions" that took place in Eastern Europe (some of which, of course, were more velvety than others). Resource notes and occasional substantial book reviews heighten the title's utility. Something of a hybrid between a grass- roots and scholarly effort; the spirit is of the former, but the academic qualifications of many contributors lend it an air of the latter. Fellowship. Edited by Virginia Baron. Fellowship of Reconciliation, 523 N. Broadway, Box 271, Nyack, NY 10960.8/year. $15. 1934-. OCLC 1569084. ISSN 0014-9810. Circ. 8,000. Human Rights lnternet Reporter; PALS; Peace Research Abstracts. Fellowship contains peace movement news from around the world, news of Fellowship activities, personal accounts o f peace activists (such as Joseph J. Fahey's "From Bluejacket to Pacifist" in the March 1991 issue), analysis of military events, and discussion of more subtle forms of violence, such as homelessness and war toys. Includes approximately a half-dozen book reviews in each issue, ranging from one or two para- graphs to several hundred words. Global Report: Progress Toward a World of Peace With Justice. Edited by Richard Hudson. Center for War/Peace Studies, 218 E. 18th St., N.Y., NY 10003. Quarterly. $35 (membership). 1977-. ISSN0730-9112. Circ. 2,500. The 2,500-member Center for War/Peace Studies advocates a much-enhanced role for the United Nations in achieving and maintaining international peace. The centerpiece of the organization's current efforts is a campaign to make the U.N. the top level of a global federal system, with considerably-strengthened power to make and enforce decisions. Organizational member- ship brings this newsletter, along with other materials, such as Benjamin Ferencz's and Ken Keyes, Jr.'s Planethood: The Key to Your Future. Global Report provides in its four pages news and features relevant to the center's objectives. Recent issues have included an interview with Andrei D. Sakharov and critical discussion of the Bush administration's use of the U.N. during the Persian Gulf War. 12 SEmAt, S REVIEW -- GRANT BURNS- INFACT: Nuclear Weaponmakers Campaign Update. INFACT National Field Campaign, PO Box 3223, S. Pasadena, CA 91031. Quarterly. $15. 1986-. For five years INFACT has been leading a consumer boycott of General Electric, one of the nation's leading nuclear weapons contractors. This brief newsletter reports on progress in the campaign and on GE activities on the nuclear front, including current work and historical events, such as the company's involvement in the notorious 1949 release of radioactive iodine into the atmosphere from the Hanford nuclear facility. INFACT has published numerous materials concerning GE and the boycott. The $15 charge is more a campaign donation than_ a subscription fee. A signifi- cant grassroots contribution to the nuclear debate. The Nonviolent Activist: The Magazine o f the War Resisters League. Edited by Ruth Benn. War Resisters League, 339 Lafayette St., New York, NY 10012. 8/year. $15/$25. 1984-. ISSN 8755-7428. Circ. 15,000. Indexed: Alternative Press Index. A 24-page magazine published by the nation's oldest secular pacifist organization, The Nonviolent Activist contains political analysis from a pacifist perspective, feature articles, and information relating to nonviolence, feminism, disarmament, international issues, resistance to registration and the draft, war tax resistance, and other topics. Although it occasionally runs articles on nuclear subjects (such as "Hiroshima and Nagasaki Remembered" in June 1990), the maga- zine's scope attempts to cover the whole of the peace and anti-militarist movement insofar as possible within its available space. Includes one or two book reviews per issue, up to 600 words long. Its value is enhanced by annual indexing. The Nuclear Resister. Edited by Jack and Felice Cohen- Joppa. PO Box 43383, Tucson, AZ 85733. 8/year. $18/10 issues. 1980-. ISSN 0883-9875. Circ. 1,000. This 16-page tabloid "works to foster a wider public awareness of imprisoned nuclear resisters, their motivations and their action." It facilitates a support network for such activists in the U.S., Canada, and Great Britain. The paper reports on arrests and jailings of civil disobedients, and provides analysis and com- mentary on underlying issues, as in the article on "The Militarization of the Academic Community" in the 21 September 1990 issue. Features statements of resisters themselves and a listing of forthcoming nonviolent direct actions at nuclear sites. An excellent source of information and opinion on this most committed segment of the peace movement, particularly in view of the mass media's almost total disregard in this area. Nukewatch Pathfinder. The Progressive Foundation, PO Box 5658, Madison, WI 53701. Quarterly. $15. Nukewatch (the informal name of the Progressive Foundation) came into being in 1979 following a federal court's decree restraining The Progressive from publishing information about the U.S. nuclear weapons program. The foundation, founded by the magazine, has developed into an independent action group working for peace and justice. Its 4-page tabloid newsletter reports on organizational activities, including the Nukewatch "H-Bomb Truck Watch," which monitors Department of Energy convoys that transport nuclear warheads and their components throughout the U.S., and the Missile Silo Campaign, designed to map the 1,000 ICBM missiles and 100 launch control centers in the Midwest and Great Plains. A good source of information on the secular arm of the grassroots peace movement. The Objector; A Journal o f Drafi and Military Informa- tion. Edited by Jeff Schutts. PO Box 42249, San Francisco, CA 94142.6/year. $15/$20. 1980-. OCLC 7534019. ISSN 0279-103X. Circ. 3,000. The Objector covers in 12-16 pages Selective Service laws and activities, military regulations and life in the military, issues of conscientious objection, anti-militarism, draft registration and resistance, and other information of concern to those facing "compulso- ry" military service. Published by the CCCO, an agency founded in 1948 as the Central Committee for Conscientious Objectors. Includes news of life in the Soviet and other foreign military establishments. Strongly recommended as an information tool in any environment where young men and women ponder their futures and search their consciences. On Beyond War. Edited by Mac Lawrence and Marilyn Rea. Beyond War, 222 High St., Palo Alto, CA 94301. 10/year. $25. 19847-. ISSN 0887-9567. Beyond War is an eight-year old educational foundation dedicated to building a cooperative, sustain- able world. It is active in a number of areas, including citizen diplomacy efforts with people in the Soviet Union and proposing initiatives for global security and cooperation. The 8- to 12-page newsletter contains discussions of current conflicts, ideas for positive change, interviews with peace activists and scholars, commentary on socio-psychological aspects of interna- tional relations, and occasional book reviews. Beyond War's dominant message is that all humanity shares the same vital need to preserve the planet; it believes that recognizing the common interest everyone has as a citizen of planet earth in its preservation is a logical and necessary step toward achieving preservation of the planet a n d humanity. "The Earth and all life are - - S E R I A L S O F T i m P E A C E M O V E M E N T - - W I N T E R 1 9 9 1 13 interdependent and interconnected," says one Beyond War document. "The well-being of each individual is inextricably linked to the well-being of the whole. All is one." The spirit here is grassroots, the execution professional. Peace and Freedom. Edited by Roberta Spivek. Women's International League for Peace & Freedom, 1213 Race St., Philadelphia, PA 19107.6/year. $10. 1941-. OCLC 2265762. ISSN 0015-9093. Circ. 11,000. Indexed: Alternative Press Index. Billing itself as "the only U.S. magazine devoted solely to the women's peace movement," each issue of P & F covers a full spectrum of international peace- related issues, from advocacy of a comprehensive nuclear test ban to children's books, racism, sexism, disarmament, peace education, and WlLPF activities. Useful to any peace activist or researcher, Peace and Freedom should be a basic item on any woman peace worker's periodical shelf. Peace Brigades International Project Newsletters. Peace Brigades International, Box 1233, Harvard Sq. Sta., Cambridge, MA 02238. Monthly. $25. 1989-. Peace Brigades International (PBI) sends unarmed international peace teams, on invitation, into areas of repression or conflict, acting on thebeliefthat "citizens can act boldly as peacemakers when their governments cannot." The newsletter provides information about the activities of the teams and the organizations with which they work, as well as background information on the situations in the countries where PBI has projects. Formerly published separately by country, the newslet- ter began including information about all projects (Central America, Southeast Asia, North America) in summer 1991. An effective tool for staying informed about troubled local situations in countries and regions with the potential to serve as catalysts for broader violence and military confrontation. Peace Conversion Times. Edited by Will Loob. Alliance for Survival, 200 N. Main St., Suite M-2, Santa Ana, CA 92701.9/year. $25. Circ. 8,000. 1983-. The Alliance for Survival is a grassroots group whose major goals includethe abolitionofnuclear arms and power, reversal of the arms race, and an end to military interventions. It is primarily active in the city of Los Angeles and in Orange County, Calfornia. Peace Conversion Times is an 8-page tabloid featuring organizational news and articles on narrower aspects of the broad goals noted above. Included here as a good example of a local peace periodical produced on a slender budget. Peace Magazine. Edited by Metta Spencer. Canadian Disarmament Information Service, 736 Bathurst St., Toronto, Ont. MSS 2R4 Canada. 6/year. $20. 1985-. ISSN 0826-9521. Circ. 8,000. "To inform, enlighten, and inspire. To save Earth from the scourge of war." With this motto, Peace Magazine addresses a wide variety of issues and readers. The 32-page magazine endorses multilateral disarmament, but otherwise takes no editorial position and presents a variety of views. It contains a listing of upcoming peace events in Canada, notes on the Canadian peace movement, reviews of books, films, and videos, letters from abroad, and other regular features. Recent issues have offered articles on the Persian Gulf War, the General Electric boycott, Greenham Common, nuclear accidents, and many other relevant issues. A well put-together magazine that will be useful to peace activists and scholars in the U.S., and essential to those in Canada. Peace Reporter. Edited by Kathleen J. Lansing. National Peace Institute Foundation, 110 Maryland Ave. NE, Suite 409, Washington, DC 20002. Quarter- ly. $35 (membership). 1986-. ISSN 1049-0779. Peace Reporter is a six-page newsletter providing information on the growth and development of the United States InstituteofPeace, activities and programs of the foundation, and other articles on peacebuilding, peacemaking, and conflict resolution. A recent issue contained articles on conflict management seminars in Armenia, establishment by the Institute of Peace of a Middle East program, networking notes, and other information. The foundation is an independent organiza- tion, not affiliated with the U.S. Institute of Peace, although its activities in behalf of the Institute helped enable its creation. Membership opens opportunities to meet in Regional Council workshops and seminars on peacemaking and conflict resolution. RECON. Edited by Chris Robinson. RECON Publica- tions, PO Box 14602, Philadelphia, PA 19134.9/year. $15. 1973-. ISSN 0093-5336. Circ. 2,000. This newsletter of approximately 14 pages "covers Pentagon activities around the world. RECON exposes little-known events and explains the reasons behind the mass-media headlines." Produced by volunteers, RECONreflects its editor's belief that what one reader calls "a goofy bunch of idealists" can help effect positive social change, in spite of the vast financial and political power of the military industrial complex. "We have faith that the change will come," says Robinson. RECON often publishes articles on nuclear resistance, nuclear weapons and warfare issues, and SDI. Includes eight to ten paragraph-long book and document reviews in each issue. 14 SERIALS REVIEW -- GRANT BURNS The Reporter for Conscience' Sake. Edited by David W. Treber. National Interreligious Service Board for Conscientious Objectors, Suite 750, 1601 Connecticut Ave., NW, Washington, DC 20009. Monthly. $20. 1940-. OCLC 2244974. ISSN 0034-4796. This publication is an update on legislation and developments affecting conscientious objectors to participation in war. Each 8-page issue is likely to offer discussion of individual CO cases, commentary on military action, analysis of pro-military propaganda in the media, a number of brief book reviews and other leads to pertinent literature, coverage of congressional action, and more. A valuable source to help anyone understand contemporary conscientious objection to participation in war, but especially useful to those in a position to counsel young people concerned about the draft and what constitutes their "duty" to their country. Space and Security News. Edited by Robert M. Bow- man. Institute for Space and Security Studies, 5115 Hwy. A1A S., Melbourne Beach, FL 32951. Quarterly. $25. 1984-. Editor Bowman, the author of Star Wars: A Defense Insider's Case Against the Strategic Defense Initiative (J.P. Tarcher, 1986), is a retired Air Force Lt. Colonel. He conducts an energetic campaign against the militarization of space and the continued funding of defense programs he considers wasteful and a threat to U.S. and global security. Each issue of his 8- to 16- page S & S News contains Bowman's analysis of global events and military programs, chiefly SDI. Like Thomas Liggett's World Peace News, Bowman's periodical reflects the thinking of a former military man who has seen a new light. He describes the publication as providing "an independent voice for the American people on space and other high-tech issues affecting national security .... We specialize in those areas where we feel the government has lied to the American people and their elected representatives to Congress. We "Speak Truth to Power' on issues like 'Star Wars,' the KAL-007 shootdown, the Challenger explosion, nuclear testing, and the war against Iraq. We have vigorously opposed weapons in space since 1980." The format is homey (2-column, typed), the message urgent and clearly-presented. Surviving Together: A Journal on Soviet-American Relations. Edited by Harriet Crosby, et al. Institute for Soviet-American Relations, 1601 Connecticut Ave., NW, Suite 301, Washington, DC 20009. 3/year. $25/$30. 1983-. ISSN 0895-6286. Circ. 6,000. This journal's parent institute is a nonpartisan serviceorganizationworkingto improveSoviet-Ameri- can relations throughbetter communication, facilitating working relationships between individual Soviet and U.S. citizens, cultural exchanges, and other means. Surviving Together presents news and editorial opinion on U.S.-Soviet relations and chronicles exchanges between the two countries, especially private-sector contacts. Each 90-page issue's coverage is divided among approximately 20 subjects, such as health, education, world security, environment, city affilia- tions, and citizen diplomacy. It includes articles reprinted from other sources and those based on information retrieved from interested organizations. Both U.S. and Soviet sources are cited. An effective tool for keeping informed on healthy developments in U.S.-Soviet relations. Features a good number of resource and new book notes. Readable and exciting. The Test Banner. American Peace Test, PO Box 26725, Las Vegas, NV 89126. Monthly. $10. 198?-. American Peace Test is a grassroots group dedicated, to nonviolent action to end the arms race. It advocates a comprehensive nuclear test ban as a first step towards disarmament, and engages in education and outreach to communities affected by nuclear weapons testing and the arms race. The organization's Testing Alert Network monitors U.S. and British tests at the Nevada Test Site and shares information on foreign tests with a global network of activists. The Test Banner reports both U.S. and international opposition to nuclear testing, including protests by Soviet citizens. The tabloid helps the reader keep up with a variety of testing issues, including environmental and legal matters. Readers seriously interested in participating in the movement for a comprehensive test ban will welcome access to this title. WAND Bulletin. Women's Action for Nuclear Disarma- ment. PO Box B, Arlington, MA 02174. Quarterly. $30 (membership). Circ. 20,000. 1982-. WAND was founded in 1980 by Dr. Helen Caldicott as a women's initiative to eliminate weapons of mass destruction and redirect military resources to human and environmental needs. WAND engages in congressional lobbying, grassroots organization, support of women congressional candidates, and other measures serving its objectives. The WAND Bulletin, an 8-page newsletter, includes notes from affiliates around the U.S. as well as discussions on a variety of political and military issues. A desirable addition to feminist and peace collections. Washington Peace Letter (see figure 3). Washington Peace Center, 2111 Florida Ave., NW, Washington, DC 20008. Monthly. $25. 1963-. ISSN 1050-2823. Circ. 5,000. - - S E R I A L S O F T H E P E A C E MOVEMENT - - W I N T E R 1 9 9 1 15 An affiliate of the national grassroots network, Mobilization for Survival, the Washington Peace Center focuses on peace education and action in the metropoli- tan Washington area. The 8-page tabloid aims, in its editors' words, "to support the work of local progres- sive grassroots activists, and provide information on issues of local, national, and international importance." The paper concentrates on such issues as militarism, racism, sexism, homelessness, protection of the environment, homophobia, and economic justice, as well as the conventional peace issues of outright military confrontation. Occasional book reviews. In its broad spectrum of concerns, the Washington Peace Letter is emblematic of the contemporary peace movement's realization that institutionalized violence extends far beyond traditional ideas of "war" to include more subtle but still devastating affronts to the rights of both humanity and nature. The tabloid's single, overriding interest is the objective of its subtitle, world government, and the sooner the better. The whole of WPN is given to short news notes and commentary, with occasional longer pieces, analyzing global affairs in light of that objective. It is relentlessly critical of efforts to preserve nationalism and the sovereignty of the nation-state; Liggett sees nuclear weaponry as the death knell--one way or another--of the present system of competing states. Each 8-page issue is full of information and opinion of interest to advocates of world government, contribut- ed not only by Liggett but by other advocates of the rule of international law as well. WPN is currently campaigning for Czechoslovak President Vaclav Havel's designation as U.N. Secretary-General in the belief that Havel has a better understanding of interna- tionalism than "the U . N . ' s line of nationalist Secretar- ies-General." It is especially interesting for its quick takes on political attitudes expressed in the mass media. WASH INGTON DECEMBER 1 9 9 0 PUBLISHED BYTHE WASHINGTON PEACE CENTER Nuclear Tomahawk SLCMs aboard U,S. Vessels in t h e Middle East Region t ~ n 12 A u g u s t - i November i990) Figure 3: Washington Peace Letter December 1990 World Peace News: A World Government Report. Edited and published by Thomas Liggett. 300 E. 33d St., New York, NY 10016.6/year. $20/3 years. 1970-. ISSN 0049-8130. Circ. 2,000. Editor-publisher Liggett, a journalist and decorated World War II Marine Corps fighter pilot, dedicates WPN to "All the World-Government News That's Fit to Print and Almost Free of Cant, Hype and Twaddle." R E L I G I O U S P E R S P E C T I V E S The Advocate. Edited by Kathleen Hayes. Evangelicals for Social Action, 10 Lancaster Ave., Wynnewood, PA 19096. Monthly. $20 (membership). 1988-. This nicely-designed 16-page newsletter includes attention to nuclear issues within its broad embrace of topics concerned with peace and justice. It describes its mission as seeking "to contributeto the development of social awareness and a consistently proqife social ethic in the American Christian evangelical community, in order to, in the words of our slogan, 'promote shalom in public life.'" Each issue contains a feature article on an important public policy matter, federal legislative updates, news on developments abroad, and other organizational information. Briefly. Edited by Nancy Lee Head. Presbyterian Peace Fellowship, Box 271, Nyack, NY 10960. Quarterly. $25 (membership). 1944-. A newsletter designed to inform Presbyterian Church members of peacemaking ideas, activities, resources, and backgrounds, Briefly, in its 8 pages, covers issues on peace in general, including attention to nuclear matters such as the General Electric boycott led by INFACT and nuclear weapons facility investiga- tions. It also features notes on resources and kindred organizations, plus occasional book reviews. Christian Social Action (see figure 4). Edited by Lee Ranck and Stephen Brockwell. General Board of Church and Society, United Methodist Church, 100 Maryland Ave., N.E., Washington, DC 20002. 1 I/year. $13.50. 1968-. ISSN0164-5528. Circ. 4,500. 16 SERIALS REVIEW - - GRANT BURNS Editor Ranck describes Christian Social Action as a magazine that "builds on the premise that faithful witness involves constant grappling with current issues in light of biblical and theological reflection. CSA is intended to stimulate thought, discussion and further study on a number of complex, sometimes controversial issues." These issues have recently included the Persian Gulf War, the situation in Panama, women's rights, the death penalty, and gay and lesbian concerns. Letters, a "U.N. Report," occasional book reviews, and other features round out the 40-page magazine. A good addition to libraries trying to offer readers access to a variety of religiously-informed views on the many aspects of peace and violence in today's world. Desert Voices. Nevada Desert Experience, PO Box 4487, Las Vegas, NV 89127. Quarterly. Free (dona- tions welcome). 1988-. The Nevada Desert Experience describes itself as "a faith-based organization with Franciscan origins working to end nuclear weapons testing through a campaign of prayer, dialog, and nonviolent direct action." Organized in 1984, the Experience conducts prayer vigils at the Nevada Test Site and sponsors annual commemorations of Hiroshima and Nagasaki in August. "NDE is a voice in the desert calling people of faith to nonviolence in the face of violence, truth in the face of illusion, hope in the face of despair, love in the face of fear." The 6-page newsletter features articles on the comprehensive test ban issue, organiza- tional news, notes on activities of kindred groups, and occasional book reviews. Episcopal Peace Fellowship Newsletter. Edited by Dana S. Grubb. PO Box 28156, Washington, DC 20038. Quarterly. $25 (membership). ca. 1965-. This newsletter is primarily for the encouragement and information of EPF members and friends, and to keep bishops, church press, and others informed of organizational activities and objectives. The Episcopal Church has been an active peace and anti-nuclear weapons advocate for some time; Episcopalians seeking connections with other Church members will find this newsletter helpful. Ground Zero (see figure 5). Ground Zero Center for Nonviolent Action, 16159 Clear Creek Rd. NW, Poulsbo, WA 98370. Quarterly. Donation. The root of Ground Zero's orientation is secured in the tradition of Christian nonviolence, but, as the "Dear Gandhi" letters column suggests, the point of view is anything but narrowly sectarian, and not without a sense of humor. The 12-page tabloid dwells on peace issues at large, from testimonies to the power of prayer to sustain the peace activist to analysis of current U.S. military projects and protests around the nation. It includes the regular feature "Voices from Prison," in which peace activists jailed for their actions reflect on their situations and the meanings implicit in them. As in most grassroots publications, there is a strong sense of community evoked by Ground Zero, in this case a spiritual community. Recommended as a good example of its kind. The f a c e of t h e "enemy?" Figure 4: Christian Social Action March 1991 TheOtherSide. Edited by Mark Olson, Doug Davidson, and Dee Dee Risher. 300 W. Apsley St., Philadelphia, PA 19144. Bi-monthly. $29.50. 1965-. ISSN 0145- 7675. Circ. 14,500. This is an independent, ecumenical Christian magazine tending to the broad issues of peace and social justice. It addresses war, racism, nationalism, and the oppression of the disenfranchised. The maga- zine has published such writers as Daniel Berrigan, Mary Lou Kownacki, bell hooks, Margaret Drabble, William O'Brien, and many others; it maintains a very selective approach to its submissions. It includes poetry and fiction in addition to non-fiction pieces. "We abhor political rationalizing and the social posturing of the right and left," say the editors. "We welcome critical thinking about ourselves and those 'movements' of which we sometimes are a part." Good illustrations; a nice title for public libraries. SERIALS OF THE PEACE MOVEMENT - - W I N T E R 1 9 9 1 17 I • F A L L I990, Vol, 9, N o . 2 C ; I I I ! I , I J Figure 5: Ground Zero Vol. 9, No. 2, Fall 1990 Pastoral Care Network f o r Social Responsibility Newsletter. Edited by G. Michael Cordner, Th.D. PO Box 9243, Ft. Myers, F L 33902. Quarterly. $25 (membership). 1984-. This ,organizational communication tool serves persons with training and interest in pastoral psycholo- gy and issues related to peace with justice and the "integrity of creation." The 16-page newsletter informs members of the network and other interested persons about important related events, issues, resources, and concerns. The strong antiwar theme is accompanied by discussion o f such social justice issues as adequate housing. It includes numerous notes from foreign readers and resource notes. Pax Christi USA. Edited by Mary Lou Kownacki, OSB. National Catholic Peace Movement, 348 East Tenth St., Erie, PA 16503. Quarterly. $20 (membership). 1985-. ISSN 0897-9545. Circ. 10,000. The primary goal of Pax Christi, the international Catholic peace movement, is "to work with all people for peace for all humankind, always witnessing to the peace of Christ. Its priorities are a Christian vision of disarmament, a just world order, primacy of con- science, education for peace and alternatives to vio- lence." Pax Christi USA covers the Catholic peace movement in depth, with articles by and about activists, actions, and events, from analysis of the Persian Gulf War to campaigning for a Comprehensive Test Ban treaty. Each 38-page issue contains a variety of feature articles, columns, two or three book reviews, news of Pax Christi organizational matters, and "Network," a resources listing. Essential reading for Catholic peace activists and a desirable item for libraries that wish to make Catholic peace perspectives more readily available to their users. Peace Office Newsletter. Mennonite Central Committee, International Peace Section, 21 S. 12th St., Box 500, Akron, PA 17501.6/year. $10. The Mennonite Central Committee is "the coopera- tive relief and service agency of North American Mennonite and Brethren in Christ conferences. It carries on community development, peacemaking and material aid 'in the name of Christ,' in response to His command to teach all nations the way of discipleship, love and peace." The 12-page newsletter features biblical perspectives on war and peace, examination of psychological issues, peace activism among different groups ("Seniors for Peace" is a current project), and reflections on the meaning of peacemaking. World Peacemaker Quarterly. Edited by Dr. William J. Price. World Peacemakers, Inc., 2025 Massachusetts Ave. NW, Washington, DC 20036. Quarterly. $5. Circ. 2,500. 1979-. This Christian, non-denominational newsletter emphasizes the importance of following the teachings of Christ in working for a peaceful world. The newslet- ter reflects editor Price's statement, drawn from his book Seasons o f Faith and Conscience: Kairos, Confes- sion, & Liturgy (Orbis, 1991), that "Every act of worship, every occasion where the sovereignty of the Word of God is celebrated, every instance where the realm of God is acknowledged, is always and every- where expressly political." Church and state may be separate, but World Peacemakers is a group that approaches politics informed by religious conviction. The 20-page newsletter contains essays and notes concerning the spiritual motivations and rationales for turning away from war as a "solution" to international problems. P R O F E S S I O N A L P E R I O D I C A L S The Arms Control Reporter: A Chronicle o f Treaties, Negotiations, Proposals. Institute for Defense & Disarmament Studies, 2001 Beacon St., Brookline, MA 0216. Monthly. $325 libraries/S500 profit-making institutions. 1982-. OCLC 16159509. ISSN 0886-3490. Circ. 400. 18 SERIALS REVIEW - - GRANT BURNS This useful if, for all but major research facilities, prohibitively costly looseleaf service provides up-to-date information on the status of arms control negotiations, the positions of governments, the record of events leading to the current situation, and an update on weapons involved in negotiations. Each supplement contains 100-160 pages. The binder arranges material by topic; the 1991 cumulation, for instance, covers close to 40 arms negotiation areas, including short- range nuclear forces, nuclear-weapon-free zones, the Non-Proliferation Treaty, and missile proliferation. Although full of valuable information, the title's cost will inevitably keep it out of the hands of many researchers. Arms Control Today. Edited by Matthew Bunn. Arms Control Association, 11 Dupont Circle NW, Washing- ton, DC 20036. Monthly except two bimonthly issues, Jan./Feb. and July/Aug. $25/$30. 1972-. OCLC 2197658. ISSN 0196-125X. Circ. 4,000. Abstracts o f Military Bibliography; Aerospace Defense Markets and Technology; PAIS; Predicasts Overview of Markets and Technologies. The Arms Control Association, a national member- ship organization, "seeks to create broad public appreciation of the need for positive steps toward the limitation of armaments and the implementation of other measures to reduce international tensions and promote world peace." Its journal is essential for any serious collection on peace, nuclear weapons, and strategic issues in general; ACT's typical 40-page issue contains interviews with influential figures and informed articles on such topics as nuclear proliferation, verification, movement toward a comprehensive test ban, and strategic defense. The regular departments, "News Briefs" and "Factfile," afford quick access to develop- ments in or affecting arms control. One of the most valuable points for the researcher is "Arms Control in Print," a timely, two-page bibliography identifying books, pamphlets, government documents, and articles in various categories. One or two long book reviews per issue allow reviewers to address the topic at hand as well as the books under consideration. Contributors are prominent and varied in their viewpoints. Barometer. Edited by Tariq Rauf. Canadian Centre for Arms Control and Disarmament, 151 Slater, Suite 710, Ottawa, Ontario, Canada KIP 5H3. Quarterly. $30/$45. 1990-. ISSN 0825-1894. Circ. 3,000. The Canadian Centre for Arms Control and Disarmament was established in 1983 to encourage informed debate and to provide independent, non- partisan research and information on arms control and disarmament. Barometer, although subsidized to some extent by the government, maintains an independent editorial position. An 8-page tabloid printed on quality paper, its emphasis is on Canadian involvement in global issues of arms control and disarmament. 1990 issues contained articles on nuclear testing in the Arctic, International Atomic Energy Agency (IAEA) safe- guards, trends in the arms trade, and Canadian-Soviet cooperation initiatives, among other topics, plus occasional book reviews. Bulletin o f the Atomic Scientists. Edited by Len Ack- land. Educational Foundation for Nuclear Science, 6042 S. Kimbark Ave., Chicago, IL 60637. 10/year. $30. 1945-. OCLC 1242732. ISSN 0096-3402. Circ. 20,000. A.B. C. Pol. Sci.; Academic Index; American Bibliogra- phy o f Slavic & East European Studies; America: History & Life; Bibliography & Index o f Geology; Biography Index; Biol. Dig.; Biological Abstracts; Book Review Index," Book Review Digest; Chemical Abstracts; Current Advances in Ecological and Environmental Sciences; Current Contents; Current Index to Journals in Educatib~i; Energy Review; Environmental Periodi- cals Bibliography; Excerpta Medica; Future Survey; General Science Index, Historical Abstracts; Index to Scientific Reviews; INIS Atomindex; Magazine Index; Media Review Digest; Metals Abstracts; Middle East: Abstracts & Index; PollutionAbstracts; Readers' Guide to Periodical Literature; Social Science Citation lndex; Sociological Abstracts; Risk Abstracts; South Pacific Periodicals Index; World Aluminum Abstracts. The BAS debuted in December 1945. Home of the famous "Doomsday Clock" logo indicating its editors' estimation of humanity's proximity to nuclear annihilation, the magazine is rather more optimistic about the future than it was a few years ago, or even at its inception when it warned of atomic catastrophe being "inevitable if we do not succeed in banishing war from the world." In its 45th anniversary issue, editor Ackland wrote, "The race to nuclear destruction between the world's two military behemoths has been reversed and the opportunity exists to dismantle the dangerous Cold War arsenals and superstructures." If that reversal has taken place, BAS can claim as much credit as any periodical. Throughout its history it has been at the forefront of "responsible" (i.e., professional, expert) forums for addressing the many and intricate aspects of the nuclear threat. Proliferation, testing, the arms race, nuclear weapon facility prob- lems, and many other nuclear issues come into its scope. With articles by recognized authorities, a lively format with good illustrations and good book reviews, BAS is a must for all libraries. CEASE News. Edited by Peggy Schirmer. Concerned Educators Allied for a Safe Environment, 17 Gerry St., Cambridge, MA 02138.3/year. $5. Circ. 700. 1982-. - - S E R I A L S O F T H E P E A C E M O V E M E N T - - WINTER 1991 19 CEASE is a national network of parents, teachers, and other young children's advocates concerned about the dangers of violence, pollution, nuclear power, nuclear war, and a global military budget that drains resources from programs designed to help children and their families. CEASE News is a modest but neatly- produced little newsletter reporting organizational activities and featuring brief articles on various facets of the peace movement. Recent issues have offered articles on the children of Hiroshima, war toys in the classroom, and the Middle East crisis, with some book and audiovisual reviews of materials intended either for children or for their adult teachers and guides. Council f o r a Livable World Newsletter. Council for aLivableWorld, 100 Maryland Ave. NE, Washington, DC 20002. Irreg.; free. 198?-. Although in the words of Council office manager Chris Peterson, "This newsletter is published with no regularity whatsoever," it remains of interest when it does appear. The Council works in behalf o f establish- ing a majority in the U.S. Senate supporting nuclear disarmament and "a big cut in the military budget." The 4-page newsletter contains updates on the current state of that budget, the status of weapons programs, arms control agreements, and other topics. The Council also publishes irregular "Fact Sheets," also free, on specific weapons and military issues, and operates a "Nuclear Arms Control Hotline" (202-543-0006), a 3- minute taped message. CPSRNewsletter. Edited by Gary Chapman. Computer Professionals for Social Responsibility, PO Box 717, Pale Alto, CA 94302. Quarterly. $50. 1983-. This desktop-published 30-page newsletter turns its attention generally to the socially responsible uses of computers, and has recently covered such issues as telephone privacy and how computers contribute to the ecological crisis. It has also published many articles in its history on nuclear war and related topics, includ- ing nuclear education, strategy, computer unreliability and nuclear war, SDI, and other topics. Articles contain references, but the style is accessible to the average educated reader; one need not be a computer scien- t i s t - o r even use a computer--to make sense of it. Recently CPSR called for an end to the "Star Wars" program, and published a response to that call by the Strategic Defense Information Office. Given the importance of computers in contemporary weaponry and defense systems, this newsletter is worth the attention of anyone concerned about the relationship of high technology to war and peace. ESR Journal: Educating for Social Responsibility. Edited by Sonja Latimore. Educators for Social Responsibility, 23 Garden St., Cambridge, MA 02138. Annual. $12. 1990-. ESR Journal devotes itself to new ideas on educating students for their involvement in the world. It includes both theoretical and practical essays by ESR leaders and other experts in education. "Skilled, courageous, and creative teachers are essential for our country to survive and thrive," states a journal repre- sentative. "ESR exists to enable such teachers to work together to develop and share ideas." The 120-page 1990 issue, in the format of a typical scholarly journal, featured articles on human rights education, conflict management for students, the role of education for social responsibility in American culture, and other topics. Many of the articles contain footnotes ~atd bibliographies. One hopes this welcome addition to educational literature will be able to evolve to a more frequent publication status. F.A.S. Public Interest Report. Edited by Jeremy J. Stone and Steven Aftergood. Federation of American Scientists, 307 Massachusetts Ave. NE, Washington DC 20002.6/year. $25/$50. 1970-. ISSN 0092-9824. Circ. 4,000. The Federation of American Scientists was founded in 1945 by Manhattan Project scientists to promote the peaceful and humane uses of science and technology. Its journal describes itself "as a means to disseminate the research and analysis produced by various projects of the F.A.S. Fund (educational and research arm of the Federation) which deal primarily in the areas of nuclear proliferation, chemical/biological weapons, international scientific exchange, disarmament verification and the environmental and political implica- tions of the U.S. space policy." Occasional book reviews are included. LAWS Quarterly. Edited by Laura McGough. Lawyers Alliance for World Security, 1120 19th St., NW, Washington, DC 20036. Quarterly. $20. 1982-. Recently revamped from a 4-page newsletter to a more substantial 20-page magazine, LAWS Quarterly is designed to assist its parent organization in providing a forum for the analysis and exchange of ideas concern- ing reduction of the threat of nuclear war, advancing non-proliferation, and enhancing movement towards the rule of law in the Soviet Union. In addition to organizational news, the most recent issue featured essays by a scholar from the Center for International Security and Arms Control of Stanford University and by a former director of the U.S. Arms Control and Disarmament Agency. Previously published as the newsletter of the Lawyers Alliance for Nuclear Arms Control. LAWS Quarterly is a desirable addition to law libraries. 20 SEiCIA~ R~VIeW -- GRA~T BU~S Meiklejohn Ovil Liberties Institute PeaceNet Bulletin. Edited by Ann F. Ginger. Meiklejohn Civil Liberties Institute, Box 673, Berkeley, CA 94701. Monthly. $12. 1990-. The Meiklejohn Civil Liberties Institute is active on a variety of fronts; a commitment to peace and social justice is one of them. The PeaceNet Bulletin is a four- to six-page newsletter devoted to single-issue analysis of"crucial current events and the central issues of peace law" regarding such topics as the U.S. invasion of Panama, the Persian Gulf War, and nuclear deterrence. The organization's goal "is to fulfill our responsibilities in the nuclear age by helping inform U.S. public discussion and debate on these events and to support appropriate action by U.S. poiicym~_c~, organizations, and also specifically by lawyers and lawmakers." Contributors are legal authorities. Chiefly of interest to those in the legal profession who want to explore the opportunities for pursuing peace and justice afforded by their professional expertise. Nucleus. Editedby StevenKrauss. UnionofConcerned Scientists, 26 Church St., Cambridge, MA 02238. Quarterly. Donation. 1978-. ISSN 0888-5729. Circ. 130,000. Nucleus covers arms control, national security and energy policy issues, and nuclear power safety. The oversize 8-page tabloid contains news and analysis of all these issues, and benefits from good graphs, charts, and other illustrations. The Union of Concerned Scientists is dedicated to environmental health, renew- able energy, and "a world without the threat of nuclear war." The organization also publishes books and brochures on these issues, along with its 4- to 6-page "Briefing Papers" on such topics as nuclear prolifera- tion, antisatellite weapons, and other aspects of nuclear war and peace. The PSR Quarterly: A Journal o f Medicine and Global Survival. Edited by Jennifer Leaning, M.D. Williams & Wilkins, PO Box 23921, Baltimore, MD 21203. (Editorialoffices: 10BrooklinePlaceWest, Brookline, MA 02146). Quarterly. $48/$85. 1991-. This most welcome new journal began in the thirtieth anniversary year of Physicians for Social Responsibility, a national organization of 25,000 health professionals and supporters working to prevent nuclear war and other environmental catastrophes. PSR is the U.S. affiliate of the International Physicians for the Prevention of Nuclear War. The journal provides the first peer-reviewed periodical coverage of the medical, scientific, public health, and bioethical problems related to the nuclear age. It features editorials, debate and rebuttal, news notes, letters, and book and journal reviews. The 65-page debut issue of March 1991 contained scholarly articles on the neutron bomb, health effects of radioactive fallout on Marshall Islanders, and other significant contributions to an informed under- standing of medical issues in the context of a world bristling with weapons of mass destruction. Any library serving a clientele with an interest in medicine and allied health fields will want to give this title serious consideration. PSR Reports. Edited by Burton Glass. Physicians for Social Responsibility, 1000 16th St., NW, Suite 810, Washington, DC 20036. 3/year. $80 physicians/S40 associates/$15 students (membership). ISSN 0894-6264. Circ. 50,000. 1985-. (Was PSR Newsletter, 1980- t ¢ ' t t ~ L " \ The official membership newsletter for Physicians for Social Responsibility, this 8-page tabloid informs readers of the organization's campaigns against nuclear weapons testing and production, federal budget priori- ties, and environmental protection and restoration. Some bool~ Teviews are included. Psychologists f o r Social Responsibility Newsletter. Edited by Anne Anderson. 1841 Columbia Rd., NW Suite 207, Washington, DC 20009. Quarterly. $35. 1982-. This 12-page newsletter in addition to covering activities of Psychologists for Social Responsibility, focuses on projects in which professional psychologists are involved concerning peace, war, conflict resolution, and related topics. The newsletter also features articles on such topics as the psychological case for a compre- hensive test ban, profiles of antiwar psychologists, and commentary on current international crises. The organization defines its mission as using psychological principles and tools "to promote conversion from a war system to a world dedicated to peace and social justice." An annotated resource list is a regular feature; occasional book reviews are included. Psychologists who want to stay abreast of professional developments regarding war and peace will find this title useful; so would lay readers interested in psychology. Research Report o f the Council on Economic Priorities (see figure 6). Edited by Alice T. Marlin. Council on Economic Priorities, 30 Irving PI., New York, NY 10003. Monthly. $25. 1969-. ISSN 0898-4328. The Council on Economic Priorities is an indepen- dent, public interest research organization. A focus on arms control, military spending, and national security has long been one of the Council's interests. Recent issues of the 6-page Research Report have dealt with the economic effects of the Cold War's decline, particularly the need for conversion from military to - - S E R I A L S O F T H E P E A C E M O V E M E N T - - WINTER 1991 21 civilian industry in both the U.S.S.R. and the United States. Succinct but informative. O C T O B E R 1 9 9 0 Beating Swords into Washing Machines _ ~ ~ ~ O HEL~ F ~ COnSUMErS? ing m~lacm~ers' envtcun mental c h l m s . A ~ r d l n g [CEP R ~ h e r ] t h e EPA is ~ n s l d e ~ g ~ ] ~ l m g M ~ n w h i l e , ~ m e s t ~ have s a a e d meir o w n laLeling p r ~ g ~ l s . m ~ e its . i ~ f i a a maaer ate a consum er a d v i s o ~ i ~ o w n ~ i l a r e a l ~ l l e y . , But ~ m e m ~ , , ~ a o n l y d e . r i d ~ what y o u b u y b u t ~ h ~ t y ~ d o ~ i t h COOO n o u s ~ e e r ~ , ^ v o u s r i ~ o B e P ~ r E g ~ c ~ s , ~ R ~ N SPECHLBR, J t ~ r m r ~ n c ~ ~ K ~ M ^ L O N ~ S o v i e t * x p ~ c ~ t m a t e then ~ m ~ y ~s o n e - i m p o ~ t ¢ i e m c m s o f M i k h ~ i l G o r b a c h e v ' s p e r u ~ i r d o f alI p ¢ o p l e e m p l o y e d i n S e v i c t i n d u s st~ikaistheconvcnionofmiIite~/produetionto t r y m ~ u f ~ t m e p r o d u c m c o n n o t e d w i t h c i v i l i ~ purposes. T h e ~ r o / s t a t e o f t h e e i v i l i ~ t h e m i l i t a l y . A ~ d i u g t o S m i e t ~ [ s t s ~ a ~ o f t h e S o v i e t ~ o ~ m y h ~ a l s o e ~ n v i n ~ d Z u ~ b y a k o b a s h v i l i , H ~ d o f t h e D e p o n e n t for R ~ s i ~ R e p u b l i c P r e s i d ~ t B o r l s Yeltsin t h a t s u e - F o r e i g n E c ~ o m i c R e l a t i ~ s at t h e S o v i e t Institute cessfi~l ¢ o n v e ~ i o a t o a p e t e ~ o n q m y i s n ~ e s - o f E c o n o m i c ~ d T ~ h n o l o g l c a l P o r ~ t i n g , a sary. E v e r y d a y , m i d e ~ appear i n t h e $ o v i a p r e s s C E P w o r k i n g g r o u p m e m b e r w h o s e ~ e s ~ a c o a a b o u t ~ w c ~ u m e r g o a d s b e i n g p t o d u ¢ ~ m s u i t ~ t t o t b e U . S . S R . S u p r e m e S o v i e t , thebuordun defense e n t e r p f i s e L I r ~ e e d , pro~lueing m u c h o f t h e m i l i t ai). ~ t o t is e v ~ I ~ g e r t h i n thig figltre neede~d e ~ s u m e t g o e d s i n factories t h a t ~ c ¢ p ~ s u g g e s t s . B y s o m e c a l c u l a t i o n s * s a y s D r . 6 ~ d m l s s i l e s s ~ m s a p ~ i s e a p c a ¢ © f a l f i l l e d y a k o b ~ h ~ i , t h e v~tae o f ~ u u ~ u ~ d for d ~ A ~ y shtt~ f i o m m i l i t m y t o c i v i l i a n p r o d u c t i ~ i s f e n ~ a l m s m a y r e ~ h ~ m u c h a s s l x t y p e r ~ n t o f therefore a ~ l ~ m e d e v e l o l m l e n t . Yet. t h e c o u n t a i i i l ~ d u s t i l a l ~ o a r ~ s u s ~ t i n d l e n a f i o n T h i s c o n v e ~ i ~ m e , 4 e l b e l n g ~ e d i n t h ~ S o v i ~ U n i o n , v a l ~ . b ~ d est i m a t i ~ takes a c c o s t o f t h e h i g h e r a s w e s h a l l s ~ , i s l u b - o p t i m a I ~ d i n s o m e w a y s q u a l i t y o f m i l i t a / ~ p r o d u e B s u p e r f i c i a l F o r i o s ~ ¢ . S o v i e t defense e n g l i t i s n o S U l p d ~ . t h e n , t h a t ~ e eft t h e m o ~ n e e d . a ~ t o m e d t o d e m = d s f ~ p l n p ~ i a t p ~ i F i g u r e 6: Research Report o f the Council on Economic Priorities October 1990 SCHOLARLY JOURNALS Bulletin o f Peace Proposals. Edited by Magne Barth. Sage Publications, PO Box 5096, Newbury Park, CA 91359. Quarterly. $37/$83. 1970-. OCLC 1537766. ISSN 0007-5035. Abstracts of Military Bibliography; America: History and Life; Historical Abstracts; Human Rights lnternet Reporter; INIS Atomindex,'Middle East: Abstracts & Index; PALS," Risk Abstracts. Recent issues of this scholarly journal have addressed such topics as religion and armed conflict, the alleged obsolescence of major war between devel- oped countries, international environmental cooperation, current change in Europe, and the arms industry, technology, and democracy in Brazil. It includes the occasional article on nuclear and related issues, such as Sven Hellman's "The Risks of Accidental Nuclear War" in the March 1990 issue. Authors are an interna- tional lot, including those from the U.S., Western and Eastern Europe, Latin America, Africa, Canada, and elsewhere. The journal's motto is "To motivate re- search, to inspire future oriented thinking, to promote activities for peace." It concentrates on international policy in the light of general peace research theory. Perhaps a bit intimidating for undergraduates and the public at large. Conflict Management and Peace Science. Edited by Walter Isard. Peace Science Society (International), Dept. of Political Science, SUNY Binghamton, Bing- hamton NY 13901. Irreg. $20. 1974-. OCLC 8055590. ISSN 0738-8942. Circ. 1,000. A.B.C. Pol. Sci.; America: History & Life; Current Contents; Historical Abstracts; Middle East: Abstracts & Index; PALS; Social Science Citation Index. It may not publish more than one issue in a year, but this journal nevertheless contributes some worth- while points of view on peace issues. This scholarly titlehas featured articles on long-term effects of nuclear weapons, the high-technology arms race, and the relationship between trade and conflict. For advanced students and scholars; others will be frequently stymied by mathematical formulae in the articles. Contributors are almost exclusively U.S. scholars. Current Research on Peace and Violence. Edited by Pertti Joenniemi. Tampere Peace Research Institute, Hameenkatu 13 b A, PO Box 447, SF-33101, Tampere, Finland. Quarterly. $40. 1971-. ISSN 0356-7893. Circ. 600. Abstracts of Military Bibliography; Current Contents; International Political Science Abstracts; Middle East." Abstracts & Index; Sociological Abstracts, Social Science Citation Index. An interdisciplinary scholarly journal that publish- es articles on a wide variety of topics in its 60 to 70 pages. Recent issues have featured articles on the U.N. and nuclear disarmament, Soviet military doctrine, "peace research as critical research," and other issues. A diversity of viewpoints and contributors, from Scandinavia, North America, Great Britain, and elsewhere, gives the journal appeal to peace activists, scholars, and students. Disarmament: A Periodic Review by the United Nations. Edited by Lucy Webster. United Nations Dept. of Disarmament Affairs. Publications Sales Office, Rm. DC2-853, New York, NY 10017. Quarterly. $18. 1978-. American Bibliography of Slavic & East Europe- an Studies; PAIS. Disarmament is intended to serve as a source of information and a forum for ideas concerning the activities of the United Nations and the wider interna- tional community with regard to arms limitation and disarmament issues. The periodical is issued in English, French, Russian, and Spanish editions. As one might expect, the breadth of subjects covered is extensive and its contributors are international. Recent issues have offered articles on economic conversion in the 22 SERIALS REVIEW - - GRANT BURNS U.S.S.R., coverage of the Non-Proliferation Treaty Review Conference that took place in the fall of 1990, tactical nuclear weapons, international arms transfers, and other significant topics. Contributors come to their tasks with well-informed backgrounds in the issues. The majority of the articles contain references to other literature. From 20 to 30 brief book reviews, a list of publications received, recent documents on disarma- ment, and a chronology of disarmament activities round out each issue. At the price, Disarmament is an economical and desirable addition to most libraries. International Journal on World Peace. Edited by Panos D. Bardis. Professors World Peace Academy, GPO Box 1311, New York, NY 10116. Quarterly. $15/$30. 1984-. ISSN 0742-3640. Circ. 10,000. Current Con- tents; Psychological Abstracts; Social Science Citation Index; Social Work Research & Abstracts; Sociology o f Education Abstracts; Geographical Abstracts; International Political Science Abstracts; Key to Economic Science; LLBA Linguistics and Language Behavior Abstracts; PALS; Peace Research Abstracts; Sociological Abstracts. This is another title ranging widely over the world of peace issues. A typical number contains two or three major articles; recent issues have focused on national self-determination, the link between Locke and Kant and ecological theories, the historical paradox of religious sects' lip-service to peace while engaging in war, apartheid, and wars of development in Latin America. A brief "News" section takes an equally broad approach to current political developments, such as the independence movements in the Soviet Union. It includes notes on new books and journals. Book reviews are lengthy, if not plentiful (8 to 10 per issue). Some of the books chosen for review are curious entries in a journal devoted to peace (e.g., E.D. Hirsch's Cultural Literacy) but the reviews also turn up some interesting and generally overlooked titles. Clearly a reflection of its editor's worldview, even to the inclusion of his long "Miscellany" column, in which he may offer anything from his own reflections on global affairs to poems sent in by readers to his "Pandebars," brief poetic musings on whatever catches his fancy. Journal of Conflict Resolution. Edited by Bruce M. Russett. Sage Publications, 2455 Teller Rd., Newbury Park, CA 91320. $130. 1957-. OCLC 1623560. ISSN 0022-0027. A.B. C. Pol. Sci.; America: History & Life; American Bibliography o f Slavic & East European Studies; Abstracts o f Military Bibliography; A cademic Index; Current Contents; E1 (Excerpta lndonesica); Educational Administraiton Abstracts; Historical Abstracts; International Political Science Abstracts; Psychological Abstracts; Middle East: Abstracts & Index; PALS; Predicasts Overview of Markets and Technologies; Peace Research Abstracts; Psycscan; Social Science Citation Index; Social Sciences Index; Social Work Research & Abstracts; Sociology o f Education Abstracts. Although war and its avoidance is a consistent theme in JCR, the journal is greatly varied in its subjects, and its focus is both historical and contempo- rary. The March 1991 issue, for example, offered articles on economic causes of a breakdown in military balance, another on Chinese community mediation, and an essay on foreign policy crises, 1929-1985. JCR often includes articles on nuclear deterrence and other facets of strategic arms. Contributors are chiefly U.S. academics, with occasional appearances by foreign scholars. The typical JCR essay is heavily annotated, laden with mathematical formulae, and more-or-less impenetrable to the lay reader. Abstracts precede the articles. Desirable for most academic collections; most public i]l~raries can live without it. Journal o f Peace Research. Edited by Nils P. Gleditsch and Stein Tonnesson. Sage Publications, Box 5096, Newbury Park, CA 91359. Quarterly. $37/$83. 1964-. OCLC 1607337. ISSN 0022-3433. Circ. 1,200. A.B.C. Pol. Sci; America: History & Life; Current Contents; Future Survey; Historical Abstracts; International labor Documentation; I~BA Linguistics and Language Behavior Abstracts; Middle East: Abstracts & Index; PAIS; Peace Research Abstracts; RiskAbstracts; Social Sciences Index. Published under the auspices of the International Peace Research Association, JPR "is committed to theoretical rigour, methodological sophistication, and policy orientation." The journal produces an occasional special theme issue; the February 1991 number is given over to international mediation and contains ten selections on the topic, including an introduction by former President Jimmy Carter. Other contributors to JPR are political scientists, sociologists, and psycholo- gists from the U. S., U.K., Scandinavia, and elsewhere. Articles contain abstracts and end notes. Thematic issues feature an issuewide bibliography listing citations to all the items referred to in the issue in hand. JPR publishes numerous articles on nuclear issues; recent essays have dealt with ICBM trajectories, assumptions of British nuclear weapon decision makers, and factors predisposing individuals to support nuclear disarma- ment. The "Book Notes" section provides fairly substantial reviews of up to a dozen recent books. A good addition to most peace collections. Peace and Change. Edited by Robert D. Schulzinger and Paul Wehr. Sage Publications, 2455 Teller Rd., SERIALS OF 'fln~ PEACE MOVEMENT - - W I N T E R 1 9 9 1 23 Newbury Park, CA 91320. Quarterly. 1972-. ISSN 0149-0508. Circ. 1,000. Historical Abstracts; Abstracts o f Military Bibliography; Human Rights Internet Reporter; International Political Science Abstracts; Middle East: Abstracts & Index; PALS; Peace Research Abstracts; Sage Public Administration Abstracts; Sage Urban Studies Abstracts. Peace and Change publishes scholarly articles on many peace issues, but focuses especially on work concerning the development of a just and humane society. The chronological scope is historical as well as contemporary; the January 1991 issue, for instance, features an assessment of the peace movement in the 1980s and a special section on Bertha yon Suttner (1843-1914), author of the famous 1889 antiwar novel Die Waffen niederl (Lay Down Your Arms). Contribu- tors, both foreign and U.S., to each issue's 6 to 9 articles typically represent a variety of discip- lines-anthropology, history, literature, political science, sociology, physics, and others. The journal's openness to work from different spheres gives it a healthy and stimulating eclecticism: few readers at all interested in peace topics will fail to find at least one or two articles per issue that strike sparks for them. Book reviews are few; it is an area the journal could bolster. Peace and the Sciences. Edited by Peter Stania. International Institute for Peace, Mollwaldplatz 5, A- 1040, Vienna, Austria. Quarterly. $240. 1969-. OCLC 6158329. ISSN 0031-3513. Circ. 800. This journal reports discussions at international meetings of both Western and Eastern scientists organized by its publisher. It also recently inaugurated a more thorough attention to the research activities of the liP. Chiefly of interest to those looking for a journal with a strong emphasis on European perspec- tives on peace issues; contributors are mostly European, although some U.S. scholars find their way into the journal's pages. Recent issues have dealt in depth with the future of Europe, economic conversion following disarmament, and ecological security. Contains a mix of research and reflective pieces. Survival. Edited by Hans Binnendijk. International Institute for Strategic Studies, 23 Tavistock St., London WC2E 7NQ, England. U.S. subscriptions to Brassey's, Maxwell House, Fairview Park, Elmsford NY 10523. 6/year. $30. 1959-. OCLC 5010177. ISSN 003945338. Circ. 6,500. Abstracts o f Military Bibliography; Historical Abstracts. A scholarly journal devoted to conflict and peacemaking, Survival covers the globe; articles range from Sri Lanka and Cambodia to Central America and South Africa. It contains occasional articles on explicit- ly nuclear issues, such as coverage o f the 1990 Non- Proliferation Treaty Review and evaluation of SDI deployment options. Each issue's book reviews are relatively few but lengthy, and often focus on works concerned with nuclear topics. I N D E X E S A N D ABSTRACTS Alternative Press lndex. AlternativePress Center, Inc., PO Box 33109, Baltimore, MD 21218. Quarterly. $30/$125. 1969-. OCLC 1479213. ISSN 0002-662X. Circ. 550. Subject and author access to articles in close to 250 alternative and radical publications, many of which cover peace issues on a regular basis. Most of the periodicals indexed here are not well represented in other indexes; most of them are not well represented in libraries. The majority of the titles are U.S. publica- tions, but the list includes many from Canada, Great Britain, Australia, and other nations. Peace Research Abstracts Journal. Edited by Hanna Newcombe and Alan Newcombe. Peace Research Institute, Dundas, 252 Dundana Ave., Dundas, Ont. L9H 4E5, Canada. Monthly. $210. 1964-. OCLC 1605735. ISSN 0031-3599. Circ. 400. A very useful tool for peace professionals, this abstracting journal cites and annotates (frequently at considerable length) over 3,000 documents annually. Coverage includes books, scholarly and semi-popular periodicals representing a large number of disciplines, institutional reports, newspapers, films, and other materials. Access is by author and subject indexes and by a code index that classifies entries by subject. Back issues are available from the publisher. Indispensable for researchers investigating Canada's role in affairs of peace and war because of its strong coverage of Canadian publications, the journal also treats a copious quantity of American and British materials. Some coverage of non-English language documents can also be found. 24 SERIAZS REVIEW - - GRANT B U R N S - work_vlmvezww45go3d7y2n6wazz2ci ---- Libraries' Role in Curating and Exposing Big Data Future Internet 2013, 5, 429-438; doi:10.3390/fi5030429 future internet ISSN 1999-5903 www.mdpi.com/journal/futureinternet Article Libraries’ Role in Curating and Exposing Big Data Michael Teets 1 and Matthew Goldner 2,* 1 OCLC Innovation Lab; OCLC, 6565 Kilgour Place, Dublin, OH 43017, USA; E-Mail: teetsm@oclc.org 2 OCLC Library Services Division; 6565 Kilgour Place, Dublin, OH 43017, USA * Author to whom correspondence should be addressed; E-Mail: goldnerm@oclc.org; Tel. +1-614-764-6405. Received: 21 March 2013; in revised form: 16 May 2013 / Accepted: 12 July 2013 / Published: 20 August 2013 Abstract: This article examines how one data hub is working to become a relevant and useful source in the Web of big data and cloud computing. The focus is on OCLC’s WorldCat database of global library holdings and includes work by other library organizations to expose their data using big data concepts and standards. Explanation is given of how OCLC has begun work on the knowledge graph for this data and its active involvement with Schema.org in working to make this data useful throughout the Web. Keywords: WorldCat; knowledge graph; library data; bibliographic data; authority data; Schema.org; OCLC 1. Introduction Libraries have amassed an enormous amount of machine-readable data about library collections, both physical and electronic, over the last 50 years. However, this data is currently in proprietary formats understood only by the library community and is not easily reusable with other data stores or across the Web. This has necessitated that organizations like OCLC and major libraries around the world begin the work of exposing this data in ways that make it machine-accessible beyond the library world using commonly accepted standards. This article examines how the OCLC data hub is working to become a relevant and useful source in the web of big data and cloud computing. For centuries libraries have carefully cataloged and described the resources they hold and curate. OCLC was formed in 1967 to bring this data together in electronic form in a single database. Today WorldCat, formed by libraries around the world, has over OPEN ACCESS Future Internet 2013, 5 430 300 million records of physical and electronic books and journals, recordings, movies, maps and scores with more than 2 billion holdings that describe which libraries, archives and museums hold and license these resources. However, these 300 million individual records remain isolated in the Web, requiring a restructuring of how this data is exposed for use throughout the Web. The organizing of information has been carried out for centuries. “The idea that a library should provide the opportunity for study of the texts and a means to discover the original words of the authors, even those who had lived long before, first became the manifest of the library of Alexandria.… In the middle of the third century BC, Callimachus the poet, was employed here. Organizing the material and carrying on scholarly work at the same time.… Callimachus’ so-called Pinakes, in which the works of the authors were organized in alphabetical order, was the first ever library catalogue.” [1] This work advanced over the next thousand years with small improvements on how these lists were maintained, but they were intended only as inventory devices rather than finding aids. It wasn’t until the seventeenth century that printed book catalogs were introduced at Oxford and began to have alphabetical arrangements of authors to serve as finding aids [2]. The end of the nineteenth century saw the introduction of printed cards and the card catalog [3]. This was the dominant library finding aid for the next century until the first commercially available Online Public Access Catalog (OPAC) was introduced in 1980 by CLSI in the United States [4]. However, as David Weinberger points out in his work, Everything Is Miscellaneous, these catalogs were organized for an analog world [5]. Somebody had to decide how to organize the information, whether it be alphabetical by author or title, or using a subject thesaurus or classification scheme, which predetermined how materials would be organized and then located. For the most part these were closed schemas disallowing reuse of parts of the metadata in other systems. There were exceptions, such as authority files of authors’ names, which could then be linked to multiple cataloging records, but even then there was no common schema for linking to these files to open them up for broader use. So the question raised by OCLC Research and OCLC’s Innovation Lab is whether this metadata about library materials has a role in the current world of cloud computing and big data, and if so, how is that role defined and who fills it? Gartner defines big data as “high-volume, velocity and/or variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision-making and process automation.” [6] This ties to Weinberger’s concept of the Web as leaves that can be made smart and piled together to meet any specific users need. He states “Smart leaves are not like catalog cards with more room and an extra forth IQ points. … An identifier such as an ISBN that enables distributed information to come together when needed turns a C-student leaf into a genius.” [7] We are often asked, “Is library data really big data?” If you consider just the metadata representing the collection of printed and electronic works held by libraries, it really cannot be considered big data in its current meaning. [8] Even when you consider the full-text of those works, the data management does not require “big data” techniques. However, when we look at these collections as a training set for all human knowledge, we can follow obvious data trails to generate massive collections of new relationship assertions. From a single work, we can extract relationships from co-authors, citations, geo-location, dates, named entities, subject classification, institution affiliations, publishers and historical circulation information. From Future Internet 2013, 5 431 these relationships, we can connect to other works, people, patents, events, etc. Creating, processing and making available this graph of new assertions at scale is big data. It requires the source data to be structured in such a way that many can consume and operate on the corpus in common ways. In 2008, Nature published an entire issue on big data. Authors Frankel and Reid [9] made the prescient statement: “To search successfully for new science in large datasets, we must find unexpected patterns and interpret evidence in ways that frame new questions and suggest further explorations. Old habits of representing data can fail to meet these challenges, preventing us from reaching beyond the familiar questions and answers.” This is precisely the issue being grappled with concerning library metadata. Though it is structured, it is not usable in a Web world of big data. This has led us to using linked data technologies as a primary enabler for interoperability at scale on the Web. In 2006, Tim Berners-Lee [10] recommended four rules—or “expectations of behavior”—that are useful in this discussion. Early innovation in exposing library data as linked data has shown there is a useful place in the larger web of big data. Some examples are shown in Table 1. Table 1. Examples of linked data. Organization Type of data British Library Bibliographic (item descriptions) Deutsche National Bibliothek Authority (names) and bibliographic Library of Congress Bibliographic OCLC Classification (Dewey Decimal), authority, bibliographic In 2011, the British Library (BL) [11] announced the release of the British National Bibliography as linked data. Its purpose went beyond just encoding bibliographic data as RDF; rather, “...they set out to model ‘things of interest,’ such as people, places and events, related to a book you might hold in your hand.” The Deutsche National Bibliothek (DNB) [12] first released its authority data as linked data in 2010. Authority data are authorized name headings that are used by libraries to ensure the works of a single author are associated with each other. DNB followed this in 2012 when it released its bibliographic data, which describes specific titles and works, joining the BL in releasing the data that describes a major national library collection as open linked data. In May 2012, the US Library of Congress (LC) [13] announced it also would pursue releasing its bibliographic data as linked data. This was part of a larger project “to help accelerate the launch of the Bibliographic Framework Initiative. A major focus of the project is to translate the MARC 21 format to a Linked Data (LD) model while retaining as much as possible the robust and beneficial aspects of the historical format.” MARC (MAchine Readable Cataloging) [14] had been developed since the late 1960s to provide a method to encode bibliographic and authority data so it could be moved between systems; however, it was used only in the library community and is not appropriate for use in today’s Semantic Web. OCLC’s own efforts to expose linked data have spurred several innovative uses of existing data and increased traffic from search engines. Over this same time period OCLC has released several datasets Future Internet 2013, 5 432 as linked data. Starting in late 2009, OCLC released Dewey classification data and has extended this data over the years to include deeper levels of the classification scheme [15]. This was followed in 2009 with the release of the Virtual International Authority File (VIAF) as linked data [16], and then the 2012 OCLC release of WorldCat as linked data using Schema.org as the primary vocabulary. As shown at Datahub [17], this trend is rapidly accelerating as more and more library data is released as open linked data. 2. Problem Faced The transition that OCLC and the library industry are going through is simultaneously a significant change and a return to the roots of the industry. Organizing the authoritative descriptions of library objects is the root of library cataloging and a well-understood concept by library staff. However, the industry participants have primarily interoperated by exchanging text strings encoded in industry-proprietary MARC record formats. Our systems have evolved collectively to consume, manage and reproduce these text strings. Authoritative text strings have been stored and managed separately and are somewhat loosely connected through applications to item descriptions. The transition to effectively operating with data at scale requires much more focus on accessible structures, persistent identifiers and a comprehensive modeling of the data under management. Linked data is the formalized method of publishing this structured data so that large data repositories can be interlinked using standard Web protocols. 3. Solution in Knowledge Graph and Linked Data In order to manage authority of text strings, we must first think about the organization of top-level entities relevant to our industry needs. This organization is often referred to as an “upper ontology” or, more recently, as entities in a knowledge graph. A knowledge graph is a model of relationships between entities or objects in a given space. Given libraries’ broad responsibility in organizing knowledge, the entities in a library knowledge graph must be quite broad and include the following key entities: • People: Traditionally people who have formally published works, but increasingly we must account for all people whether they are writing, reviewing, publishing or simply using library content; • Items: Physical items held within libraries such as books and media, but also electronic-only information such as e-books, journal articles and digital scans of real objects; • Places: Geographic locations past and present must be maintained to understand context of published works. Political names are important, but so are more abstract names such as Northern Gulf States; • Events: Events can account for grand events such as a public performance by a rock music group, and also minor events such as a user viewing a book text; • Organizations: Organization entities encompass corporate names, publishers, political parties and bodies, and associations; • Concepts: Subject classification systems such as the Dewey Decimal Classification system and Library of Congress Class system are well-known concept organization schemes. More free-form tagging systems also round out this category. Future Internet 2013, 5 433 The models produced are being defined in RDF schemas. These schemas or vocabularies represent the most tangible output of the work in the knowledge organization space. Two recent efforts are attracting the most attention. The first is the Schema.org effort led by Google [18] and the library extension to this vocabulary initiated by OCLC and evaluated within the W3C Community Groups [19]. Some parts of these extensions are being integrated back into the core schema. The second effort is the Bibliographic Framework Transition Initiative launched by the Library of Congress [20]. The former focuses on simplicity and interoperability, including systems outside of the direct sphere of the library. The latter focuses on the transition from MARC21 to modern standards necessary to preserve the precision and detail required for library workflows that include cataloging, resource sharing, preservation and inventory management. The selection of a schema, framework or vocabulary can be daunting, technically complex and fraught with emotion and politics within a given community. We have adopted the strategy of selecting broadly used models and developing accessible systems to achieve the most interoperability. It is not our strategy to select and prove a single schema is the best for all. We have selected the Schema.org vocabulary as the root of our data model as it provides an accessible core. Schema.org was launched as a collaboration among Google, Bing and Yahoo! to create a common set of schemas to encourage structured data markup within Web pages. While initially viewed as overly simplistic, the organization has been very collaborative with established communities of practice to extend and generally improve the quality of their schemas. OCLC was an early collaborator focusing on items typically held by libraries. We worked with staff within Google to understand the possibilities and determine the best places to contribute. Some of our initial extensions were to recognize the differences between content and the carrier in published works. For example, when requesting or offering a movie within libraries, or anywhere on the Web for that matter, it is important to distinguish whether it is a digital download, a DVD, a CD or VHS. This is data well-structured and used in libraries but not immediately necessary for a movie review site. This ability to cross industries with similar data is met through common schemas that allow multiple industries to improve simultaneously [21]. Once primary entities are identified, three primary activities are required to make them usable. First, the entities must be modeled, but using existing schemas will reduce the work substantially. Modeling experts must work with expected contributors and consumers of the entities to understand their requirements. Those requirements must be transformed into conceptual and formal data models that govern how the data is stored and accessed. Secondly, the relationships between the entities must be defined in a similar manner. Finally, those entities must be given unique, persistent and accessible identifiers. And this is where we transition from just having lots of data to what is known as “big data.” With accessible data and identifiers, the relationships can be auto-discovered by our systems. These new relationships can be turned immediately into action by these same systems. The use of linked data in a larger knowledge graph, and its application in big data, is not just theory. It provides immediate and tangible benefits to the consumers of the data being managed. In our case, one of the most prominent examples is an author’s name. While a person’s name may seem obvious in text form, it can be quite complex in a global data landscape. A common problem to solve is to distinguish between authors of the same name. There could be hundreds of people named “Ji Li” who publish journal articles even within a specific technology discipline. The name can have multiple Romanized transcriptions of the original Chinese name. Birth and death dates, co-authors, countries of Future Internet 2013, 5 434 publication and publishers can be included with the name to provide clarity. In reality, however, it is not until you get to a globally unique identifier for each that the problem can be mostly solved. Storing the identifier instead of the name, as in the Virtual International Authority File [22], and resolving at use allows disambiguation that would be impossible on any single data store (see Figure 1). Beyond disambiguation, it allows immediate updating should the author’s information change. When Ji Li becomes Dr. Ji Li, or even completely changes his name, all systems using the authoritative identifier could be immediately current. These same benefits apply to geographic place names, titles, events, subjects, etc. Figure 1. Author representation in Virtual International Authority File (VIAF). Following from the author example above, a metadata record describing a creative work in a collection can now be expressed as links rather than text strings to be maintained. The following human-readable display of a “record” (see Figure 2) is actually not a record at all but instead is a series of entities with defined relationships. The entities have identifiers resolvable on the Web using standard http requests. We can see that Ji Li published a book in 1961 titled, Beijing di 1 ban. From the Library of Congress link through the VIAF identifier we can discover that Ji Li passed away in 1980 even though this specific reference was unaware of the death date. You can see how this method of data organization is critical as we move toward massive, globally accessible data stores. Our first task was to select and model our primary data. Rather than attempt to model all of the data under management at OCLC, we chose to be opportunistic by modeling and exposing the data we viewed as important in WorldCat and nearest to the model exposed by Schema.org. We had a good handle on concepts with Dewey [23], LC Subject Authorities [24] and FAST subject headings [25], and have services to crosswalk between these and other classification systems. The new work was to make them accessible in the Schema.org vocabulary and embed this as html markup. We followed the same path for the data elements the library community is comfortable modeling that falls under the schema:creative work hierarchy such as: author, name, publisher, inLanguage, etc. See an example of schema description below (Figure 3). We applied these models across our own WorldCat catalog and we also made the data accessible to include in others’ catalogs with the intent of improving library and Future Internet 2013, 5 435 nonlibrary descriptions. We exposed this data directly in our production systems in html markup in the human-accessible systems and also in downloadable datasets. Because this was new to the community we serve, we made the data visible in a human-readable “view source” display in the bottom of each page we produced (seen above in the book example, Figure 2). Figure 2. Human-readable display of a “record”. Figure 3. Schema description. From the first public step, we set out on two parallel but complementary paths. One path took an internal focus; the other took an external focus. Internally, we began this formal modeling for the most important data resources under our management. This required modeling and a program of education among the consumers of this data within our product and engineering teams. Externally, we sought Future Internet 2013, 5 436 partners that would augment our data, provide connectivity through modeled relationships, or would simply exploit this data within their systems. Momentum on both internal and external fronts has been increasing, and we find our data being used to create innovative relationships inside and outside our services. Library developers have begun to integrate our data into their services. Zepheira [26] is a Semantic Web company that has assisted both OCLC and the Library of Congress in implementing these concepts. In the midst of our efforts to move to more modern data structures for interoperating both inside and outside the library community through data, the Library of Congress kicked off an effort to replace the MARC cataloging record structure with a modern solution. In many ways, LC followed a similar path as described above but with a more specific focus on the detailed information required in specific library services. The rich detail required at an item level has led LC to a more industrial-strength model optimized for its needs. OCLC was invited to the table early in this discussion and has been both a contributor and consumer of LC’s efforts. Because we are both operating with accessible and machine-understandable data models, our collective efforts do not produce significant conflicts. When either group looks at an object from the other, it will see not only the string of text data but also the identifier of that object, the model under which it was created, and even the authority under which it is managed. Essentially, the data becomes interoperable at scale. Using big data to automatically discover relationships is opening the doors for rapid innovation. Our systems could recognize that a PhD student at a small European institution is focusing on the same subject matter as a professor at a major US academic research organization. The systems could recognize that research in a specific area is rapidly increasing while an adjacent category is rapidly decreasing. Collaboration forums, collections and repositories could be spawned or dismantled without distracting the researchers from their primary task. In theory, better solutions to research problems can be found more quickly. 4. Conclusions We stated earlier that OCLC and major libraries around the world need to expose the vast wealth of library collections data produced in the last 50 years beyond the library community. As pointed out by analyst Anne Lapkin in a Gartner report [27]: “Big data is not just about MapReduce and Hadoop. Although many organizations consider these distributed processing technologies to be the only relevant “big data technology,” there are alternatives. In addition, many organizations are using these technologies for more traditional use cases, such as preprocessing and the staging of information to be loaded into a data warehouse.” In other words, just making big data sets accessible is not a desired end point. It is about making the data reusable in combination with other data sets across the Web. Just moving to RDF alone could not accomplish this, leading OCLC to make the decision to work with an already accepted cross-industry vocabulary to improve access. OCLC and others involved are still early in this process. However, there is already strong evidence that once exposed, library data is useful to other communities and is accessed and repurposed. Because librarians have invested so much time and energy into authoritatively describing resources, their creators and subjects covered, this data Future Internet 2013, 5 437 serves a valuable role in connecting the many facets of people, items, places, events, organizations and concepts into a meaningful knowledge graph. As Weinberger said in Everything Is Miscellaneous, this work allows any searcher on the Web to pull together information based on relationships that are meaningful to the searcher, not based on a predefined organization created for an analog world [28]. By exposing massively aggregated library data in ways that make this possible, information seekers around the world will find items of interest in ways not previously possible. Acknowledgments The authors wish to thank Jeff Young, Software, Architect, OCLC, for his contributions to the article content, and Brad Gauder, Editor, OCLC, for editorial assistance. Conflicts of Interest The author declares no conflict of interest. References 1. Eliot, S.; Rose, J. A Companion to the History of the Book; John Wiley and Sons: Hoboken, NJ, USA, 2009; p. 90. 2. Kent, A.; Lancour, H.; Nasri, W.Z.; Daily, J.E. Encyclopedia of Library and Information Science; Marcel Dekker: New York, NY, USA, 1968; Volume 4, p. 255. 3. Kent, A.; Lancour, H.; Nasri, W.Z.; Daily, J.E. Encyclopedia of Library and Information Science; Marcel Dekker: New York, NY, USA, 1968; Volume 4, p. 277. 4. Kent, A.; Lancour, H.; Nasri, W.Z.; Daily, J.E. Encyclopedia of Library and Information Science; Marcel Dekker: New York, NY, USA, 1968; Volume 58, p. 154. 5. Weinberger, D. Everything is Miscellaneous: The Power of the New Digital Disorder; Times Books: New York, NY, USA, 2007. 6. LeHong, H.; Laney, D. Toolkit: Board-Ready Slides on Big Data Trends and Opportunities. Gartner, 1 March 2013, G00238695. 7. Weinberger, D. Everything is Miscellaneous: The Power of the New Digital Disorder; Times Books: New York, NY, USA, 2007; p. 120. 8. Hessman, T. Putting big data to work. Ind. Week 2013, 262, 14–18. 9. Frankel, F.; Reid, R. Big data: Distilling meaning from data. Nature 2008, 455, doi:10.1038/455030a. 10. W3C. Design Issues Website. Available online: http://www.w3.org/DesignIssues/LinkedData.html (accessed on 8 May 2013). 11. Talis Systems Website. Available online: http://talis-systems.com/2011/07/significant- bibliographic-linked-data-release-from-the-british-library/ (accessed on 3 March 2013). 12. Svensson, L.G.; Jahns, Y. PDF, CSS, RSS and other Acronyms: Redefining the Bibliographic Services of the German National Library. Available online: http://conference.ifla.org/past/ifla76/ 91-svensson-en.pdf (accessed on 3 July 2013). Future Internet 2013, 5 438 13. Library Journal. Info Docket. Available online: http://www.infodocket.com/2012/05/23/linked- data-the-library-of-congress-announces-modeling-initiative-contracts-with-zepheira/ (accessed on 3 March 2013). 14. Library of Congress. Bibliographic Framework As a Web of Data Linked Data Model and Supporting Services; Library of Congress: Washington, DC, USA, 2012. Available online: http://www.loc.gov/marc/transition/pdf/marcld-report-11-21-2012.pdf (accessed on 5 August 2013). 15. The OCLC Cooperative Blog. Available online: http://community.oclc.org/cooperative/ 2012/07/dewey-linked-data-interview-with-michael-panzer.html (accessed on 3 March 2013). 16. OCLC Website. Available online: https://www.oclc.org/en-US/news/releases/2012/201224.html (accessed on 3 March 2013). 17. Datahub. Available online: http://datahub.io/ (accessed on 12 August 2013). 18. Schema.org. Available online: http://schema.org/ (accessed on 12 August 2013). 19. Schema Bib Extend Community Group. Available online: http://www.loc.gov/marc/transition (accessed on 12 August 2013). 20. Bibliographic Framework Initiative. Available online: http://www.loc.gov/bibframe/ (accessed on 12 August 2013). 21. Godby, C.J. The Relationship between BIBFRAME and the Schema.Org “Bib Extensions” Model A Working Paper; OCLC Research: Dublin, OH, USA, 2013. Available online: http://www.oclc.org/ content/dam/research/publications/library/2013/2013-05.pdf (accessed on 5 August 2013). 22. Virtual International Authority File. Available online: http://viaf.org/viaf/69247245/ (accessed on 12 August 2013). 23. OCLC Website. Available online: http://www.oclc.org/dewey (accessed on 3 March 2013). 24. Library of Congress Subject Headings. Available online: http://id.loc.gov/authorities/subjects.html (accessed on 3 March 2013). 25. OCLC Research Website. Available online: http://www.oclc.org/research/activities/fast.html (accessed on 3 March 2013). 26. Zepheira Website. Available online: http://zepheira.com/ (accessed on 3 March 2013). 27. Lapkin, A. Hype Cycle for Big Data, 2012. Gartner, 31 July 2012, G00235042. 28. Weinberger, D. Everything is Miscellaneous: The Power of the New Digital Disorder; Times Books: New York, NY, USA, 2007; pp. 94–96. © 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/). work_voc5scbh4nadtpivoofqxbpu74 ---- Automation and Technical Services Organization By: Rosann Bazirjian Bazirjian, Rosann. “Automation and Technical Services Organization.” Library Acquisitions: Practice and Theory 17(1): 73-77 (1993). CERTAIN FIGURES OR CHARTS ARE OMITTED FROM THIS FORMATTED DOCUMENT. The pre-automation era dictated that functions be organized around physical files. These files retained order information, contained bibliographic holdings and maintained auditing and accounting records. These files also encouraged the traditional divisions of technical services: cataloging and acquisitions. As we automate, files begin to disappear, and as they disappear, so do the traditional organizational structures with which we are familiar. Departments begin to merge, the sharp lines marking divisions begin to blur, and work becomes interrelated. There are many factors about automation that prompt reorganization and that must be considered in the process, and I will highlight four of them. The primary goal should be the streamlining of functions, and this might very well be the basis on which any reorganization is planned. It is important to make sure that duplication of effort is eliminated, i.e., that the same person does not have to handle the material twice and that the item moves through technical services in an organized way. Cost-effectiveness also comes into the picture, both in terms of human resource allocation as well as the actual cost of searches, record transfers, and bibliographic utilities. In addition, automation has provided for immediate access to all we do in technical services. The patron is able to track how a book is acquired and cataloged virtually every step of the way. How our orders look and how the database is maintained suddenly take on a more global meaning. Finally, the integrated database binds all of our functions together. One record is used for order, receipt, and cataloging. Therefore, whatever we do in one section automatically affects the work of another. It is that integration which one needs to exploit. No longer can one department work in a vacuum because of this interrelatedness of function. The reorganization of technical services at Syracuse University coincided, for the most part, with a physical reconfiguration of the library building itself. This greatly facilitated our reorganization, as past history, in terms of physical surrounding, was soon shed. We moved from a basement to beautiful fifth floor headquarters, with windows and carpeting. Prior to July, 1991, the technical services division was traditionally organized. We had a cataloging and an acquisitions department, physically separated by an enormous shelflist. Each department had a head and reported to the Associate University Librarian for Technical Services. The two departments shared a bank of eight OCLC terminals. There were scattered shared terminals throughout both departments that accessed a home-grown system, which has since been replaced by NOTIS. With our physical reconfiguration, the shelflist was discretely placed in a far corner of the room, so that any attempt at barriers was eliminated. The eight OCLC terminals grew to http://libres.uncg.edu/ir/clist.aspx?id=44 ten and were placed into three smaller clusters rather than one. In addition, every staff member now has a terminal at his/her desk to access NOTIS. In July, 1991, we combined acquisitions and cataloging into one department, and called that department Bibliographic Services. The new department is divided into three units and one section: the Monograph Unit, the Serials Unit, the Receiving/Accounting Unit, and the Database Management Section. All three units and the one section report to the Head of Bibliographic Services (Figure 1).***FIGURE IS OMITTED IN THIS FORMATED DOCUMENT The first step in our reorganization was the combination of the receiving and accounting sections into one unit in May, 1991, and this is the one unit where a true merger has taken place. This merge was precipitated by the implementation of NOTIS, and as I said earlier, a desire to streamline and not duplicate effort. Within the NOTIS system, receiving and paying is done at the same time. The individual calls up a record, receives it, creates an invoice, and indicates payment at one time. Prior to our combining of these sections, the book would have been received by one section, then moved down the hall to a different section, which would have had to retrieve the very same record for payment purposes. The duplication would have been wasteful and foolish. We have one further link and that is between the library and university accounting. That is an electronic link-all invoice payments made by the library interface with the university’s accounting office. Whatever we do, and vice versa, affects the other. Again, the interrelatedness of function is apparent. It was two months after the creation of the Receiving/Accounting Unit that the former acquisitions and cataloging departments were merged into one. The Monograph Unit comprises monographic searchers and catalogers from the two former departments. As the database was integrated, and as searchers begin to actually select OCLC records for transfer into the local system, it became more evident that the database was a shared resource. Each section of this Unit realized that what one did directly affected the other’s work. Since good communication and the ability to share in the decision-making process are crucial, it was clear that the decision to merge was a sound one. Training is also shared as searchers learn cataloging rules in greater detail. We notice more of a concern on the part of searchers that they are choosing the records that catalogers want to use. They are much more aware of and sensitive to cataloging rules and interpretations as they constantly ask which is the better record to choose, and it is in this way their training continues. We have even asked catalogers to do some preorder searching. That was an eye-opener as they realized the difficulty of selecting an OCLC record without the book in hand. This type of flexibility of staffing was not possible prior to the merging of the two departments. There is also a greater implicit trust in the work of the preorder searchers on the part of the catalogers. They trust that the record selected was the proper one to choose. With a paper search history no longer accompanying each book that trust needs to become implicit. Serials functions were combined during this reorganization as well. The Serials Unit now comprises serials receiving, serials adds, and serials cataloging. As we proceed with NOTIS and the closing of the serials shelflist, we have now begun to merge the serials receiving and adding functions. Again, the reason is to eliminate the duplication of effort. It seems redundant to have the receiver receive an issue of a periodical, and not just, on the same NOTIS record number, create the volume holding as well. Why pass the issue along to a different individual? Within serials, acquisitions, adds, and cataloging are, as Jennifer A. Younger and D. Kaye Gapen say, “so intertwined as to be inseparable” [ 1]. They continue to ask the question: why spend time separating the problems? Technical Services has a much greater profile in libraries with integrated systems. We realize that whatever we do affects public services. The way we record current receipts, the way we indicate which volume is on order, the way we list our volume holdings all have a public profile. It is partially for this reason that the Database Management Section has been created; its prime goal is to maintain the integrity of the database. They are responsible for authority control as well as error detection and correction. This section also incorporates serials maintenance functions as well. This is one more way we have worked toward a true combination of not only departments but functions. This is an important section because it works to assure that the same records all of the other sections have worked on somehow fit into the larger database properly. It is a way of assuring quality as well as accuracy. The benefits of the combined department have revealed themselves through some recent projects we have undertaken. A recent library-wide weeding project called upon searchers, receivers, catalogers, and serials staff as well as database management to come up with a shared set of procedures to work together to ensure the withdrawal of weeded titles. In the past, this would have crossed departmental lines and as a result, would not have gone so smoothly. Probably the burden would have fallen only upon the old cataloging department. In addition, recent training sessions for OCLC’s Prism called upon the cooperation of supervisors from three sections to work as one to develop and implement training. Again, this cooperation might not have been possible prior to our reorganization. We have come a long way. The single department approach has resulted in a staff working together for the good of the library patron. Our shared automated functions have led to a mutual concern for the database. We are all a part of the end product and work together to ensure its quality and integrity. But, what might the future hold in terms of technical services organization? There are several products and innovations which are mandating additional change in the traditional technical service divisions that we have in our libraries. One is online access to vendor and publisher bibliographic tools, as well as EDI. We are also able to access bibliographic records beyond our own local online databases via the Internet. This ability has significant implications for searching, cataloging, and collection development. In addition, we have the prospect of workstations where the staff member can search, transfer a record, create and receive an order, and perform interlibrary loan functions. This ability to integrate acquiring, cataloging, and borrowing functions into one functionary work area has far-reaching implications. Will we be seeing a marriage of ILL, document delivery, and acquisitions? If we order periodical titles on subscription, is it so different ordering an article as well, for the library? for the patron? Finally, technical service functions are becoming scattered throughout public service areas, and I strongly believe this is what the future holds in store for us. As our automated systems grow more and more sophisticated, so do they grow in interrelationship. The amount of detail that public services needs to know about how the system works increases to the same extent for technical services as we need to understand the demands and wishes of our patrons. The amount of information that the automated system is capable of increases that demand for crossover. The fine line that, divided both divisions is disappearing. This, I believe, is a contributing factor in a trend toward the decentralization of technical service functions. Many of our branch libraries are beginning to check in periodical issues and create item records. Departments are transferring titles and assuming the technical responsibility for that as well. Other branches are cataloging maps and documents, and I see a future where bibliographers in collection development will soon have the capability of placing orders into the database for retrieval by the technical services staff for final verification and placement. The role that technical services will play once decentralization is taken further still needs to be established. How we insure the quality of functions once these functions are scattered throughout the campus and what our roles as technical service librarians will be are the questions that we will be asked to address at some point in the very near future. Note 1. Younger, Jennifer A. and D. Kaye Gapen. “Technical Services Organization: Where We Have Been and Where We Are Going,” Technical Services Today and Tomorrow, ed. Michael Gorrnan. Englewood Cliffs: Libraries Unlimited, 1990, 180. CERTAIN FIGURES OR CHARTS ARE OMITTED FROM THIS FORMATTED DOCUMENT. work_vwf2wonikbd5zito543u3mkjgq ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586944 Params is empty 216586944 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586944 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_vxik3jeiqngylavn5kn524ra4u ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216586941 Params is empty 216586941 exception Params is empty 2021/04/06-01:36:59 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586941 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:36:59 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_w5c6m3zyhfhwhlpwhgofsexb3i ---- Part B Methods Using Faculty Focus Groups to Launch a Scholarly Communication Program Martin P. Courtois and Elizabeth C. Turtle Kansas State University, Manhattan, Kansas USA Abstract Purpose – This paper explores the benefits of using faculty focus groups as an early component of a scholarly communications program with suggestions for planning and conducting sessions, recruiting participants and analyzing outcomes. Design/methodology/approach – Based on the experiences and findings of the authors where the use of focus groups was used in the initial stages of organizing a scholarly communications program at Kansas State University Findings – Focus groups are an effective method to begin identifying scholarly communication issues that resonate with faculty on a particular campus. Focus groups can be helpful in targeting efforts to begin a scholarly communications program. Practical implications – Focus groups are effective in generating insights, opinions and attitudes and are low cost in terms of time and resource commitments. Originality/value – There is very little in the literature about using faculty focus groups to start a campus scholarly communication program. This article provides practical and useful information that other libraries can use to incorporate this method into their planning. Keywords Focus groups, Scholarly communications, faculty Paper type Case study Introduction Scholarly communication is an umbrella term for a complex array of related issues, including authors’ rights, copyright, access to information, peer review, and publishing, all of which have a direct impact on libraries, universities, and particularly, faculty. Driven by years of journal price increases, dwindling serials budgets, and the potential for new distribution channels, organizations such as ACRL, ARL, and others have intensified focus on scholarly communication. Academic libraries are responding by creating scholarly communication programs, and, as with many areas of library service, involving faculty in discussions of these issues. Developing a dialog with campus groups, especially faculty, to increase their awareness of scholarly communication issues is important. (Duncan, 2006). Libraries and universities employ a number of methods to involve faculty in this discussion, including, surveys, blogs, individual interviews, seminars/speakers, departmental visits, and campus committees. Because of the broad nature of scholarly communication and the impact it has on many aspects of academic life, focus groups are a particularly apt method to use, especially in the early stages of program development. This article will explore the benefits of focus groups as a component of a scholarly communication program and offer suggestions for planning focus groups, recruiting participants, conducting the session, and analyzing outcomes. About Focus Groups Focus groups have been used as a social science research method since the 1920’s and have been referred to as “focused interviews,” group interviews,” “group depth interviews,” and “focus group interviews.” (Walden, 2006). The basic framework of a focus group is an open, in-depth discussion with a small group of individuals purposely selected to explore a predetermined topic of shared interest. This discussion is typically led by a moderator, but the setting is usually informal and encourages interaction among group members. Focus groups are effective in generating insights and providing qualitative data on participants’ feelings, values, opinions, and attitudes. The group setting allows for probing answers, clarifying responses, asking follow-up questions, and testing assumptions. The process of interaction within the group will often stimulate new ideas and provide more detail than can be obtained with other survey techniques. Although planning and preparation are required to hold focus groups, overall costs in terms of time and resources are low. There are some drawbacks. Focus groups will not provide quantitative data, and though it may be tempting, it is not valid to generalize or draw sweeping conclusions based on opinions expressed by only a few individuals. Even with an aggressive approach, focus groups will reach only a small percentage of a group. The open nature of focus groups make them vulnerable to domination by the moderator or a participant, and all members of the group may not be equally represented in the discussion. Despite the best efforts of the moderator to elicit discussion, some participants may be unwilling to share their true views and feelings in a group setting. Focus Groups and Scholarly Communication As libraries begin to focus on scholarly communication, focus groups are an excellent way to introduce faculty to the issues and learn their perspectives. Publications from the University of California’s Office of Scholarly Communication refer to the use “structured interviews” (focus groups) to promote and encourage university-wide planning and action to develop scholarly communication systems. (The University of California, 2007). Participating in a focus group may be one of 2 the few opportunities faculty have to interact with peers outside their department and to hear perspectives from fields whose traditions for peer review and scholarly publishing may be quite different from their own. For librarians, focus groups will generate ideas for educating faculty and promoting scholarly communication. The process of selecting participants for a focus group is a valuable exercise in identifying key faculty, and interaction within the group may set the stage for further work and projects with interested faculty. Much of the attention on scholarly communication is focused on the need to educate faculty on issues such as retaining copyright, publishing in open access journals, and depositing articles in an institutional or subject repository. While it is important for faculty to be aware of these issues, it is equally important for librarians to become aware of faculty concerns related to scholarly communication. Are faculty under pressure to publish in certain journals? Do they see potential problems with depositing pre-prints of their research in an institutional repository? The open nature of focus groups encourages faculty to articulate their concerns and gives librarians the opportunity to hear faculty perspectives. Focus groups are an effective medium for raising new issues with faculty, but librarians also need to be ready to listen and learn from faculty. Conducting a Focus Group At Kansas State University, there was interest in launching a scholarly communication program, but uncertainly as to where to begin. Focus groups seemed like a good way to generate ideas on how to proceed and did not require a large investment of time and resources. Two librarians volunteered to serve as moderators, and the first faculty focus groups were held during Spring semester 2007. This next section describes methods used for planning the sessions, recruiting participants, and conducting the sessions. Planning Since the library was in the earliest stages of addressing scholarly communication, the goal for the focus groups was to hear views and opinions of faculty on a number of issues. During initial planning, the moderators identified the following issues to address with the focus group: • Faculty awareness of scholarly communication issues • Alternatives to traditional scholarly publication • Barriers to implementing solutions to scholarly communication problems • How tenure/promotion criteria affect these issues • Open access journals • Self-archiving • Author’s rights and copyright • Strategies for promoting scholarly communication at K-State 3 To address these topics, the moderators prepared a short list of questions (see Appendix ) to pose to the group. It was decided, however, to be flexible and not set a time limit for discussion of any one topic. If interest in a particular topic was high, the moderators would allow discussion to run its course before introducing the next question. The moderators also discussed the need to monitor the discussion and to make sure the session was not dominated by one individual. Focus group sessions were planned for 90 minutes. Moderators prepared brief presentations to give an overview of the “crisis” in scholarly communication and some of the possible solutions, including retaining copyright, self-archiving, and new publishing models. These presentations took about 15 minutes, with the rest of the session devoted to open discussion. A date toward the middle of the semester was selected and a small conference room in the library was reserved for the meeting. Recruiting Participants The moderators decided it would be best to start with faculty who were likely aware of issues in scholarly communication and who could offer ideas on how the library and university could address these concerns. Subject librarians were asked to suggest faculty who might be interested in attending a focus group, including those who were frequent library users, editors of journals, and prolific authors. Other key faculty were identified, which gave the moderators a list of 25 names. During this process, attention was paid to selecting faculty at various points in their careers and from a variety of disciplines. The moderators prepared an invitation which included a description of the focus group and a list of topics to address in the session. This was sent by e-mail to target faculty about 10 days before the session. Faculty were asked to RSVP, and 12 responses were received. Seven faculty stated they were interested but could not attend at the scheduled time. These faculty were contacted and another session scheduled for a time when most of them could attend. As the moderators worked on recruiting participants, several key guidelines emerged. One is that repeated contact and follow up is necessary with faculty in order to elicit their participation. They may be interested in the topic, but their schedules and workloads present formidable hurdles to attending a session. Ask faculty when they have free time, send reminders, and make phone calls. These are small efforts to make when compared to the valuable insight gained through the focus groups. Secondly, it’s beneficial to ask for help in identifying potential participants, and to think broadly in terms of who may be interested. Many excellent participants were identified by polling other librarians, many of whom work closely with faculty in academic departments. In addition, department chairs, members of the library committee, and faculty senators are all potential candidates. 4 Conducting the Session Of the 8 faculty who indicated they would attend the first focus group, only 5 attended. As it turned out, this proved to be an ideal size for the group. Everyone in the group was able to participate freely, whereas a larger group would have likely prevented free discussion. For the second group, 4 faculty attended, and again this size group allowed for a free and open discussion. Although these groups seem small, at no point in either session did discussion drag. Place cards with the name and department of each participant were prepared beforehand, and proved helpful since faculty were from several different departments and had not met previously. The session began with introductions, followed by a brief overview by the moderators, and then open discussion. The moderators posed questions to the group to move from one topic to the next. The moderators knew they would need to pay attention to the discussion and did not want to be burdened with having to take extensive written notes. Video recording was thought to be too intrusive, so a small digital audio recorder was used to record each session. Most modern voice recorders can record for several hours, and the device required no attention during the session. Analysis and Outcome After the sessions, the moderators reviewed the audio recordings. No attempt was made to extract quantitative data, e.g., 80% of participants thought journal prices are too high. Rather, the sessions were effective in identifying the range of faculty views on specific issues. For example, none of the faculty had made efforts to retain copyright for their works. Several faculty noted the need to publish in high profile journals for promotion and tenure considerations. Reaction to open access was mixed; some faculty viewed it as very important, while others said that dissemination and access to their work was not a concern. Faculty were not aware of the potential citation advantage of self archiving, but all indicated they and their colleagues would be interested in participating in a self- archiving program as long as the process was simple and required little or no effort on their part. The following points were also raised during the sessions: • Several faculty mentioned interlibrary loan is effective in providing access to journals they need. Most do not perceive a “crisis” in scholarly communication, although they do see that access may be an issue in developing countries. Some faculty mentioned they try to avoid publishing in very high priced journals. • There was confusion over self-archiving and its relation to peer review. The initial reaction in most cases is that self-archiving would replace peer review. 5 • There was hesitation at the idea of archiving pre-prints. Faculty recognize there is an advantage to getting one’s ideas out on the web quickly , but have concerns with having different versions available, especially as corrections or updates are published. • There was some support for the idea of “civil disobedience” in self-archiving. Faculty told stories of colleagues who self archive their articles even if it goes against publisher policies. • Getting access to materials in institutional repositories worldwide was a concern. How will we find relevant materials without having to be aware of or visit individual collections in individual repositories? • There was little support for the concept of article charges to support open access journals. Faculty were concerned of the impact this will have on research funds, i.e., if faculty will need to pay these costs from their grant funds. • Universities should be willing to host journals, but it is not clear how this would be funded. • Early in the discussion, there was confusion over “open access” and “online” journals. It was not clear to everyone that even though an article is available in an electronic form, access may be restricted. • Academic societies need journal subscription revenues in order to survive. • There was strong interest in how scholarly communication could be different. How could we best utilize the possibilities offered by networked and digital media? From the analysis of the ideas, feelings, and attitudes expressed in the focus groups, it became clear the first phase of the Library’s scholarly communication program needed to focus on working with faculty and library staff to raise awareness and understanding of the issues. To address this, the following projects were outlined: • Hold staff seminars to identify “talking points” subject librarians can use to engage their faculty in discussions on scholarly communication; • Establish a library-based web site that identifies and defines key issues within scholarly communication; 6 • Create an automated, online presentation that will serve both as a tool for librarians in opening presentations to faculty and as a resource for interested faculty to view on their own; • Plan a seminar with invited speakers to focus both faculty and administrator attention on scholarly communication. The most tangible outcomes, however, were projects launched through faculty contacts made at the sessions. Two faculty members, energized by the potential of self-archiving and open access, agreed to initiate pilot projects for adding the scholarly work of faculty in their departments to K-State’s budding institutional repository. Although these projects have a specific goal of creating subject- based collections within the repository, they also provide the opportunity to further engage faculty in discussion and exploration of the many facets of scholarly communication. Conclusion While the Library has a long way to go in establishing a scholarly communication program, focus groups proved to be an effective mechanism for identifying the first steps to take. For other libraries seeking to address scholarly communication issues, focus groups can help identify topics that resonate with faculty at a particular institution and help target the initial thrust of the program. References Duncan, J., Walsh, W., and Daniels, T. (2006) “Issues in scholarly communications: creating a campus-wide dialog”, The Serials Librarian, Vol 50 No 3/4, pp. 243-248. The University of California Office of Scholarly Communication and the California Digital Library eScholarship Program in association with Greenhouse Associates, Inc. (2007) “Faculty attitudes and behaviors regarding scholarly communication: survey findings from The University of California”. Available http://osc.universityofcalifornia.edu/responses/materials/OSC-survey-full- 20070828.pdf . Walden, Graham R. (2006) “Focus group interviewing in the library literature”. Reference Services Review, Vol 34 No 2, pp. 222-41. Appendix: Discussion Questions These are questions the moderators had prepared to pose to the focus group. 7 http://osc.universityofcalifornia.edu/responses/materials/OSC-survey-full-20070828.pdf http://osc.universityofcalifornia.edu/responses/materials/OSC-survey-full-20070828.pdf 1. Why do you publish (communicate findings to peers, advance career, gain funding, financial reward, prestige, etc) and which of these reasons is most central to your work? 2. How do you choose which journals you publish in? What are the factors that determine an acceptable level of quality? What values (university, departmental, discipline) come into play? 3. What are some of the challenges and changes to publishing in your field? 4. Have you ever published or considered publishing in an open access journal? How would this be accepted by your department? What institutional capacity exists to support open access and/or author-pays models? 5. Of the professional associations you belong to, are you familiar with their standing on scholarly communication issues ( reasonable journal prices, do they contract or sell their publications to a commercial publisher, have they launched an open access journal, do they have disciplinary repositories?) 6. How do you handle copyright for your publications? Have you ever retained, modified, or negotiated your copyright? 7. We’re considering expanding the scope of K-Rex to include collections of faculty papers by creating a disciplinary or institutional repository. Is this something that would be of value to you? Would you voluntarily use it? 8. What kind of follow up to this meeting would you like to see? How should the university respond to these issues? What strategies should we use to educate and involve more faculty? Additional questions to consider: • Do you retain copyright when you publish an article? • Do you make copies of your articles available on a web site? • Do you ever have problems getting articles you need for your research or teaching? • Have you ever used an article or other paper that was freely available on the web? 8 9 Author Biographies Martin P. Courtois coordinates the electronic theses and dissertations program and the institutional repository at Kansas State University, Manhattan, Kansas. Elizabeth C. Turtle is a science librarian and member of the newly-created Repository Services Team at Kansas State University Libraries, Kansas State University, Manhattan, KS. work_whltpsnd45arnozfbl6awpizli ---- DOCUMENT RESUME ED 433 819 IR 057 449 AUTHOR Darzentas, Jenny TITLE Sharing Metadata: Enabling Online Information Provision. PUB DATE 1999-05-00 NOTE 10p.; In: The Future of Libraries in Human Communication: Abstracts and Fulltext Documents of Papers and Demos Given at the [International Association of Technological University Libraries] IATUL Conference (Chania, Greece, May 17-21, 1999) Volume 19; see IR 057 443. AVAILABLE FROM Web site: http://educate.lib.chalmers.se/IATUL/proceedcontents/chanpap /darzenta.html PUB TYPE Reports Evaluative (142) Speeches/Meeting Papers (150) EDRS PRICE MF01/PC01 Plus Postage. DESCRIPTORS *Academic Libraries; Bibliographic Utilities; *Cataloging; *Cooperative Programs; Distance Education; *Electronic Libraries; Foreign Countries; Higher Education; Information Management; Instructional Materials; *Library Cooperation; Library Role; Library Services; *Shared Library Resources; Union Catalogs IDENTIFIERS *Electronic Resources; Greece; Learning Environments; MARC; Metadata; OCLC ABSTRACT This paper describes work being carried out in the fields of online education provision and library systems, beginning with a description of the current state of the art with regard to online learning environments and educational materials management. Suggestions and solutions for librarians dealing with the management of educational digital content for education service providers are presented, including cooperation and collaboration, linking publishers and national bibliographies, MARC and metadata, and union catalogs/virtual union catalogs. An overview is provided of UNIverse, a European-funded project that is developing a library system to support a virtual union catalog and that offers mechanisms for facilitating cataloging activities by enabling record supply. The paper then focuses on the experiences of a sub-group of the UNIverse project, headed by the National Library of Greece, to test and evaluate the record sharing capabilities of the system and collaborative cataloging in practice. (Contains 24 references.) (MES) ******************************************************************************** Reproductions supplied by EDRS are the best that can be made from the original document. ******************************************************************************** PERMISSION TO REPRODUCE AND DISSEMINATE THIS MATERIAL HAS BEEN GRANTED BY N. Fjallbrant TO THE EDUCATIONAL RESOURCES INFORMATION CENTER (ERIC) J) H 1 of 9 http://educate.lib.chalme U.S. DEPARTMENT OF EDUCATION Office of Educational Research and Improvement EDUCATIONAL RESOURCES INFORMATION CENTER (ERIC) This document has been reproduced as received from the person or organization originating it. Minor changes have been made to improve reproduction quality. Points of view or opinions stated in this document do not necessarily represent official OERI position or policy. SHARING METADATA: ENABLING ON LINE INFORMATION PROVISION Darzentas, Jenny Kyros 55 Evripidou & Thisseos Av, Kallithea 176 74 Athens, Greece E-mail: kyros@compulink.gr Irzenta.html Introduction Online education, although by no means perfected, is now a reality. Hand in hand with its development are the continuing advances in education materials management. This paper describes work being carried out both in the field of online education provision and library systems. It briefly describes a prototype online learning environment (GESTALT) 1 and highlights the implications of such environments on libraries in terms of discovery of course components and relevant support material. The task of cataloguing, already one of the most heavy in terms of human resources, becomes an increased burden when it relates to digital material. It becomes necessary to describe not only the content and form and location of such material, but also, other metadata concerning its accessibility and delivery media. Again digital material may be composed of many separate components which each carry separate cataloguing requirements. In the context of the learning environment, a lecture may have text, sound, graphics, video, self-assessment exercises, a bibliography with hyperlinks. It is possible to tag all these digital objects with metadata in order to describe them and also to aggregate/desegregate so that the material may be used in a highly modular way. Such a vision of online information provision requires the capability of searching through online repositories of information in an efficient manner and for libraries to be able to support the cataloguing activities about their collections to this degree of detail. The UNIverse project 2 is developing a library system to support a virtual union catalogue. It also offers mechanisms for facilitating cataloguing activities by enabling record supply. This activity, can be viewed in the wider context of setting up infrastructures for libraries to share information not only about their catalogues and material, in a traditional sense, but also to prepare for what can be seen as a future enhancement of their role, sharing information about digital objects. The UNIverse system is already capable of processing the whole of the retrieval process from search and locate to order and delivery of digital objects over networks. This paper focuses on the experiences of a sub group set up within the UNIverse project to specifically test and evaluate the record sharing capabilities of the system, and collaborative cataloguing in practice. These experiences, not only as they relate to the system, but to the wider context of networked information and metadata tags for retrieval, are presented here. The paper begins with a description of the current state of the art in regard to online learning environments, and metadata descriptions of the learning objects, which constitute the course and other relevant material, along with current practice in union catalogue assembly and maintenance. It continues with an overview of the UNIverse project and the collaborative cataloguing experiment that was conducted within it. Finally, concluding remarks about the nature of the implications upon libraries and their present and future modes BESTCOPYAVAOLABLE 2 10/7/99 11:33 AM Paper hup://educate.lib.chalmers.se/IATUUproceedcontents/chanpap/darzenta.hunl 2 of 9 of cataloguing activity are made. Acknowledgements Some of the work described here is being carried out under the EU's Telematics for Libraries Programme, project UNIverse, and EU's ACTS Programme, project GESTALT. I thank the teams of people in these projects and in particular that of the Greek Special Interest Group from UNIverse for their support. Online learning and education materials management The overall importance of the role of libraries in education, and moreover in distance education 3, is well-recognised 4. Two important factors can be cited which among others contribute to their increasing participation in educational practice. On the one hand, there is the constructivist pedagogical model influencing much of present day educational thinking, and putting great emphasis on the notions of learning by discovery and exploration, and on the other the technological innovations which enable access to increasingly wider range of materials. As has been extensively documented, 4 5 what this means for the librarian is that the task of mediating between learner and resources becomes more imperative, and with the added pressure that they must combine elements of professional librarianship such as enquiry and research activities, with technical expertise k In addition, with both remote and on campus users, they are often the primary source of instruction for students in the use of email, database querying, and other skills. For librarians, mediating between users and resources, is but one, albeit very important, facet of their mission. They are, of course, also required to select, acquire, organise, make accessible, and preserve material. All this, while they are being subjected to enormous increases in both the numbers of users and the amount of material they can mediate access to. One example of the increase in material, which is relevant to the education scenario, is the increasing tendency for academic institutions to consider all sorts of content production by their teaching staff as valuable commodities, and to be looking for some kind of asset management system to handle this in-house material. This content is typically primary content material, made up of lecture notes and assignments, reading lists and exam questions. But as tutors begin to explore the possibilities of new technologies for teaching, and bow to the pressure to provide content which can be transmitted to remote students, the material becomes increasingly multimedia. Historically, either the content authors kept control of such material, or in some other cases, the computing services department, as technical experts, were given custody. However, as the volume of such material increases, and with the realisation by education service providers of the potential for exploitation of this material, the need for adequate management becomes more and more pressing. Furthermore, the philosophy of treating this material as reusable modules is increasingly prevalent. For both the educationalist and the information scientist professional, this calls into play questions of granularity. What is the smallest unit of knowledge, and what should be visible from the catalogue for that material? There is also the question of what other information about the resource should be recorded. Sufficient descriptions of the modules are required, so that they can be searched and located, and in addition displayed and manipulated. Digital resources have other descriptive needs, and more especially when they exist not in any tangible form, such as a CD or a video, but only as bits and bytes that can only be apprehended by the correct access platform of software and hardware. It is not surprising that the Library should be called upon to manage these assets, since it has amassed the most expertise in these areas. At the present time, there is much research and effort going into the design of metadata for educational software, and into tying to pin down standards that will enable interoperablity of implemented metadata, 3 10/7/99 11:33 AM Paper http://educatelib.chalmers.se/IATUL/proceedcontents/chanpap/darzenta.html and p.articularly in regard to learning object metadata. In this respect, one can mention, the work of the Dublin Core 2, the IEEE LTSC 8-, and the IMS 2 in the States, and the CEN/ISSS working group on Learning Technologies IQ in Europe and the ACTS funded GESTALT project I. The GESTALT project looks at the process of online learning from a holistic viewpoint, seeing the whole of the process from searching for a course, via an electronic broker, or Resource Discovery Service, to the student enrolling in a Learning Environment, to follow a programme of study and making use of assets (both primary educational content and supporting material) from the Asset Management System. GESTALT is in the process of defining metadata sets based upon the emerging standards for ensuring interoperability of the whole system. Again, in accordance with emerging standards, the encoding of the metadata will be done in XML II. This paper is not the place for discussion of these very interesting developments, instead, it wishes to point out the very real burden that will be placed afresh on librarians who will be asked to manage educational digital content for education service providers. For whereas professional publishers of digital material may go some way to help with pre-cataloguing items, it is doubtful whether educational content providers will do so, or will be able to do so, unaided. For the librarian to be able to cope with the new influx, some re engineering of present modus operandi may have to be undertaken. In the next section, suggestions and solutions for addressing various parts of this complex activity are presented Co-operation and Collaboration: Linking publishers and national bibliographies; MARC and metadata; Union catalogues and virtual union catalogues It has been recognised by the library world that bibliographic control over electronic publications (especially those published via networks) is not adequate in the face of the continuous growth in the amount of material being published chiefly or solely in electronic form. Equally disturbing is the recognition that there is no agreed standard of bibliographic description for electronic publications. These were two of the issues that the BIBlink -E2 project, funded by the EU, attempted to go some way to tackling. The BIBLink project, grew out of the CoBRa project la which recognised that the significant growth in electronic publishing raised issues that needed to be addressed at an international level. Project BIBLINK called upon the bibliographic expertise of the national libraries of Europe, working in conjunction with partners in the book industry, to examine ways that electronic publications are described for catalogues and other listings. Thus BIBlink spent effort mapping from various MARC formats to various metadata schema. They found that several MARC formats were going through the process of being updated to enable cataloguing of electronic publications, in particular on-line publications. MARC format has unique value for integrating metadata describing electronic resources into existing legacy systems. If libraries wish to integrate metadata into their existing systems, and use existing software (albeit with some updating to deal with new fields) then MARC offers a solution. Indeed, most work has been done on adapting the USMARC format for the cataloguing items accessible through the Internet. OCLC's Intercat -1-4 project has served as a test bed for the cataloguing of network resources, and as a means to introduce and verify new fields and fine tune as required. Over 200 libraries participated in this project, the majority of them academic (60%) and nearly all of them situated in the US. There are at present nearly 83,000 records in the Inter Cat database. To understand why MARC formats should be extended, it is necessary to understand something of the topology of metadata. An essential aspect of the level of richness of a format is the extent of the content, both in terms of range and depth. The attempt to describe more or less aspects of an object will be reflected in the overall level of complexity, for example designation or format rules for content. In order to identify the extent of content the elements describing an object can be clustered into groups. An example may be seen in a reference model for business-acceptable communication proposed by 3 of 9 4 10/7/99 11:33 AM Paper http://educate.lib.chalmers.se/IATUL/proceedcontents/chanpap/darzenta.html 4 of 9 Bearman 15. This defines clusters of data elements which would be required to fulfil the range of functions of a record. The functions of records are identified as the provision access and use rights management, networked information discovery and retrieval, registration of intellectual property, and authenticity. The clusters of data elements are defined in six layers: 1. Handle Layer registration metadata or properties o record identifier o information discovery and retrieval 2. Terms and Conditions Layer o rights status metadata o access metadata o use metadata O retention metadata 3. Structural Layer o file identification o file encoding metadata O file rendering metadata o record rendering metadata o content structure metadata o source metadata 4. Contextual Layer o transaction content o responsibility o business function 5. Content Layer O content description 6. Use History Layer From the above, it is clear that Bearman's model looks at the record in a wider context than the bibliographic context alone, and it is particularly relevant to this paper as it takes account of the business context in which metadata is used. Bearman includes metadata elements that are appropriate for metadata in the context of publishing and supply. In the new model of educational content suppliers, some of these business related metadata will be needed, if education service providers are to market their courses in a global competitive market, and if they are to deliver globally, then it is essential that the metadata take account of delivery mechanisms. Taking the issue of cataloguing electronic resources from another angle, there have been several attempts to catalogue resources on the Internet in both automated and collaborative fashions. Take for instance, the amount of work on subject gateways 16.. Subject gateways are labour intensive to develop and maintain. They require the constant input of staff who hand pick, classify and catalogue each Internet resource. This is both the strength and the weakness of gateways. The human input allows for semantic judgements and decisions that are the key ingredient for creating a quality controlled gateway. This ingredient is lacking in automated indexes or search engines which can not filter information in such a meaningful way. However, considerable time and effort is needed to make these judgements and decisions and this means that the collection of resources is often small and slow to grow. As the number of resources available over the Internet increases, gateways need to develop ways of increasing the number of resources they can catalogue. The DESIRE project 12. has identified two ways in which this might be done: firstly by distributed cataloguing, which increases the number of people adding resources, and secondly by automatic metadata entry: improving the efficiency of the cataloguing process. 5 10/7/99 11:33 AM Paper http://educatelib.chalmers.se/IATUUproceedcontents/chanpap/darzenta.httn1 5 of 9 In order to perform automatic metadata entry, subject gateways would harvest the metadata produced by subject communities into templates. One of the main issues of automatically generating templates is ensuring that the high standards (that set apart gateways from automatic search engines) are maintained. This means that both the resources included in the database should be of high quality as well as the catalogue records themselves. The DESIRE researchers suggest that ensuring the integrity of resources could be achieved by only harvesting automatically metadata from 'trusted' information providers. A trusted provider would be a site or organisation that had been previously evaluated by the information gateway as a high quality resource. To ensure that the catalogue records remained of a high and consistent quality the gateways would need to promote 'good use guidelines' (including the use of controlled vocabularies) for the production of metadata within their subject community Along the same lines, in September of 1998, the OCLC launched a worldwide call for participants to their Co-operative Online Resource Catalog (CORC) -13- project seeking to automate cataloguing of Internet resources. The aim of the project would be to explore the co-operative creation and sharing by libraries of metadata. Besides libraries, museums, archives, publishers and other institutions that face similar problems with the proliferation of resources on the Web are invited to participate. The project will build upon OCLC's prior activities in creating Internet resource databases through such projects as the OCLC NetFirst 19- and InterCat 2--Q. databases, but the CORC project will rely more heavily on automated means to build its database. Both NetFirst and InterCat records will be used initially to seed the CORC database. Both full USMARC cataloguing and an enhanced Dublin Core metadata mode will be used. As can be seen from the above two projects, fundamental to these efforts is the co-operation and collaboration of library and other staff. They have been able to build on pre-existing shared cataloguing activities to create networks that enable quicker responses to the problem of the influx of the web. These shared cataloguing activities are at the heart of this paper, and so deserve further scrutiny. The idea of collaborative cataloguing is not new, but it was enabled by technology. From the time MARC was introduced, and libraries began the tremendous job of converting from physical card catalogues to machine readable ones, the idea of commercial record supply and union catalogues began to take hold. In the late 1960s, the convergence of technology and a good idea brought the library world into a new era of shared goals and resources. According to the OCLC, the "visionary dream" of co-operative cataloguing is now deeply embedded in library economics, and the result has been the most widely used academic database on the Internet, WorldCat (the OCLC Online Union Catalog) il. The step from union catalogues to virtual union catalogues has had to wait until technology was mature enough to support networking, but still there are the known problems of rights of access, etc. The best-publicised example of virtual union catalogues is that of the Virtual Canadian Union Catalogue (vCUC) 22. The concept of the vCUC involves a decentralised, electronically accessible catalogue created by linking the databases of several institutions. The full implementation of a distributed, linked union catalogue to support all aspects of resource sharing is a complex process involving the resolution of technical, policy and service issues. Obviously, these issues cannot be tackled all at once, therefore the initiative is limited to five interlinked issues. These are: the primary use of union catalogues in support of interlibrary loan, and to identifying and resolving issues related to the record syntax to be used (USMARC and/or CAN/MARC); the provision of holdings information (accessibility and coding); the roles and responsibilities of the union catalogue participant; the standardisation of the use of library symbols; and finally, the format and degree of detail for holdings data. For some, virtual union catalogues are still too fraught with insoluble issues to be viable. For instance, in a 6 10/7/99 11:33 AM Paper http://educate.lib.chalmers.seilATUL/proceedcontents/chanpap/darzenta.html 6 of 9 nationally funded project to produce specifications for a union catalogue of university libraries in Greece 23., the decision was made to design a union catalogue with a centralised database, rather than the virtual model with distributed databases. Although this decision was not considered by all those involved to be the most forward thinking, it was seen as the most pragmatic in a region very behind in terms organised library co-operation. As their report explains, many libraries have automated systems and have processed part of their collections, but there is no shared cataloguing activity, every library does its own cataloguing independently. The only co-operation patterns to have evolved are among academic and research libraries that subscribe to a serials co-operative catalogue, operated by the National Documentation Centre. As with most countries in this situation a certain amount of leapfrogging will take place and the design the centralised catalogue of 32 higher education institutions can be seen as a first step in bringing collaborative and networking to the Greek Academic Librarian, and breaking the mould of isolation. The case of the Greek academic and research libraries has been picked out as it provides the background for the collaborative cataloguing experiment taking place within the Greek group of libraries that is testing the UNIverse system. UNIverse and collaborative cataloguing Within the European funded project UNIverse a large-scale project based on the concept of a virtual union catalogue, a series of advanced library services to both end-users and librarians are offered, namely: Search and Retrieve very large scale, transparent multi-database searching Mixed-media document delivery integrated to the search and retrieve process Inter-Library Loans - integrated to the search and retrieve process Collaborative cataloguing/ record supply - an efficiency gain for the librarian. The virtual union catalogue forms the core of the UNIverse system around which a number of key features have been built. Firstly, the ability to perform parallel searches upon multiple physical databases which have a variety of access methods, record syntax, character sets and languages, and see the results as if a single logical database were being searched. Secondly, the multiplicity of data sources is hidden from the user and a high quality of service is achieved both in terms of performance and data quality through record de-duplication and merging. Thirdly, through the use of Open Distributed Processing techniques the architecture has potentially unlimited scalability whilst maintaining high performance. The libraries that are testing and validating the collaborative cataloguing aspects of the system are those in the Greek group headed by the National Library of Greece. This group comprises universities, a professional society library, and the library of an internationally renowned college. While there are some overlaps in their collections, the group's main cohesion derives from the willingness of its librarians to enter into such experiments, and their hopes that this will lead to greater collaboration between their institutions. In the wider context, some of the aims and benefits of a Collaborative cataloguing service are a better use of staff resources; enhanced records; mutual benefit to specialist libraries; contribution to virtual union catalogue; potential source of revenue for supplying libraries. However, in the context of the Greek group of libraries, whose history of collaborative cataloguing is non existent, their hopes are more specific. Universe offers the attraction of a virtual Union Catalogue, with all the advantages of immediacy, flexibility, and scalability. Each institution involved employs substantial number of cataloguers as a proportion of its total staff, they hope UNIverse will offer a better use of these staff resources in terms of quicker throughput of material; substantial lessening of cataloguing backlog; better quality records. They understand the virtues of collaborative cataloguing as opposed to simple record supply, which will also enable them to share specialist subject knowledge 7 10/7/99 11:33 AM Paper http://educate.lib.chalmers.se/IATUUproceedcontents/chanpap/darzenta.html The libraries are at present engaged in evaluating the system. The plan is to test the use of the collaborative cataloguing scenario over five features of the record supply service. These features are: search and retrieve records for download using a variety of fields; merge records/multiple records; to create records; to enhance records; to test the use of the audit trail where libraries use the Universe Client, and the server is able to record data for the supplying library. Wherever possible each library will play both supplier and recipient roles. Technically, the system is simple to understand. Initially, the user will search a number of targets using the UNIverse client. This search process will generate a result list that the user can select records from. When the user selects the option to download (or export) the record, a dialogue is presented to allow the file name to be specified and the required record format/syntax. Typically the local catalogue system will have a daemon process running that looks for files appearing in a pre-determined director. When new files appear the process will import the records into the local database. The record download system will then be used to place records into this directory causing them to be automatically uploaded into the local catalogue. (This daemon process is not part of the Universe system). Some predictions for the future MARC has been with us for nearly 30 years and has been very useful, but the new Internet and web enabled communications require new indexing paradigms, or at least extensions to existing MARC. However, the vision of embedding, or attaching, other digital information to the bibliographic- record is strong. The influx of digital resources is already overwhelming, the expected influx of educational material promises to place even more urgent demands upon education services providers' asset management staff. The problems are still looking for the best way to apply solutions. The technological change affects the objects to be described and the systems used to manage bibliographic data. The issue was laid out succinctly by Hickey: "Now, libraries need a system to create and share metadata for online resources to help automate resource selection, creation of the metadata itself and maintenance of links." 24- Fundamental to the technical system of creating and sharing metadata, will be the same types of human centred networks already existing for collaborative cataloguing activities. The metadata will probably exceed by far the level of detail found in the average bibliographic record. As we have tried to show in this paper, and is the experience of the UNIverse Greek SIG, collaborative cataloguing and eventually, sharing metadata, will in the end depend as much on the technology as on the co-operative networks of participants involved. References 1. GESTALT (Getting education systems talking across Leading edge Technologies) 2. UNIverse: http://www.fdgroup.co.uk/research/universe/ 3. Stork, H-G. Digital Libraries and their impact on Distance Learning: A European Perspective, IATUL News, Vol. 6. 1997, no.4 http://educatelib.chalmers.se/IATUL/4-97.html 4. Holowachuk, D, The Role of Librarians in Distance Education, 1997 http://hollyhock.slis.ualberta.ca/598/darlene/distance.htm 5. Prestamo, A.M. Development of Web-Based Tutorials for Online Databases, 1998 http : / /www.library.ucsb.edu /istl /98 winter/article3.html 6. Hastings, K. Tennant, R. How to build a Digital Librarian, D-Lib Magazine, November 1996 http://www.dlib.org/d1 ib/november96/ucb/llhastings.html 7. http://purl.org/metadata/dublin core/ 7 of 9 8 10/7/99 11:33 AM Paper http://educate.lib.chalmers.seilA'rUlJproceedcontents/chanpap/darzenta.html 8. http://grouper.ieee.org/groups/ltsc/index.html 9. http://www.imsproject.org/index.html 10. http://cenorm.be/isss/news/default.html 11. There are a wide variety of resources on XML, for an introduction see http://www.w3.org/XML/, and also the XML zone at http://xml-zone.com/ 12. http://hosted.ukoln.ac.uk/biblink/ 13. http://w ww. bl .uk/information/cobra.html 14. http://orc.rsch.ocic.org:6990/ 15. Bearman, David and Sochats, Ken. Metadata Requirements for evidence. Available at URL: http://www.lis.pittedu/-nhprc/model.htm. 16. Subject gateways to be mentioned here: DutchESS (The Dutch Electronic Subject Service, Netherlands http://www.konbib.nUclutchess/docs/info.html#8 EEVL (The Edinburgh Engineering Virtual Library, UK) http://www.eevl.ac.uk/volunt.html ADAM (Arts, Design, Architecture and Media Information Gateway, UK http://adam.ac.uk/friends/ Biz/ed (Economics and Business Education Gateway, UK) http://www.bized.ac.uk/inforni/infohome.htm EELS (Engineering Electronic Library, Sweden) http://www.ub2.1u.se/eel/about.html 17. Worsfold, E Distributed and Part-Automated Cataloguing (A DESIRE Issues Paper ) March 1998 http://www.sosig.ac.uk/desire/cat/cataloguing.html and http: / /www.desire.org /, there is also a Desire 2 project, see 18. ftp://ftp.rsch.ocic.org/pub/Internet cataloguing project/Manual.txt 19. see http://www.ocic.org/ocic/new/n234/prod netfirst continues growth.htm, for most recent news 20. see [14] 21. http://www.ocic.org/ocic/promo/7775os/worldcat.htm 22. http://n1c-bnc.ca/resource/vcuc/index.htm 23. This report, in Greek, but with an English language summary can be found at http://www.ntua.gr/library/deliv01.htm 24. ftp://ftp.rsch.ocic.org/pub/Internet cataloguing project/Manual.txt Back to contents 8 of 9 9 10/7/99 11:33 AM Paper http://educate.lib.chalmers.se/IATUL/proceedcontents/chanpap/darzenta.html Last edited by J.F, 31st May, 1999. 9 of 9 10 10/7/99 11:33 AM U.S. Department of Education Office of Educational Research and Improvement (OERI) National Library of Education (NLE) Educational Resources Information Center (ERIC) NOTICE REPRODUCTION BASIS ERIC This document is covered by a signed "Reproduction Release (Blanket) form (on file within the ERIC system), encompassing all or classes of documents from its source organization and, therefore, does not require a "Specific Document" Release form. This document is Federally-funded, or carries its own permission to reproduce, or is otherwise in the public domain and, therefore, may be reproduced by ERIC without a signed Reproduction Release form (either "Specific Document" or "Blanket"). EFF-089 (9/97) work_whq6nqtgf5flfjdp6xcumwqazq ---- Microsoft Word - 2. BP_2016_12 INDEX.docx BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) La revista Bajo Palabra ofrece a los autores la difusión de sus resultados de investigación principalmente a través del Repositorio Institucional de la Biblioteca de Humanidades de la Universidad Autónoma de Madrid, así como a través de diferentes bases de datos, catálogos, repositorios institucionales, blogs especializados, etc. El éxito con que se acomete la tarea de difundir los contenidos científicos de Bajo Palabra. Revista de Filosofía se ve reflejado por su inclusión en: Índices de valoración de calidad científica y editorial: • LATINDEX • DICE. Difusión y Calidad Editorial de las Revistas Españolas de Humanidades y Ciencias Sociales y Jurídicas • BDDOC CSIC: Revistas de CC. Sociales y Humanidades • ESCI. Emerging Sources Citation Index de Thomson Reuters: +3.5 • CIRC: Clasifcación Integrada de Revistas Científcas • ANEP: Agencia Nacional de Evaluación y Prospectiva. Categoría ANEP: B • ISOC, CIENCIAS SOCIALES Y HUMANIDADES • RESH. Revistas españolas de Ciencias Sociales y Humanidades • ERIH PLUS European Reference Index for the Humanities and Social Sciences (Norwegian Centre for Research Data) • Ulrich’s Periodicals Directory • CECIES. Revistas de Pensamiento y Estudios Latinoamericanos • I2OR. International Institute of Organized Research • DRJI. Directory of Research Journals Indexing • IN-RECH. Índice de impacto. Revistas españolas de Ciencias Humanas • MIAR (Sistema de medición cuantitativa de la visibilidad de las publicaciones periódicas en Ciencias Sociales: índice de difusión ICDS de BAJO PALABRA: 4.230) • The Philosopher’s Index Así como en su difusión y acceso a sus contenidos en texto completo a través de: • REPOSITORIO INSTITUCIONAL DE LA UAM. BIBLOS-E ARCHIVO • DIALNET, portal de difusión de la producción científica hispana • BIBLIOTECA UNIVERSIA • E-REVISTAS. Plataforma de Open Access de Revistas Científicas Electrónicas (no) • REDIB. Red Iberoamericana de Innovación y Conocimiento Científico • REBIUN. RED DE BIBLIOTECAS UNIVERSITARIAS • BIBLIOTECA VIRTUAL DE BIOTECNOLOGÍA PARA LAS AMÉRICAS • AL-DIA. REVISTAS ESPECIALIZADAS • COPAC. National, Academic and Specialist Library Catalogue (Reino Unido) • ZDB. Deutsche Digitale Bibliothek (Alemania) • EZB (Elektronische Zeitschriftenbibliothek) (Alemania) • Catálogo SUDOC (Francia) • OCLC WorldCat (mundial) • DULCINEA. SHERPA/RoMEO • EBSCO’s database products • Fuente Académica Plus • DOAJ, Directory of Open Access Journals Y su citación en diferentes blogs y sitios web: • CANAL BIBLOS: Blog de la Biblioteca y archivo de la UAM • LA CRIÉE: PÉRIODIQUES EN LIGNE • HISPANA. Directorio y recolector de recursos digitales • Biblioteca Filosófica Imprescindible BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) Gracias también al Nuevo Portal de Revistas electrónicas UAM https://revistas.uam.es/bajopalabra y al excelente servicio de canje de revistas realizado por la Biblioteca de Humanidades de la Universidad Autónoma de Madrid, gracias al cual se pueden consultar ejemplares de Bajo Palabra en numerosas Bibliotecas, y por el cual se realiza actualmente un intercambio con más de 40 revistas. *NOVEDAD: Bajo Palabra. Revista de Filosofía ha sido incluida este año en ESCI: Emerging Sources Citation Index de Thomson Reuters, Journal Index, en el repertorio Fuente Académica Plus, EZB (Elektronische Zeitschriftenbibliothek), en AE Global Index y en ERIH PLUS: European Reference Index for the Humanities and Social Sciences (Norwegian Centre for Research Data). Actualmente se ha solicitado su inclusión en CARHUS, Arts and Humanities Citation Index (ISI) y SCOPUS. Más información en: https://revistas.uam.es/bajopalabra/pages/view/INDIZACI%C3%93N BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) The Journal Bajo Palabra successfully diffuses the authors’ research results mainly through the Institutional Repository of Humanities Library at the Autonomous University of Madrid, as well as through different databases, catalogues, institutional repositories, specialized blogs, etc. Bajo Palabra has a good ranking in several quality editorial indexes: • LATINDEX • DICE. Diffusion and Editorial Quality of Spanish Journals of the Humanities and of Social • and Legal Sciences • BDDOC CSIC: Journals of Social Sciences and Humanities • ESCI. Emerging Sources Citation Index de Thomson Reuters: +3.5 • CIRC: Classifcation Integrated Scientifc Journals • ANEP: The National Evaluation and Foresight Agency. Category ANEP: B • ISOC, CIENCIAS SOCIALES Y HUMANIDADES • RESH. Revistas españolas de Ciencias Sociales y Humanidades • ERIH PLUS European Reference Index for the Humanities and Social Sciences (Norwegian Centre for Research Data) • Ulrich’s Periodicals Directory • CECIES. Journals of Latin-American Thought and Studies • I2OR. International Institute of Organized Research • DRJI. Directory of Research Journals Indexing • IN-RECH. Índice de impacto. Revistas españolas de Ciencias Humanas • MIAR (System for quantitatively measuring the visibility of social science journals based on their presence in different types of databases. ICDS of BAJO PALABRA: 4.230) • The Philosopher’s Index It is freely accessible through: • INSTITUTIONAL REPOSITORY OF THE UAM. UAM BIBLOS-E ARCHIVE • DIALNET, Web portal for the diffusion of Spanish scientifc production • UNIVERSIA LIBRARY • E-REVISTAS. Plataforma de Open Access de Revistas Científicas Electrónicas (CSIC) • REDIB. Red Iberoamericana de Innovación y Conocimiento Científico • REBIUN. NETWORK OF UNIVERSITY LIBRARIES • VIRTUAL LIBRARY OF BIOETCHNOLOGY FOR THE AMERICAS • AL-DIA. SPECIALIZED JOURNALS • COPAC. National, Academic and Specialist Library Catalogue (United Kingdom) • ZDB. Deutsche Digitale Bibliothek (Germany) • EZB (Elektronische Zeitschriftenbibliothek) (Germany) • SUDOC Catalogue (France) • OCLC WorldCat • DULCINEA. SHERPA/RoMEO • EBSCO’s database products • Fuente Académica Plus • DOAJ, Directory of Open Access Journals It has been quoted in multiple blogs and Web sites: • CANAL BIBLOS: Blog of the Library and archive of the UAM • LA CRIÉE: PÉRIODIQUES EN LIGNE • HISPANA. Directory and collector of digital resources • Essential Philosophical Library BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) Thanks to the Nuevo Portal de Revistas electrónicas UAM https://revistas.uam.es/bajopalabra and thanks to the excellent service of journals exchange provided by the Humanities Library of the Autonomous University of Madrid, it is possible to consult Bajo Palabra’s in several libraries, and the journal is currently conducting an exchange with more than 40 different humanities journals. *NEWS: Bajo Palabra. Journal of Philosophy has been indexed this year in ESCI: Emerging Sources Citation Index de Thomson Reuters, Journal Index, Fuente Académica Plus, EZB (Elektronische Zeitschriftenbibliothek), AE Global Index and ERIH PLUS: European Reference Index for the Humanities and Social Sciences (Norwegian Centre for Research Data). The journal is currently under the indexing process with CARHUS, Arts and Humanities Citation Index (ISI) and SCOPUS. More information in: https://revistas.uam.es/bajopalabra/pages/view/INDIZACI%C3%93N BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) LATINDEX DICE. Difusión y Calidad Editorial de las Revistas Españolas de Humanidades y Ciencias Sociales y Jurídicas BDDOC CSIC: Revistas de CC. Sociales y Humanidades ISOC, CIENCIAS SOCIALES Y HUMANIDADES ESCI. Emerging Sources Citation Index de Thomson Reuters: +3.5 CIRC: Clasifcación Integrada de Revistas Científicas BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) ANEP: Agencia Nacional de Evaluación y Prospectiva. Categoría ANEP: B RESH. Revistas españolas de Ciencias Sociales y Humanidades ERIH PLUS European Reference Index for the Humanities and Social Sciences Ulrich’s Periodicals Directory CECIES. Revistas de Pensamiento y Estudios Latinoamericanos I2OR. International Institute of Organized Research BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) DRJI. Directory of Research Journals Indexing IN-RECH. Índice de impacto. Revistas españolas de Ciencias Humanas MIAR (Sistema de medición cuantitativa de la visibilidad de las publicaciones periódicas en Ciencias Sociales: índice de difusión ICDS de BAJO PALABRA:4.230 The Philosopher’s Index Así como en su difusión y acceso a sus contenidos en texto completo a través de: Repositorio institucional de la UAM: BIBLOS-E ARCHIVO BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) DIALNET, portal de difusión de la producción científica hispana BIBLIOTECA UNIVERSIA REDIB. Red Iberoamericana de Innovación y Conocimiento Científico REBIUN. RED DE BIBLIOTECAS UNIVERSITARIAS BIBLIOTECA VIRTUAL DE BIOTECNOLOGÍA PARA LAS AMÉRICAS BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) AL-DIA. Revistas especializadas COPAC. National, Academic and Specialist Library Catalogue (Reino Unido) ZDB. Deutsche Digitale Bibliothek (Alemania) Catálogo SUDOC (Francia) OCLC WorldCat (mundial) BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) DULCINEA. SHERPA/RoMEO EBSCO’s database products. Fuente Académica Plus DOAJ, Directory of Open Access Journals EZB (Elektronische Zeitschriftenbibliothek) AE Global Index (mundial) More information in: https://revistas.uam.es/bajopalabra/pages/view/INDIZACI%C3%93N BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) Comparativa de las Revistas Científicas Españolas según Google Scholar Metrics (2011-2015). Índice H de Bajo Palabra en Google Scholar Metrics: 11 (11 de 50 revistas de Filosofía y Teología). Índice h5: 4; Mediana h5: 5. https://scholar.google.com/citations?hl=es&view_op=search_venues&vq=BAJO+PALABRA Fuente: Ayllón, J.M.; Martín-Martín, A.; Orduña-Malea, E.; Delgado López-Cózar, E. (2016). Índice H de las revistas científicas españolas según Google Scholar Metrics (2011-2015). EC3 Reports,17. Granada, 27th July 2016. https://www.google.es/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&ved=0ahUKEwj_yMSh_LfPAhWoDcAKHda_D7AQFg gwMAQ&url=http%3A%2F%2Fwww.um.es%2Fdocuments%2F793464%2F4343909%2FINDICE%2BH%2BREVISTAS%2BES PA%25C3%2591OLAS%2B2011-2015.pdf%2Ffddc675c-92c3-4fbf-ba7d- c4059d63cf72&usg=AFQjCNEGLuDFKaiJ3YkIatz7xnm3n4yJTg&sig2=mEq2QM_m-EeYCedB77xwbA&cad=rja BAJO PALABRA. Revista de Filosofía II Época, Nº 12 (2016) work_vk2u6by5m5clrafwsq4sauxczm ---- Taking Our Pulse: The OCLC Research survey of special collections and archives Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives Jackie M. Dooley Program Officer Katherine Luce Research Intern OCLC Research A publication of OCLC Research Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 2 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives Jackie M. Dooley and Katherine Luce, for OCLC Research © 2010 OCLC Online Computer Library Center, Inc. Reuse of this document is permitted as long as it is consistent with the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.0 (USA) license (CC-BY-NC- SA): http://creativecommons.org/licenses/by-nc-sa/3.0/. October 2010 Updates: 15 November 2010, p. 75: corrected percentage in final sentence. 17 November 2010, p. 2: added Creative Commons license statement. 28 January 2011, p. 25, penultimate para., line 3: deleted “or more” following “300%”; p. 26, final para., 5th line: changed 89 million to 90 million; p. 30, final para.: changed 2009-10 to 2010-11; p. 75, final para.: changed 400 to 80; p. 76, 2nd para.: corrected funding figures; p. 90, final line: changed 67% to 75%. OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 1-55653-387-X (978-1-55653-387-7) OCLC (WorldCat): 651793026 Please direct correspondence to: Jackie Dooley Program Officer jackie_dooley@oclc.org Suggested citation: Dooley, Jackie M., and Katherine Luce. 2010. Taking our pulse: The OCLC Research survey of special collections and archives. Dublin, Ohio: OCLC Research. http://www.oclc.org/research/publications/library/2010/2010-11.pdf. http://creativecommons.org/licenses/by-nc-sa/3.0/� http://www.oclc.org/� mailto:jackie_dooley@oclc.org� http://www.oclc.org/research/publications/library/2010/2010-11.pdf� Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 3 Contents Executive Summary ........................................................................................ 9 Introduction ................................................................................................ 15 Background .............................................................................................. 15 Definition of Special Collections...................................................................... 16 Project Objectives ...................................................................................... 16 Survey Population....................................................................................... 17 Acknowledgements ..................................................................................... 19 1. Overview of Survey Data .............................................................................. 21 Overall library size and budget ....................................................................... 21 Collections ............................................................................................... 23 User Services ............................................................................................ 34 Cataloging and metadata .............................................................................. 46 Archival collections management .................................................................... 49 Digital special collections ............................................................................. 53 Staffing ................................................................................................... 62 Most challenging issues ................................................................................ 68 2. Overviews of Membership Organizations ............................................................ 73 Association of Research Libraries .................................................................... 73 Canadian Academic and Research Libraries ........................................................ 83 Independent Research Libraries Association ........................................................ 89 Oberlin Group ........................................................................................... 95 RLG Partnership ........................................................................................101 3. Conclusion and Recommendations ..................................................................109 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 4 Appendix A. Survey Instrument .........................................................................113 Appendix B. Responding institutions ...................................................................138 Appendix C. Overview of Museum Data ...............................................................144 Appendix D. Methodology ...............................................................................148 References ................................................................................................ 129 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 5 Tables Table 0.1. Survey respondents ......................................................................... 18 Table 0.2. Respondents by type of institution ....................................................... 18 Table 1.1. Printed volumes by membership organization .......................................... 22 Table 1.2. Branch libraries reported ................................................................... 24 Table 1.3. Special collections size ..................................................................... 25 Table 1.4. Impetus for establishing a new collecting area ......................................... 27 Table 1.5. Acquisitions funding ......................................................................... 29 Table 1.6. Onsite visits .................................................................................. 35 Table 1.7. Presentations ................................................................................ 45 Table 1.8. Catalog records .............................................................................. 46 Table 1.9. Mean number of staff FTE................................................................... 63 Table 1.10 Most challenging issues ..................................................................... 68 Table 2.1. ARL overall library size ..................................................................... 74 Table 2.2. ARL change in overall library funding .................................................... 75 Table 2.3. ARL special collections size ................................................................ 75 Table 2.4. ARL acquisitions funding, 2010 and 1998 ................................................ 76 Table 2.5. Percentage of all survey holdings held by ARL libraries ................................ 77 Table 2.6. ARL onsite visits ............................................................................. 77 Table 2.7. ARL presentations ........................................................................... 78 Table 2.8. ARL catalog records ......................................................................... 79 Table 2.9. ARL online catalog records (2010 and 1998) ............................................. 79 Table 2.10. ARL archival finding aids (1998 and 2010)............................................... 81 Table 2.11. CARL overall library size .................................................................. 84 Table 2.12. CARL change in overall library funding ................................................. 84 Table 2.13. CARL special collections size ............................................................. 85 Table 2.14. CARL onsite visits .......................................................................... 86 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 6 Table 2.15. CARL presentations ........................................................................ 86 Table 2.16. CARL catalog records ...................................................................... 87 Table 2.17. IRLA overall library size ................................................................... 90 Table 2.18. IRLA change in overall library funding .................................................. 90 Table 2.19. IRLA special collections size .............................................................. 91 Table 2.20. IRLA onsite visits ........................................................................... 92 Table 2.21. IRLA presentations ......................................................................... 93 Table 2.22. IRLA catalog records ........................................................................ 94 Table 2.23. Oberlin overall library size ............................................................... 96 Table 2.24. Oberlin change in overall library funding ............................................... 96 Table 2.25. Oberlin special collections size .......................................................... 97 Table 2.26. Range of Oberlin special collections sizes .............................................. 97 Table 2.27. Oberlin onsite visits ........................................................................ 98 Table 2.28. Oberlin presentations ..................................................................... 99 Table 2.29. Oberlin catalog records ................................................................... 99 Table 2.30. RLG Partnership overall library size ....................................................102 Table 2.31. RLG Partnership change in overall library funding ...................................103 Table 2.32. RLG Partnership special collections size ...............................................103 Table 2.33. Percentage of all survey holdings held by RLG Partnership libraries ..............104 Table 2.34. RLG Partnership onsite visits ............................................................104 Table 2.35. RLG Partnership presentations ..........................................................105 Table 2.36. RLG Partnership catalog records ........................................................106 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 7 Figures Figure 1.1. Printed volumes across overall population .............................................. 22 Figure 1.2. Change in overall library funding ......................................................... 23 Figure 1.3. Changes in acquisitions funding .......................................................... 30 Figure 1.4. Cooperative collection development .................................................... 31 Figure 1.5. Special collections in secondary storage ................................................ 32 Figure 1.6. Preservation needs ......................................................................... 33 Figure 1.7. Changes in level of use .................................................................... 36 Figure 1.8. Changes in use by format .................................................................. 37 Figure 1.9. Changes in users’ methods of contact ................................................... 38 Figure 1.10. Access to uncataloged/unprocessed materials ....................................... 39 Figure 1.11. Interlibrary loan ........................................................................... 40 Figure 1.12. Reasons to disallow use of digital cameras ............................................ 41 Figure 1.13. Average charge for a digital scan ....................................................... 42 Figure 1.14. Internet access to finding aids .......................................................... 43 Figure 1.15. Web-based communication methods ................................................... 44 Figure 1.16. Change in size of backlogs ............................................................... 48 Figure 1.17. Encoding of archival finding aids ....................................................... 50 Figure 1.18. Software for creating finding aids ...................................................... 51 Figure 1.19. Responsibility for records management ................................................ 52 Figure 1.20. Digitization activity ....................................................................... 53 Figure 1.21. Involvement in digitization projects .................................................... 54 Figure 1.22. Large-scale digitization .................................................................. 55 Figure 1.23. Responsibility for born-digital archival materials .................................... 57 Figure 1.24. Born-digital archival materials already held .......................................... 59 Figure 1.25. Impediments to born-digital management ............................................ 60 Figure 1.26. Institutional repositories ................................................................. 61 Figure 1.27. Changes in staffing levels ................................................................ 64 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 8 Figure 1.28. Education and training needs ............................................................ 65 Figure 1.29. Demographic diversity .................................................................... 66 Figure 1.30. Integration of separate units ............................................................ 67 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 9 Executive Summary Special collections and archives are increasingly seen as elements of distinction that serve to differentiate an academic or research library from its peers. In recognition of this, the Association of Research Libraries conducted a survey in 1998 (reported in Panitch 2001) that was transformative and led directly to many high-profile initiatives to "expose hidden collections." As this OCLC Research report reveals, however, much rare and unique material remains undiscoverable, and monetary resources are shrinking at the same time that user demand is growing. The balance sheet is both encouraging and sobering: • The size of ARL collections has grown dramatically, up to 300% for some formats • Use of all types of material has increased across the board • Half of archival collections have no online presence • While many backlogs have decreased, almost as many continue to grow • User demand for digitized collections remains insatiable • Management of born-digital archival materials is still in its infancy • Staffing is generally stable, but has grown for digital services • 75% of general library budgets have been reduced • The current tough economy renders “business as usual” impossible The top three “most challenging issues” in managing special collections were space (105 respondents), born-digital materials, and digitization. We updated ARL’s survey instrument and extended the subject population to encompass the 275 libraries in the following five overlapping membership organizations: Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 10 • Association of Research Libraries (124 universities and others) • Canadian Academic and Research Libraries (30 universities and others) • Independent Research Libraries Association (19 private research libraries) • Oberlin Group (80 liberal arts colleges) • RLG Partnership, U.S. and Canadian members (85 research institutions) The rate of response was 61% (169 responses). Key Findings A core goal of this research is to incite change to transform special collections, and we have threaded recommended actions throughout this section. We focused on issues that warrant shared action, but individual institutions could take immediate steps locally. Regardless, responsibility for accomplishing change must necessarily be distributed. All concerned must take ownership. Assessment A lack of established metrics limits collecting, analyzing, and comparing statistics across the special collections community. Norms for tracking and assessing user services, metadata creation, archival processing, digital production, and other activities are necessary for measuring institutions against community norms and for demonstrating locally that primary constituencies are being well served. ACTION: Develop and promulgate metrics that enable standardized measurement of key aspects of special collections use and management. Collections ARL collections have grown dramatically since 1998, ranging from a 50% increase in the mean for printed volumes and archival collections to 300% for visual and moving-image materials. Two thirds of respondents have special collections in secondary storage. As general print collections stabilize, such as through shared print initiatives and digital publication, a need for more stacks space for special collections will become all the more conspicuous. The arguments to justify it will have to be powerful. The amount of born-digital archival material reported by respondents is miniscule relative to the extant content of permanent value: the mean collection size is 1.5 terabytes, the median Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 11 a mere 90 gigabytes. It is striking that only two institutions hold half of the material reported, and only thirteen hold 93% of it. Receipt of a gift is the most frequently stated impetus for undertaking a new collecting emphasis. Some respondents noted, however, that they do not plan to acquire other materials to strengthen the new area, which may signal that the gift was outside the library’s areas of strength or need. Such gifts sometimes become a liability over time. Deaccessioning of unwanted materials, some of which have languished unprocessed for years, occurs for appropriate reasons but is not widely practiced. Informal collaborative collecting is fairly widespread on a regional basis, but formal arrangements of any kind are rare. ACTION: Identify barriers that limit collaborative collection development. Define key characteristics and desired outcomes of effective collaboration. The preservation needs of audiovisual collections (both audio and moving image) are well known to be staggering, and our data confirm that these materials have by far the most serious problems. ACTION: Take collective action to share resources for cost-effective preservation of at-risk audiovisual materials. User Services More than 60% of respondents stated that use by faculty, undergraduates, and visiting researchers has increased over the past decade. Nearly half, however, were unable to categorize their users by type, even those in their primary user population. User services policies are evolving in positive ways: most institutions permit use of digital cameras and 90% allow access to materials in backlogs. More than one third send original printed volumes on interlibrary loan, while nearly half supply reproductions. Conservative vetting of requests may, however, result in unwarranted denial of all three types of access. ACTION: Develop and liberally implement exemplary policies to facilitate rather than inhibit access to and interlibrary loan of rare and unique materials. Cataloging and Metadata The extent to which materials appear in online catalogs varies widely by format: 85% of printed volumes, 50% of archival materials, 42% of maps, and 25% of visual materials are accessible online. Relative to ARL’s 1998 data, 12% more printed volumes have an online record, as do 15% more archival materials and 6% more maps. This limited progress may be Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 12 attributable in part to lack of sustainable, widely replicable methodologies to improve efficiencies. ACTION: Compile, disseminate, and adopt a slate of replicable, sustainable methodologies for cataloging and processing to facilitate exposure of materials that remain hidden and stop the growth of backlogs. ACTION: Develop shared capacities to create metadata for published materials such as maps and printed graphics for which cataloging resources appear to be scarce. On the other hand, great strides have been made with archival finding aids: 52% of ARL collection guides are now accessible online, up from 16% in 1998. Across the entire population the figure is 44%, which would increase to 74% if all extant finding aids available locally were converted. The other 26% reveals the archival processing backlogs that remain. ACTION: Convert legacy finding aids using affordable methodologies to enable Internet access. Resist the urge to upgrade or expand the data. Develop tools to facilitate conversion from local databases. Backlogs of printed volumes have decreased at more than half of institutions, while one fourth have increased. For materials in other formats, increases and decreases are roughly equal. Archival Collections Management The progress made in backlog reduction for archival materials is aided by the fact that 75% of respondents are using minimal-level processing techniques, either some or all of the time. Tools for creation of finding aids have not, however, been standardized; some institutions use four or more. The institutional archives reports to the library in 87% of institutions, while two thirds have responsibility for records management (of active business records). The challenges specific to these materials should therefore be core concerns of most libraries—and it is in this context that the impact of born-digital content is currently the most pervasive. Digitization Nearly all respondents have completed at least one special collections digitization project and/or have an active digitization program for special collections. One fourth have no active program, and the same number can undertake projects only with special funding. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 13 More than one third state that they have done large-scale digitization of special collections, which we defined as a systematic effort to digitize complete collections—rather than being selective at the item level, as has been the norm—using production methods that are as streamlined as possible. Subsequent follow-up with respondents has revealed, however, that the quantities of material digitized and/or production levels achieved generally were not impressive or scalable. ACTION: Develop models for large-scale digitization of special collections, including methodologies for selection of appropriate collections, security, safe handling, sustainable metadata creation, and ambitious productivity levels. One quarter of responding institutions have licensing contracts with commercial vendors to digitize materials and sell access. It would be useful to learn more about the existing corpus of digitized materials, particularly rare books, some important collections of which are not available via open-access repositories. ACTION: Determine the scope of the existing corpus of digitized rare books, differentiating those available as open access from those that are licensed. Identify the most important gaps and implement collaborative projects to complete the corpus. Born-Digital Archival Materials The data clearly reveal a widespread lack of basic infrastructure for collecting and managing born-digital materials: more than two thirds cited lack of funding as an impediment, while more than half noted lack of both expertise and time for planning. As a result, many institutions do not even know what they have, access and metadata are limited, only half of institutions have assigned responsibility for managing this content, few have collected more than a handful of formats, and virtually none have collected at scale. Clearly, this activity has yet to receive priority attention due to its cost and complexity. Community action could help break the logjam in several ways. ACTION: Define the characteristics of born-digital materials that warrant their management as “special collections.” ACTION: Define a reasonable set of basic steps for initiating an institutional program for responsibly managing born-digital archival materials. ACTION: Develop use cases and cost models for selection, management, and preservation of born-digital archival materials. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 14 Staffing The norm is no change in staff size except for in technology and digital services, which increased at nearly half of institutions. Even though more than 60% of respondents reported increased use of collections, staffing decreased in public services more frequently (23%) than any other area. Across the population, 9% of permanent special collections staff are likely to retire within the next five years. The areas most often mentioned in which education or training are needed to fulfill the institution’s needs were born-digital materials (83%), information technology (65%), intellectual property (56%), and cataloging and metadata (51%). ACTION: Confirm high-priority areas in which education and training opportunities are not adequate for particular segments of the professional community. Exert pressure on appropriate organizations to fill the gaps. The gradual trend in recent decades toward integration of once-separate special collections continues; 20% of respondents have done this within the past decade. Multiple units continue to exist at one of four institutions. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 15 Introduction Background In 1998 the Association of Research Libraries (ARL) conducted a survey of special collections in ARL libraries that provided an unprecedented view of 99 member libraries with regard to special collections access, use, preservation, organizational structure, budgets, and more (Panitch 2001). In 2007 the ARL Special Collections Working Group made the decision not to update the 1998 survey. This OCLC Research survey owes a debt of gratitude to ARL’s transformative project, which served as both inspiration and model for our own work. The issues and questions raised in ARL’s report gave us much food for thought. Even before the survey report was published in 2001, ARL began considering a new program agenda to highlight for university library directors the role that special collections can play in bringing distinction and uniqueness to each of its member libraries, as well as to closely examine the purpose and significance of special collections. A series of conferences was launched to envision and debate the role of special collections within the academic research library, including how distinctive collections could be managed and promulgated to the greatest possible benefit of the academy. These activities, in concert with the data from the 1998 survey, led to various high-profile initiatives within ARL and beyond to “expose hidden collections”—that is, to enable online access to the massive quantities of rare and unique material unknown to the user community. Relevant reports and other documents are in the special collections section of the ARL Web site (ARL 2009). Given the success of ARL’s efforts to study and raise the profile of special collections, OCLC Research felt the time was right for a follow-up survey. We wanted to see how effective the efforts of the past decade have been, explore new issues that have emerged, and encompass a larger and more diverse population of academic and research libraries. We recognize that libraries find themselves in very different circumstances today than in 1998. One recent influential variable has been the decline in the global economy, which has deeply Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 16 constrained the ability of both governmental and private institutions to continue with “business as usual.” We are all tightening our belts and closely examining our core values and objectives in order to identify the most mission-critical needs, as well as those that must be scaled back or discontinued. As we ponder the implications of the data generated by this survey, it is important to keep in mind that special collections and archives are not exempt from close scrutiny. We must carefully evaluate what we do, how we do it, and why it matters. The 1998 ARL report included a question that remains central: What are the most appropriate measures by which to evaluate and compare usage of special collections, and what are the most appropriate terms in which to convey the centrality of special collections to all levels of research and scholarship? (Panitch 2001, 9) Definition of Special Collections We defined special collections as library and archival materials in any format (e.g., rare books, manuscripts, photographs, institutional archives) that are generally characterized by their artifactual or monetary value, physical format, uniqueness or rarity, and/or an institutional commitment to long-term preservation and access. They generally are housed in a separate unit with specialized security and user services. Circulation of materials usually is restricted. The term “special collections" is used throughout this report to refer to all such materials. The definition is intended to exclude general collections characterized by format or subject specialization—such as published audiovisual materials or general library strength in Asian history—as well as materials managed as museum objects. Project Objectives We began with five objectives: • Obtain current data to identify changes across ARL libraries since 1998. • Expand ARL’s survey population to include four organizations for which no such survey had been conducted. • Enable institutions to place themselves in the context of norms across the community. • Provide data to support decision making and priority setting. • Make recommendations for action. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 17 In designing the survey instrument, we identified several areas of high current interest that warranted significant attention: user services, archival collections management, and digital special collections. In order to enable longitudinal comparisons with ARL’s 1998 data, we retained many of their questions, including those that focused on basic measures such as collection size, type and number of users, status of online access, and number of staff. To keep the survey as lean as possible, we excluded some ARL questions relating to facilities, preservation management, organizational structure, and fundraising. All are significant and remain of interest, but we saw limited potential in these areas for actionable outcomes. Survey Population We received 169 responses (61%) out of the overall survey population of 275 institutions, which encompassed the membership of each of these five overlapping academic and research library organizations in the United States and Canada: • Association of Research Libraries (ARL) • Canadian Association of Research Libraries (CARL) • Independent Research Libraries Association (IRLA) • Oberlin Group • RLG Partnership (U.S. and Canada)1 Chapter Two consists of an overview of selected data for each of the five organizations. Appendix B lists the 169 respondents, first by organizational membership(s) and then by type of institution. Each institution was required to submit one unified response for all special collections units. 2 Slightly more than half of respondents (52%) are private institutions and 41% are public. Seven percent (7%) consider themselves “hybrid,” with financial support coming from both private and public sources.3 Five respondents reported having no special collections.4 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 18 Table 0.1. Survey respondents (n=169) All ARL CARL IRLA Oberlin RLG Population Total 275 124 31 19 80 85 Percent 45% 11% 7% 29% 31% Respondents Total 169 86 20 15 39 55 Percent 51% 12% 9% 23% 33% Note: Percentages total more than 100 due to some institutions’ membership in two or three of the five organizations. Two of the organizations consist principally of university libraries (ARL and CARL),5 one solely of private liberal arts college libraries (Oberlin), and another of independent research libraries not affiliated with academic institutions (IRLA). The RLG Partnership is heterogeneous in its membership. The memberships of the five groups overlap significantly, as detailed in the organizational profiles in Chapter Two. Table 0.2. Respondents by type of institution (n=169) Number of responses Percent of responses Universities 100 59% Colleges 32 19% Independent research libraries 13 8% Museums 8 5% Historical societies 6 3% National institutions 5 3% Governmental libraries 2 1% Public libraries 2 1% Consortium 1 1% Total 169 100.0% Nine institutional types are represented among the 169 respondents. Universities and colleges predominate, followed by independent research libraries and museums. Because most of the universities are members of ARL and/or CARL, and because all college libraries are members of the Oberlin Group, the overviews of those groups in Chapter Two generally express the overall norms for these two institution types. The same is true for independent research libraries, given that most are IRLA members. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 19 The eight museum respondents are members of the RLG Partnership. Because of their special nature relative to the rest of the population, we have summarized selected data in Appendix C. The number of responses from each of the other five types of institution is too few to warrant characterization. Nevertheless, their data enhance our overall view of the practices of research libraries. Most are RLG Partners, seven are in ARL, and one is in CARL. Brief observations follow about each of these small cadres of respondents. The six historical societies are all at the state level (rather than county or other jurisdiction). Three are RLG Partners (California, Minnesota, and New York), and three are IRLA members (Pennsylvania, New York, and Virginia). Only the Minnesota Historical Society is a government agency; the others are private institutions. We defined “national institutions” as those for which the primary audience is the citizenry of a nation rather than affiliates of a particular institution, city or region, or government agency. The five national institutions that responded are in the U.S.: the Library of Congress, the National Archives and Records Administration, the Smithsonian Institution, the National Library of Medicine, and the National Agricultural Library. They vary greatly in mandate, size, and scope of collections and services. The two public libraries are those in the cities of Boston (a public institution and ARL member) and New York (supported by both private and public funds, and a member of both ARL and the RLG Partnership). We considered “governmental” those libraries that report to and serve a governmental entity and are not national in scope. The two in our population are the Library of Parliament (Canada) and the New York State Library. The one consortium is the Center for Research Libraries, a member of ARL that holds no special collections. Acknowledgements Many colleagues contributed generously of their time and expertise throughout this project. First and foremost, the Association of Research Libraries endorsed our work after making a decision not to update its 1998 survey. Liaisons to the five membership organizations facilitated communications and offered advice about methodology: Jaia Barrett and Julia Blixrud (ARL), Tom Hickerson (CARL), Ellen Dunlap (IRLA), and Bob Kieft and Sherrie Bergman (Oberlin). Judith Panitch, author of the ARL survey report, remembered far more than she expected to about methodology and significant decisions, thereby smoothing our course. Alice Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 20 Schreyer provided the valuable perspective of a founding member of ARL’s special collections working group. More than thirty colleagues from across the five organizations served as reviewers of two drafts of the survey instrument. Their comments, questions, and advice improved it greatly and ensured that it spoke to the needs of the entire population. We particularly thank Stephen Enniss, Bill Joyce, Richard Lindemann, Nina Nazionale, Sarah Pritchard, Michael Ryan, Elaine Smyth, Rob Spindler, and Laura Stalker for their incisive responses. Norman Reid and Marie-Louise Ayers offered thoughtful input that led us to realize that a single instrument would not meet the needs of RLG Partners beyond North America. Nancy Elkington authored the profiles of the five membership organizations that open each section of Chapter Two. John Cole of the University of Calgary helped us understand some key differences between Canadian and U.S. archival practice. Finally, OCLC Research colleagues Ricky Erway, Constance Malpas, Dennis Massie, Jim Michalko, Merrilee Proffitt, Jennifer Schaffner, Karen Smith-Yoshimura, Günter Waibel, and Bruce Washburn contributed immeasurably to the substance and clarity of both the report and the action items. Notes 1 We did not survey RLG Partners outside North America after preliminary testing revealed that significant differences in the survey instrument would be necessary in order to address the needs of those institutions. 2 In taking this approach, we followed the precedent set by ARL in 1998. We felt that permitting individual units within an institution to report separately would inappropriately skew the results by over-representing large institutions. We did not define “unit” in recognition of the fact that departments, areas of collecting focus, branch libraries, and other organizational units could be administratively and/or physically separate. Respondents made their own determinations. 3 Initial testing of the survey instrument revealed that some institutions feel strongly about their “hybrid” status, leading us to include this as an option. We did not, however, attempt to define it. 4 Those having no special collections are the California Digital Library, the Center for Research Libraries, the Leo T. Kissam Library of Fordham University School of Law, the Kimbell Art Museum, and the Université de Sherbrooke. The maximum number of responses is therefore 164 for questions other than those that identify the responding institutions. 5 ACRL and CARL each have a small percentage of non-academic members, as detailed in Chapter Two. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 21 1. Overview of Survey Data Chapter One has eight sections, generally following the flow of the survey instrument: • Overall library size and budget • Collections • User services • Cataloging and metadata • Archival collections management • Digital special collections • Staffing • Most challenging issues The rate of response generally was 95% or more for multiple-choice questions and noticeably less for some of the questions that required numerical data. The latter likely means that some institutions either do not record certain statistics or do not collect them in a manner that sufficiently matched the categories we provided. Throughout this chapter, we intermittently pose questions that the data raised for us, some of which led to formulation of the action items. Many more could be asked, and we invite readers to do so. Full data will be published in a supplement to this report. Overall Library Size and Budget We asked two questions about the overall library to bring perspective to the situation of special collections within the broader institutional context: overall collection size and the effect of the current economy on funding. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 22 Figure 1.1. Printed volumes across overall population (Q. 7, n=163) The diverse nature of the survey population is reflected in the distribution of libraries by collection size. Most of the respondents that hold fewer than one million volumes are members of IRLA, Oberlin, or are non-academic institutions in the RLG Partnership. At the other end of the spectrum, all 24 institutions holding more than six million volumes are ARL members (some are also members of CARL and the RLG Partnership). One institution, the U.S. National Archives and Records Administration, does not collect printed volumes. Table 1.1. Printed volumes by membership organization (Q. 7, n=163) Volumes All ARL CARL IRLA Oberlin RLG None 1 — — — — 1 Fewer than 1 million 58 — 4 11 32 19 1–3 million 43 24 11 3 7 7 3–6 million 37 36 4 — — 8 More than 6 million 24 24 2 1 — 16 Total 163 84 21 15 39 51 This broad differential in library size is meaningful in the analysis of numerical data across the five organizations, particularly for collections, funding, users, and staffing. In general, Fewer than 1 million, 36.2% 1–3 million 26.4% 3–6 million 22.1% More than 6 million, 14.7% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 23 means and medians for these differ greatly. The organizational profiles in Chapter Two highlight these variations. On the other hand, the overall norms for the 39 multiple-choice questions, many of which focused on policy and practice issues, generally varied fairly negligibly across the survey population and were thereby revealed to be relatively independent of overall library size. Noticeable variations were more commonly based in the type of institution (university, independent research library, etc). Figure 1.2. Change in overall library funding (Q. 77, n=160) The data show that 75% of respondents saw their 2008-09 budgets drop as a result of the recent decline in the global economy. Endowments have fallen significantly in value, and governmental budgets have been severely reduced. The inevitable belt tightening is well underway. The data might be even more dire if gathered again for the 2010-11 year, during which many libraries are experiencing even deeper budget cuts. Collections In today’s academic and research library context, special collections are increasingly seen as an element of distinction that serves to differentiate an institution from its peers. Many original primary source materials reside in special collections and serve both as basic fodder for scholarly work and as a source of inspiration to students and others who may be 26% 24% 12% 5% 9% 16% 9% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Decreased 1-5% Decreased 6-10% Decreased 11-15% Decreased 16-20% Decreased more than 20% No change Increased Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 24 undertaking their first research project. The array of disparate formats—from rare books to photographs, from large archival collections to born-digital archival records—and the myriad methods of managing collections present daunting challenges for librarians and archivists who curate and interpret these rare and unique materials. In this section we explore size of collections, changes in collecting foci, the extent and stability of acquisitions funding, cooperative collecting, offsite storage, and preservation. We established a context for the level of completeness of the collections data by asking respondents to state how many separate special collections units exist across their institution and for which of these data was, and was not, being reported. 1 Data was reported for 69% of the 568 special collections units enumerated by respondents. We feel comfortable asserting that the special collections materials held by the other 31% constitute less than 31% of the extant materials, given that every institution reported data for its principal unit, which is generally the largest. Nevertheless, the overall magnitude of special collections holdings across the survey population clearly is appreciably larger than that reflected by our data. Several institutions commented on the difficulty of compiling statistics for multiple units across large universities. This was due to a variety of factors, including different methods of keeping statistics, variation in policies, and communication challenges. Some also noted the utility of having collaborated internally to prepare a combined response; this was particularly true for those contemplating future integration of separate units. The mean number of units reported per institution was 3.6. This would be lower but for the four institutions that have 22 or more units (39% have only one unit). The mean varied significantly across the five organizations. Table 1.2. Branch libraries reported (Q. 8, n=161) Type of Branch Units Reported Units Not Reported Arts 10 8 Institutional archives 45 11 Law 9 12 Medicine 8 15 Museum 5 6 Music 4 4 Science 0 5 Total 82 61 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 25 The nature of the units named illustrates the general character of absent data. Some types of branch library were fairly equally split in terms of being reported or not, with the exception of university archives (80% of those named were reported) and science libraries (none of the five named were reported). Many more institutional archives were “silently” reported as an integrated part of the primary (or sole) special collections unit within an institution. Table 1.3. Special collections size (Q. 11, n=161) 2 n n as Percent of Population Total Items Reported Mean Median Printed volumes 155 95% 30,000,000 191,000 80,000 Archival and manuscript collections 151 92% 3,000,000 lf 20,100 lf 10,300 lf Manuscripts (managed as items) 61 37% 44,000,000 717,000 950 Cartographic materials 90 55% 2,000,000 20,600 800 Visual materials (two-dimensional) 101 62% 90,000,000 880,000 171,000 Audio materials 92 56% 3,000,000 32,400 2,300 Moving-image materials3 84 51% 700,000 8,300 700 Born-digital materials 58 35% 85,000 GB 1,500 GB 90 GB Microforms 78 48% 1,300,000 17,300 3,000 Artifacts 83 51% 154,000 1,850 500 Note: Archival and manuscript collections were counted in linear feet and born-digital materials in gigabytes. The mean collection size for every format varies dramatically from the corresponding median—as expected, since the survey population includes such a wide range of library types and sizes. The largest collection of printed volumes is more than 1.3 million; the smallest is 100. The largest archival holdings are more than 200,000 linear feet; the smallest are 25. Some institutions have exceptionally large holdings in particular formats (e.g., manuscripts managed as items or visual materials), which drives up the means. Longitudinal comparison with ARL’s 1998 data (detailed in Chapter Two) is revealing: mean increases over the ensuing decade ranged from 50% (printed volumes and archival collections) to 300% (visual, audio, and moving image materials). It would be valuable to know whether such dramatic increases occurred for the other organizations in the survey population. The number of respondents who provided data for each format varied significantly. Several factors are probably relevant: not all institutions have materials in all formats; many institutions manage special formats as part of archival and manuscript collections; and not all institutions record statistics for all formats. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 26 Determining uniform metrics for counting special collections across institutions is not straightforward, given the multiple ways in which libraries manage these materials.4 To ensure that all materials in a specific format were slotted consistently into one broader category, we included a supplement to the survey instrument defining the scope of each format (see Appendix A.2). We combined all archival and manuscript materials managed as collections into one category, including institutional records such as those held by university archives, because the two are managed inseparably in many institutions. We added a category for manuscripts managed as items, since some institutions acquire, count, and describe them in this way; in fact, in some multi-unit institutions, separate departments use different approaches. The low number of responses for manuscripts managed as items reflects the fact that many institutions manage all such materials as collections. Accepted metrics of any sort are lacking in this area, with the result that we can safely draw fewer conclusions from the data than would be optimal. The lowest rate of response was for born-digital material (58 responses, or 35%). The data reported in the Digital Special Collections section reveal, however, that 79% reported having at least some holdings. This discrepancy may exist because some respondents are not yet actively managing their holdings. For example, some have not determined the number of gigabytes of material that they have acquired, and doing so is a challenge if content is dispersed across numerous physical media and/or file servers. It is striking that two institutions hold 51% of the 85,000 gigabytes of born-digital material reported overall, and thirteen hold 93%.5 In general, libraries’ current holdings are a drop in the bucket of the archival content that warrants long-term preservation. It is evident that this activity is in its infancy and presents a difficult challenge. Question 12, which was optional, enabled respondents to report items in specific formats that would otherwise be reflected only within the linear-foot count of archival and manuscript collections.6 Only 33 institutions responded, but their data reveal the tip of a metaphorical iceberg of photographs, recordings, moving-image formats, and other materials. For example, 28 institutions reported a total of 35 million visual items—adding nearly 40% to the 90 million visual items reported in question 11. This is powerful evidence of the extraordinary quantities of non-textual materials contained within archival collections. Some questions: • Is the dramatic growth of ARL special collections since 1998 necessarily a good thing? • Do such growth rates extend across the entire survey population? • Is such growth sustainable? If not, what should change? Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 27 New collecting areas have been established by 61% of respondents since 2000. Some named a single new area of interest, while others listed ten or more. The hundreds of topics described are too varied to characterize usefully in any detail (see the data supplement) and cover a wide range of topics. Documentation of contemporary social and political issues is widespread, including racial and ethnic groups, gender issues, the environment, the media, and human rights. The specific area most frequently named, however, was artists’ books (ten respondents), a contemporary genre that combines characteristics of artistic creation and traditional book arts. Table 1.4. Impetus for establishing a new collecting area (Q. 13, n=164) Reason Number of institutions Gift 38 New institutional direction 28 Faculty suggestion 22 Curator’s decision 15 Administrator’s decision 5 Respondents cited five typical reasons for establishing a new collecting emphasis: gift, new institutional direction, faculty suggestion, curator’s decision, or administrator’s decision. Receipt of a gift was the most common reason given, which serves as a reminder that collectors and other donors continue to enhance library collections; many of the greatest special collections have been built largely through the generosity of private donors. It can be difficult, however, to refuse an attractive gift—particularly if it has substantial monetary value or is offered by a valued institutional supporter—even if it does not directly support institutional needs. Some respondents noted that they were unlikely to add to a particular donated collection. This can sometimes hint at gift materials that could become an expensive burden in the future—a problem that plagues some special collections as a legacy of earlier days when collection development practices often were more expansive than is now practical. Thirty-four respondents (21%) described collecting areas for which acquisition of new materials has been discontinued. Not all explained their decisions, but several reasons were cited by more than one: transfer to general collections, a topic better collected by another institution, lack of space, insufficient funding, and tighter collection development policies. The discontinued topical areas were too diverse for detection of any pattern. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 28 Thirty-three respondents (20%) reported having physically withdrawn collections, most of which were archival materials. While the topics of the collections revealed no particular pattern, six reasons for deaccessioning surfaced more than once: • Transferred to a more appropriate institution (13 respondents) • Returned to donor, usually at donor’s request (5) • Transferred publications and microforms from special to general collections (4) • Reunited split collections (3) • Reappraised research value of materials acquired long ago but never processed (2) • Withdrew originals lacking value as artifacts following digitization (2) One respondent mentioned space, and two stated that the materials were out of scope. It is likely that one or both of those factors are implicit in much deaccessioning. Transfer to another institution where a collection will be welcomed as valuable and within scope reflects a value generally held by archivists. They strive to avoid acquiring a collection when closely related materials, such as another part of the papers of an individual, are already in another institution. Re-uniting split collections might be thought of as retroactive collaborative collection development. Withdrawal following digitization may be another area worthy of investigation. While materials that have special features as original artifacts generally are not considered candidates for withdrawal, much material in contemporary collections lacks any such characteristics. For example, one deaccessioned collection consisted of photographic slides; many archivists would find withdrawal inappropriate, since original photographs have higher resolution that may add to their information value. In contrast, the other post-digitization withdrawal consisted of routine business records in a university archives. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 29 Table 1.5. Acquisitions funding (Q. 75, n=140; Q. 76, n=132)7 All ARL CARL IRLA Oberlin RLG n 105 59 14 10 25 33 Institutional funds Mean $130,000 $170,000 $44,200 $347,100 $18,600 $288,800 Median $43,700 $60,000 $14,600 $100,000 $5,400 $71,200 Special funds Mean $215,400 $318,000 $248,800 $474,300 $34,100 $435,200 Median $54,300 $140,000 $88,400 $66,500 $12,600 $197,000 Combined institutional + special funds Mean $273,100 $417,000 $174,000 $652,000 $37,100 $639,600 Median $83,000 $182,600 $37,400 $142,000 $12,500 $254,800 Fifty-seven percent (57%) of printed volumes were purchased using nearly equal percentages of institutional (29%) and special funds (28%). In contrast, only 18% of materials in other formats were purchased. Twenty-three percent (23%) of respondents acquired 100% of non-print materials as gifts or transfers, and another third acquired more than 90% of materials in this way. These statistics do not suggest that most archival materials are unsolicited gifts—archivists very actively pursue collection donations of in their areas of emphasis. It does, however, signal that many archival collections have little or no monetary value, irrespective of the strength of their research value. Institutional records (such as those found in a university or governmental archives) are, by definition, not offered for sale. We asked respondents to differentiate two types of funding: institutional and “special.” The latter encompassed endowments, gifts, grants, and any other funding sources beyond the institutional budget.8 The data show that 38% of collections funds are institutional and 62% are special across the overall population. The large gaps between mean and median budgets, both across the overall population and within each membership organization, are a reflection of the diversity of institutional sizes and types. IRLA members have the highest mean acquisitions budgets in both institutional and special funds. These libraries generally consist solely of special collections, which therefore need not compete with general collections for purchasing priority. In academic libraries, on the other hand, special collections necessarily receive a tiny percentage of the overall budget from Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 30 institutional funds. In contrast with the high mean for IRLA libraries, however, the IRLA median is well below those of ARL and RLG Partnership libraries. The RLG Partnership has the second highest mean and median budgets in both institutional and special funds. The Partnership includes thirteen IRLA libraries and a number of the largest ARLs, which contributes to this outcome. Figure 1.3. Changes in acquisitions funding (Q. 77, n=161) Nearly half of respondents reported having more acquisitions funding in 2008-09 than in 2000, while 24% reported having less. Increased funding helps account for the dramatic increases in collection size described earlier, particularly for purchased printed volumes. In fact, the overall mean and median for ARL acquisitions funding were a remarkable three to four times higher than reported for the 1998 survey. This stands in stark contrast to general library trends. The survey data for acquisitions funding might be very different if re-acquired for the 2010–11 year, during which many libraries are seeing even deeper budget cuts than those reflected by our data. Less funding in 2008, 24% No change, 22% More funding in 2008, 48% Not sure, 6% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 31 Figure 1.4. Cooperative collection development (Q. 20, n=163) Most cooperative collection development arrangements reported are informal and with local or regional partners; 50% of respondents have such arrangements. Formal collaborations, on the other hand, are rare: 5% collaborate formally locally or regionally, 6% with consortial partners, and 2% with national partners. Even fewer international collaborations were reported. In her report of ARL’s 1998 survey, Panitch (2001, 48) cited a similar low percentage (11%) of formal arrangements. While our population is different, it appears that this is an area in which little progress has been made. Some questions: • Are meaningful collecting collaborations feasible for special collections? • How would an effective formal collaboration be defined? • Are special collections librarians sufficiently familiar with the techniques used in collaborations focused on general materials? Would those techniques be relevant? • How will the gradual shift to “shared print” for general collections affect special collections? 45% 50% 5% 74% 21% 6% 82% 16% 2% 93% 6% 1% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% None Informal Formal Local/Regional institutions Members of your consortium Other institutions in your nation Institutions in other nations Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 32 Figure 1.5. Special collections in secondary storage (Q. 21, n=163) Special collections materials are housed in offsite or other remote storage at two-thirds of responding institutions—and space was by far the most frequently cited “most challenging issue” in response to question 79. The data described earlier that reveal enormous growth of ARL special collections carries a corollary implication that space needs for special collections will continue to grow, perhaps dramatically. As print general collections stabilize, a need for more stacks space for special collections will become all the more conspicuous. The arguments to justify it will have to be powerful. Some questions: • If special collections growth continues at a strong pace, will institutions be able to satisfy the ensuing need for more shelf space? • Will libraries have to become more cautious about acquiring large archival collections and/or weed them more aggressively during processing? • Will deaccessioning of general print collections that are available digitally become the norm and free existing space for growth of special collections? • Will deaccessioning or transfer of special collections materials of minimal or out-of-scope research value become more common and accepted? No 28% In planning stages 5%Yes 67% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 33 Figure 1.6. Preservation needs (Q. 22, n=163) We asked respondents to characterize the relative extent of preservation needs across their special collections. We chose a non-numerical approach in the belief that few institutions collect data on the percentage of materials by format that have particular levels of preservation need. A majority of institutions ranked the preservation needs of visual and audiovisual materials much higher than those of other materials. This reflects the inherent instability of materials such as photographic prints and negatives; audio recordings on analog media such as wax cylinder, reel-to-reel tape, and cassette tape; and moving images recorded on either film or video. Such materials inherently must be duplicated if they are to survive more than a few decades, and best practices dictate that use copies be made in order not to further threaten the stability of originals. A high percentage of these materials in special collections is archival in nature, and the content may therefore exist only as one unique original. If that original deteriorates beyond recovery, its content will be lost forever. These formats present costly needs for preservation solutions for which funding rarely is sufficient. Given economic realities, this situation is unlikely to improve in the foreseeable future. Stringent appraisal and prioritization—particularly if done collaboratively—would help 17% 20% 30% 61% 46% 40% 38% 17% 32% 36% 24% 6% 2% 2% 2% 3% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Printed volumes Archives and manuscripts Visual materials Audiovisual materials High Medium Low No problems Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 34 ensure that scarce preservation resources are dedicated to the most important content. For some collections, transfer to another institution at which the content would merit high preservation priority may be the best solution. Some questions: • Can means of collaboration be developed to achieve cost-effective preservation of the highest-priority audiovisual materials? • What should an institution do if it holds material of high importance that it is unlikely to be able to preserve before major deterioration occurs? • Should we recognize that much analog audiovisual material simply will not survive? User Services User services are particularly rich in issues of current interest, including levels of use, effective communication with users, accessibility of materials in backlogs, cost-effective delivery of both originals and reproductions, and new methods of outreach to foster widespread and meaningful use. We highlight the difficulty of obtaining consistent statistics across responding institutions multiple times in this section. Panitch (2001, 61) called out the same issue in her 1998 ARL report when she noted the lack of appropriate measures by which to evaluate and compare usage. Nearly 575,000 visits were made to the special collections and archives units of the 140 responding institutions in 2008-09. This finding demonstrates that many rare and unique materials are serving their purpose. Both library directors and special collections librarians may want to dispassionately evaluate, however, to what extent this level of activity justifies the resources being expended, as well as what additional programmatic metrics add strength to the special collections value proposition. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 35 Table 1.6. Onsite visits (Q. 24, n=140) n Onsite Visits Percentage of Total Visits Mean Median Faculty and staff 92 52,523 9% 571 139 Graduate students 49 28,847 5% 589 184 Undergraduates 81 69,773 12% 861 456 Visiting researchers 89 138,352 24% 1,555 146 Local community 58 38,298 7% 660 211 Other 74 245,839 43% 3,300 660 Total 573,632 100% 4,218 1,571 We asked respondents to report the number of onsite visits to special collections rather than individual users or all user contacts. We had several reasons for this approach: statistics for onsite visits commonly exist, the number of visits best reflects reading room workloads, special collections libraries remain very interested in onsite use of original materials, and inclusion of off-site users would make it difficult to distinguish between reference transactions and use of materials. This said, it would be valuable also to have data about all reference transactions, as well as virtual use of digitized collections. Twenty-four (24%) percent of respondents reported all user visits as “other” rather than using any of the categories provided. In fact, “other” visits comprise 43% of the overall total reported. Respondents who commented revealed two reasons: either their local categories did not sufficiently mesh with those we used, or they routinely tabulate only one aggregate number. We had refined our categories based on feedback from reviewers of the draft survey and therefore knew that we could not satisfy all needs. We learned, for example, that some IRLA libraries would report all users as “visiting scholars and researchers” because their categories are too granular to be appropriate across our broader population. These results convey how difficult it is to evaluate data usefully without standard metrics in use across the special collections community. More granular comparisons should be feasible, at minimum, across relatively homogenous populations such as universities or colleges. We cannot demonstrate the level of value delivered to primary constituencies unless we can reliably characterize our users. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 36 Figure 1.7. Changes in level of use (Q. 25, n=162) The percentage of respondents who reported increased use of collections is dramatically higher than those who reported no change or decreased use. Depending on the user category, 43% to 65% of respondents reported increased use; in contrast, only 3% to 6% reported decreased use in any category. Use by faculty and staff, undergraduates, and visiting scholars and researchers increased at more than 60% of responding institutions.9 These results may be traceable both to the high priority that many special collections librarians and archivists place on education and outreach activities and to the discoverability of increasing quantities of material.10 Some questions: • Does the level of onsite use of special collections justify the resources being expended? • What are the most appropriate measures by which to evaluate use? • What additional values can we ascribe to special collections to convey their importance for all levels of study, scholarship, research, and the role of the library overall? 8 4 5 8 7 7 21 17 13 24 23 10 95 66 101 103 75 53 6 4 3 5 10 11 25 64 33 19 40 44 0 20 40 60 80 100 120 Faculty and staff Graduate students Under- graduates Visiting scholars and researchers Local community Other Decreased No change Increased Not Sure This user category not used Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 37 Figure 1.8. Changes in use by format (Q. 27, n=161) Increased use of materials in all formats is the norm across the survey population. Depending on the format, increased use was reported by 45% to 88% of respondents. The most dramatic increases were for archives and manuscripts (88%) and visual materials (76%). Thirty percent (30%) of respondents to question 27 reported that they have no born-digital materials. This is one third more than the 21% who gave the same in response to question 62 on current holdings. As mentioned earlier, this discrepancy may reflect that few institutions are actively managing their born-digital materials and therefore do not yet have an accurate sense of their holdings. Also, it is likely that no born-digital materials are yet available for public use in some libraries. 3 2 0 5 15 48 77 91 140 121 93 72 43 37 9 15 27 1310 4 2 1 1 0 0 20 40 60 80 100 120 140 160 Pre-1801 books Post-1800 books Archives and manuscripts Visual materials Audiovisual materials Born-digital materials No materials of this type Increased No change Decreased Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 38 Figure 1.9. Changes in users’ methods of contact (Q. 26, n=164) Sixty-two percent (62%) of respondents noted an increase in onsite use over the past decade. It is no surprise that e-mail transactions increased, while telephone and mail decreased. Of all the methods of contact listed as response options, the one used by the fewest respondents is interactive chat reference (18%). An ARL report published in 2008 sets this finding in the broader library context: Social Software in Libraries showed that 94% of ARL respondents (64 members) offered central interactive reference services, which perhaps suggests that only a small minority of special collections units participate in a service that their parent library has implemented (Bejune and Ronan 2008). 0 1 72 129 0 0 100 159 78 22 45 13 26 2 2 4 50 2225 1 3 0 50 119 0 20 40 60 80 100 120 140 160 180 Onsite E-mail Web site comment feature Interactive chat reference Telephone Mail Method not used Increased No change Decreased Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 39 Figure 1.10. Access to uncataloged/unprocessed materials (Q. 28, n=164) Materials that lack online metadata are effectively “hidden.” It is therefore encouraging that 90% of respondents permit use of uncataloged and/or unprocessed materials, at least selectively.11 But how selectively? The results of one recent user study showed that 50% of nearly 500 respondents had been denied use of a collection, though 63% had used one or more (Greene 2010). As general practice, special collections staff review requests for use of unprocessed materials and then make a decision based on a variety of considerations. The principal reasons stated for disallowing use (question 29) of unprocessed archival materials are readily understandable: a collection may have been acquired in such disorder that use is virtually impossible; lack of physical processing may mean that handling would endanger fragile materials; access copies may not yet exist for unstable or fragile originals; or items that must be restricted for reasons of privacy and confidentiality may not yet have been identified and isolated.12 The rationale is not as self-evident, however, for withholding materials from use because catalog records or finding aids are incomplete or below standards. Ten institutions do not permit printed volumes to be used for this reason, and fifteen withhold books due to concerns about security. We know anecdotally that lack of copy-specific notes for unambiguous identification of particular copies is a reason sometimes given, but practitioners may want to 2 0 6 13 48 19 14 12 23 34 135 148 144 124 71 0 20 40 60 80 100 120 140 160 Printed volumes Archives and manuscripts Visual materials Audiovisual materials Born-digital materials No materials of this type No Yes Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 40 consider whether a properly supervised and secure reading room sufficiently mitigates concerns about potential theft. Or are there other reasons that legitimately justify withholding books from use? Figure 1.11. Interlibrary loan (Q. 30, n=163) More respondents (44%) loan reproductions of special collections items than original materials: 38% loan original printed volumes and 18% loan materials in other formats. Loan of rare and unique material is fraught with legitimate risks for security and safe handling; nevertheless, the special collections community would earn political capital by developing—and generously implementing—best practices to facilitate more widespread participation in resource sharing. The ensuing benefits for scholars and students for whom travel is not possible are obvious. The current emphasis on exploring “shared print” initiatives across the research library community bolsters this imperative.13 Some questions: • How selective is approval of requests for use of unprocessed collections and/or interlibrary loan? • Are instances of non-approval generally reasonable, or are decision makers overly cautious? • Do libraries have policies for justifying non-approval, or are decisions often ad hoc? 38% 18% 4% 44% 31% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Yes, printed volumes Yes, materials in other formats Yes, but only within our consortium Yes, but only reproductions/ copies No Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 41 Figure 1.12. Reasons to disallow use of digital cameras (Q. 33, n=27) Many of today’s special collections users would prefer to use personal digital cameras rather than place orders for reproductions for later fulfillment by the library. Because providing reproductions is a key service in most special collections reading rooms, enabling use of cameras increases user convenience and lessens staff intervention. It is therefore good news that 87% of respondents permit users to employ digital cameras.14 Enabling this service has been controversial within the special collections community, but user convenience clearly is taking precedence. The reasons most often stated for not permitting digital cameras include perceived potential for inappropriate re-use (generally meaning copyright infringement), damage to fragile materials, and disruption to a quiet reading room environment. Inappropriate re-use was the concern most frequently reported.15 It is debatable, however, whether this is actually a significant risk, given that most libraries and archives have long provided publication-quality photographs for sale with little or no ill effect. Standard practice mitigates against misuse by requiring the user’s signature on a permission form to accept responsibility for honoring copyrights, and this practice remains the norm in the digital context.16 70% 41% 63% 48% 59% 48% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Copyright/ Inappropriate use Potential loss of revenue Improper handling of materials Reading room disruption Existing reproduction services are sufficient Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 42 Figure 1.13. Average charge for a digital scan (Q. 34, n=164) Users who need publication-quality reproductions or cannot consult materials on site often order digital scans to be made by library staff. Forty percent (40%) of respondents have an average charge $10 or less, including 12% that provide scans at no cost.17 Two further outcomes are sometimes desirable once a scan of a collection item has been made for a user: 1) avoid rescanning the item when repeat requests are received, and 2) make the image publicly available online after a copy has been delivered to the user. Our data indicate that 96% of respondents retain scans made by and/or for users for potential inclusion in a digital library (36% always, 59% sometimes). We did not ask about status of deployment to a public site, but anecdotal evidence suggests that many institutions have large “backlogs” of digital files that are not yet discoverable; this may be because they are not yet actively managed internally (e.g., not stored on a dedicated server, no metadata), or the library does not yet have the technical infrastructure for making digital content available to users. We provide scans at no charge, 11.6% $0-$5, 28.7% $5.01-$10, 20.1% $10.01-$20, 22.6% More than $20, 15.9% We do not offer this service, 1.2% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 43 Figure 1.14. Internet access to finding aids (Q. 36, n=164) Most respondents make their archival finding aids accessible on the Internet both on a local Web site (84%) and via a Web server (76%) that can be crawled by search engines such as Google. In addition, nearly half (42%) contribute to a consortial database, and 30% contribute to ArchiveGrid, which is the largest aggregation of finding aids in existence.18 All told, these multiple avenues expand users’ opportunities to discover unique primary research materials. 84% 76% 30% 14% 42% 7% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Local Web site Internet search engines ArchiveGrid Archive Finder Consortial finding aid database or catalog Not Internet- accessible Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 44 Figure 1.15. Web-based communication methods (Q. 37, n=162) We explored the extent to which respondents have implemented “Web 2.0” social media for outreach or feedback. Half of respondents have implemented an institutional blog, and 40% have a social networking presence such as a Facebook page. Anecdotal evidence suggests that linking from Wikipedia articles to a library’s Web site (used by 38% of respondents) can draw measurable use of archival collections.19 Visual and audiovisual materials are posted to Flickr (31%) and YouTube (25%) and disseminated via podcasts (26%); the popular appeal of visual content may cause these percentages to rise over time. On the other hand, the majority have no current plans to implement any other Web 2.0 methods other than blogs. 49% 30% 24% 24% 37% 17% 10% 15% 39% 25% 19% 10% 10% 6% 10% 6% 10% 16% 10% 6% 32% 55% 63% 63% 50% 70% 72% 61% 48% 62% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% No current plans to implement Will implement within a year Using now Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 45 Table 1.7. Presentations (Q. 38, n=154) n Number of Presentations Percent of Total Mean Median College/University courses 143 8,366 52% 59 28 Others affiliated with responding institution 127 2,634 16% 21 6 Local community 125 3,358 21% 27 8 Other visitors 108 1,749 11% 16 5 Total 16,107 100% 105 47 A library’s capacity to make presentations is inherently limited by the size of its staff, and the organization-specific tables in Chapter Two reflect wide variation in the means and medians of the five organizations. The RLG Partnership mean is 194, IRLA is 164, and ARL is 156; the means for CARL and Oberlin libraries are far lower, as are their mean numbers of staff. RLG Partnership libraries also have the highest mean (103) for presentations to college and university courses. The ARL mean is 91, and that for IRLA is 72. The mean number of presentations for ARL libraries has increased by two thirds since 1998. Given the strong emphasis placed on instructional use of special collections, it would be interesting to know whether such increases have occurred across the other organizations as well. Some questions: • To what extent do presentations of various types result in use of collections? • To what extent do presentations to primary user groups such as students improve the quality of the work they produce? • To what extent do presentations and instruction sessions given by non- special collections staff add to measures of overall impact? • To what extent do public presentations to non-users add to the overall value delivered by special collections? More than one third (37%) of respondents have a fellowship or grant program to enable on-site user visits—a major aid to scholars, especially in an era of decreased funding for research Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 46 travel. Such programs are far more common at private institutions than public, and above all at independent research libraries that do not have a permanently-affiliated user group. Cataloging and Metadata The question that looms the largest for many readers of this report may be: To what extent have we succeeded in “exposing hidden collections” in the decade since ARL’s benchmark survey in 1998? The short answer: far from enough. Some progress has been made, but vast quantities of special collections material are not yet discoverable online. (See the ARL section of Chapter Two for comparison with the 1998 data.) In this section we examine the extent to which special collections materials in all formats have online access. Table 1.8. Catalog records (Q. 41-47)20 Format n Online Offline No Records Described within Archival Collections Printed volumes 154 85% 7% 8% n/a Archival collections 153 56% 14% 30% n/a Manuscripts (items) 96 51% 23% 26% n/a Cartographic materials 129 42% 16% 23% 24% Visual materials 136 21% 13% 35% 35% Audiovisual materials 128 25% 7% 36% 36% Born-digital materials 89 29% 1% 34% 40% The current state of online catalog records can be summarized briefly: • Printed volumes: 15% are not in online catalogs. • Archives and manuscripts: 44% are not in online catalogs. • Cartographic materials: 58% are not in online catalogs. • Visual and audiovisual materials: Barely 25% were reported as having records in online catalogs. Because 35% are managed within archival collections, however, more may be accessible at the collection level. • Born-digital materials: 71% are not in online catalogs, but more of these materials (40%) are managed within archival collections than any other format. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 47 We did not ask respondents to distinguish between full and less-than-full catalog records. Online data therefore could be at any level of detail, from skeletal to highly detailed. This can include brief records made at time of acquisition, which many libraries later expand upon for special collections materials. Detailed records have justifiable value when descriptive information and access points beyond the norms of general cataloging practice reveal special characteristics of rare and unique materials. On the other hand, detailed editing of existing cataloging copy may not always be justified; community consensus about appropriate circumstances for streamlining would be valuable. We asked whether non-print materials were cataloged within archival and manuscript collections, since standard practice is that only a collection-level record is then made in lieu of an individual record. Depending on the format, 24% to 40% of non-print materials are managed within collections.21 Some questions: • Under what circumstances can detailed item-level cataloging be justified? • Should more non-print materials such as maps and published visual materials be managed archivally to enable collection-level rather than item-level cataloging? An Internet-accessible finding aid exists for 44% of archival collections. This percentage would rise to 74% if the 30% of finding aids that are “hidden”—i.e., those available only locally— were converted for Internet accessibility. Much retrospective conversion already has been done, particularly as institutions have implemented Encoded Archival Description (discussed in the Archival Collections Management section). It therefore seems possible that those not yet converted are the furthest from meeting contemporary standards and may present various challenges; for example, the physical arrangement of the corresponding collection may no longer match the finding aid, or the structure and content of the data may be far below current standards. Nevertheless, imperfect metadata is preferable to none at all. It is the rare potential user who does not want to know, above all, that materials exist, and where they are located. Some questions: • Are the reasons not to convert finding aid data more powerful than the reasons to do so? Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 48 • If a finding aid’s structure is problematic for conversion to EAD, would cost- effective conversion to a format such as PDF be preferable to no online access? • Where legacy finding aids exist in quantity, have libraries given sufficient priority to conversion, as they did for library card catalogs in decades past? Figure 1.16. Change in size of backlogs (Q. 49, n=161) It is encouraging that 59% of respondents reported a decrease in their backlogs of printed volumes since 2000, and 44% reported decreased backlogs for materials in other formats. On the other hand, 25% and 41% of backlogs, respectively, have increased. Efficient cataloging and processing methodologies may be in use, yet challenges clearly remain to balance collection growth with the need for backlog reduction. We did not gather data about the actual size of backlogs, but another research team surveyed rare book catalogers in 2010 regarding backlog size, awareness of the discourse about “hidden collections,” and any changes in rare book cataloging practices in response. Their data indicate that 72% of respondents believe their efforts have been “successful,” and 65% believe their approach is sustainable for preventing further backlog growth (Myers 2010). More such research would help us better understand the reasons behind the rise and fall of special collections backlogs. 39 65 22 17 94 70 0 10 20 30 40 50 60 70 80 90 100 Printed volumes Other formats Increased No change Decreased Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 49 Some questions: • Why are so many backlogs continuing to increase? • Why hasn’t the increased emphasis on sustainable metadata methodologies had more payoff? Archival Collections Management In recent years, archival and manuscript materials have earned a much higher profile within libraries and across the teaching and research communities, due at least in part to promulgation of Internet-accessible finding aids and increased visibility of collections via digital libraries. Relevant issues are discussed throughout this report.22 This section presents several topics that pertain only to archival materials. For purposes of this survey, we defined archival and manuscript collections as materials in any format that are managed as collections, including those within institutional archives (see Appendix A.2). In contrast, we defined “manuscripts” as textual materials managed and cataloged at the item level.23 Throughout this section, the phrase “archival materials” is used to encompass all of these. In 2005, Mark Greene and Dennis Meissner published the seminal article “More product, less process,” which proved catalytic in raising archivists’ consciousness of the need to reduce the vast backlogs languishing in libraries and archives.24 To address this dilemma, “MPLP” (the acronym by which the article has become known) articulates the steps in processing that can most productively be eliminated in order to improve efficiencies, emphasizing a continuum of possible approaches to processing based on the nature and expected use of particular materials. The Greene/Meissner recommendations have been controversial, in part because less detailed processing can lead to difficulty using collections that have not been physically arranged to facilitate research, while less granular finding aids can reduce discoverability. Given these factors, public services staff in some institutions are experiencing increased workloads that may offset savings in processing time (Greene 2010). The data show that 75% of respondents use an MPLP-style approach, either sometimes (57%) or always (18%). It is likely, however, that one respondent’s practice is more generally true: “While we apply MPLP to all processing, that does not mean that every collection is minimally processed.” In other words, “applying” MPLP can sometimes result in a decision that the materials warrant full processing. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 50 When correlated with the responses to question 49 regarding changes in backlog size—44% of non-print backlogs have decreased—we can posit that widespread adoption of simplified methods has had positive results in terms of exposing hidden collections. Some questions: • To what extent is use of MPLP-style simplified processing responsible for decreases in archival backlogs? • Can we establish processing metrics across the archival community? • Are the increased public service challenges that can arise in using minimally processed collections being used appropriately to justify more detailed processing? Figure 1.17. Encoding of archival finding aids (Q. 52, n=162) Encoded Archival Description, first released in 1998, is the first standard to define the data elements used in archival finding aids and the relationships among them.25 EAD has led to improved standardization of finding aids in structure and appearance, easier migration of data across platforms, and design of user interfaces that are both navigable and flexible. Our data reveal that 69% of respondents use EAD. As with minimal processing, implementation has met with resistance in some quarters: staff must be trained, software evaluated and 69% 36% 24% 8% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% EAD HTML No encoding scheme used Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 51 implemented, workflow re-examined, and a public interface designed. It is generally accepted across the library community, however, that the benefits of using standards justify the affiliated expenses.26 Figure 1.18. Software for creating finding aids (Q. 53, n=160) Respondents use an array of software tools for creating and encoding finding aid data, and some institutions use four or more. 27 Word-processing software is the most widely used, likely because virtually any new staff member arrives knowing how to use it, including part-time and temporary employees. It would be useful to know the extent to which respondents find the existing array of available tools satisfactory.28 Many institutions mandate the existence of an institutional archives (“university archives” in academic institutions) and designate the responsible organizational unit. The nature of these collections is very different from other archival holdings of special collections libraries, presenting an overlapping but somewhat different set of issues. Collection development, types of material, and the primary user base (often the institution’s administration and staff) all differ. In addition, born-digital materials are far more prevalent in institutional archives than in most other types of collecting, at least at present. 73% 56% 11% 34% 18% 47% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Word processing Database Archon Archivists’ Toolkit EAD Cookbook XML markup tool Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 52 The institutional archives reports within the library at 87% of responding institutions. The above issues therefore loom large for our survey population. Figure 1.19. Responsibility for records management (Q. 55, n=162) Managing the institutional archives requires close coordination with the unit responsible for the parent institution’s records management program to ensure that materials of permanent value are not discarded before being evaluated for transfer to the archives.29 Accomplishing this is straightforward when both functions reside in a single organizational unit; it is more challenging when the records management function reports elsewhere. A library or archives is responsible for records management in 71% of responding institutions— sometimes independently (30%), sometimes with shared responsibility (19%), and sometimes informally (22%) because the parent institution has no formal records management program. The latter circumstance is fraught with difficulty, since archivists are faced with seeking cooperation from offices throughout the institution that may not recognize the importance of saving their business records. The sad reality is that no formal records management program exists in many academic and research institutions. Yes, formally, 30% Shared responsibility, 19% Yes, informally, 22% No, 29% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 53 Some questions: • What arguments would help libraries obtain both the authority and the necessary resources in order to formalize records management programs in institutions that have none? • How many special collections and archives have staff qualified to do records management? Digital Special Collections The increasing availability of special collections materials in digital form over the past decade has been nothing sort of revolutionary for both users of special collections and the professionals who manage them. User expectations typically are high: how many of us have not been asked why everything is not yet online? At the same time, the advent of born-digital archival materials has presented a new challenge that has proven daunting, given the need for complex technical skills and challenging new types of intra-institutional collaboration. This section covers these two topics, which emerged as two of the three top challenges faced today in the special collections context.30 Figure 1.20. Digitization activity (Q. 57, n=163) The organizational placement of digitization programs for special collections materials varies.31 Respondents could select multiple responses as appropriate to their circumstances. 78% 52% 50% 21% 3% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% One or more projects completed Active program within special collections Active library-wide program that includes special collections Can undertake projects only with special funding No projects done yet Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 54 Half have a program based in special collections and half have a library-wide program; the two groups overlap, and 25% have both. This leaves 25% of respondents that do not have an active program. Ninety-seven percent (97%) have completed one or more digitization projects and/or have an active program. This statistic is fairly constant across the entire population, regardless of library size or institutional type. Twenty-two percent (22%) can undertake projects only with special funding, suggesting that these libraries have not prioritized digitization of primary sources as an integral element of their programs and services. Figure 1.21. Involvement in digitization projects (Q. 58, n-161) Special collections staff in more than 75% of institutions perform three or four of the digitization activities given as options (project management, selection of materials, cataloging/metadata creation, and digital image production). For the 15% that are involved in only one activity, selection of materials was invariably that one. Twenty-four institutions reported additional activities: Web design (5 responses), grant writing (5), information technology (5), administration (3), and scanning on demand for users (2). It is likely that others would have selected one or more of these had we included them among the multiple-choice options. 87% 99% 84% 71% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Project management Selection of materials Cataloging/metadata Digital image production Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 55 Figure 1.22. Large-scale digitization (Q. 59, n=163) Thirty-eight percent (38%) of respondents stated that they have already done large-scale digitization of special collections. This result was unexpected, given that special collections have been excluded from some high-profile mass digitization projects for reasons of efficiency. Subsequent follow-up with respondents has revealed, however, that the quantities of material digitized and/or production levels achieved generally were not impressive or scalable32. We used the term “large-scale” to distinguish special collections activity from “mass digitization.” The latter generally is understood to mean conversion of library holdings at “industrial scale” without selecting individual items, limited human intervention in the capture process, and achievement of exceptionally high productivity (Coyle 2006). Digitization of special collections, on the other hand, often requires a measure of selectivity to ensure that certain materials receive special handling to prevent damage or, if necessary, be excluded. We therefore defined “large-scale” digitization as a systematic effort to consider complete collections—rather than being selective at the item level, as has been the norm for many projects—and using production methods that are as streamlined as possible while also accounting for the needs of special materials. Some “large-scale” projects may be among those done under contract with commercial vendors, particularly those that digitize collections of exceptional depth. A better overall understanding of the nature and scope of large-scale digitization of special collections would be valuable. Projects completed, 38.0% Intended in future, 36.2% No plans, 17.8% Not sure, 8.0% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 56 Some questions: • To what extent have libraries switched from doing highly selective “boutique” projects to digitization of entire collections? • What methodologies are being used for large-scale projects? What production levels are achievable? Scalable? • Can we develop replicable methodologies for large-scale projects that include metrics for efficiency and effectiveness? The incidence of licensing contracts with commercial firms for digitization of special collections and subsequent sale of digitized special collections content varies enormously across the five organizations surveyed. The overall mean is 26%, while the percentages for the organizations surveyed are 11% (CARL), 13% (Oberlin), 27% (ARL), 39% (RLG Partnership), and, most notably, 73% of IRLA libraries. The exceptional depth and distinction of IRLA collections in specialized areas is likely a key factor. In addition, the fact that some IRLAs rely in part on earned income to support their programs, unlike governmental libraries or those affiliated with universities, offers added incentive. Some questions: • To what extent are key segments of the corpus of digitized rare books not available online and/or as open-access digital content? • For how long are libraries that license their content likely to maintain contracts that result in access being available to subscribers only? Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 57 Figure 1.23. Responsibility for born-digital archival materials (Q. 61, n=161) In addition to the challenges associated with digitization, the daunting requirements of born- digital archival materials have begun to loom large among the concerns of academic and research libraries, as our data reveal in multiple ways. What is the intersection between born-digital content and special collections? Some born- digital materials, such as scholarly e-journals that have no print version, and reference databases, are easily disregarded in the special collections context; print originals of such materials were never located in special collections. In contrast, original archival and manuscript materials such as institutional office records, authors’ drafts that exist only on floppy discs, and digital photographs are the born-digital equivalents of materials traditionally collected by special collections. Other types of exclusively digital content, such as Web sites and scholars’ data sets, have characteristics that may or may not warrant special collections involvement. Various types of expertise held by special collections librarians and archivists are relevant for developing the context of a digital collection and interpreting its content. Such skills include selecting materials of permanent rather than temporary value, negotiating ownership, resolving legal issues, determining and enforcing any restrictions, ensuring authenticity, determining file arrangement, and creating collective metadata. 13% 17% 3% 11% 27% 18% Special collections/ archives Library-wide level Institutional level Decentralized Not formally determined Not yet addressed 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 58 Addressing such considerations would be valuable in planning for the management of born- digital materials in an academic or research library. Anecdotal evidence shows that in some institutions special collections is assigned responsibility for all born-digital materials; in others, special collections has no role. A more nuanced approach is necessary. Only 55% of responding institutions have assigned responsibility for managing born-digital materials to one or more organizational units—10% more than was reported by ARL in 1998. Of these, 30% have given this responsibility to the library, within either special collections, the institutional archives, or at the library-wide level. Only 3% have consciously assigned responsibility elsewhere. Time will tell whether this pattern will continue as the undecided 45% move forward. Organizations rarely assign responsibility for such a complex activity until a need has been defined and accepted—or, in some cases, in response to a precipitating crisis. Initial actions include development of infrastructure, shared planning and communication, and assignment of resources (both financial and human). The 2010 report of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access emphasizes that a variety of players have a stake in born-digital preservation and management (BRTF 2010). Even so, only librarians and archivists are likely to assert the use case for preserving archival materials of permanent historical or evidentiary value. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 59 Figure 1.24. Born-digital archival materials already held (Q. 62, n=159) Seventy-nine percent (79%) of respondents reported having collected born-digital materials in one or more formats; these data are in stark contrast to the 35% who reported the size of their born-digital holdings in response to question 11. Visual and audiovisual materials (such as photographs, audio, and video) are the most frequently collected born-digital formats, closely followed by institutional records and other archives and manuscripts. Two respondents expressed the ad-hoc nature of their collecting in a way that may apply more generally across the survey population: “We have and manage some born-digital materials in all or most of these categories, but these are all ad hoc items or groups of items—not something we set out to "collect" or manage. We seem to be moving toward a model in which SC and UA share responsibility for setting policy, perhaps for making decisions in specific cases, but where the materials themselves are folded into Digital Library Services/institutional repository.” “Can't really answer this collection because our "collecting" is so sporadic.” A few respondents reported having collected formats not listed in the survey: e-mail, electronic theses and dissertations, cartographic materials, oral histories, undergraduate honors papers, digital arts, scholarly output of various types, and blogs. 46% 44% 36% 15% 55% 27% 47% 45% 11% 21% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 60 Figure 1.25. Impediments to born-digital management (Q. 63, n=157) Lack of funding was the impediment to implementation of born-digital materials management most often cited (69%), followed by lack of time for planning (54%) and lack of expertise (52%). All three are essential to any program; until they are in place, most collecting that takes place is likely to be reactive. Active management of digital files is also unlikely, since substantial resources are necessary for metadata creation, computer server space, and much more. The 52% of respondents that cited lack of expertise as an impediment stands in contrast to the 83% needing education or training in this area (question 71). 52% 54% 69% 13% 41% 2% 1% 10% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 61 Figure 1.26. Institutional repositories (Q. 64, n=158) Sixty-nine percent (69%) of respondents have an institutional repository (IR). Half of all respondents reported that special collections units contribute collections content, which reflects the varying scope of IRs: some focus principally on the scholarly output of faculty and other researchers, while others include institutional records and other materials typically collected by special collections or archives. The 2007 report of the MIRACLE project, a census of IRs in U.S. academic institutions, explored the involvement of archivists and archives. The MIRACLE data indicate that both participation and contribution of content by archivists have been minimal. For example, the report states (Markey 2007, Section 9.3), “[We] have no census data that would help explain the marginalization of the archivist with respect to IRs. There may be merit to Crow’s ([2002]) observation that the IR competes with the university archives.” We can think of no legitimate reason for an IR project management team to allow competition to enter the picture. Collaboration, not competition. 43% 53% 38% 25% 9% 31% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Contribute metadata Contribute collections content Contribute project management Participate in other ways No involvement No institutional repository Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 62 Some questions: • Under what circumstances should special collections staff play a role in management of born-digital materials? • Which skills and knowledge held by special collections librarians and archivists are essential for managing born-digital materials of any kind? • What are the basic steps an institution should take to jump start progress on managing born-digital archival materials? • What are the elements of an effective use case for born-digital materials? • Are any institutions assigning a role to special collections staff in curation of large data sets? • Who should be responsible for institutional Web sites that have almost completely replaced countless physical brochures, newsletters, and other publications, but which, in physical form, were the responsibility of the university archives? • What role should special collections play in the context of an institutional repository? Staffing In many academic and research libraries special collections staff are responsible for the full array of functional duties, including selection and interpretation of materials in any format, public services, teaching, specialized cataloging, archival processing, preservation, public outreach, exhibits, publications, digitization projects, born-digital management, fundraising, and more. We explored a variety of staffing issues of interest in the special collections context. These include number of staff, expected retirements, demographic diversity, and education and training needs. We also examined the extent to which separate special collections units have been integrated, which can lead to increased efficiencies and lower costs. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 63 Table 1.9. Mean number of staff FTE33 All ARL CARL IRLA Oberlin RLG n 158 80 19 15 39 50 Permanent (Q. 66, n=161) FTE 13 20 8 32 3 25 Professional 8 12 4 21 2 15 Paraprofessional 5 8 4 11 1 10 Temporary (Q. 67, n=142) FTE 2 3 2 4 0 5 Professional 1 2 1 2 0 3 Paraprofessional 1 1 1 2 0 2 Total FTE 15 23 10 36 3 30 Professional 9 14 5 23 2 18 Paraprofessional 6 9 5 13 1 12 We solicited staffing statistics in FTE (full-time equivalents), either whole or decimal numbers. Positions to be reported were those “focused on special collections-related functions” rather than only those located within a special collections unit.34 The data reveal that IRLA special collections have, in the aggregate, far more staff (the IRLA mean is 32 FTE) than the members of the other four organizations. The next highest mean, that of RLG Partnership libraries (25 FTE), is positively influenced by the thirteen IRLAs that are also RLG Partners. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 64 Figure 1.27. Changes in staffing levels (Q. 70, n=163) The norm across the survey population is that staff size within special collections generally has remained stable since 2000. The predominant response was “No change” in all functional areas of responsibility, with the exception of technology and digital services, for which 44% of respondents reported an increase. That said, stable staffing was not universal; decreases by functional area ranged from 7% (technology) to 23% (public services), and increases ranged from 21% (administration) to 32% (curatorial). Decreased public services staffing may be of particular concern, given that a majority of respondents reported increased use across most user categories and formats. Increased staffing in technology and digital services may be related in some way to the fact that education and training needs in technology-related areas are higher than any other area, as described under Most Challenging Issues. Some respondents commented on staffing challenges relative to activities such as digitization, management of born-digital materials, and conversion of in-house databases. 1 5 3 23 22 3234 50 50 40 43 69 104 72 71 70 72 45 23 30 37 25 20 11 0 20 40 60 80 100 120 Administrative Curatorial Public services Technical services (print materials) Technical services (other materials) Technology and/or digital services No staff in this area Increased No change Decreased Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 65 Figure 1.28. Education and training needs (Q. 71, n=156) The survey instrument listed sixteen areas in which staff may need development in order to meet the institution’s needs. We selected these areas based on review of education and core competency guidelines issued by the two principal U.S. professional societies for special collections librarians and archivists (ACRL 2008; SAA 2002, 2006). The two most frequently named areas of need were born-digital materials (83%) and information technology (65%). Intellectual property was third (56%), perhaps reflecting that many institutions struggle to determine the risks and legal responsibilities associated with digitizing materials that have not gone out of copyright.35 Training in born-digital management and intellectual property are widely available at archival conferences and in continuing education programs, but less so in programs that target the broader special collections population.36 This may be a fruitful area in which organizations such as the ACRL Rare Books and Manuscripts Section could direct members toward more opportunities. Half of respondents cited a need for staff development in cataloging and metadata. This may relate in part to the desire to employ non-MARC metadata for digitization projects, since special collections staff often does this work, and original cataloging is often necessary for 35% 83% 51% 20% 24% 28% 30% 65% 56% 28% 24% 39% 14% 17% 39% 21% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 66 lack of existing cataloging copy.37 Also, about one third need greater expertise in archival processing or records management. Taken together, these data suggest that placing a higher priority on such training could help institutions more efficiently expose hidden collections. Based on respondents’ estimates, 9% of special collections staff across the survey population are likely to retire in the next five years. Percentages varied somewhat by organization: ARL (10%), CARL (16%), IRLA (8%), Oberlin (7%), and the RLG Partnership (8%). Figure 1.29. Demographic diversity (Q. 69, n=160) We asked which demographic groups are represented among the special collections staff of each institution.38 The data show that slightly more than one third have Black/African American staff members, while slightly fewer have Asians or Hispanics/Latinos on staff.39 Note that these percentages reflect the percentage of institutions that have special collections staff in each of these population groups, not the percentage of individual staff.40 29% 35% 30% 8% 6% 96% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Asian Black or African American Hispanic or Latino Native American Pacific Islander White Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 67 Figure 1.30. Integration of separate units (Q. 72, n=159) Until recent decades, it was common for multiple special collections and archives units to exist within one institution. These tended to be departments or branch libraries segregated by type of material (e.g., rare books, manuscripts, institutional archives, oral history) or collecting focus (e.g., local history or other topical areas). While autonomy has its virtues, supporting multiple independent units often results in added expense and inefficiency, such as the need to staff multiple facilities and public service desks, uneven expertise in specialized skills such as cataloging or archival processing, and variable policies and practices across the institution. In recent decades, the prevalence of separate units has diminished: more than one third of respondents have integrated all formerly separate units, while 28% continue to maintain separate units and do not plan to change this. 20% 16% 30% 28% 7% Since 2000 Before 2001 Have always had only one Multiple units, all remain separate Entire institution is special collections 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 68 Some questions: • Is the total number of staff in special collections remaining stable? • Why is public services staff decreasing at the same time that use is increasing? • How can a library determine the appropriate level of staffing for special collections in the context of both the library’s overall goals and the areas in which special collections is expected to help fulfill those goals? • Why are educational needs in several areas so widely unmet? Most Challenging Issues We asked respondents to name their “three most challenging issues” in open-ended comments. (We disallowed staffing and funding, since they can serve to mask more specific challenges.) The data give an interesting overview of what respondents see as their most significant pain points. Some answered in a few words; others at considerable length. (See the data supplement for complete data.) We slotted the stated challenges into thirteen categories. Table 1.10 Most challenging issues (Q. 79, n=158) Issue Number of responses Space and facilities 105 Born-digital materials 60 Digitization 57 Meeting user needs 48 Cataloging and archival processing 47 Preservation 36 Information technology 30 Administration and institutional relations 20 Collection development 19 Staff development 9 Rights and permissions 7 Fundraising 5 Records management 2 Note: The total equals far more than the number of respondents because each could name up to three challenges. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 69 Space was cited nearly twice as often as any other issue. Some respondents stated space as a problem in general terms, while others focused on collections, staff, and/or public services space as being inadequate in size or configuration. About 25 institutions described the need for improved environmental conditions (temperature and humidity controls) or security. A few were in the midst of renovation projects. The second most frequently named challenge was born-digital materials. This echoes the wide range of other contexts (collecting, access, preservation, management, training, etc.) in which it rose to the top as an area of concern. Digitization was also widely seen as a continuing challenge. Some respondents couched this in terms of an implicit mandate to put as much material as possible online, and as soon as possible. Some conveyed a sense that, short of digitizing everything in special collections, libraries can never do enough. Respondents expressed a wide variety of challenges relating to meeting user needs, such as how to understand the changing needs and nature of users, attract new users, improve insufficiently discoverable metadata, integrate special collections materials into academic courses, maintain a strong web presence, expand outreach programs, and implement social networking tools. Cataloging and archival processing ranked no higher than fifth despite the fact that “hidden” materials remain numerous. This may suggest that gradual success in making more materials available has led libraries and archives to focus on the many other challenges that have been lying in wait. Some respondents mentioned a need for cataloging to facilitate digitization, including determining what constitutes adequate metadata in that context. Preservation of physical materials also remains an important issue. Most respondents who voiced concern about preserving originals emphasized that audiovisual materials are the problem. This matches the data for question 22, for which 62% of respondents stated that audiovisual materials have a high level of need. Thirty-three respondents added a final comment (question 80); those of potential interest beyond the responding institution are transcribed in the data supplement. While these varied greatly in both substance and length, several issues were raised by multiple respondents: • The difficulty of compiling some of the data requested for this survey, particularly at institutions that amalgamated data for multiple units • The inherent challenges that exist when one individual is responsible for a disparate variety of basic functions within a very small department • For how long can we keep doing more with less? Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 70 Our recommendations for action in Chapter Three offer some possible concrete steps for moving forward. Notes 1 The survey instructions asked that all known units be reported and named whether or not they report administratively to a library system rather than to an academic school or other organizational unit. Some institutions nevertheless reported only those special collections that are part of a library system. 2 Data for the two largest U.S. respondents were excluded to avoid skewing the overall means: the Library of Congress and the National Archives and Records Administration. Numbers rounded in this table. See the data supplement for exact figures. 3 This total does not include the two largest film and television archives in the United States: the Library of Congress and UCLA. LC collection statistics were excluded, as noted earlier. UCLA did not include its Film & Television Archive in its response. 4 The 1998 ARL data revealed this in various ways. Judith Panitch elaborated on this and other collections issues in a phone conversation on June 15, 2009. 5 The two largest born-digital holdings reported were 26,000 and 17,000 gigabytes. The range for the other eleven institutions in the top thirteen is much smaller, from 7,500GB down to 1,500GB. 6 ARL did not include such a question in the 1998 survey; per Judith Panitch, some respondents would have welcomed one. 7 Data for the two largest U.S. respondents were excluded to avoid skewing the overall means: the Library of Congress, which has an exceptionally large materials budget, and the National Archives and Records Administration, which has no acquisitions funds because it acquires all materials by transfer from government agencies. Note that the combined figures are not simple combinations of “institutional” and “special” because it is not statistically valid to sum means or medians across subgroups; they were therefore recalculated from the combined data. 8 We collected statistics in three funding categories—collections, staff, and other expenses—but found that only the collections data warranted analysis. Data submitted for staff and other expenses are detailed in the data supplement, Tables 75-76. 9 We asked for a relative indication of change, not specific statistics, since we felt that the latter would have been far too time consuming, if not impossible, for many respondents to provide. This approach was validated by the data from question 24. The same is true for questions 26 and 27. 10 ARL recently published a report that focuses on various forms of special collections outreach (Berenbak et al. 2010). 11 This finding correlates closely with the findings of a 2009 ARL SPEC Kit in which 92% of respondents reported that they permit access to minimally or unprocessed collections (Hackbart-Dean and Slomba 2009). 12 Only 59 respondents answered question 29, perhaps because the wording implied that it could be skipped if the response to question 28 was “yes.” 13 The Rare Books and Manuscripts Section of ACRL has long actively encouraged interlibrary loan via promulgation of a set of guidelines, currently undergoing scheduled review as of 2010 (ACRL 2004). 14 We did not explicitly ask whether camera use is approved selectively, but a review of numerous policies by an RLG Partnership working group suggests that this is generally the case. See Miller, Galbraith, and RLG (2010). 15 Only 27 respondents answered question 33, perhaps because the wording implied that it could be skipped if the response to question 32 was “yes.” 16 The document, Well-intentioned practice for digitizing collections of unpublished materials, seeks to define reasonable community practice to minimize the risk of copyright violations (OCLC Research 2010c). It was issued as an outcome of the RLG Partnership symposium Undue Diligence: Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 71 Low-Risk Strategies for Making Collections of Unpublished Materials More Accessible held on March 11, 2010 (OCLC Research 2010b). 17 Many respondents charge a variety of prices for scans based on factors such as size and format of the original and the required image resolution. Some have differential pricing based on the user’s status (e.g., lower cost for students) or affiliation (e.g., higher cost for unaffiliated users). 18 ArchiveGrid is available without charge to all OCLC FirstSearch subscribers and by subscription to others (OCLC 2006-2010). 19 Such links were initially disallowed under Wikipedia’s conflict-of-interest policy, but after significant lobbying by the archival community, this ban was lifted in September 2009. See, for example, Theimer (2009). 20 Percentages for each row sometimes add up to slightly more than 100% because we allowed a margin of error of up to +10% in each response. Individual responses totaling more than 110% were dropped from all calculations. 21 We did not ask respondents to indicate the extent to which the relevant archival collections themselves have catalog records. This would have added a level of complexity that would have been difficult to convey in the survey and likely would have been impossible for most respondents to determine. 22 Archival issues are discussed under these topics: preservation (question 22), levels of use and access policies (questions 27-30), access to finding aids (question 36), catalog records (42-43), existence of finding aids (question 48), size of backlogs (question 49), born-digital materials (questions 61-63), and training needs (question 71). The 1998 ARL survey addressed archival materials only in terms of collection size, level of access, and preservation of electronic (a.k.a. born-digital) records. 23 The two categories of material used in the ARL survey were “manuscripts” and “university archives.” 24 Roughly 20% of archival materials were reported as unprocessed or uncataloged in the 1998 ARL survey; the combination of “unprocessed” and “uncataloged” leaves unclear whether respondents were referring to absence of any catalog record, as opposed to lack of physical processing and creation of a finding aid. Eighty-four percent (84%) had no Internet-accessible finding aid. 25 The EAD DTD and XML schema, tag library, and other documentation are hosted by the Library of Congress (LC 2010). 26 The EAD Help Pages include extensive examples of EAD implementations and much more (SAA 2010b). 27 We did not ask respondents to name the specific tools that they use (e.g., which database software or which XML markup tool); had we done so, responses almost certainly would have varied widely. Those who added other information mentioned use of spreadsheets (principally Microsoft Excel), web authoring tools (such as Dreamweaver), and particular commercial database products. 28 The current plan to integrate the Archivists’ Toolkit and Archon, two Mellon-funded archival management systems, as ArchivesSpace, offers the latest promise for a tool that is sophisticated, while not requiring sophisticated technological resources to manage it (ArchivesSpace 2010). 29 Records management is concerned with the disposition of “active” records—i.e., those needed by the office of origin in order to conduct its daily business. Only those records deemed of permanent value should be sent to the institutional archives. 30 The number one challenge was space (question 79). 31 We did not attempt to define the nature of an active program; respondents made their own determination. 32 Determined as an outcome of interviews conducted for the Rapid Capture project (OCLC Research 2010a). 33 We excluded the Library of Congress and the National Archives and Records Administration, both of which have hundreds of special collections and archives staff, from our calculations to avoid inappropriate skew in the data. 34 We did not ask respondents to differentiate in this regard, nor did we ask whether an FTE was filled or vacant. 35 See the discussion of this issue in the User Services section of this report. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 72 36 The Society of American Archivists is particularly active in this area. See, for example, the course description for Copyright: The archivist and the law (SAA 2010a). As of 2010, the Rare Book School at the University of Virginia is offering a course on born-digital archival records (RBS 2010). 37 This is true for 85% of the 161 respondents to question 58 on participation in digitization projects. 38 We used a subset of the categories in the 2000 U.S. Census; see U.S. Department of Commerce n.d. 39 Several respondents noted that they either included or excluded student employees in their reporting; we do not know whether one or the other approach was prevalent. Several Canadian institutions noted that such statistics are not kept in Canada, though most CARL members did provide data. 40 ARL tracks the number of staff by demographic group in its ARL Annual Salary Survey, 2009-2010 (ARL 2010a, Graph 1). Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 73 2. Overviews of Membership Organizations Chapter Two presents a partial profile of each of the five membership organizations surveyed. Each begins with an overview of the organization and then highlights selected results that differ noticeably from the overall norms detailed in Chapter One. Where an issue covered in Chapter One is not addressed here, it is because either the organization’s data roughly matched that of the overall population, or we judged the significance of the particular issue to be not particularly noteworthy. Chapter Two therefore supplements rather than replaces Chapter One for a complete view of each organization. Brief tables summarize data for overall library size and budget, collection size, onsite visits, presentations, and catalog records. Note that three tables in Chapter One also include data for each organization: those for number of respondents, acquisitions funding, and number of staff. Association of Research Libraries http://www.arl.org/ The rate of response by ARL members was 69% (86 of 124 members), or 51% of survey respondents overall. This includes all 24 ARL members that hold more than six million volumes. When ARL’s 1998 survey was conducted there were 110 members, of which 99 responded (90%), including all 18 members that had more than five million volumes at that time. 1 Seventy-one of the 99 ARLs that participated in 1998 also did so in 2010, comprising 84% of our ARL respondents. Some that responded in 2010 were not yet ARL members in 1998. Throughout this section, we highlight comparisons between our data and that from 1998. http://www.arl.org/� Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 74 Organizational profile The Association of Research Libraries was founded in 1932 and had 124 members in the United States and Canada at the time our survey was being conducted.2 The classic ARL member is a comprehensive, research-intensive university. Member libraries come in all sizes, however, ranging from Kent State and Guelph at the smaller end of the spectrum to the Library of Congress and Toronto as two of the largest. Additionally, ten non-academic institutions are members (seven of which responded to the survey): Library and Archives Canada, the Library of Congress, the National Agricultural Library, the National Library of Medicine, the Smithsonian Institution Libraries, the Center for Research Libraries, the Canada Institute for Scientific and Technical Information (CISTI), the New York Public Library, the New York State Library, and the Boston Public Library. ARL is a non-profit organization with an agile agenda that focuses on strategic directions, currently these: Influencing Public Policies, Reshaping Scholarly Communication, and Transforming Research Libraries. ARL collects and maintains detailed annual statistics about physical and digital library holdings, finances, human resources, and selected special issues (some of which are published in the ongoing SPEC Kit series). Active publication and training programs also are important aspects of ARL’s services to members. ARL works with other organizations to lobby the U.S. Congress on behalf of research libraries. The ARL offices are located in Washington, D.C. The directors meet twice annually. One third of ARL members are also in the RLG Partnership and 18 are in CARL. Overall library size and budget Table 2.1. ARL overall library size (Q. 7, n=84) Number of Volumes Number of ARLs Percent of ARLs < 1,000,000 volumes – – 1-3 million volumes 24 29% 3-6 million volumes 36 43% > 6,000,000 volumes 24 29% Every ARL library has more than one million volumes, reflecting the organization’s membership consisting principally of research-intensive universities. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 75 Table 2.2. ARL change in overall library funding (Q. 77, n=81) Change Reported Percent of Responses Decreased 1-5% 24% Decreased 6-10% 29% Decreased 11-16% 8% Decreased 16-20% 10% Decreased more than 20% 14% No change 10% Increased 6% The pattern of change in overall library funding for ARL libraries is fairly similar to respondents overall. Collections Table 2.3. ARL special collections size (Q. 11, n=79)3 n Mean Median Printed volumes 79 285,000 202,000 Archives and manuscripts (collections) 79 32,200 l.f. 23,500 l.f. Manuscripts (managed as items) 34 1,100,000 974 Cartographic materials 54 23,600 1,800 Visual materials 57 1,350,000 372,000 Audio materials 54 53,000 6,300 Moving-image materials 47 14,000 3,400 Born-digital materials 35 2,000 GB 50 GB Microforms 45 15,000 3,100 Artifacts 46 2,500 500 Note: Archival and manuscript collections were counted in linear feet (l.f.) and born-digital materials in gigabytes (GB). A comparison of the data from the 1998 and 2010 surveys indicates that growth of collections in some formats across ARL libraries has been extraordinary over the past decade.4 The mean number of printed volumes and archival collections has increased by slightly more than 50%, audio materials by 240%, and visual and moving-image materials both by around 300%. The mean number of microforms, on the other hand, decreased by 80%, perhaps due to transfers to general collections or deaccessioning of microform sets that have been digitized. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 76 What factors explain this rapid growth? In recent years many institutions have focused intensively on building archival and manuscript holdings, and these collections sometimes grow significantly with a single large acquisition; this is particularly true for certain types of collections such as political papers and institutional records. It is also possible that some apparent growth is actually the result of increased collections processing that has revealed non-print materials in far greater numbers than were previously thought to exist. Increased acquisitions funding also likely accounts for some growth, particularly for printed volumes, half of which are acquired by purchase: 58% of ARLs reported having more funding for special collections than in 2000. In fact, the means and medians were remarkably higher: in 1998, mean funding was $210,000, and in 2010, it was $417,000. Median acquisitions funding increased from $59,000 to $182,600. Table 2.4. ARL acquisitions funding, 2010 and 1998 (Q. 75, n=54; Q. 76, n=59)5 2010 1998 Mean Median Mean Median Institutional $170,000 $60,000 $46,000 $19,000 Special $318,000 $140,000 $149,000 $13,000 Total $417,000 $182,600 $210,000 $59,000 A very high percentage of total collection materials across the entire survey population are held by ARL libraries, ranging by format from 97% of audio materials and 95% of moving-image materials down to 65% of cartographic materials and 51% of microforms. ARLs hold 75% of the printed volumes. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 77 Table 2.5. Percentage of all survey holdings held by ARL libraries Format Total items across survey population Total items in ARL Percent in ARL Printed volumes 30,000,000 22,500,000 75% Archival and manuscript collections 3,000,000 l.f. 2,500,000 l.f. 83% Manuscripts (managed as items) 44,000,000 37,400,0006 85% Cartographic materials 2,000,000 1,300,000 65% Visual materials (two-dimensional) 90,000,000 77,000,000 86% Audio materials 3,000,000 2,900,000 97% Moving-image materials 700,000 666,000 95% Born-digital materials 85,000 GB 70,000 GB 82% Microforms 1,300,000 666,000 51% Artifacts 154,000 113,000 73% Note: Archival and manuscript collections were counted in linear feet (l.f.) and born-digital materials in gigabytes (GB). The percent of ARLs that have special collections materials in storage has risen minimally since 1998, from 73% to 80%. Formal arrangements for collaborative collection development remain rare, as they were in 1998. At that time, about 6% of ARLs had formal collaborations; as of 2010, about 10% do. ARL libraries rank preservation problems for several formats noticeably higher than the overall means: archives and manuscripts (51% of respondents reported having a medium level of need), visual materials (44% expressed high level of need), and audiovisual materials (73% stated a high level of need). User services Table 2.6. ARL onsite visits (Q. 24, n=66) n Number of Visits Percent of Total Mean Median Faculty and staff 40 32,121 8% 803 410 Graduate students 38 28,245 7% 743 361 Undergraduates 39 42,810 10% 1,098 662 Visiting researchers 38 93,592 23% 2,463 413 Local community 25 13,532 3% 541 403 Other 44 199,551 49% 4,535 1,677 Total 409,851 100% 6,210 3,088 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 78 Only 77% of ARL respondents provided data on onsite visits, and nearly half of the visits reported were categorized as “Other.” As noted in Chapter One, these two outcomes suggest that nearly one-fourth of ARL respondents do not record statistics in a way that is compatible with our approach. The relative percentage of use by each type of user for those that did break down their statistics is, however, at least somewhat indicative of norms across ARL libraries. In this regard it is striking that unaffiliated scholars and researchers were by far the most numerous across the specific user types—only slightly less numerous than all users who are affiliated with their institution (faculty, staff, and students). The mean and median numbers of onsite users in ARL’s 1998 survey data were 3,696 and 2,280, respectively. In comparison with our data, the data reveal that the mean has risen by nearly 70% and the median by one third. On the other hand, the means for each type of user are very different in 2010 than in 1998. For example, the mean for undergraduates was 943 in 1998; in 2010, it was 743 (Panitch 2001, 99). The degree to which partial reporting and use of “Other” may explain this decrease is unknown. ARLs reported 71% of the overall onsite visits across the survey population. The level of use by graduate students (64%) and undergraduates (74%) increased in noticeably more ARL institutions than across the overall population. In terms of use by type of material, audiovisual use increased in markedly more ARLs (73%) than the overall mean (58%). Table 2.7. ARL presentations (Q. 38, n=80) n Number of Presentations Percent of Total Mean Median College/University courses 76 6,946 56% 91 57 Others affiliated with responding institution 65 1,890 15% 22 12 Local community 68 2,400 19% 35 14 Other visitors 60 1,215 10% 20 10 Total 12,451 100% 156 87 ARL members reported 75% of the presentations across the overall population, yet the ARL mean is somewhat less than those for IRLA and RLG Partners. It is, however, encouraging to note that the mean has increased by 88% since 1998, when it was 88. Somewhat more ARLs permit interlibrary loan of original printed volumes (45%) and other materials (27%) than the overall means (38% and 18%, respectively). Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 79 Of the 17 ARLs that gave reasons for not permitting cameras in the reading room, 65% cited concern about loss of revenue from reproduction services. ARLs noticeably exceeded the overall means in their implementation of blogs (57%) and Wikipedia links (49%). Forty-seven percent (47%) have a fellowship or grant program for visiting researchers. Cataloging and metadata Table 2.8. ARL catalog records (Q. 41-47)7 Format n Online Offline No Record Described within Archival Collections Printed volumes 76 84% 9% 7% n/a Archival collections 77 63% 13% 25% n/a Manuscripts (items) 47 55% 26% 20% n/a Cartographic materials 68 46% 13% 18% 28% Visual materials 68 21% 13% 25% 45% Audiovisual materials 65 25% 7% 27% 48% Born-digital materials 51 30% 1% 28% 46% The 2010 ARL data for online catalog records is nearly identical to that of the overall population. Table 2.9. ARL online catalog records (2010 and 1998) Format 2010 1998 Percent Change Printed volumes 84% 73% 12% Manuscripts 46% University archives 29% Archives & mss. (collections) 63% Manuscripts (items) 55% Cartographic materials 46% 36% 6% Visual materials 21% 33% –12% Audio materials 37% Video and film 43% Audiovisual materials 25% Born-digital materials 30% Computer files 43% Note: Blank cells indicate formats for which ARL and OCLC used different format categories and for which relative percentages therefore cannot be calculated. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 80 A comparison of our data with that from the 1998 survey reveals relatively modest improvements overall in the percent of online catalog records:8 • Printed volumes: 12% increase, from 73% to 85%. • Archives and manuscripts: The use of different categories in the two surveys obviates clean comparison. Combining each pair, however, we see an increase in online records of ca. 15%. Regardless, 37%-45% of these unique materials are not yet in online catalogs. • Cartographic materials: 6% increase, from 40% to 46%. • Visual, audio, and moving image materials: Online records for these formats decreased. A variety of factors may be in play, including dramatic growth in collection sizes and the possibility that recent inventories of holdings have revealed that more materials were extant in 1998 than had been known to exist at that time. On the other hand, given that close to 50% of these materials are managed with archival collections, the percentage of materials accessible online could be much higher if all collections were cataloged. • Born-digital materials: Comparison is not feasible due to completely different format definitions. The data for born-digital materials warrants particular mention because ARL’s metric changed radically in the past decade. The 1998–99 statistical methodology counted “computer files” as physical media (e.g., disks, tapes, CDs), the content of which was not necessarily either born- digital or “archival” in nature (Kyrillidou and O’Connor 2000). The mean number of items held was 288, and the largest collection was 2,782 physical items. How this compares with current holdings in gigabytes (85,000 GB across the entire population) cannot be determined. Cataloging of physical media is more straightforward than are aggregations of born-digital archival files; this may explain the higher percentage of material with online records reported in 1998 (43%) relative to 2010 (29%). Regardless, 29% is promising, considering that it is still early days for providing access to these materials. Some online records probably describe discrete documents held in institutional repositories rather than digital content that has not been “published” in such fashion. It is important that we learn more about born-digital archival materials and the extent to which public access exists. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 81 Table 2.10. ARL archival finding aids (1998 and 2010) n Internet Finding Aid Non-Internet Finding Aid Machine- readable Finding Aid No Finding Aid Archival and manuscript collections (OCLC 2010) 81 52% 30% 19% Manuscripts (ARL 1998) 82 16% 31% University archives (ARL 1998) 71 16% 36% Note: Blank cells indicate where ARL and OCLC used different format categories. The increase in online finding aids since 1998 is dramatic: from 16% to 52%. This is in part because thousands of legacy finding aids that had existed for years were converted to EAD or HTML in the past decade. As with catalog records, however, only an imperfect comparison is feasible, since the two surveys used different definitions for formats of material and types of access. ARL collected data on “machine-readable” finding aids in addition to “Internet” finding aids. Although the first term was not defined, we assume it was intended for data created digitally but not available on the Internet; this would include finding aids stored in local databases or word processing files and available only locally. In addition, ARL combined “uncataloged” and “unprocessed” rather than clearly separating processing status and existence of finding aids from that of library catalog records. Based on ARL’s finding aid statistics, however, we surmise that about 50% of collections had either no finding aid or only one on paper. Regardless, our data clearly show that far more finding aids are online now than in 1998. Ten years of effort to make descriptions of archival collections accessible have been successful— but ARL libraries are only halfway there. Archival collections management EAD is used by 85% of ARLs, noticeably above the overall mean of 69%. Forty-two percent (42%) use the Archivists’ Toolkit. Two thirds of ARL libraries have responsibility for records management at some level. Digital special collections An active library-wide digitization program is in place at 72% of ARLs, and 47% report having done large-scale digitization of special collections. Slightly more than half of ARL institutions have assigned responsibility for management of born-digital archival materials (52%, up from 45% in 1998), whether to the library or Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 82 elsewhere in the institution. This minimal increase over twelve years is not encouraging. In addition, more ARLs have collected every born-digital format listed in the survey than the overall mean, while 13% have collected none. Eighty-six percent (86%) of ARLs have an institutional repository, and special collections staff within ARLs are more often involved in all aspects of IR implementation than the overall mean. The most striking difference is in contribution of collections content (69%, overall mean of 53%). Staffing The mean number of permanent FTE for ARL respondents is twenty (twelve professionals and eight paraprofessionals). The median is twelve (seven professionals and five paraprofessionals). Considerably more ARL respondents have Black/African American staff in special collections (52%, overall mean is 35%) than the members of the other four organizations. As mentioned in Chapter One, our data measure the number of institutions, not the number of staff.9 We focused on changes in staffing by functional responsibilities, whereas in 1998 ARL asked whether the overall number of permanent special collections staff had changed in the previous ten years. It is noteworthy in the context of the current difficult economic times that 48% of respondents in 1998 reported that staffing had increased in the prior decade, whereas ARL staffing in most areas has been stable since 2000. The one exception is technology and digital services, in which 54% of ARL respondents reported increased staffing. The areas in which the largest percentage of ARLs reported a need for education or training are born-digital records (88%), information technology (72%), intellectual property (65%), cataloging and metadata (48%), preservation (43%), and management/supervision (40%). More ARL libraries have integrated formerly separate special collections units (45%) than the overall mean, presumably because ARLs are larger and have the potential for more units (only 18% have always had a single unit). Nevertheless, 37% continue to have multiple separate units. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 83 Canadian Academic and Research Libraries http://www.carl-abrc.ca/about/about-e.html The rate of response by CARL members was 68% (20 of 31 members), comprising 12% of respondents overall. Eleven CARL members responded to ARL’s 1998 survey; seven of the eleven also participated in 2010. Given that only one third of our 2010 CARL respondents participated in both surveys, comparisons between the two sets of CARL data seemed unlikely to be meaningful. Indeed, our examination of the data on several issues revealed illogical patterns; we therefore opted not to report on similarities or differences (other than one comment under special collections size). The ARL report included a chapter focused on the Canadian responses in which it was reported that means were far below U.S. norms for collection size, staffing levels, and expenditures. This is still the case. Organizational profile The Canadian Association of Research Libraries was founded in 1976. The membership currently consists of 28 universities and three national institutions: Library and Archives Canada, the Canada Institute for Scientific and Technical Information (CISTI), and the Library of Parliament. 10 CARL strives to enhance the capacity of member libraries to partner in research and higher education, and to seek effective and sustainable scholarly communication and public policy encouraging of research and broad access to scholarly information. CARL considers the collective human and physical resources of its members a strategic national information resource. Activities include a wide array of initiatives in support of the mission, including enhancing skills in data management, reciprocal interlibrary lending and document supply, construction and use of the Digital Collection Builder11 and development of open repositories. Eighteen (58%) members of CARL are also members of the Association of Research Libraries, and four are in the RLG Partnership (Alberta, British Columbia, Calgary, and Toronto). http://www.carl-abrc.ca/about/about-e.html� Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 84 Overall library size and budget Table 2.11. CARL overall library size (Q. 7, n=19) Number of Volumes Number of CARLs Percent of CARLs < 1,000,000 volumes 2 11% 1-3 million volumes 11 58% 3-6 million volumes 4 21% > 6,000,000 volumes 2 11% CARL is unique among the five organizations in having a majority of libraries in the mid-size range. Two, both of which are ARL members, have more than six million volumes. Table 2.12. CARL change in overall library funding (Q. 77, n=18) Change Reported Percent of Responses Decreased 1-5% 24% Decreased 6-10% 29% Decreased 11-16% 8% Decreased 16-20% 10% Decreased more than 20% 14% No change 10% Increased 6% The budget data show that 84% of CARL members have seen their overall funding drop due to the current global economy. The pattern of change in overall library funding is somewhat more negative for CARL libraries than respondents overall. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 85 Collections Table 2.13. CARL special collections size (Q. 11, n=16)12 n Total Items Mean Median Printed volumes 16 1,700,000 107,000 72,000 Archives and manuscripts (collections) 16 150,000 l.f. 9,200 l.f. 5,100 l.f. Manuscripts (managed as items) 5 2,400 500 200 Cartographic materials 9 44,000 4,900 1,200 Visual materials 8 1,300,000 165,000 130,000 Audio materials 11 66,000 6,000 1,100 Moving-image materials 8 14,000 1,700 700 Born-digital materials 6 2,100 GB 359 GB 150 GB Microforms 7 27,000 3,900 1,700 Artifacts 6 1,035 173 100 Note: Archival and manuscript collections were counted in linear feet (l.f.) and born- digital materials in gigabytes (GB). It seems that few CARLs manage manuscripts as items; only five libraries provided data, and the holdings are tiny. Moving-image collections across the CARL libraries also are very limited in size. We sought to measure changes in mean and median collection sizes between 1998 and 2010 for CARL respondents, as we did for the ARLs. Recalling that two thirds of our respondents did not participate in 1998, however, we hypothesized that any comparison would be meaningless. We believe this was confirmed by the fact that while two means increased greatly (archival collections and audio materials), most others either stayed the same or dropped significantly. The mean for printed volumes, for example, dropped by 80%. We therefore disregarded collection size comparisons, considering them invalid. Far more CARLS (37%) reported stable funding for collections than the overall mean (6%). Concomitantly, fewer had increased funding (32%, compared to 48%). The CARL mean for institutional acquisition funds is $44,000, which is 25% that of ARL libraries. Somewhat more (25%) have informal collaborative collection development arrangements with other non-regional institutions in their nation than the overall mean (16%). No CARL institution has a formal collaboration in any category, nor do any collaborate internationally. Some special collections are in offsite storage in 58% of CARL libraries. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 86 More CARL libraries reported lower levels of preservation needs for some formats than the overall population: archives and manuscripts (50% of CARLs had low needs, compared to 40% medium) and audiovisual materials (44% had high needs, compared to 62% high overall). User services Table 2.14. CARL onsite visits (Q. 24, n=16) n Number of visits Percent of total Mean Median Faculty and staff 9 11,295 9% 1,255 188 Graduate students 6 2,769 5% 2,769 462 Undergraduates 7 15,283 12% 2,183 753 Visiting researchers 8 3,975 24% 497 99 Local community 6 5,721 7% 954 355 Other 10 29,121 43% 2,912 1,603 Total 68,164 100% 4,869 2,341 While many CARLs reported increased use by faculty/staff (41% of respondents), undergraduates (47%), visiting scholars (35%), and the local community (35%), the percentage of libraries was well below the overall mean. In addition, onsite use has increased at fewer CARLs (44%) than the overall mean (62%). In terms of use by type of material, use of books has increased in a higher percentage of CARL libraries (ca. 65%) than across the rest of the population. Table 2.15. CARL presentations (Q. 38, n=16) n Number of Presentations Percent of Total Mean Median College/University courses 16 441 61% 28 20 Others affiliated with responding institution 12 67 9% 6 5 Local community 12 113 16% 9 5 Other visitors 9 103 14% 11 5 Total 724 100% 45 35 The mean number of presentations across CARL libraries is less than 30% that of ARL, IRLA, and RLG Partners. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 87 Noticeably fewer CARLs permit use of unprocessed/cataloged materials (ca. 63%) than the overall mean (ca. 80%). On the other hand, far more CARLs permit interlibrary loan of original printed volumes (63%, the overall mean is 38%). Of the five CARLs that gave reasons for not permitting cameras in the reading room, none cited concern about loss of revenue from reproduction services (overall mean is 41%), and only one cited potential disruption in the reading room. On the other hand, four of the five were concerned about improper handling of materials. CARL libraries’ average fees for a digital scan are similar to the overall means across the population, with one exception: a far lower percentage of CARLs (5%) charge more than $20. Two libraries do not offer scanning as a service. Far fewer contribute finding aids to ArchiveGrid (5%) than the overall mean (30%), but more CARLs (53%) contribute to a consortial database than the overall mean. Only 18% have implemented a blog (overall mean is 49%), but another 47% intend to do so within a year. Very few CARLs have adopted any other web 2.0 communication methods, with the exception of Wikipedia links (43%). Only 17% have a fellowship or grant program for visiting researchers. Cataloging and metadata Table 2.16. CARL catalog records (Q. 41-47)13 Format n Online Offline No Record Described within Archival Collections Printed volumes 18 79% 10% 13% n/a Archival collections 17 63% 8% 28% n/a Manuscripts (items) 8 52% 27% 21% n/a Cartographic materials 14 58% 16% 15% 24% Visual materials 13 21% 9% 31% 47% Audiovisual materials 12 31% 11% 19% 55% Born-digital materials 8 57% 0% .1% 44% While catalog record statistics for CARL libraries are very similar to the overall norms for some formats, they differ in several respects. Significantly more cartographic materials (58%, compared to 42%) have online records, as do born-digital materials (57%, compared to 29%). In addition, substantially more visual and audiovisual materials are managed within archival collections, rather than as individual items, than the overall mean. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 88 Backlogs of both printed volumes and other materials have increased in 60% of Canadian institutions. Only 17% reported a decrease in backlogs of either type (compared to overall means of 59% and 44%, respectively). CARL members have 52% of finding aids online, which is 8% more than across the full survey population. Archival collections management Some archival management practices are quite different in Canada than in the United States, as the data show in several ways. Only 28% do minimal collections processing, the complete reverse of the overall population, across which 75% do at least some minimal processing. Twenty-eight percent (28%) use EAD, far below the overall mean of 69%. Most use word processing and/or database software for finding aid preparation; few use the Archivists’ Toolkit or XML markup tools. The institutional archives reports to 78% of libraries, and the same percent of libaries have some level of responsibility for institution-wide records management. Digital special collections Ninety percent (90%) of CARLs have completed at least one digitization project. On the other hand, 32% cannot undertake projects unless they have special funding. Large-scale digitization of special collections has been more common in Canada than across the overall population (47%, overall mean is 33%). Few CARLs (11%) have licensing contracts with commercial vendors. Far more CARL institutions have assigned sole responsibility for management of born-digital materials to libraries or archives (48%) than across the overall population (30%). Born-digital materials have been collected by far fewer CARL libraries, however, than the overall mean for every format, while 33% have collected none whatsoever (overall mean is 21%). In contrast, lack of funding is an impediment to born-digital management in far fewer institutions (44%) than the overall mean (69%). The same is true for lack of administrative support outside the library (22%, overall mean is 41%). A higher percentage of CARL members have an institutional repository (89%) than any of the other four organizations, but special collections departments in CARL libraries are somewhat less involved with the IR than the overall mean. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 89 Staffing The mean number of permanent FTE for CARL respondents is eight (four professionals and four paraprofessionals), and the median is six (three professionals and three paraprofessionals). Demographic diversity in special collections is low in CARL libraries: 17% of respondents have Asian staff, 6% Black/African American, 6% Pacific Islander, and none have Native American or Hispanic/Latino staff. Several noted that in Canada such data typically is not tracked, though most provided data. Staffing in curatorial and public services areas decreased more than the overall means. Staffing for technology and digital services increased, but less than the overall mean. The areas in which the highest percentage of CARLs reported a need for education or training are born-digital records (84%), information technology (74%), cataloging and metadata (68%), intellectual property (63%), history of the book (47%), preservation (47%) and archival processing (47%). One third of CARL respondents maintain separate special collections units (the same as ARL), and another third have always had only one unit. Independent Research Libraries Association http://irla.lindahall.org/ The rate of response by IRLA members was 79% (15 of 19 members), comprising 9% of respondents overall. Organizational profile The Independent Research Libraries Association (IRLA) was established in 1972 and currently includes 19 U.S. libraries and one European institution as a Foreign Corresponding Member.14 IRLA is an informal confederation intended to address the future of independent, privately- supported research libraries, including issues such as preserving collections, serving both the public and the world of scholarship, and financing these costly private institutions that lack the stability of public or university support. Together, IRLA members could be said to comprise a “who’s who” of American private research libraries. Some have the most comprehensive collection in the world in their particular area of focus; examples include the Folger Shakespeare Library and the American Antiquarian Society. Many IRLA institutions have a museum in addition to library and archival http://irla.lindahall.org/� Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 90 resources. IRLAs tend to have strong outreach and public programming components, including research fellowships for visiting scholars. Many IRLAs hold only special collections, in contrast with the other organizations in which general collections have the vast majority of both holdings and users. IRLA directors meet annually at a member institution and communicate informally throughout the year. The librarians and research directors of four of the largest members (Folger, Antiquarian Society, Huntington, and Newberry, known as FAHN) also meet annually. Thirteen of the 19 regular members (68%) are also in the RLG Partnership, one of which (the New York Public Library) is also an ARL member. Overall library size and budget Table 2.17. IRLA overall library size (Q.7, n=15) Number of Volumes Number of IRLAs Percent of IRLAs < 1,000,000 volumes 11 73% 1-3 million volumes 3 20% 3-6 million volumes — — > 6,000,000 volumes 1 7% Nearly three quarters of IRLA respondents have fewer than one million volumes overall. As noted above, many IRLAs have minimal non-rare holdings. Table 2.18. IRLA change in overall library funding (Q. 77, n=15) Change Reported Percent of Responses Decreased 1-5% 13% Decreased 6-10% 53% Decreased 11-16% — Decreased 16-20% 13% Decreased more than 20% 20% No change — Increased — The budget data show that 100% of IRLA members have seen their budgets drop as a result of the current global economy, as compared to the overall mean of 75% across the population. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 91 Collections Table 2.19. IRLA special collections size (Q. 11, n=15)15 n Total Items Mean Median Printed volumes 15 4,800,000 320,000 250,000 Archives and manuscripts (collections) 14 246,000 l.f. 17,500 l.f. 16,500 l.f. Manuscripts (managed as items) 5 6,100,000 1,220,000 35,000 Cartographic materials 11 970,000 88,100 4,000 Visual materials 11 11,500,000 1,000,000 425,000 Audio materials 5 690,000 138,000 100 Moving-image materials 4 62,000 15,600 300 Born-digital materials 3 3,200 GB 1,100 GB 163 GB Microforms 10 374,000 38,000 26,000 Artifacts 6 12,000 2,000 950 Note: Archival and manuscript collections were counted in linear feet (l.f.) and born- digital materials in gigabytes (GB). Most IRLA members reported printed volume holdings in special collections of 200,000 to 900,000, while three that principally have non-rare collections reported between 10,000 and 33,000 printed volumes in special collections. The mean and median holdings of printed volume, manuscript item, cartographic, and microform holdings are higher across the IRLA libraries than any of the other four organizations surveyed. About 45% of the aggregate cartographic holdings are in IRLAs. Forty-seven percent (47%) of IRLAs had less funding for collections in 2008–09 than in 2000, while 40% had more funding. Significantly more IRLA libraries have informal collaborative collection development arrangements with other non-regional institutions in their nation (27%) than the overall mean (16%). Only one IRLA institution has a formal collaboration in any category; that same relationship is the only international one undertaken by any IRLA. Twenty-seven percent (27%) have special collections housed in offsite storage, far below the overall mean (67%). IRLAs generally rank their preservation needs somewhat lower than the overall means. This is particularly true for audiovisual materials (only 40% of IRLA respondents reported having a high level of need, compared to 62% high across the population). Given both the collection Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 92 size statistics and the nature of IRLA collecting priorities, it seems likely that they hold far fewer unstable visual media such as photographs than is the case across the overall population, and overall IRLA holdings of audio and moving-image materials are small. User services Table 2.20. IRLA onsite visits (Q. 24, n=14) n Number of Visits Percent of Total Mean Median Faculty and staff 2 1,575 1% 788 788 Graduate students 0 — — — — Undergraduates 0 — — — — Visiting researchers 10 76,838 66% 7,683 2,432 Local community 4 4,402 4% 1,100 751 Other 5 33,673 29% 6,735 5,043 Total 116,488 100% 8,321 4,386 IRLA respondents have the highest mean number of onsite users across the overall population: 33% more than ARL members and 10% more than RLGs. The level of use speaks to the high profile of IRLA libraries’ strengths within their collecting foci. Most IRLA libraries reported user statistics only for staff and visiting researchers, since they have no affiliated students. Four members provided counts of local community users, while 29% of the total users reported were ambiguously categorized as “Other.” IRLA libraries share some user categories that were too granular for use in this survey, such as research fellows funded by the library. Onsite use has increased at a smaller percentage of IRLAs (40%) than the overall mean (62%). Use of books has increased at some (36%), but less than the overall mean (52%). Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 93 Table 2.21. IRLA presentations (Q. 38, n=12) n Number of Presentations Percent of Total Mean Median College/University courses 12 723 37% 72 21 Others affiliated with responding institution 10 354 18% 35 23 Local community 12 611 31% 51 28 Other visitors 11 285 14% 26 10 Total 1,973 100% 164 78 Given that IRLA institutions have no affiliated students, the mean number (72) of college or university course presentations is impressive relative to the academic library respondents. It is, for example, 80% of the ARL mean. No IRLA libraries permit interlibrary loan of original materials, while 33% loan reproductions. IRLAs generally charge more for digital scans than the rest of the survey population: 40% charge $10-$20, and 40% charge more than $20. Fewer contribute finding aids to a consortial database (33%) than the overall mean (42%). IRLAs are far above the overall means in implementation of nearly all web 2.0 communication methods, including YouTube (50% vs. 25% overall), podcasting (57% vs. 26%), applications for mobile devices (23% vs. 11%), a social networking presence such as Facebook (79% vs. 40%), and Twitter (57% vs. 40%). This suggests that external publicity and marketing are a high priority. Eighty-six percent (86%) have a fellowship or grant program for visiting researchers (overall mean is 37%). Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 94 Cataloging and metadata Table 2.22. IRLA catalog records16 Format n Online Offline No Record Described within Archival Collections Printed volumes 15 87% 6% 8% n/a Archival collections 14 35% 18% 51% n/a Manuscripts (items) 10 35% 19% 48% n/a Cartographic materials 15 38% 38% 20% 8% Visual materials 13 21% 28% 28% 23% Audiovisual materials 11 31% 5% 39% 17% Born-digital materials 5 5% 0% 70% 25% Percentages of online catalog records at IRLA libraries are similar to the overall means, with three exceptions, for each of which IRLAs have a much lower percentage online: archival collections (35%), manuscript items (35%), and born-digital materials (5%). A much lower percentage of finding aids (25%) is online at IRLA libraries than the overall mean (44%). Archival collections management Some minimal archival processing is done by 60% of IRLAs. EAD is used by 87%, at the same level as ARL and far above the overall mean (69%). Few IRLAs use database software for finding aid creation, while more than the overall average use Archivists’ Toolkit and XML markup tools. The institutional archives reports to the library in 80% of IRLA libraries, and 80% are responsible, at some level, for institution-wide records management. Digital special collections Nearly half (47%) of IRLAS must have special funding in order to undertake digitization projects, more than double the overall mean (22%). An exceptionally high 73% have licensing contracts with commercial vendors, nearly triple the mean of 26%. The fact that IRLAs attract so much interest from vendors is another indication of the world-class depth of collections in IRLA libraries’ areas of emphasis, as well as the necessity of earned income as a factor in ensuring financial stability. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 95 Only 21% of IRLA members have assigned responsibility for management of born-digital materials to any unit. Regardless, roughly the same percent of members have collected most born-digital formats at roughly the overall mean levels, with the exception of Web sites (collected by only 7%). Thirty-six percent (36%) have collected no born-digital materials of any kind. Lack of funding is an impediment to implementing born-digital management for 79%, and a higher percentage of IRLAs than the overall norm also report that lack of expertise is a major impediment (64%). Few (14%) cite lack of administrative support as a barrier. Only 36% of IRLAs have an institutional repository. Staffing The mean number of permanent FTE for IRLA respondents is 32 (21 professionals and eleven paraprofessionals). The median is 29 (20 professionals and nine paraprofessionals). This is by far the highest mean across the overall population. Demographic diversity exists in more IRLA libraries than the overall means: 60% of respondents have Asian American staff in special collections, 40% have Black/African American, and 47% have Hispanic/Latino. The areas in which the highest percentage of IRLAs reported a need for education or training are born-digital records (80%), information technology (80%), records management (67%), archival processing (67%), cataloging and metadata (60%), foreign languages (53%), intellectual property (53%), and history of the book (40%). Oberlin Group http://www.oberlingroup.org/ The rate of response by Oberlin members was 49% (39 of 80 members), comprising 23% of respondents overall. Organizational profile The Oberlin Group is an unincorporated, informal confederation of 80 liberal arts colleges, many of whose directors have been meeting annually since 1986. Sixteen members are universities rather than colleges (e.g., Wesleyan University). The principal areas of focus for the organization are library issues of common concern, best practices in library operations and services, licensing of electronic resources of interest to member institutions, cooperative resource sharing, and establishing communities of practice. Named for the site of the group’s first conference, Oberlin College, the Group is successful http://www.oberlingroup.org/� Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 96 not only at hosting discussions of, but also implementing solutions to, the challenges faced by liberal arts college libraries today. The group's activities include reciprocal interlibrary loan, annual statistical surveys, other in- year surveys, and a small number of consortial contracts for electronic journals and reference services subscriptions brokered by one member on behalf of any who wish to participate. Subsets of the group engage in other cooperative projects such as advocacy for open access and new forms of scholarly communication, collaborative collecting, digital access agreements, and a digital repository. The library directors meet annually at a member institution. Swarthmore College is also a member of the RLG Partnership. Overall library size and budget Table 2.23. Oberlin overall library size (Q. 7, n=39) Number of Volumes Number of Oberlins Percent of Oberlins < 1,000,000 volumes 32 82% 1-3 million volumes 7 18% 3-6 million volumes – 0% > 6,000,000 volumes – 0% The relatively small size of Oberlin libraries’ overall collections reflects their support of campuses that principally educate undergraduates and therefore do not require the intensive research collections needed to support doctoral courses and research. Table 2.24. Oberlin change in overall library funding (Q. 77, n=39) Change Reported Percent of Responses Decreased 1-5% 36% Decreased 6-10% 8% Decreased 11-16% 18% Decreased 16-20% — Decreased more than 20% 5% No change 23% Increased 10% The pattern of change in overall library funding is fairly similar for Oberlin libraries as for respondents overall. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 97 Collections Table 2.25. Oberlin special collections size (Q. 11, n=39)17 n Total Items Mean Median Printed volumes 39 1,100,000 28,600 15,000 Archives and manuscripts (collections) 37 114,000 l.f. 3,100 l.f. 2,750 l.f. Manuscripts (managed as items) 12 37,000 3,100 1,000 Cartographic materials 17 6,300 373 168 Visual materials 18 1,140,000 64,000 22,500 Audio materials 21 36,000 1,700 450 Moving-image materials 19 17,500 919 300 Born-digital materials 11 2,200 GB 200 GB 70 GB Microforms 12 99,000 6,300 400 Artifacts 20 17,500 875 175 Note: Archival and manuscript collections were counted in linear feet (l.f.) and born-digital materials in gigabytes (GB). The aggregate Oberlin special collections form a very small percentage of all materials reported across the population. For example, Oberlin libraries hold 4% of the printed volumes. Again, this follows readily from the fact that Oberlin colleges focus on undergraduate education rather than postgraduate-level research. Table 2.26. Range of Oberlin special collections sizes (Q. 11, n=139) Number of Oberlins Percent of Oberlins 174,000 to 190,000 volumes 2 5% 50,000-86,000 volumes 3 7% 25,000-50,000 volumes 8 21% 10,000-25,000 volumes 12 31% 2,000-10,000 volumes 12 31% < 250 volumes 2 5% Two Oberlin libraries hold 33% of the printed volumes in special collections across the 39 responding libraries. Other Oberlins’ special collections holdings are far smaller than these two. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 98 Somewhat more Oberlins have informal collaborative collection development arrangements with other regional or local institutions (54%) than the overall mean (45%). Only one Oberlin institution has a formal collaboration in any category, and none collaborate internationally. User services Table 2.27. Oberlin onsite visits (Q. 24, n=37) n Number of Visits Percent of Total Mean Median Faculty and staff 34 5,717 20% 168 85 Graduate students 3 71 .2% 24 25 Undergraduates 33 13,649 47% 414 337 Visiting researchers 27 2,929 10% 108 62 Local community 18 1,804 6% 100 37 Other 15 4,996 17% 333 77 Total 29,166 100% 788 731 The data for onsite visits to Oberlin special collections show that undergraduates comprise nearly half of all users. Given the core mission of these colleges, it is intriguing that the percentage is not even higher. Only 17% of users were reported as “Other” (overall mean is 43%). More Oberlin libraries reported increased use by affiliated faculty/staff (77% of respondents) and undergraduates (82%) when compared to the other four organizations surveyed. Ten percent (10%) of users are visiting researchers, which indicates that at least some Oberlins hold special collections of research caliber and serve a population beyond their primary, college-affiliated users. In fact, 15% have a fellowship or grant program for visiting researchers. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 99 Table 2.28. Oberlin presentations (Q. 38, n=39) n Number of Presentations Percent of Total Mean Median College/University courses 38 788 59% 21 16 Others affiliated with responding institution 32 185 14% 6 4 Local community 27 181 13% 7 3 Other visitors 24 187 14% 8 3 Total 1,341 100% 34 27 The mean number of staff in Oberlin special collections is three (median is two), and many individuals therefore have a wide range of responsibilities. This is meaningful in evaluating the low numbers of presentations. Ninety percent (90%) of Oberlins permit digital cameras in the reading room. Four gave reasons for not permitting their use: the concern most commonly cited (by three of the four) was the potential for inappropriate use of the digital files (such as violation of copyright). Most Oberlin libraries charge much less for digital scans than the rest of the survey population: 31% provide scans at no charge, 39% charge less than $5, and only 5% charge more than $10. Ten percent (10%) contribute archival finding aids to ArchiveGrid, while 13% have no finding aids online. Far fewer Oberlins have implemented Web 2.0 communication methodologies than the norms across the survey population. Cataloging and metadata Table 2.29. Oberlin catalog records (Q. 41-47)18 Format n Online Offline No Record Described within Archival Collections Printed volumes 39 87% 6% 8% n/a Archival collections 38 35% 18% 51% n/a Manuscripts (items) 28 35% 19% 48% n/a Cartographic materials 31 32% 10% 36% 26% Visual materials 34 10% 10% 58% 24% Audiovisual materials 34 18% 9% 46% 28% Born-digital materials 23 18% 4% 50% 32% Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 100 The Oberlin statistics for online catalog records differ markedly from—and are much lower than—the overall means in two areas: only 35% of archival collections and manuscript items have an online record, as do only 1% of born-digital materials. Oberlin respondents have 31% of their finding aids online (overall mean is 44%). Archival collections management EAD is used at 44% of Oberlins (overall mean is 69%). The institutional archives reports to the library at every responding institution. Formal responsibility for records management is assigned to 31% of the libraries, while informal responsibility falls to 36% of libraries because no other institutional unit is responsible. Digital special collections While nearly half have a digitization program within special collections, only 26% have such a program library-wide. Noticeably more Oberlins do digital image production within special collections (82%) than the overall mean (71%). Thirteen percent (13%) of Oberlins have licensing contracts with commercial vendors, far fewer than the overall mean. Only 21% of Oberlin respondents have assigned responsibility for management of born-digital materials to any unit. Few have collected born-digital materials in private archival and manuscript collections (16%), and only one has collected data sets. A striking 74% state that lack of administrative support outside the library is an impediment to implementation of born-digital management. Fifty-six percent (56%) of Oberlins have an institutional repository. Special collections staff are involved with its implementation at all those that have an IR. Staffing The mean number of permanent FTE for Oberlin respondents is three (two professionals and one paraprofessional). The median is two (one professional and one paraprofessional). Demographic diversity is much lower across the Oberlin libraries than the overall means for two population groups: Black/African American (10% of respondents, overall mean is 35%) and Hispanic/Latino (10% of respondents, overall mean is 30%). Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 101 Staff size has generally been stable in all functional areas of special collections across Oberlin libraries, including for technology and digital services (an increase of 44% is the overall mean for the latter). The top areas in which Oberlin libraries reported a need for education or training are born- digital records (80%), records management (49%), cataloging and metadata (43%), information technology (40%), intellectual property (40%), and preservation (40%). Sixty-two percent (62%) have always had only one special collections unit—not surprising, since most of the member libraries are relatively small. Of the others, 23% have integrated formerly separate units, and 15% remain separate. RLG Partnership www.oclc.org/research/partnership/ The rate of response by members of the RLG Partnership was 65% (55 of 85 U.S. and Canadian partners), comprising 33% of respondents overall. Organizational profile Approximately 100 institutions are currently affiliated with the RLG Partnership. Unlike the other four organizations surveyed, the Partnership is heterogeneous with regard to the types of affiliated institutions, which include universities, independent research libraries, public and national libraries, museum libraries and archives, historical societies, public libraries, and other archival institutions (including the U.S. National Archives and Records Administration). The Partnership also has members beyond North America; the largest group is located in the United Kingdom, while others are in continental Europe, Japan, Australia, and New Zealand. Members outside North America were not included in the survey population due to variations in practice across nations. The Partnership is most beneficial to libraries and other research institutions that want to invest in collaboratively designing future services. It is a global alliance of like-minded institutions that focuses on making operational processes more efficient and shaping new scholarly services by directly engaging senior managers. The Partnership is supported by the full capacities of OCLC Research, informed by an international, system-wide perspective, and connected to the broad array of OCLC products and services. Activities include, among others, reciprocal interlibrary lending and document supply through the SHARES program, applied research into challenges and questions facing research libraries and museums, numerous projects focused on the concerns of special collections and archives, and active programs of http://www.oclc.org/research/partnership/� Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 102 webinars and publications. All partners are invited to an annual meeting and topical symposium. The RLG Partnership traces its development to the founding of the Research Libraries Group in 1974. In 2006, RLG combined with OCLC, and the programmatic staff and activities were integrated into OCLC Research, which has long been one of the world's leading centers devoted exclusively to the challenges facing libraries in a rapidly changing information technology environment. Since being founded in 1978, the office has investigated trends in technology and library practice with a view to enhancing the value of library services. The RLG Committee of the OCLC Board of Trustees is entrusted with governance of the Partnership. OCLC is an international not-for-profit library cooperative whose members work together to improve access to the information held in libraries around the globe. Approximately one third of RLG Partners are also members of ARL. Thirteen of the 19 members of IRLA are also in the RLG Partnership, as are four CARL members. Swarthmore College is a member of the Oberlin Group. Overall library size and budget Table 2.30. RLG Partnership overall library size (Q. 7, n=51) Number of Volumes Number of RLGs Percent of RLGs No printed volumes19 1 2% < 1,000,000 volumes 19 37% 1-3 million volumes 7 14% 3-6 million volumes 8 16% > 6,000,000 volumes 16 31% The distribution of library sizes reflects the wide range of institution types across the Partnership. For example, many of the libraries holding less than one million volumes are also IRLA members, while most of the largest are also in ARL. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 103 Table 2.31. RLG Partnership change in overall library funding (Q. 77, n=51) Type of Change Reported Percent of Responses Decreased 1-5% 24% Decreased 6-10% 29% Decreased 11-16% 8% Decreased 16-20% 10% Decreased more than 20% 14% No change 10% Increased 6% The pattern of change in overall library funding is fairly similar for RLG Partnership libraries as for respondents overall. Collections Table 2.32. RLG Partnership special collections size (Q. 11, n=48)20 n Total Items Mean Median Printed volumes 47 13,500,000 287,000 202,000 Archives and manuscripts (collections) 48 1,510,000 l.f. 31,400 l.f. 18,000 l.f. Manuscripts (managed as items) 28 6,400,000 228,000 958 Cartographic materials 32 1,400,000 44,400 3,100 Visual materials 37 45,500,000 1,230,000 400,000 Audio materials 26 1,600,000 61,500 4,700 Moving-image materials 26 206,000 7,900 1,500 Born-digital materials 19 50,000 GB 2,600 GB 114 GB Microforms 31 700,000 22,500 5,000 Artifacts 25 76,000 465 2,000 Note: Archival and manuscript collections were counted in linear feet (l.f.) and born- digital materials in gigabytes (GB). Fifty-eight percent (58%) of RLG Partner respondents had more funding for collections in 2008–09 than in 2000. The percentage of holdings across the survey population that are held in RLG Partnership libraries ranges from 70% of cartographic materials down to 29% of moving image materials and 15% of manuscripts managed as items. RLG Partners hold roughly half of the aggregate total of each of the other formats. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 104 Table 2.33. Percentage of all survey holdings held by RLG Partnership libraries (Q.11, n=48) Format Total items across survey population Total items in RLG Percent in RLG Printed volumes 30,000,000 13,500,000 45% Archival and manuscript collections 3,000,000 l.f. 1,510,000 l.f. 50% Manuscripts (managed as items) 44,000,000 6,400,000 15% Cartographic materials 2,000,000 1,400,000 70% Visual materials (two-dimensional) 90,000,000 45,500,000 51% Audio materials 3,000,000 1,600,000 53% Moving-image materials 700,000 206,000 29% Born-digital materials 85,000 GB 50,000 GB 59% Microforms 1,300,000 700,000 54% Artifacts 154,000 76,000 49% Note: Archival and manuscript collections were counted in linear feet (l.f.) and born-digital materials in gigabytes (GB). Somewhat more RLG Partnership libraries have informal collaborative collection development arrangements with other non-regional institutions in their nation (24%) than is the case across the overall population. Seven RLG Partner institutions have formal collaborations, and two have formal international arrangements. User services Table 2.34. RLG Partnership onsite visits (Q. 24, n=43) n Number of Visits Percent of Total Mean Median Faculty and staff 26 27,569 8% 1,060 612 Graduate students 18 16,867 5% 957 423 Undergraduates 18 15,526 5% 863 550 Visiting researchers 30 110,996 34% 3,700 863 Local community 17 22,625 7% 1,331 435 Other 22 129,879 40% 5,904 1,854 Total 323,462 100% 7,522 4,482 The mean number of onsite users at RLG Partnership libraries is the second highest across the overall population (IRLAs are 10% higher). The medians for RLG Partners and IRLA libraries are very similar. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 105 Onsite use has increased across noticeably more RLG Partnership libraries (75%) than across the rest of the population. In particular, use by visiting researchers has more often increased (78% of RLG Partnership respondents). Table 2.35. RLG Partnership presentations (Q. 38, n=48) n Number of Presentations Percent of Total Mean Median College/University courses 43 4,417 47% 103 59 Others affiliated with responding institution 42 1,601 17% 38 18 Local community 40 2,282 24% 57 20 Other visitors 38 1,009 11% 27 16 Total 9,309 100% 194 101 The RLG Partnership means for all types of presentation audiences except local community members are noticeably higher than those of ARL libraries. Of the twelve RLGs that gave reasons for not permitting cameras in the reading room, 83% cited concerns about both copyright and improper handling of materials. Loss of revenue from reproduction services was of concern to 58%, and 67% cited potential disruption in the reading room. These three concerns were expressed by far fewer respondents across the other organizations surveyed. RLG Partnership libraries generally charge more for digital scans than the overall population: although 35% charge less than $10, 25% charge more than $20. Far more contribute finding aids to ArchiveGrid (48%) than the overall mean (30%), reflecting this database’s origins within the Research Libraries Group. Significantly more RLGs have implemented web 2.0 communication methods than the overall population: 61% have a blog, 46% create podcasts, 30% use an institutional Wiki, and 54% have a social networking presence such as a Facebook page. A fellowship or grant program for visiting researchers is in place at 57% of RLG Partnership libraries. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 106 Cataloging and metadata Table 2.36. RLG Partnership catalog records (Q41-47)21 Format n Online Offline No Record Described within Archival Collections Printed volumes 47 86% 6% 9% n/a Archival collections 47 64% 12% 26% n/a Manuscripts (items) 34 64% 22% 13% n/a Cartographic materials 39 45% 24% 21% 16% Visual materials 44 33% 10% 24% 37% Audiovisual materials 38 27% 8% 34% 36% Born-digital materials 30 28% 1% 32% 39% The 2010 RLG Partnership data for online catalog records is nearly identical to that of the overall population, with one exception: about 10% more archival and manuscript holdings have online records. RLG Partners have 49% of their finding aids online. Archival collections management The institutional archives reports to the library at 75% of RLGs. Fifty percent (50%) have formal responsibility for records management, far above the overall mean (30%), and a total of 87% have some level of responsibility for this activity. Digital special collections Most RLG Partners either have already done large-scale digitization of special collections (46%) or plan to do so (42%); both figures are well above the overall means. More RLGs have licensing contracts with commercial vendors (39%) than the overall mean. RLG Partnership libraries have collected every born-digital format listed in the survey at a somewhat higher rate than the overall means: 71% have collected digital photographs, which is the highest percentage for any born-digital format across the survey population. A lower percentage of RLGs report various impediments to born-digital management than members of the other organizations; in particular, lack of administrative support is far less often an issue (28%, compared to 41%). Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 107 Staffing The mean number of permanent FTE for RLG Partnership respondents is 25 (fifteen professionals and ten paraprofessionals). The median is twelve (eight professionals and four paraprofessionals). Every demographic group listed in the survey is represented among special collections staff in up to 15% more RLG Partnership libraries than the overall means. Staff size decreased in public services at a higher percentage of RLGs (39%) than the overall mean decrease across the population (23%). The areas in which the highest percentage of RLG Partners cited a need for education or training are born-digital materials (84%), information technology (77%), intellectual property (57%), cataloging and metadata (55%), records management (45%), and archival processing (41%). Thirty-six percent (36%) of RLG Partnership libraries maintain separate special collections units, 10% have always had only one, and 16% consist entirely of special collections (the latter generally are IRLA members that also belong to the RLG Partnership). Notes 1 Two additional ARL members that hold more than six million volumes were not members in 1998: the National Library of Medicine and the New York Public Library. 2 ARL added its 125th member, the University of Ottawa, in the spring of 2010 after we had closed data collection. 3 Numbers were rounded in this and other collections tables. See the data supplement for exact figures. 4 This comparison is imperfect due to differences in the respondent population, yet we feel it can reasonably be made because 84% of our ARL respondents also responded to the 1998 survey. 5 Data for the two largest U.S. respondents were excluded to avoid skewing the overall means: the Library of Congress, which has an exceptionally large materials budget, and the National Archives and Records Administration, which has no acquisitions funds because it acquires all materials by transfer from government agencies. Note that the Total figures are not simple combinations of “institutional” and “special” because it is not statistically valid to sum means or medians across subgroups; they were therefore recalculated from the combined data. 6 One ARL respondent reported 20 million manuscript items, which dramatically skews the mean upward. 7 Percentages for each row sometimes add up to slightly more than 100%, because we allowed a margin of error of +10% in each response. Data from an individual respondent totaling more than 110% for a particular format were omitted from all calculations. 8 ARL used “uncataloged,” “card catalog,” and “MARC record” as the three possible categories for status of access; in contrast, we used “no record of any kind,” “print-only,” and “online.” We intended that each of our terms be roughly equivalent to the corresponding ARL term, but some respondents may have interpreted them somewhat differently. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M. Dooley and Katherine Luce, for OCLC Research Page 108 9 In contrast, in its annual salary survey, ARL counts the percentage of staff in each demographic group. ARL’s 2009–10 data shows these percentages: 85.7% Caucasian/Other, 6.4% Asian or Pacific Islander, 4.6% Black, 2.8% Hispanic, and 0.5% American Indian or Native Alaskan. 10 CARL added its 32nd member, Ryerson University, in 2010 after we had closed data collection. 11 See Canadiana.org n.d. 12 Numbers were rounded in this and other collections tables. See the data supplement for exact figures. 13 Percentages for each row sometimes add up to slightly more than 100%, because we allowed a margin of error of +10% in each response. Data from an individual respondent totaling more than 110% for a particular format were omitted from all calculations. 14 The foreign member is the Herzog August Bibliothek in Wolfenbüttel, Germany, which was not surveyed. 15 Numbers were rounded in this and other collections tables. See the data supplement for exact figures. 16 Percentages for each row sometimes add up to slightly more than 100%, because we allowed a margin of error of +10% in each response. Data from an individual respondent totaling more than 110% for a particular format were omitted from all calculations. 17 Numbers were rounded in this and other collections tables. See the data supplement for exact figures. 18 Percentages for each row sometimes add up to slightly more than 100%, because we allowed a margin of error of +10% in each response. Data from an individual respondent totaling more than 110% for a particular format were omitted from all calculations. 19 The U.S. National Archives and Records Administration does not collect printed volumes. 20 Numbers were rounded in this and other collections tables. See the data supplement for exact figures. 21 Percentages for each row sometimes add up to slightly more than 100%, because we allowed a margin of error of +10% in each response. Data from an individual respondent totaling more than 110% for a particular format were omitted from all calculations. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 109 3. Conclusion and Recommendations The 1998 ARL report (Panitch 2001) concluded with a chapter titled “Areas of concern” in which observations were framed around five questions, three of which are of particular interest in the context of our work: • Will ARL institutions be able to continue collecting the special collections materials needed for teaching and scholarship? • Is adequate intellectual access being provided for special collections materials? • Are staff levels and available skills appropriate to support the growing size and scope of special collections? These are big issues, not easily addressed, and more than a decade later, they remain outstanding. To ARL’s unanswered questions, we add some of our own that we see as among the most central of those that we posed throughout this report: • Is dramatic growth of collections sustainable? If not, what should change? • Why are formal collaborative collection development partnerships so rare? • Why are so many backlogs continuing to increase? • Why hasn’t the emphasis on sustainable metadata methodologies had more payoff? • Does the level of use of special collections justify the resources being expended? • What constitutes an effective large-scale digitization project? • Can we collaborate to complete the corpus of digitized rare books? • What would best help us jump-start progress on managing born-digital archival materials? Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 110 The proposed recommendations for action that follow echo many of these questions and suggest concrete steps for moving forward. Action Items A core goal of this research is to incite change to transform special collections. In that spirit, we present a set of recommended action items (also threaded throughout the Executive Summary). We focused on issues that warrant shared action, but individual institutions could take immediate steps to effect change locally. Regardless, responsibility for accomplishing change must necessarily be distributed. All concerned must take ownership. Assessment • Develop and promulgate metrics that enable standardized measurement of key aspects of special collections use and management. Collections • Identify barriers that limit collaborative collection development. Define key characteristics and desired outcomes of effective collaboration. • Take collective action to share resources for cost-effective preservation of at-risk audiovisual materials. User Services • Develop and liberally implement exemplary policies to facilitate rather than inhibit access to and interlibrary loan of rare and unique materials. Cataloging and Metadata • Compile, disseminate, and adopt a slate of replicable, sustainable methodologies for cataloging and processing to facilitate continued exposure of materials that remain hidden and stop the growth of backlogs. • Develop shared capacities to create metadata for published materials such as maps and printed graphics for which cataloging resources appear to be scarce. • Convert legacy finding aids using affordable methodologies to enable Internet access. Resist the urge to upgrade or expand the data prior to conversion of print-only finding aids. Develop tools to facilitate conversion from local databases. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 111 Digitization • Develop models for large-scale digitization of special collections, including methodologies for selection of appropriate collections, security, safe handling, sustainable metadata creation, and ambitious productivity levels. • Determine the scope of the existing corpus of digitized rare books, differentiating those that are available as open access from those that are licensed. Identify the most important gaps and implement collaborative projects to complete the corpus. Born-Digital Archival Materials • Define the characteristics of born-digital materials that warrant their management as “special collections.” • Define a reasonable set of basic steps for initiating an institutional program for managing born-digital archival materials. • Develop use cases and cost models for selection, management, and preservation of born-digital archival materials. Staffing • Confirm high-priority areas in which education and training opportunities are not adequate for particular segments of the professional community. Exert pressure on appropriate organizations to fill the gaps. Next Steps We invite readers to challenge themselves, their parent institutions, the membership organizations to which their institutions belong, and their professional societies to engage with the issues raised by this report. Which recommended actions warrant high priority is open to debate, and we look forward to participating in the conversation. In some cases, relevant projects are already underway. Examples include these: • The Council on Library and Information Resources is in the third year of its Hidden Collections initiative, which encourages grantees to devise sustainable methodologies for cataloging or processing. We look to CLIR (2010) to promulgate actively the best of these approaches. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 112 • ARL is a leader in the development of metrics. The Statistics and Assessment Committee is beginning to revisit the value of some statistics and how best to take into consideration issues such as collective collecting, digital libraries, and special collections.1 • The ARL Transforming Special Collections in the Digital Age Working Group is studying four broad areas as its 2010-11 agenda: digitization, born-digital, legal issues, and collections (ARL 2010b). • Professional associations generally promote and support the educational needs of their members. Within the special collections and archives realm, ACRL’s Rare Books and Manuscripts Section and the Society of American Archivists play leading roles. • OCLC Research has projects underway in two areas: streamlining workflows for interlibrary loan of special collections (2009) and identifying successful approaches to large-scale digital capture (2010a). Note 1 E-mail exchange with Sue Baughman, Associate Deputy Executive Director of ARL, 5 July 2010. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 113 Appendix A. Survey Instrument Survey questions A facsimile of the survey instrument is provided on the following pages. Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries This OCLC Research survey explores the state of special collections and archives in academic and research libraries in the United States and Canada. We seek to identify norms across the community and thereby help define needs for community action and research. Only one response per institution is permitted. If you have more than one special collections or archives unit, please combine data from all units. We recognize that surveying all units may not be feasible for some respondents. Supplying the broadest possible data will, however, make clear your institution's overall level of distinction and add to our view of the rare and unique materials held across the community. The survey may take from one to several hours to complete depending on the availability of statistical data and/or whether or not you'll be combining data from multiple units. You may wish to print the PDF version as a working copy for data gathering. Your responses on a particular page are saved each time you click on a "forward" or "back" button. Do not use your browser's navigation arrows. You need not complete the survey in one sitting; you can re - enter to update or correct your data at any time until the survey closes on December 18th. Always enter using the URL in the survey invitation that we sent to your director by email. If your institution has no special collections, please provide your contact information and respond to the yes/no question that follows. Your response will help complete our overall view of academic and research library collections. Please submit your completed response by December 18, 2009. OCLC Research will publish the survey results in mid -2010. Participating institutions will be identified, but no data will be associated with individual respondents. Contact information will be held confidential. Address questions to Jackie Dooley, Consulting Archivist, OCLC Research ( dooleyj@oclc.org or 949.492.5060). For technical problems of any kind, contact Jeanette McNicol ( mcnicolj@oclc.org or 650.287.2133). Introduction Respondent Information Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 114 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries Special collections are library and archival materials in any format (e.g., rare books, manuscripts, 1. Contact Information* Name Title Institution Country E-mail Telephone 2. How would you prefer to be contacted if we have any follow-up questions? * 3. Consortial memberships (check all that apply)* 4. Type of institution* 5. Public or private institution* Definition of Special Collections E-mail nmlkj I prefer not to be contacted nmlkj Telephone nmlkj ARL gfedc CARL (Canada) gfedc IRLA gfedc Oberlin gfedc RLG Partnership gfedc University nmlkj College nmlkj Independent research library nmlkj Museum nmlkj National library nmlkj Historical society nmlkj Governmental library nmlkj Public library nmlkj Other (please specify) nmlkj Public (base funding source is governmental) nmlkj Private (base funding is from a non-governmental source) nmlkj Both/Hybrid nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 115 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries photographs, institutional archives) that are generally characterized by their artifactual or monetary value, physical format, uniqueness or rarity, and/or an institutional commitment to long -term preservation and access. They generally are housed in a separate unit with specialized security and user services. Circulation of materials usually is restricted. The term "special collections" is used throughout to refer to all such types of materials. This definition excludes general collections characterized by format or subject specialization (e.g., published audiovisual materials, general library strength in Asian history), as well as materials managed as museum objects. 6. Does your institution have special collections?* Instructions Please respond with regard to materials held in special collections and archives units only. If your library consists primarily of special collections (i.e., you have no “general” collections or reading room), respond with regard to the entire library. Exclude other organizational units (e.g., museum curatorial units; research or fellowship programs) that do not report under a library or archives in your institution. Use your institution's latest twelve-month “statistical year” that ended prior to July 1, 2009 for statistical questions. (In cases where you do not have formal statistics, we encourage reasonable estimates to minimize the time you will spend.) Respond to all other questions based on your current practices. Practices vary across institutions, which may render some questions ambiguous for a particular respondent. Use your best judgment to interpret each question for your circumstances. Text boxes have no word limit; you may exceed the size of any box. Each page concludes with an open comment box for any additional thoughts or details. Please submit the survey online to avoid inadvertent data input errors on our part. If you prefer to respond on paper, please print the PDF version, clearly enter all data, and mail to: Special Collections and Archives Survey OCLC Research 777 Mariners Island Blvd., Suite 550 San Mateo, CA 94404 USA Collections Yes nmlkj No nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 116 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 7. Indicate the total number of printed volumes (refer to Appendix A for definition of scope) in your institution's overall library collections, both general and special. (For libraries that report annual statistics to ARL, this is your total "Volumes in library.") 8. Information about your institution's separate special collections libraries and archives will help us understand the scope of your data. Units may be separate administratively and/or physically. Total separate units across the institution Number of separate units included in your response 9. Name the special collections unit(s) for which you are reporting data. 10. Name any special collections units for which you are not reporting data. No printed volumes nmlkj Fewer than 1 million volumes nmlkj 1 to 3 million volumes nmlkj 3 to 6 million volumes nmlkj More than 6 million volumes nmlkj Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 117 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 11. Estimate the size of your special collections by physical unit (except where indicated below) for each format of material as of 2008/2009. Important: Consult Appendix A to determine in which category to report formats more specific than those listed below (e.g., count pamphlets as volumes, postcards as visual materials). Special collections and archives often manage materials in certain formats as integral parts of archival or manuscript collections. When this is the case, 1) include them in the linear foot count for archival and manuscript collections, and 2) enter "0" on the line for the specific format (you may optionally report item counts for such formats in Question 12). Conversely, enter below the counts for any special formats that you manage as items. Printed volumes Archives and manuscripts (managed as collections--count linear ft.) Manuscripts (managed as items--count physical units) Cartographic materials Visual materials Audio materials Moving image materials Born-digital materials (gigabytes) Microforms Artifacts Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 118 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 12. This optional question is for item-level counts of materials included within archival and manuscript collections (counted in Question 11)--for example, to report how many photographs your institution manages within archival collections. Leave blank for any formats already counted as items in Question 11. Cartographic materials Visual materials Audio materials Moving image materials Born-digital materials (gigabytes) Microforms Artifacts 13. Have you established any significant new collecting areas within special collections since 2000? 14. Have you discontinued new acquisitions in any collecting areas within special collections since 2000? 15. Have you deaccessioned any significant bodies of special collections materials since 2000? (Deaccessioning is physical withdrawal of cataloged or processed materials. It does not include weeding during processing.) 16. Any additional comments about this page? No nmlkj Yes (Describe briefly and note impetus; e.g., a major gift, curator’s decision, faculty suggestion, new institutional direction.) nmlkj No nmlkj Yes (Describe briefly and note impetus as above.) nmlkj No nmlkj Yes (Describe briefly and note impetus as above.) nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 119 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries Collections (continued) 17. Estimate the percentage of printed volumes in special collections acquired by each of the following methods during 2008/2009. Enter "0" where appropriate. Purchase (Institutional funds) Purchase (Special funds) Gifts-in-kind Transfer from elsewhere in your institution 18. Estimate the percentage of materials other than printed volumes (e.g., archives and manuscripts, visual materials) in special collections acquired by each of the following methods during 2008/2009. Enter "0" where appropriate. Purchase (Institutional funds) Purchase (Special funds) Gifts-in-kind Transfer from elsewhere in your institution 19. Did the amount of acquisitions funding that you had for purchasing special collections materials in 2008/2009 differ relative to that you had in 2000? 20. Do special collections units participate in any cooperative collection development arrangements? No arrangements Informal arrangements Formal arrangements Local/Regional institutions nmlkj nmlkj nmlkj Members of your consortium nmlkj nmlkj nmlkj Other institutions in your nation nmlkj nmlkj nmlkj Institutions in other nations nmlkj nmlkj nmlkj Less funding in 2008 nmlkj No change nmlkj More funding in 2008 nmlkj Not sure nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 120 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 21. Are any special collections materials housed in off-site or other secondary storage? 22. Indicate the relative extent of preservation needs across your special collections in the following formats. No problems Low Medium High Not Sure No materials of this type Printed volumes nmlkj nmlkj nmlkj nmlkj nmlkj nmlkj Archives and manuscripts nmlkj nmlkj nmlkj nmlkj nmlkj nmlkj Visual materials nmlkj nmlkj nmlkj nmlkj nmlkj nmlkj Audiovisual materials nmlkj nmlkj nmlkj nmlkj nmlkj nmlkj 23. Any additional comments about this page? User Services 24. State the number of onsite visits (the "gate count" or "reader days") by special collections users during 2008/2009. If you do not use a category, leave it blank. Affiliated faculty and staff Affiliated graduate students Affiliated undergraduate students Visiting scholars and researchers Local community Other No nmlkj In planning stages nmlkj Yes nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 121 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 25. Has the level of use of your special collections changed since 2000? Decreased No change Increased Not Sure This user category not used Affiliated faculty and staff nmlkj nmlkj nmlkj nmlkj nmlkj Affiliated graduate students nmlkj nmlkj nmlkj nmlkj nmlkj Affiliated undergraduate students nmlkj nmlkj nmlkj nmlkj nmlkj Visiting scholars and researchers nmlkj nmlkj nmlkj nmlkj nmlkj Local community nmlkj nmlkj nmlkj nmlkj nmlkj Other nmlkj nmlkj nmlkj nmlkj nmlkj 26. Have users’ methods of contacting your special collections changed since 2000? Decreased No change Increased Not Sure This method not used Onsite nmlkj nmlkj nmlkj nmlkj nmlkj E-mail nmlkj nmlkj nmlkj nmlkj nmlkj Website comment feature nmlkj nmlkj nmlkj nmlkj nmlkj Interactive chat reference nmlkj nmlkj nmlkj nmlkj nmlkj Telephone nmlkj nmlkj nmlkj nmlkj nmlkj Mail nmlkj nmlkj nmlkj nmlkj nmlkj 27. Has use of the following types of special collections materials changed since 2000? Decreased No change Increased Not Sure No materials of this type Books printed before 1801 nmlkj nmlkj nmlkj nmlkj nmlkj Books printed 1801 or later nmlkj nmlkj nmlkj nmlkj nmlkj Archives and manuscripts nmlkj nmlkj nmlkj nmlkj nmlkj Visual materials nmlkj nmlkj nmlkj nmlkj nmlkj Audiovisual materials nmlkj nmlkj nmlkj nmlkj nmlkj Born-digital materials nmlkj nmlkj nmlkj nmlkj nmlkj Other Other Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 122 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 28. Does special collections permit use of uncataloged and/or unprocessed materials? Select “yes” even if requests are approved selectively. Yes No No uncat/unproc materials of this type No materials of this type Printed volumes nmlkj nmlkj nmlkj nmlkj Archives and manuscripts nmlkj nmlkj nmlkj nmlkj Visual materials nmlkj nmlkj nmlkj nmlkj Audiovisual materials nmlkj nmlkj nmlkj nmlkj Born-digital materials nmlkj nmlkj nmlkj nmlkj 29. If special collections does not permit use of uncataloged and/or unprocessed materials in certain formats, why not? Check all that apply. Descriptions incomplete Descriptions below standards Insufficiently processed to be usable Preservation Security Privacy and confidentiality Printed volumes gfedc gfedc gfedc gfedc gfedc gfedc Archives and manuscripts gfedc gfedc gfedc gfedc gfedc gfedc Visual materials gfedc gfedc gfedc gfedc gfedc gfedc Audiovisual materials gfedc gfedc gfedc gfedc gfedc gfedc Born-digital materials gfedc gfedc gfedc gfedc gfedc gfedc 30. Do you permit interlibrary loan of original special collections materials? Answer “yes” even if requests are approved selectively. Check all that apply. 31. Any additional comments about this page? User services (continued) Other reason(s): Yes, printed volumes gfedc Yes, materials in other formats gfedc Yes, only to institutions within our parent institution or consortium gfedc Yes, but only reproductions/copies gfedc No gfedc Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 123 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 32. Does special collections allow the use of digital cameras in the reading room by users for copying collection materials? 33. If you do not permit use of digital cameras in the reading room, please state your reasons. 34. How much does special collections charge, on average, for a digital scan of a collection item? 35. Does special collections retain copies of images scanned by and/or for users for potential inclusion in your digital library? (This does not include retention for internal purposes only.) Yes nmlkj Considering it nmlkj No nmlkj Concern about inappropriate use of the digital files (e.g., copyright violations) gfedc Concern about potential loss of revenue from reproduction services gfedc Concern about improper handling of materials gfedc Concern about disruption in the reading room gfedc Existing reproductive services (e.g., photocopying, microfilming, scanning done by staff) are sufficient gfedc Other (please describe) gfedc We provide scans at no charge nmlkj $0-$5 nmlkj $5.01-$10 nmlkj $10.01-$20 nmlkj More than $20 nmlkj We do not offer this service nmlkj Always nmlkj Sometimes nmlkj Never nmlkj Other Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 124 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 36. By which method(s) do you make archival finding aids Internet- accessible? Check all that apply. 37. Indicate which web-based communication methods special collections uses for outreach or to gather feedback. Limit your response to communications intended to promote or raise awareness of your institution's activities and collections; do not include uses by individuals, such as via personal blogs or Twitter accounts. Using now Will implement within a year No current plans to implement Institutional blog nmlkj nmlkj nmlkj Flickr nmlkj nmlkj nmlkj YouTube nmlkj nmlkj nmlkj Podcasting nmlkj nmlkj nmlkj Wikipedia links nmlkj nmlkj nmlkj Institutional wiki nmlkj nmlkj nmlkj Applications for mobile devices nmlkj nmlkj nmlkj User-contributed feedback (e.g., social tagging) nmlkj nmlkj nmlkj Social networking presence (e.g., Facebook page) nmlkj nmlkj nmlkj Twitter nmlkj nmlkj nmlkj On a local website gfedc Available to web crawlers for use by search engines (files are available on a local web server) gfedc Contributed to ArchiveGrid (formerly RLG Archival Resources) gfedc Contributed to Archive Finder (formerly ArchivesUSA) gfedc Contributed to a consortial database or catalog (e.g., Online Archive of California) gfedc Our finding aids are not Internet-accessible gfedc Other method (please describe) gfedc Other (please describe) Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 125 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries Estimate the percentage of special collections material that has each type of library catalog record (e.g., MARC records) for materials in the following formats. Refer to Appendix A for the scope of materials within each format. 38. Estimate how many presentations (e.g., course sessions, public lectures, tours) special collections staff made during the 2008-2009 year. College/university courses Non-course groups affiliated with your institution Visitors from your local community Visitors from elsewhere 39. Do you have a program (e.g., fellowships or grants) for awarding funds to users to visit your special collections? 40. Any additional comments about this page? Cataloging and Metadata 41. Printed volumes No catalog record of any kind Print catalog record only Online catalog record 42. Archives and manuscripts (managed as collections) No catalog record of any kind Print catalog record only Online catalog record 43. Manuscripts (managed as items) No catalog record of any kind Print catalog record only Online catalog record Yes nmlkj No nmlkj Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 126 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 44. Cartographic materials No catalog record of any kind Print catalog record only Online catalog record Cataloged as part of archival and manuscript collections 45. Visual materials No catalog record of any kind Print catalog record only Online catalog record Cataloged as part of archival and manuscript collections 46. Audiovisual materials No catalog record of any kind Print catalog record only Online catalog record Cataloged as part of archival and manuscript collections 47. Born-digital materials No catalog record of any kind Print catalog record only Online catalog record Cataloged as part of archival and manuscript collections 48. Estimate the percentage of archival and manuscript collections for which each type of archival finding aid exists. No finding aid Not Internet-accessible finding aid Internet-accessible finding aid 49. Has the size of your special collections uncataloged/unprocessed backlogs changed since 2000? Decreased No change Increased Not sure No materials of this type Printed volumes nmlkj nmlkj nmlkj nmlkj nmlkj Materials in other formats nmlkj nmlkj nmlkj nmlkj nmlkj 50. Any additional comments about this page? Other Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 127 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries Archival Collections Management 51. Have you implemented a simplified approach to archival processing, such as that advocated in Greene and Meissner's article "More product, less process," in The American Archivist, to facilitate backlog reduction, higher rates of production, and/or more timely access to collections? 52. Do you create and/or maintain archival finding aids using an encoding scheme? Check all that apply. 53. Indicate which of the following software tools you currently use, or plan to use in the near future, for creating archival finding aids. Check all that apply. 54. Does your institutional archives report within the library or to another administrative unit? Yes, for all processing nmlkj Yes, for some processing nmlkj No nmlkj EAD gfedc HTML gfedc No encoding scheme used gfedc Other (please describe) gfedc Word processing software (of any type) gfedc Database software (of any type) gfedc Archon gfedc Archivists’ Toolkit gfedc EAD Cookbook gfedc XML markup tool (e.g., XMetal) gfedc Other (please describe) gfedc Library nmlkj Chief executive officer (e.g., president, chancellor) nmlkj Chief information officer nmlkj We have no institutional archives nmlkj Other (describe below) nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 128 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 55. Is a library or archives unit responsible for records management for your institution? 56. Any additional comments about this page? Digital Special Collections 57. Describe the nature of your digitization program (i.e., digital reproduction of original physical materials) for special collections materials. Check all that apply. 58. In which ways are special collections staff involved in digitization projects? Check all that apply. 59. Indicate whether you are considering large-scale digitization of special collections materials. (This generally involves a systematic effort to convert entire collections--rather than being selective at the item level--using streamlined digitization methods.) Yes, sole responsibility nmlkj Yes, responsibility is shared with other institutional unit(s) nmlkj Yes, informally, because no other unit takes responsibility nmlkj No nmlkj We have completed one or more projects gfedc We have an active digitization program within special collections gfedc We have an active library-wide digitization program that includes special collections materials gfedc We can undertake projects only when we secure special funding gfedc We have not yet undertaken any projects gfedc Project management gfedc Selection of materials gfedc Cataloging/metadata creation gfedc Digital image production gfedc Other (please describe) gfedc We have already done such projects nmlkj We intend to do this in future nmlkj We have no plans to do this nmlkj Not sure nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 129 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 60. Do you have any licensing contracts in place, or being negotiated, to give commercial firms the right to digitize materials from your special collections and sell access? 61. Where within your institution is responsibility assigned for management and preservation of born-digital archival materials? 62. Which types of born-digital archival material does your special collections and/or institutional archives currently “collect” or manage? Check all that apply. Yes nmlkj No nmlkj Responsibility is assigned to special collections and/or the institutional archives nmlkj Responsibility is at the library-wide level nmlkj Responsibility is at the institutional level nmlkj Responsibility is decentralized nmlkj Responsibility has not been formally determined nmlkj This issue has not yet been addressed nmlkj Other (please describe) nmlkj Institutional archival records gfedc Other archives and manuscripts gfedc Publications and reports gfedc Serials gfedc Photographs gfedc Websites gfedc Audio gfedc Video gfedc Data sets gfedc None gfedc Other (please describe) gfedc Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 130 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 63. Which of the following are impediments to implementing management and preservation of born-digital archival materials in your institution? Check all that apply. 64. How is special collections involved in implementation of your library's institutional repository? Check all that apply. 65. Any additional comments about this page? Staffing 66. How many permanent staff positions were focused on special collections-related functions during 2008/2009? Use your local job classifications to differentiate categories. Report in FTE (full-time equivalents), either whole or decimal numbers. Professional/Exempt ParaprofessionalNon-exempt Student/Volunteer/Intern Lack of expertise gfedc Lack of time for planning gfedc Lack of funding gfedc Lack of administrative support within the library gfedc Lack of administrative support elsewhere in the institution gfedc This is not the library's responsibility gfedc We do not expect to acquire any such materials gfedc No known impediments gfedc Other (please describe) gfedc We contribute metadata gfedc We contribute collections content gfedc We contribute to project management gfedc We participate in other ways gfedc We are not involved with the repository gfedc We have no institutional repository gfedc Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 131 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 67. How many temporary staff positions (e.g., grant funded) were focused on special collections-related functions during 2008/2009? Use your local job classifications to differentiate categories. Report in FTE (full-time equivalents), either whole or decimal numbers. Professional/Exempt Paraprofessional/Non-exempt Student/Volunteer/Intern 68. How many special collections staff are likely to retire in the next five years? 69. Improving the demographic diversity of staff has been a key focus of the special collections and archives communities in recent years. Which population groups currently are represented among your special collections staff? Check all that apply. 70. Have your staffing levels changed for the following activities in special collections since 2000? Decreased No change Increased No staff in this area Administrative nmlkj nmlkj nmlkj nmlkj Curatorial nmlkj nmlkj nmlkj nmlkj Public services nmlkj nmlkj nmlkj nmlkj Technical services (print materials) nmlkj nmlkj nmlkj nmlkj Technical services (other materials) nmlkj nmlkj nmlkj nmlkj Technology and/or digital services nmlkj nmlkj nmlkj nmlkj Asian gfedc Black or African American gfedc Hispanic or Latino gfedc Native American gfedc Pacific Islander gfedc White gfedc Other (please state) gfedc Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 132 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries Please estimate your library's expenditures for special collections during 2008/2009. 71. In which areas do special collections staff particularly need education or training in order to meet the institution's needs? Check all that apply. 72. Have any separate special collections units within your institution been integrated since 2000? 73. Any additional comments about this page? Funding 74. Indicate the monetary unit in which you are reporting. Archival processing gfedc Born-digital records gfedc Cataloging and metadata gfedc Collection development gfedc Foreign languages gfedc Fundraising gfedc History of the book gfedc Information technology gfedc Intellectual property gfedc Management/supervision gfedc Outreach gfedc Preservation gfedc Public relations gfedc Public services gfedc Records management gfedc Teaching gfedc Other (please describe) gfedc Yes nmlkj All units were integrated before 2000 nmlkj We have always had only one special collections unit nmlkj We have multiple special collections units and all remain separate nmlkj Our entire institution is solely or primarily special collections nmlkj U.S. dollars nmlkj Canadian dollars nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 133 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries 75. Institutional funds Collection materials Salaries/wages Other 76. Special funds (e.g., endowments, gifts, grants) Collection materials Salaries/wages Other 77. Has overall funding for your library and/or archives changed in the context of the current global economic crisis? 78. Any additional comments about this page? Reflections 79. Please state what you consider the three most challenging issues currently facing your special collections, not including staffing or funding. 1. 2. 3. 80. Is there anything else you'd like to add? End of Survey Decreased 1-5% nmlkj Decreased 6-10% nmlkj Decreased 11-15% nmlkj Decreased 16-20% nmlkj Decreased more than 20% nmlkj No change nmlkj Increased nmlkj Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 134 Special Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research LibrariesSpecial Collections & Archives in Academic & Research Libraries Thank you!! We appreciate your participation in this survey. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf Jackie Dooley and Katherine Luce, for OCLC Research October 2010 Page 135 Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 136 Counting specific formats Several survey questions ask for data by format of material. Use the lists below to map specific formats to the categories used in the survey to ensure consistency across institutions. These are not necessarily comprehensive lists. Printed volumes: Count each physical volume or other physical item • Books • E-books • Serials • Codex manuscripts (bound volumes) • Atlases • Government documents • Newspapers • Pamphlets • Theses and dissertations Archives and manuscripts (managed as collections): Count in linear feet • Archival and manuscript materials in any format that are described and managed as collections • Materials managed as collections as part of the institutional archives Manuscripts (managed as items) • Manuscripts, generally textual, managed and cataloged at the item level Cartographic materials: Count each physical item • Two-dimensional maps • Globes Visual materials: Count each physical item • Architectural materials • Drawings • Ephemera • Paintings • Photographs • Postcards Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 137 • Posters • Prints • Slides and transparencies Audiovisual materials: Count each physical item • Audio materials o Music recordings o Spoken word recordings • Moving image materials o Film o Video Microforms: Count each physical item Born-digital archival materials: Count the number of gigabytes of data • Data files • Digital audio, film and video • Digital cartographic materials • Digital personal papers or organizational records • Digital photographs • Digital reports or publications • E-mail • Web sites Artifacts: Count each physical item • Three-dimensional objects other than globes • Realia • Architectural models • Scrolls • Papyri Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 138 Appendix B. Responding institutions Respondents by Membership Organization Association of Research Libraries (ARL) 86 respondents (of 124 members) Note: Seventy-one ARL members responded to both the 1998 and 2010 surveys, as indicated by asterisks (*). *Arizona State University *Auburn University Boston College *Boston Public Library *Brigham Young University *Brown University Center for Research Libraries *Columbia University *Cornell University *Dartmouth College *Duke University *Emory University *Florida State University George Washington University *Georgia Institute of Technology *Harvard University *Indiana University *Iowa State University *Johns Hopkins University *Library of Congress *Louisiana State University *McMaster University *Michigan State University National Agricultural Library National Library of Medicine New York Public Library New York State Library *New York University *North Carolina State University *Northwestern University *The Ohio State University Pennsylvania State University *Princeton University *Purdue University *Rice University *Rutgers University Smithsonian Institution *Southern Illinois University *Syracuse University *Temple University *Texas A&M University *Tulane University *Université de Montréal *University at Buffalo *University of Alberta *University of Arizona *University of British Columbia University of Calgary Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 139 *University of California, Berkeley *University of California, Los Angeles University of California, San Diego *University of California, Riverside *University of California, Santa Barbara *University of Chicago *University of Colorado *University of Connecticut *University of Georgia *University of Hawaii *University of Illinois, Chicago *University of Illinois, Urbana *University of Iowa *University of Kansas *University of Kentucky University of Louisville *University of Manitoba *University of Miami *University of Michigan *University of Minnesota University of Nebraska *University of New Mexico *University of North Carolina University of Oregon *University of Pennsylvania *University of Southern California *University of Tennessee *University of Texas *University of Toronto University of Utah *University of Virginia *University of Washington *University of Waterloo *University of Wisconsin *Vanderbilt University *Washington University, St. Louis *Yale University *York University Canadian Academic and Research Libraries (CARL) 20 respondents (of 31 members) Note: Seven CARL members responded to both the 1998 and 2010 surveys, as indicated by asterisks (*). Brock University Carleton University Dalhousie University Library of Parliament *McMaster University Memorial University of Newfoundland Université de Montréal Université de Sherbrooke *University of Alberta *University of British Columbia University of Calgary *University of Manitoba University of New Brunswick University of Ottawa University of Saskatchewan *University of Toronto University of Victoria *University of Waterloo University of Windsor *York University Independent Research Libraries Association (IRLA) 15 respondents (of 19 members) American Antiquarian Society Folger Shakespeare Library Getty Research Institute Hagley Museum and Library Historical Society of Pennsylvania Huntington Library John Carter Brown Library Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 140 Library Company of Philadelphia Linda Hall Library Massachusetts Historical Society New York Academy of Medicine Library New York Public Library New-York Historical Society Newberry Library Virginia Historical Society Oberlin Group 39 respondents (of 80 members) Agnes Scott College Amherst College Augustana College Austin College Bates College Beloit College Berea College Bowdoin College Bucknell University Carleton College Coe College Colby College College of Wooster Colorado College Connecticut College Denison University DePauw University Dickinson College Franklin and Marshall College Gettysburg College Grinnell College Gustavus Adolphus College Haverford College Kalamazoo College Kenyon College Macalester College Mills College Occidental College Reed College Rollins College Saint John's University Skidmore College Smith College Trinity College Vassar College Washington and Lee University Wesleyan University Whitman College Willamette University RLG Partnership 55 respondents (of 85 U.S. and Canadian members) Amon Carter Museum Art Institute of Chicago Athenaeum of Philadelphia Brigham Young University Brooklyn Museum California Digital Library California Historical Society Chemical Heritage Foundation Columbia University Cornell University Emory University Fordham University School of Law George Washington University, Jacob Burns Law Library Getty Research Institute Hagley Museum and Library Huntington Library Indiana University Institute for Advanced Study John Carter Brown Library Kimbell Art Museum Library Company of Philadelphia Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 141 Library of Congress Linda Hall Library Minnesota Historical Society Museum of Fine Arts, Houston National Archives and Records Administration National Gallery of Art Nelson-Atkins Museum of Art New York Public Library New York University New-York Historical Society Newberry Library Oregon State University Philadelphia Museum of Art Princeton University Rice University Rutgers University Smithsonian Institution Temple University Pennsylvania State University University of Alberta University of Arizona University of Calgary University of California, Berkeley University of California, Los Angeles University of Chicago University of Miami University of Michigan University of Minnesota University of Ottawa University of Texas University of Toronto University of Washington Yale University Yeshiva University Respondents by Type of Institution Colleges (32) Agnes Scott College Amherst College Augustana College Austin College Bates College Beloit College Berea College Bowdoin College Carleton College Coe College Colby College College of Wooster Colorado College Connecticut College Dickinson College Franklin and Marshall College Gettysburg College Grinnell College Gustavus Adolphus College Haverford College Kalamazoo College Kenyon College Macalester College Mills College Occidental College Reed College Rollins College Skidmore College Smith College Trinity College Vassar College Whitman College Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 142 Consortium (1) Center for Research Libraries Governmental Libraries (2) Library of Parliament New York State Library Historical Societies (6) California Historical Society Historical Society of Pennsylvania Massachusetts Historical Society Minnesota Historical Society New-York Historical Society Virginia Historical Society Independent Research Libraries (13) American Antiquarian Society Athenaeum of Philadelphia Chemical Heritage Foundation Folger Shakespeare Library Getty Research Institute Hagley Museum and Library Huntington Library Institute for Advanced Study John Carter Brown Library Library Company of Philadelphia Linda Hall Library for Science, Engineering, and Technology New York Academy of Medicine Library Newberry Library Museums (8) Amon Carter Museum Art Institute of Chicago Brooklyn Museum Kimbell Art Museum Museum of Fine Arts, Houston National Gallery of Art Nelson-Atkins Museum of Art Philadelphia Museum of Art National Institutions (5) Library of Congress National Agricultural Library National Archives and Records Administration National Library of Medicine Smithsonian Institution Public Libraries (2) Boston Public Library New York Public Library Universities (100) Arizona State University Auburn University Boston College Brigham Young University Brock University Brown University Bucknell University California Digital Library Carleton University Columbia University Cornell University Dalhousie University Dartmouth College Denison University DePauw University Duke University Emory University Florida State University Fordham University School of Law George Washington University Georgia Institute of Technology Harvard University Indiana University Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 143 Iowa State University Johns Hopkins University Louisiana State University McMaster University Memorial University of Newfoundland Michigan State University New York University North Carolina State University Northwestern University The Ohio State University Oregon State University Pennsylvania State University Princeton University Purdue University Rice University Rutgers University Saint John's University Southern Illinois University Syracuse University Temple University Texas A&M University Tulane University Université de Montréal Université de Sherbrooke University at Buffalo University of Alberta University of Arizona University of British Columbia University of Calgary University of California, Berkeley University of California, Los Angeles University of California, Riverside University of California, San Diego University of California, Santa Barbara University of Chicago University of Colorado University of Connecticut University of Georgia University of Hawaii University of Illinois, Chicago University of Illinois, Urbana University of Iowa University of Kansas University of Kentucky University of Louisville University of Manitoba University of Miami University of Michigan University of Minnesota University of Nebraska University of New Brunswick University of New Mexico University of North Carolina University of Oregon University of Ottawa University of Pennsylvania University of Saskatchewan University of Southern California University of Tennessee University of Texas University of Toronto University of Utah University of Victoria University of Virginia University of Washington University of Waterloo University of Windsor University of Wisconsin Vanderbilt University Washington and Lee University Washington University, St. Louis Wesleyan University Willamette University Yale University Yeshiva University York University Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 144 Appendix C. Overview of Museum Data Eight museums from the RLG Partnership responded to the survey regarding the special collections they hold in libraries and archives. Such a small number of respondents cannot conclusively represent the broader population; regardless, this selective overview may hint at the shared nature of these institutions. Overall library size and budget All eight respondents have fewer than one million volumes in their overall libraries. Budgets decreased at six institutions, more or less equally distributed across the range of decrease from 1%-5% to more than 20%. Collections Special collections of printed volumes range in size from fewer than 1,500 volumes up to 200,000; the mean is 41,000, while the median is 9,000. Archival collections range in size from negligible to 3,400 linear feet. Visual materials are the most common other format: five institutions reported holdings ranging from 50,000 to 200,000 items. Responses regarding acquisition of special collections printed volumes by purchase or gift were widely disparate, such that a mean or median would be meaningless. Regardless, several things stand out: only one museum acquires more than 10% of printed volumes by purchase using institutional funds; only one acquires more than 25% using special funds; and a far higher percentage of printed volumes overall are received as gifts than as purchases. Institutional funding for special collections and archives acquisitions is minimal: the three figures reported ranged from $10,000 to $33,000. Special funds are also in very short supply, with the exception of one museum that reported $600,000. Materials other than printed volumes were acquired almost exclusively by gift or transfer from within the institution: only one respondent reported making any purchases. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 145 Five reported having less funding for special collections and archives acquisitions than in 2000. Four have established new collecting areas since 2000, while none have discontinued any areas. With regard to deaccessioned materials, one noted that the slide collection is gradually being withdrawn as items are digitized. Six have informal collaborative collection development arrangements for special collections and archives with local or regional institutions, and three with members of their consortium. None have formal collecting collaborations. As is the case across the overall population, audiovisual materials have the highest level of preservation need at more than half of institutions. User services Five respondents reported statistics for onsite visits by affiliated staff, visiting researchers, and local community users. Staff visits ranged in number from 25 to 3,218, visiting researchers from 30 to 250, and community users from 20 to 55. Six reported that onsite visits by visiting scholars and researchers and members of the local community have increased. Five stated that use has increased for books printed both before and after 1800, archives and manuscripts, and visual materials. The number of public presentations ranged from five to 100; the mean was 30 and the median 15. One museum gave 85 presentations to local community visitors; with that exception, college and university courses were the most frequent audiences. Four respondents permit use of uncataloged/unprocessed materials. Four do not permit interlibrary loan. Six allow researchers to use digital cameras in the reading room. Three charge more than $20 for a digital scan of a collections item, while one provides scans at no charge. Six retain scans made for users for addition to the digital library. Finding aids are most often made available either via a local Web site or contributed to ArchiveGrid. Only one respondent has no finding aids accessible. Three respondents have an institutional blog, while use of all other Web 2.0 communication methodologies is almost nonexistent. One museum has a fellowship program for researchers. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 146 Cataloging and metadata Ninety-six (96%) of printed volumes have an online catalog record, well above the overall mean (85%). The percentage of archival collections that are Internet-accessible is 74%, which is impressive relative to the overall mean of 44%. Statistics for materials in other formats are too few and disparate to be meaningful. More backlogs decreased than increased: printed volume backlogs decreased at four institutions, and backlogs of other materials decreased at three. (Each of the other possible choices was given by only one respondent.) Archival collections management Three museum respondents sometimes use minimal archival processing. Four use EAD for encoding of finding aids. Five respondents use word processing and/or database software for creating archival finding aids. None use Archon, the Archivists’ Toolkit, or the EAD cookbook. Two use XML markup tools. The institutional archives reports to the library or archives at five museums. Records management responsibility is held by the library or archives at every responding institution. Digital special collections Three respondents have completed digitization projects, one has an active digitization program, two can undertake projects only with special funding, and three have had no activity at all. One museum respondent has done large-scale digitization, and four more intend to do so in future. One has licensing agreements with commercial vendors for digitization. Three respondents have assigned institutional responsibility for managing born-digital materials. Institutional archival records, photographs, and video are the most often-held born-digital formats. None have collected Web sites. Only one respondent reported having no born-digital materials. As is the case across the overall population, lack of time for planning and lack of funding are the two most frequently cited impediments to born-digital management. Two museums have an institutional repository. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 147 Staffing The mean number of special collections staff is 2.9 professionals and 1.3 paraprofessionals for a total of 4.2 FTE. The median for professionals is two and for paraprofessionals is one. When compared to the overall population relative to the size of the museums’ collections, this number of staff is strong. Demographic diversity is limited: all of respondents’ staff members in special collections are white/Caucasian except for two museums that have Asian staff. Stable staffing was the norm, with the exception of public services. For the latter, three reported a decrease, two no change, and two an increase. Respondents reported very few increases or decreases in staffing in other functional areas. Five topics emerged in which education and training are needed by more than two respondents: born-digital materials (needed by all), archival processing, metadata creation, information technology, and records management. Two have more than one special collections unit. Most challenging issues Responses were disparate. The only two issues reported more than once were space and digitization. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 148 Appendix D. Methodology Survey Design The instrument includes a total of 80 questions of the following five types: • Respondent identification (5 questions) • Multiple-choice (39) • Numeric (20) • Open-ended (6) • Optional comments (10) The instructions encouraged respondents to use informed estimates where they lacked formal data; we felt this was preferable to receiving few responses to particular questions, while also recognizing that estimating gives the data a lesser guarantee of accuracy. We would have been able to frame some questions more precisely and have a higher level of confidence in the data if meaningful metrics were used across the special collections community. We formulated questions in accordance with how we believe institutions most commonly record statistics. For example, we felt it more likely that respondents would tally onsite visitors than all users and transactions. We knew, however, that it would not be possible to include the many user categories that are employed in various institutions; we thus added “Other” as a catchall option, and it was the only user category utilized by 24% of respondents. We requested numerical data from the 2008-09 year, defined as the institution’s latest twelve-month “statistical year” that ended prior to 1 July 2009. For other questions, respondents were to answer based on current circumstances at the time of their response. We included an optional comment box at the end of every page to facilitate comments. These were used extensively and, in some cases, led to correction of respondents’ initial data based on comments that clarified intent. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 149 We conducted two rounds of testing our draft instrument with two groups of reviewers—a total of about 30 individuals from across the five organizations. In addition to helping us focus on the most important issues and make each question as clear and unambiguous as possible, reviewers identified differing nuances of understanding due to factors such as multiple meanings of terminology and varying methods of recording statistics. We therefore took care to employ nomenclature that is as ecumenical as possible to bridge such differences; examples include the use of “institutional archives” instead of “university archives,” and clarification that both “gate count” and “reader days” are terms used to refer to onsite visits. We guaranteed to respondents that their data would be kept confidential. Survey Dissemination We used a Web-based survey tool (SurveyMonkey)1 to send invitations and to gather responses. The official invitation was sent by e-mail on 6 November 2009 to the director (or designate) of each of the 275 libraries represented in the survey population. Responses were permitted either online or on a printout of a PDF file; OCLC Research staff input those received on paper. The initial closing date of 17 December 2009 was extended to 29 January 2010 to accommodate requests from numerous respondents. A single request was sent to each institution in the population, including those known to have multiple special collections units. This matched the methodology used by ARL in the 1998 survey. The purpose was to avoid potential overrepresentation of particular large institutions based on the nature of their organizational structures. All special collections and archives units were eligible for inclusion, whether or not they report to a broader library system or another organizational entity. Other types of collecting units such as museum curatorial units or research institutes were excluded. Data Analysis After the data collection period closed, we exported all data to Microsoft Excel for computation and analysis and normalized it in several ways: • Corrected clear errors of fact, such as inaccurate organizational membership or type of institution • Enabled calculations on numeric data by dropping alpha characters (l.f., vols., etc.) and made a decision for each numerical question about how to deal with blanks vs. zeroes to render the data consistent. Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 150 • Deleted numeric data that could not be normalized, such as counts of artifacts given in linear feet rather than items • Entered the appropriate response when it was revealed by an open-ended comment • Discarded data that were clearly in error, such as numeric data that added up to more than 110% for questions 41-47 • For Canadian respondents, we converted linear meters to linear feet and Canadian dollars to U.S. dollars (at the rate valid on the day that data collection was closed) We selectively contacted respondents individually, as necessary, to give them an opportunity to correct data that revealed a misunderstanding of the instructions, as well as to clarify inconsistent or unclear responses. We did not, however, seek completion of data for questions that were skipped under the assumption that respondents were aware of their omissions and had decided for some reason not to provide data for those questions. We excluded from our data analysis the statistics for special collections size, funding, onsite visits, and staff reported by the Library of Congress and the National Archives and Records Administration, the two largest institutions by far in the survey population, in order to avoid inappropriate skew in the data overall. Notes 1 http://www.surveymonkey.com http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 151 References ACRL (Association of College & Research Libraries). 2004. Guidelines for the interlibrary loan of rare and unique materials. http://www.ala.org/ala/mgrps/divs/acrl/standards/rareguidelines.cfm. ———. 2008. Guidelines: Competencies for special collections professionals. Prepared by the Rare Books and Manuscripts Section, ACRL/ALA Task Force on Core Competencies for Special Collections Professionals. http://www.ala.org/ala/mgrps/divs/acrl/standards/comp4specollect.cfm. ArchivesSpace. 2010. ArchivesSpace: Building a next-generation archive management tool. http://archivesspace.org/. ARL (Association of Research Libraries). 2009. Exposing hidden collections. www.arl.org/rtl/speccoll/hidden.shtml. ———. 2010a. ARL annual salary survey 2009-2010. Page last modified September 16, 2010. http://www.arl.org/stats/annualsurveys/salary/sal0910.shtml. ———. 2010b. Transforming special collections in the digital age working group 2010+. http://www.arl.org/rtl/speccoll/spcollwg2010/index.shtml. Bejune, Matthew and Jana Ronan. Social software in libraries. 2008. SPEC Kit 304. Washington, D.C.: Association of Research Libraries. Summary available online at: http://www.arl.org/bm~doc/spec304web.pdf. Berenbak, Adam, Cate Putirskis, Genya O’Gara, Claire Ruswick, Danica Cullinan, Judy Allen Dodson, Emily Walters, and Kathy Brown. 2010. Special collections engagement. SPEC Kit 317. Washington, DC: Association of Research Libraries. Summary available online at: http:/www.arl.org/bm~doc/spec-317-web.pdf. BRTF (Blue Ribbon Task Force on Sustainable Digital Preservation and Access). 2010. Sustainable economics for a digital planet: Ensuring long-term access to digital information. La Jolla, Calif.: Blue Ribbon Task Force on Sustainable Digital Preservation and Access (Francine Berman and Brian Lavoie, co-chairs). http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf. http://www.ala.org/ala/mgrps/divs/acrl/standards/rareguidelines.cfm� http://www.ala.org/ala/mgrps/divs/acrl/standards/comp4specollect.cfm� http://archivesspace.org/� http://www.arl.org/rtl/speccoll/hidden.shtml� http://www.arl.org/stats/annualsurveys/salary/sal0910.shtml� http://www.arl.org/rtl/speccoll/spcollwg2010/index.shtml� http://www.arl.org/bm~doc/spec304web.pdf� http://www.arl.org/bm~doc/spec-317-web.pdf� http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf� http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 152 Canadiana.org. What is the Canadiana.org Digital Collection Builder (DCB)? Canadiana.org Digital Collection Builder. http://dcb-gcn.canadiana.org/. CLIR (Council on Library and Information Resources). 2010. About the program. Cataloging hidden special collections and archives. http://www.clir.org/hiddencollections/index.html. Coyle, Karen. 2006. Mass digitization of books. The Journal of Academic Librarianship, 32 (6): 641-645. Catalog record available online at http://www.worldcat.org/oclc/425305904. Crow, Raym. 2002. The case for institutional repositories: A SPARC position paper. Washington, DC: The Scholarly Publishing and Academic Resources Coalition. http://www.arl.org/sparc/bm~doc/ir_final_release_102-2.pdf. Greene, Mark A. and Dennis Meissner. 2005. More product, less process: Revamping traditional archival processing. The American Archivist, 68 (2): 208-263. http://archivists.metapress.com/content/c741823776k65863/fulltext.pdf. Greene, Mark. 2010. MPLP: It’s not just for processing anymore. The American Archivist, 73 (spring/summer): 175-203. Hackbart-Dean, Pam, and Elizabeth Slomba. 2009. Processing decisions for manuscripts and archives. SPEC kit 314. Washington, D.C.: Association of Research Libraries. Summary available online at http://www.arl.org/bm~doc/spec-314-web.pdf. Kyrillidou, Martha, and Michael O’Connor. 2000. ARL statistics 1998–99. http://www.arl.org/bm~doc/1998-99arlstats.pdf. LC (Library of Congress). 2010. EAD: Encoded Archival Description version 2002 official site. Date on home page: August 6, 2010. http://www.loc.gov/ead/. Markey, Karen, Soo Young Rieh, Beth St. Jean, Jihyun Kim, and Elizabeth Yakel. 2007. Census of institutional repositories in the United States: MIRACLE project research findings. Washington, D.C.: Council on Library and Information Resources. Publication 140. http://www.clir.org/pubs/abstract/pub140abst.html. Miller, Lisa, Steven Kenneth Galbraith, and the RLG Partnership Working Group on Streamlining Photography and Scanning. 2010. "Capture and release" digital cameras in the reading room. Dublin, Ohio: OCLC Research. http://www.oclc.org/research/publications/library/2010/2010-05.pdf. Myers, Ann. 2010. Hidden collections survey results. DCRM-L. https://listserver.lib.byu.edu/pipermail/dcrm-l/2010-April/002254.html. http://dcb-gcn.canadiana.org/� http://www.clir.org/hiddencollections/index.html� http://www.worldcat.org/oclc/425305904� http://www.arl.org/sparc/bm~doc/ir_final_release_102-2.pdf� http://archivists.metapress.com/content/c741823776k65863/fulltext.pdf� http://www.arl.org/bm~doc/spec-314-web.pdf� http://www.arl.org/bm~doc/1998-99arlstats.pdf� http://www.loc.gov/ead/� http://www.clir.org/pubs/abstract/pub140abst.html� http://www.oclc.org/research/publications/library/2010/2010-05.pdf� https://listserver.lib.byu.edu/pipermail/dcrm-l/2010-April/002254.html� http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie Dooley and Katherine Luce, for OCLC Research Page 153 OCLC (OCLC Online Computer Library Center, Inc). 2006-2010. ArchiveGrid: Open the door to history. http://archivegrid.org. OCLC Research. 2009. Sharing special collections. http://www.oclc.org/research/activities/sharing/default.htm. ———. 2010a. Rapid capture: Mass digitization of special collections. http://www.oclc.org/research/activities/capture/default.htm. ———. 2010b. Undue diligence: Seeking low-risk strategies for making collections of unpublished materials more accessible. http://www.oclc.org/research/events/2010-03-11.htm. ———. 2010c. Well-intentioned practice for digitizing collections of unpublished materials. Revised 28 May. http://www.oclc.org/research/activities/rights/practice.pdf. Panitch, Judith M. 2001. Special collections in ARL libraries: Results of the 1998 survey. Washington, D.C.: Association of Research Libraries. http://www.arl.org/bm~doc/spec_colls_in_arl.pdf. RBS (Rare Book School). 2010. Born digital materials: Theory and practice. RBS Course L-95. Described online at: http://www.rarebookschool.org/courses/libraries/l95/. SAA (Society of American Archivists). 2002. Guidelines for a graduate program in archival studies. http://www.archivists.org/prof-education/ed_guidelines.asp. ———. 2006. Guidelines for Archival Continuing Education (ACE). http://www.archivists.org/prof- education/ace.asp. ———. 2010a. Copyright: The archivist and the law. Workshop. Course description available online at: http://saa.archivists.org/Scripts/4Disapi.dll/4DCGI/events/193.html?Action=Conference_Detail&C onfID_W=193&Time=352143800&SessionID=9679572zd51qyb22nkgph81d54o0lr9y46kg0c7rcysil7q7fw 6pfy9176x75430. ———. 2010b. EAD bibliography. EAD (Encoded Archival Description) Help Pages. www.archivists.org/saagroups/ead/bibliography.html. Theimer, Kate. 2009. Wikipedia policy change means archives can post links in articles! Go crazy, archivists! ArchivesNext. http://www.archivesnext.com/?p=402. U.S. Department of Commerce. Informational Short Form Questionnaire. Bureau of the Census. Form D- 61A. http://www.census.gov/dmd/www/pdf/d61a.pdf. http://archivegrid.org/� http://www.oclc.org/research/activities/sharing/default.htm� http://www.oclc.org/research/activities/capture/default.htm� http://www.oclc.org/research/events/2010-03-11.htm� http://www.oclc.org/research/activities/rights/practice.pdf� http://www.arl.org/bm~doc/spec_colls_in_arl.pdf� http://www.rarebookschool.org/courses/libraries/l95/� http://www.archivists.org/prof-education/ed_guidelines.asp� http://www.archivists.org/prof-education/ace.asp� http://www.archivists.org/prof-education/ace.asp� http://saa.archivists.org/Scripts/4Disapi.dll/4DCGI/events/193.html?Action=Conference_Detail&ConfID_W=193&Time=352143800&SessionID=9679572zd51qyb22nkgph81d54o0lr9y46kg0c7rcysil7q7fw6pfy9176x75430� http://saa.archivists.org/Scripts/4Disapi.dll/4DCGI/events/193.html?Action=Conference_Detail&ConfID_W=193&Time=352143800&SessionID=9679572zd51qyb22nkgph81d54o0lr9y46kg0c7rcysil7q7fw6pfy9176x75430� http://saa.archivists.org/Scripts/4Disapi.dll/4DCGI/events/193.html?Action=Conference_Detail&ConfID_W=193&Time=352143800&SessionID=9679572zd51qyb22nkgph81d54o0lr9y46kg0c7rcysil7q7fw6pfy9176x75430� http://www.archivists.org/saagroups/ead/bibliography.html� http://www.archivesnext.com/?p=402� http://www.census.gov/dmd/www/pdf/d61a.pdf� work_wlhfhoc6qjdpvlv7gpathksjy4 ---- User-Centered Provisioning of Interlibrary Loan: a Framework – In the Library with the Lead Pipe Skip to Main Content chat18.webcam Open Menu Home About Awards & Good Words Contact Editorial Board Denisse Solis Ian Beilin Jaena Rae Cabrera Kellee Warren Nicole Cooke Ryan Randall Emeritus Announcements Authors Archives Conduct Submission Guidelines Lead Pipe Publication Process Style Guide Search Home About Awards & Good Words Contact Editorial Board Denisse Solis Ian Beilin Jaena Rae Cabrera Kellee Warren Nicole Cooke Ryan Randall Emeritus Announcements Authors Archives Conduct Submission Guidelines Lead Pipe Publication Process Style Guide Search 2018 Mar 21 Kurt Munson /4 Comments User-Centered Provisioning of Interlibrary Loan: a Framework In Brief: Interlibrary loan (ILL) has grown from a niche service limited to few privileged scholars to a ubiquitous expected service. Yet, workflows still assume specialness. Users’ needs should come first and that means redesigning ILL into a unified linear user-centered process. It is not just a request form, rather we need improved mechanisms for users to track, manage, and communicate about their requests. This article explores how ILL developed, problems with the current ILL ecosystem, and changes that can make ILL centered on users’ needs and processes rather than backend library systems. By Kurt Munson Introduction Interlibrary loan (ILL) provides library users with a critical tool to acquire resources they need for their information consumption and evaluation activities whether research, teaching, learning, or something else. The 129% increase in ILL volume between 1991 and 2015 in the Association of Research Libraries (ARL) statistics clearly shows that ILL has grown from a niche service to an expected one (ARL, 2016). Yet, our library processes for providing this service have not kept pace with technological development. Thus, the provision of ILL is less effective than it could be because it is predicated upon library processes and systems rather than on most effectively meeting users’ needs. This article explores the development of ILL as a service, suggests areas in need of improvement, provides a framework for redesigning this service in a user-centered way, and finally outlines efforts to create such a user-centered ILL to meet those needs. Interlibrary loan holds a unique place within the suite of services libraries provide. ILL is entirely user initiated and driven by demonstrated user need. It provides a mechanism for users to acquire materials they have discovered and determined to be worthy of additional investigation but for which local copy is not available. ILL expands the resources available to users to that which can be delivered, not just the contents of the local collection. The modern research library offers a range of services under the ‘Resource Sharing’ umbrella, including consortial sharing of returnables, interlibrary loan of returnables and non-returnables, and local document delivery operations. The ILL process discussed in this article is restricted to ILL as a brokered process whereby a library requests and arranges the loan of a physical item for use by an affiliated user. ILL practitioners refer to this process as traditional ILL of returnables, as the item will be returned to the owning library. Scans or reproductions of articles or portions of a work provided from a local collection or by another library fall outside this article’s scope because the workflows for sourcing and providing those items are quite different. This article primarily concentrates on ILL between academic libraries, though its recommendations are generalizable to public, medical, and other libraries. Historical Development ILL has a long history as a library service but for most of that history, it was a niche service provided to only a select group of library users, most often faculty members and perhaps graduate students. ILL was difficult, time consuming, and required a great deal of staff effort. Simply identifying an owning library was a challenge before the introduction of shared computerized catalogs. Citations needed careful verification to ensure accuracy, particularly for items created prior to the introduction of the International Standard Book Number (ISBN) system in 1968. Identifying holdings and ownership represented huge challenges. While tools like the Pre-1956 Union Catalog existed, these were out of date as soon as they were printed. Requests were made via mailed paper request forms. The library that owned the item would likely know nothing of the requesting library so the trusted relationships we take for granted had not yet developed. A library might send an item or it might not. An owning library might respond in the negative or it might not. It was at best an arduous process analogous to weaving cloth and sewing garments by hand rather than purchasing ready-made off the rack clothing. The creation of the OCLC cooperative in 1967, specifically its shared index of items, provided the opportunity to vastly improve ILL processes and workflows. The OCLC database, eventually to be known as WorldCat, contained one record for a work and libraries that could indicate who owned a copy of that item. It was now possible to identify ownership easily. Moreover, this identification could be done in one place with simultaneous citation verification. OCLC introduced the first of its interlibrary loan subsystems in 1979 (Goldner, Birch, 2012, p. 5) because there were now enough item records and holding records in the shared OCLC index to support ILL processing. Over time, additional axillary ILL services for library staff were introduced by OCLC. For example, a library can provide contact information, address information, and explain what it will and will not lend with any associated costs for these services in the ILL policies directory. The OCLC ILL Fee Management (IFM) system provides an automated billing system as part of the transaction process. ILL became markedly easier to do, or at least portions of the process did. The development of WorldCat and other union catalogs made the process of identifying owning libraries and placing requests much easier but these were closed systems with limited functionality. These systems did one thing: placed a request. Yet, ILL is a multi-part process consisting of many disparate steps that library staff perform. Files of request forms require maintenance. Users need to be contacted when items arrive or need to be returned. Circulating necessitates tracking over time. Physical items require packing and shipping. Invoices require payment. For the library user, ILL is just one of many tools to acquire materials and the user’s interest is accessing the materials, not how the library chooses to source the requested item. Users once filled out a paper form which staff keyed into the requesting system. Then the user patiently waited until they received a phone call or postcard alerting them that the item had arrived. To be sure, verification and ordering had become easier but the process still involved many handoffs between different systems with minimal communication. Easier ordering allowed ILL request volumes to increase markedly (Goldner, Birch, 2012, p. 5). ILL management systems were developed to automate the management and tracking of requests over their lifespan in addition to handling communication with users and to circulate the items. ILLiad is the most common ILL management system used today in academic libraries. Both owning libraries and requesting libraries came to rely upon these systems to manage requests over their lifespan. Request databases replaced file folders. Data could be pushed from one system into another. Routine tasks, such as sending overdue notices, could be automated. ILL had become a standard mainstream expected service rather than the niche one. Improved staff processing was not the only driver for increased volume. OpenURL and other outgrowths of user-facing databases and the ubiquity of the internet made discovery easier (Musser, 2016, p. 646). The easy transfer of metadata via OpenURL increased request volume because users could request items by pressing a button instead of filling out a paper form. The request went into the request database for staff processing. Nonetheless, the improvements ILL management systems provided remained rooted in ILL’s traditional union catalog-based requesting workflows. They focused on making library staff processes to provide items more easily rather than user workflows or needs. Issues with the approach and workflows described above are explored below. Problems with Our Current Approach A number of issues limit ILL service’s usability which in turn limits its effectiveness for both users and library staff. To be sure, ILL services are valued by users and play an integral part in the suite of services libraries provide to source materials for users, but it can be improved upon by reconceptualizing the process whereby it is provided. Libraries can rethink how the individual parts of the process, be they software or workflow, are put together. Areas for reconceptualization fall into five broad categories, and these are discussed below. First, existing systems are based on identifying libraries that own a requested item. But for the purposes of ILL, ownership is only the first step in the process. An on-shelf loanable copy must be located because only items that fit these criteria can fill the user’s need. WorldCat can tell us who owns an item but what we need is a library that can loan the item. Owning libraries, or lenders as ILL practitioners call them, still need to perform a search of their local catalog to determine if the item is on shelf and loanable. This involves a time-consuming antiquated manual workflow that fails to take advantage of tools such as Z39.50 for automated catalog lookup. Workflows have not kept up with technological advancements. Consortial borrowing systems, such as Relais D2D or VDX, where a group of libraries share a discovery layer that displays availability, mitigate the issue described above but these systems also have a serious shortcoming: they force users to execute the same search in multiple discovery layers to find an available copy. Users, having identified an item, cannot simply submit a request and have the library source it for them. Rather, libraries expect users to navigate across disparate interfaces with unique request processes to request an item. Thus discovery and delivery become a fractured process for users as libraries push the work of finding a loanable copy onto their users. Second, identifying owning libraries remains tied to the searching of union catalogs because metadata is not recycled efficiently. A user searches their local library’s discovery tool and finds that an item they want is checked out so they fill out an ILL request form populated with metadata from their local discovery tool. Library staff, or preferably automated systems, then re-execute a similar search using that same metadata against a larger database to identify potential lending libraries and the request is ported into a different system. Since the metadata populating the local discovery tool likely came from WorldCat in the first place and that metadata will be used to search against WorldCat again, said metadata should be trusted rather than assuming that the citation needs verification by library staff. This is again an antiquated workflow rooted in past practices. Third, ILL is very much predicated on the terms imposed by the owning library. While the OCLC policies directory provides library staff with information about terms of use for borrowed items, the lack of consistent agreed-upon standards for loan periods between libraries creates a situation ripe for confusion on the part of users. Again, this harks back to an era where ILL was rare, difficult, and unique rather than the current situation where ILL is a standard service. Too much emphasis is placed on unique locally defined rules rather than on setting broadly agreed-upon standards or considering users’ needs for materials. Fourth, the process uses siloed systems with weak integrations and poor interoperability. Discovery happens in one system. Requests are managed in a separate ILL management system which ties to an external ordering system for sourcing items. When the item arrives at the borrowing library, these respective systems must be updated but then the item needs to be handled as a circulation likely in a separate system again or in a system separate from the one that manages the user’s loans for locally owned materials. Yes, the systems can communicate between each other but this process is staff intensive and lacking in automation. Crosswalks, bridges, and information exchange protocols are not employed fully or efficiently. Finally and most importantly, providing ILL services is predicated on library processes or library tools rather than user processes or needs. Users must learn and jump between disparate systems, often with jarring handoffs, to acquire materials. Depending on how the item is sourced by the library for the user, they need to find the system where the library has chosen to process that request. Communication is scant. It comes from different systems and mostly consists of silence until a pick up notification is sent. This confusing process is followed by inconsistent rules surrounding use based on the lending library’s terms of use. Usability studies have demonstrated how this confuses users (Foran, 2015, p. 6). Presented with multiple, often contradictory delivery options, and unclear explanations of the differences between them, users tend to place requests in each system in the hopes that one will work. Not only is this poor customer service, but it also increases staff workloads and costs for the library with duplicated work. Why? Because libraries define ILL success as having acquired a copy for the user. The user’s needs—required turnaround time, format, amount of time they will need the item or even its relative importance to them for intended use of it—are secondary, when even considered. Libraries need to gain a better understanding of how ILL fits into the user’s activities and how they can more effectively support those activities. ILL needs to be borrower-centered not lender-centered. In many ways the issues outlined above are a natural outcome of a service’s evolution over time and the result of a fairly stable ecosystem that expanded gradually over time. The foundational systems which undergird the service were able to absorb the increased request volume and processes simply continued without redesign or rethinking. Yet the environment in which the service exists is evolving rapidly and the time for a radical rethinking of the technology used to support the service workflows and metrics for success is here. Recommendations for Developing an Alternative Framework At the International ILLiad Conference in March 2016, Katie Birch of OCLC announced that OCLC intended to “move ILLiad to the cloud”. Far more than any other change in ILL processing or systems, including the introduction of Worldshare ILL, this announcement shook the foundations of academic library ILL in the United States. We were presented the opportunity to reimagine how we provide ILL services. We began to ask the question “what should the ILL workflows be?” How could we make them more user-centered rather than continuing the historic workflows mandated by vendor-supplied platforms? Concurrently and partially in response to this announcement the Big Ten Academic Alliance (BTAA), previously known as the Center for Institutional Cooperation (CIC), embarked on a project to explore, redefine, document, and share a user-centered discovery to delivery process. The project’s goal was to describe an easy-to-understand user experience that shielded them from the disparate library staff systems and provided a more linear discovery to resource delivery process. Usability studies confirmed library staff members’ impression that the process was confusing and disjointed to users (Big, 2016, pp. 19-22; Big, 2017b, pp. 19-21). Cooperatively with Ivies Plus Libraries and the Greater Western Library Association (GWLA), we defined base requirements and system functionalities for a new user centered vision of ILL. A one page summary document entitled “Next Generation Discovery to Delivery: A Vision” was released in February of 2017. Staff from BTAA libraries, including the author of this article, wrote two reports entitled “A Vision for Next Generation Resource Delivery” and “Next Generation Resource Delivery: Management System and UX Functional Requirements”. These works, in part, inform the three broad recommendations outlined below, described as: user process, technological, and cultural. To start, the library tools that support the users’ processes must be based upon their workflows rather than the processes library systems staff use to manage that work. Where in the past a user interface was tacked onto a library staff system, this should no longer be the case. Users deserve a simple universal request mechanism, a “get it” button (Foran, 2015, p. 5) that connects to a smart fulfillment system (Big, 2017b, p. 9). Requests should display in a single dashboard-like interface that allows users to manage all their library interactions in one place (Big, 2017b, p. 9). No longer should users be expected to hunt across disparate library system interfaces to locate their request for that specific item. Achieving this requires that we rethink how we, library staff, present library systems to users. Since the primary local discovery layer is the user’s primary entry point into the library and the place where they manage their library interactions, this interface needs to be the place where we display all request information to them. Thus, vendors who provide discovery layer tools must make them open and capable of incorporating data from external sources so we can provide users a unified display. They should be shielded from systems libraries use to perform their work of fulfilling requests. Users need items and which library staff process is invoked is immaterial to them. Getting the item is paramount. This notion must inform how libraries design, combine, and present their backroom systems to our customers. Second, delivery of an available on-shelf loanable copy to the user who needs it and made the effort to ask for it is what matters, not identifying owning libraries. ILL loans are simply more complicated circulations. Discovery tools should be separated from discovery options as these two do not need to be interconnected. The metadata from discovery is all that is needed to initiate delivery. Request should be managed via a lightweight system specifically designed around the efficient and timely fulfillment of that user’s request with user satisfaction serving as the primary metric for defining success. The BTAA reports named this new idea “Resource Delivery Management System” (RDMS) (Big, 2017b, p. 12). Working off a list of potential partner libraries maintained and defined in the RDMS, a simple Z39.50 search using that recycled metadata should identify a potential lending partner and when a loanable copy is found, a request should be placed via NCIP with routing and courier tracking/shipping information included in the RDMS’s request record. Circulations of ILL items should occur in the local Library Services Platform (LSP) so users can managed all loans regardless of how they are sourced in one place. The ideas above, in many ways, represent a somewhat radical break from past processes or practices. They decouple sourcing of materials from a shared index. Instead, they are based on library-defined partnerships and the identification of a loanable copy at a partner. Moreover, this approach promotes interoperability across different systems as the request is not tied to any legacy or monolithic system. Multiple micro-systems each play a part to complete a multistep process. Finally, it limits the functional scope of the RDMS to just the management of delivery, avoiding the current problem of (often subpar) duplication of functionality across systems. While no such system as described above exists, potential development is under exploration by vendors. The ideas outlined above further move us from the current siloed systems to one where integrations are central and key and where the best, most appropriate system, manages or provides the required information (Big, 2017a, p. 1). Thus, the local LSP handles all aspects of notification, circulation, and fines or blocks. Viewing this as a process consisting of many parts also allows us to reimagine it so that we can incorporate other previously excluded information such as shipping status derived from the UPS or FedEx APIs. Additional communications to users about the status of their request should be included too. Companies provide these updates on orders and shipping as a matter of course so libraries can also. Users reasonably expect them. Authoritative sources, rather than poorly duplicated ones, should be called upon to provide information as needed. Local address information sourced from that campus identity management system, for example. This system consists of many parts communicating with each other via protocols using APIs when needed. Binding their collective parts together with each assigned a specific task provides a new framework for the workaday provisioning of ILL services. Technology is easy to change. Culture is more difficult, particularly entrenched library policies. These policies’ efficacy at guiding user behavior and promoting shared stewardship of materials is almost never tested. Yet, users and library staff are both equally engaged in the management of loaned items. Libraries need to embrace the early slogan of the Rethinking Resources Sharing Initiative, “throw down your policies and embrace your collections” and libraries need to manage this sharing efficiently in a data-driven way. It is important to remember that users need materials to complete their work. The use of materials by users is predicated upon their need, associated timeline, and perceived value of the item. As the Big Ten Academic Alliance has stressed, “All that matters is format, time to delivery, loan period, and costs to the patron, if any” (Big, 2016, p. 9). These items have value to the user. They put effort into acquiring them. ILL is entirely user-driven unlike many other library processes. Arbitrary loan periods as set by any owning lending library may and in fact do come into conflict with users’ needs (Foran, 2015, p. 4). Libraries can resolve these conflicts easily by moving to standardized loan periods for ILL. Standards should replace the boutique exceptionality encouraged by the OCLC policies directory. Stated differently, the emphasis needs to shift from lender-imposed restrictions to borrowing libraries having the ability to communicate standard policies. For example, the BTAA shared twelve week loan period, when complemented by the equivalent Northwestern University local loan period, coupled with user blocks and assessment of replacement cost fines after thirty days provide a consistent user experience that, in turn, encourages the timely return of items. For example, only 29 of 29,137 total ILL loans were lost by Northwestern University users in 2016. This example demonstrates how consistent policies promote compliance. Why? Because they are both easy to understand and failure to comply with communicated expectations has direct consequences, specifically the loss of library privileges. Further, research done by the Ivies Plus Libraries demonstrates that almost all items are returned to the owning library after the user has completed their use of said item. Only 70 items of roughly 750,000 over three years were truly lost by patrons or never returned. This data clearly demonstrates the need to rethink policies across libraries and reconsider shared assumptions. In other words, the emphasis needs to be on understanding user behavior based on their needs and developing effective ways to affect their behavior to achieve agreed upon reasonable outcomes. Libraries must also shift from their historic lender-centric ILL system to one where an ILL user receives an item and national standards provide them a consistent easy-to-understand experience. This would promote an environment where borrowing libraries can more effectively manage their users. Appropriate effective tools, tested by data, are needed. Ineffective tools need to be discarded, like overdue notices via email from the lending library to the ILL borrowing staff. These will never affect user behavior. Making the process easier for users to understand in terms of policy is critical. The introduction of standardized loan periods, replacement costs, and the like across libraries would simplifying the management of ILL for both users and library staff. It would also greatly assist in achieving compliance and reducing (often pointless) staff work. Rather than starting with the question of which library system can perform a specific job, we need to rethink this process and backfill the appropriate system, library or other, from the starting point: the initial discovery and request by the user. The BTAA phrased this as smart fulfillment. Smart fulfillment is a linear path for users to follow where effective automated handoffs between library systems source and manage requests from or in the most appropriate place. Conclusion ILL has grown from a niche service to an expected standard one, growing 129% between 1991 and 2015 in ARL libraries (ARL, 2016). Yet workflows and system integrations have not evolved as much as they should have in response to this growth. A confluence of announcements and work to redefine processes now presents libraries with a unique opportunity to rethink ILL, transition from legacy practices, and to unify the fractured discovery to delivery process we present to our users. If we integrate library systems and systems that support library systems differently, and effectively leverage each system’s strength, we can create an easy-to-use service that meets demonstrated user needs. We can provide a service that provides smart fulfilment of requests and improves both the user and staff experience. This should be our goal. The author wishes to extend his deepest thanks to Heidi Nance, Director of Resource Sharing Initiatives for the Ivy Plus Libraries, for her willingness to review this article, apply her deep knowledge of ILL while doing so, and for the thoughtful comments and suggestions. Thank you, Heidi. References Association of Research Libraries. (2016). ARL Statistics 2014-15. Association of Research Libraries. Retrieved from http://www.arl.org/storage/documents/service-trends.pdf Big Ten Academic Alliance. (2016). A Vision for Next Generation Resource Delivery. Retrieved from https://www.btaa.org/docs/default-source/library/d2dnov2016report.pdf?sfvrsn=4 Big Ten Academic Alliance. (2017a). Next Generation Discovery to Delivery System: a Vision. Retrieved from https://www.btaa.org/docs/default-source/library/discoverysystemsvisiononepage.pdf?sfvrsn=2 Big Ten Academic Alliance. (2017b). Next Generation Resource Delivery: Management System and UX Functional Requirements. Retrieved from http://www.btaa.org/docs/default-source/library/next-generation-resource-delivery–functional-requirements.pdf Foran, K. (2015). “New Zealand Library Patron Expectations of an Interloan Service.” New Zealand Library & Information Management Journal. 55(3), 3-9. https://lianza.org.nz/nzlimj-volume-55-no-3-october-2015 Goldner, M., & Birch, K. (2012). “Resource Sharing in a Cloud Computing Age.” Interlending & Document Supply, 40(1), 4-11. https://doi.org/10.1108/02641611211214224 Musser, L., & Coopey, B. (2016). “Impact of a Discovery System on Interlibrary Loan.” College & Research Libraries. 77(5), 643-653. https://doi.org/10.5860/crl.77.5.643 Stapel, J. (2016). “Interlibrary Loan and Document Supply in the Netherlands.” Interlending & Document Supply. 44(3), 104-107., https://doi.org/10.1108/ILDS-03-2016-0015 interlibrary loan, interlibrary services, resource sharing, technology Without Foundations, We Can’t Build: Information Literacy and the Need for Strong School Library Programs Scholarship as an Open Conversation: Utilizing Open Peer Review in Information Literacy Instruction 4 Responses Luke M 2018–04–03 at 8:27 pm Hi Kurt, An interesting insight into how ILL operates in North American libraries! The thing that stood out the most as a difference between US and Australian ILL systems (if I have read your paper correctly), is the onus placed on users to find out which libraries hold the item they want and send separate requests to each institution or public library service. I think the Libraries Australia Document Delivery System (LADD), managed by the National Library of Australia could prove a good model for a more user friendly experience. Australian and New Zealander users can search for an item across the National Bibliographic Database. Once they identify that an item exists in Australia or New Zealand, they can simply ‘get it’. You can view the database for yourself here – https://librariesaustralia.nla.gov.au/search/simpleSearch?action=Login&mode=login&main=true&querystring=null . Depending on the rarity of an item or where it is held, there may be a fee charged for the item, but many public libraries will supply an item for free. The request is mediated by a librarian at the library they belong to, who double checks for availability of an item then forwards the request on to every holding library on the users behalf. This is managed by a rota, so the ILL officers at holding libraries will see the requests one at a time until someone can fulfill it. In this way a request can often be fulfilled withing about 5 – 10 days. This is how it works in a public library in Queensland anyway, the process could be a bit different for academic libraries. This paper from the National Library of Australia goes into some more detail about how LADD works – https://www.nla.gov.au/content/libraries-australia-document-delivery-a-system-for-a-variety-of-users Hope this is of interest! Luke M 2018–04–03 at 8:34 pm I have incorrectly addressed my comment to Heidi rather than the author. Sorry Kurt! Could the moderators please amend this? Kurt Munson 2018–04–04 at 9:58 am The national bibliographic database is like to OCLC’s WorldCat for the US. Remember that there is no equivalent national database for the United States, a key difference. Our’s is a much more distributed system where we start with a local system, often for one library only, and then move out from that in concentric circles of fulfillment systems ending with OCLC’s WorldShareILL. A Kay 2019–03–13 at 11:02 pm Totally agree with Kurt. This work is licensed under a CC Attribution 4.0 License. ISSN 1944-6195 About this Journal | Archives | Submissions | Conduct work_wq3kn7vdpnd3xatwx34origep4 ---- Microsoft Word - oadc.doc Open Access Dissemination Challenges: A Case Study Philip Young University Libraries at Virginia Tech, Blacksburg, Virginia, USA Abstract Purpose- This paper explores dissemination, broadly considered, of an open access database as part of a librarian-faculty collaboration currently in progress. Design/methodology/approach- Dissemination of an online database by librarians is broadly considered, including metadata optimization for multiple access points and user notification methods. Findings- Librarians address open access dissemination challenges by investigating search engine optimization and seeking new opportunities for dissemination on the web. Differences in library metadata formats inhibit metadata optimization and need resolution. Research limitations/implications- The collaboration is in progress and many of the ideas and conclusions listed have not been implemented. Practical implications- Libraries should consider their role in scholarly publishing, develop workflows to enable it, and extend their efforts to the web. Originality/value- This paper contributes to the scant literature on dissemination by libraries, and discusses dissemination challenges encountered by a non-peer reviewed, dynamic scholarly resource. Keywords- Open access, Dissemination, Metadata optimization Paper type- Case study Introduction The online environment has brought about a revolution in scholarly publishing. Scholars now publish on the web, whether in an online journal or to their own web site. They are also creating new forms of scholarship and providing access to scholarly resources. This paper discusses a database that came online in 2006, “Spenser and the Tradition: English Poetry 1579-1830” [http://spenserians.cath.vt.edu]. The database was brought to the attention of librarians at Virginia Tech by its creator, Dr. David Radcliffe, a faculty member in the English department. The database, more than 15 years in the making and still being updated, offers full text of poems, although its uniqueness lies in its links between poets as authors and readers—in short, a genealogy of influence. The complex web of links between poets is far easier to express and explore in the digital environment. The database is written in MySQL and is hosted on a server by the Center for Technology in the Humanities [http://wiz.cath.vt.edu/cath/cath.html]. Although the database is not hosted by the University Libraries at Virginia Tech, Dr. Radcliffe sought help from librarians for its dissemination—clearly, just putting it online wasn’t enough. Faculty are turning to librarians for assistance in producing electronic resources (Brown et al. 2007) and the trend is likely to continue. Dr. Radcliffe’s interests 2 include the cataloging of the database so it can be entered into Virginia Tech’s library catalog (as well as other library catalogs), and how to ensure that it ranks highly in the results returned by search engines. What information should he provide on his website, or in his code, to best enable bibliographic and web indexing? This faculty-library collaboration faces several challenges. First, libraries are not often called upon to assist in the dissemination of information in the broadest sense. In most cases, simply making a scholarly resource available online qualifies as dissemination. However, this collaboration chose to examine dissemination more deeply, in terms of how metadata might be optimized for a variety of online dissemination purposes, and more broadly, in terms of functions traditionally associated with publishers. Relatively little has been published about optimizing metadata (Dawson 2005). Second, this case study concerns a scholarly resource rather than scholarship itself. As such, it is not peer- reviewed, and its dynamic, updating format offers challenges. While open access (OA) definitions can be extensive, for the purposes of this paper OA simply means “digital, online, free of charge, and free of most copyright and licensing restrictions” (Suber 2007). A search reveals little published literature on the dissemination of OA resources. Dissemination has been identified as the most difficult problem of the OA movement (Morgan 2004). A broad exploration of dissemination possibilities for the English Poetry database suggests that librarians could optimize metadata for cataloging, repository harvesting, and the web, as well as find ways to notify interested users of the resource. Dissemination was an integral role of the earliest libraries, but over time publishing and libraries became separate (Grafton 2007). The two are converging again in the digital environment. A recent report urges universities to renew their commitment to publishing by combining skills at libraries, university presses, and information technology departments (Brown et al. 2007). Libraries should seek new ways to support scholars in the online environment (ACRL 2007, Morgan 2004, OCLC 2004), and should focus on services including “the provision of a mechanism for the dissemination of information” (Manuel and Oppenheim 2007). Cataloging The provision of access best known by librarians is traditional cataloging using machine- readable cataloging (MARC) records. Online resources pose unique challenges to catalogers. Important information is not found in a standard place, and is often missing altogether. In recent years, several methods to extract metadata automatically from a web site’s source code have been created (Library of Congress 2005, OCLC 2007, Su et al. 2002). However, in most cases catalogers will need information from the plain text of the site as well as the source code. Catalogers will typically look for a clear and consistent title, a statement of responsibility, a summary or description, software requirements (if any), an indication of whether the resource is updated (and with what frequency), and a date or dates, among other information. Sometimes this information 3 can be found on a web site’s “About” page, if present. However, the web site’s source code header will likely become the de facto provider of metadata, since it provides a place for structured metadata that can be extracted automatically. The use of OCLC’s metadata extraction tool (OCLC 2007) for the cataloging of the English Poetry database proved unsatisfactory, despite the tool’s crosswalk for more than 1,000 metadata fields (OCLC 2007a). The poor result, however, is likely due to the lack of metadata in the source code (the “meta” tags were limited to author and keywords) rather than the performance of the tool. Catalogers rarely have influence on the coding of the websites that they are cataloging, but in such situations, how should a website be coded to obtain maximum value from the metadata extraction tool? According to OCLC (2007a), the tool will work better as more metadata is provided. Metadata harvesting to create MARC records is unlikely to result in complete records, but can limit cataloger intervention to a minimum. Automated, or at least hybrid, means of metadata creation are needed (Weibel 2005), and are already in place at some abstracting and indexing services, as well as newspapers like the New York Times (Harris 2007). Additional cataloger time can be saved by using constant data in combination with metadata extraction. The proliferation of online updating resources like the English Poetry database calls attention to the need for automated updating of MARC records. Metadata extraction tools should be designed to check updating resources on a regular basis. Repository aggregators harvest metadata regularly, and Google recommends that webmasters use an “if-modified-since HTTP header” so its crawlers will recognize updated content (Google 2007). E-mail alerts for updated content have been employed by the Library of Congress (2005), although this works better for discrete rather than integrated updates. Once the full catalog record is created, it can be entered into the OCLC database and exported to the local catalog. Once in OCLC’s database, the record appears in WorldCat [http://www.worldcat.org]. The resource record includes a list of libraries that have added it to their catalog. WorldCat can be accessed and searched by anyone, and more importantly, its contents are indexed by search engines through Open WorldCat [http://www.oclc.org/worldcat/help/en/#howget]. Likewise, the record can be found in library catalogs that are pilot testing WorldCat Local, an implementation of the OCLC database as a local catalog [http://www.oclc.org/news/releases/200659.htm]. Repositories Although the English Poetry database is on the web, there are advantages to also depositing the database in a repository. For example, a repository would address digital preservation, especially format migration (Davis and Connolly 2007, SHERPA 2006). This aspect of repositories is especially important to Dr. Radcliffe, who has already experienced two format migration problems in the long gestation of the database. Repositories, and especially aggregation interfaces like OAIster [http://www.oaister.org], provide another means of user discovery, and better ranking in search engine results 4 (SHERPA 2006). Additionally, automated processes offer the potential to create a record for each of the 25,000 items in the database rather than one record for the database as a whole. Access to authors too numerous to list in a MARC record would be greatly improved. However, repository deposit presents several problems for the English Poetry database. First, some repositories define open access in such a way that it would exclude the database either by format or its lack of peer review. These repositories are restricted to works of scholarship rather than scholarly resources, although the Open Access Initiative (OAI) was originally intended for non-peer reviewed materials (Rusch-Feja 2002) and most repositories use a broader definition of open access (Hood 2007). Second, repositories use a download model (Smith et al. 2003) that is not as user-friendly for a resource like a database, which is easier to use on the web. Third, updates to the database are problematic because repositories are designed to save versions of resources. This seems incompatible with an environment in which scholars increasingly will be contributing growing data sets (ACRL 2007). Depending on the frequency of the updates, the repository version could be out of synch with the web version, and there would be no need to save numerous older versions. Preservation and access functions conflict, in a similar way that remote storage for physical items enhances preservation while hampering access. Differences between repository software implementations in the handling of these resources have implications for repository selection and policy. Fourth, an appropriate repository may not be available for a particular resource. Dr. Radcliffe has been unable to identify a disciplinary repository for the database, and Virginia Tech does not yet have an institutional repository. Lack of repository access has implications for numerous subject areas, particularly in the humanities, which have been slower to develop them than in the sciences (Suber 2004), and for independent and developing world scholars. A third type of repository, the static repository, offers a low-effort, low- cost option (Moffat 2006), but is not well-suited for the size and dynamic nature of the database. A new initiative called Object Reuse and Exchange may provide a better model for complex objects like databases, and improve search engine behavior (OAI 2007). The Open Access Initiative-Protocol for Metadata Harvesting (OAI-PMH) requires metadata exposed as simple Dublin Core, but encourages additional metadata formats, including MARC (Moffat 2006, OAIster 2007). Dublin Core metadata would also assist OCLC metadata extraction, although its simplicity would limit its usefulness in the more detailed MARC record. And even Dublin Core’s minimal metadata scheme is not fully utilized by current OAI-PMH data providers (Ward 2004). The significant discrepancies in metadata detail between repositories and library catalogs must be addressed to achieve metadata optimization. Search engine optimization “Despite a mantra of interoperability, attention is rarely given to the question of how to ensure that meticulously crafted metadata is used beyond the confines of 5 its immediate surroundings. The existence of search engines is ignored or denigrated.” (Dawson and Hamilton 2006) Search engines are the primary means of discovering and selecting digital content (Dawson 2004). The prominence of Google and other search engines among searchers has been noted, as well as increased collaboration between information providers and Google (Tenopir 2004). A study has shown that 89% of college students use search engines to begin an information search (De Rosa et al. 2006), and some claim that 95% of scholarly inquiries start at Google (Grafton 2007). Even in the hard sciences, search engines are a common choice. Kahn and Drey (2002) found that Google was the second choice of analytic and organic chemists, and the first choice among chemists in management and development positions. Search engines are easily accessible, and that is the most important variable governing the use of information (Morville 2005). The practice of coding websites for the highest possible search result ranking is referred to as search engine optimization (SEO), and its importance for scholarly resources cannot be overstated: To reach users wherever they are, we as a community need to disclose more metadata to OAI harvesters [and] Web crawlers... search engine optimization is crucial. (Smith-Yoshimura 2007) Connecting users with the content and services we design and build is part of our broader mission. It’s not good enough to create a great product and expect someone else to worry about how people will find it. Together with form and function, findability is a required element of good design and engineering. I relentlessly make this case to government agencies and nonprofits that don’t have marketing departments. They tend to shy away from SEO as overly commercial, but they’re missing a great opportunity to fulfill their mission by helping people find what they need. (Morville 2005) OA-OAI archiving and Google indexing are completely compatible. We can do both, and we should. (Suber 2004a) Designing scholarly web resources for high placement in search results makes sense. In addition to increased visibility, top results are perceived as authoritative (Morville 2005). When the disciplinary repository ArXiv [http://arxiv.org] redesigned its site for improved indexing by Google, usage increased 50% (Inger 2004). Search engine optimization has largely been employed by commercial web sites. Because Google and other search engines do not reveal the details of their search algorithms, a small industry has been created to help webmasters optimize the coding of their web site so the site appears as high as possible in search engine results. Search engines also offer advice to webmasters for optimal indexing (Google 2007). Much of SEO is geared toward making sites easy to access and navigate by the crawlers, or automated robots, that are used by search engines to index the web. Generally, positive 6 factors for indexing include clear title tags (Dawson 2004, Sullivan 2002) and alternate text for images tags (Google 2007), a site map, incoming links (Brooks 2004), and top domain (Dawson and Hamilton 2006). Negative factors are primarily those inhibiting the crawlers, such as frames, JavaScript (Weideman and Schwenke 2006), Flash, and redirects (these factors also inhibit the metadata extraction tools used in cataloging). Some of these features increase usability yet are in conflict with SEO (Bosworth 2007, Weideman and Schwenke 2006). Some aspects of the English Poetry database’s source code, such as its extensive use of JavaScript, are in this category. Some steps to improve the indexing of the database have already been taken. Frames have been eliminated, alternate text for images added, and the original “org” domain reverted to “edu.” Dawson and Hamilton (2006) advise that Google seems to privilege “gov”, “edu” and “ac.uk” domains, so one should avoid using other domains merely to give a special project a memorable URL. Specific title tags (Dawson 2004) for each record in the database are in place. The influence of JavaScript on indexing can be mitigated, and a site map added. The database needs usability improvements of the kind that should not affect indexing. Preliminary feedback from reference librarians indicates that undergraduates will have difficulty navigating the database, and will expect a search box. The database currently enables searching, but not on the front page. In addition to a built-in search box, OpenSearch [http://www.opensearch.org/Home] code can be added to allow toolbar searching of the database, so that users can easily search the database wherever they are on the web. As a scholarly resource, citations for each record in the database should be provided. A stable URL is an important part of the citation. Permanent links can appear on each page, and full citations could be generated automatically from Dublin Core or other metadata (Jorgensen 2005), or exported to citation software. SEO differs significantly from the cataloging and repository worlds, where explicit metadata is highly valued. Metadata extraction, for example, performs better with more metadata (OCLC 2007a). However, search engines mostly ignore metadata added by webmasters due to a history of abuse and misrepresentation (Beall 2006, Brooks 2004), particularly keywords (Dornfest et al. 2006, Sullivan 2002). While the importance of incoming links is frequently cited (Brooks 2004), Dawson and Hamilton (2006) demonstrate that a library resource can achieve top listing without any incoming links. Other dissemination methods While metadata optimization can enhance access, more explicit methods of dissemination deserve examination. Some of these methods are currently employed by publishers to notify interested users, and others emerge from the increasing interactivity of the web. This kind of dissemination can result in the incoming links that further enhance search engine indexing. Submission of a site’s URL to search engines, directories and portals is one method. Search engines recommend site submission as part of their guidelines for webmasters (Google 2007). Indexing by Google Scholar was pursued for the database since it is a scholarly resource, but the search engine is limited to scholarship, that is, 7 textual narrative in the form of articles and books, much in the same way that some repositories are restricted. A number of general portals encourage submission, such as the Open Directory Project [http://www.dmoz.org], the Yahoo Directory [http://dir.yahoo.com], and Intute [http://www.intute.ac.uk]. Mattison (2006) provides an extensive overview of disciplinary portals in the humanities. Among the portals linking to the database are two well-known sites, Voice of the Shuttle [http://vos.ucsb.edu] and Early Modern Resources [http://earlymodernweb.org.uk/emr]. However, Dr. Radcliffe reports a portal submission success rate of only 1 in 4, which was discouraging enough to make him give up. Sowards (1999) likewise found little success with URL submission to portals until news events created interest in his content. Many journals are now online and publish reviews of scholarly resources. An online review provides awareness as well as an incoming link. The increasing interactivity of the web offers opportunities for dissemination. Lally and Dunford (2007) relate the use of the online encyclopedia Wikipedia to drive usage. In most cases this simply involves adding a link to a relevant article, although it sometimes entails writing a new article. An examination of incoming links to the database found several links from Wikipedia already in place. This examination also revealed that a link from a community blog such as MetaFilter [http://www.metafilter.com] or MonkeyFilter [http://monkeyfilter.com] can greatly increase awareness. Disciplinary mailing lists notify recipients of new resources, and the database received mention on the Byron list. Librarians can use collection development lists to alert other libraries that might want to add the database to their catalog. Hood (2007) suggests adding new online resources to pathfinders and subject guides, and the use of targeted e-mail alerts by subject bibliographers. Really Simple Syndication (RSS) has potential, but may not be appropriate where resources are changing frequently or numerous resources come online at once (as in the case of repositories). Publishers commonly use RSS table of contents alerts, print flyers, mail postcards, advertise in journals, and send e-mail. While publishers have more tools for generating awareness, their content is hidden behind a subscription wall. The general public and libraries that cannot afford a subscription have no access. Ironically, it is the OA resources like the English Poetry database which are difficult to disseminate, and librarians should be creative as possible in assisting their faculty in doing so. Concluding discussion The future of this collaboration in dissemination involves numerous tasks. The source code of the database’s web site needs refinement following the recommendations of Brooks (2004) and especially Dawson and Hamilton (2006). A full metadata header and citation functionality need to be added, and navigation and search tools improved. Then more explicit dissemination methods can be employed. In addition, Dr. Radcliffe would like to produce a guide for other faculty who are interested in the dissemination of their online content. Libraries may want to consider similar recommendations as a service to their faculty, particularly since the proliferation of digital centers on campuses means that 8 much online scholarship will not be produced or hosted by libraries. Also, digital centers may not be as familiar with metadata uses for multiple purposes. Measurement of access and dissemination after applying SEO principles will be a difficult task due to the variety and simultaneous influence of factors. Measurement might include indicators such as rank in search engines, library holdings in WorldCat, direct linking by libraries, statistics from website management applications as well as from the server and MySQL database, number of incoming links, success rate with portal submission, citations in scholarly papers, and improvements in metadata extraction. Metadata optimization is necessary if efficiency and effective dissemination are to be realized by libraries. The current standard for online resources is some combination of extensible markup language (XML), Resource Description Framework (RDF), and Dublin Core (DC). Repository harvesting requires DC metadata and XML as its syntax; DC can be used in metadata extraction for cataloging; and RDF will be necessary for any future Semantic Web implementation. Also, citations can be extracted from DC. Libraries must first bridge the gap between Dublin Core and MARC. Omitting digital collections from the catalog results in information “silos” and requires users to search in different places. Workflows for electronic resources (such as electronic theses and dissertations) have already been created in libraries, and a similar but more comprehensive workflow should be created that provides access in catalogs, repositories, and on the web. One workflow integration tool recently became available (OCLC 2007b) that addresses the metadata problem by starting with the MARC record and deriving qualified DC upon deposit of the digital resource. This crosswalk direction may prove more effective than deriving MARC from DC. Libraries must address the problem of providing access to online resources in an environment that largely ignores explicit metadata. The XML/RDF/DC metadata scheme may be useless to most search engine crawlers, yet the web is where most information seekers are going first. The invisibility of metadata to search engines may be one cause of so little effort by libraries toward SEO (Beall 2006). While this metadata scheme is compatible with the Semantic Web, much skepticism remains about user-supplied metadata (Weibel 2005). A more realistic scenario in which metadata could be fully utilized is that of “closed applications” (Brooks 2004) such as intranets or digital libraries, or dividing the web by top domain or other means. Google Scholar’s harvesting of citations from scholarly publishers may be one example. Until ways can be found to utilize the XML/RDF/DC scheme in web indexing, libraries should probably heed the recommendations of Dawson and Hamilton (2006). Genre will become increasingly important in the online environment (Morville 2005). The library community may want to consider metadata identification of databases and other online resources, as well as creating a genre for scholarly resources. A category for scholarly resources (i.e., the materials on which scholarship is based) may become important as more primary research material is digitized and as more data sets are made available. 9 While faculty-created online resources such as the English Poetry database may not be common, or faculty not as concerned with dissemination, libraries should consider their role in scholarly publishing. As Manuel and Oppenheim (2007) state, “Google, repositories and libraries all have a part to play in improving dissemination, and thus research impact.” The knowledge of metadata and open access in libraries positions them well for increased faculty collaboration. As the volume of information increases, our ability to find particular items decreases, and we spend more time searching (Morville 2005). Online resources need more metadata, and libraries can fill the need in the scholarly arena. References ACRL (2007), “Establishing a research agenda for scholarly communication: a call for community engagement”, available at: http://www.ala.org/ala/acrl/acrlissues/scholarlycomm/SCResearchAgenda.pdf Beall, J. (2006), “The death of metadata”, The Serials Librarian, Vol. 51, No. 2. Bosworth, A. (2007), “Google is destroying the web and you don’t even know it”, available at: http://blog.alexbosworth.net/article/google_destroying_the_web Brooks, T.A. (2004), “The nature of meaning in the age of Google”, Information Research, Vol. 9 No. 3, paper 180, available at: http://InformationR.net/ir/9- 3/paper180.html Brown, L., Griffiths, R., and Rascoff, M. (2007), “University publishing in a digital age”, Ithaka Report, available at: http://www.ithaka.org/strategic- services/Ithaka%20University%20Publishing%20Report.pdf Davis, P.M. and Connolly, M.J.L. (2007) “Institutional repositories: evaluating the reasons for non-use of Cornell University’s installation of DSpace”, D-Lib Magazine, Vol. 13 No. 3/4, available at: http://www.dlib.org/dlib/march07/davis/03davis.html Dawson, A. (2004), “Creating metadata that works for digital libraries and Google”, Library Review, Vol. 53, No. 27, pp. 347-350, available at: http://eprints.cdlr.strath.ac.uk/2289/01/ad200402.htm Dawson, A. (2005), “Optimising publications for Google users”, pp. 177-194, in Miller, W. and Pellen, R.M. (eds.), Libraries and Google, Haworth Press, Binghamton, N.Y. Dawson, A. and Hamilton, V. (2006), “Optimising metadata to make high-value content more accessible to Google users”, Journal of Documentation, Vol. 62 No. 3, pp. 307-327 10 De Rosa, C. et al., (2006), College Students’ Perceptions of Libraries and Information Resources, OCLC, Dublin, Ohio. Dornfest, R., Bausch, P. and Calishain, T. (2006), Google Hacks (3rd ed.), O’Reilly, Sebastopol, Calif. Google (2007), “Webmaster guidelines”, available at: http://www.google.com/support/webmasters/bin/answer.py?answer=35769 Grafton, A. (2007), “Future reading: digitization and its discontents”, The New Yorker, available at: http://www.newyorker.com/reporting/2007/11/05/071105fa_fact_grafton Harris, J. (2007), “Messing around with metadata”, Open, [New York Times blog], available at: http://open.blogs.nytimes.com/2007/10/23/messing-around-with-metadata/ Hood, A.K. (2007), “Open Access Resources: Executive Summary”, SPEC Kit 300, pp. 11-14, Association of Research Libraries, Washington, D.C., available at: http://www.arl.org/bm~doc/spec300web.pdf Hunter, P. and Guy, M. (2004), “Metadata for harvesting: the Open Archives Initiative, and how to find things on the Web”, Electronic Library, Vol. 22 No. 2, pp. 168-174, available at: http://homes.ukoln.ac.uk/~lispjh/tel-metadata/metadata-final5.pdf Inger, S. (2004), "Google vs traditional information services: a comparison of search results", National Federation of Abstracting and Indexing Services (NFAIS), 22 February, available at: http://www.scholinfo.com/GoogleversusTraitionalInformationServices.pdf Jorgensen, P. (2005), “Citations in hypermedia: implementation issues”, Information Technology and Libraries, Vol. 24 No. 4, available at: http://news.ala.org/ala/lita/litapublications/ital/volume242005/number4december/content v424/jorgensen.pdf Kahn, D. and Drey, J. (2002), "Finding Chemical Information on the Web - the User's Viewpoint", Free Pint, Issue 109, available at: http://www.freepint.com/issues/040402.htm#feature Lally, A.M. and Dunford, C.E. (2007), “Using Wikipedia to extend digital collections”, D-Lib Magazine, Vol. 13 No. 5/6, available at: http://www.dlib.org/dlib/may07/lally/05lally.html Library of Congress (2005), “Web Cataloging Assistant”, Bibliographic Enrichment Advisory Team, available at: http://catdir.loc.gov/catdir/beat/webcat.html Manuel, S. and Oppenheim, C. (2007), “Googlepository and the university library”, Ariadne, Issue 53, available at http://www.ariadne.ac.uk/issue53/manuel-oppenheim/ 11 Mattison, D. (2006), “The digital humanities revolution”, Searcher, Vol. 14, No. 5, pp. 25-34. Moffat, M. (2006), “Marketing with metadata—how metadata can increase exposure and visibility of online content”, New Review of Information Networking, Vol. 12, Nos. 1-2, pp. 23-40. Morgan, E.L. (2004), “Open access publishing”, available at http://infomotions.com/musings/open-access/open-access.pdf Morville, P. (2005), Ambient findability, O’Reilly, Sebastapol, Calif. OAI (2007), “Open Archives Initiative announces public meeting on March 3, 2008 to release Object Reuse and Exchange specifications”, press release available at: http://www.openarchives.org/ore/documents/ore-hopkins-press-release.pdf OAIster (2007), “How to become a data contributor”, available at: http://www.oaister.org/dataproviders.html OCLC (2004), The 2003 OCLC Environmental Scan: Pattern recognition, OCLC, Dublin, Ohio. OCLC (2007), “Cataloging: Create Bibliographic Records”, OCLC Connexion Client Guides, pp. 15-23, available at http://www.oclc.org/support/documentation/connexion/client/cataloging/createbib/create bib.pdf OCLC (2007a), E-mail from OCLC Connexion-Support, August 20, 2007. OCLC (2007b), “Attach digital content to WorldCat records”, available at: http://www.oclc.org/support/documentation/connexion/client/cataloging/bibactions/#cat_ act_attach_digital_files Rieh, S.Y., Markey, K., St. Jean, B., Yakel, E., and Kim, J. (2007), “Census of institutional repositories in the U.S.: a comparison across institutions at different stages of IR development”, D-Lib Magazine, Vol. 13 No. 11/12, available at: http://www.dlib.org/dlib/november07/rieh/11rieh.html Rusch-Feja, D. (2002), “The Open Archives Initiative and the OAI Protocol for Metadata Harvesting: rapidly forming a new tier in the scholarly communication infrastructure”, Learned Publishing, Vol. 15, No. 3, pp. 179-186. SHERPA (2006), “Fifteen common concerns- and clarifications”, available at: http://www.sherpa.ac.uk/documents/15concerns.html 12 Smith, M., Barton, M., Bass, M., Branschofsky, M., McClellan, G., Stuve, D., Tansley, R., and Walker, J.H. (2003), “DSpace: an open source dynamic digital repository”, D-Lib Magazine, Vol. 9 No. 1, available at: http://www.dlib.org/dlib/january03/smith/01smith.html Smith-Yoshimura, K. (2007), “RLG Programs Descriptive Metadata Practices Survey Results”, OCLC Programs and Research, available at: http://www.oclc.org/programs/publications/reports/2007-03.pdf Sowards, S.W. (1999), “Practical lessons for small-scale web publishers”, Journal of Electronic Publishing, Vol. 5, Issue 2, available at: http://www.press.umich.edu/jep/05- 02/sowards.html Su, S.T., Long, Y. and Cromwell, D.E., (2002), “Metadata by crawling E-publications”, Information Technology and Libraries, Vol. 21 No. 4, available at: http://news.ala.org/ala/lita/litapublications/ital/2104su.cfm Suber, P. (2004), “Open access in the humanities”, SPARC Open Access Newsletter, Issue 70, available at http://www.earlham.edu/~peters/fos/newsletter/02-02- 04.htm#humanities Suber, P. (2004a), “The case for OAI in the age of Google”, SPARC Open Access Newsletter, Issue 73, available at http://www.earlham.edu/~peters/fos/newsletter/05-03- 04.htm#oai-google Suber, P. (2007), “Open Access overview”, available at: http://www.earlham.edu/~peters/fos/overview.htm Sullivan, D. (2002), “Death of a meta tag”, available at http://searchenginewatch.com/2165061/print Tenopir, C. (2004), “Is Google the competition?”, Library Journal, Issue 6, available at: http://www.libraryjournal.com/article/CA405423.html?display=Online+ Ward, J. (2004), “Unqualified Dublin Core usage in OAI-PMH data providers”, OCLC Systems & Services, Vol. 20, No. 1, pp. 40-47. Weibel, S.L. (2005), “Border crossings: reflections on a decade of metadata consensus building”, D-Lib Magazine, Vol. 11 No. 7/8, available at: http://www.dlib.org/dlib/july05/weibel/07weibel.html Weideman, M. and Schwenke, F. (2006), “The influence that JavaScript has on the visibility of a website to search engines- a pilot study”, Information Research, Vol. 11 No. 4, available at http://informationr.net/ir/11-4/paper268.html 13 About the author Philip Young has served since 2006 as Catalog Librarian for science and technology at the University Libraries at Virginia Tech. Philip Young can be contacted at: pyoung1@vt.edu. work_wvnozgdfgrhjtayleb47zjdf4u ---- Microsoft Word - E-LIS_OTDCF_v23no4.doc by Norm Medeiros Associate Librarian of the College Haverford College Haverford, PA Bibliographic Challenges in Historical Context: Looking Back to 1982 ___________________________________________________________________________________________________ {A published version of this article appears in the 23:4 (2007) issue of OCLC Systems & Services.} “A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that he avoid all detail and treat his subjects only in outline, but that every word tell.”1 -- William Strunk Jr. ABSTRACT This article comments on a paper written in 1982 in celebration of the 25th anniversary of the Association for Library Collections & Technical Services (ALCTS). The piece describes persistent challenges faced by the cataloging community. It includes comments made during the ALCTS 50th Anniversary National Conference as a means of placing these challenges in historical context. KEYWORDS online catalog ; OPAC ; cataloging ; information access I’m returning home in the comfort of Amtrak’s coach class from the Association for Library Collections & Technical Services’ (ALCTS) National Conference, a celebration of the organization’s 50 years. The two-day event held in Washington D.C. featured visionaries who forecasted an increasingly chaotic future, penetrating anxiety into the collated world of the 125 members in attendance. Being a practical and responsible lot, librarians like to plan ahead. Given the exponential rate of change and the inability for our profession to keep up with it, I wonder if it’s not better to proceed like the hitter who reacts naturally to a pitch rather than trying to guess what’s coming. Not many librarians are ballplayers I’ve learned, so aboard the S.S. Unknown we go with a course set for the horizon. Don’t forget the Dramamine. I’ve had occasion recently to review past issues of Library Resources & Technical Services (LRTS), the official journal of ALCTS. To the satisfaction of serials librarians worldwide, LRTS has stayed true to its founding title, unlike its sponsor which was known as the Resources and Technical Services Division (RTSD) until 1989. Since its inception in 1957, LRTS has published peer-reviewed papers on collection development and technical services. In 1982, a silver anniversary issue of LRTS was commissioned. While browsing this volume, the title of one paper caught my attention: “Is there a catalog in your future?” by Nancy J. Williamson. 2 The question is timeless, having as much significance today as it did 25 years earlier. Intrigued, I began reading the article, which unlike so many works that predict a false future, proved remarkably prophetic. THE BIBLIOGRAPHIC WORLD OF 1982 When Williamson wrote her article in 1982, the Anglo-American Cataloguing Rules, Second edition (AACR2) had been in service for four years; first-generation library management systems were being implemented in some libraries; indexing and abstracting services were developing Boolean and keyword search capabilities; and commercial information systems such as BRS and Dialog, through modem telecommunications, were allowing individuals from home or office to access citation, and in some cases, full-text data. At the onset of these developments, Williamson and others began to question the future role of the catalog and the library, particularly as the expectations of users grew in accordance with these technological advances. One of Williamson’s concerns remains constant today: the need to improve access within online catalogs. Enriched catalog tools such as Endeca seek to maximize the rich data available in most bibliographic records. Yet it remains in 2007 as it did in 1982 that most catalogs are ineffective when compared to sophisticated, easy-to-use interfaces built for the commercial sector. FULL-TEXT IS KING Williamson’s crystal ball noted the challenge of trying to serve a populous increasingly demanding of full-text sources. Williamson thought full-text would be “doubly attractive to users;”3 we know such sources to be infinitely more attractive to users, having become the expectation, rather than the exception. For example, undergraduates love JSTOR, the giving tree of online resources; it never fails to satisfy. Given the culture of convenience that permeates the lives of college students, one wonders about the future viability of information providers that fail to provide full-text. Abstracting and indexing services seem at risk, their niche being subsumed by Google Scholar. Although canceling citation-only services may be premature, especially given Google’s secrecy about the sources it indexes, might there be a return to per-search database pricing models for those resources whose use is waning? THE PLACE OF THE CATALOG With stunning precision, Williamson predicts the future place of the catalog and role of the library upon the 50th anniversary of ALCTS: I see a catalog in our future, but a catalog which will not be the major focal point in gaining access to information, and one which will play a diminished role in that world. While the role of the library as a recreational institution does not appear to be in serious question, its survival as an information agency will be dependent on its ability to redefine its procedures and goals in terms of the bibliographic universe as a whole. In doing so, it will be necessary to place its basic tool - the catalog – in its proper perspective with other access tools. In brief, librarians must consider the ways and means of developing information services as opposed to providing access to specific collections or particular databases.4 The persistent problems of the catalog exist less with its business modules and more with its front- end. R. David Lankes, an invited speaker at the ALCTS National Conference, made the observation that only in libraries are customers given access to the inventory system; in no other line of business are customers given such a privilege.5 Yet libraries not only provide such access, they do so knowing that the interface provided is not good. An undercurrent of Lankes’s speech, which was even more pronounced during the breakout sessions that followed, is that the time has come to peel away the discovery component of the catalog from its business core. Other information players do search better than libraries; we should let them do it. CONCLUSION Near the close of Williamson’s insightful article, she cautions, “In the marriage between new technology and access to information, let us not be left standing at the alter.”6 Twenty-five years later, are we even at the wedding? Enormous supplies of energy during the ALCTS National Conference were spent strategizing ways for libraries to remain relevant to our visually-oriented, socially-networked clientele. Pushing services into venues such as MySpace, YouTube, and Second Life was seen as one option, building relationships across campuses and communities in the more traditional ways another. If I live to comment on the 75th anniversary of ALCTS, I suspect the issue of library relevance will still be with us. Will the catalog be there as well? I’ll let you know when we reach the horizon. REFERENCES 1. Strunk Jr., William, and E.B. White. The Elements of Style, 3rd ed. New York: Collier Macmillan, 1979. 2. Williamson, Nancy J. “Is there a catalog in your future?” Library Resources & Technical Services 25 (1982): 122-135. 3. Ibid, p.124. 4. Ibid, p.127. 5. Lankes, R. David. “Collecting Conversations in a Massive Scale World.” Paper presented at the ALCTS National Conference, Washington, D.C., June 21, 2007. 6. Williamson, p.134. work_wyl7e6innrewvhogvcqiqd2ske ---- 1 A Survey of Batch Cataloging Practices and Problems PHILIP YOUNG Virginia Tech, Blacksburg, VA Abstract. Groups of bibliographic records are added to library catalogs with increasing frequency. Batch cataloging requires knowledge of bulk record transfer as well as current cataloging standards. While more efficient than cataloging items individually, batch cataloging requires different skills and creates new challenges. Responses to a wide-ranging online survey document the workload, tools, practices, and problems of batch cataloging. The unique characteristics of electronic resources affect many aspects of batch cataloging. Survey respondents lack consensus on how to share improved batch records, and recent trends in the bibliographic environment seem to make a networked solution less likely. KEYWORDS batch cataloging, vendors, bibliographic record quality, cataloging tools, sharing, cooperative cataloging Address correspondence to Philip Young, University Libraries, Virginia Tech, 560 Drillfield Drive, Blacksburg, VA 24061. E-mail: pyoung1@vt.edu 2 Batch cataloging in libraries has become increasingly common in recent years. This is in part due to large sets of electronic resources that need to have bibliographic records entered in the catalog when a subscription becomes valid. Machine-readable cataloging (MARC) records, sometimes by the hundreds or thousands, are obtained, edited, and entered into the library’s online catalog at one time. While problems with individual records can be identified in batch view and corrected prior to or after load by a headings report or authority control vendor, all records are not examined individually as in traditional cataloging practice. The continuing proliferation of information resources, especially in electronic form, has required this workflow, which is more efficient than older methods but presents new and complex problems. The term “batch cataloging” as used in this article refers to obtaining (or creating), transferring, manipulating, and editing groups of MARC bibliographic records. Those performing this work must know cataloging standards as well as how to transfer records into the local catalog. While it is possible for staff in other areas of the library to load records into the catalog, they are not always trained in cataloging practices. Those performing batch cataloging are also responsible for the integrity of the metadata and are therefore usually part of the cataloging department. Batch cataloging was added to the author’s responsibilities in 2009, and since then the practice has generated numerous questions about how other libraries address related problems. How significant is the batch cataloging workload at other libraries, and who is responsible for it? What tools and methods are employed to improve records? How often are holdings set in the Online Computer Library Center (OCLC)? How is authority control addressed? What is the effect of discovery layers on batch cataloging? How are duplicate records detected? How many catalogers are creating record batches? Is there an automated way to determine which of multiple links on an e-book record are valid for a library and which are not? And perhaps most importantly, how can libraries improve the quality of record batches and share those improvements with others? As the frequency and volume of batch cataloging increases and staffing levels decrease or remain steady, answers to these and other batch-related questions will be of interest to libraries desiring to improve both workflow and record quality. Literature Review While a number of articles have been published about batch cataloging, this literature review focuses on the most recent articles relevant to one or more survey questions. No surveys focused on batch cataloging practices and issues in general have been conducted before, though Kemp (2008) surveyed librarians about MARC record services for serials. Respondents indicated that they used a variety of methods to manipulate and improve records, including the free software MarcEdit (http://people.oregonstate.edu/~reeset/marcedit/html/index.php), scripts, global update in the integrated library system (ILS), and individual record editing. The survey documented a wide range of record errors, in addition to problems associated with duplicate and brief records, and about 40% of respondents expressed concerns about record quality. Martin and Mundle (2010) reported a case study of managing and improving the quality of bibliographic records for a large e-book collection in a consortial setting. They created a discussion list for the consortium and shared record quality improvements, though not authority cleanup. Collaboration between the consortium and the vendor “to improve records before receiving them was the most productive route to quality data in the catalog” (p. 234). Mugridge and Edmunds (2009) relate a batch-cataloging workflow and bring to light a wide variety of issues. While a significant increase in usage results from batch loading title-level 3 records, the process is a complex one with technological and organizational challenges. The latter requires upgraded skills, accurate documentation, and improved inter-departmental communication. Wu and Mitchell (2010) discuss batch cataloging of e-books using MarcEdit, as well as issues raised by vendor-supplied records and the impact of guidelines for provider-neutral e-book records. Librarians rather than paraprofessionals performed most batch cataloging due to the complexity of the process. The library’s shift to an e-book knowledgebase with a MARC record service, while not solving problems of record quality and duplication, streamlines workflow so that work can be delegated. Chen and Wynn (2009) surveyed academic libraries to determine if and how e-journals were cataloged and found that 30 of 36 libraries were using record batches. The poor quality of vendor records was a common complaint. Twenty-nine percent of libraries rejected MARC record services as too expensive, and 52% did not set e-journal holdings with OCLC. Martin (2007), in an overview of e-book cataloging problems, discussed delays in offering title-level access in the catalog and documented several common errors in records. Martin also found that when vendor records are used, holdings are often not recorded in OCLC, resulting in a loss of reliable information for interlibrary loan and less sharing of catalog records. Sanchez, et al., (2006) described a method for improving a large set of e-book records, including authority control, before loading them into the catalog. In addition to documenting numerous record-quality problems, they emphasized the importance of interdepartmental communication. While Sanchez, et al., used an ILS report to identify authority work needed, Finn (2009) described a process using an authority vendor and MarcEdit to complete authority work before loading records into the catalog. Methodology The author created an online survey containing a wide range of batch cataloging questions in order to provide a landscape view of the practice that would also indicate possible avenues for future research. The survey responses document the workflow, tools, practices, and problems of batch cataloging. Because the survey focused on larger issues associated with batches, there was no attempt to document specific errors in MARC records. While the survey was focused on e-book batch records, some questions were format-neutral. The survey did not distinguish between one-time and continuing batch cataloging. The online survey asked 31 questions in either multiple-choice (or “select all that apply”) or short-answer (free text) format. This article presents a subset of the survey (22 questions, see Appendix), with some questions omitted in the interest of space and relevance. The survey begins with questions about the respondents and their libraries, moves on to questions about tools and practices, and ends with “big-picture” questions and an opportunity for open comment. The survey was announced by e-mail on three listservs, BATCH (administered by the author, with 104 subscribers), MARCEDIT-L (694 subscribers), and AUTOCAT (about 6,200 subscribers). The online survey utilized an in-house survey instrument and was active for two weeks, from November 9-23, 2010. A reminder e-mail was sent to the same listservs a few days before the survey closed. The survey recorded 128 unique respondents, though some respondents did not answer every question. Therefore the number of responses varies by question, and is reported with the results below. Respondents were a self-selected group. The open comments at the end of the survey are incorporated into the results wherever possible. Results Respondents and Their Libraries 4 Respondents to the survey were overwhelmingly academic librarians located in the United States: 102 of 128 respondents (80%) worked in academic libraries, while 17 (13%) worked in public libraries and 6 (5%) worked in special libraries. Additionally, two respondents worked for consortia, and one for a school library. One hundred eight of 128 respondents (84%) were located in the U.S., with the remainder from Canada, the United Kingdom, and Australia. One hundred nine of 128 respondents (85%) held librarian status (defined as possessing a M.L.S. or equivalent degree), while 19 (15%) were paraprofessionals. The survey attempted to address the workload associated with record batches by asking questions about the number of people involved in batch cataloging at each library and the percentage of time spent on it (see Tables 1 and 2). In the comment section at the end of the survey, two respondents clarified that they spent much less than 25% of their time on batch cataloging. Table 1. Number of batch catalogers per library. Number of catalogers Responses Percentage 1 51 40% 2 36 28% 3 30 23% 4 or more 11 9% Total 128 100% Table 2. Percentage of time spent working with batch records (choose closest answer). Percentage of time Responses Percentage 25% 96 77% 50% 20 16% 75% 7 5% 100% 2 2% Total 125 100% A question about types of material represented in record batches allowed for multiple selections, and the numbers and percentages are from a total of 128 respondents (see Table 3). E- books were by far the most frequently batch-loaded record format, selected by 116 respondents (91%). A wide variety of formats, however, are batch loaded. In the “other” category, the most common responses identified e-journals, various audio-visual materials (such as video, music, and audio books, both online-accessible and in a physical carrier), and government documents. Table 3. Types of material batch cataloged (check all that apply; 128 responses). Type of Material Responses Percentage E-books 116 91% Print books 64 50% Online video 51 40% Microform 37 29% Maps 16 13% Other 50 39% 5 Tools Five survey questions asked respondents about the tools used to acquire and manage MARC record batches. While many record sets are obtained through a service such as WorldCat Collection Sets or downloaded directly from a vendor’s web site, MARC records can also be obtained in other ways. Asked whether they had used the Z39.50 protocol to obtain records from another catalog, 46 of 126 respondents (37%) said yes, while 80 (63%) said no. A slightly greater number have used OCLC Connexion’s batch-processing capabilities to create and export a batch, with 54 of 127 respondents (43%) answering yes, and 73 (57%) answering no. However, in a later question about setting holdings in OCLC, 8 of 95 respondents stated that their libraries were not OCLC members. Therefore, some of those who answered “no” did so because OCLC Connexion was not available to them. Respondents were asked to identify the editing tools they use, and/or name other tools (the number of responses and percentages are not cumulative because multiple selections were permitted). Of 128 total respondents, 107 (84%) selected MarcEdit. Fifty-seven (45%) used their system’s global update function, 26 (20%) used Microsoft’s Excel spreadsheet application, and 24 (19%) named another tool. Of the “other” responses, 4 respondents mentioned MARC Report, while 2 respondents mentioned Perl, shell scripts, Connexion, OCLC macros, and Bibliofile. Receiving one mention apiece were MARC Magician, home-grown ruby scripts, emacs macros, TextPad, Notepad++, gedit, MARCTOOL, UNIX editor, NANO, MARC Global, Regex Buddy, Mitinet, and load profiling (the subject of another question). Respondents were asked how often they used regular expressions, a simple computer language used to identify records or parts of records for editing or manipulation. Regular expressions can be used in MarcEdit and the global update function of some library systems. Of 124 respondents, 16 (13%) selected “frequently,” 40 (32%) “sometimes,” and 68 (55%) “never.” Duplicate record detection is one of the thorniest problems in batch cataloging. Respondents were asked about their methods of detecting duplicates within a single batch and for records already in the catalog. Results should be viewed with caution, because the free-text answers were sometimes hard to interpret, respondents often did not indicate which question they were answering (or both), and there was some confusion over the meaning of “duplicate.” While the intended meaning of duplicate was an exact copy requiring deletion, some respondents interpreted duplicate to mean multiple e-book records for the same content but from different publishers or vendors. The author attempted to categorize most of the 115 responses and eliminated others due to lack of clarity. Fifty respondents (43%) mentioned that load tables identified duplicates based on a MARC tag of 001, 010, 020, or 035. Twenty-three respondents mentioned a headings report or other information generated by the load table. It is likely these responses were referring to the same or a similar process, and these categories could be combined for a total of 73 (63%) responses. Seven respondents used a visual check for deduplication, three used scripts, two used Excel, and ten had no deduplication method. For deduplicating records within a batch, ten respondents mentioned MarcEdit, while six had no method. Practices The most significant recent change in the batch cataloging of e-books is the transition to provider-neutral records (Program for Cooperative Cataloging, 2009b). Previously, a separate e- book record for each vendor or publisher was entered into OCLC, resulting in records that varied in quality and were difficult to find. The provider-neutral standard allows multiple links (MARC 856 fields) to be attached to one record and eschews vendor-specific information in other fields. This means that some batches must be edited to delete links to e-books that the library cannot access. The survey asked how records with multiple links were being addressed. There were 104 6 responses to the question “Since provider-neutral e-book records can have two or more links to content, do you have a way of checking which links are valid for your library, or do you simply delete all links that aren’t from that particular provider?” Seventy-two respondents said they delete the links, although it was not always clear whether they were doing so as a batch edit or individually. Four of these 72 are using MarcEdit and/or regular expressions to do so. Ten respondents have not encountered the problem because they only receive records from the vendor or WorldCat Collection Sets, which includes only the vendor link. Three respondents indicated that they use a URL checker, two respondents said they add their own links, and one uses a link resolver to coordinate links. Other responses varied from checking the links manually to moving them to another field in case they were needed later. Almost all e-books have print equivalents, and e-book records usually have a 776 MARC field for linking to the other format. However, in a batch of e-book records it is not efficient to check for a library’s print equivalent for each record. There were 118 responses to the question “What is your library’s approach to the 776 linking field in e-book batches?” (see Table 4). The majority of respondents do not address this field, have display turned off, or use the single-record approach that negates the need for a 776. Only two respondents check individual records for print equivalents in their library. In the “other” category, one library deletes all 776 fields. Some of the “other” responses were eliminated or moved to other categories. Table 4. 776 Approach. Policy Responses Percentage We leave them as is/no policy 79 67% We turn/have display off 23 19% We use the single-record approach 13 11% We check record by record whether display should be on or off 2 2% Other: 1 1% Total 118 100% Respondents were asked whether they set holdings in OCLC for record batches, and why. Of 127 respondents, 28 (22%) answered yes, 42 (33%) answered no, and 57 (45%) answered sometimes/depends. Ninety-five respondents explained their answer. Twenty-three respondents said they set holdings if they had OCLC records or record numbers; vendor records did not have holdings set in OCLC. Fifteen respondents did not set holdings for e-book records, with some explaining that this was due to interlibrary loan (ILL) restrictions, vendor restrictions, or because titles changed frequently. Nine respondents upload library holdings to OCLC at regular intervals (weekly, monthly, or quarterly). Eight respondents were from libraries that were not OCLC members. Conversely, three respondents used WorldCat Local, making accurate holdings imperative. Three respondents had holdings set automatically by vendors, and one said holdings were set only for WorldCat Collection Sets. One respondent using the single-record approach attempts to set holdings on both record formats, but with some inconsistency. One respondent generates search keys from vendor records and attempts to set holdings for as many as possible. One respondent sets holdings only for purchased e-books but not for those leased. Asked about the relationship between electronic resources management (ERM) systems and e-book batches, the vast majority of respondents (81 of 110, or 74%) answered that they do not use ERM for e-books. Sixteen respondents (14%) use ERM for e-books but not for batch cataloging, and 11 (10%) use ERM with some or all e-book batches. In the “other” category, two respondents said they were investigating ERM for e-books. Some responses in the “other” category were moved or eliminated. Respondents were asked whether they use customized load profiles for batches, and if so, when or why. Eleven of the 105 responses were eliminated due to lack of clarity. Of the 7 remaining 94 responses, 76 (81%) answered yes, and 18 (19%) answered no. Of the affirmative answers, the most common reasons given were to determine record overlay and to create item or holdings records. Others used load profiles to strip or add MARC fields, add proxy information, and adjust control fields. Those answering “no” said that customized load profiles were not possible in their system, or they felt that using MarcEdit was easier than adjusting the profile. Authority control for batches is accomplished in a variety of ways. All 128 survey respondents answered a question about their library’s approach to authority control for batches. Fifty respondents (39%) said that authority control was done in-house. Forty-seven respondents (37%) sent batches to an authority-control vendor and then did local cleanup. Fourteen respondents (11%) do not have authority control for their catalogs. Seventeen respondents (13%) gave a text response in the “other” category. Of these, 3 stated that their authority control was done in-house, but only occasionally, or with lower quality control, or only for certain fields. Three said batches were not treated differently from the rest of the catalog, from which new records were sent to an authority control vendor periodically. Two respondents said they use a combination of in-house and vendor-supplied authority control. Two said they do no authority control on batches. Six respondents said some but not all batches were sent to a vendor. One respondent said the library’s consortium handles authority control. In the comment area at the end of the survey, four respondents cited authority control as a problem, especially for names: “Authority control of names in vendor-produced e-book records leaves much to be desired. We are always having to clean them up.” Big-Picture Questions The survey asked four questions that situated batch cataloging in a larger context. Fewer responses were received than for previous questions, perhaps because they raise issues that are the responsibility of other people in the library. Furthermore, since the questions solicited a free- text response, compiling and categorizing answers was often challenging. The recent advent of discovery platforms that can recognize electronic access has the potential to affect batch cataloging. Respondents with such a discovery platform, or who were planning to implement one, were asked what changes this might mean for batch cataloging. Fifteen of 56 total respondents said they did not know, including several using discovery platforms such as Summon, Primo, and Encore. Nine respondents said that the discovery platform has not affected batch loading (including Summon, Ebscohost integrated search, and VuFind). Of these, two said it had no effect on batch loading, in part because the knowledgebase is not comprehensive enough; one did not subscribe to the e-book knowledgebase; and one said title searches were problematic. Two respondents said they would still require full catalog records with Library of Congress subject headings. One respondent said Primo had changed the library’s policy because it de-duplicated records. One respondent said e-book records are purchased from Serials Solutions (the creator of Summon), and another respondent is considering discontinuing the batch loading of e-books. One respondent said, “We are currently moving to Serials Solutions for all our e-content MARC records. It has become too much work for us to keep up with all the updates to individual record sets and all the individual editing”, but another respondent using the MARC records from that service expressed serious concerns about the quality of records supplied. Electronic resources are available to library users immediately, and there are often delays before a MARC record can be created, distributed, and entered into the catalog. Respondents were asked whether delays between print availability, online availability, and batch cataloging have caused ordering or collection-management problems in their libraries, and how these problems are addressed. Of 68 responses, 16 experienced no problems with delays, and 7 said problems were infrequent and handled on a one-by-one basis. Seven respondents download records in advance that display to users as on order, or use a brief record for this and then overlay 8 with a full record later. Three respondents said their acquisitions department checks print orders to ensure that there is not already online access. Three respondents reported having to wait for records to become available for e-resources: “Our collection managers would like the records available immediately, and we sometimes have to wait for vendors to create them.” Another respondent said that the “record set for our NetLibrary subscription does not always match what is available. Also, their MARC records don't become available right away (takes up to a month).” Other respondents mentioned cataloging items individually on OCLC when records do not arrive in a timely manner, and a need for better notification from vendors when records are ready. In some cases, delays were caused internally due to lack of communication: “Our biggest headaches with respect to e-content (and in particular e-books) has less to do with getting/maintaining decent MARC records and more to do with communications between acquisitions and cataloging. Because e-book packages can be selected and acquired so quickly, there is often a significant and frustrating gap between the point of order and the point at which catalogers know what is now available and needs to be cataloged in the OPAC.” Another respondent said “we have worked with collections to improve communication on this issue.” One problem with batch records is that when catalogers edit them to improve their quality, they are duplicating the effort of other libraries (to clarify, these are not local edits but edits correcting series treatment or typographical errors, adding classification numbers, etc.) The survey noted this problem and asked for practical suggestions to make this process more efficient. The 84 responses primarily fell into four categories. The first category includes record sharing and the use of consortia. In this category, 10 respondents suggested a repository of corrected record sets, though some were concerned about licensing issues. Five respondents are already using their consortia to share batch-record improvements. Two respondents mentioned that they had shared records for a specific set with another library. Another respondent also suggested this kind of peer-to-peer networking among libraries. The second category involves working with vendors to improve the quality of records. Twelve respondents suggested reporting errors in batches to the vendor, with two of these offering specific examples of vendors correcting batches as a result of communication. Other suggestions in this category included vendors creating a central place for libraries to share record sets, and vendors hiring libraries to create records. Two respondents suggested reporting errors to vendors and then uploading records to OCLC to be enhanced and have headings controlled. The third category involves requiring better record quality from vendors on the front end. Seven respondents said libraries should petition, pressure, or require vendors to provide full and accurate catalog records. An additional five respondents thought vendors should provide record customization for libraries. Two respondents called for standardization of control numbers and e- ISBNs, and one said more vendors should contribute records to OCLC. The latter comment leads into a fourth category of respondents desiring better integration of vendor records with OCLC. Seven respondents made suggestions such as submitting corrections to OCLC, enabling record enhancement for batches, using local-holdings records, and improved set support from OCLC. Finally, one respondent noted that the most time-consuming part of batch loading is authority control, and two respondents said they spend little or no time correcting batch sets. The survey offered respondents an opportunity to mention other concerns or problems with batch MARC records, and/or to clarify earlier answers. These responses have been integrated into the response summaries above or the discussion below. Discussion Academic librarians in the U.S dominated the survey responses. This may reflect a greater volume of batch-cataloged resources, or a greater variety of them, in academic libraries. Respondents were overwhelmingly librarians (85%) rather than paraprofessionals (15%), perhaps 9 lending support to the view that the coordination required for batch cataloging often makes delegation difficult (Wu & Mitchell, 2010). As one respondent commented, batch loading reduces the number of materials that would otherwise go to copy cataloging, but “the work and follow up are done by higher level staff.” Librarians may be better able to facilitate communication with other departments, have skills related to record retrieval such as Z39.50 and OCLC Connexion batch processing, have experience with manipulation tools such as regular expressions, scripts, global update, and MarcEdit, and have access to load table configuration. However, batch cataloging is likely to be delegated as the process becomes more stable and routine at each library. Since each record batch is unique, librarians will likely continue to evaluate record quality, and both librarians and paraprofessionals will need to be able to manipulate large groups of records in addition to their cataloging skills. Evidence of batch workload has not been previously available. This survey shows that a majority of libraries have two or more catalogers responsible for loading and/or editing batch records, but the majority spends 25% or less of their time on the process. These results may indicate that as a greater variety of resources become available in batches, the workload becomes more distributed, but the inherent efficiency of the process means that each person spends a small amount of time on it. However, batch processing for print or government documents may account for the number of personnel involved. At the author’s institution, for example, all or most records in these batches are checked individually, but that is not the case for e-book record batches. Record transfer itself takes little time, but large differences in time can occur depending on whether metadata quality is checked for each record individually or as a record batch. Additionally, the relatively small amount of time spent on batch loading and editing may not include time for policy and planning, maintaining documentation, and authority work. While e-books were by far the most commonly batch-loaded material type, the survey also indicates that virtually every type of resource has been cataloged in batches. Batch processes will likely become more frequent due to the ever-increasing volume of resources, the transition from print to electronic, the marketing appeal of MARC records for vendor packages, and above all, the efficiency of the process. It is hardly surprising that MarcEdit was by far the most popular tool for batch catalogers, since the MarcEdit listserv was solicited for survey responses. Of greater interest was the wide variety of tools listed by respondents, especially those involving scripting. This provides evidence of the usefulness of programming skills for cataloging functions and perhaps helps explain the predominance of more broadly skilled librarians over paraprofessionals as batch catalogers. While a majority of respondents never use regular expressions, a significant minority frequently or sometimes does. Scripts are also used for duplicate record detection, though respondents are largely dependent on load or headings reports for this purpose. The relatively new provider-neutral standard for e-books raises the issue of how the matching of e-book records from different vendors will occur in local catalogs. Guidance on the provider-neutral approach defers to libraries whether to use single or separate e-book records (Program for Cooperative Cataloging, 2009b). However, because many libraries use the OCLC control number as an overlay point, it may be difficult to maintain a separate record approach for OCLC records. The provider-neutral policy does not suggest how the ILS might recognize library subscriptions to various vendor e-book packages. One scenario is that a load table will recognize the OCLC control-number match, and simply add the 856 associated with the specific vendor. However, this situation presents two opposing problems. On one hand, the original record in the catalog is “protected,” and a new 856 is added to the record, authority control and other record editing is preserved. On the other hand, the incoming bibliographic record may have updated fields that the catalog record does not, such as a summary, contents notes, or corrections. In this case it may be possible to adjust a load table to accept only new fields in addition to the 856 field. If a vendor is not an OCLC participant and uses its own control number, a different match point will be needed. ERM is a possible solution to this problem, but only 10% of survey 10 respondents are currently using it for e-book batches, and none mentioned its use in this regard. This question sought to discover automated means of checking which links offer valid access, and it appears that only those using a URL checker or link resolver are in this category. E-book records commonly provide a linking field to the print format (MARC 776). Reciprocal linking fields on the print and electronic record versions allow library users to select their preferred format. With e-book batches of hundreds or thousands of records, however, it is often not possible to quickly determine instances in which the library also offers the print version. Doing so manually would undermine the efficiency of batch loading. Automated methods usually depend on ILS reports showing matches on ISBN, call number, or other number. Still, determining which 776 fields should display requires viewing individual records as well as adding a reciprocal 776 on the print record. For the 13 respondents using a single-record approach, this is not a problem, but it appears that most respondents have been unable to provide navigation between formats in their catalog. One concern is that print and e-book records for the same content might differ, such as in contents or summary notes, which might result in retrieval of only one of the records by a keyword search. A process to link formats via the 776 in this situation has been described (Simpson, Lundgren, & Barr, 2007) but may be too time-consuming or beyond the capabilities of some libraries. Some catalog discovery layers collocate resources available in different formats and may make manipulation of the 776 unnecessary. Recording holdings in OCLC for batches is affected by several factors. Some vendors do not use OCLC records, and without OCLC control numbers, setting holdings for these records can be more time-consuming. A significant number of libraries do not set holdings if they perceive that the only purpose in doing so is for ILL, because most electronic resources are not eligible for loan. However, setting holdings would benefit library patrons using WorldCat as a means of discovery and would be of benefit if the library decides to use WorldCat Local as their catalog. Some libraries do not set holdings for particular sets because their catalog records are uploaded to OCLC on a regular basis. Setting holdings based on whether content is owned or leased reinforces the importance of communication between cataloging and acquisitions or collection development. When libraries use vendor records for non-lending resources, they often have little or no interaction with the de facto union catalog, OCLC. The potential for large-scale cooperation on record quality, including authority control, is thereby lessened. Other record sources exacerbate this situation. Z39.50 is used to obtain records by 37% of survey respondents, and MARC record services and new bibliographic utilities are emerging as record sources. Additionally, three survey respondents were concerned about the licensing of MARC records as a barrier to sharing. Discovery platforms have the potential to influence batch cataloging practice due to their ability to collocate resources automatically, and through their associated knowledgebases. As the volume of resources increases, and staffing in cataloging departments remains stable or decreases, outsourcing of tasks to knowledgebases and discovery services will likely increase. Knowledgebases serve administrative functions by tracking subscribed online resources, and by “recognizing” when new resources come online. The latter function is difficult to perform otherwise. Given the hundreds or thousands of resources available in a set, and the delays in the creation and distribution of MARC records, it is currently difficult to know about new resources or identify those not yet represented by a record. However, there are several problems with these services. As one respondent commented, knowledgebases are often incomplete, perhaps due in part to lack of vendor cooperation. And another respondent felt that the record quality of such services seems to be no better, and sometimes worse, than records obtained previously. Additionally, record licensing poses a barrier for record sharing and results in duplicate record creation by different entities. The cost could be a barrier to smaller libraries, with the MARC record service an additional cost above that of the knowledgebase. While the survey question about the delay between the availability of online resources and the availability of MARC records for them was focused on ordering issues in the library, this 11 delay can potentially decrease catalog use. The online environment reverses the traditional cataloging workflow for print. When a subscription or a new e-book comes online, it is immediately available to library users, whereas print resources have traditionally been cataloged before they are made available. Therefore the timeliness of cataloging has increased in importance. If library users know that a vendor’s newly available e-books are slow to be reflected in the library catalog, there is added incentive to bypass it. This is a difficult problem for libraries to address, because they are dependent on a vendor or utility for record delivery. While knowledgebases partly help address delays by offering brief records so that library users can discover resources, they come with many problems, as noted above. Additionally, as some respondents noted, delays are caused or exacerbated by lack of internal communication. More frequent use of print records as a basis for e-book records might minimize delays as well as improve record quality. If one were to measure delays until the time a MARC record was more or less “finished,” then authority-control work would likely be the most time-consuming part of batch cataloging, especially considering the number and quality of the records. While this survey did not solicit specific record-quality problems, a few respondents offered examples, and common issues are well-documented in the literature (Kemp, 2008; Sanchez et al., 2006). Vendor batches with poor or nonexistent authority control create a tremendous amount of work for libraries, especially for the 39% of survey respondents doing this work in-house. Some record sets may be excluded from authority control due to some combination of size, quality, or set changeability. For example, problems might be caused by patron-driven collections in which the majority of e-books have not been purchased by the library, or if a set needs to be deleted and then re-loaded. Applying authority control in these situations would expend additional time, money, or both. Large efficiencies would likely be gained by performing authority work at a more networked level, for example by a vendor or bibliographic utility. Reducing or eliminating the duplication of effort in correcting batches of records is a difficult problem because it would involve a high degree of coordination between libraries, utilities, and/or vendors. Improvements to record batches, including authority control, seem to be infrequently shared, and even then rarely at a networked level available to all. Sharing within consortia makes workflow more efficient, but does not benefit libraries outside the consortia. Direct sharing of records can be prohibited by the licensing or record use policies of vendors or utilities. Libraries can make edits to the “master record” on a bibliographic utility. But some libraries cannot afford OCLC membership, and making corrections on OCLC would greatly reduce, if not eliminate, the time savings of batch processing. Corrections to record batches, including edits to individual records, take place in record-editing tools or in the library catalog external to OCLC. After cleaning up the local catalog, there is little incentive to repeat this work in OCLC. Batch upload into OCLC is possible but problematic. It is open only to PCC libraries, and record overlay depends on algorithms that do not always work properly or do not replace the fields needed. Sharing authority control would greatly enhance the effectiveness of batch cataloging, although it would likely add to delays in obtaining records. However, this may not be significant if records are improved in a multi-step iterative process, beginning with a brief record to register availability, to an initial MARC record, to a “finished” MARC record with authority work and other quality control completed. Working with vendors to improve records is successful in some cases but, in the author’s experience, not in others. As one survey respondent advocated, vendors need to be convinced that record errors, as well as delays, affect the findability of the resource. Vendors have an incentive to enhance discovery, particularly for patron-driven collections. Guidance for vendors creating MARC records is available, which includes the suggestion that authority control be implemented (Program for Cooperative Cataloging, 2009a). The quality of many batch sets, however, seems to indicate that these guidelines are not followed. Requiring vendors to provide high-quality records seems problematic due to poor communication between collection development and cataloging departments in some libraries, in addition to the 12 time needed to fully examine a subset of records. Subscription decisions in libraries are primarily based on the content offered but need to incorporate the quality of metadata available as well. Better integration of vendor records with OCLC is dependent on the OCLC-vendor relationship, and while allowing for record improvements, still does not eliminate the duplicate work required. Greater effort toward cooperative cataloging at the “network level” for the benefit of all could make batch cataloging far more efficient. Conclusion Practices related to batch cataloging of MARC records are extremely variable and are influenced by differing record batches, editing tools, and local policies. Librarians are usually responsible for this workflow, probably because they are in the best position to facilitate communication with other departments, determine policies, and perform system functions such as adjusting load tables. As the workflow becomes established, batch cataloging will likely be delegated more frequently, especially where ERM and MARC record services are used. Those performing batch cataloging will need to build on their cataloging knowledge to learn new tools and methods of manipulating large groups of records. Increases in the number of electronic resources make it likely that batch cataloging will become more frequent. Automation of some batch-related tasks such as collocating formats and combining links for online access on a single record is currently low but seems likely to increase. The unique characteristics of electronic resources—immediate availability, varied acquisition models, and lack of lending permissions—have implications for batch cataloging. Delays in the delivery of MARC records may lead more libraries to move to an incremental metadata model beginning with brief records for immediate access, followed by overlay by a full MARC record, and finally, record editing and authority control. Some libraries choose not to address record quality for sets of electronic resources that are leased, changeable, or part of a patron-driven acquisitions plan. Survey respondents are divided on the best way to share upgraded batch MARC records. Greater use of consortia, peer-to-peer sharing, vendor communication, requirements for vendors, and involvement by OCLC are suggested. Batch cataloging of vendor records often reduces interaction with OCLC, due to the perception that holdings do not need to be set and the inefficiency of transferring improved records there. This trend, along with the use of Z39.50, a proliferation of other record sources such as MARC record services and new utilities, as well as concerns about MARC record licensing, seems to make large-scale sharing of record improvements less likely. A new emphasis on cooperative cataloging, including authority control, will be necessary to make batch cataloging more efficient. Received: June 8, 2011 Accepted: July 22, 2011 References Chen, X. & Wynn, S. (2009). E-journal cataloging in an age of alternatives: A survey of academic libraries. The Serials Librarian 57(1), 96-110. Finn, M. (2009). Batch-load authority control cleanup using MarcEdit and LTI. Technical Services Quarterly 26(1), 44-50. Kemp, R. (2008). MARC record services: A comparative study of library practices and perceptions. The Serials Librarian 55(3), 379-410. 13 Martin, K. E. (2007). Cataloging eBooks: An overview of issues and challenges. Against the Grain 19(1), 45-47. Martin, K. E. & Mundle, K. (2010). Cataloging e-books and vendor records: A case study at the University of Illinois at Chicago. Library Resources & Technical Services 54(4), 227-237. Mugridge, R. L. & Edmunds, J. (2009). Using batchloading to improve access to electronic and microform collections. Library Resources & Technical Services 53(1), 53-61. Program for Cooperative Cataloging. (2009a). MARC record guide for monograph aggregator vendors. (Includes revisions to September 2010). Retrieved April 11, 2011 from http://www.loc.gov/catdir/pcc/sca/FinalVendorGuide.pdf Program for Cooperative Cataloging. (2009b). Provider-Neutral E-Monograph MARC Record Guide. (Prepared by Becky Culbertson, Yael Mandelstam, George Prager, includes revisions to September 2010). Retrieved April 11, 2011 from http://www.loc.gov/catdir/pcc/bibco/PN-Final- Report.pdf Sanchez, E., Fatout, L., Howser, A. & Vance, C. (2006). Cleanup of NetLibrary cataloging records: A methodical front-end process. Technical Services Quarterly 23(4), 51-71. Simpson, B., Lundgren, J., & Barr, T. (2007). Linking print and electronic books: One approach. Library Resources & Technical Services 51(2), 146-152. Wu, A. & Mitchell, A. M. (2010). Mass management of e-book catalog records. Library Resources & Technical Services 54(3), 164-174. Appendix Batch Cataloging Survey What type of library do you work in? Academic Public Special Other: My library is located in: U.S. Canada U.K. Australia http://www.loc.gov/catdir/pcc/sca/FinalVendorGuide.pdf� http://www.loc.gov/catdir/pcc/bibco/PN-Final-Report.pdf� http://www.loc.gov/catdir/pcc/bibco/PN-Final-Report.pdf� 14 Other: I am a: Librarian (i.e, have a M.L.S. or equivalent) Paraprofessional/staff member How many catalogers in your library are responsible for loading/editing batches of MARC records? 1 2 3 4 or more How much of your time do you spend working with batch MARC records (choose closest answer)? 25% 50% 75% 100% I use the following tools to edit MARC record batches (check all that apply): MARCEdit Global Update in my ILS Excel Other: Do you use regular expressions (RegEx) to identify and edit MARC data? Frequently Sometimes Never Since provider-neutral e-book records can have two or more links to content, do you have a way of checking which links are valid for your library, or do you simply delete all links that aren't from that particular provider? 15 What is your library's approach to the 776 linking field in e-book batches? We turn/have display off We check record by record whether display should be on or off We leave them as is/no policy We use the single-record approach Other: What material types are batch loaded by your library (check all that apply)? E-books Online video Microform Print books Maps Other: Do you set holdings in OCLC for record batches? Yes No Sometimes/Depends Explain your answer above: Have you used OCLC Connexion's batch processing to create and export a batch? Yes No Have you used Z39.50 to retrieve records from another catalog? Yes No What is the relationship between ERM and e-book record batches at your library? ERM for e-books is linked with some or all batches we load ERM for e-books is separate from our batch loads We do not use ERM for e-books Other: 16 Do you have a uniform practice for control numbers (001, 035, etc.) used in your catalog? Describe. Do you use customized load profiles/programs for batches? If so, when or why? How is authority control for batches addressed? Sent to a vendor followed by local cleanup Authority control is done in-house We do not currently do authority control Other: What are your methods for detecting duplicate records (1) within a single batch, and (2) for records already in your catalog? If you have or are planning to purchase a discovery platform like Summon, with a knowledgebase that recognizes access to e-books, how does/will this affect batch loading of e-books? If delays between print availability, online availability, and the loading of records into the catalog have caused ordering/collection management problems in your library, please describe how these are addressed. Often many libraries are locally correcting the same batches, which results in a tremendous duplication of effort. In your view, what would be the most practical way to make this process more efficient? Please feel free to add other concerns or problems with batch MARC records, and/or clarify answers above: work_x3wdkm4qvnc6lacmkofyqxpgvi ---- DNB - DDC Springe direkt zu: Inhalt Hauptmenü Suche Servicemenü English Ge­bär­den­spra­che Leich­te Spra­che Menü Servicenavigation Um­gang mit Co­ro­na Be­nut­zung DNB Pro­fes­sio­nell Mein Konto Webseiten durchsuchen Suche. Finden. Entdecken Suchbegriff eingeben Sie möchten in unseren Beständen recherchieren? Zur Katalog-Suche Sie sind hier: Startseite DNB Professionell DDC Dewey-Dezimalklassifikation (DDC) Grafik: Tina Mengel Inhalt Viel­falt zählt – wie wir die DDC ein­set­zen Klas­si­fi­zie­ren und Su­chen mit Web­De­wey Her­kunft und in­ter­na­tio­na­le Ver­brei­tung der DDC Häu­fig ge­stell­te Fra­gen (FAQ) Kon­tak­te Hier erfahren Sie Wissenswertes rund um die DDC Deutsch und zur Dewey-Dezimalklassifikation (DDC), der international am weitesten verbreiteten Universalklassifikation. Die Dewey-Dezimalklassifikation ist ein System zur Ordnung von Wissen, das in vielen Bibliotheken weltweit verwendet wird. Im Jahr 1876 in der von dem Namensgeber Melvil Dewey entwickelten Form erstmals veröffentlicht, schafft sie es dank ständiger Weiterentwicklung und inhaltlicher Anpassung, immer mit der Zeit zu gehen. Anders als im Herkunftsland USA, wo „Dewey“ in der Mehrzahl der Bibliotheken traditionell als Aufstellungssystematik eingesetzt wird, steht bei uns die Katalogsuche im Vordergrund. Vielfalt zählt – wie wir die DDC einsetzen DDC-Sachgruppen gliedern die Deutsche Nationalbibliografie. DDC-Kurznotationen werden für die maschinelle klassifikatorische Erschließung eingesetzt. Vollständige Notationen vergibt die Deutsche Nationalbibliothek seit 2007 und folgt damit der internationalen Praxis, zum Beispiel der Library of Congress und der British Library. DDC in der Deutschen Nationalbibliothek Foto: T. Mengel Klassifizieren und Suchen mit WebDewey WebDewey Deutsch enthält den aktuellen Stand des Klassifikationssystems und ist Arbeitswerkzeug für das Erschließen von Medienwerken mit der DDC. WebDewey Search bietet die Möglichkeit, Literatur mithilfe der DDC gleich in mehreren Katalogen zu finden. WebDewey DDC Deutsch – Übersetzung und Nutzung DDC-Übersichten Herkunft und internationale Verbreitung der DDC Die Dewey-Dezimalklassifikation (DDC) stammt aus den USA, ihr Namensgeber war Melvil Dewey (1851–1931). Ob in Deutschland, Frankreich, Island, Norwegen oder Vietnam: Die DDC findet weltweit Anwendung, Tendenz zunehmend. Dass das so ist, liegt nicht zuletzt an der erfolgreichen internationalen Zusammenarbeit. DDC international WebDewey Foto: E. Conradi Was ist die DDC? Die DDC in der Deutschen Nationalbibliothek Foto: T. Mengel WebDewey Deutsch und WebDewey Search DDC Deutsch – Übersetzung und Nutzung DDC international Foto: E. Conradi DDC Deutsch ausserhalb der Deutschen Nationalbibliothek Grafik: T. Mengel Häufig gestellte Fragen (FAQ) Ich kenne mich mit der DDC nicht aus, möchte sie aber in meine Literatursuche einbeziehen. Wie mache ich das? Um mit der DDC erschlossene Titel der Deutschen Nationalbibliothek zu finden, gibt es zwei einfache Wege: WebDewey Search: Hier können Sie intuitiv und ohne DDC-Vorkenntnisse in den im Dezimalsystem hierarchisch gegliederten DDC-Notationen browsen. Erkunden Sie ganz allgemeine sowie hoch spezifische Themen. Am Ziel angelangt, können Sie die Treffersets direkt im Katalog der Deutschen Nationalbibliothek öffnen und dort gegebenenfalls die Suche noch weiter verfeinern. Ein Tipp: Wählen Sie gleichzeitig mehrere Bibliothekskataloge aus, das erhöht Ihren Sucherfolg! Ein Hinweis: In unserem Katalog finden Sie WebDewey Search über den Link „Browsen (DDC)“ Erweiterte Suche im Katalog der Deutschen Nationalbibliothek: Sind Sie bereits etwas mit der DDC vertraut, wählen Sie in der Dropdown-Liste „DDC-Notation“ aus und geben eine DDC-Notation ein. Sie können Ihre Suche auch mit anderen Suchfiltern kombinieren. Mehr zu WebDewey Search Zum Katalog Gibt es eine Druckausgabe der DDC 23? 2011 erschien die englischsprachige Druckausgabe der DDC 23; 2012 wurde die 15. DDC-Kurzausgabe (Abridged Edition 15) in Englisch veröffentlicht. Seither hat OCLC keine fortlaufende Druckausgabe mehr herausgebracht, jedoch ist es möglich, eine jährlich aktualisierte englischsprachige Print-on-Demand-Version bei OCLC zu beziehen. Informationen zu den englischsprachigen Druckausgaben von OCLC Informationen zur deutschen Druckausgabe der DDC 22 Für die deutsche Ausgabe der DDC 23 ist keine Druckversion vorgesehen. Die Klassifizierungsanwendung WebDewey Deutsch enthält den vollständigen Inhalt der Klassifikation sowie alle fortlaufenden Aktualisierungen. Darf ich DDC-Notationen für die eigene Sammlung kostenfrei vergeben? Grundsätzlich können DDC-Notationen ohne Lizenzierung für die Klassifizierung von Objekten verwendet werden. Der vollständige Inhalt der DDC, das heißt alles, was über einzelne DDC-Notationen und Klassenbenennungen hinausgeht, ist durch das Online Computer Library Center (OCLC) urheberrechtlich geschützt. Fragen zur Lizenzvergabe für die deutsche DDC-Ausgabe beantworten wir gerne. Mehr zu WebDewey Deutsch und WebDewey Search Übersetzung und Nutzung der DDC Zu den DDC-Übersichten Woran kann es liegen, dass ich in WebDewey Deutsch auf der Aktualisierungsseite eine Aktualisierung nicht finde? Seit ihrer Erstübersetzung (2005) wird die DDC Deutsch fortlaufend aktualisiert. Vor dem Umstieg auf WebDewey Deutsch im Jahr 2012 gab es einen anderen Workflow für die Dokumentation von DDC-Updates. Das hat zur Folge, dass alle vor 2012 dokumentierten Änderungen auf der Aktualisierungsseite in WebDewey Deutsch nicht abrufbar sind. Jedoch steht Ihnen in WebDewey Deutsch eine Archivdatei zur Verfügung, in der die vor 2012 liegenden Aktualisierungsinformationen enthalten sind. Mehr zu WebDewey Deutsch Gibt es für das Erstellen schwieriger DDC-Notationen eine technische Unterstützung? Ja, die gibt es! In der Klassifizierungsanwendung WebDewey Deutsch können Synthetische Notationen mithilfe eines sogenannten Syntheseassistenten gebildet werden. Dieser wertet die DDC-Regeln automatisch aus und unterstützt so beim Anhängen von Notationen aus den Haupt- und Hilfstafeln. Soll die Synthetische Notation im System gespeichert werden, gibt der Assistent zusätzlich Vorschläge für die Klassenbenennung und DDC-Registereinträge. Mehr zu WebDewey Deutsch Woran sehe ich, ob eine DDC-Notation intellektuell oder maschinell vergeben wurde? Im Katalog der Deutschen Nationalbibliothek sind DDC-Kurznotationen bei den Titelangaben mit dem Hinweis „maschinell ermittelte DDC-Kurznotation“ gekennzeichnet. Vollständige, intellektuell vergebene DDC-Notationen haben diesen Hinweis nicht, dafür ist die Ausgabe der DDC hinter der Notation in eckigen Klammern angegeben: [DDC22ger] beziehungsweise [DDC23ger]. Mehr zur DDC in der Deutschen Nationalbibliothek Mehr zur Erschließung von Medienwerken Kontakte Dr. Heidrun Alex h.alex@dnb.de Tina Mengel t.mengel@dnb.de Kurz-URL: https://www.dnb.de/ddc Unsere Newsletter Aktuell und informativ – für Sie von unseren Newsletter-Redaktionen Mehr empfehlen Über den Katalog Mein Konto Hil­fe DNB Glossar In­halt (Si­te­map) Im­pres­s­um Da­ten­schutz Bar­rie­re­frei­heit Pres­se Be­ruf und Kar­rie­re Kongresszentrum Öf­fent­li­che Ausschreibungen Recht­li­ches und Grund­lagen Ge­sell­schaft für das Buch DNB Leipzig Deutscher Platz 1 04103 Leipzig + 49 341 2271-0 + 49 341 2271-0 info-l@dnb.de DNB Frankfurt Adickesallee 1 60322 Frankfurt am Main + 49 69 1525-0 + 49 69 1525-0 info-f@dnb.de An­fahrt Kon­tak­te Hinweis zur Verwendung von Cookies Cookies erleichtern die Bereitstellung unserer Dienste. Mit der Nutzung unserer Dienste erklären Sie sich damit einverstanden, dass wir Cookies verwenden. Weitere Informationen zum Datenschutz erhalten Sie über den folgenden Link: Datenschutz OK work_x46ggun26vai3lnnudlgzlbwvq ---- Expansions and Challenges to the Impact Factor: An Enduring Legacy of the Open Access Movement Emerging Alternatives to the Impact Factor General Review Keywords: impact factor; h index; Y factor; evaluation metrics Purpose: The authors document the proliferating range of alternatives to the impact factor that have arisen within the past five years, coincident with the increased prominence of open access publishing. Methodology/Approach: This paper offers an overview of the history of the impact factor as a measure for scholarly merit; a summary of frequent criticisms of the impact factor’s calculation and usage; and a framework for understanding some of the leading alternatives to the impact factor. Findings: This paper identifies five categories of alternatives to the impact factor: a. Measures that build upon the same data that informs the impact factor. b. Measures that refine impact factor data with “page rank” indices that weight electronic resources or Web sites through the number of resources that link to them. c. Measures of article downloads and other usage factors. d. Recommender systems, in which individual scholars rate the value of articles and a group’s evaluations pool together collectively. e. Ambitious measures that attempt to encompass the interactions and influence of all inputs in the scholarly communications system. Value of Paper: Librarians can utilize the measures described in this paper to support more robust collection development than is possible through reliance on the impact factor alone. History and Calculation of the Impact Factor In 1955 Eugene Garfield proposed a “bibliographic system for science literature that can eliminate the uncritical citation of fraudulent, incomplete, or obsolete data by making it possible for the conscientious scholar to be aware of criticisms of earlier papers” (Garfield 1955). Using the example of legal documents, which heavily depend on the precedent of previous rulings, Garfield proposed development of an “impact factor” that would rate the worth of a scientific article based upon the number of articles that cite it subsequently. The impact factor would also be useful for tracing the “eclectic” connections between ideas that would not be obvious through more established methods like subject indexing. Finally, provided with a convenient way to access a paper’s subsequent citations, the “conscientious scholar” would be in a better position to build upon the chain of reaction to any given paper. In 1961 Garfield and Sher began to operationalize these goals, with the creation of the Science Citation Index (SCI) (Garfield 2006). Although the SCI focused upon the impact factor for authors, by the end of the decade Garfield became interested in quantifying the impact factor for journals (Garfield 1972). The journal-focused metric eventually became quantified in the Journal Citation Reports (JCR) database. Anyone who inquires about the “high impact journals in my field” probably wants data from the JCR. The impact factor for a journal “x” is a ratio based upon the previous two years of citation data [1]. As an example, the 2007 impact factor for “x” is calculated as follows: Number of 2007 articles in journals that reference citable articles published in “x” in 2005-2006 Number of all citable articles published in “x” in 2005-2006 If the numerator is 200, and the denominator is 100, then the impact factor for “x” would be 2. In general, the higher impact factor journals in a given field are perceived to be more prestigious than other titles. Crucially, the impact factor’s calculation depends on the number of “citable items” a journal published. Original research papers are always included, but letters and editorials often are not. This stipulation alone can have a significant effect on a given journal’s impact factor. As of 2007, the journal impact factor remains a foremost metric in scholar’s minds. Everyone wants their work to appear in high impact journals. Librarians must respond to this impulse in collection decisions, by privileging these journals even if they are very expensive to obtain. Criticism of the Impact Factor Well before the Internet created space for open access publishing, criticisms of the impact factor abounded. Seglen summarizes many of these critiques in a 1997 paper that built upon earlier work by himself and others (Seglen 1997). Among other problems, Seglen concludes that a relatively small number of highly cited articles can disproportionately skew the impact factor for a journal; that review articles are cited so frequently that this favorably influences the impact factors for journals that publish many review articles; and that the impact factor arbitrarily favors research in fields whose literature rapidly becomes obsolete. More recent criticism by Smith emphasizes the deleterious role of “citation cartels” in which authors indiscriminately cite themselves and others in order to boost the impact factor (Smith 2006). Another negative consequence of obeisance to the impact factor is the devaluing of “softer” elements of a journal—such as editorials and layman’s overviews—that are less likely to be cited by researchers. Citing Malcolm Chiswick, Smith proclaims that the result of this diminution is an “impacted journal.” The editors of PLoS Medicine brand all such maneuvers as playing the “impact factor game” (PLoS Medicine Editors 2006). In a recent lecture, Garfield defends the impact factor while acknowledging some shortcomings in its calculation (Garfield 2006). He argues that journal impact factors for discrete fields tend to be stable over time, which validates their usefulness as an indicator of the most prestigious journals. He also charges that critics of the impact factor exaggerate the effects of anomalies in its calculation. With those lines drawn, Garfield admits that the exclusion of non-citable items like editorials can mildly distort the impact factor (although he only admits this for high-profile medical titles.) Next Garfield describes his continuing work to improve the value of the impact factor. He then points readers to another database, Journal Performance Indicators, which “eliminate[s]” “many of the discrepancies inherent” in traditional impact factor calculation. But although he acknowledges some defects, naturally Garfield is partial to his creation. He concludes with a quote from Hoeffel: “‘Experience has shown that in each specialty the best journals are those in which it is most difficult to have an article accepted, and these are the journals that have a high impact factor.’” The criticism of the impact factor has gained some traction. In 1997, Seglen was particularly concerned about the prospect that national governments would unfairly utilize impact factor scores as a convenient metric for evaluating scholars; those who published in high impact journals would reap future grants, while those who did not would go without. Smith echoed this fear in 2006. Both authors should be pleased with this statement from the British government about the UK’s 2008 Research Assessment Exercise: “No panel will use journal impact factors as a proxy measure for assessing quality” (Research Assessment Exercise Team 2006). Alternatives to the Impact Factor A. Measures that build upon the same data that informs the impact factor Proposed by Hirsch in 2005, the h index is computed as, “the number of papers with citation number h” (Hirsch 2005). An author with an h index of 30 has published 30 papers that received at least 30 citations in subsequent work. If one scholar has published 100 papers and another has published 35, but both have 30 papers that received at 30 citations, each would have an h index of 30. Hirsch argues that his index provides a way to gauge the relative accomplishments of different researchers that is more rigorous than simply toting up their number of publications. Hirsch describes calculating the h index using the data in the “times cited” field for author records in Thomson ISI’s Web of Science database; this is the same data source that informs the calculation of a journal’s impact factor. Although the h index is relatively straightforward, Hirsch recognizes that it should only be one factor in any evaluation of a scholar’s impact. Almost as soon as Hirsch proposed the h index, researchers began to develop variants of it. A recent review summarizes the analytic potential of nine h index variants in the field of biomedicine (Bornmann et al. 2008). Barendse recently built upon the h index in his proposal for a “strike rate index” (Barendse 2007). Using logarithmic techniques to ensure that journals in different fields can be compared objectively, Barendse transforms the h index from a strictly author-focused measurement into one that can also inform the evaluation of journals. This is analogous to Garfield’s shift in emphasis from authors to journals almost 40 years ago. However, Barendse argues that the strike rate index controls for citation patterns over the long term much better than the native impact factor. He finds the two approaches to be complementary—not competitive—means of understanding scholarly worth. B. Measures that refine the impact factor with “page rank” data In 2006 Bollen and colleagues introduced the “Y factor,” a measure that combines traditional impact factor data with “page rank” data similar to what Google uses in its search algorithm (Bollen et al. 2006). To some extent, Google simply keeps track of which Web sites link to other Web sites. This is akin to rudimentary article citation. However, Google also factors the prestige of a linking site into its calculations; if the National Institutes of Health site links to a Web page, this means more than a link to the same page from a personal blog. The Y factor—which the authors acknowledge has not yet been fully justified scientifically—applies the page rank approach to journal citation networks. They found significant differences between the highest impact factor journals and Y factor journals in physics, computer science, and medicine, and less difference in a subspecialty of medicine, dermatology (Dellavalle et al. 2007). Developed by Bergstrom, the Eigenfactor provides an online suite of tools that “ranks journals much as Google ranks websites” [2]. Available at no charge, the Eigenfactor attempts to account for the prestige of citing journals; incorporates many non-standard items such as newspapers and PhD dissertations into the citation network; and evaluates items over a 5 year (rather than 2 year) period. Of particular interest to librarians, the “cost-effectiveness search” relates this data to the going subscription rates for journals as means of determining value-for-money. The “Article Influence” metric within the Eigenfactor is comparable to the impact factor, but that is just one aspect of the broader framework. C. Measures of article downloads and other usage factors The alternatives to the impact factor described thus far are refinements and enhancements of it. However, the profusion of online journal content in recent years has prompted calls for new evaluation metrics entirely; the impact factor’s origin dates from a period when all scholarship was in print. The Usage Factor, which is still in its formative stages, is one alternative currently receiving attention. Its calculation relies upon statistics generated by COUNTER, which tracks monthly full-text article requests for many scholarly journals [3]. Internet usage data is often obtained from the web site Alexa [4]. The Usage Factor would be calculated as follows (Shepard 2007): Total usage of a journal (Counter journal usage for a specified period) Total number of articles published online during the same period An even more straightforward alternative metric for the impact of an article is the number of times it has been downloaded, which is easy to track online. For the fields of physics, astrophysics, and mathematics, Brody and colleagues found a positive correlation between the frequency of times an article is downloaded soon after publication and the eventual rate at which it will be cited (Brody et al. 2006). The attention to downloads refocuses evaluation on individual papers and away from journals. This is a positive development in light of the pernicious incentives that high impact journals can provide for author behavior. Because open access articles are available to anyone with an Internet connection, they are likely to be downloaded more frequently than articles that are only available via personal or library subscriptions. Hitchcock continues to update a useful bibliography about the relationship between downloads and citation impact (Hitchcock 2007). As a practical matter, the open access download-to-citation advantage may be small (Kurtz et al. 2007); this is a topic for continuing debate. But it does seem evident that tracking downloads will be important to the evaluation of scholarly impact for the foreseeable future. In a 2005 survey—which has somewhat limited utility, due to the small sample size—established researchers rated downloads slightly higher than citations as an indicator of the impact of an article (Rowlands et al. 2005). D. Recommender systems and other communitarian metrics It is possible to track article downloads and calculate usage factors for all articles, open access or not. However, several community-based measurements are emerging that are much more practical in an open access environment. At a 2003 symposium sponsored by the National Academies, Resnick promoted the concept of “recommender systems” as a means of expanding the peer review process (Resnick 2004). Rather than cloaking the entire process in secrecy, Resnick argued that journals should post the contents of peer reviews, and possibly even the reviewer’s names as a means of increasing transparency and ensuring that reviews were more thorough. In this more public context, one’s colleagues would be encouraged evaluate the cumulative value of your peer reviews. Your prestige would rise (or fall) accordingly. PLoS ONE, one of the most innovative open access journals published by the Public Library of Science, follows the spirit if not the letter of Resnick’s proposals [5]. Papers published in PLoS ONE must pass an initial threshold review by an editor, but most of the reviewing (and rating) occurs publicly after an article appears. The home page for the journal provides easy access to recent ratings—on the dimensions of insight, reliability, and style—and reader comments. The entire enterprise depends on “openness” in a full sense, from the author’s willingness to receive public scrutiny to a reviewer’s willingness to sign their name. Schnell recently described an even more novel evaluation metric, which he termed the “Blog Citation Index” or BCI (Schnell 2007). An increasing (but still small) amount of scholarly discourse occurs via blog posts; the comments on these posts; and the links to the posts from other sites (which are akin to citations). The BCI would formally track these connections. The blogosphere is a radically open place that has blossomed in recent years; Schnell’s proposal is a natural extension of earlier calls for “recommender systems” to enrich peer review. E. Package of evaluation metrics designed for modern scholarly communication practices Funded by the Andrew W. Mellon Foundation, the MESUR (MEtrics from Scholarly Usage of Resources) project is a two year effort to enrich “the toolkit used for the assessment of the impact of scholarly communication items, and hence of scholars, with metrics that derive from usage data” [6]. Researchers at the Los Alamos National Laboratory—who also developed the Y factor —are leading the MESUR (pronounced “measure”) initiative. MESUR is the most comprehensive effort to date to align article impact evaluation techniques with modern scholarly communication practices, which are very different today than just a decade ago. The final report is due to Mellon in October 2008. Conclusion The impact factor is a scholarly evaluation metric that continues to have some utility. But it was conceived in a very different context than exists today, and was controversial even before the birth of the Internet and the advent of open access publishing fundamentally altered the landscape. The impact factor is a tool whose usefulness is waning, but there is not yet a fully viable alternative to it. This paper describes several alternatives to the impact factor, some well-developed and some still in formative stages. Librarians should inform themselves about the strengths and weaknesses of the alternatives; contribute to the discussion about their ongoing development; encourage and facilitate faculty awareness of these alternatives; and begin to use them in deciding how to develop and promote scholarly resources. References 1. Barendse, W. (2007) “The strike rate index: a new index for journal quality based on journal size and the h-index of citations”, Biomedical Digital Libraries, Vol 4 No 3. Available http://www.bio-diglib.com/content/4/1/3. 2. Bollen, J., Rodriguez, M.A., Van de Sompel, H. (2006) “Journal status,” Scientometrics, Vol 69 No 3, pp. 669-687. Available http://arxiv.org/abs/cs/0601030. 3. Bornmann, L., Mutz, R., Daniel, H-D. (2008) “Are there better indices for evaluation purposes than the h index? A comparison of nine different variants of the h index using data from biomedicine,” Journal of the American Society for Information Science and Technology, Vol 59 No 5, pp. 830-837. 4. Brody, T., Harnad S., Carr L. (2006) “Earlier Web usage statistics as predictors of later citation impact,” Journal of the American Association for Information Science and Technology, Vol 57 No 8, pp 1060-1072. Available http://eprints.ecs.soton.ac.uk/10713/. 5. Dellavalle, R.P., Schilling, L.M., Rodriguez, M.A., Van de Sompel, H., Bollen, J. (2007) “Refining dermatology impact factors using PageRank,” Journal of the American Academy of Dermatology Vol 57 No 1, pp. 116-119. 6. Garfield, E. (2006) “The history and meaning of the journal impact factor,” Journal of the American Medical Association Vol 295 No 1, pp. 90-93. Available http://jama.ama- assn.org/cgi/content/full/295/1/90. http://jama.ama-assn.org/cgi/content/full/295/1/90 http://jama.ama-assn.org/cgi/content/full/295/1/90 http://eprints.ecs.soton.ac.uk/10713/ http://arxiv.org/abs/cs/0601030 http://www.bio-diglib.com/content/4/1/3 7. Garfield, E. (1972) “Citation analysis as a tool in journal evaluation,” Science Vol 178 No 60, pp. 471-479. 8. Garfield, E. (1955) “Citation indexes for science: a new dimension in documentation through association of ideas,” Science Vol 122 No 3159, pp. 108-11. Available http://ije.oxfordjournals.org/cgi/content/full/35/5/1123. 9. Hirsch, J. E. (2005) “An index to quantify an individual’s scientific research output,” Proceedings of the National Academy of Sciences Vol 102 No 46, pp. 16569-16572. Available http://www.pnas.org/cgi/content/full/102/46/16569. 10. Hitchcock, S. (2007) “Effect of open access on citation impact: a bibliography of studies.” Available http://opcit.eprints.org/oacitation-biblio.html. 11. Kurtz, M.J., Henneken, E.A. (2007) “Open access does not increase citations for articles from The Astrophysical Journal.” Available http://arxiv.org/abs/0709.0896. 12. Public Library of Science Medicine Editors (2006) “The impact factor game,” Public Library of Science Medicine Vol 3 No 6, pp. e291. Available http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=16749869. 13. Research Assessment Exercise Team (2006) “RAE2008: panel criteria and working methods.” Available http://www.rae.ac.uk/pubs/2006/01/. http://www.rae.ac.uk/pubs/2006/01/ http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=16749869 http://arxiv.org/abs/0709.0896 http://opcit.eprints.org/oacitation-biblio.html http://www.pnas.org/cgi/content/full/102/46/16569 http://ije.oxfordjournals.org/cgi/content/full/35/5/1123 14. Resnick, P. (2004) “Implications of Emerging Recommender and Reputation Systems,” in Committee on Electronic, Scientific, Technical, and Medical Journal Publishing (ed.), Electronic Scientific, Technical and Medical Journal Publishing and its Implications: Report of a Symposium, Washington, pp.49-50. 15. Rowlands, I., Nichols, D. (2005) “New journal publishing models: an international survey of senior researchers.” Available http://www.ucl.ac.uk/ciber/ciber_2005_survey_final.pdf. 16. Schnell, E. (2007) “A blog citation index?” Available http://ericschnell.blogspot.com/2007/11/ blog-citation-index-bci.html. 17. Seglen, P.O. (1997) “Why the impact factor of journals should not be used for evaluating research,” British Medical Journal Vol 314 No 7079, pp. 498-502. Available http://www.bmj.com/ cgi/content/full/314/7079/497. 18. Shepard, P.T. (2007) “Final report on the investigation into the feasibility of developing and implementing journal usage factors.” Available http://www.uksg.org/sites/uksg.org/files/Final %20Report%20on%20Usage%20Factor%20project.pdf. 19. Smith, R. (2006) “Commentary: The power of the unrelenting impact factor—is it a force for good or harm?,” International Journal of Epidemiology Vol 35 No 5, pp. 1129-1130. Available http://ije.oxfordjournals.org/cgi/content/full/35/5/1129. Web Sites http://ije.oxfordjournals.org/cgi/content/full/35/5/1129 http://www.uksg.org/sites/uksg.org/files/Final%20Report%20on%20Usage%20Factor%20project.pdf http://www.uksg.org/sites/uksg.org/files/Final%20Report%20on%20Usage%20Factor%20project.pdf http://www.bmj.com/cgi/content/full/314/7079/497 http://www.bmj.com/cgi/content/full/314/7079/497 http://ericschnell.blogspot.com/2007/11/blog-citation-index-bci.html http://ericschnell.blogspot.com/2007/11/blog-citation-index-bci.html http://www.ucl.ac.uk/ciber/ciber_2005_survey_final.pdf 1. “The Thomson Scientific Impact Factor.” Available http://scientific.thomson.com/free/essays/journalcitationreports/impactfactor/. 2. “Why Eigenfactor.org: Ranking and Mapping Scientific Journals.” Available http://www.eigenfactor.org/whyeigenfactor.htm. 3. “COUNTER: Online Usage of Electronic Resources.” Available http://www.projectcounter.org/. 4. “Alexa the Web Information Company.” Available http://alexa.com/. 5. “PLoS ONE: Publishing science, accelerating research.” Available http://www.plosone.org/home.action. 6. “Mesur: Metrics from Scholarly Usage of Resources.” Available http://mesur.org/Home.html. http://mesur.org/Home.html http://www.plosone.org/home.action http://alexa.com/ http://www.projectcounter.org/ http://www.eigenfactor.org/whyeigenfactor.htm http://scientific.thomson.com/free/essays/journalcitationreports/impactfactor/ work_x6ipvmkmzjfrzkgzwy7mvg3mde ---- UC Santa Barbara UC Santa Barbara Previously Published Works Title "Insourcing" of Cataloging in a Consortial Environment: The UC Santa Barbara-UC San Diego Music Copy Cataloging Project Permalink https://escholarship.org/uc/item/55h2t40v Journal Cataloging & Classification Quarterly, 51(1-3) ISSN 0163-9374 1544-4554 Authors Nyun, James Soe Peters, Karen A DeVore, Anna Publication Date 2013 DOI 10.1080/01639374.2012.729553 License https://creativecommons.org/licenses/by-sa/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California https://escholarship.org/uc/item/55h2t40v https://creativecommons.org/licenses/https://creativecommons.org/licenses/by-sa/4.0//4.0 https://escholarship.org http://www.cdlib.org/ 1 "Insourcing" of Cataloging in a Consortial Environment: The UC Santa Barbara-UC San Diego Music Copy Cataloging Project JAMES SOE NYUN Geisel Library, University of California–San Diego, La Jolla, California KAREN A. PETERS The George and Helen Ladd Library, Bates College, Lewiston, Maine ANNA DeVORE Davidson Library, University of California–Santa Barbara, Santa Barbara, California ABSTRACT A collaborative cataloging project for music sound recordings between two University of California campuses matches available staffing at UC San Diego with the need for better access to a high-priority collection of audio CDs at UC Santa Barbara, with promising results. This article discusses the decision to collaborate, the project planning process, cataloging standards and workflow issues, network level cataloging within an international database (OCLC), communication between personnel on the two campuses, managing cataloging review, an assessment of the project's achievements to date, and implications and future directions for similar cooperative projects. 2 Keywords: Descriptive cataloging, Cataloging administration/management, Cooperative cataloging, Types of materials…Sound recordings, Cataloging research…Case studies, Types of libraries…College and university libraries INTRODUCTION: SETTING THE SCENE Since the mid-20th century, the University of California Libraries have had a strong history of building collections consortially. Over the past decade or two, individual University of California libraries have also increased efficiencies in the acquisition, cataloging, processing, and storage of library materials. System-wide storage efficiencies were addressed beginning in the 1980s with the construction of the Northern and Southern Regional Library Facilities. In 2000, the UC System developed a model shared cataloging and record distribution program, the Shared Cataloging Program (SCP), based at UC San Diego, to provide records for electronic serials and monographs purchased for the System by the California Digital Library.1 In 2002, the University's Collection Development Committee 3 formed a task force charged with planning workflows for a pilot project to explore centralizing the cataloging of system-wide shared print materials.2 Later, in 2005, the Bibliographic Services Task Force (BSTF) report, Rethinking How We Provide Bibliographic Services for the University of California ,3 proposed that UC library cataloging should be viewed and organized as a system-wide enterprise. In 2007, UC library administrators began work with OCLC on a next-generation consortial catalog on the WorldCat Local platform; it is now the system-wide public access catalog interface.4 In 2009, the Next Generation Technical Services (NGTS) initiative was launched with the objective of integrating collections and technical services functions across the system. Included in the initiative are proposals for a system-wide shelf-ready approval plan, enterprise-level financial mechanisms, and system-wide cataloging standards for consortial and collaborative cataloging.5 In addition to the formal University-wide initiatives, several smaller cataloging arrangements have been developed in recent years among members of the ten UC campuses. An outgrowth of a 2008 survey of system-wide cataloging format and language expertise conducted by the University's Heads of Technical Services (HOTS) group, these arrangements typically involve two or three campuses, with one campus providing foreign language cataloging or 4 support to another lacking the necessary expertise. These initiatives and arrangements form the backdrop against which the collaborative music cataloging project described in this paper was initiated. At the end of 2010, UC San Diego’s Head of Technical Services forwarded an open call offering music cataloging support to her counterparts across the system. The offer was quite broad—cataloging of scores, sound recordings, and video materials were all possibilities—and stipulated that the San Diego catalogers would prefer working with surrogates in order to avoid the necessity of transporting library materials back and forth. UC Santa Barbara’s Head of Technical Services, aware of the Library’s sizeable music backlog, forwarded the offer to the Head of UC Santa Barbara’s Music Library who, welcoming the opportunity, agreed that the offer should be pursued. In the discussion that follows, the authors will place this undertaking in the context of published accounts of collaborative cataloging projects. We will follow with discussions of our project planning, implementation, and management, as well as a consideration of the shared cataloging standards used for the project. Finally, we will share some of the lessons we have learned from this project, and delineate potential issues that might arise in other shared cataloging projects. 5 Throughout our discussion, we have opted to use the term "insourcing" to characterize the nature of this project. The word stands in contrast to the well- established term "outsourcing," in which a business or other organization contracts out work (e.g., to an external vendor) that would previously have been carried out internally. The emerging term, insourcing, describes a business practice that the Oxford English Dictionary describes simply as "the action or process of obtaining goods or services in-house, esp[ecially] by using existing resources or employees."6 In calling the project an example of insourcing, we emphasize the project’s alignment with UC’s Next Generation Technical Services vision of moving from separate library operations at each campus towards one large operation with pooled resources. LITERATURE REVIEW The genesis of the insourcing project proposed by UC San Diego’s Head of Technical Services can be found in the UC Bibliographic Services Task Force report mentioned above.7 Included in this rethinking is a recommendation that the UC System “rearchitect” its cataloging workflow on a system-wide basis: To maximize the effectiveness of our metadata creation, University of California cataloging should be viewed as a single enterprise. We need to move beyond a practice of shared cataloging to a practice of integrated 6 cataloging, in which the system adopts a single set of cataloging standards and policy, eliminates duplication of effort and local variability in practice, provides system wide access to language, format, and subject expertise, and creates a single copy of each bibliographic record for the entire system [emphasis ours].8 While the University of California has not yet adopted a single system- wide cataloging practice and may never entirely succeed in doing so, the UC San Diego Library’s offer to “lend” its music copy cataloging expertise to other UC Libraries without expectation of reciprocation certainly appears to represent an experiment in the possibility of system-wide cataloging integration. Thus, while UC San Diego may appear to derive little immediate benefit from the arrangement discussed in this paper, it nevertheless has the potential to benefit from any future system-wide integration, a point to which we will return in our conclusion. The literature on cooperative and shared cataloging arrangements that bear similarity to the San Diego-Santa Barbara insourcing project is extensive.9 The arrangements generally described, however, differ from the project being discussed here in at least two important ways: first, the institution is sharing work done for its own benefit—that is, cataloging materials that it owns—and sharing the resulting records with other institutions that also own those materials. In the present case, as it turns out, UC San Diego owns copies of only a small percentage of the items its staff is copy cataloging for UC Santa Barbara. 7 Second, and related to the first, in a cooperative or shared arrangement, both institutions apply shared cataloging standards. This will not necessarily be the case in an arrangement such as San Diego-Santa Barbara’s, forcing the catalogers performing the work to modify their standards and practices to a greater or lesser extent in order to conform with the requirements of the institution for which they are performing this service. The literature discussing small-scale cooperative or insourcing arrangements of this nature is rather sparse and generally speculative. It tends to focus on foreign language materials and involves, or has as an expectation of, reciprocity of expertise or exchange of money for services. One of the first such accounts, by Joseph Kiegel and Merry Schellinger, describes a cooperative exchange between the Universities of Minnesota and Washington in which the former provided Scandinavian language expertise, and the latter, Arabic.10 Not long thereafter, noting that vendor-supplied cataloging was too expensive for many libraries with foreign language cataloging backlogs, James Chervinko published a study on the then-current state of, and possibilities for, cooperative exchanges between or among libraries with complementary language expertise, or of arrangements similar to outsourcing, in which a library with a particular language expertise would charge a fee for cataloging another library’s 8 materials.11 While Chervinko’s study reports in detail what he found, he does not provide a summary of its results or draw any overall conclusions other than that these arrangements, while not a complete solution to the problem of foreign language backlogs, could substantially reduce them. The most extensive study of this subject is Magda El-Sherbini’s largely theoretical examination of possibilities for sharing cataloging expertise as an alternative to outsourcing12 (on which see below), bolstered by her long experience with the outsourcing of Slavic language materials at Ohio State University.13 Pointing to the Library of Congress Working Group on the Future of Bibliographic Control‘s report On the Record,14 as well as similar reports from the Indiana University Task Force on the Future of Cataloging15 and that of the University of California Libraries Bibliographic Services Task Force mentioned above, El-Sherbini writes, “Libraries have to think beyond their walls and go beyond sharing bibliographic records through OCLC, [by] also sharing unique expertise among them.”16 To this end, she presents five models, three for bibliographic records and two for authority control, and expresses the hope that her study will generate other models and ideas. Insourcing bears a good deal of resemblance to outsourcing as a solution to the reduction of cataloging backlogs, particularly in the absence of specific 9 cataloging expertise, and as an attempt at cost savings or a rational distribution of expenses. There are a number of studies and essays on outsourcing besides the one by El-Sherbini mentioned above. Among these are Sheila Ayers’s essay on the effect on libraries of outsourcing their cataloging,17 which provides an overview of the practice, including the reasons for it, and its advantages and disadvantages. Rebecca Lubas’s chapter in a book on practical cataloging discusses how to approach and manage outsourcing, as well as its rationale.18 Kenneth Bierman and Judith Carter describe the University of Nevada Las Vegas’s generally positive experience with shelf-ready books.19 On the other hand, there is Faye Leibowitz’s 2007 NASIG conference presentation on the outsourcing of a retrospective cataloging project involving a special collection of serials at the University of Pittsburgh.20 In her presentation, which is entitled “Risky Business,” Leibowitz, like El-Sherbini, suggests that an alternative to outsourcing—in this case, hiring project staff—might have been preferable, had that alternative been possible. Finally, little has been written specifically in regards to the insourcing, outsourcing, or collaborative cataloging of music materials, although music is sometimes mentioned along with foreign language materials as something that requires special cataloging expertise that may not be available or otherwise 10 sufficient to a particular library’s needs. Ruth Tucker’s discussion of a cooperative cataloging project in a quasi-consortial environment that involved the retrospective conversion of music scores at the University of California, Berkeley,21 is useful in pointing out some of the specific issues and requirements pertinent to the cataloging of music materials, as well as its sheer expense. Tucker concludes that the project’s cooperative approach was gratifyingly cost effective. More recently, Nancy Lorimer discusses a joint project, funded by a Mellon grant, to catalog 78 rpm recordings held by sound archives at Stanford and Yale Universities and the New York Public Library. Discussed in particular were the goals of the participating institutions, decisions taken in common regarding cataloging standards, a division of labor aimed at avoiding duplication of work, and especially Stanford’s development for the project of a methodology for efficient batch searching and processing of cataloging copy in OCLC.22 BACKGROUND The Music & Media Cataloging Unit at UC San Diego consists of two full- time and two part-time staff whose hours add up to 3.5 FTE (full-time equivalents), a number that has proven stable over the last decade. Two catalogers (1.7 FTE) concentrate on copy cataloging of sound recordings, scores, 11 and moving image material. Another part-time cataloger (.8 FTE) serves as the unit’s primary original cataloger of scores and sound recordings. A full-time unit head is dedicated to administration of the unit, occasional original cataloging in most formats, and participation in metadata projects and committees in the Metadata Services Department and University Libraries. Three staff members hold advanced music degrees, and the fourth has strong generalist music knowledge and a doctorate in library science. This level of staffing had served well through a period of strong purchasing and the presence of a large backlog, primarily the result of a major gift of approximately 27,000 compact disc titles, out of which nearly 20,000 were retained. As the unit neared completion of the copy cataloging component of the backlog, however, it became clear that cataloging capacity would soon exceed the incoming and deferred workload. In contrast, cataloging resources at UC Santa Barbara had dwindled since the mid-1990s, from 2 full-time music catalogers plus student assistants, to 1 full- time music cataloger with occasional help from other cataloging staff. In 2009, upon the music cataloger’s retirement, this dwindled to approximately 1 FTE with no occasional help, as that help—UC Santa Barbara’s then-Special Formats Cataloger—added responsibility for the Music Library’s original cataloging to her duties; and shortly thereafter, the Music Library’s administrative assistant 12 took on the copy cataloging of scores and eventually sound recordings on a half- time basis. Both of these individuals hold advanced degrees in music; the newly designated Music, Media, and Slavic23 Materials Cataloger also holds a master’s degree in Library and Information Studies and a doctorate in ethnomusicology. At the same time that UC Santa Barbara’s music cataloging capacity was being gradually reduced, the collections in its Music Library were continuing to grow. The resulting backlog situation was exacerbated when the Library’s heavily used CD collection was suddenly enlarged by a gift of over 2,000 titles with content that was of high interest to the Library’s users. In order to provide potential users with at least minimal bibliographic access to materials in its backlogs, the practice at Santa Barbara is to import OCLC cataloging copy, if available, into its online catalog, from which expedited processing of these materials can be requested by patrons seeking access. In the Music Library, items such as CDs and DVDs that cannot be checked out are further provided with barcodes and call numbers, shelved in closed stacks, and made available for in- library use while awaiting the attention of cataloging staff—a necessity due to the wide range of quality and completeness represented in OCLC bibliographic records.24 13 UC Santa Barbara Library is not unique in its music backlog. In the mid- 1990s, Judy MacLeod and Kim Lloyd reported the findings of a survey in which 75% of their respondents reported backlogs of print and recorded music items; the percentage for academic libraries was slightly higher at 77%. The most important factors contributing to these backlogs included gifts and acquisitions in excess of staff and funding to process, and lack of catalogers with appropriate subject expertise.25 It might also seem, however, that the library profession as a whole is not convinced of the necessity or desirability of fostering such expertise. At the turn of the millennium, A. Ralph Papakhian attributed this attitude to a rather paradoxical situation: on the one hand, the expense of cataloging music items relative to that of cataloging books had resulted, he believed, in “a prejudice against music in libraries.” Regarding the expense, commercial recordings in particular often contain a number of different pieces of music recorded at different times and places, each of which may involve the participation of different creators and performers. In other words, the cataloging of a sound recording or score made up of a numerous pieces of music bears more resemblance to the cataloging of several books than to the cataloging of a single one—a situation that has become more acute as technological advances in media allow for the inclusion of increasing amounts of content on a single item. This 14 expense, then, is the direct result of the provision of increased access to materials, something that the professional ethic of librarians would seem to valorize.26 While it will ultimately be up to library administrators to determine whether they will continue to support the provision of detailed access to music materials, the music library profession clearly believes in the need for such provision. Recommended enhancements to WorldCat Local by the Reference Services Committee of the Music OCLC Users Group (MOUG)27 and a Music Discovery Requirements document drafted by the Emerging Technologies Committee of the Music Library Association28 both indicate the importance to music users of “clear identification and display of information regarding the musical work,”29 as well as of particular versions (expressions and manifestations) of works; of the importance of persons and corporate bodies as access points to music scores and recordings; and of the resulting necessity of providing information sufficient to distinguish among similar or identical names and titles. ESTABLISHING THE RELATIONSHIP AND DETERMINING THE MATERIALS TO BE CATALOGED 15 Once the Head of UC Santa Barbara’s Music Library indicated her interest in pursuing UC San Diego’s offer of music cataloging assistance, Santa Barbara’s Head of Technical Services instructed two of this article’s authors—UC Santa Barbara’s primary monographs cataloger, and its Music, Media, and Slavic Materials Cataloger—to contact the third author, UC San Diego’s Music & Media Cataloging Unit Head, in order to confirm the proposed project’s feasibility and to work out the details of its implementation. In the course of the ensuing telephone conversation, the three of us established that our cataloging units shared similar philosophies and standards for music cataloging, and that the San Diego catalogers were well qualified for the work that Santa Barbara required. Shortly thereafter, the two Santa Barbara catalogers met with the Head of our Music Library to determine which materials would be most suitable for the project. Initially, at her suggestion, and with encouragement from our contact in San Diego, our discussion centered on Santa Barbara’s music score backlog. At UC San Diego, experience with sending scans of parts of print monographs to a cataloging partner had demonstrated the feasibility of assembling a portable package of files that could serve as a basic cataloging surrogate for a print title. Scans of pertinent parts of a publication were sent to the remote cataloger, along with a simple form filled out with basic metadata 16 such as size and pagination. As the project peaked in the days before the full blossoming of wikis and cloud file sharing resources, managing these file packages occasionally proved problematic. Additionally, creating the surrogates took some time, a situation mitigated somewhat by using student assistants to carry out the bulk of the work. These issues aside, the process ultimately proved to be a success for small batches of materials, and our current technological infrastructure would surely lessen some of these difficulties. Scores can nevertheless prove more resistant than general monographs to the creation of effective digital or photocopied cataloging surrogates, and further pose multiple difficulties for a distance cataloging project that requires the catalogers to work from these surrogates. In general, scores as unique bibliographic entities can be difficult to distinguish from one another. Numerous issues, for example, may be printed from the same plates at very different points in time. As these issues carry the same plate number(s) and copyright date, but often lack a publication date, the cataloger may be forced to differentiate between them by evaluating factors such as condition, type or quality of paper, and size, and these differences may not be clearly reflected in a scan or photocopy. In addition, scores may be accompanied by individual parts that may or may not be the same size as the score—or each other. Finally, information important to an 17 item’s identification or description is not always found in the same location on every score: contents, dates and circumstances of creation, and editorial remarks, for example, may be found at or near the beginning or the end of a score. The situation requires the staff creating the surrogates to apply a level of care and judgment to each individual item that we did not initially envision for this project. As a result, our discussion turned to the possibility that insourcing Santa Barbara’s much larger CD backlog might be a more suitable project. Not only, as mentioned previously, were these materials of high interest to the Library’s users, but the relatively small and uniform CD packaging format would, in most cases, permit a simple reproduction by staff of the disc surface and container back that could be then be transmitted by mail, together with the CDs’ inserts or accompanying booklets, to San Diego’s catalogers. All parties thus came to agree that UC Santa Barbara’s CD copy cataloging backlog would be the focus of the San Diego-Santa Barbara insourcing project. CATALOGING STANDARDS At this point, the UC San Diego Unit Head developed a Scope of Work plan, modeled on contracts created for vendor cataloging projects, that outlined 18 the cataloging services that the San Diego catalogers would provide for Santa Barbara (Appendix). In drafting the plan, he collaborated with the UC Santa Barbara music cataloger on determining the cataloging standards to be used. Both parties were committed to a fundamental principle of cataloging as much as possible at the network level. Using Santa Barbara’s full-level OCLC cataloging authorization, the catalogers at UC San Diego would be directed to search for and catalog each item in the OCLC Connexion client, using the set of standards defined for the project. All edits and record enrichment appropriate for sharing at the network level were to be made to the OCLC master record under OCLC’s Expert Community program; any further institution-specific changes would be applied after replacing the network-level record. Cataloging for this project, then, balances the strengths and limitations of collaborative cataloging on a network level, permitting the use of available records and the provision of enriched access to the resources represented by the shared records, while at the same time, in the spirit of cooperation and the Expert Community program, respecting any previous catalogers’ judgment. The parties involved also agreed that, in general, work would progress more smoothly if UC Santa Barbara’s requirements mirrored UC San Diego’s practices whenever possible. In the very few cases where practices differed, if 19 neither party felt strongly about the use of one practice over another, the Head of UC Santa Barbara’s Music Library was asked for her preference. Potentially the most difficult, but perhaps also the most interesting, part of these discussions centered on what one might call “standardizing the non- standard.” Bibliographic records for sound recordings cataloged under AACR2 can reflect different levels of completeness, ranging from concise descriptions that provide no more than the most basic information, to records that provide detailed access to all of a sound recording’s content by means of a complete listing of its contents and the creation of controlled access points registering all performers, composers, titles, and genres/mediums of performance represented. Both San Diego and Santa Barbara generally create or upgrade existing bibliographic records according to the latter standard for much of the material cataloged, believing such a level of access to be helpful to music users at both of our institutions. Such a level of access also aligns with the goals and standards expressed in OCLC’s Expert Community guidelines,30 as well as with the intent of the documents produced by the Reference Services Committee of the Music OCLC Users Group and the Emerging Technologies Committee of the Music Library Association referenced earlier. As a result, this is the level of access San 20 Diego’s catalogers agreed to provide for Santa Barbara in the Scope of Work document. An example of the problems that can arise, however, when attempting to harmonize the practices of two or more cataloging institutions can be seen in questions regarding the application of relator codes31 to the access points for performers, composers, and other individuals in MARC bibliographic records. Generally used by sound recording catalogers to identify the relationships of these individuals with the content of the item being cataloged, the list of codes invites inconsistency by providing similar codes that operate at different levels of granularity. There is, for example, a general code for “performer,” as well as more granular codes for such roles as “instrumentalist,” “vocalist,” and “conductor.” There are also different codes, such as those for “singer” and “vocalist,” that might be used to signify the same relationship. While UC Santa Barbara’s practice is generally to provide relator codes at a more granular level, OCLC’s Expert Community guidelines would suggest that relator codes present in OCLC Master Records should not be changed to incorporate higher levels of granularity. The network-level restriction does not preclude a cataloging agency from making changes in its local catalog. The standards adopted for this project 21 nevertheless direct San Diego’s catalogers to leave relator codes in the records destined for Santa Barbara’s catalog as found. The use of relator codes may seem a fairly small point, but depending on how local online public access catalogs are implemented, different relator codes can index differently, causing split files; this is, in fact, the case with both of our institutions’ OPACs. And even in cases where indexing is not an issue, the use of different codes for identical functions can lead to user confusion. The new cataloging code, RDA (Resource Description and Access), slated to be adopted by the Library of Congress on March 31, 2013,32 places heightened importance on delineating relationships, and its Appendix I lays out a rich set of terms for them.33 Our actions on the project in this regard align with the intent of the new cataloging code to develop a focus on relationships in bibliographic records: one of the overall goals of RDA is to enable catalogers and other metadata specialists to create bibliographic records that will function more effectively in the wider, and increasingly interconnected, data environment.34 IMPLEMENTATION Having agreed on cataloging standards, our attention turned to implementation of the project. Tools that we had initially considered for use 22 included such things as scanners and digital cameras for creating surrogates, wikis for sharing information, and project management software. After considering these possibilities, however—and in light of our decision that the project’s focus should be CDs rather than scores—we decided that simpler tools such as email, “snail” mail, photocopies, and spreadsheets were adequate for the project, and less time- and resource-intensive to manage than more “advanced” technologies. Based on these decisions, UC Santa Barbara’s music copy cataloger, who had been designated the project manager at his end, developed a provisional workflow. After testing Santa Barbara’s part of the process, he created several implementation tools: procedures for processing the CDs for shipment to San Diego and upon their return; a simple tracking spreadsheet containing only two pieces of data—the UC Santa Barbara call number and the OCLC online save file number used by the UC San Diego catalogers; and a shipping form with spaces for insurance value, content description, approval signatures, and a tracking number. The ongoing work of preparing shipments from Santa Barbara is done by a student assistant. This assistant selects the next group of CDs to be cataloged, checks the item record for each in the local ILS (Ex Libris Aleph) to verify that it 23 has not already been cataloged, and suppresses the record so that it does not display in the online public catalog. Each item record in Aleph is assigned a special item process status code to indicate that it is being sent to San Diego. The assistant then photocopies the disc surface of each CD (AACR2’s chief source of information) and the back of the container when it contains pertinent information. The photocopies are placed in the CD booklet, creating the cataloging surrogate. The assistant records the call number for each title to be shipped on the tracking spreadsheet, and prints a copy to send to San Diego as a shipping manifest. Booklets and photocopies, which are kept in call number order for shipping, are packaged in two overlapping file folders, rubber-banded together, and placed in a manila shipping envelope. This package, along with the custom shipping form, is shipped to UC San Diego inside a FedEx Tyvek envelope or box. Although the degree of packaging might be considered minimal, there has been no damage in the twenty-two batches completed thus far. Safely packaging and transporting the actual CDs and their jewel cases would have required much more effort. At the UC San Diego end, all project cataloging is handled by two advanced copy catalogers who have previously done most of the tasks associated 24 with original cataloging. Both have been trained to evaluate and formulate access points, and one has been trained to supply original description for several formats, making their skill sets perfect for this project. In their workflow, incoming batches of work are divided between them in a ratio based on their incoming workloads and their official work assignments. Cataloging then commences based on the Scope of Work document. The San Diego catalogers, using Santa Barbara's OCLC authorization, make all changes appropriate for the OCLC master record and save the records in Santa Barbara's online save file after making any further edits needed for the UCSB local catalog. When completed, the catalogers record the save file number on the tracking spreadsheet. Each cataloger then sends their work to the other for a final review. Any questions are bounced to one of the two project managers: on San Diego’s end, this is the Music & Media Cataloging Unit Head. Up to this point, the cataloging surrogates have proven altogether satisfactory in providing enough information for copy- searching the materials and the subsequent cataloging and enrichment of the OCLC master records. Before the project commenced, we predicted that discs would need to be shipped occasionally so that the cataloger could listen to the contents in order to answer certain questions not properly dealt with by the 25 surrogate. Surprisingly, however, this has not proven necessary. Once the batch is reviewed, the surrogates are collected and shipped back to Santa Barbara. When the completed materials are received by Santa Barbara, the student assistant completes the processing, using the tracking spreadsheet to cross match each CD’s call number with the OCLC Connexion client online save file number. The assistant retrieves the bibliographic record cataloged by San Diego from the OCLC online save file and exports it into Santa Barbara’s Aleph ILS, overlaying the existing record and completing or revising the holdings and items records in Aleph. Finally, the assistant reunites the disc, booklet, and case, recycles the photocopies, and reshelves the CDs. To keep the affected parties informed, the project manager at UC San Diego provides periodic updates on the overall status of the project. These e-mail updates include such details as dates when batches are assigned to catalogers, moved into final review status, or readied for shipment. E-mail is also used for the catalogers’ occasional questions for Santa Barbara, and by the project manager at each location to alert the other when surrogates are shipped or received. At UC San Diego, Confluence, the content collaboration tool created by software firm Atlassian, is used to manage a common wiki space that functions as a shared collection point for all important documents and communications. 26 The entire lifecycle of project materials from receipt in San Diego to shipment back to Santa Barbara is now tracked using one of the wiki pages. Current plans are to authorize selected collaborators at UC Santa Barbara to access the local project wiki pages and view the detailed workflow status. Initially, Santa Barbara staff shipped batches of 30 surrogates for cataloging. As the San Diego catalogers adjusted to their new routine, the batch size was increased to 50. The two libraries developed a rhythm, rotating three batches at a time: while one batch (C) at Santa Barbara is being prepared to ship to San Diego, there is a batch (B) being cataloged at San Diego; at the same time, a completed batch (A) is on its way back to Santa Barbara. In practice, when San Diego sends Santa Barbara an email indicating that Batch A is ready to ship back, Santa Barbara sends Batch C to San Diego, with the result that, ideally, no more than 100 of Santa Barbara’s items are out at any given time. Cataloging for most batches is completed in approximately four to seven working days. Use of the most economical mode of transport and campus mail services can add an additional week to either end of the timeline, resulting in a three week turnaround on average for each batch. CONCLUSION: LESSONS LEARNED AND FUTURE DIRECTIONS 27 So far, the UC Santa Barbara-UC San Diego insourcing project has been a success. Twelve months into the project, 880 titles have been cataloged by UC San Diego for the UC Santa Barbara Music Library, and the project is still ongoing. The Head of UC Santa Barbara’s Music Library has expressed her satisfaction with the project, writing that it has provided a highly cost effective way for us to implement high-quality copy cataloging for the CD backlog in the Music Library … [T]he collection that we targeted for the project is [of] high value for our campus, so the opportunity to have it cataloged has served us very well.35 Much of the success of this project has been facilitated by the partner institutions being closely aligned in their cataloging policies: at the start of the project, UC San Diego’s cataloging policies were similar to those of UC Santa Barbara, necessitating only a brief training period for the San Diego catalogers involved. Although subtle differences in cataloging between the campuses proved troublesome to keep separate in practice, working in large batches has helped UC San Diego’s catalogers keep the two campuses’ requirements properly segregated. Exacerbating this situation, however, is something that was not anticipated when the project began: the possibility that one or both institutions’ standards and practices might change in ways that could affect the consistent 28 workflow established by the project managers and set out in the Scope of Work document. Several months into the project, San Diego moved towards simplifying its cataloging processes by adopting new guidelines that were informed by the BIBCO Standard Record Metadata Application Profiles (MAPS).36 Although the sound recording standards adopted by San Diego were richer than the baseline defined in the MAPS, they marked a further distancing from the standards used to process CDs for UC Santa Barbara. Maintaining a bifurcated workflow has injected additional complexity into the process, requiring catalogers to continue doing some things for some items, while at the same time retraining themselves for a different group of materials. That being said, processing two streams of materials according to different standards has proven manageable, if not exactly optimal. In working on this project, we have learned and considered many other things about the ways a shared project can be structured and managed, as well as how to deal with unforeseen circumstances such as the one just described. Earlier, we discussed impediments to attempting a similar project with printed scores, and we acknowledge that the project participants chose a format of material that better accommodates remote cataloging. We nevertheless believe that a project involving the cataloging of scores from the items themselves might 29 be feasible under certain circumstances, and possibly mandatory for some projects involving original cataloging. This issue could be addressed by building into the project the occasional transport of material that is oversized or otherwise difficult to capture in a surrogate. An even more ambitious approach might be to pair a cataloging project with a digitization effort, a step that could help produce superior surrogates while at the same time eliminating the labor required to handle and scan an item twice. In such a situation, however, accommodations to standard digitization workflows would have to be worked out: for example, it would likely be desirable to include a scale that a cataloger could use to ascertain the size of the original. Our experience further suggests that enlarging such a project for a wider pool of participants, each with different cataloging standards and practices, would magnify some of the problems we have noted here; and with the impending adoption of RDA, the possibility that participating institutions might choose different implementation dates and practices would further complicate matters. For example, the earlier-discussed potential for inconsistency created by the different levels of granularity permitted in the choice of relationship terms/codes under AACR2 is continued in RDA, with Appendix I.1 directing the cataloger to “[u]se relationship designators at the level of specificity that is 30 considered appropriate for the purposes of the agency creating the data.” In such situations, some problems could be avoided by coordinating a common implementation date, while the move to a new cataloging code could serve as an opportunity for participating institutions to work towards a more consistent policy unencumbered by institutional history, should that be a possibility. Our experience on the project clearly demonstrates that a unitary cataloging policy is easier to implement than parallel workflows. The informal pilot-project nature of this partnership poses other interesting questions, the most prominent of which is that of funding. Clearly, UC San Diego is shouldering the greater part of this project’s expenses, dedicating portions of the time of two catalogers and a manager to process the CDs and oversee the project. Further highlighting the fiscal imbalance, as we noted earlier in our literature review, UC San Diego derives only slight apparent benefit from performing this work, and its cataloging operations are complicated and made less efficient by the need to operate using multiple standards. When viewed more generously from the wider University level, however, the local imbalance could be viewed as an intra-University sharing of scarce resources, in the spirit of the University's Next Generation Technical Services initiative discussed in the introduction to this paper. The project, then, clearly functions as 31 an early experiment in bringing some of these principles to life. The reality that campus libraries are funded separately, and that this is not a formally funded pilot project, however, reinforces our initial sense that this is an example of filial largesse within the University. One potential model for “funding” consortial projects such as this one in a more equitable way across the system could be that of a cooperative in which libraries accrue cataloging credits or debits in a central “bank” based on non-monetary currency such as cataloging hours or the number of titles cataloged. Continuing discussions on further integrating the University's cataloging operations doubtless will be looking at parts of this project to model potential new shared cataloging services. Whether this particular project is viewed as an area to develop on a larger scale remains to be seen. In any event, a more comprehensive planning process would need to precede any attempt at a larger or more formal project. Another side-effect of the somewhat informal nature of this project highlights the fact that the project CDs rank lower in priority at UC San Diego than its own incoming materials. While this ensures that the hosting institution's workload does not suffer, it does leave UC Santa Barbara in the position of not being able to count on their materials being processed against a guaranteed throughput target. In practice, this has not been a problem, as the OPAC records 32 of items sent to San Diego for cataloging have been suppressed to temporarily prevent their retrieval by potential users. This does beg the question of their availability, however. A more binding arrangement with formalized priorities and formally restructured percentages of time for the catalogers would help create an environment where the institution with the items needing to be cataloged could be assured of more reliable throughput. We have described several aspects of the project that could benefit from a more formal approach. Informality, however, does have its advantages, and the fact that the project exists at all is a testament to it. Only a very few weeks elapsed between the original call for interest and the start of actual cataloging. Setting project standards and expectations received proper attention, but was eased along by both sides’ flexibility and willingness to compromise on points important to the other. Proceeding as an exploratory pilot project has allowed UC Santa Barbara to get a significant block of their music CD backlog processed without major expense. At the same time, catalogers at San Diego have been able to focus their work on materials that correspond with their main area of expertise and interest. As the Head of UC Santa Barbara’s Music Library put it, “this has been a win-win situation.”37 33 NOTES 1. For background documents, see Shared Cataloging Program (SCP) (August 4, 2010). http://www.cdlib.org/services/collections/scp/ (accessed February 24, 2012). See also Patricia Sheldahl French, Rebecca Culbertson, and Lai-Ying Hsiung, "One for Nine: The Shared Cataloging Program of the California Digital Library," Serials Review 28, no. 1 (2002): 4-12. 2. For a planning document, see Working Group on the UC Shared Print Collection Pilot, Report (2003). http://libraries.universityofcalifornia.edu/cdc/taskforces/ucsharedcoll-pilot-rpt.pdf (accessed February 24, 2012). For an evaluation of the test, see Elsevier/ACM Pilot Assessment Team, Report (2004). http://libraries.universityofcalifornia.edu/cdc/taskforces/elsevier_acm_assessment.doc (accessed February 24, 2012). 3. University of California Libraries Bibliographic Services Task Force (BSTF), Rethinking How We Provide Bibliographic Services for the University of California (2005). http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf (accessed September 14, 2007). 4. See UC/OCLC Pilot Implementation (2012). http://libraries.universityofcalifornia.edu/about/uc_oclc.html (accessed February 24, 2012). 5. For general background on past and current phases, see Next-Generation Technical Services (NGTS) (2012). http://libraries.universityofcalifornia.edu/about/uls/ngts/index.html (accessed February 24, 2012). 6. "Insourcing, n.". OED Online. March 2012. Oxford University Press. http://www.oed.com/view/Entry/267477?redirectedFrom=insourcing (accessed April 20, 2012). 7. UC Libraries BSTF, Rethinking How We Provide Bibliographic Services.. 8. Ibid., 21. 34 9. For a historical overview of library cooperation in general, see Joseph E. Straw, “When the Walls Came Tumbling Down: The Development of Cooperative Service and Resource Sharing in Libraries: 1876-2002,” The Reference Librarian 83-84 (2003): 263-276. 10. Joseph Kiegel and Merry Schellinger, “A Cooperative Cataloging Project between Two Large Academic Libraries,” Library Resources and Technical Services 37 no. 2 (1993): 221- 225. 11. James S. Chervinko, “Cooperative and Contract Cataloging of Foreign-Language Materials in Academic and Research Libraries,” Cataloging & Classification Quarterly 21, no. 1 (1995): 29-65. 12. Magda El-Sherbini, “Sharing Cataloging Expertise: Options for Libraries to Share Their Skilled Catalogers with Other Libraries,” Cataloging & Classification Quarterly 48, no. 6-7 (2010): 525-540. 13. See, for example, Magda El-Sherbini, “Contract Cataloging: a Pilot Project for Outsourcing Slavic Books,” Cataloging & Classification Quarterly 20, no. 3 (1995):57-73; and Magda El-Sherbini, “Outsourcing of Slavic Cataloging at the Ohio State University Libraries: Evaluation and Cost Analysis,” Library Management 23, no. 6-7 (2002): 325-329. 14. Library of Congress Working Group on the Future of Bibliographic Control, On the Record (2008). http://www.loc.gov/bibliographic-future/ (accessed September 19, 2007). 15. Indiana University Task Force on the Future of Cataloging, A White Paper on the Future of Cataloging at Indiana University (2006). http://wwwl.iub.edu/~libtserv/pub/Future_of_Cataloging_White_Paper.doc (accessed September 3, 2007). 35 16. El-Sherbini, “Sharing cataloging expertise,” 529. 17. Sheila Ayers, “The Outsourcing of Cataloging: The Effect on Libraries,” Current Studies in Librarianship 27, no. 1-2 (2003): 17-28. 18. Rebecca L. Lubas, “Managing Vendor Cataloging to Maximize Access,” in Practical Strategies for Cataloging Departments, ed. Rebecca L. Lubas (Santa Barbara, California: Libraries Unlimited, 2011), 65-72. 19. Kenneth J. Bierman and Judith A. Carter, “Outsourcing Monograph Cataloging at the UNLV Libraries,” Technical Services Quarterly 25, no. 3 (2008): 49-65. 20. Faye R. Leibowitz and Michael A. Arthur, “Risky Business: Outsourcing Serials Cataloging,” The Serials Librarian 54, no. 3-4 (2008): 253-260. 21. Ruth W. Tucker, “Music Retrospective Conversion at the University of California at Berkeley: Conversion of Musical Scores through a Consortium,” Technical Services Quarterly 7, no. 2 (1989): 13-28. 22. Nancy Lorimer, “Unlocking Historical Audio Collections: Collaborative Cataloging and Batch Searching of 78 rpm Recordings. Technical Services Quarterly 29, no. 1 (2012): 1-12. 23. The Slavic duties were added due to an additional retirement. 24. Such treatment is not uncommon: UCSD had a similar practice for its sound recordings in the past but was able to eliminate its backlog in 2010. And while one of the authors was attending library school at the University of Wisconsin-Madison (2004-2006), the Music Library there applied a similar practice to its music score backlog, filing the scores by accession number in closed stacks and making them available for checkout upon request. 36 25. Judy MacLeod and Kim Lloyd, “A Study of Music Cataloging Backlogs,” Library Resources and Technical Services 38, no. 1 (1994): 11, 14. The authors also cite AACR2 as a major contributing factor, especially in regards to its uniform title rules (8). 26. A. Ralph Papakhian, “Cataloging,” Notes 56, no. 3 (2000): 584-585. 27. Music OCLC Users Group Reference Services Committee, WorldCat Local Enhancement Recommendations for Music (August 2009, revised April 2010). http://www.musicoclcusers.org/WorldCatLocal20100412.pdf; and Music OCLC Users Group Reference Services Committee, Music Recommendations for WorldCat Local: Status Report (January 21, 2011). http://www.musicoclcusers.org/WorldCatLocal20101222rev.pdf (accessed January 25, 2012). 28. Music Library Association (MLA) Emerging Technologies Committee. Music Discovery Requirements, Draft 2, (February 9, 2012). http://personal.ecu.edu/newcomern/musicdiscoveryrequirementsfeb92012.pdf (accessed February 10, 2012). 29. Ibid., 6. 30. Jay Weitz, Expert Community: Guidelines for Experts (November 2009). http://www.oclc.org/support/documentation/worldcat/cataloging/ece/default.htm (accessed February 21, 2012). 31. MARC Code List for Relators. Term sequence (December 7, 2010). http://www.loc.gov/marc/relators/relaterm.html (accessed February 26, 2012). 37 32. Beacher Wiggins, “Library of Congress Announces Its Long-Range RDA Training Plan (Updated March 2, 2012).” http://www.loc.gov/catdir/cpso/news_rda_implementation_date.html (accessed April 21, 2012). 33. RDA: Resource Description and Access. Appendix I, “Relationship Designators: Relationships between a Resource and Persons, Families, and Corporate Bodies Associated with the Resource” (August 16, 2010). http://access.rdatoolkit.org/rdaappi.htm (accessed February 22, 2012). 34. Presentations from a session on RDA and Linked Data at the Music Library Association’s 2012 Annual Meeting demonstrate the potential for realization of this goal: see Jenn Riley, “RDA and Linked Data: Moving Beyond the Rules (February 18, 2012). https://docs.google.com/viewer?a=v&pid=sites&srcid=bXVzaWNsaWJyYXJ5YXNzb2Mub3Jnf G1sYWRhbGxhczIwMTJ8Z3g6NzRlOGIwZTZjYjQ2NjAwYg (accessed February 27, 2012); and Kimmy Szeto, “A Brief Introduction to RDF/Linked Data and RDA Registered Properties” (February 18, 2012). https://docs.google.com/viewer?a=v&pid=sites&srcid=bXVzaWNsaWJyYXJ5YXNzb2Mub3Jnf G1sYWRhbGxhczIwMTJ8Z3g6M2VmZTk1OGEyY2U5NDZhNA (accessed February 27, 2012). 35. Eunice Schroeder, personal communication (February 26, 2012). 36. BIBCO Standard Record Metadata Application Profiles (September 16, 2011). http://www.loc.gov/catdir/pcc/bibco/BSR-MAPS.html (accessed February 21, 2012). 37. Schroeder, personal communication. 38 Appendix: Scope of Work: UCSD guidelines for UCSB-UCSD CD Cataloging Project Draft 4, last edited 110330 Workflow Cataloger/UCSD project manager will:  Acknowledge receipt of material to be cataloged.  Examine scans of disk and container and verify that it is legible.  Copy-search item on OCLC, using UCSB login.  If copy is found, catalog item according to the guidelines below; if no copy, bounce to original cataloger.  Apply OCLC record validation and fix any errors.  Control all headings (unless doing so will render the form of heading incorrect) and replace existing OCLC record with edited version when appropriate.  Save completed, edited record in UCSB save file; note save number on spreadsheet  Notify UCSB if clarification or further information is needed to catalog an item.  Re-save records in file before sending shipment back to UCSB.  Notify UCSB when a batch is completed. 39 Cataloging Guidelines Examine entire record for typographical errors and completeness. Pay special attention to fields indicated below. Note that UCSB may have provided additional important information that may not be apparent in the scans of the disc and container. OCLC fixed fields: Type, Lang, AccM, Ctry, DtSt, Dates 007: Physical Description Fixed Field. Correct or add. Code subfield e to correspond to stereo or mono as found in subfield b of the 300 field; if not stated in 300, use value “u” for unknown. 020 |a ISBN (not likely to appear for sound recording, but may be present in accompanying print material) 024: Other Standard Identifier. Code ISRC (1st indicator 0); UPC (1st indicator 1); ISMN (1st indicator 2); EAN (1st indicator 3) 028: Label Number. Enter with indicators “02” when only one label number, enter with indicators “00” if more than one are present. When more than one is present also supply eye-readable note: 500bb [Label name]: [number 1]; [number 2]… 041: Language Code. Provide only when more than one language is present. Place language of main item 40 (sung/spoken text) in |d; language of libretti in |e; language of other accompanying material in |g. 043: Geographic Area Code. Supply when subject headings indicate a geographical focus. Use OCLC 043 generator macro. 1xx: Name or UT main entry. Supply/correct as required. Control heading or verify in OCLC Name Authority File (NAF). 240: Uniform Title. Supply when required. Verify form in OCLC NAF. 245: Title Statement. Correct as necessary. 246: Varying Form of Title. Correct or supply as necessary. Generally supply subfield a only. 250: Edition Statement. Correct or supply as necessary. 260: Publication, Distribution, etc. Correct. Serious differences may constitute a different item; consult OCLC When to Input a New Record Guidelines. 41 300: Physical Description. Correct. Serious differences may constitute a different item; consult OCLC When to Input a New Record Guidelines. 490: Series Statement. Correct or supply as appropriate. 500: General Note. Supply or verify notes pertaining to: Nature or form of work; multiple publisher’s numbers; source of title if not from chief source; duration (when not presented in conjunction with a contents note); presence of significant accompanying material (such as a booklet). If durations are many and would be overly complex to enumerate in 500 or 505, make generic note, “Durations listed [on container/in booklet].” 505: Formatted Contents Note. Supply or verify contents. Durations and performers may appear here instead of in separate duration and performer notes if the cataloger determines that this is the clearest and most economical way of recording the information. Prefer basic formatting, without subfields, unless copy already uses enhanced formatting. 511: Participant/Performer Note. Correct or supply if participants are not recorded in conjunction with a contents note. If the item names the individual performers in a larger group, list the performers parenthetically after the name of the group unless more than eleven performers are named. 42 518: Date/Time and Place of Event. Correct or supply recording details presented on the item. 546: Language Note. Give language of sung/spoken languages if not deducible from rest of description; give language of libretto and significant commentary. 6XX: Subject Headings. Supply up to a total of 5 unless cataloger judgment requires more. Control headings or verify forms in OCLC SAF. 7XX: Added Entries. For classical albums, provide name and uniform title access to all performers and works listed on container unless cataloger judgment decides this would be excessive. For popular music, provide performer name access only. When a group is named, don’t provide individual access points for its members. Add relator codes to performers, using cnd, itr or voc; however, if bib record already has a pattern established where itr and voc are collapsed into prf, follow the technique found in the record. 830: Uniform Title. Provide/verify as necessary. Trace traceable series. work_xb2akbhlljafpn4qig2lapassq ---- Issue Information Geography Compass Overview Geography Compass is an online-only journal publishing original, peer-reviewed surveys of current research from across the entire discipline. Aims and Scope Unique in its range, Geography Compass is an online-only journal publishing original, peer-reviewed surveys of current research from across the entire discipline. Geography Compass is inclusive: it does not privilege any one perspective over another, it is open to all authors, and publishes articles that are both theoretical and practical in orientation, or concerned with methodology, as well as issue- oriented reviews. The journal’s emphasis is upon state-of-the-art reviews, supported by a comprehensive bibliography and accessible to an international readership of geographers and scholars in related disciplines. Geography Compass is aimed at students, researchers and non-specialist scholars, and will provide a unique reference tool for researching essays, preparing lectures, writing a research proposal, or just keeping up with new developments in a specific area of interest. Geography Compass... • …supports your research with over 100 new articles per year, sourced from an international scholarly community. Gain an introduction to new fields, an overview of unfamiliar topics, and familiarity with the latest scholarship and debate. • …informs your teaching with lively original articles that are quickly and continuously replenished, and supplemented with teaching guides. Geography Compass will provide you with up-to-date bibliographies and expert analysis on key themes to inspire and engage your students Explore Geography Compass for: • A new kind of core content: state-of-the-art surveys of current research discuss the major topics, issues, viewpoints, and controversies within each area of the discipline. • Coverage of the entire field highlights connections across sub-disciplines within geographical research Reference-linked bibliographies for each article, providing the ideal entry point into specialist literature. • Teaching Guides from article authors to inspire and engage your students 100 articles per year: 3 times more than a standard journal. • Fast continuous publication: articles typically available 6-8 weeks after acceptance and as an online-only journal there are no issue restrictions. Fields covered by Geography Compass include: Atmosphere & Biosphere, Cultural Geography, Development, Economic, Environment & Society, Geomorphology & Hydrology, GIS & Earth Observation, Political Geography, Social Geography and Urban Geography. Visit www.geography-compass.com For information on other Wiley-Blackwell Compass Journals visit www.blackwell-compass.com Keywords geography, compass, Atmosphere & Biosphere, Cultural, Development, Economic, Environment & Society, Geomorphology & Hydrology, GIS & Earth Observation, Political, Social, Urban Abstracting and Indexing Information • Academic Search Complete (EBSCO Publishing) • CSA Environmental Sciences & Pollution Management Database (ProQuest) • CSA Sustainability Science Abstracts (ProQuest) • Environmental Sciences and Pollution Management (OCLC) • GEOBASE (Elsevier) • SCOPUS (Elsevier) • Water Resources Abstracts (ProQuest) 361 Editorial Board Editors-in-Chief Mike Bradshaw, University of Leicester, UK Associate Editor for Physical Geography John Kupfer, University of South Carolina, USA Section Editors Atmosphere & Biosphere John Kupfer, University of South Carolina, USA Scott Curtis, East Carolina University, USA Cultural Geography Gail Davies, University of Exeter, UK Development Geography Cheryl McEwan, Durham University, UK Economic Geography Andrew Wood, University of Kentucky, USA James Faulconbridge, Lancaster University, UK Environment and Society Geoff Wilson, The University of Plymouth, UK Trevor Birkenholtz, University of Illinois at Urbana-Champaign, USA Geomorphology & Hydrology Paul F. Hudson, University of Texas at Austin, USA Sheryl Luzzadder Beach, The University of Texas at Austin, USA GIS & Earth Observation Ryan Jensen, Brigham Young University, USA John Jensen, University of South Carolina, USA Michael N. DeMers, New Mexico State University, USA Political Geography Fiona McConnell, University of Oxford, UK Social Geography Jon May, Queen Mary College, London UK Urban Geography Mark Jayne, University of Manchester, UK Editorial Board Members Atmosphere & Biosphere David M. Cairns P. Grady Dixon Taly Drezner Lesley-Ann Dupigny-Giroux Lesley Rigg Richard Field Scott Franklin Michelle Goman Georg Grabherr Paul Knapp Rezaul Mahmood George Malanson Patrick Moss Corene Matyas Andrew Oliphant George Perry Scott Robeson John Rogan Cultural Geography Gavin Brown Ian Cook Mike Crang Philip Crang Dydia DeLyser Caitlin DeSilvey David Lambert Denis Linehan Hayden Lorimer Fraser MacDonald Catherine Nash Divya Praful Tolia-Kelly Development Anthony Bebbington Reginald Cline-Cole Sara Kindon Claire Mercer Giles Mohan Warwick Murray Richa Nagar Rob Potter Saraswati Raju Jonathan Rigg Jennifer Robinson Alison Stenning Economic Geography Roger Hayter Philip Kelly George C.S. Lin Danny MacKinnon Phillip O’Neill Bae-Gyoon Park John Pickles Jane Pollard Jessie Poon Dominic Power Norma Rantisi Adrian Smith Peter Sunley Andrew Wood Matthew Zook Environment & Society Ian Bailey Anne Chin Julian Clark Anna Davies Kirstie Fryirs Mike Goodman David Laurence Higgitt Richard Howitt Richard Huggett Kate Rowntree Geomorphology & Hydrology Ramon Batalla Mike Bonell Sean Carey Helmut Elsenbeer Jeff McDonnell Jim McNamarra Tom Meixner Allan Rodhe Jan Seibert Murugesu Sivapalan John Wainwright William Graf Carol P. Harden Stuart Lane Richard Marston Takashi Oguchi Francisco L. Pérez André G. Roy Randall Schaetzl Douglas J. Sherman Catherine Souch Markus Stoffel Martin Thoms Ellen Wohl GIS & Earth Observation Sharolyn Anderson Gennady Andrienko Robert Edsall Sarah Elwood Douglas Flewelling Mark Gahegan Rina Ghose Randy Gimblett Michael Goodchild Carl Grundy-Warr Mark Harrower R. Daniel Jacobsen Chris Jones Brian Lees Steven Manson Robert McMaster Jeremy Mennis Morton O’Kelly Martin Raubal Nadine Schuurman Shashi Shekhar Kathleen Stewart (Hornsby) Monica Wachowicz Mike Worboys Nick Mount Andrew Lovett Kelley A. Crews-Meyer Reginald Fletcher Perry Hardin R. Douglas Ramsey Jason Tullis Yong Wang Timothy Warner Dawn Wright Political Geography Peter Adey Veit Bachmann Alana Boland Angharad Closs Stephens Mathew Coleman Simon Dalby Lorraine Dowle Klaus Dodds Colin Flint Kevin Grove Merje Kuus Katie Meehan Jo Sharp Chih Yuan Woon Social Geography Paul Cloke Damian Collins Robyn Dowling Isabel Dyck Robyn Longhurst Robin Kearns Don Mitchell Virginia Parks Deborah Phillips Gill Valentine Urban Geography Ben Derudder Kevin Dunn Bethan Evans Steve Herbert Phil Hubbard Kurt Iveson Andy Jonas Paul Knox Loretta Lees David Ley Eugene McCann Deborah Martin Pauline McGuirk Byron Miller Ugo Rossi Elvin Wyly 362 work_xb32r2fe4vfixcpjh2xvflssle ---- 64BA94B3.rtf ALEXANDRIA, 14(3), 2002 Automation of Processes in the National Library of China: Historical Review and Future Perspective BEN GU Ben Gu has an MS in Mathematics from Fudan University, Shanghai, and a PhD in Manag e- ment Science from Renmin University of China, Beijing. He began work in the National Library of China as a book selector in 1987, and became Chief of Book Selection in 1989. He was ap- pointed Director of the Centre for Acquisitions & Cataloguing of Foreign Books in 1998 and Deputy Director of the Acquisitions & Cata- loguing Department in 1999. He has more than 100 pu blications to his name, covering library science, the book industry, music and Chinese studies. INTRODUCTION The National Library of China (NLC) was estab- lished in 1909. As of December 2001, it had a collection of more than 21 million items, inclu d- ing 5.5 million volumes of Chinese monographs, 3 million volumes of monographs in foreign languages, 45,230 titles of Chinese periodicals, 41,436 titles of periodicals in foreign languages, and various special collections. About 1,400 full-time employees and several hundreds of part-time employees work in the library. In the last two decades, the automation of processes in the National Library of China has progressed smoothly up to a certain point. Ho w- ever, the processing of publications in foreign languages is still to be computerized. This situa- tion does not match the present state of library automation in countries from which these publi- cations are imported. There are several reasons for this. As the country’s national library, the NLC must give top priority to the processing of Chinese publications; the prices of library sys- tems for the processing of publications in foreign languages are comparatively high, and we can purchase an efficient system only when the budget is sufficient; and limited technologies have not previously allowed a library system to process all languages on a single platform. In this paper, I review the history of automation in the NLC and put forward some proposals for future development. THE 1980s In the 1980s, the NLC used LC MARC tapes to print bibliographic records of books in specific subjects, such as on China and Marxism, for ac - quisitions purposes. This was the earliest form of library automation; it simply provided more in- formation for book selectors and did not change the manual work processes. During this period, the acquisitions librarians corresponded with AUTOMATION OF PROCESSES IN THE NATIONAL LIBRARY OF CHINA 134 foreign publishers or booksellers by mail, at best by telex. In 1987, when the present library building was opened to the public, the NLC formally began its library automation programme. It specified ke y- boards in Western, Japanese and Russian la n- guages, and drafted system requirements. Be - cause there were no software systems and soft- ware developers, the NEC computers were not used for the processing of publications in foreign languages. At the same time, staff tried to make more use of them. They used Japanese and Ru s- sian terminals for word processing and compiled some bibliographies. Because terminals did not have hard dis cs, data could be stored only on 360KB or 1.2MB floppies, which cannot process large amounts of information or accommodate databases. The bibliographies compiled at that time could not be converted into IBM -compatible formats, and they cannot be used now. It might be said that computers were used as typewriters. In 1990, acquisitions librarians of Western monographs used dBase II to create an ISBN database on an NEC Chinese terminal with a 20MB hard disc. Since then, book selectors have been using it for duplicate-checking when they select new titles. The database was later updated and switched to IBM -compatible computers. Although quite a simple system, it has saved much manual work. THE 1990s Acquisitions One of the main characteristics of the 1990s is t he large-scale application of CD -ROMs. The first CD-ROM title we used was BIP (Books in Print with Reviews Plus). It has replaced the three volumes of Books in Print, four volumes of the Subject Guide to Books in Print and the biannual Forthcoming Books, saved much shelf space, improved search efficiency and timeliness. Du r- ing this period of time, fax was becoming popular, a n d e-mail was becoming a main method of communication with foreign booksellers. Cataloguing Western languages Automation of cataloguing began in the early 1990s with the introduction of Bibliofile CD-ROM produced by TLC (The Library Co r- poration). Limited by the network environment and the computer expertise of staff, all MARC records downloaded were stored on floppies. Because of poor physical conditions and disc quality, many of the several hundreds of floppy diskettes were found unusable when bibliographic records were converted into ISO 2709 format in recent years. This has caused a waste of human resources in that some downloading operations had to be done again. With consideration of prices and search in- terfaces, we began to replace Bibliofile with OCLC CatCD for the cataloguing of books in Western languages in 1998. The latter can be run on a Windows NT based network and shared by many users. Chinese books During this period, Chinese monograph cata- loguers began to use the ACOS system based on NEC mainframe computers to catalogue new Chinese monographs and produce the China Na- tional Bibliography. After a few years, ACOS was replaced by a Windows -based system pro- duced by Wenjin IT Centre, which is located in the NLC building and consists of former NLC computer technicians. Japanese books In the Section for Monographs in Oriental La n- guages, a Novell network was installed for the access to bibliographical CD-ROMs produced by the National Diet Library and Nippan. At first, staff used floppies to store downloaded MARC records for future conversion. Because the library did not plan to introduce a special system for processing Japanese books, the section stopped using the floppies in 1998. The CD-ROMs are now used solely for printing card catalogues of books searched in them. Russian books In 1998, some Chinese library software vendors were interested in testing the processing of Ru s- sian in their Chinese software. Theoretically, Russian can be processed only if necessary changes are made to the interface and a Russian operating system is used. Because of the limited market in China, the vendors did not pursue this AUTOMATION OF PROCESSES IN THE NATIONAL LIBRARY OF CHINA ALEXANDRIA, 14(3), 2002 135 application. Even today very few Chinese libra r- ies can process Chinese, Russian and Japanese in a single system. International exchange In early 1990s, the International Exchange Sec- tion used dBase II (later FoxBase 2.0) to make a simple database for the management of the ad- dresses of exchange partners. The work in this section was thus automated to a certain degree and became more efficient. Reader Services OPAC In 1999, Wenjin IT Centre produced a Web-based OPAC system for the use of public access and online reservation. As the first open access cata- logue system in the library, it can be used to search bibliographic records of Chinese public a- tions and publications in Western languages, although Latin characters with diacritics appear as strange Chinese characters. However, the system has several disadvantages: • it is based on Windows Chinese version, and does not support Western languages, esp e- cially characters with diacritics, not to say other languages, such as Russian and Jap a- nese; • it does not have an English-language interface and is n ot easy to use for foreign users; • it does not have many search options. Closed-stack service In late 1990s, Tsinghua University helped the library to make a system specially for the ma n- agement of reader status and the delivery of call-slip information. This has replaced the former mechanical call -slip delivery system. Open-stack service A circulation system purchased from an Ameri- can company in the late 1980s went out of service and was finally replaced by a new system pro- duced by Wenjin IT Centre. With this system, the library can use smart cards jointly issued by the NLC and the Industrial and Commercial Bank of China to process user information (including ID numbers and colour photographs) and circulation records. Infrastructure In 1999, the NLC completed the construction of a gigabyte library -wide network, with every in- formation node sharing 10 MB. It was the first Gigabit Ethernet library -wide network ever built in China. In 2000, the library implemented sec- ond-stage library-wide network engineering, upgrading network equipment, adding informa - tion nodes and computers and developing a VOD system. The NLC began to be connected to CERNet (China Education and Research Network) based in Tsinghua University via microwave in 1995, and has since been connected to Peking Univer- sity and the Chinese Academy of Sciences. Its connection with ChinaNet began in 1996, and with BCTV in 1997; it established a 100-MB connection with the State Council and a 1-GB connection with China CATV Net in 1999. The NLC now has Internet connections with almost all backbone networks in China. Summary We can summarize the following characteristics of library automation during the 1990s: • Improvisation: automation in the acquisitions and cataloguing departments was not per- formed by computer professionals, and li- brarians had to spend quite a lot of time learning to write computer programs. • Independence: various processes used diffe r- ent systems and were not linked, and data could not be shared. • Incompleteness: only acquisitions and cata- loguing of books in some languages were automated to a certain degree, leaving all other languages with a manual or semi -manual operation. • A good network without an integrated system: a gigabyte library -wide network was con- structed, but there was no integrated lib rary management system for all the processes. • No generally used standards: different sec- tions, such as those of Chinese new books, Chinese rare books, audiovisual materials, etc. used different cataloguing formats for diffe r- ent systems, or different rules for the same formats. AUTOMATION OF PROCESSES IN THE NATIONAL LIBRARY OF CHINA 136 At this time, library automation in the Na - tional Library of China lagged far behind not only national libraries of developed countries, but also other major libraries in China, such as Shanghai Library, Peking University Library and Tsinghua University Library. The reasons include the in- adequacy of budgets, the large size of the colle c- tions, the pursuit of processing of all languages in a single system, and some management problems. PRESENT POSITION Present processes for foreign public ations Acquisitions 1. New title information is obtained from pub- lishers, booksellers, CD-ROMs, the Internet, etc. 2. Book selectors make their selection on book- seller order forms or print interim order slips for acquisitions librarians. 3. Book selectors check the ISBN database and the author/title card catalogues for duplicates. 4. Acquisitions librarians prepare orders on in- terim order slips prepared by book selectors or order forms produced by booksellers. 5. Acquisitions librarians check library holdin gs for duplicates again and maintain author and title card catalogues. 6. Orders are sent to booksellers via e-mail and printed order forms. 7. Acquisitions librarians use typewriters to put short titles line-by-line on individual check-in forms. 8. Acquisitions librarians calculate total amounts of invoices for payment. 9. When a title has been processed by catalogu- ers, acquisitions librarians have to replace the card order form with a catalogue card. Almost all the processes are manual. Acqui- sitions staff have to manually type bibliographic records two or three times. If the manual work of cataloguing is taken into account, there is even more duplication of work. Cataloguing Russian librarians use typewriters to type cat a- logue cards; librarians with knowledge of Ko rean, Arabic and Hindi librarians use PCs to print catalogue cards; and librarians who know Japa- nese use CD-ROMs to print catalogue cards. As for cataloguers of monographs in Western languages, they classify books according to the Chinese Library Classification , manually check staff catalogues to assign author numbers, use OCLC CatCD to download bibliographic records, and manually print spine labels. They then con- vert OCLC files into ISO 2709 format and fo r- ward them to computer technicians, which have to convert these records again for uploading to the OPAC system. Meanwhile, reader service li- brarians have to check the OPAC record by record when they receive newly catalogued books from cataloguing sections. The OPAC search has no authority control, but the cataloguing section maintains some reference cards for authority control according to LC authority file micro- fiches. Although the library has an OPAC for searching books in Western languages, catalogue cards are still used for all foreign languages, and cataloguing staff have to organize all the card catalogues in addition to processing MARC re - cords. If any changes in bibliographic records are needed, all the cumbersome procedures have to be repeated. Reader services Systems for reader services are separated from the OPAC system, and we are unable to link biblio- graphic records with user data. For example, when users search the OPAC, they cannot view the status of books and so do not know if an item is available or not. Likewise, the check-out desk staff simply record the call number and user ID, and there is no means of knowing what titles a user has borrowed. Looking for a good integrated library system System selection To solve the problem of not having a library sys- tem for the processing of books in foreign lan- guages, the NLC began to select an integrated system for this part of work in late 1999. Among the principal requirements are that it must be technically up-to-date, able to do multilingual processing, extendible, capable of integration and consistent with the present processes. AUTOMATION OF PROCESSES IN THE NATIONAL LIBRARY OF CHINA ALEXANDRIA, 14(3), 2002 137 Because there was no Chinese local library software vendor able to provide such a system, the system selection group visited major Chinese libraries using systems developed by foreign vendors and held discussions with representatives of major library software vendors in the world. After two years of consideration, the NLC de- cided to select Aleph 500 produced by Ex Libris, an Israel-based company, and signed the contract in September 2001. We find that Aleph 500 can meet most of our requirements; it supports Un i- code, and is flexible, extendible and cost-effective. We also find that the Chinese and the Israelis are similar in their way of thinking, because they are both people with long traditions and with customs that differ from those of wes t- erners. It has been decided that the system will be used not only for publications in foreign la n- guages but for Chinese publications. Preparatory work In November 2001, the NLC and Ex Libris drafted a timetable for the implementation of the new system. It is expected that the preparatory work will take about a year, and the STP (Switch to Production) will be completed by the end of 2002. In January 2002, a working group was estab- lished to coordinate preparatory work on the new Aleph system. It consists of managers, librarians and computer professionals from different de- partments, who are given responsibility for org a- nizing training, setting up parameters and repre - senting NLC in contacts with Ex Libris. Since Chinese publications have been cat a- logued in CNMARC format, which is based on UNIMARC, and publications in Western la n- guages have been catalogued in MARC21 (USMARC), we have to create two bibliographic databases respectively in CNMARC and MARC21 formats. For simplicity, we plan to use MARC21 for all other foreign languages, in- cluding Russian, Japanese, Korean, Arabic, Vietnamese, Mongolian and Hindi, and are trying to find sources of bibliographic records for these languages. FUTURE PERSPECTIVES: THE 21ST CENTURY New processes Acquisitions 1. New title information is acquired from pub- lishers, booksellers, CD-ROMs, the Internet, etc., in MARC format, and then downloaded into temporary bibliographic databases; if the interfaces between Aleph 500 and booksellers are completed, direct downloading from bookseller databases can be used. 2. Book selectors make selection on booksellers’ MARC data and check duplicates directly by searching data in related fields (author, title and ISBN). 3. Acquisitions librarians add order and vendor information. 4. Orders can be sent to booksellers via e-mail and EDI. 5. Check-in can be done by adding necessary information in bibliographic and administra- tive records. 6. Total amounts can be generated automatically for payment 7. Line items for general invoices can be gener- ated from bibliographic records, and manual input is no longer necessary. In addition, book selectors can utilize user statistics to analyse use rates of library collections so that necessary adjustments can be made to meet users’ needs. International exchang e Exchange librarians can use the acquisitions and serials modules by treating their exchange part- ners as if they were booksellers. Cataloguing Cataloguers can do original cataloguing, change bibliographic records created by acquisitions librarians, or use MARC records downloaded from OCLC CatCD and OCLC WorldCat or other services. Some booksellers can provide MARC records when they supply books. All the other work can be done in the system with little or no manual work. Card catalogues will be obsolete. For Japanese monographs, we plan to use the Z39.50 service provided by the National Institute AUTOMATION OF PROCESSES IN THE NATIONAL LIBRARY OF CHINA 138 of Informatics in Japan to download MARC21 records for cataloguing. For Chinese monographs, we can use the au- thority files prepared by the NLC, which have never been used before. We have purchased LC authority files for the authority control of records in Western languages. For other languages, we can create authority records or purchase them from relevant countries if they are available. Because the NLC does not have librarians for all languages, publications in some languages cannot be processed. For example, the Section of Oriental Books is responsible for Vietnamese books, but it has had no Vietnamese specialists for some time. With the application of the new system, Vietnamese books and other languages in Latin character sets can be processed by Western language librarians by downloading records from OCLC WorldCat. This is cost-effective for lan- guages in which the library acquires of fewer than 1,000 new monographs annually. Reader services With the application of the new system, reader service librarians can manage all items and users in a single system; reference librarians can use the system to provide an SDI service; and readers can use the Chinese and English interfaces and the powerful search functions of the OPAC to find information more easily. Problems and solutions Hardware In principle, most computers in use at present should be upgraded or replaced by new ones to accommodate Windows 2000 as required by Aleph 500. If we can find a way of using the present computers with Windows 98 in related languages, we can save much money and time. Bibliographic records for cataloguing and ac- quisitions In the new system, bibliographic records are shared in all the processes, including acquisitions, cataloguing and reader services. However, bib- liographic information from different sources, such as booksellers and cataloguing tools, are different. For example, a multi-volume title may be described by acquisitions librarians as such but by cataloguers as a series title consisting of various separate titles. Ways need to be found to deal with this sort of problem. Authority control Although Chinese authority databases have been maintained by the NLC for several years, they have never been used for authority control. The staff have to find means to establish a connection between authority and bibliographic databases. After purchasing the authority files from LC, we should make necessary changes in the author- ity records, especia lly for those of Chinese names, and add Chinese characters so that users can find titles in all languages by searching a single name. The question is whether we should maintain one authority file, or one for our own use and another for the original LC data. MARC formats Theoretically, Aleph 500 can accommodate many MARC formats. If we wish, we can create many bibliographic databases with different MARC formats for different languages, e.g. CNMARC, MARC21, RUSMARC and JapanMARC. Ho w- ever, this would lead to difficulty in system management and low efficiency in OPAC searching. After investigations and discussions, we have decided to use only two MARC formats, CNMARC and MARC21. Payment Many new integrated library management sys- tems have e-business functions to allow users to use electronic invoices and electronic payment. However, almost no Chinese booksellers provide such services. We should try to persuade them to add such functions and facilitate our acquisitions work. Reader services Aleph 500 does not fully meet all the require - ments of the library, especially those of reader services, such as reading room statistics, the processing of readers’ cards, and the display of photographs for control purposes. (The smart cards issued jointly by the NLC and the Industrial and Commercial Bank of China do not contain photographs. We therefore require the system to display photographs of card -holders when the use AUTOMATION OF PROCESSES IN THE NATIONAL LIBRARY OF CHINA ALEXANDRIA, 14(3), 2002 139 the library. Ex Libris promises to develop this function, but has not yet done so). To solve these problems, we need to persuade the vendors to add new functions to meet our requirements on the one hand, and on the other we should adjust our own processes to the system. CONCLUSION ‘Therefore its name was called Babel, because there the LORD confused the language of all the earth; and from there the LORD scattered them abroad over the face of all the earth.’ (Genesis XI). It is impossible for us to unify all the la n- guages in the world. However, it is our dream to process all of them, particularly in a single sys- tem. What we have been doing is to work towards realization of the dream. With the rapid Chinese economic development and the greater attention of the government, we think that we can do so in the near future. A brand new National Library of China will come into being. ABSTRACT As the largest library in China, the National Library of China (NLC) has spent more than 20 years in automating its processes, and has not yet had an integrated library management system, lagging behind libraries of developed countries and also other major libraries in China. In the 1980s and 1990s the NLC made slow progress; reasons for lack of success in the past include the inadequacy of budgets, the extent of the collections, the pursuit of processing of materials in all languages in a single system, and some management problems. In 2001, the NLC signed a contract with Ex Libris to implement its Aleph 500 system and aimed at comprehensive solutions. Aleph 500 meets most of the libra ry’s requirements, supports Unicode and is flexible, extendible and cost-effective. The new system is expected to help the library to fully automate all of its processes, provide services better than other Chinese libraries and process all languages in a s ingle system. work_xcklbtlqqzbr3doojzjzzgrdb4 ---- 1 A multifaceted approach to promote a university repository: the University of Kansas’ experience. Holly Mercer, Brian Rosenblum, Ada Emmett Holly Mercer: University of Kansas Libraries hmercer@ku.edu Anschutz Library Room 320K 1301 Hoch Auditoria Dr. Lawrence Kansas 66045 Holly Mercer is the Coordinator of Digital Content Development at the University of Kansas. A member of the Digital Initiatives program since 2003, she is a consultant on metadata, digitization standards, project management, and scholarly communication issues for the KU campus, and administrator of the KU ScholarWorks repository. Brian Rosenblum: University of Kansas Libraries brianlee@ku.edu Anschutz Library Room 320A 1301 Hoch Auditoria Dr. Lawrence Kansas 66045 Brian Rosenblum has served as Scholarly Digital Initiatives Librarian at the University of Kansas since 2005, where he helps promote KU ScholarWorks and develop other digital projects. Previously, from 2000-2005, Brian worked at the Scholarly Publishing Office at the University of Michigan Library. His professional interests include scholarly communication issues, electronic publishing, and library development in Central and Eastern Europe. Ada Emmett: University of Kansas Libraries aemmett@ku.edu Anschutz Library Room 300 1301 Hoch Auditoria Dr. Lawrence Kansas 66045 Ada Emmett has worked at the University of Kansas since 2002. She serves as the subject specialist for chemistry and molecular biosciences and has been involved in projects to foster greater awareness and use of KU’s institutional repository, KU ScholarWorks. She has been interested in the complex issues currently facing the system of dissemination and access of scholarship since graduate school. 2 Article Type: Case Study Purpose of this paper To describe the history of KU ScholarWorks, the University of Kansas’ institutional repository, and the various strategies used to promote and populate it. Design/methodology/approach This paper describes how KU ScholarWorks came into being, and discusses the variety of activities employed to publicize the repository and encourage faculty to deposit their work. In addition, the paper discusses some of the concerns expressed by faculty members, and some of the obstacles encountered in getting them to use the repository. The paper concludes with some observations about KU’s efforts, an assessment of the success of the program to date, and suggests some next steps the program may take. Findings KU ScholarWorks has relied on a "self-archiving" model, which requires regular communication with faculty and long-term community building. Repository content continues to grow at a steady pace, but uptake among faculty has been slow. In the absence of mandates requiring faculty to deposit work, organizations running institutional repositories must continue to aggressively pursue a variety of strategies to promote repositories to faculty and encourage them to deposit their scholarship. Originality/value KU’s experience will help other institutions develop institutional repositories by providing examples of marketing strategies, and by promoting a greater understanding of faculty behavior and concerns with regard to institutional repositories. A multifaceted approach to populate a university repository: the University of Kansas’ experience. INTRODUCTION In a September 2005 article assessing institutional repository deployment in the United States, Clifford Lynch and Joan Lippincott conclude that "institutional repositories are now clearly and broadly being recognized as essential infrastructure for scholarship in the digital world" and that they are "being positioned decisively as general-purpose infrastructure within the context of changing scholarly practice, within e-research and cyberinfrastructure, and in visions of the university in the digital age" (2005). However, although repositories may be "recognized as essential infrastructure" it is not necessarily faculty-authors doing the recognizing, and persuading faculty to fill institutional repositories (IRs) through self-archiving remains a challenge. 3 The University of Kansas (KU) established its institutional repository, KU ScholarWorks, in spring 2003, early in the IR movement, with solid support from the Provost, who was instrumental in helping launch the repository as part of a broader scholarly communications program. Since its introduction, library staff have employed a variety of strategies and approaches, none of which are unique to KU, to marketing KU ScholarWorks to the KU community. However, despite active building and promotion for nearly three years, making the campus aware of its existence and purpose has not been easy, and uptake among faculty has been slow, though content has continued to steadily grow. KU’s experience is typical. A 2004 report on the state of institutional repositories asserts, “The biggest problem facing those setting up IRs is persuading faculty to use them. Outside a few disciplines (e.g. physics, computer science, and economics) there is little tradition of preprints or working papers and apparently still little interest in self-archiving. Academics may be radical in their thought but they are conservative in their behavior, and there is a good deal of inertia in the current publishing systems….The data quoted in this report shows that take-up rates for IRs have to date been very patchy, especially where the deposit of materials depends on the decision by individuals to self-archive their material” (Ware, 2004). “Archivangelist” Stevan Harnad states that encouragement to deposit items “is not sufficient to raise the self-archiving rate appreciably above the 15% baseline for spontaneous self-archiving” (2006). He argues forcefully for institutions to require faculty to self-archive all research. In the absence of those mandates (and perhaps as a necessary preliminary to them) institutions operating IRs will continue to employ a variety of small- and large-scale, labor-intensive methods to reach out to faculty, solicit their material, and further engage them in applying alternative methods to disseminate their research. This can be "a slow, incremental, somewhat piecemeal process" (Lynch and Lippincott, 2005) which has been compared elsewhere to throwing spaghetti at a wall and seeing what sticks (Salo, 2006). This kind of advocacy and grass-roots activism may be part of the preliminary groundwork needed to create an environment in which such mandates will be possible. Jones, Andrew, and MacColl (2006) place these advocacy efforts in a theoretical framework that relates Everett Rogers’ diffusion of innovation concepts to issues of faculty adoption of IRs, the challenges of getting widespread use of an innovation, and the time and efforts involved. They describe a social-system of repository use where innovators introduce IRs and advocacy builds support for IRs, but wholesale adoption does not occur until use is mandated. Success for institutional repositories is usually defined by the number of items held in relation to the number of faculty, and, though less often articulated, by how often the archived items are downloaded by others (use by authors and readers) (Shearer, 2003). Jones, Andrew, and MacColl compare an IR to a library and ask, “…who in their right mind would want to visit a library without books?” (2006). The more items deposited that 4 are representative of the faculty output the better. But this definition of success, if solely based on numbers, belies one purpose of IRs, which is to create opportunities for change in the system and its stakeholders, such as authors, publishers, and readers. In essence, universities are widely adopting institutional repositories as dissemination engines because the successful IR will create an opportunity for behavior changes in both authors and readers, two key stakeholders in the system. Thus, gauging the success of KU’s repository (and other repositories) is not simply a numbers game, especially not at this early stage when IRs are still largely in embryonic form. Although the number of items in KU ScholarWorks is modest, the repository has several very active communities and contributors, and has generated interest among faculty in a variety of departments on campus. Moreover, the early establishment of an institutional repository has given KU librarians a great deal of feedback and knowledge about the campus environment and faculty members’ perceptions and needs with regard to scholarly communication. KU has gained valuable experience in the policy and technical requirements of setting up and maintaining a repository, and librarians have established relationships with academic units that will likely prove beneficial in the long term as more faculty are persuaded to use KU ScholarWorks. This paper discusses the history of KU ScholarWorks to date, including the strategies used to populate it. Part One describes how KU ScholarWorks came into being, and discusses the variety of activities employed to publicize the repository and encourage faculty to deposit their work. Part Two explains how some KU ScholarWorks communities have evolved, and includes several observations and an assessment of KU efforts. The paper concludes with thoughts about measuring the “success” of KU’s repository, and suggests next steps for the program. PART ONE / BIRTH AND GROWTH OF KU SCHOLARWORKS Take in Figure 1: IR development at the University of Kansas THE KU ENVIRONMENT The University of Kansas (KU) is a comprehensive educational and research institution with over 29,000 students and 2,200 faculty members. KU includes the main campus in Lawrence; the Medical Center in Kansas City, Kansas; the Edwards Campus in Overland Park; a clinical campus of the School of Medicine in Wichita; and educational and research facilities throughout the state. KU offers more than 170 fields of study and has a research budget of more than $274 million. The KU ScholarWorks repository includes scholarship created primarily by faculty, staff, and students at the Lawrence and Edwards campuses. This repository service is offered and maintained by KU Digital Initiatives, a program of Information Services (IS). The Vice Provost of Information Services oversees the Libraries, Information Technology (IT), and Networking and Telecommunication Services divisions. Staff from both IT and the Libraries take part in providing technical and administrative support for KU ScholarWorks. 5 LAYING THE GROUNDWORK Discussion of scholarly communication issues on campus preceded the launch of KU ScholarWorks as a pilot project. David Shulenburger, an early advocate for scholarly communication reform, was Provost and Chief Operating Officer at KU until June 2006. Shulenburger, an economist, proposed developing a national eprint archive, the National Electronic Article Repository (NEAR) (1998), and wrote and spoke on the topic extensively while KU Provost. He provided a campus forum for discussion of scholarly communication issues through the Provost's Seminar on Scholarly Communication, sponsored by the Office of the Provost and the University Libraries. Following on the heels of national efforts to manage the rising costs of library subscriptions to scholarly journals, such as the enumeration of the Tempe Principles for Emerging Systems of Scholarly Communication (Association of Research Libraries, 2000) and the formation of the Scholarly Publishing and Academic Resources Coalition (SPARC), the first Provost's Seminar, "From Crisis to Reform: Scholarly Communication and the Tempe Principles," was held on November 8, 2000. The primary focus of the seminar was engaging faculty in discussing the Tempe Principles for Emerging Systems of Scholarly Communication. Speakers addressed KU's role in the scholarly communication movement, reactions to the Tempe Principles, and discipline-based solutions to the serials crisis. While this seminar did not focus on establishing an institutional repository at KU, it laid the groundwork for development of a KU repository by raising awareness of issues that a repository might help address. LAUNCH OF A PILOT REPOSITORY Efforts to establish an institutional repository at the University of Kansas began in earnest in 2002 when the Libraries hosted a forum for KU librarians to discuss scholarly communication issues and the open access movement. Provost Shulenburger focused on changing scholarly practices, and Information Services leadership focused on establishing a repository for preservation and dissemination. A 2003 white paper explains, “…scholarly works scattered across a variety of Web sites can be difficult for other researchers to locate. Opportunities for effective exchange may be lost in the chaotic sprawl of the World Wide Web…. Institutional repositories—digital collections that organize, preserve, and make accessible the intellectual output of a single institution—are emerging at leading universities as one response to this new environment” (Fyffe and Warner, 2003). Information Services leadership developed a repository implementation plan that called for a series of working groups to address various aspects of establishing and maintaining an institutional repository. These working groups were organized in the spring of 2003 and each group was to complete its charges and submit a report by summer 2003. The 6 system selection group recommended installing DSpace, then in beta test (version 1.0 was released by the time KU was ready to proceed with the installation), and KU began with a "proof-of-concept" test repository to build further administrative and faculty support. In all, 28 staff members from the Library and IT units on the Lawrence and Medical Center campuses participated in the working groups. KU ScholarWorks launched as a pilot repository in September 2003. EARLY AND ONGOING FACULTY INVOLVEMENT KU ScholarWorks was conceived as a service for faculty, and KU Libraries sought ongoing faculty involvement from the earliest stages of planning and development. One of the IR working groups, the early adopters group, identified faculty from across KU who might learn to use the system, submit some items, and provide feedback to refine the IR. Some early adopters were faculty who had previously expressed an interest in digital scholarship. Richard Fyffe, Associate Dean for Scholarly Communication and Holly Mercer, Coordinator for Digital Content Development, met with each early adopter at least once to demonstrate system functionality, discuss policies and procedures, and assist in uploading documents. The early adopters group submitted items to the test repository, then met together in January 2004 for a focus group discussion on policy issues as well as system functionality. Feedback received from these focus groups influenced subsequent decisions in the planning and development process. Early adopters believed that KU ScholarWorks communities should reflect epistemic communities rather than administrative campus units (such as schools and departments). Therefore, KU ScholarWorks supports three community types: formal communities, associated with academic departments and research units; informal communities, for individuals to contribute without a formalized community structure; and communities of practice, for interdisciplinary groups that lack a formalized administrative structure. Interestingly, while early adopters stressed the need for communities of practice, none have been requested yet. While responses from the focus groups were generally positive, few of the "early adopters" in fact became users of the repository. However, one early adopter did establish KU ScholarWorks' first formal community, the Policy Research Institute community, and several others became members of a KU ScholarWorks advisory committee. While KU ScholarWorks’ policies are ultimately the decision of Information Services leadership, this advisory group brings an important faculty and user perspective to the planning process. Staff working on repository development had hoped that members of the advisory committee would also act as "ambassadors" who would advocate the use of KU ScholarWorks to faculty peers, but to date the group has not yielded dramatic results in terms of advocacy or activism. In fact, few members of the committee are associated with departments or research centers with KU ScholarWorks communities or have actually submitted items themselves. Future plans may call for an expanded or altered membership so that actual KU ScholarWorks participants will have a greater voice in developing and refining the service. 7 In addition, the KU Libraries held a separate focus group in conjunction with KU Continuing Education (KUCE) in February 2005 to learn more about principal investigators' needs for meeting grant dissemination requirements. The Libraries and KUCE invited recent federal grant recipients in various disciplines to participate. The participants stated that dissemination of research was often only considered as an afterthought, because by the time there were results to report they had already moved on to the next project. They indicated an interest in having boilerplate language to describe how KU ScholarWorks meets preservation and dissemination requirements for inclusion on grant applications. Consequently, KU Libraries added a section to the "About KU ScholarWorks" Web site titled "Support for Grant Applicants" which includes a link to text that grant applicants can copy and paste into their grant proposals (http://www2.ku.edu/~scholar/docs/grantsupport.shtml). ROMEO GREEN (I) KU ScholarWorks launched with the expectation that faculty would self-archive their work—that is, they would decide to upload their work themselves or submit via a departmental proxy. However, it was clear there would be a number of barriers to immediate faculty participation, ranging from complex copyright clearance issues, to confusion about appropriate content for the repository, to simply getting the attention of busy faculty and researchers who may not pay much attention to a new service whose benefits are not immediately clear to them. Library staff believed that departments would be more likely to join as communities if faculty could see high quality content already in the repository, and therefore launched a project to populate KU ScholarWorks. KU Libraries launched the RoMEO Green project in September 2004 to explore some of these issues. Phase one of RoMEO Green (named after the RoMEO/SHERPA project from which much of the initial publisher policy data was derived) focused on alternative, staff-mediated strategies to populate the repository. By combining KU faculty citation data with “green” publisher policy data (publishers that allow their authors to post versions of their articles on web sites on in repositories), staff determined which papers by KU authors might be deposited in KU ScholarWorks. Staff then contacted those authors and asked permission to deposit the articles on their behalf. This initiative was based in part on a similar initiative undertaken at the University of Glasgow (Mackie, 2004). The RoMEO Green project goals were to add content to KU ScholarWorks, explore services that might be offered faculty to support their use of KU ScholarWorks, and create interest in an institutional repository at KU. Staff identified and requested 2210 articles from faculty. Ninety-two articles, about 4% of the total requested, were deposited. The percentage is low, but this was the first time many faculty had heard of KU ScholarWorks. It is also consistent with the compliance rate in the initial eight-month period after the National Institutes of Health (NIH) implemented its Public Access Policy requesting and encouraging (but not requiring) that NIH-funded investigators submit their 8 final, peer-reviewed manuscripts to the National Library of Medicine’s PubMed Central Database upon acceptance for publication in a journal (Zerhouni, 2006). At KU, in addition to the 92 articles added to the repository, the RoMEO Green project did provide several, perhaps less quantifiable, benefits. It provided a way for the Libraries to continue to reach out to faculty about scholarly communication issues; staff received feedback about faculty behavior and attitudes, and gained a better understanding of the complexity of working around publishers’ self-archiving policies; and it helped KU Libraries form relationships with some faculty members who later deposited more material in the repository. This is important because, as will be discussed later, one of the ways communities in KU ScholarWorks become active submitters is through long-term relationship building with individual faculty members and departments. The library hopes that getting an early start in developing these relationships will pay off later. (For a full description of the KU’s RoMEO Green project, its methods and findings see Mercer and Emmett, 2005) FACULTY RESOLUTION AND SECOND SCHOLARLY COMMUNICATION SEMINAR In March 2005, the KU University Council, the governance body for faculty and professional and academic staff of the University, passed a broad “Resolution on Access to Scholarly Information.” KU was the first member of the American Association of Universities (AAU) to pass a resolution calling on its faculty to self-archive (Suber, 2005). The resolution, a result of strong advocacy and involvement from Provost Shulenburger and Assistant Dean Fyffe, addresses current issues in scholarly communication, and calls on faculty to take such actions as amending their copyright transfer statements to allow them to deposit their work in KU ScholarWorks, and to become familiar with the publishing and business practices of journals and support those that permit dissemination through university repositories and other open access models. The resolution also calls on the academy (University, professional and scholarly associations and administrators) to establish clear “guidelines for merit and salary review…and promotion and tenure…that will allow the assessment of and the attribution of appropriate credit for works published in such venues” as KU ScholarWorks (University of Kansas University Council, 2005). It calls on KU Libraries to provide resources to help faculty better understand the business practices of journal publishers and their impact on the scholarly communication system. Passage of the resolution was timed to coincide both with the second Provost’s Seminar on Scholarly Communication (http://www.lib.ku.edu/scholcommSeminar.shtml), held in early March 2005, and with the official launch of the KU ScholarWorks repository. The second Provost’s Seminar focused specifically on the role of digital repositories in the scholarly communication system, and brought leaders in the scholarly communication movement to the KU campus. The Seminar also included a demonstration of KU ScholarWorks. KU is not alone in choosing to announce its IR at a scholarly communication seminar; the University of New Mexico, for example, planned a similar event to announce its repository, also in March 2005 (Phillips et al., 2005) . 9 While librarians had been talking informally about KU ScholarWorks, and giving formal presentations to academic departments, research centers, and governance bodies for some time, there was a noticeable spike in interest in KU ScholarWorks following the Provost’s Seminar and passage of the University Council resolution. Some academic departments requested that library subject liaisons attend a departmental meeting to discuss KU ScholarWorks, and individual faculty contacted KU ScholarWorks administrators to inquire about the submission process and items accepted for deposit. The University Council resolution is a significant accomplishment and is an indication of the importance of this issue to KU leadership and their commitment to addressing it, but the lasting impact of the resolution on the KU ScholarWorks repository is still unclear. When the summer break approached, direct inquiries from faculty declined. Clearly, there is a need for a continued and sustained effort at keeping faculty aware of these issues, as they seem to respond when the opportunities are presented to them. ONGOING OUTREACH AND EDUCATION Since the events and publicity surrounding the official launch of KU ScholarWorks in March 2005, KU Libraries has continued to promote the repository on a smaller scale. Library staff have been communicating formally and informally with academic departments, making presentations at departmental meetings, working with individual faculty members to deposit their materials, taking advantage of personal connections, and generally looking for opportunities to discuss the repository program. The combination of education and outreach efforts has resulted in small but growing KU ScholarWorks communities. Staff are also increasing outreach to and involvement of library subject liaisons. Subject liaisons have more regular contact with faculty members in their subject areas than KU Digital Initiatives staff do, and it is clear that their participation and support will be crucial for a successful repository program (Bell et al., 2005). The Libraries currently offer workshops on KU ScholarWorks to subject librarians so that they can become more familiar with the program and better able to discuss it with faculty. Recently, usage statistics have been sent out monthly to library liaisons with data on the most- downloaded items of the month. Liaisons can then send this information on to their faculty colleagues if they feel it is appropriate. An “About KU ScholarWorks” Web site (http://www2.ku.edu/~scholar/) provides information about the repository service. The Web site includes a detailed FAQ, policy documents, text for grant applicants, and links to other pages about scholarly communication issues. A section on “Working with Publishers” is intended to help educate users about intellectual property issues and give them some guidance in retaining or obtaining rights for their work. This section includes links to the Securing a Hybrid Environment for Research Preservation and Access (SHERPA) Web site so that faculty may determine the policies of particular journals in which they publish, letter templates they can use when seeking permission from publishers to post articles in the repository, 10 and an “author’s addendum” that authors can use to modify their copyright transfer agreement with their publisher. (This addendum is based on the addendum created by SPARC, and was reviewed and approved by KU General Counsel.) ROMEO GREEN (II) In early 2006, KU Libraries continued gathering faculty input by following up with a second phase of the RoMEO Green project. This phase focused on assessing faculty perceptions of KU ScholarWorks, and identifying what conditions would encourage KU faculty to adopt greater use of the repository. Faculty who had responded favorably to requests to participate in the first phase of RoMEO Green (by granting permission to have some of their published articles posted in KU ScholarWorks) were invited to attend focus groups. During the focus groups, they discussed their knowledge and impressions of KU ScholarWorks, the submission process, departmental and disciplinary concerns about the repository, and any barriers to depositing their work. The twelve faculty who participated offered enthusiastic support for KU ScholarWorks. Some, though not all, regularly submitted their work to the repository. Several broad issues emerged from the focus groups. Financial and administrative support. Faculty feel overburdened as it is and feel that they and their departments do not have the time or infrastructure to take on new responsibilities, to become familiar with copyright issues, or to learn the archiving policies of different publishers. They think that centralization of these activities would be more efficient. Policy and community issues. Staff detected some tension between the desire to set submission and content policies at the community level, and the need to understand and be assured of the consistency and quality of content in the repository across the entire institution. This suggests a possible need to illuminate more clearly the distinction between the access and preservation functions of the repository, and the peer-review functions of formal publication. Staff working with the IR need to better articulate to faculty that KU ScholarWorks is not intended to displace the traditional peer-review process. Technological barriers. There were several suggestions for repository software changes or technology add-ons that would increase efficiency or lower technology barriers to participation (for example, the ability to automatically create PDF files as part of the submission process). Marketing and education. There is a need for continued and more aggressive marketing about KU ScholarWorks and scholarly communication issues. Participants offered many suggestions for ways to publicize these issues. They also suggested that library staff make discussions of scholarly communication issues more concrete---rather than presenting abstract and formulaic explanations about the scholarly communication system. The Libraries would be more effective if it “told success stories.” Faculty want to hear 11 concrete examples of real benefits of participating in these programs, in terms they understand. A report was made to the KU Libraries Dean’s Council with recommendations for future actions based on IR user feedback. The recommendations included providing greater support for teaching faculty through staff-mediated projects, developing and implementing detailed marketing and education campaigns, and providing technology support to simplify the submission process. The report was well received, and the Dean of Libraries presented the report to IS leadership. Information Services is currently involved in strategic planning, and it is expected that many ideas will be implemented in support of the planning process. PART TWO / OBSERVATIONS AND ASSESSMENT SUCCESSFUL KU SCHOLARWORKS COMMUNITIES KU has adopted a somewhat labor intensive approach to encourage submissions to KU ScholarWorks that relies on building relationships with individual faculty authors, but more importantly, with potential KU ScholarWorks communities. Informal communities include those communities established as part of the RoMEO Green project, as well as those that were created at the request of an individual faculty member or researcher, without departmental support. Thirty-one (72%) of the forty-three KU ScholarWorks communities are informal, and lack a designated community administrator or signed memorandum of agreement. Formal communities have an identified community administrator who acts as a point of contact, and is empowered to make decisions on behalf of the community. A memorandum of agreement outlines the formal relationship between a community and KU ScholarWorks (http://www2.ku.edu/~scholar/docs/memorandum.shtml). Although informal communities make up 72% of the total number of communities, they account for only 19% of total items deposited. Formal commitments with campus units seem to build stronger relationships and provide structure for ongoing community development, content recruitment, and faculty support. Gibbons noted that understanding the needs of faculty is necessary to build a repository program, and implementers must create "a tailored and personalized impression" to which faculty can relate (2004). Communities also have their own personalities, needs and uses for a repository, and it is important to develop relationships with them to understand those needs. KU ScholarWorks communities have come into being and grown in a variety of ways. The following three examples of successful communities will illustrate this process: Author Advocacy. The personal communications established through the RoMEO Green project increased many participants’ awareness of their rights as authors. When one faculty member was asked to supply the author final draft of his work, he initially declined, but did express an interest in understanding why he was not asked to supply the 12 publisher's versions. He preferred to have the final published version available, rather than the author final draft. This professor had served as editor of a scholarly society journal, and he used those professional connections to gain permission for KU to post in KU ScholarWorks the publisher versions of all articles, present and future, authored by KU faculty in that society's journals. In this case, staff efforts did not result in one of the desired outcomes of the project (for faculty to deposit their own work), but it did lead to one author's better understanding of publisher policies and author rights, and 35 additional articles posted. Perhaps even more importantly, a faculty member became an agent of change. As Rogers states in his work Diffusion of Innovations, a "change agent's position is often midway between the change agency" (in this case, the University), and the client system (the scholarly society). The faculty member was able to effect change because he was an effective "linker" between the interests of the University and its faculty, and the scholarly society as publisher (2003). Department-Mediated Submissions: The School of Law and the Department of Public Administration have adopted a mediated process whereby an appointee from the academic unit submits all work on behalf of authors. While to date only two items have been submitted to the Public Administration community using this method, the School of Law has over 120 items in its community. Public Administration and Law have experienced different outcomes based on this model, and RoMEO Green faculty focus groups expressed doubts that all departments would have the resources to take on such a task. Still, a centralized approach to community development may prove an effective submission method for other campus units. Graduate Student Project Submissions: A final example demonstrates how the first student content was deposited into KU ScholarWorks. The School of Engineering offers professionals employed in engineering firms the opportunity to pursue an advanced degree in Engineering Management at KU's Edwards Campus. The Engineering Management program does not have a thesis requirement, but instead requires students to submit a field project. The field projects were submitted in print to the program and retained in the program offices, and a second copy was placed on reserve in the library on the Edwards Campus. After the library director at the Edwards Campus attended a KU ScholarWorks information session, she determined KU ScholarWorks would be a more efficient method to disseminate and store the field projects. She approached the Engineering Management program director, and he supported adoption of a new procedure using KU ScholarWorks. Students continue to submit field projects to the Engineering Management program, and Edwards Campus library staff then deposit an electronic copy in KU ScholarWorks. BY THE NUMBERS In an earlier paper describing efforts to populate KU ScholarWorks, Mercer and Emmett stated, "KU ScholarWorks will fill its role as an institutional repository when its contents are representative of the vast research output from the many disciplines at KU" (2005). As of September 1, 2006, there are 759 items in forty-three KU ScholarWorks communities or, on average, 17.65 items per community. While the number of items 13 available in KU ScholarWorks continues to increase, it hardly represents the depth or breadth of scholarship produced by KU faculty. In addition, the number of items available in KU ScholarWorks is far fewer than the median for Association of Research Libraries (ARL) members with repositories (University of Houston Libraries' Institutional Repository Task Force, 2006). This is despite the extensive promotion of the repository over the course of several years. Why are the numbers lower than expected at this stage, and what can staff learn from this? First, one must be careful not to read too much into these numbers. Lynch and Lippincott, in their survey of U.S. repositories, recognized that comparing repositories by size is problematic because ...no two institutions are counting the same things. We received reports of the number of objects ranging from hundreds of thousands to, at the low end, a few dozen. The diversity in both the definition of what constitutes an "object" and in the nature of the objects being stored (massive videos or groups of datasets as opposed to individual articles or images) makes repository size very hard to interpret, or to relate to space measurements (2005). In addition, a count of total items in a repository does not take into account factors such as whether items were archived by authors or by proxies. The libraries have not been proactive in identifying for submission items such as working papers and technical reports that are already available on departmental Web sites. KU has taken an approach that relies on building relationships with individual faculty authors and potential communities, and encourages self-archiving. Most of the content in KU ScholarWorks has been self-archived by individuals or submitted through their community administrator, as opposed to a library staff-mediated model. Another metric for measuring the success and impact of a repository is usage, which can be measured by the number of searches performed and number of items downloaded from the repository (Shearer, 2003). The DSpace usage logs at KU show that the repository is searched regularly and items are frequently accessed. CONCLUSIONS KU has employed a variety of methods to encourage its faculty to take more control of the intellectual rights of their future works using the IR as a dissemination tool. As outlined in this paper, staff’s multifaceted approach has utilized the efforts of University and Library top administrators, IR staff, library subject specialists, early adopters, and advisory board members to populate the repository. KU ScholarWorks continues to grow at a slow but steady pace, with several successful and active communities. Still, KU Libraries are striving for higher participation, and can make some general observations and conclusions about its approach so far. 14 Based on the experiences at KU and those reported by colleagues at other institutions, library staff know there is work yet to do to increase the rate of adoption of the IR. KU ScholarWorks has relied heavily on the "self-archiving" model for institutional repositories, where authors deposit their own works with little assistance from their academic units or the Libraries. This model assumes faculty have made, or are willing to make, the behavioral change required to deposit their published and unpublished scholarship. While ultimately this behavioral shift is a desired outcome, the reality may well be that faculty will be more willing to self-archive when there is more content available in the repository. Indeed, faculty stated as much during the RoMEO Green focus groups. More content in the IR can serve as indirect evidence that current practice is shifting. Until contributing to an IR is an integral part of the scholars’ social system (and hence normal practice), they are not likely to use a repository (Jones et al., 2006). Institutional repositories are still in the early stages of development. Everett Rogers’ innovation diffusion model defines five stages of progression: knowledge, persuasion, decision, implementation, and confirmation (2003). KU is firmly in the decision stage, with some enthusiastic early adopters and departments committed to using the repository. The University of Kansas is an early adopter of an institutional repository, although individual faculty are at various stages along the adoption continuum. A handful of authors regularly submit their work to KU ScholarWorks, but they are not yet activists who encourage and persuade their peers to submit. The challenge will be to continue developing methods to encourage uptake so that KU ScholarWorks will move through the implementation phase and become part of the fabric of faculty practice at KU. While mandates may eventually be the best way to ensure comprehensive capture of the output of an institution, those running IRs must continue to pursue other means of applying social and administrative pressure to persuade faculty to deposit their works. Other institutions, such as the Massachusetts Institute of Technology, have found that identifying and working with an "insider advocate" is a more effective means of increasing deposits (Baudoin and Branschofsky, 2003) . A respected member of the faculty might influence behavior more than administrative encouragement. Identifying more insider advocates or activists, who will promote KU ScholarWorks, is a logical next step for continued development of KU's institutional repository program. KU has experienced several changes in leadership in 2006. With a new Provost and several new deans (including a new Dean of Libraries), the Libraries have an opportunity to work with these new campus leaders to market the KU ScholarWorks service and spark changes in faculty behavior. Staff are hopeful that IS leadership will act on faculty recommendations outlined in the RoMEO Green report. The report calls for increased support for library-mediated submissions, and enhancements that will make faculty self- archiving easier, such as conversion to the PDF format as part of the submission process. Expanding KU ScholarWorks to include more graduate student work is a priority for Digital Initiatives. During focus groups, faculty expressed strong support for inclusion of theses and dissertations in KU ScholarWorks. Inclusion of electronic theses and dissertations (ETDs) will increase total submissions to the repository, but will also 15 provide greater exposure for graduate student work. Staff will also expand the number of KU ScholarWorks contributors by offering to host papers and presentations given at conferences and symposia sponsored by KU. KU will continue a personalized approach to encouraging use of KU ScholarWorks. While staff will continue to work with individual faculty, more energy will be directed toward establishing formal communities, where the most significant growth in items has occurred. As the number of KU ScholarWorks communities continues to rise, staff will work even more closely with library subject specialists, so that they can effectively market the repository service. Staff will continue to sponsor periodic focus groups with KU ScholarWorks users, and others engaged in alternative methods for research dissemination. KU ScholarWorks community practices will be documented by “telling stories,” so that faculty understand how KU ScholarWorks reflects their own disciplinary work practices. IR administrators and advocates have the responsibility and challenge to continue to make faculty aware of the repository and related scholarly communication issues. This can be done by promoting the repository and engaging in dialogue with faculty as much as possible. Use of the repository by KU faculty is tied in part to larger trends in the academic world. As self-archiving becomes an increasingly accepted part of academic practice, KU faculty will wish to participate in that practice, and KU Libraries must position KU ScholarWorks to meet their needs as well as the needs of the institution as a whole. Bibliography: Association of Research Libraries (2000), "Principles for emerging systems of scholarly publishing", available at: http://www.arl.org/scomm/tempe.html (accessed September 4, 2006). Baudoin, P. and Branschofsky, M. (2003), "Implementing an institutional repository: the DSpace experience at MIT", Science & Technology Libraries, Vol. 24 No. 1/2, pp. 31-45. Bell, S., Foster, N. F. and Gibbons, S. (2005), "Reference librarians and the success of institutional repositories", Reference Services Review, Vol. 33 No. 3, pp. 283-90. Fyffe, R. and Warner, B. F. (2003), "Scholarly communication in a digital world: the role of an institutional repository", available at: http://hdl.handle.net/1808/126 (accessed September 14, 2006). Gibbons, S. (2004), "Establishing an institutional repository", Library Technology Reports, Vol. 40 No. 4, pp. 57-8. Harnad, S. (2006), "Maximizing Research Impact Through Institutional and National Open-Access Self-Archiving Mandates", Proceedings CRIS2006. Current 16 Research Information Systems: Open Access Institutional Repositories Bergen, Norway, available at: http://cogprints.org/4787/ (accessed 10 September 2006). Jones, R., Andrew, T. and MacColl, J. (2006), "Advocacy", The Institutional Repository. Chandos Publishing, Oxford. Lynch, C. A. and Lippincott, J. K. (2005), "Institutional Repository Deployment in the United States as of Early 2005", D-Lib Magazine, Vol. 11 No. 9, available at: http://www.dlib.org/dlib/september05/lynch/09lynch.html (accessed August 20 2006). Mackie, M. (2004), "Filling Institutional Repositories: Practical Strategies from the DAEDALUS Project", Ariadne, Vol. 39, available at: http://www.ariadne.ac.uk/issue39/mackie/ (accessed August 20, 2006). Mercer, H. and Emmett, A. (2005), "RoMEO Green Project at the University of Kansas: An experiment to encourage interest and participation among faculty and jumpstart populating the KU ScholarWorks Repository. " Proceedings of the 68th Annual Meeting of the American Society for Information Science and Technology (ASIST), pp. 1433-41. New Orleans, available at: http://hdl.handle.net/1808/873 (accessed September 3, 2006). Phillips, H., Carr, R. and Teal, J. (2005), "Leading roles for reference librarians in institutional repositories: One library's experience", Reference Services Review, Vol. 33 No. 3, pp. 301-11. Rogers, E. M. (2003), Diffusion of Innovations, Free Press, New York. Salo, D. (2006), "A messy metaphor", Caveat Lector, available at: http://cavlec.yarinareth.net/archives/2006/01/09/a-messy-metaphor/ (accessed September 12, 2006). Shearer, M. K. (2003), "Institutional repositories: Towards the identification of critical success factors", Canadian Journal of Information and Library Science-Revue Canadienne Des Sciences De L Information Et De Bibliotheconomie, Vol. 27 No. 3, pp. 89-108. Shulenburger, D. (1998), "Moving with dispatch to resolve the scholarly communication crisis: From here to NEAR", Association of Research Libraries Proceedings of the 133rd Membership Meeting ARL Washington, DC, available at: http://www.arl.org/arl/proceedings/133/shulenburger.html (accessed September 18, 2006). Suber, P. (2005), "More on the Kansas OA policy", available at: http://www.earlham.edu/~peters/fos/2005_04_03_fosblogarchive.html (accessed September 10, 2006). University of Houston Libraries' Institutional Repository Task Force (2006), "SPEC Kit 292: Institutional repositories: Executive Summary", available at: http://www.arl.org/spec/SPEC292web.pdf (accessed September 18, 2006). University of Kansas University Council (2005), "Resolution on Access to Scholarly Information: Passed by the KU University Council 3/10/05 ", available at: http://www.provost.ku.edu/policy/scholarly_information/scholarly_resolution.htm (accessed September 11, 2006). Ware, M. (2004), "Pathfinder research on web-based repositories: Final report", Bristol, UK, Publisher and Library/Learning Systems (PALS) available at: http://www.palsgroup.org.uk/palsweb/palsweb.nsf/79b0d164e01a6cb880256ae00 17 04a0e34/8c43ce800a9c67cd80256e370051e88a/$FILE/PALS%20report%20on% 20Institutional%20Repositories.pdf (accessed September 14, 2006). Zerhouni, E. (2006), "Report on the NIH Public Access Policy", available at: http://publicaccess.nih.gov/Final_Report_20060201.pdf (accessed September 18, 2006). work_xefv5w374jalvjsm3q2ikiveeu ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 216588428 Params is empty 216588428 exception Params is empty 2021/04/06-01:37:00 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216588428 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 01:37:00 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_xfoiu6gb25edjjqdqpwmtmyhbm ---- <3030372D3033302D2D2D2D2D2D2D30315FB1E8B9CEC1F62CB1E8C6C7C1D82D2D28C0E7C6EDC1FD292E687770> RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구* - 박경리의 거 코드를 심으로 - Linking and Sharing EAC Authority Records Using RAMP: Focusing on the Records of “Park, Kyung-ni” 박 지 영 (Zi-young Park)** 목 차 1. 서 론 2. 이론 배경 2.1 기록물의 내용과 맥락의 분리 기술 2.2 EAC 기반 거 코드의 통합 사례 2.3 선행연구 3. RAMP를 이용한 거 코드의 시범 통합 연계 3.1 RAMP의 개요 특성 3.2 연계 상 외부 거 코드 3.3 거 코드의 시범 통합 연계 3.4 시사 향후 과제 4. 결 론 < 록> 기록 거 코드는 이용자가 기록물에 근하고, 기록물을 이해하는 것을 지원한다. 그런데 기록물의 생산자는 기록 외에 다른 출 물의 생산자이기도 하며, 이용자도 정보에 통합 으로 근하기를 원한다. 게다가 기록 거 코드 와 서지제어를 한 거 코드는 상이한 특성을 지니면서도 상호 연계가능한 공통 을 지니고 있다. 이에 본 연구에 서는 기록 거 코드를 구축하거나 확장하는데 기존의 거 코드를 반입하여 이용하고, 확장된 기록 거 코드를 다시 공유할 수 있도록 키피디아와 같은 웹 환경에 반출하는 방안을 제안하 다. 그리고 시범 인 연계 공유 결과를 바탕으로 도출된 시사 과 향후 과제를 제안하 다. 주제어: 기록 거 코드, EAC, RAMP, 월드캣, 가상의 국제 거 일, 키피디아 Archival authority records support users in accessing and understanding archival information. The creator of the archives, on the other hand, is also the creator of other informative materials, including published products, and the users want to access information in a seamless manner. Moreover, the authority record has common attributes with the authority records for bibliographic control as well as its distinctive characteristics. Therefore, this research aims to link legacy authority records for constructing and expanding archival authority records and provide the expanded archival records to the Web environment, including Wikipedia, for data sharing. Finally, some issues and suggestions for further research based on the findings that resulted from experimental linking and sharing are discussed. Keywords: archival authority records, Encoded Archival Context(EAC), Remixing Archival Metadata Project(RAMP), WorldCat Identities, Virtual International Authority File(VIAF), Wikipedia * ** 본 연구는 한성 학교 교내학술연구비 지원 과제임. 한성 학교 지식정보학부 조교수(zgpark@hansung.ac.kr) ■ 수일: 2014년 4월 19일 ■ 최 심사일: 2014년 5월 2일 ■ 게재확정일: 2014년 5월 13일 ■ 한국기록 리학회지 14(2), 61-82, 2014. 62 한국기록 리학회지 제14권 제2호 2014 1. 서 론 한 개인은 자신의 지 ․ 술 창작품인 작을 출 할 뿐 아니라 그 작품을 완성하기까지 의 여정이나 자신의 삶의 여러 측면을 설명해 수 있는 기록을 남긴다. 박경리는 우리 문학사 에 남을 다수의 작품을 남겼으며, 그의 작품은 국내 뿐 아니라 번역되어 해외에서도 소개되었 다. 한 토지문화재단의 토지문화 이나 박경 리 기념 을 비롯하여 다수의 기 에서 박경리 의 작품 뿐 아니라 생 의 유품이나 사진 등을 리하고 있는데, 이 역시 귀 한 문화유산이 될 것이다. 박경리 뿐 아니라 ‘기 의 도서 ’ 설 계로 알려진 건축가 정기용 선생도 자신의 작품 에 해당되는 건축물 뿐 아니라, 자신의 건축세 계 을 정립하고 이를 실천해 나간 삶의 기록을 남겼다. 이에 국립 미술 에서는 정기용 선 생의 기록을 바탕으로 그림일기: 정기용 건축 아카이 라는 시회를 개최하기도 했다. 그 리고 이와 마찬가지로 이용자들도 작가나 술 가의 작품에 근하는 동시에 그 사람의 삶에 해 알고 싶어 한다. 따라서 정보를 조직하는 에서는 이용자 가 작품과 기록에 통합 으로 근할 수 있도록 조직 도구를 연계해야 하는데, 이 때에는 직 인 연계와 간 인 연계 수단을 이용할 수 있다. 도서 과 기록 의 조직 도구를 연계하는 경우라면, 우선 도서 의 서지 코드와 기록 의 기록물 코드를 직 연계하는 방법이 있다. 이 방법은 주로 서지 코드와 기록 코드의 기 술요소 자체를 맵핑함으로써 이루어진다. 두 번째 방법은 거 코드 간을 연계하는 간 인 방식으로서 식별요소와 같이 각 분야의 거 코드에 공통 으로 포함되어 있는 항목은 연 계하고, 복되지 않는 상이한 기술요소는 합집 합 형태로 연계 코드에 추가하게 된다. 서지제 어를 한 거 코드와 기록 거 코드는 모 두 해당 거 코드와 연계되어 있는 서지 코 드나 기록 코드로 이용자를 안내하고, 서지 코드나 기록 코드가 기술하는 정보의 이해를 돕기 한 것이기 때문이다. 물론 두 번째 방법 은 서지기술이나 기록물 기술에서 거 부분이 독립되어 별도로 기술될 때에 가능하다. 그런데 유사한 역할과 구조를 지닌 거 코드라도 도 서 에서 주로 구축되는 거 코드와 기록물 을 상으로 한 거 코드의 목 과 구조가 상 이하므로 거 코드를 연계할 때는 어느 분야 의 거 코드를 기 으로 삼을지에 한 선택 이 필요하고, 분야별 거 코드의 구조와 특성 에 한 이해가 필요하다. 이에 본 연구에서는 박경리와 같이 서지 코 드와 기록물 코드가 모두 존재할 수 있고, 그 경우 이용자의 통합 인 근을 지원할 수 있도 록 거 코드를 연계하는 방안을 모색하고자 했다. 구체 으로는 기록 거 코드를 기반으 로 기 거 코드를 구축한 뒤, 도서 에서 구축한 이름 거 코드를 반입하여 기존 거 코드를 확장하고, 확장된 기록 거 코드를 키피디아와 같은 웹 공간에 다시 반출함으로 써 거 코드의 공유에 기여하는 방법을 분석 하 다. 시범 으로 거 코드를 구축하고 연 계하기 해서 Remixing Archival Metadata Project(RAMP) 로젝트에서 개발된 편집도 구를 이용했다. RAMP는 설치 환경이 복잡하지 않으면서도 Encoded Archival Context(EAC) 를 기반으로 다양한 코드 간의 연계를 제공하 RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 63 기 때문이다. 그리고 이 과정에서 반입 상 거 코드인 OCLC의 WorldCat Identities와 가상 의 국제 거 일(Virtual International Authority File, VIAF)을 분석했으며, RAMP를 이용한 시범 연계 결과를 바탕으로 상이한 분야의 거 코드를 연계하거나, 기존의 거 코드를 공유 공간으로 반출할 때 고려할 사항을 제안 하 다. 2. 이론 배경 2.1 기록물의 내용과 맥락의 분리 기술 기록물의 기술표 인 ISAD(G)는 기록물 자 체에 한 기술과 생산맥락에 한 기술을 분리 하지 않고, 하나의 코드에서 모두 기술하는 방 식이었다. 그러나 생산맥락을 기술하는 거 코 드를 별도로 분리하는 것이 효과 이라는 인식 이 확산되어 ICA는 1996년에 ISAAR(CPF) 을 배포하여 기록물과 생산자의 기술을 분 리하고자 했다. Roe는 기존의 문헌을 검토하여 기록물의 거 코드를 통해 얻을 수 있는 이 을, 1) 용어의 일 성을 보장하고, 2) 기록정보의 검색을 지원하는 맥락 정보를 제공하며, 3) 개 체(기록물을 생산한 개인이나 단체, 기록물을 소 장한 기 등) 간의 다차원 이고 다계층 계 를 표 하는 것으로 제시하 다(1993, pp. 119- 121). 즉, 기록 거 코드는 이용자가 기록물에 일 되게 근하고, 기록물을 이해하는데 필요 한 다양한 정보를 입수할 수 있도록 지원한다. 그런데 기록물의 생산자는 기록 외에 다른 출 물의 생산자이기도 하며, 이용자도 기록물과 출 물의 구분 없이 원하는 정보를 통합 으로 이 용하기를 원한다. 그리고 기록 거 코드와 출 물의 이름 거 코드는 상이한 특성을 지니 면서도 상호 연계가능한 공통 을 지니고 있다. 설문원은 기록물 검색도구의 발 방향 하나 로 도서 ․기록 ․박물 등 문화유산기 간의 검색도구를 연계하여 통합검색을 제공하 는 것을 제시하기도 했다(2010, pp. 26-28). 기록 거 코드의 인코딩 표 으로는 Encoded Archival Context for Corporate Bodies, Persons, and Families(EAC-CPF)가 표 이다. EAC의 안은 2004년에 발표되었으며, 2010년에는 최종 안이 확정되었고, 2011년에 Society of American Archivists(SAA)의 표 으로 채택되었다(EAC- CPF Web Site). 형식 인 측면에서 EAC는 일 종의 XML-Schema로서 Encoded Archival De- scription (EAD)로 인코딩된 기록 코드에서 맥 락정보를 추출하여 별도의 거 코드로 리하 기 해 개발한 거 코드의 인코딩 표 이다. EAC는 거 코드의 구축 뿐 아니라, 거 코 드의 연계나 통합의 기본 표 으로도 활용된다. EAC가 련 분야의 거데이터와 연계될 수 있었 던 것은 기록의 기술 단 를 문서 심에서 데이터 심으로 환했기 때문이다. EAC가 문서 심의 표 인 ISAAF(CPF)의 구조를 수용하긴 했으 나, 의미와 구조에 한 상세한 표 방법을 지원 함으로써 기술요소들을 별도의 데이터로 분리할 수 있었던 것이다(Pitti, 2005, p. 18). EAC의 기 술 역은 식별 역, 기술 역, 계 역으로 구 분되는데, 특히 계 역은 EAC와 다른 구조로 인코딩된 거 상 개체와 계를 맺을 수 있도 록 지원한다(Thompson, Little, González, Darby, & Carruthers, 2013). 64 한국기록 리학회지 제14권 제2호 2014 2.2 EAC 기반 거 코드의 통합 사례 2.2.1 LEAF 로젝트 Linking and Exchanging Authority Files (LEAF) 로젝트는 2001년부터 2004년까지 진 행되었는데, 유럽 10개국의 15개 기 (도서 , 기 록 , 연구센터 등 포함)이 력하여 각 기 의 거 일을 EAC 형식으로 변환하여 통합 으로 리․제공하기 한 것이다(김성희, 2005). 주요 연계 상은 개인명 거 코드이며, 개 별 기 의 거 코드는 FTP나 Open Archives Initiative(OAI), Z39.50 로토콜을 통해 LEAF 서버로 수집되었다. 수집된 모든 거 코드는 EAC 형식의 LEAF Authority Records(LARs) 로서 서버에 장되는데, 장 과정에서 동일 인 물인지 시스템 체크가 수행되었으며, 동일하다고 단이 되면 Shared LEAF Authority Record (SLAR)에 장된다(Lieder, 2006; Ottosson, 2005). LEAF 로토타입의 검색결과 화면은 <그 림 1>과 같은데, 결과화면의 상단에는 해당 거 코드를 제공한 기 명이 제시되어 있고, 코드 뷰에서는 EAC의 각 기술 역을 확인할 수 있다. <그림 1> LEAF 로토타입 화면 시(Lieder, 2006) RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 65 ∙1단계 - EAD 검색도구에 있는 2개의 기술 요소로부터 EAC-CPF 생산자 코드를 추출 - 추출 상 데이터: (1) 생산자명, (2) 생산자를 기술하는 기 /역사 정보 - 추출 상 부가데이터: (1) 제어된 엔트리와 (2) 식별가능한 표제 (이 과정은 EAD 검색도구가 정확히 인코딩 되어 있다면 즉시 가능한 과정이다. 그러나 생산자명과 련 데이터가 혼재되어 있다면 추출과정이 어려울 수 있다.) ∙2단계 - 추출한 EAC-CPF 코드를 하나씩 식별하여 복을 제거 ∙3단계 - 복을 제거한 EAD-CPF 집합을 LCNAF ULAN 의 거 코드와 매칭 (매칭 과정에서는 거 코드의 거형식을 먼 비교하지만, 비우선어도 확인함) <표 1> SNAC의 거데이터 구축 차 2.2.2 SNAC 로젝트 Social Networks and Archival Context (SNAC) 로젝트는 각 기 에 분산된 기록정보 에서 거 정보를 추출하여 통합 거 코드를 구 축하고 연계하기 한 연구 로젝트로서 2010년 에 시작되었다. 이 로젝트의 주요 목 은, 1) 개 인이나 단체 가족을 가능한 다양한 이름과 형 식을 통해 발견하고 식별하는 것, 2) 개인에 한 다양한 련 정보를 통해 기록정보에 한 근 성을 향상시키는 것, 3) 개인 간의 계를 체계 으로 기록함으로써, 소셜 네트워크와 문가 네 트워크에의 근 제공, 4) 기록 이 효율 이고 신뢰할 수 있는 방법으로 기록물 기술정보를 탐색 할 수 있도록 지원하는 것이다. SNAC에서 EAC를 활용한 이유는 거 코 드를 통해 기록 코드 자체의 검색이나 근 방 식을 개선시키고, 기록 코드와 련된 맥락정 보를 함께 제공하기 해서이다. 이 로젝트는 다수의 기 에서 기여한 데이터로 구축되었는데, 이 에는 미국의 국립기록 리청과 스미스소니 언 회, 미국국회도서 (LC), 국국가도서 등이 있다. 특히 OCLC의 WorldCat과 VIAF, Getty의 어휘제어 로그램(Vocabulary Program) 에서는 규모의 표 화된 거데이터를 제공 했다. SNAC 로젝트에서 거데이터를 구축 하는 과정은 <표 1>과 같이 3단계로 이루어지는 데, 1단계에서는 수집된 EAD 일에서 EAC 코드를 추출하고, 2단계에서는 추출된 EAC 코드의 복을 제거하며, 3단계에서는 EAC 코드를 LC나 Getty의 거 코드와 연계하 는 과정을 거친다(SNAC Project Web Site). 2.3 선행연구 기록물의 거제어와 련된 국내 연구 , 설문원(2002)은 기록물 자체에 한 기록에서 66 한국기록 리학회지 제14권 제2호 2014 맥락 정보를 분리하여 별도의 거 코드를 구 축하기 한 연구를 수행했는데, ISAAR(CPF) 와 호주의 시리즈 기술체계를 분석하고, 국내의 기록 거 코드를 구축하기 한 방안을 제시했 다. 그리고 김성희(2005)는 내용표 인 ISAAR (CPF)와 구조표 인 EAC-CPF를 비교 분석 하고, 이를 바탕으로 거 일의 구축이나 교환 을 한 국가 차원에서의 력과제를 제시했 다. 이 정(2006)은 민간 부문의 수집형 기록 인 민주화운동기념사업회 사료 에서 ISAAR (CPF)를 바탕으로 구축한 거제어 시스템을 분석하 는데, 이 시스템에서는 EAC 형식으로 거데이터를 입력하고, 거 코드번호를 이용 해 기록물 코드와 연계했다. 이 외에도 국가기 록원의 거데이터를 링크드 데이터 형식으로 구조화시킨 박옥남(2012)의 연구와 기록물 코 드의 거표 인 ISAAR(CPF)와 서지 코드의 거표 인 IFLA의 Functional Requirements of Authority Data(FRAD) 모형을 비교분석 한 이혜원(2013)의 연구도 있다. 국외의 연구로는 Pitti(2004)와 Wisser(2011) 가 EAC-CPF의 개발과정과 구조 특징을 분 석하고, EAC는 거 코드를 구축하기 한 표 일 뿐 아니라, 국제 으로나 다수의 기 이 거 코드를 공유할 수 있는 도구를 제공할 수 있다는 을 강조했다. 특히 기록정보조직 분야 의 학술지인 Journal of Archival Organization 에서는 2005년에 기록물 기술표 을 특집으로 다루었는데, EAC를 다룬 논문도 다수 수록되어 있다. 이 Szary(2006)는 ISAAR(CPF)와 EAC-CPF를 연계하여 분석하고, EAC-CPF 표 의 장 을 소개했다. 한 Vitali(2006)는 주립 아카이 의 온라인 검색도구를 구축한 SIASFI(Sistema Informatico of the State Archives of Florence) 로젝트를 통해 온라인 으로 거 코드를 비롯한 기록물 기술 정보를 제공하기 한 과정을 상세히 기술했다. 한 Farokhzad와 Nikfarjam(2011)은 EAC가 EAD 를 보완하는 기록물 기술도구가 될 수 있다고 제시했으며, Larson과 Janakiraman(2011)은 SNAC 로젝트를 소개했다. 3. RAMP를 이용한 거 코드의 시범 통합 연계 3.1 RAMP의 개요 특성 Remixing Archival Metadata Project(RAMP) 로젝트는 Miami 학도서 에서 구축한 개 인이나 단체에 한 기록 거데이터를 배포하 기 해 진행되었는데, 이 과정에서 RAMP라는 웹기반 편집도구를 개발했다. RAMP는 2014년 1월에 버 1.3.2가 발표되었으며, EAD 문서에 서 기 정보나 이력 정보를 추출하여 EAC-CPF 형식의 거 코드를 생성하거나 직 EAC 문 서를 구축할 수 있다. 한 외부 출처의 거 코드를 반입하여 기존의 거 코드를 확장시 킬 수 있으며, 확장된 거 코드를 키 형식 으로 변환하도록 지원한다(Thompson et al., 2013)(<그림 2> 참조). 한편 2014년 1월에 OCLC는 기록 거 코드의 구조 표 인 EAC를 바탕으로 거 코드를 공유 연계하도록 지원하는 도구인 RAMP와 xEAC와 들의 특징을 소개하고, 이 도구들을 통해 아키비스트와 사서가 이름 거 RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 67 <그림 2> RAMP 기 화면 출처: http://demo.rampeditor.info/ 코드를 공유할 수 있는 가능성을 모색하기 한 세미나를 개최했다(OCLC, 2014). xEAC 와 RAMP는 기록 거 코드를 직 입력하는 기능 뿐 아니라, EAC가 아닌 표 을 용하여 구축한 외부의 거 코드를 반입하여 기존의 EAC 형식과 통합시키거나, 기존의 EAC 형식 의 거 코드를 이와 상이한 방식으로 반출할 수 있기 때문이다. 특히 RAMP는 xEAC보다 설치 환경이 단순하면서도 EAD 형식의 문서 에서 개인의 기 정보나 기 의 이력 정보를 추 출하는 기능을 포함하여 EAC-CPF 형식의 거 코드를 생성한다. 한 WorldCat Identities 와 VIAF의 데이터를 반입하고, 다시 Wikipedia 형식으로 데이터를 반출할 수 있다는 장 이 있다. 3.2 연계 상 외부 거 코드 3.2.1 WorldCat Identity WorldCat Identity는 OCLC에서 제공하는 통합 거 코드로서 거 상에 한 풍부한 정보를 제공하며, WorldCat의 풍부한 서지 코드와 연계되어 있다(WorldCat Identity Web Site). WorldCat Identities에서 ‘park, kyung 68 한국기록 리학회지 제14권 제2호 2014 ni’라는 키워드로 검색하면 인물에 한 개요 정보로서 작품 정보와 장르, 작 역할어, 분류 기호가 제공되고, 막 그래 형식으로 출 물의 타임라인 정보가 제공되며, 해당 개체를 주제로 한 작물의 정보, 해당 개체가 직 창작한 작물의 정보, 해당 작을 주로 이용 하는 이용자층, 해당 개체와 연계된 다른 거 개체, 해당 개체와 련된 주제명 등도 제공된 다. WorldCat Identities에서 인물 박경리에 해 제공하는 정보를 정리하면 <표 2>와 같다. 3.2.2 가상의 국제 거 일 VIAF는 Virtual International Authority File의 약어로서 각국의 표 인 도서 들이 구축한 거 코드를 클러스터로 구성하여 국제 으로 공유하기 한 서비스를 제공한다 (VIAF Web Site). VIAF는 국제표 거번 호인 ISNI와 연계되어 있으며, 거 코드를 웹 에서 고유하게 식별할 수 있는 퍼머링크를 제공 한다. ‘Park, Kyung-ni’의 거 코드에서 제 공되는 정보를 정리하면 <표 3>과 같다. 구분 시 설명 작정보 182 works in 381 publications in 4 languages and 1,535 library holdings 박경리의 작품은 1,535개의 도서 에 소장되어 있으며, 작은 182개, 출 물은 381종에 해당 된다. 장르정보 Fiction History Interviews Criticism, interpretation, etc 박경리와 련된 작의 장르정보로서 소설, 역사, 인터뷰, 비평서, 해설서 등이 있다. 작역할어 Bibliographic antecedent, Honoree 박경리와 련된 작의 역할어로서, 원작자나 수 상자 등이 있다. 분류기호 PL992.62.K9, 895.734 박경리와 련된 작의 분류기호이다. 출 물 타임라인 박경리와 련된 출 물의 타임라인으로서, 1950 년부터 표기되어 있다. 련 작 Pak Kyŏng-ni wa T'oji, "T'oji" sajŏn (후략) 박경리와 련된 작 가장 많은 도서 에 소장된 것으로서 ‘박경리와 토지’, ‘토지사 ’ 등 이 있다. 박경리의 작 Kim yakkuk ŭi ttaldŭl, Land : a novel = Tʻoji (후략) 박경리의 작품 가장 많은 도서 에 소장된 것으 로서 ‘김약국의 딸들’과 ‘토지’ 등이 있다. 이용자 계층 박경리와 련된 작의 주요 이용계층이 제시되 어 있다. 련 거제어 개체 Kang, Choonwon, Tennant, Agnita 1934- Translator (후략) 박경리와 련된 다른 개체로서, 강 원과 Tennant(번역자) 등이 있다. 련 링크 Library of Congress Authority File, Virtual International Authority File, Wikipedia Johann Sebastian Bach 박경리의 거 코드와 련된 외부링크로서 LCNAF, VIAF, 키피디아가 제시되어 있다. 련 주제정보 박경리와 연계된 주제어를 태그 클라우드 형식으 로 제시하고 있다. 조명희 등이 있다. <표 2> WorldCat Identities 제공 정보의 유형별 시(‘박경리’의 경우) RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 69 구분 시 설명 이름정보 Pak, Kyǒng-ni, 1926-2008 Park, Kyung-ni 1926-2008 등 국가별로 표이름이 그룹화되어 표기되어 있다. 식별기호 VIAF ID: 40224767 (Personal) Permalink: http://viaf.org/viaf/40224767 ISNI-test: 0000 0000 8119 6148 VIAF ID를 비롯하여, VIAF ID 값이 포함된 고유한 코드 링크값, ISNI 연계 값이 제시되어 있다. 표표목 200 _ | ​‡a Park​ ​‡b Kyung-ni​ ​‡f 1926-2008​, 100 1 _ ‡a Pak, Kyŏng-ni,​ ‡d 1926-2008​ 각국 거 코드에서 ‘박경리’ 이름의 표형식이 입력 되는 필드(100 는 200)가 연계되어 함께 표기되어 있다. 통일표제 Kim yakkuk ŭi ttaldŭl. - EXPRESSIONS (1) T'ochi - EXPRESSIONS (1) 박경리 작품명이 제시되어 있다. 표 형(expression) 정보를 통해 작품의 언어를 확인할 수 있다. 이형표목 400 1 _ ​‡a Bak, Gyeong-ri​ ​‡d 1926-2008, 400 1 _ ​‡a Kyongni, Pak​ ​‡d 1926-2008 각국 거 코드에서 ‘박경리’ 이름의 이형이 입력되는 필드(400)가 연계되어 함께 표기되어 있다. 작품목록 T'oji : Pak Kyŏng-ni taeha sosŏl (23), 土地 : 박경리 대하소설(21) 박경리의 작품이 선별되어 함께 표기되어 있다. 련 자 Kang, Choonwon.(3), 한계진, 토지문화재단(Korea) 등 박경리와 련된 작품의 공 자가 표기되어 있다. 출 국가 박경리 련 작을 출 한 국가가 지도에 표기되어 있다. 가장 색이 진한 곳은 한국이다. 출 통계 박경리 련 작의 연도별 출 통계가 ISBN을 기반 으로 집계되어 있다. 발행처 나남출 , 지식산업사 등 박경리 련 작의 출 사명이 표기되어 있다. 련정보 Personal Information(성별, 국 등) External Links(WorldCat Identities) 박경리의 성별이나 국 과 같은 개인정보, WorldCat Identities와 같이 연계된 외부 거 코드 링크가 제공 된다. 거 코드 반출 MARC-21 record VIAF Cluster in XML RDF record, Just Links in JSON 박경리의 거 코드를 반출할 수 있는 형식이다. MARC21, RDF, JSON 등이 제공된다. 거 코드 리이력 박경리의 거 코드를 추가하거나 수정한 이력정보 를 기 정보 일시와 함께 제공한다. <표 3> VIAF 제공 정보의 유형별 시(‘박경리’의 경우) 3.3 거 코드의 시범 통합 연계 3.3.1 기 거 코드 구축 시범 으로 거 코드를 구축하는 경우에 는 기 거 코드 정보가 없으므로, 외부의 거데이터를 반입하기 에 <그림 3>과 같은 메뉴를 통해 기 인 기록물 거 코드를 구 축해야 한다. 본 연구에서는 거 코드 간의 70 한국기록 리학회지 제14권 제2호 2014 연계를 으로 다루었으므로, 기 거 코드는 최소한의 데이터로 구성하 다. 기 거 코드를 구축하기 해서 RAMP 기화면 에서 새 코드 만들기 메뉴를 선택하면, 코 드 작성 화면이 나타나며, 이 화면에서 개체 유 형을 ‘개인’(Person)으로 선택하면, 하단의 ‘이 름 날짜’(Name and dates) 항목에 2가지 메 뉴가 나타난다. 그 ‘단일입력창(Single Field)’ 를 선택하면, 나머지 이름 정보를 입력할 수 있다. 거 코드 입력 메뉴에는 개인명 외에도 EAC 형식을 수하여 입력요소가 제공된다. 입 력창에 각 필드를 입력하고 데이터를 장하면, XML기반 EAC 코드가 구축된다. XML 문서 이므로 유효성 검사를 수행해야 하는데, 유효하 지 않은 문서인 경우에는 표 에 맞도록 수정할 수 있다. 유효성 검사를 통과하면 <그림 4>와 같 <그림 3> RAMP의 기 거데이터 구축 화면 일부 <그림 4> 유효성 검사를 마친 XML 기반 EAC 형식의 거 코드 RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 71 은 결과 화면이 나타나고, 다음 단계인 WorldCat Identities와의 연계 메뉴로 이동할 수 있다. 3.3.2 WorldCat Identity 코드 연계 기록 거 코드에 WorldCat Identities에서 제공하는 거 코드를 연계하기 해서는 우선 RAMP에서 제공하는 ‘WorldCat Name Search’ 화면에서 합한 코드를 검색해야 한다. 검색 결과 에 가장 합한 값을 선정하는데, 이 때 WorldCat Identities에서 복수의 거 코드를 제시할 수 있다(<그림 5> 참조). 따라서 WorldCat Identities에서는 거제어 상에 한 정확한 거 코드를 리하고 이를 RAMP에 제공해 야 한다. WorldCat Identities에서 제공하는 정 보의 품질이 연계된 거 코드의 품질에 향 을 미치기 때문이다. 검색결과의 링크를 클릭하면 선정한 값이 합한 연계 상 코드인지를 확 인할 수 있는데, RAMP에서 제공하는 WorldCat Identities 결과화면은 직 OCLC에서 속한 WorldCat Identities 화면보다 간략한 정보를 제공하고 있었다. WorldCat Identities 코드에서 제공된 부 가정보를 바탕으로 기존의 RAMP 거 코드 와 연계하기에 합한 WorldCat의 거 코드 를 선택하 다. 그러면 WorldCat Identity 거 코드에 포함된 주제명인 ‘Korea’를 RAMP 거 코드와 연계할 것인지를 묻는 화면이 제 시되는데, 주제정보를 함께 반입하기로 결정하 면 해당 주제정보를 기록 거 코드를 추후에 키피디아에 반출할 때 카테고리 정보로 사용 할 수 있다. 최종 으로 반입된 거 정보는 추 가된 태그 정보를 통해 알 수 있다(<그림 6> 참조). 반입된 태그 . , 태그의 일부분을 나타내면 <표 4>와 같다. 시의 태그에 는 반입된 FAST의 주제명인 ‘Korea’가 추가되 었으며, 태그에는 ‘Kang, Choonwon’ 이 련 인물로 추가되었고, 태그에는 ‘The curse of Kim's daughters: a <그림 5> 거 코드 연계를 한 WorldCat Identities 검색 화면 72 한국기록 리학회지 제14권 제2호 2014 <그림 6> WorldCat Identities 반입 결과 화면 Korea (후략) Kang, Choonwon The curse of Kim's daughters : a novel 1931907102 <표 4> WorldCat Identities를 통해 추가된 태그 일부 시 novel’이 ISBN 정보와 함께 추가되었음을 확인 할 수 있다. 3.3.3 가상의 국제 거 일 코드 연계 RAMP의 거 코드에 가상의 국제 거 일 (VIAF)의 데이터를 연계하는 과정은 WorldCat Identities와 연계하는 과정과 유사한데, ‘VIAF name search’ 화면에서 검색어를 입력하면 RAMP 는 연계된 VIAF 코드 에 합한 결과를 제 공한다. VIAF의 경우는 WorldCat Identities의 경우보다 검색결과가 정확하고 단순했으며, 검 색결과가 합한지를 결과목록에서 제공하는 링 크를 통해 확인할 수 있다(<그림 7> 참조). RAMP에서 생성된 기 기록 거 코드와 RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 73 <그림 7> RAMP의 VIAF 연계를 한 검색창(앞쪽)과 검색결과(뒤쪽) VIAF에서 연계된 거 코드가 동일 인물에 한 것임을 확인하고, 코드 반입을 선택하면 WorldCat Identities의 경우와 같이 새로 추가 된 태그를 알려주는 팝업창이 열린다. VIAF의 경우에는 태그와 태그 가 추가되었는데, 새로 반입된 태그 는 VIAF가 정보원으로 추가된 것을 의미하므 로 실제 추가된 데이터는 에 해당 된다. 거 코드 소스코드에서 반입된 태그의 일부를 가져오면 <표 5>와 같은데, 각국의 국가 도서 에서 제공한 ‘박경리’ 이름의 다양한 형 식이 추가된 것을 알 수 있다. 3.3.4 기록 거 코드의 키피디아 반출 키피디아는 웹을 통한 업 방식으로 구축되 는 개방형 백과사 이다(Wikipedia Web Site). Park, Kyung-ni 1926-2008 BNF Park Kyeong-ri 1926-2008 BNF Park Gyoung-li 1926-2008 BNF Pak, Kyong-Ni ISNI <표 5> VIAF에서 반입된 태그의 일부 74 한국기록 리학회지 제14권 제2호 2014 키피디아는 WorldCat Identities나 VIAF와 는 달리 반구조화 된 데이터이지만, 의 내용 에 따른 문서작성 양식을 제공한다. RAMP에 서는 거 코드를 반출하기 에 이 단계까 지 추가된 XML을 장하는 단계를 거친다. 키피디아 양식에 맞게 변환된 거 코드 코드 는 <그림 8>과 같으며, 문서를 키피디아의 인 터페이스로 제시하면 <그림 9>와 같다. 이와 같 이 RAMP에서는 키피디아의 ‘개인’에 해당 하는 웹페이지 양식에 맞추어 거 코드를 변 환한다. 이는 RAMP에 키피디아 데이터를 반입하기 한 것이 아니라, 거 코드를 반출 하여 키피디아에 기여하기 한 것이다. 변환 된 거 코드는 키피디아에 업로드하기 에 수정할 수 있다. <그림 8> 키피디아 업로드용 코드 변환 화면 <그림 9> 박경리 거 코드의 키피디아 업로드 시 화면 RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 75 3.4 시사 향후 과제 3.4.1 연계 상 외부 정보원의 특성 분석 외부의 거 코드를 연계하여 기록 거 코드에 수록되는 정보의 범 를 확장하는 것은 기록학 분야 뿐 아니라 체 인 정보 생태계에 도 바람직하다. 이에 시범 인 연계 결과를 바 탕으로 효과 인 외부 정보원의 사용을 해 고 려해야 할 을 도출하 다. 우선 연계 상 외부 정보원의 코드 품질이 나 서비스 방식에 해 악해야 한다. 외부의 코드는 구축부터 서비스에 이르기까지 직 인 리 상이 아니므로, 국가도서 이나 이 에 상응하는 신뢰성 있는 기 에서 구축된 코 드를 연계하는 것이 바람직할 것이다. 재 이 루어지고 있는 연계 방식도 부분 국가도서 이나 도서 의 력체, 는 이에 상응하는 문 화유산 리기 에서 구축한 거 코드를 연 계 상으로 삼고 있었다. 단 외부기 에서 반입 한 데이터와 자 의 데이터가 복되거나 충돌 한다면, 자 의 거 코드가 우선 으로 사용 되도록 시스템을 설계해야 할 것이다. 둘째, 외부 정보원의 반입에서 얻을 수 있는 이 에 해 악할 수 있어야 한다. 복수의 정 보원에서 거 정보를 반입하다 보면, 기 거 코드에 추가했을 때 유용성이 떨어지는 정 보가 반입될 수도 있다. 거 정보 자체에는 문 제가 없지만, 기 거 코드와 많이 복되 는 경우에도 반입의 필요성이 약화될 것이다. 따라서 외부 정보원의 특수성을 분석하여, 반 입했을 때 풍부해지는 기술 역이나 요소가 어 느 부분인지 악할 수 있어야 한다. 를 들면, WorldCat Identities의 경우 WorldCat에 포함 된 풍부한 서지 코드와의 연계정보가 반입 시 이 이 될 수 있다. WorldCat Identities 새로 추가된 태그가 주로 주제명, 박경리와 련된 작에 나타난 다른 인명, 박경리와 련된 작 정보에 해당되었다. WorldCat Identities를 이용한다면, 도서 의 서지 코드와 기록 의 기록 코드의 연계에 효과 일 것이다. 그리고 거정보와 함께 주제명도 함께 반입되므로, 박경리와 련된 주제 분야에 속하는 다른 인 물이나 그 인물의 작과의 연계도 가능하다. VIAF는 국가별 거 코드 집합이므로 박경 리를 나타내는 다양한 국가별 명칭에 한 정보 를 반입할 수 있었으며, ISNI와 같은 국제 인 인명식별기호와 연계정보가 부여되어 있어 박 경리의 여러 이름 형식 뿐 아니라 국제 으로 식별하기 한 거번호도 확인할 수 있었다. 따라서 VIAF는 특히 개인명의 여러 형식과 거 번호를 연계하는데 유용했다. 그런데 박경리의 기정보에 해당되는 부분 은 WorldCat Identities나 VIAF 모두에서 효 과 으로 연계되지 않았다. 이는 도서 거 코드의 특수성에 따른 결과로서, 기정보는 키피디아에서 더욱 풍부하게 제공하고 있었다. 키피디아도 디비피디아(DBpedia)를 통해 키피디아의 정보를 반출하고 있으므로 필요한 경우 디비피디아의 정보를 기록 거 코드에 반입하는 것도 고려할 수 있을 것이다. 3.4.2 거 코드의 공개 의의 방식 설정 도서 거 코드의 반입을 통해 기록 거 코드의 범 나 일부 기술요소가 풍부해졌다 면, 기록 거 코드를 다시 키피디아를 통해 반출하는 것은 웹상의 거정보의 신뢰성을 높 76 한국기록 리학회지 제14권 제2호 2014 이는데 기여할 수 있다. 한 웹을 통해 이용자 들이 기록 거 코드와 련된 기록물 자체에 근하는 수단을 제공하는 방식을 제공할 수도 있다. 즉, 국가도서 의 거 코드를 연계한 기록 거 코드를 키피디아와 같은 웹에 제 공했을 경우 3가지 측면에서 기여할 수 있다. 첫째, 기록 거 코드의 공유를 통해 신뢰성 있는 정보를 웹 이용자들에 제공할 수 있다. 기 록 거 코드가 도서 의 거 코드보다 개인 의 기정보와 같은 설명정보가 풍부하긴 하지 만, 키피디아에는 개인의 기 정보나 이름의 이형 등 부분의 역에서 훨씬 많은 정보들이 제공되고 있었다. 따라서 기록 거 코드는 양 인 측면보다는 거 정보의 질 인 측면에서 의 기여도가 더 크다고 할 수 있다. 물론 키피 디아에서도 자정과정을 통해 데이터의 신뢰성 을 유지하려고 하지만, 국가도서 이나 기록 에서 제공하는 데이터의 신뢰성은 키피디아 와 다른 측면에서의 신뢰성을 제공할 수 있다. 둘째, 웹 공간의 불균형한 거 정보를 보완 할 수 있다. 웹에는 지명도가 높은 개인의 거 정보는 많을 수 있지만, 지명도가 낮거나 련 자료의 이용률이 조한 개인의 거 정보는 거 의 없는 경우가 있기 때문이다. 따라서 상업 인 이익에 따라 구축되는 것이 아니라, 기록물 의 체 인 기술 측면에서 구축되는 거 코 드는 웹상의 업을 통해 구축되는 거 코드 의 불균형을 해소할 수 있을 것이다. 셋째, 웹에 구조화된 방식의 기록 거 정보를 제공할 수 있다. 키피디아에도 문서를 작성할 때 코드를 부여할 수 있고, 문서 유형별로 입력 양식을 제공하는 등 구조화 정도가 높은 문서들 이 많다. 그러나 여 히 표 에 따라 작성된 기 록 거 코드와 비교하면 구조화된 정도가 낮 은 편이다. 따라서 기록 거 코드를 키피디 아에 올려서 키피디아에 구조화된 거 정보 를 추가하는 의의를 가질 수도 있다. 그런데 이미 키피디아에는 상당량의 정보가 구축되어 있고, 한국과 련된 문서도 많이 찾을 수 있다. 따라서 기록 거 코드를 키피디아에 올리는 경우 기존의 정보와 복되거나 충돌되 는 이 없는지 고려해야 한다. 를 들어, 박경 리와 련된 키피디아의 기존 문서가 있는 경 우 기존 문서를 기록 거 코드로 체하는 것은 바람직하지 않으며 다수의 키 편집자들이 있어 가능하지도 않을 것이다. 따라서 기존의 문서가 있는 경우에는 기록 거 코드에서 합한 부분 을 선별하여 기존 문서에 추가할 수도 있을 것이 다. 한 키피디아에 코드를 올리는 방식 외 에 기록 거 코드 체를 공유 가능한 형식으로 웹에 공개하는 방식을 고려할 수도 있다. 이것은 ‘링크드 데이터’(linked data)라는 방식의 데이 터 발행과 공유를 의미하는데, 이를 통해 기록 거 코드를 공유하고자 하는 개인이나 단체들이 기본 규칙을 수하는 범 내에서 거 정보를 자유롭게 활용할 수 있게 된다. 3.4.3 국내 환경에 합한 거 코드 구축 연계도구 개발 필요 본 연구에서는 RAMP라는 기록 거 코드 편집 연계 도구를 활용하여, 기 거 코드 를 구축하고 이를 WorldCat Identities VIAF 의 거 코드와 연계하고 키피디아에 반출 하는 시범 인 연계과정을 시도해 보았다. 이 경우 편집도구나 연계 상 거데이터가 모두 국외의 정보원에 해당되므로, 국내의 이용자들 RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 77 에게는 국내 환경에 합한 편집도구와 연계 상 외부 거 코드가 필요한데, 이와 련된 이 슈를 3가지로 정리할 수 있다. 첫째, 기 거 코드를 추출할 EAD 코 드나 기존의 EAC 일을 구축해야 한다. 국가 기록원에서 생산기 연 데이터베이스를 구 축하고 이를 공개하고 있는데, 이것은 공공부분 의 거 코드로 연계를 한 기 코드가 될 수 있다. 민주화운동기념사업회 사료 리시스 템에서도 생산맥락의 제어를 해 EAC 형식으 로 거 코드를 구축한 바 있는데(이 정, 2006), 이와 같은 공공성을 띤 기 의 거 코드를 활 용할 필요가 있다(<그림 10> 참조). 둘째, 기록 거 코드와 연계 가능한 국내의 외부 거 정보원을 확보해야 한다. 를 들면, 국내에 연계 가능한 거 코드에는 국립 앙 도서 의 거 코드가 있다. 국립 앙도서 의 링크드 데이터는 2014년 에 공개 으로 서 비스를 제공하고 있는데, <그림 11>을 보면 근 이 되었던 표목인 ‘박경리’를 확인할 수 있 으며, 련된 링크드 데이터 연계 링크가 제공 되고, MARC 형식으로도 확인할 수 있다. <그림 12>는 박경리의 링크드 데이터 일 부인데, 박경리의 출생년도와 련 작에 련된 링크정보( 쪽)를 확인할 수 있으며, 사 망년도와 박경리와 련된 다른 웹자원과의 연 계를 한 제어된 근 (foaf:name으로서의 ‘박경리’), VIAF 링크정보, 거 코드의 출처 주기에 해당되는 련 작정보, 박경리의 유 형(author)이 제시되어 있다. 마지막으로 본 연구에서는 거 코드를 구 축하고 연계하기 한 도구로 RAMP를 활용 <그림 10> 민주화운동기념사업회 사료 리시스템의 거 코드 시 화면 출처: 이 정, 2006, 125. 78 한국기록 리학회지 제14권 제2호 2014 <그림 11> 국립 앙도서 링크드데이터 ‘박경리’ 시 화면( 쪽) 해당 MARC 표시 화면(아래쪽) 출처: http://lod.nl.go.kr/ <그림 12> 국립 앙도서 링크드데이터 ‘박경리’ 화면 일부 출처: http://lod.nl.go.kr/page/KAC201110881 했는데, 앞으로는 국내 정보환경에 맞는 거 코드의 조직 연계도구를 개발할 필요가 있다. 정보를 조직하기 해서는, 목 에 맞는 합한 도구가 필수 인데, 이는 기록 거 코드도 외가 아니다. 재 기록 거 코드의 구축을 해 OSASF(Open Source Archive Software RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 79 Forum, 구 AtoM 포럼)에서 SAA에서 배포한 AtoM의 한 을 비하고 있으며, 국내 용 을 한 다양한 시도를 하고 있다(OSASF 웹 사이트). 분야별로 이와 같은 노력들이 되 어 앞으로는 분야 간 데이터의 통합 구축 연 계가 더욱 활발해져야 할 것이다. 4. 결 론 본 연구에서는 출 물과 기록물을 모두 생산 한 개인의 정보를 통합 으로 리하고 근하 기 한 거 코드의 연계 확장을 시도해 보았다. 기록물 코드에서 거 부분이 별도로 분리되면서, 복수의 기 에서 구축한 거 코 드를 통합하거나 연계할 수 있는 가능성이 커졌 기 때문이다. 한 거 코드의 구축 상인 개인이나 단체, 가족은 다양한 활동의 주체이므 로 기록학 분야 뿐 아니라 여러 분야에서 련 거 정보를 구축하여 활용하고 있기 때문이다. 이에 기본 인 거 코드를 구축하거나 기존 의 EAD 일에서 EAC 형식의 거 코드를 생성할 수 있고, WorldCat Identities나 VIAF 와 같은 외부 거 코드를 반입할 수 있으며, 최종 구축된 XML 형식의 거 코드를 키 피디아의 개인 문서 양식에 맞게 반출할 수 있 는 기능을 갖춘 편집도구인 RAMP를 통해 시 범 으로 기록 거 코드를 구축하고 연계해 보았다. 그리고 시범 인 통합 연계 결과를 바탕으 로 다음과 같은 시사 과 향후과제를 도출하 다. 첫째로는 기록 거 코드와 연계할 수 있는 외부 정보원의 특성을 악하고, 외부 정보원을 반입했을 경우 기존의 기록 거 코드에 어떤 이 이 있는지를 분석할 수 있어야 한다. 더불 어 외부정보원의 신뢰성도 함께 고려해야 할 것 이다. 둘째로는 최종 구축한 통합 기록 거 코 드를 키피디아를 비롯한 웹 정보환경에 공개 했을 경우의 의의와 그 범 를 설정할 필요가 있었다. 거 코드는 키피디아를 비롯한 여 러 웹상의 정보보다 구조화 수 과 신뢰성이 높 은 자료이면서, 양 인 면에서는 기존의 키피 디아 문서보다 규모가 크지 않은 경우가 많다. 따라서 기존의 거 코드를 반출할 때는 기존 의 키피디아 문서에 합한 방식으로 삽입되 어야 할 것이다. 덧붙여 키피디아 외의 웹정 보원에서도 기록 거 코드를 사용할 수 있도 록 링크드 데이터와 같은 공개된 형식으로 거 코드를 제공하는 것도 고려해 볼 수 있다. 마 지막으로는 국내 환경에 합한 기 거 코 드가 풍부해져야 외부 정보원과 연계했을 경우 더욱 유용한 거 코드를 구축할 수 있다. 그 리고 다양한 분야의 거 코드를 EAC와 같은 기본 형식으로 기록 거 코드에 반입할 수 있 도록 변환하고, 확장된 기록 거 코드를 외부 로 다시 반출하는 기능을 제공하는 편집도구의 국내 개발이 필요하다. 80 한국기록 리학회지 제14권 제2호 2014 참 고 문 헌 김성희 (2005). 기록물 생산자 거제어를 통한 맥락정보의 구축 교환 - ISAAR(CPF) 2 과 EAC를 심으로. 한국비블리아학회지, 16(2), 61-88. 박옥남 (2012). 기록물 거통제 기반 Linked Data 구축에 한 연구. 한국비블리아학회지, 23(2), 5-25. 설문원 (2002). 기록물을 한 단체 거 코드 연구. 한국기록 리학회지, 2(2), 39-68. 설문원 (2010). 기록 검색도구의 발 과 망. 기록학연구, 23, 3-43. 이해 , 이미 , 이은 , 이 , 이 정, 최 실, 박미자 (2008). 학기록 시소러스 구축 지침의 개발 연구. 한국기록 리학회지, 8(1), 189-210. 이 정 (2006). 수집형 기록 의 거제어시스템 분석. 기록학연구, 13, 91-134. 이혜원 (2013). 기록 과 도서 거체계 비교 상호운용성 분석. 한국기록 리학회지, 13(2), 227-246. Farokhzad, B.F., Nikfarjam, H., & Arbabi, V. (2011). Challenges and opportunities involved in implementing EAC. World Academy of Science, Engineering and Technology, 78, 157-159. Larson, R.R., & Janakiraman, K. (2011). Connecting archival collections: The social networks and archival context project. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6966 LNCS, 3-14. Lieder, Hans-Jörg (2006). Linking and Exploring Authority Files. TEL-ME-MOR/M-CAST Seminar, Prague November 23rd 2006. OCLC. (2014). Beyond EAD: Tools for Creating and Editing EAC-CPF Records and “Remixing" Archival Metadata. Retrieved May 2, 2014 from http://oclc.org/research/events/2014/01-09.html. Ottosson, Per-Gunnar (2005). EAC and the Development of National and European Gateways to Archives. Journal of Archival Organization, 3(2/3), 261-274. Pitti, Daniel V. (2004). Creator Description: Encoded Archival Context. Cataloging & Classification Quarterly, 38(3/4), 201-226. Richard V. Szary BA and MA (2006). Encoded Archival Context (EAC) and Archival Description: Rationale and Background. Journal of Archival Organization, 3(2/3), 217-227. Roe, Katheleen (1993). Enhanced authority control: Is it time? Archivaria, 35(Spring 1993), 119-129. Stefano Vitali (2006). What Are the Boundaries of Archival Context? The SIASFI Project and the Online Guide to the Florence State Archives, Italy. Journal of Archival Organization, 3(2/3), 243-260. Thompson, Timothy A., Little, James, González, David, Darby, Andrew, & Carruthers, Matt RAMP를 활용한 EAC 기반 거 코드의 연계 공유 한 연구 81 (2013). From Finding Aids to Wiki Pages: Remixing Archival Metadata with RAMP. code{4}lib journal, 22. http://journal.code4lib.org/articles/8962 Wisser, Katherine M (2011). Describing Entities and Identities: The Development and Structure of Encoded Archival Context—Corporate Bodies, Persons, and Families. Journal of Library Metadata, 11(3/4), 166-175. [ 웹사이트] 오 소스 아카이 소 트웨어 포럼(Open Source Archive Software Forum, OSASF). 검색일자: 2014.5.2. http://osasf.net/. Virtual International Authority File (VIAF). Retrieved May 2, 2014 from http://viaf.org/. WorldCat Identities. Retrieved May 2, 2014 from http://www.worldcat.org/identities/. Encoded Archival Context for Corporate Bodies, Persons, and Families (EAC-CPF). Retrieved May 2, 2014 from http://eac.staatsbibliothek-berlin.de/. Social Networks and Archival Context (SNAC) Project. Retrieved May 2, 2014 from http://socialarchive.iath.virginia.edu/. Wikipedia. Retrieved May 2, 2014 from http://en.wikipedia.org/. •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) Kim, Sung-Hee (2005). Establishing and Exchanging Contextual Information Based on the Authority Control of Creators of Archives. Korean Biblia Society for Library and Information Science, 16(2), 61-88. Lee, Hyun-Jeong (2006). Analysis of authority control system in collecting repository from the case of archival management system in Korea Democracy Foundation. Korean Journal of Archival Studies, 13, 91-134. Lee, Hyewon (2013). The Comparison and Analysis of Interoperability between Archival Authority and Library Authority. Journal of Records Management & Archives Society of Korea, 13(2), 227-246. Park, Ok Nam (2012). The Design and Development of Linked Data from Authority Data in National Archives of Korea. Korean Biblia Society for Library and Information Science, 23(2), 5-25. Rieh, Hae-young, Lee, Mi Young, Lee, Eun Young, Lee, Hyuk Joon, Lee, Hyun Jeoung, Choi, 82 한국기록 리학회지 제14권 제2호 2014 Young Sil, & Park, Mi Ja (2008). Study on the development of guidelines for thesaurus construction at university archives: Case study of Myongji University Archives Center. Journal of Records Management & Archives Society of Korea, 8(1), 189-210. Seol, Moon-Won (2002). A study of archival authority records for corporate bodies. Journal of Records Management & Archives Society of Korea, 2(2), 39-68. Seol, Moon-Won (2010). A study on development and prospects of archival finding aids. Korean Journal of Archival Studies, 23, 3-43. [ Website ] Open Source Archive Software Forum. Retrieved May 2, 2014 from http://osasf.net/. work_xi3gjm7ikngyhibfqyv4hutdom ---- Technical Services Quarterly. 1983, vol.1, n.1, p.117-120. ISSN: 1555-3337 (electronic) 0731-7131 (paper) DOI: 10.1300/J124v01n01_19 http://www.informaworld.com/smpp/title~content=t792306978~db=all http://www.informaworld.com/smpp/title~db=all~content=g904836689 http://www.informaworld.com/smpp/content~db=all~content=a904836666~frm=titlelink © 1983 The Haworth Press, Inc. Libraries on the Line Carol R. Krumm, BA, BS and Beverly I. McDonald, BA, MA Abstract The utilization of online systems has necessitated many changes in technical services including the following: 1. Increased cooperation among libraries is obvious; 2. Card catalogs have been closed or frozen; 3. The focus of cataloging has changed from manual to online access with the publication of AACR2; 4. There is a growing realization of the importance of authority control; 5. The automation of serials is in the beginning stages; and, 6. Libraries are exploring implications for user services. Since the early 1900s librarians have discussed ways of streamlining technical services. However, implementing proposed changes was slow. For example, the first edition of A.L.A. Catalog Rules appeared in 1908. Although a committee for the second edition was appointed in 1930, the second edition of A.L.A. Catalog Rules and Rules for Descriptive Cataloging were not issued until 1949, nineteen years later. From the advent of automation in the late 1960s to the present, more changes have occurred in technical services than in all the years preceding 1967. Much credit is due the academic librarians of Ohio who conceived the idea of a cooperative automated cataloging system, now the Online Computer Library Center (OCLC). Cooperation Cooperation among libraries is evident today in the widespread use of bibliographic utilities such as OCLC, the Research Libraries Information Network (RLIN), and Washington Library Network (WLN) for acquisitions, cataloging, authority work, serial processing, and interlibrary loan. Greater attention to certain areas will make cooperation even more effective. One goal must be the development of better national and international standards of rules and formats. Also imperative is the interface between the Library of Congress (LC) and bibliographic utilities, as well as the interface among the various bibliographic utilities. Additionally, the practice of assigning libraries specific responsibilities in cataloging, name authority work, serials, and interlibrary loan must be expanded. Selected Aspects of Technical Services Catalogs Many libraries are now freezing or closing card catalogs and developing or using catalogs in online or microform format. Some catalogs are new while others have evolved from existing http://www.informaworld.com/smpp/title~content=t792306978~db=all http://www.informaworld.com/smpp/title~db=all~content=g904836689 http://www.informaworld.com/smpp/content~db=all~content=a904836666~frm=titlelink systems. For example, at the Ohio State University Libraries (OSUL) the automated circulation system has been expanded so that it now includes an online public catalog as well as online access to detailed serial holdings. Before the online catalog can completely replace the card catalog, some problems need to be resolved. In the first place, the online catalog must provide at least as much information and as many access points as the card catalog. Cross references are necessary. Moreover, large academic libraries must decide whether or not such non-Roman scripts as Arabic, Hebrew, and Chinese should be transliterated. Essential, too, are improved search capabilities for national and local online systems. Terminals must be less costly and more efficient to operate and have less down time. Cataloging Codes With the publication of AACR2, the focus changed from manual to online access. It has been difficult to keep up with the numerous AACR2 rule interpretations and changes. The inevitability of AACR3 raises the following questions: When will AACR3 be published and implemented? Will there be major changes? Can these changes be carried out without serious upheaval to the system? Will practicing catalogers at the local level have input into future catalog codes? Authority Files There is a growing realization that authority files are of primary importance to an online catalog. While some libraries have changed from paper to online or microform authority files, others are establishing authority files for the first time. Links between online catalogs and online authority files enable changes in an online catalog to be made much more quickly and accurately than in a manual catalog. However, decisions concerning authorized forms of headings and links must be made by library personnel. Especially helpful for technical services processing is online access to the LC name authority index which is available through the bibliographic utilities. Perhaps in the near future there will be access to LC subject, series, and uniform title authority files as well. Serials Some libraries are beginning to automate serial processes, such as online bibliographic records, online check-in, and online union lists. An example of online serial activity is the Ohio State University Libraries serial holdings file, which displays detailed holdings statements and permits access by volume and year. These serial holdings are kept up-to-date by online maintenance. Future online capabilities for individual records should include serial publication patterns, date of next expected issue, check-in, and claiming. Among other areas of interest are online periodical indexes, electronic journals, batch and online searches for journal articles, document delivery, and copyright law problems. Implications for User Services Recent cooperative activities have pervaded all areas of the library world from international standardization to bibliographic utilities to individual libraries. Technical services support is essential for public service to patrons. If technical services do not use innovative methods to eliminate backlogs, patrons will not be able to locate needed information. As patrons become excited about the use of the online catalog, they may prefer to use it to search for information about monographs and serials. Two factors which make the online catalog a viable tool are AACR2 and online authority files. Helpful to patrons are the use of natural language and user-oriented choice and form of entry as indicated by AACR2. Good authority control, including cross references, also assists the reader. User-friendly terminals will permit patrons to find needed information and materials with minimal or no assistance. Recent and projected changes in technical services have enabled library staff and patrons to find what they need, when they need it, in the least amount of time. Carol R. Krumm is Cataloger and Assistant Professor of Library Administration and Beverly I. McDonald is Cataloger and Instructor of Library Administration, The Ohio State University Libraries, 1858 Neil Avenue Mall, Columbus, OH 43210. work_xidu724b3jgvbk4i5yt7vrmuxq ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216590504 Params is empty 216590504 exception Params is empty 2021/04/06-01:37:03 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216590504 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:37:03 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_xk3fwr2i3bcibgxtjjqo5sobca ---- Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 1 P le a se n o te t h a t th is i s a n a u th o r- p ro d u ce d P D F o f a n a rt ic le a cc e p te d f o r p u b li ca ti o n f o ll o w in g p e e r re vi e w . T h e p u b li sh e r ve rs io n i s a va il a b le o n i ts s it e . [This document contains the author’s accepted manuscript. For the publisher’s version, see the link in the header of this document.] Paper citation: Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. DOI: 10.1080/01462679.2011.604907 Keywords: Abstract: In March of 2009, the University of Kansas (KU) Libraries began a year-long subscription to OCLC’s WorldCat Collection Analysis (WCA) tool, which was recommend by the Associate Dean of Technical Services and the Assistant Dean of Collections and Scholar Services. KU Libraries bases much of its collections decisions on data collected, including usage statistics, overlap analysis, and interlibrary loan statistics. The WCA was perceived as another method of collecting data to make collection development decisions. An implementation committee was appointed by the deans and led by the authors, the Head of Collection Development and a Social Sciences Librarian who had experience with the WCA at another institution. The implementation committee set institutional goals and priorities for the project, as well as prepared informational documents, and conducted training sessions for subject librarians. Librarians submitted reports for each of their collections. Although the project coordinators dealt with the many frustrations experienced by the subject librarians because of the flaws associated tool and would change the process for future WCA projects, overall, KU librarians were pleased to discover that the quality of the collections at KU is very high. Text of paper: Using the WorldCat Collection Analysis Tool: Experiences from The University of Kansas Libraries Amalia Monroe-Gulick, almonroe@ku.edu Lea Currie, lcurrie@ku.edu Author Final Draft Introduction The KU Libraries is a founding member of the Association of Research Libraries (ARL) with 4,235,542 total cataloged items. KU has a sizable special collections library which holds manuscripts, rare books, maps, photographs, and ephemera. KU also holds many unique collections in international area studies, including its East Asian; Spanish, Portuguese, and Latin American; Slavic; and African Studies collections, which combined total more than one-third of the KU Libraries’ collections. KU is a longtime member and contributor to OCLC. http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 2 The WorldCat Collection Analysis (WCA) tool is an instrument for evaluating one library’s collection against the holdings in the entire WorldCat database and with selected OCLC members. When the WCA tool was first introduced, OCLC representatives were invited to visit KU and demonstrate its capabilities. After reading negative reviews and hearing that OCLC was promising to make improvements in the future, KU Libraries decided to wait until the problems were ameliorated before subscribing to the tool. KU Libraries’ Associate Dean of Technical Services and the Assistant Dean of Collections and Scholar Services had been paying a attention to the WCA, and when they thought the proper improvements had been made, they announced that KU would subscribe for a year. The KU Libraries began a subscription to the WCA in March of 2009.The subscription included three designated OCLC sites, including KU General Libraries, the Kenneth Spencer Research Library, which houses rare books, manuscripts, and other special collections(6), and the KU Medical Center’s Dykes Library. KU Libraries has a long history of collecting data to analyze its collections. Circulation and interlibrary loan statistics are collected, as are the numbers of faculty, students, and graduation rates for each KU academic discipline. This data is used to inform allocation decisions each year. The WCA comparison data provides additional information for identifying strengths and weaknesses of the collections for allocation purposes. The results of this analysis are being used to better understand the KU collections and to realign development priorities for the foreseeable future. Librarians at KU will also use the comparison information to identify potential subject areas for collaborative collection development with other libraries in the state of Kansas, most notably Kansas State University. Literature Review The available literature on the WCA addresses the functionality of the product and case studies of individual libraries. There is very little literature discussing a large institution implementing the tool across collections. The leaders of the of the implementation committee reviewed articles that addressed WCA utilization, strengths, and weaknesses. All subject librarians participating in the project were encouraged to read the articles to develop a better understanding of the process and what types of analyses could be conducted. The literature also informed the KU Libraries’ implementation committee’s documentation and training sessions The data generated from WCA can be used in several different types of analyses to help achieve the goals of individual institutions. Sneary (2006), an OCLC Creative Services Analyst, suggests that WorldCat Collection Analysis (WCA) can help ensure that current collection development is in alignment with the “strategic goals of the university.” With the data collected from the WCA, libraries can determine if certain subject areas are heavily collected. Comparing local holdings with other institution’s collections is the most common approach to utilizing the WCA but there are other goals as well. Intner (2003) suggests that the knowledge gained from evaluations, specifically comparison projects, allows for informed justifications when discussing collections at the university level or requesting additional library funding. Discovering that a library collection is smaller or older than its peers’ collections is a simple and understandable way to communicate to university administration that additional library resources are needed. At Saint Leo’s http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 3 University, Henry, et. al. (2008) used the WCA as a collection evaluation in response to the university’s Institutional Effectiveness Plan. One component of their project was the comparison of their holdings to holdings at similar institutions. As a secondary result from gathering WCA data, lists of titles currently owned by SLU were generated for weeding purposes. St. Leo’s librarians plan to conduct a second analysis in a few years to provide the library with additional data and a longitudinal study of the collection. Spires (2006) used the WCA to run comparisons with a defined peer group of libraries in the Consortium of Academic and Research Libraries (CARL) in Illinois and Bradley University. The librarians at Bradley University found the WCA helpful in comparing collection size, age of collection, collection overlap, and collection uniqueness with libraries in CARL. At Colorado State University, Culbertson and Wilde (2009) used the WCA and other metrics to assess the library’s support of doctoral programs in twelve disciplines. The purpose of the study was to support a request for additional funding to the university administration. Librarians worked with teaching faculty to identify comparable institutions. Using the WCA, the monograph collections were evaluated, but evaluating journal collections was challenging because of the lack of accurate serials records in WorldCat. For example, WorldCat does not include records for electronic serials in aggregator databases. Librarians incorporated faculty input, accreditation criteria, Journal Citation Reports, statistics from CSU’s open URL server, Local Journal Utilization Reports, and interlibrary loan statistics to develop a list of essential core journals. One of the most important aspects of the literature review is the identification of the problems and drawbacks of the WCA, and ways to circumvent them. Negrucci (2008) identified some of the weaknesses of the WCA. The two biggest problems identified were the over- or under-reporting of unique titles because of multiple editions and formats, and the rigidity of the OCLC subject conspectus. The author was not able to apply the WCA Functional Requirements for Bibliographic Records (FRBR) filter to specific comparison groups, only to the general WorldCat Analysis, because FRBR was not available for individual subjects at the time. When FRBR is not working, it is easy to misinterpret the data, because it appears there are quite a few overlapping titles, when they are actually different editions or formats. At North Carolina State University, Orcutt and Powell (2006) used the WCA to run comparisons against groups of research libraries in their consortium. They found that, implicit to its design, the WCA works better when running comparisons to a single institution and not as well with multiple institutional comparisons. NCSU librarians also realized that WCA data is only updated quarterly and the tool could not accommodate sampling methods (8). Obtaining workable data required gathering information within restrictions inherent to the tool. They were forced to exclude all formats other than monographs because of inconsistent reporting to OCLC. They also had to exclude titles with imprint dates within the most recent two years in order to account for differences in cataloging and acquisition rates across their consortium. In many cases, the WCA subject categories were inflexible and often less than helpful. Some academic libraries chose to evaluate specific collections in their libraries. Beals (2007) used the brief test assessment model and the WCA to evaluate zoology collections at three universities. The brief test assessment was developed based on the Research Libraries Group (RLG) Conspectus model as an attempt to quantify collection strengths. Like Orcutt and Powell (2006), Beals had similar difficulties when using the WCA. The author encountered problems running analyses http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 4 for subjects with multiple call number ranges. The author argued that combining both the brief test assessment and the WCA provides a more complete picture of a collection since they each fulfill a different role in collection assessment and provide a more complete picture of a collection. Cox and Gushrowski (2008) used the WCA to determine the publication date span and median publication date for a weeding project in a dental library. Library staff were able to determine the quantity and age of titles in specific subject areas and compare them to the collection as a whole. A less documented but useful way to utilize the product is the use of interlibrary loan statistics. Way (2009) used the WCA in a unique manner. Grand Valley State University Libraries used the interlibrary loan (ILL) analysis in the WCA to generate a list of titles that had been borrowed. A review of the list seemed to indicate a large number of titles would likely be appropriate for their collection. As a result, the library decided to pursue the development of a patron-initiated purchase program via ILL to enhance the library’s collection. Implementation Following the decision to purchase a year-long subscription to OCLC’s WorldCat Collection Analysis (WCA) tool, the Associate Dean of Technical Services and the Assistant Dean of Collections and Scholar Services appointed an implementation committee that was led by the Head of Collection Development and a Social Sciences Librarian. The implementation committee reviewed and identified institutional goals and priorities related to the project, and taught subject librarians in a classroom setting and one-on-one sessions to use the WCA. The committee wanted to establish and facilitate an efficient process because all subject librarians with collection development responsibilities (12) were required to complete reports using WCA. The committee compiled a document outlining the overall goals of the project, a timeline, and a list of limitations within the WCA product. The priorities, goals, and timeline for KU Libraries were as follows: 1. Between May 1, 2009 and March 3,1 , 2010, compare our collections with ARL peers and other groups or individual libraries as identified by subject liaisons (see Appendix A) (10). a. Identify strengths and weaknesses that characterize our collections generally. b. Identify unique collections or unique material. c. Identify gaps in our collections, overlaps, and duplication. d. Identify resources needed to support new and expanding programs and to support formal accreditations. e. Identify possibilities for collaborative collection development within a region or consortium. 2. By June 1 , 2010, create a formal collection development plan resulting from the analyses conducted a. Recommend specific areas for increased budget allocations. b. Recommend specific areas where collections budget will be cut. c. Report on significant weaknesses and subject areas where collections may be stronger than necessary (due to the goals and mission of the institution). http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 5 3. During Fiscal Year 2011 and beyond, coordinate discussion with other regional libraries regarding cooperative collection development opportunities. Before the WCA project was implemented, fallacies and inconsistencies of the WCA product and records in OCLC and problems unique to KU were identified to better inform the collection analyses. Those included:  WCA uses the WorldCat accession number as the unique identifier of a bibliographic record for matching purposes, and not the title, author, or edition statements. Negrucci (2008) describes how (36) the same edition of a work may have multiple bibliographic records in WorldCat, resulting in a comparative analysis that over-reports uniqueness and underreports overlap. OCLC has attempted to mitigate the over-reporting of uniqueness by providing the option of applying a Functional Requirements for Bibliographic Records (FRBR)algorithm so that the same titles with different formats or editions are compared .(34)  Orcutt and Powell (2006) (36) reported difficulty in obtaining reliable samples from different subject areas. Libraries have been forced to use the limit function to exclude non-book formats and recent publication dates to obtain workable data sets for their core collection analysis.  Orcutt and Powell (2006) complained about (36) the rigid conspectus structure of WCA. The lack of functionality to conduct a user-defined search of LC subjects and classifications limits the detailed view needed for an in-depth collection analysis.  The currency of WCA data is also an issue. The tool relies on an extract from WorldCat taken once per quarter. For more current imprints, the infrequency of updates precludes tenable comparisons. WCA is not the tool for comparing recent acquisitions.  The WCA does not permit sorting by language. Problems identified that are unique to KU collections included:  Many of KU’s electronic resources, including e-record sets like Early English Books Online, , 18 th Century Collections Online, ACLS (American Council of Learned Societies) (15) humanities e-book, etc. are not in WorldCat.  A few of KU’s e-journal titles will be found in WorldCat; however, the KU Catalog’s record for the e-journals does not contain an OCLC number. (15) (34-ommited Voyager)  Most U.S. government documents and international documents are not in WorldCat.  Many of the East Asian collection titles are not in WorldCat. (15,16)  Approximately 50,000 maps in the Map Library are not in WorldCat.  Many microforms are not in WorldCat.  Most of the sound recordings from the KU Archive of Recorded Sound are not in WorldCat.  Over 3 million photographs from the KU Libraries are not found in WorldCat.  Some manuscript collections from Kenneth Spencer Research Library are not in WorldCat, particularly those in Special Collections.  OCLC records without a Dewey or LC call number will be designated as “Unknown Classification” in WorldCat Analysis. This includes most Kenneth Spencer Research Library http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 6 special collections materials, theses and dissertations, some microforms, some sound recordings, as well as catalog records that were contributed to OCLC but which lacked a Dewey or LC call number in the master record.  Some records from Kenneth Spencer Research Library appear in the main KU General Libraries WorldCat Collection Analysis account. They may, or may not, also appear in the Spencer Research Library WorldCat Collection Analysis account because an item record may be connected to both OCLC records In addition, the committee broadly outlined the process that subject librarians would take to achieve the priorities and goals stated above. Subject librarians were directed to: 1. Conduct a basic WorldCat comparison. 2. Run reports comparing collections to our ARL Peers.(see Appendix B)(10) 3. Choose other appropriate pre-identified groups (Big 12, Regents Libraries (four year public colleges and universities in Kansas), etc.)(see Appendix B)(17,10) 4. Consult with teaching faculty to compile lists of peer institutions appropriate for the unique disciplines to conduct additional analyses. 5. Choose from authoritative lists, such as Best Books for College Libraries. 6. Consider different types of tools for unique collections – all collections are not created equal! 7. Share ideas for analysis with other subject librarians while working through the process. As the participants began working on the project, it became clear that more guidance and training were necessary. Therefore, additional documentation was created throughout the project in response to questions from subject librarians. At their request, a report template was created to help analyze the data collected (see Appendix A). The committee agreed that a template would provide a certain amount of consistency in the data reported. Training and User Support In addition to the documentation, three different kinds of support were offered to further assist subject librarians with using WCA: workshops, user groups, and desk-side coaching. All three provided basic information on the product, how to use the product, and useful Excel features. In addition to the initial documentation previously mentioned, a wiki was created that included login information, lists of the libraries in each comparison group, a bibliography of articles discussing WCA projects and product reviews, and later included completed reports. Introductory workshops designed specifically for KU Libraries’ needs were conducted by three members of the WCA implementation committee and a library technology instructor, who trained the subject librarians to use advanced options in Excel. During the workshops a practical demonstration was presented on using the WCA. Subject librarians were shown how to choose comparison groups, limits, and display options. The monthly WCA Users Group meetings were held for any interested WCA project participants. These meetings were designed for users to come and work on reports, ask questions, and share ideas. http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 7 The meetings also provided users with a venue to express their concerns and the problems they encountered when running reports. Later in the year, in this venue, subject librarians shared their completed reports and the WCA features they used to analyze their own collections. These demonstrations created a context and provided support for subject librarians who had been struggling with the project. Observing how other librarians utilized features such as exporting and copying charts, and filtering and sorting in Excel, and working with multiple editions was extremely helpful. One of the biggest challenges discussed in the WCA Users Group meetings was the ability to understand exactly what participants were trying to learn from the analysis. It was not possible to establish strict guidelines for everyone because there were too many variations among collections, and different flaws within the product that affected specific subject collections differently. The social science disciplines, for example, were affected less by the FRBR issue than the humanities. Also, science librarians found the project difficult since their collections are mainly serial dependent. Most of their serials come from aggregator databases and do not appear as KU holdings in WorldCat. Desk-side coaching was also provided for running WCA reports and Excel training. These one- on-one sessions were helpful in assisting users to better understand what they were attempting to extract from the tool. It was also more effective to train people on different features of Excel one-on- one because of the varying needs and skill-levels of individual users. KU subject librarians greatly improved their Excel knowledge and skills while working on this project. Results WCA reports were run comparing the entire KU Libraries’ collections to several peer groups, including KU’s ARL peers, Big 12 peers, and Kansas State University (see Appendix B), the latter with whom KU has collaborated on several collection development projects. These broad comparisons gave a clear indication that KU has relatively strong (14) collections based on the high number of titles that overlapped with our (14) aspirational peers. Compared to the ARL peers, KU has 77.48% overlap, while KU has 70.86% overlap with Big 12 peers. KU’s overlap with Kansas State was only 35.12%, which is not surprising since KU has a larger collection and the two schools have many differing academic programs. Forty-five individual WCA reports for specific subjects were submitted by subject librarians. Overall, the results were positive and can be used for several different collection management activities, including retrospective collecting, approval plan adjustments, changing future firm orders, and augmenting collection development policy statements. For example, the Political Science collection analysis found an overall homogeneity between its holdings and the comparison peers (Big 12, ARL Peers, and Political Science Peers). This is a positive result as KU does not strive to have a unique collection in this area, but there were no significant weaknesses when compared with its peers. All subject areas were strong except for the U classification, but KU does not have a military science program so this was not an area identified for adjustment. The Map collection was described in the subject librarian’s report as “on par with the very largest research libraries.” The list of titles not held by KU will be used for retrospective collecting in the areas of history of cartography and environmental sciences. The United States History collection was also found to be strong in the E-F classifications. The only significant weakness observed was in the area of Pacific States and Territories, but this was not cause for concern as this is not a widely studied area at KU. The Journalism collection was found to be http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 8 strong, except in the areas that focus on reporting specific issues, e.g., the Iraq War. These findings resulted in ordering titles not previously held by KU, and adjustments in future firm order priorities. Some collections were found to be weak overall. The subject librarian for the Women’s Studies collection stated that the collection “contains the minimum of what should be in a research library collection, and is behind both our ARL Peers and our Women’s Studies Peers.” These results are significant because a PhD program in Women’s Studies has recently been approved for Women’s Studies and the current collection may not support advanced research in this area. Interlibrary loan and circulation data, in conjunction with the list of titles not held by KU, will be used to expand the collection. The African Studies collection analysis also found that the KU collection is far behind its peers in collecting African materials. Even with the majority of analyses finding positive results, most subject librarians reported varying levels of dissatisfaction because they felt they did not gain significant insight into their collections from the WCA comparisons; rather, the results reaffirmed what they already assumed about their collections or the data was inaccurate. Even though there was dissatisfaction with the product, KU Libraries now has documentation about their collections based on the reports of individual librarians.  Thirty-nine reports found that KU collections were comparable to aspirational peer libraries based on the number of titles that overlapped with titles in peer libraries and unique titles held by KU.  Weaknesses in collections were identified in 44% of the reports submitted by librarians based upon low overlap percentages. However, 29% noted that the titles not owned by KU would not be typically selected for our collections because they fall outside of KU ‘s collecting scope (i.e. commercial publishers, professional literature, subject content, text books)  42% of the reports reported that books held by peers would be purchased as a result of the peer analyses.  16% reported that they would make adjustments to the approval plan in their subject areas. Two librarians stated that they would begin collaborative collection development projects with other libraries based on their findings in the WCA reports. There were many negative comments in the final reports regarding the difficulty of using the WCA and the lack of usable data collected.  16% of librarians reported that they found the WCA reports of no use, because their subject areas are serials dependent.  44% of librarians reported problems with the WCA data, including too much duplication due to multiple editions, items not cataloged in WorldCat, and data from WorldCat not being uniform.  11% of librarians thought that the WCA was too difficult to use.  7% of librarians were disappointed that a particular language was not available in the WCA. http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 9 Conclusion After reflecting on the entire WCA project at KU Libraries, some elements of the implementation and process were considered to have worked well, and others need improvement. When the project was completed, the implementation committee met to discuss the successes and failures of the project. After reading all of the librarians’ reports, the committee members agreed that they would have conducted the process differently had they been aware of the challenges beforehand. Initially the implementation committee thought it would be beneficial for all subject librarians to be involved in the project, but later agreed that a smaller, core group of librarians should have run all of the reports so that there was more consistency in the data that was collected. Even though a template for reporting data and analyzing the reports was designed and shared with the subject librarians, a few librarians did not use it, and even those that did often supplied inconsistent data, making it difficult to analyze the big picture. One of the benefits of WCA is the ability to download lists of titles that the library does not own into Excel. These lists have several uses, including providing titles for retrospective collection development purchases to fill in gaps and providing information for future comparisons. However, if KU were to subscribe to the WCA again, all of the title lists would be run by a smaller group of participants and stored in a centralized repository for future use. An additional benefit of the project emerged because librarians were simultaneously writing new subject collection development policies. The WCA results frequently reaffirmed claims made about collections over a number of decades, including the claim that KU has research-level collections that cover all major areas in its academic disciplines. These results then informed the newly written policies. Both the WCA reports and the collection development policies will provide documentation for future collection managers that will enhance their understanding of the history of the collections, and the reasons behind collecting decisions. One of the major set-backs experienced by KU Libraries during the year-long WCA project was technical difficulties. Subject librarians were consistently reporting problems related to the inability to download reports because WCA would “time out.” The problems were repeatedly reported to OCLC, but they were not resolved immediately. The “time out” issue not only prevented librarians from running reports when they had scheduled time to do so, but also the problems created a significant amount of frustration with the project as a whole. Although many of the technical problems KU encountered throughout the year are not documented elsewhere in the literature, we would recommend that OCLC make the debugging of these problems a priority. As noted by Negrucci (2008) and Orcutt and Powell (2006), the rigid subject conspectus persists and is a problem that OCLC also needs to address. KU also agrees with Orcutt and Powell 2006) and Beals (2008) (36) that OCLC needs to update the WCA on a more frequent basis. A monthly update would be a significant improvement and improvements to FRBR would also make the product much more useful. The WCA would be a much more useful product if analyzing all of the serials we have access to could be achieved. KU subscribes to many of the largest aggregator databases, but the serials we access through these databases do not display as owned by KU in WorldCat. WorldCat could provide a http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 10 knowledgebase of serials titles lists that are accessible from aggregators to add to the serials cataloged in OCLC. Making the subject conspectus less rigid would also make the WCA far more functional than it is now. Other libraries would be advised to consider the overall benefits and drawbacks of the WCA before implementing any library-wide projects. Libraries will want to identify their goals, and potential problems that might be caused when running WCA reports due to inadequate records in individual library catalogs. They will want to ensure that all those participating understand what the WCA can and cannot do, as well as the technological components of the product. KU found that standardizing the data collected as much as possible is advantageous for all involved in any analysis project and will produce stronger results. Furthermore, assigning a smaller group of librarians to conduct the analyses would also ensure that the results are more consistent among collections, as well as maximizing expertise and minimizing staff time devoted to the project. Overall, KU Libraries found some value in subscribing to the WorldCat Collection Analysis tool even though there were challenges because of problems with the tool. Many subject librarians concluded that their specific collections are strong. Librarians also identified publishers with whom they were previously unfamiliar. They identified sub-areas of specific collections that are strong, wrote more informed collection development policies, and improved Excel skills. The collection documentation (title lists of monographs not owned by KU and collection policies) that was produced is one of the primary benefits resulting from the project. KU may subscribe to WCA in the future if the current drawbacks to the product are addressed by OCLC. However, if KU Libraries subscribes to WCA in the future, changes will be made in the implementation process. Collecting data over time to compare the KU collection with its peers would produce useful information to track the changing nature of the KU Libraries collections. Gathering longitudinal data would also assist KU Libraries in developing an awareness of how collections are changing at a national level. Tracking these changes will help KU Libraries’ understand research library is and will be in these quickly changing times. http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 11 Appendix A Report Template WorldCat Collection Analysis Tool Report–FY10 Subject Librarian Subject Area Methodology: Explanation of Reports 1. Comparison groups used (e.g., ARL peers, WorldCat, custom peers, standard list) 2. Limits utilized (if applicable): a. Subject b. Years c. Call number range d. Holding count e. Language f. Format 3. How were reports sorted: a. Call number b. Publisher c. Language d. Holding count Results/analysis (please provide supporting data where applicable): 1. Please provide a general description of your collection. 2. Strengths of collection (e.g., unique items, completeness of collections). 3. Weaknesses of collection (e.g., missing call number ranges, publishers, years). 4. Application of results (how will you use the data collected to make decisions about the collection; e.g., approval plan adjustment, budgetary requests, retrospective collecting, accreditation purposes, collection development policies). 5. Difficulties (fallacies that impeded results; e.g., multiple editions, serial-dependent, records not in OCLC, alternative formats) Additional feedback: 1. Was this analysis of your collections useful? Why not? 2. How successful were you in getting the information you wanted from the WCA reports? 3. What information were you hoping to find from the WCA? 4. How should the WCA tool be changed to make it more useful? http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 12 Appendix B KU Libraries Peer Comparison Groups KU Libraries’ ARL Peers  University of Colorado, Boulder  University of Iowa  University of North Carolina, Chapel Hill  University of Oklahoma  University of Oregon KU Libraries’ One-to One Comparisons  KU Spencer  Kansas State Univ  Univ of Michigan  Univ of Missouri  Univ of Nebraska, Lincoln KU Libraries Big 12 Peers  Baylor  Iowa State  University of Missouri  Oklahoma State  Texas A&M  Texas Tech  University of Colorado  University of Nebraska  University of Oklahoma  University of Texas http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ Amalia Monroe and Lea Currie (2011): Using the WorldCat Collection Analysis Tool: Experiences From the University of Kansas Libraries, Collection Management, 36:4, 203-216. Publisher’s official version: http://dx.doi.org/10.1080/01462679.2011.604907. Open Access version: http://kuscholarworks.ku.edu/dspace/. 13 References Beals, Jennifer Benedetto. 2007. Assessing collections using brief tests and WorldCat Collection Analysis. Collection Building 26(4): 104-107. Cox, Janice E. and Barbara A. Gushrowski. 2008. A dental library book collection intervention: from diagnosis to cure. Journal of Hospital Librarianship 8(3): 353, 354. Culbertson, Michael, and Michelle Wilde. 2009. Collection Analysis to enhance funding for research materials. Collection Building 28(1): 9-17. Henry, Elizabeth, Rachel Longstaff, and Doris Van Kampen. 2008. Collection analysis outcomes in an academic library. Collection Building 27(3): 113-117. Intner, Sheila. 2003. Making your collections work for you: collection evaluation myths and realities. Library Collections, Acquisitions and Technical Services 27(3): 339-350. Negrucci, Teresa. 2008. Advisor Review: WorldCat Collection Analysis. The Charleston Advisor 9(4): 50- 54. Orcutt, Darby, and Tracy Powell. 2006. Reflections on the OCLC WorldCat Collection Analysis Tool: we still need the next Step. Against the Grain November 18(5): 44, 46, 48. Sneary, Alice. 2006. Moving towards a data-driven development: WorldCat Collection Analysis. Against the Grain 18(5): 54, 56, 58, 60. Spires, Todd. 2006. Using OCLC’s WorldCat Collection Analysis to evaluate peer institutions. Illinois Libraries 86(2): 11-19. Way, Doug. 2009. The assessment of patron-initiated collection development via interlibrary loan at a comprehensive university. Journal of Interlibrary Loan, Document Delivery, and Electronic Reserve 19: 299-308. http://dx.doi.org/10.1080/01462679.2011.604907 http://kuscholarworks.ku.edu/dspace/ work_xnbusg46lrhubc2mfcflotajse ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586024 Params is empty 216586024 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586024 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_xunrgedzlvazrmfdwdtqiquwfa ---- F I I 1 1 [ ^ ^ ^ - ^ Volume 58 MM •^•^B_^k Number 1 Wm B B ^ T ^ Winter 1995 Ajiierican Archivist E\O.OrlON OF FEPEPAL PECDRP-KEEPIHG PROCEDURES The Society of American Archivists D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 The American Archivist Richard J. Cox, Editor, University of Pittsburgh Teresa M. Brinati, Managing Editor, Society of American Archivists Nancy Fleming, Copy Editor Mitchell Bjerke, Editorial Assistant Barbara L. Craig, Reviews Editor, University of Toronto EDITORIAL BOARD Terry Cook (1994-1997), National Archives of Canada Anne Gilliland-Swetland (1994-1997), University of Michigan Joan D. Krizack (1994-1995), Northeastern University Lawrence J. McCrank (1994-1996), Ferris State University Robert S. Martin (1994-1998), Louisiana State University James M. O'Toole (1994-1998), University of Massachusetts at Boston Victoria Irons Walch (1994-1996), Consultant Maxine Trost (1994-1995), University of Wyoming The Society of American Archivists PRESIDENT Maygene Daniels, National Gallery of Art VICE PRESIDENT Brenda Banks, Georgia Department of Archives and History TREASURER Lee Stout, Pennsylvania State University EXECUTIVE DIRECTOR Susan E. Fox COUNCIL MEMBERS Karen Benedict (1993-1996), Consultant Susan E. Davis (1994-1997), State Historical Society of Wisconsin Luciana Duranti (1992-1995), University of British Columbia Timothy Ericson (1993-1996), University of Wisconsin at Milwaukee Margaret L. Hedstrom (1992-1995), New York State Archives & Records Administration Steve Hensen (1994-1997), Duke University H. Thomas Hickerson (1993-1996), Cornell University Sharon Gibbs Thibodeau (1994-1997), National Archives and Records Administration Elizabeth Yakel (1992-1995), Student About the cover: Issues about the management of electronic records continue to vex both archivists and records managers. This cartoon appeared in the September 4, 1995, issue o/Federal Computer Week accompanying an editorial about the recent National Archives' electronic mail regulations and the apparent lack of e-mail management systems that would allow agencies not to have to print out electronic mail messages. This issue of the American Archivist carries several essays concerning the challenges of managing elec- tronic mail. (Illustration courtesy of Richard Tennant) D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 .The.American Archivist Volume 58 / Number 1 Forum / 4 From the Editor Easy Distinctions / 6 Richard J. Cox Presidential Address / Winter 1995 Expanding the Foundation / 10 Edie Hedlin Pease Award "No Documents—No Women's Archives / Anke Voss-Hubbard History": Mary Ritter Beard and the Early History of 16 Research Article Beneficial Shocks: The Place of Processing-Cost Analysis in Archival Administration / 32 Paul Ericksen and Robert Shuster ©The Society of American Archivists, 1995. All Rights Reserved. ISSN 0360-9081 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 American Archivist / Winter 1995 Perspectives Legal Admissibility of Electronic Records as Evidence and Implications for Records Management / 54 Sara J. Piasecki Integrating Archival Management and the ARCHIVES Listserv in the Classroom: A Case Study / 66 Diana L. Shenk and Jackie R. Esposito International Scene Revolution in Records: A Strategy for Information Resources Management and Records Management / 74 Peter M. H. Waters and Henk Nagelhout From the Reviews Editor What's Ahead in Reviews / 84 Barbara L. Craig Review Article James M. O'Toole Toward a Usable Archival Past: Recent Studies in the History of Literacy (Works reviewed: Michael T. Clanchy, From Memory to Written Record: England, 1066- 1307; Brian Stock, The Implications of Literacy: Written Language and Models of Interpretation in the Eleventh and Twelfth Centuries; Rosamund McKitterick, The Carolingians and the Written Word; Rosamund McKitterick, ed., The Uses of Literacy in Early Medieval Europe; Rosalind Thomas, Literacy and Orality in Ancient Greece; and William V. Harris, Ancient Literacy) / 86 D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 The Society of American Archivists SAA Council Meeting Minutes, 3-5 June 1994 / 100 SAA Council Meeting Minutes, 6 September 1994 / 107 SAA Council Meeting Minutes, 10 September 1994 / 112 Editorial Policy / 114 Postal Notice The following statement of ownership, management, and circulation was filed in accordance with the provisions of Section 4369, Title 39, U.S. Code, on 29 September 1995, by Teresa M. Brinati, Managing Editor. The American Archivist is published quarterly by the Society of American Archivists, 600 S. Federal St., Suite 504, Chicago, Illinois 60605. The managing editor is Teresa M. Brinati. The owner is the Society of American Archivists, 600 S. Federal St., Suite 504, Chicago, Illinois 60605. There are no stockholders, bondholders, mortgages, or other security holders in the organization. The average number of copies of each issue printed during the preceding twelve months was 5,514; sales through dealers and carriers, street vendors, and counters sales were 0; mail subscriptions to members and subscribers were 4,925; total paid circulation was 4,925; free distribution was 119; total distribution was 5,040; and 470 copies were for office use, leftover, or spoiled after printing. For the most recent issue (Summer 1994), total number of copies printed was 5,547; sales through dealers and carriers were 0; mail subscriptions to members and subscribers were 4,925; total paid circulation was 4,925; free distribution was 121; total distribution was 5,046 and 501 copies were for office use, leftover, or spoiled after printing. Subscription Information The American Archivist (ISSN 0360-9081) is published quarterly by the Society of American Archivists, 600 S. Federal, Suite 504, Chicago, Illinois 60605. Second class postage paid at Chicago, Illinois, and additional mailing office. Subscriptions: $85 a year to North American addresses, $100 a year to other addresses. Single copies are $25 for magazine copies and $30 for photocopies. Articles and related communications should be sent to Teresa M. Brinati, Managing Editor, Society of American Archivists, 600 S. Federal, Suite 504, Chicago, Illinois 60605. Telephone: (312) 922- 0140. Advertising correspondence, membership and subscription correspondence, and orders for back issues should be sent to SAA at the address above. Requests for permission to reprint an article should be sent in writing to SAA at the above address. Claims for issues not received must be received by SAA headquarters within four months of issue publication date for domestic subscribers and within six months for international subscribers. The American Archivist is available on 16 mm microfilm, 35 mm microfilm, and 105 mm microfiche from Univesity Microfilms International, 300 N. Zeeb Road, Ann Arbor, MI 48106-1346. When an issue is out of stock, article and issue photocpies may also be obtained from UMI. The American Archivist is indexed in Library Literature and is abstracted in Historical Abstracts; book reviews are indexed in Book Reviews Index. @ The American Archivist is printed on an alkaline, acid-free printing paper manufactured with no groundwood pulp that meets the requirements of the American National Standards Institute— Permanence of Paper, ANSI Z39.48-1992. Typesetting and printing of The American Archivist is done by Imperial Printing Company of St. Joseph, Michigan. The American Archivist and the Society of American Archivists assume no responsibility for statements made by contributors. ©The Society of American Archivists 1995. All rights reserved. Postmaster: send address changes to The American Archivist, 600 S. Federal, Suite 504, Chicago, Illinois 60605. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 American Archivist / Vol. 58 / Winter 1995 Appraisal and Oral Evidence To the editor: Luciana Duranti's otherwise excellent ar- ticle, "The Concept of Appraisal and Ar- chival Theory," (American Archivist, 57 [Spring 1994]: 328^14) has in it a glaring error. Duranti believes "that documents purposely created to provide evidence of oral actions must not be included in the societal archives: They do not constitute evidence but interpretation, and their inclu- sion among archival material would be an infringement of our historical accountabil- ity" (p. 343). To follow Duranti's direction to exclude all "documents purposely created to pro- vide evidence of oral actions" would im- poverish society, archives, and history by forcing archivists to throw out many criti- cally important documents such as: • all written minutes of meetings conducted orally. • all written memoranda of oral conver- sation, statements, or interviews, even court stenographers' typed transcrip- tion of legal testimony during trials. • all segments of written memoirs, bi- ographies, or autobiographies that are based on "oral actions." • all written diplomatic, legal, eco- nomic, social, and political reports or memoranda based solely on what one heard or said. Instead of prolonging the list of types of items Duranti would remove from the his- torical record, let's remember specific doc- uments from James Madison's notes on the U.S. Constitutional Convention to John Dean's and J. R. Haldeman's memoranda of conversations with President Richard M. Nixon. Surely, most archivists would agree that to not include such "documents pur- posely created to provide evidence of oral actions . . . among archival material would be an infringement of our historical ac- countability." ROBERT G. SHERER University Archivist Tulane University Reply from the author: Thank you for giving me the opportunity to respond to the letter of Robert G. Sherer. Mr. Sherer is absolutely correct in each and every one of his statements. I would never suggest that any of the examples he lists is not the direct competence of the archivist and should be removed "from the histori- cal record." I did not refer to those types of records when I made the statement quoted by Sherer. As a matter of fact, most of those records belong to one of the two most important diplomatic categories of records, the probative records. (See Luci- ana Duranti, "Diplomatics: New Uses for an Old Science. Part II," Archivaria 29 [Winter 1989-90]: 9.) The key to my intended meaning is the adverb "purposely," as opposed to "nat- urally." Minutes, memoranda, and similar reports of oral actions are generated in the natural course of affairs, not to provide a historical record for future researchers. In other words, they are needed for carrying D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 Forum out whatever activities the creator intends to carry out. On the contrary, oral histories, for example, are purposefully generated for posterity, and therefore do not present the necessary characteristics of all archival documents. The latter is the type of docu- ment I was referring to in my statement. With all the above said, I wish to apol- ogize to Robert Sherer and all my readers for my overconcise writing style. Too often I leave my readers to wonder what I mean by one sentence or another, and to interpret individual words, when one more sentence or a few examples would have made read- ing so much more pleasant! I will make a better effort in the future. I also wish to thank Robert Sherer for having brought the issue to my attention, and for giving me the opportunity to clarify my statement, as I am certain that many people have been wondering about it. LUCIANA DURANTI Master of Archival Studies Program University of British Columbia MicroMARC and Importing/ Exporting To the editor: In her article, "Automating the Archives: A Case Study," (American Archivist 57 [Spring 1994] 364-73) Carole Prietto mis- represents the capabilities and functions of MicroMARC :amc. From its initial release in 1986, MicroMARC:amc has always had the capability to import and export US- MARC AMC records. This includes im- porting and exporting records to OCLC. There has never been a question with the ability of MicroMARC :amc to export re- cords, whether to OCLC or other MARC systems. The only question has been in what medium. Until a few years ago, OCLC required the records be sent on a 9- track tape. For MicroMARC:amc users who did not have the capability to generate a 9-track tape for export to OCLC, we pro- vided such a service. Today MicroMARC: amc users can easily transport their USMARC AMC records to OCLC, RLIN, and so on, via the Internet. FREDERICK L. HONHART Michigan State University Reply from the author Thank you for the comment concerning my article. Please note that in footnote number 13, I do note the fact that at the time I evaluated MicroMARC for use at Wash- ington University, "MicroMARC users had to copy completed records to a floppy disk and send them to Michigan State Uni- versity. At Michigan State, records were tape-loaded into OCLC via the university's mainframe. Both MicroMARC and Mina- ret have since added modules for importing and exporting MARC records." The larger point being made at that place in the article was that, as of 1991, MARC records cre- ated in either Minaret and MicroMARC re- quired some form of conversion routine before they could be loaded into OCLC. In both cases, that has since changed, as I also stated in footnote 13. I believe this ad- dresses your points concerning OCLC con- version and MicroMARC, but if it does not, I would appreciate hearing from you so that the record may be set straight. CAROLE PRIETTO Washington University in St. Louis With the exception of editing for con- formity of capitalization, punctuation, and citation style, letters to the Forum are published verbatim. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 American Archivist / Vol. 58 / Winter 1995 From the Editor Easy Distinctions HISTORICAL PRESERVATIONIST Hugh How- ard has written: "The world is full of easy distinctions.... a convenient one is be- tween the savers and the throwers."1 The essays in this issue of the American Archi- vist are also about easy distinctions in our own world: The champions of archives, versus those who are not advocates. The need to conduct research about basic ar- chival functions versus the need to manage potential damage against providing greater detail on the costs of maintaining our doc- umentary heritage. The growing use of electronic recordkeeping systems, moving against the tide of legal systems and archi- val practices still tied to a paper world. Ed- ucation in the classroom, versus "street smarts" acquired over the information highway. Archives and records manage- ment objectives, weighed against organi- zations' interest in meeting them and supporting such objectives. The notion of our present professional practice, con- trasted with the historical evolution of the field. Easy distinctions. The initial essay on the early develop- ment of women's archives is a good place to begin considering some easy distinctions in our own work. Anke Voss-Hubbard's history of the origins of the Sophia Smith Collection at Smith College is more than a •Hugh Howard, The Preservationist's Progress: Architectural Adventures in Conserving Yesterday's Houses (New York: Farrar, Straus and Giroux, 1991), 5. chronicle of the early formation of wom- en's archives in this country. It is also an interesting exploration of the value and challenge of archival history. In previous essays I have argued for the relevancy of archival history, as have others (including James O'Toole, who does so again in his essay in this issue), so there is no need to repeat the arguments here. Voss-Hubbard's article, however, is an insightful view into the tenuous foundations of such subject ar- chives, as well as our ability to go back and understand the origins of our pro- grams. At several points, Voss-Hubbard comments on Mary Beard's own lack of interest in or care for her records. I suspect that many archivists have made little pro- vision for their own papers, and that the future historians of our profession will face similar detective sleuthing. Does this strike anyone as peculiarly ironic, that the pre- servers of archival records are not admin- istering their own archives? For a long time archivists have operated as if arrangement and description were the primary functions of their work and re- sponsibility. While appraisal and reference or use have at times competed for priority, other forces—the extent of writings, efforts to develop standards, and the emphases of graduate and continuing education—have kept arrangement and description at the fore. The easy distinction here is that ar- rangement and description equal archival knowledge and practice, whereas other ac- tivities are merely diversions from such D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 From the Editor work. Yet, as Paul Ericksen and Robert Shuster convey in their essay, the supposed centrality of this function has not been ac- companied by serious efforts to analyze its costs and procedures. With the details of their study, Ericksen and Shuster con- firmed "that the resources we devoted to processing exceeded the value we placed on what we had accomplished." Is there another irony here, in that this function's importance as the primary user of archival resources has not been worthy of substan- tial study itself?2 For thirty years archivists and records managers have debated both the signifi- cance of electronic records and how to manage them. While this discussion has gone on, often generating more theoretical discourse rather than reflecting experience, our courts have slowly evolved to the point of treating electronic records as fundamen- tally different or distinct from paper re- cords. Sara J. Piasecki's essay on the legal admissibility of electronic records as evi- dence is a straightforward account of the evolution of law and legal decisions. While Piasecki sees many uncertainties in the di- rection our courts may be heading, leading to a certain "highly contentious legal fu- ture," her reading of the law and legisla- tion also identifies trends that force organizations, records managers, and ar- chivists to develop more effective pro- grams for ensuring the maintenance of electronic records systems. Although she does not write in this tone, it appears likely that the electronic technology sweeping through our organizations and society rep- resents more opportunity for strengthening records and archival management if we po- 2This topic is by no means alone in this regard. Every few months the ARCHIVES Listserv features a lengthy essay about user fees. I remember that many of the more recent comments were uttered at profes- sional conferences I attended twenty years ago. Yet, we do not have a single study, even a profile, of the prevailing use of fees in archival programs! sition ourselves with the right advice in our institutions. The easy distinction between electronic and paper recordkeeping sys- tems that has caused us to break our serv- ices and approaches so neatly between the two is nearing the end of its utility. Archivists have also long debated the relevance of practice and theory and meth- odology gained in graduate classrooms. Some of these debates are cooling, as grad- uate education enters a new realm of so- phistication and comprehensiveness. Yet, as the article by Diana Shenk and Jackie Esposito reveals, there remains a need and value in maintaining a strong and steady connection between training and education. Their discussion of the use of the AR- CHIVES Listserv outlines the potential of bringing the practical, daily work of the ar- chivist into the classroom, a value I cer- tainly see as I require my archives students to read and discuss this and other listservs. Questions remain about the use of the elec- tronic discussion vehicles. For example, Shenk and Esposito comment on the gen- eral lower quality of the resulting student papers; is this attributable to the listserv or is it more a reflection of what we should expect from a one-course introduction? In the program in which I teach, with a cluster of six courses, the quality of papers is high and the use of the Internet more sophisti- cated. Shenk and Esposito also point out that the use of the ARCHIVES Listserv provides "virtually unlimited access to the great archival minds in our profession." However, many leading archivists do not participate in the public discussions, at the same time that anyone (including nonar- chivists) can join and participate in the dis- cussions. (How are these sorted out?) And we must still ask if the best access to the best thinking about archival science is not in the print (or electronic) journals rather than in listservs. A gap in reality between aim and prac- tice is also often a problem for the purpose of organizational and governmental records D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 American Archivist / Winter 1995 management and archives programs. The contribution in this issue by Peter Waters and Henk Nagelhout about recent efforts by the National Archives of The Nether- lands offers ways to deal with these chal- lenges. Rather than trying to force proce- dures and policies that cut across the organizational grain, archivists and records managers are striving to determine and then meet the needs and wishes of the agency staff creating and maintaining the records. These European archivists also confirm the need, long accepted but seldom practiced, for identifying at an early stage of creation those records that are archival. Their approach also suggests what is hap- pening with our late twentieth-century in- stitutions, when they discuss the abandonment of uniform approaches in fa- vor of a greater diversity for records man- agement. James O'Toole's review essay on the history of literacy is an important contri- bution to our professional literature be- cause it shows, with no doubt, that there is a rich and vital scholarship with direct rel- evance to our own discipline. As he states, those who think they know all they need to know about our professional past from a quick reading of Posner and a few others are very sadly mistaken. Perhaps an easy distinction here is the irony that a profes- sion concerned with preserving historical records seems blissfully unaware both of its own past and of the need to preserve the records of its own institutions, leaders, and profession. If it has no other impact, O'Tooles's essay ought to convey the mes- sage that the burdens and challenges of the modern electronic age may not be far re- moved from our ancient predecessors' challenge of coping with the transition from orality to writing and from manu- script to printed texts. Although Edie Hedlin's essay on build- ing foundations appears first in this issue, as the Presidential Address, it is an appro- priate thought for concluding this introduc- tory editorial. My focus has been on easy distinctions, but Edie's emphasis is on hard ones. She argues—and does it well, in my opinion—for the need to build partnerships and professional infrastructure. She de- scribes, well again, how the problems we face are big and require coordinated ac- tions and new initiatives. The issues and concerns raised by the other authors in this American Archivist are exactly the kinds that could be tackled by the types of con- sortia, institutes, centers, and think-tanks Edie describes in her stirring call for new actions. It is the role of our presidents to paint the big picture and to point us toward brave new worlds. Generally, we forget what they have said or (and just as bad) we view their messages as historical docu- ments reflecting where we were at a par- ticular juncture. Edie Hedlin has given us a document that should not be shelved and forgotten. If we fail to heed this advice, society may shelve us and forget what we have to say. D ow nloaded from http://m eridian.allenpress.com /doi/pdf/10.17723/aarc.58.1.h915873284443778 by C arnegie M ellon U niversity user on 06 A pril 2021 work_xyj5zpvsn5aanglbslwsxp4fpu ---- Turnaround Time Between ILLiad’s Odyssey and Ariel Delivery Methods: A Comparison Valparaiso University From the SelectedWorks of Ruth S. Connell 2006 Turnaround Time Between ILLiad’s Odyssey and Ariel Delivery Methods: A Comparison Ruth S Connell, Valparaiso University Karen L Janke Available at: https://works.bepress.com/ruthconnell/5/ http://www.valpo.edu https://works.bepress.com/ruthconnell/ https://works.bepress.com/ruthconnell/5/ Turnaround Time Between ILLiad’s Odyssey and Ariel Delivery Methods: A Comparison Ruth S. Connell Karen L. Janke ABSTRACT. Interlibrary loan departments are frequently looking for methods to reduce turnaround time. The advent of electronic delivery in the past decade has greatly reduced turnaround time for articles, but re- cent developments in this arena have the potential to decrease turn- around time even further. The ILLiad interlibrary loan management system has an electronic delivery component entitled Odyssey. Odys- sey has a setting that allows articles to be sent to patrons without bor- rowing staff intervention. Using the tracking data created by the ILLiad management system, the turnaround time data for two deliv- ery methods, Ariel and Odyssey, was captured for two different aca- Ruth S. Connell is Reference Services Librarian, Interlibrary Loan Department, Christopher Center for Library and Information Resources, Valparaiso University, 1410 Chapel Drive, Valparaiso, IN 46383 (E-mail: Ruth.Connell@valpo.edu). Karen L. Janke is Assistant Librarian, Acting Access Services Team Leader, and Former Interlibrary Loan Librarian, IUPUI University Library, 755 West Michigan Street, Indianapolis, IN 46202 (E-mail: kjanke@iupui.edu). The authors acknowledge the assistance of Greg Stinson, Director of Institutional Research at Valparaiso University, with statistical analysis. The authors also acknowl- edge the assistance of Genie Powell of Atlas Systems, Inc. with turnaround time que- ries. Journal of Interlibrary Loan, Document Delivery & Electronic Reserve Vol. 16(3) 2006 Available online at http://www.haworthpress.com/web/JILDD © 2006 by The Haworth Press, Inc. All rights reserved. doi:10.1300/J474v16n03_07 41 mailto:Connell@valpo.edu mailto:kjanke@iupui.edu http://www.haworthpress.com/web/JILDD demic institutions. With the Trusted Sender setting turned on, Odyssey delivery was faster than Ariel for the institutions studied. [Article cop- ies available for a fee from The Haworth Document Delivery Service: 1-800-HAWORTH. E-mail address: Web- site: © 2006 by The Haworth Press, Inc. All rights reserved.] KEYWORDS. Odyssey, Ariel, ILLiad, interlibrary loan, document de- livery, turnaround time INTRODUCTION Electronic document delivery has been a vital component to interli- brary lending and borrowing operations since the mid-1990s through the use of popular software programs such as Ariel, Prospero, and ILLiad. Ariel allows lending libraries to scan and send articles as .TIFF files to borrowing libraries via the Internet, eliminating mailing time and postage costs between the lending and borrowing libraries.1 With the release of version 3.01 in 2001, Ariel includes patron delivery via e-mail or the Internet.2 The ILLiad software also allows for articles to be received via Ariel. Once the .TIFF files are imported into ILLiad they are converted to .PDF, posted to a web server, and patrons are notified by e-mail that their article is available. Patrons can then go to the Web to view or print their request. A new option for electronic article delivery emerged with the devel- opment of the Odyssey component of the OCLC ILLiad management software, released in April, 2003 in ILLiad version 6.2.0.1. Odyssey is a protocol that allows libraries using ILLiad to scan and send articles to one another from within the ILLiad software, essentially creating a con- versation between two ILLiad servers that includes each library’s trans- action number as well as OCLC interlibrary loan number, if applicable.3 Perhaps the most important innovation in electronic document deliv- ery using Odyssey is the Trusted Sender setting. If fully enabled, a bor- rowing library’s ILLiad server can receive an article sent from a lending library via Odyssey, convert it to PDF format, deliver the article to the web, and notify the customer, all without staff intervention. A borrow- ing library must decide which of three levels of staff review/trustworthi- ness they use when implementing Odyssey: Always, Trusted, or Never. If set at ‘Never,’ all articles must be reviewed by a staff member before 42 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve mailto:docdelivery@haworthpress.com http://www.HaworthPress.com conversion to PDF and notification of the patron can be made. All arti- cles sent via Ariel are also subject to the required review process. If set at “Trusted,” only articles from those libraries previously designated as Trusted Senders will be posted to a web server without staff interven- tion. If set at “Always,” every article from every lender will be sent di- rectly to the Web without staff intervention. LITERATURE REVIEW Turnaround time is one component to consider when measuring the performance of interlibrary loan departments, as well as measuring pa- tron satisfaction with interlibrary loan service. For this study, turn- around time is defined as the period of time beginning when a patron submits an article request and ending when the patron is notified that the article had arrived. Turnaround time has been shown to be an important component of user satisfaction, though it is not the only determining factor.4 In 1996, Weaver-Meyers and Stolt discovered that although turnaround time has little relationship to patron satisfaction, there is a strong correlation between a patron’s satisfaction and their percep- tion of timeliness, defined as the “window of usefulness.”5 Fong’s analysis of the unsolicited comments from the participants in the Weaver-Meyers/Stolt study indicates that there was a “strong desire for speedier delivery,” at least by those participants who left free-form, anonymous comments.6 Yang’s 2002 study of customer satisfaction at Texas A&M University also indicates that user expectations for prompt turnaround have not waned, but expectations vary in what is acceptable turnaround time. Confirming results reported by Weaver-Meyers and Stolt, Yang also found that users do not require their turnaround time expectations to be met in order to rate service as “Satisfactory” or “Very satisfactory.”7 Despite these results, libraries remain focused on using reducing turnaround time, often with the aid of new technologies. According to the 2002 ARL Assessing ILL/DD Services Survey, the time it takes from when a borrowing library sends an article request to a lender to when the borrowing library receives the item from the lender accounts for the majority of the turnaround time.8 The impact of electronic trans- mission (e.g., via Fax or Ariel) on reducing turnaround time has also been well-documented.9 However, because Odyssey is a relatively new delivery method, no turnaround time study has been done comparing Ruth S. Connell and Karen L. Janke 43 the turnaround time of unmediated electronic delivery (Odyssey with Trusted Sender) with mediated electronic delivery methods (Ariel or Odyssey without Trusted Sender). PARTICIPATING LIBRARIES Indiana University Purdue University Indianapolis (IUPUI) is a public, urban research university created in 1969 as a partnership by and between Indiana and Purdue Universities, with IU as the manag- ing partner. Indianapolis is the capital and largest city in Indiana (popu- lation 860,000) and is centrally located in the state, approximately 180 miles southeast of Chicago, IL and 175 miles west of Columbus, OH. IUPUI has a Carnegie Classification of Doctoral/Research Universi- ties–Intensive, with over 29,000 students and over 180 degree pro- grams. The five libraries on the IUPUI campus (Medical, Law, Dental, Art, and University) are part of the Indiana University library system. Medical, Law, Dental, and University Library have separate interlibrary loan operations and separate OCLC symbols. The collections of the Ruth Lilly Art Library are listed under the University Library OCLC symbol, IUP, and are part of University Library’s interlibrary loan oper- ations. University Library serves approximately 26,000 students, in- cluding the undergraduate population, university administration, and all graduate and professional programs except for Law, Dentistry, and Medicine. University Library’s interlibrary loan operations consist of two full- time equivalent clerical staff and the equivalent of 2.0 full-time equiva- lent student workers (undergraduate and library science graduate stu- dents). There is a 0.5 full-time equivalent librarian administrator. In FY July 2004-June 2005, IUP received 9,861 interlibrary loan borrowing requests and filled 6,716, or 68%. During the same period, it received 23,677 lending requests and filled 15,317, or 65%. Turnaround time for all borrowing articles received via all methods was 7.46 days in FY 04/05.10 IUP has been using Ariel since 1998, Prospero from 2000-2003, OCLC ILLiad management software since August 2003 and the Odyssey component of ILLiad since February 2005. Valparaiso University is a private 4-year liberal arts institution serv- ing a primarily undergraduate student population with some graduate programs as well. The university is located in Valparaiso, Indiana, ap- proximately 50 miles from Chicago. Valparaiso University has a Carne- gie Classification of Master’s Colleges and Universities I. Enrollment 44 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve consists of over 3,000 undergraduate students and over 300 graduate students, including a law school with an enrollment of over 500 stu- dents. The law school is served by a separate library with its own interli- brary loan program, and is not included in this study. VU’s Christopher Center for Library and Information Resources uses the OCLC symbol IVU and employs one full-time paraprofessional Interlibrary Loan Man- ager and 1.25 full-time equivalent student employees. The Reference Services Librarian oversees the department and assists with day-to-day operations during busy time periods, and when the interlibrary loan paraprofessional is on vacation. During the time of this study, the Li- brarian spent about five hours per week working in ILL. From August 11, 2004 until August 10, 2005, IVU received 10,015 borrowing requests and filled 7,905, or 79%. During the same period, they received 3,347 lending requests and filled 2,424, or 72%.11 Turn- around time for all borrowing articles received from all sources was 10.79 days for the year listed above. For five months of this time period, IVU was not using Ariel or Odyssey, so all articles were received via surface mail or fax. IVU has been using ILLiad since August 2004, and Ariel and Odyssey since February 2005. METHODS IUP and IVU tracked the turnaround time for unmediated delivery via Odyssey and mediated delivery via Ariel over a three-month period from February 14 to May 15, 2005. The sample was taken during one of the busiest times of the semester to maximize the sample size. The ILLiad management system consists of a relational database, and each step in the interlibrary loan process is recorded in the database. Track- ing turnaround time does not require exhaustive record-keeping by in- terlibrary loan staff, rather manipulation of the existing data using SQL queries to retrieve the information from the ILLiad database. Therefore, neither institution’s interlibrary loan staff was required to alter their daily work habits in order to calculate turnaround time. The three-month period of study resulted in a combined sample of 2,195 articles. Both institutions implemented Odyssey in February 2005. Because IVU did not have electronic delivery via any method until January of 2005, their interlibrary loan department had used a Custom Holdings path based solely on cost and location, regardless of the format of the item being supplied. In early February, in order to maximize receipt of articles via Odyssey and Ariel, the IVU Reference Services Librarian Ruth S. Connell and Karen L. Janke 45 created a Custom Holdings path for photocopy requests to identify li- braries that did not charge interlibrary loan fees and supplied photocop- ies using Odyssey or Ariel. IUP had already prioritized holdings groups, who were free lenders and delivered via Ariel. Because Ariel is so widely used and Odyssey is an emerging delivery method, we decided it would be more difficult to receive articles sent via Odyssey. It was nec- essary to get as many articles sent via Odyssey as possible to collect enough data to make this study statistically reliable. Therefore, both in- stitutions set their Custom Holdings paths to direct article requests to Odyssey institutions first, and then to libraries who send via Ariel. For example, the IVU photocopy Custom Holdings path was set up to iden- tify lenders in this order: 1. Odyssey libraries located anywhere that do not charge 2. Ariel libraries located in Indiana that do not charge 3. Ariel libraries located anywhere that do not charge 4. Indiana libraries that do not use electronic delivery and that do not charge 5. Libraries in the Midwest that do not use electronic delivery and that do not charge 6. Libraries in the contiguous United States that do not use electronic delivery and that do not charge 7. Libraries in Hawaii, Alaska and Canada that do not use electronic delivery and that do not charge 8. Libraries that charge. At the end of the three-month data-collection period, the IUP Interli- brary Loan Librarian created two SQL queries to retrieve the ILLiad transaction data. The first SQL query determined turnaround time for Odyssey with the trusted sender option turned on; in other words, unme- diated turnaround time (see Appendix 1). The second query returned data to determine mediated electronic delivery turnaround time for both institutions. For this study, mediated requests are the equivalent of Ariel requests, although if an institution used Odyssey without the trusted sender option turned on, these articles would also be a part of this medi- ated subset (see Appendix 2). In order to address the question of staff time devoted to electronic de- livery processing, both IUP and IVU tracked the amount of time neces- sary to process article requests delivered electronically from other institutions by using a log sheet located at the Ariel receiving station. Each staff member who processed electronic delivery articles recorded 46 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve the date, start time, stop time, number of articles delivered to the Web, and number of articles with quality problems that were not delivered to the patron. The manually created log sheets were transferred to a Microsoft Excel spreadsheet in order to determine the total time spent processing during each session, as well as the averages for the number of articles sent to the Web, the number of articles not sent to the Web, and number of articles processed in one hour. RESULTS The following questions were studied: • Is there a significant difference in the turnaround time between mediated and unmediated electronic delivery methods? • How much staff time is devoted to processing electronically deliv- ered articles that could be re-allocated if more articles were sent via Odyssey with Trusted Sender? The SPSS statistical package was used to analyze the data and test if unmediated article delivery by Odyssey with Trusted Sender is signifi- cantly faster than mediated delivery by either Ariel or Odyssey without Trusted Sender. We used an independent samples t-test when analyzing the data, eliminating requests that took longer than 27 days to deliver by any method. We removed these outliers because these requests had problems unrelated to the turnaround time we were studying. This eliminated only 23 of 2195 requests (1%). Looking at the aggregate data, requests delivered via Odyssey (unmediated delivery) had a shorter turnaround time than those delivered via Ariel (see Table 1). The data indicate a significant difference (p < 0.001) between delivery methods for the data as a whole. We then analyzed the data separately for each institution (see Table 2). The data indicate a significant difference (p < 0.001) between delivery methods for IUP alone. The difference in turnaround time by delivery methods is not significant for IVU alone. IVU tracked staff time for electronic delivery processing over 26 ses- sions. The interlibrary loan paraprofessional is able to process an aver- age of 40 articles per hour; an average of 1 article per session had quality problems and was not sent to the Web. Student workers also pro- cessed incoming Ariel articles, but their data was discarded due to im- proper recording. IUP’s results were similar. Over the course of 66 sessions, IUP processed an average of 43 articles per hour, including an Ruth S. Connell and Karen L. Janke 47 average of 1 article per session with quality problems. Using these fig- ures, it is possible to calculate the cost savings from Odyssey delivery because of the savings in staff time. Using the lowest rate of electronic delivery staff time of 40 articles per hour and a total of 400 articles re- ceived via Odyssey, 10 hours of labor were saved between both institu- tions. An average wage of $7/hour would represent a savings of $70. DISCUSSION Although the aggregate data indicates show that unmediated elec- tronic delivery was faster than mediated electronic delivery at both in- stitutions studied, this difference was significant in only one of the two institutions. One possible explanation for this difference could be that IVU processes electronically delivered requests twice as often as IUP (twice a day, as opposed to once a day), decreasing the amount of time that mediated article requests wait in the system before staff process for delivery. The number of times per day that articles are processed is an indication of the impact of the ratio of number of staff to total request volume. IUP has 4 full-time equivalent staff to 33,538 requests, a ratio 48 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve TABLE 1. Turnaround Time by Delivery Method Delivery Method Mean Turnaround Time (in days) N (Requests) t-stat Degrees of Freedom Significance Mediated 7.3 1,772 5.417 2,170 <0.001 Unmediated 5.9 400 TABLE 2. Turnaround Time by Delivery Method and Institution College Delivery Method Mean Turnaround Time (in days) N (Requests) t-stat Degrees of Freedom Significance Number of Lenders IUP Mediated 7.5 1,319 6.615 1,513 <0.001 187 Unmediated 5.2 196 22 IVU Mediated 6.8 453 0.367 655 0.714 124 Unmediated 6.7 204 14 of 1:8,385. IVU has 2.25 full-time equivalent staff to 13,362 requests, a ratio of 1:5,938. For many interlibrary loan departments, it is of utmost importance to send patrons photocopied or scanned articles that are complete and of high quality. Traditionally, articles that are checked in quality control can be intercepted before being sent to the patron in an incomplete state. Setting the OdysseyAutoElecDel key value to “Always” means that an interlibrary loan department relinquishes quality control and is trusting lending libraries to send complete articles. Does turning on the Trusted Sender setting adversely affect the quality of documents delivered to patrons? In the case of IUP, of all the 196 Odyssey-delivered articles, only 1 patron contacted the interlibrary loan office to request a re-send of an incomplete article. This is 0.5% of all Odyssey-delivered docu- ments. Similarly, of the 1,319 staff-mediated articles that were received and supposedly reviewed by staff before being sent to the patron, 4 pa- trons (0.3% of all mediated documents) contacted the interlibrary loan office to ask for a re-send, despite the article passing through quality control. What does this study say about quality control and electronic deliv- ery of articles? We speculate that the reasons for the few requests for re- sends of Odyssey-delivered documents are one or both of the following. The Odyssey-delivered articles may actually be of high quality. Unless we were to examine each Odyssey-delivered article to confirm that the articles are complete, we cannot prove this. If the quality is high, one possible explanation is that Odyssey senders are philosophically in- vested in the promise of Odyssey with Trusted Sender as a method of ar- ticle delivery that reduces staff time and turnaround time. Odyssey lenders are also aware of the potential for the article to bypass the re- view process and go directly to the borrowing library’s patron, and may therefore exercise extra caution when sending. Another explanation may be that the unmediated electronic documents are of no higher qual- ity than mediated electronic documents, but for some reason, patrons do not complain. Yang’s customer satisfaction study shows that quality is only one component of user satisfaction with interlibrary loan service. In fact, when asked to choose reasons they are satisfied with interlibrary loan and document delivery services, “The quality of the scanned item is good” was rated as a reason for satisfaction only 34.3% of the time, eighth behind other reasons such as electronic delivery, no cost, turn- around time, convenience, e-mail notification of request availability, re- quest tracking, and helpful interlibrary loan staff.12 Perhaps patrons assume articles are of high quality and do not emphasize this when stat- Ruth S. Connell and Karen L. Janke 49 ing the reasons they are satisfied, but another explanation could be that interlibrary loan departments over estimate the importance of error free article delivery to patrons. A word cut off at the end of a sentence might not have enough of an impact to warrant contacting the interlibrary loan office to request a re-send, or the anticipated time delay in requesting the re-send is a deterrent. The patron status may also indicate the preva- lence of this reason. All of the re-send requests at IUP came from fac- ulty members. A future study measuring satisfaction as it relates to a potential tradeoff between turnaround time and quality of items re- ceived would be helpful for libraries to consider when deciding to turn on Odyssey with Trusted Sender. CURRENT ENVIRONMENT AND FUTURE IMPLICATIONS A new Odyssey scanning interface may hold the key to wider imple- mentation. As of August 2005, the current Odyssey scanning interface in ILLiad 7.0.3.0 does not allow for advanced manipulation of scanned images. Interlibrary loan staff currently cannot re-scan a single page as needed, but rather must re-scan the entire article from page one. How- ever, with a future release of ILLiad 7.1 scheduled in the fourth quarter of 2005, these barriers may no longer be a factor. After the ILLiad 7.1 release, the only barrier to obtaining a turnaround time and/or cost bene- fit from Odyssey with Trusted Sender is each interlibrary loan depart- ment’s ability to relinquish the control gained from mediated electronic delivery. Further enhancements to ILLiad that would give borrowing li- braries another level of quality control would include an option to turn off the Trusted Sender unmediated delivery with specific patrons. This would give libraries the choice to prohibit Trusted Sender articles from being sent to specific patrons or patron types (e.g., faculty) that are known to have stringent quality requirements. At its core, this study shows the differences in turnaround time for two interlibrary loan departments from two different types of academic libraries, one using a population of 199 lenders (IUP) and the other us- ing a population of 133 lenders (IVU). This is a small number of lenders in the total population of possible lenders, or even of all lenders used by a borrowing library in one year. As of writing, there were 732 units listed in the “ILLD” group in the OCLC interlibrary loan Policies Direc- tory, which is automatically set when a library licenses ILLiad. Of these, 73 (10%) identify themselves as delivering via Odyssey. How- 50 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve ever, this component of the Policies Directory is dependent on individ- ual library’s input and may not accurately reflect the current practices of many libraries.13 Even if a library does not see an improvement in turnaround time, they can still gain from savings in staff time devoted to electronic deliv- ery processing. Of all costs associated with interlibrary lending and bor- rowing, staff costs represent the largest component of interlibrary loan unit costs.14 Both institutions in this study recorded a similar amount of staff time devoted to the manual process of quality checking Ariel articles and delivering them to the Web using ILLiad, between 40 and 43 articles per hour. If the time devoted to mediated electronic article delivery pro- cessing could be reduced by switching to Odyssey with Trusted Sender, the cost savings could be dramatic. The number of ILLiad libraries and the amount of traffic on the OCLC Resource Sharing system that origi- nates from ILLiad libraries is not inconsequential. For example, of all interlibrary loan requests that were placed on the OCLC Resource Shar- ing system in January, 2005, 42% originated in ILLiad systems.15 Al- though the number of requests varies each month, approximately 1 million requests were placed in October 2004, so it is not difficult to estimate that the number of requests that are coming from and going to other ILLiad libraries is in the hundreds of thousands.16 Reducing the staff time for these requests could represent a considerable savings. More specific data about the type and origin of requests on the OCLC Resource Sharing, or a larger study across multiple/high volume institu- tions could further serve to quantify the impact of Odyssey with Trusted Sender. CONCLUSION Unmediated delivery via Odyssey with Trusted Sender was faster than mediated delivery via Ariel for the combined data set collected during this study. In addition, mediated delivery via Ariel required con- siderable staff time that unmediated delivery via Odyssey with Trusted Sender did not. Ariel is at a disadvantage because there is no unmedi- ated option available. In order to take advantage of the cost and time benefits of unmediated delivery, ILLiad libraries should use Odyssey with Trusted Sender. As more libraries adopt Odyssey, the advantages to libraries using unmediated delivery will increase. Ruth S. Connell and Karen L. Janke 51 NOTES 1. Mary E. Jackson, Bruce R. Kingma, and Tom Delaney, Assessing ILL/DD Ser- vices: New Cost-Effective Alternatives (Washington, DC: Association of Research Li- braries, 2004), 52. 2. Infotrieve, Inc. Version 2.x: Frequently Asked Questions. January 2003. Infotrieve, Inc. 13 August, 2005 . 3. Powell, Genie. “Odyssey questions.” E-mail to Karen Janke. 4 Aug 2005. 4. Zhen Ye (Lan) Yang, “Customer Satisfaction with Interlibrary Loan Ser- vice–deliverEdocs: A Case Study,” Journal of Interlibrary Loan, Document Delivery & Information Supply 14, no. 4 (2004). doi:10.1300/J110v14n04_07; Sonja Landes, “Interlibrary Loan Survey: State University of New York College at Geneseo Librar- ies,” Journal of Interlibrary Loan, Document Delivery & Information Supply 11, no. 4 (2001). doi:10.1300/J110v11n04_06; Patricia L. Weaver-Meyers and Wilbur A. Stolt, “Delivery Speed, Timeliness and Satisfaction: Patrons’ Perceptions About interlibrary loan Service: Customer Satisfaction in GMRLC Libraries,” Journal of Library Admin- istration 23, no. 1-2 (1996). 5. Weaver-Meyers and Stolt, 35-37. 6. Fong, “The Value of Interlibrary Loan: An Analysis of Customer Satisfaction Comments,” Journal of Library Administration 23, no. 1/2 (1996): 49. 7. Yang, 85-87. doi:10.1300/J110v14n04_07. 8. Jackson, Kingma, and Delaney, 53. 9. Mary K. Sellen, “Turnaround Time and Journal Article Delivery: A Study of Four Delivery Systems,” Journal of Interlibrary Loan, Document Delivery & Informa- tion Supply 9, no. 4 (1999). doi:10.1300/J110v09n04_08; Jackson, Kingma, and Delaney, 52. 10. Fiscal year statistics are from the reports that are part of the ILLiad system. The reports used for borrowing statistics were Fill Rate Statistics and Turnaround Time. The Fill Rate Statistics report was used to calculate the lending statistics. 11. The IVU statistics do not follow the traditional fiscal year dates due to a move to a new library building and the implementation of ILLiad during the first month of the fiscal year. 12. Yang, 84. doi:10.1300/J110v14n04_07; In Yang’s survey, 213 responses were collected for the question “If you are satisfied with Interlibrary Services and deliverEdocs, it is because”: and were encouraged to choose as many answers as appli- cable. The reasons that rated lower than “The quality of scanned item is good” were: I can resubmit the requests online (32.7%, n = 70), I can renew the requests online 24/7 (28.6%, n = 61), and I can cancel my requests online 24/7 (24.9%, n = 53). 13. OCLC, Inc. OCLC interlibrary loan Policies Directory. 2002-2005. OCLC, Inc. 1 August 2005 . 14. Jackson, Kingma, and Delaney, 29-33. 15. Kriz, Harry. “ILLiad 8th Anniversary.” International OCLC ILLiad Meeting. OCLC Headquarters, Dublin, Ohio. March 17, 2005. 16. OCLC, Inc. 140 Millionth interlibrary loan Request Created. 6 December 2004. OCLC, Inc. August 25, 2005. . 52 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve http://www.Infotrieve.com/Ariel/Arifaq1x2x.Html#2 http://illpolicies.oclc.org http://www.oclc.org/news/announcements/announcement138.htm http://www.oclc.org/news/announcements/announcement138.htm REFERENCES Fong, Yem S. “The Value of Interlibrary Loan: An Analysis of Customer Satisfaction Comments.” Journal of Library Administration 23, no. 1/2 (1996): 43-54. Infotrieve, Inc. Version 2.x: Frequently Asked Questions. January 2003. Infotrieve, Inc. 13 August, 2005 . Jackson, Mary E., Bruce R. Kingma, and Tom Delaney. Assessing ILL/DD Services: New Cost-Effective Alternatives. Washington, DC: Association of Research Librar- ies, 2004. Landes, Sonja. “Interlibrary Loan Survey: State University of New York College at Geneseo Libraries.” Journal of Interlibrary Loan, Document Delivery & Informa- tion Supply 11, no. 4 (2001): 75-80. OCLC, Inc. OCLC interlibrary loan Policies Directory. 2002-2005. OCLC, Inc. 1 Aug. 2005 . Powell, Genie. “Odyssey Questions.” ed. Karen L. Janke. Virginia Beach, VA, 2005. Sellen, Mary K. “Turnaround Time and Journal Article Delivery: A Study of Four De- livery Systems.” Journal of Interlibrary Loan, Document Delivery & Information Supply 9, no. 4 (1999): 65-72. Weaver-Meyers, Patricia L., and Wilbur A. Stolt. “Delivery Speed, Timeliness and Satisfaction: Patrons’ Perceptions About interlibrary loan Service: Customer Satis- faction in GMRLC Libraries.” Journal of Library Administration 23, no. 1-2 (1996): 23-42. Yang, Zhen Ye (Lan). “Customer Satisfaction with Interlibrary Loan Service– deliverEdocs: A Case Study.” Journal of Interlibrary Loan, Document Delivery & Information Supply 14, no. 4 (2004): 79-94. Received: 08/13/05 Revised: 08/28/05 Accepted: 09/30/05 Ruth S. Connell and Karen L. Janke 53 http://www.Infotrieve.com/Ariel/Arifaq1x2x.Html#2 http://illpolicies.oclc.org APPENDIX 1. SQL Query for Unmediated Turnaround (Odyssey with Trusted Sender) USE ILLData SELECT DISTINCT t.TransactionNumber, u.Status, t.LendingLibrary, datediff(mi,a.datetime, b.datetime) AS TimeInMin FROM dbo.transactions t left join dbo.usersALL u on t.username = u.username left join dbo.tracking a on t.transactionnumber = a.transactionnumber left join dbo.tracking b on a.transactionnumber = b.transactionnumber left join dbo.History h on t.transactionnumber = h.transactionnumber WHERE (a.changedto = 'Submitted by Customer' or (a.changedto = 'Request Added through Client') or (a.changedto like 'Imported from%')) AND b.changedto like 'Delivered to Web' AND t.TransactionNumber not in (select distinct TransactionNumber from Tracking where ChangedTo = 'In Electronic Delivery Processing') AND t.TransactionNumber not in (select distinct TransactionNumber from History where Entry = 'EMail Added: Requested Item Delivered Electronically') AND t.RequestType = 'Article' AND t.ProcessType = 'Borrowing' AND b.DateTime > = '2/14/2005' And b.DateTime < '5/15/2005' AND u.NVTGC = 'IUP' ORDER BY t.TransactionNumber 54 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve APPENDIX 2. SQL Query for Mediated Electronic Delivery (Ariel or Odyssey without Trusted Sender) USE ILLData SELECT DISTINCT t.TransactionNumber, u.Status, t.LendingLibrary, datediff(mi,a.datetime, b.datetime) AS TimeInMin FROM dbo.transactions t left join dbo.usersALL u on t.username = u.username left join dbo.tracking a on t.transactionnumber = a.transactionnumber left join dbo.tracking b on a.transactionnumber = b.transactionnumber left join dbo.History h on t.transactionnumber = h.transactionnumber WHERE (a.changedto = 'Submitted by Customer' or (a.changedto = 'Request Added through Client') or (a.changedto like 'Imported from%')) AND b.changedto like 'Delivered to Web' AND t.TransactionNumber in (select distinct TransactionNumber from Tracking where ChangedTo = 'In Electronic Delivery Processing') AND t.RequestType = 'Article' AND t.ProcessType = 'Borrowing' AND b.DateTime > = '2/14/2005' and b.DateTime < '5/15/2005' AND u.NVTGC = 'IUP' ORDER BY t.TransactionNumber Ruth S. Connell and Karen L. Janke 55 56 Journal of Interlibrary Loan, Document Delivery & Electronic Reserve ANNOUNCEMENT What are IFLA interlibrary loan coupons? The International Federation of Library Associations (IFLA) Voucher Plan makes it easy for you to pay for your international interlibrary loan re- quests, by using a coupon instead of cash, checks credit cards or money or- ders. How do I use them? You attach one voucher to your request form each time you request an item from a library in another country. The supplying library accepts the voucher as payment for the transaction, and retains it to re-use it for another transaction when it wishes to borrow from another library. If you use an electronic system to transmit the request, you can mail the coupon. What do they cost and where do I get them? If you have an account that can pay in Euros your library may purchase vouchers from IFLA Headquarters for 8 Euros each. For details: http://www.ifla.org/VI/2/p1/vouchers.htm. If you want to pay in U.S. dollars, you can buy vouchers from BCR. You do not need to be a member. For details: http://www.bcr.org/resourcesharing/ifla-vouchers.html. For complete information: e-mail: voucher@ifla.org IFLA Voucher Scheme International Federation of Library Associations HQ P O Box 95312 2509 CH The Hague Netherlands Tel: +31 70 3140884 FAX: +31 70 3834827 http://www.ifla.org/VI/2/p1/vouchers.htm http://www.bcr.org/resourcesharing/ifla-vouchers.html mailto:voucher@ifla.org Valparaiso University From the SelectedWorks of Ruth S. Connell 2006 Turnaround Time Between ILLiad’s Odyssey and Ariel Delivery Methods: A Comparison JILDD 16(3)-Journal Print.pdf work_y3rcecbmvbgyzpexok2lvc2xie ---- Retrieval Issues for the Colorado Digitization Project's Heritage Database Search  |    Back Issues  |    Author Index  |    Title Index  |    Contents D-Lib Magazine October 2001 Volume 7 Number 10 ISSN 1082-9873 Retrieval Issues for the Colorado Digitization Project's Heritage Database   William A. Garrison University of Colorado, Boulder garrisow@colorado.edu Abstract The Colorado Digitization Project (CDP), begun in the fall of 1998, is a collaborative initiative involving Colorado's archives, historical societies, libraries, and museums. The project is creating a union catalog of metadata records and has developed tools for the creators of metadata records, the assignment of subject headings, and the use of name headings. The CDP is also investigating the use of Dewey Decimal Classification numbers through WebDewey to allow linkage of general subject terms and highly specialized subject terms within a subject browse feature of the union catalog. 1. Project Overview The Colorado Digitization Project (CDP), a collaborative of Colorado's archives, historical societies, libraries, and museums, has undertaken an initiative to increase user access to the special collections and unique resources held by these institutions. Through digitization and distribution via the Internet, the CDP is creating a virtual digital collection of resources to provide the people of Colorado access to the rich historical, scientific and cultural resources of their state. The virtual collection will include such resources as letters, diaries, government documents, manuscripts, music scores, and digital versions of exhibits, artifacts, oral histories, and maps [1]. Project participants can contribute content that has been reformatted into digital format as well as content that was born digital. 2. Development of a Union Catalog of Metadata Since a key objective of the project is to increase access to digital collections, the first effort undertaken was identifying the approaches used by the existing project participants to provide access to their collections. Even at the early stages of the project, it became clear that the participating institutions used different approaches and that there was no dominant standard or approach used by all. The CDP reviewed the differing approaches for common elements and current and emerging standards, including Encoded Archival Description (EAD), MARC, Government Information Locator Service (GILS), Dublin Core (DC), Visual Resources Association (VRA), etc. As web searching would not provide the desired access, and a single centralized metadata and image system would not be politically or financially feasible, the CDP recommended the development of a union catalog of metadata to provide the desired level of access, hoping that future developments in web searching would negate the long-term need for the union catalog. Based on an analysis of the metadata standards, the CDP adopted the Dublin Core/XML metadata standard for the union catalog. Creation of a union catalog of metadata involves a wide range of issues. This paper focuses primarily on the retrieval issues involved. CDP cultural heritage institutions represent many specialized institutions, for example the Florissant Fossil Beds National Monument with its large collection of unique fossils, the Crow Canyon Archaeological Center with its collection of archaeological materials, and the Boulder History Museum with its collection of more than 4,000 costumes and accessories. The first two institutions use taxonomies from their specialized fields to provide subject access to their collections, while the Boulder History Museum uses Chenhall's classification system [2]. At the same time, some of the smaller, more general collections in the CDP contain the same types of resources or subjects but use a more generalized subject heading list for subject analysis, such as the Library of Congress Subject Headings (LCSH), or they may use only uncontrolled vocabulary for subject access. The CDP union catalog will provide access to the entire range of subject terms used by participating institutions and will do so without an authority control system. As a result, unless the user knows both the general and specialized taxonomy, retrieval will be limited to term input. Name headings, both personal and corporate, also present a unique challenge. For the most part, libraries will use authorized forms of headings (i.e., headings appearing in the authority files in OCLC, RLIN, or other authority files) or headings established according to AACR2. Other types of institutions may have no resource or authority files to use and will use only the forms of headings available to them. As a result of these differences in name headings used, there will most likely be multiple forms of headings for a single person or corporate body, and there may be a single heading for multiple persons (i.e., non-unique, unestablished name headings). 3. Colorado Term and Name Lists While no overall authority control exists as part of the CDP union catalog, two specific areas of authority control are being addressed. In order to assure some level of consistency in terminology, the CDP has developed a list of Colorado terms that a user can use to perform web searches. This list includes terms for Colorado geographic names and Library of Congress (LC) subject headings with the word Colorado or the abbreviation Colo. appearing in the subject string. Users can search the list by specific term or can browse the list. The term list is being generated from subject headings in the Prospector database, which is a union catalog reflecting the collections and holdings of sixteen major public and academic research libraries in Colorado and Wyoming. The term list may be searched directly, or it may be searched while creating or modifying records on the CDP input site [3]. Institutions creating metadata records can search the term list for LC subject headings to use if another subject list or thesaurus is not being used. The CDP has begun exploring the idea of creating a thesaurus and/or a full authority file from the term list. The latter would be approached through a statewide Name Authority Cooperative (NACO) / Subject Authority Cooperative (SACO) project creating name headings and subject headings through the Program for Cooperative Cataloging. Figure 1 below illustrates a sample input screen showing the "search the Prospector database" from the input site. Figure 1. CDP Input Site   The CDP has also created a name list of Colorado authors (both personal and corporate) using the Prospector database. The name list has been much more difficult to create than the subject term list. As with the subject list, the user has the option of searching the Prospector catalog from the CDP data entry system to retrieve names or to use the name list. The CDP is creating the subject term list and the names list to achieve consistency in terminology used in the union catalog even though the CDP recognizes that some project participants will not use the lists [4]. Figure 2 shows how a search from the subject list might look, and Figure 3 illustrates a search from the names list. Figure 2. Sample Subject Term Search   Figure 3. Sample Name Search   The term list and the name list contain no duplicate entries; however, it should be noted that the lists contain some errors (e.g., Colorado as a whole word when the abbreviation Colo. should have been used and occasional misspellings of terms). The CDP has not yet begun to deal with the errors in these lists because as long as the database remains relatively small, putting the access points in the Heritage database under authority control is not a priority issue. Nevertheless, as the database grows the issue of authority control will loom larger. 4. The Union Catalog's Use of Dublin Core The CDP has named its union catalog Heritage, and it uses the OCLC SiteSearch software for the public interface. Project participants may use ftp (file transfer protocol) to send records to the CDP for loading into the Heritage database or participants may create records in a metadata input system developed and programmed by the CDP. Records created in the input system are marked for transfer to Heritage when complete. Participants contributing records in MARC or another format provide the CDP with a profile so that the fields coming into the CDP database can be mapped to the CDP Dublin Core record format [5]. It should be noted that the CDP has departed slightly from the DC elements as currently defined by the W3C [6]. A core set of mandatory elements has been defined similar to the "core record" developed by the Program for Cooperative Cataloging. The mandatory elements are Title, Creator, Subject, Description, Identifier, Date and Format. 5. Links from the Union Catalog to Digital Images The CDP does not store the actual digital image files, but instead provides links from the union catalog to the images residing on the project particpants' servers. Participants may create metadata records for each image residing locally, or they may create one metadata record for a "collection" consisting of many images. For example, one of the participating sites stores several digital images of the Moffat Tunnel but has created only one metadata record and access point for the "collection" of tunnel images. In this example, the "collection" includes four images of the East Portal of the Moffat Tunnel and four images of the West Portal. The user can view eight images of the Moffat Tunnel but retrieves only one metadata record. Figures 4 - 6 below illustrate a search for the Moffat Tunnel records in Heritage, the resulting display record, and the appropriate website link. Figure 4. Moffat Tunnel Search   Figure 5. Moffat Display Record   Figure 6. Moffat Tunnel Website Link   The Moffat Tunnel display record in Figure 5 provides an excellent example of the authority control problems faced by the CDP. The first subject entry reads "Moffat Tunnel" when the correct subject heading from LCSH is "Moffat Tunnel (Colo.)." The second subject entry displayed is "Denver and Rio Grande Railroad, Moffat Road," which has no equivalent in LCSH other than the corporate entry for "Denver and Rio Grande Railroad Company," and even that probably should be under the later name of "Denver and Rio Grande Western Railroad Company" for the years 1923-1927. The third subject entry is "Railroad Construction and Tunnel Construction," which in LCSH would be split into two subject headings: "Railroads--Design and construction" and "Tunneling." Bringing all these headings under authority control would be desirable but would require a considerable amount of human intervention, something the CDP is not currently staffed to perform. The above example, of course, assumes that the controlled vocabulary of choice is LCSH. 6. Subject Retrieval and Mapping to Dewey Decimal Classification The CDP proposes to map the subject terms in the Heritage database to Dewey Decimal Classification numbers. Currently, if project participants input Dewey numbers into the metadata, they must either use a printed version of Dewey and manually key the Dewey number or must search WebDewey to obtain a classification number and then type or copy/paste the numbers into the metadata record. A model such as the one developed for use in the Cooperative Online Resource Catalog (CORC) seems desirable for the CDP to follow [7]. Automatic machine classification needs to be further developed before it can be implemented in a project like the CDP. The classification process as described by Hickey and Vizine-Goetz, where the Dewey schedules may be displayed with scope notes, related topics, and references to the DDC manual, provides a tremendous tool for the classifier. The ability to index and search DDC numbers in a catalog like Heritage may also prove useful [8]. However, the use of the actual classification schedules with notes and references to the manual would most likely not help the public patron searching the CDP database. For public patrons, a display of terms with context would be more useful. The example below models how the retrieval might look in the CDP catalog for a user entering the subject search term "gold". Search term: Gold     Results and context No. of hits Gold mines and mining 3 Gold prospecting/prospectors 4 Gold tableware 1 Gold coins 2 When the term "gold" is entered, the system retrieves the following Dewey Decimal Classification numbers: 622.3422 (Gold mines and mining), 622.1841 (Gold prospecting), 739.2283 (Gold tableware), and 737.43 (Gold coins). The classification numbers would have matched or been mapped to terms in the metadata records. The end user does not see the DDC numbers retrieved but instead views the terms/context needed to select records that match his or her needs. Using this model, the user retrieves records for digital objects by subject regardless of subject thesaurus or taxonomy used. The CDP also needs to address the issue of which display terms should appear to the user. The terms appearing in the example above do not necessarily match the terms as used in the Dewey schedules. "Gold mines and mining" appears as 622.3422 in the schedules with the caption text "*Gold". Simply displaying "*Gold" to the end user is not going to be meaningful, as the context is not clear without the hierarchy. Therefore, a method needs to be developed to get the display term to include the hierarchical context as it would appear in the Dewey schedule. Figure 7 shows the WebDewey display for "Technology" in which the term "*Gold" appears. Figure 7. WebDewey Display of Search for Term "Technology"   There is a particular problem with mapping subject terms in the Heritage database to DDC numbers for an institution such as the Florissant Fossil Beds National Monument that uses a specific taxonomy to classify digital images of fossils. The DDC numbers for fossil invertebrates in class number 569 displayed in Figure 8 below provide an example of the detailed breakdown with taxonomic names in the schedules. Figure 8. Dewey Decimal Classification Numbers for Fossil Invertabrates"   It is highly doubtful that anyone without a detailed knowledge of the above taxonomic terms would understand the retrieval result if these were the terms displayed. If an end user were looking for digital images of dog fossils, the terms from the DDC schedule above would not help. In the search for fossils of dogs, the end user would also not be helped by the DDC Relative Index, as there is no entry under dogs with respect to fossils. From this example, it becomes clear that there will need to be additional terms mapped by means such as the LCSH mapping that has been done in WebDewey. Considerable work will need to be done to map the subject terms in the CDP database to the DDC numbers and a great deal of effort put into term retrieval and display. The end user must be able to retrieve records using both specific terminology and taxonomy as well as general terminology. The taxonomist needs to be able to retrieve digital images when searching for "canidae" while at the same time, the general user needs to retrieve digital images when searching for "dogs fossils". In determining the level of specificity required to use DDC numbers for retrieval, it might be that the use of the DDC Table 2 geographic area -788 for Colorado becomes meaningless for some subject areas in a database consisting of digital resources located only in Colorado. All the digital images for gold mines will be for gold mines located in Colorado, so adding or mapping a DDC number into the metadata record of 622.342209788 may make the Table 2 notation irrelevant. This would not be the case were the database to be expanded to include records from other states or were there to be so many digital images of gold mines in Colorado that a county or local breakdown would be necessary. On the other hand, the breakdown in DDC Table 2 for Colorado locations may require expansion for the Heritage database, especially for materials dealing with local history. In most cases, classifying materials to the county level will not be sufficient. Figure 9 from DDC Table 2 shows the geographic locations for the Northern counties of the Rocky Mountains of Colorado. Figure 9. Dewey Decimal Classification Table 2 Geographic Locations for Northern Colorado Counties" None of the counties listed in Figure 9 above has further geographic, local areas. For example, the Table 2 notation for Boulder County -78863 may not be sufficient when distinctions need to be made (especially for historical documents or photographs) between local cities located within Boulder County, e.g., Boulder (city), Erie, Superior, Louisville, Longmont, etc. The CDP will need to determine exactly how this expansion will take place. One solution might be to develop a link to a system or site with GIS data to achieve this expansion. 7. Conclusions For the foreseeable future, retrieval problems will continue in Heritage. No easy or simple solutions to the authority control problems exist. Dewey Decimal Classification seems to offer the most promising method for subject retrieval in databases like Heritage that contain subject vocabulary from multiple thesauri. Further testing will be necessary to determine the feasibility of using Dewey Decimal Classification for subject retrieval as described in this article. Retrieval and authority control issues for names (both personal and corporate) will be somewhat more difficult to resolve without human intervention and control. The possible statewide NACO/SACO project to create name headings and subject headings through the Program for Cooperative Cataloging might provide a solution to the problem of access and retrieval in Heritage. Heritage has just been opened to public view [9]. The author looks forward to the research continuing as the Heritage database is used by participating institutions and public patrons [10]. Notes and References [1] Liz Bishoff and William A. Garrison. "Metadata, Cataloging, Digitization, and Retrieval: Who's Doing What To Whom: The Colorado Digitization Project Experience," Proceedings of the Bicentennial Conference on Bibliographic Control for the New Millennium (Washington, D.C.: Library of Congress, Cataloging Distribution Service, 2001), 377. [2] James R. Blackaby and Patricia Greeno. The Revised Nomenclature For Museum Cataloging: A Revised and Expanded Version of Robert G. Chenhall's System for Classifying Man-Made Objects (Nashville, TN: American Association for State and Local History, 1988). [3] For more information on the Prospector database see: Carmel Bush, William A. Garrison, George Machovec, and Helen I. Reed, "Prospector: A Multivendor, Multitype, and Multistate Western Union Catalog," Information Technology and Libraries, 19, no. 2 (June 2000): 71-83. [4] Both the subject term and name term lists may be found at: . [5] The CDP Dublin Core element set may be found at: . [6] The Dublin Core element set may be found at: . [7] Thomas B. Hickey and Diane Vizine-Goetz, "The Role of Classification in CORC," Annual Review of OCLC Research, 1999 [on-line]; available from ; Internet; accessed 28 September 2001. [8] Karen Markey Drabenstott, "Dewey Decimal Online Classification Project: Integration of a Library Schedule and Index into the Subject Searching Capabilities of an Online Catalogue," International Cataloguing, 14 (July 1985): 31-4. [9] Heritage may be accessed from the CDP website by clicking on "Heritage" at: or from the Access Colorado Library and Information Network (ACLIN) Colorado Virtual Library website under the "Create your own group" section at . [10] Portions of this paper were presented at and appear in "Subject Retrieval in a Networked Environment: Papers presented at an IFLA Satellite Meeting sponsored by the IFLA Section on Classification and Indexing & IFLA Section on Information Technology, OCLC, Dublin, Ohio, USA, 14-16 August 2001." Copyright 2001 William A. Garrison Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Conference Reports Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/october2001-garrison   work_yclmweznpfc3fpozqydipx5waa ---- HKUST Institutional Repository Building an institutional repository: sharing experiences at the HKUST Library Ki-Tat LAM and Diana L. H. CHAN The author Ki-Tat Lam is the Head of Library Systems at HKUST Library. Diana L. H. Chan was the Head of Reference at HKUST Library and is now the Associate Librarian of Public Services at the City University of Hong Kong. Keywords Institutional repositories, open access, self-archiving rights, content recruitment, DSpace, digital libraries, HKUST Abstract Purpose - To document HKUST’s experiences in developing its Institutional Repository and to highlight its programming developments in full-text linking and indexing, and cross institutional searching. Design/methodology/approach - This paper describes how HKUST Library planned and set up its Institutional Repository, how it acquired and processed the scholarly output, and what procedures and guidelines were established. It also discusses some new developments in systems, including the implementation of OpenURL linking from the pre-published version in the Repository to the published sources; the partnership with Scirus to enable full-text searching; and the development of a cross- searching platform for institutional repositories in Hong Kong. Findings - It illustrates what and why some policy issues should be adopted, including paper versioning, authority control, and withdrawal of items. It discusses what proactive approaches should be adopted to harvest research output. It also shows how programming work can be done to provide usage data, facilitate searching and publicize the repository so that scholarly output can be more accessible to the research community. Practical implications - Provides a very useful case study for other academic libraries who want to develop their own institutional repositories. What is originality/value of paper - HKUST is an early implementer of institutional repositories in Asia and its unique experience in policy issues, harvesting contents, standardization, software customization, and measures adopted in enhancing global access will be useful to similar institutions. Introduction The Hong Kong University of Science and Technology (HKUST) is a young institution opened in October 1991. It offers taught and research programs in science, engineering, business, humanities and social science, with 430 full-time faculty members, 5,600 undergraduates and 3,200 postgraduates. Despite its short history, HKUST has rapidly evolved into a world class institution, and was ranked number 43 in the world by The Times Higher Education Supplement in 2005. This is the Pre-Published Version 2 The HKUST Library has been engaged in library digitization projects since its foundation 15 years ago, including the early project on Course Reserve Imaging in 1993 and the CJK (Chinese, Japanese, Korean) capable systems for Digital University Archives and Electronic Theses in 1997. The experiences gained through these projects have facilitated a smooth creation of its Institutional Repository. The Library showed its early support to the open access concept by joining SPARC in 2001. And in November 2002, Kimberly Douglas, the University Librarian of the California Institute of Technology, was invited to the Library to give a staff development workshop on E-prints, OAI (Open Access Initiatives) and institutional repository. The Library decided to build the HKUST Institutional Repository after the workshop, aiming to create a permanent record of the institution’s scholarly output in digital format, and to make the Repository globally and openly accessible. The HKUST Institutional Repository (see Figure 1) was launched in February 2003 with its first batch of 105 computer science technical reports. It has grown to 2,369 documents from 42 academic departments in September 2006, holding preprints, technical reports, working papers, conference papers, journal articles, presentations, book chapters, patents, and PhD theses. They are mainly PDF files with some PowerPoint and program files. These scholarly works were accessed 69,000 times excluding robots in the period from September 2005 to August 2006. A research study on how Hong Kong Chinese students learned English was downloaded 800 times in just a month. Figure 1 Home page of the HKUST Institutional Repository (http://library.ust.hk/repository/) The HKUST Institutional Repository was reported in a number of conference presentations (Chan 2004a; Chan 2004b; Lam 2004; Lam 2006) and journal articles 3 (Chan, Kwok, Yip 2005; Kwok, Chan, Wong 2006). This paper will summarize and update the issues discussed in these reports, including how to plan and set up the Repository, how to acquire and process the scholarly output, and what procedures and guidelines were established. It will also discuss some of the new developments, including the implementation of OpenURL linking from the pre-published version in the Repository to the published sources; the partnership with Scirus to enable full-text searching; and the development of the HKIR, the cross-searching platform for institutional repositories in Hong Kong. Planning Stage The Library adopted a bottom-up approach for building its Institutional Repository. As the concept of institutional repository was quite new in Hong Kong in the year of 2002, it would be easier to begin the project in a small scale but with gradual expansion in scope and institutional participation. With a tangible repository at hand, the Library could approach the faculty and university administration, explained and demonstrated what IR was and how they could benefit from it. Another advantage of starting small was that the investment of resources would be relatively small as compared to a large-scale project and the Library could have more freedom to test the water before moving on. Establishing a Task Force The project began with the establishment of a task force in December 2002, consisting of four librarians from the Reference and Systems departments and the Associate University Librarian. The task force’s charge was to identify the issues involved in creating the IR, to evaluate and select the software for hosting the Repository, and to develop action plans. Findings and recommendations were reported to the Library Administration Committee, the main decision-making body, for approval. A number of policy issues were resolved during the early stage of planning. For example:  Make the IR totally open and accessible to the world. If a faculty member wishes to restrict access, then the document will not be accepted.  The IR is a deposit of research documents, not merely an index with links to external sources; if the Library does not have the right to deposit the full-text papers, they cannot be included in the IR.  Undertake retrospective work to include documents previously published in addition to the current ones.  Do not include ephemeral materials such as faculty-produced course notes, popular works or feature columns from newspapers, but would limit the coverage to published material and grey literature only.  Allow authors to submit documents online and they will sign a permission agreement, granting the Library non-exclusive distribution rights.  Adopt Adobe’s PDF format as the default document format.  Build a single database and not to have multiple databases such as the model adopted by California Institute of Technology. Selecting IR Software While there are many options for selecting IR software and hosting services today, the choices were extremely limited in the year of 2002. Like many of its digital library projects, the Library decided to use open source software for its IR. The main advantage of open source software was that it provided flexibility for local 4 customization and feature enhancements. The significant software cost savings was also a consideration, as the Library did not receive extra funding for the IR project. The task force decided to focus on open source software that supported OAI-PMH (Open Access Inititative - Protocol for Metadata Harvesting). Two such IR software programs were evaluated, namely EPrints and DSpace. EPrints was developed by the University of Southampton and was widely used by IR implementers in 2002. DSpace was jointly developed by MIT Libraries and Hewlett-Packard Company, and began its first release on Sourceforge at the time of the Library’s evaluation. DSpace was developed with experience gained from EPrints, but with a clever move from the Perl programming language to Java and Servlet. And at that time, it also had better Unicode support, which was essential to the Repository that would contain Chinese materials. With the above consideration, the Library decided to adopt DSpace. Once DSpace was selected, the Library began to develop an initial prototype, using the 105 working papers from the Department of Computer Science, which were freely available on their website in postscript format. During this prototyping, a number of design issues were resolved, including how to organize the documents by departments and by document types, and what fields were required in the metadata. Staffing As there was no extra funding and manpower provided for the project, the Library relied on existing library staff to create and maintain the Repository. In addition to the initial planning and system setup done by the task force, an IR Team of eight reference librarians and five data entry staff was established to handle on-going work such as faculty liaison, document acquisition and processing, and the actual data input tasks. It was estimated that about 350 man-days for librarians and another 350 man- days for support staff were spent in the first three years of the project. It was found that the time and efforts taken to acquire and process the content so as to achieve a critical mass were quite substantial. Institutions that are interested in creating IRs should be aware of the staffing implications and should request sufficient funding for the project. Organizing the Repository In DSpace, a repository is made up of a hierarchy of communities, collections, items, metadata and bitstreams. A document is represented by an item, which contains metadata, i.e., a description of the document and a bundle of bitstreams, such as PDF and PowerPoint files that hold the actual content of the document. Items are held in collections, which are further grouped under communities. Documents in the HKUST Institutional Repository are organized by academic departments (communities), and within a department, they are further grouped according to the document types (collections), such as conference papers and journal articles. As of September 2006, the Repository has 2,369 documents in 42 communities and 139 collections. As expected, disciplines which have an established tradition of sharing preprints and working papers, such as computer science and engineering, are ranked as top contributors to the Repository (see Table 1). Conference papers, journal articles, preprints, working papers and doctoral theses constitute the major document types held in the Repository (see Table 2). 5 Table 1 Top 10 contributing departments (September 2006) Academic Departments and Centers Size Percentage Computer Science 478 20.2% Electrical and Electronic Engineering 324 13.7% Mechanical Engineering 164 6.9% Marketing 154 6.5% Mathematics 130 5.5% Physics 126 5.3% Chemistry 106 4.5% Social Science 106 4.5% Biology 84 3.5% Language Center 84 3.5% Others 613 25.9% Total 2369 100.0% Table 2 Total number of documents by document types (September 2006) Document Types Size Percentage Conference papers 636 26.8% Working papers, technical papers, research reports, preprints 549 23.2% Journal articles 537 22.7% Doctoral theses 473 20.0% Presentations 70 3.0% Patents 58 2.4% Book chapters 38 1.6% Others 8 0.3% Total 2369 100.0% The metadata of the documents is encoded in qualified Dublin Core schema. DSpace’s default DC registry was followed, except for a locally defined qualifier openurl for the element identifier. The purpose of defining this local field will be discussed later in this paper. Document Submission and Processing To make the document submission as simple and effortless as possible, the Library decided to develop its own Faculty Submission Form, a web-based interface outside of the DSpace workflow. Faculty members are only required to input minimal bibliographic data, such as title, author and citation source when submitting the actual files to the server. The Form contains a Non-Exclusive Distribution License (Figure 2). They need to check the “I Agree” box to grant permission to the Library. The IR Team then verify and enhance the metadata, ascertain publishers’ policies to avoid depositing the wrong versions, harvest and convert the files to PDF format as needed, and add the documents to the Repository. A web-based Add Item program was also developed for the IR Team so that they can integrate these document submission and processing tasks seamlessly with DSpace. 6 Figure 2. Non-Exclusive Distribution License in the Faculty Submission Form Harvesting Research Output While some faculty members and researchers had the initiative to submit their documents via the submission form, most of them were not responsive at all. Therefore, the Library had to take a more proactive approach to discover and harvest research output for the Repository. For example, the IR Team had:  Visited faculty members’ personal and departmental websites as well as the websites of the research centers and institutes on campus to harvest full-text research papers and publications posted on the web.  Surveyed academic departments to harvest collections of working papers and technical reports.  Searched the library catalog to identify proceedings of conferences held at HKUST.  Scanned through boxes of pre-published research papers held in the University Archives.  Searched electronic databases and open access sources such as Web of Science and DOAJ to identify papers published by the HKUST researchers.  Contacted individual faculty members to ask for their complete publication lists and their full-text documents. In most of the above cases, the IR Team had to contact the original authors to obtain permissions before loading the harvested documents to the Repository. And if the electronic version was unavailable, the paper document would be digitized. The HKUST Electronic Theses database was built a few years earlier than the IR. Not all these theses are open access. The Library decided to include only those PhD theses with author permission in the IR so that all of them would be openly accessible. Instead of depositing a second copy to the IR, only metadata was created, together with a link to retrieve the full-text from the Electronic Theses database. HKUST faculty members need to report annually to the Research Output Collection System. The Library asked the office in-charge of this system to include a checkbox in the submission page to indicate the reporters’ willingness to deposit their publications into the Repository. If the box is checked, an email containing the 7 citations will be sent to the IR Team for follow up actions. This automatic alert mechanism has enabled us to harvest research output on an annual basis. Publishers’ Policies and Deposit Guidelines Verifying and selecting a version of the document for depositing to the Repository is far from a trivial task. While more and more publishers nowadays have their self- archiving policies clearly spelt out on their websites, this was not the case in 2003. Project RoMEO, which provides a directory of publisher self-archiving policies, was just launched in 2002. The IR Team surveyed the publisher’s policies for the journal articles via SHERPA/RoMEO and publisher websites. Findings were recorded in the IR Staff Working Manual for easy future reference. The list currently contains more than 60 publisher policies (see Figure 3). Publishers’ policies can be summarized as follows:  no archiving allowed  allow pre-refereed version only  allow post-referred version only  allow pre- and post- refereed versions  allow publisher’s version  allow all versions  not specified Figure 3. List of publishers’ policies collected by the HKUST Library, showing links to special notes and acknowledgment text, and links to publishers’ website If the policy is unknown, the IR Team will write to the publishers for clarification and ask them for the archiving permission. And if the Library’s version is not usable, the IR Team will contact the authors to ask for an acceptable version. 8 The Library also encourages authors to negotiate with publishers so as to retain their self-archiving rights and the rights for personal educational use. They should also avoid granting an exclusive long-term license that extends beyond first publication. Versioning It is essential that users know whether the version deposited is the published version or not. To avoid confusion, a watermark “This is the Pre-Published version” is stamped on the first page of the document if it is a pre-refereed or post-refereed version. A note “pre-published version” is also displayed in the “Files in This Item” box of the item record display page (Figure 4). A piece of scholarly work may undergo several rounds of revisions. Authors are encouraged to submit revised versions as separate documents. They can also replace the previous versions as long as they are not published items. By doing so, the revised version will bear the same identifier (handle) number as the previous one. Author Name Authority Control Authors may have their works published under different names. It is essential to perform some form of authority control for consistency. Policies for entering the names of HKUST researchers were established. Name authority records for some authors are readily available from the library catalog. For those without record, several university publications such as the Academic Calendar, Faculty’s Profile and Communications Directory are consulted. In a few cases, emails were sent to the authors to seek their preferences on the names used, especially in the case of maiden and married names, names in Chinese, and the inclusion of Christian names. For Chinese documents, the English name of a HKUST author will be taken from the title page, should it appear in English or bilingually in English and Chinese. If HKUST authors do not provide their English names in their Chinese documents, the IR Team will look up their English names and add them to the metadata. For bilingual names of the same author, the Chinese name will be entered in parentheses after the English one, e.g., “Chan, Diana L. H. (陳麗霞)”. Some documents in the Repository were jointly written with non-HKUST authors. Since it is difficult to identify these non- HKUST affiliated authors, the Library decided not to perform authority check on them. Subject Keywords When authors submit records, they can supply three to eight keywords or phrases for indexing. If the subject field is not filled, the IR Team will extract the keywords from the abstracts. English keywords are used for Chinese documents as well. The Library decided not to use thesauri or LC subject headings in assigning subjects. Withdrawal of Items from the Repository At the specific request from authors, documents in the Repository can be permanently removed. To retain the historical record, such transactions will be noted in the metadata record. Since these documents may have been cited by others, the system will supply a "tombstone" when they are requested. A withdrawal statement will be displayed in place of the view document link. 9 Programming Efforts The advantages of using open source software such as DSpace as the platform for the Institutional Repository became more apparent when the needs and requests to customize the software flooded in. Apart from the Faculty Submission Form and the Add Item Form as mentioned in the previous section, the following are other customizations that are worth mentioning. Linking to the Published Version Some publishers only allow institutions to archive the pre-published version. From time to time, the Library receives authors’ feedback that they prefer users to read the published version rather than the pre-published version archived in the IR. One can easily resolve this problem by adding the direct URL link to point users to the published version residing on an aggregator’s or the publisher’s website to which the institution has a subscription. The Library objected to this approach because such links would become broken due to subscription changes. After much study, the Library decided to implement an OpenURL linking mechanism on DSpace so that users can be dynamically redirected to library-subscribed resources that host the published version. To enable this linking feature, the metadata (item record) must contain the OpenURL string. This is made possible by using a locally defined Dublin Core field identifier.openurl. DSpace’s item record display page was modified to enable a link-resolver button when the item contains an OpenURL (see Figure 4). Program was developed to query OCLC’s OpenURL Resolver Registry on-the-fly while displaying the button. By doing so, users will see their own institution’s link-resolver, such as WebBridge for HKUST. When users click on the button, the link-resolver will try its best to identify external sources that contain the published version. Constructing OpenURL manually is extremely painful. To automate this process, a web-based program was developed. It can intelligently parse the title, constributor.author and the identifier.citation fields to obtain most of the required key- value pairs for constructing the OpenURL. As the journal’s ISSN is not available in the Repository’s metadata, the program searches the library catalog to obtain the numbers. With this program, the OpenURL string can be quickly created within a few mouse clicks. Figure 5 shows the web interface for this highly user-friendly OpenURL Builder program. 10 Figure 4. A pre-published document record, showing versioning information, with WebBridge link to the published sources Figure 5. OpenURL Builder - automating the construction of the OpenURL Usage Statistics and Top 20 Most Accessed Documents In-house usage analyzing programs were developed to supplement the usage reports that come with DSpace. They are based on the web access logs captured by the server Check Library catalog and auto- insert the ISSNs to the form Click this button to create this OpenURL fragment Click this link to test the OpenURL Click on this image to launch an OpenURL link-resolver to locate the published version Document deposited in the Repository is a pre-published version Watermark 11 when users issue requests to download the bitstreams, e.g., the PDF files. The Repository is open to the world and allows visits from search engines, robots and OAI harvesters. As a result, it receives tens of thousands of web requests per day. Program was developed to enable the Library to know how many times the IR documents were downloaded by “real” users, excluding robot accesses. This figure was updated monthly to the Repository home page. Another customized program was the monthly listings of the Top 20 most accessed documents. It is interesting to analyze these Top 20 lists as they give a good account of documents, topics and authors that users are most interested in. Such information is useful for IR promotion. For example, the Library wrote to the authors in the lists to inform them about the high usage of their papers. The IR Team also showed the lists to the faculty members during departmental visits. While the majority of the documents are from the academic departments, it is worth mentioning that a number of documents authored by the HKUST Language Center made their way into the lists, together with the ones HKUST Library wrote on institutional repository and virtual reference. CJK Search and Display In the early versions of DSpace, there were problems on searching and displaying Chinese characters. The authors managed to fix these problems by revising and replacing some of the DSpace source codes. While some of these problems were eventually fixed in DSpace’s later versions, the timing of fixing them was critical to the Library’s IR software selection. Had they not been fixed during software evaluation, the Library would not have selected DSpace. Thanks to open source, one could dig into the source codes and fix problems quickly. The main CJK problem was attributed to the use of the CJK-illegible string tokenizer. DSpace is Unicode capable, meaning that it supports data and strings in multiple scripts, including CJK. However, like many other non-Roman scripts, the way Chinese strings are sorted, indexed and searched can be quite different from that for English. Global software developers should be aware of these differences in order to avoid problems similar to the ones encountered with DSpace. Enhancing Global Access It is essential to publicize an institutional repository so that the research output can be made known to the world. In addition to making the Repository readily available and openly accessible, the Library has implemented the following measures to allow search engines, agents and harvesters around the world to discover documents in the Repository. OAI-PMH Compliance OAI-PMH (Open Access Initiative – Protocol for Metadata Harvesting) is a protocol that allows metadata to be easily harvested by computer programs. Like other IR systems, DSpace is OAI-PMH compliant. It is useful to register the OAI Base Path of the IR to various OAI registries, such as the ROAR (Registry of Open Access Repositories) maintained by EPrints. OAI harvesters can then follow this registered path and retrieve the metadata for their own searching and indexing services. At least two well known services, namely OAIster and Scirus, are constantly harvesting HKUST’s IR metadata via this protocol. 12 Indexed by OAIster OAIster is a project of the University of Michigan Digital Library Production Service. By using OAI protocol, it has collected almost 10 million of metadata records of academically-oriented digital resources from 680 institutions around the world. The Library contacted OAIster in June 2003 and since then the Repository records are included in OAIster. HKUST research output is therefore available to all OAIster users via its one-stop searching interface. Full-text Searching on Scirus Scirus is Elsevier’s free search engine for scientific information. In addition to web pages, it also harvests content from selective institutional repositories. In November 2005, Scirus proposed to index the HKUST Institutional Repository. The project involved building the mechanism to harvest the content of the Repository, indexing both the metadata and full-text of the documents, making them searchable on the Scirus platform, and integrating the Scirus search form within the Repository home page. This feature was rolled out in May 2006. Thanks to Scirus, the Library is able to offer full-text searching external to DSpace as well as to open up the content to a larger scientific research community. Crawling by Google and Yahoo Robots from search engines are allowed to visit and crawl the web pages of the Repository. By enabling robot access, HKUST’s research output is readily available via popular search engines such as Google and Yahoo, as well as their subsets, such as Google Scholar. The following story, as told by the Library’s reference librarians, shows the effectiveness of using these search engines to discover documents in the Repository: “Once, we received an email from someone in the U.K. who wanted to contact the author of a PhD thesis. It turned out that the requestor was the father of a son suffering from a type of cancer called Ewing Sarcoma. He discovered the thesis on the IR via the web. We acted as the intermediary and passed his enquiry to the author concerned.” (Kwok, Chan, Wong 2006) Searching with SRW/U SRW/U (Search and Retrieval for the Web, or by URL) is a protocol for searching heterogeneous databases using XML and HTTP. It retains the core functionality of Z39.50 but in the form of web services. With SRW/U, search service providers can broadcast a search to various institutional repositories and deliver the search results in their own GUI interface. To allow such federated searching, the Library implemented the SRW/U layer to the Repository in October 2004, based on OCLC’s SRW/U open source software. The HKIR Experiment Other universities in Hong Kong have also started to build their institutional repositories. There is an emerging need to share IR experiences among them and to collaborate. One of the possibilities is to develop a union repository for scholarly output in Hong Kong. To demonstrate the feasibility of such collaboration and to study the issues involved, the Library developed an experimental system called HKIR (Hong Kong Institutional Repositories) in February 2006. The system is powered by the DSpace software, with OCLC’s OAIHarvester2 software for harvesting OAI metadata. As of September 2006, six collections of ETD and institutional repositories 13 from five Hong Kong universities were created, allowing cross-searching of local scholarly output. A number of issues were identified during the study. Many of them are related to the standardization of metadata description among institutions. These include standardization in author names, subject analysis, document types and metadata schema (Figure 6). Figure 6. Two records of the same article in HKIR (http://lbapps.ust.hk/hkir/), showing different metadata description Another problem related to OAI harvesting was also identified during the study. While DSpace uses qualified Dublin Core as the metadata schema, OAI’s default metadata format oai_dc requires unqualified Dublin Core. As a result, metadata that contains qualifiers, such as identifier.citation, would become identifier after the OAI harvesting. Unless there is revision from the oai_dc schema, local institutions will be required to use a HKIR defined metadata format. Conclusions The open access movement began with the establishment of SPARC to address market dysfunctions in scholarly publishing, followed by the formation of the OAI to promote author self-archiving and interoperable standards. After almost a decade of hard work, the authors see a number of good converging signs: publishers are more supportive of open access, the number of open access journals continues to grow, research funding bodies understand and better embrace open access, and more importantly, the flourishing of institutional repositories to preserve scholarly output and to make it openly accessible. Different author name forms assigned to the same article Inconsistent document type assigned to the same article 14 Installing IR software such as DSpace is straightforward, but tailoring the software and setting up policies and procedures to make it work effectively in one’s institutional environment are uphill tasks. Even more difficult is the effort needed to recruit content. IR providers need to continue to educate researchers about the IR and encourage them to deposit their research to the Repository. They also need to campaign for government support. While more and more institutions in Asia are beginning to develop their own repositories, the authors see the need of experience sharing, collaboration and standardization. HKUST is an early implementer of institutional repositories in Asia and its unique experience will be useful to similar institutions in this region. References Chan, D. (2004a). "Managing the challenges : acquiring content for the HKUST Institutional Repository", International Conference on Developing Digital Institutional Repositories: Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology Libraries, Pasadena, CA, and the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/1973 (accessed September 28, 2006). Chan, D. (2004b). "Strategies for acquiring content : experiences at HKUST", International Conference on Developing Digital Institutional Repositories: Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology Libraries, Pasadena, CA, and the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/1974 (accessed September 28, 2006). Chan, D., Kwok, C. and Yip, S. (2005). "Changing roles of reference librarians : the case of HKUST Institutional Repository", Reference services review, v. 33, no. 3, pp. 268-282, available at http://hdl.handle.net/1783.1/2039 (accessed September 28, 2006). Kwok, C., Chan, D. and Wong, G. (2006). "From idea to reality: building the HKUST Institutional Repository", University library journal, v. 10, no. 1, March, available at http://hdl.handle.net/1783.1/2528 (accessed September 28, 2006). Lam, K.T. (2004). "DSpace in action : implementing the HKUST Institutional Repository system", International Conference on Developing Digital Institutional Repositories: Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology Libraries, Pasadena, CA, and the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/2023 (accessed September 28, 2006). Lam, K.T. (2006). "Exploring IR technologies", Workshop on Managing Scholarly Assets in Institutional Repositories: Sharing Experiences Among JULAC Libraries, Hong Kong, February 24, 2006, the Hong Kong University of Science and Technology Library, Hong Kong, available at http://hdl.handle.net/1783.1/2501 (accessed September 28, 2006). work_yd5xne2kxrewhbfnjmboxbvk7u ---- <3239392D3332385F5F5F5F5F5F5FB1B8C1DFBEEF5FC0CCC0C0BAC028C2FCB0EDB9AEC7E5BBF0C0D4C3D6C1BEC6C4C0CFB8DEC0CFBAB8B3BBB1E2292E687770> Open API를 이용한 서지 코드와 목록 확장에 한 연구* A Study on the Bibliographic Records and the Expansion of Library Catalog Using Open API 구 중 억(Jung-Eok Gu)** 이 응 봉(Eung-Bong Lee)*** 목 차 1. 서 론 2. Open API 황 활용성 분석 3. Open API 용 실험 4. Open API 기반 서지 코드와 목록 확장 방안 5. 결 론 록 본 연구에서는 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, OCLC 등 8개의 도서용 Open API 황을 비교 분석하 다. 그리고 C 학교 도서 의 서지 코드에서 ‘ 학’(DDC 535) 분류 167건의 표본 서지 코드를 선정하여 Open API의 활용성을 비교 분석하 다. 이러한 분석결과를 토 로 Open API를 이용한 서지 코드의 근 기능 확 와 목록 확장에 유용하다고 단되는 12개의 Open API 용 모형을 실험 으로 구 하 다. 통합서지용 KORMARC 001, 010, 012, 020, 035, 246, 505, 520, 850, 856 필드들을 상으로 Open API를 이용하여 서지 코드의 근 기능 확 와 목록 확장방안들을 고찰하 다. 마지막으로 표본 서지 코드의 OPAC 검색결과 간략화면과 상세화면을 상으로 Open API를 이용하여 목록 확장방안들을 시로 구 하 다. ABSTRACT In this study, the present condition of 8 Open APIs for books, such as Naver, Daum, Aladdin, Amazon, Google, KERIS, LibraryThing, and OCLC, were compared and analyzed. In addition, the bibliographic records of 167 samples of visible ‘light & paraphotic’ (DDC 535) classifications were selected from the bibliographic records of the library of C university, and the effectiveness of the Open APIs were compared and analyzed. On the basis of the results of this analysis, 12 Open API application models, which were judged to be useful in expanding the function of the access point of the bibliographic records and enhancing the library catalog using Open API, were experimentally implemented. Various methods of augmenting access points of biblio- graphic records and expanded library catalogue were investigated using the unique identifier described in the KORMARC fields of 001, 010, 012, 020, 035, 246, 505, 520, 850, 856 for the integrated format for bibliographic data, and Open API. Finally, using Open API for the brief display and detailed display screen as a result of search of the OPAC of the sample bibliographic records, the devices for the expanding library catalog were implemented as examples. 키워드: 서지 코드, 목록 Bibliographic Records, Library Catalog, Open API, KORMARC * ** *** 이 연구는 2009년도 한국문헌정보학회 춘계 학술 회(2009.4.24)에서 발표한 내용을 수정․보완한 것임. 한국기 과학지원연구원 연구장비진흥실장(jekoo@kbsi.re.kr) 충남 학교 사회과학 학 문헌정보학과 교수(eblee@cnu.ac.kr) 논문 수일자: 2009년 5월 27일 최 심사일자: 2009년 6월 2일 게재확정일자: 2009년 6월 12일 300 한국문헌정보학회지 제43권 제2호 2009 1. 서 론 1.1 연구의 필요성 목 통 인 기술목록 심의 MARC 기반 서 지 코드는 그 형식과 구조의 경직성 등으로 인해 이용자의 정보 요구와 기 를 충분히 반 하지 못하고 있다. 하지만 서지 코드는 새 로운 유형의 메타데이터 출 에도 불구하고 로 컬 정보자원과 이용자를 연결하는데 핵심 인 역할을 수행하고 있다. 최근 Google은 사이트의 객 인 신뢰성을 확보하기 한 방법으로 도서의 본문 미리보기 서비스를 확 하고 있고, Amazon은 상세하고 다양한 서지정보를 제공하고 있으나 통 인 도서 목록은 정보 부족으로 인한 기존 이용 자의 이탈을 방지하고 신규 이용자를 확보하는 데 어려움을 겪고 있다. 이러한 도서 목록환경 속에서 Web 2.0의 Open API는 서지 코드의 근 기능 확 와 정보 부족 해소를 한 목록의 확장 도구로 써 매우 유용하다고 할 수 있다. 서지 코드의 고유 식별자와 Open API를 이용하여 인터넷서 , 포털, 검색엔진 등 딥 웹 (deep web) 상의 풍부한 콘텐츠(표지 이미지, 설명, 목차, 서평, 추천도서, 본문 미리보기, 이 도서의 다른 , 소장기 등)를 목록에 포함해 주거나 링크를 제공할 수 있다. 도서 은 인터넷서 , 포털, 검색엔진 등에 서 공개하고 있는 Open API를 이용하여 풍부 한 콘텐츠의 공유와 활용을 통해 이용자가 원 하는 도서를 쉽게 발견하고 최종 으로 원문을 확보하는데 도움을 수 있다. 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, OCLC 등의 도서용 Open API 를 통해 이용가능한 풍부한 콘텐츠는 자자원 에 해당된다. 도서 은 Open API를 이용한 개 방, 공유, 력 참여 서비스 기반을 마련하고 목록자 심에서 이용자 심의 목록을 제공할 필요가 있다. 본 연구는 Web 2.0의 Open API를 이용하여 서지 코드의 근 기능 확 와 목록 확장방 안들을 제시하고자 하 다. 도서 에서는 Open API를 가장 효과 인 방법으로 이용하여 목록 작성의 비용을 감하고 목록업무의 효율성을 증진하며 목록의 품질 향상을 통해 목록의 정 보 경쟁력을 제고시키는데 기 자료로 활용할 수 있을 것이다. 1.2 연구의 내용 방법 본 연구의 목 을 달성하기 한 연구 내용 과 방법은 다음과 같다. 첫째, 네이버, 다음, 알라딘, Amazon, Goo- gle, KERIS, LibraryThing, OCLC 등 8개의 도 서용 Open API 황을 비교 분석하 고, Open API의 출력 결과 필드와 실제 웹 사이트를 통 해 이용가능한 풍부한 콘텐츠를 비교 분석하 다. 둘째, C 학교 도서 의 소장도서 ‘ 학’ 분류(DDC 535)의 서지 코드 234건을 추출한 후 MARC 020 필드에 ISBN이 기술된 167건 을 표본으로 선정하여 Open API 활용성을 비 교 분석하 다. 셋째, 네이버, 다음, 알라딘, Amazon, Goo- gle, KERIS, LibraryThing, OCLC 등 8개의 Open API를 이용한 서지 코드와 목록 확장에 한 연구 301 Open API를 이용한 서지 코드의 근 기 능 확 와 목록 확장에 유용할 것으로 단되 는 12개의 용 모형을 실험 으로 구 하 고, 용 실험 결과에서 나타난 문제 과 활용방안 들을 제시하 다. 실험 시스템 구 을 해서 는 Unix 서버, Apache 2.0.48/Tomcat 5.5.26 웹서버, Servlet 2.4/JSP 2.0 웹 로그래 언 어 등을 이용하 다. 넷째, 통합서지용 KORMARC의 001, 010, 012, 020, 035, 246, 505, 520, 850, 856 필드들 을 상으로 Open API를 이용한 서지 코드 의 근 기능 확 와 목록 확장방안들을 고 찰하 다. 다섯째, 표본 서지 코드의 OPAC 검색결 과 간략화면과 상세화면을 상으로 Open API 를 이용하여 목록 확장방안들을 시로 구 하 다. 1.3 련연구 Hildreth(1991)는 도서 서비스의 향상을 한 미래의 목록 기능으로 ‘E3 OPAC’을 제 안하면서 통 인 도서 목록에 기사색인, 목차, 록, 이미지, 원문 등을 추가하는 ‘수록 데이터의 확장(expanded catalog)’을 주장하 다. 2001년에 미국의회도서 이 웹을 통한 목차 정보 제공의 유용성에 해 온라인 설문조사를 실시한 결과, 웹 이용자 360명 60%는 로컬 도서 , 미국의회도서 , OCLC WorldCat 등 의 도서 목록에서 링크를 통해 목차 정보를 찾을 수 있었고, 84%는 목차 정보가 유용하다 고 응답하 다(Byrum and Williamson 2006). 미국의 8개 사립 학도서 이 참여하고 있는 CLIC(Cooperating Libraries In Consortium) 목록에서 MARC 505 필드의 목차와 520 필드 의 요약이 추가된 경우에 출이 20.4% 증가 된 것으로 나타났다(Faiks, Radermacher, and Sheehan 2007). 미국의 Northwestern University 도서 목 록에서 MARC 505 필드에 목차를 추가한 5,000 여 개의 샘 코드를 상으로 조사한 결과, 출이 24% 이상 증가된 것으로 나타났다(Miller and Babinec 2008). Breeding(2007)은 차세 도서 목록의 주 요 기능으로 Amazon, Google, LibraryThing 등의 Open API를 이용하여 표지 이미지 등을 목록에 포함시키고 시각화된 정보제공을 주장 하 다. 미국의회도서 의 PCC(Program for Coo- perative Cataloging)에서 단행본 자자원의 경우는 MARC 856 필드를 용가능한 필수로 정의하고 있다(LC 2003). 통합서지용 KORMARC에서 856 필드 표시 기호의 용수 은 ‘재량’, 표시기호와 식별기 호의 반복성에 해서는 ‘반복’으로 정의하고 있다(국립 앙도서 2006). Tennant(2007)는 미국의 UC Berkeley 학 도서 목록에서 1,000,000건의 MARC 코드를 추출하여 865 필드의 이용량을 분석하 다. 이 연구 결과에 따르면, 856 필드에서 제1 지시기호의 98.4%가 ‘4’ HTTP이고, 제2지시 기호의 64.9%가 ‘1’ 자료의 자 버 인 것으로 나타났다. 식별기호 ‘▾3자료 범 지정’은 50% 이상 기술되어 있고, 이 최소 37% 이상은 목차인 것으로 나타났다. 302 한국문헌정보학회지 제43권 제2호 2009 2. Open API 황 활용성 분석 2.1 Open API 황 분석 2.1.1 도서용 Open API 비교 재 인터넷서 , 포털, 검색엔진, 종합목록, 사회 네트워크 사이트 등에서 공개하고 있는 도서용 Open API를 제시하면 <표 1>과 같다. Open API는 주로 REST 기반 HTTP GET 방법을 이용하여 검색을 요청하고, 검색 응답 의 데이터 포맷으로 XML, RSS, JSON 등을 이용하며, 기본 으로 UTF-8 문자 인코딩 방 법을 이용하고 있다. Open API에서는 이용자 리, 이용량 추 등에 필요한 서비스 계정 신 청과 인증키를 발 받아 이용할 수 있고, 서비 스의 안 성과 보안상의 이유 등으로 트래픽 제한을 두고 있으며, 상업 용도로 활용할 경 우 Open API 제공사와 제휴를 통해서 이용할 수 있다. OCLC의 xISBN API는 비상업 용 도로 하루동안 500회의 검색 요청을 과하지 않은 범 내에서만 무료로 이용할 수 있다. 도서 에서는 Open API를 이용하여 외부 의 풍부한 콘텐츠를 공유하고 활용하여 목록 확장을 한 인력, 시간, 비용 등을 감할 수 있다. 하지만 Open API에서 제공하는 기능과 정보 제공 범 내에서만 이용할 있다는 단 이 있다. 2.1.2 풍부한 콘텐츠 비교 Open API의 출력 결과 필드에서 제공하는 풍부한 콘텐츠의 이용가능성을 비교하면 <표 2> 와 같다. 네이버, 다음, 알라딘, Amazon, Goo- gle, KERIS, OCLC 등은 검색 요청에 의한 동 링크 방법, LibraryThing은 정 링크 방법 으로 풍부한 콘텐츠를 이용할 수 있다. <표 3>은 Open API의 출력 결과 필드에서 구 분 Open API 인증 여부 검색 요청 방법 검색 응답 데이터 포맷 트래픽 제한 서비스 과 인터넷 서 알라딘 검색 API ◯ REST XML, RSS, JSON ◯ - 상품 API ◯ REST XML, RSS, JSON ◯ - Amazon Amazon Associates Web Service ◯ REST XML, HTML ◯ - 포털 네이버 책 API ◯ REST RSS ◯ - 다음 책 검색 API ◯ REST XML, RSS, JSON ◯ - 검색엔진 Google AJAX Search API ◯ REST HTML ◯ - Book Search APIs Dynamic Links - REST HTML - - Static Links - REST HTML - - 종합목록 KERIS 단행본 검색 API ◯ REST XML - - OCLC xISBN API ◯ REST XML, JSON ◯ ◯ WorldCat Link - REST HTML - - WorldCat Search API ◯ REST XML, RSS, DC 등 ◯ - 사회 네트워크 LibraryThing LibraryThing APIs Easy Linking - REST HTML - - ISBN Check - REST XML ◯ - Web Services API ◯ REST XML ◯ - <표 1> 도서용 Open API 황 비교 Open API를 이용한 서지 코드와 목록 확장에 한 연구 303 구 분 네이버 다음 알라딘 Amazon Google KERIS Library Thing OCLC 표지 이미지 ◯ ◯ ◯ ◯ ◯ - - - 설명 ◯ ◯ ◯ ◯ - - - - 목차 - - ◯ - - ◯ - - 서평 - - - ◯ - - - - 추천도서 - - - ◯ - - - - 본문 미리보기 - - ◯ - ◯ - - - 이 도서의 다른 - - - - - - - ◯ 도서가격 ◯ ◯ ◯ ◯ - - - - 소장기 - - - - - ◯ - - 상세 서지정보 URL ◯ ◯ ◯ ◯ ◯ ◯ ◯ ◯ 주) Google: Book Search APIs - Dynamic Links, LibraryThing: LibraryThing APIs - Easy Linking, OCLC: xISBN API <표 2> Open API 출력 결과 필드의 풍부한 콘텐츠 비교 구 분 Open API 출력 결과 필드 유 형 사 례 오류사항 네이버 설명 필드 내용 ‘해외주문원서’ 안내 알라딘 표지 이미지, 서명 자사항 제공 KERIS 목차 출력 필드 필드 신 이용 개선사항 네이버 설명 출력 내용 내용요약(passage summary) 해제 네이버, 다음, 알라딘, Amazon 본문 미리보기 필드 추가 본문 미리보기 URL 알라딘 도서 개별 출력 검색 기능 13자리 ISBN 지원 KERIS 목차 필드 추가 목차 URL 출력 필드 필드 신 이용 출력 내용 HTML
,

태그 포함 <표 3> Open API 출력 결과 필드의 오류 개선사항 데이터의 품질 제고 근의 편리성 향상을 한 몇 가지 오류 개선사항들을 제시한 것 이다. Open API의 출력 결과 필드에서 제공하는 상세 서지정보 URL을 따라 실제 웹 사이트를 통해 이용가능한 풍부한 콘텐츠를 비교하면 <표 4>와 같다. 공통된 주요 특징으로는 도서 소개, 자 역자 소개, 목차, 서평(미디어, 문가, 출 사, 독자), 평 , 코멘트 등 다양한 정 보를 얻을 수 있다. 자, 발행자 작권 소 유자와 약된 경우 도서의 본문 검색 미리 보기도 가능하고, 이용자들의 심과 구매 행 동패턴을 분석하여 추천도서를 제공하는 것 등 이 있다. 본문 미리보기의 목차 페이지를 제외 하고는 각 사이트마다 장, , 항, 목 등 목차 기술 수 에 차이가 있고 목차 페이지 정보가 없는 경우도 있다는 단 이 있다. 개별 인 특 징을 살펴보면, 네이버에서는 국립 앙도서 304 한국문헌정보학회지 제43권 제2호 2009 구 분 네이버 다음 알라딘 Amazon Google KERIS Library Thing OCLC 표지 이미지 ◯ ◯ ◯ ◯ ◯ ◯ ◯ ◯ 설명 ◯ ◯ ◯ ◯ ◯ ◯ ◯ ◯ 목차 ◯ ◯ ◯ ◯ ◯ ◯ - ◯ 서평 ◯ ◯ ◯ ◯ ◯ - ◯ ◯ 코멘트 ◯ ◯ ◯ ◯ - - - - 평 ◯ ◯ ◯ ◯ ◯ - ◯ ◯ 추천도서 ◯ ◯ ◯ ◯ ◯ - ◯ ◯ 본문 미리보기 ◯ ◯ ◯ ◯ ◯ - - ◯ 본문검색 ◯ ◯ ◯ ◯ ◯ - - - 도서가격 비교 ◯ - - ◯ ◯ - - ◯ 소장기 ◯ - - - ◯ ◯ - ◯ 이 도서의 다른 - - - ◯ ◯ - ◯ ◯ 이 도서를 인용한 도서 - - - ◯ ◯ - - - 이 도서를 인용한 학술자료 - - - - ◯ - - - 이 도서를 인용한 웹페이지 - - - - ◯ - - - <표 4> Open API를 제공하는 웹 사이트의 풍부한 콘텐츠 비교 과 국회도서 의 소장여부를 확인할 수 있다. Amazon, Google, LibraryThing, OCLC 등은 이 도서의 다른 정보를 제공하고 있다. KERIS 에서는 네이버의 표지 이미지, 설명 가격 정 보를 활용하고 구매도 연계하고 있다. Library Thing은 Amazon의 표지 이미지와 상품 설명, OCLC의 xISBN API를 활용하고 있다. OCLC 의 WorldCat은 Google의 삽입된 뷰어 API를 이용한 본문 미리보기와 Amazon의 독자 서평 정보를 활용하고 있다. 2.2 Open API 활용성 분석 2.2.1 표본 서지 코드 선정 표본 서지 코드는 <표 5>와 같이 C 학교 도서 의 소장도서 ‘ 학’ 분류의 서지 코드 234건을 추출한 후 MARC 020 필드에 ISBN이 기술된 167건을 표본으로 선정하여 Open API 활용성을 비교 분석하 다. 표본 서지 코드의 주된 특징을 살펴보면, 최신성이 있는 도서를 심으로 목차와 URL이 각각 27건(16.2%), 10건(5.9%)인 것으로 나타났고 록과 원문 정 보는 없는 것으로 나타났다. 2.2.2 상세 서지정보 URL 분석 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, OCLC 등의 Open API를 개별 로 이용할 경우 상세 서지정보 URL의 이용가 능성을 비교한 결과는 <표 6>과 같다. 상세 서 지정보 URL은 체로는 Google에서 155건 (92.8%), 국내서는 네이버와 KERIS에서 각각 30건(100%), 외국서는 Amazon과 Google에서 각각 129건(94.2%)으로 가장 많게 나타났다. 국내서의 경우 Google에서 26건(86.7%)을 이 용할 수 있다는 것이 특징으로 나타났다. C 학교 도서 에서 이용하고 있는 A사의 Open API를 이용한 서지 코드와 목록 확장에 한 연구 305 구 분 건 수 ISBN 록 목 차 URL 원 문 2001∼2008 45 45(100%) - 18 8 - 1991∼2000 55 45(81.8%) - 9 2 - 1981∼1990 50 35(70.0%) - - - - 1971∼1980 45 36(80.0%) - - - - 1961∼1970 15 4(26.7%) - - - - 1951∼1960 13 - - - - - 1941∼1950 5 1(20.0%) - - - - 1931∼1940 3 1(33.3%) - - - - 1921∼1930 3 - - - - - 합 계 234(100%) 167(71.4%) - 27(16.2%) 10(5.9%) - <표 5> 표본 서지 코드 비교 구 분 네이버 다음 알라딘 Amazon Google KERIS Library Thing OCLC A사 국내서 30 (18.0%) 30 (100%) 26 (86.7%) 25 (83.3%) - 26 (86.7%) 30 (100%) - 1 (0.6%) 28 (93.3%) 외국서 137 (82.0%) 59 (43.1%) - 23 (16.8%) 129 (94.2%) 129 (94.2%) 124 (90.5%) 40 (29.2%) 136 (81.4%) 60 (43.8%) 합 계 167 (100%) 89 (53.3%) 26 (15.6%) 48 (28.7%) 129 (77.2%) 155 (92.8%) 154 (92.2%) 40 (23.9%) 137 (82.0%) 88 (52.7%) <표 6> Open API 개별 이용시 상세 서지정보 URL의 이용가능성 비교 상세 서지정보 URL을 살펴보면, 체로는 88 건(52.7%), 국내서는 28건(93.3%), 외국서는 60건(43.8%)인 것으로 나타났다. 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, OCLC 등 8개의 Open API를 모두 이용할 경우 상세 서지정보 URL의 이용 가능성을 비교한 결과는 <표 7>과 같다. 상세 서지정보 URL의 경우 167건(100%) 모두를 이용할 수 있는 것으로 나타났다. 2.2.3 풍부한 콘텐츠 분석 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, OCLC 등의 Open API를 개별 로 이용할 경우 웹 사이트를 통한 풍부한 콘텐 츠의 이용가능성을 비교한 결과는 <표 8>과 같 다. 표지 이미지는 Google 77건(46.1%), 설명 은 Google 75건(44.9%), 목차는 Google 53건 (31.7%), 편집자 서평은 Amazon 76건(45.5%), 독자 서평은 Google 31건(18.6%), 추천도서는 Amazon 32건(19.2%), 본문 미리보기는 Amazon 35건(20.9%), 이 도서의 다른 은 Amazon 66건(39.5%)으로 각각 가장 많게 나타났다. 특 히 OCLC의 xISBN API를 이용한 이 도서의 다른 정보의 이용가능성을 구체 으로 살펴 보면, 국내서는 30건 1건(3.3%), 외국서는 137건 57건(41.6%)인 것으로 나타났다. C 학교 도서 에서 제공하는 A사의 상세 서지정보는 표지 이미지 43건(25.7%), 설명 29 306 한국문헌정보학회지 제43권 제2호 2009 구 분 1 2 3 4 5 6 7 8 합 계 국내서 - 2 2 3 22 1 - - 30 외국서 - 4 15 48 35 25 10 - 137 합 계 - 6 17 51 57 26 10 - 167 <표 7> Open API 모두 이용시 상세 서지정보 URL의 이용가능성 비교 구 분 네이버 다음 알라딘 Amazon Google KERIS Library Thing OCLC A사 표지 이미지 53 20 33 55 77 53 30 58 43 설 명 35 18 16 35 75 35 21 38 29 목 차 43 21 14 35 53 46 - 33 21 서평 편집자 - 4 1 76 - - - - 1 독자 9 6 8 28 31 - 4 29 - 추천도서 - 7 8 32 27 - 14 3 - 본문 미리보기 - 2 3 35 26 - - 29 - 이 도서의 다른 - - - 66 54 - 13 58 - <표 8> Open API 개별 이용시 풍부한 콘텐츠의 이용가능성 비교 구 분 표지 이미지 설 명 목 차 독자 서평 추천도서 본문 미리보기 이 도서의 다른 1999∼2008 59 55(93.2%) 51(86.4%) 54(91.5%) 29(49.2%) 37(62.7%) 31(52.5%) 31(52.5%) 1989∼1998 34 26(76.5%) 26(76.5%) 22(64.7%) 7(20.6%) 9(26.5%) 8(23.5%) 11(32.4%) 1979∼1988 37 22(59.5%) 15(40.5%) 16(43.2%) 14(37.8%) 7(18.9%) 3(8.1%) 21(56.8%) 1969∼1978 35 19(54.3%) 9(25.7%) 14(40.0%) 12(34.3%) 13(37.1%) 3(8.6%) 26(74.3%) 1936∼1968 2 2(100%) 1(50.0%) 2(100%) 2(100%) 1(50.0%) 2(100%) 2(100%) 합 계 167 124(74.3%) 102(61.1%) 108(64.7%) 64(38.3%) 67(40.1%) 47(28.1%) 91(54.5%) <표 9> Open API 모두 이용시 풍부한 콘텐츠의 이용가능성 비교 건(17.4%), 목차 21건(12.6%), 편집자 서평 1 건(0.6%)인 것으로 나타났다. 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, OCLC 등 8개의 Open API를 모두 이용할 경우 풍부한 콘텐츠의 이용가능성 을 비교한 결과는 <표 9>와 같다. 체로는 표 지 이미지 124건(74.3%), 설명 102건(61.1%), 목차 108건(64.7%), 독자 서평 64건(38.3%), 추천도서 67건(40.1%), 본문 미리보기 47건 (28.1%), 이 도서의 다른 91건(54.5%) 등을 각각 이용할 수 있고, 최신성이 있는 도서일수 록 비교 이용가능성도 높은 것으로 나타났다. 네이버, 다음, 알라딘, Amazon, Google 등 의 5개 Open API를 모두 이용하여 표지 이미 지의 이용가능성을 비교한 결과는 <표 10>과 같다. 표지 이미지 43건(25.7%)은 이 용할 수 없고, 50건(29.9%)은 1개 사이트에서 만 이용할 수 있는 것으로 나타났다. Open API를 이용한 서지 코드와 목록 확장에 한 연구 307 구 분 0 1 2 3 4 합 계 국내서 4 5 7 12 2 30 외국서 39 45 31 15 7 137 합 계 43 50 38 27 9 167 <표 10> Open API 모두 이용시 표지 이미지의 이용가능성 비교 구 분 네이버 다음 알라딘 Amazon Google KERIS Library Thing OCLC A사 목 차 43 21 14 35 53 46 - 33 21 $21.5 $10.5 $7.0 $17.5 $26.5 $23.0 - $16.5 $10.5 설 명 35 18 7 35 75 35 21 38 29 $17.5 $9.0 $3.5 $17.5 $37.5 $17.5 $10.5 $19.0 $14.5 <표 11> Open API를 이용한 풍부한 콘텐츠의 경제 가치 비교 미국의 Syndetic Solutions사는 소 코드 의 데이터 일 반입 비용으로 히트당 목차는 $0.50, 설명은 $0.30을 용하고 있다. 이를 기 으로 Open API를 통해 풍부한 콘텐츠를 공유하여 활용하는 경우 경제 가치를 비교한 결과, <표 11>과 같이 목차와 설명 정보에 한 히트율이 가장 높은 Google이 가장 많은 경제 가치를 갖고 있다고 할 수 있다. 3. Open API 용 실험 3.1 MARC 856 필드 작성 MARC 856 필드는 자자원의 소재 근 에 한 정보를 기술하는데 이용되고 있다. 시스템에 의해 자동 으로 이용자가 해당 자 자원에 직 근할 수 있도록 해주고 있다. <표 12>와 같이 Open API를 이용한 MARC 856 필드 작성시 제1지시기호는 거의 모든 경 우 ‘4’ HTTP이다. 제2지시기호는 ‘0’ 자료자체, ‘1’ 자료의 자 버 , ‘2’ 련 자료 등을 기술 할 수 있다. 식별기호는 ‘▾3자료 범 지정’, ‘▾ uURI’ 등을 기술할 수 있다. 알라딘은 toc 필드의 CDATA 섹션에서 내용 유무를 확인하여 식별기호 ‘▾3목차’와 lets- lookimg 필드에 이미지 경로 URL이 있는 경우 식별기호 ‘▾3본문 일부 미리보기’를 작성할 수 있다. Amazon은 EditorialReviews, Custome- rReviews, SimilarProducts 필드의 유무에 따 라 식별기호 ‘▾3편집자 서평, 독자 서평, 추천 도서’를 세분하여 작성할 수 있다. Google은 동 링크 방법의 JSON 검색결과로 도서가 존재 하는 것을 먼 체크한 후 preview 필드의 내용 에 따라 식별기호 ‘▾3본문 체보기’ 는 ‘▾3 본문 일부 미리보기’를 작성할 수 있다. KERIS 는 riss.toc 필드의 내용(Y/N)을 확인한 후 고 유 URL에 포함된 제어번호로 목차 URL을 구 성하여 식별기호 ‘▾3목차’를 작성할 수 있다. 본 연구에서는 네이버, 다음, 알라딘, Ama- 308 한국문헌정보학회지 제43권 제2호 2009 Open API 제1 지시 기호 제2 지시 기호 식별기호 비 고 (자료 범 지정)▾3 ▾u 네이버 책 API 4 2 - link 서지정보 다음 책 검색 API 4 2 - link 서지정보 알라딘 도서 개별 출력 API 4 1 toc link 목차 1 letslookimg link 본문 일부 미리보기 2 - link 서지정보 Amazon Amazon Associates Web Service 4 2 EditorialReviews CustomerReviews SimilarProducts DetailPageURL 서지정보 Google Google Book Search APIs - Dynamic Links/ISBN 4 0 preview preview_url 본문 체보기 1 preview preview_url 본문 일부 미리보기 2 preview info_url 서지정보 KERIS 단행본 검색 API 4 1 riss.toc url 목차 2 - url 서지정보 LibraryThing LibraryThing APIs - Easy Linking/ISBN 4 2 - url 서지정보 OCLC WorldCat Link - Permalink/ISBN 4 2 - url 서지정보 WorldCat Link - Permalink/OCLC Number 4 2 - url 이 도서의 다른 2 - url 서지정보 <표 12> Open API를 이용한 MARC 856 필드의 용 유형 시 zon, Google, KERIS, LibraryThing, OCLC 등의 8개 Open API에서 제공하는 출력 결과 필드를 이용하여 MARC 856 필드를 작성하는 것을 구 하 다. <그림 1>은 본 연구자가 임으로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)의 MARC 856 필드를 작성한 것 이다. MARC 856 필드에서 식별기호 ‘▾uURI’는 이용자가 주소를 복사하여 해당 자원에 근할 수 있도록 디스 이해 주어야 하고(Maxwell 2004), 이용자는 URL 주소를 쉽게 식별해 낼 수 있어야 하며, 이아웃의 제약을 고려하여 URL 주소를 명시 이고 간략하게 기술해주는 것이 바람직할 것이다. Amazon은 DetailPageURL 신에 ASIN 코 드를 이용한 고유 링크(permanent link)로 체 할 수 있고, Google은 동 링크 방법으로 도서 가 존재하는 경우 ISBN, LCCN, OCLC 번호 등 을 이용한 정 링크 방법으로 체 할 수 있다. <표 13>과 같이 Google의 Book Search와 OCLC의 WorldCat은 검색어, 서명, 자 등에 한 검색결과 페이지의 링크를 만들 수 있고, 이를 MARC 856 필드에서 식별기호 ‘▾3탐색 보조도구’로써 기술할 수 있다. 3.2 표지 이미지 도서 이용자는 일반 으로 OPAC의 검색 결과 간략화면에서 자신이 원하는 도서를 찾기 Open API를 이용한 서지 코드와 목록 확장에 한 연구 309 <그림 1> Open API를 이용한 856 필드의 용 실험 시 해 일일이 살펴보아야 하고 텍스트의 색상, 크기, 폰트 등으로만 이용자가 원하는 도서를 직 으로 식별하는데 한계가 있다. 본 연구에서는 네이버, 다음, 알라딘, Ama- zon, Google 등 5개의 Open API를 이용하여 ISBN 검색으로 해당 도서에 한 다양한 종류 (썸네일, 소, , )의 표지 이미지를 모두 표시 해주는 것을 구 하 다. 썸네일 이미지는 실제 크기의 이미지를 축소한 것으로 단지 HTML 인코딩을 이용하여 크기를 변경할 경우 품질이 좋지 않을 수 있다. <그림 2>는 본 연구자가 임으로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)에 한 표지 이미지를 표시한 것이다. 도서 의 OPAC 검색결과 화면에서는 서지 정보의 텍스트와 도서의 표지 이미지를 히 배치한 이아웃을 통해 이용자가 찾는 도서를 더 빠르고 쉽게 식별할 수 있도록 정보를 시각 화해 필요가 있다. 도서의 표지 이미지는 그 종류에 따라 ‘미리보기’와 ‘확 보기’로 이용할 수 있고, 검색결과 간략화면도 테이블, 리스트, 포털 형태 이외에 표지 이미지를 추가해 수 있다. 표지 이미지는 서가에서 청구기호로 유사한 주제의 도서는 인 하여 배치되고 있으 므로 가상서가의 라우징 기능을 제공하는데 이용할 수 있다. 3.3 설명 도서 이용자는 오 라인서 이나 서가에 서 자신이 원하는 도서의 구매나 출 등을 해 도서의 앞표지, 앞날개, 뒷날개, 뒷표지 등의 310 한국문헌정보학회지 제43권 제2호 2009 <그림 2> Open API를 이용한 표지 이미지 용 실험 시 다양한 서지정보를 읽어보기 마련이다. Open API에서는 주로 도서 소개의 내용 일 부를 제공하고 있다. 네이버에서는 도서 소개 의 내용을 발췌 요약한 형태로 설명 정보를 제 공하고, Amazon의 경우 상품 설명 정보를 제 공하고 있다. 본 연구에서는 Amazon의 Open API를 이 용하여 ISBN 검색으로 해당 도서의 상품 설명 정보를 표시하는 것을 구 하 다. <그림 3>은 본 연구자가 임의로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)의 상품 설명 정보를 표시한 것 Open API를 이용한 서지 코드와 목록 확장에 한 연구 311 <그림 3> Open API를 이용한 상품 설명 용 실험 시 이다. Open API의 출력 결과 필드에서 제공하는 설명 정보는 OPAC의 검색결과 간략화면에서 이용가능성을 표시해주거나 미리보기를 제공 할 수 있고 상세화면에서 삽입해 수도 있다. 3.4 목차 국내도서 에서 목차 정보 제공은 자 에서 직 입력하는 방법, 출 사나 납품업체를 통해 구입하는 방법, 국립 앙도서 의 목차 정보를 다운로드 받아 반입하는 방법, KERIS의 종합 목록이나 미국의회도서 의 목차 정보 URL에 링크하는 방법 등을 이용하고 있다. URL 링크 의 경우 도서 시스템에서 검색할 수 없다는 단 이 있다. 본 연구에서는 KERIS의 단행본 검색 Open API를 이용하여 ISBN 검색으로 해당 도서에 한 목차의 유무와 목차 정보를 표시하는 것 을 구 하 다. <그림 4>는 본 연구자가 임의로 선정한 ‘분 학의 이해’(ISBN-10: 8995737794)의 목차 정보 일부를 표시한 것이다. <표 14>와 같이 알라딘의 경우 웹 라우 에서 곧바로 이용할 수 있는 HTML 태그들이 포함되어 있지만, KERIS는 목차 내용을 문장 는 문단이 구분되지 않고 한 로 제공된다 는 단 이 있다. 목차는 원칙 으로 원문 그 로 제공되어야 하지만, 아직은 사이트마다 목 차의 정보 제공수 이 상이하여 이용자들의 다 양한 정보 요구를 충족시키는데 제약이 있다. Open API의 출력 결과 필드에서 제공하는 목차 정보는 OPAC의 검색결과 간략화면에서 이용가능성을 표시해주거나 미리보기를 제공 할 수 있고 상세화면에서 삽입해 수도 있다. 목차 정보는 네이버, 다음, 알라딘, Amazon, 312 한국문헌정보학회지 제43권 제2호 2009 <그림 4> Open API를 이용한 목차 용 실험 시 알라딘 KERIS - - 1. 분자식과 그 화학 정보
2. 외선 분 학
... 간생략 ... 10. 핵자기공명 분 학

]]>
목차제1장 분자식과 그 화학 정보 1.1 원소 분석 계산법 = 13 1.2 분자질량 결정법 = 15 1.3 분자식 = 16 1.4 수소모자람 지수 = 16 1.5 십삼의 규칙 = 19 1.6 질량 스펙트럼 맛보기 = 22 연습 문제 = 23 참고 문헌 = 24 ... 간생략 ... 부록 14 스펙트럼 찾아보기 = 659 <표 14> Open API에서 제공하는 목차 정보 비교 Google 등에서 제공하는 본문 미리보기의 목차 페이지에 링크할 수 있다. 3.5 서평 서평은 출 사, 문가, 미디어, 독자 서평 등 으로 구분할 수 있고, 서평의 주체에 따라서 그 성격과 내용에는 차이가 있으며, 서평은 독자 가 도서를 읽어 볼 만한 가치가 있는지를 단 하고 선택하는데 도움을 다. 본 연구에서는 Amazon의 Open API를 이 용하여 ISBN 검색으로 해당 도서의 독자 서평 을 표시하는 것을 구 하 다. <그림 5>는 본 연구자가 임의로 선정한 ‘Mod- ern Raman Spectroscopy’(ISBN-10: 0471497940) 에 한 독자 서평 일부를 표시한 것이다. Am- Open API를 이용한 서지 코드와 목록 확장에 한 연구 313 <그림 5> Open API를 이용한 독자 서평 용 실험 시 azon은 문가 보다는 독자 서평을 시하고 있 으며, 이용자의 참여를 통해 평 을 부여할 수 있 도록 평가시스템을 도입하여 독자 서평의 질 수 과 신뢰성을 향상시키고 있다. Amazon의 독자 서평은 기본 으로 Open API에서 최근일자순 으로 최 5개까지를 제공하고, 웹 사이트에서는 평 순으로 독자 서평을 모두 볼 수 있다. Amazon에서 제공하는 독자 서평 정보는 OPAC의 검색결과 간략화면에서 이용가능성 을 표시해주거나 미리보기를 제공할 수 있고 상세화면에서 삽입해 수도 있다. 3.6 추천도서 인터넷서 등에서는 이용자의 심과 구매 행동 패턴 등을 분석하여 이용자가 원하는 도서 를 선택하는데 도움을 주고 매출 증 효과를 얻기 해 추천시스템을 리 용하고 있다. Amazon은 상품간 력 필터링(item-to- item collaborate filtering)을 이용하여 이용자 가 심 상품을 열람하기만 해도 그와 련된 다른 상품을 추천해주고 있다. 국의 University of Huddersfield 도서 은 약 13년간에 걸쳐 축 된 3백만건의 출 기 록을 토 로 37,000건 이상의 ‘이 도서를 출 한 사람은 이 도서도 출했습니다’는 추천시 스템을 구 하고 있다(Pattern 2008). 본 연구에서는 Amazon의 Open API를 이 용하여 ISBN 검색으로 해당 도서의 추천도서 를 표시하는 것을 구 하 다. 314 한국문헌정보학회지 제43권 제2호 2009 <그림 6>은 본 연구자가 임으로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)의 ‘이 도서를 구입한 사람은 이런 도서도 구입했습니다’는 추천도서를 표시한 것 이다. Amazon의 추천도서는 기본 으로 Open API에서 최 5개까지를 제공하고, 웹 사이트 에서는 추천도서 모두를 볼 수 있다. <표 15>와 같이 Amazon에서 제공하는 추천 도서 23건에 한 C 학교 도서 의 소장도서 는 5건(21.7%), KERIS의 종합목록에서 국내 학도서 의 소장도서는 22건(95.7%)인 것 으로 나타났다. 인터넷서 등에서 제공하는 추천도서 정보 는 도서 이용자가 자 소장도서의 출이나 희망도서, 상호 차 원문복사를 신청할 수 있고, 도서 에서는 장서개발, 수서, 참고 사, 주제정보 서비스 등에 활용할 수 있을 것이다. 3.7 본문 검색 미리보기 최근 인터넷서 이나 포털 등에서는 도서의 본문 검색 미리보기 서비스를 강화하고 있 다. 본문 검색은 검색어가 포함된 페이지의 일 부를 보여주고, 본문 미리보기는 일부 특정 페 <그림 6> Open API를 이용한 추천도서 용 실험 시 Amazon 추천도서 C 학교 도서 KERIS 종합목록 Amazon 추천도서 C 학교 도서 KERIS 종합목록 Amazon 추천도서 C 학교 도서 KERIS 종합목록 Amazon 추천도서 C 학교 도서 KERIS 종합목록 0471497940 - ○ 0486673553 - ○ 048663941X - ○ 0123694701 - ○ 048666144X - ○ 0935702253 - ○ 0471358320 - ○ 3540431802 - ○ 0201610914 - ○ 0070277303 ○ ○ 0849324610 - ○ 0471607312 - ○ 0306472929 - ○ 0471743399 - - 0470516348 ○ ○ 047141526X - ○ 0387312781 - ○ 0521642221 - ○ 0470844167 ○ ○ 1402019009 - ○ 0470093072 - ○ 0521370957 ○ ○ 0849324637 ○ ○ - - - <표 15> Open API를 이용한 추천도서의 이용가능성 비교 Open API를 이용한 서지 코드와 목록 확장에 한 연구 315 이지를 보여주는 것이다. 2006년 11월 한출 문화 회, 한국출 인 회의, 포털업체 등의 약에 따르면, 이용자는 로그인 기 으로 30일간 해 체 페이지의 최 5% 범 내에서 본문 검색 서비스를 제공 받을 수 있다. 단, 100페이지 미만 도서의 경우 에는 최 5페이지까지 본문 검색이 가능하다. 본문 미리보기 서비스는 표지, 서문, 목차, 후기, 추천사, 권면 표제지, 색인 등의 내용을 볼 수 있고, 미리보기 이외의 본문은 최 10페이 지까지 볼 수 있다( 한출 문화 회 2006). Open API의 출력 결과 필드에서 다음은 ebook_barcode, 알라딘은 letslookimg, Google 은 preview_url 등을 이용하여 본문 미리보기 서비스를 확인할 수 있다. 네이버, 다음, 알라딘, Amazon 등의 본문 미리보기 서비스는 시스템 에 의해 HTTP GET 요청의 응답상태( : 404 Not Found)나 처리상태( : status=“ok”)를 체크한 후 링크할 수도 있다. 본 연구에서는 Amazon과 Google Book Search APIs의 동 링크 방법을 이용하여 ISBN 검 색으로 해당 도서의 본문 미리보기를 표시하는 것을 구 하 다. <그림 7>은 본 연구자가 임의로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)에 한 본문 미리보기 서비스를 표시한 것이다. Amazon의 추천도서를 상 으로 Google Book Search APIs의 동 링크 방법을 이용하여 표지의 썸네일 이미지와 본문 미리보기 버튼을 표시하 다. <그림 7> Open API를 이용한 본문 검색 미리보기 용 실험 시 316 한국문헌정보학회지 제43권 제2호 2009 본문 검색 미리보기 서비스는 작권 보 호 문제가 해결되어야 하고, 이용자는 검색을 통해 도서의 주요 내용을 미리 살펴볼 수 있어 도서 이나 서 을 이용하지 않을 수 있다는 우려가 있다. 하지만 도서 은 인터넷서 등 의 본문 검색 미리보기 서비스를 활용하여 이용자가 최종 으로 원문을 확보하는데 도움 을 필요가 있다. 3.8 이 도서의 다른 OCLC의 WorldCat에서는 FRBR Work-Set 알고리즘을 용한 xISBN 테이블을 보유하고 있으며, xISBN API를 통해 특정 작과 련 된 ISBN들을 제공한다. 본 연구에서는 OCLC의 xISBN API를 이 용하여 ISBN 검색으로 해당 도서의 다른 을 표시하는 것을 구 하 다. <그림 8>은 본 연구자가 임으로 선정한 ‘In- troduction to Spectroscopy. 4th ed.’(ISBN- 10: 0495114782)의 다른 을 모두 표시한 것 이다. xISBN API에서 getMetadata 검색 요 청을 통해 ISBN에 해당되는 도서의 OCLC 번 호, LCCN, 출 형태, 출 년도, 원어, 출 언 어, 사항, 서명, 자, 출 사, 출 도시, URL 등의 메타데이터를 지정하여 얻을 수 있다. <그림 8> Open API를 이용한 이 도서의 다른 용 실험 시 Open API를 이용한 서지 코드와 목록 확장에 한 연구 317 <표 16>은 C 학교 도서 에서 해당 도서의 다른 을 얼마나 소장하고 있는지, KERIS의 종합목록에서 국내 학도서 들이 해당 도서 의 다른 을 얼마나 소장하고 있는지, 원서의 번역서는 얼마나 있는지 비교한 것이다. xISBN API를 이용하여 FRBR 모델을 용할 경우에 일부 도서에서 이용자가 원하는 도서를 탐색, 식별, 선택, 확보 등을 하는데 유용하다고 할 수 있다. 도서 에서는 xISBN API를 이용하여 이용 자에게 FRBR 모델을 용한 검색결과를 제 공함으로써 디스 이를 개선할 수 있다. xISBN API의 무료 구독 버 을 이용하여 FRBR 모델을 구 할 수 있고, 이를 통해 자 의 OPAC에서 이용가능한 다른 이나 번역서 등을 제공함으로써 출 열람, 상호 차 원문복사 서비스의 증 를 기 할 수 있다. 3.9 종합목록 소장기 도서 이용자는 자신이 원하는 도서의 출 는 열람이 불가능한 경우 종합목록을 이용하 여 상호 차 원문복사 서비스를 신청할 수 있다. 본 연구에서는 KERIS의 단행본 검색 API 를 이용하여 ISBN 검색으로 해당 도서의 소장 기 수와 소장기 목록을 표시하는 것을 구 하 다. <그림 9>는 본 연구자가 임으로 선정한 ‘분 학의 이해’(ISBN-10: 8995737794)에 한 국내 학도서 의 소장기 과 도서 코드를 표시한 것이다. 소장기 의 도서 코드는 한국 도서 부호표를 이용하므로 종 는 지역별 분류 검색 기능을 구 할 수 있다. 향후 KERIS 종합목록의 참여기 에서 서지 코드의 고유 링크를 제공하고 종합목록에서 링크해주면 이용자가 소장정보 출상태를 실시간으로 확인할 수 있을 것이다. 3.10 소장기 지도 표시 도서 이용자는 종합목록 검색결과 자신이 원하는 도서의 국 는 지역내 도서 의 치를 지도 상에서 한 에 확인할 수 있으면 타 방문 열람이 가능한 도서 을 찾아가는데 도움이 될 수 있다. 구분 ISBN 출 지 출 사 출 년 C 학교 도서 KERIS 종합목록 (소장기 수) 원서 0495114782 4th ed. Belmont, CA Brooks/Cole 2009 ○ ○(10) 0495555754 4th ed. Pacific Grove, Calif. Brooks/Cole 2008 - ○(7) 0030319617 3rd ed. Fort Worth, Tex. Harcourt Brace College Pub. 2001 ○ ○(33) 0030584272 2nd ed. Fort Worth, Tex. Harcourt Brace College Pub. 1996 - ○(26) 0721671195 1st ed. Philadelphia W. B. Saunders Co. 1979 ○ ○(72) 번역서 8995737794 3 서울 사이 러스 2007 - ○(43) 8973381512 2 서울 自由아카데미 1998 ○ ○(49) - 1 서울 自由아카데미 1985 - ○(70) <표 16> 이 도서의 다른 비교 318 한국문헌정보학회지 제43권 제2호 2009 <그림 9> Open API를 이용한 종합목록 소장기 용 실험 시 본 연구에서는 KERIS의 단행본 검색 API 와 네이버 지도 API를 이용하여 ISBN 검색으 로 해당 도서의 소장기 치를 지도에 표시 하는 것을 구 하 다. <그림 10>은 본 연구자가 임의로 선정한 ‘비 선형 이 분 학’(ISBN-10: 8992592108) 의 81개 소장기 에서 국의 4년제 국공립 학교 23개와 지역의 4년제 사립 학교 4 개 등 총 27개 소장기 의 지도 좌표를 JSP내 에 미리 장하여 지도 상에 표시한 것이다 <표 17>과 같이 네이버 지도에서 우편번호나 주소를 이용한 좌표 변환으로는 특정 지 의 좌 표 값을 얻을 수 없으므로 지도 상에 도서 치를 정확하게 표시하기 해서는 미리 최 지 을 찾아 해당 좌표를 공동목록시스템 등의 기 정보 리 등에 추가해주는 것이 필요하다. KERIS의 종합목록이나 국립 앙도서 의 국가자료공동목록시스템에서는 검색결과 소장 기 을 지도 상에 표시해 수 있고, 소장기 에서 Open API를 통해 소장정보 출상태 를 제공하면 지도 상에서 마커와 정보창을 이 용하여 련정보를 표시해 수 있을 것이다. 3.11 도서 가격 비교 도서 이용자는 자신이 원하는 도서의 구매 의사가 있는 경우 가격에 한 정보를 필요로 한다. 재 출간 18개월 미만 신간의 경우 온․ 오 라인 서 모두 20%(정가의 10%+추가 경품할인 10%)까지 할인할 수 있다. Open API를 이용한 서지 코드와 목록 확장에 한 연구 319 <그림 10> Open API를 이용한 소장기 지도 표시 용 실험 시 구 분 검색방법 X축 Y축 C 학교 네이버 지역 API 주소 341335 418987 Geocoding Open API 우편번호 341450 418785 주소 341450 418785 C 학교 도서 네이버 지도 이용 제 최 지 의 좌표값 구하기 341484 419060 <표 17> Open API를 이용한 도서 의 지도 좌표 비교 본 연구에서는 네이버, 다음, 알라딘, Ama- zon 등의 Open API를 이용하여 ISBN 검색으 로 도서의 정가와 매가를 표시하는 것을 구 하 다. <그림 11>은 본 연구자가 임의로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)의 가격 정보를 표시한 것이다. 노란북과 Google의 도서 가격비교 사이트로 링 크해 것이다. 도서 OPAC에서는 이용자가 원하는 도서 를 구매할 수 있도록 가격 정보와 가격비교 사 이트로 연결되는 링크를 제공할 필요가 있다. 3.12 ISBN 13 자리 이용 2007년 1월부터 세계 으로 13자리 ISBN 을 이용하고 있고, 10자리와 13자리의 ISBN이 모두 있는 도서는 MARC 020 필드에 반복하여 기술해야 하며, 도서 시스템에서는 10자리와 13자리 ISBN 검색조건을 모두 지원해야 한다. 인터넷서 등은 10자리 ISBN에 해 가 이면 13자리 ISBN을 함께 표시해주고 있다. 320 한국문헌정보학회지 제43권 제2호 2009 <그림 11> Open API를 이용한 도서 가격비교 용 실험 시 본 연구에서는 OCLC의 xISBN API를 이 용하여 ISBN 10자리 변환, 13자리 변환, 하이 추가, 체크섬 확인 등을 구 하 다. <그림 12>는 본 연구자가 임의로 선정한 ‘Un- derstanding NMR Spectroscopy’(ISBN-10: 0470017864)에 한 13자리 ISBN 변환 등을 표시한 것이다. OCLC의 xISBN API와 LibraryThing의 ISBN Check API는 도서 시스템에서 필요시 ISBN 자리수를 제한하지 않도록 개선하거 나 기존 서지 코드를 상으로 ISBN의 소 변환, 유효성 검증, 검색조건 등에 이용할 수 있다. <그림 12> Open API를 이용한 ISBN 자리수 변환 용 실험 시 Open API를 이용한 서지 코드와 목록 확장에 한 연구 321 4. Open API 기반 서지 코드와 목록 확장 방안 4.1 통합서지용 KORMARC 필드 활용 4.1.1 MARC 001 필드 001 필드는 자 의 서지 코드에 한 제어 번호를 기술하는데 이용된다. 제어번호는 구조 와 입력규칙에 따라 시스템에 의해 생성되고, 인터넷에서 서지 코드의 고유한 URL 주소인 고유 링크를 제공하는데 이용할 수 있다. 고유 링크는 Open API를 이용하여 상세 서지정보 에 연계할 수 있는 근 을 확보하는데 매우 유용하다고 할 수 있다. 서지 코드의 고유 링크는 도서 시스템의 특성을 반 하여 <표 18>과 같이 고유 식별자를 이용한 고유 링크 서비스 사례를 참고하여 구 할 수 있다. 4.1.2 MARC 010 필드 010 필드는 미국의회도서 이 서지 코드에 부여한 제어번호(LCCN)를 기술하는데 이용 된다. 식별기호 ‘▾a미국의회도서 제어번호’ 는 LCCN 고유 링크 서비스를 통해 해당 도서 의 상세 서지정보에 링크하는데 이용할 수 있다. 4.1.3 MARC 012 필드 012 필드는 국립 앙도서 이 서지 코드에 부여한 제어번호를 기술하는데 이용된다. 식별 기호 ‘▾a국립 앙도서 제어번호’에 기술된 제어번호는 국가자료종합목록의 참여기 등 이 해당 도서의 상세 서지정보에 링크할 수 있 도록 고유 링크를 제공하는데 이용할 수 있다. 4.1.4 MARC 020 필드 020 필드는 국제표 도서번호(ISBN)를 기 술하는데 이용된다. 식별기호 ‘▾a국제표 도서 번호’는 네이버, 다음, 알라딘, Amazon, Google, KERIS, LibraryThing, Google 등의 상세 서지 정보에 링크하는데 이용할 수 있다. 서지 코드 에서 OCLC의 xISBN API와 LibraryThing의 ISBN Check API를 이용하여 ISBN 검색으로 구분 고유 식별자 고유 링크 시 Amazon ASIN http://www.amazon.com/gp/product/048663941X KERIS 제어번호 http://www.riss4u.net/link?id=U10858875_ http://www.riss4u.net/keris_abstoc.jsp?no=10858875 Google ISBN OCLC LCCN http://books.google.co.kr/books?vid=ISBN0451522907 http://books.google.co.kr/books?vid=OCLC99995956&printsec=toc http://books.google.co.kr/books?vid=LCCN2006933867&printsec=backcover LC LCCN http://lccn.loc.gov/2006933867 LibraryThing ISBN http://www.librarything.com/isbn/3540734155 MIT Libraries 제어번호 http://library.mit.edu/item/001534404 OCLC ISBN OCLC 번호 우편번호 http://www.worldcat.org/isbn/354042332X http://www.worldcat.org/oclc/99995956 http://www.worldcat.org/isbn/354042332X&loc=90101 <표 18> 고유 식별자를 이용한 고유 링크 서비스 사례 322 한국문헌정보학회지 제43권 제2호 2009 10자리 는 13자리로 ISBN의 소 변환이나 OPAC의 검색결과 화면에서 10자리 는 13자 리 ISBN을 표시하는데 이용할 수 있다. OCLC 의 xISBN API와 Library Thing의 thingISBN API를 이용하여 ISBN 검색으로 FRBR 모형 을 구 할 수 있고, OCLC의 xISBN API를 이 용하여 LCCN, OCLC 번호를 소 변환 하는데 이용할 수 있다. 4.1.5 MARC 035 필드 035 필드는 주로 종합목록의 데이터를 다운 로드 받아 로컬 시스템에 업로드하여 이용할 때 국립 앙도서 , KERIS, OCLC 등의 력기 제어번호를 기술하는데 이용된다. 식별기호 ‘▾a도서 부호와 제어번호’는 국립 앙도서 과 KERIS의 종합목록에서 Open API를 개발 하여 ISBN 검색으로 서지 코드의 제어번호를 제공하면 로컬도서 등에서 력기 제어번 호를 소 변환 하는데 매우 유용할 것이다. 력기 제어번호는 Google, KERIS, OCLC 등 의 상세 서지정보에 링크하는데 이용할 수 있다. 4.1.6 MARC 246 필드 246 필드는 번역서의 원서명을 기술하는데 이용된다. 통합서지용 KORMARC에서 식별 기호 ‘▾w 코드 제어번호’와 ‘▾z국제표 도 서번호’를 신설하여 번역서와 원서의 서지 코 드간을 기계 으로 연결시켜 이용자가 원하는 도서를 식별하는데 도움을 수 있을 것이다. 4.1.7 MARC 505 필드 505 필드는 목차 정보 등을 기술하는데 이용 된다. 식별기호 ‘▾uURI’는 인터넷을 통해 이 용가능한 목차 정보의 URI를 기술할 수 있다. 505 필드에서 목차 정보는 856 필드의 식별기 호 ‘▾uURI’로 체할 수 있고 이때 505 필드 는 삭제도 가능하다. 4.1.8 MARC 520 필드 520 필드는 요약, 해제, 록, 서평 등을 기술 하는데 이용된다. 식별기호 ‘▾uURI’는 인터넷 을 통해 이용가능한 요약, 서평 정보 등의 URI 를 기술할 수 있다. 520 필드에서 요약 등 주기 는 856 필드의 식별기호 ‘▾uURI’로 체할 수 있고 이때 520 필드는 삭제도 가능하다. 4.1.9 MARC 850 필드 850 필드는 서지 코드에 기술된 해당 도서 의 소장기 에 한 도서 부호를 기술하는데 이용된다. 식별기호 ‘▾a소장기 ’은 종합목록 데이터를 다운로드 받는 과정에서 자동생성할 수 있다. 식별기호 ‘▾w 코드 제어번호’를 신설하여 소장기 의 상세 서지정보에 링크하 는데 이용할 수 있다. 하지만 로컬도서 과 종 합목록에서는 850 필드를 거의 사용하고 있지 않다는 단 이 있다. 4.1.10 MARC 856 필드 856 필드는 서지 코드에 기술되는 도서의 자자원에 해 이용자가 속하여 이용할 수 있는 정보를 기술하는데 이용된다. 식별기 호 ‘▾uURI’는 Open API를 통해 이용가능한 해당 도서의 설명, 목차, 서평, 추천도서, 소장 기 , 본문 검색 미리보기, 이 도서의 다른 , 도서 가격비교, 탐색보조도구 등을 기술할 수 있다. Open API를 이용한 서지 코드와 목록 확장에 한 연구 323 자자원의 목록기술에 있어서 865 필드를 이용하거나 별도의 메타데이터 포맷에 따라 DB를 구축하고 001 필드의 자 제어번호를 통 해 서지 코드에 연결할 수도 있다. 4.2 확장방안 시 4.2.1 KORMARC 필드 본 연구자가 임의로 선정한 ‘Introduction to spectroscopy. 4th ed.’(ISBN-10: 0495114782) 의 원서와 번역서에 한 서지 코드들을 상 으로 통합서지용 KORMARC과 서지데이터용 KORMARC의 확장방안을 고찰하 다. <표 19>와 같이 통합서지용 KORMARC을 기 으로 246 필드에 식별기호 ‘▾w 코드 제 어번호’와 ‘▾z국제표 도서번호’를 신설하면 원서와 번역서의 서지 코드를 시스템에 의해 연결시킬 수 있고 이를 통해 목록을 확장하는 데 유용할 것으로 단된다. 서지데이터용 KORMARC을 기 으로 507 필드의 식별기호 ‘▾z국제표 도서번호’에 ISBN을 추가로 기술 하면 OCLC의 xISBN API를 이용하여 자 에 서 이용가능한 원서와 번역서의 모든 을 제 공할 수 있을 것이다. 구분 ISBN 서지 코드 시 C 학교 도서 KERIS 종합목록 원서 0495114782 4th ed. 001 000000712567 020 ▾a9780495114789(pbk.):▾cUS$178.95 020 ▾a0495114782(pbk.) 245 00 ▾aIntroduction to spectroscopy /▾cDonald L. Pavia ... [et al.]. 250 ▾a4th ed. 500 ▾aPrevious ed. entered under Pavia. ○ ○ 0495555754 4th ed. ... 생략 ... - ○ 0030319617 3rd ed. 001 000000600401 020 ▾a0030319617:▾cUS$123.95 245 00 ▾aIntroduction to spectroscopy: ▾ba guide for students of organic chemistry / ▾cDonald L. Pavia, Gary M. Lampman, George S. Kriz. 250 ▾a3rd ed. ○ ○ 0030584272 2nd ed. ... 생략 ... - ○ 0721671195 1st ed. 001 000000007439 020 ▾a0721671195 245 10 ▾aIntroduction to spectroscopy: ▾ba guide for students of organic chemistry /▾cDonald L. Pavia, Gary M. Lampman, George S. Kriz, Jr ○ ○ 번역서 8995737794 3 ... 생략 ... - ○ 8973381512 2 001 000000559111 020 ▾a8973381512:▾c25,000 245 10 ▾a분 학 분석입문 / ▾dPavia; ▾eLampman; ▾eKriz [著]; ▾e문석식 [외역] 246 10 ▾aIntroduction to spectroscopy: a guide for students of organic chemistry ▾w(KERIS)BIB000000208474 ▾z0030584272 250 ▾a제2 507 10 ▾tIntroduction to spectroscopy: a guide for students of organic chemistry ▾z0030584272 ○ ○ - 1 ... 생략 ... - ○ <표 19> KORMARC 단행본 서지 코드 시 324 한국문헌정보학회지 제43권 제2호 2009 4.2.2 검색결과 간략화면 본 연구자가 임의로 선정한 ‘Quantum non- linear optics’(ISBN-10: 354042332X)의 검색 결과 간략화면을 상으로 Open API를 이용 한 목록 확장을 시로 구 하 다. <그림 13>과 같이 행 간략화면에서는 856 필드의 식별기호 ‘▾uURI’에 기술된 미국의회 도서 의 목차와 출 사 설명 URL에 링크할 수 있다. 확장 간략화면에서 서명에 Google의 표지 이미지를 표시하 고, 원문에는 Google의 본문 미리보기를 링크하 다. URL은 편집자 서평, 추천도서, 본문 미리보기 등을 이용할 수 있는 Amazon으로 체하 다. 목차의 경우 Google의 본문 미리보기에서 이용가능한 목차 페이지에 직 링크하 다. 4.2.3 검색결과 상세화면 본 연구자가 임의로 선정한 ‘Quantum nonli- near optics’(ISBN-10: 354042332X)의 검색 결과 상세화면을 상으로 Open API를 이용 한 목록 확장을 시로 구 하 다. <그림 14>과 같이 행 상세 서지정보에서는 ‘Online Access’에 미국의회도서 의 목차 출 사 설명 정보를 링크해 주고 있다. Open API를 이용하여 확장된 상세 서지정보에서는 Google Book Search의 썸네일 표지 이미지와 본문 미리보기에 링크하 고, 이를 통해 이용자 는 본문의 특정 페이지(앞표지, 표제지, 권, 목차, 색인, 뒷표지)를 열람할 수 있으며, 특히 원문 그 로 목차 페이지를 열람하고 검색할 수 있다는 장 이 있다. ‘요약 등 주기’를 추가하여 Amazon의 상품 설명을 표시하 다. 네이버, 다 음, 알라딘, Amazon, Google, KERIS 등의 상 세 서지정보에 링크하 고, KERIS에서는 국내 학도서 의 소장기 정보를 얻을 수 있으며, OCLC의 WorldCat을 통해 이 도서의 다른 정보를 얻을 수 있다. 서명으로 Google의 Book Search와 OCLC의 WorldCat에서 검색결과를 <그림 13> Open API를 이용한 검색결과 간략화면 확장 시 Open API를 이용한 서지 코드와 목록 확장에 한 연구 325 <그림 14> Open API를 이용한 검색결과 상세화면 확장 시 얻을 수 있도록 탐색보조도구를 링크하 다. 네 이버, 알라딘, Amazon 등에서 정가 매가 를 제공하 고, 노란북과 Google의 가격비교 사 이트를 링크하 다. 5. 결 론 통 인 기술목록의 문제 하나인 정보 부족을 해소하기 해 고유 식별자를 이용하여 서지 코드의 근 기능을 확 하고 Open API를 이용하여 외부의 풍부한 콘텐츠를 공유 하고 활용하여 목록을 확장할 필요가 있다. 본 연구에서는 네이버, 다음, 알라딘, Ama- zon, Google, KERIS, LibraryThing, OCLC 등 8개의 도서용 Open API에 한 황을 비 교 분석하 고, C 학교 도서 에서 167건의 표본 서지 코드를 선정하여 Open API의 활 용성을 비교 분석하 다. 서지 코드의 근 기능 확 와 목록 확장에 유용하다고 단되는 12개의 Open API 용 모형을 실험 으로 구 하 다. 통합서지용 KORMARC 필 드들을 상으로 고유 식별자와 Open API를 이용한 서지 코드의 근 기능 확 방안을 고찰하 고, 표본 서지 코드의 OPAC 검색결 과 간략화면과 상세화면을 상으로 Open API 를 이용하여 행 목록의 확장 방안을 시로 구 하 다. C 학교 도서 의 표본 서지 코드 167건을 상으로 8개의 Open API를 모두 이용할 경우 상세 서지정보 URL 167건(100%), 표지 이미 지 124건(74.3%), 설명 102건(61.1%), 목차 108건(64.7%), 독자 서평 64건(38.3%), 추천도 서 67건(40.1%), 본문 미리보기 47건(28.1%), 이 도서의 다른 91건(54.5%) 등의 풍부한 콘 텐츠를 이용할 수 있는 것으로 나타났다. 326 한국문헌정보학회지 제43권 제2호 2009 통합서지용 KORMARC의 001, 010, 012, 020, 035, 246, 505, 520, 850, 856 필드들을 상 으로 Open API를 이용하여 시스템에 의해 일정 수 이상 서지 코드의 근 기능을 확 할 수 있고 목록업무에서 수작업을 최소화할 수 있 을 것으로 기 된다. 로컬도서 과 종합목록 에서는 001 필드의 제어번호를 이용하여 고유 링크 서비스를 제공함으로써 서지 코드의 개방, 공유, 력 참여의 서비스 기반을 마련할 필요 가 있다. 원서와 번역서의 서지 코드간 기계 인 연결을 해 통합서지용 KORMARC에서 246 필드에 식별기호 ‘▾w 코드 제어번호’와 ‘▾z국 제표 도서번호’를 신설하고, 기존 서지데이터용 KORMARC에서 507 필드의 식별기호 ‘▾z국제 표 도서번호’에 ISBN을 기술할 필요가 있다. 로컬도서 에서는 Open API를 이용한 목록 확장을 해 서지 코드와 자자원의 매칭률 을 더욱 높일 수 있는 작업이 필요하다. 즉 고유 식별자인 ISBN, LCCN, OCLC 번호, 력기 제어번호 등에 한 소 변환과 락 오류 등에 한 목록 품질 리가 필요하다. 종합 목록에서는 로컬도서 의 목록비용 감 등을 해 Open API를 이용하여 856 필드의 기술을 한 룩업 도구나 메타데이터를 제공하는 것이 필요하다. 도서 시스템 업체는 Open API를 이용하여 자자원의 목록기술이 가능하도록 목록시스템의 MARC 편집 기능과 OPAC 검 색시스템의 인터페이스를 개선할 필요가 있다. 향후 Open API를 이용하여 서지 코드의 근 기능 확 와 목록 확장을 통한 도서 목록 이용자의 만족도 향상, 목록업무의 효율성 향상, 장서의 이용률 증 등에 미친 체계 인 이용량 리와 분석도 가능할 것으로 보인다. 참 고 문 헌 [1] 국립 앙도서 편 . 2006. 한국문헌자동화목록형식: 통합서지용 . 서울: 한국도서 회. [2] 네이버. 2009. “Open API - 책검색.” [online]. [cited 2009.3.19]. . [3] 다음. 2009. “도서 검색 API.” [online]. [cited 2009.3.19]. . [4] 한출 문화 회. 2006. “도서 본문 검색 미리보기 서비스 기 .” [online]. [cited 2009.3.19]. . [5] 알라딘. 2009. “도서(상품) 검색 API.” [online]. [cited 2009.3.19]. . [6] 한국교육학술정보원. 2008. “RISS Open API.” [online]. [cited 2009.3.19]. . Open API를 이용한 서지 코드와 목록 확장에 한 연구 327 [7] Amazon. 2009. “Amazon Web Services Developer Community: Docs: Amazon Associates Web Service(API Version: 2009-02-01).” [online]. [cited 2009.3.19]. . [8] Breeding, Marshall. 2007. “Next-Generation Library Catalogs.” Library Technology Reports, 43(4). [9] Byrum, J. D., & Williamson, D. W. 2006. “Enriching Traditional Cataloging for Improved Access to Information: Library of Congress Tables of Contents Projects.” Information Technology and Libraries, 25(1): 4-11. [10] Faiks, Angi, Radermacher, Amy, & Sheehan, Amy. 2007. “What ABOUT the book? Google- izing the Catalog with Tables of Contents.” Library Philosophy and Practice, 9(3): 1-12. [11] Google. 2008. “Google Book Search APIs.” [online]. [cited 2009.3.19]. . [12] Hildreth, C. R. 1991. “Advancing toward the E3 OPAC: the imperative and the path.” In Van Pulis, N (Eds), Think Tank on the Present and Future of the Online Catalog: Proceedings (ALA Midwinter Meeting, Chicago, January 11-12, 1991), Reference and Adult Services Division, American Library Association, 17-38. [13] LC. 2003. “Monographic Electronic Resources: BIBCO Core Record Standards.” [online]. [cited 2008.3.19]. [14] LibraryThing. 2008. “LibraryThing APIs.” [online]. [cited 2009.3.19]. . [15] Maxwell, Robert L. 2004. Maxwell's Handbook for AACR2 : Explaining and Illustrating the Anglo-American Cataloguing Rules Through the 2003 Update. Chicago: American Library Association. [16] Miller, Karen D., & Babinec, Michael S. 2008. “Tables of Contents in the Catalog and Their Impact on Usage.” [online]. [cited 2009.3.19]. . [17] OCLC. 2007. “WorldCat Web service: xISBN.” [online]. [cited 2009.3.19]. . [18] Pattern, Dave. 2008. “University of Huddersfield -- Circulation and Recommendation Data.” [online]. [cited 2009.3.19]. . [19] Tennant, Roy. 2007, “Trouble in Online Paradise: An Analysis of MARC 856 Usage at One 328 한국문헌정보학회지 제43권 제2호 2009 Institution.” [online]. [cited 2009.3.19]. . •국문 참고자료의 영어 표기 (English translation / romanization of references originally written in Korean) [1] The National Library of Korea. 2006. Korean machine readable cataloging format: integrated format for bibliographic data. Seoul: Korean Library Association. [2] Naver. 2009. “Open API - Book Search.” [online]. [cited 2009.3.19]. . [3] Daum. 2009. “Book Search API.” [online]. [cited 2009.3.19]. . [4] Korean Publishers Association. 2006. “Book search & preview service guidlines.” [online]. [cited 2009.3.19]. . [5] Aladdin. 2009. “Book Search API.” [online]. [cited 2009.3.19]. . [6] KERIS. 2008. “RISS Open API.” [online]. [cited 2009.3.19]. . work_yfptd5zbezef3aqrroansnre7u ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586360 Params is empty 216586360 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586360 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_yfvk7xdluba4fh5lkt4hfo3pya ---- The Online Library Catalog: Paradise Lost and Paradise Regained? Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents D-Lib Magazine January/February 2007 Volume 13 Number 1/2 ISSN 1082-9873 The Online Library Catalog Paradise Lost and Paradise Regained?   Karen Markey University of Michigan (This Opinion piece presents the opinions of the author. It does not necessarily reflect the views of D-Lib Magazine, its publisher, the Corporation for National Research Initiatives, or its sponsor.) Purpose Statement This think piece tells why the online library catalog fell from grace and why new directions pertaining to cataloging simplification and primary sources will not attract people back to the online catalog. It proposes an alternative direction that has greater likelihood of regaining the online catalog's lofty status and longtime users. Such a direction will require paradigm shifts in library cataloging and in the design and development of online library catalogs that heed catalog users' longtime demands for improvements to the searching experience. Our failure to respond accordingly may permanently exile scholarly and scientific information to a netherworld where no one searches while less reliable, accurate, and objective sources of information thrive in a paradise where people prefer to search for information. The Impetus for this Essay The impetus for this essay is the library community's uncertainty regarding the present and future direction of the library catalog in the era of Google and mass digitization projects. The uncertainty is evident at the highest levels. Deanna Marcum, Associate Librarian for Library Services at the Library of Congress (LC), is struck by undergraduate students who favor digital resources over the online library catalog because such resources are available at anytime and from anywhere (Marcum, 2006). She suggests that "the detailed attention that we have been paying to descriptive cataloging may no longer be justified ... retooled catalogers could give more time to authority control, subject analysis, [and] resource identification and evaluation" (Marcum, 2006, 8). In an abrupt about-face, LC terminated series added entries in cataloging records, one of the few subject-rich fields in such records (Cataloging Policy and Support Office, 2006). Mann (2006b) and Schniderman (2006) cite evidence of LC's prevailing viewpoint in favor of simplifying cataloging at the expense of subject cataloging. LC commissioned Karen Calhoun (2006) to prepare a report on "revitalizing" the online library catalog. Calhoun's directive is clear: divert resources from cataloging mass-produced formats (e.g., books) to cataloging the unique primary sources (e.g., archives, special collections, teaching objects, research by-products). She sums up her rationale for such a directive, "The existing local catalog's market position has eroded to the point where there is real concern for its ability to weather the competition for information seekers' attention" (p. 10). At the University of California Libraries (2005), a task force's recommendations parallel those in Calhoun report especially regarding the elimination of subject headings in favor of automatically generated metadata. Contemplating these events prompted me to revisit the glorious past of the online library catalog. For a decade and a half beginning in the early 1980s, the online library catalog was the jewel in the crown when people eagerly queued at its terminals to find information written by the world's experts. I despair how eagerly people now embrace Google because of the suspect provenance of the information Google retrieves. Long ago, we could have added more value to the online library catalog but the only thing we changed was the catalog's medium. Our failure to act back then cost the online catalog the crown. Now that the era of mass digitization has begun, we have a second chance at redesigning the online library catalog, getting it right, coaxing back old users, and attracting new ones. Let's revisit the past, reconsidering missed opportunities, reassessing their merits, combining them with new directions, making bold decisions and acting decisively on them. Why the Online Catalog Fell from Grace This brief account of end-user searching tells why the online catalog fell from grace. The Reign of the Online Catalog By the early 1980s, a critical mass of online catalog deployment had been achieved across the United States. A nationwide survey demonstrated that over 80% of library users held favorable views of this new form of the catalog (Markey, 1984, 2; Matthews, Lawrence, and Ferguson, 1983, 152). The decade and a half beginning in the early 1980s was the golden age of the online catalog, because library users depended on it almost exclusively for finding information on the topics that interested them (Farber, 1984). The online catalog was and still is an appropriate place for people to start their search for information because books synthesize human knowledge about particular phenomena in and across disciplines. They span large intellectual spaces, tackle mammoth problems, make more intensive cases than all other literary genres, and undergo rigorous editorial review. Paradise Lost From the start, users wanted subject searching improved in online catalogs (Besant, 1982), they told us subject searching was difficult (Markey, 1984, 81-84; Matthews, Lawrence, and Ferguson, 1983, 155-164), and they wanted tables of contents and journal articles added to the catalog's database (Markey, 1984, 84-87; Matthews, Lawrence, and Ferguson, 1983, 118-120). Through its Bibliographic Service Development Program (Haas, 1978), the Council on Library Resources sponsored a long list of researchers to demonstrate subject access improvements to online catalogs (see list specifics in Drabenstott, 1991). By the early 1990s, researchers recommended these solutions: Make subject searching in online catalogs easier using post-Boolean probabilistic searching with automatic spelling correction, term weighting, intelligent stemming, relevance feedback, and output ranking (Hildreth, 1989, 1995; Drabenstott, 1991; Walker, 1989) Streamline users' book selection decisions at the catalog by adding tables of contents and back-of-the-book indexes to cataloging (i.e., metadata) records (Atherton, 1978; Wormell, 1981; Markey and Calhoun, 1987) Reduce the many failed subject searches by expanding the online catalog with full texts—journal and newspaper articles, encyclopedias, dissertations, government documents, etc. (Drabenstott, 1991; Tiefel, 1991) Increase finding strategies in online catalogs through the library classification (Markey and Demeyer, 1986; Larson, 1991) The reasons why these solutions were not applied to online library catalogs to transform the user experience are subtle, nuanced, and varied: (1) the library profession's longtime obsession with descriptive cataloging, (2) the focus of the technical services department on other priorities, e.g., retrospective conversion, cataloging backlogs, authority control, etc., (3) the profession's conscious shift away from supporting technical services in favor of public services, (4) the ever increasing per-item cataloging cost, (5) the failure of the research community to arrive at a consensus about the most pressing needs for online catalog system improvement and to field cost-conscious solutions, (6) failure of the library staff issuing the Requests for Proposals (RFPs) to act in concert about needed system improvements, (7) lower-than-inflation funding allocations for libraries, (8) the costs of building collections and licensing resources pushing well beyond the rate of inflation giving rise to the open-access movement, (9) the high cost of integrated library system (ILS) technology generally, and (10) the failure of ILS vendors to monitor shifts in information-retrieval technology and respond accordingly with system improvements. In the end, widely disconnected organizations and market forces failed to converge in a direction that kept users queuing at the online catalog. The Reign of Google In the late 1990s, the World-Wide Web grew exponentially. For-profit software vendors deployed search engines such as Alta Vista, Excite, and Hotbot to showcase full-text searching to prospective software purchasers specifically and to Internet searchers generally. Ironically these systems embraced post-Boolean searching, the very technology that online catalog vendors eschewed (Calhoun, 2006, 26; University of California Libraries, 2005, 17; Yu and Young, 2004, 168). By the early 2000s, Google, a for-profit company with the objective of "organizing the world's knowledge" (Google, 2005), registered 700 times more searches on a daily basis than the online library catalog for the statewide campuses of the University of California served on a monthly basis (Cooper, 2001; Sherman, 2003). Google now reigns. Given the company's tremendous investment in digitization projects (Google, 2004), Google has every intention of keeping its exalted position for some time to come. The company has deep pockets, innovative leadership, high-level technical talent, and a proven track record on delivering successful products to the marketplace. Why Do People Prefer Google as a Starting Point? To answer this question, this section summarizes a quarter-century of research findings about people's information-seeking behavior. Searching for Information in the Library Puts People on an Emotional Roller Coaster "I despise searching the library for books and other sources. It takes a long time and rarely can you find sources needed. This difficult process is the first thing I think of when I think of using the library" (De Rosa et al., 2005, 1-22). The frustration that this 18-year-old expresses about searching for library resources fits the Information Search Process (ISP) model to a "T" (Kuhlthau, 1993). Not only does the ISP model tell us that people experience a wide range of negative and positive emotions during their search for information, it tells us exactly what they are doing when their emotions roller coaster up and down (Kracker, 2002; Kracker & Wang, 2002; Swain, 1996). Putting One's Information Needs into Words Is Downright Difficult Many researchers express surprise at the brevity (from one to three words) of the queries people submit to online systems (Markey, in press). Belkin (1980, 137) tells why so few words make up their queries, "Precisely because of the inquirer's lack of knowledge about a problem area, it is impossible to specify what would resolve it." For Belkin, the saving grace is the inquirer's ability to recognize what he or she wants or does not want during the course of the search. Therein lies an important solution to the problem—information systems that report results for easy eyeballing and instantaneous recognition of relevant possibilities. Domain Expertise—It's All about Knowing What You Want and Where to Look Domain experts—scholars, scientists, and experienced researchers who have expert knowledge of their discipline as a whole and in-depth knowledge about a couple of ideas that ranks them amongst the world's experts—know the unanswered research questions, sticky controversies, and active scholars in their discipline. Rarely, if ever, do they need to conduct the brute-force subject searches that characterize the searches of domain novices (Ellis, 1989; Land & Greene, 2000; Drabenstott, 2003). When they are stumped, their standing in the field gives them carte blanche to contact the world's experts to get answers to questions about who is doing research or has published on a topic. Primary sources are truly the intellectual playground of domain experts: they use primary sources to make new discoveries, and the by-products of their research are the creation of new primary sources. Most people are domain novices about their topics of interest. Undergraduate students especially are just beginning to learn the summary knowledge of a discipline. They have no depth, do not know the discipline's influential authors, important questions, cutting-edge research, or research methodologies. Building a catalog of the future that is biased toward primary sources does not serve the interests of domain novices. Imagine a future University of Michigan co-ed whose professor assigns her a term paper on Kukulcan. Before cracking open her textbook to learn the absolute basics about Kukulcan, she searches Calhoun's online library catalog of the future and retrieves images of Kukulcan sculptures from the University's Kelsey Museum. Because she has no knowledge of Kukulcan nor the Mesoamerican culture from which Kukulcan derives, she would not understand what the sculptures mean, how to make sense of the minimal metadata usually associated with museum objects such as these, and how the images now figure into her ongoing search for information or the term paper her instructor has assigned her to write. Diverting our existing online library catalogs away from books to primary sources will drive this co-ed and her peers back to the simplicity of Google as quickly as one can say "Kukulcan." Searching for Something One Does Not Know Is Frenetic, Aimless, and Random Because many people are searching online systems for something they do not know, their behavior is neither targeted nor direct. "Students often use very chaotic, what they themselves term 'random,' methods for finding materials for their papers. A characteristic comment is: 'I felt kind of aimless, kind of like shooting in the dark, you're going to get something eventually'" (Valentine, 2001, 112). Debowski (2001, 378) makes similar observations, "It was evident that [people] spent more time inputting, rather than planning a suitable search process. There was little evidence of search quality assessment ... with most entering the next search statement very rapidly ... [People] who search without a solid foundation fail to gain a stronger understanding of the search process. Instead, they appear to develop further erroneous habits as they continue." Land & Greene (2000, 57) attribute such meandering to low levels of metacognitive knowledge, "the process of reflecting on or monitoring the effectiveness of the search process and then refining the process when necessary," and note its pervasiveness in the searches of domain novices. People's Starting Point Is Google and the Web People start their quest for knowledge and understanding with Google (De Rosa et al., 2005; Awre et al., 2005; Griffiths and Brophy, 2005; Fast and Campbell, 2004; Pew Research Center, 2003; OCLC, 2002; Outsell, 2000). It ranks the most basic, elementary, and easy-to-understand information at or near the top of the heap. If you are not convinced, do a search for something you know nothing about like "kukulcan." Right at the top of Google's list are web sites that tell who Kukulcan is, alternative names for this Aztec god, and, in the case of the Wikipedia entry, links to both online and print sources for learning more. The World-Wide Web has become the people's encyclopedia of choice. Google and other web search engines give people a good start, and, in fact, with Wikipedia links in hand, it gives them a running start, for building on their bare-bones, basic knowledge of a topic. The web also satisfies people's voracious appetites for full texts (Bar-Ilan and Fink, 2005). Instead of strolling in the library stacks to find a book, people want to stay put in their homes and offices and retrieve full texts with a click of a button. Asked about the reliability, accuracy, and objectivity of the information they retrieve on the web, people express concern, but there is little evidence that they act on their concern (De Rosa et al., 2005; Griffiths and Brophy, 2005; Fast and Campbell, 2005; Marcum and George, 2003; Outsell, 2000). As such, searching the web specifically, and searching for information generally, conforms to the principle of least effort, "The design of any ... information system should be the system's ease of use ... If an organization desires to have a high quality of information used, it must make ease of use of primary importance" (Rosenberg, 1966, 19). A Second Chance to Redesign the Online Library Catalog To regain the online catalog's lofty status and win back its fair-weather fans, let's redesign an online library catalog that embraces: (1) post-Boolean probabilistic searching, to ensure the precision of searches in online library catalogs bearing the full texts of digitized books, journal articles, encyclopedias, conference papers, etc., (2) subject cataloging, to take advantage of the user's ability to recognize what they want or do not want during the course of the search, and (3) qualification cataloging, to enable users to customize retrievals that are in keeping with their level of understanding and expertise in a domain. Embrace Post-Boolean Probabilistic Searching Long overdue is the replacement of our outdated Boolean-based catalogs with post-Boolean, probabilistic retrieval methods that characterize Google and other web search engines (University of California Libraries, 2005, 17). Why does post-Boolean probabilistic searching do so well? Susan Feldman (1998, 40-41) sums it up best: "These systems are doing what you as [expert] searchers have learned to do yourselves. They look for terms that can distinguish one document from another, they ask for the terms to appear close together in the document, they stem words, and they count words that appear in the title more heavily than those appearing in the rest of the text ... Some systems also try to match query concepts... They enlarge a search beyond the boundaries that the query originally defined." In the post mass digitization era, every word and phrase from millions of digital texts of all literary genres will be at the fingertips of online library catalog users. Giving users a Boolean-based system to search digitized texts is comparable to giving Captain Kirk a Mercury-era space capsule to travel the galaxy. Embrace Subject Cataloging When people can search every word that has ever been written, one is hard pressed to find evidence in support of subject cataloging. Yet such evidence has been right under nose for several years, thanks to a report Marcia Bates prepared for the Library of Congress. The evidence pertains to the 30-to-1 ratios that characterize access to stores of information (Dolby and Resnikoff, 1971). With respect to books, titles and subject headings are 1/30 the length of a table of contents, tables of contents are 1/30 the length of a back-of-the-book index, and the back-of-the-book index is 1/30 the length of a text. Similar 30 to 1 ratios are reported for the journal article, card catalog, and college class. "The persistence of these ratios suggests that they represent the end result of a shaking down process, in which, through experience, people became most comfortable when access to information is staged in 30-to-1 ratios" (Bates, 2003, 27). Recognizing the implications of the 30-to-1 rule, Atherton (1978) demonstrated the usefulness of an online catalog that filled the two 30-to-1 gaps between subject headings and full-length texts with tables of contents and back-of-the-book indexes. In the post mass digitization era, subject headings, class numbers, classification captions, and entries from tables of contents entries and back-of-the-book indexes should figure prominently in the post-Boolean probabilistic catalog's: Ranking algorithms. Such algorithms should be profiled to give much higher weights to subject headings, classification captions, and entries from tables of contents and book indexes than to words buried deep in the text. Brief-document displays. Everyone is familiar with Google's brief-document displays that list keywords, phrases, and sentence fragments from retrieved web pages. Users scan these displays to determine what the page is about and whether to link it. Even better would be document titles, subject headings and classification captions to expedite scanning for relevant items in long lists of retrievals. Relevance feedback ("find more like") mechanisms. Relevance feedback algorithms should weight titles and subject headings much higher than words buried deep inside texts. North Carolina State University's new Endeca online catalog gives us previews of relevance feedback for virtual classification browsing points and for faceted LC subject headings (Antelman, Lynema, and Pace, 2006). Mann (2005; 2006a) extols the benefits of maintaining LC subject headings in their current form; Anderson and Hofmann (2006) advocate faceting LC headings. Expand with Qualification Metadata Metadata that is essential for users in the post mass digitization age must facilitate their document-selection decisions. Here is a list of document attributes that would enable users to qualify retrievals with greater precision and customize them according to their level of understanding and knowledge of a domain: In a discipline: in biology, in computer science, in the history of art, in mathematics, in meteorology, in physics, in theology, etc. With knowledge of this subject at a particular academic level: with an elementary education, with a high school education, with a college education, etc. To what extent the author is an authority on the topic at hand. For a particular class of people: for teens, for seniors, for shut-ins, etc. Is a particular genre or of a particular literary nature: encyclopedias, law, newspapers, poetry, history, bibliography, research, diaries, statistics, state-of-the-art review, dissertation, first-person account, fiction, etc. When the particular subject took place: 16th century, Age of Enlightenment, Victorian Era, 1939-1945, etc. What can be done with the document: buy, read, solve, calculate, download, play games, chat, sell, gamble, search, listen, watch, etc. How others benefited from using the document, i.e., reviews and ratings. What kind of experience the user gets from the document: scary stories, sad pictures, funny jokes, break-your-heart lyrics, etc. The above list is by no means comprehensive. Examine the major databases across the disciplines to expand on the above list and to gather the controlled vocabularies these databases use for each attribute. If I was starting my search for Kukulcan, I might be inclined to qualify retrievals by setting with at "a high school education" and is-of at "encyclopedias" attributes. If I was farther along in my exploration, I might up the ante by setting with to "a college education," and is-of to "history," "research," and "bibliography." If I was settling a bet, I might not be concerned about the to-what-extent setting but if I was integrating what I had found into my senior thesis, I would be tempted to set to-what-extent at a "high" level to limit retrievals to domain experts writing in their chosen field. Again, North Carolina State University's Endeca online catalog gives us a preview—it shows how existing metadata elements can be used to qualify search results. Adding the qualification metadata listed above could make our future post-Boolean probabilistic catalogs even more versatile than what is possible with the metadata in today's cataloging records. Today, people voluntarily add metadata (they call them tags) to texts and multimedia, e.g., web sites (del.icio.us, Shadows, and MyWeb), blog posts (Technorati and RawSugar), images (Flickr), and videos (YouTube) (Wash and Rader, 2006; Xu et al., 2006). Instead of eliminating metadata, our field should be studying user-added metadata and adding what users want to metadata in the future online catalog. Ameliorating the Full-Text Retrieval Problem The recommendations presented in this think piece about post-Boolean searching, subject cataloging, and qualification metadata are intended to ameliorate the full-text retrieval problems inherent with Google/Open Content Alliance digitized text (Tennant, 2005). In the online catalog of the post mass digitization era that searches millions and millions of full texts, imagine the results of your searches for the queries "kukulcan," "aztecs," or "spanish conquest." Each search will result in millions of hits with no guarantee that the top-ranked ones will address your desired topic in depth or at your level of understanding. Enlisting post-Boolean retrieval algorithms on rich, authority-driven metadata is imperative for ensuring the precision of search results in the online catalog of the future. Building the Future Online Catalog Now Before mass digitization projects make significant headway, the library community must act on building the future online catalog joining forces with researchers, practitioners, and system designers in related and allied fields to: (1) gather relevant information, (2) test prototype post-digitization-era catalogs, (3) evaluate results and make decisions, (4) assign tasks to willing parties, and (5) execute them. The information-gathering phase must include definitions of the future online library catalog. Will books dominate or will future catalogs feature the full gamut of scholarly products and by-products? To get us started is Christine Borgman (2006, 2007) with her extensive research on the future of scholarly communication. With regard to subject access in the catalog of the future, we should consider all options, e.g., continuing the status quo, enlisting human indexers to apply faceting, restricting faceting to computer-based approaches, assessing automatic subject cataloging and classification, eliminating subject analysis altogether. Here are examples of subject-access functionality in future online catalog prototypes that should be assessed in the testing phase: Ranking algorithms that give the highest weights to the summary data in metadata records such as titles, subject headings, class numbers, and qualification metadata to ensure the precision of ranked output Relevance feedback (i.e., "find more like this") mechanisms that weight subject headings, titles, class numbers, and qualification metadata higher than words and phrases buried deep inside digitized texts Data elements that users want to see in the catalog's brief displays of retrieved items Document attributes that are most useful for qualifying retrievals so that retrievals are relevant and users are intellectually prepared to understand their contents Qualification attribute selection routines that are easy for searchers to understand and use The role of citation data for searching, ranking, retrieval, relevance feedback, and display Ability to display and manipulate full texts, e.g., searching, navigating, underlining, note-taking, writing in the margins, sharing with peers, etc. Metadata assignment (i.e., tagging) procedures that encourage users to participate, perhaps by rewarding them for their assignment Integration of online library catalog searching into the larger scenario of information seeking generally—Google and the Internet generally, journal searching, searching the invisible web, institutional repository searching, etc. In the past, the library community has left decision-making to a few key individuals, advisory groups, organizations, or professional societies for reasons that deserve examination elsewhere. No longer should decisions be left to a few. First, we have the technology to be inclusive in the decision-making phase. Second, we are facing an uncertain future in which we may experience a shift in the balance from the primacy of a few large institutions, their collections, authority, and staff expertise to a federation that requires the participation of all in the creation of a new and different comprehensive whole. Third, successful deployment of shared, technology-based decision-making could set the standard for future decision-making within the discipline and inspire other disciplines to embrace the approach. Being inclusive during the decision-making process may be a necessity to secure everyone's participation during task-assignment and execution phases. Finding today's equivalent to yesterday's Bibliographic Services Development Program to support such an ambitious plan of action would certainly facilitate the building of the future online library catalog. Conclusion Whether the library community adopts this think piece's recommendations or goes in a different direction, the time is right to rethink library cataloging and online catalogs. Reading and synthesizing Marcum, Calhoun, Bates, Mann, Hildreth, Borgman, Anderson, etc., should be mandatory for everyone who cares about the future of the online library catalog. The next steps must be to engage all interested parties in serious dialogue, system prototyping, decision making, and action so the online library catalog of the future hits the ground running just as mass digitization projects end. Should we fail to act until all the books are digitized and the copyright problems are solved, the last person to leave to digitization workroom may be turning off the lights on the library. Acknowledgment The author is grateful to Pauline Cochrane, Martha O'Hara Conway, Paul Conway, C. Olivia Frost, Charles R. Hildreth, Thomas Mann, William H. Mischo, and Soo Young Rieh who provided thoughtful reviews of early drafts. References Anderson, James D. and Melissa A. Hofmann. 2006. A Fully Faceted Syntax for Library of Congress Subject Headings. Cataloging & Classification Quarterly 43, 1: 7-38. Antelman, Kristin, Emily Andrew K. Lynema, and Pace. 2006. Toward a Twenty-first Century Library Catalog. Information Technology & Libraries 25, 3 (September): 128-139. Atherton, Pauline. 1978. Books Are for Use: Final Report of the Subject Access Project to the Council on Library Resources. Syracuse, NY: School of Information Studies. Awre, Chris et al. 2005. The CREE Project: Investigating User Requirements for Searching within Institutional Environments. D-Lib Magazine 11, 10 (October). . Bar-Ilan, Judith, and Noa Fink. 2005. Preference for Electronic Format of Scientific Journals: A Case Study of the Science Library Users at the Hebrew University. Library and Information Science Research 27, 3: 363-376. Bates, Marcia J. 2003. Task Force Recommendation 2.3 Research and Design Review: Improving User Access to Library Catalog and Portal Information; Final Report (Version 3). June 1. . Belkin, Nicholas J. 1980. Anomalous States of Knowledge as a Basis for Information Retrieval. Canadian Journal of Information Science 5: 133-143. Besant, Larry. Early Survey Findings: Users of Public Access Catalogs want Sophisticated Subject Access. American Libraries 13, 3 (1982): 160. Borgman, Christine L. 2006. Disciplines, Documents, and Data: Convergence and Divergence in the Scholarly Information Infrastructure. ISI Samuel Lazerow Memorial Lecture, University of Tennessee, October 18. . Borgman, Christine L. 2007. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, MA: MIT Press. Calhoun, Karen. 2006. The Changing Nature of the Catalog and its Integration with Other Discovery Tools; Final Report, Prepared for the Library of Congress. March 17. . Cataloging Policy and Support Office. 2006. Series at the Library of Congress. . Cooper, Michael D. 2001. Usage Patterns of a Web-based Library Catalog. Journal of the American Society for Information Science & Technology 52, 2: 137-148. De Rosa, Cathy et al. 2005. Perceptions of Libraries and Information Resources: A Report to the OCLC Membership. Dublin, Ohio: OCLC. . Debowski, Shelda. 2001. Wrong Way, Go Back! An Exploration of Novice Search Behaviours while Conducting an Information Search. The Electronic Library 19, 6: 371-382. Dolby, J. L., and Howard L. Resnikoff. 1971. On the Multiplicative Structure of Information Storage and Access Systems. Interfaces, The Bulletin of the Institute of Management Sciences 1: 23-30. Drabenstott, Karen M. 1991. Online Catalog User Needs and Behaviors. In Think Tank on the Present and Future of the Online Catalog: Proceedings, edited by Noelle Van Pulis, 59-84. Chicago: Reference and Adult Services Division, American Library Association. Drabenstott, Karen M. 2003. Do Non-domain Experts Enlist the Strategies of Domain Experts? Journal of the American Society for Information Science and Technology 54, 9: 836-854. Ellis, David. 1989. A Behavioural Approach to Information Retrieval System Design. Journal of Documentation 45: 171-212. Farber, Evan. 1984. Catalog Dependency. Library Journal 109, 3 (February 15): 325-328. Fast, Karl V., and D. Grant Campbell. 2004. "I Still Like Google:" University Student Perceptions of Searching OPACs and the Web. In Proceedings of the ASIS Annual Meeting 2004), 41: 138-146. Medford, NJ: Information Today. Feldman, Susan. 1998. Where Do We Put the Web Search Engines? Searcher 6, 10: 40-57. Google. 2004. Google Checks Out Library Books. . Google. 2005. Corporate Information: Company Overview. . Griffiths, Jillian, and Peter Brophy. 2005. Student Searching Behavior and the Web: Use of Academic Resources and Google. Library Trends 53, 4 (Spring): 539-554. Haas, Warren J. 1979. Managing the Information Revolution: CLR's Bibliographic Service Development Program. Library Journal 104, 16 (September 15): 1867-1870. Hildreth, Charles R. 1989. Intelligent Interfaces and Retrieval Methods for Subject Searching in Bibliographic Systems; Prepared for the Library of Congress. Washington, DC: Cataloging Distribution Service. Hildreth, Charles R. 1995. Online Catalog Design Models: Are We Moving in the Right Direction? Report Prepared for the Council on Library Resources. . Kracker, Jacqueline. 2002. Research Anxiety and Students' Perceptions of Research: An Experiment; Part I: Effect of Teaching Kuhlthau's ISP Model. Journal of the American Society for Information Science & Technology 53, 4: 282-294. Kracker, Jacqueline, and Peiling Wang. 2002. Research Anxiety and Students' Perceptions of Research: An Experiment; Part II: Content Analysis of their Writings on Two Experiences. Journal of the American Society for Information Science & Technology 53, 4: 295-307. Kuhlthau, Carol C. 1993. Seeking Meaning: A Process Approach to Library and Information Services. Norwood, NJ: Ablex. Land, Susan M., and Barbara A. Greene. 2000. Project-based Learning with the World Wide Web: A Qualitative Study of Resource Integration. ETR&D 48, 1: 45-67. Larson, Ray R. 1991. Classification Clustering, Probabilistic Information Retrieval, and the Online Catalog. Library Quarterly 61, 2 (April): 133-173. Mann, Thomas. 2006a. The Changing Nature of the Catalog and its Integration with Other Discovery Tools. Final Report. Prepared for the Library of Congress. Critical Review. April 3. . Mann, Thomas. 2006b. What is Going on at the Library of Congress? June 19. . Mann, Thomas. 2005. Will Google's Keyword Searching Eliminate the Need for LC Cataloging and Classification? . Marcum, Deanna B. 2006. The Future of Cataloging. Library Resources & Technical Services 50, 1 (January): 5-9. Marcum, Deanna B., and Gerald George. 2003. Who Uses What? Report on a National Survey of Information Users in Colleges and Universities. D-Lib Magazine 9, 10 (October). . Markey, Karen. 1984. Subject Searching in Library Catalogs. Dublin, Ohio: OCLC. Markey, Karen, and Anh N. Demeyer. 1986. Dewey Decimal Online Classification Project. Dublin, OH: OCLC. Markey, Karen. In press. 25 Years of Research on End-user Searching. Journal of the American Society for Information Science & Technology. Markey, Karen, and Karen Calhoun. Unique Words Contributed by MARC Records with Summary and/or Contents Notes. In ASIS '87, Proceedings of the 50th ASIS Annual Meeting, edited by Ching-chih Chen, 153-162. Medford, NJ: Learned Information. Matthews, Joseph R., Gary S. Lawrence, and Douglas K. Ferguson. 1983. Using Online Catalogs: A Nationwide Survey. New York: Neal-Schuman. OCLC. 2002. How Academic Librarians can Influence Students' Web-Based Information Choices. Dublin, Ohio: OCLC. . Outsell Inc. 2000. Today's Students, Tomorrow's FGUs. Information About Information Briefing 3, 24 (October 16): 1-25. Pew Research Center. 2003. America's Online Pursuits: The Changing Picture of Who's Online and What They Do. . Rosenberg, Victor. 1966. The Application of Psychometric Techniques to Determine the Attitudes of Individuals toward Information Seeking. Bethlehem, PA: Lehigh University. AD 637 713. Schniderman, Saul. 2006. Statement of Saul Schniderman representing The Library of Congress Professional Guild, AFSCME Local 2910, before the Committee on House Administration concerning the World Digital Library. July 27. . Sherman, Chris. 2003. ComScore Launches Search Engines Tracking System. . Swain, Deborah E. 1996. Information Search Process Model: How Freshmen Begin Research. In Proceedings of the 59th Annual Meeting of the American Society for Information Science, 33, 95-99. Medford, NJ: Information Today. Tennant, Roy. 2005. Google, the Naked Emperor. Library Journal 130, 13 (August): 29. Tiefel, Virginia. 1991. The Gateway to Information: A System Redefines How Libraries are Used. American Libraries 22, 9 (October): 858-860. University of California Libraries. 2005. Rethinking How We Provide Bibliographic Services for the University of California: Final Report, December 2005, Bibliographic Services Task Force. . Valentine, Barbara. 2001. The Legitimate Effort in Research Papers: Student Commitment versus Faculty Expectations. Journal of Academic Librarianship 27, 2 (November): 107-115. Walker, Stephen. 1989. The OKAPI Online Catalogue Research Projects. In The Online Catalogue: Developments and Directions, edited by Charles R. Hildreth, 84-106. London: The Library Association. Wash, Rick, and Emilee Rader. 2006. Incentives for Contribution in del.icio.us: The Role of Tagging in Information Discovery. Manuscript submitted to CHI'07 Conference. Wormell, Irene. 1981. Subject Access Project: The Use of Book Indexes for Subject retrieval system in libraries. International Forum on Information and Documentation 6, 4 (1981): 24-28. Xu, Zhichen et al. 2006. Toward the Semantic Web: Collaborative Tag Suggestions. Collaborative Web Tagging Workshop. . Yu, Holly, and Margaret Young. 2004. The Impact of Web Search Engines on Subject Searching in OPAC. Information Technology & Libraries 23, 4 (December): 168-180. Copyright © 2007 Karen Markey Top | Contents Search | Author Index | Title Index | Back Issues Previous Opinion | Next Opinion Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions doi:10.1045/january2007-markey   work_ymgsmmc6trdofjsoqnbewdvtha ---- Sharing Cataloging Expertise: Options for Libraries to Share Their Skilled Catalogers with Other Libraries Cataloging & Classification Quarterly, (2010), v. 48, n. 6-7, pp. 525-540. ISSN: 0163-9374 (print) / 1544-4554 (online) DOI: 10.1080/01639374.2010.495694 http://www.tandfonline.com/ http://www.tandfonline.com/loi/wccq20 http://www.tandfonline.com/doi/pdf/10.1080/01639374.2010.495694 ©2010, Taylor & Francis Group, LLC Sharing Cataloging Expertise: Options for Libraries to Share Their Skilled Catalogers with Other Libraries MAGDA EL-SHERBINI The Ohio State University Libraries, Columbus, Ohio, USA Library cooperation is a flexible concept that involves practically all aspects of library technical operations. Until recently, areas of cooperation have included mostly interlibrary borrowing and the union catalogs. Materials processing remains a domain of each individual library that maintains its own experts and uniquely skilled staff to process their own materials. This study raises the question of whether libraries can also share their cataloging expertise with other institutions. The five models presented here will demonstrate how libraries can leverage existing library expertise and reduce duplication of efforts, while at the same time enhancing cooperation among libraries and maintaining high cataloging standards that are a must in the new technology era. INTRODUCTION A recent article published by this author in collaboration with another author examined the current library practice of processing and delivering information. 1 The objectives of that article were to examine the mechanism for delivering and processing of bibliographic information and to propose alternative solutions. These alternative solutions were presented in the form of conceptual models. The current article is a continuation of that earlier reflection. In the earlier article, the authors looked at some typical ways in which libraries share bibliographic records. A brief study of the problem revealed that, for the most part, libraries engaged in traditional forms of cooperation and continued to rely on their Online Public Access Catalog (OPAC), OCLC, and, in some cases, library consortia. The article concluded with the idea that “by eliminating the middle steps of creating, accessing, and retrieving information via intermediaries such as regional consortia, OCLC, and costly OPACs, libraries might realize substantial savings that could be diverted to enrich bibliographic records that form the foundation of the current bibliographic structure.” 2 Examining the status of cataloging and its many aspects in the dynamically changing technology environment is a work in progress. Cataloging as such can be described as the process of describing the information package following well-established guidelines and applies time-tested standards. Technological developments make that process easier, and frequently offer opportunities to review some assumptions and common practices. Although the terms “cooperative” and “sharing” are often used interchangeably in library literature, there is a slight difference between the two terms. According to the dictionary definition, the term “Cooperative” means “done with or working with others for a common purpose or benefit.” The term “cooperative cataloging” is a prime example from library literature. However, the term “sharing” means “To allow someone to use or enjoy something that one possesses.” 3 These are the concepts that are being explored in the current study. Library cooperation is a flexible concept that spans practically all aspects of library technical operations. Until recently, areas of cooperation have included mostly interlibrary borrowing and the union catalogs. Materials processing remains the domain of each individual library that maintains its own experts and uniquely skilled staff to process their own materials. Libraries continue to cooperate in processing their materials and making them available to each other through the OCLC database. This study raises the question of whether libraries can expand their services by sharing their cataloging expertise with other institutions. As library administrators continue to seek solutions to the budgetary restrictions facing them, and as the number of professional catalogers continues to decline, 4 libraries have an opportunity to expand the concept of cooperative cataloging, to include the idea of sharing their cataloger expertise with other institutions on a cooperative basis. The author of this article has put together some ideas about ways in which libraries can take advantage of their cataloging prowess and leverage their existing resources to further expand on the fundamental idea of cooperation in new ways. What is being proposed is a way for libraries to reduce the cataloging load of individual institutions, improve the quality of records, share cataloging resources, and bring the concept of cooperation to a new level. The models presented here might allow libraries to go beyond providing the current cataloging service to its institution. The proposed options call for the examination of the cataloging operations of libraries and suggest the possibility of creating a more comprehensive cooperative cataloging on a consortium level. A consortium might include multiple consortia, Association of Research Libraries (ARL), Committee on Institutional Cooperation (CIC), school libraries or any type of library grouping. More specifically, ideas presented in these models are an attempt to raise awareness of the possibilities for libraries to eliminate redundancies and duplication in cataloging, share cataloging expertise among institutions, create new and enhanced levels of specialization in the cataloging community, improve the quality of records, reduce or eliminate local practice, and make cooperation a tool for long-term planning. These concepts are nothing more than a set of frameworks that can be applied directly, or adapted to satisfy the requirements of any existing or newly formed group or consortium. They may be used by large or small institutions, regional groups, or statewide networks, and may be suitable for academic, public, school, or special libraries. They share some common features, while each model introduces something unique. LITERATURE REVIEW Current literature is replete with articles that point out the flaws in current cataloging practice. In his article “Tomorrow Never Knows: The End of Cataloging” Danskin points out that cataloging needs to change in order to survive. 5 Marcum noted that the Library of Congress (LC) spends about forty-four million dollars on cataloging every year. She also raised the important question of costs that are involved in the creation of detailed catalog records, and this is likely to be an issue that will be discussed in the future. 6 Calhoun reported that the American research libraries spent about 239 million on technical services labor in 2004. 7 These figures identify one area in library processing that deserves some, and perhaps significant, changes. Cost is the one common factor in many of these deliberations. In a recent article “Columbia, Cornell Join Up,” the authors announced that the two large academic libraries at Columbia University in New York City and Cornell University in Ithaca, NY (2CUL) are collaborating on collection development and will begin with $385,000 from the Andrew W. Mellon Foundation. 8 They also announced that the second focus would be on cooperative acquisition, cataloging, and e-resource management. The 2CUL initiative is an example of sharing the budget and expertise in the processing of their libraries’ collections. Libraries tend to customize cataloging to satisfy their own customers and needs. Applying local practices to each institution’s records has an effect on sharing of bibliographic records not only in print formats, but also in non-book formats. Naun and Braxton addressed this issue in their “Developing Recommendations for Consortial Cataloging of Electronic Resources: Lessons Learned.” 9 They acknowledge that historical data and local practices that each library within the consortia applies can greatly affect the feasibility of migrating data and, as a consequence, can constrain future practice. One of the major problems facing libraries today is the shrinking of resources and reductions in budgets that support library operations. Cataloging is one of the most affected areas in the library. In a recent study, Riemer and Morgenroth underscored the growing importance and the value of cooperative cataloging in the library community. 10 Wolven discussed the current shared cataloging library model that is used by libraries, and discussed the need for new cataloging models that take into consideration the issues that are facing libraries. He proposed a simplified, cost-effective solution to cataloging that would be streamlined and more transparent to users. 11 What is behind this idea is the notion that cataloging can be outsourced, and performed outside the library, with limited adherence to existing standards. The proposed solution may be cost-effective in the short term, but it raises the question of what effect it will have on the user’s ability to identify and retrieve the item that is needed. Steinhagen, Hanson, and Moynahan’s article continues the main line of thought of their predecessors by focusing on the changes taking place in cataloging. 12 It addresses the golden era of the international cooperative cataloging in the 1970s and 1980s when libraries’ budgets were abundant and the publishing industry was supplying academic libraries with large numbers of titles in order for them to compete for the top spot among the ARL. The authors discussed how this golden era changed in the 1990s and early 2000s when budgets ceased to grow at the previous rates or decreased. At the same time, new developments and issues began to impact the cataloging world. New or changed cataloging rules were being introduced, in part to manage the new and increasingly complex formats, huge unmanaged backlogs continued to grow, and demand for more access gave rise to the blooming trend of outsourcing. The profession experienced a rising rate of retirements among skilled catalogers, as the Internet exploded on the information scene and the demand for access to online resources grew. In their discussion they emphasized that “library administrators should cultivate local cataloging expertise through on-the-job training and professional workshops for catalogers. In the longer run, administrators must recognize that outsourcing cataloging to vendors and/or utilities has its limitations, if fewer original catalogers are left to populate and refine the databases.” 13 They continued their discussion by emphasizing that “cooperative cataloging activities should continue so that all can benefit from the growth of international databases that will bring us closer to the dream of universal bibliographic control.” 14 They concluded by pointing out that there will be more changes in cataloging in the coming years, and that catalogers are accepting and welcoming these changes. Their skills will be needed not for bibliographic description, but in providing access to the intellectual content through controlled vocabulary and authority control. Three major reports have been issued in the last five years addressing the issues that affect cataloging, the future of the catalog, and cataloging in general. “On the Record: Report to the Library of Congress Working Group on the Future of Bibliographic Control” 15 presents many statements about the need for more cooperation among libraries in producing bibliographic records. In their introduction to this report, the Working Group stated that “the future of bibliographic control will be collaborative, decentralized, international in scope, and web based.” 16 They also “recognized that there are many other institutions and organizations that have the expertise and capacity to play significant roles in the bibliographic future.” 17 This report is a “call to action” and is reviving the concept of cooperative cataloging and sharing of expertise. Libraries have to think beyond their walls and go beyond sharing bibliographic records through OCLC, but also sharing unique expertise among them. The University of California Libraries Bibliographic Services Task Force on “Rethinking How We Provide Bibliographic Services for the University of California” 18 offers a vision for improving access to materials and points to existing ideas or techniques, such as a centralized catalog for the whole system, or using OCLC as the single database for all University of California system bibliographic records. The report emphasizes the “need to centralize and/or better coordinate services and data, while maintaining appropriate local control, as a way of reducing effort and complexity and of redirecting resources to focus on improving the user experience.” 19 In 2006, Indiana University issued a white paper on the future of cataloging at Indiana University. 20 They provided an overview of current trends in libraries and technical services, and identified possible new roles for cataloging staff and strategies aimed at revitalizing cataloging operations at Indiana. The authors of this report viewed catalogers as key players in the era of scholarly communication and digital contents and stated that catalogers, like all librarians, “…must collaborate with other disciplines and within their own consortia and networks to be successful.” 21 Their first strategic direction in this report emphasized the need for cooperation between cataloging departments and other units within and outside the library boundaries. Small libraries or those libraries that do not meet the Name Authority Cooperative (NACO) minimum submission requirements have been creating NACO funnels to enable them to contribute records when they are not able to join the NACO program directly. These funnels facilitate cooperation in creating NACO records. In his paper, Larmore provided a step-by-step explanation of how a NACO funnel was established for four academic libraries and one state library in South Dakota. 22 Promoting collaboration and cooperation among libraries should be the foundation of long term planning for an effectively managed library. Libraries and consortia have been successful in implementing initiatives that benefited libraries in a number of ways. Some of these have been described in CIC’s “Cooperation among Research Libraries: The Committee on Institutional Cooperation.” 23 Recently launched cooperative projects include digitization of archives, institutional repositories, sharing of “born digital” collections, and many more. 24 Sources cited above are examples of ways in which librarians are beginning to search for solutions and alternatives to current practices that go beyond individual libraries, and suggest that new plans need to be developed for enhanced cooperation among libraries in areas that until now have been underrepresented. BACKGROUND Many libraries are currently selecting, acquiring, and cataloging their items individually. They are using the same mechanisms to describe each item. These mechanisms include the application of national cataloging standards, such as the Anglo-American Cataloguing Rules, Second Edition Revised (AACR2Rev), the Library of Congress Subject Headings (LCSH), Library of Congress Classification (LCC), the Dewey Decimal Classification (DDC), and so on. Some if not all libraries are using vendor record services, such as OCLC PromptCat, Backstage Works for authority control, MARCIVE, and others for government documents. All of these libraries acquire materials in a variety of formats and languages. Some libraries are creating institutional repositories and digitizing their selected materials. In addition, they input and export their records to and from OCLC WorldCat. These examples illustrate similarities of services and activities that are used by most libraries (Figure 1). Although there are similarities in what libraries are acquiring, there are also substantial differences in what they are collecting based on their users and the community they serve. Every library has certain strengths and tends to focus their collection development in certain specific areas. For example, some libraries might focus on certain foreign languages, such as French and German, while others might focus on non-Roman languages. In other instances, libraries might focus on acquiring more materials in special formats, such as CD-ROM, DVD, microfilms, and so on. Regardless of how similar or different library collections are, each institution continues to work and operate independently from all others. The concept of sharing applies to sharing bibliographic records. Libraries continue to recruit and hire their own staff. They also continue to process their own materials. As libraries lose staff or eliminate positions, they lose cataloging and technical expertise in those areas. They usually attempt to solve the problem internally and do not look to other institutions for support. Other libraries feel no obligation to provide assistance in getting their neighbor’s collections cataloged. FIGURE 1 The Current Processing and Sharing of Bibliographic Records among Libraries. In today’s economy, libraries are facing even greater difficulties in recruiting catalogers and filling vacant positions. 25 Recent cuts in library budgets have a great impact on library services and personnel. 26 Even with the budget cuts and the problems in filling library positions, individual libraries continue to look at their problems as their own. Library administrations tend to justify the reasons for their growing backlogs by discussing staff shortages, lost positions, or lack of expertise to handle cataloging. The most popular and easiest option for libraries to consider is “outsourcing” their collection to a vendor. Although outsourcing continues to be a viable choice for some libraries, not every library is able to afford to use this option. Another option is to create brief bibliographic records that provide the users with the item title and perhaps its author. Some libraries have chosen not to catalog these items and backlogged them indefinitely. A good example of this situation is with the foreign language and special formats materials, where the library does not have either the expertise to read these languages and create a brief record, nor does their budget allow them to outsource them to a cataloging vendor. Since so many libraries are facing similar issues in the area of cataloging and materials processing, it is reasonable to envision opportunities for a sharing effort among libraries. It seems possible to establish sharing arrangements where these libraries could work together to reduce the workload for each other and overcome the shortage of staff and budget constrains. Each library could assume responsibility for one part of the overall cataloging burden that is now performed by each library and produce higher quality records while maintaining high cataloging standards. Models presented below offer potential answers to how this can be accomplished, and will illustrate some ideas for cooperative cataloging and sharing of expertise resources among libraries. In proposing these hypothetical models, it was necessary to make certain generalizations about the volume of materials acquired and processed by libraries. The figures used in the models are introduced for illustration purposes only, and should not be construed as actual. These are simply some basic assumptions that help in constructing the models:  Major libraries select, acquire, and process a substantial amount of identical materials, acquired from almost the same vendors.  All libraries catalog almost the same materials (original or copy).  There are numerous institutions and catalogers cataloging the same materials at approximately the same time.  Most major libraries process post cataloging authority control for the same records and by the same vendor.  All libraries acquire a percentage of materials that are unique to their own collections.  Some libraries maintain expertise in specific areas, such as foreign languages and special formats.  Some libraries are acquiring materials where they do not have expertise to process them.  Vendors and publishers will increasingly provide bibliographic records and contribute them to the OCLC database. MODELS FOR SHARING CATALOGING EXPERTISE Model 1 The first model focuses on the concept of making the best use of the existing cataloging expertise by sharing these resources among a group of libraries. In this scenario, each library will identify the specific strengths of its collection. A variety of criteria could be used, including subjects, languages, and formats. These divisions can be simple or more complex, as each library may choose to refine the broad categories and include specific strengths of collections. In some cases, library strength is accompanied by a corresponding strength in staff. To provide a brief illustration, library 1 could be cataloging all materials in all formats and subjects in Hebrew; library 2, cataloging materials in Chinese, Japanese, and Korean (CJK); library 3 could be cataloging materials in French, and so on (Figure 2). This model offers a number of advantages to prospective participating libraries. Elimination of redundancy may be the most important advantage. The proposed model eliminates the need for each library to maintain its own specialized catalogers, and at the same time eliminates the need to catalog the same items cataloged by their neighbors. This arrangement would eliminate duplication of efforts. A by-product of this may be a reduction in reliance on some vendor services, such as shelf ready and the OCLC PromptCat service. At this time, many libraries are receiving catalog records from vendors, and some continue to spend tremendous amounts of time fixing these records locally. If participating libraries work cooperatively and share their expertise with their neighbors, there will be no need to backlog materials or spend each library’s budget on the same products. This model has the potential to create mutual interest among these libraries to follow the same cataloging standards and eliminate each library’s local practice, and to move the library to a true cooperative sharing environment. FIGURE 2 Shared Cataloging Responsibility. Membership in a cooperative will entail participation in the Library of Congress Cooperative Cataloging Program (PCC) and all its components (Bibliographic Record Cooperative Program [BIBCO], NACO, Subject Authority Cooperative Program [SACO], and Cooperative Online Serials [CONSER]). The cooperative will continue to contribute records to the Library of Congress and OCLC. The important difference is that the records will be input into the OCLC WorldCat only once. This will help eliminate multiple records in the database. The cataloging library will be responsible for all authority work for the materials they are assigned to catalog. This includes sending the records for vendor authority control maintenance. Other advantages include enhancing the skills of professional catalogers. This will result in a higher quality of records produced by experienced catalogers. Other libraries will be able to adapt these records without any modification or change. The actual process will require long-range planning and a level of commitment from library management. Libraries will follow their acquisitions profiles. Some things will not change, as libraries will continue to order their materials as usual and pay for their acquisitions. Instead of receiving the materials directly, the materials will be sent directly from the vendor to the cataloging libraries. The cataloging libraries will catalog the materials in OCLC WorldCat as usual, set up holdings, and provide labels. The cataloging library will send cataloged materials back to the acquiring library already shelf ready. Parity is achieved when the volume of cataloging contributed to the cooperative by each institution is roughly equivalent to the volume of records obtained. Model 2 The first model presented discusses a cataloging cooperative based on the idea of the exchange of records among institutions that are similar in size and cataloging volume. In the following scenario, libraries will be able to maximize their own cataloging resources and expertise by providing specialized cataloging service to their partners for a small fee. This is not a contract-cataloging plan per se, but it allows libraries to share their expertise with other institutions. For example, a library that has experienced staff in cataloging of audiovisual (A/V) formats for example, would provide this service to other libraries that do not have A/V catalogers. In this scenario, the cataloging library will decide what service it can offer to other libraries. The assumption is that the cataloging library will maintain high cataloging quality and provide a service that will pay for itself. The cost of cataloging should not be based on profit, but only on cost recovery. Advantages of this model are similar to those in the first scenario. One important difference is the ability to obtain low-cost, professionally prepared cataloging records. As is the case in the first scenario, a sense of sharing and cooperation among the participating libraries is likely to increase. Shared goals and objectives makes this option a win-win situation for both the cataloging and the client libraries. The actual process will involve the following steps. Each library will identify the collection that needs to be cataloged. The cataloging library will catalog sample materials and estimate the cost. The cataloging library and the client library will write specifications of the project detailing all aspects, including the cost, the terms of completion, standards, items versus surrogate, and how the bibliographic records will be delivered. The cataloging library will be responsible for doing authority control work. The client library will obtain a bibliographic record that includes authority processing. Other details will have to be negotiated among the partner libraries. Model 3 The next model proposes a scaled-down version of models 1 and 2. This type of sharing may be more suitable for a smaller regional group, a county-wide library system, or a group of small academics. The benefits achieved here are the same, while the limited size of the sharing and relative proximity of the participating institutions might make this model a little less cumbersome and easier to implement. To make this plan work effectively, each library would need to agree to create an X number of catalog records for commonly ordered materials, such as those coming from approval plans. Each library would charge a fixed fee for each record created if that option is selected; otherwise, each library simply shares catalog records with its partners. Accounting should be rather simple, assuming a balance of records shared over time. A number of advantages can be derived from this plan. Cataloging costs shared by member institutions could result in substantial savings for each. The quality of records is likely to improve as cataloging volume decreases at each institution and catalogers can develop expertise in their areas. Since the plan is designed to include a small number of institutions, it is potentially fairly simple to implement and manage. Model 4 This model is designed to handle authority control workflows and related issues. Outsourcing authority maintenance means that periodically, each library sends new bibliographic records to a company for authority processing. The vendor notifies the library of updated headings. All libraries are using vendors to perform their authority control and each library pays for this processing. In this scenario, the option of centralizing authority control processing is introduced. The author also introduces the PCC funnel idea. This funnel allows libraries to work together as NACO reviewers and contributors. This would mean that when a library does not have the expertise to create NACO headings, it could funnel its processing workload to a library that is already a NACO participant. The same process can also apply to the other components of the PCC program, such as BIBCO, SACO, and CONSER. The obvious advantage of this scenario is that it reduces the burden of authority control by sharing the process among the participant libraries. It also reduces the burden of post authority control processing for those libraries that follow this practice. For example, libraries that participate in the cooperative authority control can distribute the fallouts (fallouts are defined as vendor reports that are delivered back to the library for the headings that did not have authority records or have problems that will require manual problem-solving) and the authority control among them. In most cases, libraries receive several reports of the fallouts. Due to shortages in staffing, these reports usually do not get processed. As a result, the wrong headings/no headings will remain in the system and will cause problems in retrieval. The cooperative authority control model will eliminate this problem and will assure the quality of headings in the online catalog. To implement this type of cooperative model, every library will continue to catalog as usual, using the OCLC database. All the participant libraries will perform post-cataloging authority control. This means that libraries do not need to search the OCLC Authority File to verify the heading when they are performing copy cataloging. They will verify the headings and perform all authority control work when they perform original cataloging. OCLC will collect all the new records periodically (e.g., monthly) by a symbol selected by the consortium and send these records to a vendor for authority processing. After processing the new records, the vendor will send the records back to the libraries (dividing them by library symbol, Figure 3). Alternately, the consortium can create a shared authority file that can be used by all member libraries. The vendor will also send the non-matched authority records in a form of reports to a designated library or libraries. Libraries will be responsible for problem-solving and creating new authority records, if needed. To create NACO, SACO, and BIBCO records, libraries can create a funnel among them and designate a library or libraries that would be responsible for reviewing other library’s records and contributing them to the PCC program. FIGURE 3 Shared Authority Control Processing. Model 5 Scenario 5 is an alternative to the preceding one. It proposes that OCLC performs the authority control processing (Figure 4). OCLC has the capability to centralize all authority maintenance. OCLC provides access to the National Authority File for names, subjects, and series. It also provides the interface for submitting names and series. When libraries perform authority control, they only process their own records. This means that the headings in their online catalog are assumed to be correct and matching the authorized headings in the OCLC authority control file. In most cases, libraries make changes to the headings locally instead of going back and correcting the heading in the OCLC authority file. FIGURE 4 OCLC Based Authority Control Model. OCLC also has a Quality Control Division that is responsible for making corrections to the headings and the bibliographic data. This includes merging records, adding missing data, creating new authority records, fixing tags, and so on. These revisions and updates are made to the master record in the OCLC database, and libraries might not be aware of these changes and corrections. As a result, libraries are making changes to their headings locally, and OCLC is making changes and updates to the master record. This mechanism produces redundancy in authority control processing by individual libraries and OCLC. Missing data and typographical errors make it difficult for users, including librarians, to identify, retrieve, and request items. Implementation of this model reduces redundancy in authority control processing between member libraries and OCLC. It helps assure consistency in the form of headings among all libraries and OCLC. Centralizing authority control at OCLC might reduce costs by distributing the costs among the participant libraries and OCLC. Having the same, correct records in both the OCLC database and the library catalogs will expedite searching. This is particularly important for interlibrary loan operations. SUMMARY Implementation of any of these models will require extensive planning, some internal restructuring, and a strong commitment to the idea of sharing and cooperation. It may also require an attitude shift, as libraries undertake the task of managing problems in innovative and more complex ways. The list of requirements for the implementation of any of the models described in this article will have to include several basic components. Each library will assume responsibility for certain functions. Libraries will identify each other’s strengths and make their acquisitions profiles known to other partner libraries. Based on these strengths, each library is assigned responsibility for cataloging a subject or format, according to their area of expertise. It is assumed that libraries in this scenario will follow the same cataloging standards (no local practice!). Each library creates records and submits them to OCLC. Libraries will need to identify their areas of cataloging expertise and be willing to share this expertise with the other institutions. Each library will build a cooperative cataloging team and commit to maintaining strength in their chosen area. A basic team may consist of one cataloger and 1–2 support staff. The size of this unit will be determined by the size of the cooperative and the volume of materials in the given area. Specific requirements and procedures will have to be articulated and developed for each individual scenario according to local needs and conditions. Tracking and routing mechanisms will need to be developed in each case. In the fee-based cataloging cooperative, special attention needs to be paid to financial matters and transfer of funds. In the scenario involving other consortia or vendors, terms and conditions will have to be negotiated to accommodate everyone. Other challenges and difficulties are likely to arise as the specific plans are being developed. Institutional cooperation and commitment is what will make these cooperatives a success. It may be difficult to coordinate policies and procedures, and this may call for the development of new management skills among library managers. Libraries are not using the same ILS systems, and that could be a problem, unless they all agree to redesign the architecture of their online systems. This could become a cost factor. It may also be difficult to maintain specialized cataloging staff. This will require commitment from participating libraries. These options did not exhaust all the possibilities that might be available. It is the author’s hope that more ideas and potential models will be generated in response to the thesis of this project. CONCLUSION These five models introduce an alternative way of looking at library sharing and cooperation. Ideas presented here may be an alternative to the current trends that look at outsourcing of cataloging as a way of the future. This article proposes viable models that offer solutions that will go a long way toward addressing the budget concerns of today’s library administrators. Solutions proposed here will foster a spirit of cooperation among libraries and reinvigorate the profession. Most importantly, they will allow libraries to remain in charge of the process, and will ensure the integrity of the library record that is based on universally accepted standards. The five models presented here offer a set of ideas. They are not completed roadmaps or plans that can be implemented without refinement. Ideas for cooperation in the area of cataloging presented here offer possible solutions to some libraries that are attempting to grapple with questions regarding the future of cataloging and the library catalog. NOTES 1. Magda El-Sherbini and Amanda Wilson, “New Strategies for Delivering Library Resources to Users: Rethinking the Mechanisms in which Libraries are Processing and Delivering Bibliographic Records,” The Journal of Academic Librarianships 33, no. 2 (March 2007): 228–242. 2. Ibid., 241. 3. Dictionaries, Encyclopedia and Thesaurus. http://www.thefreedictionary.com/ (accessed November 30, 2009). 4. Elizabeth N. Steinhagen, Mary Ellen Hanson, and Sharon A. Moynahan, “Quo Vadis, Cataloging?” Cataloging & Classification Quarterly 44, no. 3–4 (2008): 271–280. 5. Alan Danskin, “Tomorrow Never Knows: the End of Cataloging?” IFLA Journal 33, no. 3 (2006): 205– 209. www.ifla.org/IV/ifla72/papers/102-Danskin-en.pdf (accessed November 30, 2009). 6. Deanna B. Marcum, “The Future of Cataloging,” Library Resources & Technical Services 50, no. 1 (2006): 5–9. 7. Karen Calhoun, “The Changing Nature of the Catalog and its Integration with Other Discovery Tools: Final Report,” prepared for the Library of Congress (March 2006). www.loc.gov/catdir/calhoun-report-final.pdf (accessed November 30, 2009). 8. Norman Oder, Lynn Blumenstein, and Josh Hadro, “Columbia, Cornell Join Up,” Library Journal 134, no.18 (2009): 12. 9. Chew Chiat Naun and Susan M. Braxton, “Developing Recommendations for Consortia Cataloging of Electronic Resources: Lessons Learned,” Library Collections, Acquisitions, & Technical Services 29, no. 3 (September 2005): 307–325. 10. John J. Riemer and Karen Morgenroth, “Hang Together or Hang Separately: The Cooperative Authority Work Component of NACO,” Cataloging & Classification Quarterly 17, no. 3–4 (1993): 127–161. 11. Robert Wolven, “In Search for a New Model,” Library Journal no. 133, (2008): 6–9. 12. Elizabeth N. Steinhagen, Mary Ellen Hanson, and Sharon A. Moynahan, “Quo Vadis, Cataloging?” Cataloging & Classification Quarterly 44, no. 3–4 (2008): 271–280. 13. Ibid., 278. 14. Ibid. 15. On the Record: Report of the Library of Congress Working Group on the Future of Bibliographic Control (January 9, 2008). www.loc.gov/bibliographic-future/news-ontherecord-jan08.pdf (access November 30, 2009). 16. Ibid., 9. 17. Ibid. 18. University of California Bibliographic Services Task Force, “Rethinking How We Provide Bibliographic Services for the University: Final Report” (December 2005). www.libraries.universityofcalifornia. edu/sopag/BSTF/Final.pdf (accessed November 30, 2009). 19. Ibid., 3. 20. A White Paper on the Future of Cataloging at Indiana University. January 15, 2006. Prepared by Jackie Byrd et al. www.iub.edu/˜libtserv/pub/Future of Cataloging White Paper.pdf (accessed November 30, 2009). 21. Ibid., 14. 22. Dustin P. Larmore, “A New Kid on the Block: The Start of a NACO Funnel Project and What is Needed to Start your Own,” Cataloging & Classification Quarterly 42, no. 2 (2006): 75–81. 23. CIC Strategic Directions 2007–2010. www.cic.net/Libraries/NewsPub/StrategicDirections2007– 2010.sflb.ashx (accessed November 30, 2009). 24. Thomas W. Shaughnessy, “Cooperation Among Research Libraries: The Committee on Institutional Cooperation,” Journal of Library Administration 25, no. 2/3 (1998): 73. 25. Responses to Question: If you Think Libraries are Having Difficulty Recruiting and Filling Library Openings Today, What Do you Believe is the Primary Cause of the Difficulty? http://www.ischool. utexas.edu/˜rpollock/tla2009/why recruiting difficulties.pdf (accessed November 30, 2009). 26. The Impact of Budget Cuts on Colorado Academic Libraries. http://www.lrs.org/documents/ fastfacts/207 AcademicBudgetCuts.pdf (accessed November 30, 2009). Sharing Cataloging Expertise: Options for Libraries to Share Their Skilled Catalogers with Other Libraries INTRODUCTION LITERATURE REVIEW BACKGROUND MODELS FOR SHARING CATALOGING EXPERTISE SUMMARY CONCLUSION NOTES work_ymuk7qetczcujnpfoeq2jgss2e ---- 2013 · VOLUME 4 · ISSUE 2 EDITOR-IN-CHIEF Scott Farrow, University of Maryland, Baltimore County, USA ASSOCIATE EDITORS Joseph Cordes, George Washington University, USA Lynn Karoly, RAND Corporation, USA David Salkever, University of Maryland, Baltimore County, USA MANAGING EDITOR Mary Kokoski, University of Maryland, Baltimore County, USA JOURNAL OF BENEFIT-COST ANALYSIS of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S2194588800000555 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:58, subject to the Cambridge Core terms https://www.cambridge.org/core/terms https://doi.org/10.1017/S2194588800000555 https://www.cambridge.org/core Journal of Benefit-Cost Analysis (JBCA) seeks to improve the analytical practice of benefit-cost analysis and to expand scholarly knowledge. Its scope includes topics in social policy such as education, crime, poverty, and employment; as well as in environment, health, energy, natural hazards, terrorism, defense, and other areas to which benefit-cost analysis and related tools can be applied. Articles that skillfully apply benefit-cost tools are especially encouraged as are theoretical papers with some link to empirical application. All levels of analysis are sought: international, national, state, regional, and local. Authors focusing on principles and standards, reviews of areas of application, and shorter pieces that demonstrate implementable „skills of the trade“, are also encouraged but they should contact the Editor at JBCA@umbc.edu prior to submission. ABSTRACTED/INDEXED IN EconLit, OCLC: WorldCat, Research Papers in Economics (RePEc). ISSN 2194-5888 · e-ISSN 2152-2812 All information regarding notes for contributors, subscriptions, Open Access, back volumes and orders is available online at http://www.degruyter.com/jbca. RESPONSIBLE EDITOR Scott Farrow, UMBC Department of Economics, 1000 Hilltop Circle, Baltimore, MD 21250, USA, Email: JBCA@umbc.edu JOURNAL MANAGER Eike Wannick, De Gruyter, Genthiner Straße 13, 10785 Berlin, Germany. Tel.: +49 (0)30 260 05-376, Fax: +49 (0)30 260 05-250, Email: eike.wannick@degruyter.com RESPONSIBLE FOR ADVERTISEMENTS Panagiota Herbrand, De Gruyter, Mies-van-der-Rohe- Straße 1, 80807 München, Germany, Tel.: +49 (0)89 769 02 – 394, Fax: +49 (0)89 769 02 – 350, Email: panagiota.herbrand@degruyter.com TYPESETTING Compuscript Ltd, Shannon, Ireland PRINTING Franz X. Stückle Druck und Verlag e.K., Ettenheim © 2013 Walter de Gruyter GmbH, Berlin / Boston Printed in Germany of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S2194588800000555 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:58, subject to the Cambridge Core terms https://www.cambridge.org/core/terms https://doi.org/10.1017/S2194588800000555 https://www.cambridge.org/core Journal of Benefit-Cost Analysis 2013 | Volume 4 | Issue 2 Contents J. Scott Holladay and Michael Livermore Regional variation, holdouts, and climate treaty negotiations 131 Robert J. Brent A cost-benefit framework for evaluating conditional cash-transfer programs 159 Dan F. Jacoby A cost-benefit analysis: implementing temporary disability insurance in Washington State 181 Henrik Jaldell Cost-benefit analyses of sprinklers in nursing homes for elderly 209 Cass R. Sunstein The value of a statistical life: some clarifications and puzzles 237 of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S2194588800000555 Downloaded from https://www.cambridge.org/core. Carnegie Mellon University, on 06 Apr 2021 at 00:36:58, subject to the Cambridge Core terms https://www.cambridge.org/core/terms https://doi.org/10.1017/S2194588800000555 https://www.cambridge.org/core work_yo6xsthoqffsxbpgr5wpgpfsmu ---- Metadata: the Foundations of Resource Description Metadata: The Foundations of Resource Description Stuart Weibel Office of Research, OCLC Online Computer Library Center, Inc. weibel@oclc.org D-Lib Magazine, July 1995 This paper is an abbreviated version of the Summary Report of the OCLC/NCSA Metadata Workshop . It sets forth a proposal for the content of a simple resource description record (the Dublin Core Metadata Element Set) and outlines a series of further steps to advance the standards for the description of networked information resources. Introduction Underlying Assumptions Implementations Next Steps References Introduction The explosive growth of interest in the Internet in recent years has created a digital extension of the academic research library for certain kinds of materials. Valuable collections of texts, images and sounds from many scholarly communities -- collections that may even be the subject of state-of-the-art discussions in these communities--now exist only in electronic form and may be accessible from the Internet. Knowledge regarding the whereabouts and status of this material is often passed on by word of mouth among members of a given community. For outsiders, however, much of this material is so difficult to locate that it is effectively unavailable. Why is it so difficult to find items of interest on the Internet or the World Wide Web? A number of well-designed locator services, such as Lycos [http://lycos.cs.cmu.edu/] , are now available that automatically index many of the resources available on the Web and maintain up-to-date databases of locations. But indexes are most useful in small collections within a given domain. As the scope of their coverage expands, indexes succumb to problems of large retrieval sets and problems of cross disciplinary semantic drift. Richer records, created by content experts, are necessary to improve search and retrieval. Formal standards such as the TEI Header and MARC cataloging) will provide the necessary richness, but such records are time consuming to create and maintain, and hence may be created for only the most important resources. An alternative solution that promises to mediate these extremes involves the creation of a record that is more informative than an index entry but is less complete than a formal cataloging record. If only a small amount of human effort were required to create such records, more objects could be described, especially if the author of the resource could be encouraged to create the description. And if the description followed an established standard, only the creation of the record would require human intervention; automated tools could discover these descriptions and collect them. Can a simple metadata record be defined that sufficiently describes a wide range of electronic objects? The Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA) convened the invitational Metadata Workshop on March 1-3, 1995, in Dublin, Ohio to address this issue. Fifty-two librarians, archivists, humanities scholars and geographers, as well as standards makers in the Internet, Z39.50 and Standard Generalized Markup Language (SGML) communities, met to identify the scope of the problem, to achieve consensus on a list of metadata elements that would yield simple descriptions of data in a wide range of subject areas, and to lay the groundwork for achieving further progress in the definition of metadata elements that describe electronic information. Goals Goals of the workshop included fostering a common understanding of the problems and potential solutions among the stakeholders and promoting a consensus on a core set of metadata elements to describe networked resources. Scope Since the Internet contains more information than professional abstractors, indexers and catalogers can manage using existing methods and systems, it was agreed that a reasonable alternative way to obtain usable metadata for electronic resources is to give authors and information providers a means to describe the resources themselves. The major task of the Metadata Workshop was to identify and define a simple set of elements for describing networked electronic resources. To make this task manageable, it was limited in two ways. First, only those elements necessary for the discovery of the resource were considered. It was believed that resource discovery is the most pressing need that metadata can satisfy, and one that would have to be satisfied regardless of the subject matter or internal complexity of the object. Secondly, the discussion was further restricted to the metadata elements required for the discovery of what were called document-like objects, or DLOs by the workshop participants. It was believed that DLOs are still the most common type of resource sought in the Internet and that whatever solution could be proposed for DLOs could be extended to other kinds of resources. More importantly, the likelihood of making progress on this challenging problem would be increased if attention could initially be restricted to something familiar. DLOs were not rigorously defined, but were understood by example. For example, an electronic version of a newspaper article or a dictionary is a DLO, while an unannotated collection of slides is not. Of course, the crux of the problem is that in a networked environment, DLOs can be arbitrarily complex because they can consist of text with callouts to images, audio or video clips, or to other hypertext documents. The Metadata Workshop participants made no attempt to limit the complexity of DLOs, except to say that the intellectual content of a DLO is primarily text, and that the metadata required for describing DLOs will bear a strong resemblance to the metadata that describes traditional printed texts. As a result of the restricted focus of the workshop, certain issues required for a complete description of DLOs, such as cost, archival status and copyright information, were eliminated from the scope of the discussion. Elements required for the description of objects other than DLOs, such as the elements required for the description of complex geological strata in a geospatial resource, were also beyond the scope of the discussion. The goal was to define a core set of metadata elements that would allow authors and information providers to describe their work and to facilitate interoperability among resource discovery tools. But because the core elements do not yield a complete description of objects in a networked environment, careful consideration was also given to mechanisms for extending the element set. The primary deliverable from the workshop was a set of thirteen metadata elements, named the Dublin Core Metadata Element Set (or Dublin Core, for short). The Dublin Core was proposed as the minimum number of metadata elements required to facilitate the discovery of document-like objects in a networked environment such as the Internet. The syntax was deliberately left unspecified as an implementation detail. The semantics of these elements was intended to be clear enough to be understood by a wide range of users. Below is a brief description of the elements in the Dublin Core Dublin Core Element Description Subject: The topic addressed by the work Title: The name of the object Author: The person(s) primarily responsible for the intellectual content of the object Publisher: The agent or agency responsible for making the object available OtherAgent: The person(s), such as editors and transcribers, who have made other significant intellectual contributions to the work Date: The date of publication ObjectType: The genre of the object, such as novel, poem, or dictionary Form: The physical manifestation of the object, such as Postscript file or Windows executable file Identifier: String or number used to uniquely identify the object Relation: Relationship to other objects Source: Objects, either print or electronic, from which this object is derived, if applicable Language: Language of the intellectual content Coverage: The spatial locations and temporal durations characteristic of the object To make this discussion concrete, consider an electronic a record created with the relevant portions of the Dublin Core, and a sample syntax, that describes an electronic version of Maya Angelou's poem "On the Pulse of Morning". This description is based on a record created by the University of Virginia Library's Electronic Text Center. (For a description of that project, see Gaynor [Gaynor] .) Subject: Poetry Title: On the Pulse of Morning Author: Maya Angelou Publisher: University of Virgina Library Electronic Text Center OtherAgent: Transcribed by the University of Virginia Electronic Text Center Date: 1993 Object: Poem Form: 1 ASCII file Identifier: AngPuls1 Source: Newspaper stories and oral performance of text at the presidential inauguration of Bill Clinton Language: English Underlying Assumptions The discussions at the Metadata Workshop revealed several principles that should guide the further development of the element set. Adherence to these principles increases the likelihood that the core element set will be kept as small as possible, that the meanings of the elements will be understood by most users, and that the element set will be flexible enough for the description of resources in a wide range of subject areas. These principles are intrinsicality, extensibility, syntax independence, optionality, repeatability, and modifiability. Intrinsicality The Dublin Core concentrates on describing intrinsic properties of the object. Intrinsic data refer to the properties of the work that could be discovered by having the work in hand, such as its intellectual content and physical form. This is distinguished from extrinsic data, which describe the context in which the work is used. For example, the "Subject" element is intrinsic data, while transaction information such as cost and access considerations are extrinsic data. The focus on intrinsic data in no way demeans the importance of other varieties of data, but simply reflects the need to keep the scope of deliberations narrowly focussed. Extensibility In addition to its use in dealing with extrinsic data, extension mechanisms will allow the inclusion of intrinsic data for objects that cannot be adequately described by a small set of elements. Extensibility is important because users may wish to add extra descriptive material for site-specific purposes or specialized fields. In addition, the specification of the Dublin Core itself will change over time, and the extension mechanism will allow revisions while maintaining some backward compatibility with the originally defined element set. Syntax Independence Syntactic bindings are avoided because it is too early to propose formal definitions and because the Dublin Core is intended to be eventually used in a range of disciplines and application programs. Optionality All the elements are optional. The Dublin Core may eventually be applied to objects for which some elements have no meaning (who is the author of a satellite image?). It also seems counterproductive to mandate complex descriptions if the creators of the content are expected to provide the descriptive material. A simple description is better than no description at all. Repeatability All elements in the Dublin Core are repeatable. For example, multiple author elements would be used when a resource has multiple authors. Modifiability Each element in the Dublin Core has a definition that is intended to be self-explanatory. However, it is also necessary that the definitions of the elements satisfy the needs of different communities. This goal is accomplished by allowing each element to be modified by an optional qualifier. If no qualifier is present, the element has its common-sense meaning; otherwise, the definition of the element is modified by the value of the qualifier. Qualifiers will be typically derived from well-known conventions in the library community or from the field of knowledge appropriate to the resource. Qualifiers are important because they give the Dublin Core a mechanism for bridging the gap between casual and sophisticated users. For example, the data in the Subject element consists of any word or phrase that describes the object's content. However, a professional cataloger may wish to supply the name of the authoritative source from which the subject terms are taken. In such a case, the element may be written as Subject (scheme=LCSH) , indicating that the subject terms are taken from the Library of Congress Subject Headings. Implementations One of the goals of the OCLC/NCSA Metadata Workshop was to promote prototype resource description projects based on a common model of resource description. A number of Metadata Workshop conferees represent organizations that have ongoing activities or are starting activities that will be influenced by the results of the workshop. These include: The OCLC Spectrum Project Contact:Diane Vizine-Goetz, vizine@oclc.org The OCLC Internet Resources Cataloging Project Contact:Erik Jul, jul@oclc.org Library of Congress Contact:Rebecca Guenther, rgue@loc.gov O'Reilly Associates Contact:Terry Allen, terry@ora.com Los Alamos National Laboratory and Indiana University Contact:Ron Daniel Jr.,rdaniel@acl.lanl.gov Contact:Pete Percival,percival@bronze.ucs.indiana.edu Bunyip Systems Contact:Chris Weider,clw@bunyip.com Georgia Institute of Technology Contact:Michael Mealling, michael.mealling@oit.gatech.edu , http://www.gatech.edu/iiir SoftQuad Contact: Yuri Rubinsky,yuri@sq.com Concordia University Contact:Bipin Desai, bcdesai@cs.concordia.ca, http://www.cs.concordia.ca/~faculty/bcdesai/cindi-system-1.0.html Next Steps Refinement and standardization of the metadata element set defined in this document will be an ongoing, dynamic process involving many stakeholder communities. No single forum will suffice to air all concerns and no single standard can be expected to accommodate the needs of all communities. The problem must be divided into manageable chunks and the process must engage the relevant stakeholder communities. Implicit in the present activity is the proposition that there are core elements common to many object types, and that a simple, extensible framework of such elements can be defined to support more complete resource descriptions. The initial objective--the specification of elements for the discovery of document-like objects--can be extended in a variety of directions: Expansion of the Dublin Core to include other object types, such as services or collections. Expansion of the Dublin Core to embrace functionality other than resource discovery, such as archival control and the authentication of users and charging mechanisms. Establishing standardized methods for extensibility. Refinement of existing work. The Dublin Core is an untested approach to the description of resources that will need to be modified with experience. OCLC and NCSA will establish a workshop series to address aspects of this agenda. A Metadata Workshop Steering Committee will be established to define topics and assure appropriate representation of stakeholders. Design groups of perhaps a dozen or fewer individuals will be solicited to prepare discussion papers to focus workshop activities. Participants will be invited based on their publicly evident accomplishments in relevant areas or by reviewed application. Workshops will be limited to 50 or fewer participants and conducted in roughly the style of the March 1995 Workshop. Other work will be done in coordination with IETF working group on Uniform Resource Identifiers (URIs) to assure that the results can be integrated into the emerging protocols for resource location and persistent naming. Finally, active promotion of results will be carried out by establishing liaison with formal associations of stakeholders. In the library community, MARC standards evolve under the guidance of the Machine-Readable Bibliographic Information Committee (MARBI), composed of representatives of the Library of Congress and other stakeholders in the library community. A close relationship should be sustained between this committee and the Metadata Work Group. Relationships should also be established with publishers, document vendors, SGML vendors and theoreticians working on the problem of text encoding. Other communities also have requirements that must be accommodated in any framework for resource description. These communities include the GIS community, government information providers and business communication groups. References [MARC] Network Development and MARC Standards, Office, ed. 1994. USMARC Format for Bibliographic data. 1994. Washington, DC: Cataloging Distribution Service, Library of Congress. [TEI] Sperberg-McQueen, C. M., and Leu Burnard, ed. 1994. Guidelines for Electronic Text Encoding and Interchange. Chicago and Oxford: Text Encoding Initiative. [Gaynor] Gaynor, Edward. 1994. "Cataloging Electronic Texts: The University of Virginia Library Experience." Library Resources and Technical Services 38(4): 403-413 (October 1994). Copyright © 1995 OCLC hdl:cnri.dlib/july95-weibel work_yrjbrhk6n5gp7e3vafsckftula ---- UC Irvine UC Irvine Previously Published Works Title ALA annual conference, Washington DC: LITA top tech trends update Permalink https://escholarship.org/uc/item/0gq1385z Journal Library Hi Tech News, 24(8) ISSN 0741-9058 Author Brown, M Publication Date 2007-10-30 DOI 10.1108/07419050710835983 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California https://escholarship.org/uc/item/0gq1385z https://creativecommons.org/licenses/https://creativecommons.org/licenses/by/4.0//4.0 https://escholarship.org http://www.cdlib.org/ ALA Agnual Conference, Wash~llgton 'pc:. LITA Top>.Te~h T.rends Update The Library and Information Technology Association Division (LITA) hosts a Top Technology Trends Forum when six leaders in the field share what they think has had the biggest impact for library technologies over the year. Reports are po~1ed on the division's Web site ( www .ala.org/ala/lita/litaresou:rces/ toptechtrends/toptechnology .htm) and having a running record of these trends is always interesting to read and review. Podcasts and blog entries (http://litablog. org/category /top-technology-trends/) for the 2007 conference are available. Maurice York (Associate Head Infonnation Technology, North Carolina State University) moderated the panel presentation in large and historic ballroom in the Mayflower Hotel, Washington, DC with approximately 500 people in the audience. The panel of seven spoke for several minutes each and the audience presented questions to the panel during the 2 h program. The top technology trends discussed included RFID, open source adoption in libraries, and the importance of privacy. Member of the panel were John Blyberg, Karen Coombs, Roy Tennant, Marshall Breeding, Walt Crawford, and Joan Frye Williams. Podcasts and blog entries for the individual speakers are available from the LITA website (http://litablog.org/ category /top-technology-trends/). Trend: open source ILS Marshall Breeding (http://staffweb. librnry. vanderbilt.edu/breeding/), director for innovative technologies and research at Vanderbilt University Librnries (TN), began as first speaker on the Top Tech Trends panel by referencing his Ll Automation Marketplace article, "An Industry Redefined," (www. Iibraryjournal.com/article/CA642925 I. html), in which he predicted "unprecedented disruption" in the integrated library system (ILS) market. He spoke about the pressures on libraries being forced to change integrated systems due to products being discontinued. Breeding said 60 per cent of the libraries in one state are facing a migration due to the Sirsi/Dynix product roadmap being changed, but he said "not all ILS companies are the same". Breeding said open source interest in Apache and Linux has been used as infrastructure in libraries for many years, but is new to the ILS world as a product. Interest has now expanded from the technology adventurers to the decision makers. The Evergreen PINES project (http:// libraryjoumal.com/article/CA6396354. html) in Georgia is a "most successful" example, with 55 of 58 counties participating. With the recent decision to adopt Evergreen in British Columbia and other libraries exploring Koba ( www .koha.org/), there is movement to open source solutions. However Breeding cautioned the relative numbers of open source adoptions are "miniscule" compared to libraries with commercial ILS. Will there be a switch to open source becoming an avalanche? Breeding said several commercial support companies have sprung up to serve the open sow:ce ILS maiket, including Liblime (http:// liblime.com/), Equinox (http://esilibrary. com/), and CARe Affiliates (www. library journal. com/article/CA6453007. html). Breeding predicted an era of "new decoupled interfaces". There is a new emphasis to change the front-end interfaces ("front ends") of library systems to match expectations of the library users. Trend: ILS "backend" support and RFID John Blyberg (www.blyberg.net/), head of technology and digital initiatives at Darien Public Library (CT), said the "back end [in the ILS] needs to be shored up because it has a ripple effect" on other services. The operational infrastructure side of the ILS needs to be robust enough to support interfaces and user actions, else the ILS performance will suffer. Blyberg talked about RFID as a coming technology, and it makes sense for use in sorting and book storage, echoing Lori Bowen Ayre's earlier talk at ALA (www.galecia.com/weblog/mt/archives/ 000268.php) point that libraries need to create a market for and "support the distribution demands of the Lo11g Tail". For more on the "Long Tail" see Lorean Dempsey "Libraries and the Long Tail: Some Thoughts about Libraries in a Network Age". D-Lib Magazine April 2006 (www.dlib.org/ dlib/april06/dempsey /04dempsey .html). RFID privacy concerns have been raised for tagging books but Blyberg counters that "privacy concerns are non- starters, because RFID is essentially a barcode". With RFID information is stored in a database the focus of security concerns should be protection of data and not the detection ofRFID tags. Finally, Blyberg said that vendor interoperability and a democratic approach to development is needed in the age of lnnovative's Encore and Ex Libris' Primo, both which can be used with different ILS systems and can decouple the public catalog from the ILS. With the xTensible catalog (xC) (www.libraryjoumal.com/article/ CA6365210.html) and Evergreen coming along, Blyberg said there was a need for funding and partners to further enhance their development. There was some discussion between Walt Crawford and John Blyberg about what Crnwford described a'> "lead to the erosion of patron privacy" by introducing RFID to patron barcodes. Could details of individuals reading 20 LIBRARY HI TECH NEWS Number 8 2007, pp. 20-23, \!;) Emerald Group Publishing Limited, 0741-9058, DOI 10.1108/07419050710835983 habits be data-mined by an intruder to the system database? Joan Frye Williams countered that both Blyberg and Crawford were "insisting on using logic on what is essentially a political problem". Williams commented that RFID issues with the libraries were about getting the RFID message out since libraries rather than legal challenges. Trend: end user as contributor, digital as format of choice, desktop and web become one Karen Coombs ( www.librarywebchic. net/), head of web services at the University of Houston (TX), discussed three trends: The end user as content contributor but the long-term disposition of the material is unclear. Coombs com- mented that currently more than 62 per cent of all US households own digital cameras and using Y ouTube. Blip.tv, Fliclcr, and other web-based services to distribute them. "What happens if YouTube goes under and people lose their memories?" There is a huge potential for the born electronic material to be lost and libraries need to think about captur- ing it. Coombs referred to her grandfather who sent letters back from war but today the soldiers in Irnq are emailing, blogging, and posting digital photos. Who is preserving that? Coombs described the "Picture Australia Project" (www.pictureaustraliaorg/) by the National Library of Australia and its partnership with Flickr as a positive development. · Digital as format of choice for users and referring to examples such as iTunes for music and Joost for video. Coombs said the library has no provision for supplying stream- ing video, audio, and other online servers, "especially in public li- braries". Though companies like Overdrive and Recorded Books exist to serve this need, perhaps her point was that the consumer adoption has superseded current library demand. Coombs chal- lenged the audience to think about a broader definition of user support. 'I know everyone will cringe if I mention e-books" but we have to see that e-books are not the LIBRARY HI TECH NEWS Number 8 2007 problem, the problem is the reading mechanism. Karen has 12 book cases of books and would really appreciate this stuff digitally. We have to get in this game, how do we get in this game? A blurred line between desktop and web applications, which Coombs demonstrated with Google Docs (http://docs.google.com/), YouTube remixer (www.youtube.com/ytremix- er) and Google Gears (http://gears. google.com/), "which lets you read your feeds when you 're offline" is blurring the lines for offline editing. "This blurring oflines is only going to continue. We haven't figured out how to get content to desktops, how do we get it into web applications?" John Blyberg responded to Coombs trends, saying that he sees academic libraries pursuing semantic web technologies, including developing ontologies. Coombs disagreed with this assessment, saying that "libraries have lots of badly-tagged HTML pages". Roy Tennant agreed, "H the semantic web arrives, buy yourself some ice skates, because hell will have frozen over". Breeding said that he longs for services-oriented architecture (SOA) but "I'm not holding my breath". SOA can develop true information applications built from the start but current systems wrap around legacy systems. True Web 3.0 applications are a long way off. Walt Crawford replied, "Roy [Tennant] is right - most content providers don't provide enough detail, and they make easy things complicated and don't tackle the hard things". Coombs pointed out most users do not want to do what is necessary to populate XML documents that a semantic web requires. Coombs said "people are too concerned with what things look like", but Crawford interjected. "not too concerned". Trends: "demise" of the catalog, "Software as a Service", Shakeups in ILS marketplace Roy Tennant (www.libraryjournal. com/blog/1090000309.html), OCLC senior program manager, began his comments with a disclaimer that the panelists do not consider themselves experts but "lucky people who get to spout". Tennant listed his trends: Demise of the catalog, which should push the OPAC into the back room where it belongs, where it started it's life", and elevate discovery tools like Primo, Verde and Encore, as well as OCLC WorldCat Local, to help people find information. The tools can unify more information sources than just the online catalog. Tennant suggests we "kill the term OPAC". Software as a Service, fonnerly known as ASP and hosted services, which means librarians "don't have to babysit machines, and is a great thing for lots of librarians" and libraries can get out of the business of running software. Library ven- dors, SirsiDynix, and OCLC can support the software so libraries can use the systems that vendors support for them. The interface and configuration can be tailored to individual libraries and the benefits include software updates, transpar- ency of service, and painless opera- tion. Intense marketplace uncertainty due to the private equity buyouts of ExLibris and SirsiDynix and the rise of Evergreen and Koha loom- ing open source options. Intense marketplace uncertainty aids a push towards open source systems. Ten- nant also said he sees "WorldCat Local as a disruptive influence". Aside from the ILS, the abstract and indexing (A&I) services are being bypassed as Google and OCLC are going direct to publish- ers to license content. Where do indexers fit in when someone like Google goes directly to the publish- ers and full text? Will Google, direct content access and internal cross linking of citations, make the business of creating an index irre- levant? Eventually, an ILS will be used mostly for back room main- tenance, not front-end. An audience member asked if libraries should get rid of local catalogs, and Tennant said "only when it fits local needs". Trends: privacy issues, "slow library movement", library as publisher Walt Crawford (http://walt.lishost. org/) spoke next and stood for the benefit of the people at the back of a 21 very large room. Crawford's attention on trends include: Privacy still matters. Crawford questioned if patrons really wanted libraries to tum into Amazon in an era of government data mining and inferences which could track a ten year patron borrowing pattern. Before libraries rush to emulate commercial services be sure people understand what that level of referral and personalized services means and if this is what people want. Intellectual freedom is key to democracy. The slow library movement (http:// loomware.typepad.com/slowlibrary/), which argues that locality, where the library is part of the commu- nity, is vital to libraries, mind- fulness matters, and open source software should be used "where it works". Crawford defines "mind- fulness" as thinking about what you are doing and why. Pay atten- tion to open source issues but use them in meaningful ways. The role of the public library as publisher where libraries are doing this with very small teams by helping local people get published. Crawford pointed out libraries in Charlotte- Mecklenberg County, Vermont li- braries that work with Jessamyn West (www.librarian.net/), and Wyoming as farther along this path, and said the "tools are good enough that it's becoming practical". Walt described local publishing as a key role for libraries in the world of citizen content. Blyberg commented on Crawford's presentation, saying systems "need to be more open to the data that we put in there" and there is room in the online catalog for more than MARC records. Williams said that content must be "disaggregatable and remix.able, and Coombs pointed out the current difficulty of swapping out ILS modules, and said Electronic Resource Management (ERM) ( www.libraryjournal.com/article/ CA6440577.html) was a huge issue. Tennant referenced the Talis platform (www.talis.com/), and said one of Evergreen' s innovations is its use of the XMPP (Jabber) protocol (www.xmpp. org/), which is "easier than SOAP web services, which are too heavyweight". ----------··---··-··-······ ·------------ 22 Marshall Breeding responded to a question from the audience asking if MARC was dead, saying "I'm married to a cataloger, but we do need things in addition to MARC, which is good for books, like Dublin Core and ONIX". Coombs pointed out that MARCXML is a mess because it is retrofitted and does not leverage the power of XML. Crawford said, "I like to give Roy [Tennant] a hard time about his phrase "MARC is dead", and for a dying format, the Moen panel was full at 8 a.m". The 08 :00 AM meeting from the previous day on MARC cataloging drew an audience that filled the room and had people standing out in the hall. "There's obviously still interest". A questions from the audience asked what happens when the server goes down, and Blyberg responded, "What if your T-1 line goes down?" What happens if the electricity in the library goes out? Joan Frye Williams exhorted the audience to "examine your consciences when you ask vendors how to spend their time". Coombs agreed, saying that her experience on user groups had exposed her to "crazy competing needs that vendors are faced with - [they] are spread way too thin". Williams said there are natural transition points and she spoke darkly of a "pyramid scheme" and that you "get the vendors you deserve". Coombs agreed, saying, "Feature creep and managing expectations is a fiercely difficult job, and open source developers and support staff are different people". Trends: behaviors; new menu of end- user focused technologies; grasping the full potential; learning from mistakes Joan Frye Williams (www.jfwilliams. com/), information technology consultant, trends were not specifically about technology but about behavioral trends. The circular path of systems to run in cycles is part of the process for libraries to confront technologies where they are fit. Libraries can get caught in cycles of doing things in the same way but with new technologies. New menu of end-user focused technologies. Williams said she worked in libraries when the type- writer was replaced by an OCLC machine, which did not change the workflow processes but "auto- mated" them. Libraries are still not using technology strategically. "Technology is not a checklist", She talked about how she related this to her niece who uses a mobile as a phone, as a flashlight, as a camera, for texting, and so on while Joan still considers it just to be a phone. This is the difference be- tween simply seeing a new tech- nology and recognizing how it changes the possibilities. Williams chided, saying that the 23 Things (http://plcmcl2-about.blogspot.com/) movement of teaching new skills to library staff was insufficient since people where being motivated to just look at or try technology. Williams said you do not just have to try or spot a technology, you have to grasp it<; potential and that is threatening for many people. Williams also talked about how we have to remember not to imple- ment a new technology and then abandon it. Do not stop once the new technology is implement but do more with it than tum it on and step back. Learn from the mistakes and the successes of your new technology and take it a bit further. Even though the technology scares some people, we need to be able to grasp its full potential. "We're toast if we don't grasp the full potential". Ability for libraries to assume development responsibility in con- cert with end users. Joan described comments from people she works with on the utility of online book sites, where half the people find the book site really cool and have a huge upside potential, but the other have saying "it's not a library/you know it's gonna break/l'm not sure we can guarantee quality". "Well hello, discovery has left the build- ing, fulfillment is not far behind", Williams chided. Libraries are holding back from developing use- ful services by being afraid of irrelevance, which is self-fulfilling. "If the civilians don't need us will they still want us?" Have to make things more convenient, adopting artificial in- telligence (Al) principles of self- organizing systems. Williams said, LIBRARY HT TECH NEWS Number 8 2007 "If computers can learn from their mistakes, why can't we?" We have a reluctance to be involved more directly in the development cycle. Modify ourselves based on what we learn in real life. Currently there is an absence of feedback to intelligently evolve the system. Questions from the audience to all panelists followed. An questioner asked why libraries are still using the ILS. Coombs said it is a financial issue, but Breeding responded sharply, "How can we not automate our libraries?" Walt Crawford agreed, "Are we going to return to index cards?" When the panel was asked if library home pages would disappear, Crawford and Blyberg both said they would be surprised. Williams said "the product of the [library] website is the user experience". She said Yorba Linda Public Library (CA) (www.ylpl.lib.ca.us/) is enhancing their site with a live book feed that updates "as books are checked in, a feed scrolls on the site". The panelists agreed nature of the library website as a place will change just like the physical library is changing. It will become more interactive and collaborative as mashups of library data increase and are used directly instead of visiting the library website. The library website will still be necessary. UBRARY HI TECH NEWS Number 8 2007 When asked by an audience member asked why the panel did not cover toys and protocols, Crawford replied "outcomes matter", and Coombs agreed, saying 'Tm a toy geek but it's the user that matters". Many participants talked about their use of Twitter (www.twitter.com/), and Coombs said portable applications on a USB drive have the potential to change public computing in libraries. Users' interaction with information is changing, and we are responding. This is where much of the current environment of change comes from. Tennant recommended viewing the Photosynth demo (www.ted.com/ index.php/talks/view/id/129), first shown by Blaise Aguera y Areas at the TED2007 (Technology, Entertainment, Design) conference (www.ted.com/). Finally, when asked how to keep up with trend<>, especially for new systems librarians, Coombs said, "It depends what kind of library you 're working in. Find a network - ask questions on the code4lib (IRC) channel (www.code4lib. org/) ". People asked for recommendation of how to keep aware of technology trends. Bly berg recommended constructing a "well-rounded blogroll" that includes sites from the humanities, sciences, and library and information science will help you be a well-rounded feed reader". Tennant recommended a "gasp dead tree magazine, Business 2.0", Coombs said the commercial Gartner website (www.gartner.com/) has good infonnation about technology adoptions and Williams recommended trendwatcb.com (http://trendwatch.com/). Links to other trends: Karen Coombs' Top Technology Trends (http://litablog.org/2007 /06/20/karen- coombs-top-technology-trends/) Meredith Farkas' Top Technology Trends (http://litablog.org/2007 /06/15/mer- edith-farkas-top-technology-trends/) 3 Trends and a Baby (Jeremy Frumkin) (http:/ /litablog.org/2007 /06/24/466/) Some Trends from the LiB (Sarah Hougton-Jan) (http://litablog.org/2007 /06/20/ some-trends-from-the-lib/) "Sum" Top Tech Trends for the Summer of 2007 (Eric Lease Mor- gan) (http://litablog.org/2007 /06/15/ sum-top-lech-trends-for-the-sum- mer-of-2007 /) Mitchell Brown (mcbrown@uci.edu) is a co-editor of LHTN and the Chemistry and Systems Science Librn.rian at the University of California, Irvine Libraries, California, USA 23 work_yti4mxs76bauvjlq6sjirnupvq ---- Usability of digital libraries: a study based on the areas of information science and human-computer-interaction 1 World Library and Information Congress: 71th IFLA General Conference and Council "Libraries - A voyage of discovery" August 14th - 18th 2005, Oslo, Norway Conference Programme: http://www.ifla.org/IV/ifla71/Programme.htm julho 21, 2005 Code Number: 165-E Meeting: 157 Statistics and Evaluation with Information Technology and with University and Research Libraries Usability of digital libraries: a study based on the areas of information science and human-computer-interaction1 Sueli Mara Ferreira2 Denise Nunes Pithan3 ABSTRACT The conception, planning and implementation of digital libraries, in any area of knowledge, demand innumerable studies in order to verify and guarantee their adequacy to the users’ necessities. Such studies find methodological, conceptual and theoretical support in some areas of knowledge, such as Human-Computer-Interaction (HCI) (Usability Studies, in particular) and Information Science (IS) (especially studies about users’ necessities and behavior in information search and use). This research, therefore, intends to integrate concepts and techniques from these two areas, that is, it analyzes the usability of the InfoHab digital library, having as theoretical base the Constructivist model of user study proposed by Carol Kuhlthau and the criteria of usability established by Jacob Nielsen. In order to do so, a qualitative study with six users with different levels of academic formation and experience in the use of recovery systems was developed. Data was collected through 1 Paper presented at the Session ‘Measures and standards in the electronic age’ organized by Statistics and Evaluation Section, University and General Research Libraries Section and Information Technology Section in the World Library and Information Congress: 71st IFLA General Conference and Council. August 18th, 2005, Oslo, Norway. 2 Ph.D. Professor from the Librarianship and Documentation Department of the Communication and Art School of the University of São Paulo, Brasil, and Coordinator of the Nucleus of Research ‘Design of User-centered Virtual Systems’. Email: smferrei@usp.br 3 Researcher from of the Nucleus of Research ‘Design of User-centered Virtual Systems’. Special student from the Post-graduation Program of Communication Sciences of the Communication and Art School of the University of São Paulo, Brasil. Email: denise.pithan@poli.usp.br http://www.ifla.org/IV/ifla71/Programme.htm 2 personal interviews, prototype of the library, direct observation, image and sound records. The variables of this study included the following criteria: learnability, efficiency and effectiveness of the digital library, management of errors, memorability and the user’s satisfaction from the perspective of cognitive and affective aspects and the actions taken by the users during the information search process. The aspects identified in the collected data are discussed and the results are evidence of the possible synergy between the HCI and IS fields. So we expect to contribute conceptually for a discussion about a model of usability study that can be more inclusive and incorporate the aspects pointed by the Constructivist model. Keywords: Digital Libraries Design; Constructivist Model; Usability; Usability Studies 1 INTRODUCTION The conception, planning and implementation of digital libraries, in any area of knowledge, demand innumerable studies in order to verify and guarantee the final product adequacy to the users’ necessities. Such studies find methodological, conceptual and theoretical support in some areas, such as Human-Computer-Interaction (HCI), for the usability studies, and Information Science (IS), for the studies about information needs and user’s behavior during the information search and use processes. According to Norman and Draper (1986), the area of HCI studies the contact between computer systems and human use, more specifically, the interaction that occurs in this process. Norman continues, "the properties attributed to the system as the interface, the language, the orientation on the tools and devices, the work load, flexibility, compatibility with other systems, communication, as well as the effort to work, intervene directly in this interaction." In this context, usability is understood as "the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use." (ISO 9241- 11,1998). Information Science, in turn, proposes “the holistic understanding of the human being while individuals with cognitive, affective and physiological needs and they operate inside of projects that are part of an environment with partner-cultural, economic and politics restrictions. These needs, the projects and the environment form the base of the context of the behavior of information search." (FERREIRA, 1995) Studies with such focus make it possible to the planners of digital systems to better understand the users’ mental models and make it easier the development of more useful and adherent design according the target-public’s necessities. This research intends to integrate concepts and techniques of these two fields, carrying out an usability study theoretically based on Carol Kuhlthau’s Constructivist model of user study and Jacob Nielsen’s quality components of usability. Based on Kuhlthau (1991), the presuppose of this study is that by observing the information search process from the user’s perspective and analyzing the cognitive and affective aspects involved that are present during the interaction with the system, we can diminish the gap between the user’s natural process of information use and the one proposed by the information systems. Therefore, InfoHab, Center of Reference and Information in Habitation, was studied. This center is a digital library in the area of Construction that offers researchers, professionals and companies a free digital databank on Brazilian technical and academic production in the construction field. Since 2000, this library intends to integrate associate entities, government agencies and universities4. In order to incorporate new scientific communication support services, InfoHab 4 Supporters of InfoHab: Associação Nacional de Tecnologia do Ambiente Construído (ANTAC), from FINEP (Studies and Projects Supporter) via 3 reorganized its system and, in special, its interface of access. Currently, the Library allows the user to search for publications about the subject, as well as chances for divulgation and participation in events of Civil Engineering. This article, thus, describes our qualitative study. It starts with a review of the fundamental concepts of HCI and IS used in this research. Then it presents the methodology applied, analyzes the results found and recommends future studies. 2 LITERATURE REVIEW The late twentieth century was marked by the following characteristics derived from the Internet: a boom of the available information and a fast growth in the number of connected computers. Researchers like Castells (2003) and Lévy (2003) have argued about the social, economic and political changes originated by the use of the new technologies of information and communication for the net connected society. These great alterations in all the scopes of human activity have only become possible to the extent that the new technological resources of information and communication have been accessible to people without specialized formation in computer science. The proliferation of information systems (including databases, digital libraries, websites, among others) show the difficulty designers are faced with in the attempt to catch and to satisfy users’ expectations and interests. This situation implies a rethink of systems planning and designing, in order to add differentiated values. As a result of innumerable research projects into this direction, one can note that to guarantee and to add value to the systems implies drawing and projecting products and services centered in the users’ needs and focused on the way users perform their tasks. Therefore, it is essential to consider both cognitive and operational aspects involved in the process of information search and use (NORMAN, 1986; DERVIN; NILAN, 1986). Norman (1986, p. 61) defines the user-centered system as the design carried out from the user’s point of view, thus emphasizing people rather than technologies. This proposal refers to planning and developing a system, specially interfaces, focusing on the users’ necessities, perceptions, mental models and information processing structures. Researchers of various areas of knowledge have studied methodologies and developed methods and techniques aiming at guaranteeing systems with the characteristics mentioned above. Some examples are the usability studies detailed by the area of Human Computer Interaction and the studies about information search and use behavior prescribed by the alternative approaches of user studies in the Information Science field. 2.1 Usability According to Nielsen (2000a), usability has became a question of survival in the economy of the Internet. The author affirms "there is an abundance of available sites, [therefore] to leave is the first defense mechanism when the users find difficulties". These difficulties are usually related to the organization schemes, navigation systems, search system and labeling systems of information in the Habitare Program, from CNPq (RHAE Program), from Caixa Econômica Federal (CEF – a national bank) and Brazilian State Department of Science and Technology (MCT). Academic entities that participate in the process of collection, treatment and alimentação do sistema: Federal University of Santa Catarina, University of São Paulo, Federal University of Rio Grande do Sul, Federal University of Fluminense, Federal University of Bahia, Federal University of São Carlos, University of Campinas, Federal University of Rio de Janeiro e Regional University of Chapecó. 4 web. That is, because of the great number of available options today, the information architecture can determine the user permanence or abandonment of the virtual systems. (ROSENFELD; MORVILLE, 2002). Usability, as Nielsen (2003) argues, is a quality attribute that assesses how easy user interfaces are to use, making it possible to the customers to develop tasks in a clear, transparent, agile and useful way. This concept corroborates the one prescribed by norm ISO 9241-11(1998), which considers usability the "capacity that an interactive system offers its users, in a determined operation context, for the accomplishment of tasks, in an effective, efficient and pleasant way." For the Usability Professional's Association (UPA), usability is directly related to quality of the product, as well as to the user’s efficiency, effectiveness and satisfaction. This same association defines usability as a set of techniques developed to create usable products, with a user-centered approach. Nielsen (2003) considers that the usability of a system can have five quality components: • Learnability: How easy is it for the users to accomplish basic tasks the first time they encounter the design? • Efficiency: Once users have learned the design, how quickly can they perform tasks? • Memorability: When users return to the design after a period not using it, how easily can they reestablish proficiency? • Errors: How many errors do users make, how severe are these errors, and how easily can they recover from the errors? • Satisfaction: How pleasant is it to use the design? Usability tests play an important role in each stage of the process of virtual systems development, specially in the drawing of the interface, "space" in which the interaction between the user and the system’s available content, services and products occurs. 2.2 Users Study – Information search and use behavior The "user-centered" perspective (or alternative studies, as referred to by Dervin and Nilan, 1986) is initiated in the 1970s, in the Information Science field, when the necessity to extend the focus of the research appears, concentrating in the individual actors of the information search and use processes, in social, practical and cultural contexts. “The approach focuses on the user’s problems and on the production of meaning, pointing out that the efficiency of the information recovery depends on the integration of the results with the user’s life and specially on the evaluation the user makes about the utility of the information to solve problems"(JAMES5, 1983; HALL6, 1981; INGWESEN,7 1982 apud KUHLTHAU, 1991). While the system-oriented studies (studies of use and usability) examine what happens in the informational environment external to the individual, the user-oriented studies also examine the individual’s psychological and cognitive necessities and preferences and how they affect the standards of search and use of information (CHOO, 1998). Therefore, such studies focus on the analysis of internal behavior and/or behavior externalized through non-verbal communication, allowing individuals to construct and project their movement through time and space. 5 JAMES, R. Libraries in the mind: how can we see user’s perceptions of libraries. Journal of librarianship, p.19-28, 1983. 6HALL, H.J. Patterns in the use of information: the right to be different. Journal of the American Society for Information Science, v. 32, p. 103-112, 1981. 7 INGWERSEN, P. Search procedures in the library analyzed from the cognitive point of view. Journal of Documentation, v.38, p. 583-600, 1983. 5 The development of users studies, from this perspective, has been searched and described by three distinct approaches: the User-Values Approach by Robert Taylor (1994), the Constructivist Model by Carol Kuhlthau (1991) and the Sense-Making Approach by Brenda Dervin (1986). Among these, the Constructivist Model suggested by Carol Kuhlthau (1994) emphasizes the occurrence of affective and cognitive states that certainly appear in an information search process. Its central axle is the "Information Search Process " (ISP) considered as "the user’s constructive activity of finding meaning from information in order to extend his or her state of knowledge on a particular problem or topic". This process occurs in phases experienced by individuals as they build their view of the world by assimilating new information. (KELLY8, 1963 apud KUHLTHAU, 1991). The analysis of these phases must incorporate three aspects of activities: physical (real actions performed by the users); affective (experienced feelings) and cognitive (ideas related both to the process and to the content). Kuhlthau (1991) identified, analyzed and described six phases of her model for ISP: initiation, selection, exploration, formulation, collection and presentation. The first one, initiation, is marked by feelings such as uncertainty and apprehension, whose commonest thought is the search for something vague and general. Thus, the user’s task is to recognize the necessity of Information and to talk to other researchers so as to look for similar experiences. In the second stage, the selection, the task is to identify the general topic of the research, in which feelings such as optimism after the task is completed appear. After the general topic is selected, the user goes to the third stage of the ISP, exploration, in which confusion, frustration and doubtful feelings occur, since the user’s task is to investigate about the general topic and search for new and relevant information. The fourth stage is the formulation, in which the task is to formulate a perspective focused on the needed information. Therefore, feelings of uncertainty and doubt turn into confidence and clarity. Kuhlthau considers this stage as the critical point of the ISP, because if the user can not determine the focus of the search, he or she will probably have difficulties in the following stages. In the fifth stage, the collection, a sense of direction starts to appear, as well as the researcher’s interest for the subject. The commonest actions are seeking the pertinent or focused information in more appropriate sources, such as the libraries. The sixth stage, presentation, is the moment to finish the search and the production and to present the final knowledge resultant from the research. The table below summarizes the set of feelings, thoughts, actions and tasks that occur during each stage of the Information Search Process. 8 KELLY, G.A. A theory of personality: the psychology of personal constructs. New York: Norton, 1963. 6 Stages in ISP Common Feelings in Each Stage Common Thoughts in Each Stage Common Actions in Each Stage Appropriate Task According to Kuhlthau’s Model 1. Initiation Uncertainty General/ Vague Seeking Background Information Recognize 2. Selection Optimism Identify 3. Exploration Confusion/ Frustration/ Doubt Seeking Relevant Information Investigate 4. Formulation Clarity Narrowed/ Clearer Formulate 5. Collection Sense of Direction/ Confidence Increased Interest Seeking Relevant or Focused Information Gather 6. Presentation Relief/ Satisfaction or Disappointment Clearer or Focused Complete Table 1 – Information Search Process (ISP). (KUHLTHAU, 1991, p. 367) These stages identified by Kuhlthau evidence that the emotional and cognitive aspects can influence the success of the information seeking. At the moments when there is a feeling of uncertainty, causing doubt, confusion and anxiety, as in the stages one (Initiation) and three (Exploration), there is a great risk that these feelings provoke the mismanagement of the task, compromising the course of the following phases. In phase four, considered crucial and thus highlighted in the table, for example, if the user is fully successful in the research, it is important that when he or she performs the task he or she feels prepared to continue, with sufficient security, because without the focus of the research, the user is unlikely to reach his/her goal fully. A review of the users studies (IS field) such as usability (HCI field) points to the existence of various feelings during the information search or other kind of interaction with a web system. Nielsen (2000a), for example, affirms that the users can feel anguish and uncertainty during a visit to a website, and he attributes these feelings to mistakes in the systems interfaces. Based on these evidences, this research aims at checking out whether the integration between the concepts and methods proposed by Carol Kuhlthau’s Constructivist model and the quality components of usability established by Jacob Nielsen, in a study with a specific digital library, contributes to the enlargement of our knowledge about the subject. 3 RESEARCH METHODS This is a qualitative empirical research that analyzed the interaction and use made by a group of users of the InfoHab digital library, considering specifically the affective and cognitive aspects found and the actions the users took to solve situations presented to them. This study also describes the selected sample, the variables and the methods of data collection. 7 3.1 Definition of the Sample To Nielsen (2000b), the number of test users can influence the identification of the problems of usability of a website. One user makes it possible to identify about 25% of usability problems, while fifteen users allow us to identify 100% of the problems. The number of usability problems found in a usability test with n users (NIELSEN, 2000a, p. 1) On the one hand, Nielsen (2000a) shows that the usability evaluation would have to be made with fifteen users but, on the other hand, he considers the test can be trustful enough with five users. According to him, by testing the site with five users it is possible to identify a great part of the usability problems (about 85%) without the unnecessary involvement of many resources or users. However, he recommends that studies should be made systematically each time the site project is reformulated so as to correct errors of usability pointed out by the users and other errors generated by the reformulation itself (NIELSEN, 2000a). For this research, we first analyzed the users registrations in the InfoHab Library and noticed a significant presence (about 80% of the 7.789 registrations) of students (under-graduate and graduate students) and professors. Among this academic public, this research selected users from the Department of Civil Engineering of the Polytechnical School of the University of São Paulo (USP) in Brazil. From the 33 professors and 65 students in this department, 19 professors and 44 students are registered in InfoHab. This research works only with academic public, following Nielsen’s (2000b) recommendation. Six users were invited to compose the sample of this study: an experienced doctor professor, a professor recently awarded a doctorate, a master’s course freshman, a doctorate student and two under- graduate students - fifty percent of them already use InfoHab. 3.2 Variables of the study This research considered the five variables by Nilsen (2003): learnability, efficiency, memorability, errors and satisfaction - all observed from the perspective of the feelings, cognitive process and actions taken by the users during the interaction with the digital library (following the model by Kuhlthau, 1991), which, therefore, have become variables of the study. Each variable used in this study is defined as follows: • Learnability: users’ assimilation of distinct ways of solving problems or using InfoHab. • Efficiency: easiness of task accomplishment, verified through the fluency and difficulty felt by the user during the task performance in InfoHab. 8 • Memorability: the possibility of the user to remember interactions with the system, explaining them or acting in order to repeat correctness and prevent errors. • Errors: errors occurred due to internal problems of the system or to users’ misuse, as well as the analysis of the answers that the system emits in the various interactions with the users. • Satisfaction: pleasantness in the use of the site as well as the way efficiency and effectiveness of the system was perceived by the user. • Feelings: user’s feelings revealed during each phase of the information search process: first contact and knowledge of the new InfoHab interface and its use for the solution of a given task and its conclusion. • Cognitive processes: thoughts formulated by the user during the phases involved in the accomplishment of a task. • Action: actions taken by the users to know the new interface and to accomplish the task. 3.3 Data Collection Appealing to an on-line prototype of the InfoHab digital library, the data collection was divided in three phases: random exploration of the new interface of InfoHab, performance of a task predefined for the research team and an interview at the end of the meeting. In the phase of the random exploration, the researcher explained the objectives of the research and questioned the users about how they used to search information about events and scientific publication as well as about their expectation about the services that a library should offer. After that, the digital library was presented and the user was requested to visit InfoHab freely and to say out loud each and every idea and thought that came to their minds. After a short period of navigation, the researcher made some questions about their impressions and opinions. After that, the users had to perform two predefined tasks of information search request, which demand the use of the system. Also, during this interaction, the user was requested to say out loud his/her thoughts and actions. The tasks, designed to demand the use of some available functionalities in InfoHab, were: (1) to save a list of the master’s theses on "rice rind ash" defended at the Federal University of Santa Catarina and to access the complete document; (2) to identify the events that will take place in 2005. After the tasks were completed, the researcher carried out a half-structured interview, whose objective was to identify the user’s perception about his/her performance and difficulties, strong and weak points of the system, level of satisfaction, emotional and cognitive aspects involved in the interaction with the system. Also, we made prospective questions aiming at identifying the user’s expectations, priorities and suggestions. All the three phases of data collection were filmed with a digital camera placed so as to follow the face and corporal expressions of the participants; also sound files were generated to record the interviews. The software Screen Record was used to follow and to register the users’ actions during the interaction with the systems. Also, direct observation of the interaction with the prototype of the system was used as a tool of data collection. In order to increase the degree of trustworthiness of the research, we used triangulation of empirical information collected from various sources of evidence. The duration of user participation in every stage of this research was about 35 minutes. 9 4 DATA ANALYSIS AND RESULTS It was possible to obtain much data as evidence of the validity of studies that integrate criteria of the two fields (IS and HCI), identification of the users’ mental model when faced with the InfoHab website, finding of information architecture and content implementation problems. Due to the great diversity and depth of the data and results, this article presents some aspects of the results specifically related to the convergence and possible synergy between the mentioned areas, presenting them in accordance to the collection phases described in the methodology section. 4.1 Phase 1 – Random Exploration During the phase of Random Exploration, we aimed at collecting the initial impressions about InfoHab, as well as the main references of other informational resources periodically used by the participants as a source of scientific information. InfoHab was immediately identified as a service for the access of scientific publications in the area. It was also compared by some of the users to other systems of the University of São Paulo (Dedalus System of the Library). Normally, the users use as information source databases Scirus for Scientific Information, Science Direct and CAPES journals portal. Beyond these, sites of research groups from other universities and scientific associations were cited. The users considered themselves capable to use InfoHab easily and believed that they would find the same logical structure of the systems mentioned above: "I tend to remember logical things, if the path is logical, I’ll remember it.” Systems that apply and/or adopt a design that is familiar to the users’ cognitive model tend to become more logical, which increases the possibility of memorization of its characteristics and functionalities. All the users’ first impression about InfoHab was that it is a pleasant site in terms of visual aspects, organization and distribution of information. However, one of the users specifically commented on the used labeling system, inferring that it used unnecessary and unclear terms for the understanding of the content: “I would like to have a more defined image of what it is, for example, virtual nuclei or management of events [...] I would like to get a brief glimpse and already understand a little more before going on". 4.2 Phase 2 – Task Performance The users, in this phase, provided evidence that the use of their previous experiences with other search systems (cited in phase 1) when they tried to talk back their models of development of the requested task, for example the use of the word-key strategy, simple and advanced search. There was no consensus about the strategies used by the users, so various paths were followed, but all of them remade the search more than once. Some examples of the users’ commentaries during these activities are: "can the research be refined?”, "what happened with my search that it generated zero register". Other users brought suggestions of new available interesting and complementary functionalities in similar systems "maybe here in the key-words, look! There are ash, rice. This might be a link to other works. Dedalus is like this. There you have the key-word, so you click...” In general, in the information search, not all the users showed fluency to deal with the specific site (it took them from fifteen to twenty-five minutes to complete the task), although they are experienced users in activities of bibliographical searching. It happened due to problems of usability and architecture of information in the digital library that generated feelings of unreliability, anguish and discomfort during the process. Besides, the task involved more actions than a mere simple search, since users were supposed to save a list of references and to access the documents. Despite being able to locate the documents, none of users completed the task, although some declared they did. Therefore, although the user was apparently satisfied with his/her own 10 performance, the system offered more possibilities and these were not identified by the user, what certainly would have had a direct impact on the understanding of the mentioned satisfaction. It should be noted that the subject was not of interest to the user and this might not have generated great motivation. This is why a more detailed inquiry should be carried out to verify how the variable personal motivation derived from a necessity of real information influences the users’ actions and decisions. The same happened to the second search, for 2005 events. The users used the cognitive model they had consolidated from their experiences with the other systems of the University, e none of them was able to complete the task, generating a direct impact onto their self-esteem and confidence: "I feel a little... I don’t know if I got everything..." --- "I felt insecure, for example, did I do right, there in the events?" Another user compared the system with the previous one and considered that the events were divulged better in the previous system. Both results corroborate the proposal by Borgman (19869, adapted by BISHOP, HOUSE, BUTTENFIELD, 2003) when it describes the three abilities that are necessary to the users to carry out a search in digital libraries: conceptual knowledge about the process of recovery of information, semantic and syntactic knowledge of the area to implement an adequate query and, finally, technical abilities in the use of the selected source to perform the search. In relation to the memorization and learning variables, we observed that the problems mentioned above generated an impact directly onto these components of the digital library in question, since when we change all a preexisting model, we must justify it logically so that the user recognizes it as valid and rethink and reformat his/her previous standard. In the analysis of the digital library efficiency, some technical errors were observed (for example, once the site did not offer the user the chance to return to the previous movement and obliged him/her to restart the task from the beginning), others were due to the search tool specification, which did not offer certain expected functionalities (such as: refinement of the search result). In those situations, the most common feelings were insecurity, anguish, scare, discomfort, impatience and frustration, besides the great deal of time spent to finish the task. This demanded intervention by the interviewer to maintain users’ motivation, and assistance so that they could solve the problems and continue the task. One of the users felt embarrassed and thought: "I think I do not know how to make a research any more." 4.3 Phase 3 - Interview The final interview was important to analyze some aspects of the memorization easiness, learning of the dynamics, general impressions after use and final satisfaction with the InfoHab digital library. In general, the users demonstrated easiness in learning and remembering the steps they had taken to perform the task, when asked during the interview. However, it is not possible to affirm that the page is easy to memorize, since there are many conflicts between the proposed model and the users’ mental models. It is also necessary to verify whether after a period of absence the user remembers the steps to perform those tasks made during this research. However, although the users were faced with difficulties during the accomplishment of the tasks, they felt satisfied at the end, in part due to the interaction in the interview process, in which they were presented with other services not identified before or other forms of search task performance that they did not know. Thus, Dervin’s (1984) comment can be confirmed when he says that 9 BORGMAN, C.L. (1986). Why are online catalogs hard to use? Lessons Learned from Information Retrieval Studies. Journal of the American Society for Information Science, 37 , 387-400. 11 qualitative studies that make the user remember and speak out his/her previous experiences help his/her learning, because they lead to a process of systematization and understanding of the problem that, usually, ends up extending his/her initial perception of the problem and of the information search process. In relation to the satisfaction with the site, the users suggested ways to improve it, such as: to diminish the amount of text, to increase the size of the font type, to allow refinement in the search and inclusion of new search criteria (like by date and geographical location). 5 FINAL CONSIDERATIONS The results evidenced the synergy between the areas of Human-Computer-Interaction and Information Science, according to the theory by Carol Kuhlthau (1991) and the proposal by Jacob Nielsen (2000a; 2003). Therefore, through the test of usability in the site of a digital library it was possible to evidence that to analyze information search and use behavior validates and adds new perspectives to the analysis of usability aspects. Thus, it was possible to observe that the users’ actions, feelings and thoughts, as well as their experiences disclose significant indications to learning components, memorization, errors, efficiency of the digital library and mainly users’ satisfaction. However, this synergy still needs other deeper studies that incorporate contributions from other areas of knowledge to explain still not investigated phenomena about this relation between usability, information necessity, information search process and users’ satisfaction. Another item that also deserves attention is related to specific studies on users’ nonverbal communication, since, as identified in this research, their body movements (noted by the interviewer and also registered by the tools of data collection) can evidence other factors related to cognitive and/or affective aspects that can contribute to the design of digital libraries. BIBLIOGRAPHICAL REFERENCES BISHOP,A.P.; HOUSE,N.A.Van/ BUTTENFIELD, B. Eds. (2003). Digital Library Use: Social Practice in Design and Evaluation. Cambridge, MA: MIT Press CASTELLS, Manuel. Lições da Internet. In: MORAES, Denis de. (Org). Por uma Outra Comunicação: mídia, mundialização, cultura e poder. Rio de Janeiro: Record. DERVIN, 1984. A theoric perspective ans research approach for generating research helpful to communication practice. Public Relations Research and Education l (1), 30-45. DERVIN, Brenda; NILAN, Michael.(1986) Information needs and uses. ARIST, v.21, p.3-33.. FERREIRA, S.M.S.P. (1996). Novos paradigmas e nova percepção dos usuários. Ciência da Informação, v.25, n.2. HUTCHINS, E.L. (1986) et al. Direct Manipulations Interfaces. In: NORMAN, Donald A. User- Centered-System Design: new perspective on Human-Computer-Interation. New Jersey. INTERNATIONAL ORGANIZATION FOR STANDARTIZATION (ISO). (1998). ISO 9241-11: Ergonomic requirements for office work with visual display terminals (VDTs) -- Part 11: Guidance on usability. Geneva. KUHLTHAU, Carol. (1991). Inside the search process: information seeking from the user´s perspective. Journal of the American Society for Information Science, v. 42, n. 5, p. 361-371. 12 ______. (1994) Students and the Information Search Process: zones of Intervention for Librarians. Advances in Librarianship, v. 18, p. 57-72. LÉVY, Pierre.(2003) Cibercultura na rede. In: MORAES, Denis de. (Org). Por uma Outra Comunicação: mídia, mundialização, cultura e poder. Rio de Janeiro: Record. MORRIS, Ruth. (1994) Toward a user-centered information service. Journal of the American Society for Informatioin Science. V.45, n.1, p.20-30. NORMAN, Donald A.; DRAPER, Stephen W.(1986). Cognitive Engineering. In: ______. User- Centered-System Design: new perspective on Human-Computer-Interation. New Jersey. NILSEN, Jacob (2000a). Projetando Websites: designing Web Usability. Rio de Janeiro: Campus, 2000. ______. (2003) Usability 101: Introduction to Usability. Useit.com: Usable Information Tecnology. UseNet Alertbox, August. Disponível em: http://www.useit.com/alertbox/20030825.html. Acesso em: 07 dez. 2004. ______. (2000b). Why you only need to test with 5 user. Useit.com: Usable Information Tecnology. UseNet Alertbox, March.. Disponível em: http://www.useit.com/alertbox/20000319.html. Acesso em: 27 jan. 2004 ROSENFELD, Louis; MORVILLE, Peter (2002) . Information architecture for the World Wide Web. 2nd Edition. Beijing: O’Reilly, TAYLOR, Robert. (1994). Value-added processes in information systems. Washington: Ablex, 1994. USABILITY PROFESSIONAL’S ASSOCIATION. Website institucional. Disponível em: http://www.upassoc.org/ http://www.useit.com/alertbox/20030825.html http://www.useit.com/alertbox/20000319.html http://www.upassoc.org/ work_yulxrmrmabe2foroc54wcfeg5a ---- Added value of information and information systems.doc 1 Added value of information and information systems: A conceptual approach Rahmatollah Fattahi, Ph.D.1. Ebrahim Afshar, Ph.D.2 Abstract: Purpose – Information, due to its nature, has numerous capabilities. Through utilizing these capabilities, information systems can add to the value of information. The purpose of this paper is to explain where and how added value emerge form the work processes in library and information professions. Design/methodology/approach – The paper begins with a review of the related literature and then takes a conceptual approach to discuss different values of information and IR systems; elaborates on how each of the processes like assessment of needs, selection, description/organization, storage/processing, search/retrieval, and dissemination generate capabilities that lead to added value. Findings – The paper identifies that added value is generated through processes such as reproduction, exchange, transfer, refinement, analysis, interpretation, synthesis, and regeneration of information. Many of such processes turn information into knowledge. Practical implications – Librarians and information specialists need to find practical ways in their work processes regarding how to design information systems and services which can generate added value for information. Research limitations/implications –This paper is based on the author’s reflections on the matter of added value generated by library and information practice. Further empirical studies are needed to substantiate the extent to which such values are generated through information systems and services in the real world. Originality/value – In the present evolving conditions, library and information professionals are able to add to the value of information by sharing their knowledge with the expertise of computer scientists and finding ways and up-to-date methods of optimizing existing systems, as well as designing new systems. These are the two strategies along which the profession should guide its educational, research and practical endeavors. Keywords: Information, Information systems, Information services, Added value. Article Type: Conceptual paper 1 Associate Professor, Department of Library and Information Science, Ferdowsi University of Mashhad, Iran (fattahi@ferdowsi.um.ac.ir) 2 Assistant Professor, Department of Library and Information Science, Isfahan University, Iran (e_afshar@hotmail.com) mailto:(fattahi@ferdowsi.um.ac.ir) mailto:(e_afshar@hotmail.com) 2 Introduction Library and information services as two professions with information and knowledge as their subject of interest, besides serving community’s need involve generate added value in the strict financial terms. That is the added value generated from the processes and functions undertaken by them. The main functions of the two include needs assessment, selection, provision, processing, organization and dissemination of information. The subject matter of these functions is information – the engine for the progress of individuals, communities, and nations. A considerable body of literature has been published on information and its value to the scientific, industrial, economic, social, political value of information. Cleveland (1982, 1985), Burk and Horton (1988), Maguire (1990), Matarazzo and Prusak (1995), Volpe National Transportation Systems Center, (1998), and Woldring (2001) have enriched our thoughts about the value of information to the contemporary organization particularly where library managers are under pressure to demonstrate their worth and the value of their library. Robert Hayes (1997) in his contribution to the International Encyclopedia of Information and Library Science has comprehensively examined the value of information from micro and macro economic points of view. An overview of the library literature and the literature of other fields indicates that nowadays not only library and information service professionals, but also many other professions highly value information. Today the motto 'information is power' has become a commonplace truth. Nevertheless, it appears that members of the community, and even library and information managers do not have a clear understanding of what constitutes added value in respect to information (Matarazzo and Prusak, 1990) The same is true about many top managers who rarely use the library directly and thus have an unclear understanding of its value (Saracevic and Kantor, 1997) The aim of the present paper is to explain where and how added value emerge form the work of library and information professionals. But, it seems in order first ask ourselves what added value is? What is added value? The Oxford English Dictionary quotes the definition provided in the Terminology of management and financial accounting (1974) for added value as “the increase in market value resulting from an alteration in the form, location, or availability of a product or service, excluding the cost of bought-out materials or services”. Another definition, also quoted in the OED, maintains added value to be “the gap between what the customer pays and what the manufacturer and supplier has to pay for the material”. In the profit making sector, all the effort is directed towards either generating added value or boosting it, that is maximizing the gap between the value of the input and output. The problem is how added value could be calculated, or even considered in the strict terms, for the products of library and information services as long as they are not selling their products, or even if they sell them, they are not looking for profit (i.e. just to cover expenses partially or entirely)? Obviously, the answer to this question for those sections of library and information services which are run for profit making is a task in the hands of accountants. Added value in information systems and services The value of information can be addressed from different perspectives. Top managers see the 3 value in decision making and operational management (Marshall, 1993; Hayes, 1997). Some researchers emphasize the monetary value of information and argue that the cost of a professional user's time and effort to obtain information elsewhere far exceeds the cost of providing a library or an information system (Griffiths and King, 1993; Keyes, 1995). The value of information in time saved, productivity and improved work quality is highlighted in a research carried out by Volpe National Transportation Systems Center (1998). Similarly, Koenig (1992) stresses the correlation between the costs of information services and corporate productivity. To McGee and Prusak (1993), the value of information lies in its use for competitive corporate strategy. The ability to acquire, manipulate, interpret, and use information makes it possible for organizations not only to survive but also to be ahead of their rivals. Reports on the added value of information released in some countries or economic sectors indicate the strategic attention paid to this matter. For example, the Georgia Technical Institute installed a campus-wide online library system in 1986 and reduced the costs of its literature searches by $1.2 million a year. Another case study was the library at the Houston division of Texas Instruments. In a survey conducted by the library, users' responses indicated that the library saved the company $268,800 a year and increased users' job proficiency by a value of $523,000 a year. From an annual investment in the library of $186,000 a year, Texas Instruments netted $959,000 in benefits--a 515 percent rate of return. (quoted in Matarazzo, et al., 1987). From an Asian perspective, the added value of the information market has gained much attention. The Government of Shanghai’s official website (2004) states that the information industry in that state generated 35.04 million Yuan in the year 2000. That figure shows a 28.8% growth compared to the previous year. The added value gained from the production of information was 1.25 billion Yuan, and that figure for information services has been 12.09 billion –a 13% growth compared to the previous year. Information in electronic form is capable of generating even more value. Bothma (1996) points to some properties of electronic information among the most important factors for its added value. For example, at present, by synthesizing text, voice, and images an increasing number of multimedia resources are generated; hundreds of thousand of job opportunities are created, even in developing countries. This has created a considerable added value. Overall, one could conclude that added value depends on factors such as the quality of the product itself, the method and quality of innovation, the type of utilization, conditions of use, the individual user (or the customer), the time of use, and even the place of investment and use. Such factors, as will be discussed later in this paper, are applicable to information and information systems. Also, the notion of added value in relation to information services is not limited to economics. Such services spread knowledge in the society. To elaborate this, we first carry out an analysis of the capabilities and values of information. Then we elaborate on the capabilities of information systems for generating added value. A. Capabilities and values of information Information due to its very nature is capable of generating added value. In this section, we will elaborate on the characteristics that make this possible. The next section of the present paper will discuss capabilities of information retrieval systems for the same thing. 4 1. Information can be purchased and sold. Many purchasable commodities are capable of generating added value as well. That is, they could be purchased first, and then sold for a higher price. Information is not an exception to this rule. The producer of information may sell it many times over. The buyer in turn may also sell that information in certain conditions. Robert Hayes (1997: 119) considers this attribute as "capital resource". He stresses that information could be sold or given away without loosing the value and the content. The more a piece of information is capable of being bought and sold, the more its value may be increased. No doubt, because of the increasing educational, research, industrial, and economic activities on a global scale, the trend of production (provision) and consumption (demand) is surging. The increasing number of publishers, information suppliers and Internet service providers (ISPs) and the overwhelming production of a variety of electronic information resources and databases is an indication of demand. This shows the capability of information as a commodity. 2. Information can be used and reused repeatedly One distinctive attribute of information is that it can be used repeatedly. In other words, unlike many other assets, information is reusable. For example, some information an individual or a company generates or buys for some purpose, could be used for other purposes, even by other individuals and companies. Think of the photocopy made of a journal article through a document delivery system. Another example is when others who borrow through inter-library loan use the resources of a library, which have been acquired to meet the needs of that library’s target users. This is an indication of the fact that information potentially has added value, i.e. it is capable of returning much more than that which has been paid to obtain it. Hayes (1997: 120) maintains that the cost of information is independent of the measure of utilization. That means that the degree of utilization is not proportionate to the investment. 3. Information can be shared Some commodities are capable of being used simultaneously by multiple users. Information is one such commodity that enjoys this capability at the highest level. It is able to be used in high amounts without any depreciation caused by multiple uses. For example, many users might use a certain book. In the same way, tens of users or several libraries could use an electronic source or a database at the same time. Overall, "resource sharing" has allowed optimization of information resources over the last two decades. Resource sharing has proven its economic value. Computer technology and networks have greatly facilitated this sharing. 4. Information can be transferred through time and space Information, when recorded on ‘hard material’ (i.e., paper, metal, plastic, etc.) is similar to other commodities with regard to transportation. Books and journals, audio-visual materials are transported like other materials: it takes equal amount of time and energy and money to convey them through time and space. Despite all the measures taken, transportation of material goods involves time delays, disruption, loss, etc. However, since transmission of electrons has become the vehicle for conveying information from one point to the other, the time and energy factors have become almost non-existent and the cost has decreased relatively, even since telegraph was invented. In our time, electronic information can be 5 transferred easily with minimum cost. It may be dispatched to far distances via telecommunication channels to consumers. Distribution of e-books, e-periodicals, e-files, e- records and so on regional, national, and global scales is a common practice nowadays. The ease of transfer of information in a networked world (i.e., Internet) is the very core of the concept of an information society, expected to intrigue unprecedented cultural and political changes around the world. Also, information can be accessed and ‘consumed’ when the user is willing to. This attribute is much valued by consumers of information: consumers need not be present at a specific time and place in order to be able to use it. 5. Information can be processed Similar to any raw material, information is capable of being processed according to a certain plan or program and under certain conditions to generate new information. The new information could be used for new and higher purposes. An important point in this respect is that, compared to other raw materials, processing of information in many cases is simpler and less costly. In other words, investment for developing information processing systems (such as library softwares) requires fewer resources, yet its output in the long run is more. As will be discussed later, information in electronic format is capable of being processed further; hence it generates a higher degree of added value. 6. Information can be reproduced Some commodities have the capability of being produced in large quantities after the initial investment. Also, it is possible to reproduce such commodities later on with minimum expenses. This attribute increases the added value. Information resources enjoy such capability. They can be reproduced in multiple copies once generated for the first time through writing, compiling, translating, etc. Similarly, reprints or new editions may be produced and sold in large quantities. Offset makes multiple uses of the technology with which earlier editions were prepared. Card catalogs could be reproduced; databases could be set up using bibliographic records through copying. Another prominent example is the legal copying of full text electronic resources from the Internet. 7. Information can be refined One interesting characteristic of information that increases its added value is the fact that it is refinable. Normally, with the increase in size of information over time, and increasing difficulty in retrieval, attempts to refine information (i.e. to identify and maintain useful information and delete of useless portions) become necessary. This, in fact, adds to the level of control over existing information and increases access to it, as well as facilitates retrieval and saves time. All of these factors, especially time saving, involve added value. Weeding in libraries is performed for the same reason (i.e. to refine information). Preparing a new edition of an information resource and at the same time, putting aside the older edition, is in fact an act of refinement. This is also true with regard to databases: with each edition, the information content of records is further improved, and with the deletion of unnecessary records one adds to the usefulness of the database. Another possible example is that, by narrowing their search results, users of information systems can refine the information they retrieve. 6 8. Information can be interpreted, inferred, and adapted Information is a commodity that may be used in a variety of ways. Different understanding of a piece of information, in other words, different interpretations of that information is always possible. Throughout history, this property has been the reason for many intellectual, cultural, and political developments. This property, i.e., motivating consumers to react and provide their interpretations in the form of new interpretations or adaptations and creation of new works, is probably a prime example of how added value is added to information. 9. Information can be synthesized and converted into knowledge. The capability of information for synthesis is huge. On the most basic level, the survival of many living creatures, particularly human beings, and their compatibility with the changing environment, is dependent on such capability. In the case of human beings, the capacity of information for analysis and synthesis leads to the generation of new information (in particular in recorded forms). Books, articles, documents and records and so on, are products of compilation, analysis, and synthesis of information. Production of new knowledge is, in fact, the transcendental stage of such processes. Compared to information, knowledge enjoys a higher level of added value. It can be more effectively utilized in strategic decision-making. Because of the same reason, in some societies, instead of generating and improving access to information, the emphasis has turned toward exploring ways of transforming information into knowledge. When conditions are ready, information in synthesis with existing knowledge (approaches and epistemological frameworks) is capable of turning into knowledge. Robert Hayes (1997: 120) considers information a public good and knowledge a private good. He believes that information may be synthesized with other information, that it transforms it and shapes new theories, and eventually generates new knowledge. This is the most valuable capability of information, vital to the prosperity of society. Information is the essential constituent for the generation of knowledge. The content of an authoritative encyclopedia, such as Encyclopedia Britannica, represents the accumulated knowledge of human beings, and is based on the synthesis of information created over many centuries, and in fact is new knowledge. B. How do information systems generate added value? After discussing the nature of information and its capacity for generating added value, we now see how such added value actually is created. Information added value does not happen by itself. Certain processes should take place in order for this to take place. Since to achieve this is impossible or extremely difficult for individuals, it is reasonable to assign this social responsibility to one or a number of professions with required knowledge and skills. The library profession from early days (by whatever name it was then known) has had the responsibility for the processes and functions that create added value out of information (i.e. selection, acquisition, processing organization, and dissemination of information). The profession gradually and over a long time defined and implemented such process within the framework of information systems. Essentially, the library itself is an information system that performs the variety of functions mentioned above within the framework of an integrated process, with the purpose of providing society with easy and quick access to information. In this way, libraries and information centers take action to create different systems ranging from very simple ones (such as making lists and catalogs of resources, i.e., manual systems) to advanced forms (i.e. intelligent and networked computer systems). 7 Easy and quick access to desired information is two major parameters in designing information systems. This is to save users’ time, in particular experts’ time. The outcome of such time saving is generation of added value. In this sense, the library and information center, and even the Internet are information systems. In most cases, of course, by information system we mean one of more bases of certain structures with the goal of storage and retrieval of information. For example, a library OPAC, a database of journal articles, a dissertation abstract and the like, and also library websites are major and applied parts of the information system. Over the last several decades, librarians have gradually succeeded in increasing the capabilities of information systems in order to change their capacity, speed, and accuracy of information storage and retrieval. They have attempted make it possible to maximize the benefit (or add value) taken out of every piece of information. A number of capabilities that add to the value of information could only be achieved through using information systems, in particular computerized and electronic. Examples are simultaneous use of information, electronic exchange of information, copying and reproduction of bibliographic information, transforming the format of storage and so on. Hayes (1997: p.124) states that information technology has exponentially increased the value of information. In this section, we will provide an analysis of the processes and functions of libraries and information service centers to show how they add to the value of information. What is taken place in creating an information system is a process involving a number of specific operations that librarians and information experts do routinely. Most of these operations, that are performed using bibliographic information, generate added value. Taylor (1986) considers that three processes lead to the generation of added value with regard to information. These are "organizing", "analyzing", and "judgmental" processes. Within each of these three major processes there are additional specific activities but the library and information professions consider these three as their main functions. "Organization" is performed through operations such as descriptive and subject cataloging, classification, and indexing. "Analysis" is part of needs assessment, organization, and information seeking behavior. "Judgment" is an integral component of the selection or collection development policy (of resources based on agreed standards). Judgment is also taken into consideration in information retrieval (evaluating and ranking the results on the basis of their relevance) and is a part of the performance assessment of the library. Similarly, the three main functions of the library profession (i.e., selection and acquisition, organization, and dissemination of information) are processes that make it possible to add more value to information and to provide a framework for transforming information into knowledge. Processes performed in information systems and the resulting capabilities that add to the value of information are: 1. Selection and acquisition of information resources Selecting useful information resources that best meet the needs of users is one of the first and most important requirements of an information system. If this is done properly, users of the system will have access to useful and relevant information, which can be critically important for them. Therefore, resources rightly selected for an information system will be followed by more and better use, in addition to saving the time that users spend to access that information. This is one source of added value. 8 2. Description and organization of information resources Compared with the function of selection, the description and organization of information add more to the usefulness of information stored in the system. In other words, most of the capabilities that add value to information are created at the stage of organization. Even capabilities, which surface at the stage of search and dissemination, are dependent on the quality of description and organization. An ideal approach to description and organization of information, capable of creating utmost added value, is the one that requires each information resource to be catalogued and organized prior to its publication (CIP= Cataloging in Publication). Other libraries and information centers can use the bibliographic record created for that item instead of repeatedly recreating it. In this way, a huge saving of time and effort takes place and organization of material is done at minimum cost. Recent advances in the production of electronic resources and their description at the very stage of creation of files, based on metadata format, has doubled the added value of information. Resources based on metadata format can be readily and automatically described and organized. Internet search engines identify and index metadata resources more easily in increasingly effective ways. Organization and representation (i.e. subject analysis, indexing and classification) of knowledge, carried out to create cohesion and order, are among the most important capabilities of an information system. They also add to the value of data. In other words, organization and representation of information based on thesauri and classification schemes make quick and easy retrieval and location of information possible. In addition, organization of knowledge improves the user's understanding of the structure of knowledge and of interrelationships among different disciplines. Among the existing ways and approaches to organize knowledge some are more capable of adding to the value of information. Bothma (1996) observes that providing access to information via hierarchical structuring and hyperlinks is one of the most distinct ways of adding to the value of information. Similarly, Fattahi and Parirokh (2002) have considered the multi-level and hierarchical structure in the description and organization of information a most desirable approach for translating information into knowledge. Through a pre- arrangement of categories and sub-categories of bibliographic families (i.e., works and their various editions and manifestations) and the links between information entities at a higher or lower level, the systematic multi-level structure called "Super record" makes the organization and representation of information sources more comprehensible to the user. In a more general term, a library collection is an act of generating added value. By carefully selecting a small portion of the universe of information and forming a collection, librarians in fact create a ‘Super Text”. Such a ‘super text’ has the function of shortening the time and effort that otherwise potential users need to spend in order to gain access to information they need. In fact, it might be correct to say that assigning Collection Level Descriptor (LCD) to segments of a collection is an example of how the concept of ‘super text’ comes into play. By assigning level descriptors to a collection the user is provided with a structured, open, standardized, and machine-readable metadata that gives them an understanding of what is in the collection beyond individual items. Macgregor (2003) has discussed the application of LCD for digital libraries for both user resource discovery and institutional collection management. 9 3. Storage and processing of information Storage of bibliographic data in information systems, in particular in a standard format (such as MARC format and/or Dublin Core Metadata), creates a variety of capabilities. These capabilities add to the value of information considerably. In fact, by de-construction of bibliographic information and storing such data in different fields and sub-fields information systems provide the potential upon which a number of new functions become possible: a. Capability of copying, exchange, and transfer of information Perhaps one of the most important capabilities that add to the value of information in computerized systems, particularly in bibliographic databases, is the possibility of copying and transferring records from one system to another. Bibliographic records have a considerable market value. For this reason, some cataloging institutions and bibliographic agencies have invested heavily in creating huge bibliographic databases to provide customers with ability to copy and transfer records in information systems. For example, OCLC as a bibliographic agency pays a certain amount to obtain each bibliographic record from a member library. Based on a cost-benefit analysis, each MARC record in OCLC databases (e.g., World Cat, Authority file, interlibrary loan system) worth around $27 (Mathews, 2000). In fact, such agencies invest once to create a record, then sell it unlimited times. In this way, not only do they recoup what has been invested for the purchase of the record, but also add to the value of their database. Over a period of ten years (1988 until 1998) OCLC earned more that $486m (OCLC Annual report, 1998). Similarly, in many countries a number of institutions, including national libraries, have been able to add a good deal of value to their bibliographic databases. Many libraries now use national bibliographies either online or on-disk for a minimum cost and catalog their collection by transferring relevant records to their own library systems. In addition to cataloging agencies and national bibliographic agencies, copying and transferring of information (records) is available in many ordinary databases. Libraries, and even individuals can buy records they need with minimum cost and add them to their own databases. This has become a routine practice over the last decade in many countries. Libraries save a lot of time and money in this way, while consistency in data exchange and storage, which is an important factor in easy access is achieved. In the web environment, too, a similar possibility for copying and transfer of bibliographic data has greatly added to the value of information. Users, including librarians, are now able to make copies of what they have retrieved on the web for free. They can forward it to others via email services. At present, search in the catalog of large libraries, particularly national libraries, has become a routine. Catalogers easily transfer the desired records from the database of, for example, the Library of Congress to their own database albeit taking copyright issues into consideration. b. Capability of further processing of information In addition to the copying and reproducing of information made as a result of storage, further processing is another possibility that adds to the value of information. As mentioned previously, each record is consisted of different fields and subfields. It is possible to consider certain tasks for the bibliographic record according to the needs of the library. By tagging indexable and searchable fields (like the fields for authors, titles, subjects, and call 10 numbers), and by defining data elements that help the user to carry out advanced searches, or by arranging and sorting the results (i.e., the output) in the desired formats (e.g. authors, subjects, publication dates, language, type of material, etc.) the same data in a single database are used in different ways to make the search and retrieval quicker and easier and more comprehensible, and thus push the added value further. Other possibilities of further processing of data stored in bibliographic databases include flexibility of output and display of information, housekeeping and report generation for different purposes and information management, which will be discussed later in this paper. c. Possibility of global change in the database One important advantage of computerized systems in terms of processing is the possibility of performing global changes/modifications/corrections to certain data elements in all the records stored in the database. For example, it is possible to modify or replace a personal or corporate name heading, or a subject heading with another one in all the records within the database in order to change the existing one we consider no longer appropriate or correct. This means a huge saving of time and effort in data entry, impossible otherwise. The consistency achieved in data entry greatly enhances the accuracy of search and retrieval. Altogether, as the possibility of global change increases, it makes better and easier processing of bibliographic data possible. 4. Integration of information One very clear advantage of electronic information systems is the capability to link/integrate components of the system to one another, using linking techniques and standards of data exchange. This concept is called integration. The purpose of maintaining integration is easy access to separate modules as well as to exchange and sharing of data. For example, in a computerized library system all modules (such as acquisitions, cataloging, and loans) are linked making it possible to share and/or exchange information across different files in the system. In such a situation, the librarian can use the same terminal to access information stored in other sections, search in them, and transfer data, as well as carrying out quality control (i.e., editing records). The value of such capability in terms of time and money saving (no need to retype or re-enter data already existing in one section to another section), simplicity of work (no need to exit from one module in order to enter another), consistency of data, and control is immense. Integration in an information system for end users entails added value. The end users are able to lodge their request in one spot (=one stop shopping). An advantage of this can be seen when using integrated library systems. In such environment, end users easily visit different sections of the system and access information they need, reserve books, check their borrowing records, etc, without the need to change the user interface. The same is true with regard to websites and Web-based library systems. The hyperlink technology available in the web environment is also a dramatic development for integrating information regardless of type and location. As mentioned before, data sharing and information sharing among multiple systems (libraries and information centers) made possible by computer and networking technology, is one basis for the increase of information added value In other words, the possibility of sharing requires information systems, in particular those that are computerized and 11 networked. Over the last two decades, information systems at the local, national, and international level have been developed in which data sharing has been an important activity. Added value as a result of information sharing takes place when a specific investment, like the purchase of a database, an electronic journal or book, is made; then what is purchased is downloaded into a system and made available to tens or even hundreds of end users. In fact, the amount of initial investment is broken down among tens of libraries or institutions. In another dimension, the Internet, and in particular, the WWW as a global information system, have made the concept of information sharing tangible by making it possible to have global access to millions of sites for end users. 5. Information search and retrieval The variety of capabilities that have enhanced the performance of information systems also enables such systems to easily retrieve information otherwise impossible or difficult and time consuming by mechanical systems. Such capabilities are invaluable to librarians, information workers and end users, in terms of accuracy as well as speed. For example, with regard to acquisitions and cataloging, librarians by using advanced and accurate search techniques, even based on one single component, retrieve their desired records; and avoid creating new records (i.e., doing original cataloging) which is expensive and time consuming. Examples of such advanced capabilities could be seen in networked system of OCLC , where thousands of catalogers and acquisitions staff, and even end users visit OCLC database every day from all around the world. The fact that end users are now using OCLC is one aspect of the added value of bibliographic information stored in OCLC databases. a. Advanced search/retrieval capabilities Advanced search capabilities have advantages such as accuracy and exactness in retrieval and identification of relevant information for end users as well. This prevents repeated searches and makes considerable savings in time and energy. The value of advanced searching is more obvious when searching millions of web pages through a variety of advanced search options. Some of the advanced search capabilities which are used more frequently in many information systems include combined search (using AND, OR, NOT, AND NOT operators), limiting search (based on, for example, language, place, date, type of material, publisher, type of the site), concept search for avoiding false drops, search by using controlled vocabulary authority files (subject headings or thesaurus descriptors) in the form of relational files, and so on. b. Using relational files to expand or limit a search As mentioned before, information systems increase accuracy in retrieval by utilizing relational files (such as subject heading files, thesauri, or authority files). This is of high value to users. In addition, the ability to search in relational files, in particular subject headings and thesaurus files enables users to review these files and develop a relative understanding of the relevance of subject terms prior to searching. Hierarchical relationships, such as broader, narrower and/or related terms, and preferred and non- preferred vocabulary (i.e., 'see' references) articulated in such files guarantees the 12 effectiveness of searches in the database. Displaying relationships between/among different subject areas makes understanding of the structure of knowledge easier. c. Facilities for hypertext searching in electronic resources In addition to the capabilities mentioned above, hierarchical, and in particular, hypertext search facilities in databases and electronic resources lead to much quicker searching and more relevant results, adding to the value of information stored in such resources considerably (Bothma, 1996). Nowadays, many users of electronic databases and websites apply such searches to trace information they need. In some library catalogs hypertext links include major elements in bibliographic records, such as the main entry, added entries, subject headings and in particular, in thesaurus structure with links to the database, hypertext technology is used to limit or expand searches. Examples of these hypertext links in web-based catalogs and other bibliographic databases are numerous. Another context in which the use of hyperlinks can add to the value of bibliographic information in OPACs is to relate and collocate all editions and manifestations of a work and all works by or about an author on the screen. The concept of Super Records, as described by Fattahi (1996, 1998), offers catalog users the advantage of searching and retrieving all the instances of a bibliographic family together by presenting a more meaningful display. d. Flexibility in output/display Sorting the search results based on authors’ names, titles, subjects, publication dates, relevance ranking, deletion of repetitions, and the like are among the capabilities of many modern bibliographic databases. In addition to saving the time of the user, such capabilities make the display of search results more comprehensible. Many information systems producers develop their products by focusing on such characteristics. Sorting capability is now part of customization and user friendliness of modern IR systems. Another advantage of electronic records is the possibility of varying the data display at the stage of output. This is because of the structure of the electronic record, which contains fields and sub-fields, permitting outputs to be manipulated according to the needs of users. In other words, there is a kind of flexibility in modern information systems that enables users obtain records from other systems and display them as desired without any extra expense. The purpose is to make information more comprehensible to the user (Yee, 1998). 6. Management of Information A function of information management is to add maximum value to information stored in information system. All the functions named to this point are, in fact, functions of good management, however, here we specifically focus on report generating for managerial and housekeeping purposes. a. Capability of report production and analysis in information systems. Report generating is one of the requirements of almost all management system. Information systems are no exception. Development of more sophisticated applications permits the generation of a wide range of managerial reports for almost every function of the system. In fact, a system manger is able to added maximum value to the data stored in the system by producing such reports. Managerial reports may include statistics concerning 13 the following: - records in the database, - records created in, or added to the database in a specific period of time, by different individuals, - records generated originally (e.g., through original cataloging), - records modified, updated or deleted, - records transferred or copied into the system - records for each individual author, translator, compiler, etc., - items borrowed and returned in each period of time, by each borrower, - overdue items, - etc. All types of reports are, in fact, prepared using the same records created originally for the purpose of retrieval. The system manager prepares such reports and analyzes them to understand the general state of affairs or to learn about specific aspects. Decision making, planning and review of existing programs are the aim of report generation. As such, the system lends added value to the information to be used in ways not predicated at the outset. Chemical Abstract Service (CAS)3, as one of the most active and heavily used information systems in chemistry, is exemplary for its report generating capability Conclusion Based on what already has been said in this paper, we can conclude that information is a vital product with capacity to provide added value and that as time passes, the strategic importance of information increases. That is because of the characteristics and capacities of information and information systems. In addition, the more an information system is used, the higher becomes its value. In the matrix below, different functions of information systems and their capabilities are shown. To achieve these capabilities depends on the management of the system in establishing the relevant structure and the plan to implement it. A point worth mentioning is that, based on the matrix most of the functions in the first column fall exclusively within the purview of library and information professionals. Library and information professionals have developed principles and processes for such activities. Functions such as selection, acquisition, description, dissemination and management are among them. Other functions such as storage, integration, search and retrieval, also fall in the realm of library profession. Matrix 1. Matching information system functions and capabilities producing added value 3 Chemical Abstract Service (CAS(www.cas.org/infopro/infoprovide.html)) provides the following routine reports: Journal reports/ patent report/Journal and patent report/substance report with properties/substance table/substance table with properties http://www.cas.org/infopro/infoprovide.html) 14 Capabilities Functions M ulti-purpose use /shared use Sim ultaneous use D uplication/copying T ransfer/exchange M odification/refinem ent A nalysis/ interpretation Synthesis/ R eproduction K now ledge production Selection and acquisition x x x x Description and organization x x x x Storage and processing x x x x x Integration x x x x Search and retrieval x x x x x Dissemination x x x x x x x Management/ Housekeeping x x x x x x x x x Under existing conditions, computer science and networking professions have been able, through automation and optimizing methods of storage and retrieval, to enter this realm and take control of it. However, in instances where the knowledge and the experience of librarians and information workers have supported the work of computer scientists in developing systems, a higher added value of information has become possible. When speed, accuracy, and relevance are essential to information systems, utilizing the expertise of computer scientists is unavoidable. Therefore, librarians and information scientists need to explore ways to interact more closely with computer experts. They should need to try to identify ways and methods of adding value to information systems. The report of a task force of the Special Libraries Association (quoted in Matarazzo, et al., 1987) stresses that information professionals must be prepared to prove the value of their services through one or more of the following approaches: 1) measuring time saved; 2) determining actual monetary savings or gains; or 3) providing qualitative, anecdotal evidence of value. Another important point LIS professionals can rely on, is that as long as information is produced and consumed as a strategic commodity, and as long as access to relevant information requires selection, description, storage, organization, and dissemination, their professional services will continue to be indispensable to society. This, as discussed earlier, depends furthermore on their ability to identify strategies for adding value to information 15 through professional education, research, and practice. Current needs, the pressures of time, and scarcity of resources require us to enhance our knowledge about the ways in which we can create more value for information and information systems. This, in turn, requires that education for library and information science be continually revised in this direction. If this proves to be the case, we will continue to perform as the leading profession with regard the management of information and knowledge in society. References: Bothma, T.J.D. (1996), "Added Value in Electronic Publications", In. Proceedings of 20th International Online Meeting; proceedings, London 3-5 Dec 1996 / edited by David I.Raitt and Ben Jeapes. Oxford : Learned Information Europe, pp. 459-470. China Shanghai official website: (www.Shanghai.Gov.cn/gb/shanghai/english/economy/) Fattahi, R. (1996), ˝Super Records: An approach towards the description of works appearing in various manifestations˝, Library review, Vol. 45 No. 4, pp. 19-29 Fattahi, R. (1998), "AACR2 and Catalogue production technology˝, Proceedings of the International Conference on the principles and Future Development of AACR, Toronto, 23-25 October 1997. Edited by Jean Weihs. Chicago: ALA, Canadian Library Association, Library Association, pp. 17-43. Fattahi, R. and Parirokh, M. (2002), "Restructuring the bibliographic record for better management, organization and representation of knowledge", Challenges in knowledge representation and organization for the 21st Century: Integration of knowledge across Boundaries; Proceedings of the Seventh International ISKO Conference, 10-13 July 2002, Granada, Spain. Edited by Maria J. Lopez-Huertas and Francisco J. Munoz- Fernandez. Granada: Ergon Verlag, pp.107-112. Griffiths, J. and King, D. (1993), Special Libraries: Increasing the Information Edge. Washington, DC: Special Libraries Association. Hayes, R. (1997), "Economics of information", International encyclopedia of information and library science/ edited by John Feather and Paul Sturges. London; New York: Routledge, pp.116-129. Keyes, A. (1995), "The Value of the Special Library: Review and Analysis", Special Libraries, Vol. 86 No. 3, pp. 172-187. Koenig, M. (1992), “The Importance of Information Services for Productivity: Under- recognized and Under-invested”, Special Libraries, Vol. 83, No. 4, 199-210. McGee, J. and Prusak, L. (1993), Managing Information Strategically. The Ernst & Young Information Management Series. New York: John Wiley & Sons, Inc. Macgregor, G. (2003), “Collection-level descriptions: metadata of the future?”, Library Review, Vol. 52 No. 6, pp. 247-250. Maguire, C. (1990) An Australian perspective on the value of information, in National Think Tank on Library Statistics, edited by F. Exon and K. Smith. Publisher: Perth. Matarazzo, J., et al. (1987), The President's Task Force on the Value of the Information Professional: Final Report. Washington, DC: Special Libraries Association. Matarazzo, J. and Prusak, J. (1990), "Valuing Corporate Libraries: A Senior Management Survey", Special Libraries, Vol. 81 No. 2, pp. 102-110. The New Palgrave dictionary of economics and law/ edited by Peter Newman.- London: Macmillan Reference, 1998. Online Computer Library Center. (1998), OCLC Annual Report. Dublin (Oh.): Online http://www.Shanghai.Gov.cn/gb/shanghai/english/economy/ 16 Computer Library Center. Saracevic, T. and Kantor, P. (1997), “Studying the value of library and information services. II. Methodology and Taxonomy”, Journal of the American Society for Information Science, Vol. 48 No. 6, 543-563. Taylor, R. S. (1986), Value-added processes in information systems. Norwood, NJ: Ables. Volpe National Transportation Systems Center. (1998), "Value of information and information services", available at: www.fhwa.dot.gov/reports/viiscov.htm (accessed: 12 February 2005). Woldring, E. (2001). “Strategies to measure the value of special libraries”, Rivers of knowledge: 9th Specials, Health and Law Libraries Conference, Canberra, 26-29 August 2001, available at: http://conferences.alia.org.au/shllc2001/papers/woldring.html (accessed: 1 August 2005) Yee, M. and Layne, S. (1998), Improving Online Public Access Catalogs. Chicago (Illinois): American Library Association. http://www.fhwa.dot.gov/reports/viiscov.htm http://conferences.alia.org.au/shllc2001/papers/woldring.html work_ywgkyp5pczgzpbwdpx7vo7vjni ---- Microsoft Word - OTDCF_v23no2.doc by Norm Medeiros Associate Librarian of the College Haverford College Haverford, PA ERMS Implementation: Navigating the Wilderness ___________________________________________________________________________________________________ {A published version of this article appears in the 23:2 (2007) issue of OCLC Systems & Services.} “Tis easy to see, hard to foresee.” -- Benjamin Franklin ABSTRACT This article describes important considerations for commercial ERMS implementers. It identifies the value proposition in choosing to purchase an ERMS. The paper describes challenges common to all libraries, irrespective of commercial ERMS chosen. KEYWORDS electronic resource management ; electronic resource management systems ; ERMS implementation Electronic resource management is the area in which I’ve had to focus my attention the past several years. It’s been the cause of numerous headaches and other physical ailments. The mental consequences of this work have yet to be diagnosed, but they too I’m sure will prove chronic. Late last year, I was invited to talk to a group of librarians who were preparing to implement electronic resource management systems (ERMS). Having been down that bumpy road a couple of years earlier, the speaking invitation provided an opportunity to reflect on the challenges my colleagues and I endured as we migrated from a locally-developed ERMS to a commercial product. Given the hotness of this topic, I think there’s value in recounting these considerations. Becoming familiar with the ERMI specification The report1 of the Digital Library Federation (DLF) Electronic Resource Management Initiative (ERMI) is masterful. Rich and visionary, it accommodates an impressive array of functionality and data. Since commercial e-resource system vendors have used the ERMI specification as a roadmap on which to model their systems, it’s important for libraries implementing ERMS to understand the ERMI framework. This understanding should include the functional requirements -- Appendix A of the report -- that describe what an ERMS ought to do. Equally important is the data structure -- Appendix E of the report -- which defines the entities and their associated elements. A modest understanding of these concepts facilitates communication with ERMS vendors. __________________________________________________________________________________________ 1. Jewell, T., et al. Electronic Resource Management: Report of the DLF ERM Initiative (Washington, D.C.: Digital Library Federation, 2004) Importing data from existing systems/spreadsheets The value of understanding the ERMI data structure manifests during the process of migrating data into the ERMS. Administrative metadata within a container such as a locally-developed ERMS or a spreadsheet can be mapped to the ERMI data structure. This exercise will result in exact matches – one-to-one element correspondence with the ERMI data structure; partial matches – where a single element used locally is more discretely defined within the ERMI specification, or vice versa; and failed matches, where a local field has no ERMI equivalent. Working with your vendor to determine how and where to move elements from the latter two categories will minimize manual processing following the load. Assigning values Several dozen ERMI fields are defined to use value lists. Examples include resource type, license status, and pricing model. The ERMI report points to pre-existing value lists in some cases; in others, it provides recommended values. Nevertheless, it’s useful to develop institution-specific values ahead of implementation if possible. In working through this tedious task, you will ensure that data values are standard, as well as results and reports generated from these values. It would be beneficial for the e-resource community to share library-derived value lists in order to assist those currently in the throes of ERMS implementation. Staffing and workflows Understanding the functional and data needs of e-resource management activities, and who will do the work, are major components of ERMS implementation, but also areas in which it is difficult to feel confident. It may be worthwhile to inventory current practices as a way of identifying areas that can be improved through change, be that centralization or elimination of unnecessary steps. It is also important to determine whether e-resources will move through a dedicated e-only workflow, or be handled by the same set of staff that handles materials in traditional formats. Individual library circumstances will likely dictate the better approach, though I’d contend that insinuating the ERMS into the everyday work lives of the entire technical/electronic services staff is more challenging than an organizational structure that carves out a core set of staff who deals only with e-resources, and by extension, the ERMS. Workflow tracking within the ERMS Revamped workflows are only as strong as their ability to be performed in a timely manner. The great promise of electronic resource management systems, in my opinion, is workflow communication and tracking. Unlike purchasing a physical object such as a book that can be seen as it weaves its way through the various processes associated with it, the status of an electronic resource can be very hard to pinpoint. The ambitious use of ERMS as a communication tool, alerting staff when tasks need to be done or information needs to be disseminated, can fill a void in current e- resource management practice. Ticklers that email staff based on the date of some occurrence such as beginning of a trial, renewal reminder, or termination date serve a purpose, but are relatively unsophisticated compared to a workflow tracking mechanism predicated on the status of a resource, from the moment a decision is made to evaluate it through renewal or termination. Data propagation Several libraries have purchased e-resource management systems from vendors other than their ILS provider, warranting the need to build interoperability across platforms. Acquisitions data include elements where necessary redundancy between the ILS and ERMS occurs, and with the SUSHI protocol now a NISO draft standard, it’s even more crucial for ERMS implementers to find an automated means of moving acquisitions data into their systems. A subcommittee of the DLF ERMI Phase 2 steering group is currently investigating the feasibility of such interoperability. Their preliminary report is available at < http://www.haverford.edu/library/DLF_ERMI2/ACQ_ERMS_white_paper.pdf>. Summary remarks Libraries need sophisticated systems that facilitate communication and workflow, especially as the prospect of mainstream purchase of e-books grows closer. Moreover, as collection decisions are made consortially and the majority of resources purchased become electronic, libraries will require an ERMS to maintain effective control of these coveted, expensive resources. ERMS implementation strategies must include staff buy-in such that all involved recognize the importance of incorporating a new tool into their work. Library administrators and implementation managers need to foster a culture within their institutions where e-resource management in its many forms is seen as mission critical. In accomplishing this task, the toughest implementation challenge will be behind you. work_ywy3nnyfhbgjxe64qjqpeza43y ---- On-Line Bibliographic System Instruction TRUDI BELLARDO, GAIL KENNEDY, AND GRETCHEN TREMOULET A course in on-line bibliographic systems was introduced into the curriculum of the College of Library Science at the University of Kentucky. It was taught in five-week sections by three instructors who were practicing librarians and each an expert in one type of bibliographic network: OCLC, MEDLINE, or Lockheed DIALOG. Library space, equipment, and materials were utilized. The over-all goals of the course were to develop terminal skills and related proficien- cies and to instill a knowledge of the administrative considerations relative to various kinds of networks. Despite problems encountered related to class size, scheduling, theft of equipment, and supplementary readings, the students evaluated the course highly and the instructors felt it was an over-all success and worth repeating. WITH THE widespread availability of on-line bibliographic sys- tems (e.g., ORBIT, DIALOG, OCLC, MEDLINE, etc.), and in recogni- tion of the fact that such systems have clearly become a permanent part of the library profession, graduate programs of library education are cur- rently trying to decide how instruction relating to such systems can be most efficiently and effectively introduced into the curriculum. A number of different patterns for offering on-line instruction are available to the library educator, and a state-of-the-art survey of practice among library programs accredited by the American Library Association has recently been published by S. P. Harter. 1 The approach taken at the University of Kentucky College of Library Science is sufficiently different from those taken at other institutions to suggest that a discussion of this experience might be helpful to others introducing such an instructional Bellardo is Data Services Librarian, Kennedy is Acquisitions Librarian, both at the M. I. King Library, and Tremoulet is Search Analyst, Medical Center Library and Communication System, all at the University of Kentucky, Lexington, KY 40506. 21 JOURNAL OF EDUCATION FOR LIBRARIANSHIP component to their program of library education. Program Structure. Through 1975 student exposure to on-line biblio- graphic systems at the College of Library Science was essentially limited to classes which emphasized the conceptual-theoretical issues of computer- based information storage and retrieval. These classes were su pple- men ted by system demonstrations which were provided by various ele- ments of the University of Kentucky Libraries and the Medical Center Library. The primary goal of such instruction was to provide the students with a conceptual understanding of the systems and to develop what A. Kent refers to as system "literacy."2 It soon became apparent that this approach to on-line instruction was inadequate to the professional needs of the students. While the conceptual aspects of information storage and retrieval were adequately treated within the structure of a number of courses ranging from cataloging and classification to information storage and retrieval, the experiential or practical aspects of system operation were being given considerably less attention. In short, the students under- stood the structural nature of on-line systems but did not understand operational aspects or procedures. The first step in expanding the on-line instructional program took the form of providing students with an opportunity for "hands-on" experi- ence with ORBIT and DIALOG through an existing course entitled Information Storage and Retrieval Systems. The instructional compo- nent was designed jointly by the instructor of the course and the data services librarian at King Library. The experiential portion of the course was taught by the data services librarian and while it wasjudged a valuable unit of the course by the students, it was agreed that the approach fell short of providing students with a desired level of "operational functional- ity."3 In order to improve the students' operational performance it was decided to further expand the program during the 1977 spring semester by introducing a completely separate course called Computer-Based Bib- liographic Networks. The course was designed to provide students with exposure to a range of existing on-line systems and consisted of three distinct instructional units; one unit devoted to a cooperative library network (OCLC), another to a subject-specific system (MEDLINE) and a third to a multiple-database system which would have application in general reference (DIALOG). While the course was to emphasize expe- riential learning, it was expected that the instructors would consider a number of administrative and operational issues related to the provision of on-line bibliographic services and computer-based information storage and retrieval as well. In implementing the course the College of Library Science had essen- tially two alternative approaches which it could take: (1) It could develop 22 On-Line Bibliographic System Instruction an independent capability within the college, requiring it to acquire qual- ified instructors, necessary hardware and facilities; or (2) It could rely upon instructors and facilities already existing within the university li- brary system. Given the experiential nature of the intended course, the existence of qualified and experienced professional librarians who were willing to serve as instructors, and the relatively high cost of developing an independent capability, it was decided that the most cost-effective ap- proach would be to build on existing staff and facilities. The professional librarians selected as instructors for the course each had extensive experience with one of the three on-line bibliographic systems included in the course. The MEDLINE and DIALOG instructors were full-time search analysts, and the OCLC instructor had been instru- mental in implementing OCLC in the university library system and had supervised the on-line cataloging section for two years. The three-hour class was scheduled for one evening each week, taking advantage of the reduced rates available from MEDLINE and DIALOG and avoiding competition with normal library use of the available termi- nals. The students were divided into three groups of approximately 12 students each; each group studied one of the systems for five weeks and then rotated to another system. A number of questions arose relative to the scope and complexity of the three separate sections. How much could the students absorb in each of the three short courses? What kinds of background and preparation would they - should they - bring to each section? Would their absorp- tion level increase as they rotated from one section to another or would they suffer learning interference and confusion? From a survey of the students who enrolled in the course it was learned that virtually all had previously taken an introductory cataloging-classification course (the only prerequisite for the course); 17 per cent had taken, or were in the process of taking, one or more advanced cataloging courses; 69 per cent had taken a course in information science; 17 per cent were taking concurrently a course in information storage and retrieval systems, and 17 per cent had previously enrolled in a course dealing with library automation. Two of the students had some training on OCLC. In actuality this statistical data provided little help in answering the above questions, and the instructors had to be sensitive to cues from the students as the course progressed and to modify the pace and content of the course accordingly. The syllabus for the course was actually three individual syllabi com- piled independently by the instructors and then coordinated in order to ensure a degree of consistency and uniformity in style. The instructors agreed to give a written examination at the end of each five-week session and the final grade of a student represented a combination of the stu- dent's grades received in each of the individual units. 23 JOURNAL OF EDUCATION FOR LIBRARIANSHIP Each of the individual instructional units is discussed in the following sections. This discussion is followed by an over-all summary which high- lights the major problems encountered in teaching the course and changes which have been made in the program as a result of this experi- ence. DIALOG Instructional Unit. The DIALOG system was chosen for train- ing on-line bibliographic retrieval in general reference largely because its command language, unlike that of ORBIT, is very different from MED- LINE, and also because of the availability of C. P. Bourne's DIALOG Lab Workbook, which guided the "hands-on" work. The workbook was really too long and detailed for the five-week course but proved helpful nonetheless. 4 The DIALOG retrieval system provides access to almost 60 databases (as of July, 1977) but only six were available with Lockheed's classroom instruction program and Bourne's workbook concentrated on the ERIC database. The three original objectives of the DIALOG section were: to develop skills in on-line procedures and protocols; to develop skills in question negotiation and search strategy formulation; to provide an overview and history of the database industry, library administration of on-line refer- ence services, and key concepts in automated information storage and retrieval. The first objective was realized through on-line sessions during which each of the ten to twelve students present took a turn at the terminal to work through a few of the exercises at the end of the chapters in Bourne's workbook while the rest of the students watched. The students complained initially of being self-conscious in front of their classmates, but most of them gained confidence as the course progressed. This method was awkward in some respects, but it was the instructor's solution to the large class size, too few terminals, and the limited amount of searching funds. In addition, it enabled the class to cover many more exercises and to observe more on-line techniques than if the time had been divided up into short sessions with only a few students present. The instructor also spent an hour with each group demonstrating the other on-line systems· available at the University of Kentucky. The demonstra- tions were done at the end of each five-week session, after the students were very familiar with DIALOG. Consequently they were much more perceptive to the points of com paris on among the systems than a grou p that had had no previous training would have been. The second objective of the section was only partially realized because of lack of time. A few class discussions covered search design, but negotia- tion skills and variations in search tactics were not explored very deeply. The third objective was achieved largely through class discussions, outside readings and a glossary of initialisms and retrieval jargon pre- pared by the instructor. The most helpful items on the reading list were D. 24 On-Line Bibliographic System Instruction M. Wax, A Handbook for the Introduction of On-line Bibliographic Search Services Into Academic Libraries and R. W. Christian, The Electronic Library: Bibliographic Data Bases 1975-76. 5 A fourth objective, which evolved from the class discussions, was to increase the students' awareness of the current controversial issues that are being debated at conferences and in the literature. These topics included charging for on-line services, "price wars" and other kinds of competition among database vendors, the need for standards and stan- dardization, the proper role of the federal government, etc. In spite of the emphasis in class on achievement of the first objective, [he students were graded mainly on the written test rather than on terminal skills or class participation. Without clear performance stan- dards established in aavance, it proved extremely difficult to assess these aspects fairly. In the future, fair performance criteria and evaluation measures need to be established that will help both student and teacher. The instructor felt that the section was most valuable for those students who had taken other information science courses, and would recommend that a course in information storage and retrieval systems be a prerequi- site for this course. This is not likely to be implemented soon, but other changes that will likely be made the next time the course is taught include increasing the importance of on-line performance as a grading factor; breaking the section into two groups, with each group spending shorter but more intensive terminal sessions; using videotape for analysis of the pre-search interview; and revamping the reading list to add a broader range of viewpoints on controversial issues. OGLG Instructional Unit. The OCLC unit aimed at fulfilling a dual purpose. The first goal was to provide an introduction to cooperative automated library networking - history, purpose, and current status - emphasizing OCLC, but covering other regional networks as well. Sec- ondly, it was hoped that a series of on-line exercises would enable students to develop some skill in the use of the OCLC 100 terminal and familiarity with the OCLC database. Class time during each of the five-week units was divided between discussion and on-line practice. An assigned group of readings covering a fairly broad spectrum of network activities provided the bases for class discussions. The class addressed such topics as the reasons networks began and flourished; the impact of networks on libraries' technical and public services; the effects of networks on interlibrary cooperation; pro- ducts and services of networks; and relationship between OCLC and its affiliate networks. The discussions also included currently debated issues concerning network governance and administration, network system growth strategy, and libraries' utilization of emerging computer capabilities. 25 JOURNAL OF EDUCATION FOR LlBRARIANSHIP Hardware and software of the OCLC system were examined at a basic, non-technical level. Since a beginning cataloging course was a prerequi- site for this course, database records and their use for cataloging purposes could be covered in some depth. On-line exercises concentrated on building skill in three major terminal operations: database searching; editing of bibliographic records; and inputting original cataloging. Self-Instructional Introduction to the OGLG M odel-J 00 Terminal by B . Juergens was selected as an introd uctory text for terminal operation and searching. 6 After an in-class demonstration stu- dents completed the exercises in Juergens' manual on their own. For editing and inputting practice the SOLINET Terminal Training Manual, 7 which combines audio cassettes and printed exercises, was used. On-line exercises were done independently by students in most cases. Problems and experiences were brought to class for discussion each week. The OCLC instructional videotapes produced at Kent State University were received late in the semester, in time for only the last unit. The latter two tapes of the four-tape series proved especially appropriate for the class in that they are designed for librarians and library science students and demonstrate some of the more complicated terminal operations. In addition to terminal operations and discussions of topics relevant to networking, fixed and variable field tagging of records for input catalog- ing was also introduced. The final group of students went into tagging in some depth while the earlier groups received a more cursory introduc- tion. The test was given during the fifth class period for each group and was composed of short answer and essay questions covering the readings and class discussions along with a brief quiz on terminal operations. Some problems and needed changes in the OCLC unit were obvious as the course progressed and others were brought to light by student obser- vations. An unexpected development that became apparent after the first rotation and intensified in the third unit was that each progressive group was initially more advanced than the previous one. Because the students were grouped by random selection and there was no concentration of expertise in anyone group, this phenomenon (which was observed by all instructors) was no doubt attributable to basic system similarities and terminology learned in the first unit for each group. As a result, each succeeding group required less basic instruction and could cover more material in five weeks. Adjustments in class structure and methodology were required with each unit. These adjustments should be no problem when anticipated but effort should be taken to minimize the difference in instruction given to consecutive groups in the same semester. Flexible availability of the OCLC system and the library's terminals made possible out-of-class on-line practice assignments. The outside as- 26 On-Line Bibliographic System Instruction signments were initially made to purposefully free class meetings for group discussions intended to develop understanding of networks' goals and operations. However, these discussions wore thin when extended over three hour class periods. In addition, the reading list, which formed the base for discussion, was occasionally redundant and could have been pared down with no real loss of content. A balance of discussion and on-line practice during the class period seemed to be more palatable than the excessively long discussions. When this approach was adopted during the last unit, the productive atmosphere during group terminal practice was immediately apparent. The students learned from each other and enjoyed attacking more difficult procedures together. MEDLINE Instructional Unit. Four objectives were identified for the MEDLINE section. The first was to develop familiarity with on-line searching procedures~ the second, to develop ability to design search strategies, the third, to become acquainted with MeSH; and the last, to gain an overview of the use and usefulness of MEDLINE. Fulfillment of the first three was dependent largely on lecture material and terminal practice. The last was dependent upon the readings. The content and structure of the unit was governed somewhat by the length and frequency of the class meetings. The five-week session was not long enough to cover both machine mechanics and vocabulary in great detail. A decision was made to emphasize machine mechanics (commands, format of entry, Boolean logic, etc.), since that was considered the pri- mary objective of the unit. Thus the MeSH vocabulary, so important to the system, was introduced only to the extent that the students could understand the relation of the tree structure and subheadings to the alphabetic list. The five class meetings were used for a lecture, on-line practice, and the test. The lecture, given during the first class meeting, introduced the alphabetic MeSH, the tree structure, MEDLINE, the backfiles, Boolean logic, some of the commands, and some techniques such as offsearch, offline printing, author searching, textword searching, etc. The lecture was heavily supported by transparencies. For the next three meetings the class was divided into four groups of three students each; each group had a 30-minute terminal session. The students were given search questions to have formulated by the time they arrived. These queries were designed to illustrate the techniques discussed in the lecture. On arrival they were given a "correct" formulation and each student executed a search while the other two watched. In addition they had some free time to practice various commands. The final examination covered both the lecture and reading assignments and consisted of short essay questions, search ques- tions to formulate, and a long essay question. Grades were determined largely by test scores. 27 JOURNAL OF EDUCATION FOR LIBRARIANSHIP The reading assignment consisted of several parts. Nine journal articles were included which aimed largely at the extent of use of MEDLINE by various groups and countries and also at the mediated versus non- mediated controversy. The reading assignment also included copies of selected transparencies from the lecture and portions of the Midcontinen- tal Regional Medical Library's A Guide to Using MeSH's Alphabetical List and Categorized Lists (Trees).8 One unique problem encountered was theft of the MEDLINE terminal. Before it could be replaced the first group of students had finished the MEDLINE unit. They were able to use borrowed equipment for only one evening. The loss pointed up the frustrating dependency of such a course on machines and computer systems. Theft, malfunction of the equip- ment, system down-time, and similar problems can cripple the progress of the class if alternative plans for instruction are not made in advance. The restriction of terminal access proved to be another considerable problem. Because of budget limitations and terminal security considera- tions, the terminal was available to students only the evenings on which the class met. The students needed to have more terminal time in order to understand the system adequately. Since their time was so limited they were very strictly supervised and not really allowed to make their own errors. The restrictions on terminal usage also resulted in the decision not to use MEDLEARN, an on-line instructional package on the use of MED- LINE. The students' lack of background in health science vocabulary created still another problem. Most of them had little, if any, experience in a health science setting and were thus unaware of the sorts of questions which arise. They handled their assigned search questions fairly well but their lack of background left them floundering when it came to using their free time. They needed quite a bit of supervision in order to make good use of it. An improvement for the future will be to use the National Library of Medicine's On-line Services Reference Manual 9 as an assigned reading. Ini- tially it was thought to be too detailed and extensive for such a short and introductory course. By the end of the course, however, it had become evident that at least excerpts from this tool would have been very benefi- cial to the students despite their limited acquaintance with the system. Student Evaluation. The students were given the opportunity to evaluate the course near the end of the semester. They were asked to rank various aspects of the course instruction on a one to five scale and also to add any specific or general comments. The amount of narrative response was significant and although there was a natural divergence on various issues, certain repeated points appeared to be valid assessments of the semester's experience. Recurring criticisms were: 28 On-Line Bibliographic System Instruction 1. The class size was unwieldy and should be reduced to maximize hands-on experience for all students. 2. Supplementary readings were somewhat redundant and not ap- propriate for an introductory level course. They should be de- emphasized with more time allotted for on-line experience. Several suggestions were made about the course organization. A few students felt that each of the three course sections could be included in other existing courses. Others suggested that one of the three sections be dropped so that more time could be devoted to the remaining two. Still others recommended adding a fourth database, e.g., LEXIS, and giving st~dents a choice of three out of four. With regard to the numerical rankings, the students generally rated the course instructors slightly above average in such areas as organization, enthusiasm, self-confidence, and availability of instructor to students. The rankings were somewhat higher on the instructor's command of the su bject and encouragement of partici pation and class discussion. Across the board the students ranked the over-all value of the course as high and in their comments reiterated the need for continuing this type of course. Assessments and Alternatives. As has been stated, the course was experi- mental in structure, format, and content. A few similar courses have been reported in detail, but in all cases the course objectives and resources differed substantially from this effort. 10 Symptomatic of some of the frustrations of the course were the suggestions that either the enrollment be reduced or the equipment for student use be increased. In the MED- LINE and DIALOG sections, the equipment needed to serve the informa- tion needs of library users was just not adequate to meet the needs of this course. The organization of the course was also an area of concern; probably only the students who wanted a 'splash in the face' were satisfied. Even if one isolated topic from each network had been examined, five weeks would not have been sufficient to provide in-depth treatment. Several students articulated their preference for more - or less - attention to particular systems. Since the close of the semester several alternatives have been considered for the future. Most of these are ones suggested by the students themselves and each creates its own challenges. One suggestion is to include each of the three types of systems in a traditional course: the OCLC system in a cataloging or academic libraries course; DIALOG in a reference or bibliography course; MEDLINE in a medicallibrarianship course. This approach would not increase terminal time - would in fact probably decrease it - but would provide an opportunity to explore the relationship of the networks to library services and functions. However, the concept of teaching library networks by comparing different types of databases and system configurations would 29 JOURNAL OF EDUCATION FOR LIBRARIANSHIP not be supported by burying the networks in separate courses. Another suggestion is to remove one of the systems from the course, include it in another course and leave two systems to be taught in this one. The extra time could be used for more on-line training. The students who were frustrated with the brief exposure to each system would probably find this more satisfying. The problem here would be deciding which two to retain and which one to remove. Depending on the particular interests of students anyone of the three could be dropped. A further suggestion is to go a step further and add a system, allowing students to choose three out of four. This arrangement would necessitate more elaborate scheduling, especially if the students' choices were not evenly balanced. The suggestions to reorganize the course and explore other alternatives could be symptomatic of a problem that really cannot be solved by jug- gling sections. Modifying the course in anyone way might satisfy the needs of one group of students but not another. More significantly, the systems chosen for the course are truly disparate. Not only do they have different purposes and operations, but they require different kinds of training to produce various kinds of proficiencies or skills. OCLC termi- nal operation is largely a mechanical, easily-learned function. The true intellectual effort is in the manipulation of the database for cataloging purposes and in the administration of the system in the library. DIALOG represents just one of several complex (and continually changing) commercially-available retrieval systems, anyone of which is difficult to master, and requires also skills in query negotiation and search strategy formulation. The mechanics of MEDLINE are on the surface deceptively simple; proficiency in the use of the system is difficult without extensive training in the use of MeSH and a health science background. The common thread of the units is tenuous and becomes even more so under close scrutiny. Perhaps the incongruity of the networks produces a degree of frustration in adjusting to the three within one semester. The problems inherent in the structure warrant close monitoring as the course is taught in the future. For the immediate future more terminals will be used and the course will be taught in both the fall and spring semesters in order to fill student demand without increasing size of class. The course as is is definitely filling a curriculum need and the students, while complaining about the details of execution, are generally extremely enthusiastic about the idea for the course. On-line systems themselves are in a state of flux and rapid expansion, with new features, capabilities and subsystems becoming available all the time. From both a curriculum and an instructional viewpoint, what is needed is to keep the over-all goals of the course in mind while maintaining a flexible and adaptable mode of implementation. 30 On-Line Bibliographic System Instruction References 1. Harter, S. P.: Instruction Provided by Library Schools in Machine-Readable Bibliographic Data Bases. Proceedings of the ASIS Annual Meeting, 14: 49, 1977, and microfiche in pocket. 2. Kent, A.: Information Science. Journal of Education for Librarianship, 17: 131-139, Winter, 1977 . 3. Ibid. 4. Bourne, C. P.: DIALOG Lab Workbook; Training Exercises for the Lockheed Information Retrieval Service. Berkeley: Institute of Library Research, University of California, 1976.A Brief Guide to DIALOG Searching, Palo Alto, Calif., Lockheed Information Systems, 1976, was also available to the students. 5. Wax, D. M.: A Handbookfor the Introduction of On-line Bibliographic Search Services into Academic Libraries. Office of University Library Management Studies, Occasional Papers, 4, 1976; and Christian, R. W: The Electronic Library: BiblifJgraphic Data Bases 1975-76. White Plains, N.Y., Knowledge Industry Publications, 1975. 6. Juergens, B.: Self-Instructional Introduction to the OCLC Model-100 Termznal. Richardson, Texas, Amigos Bibliographic Council, 1976. 7. Thomas, K. A .: SOUNET Terminal Training Manual. Atlanta, Southeastern Library Network, 1976, parts 3-4. 8. Midcontinental Regional Medical Library. A Guide to Using MeSH's Alphabetical List and Categorized Lists (Trees). Omaha, Nebraska, Unpublished paper, 1975. 9. National Library of Medicine, Bibliographic Service Division, MEDLARS Management Section. On-line Services Reference Manual. Bethesda, Maryland, 1976. , 10. These include: Rees, A.M., Holian, L., and Schaap, A.: An Experiment in Teaching MEDLINE. Bulletin of the Medical Library Association 64:176-202, April, 1976; Sewell, W.: Use of MEDLINE in a Medical Literature Course.Journal of Education for Librarianship 15:35-40, Summer, 1974; and Bourne, C. P., and Anderson, B. E.: Observation on the Use of the Lockheed DIALOG Systemfor Laboratory Work in a Fall 1975 Course on Computer-Based Reference Services at the UCB School of Librarianship. Berkeley: Institute of Library Research, University of California, Unpublished papers, January 1976. A comprehensive sum- mary of the key issues in training can be found in Williams, M . E.: Education and Trainingfor Online Use of Data Bases. (Unpublished paper presented at the EUSIDIC Conference at Graz, Austria, Dec. 1, 1976). 31 work_z5se3tjpangm7nlnp56wrh2nni ---- EDITORIAL NOTES Message from the Editor-in-Chief Da-Wen Sun Received: 23 November 2009 /Accepted: 23 November 2009 /Published online: 4 December 2009 # Springer Science+Business Media, LLC 2009 Dear reader Welcome to Volume 3 of Food and Bioprocess Technology (FABT). Since FABT was launched in 2008, it has made some good progresses. & Within a short period of just over 1 year, FABT is indexed by ISI Web of Science, i.e., Science Citation Index (SCI). Now FABT is abstracted/indexed by all the major databases including Academic Search, AGRICOLA, Aquatic Sciences & Fisheries Abstracts, Biotechnology Abstracts, Biotechnology and Bio- Engineering Abstracts, CAB International, Chemical Abstracts Service (CAS), ChemWeb, Compendex, CSA Biological Sciences, Current Contents/Agriculture, Biology & Environmental Sciences, EMBiology, Food Science and Technology Abstracts, Global Health, Google Scholar, Journal Citation Reports/Science Edition, OCLC, Science Citation Index Expanded (SciSearch), SCOPUS, Summon by Serial Solutions, and VINITY—Russian Academy of Science. & Manuscript submission to FABT is overwhelming. At the time of writing we have 194 papers published on Online First™, which are waiting to be assigned to regular issues. In order to accommodate this high demand of paper publications, from the current volume, FABT is expanded from four issues (100 printed pages per issue) per year to six issues (150 printed pages per issue) per year. & FABT will have its 1st Impact Factor in June 2010. I should emphasize here that without your support, it will not be possible for FABT to make such achievements. As the Editor-in-Chief of the journal, I would therefore like to take this opportunity to express my sincere thanks to our readers, in particular to our reviewers for your critical comments on manuscripts which have greatly contributed to maintaining the high standard of the journal; and to our authors for your publication of cutting-edge high-quality original papers in the journal. D.-W. Sun (*) Biosystems Engineering, Agriculture & Food Science Centre, School of Agriculture, Food Science & Veterinary Medicine, University College Dublin, Belfield, Dublin 4, Ireland e-mail: dawen.sun@ucd.ie URL: www.ucd.ie/refrig URL: www.ucd.ie/sun Food Bioprocess Technol (2010) 3:1 DOI 10.1007/s11947-009-0304-x http://www.ucd.ie/sun Message from the Editor-in-Chief Dear reader << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (None) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (ISO Coated v2 300% \050ECI\051) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Error /CompatibilityLevel 1.3 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJDFFile false /CreateJobTicket false /DefaultRenderingIntent /Perceptual /DetectBlends true /ColorConversionStrategy /sRGB /DoThumbnails true /EmbedAllFonts true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 524288 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveEPSInfo true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts false /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 150 /ColorImageDepth -1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages false /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /ColorImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org?) /PDFXTrapped /False /SyntheticBoldness 1.000000 /Description << /ENU /DEU >> >> setdistillerparams << /HWResolution [2400 2400] /PageSize [5952.756 8418.897] >> setpagedevice work_z6lpya5grfgzriyiqmkdjylvv4 ---- JavaScript must be enabled to use the system work_z7vepf556rbkzef7dsqkvtttba ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216585362 Params is empty 216585362 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585362 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_zdeqeszzqrblfn75rervpoqu3m ---- PII: S1464-9055(00)00171-8 Library Collections, Acquisitions, & Technical Services, 2000, Volume 24, Issue 4, Pages 443-458. ISSN: 1464-9055 DOI: 10.1016/S1464-9055(00)00171-8 http://www.sciencedirect.com/science/journal/14649055 http://www.sciencedirect.com/science/article/B6VSH-41V35RT-1/2/a5dfdebbfefa3a28205d89b44c369039 © 2000 Elsevier Science Inc. All rights reserved. Making the connection between processing and access: do cataloging decisions affect user access? Ruey L. Rodman Abstract One function of a call number is to organize the library collection to promote browsability either on the shelf or in an online catalog. This study, based on research done at the Ohio State University Libraries, examines the impact on library collection organization if call numbers are not changed to fit into the shelf list sequence. The browsability of items were tracked by assessing how many screens away titles appear from like items in the online public access catalog, if call numbers by a bibliographic utility were not changed. The study assesses whether not reviewing the call numbers affects patrons‟ ability to find the items. 1. Introduction With the ever-increasing “information explosion,” libraries face difficult decisions on purchasing, book processing, and space allocation. Libraries must seek ways to cut costs and increase efficiency. One area under continuous scrutiny is book cataloging. A study by Magda El-Sherbini and John Stalker entitled, “A Study of Cutter Number Adjustment at the Ohio State University Libraries,” is one example of research on this time-consuming process. Their study examines “existing copy cataloging procedures to assess whether it was feasible to eliminate the review and adjustment of cutter numbers in producing copy cataloging records. A change in this procedure might reduce processing costs and improve productivity” [1]. As a public service librarian at the Ohio State University (OSU) Libraries, this author questioned whether a change of this type would affect the accessibility of materials. More specifically, this author wondered if the non-adjustment of cutter number or non-adherence to strict alphabetic order would affect the “browsability” of materials in an online public access catalog (OPAC). The study takes the research of El-Sherbini and Stalker one step further. It compares the shelf-listed call number with the non-shelf-listed call number provided on a catalog record, and then attempts to assess the effect on the display of the title in an OPAC. http://www.sciencedirect.com/science/journal/14649055 http://www.sciencedirect.com/science/article/B6VSH-41V35RT-1/2/a5dfdebbfefa3a28205d89b44c369039 1.1. Scope and definitions Available cataloging copy in bibliographic utilities such as the OCLC, Inc. (OCLC) or the Research Libraries Information Network (RLIN) has done much to increase the speed of processing a book, but processing units still look for ways to increase efficiency and production. Also, services such as PromptCat, developed by OCLC, or shelf-ready materials provided by vendors, support library processing units in their efforts to receive an item and get it to the shelf as quickly as possible. Can libraries accept the copy of the record provided by a bibliographic utility without reviewing the content of the record? This study examines one part of the record, the call number. Classifying and cuttering, or the assignment of call numbers, is a primary activity in cataloging. In general, bibliographic classification is designed to organize materials in a chosen way. A call number is designed in parts using established symbols, including a class number (representing subject), one or two cutters (representing geographic, topical, or specific author), and book number (representing alphabetic scheme). Call number assignment is the most prominent method used in libraries to organize collections systematically according to the subject matter of each item. The call number file is called the shelf list because it is arranged in the order the items are found on the shelf. This file promotes browsability among items that are grouped together by subject through call number assignment. Classifying materials to permit effective browsing became more crucial with the rise of open access to materials by library patrons. As Osborn relates, “The provision of self-service on the part of readers grew out of conditions that were encountered for the first time in history in the 1820‟s when in the British Museum some 200 readers a day presented requests for materials and subjects which were beyond the capacity of the librarian-as-a-living-catalog to fill, for example a request to see all of the library‟s holdings of materials printed in France during the French Revolution or a request for information on new discoveries around the world or new developments in all fields of science” [2]. For the purpose of this study the following definitions will be used. 1. Class number: a system of alphas and numerics used to keep like items together by subject whether on the shelf, in a card catalog, or in an OPAC display. Part of the class number may be a cutter for subject, topic, or specific author 2. Class number change: an adjustment made to the cutter 3. Book number: the alpha and numerics used to order items by author or title within a class number 4. Shelf listing: the process of adjusting the book number to fit an item into an existing sequence of materials Throughout the study these definitions are used to differentiate between subject organization (class number portion) and alphabetic organization (book number portion). The phrases “shelf listing” and “call number assignment” both refer to the act of adjusting the book number. By eliminating shelf listing, the role of the call number functions more as a shelf position indicator and less as a means of keeping like items together. In many libraries the bar code, rather than the call number, is the unique number assigned to each item. With the use of bar codes it may no longer be critical to assign a unique call number to each item. Therefore, duplication of call numbers will be tracked as a possible factor that might have an impact on the library collection. 1.2. Literature survey A review of the literature did not reveal any research addressing the impact on call number display in an OPAC if shelf listing is eliminated in the cataloging process. Besides the El-Sherbini and Stalker article, Massey and Malinconico have also contributed research on cutting processing costs by accepting call numbers from records provided by a bibliographic utility. Massey and Malinconico reach a similar conclusion to El-Sherbini and Stalker: “the results of this study indicate that local shelf listing is not a cost-effective operation for the University of Alabama libraries . . . The small number of errors detected produced a small amount of shelflist disorder and would, therefore, be expected to have a low impact on the browsability of the collections” [3]. The University of Alabama Libraries is no longer shelf listing call numbers on provided copy. Many years ago at the OSU Libraries, shelf listing was suspended in all classes with the exception of classes M, N, and P. These exceptions were made in order to maintain the established single cutter for musicians, artists, and literary authors. This author examined other research areas that might influence a library to consider eliminating shelf listing as a part of book processing. The research can be categorized into three broad areas: 1) classification schemes in an online environment, 2) the quality of bibliographic records in the online databases of bibliographic utilities, and 3) catalog use studies and/or information seeking behavior studies. 1.3. Classification schemes in an online environment There are many studies that discuss the use of classification schemes as a means of improving access to items in an online environment. Most of these reports concern the enhancement of classification schemes through direct linking of class number to subject index files. Broadbent [4] highlights the issues by exploring whether an online catalog can function both as a dictionary and classified catalog without requiring additional time or intellectual effort on the part of the cataloger. Drabenstott [5] discusses the importance of incorporating a classification scheme into the retrieval protocols of an online catalog, to introduce a logical approach to subject searching and to increase the amount of information in subject indexes from the subjects in bibliographic records. There is also research being done on multiple class number assignments in bibliographic records in an online environment. Huestis [6] describes the development of clusters of classification numbers in an index, that are associated with bibliographic records and accessible in the online index-searching program. Past and present classification practices are summarized by Hill [7] who proposes that catalogers provide enhanced subject access through multiple classification numbers. Losee [8] examines the clustering of similar books provided by a classification system. He examines the relationship between the relative shelf location of a book and the books that users choose to circulate to impact the design of classification and clustering systems supporting browsing in online public access catalogs and in full-text electronic storage and retrieval systems. Classification schemes being used as independent online retrieval tools are also of interest. Cochrane and Markey [9] present research on data elements that have been enumerated for the purpose of constructing files of library classification records to assist in information retrieval. Williamson [10] addresses innovations in thesaurus design and standards to examine how classification structure will support information retrieval. Both of these articles conclude that an online classification index can aid in retrieval, although research into its design, users, and expected results still needs to be addressed. The above research implies that current classification practices alone are not an effective tool for the retrieval of information, or are not used to the fullest advantage. In a 1986 survey of ARL libraries, seventy-seven libraries (77) were still maintaining a card shelf list file [11]. The reasons for doing this were that a true shelf list function was not available online, that parts of the collection needed retrospective conversion, and that better browsability functions were needed in online systems. As Chan states, “Classification holds great promise for augmenting effectiveness in online retrieval. While certain characteristics of classification prevent its being a totally reliable retrieval tool by itself, it can be a useful supplementary device” [12]. Gertrude Koh [13] supports Chan‟s statement in her research on a “subject tool box” or a combined system of subject headings and classification which will meet user learning styles and vocabulary and assist in online “shelf browsing.” By itself, classification may not be an effective retrieval protocol, but in combination with other search mechanisms it provides added value for user searching. However, non-adjustment of call numbers can be viewed as a possible development in the use of classification in online systems. If many libraries use standard call numbers provided on records in a bibliographic utility, the development of classification schemes and their uses as search tools may become more acceptable. The results may be applied to many libraries rather than one library at a time. 1.4. Quality of records in bibliographic utilities The second area of research examined for this study concerns the accuracy of copy provided by bibliographic utilities. These studies include all fields in the provided record, of which the call number is but one element. In 1987 at the Mann Library of Cornell University, Janet McCue and others found that in an analysis of cataloging copy from the RLIN database, 57.44% of a total of 85.3 changes were modifications to the classification number. The authors state, “The fact that one or more Mann catalogers changed the classification on 39 of 80 records (including 4 L[ibrary of] C[ongress records]) illustrates the latitude possible in determining classification” [14]. The authors do not define their use of the term classification, but one gets a sense from the content of the article that the term is applied to the class number portion of the call number. They recommend more in-depth training on choice and form of classification numbers by copy catalogers. Shared cataloging as accepted or applied by local libraries is of great interest to the library community. In a study on the accuracy of LC copy, Arlene Taylor and Charles Simpson [15] also included classification as an access point worth consideration in their research. They found that there were 4.3% problems with call numbers in the Cataloging in Publication (CIP) sample and 5.5% problems with call numbers in the non-CIP sample in their study. The article does not present data on the types of problems found in the classification, but the problems are considered significant because classification is seen as a major access point. There seems to be a general perception that classifying a document or assigning just the class number portion is a very individualized process. Thus, one classifier‟s subject analysis and classification assignment might be different from another classifier‟s for the same item. The inconsistencies of classification through subject analysis do point to another possible weakness in the sharing of call numbers without adjustment. Does class number assignment have an impact on a decision to accept call numbers without review? In an investigation of information retrieval from a classification of words used to group documents together, Jones states, “By this [a certain sort of classification] I mean a classification in which members of a class do not necessarily all share one or more common properties, and in which individual items can appear in more than one class. . . . This is a natural consequence of the fact that the documents in a collection, though they may be topically related, are not likely to be identical in both subject matter and vocabulary” [16]. Jones further discusses the difficulty in accurately and consistently assigning the correct identifier to similar documents in order to group them together for retrieval. Consistency in the use of any classification scheme seems to be somewhat problematic. Bibliographic classification differences in libraries may also be affected by the needs and expectations of each library. Current practice assigns one class number to an item based on the first subject heading. If classification is a subjective decision-making process, can we then assume that the general class indication of content or topic is acceptable, or that the call number is relegated to be just a shelf position indicator? There are surveys administered by bibliographic utilities to assess users‟ perceptions about record quality in their databases. In an article about a survey of records in the OCLC database, Davis [17] asks two questions concerning the seriousness of errors, and the perceptions of how well existing programs addressed quality control needs. The research interest in shared records by both user and provider is important because this study investigates the acceptance of a provided field without review. 1.5. Catalog use studies The last research area examined for this study is catalog use studies and information seeking behaviors. In these areas there is a wealth of research. The following from R. Hyman‟s introduction in his Access to Library Collections sums up the issues involved. “An investigation of any aspect of the direct shelf approach involves one immediately in a central problem which ramifies [i.e., divides], often unexpectedly, into almost every major concern, theoretical and practical, of librarianship. Thus, one may easily become entangled in: selection and acquisition policy . . . ; the function of cataloging, particularly of subject heading, vis-à-vis classification; general versus special classification schemes; documentation as related to librarianship; the utility of mnemonic and of expressive notation; bibliothecal as against bibliographical classification; the differing interpretations of the browsing concept (and of browsability) for research and for non-scholarly library use; how to determine and store less- used or obsolescent materials; the divergent philosophies on the desirable extent of readers‟ services and reference assistance; the worth and form of independent study in the library; the suitability of the LC Classification (LC) or of the Dewey Decimal Classification (DC) for various types and sizes of libraries–an issue often complicated by concomitant problems of reclassification; the encroachments on direct access resulting from increased use of microforms and from possible mechanized information storage and retrieval; the proper educational, social, or scholarly functions of libraries. Nor is this by any means a full listing of the threatening entanglements” [18]. Even though this statement was written in 1972, it seems to hold true today. When studying the organizational structure of the library‟s collection, or the direct shelf approach, all of the library‟s parts or activities come under scrutiny. Use of card files or online files is usually the initial contact by a patron when beginning to seek materials on a subject or to look for a specific item. Making a change in just one of the available files could affect many aspects of how a library is organized and operates. Catalog use studies investigate not only how information is organized and retrieved, but also the schemes used to organize the information in the physical arrangement of the library as well as in online systems and their retrieval capabilities. A common conclusion reached in many reports is that when seeking information, users do not use the call number file as their initial search option. For the most part, users approach the search for information from a known item point of view (author or title search) or from a subject perspective [19–21]. After completing the search, they use call numbers to locate the item on the shelf. Patrons will then browse nearby items for other appropriate titles. They use call numbers as pointers to the physical item, and when they find the shelf area in the library they browse titles, not call numbers. Although the following statement by Thomas Mann is not from a user study, it does summarize another aspect of patron behavior that influences search strategies. Mann‟s Principle of Least Effort “states that most researchers (even „serious‟ scholars) will tend to choose easily available information sources, even when they are objectively of low quality, and further, will tend to be satisfied with whatever can be found easily in preference to pursuing high-quality sources whose use would require a greater expenditure of effort” [22]. In general, users want their information search to be quick, easy, usable, and limited in number of items retrieved. Another common thread in user study reports is the classification scheme itself and how it is manifested in the physical arrangement of items in the library. The classification of the store of human knowledge is indeed a very complex issue. As stated by Langridge, “In the bibliographic context, „classification‟ is commonly taken to imply „classification schemes‟. These represent the fullest use of classificatory methods, but the term „classification‟ by itself really means a way of thinking that pervades all subject work” [23]. A “way of thinking” is the crux of the issue facing libraries today. Each and every user may have his or her own way of approaching a search for information. It is difficult to assess how an “average user” thinks in order to design the best scheme for organization. The LC Classification schedules are very complex, and without some explanation, patrons may not be able to use them. The full call number is used by patrons to locate the item on the shelf, and only in its broadest sense (class number only) will “classification” assist the patron in browsing by linking like items together. There is much activity, study, and discussion in the area of classification research. Classification schemes or call number assignments are revised to meet the continuing changes in information, are examined in records found in bibliographic utilities, and are studied to determine their use by those searching for information. This study may raise more questions than it answers, but it is hoped that the research will shed some light on the non- adjustment of call numbers as a viable option for librarians to consider. 1.6. Research objectives and methodology From home, office, or in the library, the online catalog serves as the initial point for users to begin their search for information. The research questions examined by this study are: Is it necessary to adjust the book number to maintain alphabetic order of items within a class? If not, how does this affect the call number display in an OPAC? In other words, to what degree will the browsability of a collection in an online catalog change if call numbers are not shelf listed? The preponderance of literature describes the need for research and development in the use and application of classification systems and the need for more analysis of searching behaviors. No research has been done to show whether or not the suspension of concatenation or linking in a series affects the browsability for information in an OPAC. Libraries might be able to abandon strict alphabetic order for speedier, more efficient processing of materials if browsability in the call number file is not greatly affected. Data are collected on the call numbers in the bibliographic record. Data are also collected on the edited or shelf listed call number to compare with the original call number provided by a bibliographic utility. The impact on the browsability of an item or the effect on the display of the title in an OPAC is then assessed. The initial results will also be examined over three years to track any change in OPAC display position of the titles. The data for this study was taken from items copy cataloged during three months in 1992 at the OSU Libraries, which uses LC Classification. Data have been compiled on books that received copy cataloging using bibliographic records found in the OCLC Union Catalog. These bibliographic records were considered acceptable if they included a LC Classification number and subject entries. Records that did not have a LC call number were eliminated from the sample. Because this study is primarily concerned with the effect of shelf listing in the OPAC display of titles, no attempt has been made to ascertain the correctness of the class assignment, and it is assumed that the class number on the bibliographic record as found in OCLC was valid. In order to provide a description of the overall sample as found in OCLC, the following data elements were tracked from the supplied copy: 1. cataloging input agency: LC or member institution 2. the encoding level: blank (LC), I (member institution), 8 (CIP cataloging), and other (e.g., 5 for minimal level cataloging) 3. bibliographic description: blank (non-ISBN), a (AACR2), I (ISBD form) 4. call number field tag: 050 (assigned by the LC), 090 (assigned by member institutions using the LC Classification scheme) An analysis of the portions of the call numbers that were changed was also done to identify the types of changes made to the call number for shelf listing purposes. The categories used to track the call number changes were: 1. classification (includes topical, geographic, or author cutter) 2. book number (cutter used to alphabetize into the shelf list) 3. changes required for local practice (adding a date, adding a number one for English translation, etc.) 4. no change required In addition to the above, it was noted whether an unchanged call number matches or duplicates a call number already in the call number file. It was also noted if the changed call number was literature or LC Classification P. In summary, to assess the browsability of like items in the OPAC, three basic steps were used to analyze the sample: 1) a description of the type of copy used in book processing, 2) call number analysis to assess how many call numbers were changed, and 3) of the call numbers that were changed, how many would have been one, two, or three or more screens away if not changed. The last step could only be done after step two, which eliminates those items in which the call number has not been changed. Approximately 250–300 cataloged items in all formats were cataloged daily at the OSU Libraries. Every tenth title was selected for the sample as representative of the approximately 12,000 to 15,000 items normally added to the collection every three months. The sample was selected according to the following conditions: 1. Only those items that were copy cataloged were used. Any items that were “originally cataloged” at the library were removed during the analysis of the overall sample 2. Only monographs (including microforms) were used as source data 3. Those items cataloged with a locally constructed call number (not LC classification) were eliminated The source data used in this study were three years old when analysis began. Approximately 200,000–225,000 items have been added to the online catalog since the sample items were cataloged. By counting lines in the online catalog display for the unedited call numbers, an estimate of the effect of browsability in the OPAC was based on a period of three years. The sample yielded a total of 1,130 titles. The analysis began with a brief description of the type of copy provided. The fields chosen to describe the sample were the cataloging source field, the encoding level field, the description field, and the call number field tag. The definitions for these fields were taken from the document Bibliographic Formats and Standards (1993, FF:3-75. 079-83) issued by the OCLC Online Computer Center, Inc. In all cases the first part, or subfield “a,” of the 040 field was used to identify the original source of the cataloging data. The LC supplied 753 titles (66.6%) and OCLC member institutions supplied 377 titles (33.4%). The encoding level field was examined next. Encoding level indicates the degree of completeness of the machine-readable record. The LC, National Library of Medicine, British Library, National Library of Canada, National Library of Australia, and the National Series Data Program use blank and numeric codes in this field. Member institutions use capital letter codes. Encoding level “blank” is defined as full-level cataloging; encoding level 8 is the code for prepublication-level cataloging or the Cataloging-in-Publication program (CIP). Encoding level I indicates full-level cataloging input by OCLC member institutions. These codes were examined because they are indicative of full-level cataloging which should include a complete call number. Other codes used in the field, e.g., 5 or M, usually indicate less than full-level cataloging. Less than full cataloging codes are grouped together into a category titled “other.” The results for the encoding level field are blank = 511 titles (45.2%); I = 337 titles (29.8%); 8 = 243 titles (21.5%); and other = 39 (3.4%). The description field indicates whether the item has been cataloged according to the provisions of the International Standard Bibliographic Description (ISBD). The three possible indicators for this field are: “blank,” which indicates that the record is in non-ISBD form; “a,” which indicates that the record is in AACR2 form (Anglo-American Cataloging Rules, second edition); and “I,” which indicates record is in ISBD form and is known to be a non-AACR2 record. The description codes are concerned with the bibliographic description of the record‟s content, and do not imply whether the choice and form of the headings used in the record follow AACR2 standards and rules. The results were level a = 1,064 titles (94.2%); level blank = 38 titles (3.4%); and level I = 28 titles (2.5%). Based on these three data elements (cataloging source, encoding level, and description), 66.6% of the copy used was provided by national libraries, and 75.0% was full-level cataloging. One-third of the sample or 33.4% was input by member institutions of which 29.8% was input at full-level cataloging. Overall, 94.2% of the sample used in this study was input in AACR2 form. Only 3.4% is in less than full cataloging, and 5.9% of the records were in earlier forms of bibliographic description. To summarize, 96.5% of the sample (encoding levels blank, I and 8) and 94.2% of the sample (description a) indicated usable, full-level available copy. Call number assignment field tag was the next element examined. Acceptable copy at OSU Libraries is defined as having a LC call number or field tags 050 or 090. When neither of these tags were present, the record was also tracked and defined with other tags, e.g., 060, 070, 082, 092 that are not used by the OSU libraries for cataloging. The field tag 050 is defined as a call number assigned by the LC, and the 090 field tag is defined as a call number based on the LC Classification schedules but assigned by an OCLC member institution. The results were that 1,065 titles (94.2%) had 050 and 090 tags, and 65 titles (5.8%) had other or no tags. The 65 titles in the “other or not present” category were eliminated from further analysis because this type of call number is always shelf listed and would not have been accepted without review. With the elimination of the 65 titles, the sample size was reduced to 1,065. Another book processing requirement of the OSU Libraries is that the selected copy must have valid LC subject headings (650 field tag). This category does not affect the study except that items without valid subject entries would be forwarded to the original cataloging section. This category was counted to determine how many items would have been removed from processing without review. Of the 93 titles without subject entries, 67 titles were classed as literature, which do not require subject analysis. Only 26 titles had no subject entries. None of these titles were eliminated from the sample at this point because they were processed using the call number found on the copy, although these all required expert attention to other fields before cataloging was complete. The last category used in this study to define the sample answers the question: Does a record have original cataloging input by the OSU Libraries? This question is important because it means that no bibliographic record was available in OCLC. A cataloger at the OSU Libraries would catalog the item, including a shelf listed call number, before input to the OCLC Union Catalog. This study examined those items that were copy cataloged using a bibliographic record already in OCLC. Forty-five titles were found to be originally cataloged by the OSU Libraries. To summarize, the initial sample was reduced by 65 titles that did not have a call number and by 45 more titles that were originally cataloged. The total sample size was now 1,020 titles. Of these, only the call number was examined further. The initial examination determined whether the call number on the bibliographic record was changed or whether it was accepted as found in the bibliographic record. Seven hundred ninety six titles (78.0%) contained call numbers originally provided on the bibliographic record which were accepted without revision during the copy cataloging process. The sample size for further call number analysis was therefore reduced to 224 titles. 1.7. Results Of these 224 titles, three categories were tracked to identify which part of the call number was changed. First, it was noted if the class number, which includes author, geographical, or topical cutter, had been changed. This change was counted first and as the only change even if other parts of that call number were changed. Second, the book number, which alphabetizes the title into the collection, was examined. This category was counted as the only change if it was the only element changed in the call number. Third, changes due to local practice were counted as the one and only change provided that the class and book numbers were not changed. There were three local practices included in this study: 1) adding a number one to the book number to indicate English translation; 2) adding a cutter, Z8, to show literary criticism; and 3) adding a year to the call number. The results of the analysis of the parts of the changed call number can be seen in Table 1. Table 1 Call number changes By checking where the record would file if the call number had not been edited and comparing it to where the record would file with an edited call number, the “browsability,” or how close together on the screen the two call numbers are, can be estimated. The OPAC display of call number in OSU‟s Innovative Interfaces‟ system displays eight call numbers on one screen. When a call number is input that does not match an existing call number, the input call number is displayed in the middle of the screen with four call numbers above and below. For this study, the call number lines are translated into OPAC screen displays as follows: 1-4 lines are equal to the same screen; 5-12 lines are equal to one screen away; 13-20 lines are equal to two screens away; 21-28 lines are equal to three screens away; 29+ are equal to more than four screens away; The results of the OPAC search on the unchanged call numbers in relation to the shelf listed call numbers appear in Table 2. Note that 187 items (83.5%), that were within twelve lines of the call number, would probably have been found or seen by the patron if they follow Mann‟s Principle of Least Effort. In essence, the change to the call number was relatively slight when position in the OPAC display was examined. This does leave 37 (16.6%) of the titles that are two or more screens away. If a user was following the principle, this would result in a missed or failed search result. Table 2 OPAC display result for provided call number In the OSU Libraries‟ OPAC system, the call number does not have to be unique when a record is added to the database. The unique number for each item is the barcode. It is technically possible to have two different items with the same call number and still retrieve them for circulation purposes. It is not known whether this would be confusing to patrons when seen in the OPAC display or on the shelf. Thus, an additional category was tracked to determine the percentage of duplicated call numbers if a call number was accepted without review. It was also noted whether the titles were different or the same. Of the 224 titles, eight (3.6%) duplicated an existing call number. In six of the eight titles (75%) the titles were different, which means that the same call number was assigned to two different items. Of the two unchanged call numbers (25%), one matched a call number input to this OPAC by another library. The other unchanged call number represented the second edition of a title that matched the call number used for the cataloged first edition. Since approximately 25% of this collection is in the literature classifications, two additional categories of information about the changed call numbers were tracked: 1. whether the item is literature, and 2. whether the call number change was made to keep literary authors together Of the 71 changes made to class number, 55 (77.5%) were classed in literature. If these adjustments had not been made to the call numbers, a “new” class number sequence would have been established for these authors. Therefore, the works of these authors would have been found in two shelf locations. The remaining 16 titles (22.5%) were not literature. Upon review of these titles, the author determined that the class number portion was changed because of a topical or geographical cutter. These changes were made to keep the same topics or geographical areas together in the same shelf location. After the compilation of the results of the first search in December 1995, the author intended to do a time series projection based on the results and to check the OPAC displays two more times. However, when the OPAC displays were examined in March 1997 and May 1998, no change had occurred in the display positions of the 224 titles. Either the size of the collection did not increase enough or collecting in the subject areas of the 224 titles was not significant enough to make any change in the OPAC display position. Another possibility is that, since 1995, the unchanged call numbers would have compounded the out-of-sequence items. This aspect of the OPAC display results has not been tracked or factored into the results of this investigation. 2. Summary Since the size of the library collection seems to have an effect on the OPAC displays, some overall projections might be made for one year of production against the size of the database. Of the original sample, 224 titles (21.9%) had a call number change. If these call numbers had not been changed then 224 titles would not be in correct order in the OPAC display. However, 187 (83.5%) of the 224 titles appear on the same screen or one screen away and are considered easy to find if a search of the OPAC is done by call number. The remaining 37 titles (16.6%) would fall two or more screens away and are considered not easy to find. Note that 796 titles (78%) of the sample titles would fit perfectly into the collection without call number adjustment. Based on these results, the following projections can be made for one year of production. There are approximately 45,200 monograph titles added to OSU‟s collection in one year. Of these titles, 43,256 (90.2%) could be processed because there was available, acceptable copy. There would be approximately 9,473 (21.9%) call number changes. If these call numbers had not been changed, these titles would then be out of order in the OPAC display. However, of the unchanged call numbers, 7,909 (83.5%) titles would be on the same screen or one screen away from the shelf listed call number. This leaves 1,572 (16.6%) titles with unchanged call numbers that would be two or more screens away. The first OPAC search was done in 1995, three years after the sample titles had been processed. The estimated size of the database at that time was 2,865,000 titles. Following the line of reasoning above, after three years of production, there would be 4,716 titles (0.16%) of the entire database) out of sequence by more than two screens in the OPAC display. Using 0.16% as the percentage for out-of-sequence titles against yearly database growth, predictions can be made on the number of titles in the database that would be more than two screens away from a shelf listed call number. The above results do not take into account any compounding that may occur because of the out-of-sequence items. This study has not examined whether compounding is a significant factor in increasing the number of items out-of-sequence over time. The literature titles are more of a problem if titles are out-of-sequence. With literature, it is the class number that becomes the key element in accepting call numbers without review. Only a cursory review of literature titles was done in this investigation. There were 55 (24.6%) literature titles with changed call numbers of the 224 titles that were searched in the OPAC. Of these fifty-five titles, 53 (93.4%) had a change made to the author cutter, which is the element used to keep the works of an author together on the shelf. Without this call number adjustment, the works of an author would be shelved in two or more locations. This investigation did not review the literature titles further, but it would be interesting to note how far from the established class number an unchanged literature call number would fall, not only in the OPAC, but also on the shelf. 3. Conclusions The research question asked in this study is: To what degree will the browsability of a collection in an OPAC change if call numbers are not shelf listed? The results indicate that for this library‟s collection, after three years, 0.16% of total titles cataloged without call number review may not be easily found in the OPAC. This is not a large percentage, and therefore non- review of call numbers in cataloging would seem to be an acceptable decision for cutting costs and increasing productivity. There are serious questions raised by this study that have not been answered, and more research is recommended. This research was limited to a call number search and the display results of titles in an OPAC. The decision on what would be “findable” was based on readings about user retrieval preferences. Patrons do not like to retrieve too many titles for review. Also, patrons prefer a known item approach or subject approach, and so it is assumed that all titles would be retrieved by this type of search protocol, no matter what the call number assignment. An important constraint of this research is that the OPAC results were not translated to the actual shelf position in the library. This decision was based on the assumption that in this digital age, the user will search online and then use the call number to find the item on the shelf. Accepting call numbers without review may have one result in the OPAC display and an entirely different result when the actual shelf position of the item is examined. Assume that a user selects an item from a search of the OPAC. The patron jots down the call number and goes to the shelf to retrieve the item. The item selected is one that had a call number that was not changed during processing and is found five shelf ranges away from like items in the collection. Would the patron be satisfied with this result? Would the patron realize that more items exist but are not shelved in close proximity to the selected title? How does screen display position translate to actual shelf position? How are the items actually shelved in each library? In this OPAC, the call number sequence display is continuous no matter what the format or material type. If a library shelves formats separately, e.g., monographs in one area and serials in another, a shelf position examination might have very different results. By accepting call numbers as found on copy without review, how many classification sequences would actually be established for a given topic? It has been established in the review of available copy that 66.6% of records used were provided by the LC and 33.4% were provided by member institutions. If a call number input by the LC for a topic has a cutter of R66 and is accepted without review and the library had already established this topic as R6, the result is that two sequences have been established for one topic. It is assumed that the LC class assignment will remain consistent. If member institution call numbers are accepted without review for the same topic, yet another cutter might be established. A library collection could contain quite a few class sequences for items that are traditionally classed together. This could be a problem, not only in the browsing of the OPAC, but also in the browsing of the shelves. This leads one to question the extent to which a library‟s processing/maintenance policy extends to the re-cataloging of items to keep them together. Classification schemes by their very nature are under constant revision to codify new information, new research areas, or change already established class number notations. Do libraries go back and adjust class numbers of items if a change has been made to the scheme? It is assumed that they do not because of limited resources. If they do not re-catalog because of schematic changes, would it be necessary to re- catalog items that are out of order because of processing choices? The above brief discussion does not include all of the issues associated with this study. However, the study shows that from the sample group approximately 78% of the copy cataloged items fit into this library‟s collection without needing any call number adjustment. It showed that 21.9% of processed items required a call number adjustment which was so slight that the unchanged call number was on the same screen or the next screen in the OPAC display. This leaves 16.6% of the items out of sequence by two or more screens, which when factored into the entire collection results in 0.16% total titles not easily found in the OPAC. When taken by themselves, the statistics seem to make the proposition of processing items without call number review somewhat attractive. However, when translated to the collection‟s physical arrangement it may become less attractive. This author proposes that size of the library collection does make a difference. A similar study on a small library collection would make an interesting comparison. In the virtual world, should the core method of systematic classification that organizes our collections be suspended? As LeBlanc states, “Will the access potential of the virtual library prove healthily cornucopian, or will the browsability of this new informational format permit the retrieval of only so much fodder from the cybernetic trough – enough to sustain users, but not enough to satisfy them” [24]? The authors hopes that this examination of call number assignment and how it might be applied or not applied in processing provides some new ideas or insights. References [1] El-Sherbini M, Stalker JC. A study of cutter number adjustments at the Ohio State University Libraries. Library Resources & Technical Services, 40 (October 1996):320. [2] Osborn AD. From Cutter and Dewey to Mortimer Taube and beyond: a complete century of change in cataloguing and classification, Cataloging & Classification Quarterly 12 (1991):36. [3] Massey SA, Malinconico SM. Cutting cataloging costs: accepting LC classification call numbers from OCLC cataloging copy, Library Resources & Technical Services, 41 (January 1997):38. [4] Broadbent E. The Online Catalog: Dictionary, Classified, or Both? Cataloging & Classification Quarterly 12 (1991):108. [5] Drabenstott KM , et al. Analysis of a bibliographic database enhanced with a library classification, Library Resources & Technical Services, 34 (April 1990):179. [6] Hill JS. Online classification number access: some practical considerations, The Journal of Academic Librarianship, 10 (March 1984):21. [7] Huestis JC. Clustering LC classification numbers in an online catalog for improved browsability, Information Technology and Libraries, 7 (December 1988):383. [8] Losee RM. The relative shelf location of circulated books: a study of classification, users, and browsing, Library Resources & Technical Services, 37 (April 1993):197–8. [9] Cochrane PA, Markey K. Preparing for the use of classification in online cataloging systems and in online catalogs, Information Technology and Libraries, 4 (June 1985):108 –9. [10] Williamson NJ. The role of classification in online systems, Cataloging & Classification Quarterly, 10 (1989):99. [11] Epple M, Ginder B. Online catalogs and shelflist files: a survey of ARL Libraries, Information Technology and Libraries, 6 (December 1987):294. [12] Chan LM. Library of Congress class numbers in online catalog searching, RQ 28 (Summer 1989):536. [13] Koh GS. Options in classification available through modern technology, Cataloging & Classification Quarterly, 19 (1995):196. [14] McCue J, Weiss PJ, Wilson M. An analysis of cataloging copy: Library of Congress vs. selected RLIN Members, Library Resources & Technical Services, 35 (1991):73. [15] Taylor AG, Simpson CW. Accuracy of LC copy: a comparison between copy that began as CIP and other LC cataloging, Library Resources & Technical Services, 30 (October/December 1986):377. [16] Jones KS. Some thoughts on classification for retrieval, Journal of Documentation, 26 (June 1970):91. [17] Davis CC. Results of a survey on record quality in the OCLC database, Technical Services Quarterly, 7 (1989):44. [18] Hyman RJ. Access to library collections: an inquiry into the validity of the direct shelf approach, with special reference to browsing, Metuchen, NJ: Scarecrow Press. 1972, p. 2. [19] Wallace PM. How do patrons search the online catalog when no one‟s looking? Transaction log analysis and implications for bibliographic instruction and system design, RQ, 33 (1993):239. [20] Tagliacozzo R, Rosenberg L, Kochen M. Access and recognition: from user‟s data to catalog entries, Journal of Documentation, 26 (September 1970):248. [21] Hancock M. Subject searching behavior at the library catalog and at the shelves: implications for online interactive catalogues, Journal of Documentation, 43 (December 1987):306. [22] Mann T. Library Research Models: a Guide to Classification, Cataloging, and Computers. New York: Oxford, 1993, p. 91. [23] Langridge DW. Classification: Its Kinds, Elements, Systems, and Applications. London: Bowker-Sauer. 1992. [24] LeBlanc J. Classification and shelflisting as value added: some remarks on the relative worth and price of predictability, serendipity, and depth of access, Library Resources & Technical Services, 39 (July 1995):302. work_zfq5dmv6pbgmfdq3yllzg4zc4q ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216586441 Params is empty 216586441 exception Params is empty 2021/04/06-01:36:58 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216586441 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:58 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_zm2dhejhrbdd7hw3u3vro6rbeq ---- From Tech Services to Leadership The Serials Librarian, 2008. Vol. 54, No. 1-2, p127-134. doi: 10.1080/03615260801973935 ISSN: 0361-526X (Print), 1541-1095 (Online) http://www.tandfonline.com/ http://www.tandfonline.com/toc/wser20/current http://www.tandfonline.com/doi/abs/10.1080/03615260801973935 http://dx.doi.org/10.1080/03615260801973935 © 2008 The Haworth Press. From Tech Services to Leadership Anne McKee Moderator Joyce Ogburn Carol Pitts Diedrichs Karen Calhoun Panelists Sarah E. Morris Recorder ABSTRACT. The panel at this strategy session was composed of three distinguished women from technical services who currently hold leadership positions. The session topic was divided into two sections: 1) What is leadership?; and 2) How can technical services help you be a leader? Each panelist spoke to the first topic and then reversed order to explore the second. The panelists all spoke to a definition of leadership based on the personal, social, and professional skills possessed by leaders. The panel concurred that technical services staff are uniquely prepared for leadership for a variety of reasons because the personal, social, and professional skills discussed by the panel in the first half of the session are prevalent in technical services departments. INTRODUCTION If the full crowd at NASIG 2007 was any indication, technical services and leadership remains a timely topic. The panel at this strategy session was composed of three distinguished women from technical services who currently hold leadership positions: Joyce Ogburn, University Librarian at the University of Utah; Carol Pitts Diedrichs, Dean of Libraries and William T. Young Endowed Chair at the University of Kentucky; and Karen Calhoun, Vice President, OCLC WorldCat and Metadata Services at OCLC Online Computer Library Center. The moderator was Anne McKee, Program Officer at the Greater Western Library Alliance. The session topic was divided into two sections: 1) What is leadership?; and 2) How can technical services help you be a leader? Each panelist spoke to the first topic and then reversed order to explore the second. This report will cover both halves, focusing first on the common http://www.tandfonline.com/ http://www.tandfonline.com/toc/wser20/current http://www.tandfonline.com/doi/abs/10.1080/03615260801973935 http://dx.doi.org/10.1080/03615260801973935 threads discussed by all three panelists and concluding with the question and answer period. PART ONE: WHAT IS LEADERSHIP? The panelists all spoke to a definition of leadership based on the personal, social, and professional skills possessed by leaders. Their list of personal principles was not surprising: courage, honesty, integrity, fairness, tact, patience, passion, energy, optimism, and flexibility. Carol Pitts Diedrichs emphasized the works of Daniel Goleman, an internationally known psychologist, on the importance of emotional intelligence in the workplace. Having the emotional intelligence skills to be both self-aware and self-regulated and personal emotional skills such as being emotionally stable or “centered” are important. Additional personal skills needed by leaders focused more on their relationship to their work: to have clarity of vision, perseverance, a restlessness with the status quo, an orientation toward the future, a pursuit of opportunities, lots of questions, and a need to do your best. None of the panelists left out the importance of having a sense of humor and a desire to have fun while pursuing your dreams. The personal skills developed by practiced leaders are important, but the panelists focused just as much on these elements as on the way leaders relate to others and, most especially, to those who follow them. Joyce Ogburn noted in particular that leadership involves both the process and the people. The social skills the panelists noted included being a good listener, being respectful, having, and showing empathy, learning the art of persuasion, being able to find common ground among varied parties, and having influence as opposed to power. Because leadership depends on followers, the panelists emphasized working in a team and remembering that no one is a leader by him or herself. The panelists also spoke to the leadership characteristics displayed professionally, stemming from personal and social skills that combine in the workplace to produce particular traits. These included being able to take responsibility (for yourself and your work), to manage conflict rather than avoid it, to have a strong commitment to your organization, to build coalitions, and to stay informed and connected. Joyce focused on what stops people from being leaders. To go from “I can’t” or “This won’t work” to “I can do it,” Joyce recommended starting small. For example, consider the future. Rather than be afraid of it, think about where you would really like to be. While initially it may look too big to handle, breaking the process down into manageable parts can make it easy to get started and really make a difference in achieving your goals. Look where you are and what you can do, even if it’s a small thing, and you will be amazed at how quickly you can move forward. Carol focused on the social aspect of leadership, those other people so necessary to its practice. Carol’s advice was to remember that, “it’s not all about me.” Focusing on those who are being led is difficult because you need to listen to all the voices, especially the dissident ones, while remembering to be civil and respectful. This means you may have to adjust your style to deal with a myriad of personalities. While doing so, you must take the feelings of your followers into account, which requires more effort. The more you lead, Carol noted, the more you will appreciate great followers. She offered some tips to bring followers together, including working with the assumption that nothing important is done alone (no leader is an island). Using the concept of the movie, Pay It Forward, Karen Calhoun reminded us that leadership is thinking about what will make the world better and what we can do about it. Building on the inspiration of Elizabeth Cady Stanton and civil rights leaders of the 20th century, Karen emphasized taking small steps and persevering through the messy process of leadership. Once you are willing to be the first one to go forward, you have to communicate your vision and seek to influence others, not to command them. Expanding on Carol’s comments, Karen emphasized having respect for others: listening and understanding the other person’s perspective, and knowing what they want. Karen also discussed Ned Herrman’s Brain Dominance model, noting that technical services staff members tend to self-select for the limbic-left model (organizers), but we should instead seek to use our whole brain and incorporate the other types (theorists, humanists, and innovators), regardless of our preferences and tendencies, in order to be successful leaders. PART TWO: WHAT IS TECHNICAL SERVICES’ CONTRIBUTION TO LEADERSHIP? In the second half of the strategy session, the panelists turned to the topic of how experience in technical services can help you be a leader. The panel concurred that technical services staff are uniquely prepared for leadership for a variety of reasons. Those personal, social, and professional skills discussed by the panel in the first half of the session are prevalent in technical services departments. Karen started this section by remarking that technical services departments have been so squeezed for staff and resources over the last twenty years that necessity has become the mother of invention. All the panelists agreed that such an environment prepared them well by reinforcing leadership skills. By managing people and resources, adhering to a code of ethics and accountability standards, applying new technologies, working as part of a team, dealing with constant change, and serving customers— all while cultivating an environment ripe for innovation—technical services staff members hone their leadership skills on a daily basis at all levels, not just that of senior management. In stressing the importance of the people you work with, Karen commented that people are the softest of software and that social skills are essential to ensure that the work is done. The interesting part, she noted, is that technical services departments tend to draw the most interesting personalities, creating an excellent opportunity to apply the whole brain model. Carol remarked that technical services enable you to become a leader because the environment allows you to pick up the skills faster. Especially for those with acquisitions experience, you really “get” budgets and have a practiced ability to make the money work out. Aside from the money, technical services departments understand serving internal customers and dealing with happy and unhappy people, stressing the importance of practicing and developing these skills through organized training, although such opportunities must often be sought out. To close up the second half of the presentation, Joyce gave some concrete examples of leadership from technical services. On the cataloging side, there is the MARC record, early library automation efforts such as the old OCLC Beehive terminals, online catalogs and of course, OCLC. In terms of technology, technical services has been automated for many years, is accustomed to using online integrated library systems, and more recently added electronic resource management systems, digitization and digital library development to its portfolio. Technical services have been in the forefront of standards development and adoption, as evidenced by EDI (electronic data interchange), DOI (digital object identifier), MARC 21, Serial Item and Contribution Identifier NISO Z39.56, Holdings statement NISO Z39.71, and many others. Technical services has also shown professional leadership in metadata, process improvement and efficiency, dealing with the serial pricing crisis, changes in scholarly communication, and essential disaster planning and recovery. Finally, Joyce noted that the hallmark of technical services is collaboration—sharing resources and simply “sharing the load.” In each of their final remarks, the panelists noted some “tips of the trade” for leaders. Karen noted that patience is important to try to learn, even though leaders tend to be impatient people by nature. She reminded the audience that everyone could be a leader. The most important thing to do, though, is “pay it forward.” Carol offered three specific pieces of advice. First, as you rise higher in an organization you will have fewer peers inside your organization, so it is important to cultivate peers outside of your organization to serve as mentors, sounding boards, or objective listeners. Mentors, both for- mal and informal, are also crucial, and may reappear over the course of your career. Lastly, while transparency and open communication is important, sometimes you have to keep information confidential. Carol suggested saying “I can’t talk about that (yet),” in those situations so that peers and followers will not feel as though you are deliberately withholding important information. Joyce suggested keeping a “feel good file” where you can save a record of positive feedback you receive that may inspire or reinvigorate you and help you survive difficult periods in your career. Technical services tends to receive fewer thanks or positive comments than a more visible public services department, and there may be even less acknowledgement as you rise to the top of an organization, when the appreciation appropriately goes to those on the front line who are doing the work. Reminding us to lead from the middle, Joyce cautioned against running too far ahead and losing those behind you and leading from the rear while nipping at the heels of those in front of you. She also noted that a “can do” attitude really pays off. In terms of professional development, she recommended reading contemporary business literature to glean leadership lessons from great business leaders, but also exploring other aspects of leadership, such as leadership for the common good. A list of the panel’s recommended reading is included at the end of this report. Question & Answer Section To close out the session, the panelists held a question and answer section, fielding as many questions from their lively audience as time allowed. The first question asked how to successfully build coalitions, as the audience member had had a bad experience doing so. The panel’s consensus was that you need to start by building relationships. You need to get out of your office so that you know the players and they know you, because once those relationships are built the collaboration comes more easily. They suggested to first approach those who are likely to agree with you, and that will build momentum for others to follow. However, they also pointed out that some people will never buy in, which is okay. The second question was whether we are preparing staff members to lead in our departments and organizations—whether we are building success in the institution itself. Sometimes promotions come to those who are best at a particular task, not necessarily to those who are the best leaders. The presenters noted that the Association for Research Libraries (ARL) has many leadership development programs as do other institutions, and they recommended attending as many formal programs as interest, aptitude, time, and money allow. The next questioner observed that many senior managers seemed to spend an inordinate amount of time on work so asked for advice on how to achieve a work/life balance. The panel agreed that sacrifices need to be made, especially at the top level, but you make your own choices about what sacrifices you are willing to make. However, as your life changes and subsequently your priorities, the blend of work/life balance changes, too. Balance is a very personal issue that will be with you throughout your career. Because leadership can happen at every level, you do not always have to be at the top to be a leader. It may be easier to achieve the balance you desire from a lower position in the organization. The panel suggested consciously developing leadership skills of more staff to relieve some of the pressure of being at the top of the organization chart. Carol also made a plea to everyone to let your Dean have some balance, too! The fourth question dealt with generational differences, a topic that came up in several other sessions during the conference. All of the panelists agreed that generational differences are very situational. You need to be aware of and appreciate what you don’t know and how American cultural mores impact the workplace. Libraries often have a very international workforce with different ideas about how the workplace should work. Sometimes other issues can masquerade as generational differences. The panel also reminded the audience that you are not your mentor and your perspective will change as your stage in life changes. In terms of leadership, they suspect that a new definition of leadership will come as a new generation moves into the top positions. With only a few minutes left in the session, the last audience question was how the panelists had learned to accept help from support staff since secretaries and administrative assistants in technical services are a rare commodity. The panel all had varying experiences with and without assigned support staff, but noted that you really will live or die by support staff. Joyce noted that it is a process of learning how to use help, and how much they really can help you, but in the end, it’s worth it. Going back to those original personal skills mentioned earlier, flexibility is pivotal because it’s a give and take situation and delegation is key. You need to have time to do your job and you need to let your support person help you. Carol recommended using technology, such as synchronized PDAs as a tool, and to ask for the tools you need to do your job. From Tech Services to Leadership was a well-attended session, which sparked discussion in other sessions during the rest of the conference. This attributed to the quality and timeliness of its advice. No doubt, future NASIG leaders will reference this presentation as a catalyst for their success. RECOMMENDED READING 1. Block, P. (2000). Flawless Consulting: A Guide to Getting Your Expertise Used. San Francisco: Jossey-Bass/Pfeiffer. 2. Bridges, W. (1991). Managing Transitions: Making the Most of Change. Reading, MA: Addison-Wesley. 3. Crosby, B. C., & Bryson, J. M. (2004). Ed. Goethals, G. R., G. J. Sorenson, & J. M. Burns. Leadership for the Common Good. In Encyclopedia of Leadership. Thousand Oaks, CA: Sage. 4. Diedrichs, C. P. (1998). Rethinking and Transforming Acquisitions: The Acquisitions Librarian’s Perspective. Library Resources & Technical Services, 42 (2), 113–125. 5. Goleman, D. (1995). Emotional Intelligence. New York: Bantam Books. 6. Goleman, D., Boyatis, R. E., & McKee, A. (2002). Primal Leadership: Realizing the Power of Emotional Intelligence. Boston, MA: Harvard Business School Press. 7. Goethals, G. R., Sorenson, G., & Burns, J. M., Eds. (2004). Encyclopedia of Leadership. Thousand Oaks, CA: Sage Publications. 8. Gregor, D., & Mandel, C. (1991). Cataloging Must Change! Library Journal, 116(6), 42– 47. 9. Herrman, N. (1989). The Creative Brain. Lake Lure, NC: Brain Books. 10. Lakos, A., & Phipps, S. (2004). Creating a Culture of Assessment: A Catalyst for Organization Change. Portal: Libraries and the Academy 4, (3), 345–361. 11. Mitchell, M., Ed. (2007). Library Workflow Redesign: Six Case Studies. Washington, D.C.: Council on Library and Information Resources. 12. Patterson, K. (2002). Crucial Conversations: Tools for Talking When Stakes are High. New York: McGraw-Hill. 13. Von Oech, R. (1983). A Whack on the Side of the Head: How to Unlock Your Mind for Innovation. New York: Warner Books. work_zrphq7smmrdadpj55qfjurv5au ---- EDITORIAL Editorial Mick Wycoff Published online: 21 November 2009 # Springer Science + Business Media, LLC 2009 Here we are at the end of 2009 already. This final issue of Evolution: Education and Outreach, volume 2 closes our celebration of the Darwin Bicentennial and our full second year of publishing. As Stephen Jay Gould wrote on the flyleaf when he gave his friend and collaborator Niles Eldredge a copy of his last book, The Structure of Evolutionary Theory, “what a ride we’ve had!” Our journal was born at an intense planning session in New Orleans. The winter of 2007 was just ending, and the flood-torn city was still a wreck. A quorum of our editors and Board of Directors, headed up by prime mover and Executive Editor Andrea Macaluso, convened at Springer’s invitation to invent a new kind of journal. We didn’t not just want to publish about evolution: We wanted to teach it. We knew that teachers needed clear, accurate sources dealing with evolution; they needed to know more about what works best to prepare high school students for college—and what doesn’t; they needed models for teaching evolution in elementary school; they needed information, encourage- ment—respect, even. But could we reach them? And who could we find to speak to them? Were our hopes for teaching the great lessons of evolution any more promising than the future of the ruined city we visited? We quickly found that we had the advantage in a fresh start and new ground to build on. Like a wise contractor, our publisher, Springer, has given us the tools and support we called for to get the job done. That includes going over budget to double the page count for Kristen Jenkins’ weighty special issue, “Teaching Evolution” (EEO, Vol. 2, Issue 3, 2009). And our stellar board of teachers and scientists has reliably show up for the publishing equivalent of a barn-raising: lots of hard work for no pay but the vast satisfaction of helping to erect a structure serving the needs of the community. By now, we have succeeded so far beyond our hopes that our (recently promoted) Editorial Director Ms. Macaluso assures us we are “the hottest new evolution journal on the planet.” In a rugged and competitive landscape, EEO attracted so much attention that libraries were buying subscriptions in spite of finding the journal free online, and one indexer actually invited us to apply for inclusion in their database. So far, the journal has been indexed/abstracted in Abstracts in Anthropology, Google Scholar, Summon by Serial Solutions, and the online library and research consortium OCLC. It takes a while to gather readership information, but we just got the download report for late 2008 and early 2009. The top few articles people hit on included T. Ryan Gregory’s two masterful analytical essays, “Understanding Evolutionary Trees” and “Evolution as Fact, Theory and Path.” Next came Kristin Jenkin’s book review of Guns, Germs and Steel by Jared Diamond, followed by Ian Tattersall discussing “What’s so Special about Science?” and rounded out with two classroom pieces designed for use in lesson planning: Annastasia Thanukos’ regular column, Views From Understanding Evolution, devoted to “Parsimonious Explanations for Punctuated Patterns” and Gregory Eldredge on “The Five Major Divisions (King- doms) of Life.” To interpret for a moment: the first two articles by Ryan Gregory belong on every evolutionary biologist’s bookshelf, yet they are so clearly written that a good high school biology teacher could build entire units of study around either one. As for the book review, given Diamond’s popularity, may we guess that it has been downloaded by the proverbial General Audience of All Ages and Interests. And the noted primatologist Ian Tattersall’s engaging look at science as a human endeavor M. Wycoff (*) The Eldredge Group, Ridgewood, NJ, USA e-mail: meldredge@earthlink.net Evo Edu Outreach (2009) 2:579–580 DOI 10.1007/s12052-009-0189-1 embedded within culture is also a great read for anybody from scientists to students. Last of all, the Thanukos and Eldredge downloads give us real proof that teachers want and will access the tools to teach evolution when they are available. It seems that EEO is a hit on every level from the upper reaches of the evolutionary high table right down to the communal trough we call the Internet. Clearly, our success is a “new media” story. EEO represents a networking of the linear world of print and scholarship with the meshed, linked, and intersecting path- ways of the electronic cloud. What an exciting time to start a journal. Yes, there are a few hard copies floating around, but online is where the action is—and the pictures are all in color there, too. Interestingly, old-fashioned, brick-and-mortar libraries may stand at the heart of the revolution. Look at the work of our first prize winner for this year’s Springer award for the Darwin Year Celebration Contest, entitled “Evolve Your Library.” This notable contestant is the science and technology library of the New University of Lisbon in Portugal. They won for a wide-ranging celebration of Darwin and evolution at their Caparica Campus that included theater performances, a book and video exhibit, a recreation of Darwin’s study, an outdoor installation inspired by Darwin’s tree of life, and outreach activities for high school students and senior citizens (http://eventos.fct.unl.pt/darwin2009). Don’t miss our own latest nod to networked information: First, co-editors Niles and Gregory Eldredge present a radical new integrated approach to teaching K-16 science that highlights evolution, folds the natural world and personal experience back into the curriculum, and turns to alternate resources including the Internet as much as to formal textbooks. Also, Adam Goldstein, our book review editor, starts a new column this issue devoted to Evolution and Education Resources—feel free to send in suggestions for future columns and get updates on our blog and twitter sites as well as our MySpace (http://www.myspace.com/ springer_evoo) and Facebook pages (http://www.facebook. com/group.php?gid=23672835224&ref=ts). We love to hear from our fans, supporters, and fellow evolutionists, so stay in touch. We particularly urge teachers, scholars, and students from outside the USA to contribute and make this a truly international effort. Next year should be another smash, a dynamic series of special issues starting with another sure-fire instant classic, edited by our esteemed board member John Thompson and his Chilean colleague Rodrigo Medel, on coevolution. Plan to visit us in the New Year for this original issue; then, share your own adventures, lessons, and insights about Evolution: Educa- tion and Outreach at http://www.springer.com/life+sci/ journal/12052 580 Evo Edu Outreach (2009) 2:579–580 http://eventos.fct.unl.pt/darwin2009 http://www.myspace.com/springer_evoo http://www.myspace.com/springer_evoo http://www.facebook.com/group.php?gid=23672835224&ref=ts http://www.facebook.com/group.php?gid=23672835224&ref=ts http://www.springer.com/life+sci/journal/12052 http://www.springer.com/life+sci/journal/12052 Editorial work_zu4edsdazza7ddd3mpzrffctkm ---- wp-p1m-38.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-38.ebi.ac.uk no 216585445 Params is empty 216585445 exception Params is empty 2021/04/06-01:36:57 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 216585445 (wp-p1m-38.ebi.ac.uk) Time: 2021/04/06 01:36:57 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_zt62z56qerhjdooizoeovmyzmi ---- Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 1: Mission and Progress Search  |    Back Issues  |    Author Index  |    Title Index  |    Contents D-Lib Magazine September 2001 Volume 7 Number 9 ISSN 1082-9873 Networked Digital Library of Theses and Dissertations Bridging the Gaps for Global Access - Part 1: Mission and Progress     Hussein Suleman, Anthony Atkins, Marcos A. Gonçalves, Robert K. France, Edward A. Fox (hussein, anthony.atkins, mgoncalv, france, fox) @vt.edu Virginia Tech Vinod Chachra, Murray Crowder (chachra, crowderm) @vtls.com VTLS Inc. Jeff Young jyoung@oclc.org OCLC Abstract The Networked Digital Library of Theses and Dissertations (NDLTD) is a collaborative effort of universities around the world to promote creating, archiving, distributing and accessing Electronic Theses and Dissertations (ETDs). Since its inception in 1996, over a hundred universities have joined the initiative, underscoring the importance institutions place on training their graduates in the emerging forms of digital publishing and information access. The outreach and training mission of NDLTD is an ongoing project so in this article we report on the current status of membership and support activities. Recent research has focused on creating a union database that will provide a means to search and retrieve ETDs from the combined collections of NDLTD member institutions. The Virtua system developed by VTLS will serve as the heart of this union database. In order to bridge the gap between the existing distributed institutional archives and a unified collection of ETDs, we have developed a metadata standard especially suited to ETDs - this is then used by partner sites to export their freely-available metadata using the Metadata Harvesting Protocol of the Open Archives Initiative. We also link name authority information into the metadata records to support unique identification of authors and others associated with the works. Additional research efforts include advanced search mechanisms, semantic interoperability, the design and development of multi- and cross-lingual search systems, and software modules that support the development of higher-level services to aid researchers in seeking relevant ETDs. Introduction The Networked Digital Library of Theses and Dissertations (NDLTD, see http://www.ndltd.org) has emerged as a result of the efforts of thousands of students, faculty, and staff at hundreds of universities around the world, as well as the assistance of interested parties at scores of companies, government agencies, and other organizations. This federation has multiple objectives, including: to improve graduate education by allowing students to produce electronic documents, use digital libraries, and understand issues in publishing; to increase the availability of student research for scholars and to preserve it electronically; to lower the cost of submitting and handling theses and dissertations; to empower students to convey a richer message through the use of multimedia and hypermedia technologies; to empower universities to unlock their information resources; and to advance digital library technology. Work toward those objectives has proceeded since November 1987, the date of the first meeting devoted to exploring how advanced electronic publishing technologies could be applied to the preparation of electronic theses and dissertations (ETDs). Early efforts are summarized in two D-Lib articles in 1996 and 1997 [Fox et al, 1996; Fox et al., 1997]. A third article summarizes the first attempts to support, through federated search, access to the collection (see also http://www.theses.org) of ETDs that is emerging in distributed fashion [Powell & Fox, 1998]. NDLTD activities are coordinated by an international steering committee that meets each spring and fall. Its members include those who lead the diverse regional and national efforts that promote efforts regarding ETDs. Committees help with strategic planning, standards (see http://www.ndltd.org/standards), training, and meetings. A good deal of effort by steering committee members has gone into fund-raising, so that single and groups of institutions could implement ETD initiatives. There have been national projects in the USA [Kipp, et al., 1999], South Africa, Germany, Australia, and other countries. Supporting research work has been funded by NSF (in projects IIS-9986089 [Fox, 2000], IIS-0086227 [Fox, et al. 2000], IIS-0080748 [Fox, et al., 2001]), as well as DFG (in Germany) and CONACyT (in Mexico). At the grass roots level, one line of support for NDLTD emerged from efforts at Virginia Tech, which has developed training materials and workflow management software that have been adapted by diverse groups. Many other projects and programs interested in ETDs have arisen around the world, some independently, but all are welcome to collaborate through the growing federation that is NDLTD. This is important since open sharing of methods helps others know how to address problems as well as ongoing changes in technology. Already there have been four international symposia on ETDs, with approximately 200 attendees at each of the last two. The next two will be held May 30 - June 1, 2002 at Brigham Young University in Provo, Utah, and late spring, 2003, in Berlin. The NDLTD steering committee has its spring meetings in conjunction with the ETD conferences. These efforts should have a strong positive effect on expanding awareness at universities around the globe. One important agent promoting learning in this arena is the UNESCO International Guide for the Creation of ETDs (see ). To be released late in 2001 in a number of different languages, this book / web site should help students, faculty, and administrators participate in NDLTD. This should extend the considerable progress already made, as is discussed in the next section. NDLTD Progress NDLTD has experienced constant progress since its formation. We have registered growth in all major facets, including membership (with an increasing international participation), collection size, access, multimedia use, and worldwide availability. Membership Table 1 shows NDLTD membership as of August 2001. In less than two and a half years, NDLTD has more than doubled the number of registered members (from 59 members in May 1999). There are currently 120 members; 52 U.S. universities, 52 non-U.S. universities, and 16 institutions, regional centers and organizations (such as UNESCO). These various partners represent 23 countries: Australia, Brazil, Canada, China, Colombia, Germany, Greece, Hong Kong, India, Italy, Mexico, Netherlands, Norway, Russia, Singapore, South Africa, South Korea, Spain, Sudan, Sweden, Taiwan, the USA, and the United Kingdom. These numbers also emphasize the growth of global interest in NDLTD as international participation grew from less than one third in 1998 to half of the total membership in 2001. Also, by early 2002, at least 11 of the registered NDLTD members will have started requiring mandatory submission of electronic theses and dissertations. (In the table below, those are marked with an asterisk.)   USA Universities (524) International Universities (512) Institutions (156) Air University (Alabama)Alicante University Baylor University Brigham Young University California Institute of Technology Clemson University College of William and Mary Concordia University (Illinois) East Carolina University East Tennessee State University* Florida Institute of Technology Florida International University George Washington University Louisiana State University* Marshal University Massachusetts Institute of Technology Miami University of Ohio Michigan Tech Mississippi State University Montana State University Naval Postgraduate School New Jersey Institute of Technology New Mexico Tech North Carolina State University* North Western University Pennsylvania State University Regis University Rochester Institute of Technology Texas A&M University University of Colorado University of Florida University of Georgia University of Hawaii at Manoa University of Iowa University of Kentucky University of Maine* University of North Texas* University of Oklahoma University of Pittsburgh University of Rochester University of South Florida University of Tennessee, Knoxville University of Tennessee, Memphis University of Texas at Austin* University of Utrecht University of Virginia University of West Florida University of Wisconsin, Madison Vanderbilt University Virginia Commonwealth University Virginia Tech* West Virginia University* Western Michigan University Worcester Polytechnic Institute Alicante University (Spain)Australian National University (Australia) Biblioteca de Catalunya (Spain) Chinese University of Hong Kong (Hong Kong) Chungnam National U., Dept of CS (S. Korea) City University, London (UK) Curtin University of Technology (Australia) Darmstadt University of Technology (Germany) Freie Universitat Berlin (Germany) Gerhard Mercator Universitat Duisburg (Germany) Griffith University (Australia) Gyeongsang National University, Chinju (Korea) Humboldt-Universität zu Berlin (Germany) Indian Institute of Technology, Bombay (India) Lund University (Sweden) McGill University (Canada) National Sun Yat-Sen University (Taiwan) Nanyang Technological University (Singapore) National University of Singapore (Singapore) Rand Afrikaans University (South Africa) Rhodes University (South Africa)* Shanghai Jiao Tong University (China) St. Petersburg State Technical U. (Russia) State University of Campinas (Brazil) Sudanese National Electronic Library (Sudan) Universidad de las Amèricas Puebla (Mèxico) Universitat Autonoma de Barcelona (Spain)* Universitat d'Alacant (Spain) Universitat de Barcelona (Spain) Universitat de Girona (Spain) Universitat de Lleida (Spain) Universitat Oberta de Catalunya (Spain) Universitat Politecnica de Catalunya (Spain) Universitat Politecnica de Valencia (Spain) Universitat Pompeu Fabra (Spain) Universitat Rovira i Virgili (Spain) Universitè Laval (Quèbec, Canada) University of Bergen (Norway) University of Antioquia (Medellin, Colombia) University of British Columbia (Canada) University of Guelph (Ontario, Canada) University of Hong Kong* University of Melbourne (Australia) University of Mysore (India) University of New South Wales (Australia) University of Pisa (Italy) University of Queensland (Australia) University of Sao Paulo (Brazil) University of Sydney (Australia) University of Utrecht (Netherlands) University of Waterloo (Canada) Uppsala University (Sweden) Wilfrid Laurier University (Canada) CinemediaCoalition for Networked Information Committee on Institutional Cooperation Consorci de Biblioteques Univers. Catalunya Diplomica.com Dissertationene Online Dissertation.com ETDweb Ibero-American Sci. & Tech. Ed. Cons. (ISTEC) National Documentation Centre (NDC, Greece) National Library of Portugal OhioLINK OCLC Organization of American States (OAS) SOLINET Sudanese National Electronic Library (Sudan) Solinet UNESCO Table 1. NDLTD Membership   Collection Size The number of ETDs across the NDLTD universities/institutions has grown at an even faster pace. From a few dozen at Virginia Tech in 1996, to 4,328 ETDs at 21 institutions in March 2000, we accounted for a total of 7,268 ETDs at 25 member institutions in July 2001. Table 2 shows a breakdown of the current numbers of ETDs as of July 2001 organized by member institution. This data is largely the result of an on-line survey conducted by Gail McMillan and represents only those institutions that responded to the survey.   University/Institution ETD Collection size ADT: Australian Digital Thesis Program (Australia) 238 University of Bergen (Norway) 45 California Institute of Technology 2 Consorci de Biblioteques Universitaries de Catalunya (Spain) 151 East Tennessee State University 106 Humboldt-University (Germany) 430 Louisiana State University 3 Mississippi State University 33 MIT 62 North Carolina State University 301 Pennsylvania State University 83 Pontifical Catholic University (PUC) (Brazil) 90 Gerhard Mercator Universitat Duisburg (Germany) 126 Universitat Politecnica de Valencia (Spain) 189 University of Florida 174 University of Georgia 121 University of Iowa 6 University of Kentucky 19 University of Maine 27 University of North Texas 337 University of South Florida 25 University of Tennessee 12 University of Tennessee, Knoxville 28 Uppsala University (Sweden) 178 Virginia Tech 3393 West Virginia University 1006 Worcester Polytechnic Institute 83 TOTAL 7268 Table 2. NDLTD collection size   These statistics do not take into account scanned theses and dissertations, which make up a substantial portion of the total NDLTD collection. There are 26 scanned documents at the New Jersey Institute of Technology, 150 at the University of South Florida, 5,581 at MIT, and 12,000 at the National Documentation Center in Greece. These result in a total of 17,763 scanned theses and dissertations at these institutions, and quite conceivably thousands of unreported ones at other institutions. Access Statistics To demonstrate the potential of NDLTD for global access and sharing of the knowledge produced by universities worldwide, we have periodically analyzed the access logs of the Virginia Tech ETD (VT-ETD) collections. The results for the period 1997-2000 are shown below in Table 3.     1997/98 1998/99 increase 1997/98- 1998/99 1999/00 increase 1998/99- 1999/00 Requests for PDF files (mostly full ETDs) 221,679 481,038 117.0% 578,152 20.2% Requests for HTML files (mostly tables of contents and abstracts) 165,710 215,539 30.1% 260,699 21.0% Requests for multimedia 1,714 4,468 160.7% 12,633 182.7% Distinct files requested 6,419 21,451 234.2% 16,409 -23.5% Distinct hosts served 29,816 57,901 94.2% 87,804 51.6% Average data transferred daily 156,089 KMbB 219,132 KMbB 40.4% 382 MbMB 74.4% Data transferred 55,637 MbGB 78,107 MbGB 40.4% 137 GbGB 75.6% Table 3. Access Log Statistics from the VT-ETD collection   We can see that the number of accesses tends to increase each year. As the collection grows and gains popularity, the number of accesses will most likely continue to increase. More specifically, Table 4 indicates that each of the seven countries with the most accesses has an increasing number of accesses each year (with the exception of Germany in the 97/98 - 98/99 period). The United Kingdom, and surprisingly Malaysia, dominated accesses from outside the US. The other accessing countries are all European, a fact that is probably related to advances in network infrastructure in those countries.   International Domain 1997/98 1997/98 rank 1998/99 1998/99 rank Increase 1997/98- 1998/99 1999/00 1999/00 rank Increase 1998/99- 1999/00 United Kingdom 6,735 1 11,347 1 68.5% 25,583 1 125.5% Malaysia 876 16 4,190 6 378.3% 16,147 2 285.4% France 2,138 7 4,797 5 124.4% 14,960 3 211.9% Germany 6,727 2 3,374 9 -49.8% 14,384 4 326.3% Canada 3,413 4 9,632 3 182.2% 13,543 5 40.6% Spain 590 18 3,647 8 518.1% 9,918 6 171.9% Italy 1,430 12 3,095 10 116.4% 9,300 7 200.5% Table 4. Access by Non-US Sites   Multimedia Use in ETDs One of the main objectives of NDLTD is to promote student creativity through the use of diverse types of multimedia content in ETDs, while making students comfortable with the use of this technology to exploit richer modes of self-expression. Table 5 indicates how much of this objective has been achieved in the VT-ETD collection, with a breakdown of the 8,056 multimedia files contained in a selection of 2,180 available ETDs. This illustrates both that authors are beginning to shift towards non-textual media and that some are moving away from the early single-file paradigm of digitization. File type Examples Count Still image BMP, DXF, GIF, JPG, TIFF 328 Video AVI, MOV, MPG, QT 58 Audio AIFF, WAV 18 Text PDF, HTML, TXT, DOC, XLS 7601 Other Macromedia, SGML, XML 51 Table 5. Multimedia use in VT-ETD collection   Worldwide Release In terms of copyright, a significant issue is whether to allow the electronic document to be viewed worldwide, on campus only, or not at all. The “mixed” case, which is a unique capability of electronic documents, occurs when some portions (e.g., particular chapters or multimedia files) have restricted access while others are more widely available. The majority of Virginia Tech students allow their documents to be viewable worldwide (see Figure 1) - but some initially choose not to grant worldwide access in order to protect their publication rights. To address this concern, there are ongoing discussions with publishers to help them understand the goals and benefits of NDLTD [NDLTD, 1999]. We are pleased to see a change in attitude by some publishers over the course of the project. The American Chemical Society developed a policy more favorable to NDLTD as a result of lengthy discussions and the American Physics Society has been receptive to issues concerning the Open Archives Initiative and NDLTD.   Figure 1. Student and committee choice for ETD availability from Virginia Tech (2668 ETDs as of July 17, 2000).   Standards Activity In order to support many of the current and future research and service-related activities, work has begun to define standards that will enable more consistent exchange of information in an interoperable environment. Among the first of these projects is ETDMS - the Electronic Thesis and Dissertation Metadata Standard - and a related project for name authority control. Electronic Thesis and Dissertation Metadata Standard (ETDMS) ETDMS was developed in conjunction with the NDLTD, and has been refined over the course of the last year. The initial goal was to develop a single standard XML DTD for encoding the full text of an ETD. Among other things, an ETD encoded in XML could include rich metadata about the author and work that could easily be extracted for use in union databases and the like. During initial discussions it became clear that the methods used by different institutions to prepare and deal with theses and dissertations would make it all but impossible to agree on a single DTD for encoding the full text of an ETD. Many institutions were unwilling or unprepared to use XML to encode ETDs at all. Thus, instead of an XML DTD for encoding the full text of an ETD, ETDMS emerged as a flexible set of guidelines for encoding and sharing very basic metadata regarding ETDs among institutions. Separate work continues in parallel on a suite of DTDs, building on a common framework, for full ETDs. ETDMS is based on the Dublin Core Element Set [DCMI, 1999], but includes an additional element specific to metadata regarding theses and dissertations. Despite its name, ETDMS is designed to deal with metadata associated with both paper and electronic theses and dissertations. It also is designed to handle metadata in many languages, including metadata regarding a single work that has been recorded in different languages. The ETDMS standard [Atkins, et al., 2001] provides detailed guidelines on mapping information about an ETD to metadata elements. ETDMS already is supported as an output format for the Open Archives interface to the Virginia Tech ETD collection. ETDMS will be accepted as an input format for the union catalog currently being developed in conjunction with VTLS [VTLS, 2001]. NDLTD strongly encourages use of ETDMS. Authority Linking Each reference to an individual or institution in an ETDMS field should contain a string representing the name of the individual or institution as it appears in the work. In addition, these references also may contain a URI that points to an authoritative record for that individual or institution. Associating authority control with NDLTD seems particularly appropriate since universities know a great deal about those to whom they award degrees and since a thesis or dissertation often is the first significant publication of a student. The “NDLTD: Authority Linking Proposal” [Young, 2001] identifies several goals for a Linked Authority File (LAF) system to support this requirement: LAF records should be freely created and shared among participants. While a central authority database is an option, the LAF design expects the database to be distributed to share cost. Individual participants or groups should be able to host a copy of the LAF database and share changes they make to local copies of LAF records with other hosts using the Open Archives Initiative (OAI) protocol [Lagoze and Van de Sompel, 2001]. The mechanism for keeping records in sync is described in the proposal. The URIs should be meaningful and useful to anyone outside NDLTD’s domain. A benefit of using the OAI protocol is that individual LAF records will be accessible via an OAI GetRecord request (discussed in the second part of this article). The URIs should be persistent and current. This raises a number of challenges, such as duplicate resolution. By using PURLs [OCLC, 2001] in ETDMS records, the underlying OAI GetRecord URLs can be rearranged without affecting the ETDMS records that rely on them. The model should be scalable and applicable beyond NDLTD. The LAF model was designed to work entirely with open standards and open-source software. The LAF design has other advantages over alternatives such as the Library of Congress Name Authority Database [Library of Congress, 2001]. Only the level of participation among decentralized participants limits the coverage of the collection. Because the records are based on XML, the content of LAF records can be as broad or narrow as needed. Finally, because they are distributed using the OAI protocol, multiple metadata formats can be supported. Future of NDLTD The statistics presented illustrate that the production and archiving of electronic theses and dissertations is fast becoming an accepted part of the normal operation of universities in the new electronic age. NDLTD is dedicated to supporting this trend with tools, standards, and services that empower individual institutions to set up and maintain their own collections of ETDs. At the same time NDLTD promotes the use of these ETDs through institutional websites as well as portal-type websites that aggregate the individual sites and create seamless views of the NDLTD collection. Ongoing research and service-provision projects are addressing the problems of how to merge together the currently distributed and somewhat isolated collections hosted at each member institution. The second part of this article discusses some of these projects in detail, including development of the Union Catalog Portal based on VTLS’s Virtua system and the myriad of research efforts investigating how to provide better services to researchers with specific information-seeking needs and behaviors. References Atkins, Anthony, Edward A. Fox, Robert France and Hussein Suleman (editors). 2001. ETD-ms: an Interoperability Metadata Standard for Electronic Theses and Dissertations -- version 1.00. Available from . DCMI. 1999. Dublin Core Metadata Element Set, Version 1.1: Reference Description. Available from . Fox, Edward A. 2000. Core Research for the Networked University Digital Library (NUDL), NSF IIS-9986089 (SGER), 5/15/2000 - 3/1/2002. Project director, E. Fox. Fox, Edward A., John L. Eaton, Gail McMillan, Neill A. Kipp, Laura Weiss, Emilio Arce, and Scott Guyer. 1996. National Digital Library of Theses and Dissertations: A Scalable and Sustainable Approach to Unlock University Resources, D-Lib Magazine, September 1996. Available at . Fox, Edward A., Brian DeVane, John L. Eaton, Neill A. Kipp, Paul Mather, Tim McGonigle, Gail McMillan, and William Schweiker. 1997. Networked Digital Library of Theses and Dissertations: An International Effort Unlocking University Resources, D-Lib Magazine, September 1997. Available at . Fox, Edward A., Royca Zia, and Eberhard Hilf. 2000. Open Archives: Distributed services for physicists and graduate students (OAD), NSF IIS-0086227, 9/1/2000-8/31/2003. Project director, E. Fox (w. Royce Zia, Physics, VT, and E. Hilf, U. Oldenburg, PI on matching German DFG project). Fox, Edward A., J. Alfredo Sánchez, and David Garza-Salazar. 2001. High Performance Interoperable Digital Libraries in the Open Archives Initiative, NSF IIS-0080748, 3/1/2001-2/28/2003. Project director, E. Fox (with co-PIs J.Alfredo Sánchez, Universidad de las Américas-Puebla --- UDLA, and David Garza-Salazar, Monterrey Technology Institute --- ITESM, both funded by CONACyT in Mexico). Kipp, Neill, Edward A. Fox, Gail McMillan, and John L. Eaton. 1999. FIPSE Final Report, 11/30/99. Available from (PDF version) and (MS-Word version). Lagoze, Carl and Herbert Van de Sompel. 2001. The Open Archives Initiative Protocol for Metadata Harvesting. Open Archives Initiative. January 2001. Available from . Library of Congress. 2001. Program for Cooperative Cataloguing Name Authority Component Home Page. Available from . NDLTD. 1999. Publishers and the NDLTD. NDLTD, July 1999. Available from . OCLC. 2001. Persistent URL Home Page. Dublin, OH: OCLC Online Computer Library Center. Available from . Powell, James and Edward A. Fox. 1998. Multilingual Federated Searching Across Heterogeneous Collections. D-Lib Magazine, September 1998. Available at . VTLS. 2001. Virtua ILS. Available from . Young, Jeffrey A. 2001. NDLTD: Authority Linking Proposal. Dublin, OH: OCLC Online Computer Library Center. Available from . Copyright 2001 Hussein Suleman, Anthony Atkins, Marcos A. Gonçalves, Robert K. France, Edward A. Fox, Vinod Chachra, Murray Crowder, and Jeff Young Top | Contents Search | Author Index | Title Index | Back Issues Previous Article | Next Article Home | E-mail the Editor D-Lib Magazine Access Terms and Conditions DOI: 10.1045/september2001-suleman-pt1   work_zwfodkbj25ev3kynhlrxxg3jsm ---- 98ids_fmt.doc 1 An integrated Web -based ILL system for Singapore libraries Foo, S., & Lim, E.P. (1998). Interlending & Document Supply, 26(1), 10-20. Re-published (with permission) in: Foo, S., & Lim, E.P. (1999). OCLC Systems and Services, 15(1), 24-34. An Integrated Web-Based ILL System for Singapore Libraries Schubert Foo & Ee-Peng Lim School of Applied Science Nanyang Technological University, Nanyang Avenue, Singapore 639798 Abstract The paper proposes an integrated Web-based inter-library loan (ILL) system to replace and enhance the existing manual-based ILL system used by Singapore libraries. It describes the system requirements that must be supported in order to make it a viable and acceptable solution to all participating libraries. Subsequently, it presents the client-server Web-based system architecture, database design and Java development platform that are used to implement the system. The new system exhibits a host of advantages over the manual system including minimising human resource by eliminating form-filling and other forms of paperwork completely, improving the access and speed of the ILL process by allowing participating libraries to update each other's databases directly, ensuring data integrity, simplifying status tracking and supporting instantaneous status and statistical reporting. Keywords Integrated ILL system, Web application, database, client-server architecture, Java. Introduction The inter-library loan (ILL) service is one of the many services provided by the libraries that offers users a way to access library resources beyond their affiliated libraries. Through inter-library cooperation, loans from participating libraries and organizations, both local and abroad, are arranged on behalf of the requester. Such a service allows participating libraries to share and maximize the utilisation of their resources. 2 Existing ILL systems in Singapore are all basically manual systems that require substantial librarian intervention in the ILL process. This is also the likely scenario of many other ILL systems around the world. However, there do exists a very small number of computerized systems in universities, such as the University of Stirling [1], University of Arkansas [2] and Nanyang Technological University (NTU) [3] support World-Wide-Web based ILL systems, or commercial ILL modules such as Softlink's Library Automation Software [4]. However, it is impossible to infer if these systems are just front-end systems to obtain the ILL input information from the requester (such as NTU library), or additionally include back-end processing to manage the ILL process among participating lending libraries. Nonetheless, the number of manual or semi-manual systems far exceeds totally computerized ones due to a number of reasons. First, there is the need for each participating library to use the same system or software in order to carry out the ILL process. Second, the difference in library policies among participating libraries makes it difficult to realize a generic system that would be acceptable to all. Third, as the number of ILL requests is traditionally low compared to normal borrowing, little resources have been placed to administer or computerize the ILL service. Finally, different systems, even if they exist, suffer from interoperability problems due to the absence of standards. Thus, two different ILL systems will not be able to exchange information or data. This problem will only be solved over time when ILL systems become more commonplace. The need for interoperability among systems across national boundaries will eventually yield a de-facto standard for ILL transactions and communications. Thus, the present situation will continue unless some form of acceptable solution to all can be found and realized. Review of current ILL process In attempting to provide such an acceptable solution, the NTU library has been used as a basis of study and subsequent derivation of system requirements of the proposed integrated Web-based ILL system to replace the existing ILL system. NTU Library is one of a number of participating libraries of the Library Association of Singapore (LAS) ILL service. Other participating libraries include national institutions (e.g. Monetary Authority of Singapore, Institute of Southeast Asian Studies, National Library Board, National Archives, Productivity and Standards Board, Trade and Development Board), academic institutions (e.g. National University of Singapore, National Institute of Education, Singapore Polytechnic, Temasek Polytechnic), schools and junior colleges (e.g. United World College of SEA, St. Andrew's Junior College, Raffles Institution) and other organisations (e.g. Strait Times Editorial, Singapore Broadcasting Corporation, British Council). Participating libraries have ILL reciprocity and adopt a similar process for processing outgoing and incoming ILL requests. There are basically no charges for ILL 3 transactions of books or periodicals although some libraries charge a flat-rate for photocopies of periodicals. Figure 1 NTU Library's ILL request form on the Web In an outgoing ILL request, a requester will first complete either an ILL-request form or provide the information through a library web page. Figure 1 shows the NTU library's web page used to obtain the input information of an ILL request. With the request, the librarian has to verify the validity and contents of the request by checking the requester's particulars, 'correctness' of publication request from a union catalogue such as the Singapore Integrated Library Automated Service (SILAS). SILAS constitutes the national database for Singapore libraries and allows checks to be carried out for bibliographic details as well as the libraries holding the particular title. This check is necessary to ensure the non-availability of request in the library's own holding and to assist in deciding on the lending library to be used to fulfill the ILL request. 4 Figure 2 Library Association of Singapore ILL Form With this information, a standardised LAS-ILL form of Figure 2 is completed and submitted to the lending library. This is a three-page carbon-copy form that is used for handling "Request", "Interim Report" and "Notice of Return". With the ILL approved, arrangements are made for the collection of material by the borrowing library. The requester is subsequently informed to collect the material. The normal process of tracking and controlling of the loaned material, and collecting overdue fines applies. Finally, arrangements are made to return the material to the lending library. In the reverse role as a lending library, the incoming ILL request is first verified against the library catalogue and holding information. If the material is available for loan, then it is physically retrieved from the shelf and put on charge and placed aside to await collection. Arrangements are made to inform and arrange for collection by the borrowing library. The process of tracking loan item collection and their return are similar to normal items from the library except that is it carried out with a borrowing library instead of a normal library user. The process ends with the confirmation with the borrowing library of the safe and sound return of the loaned item. The current manual ILL system is laborious, time-consuming, error-prone and inefficient. Majority of time is being spent on searching, documentation and updating of loans’ statuses that require minimal decision making. Duplication of effort is required to complete the LAS ILL loan form since the requester completes a separate form when requesting for an ILL. Access to SILAS is confine d to certain access points in the 5 library. Miscommunication with the lending library may delay the process of approval. Time is spent in contacting requesters when urgent requests are not fulfilled, filing ILL forms, carrying out manual periodic checking of overdue items and collating ILL statistics. The manual system is manageable if the requests are few but is likely to lead to proportionate increase of administrative problems and overloads when ILL demand increases. Such reasons, coupled with the rapid advancement of Internet technology, availability of on-line library systems and OPAC have provided the impetus to provide a new computerized ILL system to manage the complete ILL process. System architecture and requirements ILL Web-Page Java Applets Java Classes Client Server Databases Internet mSQL Database Server Communication Process OutILLDB OutILLDB InILLDB Server InILLDB Library 1 Library ... n Client Figure 3 Integrated Web-based ILL system architecture The proposed integrated Web-based ILL system adopts a client-server architecture as shown in Figure 3. It allows the system to be integrated with the library's existing functions on the Web to enable widespread access by anyone on the Internet. This implies that both users and librarians are no longer confined to using special access points to submit or process ILL requests. 6 The client can be a personal computer, Macintosh or workstation with a suitable Web browser such as Netscape Navigator 2.x/Gold [5] or Microsoft Internet Explorer 3.0 [6], or higher equivalent versions of them. Residing on the Web server is a database server that supports two separate databases for maintaining incoming and outgoing ILL requests respectively. The use of separate databases distinguishes the two different roles of borrowing and lending clearly, and provides a neat solution that improves speed, efficiency and maintainability. These stand-alone databases are used solely to support the ILL functionality and at such, are not the same library databases used for catalog or holding information. This server architecture is replicated at each participating library. A communication processor exists to allow direct communication and updating of each other libraries' databases. Users interact with the system using the normal Web access techniques of browsing and filling in forms. The architecture supports a number of essential system requirements and characteristics to make it a viable replacement of the existing ILL system: • The system is differentiated from other automated library systems that are used to support the various functions (e.g. acquisition, cataloguing, circulation, media, resource, etc.) of the library. • It is a 'stand -alone' system that remains functional even if one or more other participating libraries' systems are down. • The system is easily accessible, supports ease-of-use to both users and librarians, and minimizes the amount of human effort in the ILL process. • The system is integrated with the library’s Web page to provide the added ILL functionality. • An authentication process is provided to control legitimate access by registered library users, librarians and participating libraries. • All ILL requests are electronically completed by requesters and verified by the system before they are processed by librarians; • New requests are automatically presented to librarians for approval. Approved requests are forwarded to the lending libraries by updating their databases directly. Rejected requests are accompanied by reasons provided by the library. • Incoming ILL requests from other libraries are automatically presented for approval. Likewise, the outcomes of these requests are forwarded to the borrowing libraries by updating their databases directly. • Progress tracking exists to enable users to check and be kept up-to-date of the statuses of their ILL requests. • Transaction updating exists to allow both libraries to keep track and monitor the loan status after collection from the lending library. • Overdue items are automatically flagged and made known to the librarians by the system. • Completed ILL transactions are kept in the databases for a period of time (e.g. two years) prior to being archived. Statistical information of the data on the active databases can be generated for viewing and analysis. 7 • The system maintains participating libraries' information allowing libraries to join or leave the ILL system. System design Figure 4 Module structure and access control Three separate Web pages on the client machine provide different levels of access to the ILL system for the users and librarians as shown in Figure 4. The About_ILL module provides information of the ILL service that is being offered by the library. The Registration module allows first-time users to submit a one-time membership application to use the ILL services offered by the library. Users have to read and agree to abide by the ILL service policy in the process of submitting their application. Depending on library policies, ILL services may be offered to certain groups of staff of the organization. For example, NTU library's ILL service is accessible by academic staff or postgraduate students through their supervisors only. Registered users have access to four modules to carry out their ILL transactions. Prior to that, an Authentication module provides the necessary safeguard and access control to restrict All Web users Registration About_ILL Authentication ILL_Book Request ILL_Periodical Request Status_Request Help Process_Registration Outgoing_ILL Incoming_ILL Statistics Administration Integrated Web-Based ILL System Registered users Authorised librarians/system administrator 8 the ILL service to registered users and authorized librarians. The ILL_Book Request and ILL_Periodical Request modules allow users to submit an ILL book or periodical request by defining the necessary details of the publication. The Status_Request module is used to query and display the status of outstanding ILL requests of users. Finally, the Help module provides on-line help assistance to users and provides answers to a list of Frequently-Asked-Questions. A separate web page that is only accessible by librarians provides the necessary modules to support the ILL service. The Process_Registration and Outgoing_ILL module mirrors the users' Registration and Request modules to provide the tool for processing new membership and ILL requests respectively. The Incoming_ILL module is used to process ILL request from borrowing libraries. The outcome of outgoing and incoming ILL requests are automatically updated on the lending and borrowing library's database respectively via the Communication Processor on the local server. In addition, a Statistics module is used to generate statistical data of incoming and outgoing ILL requests. This includes information on the number of ILL requests that are accepted or rejected over a defined time period, different types of user groups requesting for ILL service, and subject categories of ILL requests. Such information can be analyzed and used by the library to plan for future ILL services, promote reciprocity and cooperation among participating libraries, or to build up a collection of specialised material in the library. Finally, an Administration module allows various administrative functions such as database backup or archive to be carried out periodically. Additionally, it is used to maintain an up-to-date record of the participating libraries and contact information. As the ILL is a distinct service of the library, a set of customized databases are used to support multiple independent users having concurrent access to a central repository of consistent and shared information. Two databases, namely, InLLDB and OutLLDB (as shown in Figure 3) are used to store the necessary information and distinguish the role played by the library in processing outgoing and incoming ILL requests respectively. The overall database model is depicted in Figures 5 and 6 using an entity-relationship diagram consisting of entity types represented by boxes, their attributes represented by ovals, and relationship types between entities represented by diamonds. As shown in the data model, each ILL request is distinguished uniquely by means of a serial number (s_no) with a date of request, date of processing and the status in the ILL process. The comment attribute stores the message associated with the status including the reason for an ILL request rejection. 9 Figure 5 Entity relationship diagram for outgoing ILL database (OutILLDB) The ILL request is submitted by either a user (user_id) or requesting_library (req_lib_name) whose details are defined in user_data and library_data entities respectively. Each ILL request is either a book or periodical with the normal bibliographic information defined using the book_data or periodical_data attributes. This is to take into account the different information requirements for a book and periodical (such as periodical title/editor, article title/author, volume number, part number, range of page numbers, and so on). Approved outgoing ILL requests of the borrowing library are tracked using a transaction record (transact_record) that contains attributes to define the lending library, date of response from the lending library, collection date, and due date for return to the lending library. The normal user's loan data of collection date, due date, return date and overdue fine are also included in the same transaction record. loan_stat periodical_databook_data author edition ISBN publ submits ill_request user_data s_no user_id status title doc_type comment req_date proc _date status login_id pwd fname dept email title lname tel_no approved ILL requests transact_ record lib_name user_collect_date user_return_date return_date reply_date comment collect_date due_date cost cost_content d part vol s_pg e_pg year outcome subject doc _type user_type year quarter req_type s_no 10 Figure 6 Entity relationship diagram for incoming ILL database (InILLDB) Additionally, statistical information of the ILL request is defined using a loan statistics entity (loan_stat) that contains attributes to define the subject of the ILL request, type of user, type of document, type of request, date and outcome of request. Similarly, all approved incoming ILL requests by the lending library are tracked by a transaction record and loan statistics that contains the corresponding set of information that include approval date, collection date, due date, return date, overdue fine, and so on. The process of the ILL service using the new system is shown in Figures 7 and 8 for outgoing and incoming ILL requests respectively. It is basically identical to the existing manual system except that all information is now electronically handled by the system. Manual form filling, filing and generation of statistics is completely eliminated in the new system. Users and librarians are provided with appropriate user interfaces at various stages to manage the complete ILL process. The new system provides an effective and efficient tool to both users and librarians in managing the ILL process. From the users' point of view, ILL requests may be conveniently submitted, incomplete or erroneous inputs easily rectified, turnover time for ILL approvals and loan collection reduced, and status checks on outstanding ILL requests and loans easily accomplished. From the librarians' point of view, all forms of loan_stat periodical_databook_data author edition ISBN publ submits ill_request library_data s_no req_lib_name status title doc_type comment req_date proc _date contact_fname contact_lname addr email name tel_no url approved ILL requests transact_ record appr _date collect_date due_date cost cost_content d part vol s_pg e_pg year outcome subject doc _type user_type year quarter req_type s_no req_s_no call_no return_date 11 Figure 7 Outgoing ILL Process N N Y N N Y Y Y Y N Y N Y N Y N S t a r t Check new ILL request listing Search Union Catalogue Material f o u n d ? Check material availability Material available? Update ILL request to lending library ILL approved? Inform requester to collect material Check update and collect material Material collected? Material r e t u r n e d ? Date due? Send reminder to user Inform requester reason for rejecting ILL request End Update loan information on material Send reminder to user Overdue fine? Collect fine Update loan information on material Return material & fine to lending library End Date due? 12 Figure 8 Incoming ILL Process Y N Y Y N Y N N Y N Search OPAC for material On Holding? Locate material in library M a t e r i a l found? Put on charge for collection Update as approved ILL with collection instructions Check listing for new requests Start Update and inform reason for rejecting ILL request On Charge/ Reserved N N Y Y N Material collected? Update loan information on material Send reminder to borrowing library Damaged/Over- due prompt? Collect fine Update loan information on material Return material to shelf End N Y Material returned? Date due? Send reminder to library Inform material unavailable & check intention Want to wait? End Reserve material on O P A C 13 paperwork are eliminated. Misplaced or lost ILL requests no longer occur. Approved outgoing ILL requests reach the lending library almost instantaneously for further processing. Outcome of incoming ILL requests likewise reaches the borrowing library quickly, and management of collection, loan tracking and return becomes much easier. System implementation The proposed integrated Web-based ILL system has been developed in the School of Applied Science, NTU. It employs the system architecture and design described in the previous sections and satisfies the associated system requirements. The client is implemented as a set of Java applets [7] that are downloaded and invoked when the Web page is browsed. The client-server concept of the architecture is only apparent when these applets are downloaded to the client since there is no prior installation of applets at the client machine. Java has been selected as the development platform due to its superior features, being platform-independent, and its emerging role as being the de-facto language for Web- based applications. A publicly available relational database management system (RDBMS), Mini-SQL [8], that supports standard subset of ANSI SQL is used as the database engine. This is sufficient in most instances as the number of ILL transactions is still relatively small. Data associated with incoming and outgoing ILL requests are stored in separate InLLDB and OutLLDB databases respectively. The client uses a set of object classes to communicate with the host and remote library databases. The object classes are built using Java and Java-mSQL Application Programming Interfaces (APIs) [9]. Each class consists functionality to support connection establishments, database queries, retrieval and presentation of results, and disconnection to remote hosts. If necessary, the existing Mini-SQL database can be "unplugged" (removed) and replaced by a production based RDBMS to support higher ILL demands. This pose no problem as it requires only minor changes to existing codes due to the use of standard SQL APIs. A status flag is used to keep track of the state of an ILL request in the process. The flag can assume one of the eight states shown in Table I. If an ILL application is rejected, a message predetermined by the librarian is displayed to the user to inform the latter about the reason for rejection. This table also defines the library that is responsible at the various action stages and the databases that are updated in the process. As an example of the user interface used in the system, the librarian's ILL Web page is shown in Figure 9. The icon-based menu provides the functionality to handle new membership requests, process and update outgoing ILL transactions. New ILL requests are automatically displayed for processing by the librarian. Upon approval of the request, the librarian will identify and select a lending library to send the request. The dialog box used to do this is shown in Figure 10. The request is subsequently sent 14 by the system by automatically updating the lending library's incoming database directly. The lower half of the librarian's ILL Web page is used to process incoming requests as the library plays the reverse role as a lending library to process and update incoming ILL transactions from borrowing libraries. In addition, it provides links to the administrative and statistical functions of the system. Similar interfaces and dialog boxes of Figures 9 and 10 are used in various modules throughout the system. Table I Status in ILL Process Status ILL Process Stage Action By Database Update A New ILL request Registered user (BL) OutILLDB (BL) B ILL approved by Borrowing library Librarian (BL) OutILLDB (BL) InILLDB (LL) C ILL approved by Lending library Librarian (LL) OutILLDB (BL) InILLDB(LL) D Material collected, update collect date and due date Librarian (BL) Librarian (LL) OutILLDB (BL) InILLDB (LL) E Material collected by registered user Librarian (BL) OutILLDB (BL) F Material returned by registered user Librarian (BL) OutILLDB (BL) G Material returned to Lending library Librarian (BL) Librarian (LL) OutILLDB (BL) InILLDB (LL) X ILL request rejected Librarian (BL) OR Librarian (LL) OutILLDB (BL) OR both OutILLDB (BL) & InILLDB (LL) BL: Borrowing library LL: Lending library Current status The system will be initially installed in the libraries of NTU and NIE (National Institute of Education) for a pilot run and further evaluation. As part of future work, the system will be enhanced to integrate bibliographic searches results with ILL requests in order to further minimize user input as well as input error. Likewise, a function can be provided to make the necessary checks against the union catalogue and libraries' holding information to validate the request and provide a list of lending libraries automatically for selection by the librarian. Additionally, some form of automatic electronic mailing facility can be incorporated either at the borrowing or lending library so that requesters are informed of the outcome of their requests at the shortest possible time after submission. 15 By gathering feedback on the pilot system, additional system requirements from other participating libraries, and outputs from continuing work, it is envisaged that a production version will be developed and implemented on a nation-wide basis. Figure 9 Librarian's Web page Figure 10 Interface to send approved ILL request 16 Conclusion An integrated Web-based ILL system is proposed and developed. The system has demonstrated the advantages of having a Web-based system to replace the existing paper-based manual-intensive ILL system. The system is small, secure, robust, easily installed and maintained. It eliminates form-filling completely and minimizes human errors as information is electronically captured, updated and transmitted among participating libraries. The system simplifies status tracking and allows users to obtain the information directly and easily. Use and administration of the system is carried out through the Web and therefore made accessible Internet-wide and not just confined to limited access points in the library. Data in the system is handled efficiently and effectively with options for backup archival and statistics generation. The overall improvement in speed and efficiency of the ILL service will enable larger number of ILL requests to be handled and further promote cooperation among participating libraries. References [1] Stirling University, "Stirling University Library: Inter-Library Loan Document Requests", , 1997. [2] Arkansas Univ ersity, "University of Arkansas Libraries", , 1995. [3] Nanyang Technological University, "Nanyang Technological University Library", , 1997. [4] Softlink International, "Softlink Library Automation Software", , 1997 [5] Netscape Communications Inc., "Netscape Navigator 2.02 WWW Browser" . 1997. [6] Microsoft Corporation, "Internet Explorer 3.0". , 1997. [7] Walsh, A.E., "Foundations of Java programming for the World Wide Web", IDG Books Worldwide, Inc., 1996. [8] Hughes, D.J., "Mini SQL: A Lightweight Database Engine Version 1.0.6", , 1995. [9] Collins, D., "Java-mSQL APIs", , 1997. 17 BIO GRAPHY Schube rt Foo is the Head of the Division of Information Studies, School of Applied Science, Nanyang Technological University. His current research interests include multimedia technology, Internet technology, CSCW systems, digital libraries and project management. Ee-Peng Lim is a lecturer in the Division of Software Systems, School of Applied Science, Nanyang Technological University. His current research interests include database integration, multi-database systems, and digital libraries. work_zyp3zahgp5cgtbn2or3qklpivq ---- Methods for in-sourcing authority control with MarcEdit, SQL, and regular expressions The University of Akron From the SelectedWorks of Michael Monaco 2020 Methods for in-sourcing authority control with MarcEdit, SQL, and regular expressions Mike Monaco, The University of Akron Available at: https://works.bepress.com/michael-monaco/24/ http://www.uakron.edu/ https://works.bepress.com/michael-monaco/ https://works.bepress.com/michael-monaco/24/ 1 Methods for in-sourcing authority control with MarcEdit, SQL, and regular expressions Mike Monaco Coordinator, Cataloging Services The University of Akron, Akron, Ohio, USA https://orcid.org/0000-0001-7244-5154 The University of Akron 302 Buchtel Common Akron, Ohio 44325-1712 Office: 330-972-2446 mmonaco@uakron.edu This is an Accepted Manuscript of an article published by Taylor & Francis in the Journal of Library Metada on December 20, 2019, available online: http://www.tandfonline.com/10.1080/19386389.2019.1703497 https://orcid.org/0000-0001-7244-5154 mailto:mmonaco@uakron.edu 2 Methods for in-sourcing authority control with MarcEdit, SQL, and regular expressions ABSTRACT This is a report on a developing method to automate authority control in-house (that is, without an outside vendor), especially for batch-loaded bibliographic records for electronic resources. A SQL query of the Innovative Sierra database retrieves entries from the “Headings used for the first time” report. These entries are then processed with some regular expression substitutions to create a list of terms suitable for batch searching in the OCLC Connexion client. Two approaches to this problem are described in detail and results compared. A similar method for using the “Unauthorized headings” report from the SirsiDynix Symphony ILS is also described. Keywords: Authority control, automation, Regular expressions, SQL, batch processes, workflows, Sierra ILS Shorter title: Methods for in-sourcing authority control 1 Background Like many, perhaps most, university libraries in the United States, the majority of The University of Akron University Libraries’ collection budget has shifted from print/tangible to electronic resources (e-resources), while cataloging staff time has been reduced through attrition and the assignment of additional non-cataloging responsibilities. Shifting from print resources where manual authority work for individually vetted bibliographic (bib) records is possible to batches of e-resources bib records loaded en masse without the same individual vetting has meant changing the approach to authority work for most of the new titles added to the collection. Because outsourced authority vendors are not an option, authority control remains in-house, but by working with the systems unit, the cataloging unit is developing a method to automate much of the authority workload associated with loading large batches of bib records. Literature review Authority control is a fundamental and perennial challenge in librarianship, and a great deal has been written about authority control, automation, and challenges posed by vendor-supplied bib records. This review focuses on recent work on (1) how the quality of vendor-supplied batches of bib records has been assessed, (2) attempts to control the quality of such records and the headings in them, and (3) efforts to automate authority work. The importance of quality metadata for access and discoverability, and the centrality of authority control to quality control in a bibliographic database, is assumed rather than argued for in the present paper. Snow (2017) provides an in-depth review of the literature on the importance of metadata quality. 2 No discussion of authority control can ignore the use of vendors to “outsource” the drudgery of managing headings and loading authority records (ARs). Park (1992) presents a relatively early look at moving from manual (in-house) authority files to automated authority files maintained by vendors. Tsui & Hinders (1999) look at outsourcing authority work with vendors, which at the time was the only real alternative to finding ARs manually for each access point. Their cost-benefit analysis compared OCLC charges and credits and the associated staff time to the cost of a vendor contract and associated staff time. Aschmann (2002) gives a detailed plan for outsourcing that includes creating an RFP, working with a vendor, and forming an in-house authority control team. Ten years after Park’s hopeful outline, Aschmann found that outsourcing did not necessarily save staff time. Jackson (2003) discusses some of the advantages and perils of automated authority control. While the ability to automatically update headings in the catalog is a great benefit, the downsides include the limits of computer matching (which may produce amusing errors) and the need to manually identify authorities to add to the catalog. Jackson concludes that vendor-supplied authority control carries the same benefits and perils, and hopes that future integrated library systems (ILS) will provide better options for in-house authority control. Velluci (2004) surveys the major vendors of authority control and explains the services they offer. It is an interesting snapshot of the state of authority control in 2004. One of the leading vendors mentioned is no longer in business, and the author notes resistance, on the part of vendors, to the idea of an international authority file. It is an accurate and detailed overview of the services available, excepting only services that have developed since 2004, such as updating bib records to use RDA forms of headings and adding URIs to MARC fields. Zhu & Von Sheggern (2005) outline various quality control services that can be provided by vendors and try to set realistic 3 expectations for librarians about these services with examples of what can and cannot be automated. This article serves as an excellent overview of how automated authority control works, explaining normalization, matching, and which elements of authority and bib records are typically utilized. The authors also provide a checklist of common options and questions that libraries should ask of vendors. Williams (2010) gives a case study of database cleanup with Marcive, noting some of the limits of automation identified by Zhu & Von Sheggern (2005) and some local issues such as the strain loading batches of bib records put on their ILS, and how they dealt with those challenges. Some recent work has taken up the idea of managing large scale quality control and authority work projects internally. Kreyche, Lisius, & Park (2010) describe a process at Kent State University Libraries for updating name ARs with death dates added since the NACO policy change which made adding death dates more routine. While it is an important example of a large-scale in-house project, the vendor Backstage Library Works eventually began offering the same service. Ziso, LeVan, & Morgan (2010) describe a method that, rather than using ARs within the database, queries OCLC’s WorldCat Identities file to direct users to authorized access points and related works for searches. It is certainly an outside-the-box approach, but most library catalogs still rely on internal authority files, and the method does not help update headings in bib records, so bib record quality would need to be addressed by other workflows. Mak (2013) gives a detailed look at a process at Michigan State University Libraries to cope with the mass re-issue of name authority records (NARs) by the Library of Congress, when many NARs were revised to meet RDA standards. Mak describes a process where ARs in the local catalog are exported, converted to a format allowing the extraction of control numbers for batch searching, and then comparing the retrieved ARs to the exported ARs to identify and select 4 updated records to load. An AutoIT script automates most of this this process and even updates bib records within the ILS. Cook (2014) provides a roundup of useful tools for manipulating metadata, including programs, development environments, and programming languages that can be used to manipulate MARC records. Some of these tools were utilized in the present project. Carrasco, Serrano, & Castillo-Buergo (2016) describe a tool for matching headings in the context of a large database with possible duplicates. Their work is notable for relying on bib records to disambiguate names. This was accomplished by analyzing time periods and dates in the bib records associated with names rather than using ARs. Dong, Glerum, & Fenichel (2017) describes a process for resolving a problem that was more or less unique to their shared database: duplicate series data. This is useful to other libraries because the authors detail their planning process and practical lessons learned. The article also includes a good literature review on database quality and describes some approaches and projects in large-scale authority control undertaken elsewhere. Wolf (2019) describes processes that use existing lists of changed or updated ARs (the Library of Congress “Weekly Lists” and OCLC’s “Closed Dates in Authority Records”) to extract record identifier numbers, and then queries these numbers in the local Sierra ILS (hereafter, Sierra) to determine which ARs need to be re-loaded into Sierra. Wolf’s process involves using regular expressions to extract the relevant data, JSON queries of the Sierra database, and batch searches in the OCLC Connexion Client (Connexion), making her work somewhat similar to the present project. Indeed it is a complimentary project; where the present project begins with internal notifications (headings reports) Wolf’s begins with external notifications (the aforementioned Library of Congress and OCLC lists). A natural catalyst for searching for large scale authority control solutions has been the increasing practice of batch loading bib records, especially records for intangible e-resources. 5 The batch loading of e-resource bib records creates several challenges for authority control, as vendor-supplied bib records may be of varying quality and are loaded in volumes that make evaluating the records individually impractical. Sanchez, Fatout, Howser, & Vance (2006) is one of the first publications to address the use of “non-traditional, non-ILS supplied editing utilities to correct MARC records prior to loading” (p. 54). Their paper describes their use of MarcEdit, Word, and Excel to correct errors in bib records provided by NetLibrary. These corrections were carried out in batches, but authority work was carried out manually by catalogers. While manual authority work on small batches of e-resource bib records was feasible in 2006, the growth of e-resources in library collections and the reduction of library staff renders such an approach less practical today. Heinrich (2008) details quality enhancements made to electronic book (e-book) bib records both pre- and post-load. The pre-load work included vetting different collections and requesting customizations based on local practices. The post-load work included deduplication of titles, transferring local information to the batch-loaded records, and establishing overlay protection for the local data fields. However no attempt was made at authority control for electronic serials (e- serials) because the “records are unstable” (p. 15) -- that is, e-serial bib records are frequently updated and redistributed, making local changes to records less permanent than they are for e- books. Moreover, the batch-loaded bib records were also excluded from the headings reports out of concern that the large number of records would “overwhelm the capacity of the headings reports (p. 15). Finn (2009) describes pre-load authority work on batches of bib records at Virginia Tech. Their procedures are a mix of outsourced and internal work. First, Library Technologies, Inc. (LTI) edits the batch files to correct certain common errors and creates a report on “unlinked” (that is, uncontrolled) headings. Then library staff use MarcEdit to make 6 changes to the access points in the bib records based on LTI reports. LTI also supplies ARs for the library to load. Global updates and headings reports in the ILS are used after loading the bib records to cover additional corrections. Martin & Mundle (2010) offer a typology of authority problems (broadly: access issues, load issues, and record quality issues), explain their procedures for dealing with them, and emphasize the usefulness of talking to vendors as a tactic for maintaining quality control. They focus in particular on Springer e-books, a collection that also proved vexing to other consortia. Wu & Mitchell (2010) describe some of the e-book record quality issues the University of Huston Libraries has found and how they are addressed with MarcEdit batch processes, as well as the difficulties posed by changing cataloging standards, particularly the preference for provider-neutral records. Panchyshyn (2013) introduces a procedure for quality control via a checklist. Authority control is managed at Kent State by isolating batches of e-resource bib records from other records (p. 27-28). Like Heinrich (2008), Panchyshyn warns that the costs associated with of authority control may make it inadvisable for certain kinds of resources -- in this case, e- resources that will not be held “in perpetuity” (p. 34). Beisler & Kurt (2012) describe a task force used to deal with issues with e-resources and batch loading workflows, developing a form similar to Panchyshyn’s checklist for managing workflow. They mention very little on quality control and automated authority processing however. David & Thomas (2015) look at the quality of bib records for e-resources. They note that the quality of bib records is especially important for e-resources because these resources cannot be found on the shelf, and user browsing and selection takes place in the catalog, based mainly on the bibliographic metadata displayed there (p. 802). They focus on the types of errors that occur in access points and the time and cost of correcting them. Their study of user searches confirmed that title, author, and subject fields are 7 the most important access points, both because they are most frequently chosen for single-field searches and because their analysis of keyword searches found that title, author, and subject terms were the three most commonly entered kinds of search terms. Of course all three of these access points are controlled by ARs, further highlighting the importance of authority work for access. Flynn & Kilkenny (2017) describe dealing with the problem of e-resource bib record quality at the consortia level. Their paper describes the evolving policies and procedures that were put in place to improve record quality in OhioLINK. These are focused on changes to bib records -- some manual and some automated. They also include a helpful review of the literature on vendor record quality and discuss how they worked with various vendors to improve record quality at the source. Van Kleeck, Nakano, Langford, Shelton, Lundgren, & O’Dell (2017) examine record sources, again highlighting the importance of record quality for e-resources. They conclude that OCLC bib records distributed via WorldShare Collection Manager (WCM) are generally equal to or superior to the vendor-supplied records from other sources that they examined. The record sources are identified in this study (as opposed to David & Thomas (2015) and Flynn & Kilkenny (2017) who anonymize the vendors and publishers), making this an especially helpful article for librarians developing their own workflows. Their emphasis on record quality (especially authorized access points) underscores the importance of authority control for e- resources which are primarily accessed through OPACs or discovery layers dependent on these access points. Thompson & Traill (2017) describe a method to check record quality with Python scripts that evaluate quality using a rubric that gives credit for the presence of authorized access points, call numbers, and descriptive fields which affect discovery such as summaries and contents notes. The records’ scores according to the rubric are used to separate records that can 8 be batch loaded from those that will need human intervention to assure completeness and correctness. This project has had the added value of helping compare the relative quality of different sources of bib records, and confirming Van Kleeck et al.’s observation that WCM provides better records than most vendors. Tingle & Teeter (2018) describe an effort to make e-resources visible in a fairly literal manner. Proxies for titles and topics were placed on the shelves among print resources. The project highlights how significant an issue the discoverability of e-resources remains, but does not particularly address record quality within the catalog. Automation of authority work at The University of Akron Even libraries with authority control vendors often find it impractical or cost-ineffective to have authority outsourcing for e-resources bib records. As discussed in the literature review, e- resources pose a particularly vexing problem because the records are often of low quality, because the records are not expected to remain for long or will be updated with new records at regular intervals, and/or because the sheer number of incoming records can be daunting. At The University of Akron (UA), a large public research university which does not use an authority control vendor, a process was developed to leverage some free software, simple database queries, and existing capabilities already present in the ILS and bibliographic utilities to create a process that improves and controls the access points in bib records with minimal staff effort, and retrieve supporting AR in batches. The process is an example of successful collaboration between librarians with different areas of functional expertise and at different institutions, and we hope our initial successes will inspire other librarians to push themselves to develop skills beyond those traditionally employed within their units. 9 In developing this method to download batches of ARs to support (and update) headings in incoming bib records, the goal is to automate authority control in-house, especially for batch- loaded bib records for e-resources. Before loading, batches of bib records for e-resources have their access points for names and topics compared to the Library of Congress' Linked Data Service (LDS) via the MarcEdit report “Validate Headings.” This report changes headings in bib records that match variant access points for authorities in the LDS to the authorized forms. The bib records are then loaded into Sierra. This triggers headings reports in Sierra. The “Headings used for the first time” report lists entries for headings that are new to the catalog and therefore do not match ARs in the catalog. This report can be queried with SQL to retrieve text strings to search against the authority file in OCLC via a batch process in Connexion and download matching ARs in batches. An earlier version of the process will also be described, which involved using a text editor to sort access points by type and then run a series of find/replace operations using regular expressions (regexes, singular: regex) to normalize the access points for batch searching. Some pointers for applying the method in the SirsiDynix Symphony ILS follow. Symphony has a different approach to headings reports than Sierra, but Symphony’s reports can still yield usable textual search strings if the report output is processed with a series of regexes similar to those used in the Sierra methods. Statistics collected to track the success rate of the headings validation tool in MarcEdit and the batch searching of ARs based on the SQL queries are provided. The conclusion assesses the cost in staff time versus benefit in improved access, and discusses the lessons learned by the authors in this collaboration, as well as suggesting possible refinements and improvements of the process and areas for further exploration. 10 Pre-load authority work with MarcEdit At UA, several procedures are followed to improve the quality of bib records before loading them into Sierra. There are two categories of procedures: collection-specific tasks and heading validation. Unlike Virginia Tech’s procedures as reported in Finn (2009), this work is carried out entirely in-house. Most e-resource bib record collections have specific sets of edits that are always applied either before loading (in MarcEdit’s MarcEditor program) or during the load (with specialized load tables for the collections). These edits may be local customizations (collocation fields to identify the collection, local call numbers and location codes, etc.), or for a few collections they may be more extensive, such as adding form/genre headings to streaming video collections. For a few collections, recurrent errors that have not been adequately addressed by the record suppliers have their own set of tasks in MarcEditor. The most extreme case is a streaming video collection that has recurring errors in access points, such as incorrect forms of names, qualifiers incorrectly added to corporate body headings, and problems with subdivisions coding (missing or improperly coded delimiters and subfield codes). In many cases these edits are made in MarcEdit because the applicable ARs do not have matching variant access points that would enable the ILS to automatically “flip” the access points in the bib records, and because Sierra reports but does not automatically flip variant forms of headings when a bib record is loaded (rather, automated processing is triggered when the ARs are loaded). These pre-load edits make improvements to record quality, but the most dramatic and efficient processing is the second category, utilizing MarcEdit’s “Validate Headings” report. 11 Heading validation in MarcEdit compares access points in bib records to the authorities in the Library of Congress’ LDS. Variant headings for names are flipped to the authorized form if there is an exact match. The validate headings report is a routine part of the workflow at UA for many batches of records. Because the validation report provides a statistical log of the changed headings, the statistics of each set processed are compiled to determine the relative quality of records from different publishers and whether the time required to run the report is justifiable. It was determined that there was little benefit from running the report on the brief bib records supplied for the discovery layer, but on the other hand some collections benefited significantly, especially those that had been harvested at some point from the Library of Congress or OCLC and which therefore had older forms of headings. Two years of data collection (March 2017- March 2019) has demonstrated that of a total of 1,230,195 bib records loaded in batches, 32,249 access points were changed from a variant form to the authorized form. Because the Validate Headings report notes headings in 1xx, 6xx, and 7xx fields separately, it was possible to separately track name access points and topical access points. This was helpful as some sets, such as streaming video, tended to have far more name access points than would be typical of e- books or serials. The results for sets of bib records from different vendors were compared to the results for bib records from OCLC (via WorldShare Collection Manager), with the assumption (supported by Flynn & Kilkenny (2017)) that OCLC records were a reasonable benchmark for acceptable record quality. Using this benchmark, UA only continued using the Validate Headings report for sets that had a rate of correction higher than that of the OCLC record sets. Some selected collections are summarized on table 1, Summary of MarcEdit Validate Headings on selected record sets; the OCLC row is bolded for emphasis as it served as the benchmark for deciding whether the time invested in the report was worthwhile. 12 [place table 1 here] Pre-load authority work is particularly beneficial to the workflow because Sierra can identify headings that have not been used before in the catalog (“Headings used for the first time”) which perforce do not have corresponding ARs in the catalog: an AR’s 1xx field would constitute a previous use of the heading. But because many of the headings in the bib records have been validated or changed to match existing ARs, it is more likely that ARs corresponding to the headings can be retrieved. The reported “new” headings are supplied in a report within Sierra, which made the next steps -- automated retrieval of ARs in-house -- possible. Two versions of the method are detailed below, because the two slightly different approaches have different strengths and weaknesses. Post-load authority work with Headings reports The remainder of this paper will describe the development and implementation of a method to accomplish authority control by loading ARs matching headings that have been flagged as “new” to the catalog by headings reports in the ILS. For clarity, the two different versions of the method are referred to as “Alpha” and “Beta.” A third process for another ILS is dubbed “Gamma.” The method has three components, which will be referred to as a query, processing, and batch searching. The query retrieves data; the processing prepares the data to be batch searched. Batch searching uses the batch processing module in Connexion to retrieve ARs. The query and processing vary in each version of the method, and it is hoped that the discussion of how they developed and how the different methods compare in terms of efficiency and success will be helpful to others adapting the method to their own libraries. 13 The Alpha method: background and query The initial project began with the somewhat obvious thought that it would be nice to be able to gather the headings in the “Headings used for the first time” report in Sierra and batch search them in Connexion. See figure 1, Sample “Headings used for the first time” report entry. A cataloger will recognize the MARC field listed as “Field” in the report. Corresponding MARC fields also appear in ARs. The challenge would be collecting the MARC data in a form that could entered into Connexion searches. [place figure 1 here] A colleague in Systems (Susan DiRenzo Ashby, Coordinator, Systems, The University of Akron) identified the location of the report’s components in Sierra’s database, and another colleague (Michael Dowdell, Systems Administrator, The University of Akron) devised a simple SQL query to collect the MARC fields with the triggering headings. pgAdmin is used to run these queries and place the results in a comma-separated values (.csv) file. pgAdmin is a user interface for accessing databases, executing SQL queries, and managing the results. The .csv file, once the contents are processed (normalized to remove MARC and Sierra codes and tags and potential stop words, operators, or commands), can in turn can be entered into Connexion’s batch searching tool to retrieve matching ARs. These ARs ultimately are loaded in support of the bib records. Over time, through trial and error and with help from Craig Boman (Discovery Systems Librarian, Miami University) the query was refined. The Alpha method queries the Sierra database for the terms listed under “Field:” in the report. The SQL query was: SELECT field FROM sierra_view.catmaint 14 WHERE condition_code_num=1 ORDER by field ; The SQL query is asking for a particular column of data (field), in a particular table (sierra_view.catmaint), where another column in the table (condition_code_num) has a particular value (1). This has exactly the desired effect: the query returns the data labeled “Field:” from all entries in the “Headings used for the first time” report. In the case of the entry depicted in figure 1, the data is: a1001 |aWolfram, Adolph,|earranger of music,|einstrumentalist Thus, all of the MARC coding (tags, indicators, subfield delimiters) and also the Sierra field group tag (here, the initial “a”) are returned by the query. This data would interfere with a batch search in Connexion, since the Connexion search is querying the WorldCat authority file, which contains only authorized access points and variants. Additional data such as the relator terms in the example (“|earranger of music,|einstrumentalist”) also interfere with searching. This problem is addressed later in the “processing” component of this procedure. The field group tag is useful as it distinguishes name headings (tagged “a” for “author” or “b” for “other author”) from subject headings (tagged “d”). This is important because the Sierra requires separately loaded ARs for names when they are used as name access points (Sierra tag “a” or “b” and MARC tags 1xx or 7xx) or as subjects (“d” and 6xx). The Connexion batch searching tool on the other hand requires separate searches for topical headings and name headings. Fortunately it is possible to search the index of Library of Congress (LC) names, which includes personal names, corporate bodies, conferences, and uniform titles (including name/title headings). The possible combinations of tags and headings are laid out in table 2, 15 Headings types in Sierra and WorldCat. The shaded area highlights situations where name headings are used as subject access points. [place table 2 here] This is why the SQL script includes the command to ORDER the output by “field”. ORDER sorts the data alphanumerically. The fields starting with “a” or “b” will be separated from the “d”s. Furthermore those starting with “d600” through “d630” would all be grouped together, regardless of the order they appeared in the headings report. Sorting the full fields, with the initial Sierra and MARC tags, effectively groups these different uses of headings. That is, the sorted list is ordered into three groups: name headings used as name access points (or “names-as- names”), name headings used as subject access points (or “names-as-subjects,” shown shaded in table 2), and subject headings. (A few other field group tags may also appear in the report, depending on the local settings used, but these too would be gathered by tags.) The three types of headings were then manually “cut and pasted” in a text processing application (in this case, Editpad) into three distinct files to be searched and loaded with slightly different criteria: the names-as-names which are searched as LC names and loaded as names authorities, the names-as- subjects which are searched as LC names and loaded as subject headings, and the subject headings which are searched as LC Subject Headings (LCSH) and loaded as subject headings. The Alpha method: processing The Connexion batch searching tool can import text lists of terms to search. The problem though remained: how to search just the data in MARC subfields that would be useable in these searches. Returning to the example in Figure 1, the goal is to search just the words “Wolfram” and “Adolph” and not the words “a1001” “|aWolfram,” “Adolph,|earranger” “of”, and 16 “music,|einstrumentalist”, which is how Connexion would parse the field as retrieved. The solution arrived at for the Alpha was to use a series of regexes to find and delete the extraneous data, which is mostly readily identified by MARC codes, and also to strip out punctuation and common stop words and operators that would confound the searches. The stop words and operators can appear both in subject and name -- especially name/title or uniform title, headings. Consider headings such as: “Same-sex divorce,” “Actors with disabilities,” “Cyrus, the Great, King of Persia, -530 B.C. or 529 B.C.,” and “Gone with the wind (Motion picture : 1939)”. The underlined words are interpreted by Connexion as potential operators or stop words, and the punctuation is interpreted as syntax for commands, any of which can interfere with keyword search. The stop words slow down the batch process as they are not indexed and waste effort. The operators and command syntax can cause errors that stop the affected searches. Occasionally, some name elements are identical to WorldCat index labels and will not be readily searched as keywords, because the batch process interprets them as commands lacking proper punctuation. For example, the family name “Su” will be interpreted as the label “su” (for the LCSH index of WorldCat’s authorities) and regarded as an error as it is missing the “:” or “=” which would tell Connexion whether it is a keyword or browse search of that index. There is little to be done in such cases, as removing these name elements is unlikely to create a search with just one match. However the stop words and operators can generally be removed with no loss of precision. A somewhat complicated series of “find/replace” operations using regexes were therefore performed in the separated text files of names and subjects. The complete list of expressions used follow: 1. (.*\|a) 17 2. (\|db\. ca\. |\|db\. |\|d\. ca\.|\|dd\. |\|dca\. |-ca\. |\|dfl\. ca\. |\|dfl\.) 3. (\|e.*|\|4.*|\|0.*|\|j.*) 4. (\|.) 5. (“|;|:|\(|\)|\?| and | or |&c\.|&| in | an |,| the | for | on | so | with | to | by |”|’| be | that |\.{3}| near | same ) The first expression simply selects everything up to, and including, “|a” which is how Sierra represents “subfield a” in the MARC field. So, for the example from figure 1, this selects “b7001 |a”. This selection is replaced with nothing; that is, it is simply deleted. The other expressions are all replaced with a blank space, so that the remaining terms do not run together. This is important because the Sierra database does not store spaces that appear before or after subfield delimiters in the MARC record. The second expression selects commonly occurring AACR2 abbreviations that occur in names with uncertain or incomplete dates. These abbreviations are generally selected in the context of a name heading’s subfield d (hence the “\|d” preceding some tokens); other likely contexts are signified in the expression such as “b. ca.,” “-ca.” and so on. These are likely to occur in older record sets which some vendors distribute. They may also exist in older bib records in the catalog and appear in the report because of some other edit that was made to the record. The example in figure 1 does not have any such abbreviations however. The third expression selects relationship terms and identifiers, again including the subfield delimiters themselves. In figure 1, “|earranger of music,|einstrumentalist” would be selected. The fourth expression selects any remaining subfield delimiters and codes, such as subfield q (marking a fuller form of name). The last expression selects a variety of punctuation marks and common stop-words and operators. 18 Running these find and replace substitutions is not especially time-consuming, but they must be run in order and require some attention to detail. Figure 2 shows two screenshots of some actual Sierra fields output by the Alpha query. On the left is the raw output, on the right is the same screen after processing. [place figure 2 here] Batch searching At this point the data, saved as a plain text file, can be imported into Connexion for batch searching. Names (whether used as names or subjects in Sierra) should be searched with the default index “LC Names” (nw:) selected; subjects with “LCSH” (su:). The batch was run with a limit of one hit per match. This limitation to a single hit avoids situations where human intervention might be required to decide between two or more similar headings that are partial matches to the entry. Such partial matches might be name/titles that matched just the name, modified name headings that matched an entry with no modifier, and so on. As an example, consider the name “Colombo, Maria” from figure 2. A search of the name authority file for “nw:colombo maria” yields twelve hits for names containing those words, but none exactly match the entry. On closer inspection none can be identified with the Maria Colombo in UA’s catalog anyway, but even if one of the multiple hits were a match, there would be no way to automate selection of the correct heading. Moreover, there is a limit to the number of records that can be stored in a Connexion save file (10,000) and including more than one match would potentially fill the save file before all the terms in the batch are searched. This was the procedure was carried out at UA for four months in 2017, with queries made about once a week. The reports had a mean average of 4412 entries, mostly due to bibliographic 19 batch loads and a simultaneous project of re-loading certain e-resource collection bib records. About 52% of the entries were names-as-names, 6% names-as-subjects, and 42% subjects. The greatest success by far was had searching the names-as-names. 59% of the name-as-name entries returned a unique AR, while just 23% of names-as subjects and 5% of subjects did the same. The lower success rates for name and topical subject headings can be partly explained by the fact subdivisions were always included in the authority searches, but ARs established with main headings plus subdivisions are relatively rare. Because the local installation of Sierra was not a version that could ignore subdivisions when creating the headings report, only subdivided ARs would match subdivided headings. So, it made sense to try to find ARs that also have the subdivisions. In August of 2017 the project was put on hold as upgrades to Sierra were planned, and by good fortune another librarian at a conference (Craig Boman, Miami University) suggested a tweak that could eliminate (1) the need to separate the data retrieved in the query and (2) most of the processing. The Beta method: query Mr. Boman suggested altering the query use to SELECT index_entry rather than SELECT field. 1 The “index_entry” is the data labeled “Indexed as Author” (or “Indexed as Subject”, etc.) in the heading report. In figure 1. this is simply “adolph wolfram.” These index entries are ready to batch search, for the most part. Because the UA implementation of Sierra does not index the title part of name/title headings in the author index, there is less need to remove stop words and operators from the names. Punctuation is not present in the index entries either. But of course there remains the issue of separating names-as-names, names-as-subjects, and subjects. This was 1 C. Boman (personal communication, May 14, 2018) 20 accomplished with another tweak. A condition was added to the query, based on the prefixes in the field. Instead of running one query and then separating and normalizing the output with the Alpha processing, the separation could be accomplished by running three distinct queries. The resulting data needs less processing, because the MARC coding and punctuation are already absent. Names-as-names were selected with the following query that exploits regexes in the search. The use of a regex is indicated by the tilde (~) and the expression enclosed in single quotes. SELECT index_entry FROM sierra_view.catmaint WHERE condition_code_num=1 and field ~'^a|^b' ; The “WHERE” conditional now focuses on fields that begin with an “a” or “b” -- that is, on fields with the index group tag for “name” (a) or “other name” (b). As mentioned above, the “index_entry’ will not contain subfield t, so articles and other stop words and operators are less common. Even so, conference names, place names, and uniform titles may occur in these as “names” or “other names” so there may still be some terms that will confound Connexion batch searches. For example, the abbreviation for Oregon (“Or.”) will appear in the index_entry as “or”, which will be interpreted as an operator in Connexion, and since it is likely to be at the end of a string, it will be an operator with bad syntax. More commonly, corporate or conference names may have words like “the” or “and,” and personal names might have an “or” in uncertain dates, or AACR2 abbreviations that might not be recorded in the AR’s variant (4xx) fields. Thus, some processing is still carried out. Names-as-subjects are handled similarly, with the following query: SELECT index_entry FROM sierra_view.catmaint WHERE condition_code_num=1 and field ~'^d6[0-3]' 21 ; Here the conditional selects fields beginning with a “d” (subjects) and the MARC tags 600 through 630. This therefore selects personal names (tag 600), conference and corporate names (tag 611 and 610), or uniform titles (630). In principle a MARC tag 620 could also be selected, but in practice this should not happen because 620 is undefined in MARC21. And topicals are selected with a third query: SELECT index_entry FROM sierra_view.catmaint WHERE condition_code_num=1 and field ~'^d65' ; Here, any subject (d) tagged 65x is selected. UA’s implementation of Sierra tags only MARC fields 650 and 651 as subjects; 653 and 655 are placed in indexes with other tag codes. The Beta method: processing For each query, the output .csv files are opened in Editpad and the fourth line of regex from the Alpha process is used to remove stop words and operators. A minor hiccup was introduced to the process when batches of files processed by a vendor began to be loaded, as these included one or more subfield 0 in MARC 6xx fields. Because UA’s implementation of Sierra had not been set up to exclude subfield zero from indexes, the content of the subfield was included in the text. For example, a personal name subject access point for Derrida, Jacques--Criticism and interpretation uses the MARC coding: 600 10|aDerrida, Jacques|0http://id.loc.gov/authorities/names/n79092610|xCriticism and interpretation.|0http://id.loc.gov/authorities/subjects/sh99005576 22 and appeared in the index as: derrida jacques http id loc gov authorities names n79092610 criticism and interpretation http id loc gov authorities subjects sh99005576 An additional regex was needed to strip out the content of subfield zero: (http id loc gov authorities subjects sh[\d]+)|(http id loc gov authorities names n[a-z]?[\d]+). In the future, when the subfield zero is excluded from the indexes, it will not be necessary to remove these strings of characters. Thus, for this heading, after running the regexes to remove stop words and operators, and the subfield zero identifiers, the remaining data is: derrida jacques criticism interpretation The text file is now ready for import into a Connexion batch search. The Beta method removed a few steps from processing, and was also simpler in the sense that there was no need to cut vast selections of data from a single spreadsheet. This made the Beta method a bit less demanding of attention than the Alpha method. Results The Alpha method was tested on 91,491 entries in the headings report over a four month period (April 11, 2017-August 11, 2017). This ultimately yielded 31,891 ARs of all types. The Beta method was tested on slightly smaller number of entries -- 87,077 -- collected over a six month period (July 12, 2018-January 9, 2019). The results are summarized in table 3, Alpha and Beta results. [place table 3 here] The Alpha processing took a noticeably longer time to perform than the Beta, because the query results had to be sorted and saved into different files; the Beta process, involving just a 23 single regex substitution, could be performed rapidly. However the majority of the time needed for both versions was simply allowing the batch searching to run, exporting the ARs from Connexion, and loading the ARs into the ILS. Therefore the overall time spent on each method was nearly the same for a given heading report. The difference was more qualitative, as the Alpha method involved more attention to detail in selecting, reformatting, cutting and pasting, and saving data from spreadsheets. Notably, the Alpha query results always included some “junk” headings: non-MARC headings from brief bib records and headings in local 970 tags. The non-MARC headings were added to brief records by staff outside of the cataloging unit and were often incomplete; as they were not intended to be authorized access points it made no sense to search for matching authorities. The 970 tags had been added to provide access points for the table of contents of monographs and were in an idiosyncratic format which Sierra’s automated authority processing (AAP) could not access. These “junk” headings had to be excluded from the batch processing as well. To compare success rates, the number of successful AR retrievals is divided by the total number of entries searched in the batch to arrive at a success ratio. Comparing the ratio of success for the Alpha and Beta processes, the difference is rather small overall -- about a 35% success rate in the Alpha and 33% in the Beta. Differences emerge when comparing the success ratios for specific types of headings, and the total number of headings of each type. The Alpha data shows a 63% success rate for names, versus 49% in the Beta. The rates for names-as- subjects are closer, and based on smaller sample sizes. The rates for subjects are very small, at 5% and 9% respectively. One would expect less success in subject (and name-as-subject) searches because it is not often the case that extended strings of headings and subdivisions will match an identical and unique AR. Some ILSs will ignore subdivisions when verifying subject 24 headings, but UA’s installation of Sierra checks the entire string including subdivisions. Similarly, name/title headings can pose problems, because NACO practice is not to create an AR for every title, but only those needing qualifiers or cross references. The issue is that the indexed fields lack subfield delimiters which would allow subdivisions to be removed before searching. While some subject authority records (SARs) are established with subdivisions, these are a minority of all SARs and the possible combinations of headings and subdivisions in bib records is vast. In principle one might search the batch of names-as-subjects in the subject index (su:). This would double the time and effort spent searching for names used as subjects, but it may be an avenue worth pursuing in the future. The most glaring difference -- the difference in success rates for name entries -- may be explained by several factors. First, the Beta process does not provide an easy way to remove AACR2 abbreviations from dates used to qualify names, such as “b.” (for born) and “d.” (for died). Because these would generally occur after a subfield d in the MARC field, the second regex in the Alpha could identify and remove them. But Beta selects indexed entries rather than the full MARC, so “b.” and “d.” in the names could be AACR2 abbreviations or they could be initials. It may prove helpful to devise a regex that will remove such abbreviations when occurring near numbers as a workaround. Secondly, the Alpha and Beta test were not undertaken simultaneously. Because the Beta test was run later, it would likely be checking headings that do not have corresponding ARs. The entries in the later “Headings used for the first time” report would be less likely to have corresponding ARs simply because they are already being compared to a more robust authority 25 file in the ILS due to the ARs already loaded from the Alpha method. Ideally, the two methods should be compared using the same day’s headings report. Thirdly, there is the simple fact that the bib records loaded during the two test periods were different. This would be impossible to completely account for in principle, as different staff and faculty were loading different sets of bib records for different purposes in the normal course of the library’s operation. The higher success rate for name headings in the Alpha method is a problem requiring more investigation to explain. All in all there were far too many variables in the MARC ecosystem of a functioning ILS to make a truly controlled comparison. Another complicating factor is that the second set of entries, which were used to test the second version of the process, had relatively fewer subject entries overall. This accounts for the similar “overall” success rates (35 and 33%) despite the Alpha process seeing significantly more successful name searches. This increases the suspicion that the difference in success rates owes more to the different bib records loaded than about the processes themselves. Thus, it was clear that a direct comparison of the methods was in order. Alpha and Beta head-to-head A more direct and meaningful comparison would be to run the two processes against a single headings report and compare the results. This comparison was made by allowing the “Headings used for the first time” report to accumulate for several weeks until there were 21,488 entries. Then both the Alpha and Beta methods were tested, with a stopwatch running to determine the exact amount of time the queries and processing took, beginning with opening the pgAdmin tool and stopping when the three files (names, names-as-subjects, and subjects) were saved. The results confirmed that the Beta method was considerably faster. The Alpha method took fourteen 26 minutes and two seconds. The Beta method took five minutes and forty-nine seconds. So, the Beta process clearly has the advantage in terms of time and effort. In the single headings report, the Alpha and Beta queries yielded similar but slightly different counts for the total number of entries in each category. These are summarized in table 4, Search strings retrieved by the Alpha and Beta queries. [place table 4 here] The totals were reasonably close, but were not exactly the same. This discrepancy could be explained by two factors. First, some non-MARC entries from brief bib records made their way into the Alpha list. These still begin with an index tag of “a” so the Alpha query selects them along with the MARC fields. The non-MARC fields were obvious in the Alpha results, and omitted during the sorting. This would necessarily leave the Alpha lists shorter. But another unavoidable factor was that fourteen minutes had elapsed between the Alpha and Beta queries, so in effect the Beta query was querying a slightly larger report. Checking the report again after these tests revealed that another 40 entries had been added to the “Headings used for the first time” report since running the Alpha query. This small difference in total hits is tolerated as insignificant. Running the two batches of results in Connexion yielded very similar results. The Alpha process had 4345 successful searches, while the Beta had 4348. De-duplicating the results of each batch reduced the hits for each to 4338 and 4341, respectively. Moreover, comparing the two sets to each other showed that the Alpha batch had 36 ARs not in the Beta batch, and there were 39 ARs in the Beta but not in the Alpha. Examination of the two sets of ARs did reveal some patterns to the discrepancies. These fell into two classes: conference name authorities and name/title authorities. 27 Conference headings were particularly problematic for both methods. Alpha returned the AR n50062132 (International Wheat Genetics Symposium) but the Beta did not. This can be accounted for by the fact that the MARC field which triggered the entry in the report was: a1112 |aInternational Wheat Genetics Symposium|0http://id.loc.gov/authorities/names/n50062132|n(12th :|d2013 :|cYokohama- shi, Japan) Note that there is a subfield zero embedded in the heading. This is an artifact of an authority vendor’s processing of the record for consortium that provides the e-resource bib reocrds. The expression (\|e.*|\|4.*|\|0.*|\|j.*), which was used to trim relator terms and URIs from fields, removed everything following the subfield zero. Thus the Alpha method batch searched only the portion in subfield a. Meanwhile the Beta batch searched the entire conference heading, including the specific numbering, year, and place. Because this was not established separately, there was no matching authority to return. A case might be made for wanting to retrieve the general conference name AR, even if it does not match a specific index entry, much as one might retain a topical AR for subjects that only appear further subdivided in the indexes. On the other hand, the Beta method was able to retrieve the conference authority n 86042368, (Palestine Arab Congress. Executive Committee), while Alpha did not. In this case, the conference name uses subfield e for the subordinate unit (Executive Committee). In most 1xx and 7xx tags, subfield e is used for relator terms and therefore removed from the fields in the Alpha process. But because it used for part of the name in 111 and 711 fields, removing subfield e creates a less specific search, and the batch process, which accepts only single matches, rejected the multiple matches in the authority file. In this case, the unmodified Palestine Arab Congress and a specific meeting in 1921 were also established, making the Alpha version of the 28 search find three matches. Of course since only single hits are retained, none of these were retrieved. It should be possible to further improve the queries to retrieve conference headings separately, so that they can be processed differently with revised regex substitutions and searched separately. This may be a project for the future. In the case of name/title entries, there is a difference in how Sierra indexes name/title combinations and how they appear in the MARC fields. The MARC fields may be a single line, as in the case of 7xx fields with names in subfield a (and possibly qualifiers in b, c, d, and q) and titles in subfield t (and possibly qualifiers in l, s, etc.), or they may appear in two fields (1xx + 240). But Sierra only reports on names when a new name/title 7xx is added to the catalog. Therefore, the index entry retrieved in the Beta SQL query might be: furman james 1937 1989 while the MARC field retrieved by the Alpha SQL query on the same entry is: b70012|aFurman, James,|d1937-1989.|tHehlehlooyuh. When the Alpha batch searches for “Furman James 1937 1989 Hehlehlooyuh,” it will not find a match, but the Beta batch search for just the name will. It would be possible to adjust the substitution regex to remove subfield t (and anything following it) for the Alpha processing, but this would be a two-edged sword: it will avoid missing some retrievable names, but it will also be unable to retrieve name/titles. This is ultimately a special case of the recall versus precision problem. Precision was favored in this case, and subfield t retained. At this point it would seem that two processes are quite comparable in effectiveness. They have different strengths. Alpha can be a bit more precise. Beta is a bit less time-consuming. 29 Which is more suitable for use at a given library will depend on the resources that can be devoted to implementing and possibly improving them. As a final test, the ARs from each set were loaded in “test” mode, so that a count of overlays (that is, ARs which are already in the catalog) and inserts (ARs new to the catalog) were reported without actually loading the ARs. The Alpha file had 746 overlays and 3592 inserts. The Beta had 749 overlays and 3592 inserts. Overlays would generally be “harmless” in the sense that at worst, they duplicated records already in Sierra. They might beneficially update an existing record, if the iteration in the catalog was out of date (pre-RDA forms, open dates, or changes to names). But inserts are generally the goal for this process. One possible refinement would be to compare the relevance of the search results in terms of how many “blind references” the different methods produce. It is obviously likely that some of the successful batch searches will be “false hits” -- matches that are only partial and/or refer to different names or topics than the heading in use. The ARs thus retrieved will become “blind references,” authorities that do not support any headings in bib records. Such blind references are normally suppressed or deleted in regular database maintenance. Anticipating this problem, at UA a MARC 910 field is inserted into all the ARs retrieved by the batch searches to identify the records as batch-loaded rather than manually added. This allowed us to select blind references in the “Blind references” headings report that originated from these batches for summary deletion. The Alpha and Beta processes can help Sierra libraries, but is this in-sourcing approach applicable to other ILSs? The answer is yes, providing the ILS has some mechanism for reporting unauthorized or uncontrolled headings. Gamma method for SirsiDynix Symphony 30 Another opportunity suggested itself with the SirsiDynix Symphony ILS. Symphony has a report that will export a text file identifying “unauthorized headings”. These are, like the headings in the Sierra report already discussed, headings that do not match any ARs in the system. Because the type of headings (names, topical, etc.) and even the MARC tags involved (100, 700, etc.) can be preselected before running the report, no SQL query or sorting is required. Moreover, name ARs can control both name access points and subject access points, so there is no need to search and load names-as-subjects separately from names. Figure 3 below is an illustration of a part of one such report’s output. Note that the report was run with the options to “format report” and “view log” unchecked -- leaving these options checked produces a less useable report with line breaks and page breaks that complicate normalization. [place figure 3 here] In the above sample, a few differences in the output from the SQL query will be evident. First, there is the presence of some header information in the first few lines. These are generated by the system, and can simply be selected and deleted manually in the text editor. Secondly, diacritics in this report are displayed as an additional character, typically a character with a diacritic of its own. For example Abū Dāʼūd Sulaymān ibn al-Ashʻath al-Sijistānī is displayed here as: Abåu Dåa®åud Sulaymåan ibn al-Ash°ath al-Sijiståanåi The additional characters precede the characters that should have diacritics applied. This is a character encoding issue, as the Unicode encoding does not translate correctly into the output. Third, a subfield “?” with the term “UNAUTHORIZED” is appended to each line. These appear in the staff view of bib records in Symphony as well. Finally, each line is preceded by a 31 number, indicating the number of occurrences of the heading in the database. Because these counts are before the MARC tags, it is important that the report be run for each type of heading (1xx/600/630/7xx names and 65x topical headings). But because Symphony allows NARs to authorize both name and subject uses of the name, there is no need to segregate names-as- subjects from names. The other issues require some simple alterations to the regex used to normalize the data in the first Alpha processing. The first expression will remove the “counts” along with everything else preceding subfield a, and is fine as it is. Changing the third expression to (\|e.*|\|4.*|\|0.*|\|\?.*) will remove the “|?UNAUTHORIZED” along with relator terms, URIs, and the like. Removing the characters representing diacritics is a bit more complicated, but is doable. After stripping out all the other MARC coding, the expression [^a-zA-Z0-9- \x00-\x1F\x7F] will select all the special characters standing in for diacritics. (The expression matches characters that are NOT letters, numbers, a dash, a space, or special characters like line breaks.) These are to be replaced with nothing (i.e., not a blank space). The rest of the processing should be relatively obvious. Because the subfield delimiter symbol in both Sierra and Symphony is a “pipe,” the other regexes in the Alpha processing will work the same way. Figure 4 shows the same entries of the sample report after running the following regex find/replace substitutions. The first and last expression should be replaced with nothing rather than a blank space. 1. (.*\|a) 2. (\|db\. ca\. |\|db\. |\|d\. ca\.|\|dd\. |\|dca\. |-ca\. |\|dfl\. ca\. |\|dfl\.) 3. (\|e.*|\|4.*|\|0.*|\|j.*|\|\?.*) 4. (\|.|\|$) 32 5. ("|;|:|\(|\)|\?| and | or |&c\.|&| in | an |,| the | for | on | so | with | to | by |”|’| be | that |\.{3}| near | same |\.) 6. [^a-zA-Z0-9- \x00-\x1F\x7F] It would be reasonable to expect success rates similar to the Alpha method for Sierra, though this was not tested. Future directions & further study The initial success using the headings reports and batch processes to add ARs to the catalog has been encouraging. As the method has been tested by other librarians, additional tweaks and refinements to the query have been suggested. For example, David Green (Infrastructure Specialist, The State Library of Ohio) suggested working deduplication, processing, and removal of excess white spaces into the query by changing the three Beta queries to use the following first line2: SELECT DISTINCT trim(regexp_replace(index_entry, '(“|;|:|\(|\)|\?| and | or |&c\.|&| in | an |,| the | for | on | so | with | to | by |”|’| be | that |\.{3}| near | same | \s+ )', ' ', 'g')) Only unique (“DISTINCT”) entries are selected, and the regex substitution (with an additional term to replace multiple consecutive blank spaces) is applied to the output. A similar query, or set of queries, might be devised to speed up the Alpha method. Indeed this train of thought further suggests moving all processing out of the text editor program and into a batch of command line commands to further streamline the Gamma method as well -- an exercise perhaps for those with more advanced scripting skills than the present author. The ongoing development 2 D. Green (personal communication, May 23, 2019) 33 of these methods through collaboration among librarians in different functional areas and different institutions has been gratifying and promises to further refine these methods. Looking ahead, further manipulations such as removing subdivisions from subject entries should improve success rates. The detailed results log from the batch searches may also be worth examination to identify headings that should be checked manually when time or staffing permits. Moreover there is likely more that can be accomplished with other headings reports in Sierra (and other ILSs). For example, the Sierra “Near matches” report, which identifies entries that are partial matches to ARs, could be used to identify ARs that may need to be checked against the authority file (either the Library of Congress Name Authority File or OCLC) for updates. It may also be practical to use a SQL query to extract the “Correct heading is:” entry from the “Invalid headings” report, which notes fields in bib records that match variant forms in ARs. Batch searching the “Correct heading is:” entries would be a way to confirm that the ARs in the catalog are current, and re-loading them would trigger Sierra’s AAP (at least in those Sierra implementations that have this feature turned on). Further study is also warranted to determine the relative effectiveness of this method versus those achieved by different vendors, in terms of the number of headings correctly matched to ARs. 34 References Aschman, A. (2002). The lowdown on automated vendor supplied authority control. Technical Services Quarterly, 20(3), 33-44. DOI: 10.1300/J124v20n03_03 Beisler, A., & Kurt, L. (2012). E-book workflow from inquiry to access: Facing the challenges to implementing e-book access at the University of Nevada, Reno. Collaborative Librarianship, 4(3), 96–116. Carrasco, R. C., Serrano, A., & Castillo-Buergo, R. (2016). A parser for authority control of author names in bibliographic records. Information Processing and Management, 52(5), 753–764. DOI: 10.1016/j.ipm.2016.02.002 Cook, D. (2014). Metadata management on a budget. Feliciter, 60(2), 24–29. David, R. H., & Thomas, D. (2015). Assessing metadata and controlling quality in scholarly ebooks. Cataloging and Classification Quarterly, 53(7), 801–824. DOI: 10.1080/01639374.2015.1018397 Dong, E., Glerum, M. A., & Fenichel, E. (2017). Using automation and batch processing to remediate duplicate series data in a shared bibliographic catalog. Library Resources & Technical Services, 61(3), 143–161. Retrieved from: https://journals.ala.org/index.php/lrts/article/view/6395/8442 Finn, M. (2009). Batch-load authority control cleanup using MarcEdit and LTI. Technical Services Quarterly, 26(1), 44–50. DOI: 10.1080/07317130802225605 Flynn, E. A., & Kilkenny, E. (2017). Cataloging from the center: Improving e-book cataloging on a consortial level. Cataloging and Classification Quarterly, 55(7–8), 630–643. DOI: 10.1080/01639374.2017.1358787 Heinrich, H. (2008). Navigating the currents of vendor-supplied cataloging. IFLA Conference Proceedings, 1–18. Jackson, R. V. (2003). Authority control is alive and...well? OLA Quarterly, 9(1), 9-12. DOI: 10.7710/1093-7374.1636 Kreyche, M., Lisius, P. H., & Park, A. (2010). The DeathFlip project: Automating death date revisions to name headings in bibliographic records. Cataloging & Classification Quarterly, 48(8), 684–695. DOI: 10.0.4.56/01639374.2010.497721 Mak, L. (2013). Coping with the storm: Automating name authority record updates and bibliographic file maintenanc. OCLC Systems & Services, 29(4), 235–245. DOI: 10.1108/OCLC-02-2013-0006 Martin, K. E., & Mundle, K. (2010). Cataloging e-books and vendor records: A case study at the 35 University of Illinois at Chicago. Library Resources & Technical Services 54(4), 227-237. DOI: 10.5860/lrts.54n4.227 Panchyshyn, R. S. (2013). Asking the right questions: An e-resource checklist for documenting cataloging decisions for batch cataloging projects. Technical Services Quarterly, 30(1), 15- 37. DOI: 10.1080/07317131.2013.735951 Park, A. L. (1992). Automated authority control: making the transition. Special Libraries, 83(2), 75–85. Sanchez, E., Fatout, L., Howser, A., & Vance, C. (2006). Cleanup of NetLibrary cataloging records: A methodical front-end process. Technical Services Quarterly, 23(4), 51–71. DOI: 10.0.5.20/J124v23n04-04 Snow, K. (2017) Defining, assessing, and rethinking quality cataloging, Cataloging & Classification Quarterly, 55:7-8, 438-455, DOI: 10.1080/01639374.2017.1350774 Tingle, N., & Teeter, K. (2018). Browsing the intangible: Does visibility lead to increased use? Technical Services Quarterly, 35(2), 164–174. DOI: 10.1080/07317131.2018.1422884 Thompson, K., & Traill, S. (2017). Leveraging Python to improve ebook metadata selection, ingest, and management. Code4LibLib, (38), 1–17. Tsui, S. L., & Hinders, C. F. (1999). Cost-effectiveness and benefits of outsourcing authority control. Cataloging & Classification Quarterly, 26(4), 43–61. DOI: 10.1300/J104v26n04_04 Van Kleeck, D., Nakano, H., Langford, G., Shelton, T., Lundgren, J., & O’Dell, A. J. (2017). Managing bibliographic data quality for electronic resources. Cataloging and Classification Quarterly, 55(7/8), 560–577. DOI: 10.1080/01639374.2017.1350777 Vellucci, S. L. (2004). Commercial services for providing authority control: Outsourcing the process. Cataloging & Classification Quarterly, 39(1/2), 443–456. Williams, H. (2010). Cleaning up the catalogue. Library & Information Update, (Jan/Feb), 46– 48. Wolf, S. (2019). Automating the authority control process. Presented at the Ohio Valley Group of Technical Services Librarians Annual Conference 2019. Retrieved from https://uknowledge.uky.edu/ovgtsl2019/conf/schedule/17/ Wu, A., & Mitchell, A. M. (2010). Mass management of e-book catalog records: Approaches, challenges, and solutions. Library Resources & Technical Services, 54(3), 164–174. DOI: 10.5860/lrts.54n3.164 Zhu, L., & von Seggern, M. (2005). Vendor-supplied authority control: Some realistic 36 expectations. Technical Services Quarterly, 23(2), 49–65. DOI: 10.0.5.20/J124v23n02.04 Ziso, Y., LeVan, R., & Morgan, E. L. (2010). Querying OCLC Web Services for name, subject, and ISBN. Code4Lib, (9), 1–8. 37 Table 1. Summary of MarcEdit Validate Headings on selected record sets Record source Biblioraphic records loaded Number of headings corrected Corrections per title % 1xx and 7xx corrected % 6xx corrected EBSCO* 158,106 1,108 .007008 .002414 .000467 OCLC WCM 209,777 5468 .026066 .008498 .000903 Kanopy*** 461,562** 20,447 .044300 .01123 .01346 Films on Demand*** 10,064 1,273 .126490 .056359 .000162 Alexander Street Press 5,304 394 .074284 .012495 .008153 *EBSCO discovery layer records. These were often brief records with few or no access points, accounting for the relatively small number of corrections. **Kanopy records were routinely re-loaded as a collection, at the vendor’s recommendation, as corrections or changes to records were continuous. UA’s set of Kanopy records was less than 20,000 titles, but the set was reloaded in its entirety monthly. ***Kanopy and Films on Demand records were also pre-edited with MarcEdit tasks that addressed certain recurring errors as mentioned above in the text. This somewhat decreased the overall number of changes made by the Validate Headings report, but nonetheless the rates of corrections are still greater than the OCLC benchmarks. 38 Table 2. Headings types in Sierra and WorldCat Tagging prefix Type of authority Sierra index WorldCat authority index Sierra load table a100 Personal name Author LC Names Name authority a110 Corporate body Author LC Names Name authority a111 Conference name Author LC Names Name authority b700 Personal name Other author LC Names Name authority b710 Corporate body Other author LC Names Name authority b711 Conference name Other author LC Names Name authority d600 Personal name Subject LC Names Subject authority d610 Corporate body Subject LC Names Subject authority d611 Conference name Subject LC Names Subject authority d630 Uniform title Subject LC Names Subject authority d650 Subject Subject LCSH Subject authority d651 Geographic name Subject LCSH Subject authority 39 Table 3. Alpha and Beta results Names Names-as-subjects Subjects Total Alpha query 44410 5774 41307 91491 Alpha ARs retrieved 27935 1753 2203 31891 Alpha success rate (entries/ARs) .629025 .303602 .053332 .34857 Beta query 49570 5916 31591 87077 Beta ARs retrieved 24424 1657 2893 28974 Beta success rate (entries/ARs) .492717 .280088 .091577 .33274 40 Table 4. Search strings retrieved by the Alpha and Beta queries Alpha Beta Names 10771 10788 Names-As-Subjects 1403 1408 Subjects 9320 9339 41 Figure 1. Headings used for the first time Field: b7001 |aAdolph, Wolfram,|earranger of music,|einstrumentalist Indexed as AUTHOR: adolph wolfram Preceded by “a”: adolph vincent r Followed by “a”: adolphe bruce From: b6097185x Bach, Johann Sebastian, 1685-1750, composer Rèveries 15 □ 42 Figure 2. a1001 |aChough, Sung Kwun,|d1985- a1001 |aChua, Hui Tong,|eauthor a1001 |aChubb, Kit,|d1936- a1001 |aCohen, Louis H.|q(Louis Harold),|d1906-|eauthor a1001 |aColombo, Maria,|eauthor a1001 |aCranburne, Charles,|d-1696,|edefendant a1001 |aCrozier, C. W.,|d1807?-|eauthor Chough Sung Kwun 1985- Chua Hui Tong Chubb Kit 1936- Cohen Louis H Louis Harold 1906- Colombo Maria Cranburne Charles -1696 Crozier C W 1807 - 43 Figure 3. .folddata .report .report .title $(14810) .end .subtitle $(14180)Wed May 1 10:02:02 2019 .end .footing r .end 1 100: 0 : |a'ãAolåi,|eauthor.|?UNAUTHORIZED 1 100: 0 : |aA mi.|?UNAUTHORIZED 1 100: 0 : |aAQ,|eauthor.|?UNAUTHORIZED 1 100: 0 : |aAbraham bar Hiyya Savasorda,|dapproximately 1065-approximately 1136.|?UNAUTHORIZED 1 100: 0 : |aAbram,|cder Tate,|d1874-1962.|?UNAUTHORIZED 1 100: 0 : |aAbåu Dåa®åud Sulaymåan ibn al-Ash°ath al-Sijiståanåi,|d817 or 818-889.|?UNAUTHORIZED 1 100: 0 : |aAbåu Nuwåas,|dapproximately 756-approximately 810.|?UNAUTHORIZED 2 100: 0 : |aAbåu al-Faraj al-Iòsbahåanåi,|d897 or 898-967.|?UNAUTHORIZED 1 100: 0 : |aAbåu °Ubayd al-Qåasim ibn Sallåam,|dapproximately 773-approximately 837,|eauthor.|?UNAUTHORIZED 2 100: 0 : |aAbåu °Ubayd al-Qåasim ibn Sallåam,|dapproximately 773-approximately 837.|?UNAUTHORIZED 1 100: 0 : |aAce Hood,|d1988-|4prf|?UNAUTHORIZED 1 100: 0 : |aAce Hood.|4prf|?UNAUTHORIZED 1 100: 0 : |aAce Hood.|?UNAUTHORIZED 1 100: 0 : |aAce.|?UNAUTHORIZED 1 100: 0 : |aAchad,|cFrater,|d1886-|?UNAUTHORIZED 1 100: 0 : |aAcharya Shunya,|eauthor.|?UNAUTHORIZED 1 100: 0 : |aAchdâe,|d1961-|?UNAUTHORIZED 1 100: 0 : |aAding,|d1972-|eauthor.|?UNAUTHORIZED 44 Figure 4. Aoli A mi AQ Abraham bar Hiyya Savasorda approximately 1065-approximately 1136 Abram der Tate 1874-1962 Abu Daud Sulayman ibn al-Ashth al-Sijistani 817 818-889 Abu Nuwas approximately 756-approximately 810 Abu al-Faraj al-Isbahani 897 898-967 Abu Ubayd al-Qasim ibn Sallam approximately 773-approximately 837 Abu Ubayd al-Qasim ibn Sallam approximately 773-approximately 837 Ace Hood 1988- Ace Hood Ace Hood Ace Achad Frater 1886- Acharya Shunya Achde 1961- Ading 1972- 45 List of figure captions Figure 1. Sample “Headings used for the first time” report entry Figure 2. Sierra fields before and after Alpha processing Figure 3. SirsiDynix Symphony “Unauthorized Headings” report Figure 4. Processed “Unauthorized Headings” report The University of Akron From the SelectedWorks of Michael Monaco 2020 Methods for in-sourcing authority control with MarcEdit, SQL, and regular expressions tmp1LmiBe.pdf